Вы находитесь на странице: 1из 12

POLITECNICO DI MILANO

Advanced Computer Architectures (and HPPS)

Miscellania of pipelining
Complex pipelining, Pipeline

Marco D. Santambrogio: marco.santambrogio@polimi.it

ACA
2
Complex Pipeline
  In this problem we will examine the execution of a code segment
on the following single-issue out-of-order processor:

3
You can assume that
  All functional units are pipelined
  ALU operations take 1 cycle
  Memory operations take 3 cycles (includes time in ALU)
  Floating-point add instructions take 3 cycles
  Floating-point multiply instructions take 5 cycles
  There is no register renaming
  Instructions are fetched, decoded and issued in order
  The issue stage is a buffer of unlimited length that holds
instructions waiting to start execution
  An instruction will only enter the issue stage if it does not
cause a WAR or WAW hazard
  Only one instruction can be issued at a time, and in the
case multiple instructions are ready, the oldest one will go
first
4
Code

 
I1 L.D F3, 0(R0)
I2 ADD.D F2, F2, F3
I3 MUL.D F5, F4, F4
I4 ADD.I R0, R0, 8
I5 L.D F3, 0(R0)
I6 ADD.D F2, F3, F5

5
Code and Architecture

 
I1 L.D F3, 0(R0) ALU OP: 1 cycle
I2 ADD.D F2, F2, F3
MEM OP: 3 cycles
I3 MUL.D F5, F4, F4
I4 ADD.I R0, R0, 8 FP ADD: 3 cycles
I5 L.D F3, 0(R0) FP MULT: 5 cycles
I6 ADD.D F2, F3, F5

6
Conflicts

 
RAW I1-I2 on F3
I1 L.D F3, 0(R0)
WAW I1-I5 on F3
I2 ADD.D F2, F2, F3
WAR I2-I5 on F3
I3 MUL.D F5, F4, F4
WAR I1-I4 on R0
I4 ADD.I R0, R0, 8
I5 L.D F3, 0(R0) RAW I4-I5 on R0
I6 ADD.D F2, F3, F5 RAW I5-I6 on F3
RAW I3-I6 on F5

7
Pipeline schema
 
I1 L.D F3, 0(R0) RAW I1-I2 on F3
I2 ADD.D F2, F2, F3 WAR I1-I4 on R0
I3 MUL.D F5, F4, F4
WAR I2-I5 on F3
I4 ADD.I R0, R0, 8
I5 L.D F3, 0(R0)
RAW I5-I6 on F3
I6 ADD.D F2, F3, F5 RAW I3-I6 on F5

I E C
I E C
I E C
I E C
I E C
I E C

8
9
while (i != N)
The Code
The program has been compiled in MIPS assembly
The symbols BASEA, BASEB and BASEC are 16-
L1: lw $2, BASEA ($4)
addi $2, $2, INC1
lw $3, BASEB ($4)
addi $3, $3, INC2
add $5, $2, $3
sw $5, BASEC ($4)
addi $4, $4, 4
bne $4, $7, L1
Consider the above program be executed on a 2-is
TAKEN FORWARD NOT TAKEN) with Bran
10
Assume there are the following optimizations in
Pipelining
  Consider the code to be executed on
  a 2-issue Superscalar MIPS architecture
  with Static Branch Prediction BTFNT (BACKWARD TAKEN
FORWARD NOT TAKEN) with Branch Target Buffer
  Assume there are the following optimizations in the
pipeline
  Consider for each instruction issue: 1 ALU/BRANCH and
1 LOAD/STORE
  Consider a Register File with 4 read ports, 2 write
ports. A single read operation and a single write
operation both at the same address can be executed;
Forwarding
  Computation of PC and TARGET ADDRESS for branch &
jump instructions anticipated in the ID stage

11
Open issues…

12

Вам также может понравиться