Академический Документы
Профессиональный Документы
Культура Документы
Computer Architecture
Lecture 8: Pipelining II:
Data and Control Dependence Handling
Prof. Onur Mutlu
Carnegie Mellon University
Spring 2015, 2/2/2015
Single-cycle Microarchitectures
Pipelining
Out-of-Order Execution
Wrap Up Microprogramming
Pipelining
Interlocking
Scoreboarding vs. Combinational
Handling dependences
Data
Control
Data dependence
Control dependence
Review: Interlocking
MIPS acronym?
Data Forwarding/Bypassing
Control dependence
11
Control Dependence
13
Read-after-Write
(RAW)
Anti dependence
r3
r1 op r2
r1
r4 op r5
Write-after-Read
(WAR)
Output-dependence
r3
r1 op r2
r5
r3 op r4
r3
r6 op r7
Write-after-Write
(WAW)
14
No need to detect
15
Summary:
http://www.ece.cmu.edu/~calcm/doku.php?id=seminars:se
minars
16
addi
ra r- -
addi
r- ra -
addi
r- ra -
addi
r- ra -
addi
r- ra -
addi
r- ra -
IF
ID
EX
MEM WB
IF
ID
EX
MEM WB
IF
ID
EX
MEM
IF
ID
EX
IF
?ID
IF
17
LW
SW
Br
read RF
read RF
read RF
read RF
write RF
write RF
Jr
IF
ID
read RF
EX
MEM
WB
Reg Read
iFj
j:rk_
Reg Write
iAj
j:rk_
Reg Write
iOj
stage Y
i:rk_
Reg Write
RAW Dependence
i:_rk
Reg Read
WAR Dependence
i:rk_
Reg Write
WAW Dependence
19
LW
SW
Br
read RF
read RF
read RF
read RF
write RF
write RF
Jr
IF
ID
read RF
EX
MEM
WB
t1
t2
t3
t4
ID
IF
ALU
ID
IF
MEM
ALU
ID
IF
WB
MEM
ALU
ID
ID
IF
IF
t5
WB
MEM
ALU
ID
ALU
ID
IF
ID
IF
IF
ID
WB
MEM
ALU
MEM
ALU
ID
IF
ALU
ID
IF
ID
IF
IF
ALU
WB
MEM
WB
MEM
ALU
ID
MEM
ALU
ID
IF
ALU
ID
IF
ID
IF
dist(i,j)=1
Stall = make the dependent instruction
IF
dist(i,j)=2 wait until its source data value is available
dist(i,j)=3
1. stop all up-stream stages
dist(i,j)=4
2. drain all down-stream stages
21
ID/EX
0
M
u
x
1
WB
Control
IF/ID
EX/MEM
WB
EX
MEM/WB
WB
Add
Add
Add result
Instruction
memory
ALUSrc
Read
register 1
Read
data 1
Read
register 2
Registers Read
Write
data 2
register
Write
data
Zero
ALU ALU
result
0
M
u
x
1
MemtoReg
Address
Branch
Shift
left 2
MemWrite
PC
Instruction
RegWrite
Address
Data
memory
Read
data
Write
data
Instruction 16
[15 0]
Sign
extend
Instruction
[20 16]
Instruction
[15 11]
32
ALU
control
0
M
u
x
1
1
M
u
x
0
MemRead
ALUOp
RegDst
Stall
Based on original figure from [P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
22
Stall Conditions
23
Helper functions
Stall when
or
or
or
or
or
addi
slti
bne
sll
add
lw
lw
slt
beq
.........
addi
j
$s1, $s0, -1
$t0, $s1, 0
$t0, $zero, exit2
$t1, $s1, 2
$t2, $a0, $t1
$t3, 0($t2)
$t4, 4($t2)
$t0, $t4, $t3
$t0, $zero, exit2
3 stalls
3 stalls
3 stalls
3 stalls
3 stalls
3 stalls
$s1, $s1, -1
for2tst
exit2:
26
Remember dataflow?
add
rz r- r-
addi
r- rz r-
IF
ID
EX
MEM WB
IF
ID
ID
EX
ID
MEM
ID
WB
28
29
EX/MEM
MEM/WB
dist(i,j)=3
M
u
x
Registers
ForwardA
M
u
x
internal
forward?
Rs
Rt
Rt
Rd
ALU
dist(i,j)=1
dist(i,j)=2
Data
memory
M
u
x
ForwardB
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
dist(i,j)=3
b. With forwarding
[Based on original figure from P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
30
EX/MEM
MEM/WB
dist(i,j)=3
M
u
x
Registers
ForwardA
M
u
x
Rs
Rt
Rt
Rd
ALU
dist(i,j)=1
dist(i,j)=2
Data
memory
M
u
x
ForwardB
M
u
x
EX/MEM.RegisterRd
Forwarding
unit
MEM/WB.RegisterRd
b. With forwarding
[Based on original figure from P&H CO&D, COPYRIGHT 2004 Elsevier. ALL RIGHTS RESERVED.]
31
Assumes RF forwards internally
LW
SW
Br
Jr
IF
ID
EX
MEM
use
use
produce
use
use
produce
(use)
use
WB
33
addi
slti
bne
sll
add
lw
lw
slt
beq
.........
addi
j
$s1, $s0, -1
$t0, $s1, 0
$t0, $zero, exit2
$t1, $s1, 2
$t2, $a0, $t1
$t3, 0($t2)
$t4, 4($t2)
$t0, $t4, $t3
$t0, $zero, exit2
3 stalls
3 stalls
3 stalls
3 stalls
3 stalls
3 stalls
$s1, $s1, -1
for2tst
exit2:
34
35
36
Fetch
Decode/RF Access
Address Generation/Execute
Memory
Store Result
Stall on branch
Stall on flow dependence
37
39
40
41
42
43
44
Stall Signals
46
47
Stall Signals
48
Pipelined LC-3b
http://www.ece.cmu.edu/~ece447/s14/lib/exe/fetch.php?m
edia=18447-lc3b-pipelining.pdf
49
50
Questions to Ponder
51
Questions to Ponder
52
Answer: Profiling
53
54
Branch Types
Type
Direction at
fetch time
Number of
When is next
possible next
fetch address
fetch addresses? resolved?
Conditional
Unknown
Execution (register
dependent)
Unconditional
Always taken
Decode (PC +
offset)
Call
Always taken
Decode (PC +
offset)
Return
Always taken
Many
Execution (register
dependent)
Indirect
Always taken
Many
Execution (register
dependent)
56
Insth
Insti
Instj
Instk
Instl
t0
t1
IF
ID
IF
t2
t3
ALU MEM
IF
ID
IF
t4
t5
WB
ALU MEM
IF
ID
IF
WB
ALU
IF
58
Guessing NextPC = PC + 4
Software: Lay out the control flow graph such that the likely
next instruction is on the not-taken path of a branch
Guessing NextPC = PC + 4
61
3 conditional branches