Академический Документы
Профессиональный Документы
Культура Документы
Appendix A - Pipelining
What Is Pipelining
Laundry Example
Ann, Brian, Cathy, Dave
each have one load of clothes
to wash, dry, and fold
Washer takes 30 minutes
What Is Pipelining
6 PM
10
11
Midnight
Time
30 40 20 30 40 20 30 40 20 30 40 20
T
a
s
k
O
r
d
e
r
A
B
C
D
What Is Pipelining
Start work ASAP
6 PM
10
11
Midnight
Time
30 40
T
a
s
k
O
r
d
e
r
A
B
40
40
40 20
Pipelined laundry takes
3.5 hours for 4 loads
C
D
Appendix A - Pipelining
Pipelining
Lessons
What Is
Pipelining
6 PM
9
Time
T
a
s
k
O
r
d
e
r
30 40
A
B
C
40
40
40 20
D
Appendix A - Pipelining
What Is
Pipelining
Instruction
Fetch
MIPS Without
Pipelining
Instr. Decode
Reg. Fetch
Execute
Addr. Calc
IR
Memory
Access
Write
Back
L
M
D
Appendix A - Pipelining
Latches between
each stage provide
pipelining.
Appendix A - Pipelining
Reg
Ifetch
Reg
Ifetch
Reg
DMem
Reg
Reg
DMem
ALU
Ifetch
DMem
ALU
O
r
d
e
r
Reg
ALU
I
n
s
t
r.
Ifetch
ALU
Reg
DMem
Reg
Figure 3.3
Appendix A - Pipelining
Pipeline Hurdles
A.1 What is Pipelining?
A.2 The Major Hurdle of PipeliningStructural Hazards
-- Structural Hazards
Data Hazards
Control Hazards
Appendix A - Pipelining
Pipeline Hurdles
Definition
conditions that lead to incorrect behavior if not fixed
Structural hazard
two different instructions use same h/w in same cycle
Data hazard
two different instructions use same storage
must appear as if the instructions execute in correct order
Control hazard
one instruction affects which instruction is next
Resolution
Pipeline interlock logic detects hazards and fixes them
simple solution: stall
increases CPI, decreases performance
better solution: partial stall
some instruction stall, others proceed better to stall early than late
Appendix A - Pipelining
10
Structural Hazards
Time (clock cycles)
Ifetch
Reg
Ifetch
DMem
Reg
DMem
Reg
DMem
Reg
ALU
Instr 4
Reg
Reg
ALU
Instr 3
Ifetch
DMem
ALU
O
r
d
e
r
Instr 2
Reg
ALU
I Load Ifetch
n
s Instr 1
t
r.
ALU
Ifetch
Reg
When
two
or
more
different
instructions want
to
use
same
hardware
resource in same
cycle
e.g., MEM uses
the same memory
port as IF as
shown in this
slide.
Reg
DMem
Reg
Figure 3.6
Appendix A - Pipelining
11
Structural Hazards
Time (clock cycles)
Stall
Instr 3
Reg
Ifetch
Reg
Bubble
Reg
DMem
ALU
Ifetch
DMem
DMem
Bubble Bubble
Ifetch
This is another
way of looking
at the effect of
a stall.
Reg
Reg
Reg
Bubble
ALU
O
r
d
e
r
Instr 2
Reg
ALU
I Load Ifetch
n
s
t Instr 1
r.
ALU
Bubble
Reg
DMem
Figure 3.7
Appendix A - Pipelining
12
Structural Hazards
Appendix A - Pipelining
13
Structural Hazards
Dealing with Structural Hazards
Stall
low cost, simple
Increases CPI
use for rare case since stalling has performance effect
Pipeline hardware resource
useful for multi-cycle resources
good performance
sometimes complex e.g., RAM
Replicate resource
good performance
increases cost (+ maybe interconnect delay)
useful for cheap or divisible resources
Appendix A - Pipelining
14
Structural Hazards
Structural hazards are reduced with these rules:
Each instruction uses a resource at most once
Always use the resource in the same pipeline stage
Use the resource for one cycle only
Many RISC ISAa designed with this in mind
Sometimes very complex to do this. For example, memory of
necessity is used in the IF and MEM stages.
Some common Structural Hazards:
Memory - weve already mentioned this one.
Floating point - Since many floating point instructions require
many cycles, its easy for them to interfere with each other.
Starting up more of one type of instruction than there are
resources. For instance, the PA-8600 can support two ALU + two
load/store instructions per cycle - thats how much hardware it
has available.
Appendix A - Pipelining
15
Data Hazards
A.1 What is Pipelining?
A.2 The Major Hurdle of PipeliningStructural Hazards
-- Structural Hazards
Data Hazards
Control Hazards
Appendix A - Pipelining
16
Data Hazards
Execution Order is:
InstrI
InstrJ
I: add r1,r2,r3
J: sub r4,r1,r3
Caused by a Dependence (in compiler nomenclature).
This hazard results from an actual need for
communication.
Appendix A - Pipelining
17
Data Hazards
Execution Order is:
InstrI
InstrJ
I: sub r4,r1,r3
J: add r1,r2,r3
K: mul r6,r1,r7
Called an anti-dependence by compiler writers.
This results from reuse of the name r1.
Cant happen in MIPS 5 stage pipeline because:
All instructions take 5 stages, and
Reads are always in stage 2, and
Writes are always in stage 5
Appendix A - Pipelining
18
Data Hazards
Execution Order is:
InstrI
InstrJ
I: sub r1,r4,r3
J: add r1,r2,r3
K: mul r6,r1,r7
19
Data Hazards
Simple Solution to RAW
Hardware detects RAW and stalls
Assumes register written then read each cycle
+ low cost to implement, simple
-- reduces IPC
Try to minimize stalls
Minimizing RAW stalls
Bypass/forward/shortcircuit (We will use the word forward)
Use data before it is in the register
+ reduces/avoids stalls
-- complex
Crucial for common RAW hazards
Appendix A - Pipelining
20
Data Hazards
Time (clock cycles)
or
r8,r1,r9
xor r10,r1,r11
Reg
DMem
Ifetch
Reg
DMem
Reg
DMem
Reg
ALU
and r6,r1,r7
Ifetch
DMem
ALU
sub r4,r1,r3
Reg
ALU
O
r
d
e
r
add r1,r2,r3Ifetch
ALU
I
n
s
t
r.
MEM WB
ALU
IF ID/RF EX
Reg
Ifetch
Ifetch
Reg
Reg
Reg
DMem
Reg
The use of the result of the ADD instruction in the next three instructions causes a
hazard, since the register is not written until after those instructions read it.
Figure 3.9
Appendix A - Pipelining
21
Data Hazards
and r6,r1,r7
or
r8,r1,r9
Reg
DMem
Ifetch
Reg
DMem
Ifetch
Reg
DMem
Ifetch
Reg
ALU
Ifetch
DMem
ALU
sub r4,r1,r3
Reg
ALU
O
r
d
e
r
ALU
I
n
s
t
r.
Forwarding To Avoid
Data Hazard
xor r10,r1,r11
Figure 3.10
Appendix A - Pipelining
Reg
Reg
Reg
Reg
DMem
22
Reg
Data Hazards
and r6,r1,r7
or
r8,r1,r9
DMem
Ifetch
Reg
DMem
Reg
Ifetch
Ifetch
Reg
Reg
Reg
DMem
ALU
sub r4,r1,r6
Reg
ALU
O
r
d
e
r
ALU
I
n
s
t
r.
ALU
Reg
DMem
There are some instances where hazards occur, even with forwarding.
Figure 3.12
Appendix A - Pipelining
23
Reg
sub r4,r1,r6
and r6,r1,r7
or r8,r1,r9
Ifetch
Reg
DMem
Ifetch
Reg
Bubble
Ifetch
Bubble
Reg
Bubble
Ifetch
Reg
DMem
Reg
Reg
Reg
DMem
ALU
O
r
d
e
r
lw r1, 0(r2)
ALU
I
n
s
t
r.
ALU
ALU
Data Hazards
DMem
There are some instances where hazards occur, even with forwarding.
Figure 3.13
Appendix A - Pipelining
24
This is another
representation
of the stall.
Data Hazards
LW
R1, 0(R2)
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
IF
ID
EX
MEM
R8, R1, R9
LW
R1, 0(R2)
R8, R1, R9
IF
WB
ID
EX
MEM
WB
IF
ID
stall
EX
MEM
WB
IF
stall
ID
EX
MEM
WB
stall
IF
ID
EX
MEM
Appendix A - Pipelining
WB
25
Data Hazards
Pipeline Scheduling
Appendix A - Pipelining
26
Data Hazards
Pipeline Scheduling
scheduled
gcc
spice
54%
31%
42%
14%
tex
0%
unscheduled
65%
25%
20%
40%
60%
80%
Appendix A - Pipelining
27
Control Hazards
A.1 What is Pipelining?
A.2 The Major Hurdle of PipeliningStructural Hazards
-- Structural Hazards
Data Hazards
Control Hazards
Appendix A - Pipelining
28
Control Hazard on
Branches
Three Stage Stall
Reg
Ifetch
Reg
Ifetch
Reg
DMem
Ifetch
Reg
ALU
Ifetch
ALU
18: or r6,r1,r7
Reg
ALU
Ifetch
ALU
ALU
Control Hazards
DMem
Appendix A - Pipelining
Reg
DMem
Reg
DMem
Reg
Reg
DMem
29
Reg
Control Hazards
MIPS Solution:
Move Zero test to ID/RF stage
Adder to calculate new PC in ID/RF stage
must be fast
can't afford to subtract
compares with 0 are simple
Greater-than, Less-than test signbit, but not-equal must OR all bits
more general compares need ALU
1 clock cycle penalty for branch versus 3
In the next chapter, well look at ways to avoid the branch all together.
Appendix A - Pipelining
30
Control Hazards
Appendix A - Pipelining
31
Control Hazards
Appendix A - Pipelining
32
Control Hazards
Delayed Branch
Appendix A - Pipelining
33
Control Hazards
Pipelinespeedup=
Evaluating Branch
Alternatives
Pipelinedepth
1+BranchfrequencyBranchpenalty
Scheduling
BranchCPI
speedup v.
scheme
penalty
unpipelinedstall
Stall pipeline
31.423.5
1.0
Predict taken
11.144.4
1.26
Predict not taken
11.094.5
1.29
Delayed branch
0.51.074.6
1.31
Speedup v.
65% change PC
Appendix A - Pipelining
34
Control Hazards
Pipelining Introduction
Summary
Pipeline Depth
1 + Pipeline stall CPI
Appendix A - Pipelining
35
Control Hazards
Compiler Static
Prediction of
Taken/Untaken Branches
14%
60%
12%
50%
10%
Misprediction Rate
70%
40%
30%
20%
10%
0%
8%
6%
4%
2%
Always taken
Appendix A - Pipelining
Taken backwards
Not Taken Forwards
36
tomcatv
swm256
ora
mdljsp2
hydro2d
gcc
espresso
doduc
compress
alvinn
tomcatv
swm256
ora
mdljsp2
hydro2d
gcc
espresso
doduc
compress
0%
alvinn
Frequency of Misprediction
Control Hazards
Compiler Static
Prediction of
Taken/Untaken Branches
Appendix A - Pipelining
37
Control Hazards
Evaluating Static
Branch Prediction
Strategies
10000
1000
100
10
Profile-based
Appendix A - Pipelining
Direction-based
38
tomcatv
swm256
ora
mdljsp2
hydro2d
gcc
espresso
doduc
compress
1
alvinn
Misprediction ignores
frequency of branch
Instructions between
mispredicted branches
is a better metric
100000
Data Hazards
Control Hazards
Appendix A - Pipelining
39
What Makes
Pipelining Hard?
Examples of interrupts:
Interrupts cause
great havoc!
Power failing,
Arithmetic overflow,
I/O device request,
OS call,
Page fault
Appendix A - Pipelining
40
What Makes
Pipelining Hard?
Interrupts cause
great havoc!
41
What Makes
Pipelining Hard?
Interrupts cause
great havoc!
Appendix A - Pipelining
42
What Makes
Pipelining Hard?
Interrupts cause
great havoc!
Appendix A - Pipelining
43
What Makes
Pipelining Hard?
Complex
Instructions
44
Data Hazards
Control Hazards
Appendix A - Pipelining
45
Multi-Cycle
Operations
Floating Point
Latency
4
8
36
112
2
2
3
Initiation Rate
3
4
35
111
1
1
2
Appendix A - Pipelining
46
Multi-Cycle
Operations
Floating Point
1
IF
2
ID
IF
3
EX
ID
IF
4
MEM
EX
ID
IF
5
WB
EX
EX
ID
IF
EX
MEM
EX
ID
IF
EX
WB
EX
EX
ID
IF
MEM
WB
EX
MEM
---
EX
WB
---
Notes:
I + 2: no WAW, but this complicates an interrupt
I + 4: no WB conflict
I + 5: stall forced by structural hazard
I + 6: stall forced by in-order issue
Appendix A - Pipelining
10
11
MEM
WB
EX
ID
EX
EX
47
Appendix A - Pipelining
48
Credits
I have not written these notes by myself. Theres a great deal of fancy
artwork here that takes considerable time to prepare.
I have borrowed from:
Wen-mei & Patel: http://courses.ece.uiuc.edu/ece511/lectures/lecture3.ppt
Patterson: http://www.cs.berkeley.edu/~pattrsn/252S98/index.html
Rabaey: (He used lots of Patterson material):
http://bwrc.eecs.berkeley.edu/Classes/CS252/index.htm
Katz: (Again, he borrowed heavily from Patterson):
http://http.cs.berkeley.edu/~randy/Courses/CS252.F95/CS252.Intro.html
Appendix A - Pipelining
49
Summary
A.1 What is Pipelining?
A.2 The Major Hurdle of Pipelining-Structural Hazards
Data Hazards
Control Hazards
Appendix A - Pipelining
50