Вы находитесь на странице: 1из 17

COMP 206: Computer Architecture and Implementation

Montek Singh
Wed, Sep 21, 2005 Topic: Pipelining -- Intermediate Concepts (Control Hazards)
1

Control Hazard
A peculiar kind of RAW hazard involving the program

counter
 PC written by branch instruction  PC read by instruction fetch unit (not another instruction)

Possible misbehavior is that instructions fetched and

executed after the branch instruction are not the ones specified by the branch instruction

Control Hazard: Example


Br-1 Br Br+1 Br+2 Br+3 T
F 1 Read PC Fetch Br-1 PC += 4 2 3 4 5 6 7 8 Read PC Fetch Br PC += 4 Br Br 9 10 11 12 13 14 15 Read PC Fetch T PC += 4 T T T T

D X M W

Br-1 Br-1 Br-1 Br-1

Br PC = BTA Br

Unpipelined implementation

D X M W

1 2 3 4 5 6 Read PC Read PC Read PC Read PC Read PC Read PC Fetch Br-1 Fetch Br Fetch Br+1 Fetch Br+2 Fetch Br+3 Fetch T PC += 4 PC += 4 PC += 4 PC += 4 PC += 4 PC += 4 Br-1 Br Br+1 Br+2 annul Br-1 Br Br+1 annul Br-1 Br annul PC = BTA Br-1 Br

Pipelined with PNT strategy

More on Control Hazards


Branch delay: the length of the control hazard What determines branch delay?  We need to know that we have a branch instruction  We need to have the BTA  We need to know the branch outcome  So, we have to wait until we know all of these quantities An older pipeline (DLX, HP2):  computes BTA in EX  computes branch outcome in EX  changes PC in MEM To reduce branch delay, these steps are moved to

earlier pipeline stages in MIPS (HP3):


 Can t move up beyond ID (need to know it s a branch

instruction)
4

Reducing Branch Delays


Example: sub beq add ... lw $10, $4, $8 $10, $3, go $12, $2, $5 $4, 16($12)
Control & Hazards

go:

+
Rs

PC
Instruction Memory Rt

Registers

Imm

sign Extend

+
Bus A =
<<2 Rt Rd

IF/ID

ID/EX

Bus B

Dealing with Branch Delays


Four strategies  Stall  Predict Taken, variation A (PTA)  Predict Taken, variation B (PTB)  Predict Not Taken (PNT) Consider a hypothetical 12-stage pipeline 12 Instruction is fetched in stage 1 (IF)  Opcode becomes known in stage 2 (ID)  BTA becomes known in stage 4  Branch outcome becomes known in stage 6 Parameters  PU, PT, PNT: penalties of unconditional branch, taken branch, untaken branch  T: probability of branch being taken
6

Stall Strategy: 12-Stage Pipeline 12Clock 1 Stage 1 1 Opcode Stage 2 Stage 3 BTA Stage 4 Stage 5 Branch outcome Stage 6 Stage 7 Stage 8 Stage 9 Stage 10 Stage 11 Stage 12 2 2 1 3 2 1 1 1 1 1 1 1 1 1 1 4 2 5 2 6 2 7 3 2 8 4 3 2 9 10 11 5 6 7 4 5 6 3 4 5 2 3 4 2 3 2 12 8 7 6 5 4 3 2 13 9 8 7 6 5 4 3 2 14 15 16 17 18 19 9 9 9 9 u u 8 7 8 6 7 8 5 6 7 8 4 5 6 7 8 3 4 5 6 7 8 2 3 4 5 6 7 2 3 4 5 6 2 3 4 5 2 3 4 20 21 22 23 24 25 26 27 28 29 u u u u u u 8 7 6 5 u 8 7 6 u 8 7 u 8 u

Pipeline stalls on all branches Instructions 1 and 8 are branches


 1 is not taken, 8 is taken

Opcode determination in stage 2 stalls pipeline Branch outcome determination in stage 6 restarts pipeline from IF or ID BTA determination in stage 4 would restart pipeline from IF for jumps PU = 3, PT = 5, PNT = 4
7

PNT Strategy: 12-Stage Pipeline 12Clock 1 Stage 1 1 Opcode Stage 2 Stage 3 BTA Stage 4 Stage 5 Branch outcome Stage 6 Stage 7 Stage 8 Stage 9 Stage 10 Stage 11 Stage 12 2 2 1 3 3 2 1 4 4 3 2 1 5 5 4 3 2 1 6 6 5 4 3 2 1 7 7 6 5 4 3 2 1 8 8 7 6 5 4 3 2 1 9 10 11 12 13 14 15 16 17 18 19 9 10 11 12 13 14 15 16 17 u 8 9 10 11 12 13 14 15 16 u 7 8 9 10 11 12 13 14 15 6 7 8 9 10 11 12 13 14 5 6 7 8 9 10 11 12 13 4 5 6 7 8 9 10 11 12 3 4 5 6 7 8 9 10 11 12 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 20 21 22 23 24 25 26 27 28 29 u u u u u u 12 11 12 10 11 12 9 10 11 12 u u u u

Pipeline continues execution assuming that the branch will fall through Instructions 1 and 12 are branches
 1 is not taken, 12 is taken

Branch outcome determination in stage 6 restarts pipeline from IF for

taken branches (cancelling instructions already in pipeline) (cancelling BTA determination in stage 4 would restart pipeline from IF for jumps PU = 3, PT = 5, PNT = 0
8

PTA Strategy: 12-Stage Pipeline 12Opcode BTA Branch outcome Clock 1 Stage 1 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 7 Stage 8 Stage 9 Stage 10 Stage 11 Stage 12 2 2 1 3 3 2 1 4 4 3 2 1 5 6 7 u u+1 2 u 8 3 2 9 4 3 2 10 5 4 3 2 1 1 1 1 1 1 1 1 11 6 5 4 3 2 12 7 6 5 4 3 2 13 8 7 6 5 4 3 2 14 9 8 7 6 5 4 3 2 15 10 9 8 7 6 5 4 3 2 16 17 18 19 20 v v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 7 v 6 7 5 6 7 4 5 6 7 3 4 5 6 7 2 3 4 5 6 2 3 4 5 2 3 4 21 22 23 24 25 26 27 28 29 v+2 v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 v+2 7 v v+1 v+2 6 7 v v+1 v+2 5 6 7 v v+1 v+2

Pipeline predicts all branches to be taken and restarts pipeline from IF at

BTA as soon as BTA is known (cancelling instructions already in pipe) (cancelling Instructions 1 and 7 are branches
 1 is not taken, 7 is taken

Branch outcome determination in stage 6 restarts pipeline from IF for

untaken branches (cancelling instructions already in pipeline) (cancelling PU = 3, PT = 3, PNT = 5


9

PTB Strategy: 12-Stage Pipeline 12Opcode BTA Branch outcome Clock 1 Stage 1 1 Stage 2 Stage 3 Stage 4 Stage 5 Stage 6 Stage 7 Stage 8 Stage 9 Stage 10 Stage 11 Stage 12 2 2 1 3 3 2 1 4 4 3 2 1 5 6 7 u u+1 5 4 u 3 4 2 3 4 1 2 3 1 2 1 8 6 5 4 3 2 1 9 10 11 7 8 9 6 7 8 5 6 7 5 6 5 4 3 4 2 3 4 1 2 3 1 2 1 12 10 9 8 7 6 5 13 11 10 9 8 7 6 5 14 12 11 10 9 8 7 6 5 15 13 12 11 10 9 8 7 6 5 16 v 13 12 11 10 9 8 7 6 5 17 v+1 v 13 12 11 10 9 8 7 6 5 18 19 20 21 22 v+2 v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 10 v 9 10 8 9 10 7 8 9 10 6 7 8 9 10 5 6 7 8 9 23 24 25 26 27 28 29

4 3 2 1

4 3 2

4 3

v+2 v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 v+2 v v+1 v+2 10 v v+1 v+2

Pipeline predicts all instructions to be taken and starts fetching from BTA

as soon as it is known in stage 4 (but without cancelling instructions already in pipeline) Instructions 1 and 10 are branch instructions
 1 is not taken, 10 is taken

Branch outcome determination in stage 6 restarts pipeline from IF on fallfall-

through path (for untaken branches), and causes cancellation PU = 3, PT = 3, PNT = 2

10

Effect of Control Hazards on Pipelines


Pipeline depth Pipeline speedup ! Branch frequency v Branch penalty 1 Ideal CPI
Assume that 20% of all instructions are transfers of control, split 5% for unconditional jumps and 15% for conditional branches. For each of the four branching schemes for the 12-stage pipeline, determine the branch penalty as a function of T, the probability of a branch being taken.

Branch penalty !

5 15 v PU  v T v PT  - T v PNT 1 20 20

11

Solution for 12-Stage Pipeline 12 Stall: 0.25*3+0.75*(T*5+(1-T)*4) = 3.75+0.75T Stall: 0.25*3+0.75*(T*5+(1 PTA: 0.25*3+0.75*(T*3+(1-T)*5) = 4.5-1.5T PTA: 0.25*3+0.75*(T*3+(14.5-

PTB: 0.25*3+0.75*(T*3+(1 PTB: 0.25*3+0.75*(T*3+(1-T)*2) = 2.25+0.75T


PNT: 0.25*3+0.75*(T*5+(1-T)*0) = 0.75+3.75T PNT: 0.25*3+0.75*(T*5+(1Branch penalties, 12-stage pipeline
5 4 Pe nalty 3 2 1 0 0 Branch probability T 1 2.25 0.75 4.5 3.75 4.5 Stall 3 PTA PTB PNT

12

Delayed Branches on MIPS


One branch delay slot on MIPS Always execute instruction in branch delay slot

(irrespective of branch outcome) Question: What instruction do we put in the branch Question: delay slot?
 Fill with NOP (always possible, penalty = 1)  Fill from before (not always possible, penalty = 0)  Fill from target (not always possible, penalty = 1-T) 1 BTA is dynamic  BTA is another branch  Fill from fall-through (not always possible, penalty = T) fall-

13

Details of Various Branch Flavors


A B C D true E F G H false A: B: C: D: X: if cond goto E M: N: P: Q: E: F: G: H: Ordinary A: B: C: X: if cond goto E D: M: N: P: Q: E: Delayed, F: filled from G: H: before

X: cond

M N P Q

Type of branch Ordinary Delayed

Filled with NOP Before Target Fall-through A A A A A B B B B B

cond = true C D XE F G H C D X nop E F G H C XD E F G H C D XE F G H C D XM E F G H

A A A A A

B B B B B

cond = false C D XM N P Q C D X nop M N P Q C XD M N P Q C D XE M N P Q C D XM N P Q


14

Instruction Sequence Alteration Strategies


To allow for more aggressive filling of branch delay slot from

target or fall-through, we can selectively cancel instructions fall Classification of branches


 Delayed branch  Instruction in branch delay slot is always executed  Plain branch  Instruction in branch delay slot is cancelled if branch is taken  Useful if compiler filled branch delay slot from fall-through fall Canceling (annulling, nullifying) branch  Instruction in branch delay slot is cancelled if branch is not taken  Useful if compiler filled branch delay slot from target

Should not cancel instruction if it may cause exception A bit in the instruction set by compiler makes the choice
 MIPS, SPARC, PA-RISC: delayed (0), canceling (1) PA M 88000, i860: delayed (0), plain (1)
15

Example: Branch Penalties


Consider a DLX pipeline with a single branch delay slot in which 25% of branches are unconditional. 50% of the unconditional branches have their delay slots filled from before, 40% from the target, and 10% with NOPs. The branch delay slots of the conditional branches are filled from various sources as shown in the table below, depending on the kind of branch used. For each of the cases, determine the branch penalty as a function of T, the probability that a conditional branch is taken. How do these penalties compare to those obtained by using a Stall, PT, or PNT strategy?
Branch flavor Delayed Plain Canceling Before 30% 30% 30% Target 20% 0% 40% Fall-through 10% 40% 0% NOP 40% 30% 30%

For all of Stall, PT, and PNT on DLX: PU = 1, PT = 1, PNT = 0

16

Solution: Branch Penalties


Branch flavor Delayed Plain Canceling Stall/PT/PNT Before 30% 0.3*0 30% 0.3*0 30% 0.3*0 Target 20% 0.2*(1-T) 0% 0.0*(1-T) 40% 0.4*(1-T) Fall-through 10% 0.1*T 40% 0.4*T 0% 0.0*T NOP 40% 0.4*1 30% 0.3*1 30% 0.3*1 Conditional Unconditional 75% 25% 0.6-0.1*T 0.3+0.4*T 0.7-0.4*T 0.1 0.1 0.1 Total 0.475-0.075*T 0.25+0.3*T 0.55-0.3*T 0.25+0.75T

Branch penalties, MIPS


1.2 Branch penalty 1 0.8 0.6 0.4 0.2 0 0 T 1 0.55 0.475 0.25 0.55 0.4 0.25 1 Delayed Plain Canceling Stall/PT/PNT

17

Вам также может понравиться