Вы находитесь на странице: 1из 44

Speculation and Future-Generation

Computer Architecture

Guri Sohi

University of Wisconsin — Madison


URL: http://www.cs.wisc.edu/~sohi
Outline

• Computer architecture and speculation


• control, dependence, value speculation
• Multiscalar: next generation microarchitecture

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


2
P ro c e s s i n g : B i g P i c tu r e

• Sequence through static representation of program to


create dynamic stream of operations
• Execute operations in the dynamic stream
• Determine dependence relationships
• Schedule operations for execution
• Execute operations
• Communicate values

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


3
Role o f C o m p u t e r A r c h ite ct

• Use available technology to perform processing tasks


• Match processing tasks to hardware blocks constructed
from available technology
• Do so in a manner that is easy to design/verify

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


4
Cons t ra i n t s fo r t h e A r ch ite ct

• Program ordering constraints, a.k.a, dependences


• Control
• Artificial (i.e., name or storage)
• Ambiguous
• True

• Increased latencies manifest as performance-degrading


stalls through program dependences
• Need for architect to develop techniques to overcome
dependences

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


5
Spec ulat ion a n d C o m p u t e r A r ch ite ctu r e

Speculation: “.. to assume a business risk in hope of gain”


-- Webster

• Speculation in computer architecture is used to try to


overcome constraining conditions

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


6
Spec ulat ion a n d C o m p u t e r A r ch ite ctu r e

• Speculate outcome of event rather than waiting for outcome


to be known
• mis-speculation if wrong
• mis-speculation can have penalty
• Develop techniques to speculate better

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


7
M e m o r y H i e ra r c h i e s

• Technology trend: small and fast, or large and slow


• Can I get fast with appearance of large?

• Speculate: going to use referenced data item again


(temporal locality)
• Speculate: going to reference data items in proximity to
referenced item (spatial locality)

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


8
P ro c e s s o r G e n e ra t io n s

Generation 1 Generation 2

Generation 4

Generation 3

o
o
o

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


9
S upe r s c a l a r ( G e n e ra t io n 3 )

• Technology allows large execution engines to be built


• Feeding execution engines is a problem
• branch instructions constrain program sequencing and
instruction fetching
• waiting for branch outcome wasteful

• Speculate the outcome of a branch (control speculation)


• Use mechanisms to improve accuracy of speculation:
branch predictors

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


10
S o f t wa r e o r H a r d wa r e ?

• Speculation can be done in both hardware and software


• Software speculation requires instruction set (ISA) support
- problematic if ISA change not practical
+ requires little hardware support
• Hardware speculation does not require ISA support
+ better if business reasons preclude changing ISA
- requires more hardware

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


11
Hardwar e S p e c u l a t i o n M ech a n ism s

• Need to give appearance of sequential execution


• Need to recover from mis-speculations
• history buffers: keep checkpoint, and back up to checkpoint
• reorder buffers: do not update (architectural) state until
speculation outcome is known

• Speculation improvement mechanisms


• branch predictors

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


12
O b s e r va t i o n

• Mechanisms rely on ability to give appearance of sequential


execution
• Appearance of sequential execution implies appearance
that no ordering constraints were violated
• Basic (recovery) mechanisms for supporting control
speculation can be used to support arbitrary forms of
speculative execution!

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


13
O ut- o f - O r d e r P r o c e s sin g

Processing Program
Phase Form

static
program

instruction fetch
& branch prediction
dynamic
instruction
stream
dependence checking
execution
& dispatch window

instruction issue Execution Wavefront


instruction execution

instruction reorder
& commit
completed
instructions

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


14
G ener i c S u p e r s c a l a r P ro ce sso r

floating pt.
register
file
floating pt.
instruction functional units
buffers
register memory
pre- instr. instr.
rename interface
decode cache buffer
&dispatch functional units
integer/address
instruction and
buffers data cache

integer
register
file
re-order buffer

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


15
Te c h n o l o g y Tr e n d s

• Wires used to pass values


• Wires getting relatively slower
• Short wires for fast clock
• Short wires implies localized communication

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


16
Alpha 21264

Reg Exec
Reservation File0
Stations Exec

1 CP
delay

Reg Exec
File1
Exec

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


17
Dec e n t ra l i ze d S c h e d u lin g

• Scheduler in one cluster needs dependence information for


instructions in other cluster
• problematic for memory operations
• knowledge of memory operations needed
• address calculations needed
• Can we work without knowledge of memory operations/
addresses, i.e., ambiguous dependences?
• Use data dependence speculation

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


18
S c hedu l i n g M e m o r y O p e ra tio n s

• Data dependence speculation is the default


• predict no dependences
• Improving accuracy of data dependence prediction
- akin to branch prediction for control dependences
• Track history of dependence mis-speculations
• small number of static dependence pairs
• exhibit temporal locality
• Use history for future data dependence speculation/
synchronization decisions

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


19
O verc o m i n g Tr u e D e p e nd e n ce s

• Break true dependence using value speculation


• predict the outcome of an operation (e.g., 32 bits)
• Mechanisms to support other forms of speculation (e.g.,
control speculation) can be use to recover
• Subject of significant current research

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


20
M u l t i s c a l a r P r o c e s so r s

• Paradigm for Generation 4 processors


• Proposed in early 1990s for 2000s processors
• Employs “thread-level speculation”
• First commercial example of concept: Space Time
Computing (STC) in Sun MAJC

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


21
P roc es si n g H a r d wa r e : B ig Pictu r e

• Start with a static representation of a


program PROGRAM
• Sequence through the program to generate
the dynamic stream of operations
A
• Use single PC to walk through static
representation?

• Execute operations in dynamic stream


• Schedule operations for execution B
• Execute operations
• Communicate values
C

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


22
Basic Issues

• Sequencing
• Scheduling
• Operation execution
• Operand communication

• How do we sequence, schedule, execute, and communicate


in a more powerful manner?

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


23
Tr e n d s

• Lots of transistors available


• Wires getting slower
• Design complexity getting unwieldy
• Verification complexity getting unwieldy
• Software complexity getting unwieldy

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


24
H a r d wa r e W i s h L i st

• Simplify engineering (design, verification, testing)


• Use of simple, regular hardware structures
• clock speeds comparable to single-issue processors
• “Locality” of interconnect
• Easy growth path from one generation to next
• reuse existing processing cores
• No centralized bottlenecks
• Still build high-performance processor
• speed up execution of single program

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


25
T he “Har d wa r e - I n f l u e n c e d ” So lu tio n

• Take current generation processor


• Replicate some parts, share others
• Have next generation processor
• Different units can sequence, schedule, etc. in parallel

BUT, the software problem .......

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


26
Th e S o f t wa r e P r o ble m

• Can’t always break up single program into “independent”


chunks (i.e., multiple sequencers) statically
• control dependences
• data dependences (especially ambiguous ones)
• also load balance

• Can’t map program onto rigid hardware model

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


27
Hardwa r e / S o f t wa r e C o o p e ra tio n

• Take “mostly sequential” static program


• Use speculation to overcome dependence limitations
• When in doubt, speculate
• Break up program into “potentially independent” chunks
dynamically

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


28
Sequencing

• Unraveling the operations to be executed PROGRAM


dynamically
• Use 2-level sequencing
• sequence high level in task-sized steps A
• sequence within task (task == thread)
• vectors?
• Use control flow speculation to increase B
sequencing power
• overcome “stalls”
C

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


29
Scheduling

• Use multiple schedulers to improve PROGRAM


scheduling power
• Use data dependence speculation to
overcome scheduling limitations A
- ambiguous dependences
• Use value speculation to overcome
scheduling limitations B
- true dependences

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


30
O pe ra n d C o m mu n i c atio n

• Values bound to registers and memory PROGRAM


• Values created speculatively
• Storage A
• where should values be buffered?
• Synchronization
• operation uses value of latest producer B
• Communication
• forwarding created value to (future)
consumers C
• Create and exploit localities to reduce/
simplify interconnect!

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


31
M u l t i s c a l a r Pa ra d i g m

• Break sequencing process into two steps


• Sequence through static representation in task-sized steps
• Sequence through each task in conventional manner
• Split large instruction window into ordered tasks
• Assign a task to a simple execution engine; exploit
parallelism by overlapping execution of multiple tasks
• Use separate PCs to sequence through separate tasks
• Maintain the appearance of a single-PC sequencing
through the static representation
• Use control and data dependence speculation

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


32
W h a t i s a Ta s k / T h r e a d ?

• A portion of the static representation resulting in a


contiguous portion of the dynamic instruction stream

- part of a basic block


- basic block
- multiple basic blocks
- loop iteration
- entire loop
- procedure call, etc.

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


33
Mult is ca l a r B i g P i c t u r e : Ba sics

PROGRAM
predict predict
A
A C
B

B PROC PROC PROC


UNIT UNIT UNIT
1 2 3
C

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


34
M ult is c a l a r : L o g i c a l H ar d wa r e

SEQUENCER

PIPE PIPE PIPE PIPE


LINE LINE LINE LINE

P.U. P.U. P.U. P.U.


REG REG REG REG

MEMORY DISAMBIGUATION
CACHE HIERARCHY

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


35
R e g i s t e r Va l u e s

• Each core works out of its ‘‘local’’ register file


• Multiple register files act like separate ‘‘renamed’’ files
• Each register file contains register state at a particular time
in the (speculative) execution of a program

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


36
M e m o r y Va l u e s

• Storage
• Synchronization
• Communication
• Versions

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


37
S c hedu l i n g M e m o r y O p e ra tio n s

• Data dependence speculation is the default


• predict no dependences
• Improving accuracy of data dependence prediction
- akin to branch prediction for control dependences
• Track history of dependence mis-speculations
• small number of static dependence pairs
• exhibit temporal locality
• Use history for future data dependence speculation/
synchronization decisions

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


38
E x a m p l e : P r o bl e m

• Process stream of tokens


• Create entry in list for new token
• Use information in list to process token

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


39
Example: C Code

for (indx = 0; indx < BUFSIZE; indx++) {


/* get the symbol for which to search */
symbol = SYMVAL(buffer[indx]);

/* do a linear search fo rthe symbol in the list */


for (list = listhd; list; list = LNEXT(list) {
/* if symbol already present, process entry */
if (symbol == LELE(list)) {
process(list);
break;
}
}
/* if symbol not found, add it to the tail */
if (! list) {
addlist(symbol);
}
}

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


40
Example

• Each task is a complete list search


• Searches are usually independent and parallel
• Multiscalar can assume they are always independent
• Branches that separate tasks are predictable
• Branches within a task unlikely to be 100% predictable
• Superscalar/VLIW unlikely to be able to overlap
processing of different tokens

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


41
E x a m p l e : E xe c u t a ble

Targ Spec Branch, Branch

Forward Bits
Targ1 OUTER

Stop Bits
Targ2 OUTERFALLOUT
Create mask $4,$8,$17,$20,$23

OUTER:
addu $20, $20, 16 F
ld $23, SYMVAL−16($20) F
move $17, $21
beq $17, $0, SKIPINNER
INNER:
ld $8, LELE($17)
bne
move
$8,
$4,
$23, SKIPCALL
$17
Going from one
SKIPCALL:
jal
j
process
INNERFALLOUT generation to
ld
bne
$17,
$17, $0,
NEXTLIST($17)
INNER
another could
INNERFALLOUT:
release $8, $17
leave binary
bne $17, $0, SKIPINNER untouched!
move $4, $23 F
jal addlist

SKIPINNER:
release $4
bne $20, $16, OUTER Stop
Always
OUTERFALLOUT:

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


42
Con c l u d i n g R e m a r k s - 1

• Speculation is a very important tool in computer architect’s


toolset
• used to overcome performance-limiting constraints
• control, dependence, value, other forms, .....
• heavy use likely in future computer systems
• Wire delays will cause future microarchitectures to be
distributed
• Distributed microarchitectures have many pluses
• natural support of “threads”
• easier to design, verify, test?
• fault-tolerance

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


43
Con c l u d i n g R e m a r k s - - 2

• Multiscalar model enables distributed execution of a


sequential (or parallel) program
- makes heavy use of dependence speculation
• Beginning of a new generation of microarchitectures
- much works remains to be done
• More info at www.cs.wisc.edu/~mscalar

Guri Sohi Speculation and Future-Generation Computer Architecture Slide


44

Вам также может понравиться