You are on page 1of 35

Datapath and Control Unit Design

Simple Processor!

(4.1- 4.4 4th ed)

Oct-16 Cpu control.1


Intel SOC Architecture

• Intel Core(s) + graphics

Oct-16 Cpu control.2


Intel SOC :EU

• Intel Core(s) + graphics

Oct-16 Cpu control.3


Datapath vs Control ??
Datapath Controller

signals

Control Points

• Datapath: Storage, FU, interconnect sufficient to perform


desired functions
• Controller: controls operation on data path

Oct-16 Cpu control.4


CPU Performance ??
CPI
• Performance determined by:
–Instruction count - code
–cycle time 2Ghz ?
Inst. Count Cycle Time
–cycles per instruction - CPI 2Ghz
• Processor design impacts:
–cycle time  clock
–cycles per instruction

Oct-16 Cpu control.5


MIPS Format (Review)
• All MIPS instructions 32 bits. Three formats:
31 26 21 16 11 6 0
–R op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits
31 26 21 16 0
–I op rs rt immediate
6 bits 5 bits 5 bits 16 bits
31 26 0
–J op target address
6 bits 26 bits

Oct-16 Cpu control.6


Instructions executed in 4-5 steps
• R-type: fetch inst.,
select registers (rs, rt), [operand fetch]
ALU operation
write back registers
• lw/sw: fetch instruction
select a register(rs)
calculate address, need ALU
access memory (read/write)
write register file (lw)
• Branch: fetch the instruction
select registers (for beq)
test condition, calculate target addr., need
ALU
• First two steps are common

Oct-16 Cpu control.7


Functional Units - to build datapath review
Instruction MemWrite
address

PC
Instruction Add Sum Address Read
data 16 32
Sign
Instruction extend
Write Data
memory
data memory

a. Instruction memory b. Program counter c. Adder MemRead

a. Data memory unit b. Sign-extension unit

ALU control
5 Read 3
register 1
Read
Register 5 data 1
Read
numbers register 2 Zero
Registers Data ALU ALU
5 Write result
register
Read
Write data 2
Data data

RegWrite

a. Registers b. ALU
Oct-16 Cpu control.8
Review: How Registers work
• Register Write Enable
– Similar to D Flip Flop
Data In Data Out
• N-bit input and output
N N
• Write Enable input
– Write Enable:
Clk
• negated (0): Data Out will not
change
• asserted (1): Data Out will become
Data In after clock edge

Oct-16 Cpu control.9


MIPS Register File
• Register File consists of 32 registers: RW R1 R2
5 5 5
– Two 32-bit outputs: Write Enable
Read data 1
Read data 1 & Read data 2 Write data
32
– A 32-bit input bus: write data 32
32 32-bit
Registers Read data 2
• Register selection: Clk
32
– R1 (read register 1) selects the register
to put on read data 1
– R2 (read register 2) selects the register to put on read
data 2
– RW (write register) selects the register to be written
(write data) when Write Enable is 1 (Regwrite)
• Clock input (CLK)
– The CLK input is a factor ONLY during write operation
– During read operation, behaves as a combinational logic
block:
• Read data1 & read data 2 valid after “access time.”
Oct-16 Cpu control.10
Memory review
Write Enable Address
• Memory (Data)
Write data read data
– Input: Data In (Write data) Data In DataOut
– Output: Data Out (Read Data) 32 32
• Memory word selection: Clk
– Address selects word
– Write Enable = 1: address selects memory
word to be written via the Data In (Memwrite)
• Clock input (CLK) (omitted from Book diag for simplicity)
– The CLK input is a factor ONLY during write operation
– During read operation, behaves as a combinational
logic block:
• Address valid => Data Out valid after “access time.”
• Instruction memory data not shown (similar)

Oct-16 Cpu control.11


Clocking - Review
Clk
Setup Hold Setup Hold
Don’t Care

. . . .
. . . .
. . . .

• All storage elements clocked by same clock edge


• Cycle Time = CLK-to-Q + Longest Delay Path + Setup +
Clock Skew
• (CLK-to-Q + Shortest Delay Path - Clock Skew) > Hold
Time
Oct-16 Cpu control.12
Single-Cycle: Instruction Fetch Datapath
Next Address
Logic

• Instruction fetch 4

– Inst. In instr. memory

Add
– program counter points
to current instruction
– adder increments PC PC Read
to point to next inst. address

– For branch inst., the next Instruction


inst. address may not be Inst memory
valid

Oct-16 Cpu control.13


R-type Datapath
• R[rd] <- R[rs] op R[rt] Example: add rd, rs, rt
– Ra, Rb, and Rw come from instruction’s rs, rt, and rd fields
– ALUctr and RegWr: control logic after decoding the instruction
31 26 21 16 11 6 0
op rs rt rd shamt funct
6 bits 5 bits 5 bits 5 bits 5 bits 6 bits

Rd Rs Rt
Write 5 ALU control
5 5
Read data 1
Rw R1 R2
Write data 32

ALU
32 32-bit Result
32 Registers 32
Clk Read data 2
32

Oct-16 Cpu control.14


Complete R-type Datapath
Next Address
Logic

4 ALU control
Add

Read read
register1 zero
data 1

AL U
PC Read Read
register2 result
address
register file
Instruction Write read
register data 2
Inst memory
write data

Write
Oct-16 Cpu control.15
Timing: One complete cycle
Clk
Clk-to-Q
PC Old Value New Value
Instruction Memory Access Time
Rs, Rt, Rd, Old Value New Value
Op, Func
Delay through Control Logic
ALUctr Old Value New Value

RegWr Old Value New Value


Register File Access Time
Read data Old Value New Value
1& 2 ALU Delay Reg setup
Write data Old Value New Value

Rd Rs Rt
RegWr 5 5 ALUctr Register Write
5
Occurs Here
Read data 1
Rw Ra Rb
Write data 32

ALU
32 32-bit Result
32 Registers 32
Clk Read data 2
32

Oct-16 Cpu control.16


Load/Store Datapath
fetch same as R

• lw $1, offset-value($2) ; sw $1, offset-value($2)


• register file (get base reg.)
• ALU to calculate memory address
• data memory: read OR write
• sign extension (offset ext.) ALU control write

Read data1 data


rg 1 memory
read data2 address read
rg2 data
Write reg
write data write data
Register
sign
file
write ext. read
32
16
Oct-16 Cpu control.17
Branch Inst. Datapath

PC+4 from Branch


• beq $1, $2, offset inst. datapath target
– if ($1=$2) goto PC+offset*4
• ALU for branch condition shift
left
• Adder for computing branch 2
Registers
target address
• Shift left 2: increases Read Data1 zero
Reg 1
the range of offset by 4 Inst.
Read To branch
• Zero: control logic to Reg 2
Data2
control
decide if branch. logic
32
sig ALU control
n
16 ext.

Oct-16 Cpu control.18


Complete Datapath for : R, LD/ST, BEQ

M
Ad d u
x
4 A d d A LU
result
S hift
left 2

R egiste rs
R ead
PC R ea d register 1
R ea d
ad dres s R e ad da ta 1
re gister 2 Z ero
Instru c tio n A LU A L U
W rite R ea d A ddre ss R e ad
M resu lt
register da ta 2 d ata M
Instruc tion u u
W rite x D ata x
m e m ory
d ata m em ory
W rite
da ta
16 S ig n 32
extend

Executes basic instructions in single clock cycle


Any resource can only be once during a single cycle
Oct-16 Cpu control.19
Datapath controlled by control unit
PCSrc

M
Add u
x
4 Add ALU
result
Shift
left 2
Identify your
Registers controls
Read 3 ALU operation MemWrite
Read register 1 ALUSrc
PC Read
address Read data 1 MemtoReg
register 2 Zero
Instruction ALU ALU
Write Read Address Read
register M result data
data 2 u M
Instruction u
memory Write x Data x
data memory
Write
RegWrite data
16 32
Identify your Sign
extend MemRead
controls

Oct-16 Cpu control.20


Single-Cycle: Control Signals

• Control:
– input: 6-bit opcode Main
7 lines
control
– output: 9 control lines Control register 3-bit
• ALU control: memory
mux
– input: ALUop + 6-bit ALU
control
(function field)

inst[31-26]
– output: 3 lines ALUop 2-bit
6-bit
– I, J type, ALU control
inst[5-0]
depends on only ALUop
R-type
op func
16-bit
I-type

Oct-16 Cpu control.21


ALU Control, Truth Table
Inst Inst. Desired ALUop Function ALU
opcode operation ALU act. code control
lw load word 00 xxxxxx
sw store word 00 xxxxxx
beq branch equal 01 xxxxxx
R-type add 10 10 0000
R-type sub 10 10 0010
R-type AND 10 10 0100
R-type OR 10 10 0101
R-type slt 10 10 1010

*ALUop: output of main control


*ALU Control: combinational logic
R-: ALUop=10,
8 inputs, 3 output.
lw/sw: ALUop=00

Oct-16 Cpu control.22


Datapath with Control unit
0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite

Instruction [25– 21] Read


Read register 1
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

Oct-16 Cpu control.23


Datapath with Control unit
0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite

Instruction [25– 21] Read


Read register 1
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

Oct-16 Cpu control.24


Datapath timings in psec
0
M
u
x
Add ALU
result
1 30
Add Shift
RegDst left 2
4 Branch

100 100
Instruction [31– 26]
MemRead
MemtoReg
Control ALUOp
MemWrite
ALUSrc
RegWrite

Instruction [25– 21] Read


PC
Read
address
400 200
register 1 Read
120
data 1
Instruction
Instruction [20– 16] Read
register 2 Zero 350
0 Registers Read ALU ALU
[31– 0] Write 0 Read
Instruction
M
30
u register
data 2
M
u
result Address
data
1
M
30
memory
Instruction [15– 11]
1
x
data
50
Write
1
x Data
u
x
memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control
Regfile Instruction [5– 0]

Setup time
Rformat timing= 400 +200+30 +120 +30 + 50 (IF  WB)
OR = 400 + 100 (IF – cntl – Pcmux)
Oct-16 Cpu control.25
Control Unit -- Control Signal Definitions

Signal Name Effect when deasserted Effect when asserted


MemRead None Data put on read dataoutput
MemWrite None Write data into memory
RegWrite None Write register
ALUSrc 2nd ALUinput from 2nd ALUinput from
register file inst[15-0]
PCSrc PC = PC+4 PC = branch address
MemtoReg result of ALU is sent data in memory is sent
RegDst inst[20-16] (rt) inst[15-11]] (rd) provides
provides register write register write address
address

PCsrc = branch AND zero

Oct-16 Cpu control.26


Example 1: Execution flow for
add $1, $1, $3 (4 steps + bypass)
0
1. IF M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
1. IF ALUSrc
RegWrite
3. EX, ALU
2.D func.
Read
Instruction [25– 21] Read
register 1
4.Bypass
PC address Read
Instruction [20– 16] Read data 1
Zero
1. IF Instruction
[31– 0]
0
register 2
Registers Read
0
ALU ALU
M Write data 2 result Address Read 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
5 Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

5. WB write back result


Oct-16 Cpu control.27
Example 2: LW S0, OFF(S1)
Memory address = OFF + S1

0
M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
1. IF ALUSrc
RegWrite
3. EX, calc
2.D address
Read
Instruction [25– 21] Read
register 1
4.Mem rd
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
OFF16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

5. WB write back result


Oct-16 Cpu control.28
Example 3: BEQ S1, S0, cs330
target address = PC + offset x 4

Update PC with
target addr. If 0

successful M
u
x
Add ALU 1
result
Add Shift
RegDst left 2
4 Branch
MemRead
Instruction [31– 26] MemtoReg
Control ALUOp
MemWrite
1. IF ALUSrc
RegWrite
3. EX, compare
Instruction [25– 21] 2.D Read
s1:s0
Read register 1
PC address Read
Instruction [20– 16] Read data 1
register 2 Zero
Instruction 0 Registers Read
[31– 0] ALU ALU
M Write data 2 0 Address Read
result 1
Instruction u register M data
u M
memory x u
Instruction [15– 11] Write x
1 Data x
data 1 memory 0
Write
data
16 32
Instruction [15– 0] Sign
extend ALU
control

Instruction [5– 0]

Oct-16 Cpu control.29


Single-Cycle: J-type

• So far, datapath can handle R-type, lw/sw, beq


– How about J-type?
– J-type j L1 P.372
jal L1 Exercise 5.6
31-26 25-0

address

address= a 25 ... a 0 current PC = PC31PC30 ... PC0

Actual address L1 = pc31 pc30 pc29 pc28 a25 ... a0 00

Oct-16 Cpu control.30


Single-Cycle: Datapath + Control including jump inst

Oct-16 Cpu control.31


What’s wrong with Single cycle CPI=1 processor?
Arithmetic & Logical
Inst Memory Reg File ALU RegW

Load
Inst Memory Reg File ALU Data Mem RegW
Critical Path
Store
Inst Memory Reg File ALU Data Mem

Branch
Inst Memory Reg File cmp

• Long Cycle Time


• All instructions take as much time as the slowest
• Real memory is slow

Oct-16 Cpu control.32


Single Cycle Timing Diagram

Cycle 1 Cycle 2
Clk

Single Cycle Implementation:


Load Store Waste

Oct-16 Cpu control.33


CPU VS Microcontroller
Microcontroller = CPU + Flash(ROM) + RAM + popular I/O peripherals.

8051
Microcontroller
Block Diagram:
Used in Lab project

Used to implement low


cost applications &
Embedded Systems
Eg automotive,
appliances, elevators

Oct-16 Cpu control.34


Microcontroller Block Diagram: PIC

Oct-16 Cpu control.35