Вы находитесь на странице: 1из 75

Computer Organization & Software Systems (SSZG 516) Lecture #1 Instructor : Virendra S Shekhawat Date 23/01/06 Time 9:00

AM to 10:30 AM

Chapter 4

Cache Memory

Characteristics Location Capacity Unit of transfer Access method Performance Physical type Physical characteristics Organisation
2

Memory Hierarchy - Diagram

Cache Small amount of fast memory Sits between normal main memory and CPU May be located on CPU chip or module

Cache operation - overview CPU requests contents of memory location Check cache for this data If present, get from cache (fast) If not present, read required block from main memory to cache Then deliver from cache to CPU Cache includes tags to identify which block of main memory is in each cache slot

Cache Design Size Mapping Function Replacement Algorithm Write Policy Block Size Number of Caches

Typical Cache Organization

Mapping Function Cache of 64kByte Cache block of 4 bytes


i.e. cache is 16k (214) lines of 4 bytes

16MBytes main memory 24 bit address


(224=16M)

Direct Mapping Each block of main memory maps to only one cache line
i.e. if a block is in cache, it must be in one specific place

Address is in two parts Least Significant w bits identify unique word Most Significant s bits specify one memory block The MSBs are split into a cache line field r and a tag of s-r (most significant)

Direct Mapping Address Structure


Tag s-r 8 Line or Slot r 14 Word w 2

24 bit address 2 bit word identifier (4 byte block) 22 bit block identifier
8 bit tag (=22-14) 14 bit slot or line

No two blocks in the same line have the same Tag field Check contents of cache by finding line and checking Tag 10

Direct Mapping Cache Organization

11

Direct Mapping pros & cons Simple Inexpensive Fixed location for given block
If a program accesses 2 blocks that map to the same line repeatedly, cache misses are very high

12

Associative Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word Tag uniquely identifies block of memory Every line s tag is examined for a match Cache searching gets expensive

13

Fully Associative Cache Organization

14

Associative Mapping Address Structure


Tag 22 bit Word 2 bit

22 bit tag stored with each 32 bit block of data Compare tag field with tag entry in cache to check for hit Least significant 2 bits of address identify which 16 bit word is required from 32 bit data block

15

Set Associative Mapping Cache is divided into a number of sets Each set contains a number of lines A given block maps to any line in a given set
e.g. Block B can be in any line of set i

e.g. 2 lines per set


2 way associative mapping A given block can be in one of 2 lines in only one set

16

Two Way Set Associative Cache Organization

17

Set Associative Mapping Address Structure

Tag 9 bit

Set 13 bit

Word 2 bit

Use set field to determine cache set to look in Compare tag field to see if we have a hit e.g
Address 1FF 7FFC 001 7FFC Tag 1FF 001 Data 12345678 11223344 Set number 1FFF 1FFF 18

Chapter 10 Instruction Sets: Characteristics and Functions What is an instruction set? The complete collection of instructions that are understood by a CPU Machine Code Binary Usually represented by assembly codes Elements of an Instruction Operation code (Op code)
Do this

Source Operand reference


To this

Result Operand reference


Put the answer here

Next Instruction Reference


When you have done that, do this...

19

Instruction Cycle State Diagram

20

Instruction Representation In machine code each instruction has a unique bit pattern For human consumption (well, programmers anyway) a symbolic representation is used
e.g. ADD, SUB, LOAD

Operands can also be represented in this way


ADD A,B

21

Simple Instruction Format

Instruction Types
Data processing Data storage (main memory) Data movement (I/O) Program flow control 22

Number of Addresses (a) 3 addresses


Operand 1, Operand 2, Result a = b + c; May be a forth - next instruction (usually implicit) Not common Needs very long words to hold everything

2 addresses
One address doubles as operand and result a=a+b Reduces length of instruction Requires some extra work
Temporary storage to hold some results

23

Number of Addresses (b) 1 address


Implicit second address Usually a register (accumulator) Common on early machines

0 (zero) addresses
All addresses implicit Uses a stack e.g. push a push b add pop c c=a+b 24

Design Decisions
Operation repertoire
How many ops? What can they do? How complex are they?

Data types Instruction formats


Length of op code field Number of addresses

Registers
Number of CPU registers available Which operations can be performed on which registers?

Addressing modes (later ) RISC v CISC 25

Types of Operand
Addresses Numbers
Integer/floating point

Characters
ASCII etc.

Logical Data
Bits or flags

Types of Operation Data Transfer Arithmetic Logical Conversion I/O System Control Transfer of Control

26

Data Transfer Specify


Source Destination Amount of data

May be different instructions for different movements


e.g. IBM 370

Or one instruction and different addresses


e.g. VAX

27

Arithmetic Add, Subtract, Multiply, Divide Signed Integer Floating point ? May include
Increment (a++) Decrement (a--) Negate (-a)

Logical Operations
Bitwise operations AND, OR, NOT 28

Shift and Rotate Operations

29

Input/Output May be specific instructions May be done using data movement instructions (memory mapped) May be done by a separate controller (DMA)

30

Chapter 11 Instruction Sets: Addressing Modes and Formats

Addressing Modes Immediate Direct Indirect Register Register Indirect Displacement (Indexed) Stack

31

Immediate Addressing Operand is part of instruction Operand = address field e.g. ADD 5
Add 5 to contents of accumulator 5 is operand

No memory reference to fetch data Fast Limited range


Instruction Opcode Operand 32

Direct Addressing Address field contains address of operand Effective address (EA) = address field (A) e.g. ADD A
Add contents of cell A to accumulator Look in memory at address A for operand

Single memory reference to access data No additional calculations to work out effective address Limited address space

33

Direct Addressing Diagram

Instruction Opcode Address A Memory

Operand

34

Indirect Addressing
Memory cell pointed to by address field contains the address of (pointer to) the operand EA = (A)
Look in A, find address (A) and look there for operand e.g. ADD (A) Add contents of cell pointed to by contents of A to accumulator

Large address space 2n where n = word length May be nested, multilevel, cascaded
e.g. EA = (((A)))
Draw the diagram yourself

Multiple memory accesses to find operand Hence slower 35

Indirect Addressing Diagram


Instruction Opcode Address A Memory Pointer to operand

Operand

36

Register Addressing (1) Operand is held in register named in address filed EA = R Limited number of registers Very small address field needed
Shorter instructions Faster instruction fetch

37

Register Addressing (2) No memory access Very fast execution Very limited address space Multiple registers helps performance
Requires good assembly programming or compiler writing N.B. C programming
register int a;

c.f. Direct addressing

38

Register Addressing Diagram

Instruction Opcode Register Address R Registers

Operand

39

Register Indirect Addressing

C.f. indirect addressing EA = (R) Operand is in memory cell pointed to by contents of register R Large address space (2n) One fewer memory access than indirect addressing

40

Register Indirect Addressing Diagram

Instruction Opcode Register Address R Memory

Registers

Pointer to Operand

Operand

41

Displacement Addressing EA = A + (R) Address field hold two values


A = base value R = register that holds displacement or vice versa

42

Displacement Addressing Diagram

Instruction Opcode Register R Address A Memory Registers

Pointer to Operand

Operand

43

Relative Addressing A version of displacement addressing R = Program counter, PC EA = A + (PC) i.e. get operand from A cells from current location pointed to by PC c.f locality of reference & cache usage Base-Register Addressing A holds displacement R holds pointer to base address R may be explicit or implicit e.g. segment registers in 80x86

44

Indexed Addressing A = base R = displacement EA = A + R Good for accessing arrays


EA = A + R R++

Stack Addressing Operand is (implicitly) on top of stack e.g.


ADD Pop top two items from stack and add 45

Chapter 16

Control Unit Operation

Micro-Operations A computer executes a program Fetch/execute cycle Each cycle has a number of steps
see pipelining

Called micro-operations Each step does very little Atomic operation of CPU

46

Constituent Elements of Program Execution

47

Fetch - 4 Registers Memory Address Register (MAR)


Connected to address bus Specifies address for read or write op

Memory Buffer Register (MBR)


Connected to data bus Holds data to write or last data read

Program Counter (PC)


Holds address of next instruction to be fetched

Instruction Register (IR)


Holds last instruction fetched 48

Fetch Sequence

Address of next instruction is in PC Address (MAR) is placed on address bus Control unit issues READ command Result (data from memory) appears on data bus Data from data bus copied into MBR PC incremented by 1 (in parallel with data fetch from memory) Data (instruction) moved from MBR to IR MBR is now free for further data fetches 49

Rules for Clock Cycle Grouping

Proper sequence must be followed


MAR <- (PC) must precede MBR <- (memory)

Conflicts must be avoided


Must not read & write same register at same time MBR <- (memory) & IR <- (MBR) must not be in same cycle

Also: PC <- (PC) +1 involves addition


Use ALU May need additional micro-operations 50

Interrupt Cycle t1: MBR <-(PC) t2: MAR <- save-address PC <- routine-address t3: memory <- (MBR) This is a minimum
May be additional micro-ops to get addresses N.B. saving context is done by interrupt handler routine, not micro-ops

51

Execute Cycle (ADD) Different for each instruction e.g. ADD R1,X - add the contents of location X to Register 1 , result in R1 t1: MAR <- (IRaddress) t2: MBR <- (memory) t3: R1 <- R1 + (MBR) Note no overlap of micro-operations

52

Execute Cycle (ISZ) ISZ X - increment and skip if zero


t1: t2: t3: t4: MAR <- (IRaddress) MBR <- (memory) MBR <- (MBR) + 1 memory <- (MBR) if (MBR) == 0 then PC <- (PC) + 1

Notes:
if is a single micro-operation Micro-operations done during t4

53

Execute Cycle (BSA) BSA X - Branch and save address


Address of instruction following BSA is saved in X Execution continues from X+1 t1: MAR <- (IRaddress) MBR <- (PC) t2: PC <- (IRaddress) memory <- (MBR) t3: PC <- (PC) + 1

54

Functional Requirements Define basic elements of processor Describe micro-operations processor performs Determine functions control unit must perform Basic Elements of Processor ALU Registers Internal data paths External data paths Control Unit
55

Types of Micro-operation Transfer data between registers Transfer data from register to external Transfer data from external to register Perform arithmetic or logical ops Functions of Control Unit Sequencing
Causing the CPU to step through a series of microoperations

Execution
Causing the performance of each micro-op

This is done using Control Signals

56

Control Signals
Clock
One micro-instruction (or set of parallel micro-instructions) per clock cycle

Instruction register
Op-code for current instruction Determines which micro-instructions are performed

Flags
State of CPU Results of previous operations

From control bus


Interrupts Acknowledgements

57

Control Signals - output Within CPU


Cause data movement Activate specific functions

Via control bus


To memory To I/O modules

58

Example Control Signal Sequence Fetch MAR <- (PC)


Control unit activates signal to open gates between PC and MAR

MBR <- (memory)


Open gates between MAR and address bus Memory read control signal Open gates between data bus and MBR

59

Internal Organization Usually a single internal bus Gates control movement of data onto and off the bus Control signals control data transfer to and from external systems bus Temporary registers needed for proper operation of ALU

60

Hardwired Implementation (1) Control unit inputs Flags and control bus
Each bit means something

Instruction register
Op-code causes different control signals for each different instruction Unique logic for each op-code Decoder takes encoded input and produces single output n binary inputs and 2n outputs

61

Hardwired Implementation (2) Clock


Repetitive sequence of pulses Useful for measuring duration of micro-ops Must be long enough to allow signal propagation Different control signals at different times within instruction cycle Need a counter with different control signals for t1, t2 etc.

62

Problems With Hard Wired Designs Complex sequencing & micro-operation logic Difficult to design and test Inflexible design Difficult to add new instructions

63

Chapter 17 Micro-programmed Control Implementation (1)


All the control unit does is generate a set of control signals Each control signal is on or off Represent each control signal by a bit Have a control word for each micro-operation Have a sequence of control words for each machine code instruction Add an address to specify the next micro-instruction, depending on conditions

64

Implementation (2) Today s large microprocessor


Many instructions and associated register-level hardware Many control points to be manipulated

This results in control memory that


Contains a large number of words
co-responding to the number of instructions to be executed

Has a wide word width


Due to the large number of control points to be manipulated

65

Micro-program Word Length Based on 3 factors


Maximum number of simultaneous micro-operations supported The way control information is represented or encoded The way in which the next micro-instruction address is specified

Micro-instruction Types Each micro-instruction specifies single (or few) microoperations to be performed
(vertical micro-programming)

Each micro-instruction specifies many different microoperations to be performed in parallel


(horizontal micro-programming)

66

Vertical Micro-programming

Width is narrow n control signals encoded into log2 n bits Limited ability to express parallelism Considerable encoding of control information requires external memory word decoder to identify the exact control line being manipulated Diagram
Micro-instruction Address Function Codes Jump Condition

67

Horizontal Micro-programming

Wide memory word High degree of parallel operations possible Little encoding of control information Diagram
Internal CPU Control Signals Micro-instruction Address

System Bus Control Signals

Jump Condition

68

Compromise Divide control signals into disjoint groups Implement each group as separate field in memory word Supports reasonable levels of parallelism without too much complexity

69

Control Memory

. Jump to Indirect or Execute . Jump to Execute . Jump to Fetch Jump to Op code routine . Jump to Fetch or Interrupt . Jump to Fetch or Interrupt

Fetch cycle routine Indirect Cycle routine Interrupt cycle routine Execute cycle begin AND routine ADD routine 70

Control Unit

71

Control Unit Function


Sequence login unit issues read command Word specified in control address register is read into control buffer register Control buffer register contents generates control signals and next address information Sequence login loads new address into control buffer register based on next address information from control buffer register and ALU flags

72

Design Considerations Size of microinstructions Address generation time


Determined by instruction register
Once per cycle, after instruction is fetched

Next sequential address


Common in most designed

Branches
Both conditional and unconditional

Advantages and Disadvantages Simplifies design of control unit


Cheaper Less error-prone

Slower

73

Sequencing Techniques Based on current microinstruction, condition flags, contents of IR, control memory address must be generated Based on format of address information
Two address fields Single address field Variable format

Address Generation Explicit Two-field Unconditional Branch Conditional branch

Implicit Mapping Addition Residual control


74

Execution The cycle is the basic event Each cycle is made up of two events
Fetch
Determined by generation of microinstruction address

Execute

Thanks

75

Вам также может понравиться