Вы находитесь на странице: 1из 22

7-1

Chapter 7Memory System Design

Chapter 7: Memory System Design


Topics 7.1 Introduction: The Components of the Memory System 7.2 RAM Structure: The Logic Designers Perspective 7.3 Memory Boards and Modules

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-2

Chapter 7Memory System Design

Fig 7.1 The CPUMemory Interface


Data bus CPU m MAR w MDR w R/W Register file REQUEST COMPLETE 2m 1 b D0 Db 1 m Address bus Main memory s A0 Am 1
Address

0 1 2 3

Control signals

Sequence of events: Read: 1. CPU loads MAR, issues Read, and REQUEST 2. Main memory transmits words to MDR 3. Main memory asserts COMPLETE Write: 1. CPU loads MAR and MDR, asserts Write, and REQUEST 2. Value in MDR is written into address in MAR more 3. Main memory asserts COMPLETE
1997 V. Heuring and H. Jordan

Computer Systems Design and Architecture by V. Heuring and H. Jordan

Page 1

7-3

Chapter 7Memory System Design

Tbl 7.1 Some Memory Properties

Symbol

Definition

Intel 8088 16 bits 20 bits 8 bits 8 bits

Intel 8086 16 bits 20 bits 8 bits 16 bits 220 words

PowerPC 601 64 bits 32 bits 8 bits 64 bits 232 words

w m s b 2m

CPU word size Bits in a logical memory address Bits in smallest addressable unit Data bus size

Memory word capacity, s-sized wds 220 words

2mxs Memory bit capacity

220 x 8 bits 220 x 8 bits 232 x 8 bits

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-4

Chapter 7Memory System Design

Tbl 7.2 Memory Performance Parameters


Symbol ta tc k tl tbl = tl + k/ Definition Cycle time Block size Bandwidth Latency Units time words time Meaning Time to access a memory word Time from start of access to start of next access Number of words per block Time to access first word of a sequence of words Time to access an entire block of words Access time time

words/time Word transmission rate

Block time access time

(Information is often stored and moved in blocks at the cache and disk level.)
Computer Systems Design and Architecture by V. Heuring and H. Jordan 1997 V. Heuring and H. Jordan

Page 2

7-5

Table 7.3 The Memory Hierarchy, Cost, and Performance


Component
CPU Cache Main Memory Disk Memory Tape Memory

Chapter 7Memory System Design

Some Typical Values:

Access Random Random Random Capacity, bytes


64-1024+ 8KB-8MB 64MB-2GB

Direct 8GB

Sequential 1TB

Latency .4-10ns .4-20ns 10-50ns Block size Bandwidth


1 word 16 words 16 words

10ms 4KB

10ms-10s 4KB

System System 10-4000 clock Clock MB/s Rate rate-80MB/s $10 $.25

50MB/s

1MB/s

Cost/MB High
As

$0.002

$0.01
1997 V. Heuring and H. Jordan

of 2003-4. They go out of date immediately.

Computer Systems Design and Architecture by V. Heuring and H. Jordan

7-6

Chapter 7Memory System Design

Fig 7.3 Conceptual Structure of a Memory Cell


Regardless of the technology, all RAM memory cells must provide these four functions: Select, DataIn, DataOut, and R/W.
Select

Select
DataIn

DataIn
DataOut

DataOut

R/W
R/W

This static RAM cell is unrealistic in practice, but it is functionally correct. We will discuss more practical designs later.
Computer Systems Design and Architecture by V. Heuring and H. Jordan 1997 V. Heuring and H. Jordan

Page 3

7-7

Chapter 7Memory System Design

Fig 7.4 An 8-Bit Register as a 1-D RAM Array


The entire register is selected with one select line, and uses one R/W line
Select DataIn D R/W Select D R/W D D D D D D D DataOut

d0

d1

d2

d3

d4

d5

d6

d7

Data bus is bidirectional and buffered. (Why?)


Computer Systems Design and Architecture by V. Heuring and H. Jordan 1997 V. Heuring and H. Jordan

7-8

Chapter 7Memory System Design

Fig 7.5 A 4 x 8 2-D Memory Cell Array


2-4 line decoder selects one of the four 8-bit arrays
2 4 decoder D D D D D D D D

2-bit address

A1 A0

R/W

R/W is common to all

d0

d1

d2

d3

d4

d5

d6

d7

Bidirectional 8-bit buffered data bus


Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 4

7-9

Chapter 7Memory System Design

Fig 7.6 A 64 K x 1 Static RAM Chip


~square array fits IC design paradigm
Row address: A0 A7 8 8 256 row decoder 256 256 256 cell array

Selecting rows separately from columns means only 256 x 2 = 512 circuit elements instead of 65536 circuit elements!

256 Column address: A8 A15 8 1 256 1 mux 1 1 256 demux 1

CS, Chip Select, allows chips in arrays to be selected individually

R/W CS

This chip requires 21 pins including power and ground, and so will fit in a 22-pin package.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-10

Chapter 7Memory System Design

Fig 7.7 A 16 K x 4 SRAM Chip


Row address: A0 A7 8 8 256 row decoder 256 4 64 256 cell arrays

There is little difference between this chip and the previous one, except that there are 4 64-1 multiplexers instead of 1 256-1 multiplexer.

64 each Column address: A8 A13 R/W 6 4 64 1 muxes 4 1 64 demuxes

4 CS This chip requires 24 pins including power and ground, and so will require a 24-pin package. Package size and pin count can dominate chip cost.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 5

7-11

Chapter 7Memory System Design

Fig 7.8 Matrix and Tree Decoders


2-level decoders are limited in size because of gate fan-in. Most technologies limit fan-in to ~8. When decoders must be built with fan-in >8, then additional levels of gates are required. Tree and matrix decoders are two ways to design decoders with large fan-in:
m0 m0 m1
2 4 decoder

m4 m5 m6 m7

m8 m9 m10 m11

m12 m13 m14 m15

m4
2 4 decoder

m1 x0 x1 m2 m3

m5

x0 x1

m2 m3

m6 m7

2 4 decoder x2 x2 x2 x3

3-to-8 line tree decoder constructed from 2-input gates.


Computer Systems Design and Architecture by V. Heuring and H. Jordan

4-to-16 line matrix decoder constructed from 2-input gates.


1997 V. Heuring and H. Jordan

7-12

Chapter 7Memory System Design

Fig 7.9 Six-Transistor Static RAM Cell


This is a more practical design than the 8-gate design shown earlier. A value is read by precharging the bit lines to a value 1/2 way between a 0 and a 1, while asserting the word line. This allows the latch to drive the bit lines to the value stored in the latch.

Dual rail data lines for reading and writing bi +5 bi Active loads Storage cell

Word line wi Switches to control access to cell

Additional cells Column select (from column address decoder) R/W CS di

Sense/write amplifiers sense and amplify data on Read, drive bi and b i on write

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 6

7-13

Chapter 7Memory System Design

Fig 7.10 Static RAM Read Operation


Memory address

Read/write

CS Data

tAA

Access time from Addressthe time required of the RAM array to decode the address and provide value to the data bus.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-14

Chapter 7Memory System Design

Fig 7.11 Static RAM Write Operations


Memory address

Read/write

CS Data

tw

Write timethe time the data must be held valid in order to decode address and store value in memory cells.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 7

7-15

Chapter 7Memory System Design


Single bit line b i Switch to control access to cell Capacitor stores charge for a 1, no charge for a0

Fig 7.12 Dynamic RAM Cell Organization


Capacitor will discharge in 415 ms. Refresh capacitor by reading (sensing) value on bit line, amplifying it, and placing it back on bit line where it recharges capacitor. Write: place value on bit line and assert word line. Read: precharge bit line, assert word line, sense value on bit line with sense/amp. This need to refresh the storage cells of dynamic RAM chips complicates DRAM system design.

Word line w j

Additional cells

Column select (from column address decoder)

Sense/write amplifiers sense and amplify data on Read, drive bi and bi on write

R/W

CS

d i
1997 V. Heuring and H. Jordan

Computer Systems Design and Architecture by V. Heuring and H. Jordan

7-16

Chapter 7Memory System Design

Row latches and decoder

Fig 7.13 Dynamic RAM Chip Organization


Addresses are timemultiplexed on address bus using RAS and CAS as strobes of rows and A0 A9 columns. CAS is normally used RAS as the CS function. CAS Notice pin counts: R/W Without address multiplexing: 27 pins including power and ground. With address multiplexing: 17 pins including power and ground.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1024 1024 1024 cell array

10 Control Control logic

1024 1024 sense/write amplifiers and column latches 1024 10 10 column address latches, 1 1024 muxes and demuxes

1997 V. Heuring and H. Jordan

Page 8

7-17

Chapter 7Memory System Design

Figs 7.14, 7.15 DRAM Read and Write Cycles


Typical DRAM Read operation
Memory address Row address Column address Memory address

Typical DRAM Write operation


Row address Column address

RAS

t RAS

tPrechg

RAS

tRAS

t prechg

CAS

CAS

R/W

Data

Data

tA tC

t DHR tC

Access time Cycle time Notice that it is the bit line precharge operation that causes the difference between access time and cycle time.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

Data hold from RAS.

1997 V. Heuring and H. Jordan

7-18

Chapter 7Memory System Design

DRAM Refresh and Row Access


Refresh is usually accomplished by a RAS-only cycle. The row address is placed on the address lines and RAS asserted. This refreshed the entire row. CAS is not asserted. The absence of a CAS phase signals the chip that a row refresh is requested, and thus no data is placed on the external data lines. Many chips use CAS before RAS to signal a refresh. The chip has an internal counter, and whenever CAS is asserted before RAS, it is a signal to refresh the row pointed to by the counter, and to increment the counter. Most DRAM vendors also supply one-chip DRAM controllers that encapsulate the refresh and other functions. Page mode, nibble mode, and static column mode allow rapid access to the entire row that has been read into the column latches. Video RAMS, VRAMS, clock an entire row into a shift register where it can be rapidly read out, bit by bit, for display.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 9

7-19

Chapter 7Memory System Design

Types of High-Speed RAM


FPM DRAM (Fast Page Mode DRAM) EDO DRAM (Extended Data Out DRAM) SDRAM (Synchronous DRAM) RDRAM (Rambus DRAM) DDR SDRAM (Double Data Rate SDRAM) SynchLink DRAM Dual Port Graphics Buffer SGRAM (Synchronous Graphics RAM) DDR SGRAM (Double Data Rate SGRAM) SSRAM (Synchronous SRAM) DDR SSRAM (Double Data Rate SSRAM)

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-20

Chapter 7Memory System Design

FPM DRAM (Fast Page Mode DRAM)


A DRAM provided with the page mode of higher-speed than that of the conventional DRAM. Although FPM DRAM executes data IO only once during one cycle of RAS# in random access, it can continuously execute data IO during one cycle of RAS# in the page mode. In the page mode, the access time of the second data and thereafter becomes faster. Other types of DRAM such as NB (Nibble Mode) and SC (Static Column Mode) that realize higher-speed using specifications different from those of FPM DRAM also exist. In 1995, however, FPM DRAM represented approximately 90% of all DRAM shipments.

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 10

7-21

Chapter 7Memory System Design

EDO DRAM (Extended Data Out DRAM)


Faster than FPM DRAM. If read cycle of FPM DRAM is made higher speed, the data output time becomes shorter. EDO DRAM provides with extended output functions, the data output time does not become short even in higher speed. holds the data valid even after the signal which "strobes" the column address goes inactive. allows faster CPU's to manage time more efficiently; i.e., while the EDO DRAM is retrieving an instruction for the microprocessor, the CPU can perform other tasks without concern that the data will become invalid. In addition, since EDO DRAM and FPM DRAM are compatible DRAM that have packages with the same pin configuration, EDO DRAM can easily replace FPM DRAM. In the middle of 1996, EDO DRAM represented approximately 50% of all the DRAM shipment.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-22

Chapter 7Memory System Design

SDRAM (Synchronous DRAM)


A type of DRAM which operates in synchronization with input clock.
It latches each control signal at the rising edge of basic input clock and inputs/outputs data in synchronization with the clock signal.

With synchronous control, the DRAM latches information from the processor under control of the system clock. These latches store the addresses, data and control signals, which allows the processor to handle other tasks. After a specific number of clock cycles the data becomes available and the processor can read it from the output lines. SDRAM represented more than 50% of DRAM shipments around 1998

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 11

7-23

Chapter 7Memory System Design

DDR (Double Data Rate)


A technology designed to double the clock speed of the memory Activates output on both the rising and falling edge of the system clock rather than on just the rising edge, potentially doubling output. DDR SDRAM (Double Data Rate SDRAM) DDR SSRAM (Double Data Rate SSRAM) DDR SGRAM (Double Data Rate SGRAM)

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-24

Chapter 7Memory System Design

Types of High-Speed RAM

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 12

7-25

Chapter 7Memory System Design

Comparison - Access time of highspeed DRAM


Random access time : Access time in which both the row address and the column address different from those of the preceding cycle are accessed. Burst access time : Access time in which the same row address and a column address different from that of the preceding cycle are accessed.

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-26

Chapter 7Memory System Design

Memory Bus Clock Each High-speed DRAM Can Support

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 13

7-27

Chapter 7Memory System Design

Fig 7.16 A 2-D CMOS ROM Chip


+v

00

Address Row decoder

CS

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-28

Chapter 7Memory System Design

Tbl 7.4 ROM Types


ROM Type Maskprogrammed ROM PROM EPROM Flash EPROM EEPROM Cost Very inexpensive Inexpensive Moderate Expensive Very expensive Programmability At factory only Once, by end user Many times Many times Many times Time to Program Weeks Time to Erase N/A

Seconds Seconds 100 s 100 s

N/A 20 minutes 1 s, large block 10 ms, byte

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 14

7-29

Chapter 7Memory System Design

Memory Boards and Modules


There is a need for memories that are larger and wider than a single chip Chips can be organized into boards. Boards may not be actual, physical boards, but may consist of structured chip arrays present on the motherboard. A board or collection of boards make up a memory module. Memory modules: Satisfy the processormain memory interface requirements May have DRAM refresh capability May expand the total main memory capacity May be interleaved to provide faster access to blocks of words

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-30

Chapter 7Memory System Design

Fig 7.17 General Structure of a Memory Chip


This is a slightly different view of the memory chip than previous. Multiple chip selects ease the assembly of chips into chip arrays. Usually provided by an external AND gate.
Chip selects

...
Address m Row decoder Memory cell array s I/O multiplexer s s R/W CS m Address Data s

... . . . CS
R/W

s Data
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 15

7-31

Chapter 7Memory System Design

Fig 7.18 Word Assembly from Narrow Chips


All chips have common CS, R/W, and Address lines.
Select Address R/W CS R/W Address Data s Address Data s ps CS R/W CS

...

R/W Address Data s

P chips expand word size from s bits to p x s bits.

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-32

Chapter 7Memory System Design

Fig 7.19 Increasing the Number of Words by a Factor of 2k


The additional k address bits are used to select one of 2k chips, each one of which has 2m words:
Address m+k k to 2k decoder

...
m

R/W CS R/W Address Data s Address Data s s CS R/W Address Data s CS R/W

Word size remains at s bits.


Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 16

7-33
Address m+ q+ k

Chapter 7Memory System Design

Multiple chip select lines are used to This scheme replace the simplifies the last level of decoding from gates in this use of a (q+k)-bit matrix decoder decoder to using one scheme. q-bit and one k-bit decoder.

Vertical decoder

Fig 7.20 Chip Matrix Using Two Chip Selects

k m

Horizontal decoder

R/W CS1 CS2 R/W Address q Data

s One of 2m+q+k s-bit words

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-34

Chapter 7Memory System Design


CAS kc kc + kr High address 2kr decoder kr 2kc Enable decoder

2kr decoder R/W R/W

Fig 7.21 ThreeDimensional Dynamic RAM Array

... ... ...


R/W Address Data
1997 V. Heuring and H. Jordan

RAS R/W Multiplexed address m/2

RAS CAS

RAS CAS

CAS is used to enable top decoder in decoder tree. Use one 2-D array for each bit. Each 2D array on separate board.

Address Data Data w

RAS CAS Address Data

Computer Systems Design and Architecture by V. Heuring and H. Jordan

Page 17

7-35

Chapter 7Memory System Design

Fig 7.22 A Memory Module and Its Interface


Must provide Read and Write signals. Ready: memory is ready to accept commands. Addressto be sent with Read/Write command. Datasent with Write or available upon Read when Ready is asserted. Module selectneeded when there is more than one module. Bus Interface:
Address k+m Address register k m

Chip/board selection

Control signal generator: for SRAM, just strobes data on Read, Provides Ready on Read/Write

Module select

Memory boards and/or chips Control signal generator

Read Write

For DRAMalso provides CAS, RAS, R/W, multiplexes address, generates refresh Data signals, and provides Ready.

Ready

Data register

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-36

Chapter 7Memory System Design

Fig 7.23 Dynamic RAM Module with Refresh Control


Address k+m Address register Chip/board selection k m/2 m/2 m/2

Refresh counter Refresh clock and control


Request Refresh

Address multiplexer 2 m/2

Module select

Grant

Board and chip selects

Address lines

RAS Read Memory timing generator CAS R/W

Dynamic RAM array

Write

Data lines w Data register

Ready

w Data
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 18

7-37

Chapter 7Memory System Design

j + k = m-bit address bus

k + j = m-bit address bus msbs lsbs

Fig 7.24 Two Kinds of Memory Module Organizn.


Memory modules are used to allow access to more than one word simultaneously.

msbs lsbs j k Module 0 Address Module select Module 1 Address Module select k

Module 0 Address Module select Module 1 Address Module select

..

Module 2k 1 Address Module select

Module 2k 1 Address Module select

(a) Consecutive words in consecutive modules (interleaving)

(b) Consecutive words in the same module

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-38

Chapter 7Memory System Design

Fig 7.25 Timing of Multiple Modules on a Bus


If time to transmit information over bus, tb, is < module cycle time, tc, it is possible to time multiplex information transmission to several modules; Example: store one word of each cache line in a separate module. Main Memory Address: Word Module No.

This provides successive words in successive modules. Timing: Bus


Module 0 Module 3 tb tc Read module 0 address Write module 3 address and data Module 0 read Module 3 write tb Module 0 Data return

With interleaving of 2k modules, and tb < tc/2k, it is possible to get a 2k-fold increase in memory bandwidth, provided memory requests are pipelined. DMA satisfies this requirement.
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 19

..

7-39

Chapter 7Memory System Design

Memory System Performance


Breaking the memory access process into steps: For all accesses: transmission of address to memory transmission of control information to memory (R/W, Request, etc.) decoding of address by memory For a Read: return of data from memory transmission of completion signal For a Write: transmission of data to memory (usually simultaneous with address) storage of data into memory cells transmission of completion signal

The next slide shows the access process in more detail.


Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-40

Chapter 7Memory System Design

Fig 7.26 Sequence of Steps in Accessing Memory


Read or write Read or write Read or write Write Write Read Command to memory Address to memory Address decode Complete Precharge Write data to memory Write data Return data ta tc (a) Static RAM behavior

Read or write Read or write Read Write Pending refresh

Row address & RAS Column address & CAS R/W Return data Write data to memory ta tc (b) Dynamic RAM behavior

Precharge Complete

Refresh

Precharge Complete

Hidden refresh cycle. A normal cycle would exclude the pending refresh step. -moreComputer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 20

7-41

Chapter 7Memory System Design

Example SRAM Timings


Approximate values for static RAM Read timing: Address bus drivers turn-on time: 40 ns. Bus propagation and bus skew: 10 ns. Board select decode time: 20 ns. Time to propagate select to another board: 30 ns. Chip select: 20 ns.

PROPAGATION TIME FOR ADDRESS AND COMMAND TO REACH CHIP: 120 ns. On-chip memory read access time: 80 ns. Delay from chip to memory board data bus: 30 ns. Bus driver and propagation delay (as before): 50 ns. TOTAL MEMORY READ ACCESS TIME: 280 ns. Moral: 70 ns chips do not necessarily provide 70 ns access time!
Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

7-42

Chapter 7Memory System Design

Bus Interface:
Address m+q+k Horizontal decoder

A dd ress k+m

Address register
k m

k m

R/W CS1 CS2 R/W Address q Data Read Vertical decoder M odu le se lec t

Board select

Memory boards
Control signal generator

W rite

Ready

Data register
D a ta w

s Oneof 2m+q+k s-bit words

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 21

7-43

Chapter 7Memory System Design

Chapter 7 Summary
Most memory systems are multileveledcache, main memory, and disk. Static and dynamic RAM are fastest components, and their speed has the strongest effect on system performance. Chips are organized into boards and modules. Larger, slower memory is attached to faster memory in a hierarchical structure.

Computer Systems Design and Architecture by V. Heuring and H. Jordan

1997 V. Heuring and H. Jordan

Page 22

Вам также может понравиться