Вы находитесь на странице: 1из 110

ECE485/585 MicroprocessorSystemDesign

Prof.MarkG.Faust MaseehCollegeofEngineering andComputerScience

Memory

"640Koughttobeenoughforanybody." BillGates,1981

Memory
TaxonomyofMemories Memory Hierarchy MemoryHierarchy SRAM
BasicCell,Devices,Timing

Memory Organization MemoryOrganization


Multiplebanks,interleaving

DRAM
Basic Cell Timing BasicCell,Timing DRAMEvolution

DRAMmodules Error Correction ErrorCorrection MemoryControllers

ECE485/585:Memory

MemoryTaxonomy Memory Taxonomy


Read/WriteMemory R d/W it M Volatile NonVolatile ReadOnly

NonRandomAccess
ShiftRegister Shift Register FIFO CAM SRAM DRAM

RandomAccess
EPROM E2PROM Flash

MaskROM PROM

NAND NOR
NVRAM

ECE485/585:Memory

ComputerMemoryHierarchy p y y

FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)

Processor

Control Main Memory (DRAM) Secondary Storage (Disk)

Tertiary Storage (Tape)

Datapath

Second Level Cache (SRAM)

Third Level Cache (SRAM)

Intermediateresults CachedDRAM
ECE485/585:Memory

Re egisters

OnChip Cache

Instructions FileSystem Data Paging [CachedFiles]

Archive Backup
5

RegisterFiles Register Files


dataa sela selb selc


RegisterFile

GeneralPurposeRegisters Usuallyhavemultipleports Usually have multiple ports


SupportCPUarchitecturesdatapaths Abilitytoreadtwooperands,writeone

datab datac

OperateatCPUspeed

Forreadoperations,theregister fileisequivalenttoa2Darrayof flipflopswithtristateoutputs

Forwriteoperations,weadd someadditionalcircuitrytothe some additional circuitry to the basiccell


ECE485/585:Memory 6

AddressDecoding Address Decoding

Address decoder generates a one-hot ( ) code (1-of-n code) from the address binary to unary The output is used for row selection

ECE485/585:Memory

AccessingRegisterFiles Accessing Register Files


Read Address following Addressfollowing
Changeaddress Clock Datafromnewaddressappearsonoutput RegID Asynchronous WE Din Register File

Writeissynchronous
IfWE,inputdataiswrittentoselectedwordontheclockedge
Clock RegID Dout Din WE RegID X g R[X] RegID Y g R[Y] val val

Dout

ECE485/585:Memory

SRAMTechnology SRAM Technology


SRAMCell

addr word line

Write
Writebitandbitontobitlines Selectdesiredword(row) Turnsonpasstransistors Writesnewvaluetocell [Oneinverterinputwillbelow, turningitsoutputhigh]

bit line

data

bit line

6transistors Whichwillbelonger:bitlinesorwordlines? Bitlines! Fordensityandlowpower,wanttinytransistors Unabletodrivelongbitlines Pre chargebitlines(Vdd/2)beforeread Precharge bit lines (Vdd/2) before read Usedifferentialbetweenbitandbit
ECE485/585:Memory

Read
Selectdesiredword(row) Onebitlinewillbepulledlow Otherwillremainhigh Takeslongtimeforbitlinetobe pulledlowwithtinytransistor Dontneedtowait canjustsense differencebetweentwobitlines!

Dual portedMemoryInternals Dualported Memory Internals


Adddecoder,anothersetof Add decoder, another set of read/writelogic,bitslines, wordlines
Examplecell:SRAM
WL2 WL1

deca

decb

cell array

b2

b1

b1

b2

r/w logic r/w logic

Repeateverythingbutcross coupledinverters. Thisschemeextendsuptoa Thi h t d t couplemoreports,thenneedto addadditionaltransistors.

address ports
ECE485/585:Memory

data ports
10

BasicSRAM Basic SRAM


Sizeinbits(organization)
1Mb(256Kx4) 1Mb(128Kx8) ( ) 256Kwordsof4bits 128Kwordsof8bits
2n x b RAM A0 A1 An-1 DIN0 DIN1 DINb-1 CS WE DOUT0 DOUT1 DOUTb-1

MostControlSignalsareActiveLow ChipSelect(/CS)effectivelyanenable WriteEnable(/WE)controlsread/write Write Enable (/WE) controls read/write Write /WEisasserted(Low) /CSisasserted(Low) Read /WEisdeasserted(High) /CSisasserted(Low)

ECE485/585:Memory

11

SRAMVariations
2n x b RAM A0 A1 An-1 DIN0 DIN1 DINb-1 CS WE DedicatedDin&Dout D di d Di & D
Tradepincount($)forhigherperformance Nobidirectionalturnaroundtimerequired

2n x b RAM A0 A1 An-1

DOUT0 DOUT1 DOUTb-1

D0 D1 Db-1 CS OE WE Din&Doutoftencombinedtosavepins($) Anewcontrolsignal,OutputEnable(/OE)

ECE485/585:Memory

12

SimplifiedSRAMtimingdiagram

Read:Validaddress,then/CS(ChipSelect)asserted AccessTime:Addressgoodtodatavalid CycleTime:Minimumtimebetweensubsequentmemory operations Write:Validaddressanddatawith/WEasserted,then/CSasserted


Addressmustbestableasetuptimebefore/WEand/CSgolow Andholdtimeafteronegoeshigh

ECE485/585:Memory

13

TypicalSRAMTiming Typical SRAM Timing


A /WE /OE /CS
N

2 N words xMbit SRAM

/OEdeterminesdirection Hi=Write,Lo=Read i i d

WriteTiming:
D A /OE /WE Write od e HoldTime WriteSetupTime
ECE485/585:Memory

ReadTiming:
HighZ Junk DataOut ReadAddress DataOut ReadAddress

DataIn WriteAddress

ReadAccess Time

ReadAccess Time
14

InternalSRAMOrganization(16x4)
Din 3 Din 2 Din 1 Din 0 WriteEnable Wr Driver + Wr Driver + Wr Driver + Wr Driver + Word 0 Add dress Decod der A0 A1 A2 A3

SRAM Cell

SRAM Cell

SRAM Cell

SRAM Cell

SRAM Cell :

SRAM Cell :

SRAM Cell :

SRAM Cell :

Word 1

SRAM Cell - Sense Amp + Dout 3


ECE485/585:Memory

SRAM Cell - Sense Amp + Dout 2

SRAM Cell - Sense Amp + Dout 1

SRAM Cell - Sense Amp + Dout 0

Word 15

15

Example:CypressSRAM Example: Cypress SRAM

Noteaddressfollowingmode KeySRAMtimingparameters

tAA Addressaccesstime:timebetweenavalidaddressbeing appliedandvaliddataavailableondataoutputs tRC Readcycletime:Minimumtimethatoneaddressmustbe ead cyc e t e: u t e t at o e add ess ust be heldontheaddresslinesbeforeasecondaddresscanbe presented
tAA representslatency p y tRC representsbandwidth(throughput)
16

ECE485/585:Memory

Whathappensasnumberofbitsincreases? pp

Decoderlargerandslower Bitlinesincreaseinlength
LargedistributedRCload Larger,slowertransistors n bits Log2 n bit address

Remember Treat output as differential signal Treatoutputasdifferentialsignal Prechargebothbitlineshigh Memorycellpullsonlyonelow Sensebitvaluebycomparingsenselines Makeitshorterandwider!

ECE485/585:Memory

17

InsideaTallThinRAMis Inside a Tall Thin RAM is

n=kxmbits Log2 kbit row address rowaddress

Senseamps mux Log2 mbit columnaddress


ECE485/585:Memory

d bi 1databit
18

ReplicateforDesiredWidth Replicate for Desired Width

n=kxmbits Log2 kbit rowaddress

Senseamps S mux Log2m bit mbit columnaddress


ECE485/585:Memory

4databits
19

PhysicalSRAMArrayShouldBeSquare
Example:16x1SRAM 4x4Array
0 2-to-4 Decoder 1
IN OUT SEL WR IN OUT SEL IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR

DI

A1 A0

1 0 2

WR IN OUT SEL

WR IN OUT SEL

A3A2 /WE /CS /OE

WR

2to4 Decoder

S E
ECE485/585:Memory

4to1Mux

DO

20

EvolutionaryModifications Evolutionary Modifications


AddOutputEnable
Canturnoffdriversandimmediatelystartdrivingdataforwritewithoutbus contention

Addlatchestoinputdata
ChipSelectfunctionsthisway Onlyneedtoholddatauntillatched

Addlatchestooutputdata
Outputsavailabletobereadwhilenextaccessbegun

Providesynchronousinterface

ECE485/585:Memory

21

MemoryOrganization Memory Organization


Howdowebuildmemorysubsystemsoutofmemorydevices?

ECE485/585:Memory

22

256Kx8MemorySystem 4x64Kx8RAMchips 256K 18addresslines

ECE485/585:Memory

23

64Kx16MemorySystem 2x64Kx8RAMchips

ECE485/585:Memory

24

ImprovingMemorySystemPerformance Improving Memory System Performance


CycleTime AccessTime Time

DRAM(Read/Write)CycleTime>>DRAM(Read/Write)AccessTime
2:1Why?

DRAM(Read/Write)CycleTime
Howfrequentlycanyouinitiateanaccess? q y y Analogy:AlittlekidcanonlyaskhisfatherformoneyonSaturday

DRAM(Read/Write)AccessTime
How quickly will you get what you want once you initiate an access? Howquicklywillyougetwhatyouwantonceyouinitiateanaccess? Analogy:Assoonasheasks,hisfatherwillgivehimthemoney

DRAMBandwidthLimitationanalogy
What happens if he runs out of money on Wednesday? WhathappensifherunsoutofmoneyonWednesday? AskMom!

ECE485/585:Memory

25

IncreasingBandwidth:Interleaving
AccessPatternwithoutInterleaving:
CPU Memory

D1available StartAccessforD1

StartAccessforD2 Memory Bank0 Memory Bank1 Memory Bank2 k

AccessPatternwith4wayInterleaving:

CPU

Access sBank0

AccessBank1 AccessBank2 k AccessBank3 ECE485/585:Memory WecanAccessBank0again

Memory Bank3

26

MemoryInterleaving Memory Interleaving


read 00000 read 00001 read 00002 read 00003 read 00004 for (i = 0; i <16; i++) A[i] = A[i] * c + d; (assume A[0] at address 0)

address 0 4 8 12

address 1 5 9 13

address 2 6 10 14

address 3 7 11 15

Bank 0

Bank 1

Bank 2

Bank 3

ECE485/585:Memory

27

LowOrderMemoryInterleaving Low Order Memory Interleaving

ECE485/585:Memory

28

ECE485/585:Memory

29

HighOrderMemoryInterleaving High Order Memory Interleaving

Seenthisbefore

ECE485/585:Memory

30

ECE485/585:Memory

31

DRAMTechnology
DRAMCell

Write
Drivebitline Selectdesiredword(row)

word line

Read
Prechargebitline Selectdesiredword(row) Sensecharge Writevalueback(restore) Write value back (restore)

bit line

1transistor Chargestoredontinycapacitor Readisdestructive readmustrestorevalue Chargeleaksovertime refresh

Refresh!
Periodicallyreadeachcell
(forcing writeback) (forcingwrite back)

ECE485/585:Memory

32

VolatileMemoryComparison
SRAMCell
word line

DRAMCell

word line

bit line

bit line bit line

Largercell lowerdensity,highercost/bit Nodissipation No dissipation Readnondestructive Norefreshrequired Simpleread fasteraccess StandardICprocess naturalfor f integrationwithlogic
ECE485/585:Memory

Smallercell higherdensity,lowercost/bit Needsperiodicrefresh N d i di f h Refreshafterread Complexread longeraccesstime SpecialICprocess difficulttointegratewith logiccircuits Densityimpactsaddressing (multiplexaddresslines) 33

DRAMDevicePinOuts DRAM Device Pin Outs


CostRules! C tR l !
Fewerpins,smallerpackage,less$
2Mbdeviceorganized as256Kx8bits

/
256K=18addressbits 9rowaddressbits 9columnaddressbits

2Mb(256Kx8) DRAM Data(DQ) Data (DQ) Address /RAS /CAS /WE /OE

SoMultiplex
Data (In/Out) Data(In/Out)

/WEasserted(low)forwrite /OEasserted(low)toenableoutputbuffers

Address(Row/Column)

/RAS(RowAddressStrobe)assertedafterrowplacedonaddresspins /CAS(ColumnAddressStrobe)assertedaftercolumnplacedonaddresspins

ECE485/585:Memory

34

Howdowekeepsquarerow/column organizationbutgetdevicesofmorethanx1 organization but get devices of more than x1 bit?

(fromBruceJacob)

ECE485/585:Memory

35

InternalDRAMOrganization 2Mbas256Kx8 2Mb as 256K x 8


Squarekeepsthewiresshort: Powerandspeedadvantages Power and speed advantages LessRC,fasterprechargeand discharge fasteraccesstime!

Ad ddressBuff fer

RowDecod der

A0A8

DataIn/Out Data aIn/Out DattaIn/Out taIn/Out DaataIn/Out Da D DataIn/Out t DataIn/Ouut ut DataIn/O


36

WordLine Word Line WordLine WordLine WordLine WordLine WordLine WordLine Word WordLine Word Line Line
SenseAmps&I/O p

BitLine e BitLine BitLine BitLine BitLine BitLine BitLine BitLin ne

MemoryArray MemoryArray MemoryArray (512x512) MemoryArray (512x512) MemoryArray ( x (512x512) MemoryArray) (512x512) (512 Array MemoryArrayy Memoryy 512) Memory Array (512x512) (512x512) (512x512) (512x512)

ECE485/585:Memory

ColumnDecoder

DRAMTiming DRAM Timing


Readcycle RAS+CAS (RASasserted)Entirerowislatchedinregister (CAS (CASasserted)Datainregisterismultiplexedtooutput t d) D t i it i lti l d t t t (RASdeasserted)Datainregisterisrewritten (CASdeasserted)Outputisreleased Writecycle RAS+WE+CAS (RASasserted)Entirerowislatchedinregister (WEasserted)Dataisstable (CASasserted)WriteDatatoregister (WEdeasserted)Dataisnolongerstable ( ) g (RASdeasserted)Datainregisterisrewritten (CASdeasserted)Operationcomplete Refreshcycle RASOnly (RAS asserted) Entire row is latched in register (RASasserted)Entirerowislatchedinregister (RASdeasserted)Thedataintheregisterisrewritten

ECE485/585:Memory

37

DRAMReadTiming DRAM Read Timing


EveryDRAMaccessbeginsat: Theassertionofthe/RAS Twowaystoread:earlyorlatev./CAS T t d l l t /CAS
A
9 DRAMReadCycleTime y /RAS /CAS A /WE /OE D HighZ Junk ReadAccess Time DataOut HighZ OutputEnable Delay DataOut RowAddress ColAddress Junk RowAddress ColAddress Junk

/RAS

/CAS

/WE

/OE

256Kx8 DRAM

D
8

EarlyReadCycle:/OEassertedbefore/CAS
ECE485/585:Memory

LateReadCycle:/OEassertedafter/CAS
38

DRAMWriteTiming DRAM Write Timing


DRAMWRCycleTime /RAS /CAS A /OE /WE D Junk DataIn WRAccessTime EarlyWr Cycle:/WEassertedbefore/CAS Junk DataIn WRAccessTime LateWrCycle:/WEassertedafter/CAS Junk RowAddress ColAddress Junk RowAddress ColAddress Junk

ECE485/585:Memory

39

ActualDRAMReadCycle Actual DRAM Read Cycle

ECE485/585:Memory

40

ActualDRAMReadCycle Actual DRAM Read Cycle

tRAC:RandomAccessDelayDeterminesLatency Minimumtimefrom/RASfallingtovalid dataQuotedasthespeedofaDRAM tRAC=tRCD+tCAC

ECE485/585:Memory

41

ActualDRAMReadCycle Actual DRAM Read Cycle

tRC:RowCycleTimeDeterminesBandwidth Minimumtimebetweensuccessiverow accesses tRC=tRAS+tRP

ECE485/585:Memory

42

ActualDRAMWriteCycle Actual DRAM Write Cycle

ECE485/585:Memory

43

KeyDRAMTimingParameters Key DRAM Timing Parameters


tRAC:RandomAccessDelayDeterminesLatency
Minimumtimefrom/RASfallingtovaliddataoutput / g p QuotedasthespeedofaDRAM tRAC =tRCD +tCAC

tRCD:RowCommandDelay(RAS/CASDelay)
Minimumtimebetweenarowcommandandacolumncommand

tCAC:ColumnAccessTime
Delayfromfalling/CAStovaliddataout

tRC:RowCycleTimeDeterminesBandwidth
Minimumtimebetweensuccessiverowaccesses tRC =tRAS +tRP

tRAS:RowAddressStrobe
Minimumtime/RASmustbemaintained

tRP:RowPreChargeDelay Ch l
Minimumtimetoprechargesoanother/RAScanbegin
44

ECE485/585:Memory

Refresh
Dependsupondevice d d
Refreshperiod timebywhichallrowsmustberefreshed
Typical 64ms Typical15.6us. distributedvs burstrefresh distributed vs burst refresh

Refreshinterval timebetweenrefreshofeachrow Interleavewithnormaloperation Refreshentirerowatatime Use/RASand/CAStosignal

RASonly
/RAScycling,no/CAScycling Externalmemorycontrollermaintainsaddressoflastrowrefreshed

CASbeforeRAS(CBR)
DRAMmaintainsrefreshaddress

HiddenRefresh
FollowingReadorWrite g /CASleftlow,/RAScycled Outputdataremainsvalid Requirestime,delayingsubsequentread/write

ECE485/585:Memory

45

DRAMEvolution

(fromBruceJacob&DavidWang)

FPM FastPageMode(<1995) EDO ExtendedDataOut(1996 1999) BEDO BurstExtendedDataOut SDRAM SynchronousDRAM(>1995)


SDRAM DDRSDRAM(doubledatarate) DDR2 DDR3

OtherDRAMtechnologies
Rambus DRAM(RDRAM) ConcurrentRambus DRAM Di DirectRambus DRAM (DRDRAM) R b DRAM(DRDRAM)

NumerousSpecialtyDRAMsnotshown
VirtualChannelMemory(VCDRAM) EnhancedSDRAM(ESDRAM) MoSys 1TSRAM ReducedLatencyDRAM(RLDRAM) FastCycleDRAM(FCRAM)

By2002mostPCsusingSDRAMandDDRSDRAM By2010PCsshippingwithDDR3involume

ECE485/585:Memory

46

ConventionalDRAMReadTiming Conventional DRAM Read Timing

ECE485/585:Memory

47

FastPageModeDRAMReadTiming
Innovation Holdentirerow(page)insenseamplifiers Benefit CPUcanaccesseachcolumninrowwithoutprovidingrow addresseachtime(andprecharging)

ECE485/585:Memory

48

ExtendedDataOutDRAMReadTiming Extended Data Out DRAM Read Timing


Innovation Addlatchbetweensenseamplifiersandoutputpins Benefit Canbeginprechargingsooner(datafromprioraccess remainsvalid)

ECE485/585:Memory

49

BurstEDODRAMReadTiming Burst EDO DRAM Read Timing


Innovation DRAMprovidescolumndatasequentially Benefit Noneedtotransfercolumnaddress

ECE485/585:Memory

50

SynchronousDRAMReadTiming Synchronous DRAM Read Timing


Innovation Pipelineaccess,commandinterface(ACTIVATE,READ) Benefit Noneedtotransfercolumnaddress

ECE485/585:Memory

[fromBruceJacob&DavidWang]

51

KeySDRAMTimingParameters Key SDRAM Timing Parameters


tRCD: MinimumtimebetweenanActivatecommandandReadcommand AnalogoustoDRAMparametertRCD :RowCommandDelay(RAS/CAS / Delay) CL:CASLatency TimebetweenReadcommandandfirstdatavalid AnalogoustoDRAMparametertCAC:ColumnAccessTime

ECE485/585:Memory

SDRDRAMexamples
(DDRcanhaveCASlatencyof2.5)

52

SDRAM(SynchronousDRAM) SDRAM (Synchronous DRAM)


AdoptedforPentiumuse Synchronous(clocked)interface(simplifiedtiming) RAS,CASsignalscombinedtomakecommand Idealforcachelinefillwhenbuswidth<cachelinesize Id l f h li fill h b idth h li i Burstread/write Initiallatency,thendataeveryclockcycle y, y y Internalinterleavedbanksallowmultiple rows(pages)tobe openforread/write Self contained refresh Selfcontainedrefresh

ECE485/585:Memory

53

SDRAMDetails SDRAM Details


Multiplebanksofcellarraysare usedtoreduceaccesstime: Each bank is 4K rows by 512 Eachbankis4Krowsby512 columnsby16bits(forour part) ReadandWriteoperationsaresplit intoRAS(rowaccess)followedby ( ) f ll db CAS(columnaccess) Theseoperationsarecontrolledby g sendingcommands Commandsaresentusingthe RAS,CAS,CS,&WEpins. Addresspinsaretime multiplexed multiplexed DuringRASoperation,address linesselectthebankandrow DuringCASoperation,address g p , linesselectthecolumn.
ECE485/585:Memory

ACTIVEcommandopensa bank/rowforoperation Transfers contents of the entire Transferscontentsoftheentire rowtoabuffer SubsequentREADorWRITE commandsaccessthecontentsof h b ff therowbuffer Forburstreadsandwritesduring READorWRITEthestarting pp addressoftheblockissupplied. Burstlengthisprogrammable as1,2,4,8orafullpage (entirerow)withaburst terminateoption terminate option Specialcommandsareusedfor initialization(burstoptionsetc.) Aburstoperationtakes 4+n l (f d) cycles(fornwords)
54

Functional Block Diagram 8 Meg x 16 SDRAM

212 =4096rows(pages)

22 =4banks

29 =512columns ECE485/585:Memory 55

BanksIncorporatedIntoSDRAM Banks Incorporated Into SDRAM


Memoryaddress Memory address Row Bank Column

Whyrow/bank/column,notbank/row/column?
Considerspatiallocality C id i l l li Imagineaccessingaseriesofsequentialmemoryaddresses Afterexhaustingacolumn,referencestoanotherbank Considerifrow/bankreversed /
Bankwouldrarelybeused,losebenefitofinterleaving

ECE485/585:Memory

56

PageSize Page Size


ACTIVATE ACTIVATEcommandreadsdataforallcolumnsin d d d t f ll l i thepage
Readsdataintosenseamplifiers Writesdatabacktodatacells

Consequently,pagesizeisafactorinpower consumption
Number of Columns DQs Page Size (bytes) = 8

ECE485/585:Memory

57

DDR(DoubleDataRate)SDRAM DDR (Double Data Rate) SDRAM


Innovation Transferdataonrisingand fallingedgesofclock
SameinternalSDRAMcorebut2nprefetch

Benefit Twicethebandwidth,samecontrolandsignalsasSDRAM SignificantDifferences


Differentialclock Sourcesynchronous(DQS) Burstlengthof2,4,8only CL=2,2.5,3 , , SSTL2StubSeriesTerminatedLogic(2.5V)vsLVTTL(3.3V)

ECE485/585:Memory

58

DDRDRAM DDR DRAM


2nprefetch
UsesameDRAMcore(cellarray) Fetch twice as many bits Fetchtwiceasmanybits Samelatencyforfirstdatatransfer

[fromElpida] [from Elpida]

Sourcesynchronous
Datatransferistwiceclockrate Datastrobesentalongsidedata
Read: supplied by DRAM Read:suppliedbyDRAM
Dataalignedwithstrobeedge

Write:suppliedbycontroller
Datacenteredonstrobeedge

ECE485/585:Memory

59

DDRSDRAMAccessExamples DDR SDRAM Access Examples


Readsfromsameopenpage/bank

[From:Samsung]

ECE485/585:Memory

60

DDRSDRAMAccessExamples DDR SDRAM Access Examples


Readsfromdifferentbanks,samerow

Row

Memoryaddress Bank

[From:Samsung]

Column

ECE485/585:Memory

61

DDRSDRAMAccessExamples DDR SDRAM Access Examples


Readsfromdifferentrow

From:Samsung

Memoryaddress Bank
ECE485/585:Memory

Row

Column
62

READburst(withautoprecharge) READ burst (with auto precharge)

ECE485/585:Memory

63

WRITEburst(withautoprecharge) WRITE burst (with auto precharge)

Seedatasheetformoredetails.
ECE485/585:Memory

Verilogsimulationmodelsavailable.
64

DDR 2SDRAM DDR2 SDRAM


Currently(20092010)thedominanthighvolume(PC)memorytechnology Innovation 4nprefetch,fasterclocks Benefit Increasedbandwidth,samecontrolandsignalsasDDRSDRAM SignificantDifferences
SSTL18(1.8V)vs.SSTL2(2.5V) Lowpower(fromlowersupplyvoltageandnewlowpowermodes) (f l l l d l d ) (Optional)differentialstrobe(DQS,DQS#) ODT(OnDieTermination) 400MHzclocksvs.200MHz 400 MHz clocks vs 200 MHz CL=3,4,5 2Gbdevices 4/8banksvs.4banks 4/8 banks vs. 4 banks tFAW =fourbankactivationwindow Burstlengthsof4,8
(no2becauseof4nprefetch)

additivelatency
ECE485/585:Memory
[fromMicron]

65

ModeRegisterChanges Mode Register Changes

ECE485/585:Memory

66

AdditiveLatency Additive Latency

Noadditivelatency,tRCD =CL=4

(fromMicron)

CantplaceACTcommandincycle4 slotoccupiedbyRDAP:B0,Cx SoACTisdelayedbyfullcycle soRDAP:B2,Cxisdelayed so RD AP: B2 Cx is delayed andresultingdataoutisdelayed

ACTB<n>,R<x>=activaterow<x>inbank<n> RDAPB<n>,C<x>=readcolumn<x>fromactivatedbank<n>(autoprecharge) RD AP B< > C< > d l < >f ti t d b k < > ( t h )
ECE485/585:Memory 67

AdditiveLatency Additive Latency

CanissuetheRDB0,Cxcommandearly(inavailableslot) Noadditivelatency,tRCD =CL=4 butprocessingwillstillrequireadherencetotRCD,CLtimingconstraints

[fromMicron]

ECE485/585:Memory

68

AdditiveLatency Additive Latency

Noadditivelatency,tRCD =CL=4 CanissuetheRDB0,Cxcommandearly(inavailableslot) butprocessingwillstillrequireadherencetotRCD,CLtimingconstraints


(fromMicron)

PermitscontinuousdatareadfromDRAM

ECE485/585:Memory

69

DDR 3 DDR3
CurrentgenerationSDRAM KeyDifferences
8nprefetch
SSTL15(1.5V)vs.SSTL18(1.8V)

R d d Reducedpowerconsumption(~30%) i ( 30%)

667800MHzclocks
2xbandwidthofDDR2

8 banks vs 4 banks 8banksvs.4banks


Moreopenbanks lesslatency

Adoptionrate
Introductionin2007(insignificantquantities) t oduct o 00 ( s g ca t qua t t es) ExpectedtobedominantPCmemoryby2010 Samsung8Gb DDR3describedatISSCCFebruary2009

ECE485/585:Memory

70

MicronDDR3Datasheet Micron DDR3 Datasheet

ECE485/585:Memory

71

DDR3Commands DDR3 Commands

ECE485/585:Memory

72

DDR3Commands(Notes) DDR3 Commands (Notes)

(RFU

ReservedforFutureUse)
73

ECE485/585:Memory

DDR3ReadCycle DDR3 Read Cycle

Singleread

Backtobackreadstoopenpage Back to back reads to open page

ECE485/585:Memory

74

MemoryModules Memory Modules

184pinDDRSDRAMDIMM 184 i DDR SDRAM DIMM

FromHsienHsin SeanLee,GeorgiaInstituteofTechnology

Allchipsinarankreceivesameaddressandcontrolsignals Eachchipresponsibleforsubsetofdatabitsinitsrank ModuleactsashighcapacityDRAMwithwidedatapath


Example:8chips,each8bitswide=64bits

Easytoadd/replacememoryinasystem
Noneedtosolderorremoveindividualchips

Memorygranularityissue Memory granularity issue


Whatsthesmallestincrementinmemorysize?
75

ECE485/585:Memory

DRAMRanks DRAM Ranks

ECE485/585:Memory

76

OrganizationofDRAMModules Organization of DRAM Modules

ECE485/585:Memory

77

MemoryModules Memory Modules


SIMM(SingleInlineMemoryModule) 30pin:some286,most386,some486systems SIMM Page Mode Fast Page mode devices PageMode,FastPagemodedevices 72pin:some386,most486,nearlyallPentium(beforeDIMM) FastPageMode,EDOdevices DIMM(DualInlineMemoryModule) Dominanttoday 168pinSDRAMDIMM SODIMM(SmallOutlineDIMM) Usedinnotebooks,AppleiMac SODIMM RIMM(Rambus RDRAMModule) RIMM (R b RDRAM M d l )
SPD SerialPresenceDetect
8pinserialEEPROMonmemorymodule Key parameters for SDRAM controller KeyparametersforSDRAMcontroller
Numberofrow/columnaddresses Numberofranks Modulewidth Refreshrate/type Errorchecking(none,parity,ECC) g( , p y, ) Latency Timingparameters 184pinDDRSDRAMDIMM

200pinDDR2,DDR3SDRAMDIMM RIMM
78

ECE485/585:Memory

DRAMandDIMMNomenclature DRAM and DIMM Nomenclature


Device name DDR200 DDR266 DDR333 DDR400 DDR2-400 DDR2-533 DDR2-667 DDR2-800 DDR2-1066 DDR3-800 DDR3-1066 DDR3-1333 DDR3-1600 Clock 100 MHz 133 MHz 166 MHz 200 MHz 200 MHz 266 MHz 333 MHz 400 MHz 533 MHz 400 MHz 533 MHz 666 MHz 800 MHz M transfers per sec 200 266 333 400 400 533 666 800 1066 800 1066 1333 1600 MB/sec Per DIMM , / 1,600 MB/s 2,133 MB/s 2,666 MB/s 3,200 MB/s , / 3,200 MB/s 4,266 MB/s 5,333 MB/s 6,400 MB/s , / 8,533 MB/s 6,400 MB/s 8,500 MB/s 10,666 MB/s , / 12,800 MB/s DIMM name PC-1600 PC-2100 PC-2700 PC-3200 PC2-3200 PC2-4200 PC2-5300 PC2-6400 PC2-8500 PC3-6400 PC3-8500 PC3-10600 PC3-12800

Mtransfers/second=2xClockRate(DDR) DRAMnameincorporatesMtransferspersecond MB/sec 8 x M transfers per second MB/sec =8xMtransferspersecond DIMMnameincorporatesMB/sec(rounded)


ECE485/585:Memory 79

DRAMHistory DRAM History


DRAMs:capacity+60%/yr,cost30%/yr 2.5Xcells/area,1.5Xdiesizein3years / DRAMfab costs$2B DRAMonly:density,leakagev.speed Relyonincreasingnumberofcomputers&memorypercomputer Rely on increasing number of computers & memory per computer (>60%market) SIMMorDIMMisreplaceableunit y, y Commodity,secondsourceindustry =>highvolume,lowprofit,conservative
Standardization:JEDEC(JointElectronicDevicesEngineeringCouncil)
EIA(ElectronicsIndustriesAlliance)

Orderofimportance:1)Cost/bit2)Capacity Order of importance: 1) Cost/bit 2) Capacity FirstRAMBUS:10XBW,+30%cost=>littleimpact

ECE485/585:Memory

80

DRAM/SDRAMLatencySpecifications / y p
DRAM
Used4numbers(e.g.4111) IndicatesnumberofCPUcyclesfor1standsuccessiveaccesses CASLatency(CASorCL) Delayinclockcyclesbetweenrequestandthetimethefirstdataisavailable PC133modulemightbedescribedasCAS2,CAS=2,CL2,CL2,orCL=2 PC133 module might be described as CAS2 CAS=2 CL2 CL2 or CL=2 CASLatencyof1,2,or3 CASLatencyof2or2.5 CAS Latency of 2 or 2.5 CASLatency(tCAC) RAStoCASdelay(tRCD) RASprechargetime(tRP) CASLatency( tCAS tCL) RAStoCASdelay(tRCD) RASprechargetime(tRP) RASaccesstime(tRAS)

SDRAM

SDRDRAM

DDRDRAM Whenthreenumbersappear(e.g.322)

DDR3seeinguseoffour DDR3 i ff

ECE485/585:Memory

81

DRAM/SDRAMLatencySpecifications / y p
DRAM
Used4numbers(e.g.4111) IndicatesnumberofCPUcyclesfor1standsuccessiveaccesses CASLatency(CASorCL) Delayinclockcyclesbetweenrequestandthetimethefirstdataisavailable PC133modulemightbedescribedasCAS2,CAS=2,CL2,CL2,orCL=2 PC133 module might be described as CAS2 CAS=2 CL2 CL2 or CL=2 CASLatencyof1,2,or3 CASLatencyof2or2.5 CAS Latency of 2 or 2.5 CASLatency(tCAC) RAStoCASdelay(tRCD) RASprechargetime(tRP) CASLatency( tCAS tCL) RAStoCASdelay(tRCD) RASprechargetime(tRP) RASaccesstime(tRAS)

SDRAM

SDRDRAM

DDRDRAM Whenthreenumbersappear(e.g.322)

DDR3seeinguseoffour DDR3 i ff

ECE485/585:Memory

82

ErrorCorrection Error Correction


Motivation
Failures/timeproportionaltonumberofbits AsDRAMcellssize&voltagesshrink,morevulnerable

WhynotissueonyourPC?
Failureratewaslow Fewconsumerswouldknowwhattodoanyway Few consumers would know what to do anyway DRAMbankstoolargenow Serversalwayscorrectedmemorysystems

Sources
Alphaparticles(impuritiesinICmanufacturing) Cosmicrays(varywithaltitude)
BiggerprobleminDenverandonspaceboundelectronics

Noise

Needtohandlefailuresthroughoutmemorysubsystem
DRAMchips,module,bus DRAMchipsdontincorporateECC StoretheECCbitsinDRAMalongsidethedatabits Chipset(orintegratedcontroller)handlesECC
83

ECE485/585:Memory

ErrorDetection:Parity Error Detection: Parity

[fromBruceJacob]

ECE485/585:Memory

84

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


Singlebiterrorcorrection requiresn+1checkbitsfor2n databits

ECE485/585:Memory

85

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)

ECE485/585:Memory

86

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


Anexample:decodingandverifying

R1011
ECE485/585:Memory 87

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


Anexample:multiplebiterrors

R0111
ECE485/585:Memory 88

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


Addanothercheckbit SECDED SingleErrorCorrectionDoubleErrorDetection

requiresn+2checkbitsfor2 databits requires n+2 check bits for 2n data bits


ECE485/585:Memory 89

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


DoubleErrorDetectionExample(evenparity)

ECE485/585:Memory

Actuallyitsaystherewasntanodd numberofbiterrors

90

ErrorCorrectionCodes(ECC) Error Correction Codes (ECC)


64bitdatapath+8bitsECCstoredtoDRAMmodule

[fromBruceJacob]

ECE485/585:Memory

91

MemoryControllers Memory Controllers


Handletheactualinterfacetomemory Determinememoryconfiguration/capability MemoryTiming/Signalinterface AddressMapping pp g
PhysicalAddresstoMemoryTopology

ErrorCorrection Scheduling Refresh ResideinNorthBridgeofchipset


IntelpriortoNehalem MCH(MemoryControllerHub) I l t Isolatespfrommemorytechnology/devicechanges f t h l /d i h

Integratedwithmicroprocessor
AMD,IntelNehalem Lowlatencyforhighperformance Opens possibility for processordirected hints Openspossibilityforprocessor directedhints

ECE485/585:Memory

92

AddressMapping Address Mapping


Dualchannels ChannelID Rank Memorymodule Row Bank Column

ECE485/585:Memory

93

AddressMapping Address Mapping


Dualchannels ChannelID Rank Memorymodule Row Bank Column

Channel PhysicalpathbetweenCPUandmemory Rank GroupofDRAMchipsoperatinginlockstep Group of DRAM chips operating in lockstep Sameaddress,control,CS Responsibleforsubsetofsameword Bank SetofindependentmemoryarraysinDRAMchip Row/Column Addressofbitcellinabank Address of bit cell in a bank Maybeseveralplanestoachievenbitswide
ECE485/585:Memory 94

AddressMapping Address Mapping

[fromSimonAlbert,PhDthesis]

ECE485/585:Memory

95

SymmetricandAsymmetricChannels Symmetric and Asymmetric Channels

ECE485/585:Memory

96

AddressMapping Address Mapping

ECE485/585:Memory

97

MemoryScheduling Memory Scheduling

Memorytransactions: read,write

DRAMcommands: refresh,activate,read,write,precharge

Memoryscheduling policy
Handletransactionrequests
Possiblyfromdifferentcores

Refresh Prioritizelow/highpriority
CPU cache line fill request CPUcachelinefillrequest Prefetch

PrioritizeReadoverWrite Reordertotakeadvantageofopenpageinbank Pagepolicy


Open Page OpenPage ClosePage
ECE485/585:Memory 98

MemoryScheduling Memory Scheduling


Withoutaccessscheduling(56DRAMcycles) Time(cycles)
01 (0,0,0) (0,1,0) (0,0,1) (0,1,3) (1,0,0) (1,1,1) (1,0,1) (1,1,2) P A C P A C P A C P A C P A C P A C P A C P A C 10 20 30 40 50 56

Withaccessscheduling(19DRAMcycles)
01 (0,0,0) (0 0 0) (0,1,0) (0,0,1) (0,1,3) (1,0,0) (1,1,1) (1,0,1) (1 0 1) (1,1,2) C C P A C P A C C C P A C P A C 10 20

(bank,row,col)

DRAMcommands P:bankprecharge (3cycles) A:rowactivation(3cycles) C:columnaccess(1cycle)


99

ECE485/585:Memory

RefreshRevisited Refresh Revisited

ECE485/585:Memory

100

Refresh

ECE485/585:Memory

101

MemoryTechnologyTrends Memory Technology Trends

FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)

ECE485/585:Memory

102

Processor/MemoryGap Processor/Memory Gap

ECE485/585:Memory

103

ComputerMemoryHierarchy p y y

FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)

Processor

Control Main Memory (DRAM) Secondary Storage (Disk)

Tertiary Storage (Tape)

Datapath

Second Level Cache (SRAM)

Third Level Cache (SRAM)

Intermediateresults CachedDRAM
ECE485/585:Memory

Re egisters

OnChip Cache

Instructions FileSystem Data Paging [CachedFiles]

Archive Backup
104

Registers,SRAM,DRAM Registers, SRAM, DRAM

SimpleInterface Si l I t f Atspeedaccess(CPU) Multiported

Simpleinterface Si l i t f Moderatedensity Moderatecost/bit Singleordoubleported Primarydesigngoal:speed Canbeintegratedwithlogic

Complexinterface C l i t f Multipleclockcyclestoaccess Burst/pagemodes Veryhighdensity Lowcost/bit Primarydesigngoals:density,$ Usuallysingleported Specializedfab Specialized fab process
Rarelyintegratedwithlogic
105

ECE485/585:Memory

IntelPentium43.2GHzServer Intel Pentium 4 3.2 GHz Server


Component AccessSpeed (Timefordatatobe returned) 1cycle= 0.3nanoseconds 3cycles= 1nanoseconds 1 nanoseconds 20cycles= 7nanoseconds 40cycles= 13nanoseconds 300cycles= 300 cycles = 100nanoseconds
106

Registers L1Cache L2Cache L3Cache Memory


ECE485/585:Memory

Puttingitalltogether Putting it all together

ImprovingBandwidth Widermemoryaccess/bus Banks&interleaving Page/BurstModes

ImprovingLatency Removeredundantsteps Integration


R f h RefreshrowcounterinDRAMchip t i DRAM hi Memorycontrolleronprocessorchip

ECE485/585:Memory

Caching

107

WhySpendSoMuchTimeonMemory?
Hugeimpactoncomputingperformance(and increasingly,powerconsumption) i i l ti ) Perhapsnoothersingletechnologyhasimpactedthe evolutionofPCarchitectureasmuch
Caches Microprocessorarchitecture(prefetch) BusWidth,Speed Chipsets

PCsarenttheonlyhighperformanceapplicationfor memorysystemdesign y y g
EmbeddedSystems Video/Graphics/GameProcessors DigitalSignalProcessing(DSP) Digital Signal Processing (DSP) AutomatedTestEquipment(ATE)
108

ECE485/585:Memory

AcronymsandDefinitions Acronyms and Definitions


RAM randomaccessmemory ROM readonlymemory y y PROM programmablereadonlymemory EPROM erasablePROM EEPROM/E2PROM electricallyerasablePROM CAM contentaddressablememory CAM content addressable memory DRAM dynamicRAM(requiresrefresh) SRAM staticRAM(norefresh) SDRAM synchronousDRAM NVRAM nonvolatileRAM(oftenRAMwithbatterybackup) SDRSDRAM singledatarateSDRAM DDRSDRAM doubledatarateSDRAM RDRAM RAMBUSDRAM RDRAM RAMBUS DRAM ECC ErrorCorrectionCodes DIMM DualInlineMemoryModule

ECE485/585:Memory

109

AcronymsandDefinitions Acronyms and Definitions


Rank Groupofmemorychipswithsamecontrolsignals(eachchiptypically asubsetofthebitsthatcompriseamemoryword) p y ) Bank PortionofDRAMchipwithownsenseamplifiers,openpage

ECE485/585:Memory

110

Вам также может понравиться