Академический Документы
Профессиональный Документы
Культура Документы
Memory
"640Koughttobeenoughforanybody." BillGates,1981
Memory
TaxonomyofMemories Memory Hierarchy MemoryHierarchy SRAM
BasicCell,Devices,Timing
DRAM
Basic Cell Timing BasicCell,Timing DRAMEvolution
ECE485/585:Memory
NonRandomAccess
ShiftRegister Shift Register FIFO CAM SRAM DRAM
RandomAccess
EPROM E2PROM Flash
MaskROM PROM
NAND NOR
NVRAM
ECE485/585:Memory
ComputerMemoryHierarchy p y y
FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)
Processor
Datapath
Intermediateresults CachedDRAM
ECE485/585:Memory
Re egisters
OnChip Cache
Archive Backup
5
RegisterFile
datab datac
OperateatCPUspeed
Address decoder generates a one-hot ( ) code (1-of-n code) from the address binary to unary The output is used for row selection
ECE485/585:Memory
Writeissynchronous
IfWE,inputdataiswrittentoselectedwordontheclockedge
Clock RegID Dout Din WE RegID X g R[X] RegID Y g R[Y] val val
Dout
ECE485/585:Memory
Write
Writebitandbitontobitlines Selectdesiredword(row) Turnsonpasstransistors Writesnewvaluetocell [Oneinverterinputwillbelow, turningitsoutputhigh]
bit line
data
bit line
6transistors Whichwillbelonger:bitlinesorwordlines? Bitlines! Fordensityandlowpower,wanttinytransistors Unabletodrivelongbitlines Pre chargebitlines(Vdd/2)beforeread Precharge bit lines (Vdd/2) before read Usedifferentialbetweenbitandbit
ECE485/585:Memory
Read
Selectdesiredword(row) Onebitlinewillbepulledlow Otherwillremainhigh Takeslongtimeforbitlinetobe pulledlowwithtinytransistor Dontneedtowait canjustsense differencebetweentwobitlines!
deca
decb
cell array
b2
b1
b1
b2
address ports
ECE485/585:Memory
data ports
10
MostControlSignalsareActiveLow ChipSelect(/CS)effectivelyanenable WriteEnable(/WE)controlsread/write Write Enable (/WE) controls read/write Write /WEisasserted(Low) /CSisasserted(Low) Read /WEisdeasserted(High) /CSisasserted(Low)
ECE485/585:Memory
11
SRAMVariations
2n x b RAM A0 A1 An-1 DIN0 DIN1 DINb-1 CS WE DedicatedDin&Dout D di d Di & D
Tradepincount($)forhigherperformance Nobidirectionalturnaroundtimerequired
2n x b RAM A0 A1 An-1
ECE485/585:Memory
12
SimplifiedSRAMtimingdiagram
ECE485/585:Memory
13
/OEdeterminesdirection Hi=Write,Lo=Read i i d
WriteTiming:
D A /OE /WE Write od e HoldTime WriteSetupTime
ECE485/585:Memory
ReadTiming:
HighZ Junk DataOut ReadAddress DataOut ReadAddress
DataIn WriteAddress
ReadAccess Time
ReadAccess Time
14
InternalSRAMOrganization(16x4)
Din 3 Din 2 Din 1 Din 0 WriteEnable Wr Driver + Wr Driver + Wr Driver + Wr Driver + Word 0 Add dress Decod der A0 A1 A2 A3
SRAM Cell
SRAM Cell
SRAM Cell
SRAM Cell
SRAM Cell :
SRAM Cell :
SRAM Cell :
SRAM Cell :
Word 1
Word 15
15
Noteaddressfollowingmode KeySRAMtimingparameters
tAA Addressaccesstime:timebetweenavalidaddressbeing appliedandvaliddataavailableondataoutputs tRC Readcycletime:Minimumtimethatoneaddressmustbe ead cyc e t e: u t e t at o e add ess ust be heldontheaddresslinesbeforeasecondaddresscanbe presented
tAA representslatency p y tRC representsbandwidth(throughput)
16
ECE485/585:Memory
Whathappensasnumberofbitsincreases? pp
Decoderlargerandslower Bitlinesincreaseinlength
LargedistributedRCload Larger,slowertransistors n bits Log2 n bit address
Remember Treat output as differential signal Treatoutputasdifferentialsignal Prechargebothbitlineshigh Memorycellpullsonlyonelow Sensebitvaluebycomparingsenselines Makeitshorterandwider!
ECE485/585:Memory
17
d bi 1databit
18
4databits
19
PhysicalSRAMArrayShouldBeSquare
Example:16x1SRAM 4x4Array
0 2-to-4 Decoder 1
IN OUT SEL WR IN OUT SEL IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR IN OUT SEL WR
DI
A1 A0
1 0 2
WR IN OUT SEL
WR IN OUT SEL
WR
2to4 Decoder
S E
ECE485/585:Memory
4to1Mux
DO
20
Addlatchestoinputdata
ChipSelectfunctionsthisway Onlyneedtoholddatauntillatched
Addlatchestooutputdata
Outputsavailabletobereadwhilenextaccessbegun
Providesynchronousinterface
ECE485/585:Memory
21
ECE485/585:Memory
22
ECE485/585:Memory
23
64Kx16MemorySystem 2x64Kx8RAMchips
ECE485/585:Memory
24
DRAM(Read/Write)CycleTime>>DRAM(Read/Write)AccessTime
2:1Why?
DRAM(Read/Write)CycleTime
Howfrequentlycanyouinitiateanaccess? q y y Analogy:AlittlekidcanonlyaskhisfatherformoneyonSaturday
DRAM(Read/Write)AccessTime
How quickly will you get what you want once you initiate an access? Howquicklywillyougetwhatyouwantonceyouinitiateanaccess? Analogy:Assoonasheasks,hisfatherwillgivehimthemoney
DRAMBandwidthLimitationanalogy
What happens if he runs out of money on Wednesday? WhathappensifherunsoutofmoneyonWednesday? AskMom!
ECE485/585:Memory
25
IncreasingBandwidth:Interleaving
AccessPatternwithoutInterleaving:
CPU Memory
D1available StartAccessforD1
AccessPatternwith4wayInterleaving:
CPU
Access sBank0
Memory Bank3
26
address 0 4 8 12
address 1 5 9 13
address 2 6 10 14
address 3 7 11 15
Bank 0
Bank 1
Bank 2
Bank 3
ECE485/585:Memory
27
ECE485/585:Memory
28
ECE485/585:Memory
29
Seenthisbefore
ECE485/585:Memory
30
ECE485/585:Memory
31
DRAMTechnology
DRAMCell
Write
Drivebitline Selectdesiredword(row)
word line
Read
Prechargebitline Selectdesiredword(row) Sensecharge Writevalueback(restore) Write value back (restore)
bit line
Refresh!
Periodicallyreadeachcell
(forcing writeback) (forcingwrite back)
ECE485/585:Memory
32
VolatileMemoryComparison
SRAMCell
word line
DRAMCell
word line
bit line
Largercell lowerdensity,highercost/bit Nodissipation No dissipation Readnondestructive Norefreshrequired Simpleread fasteraccess StandardICprocess naturalfor f integrationwithlogic
ECE485/585:Memory
Smallercell higherdensity,lowercost/bit Needsperiodicrefresh N d i di f h Refreshafterread Complexread longeraccesstime SpecialICprocess difficulttointegratewith logiccircuits Densityimpactsaddressing (multiplexaddresslines) 33
/
256K=18addressbits 9rowaddressbits 9columnaddressbits
2Mb(256Kx8) DRAM Data(DQ) Data (DQ) Address /RAS /CAS /WE /OE
SoMultiplex
Data (In/Out) Data(In/Out)
/WEasserted(low)forwrite /OEasserted(low)toenableoutputbuffers
Address(Row/Column)
/RAS(RowAddressStrobe)assertedafterrowplacedonaddresspins /CAS(ColumnAddressStrobe)assertedaftercolumnplacedonaddresspins
ECE485/585:Memory
34
(fromBruceJacob)
ECE485/585:Memory
35
Ad ddressBuff fer
RowDecod der
A0A8
WordLine Word Line WordLine WordLine WordLine WordLine WordLine WordLine Word WordLine Word Line Line
SenseAmps&I/O p
MemoryArray MemoryArray MemoryArray (512x512) MemoryArray (512x512) MemoryArray ( x (512x512) MemoryArray) (512x512) (512 Array MemoryArrayy Memoryy 512) Memory Array (512x512) (512x512) (512x512) (512x512)
ECE485/585:Memory
ColumnDecoder
ECE485/585:Memory
37
/RAS
/CAS
/WE
/OE
256Kx8 DRAM
D
8
EarlyReadCycle:/OEassertedbefore/CAS
ECE485/585:Memory
LateReadCycle:/OEassertedafter/CAS
38
ECE485/585:Memory
39
ECE485/585:Memory
40
ECE485/585:Memory
41
ECE485/585:Memory
42
ECE485/585:Memory
43
tRCD:RowCommandDelay(RAS/CASDelay)
Minimumtimebetweenarowcommandandacolumncommand
tCAC:ColumnAccessTime
Delayfromfalling/CAStovaliddataout
tRC:RowCycleTimeDeterminesBandwidth
Minimumtimebetweensuccessiverowaccesses tRC =tRAS +tRP
tRAS:RowAddressStrobe
Minimumtime/RASmustbemaintained
tRP:RowPreChargeDelay Ch l
Minimumtimetoprechargesoanother/RAScanbegin
44
ECE485/585:Memory
Refresh
Dependsupondevice d d
Refreshperiod timebywhichallrowsmustberefreshed
Typical 64ms Typical15.6us. distributedvs burstrefresh distributed vs burst refresh
RASonly
/RAScycling,no/CAScycling Externalmemorycontrollermaintainsaddressoflastrowrefreshed
CASbeforeRAS(CBR)
DRAMmaintainsrefreshaddress
HiddenRefresh
FollowingReadorWrite g /CASleftlow,/RAScycled Outputdataremainsvalid Requirestime,delayingsubsequentread/write
ECE485/585:Memory
45
DRAMEvolution
(fromBruceJacob&DavidWang)
OtherDRAMtechnologies
Rambus DRAM(RDRAM) ConcurrentRambus DRAM Di DirectRambus DRAM (DRDRAM) R b DRAM(DRDRAM)
NumerousSpecialtyDRAMsnotshown
VirtualChannelMemory(VCDRAM) EnhancedSDRAM(ESDRAM) MoSys 1TSRAM ReducedLatencyDRAM(RLDRAM) FastCycleDRAM(FCRAM)
By2002mostPCsusingSDRAMandDDRSDRAM By2010PCsshippingwithDDR3involume
ECE485/585:Memory
46
ECE485/585:Memory
47
FastPageModeDRAMReadTiming
Innovation Holdentirerow(page)insenseamplifiers Benefit CPUcanaccesseachcolumninrowwithoutprovidingrow addresseachtime(andprecharging)
ECE485/585:Memory
48
ECE485/585:Memory
49
ECE485/585:Memory
50
ECE485/585:Memory
[fromBruceJacob&DavidWang]
51
ECE485/585:Memory
SDRDRAMexamples
(DDRcanhaveCASlatencyof2.5)
52
ECE485/585:Memory
53
ACTIVEcommandopensa bank/rowforoperation Transfers contents of the entire Transferscontentsoftheentire rowtoabuffer SubsequentREADorWRITE commandsaccessthecontentsof h b ff therowbuffer Forburstreadsandwritesduring READorWRITEthestarting pp addressoftheblockissupplied. Burstlengthisprogrammable as1,2,4,8orafullpage (entirerow)withaburst terminateoption terminate option Specialcommandsareusedfor initialization(burstoptionsetc.) Aburstoperationtakes 4+n l (f d) cycles(fornwords)
54
212 =4096rows(pages)
22 =4banks
29 =512columns ECE485/585:Memory 55
Whyrow/bank/column,notbank/row/column?
Considerspatiallocality C id i l l li Imagineaccessingaseriesofsequentialmemoryaddresses Afterexhaustingacolumn,referencestoanotherbank Considerifrow/bankreversed /
Bankwouldrarelybeused,losebenefitofinterleaving
ECE485/585:Memory
56
Consequently,pagesizeisafactorinpower consumption
Number of Columns DQs Page Size (bytes) = 8
ECE485/585:Memory
57
ECE485/585:Memory
58
Sourcesynchronous
Datatransferistwiceclockrate Datastrobesentalongsidedata
Read: supplied by DRAM Read:suppliedbyDRAM
Dataalignedwithstrobeedge
Write:suppliedbycontroller
Datacenteredonstrobeedge
ECE485/585:Memory
59
[From:Samsung]
ECE485/585:Memory
60
Row
Memoryaddress Bank
[From:Samsung]
Column
ECE485/585:Memory
61
From:Samsung
Memoryaddress Bank
ECE485/585:Memory
Row
Column
62
ECE485/585:Memory
63
Seedatasheetformoredetails.
ECE485/585:Memory
Verilogsimulationmodelsavailable.
64
additivelatency
ECE485/585:Memory
[fromMicron]
65
ECE485/585:Memory
66
Noadditivelatency,tRCD =CL=4
(fromMicron)
ACTB<n>,R<x>=activaterow<x>inbank<n> RDAPB<n>,C<x>=readcolumn<x>fromactivatedbank<n>(autoprecharge) RD AP B< > C< > d l < >f ti t d b k < > ( t h )
ECE485/585:Memory 67
[fromMicron]
ECE485/585:Memory
68
PermitscontinuousdatareadfromDRAM
ECE485/585:Memory
69
DDR 3 DDR3
CurrentgenerationSDRAM KeyDifferences
8nprefetch
SSTL15(1.5V)vs.SSTL18(1.8V)
R d d Reducedpowerconsumption(~30%) i ( 30%)
667800MHzclocks
2xbandwidthofDDR2
Adoptionrate
Introductionin2007(insignificantquantities) t oduct o 00 ( s g ca t qua t t es) ExpectedtobedominantPCmemoryby2010 Samsung8Gb DDR3describedatISSCCFebruary2009
ECE485/585:Memory
70
ECE485/585:Memory
71
ECE485/585:Memory
72
(RFU
ReservedforFutureUse)
73
ECE485/585:Memory
Singleread
ECE485/585:Memory
74
FromHsienHsin SeanLee,GeorgiaInstituteofTechnology
Easytoadd/replacememoryinasystem
Noneedtosolderorremoveindividualchips
ECE485/585:Memory
ECE485/585:Memory
76
ECE485/585:Memory
77
200pinDDR2,DDR3SDRAMDIMM RIMM
78
ECE485/585:Memory
ECE485/585:Memory
80
DRAM/SDRAMLatencySpecifications / y p
DRAM
Used4numbers(e.g.4111) IndicatesnumberofCPUcyclesfor1standsuccessiveaccesses CASLatency(CASorCL) Delayinclockcyclesbetweenrequestandthetimethefirstdataisavailable PC133modulemightbedescribedasCAS2,CAS=2,CL2,CL2,orCL=2 PC133 module might be described as CAS2 CAS=2 CL2 CL2 or CL=2 CASLatencyof1,2,or3 CASLatencyof2or2.5 CAS Latency of 2 or 2.5 CASLatency(tCAC) RAStoCASdelay(tRCD) RASprechargetime(tRP) CASLatency( tCAS tCL) RAStoCASdelay(tRCD) RASprechargetime(tRP) RASaccesstime(tRAS)
SDRAM
SDRDRAM
DDRDRAM Whenthreenumbersappear(e.g.322)
DDR3seeinguseoffour DDR3 i ff
ECE485/585:Memory
81
DRAM/SDRAMLatencySpecifications / y p
DRAM
Used4numbers(e.g.4111) IndicatesnumberofCPUcyclesfor1standsuccessiveaccesses CASLatency(CASorCL) Delayinclockcyclesbetweenrequestandthetimethefirstdataisavailable PC133modulemightbedescribedasCAS2,CAS=2,CL2,CL2,orCL=2 PC133 module might be described as CAS2 CAS=2 CL2 CL2 or CL=2 CASLatencyof1,2,or3 CASLatencyof2or2.5 CAS Latency of 2 or 2.5 CASLatency(tCAC) RAStoCASdelay(tRCD) RASprechargetime(tRP) CASLatency( tCAS tCL) RAStoCASdelay(tRCD) RASprechargetime(tRP) RASaccesstime(tRAS)
SDRAM
SDRDRAM
DDRDRAM Whenthreenumbersappear(e.g.322)
DDR3seeinguseoffour DDR3 i ff
ECE485/585:Memory
82
WhynotissueonyourPC?
Failureratewaslow Fewconsumerswouldknowwhattodoanyway Few consumers would know what to do anyway DRAMbankstoolargenow Serversalwayscorrectedmemorysystems
Sources
Alphaparticles(impuritiesinICmanufacturing) Cosmicrays(varywithaltitude)
BiggerprobleminDenverandonspaceboundelectronics
Noise
Needtohandlefailuresthroughoutmemorysubsystem
DRAMchips,module,bus DRAMchipsdontincorporateECC StoretheECCbitsinDRAMalongsidethedatabits Chipset(orintegratedcontroller)handlesECC
83
ECE485/585:Memory
[fromBruceJacob]
ECE485/585:Memory
84
ECE485/585:Memory
85
ECE485/585:Memory
86
R1011
ECE485/585:Memory 87
R0111
ECE485/585:Memory 88
ECE485/585:Memory
Actuallyitsaystherewasntanodd numberofbiterrors
90
[fromBruceJacob]
ECE485/585:Memory
91
Integratedwithmicroprocessor
AMD,IntelNehalem Lowlatencyforhighperformance Opens possibility for processordirected hints Openspossibilityforprocessor directedhints
ECE485/585:Memory
92
ECE485/585:Memory
93
Channel PhysicalpathbetweenCPUandmemory Rank GroupofDRAMchipsoperatinginlockstep Group of DRAM chips operating in lockstep Sameaddress,control,CS Responsibleforsubsetofsameword Bank SetofindependentmemoryarraysinDRAMchip Row/Column Addressofbitcellinabank Address of bit cell in a bank Maybeseveralplanestoachievenbitswide
ECE485/585:Memory 94
[fromSimonAlbert,PhDthesis]
ECE485/585:Memory
95
ECE485/585:Memory
96
ECE485/585:Memory
97
Memorytransactions: read,write
DRAMcommands: refresh,activate,read,write,precharge
Memoryscheduling policy
Handletransactionrequests
Possiblyfromdifferentcores
Refresh Prioritizelow/highpriority
CPU cache line fill request CPUcachelinefillrequest Prefetch
Withaccessscheduling(19DRAMcycles)
01 (0,0,0) (0 0 0) (0,1,0) (0,0,1) (0,1,3) (1,0,0) (1,1,1) (1,0,1) (1 0 1) (1,1,2) C C P A C P A C C C P A C P A C 10 20
(bank,row,col)
ECE485/585:Memory
ECE485/585:Memory
100
Refresh
ECE485/585:Memory
101
FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)
ECE485/585:Memory
102
ECE485/585:Memory
103
ComputerMemoryHierarchy p y y
FromHennessy&Patterson,ComputerArchitecture:AQuantitativeApproach(4th edition)
Processor
Datapath
Intermediateresults CachedDRAM
ECE485/585:Memory
Re egisters
OnChip Cache
Archive Backup
104
Complexinterface C l i t f Multipleclockcyclestoaccess Burst/pagemodes Veryhighdensity Lowcost/bit Primarydesigngoals:density,$ Usuallysingleported Specializedfab Specialized fab process
Rarelyintegratedwithlogic
105
ECE485/585:Memory
ECE485/585:Memory
Caching
107
WhySpendSoMuchTimeonMemory?
Hugeimpactoncomputingperformance(and increasingly,powerconsumption) i i l ti ) Perhapsnoothersingletechnologyhasimpactedthe evolutionofPCarchitectureasmuch
Caches Microprocessorarchitecture(prefetch) BusWidth,Speed Chipsets
PCsarenttheonlyhighperformanceapplicationfor memorysystemdesign y y g
EmbeddedSystems Video/Graphics/GameProcessors DigitalSignalProcessing(DSP) Digital Signal Processing (DSP) AutomatedTestEquipment(ATE)
108
ECE485/585:Memory
ECE485/585:Memory
109
ECE485/585:Memory
110