Академический Документы
Профессиональный Документы
Культура Документы
A Design Perspective
Semiconductor Memories
Digital Integrated Circuits2nd Memories
Chapter Overview
Memory Classification Memory Architectures The Memory Core Periphery Reliability Case Studies
Digital Integrated Circuits2nd Memories
Random Access
SRAM DRAM
Memories
SN 2 2 SN 2 1
Word N 2 2 Word N 2 1
Intuitive architecture for N x M memory Too many select signals: N words == N select signals
K = log2N
Memories
Bit line
Storage cell
AK A K1 1
Word line
AL2 1
Column decoder
Input-Output (M bits)
Memories
Global data bus Control circuitry Block selector Global amplifier/driver I/O
Advantages: 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings Digital Integrated Circuits2nd Memories
WL
BL WL WL
BL
BL WL
Memories
MOS OR ROM
BL[0] WL[0] V DD WL[1] BL[1] BL[2] BL[3]
WL[2] V DD WL[3]
Memories
WL[0]
GND WL [1] WL [2] GND WL [3]
BL [0]
BL [1]
BL [2]
BL [3]
Memories
Memories
Memories
WL [1] WL [2]
WL [3]
Memories
V DD Precharge devices
BL [0]
BL [1]
BL [2]
BL [3]
PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.
Digital Integrated Circuits2nd Memories
Device cross-section
Schematic symbol
Memories
10 V S
5V
20 V
25V S
0V
2 2.5 V S
5V
Avalanche injection
Memories
FLOTOX EEPROM
Floating gate Source 2030 nm Gate Drain -10 V 10 V n1 Substrate p 10 nm n1 V GD I
FLOTOX transistor
Memories
EEPROM Cell
BL WL
Absolute threshold control is hard Unprogrammed transistor might be depletion 2 transistor cell
VDD
Memories
Flash EEPROM
Control gate
Floating gate erasure n 1 source Thin tunneling oxide
programming p-substrate
n 1 drain
Memories
DYNAMIC (DRAM)
Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended
Digital Integrated Circuits2nd Memories
M3 BL
Memories
V DD
BL Q= 0 M5 V DD Cbit M4 Q= 1 V DD M6 V DD Cbit BL
M1
Memories
Memories
Memories
6T-SRAM Layout
VDD
M2 M4
Q
M1 M3
GND
M5 M6
WL
BL
BL
Memories
M1
M2
Static power dissipation -- Want R L large Bit lines precharged to V DD to address t p problem
Digital Integrated Circuits2nd Memories
SRAM Characteristics
Memories
Number of dies
150 100
worst-case corner
50 0
Normalized IOFF
Substantial variation in leakage across dies 4X variation between nominal and worst-case leakage Performance determined at nominal leakage Robustness determined at worst-case leakage
31
% of chip area
Cache
Technology (micron)
0.25
0.18
0.13
0.10
On-chip memory size is increasing with scaling Challenges: Leakage and Variability
32
Vt LOCAL
LOCAL
intra-die
Vt GLOBAL
GLOBAL
inter-die
Vt = Vt GLOBAL + Vt LOCAL
33
VL WL VR=VREAD
VTRIPRD
AXL
+
PL PR
VR=0
VREAD
AXR
VL=1
-
NL NR
Time ->
-
BL
+
Voltage BR
VR WL VL
Time ->
34
AXL
Parametric failures
Read Failures Write Failures Access Failures Hold Failures
Faulty chips
Working chips
35
Yield 33%
Chip Count
Fault statistics
Vt 30mv, using BPTM 45nm technology
577
210
315
472
629
682
105
734
52
839
944
996
1049
786
890
36
LOCAL LOCAL
intra-die intra-die
GLOBAL
inter-die
Apply correction to the global variation to reduce number of failures due to local variations
37
LowVt Corners
Read failure Hold failure
HighVt Corners
GLOBAL
38
Memory failure probabilities are high when inter-die shift in process is high
39
Reduce RF & HF
Reduce AF & WF
Reduce the dominant failures at different inter-die corners to increase width of low failure region
40
RBB ZBB
FBB
Reduce the dominant failures at different inter-die corners to increase width of low failure region
41
BL
GND
BR
Monitor circuit parameters, e.g. leakage current Effect of inter-die variation can be masked by intra-die variation
42
Y 1 X Y = X i => = Y i =1 N X
N
Adding a large number of random variables reduces the effect of intra-die variation
43
Calibrate Signal
V out
REF1
REF2
Body-Bias selection
Entire array leakage is monitored to detect inter-die corner and proper body-bias is selected
44
45
64 KB LVT Array
128KB SRAM
Dual-Vt Triple-well tech. Number of Trans: ~ 7 million Die size: 16mm2 VLSI CKT Symp. 2006, ITC 2005
46
Vt ~ 200 mV
-0.2
Continuous body biasing scheme Pros: better yield, Cons: higher cost & design complexity Finite width of low PMEM region & sharp transition Large allowable range for body bias stability requirement for body bias is less critical
V Vt ~30 m
47
Quantized (3 Level: FBB, ZBB, RBB) body bias scheme is a cost effective solution with good yield enhancement possibility
140
60 50 40 30 20 10 0 2 1.7
HVT LVT
Low Vt Cell
120 100 80 60 40 20 0
1.5
-1
Low Vt array shows more number of read failures Application of reverse body bias to NMOS (RBB) reduces number of read failures
120
140
HVT LVT
Low Vt Cell
100 80 60 40 20 0 0 -1
0.1
0.08
Low Vt array shows more number of hold failures Application of reverse body bias to NMOS (RBB) reduces number of hold failures
...
D0
D1
D7
...
D0
D1
D7
Robustness Squeeze
250 Number of dies 200 150 100 50 0 0.7 0.8 0.9 1.0 1.1 Normalized DC robustness 1.2
saved dies Noise floor
Delay Squeeze
300 Number of dies 250 200 150 100 50 0 0.8 0.9 1.0 1.1 Normalized delay 1.2
Conventional This work
PCD = 0.90 Conv. = 1.00 : avg. delay
Process detection
Leakage measurement On-die leakage sensor Program PCD using fuses
Customer
Package test
Burn in
Assembly
compa rators
VBIAS gen.
NMOS device
test interface
83m
High leakage sensing gain Compact analog design sharing bias generators
current mirrors
VSEN1
VSEN2
VSEN3
VSEN4
VSEN5
VSEN6
VREF + + -
OUT[1]
+ -
+ -
+ -
+ -
OUT[0]
Incremental mirroring ratio for multi-bit resolution leakage sensing Shared bias generators compact design Process-voltage insensitive IREF, VBIAS gen.
current mirrors
compar ators
test interface
Technology VDD Resolution Power consumption Dimensions 90nm dual Vt CMOS 1.2V 7 levels 0.66 mW @80C 83 X 73 m2
101
110
111
No constraints on device ratios Reads are non-destructive Value stored at node X when writing a 1 = VWWL-VTn Digital Integrated Circuits2nd Memories
3T-DRAM Layout
BL2 BL1 GND
RWL
M3 M2
WWL
M1
Memories
Write: C S is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance
CS V = VBL V PRE = V BIT V PRE -----------C S + CBL
Memories
V(1) V PRE
DV(1)
Memories
Cross-section
Layout
Expensive in Area
Memories
Memories
Cell Plate Si
Refilling Poly
Transfer gate
Si Substrate
Trench Cell
Digital Integrated Circuits2nd
Stacked-capacitor Cell
Memories
Bit M9 M7 M5
Memories
CAM ARRAY
H i t L o g i c
SRAM ARRAY
Input Drivers
Address
Tag
Hit
R/W
Data
Memories
Periphery
Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry
Memories
Row Decoders
Collection of 2M complex logic gates Organized in regular and dense fashion
(N)AND Decoder
NOR Decoder
Memories
Hierarchical Decoders
Multi-stage implementation improves performance
WL 1
WL 0
A 0A 1 A 0A 1 A 0A 1 A 0A 1
A 2A 3 A 2A 3 A 2A 3 A 2A 3
A1 A0
A0
A1
A3 A2
A2
A3
Memories
Dynamic Decoders
Precharge devices GND GND
VDD WL3
WL 3 WL 2 WL 1
VDD
VDD
WL 2
V DD WL 0
WL 1
WL 0 VDD A0 A0 A1 A1
A0
A0
A1
A1
Memories
A0
S0 S1 S2
A1
2 i n p u t N O R d e c o d e r
S3
Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count
Memories
A1
A1
D
Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches
Memories
V DD WL
1
V DD
V DD WL
2
V DD
V DD
f f
f f R
f f
f f R
f f
Memories
Sense Amplifiers
C V tp = ---------------Iav large make V as small as possible
small
Memories
SE
M5
SE Output (a) SRAM sensing scheme (b) two stage differential amplifier
Memories
SE
Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point.
Digital Integrated Circuits2nd Memories
Charge-Redistribution Amplifier
V ref VL M2 M3 VS C small Transient Response
2.5
M1 C large
Concept
V
V in VL
VS
Memories
SE
M4 Out
Load
V casc
M3
Cout
Cascode device
Ccol WLC M2 BL WL
Digital Integrated Circuits2nd
Column decoder
M1
CBL
EPROM array
Memories
Single-to-Differential Conversion
WL BL Cell x Diff. S.A. 2 x
1 2
V ref
Output
R0
R1
CS
CS
CS
Dummy cell
Memories
2 BL 1 BL
BL
1 t (ns)
1 t (ns)
reading 0
3 EQ 2
V
reading 1
WL
SE 1
1 t (ns)
control signals
Memories
Voltage Regulator
VDD Mdrive VDL
VREF Vbias
Equivalent Model
VREF
Mdrive
VDL
Digital Integrated Circuits2nd Memories
Charge Pump
V DD VB CLK A M1 B M2 V load Cload V load 0V V DD 2 V T 0V 2V DD 2 V T
Cpump
Memories
DRAM Timing
Memories
RDRAM Architecture
Bus Clocks Data bus k k3 l
n e t w o r k
memory array
Column Row
demux demux
Memories
A0
ATD
ATD
A1
A N2 1
Memories
Memories
C D(1F)
100
C S(1F)
Q S(1C)
10 V DD (V)
Q S 5 C S V DD / 2 V smax 5 Q S / (C S 1 C D )
4K
64K
1M 16M 256M 4G
/ chip)
64G
From [Itoh01]
Memories
Ccross
Memories
EQ
WL 1 BL C BL C
WL 0
C WBL
WL D
WL 0
WL 1 BL C BL
Memories
Folded-Bitline Architecture
WL 1 BL CBL C CBL CWBL C C C C C WL 1 WL 0 C
WBL
WL 0
WL D
WL D x Sense EQ Amplifier x y y
BL
Memories
Transposed-Bitline Architecture
Ccross BL 9 BL BL BL 99 (a) Straightforward bit-line routing Ccross BL 9 BL BL BL 99 (b) Transposed bit-line architecture
Digital Integrated Circuits2nd Memories
SA
SA
V DD SiO 2
Yield
Memories
Redundancy
Redundant rows Redundant columns Row Address : Memory Array
R o w D e c o d e r
Fuse Bank
Column Decoder
Column Address
Memories
Error-Correcting Codes
Example: Hamming Codes
Memories
Memories
From [Itoh00]
Memories
Ileakage
Factor 7
0.18 mm CMOS
VDD
sleep
V SS,int
I ACT I AC
102 2 102 3 10 10
24 25
I DC
102 6
15M
64M
255M
1G
4G
15G
64G
Operating voltage (V) 0.53 0.40 0.32 0.24 0.19 0.16 0.13
From [Itoh00]
Memories
Case Studies
Programmable Logic Array SRAM Flash Memory
Memories
Main difference
ROM: fully populated PLA: one element per minterm Note: Importance of PLAs has drastically reduced 1. slow 2. better software techniques (mutli-level logic synthesis)
But
Memories
GND
GND
V DD
X0
X0
X1
X1
X2
X2
f0
f1
AND-plane
OR-plane
Memories
Dynamic PLA
f AND GND V DD f
OR
f f AND V DD X0 X0 X1 X1 X2 X2 f0 f 1 GND
OR
AND-plane
OR-plane
Memories
AND
tpre teval f
OR
AND
OR
Memories
PLA Layout
VDD And-Plane Or-Plane GND
x0 x0 x1 x1 x2 x2 Pull-up devices
Digital Integrated Circuits2nd
f0 f1 Pull-up devices
Memories
Memories
Bit-line Circuitry
Bit-line load Block select ATD
BEQ Local WL Memory cell B /T CD CD I/O line I/O Sense amplifier B /T CD I/O
Memories
ATD
Block select ATD
SA
BS
SA SEQ
SEQ
BS
Data-cut
Memories
Block0 Block1023
Bit Line Control Circuit Sense Latches (10241 32) 3 8 Data Caches (10241 32) 3 8
I/O
I/O
From [Nakamura02]
SGD WL31
Memories
100 0V
1V
2V
3V
4V
0V
1V 2V 3V 4V Vt of memory cells
Vt of memory cells
Evolution of thresholds
Final Distribution
From [Nakamura02]
Memories
Charge pump
10.7mm
11.7mm
Digital Integrated Circuits2nd
From [Nakamura02]
Memories
From [Nakamura02]
Memories
From [Itoh01]
Memories
From [Itoh01]
Memories