Вы находитесь на странице: 1из 6

2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems

A Low-Voltage 13T Latch-Type Sense Amplifier


with Regenerative Feedback for Ultra Speed
Memory Access
Venkatesh Mani Tripathi∗ , Sandeep Mishra† , Jyotishman Saikia‡ and Anup Dandapat†
∗ Department of Electrical Engineering
Indian Institute of Technology Patna, Bihta 801103, India
† Department of Electronics and Communication Engineering
National Institute of Technology Meghalaya, Shillong 793003, India
Email: anup.dandapat@nitm.ac.in
‡ Department of Electrical Engineering
Indian Institute of Technology Guwahati, Guwahati 781039, India

Abstract—Sense amplifiers provide amplification to the very Pre-charge circuit

Input register
small voltage change in the memory datapath in near-zero access
RW Write circuit
time. The sub-micron technology demands high performance Data in
sensing at extreme noise margins. In this paper, a state of the art
latch-type storage element is proposed to provide a strong positive C C C
feedback for the small change in the differential sense input. A
well synchronized 4T controlling circuit has been used to speed up
Address register

Row decoder IE
the latch operation by dissipating minimal sense energy. The 13T
Address

C C C
ultra speed latch-type sense amplifier has been designed using
predictive 45-nm CMOS technology and simulated in SPECTRE. OE
The results show that the proposed design dissipates 0.152 fJ
with extremely low sense time of 58.6 ps at 0.6 V. The design can C C C
function up to a supply voltage scaling of 0.4 V with an average

Output buffer
sensing delay variation of just 29%.
Data out
Keywords—High speed memory design; latch delay; latch-type Sense amplifier
sense amplifier; low-voltage design; sense amplifier.
Column decoder

I. I NTRODUCTION Fig. 1. Organization of a typical memory array with sensing scheme and
interconnects. IE: input enable, OE: output enable, and RW: write control.
Sense amplifier (SA) plays a major role during the read
operation of most types of memories whose performance is
usually limited by the access time. The SA is a fundamental
building block in a number of applications including multi- for their low power advantage [6], [7]. Low-power VMSAs
input comparator, memory access, analog to digital converter suffer from degraded read access time where as the faster
and portable devices. Ideally, an SA must determine the current mode SAs are more sensitive to offset and leakage
content by sensing smallest differential charge variation on power consumption. A pre-amplifier has been used prior to
the bitlines. This leads to the feature requirement of near- latch for reducing offset voltage [8]. The increased node
zero access time and near-zero bitline energy dissipation. capacitance helps in reducing input referred noise but slows
SA differential offset and large bitline capacitance are major down the comparison speed. Leakage suppression is vital in
constraints in limiting its performance. Further, need for higher SRAM sensing which can be achieved through power gating
speed, large memory size and low-power consumption has techniques [9].
implied whole new operating environments for SAs [1]–[3]. The functional block of a memory with sensing scheme
An optimized voltage/current mode SA must be designed for and its interconnects are presented in Fig. 1. A SA is placed
near zero access time with minimal power consumption. at each column of the memory array. Power-delay trade-off
A body driven positive feedback has been provided to has been considered as an important design objective in many
improve the precision of SA [4]. The comparative study memory applications. Various voltage mode (low-power, high
on 8T SRAM with different types of SAs shows that the noise immunity) and current mode (high performance) sensing
current latched SA (CLSA) has a higher sensing speed, lower schemes with moderate trade-off have been presented in [10]–
noise margin, and more cell area than the voltage mode SA [14]. We emphasize on reducing the sensing delay (latch delay)
(VMSA) [5]. VMSAs are popular in associative memories in a low-power VMSA for providing the best energy-delay

2380-6923/16 $31.00 © 2016 IEEE 341


DOI 10.1109/VLSID.2017.15
ENB X 0V
N4

I rD Gm.Vgs2 rD
T5 T1 T3 T6 T11 T12
T13
rD

Cload
V rD
rD rD Gm.Vgs2
T10 OUTA
OUTB

T2 T4

Gm.Vgs rD
N2 N1

BL T7 T8 BLB
N3

I0 Fig. 3. Small signal approximate model of latch-type voltage mode sensing.


T9
EN GTG
Y 0V

Fig. 2. Proposed latch-type sense amplifier with regenerative feedback.


EN/ENB: sense enable, BL-BLB/OUTA-OUTB: sense I/Os. I rD Gm.Vgs rD

product. The rest of this paper is organized as follows: Section

Cload
V rD rD Gm.Vgs rD
rD rD
2 describes the proposed sensing scheme with regenerative
feedback. Section 3 provides the measurement results and
analysis reports with compared designs and we conclude in
Gm.Vgs rD
Section 4.
II. P ROPOSED R EGENERATIVE F EEDBACK S CHEME
Fig. 4. Approximate small signal model of the proposed sense amplifier.
In a conventional latch-type SA, a 4 transistor (4T) latch
is created and sensing transistors are used to decouple input
from outputs. The positive feedback provides a faster decision During pre-charge mode, the transmission gate (TG) formed
but through series transistors. The fundamental concept used by T11 and T12 remains at cut-off state and T13 turns on that
in the proposed SA is to reduce the latch-time by providing a passes logic 0 to net N4 . The “0” value at the gate of transistor
regenerative feedback circuitry to the positive feedback. A 4T T10 keeps it in cut-off state. In sense mode, TG turns on and
regenerative feedback (T10 -T13 ) has been set between source T13 turns off so that the voltage at node N1 reaches at node
and drain of nMOS latch pair as shown in Fig. 2. Transistor T10 N4 . The transistor T10 turns on and provides logic 0 to T3
has been used for reducing decision time and power reduction for a direct connection to V DD that provides faster stable-state
circuitry (T11 -T13 ) for preventing the sense output discharge. to OUTA and OUTB. The design performs better when we
A. Sensing Mechanism scale down in voltage as the pre-charge amount decreases in
comparatively less time.
During pre-charge mode, EN is set to “0” and ENB is set
to “1”. Outputs OUTA and OUTB have been charged to V DD B. Small Signal Analysis
and T9 has been cut-off that provides a high resistance which Discharging time constant decides the sensing delay in
acts as an open circuit. The available charge on inputs BL or latch-type SAs. An approximate small signal analysis has been
BLB (connected to the gates of T7 and T8 ) still cannot turn the carried out to prove the low latch-time of our proposed design.
bitline transistor (T9 ) on as a closed loop cannot be formed. The following assumptions have been made for manifest
Thus, the whole circuit is not conducting that results in small understanding of the models by neglecting the effects of
tail current (I0 ) to reduce the leakage. During sense mode, EN
1) Conductances (gbd , gbs ) and Resistances (rDB and rSB ).
and ENB have been set at “1” and “0” respectively that let the
2) MOSFET capacitances (Cgd , Cgs , Cgb , Cbd and Cbs ).
current I0 flow and net N3 reached to zero. The further design
3) Dependent current sources (gmbs ×vbs , inrD , inD and inrS ).
function has been explained by considering BL and BLB set
at “1” and “0” respectively. The net N2 reaches to logic 0 The discharging time constant directly depends upon the
through the path provided by T7 and T9 . Transistor T2 turns Thevenins resistance at the discharging output. The model of
on because during pre-charge mode VOUTA =VOUTB =“1”. latch-type SA [13] is presented in Fig. 3 and the RTh from the
Before the latch is turned on we take a signal from net N1 , approximated model has been compared with proposed model
that pass through the control circuit to gate of T10 . The output presented in Fig. 4 for proving its fast latching operation.
of T10 has been connected to OUTA terminal to provide an Applying KCL at node X in Fig. 3,
early decision. OUTA reaches at one stable state (0) that turns V V
on the latch there by providing another stable state to OUTB. + −I=0 (1)
rD 2rD

342
9'' *1' 92+ 92/ ,DYJ 'HOD\
>P9@ >P9@ >Q$@ >SV@ 1.0
    EN
    OUTA
0.8 OUTB
287$     N4
%/
   

Voltage [V]
0.6
   

    0.4


   
%/%
287%     0.2

   

    0.0

   


(1% (1 39.96 39.99 40.02 40.05 40.08 40.11 40.14 40.17
Time [ns]
Fig. 5. Layout of proposed sense amplifier with performance comparison at
various noise margins.
Fig. 6. Charge variation at various nodes of proposed SA during the phase
change (pre-charging to sense).

After simplification, the Thevenins equivalent resistance can


be written as 1.0

V 2rD EN
Req = = (2) OUTA
I 3 0.8 SO
OUTB
Therefore, the discharging time constant can be written as Voltage [V] SOB
0.6

2
ζ = rD Cload (3) 0.4
3
In Fig. 4, applying KCL at node Y, 0.2

V V V
+ + −I=0 (4) 0.0
rD rD 2rD
39.96 39.99 40.02 40.05 40.08 40.11 40.14 40.17
After simplifying Eq. (4)
Time [ns]

2rD
Req = (5) Fig. 7. Charge variation at various nodes during the phase change (OUTA-
5 OUTB: proposed SA output; SO-SOB: outputs of LVMSA [13].

The discharging time constant of the proposed model can be


expressed as
A. Power Performance
2
ζ = rD Cload (6) Output charge variation and tail transistor switching activity
5
are the major candidates of SA power consumption. The pre-
Significant improvement in the sensing delay using proposed charge leakage also contributes to the overall energy dissipa-
model can be observed by comparing equation 3 and 6. tion. Charge variation at various nodes of proposed design is
depicted in Fig. 6. The voltage drop in the outputs during the
III. R ESULTS AND P ERFORMANCE C OMPARISON phase change (from pre-charge to sense) contribute more to the
average power consumption in both voltage and current SAs.
The proposed design depicted in Fig. 5 has been imple- The output (HIGH logic) drops to a voltage of 843 mV at 1 V
mented in predictive 45-nm CMOS process using generic supply voltage for the proposed design which is a noticeable
process design kit (GPDK). An in-depth comparison with 68 mV higher than the latch-type SA [13] as shown in Fig. 7.
recently proposed relevant designs [10], [11], [13] has been The outputs have been pre-charged from 1.2 V to an extremely
conducted for proving the efficacy of our proposed SA. All low level of 400 mV to analyze the worst case sensing.
these compared designs have been re-implemented in 45-nm The supply voltage tolerance comparison with referred
CMOS process and analyzed in the same environment. As designs [10], [11], [13] are plotted in Fig. 8. The design
a test design structure, SRAMs have been used for providing presented in [10] dissipates more energy during the sensing
the bitlines (BL and BLB). Power consumption, sensing delay, due to direct supply rails connection. The decoupled tran-
noise margin, process corner tolerance, and low-voltage opera- sistors provide bi-stability in IVMSA [11] which dissipates
tion analysis have been discussed in the following subsections. more energy compared to LVMSA [13]. Due to the use of

343
Proposed SA Proposed SA
10 180
HSSA [8] HSSA [8]
LVMSA [11] LVMSA [11]
IVMSA [9] 150 IVMSA [9]
Energy dissipation [fJ]

Sensing delay [ps]


120

1
90

60

30
0.1

0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2
Supply voltage [V] Supply voltage [V]
(a) (b)

Fig. 8. Energy dissipation and delay analysis of compared sense amplifiers for supply voltage scaling from 1.2 to 0.4 V.

control circuit designed to speed up the latching, the energy 10000


TT
dissipation increases for higher supply voltages. As the supply SS
voltage is scaled down the proposed design dissipates the least SF
FS
energy among the compared architectures with average 17.9% FF
reduction from latch-type SA as show in Fig. 8(a).
Sensing delay [ps]

1000

B. Sensing Delay
Sense amplifiers are often characterized by their sensing
time (latch-time in this design). It purely depends upon the rate
of change of sense output voltage while discharging from pre- 100

charged value to GND (strong “0”). The reduced discharging


time constant (ζ) represented by equation 6 speeds up the
latching in the proposed design. Fig. 8(b) shows the latch- 0.4 0.6 0.8 1.0 1.2
time variation of compared designs for supply voltage scaling Supply voltage [V]
from 1.2 to 0.4 V. The proposed design is fastest in the voltage
range of 1.2 to 0.8 V and performs better than all compared Fig. 9. Low voltage sensing delay variation of proposed sense amplifier at
design from 0.6 to 0.4 V except the HSSA due to the logical various process corners
short to GND [10].
The limitation of latch-type sense amplifiers ( [13] and this performance metric in sense amplifiers and hence the proposed
design) is the non-functionality at near threshold voltage for SA has been tested under various VOH and VOL values (BL
higher frequency of operation. The reason being the output and BLB) for the nominal voltage supply of 1.0 V and low-
nodes take part in gate control of design transistors present in voltage of 0.5 V (low-voltage limit of 0.4 V has not been
the discharging/feedback path. The only design that performs considered due to the near-threshold effect). The advanced
at this phase is the HSSA [10], but the un-acceptable energy megabit memory arrays require a strength greater than 6 sigma
dissipation limits its use. The energy delay product (EDP) is variation but the consequence of read failure greatly eliminated
considered to be the optimum design metric due to the energy- using 8T SRAMs. The test result is summarized in Fig. 5 for
delay trade-off. The proposed design clearly the best choice different VOH and VOL combinations. A high resolution of 10
with a reduction 31.16% from voltage mode SA [13] and mV is noted with acceptable sensing time.
59.92% from current mode SA [10] at 0.4 V.
D. Process Corner Tolerance
C. Noise Margin Tolerance The process corner variation (TT, SS, SF, FS and FF) has
Along with the supply voltage variation the noise margins been considered for the measurement of energy dissipation and
highly affects the sensing delay. The average power consump- sensing delay of compared designs in Table I. The proposed
tion with the noise margin variation changes by a small amount design functions best at typical and fast corners (TT, FS and
due to the lone dependency on short-circuit current. Speed FF), where as the extra NMOS pass transistors affects the
performance at low input resolution is one of the dominating performance in the slow corner (SS). This issue has been

344
TABLE I
E NERGY D ISSIPATION AND S ENSING D ELAY COMPARISON AT VARIOUS PROCESS CORNERS (TT: T YPICAL , SS: S LOW, SF: NMOS SLOW AND PMOS
FAST, FS: NMOS FAST AND PMOS SLOW, FF: FAST CORNER )

Energy [fJ] Sensing delay [ps]


Architecture
TT SS SF FS FF TT SS SF FS FF
Proposed SA 2.023 0.6876 1.166 3.566 5.957 46.48 68.6 58.16 36.83 31.63
HSSA [10] 1532 768.5 1200 1732 2430 65.04 97.37 66.28 74.76 54.78
LVMSA [13] 0.38 0.329 0.359 0.388 0.41 56.95 75.36 67.65 48.3 43.08
IVMSA [11] 1.63 0.506 0.779 3.68 8.919 109.98 252.3 180.5 56.4 55.5

TABLE II has been improved with similar sensing delay metrics. These
P ERFORMANCE C OMPARISON FOR THE P ROPOSED SA AFTER D ESIGN results ensure the ultra speed feature of proposed design to
O PTIMIZATION ( USING L OW-T HRESHOLD T RANSISTORS (T 2 , T 4 AND
T 7 –T 10 ) SHOWN IN F IG . 2) sense large memory arrays and proves the novel regenerative
feedback advantage in the sense amplifier design.
Supply Pavg Ppeak E Delay EDP Idc
[V] [nW] [μW ] [fJ] [ps] [fJ×ps] [A] F. Performance comparison summary
1.2 459.1 11.7 73.45 31.4 2306.3 32.79
1.0 39.18 8.04 6.26 30.47 190.74 18.14 The sense amplifier delay directly affects the memory access
0.8 2.99 4.67 0.48 35.6 17.08 6.529 performance and contribute more to the EDP. The discharging
0.6 0.95 3.15 0.152 58.6 8.907 0.843 at low supply voltages suits the proposed design in this regard
0.4 0.468 1.96 0.075 243.3 18.24 0.024
due to low output voltage swing. To conclude the exclusive
implementation of various SAs, a performance comparison
TABLE III
P ERFORMANCE A NALYSIS OF VARIOUS S ENSE A MPLIFIERS
of referred architectures has been presented in Table III for
supply voltages 1.0 V and 0.4 V. The The summary of SA
V DD = 1.0 V V DD = 0.4 V performance of referred designs is presented in Table IV. The
Architecture Ppeak E Delay Ppeak E Delay proposed design is asymmetric unlike other sense amplifiers
[W] [fJ] [ps] [W] [fJ] [ps]
Proposed SA 5.41 2.03 46.4 1.24 0.06 1329 but the effect is minimum as it enhances the symmetric
HSSA [10] 18.9 1532 65.1 5.77 3.96 62.7 latching operation. The high discharging rate with regenerative
LVMSA [13] 6.70 0.38 56.9 2.04 0.08 1585 feedback makes the proposed design a better sense amplifier
IVMSA [11] 5.1 1.63 109 0.25 – – with acceptable energy dissipation at higher supply voltages
through the use of current blocking circuitry.
TABLE IV
P ERFORMANCE C OMPARISON S UMMARY OF R EFERRED D ESIGNS IV. C ONCLUSION

CMOS Prop. A state of the art ultra speed latch-type sense amplifier
Approach BTI – DLC IVSA
LA SA has been presented. The novel regenerative feedback speeds
Reference [1] [3] [5] [8] [11] (this) up the latching operation with acceptable energy dissipation.
Process [nm] 350 45 180 90 180 45 The lower voltage drop during phase change among compared
Transistors – 12 13 10 9 13 designs provides the best energy-delay metric even at near
Supply [V] 2 1 2 1 1.8 1 threshold voltages. The proposed sense amplifier provides the
Delay [ps] – 36.19 6860 17 20 30.47
Bandwid. [Hz] 66 M – – 3G 11 G – best sensing delay and EDP metrics among compared archi-
Power [μW ] 620 – 1.95 154 99 8.1 tectures and is tolerant to process and voltage variations. The
performance comparison results conclude that the proposed
design latches up in just 46 ps at 1.0 V. The supply voltage
minimized to a great extent through optimizing the presented can be scaled to 0.4 V with a low EDP of 88 fJ×ps.
design by using selective low threshold transistors (T2 , T4
and T7 -T10 ) and the modified result is presented in Table II. R EFERENCES
Transistors with standard threshold voltages have been used [1] J. Ramos, J. L. Ausn, G. Torelli, and J. F. Duque-Carrillo, “Design trade-
to calculate all other results for a fair comparison with the offs for sub-mW CMOS biomedical limiting amplifiers,” Microelectron.
referred designs. The design tolerance to low supply voltage J., vol. 44, no. 10, pp. 904–911, Oct. 2013.
[2] M. Sharifkhani, E. Rahiminejad, S. M. Jahinuzzaman, and M. Sachdev,
operation for various corners is presented in Fig. 9. The “A compact hybrid current/voltage sense amplifier with offset cancel-
sensing performance is less affected by the process-voltage lation for high-speed SRAMs,” IEEE Trans. Very Large Scale Integr.
variation with only 12.9% average change. (VLSI) Syst., vol. 19, no. 5, pp. 883–894, May 2011.
[3] I. Agbo, S. Khan, and S. Hamdioui, “BTI impact on SRAM sense
amplifier,” in Proceedings of the IEEE Design and Test Symposium,
E. Memory Sensing Performance 2013, pp. 1–6.
The design presented in [10] suffered from high sensing [4] F. Centurelli, A. Simonetti, and A. Trifiletti, “An improved common-
mode feedback loop for the differential-difference amplifier,” Integra-
delay because of the late discharging. The normalized energy tion, Analog Integr Circ Sig Process, vol. 74, no. 1, pp. 33–48, Jan.
dissipation of the proposed design and design presented in [13] 2013.

345
[5] S. L. M. Hassan, I. Dayah, and I. S. A. Halim, “Comparative study on [10] I. S. A. Halim, N. H. Basemu, and S. L. M. Hassan, “Comparative
8T SRAM with different type of sense amplifier,” in Proceedings of the study on CMOS SRAM sense amplifiers using 90nm technology,”
IEEE International Conference on Semiconductor Electronics, 2014, pp. in Proceedings of the IEEE International Conference on Technology,
321–324. Informatics, Management, Engineering, and Environment, 2013, pp.
[6] S. Mishra and A. Dandapat, “EMDBAM: A low-power dual bit asso- 171–175.
ciative memory with match error and mask control,” IEEE Trans. Very [11] P. Murugeswari, G. Anusha, P. Venkateshwarlu, M. Bhaskar, and B.
Large Scale Integr. (VLSI) Syst., vol. 24, no. 6, pp. 2142–2151, Jun. Venkataramani, “A wide band voltage mode sense amplifier receiver for
2016. high speed interconnects,” in Proceedings of the IEEE TENCON, 2008,
[7] S. Mishra, T. V. Mahendra, and A. Dandapat, “A 9-T 833-MHz 1.72- pp. 1–5.
fJ/bit/search quasi-static ternary fully associative cache tag with selective [12] B. Wicht, T. Nirschl, and D. Schmitt-Landsiedel, “Yield and speed
matchline evaluation for wire speed applications,” IEEE Trans. Circuits optimization of a latch-type voltage sense amplifier,” IEEE J. Solid-State
Syst. I, Reg. Papers, vol. 63, no. 11, pp. 1910–1920, Nov. 2016. Circuits, vol. 39, no. 7, pp. 1148–1158, Jul. 2004.
[8] H. Jeon and Y. B. Kim, “A novel low-power, low-offset, and high-speed [13] J. Han et al., “A 64 × 32bit 4-read 2-write low power and area efficient
CMOS dynamic latched comparator,” Analog Integrated Circuits and register file in 65nm CMOS,” IEICE Electron. Exp. vol. 9, no. 16, pp.
Signal Processing, vol. 70, no. 3, pp. 337–346, Mar. 2012. 1355–1361, 2012.
[9] M. Kavitha and T. Govindaraj, “Low-power multimodal switch for [14] M. F. Chang et al., “An asymmetric-voltage-biased current-mode sensing
leakage reduction and stability improvement in SRAM cell,” Arab J scheme for fast-read embedded flash macros,” IEEE J. Solid-State
Sci Eng, pp. 1–11, 2016. Circuits, vol. 50, no. 9, pp. 2188–2198, Sep. 2015.

346