Sie sind auf Seite 1von 4

High-Performance and Low-Voltage Sense-Amplifier Techniques

for sub-90nm SRAM

Manoj Sinha*, Steven Hsu, Atila Alvandpour, Wayne Burleson*, R a m Krishnamurthy, Shekhar Borhr
Department of Electrical and Computer Engineering, University of Massachusetts, Amherst, USA*
Microprocessor Research Labs, Intel Corporation, Hillsboro, OR 97124, USA

ABSTRACT capacitance and hence, will incur a larger delay as compared

to the current sensing technique.
Large bit-line capacitance is one of the main bottlenecks to the We also propose a differential charge transfer amplifier
performance of on-chip caches. New sense amplifier techniques that takes advantage of the increased bit-line capacitance and
need to explicitly address this challenge. This paper describes two also offers a low-power operation without sacrificing the
sensing techniques to overcome this problem: a current sense
amplifier (CSA) and a charge transfer sense amplifier (CTSA) and speed. Heller et al. [6] proposed a balanced charge transfer
their implementation based on 9Onm CMOS technology. The amplifier. Kawashima et al. [7] recently proposed a similar
current sense amplifier senses the cell current directly and shows a charge transfer amplifier for SRAMs for low-power
speed improvement of 1720% for 128 memory cells as compared application. The operation of the CTSA is based on the
to the conventional voltage mode sense amplifier, for same energy. charge re-distribution mechanism between very high bit-line
The other is a charge transfer sense amplifier that takes advantage of capacitance and low output capacitance of the sense-
large bit-line capacitance for its operation. CTSA shows an amplifier. Fig. 1 shows the overall architecture of the cache
improvement of 1822% for read delay for I28 memory cells and used for the experiments.
consumes 15-18% less energy than the voltage mode sense A word consists of 128 bits traversed by an uninterrupted
amplifier. CTSA results in reduced bit-line swing and which in turn word line. A cache column holds 128 memory cells and the
leads to 30% lower bit-line energy than the conventional voltage
mode. size of the cell determines the length of the bit-line. The bit-
line is modeled using 128 memory cells and n-3 model
distributed interconnects, The b i t - h e is connected to the
I. INTRODUCTION sense amplifier through the column multiplexer.

Technology and supply voltage scaling continues to

improve the logic circuit delay with each technology
generation. However, the speed of the overall circuit is
increasingly limited by the signal delay over long
interconnects and heavily loaded bit-lines due to increased
capacitance and resistance [l]. SRAM design is constrained
by its compact area requirement, which forces the use ofnear
minimum sized transistor for the memory cell design. The
small memory cell must drive large capacitive bit-lines
WLO 4bl hltf 1
resulting in a very small signal swing. This will limit the
speed of any sensing scheme that requires a development of
specific level of differential voltage to initiate the sensing
operation. The key strategy to overcome the speed limitation
may have to focus on diminishing the bit-he swing to
reduce both the delay and energy involved in charging and Saout
discharging the bit-lines [2]. The small signal swing of bit- SAen Amplifier Sa0"dt
lines involves a circuit design of sense-amplifier$ which can
sense the signal reliably at a high speed. We propose to
overcome the speed and power consumption of large Fig. 1: Bit-Line Architectme used for experiment
capacitive bit-lines using two different circuit techniques:
current sensing and charge transfer sensing technique.
. We propose a modified current sensing circuit, a This paper is organized as follows: Section 2 introduces the
proposed current sense amplifier followed by the simulation results
modification of Clamped Bit Line sense amplifier [3], with
low-input resistance to achieve high-speed operation. The and comparison with the voltage mode sense amplifier. Section 3
describes the basic working principle of the differential charge
cell-current is directly used as a signal and is detected by the transfer sense amplifier (CTSA) followed by the simulations results
current sense amplifier. This does not depend on differential and comparison with voltage mode counterpart. Section 4
discharging of the large bit-line capacitance for sensing and concludesthe results.
thus leads to speed improvement In contrast, voltage mode
sensine- 191
. relies on the differential discharee of the bit-line
~ I

t This work is partially supported by SRC Task 766.

0-7803-8182-3/03/$17.00 02003 IEEE 113

II. CURRENT SENSE AMPLIFIER enhances the speed of the sense amplifier further due to the
fact that there is a flow of bias current before the sense-
The current sense amplifier consists of two parts: one is the amplifier is actually enabled. This results in small increase
current-transporting circuit with unity-gain current transfer in the static power consumption. Fig. 3 shows the schematic
characteristics and the second is the current sense amplifier of the conventional voltage mode sense amplifier [9] with
that senses the differential current. Fig. 2 shows the circuit which the proposed technique has been compared.
implementation for the modified current sense amplifier. The voltage mode sense amplifier also operates in two
The current-transporting circuit consists of four equally sized phases. In the pre-charge phase, the bit-lines and the sense
PMOS transistors (P1 through P4) with positive feedback [2]. amplifier output are pre-charged high. Ysel is grounded
The intermediate nodes (Int and Int#) are pre-equalized using during evaluation phase and Men is pulled high for sensing
a PMOS to prevent latching and to ensure same delay for all the bit-lines. The voltage mode sense amplifier requires
read cycles. The current-transporting circuit output is differential discharging of the bit-lines capacitance for
connected to the second current sense amplifier. The current sensing the voltage difference.
sense amplifier is a modification of [3]. The circuit [3] has
been modified by shifting the sense amplifier enable signal
closer to the dutput. All the nodes of the circuit has been pre-
charged to 111 Vcc or pre-discharged to ground. I I

Pch I


Fig.3: Sch-tic afthe conventional Voltage-mode sac-amplifier

Fig. 4 shows the timing scheme used for both the modified
current senseamplifier and voltage mode sense amplifier.
I v
Fig.2 Schematic of the Modified Current-Sense amplifier j
The operation of the circuit is in two phases. In precharge j
phase, bit-lines and the output nodes of sa and sa# are pre-
charged high. This causes nodes A and B to be pre- Ysel \
discharged to ground. In the evaluation phase, Ysel is
grounded and the current is immediately transported to the
nodes A and B through the drain of PMOS (P3 and P4). The *t F
difference in the current flowing through A and B will be Fig.4 Timing scheme for eanvehtional voltage and w e n t made sense amp
equal to the cell current. The sense amplifier is enabled two-
inverter delay after the SAen is pulled high during which bias
current flows through the two legs of the sense amplifier, A. Simulation Results
while PMOS (M) keeps the output equalized. Mer this two We have carried out the simulations based on 9 h m , 110C
inverters delay, PMOS (M) is disabled and the differential CMOS technology for typical corners [8]. We have
current flowing through the sense amplifier causes a compared the result of modified current sense amplifier with
differential voltage to be developed at sa and sa#. This the voltage mode (VM SA) and Izumikawa current sense
differential voltage is then amplified to CMOS logic levels amplifier [4]. The total width of the transistors for all the
by the high-gain positive feedback cross-coupled inverters. sense-amplifiers has been kept equal. The time between
The sensing delay is relatively insensitive to bit-line word h e enable and sense-amplifier enable (At in Fig. 4) has
capacitance, as this technique does not depend on differential been swept to sense increasing differential voltages or
discharging of the large bit-line capacitance. This scheme currents. Fig. 5 shows that the modified current sense

amplifier @fad CSA) is 17-20% faster than the V M SA and Effect of orthogonal noise: Current sensing is less sensitive
10% faster than the Izumikawa current sense amplifier. to the orthogonal noise. We injected an equal noise-pulse
SA Delay Comparison at iiOC, TTTT
onto the bit-line to determine the impact on current and
voltage mode sensing. The simulation results show that the
+VM SA -Modified CSA -1zumikawa CSA sense-amplifier delay increased by 25% for Mod CSA and
55 53% for VM SA. The current sensing presents a low-
....... ~ ...... impedance path for input signals and hence the injected
charge is removed faster than the voltage sensing. This result
and the fact that the current sensing does not depend on
Q 40 differential discharging of bit-line capacitance, motivates us
0 to enable the Mod CSA faster than the VM SA.
-- . 4 4.2 4.4 4.6 4.8 5 5.2 5.4 m. CHARGE TRANSFER SENSE AMPLIFIER
Total Energy In pJ
This paper proposes a differential charge transfer SENSE
Fig.5: Energy-Delayplot for different sensing schemes amplifier (CTSA) for on-chip caches, which offers high
performance and low-power solution. Fig. 8 shows the
The Mod CSA has 50% lower bit-line swing than the schematic of the CTSA.
voltage sensing. This is because, for current sensing both the
bit-lines discharge path to the gound is cut-off due to the
cross-coupled PMOS (P1 and P2). Suppose bit-line bl
remains high and bl# goes low. This will cause node A and
Int# to go high, which will switch-off PMOS (PZ). So, the
path through node B is cut-off due to P2, while the path
through node A is cut-off as A is high. To reduce the bit-line
swing for the voltage mode, Ysel has to be used as a precise
pulse after sense-amplifier output has reacted. Even after
using Ysel as a precise pulse, the bit-line swing for Mod CSA
is 25% lower than voltage mode. We have also studied the
secondary design issues for the Mod CSA the impact of Vt
mismatch and orthogonal noise on the bit-lines. Threshold
voltage (Vt) mismatch: NMOS transistor stacking in the Mod
CSA makes it more vulnerable to Vt mismatch than the

voltage mode sense amplifier. Fig.7 shows the impact of
worst-case Vt mismatch for V M SA and Mod CSA. The
sense-amplifier delay increases for both the techniques as the
Vt difference in these NMOS device pairs increases. In the
Mod CSA, two stacked NMOS devices will have Vt
mismatcb as opposed to just a single NMOS device with Vt
mismatcb in the voltage mode sense amplifier.
Delay comparison with worst case Vt mismatch
Fig. 8: Schematic of Differential Charge Transfer Amplifier (CTSA)

The basic operation of CTSA is based on the charge re-

55 distribution %om high bit-line capacitance to the low
n 50 capacitance of the nodes So and Sa#. This charge re-
distribution results in high-speed operation and low bit-line
40 . ....
swing. The circuit consists of two parts. First is the common
35 gate cascode formed by MI, M3 and M5 (and M2, M4 and
0 2 4 6 8 10 12 14 16 M6), with PMOS M1 and M2 biased at Vb. Second, the
Vt mismatch in mV
cross-coupled inverters f m e d by M7 through M11, latches
Fig.7: Effect of worst-case Vt mismatch m sense-ampdelay the output of the common-gate amplifier (Sa and Sa#).
In the pre-charge phase, the bit-lines and all the
The Mod CSA delay increases and become more than VM intermediate nodes (A, B, C and D ) are pre-charged high.
SA for a Vt mismatch of more than IOmV. Here lOmV of The output of the common-gate amplifier (Sa and Sa#) is p r e
mismatch for Mod CSA indicates two pairs of NMOS discharged low by keeping SAen high. In the evaluation
devices each with lOmV of Vt mismatch, while just a single phase, Pch is pulled high and Ysel is grounded to select a
pair of NMOS with lOmV of Vt mismatch for VM SA. column. CTSA is enabled by pulling SAen low. Suppose,

the bit-line bl# is going low. As the voltage of the bl# goes device goes into deep-subthreshold region of operation. The
near Vh + IVtpl, MI goes into sub-threshold region of bit-line bl discharges very little while charging the output of
operation preventing the output node Sa# fiom getting the common-gate amplifier. This roughly depends on the
charged. However, the other bit-line bl remains high and ratio of capacitance of the bit-lie bl and the output node (Sa
charges the output node Sa to high. Initially, NMOS pair or Sa#). Let Caut be the capacitance of Sa and Cb be the bit-
M10 and M1 I helps in rejecting the common-mode noise and line capacitance. Then according to charge-causervation,
thereafter helps in latching the value sensed by the common Cout * Vout
gate amplifier. AVb = where, Cb 5 T o u t
Moreover, the bit-line bl does not have to charge the output
to full Vcc as the cross-coupled inverters convert the
differential voltage into full CMOS level.

+t b Current sensing is typically 17-20% faster than voltage
sensing. Current sense amplifier can be enabled faster than
Fig .9 Timing Scheme for CTSA voltage mode, as current sensing does not depend on
differential discharging of large bit-line capacitance. The bit-
A. Simulation Results line swing for current sensing is 35-40% less than the voltage
We have carried out the simulation based on 9Gnm CMOS mode, but the energy saved with reduced bit-line swing is
technology [SI. We have used VM SA, shown in Fig. 3, for compensated by the static power dissipation of the current
comparison. CTSA is 18-22% faster than the VM SA. Fig. sense amplifier. Charge transfer sense amplifier offers both
10 shows the sense amplifier delay comparison. speed and power advantage. CTSA is faster than the voltage
mode sense amplifier by 18-22%. CTSA consumes 15-18%
SA Delay Comparison at IIOC, T t T
lower energy than voltage mode sense amplifier. The bit-line
-CCTSA swing for CTSA is at least 20% lower than voltage mode
counterpa and this gives a quadratic saving of about 30% in
the energy. CTSA is as robust as voltage mode with respect
to Vt mismatch. CTSA offers dual advantage of high-speed
and low-energy. However, CTSA suffers from the additional
complexity in the design of bias voltage generator. We are
further studying the impact of the secondary design issues
10 20 30 40 50 60 70 80 like Vt mismatch, bit-line capacitance mismatch etc. to
BH Line Differentla1 in mV determine the robustness of these two circuits.
Fig. IO: Sense amplifier delay comparison
SA Energy comparison for VM SA & CTSA
[I]. R. Krishnamurthy er.a/, "High-performance and Low-power
Challenges for Sub-lorn Microprocessor Circluts", IEEE CICC, pp.
125-128, May 2002.
121. E. Seevinck et.a/, "Current-Mode Techniques for High-Speed VLSl
e Circuits with Application to C m n t Sense Amplifier for CMOS
.. ... . . ... .
SRAM",lEEEJSSC,vol. 26, no.4,pp. 525-536,April 1991.
[3]. T.N. Blalwk e1.d , "A High-speed Clamped Bit-Line Current-Mode
SenseAmplifieP,lEEEJSSC,vol. 26,na. 4, pp. 542-548, April 1991.
. . .. . . . .. .. .. . ... . [4]. M. Izumikawa er.a/,"A 400MHz, 30OmW, 8Kb CMOS RAM Macro
0.W with a CurrentSensiogscheme",pp.595-598, IEEE CICC 1994.
[SI. B. Wicht e t a / , "Analysis and Compensation of Bitline Multiplexer in
25 45 65 85 105 SRAM Current Sense Amplifiers", IEEE JSSC, vol. 36, no. Il,pp.
Bit Line Voltage Dlff. in mV 1745-1755,Nov. 2001.
Fig.11: Sense amplifier energy comparison [6]. L.G. Heller erol, "High sensitivity charge-+zansfersense amplifier,"
IEEE JSSC,vol.SC-II, pp.596401, O d . 1976.
[7]. S . KawasL!ima erol, "A Charge-Transfer Amplifier and an Encoded-
Fig. 11 shows CTSA consumes 15-18% less energy than Bus Architeechlre for Low-Power SRAM's", IEEE JSSC, vol. 33,110.5,
the V M SA as a result of the charge redistribution fiom high pp. i 9 3 - n a , ~ 1998.
bit-lime capacitance to the small output capacitance. [8]. S. Thompson, "A 9Onm Logic Technology F e a h n g 5 0 m
Strained-Silicon Transistors. 7 Lavers Conwr kttRconXCf Low-K
The energy consumed by the bit-lines charging and ILD.and I r n d SRhMCcll". I E D ~ f T ~ c h D ~ ~ r r l . p p . 6 1 . 6 4 . 2 0 0 2 .
discharging in CTSA is 30% less than the voltage sensing. 191 T . Sakura~"High-SpeedCirrvit Design wtb Scaled-DownM O S F W s
This can be understood by the following simple equation. and Low Supply Vuluge"ISC.IS,WI3 , pp 1487-1490. Slay 1993
The bit-line (bl#) path to ground cuts-off as the biased PMOS