8T 9T

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO.
11, NOVEMBER 2011 2713
A 250 mV 8 kb 40 nm Ultra-Low Power 9T Supply

Feedback SRAM (SF-SRAM)
Adam Teman, Student Member, IEEE, Lidor Pergament, Omer Cohen, and Alexander Fish, Member, IEEE
Abstract—Low voltage operation of digital circuits continues to

be an attractive option for aggressive power reduction. As stan-
dard SRAM bitcells are limited to operation in the strong-inversion
regimes due to process variations and local mismatch, the develop-
ment of specially designed SRAMs for low voltage operation has
become popular in recent years. In this paper, we present a novel
9T bitcell, implementing a Supply Feedback concept to internally
weaken the pull-up current during write cycles and thus enable
low-voltage write operations. As opposed to the majority of existing
solutions, this is achieved without the need for additional periph-
eral circuits and techniques. The proposed bitcell is fully functional
under global and local variations at voltages from 250 mV to 1.1 V.
In addition, the proposed cell presents a low-leakage state reducing
power up to 60%, as compared to an identically supplied 8T bitcell.
An 8 kbit SF-SRAM array was implemented and fabricated in a
low-power 40 nm process, showing full functionality and ultra-low
power.
Index Terms—CMOS memory integrated circuits, leakage sup-
pression, SRAM, ultra low power.
Fig. 1. Schematics of standard SRAM bitcells. (a) 6T bitcell. (b) 8T bitcell.

I. INTRODUCTION
S UB-THRESHOLD and near-threshold operation have be-

come popular alternatives for digital VLSI design in re-
cent years [1]–[18]. These approaches utilize very low supply
traditional ratioed topologies to lose functionality, especially at
low voltages, where sizing and mobility are not always the dom-
voltages for digital circuit operation, decreasing the dynamic inant factors in device drive strength.
power quadratically, and sufficiently reducing leakage currents. One of the major blocks that is highly ratioed and therefore
As static power is often the primary factor in a system’s power affected by the aforementioned process fluctuations, is the
consumption, especially for low to medium performance sys- Static Random Access Memory (SRAM). The standard 6T
tems, supply voltage scaling for minimization of leakage cur- bitcell, shown in Fig. 1(a), provides a robust, non-ratioed hold
rents is essential. Optimal power-delay studies show that the state, fully operational under process variations at very low
Minimum Energy Point (MEP) is found in the sub-threshold re- supply voltages [22]. However, when the data is accessed (read
gion, where ultra-low power figures are achieved; albeit, at the and write operations), maintaining drive strength ratios is es-
expense of orders-of-magnitude loss in performance [1], [19], sential for maintaining functionality. Device sizing is sufficient
[20]. Recent studies have shown that an attractive tradeoff be- to ensure these ratios under nominal strong inversion operation
tween power and delay can be found in the near-threshold re- [23]; however, at low voltages, process variations and mis-
gion with a slightly higher power figure, but with a significant match can cause a loss of functionality [24]. Both theoretical
increase in performance [1]. and measured analyses show that standard SRAM blocks are
Low voltage operation of Static CMOS logic is quite straight- limited to operating voltages of no lower than 700 mV [8],
forward, as its non-ratioed structure generally achieves robust [25], [26]. The read margin problem is easily solved with a
operation under process variations and device mismatch [21]. small area penalty by decoupling the read out path, as in the
However, extreme global and local variations that are intensi- two-port 8T cell, shown in Fig. 1(b). This structure features
fied at deep-nanoscale technology processes can easily cause read margins equivalent to its hold margins, however its write
margins maintain the 700 mV supply limitation [7], [9]. Note
that throughout the majority of this paper the 8T implementa-
Manuscript received December 07, 2010; revised July 24, 2011; accepted July
25, 2011. Date of publication September 01, 2011; date of current version Oc- tion will be used as a reference, as it has similar hold margins
tober 26, 2011. This paper was approved by Associate Editor Peter Gillingham. to the 6T cell, but improved read margins. Therefore, from a
The authors are with the Low Power Circuits and Systems Lab (LPC&S), low voltage stability perspective, the 8T is superior to the 6T;
VLSI Systems Center, Electrical and Computers Engineering Department, Ben-
Gurion University, Be’er Sheva, Israel (e-mail: teman@ee.bgu.ac.il). however, this design introduces additional constraints due to
Digital Object Identifier 10.1109/JSSC.2011.2164009 single-ended readout and half-select sensitivity.
0018-9200/$26.00 © 2011 IEEE

2714 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 11, NOVEMBER 2011
The requirement to develop low voltage SRAMs is derived and measurements are presented in Section V, including a
not only from the need to integrate these blocks within sub/near discussion of design considerations according to the results.
threshold digital systems, but as an independent design focus Section VI summarizes the results and concludes the paper.
in itself. SRAMs comprise very large portions of the total die
area of most modern digital ICs, and this is only expected to
grow [27]. Accordingly, the SRAM blocks are often the major II. DRIVE STRENGTH RATIOS AT LOW OPERATING VOLTAGES
factor in a given system’s power consumption. They are even a Similar to other low voltage bitcells, the proposed 9T bit-
more substantial factor in the static power consumption, as the cell is based on the two-port 8T topology, shown in Fig. 1(b).
majority of SRAM cells are in a standby hold state throughout This structure presents two major advantages. First, the Static
the clock period. Standby power reduction in SRAMs is one of CMOS cross coupled inverter structure of the cell core provides
the best opportunities for power reduction in digital ICs. a non-ratioed robust positive feedback structure for bi-stability
Low voltage SRAM design has become increasingly popular with high data retention (“hold”) noise margins. Second, the
over the past few years. Various bitcell designs and architec- readout buffer, comprised of transistors M7-M8, decouples the
tural techniques have been proposed to enable operation deep read path from the cell core, enabling a non-penetrative read
into the sub-threshold region [3], [7]–[9], [13], [28], [29]. These operation. This makes the worst case read margin equivalent
designs generally incorporate the addition of a number of tran- to the hold margin, as described in [35]. To ensure writeability,
sistors into the bitcell topology, trading off density with robust the write bit lines (WBL/WBLB) must be able to pull down the
functionality. Decoupling the readout path is a common tech- internal node ( ) storing a ’1’ past the cross coupled in-
nique [7], [8], [28], as is write margin improvement by word verters’ trip point. This is achieved by maintaining stronger ac-
line boosting and supply gating. One of the first examples of a cess transistors (M2/M5) than their respective pull up pMOS
fully operational sub-threshold bitcell is the 10T circuit of [8], (M3/M6). At strong inversion voltages, mobility ( ), device di-
while peripheral techniques enabled 350 mV operation of a stan- mensions ( ), and threshold voltage ( ) have a linear (or
dard 8T bitcell in [7]. Fully differential implementations are pro- square) influence on device currents, so it is usually sufficient
posed in [3] and [30], enabling half-select stability. Other recent to maintain a higher pull-down transconductance, i.e.,
approaches for aggressive power reduction in SRAMs include
Adiabatic operation [31], FinFET based SRAM design [32] and (1)
bitline charge recycling [33], [34].
In this paper, we propose a novel 9T bitcell with a new Supply
with the device’s transconductance, and the oxide capac-
Feedback approach to low voltage operation. This Supply Feed- itance coefficient.
back SRAM (SF-SRAM) internally weakens the pull up net- As the supply voltage is reduced and approaches the sub-
work of the bitcell during a write and thus assists in discharging threshold region, the device’s current dependence on the over-
the high data node. By applying this concept, the cell achieves drive voltage ( , where is the gate-to-
highly improved write margins and is fully functional under source voltage) grows from a linear dependence to an exponen-
global and local process variations at voltages as low as 250 tial one. This is shown by the combination of two models: the
mV. This low voltage operation is accomplished without the sub-threshold conduction model, and the EKV model [36]. For
need for any additional peripheral circuits or techniques. Cell voltages well below , the sub-threshold conduction model
functionality and stability are presented, including Monte Carlo shows the exponential dependence on overdrive [6]:
(MC) statistical distributions for proof of concept.
In addition to low voltage operation, the cell topology
presents a low-leakage state, at which the bitcell’s static power
(2)
is lower than an equivalently supplied 8T cell. In this state, the
with
static power of the bitcell has a 15%–60% lower leakage factor
than a standard 8T bitcell, depending on the cell implementa-
(3)
tion and supply voltage. This results in an 83 reduction in
leakage power for an SF-SRAM cell operating at 300 mV, as
compared to an 8T bitcell at 1.1 V. where is the transistor’s drain-to-source voltage,
An 8 kb array of SF-SRAM bitcells was fabricated as part of a is the thermal voltage, is the Drain Induced Barrier
40 nm test chip and successfully measured in the laboratory. A Leakage (DIBL) coefficient, is the subthreshold swing coeffi-
deep analysis of the operating concepts, internal mechanisms, cient, is zero bias mobility, is gate oxide capacitance, and
stability, power and performance are presented in this manu- and are the width and length of the transistor, respectively.
Note that this model also emphasizes the exponential increase
script.
of DIBL as increases, as will be discussed later.
The rest of the paper is constructed as follows. Section II
When the device approaches inversion, as gets close to
describes the problems that arise at low operating voltages
, the sub-threshold model of (2) loses accuracy and the EKV
and thus limit standard SRAM operation. The proposed 9T
model should be considered [1]:
SF-SRAM bitcell is presented in Section III and its data re-
tention is discussed; Section IV describes the read and write
operations of the proposed cell. Test chip implementation (4)
TEMAN et al.: A 250 MV 8 KB 40 NM ULTRA-LOW POWER 9T SUPPLY FEEDBACK SRAM (SF-SRAM) 2715
Fig. 3. Schematics of the 9T SF-SRAM bitcell employing a Supply Feedback

device (M9) connected to the internal data storage node ( ).
and/or loss of drive current must be taken into account. In addi-

Fig. 2. Simulated drive current ratio of minimum sized nMOS/pMOS devices tion, layout implications at deep-nanoscale processes are non-
at the typical (TT) and SF corners in a standard 40 nm LP process. Whereas the
nMOS devices are stronger than the pMOS devices at all supply voltages for trivial. Therefore, a standard 8T two-port SRAM cell is limited
the typical corner, the pMOS devices are stronger than their nMOS equivalents to a minimum voltage supply of approximately 700 mV for most
at the SF corner under 900 mV. nanoscale CMOS processes [8].
III. THE PROPOSED 9T CELL

with representing the current at the onset of inversion, ,a The previous section presented the writeability limits of
fitting parameter, and IC representing the inversion coefficient, standard SRAM cells due to drive strength ratios between
given by contending devices under process variations. The proposed 9T
Supply Feedback SRAM bitcell, shown in Fig. 3, was developed
(5) to overcome this problem by utilizing a novel Supply Feedback
concept. This concept is implemented by adding a supply
Based on these models, it can easily be concluded that gating transistor (M9) that is connected in a feedback loop to
threshold voltage has a strong effect on device current. There- the data storage node ( ). The feedback weakens the pull up
fore, differences between the thresholds voltages of nMOS path of the cell during a write operation, ensuring that the cell
and pMOS devices, especially under extreme process vari- flips, even when the pMOS devices are much stronger than
ations, often overtake mobility as the primary factor in the the nMOS ones. In addition, the internal gating creates a slight
device strength ratio. The problem is accentuated at the Slow voltage drop at the drain of the supply feedback transistor (M9)
nMOS/Fast pMOS (SF) process corner, where the pMOS de- during one of the hold states. This voltage drop results in lower
vices become stronger than equivalently sized nMOS devices at leakage currents at the expense of a slight reduction of hold
voltages below 900 mV. This variation, that causes 8T SRAMs noise margins. Hold states and stability of the 9T SF-SRAM
to fail at low voltages, is shown in Fig. 2, plotting the current cell are described in detail below, whereas read and write
ratios (with ) of minimum sized nMOS operations are described in Section IV.
devices as compared to pMOS devices at different supply
voltages. The figure shows that for a typical process corner at A. 9T Cell Hold States
room temperature, the nMOS is always stronger, but for the SF In contrast with the symmetric 6T and 8T bitcells, the 9T
corner at 0 , the pMOS becomes stronger at 900 mV. In the SF-SRAM bitcell, shown in Fig. 3, presents a pair of asymmetric
near-threshold and sub-threshold regions, the pMOS current is stable states for data storage. For the case of storing a ’0’ (i.e.,
as much as 35 stronger, such that meeting the writeability node is discharged), the feedback loop turns on M9 (
constraints through sizing would require impractical device ), propagating to the virtual supply
sizes. node, . In this “trivial” case, the internal transistors (M1,
Two relatively simple methods to improve this ratio are word- M3, M4 and M6) create a standard pair of cross-coupled Static
line boosting and increasing the channel length, as incorpo- CMOS inverters, with similar noise margins to an equivalent
rated by the authors of [7]. Adding approximately 150 mV to 8T cell. is therefore charged to , providing a strong
the gate voltage ( ) of an nMOS will ensure stronger drive gate bias for M7 ( ) and enabling fast
than equivalent pMOS at sub and near-threshold voltages, but single-ended readout (RBL discharge) when RWL is asserted.
this comes at the expense of additional periphery and power to The opposite state is initiated when is charged and is
create and propagate this boosted voltage. Enlarging the channel discharged. Now M9 is cut off as drops below the device’s
length utilizes the Reverse Short Channel Effect (RSCE) [9] threshold voltage. is strongly discharged, as the low resis-
and improves the susceptibility to process variations, therefore tance of a conducting M4 only has to overcome the high serial
maintaining a better ratio. However, this alone cannot solve resistance of disconnected M6 and M9. M1 is therefore strongly
the problem for most processes, and the increase in device size cut off, with a very low gate voltage ( ), and M3
Fig. 4. Simulated statistical distribution of steady state voltages at nodes and (left and right side graphs, respectively) for both the hold ’0’ and hold ’1’
states (top and bottom graphs, respectively) with . The results are for a 2500 point simulation under both global and local mismatch variations.
Note that the graphs for at hold ’0’ and at hold ’1’ are given in .
is conducting ( ), such that implants at a discrete sub-threshold voltage. Fig. 5 extends the
the final state of is equivalent to . This voltage is set discussion by showing the distribution over the full range of
according to the contention between the primarily DIBL current supply voltages and with three different implementation op-
through M1 ( ) and the sub-threshold tions. These three implementations, which will be compared
current through M9 ( ), which is throughout the text, are achieved by replacing the pMOS transis-
much stronger providing a high level. Ultimately, the steady tors with low-threshold (LVT) or high-threshold (HVT) devices.
state voltage at is approximately 10% lower than , and In general, using an LVT feedback device results in a higher
fluctuates with the implant of the supply feedback transistor. stable value for in the hold ’1’ state, and consequently in-
Fig. 4 shows the Monte Carlo statistical distribution of the creased hold margins, but it achieves less of an improvement
steady state voltages at the and nodes with a supply in write margins and leakage currents. Using an HVT feedback
voltage of 300 mV, under both global and local mismatch. The device provides the opposite trade-offs. The steady state node
figure includes the distributions for the two nodes at both the voltages at are shown for all three implementations in Fig. 5
hold ’1’ and hold ’0’ states. It is clearly shown that for the hold at the full range of operating voltages, from 200 mV to 1.1 V.
’0’ state, both nodes are clamped to strong levels, while at the In addition, the figure shows the voltage drop under local
hold ’1’ state, the node shows a voltage drop with a mean and global variations. The distributions of and of the hold
value of 264 mV, which is 12% less than . This voltage ’1’ state are not shown, as they are very close to the rails with
drop results in a slight loss of static noise margin (SNM), but a very small deviation ( 1 mV) for all implementations and
enables low voltage writes and provides internal leakage sup- supply voltages. Note that as shown in Fig. 5, whereas the SVT
pression. This voltage drop doesn’t affect the readability of the and LVT implementations retain hold stability under 200 mV,
cell, even at extreme variations, as can be seen in the the HVT implementation loses functionality at a slightly higher
states depictions in Fig. 4. All the distribution points fell supply voltage.
in the region for the hold ’1’ state, and had less than a 1% With respect to global variations, the most problematic corner
voltage drop for the hold ’0’ state. This ensures a strong over- is the Fast nMOS/Slow pMOS (FS) corner, as the leakage of M1
drive voltage on M7 during read operations for all states and is strengthened as compared to the current through M9 and M3.
under all operating conditions. But at this corner M4 is also strengthened, providing a robust
Fig. 4 shows the steady state voltages of an SF-SRAM bit- ’0’ level at despite the degraded level at . This ensures
cell implemented exclusively with standard (SVT) threshold correct readout through these extreme situations.
Fig. 5. Simulated mean and 6 boundaries of the steady state voltage at node
compared to for the hold ’1’ state. Three implementation options are
shown, marked by the threshold implant of the feedback transistor (M9) – SVT,
LVT and HVT. The distributions for the node and for the hold ’0’ state are
very close to the rails for all supply voltages and therefore are not shown.
Fig. 7. SNM ratio of 9T SF-SRAM cell as compared to standard 8T cell at full

range of supply voltages. (a) Hold ’1’ state with depleted noise margins for three
possible implementations of the 9T cell. (b) Hold ’0’ state with equivalent noise
margins to the 8T cell. All graphs are shown according to simulation at the TT
process corner.
the leakage through M1. As the gate voltage of M1 increases, the

Fig. 6. Butterfly curves of an SF-SRAM cell with three threshold implant op- voltage level at decreases leading to a “deflated” lobe. When
tions for the supply feedback transistor (M9) – SVT, LVT and HVT, as well as M9 is implemented with an LVT device, the loss of static level
the reference 8T butterfly curve. The simulation data was plotted for a 300 mV
supply voltage.
is limited, in contrast with the HVT device that presents a se-
verely deflated lobe. However, if ultra-low power has a higher
priority than large static noise margins, the HVT or SVT im-
plementations are attractive alternatives, since the hold ’1’ state
B. Static Noise Margin
provides a low leakage factor, as will be shown later.
As described above, the threshold implant of the supply feed- A comparison of the SNM of the 9T SF-SRAM cell with
back device (M9) has a large effect on the voltage drop and the standard 8T SRAM cell at the full range of supply voltages
therefore on the noise margins of the hold ’1’ state. Fig. 6 em- (0.2 V to 1.1 V) is given in Fig. 7. Due to the asymmetric char-
phasizes this fluctuation in noise margins, showing the butterfly acteristic of the proposed cell, the SNM is partitioned into hold
curves of the SF-SRAM cell with different threshold implants ’0’ SNM and hold ’1’ SNM graphs. These are identical for sym-
applied to M9. All curves in Fig. 6 are shown for a 300 mV metric cells, such as the 8T SRAM. Obviously the worst case
supply voltage. Note that the 8T cell SNM is shown for compar- (Hold ’1’) should be considered, however we choose to show
ison and due to the fact that various other low voltage SRAMs both cases, as SNM is an over-pessimistic metric for an asym-
are based on this structure; however, the reference 8T cell is metric cell. A better comparison could be according to Dynamic
non-functional at this voltage. As expected, the 9T butterfly Noise Margin (DNM), as described in [37], [38].
curves are asymmetric, as the node always holds strong As can be seen in Fig. 7(a), which shows the SNM ratio for
levels, whereas the node’s steady state level is a function of the hold ’1’ state, the LVT implementation has a 10–20% lower
SNM for the ultra-low voltages (200 mV–450 mV). The HVT
option is unstable below 300 mV, with a quick fall in SNM.
At the median to nominal voltages, the SVT option should be
considered, as its SNM is comparable to the LVT implementa-
tion, and it presents a lower power figure, as shown in the next
subsection. The HVT option has static noise margins that are ap-
proximately 40% lower than the reference 8T cell, and so should
only be considered for ultra-low power applications in a low
noise environment or with the integration of strong error correc-
tion codes. Note that for the hold ’0’ state, shown in Fig. 7(b),
the SNM is hardly degraded for the LVT and SVT implemen-
tations throughout the full range of supply voltages. The HVT
option has degraded margins for this state as well; however they
are substantially higher than the hold ’1’ state. Fig. 8. Simulated leakage current ratio of the 9T SF-SRAM cell, as compared
to the reference 8T bitcell for the Hold ’1’ state.
C. Static Power
The leakage reduction described above is, as far as we know,
The major advantage of the proposed 9T SF-SRAM cell is the first time a sub-threshold bitcell has been proposed with an
its functionality at ultra-low voltages without the need for addi- advantage over the standard 8T cell in static power dissipation.
tional periphery due to its extended write margins. However, a The majority of the proposed low-voltage cells have slightly
by-product of its topology is internal leakage suppression in the larger static power dissipation due to the additional transistors,
hold ’1’ state. and therefore additional leakage paths. The primary focus of all
As shown above, the steady state voltage of the data storage of these implementations is to aggressively reduce the system’s
node ( ) is slightly lower than (see Fig. 5). On the one power dissipation at the expense of performance and area.
hand, this presents a decrease in SNM, but on the other hand it Even though the proposed cell’s additional power reduction
reduces the internal leakage of the bitcell. For the standard 8T is only present at one stable state, data processing algorithms
cell, the static leakage currents are set (assuming and can maximize the number of cells storing a ’1’ (for example,
precharged bitlines) by the cut off inverter transistors, M1 and unused storage could be pre-written to this state) maximizing
M6 (see Fig. 1), along with the right access transistor (M5). The the leakage reduction of the array. It should be noted that
leakages are dominated by DIBL with , . the hold ’1’ state is also beneficial for read power and bitline
In the proposed SF-SRAM cell, assuming , leakage, as M7 is cut-off at this state.
the left leakage path is reduced due to the degraded of M1,
while the right leakage path is suppressed by the stack effect of
M9 in series with M6. M9 has a sub-threshold gate bias ( IV. READ AND WRITE OPERATIONS
) with low DIBL ( ), while The previous section discussed in detail the static operation
M6 is not as severely cut off as in the standard 8T case ( of the proposed 9T SF-SRAM cell when holding a ’1’ or a ’0’.
), but has lower DIBL ( ). However, the major benefits of this topology are its readability
Leakage through the access transistors is bitline dependent but and writeability at low operating voltages, without the need to
hardly changed, and gate leakages are slightly decreased on M4 implement additional peripheral circuitry and techniques (com-
and increased on M6. M9 provides an additional, albeit very pared to a standard 8T cell).
small, gate leakage current.
The final result of these effects is a drop in total leakage as is
A. Read Operation
depicted in Fig. 8 for the three threshold implementations of the
supply feedback transistor across the full range of supply volt- The read operation of the proposed cell is equivalent to that
ages. The figure depicts the ratio of leakage currents as com- of the reference 8T cell. Transistors M7 and M8 comprise a
pared to the reference 8T bitcell. It is clear that the higher the read buffer that decouples the readout path from the internal
threshold of the SF device, the lower the leakage, achieving as cell storage. M7 is gated by node , such that when a ’1’ is
much as a 63% reduction in static power for a nominally biased stored (at ), M7 is cut off and when a ’0’ is stored, it is con-
( ) HVT implementation. For the more robust LVT ducting. A read operation is initiated by precharging the read bit
implementation, a 20%–25% reduction is achieved for all appli- line (RBL in Fig. 3) and asserting the read word line (RWL). If a
cable voltages. The graphs for the hold ’0’ state are not included, ’0’ is stored, RBL is discharged through the M7 and M8. If a ’1’
as the leakage power is almost identical to the 8T reference cell. is stored, M7 blocks the discharge path and RBL remains at its
Note that the reference 8T cell is non-functional under 700 mV, precharged value. A single ended sensing scheme is used to rec-
due to lack of writeability. The SVT implementation of the pro- ognize if RBL has been discharged or not. Despite the deflated
posed cell in the hold ’1’ state presents a 91% (11 ) and 99% level of when holding a ’1’, is always clamped to or
(83 ) leakage power reduction at 300 mV as compared to the GND, resulting in strong conductance through M7 (
8T cell at 700 mV and 1.1 V, respectively. or ) and equivalent read performance to the 8T reference
cell. This decoupling of the readout path results in a read margin

equivalent to the hold margin described in the previous section,
which is sufficient at the full range of supply voltages under
global and local process variations. From a performance per-
spective, the read access time is proportional to a number of de-
sign factors, primarily bitline capacitance, sense amplifier sen-
sitivity and drive strength of read buffer transistors (M7, M8).
Therefore the read performance is application/architecture spe-
cific and very controllable according to required specifications.
However, it is clear that it degrades severely with reduction of
the supply voltage.
B. Write Operation
The most interesting aspect of the proposed SF-SRAM cell

is its performance during a write operation. Similar to an 8T
cell, the write operation is initiated by driving the differential Fig. 9. Write ’1’ operation.
write bit lines (WBL and WBLB of Fig. 3) to the level of the data
to be written and asserting the write word line (WWL). To en-
sure the success of this operation in a standard 8T cell, the pull
down path on the side to be written to ’0’ must overcome the
pull up pMOS that was previously holding the ’1’ state. As pre-
viously mentioned, at strong inversion voltages, this is solved
by transconductance ratios, and is easily solved by sizing the
pull up pMOS devices equivalent to the nMOS access transis-
tors, as hole mobility is lower than electron mobility. However,
as the gate voltages approach the device’s threshold, the large
current fluctuation due to process variations often disrupts this
ratio. Therefore, even a downsized pMOS can overcome the ac-
cess transistor that is weakened due to higher , longer channel
length, degraded gate widths, etc., resulting in a failed write.
In the proposed cell, the feedback loop from to the gate of
M9 assists in the write operation by weakening the pull up path. Fig. 10. Write margin ratio of a Write ’1’ operation as compared to the refer-
Again, as the cell is asymmetric, the operation is quite different ence 8T cell as simulated at various process corners and at all operating voltages.
for the case of writing a ’1’ to a cell holding a ’0’ and vice versa.
Therefore, these two cases will be described separately.
1) The Write ’1’ Operation: Let us assume that the cell is to the reference 8T cell. The great advantage of the proposed
in the hold ’0’ state, i.e., is discharged to GND and is cell over the reference cell at the SF corner is accentuated by
charged to . In order to write a ’1’, WBL is driven to this graph, showing a vast rise in the write margin ratio as the
and WBLB is discharged to GND. WWL is asserted and the write supply voltage is lowered. Below 500 mV, the 8T write margin
operation commences, as illustrated in Fig. 9. The read buffer becomes negative, while the proposed cell maintains a positive
transistors are omitted from the figure, as they are irrelevant margin down to below 200 mV under global variations. At the
to this operation. Initially, M9 is strongly conducting, enabling typical corner, the proposed cell provides an advantage over the
full contention (along with M6) to the pull down path through 8T cell at voltages under 700 mV. At higher voltages and in the
M5. Providing this situation would persist, would be pulled FS corner, the proposed cell’s write margins during a Write ’1’
down towards a steady state voltage between and GND. operation are approximately 10% lower than the reference cell;
Under standard conditions, this voltage would be low enough however they are still very sufficient, as their absolute value is
to turn on M3 and cut off M1, initiating the positive feedback of high at these voltages.
the cross-coupled inverters and resulting in a successful write. 2) The Write ’0’ Operation: The Write ’0’ operation includes
However, under certain conditions, such as the SF corner, the a similar feedback process that provides improved write mar-
steady state voltage is high enough not to initiate this feedback gins. We will now assume that is charged to
and the write would ultimately fail. In this case, the feedback and that is discharged to GND. To write a ’0’, WBL is dis-
of M9 comes into play, as is clearly depicted in Fig. 9. As charged to GND and WBLB is driven to . Subsequently,
is charged, M9 is weakened ( ) lowering WWL is asserted, providing us with the initial state illustrated
the contention to M5. This enables an “easy” write, further en- in Fig. 11 (again, omitting the M7-M8, that are irrelevant to the
hanced by the weakening of M1 as is discharged. process). In this case, the advantage of the write operation is
The Write ’1’ margins achieved for the SVT implementa- straightforward, as not only is initially residing at a lower
tion of the proposed bitcell are shown in Fig. 10, as compared voltage, but M9 is initially cut-off, providing a pull-up current
Fig. 11. Write ’0’ operation.
Fig. 12. Write margin ratio of a Write ’0’ operation as compared to the refer-
ence 8T cell according to simulation at various process corners and at all oper-
ating voltages.
that is no match for the strong discharge current of M2. The pos-
itive feedback of the cross-coupled inverters is quickly initiated, Fig. 13. Dynamic trajectories of write operations to the SF-SRAM cell at the
SF corner with a 300 mV power supply. (a) Write ’1’ operation. (b) Write ’0’
and the storage nodes are pulled up to their respective rails. operation. Both figures include the trajectories of: (1) a successful write opera-
The ratios of the Write ’0’ write margins as compared to tion with a very long write pulse; (2) a successful write operation with a write
the 8T cell are shown in Fig. 12 at various process corners pulse close to the minimum width ( ); (3) a failed write pulse just below
the minimum width; and (4) the trajectory of a (failed) write attempt into an 8T
throughout the full range of supply voltages. The behavior is reference cell under these conditions (with a very long write pulse).
similar to that described for the Write ’1’ operation, with the
proposed cell showing an even higher advantage for this opera-
tion. For example, at the typical corner with a supply voltage of The figures show four cases: a typical (successful) write; a suc-
700 mV, the 9T cell’s write margin for the Write ’0’ operation is cessful write with slightly above ; a failed write with
30% higher than the 8T cell, whereas for the Write ’1’ operation slightly below ; and the trajectory of a (non-time lim-
it is 5% higher. ited) failed write into an 8T bitcell under these conditions.
It is important to look at the dynamic stability of the cell’s
write operation in addition to its static write margins. This C. Write Access Time
stability takes into account the trajectory of the internal cell As previously explained, the major feature of the SF-SRAM
voltages during write access according to different write access cell, as compared to the standard 8T cell, is its immunity to
times. A successful write occurs when , where process variations at low voltages, specifically around the SF
is the pulse width of the WWL signal and is the process corner. When writing to a standard 8T cell at the SF
critical write access time that brings the cell state past its corner, the pMOS based pull-up path wins the ratioed fight
separatrix [37], [38]. against the nMOS based pull down path, such that the write
Fig. 13 shows the trajectories of Write ’1’ (Fig. 13(a)) and fails. In the SF-SRAM cell, the internal feedback loop weakens
Write ’0’ (Fig. 13(b)) operations at 300 mV under the SF corner. the pull-up path either at the beginning or at the end of the
Fig. 15. Test chip micrograph and evaluation board.
Fig. 14. Worst case write access comparison of the proposed SF-SRAM cell
with the reference 8T cell throughout the range of supply voltages. Under 500 is lower than the final state voltage presented in Section III.
mV, the 8T cell is non-functional at the SF corner, so this corner is excluded
from the comparison at these voltages (i.e., the plot shows the simulated ratio During this period the noise margins will be slightly lower than
to the worst case, not including the SF corner). those simulated for the cell’s final DC state.
V. IMPLEMENTATION AND MEASUREMENTS

discharge process, enabling low-voltage writes, in spite of the
strong pMOS. Ultimately, this also strongly affects the access A. Test Chip Implementation and Architecture
time, providing a substantial advantage to the SF-SRAM at the The 9T SF-SRAM bitcell was implemented in a low-power
SF corner, which presents the slowest write time for the 8T cell. 40 nm TSMC technology, using only standard process steps and
Fig. 14 plots the access time ratio between the SF-SRAM multiple implants. An 8 kbit array consisting of 256 rows
cell and the reference 8T cell across the supply voltage range. and 32 columns was integrated into a test chip and fabricated
The figure shows the ratio of the worst case access time due on a TSMC shuttle. The micrograph of the test chip is shown in
to global variations. For the nominal voltages (above 800 mV), Fig. 15. Layout of the 9T cell is shown in Fig. 16. This layout
the proposed cell has slightly longer access times for the Write was implemented according to the process design rules, using
’1’ operation, as the supply feedback slows down the positive minimal width devices with slightly larger than minimal lengths
feedback of the cross-coupled inverters that is initiated once the to dampen the variability. Poly strips were kept equivalently
trip point is crossed. However, as the voltage is lowered and the sized and oriented for advanced node fabrication considerations,
drive strength ratio between the nMOS access transistors (M2 such as double patterning. The implemented layout leaves the
and M5 in Fig. 1(b)) and the pMOS pull up devices is degraded, option of adding threshold implant masks to the entire n-well
the 8T cell’s access time becomes slower. The SF-SRAM cell area, as well as adding an HVT implant to M1, M2 and M5, fur-
ther enhancing the cell’s hold ’1’ margin and reducing leakage,
is less affected in these cases, and therefore, the worst case ac-
at the expense of increased write time and slightly reduced write
cess time is faster for the proposed cell. The access time of the
margins. For area comparison, Fig. 16 also includes the layout of
8T cell degrades rapidly as the voltage is lowered, resulting in
a standard two-port 8T cell with full adherence to the same de-
write failure below 500 mV. At this point the SF-SRAM’s worst
sign rules for a fair comparison, rather than comparing the area
case write access is approximately 10 faster than the reference
to previously reported “pushed rules” implementations. The 9T
8T, and successful writes are achieved under global variations
layout shows an increased area of approximately 20% over the
at voltages lower than 200 mV. The access time advantage is 8T cell.
not unique to the worst case comparison. At the TT corner, for Post layout simulations confirmed functionality of the array.
example, the proposed cell’s access time is shorter than the 8T Several tests were carried out to ensure that parasitic effects
cell at all voltages below 500 mV. wouldn’t impair its operability, such as coupling into the cell
It should be noted that the access times measured above take during RWL assertion, especially in the hold ’1’ state. This type
into account the delay from the rise of WWL until node of operation has a negligible effect ( 1%) on the internal data
discharges past the cell’s trip point. This definition is almost node voltage; partially due to the horizontal routing of the power
non-conditional when discussing an 8T cell, as at this point the rails creating a shielding between the WWL and RWL wires. Bit-
cell’s positive feedback has been initiated and the storage nodes line cross-capacitance also has a negligible effect on the internal
( and ) will quickly be pulled to their respective rails. The data levels in the final array.
same is true for the Write ’0’ operation on the SF-SRAM cell, In order to maintain focus on functionality of the cell and
but not for the Write ’1’ operation. When a ’1’ is written to the eliminate risks, the cell periphery was synthesized with stan-
SF-SRAM cell, node is quickly discharged, ensuring that dard cells and read sensing was implemented using a skewed
the final state will be written once WWL is lowered. However, inverter. This sensing scheme results in an obvious loss in per-
the negative feedback on causes it to charge at a much slower formance at nominal voltages, but enables low voltage testing
pace. Therefore, if WWL is lowered before charges to its final without the risk of dysfunctional sense amplifiers at these sup-
value, there will be a finite period during which the voltage at plies. To decouple the power consumption of the non-optimal
Fig. 16. Layout of 9T SFSRAM bitcell and area comparison to standard 2-port 8T bitcell.
peripheries from that of the array core, the array and internal pe- reading a ’0’, as in cycle (4), the Data Out signal falls once
ripherals (precharge circuits, write drivers, sense inverters) were RBL crosses the switching threshold of the skewed sensing in-
powered by a separate supply ( ). Level shifters were inverter; slightly above . It should be noted that the signals
serted at the interface between the standard digital circuitry and CLK, WR, Data In and Data Out are biased at a nominal supply
the array to enable standard operation of the digital circuits at voltage (1.1 V), whereas WBL, WBLB, RBL, WWL, RWL, and
nominal voltages, while testing the novel array at low voltages. are connected to the lower supply.
A comprehensive built-in-self-test (BIST) block was integrated Each column was divided into four 64 row partitions in a di-
with the array to test functionality. vided bitline (DBL) structure [39], to reduce RBL capacitance
In order to test the performance of the array core without and off-row leakage. Each partition was separately precharged
taking into account the non-optimal, high power peripherals, the and pre-sensed, with the inverted bitline section signal propa-
array was operated on a falling-edge initiated half cycle. There- gated to a NOR gate at the bottom of each row. Write bitlines
fore all performance was tested assuming the decoding, write were common for all 256 rows. The complete array architec-
driving and pre-charged signal were ready prior to the operation. ture, with the addition of the chip select (CS) signal, is shown
The timing period started with the falling edge of the clock and in Fig. 18. The reference 8T array was compiled in the same
included down-shifting, word line charging, bitcell read/write, fashion.
read sensing and up-shifting, finishing with the rising edge of
the next clock. An 8T reference array was fabricated with the B. Test Chip Measurements
same operating principle for accurate performance and power The test chips were packaged and connected to a Xilinx eval-
comparisons. Typical waveforms, showing subsequent opera- uation board (see Fig. 15) to enable test control with a standard
tion of writing and then reading a ’1’ and then a ’0’, are shown FPGA. The arrays were tested with both complete array tests
in Fig. 17. using an on-chip BIST, as well as specific tests programmed
The clock signal (CLK) is positive-edge synchronized with into the FPGA and propagated through the chip’s I/O pads. All
the Write-Read Select (WR), Data-In, and Data-Out signals. packaged test chips functioned successfully at the full range of
The digital periphery drives the write bitlines (WBL, WBLB) supply voltages from 400 mV to 1.1 V. As mentioned above, the
or precharges the read bitline (RBL) according to the WR and test setup didn’t enable operation at voltages below 400 mV. An
Data-In signals. The write and read wordlines (WWL and RWL, example set of waveforms, measured at one of the test-chip’s in-
respectively) are raised at the negative-edge of the clock, fol- terface is shown in Fig. 19. The waveforms show writing a ’1’
lowing which, the internal nodes are written, or RBL is condi- and a ’0’ to two separate addresses and subsequently reading
tionally discharged. When writing a ’1’, as in cycle (1), is them out. Synchronization of the array to the negative clock
charged to a level slightly lower than ; however is fully edge is pointed out in the figure.
discharged. When writing a ’0’, as in cycle (3), both internal Power measurements were taken using the Agilent B1500a
nodes are clamped to their respective rails. The read out value Semiconductor Device Analyzer. Fig. 20 shows the static
of the cell (Data Out) is stable prior to the next positive clock (Hold) power of the array when loaded with all zeros and all
edge and is held stable until the next negative clock edge. When ones. At the minimum measured voltage, 400 mV, the array
Fig. 17. Timing diagram of a subsequent writes and reads (a single bit is illustrated). (1) Write a ’1’; (2) Read the stored ’1’; (3) Write a ’0’; (4) Read the stored ’0’.
consumed 40 nW and 60 nW for the array loaded with ones and leakage suppression for one of its stable states, sufficiently
with zeros, respectively. As shown in the figure, the measured reducing the array’s static power dissipation. These advantages
static power of the 8T reference array (entirely loaded with come without the need for additional peripheral circuits and
zeros) was very similar to that of the SF-SRAM when loaded techniques, such as wordline boosting and supply gating that
with zeros. present various drawbacks.
Dynamic power during single Read and Write cycles are The improved characteristics of the SF-SRAM technique
shown in Fig. 21. The write power is shown for writing a 32 come at the expense of an increase in area and a reduction of
bit word equally composed of zeros and ones, whereas the read traditional static noise margins metrics. Three implementation
power is shown only for reading a 32 bit vector of zeros, as alternatives were discussed, trading off leakage suppression
the dynamic power consumption for reading a ’1’ is negligible. with static noise margins in the hold ’1’ state. For ultra-low
The read power was measured for a row in the top quarter of power applications, operating at sub-threshold supply voltages
the array, presenting a worst case power figure in the DBL and at very low frequencies, the LVT implementation should
structure. At 400 mV, a write operation consumed 360 fW be considered. Using a low-threshold transistor as the supply
and a worst case read operation consumed 210 fW. At this feedback device reduces the voltage drop at the data ( ) node
voltage the array was operated at 1 MHz, however this could during hold ’1’ cycles, resulting in a minimal loss of static
be improved significantly with the integration of an enhanced noise margins. The leakage reduction is less significant using
sensing scheme. An overall comparison of figures of merit with this implementation due to the low resistance of such a device;
the reference 8T array is given in Table I. however the power reduction at such low operating voltages is
already extremely advantageous.
C. Discussion For applications with higher supply voltages for higher per-
Several design factors were presented and compared for formance, but with an emphasis on static power minimization,
the proposed 9T SF-SRAM, as an alternative to standard the SVT or HVT implementations should be considered. The
SRAM implementations. The first and foremost advantage of loss of static noise margin is more tolerated at higher voltages,
the SF-SRAM cell is its robust functionality at low operating and the power saving is substantial. Integrating an SF-SRAM
voltages, much lower than those achievable with standard 6T array with a data processing algorithm to maximize cells in the
or 8T cells. In addition, the SF-SRAM bitcell provides in-cell low-leakage state further enhances this advantage.
Fig. 18. Architecture of the 256 32 bitcell array with divided wordline and level shifting.
Fig. 20. Measured static power of the fabricated SF-SRAM array when all
zeros and all ones are loaded. The reference 8T array was measured with all
zeros loaded, for comparison.
Fig. 19. Measured waveforms at test-chip interface. The figure shows a single and compared to those of a standard 8T bitcell. Various imple-
bit written to two different addresses and subsequently read out.
mentations were proposed and tradeoffs were explained. The
proposed bitcell provides robust functionality under global and
VI. CONCLUSION
local process variations throughout the full range of supply volt-
This paper presented a novel 9T Supply Feedback SRAM bit- ages, as low as 250 mV. This is achieved without the need for
cell. The operational concepts and mechanisms inside the bit- additional peripheral circuitry. In addition, in one of its stable
cell were discussed. Static and dynamic metrics were presented states, the cell provides internal leakage suppression, resulting
[2] B. Zhai, S. Pant, L. Nazhandali, S. Hanson, J. Olson, A. Reeves, M.

Minuth, R. Helfand, T. Austin, D. Sylvester, and D. Blaauw, “Energy-
efficient subthreshold processor design,” IEEE Trans. Very Large Scale
Integration (VLSI) Syst., vol. 17, pp. 1127–1137, 2009.
[3] I. J. Chang, J. J. Kim, S. P. Park, and K. Roy, “A 32 kb 10T
sub-threshold SRAM array with bit-interleaving and differential read
scheme in 90 nm CMOS,” IEEE J. Solid-State Circuits, vol. 44, pp.
650–658, 2009.
[4] E. A. Vittoz, “Weak inversion for ultra low-power and very low-voltage
circuits,” in IEEE Asian Solid-State Circuits Conf. (A-SSCC 2009),
2009, pp. 129–132.
[5] L. Chang, R. K. Montoye, Y. Nakamura, K. A. Batson, R. J. Eicke-
meyer, R. H. Dennard, W. Haensch, and D. Jamsek, “An 8T-SRAM for
variability tolerance and low-voltage operation in high-performance
caches,” IEEE J. Solid-State Circuits, vol. 43, pp. 956–963, 2008.
[6] S. Fisher, A. Teman, D. Vaysman, A. Gertsman, O. Yadid-Pecht, and A.
Fish, “Digital subthreshold logic design- motivation and challenges,”
in Proc. IEEE 25th Convention of Electrical and Electronics Engineers
Fig. 21. Measured dynamic power consumption of single read and write oper-
in Israel (IEEEI 2008), 2008, pp. 702–706.
ations. The write operation included writing a 32-bit word with an equal number
[7] N. Verma and A. P. Chandrakasan, “A 256 kb 65 nm 8T subthreshold
of ’1’ and ’0’ bits. The read operation included reading a 32-bit vector of ’0’
SRAM employing sense-amplifier redundancy,” IEEE J. Solid-State
bits.
Circuits, vol. 43, pp. 141–149, 2008.
[8] B. H. Calhoun and A. P. Chandrakasan, “A 256-kb 65-nm
TABLE I sub-threshold SRAM design for ultra-low-voltage operation,” IEEE J.
9T SF-SRAM FIGURES OF MERIT Solid-State Circuits, vol. 42, pp. 680–688, 2007.
[9] T. Kim, J. Liu, and C. H. Kim, “An 8T subthreshold SRAM cell
utilizing reverse short channel effect for write margin and read per-
formance improvement,” in IEEE Custom Integrated Circuits Conf.
(CICC’07), 2007, pp. 241–244.
[10] A. Wang, B. H. Calhoun, and A. P. Chandrakasan, Sub-Threshold
Design for Ultra Low-Power Systems. New York: Springer-Verlag,
2006.
[11] B. H. Calhoun and A. Chandrakasan, “Analyzing static noise margin
for sub-threshold SRAM in 65 nm CMOS,” in Proc. 31st European
Solid-State Circuits Conf. (ESSCIRC 2005), 2005, pp. 363–366.
[12] A. Raychowdhury, S. Mukhopadhyay, and K. Roy, “A feasibility study
of subthreshold SRAM across technology generations,” in Proc. 2005
IEEE Int. Conf. Computer Design: VLSI in Computers and Processors
(ICCD 2005), 2005, pp. 417–422.
[13] A. Wang and A. Chandrakasan, “A 180-mV subthreshold FFT pro-
cessor using a minimum energy design methodology,” IEEE J. Solid-
State Circuits, vol. 40, pp. 310–319, 2005.
[14] A. Vladimirescu, Y. Cao, O. Thomas, H. Qin, D. Markovic, A. Valen-
tian, R. Ionita, J. Rabaey, and A. Amara, “Ultra-low-voltage robust
design issues in deep-submicron CMOS,” in Proc. 2nd Annu. IEEE
Northeast Workshop on Circuits and Systems (NEWCAS 2004), 2004,
pp. 49–52.
[15] A. Wang and A. Chandrakasan, “A 180 mV FFT processor using sub-
threshold circuit techniques,” in 2004 IEEE Int. Solid-State Circuits
Conf. (ISSCC) Dig.Tech. Papers, 2004, vol. 1, pp. 292–529.
[16] D. Bol, R. Ambroise, D. Flandre, and J. Legat, “Interests and limita-
tions of technology scaling for subthreshold logic,” IEEE Trans. Very
in a 15%–60% reduction of static power as compared to an 8T Large Scale Integration (VLSI) Syst., vol. 17, pp. 1508–1519, 2009.
cell at the same supply voltage (depending on the implementa- [17] D. Bol, R. Ambroise, D. Flandre, and J. Legat, “Impact of technology
scaling on digital subthreshold circuits,” in Proc. IEEE Computer So-
tion). ciety Annu. Symp. VLSI (ISVLSI ’08), 2008, pp. 179–184.
An 8 kbit array of SF-SRAM bitcells was implemented, fab- [18] D. Bol, C. Hocquet, D. Flandre, and J. Legat, “Robustness-aware
ricated and tested in a Low Power 40 nm CMOS process. Mea- sleep transistor engineering for power-gated nanometer subthreshold
circuits,” in Proc. IEEE Int. Symp. Circuits and Systems (ISCAS 2010),
surement results show full functionality at all voltages between 2010, pp. 1484–1487.
1.1 V and 400 mV (the limit of the test chip). [19] B. H. Calhoun, A. Wang, and A. Chandrakasan, “Modeling and sizing
for minimum energy operation in subthreshold circuits,” IEEE J. Solid-
State Circuits, vol. 40, pp. 1778–1786, 2005.
ACKNOWLEDGMENT [20] D. Markovic, V. Stojanovic, B. Nikolic, M. A. Horowitz, and R.
W. Brodersen, “Methods for true energy-performance optimization,”
The authors would like to thank Mr. N. Sever, the Zoran Cor- IEEE J. Solid-State Circuits, vol. 39, pp. 1282–1293, 2004.
poration, and the Alpha Consortium for their help and support [21] A. Wang, B. H. Calhoun, and A. P. Chandrakasan, Sub-Threshold De-
sign for Ultra Low-Power Systems. Secaucus, NJ: Springer-Verlag
in the completion of this work. New York, Inc., 2006, Series on Integrated Circuits and Systems.
[22] H. Qin, Y. Cao, D. Markovic, A. Vladimirescu, and J. Rabaey, “SRAM
leakage suppression by minimizing standby supply voltage,” in Proc.
REFERENCES
5th Int. Symp. Quality Electronic Design, 2004, pp. 55–60.
[1] D. Markovic, C. C. Wang, L. P. Alarcon, T.-T. Liu, and J. M. Rabaey, [23] J. M. Rabaey, A. Chandrakasan, and B. Nikolić, Digital Integrated
“Ultralow-power design in near-threshold region,” Proc. IEEE, vol. 98, Circuits: A Design Perspective, 2nd ed. Englewood Cliffs, NJ: Pren-
pp. 237–252, 2010. tice-Hall, 2003, p. 761.
[24] J. Wang, S. Nalam, and B. H. Calhoun, “Analyzing static and dy- low-power CMOS image sensors and low-power design techniques for digital
namic write margin for nanometer SRAMs,” in Proc. ACM/IEEE Int. and analog VLSI chips. He has authored 11 scientific papers and two patent
Symp. Low Power Electronics and Design (ISLPED 2008), 2008, pp. applications, and has presented excerpts from his research at a number of
129–134. international conferences.
[25] M. Yamaoka, N. Maeda, Y. Shinozaki, Y. Shimazaki, K. Nii, S. In 2010, Mr. Teman was honored with the Electrical Engineering Depart-
Shimada, K. Yanagisawa, and T. Kawahara, “Low-power embedded ment’s “Teaching Excellence” recognition at Ben-Gurion University. He is a
SRAM modules with expanded margins for writing,” in 2005 IEEE recipient of the Kreitman Foundation Fellowship for Doctoral Studies and re-
Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, 2005, vol. ceived the Yizhak Ben-Ya’akov HaCohen Prize in 2010.
1, pp. 480–611.
[26] K. Zhang, U. Bhattacharya, Z. Chen, F. Hamzaoglu, D. Murray, N.
Vallepalli, Y. Wang, B. Zheng, and M. Bohr, “SRAM design on 65-nm Lidor Pergament received the B.Sc. degree in elec-
CMOS technology with dynamic sleep transistor for leakage reduc- trical engineering from Ben-Gurion University, Be’er
tion,” IEEE J. Solid-State Circuits, vol. 40, pp. 895–901, 2005. Sheva, Israel, in 2011. As part of his senior project,
[27] International Roadmap for Semiconductors, ITRS, 2009 [Online]. he worked on low-voltage/low-power SRAM design
Available: http://www.itrs.net/ and culminated his work with the fabrication of the
[28] I. J. Chang, J. J. Kim, S. P. Park, and K. Roy, “A 32 kb 10T subthreshold first 40 nm test chip by an academic group in Israel,
SRAM array with bit-interleaving and differential read scheme in 90 two scientific papers, and two patent applications.
nm CMOS,” in 2008 IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Currently he is a Design and Verification Engineer
Tech. Papers, 2008, pp. 388–622. with Mellanox Technologies, Tel Aviv, Israel.
[29] J. P. Kulkarni, K. Kim, and K. Roy, “A 160 mV, fully differential, ro- Mr. Pergament’s senior project awarded him with
bust schmitt trigger based sub-threshold SRAM,” in Proc. ACM/IEEE an award of merit for outstanding projects in the BGU
Int. Symp. Low Power Electronics and Design (ISLPED), 2007, pp. Department of Electrical Engineering for the 2009–2010 academic year.
171–176.
[30] M. F. Chang, J. J. Wu, K. T. Chen, Y. C. Chen, Y. H. Chen, R. Lee,
H. J. Liao, and H. Yamauchi, “A differential data-aware power-sup- Omer Cohen received the B.Sc. degree in electrical
plied (D AP) 8T SRAM cell with expanded write/read stabilities for engineering from Ben-Gurion University, Be’er
lower VDDmin applications,” IEEE J. Solid-State Circuits, vol. 45, pp. Sheva, Israel, in 2011. As part of his senior project,
1234–1245, 2010. he worked on low-voltage/low-power SRAM design
[31] S. Nakata, T. Kusumoto, M. Miyama, and Y. Matsuda, “Adiabatic and culminated his work with the fabrication of the
SRAM with a large margin of VT variation by controlling the cell- first 40 nm test chip by an academic group in Israel,
power-line and word-line voltage,” in IEEE Int. Symp. Circuits and two scientific papers, and two patent applications.
Systems (ISCAS 2009), 2009, pp. 393–396. Currently he is working as a Design and Verifi-
[32] A. Carlson, Z. Guo, S. Balasubramanian, R. Zlatanovici, T. J. K. Liu, cation Engineer at Marvell Semiconductors, Petah
and B. Nikolic, “SRAM read/write margin enhancements using Fin- Tikva, Israel.
FETs,” IEEE Trans. Very Large Scale Integration (VLSI) Syst., vol. 18, Mr. Cohen’s senior project awarded him with an
pp. 887–900, 2010. award of merit for outstanding projects in the BGU Department of Electrical
[33] B. D. Yang, “A low-power SRAM using bit-line charge-recycling for Engineering for the 2009–2010 academic year.
read and write operations,” IEEE J. Solid-State Circuits, vol. 45, pp.
2173–2183, 2010.
[34] K. Kim, H. Mahmoodi, and K. Roy, “A low-power SRAM using Alexander Fish (S’04–M’06) received the B.Sc. de-
bit-line charge-recycling,” IEEE J. Solid-State Circuits, vol. 43, pp. gree in electrical engineering from the Technion, Is-
446–459, 2008. rael Institute of Technology, Haifa, Israel, in 1999.
[35] T. H. Kim, J. Liu, and C. H. Kim, “A voltage scalable 0.26 V, 64 kb He completed the M.Sc. degree in 2002 and the Ph.D.
8T SRAM with V lowering techniques and deep sleep mode,” IEEE J. degree (summa cum laude) in 2006, respectively, at
Solid-State Circuits, vol. 44, pp. 1785–1795, 2009. Ben-Gurion University in Israel.
[36] C. C. Enz, F. Krummenacher, and E. A. Vittoz, “An analytical MOS He was a postdoctoral fellow in the ATIPS Labora-
transistor model valid in all regions of operation and dedicated to low- tory at the University of Calgary, Canada, from 2006
voltage and low-current applications,” Analog Integr. Circuits Signal to 2008. In 2008, he joined Ben-Gurion University,
Process., vol. 8, pp. 83–114, Jul. 1995. Israel, as a faculty member in the Electrical and Com-
[37] M. Sharifkhani and M. Sachdev, “SRAM cell stability: A dynamic per- puter Engineering Department. There he founded the
spective,” IEEE J. Solid-State Circuits, vol. 44, pp. 609–619, 2009. LPCAS Laboratory, specializing in low-power circuits and systems. His re-
[38] J. Wang, S. Nalam, and B. H. Calhoun, “Analyzing static and dynamic search interests include low-voltage digital design, energy-efficient SRAM and
write margin for nanometer SRAMs,” in Proc. 13th Int. Symp. Low Flash memory arrays, low-power CMOS image sensors and low-power design
Power Electronics and Design (ISLPED 2008), 2008, pp. 129–134. techniques for digital and analog VLSI chips. He has authored over 60 scientific
[39] A. Karandikar and K. K. Parhi, “Low power SRAM design using hier- papers and patent applications. He has also published two book chapters.
archical divided bit-line approach,” in Proc. Int. Conf. Computer De- Dr. Fish was a co-author of two papers that won the Best Paper Finalist awards
sign: VLSI in Computers and Processors (ICCD ’98), 1998, pp. 82–88. at ICECS’04 and ISCAS’05 conferences. He also received the Young Inno-
vator Award for Outstanding Achievements in the field of Information Theo-
Adam Teman (S’10) received the B.Sc. degree in ries and Applications by ITHEA in 2005. In 2006, he was honored with the
electrical engineering from Ben-Gurion University, Engineering Faculty Dean “Teaching Excellence” recognition at Ben-Gurion
Be’er Sheva, Israel, in 2006. He worked as a Design University. He serves as Editor in Chief for the MDPI Journal of Low Power
Engineer at Marvell Semiconductors from 2006 to Electronics and Applications (JLPEA) and as an Associate Editor for the IEEE
2007, with an emphasis on physical implementation. SENSORS JOURNAL. He was a co-organizer of special sessions on “smart” CMOS
He completed the M.Sc. degree at Ben-Gurion Uni- Image Sensors at IEEE Sensors Conference 2007, on low-power “Smart” Image
versity in 2011. He is currently pursuing the Ph.D. Sensors and Beyond at the IEEE ISCAS 2008 and on Design Methodologies for
degree under Dr. Alexander Fish as part of the Low Advanced Ultra Low Power Sensor and Memory Arrays at the IEEE Sensors
Power Circuits and Systems (LPC&S) Lab in Ben- Conference 2009.
Gurion University’s VLSI Systems Center.
Mr. Teman’s research interests include low-
voltage digital design, energy-efficient SRAM and Flash memory arrays,

8T 9T

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

8T 9T

Hochgeladen von

Copyright:

Verfügbare Formate

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO.

11, NOVEMBER 2011 2713

A 250 mV 8 kb 40 nm Ultra-Low Power 9T Supply

Abstract—Low voltage operation of digital circuits continues to

Fig. 1. Schematics of standard SRAM bitcells. (a) 6T bitcell. (b) 8T bitcell.

S UB-THRESHOLD and near-threshold operation have be-

0018-9200/$26.00 © 2011 IEEE

Fig. 3. Schematics of the 9T SF-SRAM bitcell employing a Supply Feedback

and/or loss of drive current must be taken into account. In addi-

III. THE PROPOSED 9T CELL

Fig. 7. SNM ratio of 9T SF-SRAM cell as compared to standard 8T cell at full

the leakage through M1. As the gate voltage of M1 increases, the

cell. This decoupling of the readout path results in a read margin

The most interesting aspect of the proposed SF-SRAM cell

Fig. 11. Write ’0’ operation.

Fig. 15. Test chip micrograph and evaluation board.

V. IMPLEMENTATION AND MEASUREMENTS

[2] B. Zhai, S. Pant, L. Nazhandali, S. Hanson, J. Olson, A. Reeves, M.

Das könnte Ihnen auch gefallen