Sie sind auf Seite 1von 4

A delay model allowing nano-CMOS standard cells

statistical simulation at the logic level


Antonio Mastrandrea, Francesco Menichelli, Mauro Olivieri
DIET - Sapienza University of Rome, Italy
{mastrandrea, menichelli, olivieri}@die.uniroma1.it

Abstract In nano-scale digital CMOS ICs, technology (complex) CMOS logic cells. The logical effort model [8] is a
parameter variation limits the usefulness of traditional corner- widely adopted paradigm for reasoning about optimal circuit
based timing simulation in favor of statistical simulation. Yet, sizing, but as it was originally conceived for manual
logic level delay modeling featuring technology variation aware optimization it is inherently a simplified fully linear model,
timing is an open challenge. We present a new semi-empirical explicitly neglecting transistor-stacks diffusion capacitances,
delay model of digital CMOS cells, accounting for input slope Miller and feed through effects, as well as non linearity.
and technology parameters, featuring Spice-level accuracy and Finally, a current-based statistical delay model for individual
full suitability for logic level (i.e. fast) statistical timing simulation cells is illustrated in [5], showing an analytical approach to
in an HDL environment. The approach has been tested against
obtain statistical behavior calculation.
Spice BSIM4 targeting a library of 272 standard cells.
We illustrate a semi-empirical delay model for arbitrarily
Keywords: delay model, standard cell, digital circuits, CMOS complex logic CMOS stages, aiming to allow (fast) Monte-
Carlo statistical analysis at hardware description language
I. INTRODUCTION (HDL) logic level, with results practically equivalent to a
Timing analysis is a critical step in VLSI design flows, due Spice-level Monte Carlo iteration. Our approach is intrinsically
to the presence of nonlinear effects in nano-scale CMOS more general than [1][2][3][4], and differently from [8] it
standard cells, such as velocity saturation, input-output addresses the highest possible accuracy in modeling non linear
coupling, voltage feed-through [1] and short circuit effect due effects and parasitic effects. It also differs from [5] in that we
to pull-up and pull-down transistors conducting simultaneously do not develop statistical models for each single cell, but for
[7], etc. specific sub-circuits (called logical drivers) that can be
combined to model virtually any CMOS cell, by means of
Traditionally, propagation delay estimation of logic cells in executable HDL specifications. The approach diverges from
a complex IC is accomplished by delay calculators relying on statistical static timing analysis (SSTA) because we target
(deterministic) delay models of the logic cells. Models used in statistical simulation allowing the designer to see statistical
delay calculators have evolved from simple lookup tables to effects on the operation of the digital system on actual data.
polynomial models, non-linear models and more recent
current-based models [6]. Nowadays, the growing statistical II. DELAY MODEL
variability in process parameters makes traditional corner based The model relies on a paradigm for describing CMOS cells
approaches not adequate for a realistic estimation of the that we refer to as logical drivers. A logical driver is a logic
fabrication yield. As a result, there is a strong need of logic unit which is an abstraction of a current path flowing between
level timing models capable of supporting statistical the output of a cell and ground/vdd, through a stack of
simulation, in addition to deterministic delay simulations. The transistors. A logical driver corresponding to an N transistors
alternative chance of circuit (SPICE) level Monte-Carlo path has N logic inputs (one for each transistor gate), one logic
statistical simulation is virtually unfeasible for circuits of output, a delay value. In a logic cell, several logical drivers can
practical complexity, due to the enormous computation time. be identified (e.g. Fig. 1). An active logical driver corresponds
Many previous works targeted the definition of a compact to a current path that is switching the cell output voltage.
model for the (deterministic) delay of a CMOS stage. The first The proposed model computes a delay for an active logical
representative example was [4], where the delay of a CMOS driver subject to single-input switching (i.e. worst case
inverter is estimated as a function of the alpha-power saturation condition), by associating the circuit depicted in Fig. 2 to the
current law in a CMOS transistor; yet, only a very approximate logical driver. is the average current flowing trough the
empirical model for the effect of the input slew time is transistor stack during the output node transition. The three
developed. An empirical extension to a more complex gate, capacitors correspond to different physical capacitances:
based on transistor stacks, was introduced in [2]. A more
complex, charge based analytical model was developed in [1]. CFANOUT accounts for capacitive load associated to the
Importantly, an output slew time analytical model is presented output, i.e. basically fan-out capacitance.
in [3]. However, all the above aim to understand basic circuit
CINTRINSIC accounts for the capacitances associated to the
behavior and have no direct application to all types of
drain terminals of the transistors identified as follows:

978-1-4244-9137-7/11/$26.00 2011 IEEE 217


they are off before, during and after the driving event, capacitances and virtual additional Miller-effect capacitances.
Also, the discharge time of CDRIVE takes into account the
their drain terminal voltage switches as a consequence voltage feed-through phenomenon and its effect on the total
of the input switching. delay of the drain voltage, so that CDRIVE partially corresponds
Such transistors are all outside the active logical driver to a formal quantity rather than a physical capacitance.
though they are in the same CMOS stage. We refer to such Without loss of generality, we assume n-type device in the
parasitic capacitances as intrinsic capacitances. Physically, following analysis. According to Fig. 2, the delay associated to
they are diffusion capacitances and diffusion-metal contacts a driving event can be decomposed in three terms:
capacitances on the internal nodes of a cell.

Where VS is the output voltage swing and is usually Vdd/2
Vdd Vdd
for delay calculation. Referring to the average current driven
by one minimum size MOS transistor as , we can write
A B

C where is the drawn width of the transistor which is


Z
causing the output voltage switching, and a is the difference
between drawn width and effective width.
A C External (fan-out)
Similarly,
load
capacitance

B
where wd(j) is an integer expressing the width of every
transistor j contributing to the intrinsic capacitance, normalized
to the minimum width, and is the diffusion capacitance
Fig. 1 Four logical driver units abstracted in a CMOS gate of a single transistor having minimum size.
Cdrive Cintrinsic Cfan out
Finally we can write

where is an integer expressing the width of every


in out transistor j contributing to the drive capacitance, normalized to
the minimum width, is the gate capacitance of a single
transistor with minimum size, and is a constant modeling the
contribution to the delay of the Miller effect and of the voltage
feed-through effect.
By defining the quantities


Cdrive Cintrinsic Cfan out
Iavg
The total delay associated to an active logical driver in a
single-stage CMOS standard cell can be written as:
Fig. 2 Equivalent circuit for the current based delay model

CDRIVE accounts the capacitances associated to the drain
terminals of the transistors, identified as follows (1)

they are on before, during, or after the driving event,


their drain terminal voltage switches as a consequence
of the input switching.
III. DETERMINISTIC MODEL CHARACTERIZATION
At least one of the transistor such defined belongs to the
active logical driver. We refer to such parasitic capacitances as The technology-dependent parameters a, , , and
drive capacitances. Physically, they correspond to (average) can be determined by characterizing all the possible circuit
drain diffusion capacitances, diffusion-metal contacts structures corresponding to logical drivers, for each possible

218
switching input. The reference circuit structures that allow to 6.00
nand2 input A X20 tr=50ps
model any CMOS gate of practical interest are: 5.50
5.00

delay HL (ps)
1 NMOS and 1 PMOS stack with a common input 4.50
1 NMOS and 2 PMOS stack with a common input 4.00
1 NMOS and 3 PMOS stack with a common input 3.50 HDL model
1 NMOS and 4 PMOS stack with a common input 3.00
spice
2 NMOS stack and 1 PMOS with a common input 2.50
2 NMOS stack and 2 PMOS stack with a common input 2.00
2 NMOS stack and 3 PMOS stack with a common input 0 1 2 3 4
2 NMOS stack and 4 PMOS stack with a common input Cload (fF)
3 NMOS stack and 1 PMOS with a common input
3 NMOS stack and 2 PMOS stack with a common input Fig. 4 Results on deterministic delay (2-input nand). Tr is input slew time.
3 NMOS stack and 3 PMOS stack with a common input The overlapped curve is the delay obtained from Spice.
4 NMOS stack and 1 PMOS with a common input
TABLE I. % error in delay results of logic level VHDL (avg between
4 NMOS stack and 2 PMOS stack with a common input HL, LH transitions) against Spice BSIM4, 45nm CMOS technology

We analyzed the above structures through Spice BSIM4 tr=10ps tr=50ps


simulations in 45 nm and 32 nm technologies, and found the
behavior of the parameters , and as a function of load Cells @Cl= @Cl= @Cl= @Cl= @Cl= @Cl=
capacitance and input slope. Ideally, a pure linear behavior of 0.33fF 1.0fF 5.0fF 0.33fF 1.0fF 5.0fF
delay with respect to load capacitance would imply , and not X1 0.0 0.0 0.0 0.0 0.0 0.0
be constants. Actually, the non-ideal behavior of the MOS
not X10 0.0 0.0 0.0 0.0 0.0 0.0
transistors make delay be non-linear for small loads, and , ,
always show the typical behavior in Fig. 3. nand2 X1 0.4 0.2 0.1 0.3 0.1 0.1

0.00046 nand2 X10 1.3 0.6 0.3 1.5 0.8 0.2


tau tr=1ps
0.00041 ao12_n X1 0.7 0.3 0.1 -0.1 0.1 0.1
0.00036
ao12_n X10 0.8 0.8 0.7 0.0 0.0 0.1
Tau (ns)

0.00031
taum_n X1_1ps ao112_n X1 -0.1 -0.3 -0.1 -0.7 -0.5 -0.1
0.00026
tau0_n X1_1ps ao112_n X10 1.3 1.5 -0.2 -1.1 -1.0 -0.6
0.00021
taui_n X1_1ps
0.00016 ao22_n X1 0.5 0.2 0.1 -0.7 -0.2 0.0
0.00011
ao22_n X10 1.5 1.5 0.6 -0.2 -0.4 -0.3
0 1 2 3 4
ao31_n X1 -1.4 -1.7 -0.7 -0.4 -0.9 -0.6
Cload (fF)
ao31_n X10 0.6 0.1 -1.3 1.3 0.8 -0.3
Fig. 3 Behavior of _o, _i and _m vs load capacitance. Input slope 10 ps.
ao212_n X1 -1.3 -1.4 -0.6 -1.1 -1.1 -0.6
It is easy to store the values of the , , functions for ao212_n X10 -0.4 -0.6 -1.4 -0.8 -0.9 -1.2
different input slope values, and implement Eq. 1 within an
ao32_n X1 -1.4 -1.8 -0.8 0.6 0.1 -0.3
HDL standard cell description. The results obtained from our
VHDL implementation show very good agreement with Spice ao32_n X10 0.0 -0.3 -1.6 1.4 1.2 0.5
BSIM4 simulations. Fig. 4 shows how the results fit the often ao222_n X1 -1.5 -2.1 -1.4 0.0 -1.1 -1.3
non-linear behavior of standard cell delay for small load
capacitance values. Table 1 summarizes the results on the delay ao222_n X10 0.8 0.1 -1.4 2.1 1.6 0.0
of non-trivial cells compared with Spice. ao33_n X1 -0.6 -1.2 -0.6 2.5 1.7 0.3
40.00 ao33_n X10 1.6 1.2 -0.5 4.3 3.9 2.6
nand2 input A X1 tr=50ps
35.00
30.00
delay HL (ps)

25.00 To our knowledge, no previous delay model suitable for


20.00 direct implementation in HDL supports all types of logic
HDL model
15.00 CMOS gates with the obtained accuracy. Similar accuracy is
spice
10.00 obtained by off-line delay calculators integrated in VLSI CAD
5.00 tools, at the expense of large database storage occupancy.
0.00 However, such delay calculators generate static .sdf files which
0 1 2 3 4 cannot support technology variations in the simulation and are
Cload (fF)
devoted to deterministic delay analysis (corner-based analysis).

219
IV. STATISTICAL MODEL CHARACTERIZATION By collecting and storing the 104 values of the asymptote
Actually, the behavior of the , and functions is for , and , obtained by the Spice-level Monte-Carlo
interesting when we take into account technology variations for iterations, it is possible to repeatedly execute the VHDL logic-
statistical Monte-Carlo analysis. level Monte-Carlo timing simulation of a CMOS cell, applying
the asymptote statistical variations to the functions, so that
0.0025 the delay statistical behavior is obtained. We implemented
tauO 1-1
0.002
logic level Monte Carlo for some reference cells, reaching
significantly accurate results. Fig. 6 shows two cases referring
0.0015 L +15% to a not cell and to a 2-input and cell. A comparison with [5] is
tau (ns)

possible although only partly significant, because of the


0.001 L different underlying approach and different application (cross-
0.0005 talk analysis). In [5], for a not cell the authors report up to
L -15% 3.9% error in the mean value and 4.4% in variance, referring to
0 process variation effect on cross-talk impact on delay.
0 1 2 3 4
Cload (fF) CONCLUSIONS
Fig. 5 Behavior of _o as affected by L variation (transistor drawn length) The presented approach gives very good results both with
Other technology variations all have a similar effect. deterministic typical delay and with statistical delay behavior,
and can be implemented in a logic-level HDL description of
extracted not X1 tr=10ps Cload=1fF tHL standard cell libraries. The advantage of this approach is to
density enable technology variation aware statistical delay analysis at
0.4
model the logic level, overriding SPICE level Monte Carlo analysis.
0.35
spice
0.3 Present work is focused on the application to circuits of
0.25
mean error= -0.729%
practical interest. The integration of the delay calculation -
0.2 deviation error =1.65% supporting generic input slope and load - in a logic-level
0.15
VHDL description of a standard cell library composed of 272
0.1
cells is close to be completed.
0.05
0
ACKNOWLEDGEMENT
5 7 9 11 13 15 17
delay (ps) This work was developed within the ENIAC-120003
MODERN Project funded by the European Commission.
extracted
density and2 (nodeA) X1 tr=10ps Cload=1fF tHL REFERENCES
0.35
model [1] Rossello, J. L., Segura, J, An Analytical Charge-Based Compact Delay
0.3 spice Model for Submicrometer CMOS Inverters, IEEE Transactions on
0.25 Circuits And SystemsI: Regular Papers, vol. 51, no. 7, Jul. 2004
mean error= 2.839%
0.2 deviation error =-6.72% [2] Sakurai, T., Newton, A. R., Delay analysis of seriesconnected
0.15 MOSFET circuits, IEEE Journal of Solid-State Circuits, vol. 26, no. 2,
pp. 122-131, Feb. 1991
0.1
[3] Alioto, M.; Poli, M.; Palumbo, G.; , "Efficient and Accurate Models of
0.05
Output Transition Time in CMOS Logic," Proc. of ICECS 2007,
0 pp.1264-1267, Dec. 2007
4 9 14 19 [4] Sakurai, T., Newton, A. R., Alpha-power law MOSFET model and its
delay (ps) applications to CMOS inverter delay and other formulas, IEEE Journal
of Solid-State Circuits, vol. 25, no. 2, pp. 584-594, Apr. 1990.
Fig. 6 Results of statistical delay behavior for two different cell cases. The
[5] Fatemi, H., Nazarian, S., Pedram, M., , "Statistical logic cell delay
overlapped curves are obtained from Spice Monte-Carlo simulation. analysis using a current-based model," 43rd Design Automation
Conference, 2006 ACM/IEEE, San Francisco, CA, pp.253-256.
We built an automated script performing 104 iterations of [6] J. F. Croix and D. F. Wang, Blade and Razor: Cell and Interconnect
Spice simulation and extraction of the functions, injecting Delay Analysis Using Current-Based Models, in Proc. Design
random variations in technology parameters at each iteration. Automation Conference, 2004
The variations considered were L, W, oxide thickness , and [7] H. Veendrick, Short-circuit dissipation of static CMOS circuitry and its
channel doping . For all of them we assumed a Gaussian impact on the design of buffer circuits, IEEE J. Solid-State Circuits,
vol. SC-19, pp. 468473, Aug. 1984.
distribution, widely used in statistical CMOS simulations, with
[8] I. Sutherland, B. Sproull, and D. Harris, Logical Effort-----Designing
3 variation of 15%. The behavior that we observed is that Fast CMOS Circuits. Morgan Kaufmann, 1999.
only the vertical shift of the functions is significantly affected
by the technology variations (Fig. 5).

220

Das könnte Ihnen auch gefallen