Beruflich Dokumente
Kultur Dokumente
Abstract In nano-scale digital CMOS ICs, technology (complex) CMOS logic cells. The logical effort model [8] is a
parameter variation limits the usefulness of traditional corner- widely adopted paradigm for reasoning about optimal circuit
based timing simulation in favor of statistical simulation. Yet, sizing, but as it was originally conceived for manual
logic level delay modeling featuring technology variation aware optimization it is inherently a simplified fully linear model,
timing is an open challenge. We present a new semi-empirical explicitly neglecting transistor-stacks diffusion capacitances,
delay model of digital CMOS cells, accounting for input slope Miller and feed through effects, as well as non linearity.
and technology parameters, featuring Spice-level accuracy and Finally, a current-based statistical delay model for individual
full suitability for logic level (i.e. fast) statistical timing simulation cells is illustrated in [5], showing an analytical approach to
in an HDL environment. The approach has been tested against
obtain statistical behavior calculation.
Spice BSIM4 targeting a library of 272 standard cells.
We illustrate a semi-empirical delay model for arbitrarily
Keywords: delay model, standard cell, digital circuits, CMOS complex logic CMOS stages, aiming to allow (fast) Monte-
Carlo statistical analysis at hardware description language
I. INTRODUCTION (HDL) logic level, with results practically equivalent to a
Timing analysis is a critical step in VLSI design flows, due Spice-level Monte Carlo iteration. Our approach is intrinsically
to the presence of nonlinear effects in nano-scale CMOS more general than [1][2][3][4], and differently from [8] it
standard cells, such as velocity saturation, input-output addresses the highest possible accuracy in modeling non linear
coupling, voltage feed-through [1] and short circuit effect due effects and parasitic effects. It also differs from [5] in that we
to pull-up and pull-down transistors conducting simultaneously do not develop statistical models for each single cell, but for
[7], etc. specific sub-circuits (called logical drivers) that can be
combined to model virtually any CMOS cell, by means of
Traditionally, propagation delay estimation of logic cells in executable HDL specifications. The approach diverges from
a complex IC is accomplished by delay calculators relying on statistical static timing analysis (SSTA) because we target
(deterministic) delay models of the logic cells. Models used in statistical simulation allowing the designer to see statistical
delay calculators have evolved from simple lookup tables to effects on the operation of the digital system on actual data.
polynomial models, non-linear models and more recent
current-based models [6]. Nowadays, the growing statistical II. DELAY MODEL
variability in process parameters makes traditional corner based The model relies on a paradigm for describing CMOS cells
approaches not adequate for a realistic estimation of the that we refer to as logical drivers. A logical driver is a logic
fabrication yield. As a result, there is a strong need of logic unit which is an abstraction of a current path flowing between
level timing models capable of supporting statistical the output of a cell and ground/vdd, through a stack of
simulation, in addition to deterministic delay simulations. The transistors. A logical driver corresponding to an N transistors
alternative chance of circuit (SPICE) level Monte-Carlo path has N logic inputs (one for each transistor gate), one logic
statistical simulation is virtually unfeasible for circuits of output, a delay value. In a logic cell, several logical drivers can
practical complexity, due to the enormous computation time. be identified (e.g. Fig. 1). An active logical driver corresponds
Many previous works targeted the definition of a compact to a current path that is switching the cell output voltage.
model for the (deterministic) delay of a CMOS stage. The first The proposed model computes a delay for an active logical
representative example was [4], where the delay of a CMOS driver subject to single-input switching (i.e. worst case
inverter is estimated as a function of the alpha-power saturation condition), by associating the circuit depicted in Fig. 2 to the
current law in a CMOS transistor; yet, only a very approximate logical driver. is the average current flowing trough the
empirical model for the effect of the input slew time is transistor stack during the output node transition. The three
developed. An empirical extension to a more complex gate, capacitors correspond to different physical capacitances:
based on transistor stacks, was introduced in [2]. A more
complex, charge based analytical model was developed in [1]. CFANOUT accounts for capacitive load associated to the
Importantly, an output slew time analytical model is presented output, i.e. basically fan-out capacitance.
in [3]. However, all the above aim to understand basic circuit
CINTRINSIC accounts for the capacitances associated to the
behavior and have no direct application to all types of
drain terminals of the transistors identified as follows:
218
switching input. The reference circuit structures that allow to 6.00
nand2 input A X20 tr=50ps
model any CMOS gate of practical interest are: 5.50
5.00
delay HL (ps)
1 NMOS and 1 PMOS stack with a common input 4.50
1 NMOS and 2 PMOS stack with a common input 4.00
1 NMOS and 3 PMOS stack with a common input 3.50 HDL model
1 NMOS and 4 PMOS stack with a common input 3.00
spice
2 NMOS stack and 1 PMOS with a common input 2.50
2 NMOS stack and 2 PMOS stack with a common input 2.00
2 NMOS stack and 3 PMOS stack with a common input 0 1 2 3 4
2 NMOS stack and 4 PMOS stack with a common input Cload (fF)
3 NMOS stack and 1 PMOS with a common input
3 NMOS stack and 2 PMOS stack with a common input Fig. 4 Results on deterministic delay (2-input nand). Tr is input slew time.
3 NMOS stack and 3 PMOS stack with a common input The overlapped curve is the delay obtained from Spice.
4 NMOS stack and 1 PMOS with a common input
TABLE I. % error in delay results of logic level VHDL (avg between
4 NMOS stack and 2 PMOS stack with a common input HL, LH transitions) against Spice BSIM4, 45nm CMOS technology
0.00031
taum_n X1_1ps ao112_n X1 -0.1 -0.3 -0.1 -0.7 -0.5 -0.1
0.00026
tau0_n X1_1ps ao112_n X10 1.3 1.5 -0.2 -1.1 -1.0 -0.6
0.00021
taui_n X1_1ps
0.00016 ao22_n X1 0.5 0.2 0.1 -0.7 -0.2 0.0
0.00011
ao22_n X10 1.5 1.5 0.6 -0.2 -0.4 -0.3
0 1 2 3 4
ao31_n X1 -1.4 -1.7 -0.7 -0.4 -0.9 -0.6
Cload (fF)
ao31_n X10 0.6 0.1 -1.3 1.3 0.8 -0.3
Fig. 3 Behavior of _o, _i and _m vs load capacitance. Input slope 10 ps.
ao212_n X1 -1.3 -1.4 -0.6 -1.1 -1.1 -0.6
It is easy to store the values of the , , functions for ao212_n X10 -0.4 -0.6 -1.4 -0.8 -0.9 -1.2
different input slope values, and implement Eq. 1 within an
ao32_n X1 -1.4 -1.8 -0.8 0.6 0.1 -0.3
HDL standard cell description. The results obtained from our
VHDL implementation show very good agreement with Spice ao32_n X10 0.0 -0.3 -1.6 1.4 1.2 0.5
BSIM4 simulations. Fig. 4 shows how the results fit the often ao222_n X1 -1.5 -2.1 -1.4 0.0 -1.1 -1.3
non-linear behavior of standard cell delay for small load
capacitance values. Table 1 summarizes the results on the delay ao222_n X10 0.8 0.1 -1.4 2.1 1.6 0.0
of non-trivial cells compared with Spice. ao33_n X1 -0.6 -1.2 -0.6 2.5 1.7 0.3
40.00 ao33_n X10 1.6 1.2 -0.5 4.3 3.9 2.6
nand2 input A X1 tr=50ps
35.00
30.00
delay HL (ps)
219
IV. STATISTICAL MODEL CHARACTERIZATION By collecting and storing the 104 values of the asymptote
Actually, the behavior of the , and functions is for , and , obtained by the Spice-level Monte-Carlo
interesting when we take into account technology variations for iterations, it is possible to repeatedly execute the VHDL logic-
statistical Monte-Carlo analysis. level Monte-Carlo timing simulation of a CMOS cell, applying
the asymptote statistical variations to the functions, so that
0.0025 the delay statistical behavior is obtained. We implemented
tauO 1-1
0.002
logic level Monte Carlo for some reference cells, reaching
significantly accurate results. Fig. 6 shows two cases referring
0.0015 L +15% to a not cell and to a 2-input and cell. A comparison with [5] is
tau (ns)
220