06690113

2462
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 22, NO. 12, DECEMBER 2014
FinPrin: FinFET Logic Circuit Analysis and

Optimization Under PVT Variations
Yang Yang and Niraj K. Jha, Fellow, IEEE
Abstract Continued scaling of bulk CMOS technology is

facing formidable challenges. FinFETs, with better control of
short-channel effects, offer a promising alternative for the
22-nm technology node and beyond. However, FinFETs still suffer
from process, voltage, and temperature (PVT) variations. Thus,
to analyze the delay of FinFET logic circuits, statistical static
timing analysis (SSTA) is more suitable than traditional static
timing analysis. In this paper, using an existing SSTA algorithm
as a foundation, we analyze silicon-on-insulator FinFET circuits
in various new ways: 1) basing the analysis on accurate device
simulation of the logic library; 2) considering voltage and temperature variations, in addition to process variations; 3) deriving
accurate timing and leakage macromodels; and 4) investigating
the impact of PVT variations on delay/leakage distributions at
the circuit level. We propose a simplified timing model that
greatly reduces the computational complexity without giving rise
to any convergence issues. The timing model has an average
absolute error of 3.4% and 4.4%, respectively, for gate output
slope and gate delay over all logic gates and sizes, compared
with accurate quasi-Monte Carlo simulations. We evaluate the
performance of our SSTA algorithm with respect to Monte Carlo
simulation, and extend the algorithm to enable statistical leakage
and dynamic power analysis as well. We investigate the impact of
PVT variations on delay/power distributions at the circuit level.
We show that deterministic optimization methods can optimize
both the mean and variance of circuit delay/power distributions,
and even the ratio of standard deviation to mean in some cases.
Finally, we show that FinFET circuits need to be carefully
optimized with temperature taken into consideration, since the
ratio between the leakage and dynamic power of a circuit
can vary drastically depending on the operating temperature
assumed.
Index Terms FinFETs, low-power design, parametric
variation, self-consistent temperature, timing model.
I. I NTRODUCTION
RADITIONAL planar devices face considerable processing challenges in sustaining further transistor scaling.
These challenges include severe short-channel effects (SCEs),
e.g., high drain-induced barrier lowering (DIBL) and threshold
voltage (Vth ) rolloff. Countering these challenges requires
heavy channel doping, which, unfortunately, makes the transistor susceptible to significant process variations that can
degrade circuit performance and increase leakage power, thus
Manuscript received February 14, 2013; revised July 5, 2013 and

October 10, 2013; accepted November 7, 2013. Date of publication December 20, 2013; date of current version November 20, 2014. This work was
supported by SRC under Contract 2010-HJ-2079.
The authors are with the Department of Electrical Engineering, Princeton
University, Princeton, NJ 08544 USA (e-mail: yangyang@princeton.edu;
jha@princeton.edu).
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TVLSI.2013.2293886
lowering the die yield. This has led to the advent of multigate
field-effect transistors (FETs) [1]. The presence of multiple
gates surrounding the channel enables much tighter control
of the electrostatic property of the channel. Among such
multigate FETs, FinFETs have been shown to hold the most
promise. They offer higher performance and lower power at
similar fabrication cost [2]. This has led several companies to
announce a switch to FinFETs at the upcoming technology
nodes.
The lightly doped channel of a FinFET improves its
resistance to process variations. For example, the static noise
margin of the FinFET implementation of a static RAM cell
is shown to be much better than its bulk CMOS implementation [3]. However, FinFETs still face intradie and interdie
process variations in a number of parameters, such as gate
length, fin thickness, work function, and oxide thickness, all of
which have an impact on the delay and power characteristics
of FinFET logic gates, and thus the die yield. In addition, the
IR drop, imperfect distribution of the voltage supply
network, and operating temperature also change the circuit
characteristics.
In the last decade, statistical static timing analysis (SSTA)
has been actively researched to address the issue of process
variations, since static timing analysis (STA) targeted at corner
cases has been shown to be inadequate [4] [16]. In addition,
while mixed-mode technology computer-aided design (TCAD)
simulation [17] has been used to study the behavior of
a FinFET, computational complexity prevents TCAD from
simulating large circuits using device simulation. Therefore,
an efficient and accurate SSTA algorithm tailored to FinFET
circuits needs to be developed. Finally, since the impact of
supply voltage and temperature variation on power and delay
of FinFET circuits is significant, they should be considered,
together with process variations.
In this paper, we first describe delay/leakage macromodels
for FinFET logic gates of various sizes based on detailed
device simulation using mixed-mode TCAD. We present
models to analyze both the gate delay dgate and output
slope Sout with reasonable accuracy. We extend an existing
SSTA algorithm [11] to FinFET circuits, considering process,
voltage, and temperature (PVT) variations. We show what
impact power optimization of a FinFET circuit using Synopsys
design compiler [18] can have on its delay and power distributions. Since leakage increases exponentially with increasing
temperature and temperature increases with increasing power
consumption, it is important to analyze circuits where leakage
and temperature are in equilibrium. We analyze FinFET
circuits under such self-consistent temperatures.
1063-8210 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
YANG AND JHA: FinPrin: FinFET LOGIC CIRCUIT ANALYSIS AND OPTIMIZATION UNDER PVT VARIATIONS
2463
effective oxide thicknesses, front- and back-gate thicknesses,

front- and back-gate spacer thicknesses, and gate-drain/source
underlap, respectively. Additional parameters of interest are
body doping NBODY , source/drain doping NSD , fin pitch FP ,
and operating voltage Vdd . The work function of an n-FinFET
(p-FinFET) is denoted by N ( P ).
III. PVT VARIATIONS
Fig. 1. FinFET structure and its cross section. (a) 3-D FinFET structure.
(b) 2-D cross section of a FinFET.
A preliminary version of this paper was presented in [19].

The remainder of this paper is organized as follows.
Section II provides background material on the FinFET
structure and TCAD device simulation used in this paper.
Section III discusses PVT variations and our macromodeling methodology. Section IV introduces an empirical timing
model. Section V formally presents the SSTA formulation for
the FinFET domain. Section VI describes the simulation setup
and the results of model validation. Section VII presents a
synthesis flow for analyzing and optimizing FinFET circuits.
Section VIII presents the experimental results. Section IX
concludes this paper.
II. FinFET S TRUCTURE
As shown in Fig. 1(a), a FinFET has a front and a back
gate. Here, L G , TOX , TSI , and HFIN are the gate length, oxide
thickness, fin thickness, and fin height, respectively. If the two
gates are shorted at the top, the FinFET is called shortedgate (SG). If not, the two gates are independently controllable,
as shown in this figure. Such a FinFET is called independentgate (IG). If an SG FinFET has different work functions on its
front and back gates, it is called asymmetric-work function SG
(ASG) [20]. An SG FinFET is the fastest. An ASG FinFET
may have leakage that is 400 lower than the SG FinFET at
the expense of around 30% lower on-current. An IG FinFET
can be used to replace two parallel transistors by one by
providing two separate signals to its two gates. If its back gate
is reverse biased, it increases the Vth of the front gate, thus
reducing leakage at the expense of increased delay. A forward
bias has the reverse impact. SG and ASG FinFETs have the
same area. However, an IG FinFET requires more area because
of the need to land an extra contact on the back gate. These
three FinFET styles lead to a rich design space. In this paper,
we focus on the SG FinFET. However, our approach can be
extended to IG and ASG FinFETs in a straightforward manner.
The double-gate structure of a FinFET enhances its electrostatic integrity, while reducing the impact of SCEs. It enables
the FinFET to generate a higher on-current, thus improving
the delay of the corresponding logic gate. Reduced SCEs lead
to better control of DIBL and subthreshold swing, resulting in
reduced leakage current as well [21].
Fig. 1(b) shows the cross section of a FinFET. Such a cross
section is used in mixed-mode TCAD simulation. Here, L GF ,
L GB , TOXF , TOXB , HGF , HGB, L SPF , L SPB , and L UN are the
physical front- and back-gate lengths, front- and back-gate
In this section, we discuss different types of variations that a

FinFET logic circuit may be subjected to, including variations
in various process parameters, Vdd , and operating temperature Top . Unlike planar devices, a FinFET does not suffer from
random dopant fluctuations due to its lightly doped channel.
However, it is still impacted by other types of variations.
For example, we found that the average leakage power of an
inverter (INV) may vary by 59.4% when N varies in its (3 ,
+3 ) range. Output resistance Rout (calculated by setting input
slope Sin = 0) of a NOR gate may vary by up to 3.3% within
the positive 3 range of Vdd . Top variations may lead to an
increase in Rout by 18.0% for a NAND gate. Therefore, without
proper consideration of all PVT variations, it is difficult to
perform an analysis and optimization of a FinFET circuit.
Next, we discuss PVT variations and our approach in
modeling them.
A. Process Variation
Process variations, which are the result of unavoidable
inaccuracies introduced during the fabrication process, assume
greater importance as we move further into the deepsubmicrometer regime since integrated circuits (ICs) become
more sensitive to parametric variations. Small differences
introduced in various process steps, such as chemical mechanical polishing and etching, may change the circuit behavior
in unexpected ways. Variations in physical parameters, such
as L G , TSI , and TOX , in turn, introduce changes in electrical
parameters, such as output capacitance Cout (calculated by
setting Sin = 0), input capacitance Cin (calculated when load
capacitance Cload Cout ), and Rout . Eventually, this impacts
dgate and Sout .
In this paper, we consider variations in two groups of
process parameters [20], [22][24]. The first group includes
parameters, such as L G , TSI , and TOX . The second group
includes work functions N and P . Variations in the work
functions alter the Vth , thus impacting both leakage and delay.
We simulate four sizes (X1, X2, X4, and X8) of the
basic gates, INV, two-input NAND, and two-input NOR, with
respect to various parametric variations and store the results
in a FinFET design database, which is used to facilitate
the generation of the FinFET logic library. Some of these
simulation results are shown in Fig. 2 for size X8. Here, Ileak
refers to leakage current. Although these results were obtained
from simulations at Top = 298 K, later, we consider the impact
of temperature variations as well.
First, as can be observed from Fig. 2(a) and (b), less than
0.4% variation in N from the nominal value of 4.4 eV can
change Ileak by up to 38% in the NAND gate case. Although
variation in P has a smaller impact on Ileak for INV and
2464
Fig. 2. Effect of process/voltage variations on Ileak , Rout , Cin , and Cout (Top = 298 K). (a) Impact of N variation on Ileak . (b) Impact of P variation
on Ileak . (c) Impact of L G variation on Rout . (d) Impact of TSI variation on Rout . (e) Impact of TOX variation on Cin . (f) Impact of Vdd variation on Cout .
NAND , Ileak deviates by more than 17% for NOR with a

variation of only 0.33% in P from its nominal value of
4.8 eV. The impact of variation in N is generally greater
than variation in P because electrons have higher mobility
than holes. In the case of the NOR gate, variation in P has
a significant impact because to balance the drive strength of
its n- and p-FinFET networks, a p-FinFET has four times the
number of fins than in an n-FinFET.
Although we have not shown the impact of variations
in the other parameters on Ileak , their impact cannot be
ignored. A 1.6-nm increase in L G from its nominal value
of 20 nm can change Ileak by 33.3% for the NOR gate (X8,
T = 298 K). Since RC parameters impact gate delay, we next
consider the impact of process variations on them. As shown in
Fig. 2(c)(e), a 3 increase in L G and TSI from their nominal
values (of 20 and 10 nm, respectively) may change Rout by up
to 11.1% (INV) and 3.4% (NAND), respectively, and a 0.08-nm
increase in TOX from its nominal value of 1 nm may change
Cin by up to 6.6% (INV).
Process variations can be divided into two categories:
1) die-to-die (D2D) and 2) within-die (WID). D2D variations
have a global impact on all transistors and interconnects. WID
variations can be further decomposed into global variations
global, local variations local , and a random component [25]
WID = global + local +
(1)
where global is location dependent, local is both proximity

dependent and layout specific, and represents the random
WID variations across the chip and is modeled by

N(0, )
(2)
where is the covariance matrix of all process parameters

under the assumption of spatial correlation. In this paper,
we concentrate on WID variations. However, the tool can
be extended to address D2D variations in a straightforward
manner.
Similar to the method in [11], we do not consider the local
component in this paper. With the assumption of a Gaussian
distribution for all process variations, any parameter p can be
represented by
p = p + x x + y y + N(0, )
(3)
where p is the nominal process parameter value, x and y

are variations along the x and y directions, respectively, and
p is computed at location (x, y).
We assume all process parameters are spatially correlated
and modeled with a Gaussian random variable (RV). However,
this assumption can be easily relaxed to include spatially
uncorrelated parameters, as shown in [11]. Spatial correlations
between different parameters are modeled using a multilevel
quad-tree [6], as shown in Fig. 3. In this model, spatial
correlation between two parameters depends on the grids they
share. For example, INV2,4 shares both grids 1, 1 and
0, 1 with NOR2,3 , but only grid 0, 1 with INV2,10 . For
interconnects, we consider metal width Wintl , metal thickness
Tintl , and interlayer dielectric (ILD) thickness HILDl as process
Fig. 3.
2465
Multilevel quad-tree-based spatial correlation model.
parameters, where subscript l denotes the metal layer. Each

interconnect is divided into segments based on the grid the
segment resides in. For example, the interconnect between
NOR2,3 and INV 2,10 may be split into four grids: 2, 3,
2, 8, 2, 9, and 2, 10.
Finally, we assume that spatial correlation only occurs
between parameters of the same type in different grids, and
interconnect parameters are considered to be different types
when in different metal layers.
B. Supply Voltage Variation
Just like process parameters, supply voltage Vdd can also
vary across the IC. The presence of IR drop and an unbalanced
power distribution network places additional burdens on the
circuit designer to ensure that delay and power constraints are
met under variations in Vdd . In [26], the voltage drop across an
IC of size 8 mm 8 mm with 160-K macros is shown to be
as high as 15%. Vdd may also rise above its nominal value due
to a poorly designed voltage regulator. Therefore, we consider
Vdd variations in both positive and negative directions.
In addition to its direct impact on leakage power Pleak and
dynamic power Pdyn , Vdd also has an impact on the timing
behavior of a logic gate by determining the drive current of
the gate, which can be modeled using Rout and Cout . As shown
in Fig. 2(f), Cout varies by up to 2.8% (INV) in the possible
range of supply voltages. Just like L G , TSI , TOX , N , and
P , we treat Vdd as a Gaussian RV, with spatial correlation
modeled using the quad-tree method, since power is usually
supplied across the IC through a distributed network.
Fig. 4. Effects of temperature variation. (a) Impact of Top variation on Rout .

(b) Impact of Top variation on Ileak .
We do not model temperature as an RV, since temperature

can settle at any value. Instead, we simulate the circuit to
obtain its expected Top . Thus, we model temperature as an
independent parameter with no correlation to process-voltage
variations, and use Top as a scaling factor to adjust the value
of other parameters. We validate this approach later.
D. Modeling of PVT Variations
Next, we discuss how the impact of PVT variations can be
modeled.
1) Modeling of RC: We model Rout , Cout , and Cin as

follows. First, we consider an arbitrary function v = F( P),
where P represents a set of RVs, and RV pi P has a
Gaussian distribution N( pi , p2i ). We then approximate this
function using the first-order Taylor expansion
F
pi
(4)
v = vnom +
pi nom
pi P
C. Operating Temperature Variation

Investigation of the cross impact between temperature and
circuit delay/power has been a popular research topic. In
addition, the temperature of a computing block in an IC does
not depend only on the power consumption of the block itself,
but also the temperature of adjacent blocks because of lateral
heat transfer.
Temperature affects both leakage current as well as timing
characteristics, the latter being reflected in RC estimations.
From Fig. 4, we can observe that Rout of an X8 logic gate
can vary by as much as 16.5% when Top varies from 298 to
398 K. Even worse, Ileak may vary by up to 28 in the NOR
gate case over the 100-K increase in temperature.
where v nom and F/ pi are computed using nominal values

of pi , and pi has a Gaussian distribution N(0, p2i ).
We can now derive the mean v and variance v2 of the
function as
v = E[v] = v nom
v2 = E[v 2 ] (E[v]2 ) =
F 2
(5)
2
pi nom pi
i

F
F
+2
cov( pi , p j ) (6)
pi nom p j nom
i = j
where E[] denotes the expectation of the function.
2466
Although the distribution of v might not be Gaussian, it is

reasonable to make this assumption when pi is sufficiently
small. Note that a large deviation from the nominal process
parameters is unlikely for WID variations. Equation (4) is
sufficient to model RC, since there is no need to calculate
v and v2 for them. On the other hand, all three equations
are required to model circuit power and delay, as shown later.
2) Extension to Modeling of Temperature: Since temperature is treated separately from other RVs, we extend the
definition of the arbitrary function as follows:
) Ftemp (Top )
= Fstat ( Pstat
v = F( P)
(7)
denotes the set of parameters modeled as RVs

where Pstat
and Ftemp (Top ) denotes the macromodel that considers the
effect of temperature. In the rest of this paper, except where
and assume
otherwise stated, we will use the notation F( P)
that the temperature effect is already considered to simplify
the presentation.
3) Modeling of Dynamic Power: Dynamic power is given
by
2
f
(8)
Pdyn = Ctot Vdd
where is the average switching activity of the circuit, Ctot
is the total load capacitance at the output of a gate, which
includes its Cout , sum of Cin of transistors in the following
stage, and the capacitance of the interconnect that connects to
the gate in the following stage, and f is the frequency.
Taking the partial derivative of the equation, we have
Pdyn
Ctot
2
= Vdd
f
(9)
pi , pi = Vdd
pi
Pdyn
Ctot
2
= Vdd
f
+2Ctot Vdd f. (10)
Vdd
Vdd
4) Modeling of Leakage Power: Average leakage power of
a logic gate over all possible input vectors is modeled as an
exponential function as follows:
Pleak = ea0 +a1 L g +a2 Tsi +a3 Tox +a4 N +a5 P +a6 P +a7 Vdd . (11)
2
Temperature is also considered by an exponential function.

With the above macromodel, Pleak is treated separately from
circuit delay Dckt and Pdyn in this paper as a log-normal
2 ). A detailed analysis is
RV denoted by ln N(leak and leak
presented in Section V.
IV. T IMING M ODEL FOR FinFET C IRCUITS
Compared with bulk planar devices, the challenge in analyzing FinFET circuit delay lies in the computation of the
different timing parameters. To the best of our knowledge,
no FinFET timing model has been presented that is both
efficient and sufficiently accurate. SPICE simulation, such as
SPICE3-UFDG [27] and BSIM-CMG/IMG [28], either is not
as accurate as device simulation at the 22-nm technology node
and beyond or often fails to converge. The direct current
modeling technique presented in [29] only models FinFET
behavior up to the 60-nm node and requires large CPU times.
Therefore, we describe an efficient, yet reasonably accurate,
FinFET gate-level timing model to obtain values of dgate
and Sout .
A. Modeling of Gate Delay

Gate delay dgate is defined as the time between the switching
point Vsw (50% of the voltage range) of its input and output
signals. We derive our empirical FinFET model based on a
modification of the well-known Horowitz delay approximation [30] as follows:

dgate = Delay(, Sin , Vsw ) = d + A (Sin / ) (12)
where d is used to replace the original (log[Vth ])2 , A = 2
d (1Vsw ) for the rising edge, A = 2 d Vsw for the falling
edge, = Rtot Ctot is the time constant of the gate when
Sin = 0, Rtot is the sum of Rout of the gate and Rw of all its
interconnects, and both d and d , stored in the FinFET logic
library, are the scaling factors that are assigned different values
for different gates and sizes for the rising (falling) transition.
The advantage of the closed-form model shown in (12) is
that it enables efficient timing calculation. It also makes it
convenient to derive the partial derivatives of dgate , pi P

dgate
1
Rtot
2
=
(2 d Rtot Ctot
+ A Sin Ctot )
pi
2 dgate
pi
Ctot
2
(2d Rtot
Ctot + A Sin Rtot )
+
pi

Sin
+
(13)
A Rtot Ctot .
pi
B. Modeling of Output Slopes
The output slope Sout of a FinFET logic gate is calculated as
the time elapsed in the transition from the high (low) to low
(high) threshold of the output signal for the falling (rising)
edge. As is common, we choose 90% of the voltage range
for the high threshold and 10% for the low threshold. Since
Rout and Cout are derived under Sin = 0, construction of Sout
should take Sin into consideration.
With our TCAD simulation results, we propose the
following empirical model to calculate the Sout of a FinFET
logic gate:

2
Sin
+s s
Sout = Slope(Sbase , Sin ) = Sbase s + ln
Sbase
(14)
where Sbase = ln9 is the output slope when Sin = 0, and s ,
s , and s are scaling factors derived for each type and size
of gate in the FinFET logic library, similar to the dgate case.
With the closed-form representation of (14), we can derive
the partial differential form of Sout that can be employed in
the SSTA algorithm described later.

Sin
Let X = ln
+ s s
(15)
Sbase
then we have

Sout
Ctot
Rtot
2
= ln9 (s + X ) Rtot
+ Ctot
pi
pi
pi
Sin / pi
+2 Sbase X s
S + S
in s base

2 Sbase X s Sin Rtot Ctot
Ctot Rtot
.
+
Sin + s Sbase
pi
pi
(16)
2467
Definition 2: Let a path pi be a set of ordered edges from

PI to PO in G stat . Define Di to be the path length distribution
of pi , where Di is computed as the statistical sum of weight
d j of edge e j for all edges on path pi . The SSTA problem
of a circuit is defined as finding the distribution of Dmax =
max(D1 , . . . , Di , . . . , Dm ) over all paths in the circuit (from
p1 to pm ), where the max operation is also computed in
statistical form.
Under PVT variations, the set of ordered edges that form
the critical path pcritical is subjected to variation. Therefore,
the purpose of SSTA is not to select the critical path with its
timing distribution, but rather to derive the statistical maximum
of all delay distributions that arrive at PO.
Fig. 5.
Partial s27 circuit from the ISCAS89 benchmark suite and its
statistical timing graph. (a) Partial circuit from s27. (b) Corresponding G stat .
V. SSTA F ORMULATION AND I TS E XTENSIONS

In this section, we present the SSTA formulation. We use
Monte Carlo (MC) simulation as a reference for simulation
results in this paper. We then discuss the modeling of delay
distribution for both gates and interconnects, followed by a
brief description of principal component analysis (PCA) and
how it is applicable in this context. Then, we review the two
operations, statistical sum and max, that are crucial to an SSTA
formulation. Finally, we discuss how the methodology can be
extended to statistical power analysis.
A. Problem Formulation
The presence of PVT variations makes STA, which aims
to find the critical path deterministically, impractical. Using
STA to address the process variation problem would require
corner-case analysis over the entire parameter variation space,
but with only interdie variations and no spatial correlation
considered. Even though this problem can be mitigated using
MC simulation that performs parameter value sampling for
each gate and wire segment, it takes inordinately long computation times before converging. This prevents its use in a
logic synthesis environment where timing analysis may need
to be done repeatedly. SSTA offers an alternative approach,
which takes parameter distributions as input and constructs
the statistical timing graph of a circuit based on the concept
of program evaluation and review technique [31], [32].
=
Definition 1: A statistical timing graph G stat
{N, E, PI, PO} is a directed graph that has exactly one
source node PI and one sink node PO, where N is a set of
nodes and E is a set of directed edges. The delay distribution
of a gate or interconnect on edge i is denoted by weight di ,
which is an RV, either correlated or independent.
Fig. 5 shows an example of how G stat is generated from
a circuit. Since the SSTA formulation is targeted at combinational circuits, the D flip-flop (DFF) is split into two parts:
1) the DFF output is the DFF primary input of the circuit;
2) the DFF input is the DFF primary output of the circuit.
A virtual PI and a PO are added to complete G stat .
B. Modeling of Delay Distribution

With the generalized first-order Taylor expansion presented
in Section III-D, the delay of a gate/interconnect is simply
given by
F
pi
(17)
d = dnom +
pi nom
pi P
where dnom is the nominal delay when all parameters are

at their respective nominal values. Therefore, mean d and
variance d2 can be calculated using (5) and (6). Again, without
explicitly stating, we assume that the temperature effect has
already been considered in these equations.
1) Modeling of Interconnect Delay: Since technology scaling does not affect interconnect modeling beyond the fact that
the interconnect size is shrunk and interconnect networks are
mostly tree structured, we use the Elmore delay model to
model interconnect delay. The interconnect delay of a directed
edge is given by [11]
dint = Dint ( Rw , Cw , Cin )
(18)
where Rw (Cw ) is the vector of wire resistances (capacitances)

on the edge and Cin is the vector of input capacitances of
transistors connected to the receiver end of the interconnect.
Note that though G stat shows only one wire segment for
each directed edge, edges are actually split into multiple wire
segments due to place-and-route (P and R) and the quad-tree
model.
With the discussion in previous sections, the RVs involved
in interconnect delay computation include Wintl , Tintl , HILDl ,
L G , TSI , TOX , N , P , and Vdd . Since RC values of metal
lines are not affected much by Vdd variations, given the
possible Vdd operating range, we do not consider Vdd variations any further. In the future, should interconnects be made
from other materials for which RC values are significantly
impacted by variations in Vdd , this approach can be easily
extended. Moreover, the temperature effect is not considered
as interconnects do not dissipate leakage current and their RC
values are not impacted much by temperature either. Using the
chain rule, we can rewrite the delay function as shown in (19),
at the top of the next page, where z is the number of metal
layers, g is the set of grids that the gates connected to the
interconnect reside in, and int is the set of grids that the wire
segments of the interconnect reside in. L G,i = L G,i L G,i ,
2468
dint = dint,nom +
Dint
L G,i
i g
Dint
i g
N,i

i int
nom
N,i +
nom
Dint
Wintl ,i
L G,i +
i g

Wintl ,i +
P,i
Sout,i
i int
)
= Dgate,i (Cw , Cout , Cin , Rw , Rout , Sin,i
)
= Sout,i (Cw , Cout , Cin , Rw , Rout , Sin,i
(20)
(21)
j =i
where n is the number of pins of the gate, dpath,i is the path

delay through pin i , and P() computes the probability of the
Gaussian RV based on a lookup table.
C. Principal Component Analysis
When spatial correlation is considered, computing the statistical sum and max of a group of RVs is quite nontrivial. PCA,
which translates the correlated RVs to a different coordinate
system such that all translated RVs are independent of each
other, has been proven to be effective in addressing this
problem [11].
Assume that for RV x i X
x i N(i , )
(23)
where i is the mean of x i and is the covariance of X , under

the Gaussian distribution assumption. Then, normalizing x i by
x i i /i gives us
x i i
N(0, nom ).
(24)
i
Performing PCA over the normalized covariance nom , we
have
(25)
nom = V V T
where V is the eigenvector matrix of nom and = diag{i },
where i is the i th eigenvalue of nom . This transformation
allows us to represent the normalized X as
X nom = V 1/2 X ind
nom
Dint
TOX,i
TOX,i nom
i g
P,i +
Dint
i g
Vdd,i
Vdd,i +
nom
z
(19)
l=1
i int
for pin i of the gate.

Since any dgate,i can potentially be on the critical path,
Sout is obtained as a weighted sum, the weight given by the
probability that pin i has the longest gate delay
n

P(dpath,i > max(dpath, j )) Sout,i
(22)
Sout =
i=1
nom
TSI,i +

Dint
Dint
Tintl ,i +
HILDl ,i
Tintl ,i nom
H I L Dl ,i nom
where L G,i is the gate length RV in grid i . The other processvoltage parameters are defined in a similar fashion.
2) Modeling of Gate Delay and Output Edge Delay: We
model dgate and Sout in an analogous manner to how we model
interconnect delay
dgate,i
TSI,i
Dint
i g
nom
Dint
(26)
where X ind N(0, I ) is the principal component of nom

with zero mean and unit variance.
Equation (26) enables us to express spatially correlated

parameters with uncorrelated principal components. Therefore,
delay di of a component (an interconnect, gate, or path) can
be simplified as
di = dnomi +
c
i,k pindk
(27)
k=1
where i,k is the coefficient computed as shown above and

c is the size of Pind ( pindk is its kth component) that is the
which is equal to LG TSI TOX

principal component of P,

N P Vdd Wintl Tintl H I L Dl .
D. Statistical Sum and Max
Two major operations, the statistical sum and max,
need to be performed while traversing G stat . Transforming
the original P into the uncorrelated Pind simplifies these
operations.
1) Statistical Sum Operation: Since for each di , pindk has
zero mean, the mean of the sum of delay over the directed
edges from one to n is basically
dsum =
n
dnomi .
(28)
i=1
Similarly, as all pindk have unit variance, the variance of dsum

can be expressed as
d2sum =
n
c
i,2 j .
(29)
j =1 i=1
2) Statistical Max Operation: We borrow the analytical

formula for statistical max over m di s, at either the output of
a single gate or over all output ports of the circuit, from [33].
We then use the method in [11], which shows how to obtain
the max of multiple RVs.
The main challenge in performing the max operation is
that the result is not a perfect Gaussian distribution. As
pointed out in [8], the distribution is significantly skewed
when two RVs, A and B, have close means but very different
variances. In this case, the part above (below) the mean
value of the resulting distribution is dominated by the RV
with the larger (smaller) variance. On the other hand, when
A and B both have close means and variances, the skewness
is reduced significantly. When one RV dominates another,
skewness is negligibly small. Therefore, although in theory,
such an approximation of the result of max introduces a risk
TABLE I
F IXED FinFET D EVICE PARAMETERS
2469
TABLE III
F ITTING AND T ESTING E RRORS FOR THE L EAKAGE M ODEL
TABLE II
VARIABLE PVT PARAMETERS
TABLE IV
F ITTING AND T ESTING E RRORS FOR THE T IMING M ODEL
due to ignoring of skewness, in practice, the first case rarely

occurs. As demonstrated through our simulation results later,
assuming a zero-skewness Gaussian distribution gives good
performance compared with MC simulation.
E. Extension to Statistical Power Estimation
The above approach of obtaining the delay distributions
can be extended to statistical dynamic power estimation in
a straightforward fashion. Similar to (18), we can represent
dynamic power as
Pdyn,i = Pdyn (Cw , Cout , Cin , Vdd )
(30)
where i represents an individual gate.

After performing PCA, we have
Pi = Pnomi +
c
i,k pindk
(31)
k=1
where i,k is calculated using PCA.

Because every gate dissipates power, the only relevant
statistical operation for power is sum. Therefore, similar to
(28) and (29), Psum and P2sum can be derived as
Psum =
P2sum =
refer to [12]. Still, the direct application of Wilkinsons method

to leakage power analysis has a complexity of O(N 2 ). Hence,
the computation time increases drastically for large benchmark
circuits.
To overcome the above problem, instead of computing
leakage power on a gate-wise basis, we first divide the circuit
into grids using the grid plan discussed previously. Next, inside
each grid, leakage power of all gates are summed up to give
the leakage power of the grid, assuming a perfect correlation.
Finally, leakage power of the circuit is obtained by applying
Wilkinsons method to all the grids in the same fashion. In
this way, complexity is reduced from O(N 2 ) to O(n 2 ), where
n is the total number of grids, n N. This has a minimal
impact on accuracy, as shown later.
VI. S IMULATION S ETUP AND M ODEL VALIDATION
N
Pnomi
i=1
N
c
i,2 j
(32)
(33)
j =1 i=1
where N is the total number of gates in the circuit.

On the other hand, since Pleak is modeled as a log-normal
RV, a statistical approach to deriving the sum of Pleak requires
a different mechanism. Although the sum of log-normal RVs
is theoretically not known to have a closed form, Wilkinsons
method gives a fairly accurate approximation to this problem [34]. Due to space limitations, we will not elaborate upon
the detailed mathematics involved, but interested readers can
In this section, we first describe the setup for FinFET

gate-level simulation using mixed-mode TCAD under PVT
variations. Next, we present our approach for validating the
gate-level leakage, timing, and temperature models described
in the previous sections.
A. Simulation Setup
The 22-nm FinFET parameters, which are assumed to not
suffer from process variations, and their values are given in
Table I. We adopt the 2-D TCAD FinFET simulation model
given in [35] and [36], which is shown to have accuracy
close to 3-D TCAD FinFET simulation, yet with a 1001000
improvement in computational efficiency.
2470
Fig. 6.
FinFET synthesis flow. (a) Initialization, optimization, and P and R stages. (b) Analysis stage.
TABLE V
T EMPERATURE I MPACT ON D ELAY-O PTIMIZED s38584
Fig. 7.
Layout of basic FinFET cells. (a) INV(X2). (b) NAND(X2).
(c) NOR(X1).
B. Model Validation
Table II lists all parameters that are subjected to variations
along with their respective nominal values, range of variation, and step size. The range is assumed to correspond to
[3 , 3 ] values. Although the gate length and oxide thickness are defined separately for both the front and back gates,
we assume that the front and back gate values of these
parameters vary in the same fashion under process variations.
Hence, we just use L G and TOX to refer to these parameters
for ease of exposition. We do not assume any nominal values
for Top , Sin , and Cload , but simulate them over their range
assuming a uniform distribution.
Even though the 2-D TCAD simulation model [35], [36]
greatly reduces the simulation time compared with the 3-D
simulation model, performing 2-D TCAD simulation of RVs
in a 9-D space is still not practical. The 2-D simulation of
Rout , Cout , and Cin of the X8 SG NAND gate takes more than
an hour of CPU time on four computing nodes consisting of
2.67-GHz Westmere CPUs and 4-GB RAM, and more than
4.2 billion simulations would need to be conducted in this
fashion. Hence, except for Top that is varied for each change in
variable parameters, process-voltage variations are introduced
one at a time.
Finally, to complete our FinFET logic library, we simulate
SG INV, NOR, and NAND gates in four sizes (X1, X2, X4, and
X8). This library is useful when the circuit needs to be delay
optimized or else power optimized under a delay constraint.
In this section, we first present results that validate the

leakage model under temperature variation and then results
that validate the timing model.
1) Leakage Model Validation Under Temperature Variation:
This validation is conducted in two phases. First, we demonstrate that the fitting error is small for the exponential model
for leakage estimation. Next, we perform parameter value
sampling for quasi-MC (QMC) simulations with Gaussian
distributed L G , TSI , TOX , N , P , and Vdd , and uniformly
distributed Top . We simulate 100 samples for each of the three
and four gate sizes and report the testing error.
Table III reports the simulation results. The third column
reports the fitting error and the fourth column reports the
testing error. The maximum error in both cases is less than or
equal to 4% (the average errors are 3.0% and 3.1%, respectively). This seems to offer a good efficiency-accuracy tradeoff.
If a higher level of accuracy is desired, our macromodel can
be extended to higher order polynomials.
2) Timing Model Validation: To validate the timing model,
we first validate the Sout and dgate models of each gate, as
shown in Table IV. In addition to obtaining the QMC samples
mentioned previously, we added samples for Sin and Cload
based on uniform distribution and evaluated the testing error
of the proposed timing model. Again, both the fitting and
testing errors seem reasonable. The QMC test showed that
our first-order RC macromodel and proposed timing model
2471
Fig. 8. Impact of process-voltage variations on Dckt , Pleak , and Pdyn for s38584 (T = 298 K). (a) Variation impact on Dckt . (b) Variation impact on Pleak .
(c) Variation impact on Pdyn .
Fig. 9.
Self-consistent temperature for s38584.
has an average testing error of just 3.4% and 4.4% for Sout
and dgate , respectively.
4) Analysis: Next, we perform statistical analysis of both

power and delay in a loop, as shown in Fig. 6(a). As mentioned
earlier, it is important to consider the impact of temperature
on leakage and delay. Therefore, in our simulation flow,
we do not assume an operating temperature for the circuit
a priori. We start the analysis from 298 K. In each round
of analysis, we provide the power (this includes both leakage
and dynamic power) and P and R information to the thermal
analysis tool HotSpot [38], which estimates the temperature of
each grid. After that, we compare the maximum temperature
from HotSpot and the one used as input to SSTA to check if
temperature has converged to within 0.1 K. If so, we stop,
else we go through the loop again until a self-consistent
temperature is reached.
VII. F IN FET L OGIC S YNTHESIS F LOW
VIII. R ESULTS AND D ISCUSSION
In this section, we describe the FinFET logic synthesis

flow, which enables us to analyze the area and delay/power
distributions of FinFET logic circuits. The flow is divided into
four stages: 1) initialization; 2) optimization; 3) placement and
routing; and 4) analysis.
1) Initialization: We start the simulation with the given
benchmark circuit (we use the combinational logic of circuits
from the ISCAS89 benchmark suite in our experiments), as
shown in Fig. 6(a). We first replace noninverting logic gates
with their inverting counterparts and an INV, then decompose
logic gates with three or more inputs into gates with only one
or two inputs. This is because our FinFET logic library only
contains two-input gates of various sizes. However, it is easy
to extend the flow to a library that has logic gates with three
or more inputs. Following this transformation, we obtain an
initial gate-size assignment in which all logic gates are either
set to X1 or X8. We next generate the Verilog code for the
benchmark, which is used in the next stage.
2) Optimization: In this stage, Synopsys design compiler
[18] performs unconstrained delay optimization, and then
power optimization under a given timing slack, using the
FinFET logic library. We perform gate size optimization and
save the result using the report_cell function provided in the
design compiler.
3) Placement and Routing: Using the FinFET layout
approach from [20], we generate the layouts of SG standard
cells: INV, NAND, and NOR, as shown in Fig. 7. Using these
layouts, all gates in the circuit are placed with the placement
tool CAPO [37], and then global routing is applied to the
placement.
Our statistical modeling approach uses MinnSSTA [11] as a

foundation. MinnSSTA was used to model delay distributions
of CMOS circuits at the 100-nm node. We have extended
MinnSSTA to obtain FinFET gate-level timing/leakage models
from relevant FinFET process parameters, which enable us
to derive circuit-level power distribution at self-consistent
temperatures, in addition to circuit-level delay distributions.
We call our approach FinPrin. Note that MinnSSTA does
not consider variations in Vdd and temperature, and does not
analyze circuit power consumption under PVT variations.
We divide each benchmark into grids, as discussed previously, such that each grid contains less than 150 gates. Since
process variation data of FinFET ICs are not yet publicly available, we obtain the covariance matrix using the model in [4].
We assume a frequency of 500 MHz for power computation.
We ran all experiments with FinPrin on a Linux PC with
2.26-GHz dual-core CPU and 2048 MB of memory.
A. Impact of Process-Voltage Variations on Delay and Power
The impact of process-voltage variations, when the parameters vary within their [3 , 3 ] range (see Table II), on Dckt ,
Pleak , and Pdyn of optimized benchmark s38584 (largest one
from the ISCAS89 benchmark suite) is shown in Fig. 8. The
x-axis represents values in the [3 , 3 ] range for the six
parameters. Each parameter was varied over 21 steps within
its range.
Fig. 8(a) shows that variations in L G have the most impact
on Dckt , which sees a 7.6% variation in the [3 , 3 ] range.
On the other hand, N and P have a negligible impact
2472
Fig. 10.
Distribution of Dckt , Pleak , and Pdyn for s38584. (a) Distribution of Dckt . (b) Distribution of Pleak . (c) Distribution of Pdyn .
TABLE VI
C OMPARISON OF R ESULTS O BTAINED FROM MC S IMULATIONS (10 000 I TERATIONS ) AND FinPrin, A SSUMING S PATIAL C ORRELATION
TABLE VII
C OMPARISON OF R ESULTS B ETWEEN X1-O NLY AND D ELAY-O PTIMIZED C IRCUITS U SING FinPrin
on delay with less than 1% impact on Dckt over the entire

variation range. In the case of Pleak of the circuit, N has
the most impact since it alters the Vth . Pleak is reduced by
as much as 71.0% over the 21 steps. Finally, since we use a
fixed frequency independent of actual critical delay, Pdyn is
mostly impacted by Vdd (by 71.4% in the range). However, it
is interesting to note that TOX stands out in terms of its impact
on Pdyn relative to the other process parameters, resulting in
a 7.3% variation in Pdyn over the range.
B. Impact of Temperature
Temperature impacts both delay and leakage, but does not
have a significant impact on dynamic power. This is evident
from the mean values shown in Table V for circuit delay and
power for benchmark s38584. Since circuit delay increases by
5.2% when Top increases from 298 to 398 K, it is important

to analyze and optimize a delay-constrained circuit under the
correct temperature. Moreover, the impact of temperature on
Pleak is dramatic. The ratio of Pleak to Pdyn rises from 10.1%
to 267.0%. Thus, power analysis and optimization need to be
done at the temperature the circuit is likely to be operating at,
considering the likely temperature of the surrounding circuitry
(since lateral heat flow can impact the temperature of the
circuit in question).
When operating in a stand-alone fashion (i.e., when there
is no lateral heat flow to the circuit in question), the selfconsistent temperature can be obtained for the circuit. This
is shown for s38584 in Fig. 9. The temperature is obtained
through the flow shown in Fig. 6(b). Note that this flow
will yield the correct temperature even when the circuit is
2473
TABLE VIII
C OMPARISON OF R ESULTS B ETWEEN D ELAY- AND P OWER -O PTIMIZED C IRCUITS (W ITH 10% AND 30% S LACK )
U SING FinPrin, A SSUMING S PATIAL C ORRELATION
Fig. 11. PDF and CDF curves for s38584 where X1 represents circuit with all X1 gates, DO represents Dckt optimization, PO-10 represents power optimization
with 10% slack, and PO-30 represents power optimization with 30% slack. (a) PDF of Dckt . (b) PDF of Pleak . (c) PDF of Pdyn . (d) CDF of Dckt . (e) CDF
of Pleak . (f) CDF of Pdyn .
surrounded by other circuits, as long as all the circuits are

subjected to this flow. For the stand-alone case, when we start
the flow with the initial temperature 298 K, it quickly jumps
to 319.0 K after the first iteration, which is nearly identical
to the temperature obtained after five iterations of the flow.

This is generally true for all the ISCAS89 benchmarks we
synthesized. Even when the initial temperature is set to 398 K,
after the first iteration, the temperature becomes 320.5 K, and
2474
thereafter it quickly saturates. Thus, the fact that only a few

iterations are required to reach the self-consistent temperature
means that the approach is practical.
the curves shift to the left, thus leading to a higher timing or

power yield.
IX. C ONCLUSION
C. FinPrin Versus MC Simulation

Let us first compare the results obtained by FinPrin with
those obtained through 10 000 MC simulations, assuming the
same grid plan. Consider the delay-optimized s38584 at its
self-consistent temperature of 319.0 K. Its Dckt , Pleak , and
Pdyn distributions, under the assumption that the processvoltage parameters are correlated, are shown in Fig. 10(a)(c).
The means match closely in the case of Dckt and the error in
standard deviation (SD) is 1.8%. In the case of Pleak and Pdyn ,
the SD errors are as small as 0.2% for both, with the error in
both the means less than 0.3%.
Table VI compares the results obtained from MC simulation
and FinPrin for delay-optimized versions of various ISCAS89
benchmarks. The high CPU time ratio (CPUTratio ) of MC
simulation with respect to FinPrin shows that the statistical
approach is much more efficient, as expected, since G stat
is traversed only once for delay estimation. In addition,
mean values obtained through FinPrin are very close to
those obtained through MC simulations. Hence, they are very
accurate. For this reason, we do not show the error in the mean
values. The SD errors are also reasonable, except in the Dckt
distribution case for s38417. This is because of the Gaussian
approximation used for the statistical max operation.
The CPU time for FinPrin varies from 0.02 to 209.3 s,
whereas for MC, it varies from 4.1 to 91 761.5 s.
D. Delay and Power Optimization
With the FinFET logic library we have developed for
different gates and sizes, the proposed FinFET analysis flow
can be used to analyze benchmark circuits under different gate
size assignments. For example, suppose we want to compare
the results for a circuit that only uses X1 logic gates with
those for a delay-optimized circuit. This comparison is shown
in Table VII. To compare the two sets of results in a statistical
context, we normalize the value of the SD to its mean. It
is interesting to note that in addition to the reduced delay,
the normalized SDs of Dckt and Pleak also improve slightly,
even though the optimization is done through a deterministic
algorithm. However, the normalized SDs of Pdyn are slightly
higher.
Next, we report the results for power optimization. In the
first (second) experiment, we use 10% (30%) timing slack over
the delay-optimized circuit as the delay constraint. The results
are presented in Table VIII. For the 10% slack case, design
compiler produces an average of 48.9% and 40.1% reduction
in Pleak and Pdyn , respectively, over all benchmarks. With 30%
slack, power can be reduced by 54.6% and 45.2%, respectively.
However, the normalized SDs are not impacted much.
Finally, Fig. 11 shows the probability density function
(PDF) and cumulative distribution function (CDF) of delay
and power for circuit s38584 under different gate assignments
and optimizations. We can see that, under a given optimization,
In this paper, we presented delay and power macromodels for logic gates and interconnects under PVT variations.
We used these macromodels to develop statistical timing and
power analysis techniques. We also evaluated the effect of
temperature on circuit delay and power, and showed the
importance of performing circuit analysis and optimization
at self-consistent temperatures. We presented the FinPrin tool
that was shown to have reasonable accuracy compared with
MC simulation, but much higher run-time efficiency. Finally,
we used the tool to analyze delay- and power-optimized
circuits. We showed that even a deterministic optimization
algorithm can improve not just the mean, but often the SD, of
the metric being optimized.
R EFERENCES
[1] (2012). International Technology Roadmap for Semiconductors [Online].
Available: http://www.itrs.net
[2] E. J. Nowak, I. Aller, T. Ludwig, K. Kim, R. V. Joshi, C.-T. Chuang,
et al., Turning silicon on its edge, IEEE Circuits Devices Mag., vol. 20,
no. 1, pp. 2031, Jan./Feb. 2004.
[3] H. Kawasaki et al., Embedded bulk FinFET SRAM cell technology
with planar FET peripheral circuit for hp32 nm node and beyond, in
VLSI Symp. Technol. Dig. Tech. Papers, Oct. 2006, pp. 7071.
[4] A. Agarwal, D. Blaauw, V. Zolotov, S. Sundareswaran, M. Zhao,
K. Gala, et al., Statistical delay computation considering spatial correlations, in Proc. IEEE Asia South Pacific Des. Autom. Conf., Apr. 2003,
pp. 271276.
[5] J. A. G. Jess, K. Kalafala, S. R. Naidu, R. H. J. M. Otten, and
C. Visweswariah, Statistical timing for parametric yield prediction
of integrated circuits, in Proc. IEEE Des. Autom. Conf., Jun. 2003,
pp. 932937.
[6] A. Agarwal, D. Blaauw, and V. Zolotov, Statistical timing analysis for
intra-die process variations with spatial correlations, in Proc. IEEE Int.
Conf. Comput. Aided Des., Nov. 2003, pp. 900907.
[7] A. Agarwal, K. Chopra, and D. Blaauw, Statistical timing based
optimization using gate sizing, in Proc. IEEE Des. Autom. Test Eur.
Conf., Mar. 2005, pp. 400405.
[8] D. Blaauw, K. Chopra, A. Srivastava, and L. Scheffer, Statistical timing
analysis: From basic principles to state of the art, IEEE Trans. Comput.
Aided Des., vol. 27, no. 4, pp. 589607, Mar. 2008.
[9] V. Khandelwal and A. Srivastava, A quadratic modeling-based framework for accurate statistical timing analysis considering correlation, IEEE Trans. Very Large Integr. (VLSI) Syst., vol. 15, no. 2,
pp. 206215, Feb. 2007.
[10] H. Chang and S. S. Sapatnekar, Statistical timing analysis considering
spatial correlations using a single PERT-like traversal, in Proc. IEEE
Int. Conf. Comput. Aided Des., Nov. 2003, pp. 621625.
[11] H. Chang and S. S. Sapatnekar, Statistical timing analysis under
spatial correlations, IEEE Trans. Comput. Aided Des., vol. 24, no. 9,
pp. 14671482, Aug. 2005.
[12] H. Chang and S. S. Sapatnekar, Full-chip analysis of leakage power
under process variations, including spatial correlations, in Proc. IEEE
Des. Autom. Conf., Jun. 2005, pp. 523528.
[13] J. Singh and S. S. Sapatnekar, Statistical timing analysis with correlated
non-Gaussian parameters using independent component analysis, in
Proc. IEEE Des. Autom. Conf., Jun. 2006, pp. 155160.
[14] W. S. Wang and M. Orshansky, Statistical timing based on incomplete
probabilistic descriptions of the parameter uncertainty, in Proc. IEEE
Des. Autom. Conf., Jun. 2006, pp. 161166.
[15] W. S. Wang and M. Orshansky, Path-based statistical timing analysis
handling arbitrary delay correlations: Theory and implementations,
IEEE Trans. Comput. Aided Des., vol. 25, no. 12, pp. 29762988,
Dec. 2006.
[16] M. Orshansky and W. S. Wang, Statistical analysis of circuit timing using majorization, Commun. ACM, vol. 52, no. 8, pp. 95100,
Aug. 2009.
[17] (2013).
Sentaurus
TCAD
Manuals
[Online].
Available:
http://www.synopsys.com
[18] (2010). Synopsys Design Compiler Manuals [Online]. Available:
http://www.synopsys.com
[19] Y. Yang and N. K. Jha, FinPrin: Analysis and optimization of FinFET
logic circuits under PVT variations, in Proc. IEEE 26th Int. Conf. VLSI
Des., Jan. 2013, pp. 350355.
[20] A. N. Bhoj and N. K. Jha, Design of ultra-low-leakage logic gates and
flips-flops in high-performance FinFET technology, in Proc. IEEE Int.
Symp. Qual. Electron. Des., Mar. 2011, pp. 18.
[21] B. Swahn and S. Hassoun, Gate sizing: FinFET vs. 32 nm bulk
MOSFETs, in Proc. IEEE 43rd Des. Autom. Conf., Jul. 2006,
pp. 528531.
[22] S. Xiong and J. Bokor, Sensitivity of double-gate and FinFET devices
to process variations, IEEE Trans. Electron Devices, vol. 50, no. 11,
pp. 22552261, Nov. 2003.
[23] J. H. Choi, J. Murthy, and K. Roy, The effect of process variations
on device temperature in FinFET circuits, in Proc. IEEE Int. Conf.
Comput. Aided Des., Nov. 2007, pp. 747751.
[24] P. Mishra, A. N. Bhoj, and N. K. Jha, Die-level leakage power analysis
of FinFET circuits considering process variations, in Proc. IEEE Int.
Symp. Qual. Electron. Des., Mar. 2010, pp. 347355.
[25] Y. Liu, S. R. Nassif, L. T. Pileggi, and A. J. Strojwas, Impact of
interconnect variations on the clock skew of a gigahertz microprocessor,
in Proc. IEEE Des. Autom. Conf., Jun. 2000, pp. 168171.
[26] H. Su, T. Liu, A. Devgan, E. Acar, and S. Nassif, Full chip leakage estimation considering power supply and temperature variations,
in Proc. IEEE Int. Symp. Low Power Electron. Des., Aug. 2003,
pp. 7883.
[27] J. G. Fossum, L. Ge, M.-H. Chiang, V. P. Trivedi, M. M. Chowdhury,
L. Mathew, et al., A process/physics-based compact model for nonclassical CMOS device and circuit design, Solid State Electron., vol. 48,
no. 6, pp. 919926, Jun. 2004.
[28] M. V. Dunga, L. Chung-Hsun, D. D. Lu, X. Weize, C. R. Cleavelin,
P. Patruno, et al., BSIM-MG: A versatile multi-gate FET model for
mixed-signal design, in Proc. Int. Symp. VLSI Technol., Jun. 2007,
pp. 6061.
[29] B. Diagne, F. Prgaldiny, C. Lallement, J.-M. Sallese, and
F. Krummenacher, Explicit compact model for symmetric double-gate
MOSFETs including solutions for small-geometry effects, Solid-State
Electron., vol. 52, no. 1, pp. 99106, Jan. 2008.
[30] M. A. Horowitz, Timing models for MOS circuits, Dept. Electrical
Eng., Stanford Univ., Stanford, CA, USA, Tech. Rep. SEL83-003,
1983.
[31] S. S. Sapatnekar, Timing. Boston, MA, USA: Kluwer Academic Publishers, 2004.
[32] T. Kirkpatrick and N. Clark, PERT as an aid to logic design, IBM J.
Res. Develop., vol. 10, no. 2, pp. 135141, Jun. 1966.
[33] C. Clark, The greatest of a finite set of random variables, Oper. Res.,
vol. 9, no. 2, pp. 8591, 1961.
[34] A. A. Abu-Dayya and N. C. Beaulieu, Comparison of methods of
computing correlated lognormal sum distributions and outages for digital
wireless applications, in Proc. IEEE 44th Veh. Technol. Conf., vol. 1.
Jun. 1994, pp. 175179.
[35] S. Chaudhuri and N. K. Jha, 3D vs. 2D analysis of FinFET logic gates
under process variations, in Proc. IEEE Int. Conf. Comput. Des., Oct.
2011, pp. 435436.
[36] S. Chaudhuri and N. K. Jha, 3D vs. 2D device simulation of FinFET
logic gates under PVT variations, accepted for publication in ACM J.
Emerging Technol. Comput. Syst.
[37] (2005). Capo: A Large-Scale Fixed-Die Placer From UCLA [Online].
Available: http://vlsicad.ucsd.edu/GSRC/bookshelf/Slots/Placement
[38] (2009). HotSpot 5.0: Temperature Modeling Tool [Online]. Available:
http://lava.cs.virginia.edu/HotSpot/index.htm
2475
Yang Yang received the B.Eng. degree from

Nanyang Technology University, Singapore, in 2010,
and the M.A. degree from Princeton University,
Princeton, NJ, USA, in 2013, all in electrical engineering.
His current research interests include FinFETbased circuit design and optimization.
Niraj K. Jha (S85M85SM93F98) received

the B.Tech. degree in electronics and electrical communication engineering from the Indian Institute of
Technology, Kharagpur, India, in 1981, the M.S.
degree from the State University of New York at
Stony Brook, Stony Brook NY, USA, in 1982, and
the Ph.D. degree from the University of Illinois at
Urbana, Urbana, IL, USA, in 1985, both in electrical
engineering.
He is a Professor of electrical engineering with
Princeton University, Princeton, NJ, USA. He has
co-authored or co-edited five books, Testing and Reliable Design of CMOS
Circuits (Kluwer, 1990), High-Level Power Analysis and Optimization
(Kluwer, 1998), Testing of Digital Systems (Cambridge University Press,
2003), Switching and Finite Automata Theory (Cambridge University Press,
2009), and Nanoelectronic Circuit Design (Springer, 2010). He has authored
12 book chapters. He has authored or co-authored more than 390 technical
papers. He holds 14 U.S. patents. He has given several keynote speeches
in the area of nanoelectronic design and test. His current research interests
include FinFETs, low power hardware/software design, computer-aided design
of integrated circuits and systems, digital system testing, quantum computing,
and secure computing.
Dr. Jha is a fellow of the ACM. He has served as the Editor-in-Chief
of the IEEE T RANSACTIONS ON V ERY L ARGE S CALE I NTEGRATION
(VLSI) S YSTEMS and an Associate Editor of the IEEE T RANSACTIONS ON
C IRCUITS AND S YSTEMS I AND II, the IEEE T RANSACTIONS ON
C OMPUTER -A IDED D ESIGN, the IEEE T RANSACTIONS ON V ERY L ARGE
S CALE I NTEGRATION (VLSI) S YSTEMS , and the Journal of Electronic
Testing: Theory and Applications. He currently serves as an Associate Editor
of the IEEE T RANSACTIONS ON C OMPUTERS , the Journal of Low Power
Electronics, and the Journal of Nanotechnology. He has served as the Program
Chairman of the 1992 Workshop on Fault-Tolerant Parallel and Distributed
Systems, the 2004 International Conference on Embedded and Ubiquitous
Computing, and the 2010 International Conference on VLSI Design. He
has served as the Director of the Center for Embedded System-on-a-Chip
Design funded by the New Jersey Commission on Science and Technology.
He has served on the program committees of more than 140 conferences and
workshops. He is a recipient of the AT&T Foundation Award and the NEC
Preceptorship Award for research excellence, the NCR Award for teaching
excellence, and the Princeton University Graduate Mentoring Award. He
received the Best Paper Award at ICCD in 1993, FTCS in 1997, ICVLSID
in 1998, DAC in 1999, PDCS in 2002, ICVLSID in 2003, CODES in 2006,
ICCD in 2009, and CLOUD in 2010. His paper was selected for The Best of
ICCAD: A collection of the best IEEE International Conference on ComputerAided Design papers of the past 20 years. Two of his papers were selected
by the IEEE Micro Magazine as one of the top picks from the 2005 and 2007
Computer Architecture conferences, and two others as being among the most
influential papers of the last ten years at the IEEE Design Automation and
Test in Europe Conference.

06690113

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

06690113

Hochgeladen von

Copyright:

Verfügbare Formate

2462

FinPrin: FinFET Logic Circuit Analysis and

Abstract Continued scaling of bulk CMOS technology is

Manuscript received February 14, 2013; revised July 5, 2013 and

effective oxide thicknesses, front- and back-gate thicknesses,

A preliminary version of this paper was presented in [19].

In this section, we discuss different types of variations that a

NAND , Ileak deviates by more than 17% for NOR with a

WID = global + local +

where global is location dependent, local is both proximity

WID variations across the chip and is modeled by

where  is the covariance matrix of all process parameters

where p is the nominal process parameter value, x and y

Multilevel quad-tree-based spatial correlation model.

parameters, where subscript l denotes the metal layer. Each

Fig. 4. Effects of temperature variation. (a) Impact of Top variation on Rout .

We do not model temperature as an RV, since temperature

C. Operating Temperature Variation

where v nom and F/ pi are computed using nominal values

where E[] denotes the expectation of the function.

Although the distribution of v might not be Gaussian, it is

denotes the set of parameters modeled as RVs

Temperature is also considered by an exponential function.

A. Modeling of Gate Delay

Definition 2: Let a path pi be a set of ordered edges from

V. SSTA F ORMULATION AND I TS E XTENSIONS

B. Modeling of Delay Distribution

where dnom is the nominal delay when all parameters are

where Rw (Cw ) is the vector of wire resistances (capacitances)

where n is the number of pins of the gate, dpath,i is the path

where i is the mean of x i and  is the covariance of X , under

for pin i of the gate.

where X ind N(0, I ) is the principal component of nom

Equation (26) enables us to express spatially correlated

where i,k is the coefficient computed as shown above and

Similarly, as all pindk have unit variance, the variance of dsum

2) Statistical Max Operation: We borrow the analytical

due to ignoring of skewness, in practice, the first case rarely

where i represents an individual gate.

where i,k is calculated using PCA.

refer to [12]. Still, the direct application of Wilkinsons method

where N is the total number of gates in the circuit.

In this section, we first describe the setup for FinFET

In this section, we first present results that validate the

Self-consistent temperature for s38584.

4) Analysis: Next, we perform statistical analysis of both

VII. F IN FET L OGIC S YNTHESIS F LOW

VIII. R ESULTS AND D ISCUSSION

In this section, we describe the FinFET logic synthesis

Our statistical modeling approach uses MinnSSTA [11] as a

on delay with less than 1% impact on Dckt over the entire

5.2% when Top increases from 298 to 398 K, it is important

surrounded by other circuits, as long as all the circuits are

to the temperature obtained after five iterations of the flow.

thereafter it quickly saturates. Thus, the fact that only a few

the curves shift to the left, thus leading to a higher timing or

C. FinPrin Versus MC Simulation

Yang Yang received the B.Eng. degree from

Niraj K. Jha (S85M85SM93F98) received

Das könnte Ihnen auch gefallen

where is the covariance matrix of all process parameters

where i is the mean of x i and is the covariance of X , under

where X ind N(0, I ) is the principal component of nom