Lecture 03 13 PDF

MOS Device Capacitance Estimation
inversion layer (channel)
gate
C gd C gs C gb C gd
t ox gate
drain
Cd b
substrate
source
C sb
depletion layer
drain
Cd b C gs
source
C sb
substrate
C gb
In cutoff region, gate-to-channel capacitance composed entirely of Cgb where Cgb = CoxWLeff
Cox = where o is free space permittivity and
o SiO t ox
SiO2 relative permittivity for SiO2
When channel is formed, depletion layers blocks Cgb . In linear region, Cgb blocked by formation of channel and gate-to-channel capacitance split evenly between Cgs and Cgd where Cgs = Cgd =
1 CoxWLeff 2 2 CoxWLeff 3
In saturation, channel is pinched off at drain, so Cgd 0, Cgs
Average channel capacitances of MOSFETs for different operation regions:
Region of operation Cutoff Linear Saturation
Cgb CoxWLeff ~0 ~0
Cgs ~0 (1/2)C oxWLeff (2/3)C oxWLeff
Cgd ~0 (1/2)C oxWLeff ~0
Cg = Cgb + Cgs + Cgd
Cg,total versus VGS:
Source/Drain Capacitance
b b
source diffusion
poly gate
source diffusion area
drain diffusion area
drain diffusion
side wall
xj
source
ND
bottom side wall
xj b a C ja C jp substrate
channel
substrate
NA
Two components: Cbottom diffusion area to substrate Csidewall diffusion depth peripheral area Cja = junction capacitance per m2 Cjp = periphery capacitance per m Cdiff = Cbottom + Csw = Cja area + Cjp perimeter = Cja a b + Cjp (2a + 2b)
Typical diffusion capacitance values for a 1m n -well process:
n -device (or wire) Cja Cjp 3 10-4 pF/m2 4 10-4 pF/m
p -device (or wire) 5 10-4 pF/m2 4 10-4 pF/m
The source/drain areas from p /n junctions with substrate or well. The junction voltage will affect the capacitance, both Cja and Cjp
General expression:
Cj =
C jo Vj 1 V b m
where Vj = junction voltage, negative for reverse bias Cjo = zero bias capacitance Vb = built-in junction potential (0.6V) m = grading coefficient (typical values between 0.3 and 0.5)
SPICE Computation of MOS Capacitance

.
M1 4 3 5 0 NFET W=4U L=1U AS=15P AD=15P PS=11.5U PD=11.5U . . .MODEL NFET NMOS + TOX=200E-8 + CGBO=200P CGSO=600P CGDO=600P + CJ=200U CJSW=400P MJ=0.5 MJSW=0.3 PB=0.7 + . . . . . . .
Definitions:
AS AD PS PD = = = = area of Source area of Drain perimeter of Source perimeter of Drain
The TOX parameter allows computation of Cox Cg = Cg (intrinsic) + Cg (extrinsic) Cg (intrinsic) = Cox W Leff Extrinsic Cg caused by overlap of gate with source/drain and channel ( only
2 if in saturation ) 3
channel
poly
source L
drain
Cgbo caused by poly extension past channel Cgso , Cgdo caused by overlap of poly with source/drain
Oxide encroachment:
Weff = Wdrawn DW Lateral Diffusion of Source and Drain:
Leff = Ldrawn 2LD = Ldrawn DL
Cgbo multiplied by channel length; Cgso , Cgdo multiplied by channel width Typically, gate capacitance will tend to dominate drain, source capacitance but can vary significantly with process. Example from book:
Cg(intrinsic) = W L Cox = 4 1 17 10-4 [pF] = 0.0068 [pF] In this example, the extrinsic gate capacitance for a typical MOS transistor is Cg(extrinsic) = (W Cgso ) + (W Cgdo ) + (2L Cgbo ) = (4 6 10-4 ) + (4 6 10-4 ) + 2 (1 2 10-4 ) [pF] = 0.0052 [pF]
In SPICE the capacitance of a source or drain diffusion is calculated as follows:

MJ VJ + Cj = Area CJ 1 + PB MJSW VJ Periphery CJSW 1 + PB
where CJ = the zero-bias capacitance per junction area CJSW = the zero-bias junction capacitance per junction periphery MJ = the grading coefficient of the junction bottom MJSW = the grading coefficient of the junction sidewall VJ = the junction potential PB = the built-in voltage (~ 0.4 to 0.8 [V]) Area = AS or AD, the area of the source or drain Periphery = PS or PD, the periphery of the source or drain PB, CJ, CJSW, MJ, and MJSW are specified in the model card. AS, AD, PS, and PD are specified by the element card. VJ depends on circuit conditions. At VJ = 2.5 [V] (half rail (VDD = 5 [V])), Cjdrain = 15 10 12 2 10 4 [1 + (2 .5 / 0.7 )] + 11 .5 10 6 4 10 4 [1 + (2.5 / 0.7 )] [pF] -4 -4 = (15 2 10 0.47) + (11.5 4 10 0.63) [pF] = 0.0014 + 0.0029 [pF] = 0.0043 [pF] = 4.3 [fF]
0. 3
0 .5
Summarizing these capacitances then, Cgtotal = 0.0068 + 0.0052 [pF] = 12 [fF] Cdrain = Csource = 0.0043 [pF] (@ 2.5 [V]).
Routing Capacitance
fringing field capacitance metal interconnect capacitance to adjacent conductor
parallel capacitance
SiO 2 substrate
Fringing Field Capacitance occurs at edge of the conductor and is due to the conductor's finite thickness. Fringing Field Capacitance will cause effective capacitance to increase.
Use empirical formulas to estimate.
Also have inter-layer capacitances (from p. 196 of text):

A B C D E
20k pass.
calculate Requires 3-D CAD Typically, just use substrate capacitance multiplied by a "fudge" factor of ~1.1, ~1.3, or even ~2.0
m2 m2 C C C m2 m2
12k 6 k
C m1 C
m2 m2 m1 C C
m1 C
6 k 6 k
poly
poly 3k 6k diffusion
Substrate
Very thin oxide (200) computationally intensive to LINE-to-GROUND EQUATION # (see text)
4.19 4.19 4.20 4.20 4.20 4.20 4.20 4.19
CONDITION
A B C D E E F G
LAYER
Poly-substrate Metal2-substrate Poly-metal2 Metal1-substrate Metal1-poly Metal1-metal2 Metal1-diffusion Metal2-diffusion
LINE-to-LINE EQUATION # (see text)

4.21 4.21 4.22 4.22 4.22 4.22 4.22 4.21
Delay
Long wire distributed RC line
R C
First-order approximation:
delay =
r c l2 2
where r = resistance per unit length c = capacitance per unit length l = length of the wire Important fact interconnect delay does not scale with lambda, it is constant. When lambda decreases, R increases and C decreases, resulting in delay constant
Inserting a buffer in a long resistance line can be advantageous.
For a poly run = 2mm length, r = 20 /m c = 4 10-4 pF/m
2mm delay
20 4 10 4 2 2 2
= 16 ns
If broken into two 1mm sections, then delay of each section = 4ns. Add a buffer with delay = 1ns and total delay becomes 4 + 1 + 4 = 9ns.
Typically, resistive effects of interconnect much more important than capacitive effects since capacitance tends to be dominated by the gate capacitances.
Load
Driver
Load
Load
Resistance/Capacitance of interconnect
Capacitance of MOSFET load
MOSFET load capacitance >> wire capacitance [unless DSM (deep submicron (0.25m) CMOS technology]
So, if we decrease interconnect resistance, then we reduce overall propagation delay between driver and load.
Reduce interconnect resistance by using metal, increasing the width of the interconnect.
Usually just want delay (RC), where R is the resistance of the interconnect and C is the total of all the capacitive loads.
Example (from text) A register that fits in data-path is 25m tall (the direction of repetition). A metal2 clock line runs vertically to link all registers in an n bit register. The register has 30m of 1m metal1, 20m of 1m poly (over field oxide), and 16m of 1m gate capacitance. 1. Calculate the per-bit clock load and the load for a 16-bit register. 2. What would be the RC delay of the register from a clock buffer using 5mm of 1m metal2 (0.05 /sq.)? 3. How wide would the clock line have to be to keep the skew below 0.5ns if a register file containing 32 16-bit registers was fed with the same 5mm metal2 wire? Solution: [Capacitance values found in Table 4.6, page 202 of text.] 1. The parasitics are as follows: Cm1 = 30 30 [aF] = 900aF Cpoly = 20 50 [aF] = 1000aF = 1fF Cgs = 16 1800 [aF] = 28,800aF Creg1 = 900 + 1000 + 28,800 [aF] = 30fF Creg16 = 16 Creg1 = 480fF 2. Rmetal2 = 5000 0.05 [ /sq.] = 250 Because the capacitance load is at the end of the wire, we approximate the RC delay by adding the metal2 track capacitance to the load capactiance and performing a simpe RC calculation. Ctotal = 0.48 + Cmetal2 [pF] = 0.48 + (5000 20 10-6 ) [pF] = 0.58pF RC = 250 0.58 10-12 seconds = 0.145ns 3. We now have 32 registers, so the load capacitance of the registers is Cregfile = 32 Creg16 = 15.36pF.
5mm
metal 2 clock line 25m
Bit0 Bit1 Bit2
Bit15
The RC for a 1m-wide clock feed is 250 15.36pF = 3.84ns. Delay of 3.84ns too big, widen the wire to reduce R; will increase C somewhat but capacitance is dominated by cell capacitance. The clock line has to be widened by 3.84/0.5 or 7.68. To be conservative, one might choose a 10m wire. Now Ct otal = 15.36 + Cmetal2 [pF] = 15.36 + (5000 10 20 10-6 ) [pF] = 16.36pF Note: R reduced by 10x, Ctotal slightly increased RC = 25 16.36 10-12 seconds = 0.41ns
Overall delay went down!
For short and lightly loaded wire lengths, can ignore the R and just model wires as lumped capacitances.
How short? w << g
w =
rcl2 2
2 g rc
l <<
Minimum width (1m) Aluminum wire, gate delay = 200ps (using data from previous example) Guidelines for ignoring RC wire delays: l <<
2 0.2 10 0.05 30 1018

9
LAYER Metal3 Metal2
MAXIMUM LENGTH () 10000 8000 5000 600 200 60
16000m
So conservatively, l < 5000m.
Metal1 Silicide Polysilicon Diffusion
If lambda = 0.5m, ignore RC delay for < 2.5mm metal runs (see table).
Do NOT ignore for heavily loaded lines like clock lines!
Gate Delay Models for Rise/Fall Time

Definition of Rise/Fall Delay Times:
V Input 50%
V Input Output
time
time tpd
tdelay,50-50 (or t pd ) =
time between input reaching 50% point and output reaching 50% point
One advantage of using 50% points for measurement is that it does not matter if output is rising or falling (gate inverting or non-inverting).
One problem with 50% propagation delays is that you can end up with a negative propagation delay for slowly rising/falling inputs.
V
Input Output
50% pt. time Output begins changing before input reaches 50% point
Can also define delay at 30% -to- 70% points, 10% -to- 90% points, etc.
For non-inverting gates, if we use 30% -to- 70% points: tpdlh prop delay low to high (measure between 30% input, 30% output)
30% time tpdlh
tpdhl prop delay high to low (measure between 70% input, 70% output)
70%
time tpdhl
For inverting gates, if we use 30% -to- 70% points: tpdlh measure 70% input to 30% output
V input 70% 30% tpdlh time output
tpdhl measure 30% input to 70% output
V output input 70% 30% tpdhl time
Modeling Delay
For a step input, then propagation delay simplifies to just rise/fall time of the output to a particular point (50%, 30%/70%, etc.).
Volts step input 30% output
time tr 30%
Delay can be modeled in terms of an RC delay: rise = Rrise Cload fall = Rfall Cload ,
and
for a particular VDD .
V DD
V DD R rise
Vin C load C load R fall C load
Effective resistance is inversely proportional to of transistor.
Rrise = k rise
1 p 1 n
Rfall = k fall
How do I determine k rise, k fall? Do SPICE simulation for a particular Cload, measure delay, solve for k rise, k fall.
These values of k rise, k fall would be valid for the particular VDD you used in the simulations.
By characterizing an inverter this way, then one can predict delay for more complex gates after transforming the complex gates into an "equivalent" inverter. In this procedure, the original characterized inverter is sometimes called the "base" inverter.
have the same worst case trise as the base inverter because
V DD
either A or B p MOS will pulling.
V DD A Wp Lp B Wp Lp Vin Wp Lp Wn Ln C load
Y B Wn Ln Wn Ln
For tfall, would expect NAND gate to be twice as slow (as the base inverter) because channel lengths add.
In delay terms (for this NAND), for trise, would expect to
More accurate delay model breaks gate delay into two parts: internal, output
internal = gate delay with zero load (only internal capacitance values
affect delay)
external = portion of delay proportional to external load.
gate = internal + k
Cload ; C unit load
Cunit load = C1X load
Make SPICE measurement at no load, get internal.
Make SPICE measurement at unit load (typical output load), determine k value.
Hopefully, k is a constant for different output loads, but may not be.
In this case, take SPICE measurements at different output loads and perform a curve fit of k against C values.
desired
actual k values
1xC
4xC
10xC
20xC
ko
actual k curve
1xC
4xC
10xC
20xC
k = k o [exp(Cnorm )] curve fit and get values for k o , ,
Cnorm =
Cload C1X load
Delay calculation: a) compute k value based on Cload b) compute delay value based on k For a gate, need propagation delay factor for each input, both H L and L H
A B
Tphl_a_to_z , Tphl_b_to_z
Tplh_a_to_z , Tplh_b_to_z Would need k o , , , and internal parameters for each one of these...
But wait a minute! All of the previous discussion assumed a step input!
Is this realistic? No
Actual waveforms in circuit look something like this:
varying
input slopes
How do I determine the range of input slopes I might see in a circuit? Need to know fastest slope, slowest slope
Fastest case would probably be for the inverter driving a 1X load, pulling down.
step input
measure this output slope
Measure the output slope. Call it your fastest slope. Measure again for 4X load and call that your typical slope.
To get a representative "slow" slope, use a 2-input NOR gate pulling high
apply a typical slope
measure output slope
15x
"heavy load"
Why use a NOR gate?
V DD
Two p MOS's in series will be slow.
Now that you have a fast slope, and slow slope, pick values in between and generate tables of model parameters (k o , , , internal)
For different slope values do table lookup based on input slope value.
How do I define input slope? Typically 30% 70% points
70%
30%
tslope = 30%-70% time difference
During characterization, I can apply a straight-line input (as shown below left)
V max 70% 70%
30%
30%
0 t slope
Not very realistic. Probably want to apply a more realistic waveform (above right).
t slope
Most realistic waveform is achieved however by having another gate drive the input:
gate under test
Apply step input here
C in
C load
Vary Cin to control input slope. Only problem is that precise control of input slope is difficult, must be able to accurately predict slope of output gate based upon value of "Cin ".
This will simulate realistic driving conditions.
Stage Ratio - Delay Optimization

To drive a large load, do not just want to make one large driver
large transistors represent a large load back to internal circuitry
C load
large driver (large transistors)
Want to drive the load with a series of progressively larger drivers
1
inv-1
a
inv-2
a2
inv-3
aN
inv-n
C load Cg C g1
minimum sized inverter
C g2
C gN
n stages (number of inverters = n)
Each driver (inverter) larger than preceding driver by stage ratio "a". Let Cg be gate load of first driver which is minimum size. Then, CgN will be Cg aN and want Cg an CL , [Note: n = N + 1]
to guarantee that none of capacitances internal to the chain of inverters exceed Cload. For example, if CgN
Cload, why we would need the n-th inverter at all!
So when the condition Cg an CL is set equal we have
an =
CL Cg
Question: What value of "a" will lead to minimum delay? What value of "n"? If we find one, we can compute the other. Delay through each stage is approximately a td where td is the delay through a minimum-sized inverter driving another minimum-sized inverter. Total Delay = n a td
We know an =
CL , C g
so
CL a= C g
Substituting,
1/n
CL Total Delay = n C g
To find optimum value for n, differentiate and set equal to zero.
1/n
td
If we do, then we find
n opt = ln
CL C g
Once we know n opt , find aopt
an =
CL C g
ln C L Cg
CL Cg
Take the natural log (i.e., ln) of both sides:
ln
CL C g
ln(a) = ln
CL C g
ln(a) = 1 a = e1 2.7
A more detailed analysis shows that the intrinsic output capacitance of the inverter will affect this ratio. aopt = exp
k + a opt a opt
where k=
C drain C gate
Page 190 of text computed
Cdrain = 0.0043pF, Cgate = 0.02pF for 1m process
k=
C drain C gate
= 0.215
aopt = 2.93
External Conditions which can affect delay

a) Operating Temperature b) Supply Voltage c) Process Variation Drain current is proportional to T(1.5) As temperature is increased, drain current is reduced for a given set of operating conditions, delay increases
The temperature of the die is what counts, this is expressed as Tj = Ta + ja Pd
where Ta ambient Temperature (C) ja package thermal impedance (C/watt) Pd power dissipation Typical values for ja range from 35 to 45 (C/watt), depending on chip package
Package Type Plastic J-Leaded Chip Carrier
Pin Count 44 68 84 100 80 84 84
ja still air 45 38 37 48 43 33 40
ja 300 ft/min. 35 29 28 40 35 20 30
Units C/W C/W C/W C/W C/W C/W C/W
Plastic Quad Flatpack Very Thin (1.0mm) Quad Flatpack Ceramic Pin Grid Array Ceramic Quad Flatpack
Parts usually characterized for different temperature ranges:

Commercial: Industrial Military 40 to 85 C 55 to 125 C 0 to 70 C
Voltage also affects device speed: voltage increases , drain current increases, delay decreases
Typically characterize device around a power supply tolerance
Power Supply Voltage Tolerance

Commercial Industrial Military 5% 10% 10%
Process Variations also affect delay

wafer.
wafer fabrication is a long series of chemical
operations, variations in diffusion depth, dopant densities, oxide/diffusion geometry variations can cause transistor switching speeds to vary from wafer batch to wafer batch, wafer to wafer and even on the same
Transistors typically characterized as "fast", "nominal", and "slow".

models for these cases.
Need SPICE transistor
However, variations between n MOS-speeds and p MOS-speeds can be independent so one can obtain "four corners" model
slow n MOS fast p MOS
fast n MOS fast p MOS
slow n MOS slow p MOS
fast n MOS slow p MOS
When characterizing for high speed, also want to use lowest temperature, highest voltage.
When characterizing for "slow" case, want highest temperature, lowest voltage.
CMOS Digital Systems Checks (Commercial)
PROCESS
Fast-n / fast-p
TEMPERATURE
0 C 125 C 0 C
VOLTAGE
5.5V (3.6V)
TESTS Power dissipation (DC), clock races Circuit speed, external setup and hold times Pseudo- nMOS noise margin, level shifters, memory write/read, ratioed circuits Memories, ratioed circuits, level shifters
Slow-n / slow-p
4.5V (3.0V)
Slow-n / fast-p
5.5V (3.6V)
Fast-n / slow-p
0 C
5.5V (3.6V)
Power Dissipation
Power Dissipation has three components:
1. 2. 3.
Static Dynamic Short Circuit
For traditional CMOS design, static dissipation is limited to the leakage currents in the reversed-biased diodes formed between the substrate (or well) and source/drain regions. But in some DSM CMOS
technology subthreshold leakage tends to also contribute significant static dissipation. Subthreshold leakage increases exponentially as threshold voltage decreases; i.e., lower V T (VTn and |VTp |) CMOS technology has more static power dissipation (due to subthreshold leakage) than higher VT technology.
Static power dissipation can be extremely small: 1 inverter @ 5V 1 to 2 nanowatts static power
Dynamic Power is governed by
P =fCV
d p L
DD
This is the amount of power dissipated by charging/discharging internal capacitance and load capacitance.
Note the relations: Higher the switching speed Lower the voltage the Bigger the gates Pd Pd ! Pd
To estimate Pd , need to know the switching frequencies of the internal signals
Typically break this into two parts:
Pd = ( Pd )
clock network
+ ( Pd )
all the rest
The power dissipation in the clock network tends to dominate in most designs. Usually assume the switching frequency of logic signals as some fraction of the clock frequency, can estimate by running some sample simulations and keeping switching statistics on internal nodes to build a probabilistic model of switching activity.
Logic synthesis techniques can be used to do the following: a. or and/or b. c. minimize # of gates maximize speed minimize switching activity
Also, have "short-circuit" power dissipation proportional to the amount of time when both p - and n -trees are conducting.
Slow rise/fall times on nodes can make this significant. Usually ignored in most calculations.
Sizing Routing Calculation

The sizing of signal lines to achieve a particular RC delay was previously discussed.
For power conductors, need to worry about 1. 2. Metal migration - too much current in too small a conductor will "blow" the conductor Ground Bounce - large current spikes in V DD /GND leads can occur when simultaneous outputs switch
Two components to ground bounce.
a.
IR
for on-chip conductors, R is resistance of on-chip conductor
b.
di dt
L is the on-chip inductance and package inductance in di is affected dt
VDD /GND pins. Package inductance dominates. Note that by slew rates on input/output pins.
Example What would be the conductor width of power and ground wires to a 50MHz clock buffer that drives 100pF of on-chip load to satisfy the metal-migration consideration (JAL = 0.5mA/m)? What is the ground bounce with chosen conductor size? The module is 500m from both the power and ground pads and the supply voltage is 5 volts.
1.
= CVDD 2 = 100 10-12 25 50 106 = 125mW
= P/V = 25mA
Thus the width of the clock wires should be at least 50m. A good choice would be 100m. = 500/100 0.05 = 5 squares 0.05 /sq. = 0.25 IR = 0.25 25 10-3 = 6.25mV di Typically, IR term of ground bounce very small compared to L term. dt
2.

Lecture 03 13 PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Lecture 03 13 PDF

Hochgeladen von

Copyright:

Verfügbare Formate

MOS Device Capacitance Estimation

inversion layer (channel)

Cox = where o is free space permittivity and

SiO2 relative permittivity for SiO2

In saturation, channel is pinched off at drain, so Cgd 0, Cgs

Average channel capacitances of MOSFETs for different operation regions:

Region of operation Cutoff Linear Saturation

Cgs ~0 (1/2)C oxWLeff (2/3)C oxWLeff

Cgd ~0 (1/2)C oxWLeff ~0

Cg = Cgb + Cgs + Cgd

Cg,total versus VGS:

source diffusion area

drain diffusion area

Typical diffusion capacitance values for a 1m n -well process:

n -device (or wire) Cja Cjp 3 10-4 pF/m2 4 10-4 pF/m

p -device (or wire) 5 10-4 pF/m2 4 10-4 pF/m

SPICE Computation of MOS Capacitance

Weff = Wdrawn DW Lateral Diffusion of Source and Drain:

Leff = Ldrawn 2LD = Ldrawn DL

In SPICE the capacitance of a source or drain diffusion is calculated as follows:

Also have inter-layer capacitances (from p. 196 of text):

LINE-to-LINE EQUATION # (see text)

Inserting a buffer in a long resistance line can be advantageous.

For a poly run = 2mm length, r = 20 /m c = 4 10-4 pF/m

Capacitance of MOSFET load

metal 2 clock line 25m

Bit0 Bit1 Bit2

Overall delay went down!

How short? w << g

2 0.2 10 0.05 30 1018

LAYER Metal3 Metal2

MAXIMUM LENGTH () 10000 8000 5000 600 200 60

So conservatively, l < 5000m.

Metal1 Silicide Polysilicon Diffusion

Do NOT ignore for heavily loaded lines like clock lines!

Gate Delay Models for Rise/Fall Time

30% time tpdlh

V input 70% 30% tpdlh time output

tpdhl measure 30% input to 70% output

V output input 70% 30% tpdhl time

Volts step input 30% output

for a particular VDD .

Vin C load C load R fall C load

Effective resistance is inversely proportional to of transistor.

either A or B p MOS will pulling.

In delay terms (for this NAND), for trise, would expect to

Cload ; C unit load

Cunit load = C1X load

Make SPICE measurement at no load, get internal.

k = k o [exp(Cnorm )] curve fit and get values for k o , ,

Cload C1X load

Actual waveforms in circuit look something like this:

measure this output slope

apply a typical slope

measure output slope

Why use a NOR gate?

Two p MOS's in series will be slow.

How do I define input slope? Typically 30% 70% points

tslope = 30%-70% time difference

V max 70% 70%

gate under test

Apply step input here

This will simulate realistic driving conditions.