Sie sind auf Seite 1von 35

MOS Device Capacitance Estimation

inversion layer (channel)

gate
C gd C gs C gb C gd
t ox gate

drain

Cd b
substrate

source
C sb
depletion layer

drain
Cd b C gs
source

C sb

substrate

C gb

In cutoff region, gate-to-channel capacitance composed entirely of Cgb where Cgb = CoxWLeff

Cox = where o is free space permittivity and

o SiO t ox

SiO2 relative permittivity for SiO2

When channel is formed, depletion layers blocks Cgb . In linear region, Cgb blocked by formation of channel and gate-to-channel capacitance split evenly between Cgs and Cgd where Cgs = Cgd =

1 CoxWLeff 2 2 CoxWLeff 3

In saturation, channel is pinched off at drain, so Cgd 0, Cgs

Average channel capacitances of MOSFETs for different operation regions:

Region of operation Cutoff Linear Saturation

Cgb CoxWLeff ~0 ~0

Cgs ~0 (1/2)C oxWLeff (2/3)C oxWLeff

Cgd ~0 (1/2)C oxWLeff ~0

Cg = Cgb + Cgs + Cgd

Cg,total versus VGS:

Source/Drain Capacitance
b b

source diffusion

poly gate

source diffusion area

drain diffusion area

drain diffusion

side wall

xj

source
ND
bottom side wall

xj b a C ja C jp substrate

channel

substrate
NA

Two components: Cbottom diffusion area to substrate Csidewall diffusion depth peripheral area Cja = junction capacitance per m2 Cjp = periphery capacitance per m Cdiff = Cbottom + Csw = Cja area + Cjp perimeter = Cja a b + Cjp (2a + 2b)

Typical diffusion capacitance values for a 1m n -well process:

n -device (or wire) Cja Cjp 3 10-4 pF/m2 4 10-4 pF/m

p -device (or wire) 5 10-4 pF/m2 4 10-4 pF/m

The source/drain areas from p /n junctions with substrate or well. The junction voltage will affect the capacitance, both Cja and Cjp

General expression:

Cj =

C jo Vj 1 V b m

where Vj = junction voltage, negative for reverse bias Cjo = zero bias capacitance Vb = built-in junction potential (0.6V) m = grading coefficient (typical values between 0.3 and 0.5)

SPICE Computation of MOS Capacitance


.
M1 4 3 5 0 NFET W=4U L=1U AS=15P AD=15P PS=11.5U PD=11.5U . . .MODEL NFET NMOS + TOX=200E-8 + CGBO=200P CGSO=600P CGDO=600P + CJ=200U CJSW=400P MJ=0.5 MJSW=0.3 PB=0.7 + . . . . . . .

Definitions:
AS AD PS PD = = = = area of Source area of Drain perimeter of Source perimeter of Drain

The TOX parameter allows computation of Cox Cg = Cg (intrinsic) + Cg (extrinsic) Cg (intrinsic) = Cox W Leff Extrinsic Cg caused by overlap of gate with source/drain and channel ( only

2 if in saturation ) 3

channel

poly

source L

drain

Cgbo caused by poly extension past channel Cgso , Cgdo caused by overlap of poly with source/drain

Oxide encroachment:

Weff = Wdrawn DW Lateral Diffusion of Source and Drain:

Leff = Ldrawn 2LD = Ldrawn DL

Cgbo multiplied by channel length; Cgso , Cgdo multiplied by channel width Typically, gate capacitance will tend to dominate drain, source capacitance but can vary significantly with process. Example from book:

Cg(intrinsic) = W L Cox = 4 1 17 10-4 [pF] = 0.0068 [pF] In this example, the extrinsic gate capacitance for a typical MOS transistor is Cg(extrinsic) = (W Cgso ) + (W Cgdo ) + (2L Cgbo ) = (4 6 10-4 ) + (4 6 10-4 ) + 2 (1 2 10-4 ) [pF] = 0.0052 [pF]

In SPICE the capacitance of a source or drain diffusion is calculated as follows:


MJ VJ + Cj = Area CJ 1 + PB MJSW VJ Periphery CJSW 1 + PB

where CJ = the zero-bias capacitance per junction area CJSW = the zero-bias junction capacitance per junction periphery MJ = the grading coefficient of the junction bottom MJSW = the grading coefficient of the junction sidewall VJ = the junction potential PB = the built-in voltage (~ 0.4 to 0.8 [V]) Area = AS or AD, the area of the source or drain Periphery = PS or PD, the periphery of the source or drain PB, CJ, CJSW, MJ, and MJSW are specified in the model card. AS, AD, PS, and PD are specified by the element card. VJ depends on circuit conditions. At VJ = 2.5 [V] (half rail (VDD = 5 [V])), Cjdrain = 15 10 12 2 10 4 [1 + (2 .5 / 0.7 )] + 11 .5 10 6 4 10 4 [1 + (2.5 / 0.7 )] [pF] -4 -4 = (15 2 10 0.47) + (11.5 4 10 0.63) [pF] = 0.0014 + 0.0029 [pF] = 0.0043 [pF] = 4.3 [fF]
0. 3

0 .5

Summarizing these capacitances then, Cgtotal = 0.0068 + 0.0052 [pF] = 12 [fF] Cdrain = Csource = 0.0043 [pF] (@ 2.5 [V]).

Routing Capacitance
fringing field capacitance metal interconnect capacitance to adjacent conductor

parallel capacitance

SiO 2 substrate

Fringing Field Capacitance occurs at edge of the conductor and is due to the conductor's finite thickness. Fringing Field Capacitance will cause effective capacitance to increase.
Use empirical formulas to estimate.

Also have inter-layer capacitances (from p. 196 of text):


A B C D E
20k pass.

calculate Requires 3-D CAD Typically, just use substrate capacitance multiplied by a "fudge" factor of ~1.1, ~1.3, or even ~2.0

m2 m2 C C C m2 m2
12k 6 k

C m1 C

m2 m2 m1 C C

m1 C

6 k 6 k

poly

poly 3k 6k diffusion

Substrate

Very thin oxide (200) computationally intensive to LINE-to-GROUND EQUATION # (see text)
4.19 4.19 4.20 4.20 4.20 4.20 4.20 4.19

CONDITION
A B C D E E F G

LAYER
Poly-substrate Metal2-substrate Poly-metal2 Metal1-substrate Metal1-poly Metal1-metal2 Metal1-diffusion Metal2-diffusion

LINE-to-LINE EQUATION # (see text)


4.21 4.21 4.22 4.22 4.22 4.22 4.22 4.21

Delay
Long wire distributed RC line

R C

First-order approximation:

delay =

r c l2 2

where r = resistance per unit length c = capacitance per unit length l = length of the wire Important fact interconnect delay does not scale with lambda, it is constant. When lambda decreases, R increases and C decreases, resulting in delay constant

Inserting a buffer in a long resistance line can be advantageous.

For a poly run = 2mm length, r = 20 /m c = 4 10-4 pF/m

2mm delay

20 4 10 4 2 2 2

= 16 ns

If broken into two 1mm sections, then delay of each section = 4ns. Add a buffer with delay = 1ns and total delay becomes 4 + 1 + 4 = 9ns.

Typically, resistive effects of interconnect much more important than capacitive effects since capacitance tends to be dominated by the gate capacitances.

Load

Driver

Load

Load

Resistance/Capacitance of interconnect

Capacitance of MOSFET load

MOSFET load capacitance >> wire capacitance [unless DSM (deep submicron (0.25m) CMOS technology]

So, if we decrease interconnect resistance, then we reduce overall propagation delay between driver and load.

Reduce interconnect resistance by using metal, increasing the width of the interconnect.

Usually just want delay (RC), where R is the resistance of the interconnect and C is the total of all the capacitive loads.

Example (from text) A register that fits in data-path is 25m tall (the direction of repetition). A metal2 clock line runs vertically to link all registers in an n bit register. The register has 30m of 1m metal1, 20m of 1m poly (over field oxide), and 16m of 1m gate capacitance. 1. Calculate the per-bit clock load and the load for a 16-bit register. 2. What would be the RC delay of the register from a clock buffer using 5mm of 1m metal2 (0.05 /sq.)? 3. How wide would the clock line have to be to keep the skew below 0.5ns if a register file containing 32 16-bit registers was fed with the same 5mm metal2 wire? Solution: [Capacitance values found in Table 4.6, page 202 of text.] 1. The parasitics are as follows: Cm1 = 30 30 [aF] = 900aF Cpoly = 20 50 [aF] = 1000aF = 1fF Cgs = 16 1800 [aF] = 28,800aF Creg1 = 900 + 1000 + 28,800 [aF] = 30fF Creg16 = 16 Creg1 = 480fF 2. Rmetal2 = 5000 0.05 [ /sq.] = 250 Because the capacitance load is at the end of the wire, we approximate the RC delay by adding the metal2 track capacitance to the load capactiance and performing a simpe RC calculation. Ctotal = 0.48 + Cmetal2 [pF] = 0.48 + (5000 20 10-6 ) [pF] = 0.58pF RC = 250 0.58 10-12 seconds = 0.145ns 3. We now have 32 registers, so the load capacitance of the registers is Cregfile = 32 Creg16 = 15.36pF.

5mm

metal 2 clock line 25m

Bit0 Bit1 Bit2

Bit15

The RC for a 1m-wide clock feed is 250 15.36pF = 3.84ns. Delay of 3.84ns too big, widen the wire to reduce R; will increase C somewhat but capacitance is dominated by cell capacitance. The clock line has to be widened by 3.84/0.5 or 7.68. To be conservative, one might choose a 10m wire. Now Ct otal = 15.36 + Cmetal2 [pF] = 15.36 + (5000 10 20 10-6 ) [pF] = 16.36pF Note: R reduced by 10x, Ctotal slightly increased RC = 25 16.36 10-12 seconds = 0.41ns

Overall delay went down!

For short and lightly loaded wire lengths, can ignore the R and just model wires as lumped capacitances.

How short? w << g

w =

rcl2 2
2 g rc

l <<

Minimum width (1m) Aluminum wire, gate delay = 200ps (using data from previous example) Guidelines for ignoring RC wire delays: l <<

2 0.2 10 0.05 30 1018


9

LAYER Metal3 Metal2

MAXIMUM LENGTH () 10000 8000 5000 600 200 60

16000m

So conservatively, l < 5000m.

Metal1 Silicide Polysilicon Diffusion

If lambda = 0.5m, ignore RC delay for < 2.5mm metal runs (see table).

Do NOT ignore for heavily loaded lines like clock lines!

Gate Delay Models for Rise/Fall Time


Definition of Rise/Fall Delay Times:

V Input 50%

V Input Output

time

time tpd

tdelay,50-50 (or t pd ) =

time between input reaching 50% point and output reaching 50% point

One advantage of using 50% points for measurement is that it does not matter if output is rising or falling (gate inverting or non-inverting).

One problem with 50% propagation delays is that you can end up with a negative propagation delay for slowly rising/falling inputs.

V
Input Output

50% pt. time Output begins changing before input reaches 50% point

Can also define delay at 30% -to- 70% points, 10% -to- 90% points, etc.

For non-inverting gates, if we use 30% -to- 70% points: tpdlh prop delay low to high (measure between 30% input, 30% output)

30% time tpdlh

tpdhl prop delay high to low (measure between 70% input, 70% output)

70%

time tpdhl

For inverting gates, if we use 30% -to- 70% points: tpdlh measure 70% input to 30% output

V input 70% 30% tpdlh time output

tpdhl measure 30% input to 70% output

V output input 70% 30% tpdhl time

Modeling Delay
For a step input, then propagation delay simplifies to just rise/fall time of the output to a particular point (50%, 30%/70%, etc.).

Volts step input 30% output

time tr 30%

Delay can be modeled in terms of an RC delay: rise = Rrise Cload fall = Rfall Cload ,

and

for a particular VDD .

V DD

V DD R rise

Vin C load C load R fall C load

Effective resistance is inversely proportional to of transistor.

Rrise = k rise

1 p 1 n

Rfall = k fall

How do I determine k rise, k fall? Do SPICE simulation for a particular Cload, measure delay, solve for k rise, k fall.

These values of k rise, k fall would be valid for the particular VDD you used in the simulations.

By characterizing an inverter this way, then one can predict delay for more complex gates after transforming the complex gates into an "equivalent" inverter. In this procedure, the original characterized inverter is sometimes called the "base" inverter.

have the same worst case trise as the base inverter because

V DD

either A or B p MOS will pulling.

V DD A Wp Lp B Wp Lp Vin Wp Lp Wn Ln C load

Y B Wn Ln Wn Ln

For tfall, would expect NAND gate to be twice as slow (as the base inverter) because channel lengths add.

In delay terms (for this NAND), for trise, would expect to

More accurate delay model breaks gate delay into two parts: internal, output
internal = gate delay with zero load (only internal capacitance values

affect delay)
external = portion of delay proportional to external load.

gate = internal + k

Cload ; C unit load

Cunit load = C1X load

Make SPICE measurement at no load, get internal.

Make SPICE measurement at unit load (typical output load), determine k value.

Hopefully, k is a constant for different output loads, but may not be.

In this case, take SPICE measurements at different output loads and perform a curve fit of k against C values.

desired

actual k values

1xC

4xC

10xC

20xC

ko

actual k curve

1xC

4xC

10xC

20xC

k = k o [exp(Cnorm )] curve fit and get values for k o , ,

Cnorm =

Cload C1X load

Delay calculation: a) compute k value based on Cload b) compute delay value based on k For a gate, need propagation delay factor for each input, both H L and L H

A B

Tphl_a_to_z , Tphl_b_to_z

Tplh_a_to_z , Tplh_b_to_z Would need k o , , , and internal parameters for each one of these...

But wait a minute! All of the previous discussion assumed a step input!

Is this realistic? No

Actual waveforms in circuit look something like this:

varying

input slopes

How do I determine the range of input slopes I might see in a circuit? Need to know fastest slope, slowest slope

Fastest case would probably be for the inverter driving a 1X load, pulling down.

step input

measure this output slope

Measure the output slope. Call it your fastest slope. Measure again for 4X load and call that your typical slope.

To get a representative "slow" slope, use a 2-input NOR gate pulling high

apply a typical slope

measure output slope

15x

"heavy load"

Why use a NOR gate?

V DD

Two p MOS's in series will be slow.

Now that you have a fast slope, and slow slope, pick values in between and generate tables of model parameters (k o , , , internal)

For different slope values do table lookup based on input slope value.

How do I define input slope? Typically 30% 70% points

70%

30%

tslope = 30%-70% time difference

During characterization, I can apply a straight-line input (as shown below left)

V max 70% 70%

30%

30%

0 t slope
Not very realistic. Probably want to apply a more realistic waveform (above right).

t slope

Most realistic waveform is achieved however by having another gate drive the input:

gate under test

Apply step input here

C in

C load

Vary Cin to control input slope. Only problem is that precise control of input slope is difficult, must be able to accurately predict slope of output gate based upon value of "Cin ".

This will simulate realistic driving conditions.

Stage Ratio - Delay Optimization


To drive a large load, do not just want to make one large driver

large transistors represent a large load back to internal circuitry

C load

large driver (large transistors)

Want to drive the load with a series of progressively larger drivers

1
inv-1

a
inv-2

a2
inv-3

aN
inv-n

C load Cg C g1
minimum sized inverter

C g2

C gN

n stages (number of inverters = n)

Each driver (inverter) larger than preceding driver by stage ratio "a". Let Cg be gate load of first driver which is minimum size. Then, CgN will be Cg aN and want Cg an CL , [Note: n = N + 1]

to guarantee that none of capacitances internal to the chain of inverters exceed Cload. For example, if CgN

Cload, why we would need the n-th inverter at all!

So when the condition Cg an CL is set equal we have

an =

CL Cg

Question: What value of "a" will lead to minimum delay? What value of "n"? If we find one, we can compute the other. Delay through each stage is approximately a td where td is the delay through a minimum-sized inverter driving another minimum-sized inverter. Total Delay = n a td

We know an =

CL , C g

so

CL a= C g
Substituting,

1/n

CL Total Delay = n C g
To find optimum value for n, differentiate and set equal to zero.

1/n

td

If we do, then we find

n opt = ln

CL C g

Once we know n opt , find aopt

an =

CL C g
ln C L Cg

CL Cg

Take the natural log (i.e., ln) of both sides:

ln

CL C g

ln(a) = ln

CL C g

ln(a) = 1 a = e1 2.7

A more detailed analysis shows that the intrinsic output capacitance of the inverter will affect this ratio. aopt = exp

k + a opt a opt

where k=

C drain C gate

Page 190 of text computed

Cdrain = 0.0043pF, Cgate = 0.02pF for 1m process

k=

C drain C gate

= 0.215

aopt = 2.93

External Conditions which can affect delay


a) Operating Temperature b) Supply Voltage c) Process Variation Drain current is proportional to T(1.5) As temperature is increased, drain current is reduced for a given set of operating conditions, delay increases

The temperature of the die is what counts, this is expressed as Tj = Ta + ja Pd

where Ta ambient Temperature (C) ja package thermal impedance (C/watt) Pd power dissipation Typical values for ja range from 35 to 45 (C/watt), depending on chip package

Package Type Plastic J-Leaded Chip Carrier

Pin Count 44 68 84 100 80 84 84

ja still air 45 38 37 48 43 33 40

ja 300 ft/min. 35 29 28 40 35 20 30

Units C/W C/W C/W C/W C/W C/W C/W

Plastic Quad Flatpack Very Thin (1.0mm) Quad Flatpack Ceramic Pin Grid Array Ceramic Quad Flatpack

Parts usually characterized for different temperature ranges:


Commercial: Industrial Military 40 to 85 C 55 to 125 C 0 to 70 C

Voltage also affects device speed: voltage increases , drain current increases, delay decreases

Typically characterize device around a power supply tolerance

Power Supply Voltage Tolerance


Commercial Industrial Military 5% 10% 10%

Process Variations also affect delay


wafer.

wafer fabrication is a long series of chemical

operations, variations in diffusion depth, dopant densities, oxide/diffusion geometry variations can cause transistor switching speeds to vary from wafer batch to wafer batch, wafer to wafer and even on the same

Transistors typically characterized as "fast", "nominal", and "slow".


models for these cases.

Need SPICE transistor

However, variations between n MOS-speeds and p MOS-speeds can be independent so one can obtain "four corners" model

slow n MOS fast p MOS

fast n MOS fast p MOS

slow n MOS slow p MOS

fast n MOS slow p MOS

When characterizing for high speed, also want to use lowest temperature, highest voltage.

When characterizing for "slow" case, want highest temperature, lowest voltage.

CMOS Digital Systems Checks (Commercial)

PROCESS
Fast-n / fast-p

TEMPERATURE
0 C 125 C 0 C

VOLTAGE
5.5V (3.6V)

TESTS Power dissipation (DC), clock races Circuit speed, external setup and hold times Pseudo- nMOS noise margin, level shifters, memory write/read, ratioed circuits Memories, ratioed circuits, level shifters

Slow-n / slow-p

4.5V (3.0V)

Slow-n / fast-p

5.5V (3.6V)

Fast-n / slow-p

0 C

5.5V (3.6V)

Power Dissipation
Power Dissipation has three components:

1. 2. 3.

Static Dynamic Short Circuit

For traditional CMOS design, static dissipation is limited to the leakage currents in the reversed-biased diodes formed between the substrate (or well) and source/drain regions. But in some DSM CMOS

technology subthreshold leakage tends to also contribute significant static dissipation. Subthreshold leakage increases exponentially as threshold voltage decreases; i.e., lower V T (VTn and |VTp |) CMOS technology has more static power dissipation (due to subthreshold leakage) than higher VT technology.

Static power dissipation can be extremely small: 1 inverter @ 5V 1 to 2 nanowatts static power

Dynamic Power is governed by

P =fCV
d p L

DD

This is the amount of power dissipated by charging/discharging internal capacitance and load capacitance.

Note the relations: Higher the switching speed Lower the voltage the Bigger the gates Pd Pd ! Pd

To estimate Pd , need to know the switching frequencies of the internal signals

Typically break this into two parts:

Pd = ( Pd )

clock network

+ ( Pd )

all the rest

The power dissipation in the clock network tends to dominate in most designs. Usually assume the switching frequency of logic signals as some fraction of the clock frequency, can estimate by running some sample simulations and keeping switching statistics on internal nodes to build a probabilistic model of switching activity.

Logic synthesis techniques can be used to do the following: a. or and/or b. c. minimize # of gates maximize speed minimize switching activity

Also, have "short-circuit" power dissipation proportional to the amount of time when both p - and n -trees are conducting.

Slow rise/fall times on nodes can make this significant. Usually ignored in most calculations.

Sizing Routing Calculation


The sizing of signal lines to achieve a particular RC delay was previously discussed.

For power conductors, need to worry about 1. 2. Metal migration - too much current in too small a conductor will "blow" the conductor Ground Bounce - large current spikes in V DD /GND leads can occur when simultaneous outputs switch

Two components to ground bounce.

a.

IR

for on-chip conductors, R is resistance of on-chip conductor

b.

di dt

L is the on-chip inductance and package inductance in di is affected dt

VDD /GND pins. Package inductance dominates. Note that by slew rates on input/output pins.

Example What would be the conductor width of power and ground wires to a 50MHz clock buffer that drives 100pF of on-chip load to satisfy the metal-migration consideration (JAL = 0.5mA/m)? What is the ground bounce with chosen conductor size? The module is 500m from both the power and ground pads and the supply voltage is 5 volts.

1.

= CVDD 2 = 100 10-12 25 50 106 = 125mW

= P/V = 25mA

Thus the width of the clock wires should be at least 50m. A good choice would be 100m. = 500/100 0.05 = 5 squares 0.05 /sq. = 0.25 IR = 0.25 25 10-3 = 6.25mV di Typically, IR term of ground bounce very small compared to L term. dt

2.