Beruflich Dokumente
Kultur Dokumente
Winter 2005
Power Consumption
http:/ / vlsicad.ucsd.edu
Area / cost
Performance
Power consumption
Reliability
Manufacturing yield
http:/ / vlsicad.ucsd.edu
Power Dissipation
Lead Microprocessors power continues to increase
Power (Watts)
100
P6
Pentium proc
10
8086 286
1
8008
4004
486
386
8085
8080
0.1
1971
1974
1978
1985
1992
2000
Year
Courtesy, Intel
http:/ / vlsicad.ucsd.edu
Power Density
10000
Rocket
Nozzle
1000
Nuclear
Reactor
100
8086
10 4004
Hot Plate
P6
8008 8085
Pentium proc
386
286
486
8080
1
1970
1980
1990
Year
2000
2010
Courtesy, Intel
http:/ / vlsicad.ucsd.edu
Consumer products
Temperature
- Every 10OC increase in operating temperature roughly doubles a
components failure rate
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Watts
http:/ / vlsicad.ucsd.edu
Energy E = P dt
Energy-delay products
http:/ / vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Supply Voltage:
Has been dropping
with successive
generations
Activity factor:
How often, on average,
do wires switch?
Clock frequency:
Increasing
http:/ / vlsicad.ucsd.edu
Transition
time
RP
CL
RN
http:/ / vlsicad.ucsd.edu
I D
W
exp q V GS V T /nkT
L
http:/ / vlsicad.ucsd.edu
Isd,leakHigh
Perf.
1.E+00
1/
High Perf.
1.E-01
1.E-02
1000
`
1/
Low Pwr
Isd,leakLow
pwr
100
2001 2003 2005 2007 2009 2011 2013 2015
1.E-03
1.E-04
I sd,leak (A/m)
1/ (GHz)
10000
1.E-05
1.E-06
Year
ECE 260B CSE 241A Power Consumption 14
http:/ / vlsicad.ucsd.edu
1.00
10000
0.90
1000
0.80
Tox
0.70
100
0.60
10
0.50
1
0.1
0.01
0.30
0.20
Oxy-nitride no longer
adequate: high K
needed
0.001
0.0001
2001
0.40
Igate spec.
from ITRS
2002
2003
2004
2005
2006
2007
2010
T ox (normalized)
Jgate (normalized)
100000
0.10
0.00
2013
2016
Year
Need for high K driven by Low Power, not High Performance
ECE 260B CSE 241A Power Consumption 15
http:/ / vlsicad.ucsd.edu
Dynamic power
(~
90% today and
decreasing
relatively)
Short-circuit
power
(~
8% today and
decreasing
absolutely)
Leakage power
(~
2% today and
increasing
relatively)
http:/ / vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
http:/ / vlsicad.ucsd.edu
Power
Analysis
RTL
Synthesis
Power
Analysis
Logic
Optimization
Power
Analysis
Transistor
Optimization
Power
Analysis
http:/ / vlsicad.ucsd.edu
Circuit Simulation
Logic
Optimization
Current Flows
Transistor
Optimization
Power Analysis
http:/ / vlsicad.ucsd.edu
Power Estimation
Dynamic Analysis
Simulation
Very accurate
http:/ / vlsicad.ucsd.edu
Power Ingredients
Dynamic Dissipation
Pdyn = CLVDDVsw f01
VDD
In
Out
Short-Circuit Currents
Psc = VDDIsc
CL
Static Dissipation
ISC
http:/ / vlsicad.ucsd.edu
1
P = i t v t dt
T 0
http:/ / vlsicad.ucsd.edu
Timing Simulation
i(Vdd)
Vdd
in
in
out1
out2
out3
out1
Vdd-Vth
out2
out3
http:/ / vlsicad.ucsd.edu
Switch-Level Simulation
Up to 3 Orders of Magnitude Faster than Circuit
Accurate for Dynamic Power
F
A
Cap (fF/bit)
IRSIM
SPICE
10
20
30
40
50
60
Sample
http:/ / vlsicad.ucsd.edu
Timing
Switch
Adder
Shift Register
% Error Speedup % Error Speedup
6
15
7
3.7
27
60
4
22
http:/ / vlsicad.ucsd.edu
PowerMill
Epic
Star-ADM
Avant!
LSIM
Analyst
Mentor
Mixed transistor/ gate simulation
Graphics Series-Parallel Switch algorithm
http:/ / vlsicad.ucsd.edu
Power
Analysis
RTL
Synthesis
Power
Analysis
Logic
Optimization
Power
Analysis
Transistor
Optimization
Power
Analysis
http:/ / vlsicad.ucsd.edu
Input
Transition
V
IInt
ISW
N
ILeak
Ci
GND
Static
Leakage Power (Ileak) [< 1%]
Sub-threshold leakage dominates, some due to leakage substrate
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Probabilistic
Analysis
Simulation
with integrated
Power Analysis
Transistor
Optimization
Simulation
Toggle
Rates
Power
Analysis
http:/ / vlsicad.ucsd.edu
Problems:
http:/ / vlsicad.ucsd.edu
Probabilistic Propagation
glitches?
Simulation
http:/ / vlsicad.ucsd.edu
lim
T
T /2
1
i t dt
T T / 2
Ai =
lim
T
ni T
T
http:/ / vlsicad.ucsd.edu
Normalized activity
f : clock frequency
ai =
Ai
f
1 2
V
2 dd
j all nodes C
j all nodes
f anout j a j
http:/ / vlsicad.ucsd.edu
Probability Propagation
Let y = f(x1, , xn) be a Boolean function with independent variables xi, the
signal probability of f can be obtained in linear time as follows.
P y =P x 1 P f x P x1 P f x
1
1
where
f x = f 1, x 2 ,. . . , x n , f x = f 0, x 2 ,. . . , x n
1
http:/ / vlsicad.ucsd.edu
Activity Propagation
Let y = f(x1, , xn) be a Boolean function with independent variables xi,
the signal activity of f can be obtained in linear time as follows.
n
A y = P
i =1
y
A x i
xi
y
= yx=1 yx=0
x
http:/ / vlsicad.ucsd.edu
Probability Propagation
Propagate
AND gate
sp(1) = sp1 * sp2
1/ 2
tp(01) = sp * (1 - sp)
1/ 2
1/ 4
7/ 16
1/ 2
1/ 2
Example
1/ 4
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Problem: Reconvergent
Fan-out:
Creates spatial
correlation between
signals
0.5
0.75
0.5
0.375?
0.5!
P(X) = P(B=1).(P(X=1 | B = 1)
http:/ / vlsicad.ucsd.edu
Solution to Reconvergence
0.375
0.5
c
0
0.75
0
a
0.25
Preferred Technique:
0.5
0.25
Other approaches:
super-gates
0.125
computation of correlation
coefficients
1 0
0
OBD D0.375
Z = bc + abc
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Symbolic Network
Transition Counters
Value of d at time t= 0
http:/ / vlsicad.ucsd.edu
Probability Simulation
0.6
0.0
0.75
0.5
0.25
0.0
t1
t2
t3
http:/ / vlsicad.ucsd.edu
It
I0
PS 0
N ext
State
PS t
Comb.
Logic
http:/ / vlsicad.ucsd.edu
DesignPower
Synopsys
PowerSim
Power_tool
Veritools
Simulation based
WattWatcher
Gate
Sente
Simulation based
POET
Viewlogic
Simulation based
Xpower
Genashor
Asynchronous designs
http:/ / vlsicad.ucsd.edu
Power
Analysis
RTL
Synthesis
Power
Analysis
Logic
Optimization
Power
Analysis
Transistor
Optimization
Power
Analysis
http:/ / vlsicad.ucsd.edu
Power Estimation
Simulation
Monte-Carlo technique
Hierarchical simulation
Architectural/ gate/transistor-level
Statistical estimation
http:/ / vlsicad.ucsd.edu
Synthesis
condition
Synthesis
P&R
RTL design
RTL planning
/ mapping
Post-layout
netlist
Structure
(macro)
netlist
Power
Characterization
Enhanced
RTL
Power model
library generator
Powerlib.vhd
Powerlib.v
Testbench
stimuli
RTL
simulation
Power
report
Powerlib.c
Power waveform / profile
http:/ / vlsicad.ucsd.edu
The number of input stimuli did not cause any error above
the 10% mark if we considered at least 10 input patterns
http:/ / vlsicad.ucsd.edu
400%
Behavioral
RTL
50%
Gate
20%
10%
Switch
http:/ / vlsicad.ucsd.edu
Expectations
Algorithmic
Algorithm selection
orders of magnitude
Behavioral
Concurrency
Memory
several times
Power manage
Clock ctrl
10-90%
RT Level
Structural transform.
10-15%
Tech. indep.
Extr/ decomp
15%
Tech dep.
Tech. mapping
Gate sizing
20%
20%
Layout
Placement
20%
http:/ / vlsicad.ucsd.edu
Outline
Problem statement
Power dissipation components
Power estimation
Optimization techniques
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Reducing Capacitance
Clock gating
Sleep transistors
http:/ / vlsicad.ucsd.edu
http:/ / vlsicad.ucsd.edu
Sleep transistors
http:/ / vlsicad.ucsd.edu
Power
Analysis
RTL
Synthesis
Power
Analysis
Logic
Optimization
Power
Analysis
Transistor
Optimization
Power
Analysis
http:/ / vlsicad.ucsd.edu
Optimization modes:
Optimization Goals
Delay
Power
Slack
AMPS - Epic
ECE 260B CSE 241A Power Consumption 59
http:/ / vlsicad.ucsd.edu
Power
Analysis
RTL
Synthesis
Power
Analysis
Logic
Optimization
Power
Analysis
Transistor
Optimization
Power
Analysis
http:/ / vlsicad.ucsd.edu
Logic or
Gate Netlist
Switching Activity
Constraints
(timing, power, area)
Logic Optimization
Tech
Library
Power Optimization
Parasitics
(Capacitance)
Power Optimized
Gate Level Netlist
http:/ / vlsicad.ucsd.edu
Factoring
Structuring
Buffer insertion/ deletion
Dont care optimization
Technology mapping
Sizing
Pin assignment
http:/ / vlsicad.ucsd.edu
Factoring
Pa = 0.1
Pb = 0.5
Pc = 0.5
http:/ / vlsicad.ucsd.edu
Logic Restructuring
http:/ / vlsicad.ucsd.edu
Technology Mapping
a
d
slack=1
http:/ / vlsicad.ucsd.edu
Technology Mapping
Example: 6-input AND
Implemented using 6 input NAND, 3 input NAND, and 2-input NAND [Bellaouar, ElMasry]
Library 1: High-Speed
Library 2: Low-Area
http:/ / vlsicad.ucsd.edu
Area
Delay (ns)
Energy (fF)
6-input
9
1.1
6.7
3-input
11
0.86
42.5
2-input
13
0.83
89.4
Library 1
Library 2
6-input
6.7
3.5
3-input
42.5
19.5
2-input
89.4
43.7
http:/ / vlsicad.ucsd.edu
mostly ad hoc
Clock gating
Pre-computation
http:/ / vlsicad.ucsd.edu
Clock gating
http:/ / vlsicad.ucsd.edu
Pre-computation
Other options:
guarded evaluation
set output directly
http:/ / vlsicad.ucsd.edu
Power Compiler
Results:
design dependent
library dependent
http:/ / vlsicad.ucsd.edu
Loop unrolling
Retiming
Pipelining
http:/ / vlsicad.ucsd.edu
Summary
http:/ / vlsicad.ucsd.edu