Beruflich Dokumente
Kultur Dokumente
Dynamic Power
Q1. A 32 bit off-chip bus operating at 1.5V and 1GHz clock rate is driving a capacitance
of 2.5pF/bit. Each bit is estimated to have a toggling probability of 0.25 at each clock
cycle. What is the power dissipation in operating the bus?
Solution:
Total capacitance,
32 2.5
= 80pF
Power dissipation,
= 0.5 CV2f
= 0.5 0.25 (8010-12) 1.52 109 W
= 22.5mW
Q2. The chip size of a CPU is 15mm25mm with clock frequency of 300MHz operating
at 3.3V. The length of the clock routing is estimated to be twice the circumference of the
chip. Assume that the clock signal is routed on a metal layer with width of 1.2m and the
parasitic capacitance of the metal layer is 1fF/m2. What is the power dissipation of the
clock signal?
Solution:
Total capacitance,
= 2
Power dissipation,
= 0.5 CV2f
= 0.5 2 (19210-12) 3.32 (300106) W
= 627mW
(b) Qualitatively analyze the effect of reducing the transition (rise or fall) time of the
input waveform on the short-circuit power of a CMOS inverter.
Answer: As transition time of the input signal decreases, the interval during which
neither n nor p channel devices are in the cutoff state shortens. This is the interval during
which the short-circuit current flows. Thus, reduction in the input transition time will
reduce the short-circuit power.
(c) As the output load capacitance of a CMOS gate is increased, how does the output
transition time change? Does the short-circuit power dissipation increase or decrease?
Answer: Increase of the load capacitance increases the output charging time constant and
hence proportionately increases the output transition time. As a result a larger proportion
of the total current between the supply and ground is used for charging and discharging
of the load capacitance and a smaller proportion flows to ground as short-circuit current.
Thus, short-circuit power is decreased.
Q4. A 65nm digital CMOS device is found to consume equal amounts (P) of dynamic
power and leakage power while the short-circuit power is negligible. The energy
consumed by a computing task, that takes T seconds, is 2PT. The following strategies are
being considered to reduce the power consumption. Determine the energy that the
computing task will consume in each case:
(a) Clock frequency is reduced to half, keeping all other parameters constant.
(b) Supply voltage is reduced to half. This slows the gates down and forces the clock
frequency to be lowered to half of its original (full voltage) value. Assume that
leakage current is held unchanged by modifying the design of transistors.
Solution:
(a) Reducing the clock frequency will reduce dynamic power to P/2, keep the static
power the same as P, and double the execution time of the task. Therefore, the
energy consumption will be,
Energy = (P/2 + P) 2T = 3PT
(b) When the supply voltage and clock frequency are reduced to half their values,
dynamic power is reduced to P/8 and static power to P/2. The time of task is
doubled and the total energy consumption is,
Energy = (P/8 + P/2) 2T = 5PT/4 =1.25PT
The voltage reduction strategy reduces the energy consumption while a simple
frequency reduction consumes more energy.
Power Estimation
Q5. An adder circuit produces a 2-bit sum and a 1-bit carry for two 2-bit positive binary
integers. All four input bits are assumed to have equal and independent probabilities for 0
and 1. Compute input and output entropies. If the total capacitance in the circuit is C,
supply voltage is 2.5V and inputs are applied at the rate of 100 million per second,
compute a high-level estimate of power.
Solution:
There are four input lines and all values are equiprobable, therefore, input entropy is,
Hi
4 bits
Considering two integers at the input, the output values are given below:
Adder
output
Second 0
input
1
integer
2
3
0
0
1
2
3
We see that on the three output lines, only 7 combinations occur and their probabilities
are, P(0)=1/16, P(1)=2/16, P(2)=3/16, P(3)=4/16, P(4)=3/16, P(5)=2/16, and P(6)=1/16.
The output entropy is calculated as,
Ho = 2(1/16)log2(1/16) 2(2/16)log2(2/16) 2(3/16)log2(3/16) (4/16)log2(4/16)
= 0.5 + 0.75 + 0.9056 + 0.5 = 2.6556 bits
Considering the given data, we need the average activity factor for the circuit. We use the
average activity formula for an n input and m output circuit:
Average entropy
2/3
(Hi + 2Ho)
n+m
where n = 4 and m = 3.
Because entropy is twice the node activity, to determine the fraction of the total
capacitance C that is switched every vector, we take half of the average entropy. The
power is estimated as:
Power =
Q6. A microcontroller system operates with system frequency of 100 Hz and repeats the
following sequence: a/d conversion, process data, real-time clock, idle. What is the
expected system operation time for the system if the system is supplied with 720mAh
capacity battery?
Power consumption of the various components is:
Analogue interface circuit 2 mA, cycle operation time - 100 s.
Microcontroller in active mode 400 A, cycle processing time:
40 s with probability 20%
80 s with probability 50%
120 s with probability 30%
Microcontroller in idle mode 1.6 A.
Real Time Clock 350 A, cycle operation time - 100 s.
LCD display 20 A, always active.
Solution:
System clock frequency fs = 100 Hz cycle time T = 10 ms
Average microcontroller cycle activity
Ta = 0.2*40 s + 0.5*80 s + 0.3*120 s = 84 s
Iavg = 2mA * 100 s/T (analog interface circuit)
+ 400 A * Ta / T + (microcontroller - active mode)
+ 1.6 A * ( T - Ta ) / T (microcontroller - idle mode)
+ 350 A * 100 s / T (RTC)
+ 20 A (LCD)
Iavg = 20 A + 3.36 A + 1.58656 A + 3.5 A + 20 A = 48.44656 A
Expected operation time OT = 720 mAh / 48.44656 A 14862hrs or 619 days
probability of a data pattern consuming CV2 energy on a wire is . Therefore, the average
per pattern energy for all n wires of the bus is CV2n/4.
Encoded bus: Encoded bus contains 2n wires. The 1-hot encoding ensures that whenever
there is a change in the data pattern, exactly one wire will have a 01 transition, charging
its capacitance and consuming CV2 energy. There can be 2n possible data patterns and
exactly one of these will match the previous pattern and consume no energy. Thus, the
per pattern energy consumption of the bus is 0 with probability 2n, and CV2 with
probability 1 2n. The average per pattern energy for the 1-hot encoded bus is
CV2(1 2n).
Power ratio
=
=
For the encoding to be beneficial, the above power ratio should be less than 1. That is,
4(1 2n)/n 1, or 1 2n n/4, i.e., n/4 1 (approximately) n 4.
The following table shows 1-hot encoded bus power ratio as a function of bus width:
4(1 2n)/n
4(1 2n)/n
n
n
1
2.0000
8
0.4981
2
1.5000
16
0.2500 = 1/4
3
1.1670
32
1/8
4
0.9375
64
1/16
Q8. A combinational one-bit adder cell with a carry input is fully tested by five vectors
000, 111, 001, 010 and 101.
(a) Give a procedure to re-sequence these vectors to minimize the peak power
consumption during test.
(b) How will you further reduce the average power consumption of the re-sequenced
test vectors.
Solution:
(a) The following graph shows the Hamming distances (bit changes) between vectors.
To have a shortest distance tour, we should enter and exit a node through unit
distance edges. That means the node should have two unit-distance edges. We notice
that 111 has only one unit-distance edge. So we can make it a start (or end) node. A
shortest distance tour, 111, 101, 001, 000, 010, which has a total of 4 transitions, is
shown on the graph.
000
101
Start
111
1
2
1
1
010
001
(b)If we repeat the vectors, the average number of transitions per vector will decrease,
reducing the power consumption. This has the same effect as slowing down the clock.
If two consecutive vectors have several transitions, several vectors can be inserted
between them such that transitions are made gradually. In all of these methods the test
time increases. The procedure of duplicating vectors or clock slowing are applicable
to sequential circuits as well.
P1
VN2
[1 + 0.1(N-1)]
52
Where N is the degree of duplication and 0.1 is the hardware overhead for each
duplicated core. P1 is given as 15W. The clock frequency for the duplicated cores is
200/N MHz. For better synchronization of clock phases, we may have N assume
integer values that can divide 200, i.e, N = 2, 4, 5, 8, 10, 20, 25, 40, 50, 100. For N
times increase in delay, the supply voltage is the VDD obtained from the given
equation,
VN
0.5 + (20.25/N)1/2
The following table gives the supply voltages and power for different values of N.
N
1
2
4
5
8
Clock (MHz)
200
100
50
40
25
Supply VN (Volts)
5.00
3.68
2.75
2.51
2.10
Power (W)
15.0
8.94
5.90
5.29
4.50
Thus a design with 5 or 8 cores can be used. The architecture for a 5 core design is
given below. An eight core design can be constructed similarly.
Reg
Multiplier
Core 1
40MHz
Reg
Multiplier
Core 2
5
to
1
mux
Reg
Output
Input
40MHz
Multiphase
Clock gen.
and mux
control
CK
Reg
40MHz
200MHz
Multiplier
Core 5