Sie sind auf Seite 1von 7

B31DE2 Advanced Digital Electronics

Low Power Digital Design Tutorial Solutions

Dynamic Power
Q1. A 32 bit off-chip bus operating at 1.5V and 1GHz clock rate is driving a capacitance
of 2.5pF/bit. Each bit is estimated to have a toggling probability of 0.25 at each clock
cycle. What is the power dissipation in operating the bus?
Solution:
Total capacitance,

32 2.5

= 80pF

Power dissipation,

= 0.5 CV2f
= 0.5 0.25 (8010-12) 1.52 109 W
= 22.5mW

Q2. The chip size of a CPU is 15mm25mm with clock frequency of 300MHz operating
at 3.3V. The length of the clock routing is estimated to be twice the circumference of the
chip. Assume that the clock signal is routed on a metal layer with width of 1.2m and the
parasitic capacitance of the metal layer is 1fF/m2. What is the power dissipation of the
clock signal?
Solution:
Total capacitance,

= 4(15+25)103 1.2 1fF = 192pF

For clock signal,

= 2

Power dissipation,

= 0.5 CV2f
= 0.5 2 (19210-12) 3.32 (300106) W
= 627mW

Short-Circuit and Leakage Power


Q3. (a)How will you design a CMOS circuit so that short-circuit current is completely
eliminated?
Answer: Short circuit power is eliminated when VDD Vtn + |Vtp|, i.e., supply voltage is
lower than the sum of the threshold voltage magnitudes for the n and p channel
MOSFETs.

(b) Qualitatively analyze the effect of reducing the transition (rise or fall) time of the
input waveform on the short-circuit power of a CMOS inverter.
Answer: As transition time of the input signal decreases, the interval during which
neither n nor p channel devices are in the cutoff state shortens. This is the interval during
which the short-circuit current flows. Thus, reduction in the input transition time will
reduce the short-circuit power.
(c) As the output load capacitance of a CMOS gate is increased, how does the output
transition time change? Does the short-circuit power dissipation increase or decrease?
Answer: Increase of the load capacitance increases the output charging time constant and
hence proportionately increases the output transition time. As a result a larger proportion
of the total current between the supply and ground is used for charging and discharging
of the load capacitance and a smaller proportion flows to ground as short-circuit current.
Thus, short-circuit power is decreased.
Q4. A 65nm digital CMOS device is found to consume equal amounts (P) of dynamic
power and leakage power while the short-circuit power is negligible. The energy
consumed by a computing task, that takes T seconds, is 2PT. The following strategies are
being considered to reduce the power consumption. Determine the energy that the
computing task will consume in each case:
(a) Clock frequency is reduced to half, keeping all other parameters constant.
(b) Supply voltage is reduced to half. This slows the gates down and forces the clock
frequency to be lowered to half of its original (full voltage) value. Assume that
leakage current is held unchanged by modifying the design of transistors.
Solution:
(a) Reducing the clock frequency will reduce dynamic power to P/2, keep the static
power the same as P, and double the execution time of the task. Therefore, the
energy consumption will be,
Energy = (P/2 + P) 2T = 3PT
(b) When the supply voltage and clock frequency are reduced to half their values,
dynamic power is reduced to P/8 and static power to P/2. The time of task is
doubled and the total energy consumption is,
Energy = (P/8 + P/2) 2T = 5PT/4 =1.25PT
The voltage reduction strategy reduces the energy consumption while a simple
frequency reduction consumes more energy.

Power Estimation
Q5. An adder circuit produces a 2-bit sum and a 1-bit carry for two 2-bit positive binary
integers. All four input bits are assumed to have equal and independent probabilities for 0
and 1. Compute input and output entropies. If the total capacitance in the circuit is C,
supply voltage is 2.5V and inputs are applied at the rate of 100 million per second,
compute a high-level estimate of power.
Solution:
There are four input lines and all values are equiprobable, therefore, input entropy is,
Hi

4 bits

Considering two integers at the input, the output values are given below:
Adder
output
Second 0
input
1
integer
2
3

0
0
1
2
3

First input integer


1
2
3
1
2
3
2
3
4
3
4
5
4
5
6

We see that on the three output lines, only 7 combinations occur and their probabilities
are, P(0)=1/16, P(1)=2/16, P(2)=3/16, P(3)=4/16, P(4)=3/16, P(5)=2/16, and P(6)=1/16.
The output entropy is calculated as,
Ho = 2(1/16)log2(1/16) 2(2/16)log2(2/16) 2(3/16)log2(3/16) (4/16)log2(4/16)
= 0.5 + 0.75 + 0.9056 + 0.5 = 2.6556 bits
Considering the given data, we need the average activity factor for the circuit. We use the
average activity formula for an n input and m output circuit:
Average entropy

2/3
(Hi + 2Ho)
n+m

where n = 4 and m = 3.
Because entropy is twice the node activity, to determine the fraction of the total
capacitance C that is switched every vector, we take half of the average entropy. The
power is estimated as:
Power =

CV2f (Average activity)

C 2.52 100 106 (2/21) (4 + 22.6556)/2

1.3857 108 C watts, where C is the total chip capacitance in Farads

Q6. A microcontroller system operates with system frequency of 100 Hz and repeats the
following sequence: a/d conversion, process data, real-time clock, idle. What is the
expected system operation time for the system if the system is supplied with 720mAh
capacity battery?
Power consumption of the various components is:
Analogue interface circuit 2 mA, cycle operation time - 100 s.
Microcontroller in active mode 400 A, cycle processing time:
40 s with probability 20%
80 s with probability 50%
120 s with probability 30%
Microcontroller in idle mode 1.6 A.
Real Time Clock 350 A, cycle operation time - 100 s.
LCD display 20 A, always active.
Solution:
System clock frequency fs = 100 Hz cycle time T = 10 ms
Average microcontroller cycle activity
Ta = 0.2*40 s + 0.5*80 s + 0.3*120 s = 84 s
Iavg = 2mA * 100 s/T (analog interface circuit)
+ 400 A * Ta / T + (microcontroller - active mode)
+ 1.6 A * ( T - Ta ) / T (microcontroller - idle mode)
+ 350 A * 100 s / T (RTC)
+ 20 A (LCD)
Iavg = 20 A + 3.36 A + 1.58656 A + 3.5 A + 20 A = 48.44656 A
Expected operation time OT = 720 mAh / 48.44656 A 14862hrs or 619 days

Bus Encoding for Low Power


Q7. The 1-hot encoding is to be used for reducing the capacitive power consumption of
an n-bit data bus. All n bits are assumed to be independent and random. Derive a formula
for the ratio of power consumptions on the encoded and the un-coded buses. Show that n
4 is essential for the 1-hot encoding to be beneficial. Reference: A. P. Chandrakasan
and R. W. Brodersen, Low Power Digital CMOS Design, Boston: Kluwer Academic
Publishers, 1995, pp. 224-225. [Hint: You should be able to solve this problem without
the help of the reference.]
Solution:
Un-coded bus: Two consecutive bits can be 00, 01, 10 and 11, each with a probability
0.25. Considering only the 01 transition, which draws energy from the supply, the

probability of a data pattern consuming CV2 energy on a wire is . Therefore, the average
per pattern energy for all n wires of the bus is CV2n/4.
Encoded bus: Encoded bus contains 2n wires. The 1-hot encoding ensures that whenever
there is a change in the data pattern, exactly one wire will have a 01 transition, charging
its capacitance and consuming CV2 energy. There can be 2n possible data patterns and
exactly one of these will match the previous pattern and consume no energy. Thus, the
per pattern energy consumption of the bus is 0 with probability 2n, and CV2 with
probability 1 2n. The average per pattern energy for the 1-hot encoded bus is
CV2(1 2n).
Power ratio

=
=

Encoded bus power / un-coded bus power


4(1 2n)/n 4/n for large n

For the encoding to be beneficial, the above power ratio should be less than 1. That is,
4(1 2n)/n 1, or 1 2n n/4, i.e., n/4 1 (approximately) n 4.
The following table shows 1-hot encoded bus power ratio as a function of bus width:
4(1 2n)/n
4(1 2n)/n
n
n
1
2.0000
8
0.4981
2
1.5000
16
0.2500 = 1/4
3
1.1670
32
1/8
4
0.9375
64
1/16
Q8. A combinational one-bit adder cell with a carry input is fully tested by five vectors
000, 111, 001, 010 and 101.
(a) Give a procedure to re-sequence these vectors to minimize the peak power
consumption during test.
(b) How will you further reduce the average power consumption of the re-sequenced
test vectors.
Solution:
(a) The following graph shows the Hamming distances (bit changes) between vectors.
To have a shortest distance tour, we should enter and exit a node through unit
distance edges. That means the node should have two unit-distance edges. We notice
that 111 has only one unit-distance edge. So we can make it a start (or end) node. A
shortest distance tour, 111, 101, 001, 000, 010, which has a total of 4 transitions, is
shown on the graph.

000

101

Start
111

1
2
1

1
010

001

(b)If we repeat the vectors, the average number of transitions per vector will decrease,
reducing the power consumption. This has the same effect as slowing down the clock.
If two consecutive vectors have several transitions, several vectors can be inserted
between them such that transitions are made gradually. In all of these methods the test
time increases. The procedure of duplicating vectors or clock slowing are applicable
to sequential circuits as well.

Class example for illustration not examinable


Q9. A 200MHz multiplier core has a rated supply voltage of 5V and consumes 15W.
The core is designed to work at reduced supply voltages. Its delay at a supply voltage
VDD with respect to the rated operation is specified as:
Relative delay = 20.25(VDD 0.5) 2
The multiplier core is to be integrated into a 200MHz data path chip but the allocated
power budget for the multiplier is only 5W. Devise a multi-core architecture for the
multiplier assuming 10% hardware overhead per duplicated core.
Solution: Power for a multi-core parallel architecture is given by:
PN

P1

VN2
[1 + 0.1(N-1)]
52

Where N is the degree of duplication and 0.1 is the hardware overhead for each
duplicated core. P1 is given as 15W. The clock frequency for the duplicated cores is
200/N MHz. For better synchronization of clock phases, we may have N assume
integer values that can divide 200, i.e, N = 2, 4, 5, 8, 10, 20, 25, 40, 50, 100. For N
times increase in delay, the supply voltage is the VDD obtained from the given
equation,
VN

0.5 + (20.25/N)1/2

The following table gives the supply voltages and power for different values of N.
N
1
2
4
5
8

Clock (MHz)
200
100
50
40
25

Supply VN (Volts)
5.00
3.68
2.75
2.51
2.10

Power (W)
15.0
8.94
5.90
5.29
4.50

Thus a design with 5 or 8 cores can be used. The architecture for a 5 core design is
given below. An eight core design can be constructed similarly.

Reg

Multiplier
Core 1

40MHz
Reg

Multiplier
Core 2

5
to
1
mux

Reg

Output

Input
40MHz

Multiphase
Clock gen.
and mux
control

CK

Reg

40MHz

200MHz
Multiplier
Core 5

Das könnte Ihnen auch gefallen