Sie sind auf Seite 1von 17

63

CHAPTER 5

LOW POWER VLSI ARCHITECTURE USING GATE


DIFFUSION INPUT (GDI) CELL

5.1 INTRODUCTION

Minimizing power consumption for digital systems involves


optimization at all levels of the design. This optimization includes the
technology used to implement the digital circuits, the circuit style, topology
and the architecture for implementing the circuits. This chapter demonstrates
the design of Viterbi decoder using GDI logic or cell. GDI cell has the
structure similar to that of a CMOS transistor with low complexity and low
switching transition. The chapter is organized thus: Section 5.1 introduces the
chapter with the existing problem statement. Section 5.2 discusses the
performance analysis of GDI logic. Sections 5.3, 5.4 and 5.5 elaborate the
implementation of the GDI logic of the design of BMU, ACS and SMU in the
Viterbi decoder. Then the block diagram of the GDI based Viterbi decoder is
provided in section 5.6. Finally, sections 5.7 and 5.8 discuss the simulation
results and a performance comparison.

From the literature survey, it is found that Mohammad K. Akbari et


al. (2004) dealt with the design of ACS unit with logic styles like CMOS,
pseudo nMOS, and dynamic logics. But the switching activity in these logic
styles is high and thus leads to high power dissipation. Also, dynamic logic
styles are often a good choice for high speed, but not for low power circuit
implementations due to the high node activity and large clock loads. The
limitation found in the wave pipelined SRL based Viterbi decoder detailed in
64

Chapter 4, is that the SRL gate suffers from high gate density. To reduce the
power consumption and to minimize the gate density, the Viterbi decoder is
designed using GDI and then its performance is compared with that of
CMOS.

GDI cell is constructed with the use of simple CMOS transistors as


shown in Figure 5.1. The differences between CMOS and GDI are: (1) the
GDI cell contains three inputs and (2) Bulks of both nMOS and pMOS are
connected to n or p, so it can be randomly biased in contrast with CMOS
inverter. The GDI cell contains four terminals – G (common gate input of the
nMOS and pMOS transistors); P (outer diffusion node of the pMOS
transistor); N (outer diffusion node of the nMOS transistor) and D node
(common diffusion of both transistors). A salient feature of GDI is that it has
improved logic level swing and static power characteristics.

Figure 5.1 Basic GDI Cell


65

5.2 PERFORMANCE ANALYSIS OF GDI CELL

The GDI cell has n + 2 inputs when compared to CMOS; Arkadiy


Morgenshtein et al. (2002) analyzed the performance of GDI in terms of noise
margin, body effect, fan out and delay etc. In realizing the function, F1 = Ɨ.b,
it has been found that the behavior of GDI cell is similar to that pMOS pass
transistor logic in which the output is at Vtp instead of 0. This is the only case
where logic degradation of one Vt takes place. In order to restore the logic, the
GDI based buffer is added at the output. The GDI cell can act as a buffer
when P=1 which performs its logic evaluation and also restore the logic. This
is a main advantage of GDI cell. The GDI cells have less Vt drop in some of
the transitions at the output of the cell. With a proper interconnection of the
cells, several GDI cells are connected in series or parallel without
accumulating the voltage drop. In order to restore the logic swing, an inverter
is used at the output (which can also perform its logical function). In this way,
the Vt drop and noise margin reductions are localized. However, there are also
potential advantages in GDI in terms of reliability. They include these:

x The lower voltage levels have lower impact due to crosstalk on


neighboring wires.
x The fact that complex functions are built by using multiple
instances of the same GDI cell contributes to reduced
variability, and
x Smaller area and number of transistors in GDI mean shorter
interconnects and less crosstalk and these enable more efficient
place and route.

Various logic functions for different input combinations of GDI


cell, which are used in this design are furnished in Table 5.1. Most of these
functions when implemented using CMOS, as well as in regard to standard
66

pass transistor logic implementations are complex (6–12 transistors). But they
are very simple (only two transistors per function) in the GDI design method.

Table 5.1 Various Logic Functions for Different Input Combinations


of GDI Cell

N (outer P (outer G (common Output


diffusion node diffusion node gate input of the Function (D)
of the nMOS of the pMOS nMOS and
transistor) transistor) pMOS
transistors)
0 b a Ɨ.b
1 b a a+b
b 0 a ab

c b a Ɨ.b+ac

0 1 a Ɨ

5.3 DESIGN OF BMU USING GDI

The schematic of BMU using GDI is given in Figure 5.2. The


BMU consists of the XOR gate and a counter for counting the number of
differing bits between the received bits from the channel and the expected
bits.

Figure 5.2 SPICE Schematic of BMU Using GDI


67

GDI cell based XOR gate is shown in Figure 5.3, in which a and b
are inputs and x is the output. When a=0 and b=1 the pMOS transistor turns
on making the output to be at high logic. nMOS transistor connected to input
a is in off condition. For b= 1 the nMOS transistor is on and the high logic
output of first stage becomes the output at x which is 1. This x acts as clk
signal for the counter. The 3-bit counter is constructed by cascading TFFs.
The output from one flipflop is fed as clock input to the next flipflop. The t
input to all the flipflops are maintained in high state. TFF comprises of signals
like t, clk and clr and preset as inputs, q1 and q2 as outputs.

Figure 5.3 GDI Cell Based XOR Gate

The GDI cell based T FF is shown in Figure 5.4. For t=1 the nMOS
transistor is on and the low output is fed to the next stage. The input from t is
directly fed to another inverter gate and the output from each state is carried
on to the next stage. The flip flop produces an output at q1 and the inverted
output is produced at q2.
68

Figure 5.4 GDI Cell Based T FF

For a transition of the clock from a high to a low state the value of
the previous state is maintained at q1. When the output of XOR gate becomes
high, the counter starts counting these high output values. Thus the output of
BMU is obtained.

5.4 DESIGN OF ACS UNIT USING GDI

The adder unit which is proposed in the design consists of two Full
Adders (FA) and one Half-Adder (HA). Schematic of adder unit using GDI is
illustrated in Figure 5.5.
69

Carry Carry

Figure 5.5 Schematic of Adder Unit Using GDI

The circuit diagram of the half adder designed using GDI is given
in Figure 5.6. Inputs and outputs of the half adder are given by a, b and s, c
respectively.

Figure 5.6 Circuit Diagram for Half Adder Using GDI

When a=0 and b=1, the pMOS transistor is on and the high output
is fed to the nMOS transistor with b as input. When b=1 the nMOS transistor
is on making pMOS transistor to be in the off state. The high input to nMOS
transistor causes the output at s to be 1. Since a=0, the third stage of the
transistor makes the pMOS transistor to turn on and the output c becomes 0,
as the supply to pMOS is grounded. Similarly, the full adder circuit is also
designed.
70

The comparator compares all the possible BMU values and


provides the lowest PM value. It consists of an inverter circuit, NAND and
NOR gates. Figure 5.7 demonstrates the internal circuit of 4-bit magnitude
comparator using GDI.

Figure 5.7 Internal Circuit of 4-Bit Magnitude Comparator Using GDI

Two 4-bit inputs are fed to the comparator as a0, a1, a2, a3 and b0,
b1, b2, b3 respectively. These inputs are first fed to the inverter gates, then
processed by the AND and NAND gates. Finally the output is obtained
through the NOR gates. The comparator design proves that the number of
transistors is minimum. The selector unit selects the lowest PM value through
the use of four 2:1 multiplexers.
71

Design of SMU using shift registers is given in Figure 5.8. The


shift registers store decision bits. The 4-bit output of the selector unit is the
input of the SMU. The SMU comprises of a 4x4 shift register constructed
using DFF. Each bit is stored in the DFF.

Figure 5.8 Design of SMU Using Shift Registers (SPICE)

Shift registers are constructed using D FF and the design of GDI


based D FF is shown in Figure 5.9. Working of the D FF is described as,
when clock signal clk=0, the pMOS transistor connected to clk is in on
72

condition. When d=1 the high output is given to the first nMOS and pMOS
transistor which turns on the nMOS transistor. This low output is fed to the
next stage of the inverters. Since clk=0 and dbar=0, the output is fed as input
to the third stage transistors which turn on the pMOS transistor and the high
output is fed to the subsequent stages. The q1 output is obtained in the high
state (q1=1).

Figure 5.9 GDI Based D FF

5.5 BLOCK DIAGRAM OF VITERBI DECODER DESIGNED


USING GDI

The block diagram represented in Figure 5.10 of the Viterbi


decoder is designed using GDI and CMOS logic integrating all the units like
BMU, ACSU and SMU. BMU calculates the BM between the expected
sequence (original random input which is a and c) and the received sequence
(introduced errors which is b and d). The inputs are processed and the
minimum PM value is decoded. The GDI based Viterbi decoder analysis
shows that the power consumption takes place only because of the input
73

transients and not exclusively due to the supply voltage. This is not suitable
for CMOS, as it has Vdd.

Figure 5.10 Block Diagram of Viterbi Decoder Using GDI and CMOS
Logics

5.6 SIMULATION RESULTS OF VITERBI DECODER BASED


ON GDI

The Viterbi decoder is designed using GDI and CMOS logics. The
results are verified using Tanner tool T-SPICE (Tanner-Simulation Program
with Integrated Circuit Emphasis) in the 0.25µm technology, 2.5V Vdd and at
a frequency of 25MHz. Since the GDI cell is a transistor based
implementation. T-SPICE is preferred for designing all the blocks and for
obtaining the simulation results.
74

5.6.1 Output Waveform of BMU

The XOR gate designed using GDI cell has its inputs as a and b
with x or clk as the output. The output waveform of BMU is shown in
Figure 5.11.

Figure 5.11 Output Waveform of BMU

5.6.2 Output Waveform of ACS


5.6.2.1 Output Waveform of Adder

The 3-bit output of the BMU is given as input to the adder unit.
These inputs to the adder are a0, a1, a2 and b0, b1, b2 significantly. Outputs
are given by sum1 (s0), sum2 (s1) and sum3 (s2) respectively. The input and
output waveform of the adder unit is given in Figure 5.12.
75

Figure 5.12 Input and Output Waveforms of Adder Unit

5.6.2.2 Output Waveform of Selector Unit

The selector unit is operated using four 2:1 multiplexers. The


output of the comparator unit is given to the select line of the multiplexers.
The output waveform of selector unit is shown in Figure 5.13. When the
inputs to the selector unit is given as a0=0, a1=1, a2=1, a3=1, b0=0, b1=0,
b2=0, b3=1and when sel (s)=11, the outputs are given as F0=0, F1=0, F2=1,
F3=0.
76

Figure 5.13 Output Waveform of Selector Unit

5.6.3 Simulation Result of Viterbi Decoder Using GDI

The algorithm of the decoder is discussed by means of trellis


diagram which has two input paths. Waveform with all signal representation
is not viewed clearly. Hence the input and output waveform of Viterbi
decoder using GDI is presented in the Figure 5.14. The input to the decoder is
a and c is given by 11 00 11 10 (8 bits). The received sequence b and d is
introduced with random errors in the original signal. It decodes the received
value and the output of the decoder will be the 0011.
77

Figure 5.14 Input and Output Waveforms of Viterbi Decoder Using GDI

5.7 PERFORMANCE COMPARISON WITH CMOS AND


OTHER CIRCUIT STYLES

For comparison, the Viterbi decoder is designed using CMOS logic


and its performance is also verified for the same input combinations.
Comparison of power, area and delay of CMOS and GDI are discussed in
Table 5.2. The average power consumption of the Viterbi decoder for a
frequency of 25MHz is reduced by 29% compared to the CMOS circuits. The
area of the decoder is reduced by 66% when compared to CMOS logic. The
ranges of frequencies from 5MHz to 25MHz are checked to obtain the output,
it is obtained and at 25MHz frequency there are no glitches in the transitions.
Delay of both CMOS and GDI are observed for the above aforesaid frequency
and it is found that GDI has 1.26 times reduced delay.
78

Table 5.2 Comparison of Power, Area and Delay of CMOS and GDI

Viterbi Decoder
Parameters
CMOS GDI

Transistor count 1682 568

Constraint
Length (K) 49.56 34.12
K=3

Power K=4 49.68 34.93


consumption(mW)
K=5 49.74 35.47

K=6 49.93 35.85

K=7 52.31 36.93

Average Power consumption


50.24 35.46
(mW)

Frequency of operation (MHz) 25 25

Delay (ns) 4 3.16

The transistor count and power consumption comparison of


different logic styles (Mohammad K. Akbari et al. 2004) are presented in
Figure 5.15. The results confirm that GDI has almost 50% to 60 % less
number of transistors when compared to other logics. Also it has minimum
average power consumption, which is an advantage over the earlier ones.
79

Figure 5.15 Transistor Count and Power Consumption Comparison of


Different Logic Styles

5.8 CONCLUSION

The chapter primarily focused on the design of low power and high
performance Viterbi decoder using GDI and CMOS logic styles. It also
presented an area efficient approach to low power for any design, as GDI
requires less number of transistors when compared to CMOS and other circuit
styles. Thus this method has proved to be very efficient interms of fabrication
of the Viterbi decoder.

Das könnte Ihnen auch gefallen