Vedic Multiplier Using Energy Recovery PDF

Low Power Vedic Multiplier Using Energy Recovery
Logic
Hardik Sangani1 Tanay M. Modi2 V.S. Kanchana Bhaaskaran3
1 2 3
School of Electronics Engineering, School of Electronics Engineering, School of Electronics Engineering,
VIT University, Chennai-600127, VIT University, Chennai-600127, VIT University, Chennai-600127,
Tamil Nadu, India. Tamil Nadu, India. Tamil Nadu, India.
1 2 3
sangani.hardik2013@vit.ac.in tanay.mayankbhai2013@vit.ac.in kanchana.vs@vit.ac.in
Abstract Multiplier is one of the primary hardware blocks in The energy recovery or adiabatic switching technique is a
modern day digital signal processing (DSP) and communication promising approach for low-power operation. The circuit
systems. It is extensively used in DSP and image processing reclaims full or partial energy stored in the nodal capacitances
applications such as, Fast Fourier Transform (FFT), convolution, and restores or recovers that energy, with the help of power
correlation, filtering and in ALU of microprocessors. Therefore, clock [6]. The power-clock is so called to represent the fact that
high speed, low area and power efficient multiplier design remain it powers the circuit even while making the timing across the
the critical factors for the overall system. This paper presents high pipelined circuit taken care of. In this paper, we employ
performance and energy efficient implementation of the binary Differential Cascode Pre-resolve Adiabatic Logic (DCPAL) for
multiplier. The design is based on ancient Indian Vedic
implementing the design [7]. DCPAL provides enhanced energy
multiplication process and the low power energy recovery (aka
recovery and reduced non-adiabatic loss, because of the non-
adiabatic logic). The generation of partial sums and products in a
single step in the Vedic approach and the energy recovery
availability of direct path between supply rails. The logic style
capability of the adiabatic logic together realize high speed and low features reduced latency and lower transistor count.
power operation of the design. A 16X16 Vedic multiplier and Furthermore, the logic incurs reduced output nodal capacitances
conventional array multiplier based on the Differential Cascode leading to increased operating frequency range. Use of four
Pre-resolve Adiabatic Logic (DCPAL) is proposed in the paper. phase power clock gives better speed performance. These
Simulation results validate this design incurring 87.21 percent features make DCPAL a very attractive logic style for energy
lesser power than the standard CMOS equivalent design. efficient operation even at higher frequencies and lower
technology nodes.
KeywordsVedic multiplier; energy recovery; DCPAL; adiabatic
logic The paper is organized as follows: Section II explains the
Vedic mathematics and sutra used in the paper, Section III
describes the circuit implementation using the Vedic multiplier.
I. INTRODUCTION Section IV discusses implementation of the proposed low power
A plethora of multiplication algorithms have been proposed Vedic multiplier based on DCPAL. Section V depicts simulation
in the literature. Of these, the tree and array architecture based results and finally the comparison is concluded in section VI.
multipliers are the two major types employed [1]. Baugh-
Wooley array multiplier is suitable for less bit operands. For II. VEDIC MATHEMATICS
larger operand lengths, the booth algorithm based multiplier
provides high speed. However, it must be noted that booth Vedic Mathematics is based on the 16-Sutras, (or Algorithm)
algorithm incurs higher power dissipation. The Wallace tree and 16-up-sutra (the Sub-algorithms) derived by Shree Bharti
multiplier with modified booth encoding is one of the faster Krishna Tirtha Ji Maharaj after his broad research on the
multipliers presented in the literature. However, it has large ancient Indian holy text, namely, the Atharva Veda [8]. These
amount of interconnects resulting in increased chip area and sutras provide a very effective approach for solving the practical
significantly higher power dissipation. These factors make it mathematical applications, across a wider range of areas like
unsuitable for low power and high frequency applications [2] trigonometry, algebra, geometry and arithmetic, to name a few.
[3]. Fast multipliers based on the Urdhva Tiryakbhyam Sutra This paper uses the Urdhva Tiryagbhyam sutra of ancient Vedic
of Vedic computation techniques adopted from ancient Indian mathematics technique for designing a faster and low power
Vedas are found to achieve high speed and reduced power Vedic multiplier to validate the design.
dissipation as compared to Baugh-wooley multiplier [4] [5].
Thus, this property makes it effective for implementing high
performance signal processing applications.
978-1-4799-3080-7/14/$31.00 2014
c IEEE 640
A. Urdhva Tiryagbhym (UT) sutra III. CIRCUIT IMPLEMENTATION OF VEDIC MULTIPLIER
This sutra literally means vertically and crosswise. This sutra
is one of the best known of the Vedic Sutras that provides an A. 2x2 multiplication
effective algorithm that can be applicable to all multiplication 2X2 multiplier serves as building block of 4X4, 8X8 and
cases. In addition, the sutra yields faster operations by further NX N multiplication; Truth table of 2X2 multiplier is
generating partial product and sum in a single iteration step. realized using k-map to get following Boolean equation:
Two-digit multiplication based on this method is shown in Fig. 1
M0 = (A0 and B0) (1)
To explain using an example, consider the multiplication of M1 = (A0 and B1) xor (B0 and A1) (2)
two digits, 23 and 52. Steps to be followed for the multiplication M2 = (A1 and B1) and (A0 and B0) (3)
process are shown in Fig. 1. The first step is the multiplication M3 = (A1 and B1) and (A0 and B0) (4)
of (3x2), which becomes the LSB bit. Then, the addition
operation of (2x2) and (5x3) is carried out. one digit answer is B. 4x4 multiplication
placed left to LSB and the carry is forwarded to the next stage. 4x4 bit multiplication based on Vedic Urdhva Tiryagbhym
Next step is multiplication of 2x5 and adding the product with algorithm can be realized using following equations, C1 to C5
the previous carry, which is in the next position on left. and H6 are carry bits generated in corresponding stages and are
This procedure can be extended for 4x4 bit multiplication as propagated to the next stage.
shown in Fig. 2 and can be further extended to perform NxN bit
multiplication. In Fig. 2, vertical and slanted lines represent the M0 = A0*B0 (5)
AND operation i.e. partial product generation between operands M1 = A0*B1 + A1*B0 (6)
while horizontal lines represent summing of partial products. M2 = A0*B2 +A2* B0 +A1*B1 +C1 (7)
M3 = A3*B0 + A0*B3 +A1*B2 + A2*B1 + C2 (8)
M4 = A3*B1+A1*B3 + A2*B2 +C3 (9)
M5 = A3*B2 + A2*B3 + C4 (10)
M6 = A3*B2 + A2*B3 +C5 (11)
M7 = H6 (12)
4X4 multiplier is implemented by arranging 2X2 structure in

Fig. 1. Two digit multiplication using UT sutra realizable and efficient manner [9]. Fig. 3 shows schematic of
4X4 multiplier implemented on CMOS using 2X2 block. Each
input bit pair is given to a separate 2X2 multiplier which
produces four partial product rows. Final product bits are
obtained by optimally adding these partial products rows. Full
adders are used to subsequently add these partial product rows.
Fig. 2. Four bit multiplication using UT sutra Fig. 3. Multiplication of 4x4 bit using CMOS logic.
2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI) 641

IV. LOW POWER ADIABATIC VEDIC MULTIPLIER
Fig. 4. Proposed 4X4 low power Vedic multiplier with () indicating latency
This section deals with the implementation of the proposed The output waveform for A = 1, B = 0 and /A = 0, /B = 0 is
low power Vedic multiplier based on DCPAL energy recovery shown in Fig. 6.The node out goes low When A= 1, B= 0 during
logic. The proposed 4x4 low power Vedic multiplier is shown in evaluate phase of PC. This makes MP1 ON, which raises /out
Fig. 4. DCPAL is used to design all the individual cells such as node along with rising PC1. During recovery phase, fully charge
logic gates, half adder and full adder. DCPAL is a dual rail Pre- /out node releases its charge back to PC via recovery path formed
resolve logic designed using nMOS based Differential Cascode by conducting MP1 transistor. In hold phase, the next stage
Voltage Switch (DCVS) tree structure and pMOS sense-amplifier evaluates and the process goes on.
memory recharge scheme to achieve efficient energy recovery,
along with that it has two pull down nodes providing both the
function and its complement in the same circuit thereby reducing
circuit complexity and improving overall latency [7].
NAND/AND gate is realized using DCPAL as shown in Fig 5.
Fig. 5. Schematic of AND/NAND gate using DCPAL Fig. 6. Waveforms of AND/NAND gate using DCPAL
642 2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI)

DCPAL based design operates with four phase power-clock adiabatic logic based circuits ([2], [6]) have latency of 2 for full
(PC) with Evaluate, Hold, Recovery and Wait phases to achieve adder. Therefore, multiplier output is received at the end of the
the adiabatic pipelined structure. Fig. 6 depicts the pipelined 8th stage. This shows that the time latency of the design is 2
structure of the adiabatic Vedic multiplier. 90 phase shift is times the duration of power-clock. This shows a significant 1.25
present between four power clocks. To ensure application of times the duration of power clock latency improvement as
steady input during evaluate phase of power-clock, the input compared to 2N-2P adiabatic logic based Vedic Multiplier [2].
signal is applied 90 leading PC1. Proposed multiplier uses Some product bits arrive early; therefore to maintain the
shared transistor approach for designing all full adders of the pipelining and timing of the design clocked buffers are
design as shown in Fig.7. introduced.
V. SIMULATION RESULTS
4X4 and 8x8 Vedic multiplier based on DCPAL and standard
CMOS logic operating at 25 MHz frequency is simulated at
45nm technology node by using industry standard Spice spectre
tool. Average power consumed Vedic multiplier structures are
compared with conventional array multiplier structures for
DCPAL and standard CMOS counterpart as shown in Table I.
Fig. 8 shows the average power (in uW) comparison in the form
of chart.
TABLE I COMPARISON BETWEEN DCPAL AND CMOS
DCPAL (uW) CMOS (uW)

Circuit
Vedic Array Vedic Array
4X4 Multiplier 0.484 0.634 1.39 1.73
8X8 Multiplier 3.13 4.67 8.57 11.14
(a)
(b)
Fig. 7. Schematic of (a) sum block and (b) carry block of DCPAL full adder
Shared transistor based logic trees for Sum and Carry form Fig. 8. Average power (in uW) for Vedic multiplier and Conventional Array
multiplier using CMOS & DCPAL
the pull down arms of the full adder. When PC3 is high, ground
is obtained at node N1 due to conducting MN11.During this VI. CONCLUSION
state, if input (A, B, Cin) is true, /S becomes connected to node
N1 pulling /S low and S is disconnected from N1.At T2, circuit Comparison results show that the Vedic multiplier designed
evaluates making S high with the help of cross coupled MP1 & using the DCPAL technique consumes 62% and 67 % lesser
MP2. If input (A, B, Cin) is false during T1, reverse operation power than its CMOS equivalent 4x4 and 8x8 designs and the
takes place pulling down node S. Carry block also works on the array multiplier based on DCPAL incurs 57% and 63% lesser
same approach. Prominent feature of the design is that the full power than its CMOS design counterparts. The DCPAL based
adder output is available after just single latency; while other Vedic Multiplier structure yields 23% and 33% reduced power
2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI) 643

as compared to equivalent conventional 4x4 and 8x8 array
multipliers. On the other hand, the CMOS based design realizes
a power saving of 18% and 23% respectively for the Vedic
multiplier over its conventional array counterpart. This validates
that Vedic multiplier is more power efficient than the
conventional array multiplier and DCPAL based design achieves
significant power reduction of around 57% and 67 % for Vedic
and array multiplier structures based on standard CMOS. In
addition to low power operation, the design offers reduced
overall latency of the circuit as compared to CMOS. Moreover,
the four phase-clocks of the DCPAL serve as power supply to
the circuit as well as the synchronizing clock for achieving a
pipelined structure. This feature helps to reduce the problem of
electromagnetic interference and cross talk across interconnects
present in the complex design operating especially at lower
technology nodes.
This work can be extended to study and the analysis of
different power components for different kinds of adiabatic logic
families with higher range of operating frequency combined
with the aim of reduced adiabatic pipeline depth.
REFERENCES
[1] J.M. Rabaey, A. Chandrakasan, B. Nikolic, Digital Integrated Circuits, a
Design Perspective, 2nd ed. Prentice Hall, Englewood Cliffs, NJ, 2002.
[2] Belgudri Ritesh and V.S. Kanchana Bhaaskaran, Design and
implementation of an efficient multiplier using Vedic mathematicsand
charge recovery logic, Proc. of Int. Conference on VLSI, Communication,
Advanced Devices, Signals & Systems and Networking (VCASAN),
Chapter 15, pp. 101-108, 2013.
[3] Himanshu Thapliyal and M.B. Srinivas, High speed efficient N X N bit
parallel hierarchical overlay multiplier architecture based on ancient Indian
Vedic Mathematics, Transactions on Engineering, Computing and
Technology, v2, pp. 225-228, Dec. 2004.
[4] Jayaprakasan V, Vijayakumar and V.S Kanchana Bhaaskaran, Evaluation
of the conventional versus ancient computation methodology for energy
efficient arithmetic architecture, Int. Conference on Process Automation,
Control and Computing (PACC), pp. 2022,april 2011.
[5] H. D. Tiwari, G. Gankhuyag, C. M. Kim and Y. B. Cho, Multiplier design
based on ancient Indian Vedic Mathematics, Proc. IEEE International
SoC Design Conference, pp. 65-68, Nov. 24-25, 2008
[6] Y.Moon and D.-K.Jeong, An efficient charge recovery logic circuit,IEEE
J. of Solid-State Circuits, Vol. 31, No. 4, pp. 514-522, 1996.
[7] V. S. Kanchana Bhaaskaran, J. P. Raina, Differential Cascode adiabatic
logic structure for low power, Journal of Low Power Electronics, vol.4,
pp. 178190, 2008.
[8] Maharaja Jagadguru Swami Sri Bharati Krisna Tirthaji.,Vedic
Mathematics, Motilal Banarasidass Publishers Pvt Ltd, Delhi, 2001.
[9] Gurumurthy KS, Prahalad MS, Fast & power efficient 16 X16 array of
array multiplier using Vedic multiplication,microsystems packaging
assembly & circuits technology conference (IMPACT) 2010, pp. 48-53.
644 2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI)

Vedic Multiplier Using Energy Recovery PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Vedic Multiplier Using Energy Recovery PDF

Hochgeladen von

Copyright:

Verfügbare Formate

Low Power Vedic Multiplier Using Energy Recovery

4X4 multiplier is implemented by arranging 2X2 structure in

2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI) 641

642 2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI)

DCPAL (uW) CMOS (uW)

2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI) 643

644 2014 International Conference on Advances in Computing,Communications and Informatics (ICACCI)

Das könnte Ihnen auch gefallen