VLSI Architecture of Parallel MAC Based on Radix-2 Modified Booth Algorithm

Basic Conceptual Document date :06-8-2010
A NEW VLSI ARCHITECTURE OF PARALLEL

MULTIPLIER-ACCUMULATOR BASED
ON RADIX-2 MODIFIED BOOTH ALGORITHM
ABSTRACT
Here in this paper we are mainly into the concerned of concept called modified Radix-2 booth
algorithm, were we use a structure of MAC (Multiply and Accumulate) concept. Which
improves the arithmetic operation in a better way in other words it increases the speed of the
operation, now here the comparison has been taken and analyzed with the existing booth
algorithm and found that the Modified Booth Algorithm (MBA) has great performances. Which
uses 1’s compliment method. The CSA propagates the carry to the least significant bit in advance
to dececrease the number of bits so this MAC accumulates the partial product value in the form
of SUM and CARRY. The reason of improvement in the high speed arithmetic is by combining
the multiply and accumulate and devising a hybrid type CSA.For delay purpose in finding out
the delay scheme we use SAKURAI’S ALPHA POWER LAW.
Basic concept and introduction :

\
This concept of MAC is mainly used in the digital signal processing. Such as video ,image or the
data processing as we know most of the dsp uses non-linear functions such as discrete cosine
transform.
For high speed multiplication the radix 4 modified booth algorithm is used is commonly but it
has got its drawback due to long critical path for multiplication coming to the multiplier concept
the multiplier can be classified into 3 parts
1.) Booth encoder
2.) Partial products.
3.) Final addition
The most effective way to increase speed of the multiplier is to reduce the number of partial
products which produces the series of addition ,so to reduce the number of calculation steps for
the partial products ,MBA algorithm is applied mostly were the Wallace tree plays a vital role in
increasing the speed of te calculation.
The above shown fig is the general MAC operation which is the existing method were it is
divided into 4 statges
1.) booth encoding

2.) partial product
3.) addition
4.) accumulation
here High speed MAC is proposed and also high brid CSA structure is produced to improve the
output rate.
This MBA uses algorithm with 1’s complement number system. A carry look ahead adder is
inserted in the CSA tree to reduce the number of bits in final adder.in increase the output rate
intermediate calculation results are accumulated in the form of sum and carry.
MAC CONCEPT:
MAC is not but a multiplying 2 numbers and adding the accumulator
Z<=X * Y + Z;
It consists of multiplicand na d the multiplier, this paper says that the first part is the radix-2
booth encoding with the multiplicand Y and multiplier X and then the partial products is
generated and converts into the form of sum and carry and the last is the final addition. And
there is a derivation of the MAC is shown in the paper. If -bit data are multiplied, the number of
the generated partial products is proportional to . In order to add them serially, the execution time
is also proportional to . The architecture of a multiplier, which is the fastest, uses radix-2 Booth
encoding that generates partial products and a Wallace tree based on CSA as the adder array to
add the partial products. If radix-2 Booth encoding is used, the number of partial products, i.e.,
the inputs to the Wallace tree, is reduced to half, resulting in the decrease in CSA tree step.
Here the above equation tells us that X and Y are the 2 binary numbers of length N bits,4
additional bits are recommended to avoid overflow.
The basic MAC unit is divided into 2 main blocks
1.) Multiplier
2.) Accumulator
The multiplier is divide into partial products generation and reduction blocks .
The partial products is further divide into summation tree and final adder .tree architecture were
proposed to improve the speed of the partial products addition
PROPOSED MAC :
Let us take suppose we want to multiply two –bit numbers and accumulate into a 2 -bit number is
The overall performance of the proposed MAC is improved by eliminating the accumulator
itself by combining it with the CSA function. If the accumulator has been eliminated, the
critical path is then determined by the final adder in the multiplier. The basic method to improve
the performance of the final adder is to decrease the number of input bits. In order to reduce this
number of input bits, the multiple partial products are compressed into a sum and a carry by
CSA. The number of bits of sums and carries to be transferred to the final adder is reduced by
adding the lower bits of sums and carries in advance within the range in which the overall
performance will not be degraded. A 2-bit CLA is used to add the lower bits in the CSA. In
addition, to increase the output rate when pipelining is applied, the sums and carrys from the
CSA are accumulated instead of the outputs from the final adder in the manner that the sum and
carry from the CSA in the previous cycle are inputted to CSA. Due to this feedback of both sum
and carry, the number of inputs to CSA increases, compared to the standard design.
And the the derivation of the proposed MAC is derived . which is in the form of P=X*Y + Z;
MAC ARCHITECTURE. :
In the previous architecture it has been mentioned clearly that it consists of accumulation as a
different step but in this method we are merging the accumulation into the partial products +
accumulation .
When comes to the hardware point of view the n inputs X and Y are converted into (n+1) bit
partial products by passing through the booth encoder as S,C,Z are generated. these 3 values are
generated and fed back and used for the next accumulation.
CSA ARCHITECTURE
Before entering into the CSA concept we need to what is CSA stand for and its basic definition
The architecture of the hybrid-type CSA that complies with the operation of the proposed MAC
is proposed, which performs 8*8-bit operation.
The above dig shows the proposed CSA architecture which gives you the information that si is
the one which is used for the sign extension and n(i) is to compensate 1’s compliment number
into 2’s compliment and s(i) and c(i) is the ith bit of the feed back sum and carry. z(i) is the I th
bit of the sum of the lower bits for each partial product ,the above taken example for 8 bits
number so only 4 partial products are generated .the CSA require at least 4 rows of Full adder
for the four partial products .so totally 5 full adders are necessary. so it is same for n*n bit mac
operation the level of CSA is (n/2+1) .now from the above dig.
1.) white square =full adder
2.) gray scale = half adder
3.) rectangular symbol =2 bit CLA

VLSI Architecture of Parallel MAC Based on Radix-2 Modified Booth Algorithm

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

VLSI Architecture of Parallel MAC Based on Radix-2 Modified Booth Algorithm

Hochgeladen von

Copyright:

Verfügbare Formate

Basic Conceptual Document date :06-8-2010

A NEW VLSI ARCHITECTURE OF PARALLEL

Basic concept and introduction :

1.) booth encoding

Das könnte Ihnen auch gefallen