ADSD Fall2011 11 MultiOperandAddition

ADSD Fall 2011
Lecture # 11
Dr. Rehan Hafiz
<rehan.hafiz@seecs.edu.pk>
Course Website for ADSD Fall 2011

2
http://lms.nust.edu.pk/
Acknowledgement: Material from the following sources has been consulted/used in these slides: 1. [CIL] Advanced Digital Design with the Verilog HDL, M D. Ciletti 2. [SHO] Digital Design of Signal Processing System by Dr Shoab A Khan 3. [STV] Advanced FPGA Design, Steve Kilts 4. Ercegovacs Book: Digital Arithmetic 2004 5. Dr. Shoab A Khans CASE Lectures on Advanced Digital System Design
Material/Slides from these slides CAN be used with following citing reference: Dr. Rehan Hafiz: Advanced Digital System Design 2010 Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Lectures: Contact: Office:
Tuesday @ 5:30-6:20 pm, Friday @ 6:30-7:20 pm By appointment/Email VISpro Lab above SEECS Library
Lecture Overview
3
Last Lecture
Two Operand Adders
This Lecture
Where
& Why Multi Operand Addition Multi-Operand Addition <with Focus on Low Latency Design>
Carry
Save Adders Wallace Compression Tree Dada Compression Tree
Multi-operand Addition
a[7] a[6] a[5] a[4] a[3] a[2] a[1] a[0]
a[7]+a[6]
a[5]+a[4]
a[3]+a[2]
a[1]+a[0]
a[7]+a[6]+a[5]+a[4]
a[3]+a[2]+a[1]+a[0]
a[7]+a[6]+a[5]+a[4]+a[3]+a[2]+a[1]+a[0]
Example: Matrix Multiplication

5
Each term requires:
Addition of 4 products
Where each product requires addition of 8 partial products as a result of multiplication (assuming 8 bit numbers)
A total of (28+3) additions
So above multiplication requires 465 additions in total
Dot Representation
6
Useful, simplified notation Useful when positioning or alignment of the bits, rather than there values, is important.
Each dot represents a digit in a positional number system. Dots in the same column have the same positional weight.
........ ........ ........ ........ ........ ........ ........ ........ ........ ........ ...............
So should we use Ripple Carry Adders (RCA)/Carry Propagate Adders (CPAs) everywhere ?
7
Addition, FIR Filtering, Matrix Multiplication

Accumulate
using cascaded CP-Adders
OR --- Just Delay the Computation of final result
Compression Trees
8
Counter (m,n) Carry Save Dual Carry Save Wallace Tree Dada Tree
(m,n) Counter
as Compression Trees / Term Reducers
9
An (m, n) counter takes as input m bits (all of the same power-of-2 weight) and produces an n-bit binary number whose value is the number of inputs that are equal to 1. Counts the number of 1s in the input and outputs the binary count value. Example: (3, 2) counter.

Of the 3 inputs, there can be either 0, 1, 2 or 3 inputs equal to 1. All four of these values can be represented as a 2-bit binary number. A (3, 2) counter is nothing but a full adder, where the sum is the LSB count output and the carry-out is the MSB count output
Carry Save Adder

10
Successively reduce 3 input vectors to 2 output vectors,

i.e. a sum vector and a carry vector. Each bit of these two vectors are computed independently of all other bits, and there is no carry propagation between adjacent bit positions.
Implements a compression from 3 vectors X, Y and Z to 2 vectors S and C.
H.W: Parallel set of (3, 2) counters, i.e. a parallel set of full adders.
Adding 3 8-bit numbers

11
Operand-1
Operand-2
Operand-3 Sum bits Carry bits
0
0 0 0 1
1
0 0 0
0
0 1 0
1
0 1 0
0
0 1 1
1
1 1 1
1
0 0 0
0
0 0
Now add the two operands using a Carry Propagation Adder
1 Carry Save Adder (CSA) + 1 CPA

Instead
of 2 CPA
1 FA Delay per CSA
Carry Save Adder in Dot Notation

12
Example-8 bit addition of 4 numbers

13
carries 1 0 0
1 1 1
1 1 0
2 0 1
2 1 0
1 1 1
0 1 1 0 0
0
0 Sum 1
0
0 1
0
0 0
0
0 1
0
1 0
1
1 1
0
1 1
0
0 0
What can be done ?
CSA
0 0 0 0
1 1 0 0
1 0 0 1
0 1 0 1
1 0 0 1
1 1 1 1
1 1 0 0
0 0 0 0
14
0 CSA
0
1
0 1 0 1
0
0 0 0 0 0
0
1 0 0 1 0
0
1 0 0 1 1
1
1 1 1 1 1
1
1 1 1 1 0
0
0 0 1 1 0 0 0 0 0
CPA 0 0
1 0 1
0 0 1
1 0 0
1 1 1
1 1 0
1 0 1
1 0 1
0 0
15
Carry Save Reduction

16
Level 0 shows all the PPs The carry save reduction scheme takes first 3 PP layers at Level 0 Reduces these to 2-PP layers using 3-2 and 2-2 compressors The rest of the partial product rows are not touched at this level of reduction The resulting two rows from the previous reduction level are then again added together with the PP4 row in the next level At each level, three partial product rows are added, resulting in two output rows The process continues until the array is reduced to no more than two rows The number of levels for N PP layers is N-2 (staring from Level 0)
Carry Save Architecture

17
Level 1
HA
FA
FA
FA
FA
FA
HA
P0
Level 2
HA
FA
FA
FA
FA
FA
FA
P1
Level 3
HA
FA
FA
FA
FA
FA
FA
P2
Level 4
HA
FA
FA
FA
FA
FA
FA
P3
PC10
PS9 PC9
PS8 PC8
PS 7PC 7
PS 6PC 6
PS 5PC 5
PS4 PC4
PS 3
Free product bits
Adding 6 Terms Using DUAL CSA

18
The partial product row matrix is divided into two equal size groups Two Simultaneous operations on the partial product matrix The resulting 4 terms are again reduced using CSA Architecture
Wallace Tree
19
Can reduce N binary numbers to two numbers in O(log N) levels.
Simultaneous CSA (Carry Save) operation is applied on all possible three TERMS (partial products) to be added
The same technique is repeated on this matrix
Level:0
Again grouping of these rows into three is done
Each group is reduced to two rows simultaneously

This process continues until only two rows are left
Level:1
Level:2
The final rows are added together for the final product
Level:3
20
Wallace Tree 7 Operands Example

21
Wallace Tree Architecture 6 Operands

22
Compressing 6 operands P0, P1, P5 to 2 vectors S and C. This can be done using 3 levels of CSAs.
The left arrow on some CSA inputs means that that vector is shifted left by one bit position to account for the fact that it is a carry vector output of a prior CSA.
Wallace Tree Architecture 8 Operands

23
No. of Operands Vs. No. of Full Adder Levels

24
Number Operands 3 4
No. of full adder Levels 1 2
5n6 7n9
10 n 13 14 n 19 20 n 28 29 n 42 43 n 63
3 4
5 6 7 8 9
Assuming Level 0 = 1st level
DADA Trees
25
An area optimized Wallace tree for partial products ! Requires the same number of adder levels It uses less number of computational elements as compared to Wallace tree It corresponds to less power dissipation and less area
DADA Trees
26
Number Operands 3 4 5n6 7n9 10 n 13 14 n 19 20 n 28 29 n 42 43 n 63
No. of full adder Levels 1 2 3 4 5 6 7 8 9
For each column; reduce only to the extent so that the number of PP in the next level = to the maximum of the range of Number of Operands in the Wallace tree table
27
Need to compress these since > 6 Also need to take care of carries coming from previous column, e.g from 7th to 8th L 1 2 3 4 FAs 3 12 9 11 35 HAs 3 2 1 1 7
No Need to reduce these columns; since these already have 6 PP/operands
FAs 12 13 6 8 39
HA 4 3 4 3 14
Wallace DADA
Further Reading
28
Ron S. Waters, Earl E. Swartzlander, "A Reduced Complexity Wallace Multiplier Reduction," IEEE Transactions on Computers, vol. 59, no. 8, pp. 1134-1137, Apr. 2010,
QUIZ
29
Quiz 20 Minutes
30
You have to design the architecture that is able to perform the addition of 4 8-bit numbers with minimum possible latency.
Questions Draw micro-architecture of your design [5 Marks] Report the latency of your design in terms of Gate Delays [3 Marks] Report the latency of your design in terms of Gate Delays if you were using Ripple Carry Adder for these 4 8-bit numbers[2 Marks]
Assumptions Assume single gate delay for all gates including XOR gate Assume 2 Gate Delay for a FA You do not need to keep care about FAN OUTs/Ins For your selection of adder; you just need to draw its block diagram
31
Lecture Ends !
Assignment-02
32
You have to explain the micro-architecture & write the Verilog Code for addition of 8 8-bit binary numbers for the following cases and compare their performances (as reported by Xilinx)
Ripple
Carry Adder Carry Select Adder Wallace Compression Tree
Duration: 2 Weeks
Quiz
33

ADSD Fall2011 11 MultiOperandAddition

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

ADSD Fall2011 11 MultiOperandAddition

Hochgeladen von

Copyright:

Verfügbare Formate

ADSD Fall 2011

Dr. Rehan Hafiz

Course Website for ADSD Fall 2011

Lectures: Contact: Office:

Two Operand Adders

Save Adders Wallace Compression Tree Dada Compression Tree

Example: Matrix Multiplication

Each term requires:

A total of (28+3) additions

So above multiplication requires 465 additions in total

Addition, FIR Filtering, Matrix Multiplication

using cascaded CP-Adders

OR --- Just Delay the Computation of final result

Carry Save Adder

Successively reduce 3 input vectors to 2 output vectors,

Implements a compression from 3 vectors X, Y and Z to 2 vectors S and C.

Adding 3 8-bit numbers

Now add the two operands using a Carry Propagation Adder

1 Carry Save Adder (CSA) + 1 CPA

1 FA Delay per CSA

Carry Save Adder in Dot Notation

Example-8 bit addition of 4 numbers

What can be done ?

Carry Save Reduction

Carry Save Architecture

Free product bits

Adding 6 Terms Using DUAL CSA

Can reduce N binary numbers to two numbers in O(log N) levels.

Again grouping of these rows into three is done

Each group is reduced to two rows simultaneously

Wallace Tree 7 Operands Example

Wallace Tree Architecture 6 Operands

Wallace Tree Architecture 8 Operands

No. of Operands Vs. No. of Full Adder Levels

No. of full adder Levels 1 2

Number Operands 3 4 5n6 7n9 10 n 13 14 n 19 20 n 28 29 n 42 43 n 63

No. of full adder Levels 1 2 3 4 5 6 7 8 9

No Need to reduce these columns; since these already have 6 PP/operands

Carry Adder Carry Select Adder Wallace Compression Tree

Das könnte Ihnen auch gefallen