Sie sind auf Seite 1von 24

Multiplication for 2s Complement

System Booth Algorithm


Consider an unsigned five bit number:
B = B4B3B2B1B0
= B416+ B38+ B24+ B12+ B01
For a 2s complement number:
B 2s comp=B4(16)+ B38+ B24+ B12+ B01
which can be re-expressed as:
B2s comp= (16)B4 + (168)B3 + (84)B2 +
(42) B1 + (21) B0
= -16(B4 B3)8(B3 B2)4(B2 B1)
2(B1 B0)1(B0 0)
The value in parentheses is difference of two
consecutive bits, which could be +1,0, or -1.
ECE152B AU 1

Example: Use the Booths algorithm recoding


scheme to perform the multiplication:
25101910

0 1 1 0 0 1 -- A=2510 multiplicant
1 0 1 1 0 1 -- B=-1910 multiplier
B5B4B3B2B1B0

P = AB = 32(B5B4)A 16(B4B3)A
8(B3B2)A 4(B2B1)A
2(B1B0)A 1(B00)A
= 32+16+04+21

ECE152B AU 2
Multiplicand Multiplier
8 0 8
PIER_LD D7 D1 D0
8-bit register D 1-bit Q
PIER_CLK 8-bit shift
PCAND_CLK register FF
8 D0
Bi Bi-1
7
D7 Combo
logic
S3 4
8-bit ALU
S1
7
8 1
prod_clr
8-bit register 8-bit register
prod_clk D7 D1 D0 D7 . D1 D0
7 7
ECE152B AU 3

Plier Bits ALU control Set

B1-H B0-H S3-H S2-H S1-H S0-H


Function
0 0 0 0 0 0 pass product value

0 1 1 0 0 1 product plus multiplicand

1 0 0 1 1 0 product minus multiplicand

1 1 0 0 0 0 pass product value

The logic for ALU Select lines is implemented by


two NAND gates.

ECE152B AU 4
Multiplication

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 5

Using adders
to add rows

ECE152B AU 6
Multiplication Using Adders

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0 1st level
adder
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 7

Multiplication Using Adders

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0 2nd level
adder
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 8
Multiplication Using Adders

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
3rd level
R2,3 R2,2 R2,1 R2,0 adder
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 9

Row Reduction Method for Multiplication

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 10
Using Carry Save Adders

ECE152B AU 11

Row Reduction Method for Multiplication

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0 1st level adder
(row-reduction unit)
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
Sum of partial products

ECE152B AU 12
Row Reduction Method for Multiplication

A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0 1st level adder
(row-reduction unit)
R2,3 R2,2 R2,1 R2,0
F5 F4 F3 F2 F1 F0 Outputs of 1st
level adder
C5 C4 C3 C2 C1 C0

ECE152B AU 13

Row Reduction Method for Multiplication


A3 A2 A1 A0
B3 B2 B1 B0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
F5 F4 F3 F2 F1 F0
2nd level adder
C5 C4 C3 C2 C1 C0
(row-reduction unit)
R3,3 R3,2 R3,1 R3,0
F6 F5 F4 F3 F2 F1 F0 Outputs of 2nd
level adder
C6 C5 C4 C3 C2 C1 C0
Use a regular adder to add these two rows
ECE152B AU 14
Generalized row reduction method:

MULTIPLICAND

MULTIPLIER

Intermediate
Partial Partial
Product Product Sums
Array Formed in
Parallel
then Summed
PRODUCT
ECE152B AU 15

Example: design a high speed multiplier


5656 bit
longest available row reduction unit: 15-4
the final stage is a LACA with 8-bit basic adders.

15-4

15-4
56
PARTIAL 15-4 PRODUCT
7-3 3-2 FA
PRODUCT 15-4
ROWS

15-4
ECE152B AU 16
DELAY: 1 + 2 T15-4 rru + T7-3 rru + T3-2rru + TLACA

to 2+4(log8(112) -1)
row
form = 2+4(3-1)
reduction
partial = 10 gate delay
unit
product

ECE152B AU 17

Multiplication with Sectioning

Design an 88 multiplier using 44 multipliers.

X3 X2 X1 X0
Y3 Y2 Y1 Y0
R0,3 R0,2 R0,1 R0,0
R1,3 R1,2 R1,1 R1,0
R2,3 R2,2 R2,1 R2,0
R3,3 R3,2 R3,1 R3,0
P7 P6 P5 P4 P3 P2 P1 P0

ECE152B AU 18
X7 X6 X5 X4 X3 X2 X1 X0
Y7 Y6 Y5 Y4 Y3 Y2 Y1 Y0
R0,7 R0,6 R0,5 R0,4 R0,3 R0,2 R0,1 R0,0
P3 R1,7 R1,6 R1,5 R1,4 R1,3 R1,2 R1,1 R1,0
R R R R R R R R P1 =
R R R R R R R R X3-0 Y3-0
P4
R R R R R R R R
R R R R R R R R
R R R R R R R R
R R R R R R R R P2 = X7-4xY3-0
P1,7 P1,6 P1,5 P1,4 P1,3 P1,2 P1,1 P1,0
P2,7 P2,6 P2,5 P2,4 P2,3 P2,2 P2,1 P2,0
P3,7 P3,6 P3,5 P3,4 P3,3 P3,2 P3,1 P3,0
P4,7 P4,6 P4,5 P4,4 P4,3 P4,2 P4,1 P4,0
ECE152B AU 19

ECE152B AU 20
Division

DD = Q DS + R

dividend quotient divisor remainder

The most straightforward method is to


mimic the operations of paper-and-pencil
long division for positive numbers.

ECE152B AU 21

Example:
1011 Quotient, Q
Divisor,DS 101) 111010 dividend, DD
101 Q3Ds
10010 R>Ds, continue
000 Q2Ds, shifted
10010 R>Ds, continue
101 Q1Ds, shifted
1000 R>Ds, continue
101 Q0Ds, shifted
11 R<Ds, done

ECE152B AU 22
A block diagram for such a divider:
INPUT INPUT
bit
DS R Q from
control

ALU
(subtract) Termination: Quotient in Q
Remainder in R

The division process involves repetitive


shifts and subtraction operations.

ECE152B AU 23

Load Ds, Q
clear Cout INPUT INPUT
clear R

DS R Q
begin

shift Q,R
left one bit ALU
inc count (subtract)

is
R-Ds Yes
positive
?
No prepare 1
prepare 0 for Q reg
for Q reg R=R-Ds

is shift Q,R
No count Yes
left one bit Done
N?
ECE152B AU 24
Fig 6.10 Parallel Array Divider
d1 D1 d2 D2 dm Dm D m+1 D 2m

d D
q1 bo bi 0

c c
R d

q2 0

Borrow always qm 0

computed

R := (c D: r1 r2 rm
c (D-d-bi) mod 2):
ECE152B AU 25

Division by Repeated Multiplication

Cost-effective if system contains high-speed


multiplier
Q = D D/ DS
In each iteration, a factor fi is generated & used
to multiply both divisor DS and dividend DD.
Q= (DDf0f1f2 )/(DSf0f1f2 )
fi is so chosen that DSf0f1f2 converges
rapidly toward 1.
If the denominator converges toward 1, the
numerator converges toward Q.

ECE152B AU 26
For simplicity, assume DD & DS are positive
normalized fraction: DS=1-x where x<1.
Set f0 = 1+x
=> DSf0=1-x2 (closer to 1 than DS)
=> Q= (DD(1+x) )/(1-x2 )
Set f1 =1+x2
=> DS f0 f1 =1-x4 (even closer to 1)
=> Q= (DD(1+x) (1+ x2) )/(1-x4 )
f0 =1+x = 1+(1- DS) = 2-DS (2s complement of DS)
f1= 1+x2 = 1+(1-DSf0) = 2- DSf0 = 2-DS0
=>fi =2- DSf0 fi-1 =2-DS(i-1)
ECE152B AU 27

Example: (1). 0.4/0.7:


DD0 0.4000000 DS0 0.7000000 f0 1.3000000
DD1 0.5200000 DS1 0.9099999 f1 1.0900000
DD2 0.5668000 DS2 0.9918999 f2 1.0081000
DD3 0.5713911 DS3 0.9999344 f3 1.0000656
DD4 0.5714286 DS4 0.9999999 f4 1.0000000
DD5 0.5714286 DS5 1.0000000 f5 1.0000000
DD6 0.5714286 DS6 1.0000000

ECE152B AU 28
(2). 0.7/0.4:
DD0 0.7000000 DS0 0.4000000 f0 1.5999999
DD1 1.1199999 DS1 0.6400000 f1 1.3599999
DD2 1.5231999 DS2 0.8704000 f2 1.1295999
DD3 1.7206066 DS3 0.9832038 f3 1.0002821
DD4 1.7495062 DS4 0.9997178 f4 1.0002821
DD5 1.7499998 DS5 0.9999999 f5 1.0000001
DD6 1.7499999 DS6 1.0000000

(3). 0.1/0.15:
DD0 0.1000000 DS0 0.1500000 f0 1.8499999
DD1 0.1850000 DS1 0.2775000 f1 1.7224999
DD2 0.3186625 DS2 0.4779938 f2 1.5220062
DD3 0.4850063 DS3 0.7275094 f3 1.2724905
DD4 0.6171659 DS4 0.9257489 f4 1.0742511
DD5 0.6629912 DS5 0.9944868 f5 1.0055132
DD6 0.6666464 DS6 0.9999696

ECE152B AU 29

The # of iterations required is determined by the


value of DS
Its better to use a fixed # of iterations
To assure that the process converges to the
correct answer for all data, instead of using 2-
DS to calculate f0, use a ROM to find an
appropriate value for f0.
It can then guarantee correct results after a
fixed # of iterations.

ECE152B AU 30
Suppose ROM has 28 words
(a) If DS is 8-bit, one iteration is sufficient
=> f0 = 1/DS
(b) If DS is > 8-bit, more than one iteration is
required, DS f0=1-x & x< 2-8
At the 2nd iteration, Ds f0 f1= 1-x2
cthe difference from 1 is <2-16
At the ith iteration (i>2)
Dsf0f1fi-1= Dsi-1 = 1-x2(i-1)
c the difference from 1 is < (2-8)2(i-1)
(3rd iteration error < 2-32)
(4th iteration error < 2-64)

ECE152B AU 31

DD Q
mult mult mult

f0 2s f1 2s f2
ROM comp comp

DS mult mult

ECE152B AU 32
Fig 6.14 Floating-Point
Number Format
Sign Exponent Fraction
s e f

1 me mf

m bits

1 + me + mf = m, Value(s, e, f ) = ( 1)s f 2e

s is sign, e is exponent, and f is significand


(mantissa)
We will assume a fraction mantissa, but some
representations have used integers
ECE152B AU 33

Floating Point Arithmetic

Floating point addition


The difficulty when adding two floating point
numbers stems from the fact that the mantissas, in
general, have different significance.
A = B+C
= MB rS EB + MC rS Ec
Before the two numbers can be properly added
together, the mantissas must be aligned.
A= (MB rS EB-Ec+MC) rS Ec (assume |B|<|C|)

ECE152B AU 34
This involves determining which operand
value is smaller, and then aligning the
mantissa of that operand appropriately with
the mantissa of the larger operand.
The alignment is accomplished by shifting
the mantissa of the smaller operand to line
up with the digits of the same significance in
the larger operand.
The amount of the alignment, i.e. the # of
positions to shift, is determined by the
difference in the exponents.

ECE152B AU 35

A block diagram:

Exponent B Exponent C Mantissa B Mantissa C

Select
Select and align
Exponent
Compare
Add/Subtract

Exponent
Post Normalization
Adjust

Result Exponent Result Mantissa


ECE152B AU 36
The selection of the appropriate mantissa to
be aligned is made based on a comparison of
the magnitude of the two exponents.
The resulting number of the
addition/subtraction is provided to the
Post-Normalization unit.
Examples:
0.8045 Input A is normalized
+ 0.7133 Input B is normalized
1.5177 Result is not normalized

0.8045
0.8032
ECE152B 0.0012 AU 37

The post normalization unit must be capable


of shifts of one or more positions for the
mantissa and adjust the size of the exponent
to reflect the normalization.
Floating point addition, then, requires many
more operations, and hence more hardware,
than its integer counterpart.

ECE152B AU 38
Floating-point adder of IBM system/360 Model 91
Input Bus Exponent
E1 E2 M1 M2 Comparison and
Mantissa
Alignment
Adder 1 Shifter 1

E1-E2 Mantissa
Adder 2
addition-subtraction

Zero digit R
Result
checker
normalization
Adder 3 Shifter 2

Output Bus
E3 M3
ECE152B AU 39

Design a network to align the smaller mantissa


to be added to the larger mantissa.
Assume that the mantissa is 24 bits
The alignment network must be capable of
shifting any number of bits, from 0 to 24
(shift left).
Assume the adders used to compare the
exponents provide a binary number (size: 0
to 24; hence 5 bits S4S3S2S1S0) which indicates
how far the number needs to be shifted in the
alignment process.

ECE152B AU 40
Fig 6.11 A N N Bit Crossbar Design for
Barrel Rotator
y0 y1 y2 y3 y4 y5

x0

x1

S h ift
co unt
x2
D eco de r

x3

x4

x5

x - in p u t
y - o utp u t

ECE152B AU 41

Properties of the Crossbar Barrel Shifter

There is a 2-gate delay for any length shift


Each output line is effectively an n way
multiplexer for shifts of up to n bits
There are n2 3-state drivers for an n bit shifter
For n = 32, this means 1024 3-state drivers
For 32 bits, the decoder is 5 bits (1 out of 32)
The minimum delay but large number of gates
in the crossbar prompts a compromise:
the logarithmic barrel shifter

ECE152B AU 42
Logarithmic Barrel Shifter
Input word
x0 x1 x2 x29 x30 x31
Shift count
s4 s3 s2 s1 s0 One shift/
bypass cell
Bypass/shift 1 bit right
Shift/bypass

Bypass/shift 2 bits right

Bypass/shift 4 bits right

Bypass/shift 8 bits right

Bypass/shift 16 bits right

y0 y1 y2 y29 y30 y31


Output word

ECE152B AU 43

The LSB of this number is used by the first level


of MUXs to shift the number by 1 bit (the 1
condition), or provide no shift at all (the 0
condition).
Similarly, the second LSB controls the 2nd set of
MUXs to shift the number by 2 more or not to
shift.
Similarly, the MSB controls the 5th set of MUXs
to shift the number by 16 more or not to shift.

ECE152B AU 44
Floating Point Multiplication
A = BC
= MB rS EBxMC rS Ec
= MB MC rSEB+Ec
Exponent B Exponent C Mantissa B Mantissa C

Exponent
Add
Multiply

Exponent
Post Normalization
Adjust

Result Exponent Result Mantissa


ECE152B AU 45

Post normalization unit only needs to shift


the result by at most one bit position.
Consider two extreme cases:
Largest / Lar gest
Base 2 Base 10
0.1111 0.9999
0.1111 0.9999
0.1110 0.9998 A ligned pr oper ly,
=>no postnor malization.
Smallest / smallest
Base 2 Base 10
0.1000 0.1000
0.1000 0.1000 N ot aligned pr oper ly,
0.0100 0.0100 =>postnor malization of
one digit position.
ECE152B AU 46
Floating Point division
A = B /C
= MB rS EB/(MC rS Ec)
= (MB / MC) rSEB-Ec

Exponent B Exponent C Mantissa B Mantissa C

Exponent
subtract
Divide

Exponent
Post-Normalization
Adjust

ECE152B AU 47

Largest / smallest
Base 2 Base 10
0.1111 0.9999
0.1000 0.1000 N ot aligned pr oper ly,
1.1110 9.9990 =>postnor malization is
r equir ed.
Smallest / Lar gest
Base 2 Base 10
0.1000 0.1000
0.1111 0.9999 A ligned pr oper ly,
0.1000 0.1000 =>no postnor malization.

The result of the mantissa division may


require post-normalization by at most one bit
position in opposite direction of the
multiplier.
ECE152B AU 48

Das könnte Ihnen auch gefallen