Beruflich Dokumente
Kultur Dokumente
3/20/20
18 arithmetic.1
1-bit adder Review (Appendix B.5, B.6)
CarryIn
CinCarryIn
a
a a
Sum
Sum
1unit of b
b delay from
b Cin to sum CarryOut
Cin
sum Carry out
Co
CarryOut
2 gate delays
2 units of
A
delay from
B A/B to sum
3/20/201
Carryout = a!bc + ab!c + abc! + abc
8 arithmetic.2
Binvert Operation
CarryIn
a
1-bit ALU: 0
1
AND, OR, Result
a+b, a+b! b 0
1
2
Less 3
a. CarryOut
ALU Delays
Result = 1 gate delay
Binvert Operation From a to result = 2
CarryIn
Most a
Form b to Result = 2
0 (ignore b invert)
significant
1
bit
Result
b 0 2
1
Less 3
Set
Overflow Overflow
3/20/201
detection
8 b. arithmetic.3
Bnegate Operation
a0
32-bit ALU b0 ALU0
Less
Result0
+ CarryOut
+ b1
0
ALU1
Less
Zero
SLT support CarryOut
Cin
a2 Result2
b2 ALU2
CarryIn 0 Less
a
0 CarryOut
1
Result
b 0 2
1 Cin
Result31
Less 3 a31
b31 ALU31 Set
0 Less Overflow
a. CarryOut
For SLT
3/20/201
8 arithmetic.4
Overflow ?? - 4-bit example
Decimal Binary Decimal 2’s Complement
0 0000 0 0000
1 0001 -1 1111
2 0010 -2 1110
3 0011 -3 1101
4 0100 -4 1100
5 0101 -5 1011
6 0110 -6 1010
7 0111 -7 1001
-8 1000
• Examples: 7 + 3 = 10 but ...
• - 4 - 5 = - 9 but ...
0 1 1 1 1
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7
3/20/201
8 arithmetic.5
Overflow Detection
• Overflow: result too large (or too small) to represent properly
– Example: - 8 4-bit binary number 7
• When adding operands with different signs, overflow cannot occur!
• Overflow occurs when adding:
– 2 positive numbers and sum is negative
– 2 negative numbers and the sum is positive
• On your own: Prove you can detect overflow by:
– Carry into MSB Carry out of MSB
0 1 1 1 1 0
0 1 1 1 7 1 1 0 0 –4
+ 0 0 1 1 3 + 1 0 1 1 –5
1 0 1 0 –6 0 1 1 1 7
3/20/201
8 arithmetic.6
Overflow Detection Logic
• Carry into MSB Carry out of MSB
– For a N-bit ALU: Overflow = CarryIn[N - 1] XOR CarryOut[N - 1]
CarryIn0
CarryOut3
3/20/201
8 arithmetic.7
CarryIn
3/20/201
8 arithmetic.8
Fast Add - Carry Select - review
3/20/201
8 arithmetic.9
16-bit Carry Select - review
3/20/201
8 arithmetic.10
Fast Addition : Carry Lookahead
• Carry Inputs can be precomputed by logic
c1 = g0 + c0 p0
= a0 b0 + c0 (a0 + b0)
p0 = a0 + b0 g0 = a0 b0 1 unit delay
each p, g
c2 = g1 + p1 c1
= g1 + p1 g0 + p1 p0 c0 3 units of delay
= a1 b1 + c1 (a1 + b1)
p1 = a1 + b1 g1 = a1 b1 1 unit delay
c3 = g2 + p2 g1 + p2 p1 g0 + p2 p1 p0 c0
3 units of delay
c4 = g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0 +
p3 p2 p1 p0 c0
3 units of delay
C4= func( a3, b3, a2, b2, a1, b1, a0, b0, c0)
3/20/201
8 arithmetic.11
Fast Addition: Carry Look Ahead – 4 bits
C0 = Cin
A B C-out
0 0 0 “kill”
S
2 0 1 C-in “propagate”
a0
g 1 0 C-in “propagate”
b0 p 1 1 1 “generate”
3 c1 = g0 + c0 p0
4
a1 S g = a and b 1 delay
g p = a or b
b1 p
c2 = g1 + g0 p1 + c0 p0 p1
3
a2 S 4 3 units of delay for c1, c2, c3,
g (c4)
b2 p 4 units of delay for S1, S2, S3
3 c3 = g2 + g1 p2 + g0 p1 p2 + c0 p0 p1 p2
4
a3 S G0=g3 + p3 g2 + p3 p2 g1 + p3 p2 p1 g0
g
b3 p P0 = p3 p2 p1 p0
3/20/20
18 arithmetic.16
MIPS arithmetic instructions
• Instruction Example Meaning Comments
• add add $1,$2,$3 $1 = $2 + $3 3 operands; exception possible
• subtract sub $1,$2,$3 $1 = $2 – $3 3 operands; exception possible
• add immediate addi $1,$2,100 $1 = $2 + 100 + constant; exception possible
• add unsigned addu $1,$2,$3 $1 = $2 + $3 3 operands; no exceptions
• subtract unsigned subu $1,$2,$3 $1 = $2 – $3 3 operands; no exceptions
• add imm. unsign. addiu $1,$2,100 $1 = $2 + 100 + constant;
no exceptions
• multiply mult $2,$3 Hi, Lo = $2 x $3 64-bit signed product
• multiply unsigned multu$2,$3 Hi, Lo = $2 x $3 64-bit unsigned product
• divide div $2,$3 Lo = $2 ÷ $3, Lo = quotient, Hi = remainder
• Hi = $2 mod $3
• divide unsigned divu $2,$3 Lo = $2 ÷ $3, Unsigned quotient & remainder
• Hi = $2 mod $3
• Move from Hi mfhi $1 $1 = Hi Used to get copy of Hi
• Move from Lo mflo $1 $1 = Lo Used to get copy of Lo
3/20/201
8 arithmetic.17
MULTIPLY (unsigned)
• Paper and pencil example :
Multiplicand 1000 A
Multiplier 1001 B
1000 a3b0 a2b0 a1b0 a0b0
0000 a3b1 a2b1 a1b1 a0b1
0000 a3b2 a2b2 a1b2 a0b2
1000 a3b3 a2b3 a1b3 a0b3
Product 01001000
• m bits x n bits = m+n bit product
• Binary makes it easy:
–0 => place 0 ( 0 x multiplicand)
–1 => place a copy ( 1 x multiplicand)
• 2 architectures – Fast Array MPY &
Slow Shift & Add
3/20/201
8 arithmetic.18
Fast unsigned Multiply== Array Multiplier
0 0 0 0
FA
carry carry A3 A2 A1 A0
out in B2
sum out
A3 A2 A1 A0
B3
Cell delays ?
Product P P7 P6 P5 P4 P3 P2 P1 P0 Multiplier B
3/20/201
8 arithmetic.20
Fast signed Array Multiplier - Baugh-Wooley alg
0 0 0 0
FA
carry carry A3 A2 A1 A0
out in B2
sum out
A3 A2 A1 A0
B3
1 P3 P2 P1 P0 Multiplier B
bj sum in
Product P FA FA FA FA 1
ai
P7 P6 P5 P4
2 cell types used, FA adders added carry
FA
carry
out in
3/20/201
8 sum out arithmetic.21
Array Multiplier - Baugh-Wooley Equations
an-1 & bn-1 are the sign bits, above equation for 4- bit
3/20/201
8
example arithmetic.22
Baugh-Wooley MPY
Example II: 1011 * 0011
1 0 1 1
1
0 0 1 1 1
0 0 1 1
0 0 1 1
0
1 1 0 0
1 0 0 0
0 0 0 0
1 1 1 0 0
0 1 1 1
0 1 1 1
1 1 1 1 0 0 0 1 = -15
arithmetic.23
Multiplication, using shift & Add
0 0 0 0 0 0 0
A3 A2 A1 A0
B0
Multiplier A3 A2 A1 A0
B1
operation
A3 A2 A1 A0
B2
A3 A2 A1 A0
B3
P7 P6 P5 P4 P3 P2 P1 P0
multiplicand
1000
multiplier
× 1001
1000
0000
0000
1000
product 1001000
Length of
product is the
sum of operand
lengths
3/20/201
8 arithmetic.25
Multiplication Hardware
using shift & Add
Initially 0
3/20/201
8 arithmetic.26
Optimized Multiplier
using shift & Add
• Perform steps in parallel: add/shift
Product Multiplicand
0000 0011 0010 2. Shift the Product register right 1 bit.
1: 0010 0011 0010
2: 0001 0001 0010
1: 0011 0001 0010
2: 0001 1000 0010
1: 0001 1000 0010
2: 0000 1100 0010
1: 0000 1100 0010 32nd No: < 32 repetitions
2: 0000 0110 0010 repetition?
0000 0110 0010 Yes: 32 repetitions
Done
3/20/201
8 arithmetic.28
MIPS logical instructions
• Instruction Example Meaning Comment
• and and $1,$2,$3 $1 = $2 & $3 3 reg. operands; Logical AND
• or or $1,$2,$3 $1 = $2 | $3 3 reg. operands; Logical OR
• xor xor $1,$2,$3 $1 = $2 $3 3 reg. operands; Logical XOR
• nor nor $1,$2,$3 $1 = ~($2 |$3) 3 reg. operands; Logical NOR
• and immediate andi $1,$2,10 $1 = $2 & 10 Logical AND reg, constant
• or immediate ori $1,$2,10 $1 = $2 | 10 Logical OR reg, constant
• xor immediate xori $1, $2,10 $1 = ~$2 &~10 Logical XOR reg, constant
• shift left logical sll $1,$2,10 $1 = $2 << 10 Shift left by constant
• shift right logical srl $1,$2,10 $1 = $2 >> 10 Shift right by constant
• shift right arithm. sra $1,$2,10 $1 = $2 >> 10 Shift right (sign extend)
• shift left logical sllv $1,$2,$3 $1 = $2 << $3 Shift left by variable
• shift right logical srlv $1,$2, $3 $1 = $2 >> $3 Shift right by variable
• shift right arithm. srav $1,$2, $3 $1 = $2 >> $3 Shift right arith. by variable
3/20/201
8 arithmetic.29
How shift instructions are implemented
Two kinds:
3/20/201
8 arithmetic.30
ARM :: Barrel Shifter:
ALU
Result
3/20/201
2/1
8 arithmetic.31
Barrel Shifter, used in ICs
Shift Right using one transistor per switch
SR3 SR2 SR1 SR0
D3
D2
A6
D1
A5
D0
A4
A3 A2 A1 A0
3/20/201
8 arithmetic.32
Barrel Shifter, used in ICs
Shift ……Left & right
SL 1 SL 2 SL3
SR2 SR1 SR0
D3
D2
A5
D1
A4
D0
A3
A2 A1 A0
arithmetic.33
Summary: Multiply & Shift
• Multiply: successive refinement to see final design
– 32-bit Adder, 64-bit shift register, 32-bit Multiplicand Register
• Fast multiply Array multiplier
3/20/201
8 arithmetic.34
Multilevel shifting – Shift right logical –
5 shift levels for 32-bit ALU
Shift 16 or 0 Shift 8 or 0 Shift 4 or 0 Shift 2 or 0 Shift 1 or 0
A31 X31
X31 Y31 Z31 M31 D31
“0” “0” “0” “0” “0”
A30
X30
Y30 Z30 M30
D30 Each Mux is 2
“0”
CMOS
A29
X29 transistors
“0”
Total
transistor
count = 5 * 32
*2 = 320
A1 X1
X1 Y1 D1
A17 X9
A0 X0 Y0 Z0 M0
X0 Y0 Z0 M0 D0
A16 X8 Z2 M1
Y4
3/20/201
8 arithmetic.35
Floating Point Arithmetic
• How to represent
– numbers with fractions, e.g., 3.1416
– very small numbers, e.g., .000000001
– very large numbers, e.g., 3.15576 109
• Fixed point
• Floating point: a number system with floating
decimal point
• Normalized numbers: no leading 0’s , single
digit before decimal point
9
1.0 x 10 9
3.1557 x 10
35
0.03
3/20/201
8 arithmetic.36
Floating Point Notation – IEEE 754 FP
decimal point exponent
Sign, magnitude
23 -24
6.02 x 10 1.673 x 10
• Single: 1011111101000…00
• Double: 1011111111101000…00
3/20/20
18 arithmetic.39
Exponent Bias used to simplify comparisons
3/20/201
8 arithmetic.40
Floating-Point Addition – Decimal
Key point – Make exponents equal
• 4-digit example
9.999 × 101 + 1.610 × 10–1
1. Align decimal points
Shift number with smaller exponent
9.999 × 101 + 0.016 × 101
2. Add significands
9.999 × 101 + 0.016 × 101 = 10.015 × 101
3. Normalize result & check for over/underflow
1.0015 × 102
4. Round and renormalize if necessary
1.002 × 102
3/20/201
8 arithmetic.41
Floating-Point Addition – Binary
Key point – Make exponents equal
• 4-bit example
1.0002 × 2–1 + –1.1102 × 2–2 (0.5 + –0.4375)
1. Align binary points
Shift number with smaller exponent
1.0002 × 2–1 + –0.1112 × 2–1
2. Add significands
1.0002 × 2–1 + –0.1112 × 2–1 = 0.0012 × 2–1
3. Normalize result & check for over/underflow
1.0002 × 2–4, with no over/underflow
4. Round and renormalize if necessary
1.0002 × 2–4 (no change) = 0.0625
Over
flow / exception
underflo
round mantissa
N Normal Y
DONE
?
3/20/201
8 arithmetic.43
Floating Point Addition Summary
3/20/201
8 arithmetic.44
Floating Point Multiplication
3/20/201
8 arithmetic.45
FP Adder Hardware
3/20/201
8 arithmetic.46
FP Adder Hardware
Exponents
compared
Step 1
Smaller number
shifted right
Step 2
Result iterated
until normalized
Step 3
Step 4
3/20/201
8 arithmetic.47
Floating Point: Overflow & Underflow
3/20/201
8 arithmetic.48
Summary of Floating Point Arithmetic
3/20/201
8 arithmetic.49