Implementation of Adders
in FPGAs
ECE 645: Lecture 3
Required Reading
Chapter 5, Basic Addition and Counting,
Sections 5.15.5, pp. 7585.
Behrooz Parhami,
Computer Arithmetic: Algorithms and Hardware Design
Required Reading
Chapter 9, Using Carry and Arithmetic Logic
Spartan3 Generation FPGA User Guide
http://www.xilinx.com/support/documentation/spartan3_user_guides.htm
Halfadder
x
y
c
s
HA
x + y = ( c s )
2
2 1
x y c
s
0
0
1
1
0
1
0
1
0
0
0
1
0
1
1
0
Halfadder
Alternative implementations (1)
s = xy + xy
b)
a)
s = x y
c = xy
c = x + y
c)
c = xy
s = xc + yc = xc yc
Halfadder
Alternative implementations (2)
Fulladder
x
y
c
out
s
FA
x + y + c
in
= ( c
out
s )
2
2 1
x y
c
out
s
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
c
in
0
1
0
1
0
1
0
1
c
in
Fulladder
Alternative implementations (1)
a)
s = (x y) c
in
c
out
= xy + c
in
(x y)
s
c
c
Fulladder
Alternative implementations (2)
s = x y c
in
= xyc
in
+ xyc
in
+ xyc
in
+ xyc
in
c
out
= xy + xc
in
+ yc
in
b)
Fulladder
Alternative implementations (3)
c)
x y c
out
s
0
0
1
1
0
1
0
1
0
1
c
in
c
in
c
in
c
in
c
in
c
in
x
y
A2
A1
XOR
D
0 1
C
in
C
out
S
p
g
Fulladder
Alternative implementations (4)
Implementation used to generate fast carry logic
in Xilinx FPGAs
x y c
out
0
0
1
1
0
1
0
1
y
y
c
in
c
in
p = x y
g = y
s= p c
in
= x y c
in
Latency of a kbit ripplecarry adder
T
rippleadd
= T
FA
(x,yc
out
) +
+ (k2) T
FA
(c
in
c
out
) +
+ T
FA
(c
in
s)
Latency ~ k T
FA
Latency k
Overflow for signed numbers (1)
Indication of overflow
Positive
+ Positive
= Negative
Negative
+ Negative
= Positive
Formulas
Overflow
2s complement
= x
k1
y
k1
s
k1
+ x
k1
y
k1
s
k1
=
= c
k
c
k1
Overflow for signed numbers (2)
x
k1
y
k1
c
k1
c
k
s
k1
overflow c
k
c
k1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
0
1
0
0
0
0
1
0
Implementation of Adders in FPGAs
Technology Lowcost High
performance
120/150 nm Virtex 2, 2 Pro
90 nm Spartan 3 Virtex 4
65 nm Virtex 5
45 nm Spartan 6
40 nm Virtex 6
Xilinx FPGA Devices
Altera FPGA Devices
Technology Lowcost Midrange High
performanc
e
130 nm Cyclone Stratix
90 nm Cyclone II Stratix II
65 nm Cyclone III Arria I Stratix III
40 nm Cyclone IV Arria II Stratix IV
23 ECE 448 FPGA and ASIC Design with VHDL
Programmable
interconnect
Programmable
logic blocks
The Design Warriors Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright 2004 Mentor Graphics Corp. (www.mentor.com)
General structure of an FPGA
24 ECE 448 FPGA and ASIC Design with VHDL
25 ECE 448 FPGA and ASIC Design with VHDL
CLB CLB
CLB CLB
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Configurable logic block (CLB)
The Design Warriors Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Spartan 3 FPGAs
26 ECE 448 FPGA and ASIC Design with VHDL
CLB Structure
27 ECE 448 FPGA and ASIC Design with VHDL
CLB Slice Structure
Each slice contains two sets of the
following:
Fourinput LUT
Any 4input logic function,
or 16bit x 1 sync RAM (SLICEM only)
or 16bit shift register (SLICEM only)
Carry & Control
Fast arithmetic logic
Multiplier logic
Multiplexer logic
Storage element
Latch or flipflop
Set and reset
True or inverted inputs
Sync. or async. control
28 ECE 448 FPGA and ASIC Design with VHDL
LUT (LookUp Table) Functionality
LookUp tables
are primary
elements for
logic
implementation
Each LUT can
implement any
function of
4 inputs
x
1
x
2
x
3
x
4
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
x
1
x
2
x
3
x
4
y
x
1
x
2
x
3
x
4
y
x
1
x
2
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
29 ECE 448 FPGA and ASIC Design with VHDL
COUT
D Q
CK
S
R
EC
D Q
CK
R
EC
O
G4
G3
G2
G1
LookUp
Table
Carry
&
Control
Logic
O
YB
Y
F4
F3
F2
F1
XB
X
LookUp
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CIN
CLK
CE
SLICE
Carry & Control Logic
x y
COUT
0
0
1
1
0
1
0
1
y
y
CIN
CIN
Propagate = x y
Generate = y
Sum= Propagate CIN = x y CIN
x
y
Carry & Control Logic in Xilinx FPGAs
Carry & Control Logic in Spartan 3 FPGAs
LUT
Hardwired (fast) logic
Simplified View of Spartan3 FPGA
Carry and Arithmetic Logic in One
Logic Cell
Simplified View of Carry Logic in One Spartan 3 Slice
Critical Path for an
Adder Implemented Using
Xilinx Spartan 3 FPGAs
Number and Length of Carry Chains
for Spartan 3 FPGAs
Bottom Operand Input to Carry Out Delay
T
OPCYF
0.9 ns for Spartan 3
0.2 ns for Spartan 3
Carry Propagation Delay
t
BYP
Carry Input to Top Sum Combinational Output Delay
T
CINY
1.2 ns for Spartan 3
Critical Path Delays and Maximum Clock Frequencies
(into account surrounding registers)
Major Differences between Xilinx Families
Number of CLB slices
per CLB
Number of LUTs
per CLB slice
LookUp Tables
Number of adder
stages per CLB slice
Spartan 3
Virtex 4
Virtex 5, Virtex 6,
Spartan 6
4input 6input
4
2
2
2
4
4
Altera Cyclone III
Logic Element (LE) Normal Mode
Altera Cyclone III
Logic Element (LE) Arithmetic Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) Normal Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) Arithmetic Mode
BitSerial & DigitSerial Adders
Bitserial
adder
x
i
y
i
s
i
c
0
start
c
i+1
clk
Digitserial
adder
d d
d
x
i
y
i
s
i
c
0
start
c
i+1
clk
Addition of a Constant
Addition of a constant (1)
x
k1
x
k2
. . . x
1
x
0
y
k1
y
k2
. . . y
1
y
0
variable
constant
+
x
k1
x
k2
. . . x
h+1
x
h
x
h1
. . . x
0
y
k1
y
k2
. . . y
h+1
1 0 . . . 0
variable
constant
+
x
h
x
h1
. . . x
0
s
k1
s
k2
. . . s
1
s
0
s
k1
s
k2
. . . s
h+1
Addition of a constant (2)
. . .
HA/
MHA
HA/
MHA
HA/
MHA
HA/
MHA
x
0
x
h1
x
h
x
h+1
x
h+2
x
k1
x
k2
. . .
. . .
. .
x
0
x
h1
x
h
s
h+1
s
h+2
s
k1
s
k2
. . .
. . .
If
y
i
= 0 Halfadder (HA)
y
i
= 1 Modified halfadder (MHA)
c
k
Modified halfadder
x
y
c
s
MHA
x + y + 1 = ( c s )
2
2 1
x y c
s
0
0
1
1
0
1
0
1
0
1
1
1
1
0
0
1
HA HA HA HA
x
1
x
2
x
k1
x
k2
. . .
. .
s
1
s
2
s
k1
s
k2
. . .
x
0
x
0
c
k
Incrementer
MHA MHA MHA MHA
x
1
x
2
x
k1
x
k2
. . .
. .
s
1
s
2
s
k1
s
k2
. . .
x
0
x
0
c
k
Decrementer
Asynchronous Adders
Possible solutions to the
carry propagate problem
1. Detect the end of propagation rather than wait for
the worstcase time
2. Speedup propagation via
lookahead
carry skip
carry select, etc
3. Limit carry propagation to within a small number of bits
4. Eliminate carry propagation through the redundant
number representation
Analysis of carry propagation
Probability of carry generation = (x
i
y
i
= 11)
4
1
Probability of carry propagation = (x
i
y
i
= 01 or 10)
2
1
Probability of carry anihilation = (x
i
y
i
= 00 or 11)
2
1
j j1 . . . . . . . i+1 i
1 0 1 0 1 1
1 1 0 1 0 1
Probability of
carry propagating
from position
i to position j
=
11 or 00
01 or 10
1
2
1
i j
2
1
probability of
propagation
probability of
anihilation
=
i j
2
1
Expected length of the carry chain
that starts at position i (1)
Expected length(i, k) =
i k
i k
i j
k
i j
i j

.

\

+

.

\

+ =
1
2
1
) (
2
1
1
1
) (
Length
of the
carry chain
Probability
of the given
length
Probability
of propagation
till the end of
adder
Distance
till the end
of adder
Expected length of the carry chain
that starts at position i (2)
Expected length(i, k) =
) 1 (
2 2
i k
For i << k
Expected length of the carry propagation is ~ 2