You are on page 1of 62

in FPGAs
ECE 645: Lecture 3
Chapter 5, Basic Addition and Counting,
Sections 5.1-5.5, pp. 75-85.
Behrooz Parhami,
Computer Arithmetic: Algorithms and Hardware Design
Chapter 9, Using Carry and Arithmetic Logic
Spartan-3 Generation FPGA User Guide
http://www.xilinx.com/support/documentation/spartan-3_user_guides.htm

x
y
c
s
HA
x + y = ( c s )
2

2 1
x y c
s
0
0
1
1
0
1
0
1
0
0
0
1
0
1
1
0
Alternative implementations (1)

s = xy + xy

b)
a)
s = x y

c = xy
c = x + y

c)
c = xy
s = xc + yc = xc yc
Alternative implementations (2)
x
y
c
out

s
FA
x + y + c
in
= ( c
out
s )
2

2 1
x y
c
out
s
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
c
in

0
1
0
1
0
1
0
1
c
in

Alternative implementations (1)
a)
s = (x y) c
in
c
out
= xy + c
in
(x y)
s
c
c
Alternative implementations (2)

s = x y c
in
= xyc
in
+ xyc
in
+ xyc
in
+ xyc
in
c
out
= xy + xc
in
+ yc
in
b)
Alternative implementations (3)
c)
x y c
out

s
0
0
1
1
0
1
0
1
0

1
c
in

c
in

c
in

c
in

c
in

c
in

x
y
A2
A1
XOR
D
0 1
C
in
C
out
S
p
g
Alternative implementations (4)
Implementation used to generate fast carry logic
in Xilinx FPGAs
x y c
out

0
0
1
1
0
1
0
1
y

y
c
in

c
in

p = x y
g = y
s= p c
in
= x y c
in

Latency of a k-bit ripple-carry adder
T
= T
FA
(x,yc
out
) +
+ (k-2) T
FA
(c
in
c
out
) +
+ T
FA
(c
in
s)
Latency ~ k T
FA

Latency k
Overflow for signed numbers (1)
Indication of overflow
Positive
+ Positive
= Negative
Negative
+ Negative
= Positive
Formulas
Overflow
2s complement
= x
k-1
y
k-1
s
k-1
+ x
k-1
y
k-1
s
k-1
=
= c
k
c
k-1

Overflow for signed numbers (2)
x
k-1
y
k-1
c
k-1
c
k
s
k-1
overflow c
k
c
k-1

0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
0
0
1
0
1
1
1
0
1
1
0
1
0
0
1
0
1
0
0
0
0
1
0
0
1
0
0
0
0
1
0
Technology Low-cost High-
performance
120/150 nm Virtex 2, 2 Pro
90 nm Spartan 3 Virtex 4
65 nm Virtex 5
45 nm Spartan 6
40 nm Virtex 6
Xilinx FPGA Devices
Altera FPGA Devices
Technology Low-cost Mid-range High-
performanc
e
130 nm Cyclone Stratix
90 nm Cyclone II Stratix II
65 nm Cyclone III Arria I Stratix III
40 nm Cyclone IV Arria II Stratix IV
23 ECE 448 FPGA and ASIC Design with VHDL
Programmable
interconnect
Programmable
logic blocks
The Design Warriors Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright 2004 Mentor Graphics Corp. (www.mentor.com)
General structure of an FPGA
24 ECE 448 FPGA and ASIC Design with VHDL
25 ECE 448 FPGA and ASIC Design with VHDL
CLB CLB
CLB CLB
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Logic cell
Slice
Logic cell
Configurable logic block (CLB)
The Design Warriors Guide to FPGAs
Devices, Tools, and Flows. ISBN 0750676043
Copyright 2004 Mentor Graphics Corp. (www.mentor.com)
Xilinx Spartan 3 FPGAs
26 ECE 448 FPGA and ASIC Design with VHDL
CLB Structure
27 ECE 448 FPGA and ASIC Design with VHDL
CLB Slice Structure
Each slice contains two sets of the
following:
Four-input LUT
Any 4-input logic function,
or 16-bit x 1 sync RAM (SLICEM only)
or 16-bit shift register (SLICEM only)
Carry & Control
Fast arithmetic logic
Multiplier logic
Multiplexer logic
Storage element
Latch or flip-flop
Set and reset
True or inverted inputs
Sync. or async. control
28 ECE 448 FPGA and ASIC Design with VHDL
LUT (Look-Up Table) Functionality
Look-Up tables
are primary
elements for
logic
implementation
Each LUT can
implement any
function of
4 inputs
x
1
x
2
x
3
x
4
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
x
1
x
2
x
3
x
4
y
x
1
x
2
x
3
x
4
y
x
1
x
2
y
x
1
x
2
y
LUT
x
1
x
2
x
3
x
4
y
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
0
1
0
0
0
1
0
1
0
1
0
0
1
1
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
x
1
0
x
2
x
3
x
4
0 0
0 0 0 1
0 0 1 0
0 0 1 1
0 1 0 0
0 1 0 1
0 1 1 0
0 1 1 1
1 0 0 0
1 0 0 1
1 0 1 0
1 0 1 1
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1
y
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
29 ECE 448 FPGA and ASIC Design with VHDL
COUT
D Q
CK
S
R
EC
D Q
CK
R
EC
O
G4
G3
G2
G1
Look-Up
Table
Carry
&
Control
Logic
O
YB
Y
F4
F3
F2
F1
XB
X
Look-Up
Table
F5IN
BY
SR
S
Carry
&
Control
Logic
CIN
CLK
CE
SLICE
Carry & Control Logic
x y
COUT

0
0
1
1
0
1
0
1
y

y
CIN
CIN
Propagate = x y
Generate = y
Sum= Propagate CIN = x y CIN
x
y
Carry & Control Logic in Xilinx FPGAs
Carry & Control Logic in Spartan 3 FPGAs
LUT
Hardwired (fast) logic
Simplified View of Spartan-3 FPGA
Carry and Arithmetic Logic in One
Logic Cell
Simplified View of Carry Logic in One Spartan 3 Slice
Critical Path for an
Xilinx Spartan 3 FPGAs
Number and Length of Carry Chains
for Spartan 3 FPGAs
Bottom Operand Input to Carry Out Delay
T
OPCYF
0.9 ns for Spartan 3
0.2 ns for Spartan 3
Carry Propagation Delay
t
BYP
Carry Input to Top Sum Combinational Output Delay
T
CINY
1.2 ns for Spartan 3
Critical Path Delays and Maximum Clock Frequencies
(into account surrounding registers)

Major Differences between Xilinx Families

Number of CLB slices
per CLB
Number of LUTs
per CLB slice
Look-Up Tables
stages per CLB slice
Spartan 3
Virtex 4
Virtex 5, Virtex 6,
Spartan 6
4-input 6-input
4
2
2
2
4
4
Altera Cyclone III
Logic Element (LE) Normal Mode
Altera Cyclone III
Logic Element (LE) Arithmetic Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) Normal Mode
Altera Stratix III, Stratix IV
Adaptive Logic Modules (ALM) Arithmetic Mode
Bit-serial
x
i
y
i
s
i
c
0
start

c
i+1
clk

Digit-serial
d d
d
x
i
y
i
s
i
c
0
start

c
i+1
clk

x
k-1
x
k-2
. . . x
1
x
0

y
k-1
y
k-2
. . . y
1
y
0

variable
constant
+
x
k-1
x
k-2
. . . x
h+1
x
h
x
h-1
. . . x
0

y
k-1
y
k-2
. . . y
h+1
1 0 . . . 0
variable
constant
+
x
h
x
h-1
. . . x
0
s
k-1
s
k-2
. . . s
1
s
0
s
k-1
s
k-2
. . . s
h+1
. . .
HA/
MHA
HA/
MHA
HA/
MHA
HA/
MHA
x
0
x
h-1
x
h
x
h+1
x
h+2
x
k-1
x
k-2
. . .
. . .
. .
x
0
x
h-1
x
h
s
h+1
s
h+2
s
k-1
s
k-2
. . .
. . .
If
y
i
y
i
c
k
x
y
c
s
MHA
x + y + 1 = ( c s )
2

2 1
x y c
s
0
0
1
1
0
1
0
1
0
1
1
1
1
0
0
1
HA HA HA HA
x
1
x
2
x
k-1
x
k-2
. . .
. .
s
1
s
2
s
k-1
s
k-2
. . .
x
0
x
0
c
k
Incrementer
MHA MHA MHA MHA
x
1
x
2
x
k-1
x
k-2
. . .
. .
s
1
s
2
s
k-1
s
k-2
. . .
x
0
x
0
c
k
Decrementer
Possible solutions to the
carry propagate problem
1. Detect the end of propagation rather than wait for
the worst-case time
2. Speed-up propagation via
carry skip
carry select, etc
3. Limit carry propagation to within a small number of bits
4. Eliminate carry propagation through the redundant
number representation
Analysis of carry propagation
Probability of carry generation = (x
i
y
i
= 11)
4
1
Probability of carry propagation = (x
i
y
i
= 01 or 10)
2
1
Probability of carry anihilation = (x
i
y
i
= 00 or 11)
2
1
j j-1 . . . . . . . i+1 i
1 0 1 0 1 1
1 1 0 1 0 1
Probability of
carry propagating
from position
i to position j
=
11 or 00
01 or 10
1
2
1
i j

2
1
probability of
propagation
probability of
anihilation
=
i j
2
1
Expected length of the carry chain
that starts at position i (1)
Expected length(i, k) =
i k
i k
i j
k
i j
i j

|
.
|

\
|
+

|
.
|

\
|

+ =

1
2
1
) (
2
1
1
1
) (
Length
of the
carry chain
Probability
of the given
length
Probability
of propagation
till the end of