VLSI Implementation of Floating Point Adder

VLSI Implementation Of Floating
Point Adder
1
OBJECTIVE:
To design the IEEE 754/854 floating point
adder with reduced area and delay using carry
cut-back adder with better performance and
implement it in FPGA.
2
INTRODUCTION:
FLOATING POINT NUMBERS:
• No fixed number of digits before and after the decimal point
• Real numbers in scientific notation have three components.
• One of the challenges in programming is that approximation

leads to reasonable results.
3
FLOATING POINT ADDITION:
• Complex algorithm in which it is represented in scientific

notation as sign, exponent and mantissa.
• Before addition operation, the exponent of two floating
point numbers should be checked and normalized.
• More difficult than multiplication because alignment of
mantissa is required before mantissa addition
• To reduce Latency and area density are the main focus of
attention to improve the performance.
4
ADDERS
CLA Carry-skip Carry-save Compound CSA Parallel-prefix

ADVANTAGES
Fast •Fast No Computes Low delay oEfficient in

Reduced adder propagation both the sum compared to RCA designing
carry •Reduces delay and incremented Speed of addition oFaster in
propagation the path High clock sum. is limited by time operation
time delay speed  faster
Higher
throughput.
DIS-ADVANTAGES
Speed •Increased Don't know More area Costly oMore area
will drop complexity about the result  More area and
with of addition power
increase To implement
in bit modular
size. multiplication
5
ADDERS
APPROXIMATE ADDER
SPECULATIVE SEGMENTED CARRY SELECT APPROXIMATE FULL

ADVANTAGES
•Reduces area and Efficient design  Consumes low power Occupies reduced
power Performs faster Area efficient area and power
•Improves speed efficient
DIS-ADVANTAGES
•Moderate accuracy In- accurate error More complex Operation is slow

rate. since it is concentrated
on LSB part.
6
LITERATURE SURVEY
7
ADDERS TECHNIQUES ADVANTAGES DISADVANTAGES
USED
Carry save adder Early overhead is reduced  earlier calculation is
[1,3] Normalization  adder size is needed
reduced by half
 reduced latency
35%
Parallel prefix Dual reduction  eliminates  applicable for three term
Adder [13,11] complementation floating point adder
 reduces
latency[35%]
Compound adder Far and near path  speedup the  increased area[13.5%]
[4-parallel prefix + addition
carry look ahead  low latency[37%]
adder]
Compound addition  Fast rounding carry propagation

 delay of rounding is  overflow
hidden  rounding position is
changed
8
Compound adder Leading One  add parallelism to  increased area[38%]
[14,20- ripple carry Predictor the design
adder + carry save  latency [6.5%]is
adder] reduced
[19- carry look ahead +
carry save adder]  improve the  more area [23%]
throughput [2.7 times]  power consumption
[28- carry look ahead Pipelining  reduced latency [5-8%]
adder + ripple carry [8.2%]
adder]
[7-parallel prefix+ high speed occupies more area
carry save adder] Conversion step can [42%]
Dual path approach be skipped
(R path and N path)  no rounding is
required
 simpler and easy to
implement
Carry look ahead Embedded shifter  enhance floating  probability of being

Adder point performance wasted space
[22]  Area[14.6%] savings
 clock rate[3.3%]
increases
9
Carry look ahead Sign embedding  reduced  Carry propagation
Adder[5,7,15] latency[2.99ns]  Asymmetry of digit
set
Improve the speed  latency of alignment

 Shared exponent shift is significant
different logic  lengthy
 High performance normalization shift is
Alignment and necessary
Approximate full normalization path A small area and less In-exact adder
adder delay (31.236 ns ) produce approximate
[27] Higher performance error that can be
tolerable.
Approximate Gate level pruning Overcomes the area Approximate adder

speculative adder[26] problem that cause the
Time consumption of minimized error due to
addition process is in-exact sum that can
reduced be tolerable.
Reduces delay
[11.88%] 10
SUMMARY
• From the literature survey, Pipelining and dual path methods

occupy more area but with the significant reduction in latency.
• In early normalization and sign embedding both adder size
and latency are reduced.
• Using dual reduction latency is reduced and is applicable for
three term adder.
• Using leading one-predictor technique permits the reduction
of the delay, since it is operating in parallel with the
significand addition.
• In far and near path, exhibits the smallest latency of the three
term adders, due to elimination of a shifter from critical path.
11
PROBLEMS IDENTIFIED:
• In floating point adders the problems are found as complex
exponent processing and significant alignment,
complementation after significant addition, complex round
processing,
• In order to reduce the latency, pipelining or dual path
approach can be used. But these techniques need extra
hardware and hence the area is gradually increased.
• Previously carry look ahead adder with ripple carry adders
(compound adder) are use to provide reduction in area and
better performance
• By replacing the compound adder by approximate adders
provide better performance.
12
BLOCK DIAGRAM:
1 Swapping 2 Pre-normalization
IN1 Data R-shift-amt

extraction Exponent Swap Exponent Right
(S1, E1, M1) Comparator IN1, IN2 difference Shifter
IN2
(S2, E2, M2)
3 Mantissa 4 Post-normalization 5 Rounding

addition
LOP Add
(carry cut-back Left Rounding
Shifter unit Output
adder)
13
FLOW CHART: Start
Enter N1 and N2 in Floating Format
Yes
Is E1/E2=0 Set S23(i.e. Hidden bit)=0 of N1 or N2
No
Yes
Is E2>E1 Swap N1 and N2
No
Calculate Difference d=E1-E2
Shift S2 of N2 to right by amount ‘d’ and fill LSB by zero’s.
Amount of shifting ‘d’ is added to the exponent of N2.
Yes N1 & N2
No
different
sign
Replace S2 of N2 by 2’s complement Yes No Compute Significant S=S1+S2
Carry out No carry out

Compute Compute
Sign=Sign of large Sign=Sign of
number N1 or N2 Previous exponent is the real exponent
Compute significant S=S1+S2
Carry out
No carry out Add ‘1’ to exponent and also shift overall result
Discard carry and shift the result to left to right dropping LSB and making MSB ‘1’
until there is ‘1’ at MSB fill LSB by zero.
MSB is ‘1’
Yes
Amount of shifting is subtracting from exponent to
Replace S by 2’s complement produce original exponent.
Assemble result into 32 bit format

14
FLOATING POINT ADDER
Exp A Exp B Mant A Mant B
Steps involved in floating Exponent difference Swap
point adder are Larger Exp

ExpB>Exp A
Exp
Pre alignment g,r,s
diff
1. Swapping
Adder
2. Pre-normalization
3. Mantissa addition Normalization
LOP
g,r,s
4. Post-normalization Shifter adjust
5. Rounding Subtract
Exponent Overflow Rounding r,s

increment unit
Exp result Mant result
15
GENERAL BLOCK DIAGRAM OF CARRY CUT-
BACK ADDER DESIGN:
AN-1 –AN-X AN-2X-1 –AN-3X AX-1 –A0
BN-1 –BN-X BN-2X-1 – BN-3X BX-1 –B0
CUT CUT
PROP SPEC PROP SPEC
ADD ADD … ADD
SN-1-SN-X SN-2X-1-SN-2X SX-1-S0
16
24- BIT CARRY CUT-BACK ADDER DESIGN
a, b a, b
CARRY =0 CARRY =1
CUT=0 PROP CUT=1

PROP SPEC SPEC
(1)
ADD
ADD(12 BIT) ADD(8 BIT)
(4 BIT)
S23-S12 S11-S4 S3-S0

MSB PART LSB PART
17
ALGORITHM FOR CARRYCUT BACK
ADDER
prop and spec with

Cut=1 input guess
1 1 1 1
0100110010111 11011111 1111 operands
0000000000101 00000000 0100

In-exact sum
0100110011100 11011111 0011

-----------------------------------------------------------------------------------
18
MANTISSA ADDITION BY APPROXIMATE
ADDER USING CARRY-CUT BACK
TECHNIQUE:
• The Approximate adder architecture called the carry-
cutback adder is used for mantissa addition.
• It includes the propagation and speculation block in which
the carry-out is ‘1’, then the propagation block cut the carry
and speculate the carry-out as ‘0’using multiplexers.
• Here the high significance carry stages are monitored to cut
the carry propagation chain at lower significance positions.
• Inexact sum cause minimized error is tolerable.
19
ALGORITHM:
STEP 1: Enter N1 and N2 (operands)
STEP 2: Data extraction & Exceptional check-up

N1 S1, E1, M1 N1=2.3=0 10000000 100100100000000000000000
N2 S2, E2, M2 N2=7.4=0 10000001 111011000000000000000000
 Check for INFINITY, SUB-NORMALs, Nan (not a number)
 Update hidden bit of mantissa for sub- normal's
STEP 3,4: COMPARE, SWAP & dynamic right SHIFT:
Swap N1 & N2 E2>E1(i.e.) 10000001> 10000000
AFTER SWAPPING NEW N1 = 0 10000001 111011000000000000000000

NEW N2 = 0 10000000 100100100000000000000000
Exp difference ‘d’ {E2}-{E1} i.e., 10000001-10000000 = 00000001 (1 Bit)
Right SHIFT by ‘d’ {M2}of N2 i.e.,N2= 0 10000000 01001001000000000000000
20
 d+{E2} New {E2} of N2 i.e., 00000001(1) +10000000 = 10000001
STEP 5: Pre-normalization Now E1=E2 10000001 = 10000001
STEP 6: Use Leading One Predictor (LOP)

LOP It is to Predict the no of leading ones from M1
and M2 before mantissa addition
STEP 7: Mantissa addition

1) Sign (+) calculation for the final output
2) Perform the desired operation (+)
Addition M1 + M2
(i.e.,) M1 + M2= 1 001101010000000000000000 (1 is carry )
add ‘1’ to {E} =10000010
21
STEP 8: Post Normalization :
 Normalized Left SHIFT
STEP 9: Rounding
 To make sure that result after operation be fitted in to available bits.
STEP 10 : Finalizing output

 Exceptional cases Update exponent and mantissa
 Assemble S, E & M
Result is {S, E, M}= 0 10000010 001101010000000000000000
22
RESULTS
23
CARRY CUT-BACK ADDER RESULTS
24
RTL SCHEMATIC OF CARRYCUT BACK ADDER:
25
TECHNOLOGY SCHEMATIC OF CARRY CUT BACK
ADDER:
26
DEVICE UTILIZATION FOR 24 bit RCA+CLA
LOGIC USED AVAILABLE UTILIZATION

UTILIZATION
No. of Slices 23 93120 1%
No. of slice LUT 25 46560 1%
No. of occupied slices 25 11640 1%
No. of bonded IOBs 96 240 40%
DELAY 6.639 ns POWER 1.202 w
27
DEVICE UTILIZATION FOR 24 BIT CLA
LOGIC UTILIZATION USED AVAILABLE UTILIZATION
28
DEVICE UTILIZATION FOR 24 BIT CCBA:
29
SINGLE PRECISION FLOATING POINT ADDITION
i ) swapping case (exp b> exp a)
30
ii )Non-swapping case (exp a> exp b)
31
iii ) Leading one prediction
32
DESIGN OF FLOATING POINT ADDER
i ) with swapping case
33
34
DESIGN OF FLOATING POINT ADDER
ii ) without-swapping case
35
36
RTL SCHEMATIC OF FLOATING POINT ADDER:
37
TECHNOLOGY SCHEMATIC OF FLOATING POINT ADDER:
38
DEVICE UTILIZATION FOR FLOATING POINT
ADDER DESIGN USING CARRY CUT BACK ADDER:
No. of slice flip flops 448 1920 24%
No. of 4 input slices 1589 1920 82%
COMBINATIONAL PATH DELAY = 11.808 ns
39
DEVICE UTILIZATION FOR FLOATING POINT
ADDER DESIGN IN HIGHER ORDER FAMILIES OF
FPGA
LOGIC UTILIZATION VIRTEX 6 VIRTEX 7 ARTIX 7
No. of Slices 400/93120 (1%) 387/408000(1%) 389/126800(1%)
No. of 4 input slices 723/46560 (1%) 727/204000 (1%) 728/63400(1%)
No. of slice flip flops 292/831 (35%) 293/821 (35%) 294/823(35%)
No. of bonded IOBs 631/240 (262%) 631/600 (105%) 631/210(300%)
Minimum period (ns) 2.450 2.283 2.951
Maximum frequency 408.105 437.92 338.816

(MHZ)
Combinational path delay 0.401ns 0.395 0.396
(ns)
40
WORK TO BE DONE
PHASE II WORK:
 To perform pipelining technique in the design of
floating point adder, hence the delay can be further
reduced.
 To make the approximate adder into self-timed

adder, it helps in the reduction of delay as well as in
power. it may increase the overall performance of the
proposed IEEE-754 floating point adder design.
41
CONCLUSION:
• The design of IEEE-754/854 floating point

adder is proposed using carry-cut back adder
that has the minimized latency as 1.606ns.
42
REFERENCES :
1. Ali Malik And Seok-bum Ko,(2006) “A Study On The Floating-point Adder In Fpgas,” IEEE Conference
On Electrical And Computer Engineering, Pp. 86-89, Ottawa Ont.
2. Ankit Kumar Kusumakar and Utsav Malviya,(2013) “Implementation Of Area Optimized Floating Point
Units In Hybrid Fpga,” International Journal Of Engineering Trends In Technology, Vol. 4, Issue. 5, Pp.
1540-1542.
3. Beaumont-Smith. A, Burgess .N, Lefrere .S and Lim C.C (1999), “Reduced Latency IEEE Floating-Point
Standard Adder Architectures,” IEEE Transactions on Computer Arithmetic, pp. 35-42, Adelaide SA
4. Dhiraj Sangwan And Mahesh Yadav.K,(2010)”Design And Implementation Of Adder/ Subracter And
Multiplication Units For Floating Point Arithmetic”, International Journal Of Electronics Engineering, pp
197-203.
5. Deepak Mishra(2015) ”A Survey On Floating Point Adders” ,international Journal Of Computer Science
Trends And Technology -Vol 3 Issue 2.
6. Ghassem Jaberipur, Behrooz Parhami and Saeid Gorgin,(2010) “Redundant-Digit Floating-Point
Addition Scheme on a Stored Rounding Value,” IEEE Transactions on Computers, vol. 59, No. 5.
7. Giorgos Dimitrakopoulos, Kostas Galanopoulos, Christos Mavrokefalidis and Dimitris Nikolos, (2008)
“Low-Power Leading-zero Counting and Anticipation Logic for High-Speed Floating Point Units,” IEEE
Transactions On VLSI, Vol. 16, No. 7, pp. 837-850.
8. Jongwook Sohn, Earl E. Swartzlander, Jr.(2014), “A Fused Floating-Point Three-Term Adder,” IEEE
Transactions on Circuits and Systems, Vol. 61, No. 10, pp. 2842-2850,
9. Javier Bruguera .D and Tomas Lang, (2015) “Floating-Point Fused Multiply-Add: Reduced Latency for
Floating Point Addition,” IEEE transactions on Computer Arithmetic, pp. 42-51. 43
10. Jun Xu And Hong Wang,(2011) “Desynchronize A Legacy Floating-point Adder With Operand-
dependant Delay Elements,” IEEE Transactions On Circuits And Systems, Pp. 1427-1430, Rio De
Janeiro.
11. Karan Gumber And Sharmelee Thangjam, (2012)“Performance Analysis Of Floating Point Adder Using
VHDL On Reconfigurable Hardware,” International Journal Of Computer Applications, Volume 46,
No.9.
12. Lakshminarayanan .A, Jayapal .N, Kumar .K, Krishnakumar .V and Shajudeen .K, (2014) “Advanced
Pipelined Area And Speed Efficient Floating-point ALU Embedded System In FPGA” ,International
Journal Of Advanced Research In Electrical, Electronics And Instrumentation Engineering, Vol. 3, Issue.
5, Pp. 9378-9385.
13. Michael J. Beauchamp, Scott Hauck, Keith D. Underwood, And K. Scott Hemmert,(2008) “Architectural
Modifications To Enhance The Floating-point Performance Of FPGAs” IEEE Transactions On VLSI
2006, Vol.16, No.2, Pp. 177-187.
14. Manish Kumar Jaiswal, Ray C. C. Cheung, M. Balakrishnan, and Kolin Paul,(2014) “Unified
Architecture for Double/Two- Parallel Single Precision Floating Point Adder,” IEEE Transactions on
Circuits and Systems, vol. 61, No. 7, pp. 521-525,
15. Meera. K A (2015) “Improved Architecture For Floating Point Addition”, International Journal Of
Innovative Research In Computer And Communication Engineering Vol.3,issue 9.
16. Peter-Michael Seidel and Guy Even,(2011) “On the Design of Fast IEEE Floating-Point Adders,” IEEE
Transactions on Computer Arithmetic, pp. 184-194, Vail CO.
17. Pedro Echeverria And Marisa Lopez-vallejo,(2011) “Customizing Floating-point Units For Fpga: Area-
Performance-standard Trade-offs,” Science Direct Journal On Microprocessors And Microsystems2011,
Vol.35, No.6, Pp. 535-546.
18. Preethi Sudha Gollamudi,kamaraju.M, (2013)”Design Of High Performance IEEE-754 Single
Precision(32 Bit) Floating Point Adder Using VHDL”, International Journal Of Engineering Research
And Technology, Vol 2 Issue 7.
44
19. Riya Saini, Galani Tina G. And R.D. Daruwala,(2013) “Efficient Implementation Of Pipelined Double
Precision Floating Point Unit On FPGA,” International Journal Of Emerging Trends In Electrical And
Electronics, Vol. 5, Issue. 1.
20. Sahdev D.Kanjariya, Rutarth Patel,(2015)”Architecture And Design Of Generic IEEE-754 Based
Floating Point Adder, Subractor And Multiplier” ,International Journal On Recent And Innovation Trends
In Computing And Communication, volume 3,issue:5.
21. Subha.S, (2017)”an Improved Floating Point Additin Algorithm”,aprn Journal Of Engineering And
Applied Science, vol 12,no 1.
22. Somsubhra Ghosh, Shubhobrata Rudra, Prarthana Bhattacharyya And Arka Dutta,(2013) “FPGA Based
Implementation Of A Double Precision Ieee Floating-point Adder,” IEEE Conference On Intelligent
Systems And Control 2013, Pp. 271-275, India.
23. Suganya, (2011)“design and optimized implementation of six-operand single-precision floating point
addition”, International Journal Of Engineering Research And Technology, Vol 20 Issue 7.
24. Suresh .V and Mr. Malyadri .G,(2017)” Design and Analysis of Area and Delay Efficient Double
Precision Floating -Point Adder”, International Journal of Research, Vol 04 Issue 08 .
25. Thiruvengadam .K, Ramesh .J, kalaiyarasi .V, (2016)”an Area Efficient Multi-mode Quadraple Precision
Floating Point Adder”,elsevier Vol[1-14].
26. Vincent Camus, Jeremy Schlachter, Christian Enz,(2016)”Approximate 32-bit Floating-point Unit
Designwith 53% Power-area Product Reduction”, IEEE Conference On Electrical And Computer
Engineering.Vol.14.
27. Weiqiang Liu, Linbin Chen, Chenghua Wang, Maire O Neill and Fabrizio Lombardi, (2015)“Design and
Analysis of Inexact Floating-Point Adders,” IEEE Transactions on Computers, vol. pp, No. 99, pp. 1-8,
28. Yedukondala Rao Veeranki And Nakkeeran .R, (2013)“Spartan 3E Synthesizable FPGA Based Floating
Point Arithmetic Unit,” International Journal Of Computer And Technology, Vol. 4, Issue. 4, Pp. 751-755.
45
THANK YOU
46

VLSI Implementation of Floating Point Adder

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

VLSI Implementation of Floating Point Adder

Hochgeladen von

Copyright:

Verfügbare Formate

VLSI Implementation Of Floating

• One of the challenges in programming is that approximation

• Complex algorithm in which it is represented in scientific

CLA Carry-skip Carry-save Compound CSA Parallel-prefix

Fast •Fast No Computes Low delay oEfficient in

SPECULATIVE SEGMENTED CARRY SELECT APPROXIMATE FULL

•Moderate accuracy In- accurate error More complex Operation is slow

Compound addition  Fast rounding carry propagation

Carry look ahead Embedded shifter  enhance floating  probability of being

Improve the speed  latency of alignment

Approximate Gate level pruning Overcomes the area Approximate adder

• From the literature survey, Pipelining and dual path methods

IN1 Data R-shift-amt

3 Mantissa 4 Post-normalization 5 Rounding

Enter N1 and N2 in Floating Format

Shift S2 of N2 to right by amount ‘d’ and fill LSB by zero’s.

Amount of shifting ‘d’ is added to the exponent of N2.

Carry out No carry out

Assemble result into 32 bit format

Steps involved in floating Exponent difference Swap

point adder are Larger Exp

Exponent Overflow Rounding r,s

Exp result Mant result

ADD ADD … ADD

SN-1-SN-X SN-2X-1-SN-2X SX-1-S0

CUT=0 PROP CUT=1

S23-S12 S11-S4 S3-S0

prop and spec with

0000000000101 00000000 0100

0100110011100 11011111 0011

STEP 2: Data extraction & Exceptional check-up

AFTER SWAPPING NEW N1 = 0 10000001 111011000000000000000000

Right SHIFT by ‘d’ {M2}of N2 i.e.,N2= 0 10000000 01001001000000000000000

STEP 5: Pre-normalization Now E1=E2 10000001 = 10000001

STEP 6: Use Leading One Predictor (LOP)

STEP 7: Mantissa addition

STEP 10 : Finalizing output

Result is {S, E, M}= 0 10000010 001101010000000000000000

LOGIC USED AVAILABLE UTILIZATION

No. of slice LUT 25 46560 1%

No. of occupied slices 25 11640 1%

No. of bonded IOBs 96 240 40%

DELAY 6.639 ns POWER 1.202 w

LOGIC UTILIZATION USED AVAILABLE UTILIZATION

No. of Slices 23 93120 1%

No. of slice LUT 34 46560 1%

No. of occupied slices 12 11640 1%

No. of bonded IOBs 74 240 30%

DELAY 6.681 ns POWER 1.293 w

LOGIC UTILIZATION USED AVAILABLE UTILIZATION

No. of Slices 23 93120 1%

No. of slice LUT 146 46560 1%

No. of occupied slices 23 146 15%

No. of bonded IOBs 73 240 30%

DELAY 1.606 ns POWER 0.042 w

LOGIC UTILIZATION USED AVAILABLE UTILIZATION

No. of Slices 886 960 92%

No. of slice flip flops 448 1920 24%

No. of 4 input slices 1589 1920 82%

No. of bonded IOBs 631 66 956%

COMBINATIONAL PATH DELAY = 11.808 ns

No. of Slices 400/93120 (1%) 387/408000(1%) 389/126800(1%)

No. of 4 input slices 723/46560 (1%) 727/204000 (1%) 728/63400(1%)

No. of slice flip flops 292/831 (35%) 293/821 (35%) 294/823(35%)