Ggggggui

2842 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: REGULAR PAPERS, VOL. 61, NO.
10, OCTOBER 2014
A Fused Floating-Point Three-Term Adder

Jongwook Sohn, Member, IEEE, and Earl E. Swartzlander, Jr., Life Fellow, IEEE
AbstractThis paper presents improved architectures for a after each addition. Many recent floating-point units can accom-
fused floating-point three-term adder. The fused floating-point modate operations that have three inputs (e.g., FMA). As a re-
three-term adder performs two additions in a single unit to achieve
better performance and better accuracy compared to a network of sult adding a three-term adder does not require a special data
traditional floating-point two-term adders, which is referred to as path or register file. Several issues for the design of the fused
a discrete design. In order to further improve the performance of floating-point three-term adder are discussed in the previous
the three-term adder, several optimization techniques are applied
including a new exponent compare and significand alignment, work [9], [10]: 1) Complex exponent processing and significand
dual-reduction, early normalization, three-input leading zero alignment, 2) Complementation after the significand addition,
anticipation, compound addition/rounding and pipelining. The 3) Large precision significand addition, 4) Massive cancellation
proposed design is implemented for both single and double preci- management, and 5) Complex round processing. In this paper,
sion and synthesized with a 45 nm CMOS standard-cell library.
The improved fused floating-point three-term adder reduces those issues are addressed by investigating several optimization
the area and power consumption by about 20% and reduces techniques. The algorithms and optimizations described in this
the latency by about 35% compared to a discrete floating-point paper can be also extended to fused floating-point multi-term
three-term adder. Based on the data flow analysis, the proposed
three-term adder can be split into three pipeline stages. Since adders with more than three operands. Therefore, the improved
the latencies of three pipeline stages are fairly well balanced, the fused floating-point three-term adder will contribute to the next
throughput is increased to 2.7 times that of the non-pipelined generation of floating-point arithmetic unit design.
design.
The proposed fused floating-point three-term adder takes
Index TermsFloating-point arithmetic, fused floating-point op- three normalized operands and executes two additions (or
erations, high speed computer arithmetic, three-term adder. subtractions) as
(1)
I. INTRODUCTION It supports all five of the rounding modes specified in the
M OST general purpose processors and application spe- IEEE-754 Standard [1]. Several algorithms and optimization
cific processors use floating-point arithmetic which is techniques are applied not only to resolve the design issues but
specified in the IEEE-754 Standard for floating-point arithmetic also to improve the performance:
[1]. The benefits of floating-point arithmetic over fixed-point 1) A new exponent compare and significand alignment
arithmetic come from its constant relative precision over a scheme is proposed. The three exponent differences are
wide dynamic range. However, floating-point operations re- computed in parallel and the differences are used for the
quire complex processes such as alignment, normalization and significand alignment. By shifting the significands with
rounding, which increases the area, power consumption and the partial difference results, the exponent difference
latency. In order to reduce the overhead, fused floating-point computation and the significand alignment can be over-
units have been proposed, which execute several operations in a lapped. The control logic determines the largest exponent
single unit to reduce the area, power consumption and latency. and three aligned significands. This approach reduces
Several fused floating-point units have been introduced: Fused the latency by performing the exponent processing and
Multiply-Add (FMA) [2][4], fused add-subtract [5], [6], fused significand alignment simultaneously.
dot product [7], [8] and fused three-term adder [9], [10]. 2) Dual-reduction is used to handle both cases that the result
Addition is the most frequently used operation in many al- of the significand addition is positive and negative. Two
gorithms and applications. Traditional floating-point two-term reduction trees generate both the positive and negativesig-
adders are extensively discussed in the previous work [11][13]. nificand pairs and the positive significand pair is selected
In case of the additions in series, however, a network of the two- based on the significand comparison. The selected signifi-
term adders loses accuracy due to the multiple roundingsone cand pair produces a positive sum so that the complemen-
tation after the significand addition can be skipped, which
reduces the latency.
Manuscript received November 23, 2013; revised March 01, 2014; ac-
3) Early normalization is applied to reduce the significand ad-
cepted March 17, 2014. Date of publication July 22, 2014; date of current dition size. By performing the normalization prior to the
version September 25, 2014. This paper was recommended by Associate Ed- significand addition, the adder size is reduced by half and
itor F. Clermidy. the rest of bits are covered by the rounding, which signifi-
J. Sohn is with Intel Corp., Austin, TX 78746 USA (e-mail: jongwook.
sohn@intel.com). cantly reduces the latency.
E. E. Swartzlander, Jr. is with the Department of Electrical and Computer 4) Since the normalization is performed prior to the signifi-
Engineering, University of Texas at Austin, Austin, TX 78712 USA (e-mail: cand addition, the Leading Zero Anticipation (LZA) and
e.swartzlander@IEEE.org).
Color versions of one or more of the figures in this paper are available online
normalization shift are on the critical path. In order to re-
at http://ieeexplore.ieee.org. duce the latency, a three-input LZA is proposed, which
Digital Object Identifier 10.1109/TCSI.2014.2333680 hides the delay of the 3:2 reduction trees.
1549-8328 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
SOHN AND SWARTZLANDER: A FUSED FLOATING-POINT THREE-TERM ADDER 2843
Fig. 1. Discrete vs. fused floating-point three-term adders.
5) Compound addition is used for fast rounding. The com-

pound addition produces the rounded and unrounded sums
together, and then the correct one is selected by the round
logic so that the delay of the rounding is hidden.
In order to increase the throughput, pipelining can be applied.
Based on the data flow analysis, the proposed fused floating-
point three-term adder is split into three stages. Since the laten-
cies of three stages are fairly well balanced, the throughput is
improved.
Section II describes the traditional fused floating-point three-
term adder. The next two sections present improved architec-
tures for a fused floating-point three-term adder design. In Sec-
tion III, several optimization techniques are introduced to im-
prove the performance of the fused floating-point three-term
adder. In Section IV, pipelining is applied to the improved fused
floating-point three-term adder based on a data flow analysis.
The proposed design is implemented for both single and double
precision and synthesized with the Nangate 45 nm CMOS tech- Fig. 2. Traditional fused floating-point three-term adder (after [9], [10]).
nology standard-cell library [14]. To evaluate the improvement
of the proposed design, the area, latency, throughput, and power 1) The exponent compare logic determines the largest of the
consumption are compared with that of the traditional floating- three exponents and computes the differences between the
point three-term adders. The comparison results for both single largest exponent and each exponent. The three significands
and double precision are discussed in Section V. Finally, the are shifted by the amount of the corresponding exponent
benefits of the proposed designs are summarized in Section VI. differences.
2) The effective operations are determined based on the three
II. TRADITIONAL FUSED FLOATING-POINT sign bits and the two op codes. The aligned significands
THREE-TERM ADDERS are inverted if the corresponding operations are subtrac-
A simple way to implement a floating-point three-term adder tion. Then, the significands are passed to the 3:2 reduction
is to concatenate two identical floating-point adders, which is tree. Carry Save Adders (CSA) are used to perform the re-
referred to as a discrete floating-point three-term adder. One duction, which reduces the three significands to two.
of the high-performance floating-point adders [11][13] can be 3) The significand addition is performed and the sum is com-
used for the discrete design. Although the floating-point adders plemented if it is negative. The LZA is performed in par-
are well-optimized for two-term addition, the discrete design allel with the significand addition and the significand sum
for the three-term adder takes twice the area, latency and power is shifted by the amount of the LZA result. The carry-out
consumption of a single floating-point adder. In order to re- of the significand addition is passed to the sign logic and
duce the overhead, a fused floating-point three-term adders have the exponent adjustment logic.
been proposed, which performs two additions in a single unit 4) The sign logic determines the sign bit of the sum. The sign
[9], [10]. Fig. 1 shows a high level comparison of the discrete bit is passed to the round logic. The normalized significand
design and a representative fused design. The fused floating- is rounded and post-normalized. The exponent is adjusted
point three-term adder shares common logic to reduce the area, with the carry-out of the significand addition and the shift
power consumption and latency. In addition, the fused floating- amount from the LZA.
point three-term adder improves the accuracy. While the dis-
III. AN IMPROVED FUSED FLOATING-POINT
crete floating-point three-term adder performs rounding twice,
THREE-TERM ADDER
the fused floating-point three-term adder performs the rounding
only once, which improves the accuracy. The traditional fused floating-point three-term adder reduces
Fig. 2 shows a traditional fused floating-point three-term the area, latency and power consumption compared to the dis-
adder [9], [10]. The traditional floating-point three-term adder crete floating-point three-term adder by sharing the common
takes three operands and performs the two additions at once. logic [9], [10]. However, it is an initial design so that opti-
The procedure of the traditional fused floating-point tree-term mizations can be applied to improve the performance. Fig. 3
adder is shows the modified design for an improved fused floating-point
2844 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: REGULAR PAPERS, VOL. 61, NO. 10, OCTOBER 2014
Fig. 4. Exponent compare and significand alignment logic.
Fig. 3. An improved fused floating-point three-term adder.
three-term adder. In this section, five optimizations for the im-

proved fused floating-point three-term adder are described: 1) Fig. 5. Significand shifter and sticky logic for the single precision.
A new exponent compare and significand alignment scheme,
2) Dual-reduction to avoid the need for complementation after the critical path takes account only the last level of the shifter.
the significand addition, 3) Early normalization, 4) Three-input Also, the sticky logic is performed during the alignment to de-
LZA, and 5) Compound addition and rounding. termine the guard, round and sticky bits. The first and second
bits under the LSB become the guard and round bits and the
A. Exponent Compare and Significand Alignment sticky bit is set if at least one of the over-shifted bits is 1, which
can be implemented with OR trees. The rest of the over-shifted
It is necessary to determine the largest exponent and aligned
bits under the sticky bit are discarded. Fig. 5 shows the signifi-
significands to handle the three operands. The traditional fused
cand shifter and sticky logic example for the single precision.
floating-point three-term adder determines the largest exponent
The control logic determines the largest exponent and the
based on the exponent differences. Then, the exponent differ-
aligned significands based on the exponent comparison results
ences are used for the significand alignment. This approach re-
as shown in Table I. In order to guarantee the significand preci-
quires performing the exponent subtractions, complementation
sion, the aligned significands become 2f 6 bits wide including
and significand shift sequentially, which takes a large latency.
two overflow bits, and guard, round and sticky bits, where f is
In order to reduce the latency, a new exponent compare and
the number of the significand bits as shown in Fig. 6. The new
significand alignment logic is proposed as shown in Fig. 4. Six
exponent compare and significand alignment logic reduces the
subtractions are performed to compute all the combinations of
latency compared to the traditional method by performing the
exponent differences ( , , ,
exponent process and significand alignment simultaneously.
, and ). In each pair of
differences, an absolute value is selected based on the exponent B. Invert & Dual-Reduction
comparison result, which enables skipping the complementation
after the subtractions. The sign logic generates the three effective sign bits
The exponent differences are used for the significand shifters. ( , and ) based on the three
Since a carry lookahead style adder produces the difference sign bits and two op codes as
LSBs first, they can be directly used for the significand shifter
so that the exponent difference computation and the significand
alignment are overlapped. Since most of the significand shifter
delay is overlapped with the exponent difference computation, (2)
TABLE I TABLE II
EXPONENT COMPARE CONTROL LOGIC 2 BIT EXTENDED LSBS FOR COMPLEMENTATION
to handle the cases that one or two significands are inverted by

effectively adding 1 or 2, respectively. Table II shows the 2 bit
extended LSBs based on the effective sign bits.
The significand sum must be complemented again, in case it
is negative. Dual-reduction is used to eliminate the complemen-
tation after the significand addition. One reduction tree takes the
three inverted significands and the other takes the reversely in-
verted significands. (e.g., and ). The two
3:2 CSAs produce the two reduced significand pairs. Between
the two significand pairs, the positive pair is selected based on
the sign of the significand sum. Although the sign detection re-
quires a significand comparison, the delay is overlapped with
the LZA so that the critical path latency is not impacted, which
is described in the next section. Since the selected significand
pair produces the positive sum, the complementation after the
significand addition is unnecessary, which reduces the latency
of the critical path.
For the delay-optimization, the first part of the significand
addition is performed after the reduction to balance with the
delay of the LZA, which performs a PG generator.
(3)
where and are th bits of the two significands and is
the level of the prefix tree adder. A Kogge-Stone adder is used
in this paper, but any type of adder can be used. The number
of levels implemented in the first part addition depends on the
delay-optimization to balance with the delay of LZA, but the
level 0 (for and ) and level 1 are implemented in this paper.
C. Early Normalization
One of the design issues for the fused floating-point three-
Fig. 6. Significand processalignment, invert, reduction, normalization, ad- term adder is the high precision significand addition. The tra-
dition and rounding. ditional fused floating-point three-term adder aligns the signif-
icands to 2f 6 bits, where f is the number of significand bits
where and are the first and second op codes, respec- [9], [10]. Such large significands require a large significand ad-
tively. dition and normalization, which the biggest bottleneck of the
The aligned significands are inverted based on the effective fused floating-point three-term adder. To reduce the overhead,
sign bits for the subtraction. The three operand subtraction re- early normalization is applied, which was previously proposed
quires that up to two significands are complimented. (e.g., for the fused floating-point multiply-add unit [4]. As shown in
). If all three Fig. 6, the normalization is performed prior to the significand
operands are negative, they are added and the sign becomes neg- addition so that the significand adder size is reduced to .
ative, that is ). In order to avoid the The rest of lower bits are passed to rounding. By nor-
increments after the inverters, 2 bits are extended to the LSB of malizing the significand pair prior to the significand addition,
the significands that are propagated to the significand addition the round position is fixed so that the significand addition and
Fig. 7. 64 bit LZD tree for the single precision.
rounding can be performed in parallel, which significantly re- The F vector is computed with the three symbols as
duces the latency of the critical path. More details of the signif-
icand addition and rounding are described in the next section.
(7)
D. Three-Input LZA and Significand Comparison The F vector is passed to the LZD logic. The LZD produces
Since the normalization is performed prior to the significand the leading zero count which becomes the shift amount of the
addition, the LZA and normalization is on the critical path. To normalization. For the LZD, any type of tree logic can be used
use a traditional two-input LZA, the three significands need to as discussed in the previous work [16], [17]. In this paper, for
be reduced to two using a 3:2 CSA, which increases the delay. fast normalization, a LZD logic producing the MSBs of the shift
The three-input LZA encodes the three inputs at once to skip amount first is selected so that the LZD logic and the normal-
the delay of the 3:2 CSA. The three-input LZA can be imple- ization shifter are overlapped [4]. Fig. 7 shows the 64 bit LZD
mented by extending the traditional two-input LZA [14]. Like tree, which can be used for the single precision. The higher shift
most of the LZAs, the three-input LZA consists of two parts: 1) bits are generated first, which of the delay is overlapped with
Pre-encoding indicator vectors and 2) Leading Zero Detection the lower levels of LZD logic. As a result, only last level of the
(LZD) logic for generating the leading zero count. The pre-en- shifter is in the critical path.
coder performs the bitwise operations to generate the W vector Most of the two-input LZAs are inaccurate due to a possible 1
as bit error. Similarly, the proposed three-input LZA also requires
correction logic. For fast error detection and correction, concur-
rent error correction logic can be used, which was previously
proposed in [18][20]1
(4) In order to determine the sign of the significand sum, which is
where , , are the th bits of the three significands. Since the used for the selection of the dual-reduction, significand compar-
input significands are inverted based on the effective sign bits, ison is required. In order to reduce the overhead, LZA pre-en-
the W vector is always positive. The W vector can be repre- coded bits can be used for the comparison tree [21].
sented by one of the four elements, , , and , indicating
wi equals to 0, 1, 2 and 3, respectively.
(8)
where is the MSB position of the significands. The delay of

the comparison tree is less than that of LZD so that it does not
(5) impact the critical path. The significand comparison result is
also used for the sign decision.
Using the four elements, the W vector is pre-encoded into three
symbols, , and as (9)
Since some of rounding modes specified in the IEEE-754 Stan-

dard [1] (i.e., round to positive and negative infinity) require
knowing the sign, it must be determined prior to the rounding.
1The error correction logic in [18] is modified by [19] and [20] to improve
(6) the accuracy and eliminate the redundancy, respectively.
Fig. 8. Significand result selection and rounding position.
E. Compound Addition and Rounding

The normalized significand pair is passed to the significand
addition and rounding. As described in Fig. 6. the upper signif-
icand bits are used for the addition and the lower bits are used
for the rounding. In the two-term adder, only one significand is
shifted for the alignment so that there is no carry propagation
for the lower part.
Fig. 9. Compound addition and rounding.
However, the three-term adder may have two aligned sig-
nificands, which causes carry propagation from the lower part.
Moreover, overflow can occur up to 2 bits so that the significand
addition result needs to be right shifted for the post-normaliza-
tion, which changes the rounding position. Fig. 8 shows the sig-
nificand result selection and the rounding position depending on
the overflows of the significand addition. and represent
two overflow bits; , and are the LSBs if there is two,
one or no overflow; G, R and S mean the guard, round and sticky
bits, respectively. Since the significand sum produces the over-
flow of at most 2, the two overflow bits and are mutually
exclusive.
The compound addition and rounding for the proposed fused
floating-point three-term adder can be implemented based on
that for the floating-point multiplier [22] as shown in Fig. 9.
The compound addition determines the upper f bits including
possible two overflow bits, and the rounding determines the rest
of three LSBs and the round decision. The upper
are passed to the compound addition, which produces sum and
Fig. 10. Round logic.
sum 1 simultaneously. Two significand sums are right shifted
by up to 2 bits based on the overflow bits. Then, one correct
malization shift amount is subtracted in case massive cancella-
significand sum is selected based on the round decision.
tion occurs during the subtraction. Since the normalization shift
The round logic requires computing the LSB, carry, guard,
amount is produced prior to the significand addition, only the
round and sticky bits (L, C, G, R and S) to determine if the sig-
two bit carry-out addition affects the critical path. Using the ad-
nificand sum is rounded up or not. As described in Fig. 8, how-
justed exponent, the exceptions are detected. The three excep-
ever, the rounding position varies depending on the overflow.
tion cases specified in IEEE-754 Standard [1] are detected as
Therefore, three rounding cases (i.e., for two, one, and no over-
flows) are separately computed. Fig. 10 shows the round logic,
which determines the round-up bit with certain L, G, R and S
bits. A 3 bit adder sums the three round-up bits and the ,
and bits to determine the three LSBs of the significand result.
(10)
Also, the carry-out of the addition is used for the final round-up
bit, which determines the upper significand result between sum where is the rounding decision of the significand ad-
and sum 1. dition result.
The largest exponent determined by the exponent compare
logic is adjusted by subtracting the shift amount from the LZA IV. A PIPELINED FUSED FLOATING-POINT
and adding the carry-out of the significand addition as shown THREE-TERM ADDER
in Fig. 11. Since the three significands generate overflow up to Pipelining can be applied to the fused floating-point three
2 bits, two carry-out bits are used for the adjustment. The nor- term adder to improve the throughput. Based on a data flow
Fig. 11. Exponent adjust logic.
Fig. 13. Delay-area curve for the three-term adders (single precision).
area and power consumption are slightly increased compared to

the non-pipelined fused floating-point three-term adder. Also,
the total latency of the pipelined fused floating-point three-term
adder is three times the latency of the second stage, which is
the largest one among the three pipeline stages. However, the
latencies of the three pipeline stages are fairly well balanced so
that the throughput is significantly increased compared to the
non-pipelined fused floating-point three-term adder.
V. RESULTS
Previous sections introduced several optimization techniques
for a fused floating-point three-term adder. The proposed de-
sign is implemented for both single and double precision in
Verilog-HDL and synthesized with the Nangate 45 nm CMOS
technology standard cell library [14]. In order to evaluate the
improvement of the proposed design, the area and latency are
compared with the traditional designs [9], [10]. Fig. 13 shows
the delay-area curve for the three single precision fused floating-
point three-term adders. Depending on the target frequency, the
implementations are synthesized with different area and delay.
For a fair comparison, the most efficient point of delay-area
product for each design is used. Table III compares the area of
the three single precision fused floating-point three-term adders.
The proposed design has a smaller significand adder compared
Fig. 12. Data flow of a pipelined improved fused floating-point three-term to the traditional designs. Also, the proposed design does not
adder. use incremental adders for the complementation and rounding,
while the traditional designs require the incremental adders for
analysis, the proposed improved fused floating-point three-term the exponent and significand computations and rounding, which
adder can be split into three pipeline stages so that results are requires additional adder area. Although the proposed design
produced on every cycle. Fig. 12 shows the data flow and crit- has twice many shifters and CSAs, area reduction in the main
ical path of the improved fused floating-point three-term adder. adders achieves smaller area for the entire logic compared to the
The critical paths of the three pipeline stages are traditional designs.
First stage: Unpack Exponent compare Significand Table IV compares the latency of the three designs. The main
alignment difference between the proposed design and the traditional de-
Second stage: Invert LZA/LZD Normalization signs is the significand addition and rounding. The proposed de-
Third stage: Significand addition Round select. sign performs a smaller significand addition compared to the
Since the second stage has the largest latency among the three traditional designs (for more delay-optimization, the first level
pipeline stages, its latency determines the throughput. Due to pg generator is performed prior to the normalization) without
the latches and control signals between the pipeline stages, the complementation by using the dual-reduction. In contrast, the
TABLE III
AREA COMPARISON OF FUSED FLOATING-POINT THREE-TERM ADDERS (SINGLE PRECISION)
TABLE IV
LATENCY COMPARISON OF FUSED FLOATING-POINT THREE-TERM ADDERS (SINGLE PRECISION)
traditional designs require a large significand adder and the in- TABLE V
verters followed by incremental adders for the complementa- RESULT COMPARISON
tion. Also, the proposed design performs the significand addi-
tion and rounding simultaneously so that the latency is signifi-
cantly reduced. Finally, the shifters for the alignment and nor-
malization are overlapped with the exponent difference compu-
tation and LZD logic, respectively so that only the last level of
the shifter is in the critical path.
Table V summarizes the results for both single and double
precision three-term adders. For the discrete design, the delay-
optimized floating-point adder [13] is used, which is well known
as a high-performance floating-point adder. All the percentages
in the table are ratios compared to the discrete design. The tra- major components such as significand alignment, significand
ditional fused floating-point three-term adders have achieved a addition, LZA and normalization are implemented using the tree
better accuracy with reduced area, power consumption (58%) structures that logarithmically increase the latency, the latency
and latency (314%) compared to the discrete design. The pro- of the double precision implementation is increased by only
posed fused floating-point three-term adder applies several tech- 20%. The benefits of the proposed optimization techniques are
niques discussed above so that the area and power consump- shown in both single and double precision.
tion are reduced by about 1520% and the latency is reduced The pipelined fused floating-point three-term adder is split
by about 35% compared to the discrete design, which is much into three stages. Table VI shows the area, latency and power
better than that of the traditional fused designs. consumption of the three pipeline stages. Each pipeline stage re-
The double precision implementation requires about twice quires latches to maintain the data and control signals between
as much area and power consumption as the single precision the stages, which increases the area, latency and power con-
implementation due to the larger logic components. Since the sumption. However, the latencies of the three pipeline stages
TABLE VI [12] S. F. Oberman, H. Al-Twaijry, and M. J. Flynn, The SNAP project:

PIPELINE STAGE COMPARISON Design of floating point arithmetic units, in Proc. 14th IEEE Symp.
Computer Arithmetic, 1997, pp. 156165.
[13] P. M. Seidel and G. Even, Delay-optimized implementation of IEEE
floating-point addition, IEEE Trans. Computers, vol. 53, no. 2, pp.
97113, Feb. 2004.
[14] Nangate Open Cell Library [Online]. Available: http://si2.org/openeda.
si2.org/projects/nangatelib/
[15] M. S. Schmookler and K. J. Nowka, Leading zero anticipation and
detection-a comparison of methods, in Proc. 15th IEEE Symp. Com-
puter Arithmetic, 2001, pp. 712.
[16] V. G. Oklobdzija, An algorithmic and novel design of a leading zero
detector circuit comparison with logic synthesis, IEEE Trans. VLSI
Syst., vol. 2, no. 1, pp. 124128, Mar. 1994.
[17] G. Dimitrakopoulos, K. Galanopoulos, C. Mavrokefalidis, and D.
Nikolos, Low-power leading zero counting and anticipation logic for
high-speed floating point units, IEEE Trans. VLSI Syst., vol. 16, no.
are fairly well balanced so that the throughput is increased to 7, pp. 837850, Jul. 2008.
about 2.7 times that of the non-pipelined design. [18] J. D. Bruguera and T. Lang, Leading-one prediction with concur-
rent position correction, IEEE Trans. Computers, vol. 48, no. 10, pp.
VI. CONCLUSION 10831097, Oct. 1999.
[19] R. Ji, Z. Ling, X. Zeng, B. Sui, L. Chen, J. Zhang, Y. Feng, and G. Luo,
The improved architecture design and implementation for a Leading-one prediction with concurrent position correction, IEEE
fused floating-point three-term adder has been presented. There Trans. Comput., vol. 58, no. 12, pp. 17261727, Dec. 2009.
[20] P. Kornerup, Correcting the normalization shift of redundant binary
are several critical design issues for the fused floating-point representations, IEEE Trans. Computers, vol. 58, pp. 14351439,
three-term adder: 1) Complex exponent processing and signifi- 2009.
cand alignment, 2) Complementation after the significand addi- [21] K. T. Lee and K. J. Nowka, 1 GHz leading zero anticipator using
independent sign-bit determination logic, in Proc. Dig. Tech. Papers
tion, 3) Large precision significand adder, 4) Massive cancella- VLSI Circuits Symp., 2000, pp. 194195.
tion management, and 5) Complex round processing. To resolve [22] G. Even and P. M. Seidel, A comparison of three rounding algorithms
those issues, several algorithms and optimization techniques are for IEEE floating-point multiplication, IEEE Trans. Comput., vol. 49,
pp. 638650, 2000.
applied: 1) A new exponent compare and significand alignment
2) Dual-reduction, 3) Early normalization, 4) Three-input LZA
and 5) Compound addition and rounding. The improved fused Jongwook Sohn (S12M14) received the B.S.
floating-point three-term adder reduces the area, power con- degree in electrical engineering from Korea Uni-
versity, Korea, in 2009, the M.S. and Ph.D. degrees
sumption by about 20% and reduces the latency by 35% rel- in electrical and computer engineering from the
ative to a discrete design. For further performance improve- University of Texas at Austin, TX, USA, in 2011
ment, pipelining can be applied. Based on the data flow anal- and 2013, respectively.
ysis, the proposed fused floating-point three-term adder can be Since 2011, he has been with Intel Corporation,
split into three pipeline stages, which increases the throughput Austin, TX, USA, where he has been working
on CPU design. His research interests include
to about 2.7 times that of the non-pipelined fused floating-point high-speed and low-power computer arithmetic,
three-term adder. computer architecture, floating-point unit design,
VLSI circuit design, and application specific processor design.
ACKNOWLEDGMENT
The authors thank the anonymous reviewers for their con- Earl E. Swartzlander, Jr. (SM79F88LF11)
structive comments. received the B.S. degree from Purdue University,
West Lafayette, IN, USA, in 1967, the M.S. degree
REFERENCES from the University of Colorado, Boulder, CO, USA,
[1] IEEE Standard for Floating-Point Arithmetic, ANSI/IEEE Standard in 1969, and the Ph.D. degree from the University
754-2008, IEEE, Inc., 2008. of Southern California, Los Angeles, CA, USA, in
[2] R. K. Montoye, E. Hokenek, and S. L. Runyon, Design of the 1972, all in electrical engineering.
IBM RISC system/6000 floating-point execution unit, IBM J. Res. From 1975 to 1990, he held a variety of positions
Develop., vol. 34, pp. 5970, 1990. at TRW including the Director of Independent Re-
[3] E. Hokenek, R. K. Montoye, and P. W. Cook, Second-generation search and Development in the TRW Defense Sys-
RISC floating point with multiply-add fused, IEEE J. Solid-State Cir-
tems Group, the Manager of the Digital Processing
cuits, vol. 25, pp. 12071213, 1990.
[4] T. Lang and J. D. Bruguera, Floating-point fused multiply-add with re- Laboratory in the Electronics and Technology Division, and the Manager of the
duced latency, IEEE Trans. Computers, vol. 53, pp. 9881003, 2004. Advanced Development Office in the System Development Division. He is the
[5] H. H. Saleh and E. E. Swartzlander, Jr., A floating-point fused add- author of two books, editor of seven books, and the author or coauthor of 76
subtract unit, in Proc. 51st IEEE Midwest Symp. Circuits Syst., 2008, refereed journal papers, 39 book chapters, and 297 conference papers.
pp. 519522. Dr. Swartzlander, Jr. was the editor-in-chief of the IEEE TRANSACTIONS ON
[6] J. Sohn and E. E. Swartzlander, Jr., Improved architectures for a fused COMPUTERS from 1990 to 1994 and was the founding Editor-in-Chief of the
floating-point add-subtract unit, IEEE Trans. Circuits Syst. I, Reg. Pa- Journal of VLSI Signal Processing. In addition, he has served as an associate ed-
pers, vol. 59, no. 10, pp. 22852291, Oct. 2012.
itor for the IEEE TTRANSACTIONS ON COMPUTERS, the IEEE TRANSACTIONS
[7] H. H. Saleh and E. E. Swartzlander, Jr., A floating-point fused dot-
product unit, in Proc. IEEE Int. Conf. Computer Des., 2008, pp. ON PARALLEL AND DISTRIBUTED SYSTEMS, and the IEEE JOURNAL OF
427431. SOLID-STATE CIRCUITS. He has been a member of the Board of Governors of
[8] J. Sohn and E. E. Swartzlander, Jr., Improved architectures for a the IEEE Computer Society (19871991), the IEEE Signal Processing Society
floating-point fused dot product unit, in Proc. 21st Symp. Computer (19921994), and the IEEE Solid-State Circuits Council/Society (19861991).
Arithmetic, 2013, pp. 4148. He has been a member of the IEEE History Committee (19962004), the
[9] A. Tenca, Multi-operand floating-point addition, in Proc. 21st Symp. IEEE Fellows Committee (20002003), and the IEEE James H. Mulligan,
Computer Arithmetic, 2009, pp. 161168. Jr., Education Medal Committee (20072011). He has chaired a number of
[10] Y. Tao, G. Deyuan, F. Xiaoya, and R. Xianglong, Three-operand
conferences. He has been honored with the IEEE Third Millennium Medal, the
floating-point adder, in Pro. 12th IEEE Int. Conf. Comput. Inf.
Technol., 2012, pp. 192196. Distinguished Engineering Alumnus Award from the University of Colorado,
[11] M. P. Farmwald, On the Design of High Performance Digital Arith- the Outstanding Electrical Engineer and Distinguished Engineering Alumnus
metic Units, Ph.D. dissertation, Computer Science, Stanford Univer- Awards from Purdue University, and the IEEE Computer Society Golden Core
sity, Stanford, CA, USA, 1981. Award.

Ggggggui

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Ggggggui

Hochgeladen von

Copyright:

Verfügbare Formate

2842 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSI: REGULAR PAPERS, VOL. 61, NO.

10, OCTOBER 2014

A Fused Floating-Point Three-Term Adder

Fig. 1. Discrete vs. fused floating-point three-term adders.

5) Compound addition is used for fast rounding. The com-

Fig. 4. Exponent compare and significand alignment logic.

Fig. 3. An improved fused floating-point three-term adder.

three-term adder. In this section, five optimizations for the im-

to handle the cases that one or two significands are inverted by

Fig. 7. 64 bit LZD tree for the single precision.

where is the MSB position of the significands. The delay of

Since some of rounding modes specified in the IEEE-754 Stan-

Fig. 8. Significand result selection and rounding position.

E. Compound Addition and Rounding

Fig. 11. Exponent adjust logic.

area and power consumption are slightly increased compared to

TABLE VI [12] S. F. Oberman, H. Al-Twaijry, and M. J. Flynn, The SNAP project:

Das könnte Ihnen auch gefallen