15V42

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

2 Aufrufe

15V42

© All Rights Reserved

Als PDF, TXT **herunterladen** oder online auf Scribd lesen

- DE_LAB EXPERIMENTS
- Mirror Adder
- QBA
- SOL Review 1.pdf
- Adders + Subtractors.pdf
- Computing With Carbon Nanotubes Optimization of Threshold Logic Gates Using Disordered Nanotubepolymer Composites
- Low Power High Speed Error Tolerant Adder Vec
- Unit 4 Combinational Circuit
- Cpu Design Lecture
- 1Accumulator Based 3-Weight
- vlsi
- Sritharan- Weighted Implementation
- Basic Facts Recal
- Critical Path analysis using Lms adaptive algorithm
- ScWi14
- 23E_A Low
- CompArchCh03L05BoothAlgor
- Exp 6 Half Adder
- udl lesson plan with technology
- go-gc

Sie sind auf Seite 1von 9

Darjn Esposito, Davide De Caro, Senior Member, IEEE, Ettore Napoli, Nicola Petra, Member, IEEE, and

Antonio Giuseppe Maria Strollo, Senior Member, IEEE

Abstract—Variable latency adders have been recently proposed detection network that asserts an output signal when speculation

in literature. A variable latency adder employs speculation: the fails. In this case (misprediction), another clock cycle is needed

exact arithmetic function is replaced with an approximated one

that is faster and gives the correct result most of the time, but to obtain the correct result with the help of a correction stage.

not always. The approximated adder is augmented with an error Since the addition time is one clock cycle when no error occurs

detection network that asserts an error signal when speculation and two clock cycles when the speculation fails, the average ad-

fails. Speculative variable latency adders have attracted strong dition time can be computed as

interest thanks to their capability to reduce average delay com-

pared to traditional architectures. This paper proposes a novel

variable latency speculative adder based on Han-Carlson par- (1)

allel-preﬁx topology that resulted more effective than variable

latency Kogge-Stone topology. The paper describes the stages in where is the clock period and is the error probability

which variable latency speculative preﬁx adders can be subdivided of the speculative adder.

and presents a novel error detection network that reduces error Speculative adders are built upon the observation that the crit-

probability compared to previous approaches. Several variable

latency speculative adders, for various operand lengths, using ical path is rarely activated in traditional adders [9]–[13]. In par-

both Han-Carlson and Kogge-Stone topology, have been synthe- ticular, in traditional adders each output depends on all previous

sized using the UMC 65 nm library. Obtained results show that bits, so the most signiﬁcant output depends on all the input

proposed variable latency Han-Carlson adder outperforms both bits. Instead, in speculative adders each output depends only

previously proposed speculative Kogge-Stone architectures and

non-speculative adders, when high-speed is required. It is also on the previous bits, where goes as [12]–[15].

shown that non-speculative adders remain the best choice when This reﬂects the fact that a propagate chain longer than

the speed constraint is relaxed. is a very rare event.

Index Terms—Addition, digital arithmetic, parallel-preﬁx A ﬁrst speculative approach to addition was proposed by

adders, speculative adders, speculative functional units, variable Nowick [12] in asynchronous contest, which implements a vari-

latency adders. able latency adder cutting the lowest levels of a Kogge-Stone

adder. In synchronous contest, Verma et al.[13] propose a vari-

I. INTRODUCTION able latency speculative adder; here the speculative addition

A DDERS ARE basic functional units in computer arith- is realized in the same way as [12], cutting the lower levels

metic. Binary adders are used in microprocessor for of a Kogge-Stone adder. A similar approach is employed in

addition and subtraction operations as well as for ﬂoating point [14]. In [15] a variable latency carry-select adder is introduced,

multiplication and division. Therefore adders are fundamental where the adder is fragmented in various windows, each one

components and improving their performance is one of the containing a Kogge-Stone adder.

major challenges in digital designs. Theoretical research [1] The Kogge-Stone adder is often used when speed is the pri-

has established lower bounds on area and delay of -bit adders: mary concern, since it uses the minimum number of logic levels

the former varies linearly with adder size, the latter has an and each cell in the adder tree has fanout of 2. This comes at the

behavior. cost of using many propagate-generate cells and many wires that

High speed adders are based on well established parallel- must be routed between stages.

preﬁx architectures [1], [2], including Brent-Kung [3], Kogge- In this paper we propose a novel variable latency specula-

Stone [4], Sklansky [5], Han-Carlson [6], Ladner-Fischer [7], tive adder based on Han-Carlson [6] parallel-preﬁx topology.

Knowles [8]. These standard architectures operate with ﬁxed The Han-Carlson topology uses one more stage than Kogge-

latency. Better average performances can be achieved by using Stone adder, while requiring a reduced number of cells and sim-

variable latency adders, that have been recently proposed in lit- pliﬁed wiring. Thus, it can achieve similar speed performance

erature [9]. A variable latency adder employs speculation: the compared to Kogge-Stone adder, at lower power consumption

exact arithmetic function is replaced with an approximated one and area [16]. We show that a speculative carry tree can be

that is faster and gives the correct result most of the time, but obtained by pruning some intermediate levels of the classical

not always. The approximated adder is augmented with an error Han-Carlson topology. The paper presents a rigorous derivation

of the error detection network and shows that the error detec-

tion network required in speculative Han-Carlson adders is sig-

niﬁcantly faster than the one used by speculative Kogge-Stone

Manuscript received July 23, 2014; revised December 12, 2014 and nulldate;

accepted January 29, 2015. This paper was recommended by Associate Editor architecture. An extensive set of implementation results for 65

C. P. Ravikumar. nm CMOS technology shows that proposed Han-Carlson vari-

The authors are with the Department of Electrical Engineering and Infor- able latency adders outperform previously developed variable

mation Technology, University of Napoli “Federico II”, I80125 Naples, Italy latency Kogge-Stone architectures. Compared with traditional,

(e-mail: dadecaro@unina.it).

Color versions of one or more of the ﬁgures in this paper are available online non-speculative, adders, our analysis demonstrates that variable

at http://ieeexplore.ieee.org. latency Han-Carlson adders show sensible improvements when

Digital Object Identiﬁer 10.1109/TCSI.2015.2403036 the highest speed is required; otherwise the burden imposed by

1549-8328 © 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

vantage.

The paper is organized as follows. In Section II we recall the

basic architecture of parallel-preﬁx adders. The stages in which

variable latency speculative preﬁx adders can be subdivided are

presented in Section III where, after a brief review of Kogge-

Stone speculative preﬁx-processing stage introduced in [12],

we present the proposed Han-Carlson speculative topology. De-

tailed discussion about the error detection stage is also reported

in this Section. The Section IV presents spatial and timing com-

plexity of investigated architectures. Section V shows detailed

implementation and synthesis results of the proposed adders, for

operand size ranging from 32 through 128 bits. Section VI con-

cludes the paper with some ﬁnal remarks.

II. PRELIMINARIES

A. Prefix Addition

The binary addition problem can be formulated as fol-

lows: given an -bit augend and an

-bit addend generate the -bit sum

Fig. 1. Han-Carlson and Kogge-Stone parallel-preﬁx topologies. .

. Let us indicate as the carry out of the

-th bit. The sum bit and the carry can be computed as

follows:

(2)

(3) (10)

In preﬁx addition we use three stages to compute the sum: where: . The preﬁx operator has two important

pre-processing, preﬁx-processing and post-processing. properties: it is associative and it is idempotent. These proper-

In the pre-processing stage the generate and propagate ties are exploited in the preﬁx-processing stage to speed-up the

signal are computed as: computation.

Finally, in the post-processing stage, the sum bit are com-

(4)

puted using (8) and:

(5)

(11)

The condition means that a carry is generated at bit

, while the condition means that a carry is propagated

through bit . B. Han-Carlson and Kogge-Stone Parallel-Prefix Adder

The concept of generate and propagate can be extended to Topologies

a block of contiguous bits, from bit to bit (with ) as The pre-processing and post-processing stages of a preﬁx

follows: adder involve only simple operations on signals local to each

if bit position. Therefore, adder performance mainly depends on

(6)

otherwise preﬁx-processing stage.

if Fig. 1 shows Han-Carlson and Kogge-Stone preﬁx adders

(7) topologies. Here black dots represent the preﬁx operator (10),

otherwise

while white dots are simple placeholders.

where: . Kogge-Stone adder is composed by levels and

The condition means that a carry is generated in the present a fanout of two at each level using a large number of

block , while the condition means that a carry black cells and many wire tracks. A good trade-off between

is propagated through the block. Thus, for any bit , the carry fanout, number of logic levels and number of black cells is

can be expressed as: given by Han-Carlson. The outer rows of the Han-Carlson

topology are Brent-Kung [3] graphs, while the inner rows are

(8)

Kogge-Stone graphs. The Han-Carlson adder in Fig. 1 uses a

where is the input carry of the -bit adder. In the following, single Brent-Kung level at the beginning and at the end of the

for the sake of simplicity, we assume that , so that (8) graph, and the number of levels is .

simpliﬁes as:

III. VARIABLE LATENCY SPECULATIVE PREFIX ADDERS

(9)

Variable latency speculative preﬁx adders can be subdivided

The block generate and propagate terms are computed in in ﬁve stages: pre-processing, speculative preﬁx-processing,

the preﬁx-processing stage of the adder. To that purpose, the post-processing, error detection and error correction. The error

( , ) couples are expressed with the help of the preﬁx correction stage is off the critical path, as it has two clock

operator deﬁned as follows: cycles to obtain the exact sum when speculation fails.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

bit Kogge-Stone adder is pruned, resulting in a speculative preﬁx-processing

Fig. 3. Han-Carlson speculative preﬁx-processing stage. The last Kogge-Stone

stage with .

row of the bit graph is pruned, resulting in a speculative preﬁx pro-

cessing stage with .

A. Pre-Processing

, where is the number of pruned levels; the number

In the pre-processing stage the generate and propagate of levels of the speculative Han-Carlson stage reduces from

signals are computed as in (4), (5). to (assuming that is a power of

two).

B. Speculative Prefix-Processing

As it can be observed in Fig. 3, the length of the propagate

The speculative preﬁx-processing stage is one of the main chains is only for , while for

differences compared with the standard preﬁx adders recalled the propagate chain length is .

in previous section. Instead of computing all the and In general, the computed propagate and generate signals for

required in (8) to obtain the exact carry values, only a subset of the speculative Han-Carlson architecture are:

block generate and propagate signals is calculated; in the post-

processing stage approximate carry values are obtained from

this subset. The output of the speculative preﬁx-processing stage

will also be used in the error detection and in the error correction (13)

stages discussed in the following.

The basic assumption behind speculative preﬁx-processing As it will be apparent in the following, having the propagate

stage is that carry signals propagate for no more than bits, lengths equal to for half of the outputs greatly simpliﬁes

with and . This assumption is corrob- the error detection.

orated by the analyses in [13],[17] that demonstrate that having C. Post-Processing

a propagate chain longer that is a very rare event.

1) Kogge-Stone Topology: The Kogge-Stone speculative In the post-processing stage we ﬁrstly compute the approx-

preﬁx-processing stage has been proposed in [12],[13] and imate carries, , and then use them to obtain the approximate

can be obtained by pruning the last levels of a traditional sum bits as follows:

Kogge-Stone adder. In the example shown in Fig. 2, the last (14)

level of a bit Kogge-Stone adder is pruned. As it can be

observed, for the length of propagate chains extends for Similarly to (9), the approximate carries are obtained as the

8 bits, resulting in a speculative preﬁx-processing stage with generate signals available in the last level of the preﬁx-pro-

. cessing stage. We have:

In general, one has , where is the number of for:

pruned levels; the number of levels of the speculative stage is otherwise

correspondingly reduced from to (assuming (15)

that is a power of two). and:

In general, the computed propagate and generate signals for for:

the speculative Kogge-Stone architecture are: for: odd

for: even

(16)

(12)

D. Error Detection

2) Han-Carlson Topology: Han-Carlson adder constitutes The conditions in which at least one of the approximate car-

a good trade-off between fanout, number of logic levels and ries is wrong (misprediction) are signaled by the error detection

number of black cells. Because of this, Han-Carlson adder can stage. In case of misprediction, an error signal is asserted by

achieve equal speed performance respect to Kogge-Stone adder, error detection stage and the output of the post-processing stage

at lower power consumption and area [16]. Therefore it is inter- is discarded. The error correction stage will give the correct sum

esting to implement a speculative Han-Carlson adder. in the next clock period.

Moved by these reasons, we have generated a Han-Carlson 1) Kogge-Stone: The error condition for carry can be ob-

speculative preﬁx-processing stage by deleting the last rows of tained from (9),(15) and using the properties of propagate gen-

the Kogge-Stone part of the adder. As an example, the Fig. 3 erate signals as:

shows the Han-Carlson adder of Fig. 1 in which the two Brent- for:

Kung rows at the beginning and at the end of the graph are un- (17)

otherwise

changed, while the last Kogge-Stone row is pruned. This yields

a speculative stage with . In general, one has Thus, the error signal can be expressed as:

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

tained from (9), (16) as:

for:

odd (25)

even

The error signal can be written as:

(26)

It can easily be seen that in (26) the terms in the second OR

are implied by the terms in the ﬁrst OR. Let us consider, for

instance, the ﬁrst two terms of the OR (assuming that is even).

We have:

Fig. 4. The nodes of the preﬁx-processing stage, whose outputs are needed to

compute the error signal, are named “checking nodes” and are highlighted as (27)

big hatched dots, for the topologies in Fig. 2–3.

Thus, we can write:

(28)

(18)

The last equation can be simpliﬁed, following an approach

where the symbol represents the logical OR. similar to previous subsection. Let us consider the last two terms

It is important to note that (18) is a necessary and sufﬁcient of the OR in (28), with index and (assuming that

error condition that requires the calculation of . Unfor- is even):

tunately, these terms are actually not computed by the specula-

(29)

tive preﬁx-processing stage (avoiding the computation of these

terms is the key idea of speculative adders). Thus, in previous One has:

papers,(18) is replaced by the following looser relation:

(30)

(19) Substituting (30) in (29), the terms with index and

of (28) can be simpliﬁed as:

The last equation is a necessary-only error condition. By

using (19), the error signal can be triggered even in absence (31)

of actual misprediction. While this does not harm the correct Similar simpliﬁcations can be realized by considering in (28)

operation of the speculative adder, having an high rate of such the terms and and so on. Finally one obtains:

“false positive” errors degrades the average addition time (1).

In this paper, instead, we rewrite the necessary and sufﬁcient (32)

condition (18) in a form that does not require the

terms. To that purpose, let us consider the last two terms of the

OR in (18), with index and : Let us consider, as an example, the preﬁx-processing stage in

(20) Fig. 3. The error signal (32) is given by:

(21) By comparing (23) and (32), it can easily be seen that the

number of terms to be OR-ed to obtain the error signal is halved

Substituting (21) in (20), the terms with index and in the Han-Carlson topology, compared to Kogge-Stone.

of (18) can be simpliﬁed as: We name “checking nodes” the nodes of the preﬁx-processing

(22) stage, whose outputs are needed to compute the error signal.

The checking nodes for both the Kogge-Stone example of Fig. 2

Similar simpliﬁcations can be realized by considering in (18) and the Han-Carlson example of Fig. 3 are highlighted as big

the terms and and so on. Finally one obtains: hatched dots in Fig. 4.

As it can be observed, in Kogge-Stone some of the checking

(23) cells are at the last level of the graph; their output signals are

available after three black cells delay. In Han-Carlson the crit-

Let us consider, as an example, the preﬁx-processing stage in ical checking cells are in the second last level of the graph and

Fig. 2. The error signal (23) is given by: are also available after three black cells delay, in spite of the

larger number of levels of the Han-Carlson preﬁx-processing

(24) stage.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

Fig. 5. Error correction and detection stages for the proposed speculative Han-Carlson adder of Fig. 3.

From the above observations, it can be concluded that error employed black cells (AO gates). Error detection spatial com-

detection is sensibly simpliﬁed and potentially faster in Han- plexity is simply estimated assuming that it is composed by a

Carlson, compared to Kogge-Stone. set of AND gates (to compute the terms in (23) or in (32))

As an additional note, the need of driving the gates of the followed by a tree of two-input OR to compute the error signal

error detection stage increases the fanout of the checking cells, (see Fig. 5 for an example). According to the model proposed in

slowing the speculative preﬁx-processing stage. [18] we assume as unit gate a basic 2-input gate, such as AND

gates and OR gates, while we count black cells (AO gates) as

E. Error Correction two unit gates.

Regarding delay, we assume that speculative sum delay is

The error correction stage computes the exact carry signals

proportional to the number of levels of speculative parallel-

(9), to be used in case of misprediction.

preﬁx stage, plus two additional levels to take into account pre-

The error correction stage is composed by the levels of the

processing and post-processing. Error detection delay is esti-

preﬁx-processing stage pruned to obtain the speculative adder.

mated as the number of OR-tree levels, plus one additional level

The Fig. 5 shows the error correction stage of the proposed spec-

to take into account the AND gates computing the terms

ulative Han-Carlson adder; the error correction for Kogge-Stone

in (23) or in (32). Assuming unit gate delay model of [18], we

topology can be obtained similarly.

count the basic 2-input gates such as AND and OR as one gate

It can be observed that the inclusion of the error correction

delay with the exception of the XOR gates which we count as

stage increases the fanout of some of the cells of the speculative

two gate delays.

preﬁx-processing stage, with adverse effect on adder speed.

Obtained results are shown in Table I. For speculative adders,

spatial complexity is reported as the sum of two contributions.

F. Post-Processing

The ﬁrst one (curly brackets) is the contribution of speculative

The approximate carries are already available at the output of preﬁx stage and error correction stage, the second one (square

the preﬁx-processing stage. The post-processing, according to brackets) is the contribution of error detection stage. As it can

(14), is equal to the one of a non-speculative adder and consists be observed, the two area contributions are both lower in the

of xor gates. proposed Han-Carlson speculative adder, compared to Kogge-

Stone. It also worth noting that the spatial complexity of spec-

IV. ADDERS CHARACTERIZATION ulative adders is higher than non-speculative ones, because of

In this section we provide a characterization of the spatial and error detection and correction stages.

timing complexity of the investigated variable latency specula- Regarding to timing complexity, Table I reports the values

tive adders, using either Han-Carlson or Kogge-Stone topolo- of both speculative sum and error detection. The Kogge-Stone

gies. Results for non-speculative adders are also reported, for adder saves two gate levels to perform the speculative sum.

comparison. This will be achieved with the help of simplistic However, the critical path traverses the error detection stage

hypotheses on area and speed of employed gates, with the aim and hence the proposed Han-Carlson architecture appears

of obtaining an analytic comparison (albeit approximated) be- faster than Kogge-Stone speculative adder, owing to the halved

tween the various topologies. Accurate values of area, speed and number of terms to be OR-ed (column in Table I) to obtain

power for 65 nm technology will be presented in the next sec- the error signal.

tion for a quantitative assessment of variable latency speculative

adders. Results of error rate analysis will also be reported at the B. Error Rate Analysis

end of this section.

The value of error probability is fundamental to understand

the degradation of average addition time (1) caused by mispre-

A. Spatial and Timing Complexity

diction. In order to evaluate error probabilities, the proposed

In order to estimate adder complexity, we make some sim- speculative Han-Carlson and the Kogge-Stone topologies have

plistic hypotheses. been simulated by using a Monte Carlo approach with a 1%

We assume that the spatial complexity of speculative preﬁx- relative error and a 99% conﬁdence level. Input vectors have

processing and error correction is proportional to the number of been chosen uniformly distributed [12],[13]. Table II reports

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

TABLE I

SPATIAL AND TIMING COMPLEXITY

ERROR PROBABILITY VALUES speed, and area) of different designs, since they strongly de-

pend on timing constraint used during synthesis. The results re-

ported in the following have been obtained by performing sev-

eral syntheses of the circuits under investigation, by varying

the timing constraint. In this way we can compare the various

topologies and ﬁnd the most effective ones depending on the re-

quired speed.

The dynamic power dissipation has been evaluated after syn-

thesis by extracting the nodes activities from a back-annotated

simulation.

The variable latency parallel-preﬁx adders depend on the

choice of the parameter (the assumed maximum length of

propagate chains, see Section III). One has , where

is the number of pruned levels of the parallel-preﬁx stage.

The optimal value descends from the following trade-off:

the results of the analysis. For Kogge-Stone we name “Pre-

if we increase we reduce the error probability (with positive

cise” the error detection stage based on equations (23), while we

effects on average delay (1)) but we make parallel-preﬁx stage

name “Coarse” the one based on the necessary-only error con-

slower (a little number of levels is pruned) and we make also

dition (19); similar naming convention is used for Han-Carlson

the error detection slower (because the checking-cells descend

topology.

toward last levels).

The Precise error detection stage signiﬁcantly reduces the

To investigate this trade-off, we have synthesized the variable

error probability, compared to the Coarse one. Moreover, Han-

latency speculative adders for different values of parameter.

Carlson speculative adder exhibits a lower error probability than

Results, by varying the synthesis timing constraint, for Han-

Kogge-Stone one. This can be interpreted as follows: in Kogge-

Carlson topology are displayed in Fig. 6, considering 32, 64, and

Stone speculative preﬁx stage all the carries are computed inde-

128 bit adders. Note that -axis variable is the average delay,

pendently from each other, instead in Han-Carlson, half of the

that takes into account the error probability. For comparison, we

carries (those in even bit-positions) are calculated from “par-

report in Fig. 6 the results obtained also for the non-speculative

ents” carries (those in odd bit-positions), through an additional

Han-Carlson adder.

level of the tree. This reduces error probability (if a parent carry

As it can be observed in Fig. 6, the implementations with

is correct the “child” carry will be correct, too).

and with reveal ineffective, the former because

In the following we will consider only implementations with

of high error rate, the latter because a single level is pruned

the Precise error detection stage which, as shown, provides

compared to the non-speculative adder.

lower error probabilities than Coarse detection.

For 32 bit (Fig. 6(a)) the optimum value is ; this value

of is also the best choice for bit (Fig. 6(b)). For

V. SYNTHESIS RESULTS

bit (Fig. 6(c)) both and give similar

We have developed Matlab scripts which generate Ver- performance.

ilog descriptions of the proposed variable latency specula- Comparison between variable latency adder and the

tive adders, and of their non-speculative counterpart. The non-speculative Han-Carlson topology reveal that variable

synthesis command was used to mark latency adders allow to reduce the minimum achievable delay.

the non-speculative outputs of the speculative adders. We have For instance, in the 64 bit case, the minimum achievable delay

synthesized these adders in UMC 65 nm library, for 32 bit, 64 is about 280 ps for the non-speculative adder and reduces up to

bit, and 128 bit operands. 225 ps in the variable latency architecture.

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

Fig. 6. Area and power of Han-Carlson speculative and non-speculative adders, as a function of the timing constraint. (a) 32 bit, (b) 64 bit, (c) 128 bit. Different

values are used for speculative adders.

The analysis of Area Occupation and Power Dissipation lower Power Dissipation for . For ,

shows that speculative adders are not effective for large average the non-speculative adder presents an area of and

delay. As the timing constraint imposed during synthesis is a power of , while the variable latency adder

made tighter speculative adders become advantageous. For exhibits an area of (20% reduction) and a power of

instance, in the 64-bit case, speculative Han-Carlson adder about (9% reduction).

results in a lower Area for lower than 385 ps and also in a

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

Fig. 7. Comparison of Han-Carlson and Kogge-Stone speculative and non-speculative adders, as a function of the timing constraint. (a) 32 bit, (b) 64 bit, (c) 128

bit.

B. Comparison with Kogge-Stone Variable Latency performance of non-speculative adders, in order to identify the

Speculative Adder region where the speculative approach is effective (the optimum

value for the variable latency speculative Kogge-Stone adder

Fig. 7 shows the comparison between proposed speculative is: for , for and

adder and Kogge-Stone one. Also in this case, we report the ).

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.

The proposed variable latency Han-Carlson adder outper- [16] S. K. Mathew, R. K. Krishnamurthy, M. A. Anders, R. Rios, K. R.

forms the speculative Kogge-Stone architecture in all the Mistry, and K. Soumyanath, “Sub-500-ps 64-b ALUs in 0.18- m SOI/

bulk CMOS: Design and scaling trends,” IEEE J. Solid-State Circuits,

considered cases, conﬁrming the trend highlighted in Table I. vol. 36, no. 11, pp. 1636–1646, Nov. 2001.

As an example, focusing on 64-bit adders, for lower than [17] B. Parhami, Computer Arithmetic: Algorithms and Hardware De-

350 ps, the proposed Han-Carlson speculative adder is the best sign. New York: Oxford Univ. Press, 2000.

choice in terms of silicon area and power consumption. More- [18] A. Tyagi, “A reduced-area scheme for carry-select adders,” IEEE

Trans. Comput., vol. 42, no. 10, pp. 1163–1170, Oct. 1993.

over, it allows to reduce the minimum achievable to 225

ps, with a 18% improvement respect to Kogge-Stone non-spec-

ulative adder and a 11% improvement respect to Kogge-Stone

speculative adder. For , proposed speculative Darjn Esposito was born in 1989 in Naples, Italy. He

adders offer 45% area reduction and 35% power saving com- received the M.S. degree (with honors) in electronic

pared to Kogge-Stone non-speculative adder. engineering from the University of Naples “Federico

II,” in 2013, where he is currently working toward the

VI. CONCLUSION Ph.D. degree. His research interests include design

In this paper a novel variable latency Han-Carlson parallel- of digital VLSI circuits, with particular emphasis on

speculative functional units.

preﬁx speculative adder for high-speed application is proposed.

A new, more accurate, error detection network is introduced,

which allows reducing the error probability compared to the pre-

vious approaches.

An extensive set of implementation results for 65 nm

CMOS technology shows that proposed Han-Carlson variable Davide De Caro (M'05–SM'09) received the M.S.

degree in electronic engineering with honors, in July

latency adders outperforms previously developed variable 1999, and the Ph.D. degree in electronic engineering

latency Kogge-Stone architectures. Compared with traditional, and computer science, in February 2003, both from

non-speculative, adders, our analysis demonstrates that variable the University of Naples “Federico II”, Italy.

latency Han-Carlson adders show sensible improvements when He has worked in the area of digital integrated

VLSI circuit design for the last fourteen years. Since

the highest speed is required; otherwise the burden imposed March 2003 he is a Researcher at the Department of

by error detection and error correction stages overwhelms any Electrical Engineering and Information Technology.

advantage. Additional work is required to extend the specu- Dr. De Caro is author of more than 50 technical

lative approach to other parallel-preﬁx architectures, such as papers in international journals and refereed inter-

national conferences.

Brent-Kung, Ladner-Fisher, and Knowles.

[1] I. Koren, Computer Arithmetic Algorithms. Natick, MA, USA: A K the Electronic engineering degree with honors in

Peters, 2002. 1995; the Ph.D. degree in electronic engineering in

[2] R. Zimmermann, “Binary adder architectures for cell-based VLSI and 1999, and the Physics degree with honors in 2009.

their synthesis,” Ph.D. thesis, Swiss Federal Institute of Technology, He has been an Associate Professor, University of

(ETH) Zurich, Zurich, Switzerland, 1998, Hartung-Gorre Verlag. Napoli, Italy, since 2005.

[3] R. P. Brent and H. T. Kung, “A regular layout for parallel adders,” He was a Research Associate at the Engineering

IEEE Trans. Comput., vol. C-31, no. 3, pp. 260–264, Mar. 1982. Dept. of the University of Cambridge, U.K., in 2004.

[4] P. M. Kogge and H. S. Stone, “A parallel algorithm for the efﬁcient His scientiﬁc interests include modeling and design

solution of a general class of recurrence equations,” IEEE Trans. of power semiconductor devices and VLSI circuit de-

Comput., vol. C-22, no. 8, pp. 786–793, Aug. 1973. sign. Prof. Napoli is author or coauthor of more than

[5] J. Sklansky, “Conditional-sum addition logic,” IRE Trans. Electron. 100 papers published in international journals and conferences.

Comput., vol. EC-9, pp. 226–231, Jun. 1960.

[6] T. Han and D. A. Carlson, “Fast area-efﬁcient VLSI adders,” in Proc.

IEEE 8th Symp. Comput. Arith. (ARITH), May 18–21, 1987, pp. 49–56. Nicola Petra (M'05) received the Laurea degree

[7] R. E. Ladner and M. J. Fischer, “Parallel preﬁx computation,” J. ACM, and the Ph.D. degree from the University of Napoli

vol. 27, no. 4, pp. 831–838, Oct. 1980. “Federico II,” Italy, in 2002 and 2007 respectively.

[8] S. Knowles, “A Family of Adders,” in Proc. 14th IEEE Symp. Comput. His research interests include design of digital VLSI

Arith., Vail, CO, USA, Jun. 2001, pp. 277–281. circuits for telecommunications and high-perfor-

[9] S.-L. Lu, “Speeding up processing with approximation circuits,” Com- mance arithmetic circuits. He is now working as

puter, vol. 37, no. 3, pp. 67–73, Mar. 2004. a Researcher at the Department of Electronics and

[10] T. Liu and S.-L. Lu, “Performance improvement with circuit-level Telecommunications Engineering of the University

speculation,” in Proc. 33rd Annu. IEEE/ACM Int. Symp. Microarchit. of Napoli “Federico II.” He has authored or coau-

(MICRO-33), 2000, pp. 348–355. thored more than 30 papers on scientiﬁc journals and

[11] N. Zhu, W.-L. Goh, and K.-S. Yeo, “An enhanced low-power high- international conferences.

speed Adder For Error-Tolerant application,” in Proc. 2009 12th Int.

Symp. Integr. Circuits (ISIC '09), Dec. 14–16, 2009, pp. 69–72.

[12] S. M. Nowick, “Design of a low-latency asynchronous adder using Antonio Giuseppe Maria Strollo (M'05–SM'06) re-

speculative completion,” IEE Proc. Comput. Digit. Tech., vol. 143, no. ceived the Laurea degree (cum laude) and the Ph.D.

5, pp. 301–307, Sep. 1996. degree in electronic engineering from the University

[13] A. K. Verma, P. Brisk, and P. Ienne, “Variable Latency Speculative of Napoli Federico II, Italy. From 2002 he is full

Addition: A New Paradigm for Arithmetic Circuit Design,” in Proc. professor at the same University. He has published

Design, Autom., Test Eur. (DATE '08), Mar. 2008, pp. 1250–1255. more than 110 papers on international journals

[14] A. Cilardo, “A new speculative addition architecture suitable for two's and conferences. His current research interests are

complement operations,” in Proc. Design, Autom., Test Eur. Conf. design and analysis of VLSI circuits. From 2009

Exhib. (DATE '09), Apr. 2009, pp. 664–669. to 2012 he served as Associate Editor of the IEEE

[15] K. Du, P. Varman, and K. Mohanram, “High performance reliable vari- TRANSACTIONS ON CIRCUITS AND SYSTEMS—PART

able latency carry select addition,” in Proc. Design, Autom., Test Eur. I: REGULAR PAPERS; currently he is Associate Editor

Conf. Exhib. (DATE '12), Mar. 2012, pp. 1257–1262. of Integration, the VLSI Journal.

- DE_LAB EXPERIMENTSHochgeladen vonsaipraneethp
- Mirror AdderHochgeladen vonjeevanprsd
- QBAHochgeladen vonSumoni Hfg
- SOL Review 1.pdfHochgeladen vonahughes1016
- Adders + Subtractors.pdfHochgeladen vonSambit Patra
- Computing With Carbon Nanotubes Optimization of Threshold Logic Gates Using Disordered Nanotubepolymer CompositesHochgeladen vonAlex
- Low Power High Speed Error Tolerant Adder VecHochgeladen vonSrini Vasulu
- Unit 4 Combinational CircuitHochgeladen vonAnurag Goel
- Cpu Design LectureHochgeladen vonEngr. Naveed Mazhar
- 1Accumulator Based 3-WeightHochgeladen vonramu_scribd
- vlsiHochgeladen vonShivam Dave
- Sritharan- Weighted ImplementationHochgeladen vonvinyrose
- Basic Facts RecalHochgeladen vonMallieswaran Subbaiyan
- Critical Path analysis using Lms adaptive algorithmHochgeladen vonShanmuga Nathan
- ScWi14Hochgeladen vonBiswajit Majumder
- 23E_A LowHochgeladen vonMohitha Ik
- CompArchCh03L05BoothAlgorHochgeladen vonJeya Kumar
- Exp 6 Half AdderHochgeladen vonKrishna Prem
- udl lesson plan with technologyHochgeladen vonapi-333561286
- go-gcHochgeladen vonMahyuddin Husairi
- Design of Low Power Vedic Multiplier Based on Reversible LogicHochgeladen vonAnonymous 7VPPkWS8O
- Adders.docHochgeladen vonFikris Ramadhani
- teachingworksample larsonHochgeladen vonapi-295565394
- Arithmetic and Logic Unit_NKDAS_241106Hochgeladen vonDaniel Matias
- student response tools lesson planHochgeladen vonapi-412689064
- Exp-10Hochgeladen vonPiyas Chowdhury
- A_32_BIT_MAC_Unit_Design_Using_Vedic_Mul.pdfHochgeladen vonRamaDinakaran
- Digital electronics 123456.docxHochgeladen vonRahul Gupta
- Carry Select Adder Using Common Boolean Logic_stampedHochgeladen vonseedr evite
- Sum Full-Adder 74LS83A.pdfHochgeladen vonVictoria Lira

- Tamilnadu PEC DetailsHochgeladen voninfosrig
- 11 Chapter 4Hochgeladen vonParamesh Waran
- KeyStep Manual 1 0 0 EnHochgeladen vonsdfsdfsdfsdf
- 07 Chapter 1Hochgeladen vonParamesh Waran
- SPOC - A Secure and Privacy-Preserving Opportunistic Computing Framework for Mobile-Healthcare EmergencyHochgeladen vonaman.4u
- Image Processing Using MatlabHochgeladen vonParamesh Waran
- A Crop Growth ReportHochgeladen vonParamesh Waran
- Matlab IntroHochgeladen vonParamesh Waran
- VLSI Implementation of Floating Point AdderHochgeladen vonParamesh Waran
- Phd EndoscopyHochgeladen vonParamesh Waran
- Cooperative Downlink TransmissionHochgeladen vonParamesh Waran
- Face AbstractHochgeladen vonParamesh Waran
- Toll CollectionHochgeladen vonParamesh Waran
- Wireless Sensor NetworkHochgeladen vonParamesh Waran
- A Probabilistic Approach for Color CorrectionHochgeladen vonParamesh Waran
- AndHochgeladen vonParamesh Waran
- V3I11-IJERTV3IS110404Hochgeladen vonParamesh Waran
- 1409.0875Hochgeladen vonParamesh Waran
- 5695 45 136 Sensorless Detection Rotor SRMHochgeladen vonParamesh Waran
- Solar Hybrid CarHochgeladen vonParamesh Waran
- Elimination of Carbon Particles From Exhaust GasHochgeladen vonParamesh Waran
- lowpowervlsidesignbook-140426055029-phpapp02Hochgeladen vonParamesh Waran
- Ver i Log Coding GuidelinesHochgeladen vonParamesh Waran
- ManualHochgeladen vonParamesh Waran
- ReadmeHochgeladen vonParamesh Waran
- Belt Conveyor DesignHochgeladen vonAshok Kumar
- [02] Chapter01_An Overview of VLSIHochgeladen vonParamesh Waran
- WI-FI Standards Version Q3 7 (17.01.07)Hochgeladen vonParamesh Waran
- Wind Power S13Hochgeladen vonParamesh Waran

- The Soul of a New Machine - Chapter 9-10Hochgeladen vonBhimo Bhaskoro
- intro_2Hochgeladen vonvirat sharma
- Design of Low Power Barrel Shifter and Rotator Using Two Phase Clocked Adiabatic Static Cmos LogicHochgeladen vonInternational Journal of Research in Engineering and Technology
- • Incorporates the ARM7TDMI® ARM® Thumb® ProcessorHochgeladen vonflo_af72
- Illinois Scan 6 Per PageHochgeladen vonNagaraj Sainagasubramanian
- Router BoardHochgeladen vonISLAMIC LIBRARY
- Delta Delay - A Knol by Sanjay ChuriwalaHochgeladen vonemail2pr2639
- Elec2142 Tute 2Hochgeladen vonSara Brown
- Sequential Logic Computing SystemsHochgeladen vonRudolf
- simHochgeladen vonRamesh Agarwal
- Linux ProfilingHochgeladen vonMahesh Patil
- Lesson Plan on SUHochgeladen vonraymark_1031
- VLSI DesignHochgeladen vonGowri Shankar
- How Boolean Logic WorksHochgeladen vonjavivr54
- 89c51ed2 datasheet.pdfHochgeladen vonAbdulhussain Amravatiwala
- Doc 3714Hochgeladen vonPetre Vijiac
- vCenter Performance CountersHochgeladen vonbabyphilip
- 16*2Hochgeladen vonAspire_Technology
- Acer_Aspire_4741Z_4741ZG_5741_5741G_5741Z_5741ZG_-_COMPAL_LA-5891P_-_REV_1.0Hochgeladen vonAgus Agus Agus
- Control Center v3.53Hochgeladen vonYogesh Kumar
- lab1d.pdfHochgeladen vonVinay Prasanth
- RC400 DatasheetHochgeladen vonRaissan Chedid
- A Complete History Of Mainframe ComputingHochgeladen vonIvan Tišljar
- Liberty File IntroductionHochgeladen voncatchakhil
- Addressing Modes of 8051Hochgeladen vonVineet Kumar Pandey
- EMM386.txtHochgeladen vongnohm
- ELX304 Ref ExamHochgeladen vonNadeesha Bandara
- Arm Processor Based Speed Control of BLDC MotorHochgeladen vonuday wankar
- DxDiagHochgeladen vontbryson2
- Samsung Bn94-06301v Ue32f5000ak Main SchHochgeladen vonSinorne

## Viel mehr als nur Dokumente.

Entdecken, was Scribd alles zu bieten hat, inklusive Bücher und Hörbücher von großen Verlagen.

Jederzeit kündbar.