VLSI Project

INTRODUCTION
1.1. Objective
To implement the Leading Zero Counting (LZC) and Leading Zero Anticipation (LZA) logic for high-speed floating-point units.
1.2. Description
The description of the basic components used in the project is given below. Floating-Point Units: Floating point unit is a part of computer s stem speciall designed to carr out operations on floating point numbers. T pical operations are addition! subtraction! multiplication! division and s"uare-root. Transcendental functions such as e#ponential and trigonometric calculations done in most modern processors with software librar routines. ea!ing "ero Co#nting logic: Leading $ero counting (LZC) is the procedure of encoding in binar representation the number of consecutive $eros that appear in a word before the first %&' that is e"ual to one. The opposite for the case of leading ones. The latter is called leading digit. Leading-$ero Counting means counting the leading number of $eros and updating the e#ponent value in the floating point number. ea!ing "ero $nticipation logic: Leading Zero Anticipation (LZA) is a techni"ue that predicts the location of the most significant digit in a floating-point addition given the inputs to the adder. The LZA designs have been incorporated in most realistic floating-point processing (nits ()*() and commercial processors. The choice of LZA st le is often dependent on the overall design of the floating-point addition unit! that is! on how subtraction is handled when the e#ponents are same and how it detects and corrects the possible one-bit error of the LZA.
1.%. Intro!#ction to & 'I

,L&- stands for ver large scale integration! which refers to those integrated circuits that contain thousands of transistor-based circuits into a single chip. ,L&- began in the +./0s when comple# semiconductor and communication technologies were being developed. The microprocessor is a ,L&- device. The term is no longer as common as it once was! as chips have increased in comple#it into the hundreds of millions of transistors. The circuits designed ma be general purpose integrated circuits such as %icroprocessors! 1igital &ignal processors and memories. ()at is & 'I* ,L&- stands for 2,er Large &cale -ntegration2. This is the field which
involves pac3ing more and more logic devices into smaller and smaller areas. ,L&-
1. &impl we sa -ntegrated circuit is man transistors on one chip. 2. 1esign4manufacturing of e#tremel small! comple# circuitr using modified
semiconductor material
3. -ntegrated circuit (-C) ma contain millions of transistors! each a few mm in

si$e
4. Applications wide ranging5 most electronic logic devices.

+istor, o- 'cale Integration: late 60s Transistor invented at 'ell Labs late 70s )irst -C (89-)) b 8ac3 9ilb at T-) earl :0s &mall &cale -ntegration (&&-) ; +0s of transistors on a chip late :0s %edium &cale -ntegration (%&-)
<
; +00s of transistors on a chip earl /0s Large &cale -ntegration (L&-) ; +000s of transistors on a chip earl =0s ,L&- +0!000s of transistors on a chip (later +00!000s > now +!000!000s) (ltra L&- is sometimes used for +!000!000s o o o o o &&- - &mall-&cale -ntegration (0-+0<) %&- - %edium-&cale -ntegration (+0<-+0?) L&- - Large-&cale -ntegration (+0?-+07) ,L&- - ,er Large-&cale -ntegration (+07-+0/) (L&- - (ltra Large-&cale -ntegration (@A+0/)
$pplications o- & 'I: Blectronic s stems now perform a wide variet of tas3s in dail life! some of them visible! some more hidden5 ; *ersonal entertainment s stems such as portable %*? pla ers and 1,1 pla ers perform sophisticated algorithms with remar3abl little energ . ; Blectronic s stems in cars operate stereo s stems and displa sC the also control fuel injection s stems! adjust suspensions to var ing terrain! and perform the control functions re"uired for anti-loc3 bra3ing (A'&) s stems. ; 1igital electronics compress and decompress video! even at high-definition data rates! on-the-fl in consumer electronics. ; Low-cost terminals for Deb browsing still re"uire sophisticated electronics! despite their dedicated function. ; *ersonal computers and wor3stations provide word-processing! financial anal sis! and games. Computers include both central processing units (C*(s) and specialpurpose hardware for dis3 access! faster screen displa ! etc. ; %edical electronic s stems measure bodil functions and perform comple#
processing algorithms to warn about unusual conditions. The availabilit of these
comple# s stems! far from overwhelming consumers! onl creates demand for even more comple# s stems. There are different entities that one would li3e to optimi$e when designing a ,L&- circuit. These entities can often not be optimi$ed simultaneousl ! onl one entit at the e#pense of one or more others. The most important entities are5 +. $rea: %inimi$ation of the chip area is not onl important because less silicon is used but also because the ield is in general increased. The ield is the percentage of correct circuits. <. 'pee!: The faster a circuit performs its intended computation! the more attractive it ma be to use of it. -ncreasing the operating speed normall re"uire a larger area. &o alwa s carefull consider the tradeoff between speed and area. &peed is a design constraint rather than one entit to optimi$e. ?. Po.er !issipation: Dhen a chip dissipates too much power it will either becomes too hot or stops wor3ing or will need e#tra cooling. 'esides there is a special categor of applications! vi$. portable e"uipments powered b batteries! for which low power consumption is of primar importance. Eere also the trade off between low power and chip area e#ists. 6. Design ti/e: A chip satisf ing the specifications should be available as soon as possible. The costs are an important factor! especiall when onl a small number of chips need to be manufactured. 7. Testabilit,: As a significant percentage of the chips fabricated is e#pected to be defective! all of them have to be tested before being used in a product. -t is important that a chip is easil testable as testing e"uipment is e#pensive. This as3s for minimi$ation of the time spent to test a single chip. Fne can combine all these entities in to a single cost function! the ,L&- cost function. *ower dissipation is recogni$ed as a critical parameter in modern ,L&- design field. The design of an efficient integrated circuit in terms of power! area! and speed simultaneousl ! has become a ver challenging problem 1.%.1 Nee! -or o. Po.er & 'I C)ips improve
As the scale of integration improves! more transistors! faster and smaller than their predecessors! are being pac3ed into a chip. This leads to a stead growth of the operating fre"uenc and processing capacit per chip! resulting in increased power dissipation. %ooreGs law predicts the growth rate of integrated circuits. Fne estimate places the rate at a <H for ever eighteen months. A need for low power ,L&- chip arises from such evolution forces of integration circuits. Another factor that fuels the needs of low power chips is the increased mar3et demand for portable consumer electronics powered b batteries. The craving for smaller! lighter and more durable electronic products indirectl re"uirements. 'atter electronic mar3ets. -ronicall ! high performance computing s stem characteri$ed b large power translates to low power portable life is becoming a product differentiator in man
dissipation also drives the low power needs. *ower dissipation has a direct impact on the pac3aging cost of the chip and the cooling cost of the s stem. Another major demand for low power chips and s stems comes from environmental concerns. &ince electricit generation is a major source of air pollution! inefficient energ usage in computing e"uipment indirectl contributes to environmental pollution. 1.%.2 Co/ponents o- Po.er Dissipation in C0O' circ#its The power dissipation in digital C%F& circuits can be described b 5 *avg A *d namic I *short circuit I *lea3age I *static (+.+)
Dhere *avg is the average power dissipation! *d namic is the d namic power dissipation due to switching of transistors! *short
circuit
is the short circuit current power
dissipation when there is a direct current path from the power suppl down to ground! *lea3age is the power dissipation due to lea3age currents! and * static is the static power dissipation. D,na/ic po.er !issipation: The d namic power dissipation *d namic! is caused b the charging and discharging capacitances in the circuit. The golden formula for calculation of *d namic is
*d namic A CL,<f
(+.<)
Dhere CL is the load capacitance! , is the suppl voltage and f is the switching fre"uenc . The d namic power dissipation is the dominant factor compared with the other components of power dissipation in digital C%F& circuit. )or technologies up to 0.?7Jm! the d namic power dissipation is about =0K of a circuitGs total dissipation. As the technolog functionalit scales down! i.e. for submicron technologies the the majorit of
contribution of d namic power dissipation also increases because of increased re"uirements and the cloc3 fre"uencies. Conse"uentl e#isting low power design and power estimation techni"ues focuses on this d namic component of dissipation. Po.er re!#ction approac)es o- !,na/ic !issipation: *d namic is proportional to the load capacitance! CL! the s"uare of ,dd ! the switching activit and cloc3 fre"uenc f. Conse"uentl ! the power reduction can be achieved b various manners5 ; Leduction of output capacitance! CL ; Leduction of power suppl voltage! ,dd ; Leduction of average number of transitions per cloc3 c cle ; Leduction of cloc3 fre"uenc . ')ort circ#it po.er !issipation: The short circuit power dissipation *short circuit ! is caused b the current flow through the direct path e#isting between the power suppl and the ground during the transition phase. &hort circuit power dissipation e#ists in static C%F& logic families! but not in d namic logic gates. Leduction in the short circuit power dissipation can be achieved b appl ing various techni"ues. ' reducing the transistor ratio D4L and scaling down the technolog the switched capacitance and suppl voltage are reduced. The *short circuit is linearl proportional to the input signal rise and fall times and therefore! reducing the input transition times! the short circuit current decreases.
ea1age po.er !issipation: The n%F& and p%F& transistors used in a C%F& logic circuit commonl have non$ero reverse lea3age and sub threshold currents. Eaving a C%F& integrated circuit! which encompasses a ver large number of transistors! these currents can contribute to the total power dissipation even when the transistors are not performing an switching action. The magnitude of the lea3age currents depends mainl on the used technolog parameters. The lea3age power dissipation! *lea3age ! is caused b two t pes of lea3age current5 ; The reverse bias diode lea3age current between the drain and source terminal of the transistor! and ; The sub threshold current through a turned off transistor channel. These current components are technologicall controlled and thus the designer can do a number of things for their minimi$ation. 'tatic po.er !issipation: &trictl spea3ing! digital C%F& circuits are not supposed to consume static power from constant static current flow. All non lea3age current in C%F& circuits should onl occur in transient when signals arte switching. Eowever! there are times when deviations from C%F& st le circuit design are necessar . An e#ample is the sub threshold current logic circuit. The sub threshold current does not re"uire a * transistor networ3 and saves half the transistors re"uired for logic computation as compared to the C%F& logic. An e#ample where this feature can be e#ploited is the s stem reset circuitr .
2$'IC CO0PON3NT'
/
2.1 Floating-point Unit

Floating-point unit ()*() is a part of computer s stem speciall designed to carr out operations on floating point numbers. T pical operations are addition! subtraction! multiplication! division and s"uare-root. Transcendental functions such as e#ponential and trigonometric calculations done in most modern processors with software librar routines. -n most modern general purpose computer architectures! one or more )*(s are integrated with the C*( however man embedded processors especiall older designs do not have an hardware support for floating point operations. -n the absence of an )*(! man )*( functions can be emulated which saves the hardware cost of an )*( but is significantl slower. Bmulation can be implemented on an of the several levels in the C*( as microcode as an operating s stem function or in user space code. -BBB floating point standard represents floating point numbers in ? fields. The fields are sign! e#ponent and mantissa. The e#ponent is a biased number in order to represent all values as positive numbers. The mantissa is the fractional value that is added to a one. This hidden one allows an e#tra bit of accurac . )loating point operations are often pipelined. A floating point unit is used in designing processors with enhancing performance and is used for low power and high speed designs.
2.2 Carr, oo1 a)ea! $!!er

Carr Loo3 ahead Adder (CLA) can produce carries faster due to the carr bits generated in parallel whenever inputs change. This techni"ue uses carr b pass logic to speed up the carr propagation. -n order to e#plain carr loo3 ahead! two important signals! traditionall called carr generate ( Gi) and carr propagate (Pi)! are defined as follows. Gi = Ai .Bi Pi = Ai Bi
The concept of the carr generation and propagation can be e#plained as follow. )or a given stage! a carr signal is generated if Gi is true! and it propagates an input carr to its output if Pi is true. The carr output signal can be derived from the carr generate! carr propagate and the carr -in signals! as e#pressed b Ci ++ = Gi + Pi .Ci To avoid carr ripple! the carr output CiI+ should be e#pressed b using the Ci for each stage. Let us use this techni"ue for the carries of a 6-bit CLA adder C+ = Go + Po .Co C< = G+ + P+.(G0 + P0 .Co ) C? = G< + P< .G+ + P< .P +.Go + P < .P +.P 0 .C 0 C6 = G? + P? .G< + P? .P< .G+ + P? .P< .P + .Go + P ? .P < .P + .P 0 .C0 The each above e"uation! there is a corresponding multi-input circuit. )igure shows the bloc3 diagram of the 6-bit CLA adder. )rom the figure! the CLA circuit generates the carr signals C +! C<! C?! and C6 b using the carr -in C0 simultaneousl . The adder circuits generate the sums! which is e#pressed b
-n general! 6-bit loo3 ahead bloc3 is used to implement an n-bit CLA adder with a single level. To go faster! an n-bit CLA adder can be implemented at a high level. The
number of loo3 ahead levels is Mlog r nN, where r is the ma#imum number of inputs per gate.
Fig. 2.1: $rc)itect#re o- Carr, oo1 a)ea! $!!er The dela of the CLA adder increases as the logarithm of the word si$e! whereas the dela of the ripple carr adder increases linearl with the word si$e. Thus! the addition performed b a multi-level CLA for a large word si$e is much faster than a ripple carr adder. )or e#ample! when we compare the number of gate dela s for the critical path of two +:-bit adders! one using ripple carr and the other using two-level carr loo3 ahead. As a result! for the +:-bit addition! carr loo3 ahead adder is si# times faster than ripple carr . Fn the other hand! due to high comple#it of carr loo3 ahead circuit! it consumes more power than ripple carr adder.
2.% 0#ltiple4er
; A %(H is an Oinput selectorP. A multiple#er pic3s one of several inputs and directs it to the output. -t allows ou to select from + of Q inputs and direct it to the output using ceil 5 lg N 6control bits. ; A %(H is also a combinational logic device meaning that once the input to the %(H changes! then after a small dela ! the output changes. (nli3e a register! a %(H does not use a cloc3 to control it.
+0
; A %(H is ver hand in a C*( because there are man occasions where ou need to select one of several different inputs to some device. (suall ! these %(Hes are ?<-bit m-+ %(H for some value of /. ; A 1e%(H is an Ooutput selectorP! letting ou pic3 one of Q outputs to direct an input to. A ?-+ %(H ; (suall %(Hes are of the form 21-1 %(H where 1 78 1. That is! the number of inputs for a t pical %(H is a power of <. ; Eow man control bits are neededR This can be calculated using the formula as before5 ceil5 lg % 6 8 2. ; lg % evaluates to a value that is greater than +! but less than <. Dhen we ta3e the ceiling of that value! we get <. ; Dith < controls bits! we can specif up to four different inputs. De onl have three inputs. Dhen the user tries to specif input 11! this would normall specif that ou want 9 8 4%! but with onl ? inputs! 9 8 4% doesnSt e#ist. The solution is placing a donSt care value for %(Hes with 1 inputs! where 1 is not a power of <! is fairl common.
2.: 0#ltiplication Process

The simplest multiplication operation is to directl calculate the product of two numbers b hand. This procedure can be divided into three steps5 partial product generation! partial product reduction and the final addition. To further specif the operation process! let us calculate the product of two twoGs complement numbers! for e#ample! ++0+two(T?ten) and 0+0+two(7ten)! when computing the product b hand! which can be described according to figure <.<.
++
Fig. 2.2: 0#ltiplication calc#lation b, )an! The bold italic digits are the sign e#tension bits of the partial products. The first operand is called the multiplicand and the second the multiplier. The intermediate products are called partial products and the final result is called the product. Eowever! the multiplication process! when this method is directl mapped to hardware! is shown in figure <.?.
Fig. 2.%: 0#ltiplication operation in +ar!.are As can been seen in the figures! the multiplication operation in hardware consists of ** generation! ** reduction and final addition steps. The two rows before the product are called sum and carr bits. The operation of this method is to ta3e one of the multiplier bits at a time from right to left! multipl ing the multiplicand b the single bit of the multiplier and shifting the intermediate product one position to the left of the earlier intermediate products. All the bits of the partial products in each column are added to
+<
obtain two bits5 sum and carr . )inall ! the sum and carr bits in each column have to be summed. &imilarl ! for the multiplication of an n-bit multiplicand and an m-bit multiplier! a product with n I m bits long and m partial products can be generated.
+?
"C ; "$
%.1 "C $rc)itect#re
Leading $ero counting is the procedure of encoding in binar representation the number of consecutive $eros that appear in a word before the first %&' that is e"ual to one. The opposite for the case of leading ones. The latter is called leading digit. Leading-$ero Counting means counting the leading number of $eros and updating the e#ponent value in the floating point number. "C ogic: The problem of normali$ing the result can be solved in two wa s. The first one involves counting the number of leading $eros of the result and then shifting the result to the left according to the outcome of the LZC unit. This method is slow and is rarel preferred. The second wa is to tr to predict in parallel with the true operation
(addition4subtraction) a pseudo result that will have almost e"ual number of leading $eros as the true result. -n this wa ! both the true operation and leading-$ero counting on the pseudo result can occur simultaneousl . The predicted leading $ero count is given to a shifter in order to normali$e the true result.
%.2 Revie. o- "C $rc)itect#re

The adoption of this techni"ue depend on the specific design choices made for the complete floating point unit such as the number of pipeline stages and cloc3 period. Counting the number of leading $eros and then updating the fractional part is the main goal of LZC. %.2.1 T.o-step enco!ing proce!#re The first method for determining the leading $ero count of a word is based on a two step encoding procedure. At first! the position of the leading digit of the input operand is mar3ed and the remaining bits are set to $eros (one hot representation). )or e#ample! for the input 00++0+00 the position of the leading digit is determined b the
+6
codeword 00+00000.To derive the one hot representation! the intermediate s string is at first produced. the bits of s that follow the leading digit are set to one! while the other more significant bits remain to $ero. )or the same input s is e"ual to 00++++++. )lag v signifies the LZC flag and denotes the all $ero case for the input. $lgorit)/ic approac): The second method for computing the leading $ero count is based on an algorithmic approach. At first! input is portioned in to n4< two-bit groups of adjacent bits. )or each group! a < bit leading-$ero count is generated. The most significant of the two bits also acts as an all-$ero indicator for the bits of the group. At the ne#t level! neighbor groups are combined and either the leading-$ero count of the left or the right group is selected using a set of multiple#ers. The selection is performed based on the value of the most-significant bit of the left group. 'ased on the number of leading $eros determined from the result selected b the fast rounding unit! the bits in the result selected b the slower rounding unit are left shifted there b normali$ing the result.
Fig. %.1: 1<-bit "C #nit The form of the derived e"uations shows that the leading-$ero count can be efficientl computed using standard carr -loo3 ahead techni"ues. Fur goal is to clarif which part of the prediction circuit that consists of the LZA logic and the LZC unit! is more critical in terms of energ and dela for the performance of the whole circuit. The
+7
benefits of the new LZA error handling method are anal $ed and compared to previousl 3nown techni"ues. The efficienc of the proposed circuits has been validated using static and d namic C%F& implementations in a standard performance. ' splitting the given word into e"ual number of bits and separate L&' and %&'! there b minimi$ation of dela . Rec#rsive algorit)/ic approac): The last method for computing the leading $ero count based on the se"uence pre-computed significant bit. The algorithm recursivel calculates the i th bit of the leading $ero count based on the precompiled more significant bit. The more significant bits are used in the first stages of the shifter that perform the coarse normali$ing steps. The less significant bits can be dela ed since the are not used until the last shifting stages of the normali$ation shifter. This techni"ue ma be beneficial in some cases! however it can not be applied when the leading $ero counter and normali$ation shifter belong to different pipeline stages. 1ela ing the computation of bits of the leading $ero count can be applied to all other LZC circuits b properl si$ing the gates of the circuit. The e#tra time slac3 provided to the less significant bits can be used for reducing the power dissipation of the circuit. %.2.2 Propose! "C $lgorit)/ -n this section! the proposed LZC unit will be presented. )ollowing a mathematical approach! we simplif the boolean relations that describe the Z bits of the leading-$ero count. The proposed method will be presented when the input is e"ual to $ero! the bits are also set to $ero! indicating that no normali$ation is re"uired. Eowever! as long as the flag is asserted! we can map the bits to an other value. Therefore! we chose to set each bit to +. Counting the leading $eros of a word is also useful to man other cases besides floating-point data paths. Almost all instruction sets of contemporar microprocessors include a count leading $eros (CLZ) instruction for fi#ed-point operands. achieving
+:
Fig. %.2: 34a/ple -or ne. /et)o! -or !eter/ining t)e lea!ing-9ero co#nt o- an =-bit operan!. The least-significant bit determines whether the leading-$ero count is an odd or even numberG Dhen a function is asserted it means that the leading-$ero count of the input string H is an even number. The least-significant bit is unused. The reason is that when H is e"ual to $ero! we are allowed to treat the bits as donGt care values. The bits are computed b the application of the operator to different groups of bits of the input operand. The finali$ed e"uations for the proposed architecture are as follows. ZoA) (A/! A:! A7! A6! A?! A<! A+! A0) Z+A) (A/IA:! A7IA6! A?IA<! A+IA0) Z<A) (A/IA:IA7IA6! A?IA<IA+IA0) ,AA/IA:IA7IA6IA?IA<IA+IA0 (?.+) (?.<) (?.?) (?.6)
)or the =-bit leading-$ero count! b appl ing function ) to all the bits of the input operand! we determine whether its leading-$ero count is an even number. This is false and so Zo is set e"ual to +. This operation re"uires several logic stages to complete. Thus! in parallel! we perform a bitwise FL operation between the neighbor bits of the input operand and a new half-si$e string is derived. 1etermine for the new string 00++ whether its leading-$ero count is even &ince! 00++ has an even number of leading $eros! bit is set e"ual to 0. -n this case! assumes half inputs compared to the computation of allowing the computation of and to finish almost simultaneousl . we can FL the adjacent bits of the
+/
intermediate string 00++ and appl the operator to the new word 0 derive the bit of the leading-$ero count.
%.% "C Unit Organi9ation

The basic bloc3 of carr -loo3 ahead tree is the well 3nown carr merge (C%) cell. )rom the proposed LZC unit! right part of the design that combines the less significant bits re"uires a simplified form of carr merge cell! which is composed onl of an AQ1FL gate. %.%.1 Propose! 1<-bit "C #nit Bach bit of the leading-$ero count is computed independentl using a separate single-output carr -loo3 ahead tree. The structure of such a tree that computes the least significant bit of the leading-$ero count in the case of an =-bit input operand is shown in )igure. ?.<. 'esides that is computed directl from the input bits! the carr -loo3 ahead trees that compute the remaining bits of the leading-$ero count! assume as input the FL function of specific groups of the input bits according to the algorithm described in )igure. <.+ Therefore! a complete binar tree of FL gates is re"uired. The intermediate results produced at each level of the binar FL tree are given as input to the corresponding carr -loo3 ahead trees that compute the bits of the leading$ero count. %oreover! the final output of the binar FL tree represents the all-$ero flag that is also re"uired b the leading-$ero counter. -n our design! independent carr -loo3 ahead trees are re"uired to compute the bits of the leading-$ero count of an n-bit input operand. Bach tree combines a different number of bits. -s computed directl from the bits of the input operand since does not participate in the computation! while assumes as input the pairs.
+=
Fig. %.%: Propose! "C Unit The onl circuit added is a single-output carr tree that computes the least
significant bit (L&') of the leading-$ero count of the true unnormali$ed result. This circuit is e"uivalent to the one shown in )igure. ?.< The circuit runs in parallel with the shifter and together with the value of the L&' of leading-$ero count that is predicted b the LZA logic! controls the last shifting stage of the normali$ation shifter.
%.: "$ $rc)itect#re

Leading Zero Anticipation (LZA) is a techni"ue that predicts the location of the most significant digit in a floating-point addition given the inputs to the adder. The LZA designs have been incorporated in most realistic floating-point processing (nits ()*() and commercial processors. The choice of LZA st le is often dependent on the overall design of the floating-point addition unit! that is! on how subtraction is handled when the e#ponents are same and how it detects and corrects the possible one-bit error of the LZA. LZA logic consists of two main parts5 a pre-encoding module which generates a string of bits with the most significant digit O+P having same position as the actual sum output! a leading $ero detector (LZA) which is then emplo ed to encode the pre-encoding result. LZA is often used in floating-point adder when the operation is
+.
effective subtraction. Leading $eros occur when the result of subtraction is positive! leading ones occur when the result of subtraction is negative. Leading $ero anticipators predict the location of the most significant bit location of the result of a floating point addition directl from inputs to the adder. Qormali$ation is used as a means of referencing a fi#ed radi# point. Qormali$ation strips out all leading sign bits so that the two bits immediatel .adjacent to the radi# point are of opposite polarit . Anticipating leading number of $eros or ones in a sum of mantissas irrespective o sign of the result of the result of the relative magnitudes of the input operands using LZA device.
%.:.1 "$ ogic . Fne of the LZA techni"ues can be more efficientl used along with the proposed LZC units. )inall a new method for handling the error of LZA logic is introduced that further reduces the comple#it of the normali$ation circuit. LZA logic tries to produce from the input operands and a prediction string of bits that will have almost the same number of leading $eros or ones as the outcome of the true operation. Leading $eros occur when the result is positive! and leading ones occur when the result is negative. The goal of LZA logic is to detect and correct the error of a +-bit floating point number. Bach bit of the prediction string is e"ual to the value of an indicator that is computed using the carr propagate! carr generate and carr 3ill functions of the input bits. %an algorithms have been presented so far for the design of the LZA logic. Their difference lies in the boolean relations that describe the function of the indicator. These techni"ues can be roughl separated in two categories. The first categor contains the circuits that detect the case of either leading $eros or ones using a single set of indicators. The second categor separates the case of leading $eros from the case of leading ones and uses two distinct sets of indicators. )or both LZA architectures! the prediction string is given to a single or two separate LZC units! respectivel ! in order to encode the number of leading $eros or ones.
<0
An of the alread described LZC units can be used for the encoding. -n some cases the indicator predicts the position of the leading digit using information from both the left and the right neighbor bits irrespective of the sign of the true result. -n the second class of LZA logic that emplo s two separate prediction units! the indicators are simpler since the do not need to detect both leading $eros and leading ones. The method presented in Leading-Zero Anticipation and degenerali$ation predicts the leading $eros and ones using two separate units. The outputs of the two units are combined to produce a single prediction string. Bach unit has its own set of indicators. )or the case of leading $eros the indicator is used. -n the opposite case! leading ones are detected using the indicator. The indicators in each unit are FLed from left to right to create two monotonic strings of $eros followed b ones.The correct encoding (leading $eros or leading ones) is selected using the sign of the true result that is computed separatel from the input operands. This method is efficient in the case of d namic-C%F& implementations. -n the original version of the LZA logic with two separate units! the indicators of each unit were first FLed and two monotonicall increasing strings of the form were generated. Therefore! the generation of a monotonic string from the indicators either or is redundant. )or the case of split LZA prediction! feeding the indicators and to the proposed LZC units two distinct weighted binar representations are produced. The first encodes the predicted number of leading $eros and the other the predicted number of leading ones. Dhich one of the two binar representations contains the correct normali$ation information can be selected in two wa s. The first one is based on the sign of the true result. -f the result is positive! then the encoding of the leading-$ero indicators is selected. -n the opposite case the encoding of the leading-one indicators contains the information that is needed for the normali$ation of the result. The second wa to get the valid number of leading $eros or ones is to compare the outputs of the two LZC units and select the ma#imum. This approach is directl derived b the functionalit of the LZA logic with two separate prediction units.
<+
%.:.2 "$ 3rror +an!ling ogic -n certain cases! the prediction of the position of the leading digit ma differ from that of the true result b one. Then! the result is not correctl normali$ed and an additional shift left b one position should ta3e place. The e#ponent should be also decreased b one. The first one (LZB -) involves chec3ing the most-significant bit of the output of the normali$ation shifter. -n case that it is e"ual to $ero! it means that the result is not normali$ed and an additional shift is re"uired. %ore bits from the intermediate levels of the shifter can be chec3ed to reduce the dela overhead in the last shifting stage. The second approach (LZB --) combines the information produced b the LZA logic with signals from the adder that produces the true unnormali$ed result. -n some cases the output of the adder is AQ1ed with the one-hot encoding of the anticipated leading digit. -f all the bits of the derived word are e"ual to $ero then the predicted position of the leading digit is not correct. Eence! an all-$ero detector is used to detect the miss-prediction in parallel to the shifter. The last approach (LZB ---) generates an error indication signal in parallel with the adder and the LZA logic. This approach uses the indicators of the LZA logic to detect specific patterns of the input bits that cause the error. The circuits that implement this form of pattern detection have significant dela and energ cost! compared to LZB - and --. De propose a new LZA error detection and correction method that is as simple as the LZB -- techni"ues without imposing an limitations on the selection of the LZC unit. The structure of the proposed error handling method is shown in )igure. ?.6.
<<
Fig. %.:: Propose! /et)o! -or !etecting an! correcting t)e 1-bit error o- t)e logic.
"$
)rom the bloc3 diagram! internal circuitr of the adder is considered as ripplecarr adder because of the high speeds. -n LZC implementation! carr -loo3 ahead adder is used for reducing the dela . &ince same carr is given to all the bloc3s so that dela involved in the circuit is less. 'ased on the floating point unit! shifter performs onl left shift operation. 'arrel shifter is considered using multiple#ers and registers. Thus it performs multiple shifting operations simultaneousl for increasing the speeds. 1epending on shifter outputs! ?5+ multiple#er is selected. B#ponent 4 update unit acts as a subtractor and incrementor. According to LZC unit! sum value of 6 bits is considered and then updating the e#ponent. 1ecision logic indicates which value is going to select the multiple#er. The onl circuit added is a single-output carr tree that computes the least
significant bit (L&') of the leading-$ero count of the true unnormali$ed result. This circuit is e"uivalent to the one shown in )igure ?.?. The circuit runs in parallel with the shifter and together with the value of the L&' of leading-$ero count that is predicted b the LZA logic! controls the last shifting stage of the normali$ation shifter.
<?
%.> 3nerg, Dela, Co/parisons

-n order to e#plore the energ -dela space for each design! we performed gate si$ing for several dela targets! beginning from the circuitGs minimum achievable dela . Fptimi$ation is performed using an in-house tool developed around the geometric programming solver and following the gate si$ing methodolog 1uring optimi$ation and measurements! inter stage wiring loads! both capacitance and resistance! have also been ta3en into account! assuming for the design a bit slice of +: metal-+ trac3s as the one used in state-of-the-art microprocessors. To get reasonable dela s! all compared designs have been optimi$ed assuming that the ma#imum allowable input capacitance of each circuit is less. %.>.1 "C Units At first! the proposed :6-bit LZC units were compared to the most efficient architecture for static C%F& implementations. The main benefit of the proposed designs compared to previous approaches is their energ efficienc . )or e"ual dela measurements! the energ savings range from +0K to 6.K. This result stems from the reduced number of gates re"uired to compute the leading-$ero count and the simpler gates that appear on the critical path. The shared-carr propagate approach re"uires more energ than the
straightforward implementation of the proposed LZC unit. This behavior is e#plained b the fact that the second variant of the proposed LZC unit has more gates on the critical path and larger fan-out of the internal nodes of the circuit compared to the straight forward implementation. Therefore the derived design has increased gate si$es that also increase the energ re"uirements of the circuit. %.>.2 "$ Circ#its The minimum dela achieved in static C%F& b the new LZC unit leaves a lot of room for the insertion of an form of LZA logic. The available dela slac3 is determined b the speed of the adder that computes the true result. )or larger dela targets it is +.The term prediction circuit denotes the pair of the LZA logic and the LZC unit that predicts
<6
the position of the leading digit. 'etter to emplo the shared-carr propagate approach reducing further the energ of the prediction circuit.
Fig. %.>: 5a6 Bnerg dela curves for the static C%F& implementation of the prediction circuit that uses the combined indicator LZA logic along with the proposed LZC unit. 5b6 Bnerg brea3down of prediction circuits using different LZC units. )or all cases of above )igure! including both circuits under comparison! the simulations show that the critical path is unevenl distributed between the LZA logic and the LZC unit. The LZA logic is responsible for roughl the +4? of the dela of the critical path! while the LZC unit contributes to the <4? of the total dela . Therefore! as far as dela is concerned! the more critical part of the prediction circuit that needs to be better optimi$ed is the LZC unit and not the LZA logic. The same distribution roughl holds for energ also. )or the implementation of the split-LZA prediction circuits! we assumed that the correct number of leading $eros is selected according to the value of the true sign of the addition using <-to-+ multiple#ers. The sign is computed b a separate carr -loo3 ahead tree! whose energ and dela overhead has been also included in the designs. The energ dela diagrams derived are shown in )igure.?.7! where we have also included the brea3down of the prediction circuits that use the prediction circuit that uses the combined-indicator LZA logic and the proposed LZC unit. )igure.?.7 (b) depicts the energ cases the proposed LZC units. combined-indicator LZA logic and the split LZA logic! respectivel ! utili$ing in both
<7
The energ optimi$ation! the
of the two LZC units does not increase! since! after dela
get less input capacitance compared to the LZC unit with the
combined indicator. The capacitance removed from the two LZC units goes to the carr tree that computes the sign of the true result! so that all paths have almost e"ual dela . Therefore! the energ overhead imposed b the split LZA architecture and the needed sign detection logic is significant compared to the combined-indicators LZA logic. The LZA architecture is responsible for roughl the one-third of the total energ of the prediction circuit. Concluding with the static C%F& prediction circuits! we can sa that! when using the proposed LZC unit that gives the most energ efficient designs! there is no meaning to use the split LZA architectures since in all cases the give less efficient circuits compared to the combined-indicator LZA approach. The same conclusion can be derived for the case of d namic C%F& implementations of the prediction circuit. )or this case! we performed two sets of e#periments. At first! we compared the prediction circuit that uses the combined-indicator LZA logic and the single-rail implementation of the proposed LZC unit! with the most efficient previous architecture. The combined- indicator LZA logic is implemented in full dual-rail d namic C%F& which then drives the proposed single-rail LZC unit. %.>.% "$ 3rror )an!ling /et)o!s A set of e#periments have also been performed in order to "uantif the energ and the dela re"uirements of the proposed and the previous LZA error handling methods. As a baseline of our comparisons we assume the energ and the dela of a standalone shifter! which is composed of stages of <-to-+ multiple#ers. All error handling methods should tr to 3eep to minimum the dela and the energ overhead the add to the normali$ation shifter. At first! we designed the techni"ue LZB -. B#amining the %&' of the output of the shifter and controlling the e#tra shifting stage according to its value! imposes a significant dela the select line. overhead. Although the signal is appropriatel buffered! the dela overhead is over +=K. The e#tra dela is caused because of the large fan-out imposed b
<:
)or the case of LZB --! we included onl the overhead imposed b AQ1 gates used to detect if the position of the predicted leading digit matches the leading digit of the true unnormali$ed result and the binar FL tree used to detect the presence of a bit e"ual to +. De did not include an of the overhead imposed b the encoder-based LZC unit or the e#tra circuits that ma be used to produce a monotonic string that denotes the position of the leading digit. -f we at least include the overhead added b the encoder-based LZC unit that is re"uired to produce the monotonic prediction string used b LZB --! we can safel conclude that the proposed method imposes significantl less energ overhead to the normali$ation shifter! while 3eeping the dela penalt to a minimum.
</
&+D $ND D3'I?N TOO '

:.1 Intro!#ction to &+D
,E1L stands for ,E&-C Eardware 1escription Language. ,E&-C is an abbreviation for ,er Eigh &peed -ntegrated Circuit! a project sponsored b the (& Uovernment and Air )orce begun in +.=0 to advance techni"ues for designing ,L&silicon chips. 1ifferent implementation technologies offer different trade-offs. ,E1L s nthesis offers an eas wa to target a model towards different implementations. &+'IC Progra/ ,E1L is an offshoot of the ,er Eigh &peed -ntegrated Circuit (,E&-C) program that was founded b the department of defense in the late +./0s and earl +.=0s. The goal of the ,E&-C program was to produce the ne#t generation of integrated circuits. *rogram participants were urged to push technolog limit in ever phase of the design and manufacture of integrated circuits. The goals were accomplished were admirabl but! in the process of
developing these e#tremel comple# integrated circuits! the designers found out that the t:ools used to create these large design were inade"uate or the tas3. The tools that were available to the designers were mostl based at the gate level. Creating designs of thousands of gates using gate level tools was an e#ternall challenging tas3! and therefore a new method of description was on order. -nitiall ! ,E1L was designed to be documentation and simulation language. & nthesis tools to describe hardware design at the register level embraced later ,E1L.
:.2 evels o- $bstraction

The various abstraction levels of hardware can be classified as ; 'ehavioral ; 1ata flow ; &tructural
<=
2e)avioral A behavioral description is most abstract. -t describes the function of the design in software li3e procedural form and provides no detail as to how the design is to be implemented. The behavioral level of abstraction is most appropriate for fact simulation of complete hardware units! verification and functional simulation of design ideas! modeling standard components and documentation. The statements in behavioral modeling are e#ecuted in se"uential modeC but the process statements are e#ecuted in parallel mode i.e. the will be e#ecuted in parallel. Data -lo. The data flow description is a component represent of the flow of control and movement of data. Concurrent data components and carriers communicate through buses and interconnections and control hardware issues signals for the control for the control for this communication. The level of hardware details involved in data flow descriptions is a great enough that such descriptions cannot serve as an end user or non-technical documentation media. 'tr#ct#ral A structural description is the lowest and the most detailed level of description considered and is the simplest to s nthesi$e to hardware. &tructural description includes a list of concurrentl active components and their interconnections. The corresponding function of the hardware is not evident from such descriptions unless the components used are 3nown. A structural description that describes wiring of logic gates is said to be the hardware description at the gate level.
:.% 3le/ents o- &+D

,E1L is most commonl used language for designing. -t is a language with large number of elements suitable for designing digital circuits.
<.
&+D Ter/s 3ntit,: All the designers are e#plained in terms of entities. An entit is the most building bloc3 in the design. The input and output pins are declared in the entit . The upper most level of the design is the top level entit . -f the design is hierarchical! then the top - level description will have lower level description contained in it. These lower level descriptions will be lower level entities in the top level description. $rc)itect#re: All the entities that can be simulated have an architecture description. The architecture describes the behavior of entit that can be multiple architectures. Architecture ma have an of the modeling either behavior or structural or dataflow modeling. Fne structural might be behavioral while the other might be structural description if the design. Con-ig#ration: A configuration statement is used to bind a component instance to an entit architecture pair. A configuration can be considered li3e a part list for a design. -t describes which behavior to use for entit much li3e a parts list describes which part to use for each part in the design. Pac1age: A pac3age is a collection of commonl used data t pes and subprograms used in design. Thin3 of a pac3age as a toolbo# that contains tools used to designs. The various sub programs available are functions and procedures. The main difference between them is function can return onl one value as its return value and procedure can return more than one value.
?0
2#s: The term bus is usuall brings to mind a group of signals or particular method of communication used in the design of hardware. -n ,E1L a bus is a special 3id that ma have its drivers turned off. Process: A process is a basic unit of e6#ecution of ,E1L. All operations that are performed in simulation of ,E1L description is bro3en into single or multiple processes. 'e@#ential state/ents: &e"uential statements e#ist inside the boundaries of process statement as well in the subprogram. &ome of the se"uential statements are listed below5 ; -) ; CA&B ; LFF* ; DA-T &+D objects ,E1L objects consist of one of the following5 ; &ignal! which represents interconnection wires that connect the component installation port together ; ,ariable! which is used for logical storage of temporar data! visible onl inside a process. ; Constant! which names assigned to specific values of a t pe.
:.: Design Tools

:.:.1 $CTI&3 +D Active-E1L is an integrated environment designed for development of ,E1L designs. The core of the s stem is a ,E1L simulator. Along with debugging and design entr tools! it ma3es up a complete s stem that allows ou to write! debug and simulate
?+
,E1L code. 'ased on the concept of a design! Active-E1L allows ou to organi$e our ,E1L resources into a convenient and clear structure. Active-E1L can be used to perform following tas3s5 ; 1evelopment of the ,E1L based designs! ; )unctional simulation of their code! ; )unctional simulation of the s nthesi$ed code! ; Timing simulation of the hardware implementation. Objective: ; Create a new design or add .vhd files to our design ; Compile and debug our design ; *erform simulation :.:.2 AI INA The first step involved in implementation of a design on )*UA involves & stem &pecifications. &pecifications refer to 3ind of inputs and 3ind of outputs and the range of values that the 3it can ta3e in based on these &pecifications. After the first step s stem specifications the ne#t step is the Architecture. Architecture describes the interconnections between all the bloc3s involved in our design. Bach and ever bloc3 in the Architecture along with their interconnections is modeled in either ,E1L or ,erilog depending on the ease. All these bloc3s are then simulated and the outputs are verified for correct functioning.
?<
Fig. :.1: Ailin4 I/ple/entation Design Flo.-C)art After the simulation step the ne#t steps i.e.! & nthesis. This is a ver important step in 3nowing whether our design can be implemented on a )*UA 3it or not. & nthesis converts our ,E1L code into its functional components which are vendor specific The bit map file consist the whole design which is placed on the )*UA die! the outputs can now be observed from the )*UA LB1s. This step completes the whole process of implementing our design on an )*UA. Ailin4 I'3 1B.1 so-t.are Intro!#ction: Hilin# -&B (-ntegrated &oftware Bnvironment) ..<i software is from H-L-QH compan ! which is used to design an digital circuit and implement onto a &partan-?B )*UA device. H-L-QH -&B +0.+i software is used to design the application! verif the functionalit and finall download the design on to a &partan-?B )*UA device. Ailin4 I'3 1B.1 so-t.are tools ; &-%(LAT-FQ 5 -&B (-ntegrated &oftware Bnvironment) &imulator ; &VQTEB&-&5 H&T (Hilin# & nthesis Technolog ) & nthesi$er.
??
Design steps #sing Ailin4 I'3 1B.1 +. Create an -&B *LF8BCT for particular embedded s stem application. <. Drite the assembl code in notepad or write pad and generate the verilog or vhdl module b ma3ing use of assembler. ?. Chec3 s nta# for the design. 6. Create verilog test fi#ture of the design. 7. &imulate the test bench waveform ('BEA,-FLAL &-%(LAT-FQ) for functional verification of the design using -&B simulator. :. & nthesi$e and implement the top level module using H&T s nthesi$er.
?6
&-%(LAT-FQ AQ1 &VQTEB&-&

>.1 'i/#lation Res#lts
>.1.1 1<-bit Carr, oo1 a)ea! $!!er
Fig. >.1: 'i/#late! res#lt -or 1<-bit Carr, oo1 a)ea! $!!er >.1.2 "$ .it) error !etection #nit
?7
Fig. >.2: 'i/#late! res#lt -or "$ .it) error !etection #nit >.1.% Propose! 1<-bit ea!ing "ero Co#nt #nit
?:
Fig. >.%: 'i/#late! res#lt -or propose! 1<-bit ea!ing "ero Co#nt #nit >.1.: ')i-ter .it) log2n-1 stages
Fig. >.:: 'i/#late! res#lt -or s)i-ter .it) log2n-1 stages >.1.> 34ponent Up!ater ; Decision 2loc1
Fig. >.>: 'i/#late! res#lt -or 34ponent Up!ater ; Decision 2loc1
?/
>.1.< % to 1 0#ltiple4er
Fig. >.<: 'i/#late! res#lt -or % to 1 0#ltiple4er
>.1.C Propose! /et)o! -or !etecting an! correcting t)e 1-bit error o- t)e "$
?=
ogic #nit
Fig. >.C: 'i/#late! res#lt -or !etecting an! correcting t)e 1-bit error o- t)e "$ ogic #nit
>.2 ',nt)esis Res#lts

>.2.1 ogic Circ#it
?.
Fig. >.=: 'c)e/atic .it) basic Inp#ts an! O#tp#t >.2.2 RT 'c)e/atic
Fig. >.D: 2loc1s insi!e t)e basic !esign Internal circ#its -or RT 'c)e/atic:
60
Fig. >.1B: ogic involve! insi!e t)e Carr, oo1 a)ea! $!!er
Fig. >.11: ogic involve! insi!e t)e "$ .it) error !etection #nit
6+
Fig. >.12: ogic involve! insi!e t)e propose! 1<-bit "C #nit
>.2.% Tec)nolog, 'c)e/atic
Fig. >.1:: Tec)nolog, sc)e/atic
6<
Internal circ#its -or Tec)nolog, 'c)e/atic:
Fig. >.1>: Internal circ#it -or "$ .it) error !etection #nit
Fig. >.1<: Internal circ#it -or t)e propose! 1<-bit "C #nit
>.% ',nt)esis Report

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W ---- &ource *arameters -nput )ile Qame -nput )ormat ---- Target *arameters Futput )ile Qame Futput )ormat Target 1evice ---- &ource Fptions Top %odule Qame Automatic )&% B#traction 5 LZAXdetXcorrXtop 5 VB& 5 2LZAXdetXcorrXtop2 5 QUC 5 CoolLunner< C*L1s 5 2LZAXdetXcorrXtop.prj2 5 mi#ed & nthesis Fptions &ummar W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
-gnore & nthesis Constraint )ile 5 QF
6?
)&% Bncoding Algorithm &afe -mplementation %u# B#traction Lesource &haring ---- Target Fptions Add -F 'uffers %ACLF *reserve HFL *reserve B"uivalent register Lemoval ---- Ueneral Fptions Fptimi$ation Uoal Fptimi$ation Bffort Librar &earch Frder 9eep Eierarch LTL Futput Eierarch &eparator 'us 1elimiter Case &pecifier ,erilog <00+
5 Auto 5 Qo 5 VB& 5 VB& 5 VB& 5 VB& 5 VB& 5 VB& 5 &peed 5+ 5 LZAXdetXcorrXtop.lso 5 VB& 5 Ves 54 5 Y@ 5 maintain 5 VB&
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W E1L Compilation W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Compiling vhdl file 2B54l$cnl$aprj4code4or<.vhd2 in Librar wor3. Architecture or<Xarch of Bntit or+ is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4not+.vhd2 in Librar wor3. Architecture not+Xarch of Bntit not+ is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4and<.vhd2 in Librar wor3. Architecture and<Xarch of Bntit and+ is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4#or<.vhd2 in Librar wor3. Architecture #or<Xarch of Bntit #or+ is up to date.
66
Compiling vhdl file 2B54l$cnl$aprj4code4carr gen.vhd2 in Librar wor3. Architecture carr genXarch of Bntit carr gen is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4CLA.vhd2 in Librar wor3. Architecture claX+:bXarch of Bntit claX+:b is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4LZAXerrdet.vhd2 in Librar wor3. Architecture l$aXerrdetXarch of Bntit l$aXerrdet is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4LZCX+:bXproposed.vhd2 in Librar wor3. Architecture l$cX+:bXpXarch of Bntit l$cX+:bXp is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4shifter.vhd2 in Librar wor3. Architecture shifterXarch of Bntit shifter is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4e#ponXupdate.vhd2 in Librar wor3. Architecture e#ponXupdateXarch of Bntit e#ponXupdate is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4mu#?.vhd2 in Librar wor3. Architecture mu#?Xarch of Bntit mu#? is up to date. Compiling vhdl file 2B54l$cnl$aprj4code4LZAXdetXcorrXtop.vhd2 in Librar wor3. Architecture l$aXdetXcorrXtopXarch of Bntit l$aXdetXcorrXtop is up to date. AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W 1esign Eierarch Anal sis W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Anal $ing hierarch for entit YLZAXdetXcorrXtop@ in librar Ywor3@ (architecture Yl$aXdetXcorrXtopXarch@). Anal $ing hierarch for entit YCLAX+:b@ in librar Ywor3@ (architecture YclaX+:bXarch@). Anal $ing hierarch for entit YLZAXerrdet@ in librar Ywor3@ (architecture Yl$aXerrdetXarch@). Anal $ing hierarch for entit YLZCX+:bXp@ in librar Ywor3@ (architecture Yl$cX+:bXpXarch@). Anal $ing hierarch for entit Yshifter@ in librar Ywor3@ (architecture YshifterXarch@). Anal $ing hierarch for entit Ye#ponXupdate@ in librar Ywor3@ (architecture Ye#ponXupdateXarch@). Anal $ing hierarch for entit Ymu#?@ in librar Ywor3@ (architecture Ymu#?Xarch@).
67
Anal $ing hierarch for entit Y#or+@ in librar Ywor3@ (architecture Y#or<Xarch@). Anal $ing hierarch for entit Yand+@ in librar Ywor3@ (architecture Yand<Xarch@). Anal $ing hierarch for entit Ycarr gen@ in librar Ywor3@ (architecture Ycarr genXarch@). Anal $ing hierarch for entit Yor+@ in librar Ywor3@ (architecture Yor<Xarch@). Anal $ing hierarch for entit Ynot+@ in librar Ywor3@ (architecture Ynot+Xarch@).
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W E1L Anal sis W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Anal $ing Bntit YLZAXdetXcorrXtop@ in librar Ywor3@ (Architecture Yl$aXdetXcorrXtopXarch@). Bntit YLZAXdetXcorrXtop@ anal $ed. (nit YLZAXdetXcorrXtop@ generated. Anal $ing Bntit YCLAX+:b@ in librar Ywor3@ (Architecture YclaX+:bXarch@). Bntit YCLAX+:b@ anal $ed. (nit YCLAX+:b@ generated. Anal $ing Bntit Y#or+@ in librar Ywor3@ (Architecture Y#or<Xarch@). Bntit Y#or+@ anal $ed. (nit Y#or+@ generated. Anal $ing Bntit Yand+@ in librar Ywor3@ (Architecture Yand<Xarch@). Bntit Yand+@ anal $ed. (nit Yand+@ generated. Anal $ing Bntit Ycarr gen@ in librar Ywor3@ (Architecture Ycarr genXarch@). Bntit Ycarr gen@ anal $ed. (nit Ycarr gen@ generated. Anal $ing Bntit YLZAXerrdet@ in librar Ywor3@ (Architecture Yl$aXerrdetXarch@). Bntit YLZAXerrdet@ anal $ed. (nit YLZAXerrdet@ generated. Anal $ing Bntit YLZCX+:bXp@ in librar Ywor3@ (Architecture Yl$cX+:bXpXarch@). Bntit YLZCX+:bXp@ anal $ed. (nit YLZCX+:bXp@ generated. Anal $ing Bntit Yor+@ in librar Ywor3@ (Architecture Yor<Xarch@). Bntit Yor+@ anal $ed. (nit Yor+@ generated. Anal $ing Bntit Ynot+@ in librar Ywor3@ (Architecture Ynot+Xarch@). Bntit Ynot+@ anal $ed. (nit Ynot+@ generated. Anal $ing Bntit Yshifter@ in librar Ywor3@ (Architecture YshifterXarch@).
6:
Bntit Yshifter@ anal $ed. (nit Yshifter@ generated. Anal $ing Bntit Ye#ponXupdate@ in librar Ywor3@ (Architecture Ye#ponXupdateXarch@). Bntit Ye#ponXupdate@ anal $ed. (nit Ye#ponXupdate@ generated. Anal $ing Bntit Ymu#?@ in librar Ywor3@ (Architecture Ymu#?Xarch@). Bntit Ymu#?@ anal $ed. (nit Ymu#?@ generated. AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W E1L & nthesis W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA *erforming bidirectional port resolution... & nthesi$ing (nit YLZAXerrdet@. & nthesi$ing (nit Yshifter@. & nthesi$ing (nit Ye#ponXupdate@. & nthesi$ing (nit Ymu#?@. & nthesi$ing (nit Y#or+@. & nthesi$ing (nit Yand+@. & nthesi$ing (nit Ycarr gen@. & nthesi$ing (nit Yor+@. & nthesi$ing (nit Ynot+@. & nthesi$ing (nit YCLAX+:b@. & nthesi$ing (nit YLZCX+:bXp@. & nthesi$ing (nit YLZAXdetXcorrXtop@. AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W %acro &tatistics Z Adders4&ubtractors 6-bit addsub 6-bit subtractor Z Legisters +-bit register 5< 5+ 5+ 5+ 5+ E1L & nthesis Leport W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
6/
Z Latches +-bit latch +:-bit latch +/-bit latch 6-bit latch Z Comparators +:-bit comparator e"ual +:-bit comparator greater +:-bit comparator less 6-bit comparator greater Z %ultiple#ers +/-bit =-to-+ multiple#er Z Logic shifters +/-bit shifter rotate left Z Hors +-bit #or< +:-bit #or<
5/ 5+ 56 5+ 5+ 5 <+ 5 +/ 5+ 5+ 5< 5< 5< 5+ 5+ 5 ?? 5 ?< 5+
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W Advanced E1L & nthesis W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Advanced E1L & nthesis Leport %acro &tatistics Z Adders4&ubtractors 6-bit addsub 6-bit subtractor Z Legisters )lip-)lops Z Latches +-bit latch +:-bit latch 5< 5+ 5+ 5+ 5+ 5/ 5+ 56
6=
+/-bit latch 6-bit latch Z Comparators +:-bit comparator e"ual Z %ultiple#ers +/-bit =-to-+ multiple#er Z Hors +:-bit #or<
5+ 5+ 5 +/ 5 +/ 5< 5< 5+ 5+
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W Low Level & nthesis W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Fptimi$ing unit YLZAXdetXcorrXtop@ ... Fptimi$ing unit Ycarr gen@ ... Fptimi$ing unit Y#or+@ ... Fptimi$ing unit YLZCX+:bXp@ ... Fptimi$ing unit YLZAXerrdet@ ... Fptimi$ing unit Yshifter@ ... Fptimi$ing unit Ymu#?@ ... Fptimi$ing unit YCLAX+:b@ ... Fptimi$ing unit Ye#ponXupdate@ ... AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W *artition -mplementation &tatus ------------------------------Qo *artitions were found in this design. ------------------------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA W )inal Leport W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA *artition Leport W AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
6.
)inal Lesults LTL Top Level Futput )ile Qame Top Level Futput )ile Qame Futput )ormat Fptimi$ation Uoal 9eep Eierarch 1esign &tatistics Z -Fs Cell (sage 5 Z 'BL& Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z Z AQ1< AQ1? AQ16 AQ17 AQ1: AQ1/ AQ1= -Q, FL< FL? FL6 FL7 FL= HFL< )1C L1 L1C* -'() F'() 5 <<7: 5 .+: 5 :+ 56 5< 5: 56 5 ?0 5 7:7 5 ?.? 5 :/ 5 <= 5 ++ 5 <= 5 +6+ 5 =+ 5+ 5 :+ 5 +. 5 7+ 5 ?6 5 +/ 5 7+ 5 LZAXdetXcorrXtop.ngr 5 LZAXdetXcorrXtop 5 QUC 5 &peed 5 VB&
Z )lip)lops4Latches
Z -F 'uffers
70
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 1evice utili$ation summar 5 --------------------------&elected 1evice 5 ?s700efg?<0-7 Qumber of &lices5 Qumber of &lice )lip )lops5 Qumber of 6 input L(Ts5 Qumber of -Fs5 Qumber of bonded -F's5 -F' )lip )lops5 Qumber of UCL9s5 --------------------------*artition Lesource &ummar 5 --------------------------Qo *artitions were found in this design. --------------------------AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA T-%-QU LB*FLT QFTB5 TEB&B T-%-QU Q(%'BL& ALB FQLV A &VQTEB&-& B&T-%ATB. )FL ACC(LATB T-%-QU -Q)FL%AT-FQ *LBA&B LB)BL TF TEB TLACB LB*FLT UBQBLATB1 A)TBL *LACB-and-LF(TB. Cloc3 -nformation5 -------------------------------------------------------------------------------I----------------------------I-------I Cloc3 &ignal [ Cloc3 buffer()) name) [ Load [ --------------------------------------------------------------I----------------------------I-------I cl3 [ '()U* [ + [ --------------------------------------------------------------I----------------------------I-------I As nchronous Control &ignals -nformation5 ----------------------------------------------------------------------------------------I------------------------I-------I Control &ignal [ 'uffer()) name) [ Load [ -------------------------------------------------I------------------------I-------I rst [ -'() [+ [ -------------------------------------------------I------------------------I-------I Timing &ummar 5 <+7 out of :6 out of ?.6 out of 7+ 7+ out of +/ < out of 6:7: 6K .?+< 0K .?+< 6K <?< <6 <+K =K
7+
--------------&peed Urade5 -7 %inimum period5 <.?<0ns (%a#imum )re"uenc 5 6?+.0..%E$) %inimum input arrival time before cloc35 <?./00ns %a#imum output re"uired time after cloc35 6.++6ns %a#imum combinational path dela 5 Qo path found Timing 1etail5 -------------All values displa ed in nanoseconds (ns) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault period anal sis for Cloc3 Cloc3 period5 <.060ns (fre"uenc 5 6.0.<=0%E$) Total number of paths 4 destination ports5 +: 4 +: ------------------------------------------------------------------------1ela 5 <.060ns (Levels of Logic A +) &ource5 u<4eintXmu#000<X+6 (LATCE) 1estination5 u<4eintXmu#000<X+6 (LATCE) &ource Cloc35 u<4%comparXgintXcmpXgt0000Xc Y+7@+ falling 1estination Cloc35 u<4%comparXgintXcmpXgt0000Xc Y+7@+ falling 1ata *ath5 u<4eintXmu#000<X+6 to u<4eintXmu#000<X+6 Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) --------------------------------------------------------------------------------------------------L15U-@\ : 0.7== 0.7/< u<4eintXmu#000<X+6 (u<4eintXmu#000<X+6) L(T65-?-@F + 0.:+< 0.000 u<4eintXmu#000?Y+6@+ (u<4eintXmu#000?Y+6@) L151 0.<:= u<4eintXmu#000<X+6 --------------------------------------------------------------------------------------------------Total <.060ns (+.6:=ns logic! 0.7/<ns route) (/<.0K logic! <=.0K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault period anal sis for Cloc3 Scl3S Cloc3 period5 <.?<0ns (fre"uenc 5 6?+.0..%E$) Total number of paths 4 destination ports5 + 4 + ------------------------------------------------------------------------1ela 5 <.?<0ns (Levels of Logic A +) &ource5 u74sel ())) 1estination5 u74sel ())) &ource Cloc35 cl3 rising 1estination Cloc35 cl3 rising 1ata *ath5 u74sel to u74sel Uate Qet
7<
Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) --------------------------------------------------------------------------------------------------)1C5C-@\ : 0.7+6 0.7:. u74sel (u74sel) -Q,5--@F + 0.:+< 0.?7/ u74selXnot000++X-Q,X0 (u74selXnot000+) )1C51 0.<:= u74sel --------------------------------------------------------------------------------------------------Total <.?<0ns (+.?.6ns logic! 0..<:ns route) (:0.+K logic! ?...K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault F))&BT -Q 'B)FLB for Cloc3 Su:4 XcmpXe"0000S Total number of paths 4 destination ports5 //+6 4 +? ------------------------------------------------------------------------Fffset5 +..=.:ns (Levels of Logic A +.) &ource5 aY+@ (*A1) 1estination5 u:4 X: (LATCE) 1estination Cloc35 u:4 XcmpXe"0000 rising 1ata *ath5 aY+@ to u:4 X: Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) ---------------------------------------- ----------------------------------------------------------'()5--@F +< +.+0: 0..:. aX+X-'() (aX+X-'()) L(T65-0-@F + 0.:+< 0.?=/ u+4cc+4cX?Xor0000X&D0 (Q<<) L(T?5-<-@F < 0.:+< 0.6+0 u+4cc+4cX?Xor0000 (u+4coY?@) L(T?5-<-@F < 0.:+< 0.66. u+4cc+4cX6Xor0000+ (u+4coY6@) L(T?5-+-@F < 0.:+< 0.66. u+4cc+4cX7Xor0000+ (u+4coY7@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX:Xor0000+ (u+4coY:@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX/Xor0000+ (u+4coY/@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX=Xor0000+ (u+4coY=@) L(T?5-+-@F 7 0.:+< 0.:.0 u+4cc+4cX.Xor0000+ (u+4coY.@) L(T?5-0-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000<7 (u+4cc+4cX+7Xor0000<7) L(T65-<-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000++++X&D0 (Q67) L(T65-<-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000++++ (u+4cc+4cX+7Xor0000+++) L(T?5-<-@F < 0.:+< 0.6+0 u+4cc+4cX+7Xor0000+?: (u+4cc+4cX+7Xor0000+?:) L(T?5-<-@F / 0.:+< 0.:/+ u+4ss+:4%#orXcXLesult+ (sumY+7@) L(T?5-+-@F 7 0.:+< 0.:0/ l$cXoutY0@+ (l$cXoutY0@Xmm#Xout) L(T65-+-@F + 0.:+< 0.000 u:4%mu#X Xmu#0000+0+XU (Q++=) %(H)75-+-@F < 0.</= 0.6+0 u:4%mu#X Xmu#0000+0+ (Q<+) L(T65-<-@F + 0.:+< 0.000 u:4%mu#X Xmu#0000+0:<X) (Q=?) %(H)75-0-@F + 0.</= 0.000 u:4%mu#X Xmu#0000+0:< (u:4 Xmu#0000Y:@) L1X+51 0.<:= u:4 X: --------------------------------------------------------------------------------------------------Total +..=.:ns (++./<<ns logic! =.+/6ns route) (7=..K logic! 6+.+K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault F))&BT -Q 'B)FLB for Cloc3 Su<4%comparXgintXcmpXgt0000Xc Y+7@+S
7?
Total number of paths 4 destination ports5 <+66 4 :6 ------------------------------------------------------------------------Fffset5 /.+=?ns (Levels of Logic A <0) &ource5 aY0@ (*A1) 1estination5 u<4eintXmu#000<X+7 (LATCE) 1estination Cloc35 u<4%comparXgintXcmpXgt0000Xc Y+7@+ falling 1ata *ath5 aY0@ to u<4eintXmu#000<X+7 Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) --------------------------------------------------------------------------------------------------'()5--@F +< +.+0: 0..:. aX0X-'() (aX0X-'()) L(T<5-0-@F + 0.:+< 0.000 u<4%comparXeintXcmpXlt0000XlutY0@ %(HCV5&-@F + 0.606 0.000 u<4%comparXeintXcmpXlt0000Xc Y0@ %(HCV5C--@F + 0.07+ 0.000 u<4%comparXeintXcmpXlt0000Xc Y+@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y<@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y?@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y6@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y7@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y:@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y/@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y=@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y.@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y+0@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y++@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y+<@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y+?@ %(HCV5C--@F + 0.07< 0.000 u<4%comparXeintXcmpXlt0000Xc Y+6@ %(HCV5C--@F < 0.?.. 0.66. u<4%comparXeintXcmpXlt0000Xc Y+7@ L(T<5-+-@F +: 0.:+< +.0?+ u<4eintXmu#000?Y0@++ L(T65-0-@F + 0.:+< 0.000 u<4eintXmu#000?Y.@+ L151 0.<:= u<4eintXmu#000<X. -------------------------------------------------------------------------------------------------Total /.+=?ns (6./?7ns logic! <.66.ns route) (:7..K logic! ?6.+K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault F))&BT -Q 'B)FLB for Cloc3 Su<4%comparXpredXerrXcmpXe"0000Xc Y/@S Total number of paths 4 destination ports5 ?0 4 +7 ------------------------------------------------------------------------Fffset5 <..+.ns (Levels of Logic A <) &ource5 bY.@ (*A1) 1estination5 u<4 X. (LATCE) 1estination Cloc35 u<4%comparXpredXerrXcmpXe"0000Xc Y/@ falling 1ata *ath5 bY.@ to u<4 X. Uate Qet
76
Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) --------------------------------------------------------------------------------------------------'()5--@F +7 +.+0: 0..?? bX.X-'() (bX.X-'()) L(T?5-+-@F + 0.:+< 0.000 u<4 Xmu#000+Y.@+ (u<4 Xmu#000+Y.@) L1C*51 0.<:= u<4 X. ---------------------------------------Total <..+.ns (+..=:ns logic! 0..??ns route) (:=.0K logic! ?<.0K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault F))&BT -Q 'B)FLB for Cloc3 Su74e#ponentXcmpXgt0000S Total number of paths 4 destination ports5 6+6< 4 6 ------------------------------------------------------------------------Fffset5 <?./00ns (Levels of Logic A <+) &ource5 aY+@ (*A1) 1estination5 u74e#ponentX? (LATCE) 1estination Cloc35 u74e#ponentXcmpXgt0000 falling 1ata *ath5 aY+@ to u74e#ponentX? Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) --------------------------------------------------------------------------------------------------'()5--@F +< +.+0: 0..:. aX+X-'() (aX+X-'()) L(T65-0-@F + 0.:+< 0.?=/ u+4cc+4cX?Xor0000X&D0 (Q<<) L(T?5-<-@F < 0.:+< 0.6+0 u+4cc+4cX?Xor0000 (u+4coY?@) L(T?5-<-@F < 0.:+< 0.66. u+4cc+4cX6Xor0000+ (u+4coY6@) L(T?5-+-@F < 0.:+< 0.66. u+4cc+4cX7Xor0000+ (u+4coY7@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX:Xor0000+ (u+4coY:@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX/Xor0000+ (u+4coY/@) L(T?5-+-@F ? 0.:+< 0.7<0 u+4cc+4cX=Xor0000+ (u+4coY=@) L(T?5-+-@F 7 0.:+< 0.:.0 u+4cc+4cX.Xor0000+ (u+4coY.@) L(T?5-0-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000<7 L(T65-<-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000++++X&D0 (Q67) L(T65-<-@F + 0.:+< 0.?=/ u+4cc+4cX+7Xor0000++++ L(T?5-<-@F < 0.:+< 0.6+0 u+4cc+4cX+7Xor0000+?: L(T?5-<-@F / 0.:+< 0./76 u+4ss+:4%#orXcXLesult+ (sumY+7@) L(T65-0-@F + 0.:+< 0.000 u74e#ponentXcmpXgt000++=?+ %(H)75-+-@F +: 0.</= 0..0. u74e#ponentXcmpXgt000++=?Xf7 L(T?5-<-@F < 0.:+< 0.7?< u74%subXe#ponentXmu#000<Xc Y0@++ L(T65-0-@F 6 0.:+< 0.:7+ u74%subXe#ponentXmu#000<X#orY+@++ L(T65-0-@F 6 0.:+< 0.7:= u74%addsubXe#ponentXmu#000+Xc Y+@++ L(T?5-+-@F ? 0.:+< 0.7<0 u74%addsubXe#ponentXmu#000+Xc Y<@++ L(T65-+-@F + 0.:+< 0.000 u74%addsubXe#ponentXmu#000+X#orY?@++ L1C*51 0.<:= u74e#ponentX? -------------------------------------------------------------------------------------------------Total <?./00ns (+?.<=0ns logic! +0.6<0ns route) (7:.0K logic! 66.0K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
77
Timing constraint5 1efault F))&BT F(T A)TBL for Cloc3 Su:4 XcmpXe"0000S Total number of paths 4 destination ports5 +? 4 +? ------------------------------------------------------------------------Fffset5 6.++6ns (Levels of Logic A +) &ource5 u:4 X+: (LATCE) 1estination5 normali$eXresultY+:@ (*A1) &ource Cloc35 u:4 XcmpXe"0000 rising 1ata *ath5 u:4 X+: to normali$eXresultY+:@ Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) -----------------------------------------------------------------------------------------------L1X+5U-@\ + 0.7== 0.?7/ u:4 X+: (u:4 X+:) F'()5--@F ?.+:. normali$eXresultX+:XF'() ------------------------------------------------------------------------------------------------Total 6.++6ns (?./7/ns logic! 0.?7/ns route) (.+.?K logic! =./K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Timing constraint5 1efault F))&BT F(T A)TBL for Cloc3 Su74e#ponentXcmpXgt0000S Total number of paths 4 destination ports5 6 4 6 ------------------------------------------------------------------------Fffset5 6.++6ns (Levels of Logic A +) &ource5 u74e#ponentX? (LATCE) 1estination5 normali$eXresultY+7@ (*A1) &ource Cloc35 u74e#ponentXcmpXgt0000 falling 1ata *ath5 u74e#ponentX? to normali$eXresultY+7@ Uate Qet Cell5in-@out fanout 1ela 1ela Logical Qame (Qet Qame) -------------------------------------------------------------------------------------------------L1C*5U-@\ + 0.7== 0.?7/ u74e#ponentX? (u74e#ponentX?) F'()5--@F ?.+:. normali$eXresultX+7XF'() -------------------------------------------------------------------------------------------------Total :.11:ns (?./7/ns logic! 0.?7/ns route) (.+.?K logic! =./K route) AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA Total LBAL time to Hst completion5 +6.00 secs Total C*( time to Hst completion5 +6.<7 secs --@ Total memor usage is +:/666 3ilob tes
7:
CFQCL(&-FQ AQ1 )(T(LB &CF*B
Two new LZC circuits have been presented in this paper. The computation of leading-$ero count is reduced to well-3nown carr -loo3 ahead techni"ues in a unified manner. &ignificant energ reductions are achieved b the proposed designs compared to the most efficient previous implementations both in static and d namic C%F& logic. 'ased on the new LZC units! simplified prediction circuits are derived that outperform alread reported architectures. )rom the presented anal sis! the most efficient combination of LZA logic and LZC unit is derived for the design of the whole prediction circuit. Also! a novel techni"ue for handling the possible error of LZA logic was described that imposes the minimum overhead to the normali$ation shifter without introducing an further limitation.
7/
R3F3R3NC3'
+. Uiorgos 1imitra3opoulos! 9ostas Ualanopoulos! Christos %avro3efalidis and 1imitris Qi3olos! Olow-power leading-$ero counting and anticipation logic for high speed floating point unitP in -BBB s mp. ,L&- & stems! vol +:! no /! 8ul <00=! pp =?/-=70. <. A. 'eumont-&mith! Q. 'urgess! &. Lefrere! and C. C. Lim! OLeduced latenc -BBB floating-point standard adder architecture!P in Proc. 15th IEEE Symp. Comput. Arithmetic! 8ul. <00+! pp. ?7]6<. ?. IEEE Standard for Binary Floating Point Arithmetic ! &td /76-+.=7! American
Qational &tandards -nstitute and -nstitute of Blectrical and Blectronic Bngineers! +.=7. 6. 8. 1. 'ruguera and T. Lang! OLeading-one prediction with concurrent position
correction!P IEEE !ran". Comput.! vol. 6=! no. +0! pp. +0=?]+0./! Fct. +.... 7. %. &. &chmoo3ler and 9. 8. Qow3a! OLeading $ero anticipation and detection5 A comparison of methods!P in Proc. 15th IEEE Symp. Comput. Arithmetic! 8ul. <00+! pp. /] +<. :. Q.\uach and %.8.)l nn! OLeading one prediction ] -mplementation! generali$ation! and applicationP! &tanford (niversit ! &tanford! CA! Tech. Lep. C&L-TL-.+-6:?! +..+.
7=
/. ,. F3lobd$ija! OAn algorithmic and novel design of a leading $ero detector circuit5 Comparison with logic s nthesisP! IEEE !ran". #ery $arge Scale Integr. %#$SI& Sy"t., vol. <! no. +! pp. +<6-+<=! %ar. +..6. =. E. &u$u3i! E. %orina3a! E. %a3ino! V. Qa3ase! 9. %ashi3o! and T. &umi! OLeading$ero anticipator logic for high-speed floating-point additionP! IEEE '. Solid "tate circuit"! vol. ?+! no. =! pp. ++7/-++:6! Aug. +..:. .. U. Zhang! Z. \i! and D. Eu! OA novel design of leading $ero anticipation circuit with parallel error detectionP! in Proc. IEEE Int. Symp. Circuit" Sy"t., %a <007! pp. :/:-:/..
$ppen!i4: Co!e
1<-bit Carr, oo1 a)ea! $!!er librar ieeeC use ieee.stdXlogicX++:6.allC entit CLAX+:b is port ( a! b 5 in stdXlogicXvector(+7 downto 0) C cin 5 in stdXlogic C su 5 out stdXlogicXvector(+7 downto 0) C ca 5 out stdXlogic ) C end CLAX+:b C architecture CLAX+:bXarch of CLAX+:b is component #or+ is port (a!b 5 in stdXlogic C c 5 out stdXlogic )C end component C component and+ is port (a!b 5 in stdXlogic C
7.
c 5 out stdXlogic )C end component C component carr gen is port ( g! p 5 in stdXlogicXvector(+7 downto 0) C ci 5 in stdXlogic C c 5 out stdXlogicXvector(+: downto +))C end component C signal p! g 5 stdXlogicXvector(+7 downto 0) C signal co 5 stdXlogicXvector(+: downto +) C begin ##+ 5 #or+ port map ( a(0)! b(0)! p(0) ) C aa+ 5 and+ port map ( a(0)! b(0)! g(0) ) C ##< 5 #or+ port map ( a(+)! b(+)! p(+) ) C aa< 5 and+ port map ( a(+)! b(+)! g(+) ) C ##? 5 #or+ port map ( a(<)! b(<)! p(<) ) C aa? 5 and+ port map ( a(<)! b(<)! g(<) ) C ##6 5 #or+ port map ( a(?)! b(?)! p(?) ) C aa6 5 and+ port map ( a(?)! b(?)! g(?) ) C ##7 5 #or+ port map ( a(6)! b(6)! p(6) ) C aa7 5 and+ port map ( a(6)! b(6)! g(6) ) C ##: 5 #or+ port map ( a(7)! b(7)! p(7) ) C aa: 5 and+ port map ( a(7)! b(7)! g(7) ) C ##/ 5 #or+ port map ( a(:)! b(:)! p(:) ) C aa/ 5 and+ port map ( a(:)! b(:)! g(:) ) C ##= 5 #or+ port map ( a(/)! b(/)! p(/) ) C aa= 5 and+ port map ( a(/)! b(/)! g(/) ) C ##. 5 #or+ port map ( a(=)! b(=)! p(=) ) C aa. 5 and+ port map ( a(=)! b(=)! g(=) ) C ##+0 5 #or+ port map ( a(.)! b(.)! p(.) ) C aa+0 5 and+ port map ( a(.)! b(.)! g(.) ) C
:0
##++ 5 #or+ port map ( a(+0)! b(+0)! p(+0) ) C aa++ 5 and+ port map ( a(+0)! b(+0)! g(+0) ) C ##+< 5 #or+ port map ( a(++)! b(++)! p(++) ) C aa+< 5 and+ port map ( a(++)! b(++)! g(++) ) C ##+? 5 #or+ port map ( a(+<)! b(+<)! p(+<) ) C aa+? 5 and+ port map ( a(+<)! b(+<)! g(+<) ) C ##+6 5 #or+ port map ( a(+?)! b(+?)! p(+?) ) C aa+6 5 and+ port map ( a(+?)! b(+?)! g(+?) ) C ##+7 5 #or+ port map ( a(+6)! b(+6)! p(+6) ) C aa+7 5 and+ port map ( a(+6)! b(+6)! g(+6) ) C ##+: 5 #or+ port map ( a(+7)! b(+7)! p(+7) ) C aa+: 5 and+ port map ( a(+7)! b(+7)! g(+7) ) C cc+ 5 carr gen port map ( g! p! cin! co ) C ss+ 5 #or+ port map ( p(0)! cin! su(0) ) C ss< 5 #or+ port map ( p(+)! co(+)! su(+) ) C ss? 5 #or+ port map ( p(<)! co(<)! su(<) ) C ss6 5 #or+ port map ( p(?)! co(?)! su(?) ) C ss7 5 #or+ port map ( p(6)! co(6)! su(6) ) C ss: 5 #or+ port map ( p(7)! co(7)! su(7) ) C ss/ 5 #or+ port map ( p(:)! co(:)! su(:) ) C ss= 5 #or+ port map ( p(/)! co(/)! su(/) ) C ss. 5 #or+ port map ( p(=)! co(=)! su(=) ) C ss+0 5 #or+ port map ( p(.)! co(.)! su(.) ) C ss++ 5 #or+ port map ( p(+0)! co(+0)! su(+0) ) C ss+< 5 #or+ port map ( p(++)! co(++)! su(++) ) C ss+? 5 #or+ port map ( p(+<)! co(+<)! su(+<) ) C ss+6 5 #or+ port map ( p(+?)! co(+?)! su(+?) ) C ss+7 5 #or+ port map ( p(+6)! co(+6)! su(+6) ) C ss+: 5 #or+ port map ( p(+7)! co(+7)! su(+7) ) C ca YA co(+:) C end CLAX+:bXarch C
:+
"$ .it) error !etection #nit librar ieeeC use ieee.stdXlogicX++:6.allC entit LZAXerrdet is port( a! b 5 in stdXlogicXvector(+7 downto 0) C predXerr 5 out stdXlogic C 5 out stdXlogicXvector(+7 downto 0))C end LZAXerrdet C architecture LZAXerrdetXarch of LZAXerrdet is signal g! s! e! f! p! n! $ 5 stdXlogicXvector(+7 downto 0)C signal egg! $pp! egeg! $p$p! ege! $p$! eges! $p$n! egsg! egseg! egse! egses! ess! $nn! eses! $n$n! ese! $n$! eseg! $n$p! esgs! esges! esge! esgeg 5 stdXlogicXvector(+7 downto 0) 5A (others A@S0S)C begin process(a! b) variable gint! sint! eint 5 stdXlogicXvector(+7 downto 0)C begin if a @ b then gint 5A (a and (not b))C elsif a Y b then sint 5A ((not a) and b)C elsif a A b then eint 5A (a #nor b)C end if C g YA gint C s YA sint C e YA eint C end process C process(g! s! e) variable fint 5 stdXlogicXvector(+7 downto 0)C
:<
begin fint 5A (e and g and (not s)) or ((not e) and s and (not s)) or (e and s and (not g)) or ((not e) and g and (not g)) C f YA fint C end process C process(g! s! e!p!n) variable pint! nint! $int 5 stdXlogicXvector(+7 downto 0)C begin pint 5A ((e or (e and g) or (( not e) and s)) and g) C nint 5A ((e or (e and s) or (( not e) and g)) and s) C $int 5A not (p or n) C p YA pint C n YA nint C $ YA $int C end process C egg YA e and g and g C $pp YA $ and p and p C egeg YA e and g and e and g C $p$p YA $ and p and $ and p C ege YA e and g and e C $p$ YA $ and p and $ C eges YA e and g and e and s C $p$n YA $ and p and $ and n C egsg YA e and g and s and g C $p$p YA $ and p and $ and p C egseg YA e and g and s and e and g C egse YA e and g and s and e C egses YA e and g and s and e and s C ess YA e and s and s C $nn YA $ and n and n C eses YA e and s and e and s C
:?
$n$n YA $ and n and $ and n C ese YA e and s and e C $n$ YA $ and n and $ C eseg YA e and s and e and g C $n$p YA $ and n and $ and p C esgs YA e and s and g and s C esges YA e and s and g and e and s C esge YA e and s and g and e C esgeg YA e and s and g and e and g C process(a!b!egg! $pp! egeg! $p$p! ege! $p$! egsg! egseg! egse! ess! $nn! eses! $n$n! ese! $n$! esgs! esges! esge! eges! $p$n! egses! eseg! $n$p! esgeg) begin if (egg A $pp) or (egeg A $p$p) or (ege A $p$) or (egsg A $p$p) or (egseg A $p$p) or (egse A $p$) or (ess A $nn) or (eses A $n$n) or (ese A $n$) or (esgs A $n$n) or (esges A $n$n) or (esge A $n$) then predXerr YA S0S C YA a C elsif (eges A $p$n) or (egses A $p$n) or (eseg A $n$p) or (esgeg A $n$p) then predXerr YA S+S C YA b C end if C end process C end LZAXerrdetXarch C Propose! 1<-bit ea!ing "ero Co#nt #nit librar ieeeC use ieee.stdXlogicX++:6.allC entit LZCX+:bXp is port ( a 5 in stdXlogicXvector(+7 downto 0) C vbar 5 out stdXlogic C
:6
$bar 5 out stdXlogicXvector(? downto 0))C end LZCX+:bXp C architecture LZCX+:bXpXarch of LZCX+:bXp is component or+ is port (a!b 5 in stdXlogic C c 5 out stdXlogic )C end component C component not+ is port (a 5 in stdXlogic C b 5 out stdXlogic )C end component C component and+ is port (a!b 5 in stdXlogic C c 5 out stdXlogic )C end component C signal n+! n<! n?! n6! n7! n:! n/! n=! n.! n+0! n++ 5 stdXlogic C signal r+! r<! r?! r6! r7! r:! r/! r=! r.! r+0! r++! r+<! r+?! r+6! r+7! r+:! r+/! r+=! r+.! r<0! r<+! r<< 5 stdXlogic C signal an+! an<! an?! an6! an7! an:! an/! an=! an.! an+0! an++! an+<! an+?! an+6! an+7! an+: 5 stdXlogic C begin ------- &tage + ------or++ 5 or+ port map ( a(0)! a(+)! r+ ) C nt++ 5 not+ port map ( a(<)! n+ ) C or+< 5 or+ port map ( a(<)! a(?)! r< ) C nt+< 5 not+ port map ( a(6)! n< ) C or+? 5 or+ port map ( a(6)! a(7)! r? ) C nt+? 5 not+ port map ( a(:)! n? ) C or+6 5 or+ port map ( a(:)! a(/)! r6 ) C nt+6 5 not+ port map ( a(=)! n6 ) C or+7 5 or+ port map ( a(=)! a(.)! r7 ) C
:7
nt+7 5 not+ port map ( a(+0)! n7 ) C or+: 5 or+ port map ( a(+0)! a(++)! r: ) C nt+: 5 not+ port map ( a(+<)! n: ) C or+/ 5 or+ port map ( a(+<)! a(+?)! r/ ) C nt+/ 5 not+ port map ( a(+6)! n/ ) C or+= 5 or+ port map ( a(+6)! a(+7)! r= ) C ------- &tage < ------and++ 5 and+ port map ( a(+)! n+! an+ ) C or<+ 5 or+ port map ( an+! a(?)! r.) C or<< 5 or+ port map ( r+! r<! r+0) C nt<+ 5 not+ port map ( r?! n= ) C and+< 5 and+ port map ( a(7)! n?! an< ) C and+? 5 and+ port map ( n<! n?! an? ) C or<? 5 or+ port map ( an<! a(/)! r++) C or<6 5 or+ port map ( r?! r6! r+<) C nt<< 5 not+ port map ( r7! n. ) C and+6 5 and+ port map ( a(.)! n7! an6 ) C and+7 5 and+ port map ( n6! n7! an7 ) C or<7 5 or+ port map ( an6! a(++)! r+?) C or<: 5 or+ port map ( r7! r:! r+6) C nt<? 5 not+ port map ( r/! n+0 ) C and+: 5 and+ port map ( a(+?)! n/! an: ) C and+/ 5 and+ port map ( n:! n/! an/ ) C or</ 5 or+ port map ( an:! a(+7)! r+7) C or<= 5 or+ port map ( r/! r=! r+:) C ------- &tage ? ------and?+ 5 and+ port map ( r<! n=! an= ) C or?+ 5 or+ port map ( an=! r6! r+/) C and?< 5 and+ port map ( r.! an?! an. ) C or?< 5 or+ port map ( an.! r++! r+=) C
::
or?? 5 or+ port map ( r+<! r+6! r+.) C nt?+ 5 not+ port map ( r+6! n++ ) C and?? 5 and+ port map ( r:! n+0! an+0 ) C and?6 5 and+ port map ( n.! n+0! an++ ) C or?6 5 or+ port map ( an+0! r=! r<0) C and?7 5 and+ port map ( r+?! an/! an+< ) C and?: 5 and+ port map ( an7! an/! an+? ) C or?7 5 or+ port map ( an+<! r+7! r<+) C or?: 5 or+ port map ( r+6! r+:! r<<) C ------- &tage 6 ------and6+ 5 and+ port map ( r+<! n++! an+6 ) C or6+ 5 or+ port map ( an+6! r+:! $bar(<)) C and6< 5 and+ port map ( r+/! an++! an+7 ) C or6< 5 or+ port map ( an+7! r<0! $bar(+)) C and6? 5 and+ port map ( r+=! an+?! an+: ) C or6? 5 or+ port map ( an+:! r<+! $bar(0)) C or66 5 or+ port map ( r+.! r<<! vbar ) C $bar(?) YA r<< C end LZCX+:bXpXarch C ')i-ter .it) log2n-1 stages librar ieeeC use ieee.stdXlogicX++:6.allC use ieee.stdXlogicXunsigned.allC entit shifter is port(a 5 in stdXlogicXvector(+: downto 0) C shift 5 in stdXlogicXvector(< downto 0) C shiftXout+! shiftXout<! shiftXout? 5 out stdXlogicXvector(+: downto 0))C end shifterC architecture shifterXarch of shifter is
:/
begin process(shift! a) begin case shift is when 20002 A@ shiftXout+ YA (a)C when 200+2 A@ shiftXout+ YA (a(+? downto 0) > a(+: downto +6))C when 20+02 A@ shiftXout+ YA (a(+< downto 0) > a(+: downto +?))C when 20++2 A@ shiftXout+ YA (a(++ downto 0) > a(+: downto +<))C when 2+002 A@ shiftXout+ YA (a(+0 downto 0) > a(+: downto ++))C when 2+0+2 A@ shiftXout+ YA (a(. downto 0) > a(+: downto +0))C when 2++02 A@ shiftXout+ YA (a(= downto 0) > a(+: downto .))C when 2+++2 A@ shiftXout+ YA (a(/ downto 0) > a(+: downto =))C when othersA@ end case C end process C process(shift! a) begin case shift is when 20002 A@ shiftXout< YA (a)C when 200+2 A@ shiftXout< YA (a(+6 downto 0) > a(+: downto +7))C when 20+02 A@
:=
shiftXout< YA (a(+? downto 0) > a(+: downto +6))C when 20++2 A@ shiftXout< YA (a(+< downto 0) > a(+: downto +?))C when 2+002 A@ shiftXout< YA (a(++ downto 0) > a(+: downto +<))C when 2+0+2 A@ shiftXout< YA (a(+0 downto 0) > a(+: downto ++))C when 2++02 A@ shiftXout< YA (a(. downto 0) > a(+: downto +0))C when 2+++2 A@ shiftXout< YA (a(= downto 0) > a(+: downto .))C when othersA@ end case C end process C process(shift! a) begin case shift is when 20002 A@ shiftXout? YA (a)C when 200+2 A@ shiftXout? YA (a(+7 downto 0) > a(+:))C when 20+02 A@ shiftXout? YA (a(+6 downto 0) > a(+: downto +7))C when 20++2 A@ shiftXout? YA (a(+? downto 0) > a(+: downto +6))C when 2+002 A@ shiftXout? YA (a(+< downto 0) > a(+: downto +?))C when 2+0+2 A@ shiftXout? YA (a(++ downto 0) > a(+: downto +<))C when 2++02 A@ shiftXout? YA (a(+0 downto 0) > a(+: downto ++))C
:.
when 2+++2 A@ shiftXout? YA (a(. downto 0) > a(+: downto +0))C when othersA@ end case C end process C end shifterXarchC 34ponent Up!ater ; Decision 2loc1 librar ieeeC use ieee.stdXlogicX++:6.allC use ieee.stdXlogicXunsigned.allC entit e#ponXupdate is port ( rst! cl3 5 in stdXlogic C $! suXl 5 in stdXlogicXvector(? downto 0) C selXin 5 out stdXlogicXvector(+ downto 0) C e#ponent 5 out stdXlogicXvector(? downto 0)) C end e#ponXupdate C architecture e#ponXupdateXarch of e#ponXupdate is signal sel 5 stdXlogic C begin process ($! suXl) variable e#ponXsub 5 stdXlogicXvector(? downto 0)C begin if suXl @ $ then e#ponXsub 5A suXl - $ C e#ponent YA e#ponXsub IS+S C elsif $ @ suXl then e#ponXsub 5A $ - suXl C e#ponent YA e#ponXsub -S+S C end if C end process C
/0
process(rst! cl3) begin if rstAS+S then sel YA S0S C elsif cl3Sevent and cl3AS+S then sel YA not sel C end if C end process C process($! suXl! sel) begin case sel is when S0S A@ selXin YA suXl(+ downto 0) C when S+S A@ selXin YA $(+ downto 0) C when others A@ null C end case C end process C end e#ponXupdateXarch C % to 1 0#ltiple4er librar ieeeC use ieee.stdXlogicX++:6.allC entit mu#? is port( a! b! c 5 in stdXlogicXvector(+: downto 0) C sel 5 in stdXlogicXvector(+ downto 0) C 5 out stdXlogicXvector(+: downto 0))C end mu#? C architecture mu#?Xarch of mu#? is begin
/+
process(a! b! c! sel) begin case sel is when 2002 A@ YA a C when 20+2 A@ YA b C when 2+02 A@ YA c C when others A@ null C end case C end process C end mu#?Xarch C Propose! /et)o! -or !etecting an! correcting t)e 1-bit error o- t)e "$ logic #nit librar ieeeC use ieee.stdXlogicX++:6.allC entit LZAXdetXcorrXtop is port( a! b 5 in stdXlogicXvector(+7 downto 0) C rst! cl3 5 in stdXlogic C normali$eXresult 5 out stdXlogicXvector(+: downto 0) )C end LZAXdetXcorrXtop C architecture LZAXdetXcorrXtopXarch of LZAXdetXcorrXtop is component CLAX+:b is port ( a! b 5 in stdXlogicXvector(+7 downto 0) C cin 5 in stdXlogic C su 5 out stdXlogicXvector(+7 downto 0) C ca 5 out stdXlogic ) C end component C component LZAXerrdet is
/<
port( a! b 5 in stdXlogicXvector(+7 downto 0) C predXerr 5 out stdXlogic C 5 out stdXlogicXvector(+7 downto 0))C end component C component LZCX+:bXp is port ( a 5 in stdXlogicXvector(+7 downto 0) C vbar 5 out stdXlogic C $bar 5 out stdXlogicXvector(? downto 0))C end component C component shifter is port(a 5 in stdXlogicXvector(+: downto 0) C shift 5 in stdXlogicXvector(< downto 0) C shiftXout+! shiftXout<! shiftXout? 5 out stdXlogicXvector(+: downto 0))C end component C component e#ponXupdate is port ( rst! cl3 5 in stdXlogic C $! suXl 5 in stdXlogicXvector(? downto 0) C selXin 5 out stdXlogicXvector(+ downto 0) C e#ponent 5 out stdXlogicXvector(? downto 0)) C end component C component mu#? is port( a! b! c 5 in stdXlogicXvector(+: downto 0) C sel 5 in stdXlogicXvector(+ downto 0) C 5 out stdXlogicXvector(+: downto 0))C end component C signal cin 5 stdXlogic 5AS0S C signal ca! predXerr! vflag 5 stdXlogic C signal selXmu# 5 stdXlogicXvector(+ downto 0) C signal l$cXout! e#ponent 5 stdXlogicXvector(? downto 0) C signal sum! l$aXout 5 stdXlogicXvector(+7 downto 0) C signal shiftXo+! shiftXo<! shiftXo?! sumXin!normali$eXout5stdXlogicXvector(+:downto 0) C
/?
begin u+ 5 CLAX+:b *FLT %A* (a! b! cin! sum! ca ) C u< 5 LZAXerrdet port map ( a! b! predXerr! l$aXout ) C u? 5 LZCX+:bXp port map ( l$aXout! vflag! l$cXout ) C u6 5 shifter port map ( sumXin! l$cXout(< downto 0)! shiftXo+! shiftXo<! shiftXo? )C u7 5 e#ponXupdate port map ( rst! cl3! l$cXout! sum(+7 downto +<)! selXmu#! e#ponent ) C u: 5 mu#? port map ( shiftXo+! shiftXo<! shiftXo?! selXmu#! normali$eXout ) C sumXin YA ca > sum C normali$eXresult YA normali$eXout(+:) > e#ponent > normali$eXout(++ downto 0)C end LZAXdetXcorrXtopXarch C
/6

VLSI Project

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

VLSI Project

Hochgeladen von

Copyright:

Verfügbare Formate

INTRODUCTION

1.%. Intro!#ction to & 'I

3. -ntegrated circuit (-C) ma contain millions of transistors! each a few mm in

4. Applications wide ranging5 most electronic logic devices.

processing algorithms to warn about unusual conditions. The availabilit of these

is the short circuit current power

2.1 Floating-point Unit

2.2 Carr, oo1 a)ea! $!!er

2.: 0#ltiplication Process

%.2 Revie. o- "C $rc)itect#re

%.% "C Unit Organi9ation

%.: "$ $rc)itect#re

%.> 3nerg, Dela, Co/parisons

The energ optimi$ation! the

&+D $ND D3'I?N TOO '

:.2 evels o- $bstraction

:.% 3le/ents o- &+D

:.: Design Tools

&-%(LAT-FQ AQ1 &VQTEB&-&

Fig. >.>: 'i/#late! res#lt -or 34ponent Up!ater ; Decision 2loc1

Fig. >.<: 'i/#late! res#lt -or % to 1 0#ltiple4er

>.2 ',nt)esis Res#lts

>.2.% Tec)nolog, 'c)e/atic

Fig. >.1:: Tec)nolog, sc)e/atic

Internal circ#its -or Tec)nolog, 'c)e/atic:

>.% ',nt)esis Report

-gnore & nthesis Constraint )ile 5 QF

5/ 5+ 56 5+ 5+ 5 <+ 5 +/ 5+ 5+ 5< 5< 5< 5+ 5+ 5 ?? 5 ?< 5+

CFQCL(&-FQ AQ1 )(T(LB &CF*B

Das könnte Ihnen auch gefallen