Sie sind auf Seite 1von 45

Single Precision Floating Point Unit

Mani Sudha Yalamanchi (sudhadimpy@yahoo.com) Rajendar Koltur (rkoltur@yahoo.com) Acknowledgements Our work is based upon the openipcore Floating Point Unit Core designed and coded in verilog by Rudolf Usseslman. We are indebted to him and www.opencores.org for providing the source code. We also thank Raghu Yerramreddikalva of Intel Corporation for helping us with the synthesis of the floating point unit. With his help, we were able to synthesize our code using Synopsys and a real target library. Objective The objective of this project is to design a Single precision floating point unit core using VHDL, simulate it and synthesize it. Currently the core supports only floating point addition, subtraction and multiplication. Methodology We started off by studying Computer Arithmetic in Reference 2. Next, we read the IEEE standard 754 on binary floating point arithmetic[6]. Reference 5 listed a number of algorithms for high performance floating point arithmetic. To handle the complextity, we leverage of an existing design in Verilog. We rewrote the code using VHDL, learning a lot about both the languages in the process. We designed out own testbench and in addition used the testing methodology adopted by the opencores desgin and reran their tests. Finally we synthesized the design using a real ASIC library and wire load model. Introduction The FPU core has the following features.
Implements Single Precision (32-bit). Implements Floating point addition, subtraction and multiplication Implements all four rounding modes, round to nearest, round towards +inf, round towards inf and round to zero. All exceptions implemented and reported according to the IEEE standard.

Entity

Signal Descriptions Inputs:


clk opa, opb clock input operands A and B

rounding mode (00-round to nearest rmode even, 01-round to zero, 10-round to +inf, 11-round to -inf) floating point operation (0 - Add, 1 Subtract, 2 - Multiply, 3 - Divide, 4 fpu_op Int to float conversion, 5 - Float to int conversion)

Outputs:
fpout inf ine result output Asserted when output is the special value INF Asserted when the calulation is inexact, i.e. some accuracy has

been lost during computation Asserted when a overflow overflow occurs, i.e. number is too large to be represented. Asserted when a Underflow underflow occurs, i.e. number is too small to be represented. Asserted when the fpu_op is set div_by_zero to divide and opb is zero Asserted when the output is a zero numeric zero Asserted when either operand is snan a SNAN Output Asserted when output is a qnan QNAN

Floating Point Multiplication Alogorithm The algorithm for Floating Point Multiplication consists of the following steps.
Check for zeros, NaN's, inf on inputs. Add the exponents Multiply the mantissas Normalize the product and round using the specified rounding mode. Also generate exceptions.

Floating Point Addition Algorithm The algorithm for Floating Point Addition/Subtraction consists of the following steps.
Check for special values on inputs. Align the mantissas, i.e right shift the significand of the smaller operand by d bits. Add or subtract the mantissas Normalize the result and round using tje specified rounding mode. Also generate exceptions.

Microarchitecture The FPU core consists of the following units. Pre Normalize Block for Add/Subtract.- Calculate the diference between the smaller and larger exponent. Adjust the smaller fraction by right shifting it, determine if the operation is an add or subtract after resolving the sign bits. Check for NaNs on inputs. Pre Normalize Block for Mul/Div - Computes the sum/difference of exponents,

checks for exponent overflow, underflow condition and INF value on an input. Add/Sub - 24 bit integer adder/subtractor. Multiply - 2 cycle 24-bit boolean integer multipler Divide - 2 cycle integer divider and remainder computation unit Post Normalize and Round Unit - Normalize fraction and exponent. Also do all the roundings in parallel and then pick the output corresponding to the chosen rounding mode. Exceptions Unit - logic to stage and generate exception signals. The block diagram is reproduced from Reference 1 and is given below.

Datapath and Pipeline

Code pre_norm_arch.vhd pre_norm_fmul_arch.vhd add_sub27_arch.vhd mul_r2_arch.vhd div_r2_arch.vhd post_norm_arch.vhd except_arch.vhd fpu_arch.vhd Simulation We used two strategies for testing. The first and simpler one was to write a test bench that exercised various features of the design. The floating point calculations are output to a file in an easy to read format with the hidden values recovered. This allowed us to hand verify the calculations. A sample output file is given here. Click here for the testbench and related package file. The second strategy is the one used by the Opencores design. The testing is much more extensive. A software implementation of the IEEE standard, Softfloat, is used as the reference design. This library is used to generate millions of test vectors that are stored in a file. A test can be a random test that selects the operation and rounding at random or focussed tests that test a fpu operation and rounding mode. The packed vectors are read from the test file and compared with

the output produced by the fpu core. The model was simulated using Modelsim. All vectors that passed on the Verilog model, also passed on the VHDL model. Synthesis The use of operators made our code compact and simple. We found that most of the operators except the division(/) and remainder(rem) operators were synthesizable. We synthesized our design using a ASIC library at Intel Corporation with Synopsys Design Compiler and the DesignWare. Note that the integer divider and remainder code was disabled during synthesis. The first step is to analyze and elaborate the design. This will ensure that there are no errors in the vhdl and prepare an unoptimized netlist. dc_shell> define_design_lib work -path ../mra dc_shell> analyze -library work -format vhdl ../hdl/add_sub27_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/div_r2_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/mul_r2_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/except_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/pre_norm_fmul_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/pre_norm_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/post_norm_arch.vhd dc_shell> analyze -library work -format vhdl ../hdl/fpu_arch.vhd dc_shell> elaborate -library work fpu dc_shell> write -format db -hier -output fpu_pre.db Next we set the wire load model, operating conditions and define the clock. Finally the model is compiled to the target library. The synthesis script is given here. Some of the questions we answered were. 1. Upto what frequency can the fpu operate? The FPU can operate at a frequency of 100 MHz. 2. What are number of library cells used. A total of 6871 cells were used. 3. What are the critical paths? The slowest timing paths are part of the post normalization unit, espically the generation of the underflow and overflow signals. Some of the reports about the synthesis are: 1. fpu_check_design.rpt..> 08-Jul-2001 18:59 11k 2. fpu_compile_scr.txt 08-Jul-2001 18:59 2k

3. fpu_constraints_viol..> 08-Jul-2001 18:59 96k 4. fpu_loop.rpt.txt 08-Jul-2001 18:59 1k 5. fpu_timing.rpt.txt 08-Jul-2001 18:59 179k

Conclusions We learned a lot in this project. We learned VHDL and Verilog coding and syntax, Floating Point Unit micro architecture, Floating Point Addition, Multiplication and Division algorithms, the IEEE standard for Binary Floating-Point Arithmetic, issues in design including pipelining, Verification Strategies and Synthesis. References 1. Rudolf Usselman, Documentation for Floating Point Unit, http://www.opencores.org. 2. John L. Hennessy and David A. Patterson, Computer Architecture A Quantitative Approach, 2nd Edition, Morgan Kaufmann, Appendix A. 3. Peter J. Ashenden, The Designer's Guide to VHDL, Morgan Kaufmann. 4. Donald E. Thomas and Philip R. Moorby, The Verilog Hardware Description Language, Kluwer Academic Publishers. 5. Stuart Oberman, Design Issues in High Performance Floating-Point Arithmetic Units, Stanford University Technical report. 6. IEEE, IEEE-754-1985 Standard for binary floating-point arithmetic.
LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_misc.ALL; USE ieee.std_logic_unsigned.ALL; USE ieee.std_logic_arith.ALL; --USE ieee.numeric_std.ALL; --USE ieee.numeric_bit.ALL; ENTITY pre_norm IS PORT( add clk opa opa_nan opb opb_nan rmode exp_dn_out fasu_op fracta_out fractb_out nan_sign result_zero_sign sign ); END pre_norm ;

: : : : : : : : : : : : : :

IN IN IN IN IN IN IN OUT OUT OUT OUT OUT OUT OUT

std_logic ; std_logic ; std_logic_vector std_logic ; std_logic_vector std_logic ; std_logic_vector std_logic_vector std_logic ; std_logic_vector std_logic_vector std_logic ; std_logic ; std_logic

(31 downto 0) ; (31 downto 0) ; (1 downto 0) ; (7 downto 0) ; (26 downto 0) ; (26 downto 0) ;

ARCHITECTURE arch OF pre_norm IS signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal 0); BEGIN signa <= opa(31); signb <= opb(31); expa <= opa(30 downto 23); expb <= opb(30 downto 23); fracta <= opa(22 downto 0); fractb <= opb(22 downto 0); expa_lt_expb <= '1' WHEN (expa > expb) ELSE '0'; expa_dn <= NOT or_reduce(expa); expb_dn <= NOT or_reduce(expb); -- Calculate the difference between the smaller and larger exponent exp_small <= expb WHEN (expa_lt_expb = '1') ELSE expa; exp_large <= expa WHEN (expa_lt_expb = '1') ELSE expb; exp_diff1 <= exp_large - exp_small; exp_diff1a <= exp_diff1 - '1'; -- if one of the exponents is zero then exp_diff1a else exp_diff1 exp_diff2 <= exp_diff1a WHEN ((expa_dn OR expb_dn) = '1') ELSE exp_diff1; -- exp_diff is 0 if both exponents are zero exp_diff <= X"00" WHEN ((expa_dn AND expb_dn) = '1') ELSE exp_diff2; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN signa, signb : std_logic ; signd_sel : std_logic_vector(2 DOWNTO 0) ; add_d_sel : std_logic_vector(2 DOWNTO 0) ; expa, expb : std_logic_vector (7 downto 0); fracta, fractb : std_logic_vector (22 downto 0); expa_lt_expb : std_logic ; fractb_lt_fracta : std_logic ; exp_small, exp_large : std_logic_vector (7 downto 0); exp_diff : std_logic_vector (7 downto 0); adj_op : std_logic_vector (22 downto 0); adj_op_tmp : std_logic_vector (26 downto 0); adj_op_out : std_logic_vector (26 downto 0); fracta_n, fractb_n : std_logic_vector (26 downto 0); fracta_s, fractb_s : std_logic_vector (26 downto 0); sign_d : std_logic ; add_d : std_logic ; expa_dn, expb_dn : std_logic ; sticky : std_logic ; add_r, signa_r, signb_r : std_logic ; exp_diff_sft : std_logic_vector (4 downto 0); exp_lt_27 : std_logic ; op_dn : std_logic ; adj_op_out_sft : std_logic_vector (26 downto 0); fracta_lt_fractb, fracta_eq_fractb : std_logic ; nan_sign1 : std_logic ; exp_diff1, exp_diff1a, exp_diff2 : std_logic_vector (7 downto

IF ((add_d = '0') AND THEN ELSE

(expa = expb) AND

(fracta = fractb))

exp_dn_out <= X"00"; exp_dn_out <= exp_large; END IF; END IF; END PROCESS; -- Adjust the smaller fraction op_dn <= expb_dn WHEN (expa_lt_expb = '1') ELSE expa_dn; adj_op <= fractb WHEN (expa_lt_expb = '1') ELSE fracta; adj_op_tmp <= (NOT op_dn) & adj_op & "000"; -- adj_op_out is 27 bits wide, so can only be shifted -- 27 bits to the right (8'd27) exp_lt_27 <= '1' WHEN (exp_diff > "00011011") ELSE '0'; exp_diff_sft <= "11011" WHEN (exp_lt_27 = '1') ELSE exp_diff(4 downto 0); -- adj_op_tmp_bitvec <= STD_LOGIC_VECTORtoBIT_VECTOR(adj_op_tmp); -- adj_op_out_sft <= To_StdLogicVector(adj_op_tmp_bitvec SRL conv_integer(exp_diff_sft)); -- (conv_integer(exp_diff_sft)); adj_op_out_sft <= shr(adj_op_tmp,exp_diff_sft); adj_op_out <= adj_op_out_sft(26 DOWNTO 1) & (adj_op_out_sft(0) OR sticky); -- Get truncated portion (sticky bit) PROCESS (exp_diff_sft,adj_op_tmp) BEGIN CASE exp_diff_sft IS WHEN "00000" => sticky <= '0'; WHEN "00001" => sticky <= adj_op_tmp(0); WHEN "00010" => sticky <= or_reduce(adj_op_tmp(1 downto 0)); WHEN "00011" => sticky <= or_reduce(adj_op_tmp(2 downto 0)); WHEN "00100" => sticky <= or_reduce(adj_op_tmp(3 downto 0)); WHEN "00101" => sticky <= or_reduce(adj_op_tmp(4 downto 0)); WHEN "00110" => sticky <= or_reduce(adj_op_tmp(5 downto 0)); WHEN "00111" => sticky <= or_reduce(adj_op_tmp(6 downto 0)); WHEN "01000" => sticky <= or_reduce(adj_op_tmp(7 downto 0)); WHEN "01001" => sticky <= or_reduce(adj_op_tmp(8 downto 0)); WHEN "01010" => sticky <= or_reduce(adj_op_tmp(9 downto 0)); WHEN "01011" => sticky <= or_reduce(adj_op_tmp(10 downto 0)); WHEN "01100" => sticky <= or_reduce(adj_op_tmp(11 downto 0)); WHEN "01101" => sticky <= or_reduce(adj_op_tmp(12 downto 0)); WHEN "01110" => sticky <= or_reduce(adj_op_tmp(13 downto 0)); WHEN "01111" => sticky <= or_reduce(adj_op_tmp(14 downto 0)); WHEN "10000" => sticky <= or_reduce(adj_op_tmp(15 downto 0)); WHEN "10001" => sticky <= or_reduce(adj_op_tmp(16 downto 0)); WHEN "10010" => sticky <= or_reduce(adj_op_tmp(17 downto 0)); WHEN "10011" => sticky <= or_reduce(adj_op_tmp(18 downto 0)); WHEN "10100" => sticky <= or_reduce(adj_op_tmp(19 downto 0)); WHEN "10101" => sticky <= or_reduce(adj_op_tmp(20 downto 0)); WHEN "10110" => sticky <= or_reduce(adj_op_tmp(21 downto 0)); WHEN "10111" => sticky <= or_reduce(adj_op_tmp(22 downto 0)); WHEN "11000" => sticky <= or_reduce(adj_op_tmp(23 downto 0)); WHEN "11001" => sticky <= or_reduce(adj_op_tmp(24 downto 0)); WHEN "11010" => sticky <= or_reduce(adj_op_tmp(25 downto 0));

WHEN "11011" => sticky <= or_reduce(adj_op_tmp(26 downto 0)); WHEN OTHERS => sticky <= '0'; END CASE; END PROCESS; -- Select operands for add/sub (recover hidden bit) fracta_n <= ((NOT expa_dn) & fracta & "000") WHEN (expa_lt_expb = '1') else adj_op_out; fractb_n <= adj_op_out WHEN (expa_lt_expb = '1') else ((NOT expb_dn) & fractb & "000"); -- Sort operands (for sub only) fractb_lt_fracta <= '1' WHEN (fractb_n > fracta_n) ELSE '0'; fracta_s <= fractb_n WHEN (fractb_lt_fracta = '1') ELSE fracta_n; fractb_s <= fracta_n WHEN (fractb_lt_fracta = '1') ELSE fractb_n; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN fracta_out <= fracta_s; fractb_out <= fractb_s; END IF; END PROCESS; -- Determine sign for the output -- sign: 0=Positive Number; 1=Negative Number signd_sel <= signa & signb & add; PROCESS (signd_sel, fractb_lt_fracta) BEGIN CASE signd_sel IS -- Add WHEN "001" => sign_d <= '0'; WHEN "011" => sign_d <= fractb_lt_fracta; WHEN "101" => sign_d <= NOT fractb_lt_fracta; WHEN "111" => sign_d <= '1'; -- Sub WHEN "000" => sign_d <= fractb_lt_fracta; WHEN "010" => sign_d <= '0'; WHEN "100" => sign_d <= '1'; WHEN "110" => sign_d <= NOT fractb_lt_fracta; WHEN OTHERS => sign_d <= 'X'; END CASE; END PROCESS; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN sign <= sign_d; -- Fix sign for ZERO result signa_r <= signa; signb_r <= signb; add_r <= add; result_zero_sign <= ( add_r AND signa_r AND signb_r) OR (NOT add_r AND signa_r AND NOT signb_r) OR ( add_r AND (signa_r OR signb_r) AND (rmode(1) AND rmode(0))) OR

(NOT add_r AND NOT (signa_r xor signb_r) AND (rmode(1) AND rmode(0))); END IF; END PROCESS; -- Fix sign for NAN result PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF (fracta < fractb) THEN fracta_lt_fractb <= '1'; ELSE fracta_lt_fractb <= '0'; END IF; IF fracta = fractb THEN fracta_eq_fractb <= '1'; ELSE fracta_eq_fractb <= '0'; END IF; IF ((opa_nan AND opb_nan) = '1') THEN nan_sign <= nan_sign1; ELSIF (opb_nan = '1') THEN nan_sign <= signb_r; ELSE nan_sign <= signa_r; END IF; END IF; END PROCESS; nan_sign1 <= (signa_r AND signb_r) WHEN (fracta_eq_fractb = '1') ELSE signb_r WHEN (fracta_lt_fractb = '1') ELSE signa_r; add_d_sel <= signa & signb & add; -- Decode Add/Sub operation -- add: 1=Add; 0=Subtract PROCESS (add_d_sel) BEGIN CASE add_d_sel IS -- Add WHEN "001" => add_d <= '1'; WHEN "011" => add_d <= '0'; WHEN "101" => add_d <= '0'; WHEN "111" => add_d <= '1'; -- Sub WHEN "000" => add_d <= '0'; WHEN "010" => add_d <= '1'; WHEN "100" => add_d <= '1'; WHEN "110" => add_d <= '0'; WHEN OTHERS => add_d <= 'X'; END CASE; END PROCESS; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN fasu_op <= add_d; END IF; END PROCESS;

END arch; LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_misc.ALL; USE ieee.std_logic_unsigned.ALL; ENTITY pre_norm_fmul IS PORT( clk : IN fpu_op : IN opa : IN opb : IN exp_out : OUT exp_ovf : OUT fracta : OUT fractb : OUT inf : OUT sign : OUT sign_exe : OUT underflow : OUT ); END pre_norm_fmul ;

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic ; std_logic ; std_logic ; std_logic_vector

(2 downto 0) ; (31 downto 0) ; (31 downto 0) ; (7 downto 0) ; (1 downto 0) ; (23 downto 0) ; (23 downto 0) ;

(2 downto 0)

ARCHITECTURE arch OF pre_norm_fmul IS signal signa, signb : std_logic ; signal sign_d : std_logic ; signal exp_ovf_d : std_logic_vector (1 downto 0); signal expa, expb : std_logic_vector (7 downto 0); signal expa_int, expb_int : std_logic_vector (8 downto 0); signal exp_tmp1, exp_tmp2 : std_logic_vector (7 downto 0); signal exp_tmp1_int, exp_tmp2_int : std_logic_vector (8 downto 0); signal co1, co2 : std_logic ; signal expa_dn, expb_dn : std_logic ; signal exp_out_a : std_logic_vector (7 downto 0); signal opa_00, opb_00, fracta_00, fractb_00 : std_logic ; signal exp_tmp3, exp_tmp4, exp_tmp5 : std_logic_vector (7 downto 0); signal underflow_d : std_logic_vector (2 downto 0); signal op_div : std_logic ; signal exp_out_mul, exp_out_div : std_logic_vector (7 downto 0); signal exp_out_div_p1, exp_out_div_p2 : std_logic_vector (7 downto 0); SIGNAL signacatsignb : std_logic_vector(1 DOWNTO 0); BEGIN -- Aliases signa <= opa(31); signb <= opb(31); expa <= opa(30 downto 23); expb <= opb(30 downto 23); -- Calculate Exponent expa_dn <= NOT (or_reduce(expa)); expb_dn <= NOT (or_reduce(expb)); opa_00 <= NOT (or_reduce(opa(30 downto 0))); opb_00 <= NOT (or_reduce(opb(30 downto 0))); fracta_00 <= NOT (or_reduce(opa(22 downto 0))); fractb_00 <= NOT (or_reduce(opb(22 downto 0))); -- Recover hidden bit

fracta <= (NOT expa_dn) & opa(22 downto 0); -- Recover hidden bit fractb <= (NOT expb_dn) & opb(22 downto 0); op_div <= '1' WHEN (fpu_op = "011") ELSE '0'; expa_int <= '0' & expa; expb_int <= '0' & expb; exp_tmp1_int <= (expa_int - expb_int) WHEN (op_div = '1') ELSE (expa_int + expb_int); exp_tmp1 <= exp_tmp1_int(7 DOWNTO 0); co1 <= exp_tmp1_int(8); exp_tmp2_int <= ((co1 & exp_tmp1) + X"7F") WHEN (op_div = '1') ELSE ((co1 & exp_tmp1) - X"7F"); exp_tmp2 <= exp_tmp2_int(7 DOWNTO 0); co2 <= exp_tmp2_int(8); exp_tmp3 <= exp_tmp2 + '1'; exp_tmp4 <= X"7F" - exp_tmp1; exp_tmp5 <= (exp_tmp4+'1') WHEN (op_div = '1') ELSE (exp_tmp4-'1'); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF op_div = '1' THEN exp_out <= exp_out_div; ELSE exp_out <= exp_out_mul; END IF; END IF; END PROCESS; exp_out_div_p1 <= exp_tmp5 WHEN (co2 = '1') ELSE exp_tmp3; exp_out_div_p2 <= exp_tmp4 WHEN (co2 = '1') ELSE exp_tmp2; exp_out_div <= exp_out_div_p1 WHEN ((expa_dn OR expb_dn) = '1') ELSE exp_out_div_p2; exp_out_mul <= exp_out_a WHEN (exp_ovf_d(1) = '1') ELSE exp_tmp3 WHEN ((expa_dn OR expb_dn) = '1') ELSE exp_tmp2; exp_out_a <= exp_tmp5 WHEN ((expa_dn OR expb_dn) = '1') ELSE exp_tmp4; exp_ovf_d(0) <= (expa(7) AND NOT expb(7)) WHEN (op_div = '1') ELSE (co2 AND expa(7) AND expb(7)); exp_ovf_d(1) <= co2 WHEN (op_div = '1') ELSE ((NOT expa(7) AND NOT expb(7) AND exp_tmp2(7)) OR co2); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN exp_ovf <= exp_ovf_d; END IF; END PROCESS; underflow_d(0) <= '1' WHEN ((exp_tmp1 < X"7f") AND (co1='0') AND ((opa_00 OR opb_00 OR expa_dn OR expb_dn) = '0')) ELSE '0';

underflow_d(1) <= '1' WHEN ((((expa(7) OR expb(7)) = '1') AND (opa_00 = '0') AND (opb_00 = '0')) OR ((expa_dn AND NOT fracta_00) = '1') OR ((expb_dn AND NOT fractb_00) = '1')) ELSE '0'; underflow_d(2) <= '1' WHEN (((NOT opa_00 AND NOT opb_00) = '1') AND (exp_tmp1 = X"7F")) ELSE '0'; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN underflow <= underflow_d; IF op_div = '1' THEN inf <= expb_dn AND NOT expa(7); ELSE IF ((co1 & exp_tmp1) > "101111110") THEN inf <= '1'; ELSE inf <= '0'; END IF; END IF; END IF; END PROCESS; signacatsignb <= signa & signb; -- Determine sign for the output PROCESS (signacatsignb) BEGIN CASE signacatsignb IS WHEN "00" => sign_d <= '0'; WHEN "01" => sign_d <= '1'; WHEN "10" => sign_d <= '1'; WHEN "11" => sign_d <= '0'; WHEN OTHERS => sign_d <= 'X'; END CASE; END PROCESS; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN sign <= sign_d; sign_exe <= signa AND signb; END IF; END PROCESS; END arch;

-- X"17e"

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_unsigned.ALL; ENTITY add_sub27 IS PORT( add : IN std_logic ; opa : IN std_logic_vector (26 downto 0) ; opb : IN std_logic_vector (26 downto 0) ; co : OUT std_logic ; sum : OUT std_logic_vector (26 downto 0) ); END add_sub27 ; ARCHITECTURE arch OF signal opa_int : signal opb_int : signal sum_int : BEGIN add_sub27 IS std_logic_vector (27 downto 0) ; std_logic_vector (27 downto 0) ; std_logic_vector (27 downto 0) ;

opa_int <= '0' & opa; opb_int <= '0' & opb; sum_int <= opa_int + opb_int WHEN (add = '1') else opa_int - opb_int; sum <= sum_int(26 downto 0); co <= sum_int(27); END arch;

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_signed.ALL; ENTITY mul_r2 IS PORT( clk : IN opa : IN opb : IN prod : OUT ); END mul_r2 ;

std_logic ; std_logic_vector (23 downto 0) ; std_logic_vector (23 downto 0) ; std_logic_vector (47 downto 0)

ARCHITECTURE arch OF mul_r2 IS SIGNAL prod1 : std_logic_vector(47 DOWNTO 0); BEGIN PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN prod1 <= opa * opb; prod <= prod1; END IF; END PROCESS; END arch;

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_unsigned.ALL; USE ieee.std_logic_arith.ALL; ENTITY div_r2 IS PORT( clk : opa : opb : quo : remainder : ); END div_r2 ;

IN IN IN OUT OUT

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector

(49 (23 (49 (49

downto downto downto downto

0) ; 0) ; 0) ; 0)

ARCHITECTURE arch OF div_r2 IS SIGNAL quo1, rem1 : std_logic_vector (49 downto 0); BEGIN PROCESS (clk) VARIABLE opa_int, opb_int, quo1_int, rem1_int : integer; BEGIN --opa_int := conv_integer(opa); --opb_int := conv_integer(opb); IF clk'event AND clk = '1' THEN --quo1_int := opa_int/opb_int; --rem1_int := opa_int REM opb_int; --quo1 <= conv_std_logic_vector(quo1_int, 50); --rem1 <= conv_std_logic_vector(rem1_int, 50); --quo <= quo1; --remainder <= rem1; quo <= opa; remainder <= opa; END IF; END PROCESS; END arch;

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_arith.ALL; USE ieee.std_logic_unsigned.ALL; USE ieee.std_logic_misc.ALL; ENTITY post_norm IS PORT( clk : div_opa_ldz : exp_in : exp_ovf : fpu_op : fract_in : opa_dn : opas : opb_dn : output_zero : rem_00 : rmode : sign : f2i_out_sign : fpout : ine : overflow : underflow : ); END post_norm ;

IN IN IN IN IN IN IN IN IN IN IN IN IN OUT OUT OUT OUT OUT

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic_vector std_logic ; std_logic ; std_logic_vector std_logic ; std_logic ; std_logic

(4 downto 0) ; (7 downto 0) ; (1 downto 0) ; (2 downto 0) ; (47 downto 0) ;

(1 downto 0) ; (30 downto 0) ;

ARCHITECTURE arch OF post_norm IS signal f2i_out_sign_p1, f2i_out_sign_p2: std_logic; signal fract_out : std_logic_vector (22 downto 0); signal exp_out : std_logic_vector (7 downto 0); signal exp_out1_co : std_logic ; signal fract_out_final : std_logic_vector (22 downto 0); signal fract_out_rnd : std_logic_vector (22 downto 0); signal exp_next_mi : std_logic_vector (8 downto 0); signal dn : std_logic ; signal exp_rnd_adj : std_logic ; signal exp_out_final : std_logic_vector (7 downto 0); signal exp_out_rnd : std_logic_vector (7 downto 0); signal op_dn : std_logic ; signal op_mul : std_logic ; signal op_div : std_logic ; signal op_i2f : std_logic ; signal op_f2i : std_logic ; signal fi_ldz : std_logic_vector (5 downto 0); signal g, r, s : std_logic ; signal round, round2, round2a, round2_fasu, round2_fmul : std_logic ; signal exp_out_rnd0, exp_out_rnd1, exp_out_rnd2, exp_out_rnd2a : std_logic_vector (7 downto 0); signal fract_out_rnd0, fract_out_rnd1, fract_out_rnd2, fract_out_rnd2a : std_logic_vector (22 downto 0); signal exp_rnd_adj0, exp_rnd_adj2a : std_logic ; signal r_sign : std_logic ; signal ovf0, ovf1 : std_logic ; signal fract_out_pl1 : std_logic_vector (23 downto 0); signal exp_out_pl1, exp_out_mi1 : std_logic_vector (7 downto 0); signal exp_out_00, exp_out_fe, exp_out_ff, exp_in_00, exp_in_ff : std_logic ; signal exp_out_final_ff, fract_out_7fffff : std_logic ;

signal fract_trunc : std_logic_vector (24 downto 0); signal exp_out1 : std_logic_vector (7 downto 0); signal grs_sel : std_logic ; signal fract_out_00, fract_in_00 : std_logic ; signal shft_co : std_logic ; signal exp_in_pl1, exp_in_mi1 : std_logic_vector (8 downto 0); signal fract_in_shftr : std_logic_vector (47 downto 0); signal fract_in_shftl : std_logic_vector (47 downto 0); signal exp_div : std_logic_vector (7 downto 0); signal shft2 : std_logic_vector (7 downto 0); signal exp_out1_mi1 : std_logic_vector (7 downto 0); signal div_dn : std_logic ; signal div_nr : std_logic ; signal grs_sel_div : std_logic ; signal div_inf : std_logic ; signal fi_ldz_2a : std_logic_vector (6 downto 0); signal fi_ldz_2 : std_logic_vector (7 downto 0); signal div_shft1, div_shft2, div_shft3, div_shft4 : std_logic_vector (7 downto 0); signal div_shft1_co : std_logic ; signal div_exp1 : std_logic_vector (8 downto 0); signal div_exp2, div_exp3 : std_logic_vector (7 downto 0); signal div_exp2_temp : std_logic_vector (8 downto 0); signal left_right, lr_mul, lr_div : std_logic ; signal shift_right, shftr_mul, shftr_div : std_logic_vector (7 downto 0); signal shift_left, shftl_mul, shftl_div : std_logic_vector (7 downto 0); signal fasu_shift_p1 : std_logic_vector (7 downto 0); signal fasu_shift : std_logic_vector (7 downto 0); signal exp_fix_div : std_logic_vector (7 downto 0); signal exp_fix_diva, exp_fix_divb : std_logic_vector (7 downto 0); signal fi_ldz_mi1 : std_logic_vector (5 downto 0); signal fi_ldz_mi22 : std_logic_vector (5 downto 0); signal exp_zero : std_logic ; signal ldz_all : std_logic_vector (6 downto 0); signal ldz_dif : std_logic_vector (7 downto 0); signal div_scht1a : std_logic_vector (8 downto 0); signal f2i_shft : std_logic_vector (7 downto 0); signal exp_f2i_1 : std_logic_vector (55 downto 0); signal f2i_zero, f2i_max : std_logic ; signal f2i_emin : std_logic_vector (7 downto 0); signal conv_shft : std_logic_vector (7 downto 0); signal exp_i2f, exp_f2i, conv_exp : std_logic_vector (7 downto 0); signal round2_f2i : std_logic ; signal round2_f2i_p1 : std_logic ; signal exp_in_80 : std_logic ; signal rmode_00, rmode_01, rmode_10, rmode_11 : std_logic ; signal max_num, inf_out : std_logic ; signal max_num_t1, max_num_t2, max_num_t3,max_num_t4,inf_out_t1 : std_logic ; signal underflow_fmul : std_logic ; signal overflow_fdiv : std_logic ; signal undeflow_div : std_logic ; signal f2i_ine : std_logic ; signal fracta_del, fractb_del : std_logic_vector (26 downto 0); signal grs_del : std_logic_vector (2 downto 0); signal dn_del : std_logic ; signal exp_in_del : std_logic_vector (7 downto 0); signal exp_out_del : std_logic_vector (7 downto 0); signal fract_out_del : std_logic_vector (22 downto 0);

signal signal signal signal downto 0); signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal signal

fract_in_del : std_logic_vector (47 downto 0); overflow_del : std_logic ; exp_ovf_del : std_logic_vector (1 downto 0); fract_out_x_del, fract_out_rnd2a_del : std_logic_vector (22 trunc_xx_del : std_logic_vector (24 downto 0); exp_rnd_adj2a_del : std_logic ; fract_dn_del : std_logic_vector (22 downto 0); div_opa_ldz_del : std_logic_vector (4 downto 0); fracta_div_del : std_logic_vector (23 downto 0); fractb_div_del : std_logic_vector (23 downto 0); div_inf_del : std_logic ; fi_ldz_2_del : std_logic_vector (7 downto 0); inf_out_del, max_out_del : std_logic ; fi_ldz_del : std_logic_vector (5 downto 0); rx_del : std_logic ; ez_del : std_logic ; lr : std_logic ; exp_div_del : std_logic_vector (7 downto 0); z : std_logic; undeflow_div_p1 : std_logic ; undeflow_div_p2 : std_logic ; undeflow_div_p3 : std_logic ; undeflow_div_p4 : std_logic ; undeflow_div_p5 : std_logic ; undeflow_div_p6 : std_logic ; undeflow_div_p7 : std_logic ; undeflow_div_p8 : std_logic ; undeflow_div_p9 : std_logic ; undeflow_div_p10 : std_logic ;

CONSTANT f2i_emax : std_logic_vector(7 DOWNTO 0) := X"9d"; BEGIN op_dn <= opa_dn or opb_dn ; op_mul <= '1'WHEN (fpu_op(2 op_div <= '1'WHEN (fpu_op(2 op_i2f <= '1'WHEN (fpu_op(2 op_f2i <= '1'WHEN (fpu_op(2 ----- Normalize and Round Logic --------------------------------------------------------------------------- Count Leading zeros in fraction PROCESS (fract_in) BEGIN IF fract_in(47) = '1' THEN fi_ldz <= conv_std_logic_vector(1,6); ELSIF fract_in(47 DOWNTO 46) = "01" THEN fi_ldz <= conv_std_logic_vector(2,6); ELSIF fract_in(47 DOWNTO 45) = "001" THEN fi_ldz <= conv_std_logic_vector(3,6); ELSIF fract_in(47 DOWNTO 44) = "0001" THEN fi_ldz <= conv_std_logic_vector(4,6); ELSIF fract_in(47 DOWNTO 43) = "00001" THEN fi_ldz <= conv_std_logic_vector(5,6); ELSIF fract_in(47 DOWNTO 42) = "000001" THEN fi_ldz <= conv_std_logic_vector(6,6); DOWNTO DOWNTO DOWNTO DOWNTO 0)="010") 0)="011") 0)="100") 0)="101") ELSE ELSE ELSE ELSE '0'; '0'; '0'; '0';

-----------------------------------------------------------------------

ELSIF fract_in(47 DOWNTO 41) = "0000001" THEN fi_ldz <= conv_std_logic_vector(7,6); ELSIF fract_in(47 DOWNTO 40) = "00000001" THEN fi_ldz <= conv_std_logic_vector(8,6); ELSIF fract_in(47 DOWNTO 39) = "000000001" THEN fi_ldz <= conv_std_logic_vector(9,6); ELSIF fract_in(47 DOWNTO 38) = "0000000001" THEN fi_ldz <= conv_std_logic_vector(10,6); ELSIF fract_in(47 DOWNTO 37) = "00000000001" THEN fi_ldz <= conv_std_logic_vector(11,6); ELSIF fract_in(47 DOWNTO 36) = "000000000001" THEN fi_ldz <= conv_std_logic_vector(12,6); ELSIF fract_in(47 DOWNTO 35) = "0000000000001" THEN fi_ldz <= conv_std_logic_vector(13,6); ELSIF fract_in(47 DOWNTO 34) = "00000000000001" THEN fi_ldz <= conv_std_logic_vector(14,6); ELSIF fract_in(47 DOWNTO 33) = "000000000000001" THEN fi_ldz <= conv_std_logic_vector(15,6); ELSIF fract_in(47 DOWNTO 32) = "0000000000000001" THEN fi_ldz <= conv_std_logic_vector(16,6); ELSIF fract_in(47 DOWNTO 31) = "00000000000000001" THEN fi_ldz <= conv_std_logic_vector(17,6); ELSIF fract_in(47 DOWNTO 30) = "000000000000000001" THEN fi_ldz <= conv_std_logic_vector(18,6); ELSIF fract_in(47 DOWNTO 29) = "0000000000000000001" THEN fi_ldz <= conv_std_logic_vector(19,6); ELSIF fract_in(47 DOWNTO 28) = "00000000000000000001" THEN fi_ldz <= conv_std_logic_vector(20,6); ELSIF fract_in(47 DOWNTO 27) = "000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(21,6); ELSIF fract_in(47 DOWNTO 26) = "0000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(22,6); ELSIF fract_in(47 DOWNTO 25) = "00000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(23,6); ELSIF fract_in(47 DOWNTO 24) = "000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(24,6); ELSIF fract_in(47 DOWNTO 23) = "0000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(25,6); ELSIF fract_in(47 DOWNTO 22) = "00000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(26,6); ELSIF fract_in(47 DOWNTO 21) = "000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(27,6); ELSIF fract_in(47 DOWNTO 20) = "0000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(28,6); ELSIF fract_in(47 DOWNTO 19) = "00000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(29,6); ELSIF fract_in(47 DOWNTO 18) = "000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(30,6); ELSIF fract_in(47 DOWNTO 17) = "0000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(31,6); ELSIF fract_in(47 DOWNTO 16) = "00000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(32,6); ELSIF fract_in(47 DOWNTO 15) = "000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(33,6); ELSIF fract_in(47 DOWNTO 14) = "0000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(34,6); ELSIF fract_in(47 DOWNTO 13) = "00000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(35,6);

ELSIF fract_in(47 DOWNTO 12) = "000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(36,6); ELSIF fract_in(47 DOWNTO 11) = "0000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(37,6); ELSIF fract_in(47 DOWNTO 10) = "00000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(38,6); ELSIF fract_in(47 DOWNTO 9) = "000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(39,6); ELSIF fract_in(47 DOWNTO 8) = "0000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(40,6); ELSIF fract_in(47 DOWNTO 7) = "00000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(41,6); ELSIF fract_in(47 DOWNTO 6) = "000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(42,6); ELSIF fract_in(47 DOWNTO 5) = "0000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(43,6); ELSIF fract_in(47 DOWNTO 4) = "00000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(44,6); ELSIF fract_in(47 DOWNTO 3) = "000000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(45,6); ELSIF fract_in(47 DOWNTO 2) = "0000000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(46,6); ELSIF fract_in(47 DOWNTO 1) = "00000000000000000000000000000000000000000000001" THEN fi_ldz <= conv_std_logic_vector(47,6); ELSIF fract_in(47 DOWNTO 1) = "00000000000000000000000000000000000000000000000" THEN fi_ldz <= conv_std_logic_vector(48,6); ELSE fi_ldz <= (OTHERS => 'X'); END IF; END PROCESS; -- Normalize exp_in_ff <= and_reduce(exp_in); exp_in_00 <= NOT (or_reduce(exp_in)); exp_in_80 <= exp_in(7) AND NOT (or_reduce(exp_in(6 DOWNTO 0))); exp_out_ff <= and_reduce(exp_out); exp_out_00 <= NOT (or_reduce(exp_out)); exp_out_fe <= (and_reduce(exp_out(7 DOWNTO 1))) AND NOT exp_out(0); exp_out_final_ff <= and_reduce(exp_out_final); fract_out_7fffff <= and_reduce(fract_out); fract_out_00 <= NOT (or_reduce(fract_out)); fract_in_00 <= NOT (or_reduce(fract_in)); rmode_00 rmode_01 rmode_10 rmode_11 <= <= <= <= '1' '1' '1' '1' WHEN WHEN WHEN WHEN (rmode (rmode (rmode (rmode = = = = "00") "01") "10") "11") ELSE ELSE ELSE ELSE '0'; '0'; '0'; '0';

-- Fasu Output will be denormalized ... dn <= NOT op_mul AND NOT op_div AND (exp_in_00 OR (exp_next_mi(8) AND NOT fract_in(47)) ); --------------------------------------------------------------------------- Fraction Normalization --------------------------------------------------------------------------- Incremented fraction for rounding fract_out_pl1 <= ('0' & fract_out) + '1'; -- Special Signals for f2i f2i_emin <= X"7e" WHEN (rmode_00 = '1') ELSE X"7f"; f2i_zero <= '1' WHEN (((opas = '0') AND (exp_in < f2i_emin)) OR ((opas = '1') AND (exp_in > f2i_emax)) OR ((opas = '1') AND (exp_in < f2i_emin) AND ((fract_in_00 OR NOT rmode_11) = '1'))) ELSE '0'; f2i_max <= '1' WHEN (((opas = '0') AND (exp_in > f2i_emax)) OR ((opas = '1') AND (exp_in < f2i_emin) AND (fract_in_00 = '0') AND (rmode_11 = '1'))) ELSE '0'; -- Claculate various shifting options shftr_mul <= exp_out WHEN ((NOT exp_ovf(1) AND exp_in_00) ='1') else exp_in_mi1(7 DOWNTO 0) ; shft_co <= '0' WHEN ((NOT exp_ovf(1) AND exp_in_00) ='1') else exp_in_mi1(8) ; div_shft1 <= ("000" & div_opa_ldz) WHEN (exp_in_00 = '1') ELSE div_scht1a(7 DOWNTO 0); div_shft1_co <= '0' WHEN (exp_in_00 = '1') ELSE div_scht1a(8); div_scht1a <= ('0' & exp_in) - div_opa_ldz; -- 9 bits - includes carry out div_shft2 <= exp_in + "10"; div_shft3 <= div_opa_ldz+exp_in; div_shft4 <= div_opa_ldz-exp_in; div_dn <= op_dn and div_shft1_co; div_nr <= '1' WHEN ((op_dn = '1') and (exp_ovf(1) = '1') and (or_reduce(fract_in(46 DOWNTO 23)) = '0') AND (div_shft3 > X"16")) ELSE '0'; f2i_shft <= exp_in - X"7d"; -- Select shifting direction left_right <= lr_div WHEN (op_div ='1') ELSE lr_mul WHEN (op_mul = '1') ELSE '1'; lr_div <= '1' WHEN ((op_dn AND NOT exp_ovf(1) AND ELSE '0' WHEN ((op_dn AND exp_ovf(1)) = '1') ELSE '0' WHEN ((op_dn AND div_shft1_co) = '1') ELSE exp_ovf(0)) = '1')

'1' WHEN ((op_dn AND exp_out_00) = '1') ELSE '1' WHEN ((NOT op_dn AND exp_out_00 AND NOT exp_ovf(1)) = '1') ELSE '0' WHEN ((exp_ovf(1)) = '1') ELSE '1'; lr_mul <= '1' WHEN ((shft_co OR (NOT exp_ovf(1) AND exp_in_00) OR (NOT exp_ovf(1) AND NOT exp_in_00 AND (exp_out1_co OR exp_out_00) )) = '1') ELSE '0' WHEN (( exp_ovf(1) or exp_in_00 ) = '1') ELSE '1'; -- Select Left and Right shift value fasu_shift_p1 <= X"02" WHEN (exp_in_00 = '1') ELSE exp_in_pl1(7 downto 0); fasu_shift <= fasu_shift_p1 WHEN ((dn OR exp_out_00) = '1') ELSE ("00" & fi_ldz); shift_right <= shftr_div WHEN (op_div = '1') ELSE shftr_mul; conv_shft <= f2i_shft WHEN (op_f2i = '1') ELSE ("00" & fi_ldz); shift_left <= shftl_div WHEN (op_div = '1') ELSE shftl_mul WHEN (op_mul = '1') ELSE conv_shft WHEN ((op_f2i or op_i2f) = '1') else fasu_shift; shftl_mul <= exp_in_pl1(7 DOWNTO 0) WHEN ((shft_co OR (NOT exp_ovf(1) AND exp_in_00) OR (NOT exp_ovf(1) AND NOT exp_in_00 AND (exp_out1_co OR exp_out_00))) = '1') ELSE ("00" & fi_ldz); shftl_div <= div_shft1(7 downto 0) WHEN ((op_dn and exp_out_00 and not (not exp_ovf(1) and exp_ovf(0))) ='1') else exp_in(7 downto 0) WHEN ((not op_dn and exp_out_00 and not exp_ovf(1))='1') else ("00" & fi_ldz); shftr_div <= div_shft3 WHEN ((op_dn AND exp_ovf(1)) = '1') else div_shft4 WHEN ((op_dn AND div_shft1_co) = '1') else div_shft2; -- Do the actual shifting fract_in_shftr <= (OTHERS => '0') WHEN (or_reduce(shift_right(7 DOWNTO 6)) = '1') else shr(fract_in,shift_right(5 DOWNTO 0)); fract_in_shftl <= (OTHERS => '0') WHEN ((or_reduce(shift_left(7 DOWNTO 6))='1') OR ((f2i_zero AND op_f2i)='1')) else (SHL(fract_in,shift_left(5 DOWNTO 0))); -- Chose final fraction output fract_trunc <= fract_in_shftl(24 DOWNTO 0) WHEN (left_right = '1') else fract_in_shftr(24 DOWNTO 0); fract_out <= fract_in_shftl(47 DOWNTO 25) WHEN (left_right = '1') else fract_in_shftr(47 DOWNTO 25); --------------------------------------------------------------------------

-- Exponent Normalization -------------------------------------------------------------------------fi_ldz_mi1 <= fi_ldz - '1'; fi_ldz_mi22 <= fi_ldz - "10110"; exp_out_pl1 <= exp_out + '1'; exp_out_mi1 <= exp_out - '1'; -- 9 bits - includes carry out exp_in_pl1 <= ('0' & exp_in) + '1'; -- 9 bits - includes carry out exp_in_mi1 <= ('0' & exp_in) - '1'; exp_out1_mi1 <= exp_out1 - '1'; -- 9 bits - includes carry out exp_next_mi <= exp_in_pl1 - fi_ldz_mi1; exp_fix_diva <= exp_in - fi_ldz_mi22; exp_fix_divb <= exp_in - fi_ldz_mi1; exp_zero <= (exp_ovf(1) AND NOT exp_ovf(0) AND op_mul AND (NOT exp_rnd_adj2a OR NOT rmode(1))) OR (op_mul AND exp_out1_co);

exp_out1 <= exp_in_pl1(7 DOWNTO 0) WHEN (fract_in(47) = '1') else exp_next_mi(7 DOWNTO 0); exp_out1_co <= exp_in_pl1(8) WHEN (fract_in(47) = '1') else exp_next_mi(8); f2i_out_sign <= f2i_out_sign_p1 WHEN (opas ='0') ELSE f2i_out_sign_p2;

f2i_out_sign_p1 <= '0' WHEN (exp_in<f2i_emin) ELSE '0' WHEN (exp_in>f2i_emax) ELSE opas; f2i_out_sign_p2 <= '0' WHEN (exp_in<f2i_emin) ELSE '1'WHEN (exp_in>f2i_emax) ELSE opas; exp_i2f <= X"9e" WHEN ((fract_in_00 AND opas)='1') else X"00" WHEN ((fract_in_00 AND NOT opas)='1') else (X"9e"-fi_ldz);

exp_f2i_1 <= shl((fract_in(47) & fract_in(47) & fract_in(47) & fract_in(47) & fract_in(47) & fract_in(47) & fract_in(47) & fract_in(47) & fract_in),f2i_shft); exp_f2i conv_exp <= (OTHERS => '0') WHEN (f2i_zero = '1') else X"ff" WHEN (f2i_max = '1') else exp_f2i_1(55 DOWNTO 48); <= exp_f2i WHEN (op_f2i = '1') ELSE exp_i2f;

exp_out <= exp_div WHEN (op_div = '1') ELSE conv_exp WHEN ((op_f2i OR op_i2f)='1') ELSE X"00" WHEN (exp_zero = '1') ELSE ("000000" & fract_in(47 downto 46)) WHEN (dn = '1') else exp_out1; ldz_all <= ("00" & div_opa_ldz) + fi_ldz;

ldz_dif <= fi_ldz_2 - div_opa_ldz; fi_ldz_2a <= "0010111" - fi_ldz; fi_ldz_2 <= (fi_ldz_2a(6) & fi_ldz_2a(6 DOWNTO 0)); -- 9 bits - includes carry out div_exp1 <= exp_in_mi1 + fi_ldz_2; div_exp2_temp <= exp_in_pl1 - ldz_all; div_exp2 <= div_exp2_temp(7 DOWNTO 0); div_exp3 <= exp_in + ldz_dif; exp_div <= div_exp3 when ((opa_dn AND opb_dn) = '1') ELSE div_exp1(7 DOWNTO 0) WHEN (opb_dn = '1') ELSE div_exp2 WHEN ((opa_dn = '1') AND NOT ( (exp_in<div_opa_ldz) OR (div_exp2>"011111110") )) ELSE (OTHERS => '0') WHEN ((opa_dn or (exp_in_00 and NOT exp_ovf(1)) ) = '1') ELSE exp_out1_mi1; div_inf <= '1' WHEN ((opb_dn = '1') AND (opa_dn = '0') and (div_exp1(7 DOWNTO 0) < X"7f")) ELSE '0'; --------------------------------------------------------------------------- ROUND --------------------------------------------------------------------------- Extract rounding (GRS) bits grs_sel_div <= op_div and (exp_ovf(1) or div_dn or exp_out1_co or exp_out_00); g <= fract_out(0) WHEN (grs_sel_div = '1') ELSE fract_out(0); r <= (fract_trunc(24) AND NOT div_nr) WHEN (grs_sel_div = '1') ELSE fract_trunc(24); s <= or_reduce(fract_trunc(24 DOWNTO 0)) WHEN (grs_sel_div = '1') ELSE (or_reduce(fract_trunc(23 DOWNTO 0)) OR (fract_trunc(24) AND op_div)); -- Round to nearest even round <= (g and r) or (r and s) ; fract_out_rnd0 <= fract_out_pl1(22 DOWNTO 0) WHEN (round = '1') ELSE fract_out; exp_rnd_adj0 <= fract_out_pl1(23) WHEN (round = '1') ELSE '0'; exp_out_rnd0 <= exp_out_pl1 WHEN (exp_rnd_adj0 = '1') else exp_out; ovf0 <= exp_out_final_ff and NOT rmode_01 AND NOT op_f2i; -- round to zero fract_out_rnd1 <= ("111" & X"fffff") WHEN ((exp_out_ff and NOT op_div AND NOT dn and NOT op_f2i) = '1') ELSE fract_out; exp_fix_div <= exp_fix_diva WHEN (fi_ldz>"010110") else exp_fix_divb; exp_out_rnd1 <= exp_fix_div WHEN ((g and r and s and exp_in_ff AND op_div)='1') else exp_next_mi(7 DOWNTO 0) WHEN ((g and r and s and exp_in_ff AND NOT op_div)='1') else exp_in when ((exp_out_ff and not op_f2i)='1') else

exp_out; ovf1 <= exp_out_ff and NOT dn; -- round to +inf (UP) and -inf (DOWN) r_sign <= sign; round2a <= NOT exp_out_fe or NOT fract_out_7fffff or (exp_out_fe and fract_out_7fffff); round2_fasu <= ((r or s) and NOT r_sign) and (NOT exp_out(7) OR (exp_out(7) AND round2a)); round2_fmul <= NOT r_sign and ( (exp_ovf(1) and not fract_in_00 and ( ((not exp_out1_co or op_dn) and (r or s or (not rem_00 and op_div) )) or fract_out_00 or (not op_dn and not op_div)) ) or ( (r or s or (not rem_00 and op_div)) and ( (not exp_ovf(1) and (exp_in_80 or not exp_ovf(0))) or op_div or ( exp_ovf(1) and not exp_ovf(0) and exp_out1_co) ) ) ); round2_f2i_p1 <= '1' WHEN (exp_in<X"80" ) ELSE '0'; round2_f2i <= rmode_10 and (( or_reduce(fract_in(23 DOWNTO 0)) AND NOT opas AND round2_f2i_p1) OR (or_reduce(fract_trunc))); round2 <= round2_fmul WHEN ((op_mul or op_div) = '1') ELSE round2_f2i WHEN (op_f2i = '1') else round2_fasu; fract_out_rnd2a <= fract_out_pl1(22 DOWNTO 0) WHEN (round2 = '1') else fract_out; exp_rnd_adj2a <= fract_out_pl1(23) WHEN (round2 = '1') else '0'; exp_out_rnd2a <= exp_out_mi1 WHEN ((exp_rnd_adj2a AND (exp_ovf(1) and op_mul))='1')ELSE exp_out_pl1 WHEN ((exp_rnd_adj2a AND NOT (exp_ovf(1) AND op_mul))='1') ELSE exp_out; fract_out_rnd2 <= "111" & X"FFFFF" WHEN ((r_sign and exp_out_ff and NOT op_div and NOT dn AND NOT op_f2i) = '1') ELSE fract_out_rnd2a; exp_out_rnd2 <= X"FE" WHEN ((r_sign and exp_out_ff AND NOT op_f2i) = '1') else exp_out_rnd2a; -- Choose rounding mode PROCESS (rmode,exp_out_rnd0,exp_out_rnd1,exp_out_rnd2) BEGIN CASE rmode IS WHEN "00" => exp_out_rnd <= exp_out_rnd0; WHEN "01" => exp_out_rnd <= exp_out_rnd1;

WHEN "10" => exp_out_rnd <= exp_out_rnd2; WHEN "11" => exp_out_rnd <= exp_out_rnd2; WHEN OTHERS => exp_out_rnd <= (OTHERS => 'X'); END CASE; END PROCESS; PROCESS (rmode,fract_out_rnd0,fract_out_rnd1,fract_out_rnd2) BEGIN CASE rmode IS WHEN "00" => fract_out_rnd <= fract_out_rnd0; WHEN "01" => fract_out_rnd <= fract_out_rnd1; WHEN "10" => fract_out_rnd <= fract_out_rnd2; WHEN "11" => fract_out_rnd <= fract_out_rnd2; WHEN OTHERS => fract_out_rnd <= (OTHERS => 'X'); END CASE; END PROCESS; --------------------------------------------------------------------------- Final Output Mux --------------------------------------------------------------------------- Fix Output for denormalized and special numbers max_num <= ( not rmode_00 and (op_mul or op_div ) and ( ( exp_ovf(1) and exp_ovf(0)) or (not exp_ovf(1) and not exp_ovf(0) and exp_in_ff and (max_num_t2) and (max_num_t1) ) ) ) or ( op_div and ( ( rmode_01 and ( div_inf or (exp_out_ff and not exp_ovf(1) ) or (exp_ovf(1) and exp_ovf(0) ) ) ) or ( rmode(1) and not exp_ovf(1) and ( ( exp_ovf(0) and exp_in_ff and r_sign and fract_in(47) ) or ( and ( r_sign

(fract_in(4 (exp_in(7) (exp_in(7) max ) ) or ( exp_in_00 and r_sign and ( ) ) )

) ) ); max_num_t2 max_num_t1 max_num_t3 max_num_t4 <= <= <= <= '1' '1' '1' '1' WHEN WHEN WHEN WHEN (fi_ldz_2<"0011000") ELSE '0'; (exp_out/=X"fe") ELSE '0'; (exp_out/=X"7f") ELSE '0'; (div_exp1>"011111110") ELSE '0';

inf_out <= (rmode(1) and (op_mul or op_div) and not r_sign and ( (exp_in_ff and not op_div) or (exp_ovf(1) and exp_ovf(0) and (exp_in_00 or exp_in(7)) ) ) ) or (div_inf and op_div and ( rmode_00 or (rmode(1) and not exp_in_ff and not exp_ovf(1) and not exp_ovf(0) and not r_sign ) or (rmode(1) and not exp_ovf(1) and exp_ovf(0) and exp_in_00 and not r_sign) ) ) or (op_div and rmode(1) and exp_in_ff and op_dn and not r_sign and inf_out_t1 ); inf_out_t1 <= '1' WHEN ((fi_ldz_2 < 24) AND (exp_out_rnd/=X"fe")) ELSE '0'; fract_out_final <= (OTHERS => '0') when ((inf_out or ovf0 or output_zero ) = '1')ELSE ("111" & X"fffff") WHEN ((max_num or (f2i_max and op_f2i) )= '1') else fract_out_rnd; exp_out_final <= else X"ff" WHEN (((op_div and exp_ovf(1) and exp_ovf(0) and rmode_00) or inf_out or (f2i_max and op_f2i) )='1') ELSE X"fe" WHEN (max_num = '1') else exp_out_rnd; --------------------------------------------------------------------------- Pack Result -------------------------------------------------------------------------fpout <= exp_out_final & fract_out_final; --------------------------------------------------------------------------- Exceptions -------------------------------------------------------------------------z <= shft_co or ( exp_ovf(1) or exp_in_00) or (not exp_ovf(1) and not exp_in_00 and (exp_out1_co or exp_out_00)); underflow_fmul <= ( (or_reduce(fract_trunc)) and z and not exp_in_ff ) or (fract_out_00 and not fract_in_00 and exp_ovf(1)); X"00" WHEN (((op_div and exp_ovf(1) and not exp_ovf(0)) or output_zero )='1')

undeflow_div_p1 <= '1' WHEN (exp_out_final/=X"ff") ELSE '0'; undeflow_div_p2 <= '1' WHEN (exp_in>X"16") ELSE '0'; undeflow_div_p3 <= '1' WHEN (fi_ldz<"010111") ELSE '0'; undeflow_div_p4 <= '1' WHEN (exp_in<"00010111") ELSE '0'; undeflow_div_p5 <= '1' WHEN (exp_in(7)=exp_div(7)) ELSE '0'; undeflow_div_p6 <= '1' WHEN (exp_div(7 DOWNTO 1)="1111111") ELSE '0'; undeflow_div_p7 <= '1' WHEN (exp_in<X"7f") ELSE '0'; undeflow_div_p8 <= '1' WHEN (exp_in>X"20") ELSE '0'; undeflow_div_p9 <= '1' WHEN (ldz_all<"0010111") ELSE '0'; undeflow_div_p10 <= '1' WHEN (exp_in=X"01") ELSE '0'; undeflow_div <= not (exp_ovf(1) and exp_ovf(0) and rmode_00) and not inf_out and not max_num and undeflow_div_p1 and ( ((or_reduce(fract_trunc)) and not ( exp_ovf(1) and exp_ovf(0)) or ( exp_ovf(1)) or ( div_shft1_co) or or exp_ovf(1) ) ) or ( exp_ovf(1) and not exp_ovf(0) and ( ( op_dn and undeflow_div_p2 and undeflow_div_p3 ) or ( op_dn and undeflow_div_p4 and undeflow_div_p3 and not rem_00) or ( not op_dn and (undeflow_div_p5) and not rem_00) or ( not op_dn and exp_in_00 and (undeflow_div_p6) ) or ( not op_dn and undeflow_div_p7 and undeflow_div_p8 ) ) ) or (not exp_ovf(1) and not exp_ovf(0) and ( ( op_dn and undeflow_div_p3 and exp_out_00) or ( exp_in_00 and not rem_00) or ( not op_dn and undeflow_div_p9 and undeflow_div_p10 and exp_out_00 and not rem_00) ) ) ); underflow <= (undeflow_div) WHEN (op_div = '1') ELSE (underflow_fmul) when(op_mul ='1') ELSE (NOT dn AND (NOT fract_in(47) AND exp_out1_co)); overflow_fdiv <= inf_out or (NOT rmode_00 AND max_num) or opb_dn and ( op_dn and not op_dn and op_dn and exp_out_00

(exp_in(7) and op_dn and exp_out_ff) or (exp_ovf(0) and (exp_ovf(1) or exp_out_ff) ); overflow <= overflow_fdiv WHEN (op_div ='1') else (ovf0 or ovf1); f2i_ine <= '1' WHEN (((f2i_zero AND NOT fract_in_00 AND NOT opas)='1') OR ((or_reduce(fract_trunc))='1') OR ((f2i_zero='1') and (exp_in<X"80") and (opas='1') and (fract_in_00='0')) or ((f2i_max='1') and (rmode_11='1') and (exp_in<X"80"))) else '0'; ine <= f2i_ine WHEN (op_f2i='1') ELSE or_reduce(fract_trunc) WHEN (op_i2f='1') ELSE ((r and NOT dn) or (s and NOT dn) or max_num or (op_div and NOT rem_00)); --------------------------------------------------------------------------- Debugging Stuff -------------------------------------------------------------------------END arch;

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_misc.ALL; ENTITY except IS PORT( clk : IN std_logic ; opa : IN std_logic_vector (31 downto 0) ; opb : IN std_logic_vector (31 downto 0) ; ind : OUT std_logic ; inf : OUT std_logic ; opa_00 : OUT std_logic ; opa_dn : OUT std_logic ; opa_inf : OUT std_logic ; opa_nan : OUT std_logic ; opb_00 : OUT std_logic ; opb_dn : OUT std_logic ; opb_inf : OUT std_logic ; opb_nan : OUT std_logic ; qnan : OUT std_logic ; snan : OUT std_logic ); END except ; ARCHITECTURE arch OF except IS signal expa, expb : std_logic_vector (7 downto 0); signal fracta, fractb : std_logic_vector (22 downto 0);

signal expa_ff, infa_f_r, qnan_r_a, snan_r_a : std_logic ; signal expb_ff, infb_f_r, qnan_r_b, snan_r_b : std_logic ; signal expa_00, expb_00, fracta_00, fractb_00 : std_logic ; BEGIN expa <= opa(30 downto 23); expb <= opb(30 downto 23); fracta <= opa(22 downto 0); fractb <= opb(22 downto 0); --------------------------------------------------------------------------- Determine if any of the input operators is a INF or NAN or any other special number -------------------------------------------------------------------------PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN expa_ff <= and_reduce(expa); expb_ff <= and_reduce(expb); infa_f_r <= NOT or_reduce(fracta); infb_f_r <= NOT or_reduce(fractb); qnan_r_a <= fracta(22); snan_r_a <= NOT fracta(22) AND or_reduce(fracta(21 downto 0)); qnan_r_b <= fractb(22); snan_r_b <= NOT fractb(22) and or_reduce(fractb(21 downto 0)); ind <= (expa_ff and infa_f_r) and (expb_ff and infb_f_r); inf <= (expa_ff and infa_f_r) or (expb_ff and infb_f_r); qnan <= (expa_ff and qnan_r_a) or (expb_ff and qnan_r_b); snan <= (expa_ff and snan_r_a) or (expb_ff and snan_r_b); opa_nan <= and_reduce(expa) and or_reduce(fracta(22 downto 0)); opb_nan <= and_reduce(expb) and or_reduce(fractb(22 downto 0)); opa_inf <= (expa_ff and infa_f_r); opb_inf <= (expb_ff and infb_f_r); expa_00 <= NOT or_reduce(expa); expb_00 <= NOT or_reduce(expb); fracta_00 <= NOT or_reduce(fracta); fractb_00 <= NOT or_reduce(fractb); opa_00 <= expa_00 and fracta_00; opb_00 <= expb_00 and fractb_00; opa_dn <= expa_00; opb_dn <= expb_00; END IF; END PROCESS; END arch;

LIBRARY ieee ; USE ieee.std_logic_1164.ALL; USE ieee.std_logic_arith.ALL; USE ieee.std_logic_misc.ALL; USE ieee.std_logic_unsigned.ALL; LIBRARY work; ---------------------------------------------------------------------------- FPU Operations (fpu_op): -- 0 = add -- 1 = sub -- 2 = mul -- 3 = div -- 4 = -- 5 = -- 6 = -- 7 = ------------------------------------------------------------------------------------------------------------------------------------------------------ Rounding Modes (rmode): -- 0 = round_nearest_even -- 1 = round_to_zero -- 2 = round_up -- 3 = round_down --------------------------------------------------------------------------ENTITY fpu IS PORT( clk fpu_op opa opb rmode div_by_zero fpout ine inf overflow qnan snan underflow zero ); END fpu ;

: : : : : : : : : : : : : :

IN IN IN IN IN OUT OUT OUT OUT OUT OUT OUT OUT OUT

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic ; std_logic_vector std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic

(2 downto 0) ; (31 downto 0) ; (31 downto 0) ; (1 downto 0) ; (31 downto 0) ;

ARCHITECTURE arch OF fpu IS signal opa_r, opb_r : std_logic_vector (31 downto 0); signal signa, signb : std_logic ; signal sign_fasu : std_logic ; signal fracta, fractb : std_logic_vector (26 downto 0); signal exp_fasu : std_logic_vector (7 downto 0); signal exp_r : std_logic_vector (7 downto 0); signal fract_out_d : std_logic_vector (26 downto 0); signal co : std_logic ; signal fract_out_q : std_logic_vector (27 downto 0); signal out_d : std_logic_vector (30 downto 0);

signal overflow_d, underflow_d : std_logic ; signal mul_inf, div_inf : std_logic ; signal mul_00, div_00 : std_logic ; signal inf_d, ind_d, qnan_d, snan_d, opa_nan, opb_nan : std_logic ; signal opa_00, opb_00 : std_logic ; signal opa_inf, opb_inf : std_logic ; signal opa_dn, opb_dn : std_logic ; signal nan_sign_d, result_zero_sign_d : std_logic ; signal sign_fasu_r : std_logic ; signal exp_mul : std_logic_vector (7 downto 0); signal sign_mul : std_logic ; signal sign_mul_r : std_logic ; signal fracta_mul, fractb_mul : std_logic_vector (23 downto 0); signal inf_mul : std_logic ; signal inf_mul_r : std_logic ; signal exp_ovf : std_logic_vector (1 downto 0); signal exp_ovf_r : std_logic_vector (1 downto 0); signal sign_exe : std_logic ; signal sign_exe_r : std_logic ; signal underflow_fmul1_p1, underflow_fmul1_p2, underflow_fmul1_p3 : std_logic ; signal underflow_fmul_d : std_logic_vector (2 downto 0); signal prod : std_logic_vector (47 downto 0); signal quo : std_logic_vector (49 downto 0); signal fdiv_opa : std_logic_vector (49 downto 0); signal remainder : std_logic_vector (49 downto 0); signal remainder_00 : std_logic ; signal div_opa_ldz_d, div_opa_ldz_r1, div_opa_ldz_r2 : std_logic_vector (4 downto 0); signal ine_d : std_logic ; signal fract_denorm : std_logic_vector (47 downto 0); signal fract_div : std_logic_vector (47 downto 0); signal sign_d : std_logic ; signal sign : std_logic ; signal opa_r1 : std_logic_vector (30 downto 0); signal fract_i2f : std_logic_vector (47 downto 0); signal opas_r1, opas_r2 : std_logic ; signal f2i_out_sign : std_logic ; signal fasu_op_r1, fasu_op_r2 : std_logic ; signal out_fixed : std_logic_vector (30 downto 0); signal output_zero_fasu : std_logic ; signal output_zero_fdiv : std_logic ; signal output_zero_fmul : std_logic ; signal inf_mul2 : std_logic ; signal overflow_fasu : std_logic ; signal overflow_fmul : std_logic ; signal overflow_fdiv : std_logic ; signal inf_fmul : std_logic ; signal sign_mul_final : std_logic ; signal out_d_00 : std_logic ; signal sign_div_final : std_logic ; signal ine_mul, ine_mula, ine_div, ine_fasu : std_logic ; signal underflow_fasu, underflow_fmul, underflow_fdiv : std_logic ; signal underflow_fmul1 : std_logic ; signal underflow_fmul_r : std_logic_vector (2 downto 0); signal opa_nan_r : std_logic ; signal mul_uf_del : std_logic ; signal uf2_del, ufb2_del, ufc2_del, underflow_d_del : std_logic ; signal co_del : std_logic ; signal out_d_del : std_logic_vector (30 downto 0); signal ov_fasu_del, ov_fmul_del : std_logic ;

signal signal signal signal signal signal signal signal

fop : std_logic_vector (2 downto 0); ldza_del : std_logic_vector (4 downto 0); quo_del : std_logic_vector (49 downto 0); rmode_r1, rmode_r2, rmode_r3 : std_logic_vector (1 downto 0); fpu_op_r1, fpu_op_r2, fpu_op_r3 : std_logic_vector (2 downto 0); fpu_op_r1_0_not : std_logic ; fasu_op, co_d : std_logic ; post_norm_output_zero : std_logic ;

CONSTANT INF_VAL : std_logic_vector(31 DOWNTO 0) := X"7f800000"; CONSTANT QNAN_VAL : std_logic_vector(31 DOWNTO 0) := X"7fc00001"; CONSTANT SNAN_VAL : std_logic_vector(31 DOWNTO 0) := X"7f800001"; COMPONENT add_sub27 PORT( add : IN std_logic ; opa : IN std_logic_vector (26 downto 0) ; opb : IN std_logic_vector (26 downto 0) ; co : OUT std_logic ; sum : OUT std_logic_vector (26 downto 0) ); END COMPONENT; COMPONENT div_r2 PORT( clk : opa : opb : quo : remainder : ); END COMPONENT;

IN IN IN OUT OUT

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector

(49 (23 (49 (49

downto downto downto downto

0) ; 0) ; 0) ; 0)

COMPONENT except IS PORT( clk : IN opa : IN opb : IN ind : OUT inf : OUT opa_00 : OUT opa_dn : OUT opa_inf : OUT opa_nan : OUT opb_00 : OUT opb_dn : OUT opb_inf : OUT opb_nan : OUT qnan : OUT snan : OUT ); END COMPONENT ;

std_logic ; std_logic_vector (31 downto 0) ; std_logic_vector (31 downto 0) ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic

COMPONENT mul_r2 IS PORT( clk : IN std_logic ; opa : IN std_logic_vector (23 downto 0) ; opb : IN std_logic_vector (23 downto 0) ; prod : OUT std_logic_vector (47 downto 0) );

END COMPONENT; COMPONENT post_norm IS PORT( clk : IN div_opa_ldz : IN exp_in : IN exp_ovf : IN fpu_op : IN fract_in : IN opa_dn : IN opas : IN opb_dn : IN output_zero : IN rem_00 : IN rmode : IN sign : IN f2i_out_sign : OUT fpout : OUT ine : OUT overflow : OUT underflow : OUT ); END COMPONENT; COMPONENT pre_norm IS PORT( add clk opa opa_nan opb opb_nan rmode exp_dn_out fasu_op fracta_out fractb_out nan_sign result_zero_sign sign ); END COMPONENT;

std_logic ; std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic_vector std_logic ; std_logic ; std_logic ; std_logic ; std_logic ; std_logic_vector std_logic ; std_logic ; std_logic_vector std_logic ; std_logic ; std_logic

(4 downto 0) ; (7 downto 0) ; (1 downto 0) ; (2 downto 0) ; (47 downto 0) ;

(1 downto 0) ; (30 downto 0) ;

: : : : : : : : : : : : : :

IN IN IN IN IN IN IN OUT OUT OUT OUT OUT OUT OUT

std_logic ; std_logic ; std_logic_vector std_logic ; std_logic_vector std_logic ; std_logic_vector std_logic_vector std_logic ; std_logic_vector std_logic_vector std_logic ; std_logic ; std_logic

(31 downto 0) ; (31 downto 0) ; (1 downto 0) ; (7 downto 0) ; (26 downto 0) ; (26 downto 0) ;

COMPONENT pre_norm_fmul IS PORT( clk : IN std_logic ; fpu_op : IN std_logic_vector opa : IN std_logic_vector opb : IN std_logic_vector exp_out : OUT std_logic_vector exp_ovf : OUT std_logic_vector fracta : OUT std_logic_vector fractb : OUT std_logic_vector inf : OUT std_logic ; sign : OUT std_logic ; sign_exe : OUT std_logic ; underflow : OUT std_logic_vector ); END COMPONENT;

(2 downto 0) ; (31 downto 0) ; (31 downto 0) ; (7 downto 0) ; (1 downto 0) ; (23 downto 0) ; (23 downto 0) ;

(2 downto 0)

BEGIN PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN opa_r <= opa; opb_r <= opb; rmode_r1 <= rmode; rmode_r2 <= rmode_r1; rmode_r3 <= rmode_r2; fpu_op_r1 <= fpu_op; fpu_op_r2 <= fpu_op_r1; fpu_op_r3 <= fpu_op_r2; END IF; END PROCESS; --------------------------------------------------------------------------- Exceptions block -------------------------------------------------------------------------u0 : except PORT MAP ( clk => clk, opa => opa_r, opb => opb_r, inf => inf_d, ind => ind_d, qnan => qnan_d, snan => snan_d, opa_nan => opa_nan, opb_nan => opb_nan, opa_00 => opa_00, opb_00 => opb_00, opa_inf => opa_inf, opb_inf => opb_inf, opa_dn => opa_dn, opb_dn => opb_dn ); --------------------------------------------------------------------------- Pre-Normalize block -- Adjusts the numbers to equal exponents and sorts them -- determine result sign -- determine actual operation to perform (add or sub) -------------------------------------------------------------------------fpu_op_r1_0_not <= NOT fpu_op_r1(0); u1 : pre_norm PORT MAP ( clk => clk, rmode => rmode_r2, add => fpu_op_r1_0_not, opa => opa_r, opb => opb_r, opa_nan => opa_nan, indicator opb_nan => opb_nan, indicator

-- System Clock -- Roundin Mode -- Add/Sub Input -- Registered OP Inputs -- OpA is a NAN -- OpB is a NAN

fracta_out => fracta, sorted fraction fractb_out => fractb, exp_dn_out => exp_fasu, output (registered; sign => sign_fasu, (registered) nan_sign => nan_sign_d, NANs (registered) result_zero_sign => result_zero_sign_d, zero result (registered) fasu_op => fasu_op operation output (registered) ); u2 : pre_norm_fmul PORT MAP ( clk => clk, fpu_op => fpu_op_r1, opa => opa_r, opb => opb_r, fracta => fracta_mul, fractb => fractb_mul, exp_out => exp_mul, -- FMUL registered) sign => sign_mul, -- FMUL sign_exe => sign_exe, -- FMUL (registered) inf => inf_mul, -- FMUL exp_ovf => exp_ovf, -- FMUL (registered) underflow => underflow_fmul_d ); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN sign_mul_r <= sign_mul; sign_exe_r <= sign_exe; inf_mul_r <= inf_mul; exp_ovf_r <= exp_ovf; sign_fasu_r <= sign_fasu; END IF; END PROCESS;

-- Equalized and -- outputs (Registered -- Selected exponent -- Encoded output Sign -- Output Sign for -- Output Sign for -- Actual fasu

exponent output

=>

sign output (registered) exception sign output inf output (registered) exponnent overflow output

-------------------------------------------------------------------------- Add/Sub -u3 : add_sub27 PORT MAP ( add => fasu_op, opa => fracta, opb => fractb, sum => fract_out_d, co => co_d ); PROCESS (clk)

-- Add/Sub -- Fraction A input -- Fraction B Input -- SUM output -- Carry Output

BEGIN IF clk'event AND clk = '1' THEN fract_out_q <= co_d & fract_out_d; END IF; END PROCESS; -------------------------------------------------------------------------- Mul -u5 : mul_r2 PORT MAP (clk => clk, opa => fracta_mul, opb => fractb_mul, prod => prod); -------------------------------------------------------------------------- Divide -PROCESS (fracta_mul) BEGIN IF fracta_mul(22) = '1' THEN div_opa_ldz_d <= conv_std_logic_vector(1,5); ELSIF fracta_mul(22 DOWNTO 21) = "01" THEN div_opa_ldz_d <= conv_std_logic_vector(2,5); ELSIF fracta_mul(22 DOWNTO 20) = "001" THEN div_opa_ldz_d <= conv_std_logic_vector(3,5); ELSIF fracta_mul(22 DOWNTO 19) = "0001" THEN div_opa_ldz_d <= conv_std_logic_vector(4,5); ELSIF fracta_mul(22 DOWNTO 18) = "00001" THEN div_opa_ldz_d <= conv_std_logic_vector(5,5); ELSIF fracta_mul(22 DOWNTO 17) = "000001" THEN div_opa_ldz_d <= conv_std_logic_vector(6,5); ELSIF fracta_mul(22 DOWNTO 16) = "0000001" THEN div_opa_ldz_d <= conv_std_logic_vector(7,5); ELSIF fracta_mul(22 DOWNTO 15) = "00000001" THEN div_opa_ldz_d <= conv_std_logic_vector(8,5); ELSIF fracta_mul(22 DOWNTO 14) = "000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(9,5); ELSIF fracta_mul(22 DOWNTO 13) = "0000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(10,5); ELSIF fracta_mul(22 DOWNTO 12) = "00000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(11,5); ELSIF fracta_mul(22 DOWNTO 11) = "000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(12,5); ELSIF fracta_mul(22 DOWNTO 10) = "0000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(13,5); ELSIF fracta_mul(22 DOWNTO 9) = "00000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(14,5); ELSIF fracta_mul(22 DOWNTO 8) = "000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(15,5); ELSIF fracta_mul(22 DOWNTO 7) = "0000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(16,5); ELSIF fracta_mul(22 DOWNTO 6) = "00000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(17,5); ELSIF fracta_mul(22 DOWNTO 5) = "000000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(18,5); ELSIF fracta_mul(22 DOWNTO 4) = "0000000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(19,5); ELSIF fracta_mul(22 DOWNTO 3) = "00000000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(20,5);

ELSIF fracta_mul(22 DOWNTO 2) = "000000000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(21,5); ELSIF fracta_mul(22 DOWNTO 1) = "0000000000000000000001" THEN div_opa_ldz_d <= conv_std_logic_vector(22,5); ELSIF fracta_mul(22 DOWNTO 1) = "0000000000000000000000" THEN div_opa_ldz_d <= conv_std_logic_vector(23,5); ELSE div_opa_ldz_d <= (OTHERS => 'X'); END IF; END PROCESS; fdiv_opa <= ((SHL(fracta_mul,div_opa_ldz_d)) & "00" & X"000000") WHEN ((or_reduce(opa_r(30 DOWNTO 23)))='0') ELSE (fracta_mul & "00" & X"000000"); u6 : div_r2 PORT MAP (clk => clk, opa => fdiv_opa, opb => fractb_mul, quo => quo, remainder => remainder); remainder_00 <= NOT or_reduce(remainder); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN div_opa_ldz_r1 <= div_opa_ldz_d; div_opa_ldz_r2 <= div_opa_ldz_r1; END IF; END PROCESS; -------------------------------------------------------------------------- Normalize Result -PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN CASE fpu_op_r2 IS WHEN "000" => exp_r <= exp_fasu; WHEN "001" => exp_r <= exp_fasu; WHEN "010" => exp_r <= exp_mul; WHEN "011" => exp_r <= exp_mul; WHEN "100" => exp_r <= (others => '0'); WHEN "101" => exp_r <= opa_r1(30 downto 23); WHEN OTHERS => exp_r <= (others => '0'); END case; END IF; END PROCESS; fract_div <= quo(49 DOWNTO 2) WHEN (opb_dn = '1') ELSE (quo(26 DOWNTO 0) & '0' & X"00000"); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN opa_r1 <= opa_r(30 DOWNTO 0); IF fpu_op_r2="101" THEN IF sign_d = '1' THEN fract_i2f <= conv_std_logic_vector(1,48)-(X"000000" &

(or_reduce(opa_r1(30 downto 23))) & opa_r1(22 DOWNTO 0))conv_std_logic_vector(1,48); ELSE fract_i2f <= (X"000000" & (or_reduce(opa_r1(30 downto 23))) & opa_r1(22 DOWNTO 0)); END IF; ELSE IF sign_d = '1' THEN fract_i2f <= conv_std_logic_vector(1,48) - (opa_r1 & X"0000" & '1'); ELSE fract_i2f <= (opa_r1 & '0' & X"0000"); END IF; END IF; END IF; END PROCESS; PROCESS (fpu_op_r3,fract_out_q,prod,fract_div,fract_i2f) BEGIN CASE fpu_op_r3 IS WHEN "000" => fract_denorm <= (fract_out_q & X"00000"); WHEN "001" => fract_denorm <= (fract_out_q & X"00000"); WHEN "010" => fract_denorm <= prod; WHEN "011" => fract_denorm <= fract_div; WHEN "100" => fract_denorm <= fract_i2f; WHEN "101" => fract_denorm <= fract_i2f; WHEN OTHERS => fract_denorm <= (others => '0'); END case; END PROCESS;

PROCESS (clk, opa_r(31),opas_r1,rmode_r2,sign_d) BEGIN IF clk'event AND clk = '1' THEN opas_r1 <= opa_r(31); opas_r2 <= opas_r1; IF rmode_r2="11" THEN sign <= NOT sign_d; ELSE sign <= sign_d; END IF; END if; END PROCESS; sign_d <= sign_mul WHEN (fpu_op_r2(1) = '1') ELSE sign_fasu; post_norm_output_zero <= mul_00 or div_00; u4 : post_norm PORT MAP ( clk => clk, -- System Clock fpu_op => fpu_op_r3, -- Floating Point Operation opas => opas_r2, -- OPA Sign sign => sign, -- Sign of the result rmode => rmode_r3, -- Rounding mode fract_in => fract_denorm, -- Fraction Input exp_ovf => exp_ovf_r, -- Exponent Overflow exp_in => exp_r, -- Exponent Input opa_dn => opa_dn, -- Operand A Denormalized

opb_dn => opb_dn, -- Operand A Denormalized rem_00 => remainder_00, -- Diveide Remainder is zero div_opa_ldz => div_opa_ldz_r2, -- Divide opa leading zeros count output_zero => post_norm_output_zero, -- Force output to Zero fpout => out_d, -- Normalized output (unregistered) ine => ine_d, -- Result Inexact output (unregistered) overflow => overflow_d, -- Overflow output (un-registered) underflow => underflow_d, -- Underflow output (unregistered) f2i_out_sign => f2i_out_sign -- F2I Output Sign ); -------------------------------------------------------------------------- FPU Outputs -PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN fasu_op_r1 <= fasu_op; fasu_op_r2 <= fasu_op_r1; IF exp_mul = X"ff" THEN inf_mul2 <= '1'; ELSE inf_mul2 <= '0'; END IF; END IF; END PROCESS; -- Force pre-set values for non numerical output mul_inf <= '1' WHEN ((fpu_op_r3="010") and ((inf_mul_r or inf_mul2)='1') and (rmode_r3="00")) else '0'; div_inf <= '1' WHEN ((fpu_op_r3="011") and ((opb_00 or opa_inf)='1')) ELSE '0'; mul_00 <= '1' WHEN ((fpu_op_r3="010") and ((opa_00 or opb_00)='1')) ELSE '0'; div_00 <= '1' WHEN ((fpu_op_r3="011") and ((opa_00 or opb_inf)='1')) else '0'; out_fixed <= QNAN_VAL(30 DOWNTO 0) WHEN (((qnan_d OR snan_d) OR (ind_d AND NOT fasu_op_r2) OR ((NOT fpu_op_r3(2) AND fpu_op_r3(1) AND fpu_op_r3(0)) AND opb_00 AND opa_00) OR (((opa_inf AND opb_00) OR (opb_inf AND opa_00 )) AND (NOT fpu_op_r3(2) AND fpu_op_r3(1) AND NOT fpu_op_r3(0))) )='1') ELSE INF_VAL(30 DOWNTO 0); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF ( ((mul_inf='1') or (div_inf='1') or

((inf_d='1') and (fpu_op_r3/="011") and (fpu_op_r3/="101")) or (snan_d='1') or (qnan_d='1')) and (fpu_op_r3/="100")) THEN fpout(30 DOWNTO 0) <= out_fixed; ELSE fpout(30 DOWNTO 0) <= out_d; END IF; END IF; END PROCESS; out_d_00 <= NOT or_reduce(out_d); sign_mul_final <= NOT sign_mul_r WHEN ((sign_exe_r AND ((opa_00 AND opb_inf) OR (opb_00 AND opa_inf)))='1') ELSE sign_mul_r; sign_div_final <= NOT sign_mul_r WHEN ((sign_exe_r and (opa_inf and opb_inf))='1') ELSE (sign_mul_r or (opa_00 and opb_00)); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN If ((fpu_op_r3="101") and (out_d_00='1')) THEN fpout(31) <= (f2i_out_sign and not(qnan_d OR snan_d) ); ELSIF ((fpu_op_r3="010") and ((snan_d or qnan_d)='0')) THEN fpout(31) <= sign_mul_final; ELSIF ((fpu_op_r3="011") and ((snan_d or qnan_d)='0')) THEN fpout(31) <= sign_div_final; ELSIF ((snan_d or qnan_d or ind_d) = '1') THEN fpout(31) <= nan_sign_d; ELSIF (output_zero_fasu = '1') THEN fpout(31) <= result_zero_sign_d; ELSE fpout(31) <= sign_fasu_r; END IF; END IF; END PROCESS; -- Exception Outputs ine_mula <= ((inf_mul_r OR inf_mul2 OR opa_inf OR opb_inf) AND (NOT rmode_r3(1) AND rmode_r3(0)) and NOT ((opa_inf AND opb_00) OR (opb_inf AND opa_00 )) AND fpu_op_r3(1)); ine_mul <= (ine_mula OR ine_d OR inf_fmul OR out_d_00 OR overflow_d OR underflow_d) AND NOT opa_00 and NOT opb_00 and NOT (snan_d OR qnan_d OR inf_d); ine_div <= (ine_d OR overflow_d OR underflow_d) AND NOT (opb_00 OR snan_d OR qnan_d OR inf_d); ine_fasu <= (ine_d OR overflow_d OR underflow_d) AND NOT (snan_d OR qnan_d OR inf_d); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF fpu_op_r3(2) = '1' THEN ine <= ine_d;

ELSIF fpu_op_r3(1) = '0' THEN ine <= ine_fasu; ELSIF fpu_op_r3(0)='1' THEN ine <= ine_div; ELSE ine <= ine_mul; END IF; END IF; END PROCESS; overflow_fasu <= overflow_d AND NOT (snan_d OR qnan_d OR inf_d); overflow_fmul <= NOT inf_d AND (inf_mul_r OR inf_mul2 OR overflow_d) AND NOT (snan_d OR qnan_d); overflow_fdiv <= (overflow_d AND NOT (opb_00 OR inf_d OR snan_d OR qnan_d)); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN underflow_fmul_r <= underflow_fmul_d; IF fpu_op_r3(2) ='1' THEN overflow <= '0'; ELSIF fpu_op_r3(1) = '0' THEN overflow <= overflow_fasu; ELSIF fpu_op_r3(0) = '1' THEN overflow <= overflow_fdiv; ELSE overflow <= overflow_fmul; END IF; END IF; END PROCESS; underflow_fmul1_p1 <= '1' WHEN (out_d(30 DOWNTO 23) = X"00") else '0'; underflow_fmul1_p2 <= '1' WHEN (out_d(22 DOWNTO 0) = ("000" & X"00000")) else '0'; underflow_fmul1_p3 <= '1' WHEN (prod/=conv_std_logic_vector(0,48)) else '0'; underflow_fmul1 <= underflow_fmul_r(0) or (underflow_fmul_r(1) and underflow_d ) or ((opa_dn or opb_dn) and out_d_00 and (underflow_fmul1_p3) and sign) or (underflow_fmul_r(2) AND ((underflow_fmul1_p1) or (underflow_fmul1_p2))); underflow_fasu <= underflow_d AND NOT (inf_d or snan_d or qnan_d); underflow_fmul <= underflow_fmul1 AND NOT (snan_d or qnan_d or inf_mul_r); underflow_fdiv <= underflow_fasu AND NOT opb_00; PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF fpu_op_r3(2) = '1' THEN underflow <= '0'; ELSIF fpu_op_r3(1) = '0' THEN underflow <= underflow_fasu; ELSIF fpu_op_r3(0) = '1' THEN underflow <= underflow_fdiv;

ELSE underflow <= underflow_fmul; END IF; snan <= snan_d; END IF; END PROCESS; -- Status Outputs PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF fpu_op_r3(2)='1' THEN qnan <= '0'; ELSE qnan <= snan_d OR qnan_d OR (ind_d AND NOT fasu_op_r2) OR (opa_00 AND opb_00 AND (NOT fpu_op_r3(2) AND fpu_op_r3(1) AND fpu_op_r3(0))) OR (((opa_inf AND opb_00) OR (opb_inf AND opa_00 )) AND (NOT fpu_op_r3(2) AND fpu_op_r3(1) AND NOT fpu_op_r3(0))); END IF; END IF; END PROCESS; inf_fmul <= (((inf_mul_r OR inf_mul2) AND (NOT rmode_r3(1) AND NOT rmode_r3(0))) OR opa_inf OR opb_inf) AND NOT ((opa_inf AND opb_00) OR (opb_inf AND opa_00)) AND (NOT fpu_op_r3(2) AND fpu_op_r3(1) AND NOT fpu_op_r3(0)); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF fpu_op_r3(2) = '1' THEN inf <= '0'; ELSE inf <= (NOT (qnan_d OR snan_d) AND (((and_reduce(out_d(30 DOWNTO 23))) AND NOT (or_reduce(out_d(22 downto 0))) AND NOT(opb_00 AND NOT fpu_op_r3(2) AND fpu_op_r3(1) AND fpu_op_r3(0))) OR (inf_d AND NOT (ind_d AND NOT fasu_op_r2) AND NOT fpu_op_r3(1)) OR inf_fmul OR (NOT opa_00 AND opb_00 AND NOT fpu_op_r3(2) AND fpu_op_r3(1) AND fpu_op_r3(0)) or (NOT fpu_op_r3(2) AND fpu_op_r3(1) AND fpu_op_r3(0) AND opa_inf AND NOT opb_inf) ) ); END IF; END IF; END PROCESS;

output_zero_fasu <= out_d_00 AND NOT (inf_d OR snan_d OR qnan_d); output_zero_fdiv <= (div_00 OR (out_d_00 AND NOT opb_00)) AND NOT (opa_inf AND opb_inf) AND NOT (opa_00 AND opb_00) AND NOT (qnan_d OR snan_d); output_zero_fmul <= (out_d_00 OR opa_00 OR opb_00) AND NOT (inf_mul_r OR inf_mul2 OR opa_inf OR opb_inf OR snan_d OR qnan_d) AND NOT (opa_inf AND opb_00) AND NOT (opb_inf AND opa_00); PROCESS (clk) BEGIN IF clk'event AND clk = '1' THEN IF fpu_op_r3="101" THEN zero <= out_d_00 and NOT (snan_d or qnan_d); ELSIF fpu_op_r3="011" THEN zero <= output_zero_fdiv; ELSIF fpu_op_r3="010" THEN zero <= output_zero_fmul; ELSE zero <= output_zero_fasu; END IF; IF (opa_nan = '0') AND (fpu_op_r2="011") THEN opa_nan_r <= '1'; ELSE opa_nan_r <= '0'; END IF; div_by_zero <= opa_nan_r AND NOT opa_00 AND NOT opa_inf AND opb_00; END IF; END PROCESS; END arch;