FDTD Design

886
IEEE TRANSACTIONS ON COMPUTERS,
VOL. 54,
NO. 7, JULY 2005
An Algorithm for Trading Off Quantization Error

with Hardware Resources for
MATLAB-Based FPGA Design
Sanghamitra Roy and Prith Banerjee, Fellow, IEEE
AbstractMost practical FPGA designs of digital signal processing (DSP) applications are limited to fixed-point arithmetic owing to
the cost and complexity of floating-point hardware. While mapping DSP applications onto FPGAs, a DSP algorithm designer must
determine the dynamic range and desired precision of input, intermediate, and output signals in a design implementation. The first step
in a MATLAB-based hardware design flow is the conversion of the floating-point MATLAB code into a fixed-point version using
quantizers from the Filter Design and Analysis (FDA) Toolbox for MATLAB. This paper describes an approach to automate the
conversion of floating-point MATLAB programs into fixed-point MATLAB programs, for mapping to FPGAs by profiling the expected
inputs to estimate errors. Our algorithm attempts to minimize the hardware resources while constraining the quantization error within a
specified limit. Experimental results on five MATLAB benchmarks are reported for Xilinx Virtex II FPGAs.
Index TermsAutomation, field programmable gate arrays, fixed-point arithmetic, floating-point arithmetic, quantization.
INTRODUCTION
ITH the
introduction of advanced Field-Programmable

Gate Array (FPGA) architectures which provide
built-in Digital Signal Processing (DSP) support such as
embedded multipliers and block RAMs on the Xilinx
Virtex-II and the multiply-accumulators, DSP Blocks, and
MegaRAMs on the Altera Stratix [1], a new hardware
alternative is available for DSP designers who can get even
higher levels of performances than those achievable on
general purpose DSP processors. DSP design has traditionally been divided into two types of activitiessystems/
algorithm development and hardware/software implementation. The majority of DSP system designers and algorithm
developers use the MATLAB language for prototyping their
DSP algorithm. Hardware design teams take the specifications created by the DSP engineers (in the form of a fixedpoint MATLAB code) and create a physical implementation
of the DSP design by creating a register transfer level (RTL)
model in a hardware description language (HDL) such as
VHDL and Verilog.
Fig. 1 shows the FPGA design flow for MATLAB-based
DSP algorithms. The left figure shows the approach where a
designer has to manually convert a floating-point MATLAB
code into a fixed-point MATLAB code. Subsequently, the
designer can use a tool such as the MATCH compiler or the
AccelChip compiler [17] to translate the fixed-point
MATLAB into RTL VHDL (this step used to be done
. S. Roy is with the Electrical and Computer Engineering Department,

University of Wisconsin-Madison, and can be reached at 318 Island Dr.,
Apt. 11, Madison, WI 53705. E-mail: roy1@wisc.edu.
. P. Banerjee is with the College of Engineering, University of Illinois at
Chicago, Room 838 SEO, 851 S. Morgan St., Chicago, IL 60607-7043.
E-mail: prith@uic.edu.
Manuscript received 20 July 2004; revised 14 Dec. 2004; accepted 29 Dec.
2004; published online 16 May 2005.
For information on obtaining reprints of this article, please send e-mail to:
tc@computer.org, and reference IEEECS Log Number TC-0243-0704.
0018-9340/05/$20.00 2005 IEEE
manually by a hardware designer before tools such as the

MATCH compiler or AccelChip compiler were developed).
This RTL implementation is synthesized onto a netlist of
gates using a logic synthesis tool such as Synplify Pro.
Subsequently, the netlist of gates is placed and routed onto
FPGAs using tools like Xilinx. While the rest of the steps in
the flow are automated, hence fast, the manual conversion
of floating-point MATLAB code to fixed-point MATLAB
code is very time consuming. The flow shown on the right
of Fig. 1 shows the automatic conversion of the floatingpoint MATLAB code into fixed-point MATLAB code using
the algorithm described in this paper.
The algorithms used by DSP systems are typically
specified as floating-point DSP operations. On the other
hand, most digital FPGA implementations of these algorithms rely solely on fixed-point approximations to reduce
the cost of hardware while increasing throughput rates. The
essential design step of floating-point to fixed-point conversion is not only time consuming, but also complicated
due to the nonlinear characteristics and the massive design
optimization space. In a bid to achieve short product cycles,
the execution of floating to fixed-point conversion is often
left to hardware designers, who are familiar with VLSI
constraints. Comparing with the algorithm designers, this
group often has less insight into the algorithm and depends
on ad hoc approaches to evaluate the implications of fixedpoint representations. The gap between algorithm and
hardware design is even aggravated as algorithms continue
to become more complex. Thus, an automated method for
floating to fixed-point conversion is urgently called for.
This paper describes an approach to automate the
conversion of floating-point MATLAB programs into
fixed-point MATLAB programs for mapping to FPGAs by
profiling the expected inputs to estimate errors. Our
algorithm attempts to minimize the hardware resources
while constraining the quantization error within a specified
Published by the IEEE Computer Society
ROY AND BANERJEE: AN ALGORITHM FOR TRADING OFF QUANTIZATION ERROR WITH HARDWARE RESOURCES FOR MATLAB-BASED...
Fig. 1. Motivation for autoquantization.
limit. It should be noted that our approach of using input

profiling to get improved hardware synthesis is similar to
approaches used in improving branch prediction in highperformance microarchitectures [27], approaches used in
parallel computing to increase thread parallelism using data
dependence analysis via value prediction [26], and in the
area of high-level power estimation and synthesis, where
power consumption is estimated based on input switching
activity [28].
The rest of the paper is organized as follows: Section 2
describes related work in the area of floating to fixed-point
conversion. A brief description of floating and fixed-point
modeling in MATLAB is provided in Section 3. Our
algorithm for automatically determining the quantization
parameters is presented in Section 4. We analyze the
complexity of our algorithm in Section 5. We report
experimental results on five MATLAB benchmark examples
on a Xilinx Virtex-II FPGA in Section 6. Conclusions and
ideas for future work are presented in Section 7.
PREVIOUS WORK
The strategies for floating-point to fixed-point conversion

can be roughly categorized into two groups [16]. The first
one is basically an analytical approach coming from those
algorithm designers who analyze the finite word length
effects due to fixed-point arithmetic. The other approach is
based on bit-true simulation originating from the hardware
designers.
The analytical approach started from attempts to model
quantization error statistically; then it was expanded to
specific linear time invariant (LTI) systems such as digital
filters, FFT, etc. In the past three decades, numerous papers
have been devoted to this approach [6], [7], [8], [9], [10],
[16]. The bit-true simulation method has been extensively
used recently [11], [12], [13]. Its potential benefits lie in its
ability to handle non-LTI systems as well as LTI systems.
There has been some work in the recent literature on
automated compiler techniques for conversion of floatingpoint representations to fixed-point representations [14],
[15]. Babb et al. [18] have developed the BITWISE compiler,
which determines the precision of all input, intermediate,
887
and output signals in a synthesized hardware design from a

C program description. Stephenson et al. [30] have done
extensive bit-width analysis using forward and backward
range propagation for their BITWISE compiler. Range
propagation primarily helps in integral length calculation.
While we have used only forward range propagation in our
algorithm to calculate integral lengths of quantizers, our
algorithm concentrates on the calculation of efficient
precision (fraction) bits of the signals. Synopsys has a
commercial tool called Cocentric [14], which tries to
automatically convert floating-point computations to
fixed-point within a C compilation framework, but does
not produce synthesizable fixed-point designs. Constantinides et al. [23] have used mixed integer linear programming techniques to solve the error constrained area
optimization problem. Constantinides [24] has developed
a design tool to tackle both linear and nonlinear designs.
Chang and Hauck [25] have developed a tool called PRECIS
for precision analysis in MATLAB. This tool guides the
developer in precision selection and requires manual
intervention. Nayak et al. [20] have developed precision
and error analysis techniques for MATLAB programs in the
MATCH compiler project at Northwestern University. An
algorithm for automating the conversion of floating-point
MATLAB to fixed-point MATLAB was presented in [21]
using the AccelFPGA compiler, but that approach required
various user-defined default precisions of variables and
constants when the compiler could not infer the precisions.
Our proposed approach is very closely related to the
approach of Sung and Kum [11], where the authors have
developed a simulation-based method to determine the
optimum word lengths for DSP algorithms. This method
[11] measures the performance of a fixed-point algorithm
using simulation results and iteratively modifies the word
length of a signal to find a set of optimum word lengths
satisfying the fixed-point performance. They have used
signal grouping to reduce the number of simulations. They
calculate the uniform word-length and minimum wordlength configuration. Starting from the minimum wordlength configuration, they perform exhaustive or heuristic
search to get their optimized configuration of word lengths.
In contrast, we follow a binary search approach to quickly
arrive at a coarse optimal point. We start our fine optimization
from the coarse optimal point, thus reducing our complexity
to vary linearly with N, the number of quantizers, even
without signal grouping. In addition, our method is useful for
MATLAB-based design and is very handy for integration
with automated tools like Accelchip and SynplifyPro for
providing a complete automated flow from floating-point
MATLAB design to FPGA hardware resources in minutes.
Our method also reduces execution time considerably by
using training sets for optimizing a design.
FIXED-POINT AND FLOATING-POINT NUMBER

REPRESENTATIONS IN MATLAB
3.1 Background
Numeric representation in digital hardware may be either
fixed or floating-point. In fixed-point representation, the
available bit width is divided and allocated to the integer
888
part and the fractional part, with the extreme left bit
reserved for the sign (2s complement). In contrast, a
floating-point representation allocates one sign bit and a
fixed number of bits to an exponent and a mantissa. In
fixed-point, relatively efficient implementations of arithmetic operations are possible in hardware. In contrast, the
floating-point representation needs to normalize the exponents of the operands for addition and subtraction.
Synthesizing customized hardware for fixed-point arithmetic operations is obviously more efficient than their
floating-point counterparts, both in terms of performance as
well as resource usage.
3.2 Fixed-Point Modeling in MATLAB

Using MATLAB for design, modeling, and simulation
means that the arithmetic is carried out in the floatingpoint domain with the constraints of fixed-point representation. Fixed-point support is provided using the MATLAB
quantization functionality that comes with the Filter Design
and Analysis (FDA) Toolbox [3]. This support is provided
in the form of a quantizer object and two methods or
functions that come with this object, namely, quantizer()
and quantize(). The quantizer() function is used to
define the quantizer object, which allocates the bit-widths to
be used along with whether the number is signed or
unsigned, what kind of rounding is to be used, and whether
overflows saturate or wrap. The quantize() function
applies the quantizer object to numbers, which are inputs to
and outputs from arithmetic operations. For example, a
quantization model of type signed fixed-point, with 16 total
bits with one sign bit, 7 integer bits, and 8 fractional bits,
rounding to the nearest representable number toward
and handling overflow with saturation is defined as follows
in MATLAB:
>> quant quantizer00 fixed;00 00 floor;00 00 saturate;00 ; 16 8;
>> Xq quantizequant; X;
This quantizer object is used to quantize an arbitrary
numerical value X (which may be a scalar or a multidimensional vector) as shown above. The resulting number
Xq has a double floating-point representation in
MATLAB, but can be exactly represented by a 16-bit
fixed-point signed number with 7 integer and 8 fractional
bits. We will use a rigorous mathematical way to represent
quantization errors using the notion of an error norm in our
automatic quantization algorithm to be described in the
next section.
ALGORITHM
FOR
VOL. 54,
NO. 7, JULY 2005
. levelization,
. scalarization,
. computation of Ranges of Variables,
. generate fixed-point MATLAB code,
. autoquantization.
We consider the following small piece of MATLAB code
segment to explain each step of our algorithm:
4.1 Levelization
The levelization pass takes a MATLAB assignment statement consisting of complex expressions on the right-hand
side and converts it into a set of statements, each of which is
in a single operator form. For example, the statement
a1 : 100 b1 : 100 c1 : 2 : 200: d1 : 100;
in the example code segment will be converted into two
statements
The reason for this pass is to make sure that the resulting
synthesized output will be bit-true with the original
quantized MATLAB statement.
4.2 Scalarization
The scalarization pass takes a vectorized MATLAB statement and converts it into a scalar form using enclosing FOR
loops. The levelized code segment shown above will be
converted to
AUTOQUANTIZATION
Our algorithm attempts to minimize the hardware resources while constraining the quantization error within a
specified limit, depending on the requirement of the user or
application. We try to minimize the total bit precision of all
quantizers, which is assumed to be a representation of the
hardware resources.
The autoquantization algorithm consists of the following
passes, which are explained in detail in the next few
sections:
We perform the above scalarization pass because we want

to control the exact order in which the quantization errors
can accumulate in case of a vectorized statement with
accumulation. This will make sure that the resulting
synthesized output will be bit-true with the original
quantized MATLAB statement. This is important for loop
carried dependencies, like the following example
889
of the Fix_Integer_ Range, Coarse_Optimize, and Fine_

Optimize algorithm described in Section 4.5, this quantizer
is set to quantizer(fixed, floor, wrap, mi pi ; pi ).
After generating fixed-point code with default quantizers, the following code segment is obtained:
The quantization error will depend on the order

of accumulation and will be different, for example,
with the orders ((((a(1)+a(2))+a(3))+a(4))+...+a(10)),
(a(1)+a(2))+(a(3)+a(4))+...+(a(9)+a(10)),
((((a(10)+a(9))+a(8))+a(7))+...+a(1)), but scalarization specifies the correct order of accumulation.
4.3 Computation of Value Ranges of Variables

The third pass takes a levelized and scalarized MATLAB
program and computes the value ranges for each variable in
the program, based on the actual inputs, and propagates the
value ranges in the forward direction. The input variables
are assigned MAX and MIN values from the maximum and
minimum values of the input files. For other variables, it
sets the MAX value to INFINITY and the MIN value to
-INFINITY. It then executes forward propagation of values.
For an assignment, this pass would calculate the value
range of the RHS variables, propagate them to the value
range of the LHS variable if the existing value range is
+/-INFINITY or if the existing value range MAX is less than
the new MAX and/or the existing value range MIN is
greater than the new MIN. However, the limitation of the
forward propagation approach is that it gets stuck with
+/-INFINITY bounds in case of loop-carried dependencies.
Let us assume that the input files have the following ranges:
inputfile1 : Min 0; Max 100
Hence, after forward propagation, the ranges assigned to
each variable are given below. Variables b, c, and d get their
ranges from the corresponding input files. We perform
forward propagation of ranges in the two for loops to find
the ranges of temp1 and a. Hence, the ranges are given
below:
4.5 Autoquantization
We now execute our Autoquantization algorithm using
profiling of the input vectors, as shown in Fig. 2. For each
benchmark and input set, we calculate the error vector e,
which is the difference between the output vectors for the
original floating-point and the fixed-point MATLAB code
obtained in Section 4.4, using actual MATLAB-based
floating-point and fixed-point simulation. We assume that
outdata is the output vector of the MATLAB code (a in our
example)
e outdatafloat outdatafixed :
We next define an Error Metric (EM) using the following
definition:
EM% norme=normoutdatafloat 100
where the vector norm used is defined by
!1=p
X p
jxjp
jxi j
:
b : Min 0; Max 100; c : Min 10; Max 500

d : Min 0; Max 200; temp1 : Min 0; Max 100000
a : Min 0; Max 100100
4.4 Generation of Fixed-Point MATLAB Code

We next convert the levelized, scalarized floating-point
MATLAB program into a fixed-point MATLAB program
by replacing each arithmetic and assignment operation
with a quantized computation. Each quantizer is set to a
default maximum precision, quantizer(fixed, floor,
wrap, [40 32]); with 40 bits of representation, 7 bits for
the integer, and 32 bits for the fraction. In future passes
We have used p = 2 for finding the norm. We have assumed

here that there is no floating-point quantization. In other
words, the MATLAB floating-point design is considered to
be accurate. We define the following terms that are used in
the algorithm:
.
.
Mfloat : floating-point Matlab code.

Mfixed : fixed-point Matlab code.
890
VOL. 54,
NO. 7, JULY 2005
Fig. 2. Overview of autoquantization algorithm.
. qi is the ith quantizer.

qi is characterized by a [wordlength fractionlength] of
mi pi ; pi , where
.
.
.
pi precision and
mi integral length of the quantizer including sign
bit
n total number of quantizers in Mfixed which are
generated for assignment/arithmetic operations
Autoquantize(EMmax , Mfloat , Inputs)

{
Levelize (Mfloat )
Scalarize (Mfloat )
Compute_Ranges (Mfloat , Input files)
Generate Mfixed (Mfloat )
Fix_Integral_Range (list of quantizers in Mfixed , max
range of each quantizer)
Coarse_Optimize (EMmax , Mfloat , Mfixed , list of
quantizers qi with known mi )
Fine_Optimize (EMmax , Mfloat , Mfixed , list of coarse
optimized quantizers qi , EMcoarse )
}End Autoquantize
The procedure Fix_Integral_Range determines the mi values
for each qi . The procedure Coarse_Optimize determines the
coarse optimized values pi for each qi . The procedure
Fine_Optimize refines the values of pi for each qi . The values
of mi for each qi are kept constant during Coarse and Fine
Optimization.
The procedures, Fix_Integral_Range, Coarse_Optimize, and
Fine_Optimize are described below:
Fix_Integer_Range(list of qi , list of maxi )
{
for each quantizer qi
maxi maxabsMaxi ; absMini
mi floorlog2 maxi 2
}End Fix_Integer_Range
Using the above algorithm, we have the following
integer ranges in our example:
The fractional length for the quantizers for representing

loop indices and flags are set to zero. We do not apply the
coarse and fine algorithm on these quantizers.
The Coarse_Optimization phase of the autoquantization
algorithm follows a divide-and-conquer approach to
quickly move close to the optimal set of quantizers. We
call this the coarse optimal point. We vary the precisions to
get a set of coarse optimized pi s, which are all equal.
Coarse_Optimize(EMmax , Mfloat , Mfixed , list of qi )
{
.
.
.
Start with the lowest precision quantizer(s) (pi 0=4

bits for all i, and mark it as L and mark the highest
(pi 32 bits for all i) precision quantizer(s) as H.
Denote M as the quantizer of precision of half the
difference between H and L (integral lengths to be
kept the same).
Calculate EM values for L, H, and M.
If EMM < EMmax , we now search in the interval
{L, M}, thus we replace M by H. If EMM > EMmax ,
891
Fig. 3. Overview of coarse optimization algorithm.
we search in the interval {M, H}, thus we replace M

by L.
We repeat the above two steps until we reach the
coarse optimal quantizer value(s) Mi satisfying
EMMi < EMmax and
0 < Mi Li < 1&0 < Hi Mi < 1:
We call the EM at this point EMcoarse .

}End Coarse_Optimize
Fig. 3 illustrates the coarse optimization phase. In our
example, if we set an EMmax of 0.1 percent, after Coarse
Optimization we have the following precisions:
Fine_Optimize algorithm basically tries to find out quantizers whose impact on the quantization error is the smallest
and to reduce precision bits in such quantizers as long as
the error is within the EM constraint. Then, it tries to find a
quantizer whose impact on the error is largest and increases
one bit of precision while simultaneously reducing 2 bits in
a quantizer with the smallest impact on the error, thereby
reducing 1 bit overall. And, it performs such bit reductions
iteratively until the EM constraint is exceeded. Thus, we
attain our Fine optimal point for the given EMmax .
Fine_Optimize(EMmax ; Mfloat ; Mfixed , list of coarse optimized
qi , EMcoarse )
{
.
.
.
Set EMfine EMcoarse .

Let EMi EM for quantizers q1 ; q; . . . ; qi ; . . . qn with
precisions p1 ; p2 ; . . . pi 1; . . . pn .
Calculate an array DEM, where
DEMi EMfine EMi ;
and EMcoarse 0:0773%. We assume that, after coarse

optimization, we get quantizers q1 ; q2 ; . . . ; qn with precisions p1 ; p2 ; . . . ; pn (which are all equal).
The accuracy of the output of an algorithm is less
sensitive to some internal variables than to others [29].
Hence, starting from the coarse optimal point, we apply our
Fine_Optimize algorithm to the Coarse optimized quantizers. We dont perform finer optimization directly from the
start because it will take a long time to converge. The
for 1<=i<=n (the ith element of DEM records the

impact of increasing the precision of the ith
quantizer by 1 bit on the total error).
Choose the smallest element (say index j) of array
DEM for which pj > 0 (if pi 0 for all i, terminate
the algorithm, as we cannot further optimize our
resources by reducing bits for precision).
Calculate EMj EM for quantizers
q1 ; q2 ; . . . ; qj ; . . . qn
with precisions
p1 ; p2 ; . . . ; pj 1; . . . pn
892
.
.
if EMj < EMmax , then EMfine EMj and our

precisions of quantizers are p1 ; p2 ; . . . pj 1 . . . ; pn
and we iteratively repeat the above three steps. If
EMj > EMmax , we move to the next step.
Calculate DEM using the same definition above.
Find j for which DEM(j) is maximum. We also find
the smallest element DEM(k) for which pk > 1 (if no
such element is found, terminate the algorithm).
Calculate EMj;k EM for quantizers
VOL. 54,
NO. 7, JULY 2005
TABLE 1a
Results of Autoquantization Algorithm versus Manual, Infinite
Precision, and Accelchip Default Quantizers on Five MATLAB
Benchmarks for a Training Set of 5 Percent and Testing Set of
95 Percent, Factor of Safety = 10%
q1 ; . . . ; qj ; . . . qk ; . . . qn
with precisions p1 ; . . . ; pj 1; . . . pk 2; . . . pn .
If EMj;k < EMmax , then EMfine EMj;k and pj ; pk
are changed to pj 1 and pk 2 and we iteratively
repeat the above two steps. If EMj;k > EMmax , we
have found our fine optimal quantizer values
}End Fine_Optimize
.
Starting with the coarse optimized quantizers and

EMcoarse 0:0773% as obtained before, we get the following
precisions by running our Fine_Optimize algorithm
and EMfine 0:0793% and this gives our fine optimized set
of quantizers with a 10 percent factor of safety.
One limitation of the algorithm is the objective function
minimized is the cumulative bit-widths of all units. It does
not selectively minimize bit-widths of units with a lower
area or cost. Cumulative bit-width is related to area and,
hence, hardware resources. The impact of bit-width on area
is also dependent on the functional unit. However, we
chose to reduce algorithm complexity by choosing the total
bit-width as our objective function.
4.6 Use of Training Sets

The autoquantization algorithm can be used for various
types of applications and different types of inputs. When
using our algorithm for a specific system, we use a small
percentage of the actual inputs for our algorithm. The
autoquantization algorithm gives the optimized set of
quantizers using this sample input. This is known as
training of the system by sample inputs. Once we develop a
quantized hardware by our algorithm, we use a larger set of
the actual inputs to test our system and verify that the
actual quantization error is within the acceptable limit. We
propose two approaches for the use of training sets. First,
we can choose a small fixed number of inputs (say 100 input
vectors) to do our training and the rest for actual evaluation.
The second approach would be to pick a small fraction of
the inputs (say 5 percent of the total number of inputs) for
training and the rest for evaluation.
While training the system, we can use a factor of safety

for the EM constraint. For example if our EM constraint of
the actual system is 5 percent and we use a factor of safety =
10 percent, then the EM constraint fed to the autoquantization algorithm is 5*.90 = 4.5%. For a system in which a
smaller sample closely resembles the input signal, the factor
of safety can be kept very low or zero. Again, the percentage
of actual inputs for training the system can vary for
different applications. But, as we train the system with a
very small percentage of the actual inputs, this process is
faster in comparison with training with 100 percent sample
inputs.
COMPLEXITY ANALYSIS
Let us assume that a given fixed-point MATLAB program

takes E time steps to execute, assuming a particular set of
quantizations of its variables for a single set of inputs. Let I
be the total number of input vectors in a given MATLAB
TABLE 1b
Table 1 continued
execution, M be the maximum precision width for the

quantizers, and N be the total number of quantizers in the
fixed-point MATLAB code. For I input vectors, assuming
that we use a training set of 5 percent of I to determine the
actual error metric, the execution time for determining the
error metric for a particular MATLAB execution is
0:05 E I OE I. Our proposed Coarse Optimization
algorithm uses a divide-and-conquer approach by starting
from the maximum precision M for all the quantizers and
reducing the precision by half at each stage. This algorithm
takes at most logM iterations. At each iteration, it will
perform a single pass of a MATLAB execution of the
application to determine the error metric. A second stage in
our algorithm uses the Fine Optimization, where individual
quantizers are changed one bit at a time to determine their
final values. Since there are N quantizers in the application,
at most N bit decrements are possible from the coarse
optimal point. Hence, the Fine Optimization algorithm
takes at most N iterations.
Hence, the two stage coarse + fine optimization has a
complexity of OE I logM N. If we start with M bit
precision for all quantizers and perform Fine Optimization
893
TABLE 2
Results of AccelChip 2003_3.1.72 and Synplify Pro on Xilinx
Virtex II XC2V250 Device for Quantizers Selected in Table 1
with a Requested Frequency of 10 MHz
only for all the quantizers, in the worst case, we can have at
most M bit decrements for each of N quantizers. Hence,
single stage Fine Optimization will execute the MATLAB
code OM N times. Hence, the total execution time
complexity of the algorithm will be OE I M N.
We have therefore reduced the complexity of the autoquantization algorithms from OE I M N for a naive
fine optimization approach assuming error metric calculations to an OE I logM N complexity assuming a
two stage Coarse and Fine Optimization approach. Further, if
894
VOL. 54,
NO. 7, JULY 2005
TABLE 3
Comparison of Run Times for Autoquantization Algorithm (Two Stage Coarse + Fixed Optimization) versus Only Fine Optimization
TABLE 4
Comparison of Run Times for Autoquantization Algorithm Using 5 Percent Training Inputs versus 100 Percent Inputs
we use a constant number of inputs to train our system, the

complexity becomes OE logM N.
EXPERIMENTAL RESULTS
We now report some experimental results on various

benchmark MATLAB programs.
. A 16 tap Finite Impulse Response filter (fir).
. A Decimation in time FIR filter (dec).
. An Infinite Impulse Response filter of type DF1 (iir).
. An Interpolation FIR filter (int).
. A 64-point Fast Fourier Transform (fft).
The MATLAB programs vary in size from 20 lines to
175 lines. We ran MATLAB 6.5 to simulate the MATLAB
fixed/floating-point codes and the AUTOQUANTIZE
algorithm. All experiments were measured on a Dell
Latitude model C840 laptop with a 1.60GHz Pentium IV
CPU, 524 MB RAM, and 18 GB hard drive running
Windows 2000. We took measurements for quantizers and
EM% corresponding to Training and Testing Inputs. We
have used randomly generated normalized inputs of
size 2,048, in the range (0, 1) for our measurements. We
have taken measurements by using 5 percent inputs to
train, and 95 percent inputs to test the system. Subsequently, we used the Accelchip 2003_3.1.72 tool from
Accelchip [17] to generate RTL VHDL automatically from
our MATLAB benchmarks. Finally, we used the Synplify
Pro 7.2 logic synthesis tool from Synplicity [22] to map the
designs onto Xilinx Virtex II XC2V250 device.
The autoquantization algorithm described has been
applied to the five MATLAB benchmarks. Table 1 shows
the quantizers selected by the Autoquantize algorithm for
EM constraints of 1 percent, 2 percent, and 5 percent and
also the actual EM values for those quantizers for training
and testing inputs. The Accelchip 2003_3.1.72 tool also
provides an autoquantization step. We compared our
autoquantization algorithm with the Accelchip quantizer.

Some hardware designers convert floating-point MATLAB
programs to fixed-point manually by following a simple
heuristic. In this heuristic, they usually fix the input
quantizers with a wordlength of 8/16 bits. Any variable
which is obtained by addition/subtraction of these quantizers is given the same wordlength. A variable obtained by
the multiplication of two quantizers with wordlengths a
and b is given a wordlength of (a + b). We have provided
results for maximum precision (32 bit) quantizers, manually
selected quantizers using the simple heuristic and Accelchip default quantizers.
We have also performed an estimation of the hardware
resources required for each set of quantizers. Table 2 shows
the results for the Xilinx VirtexII device. Results are given in
Fig. 4. Area versus error trade-off: Plot of normalized LUTs versus EM%
for the five benchmarks.
terms of resources used and performance obtained as

estimated by the Synplify Pro 7.2 tool. The resource results
are reported in terms of LUTS, Multiplexers, and embedded
multipliers used. The performance was measured in terms
of clock frequency of the design as estimated by the internal
clock frequency inferred by the Synplify Pro 7.2 tool. We
have also performed an estimation of the run times. Table 3
shows a comparison of run times for the two step COARSE
+ FINE optimization versus FINE optimization only, for
randomly generated 5 percent training inputs. Table 4
shows a comparison of run times for the autoquantization
algorithm using 5 percent training inputs versus 100 percent
inputs. These two tables explain our motivation for using a
training set and two-stage optimization rather than only
finer optimization.
Fig. 4 shows the plots of Normalized LUTS versus EM%
for the benchmarks. This shows the trade-offs that can be
obtained between the quantization error and the area
needed.
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
CONCLUSIONS
Most practical FPGA designs are limited to finite precision

signal processing using fixed-point arithmetic because of the
cost and complexity of floating-point hardware. Hence, the
DSP algorithm designer must determine dynamic range and
desired precision of input, intermediate, and output signals
in a design implementation to ensure that the algorithm
fidelity criteria are met. The first step in a flow to map
MATLAB applications into hardware is the conversion of the
floating-point MATLAB algorithm into a fixed-point version
using quantizers from the Filter Design and Analysis
(FDA) Toolbox for MATLAB. This paper describes how the
floating-point computations in MATLAB are automatically
converted to fixed-point of specific precision for hardware
design based on profiling the inputs. Experimental results
were reported on a set of five MATLAB benchmarks that are
mapped onto the Xilinx Virtex II FPGAs.
Currently, the process is semiautomated. Future directions will aim at full automation of this process, an
designing improved search algorithms for finer optimization. The idea introduced in this paper can be extended to
other programming languages like SystemC, DSP-C. It will
be an interesting problem to develop autoquantization
techniques in these languages.
ACKNOWLEDGMENTS
This research was supported by the US Defense Advanced
Research Projects Agency under contract no. F33615-01-C1631. This work was done while the authors were at the
Electrical and Computer Engineering Department of Northwestern University. A preliminary version of this paper was
published in the Proceedings of the 41st Design Automation
Conference, 2004.
REFERENCES
[1]
[2]
[3]
Altera, Stratix Datasheet, www.altera.com, Sept. 2004.

M. Haldar, A. Nayak, A. Choudhary, and P. Banerjee, A System
for Synthesizing Optimized FPGA Hardware from MATLAB,
Proc. Intl Conf. Computer Aided Design, Nov. 2001, see also
www.ece.northwestern.edu/cpdc/Match/Match.html.
Mathworks Corp., MATLAB Technical Computing Environment, www.mathworks.com, Jan. 2003.
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
895
Synplicity, Synplify Pro Datasheet, www.synplicity.com, Nov.

2002.
Xilinx, Virtex II Datasheet, www.xilinx.com, July 2001.
L.B. Jackson, On the Interaction of Roundoff Noise and Dynamic
Range in Digital Filters, Bell System Technical J., pp. 159-183, Feb.
1970.
A.V. Oppenheim, R.W. Schafer, and J.R. Buck, Discrete-Time Signal
Processing, second ed. Prentice Hall, 1999.
K.H. Chang and W.G. Bliss, Finite Word-Length Effects of
Pipelined Recursive Digital Filters, IEEE Trans. Signal Processing,
vol. 42, no. 8, pp. 1983-1995, Aug. 1994.
L.B. Jackson, K.H. Chang, and W.G. Bliss, Comments on Finite
Word-Length Effects of Pipelined Recursive Digital Filters, IEEE
Trans. Signal Processing, vol. 43, no. 12, pp. 3031-3032, Dec. 1995.
R.M. Gray and D.L. Neuhoff, Quantization, IEEE Trans.
Information Theory, vol. 44, no. 6, pp. 2325-2383, Oct. 1998.
W. Sung and K.I. Kum, Simulation-Based Word-Length Optimization Method for Fixed-Point Digital Signal Processing Systems,
IEEE Trans. Signal Processing, vol. 43, no. 12, Dec. 1995.
S. Kim and W. Sung, Fixed-Point Error Analysis and Wordlength
Optimization of a Distributed Arithmetic Based 8X8 2D-IDCT
Architecture, Proc. Workshop VLSI Signal Processing, IX, pp. 398407, 1996.
H. Keding, M. Willems, M. Coors, and H. Meyr, FRIDGE: A
Fixed-Point Design and Simulation Environment, Proc. Design,
Automation, and Test in Europe, pp. 429-435, 1998.
Cocentric System, http://www.synopsys.com, July 2003.
L. De Coster, M. Ade, R. Lauwereins, and J. Peperstraete, Code
Generation for Compiled Bit-True Simulation of DSP, Applications, Proc. 11th Intl Symp. System Synthesis, pp. 9-14, 1998.
C. Shi, Statistical Method for Floating-Point to Fixed Point
Conversion, MS thesis, Electrical Eng. and Computer Science
Dept., Univ. of California, Berkeley, 2002.
P. Banerjee, M. Haldar, A. Nayak, V. Kim, V. Saxena, R. Anderson,
and J. Uribe, AccelFPGA: A DSP Design Tool for Making Area
Delay Tradeoffs while Mapping MATLAB Programs onto
FPGAs, Proc. Intl Signal Processing Conf. (ISPC) and Global Signal
Processing Expo (GSPx), 2003.
J. Babb, M. Rinard, C.A. Moritz, W. Lee, M. Frank, R. Barua, and
A. Amarasinghe, Parallelizing Applications into Silicon, Proc.
FPGA Based Custom Computing Machines (FCCM), Apr. 1999.
D. Brooks and M. Martonosi, Dynamically Exploiting Narrow
Width Operands to Improve Processor Power and Performance,
Proc. High Performance Computer Architecture (HPCA), Jan. 1999.
A. Nayak, M. Haldar, A. Choudhary, and P. Banerjee, Precision
and Error Analysis of MATLAB Applications during Automated
Hardware Synthesis for FPGAs, Proc. Design Automation and Test
in Europe (DATE 2001), Mar. 2001.
P. Banerjee, D. Bagchi, M. Haldar, A. Nayak, V. Kim, and R. Uribe,
Automatic Conversion of Floating Point MATLAB Programs into
Fixed Point FPGA Based Hardware Design, Proc. FPGA Based
Custom Computing Machines (FCCM), 2003.
Synplify Pro 7.2 Datasheet, Synplicity Corp., www.synplicity.
com, Nov. 2002.
G.A. Constantinides, P.Y.K. Cheung, and W. Luk, Optimum
Wordlength Allocation, Proc. FPGA Based Custom Computing
Machines (FCCM), 2002.
G.A. Constantinides, Perturbation Analysis for Word-Length
Optimization, Proc. FPGA Based Custom Computing Machines
(FCCM), 2003.
M.L. Chang and S. Hauck, Precis: A Design-Time Precision
Analysis Tool, Proc. FPGA Based Custom Computing Machines
(FCCM), 2002.
L. Codrescu and S. Wills, Profiling for Input Predictable
Threads, Proc. Intl Conf. Computer Design (ICCD-98), Oct. 1998.
R.B. Hildendorf, G.J. Heim, and W. Rosenstiel, Evaluation of
Branch Prediction Using Traces for Commercial Applications,
IBM J. Research and Development, vol. 43, no. 4, 1999.
S. Gupta and F.N. Najm, Power Macromodeling for High Level
Power Estimation, Proc. 34th Design Automation Conf., pp. 365370, June 1997.
G.A. Constantinides, P.Y.K. Cheung, and W. Luk, The Multiple
Wordlength Paradigm, Proc. IEEE Symp. Field-Programmable
Custom Computing Machines, 2001.
M. Stephenson, J. Babb, and S. Amarasinghe, Bitwidth Analysis
with Application to Silicon Compilation, Proc. SIGPLAN Conf.
Programming Language Design and Implementation, June 2000.
896
[31] K.-I. Kum and W. Sung, Combined Word-Length Optimization

and High-Level Synthesis of Digital Signal Processing Systems,
IEEE Trans. Computer-Aided Design of Integrated Circuits and
Systems, Aug. 2001.
[32] C. Shi and R.W. Brodersen, Automated Fixed-Point Data-Type
Optimization Tool for Signal Processing and Communication
Systems, Proc. 41st Design Automation Conf., June 2004.
[33] S. Roy and P. Banerjee, An Algorithm for Converting FloatingPoint Computations to Fixed-Point in MATLAB Based FPGA
Design, Proc. 41st Design Automation Conf., 2004.
[34] Alta Group, Fixed-Point Optimizer Users Guide, Cadence
Design Systems, Inc., Sunnyvale, Calif., Aug. 1994.
Sanghamitra Roy is a PhD candidate in the
Electrical and Computer Engineering Department of the University of Wisconsin, Madison.
She received the MS degree in computer
engineering in December 2003 from the Electrical and Computer Engineering Department of
Northwestern University. She was the recipient
of the Walter P. Murphy fellowship at Northwestern University. She received the BS degree
in electrical engineering from Jadavpur University, India, where she was awarded the University Gold Medal for
graduating at the top of her class in electrical engineering. Her research
interests are in computer-aided design, precision analysis, autoquantization, VLSI circuit design, and optimization
VOL. 54,
NO. 7, JULY 2005
Prith Banerjee received the BTech degree in

electronics and electrical engineering from the
Indian Institute of Technology, Kharagpur, India,
in August 1981, and the MS and PhD degrees in
electrical engineering from the University of
Illinois at Urbana-Champaign in December
1982 and December 1984. He is dean of the
College of Engineering at the University of
Illinois at Chicago (UIC) and a UIC Distinguished
Professor. Previously, Dr. Banerjee was the
Walter P. Murphy Professor and chairman of electrical and computer
engineering at Northwestern University. Prior to that, he was the director
of the Computational Science and Engineering program and a professor
of electrical and computer engineering at the University of Illinois at
Urbana-Champaign. In 2000, he founded AccelChip, Inc., a developer of
products and services for electronic design automation. He served as
president and CEO of AccelChip while on leave from Northwestern from
2000 to 2002. In addition, he has been on the Technical Advisory Boards
of many companies such as Atrenta, Calypto Design Systems, and
Ambit Design Systems. His research interests are in VLSI computeraided design, parallel computing, and compilers and he is the author of
about 300 research papers in these areas and is the author of a book
entitled Parallel Algorithms for VLSI CAD. He has supervised 33 PhD
and 39 MS student theses thus far. He has served as the program chair,
general chair, and program committee member for more than 50 conferences in the past 20 years and has served as an associate editor of
four journals. He has received numerous awards and honors during his
career. He received the IEEE Taylor L. Booth Education Award from the
IEEE Computer Society in 2001. He became a fellow of the ACM in
2000. He was the recipient of the 1996 Frederick Emmons Terman
Award of ASEEs Electrical Engineering Division. He was elected to the
fellow grade of the IEEE in 1995. He received the University Scholar
award from the University of Illinois in 1993, the Senior Xerox Research
Award in 1992, the US National Science Foundations Presidential
Young Investigators Award in 1987, the IBM Young Faculty Development Award in 1986, and the President of India Gold Medal from the
Indian Institute of Technology, Kharagpur, in 1981.
. For more information on this or any other computing topic,

please visit our Digital Library at www.computer.org/publications/dlib.

FDTD Design

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

FDTD Design

Hochgeladen von

Copyright:

Verfügbare Formate

886

IEEE TRANSACTIONS ON COMPUTERS,

NO. 7, JULY 2005

An Algorithm for Trading Off Quantization Error

introduction of advanced Field-Programmable

. S. Roy is with the Electrical and Computer Engineering Department,

manually by a hardware designer before tools such as the

Fig. 1. Motivation for autoquantization.

limit. It should be noted that our approach of using input

The strategies for floating-point to fixed-point conversion

and output signals in a synthesized hardware design from a

FIXED-POINT AND FLOATING-POINT NUMBER

IEEE TRANSACTIONS ON COMPUTERS,

3.2 Fixed-Point Modeling in MATLAB

NO. 7, JULY 2005

We perform the above scalarization pass because we want

of the Fix_Integer_ Range, Coarse_Optimize, and Fine_

The quantization error will depend on the order

4.3 Computation of Value Ranges of Variables

b : Min 0; Max 100; c : Min 10; Max 500

4.4 Generation of Fixed-Point MATLAB Code

We have used p = 2 for finding the norm. We have assumed

Mfloat : floating-point Matlab code.

IEEE TRANSACTIONS ON COMPUTERS,

NO. 7, JULY 2005

Fig. 2. Overview of autoquantization algorithm.

. qi is the ith quantizer.

Autoquantize(EMmax , Mfloat , Inputs)

The fractional length for the quantizers for representing

Start with the lowest precision quantizer(s) (pi 0=4

Fig. 3. Overview of coarse optimization algorithm.

we search in the interval {M, H}, thus we replace M

We call the EM at this point EMcoarse .

Set EMfine EMcoarse .

and EMcoarse 0:0773%. We assume that, after coarse

for 1<=i<=n (the ith element of DEM records the

IEEE TRANSACTIONS ON COMPUTERS,

if EMj < EMmax , then EMfine EMj and our

NO. 7, JULY 2005

Starting with the coarse optimized quantizers and

4.6 Use of Training Sets

While training the system, we can use a factor of safety

Let us assume that a given fixed-point MATLAB program

execution, M be the maximum precision width for the

IEEE TRANSACTIONS ON COMPUTERS,

NO. 7, JULY 2005

we use a constant number of inputs to train our system, the

We now report some experimental results on various

autoquantization algorithm with the Accelchip quantizer.

terms of resources used and performance obtained as

Most practical FPGA designs are limited to finite precision

Altera, Stratix Datasheet, www.altera.com, Sept. 2004.

Synplicity, Synplify Pro Datasheet, www.synplicity.com, Nov.

[31] K.-I. Kum and W. Sung, Combined Word-Length Optimization

IEEE TRANSACTIONS ON COMPUTERS,

NO. 7, JULY 2005

Prith Banerjee received the BTech degree in

. For more information on this or any other computing topic,

Das könnte Ihnen auch gefallen