Sie sind auf Seite 1von 8

1

A Floating-point to Fixed-point C Converter


for Fixed-point Digital Signal Processors
Ki-Il Kum, Jiyang Kang, and Wonyong Sung
Abstract
An automatic scaling C program translator is developed for the ecient execution of application programs in
xed-point digital signal processors. The program for range estimation is automatically generated by inserting
codes which collect the statistics of each signal during the simulation. With the range information, the number of
shifts needed for the scaling is determined and the oating-point program is converted to a xed-point program
by type modi cation, insertion of scaling codes, and using xed-point multiplications which are important to
preserve xed-point accuracy. The SUIF (Stanford University Intermediate Format) system is used for parsing
the original program, inserting the range estimation code, and generating the translated code. A fourth order
IIR lter and an ADPCM codec are implemented on Texas Instruments' TMS320C5x as examples. The xed-
point C codes are 5 to 14 times faster than the oating-point programs with acceptable nite word-length
e ects in these examples.

I. Introduction
The reduction of development time by employing a high level language is very much needed for the pro-
gramming of digital signal processors. Especially, C compilers for oating-point digital signal processors are
gaining acceptance because of the shortened development time and the improved compiler eciency. However,
C compilers for xed-point digital signal processors have met with little acceptance especially because of the
overhead in executing oating-point operations using xed-point data path. The development of xed-point
programs require scaling of variables to prevent over ows while maintaining the accuracy [1][2]. However, de-
termining the number of shifts is usually considered as one of the most tedious parts in the use of xed-point
signal processors.
To solve these problems, a program that automatically converts a oating-point version of C program into a
xed-point version is developed for xed-point digital signal processors. In the proposed development environ-
ment, programmers develop a C application program using a oating-point arithmetic, then the oating-point
program is automatically converted to a xed-point version.
At rst, the ranges of the variables in the oating-point program are estimated by the simulation of the
modi ed oating point program which collects the statistical information of the variables. The range of the
variables is determined from the mean and the standard deviation, and is used for assigning the location
of the binary point of the xed-point variables. In the conversion step, all of the oating-point variables
2

are converted to xed-point type, and the expressions are modi ed by the scaling codes insertion and the
replacement of the integer multiplications. The integer multiplication of ANSI standard C language has a
problem for implementing xed-point arithmetic because it only keeps the lower half of the multiplied result,
while the xed-point arithmetic needs the upper half for preserving the accuracy.
II. Fixed-point data format and arithmetic rules
For the representation of the xed-point data, the generalized xed-point format [1] which allows arbitrary
binary point location using the following attributes is employed:

< word length; integer word length > (1)

In this case, the word-length (WL) is the total number of bits for representing a xed-point variable. The
integer word-length (IWL) is the number of bits to the left of the (hypothetical) binary-point, while the
number of bits to the right side of the binary point is called fractional word-length (FWL). Since each signal
can have a di erent value for the range, a unique integer word-length can be assigned to each variable.
Note that the range (R) and the quantization step (Q) are dependent on the integer word-length as follows.

2IW L  R < 2IW L (2)


Q=2 FWL =2 (W L 1 IW L) (3)

For example, a 16-bit integer (WL = 16) has the binary point just at the right side of the least signi cant bit
(IWL = 15), and thus has the range of ( 215; 215). On the contrary, a 16-bit xed-point number having the
IWL of 4 as shown in Fig. 1-(b) has the range of ( 24; 24) and the quantization step size of 2 11.
The arithmetic rules based on this xed-point data representation are shown in Fig 2. Firstly, when an
assignment is performed between two variables which have di erent IWL's, alignment of the binary point
location is needed. An arithmetic right shift of n-bit corresponds to increasing the IWL by n. For example, a
variable x having the IWL of 2 cannot be directly assigned to another variable y with the IWL of 3. In this
case, y = x should be actually performed as y = x >> 1, and x = y should be x = y << 1.

(a) Integer Format

IWL FWL

(b) Fixed-Point Format

Fig. 1. Integer format and xed-point format.


3

x S x S

x>>1 S X y S

x S

y S
x>>1 S
y = x >> 1
+ y S

y S

y<<1 S result S

S S

x S
result
x = y << 1
(a) assignment (b) addition/subtraction (c) multiplication

Fig. 2. Fixed-point arithmetic rules.

Secondly, in case of addition and subtraction, not only the IWL's of the input operands but also that of the
result should be considered. In the above example, x + y becomes (x >> 1) + y when the IWL of the result
is 3. If the estimated IWL of x + y is larger than the maximum IWL of the two variables, they are shifted by
one more bit respectively as (x >> 2) + (y >> 1) for preventing over ows.
Thirdly, in case of multiplication, the IWL of the result becomes the sum of the IWL's of the two variables
plus one as shown in Fig 2-(c) because the result has two sign bits in two's complement multiplication. When
w-bit by w-bit multiplication is performed, a 2w-bit product is obtained. According to the ANSI C grammar,
the lower half part is used in the integer multiplication. But, in a digital signal processing algorithm, most
variables are aligned to left for keeping the precision. Thus, if we want to limit the result to w-bit, the upper
w-bit part should be used.
In recent C compilers for the digital signal processors, the upper w-bit are accessible through the single pre-
cision multiplication [3]. We used this feature of the TMS320C2x/5x compiler for the ecient implementation
of the multiplication. However, in traditional compilers, single to double precision integer conversion is needed
before double precision multiplication. In some compilers, upper w=2-bit by upper w=2-bit multiplication is
supported using intrinsics to get w-bit result [4].
III. Conversion Programs
The design ow of our conversion software is shown in Fig. 3. The rst step is the range estimation, in
which the range of each oating-point variable is estimated during the simulation. The second step is the
xed-point C code generation including the type conversion and the scaling code insertion. All the steps are
implemented using the SUIF (Stanford University Intermediate Format) system [5]. The SUIF is used for
parsing the oating-point C programs, and generating the range estimation and xed-point programs. The
C front end and the SUIF to C converter are used without modi cation. The other programs are developed
4

Floating-Point
C Program

Floating-Point to Fixed-Point Range


Program Converter Estimator
C pre-processor C pre-processor

C front-end C front-end

IWL annotation ID assignment

Data type conversion

Subroutine call insertion

Scaling code insertion

Expression conversion SUIF-to-C converter

SUIF-to-C converter
Range Estimation
C Program

Integer
C Program Execution
User

IWL
Informations

Fig. 3. The oating-point to xed-point C converter design ow.

with the SUIF and the SUIF builder. The ID assignment in the range estimator and the IWL information in
the converter program are implemented using the annotation function of the SUIF system.
The rst step can be skipped if programmers can estimate the ranges of all the variables by theoretical
analysis such as L1 norm analysis [6][7]. But it can be automatically performed by our simulation based range
estimation. The oating-point C program is modi ed by inserting a subroutine call after every assignment
statement as shown in Fig. 4-(b). The subroutine insertion method is about twice faster than the C++
class based programs [8][9]. In the subroutine, the maximum absolute value, the average, and the standard
deviation of the variables are tracked. The subroutine `range' has two parameters. The rst is the variable
which will be tracked, and the second is the unique identi cation number of the variable that is automatically
assigned by the conversion program. Actually the ID numbers are annotated to the variables in the symbol
5

tables. The generated range estimation program allocates a static storage for keeping the statistical data of
the variables. The tracked information results are reported when the simulation is nished.
In the second step, the type and the expression tree conversions are performed with the IWL information.
The conversion program reads the original oating-point C program and the IWL information table which is
generated by the range estimation procedure. The type conversion is conducted by modifying the types of the
variables in the symbol tables. The type of the oat variables are replaced by the integer type. Not only the
oat type but also the oat based types such as oat pointers, oat arrays, and oat functions are converted
to integer based types. The expression tree conversion is based on the xed-point arithmetic rules described
previously. The instructions in the expression tree are traversed bottom up, and oating-point instructions are
converted to integer type. The instructions to convert are mul, add, sub, cvt, cpy, lod, str, ldc, cal, array, neg,
seq, sne, sl, and sle. When these instructions are found, the operand of the instructions are examined, and
the number of shifts is determined by the IWL's of the operands and the result. If the instruction is mul, it is
replaced by mulh() function call and implemented by a macro or an intrinsic according to the target compiler
dependent features. Figure 4-(c) shows an example of the converted xed-point C codes. According to the
range estimation results, the IWL's of the variables x, y , and s are 0, 4, and 4 respectively. The oating-point
constant of 0.9 is converted to an integer constant of 29491, which is a 16-bit xed-point number with IWL of
0. The multiplication uses mulh() macro to obtain the upper 16-bit result, and the IWL of the multiplication
result is set to 5, which is the sum of the IWL's of two operands plus one. Since the IWL of x is 0, x should
be 5-bit right shifted to add with the multiplied result. The add result is 1-bit left shifted to assign to y whose
IWL is 4.
Finally, the converted xed-point C code is tested if it produces runtime over ows. The over ows can be
produced only in the shift left operations which are used for aligning the binary point in the assignment
statements. The macro sll() shown in Fig. 4-(c) can check over ows and report their locations. If over ows
occur, the programmer should modify the IWL information table to eliminate them. The sll() macro can be
replaced by the shift only macro in the nal implementation.
IV. Implementation Examples
A fourth order IIR lter and an ADPCM codec are implemented for the TMS320C5x using the developed
oating-point to xed-point C converter. We used the TMS320C2x/5x optimizing C compiler from Texas
Instruments (version 6.60) [3] to compile the oating-point and the xed-point programs.
Table I shows the implementation results for a fourth order IIR lter. The signal to quantization noise ratio
of the xed-point implementation is 49.3 dB, which is quite acceptable in most communication applications.
The xed-point implementation based on the proposed translator is 13.8 times faster than the oating-point
implementation.
6

float iir1(float x)

static float s = 0;

float y;

y = 0.9 * s + x;

s = y;

return y;

(a) The oating-point C program.


float iir1(float x)

static float s = 0;

float y;

y = 0.9 * s + x;

range(y,0);

s = y;

range(s,1);

return y;

(b) The range estimation program.


int iir1(int x)

static int s = 0;

int y;

y = sll(mulh(29491, s) + (x>>5), 1);

s = y;

return y;

(c) The xed-point C program.

Fig. 4. Conversion example of a rst order IIR lter.


7

TABLE I
The performance results of a fourth order IIR filter.
oating-point xed-point
SQNR - 49.3 dB
machine cycles 2980 215

TABLE II
The performance results of an ADPCM codec.
test signal oating-point xed-point (16-bit) xed-point (32-bit)
1 15.51dB 12.42dB 15.29dB
2 19.11dB 14.23dB 18.92dB
3 18.95dB 14.58dB 18.53dB
4 21.43dB 16.91dB 22.56dB
machine cycles 125249 26718 61401

The ADPCM implementation is based on the G.721 standard. The ADPCM program consists of three
les and the number of lines is 771. This shows that our conversion software can handle reasonably complex
digital signal processing programs. The performance is measured by the signal to noise ratio of the input
and the reconstructed speech signal as shown in Table II. There are two versions of the xed-point programs
in the table. The rst is a 16-bit version whose variables are all 16-bit integers. It shows about 4 dB
performance degradation. It should be possible to reduce the performance degradation by employing signal
dependent scaling scheme, such as the block oating-point implementation. The 32-bit version shows a better
performance than the 16-bit version but it requires much more cycle time because the 32-bit arithmetic
operations are implemented with subroutine calls. The single precision xed-point implementation is 4.7
times faster than the oating-point implementation.
V. Concluding Remarks
In this paper, a oating-point C program translator is developed by using the SUIF system. The translator
not only converts a oating-point operation into an integer type, but supports automatic scaling as well. The
implementation results show that the translator can provide an acceptable compromise to the users of the
xed-point digital signal processors in terms of SQNR, execution speed, and the development e ort.
References
[1] Seehyun Kim and Wonyong Sung, \A Floating-point to Fixed-point Assembly Program Translator for the TMS 320C25,"
IEEE Trans. on Circuits and Systems, vol. 41, no. 11, pp. 730{739, Nov. 1994.
8

[2] Jiyang Kang and Wonyong Sung, \Fixed-Point C Compiler for TMS320C50 Digital Signal Processor," in Proceeding of the
International Conference on Acoustics, Speech, and Signal Processing '97, Apr. 1997, pp. 707{710.
[3] TMS320C2x/C2xx/C5x Optimizing C Compiler, Texas Instruments Inc., Houston, 1995.
[4] TMS320C6x Optimizing C Compiler, Texas Instruments Inc., Houston, 1997.
[5] The SUIF Library, Stanford Compiler Group, 1994.
[6] Leland B. Jackson, \On the Interaction of Roundo Noise and Dynamic Range in Digital Filters," The Bell System Technical
Journal, pp. 159{183, Feb. 1970.
[7] Christos Caraiscos and Bede Liu, \A Roundo Error Analysis of the LMS Adaptive Algorithm," IEEE Trans. on Acoustics,
Speech, and Signal Processing, vol. 32, no. 1, pp. 34{41, Feb. 1984.
[8] Seehyun Kim, Ki-Il Kum, and Wonyong Sung, \Fixed-Point Optimization Utility for C and C++ Based Digital Signal
Processing Programs," in Proceeding of 1995 IEEE Workshop on VLSI Signal Processing, Oct. 1995, pp. 197{206.
[9] M. Willems, V. Buersgens, H. Keding, and H. Meyr, \FRIDGE: An Interactive Fixed-Point Code Generation Environment
for HW/SW CoDesign," in Proceeding of the International Conference on Acoustics, Speech, and Signal Processing '97, Apr.
1997.

Das könnte Ihnen auch gefallen