Sie sind auf Seite 1von 6

Application Note

March 2002

Trigonometric Functions on the DSP16000


Core Digital Signal Processor

Introduction
This note describes how to program trigonometric math functions on the DSP16000 Core Digital Signal Pro-
cessor. It is intended as a tutorial for DSP16000 programmers either to develop their own trigonometric rou-
tines or to modify an application library trigonometric routine. It does not provide complete application source
code, but instead includes code segment examples for guiding the programmer.
The DSP16000 instruction set includes a feedback to the yh register feature. This architectural feature results
in a significant performance improvement as compared to the DSP1600 core as shown in Table 1.
This document is divided into the sections as described below.

Table 1. Document Content


Function Application Description Approximate
Cycle Reduction
vs. DSP1600
Trigonometric Sine Describes the DSP16000 implementation of the sine func- 47%
tion using the feedback to the yh register feature with input
in the Q14 format.
Cosine Describes the DSP16000 implementation of the cosine 36%
function using the feedback to the yh register feature with
input in the Q14 format.

The trigonometric functions are computed as a Taylor series polynomial expansion. The general form for an
8-term Taylor series is shown below:
2 3 4 5 6 7
f ( x ) = c0 + c 1 x + c 2 x + c 3 x + c 4 x + c 5 x + c6 x + c 7 x (1)

The sine function is described in detail followed by the cosine function. The applications for the other trigono-
metric functions such as tangent, secant, cotangent, arcsine, etc., are not described in this note but are very
similar to the sine and cosine applications.

DRAFT COPY
Application Note Trigonometric Functions on the DSP16000
March 2002 Core Digital Signal Processor

Sine
The Taylor series for the sine function is as follows:

3 5 7 9
x x x x
sin ( x ) = x – ----- + ----- – ----- + ----- – … (2)
3! 5! 7! 9!

or

3 5 7 9
sin ( x ) ≅ c 1 x + c 3 x + c 5 x + c 7 x + c 9 x (3)

where

c1 = 1
1
c 3 = – ----- = – 0.1666666667
3!
1
c 5 = ----- = 0.00833333
5!
1
c 7 = – ----- = – 0.0001984127
7!
1
c 9 = ----- = 0.0000027557
9!

or

2 2 2 2
sin ( x ) ≅ x ( c 1 + x ( c 3 + x ( c 5 + x ( c 7 + c 9 x ) ) ) ) (4)

where the range of the input operand x is:


π π
– --- ≤ x ≤ ---
2 2

To represent x in 16 bits with maximum precision, the application encodes the input operand in the Q14 fixed-point
format as illustrated in the following figure:

INTEGER
PART FRACTIONAL PART (14 BITS)




















15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SIGN
BIT s

IMPLIED BINARY
POINT

Figure 1. Q14 Fixed-Point Format

Agere Systems, Inc. 2


Trigonometric Functions on the DSP16000 Application Note
Core Digital Signal Processor March 2002

Sine (continued)
Assuming that the input operand x is stored in a single-word memory location pointed to by register r1, the routine
computes x2 as follows:

auc0=0 /* No shift of p0 register; clear yl on loads to yh */


xh=*r1 /* Load x (Q14) into xh register */
yh=*r1 /* Load x (Q14) into yh register; yl <- 0 */
p0=xh*yh yh=*r2++ /* Load p0 <- x2; yh <- c9 (Q29) */
xh=p0h /* xh register contains x2 (Q12 format) */

Note that the routine computes x2 and simultaneously loads the yh register with the coefficient c9 for future
use. The format of x2 is Q12 as illustrated in the following figure.

INTEGER
PART FRACTIONAL PART (12 BITS)



















15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SIGN
BIT s

IMPLIED BINARY
POINT

Figure 2. Q12 Fixed-Point Format

In general, when a number in QN format is multiplied by a number in QM format, the product (with no programmed
scaling in the auc0 register) is in the Q(M+N–16) format. Since x2 is in the Q12 format, whenever it is multiplied
with a value that is in QM format, the product is in Q(M–4) format. This routine takes this fact into account by pres-
caling the coefficients by a factor of four recursively. This has the benefit of scaling the small coefficients by a pro-
portionately large value to minimize loss of precision in coding the coefficients. Specifically, the routine codes the
coefficients in memory as follows:

_coeff: int 0.0000027557!29 /* c9 in Q29 format, i.e., c9<<15 relative to Q14 */


int -0.0001984127!25 /* c7 in Q25 format, i.e., c7<<11 relative to Q14 */
int 0.0083333333!21 /* c5 in Q21 format, i.e., c5<<7 relative to Q14 */
int -0.1666666667!17 /* c3 in Q17 format, i.e., c3<<3 relative to Q14 */
int 1.0000000000!13 /* c1 in Q13 format, i.e., c1>>1 relative to Q14 */

For maximum precision, the routine shifts the coefficients left as far as possible maintaining the factor of 4 scaling
ratio between coefficients until they are shifted left as far as possible without overflowing the 16-bit fixed-point
format. The limiting coefficient is c5 which is 0x4444 in Q21 format. The assembly-language statements of the
form int N!M are data declaration statements that initialize the 16-bit memory location to the constant N in QM for-
mat. These statements cause the coefficients to be stored in descending order starting at memory location _coeff.
A feature of the DSP16000 core that is used to enhance the efficiency of this application is the feedback to the yh
register (See Section 4.7.7 of the DSP16000 Digital Signal Processor Core Information Manual). The routine
enables this feature by configuring the auc1 register. Specifically, it sets bits 14—12 of auc1 to 110, causing bits
[31:16] of any result that is written to the a6 accumulator to be simultaneously written to the yh register. Recall

2 2 2 2
sin ( x ) ≅ x ( c 1 + x ( c 3 + x ( c 5 + x ( c 7 + c 9 x ) ) ) ) (5)

3 Agere Systems, Inc.


Application Note Trigonometric Functions on the DSP16000
March 2002 Core Digital Signal Processor

Sine (continued)
The sine routine must successively multiply the quantity x2 by a sum. To efficiently move the computation of the
sum to the input of the multiplier, the code uses the feedback to the yh register.
Recalling that x2 is stored in the xh register in Q12 format and c9 is stored in the yh register in Q29 format.

auc1=0x6000 /* Set XYFBK to 110; feed back a6h to yh */


r2=_coeff /* r2 points to array of coefficients */
a1=0 a1=0 p0=0 p1=0 /* initialize a1 and p1 to zero */
p0=xh*yh yl=*r2++ /* p0 <- c9x2 (Q25); yl <- c7 (Q25) */
a6=a1+p0+p1 /* yh <- a6h <- c9x2 (Q25) */
p0=xh*yh p1=xh*yl yl=*r2++ /* p0 <- c9x4 (Q21); p1 <- c7x2 (Q21); */
/* yl <- c5 (Q21) */
a6=a1+p0+p1 /* yh <- a6h <- c9x4 + c7x2 (Q21) */
p0=xh*yh p1=xh*yl yl=*r2++ /* p0 <- c9x6 + c7x4 (Q17); */
/* p1 <- c5x2 (Q17); yl <- c3 (Q17) */
a6=a1+p0+p1 /* yh <- a6h <- c9x6 + c7x4 + c5x2 (Q17) */
p0=xh*yh p1=xh*yl yl=*r2++ /* p0 <- c9x8 + c7x6 + c5x4 (Q13) */
xh=*r1 /* xh <- x */
p0=xh*yh p1=xh*yl /* p0 <- c9x9 + c7x7 + c5x5 + c3x3 (Q11); */
/* p1 <- c1x (Q11) */
a6=a1+p0+p1 /* a6 <- c9x9 + c7x7 + c5x5 + c3x3 + c1x (Q11) */
a0=a6<<4 /* Result stored in a0 in Q15 format */

The in-line code within the rectangles in the above example is compressed into cache loops as follows:

auc1=0x6000 /* Set XYFBK to 110, feed back a6h to yh */


r2=_coeff /* r2 points to array of coefficients */
a1=0 /* initialize a1 to zero */

yh=*r2++ /* yh <- c9 (Q28)*/


do 4 {
p0=xh*yh p1=xh*yl yl=*r2++
a6=a1+p0+p1
}
xh=*r1 /* xh <- x */
redo 1
a0=a6<<4 /* Result stored in a0 in Q15 format */

The compression of the code into cache loops makes more efficient use of program memory. Note that on the first
pass through the loop, yl has been previously cleared so the first result loaded into p1 is zero as required. The
final result is in Q15 format, which consists of a sign bit and 15 fractional bits. Since the result is in the range
between 1 and –1, this format optimizes the number of bits of precision for a 16-bit result.

Agere Systems, Inc. 4


Trigonometric Functions on the DSP16000 Application Note
Core Digital Signal Processor March 2002

Cosine
The Taylor series for the cosine function is as follows:
2 4 6 8
x x x x
cos ( x ) = 1 – ----- + ----- – ----- + ----- – . . . (6)
2! 4! 6! 8!

or
2 4 6 8
cos ( x ) ≅ 1 + c 2 x + c 4 x + c 6 x + c 8 x (7)

where

1
c 2 = – ----- = – 0.5000000000
2!
1
c 4 = ----- = 0.04166667
4!
1
c 6 = – ----- = – 0.00138889
6!
1 -4
c 8 = ----- = 0.248 ×10
8!

or
2 2 2 2
cos ( x ) ≅ 1 + x ( c 2 + x ( c 4 + x ( c 6 + c 8 x ) ) ) (8)

As in the application that computes the sine function, this routine has an input in the Q14 format. It computes the
value of x2 that is in the Q12 format and then computes the above equation in a cache loop using feedback to the
yh register to enhance efficiency of the code. For maximum precision, the cosine function shifts the coefficients left
as far as possible, without overflowing the 16-bit fixed-point format, while maintaining the factor of 4 scaling ratio
between coefficients. In this case, the limiting coefficient is c4 which is 0x5555 in Q19 format.

5 Agere Systems, Inc.


For additional information, contact your Agere Systems Account Manager or the following:
INTERNET: http://www.agere.com
E-MAIL: docmaster@agere.com
N. AMERICA: Agere Systems Inc., 555 Union Boulevard, Room 30L-15P-BA, Allentown, PA 18109-3286
1-800-372-2447, FAX 610-712-4106 (In CANADA: 1-800-553-2448, FAX 610-712-4106)
ASIA: Agere Systems Hong Kong Ltd., Suites 3201 & 3210-12, 32/F, Tower 2, The Gateway, Harbour City, Kowloon
Tel. (852) 3129-2000, FAX (852) 3129-2020
CHINA: (86) 21-5047-1212 (Shanghai), (86) 10-6522-5566 (Beijing), (86) 755-695-7224 (Shenzhen)
JAPAN: (81) 3-5421-1600 (Tokyo), KOREA: (82) 2-767-1850 (Seoul), SINGAPORE: (65) 6778-8833, TIAWAN: (886) 2-2725-5858 (Taipei)
EUROPE: Tel. (44) 7000 624624, FAX (44) 1344 488 045
Agere Systems Inc. reserves the right to make changes to the product(s) or information contained herein without notice. No liability is assumed as a result of their use or application.

Copyright © 2002 Agere Systems Inc.


All Rights Reserved

March 2002
AP02-044WINF (Replaces AP98-059WTEC)

Das könnte Ihnen auch gefallen