Beruflich Dokumente
Kultur Dokumente
A Mathematical Perspective
Todd D. Mateer
Contents
Chapter 1. Introduction
Chapter 2. Polynomials
1. Basic operations
2. The Remainder Theorem and Synthetic Division
3. Modular reduction of a polynomial
4. Multipoint polynomial evaluation
17
17
19
21
22
27
27
29
30
34
37
38
45
45
47
53
58
63
68
70
76
80
82
87
93
93
94
98
100
103
107
107
110
114
CONTENTS
117
117
124
126
132
139
139
141
148
155
158
163
163
170
173
176
177
177
179
193
193
194
195
) is isomorphic to the quotient ring
197
198
200
203
Appendix.
207
Bibliography
CHAPTER 1
Introduction
During the second half of the twentieth century, the Fast Fourier Transform
(FFT) has become one of the most important techniques in Electrical Engineering.
This statement is supported by the fact that over 2,000 papers have been published
on the topic since the 1960s [22] and that a list of over 75 applications of the
FFT are given in [7]. But what is the Fast Fourier Transform? This is not an
easy question to answer and the response you get depends upon who you ask.
After some introductory definitions, we will attempt to provide an answer to this
question. First, according to [DSP] a signal is defined as follows:
Signal.
A signal is any physical quantity that varies with time, space,
or any other independent variable or variables.
Those that have completed a high school mathematics curriculum may think that
the definition of signal closely matches the definition of a function. The only difference seems to be that a function is a mathematical concept, whereas a signal
represents something tangible in the real world. In this text, the terms signal
and function will be used interchangably.
Signals can be classified in many different ways. The first type of signal of
interest in this introductory section is:
Analog signal.
An analog signal is a function that is defined for all inputs in
a specified interval.
1. INTRODUCTION
EXAMPLE
Let xa (t) be the analog signal shown in the figure below. It consists of two triangles, each of which has a
width of 40 milliseconds. The signal is zero for all inputs less than 20 milliseconds and for all inputs more
than 140 milliseconds, so it is bandlimited. The signal can be drawn without lifting ones pencil off of the
paper, so it is continuous.
xa (t)
1
.5
0
t
80ms
160ms
.5
1
Digital signal.
A digital signal is a function that is only defined at certain
values of time (usually multiples of an integer) and is a function that
only has a finite set of possible values.
EXAMPLE
.5
1
1. INTRODUCTION
A true digital signal only has a finite number of possible outputs, so x(t) is
typically rounded to the nearest finite value in this case. In this text, we will relax
this assumption and work with discrete-time signals which sample an analog
signal at evenly-spaced intervals, but place no restriction on the allowable output
values.
Now that we have seen the difference between an analog and discrete-time
signal, we are ready to return to the goal of figuring out what is meant by the
phrase Fast Fourier Transform. First, we will introduce the Fourier Transform
which operates on analog signals. Next, we will proceed to the Discrete Fourier
Transform which operates on discrete-time signals. Finally, we will explain what a
Fast Fourier Transform is and how it relates to the Discrete Fourier Transform.
In a typical signals analysis course in an Electrical Engineering curriculum, the
Fourier transform of some analog signal f (t) is usually defined by the following
formula
F (s) =
N
1
X
=0
or
F (k)
fe( ) e2k/N I
N 1
1 X e
f ( ) e2k/N I
N =0
depending on what book one picks up. The mathematical notation used in these
formulas means to add up the expression that follows the symbol at each value
of from 0 to N 1. For example, if N = 4, then
N
1
X
=0
0+1+2+3=6
1. INTRODUCTION
A question that is often asked at this point is how the Fourier transform and the
discrete Fourier transform are related. This is also not an easy question to answer
and requires a knowledge of Calculus. A fairly detailed discussion of this topic can
be found in the appendix. For now, let it suffice to say that fe( ) is a function that
samples f (t) uniformly at N locations. As N increases, fe( ) becomes a better and
better approximation to f (t) and the Discrete Fourier Transform becomes a better
and better approximation to the Fourier Transform. From this point forward, we
will use the simpler notation f ( ) to represent the sampled version of f (t).
To illustrate how the Fourier Transform and Discrete Fourier Transform relate
to one another, let us consider what is probably the most popular example used in
courses that cover the Fourier Transform. The so-called rectangle function is a
function that takes on a value of 1 in the interval 1/2 < t < 1/2 and is defined to
be zero elsewhere.1 A graph of the rectangle function is given by
One nice feature about this function is that it is symmetric about the vertical
axis. This means that the graph to the left of the vertical line above is the mirror
image of the graph to the right of the vertical line. When a function is symmetric
about the vertical axis, the Fourier Transform definition simplifies to
F (s)
2 f (t) cos(2t s) dt
1At the values of t = 1/2 and t = 1/2 where the function suddenly jumps between 0 and
1, the rectangle function is usually defined to have a value of 1/2.
1. INTRODUCTION
F (s) =
1/2
=
=
2 1 cos(2 t s) dt
sin(2 t s)
s
sin(s)
s
1/2
t=0
which is defined for all values of s 6= 0. The sinc function is defined using the
above formula with the added condition that sinc(0) = 1.
Two perspectives of the sinc function are given below. The domain of this
function is all of the real numbers, so it is impossible to show a graph of the entire
function. Observe that the sinc function has a value of zero for all integer inputs
with the exception that the function has an output of one when the input is zero.
This is a desirable feature of the sinc function which is useful in engineering.
1
F (k) =
1
N
NX
2 1
=N2
f ( ) cos(2 k/N )
which also eliminates the complex variables. The sample values of the rectangle
function can now be used to determine the DFT of the rectangle function for each
value of k in the range N/2 to N/2 1. 2
For example, when N = 4, then
2Typically, the range 0 to N 1 is used for k, but this alternative range was selected so that
the DFT outputs more closely match the graph of the sinc function given earlier.
1. INTRODUCTION
F (1) =
=
=
=
1. INTRODUCTION
Observe that as N increases, the results of the Discrete Fourier Transform look
more and more like the sinc function.
But we still have not really answered the question What is the Discrete Fourier
Transform?. To resolve this question, we turn to the 1807 publication by Fourier
10
1. INTRODUCTION
that describes the construct that bears his name. It can be shown that any periodic function can be represented by an infinite series of sine and cosine functions
called a Fourier Series. It turns out that as the parameter T0 increases without
bound, then nearly any function defined over a finite interval of input values can
be represented by an infinite series of sine and cosine functions and the inverse of
the Fourier transform
f (t) =
F (s) e2tsI ds
F (k) =
1
N
NX
2 1
=N2
f ( ) cos(2 k/N )
and gives the magnitude of the sinusoid cos(k/N ) in the Discrete Fourier Series,
This is illustrated in the figure on the next page. In the left column, each of the
components of the Discrete Fourier Transform are shown for the case where N = 4.
In the right column are the corresponding cosine functions. The function cos(k/N
) is displayed with a light line in each graph and the scaled version of the function
(determined by the Discrete Fourier Transform component) is displayed with a
dark line. At the bottom of the page, the complete Discrete Fourier Transform is
displayed along with the result of adding all of the scaled sinusoids with the original
sample values displayed on the graph. This graph represents the function
f4 ( )
0 cos ( ) +
1
1
1
cos + cos (0 ) + cos
4
2
2
4
2
=
Observe that
cos
(0
)
is
equal
to
1
for
all
values
of
and
that
cos
2
cos 2 due to the symmetry properties of the cosine function. Therefore, the
above function can also be represented as
f4 ( )
1 1
+ cos
2 2
2
which is a more compact way of expressing the Discrete Fourier Series in this case.
1. INTRODUCTION
11
12
1. INTRODUCTION
f8 ( )
1 2
2
1
+
cos
+
cos ( )
2
4
2
4
using the magnitudes of the Discrete Fourier Transform computed earlier. The
function f8 ( ) is displayed in the left graph below and the function is displayed
with the original sample values in the right graph.
The graphs of the Discrete Fourier Series for the cases N = 16 and N = 32
are given below. Observe that as the number of samples increases, the resulting
Discrete Fourier Series more closely matches the rectangle input function.
Thus,
1. INTRODUCTION
13
How much effort does it take to compute the Discrete Fourier Transform? It
appears that the Discrete Fourier Transform formula must be evaluated N times
and that there are N terms involved in each formula evaluation. Also, every term
in the summation involves one multiplication. Thus, one is tempted to conclude
that a total of N 2 multiplications and N 2 N additions are required.
In 1965, a short 5 page paper written by Cooley and Tukey [11] appeared in
the literature which forever changed the field of Electrical Engineering. This paper
described an algorithm which computes the Discrete Fourier Transform in roughly
1/2 N log2 (N ) multiplications and N log2 (N ) additions. This algorithm is now
called the Fast Fourier Transform (FFT) and significantly reduces the amount
of work needed to compute the Discrete Fourier Transform for large sizes.
The following figure illustrates how much the FFT improves the Discrete Fourier
Transform computation. The upper line represents the number of multiplications
required if the formulas prevented earlier in this section are evaluated literally. The
lower line (which is somewhat difficult to distinguish from the horizontal axis) represents the number of multiplications required using the Cooley-Tukey algorithm.
14
1. INTRODUCTION
This graph only covers FFT sizes up to 256 where the slow method requires
65,536 multiplications and the fast method requires about 1,000 multiplications.
Typical FFT sizes are in the thousands or higher, but the difference in the number
of operations for these sizes is so significant that one really would not be able to tell
the difference between the FFT graph and the horizontal axis. One can already see
the incredible improvement of the FFT in the amount of effort needed to compute
the Discrete Fourier Transform.
Although the Cooley-Tukey paper is one of the most influencial papers of the
20th century, it is not the first time that the technique was first described in the
literature. The account of [23] shows that what is now known as the FFT may
have been discovered as early as 1805 by Carl Gauss. The main difference between
the success of the Cooley-Tukey paper and the unappreciated earlier papers was
a new invention called a computer which could be used to actually perform the
computations. Once engineers were able to compute the Discrete Fourier Transform
so quickly, they found many uses for the technique.
The process of creating a function which has a specified collection of function
values is called interpolation. Thus, the Discrete Fourier Transform can be viewed
as an interpolation process of the given sample values. The Discrete Fourier Transform can be reversed to receive the Discrete Fourier Series as an input and recover
the N sample values. Because the Discrete Fourier Series is being evaluated at N
points, this process is called multipoint evaluation.
There are many good books (e.g. [7]) which consider the FFT from the above
perspective and give algorithms that can efficiently compute the FFT. This book
approaches the FFT from a different perspective which is more common among
some mathematicians. The alternative perspective defines the FFT as a special
case of multipoint evaluation and the inverse FFT to be an interpolation algorithm.
This alternative viewpoint was first introduced in a 1971 paper by Fiduccia [15]
and has since become popular with researchers who are interested in Computer
Algebra and Error-Correcting Codes. This is a somewhat unfortunate development
because papers are now written using both of the perspectives and it can become
confusing for the beginning student to read these documents and have the same
perspective on the problem as the author.
A nice feature of the Fiduccia perspective is that the FFT can be developed from
an algebraic perspective using the remainders that result when two polynomials
are divided. This algebraic perspective has been studied in much greater detail by
Daniel Bernstein whose work ([1], [2] , [3]) has been very influencial of the present
authors understanding of this alternative perspective of the FFT. This textbook
is intended to further expand upon the ideas of these two pioneers and present the
Ficuddia perspective of the FFT in a form that both undergraduate engineering
and mathematics students can understand and appreciate.
Because both viewpoints of the FFT are used in publications, both perspectives
will be treated in this book. After some background material in Chapters 2 and 3,
the FFT will be developed from the Fiduccia perspective in Chapters 4-5. Chapter
6 will then consider the problem of fast polynomial multiplication, an important
application of the FFT. Next in Chapter 7, we will return to the more traditional
1. INTRODUCTION
15
approach to the FFT considered in this first chapter. Some additional topics will
be considered in Chapters 8 and 9.
EXERCISES.
1. Use a computer package to produce a graph of the rectangle function and
the sinc function as shown earlier in this chapter.
2. Write a routine that computes the Discrete Fourier Transform of a function
that is symmetric about the vertical axis using the formula given in this chapter.
Use the routine to compute the Discrete Fourier Transform of the rectangle function.
Make a graph of the Discrete Fourier Transform for N = 4, 8, 16, or 32 to verify
the results given in this section.
3. Compute the Discrete Fourier Transform of the rectangle function for the
case N = 128 (or even higher). Use these components to construct the Discrete
Fourier Series for the case N = 128 and graph the result. The graph should really
look like the rectangle function now (see below).
However, there are spikes where the graph transitions between 0 and 1. This
is called Gibbs phenonenon and is a problem that engineers must deal with when
working with signals constructed from the Discrete Fourier Transform. Special
components are used to prevent a signal from exceeding a certain value (called
overshoot) or going below a certain value (called undershoot). The result of a
signal going through these components might look something like the following:
FIGURE
4. Throughout this chapter, assumed that T0 = 2. Select one or more of the
cases: T0 = 1/2, T0 = 1, T0 = 4, T0 = 8 and:
(A) Sample the function uniformly over one period of the function using N = 32
sample values. Produce a graph of the results.
(B) Compute the Discrete Fourier Transform of the sampled function and produce a graph of the results. Compare it with the graph of the sinc function produced
in this chapter. In particular, how does the height of the function compare with
the one produced in this chapter and how many values of the sinc function are
produced between each time that the function crosses the horizontal axis?
16
1. INTRODUCTION
(C) Try to reconstruct the rectangle function with the Discrete Fourier Series
based on your Discrete Fourier Transform. Comment on the success or failure of
your attempt.
(D) The results of this exercise only give scaled versions of the input function.
Based on you work completed for this exercise, can you guess what scaling factor
(in terms of N and T0 ) that each of these results should be multiplied by to recover
the original input function?
CHAPTER 2
Polynomials
1. Basic operations
In a typical introductory algebra course (e.g. [4]), one is introduced to the
concept of a polynomial. Although polynomials involving multiple variables are
frequently used in algebra, here we will restrict ourselves to polynomials involving
one variable. First, let us review the concept of a monomial.
Monomial.
A monomial is an expression of the form
a xn
Polynomial.
A polynomial is either a monomial or a sum or difference of
monomials. Each monomial which comprises a polynomial is called
a term.
Although the above definitions specify that the coefficients are real numbers, it
is possible to create polynomials for other number systems. We will create polynomials with complex number coefficients in the next chapter.
The degree of a monomial is simply its exponent in the case of single variables.
The degree of a polynomial is the largest exponent which appears in one of its terms.
The monomial which has this exponent in the polynomial is called the leading
term.
EXAMPLE
17
18
2. POLYNOMIALS
Addition of polynomials.
The sum of the polynomials f (x) and g(x) is determined by
forming the expression f (x) + g(x) and combining like terms.
EXAMPLE.
Before reviewing the operation of subtraction, recall that an opposite or additive inverse of a number a is some other number b such that a + b = 0. If a is
positive, then b is formed by putting a minus sign in front of a. If a is negative, then
b is formed by removing the minus sign in front of a. The opposite of a polynomial
f (x) is formed by replacing each term of f (x) with a term of the same degree and
the opposite of the coefficient in f (x). We will denote the opposite of f (x) with
the notation f (x).
EXAMPLE
Subtraction of polynomials.
The difference of the polynomials f (x) and g(x) is determined by
forming the expression f (x) + (g(x)) and combining like terms.
19
As one can see, polynomial multiplication requires much more effort than polynomial addition or subtraction. In Chapter 6, we will see that the FFT can be used
to significantly reduce the amount of work needed to multiply two polynomials of
large degree.
Polynomials are an example of what is called Euclidean Domain. This means
that given two polynomials a(x) and b(x), there exists a polynomial q(x) called a
quotient and r(x) called a remainder such that
a(x)
with the property that either r(x) is the zero polynomial or else the degree of r(x)
is less than the degree of b(x). It can be shown that q(x) and r(x) are unique in
this particular type of Euclidean Domain.
To determine the quotient and remainder of a(x) divided by b(x), we follow a
procedure that works much like the division of two integers.
Division of polynomials.
The quotient q(x) of the polynomial a(x) divided by b(x) with
remainder r(x) is determined by following the following sequence of
steps:
1. Initialize r(x) equal to a(x).
2. While deg(r(x)) deg(b(x)):
3.
Divide the leading term of r(x) by the leading term of b(x).
Call this term `(x) and add it to the quotient q(x).
4.
Subtract `(x) b(x) from r(x).
5. End while
EXAMPLE
It can be shown that there are no other polynomials q(x) and r(x) such that
a = q(x) b + r(x) that have the property that the degree of r(x) is less than the
degree of b(x).
20
2. POLYNOMIALS
a(x)
q(x) (x ) + r(x)
if b(x) = x where either r(x) must be 0 or the degree of r(x) is less than the
degree of x , i.e. a polynomial of degree 0. In either case, r(x) must be a
constant. Let us call this constant C. If we evaluate the above equation at , we
obtain
a()
= q() ( ) + r()
= q() 0 + C
= C
21
1Mathematicians who have studied advanced algebra may object to this definition of
f (x) mod M(x). Traditionally, the notation f (x) mod M(x) is used to represent an element
of something called a quotient ring which consists of the set of all polynomials that have the
same remainder as the remainder of f (x) divided by M(x). However, these mathematicians then
select a representative element of this set for computational purposes. In terms of the representative elements, the definition of f (x) mod M(x) is the same as the one considered in this
section. The reader should be cautioned, however, that this view is only valid with polynomials
involving one variable. With polynomials involving two or more variables, one needs to learn the
concept of a quotient ring and the more advanced mathematical techniques used for computing in
these quotient rings. The multivariate case will not be encountered anywhere in this manuscript.
More details about the relationship between residue polynomials and quotient rings is given the
appendix for those with the appropriate algebra background.
22
2. POLYNOMIALS
(f mod MA ) mod MB
(f mod MA ) mod MC
(x + 4)(x + 3)
x+4
x+3
(x + 2)(x + 1)
x+2
x+1
x2
(x 3)(x 4)
x3
x4
23
x2 + 7x + 12
x+4
x+3
x2 + 3x + 2
x+2
x+1
x2
x2 7x + 12
x3
x4
24
2. POLYNOMIALS
x4
(x + 3)(x 3)
x+3
x3
x2
(x + 1)(x 1)
x+1
x1
or in simplified form
x4
x4 5x2 + 4
x2 9
x+3
x3
x2 4
x+2
x2
x2 1
x+1
x1
Observe that many of the polynomials in this tree have fewer terms than the
nodes of the tree given in XXX. The sequence of modular reductions for evaluating
f in set S can be given by the tree
PICTURE
and this time only XX multiplications and XX additions are required. This is a
consequence of the fact that the modulus polynomials in the new tree involve fewer
terms.
Note that if MB is of the form xm b and MC is of the form xm + b, then
MA will be of the form x2m b2 . Each of these polynomials has only two terms
and resulted in the reduced operation count of the multipoint evaluation using the
second modulus tree. We were able to arrange the points of S so that we could
achieve this situation at the bottom of the modulus tree, but we were not able to
construct polynomials of the desired form higher in the tree. It turns out that it is
impossible to find two polynomials with two terms that multiply together to form
x2m b2 whenever b2 < 0 if we restrict ourselves to the real numbers. By selecting
the points in S from a extension of the real numbers called the complex numbers,
25
we will be able to reduce the number of terms of the modulus polynomials higher
in the tree and achieve a faster multipoint evaluation technique.
CHAPTER 3
Complex numbers
1. Number systems
Over the years, mathematicians have invented several new number systems to
handle cases where one cannot solve a particular problem with the existing number
systems.
In elementary school, one first learns the number system of the natural numbers which consists of all of the positive integers, i.e. 1, 2, 3, 4, .
Next, the number zero is introduced and the number system is expanded into
the whole numbers.
Then, one is asked to find a number (possibly represented by ) such that
+ 1 = 0. Here, the concept of a negative number is introduced and the students
number system is expanded to the integers,
i.e. , 4, 3, 2, 1, 0, 1, 2, 3, 4, .
Later, one is asked to find a solution to an equation similar to 2x = 1 and the
concept of a fraction is introduced. Now, the students number system has been
expanded to the rational numbers, i.e. all numbers that can be expressed as a
ratio of two integers.
An equation similar to x2 2 = 0 cannot be solved with the rational numbers
and the concept of an irrational number is introduced. The number system is next
expanded to the real numbers.
The number system must be expanded again so that
an equation of the form
x2 + 1 = 0 can be solved. We will introduce a new symbol, 1, that we will define
as the solution
to this equation. Mathematicians traditionally use the symbol i
to represent 1 which engineers typically choose the symbol j to represent the
symbol. In this book, we will use the symbol I which is used in some popular
computer algebra packages. The number system has now been expanded to the
complex numbers
27
28
3. COMPLEX NUMBERS
Complex numbers.
The number system of complex numbers consists of all expressions of
the form
A+IB
We can extract the two components of a complex number using the following operations.
Components of a complex number.
If C = A + I B is a complex number, then
Re(C) = A is the real part of the complex number and
Im(C) = B is the imaginary part of the complex number.
EXAMPLE
It is unfortunate that one of the components of a complex number is called imaginary. This term was introduced because at first some people did not believe that
these numbers had any practical applications. If one thinks about it carefully, negative numbers can also be considered imaginary because they cannot be used to
count anything tangible in the real world. However, negative numbers have become
accepted because they can be used to represent the concepts of debt and loss. This
text is all about one of the important practical application of complex numbers.
So, while the term imaginary is traditionally applied to one of the components
of a complex number, this term should not be interpreted as a description of the
usefulness of complex numbers.
So what is the next expansion after the complex numbers? A consequence of
the Fundamental Theorem of Algebra introduced by Carl Gauss states that the
complex number system is complete, meaning that all equations with coefficients
in the complex numbers can be solved using only the complex numbers.
It is possible to expand the complex number system two more times, but each
time we lose a property that one usually associates with numbers. First, the
quaternions are a set of numbers of the form A + I B + J C + K D where A, B, C,
and D are real numbers and I2 = J2 = K2 = 1 and I J K also equals 1. This
number system is not commutative, which means that a + b can be different from
b + a. This number system was long thought to be of only theoretical interest, but
has recently been applied to computer graphics and video games. This number
system can again be expanded into the octonions which are like the quaternions,
but has eight components. This number system is also not commutative, but has
the additional property that it is not associative. That is to say, a + (b + c) may
be different from (a + b) + c. Currently, this number system seems to be mainly of
2. COMPLEX ARITHMETIC
29
theoretical interest. It turns out that the octonions represent the last expansion of
the number system that can be made where both addition and multiplication are
defined.
The following table summarizes the various expansions of the number system
discussed in this section.
TABLE
2. Complex arithmetic
In the previous section, we learned that a complex number is of the form A+I B
where A and B are real numbers. The two quantities A and I B are kept distinct
because one cannot combine real and imaginary numbers. In [30], Loy relates the
components of a complex number to the idea of apples and oranges in a fruit basket.
Apples can be added to apples and oranges can be added to oranges, but apples
cannot be added to oranges. This illustration may be useful as we give the following
definition of addition and subtraction with complex numbers
Complex addition.
The sum of the two complex numbers A + I B and C + I D is given by
(A + C) + I (B + D).
EXAMPLE
30
3. COMPLEX NUMBERS
Before discussing complex division, we first introduce the concept of the complex conjugate
Complex conjugate.
The conjugate of the complex number A + B I is given by
ABI
Complex division.
The quotient of the two complex numbers A + B I and C + D I
is given by
A+BI
C +DI
BC AD
AC + BD
+I 2
C 2 D2
C D2
This division formula is established by multiplying the numerator and denominator of the quotient by the complex conjugate of C + D I and simplifying.
The complex numbers can be graphed on a Cartesian plane where the horizontal
axis is used for the real component and the vertical axis is used for the imaginary
component. The complex number A + B I is mapped on the complex plane
using the ordered pair (A, B).
Im
(A, B)
B
A
Re
EXAMPLE.
3. Polar Representation of Complex Numbers
In this section, we will consider a second method of representing a complex
number called polar form illustrated in the figure below.
31
Im
(r, )
r
Re
32
3. COMPLEX NUMBERS
Im
Re
First, we will show how to convert from the Cartesian representation to the
polar representation. Observe that r can be computed using the Pythagorean
Theorem or the distance formula
p
A2 + B 2
= tan
and satisfies
B
A
To solve this equation for , one must exercise caution because tan1 is restricted
to the range 180o < < 180o. The following formulas give a solution for in the
range 0 < 360o for all cases except when r = 0.
tan1 (B/A)
0
=
o
90
180o
270o
undefined
if
if
if
if
if
if
if
if
A>0
A<0
A>0
A>0
A=0
A<0
A=0
A=0
and B > 0
and
and
and
and
and
and
B
B
B
B
B
B
>0
=0
>0
=0
<0
=0
To convert from polar form to Cartesian form, we will apply the following
results from trigonometry.
A =
B
33
r cos()
r sin()
r (cos() + I sin())
which is sometimes abbreviated as r cis . Engineers often instead use the abbreviation r and call the polar representation of a complex number a phasor.
One advantage of the polar representation of complex numbers is that multiplication is easy in this form.
Complex multiplication (polar form). The product of two
complex numbers written in polar form r1 1 and r2 2 is given by
(r1 1 ) (r2 2 )
= r1 r2 (1 + 2 )
Thus, all one needs to do is multiply the magnitudes and add the arguments of two
complex numbers to compute the product. This result is a consequence of the sum
and difference formulas learned in trigonometry. The derivation of the formula is
left as an exercise.
This multiplication formula can be used to derive an expression for the square
of a complex number
(r)2
r2 (2)
By repeated use of the multiplication formula, we can derive de Moivres Theorem which allows a complex number to be raised to any integer power n.
de Moivres Theorem. If r is a complex number with magnitude
r and argument , then
(r)n
rn (n)
34
3. COMPLEX NUMBERS
r1
(1 2 )
r2
There are no good formulas for addition and subtraction in polar form. The
best course of action for these operations is to convert the two numbers to be added
or subtracted into Cartesian form, compute the sum or difference, and then convert
the result back into polar form if desired.
4. Primitive Roots of Unity
In this section, we will restrict ourselves to complex numbers which have magnitude 1 when represented in polar form. These complex numbers are said to form
the unit circle in the complex plane.
Im(z)
1
Re(z)
Consider the equation z n = 1. Here, z is a complex variable, i.e. an expression of the form x + I y where x and y are unknown real numbers. Alternatively,
z can be a variable involving complex numbers represented in polar form.
A solution to the above equation is called an nth root of unity. By de
Moivres Theorem discussed in the previous section, we can verify that
1(360od/n) =
cos(360od/n) + I sin(360od/n)
EXAMPLE
35
16 135o
16 45o
1
16 180o
16 0o
16 225o
16 315o
16 270o
EXAMPLE
36
3. COMPLEX NUMBERS
EXAMPLE
1(/n + 360od/n) =
is a solution to this equation for all 0 d < n. This gives all n solutions to this
equation.
EXAMPLE
16 135o
16 45o
1
16 225o
16 315o
5. EULERS FORMULA
37
Finally, the solutions to z 4 1 are the called the 4th roots of unity and are
given by {1, I, 1, I}.
1
1
1
I
Here = I is a primitive 4th root of unity. Since I4 = 1, then the powers of
I cycle according to the following pattern:
0, 4, 8, . . . = 1
1, 5, 9, . . . = I
2 , 6 , 10 , . . . = 1
3 , 7 , 11 , . . . = I
5. Eulers Formula
In a typical Calculus course, one encounters the following Taylor expansions
for the cosine, sine, and exponential functions:
cos(x)
sin(x)
ex
x4
x6
x2
+
+
2!
4!
6!
3
5
7
x
x
x
x
+
+
3!
5!
7!
x2
x3
x4
x5
x6
x7
x
+
+
+
+
+
+
1+ +
1!
2!
3!
4!
5!
6!
7!
1
In the 18th century, Leonhard Euler saught a single formula that related these three
expressions. The expression
cos(x) + sin(x)
= 1+
x2
x3
x4
x5
x6
x7
x
+
+
+
1!
2!
3!
4!
5!
6!
7!
38
3. COMPLEX NUMBERS
In order to make a formula that works, Euler decided to replace the x in the
definition of ex with I y where y is a real number. Then we obtain
eIy
=
=
=
=
=
=
(I y)3
(I y)4
(I y)5
(I y)6
(I y) (I y)2
+
+
+
+
+
+
1!
2!
3!
4!
5!
6!
Iy
I2 y 2
I3 y 3
I4 y 4
I5 y 5
I6 y 6
1+
+
+
+
+
+
+
1!
2!
3!
4!
5!
6!
y2
I y3
y4
I y5
I y6
Iy
+
+
+
1+
1!
2!
3! 4!
5!
6!
y4
I y5
I y I y3
y2
+
+ +
+
+
1
2!
4!
1!
3!
5!
4
3
5
2
y
y
y
y
y
+
+ + I
+
+
1
2!
4!
1!
3!
5!
cos(y) + I sin(y)
1+
Now, certain mathematicians would correctly raise several objections to the derivation of the above formula. Euler did not concern himself with such issues and
neither will we in this presentation. However, it should be mentioned that some
more advanced mathematics covered in a course in complex variables is needed to
properly derive the above result. In any event,
Eulers Formula.
eIy
= cos(y) + I sin(y)
r (cos() + I sin())
= r eI
6. ROTATION TRANSFORMATIONS
39
in polar form, we see that the result of the multiplication is r( + 90o ). The new
point is the same distance from the origin, but the angle between the horizontal axis
and the ray connecting the point to the origin has been increased by 90 degrees.
EXAMPLE
16 120o
16 90o
16 30o
We are now going to consider the effect of multiplying the complex number r
by 1 where is any angle. Again, from the multiplication formula in polar form,
the result of the multiplication is r( + ). The distance that the new point is
away from the origin remains unchanged, but the angle that the ray connecting the
point to the origin makes with the horizontal axis has been increased by degrees.
40
3. COMPLEX NUMBERS
EXAMPLE
16 30o
Next, suppose that we transform the complex plane with the mapping z =
1 z. This means that every point in the complex plane should be multiplied by
1. In other words, the magnitude of every point is unchanged, but the argument
of every point is increased by degrees. The figure below illustrates how four points
are represented in a transformed complex plane where = 45o.
z = 16 45o z
We see here that a negative value for decreases the argument of every point or
rotates the points clockwise in the complex plane.
Another way of looking at this transformation is to leave the points fixed in
the transformed complex plane, but adjust the axis system instead. In this case,
the axis system is rotated by degrees. The following figure illustrates this idea
for the case where = 45o and thus the axis system is rotated counterclockwise
by (45o) = 45o .
41
6. ROTATION TRANSFORMATIONS
z = 16 45o z
The above two figures show two equivalent ways of looking at the same transformation. Another way of expressing the transformation is in terms of z instead
of z using the notation z = 1() z z. This will be the form used in the coming
chapters.
EXAMPLE
In the previous examples, we considered the transformation z = 1 45o z. Multiply both sides of this
equation by 145o to obtain
145o z = 145o 1 45o z
145o z = 1 z
So this transformation can be expressed with the
equivalent formula z = 145o z.
42
3. COMPLEX NUMBERS
EXAMPLE
16 135o
16 45o
1
16 225o
16 315o
(1(45o ))4 z4
1180o z4
z4
=
=
1180o
1180o
1180o
=
=
1180o
1
6. ROTATION TRANSFORMATIONS
43
16 90o
16 135o
16 45o
1
1
o
6
1 180
16 225o
16 0o
16 315o
16 270o
16 135o
16 0o
16 45o
16 90o
16 225o
16 270o
16 315o
16 180o
These rotation transformations will play an important role in one of the FFT algorithms discussed in the next chapter.
CHAPTER 4
FFT algorithms
1. The Binary Reversal Function
Having completed a study of complex numbers, we are ready to resume our
search for a fast multipoint evaluation algorithm. The Fast Fourier Transform
(FFT) algorithms in this chapter evaluate a polynomial f at each of the nth roots
of unity for some n = 2k . To achieve this goal, we are going to construct modulus
polynomials of the form z 2m b2 which can be factored into z m b and z m + b at
every node of the modulus polynomial tree. Before presenting the algorithms, we
need a method to easily find the bs in the expressions above. This method requires
one to convert numbers in our system called decimal which is base-10 to the one
that computers use called binary which is base-2. Note that dec is a prefix which
means 10 and bi associated with 2, which may be helpful in figuring out what to
call other number systems.
In elementary school, one learned that each digit in a number represented a
power of 10. For example, the number 123 is used to represent 1 group of a hundred,
2 groups of ten, and 3 groups of one, i.e.
(101)2
= 1 22 + 0 21 + 1 20
The notation ()2 is used to indicate that this represents a binary number.
To convert from decimal to binary, divide the number by two and record the
remainder. Then divide the quotient by two and again record the remainder. Continue the process until the quotient is zero. The binary representation of the number
is the sequence of remainders in reverse order.
45
46
4. FFT ALGORITHMS
EXAMPLE
32
12
= 6
= 3
R1
R0
= 1
= 0
R1
R1
1 23 + 1 22 + 0 21 + 1 20
=
=
18+14+02+11
8+4+0+1
13
EXAMPLE
Let j = 5 and n = 16 = 24 .
In binary form, j = (0101)2.
So (j)(16) = (1010)2 ..
As a decimal number, (j)(16) = 10.
Note in the example that leading zeros should be included in the binary reversal of
the number. Also, n is often the same for many related calculations. If this is the
47
case and it is understood from the context of the situation what n is, then it is not
necessary to specify n in the notation and one can simply use (j) instead.
Properties of the binary reversal function.
(1). (j) = 2 (2j) for j < n/2
(2). (2j + 1) = (2j) + n/2 for j < n/2
(3). This function is permutation of the integers {0, 1, 2, , n 1}.
To establish the first property, let j < n/2 and write j in binary form, i.e.
j = (0bk2 bk3 . . . b2 b1 b0 )2 where k = log2 (n). Then
(j) = (b0 b1 b2 . . . bk3 bk2 0)2 . Now, 2j = (bk2 bk3 bk4 . . . b1 b0 0)2 and
(2j) = (0b0 b1 b2 . . . bk3 bk2 )2 . Multiplying this result by 2 gives (j) as desired.
The proof of the other two properties is left as exercises.
EXAMPLE
(0) = (000)2 = 0
1 = (001)2
2 = (010)2
(1) = (100)2 = 4
(2) = (010)2 = 2
3 = (011)2
(3) = (110)2 = 6
4 = (100)2
5 = (101)2
(4) = (001)2 = 1
(5) = (101)2 = 5
6 = (110)2
7 = (111)2
(6) = (011)2 = 3
(7) = (111)2 = 7
One can verify that the three properties above hold for when n = 8.
48
4. FFT ALGORITHMS
The input to the reduction step of the classical radix-2 FFT algorithm is given
by f mod (z 2m b2 ) and computes f mod (z m b) and f mod (z m + b). Before
discussing the reduction step itself, let us consider the tree of modulus polynomials.
At the top of the tree is z n 1. Thus, we are going to evaluate f at each of
the nth roots of unity. At each reduction step with input size 2m, let b = (2j)
for some j < n/m. In the previous section, we saw that b2 = ( (2j) )2 = (j) and
b = (2j) = (2j+1) . So the input modulus polynomial is z 2m (j) and the
output modulus polynomials are z m (2j) and z m (2j+1) . At the bottom of
the tree are z (j) for all 0 j < n.
The modulus polynomial tree when n = 16 is given by:
z 16 1
z8 0
z4 1
z2 1
0
z4 + 1
z2 + 1
4
z8 8
12
z4 4
z2 4
z 2 12
10
14
z2 1
1
z 4 12
z2 + 1
5
13
z2 4
z 2 12
11
15
49
z8 1
z4 1
z4 + 1
z2 1
z1
z+1
z2
z2 + 1
1 z +
z 1
z 5
z2 +
z 3
z 7
The reduction step is simple to perform with these modulus polynomials. Split
the input into two blocks of size m by writing f mod (z 2m (j) ) = fA xm + fB .
Then the outputs are given by fY = (2j) fA + fB and fZ = (2j) fA + fB .
The reduction step can also be expressed in matrix form as
fY
fZ
(2j)
(2j)
1
1
fA
fB
Engineers often represent the reduction step as a picture similar to the one below
and call it a butterfly operation.
fB
fA
(2j)
fY
fZ
50
4. FFT ALGORITHMS
EXAMPLE
= (I z + I) + (z + 0)
= (1 + I) z + I
fZ = b fA + fB
= (I z + I) + (z + 0)
= (1 I) z I
f1
I
(fY )0
1 + I
(fY )1
f2
2
I
(fZ )0
f3
2
1 I
(fZ )1
Suppose that we want to compute the FFT of a polynomial f of degree less than
n = 2k . Then f is equal to f mod (z n 1). We will recursively apply the reduction
step with appropriate selections of m and b. After all of the reduction steps have
been completed with input size 2m = 2, then we have f mod (z (j) ) = f ( (j) )
for all j < n, i.e. the desired FFT of f . In terms of butterfly operations, the FFT
of size 8 is expressed as:
f0
f1
f2
f3
f ( 0 )
f5
f ( 4 )
f ( 2 )
f ( 6 )
f7
f6
f4
51
f ( 1 )
f ( 5 )
f ( 3 )
f ( 7 )
z 7 + 2z 6 + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1
3z 3 + 5z 2 + 5z + 2
8z + 7
15
z3 + z2 z
2z 3
3 + 2I
(1 + I) z + I
3 2I 2 + I
2+I
(1 I) z I
2I
2I
The above example was specially designed so that the exact answers could fit
nicely in the boxes for each step. Real-world problems use decimal approximations
instead as illustrated by the following example.
52
4. FFT ALGORITHMS
EXAMPLE
f (z) mod (z 2 + 1) =
2
f (z) mod (z I) =
2
f (z) mod (z + I) =
2z + 3
1
(1 + I) z + (2 + I)
(1 I) z + (2 I)
Finally,
f ( 0 ) =
4
f ( ) =
2
5
1
f ( ) =
f ( 6 ) =
f ( 1 ) =
2 + 2.4142 I
f ( ) =
3
f ( ) =
f ( 7 ) =
2 0.4142 I
2 + 0.4142 I
2 2.4142 I
f ( 1 ) =
2 + 2.4142 I
f ( 2 ) =
3
f ( ) =
4
2 + 0.4142 I
f ( ) =
f ( 5 ) =
2 0.4142 I
f ( 6 ) =
7
f ( ) =
2 2.4142 I
53
follow the steps of the algorithm using the two examples above to understand how
the algorithm works.
Algorithm : Classical radix-2 FFT
Input: f mod (z 2m (j) ), a polynomial of degree less than 2m
with complex number coefficients; an nth root of unity .
Here, m is a power of 2 where 2m n.
Output: f ( (j2m+0) ), f ( (j2m+1) ), . . . , f ( (j2m+2m1) )
0. If (2m) = 1 then return f mod (z (j) ) = f ( (j) )
1. Split f mod (z 2m (j) ) into two blocks fA and fB each of size m
such that f mod (x2m (j) ) = fA xm + fB
2. Compute f mod (z m (2j) ) = fA (2j) + fB
3. Compute f mod (z m (2j+1) ) = fA (2j) + fB
4. Compute the FFT of f mod (xm (2j) ) to obtain
f ( (j2m+0) ), f ( (j2m+1) ), . . . , f ( (j2m+m1) )
5. Compute the FFT of f mod (z m (2j+1) ) to obtain
f ( (j2m+m) ), f ( (j2m+m+1) ), . . . , f ( (j2m+2m1) )
6. Return f ( (j2m+0) ), f ( (j2m+1) ), . . . , f ( (j2m+2m1) )
Figure 1. Pseudocode for classical radix-2 FFT
54
4. FFT ALGORITHMS
Combining all of the results of the analysis of this algorithm for the case that
j is never equal to 0, we can express the number of multiplications required using
the formula
M (2m) = 2 M (m) + m
At this point, it is more convenient to express the input as size n = 2m. With this
change of variables, the above equation becomes
M (n) =
2M
n
2
n
2
This is called a recurrence relation because it expresses the operation count in terms
of itself with a different input size.
To solve a recurrence relation, we need something called an initial condition.
This is a solution of the recurrence relation for some value of the input. In the case
of the multiplication count, we know that no multiplications are required when
n = 1, i.e. M (1) = 0.
There are several techniques that can be used to solve a recurrence relation.
A branch of mathematics called combinatorics involves the study of something
called generating functions, which are a powerful tool that can be used to solve
recurrence relations. The interested reader can study [32] to learn more about this
method. A simpler, but more limited method of solving recurrence relations is with
a technique called substitution.
Suppose that we let m = n/4 in the above recurrence relation. In this case, we
have
n
2
2M
n
4
n
4
By replacing M (n/2) with the above expression in the formula derived for M (n),
we obtain
M (n) =
=
=
n n
2M
+
2 n 2 n n
+
+
2 2M
4
2
n4
n
2
+2
2 M
4
2
The technique of substitution repeats this procedure until the initial condition is
reached. Usually, one has to discover a pattern to reduce the recurrence relation
to this point. The initial condition is then substituted into the formula to obtain
a closed-form solution for the recurrence relation. If n = 2k for some k, i.e. k =
55
log2 (n), then the derivation of the multiplication count for the classical radix-2
FFT algorithm continues as follows:
n
n
+2
M (n) = 22 M
4
n 2n
n
2
= 2 2M
+
+2
8
2
n 8
n
+3
= 23 M
2
8n
n
4
= 2 M
+4
2
16
n
n
5
= 2 M 5 +5
2
2
=
n
n
= 2k M k + k
2
2
n
n
k
= 2 M
+k
n
2
n
k
= 2 M (1) + k
2
n
k
= 2 0+k
2
1
n log2 (n)
=
2
Ms (n)
Ms
n
2
n
2
56
4. FFT ALGORITHMS
Ms (n) =
=
=
=
=
=
=
=
=
n
n
+
2
2
n n
n
Ms
+ +
4n 4n 2 n
n
Ms 3 + 3 + 2 + 1
2
2
2
2
n
n
n
n
n
Ms k + k + + 3 + 2 +
2
2
2
2
2
n
1
1
1
1
Ms
+n
+
+
+
+
n
2k
23
22
2
!
k
3 2
1
1
1
1
Ms (1) + n
+ +
+
+
2
2
2
2
k !
2 3
1
1
1
1
+
+ +
+
0+n
2
2
2
2
!
k d
X
1
n
2
Ms
d=1
To complete the solution of this recurrence relation, we need the following result:
Geometric series.
If a is any real number other than 1, then
k
X
ak+1 aL
a1
ad
ad
= kL+1
d=L
If a = 1, then
k
X
d=L
To resolve the summation in the above recurrence relation, we let a = 1/2 and
L = 1. So,
k d
X
1
d=1
57
1 k+1
12
2
1
2 1
1 k+1
12
2
21
k
1
2
1
= 1 k
2
1
= 1
n
+1
Ms (n) =
=
=
k d
X
1
2
d=1
1
n 1
n
n1
M (n) =
1
n log2 (n) n + 1
2
We can follow a similar procedure to determine the number of additions for the
classical radix-2 algorithm. The recurrence relation
A(n)
2A
n
2
+n
can be used to model this operation count. This formula is valid for all values of
j, so it is not necessary to solve a second recurrence relation for this case. The
number of additions required is given by the closed-form formula
A(n)
= n log2 (n)
58
4. FFT ALGORITHMS
The techniques used in this section are sufficient to derive the operation counts
for most of the other algorithms in this text. These other operation count derivations will be left as exercises.
EXAMPLE
59
3 z 3 + 2 z 2 1 z + 0 0
( 2/2 + I 2/2) z 3 + I z 2
( 2/2 + I 2/2) z
Assuming that the powers of are precomputed, the twisting of the polynomial
f (z) of degree less than m requires m 1 multiplications and no additions. A
multiplication is not needed for the constant term of f (z).
So now, both the first and second outputs of the twisted radix-2 reduction step
are in the proper form to be used as inputs for another application of the reduction
step. The process of applying the reduction step and twisting the polynomial can
also be expressed using butterfly operations. The following figure gives the butterfly
operation which is part of an FFT of size n = 16. The input to this reduction step
is of degree less than 8. So m = 4 and the value of used to compute the twisted
polynomial is = (1)/m = (n/2)/m = (16/2)/4 = 2 .
f0
(fY )0
f1
(fY )1
f2
(fY )2
f3
(fY )3
f4
f5
f6
f7
(fZ )0
(fZ )1
(fZ )2
(fZ )3
60
4. FFT ALGORITHMS
z8 1
(z(1) )4 1
z4 1
z2 1
z1
z(4) 1
(z(2) )2 1
z(2) 1
z(6) 1
(z(1) )2 1
z(1) 1
z(5) 1
(z(3) )2 1
z(3) 1
z(7) 1
z(3) = 2 z(1)
z(5) = 4 z(1)
z(7) = 4 z(3)
One will probably agree that it is too complicated to create a new variable every
time the complex plane is transformed. Therefore, we need a different notation to
keep track of the transformations.
From this point forward, the inputs and outputs of the reduction steps will be
expressed in terms of the original untransformed complex plane using the notation
f ( z). This notation means to replace each z in the original polynomial f (z) with
z and simplify this into a new polynomial that is a function of z, i.e. compute
a twisted polynomial with = . However, the z of the twisted polynomial is
associated with a complex plane that has been rotated by . This may seem
confusing at first, but it is better than inventing a new variable every time the
complex plane is transformed. The main thing to remember is that when one
sees the notation f ( z), this means a polynomial that is a function of z in a
transformed complex plane where the axis system has been rotated counterclockwise
by .
The following example may help to clarify the new notation.
EXAMPLE.
We can now present the reduction step of the radix-2 twisted FFT algorithm in
terms of the original polynomial and the new notation. It can be shown that if the
input to the reduction step is f (z) = f ( (j)/(2m) z) mod (z 2m 1) for some j,
61
then the outputs are given by f (z) mod (z m 1) = f ( (2j)/m z) mod (z m 1) and
f ( (1)/m z) mod (z m 1) = f ( (2j+1)/m z) mod (z m 1). These expressions
can be determined by carefully keeping track of the cumulative rotations of the
complex plane using the new notation and using the fact that (j)/2m = (2j)/m.
The twisted radix-2 FFT algorithm is initialized with f (z) which equals f ( 0
z) mod (z n 1) if f has degree less than n. By recursively applying the reduction
step to f (z), we obtain f ( (j) z) mod (z 1) = f ( (j) 1) for all j in the range
0 j < n. This is the desired FFT of f (z).
The following figure shows how every intermediate result of this FFT calculation
relates to the original input polynomial f (z) for n = 8.
f (z) mod (z 8 1)
f (z) mod (z 4 1)
f (z) mod (z 2 1)
f ( 0 )
f ( 4 )
f ( z) mod (z 4 1)
f ( 2 z) mod (z 2 1)
f ( 2 )
f ( 6 )
f ( z) mod (z 2 1)
f ( 1 )
f ( 5 )
f ( 3 z) mod (z 2 1)
f ( 3 )
f ( 7 )
f0
f1
f2
f3
f4
f ( 0 )
f ( 4 )
f ( 2 )
f ( 6 )
f5
f ( 1 )
f ( 5 )
f7
f6
f ( 3 )
f ( 7 )
62
4. FFT ALGORITHMS
The next diagram shows the intermediate results of the FFT of f (z) = z 7 +
2z + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1 using the twisted method. One can compare
these results with those given for the computation of the
the classical
FFT using
2/2
+
I
2/2 and 3 =
algorithm
provided
in
the
previous
section.
Here,
2/2 + I 2/2.
6
z 7 + 2z 6 + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1
3z 3 + 5z 2 + 5z + 2
8z + 7
15
3 z3 + I z2 z
2I z 3
3 + 2I
2z+I
3 2I 2 + I
2+I
2zI
2I
2I
M (n) =
A(n)
63
n 1
+ n1
2M
2
n2
2A
+n
2
1
n log2 (n) n + 1
2
= n log2 (n)
M (n) =
A(n)
B + I A
B + I (A)
64
4. FFT ALGORITHMS
The function has the following additional properties that will be useful in the
development of the radix-4 algorithm:
Additional properties of the binary reversal function.
(4).
(5).
(6).
(7).
These properties are left as exercises for the reader to verify. It follows from these
properties that for any j < n/4,
(4j+1)
(4j+2)
(4j+3)
= (4j)
= I (4j)
= I (4j)
= f (z) mod (z m b)
= f (z) mod (z m + b)
fY
fZ
= f (z) mod (z m I b)
= f (z) mod (z m + I b)
where b = (4j) for some j < n/4m. By the above properties, then the input can
also be exprssed as f (z) mod (z 4m (j) ) and the output can be expressed as
fW
fX
fY
fZ
Each radix-4 reduction step essentially does two levels of radix-2 reduction steps.
A modulus polynomial tree for n = 16 is given by:
65
z 16 1
z4 1
0
12
z4 + 1
10
14
z4 4
1
z 4 12
13
11
15
One can verify that the bottom level results of the tree shown above are the
same as the tree produced when the radix-2 algorithm is used.
The reduction step of the radix-4 FFT is fairly simple to perform as well. Split
f into four blocks of size m by writing f = fA x3m + fB x2m + fC xm + fD .
Then the four outputs of the reduction step {fW , fX , fY , fZ } are given by the
matrix computation
fW
fX
fY
fZ
1
1
1
1 1 1
I 1 I
I 1 I
3(4j)
fA
1
2(4j)
1
fB
1 (4j) fC
1
fD
66
4. FFT ALGORITHMS
fD
fC
fB
(4j)
fA
2(4j)
3(4j)
fW
fX
fY
fZ
The multiplication by I in the reduction step does not cost any arithmetic operations. Even the inversion given in the result at the beginning of this section does
not require any computational effort if it is combined with the subtraction that
immediately preceded this multiplication.
EXAMPLE.
If k = log(n) is even, then we can start with f = f mod (z n 1) and apply the
radix-4 reduction steps until we have f mod (z (j) ) = f ( (j) ) for all j < n, i.e.
the desired FFT of f . If k is odd, then one level of reduction steps from the classical
radix-2 FFT is needed to complete the FFT computation. The following butterfly
diagram illustrates this case for an FFT of size 8. The dotted lines show the location
of the radix-4 reduction step in this computation. The radix-2 reduction steps are
at the bottom of the diagram.
f0
f1
f2
0
f ( 0 )
f ( 4 )
f3
0
f4
2 0
f ( 2 )
f ( 6 )
f5
2 0
f6
3 0
f ( 1 )
f ( 5 )
f7
3 0
f ( 3 )
f ( 7 )
67
Pseudocode for this FFT algorithm is given in Figure 3. We leave the analysis
and operation count of this algorithm as an exercise for the reader. In this algorithm, one must remember to subtract multiplications in the case where j = 0 and
a slightly simplified reduction step is used.
The number of operations required to compute an FFT using the classical
radix-4 algorithm is given by
M (n) =
A(n) =
3
8
3
8
n log2 (n) n + 1
n log2 (n) 87 n + 1
n log2 (n)
This algorithm has the same addition count as the classical radix-2 FFT algorithm,
but the multiplication count has been significantly reduced.
68
4. FFT ALGORITHMS
fW
fX
fY
fZ
1
1
=
I
I
1
1
1 1
1 I
1 I
fA
1
fB
1
1 fC
fD
1
The significance of this case is that no complex multiplications are needed in the
transformation other than a multiplication by I which simply involves swapping
components.
The input to the twisted radix-4 FFT reduction step is always given by f (z) =
f (z) mod (z 4m 1) and the output is always f (z) mod (z m 1), f (z) mod (z m +
1), f (z) mod (z m I), and f (z) mod (z m + I).
This is implemented by splitting the input polynomial f (z) into four blocks
of size m by writing f (z) = fA z 3m + fB z 2m + fC z m + fD . and then using
the butterfly operation given in the previous section to produce {fW , fX , fY , fZ }.
Again, j = 0 and so no multiplications are required at the beginning of the reduction
step.
Now, fW is already in the form needed to apply the twisted FFT reduction
step, but fX , fY , and fZ are not in the required form. A transformation of the
complex plane implemented through the twisted polynomial can produce a result
that is in the required form for each of these cases. It is left as an exercise for the
reader to verify that the value of needed in the twisted polynomial calculation to
produce the desired effect for each of these three outputs is given by
The following figure illustrates the transformations involved with the same reduction steps considered in the previous section for n = 16.
69
f (z) mod (z 16 1)
f (z) mod (z 4 1)
f ()
f ( 9 )
f ( 2 z) mod (z 4 1)
f ( 5 )
f ( 13 )
f ( z) mod (z 4 1)
f ( 3 )
f ( 11 )
f ( 3 z) mod (z 4 1)
f ( 7 )
f ( 15 )
f0
f1
f2
f ( 0 )
f ( 4 )
f3
f ( 2 )
f4
f ( 6 )
f5
f ( 1 )
f6
f ( 5 )
f7
f ( 3 )
f ( 7 )
EXAMPLE.
As an exercise, the reader can write pseudocode to implement the twisted radix4 FFT algorithm and then analyze it. One can compare the differences between
70
4. FFT ALGORITHMS
the two radix-2 algorithms to help complete this task. As one might suspect, the
operation counts of the twisted radix-4 FFT is the same as the classical radix-4
FFT.
3 = 16 135o
= 16 45o
1
5 = 16 225o
7 = 16 315o
and the following result shows that a multiplication by one of these 8th roots of
unity is cheaper than a multiplication by an arbitrary complex number.
71
2
2
(A B) + I
(A + B)
(A + I B) =
2
2
Similarly, multiplication by the other primitive 8th roots of unity 3 ,
5 , and 7 is given by
2
2
3
(A + I B) =
(A B) + I
(B A)
2
2
2
2
(B A) + I
(A B)
5 (A + I B) =
2
2
2
2
7 (A + I B) =
(A + B) + I
(A B)
2
2
Note that each product requires2 real multiplications and
2 real additions (assuming that 2/2 has been precomputed).
The reduction step of the classical radix-8 algorithm receives as input f (z) mod
(z 8m b8 ) and produces as output
fS
fT
=
=
f (z) mod (z m b)
f (z) mod (z m + b)
fU
fV
=
=
fW
fX
=
=
f (z) mod (z m I b)
f (z) mod (z m + I b)
fY
fZ
f (z) mod (z m b)
f (z) mod (z m + b)
f (z) mod (z m 3 b)
f (z) mod (z m + 3 b)
72
4. FFT ALGORITHMS
fS
fT
fU
fV
fW
fX
fY
fZ
A partial modulus polynomial tree for the case n = 64 is given in the figure
below
73
z 64 1
z8 0
z 8 32 z 8 16 z 8 48
32 16 48
40 24 56
z8 8
z 8 40 z 8 24 z 8 56
39 23 55 15 47 31 63
The reduction step for the classical radix-8 algorithm is given by the matrix
transformation
fS
fT
fU
fV
fW
fX
fY
fZ
1
1
I
I
7
3
5
1
1
1 1
1 I
1 I
I 5
I
I
7
I
3
1
1
1
1 1 1
1 I 1
1
I 1
1 3
I
1 7
I
1 I
1 5 I
1
1
I
I
5
3
7
1
1
1
1
1
1
1
1
7(8j) fA
6(8j) fB
5(8j) fC
4(8j) fD
3(8j) fE
2(8j) fF
(8j) fG
fH
where the input to the reduction step has been subdivided into eight blocks of size
m, i.e.
f (z) mod (z 8m b8 ) =
fA z 7m + fB z 6m + fC z 5m + fD z 4m
+fE z 3m + fF z 2m + fG z m + fH
However, as with the radix-4 algorithms, the matrix computation does not
describe the best way to perform the reduction step. Instead, the butterfly diagram
74
4. FFT ALGORITHMS
fH
fG
(8j)
fF
3(8j)
fE
2(8j)
fD
4(8j)
fC
5(8j)
fS
fT
fU
fV
fA
7(8j)
fB
6(8j)
fW
fX
fY
fZ
illustrates the most efficient method for calculating the classical radix-8 FFT reduction step.
The rest of the algorithm details are similar to the classical radix-2 and radix-4
algorithms and are left as an exercise for the reader. One can show that a total of
M (n) =
A(n) =
3
8
3
8
3
8
n log2 (n) n + 1
if log2 (n) mod 3 = 0
n log2 (n) 78 n + 1 if log2 (n) mod 3 = 1
n log2 (n) n + 1
if log2 (n) mod 3 = 2
n log2 (n)
operations are needed to implement the classical radix-8 algorithm. This does not
appear to be an improvement compared to the radix-4 algorithms, but we have not
yet accounted for the special multiplications by the primitive 8th roots of unity.
One can show that
M8 (n)
1
12
1
12
1
12
n log2 (n)
n log2 (n)
n log2 (n)
1
12 n
1
6 n
75
U +IV
where
U
= W T V
V
W
= D+BW
= C T D
B
1A
=
B
1+A
MR (n)
AR (n)
4
n log2 (n) 4 n + 4
3
11
n log2 (n) 2 n + 2
4
By determining the number of real operations needed for the radix-4 algorithms,
one can show that the classical radix-8 algorithm requires 1/6 n log2 (n) fewer real
multiplications compared to the radix-4 algorithms. The savings for other input
76
4. FFT ALGORITHMS
sizes are close to the above results, but not quite as attractive. These details are
left as exercises.
A twisted radix-8 FFT algorithm can be constructed by following the same
techniques used to convert the classical radix-2 and radix-4 algorithms into the
twisted versions. This also is left as an exercise for the reader. As one might
suspect, the twisted radix-8 algorithm has the same operation count as the classical
radix-8 algorithm.
8. Split-radix FFT
It is possible to improve upon the counts of the radix-8 algorithm by constructing an algorithm which combines the radix-2 and radix-4 FFT reduction steps. This
can be done for both the classical and twisted formulations of the algorithm. Here,
we will show how to develop such an algorithm for the twisted case. This algorithm
was introduced in [38], but first clearly described and named over 15 years later in
[13].
Consider the computation of f (z) = f ( z) mod (x42m 1) using the twisted
radix-4 algorithm where 0 < n/(8m). The reduction step transforms this input
polynomial into f (z) mod (z 2m 1), f (z) mod (z 2m + 1), f (z) mod (z 2m I)
and f (z) mod (z 2m + I). The final three of these outputs are then twisted in
order to continue using the simplied reduction step. However, it is not necessary
to twist f (z) mod (z 2m + 1) at this point. Rather, we can reduce this result
into f (z) mod (z m I) and f (z) mod (z m + I) without any multiplications in
R. The radix-4 algorithm does not exploit this situation and thus there is room
for improvement. It turns out that even greater savings can be achieved if we
reduce f (z) into f (z) mod (z 22m 1) instead and then reduce this polynomial
into f (z) mod (z m I) and f (z) mod (z m + I).
The split-radix algorithm is based on a reduction step which receives as input
f (z) = f ( z) mod (z 4m 1) where 0 < n/(4m) and produces as output
fW z m + fX
fY
fZ
f (z) mod (z 2m 1)
f (z) mod (z m I)
f (z) mod (z m + I)
The algorithm is called split-radix because this reduction step can be viewed
as a mixture of the radix-2 and radix-4 reduction steps. Here, fY and fZ need
to be twisted after the reduction step while the split-radix reduction step can be
directly applied to fW z m + fX . A modulus polynomial tree is given below for the
split-radix algorithm when n = 16.
8. SPLIT-RADIX FFT
77
z 16 0
z8 0
z4 0
z4 4
z2 0
0
12
z2 4
z 2 12
z2 2
10
14
z 4 12
z2 6
13
11
15
fW
fX
= fA + fC
= fB + fD
fY
fZ
= I fA fB + I fC + fD
= I fA fB I fC + fD
As with the other algorithms, these formulas do not show the most efficient method
of performing the computations. The following butterfly diagram shows how to
perform the computations using only 6m additions.
fD
fC
fB
fA
fX
fW
+
fY
fZ
78
4. FFT ALGORITHMS
After the reduction step, fY should be twisted by (2)/m and fZ can be twisted
by (3)/m so that the split-radix algorithm reduction step can be applied to these
results. The output fW z m + fX is already in the proper form.
It can be shown that (2)/m can also be used to compute the twisted polynomial for fZ . This observation was the basis for an algorithm called the conjugatepair algorithm [27]. The only difference between this algorithm and the split-radix
reduction step described above is this different transformation used for fZ . Originally, it was claimed that this reduces the number of operations of the split-radix
algorithm, but it was later demonstrated ([21], [28]) that the two algorithms require the same number of operations. However, the conjugate-pair version requires
less storage by exploiting the fact that (2)/m and (2)/m are conjugate pairs.
The split-radix algorithm receives as input f = f mod (z n 1). By recursively
applying the reduction step, we obtain f ( (j) z) mod (z 2 1) for some values
of j and f ( (j) z) mod (z 1) for other values of j. Reduction steps from the
twisted radix-2 algorithm can be used to complete the computation of the FFT.
The following butterfly diagram shows how an FFT of size 8 can be computed using
the conjugate-pair version of the split-radix FFT. Dotted lines have been added to
the diagram to show the locations of the split-radix reduction steps.
f0
f1
f2
f3
f ( 0 )
f ( 4 )
f4
f ( 2 )
f ( 6 )
f5
f ( 1 )
f6
f ( 5 )
f7
f ( 3 )
f ( 7 )
Pseudocode for the conjugate pair version of the split-radix FFT algorithm is
given in Figure 4. It is left as an exercise for the reader to analyze this algorithm.
The resulting recurrence relations are given by
M (n) =
A(n)
n 1
n
+2M
+ n2
M
2
4
2
n 3
n
+2A
+ n
A
2
4
2
8. SPLIT-RADIX FFT
79
M (n) =
A(n)
1
3
1
3
n log2 (n)
n log2 (n)
8
9
8
9
n log2 (n)
n+
n+
8
9
10
9
M8 (n) =
M8
n
2
+ 2 M8
n
4
+2
80
4. FFT ALGORITHMS
M8 (n) =
1
3
1
3
n
n
4
3
2
3
MR (n) =
AR (n) =
4
n log2 (n)
3
8
n log2 (n)
3
38
n+
9
16
n
9
2
(1)log2 (n) + 6
9
2
(1)log2 (n) + 2
9
The results for input sizes which are a power of 4 are slightly more attractive than
other input sizes. In any event, the split-radix algorithm requires less operations
than the radix-8 algorithm.
81
P
r=1
One can show that multiplication by any point on this square requires the same
number of multiplications as the special primitive 8th roots of unity considered
earlier in this chapter.
EXAMPLE
(A + I B) (1 + I tan(22.5o ))
(A tan(22.5o) B) + I (B + tan(22.5o ) A)
82
4. FFT ALGORITHMS
The new algorithm works by replacing the points on the circle used with the
split-radix algorithm with the points on the square instead. This reduces the number of operations needed to implement the reduction steps, but distorts all of the
answers to incorrect results.
The trick is to carefully keep track of how much each result has been distorted
and then unscale the results later in the algorithm. The scaling factors used in the
actual algorithm are somewhat more complicated, but allow the multiplications
by points on the square to be used as often as possible. The places where the
results are unscaled are more expensive than the reduction steps of the split-radix
algorithm, but the idea is to have more reduction steps involving the points on the
square than reduction steps where results are unscaled.
Because of all of this bookkeeping, the algorithm is somewhat complicated and
uses four different modified versions of the split-radix reduction step to achieve
the effect described above. The end result is that it reduces the number of real
multiplications by about six percent as the FFT size becomes large. Those that
would like to learn more about this algorithm are encouraged to read the papers
cited above, but the details are too complicated to present here.
Linearity.
Given complex numbers a and b, polynomials f (z), g(z), and
h(z) = a f (z) + b g(z), then
h( j ) = a f ( j ) + b g( j ) for all 0 j < n.
EXAMPLE
A related property involves the product of two input polynomials.
83
Convolution.
Given polynomials f (z), g(z), and h(z) = f (z) g(z), then
h( j ) = f ( j ) g( j ) for all 0 j < n.
EXAMPLE
We will see in Chapter 7 why this property is called convolution and why it
is probably the most important property of the FFT.
Next, suppose that the coefficients of some polynomial f (z) of degree less than
n were circularly shifted by i positions to form a new polynomial g(z). Then the
multipoint evaluation of g(z) is related to the multipoint evaluation of f (z) as follows:
Shifting.
If f (z) is a polynomial of degree less than n and
g(z) = z i f (z) mod (z n 1), then
g( j ) =
ij f ( j )
EXAMPLE
A related property instead circularly shifts the output of the multipoint evaluation
Modulation.
If f (z) is a polynomial of degree less than n and g(z) is a polynomial
formed by multiplying the coefficient of degree k in f (z) by ik for
all 0 k < n, then
g( j ) =
for all 0 j < n.
f ( j+k )
84
4. FFT ALGORITHMS
EXAMPLE
A similar result holds if the coefficients of f (z) are reversed.
Reversal.
If f (z) is a polynomial of degree less than n and g(z) = z i f (z 1 ),
then
g( j )
= i f ( j ) = i f ( nj )
f ( j ) = f ( nj )
85
f ( nj )
f ( j )
nj
) =
f ( nj )
f ( j )
Even sequence.
A sequence of numbers [f0 , f1 , . . . , fn2 , fn1 ] of length n is said to
be even if fj = fnj for all 0 < j < n.
86
4. FFT ALGORITHMS
EXAMPLE
Odd sequence.
A sequence of numbers [f0 , f1 , . . . , fn2 , fn1 ] of length n is said to
be odd if fj = fnj for all 0 < j < n.
EXAMPLE
Now that we are clear what is meant by an even sequence and an odd sequence of
numbers, we are ready to state the remaining symmetry properties
Symmetry Property III (Even sequence polynomial).
If the coefficients of a polynomial f (z) of degree less than n form
an even sequence, then f ( j ) has no imaginary component for any
0 j < n, i.e.
f ( j )
= Re(f ( j ))
EXAMPLE
Symmetry Property IV (Odd sequence polynomial).
If the coefficients of a polynomial f (z) of degree less than n form an
odd sequence, then f ( j ) has no real component for any 0 j < n,
i.e.
f ( j ) =
Im(f ( j ))
EXAMPLE
The next result shows how to transform any polynomial into the sum of one whose
coefficients form an even sequence and one whose coefficients form an odd sequence
of numbers.
87
Decomposition.
Any polynomial f (z) can be decomposed into the sum of an even
sequence polynomial and an odd sequence polynomial as follows:
f (z) + frev (z) f (z) frev (z)
+
2
2
= fa (z) + fb (z)
f (z) =
88
4. FFT ALGORITHMS
fe (z) =
fo (z) =
We are going to create a new polynomial g(z) where all of the even indexed coefficients of f (z) are in the real components of g(z) and all of the odd-indexed
coefficients of f (z) are in the imaginary components of g(z). In other words, if f (z)
is a polynomial with all real coefficients represented by
f (z) =
=
fn1 z n1 + fn2 z n2 + + f1 z + f0
fe (z 2 ) + z fo (z 2 ),
then
g(z) =
=
fe (z) + I fo (z)
Observe that if fe ( 2j ) and fo ( 2j ) are known for all 0 j < n/2, then we can
compute
f ( j ) =
fe ( 2j ) + fo ( 2j )
to compute f ( j ) for all 0 j < n/2. Symmetry Property I can then be used to
easily recover f ( j ) for all n/2 j < n. So the problem of computing the FFT of
f (z) at each of the n powers of has been transformed into computing the FFT
of g(z) at each of the n/2 powers of 2 .
The size n/2 FFT of g(z) can be computed in roughly half of the effort as
computing the size n FFT of f (z), but we then need to relate the FFT of g(z)
to computing the FFT of f (z). Here is where the symmetry properties from the
previous section come into play.
At first glance, the solution appears simple because
g( 2j )
89
= fe ( 2j ) + I fo ( 2j )
and one is tempted to extract the real part of this result and call it fe ( 2j ) and
assign fo ( 2j ) to the imaginary part. The problem with this approach is that
fe ( 2j ) and fo ( 2j ) are themselves complex numbers which are combined to form
g( 2j ), the output of the FFT. We need a way of separating this output into the
components fe ( 2j ) and fo ( 2j ).
From Symmetry Property I, we know that since fe (z) is a purely real function,
then
fe ( nj ) =
fe ( j )
Similarly, from Symmetry Property II, we know that since fo (z) is a purely imaginary function, then
fo ( nj )
= fo ( j ).
So,
g( nj ) =
fe ( nj + I fo ( nj )
fe ( j ) + I fo ( j )
=
and
g( nj )
= fe ( j ) + I fo ( j )
= fe ( j ) I fo ( j )
g( 2j ) =
g( n2j ) =
fe ( 2j ) + I fo ( 2j )
fe ( 2j ) I fo ( 2j )
fe ( 2j ) =
fo ( 2j ) =
g( n2j ) + g( 2j )
2
g( n2j ) g( 2j )
I
2
90
4. FFT ALGORITHMS
The same formulas can also be obtained by applying Symmetry Properties III and
IV to
g( 2j )
= fe ( 2j ) + I fo ( 2j )
by decomposing the two polynomials on the right side of the equation into sums of
an even sequence polynomial and an odd sequence polynomial. This derivation is
left as an exercise.
One detail that needs to be worked out is how to compute f ( n/2 ) = f (1).
Note that the above formulas do not apply to this case and g( n/2 ) is not computed
as part of the FFT. However, it can be easily shown that f ( n/2 ) = fe ( n/2 )
fo ( n/2 ) and so the entire FFT of f (z) can be computed using just the components
of the FFT of g(z) with roughly half of the effort.
In summary, here are the steps that we need to do to compute the FFT of a
real polynomial:
{g( 0 ), g( 2 ), g( 4 ), . . . , g( n2 )}
g( n2j ) + g( 2j )
2
g( n2j ) g( 2j )
I
2
where
fe (z) =
fo (z) =
5. Compute
fe ( 2j ) + fo ( 2j ).
f ( n/2 ) =
fe ( 0 ) fo ( 0 )
f ( nj )
91
CHAPTER 5
n1
Y
j=0,j6=i
z j
i j
(z 1 ) (z i1 ) (z i+1 ) (z n1 )
(i 1 ) (i i1 ) (i i+1 ) (i n1 )
for all 0 i < n. These LaGrange interpolating polynomials
have the property that
0 if j 6= i
Li (j ) =
1 if j = i
=
94
EXAMPLE.
We can construct f by combining g0 , g1 , . . . gn1 as given by
fY
fZ
= f mod (z m b)
= f mod (z m + b)
fY
fZ
b 1
b 1
fA
fB
or
fY
fZ
= b fA + fB
= b fA + fB
95
The technique of elimination can be used to solve these equations for fA and fB .
The results of these computations are
= 1/2 b1 (fZ fY )
= 1/2 (fY + fZ )
fA
fB
or
fA
fB
(2j)
1
(2j)
1
fY
fZ
once b1 = (2j) )1 = (2j) has been substituted into these equations. Observe
that (2j) = n(2j) . As a butterfly operation, the computation is expressed as
fB
fA
1/2
+
fY
(2j) /2
fZ
This could be used for the interpolation step of the classical radix-2 inverse FFT
(or IFFT) algorithm, but it has the drawback that two multiplications by 1/2 are
required in every interpolation step. This adds an extra step to the interpolation
step of the radix-2 algorithms and prevents the use of the special multiplication by I
needed to construct the radix-4, radix-8, and split-radix inverse FFT interpolation
steps.
Instead, the classical radix-2 IFFT saves the multiplications by 1/2 in each of
the interpolation steps until the end of the inverse FFT computation. So the inputs
of every interpolation step are scaled by m = 2i where m is the maximum size of
the inputs and the output is scaled by 2m. In other words, the revised interpolation
step receives as input
fY
fZ
= m f mod (z m (2j) )
= m f mod (z m (2j+1) )
and produces fA z m +fB = (2m)f mod (z 2m (j) ) as output using the formulas
96
fA
fB
=
=
(2j) (fY fZ )
fY + fZ
So the revised interpolation step can be expressed with the following butterfly
diagram
fB
fA
(2j)
fY
fZ
Again, the input polynomials are scaled by m, the output polynomials are scaled
by 2m, and the factor of 1/2 has been removed from the interpolation step.
EXAMPLE
The input to the IFFT algorithm is f ( (j) ) = f mod (z (j) ) for all 0
j < n. By recursively applying the interpolation step, we obtain n f mod (z n 1)
at the end of the algorithm. Assuming that f (z) has degree less than n, then this
output is equal to n f (z). By multiplying this result by 1/n, then the desired
polynomial f (z) is recovered.
The following butterfly diagram shows how the process would work on an FFT
of size 8.
97
n f0
n f1
n f2
n f3
0
+
f ( 0 )
f ( 4 )
n f4
1
+
f ( 6 )
n f7
f ( 2 )
n f6
2
+
n f5
f ( 1 )
f ( 5 )
3
+
f ( 3 )
f ( 7 )
Note that the outputs need to be scaled by 1/n to produce the correct coefficients
of the desired polynomial.
Pseudocode for this IFFT algorithm is given in Figure 1. The cost analysis
of this algorithm is left as an exercise for the reader. The number of operations
required for each line of the algorithm is identical to one of the lines of the classical
radix-2 FFT algorithm. Thus, the total number of operations needed to compute
the scaled IFFT of size n is
M (n) =
A(n)
n n
+
2M
2
n2
2A
+n
2
98
M (n) =
A(n)
1
n log2 (n) + 1
2
n log2 (n)
Thus, the classical radix-2 IFFT algorithm requires the same number of operations
as the companion FFT algorithm, except that n extra multiplications are required
to multiply the final result of the algorithm by 1/n to produce the correct answer.
Again, this scaling is necessary because there are factors of 1/2 ignored in each of
the k = log2 (n) levels of interpolation steps of the algorithm.
3. Twisted radix-2 IFFT
We now going to consider how to undo the twisted radix-2 FFT reduction step.
Recall that this process had two parts. First, we receive as input the polynomial
fA xm + fB = f = f ( (j)/(2m) z) mod (z 2m 1) and produce
= f mod (z m 1)
= f mod (z m + 1)
fY
fZ
fY
fZ
m f mod (z m 1)
m f mod (z m 1)
99
fA
fB
= fY fZ
= fY + fZ
can be derived to determine the two components of the output. Observe that there
are no multplications required in this part of the interpolation step.
EXAMPLE.
The algorithm is initialized with f ( (j) ) = f ( (j) z) mod (z 1) for all j
in the range 0 j < n. By recursively applying the interpolation step to these
results, we obtain n f (z) mod (z n 1) = n f (z) if f (z) has degree less than n.
By multiplying each coefficient of this result by 1/n, the desired polynomial f (z)
is recovered. The following butterfly diagram shows how the process works for an
inverse FFT of size 8.
We now provide a complete butterfly diagram for this FFT of size 8.
n f0
n f1
f ( 0 )
n f2
f ( 4 )
n f3
2
+
f ( 2 )
f ( 6 )
n f4
n f5
n f6
n f7
f ( 1 )
f ( 5 )
f ( 3 )
f ( 7 )
Pseudocode for this IFFT algorithm is given in Figure 2. The cost analysis
of this algorithm is again left as an exercise for the reader. The total number of
operations to compute the twisted radix-2 IFFT of size n is
100
n n
M (n) = 2 M
+ 1
2
n2
+n
A(n) = 2 A
2
where M (1) = 0 and A(1) = 0. After solving these recurrence relations and including the extra n multiplications to unscale the final result of this algorithm, the
formulas
M (n) =
A(n)
1
n log2 (n) + 1
2
n log2 (n)
are obtained. This algorithm has the exact same operation count as the classical
radix-2 IFFT algorithm and differs from the companion FFT algorithm only in
the n extra multiplications needed to undo the scaling introduced by ignoring the
factors of 1/2 in the interpolation steps.
101
n f0
n f1
n f2
f ( 0 )
f ( 4 )
0
+
n f3
n f4
f ( 2 )
f ( 6 )
n f6
n f7
20
30
30
2
+
n f5
20
1
+
f ( 1 )
f ( 5 )
3
+
f ( 3 )
f ( 7 )
102
n f0
n f1
f ( )
n f2
+
2
+
f ( )
n f3
f ( )
n f4
n f5
1
+
f ( )
f ( )
n f6
f ( )
n f7
f ( )
f ( 7 )
n f0
n f1
n f2
20
n f3
30
n f4
n f5
n f6
50
60
70
I
+
f ( 0 )
f ( 4 )
f ( 2 )
f ( 6 )
n f7
40
I
+
f ( 1 )
f ( 5 )
f ( 3 )
f ( 7 )
Note: For the n = 8 case, the butterfly diagram of the twisted radix-8 inverse
FFT is the same as that shown above for the classical radix-8 inverse FFT, with
the exception that the factors { 0 , 20 , 70 } should instead appear on the
bottom row of the diagram to implement the twisted polynomial.
Split-radix inverse FFT:
n f0
n f1
n f2
n f3
f ( 0 )
f ( 4 )
f ( 2 )
f ( 6 )
n f5
I
+
n f4
103
1
+
f ( 1 )
f ( 5 )
n f6
n f7
1
+
f ( 3 )
f ( 7 )
In all of the inverse FFTs considered in this chapter, the input was in scrambled
order while the output was in the natural order. Some inverse FFT algorithms
instead have the input in the natural order {f ( 0 ), f ( 1 ), , f ( n1 ) and the
output in scrambled order. We will address this in more detail later in the text.
The material in the next section will help to explain why there are these different
versions of inverse FFT algorithms.
5. The duality property
If f (z) is a polynomial with complex number coefficients, then the discrete
Fourier transform of f (z) can be represented as a polynomial using
F (w)
= f ( n1 ) wn1 + f ( n2 ) wn2 + + f ( 1 ) w + f ( 0 )
Fd
f ( d ) =
n1
X
j=0
fj ( d )j
for all 0 d < n where is a primitive nth root of unity. The discrete Fourier
transform can be efficiently computed using the FFT algorithms discussed in Chapter 4, but the outputs of the algorithm need to be rearranged according to the
function.
Suppose that we are given F (w), i.e. the evaluation of f (z) at each of the nth
roots of unity and we wish to recover the polynomial f (z). This is the problem of
computing the IFFT and several algorithms were given earlier in this chapter for
104
f (z) =
n1
X
i=0
Fi Li (z)
where Li (z) is the Lagrange interpolating polynomial. In this case, Li (z) is given
by
Li (z) =
=
1
zn 1
i
n
z i
1 i n1
z
+ 2i z n2 + + (n1)i z + 1
n
f (z) =
F0 z n1 + F0 z n2 + + F0 z + F0
F1 z n1 + F1 2 z n2 + + F1 n1 z + F1
F2 2 z n1 + F2 4 z n2 + + F2 2(n1) z + F2
Fn1 n1 z n1 + Fn1 2(n1) z n2
+ + Fn1 (n1)(n1) z + Fn1
fj
1
Fn1 (n1)(nj) + Fn2 (n2)(nj) +
n
+ F2 2(nj) + F1 nj + F0
1
F ( j )
n
105
the result by 1/n. So, f can obtained by computing the FFT of F (w), but replacing
with 1 in the computation.
This relationship between the FFT and IFFT algorithms is known as the duality property. In summary,
Duality Property.
Suppose that f (z) is a function and F (w) is the polynomial representation of the discrete Fourier transform of f (z).
(1). If F (w) is computed with an FFT algorithm with input
f (z) and primitive root of unity , then the IFFT of F (w) can be
computed with the same FFT algorithm, but with replaced by 1 .
(2). If f (z) can be computed with an IFFT algorithm with
input F (w) and primitive root of unity 1 , then the FFT of f (z)
can be computed with the same IFFT algorithm, but with 1
replaced by .
It should be noted that in part 1 of the duality property above, the final result
of the FFT computation needs to be scaled by 1/n to produce f (z). In part 2 of
the duality property, we will assume that the IFFT algorithm does not include the
scaling by 1/n and that this scaling should not be implemented to determine the
FFT of f (z).
EXAMPLE
The only drawback to the reuse of an FFT algorithm to compute the IFFT
is that any outputs of an FFT algorithm must be unscrambled according to the
function before being used as inputs to the FFT algorithm to compute the IFFT.
The outputs of the IFFT computation must be also be unscrambled to produce
the coefficients of f (z) in the correct order. At most 2n additional operations are
needed to complete these two shufflings of the algorithm data.
If the extra operations needed to reuse an FFT algorithm is too expensive to
justify having one algorithm to compute both the FFT and inverse FFT, then one
can construct a separate IFFT algorithm by essentially performing the inverse of
the steps of any FFT algorithm in reverse order as discussed in the previous sections
of this chapter. Neverthesless, duality is an important property of the FFT as we
shall see in Chapter 7.
CHAPTER 6
Classical multiplication.
Suppose that f and g are two polynomials that we wish to multiply.
Select one of the polynomials, say f . Then multiply each term of f
by every term of g and collect like terms.
In an introductory algebra course, problems with only two or three terms in the
input polynomials are usually considered. This is because if input polynomial has
n terms, then a total of n n = n2 terms need to be computed to obtain the output polynomial and it requires significant effort to perform the computation for
values of n higher than those considered in the introductory algebra course. The
following example shows how much effort is required to compute the product of two
polynomials with four terms.
EXAMPLE
108
Another way of organizing this work arranges the like terms in columns. This
technique resembles multiplication of integers and is illustrated by the following
example.
EXAMPLE
4 x3 + 3 x2 + 2 x + 1
1 x3 + 1 x2 + 1 x + 2
8 x3 + 6 x2 + 4 x + 2
4 x + 3 x3 + 2 x2 + 1 x
4 x5 + 3 x4 + 2 x3 + 1 x2
6
4 x + 3 x5 + 2 x4 + 1 x3
4
4 x6 + 7 x5 + 9 x4 + 14x3 + 9 x2 + 5 x + 2
f (x)
g(x)
n1
X
i=0
fi gki
1. CLASSICAL MULTIPLICATION
EXAMPLE
109
g3 = 1
f2 = 3
f1 = 2
g2 = 1
g1 = 1
f0 = 1
g0 = 2
=
=
f0 g0 + f1 g1 + f2 g2 + f3 g3
12+01+01+01
f1 g0 + f0 g1 + f1 g2 + f2 g3
=
h2
=
=
h3
=
=
h4
=
=
h5
=
=
h6
=
=
22+11+01+01
f2 g0 + f1 g1 + f0 g2 + f1 g3
32+21+11+01
f3 g0 + f2 g1 + f1 g2 + f0 g3
42+31+21+11
f4 g0 + f3 g1 + f2 g2 + f1 g3
02+41+31+21
f5 g0 + f4 g1 + f3 g2 + f2 g3
= 2
= 5
= 9
= 14
= 9
02+01+41+31
= 7
f6 g0 + f5 g1 + f4 g2 + f3 g3
02+01+01+41
= 4
Observe that the computation of the product with the alternative version of
classical multiplication matches the product computed with the traditional classical
multiplication method. The traditional classical multiplication method is recommended for computing the entire product polynomial; the alternative method is
recommended if just a few terms of the product polynomial are to be computed.
For those experienced with computer programming, this is because there are fewer
loops involved with the first method and there is a slight cost associated with each
loop in terms of the time required to perform the multiplication.
110
All of the methods discussed in this section are quadratic multiplication methods because the effort needed to multiply two polynomials using these techniques is
proportional to the square of the maximum degree of the input polynomials. One
can show that each technique requires n2 coefficient multiplications and n2 2n + 1
coefficient additions to implement the polynomial multiplication. These derivations
are left as exercises. The rest of this chapter will explore several faster methods of
multiplying two polynomials.
2. Karatsuba multiplication
Suppose that we wish to multiply the two polynomials f1 x + f0 and g1 x + g0 .
If the classical multiplication method discussed in the previous section was used,
we would obtain the product f1 g1 x2 + (f1 g0 + f0 g1 ) x + f0 g0 . This requires 4
coefficient multiplications to compute the result.
Suppose we compute
(f1 + f0 ) (g1 + g0 ) = f1 g1 + f1 g0 + f0 g1 + f0 g0
The sum of the middle two terms of this expression is the coefficient of x in the
above polynomial product. The other two terms are the coefficient of x2 and the
constant term. Then an alternative method for computing the middle term of the
above product polynomial is
(f1 + f0 ) (g1 + g0 ) f1 g1 f0 g0
So one can compute the coefficient of x2 and the constant term as before and then
use this formula to determine the coefficient of x.
2. KARATSUBA MULTIPLICATION
EXAMPLE
111
=
=
(f1 + f0 ) (g1 + g0 ) =
21 = 2
12 = 2
33 = 9
f (z) =
g(z) =
fA z m + fB
gA z m + gB
Observe that fA , fB , gA and gB are each polynomials of degree less than m. The
polynomials f (z) and g(z) can be multiplied into the product polynomial h(z) using
h(z) =
=
=
f (z) g(z)
(fA z m + fB ) (gA z m + gB )
fA gA z 2m + (fA gA + gB fA ) z m + fB gB
A total of four products of two polynomials of degree less than m are required for
this computation. However, the middle term can also be computed using:
112
fB gA + gB fA
(fA + fB ) (gA + gB ) fA gA fB gB
h(z) =
with only three products of two polynomials of degree less than m. Here, fA gA ,
fB gB , and (fA + fB ) (gA + gB ) can be computed with recursive calls to either
Karatsuba or classical multiplication. When each input polynomial has two terms,
the product can be computed using the technique described at the beginning of
this section and no more recursion is necessary.
2. KARATSUBA MULTIPLICATION
EXAMPLE
113
=
=
4x + 3
2x + 1
gA
gB
=
=
x+1
x+2
fB gB
(fA + fB ) (gA + gB )
= 4x2 + 7x + 3
= 2x2 + 5x + 2
= 12x2 + 26x + 12
114
M (2n 1) =
A(2n 1) =
coefficient additions to multiply two polynomials of degree less than n into a polynomial of degree less than 2n1. For even moderate-sized polynomials, this technique
is superior to classical multiplication.
3. FFT-based multiplication
An even faster technique for multiplying two polynomials is based on the observation that
h() =
f () g()
where is any element from the number system used for the coefficients of f , g,
and h.
To multiply polynomials f and g of degree less than n into a polynomial of
degree less than 2n, one could choose any selection of 2n points and evaluate f
and g at each of these points. The above formula can then be used to obtain the
evaluation of h at each of these points using only 2n coefficient multiplications.
These 2n evaluations can then be interpolated into the desired product polynomial
h. In summary:
3. FFT-BASED MULTIPLICATION
115
EXAMPLE
Based on the multipoint evaluation and interpolation techniques from Chapter
2, this technique will be comparable in performance to classical multiplication and
will likely be less efficient than Karatsubas multiplication if an arbitrary collection
of points is used. However, we learned that polynomial evaluation and interpolation
can be significantly improved if the powers of a primitive root of unity are used
instead. Since it does not matter what points are used to evaluate and interpolate
the polynomials, it is advantageous to select the powers of the primitive roots of
unity for the computations as use the algorithms of Chapters 4 and 5 for steps 1,
2, and 4.
EXAMPLE
The ordering of the points is not important for the FFT-based multiplication
technique as well. Thus, one can leave the evaluations of f (z) and g(z) in scrambled order since the inverse FFT algorithms of Chapter 5 expect the input to be
in this scrambled order.
The number of operations needed to implement FFT-based multiplication is
equal to 2 times the number of operations required to compute the multipoint
evaluation of a polynomial of degree less than n, plus the number of operations
needed to interpolate n points into a polynomial of degree less than n, plus n
multiplications for the pointwise products. If the powers of a primitive root of
unity are used for the evaluation points, then the effort needed for the computation
is roughly 3 times the effort needed to compute an FFT of size n.
CHAPTER 7
a0 +
n2
X
k=1
where n2 =
n/2
if n is even
(n 1)/2 if n is odd
Note that since the above function consists entirely of sine and cosine functions,
then x( ) will be a periodic function with a period of n samples. Since the original
1The use of the notation x( ) is more traditional than f ( ) in engineering literature, so we
will switch to this convention in this chapter.
117
118
function is nonzero only over a finite interval, the discrete Fourier series will match
this function at all of the points of interest and will repeat this pattern of signal
samples outside of the specified interval.
EXAMPLE.
In chapter 1, we mentioned that the Discrete Fourier Series consisted of at
most n sinusoids. Let us more carefully investigate this claim in light of the above
formula for the Discrete Fourier Series.
First, we can interpret a0 as a0 cos(0 ) where cos(0 ) = 1 for all integer
values of . We can also view b0 (which does not appear in the above formula)
as b0 sin(0 ). This term does not appear because sin(0 ) = 0 for all integer
values of and thus does not contribute anything to the series. The following figure
illustrates these properties for these k = 0 terms where n = 8.
cos(0 )
sin(0 )
The component where k = n/2 when n is even also deserves some additional explanation. In this case, then bn2 = 0 because sin( ) = 0 for all integer values of .
The following figure illustrates this property for the case where n = 8 and n2 = 4.
sin( )
So with these two explanations, we can clearly see that there will always be at
most n sinusoids in the discrete Fourier series. The only way that there will be less
than n sinusoids is when one or more of the coefficients in the above formula turns
out to be zero. Because the frequency of each of the sinusoids in the above formula
is a multiple of 2/n, one can say that these sinusoids are harmonically related.
119
If n = 8, the following figure gives the eight functions involved in the discrete
Fourier series.
cos(0 )
cos(/4 )
sin(/4 )
cos(/2 )
sin(/2 )
cos( )
cos(3/4 )
sin(3/4 )
120
But why is the Discrete Fourier Series comprised of sinusoids where k is restricted to the interval 0 k n/2? We will now explore several properties of
sinusoids to help us answer this question.
First, the cosine function has the property that it is an even function. In other
words,
Even function.
A function f (z) is said to be an even function if
f (z) =
f (z)
In
cos(/4 )
cos(/4 )
Observe that this is the same sinusoid as n = 8 and k = 1 and does not contribute
anything new to the series.
Next, the sine function has the property that it is an odd function. In other
words,
Odd function.
A function f (z) is said to be an odd function if
f (z) = f (z)
In
the case of the sine function sin( ) where is any constant, then sin( ) =
sin( ) for all inputs to the function. This is similar to the concept of the
odd sequence polynomial introduced in Chapter 4, but again involves the outputs
of the function rather than polynomial coefficients.
121
sin(/4 )
sin(/4 )
Observe that this is the negative of the function where n = 8 and k = 1. The
coefficient associated with the leftmost sinusoid can be multiplied by -1 and can be
absorbed into the coefficient associated with the rightmost sinusoid.
So, sinusoid components with negative values of k do not contribute anything
new to the Fourier series. Although not traditional, we could have just as easily
defined the Discrete Fourier series to consist of sinusoids associated with values of k
in the interval n/2 k 0. The point is that only representative of each type of
sinusoid should be included in the Discrete Fourier Series. Because the traditional
method is also the simplest to work with, we will continue to define the Discrete
Fourier Series with sinusoids associated with values of k in the interval 0 k n/2.
Next, we will explore why there are no terms in the Fourier series greater than
or equal to n.
Let K = k + dn for any integer d. If cos(2K/n ) is only evaluated at integer
values of , then cos(2K/n ) = cos(2k/n ). For example, let n = 8, k = 0,
and K = k + 1 n = 8.
cos(0 )
cos(2 )
122
sin(/4 )
sin(9/4 )
So, any terms with values of k which are n or higher will also not contribute anything
new to the discrete Fourier series.
The phenonemon where a sampled version of sin(9/4 ) looks like the function
sin(/4 ) is a called aliasing. This results when a particular signal is not sampled
often enough to accurately capture the characteristics of the signal. For example,
if a video camera does not sample the image of a car driving on a highway quickly
enough, it may appear on a television as if the wheels of the car are going backwards
when the car is driving forwards.
Now, let us combine the properties considered so far to handle the cases where
n/2 < k n. Let K = (n k). If cos(2K/n ) is only evaluated at integer values
of , then cos(2K/n ) = cos(2k/n ). For example, let n = 8, k = 3 and
K = n k = 5.
cos(3/4 )
cos(5/4 )
123
sin(3/4 )
sin(5/4 )
So, none of the cosine or sine terms with k in the range n/2 < k n contributes
anything new to the discrete Fourier series either.
So, every discrete Fourier series associated with a discrete function with period
of n samples has exactly n terms in the series. In Chapter 1, we mentioned that
the coefficients of the discrete Fourier series can be computed using the Discrete
Fourier Transform. In the next section, we will more carefully explore this claim
using the Fast Fourier Transform. For this section, we will simply state the discrete
Fourier series associated with the x( ) function used in the above examples.
EXAMPLE.
There is another way of expressing the Discrete Fourier Series that is sometimes
used by engineers. It can be shown that for a given n and any 0 k n/2, that
ak cos(2k/n ) + bk sin(2k/n )
ck cos(2k/n + k )
where
c2k
tan()
=
=
a2k + b2k
bk /ak
= a0 +
n2
X
k=1
ck cos(2k/n + k )
where n2 =
n/2
if n is even
(n 1)/2 if n is odd
124
There are fewer terms in this Discrete Fourier series, but the same number of
unknowns. Observe that ak and bk in the original series have been replaced with
ck and k . Also, observe that when bk = 0, then k = 0.
EXAMPLE
2. A mathematicians perspective of the engineers FFT
In this section, we are going to show that the discrete Fourier series representation of a given sequence of samples can be computed with the FFT algorithms
discussed in the previous chapters. To support this claim, we need to derive yet
another representation of the discrete Fourier series that involves complex numbers.
Eulers identity and the properties of sinusoids discussed in the previous section
can be used to show that
eI2k/n + eI2(nk)/n
2
eI2k/n eI2(nk)/n
= I
2
cos(2k/n )
sin(2k/n )
for all integer values of . Substituting these results into the formula for the discrete
Fourier series given in the previous section for an even value of n, we obtain
x( )
n1
X
k=0
ck e2k/n I
c0
= a0
ck
= 1/2 (ak + I bk )
= an/2
cn/2
ck
for 0 k n/2 1
for n/2 + 1 k n 1
Observe that ck and cnk are complex conjugates if ak and bk are real numbers.
EXAMPLE
Let us define the function
f (z) =
n1
X
k=0
fk z k
125
X(k) =
n1
X
=0
x( ) W k
where W = eI2/n and computes X(0), X(1), . . . , X(n 1) using one of the FFT
algorithms discussed in Chapter 4. Note that W , the primitive root of unity in this
approach, is equal to 1 . The duality property discussed at the end of Chapter 5
explains why the engineers FFT with primitive root of unity W = 1 is essentially
equivalent to the mathematicians IFFT with primitive root of unity .
EXAMPLE
Observe that X(d) = n cd for all 0 d < n. In other words, X(d) is the coefficient
of eI2/n in the discrete Fourier series, but scaled by n. This is because the
engineer does not scale the output of their FFT by 1/n which is typically required
at the end of the mathematicians IFFT interpolation algorithm.
The engineer uses the IFFT algorithms discussed in Chapter 5 to evaluate
n x( ) at each in {0, 1, n 1}. Here, W 1 = = eI2/n is used in
the IFFT algorithm. The principle of duality can again be used to explain the
application of the IFFT as an evaluation algorithm. The output of this algorithm
is n x(0), n x(1), , n x(n 1). These results are typically scaled by 1/n to
produce the desired evaluations.
EXAMPLE
In summary, the engineer views the FFT as an interpolation algorithm and
the IFFT as an evaluation algorithm for the discrete Fouier series. The property of
duality is used to relate this perspective to the mathematical views of the algorithms
discussed in the earlier chapters.
126
n1
X
X(k) =
=0
x( ) W k
X(k) =
n/21
=0
n/21
=0
x(2 ) W k(2 ) +
n/21
=0
x(2 ) (W 2k ) + W k
=0
x(2 + 1) (W 2k )
X(k + n/2) =
n/21
=0
x(2 ) (W 2k ) W k
n/21
=0
n/21
Xa (k)
=0
x(2 ) (W 2k )
n/21
Xb (k)
=0
x(2 + 1) (W 2k )
x(2 + 1) (W 2k )
127
We see here that the input (time domain) sequence has been separated or
decimated into two subsequences, one consisting of the even-indexed samples and
the other consisting of the odd-indexed samples. By recursively applying this reduction of the computation of the discrete Fourier series, the so-called decimationin-time (DIT) FFT is derived.
But how does this FFT relate to those studied in Chapters 4 and 5? Let us
construct a diagram of this algorithm for the case where n = 8. First, let us reduce
the problem of computing the FFT of size 8 into two subproblems of size 4.
x(0)
x(2)
x(4)
x(6)
x(1)
x(3)
DFT
size=4
X(0)
X(1)
x(5)
x(7)
DFT
size=4
X(2)
X(3)
W0
W1
W2
W3
X(4)
X(5)
X(6)
X(7)
128
x(0)
x(4)
x(2)
DFT
size=2
x(1)
DFT
size=2
W2
X(1)
x(3)
W0
W2
W0
W1
W2
W3
X(3)
x(7)
DFT
size=2
X(2)
x(5)
DFT
size=2
W0
X(0)
x(6)
X(4)
X(5)
X(6)
X(7)
The following diagram shows the operations needed for the entire computation.
Here W has been replaced by 1 .
x(0)
x(4)
x(2)
X(0)
x(6)
x(1)
X(1)
X(2)
x(5)
x(3)
X(3)
x(7)
0
X(4)
X(5)
X(6)
X(7)
129
x(0)
x(1)
x(2)
X(0)
x(4)
W0
W0
X(4)
x(7)
W0
W0
W2
W2
W1
X(2)
x(6)
W0
W2
x(5)
W0
W0
x(3)
X(6)
X(1)
W3
X(5)
X(3)
X(7)
n/21
X(k) =
=0
n/21
Now,
=0
x( ) W k +
x( ) W k +
n1
X
=n/2
n/21
=0
x( ) W k
x( + n/2) W k( +n/2)
130
n/21
X(2k) =
=0
n/21
=0
x( ) (W 2k ) +
x( ) (W 2k ) +
n/21
=0
n/21
=0
x( + n/2) (W 2k ) +n/2
x( + n/2) (W 2k )
X(2k + 1) =
n/21
=0
x( ) (W
n/21
=0
2k+1
n/21
=0
x( ) (W 2k ) W
x( + n/2) (W
n/21
=0
2k+1 +n/2
x( + n/2) (W 2k ) W
So, to compute X(k), one can compute the two discrete Fourier series
n/21
Xa (k) =
=0
x( ) (W 2k )
n/21
Xb (k) =
=0
x( + n/2) (W 2k )
X(2k) =
X(2k + 1) =
Xa (k) + Xb (k)
Xa (k) Xb (k)
z W z
x(0)
x(1)
x(2)
x(3)
x(4)
W0
X(0)
W0
W2
W0
X(4)
X(2)
x(5)
W0
X(6)
X(1)
x(6)
W1
131
W2
W3
W0
W2
W0
X(5)
x(7)
X(3)
W0
X(7)
x(0)
x(4)
X(0)
x(2)
x(6)
X(1)
X(2)
x(1)
x(5)
X(3)
X(4)
x(3)
x(7)
X(5)
X(6)
X(7)
132
4. Convolution
We learned in Chapter 6 that the FFT allows the mathematician to efficiently
compute the product of two polynomials of large degree. In the next section, we
will see that the engineer can use the FFT to efficiently compute a related operation
on two sequences of signal samples.
To introduce the engineering concepts involved, consider the system represented
by the following block-diagram
x( )
y( ) = 2 x( ) + x( 1) + x( 2) + x( 3)
y( )
4. CONVOLUTION
EXAMPLE
133
y(0) :
1
2
y(0) = 2
y(1) = 5
y(2) = 9
2
4
y(1) :
4+1
4
y(2) :
6+2+1
y(3) :
y(3) = 14
8+3+2+1
y(4) :
1
y(4) = 9
4+3+2
y(5) :
1
y(5) = 7
4+3
4
y(6) :
1
y(6) = 4
134
EXAMPLE
h(0) = 2
h(1) = 1
h(2) = 1
2
1
h(1) :
1
1
1
h(2) :
1
1
1
h(3) :
h(3) = 1
In terms of the impulse response for a given system, the output of the system
for a given input is given by the convolution formula
4. CONVOLUTION
135
Convolution.
Given a system with initial response ~( ), the output of the system
y( ) for the input sequence x( ) is given by
y( )
n1
X
x(k) ~( k)
n1
X
x( k) ~(k)
k=0
or
y( )
k=0
136
EXAMPLE
y(3) =
=
= 2
= 5
= 9
= 14
= 9
= 7
= 4
This calculation is equivalent to the one given in the previous example, but
computed strictly in terms of the convolution formula.
However, there is a much simpler method of computing the convolution. Recall
that in Chapter 6, we discussed two methods of implementing classical multiplication. Observe the similarity between the second method for classical multiplication
and the convolution formula. The easier method of implementing the convolution
is to write x( ) and ~( ) as polynomials and then multiply the polynomials using
the first method of classical multiplication.
4. CONVOLUTION
EXAMPLE
137
4 x3 + 3 x2 + 2 x + 1
x3 + x2 + x + 2
138
Polynomials
Frequency domain
Evaluation
f (z)
g(z)
n h(z)
X(k)
H(k)
Y (k)
FFT
FFT
IFFT
FFT
FFT
IFFT
F (w)
G(w)
H(w)
x( )
h( )
n y( )
Interpolation
Pointwise products
Pointwise products
Evaluations
Time domain
CHAPTER 8
(121)3
= 1 32 + 2 31 + 1 30
The notation ()3 is used to indicate that this represents a ternary number.
To convert from decimal to ternary, divide the number by three and record
the remainder. Then divide the quotient by three and again record the remainder.
Continue the process until the quotient is zero. The ternary representation of the
number is the sequence of remainders in reverse order.
139
140
EXAMPLE
= 21
= 7
R2
R0
=
=
R1
R2
2
0
=
=
=
=
2 33 + 1 32 + 0 21 + 2 20
2 27 + 1 9 + 0 2 + 2 1
54 + 9 + 0 + 2
65
For the 2-adic FFT algorithms, we used the binary reversal function to give the
order of the outputs. For the radix-3 algorithm, we will use the following related
function to give the order of the outputs.
EXAMPLE
Let j = 11 and n = 27 = 33 .
In ternary form, j = (102)3 .
So (j)(27) = (201)3 .
As a decimal number, (j)(27) = 19.
141
As with the binary reversal function, leading zeros should be included in the ternary
reversal of the number. Also, if n is the same for many related calculations and it
is understood from the context of the situation what n is, then it is not necessary
to specify n in the notation and one can simply use (j) instead.
Properties of the ternary reversal function.
(1).
(2).
(3).
(4).
The proof of each of these properties closely follows a related property of the
binary reversal function. Each of these proofs are left as exercises.
EXAMPLE
(0) = (00)3 = 0
(1) = (10)3 = 3
2 = (02)3
3 = (10)3
(2) = (20)3 = 6
(3) = (01)3 = 1
4 = (11)3
5 = (12)3
(4) = (11)3 = 4
(5) = (21)3 = 7
6 = (20)3
7 = (21)3
(6) = (02)3 = 2
(7) = (12)3 = 5
8 = (22)3
(8) = (22)3 = 8
One can verify that the four properties above hold for when n = 9.
2. Classical radix-3 FFT
A radix-3 FFT algorithm evaluates a polynomial of degree less than n at each
of the roots of z n 1 where n = 3k . It can be shown that = 1(360o/n) is a
primitive nth root of unity and that { 0 , 1 , 2 , , n1 } are the n evaluation
points that will be used in the process.
The radix-3 algorithm presented in this section is similar to a technique introduced by Winograd as presented in [33], but is believed to be easier to understand
than the Winograd algorithm. Another version of the radix-3 alorithm [12] requires the inputs to be transformed into a different number system. 1 The radix-3
1This number system is called the slanted complex number system which consists of num
bers of the form A + 3 1 B where A and B are real numbers and 3 1 is the element 1120o
142
algorithm discussed in this section requires fewer operations than the one found
in [12] and allows the inputs to remain in the traditional complex number system
presented in Chapter 3.
Let = n/3 so that 3 = 1. Note that also has the property that 2 + +
1 = 0 and that 2 is the complex conjugate of .
1
1
2 =
The input to the reduction step of the classical radix-3 FFT algorithm is given
by f (z) mod (z 3m b3 ) and the output is f (z) mod (z m b), f (z) mod (z m b),
and f (z) mod (z m 2 b). Before discussing the reduction step itself, let us consider
the tree of modulus polynomials.
At the top of the tree is z n 1. Thus, we are going to evaluate f (z) at each
of the nth roots of unity. At each reduction step with input size 3m, let b = (3j)
for some j < n/m. In the previous section, we saw that b3 = ( (3j) )3 = (j) ,
b = n/3 (3j) = (3j)+n/3 = 3j+1 and 2 b = 2n/3 (3j) = (3j)+2n/3 =
3j+2 . So the input modulus polynomial is z 3m (j) and the output modulus
polynomials are z m (3j) z m (3j+1) , and z m (3j+2) . At the bottom of
the tree are z (j) for all 0 j < n.
The modulus polynomial tree when n = 9 is given by:
z9 1
z3 1
0
z3 3
6
z3 6
7
from the traditional complex number system. It can be shown that the complex numbers and
slanted complex numbers are two different ways of representing the same collection of numbers.
143
where the 9 points used for the multipoint evaluation are given by
The reduction step is again simple to perform with these modulus polynomials.
Split the input into three blocks of size m by writing f (z) mod (z 3m (j) ) =
fA z 2m + fB z m + fC . Then the outputs are given by
fX
fY
fZ
2(3j) fA + (3j) fB + fC
2 2(3j) fA + (3j) fB + fC
2(3j) fA + 2 (3j) fB + fC
or
fX
fY
fZ
2(3j)
2 2(3j)
2(3j)
fX
fY
fZ
1
2
(3j)
(3j)
2 (3j)
1
fA
1 fB
fC
1
2(3j)
1
fA
1 (3j) fB
1
fC
We are going to compute the reduction step in a special way using the fact
that and 2 are complex conjugates. Express = R + I I and observe that
2 = R I I . Now let fR and fI be defined by
fR
fI
R ( 2(3j) fA + (3j) fB )
I I ( (3j) fB 2(3j) fA )
144
fC
fB
(3j)
fX
fA
2(3j)
fY
fZ
EXAMPLE
145
fZ = b fA + fB
= (I z + I) + (z + 0)
= (1 + I) z + I
= (I z + I) + (z + 0)
= (1 I) z I
146
f8
f7
(3j)
f6
f5
f4
f2
f1
f0
fS
fU
fT
fR
f3
2(3j)
fV
fW
fX
fY
fZ
The FFT of size 9 for the input polynomial f (z) = X considered earlier in this
section is given by NEED UPDATED FIGURE
z 7 + 2z 6 + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1
3z 3 + 5z 2 + 5z + 2
8z + 7
15
3 z3 + I z2 z
2I z 3
3 + 2I
2z+I
3 2I 2 + I
2+I
2zI
2I
2I
147
M (n) =
A(n)
n
3M
+n
3
n 5
3A
+ n
3
3
Ms (n) =
Ms
n
3
2
n
3
148
where Ms (1) = 0.
Using the technique of substitution discussed earlier for the classical radix2 algorithm, closed-form solutions for the number of operations needed for the
classical radix-3 algorithm is given by
M (n) =
=
A(n)
=
=
n log3 (n) n + 1
1
n log2 (n) n + 1
log2 (3)
0.631 n log2 (n) n + 1
5
n log3 (n)
3
5
n log2 (n)
3 log2 (3)
1.051 n log2 (n)
It can be shown that this is the same number of operations required by the Winograd
algorithm.
Observe that this algorithm is less efficient than any of the algorithms discussed
in Chapter 4, both in terms of the number of multiplications and the number of
additions.
149
We can interpret the second output as some polynomial that we wish to evaluate
at the solutions to z m = = 1(120o). Using what we learned in Chapter 2, if
we apply the transformation z = 1(120o/m) z to this equation, then it becomes
zm = 1. Recall that this transformation can be viewed as rotating the points
in the complex plane clockwise by 120/m degrees or rotating the axis system of
the complex plane counterclockwise by 120/m degrees. Again, this twisting of the
complex plane is why Bernstein calls this type of FFT the twisted FFT. The
twisted polynomial is again used to implement the transformation.
EXAMPLE
The third output f (z) mod (z m 2 ) can be interpreted as some polynomial that
we wish to evaluate at the solutions to z m = 2 = 1240o. If we apply the
transformation z = 1(240o/m) z to this equation, then it becomes zm = 1. So
now, all of the outputs of the twisted radix-3 reduction step are in the proper form
to be used as inputs for another application of the reduction step. As with the
radix-2 algorithm, it is too complicated to create a new variable each time the
complex plane is rotated, and the notation f ( z) will again be used.
It can be shown that if the input to the reduction step is f (z) = f ( (j)/(3m)
z) mod (z 3m 1) for some j, then the outputs are given by
fX
feY
feZ
=
=
=
The notation feY and feZ is used to indicate that these are the results of the reduction
step after the the twisting has been completed. The above expressions can be
determined by carefully keeping track of the cumulative rotations of the complex
plane using the new notation and using the fact that (j)/3m = (3j)/m.
We will now present the reduction step of the twisted radix-3 FFT algorithm.
Split f (z) into three blocks of size m, i.e. f (z) = fA z 2m + fB z m + fC . We
need to compute
fX
fY
=
=
fZ
fA + fB + fC
2 f A + f B + f C
f A + 2 f B + f C
150
(feY )d
2 (fA )d + (fB )d + (fC )d d
(feZ )d
(fA )d + 2 (fB )d + (fC )d d
The improved algorithm makes use of the fact that (1)/m and (1)/m are
conjugate pairs and that and 2 are complex conjugate pairs, i.e. 2 = . So
the formulas to compute the coefficient of degree d in fX , feY , and feZ are now
(fX )d
(feY )d
(feZ )d
(fA )d + (fB )d + (fC )d d
(fA )d + (fB )d + (fC )d d
= ( (fB fA )d + (fC fA )d ) d
= ( d ) (fB fA )d + d (fC fA )d
d (fB fA )d + d (fC fA )d
Once (feY )d has been computed, then complex conjugate properties can be used to
compute (feZ )d at a reduced cost. For both formulas, we will assume that d has
been precomputed and stored. By applying these concepts for all d in 0 d < m,
the entire reduction step can be computed with fewer operations. The following
example shows the technique involved with computing (feY )d and (feZ )d
EXAMPLE
151
Suppose that
(fB fA )d
d = 140o
(fC fA )d
so that
d
1+I2
3+I4
0.766 + I 0.642
= 1120o 140o
= 1160o
= 0.939 + I 0.342
d = 1 40o
= 1 160
=
=
0.766 0.642 I
0.939 0.342 I
=
=
=
=
fI
=
=
=
=
Finally,
(feY )d
fR + fI
=
=
=
=
fR fI
(1.359 + I1.186) (3.252 + I2.268)
and
(feZ )d
1.893 I1.082
152
The twisted radix-3 FFT algorithm is initialized with f (z) which equals f ( 0
z) mod (z n 1) if f has degree less than n. By recursively applying the reduction
step to f (z), we obtain f ( (j) z) mod (z 1) = f ( (j) 1) for all j in the range
0 j < n. This is the desired FFT of f (z).
The following figure shows how every intermediate result of this FFT calculation
relates to the original input polynomial f (z) for n = 9.
f (z) mod (z 9 1)
f (z) mod (z 3 1)
f ( 0 )
f ( 3 )
f ( 6 )
f ( z) mod (z 3 1)
f ( 1 )
f ( 4 )
f ( 2 z) mod (z 3 1)
f ( 7 )
f ( 2 )
f ( 5 )
f ( 8 )
The next diagram shows the intermediate results of the FFT of f (z) = z 7 +
2z + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1 using the twisted method. One can compare
these results with those given for the computation of the
the classical
FFT using
2/2
+
I
2/2 and 3 =
algorithm
provided
in
the
previous
section.
Here,
2/2 + I 2/2.
6
z 7 + 2z 6 + 3z 5 + z 4 + 2z 3 + 3z 2 + 2z + 1
3z 3 + 5z 2 + 5z + 2
8z + 7
15
3 z3 + I z2 z
2I z 3
3 + 2I
Butterfly diagram:
2z+I
3 2I 2 + I
2+I
2zI
2I
2I
f8
f7
f6
f5
f4
0
R
II0
0
R
II0
f ( 0 )
f ( 3 )
II1
II6
0
R
II0
f ( 6 )
f ( 1 )
2
R
6
R
f3
1
R
f ( 4 )
153
II2
f2
f1
6
R
II6
7
R
6
R
II7
0
R
f ( 7 )
II0
f ( 2 )
8
R
II8
II6
f0
f ( 5 )
6
R
II6
f ( 8 )
M (n) =
A(n)
n 2
+ n1
3M
3
n3
3A
+2n
3
where M (1) = 0 and A(1) = 0. The method of substitution to solve these recurrence
relations and obtain the formulas given by
2We can subtract one multiplication for the case where d = 0 and so d = 1.
154
1
1
2
n log3 (n) n +
3
2
2
1
1
2
n log2 (n) n +
=
3 log2 (3)
2
2
0.421 n log2 (n) 0.5 n + 0.5
M (n) =
A(n)
= 2 n log3 (n)
2
=
n log2 (n)
log2 (3)
1.262 n log2 (n)
4. RADIX-5 FFTS
155
research problem to try to improve these algorithms and obtain an algorithm with
a lower operation count than the one presented in this section.
4. Radix-5 FFTs
A radix-5 FFT algorithm evaluates a polynomial of degree less than n at each
of the roots of z n 1 where n = 5k . It can be shown that = 1(360o/n) is a
primitive nth root of unity and that { 0 , 1 , 2 , , n1 } are the n evaluation
points that will be used in the process.
Let = n/5 = 172o so that 5 = 1. Note that also has the property that
4 + 3 + 2 + + 1 = 0.
2
1
1
3 = 2
4 =
The input to the reduction step of the classical radix-5 FFT algorithm is given
by f (z) mod (z 5m b5 ) and the output is
fV
fW
fX
fY
fZ
= f (z) mod (z m b)
= f (z) mod (z m b)
= f (z) mod (z m 2 b)
= f (z) mod (z m 3 b)
= f (z) mod (z m 4 b)
The radix-2 and radix-3 FFT algorithms involved a reversal function needed to
determine the values of b in the reduction step. For this particular FFT, we need
the pentary reversal function which involves the base-5 representation of numbers.
156
EXAMPLE
b5
b
2 b
3 b
4 b
= ( P(5j) )5 = P(j)
= n/5 P(5j) = P(5j+1)
= 2n/5 P(5j) = P(5j+2)
The proof of each of these properties is left as an exercise for the reader.
At each reduction step with input size 5m, let b = P(5j) for some j < n/m.
Using this selection for b, the input to the reduction step is f (z) mod z 5m P(j)
and the outputs of the reduction step are
fV
fW
fX
fY
fZ
To perform the reduction step, split the input into five blocks of size m by
writing f (z) mod (z 5m P(j) ) = fA z 4m + fB z 3m + fC z 2m + fD z m + fE .
Then the outputs can be expressed as
fV
fW
fX
fY
fZ
or in matrix form as
4. RADIX-5 FFTS
fV
fW
fX
fY
fZ
1
4
3
2
1
3
4
2
1
2
4
2
3
4
157
1
1
1
1
1
4P(5j) fA
3P(5j) fB
2P(5j) fC
P(5j) fD
fE
It is left as an exercise for the reader to determine the most efficient method
to implement the above reduction step. A starting point for this exercise is the
Winograd radix-5 algorithm discussed in [33]. However, one should attempt to
improve the readability of this algorithm as was done for the radix-3 case. The new
radix-5 algorithm should exploit the fact that and 4 are complex conjugates, the
fact that 2 and 3 are complex conjugates, and the fact that 4 +3 +2 ++1 =
0. The reader can also develop pseudocode for the classical radix-5 algorithm and
determine its operation count.
It is also possible to construct a twisted radix-5 algorithm. The input to each
reduction step is the polynomial f (z) = f ( P(j)/(5m) z) mod (z 5m 1) for some
value of j and the outputs are
fV
fW
=
=
fX
fY
fZ
f (z) mod (z m 1)
f (z) mod (z m )
f (z) mod (z m 2 )
f (z) mod (z m 3 )
f (z) mod (z m 4 )
by using the j = 0 case of the classical radix-5 reduction step. Now fV is already
in the form that can be used as input to another application of the twisted radix-5
reduction step, but an adjustment needs to made to each of the other outputs. The
twisted polynomial is again used to rotate the complex roots of unity to put each
output in the required form. It can be shown that the values of for each of the
outputs is given by
fX = f (z) mod (z m 2 )
fY = f (z) mod (z m 3 )
fZ = f (z) mod (z m 4 )
: = P(2)/m
: = P(3)/m = P(2)/m
: = P(4)/m = P(1)/m
Note that two values for were given for fY and fZ . By adapting the twisted
algorithm given for the radix-3 case, the reader should be able to create a twisted
radix-5 algorithm that exploits the fact that P(1)/m and P(1)/m are complex
conjugate pairs as well as the fact that P(2)/m and P(2)/m are complex conjugate
158
pairs. These details are left for the reader to resolve. The reader is also encouraged
to develop pseudocode for the twisted radix-5 FFT and give an operation count for
the number of operations required to implement the algorithm.
In theory, the techniques discussed in this section can also be used to develop
a radix-p algorithm for any value of p. There appears to be some use of the radix7 algorithms in modern FFT software, but there does not appear to be much
value in creating FFT routines for higher values of p. The reader is encouraged
to create the classical and twisted versions of the radix-7 algorithms without any
further explanation. One can follow the steps used in creating the radix-3 and
radix-5 algorithms in the previous sections to complete this exercise. The radix-7
algorithms developed by Winograd [33] may also prove useful in completing this
exercise.
5. Radix-6 algorithms
In this section, we will show how to combine the classical radix-2 and radix-3
FFTs into the classical radix-6 FFT. By adapting the technique used in this section,
the reader can construct an FFT that works for any size of the form 2a 3b 5c 7d .
The input to the reduction step of the classical radix-6 FFT algorithm is given
by f (z) mod (z 6m b6 ) and computes
fU
fV
fW
fX
fY
fZ
= f (z) mod (z m b)
= f (z) mod (z m b)
= f (z) mod (z m 2 b)
= f (z) mod (z m 3 b)
= f (z) mod (z m 4 b)
= f (z) mod (z m 5 b)
5. RADIX-6 ALGORITHMS
2 =
159
3 = 1
4 = 2
One can implement the radix-6 reduction step by decomposing it into a radix-2
reduction step followed by two radix-3 reduction steps or into a radix-3 reduction
step followed by three radix-2 reduction steps. We will present the second option
in this section.
The first part of the radix-6 reduction step receives as input f (z) mod (z 6m b6 )
and produces f (z) mod (z 2m b2 ), f (z) mod (z 2m 2 b2 ), and f (z) mod (z 2m
4 b2 ) using the radix-3 reduction step. The radix-2 reduction step is then used
to implement the following reductions
f (z) mod (z
f (z) mod (z
2m
f (z) mod (z
2m
2m
b )
2 2
b )
4 2
b )
f (z) mod (z m b)
f (z) mod (z m 3 b)
f (z) mod (z m b)
f (z) mod (z m 4 b)
f (z) mod (z m 2 b)
f (z) mod (z m 5 b)
It is convenient to present the reduction step outputs in the order f (z) mod (z m b),
f (z) mod (z m b), f (z) mod (z m 2 b), , f (z) mod (z m 5 b). The
above outputs can be unscrambled into this order at no cost.
The following diagram shows how this reduction step can compute an FFT of
size 6 for an input polynomial of f (z) = fA z 5 +fB z 4 +fC z 3 +fD z 2 +fE z +fF .
160
f5
f4
f3
f2
f ( 0 )
f ( 3 )
f0
f1
f ( 1 )
f ( 4 )
f ( 2 )
f ( 5 )
For FFTs of size 6k , one must develop a function that unscrambles the order of
the FFT outputs back into the natural order {f ( 0 ), f ( 1 ), , f ( n1 )}. The
required function is just the hexary (base-6) reversal function which involves writing
down an integer in base-6 form with respect to n which is some power of 6, reversing
the hexits and then returning the decimal version of the number obtained. We
will denote this function using the notation H(j).
EXAMPLE
Let j = 25 and n = 36 = 62 .
In hexary form, j = (41)6 .
So H(j)(36) = (14)6 .
As a decimal number, H(j)(36) = 1 6 + 4 = 10.
5. RADIX-6 ALGORITHMS
f5
f ( 0 )
f4
f ( 3 )
f3
f2
161
f1
f0
f ( 1 )
f ( 4 )
f ( 2 )
f ( 5 )
It is left as an exercise for the reader to develop the theorems that demonstrate
that the twisted radix-6 algorithm works. The reader is also encouraged to develop
the pseudocode for the algorithm and count the number of multiplications and
additions required.
CHAPTER 9
Additional topics
The main goals for writing this book were to introduce the reader to the algebraic perspective of the FFT, to demonstrate the importance of the FFT for
performing the operation of convolution / polynomial multiplication, and to show
how an FFT algorithm can be constructed for any input size which is the product
of one or more powers of small primes. The previous eight chapters were designed
to meet these objectives. In this final chapter, we will introduce several additional
topics related to the Fast Fourier Transform which the reader can explore. Because
most of these topics require a higher mathematical background than that which is
assumed for this book, only an overview will be presented for each of these topics.
References will be given which can be used to obtain the necessary background
in each area. This chapter essentially surveys the highlights of the authors doctoral research and further details on many of these topics can also be found in the
resulting doctoral dissertation.
+
0
1
2
+
0
1
0
0
1
1
1
0
0
1
0
0
0
1
0
1
0
0
1
2
1
1
2
0
2
2
0
1
0
1
2
0
0
0
0
1
0
1
2
163
2
0
2
1
164
9. ADDITIONAL TOPICS
+
0
1
2
3
4
0
0
1
2
3
4
1
1
2
3
4
0
2
2
3
4
0
1
3
3
4
0
1
2
4
4
0
1
2
3
0
1
2
3
4
0
0
0
0
0
0
1
0
1
2
3
4
2
0
2
4
1
3
3
0
3
1
4
2
4
0
4
3
2
1
In algebra classes, one learns several properties required for a number system
to have all four basic arithmetic operations (addition, subtraction, multiplication,
and division). Using the above construction, it can be shown that the modulus
must be a prime number for the operation of division to be possible. Thus, all such
finite fields must have a prime number of elements.
It is possible to create additional finite fields using polynomials with coefficients
from one of the prime finite fields. 1 It turns out that all of these additional finite
fields have q = pk elements where p is a prime and k is an integer. The elements
of such a finite field consist of all of the remainders possible when an arbitrary
polynomial is divided by some fixed polynomial of degree k called the generating
polynomial. Addition in such a structure consists of simply adding two of these
polynomials using the arithmetic of the finite field associated with the coefficients of
the polynomials. Multiplication consists of multiplying two of the polynomials with
the result reduced by the generating polynomial. In order for the multiplication
table to be produced which contains the necessary properties for division to be
possible, it must be the case that the generating polynomial cannot be factored
(this is called irreducible). It can be shown that every finite field with n elements
contains the zero element and some other element which is a primitive mth root
of unity where m = q 1. It turns out that the polynomial x will always be
one of these primitive roots of unity in the system consisting of the q possible
residue polynomials. If the generating polynomial has the property that x is an
mth primitive root of unity where m = q 1, then the generating polynomial is
said to be primitive. Because all finite fields of size q have the same algebraic
structure, we can use the notation GF (q) to denote THE finite field of size q.
For example, let us construct a finite field with 24 = 16 elements. This finite
field will consist of all remainders that are possible when an arbitrary polynomial
with coefficients in GF (2) is divided by some generating polynomial of degree 4 with
coefficients in GF (2). We are going to choose the generating polynomial x4 + x + 1.
It can be shown that this polynomial cannot be factored over GF (2) and thus it is
irreducible. We must also show that it is primitive.
Although there are better ways to show that x is a primitive 15th root of unity
in the system of residue polynomials for x4 + x + 1 with arithmetic in GF (2), we
are going to proceed by listing all of the powers of x in this system and showing
that d = 15 is the first positive exponent where xd = 1. Clearly, x2 = x x
and x3 = x2 x. Now x4 = x3 x, but since this result is of degree 4, we must
1Mathematicians prefer to define these other finite fields in terms of a concept called a
quotient ring rather than the method used in this section. However, most mathematicians then
switch over to the system used in this section for computational purposes. The presentation in
this section is not traditional, but is believed to be easier for the beginning student to understand.
This topic is discussed in more detail in the appendix.
165
0 =
=
0x3 + 0x2 + 0x + 0
0x3 + 0x2 + 1x + 0
0x3 + 1x2 + 0x + 0
3
4
=
=
1x3 + 0x2 + 0x + 0
0x3 + 0x2 + 1x + 1
5
6
=
=
0x3 + 1x2 + 1x + 0
1x3 + 1x2 + 0x + 0
7
8
=
=
1x3 + 0x2 + 1x + 1
0x3 + 1x2 + 0x + 1
9
10
=
=
1x3 + 0x2 + 1x + 0
0x3 + 1x2 + 1x + 1
11
12
=
=
1x3 + 1x2 + 1x + 0
1x3 + 1x2 + 1x + 1
13
14
=
=
1x3 + 1x2 + 0x + 1
1x3 + 0x2 + 0x + 1
0x3 + 0x2 + 0x + 1
The addition and multiplication tables for this finite field with 16 elements is given
by
166
9. ADDITIONAL TOPICS
10 11 12 13 14
10 11 12 13 14
14
10 13 9
12 11 6
11 14 10 3
10
12 1
11 4
14 9
11 2
13
12 5
10 8
10
10 2
11 8
13 11 3
14 12 4
10 1
11
10 10 5
12 2
11 11 12 6
13 5
12 12 11 13 7
13 4
10 14 5
13 10 0
4
1
13 3
14
14 6
10 6
12 9
3
12
13
7
11
1
9
14 7
12
13 2
10 4
12
12 0
13 0
14 0
10 9
14 3
14
14 4
13 12 7
11 1
14 11 0
11
10
13 13 6
12 14 8
11 7
14 14 3
13 1
12 8
11 10 5
10 11 12 13 14
0
10
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
10 11 12 13 14 1
13
14
14
12
0
13
11
0
12
10
0
11
10 0
10 11 12 13 14 1
11 0
11 12 13 14 1
10
12 0
12 13 14 1
10 11
13 0
13 14 1
10 11 12
14 0
14 1
10 11 12 13
167
There is much more that can be said about the basic properties of finite fields.
In fact, entire books have been written about the topic. Probably the most popular
book is the one written by Lidl and Neidderriter [29]. However, the present author
also highly recommends a book written by Wan [36] which may be more accessible
for an undergraduate student.
Although a finite field can be constructed for any prime power, most finite
fields that are used in practice are of size which is a power of 2. This is because
it is convenient to store these elements using a computer and because addition can
be implemented by simply using the computers exclusive or (XOR) operator.
Suppose that we wish to efficiently evaluate a polynomial of degree less than
n = 2k with GF (2k ) coefficients at each of the elements of this finite field. Since a
finite field consists of the zero element and the mth roots of unity where m = n 1,
each of the elements of GF (2k ) is a root of x(xn1 1) = xn x. The number that
we think of as 1 in our number system is equivalent to the number +1 in GF (2k ),
so xn x = xn + x. Because n 1 is never the power of a small prime (2, 3, 5, 7)
or the product of powers of small primes, a finite field does not contain the correct
primitive root of unity to use one of the FFT algorithms discussed in Chapters 4
and 8. These algorithms are sometimes called multiplicative FFTs because they
take advantage of the multiplicative structure of the set of the roots of unity. The
algorithms introduced in this section are called additive FFTs because they have
advantage of an additive structure present in finite fields.
A basis is a special collection of elements that are linearly independent. Basically, this means that if you take any subset of the elements in the basis, you
cannot add or subtract any number of the elements in the subset and end up with
an element that is not in the subset. 2 The elements of a basis are like building
blocks that can be combined to form a set of elements.
In the example of GF (16), the only combinations of 1 = 1 are 0 1 = 0
and 1 1 = 1. Note that all finite fields with 2k elements have the property that
1+1 = 0, so what we think of as the number 2 is really equivalent to 0 in this number
system. If the element 2 = 5 is added to the basis, then we can add combinations
of 1 and 5 to form {0, 1, 5 , 10 }. Next, if we add 3 = to the basis, then we
can add linear combinations of 1, 5 , and to form {0, 1, , 2 , 4 , 5 , 8 , 10 }.
Finally, if we add 4 = 7 to the basis, then we obtain all of GF (16). A list of each
of the 16 elements in terms of the basis {1, 5 , , 7 } is given below.
2The definitions of basis and linear independence are actually more complicated than this,
but these definitions depend on a more advanced knowledge of algebra.
168
9. ADDITIONAL TOPICS
0 =
1 =
=
2
3
4
=
=
5
6
=
=
7
8
=
=
9
10
=
=
11
12
=
=
13
14
=
=
07 + 0 + 05 + 0 1
07 + 0 + 05 + 1 1
07 + 1 + 05 + 0 1
07 + 1 + 15 + 0 1
17 + 1 + 05 + 1 1
07 + 1 + 05 + 1 1
07 + 0 + 15 + 0 1
17 + 0 + 15 + 1 1
17 + 0 + 05 + 0 1
07 + 1 + 15 + 1 1
17 + 0 + 05 + 1 1
07 + 0 + 15 + 1 1
17 + 1 + 15 + 1 1
17 + 1 + 15 + 0 1
17 + 0 + 15 + 0 1
17 + 1 + 05 + 0 1
The additive FFT is a special case of multipoint evaluation, just like the multiplicative FFTs discussed in the previous chapters are special cases of multipoint
evaluation. However, the modulus polynomial tree is based on the factorization of
xn + x in this case. One design for a modulus polynomial tree that can be used to
efficiently evaluate a polynomial with coefficients in GF (16) at each of the elements
of this finite field is given below. Here, = x2 + x as space did not allow this polynomial to be explicitly included in the figure and this emphasizes the point that all
of the polynomials in each row of the table are the same except for their constant
term.
x16 + x
x8 + x4 + x2 + x
x4 + x
x2 + x
0
x4 + x + 1
+1
5
x8 + x4 + x2 + x + 1
10
+ 5
x4 + x + 5
+ 10
2
+
3
13
+ 4
7
x4 + x + 10
+ 2
12 11
+ 8
9
14
169
Note that at every branch of the tree, the elements are split into two equalsized groups based on whether or not one of the basis elements is used in a linear
combination involving the elements of that group. For example, in the first branch
of the tree above, all of the elements which are roots of x8 + x4 + x2 + x share the
property that 4 = 7 is not needed to express the element as a combination of the
basis {1, 5 , , 7 } (see the list of GF (16) in terms of the basis elements above).
Conversely, all of the elements that are roots of x8 + x4 + x2 + x + 1 require 4 if the
elements are to be expressed as a combination of the basis elements. At the second
level of the tree, the elements of each set are divided into two equal-sized groups
based on whether or not 3 = is needed in the linear combination. The third
level of the tree subdivides the elements based on 2 = 5 and the last level of the
tree subdivides the elements based on 1 = 1. Because the elements are subdivided
at each stage based on whether or not a basis element is needed to construct the
element and because the basis elements are added together to form the elements of
a finite field, an FFT that uses this technique to efficiently evaluate a polynomial
at each of the elements is sometimes called an additive FFT.
The additive FFT first appeared in the literature in a paper written by Wang
and Zhu in 1988 [37]. However, it did not work out the details of how to construct
the basis used to subdivide the elements at each stage of the FFT. Mathematicians
often use the big-O notation to measure the cost of an algorithm. The way that
this notation works is to only record the term of the operation count that grows
the fastest and omit the multiplicative coefficient of this term. So all of the multiplicative FFT algorithms require O(n log2 (n)) multiplications and O(n log2 (n))
additions. The Wang-Zhu algorithm requires O(n(log2 (n))2 ) operations in general.
Shortly after the publication of this paper, Cantor [9] (1989) published a paper that
uses a special basis to improve the number of operations needed to compute this
FFT to O(n (log2 (n))1.585 ) operations. The set of elements {1, 5 , , 7 } is an
example of a Cantor basis and has the special property that (i+1 )2 + i+1 = i
for each 1 i < k. In the case of the example used in this section, (5 )2 + 5 = 1,
()2 + = 5 , and (7 )2 + 7 = . Two important properties of the Cantor basis
in the additive FFT is the fact that all of the nonzero coefficients in the modulus
polynomials are ones except in the constant term and that if si (x) is the modulus
polynomial in the left node at a branch in the tree, then si (x) + 1 is the modulus
polynomial in the right node at that branch in the tree. These properties of the
modulus polynomials simplify the reduction step of the additive FFT algorithm
compared to general multipoint evaluation and lead to the O(n (log2 (n))1.585 )
operation count. Additional details of these additive FFT algorithms can also be
found in a paper by Gerhard and van zur Gathen [18] as well as the present authors
doctoral dissertation [31].
Another additive FFT algorithm is introduced in the present authors doctoral
dissertation and is based on earlier work by Shuhong Gao that was only published in
an informal set of notes [17] for a course that he taught in 2001. The new additive
FFT algorithm was improved upon as part of the present authors doctoral research
and is now in a form that requires fewer operations than the Cantor algorithm for
all input sizes where the finite field has 2k elements where k is itself a power of 2. 3
3Most practical uses of finite fields also have the property that k is itself a power of 2 because
this matches the most popular data type sizes used in a computer.
170
9. ADDITIONAL TOPICS
This O(n log2 (n) log2 log2 (n)) algorithm takes advantage of the fact that for any
finite field of this form, xd x divides xk x for all values of d which are powers of
2. Observe that GF (16) = GF (24 ) is a finite field of this form since 4 = 22 . Note
that x4 x divides x16 x since 4 = 22 and 2 = 21 . Also, x2 x divides x16 x since
2 = 21 and 1 = 20 . However, 8 = 23 and 3 is not a power of 2. In this case, x8 x
does not divide x16 x, but instead x8 + x4 + x2 + x does. This modulus polynomial
contains 4 terms while all of the other polynomial contain only 2 terms. Observe
that in the modulus polynomial tree above, the cases where the level is a power of
2 contain fewer terms than the modulus polynomials on level 3 (x8 + x4 + x2 + x
and x8 + x4 + x2 + x + 1). In general, the more terms that are contained in a
modulus polynomial for multipoint evaluation, the more expensive the reduction
step will be at that level. So the power-of-2 levels are less expensive than all of
the other levels. Instead of splitting a problem into 2
equal-sized subproblems,
the
new algorithm subdivides a problem of size n into n subproblems of size n.
So an FFT of size 16 is subdivided into 4 subproblems of size 4 and each of these
subproblems is subdivided into 2 subproblems of size 2. The details of the reduction
step of the new additive FFT algorithm are beyond the scope of this book, but the
important thing to understand is that all of the modulus polynomials involved in
the algorithm are of the form xd x + C where C is some constant. We skip over
all of the levels that contain a greater number of terms in the modulus polynomials
and achieve a lower operation count for the algorithm. Further details about how
this process works can be found in the authors doctoral dissertation.
One of the most important applications of finite fields is in the construction
of Reed-Solomon codes. These codes are important because they allow a compact
disk to work even when it is scratched or becomes dirty. Most of the information
on a CD is stored as finite field elements which can be easily converted to digital
form and used to produce music or information needed by a computer. However,
extra finite field elements are added to sections of data called a block. The extra
elements can be used to detect if there is a mistake somewhere in a block. If a
mistake is detected, then a process can be followed with the given information to
often correct the mistake and allow the CD to work properly anyway. This is called
forward error correction. A new algorithm for decoding Reed-Solomon codes is
included in the doctoral dissertation and is believed to be simpler to understand
than commonly used Reed-Solomon decoding routines. Because the block size on a
compact disc is typically small, the new algorithm is not currently of much practical
value. However, if data is stored differently on compact discs in the future that
uses different Reed-Solomon codes, the new algorithm may one day become more
advantageous.
171
gi+1
=
=
=
(gi )
0 (gi )
1/gi f
gi
1/gi2
gi
2 gi f gi2
172
9. ADDITIONAL TOPICS
However, there is another method that can be used that does not require one to
factor any integers. This method, called the Euclidean algorithm, was invented
around 400 B.C. and is one of the oldest techniques in mathematics that is still
used today. The Euclidean algorithm is based on the fact that if a > b, then the
GCD of a and b is the same as the GCD of b and r where r is the remainder
resulting when a is divided by b. This fact is reused as many times as it takes to
compute the desired GCD.
EXAMPLE
A variant of the Euclidean Algorithm called the Extended Euclidean Algorithm also finds integers u and v such that u a + v b = GCD(a, b). The algorithm
also determines values of ui and vi such that ui a + vi b = ri at each step of the
algorithm where ri is one of intermediate results of the Euclidean Algorithm.
EXAMPLE
The Euclidean Algorithm and Extended Euclidean Algorithm can also be applied to finite fields as well as polynomials with complex or finite field coefficients.
In fact, mathematicians call any algebraic structure where the Euclidean algorithm
can be applied a Euclidean Domain. The Extended Euclidean Algorithm is useful
for finding inverses of finite field elements.
EXAMPLE
Let us more carefully examine the computations of the Extended Euclidean
Algorithm in the above example. Observe that in the first step, XXXXXXXXX.
Similarly, in the second step, XXXXXXXXX. We really dont need the lower coefficients of the input polynomials to compute these results. For example, the first
result can also be computed as follows
EXaMPLE
Note that these input polynomials are formed from the upper coefficients of the
input polynomials of the above example. By carefully figuring out which coefficients
of the input polynomials can be discarded at each step of the Extended Euclidean
Algorithm, it possible to reduce the amount of effort needed to compute the GCD
for large-sized problems. The resulting algorithm is called the Fast Euclidean
Algorithm and is discussed in further detail in the Modern Computer Algebra
book. Most practical-sized integer GCD problems of the late 20th century were not
large enough to fully take advantage of the power of the Fast Euclidean Algorithm
and an algorithm by Kenneth Weber [REF] is more advantageous instead. However,
as problem sizes in the 21st century grow in size, the Fast Euclidean Algorithm is
expected to overtake Webers Accelerated Euclidean Algorithm in many cases.
It is possible to integrate FFT multiplication using the truncated Fast Fourier
Transform into the Fast Euclidean Algorithm as discussed in the present authors
doctoral dissertation. This chapter of the dissertation was motivated by a number
of homework problems and research suggestions given in the Modern Computer
Algebra book. More investigation is needed in this area, but most GCD problem
173
sizes are not yet large enough for such an investigation to likely yield any practical
improvement over techniques currently in place.
One important application of these fast GCD algorithms is the problem of
factorization of polynomials with finite field coefficients. This topic is the subject
of one of the chapters of the Modern Computer Algbera book. The case where the
finite field has 2k elements is treated in a separate paper [18] written by the same
authors. It should also be pointed out that the problems of multipoint evaluation
and interpolation are also considered in another chapter of the Modern Computer
Algebra book.
Because computer algebra is such a new and active area of research, no book on
the topic can cover all of the advances in this branch of mathematics. The present
author has expanded on some of this material and included some of the recent
results of others in his doctoral dissertation. Hopefully, some of this material will
be included in future editions of Modern Computer Algebra and will be improved
upon by others in the future.
zN 1 =
=
(z 1) (z + 1) (z 2 + 1) (z 4 + 1) (z K1 + 1)
(z 1)
K1
Y
(z d + 1)
d=0
174
9. ADDITIONAL TOPICS
The truncated FFT evaluates a polynomial at each of the roots of some specially
chosen polynomial M(z) of degree n constructed from the factors of z N 1 where
n N . To determine M(z), express n in binary form and include a factor of
d
z 2 + 1 whenever a one appears in the 2d place value of the binary expansion. For
example, let n = 21 so that N = 32. The binary form of n is 21 = (10101)2,
so there are ones in the 24 = 16, 22 = 4, and 20 = 1 place values. In this case,
M(z) = (z 16 + 1) (z 4 + 1) (z + 1).
Recall that the reduction step of the radix-2 FFT algorithm receives as input
f (z) mod (z 2m b2 ) and produces as output fY = f (z) mod (z m b) and fZ =
f (z) mod (z m + b) where m is a power of 2. The truncated FFT algorithm focuses
on the cases where b = 1. The reduction step of the truncated FFT algorithm
receives as input f (z) mod (z 2m 1). If z m + 1 is included in M(z), then the
reduction step produces f (z) mod (z m + 1) and then calls any of the 2-adic FFT
algorithms to evaluate f (z) at each of the roots of z m + 1. Regardless of whether or
not z m +1 is included in M(z), then the reduction step produces f (z) mod (z m 1)
which is used in the next reduction step of the truncated FFT algorithm.
The following butterfly diagram shows how the algorithm would proceed in the
case where n = 21 so that M(z) = (z 16 + 1) (z 4 + 1) (z + 1). The top row of
the figure represents the input to the truncated FFT where squares indicate the
components of the input and the circles indicate coefficients of degree greater than
the input size which are known to be zero in advance. The bottom row of the figure
represents the 21 evaluations of the input polynomial. It can be shown that each
of these locations corresponds to a root of M(z) = (z 16 + 1) (z 4 + 1) (z + 1).
fA
fB
175
1
(fY fZ )
2
1
(fY + fZ )
2
At some of the places where the inverse truncated FFT algorithm seems to get
stuck, we know fZ and some of the coefficients of fA (these are the known coefficients in the output polynomial) and we would like to recover some of the coefficients
of fY . The first equation above can be solved for fY to obtain
fY
2 fA fB
We will call this a mixed butterfly operation because it is a mixture of the reduction step of the FFT algorithm and the interpolation step of the inverse FFT
algorithm. After applying the mixed butterfly operation, we have some of the coefficients of f (z) mod (z m 1). We can either apply the mixed butterfly operation
again with the coefficients of f (z) mod (z m + 1) or use the reduction step of the
FFT algorithm to recover some of these coefficients. At some point, we will have all
of the information that we need to recover the desired output polynomial using the
inverse FFT interpolation step and the two additional mixed butterfly operations
fB
fA + fZ
fB
fY fA
and
Again, the inverse truncated FFT is somewhat more complicated than the companion FFT algorithm. The following butterfly diagram is provided to illustrate
how the process works for the case where n = 21. Basically the algorithm proceeds
from the right of the figure to the left recovering values from the top of the figure
down to the bottom. Once enough coefficients have been recovered to undo the
recursion, the algorithm proceeds from the left of the figure to the right, recovering
coefficients from the bottom of the figure to the top.
176
9. ADDITIONAL TOPICS
The following legend is also provided to indicate how each result of the above
butterfly diagram is obtained
The truncated FFT algorithms are discussed in much more detail in the doctoral
dissertation, but is presented as a generalized version of the algorithm. The algorithm in the dissertation can also work with the additive FFT algorithms introduced
earlier in this chapter, but is presented more abstractly and is thus somewhat more
difficult to read than the description given in this section might suggest.
4. Implementation of FFT Algorithms
This textbook has used the model of computing the cost of an algorithm in
terms of the number of multiplications and the number of additions required to
implement the algorithm. While this is a good model when an algorithm has very
large input sizes, it is not a good model for practical-sized problems of the early
21st century. Computers of this time period have many advanced features that a
computer programmer needs to take advantage of in order to produce an algorithm
that runs the fastest. These features are often difficult to model mathematically,
so mathematicians continue to use operation counts as a measure of comparing
two algorithms. But most people are just interested in the FFT routine that runs
the fastest regardless of what the mathematicians think. The FFTW (Fastest
Fourier Transform in the West) [16] is widely regarded as one of the best software
packages available for computing the FFT. One can read the papers written by the
inventors of this package to better understand the computer science issues involved
with implementing the FFT. Although the current version of FFTW does not use
the most efficient FFT algorithms from the mathematicans viewpoint, one must
periodically revisit the question of the most efficient FFT technique as the input
sizes grow in size in the coming years and new computer architectures are invented
which may favor different methods of computing the FFT. Just by reading the
papers of Frigo and Johnson, one can see several changes in their approach to
computing the FFT over the lifetime of their product.
Another related topic for implementing FFT algorithms is parallel computing.
Here, one gets two or more computers and distributes the work needed to compute
the FFT between the two computers. Special algorithms are needed to communicate the intermediate answers of the FFT to the other computers as efficiently as
possible. The book [10] gives a number of algorithms for computing the FFT in
6. CONCLUDING REMARKS
177
parallel. It is also possible to use these algorithms for multiplying two polynomials
or performing convolution. However, as of this writing, no one seems to be able to
distribute the work of polynomial division, the greatest common divisor, or polynomial factorization in the same way. These are important research problems that
would have great benefits if someone could discover these parallel algorithms.
5. Applications of the FFT
Earlier in this book, it was suggested that the main application of the FFT is
for the efficient computation of the operation of convolution. For many people, this
probably does not sound very impressive because they are not aware of the many
places where convolution is an important component of many practical and useful
techniques in science and engineering.
The book by Brigham [7] contains a large listing of applications of the FFT
and references where the interested reader can explore some of these applications.
Additionally, the extensive bibliography by NAME contains many articles which
discuss practical applications of the FFT.
The present author has recently become interested in the role that some of
the concepts discussed in this book play in the theory of music. Recall that the
original application of the Discrete Fourier Transform was to express a given signal
input as the sum of harmonically related sinusoids. The sinusoids related to one
with a frequency of 440 Hz (also known as the note A) is the basis of much of
traditional music theory. While the FFT does not play a prominent role in the
composition of musical arrangements, the foundation upon which the FFT is built
is the basis for what many would consider to be beautiful music where beautiful
is determined by the ears natural pleasant response to encountering harmonically
related sinusoids. The reference [30] is recommended for exploring this interesting
topic.
6. Concluding Remarks
During his undergraduate education in mathematics and engineering, the present
author had a somewhat difficult time learning the FFT because of the different
perspectives used by the two communities in their publications on the topic. As
a review, the engineers view the FFT as a technique used to efficiently compute
the coefficients of a discrete Fourier series associated with a particular sequence
of signal samples while mathematicians view the FFT as an efficient technique for
solving the multipoint evaluation problem.
Only time will tell whether the algebraic perspective of the FFT becomes more
accepted in the mathematics and engineering communities.
The impact of this book also remains to be seen. It is hoped that other mathematicians and engineers will continue to explore the topics of this book, particularly
those given in this final chapter. Those that make further contributions to the body
of knowledge represented in this book or know of contributions of others that should
be included in the book are encouraged to contact the author (currently at e-mail
178
9. ADDITIONAL TOPICS
APPENDIX A
F (s) =
Since this is an improper integral, let us choose some finite interval of size T0 over
which to evaluate the integrand. This interval will start at a = T0 /2 and end at
b = T0 /2. So we will now consider the proper integral
f (t) e2stI dt
which can be evaluated as a Riemann sum. The Riemann sum is usually defined by dividing the interval of integration into N equal-sized parts and evaluating the function of interest at the left endpoint of each subinterval. Usually, these points are denoted t0 , t1 , t2 , . . . , tN 1 with corresponding function values
f (t0 ), f (t1 ), f (tN 1 ).
To illustrate this process, consider the function
f (t) =
t + 2 if 4 t 4
0
otherwise
for the case where s = 0 so that the complex exponential component of the integral
goes away. We will let T0 = 2 so that the region of integration is from t = 1 to
t = 1. Letting N = 8, the shaded region in the figure
179
180
3
2
1
1
1
t 0 t 1 t 2 t 3 t4 t 5 t 6 t 7
is an approximation of the desired integral.
In general, since
ba
i
N
T0
T0
i
= +
2
N
ti
= a+
for all integer values of i in the range 0 i < N and t = T0 /N , then the Riemann
sum can be computed with the formula
N
1
X
i=0
f (ti ) e2sti I t
3
2
1
1
f (t) e2stI dt
lim
N 1
T0 X
f (ti ) e2sti I
N i=0
181
To evaluate the indefinite integral used to define the Fourier transform, we increase
T0 without bound. Since a = T0 /2 and b = T0 /2, then
F (s) =
=
=
f (t) e2tsI dt
lim
T0
lim
T0
T0
T0
f (t) e2stI dt
N 1
T0
T0
T0
T0
T0 X
f +
i e2s( 2 + N i)I
lim
N N
2
N
i=0
The above formula can be used to compute the Fourier Transform at any input
s in the range to . One problem with this approach is that a computer can
only store the Fourier Transform at a finite number of values for s. So what we are
going to do now is to define a new function FT0 ,N (k) which approximates F (s) for
only N of the possible values for s. Here, we will restrict k to be the N integers
in the interval N/2 k < N/2 and for an input of k, the function FT0 ,N (k)
will approximate F (f0 k) where f0 = 1/T0 and T0 is again the size of the region
over which we will sample the function f (x). So one formula that can be used to
compute FT0 ,N (k) is given by
FT0 ,N (k) =
=
N 1
T0
T0
T0 T0
T0 X
f +
i e2(f0 k)( 2 + N i)I
N i=0
2
N
N 1
1
1
T0 X
T0 T0
f +
i e2k( 2 + N i)I
N i=0
2
N
Observe that by restricting the values for which we will compute the Fourier Transform to multiples of f0 , we have eliminated T0 from part of the formula. We are
now going to introduce a new function that will simplify this formula even more.
Let fe( ) be a function defined as follows:
fe( )
f
f
T0
N
T0
N
if < N/2
if N/2
T0
where the inputs are restricted to integers in the interval 0 < N . In terms of
the example function for f (t) and the case where N = 8, then fe( ) is given by
182
3
2
1
0
We can now define the function FT0 ,N (k) in terms of fe( ) as follows:
FT0 ,N (k) =
N/21
1
T0 X e
f ( ) e2k( N )I
N =0
N 1
1
T0 X e
f ( ) e2k( N 1)I
N
=N/2
The reason why the multiplicative factor T0 /N is included in this formula can be
seen by observing the following figure
3
2
1
0
which uses fe( ) to compute FT0 ,N (k) in the case of the example used throughout
this section. Here, k = 0 which corresponds to the case where s = 0 earlier.
At figure glance, this figure looks somewhat like the earlier one which uses eight
rectangles to approximate the Fourier Transform of f (t) at s = 0 except that the
rectangles are presented in a different order. However, there is another difference
between the two figures. In the earlier case, the width of each rectangle was 1/4 or
T0 /N where T0 = 2 and N = 8. In the second figure, the width of each rectangle
is 1. In order for the summation which uses fe( ) to actually compute FT0 ,N (k), we
must multiply the resulting area by T0 /N .
The second term in the above expression can be simplified to
N 1
1
T0 X e
f ( ) e2k( N )I e2kI
N
=N/2
183
e2kI
=
=
=
cos(2k) + I sin(2k)
1+I0
1
and so FT0 ,N (k) can now be computed using the simplified formula
FT0 ,N (k) =
N 1
T0 X e
f ( ) e2k/N I
N =0
which almost matches the Discrete Fourier Transform formulas introduced in Chapter 1. The only difference between these formulas is the multiplicative constant that
appears in front of the summation.
Let us now define a new function FeT0 ,N (k) as follows:
FeT0 ,N (k) =
N
1
X
=0
fe( ) e2k/N I
where k is allowed to be any integer. Now select any integer c and observe that
FeT0 ,N (k + c N ) =
=
N
1
X
=0
N
1
X
=0
N
1
X
=0
= FeT0 ,N (k)
where we have again used Eulers Formula to claim that e2c I = 1. This result
tells us that we can select any integer k1 and FeT0 ,N (k1 ) will be equal to TN0 FT0 ,N (k)
for one of the ks in the interval N/2 k < N/2. This means that we can choose
any N consecutive integers, evaluate FeT0 ,N (k) at each of these integers, and we
will have all N values of FT0 ,N (k), scaled by a multiplicative constant. In practice,
engineers typically use FeT0 ,N (k) to define the Discrete Fourier Transform with the
range of integers 0 k < N and do not multiply the final results by T0 /N . This
matches one of the Discrete Fourier Transform formulas introduced in Chapter 1.
184
Given a fixed value for T0 , we now wish to show that FT0 ,N (k) approximates
F (k f0 ) = F (k/T0 ) at the integer values for k in the interval N/2 k < N/2.
Let FT0 (k) be defined as the function that results where N is increased without
bound in FT0 ,N (k). Then
FT0 (k) =
=
=
=
N 1
T0 X e
f ( ) e2k/N I
N N
=0
N
1
1
1
T0 X
T0 T0
lim
f +
i e2k( 2 + N i)I
N N
2
N
i=0
N
1
X
T0
T0
T0
T0
T0
i e2(f0 k)( 2 + N i)I
f +
lim
N
2
N
N
i=0
lim
lim
N
1
X
i=0
T0 /2
T0 /2
If f (t) is bandlimited (which means that there exists some M such that f (t) = 0
for all values of t that are greater than M and all values of t that are less than M ,
then FT0 (k) produces F (f0 k) exactly. Otherwise, as T0 increases, then FT0 (k)
becomes a better and better approximation to F (f0 k).
In the Discrete Fourier Transform FT0 ,N (k), there are two parameters that we
can control to specify how good of an approximation we can make to the Fourier
Transform F (s). The parameter N controls the number of values of for which
the function FT0 ,N (k) will be defined. The parameter T0 controls f0 = 1/T0 which
determines how close the valid inputs to FT0 ,N (k) will be to one another. Then the
function FT0 ,N (k) approximates F (s) for all values of s of the form k f0 for all
integer values of k in the interval N/2 k < N/2. So T0 controls the size of the
interval of values for which FT0 ,N (k) is defined and N controls how many of the
values inside the interval for which the function is defined. By increasing both of
the parameters without bound, FT0 ,N (k) will be a better and better approximation
to F (s).
We now want to show how the inverse Fourier Transform relates to the inverse
Discrete Fourier Transform. The inverse Fourier Transform is defined with the
formula
f (t) =
F (s) e2stI ds
185
We can evaluate this integral by letting the parameter fm increase without bound
in the proper integral
Z
F (s) e2stI ds
where a = fm /2 and b = fm /2. This integral can also be evaluated by using the
Riemann sum. We will again divide the interval of integration into N equal-sized
parts and evaluate the function of interest at the left endpoint of each subinterval.
These points can be denoted by s0 , s1 , s2 , . . . , sN 1 with corresponding function
values F (s0 ), F (s1 ), , F (sN 1 ). For example 1, if
F (s) =
(s 1)2 + 1 if 5 s 5
,
0
otherwise
1
1
s0 s1 s2 s3 s4 s5 s6 s7
Since
sj
ba
j
N
fm
fm
+
j
=
2
N
= a+
for all integer values of j in the range 0 j < N and s = fm /N , then the
Riemann sum can be computed with the formula
N
1
X
j=0
F (sj ) e2tsj I s
and
1This example is not intended to be the inverse of the earlier example. It was just selected to
produce the graphs that illustrate the integration process for the inverse discrete Fourier transform.
186
F (s) e2tsi I ds
lim
N 1
fm X
F (sj ) e2tsj I
N j=0
If N = 32 is the above example, the following figure shows the much better approximation for the integral.
6
By letting fm increase without bound, then the inverse Fourier Transform formula
is obtained.
In theory, the above formula can be used to evaluate f (t) at whatever real
value of t that we want. In practice, however, we usually only know approximations to F (s) at a finite number of multiples of some predetermined fundamental
frequency which is the f0 from the forward Fourier Transform analysis. If N of
function values of these approximations F,N (k) are known for all values of s = kf0
where N/2 k < N/2, then f (t), an approximation to this Riemann sum is given
by
f (t) =
f0
N
1
X
j=0
We are now going to restrict ourselves to inputs of f (t) which are multiples
of T0 /N . Let us use the notation f ? ( ) to refer to the function f (x) with inputs
restricted to values of the form T0 /N In this case, f ? ( ) simplifies to
f ? ( )
= f0
= f0
N
1
X
j=0
N
1
X
j=0
Since the inputs to f ? ( ) are restricted to multiples of T0 , we can use the evaluations
of Fe,N (k) at all values of k in the interval 0 k < N in place of the evaluations
of F,N (k) given above. Now, f ? ( ) becomes
f ( )
187
T0 e
F,N (k) e2 k/N I
f0
N
k=0
N
1
X
T0 e
F,N (k) e2 (kN )/N I
+ f0
N
N/21
k=N/2
6
4
2
0
when N = 8.
Observe that the second term in the above formula simplifies to
f ( )
T0 e
F,N (k) e2 k/N I e2 I
= f0
N
k=0
N
1
X
T0 e
= f0
F,N (k) e2 k/N I 1
N
N
1
X
k=0
N 1
1 X e
where we have again used Eulers Formula to show that e2 I = 1. We have also
made use of the fact that f0 T0 = 1.
Assuming that limT0 limN F,N (k) = F (f0 k), fm = f0 N = N/T0 ,
and t = T0 /N , then
188
lim
lim f ? ( )
T0 N
N 1
1 X e
lim
lim
k=0
lim
lim f0
T0 N
lim
lim f0
T0 N
N
1
X
j=0
N
1
X
j=0
N 1
N
fm X
fm
F
j
e2 /N (jN/2)I
=
lim lim
fm N N
N
2
j=0
=
=
=
=
N 1
fm X
F (sj ) e2 sj /fm I
fm N N
j=0
lim
lim
N 1
fm X
F (sj ) e2 T0 /N sj I
fm N N
j=0
lim
lim
lim
lim
fm N
lim
fm
N
1
X
j=0
fm /2
fm /2
F (sj ) e2xsj I s
F (s) e2sxI ds
F (s) e2sxI ds
= f (t)
f ? ( )
N 1
1 X e
which is what the engineers typically use as the Discrete Inverse Fourier Transform
formula, as the number of frequency samples (N) increases and the fundamental
frequency f0 = 1/T0 gets closer and closer to 0, then f ? ( ) becomes a better and
better approximation to f (t) for all values of t that are multiples of T0 . Since f0
gets closer and closer to 0, then this means that f (t) becomes a better and better
approximation to f (t) for all t.
The above derivation also shows that we can get away with omitting T0 /N from
the Discrete Fourier Transform formula, provided that the multiplicative factor 1/N
is included in the inverse Discrete Fourier Transform formula. This is typically what
is done if Fe,N (k) is used to define the Discrete Fourier Transform. If
FT0 ,N (k) =
189
N 1
T0 X e
f ( ) e2k/N I
N =0
is used for the definition of the Discrete Fourier Transform, then the above analysis
shows that the inverse Discrete Fourier Transform is computed using
f ? ( )
f0
N
1
X
j=0
FbT0 ,N (k) =
N 1
1 X e
f ( ) e2k/N I
N =0
is used for the definition of the Discrete Fourier Transform, then the inverse Discrete
Fourier Transform can be computed using
f ? ( )
N
1
X
k=0
N
1
X
k=0
2k/N ( n)I
N
0
if = n
if =
6 n
190
f ? ( )
N 1
1 X e
N
1
N
1
N
1
X
N
1
X
n=0
fe(n)
k=0
N
1
X
N
1
X
n=0
n=0
fe(n)
fe(n) e2k/N nI
N
1
X
k=0
N
1
X
e2 k/N I
e2k/N nI e2 k/N I
e2k/N ( n)I
k=0
1
=
(0 + 0 + + 0 + fe( ) N + 0 + + 0)
N
= fe( )
Since f ? ( ) = fe( ), we are now justified to use the phrase inverse Discrete Fourier
Transform as f ? ( ) is the inverse of the Discrete Fourier Transform of fe( ). A
similar analysis can yield the same result for F,N (k) and Fb,N (k). This analysis
which relates each of these Discrete Fourier Transform definitions and their inverses
may also help to illuminate why there is a factor of 1/N somewhere in one of the
two definitions.
Let us now return to the expression
f (t) =
N/21
X
1
N
1
X
1
Fe,N (k) e2t(f0 (kN ))I
N
k=N/2
where t is allowed to be any real number in the interval < t < . Note that
because t is permitted to be a value other than a multiple of T0 , we cannot collapse
the above two summations into a single expression. However, it is shown in Chapter
7 that if it is known that f (t) is a function with no imaginary components, then
N/21
f (t) =
a0 +
k=1
+ aN/2 cos( f0 N t)
where (ak )2 + (bk )2 = |Fe,N (k)/N | and tan() = bk /ak where is the argument of
the complex value Fe,N (k)/N when written in polar form.
191
This function is called the Discrete Fourier Series. We already know that
f (t) = f ? ( ) for the N values t = T0 /N where 0 < N . As N increases
without bound, we obtain the Fourier Series
f (t)
= a0 +
k=1
where f (t) = f (t) for any periodic function f (t). In other words, any periodic
function can be represented by an infinite summation of sinusoids where the magnitude of each sinusoid in determined by the Fourier transform. So we can generate
better and better approximations to any periodic function f (t) by increasing the
value of N in the Discrete Fourier Transform of f (t) and using the Discrete Fourier
Series to reconstruct the function.
If T0 is allowed to increase without bound so that f0 gets closer and closer to
zero, then we obtain
Z
f (t) =
F (s) e2stI ds
in other words, the inverse Fourier Transform. Let FA (s) and FB (s) be defined by
the functions such that FA (s1 )2 + FB (s1 )2 = |F (s1 )| and tan() = FB (s1 )/FA (s1 )
where is the argument of the complex value F (s1 )/N when written in polar form
for any s1 . Then the inverse Fourier Transform becomes
f (t) =
f (t) =
The importance of the above two formulas is that now nearly any real function f (t)
can be represented as an infinite summation of sinusoids. Instead of the sinusoids
in the summation having frequencies that are multiples of some common value f0 ,
the sinusoids in the summation are all possible frequency values. The Discrete
Fourier Transform can be used in conjunction with the Discrete Fourier Series to
approximate nearly any function f (t) over some finite interval T0 /2 t T0 /2.
For the Discrete Fourier Series to be valid over a wider interval, one should increase
T0 . For the Discrete Fourier Series to be a better approximation to f (t) within the
interval, one should increase N .
We have been careful to say that the above formulas can be used to approximate
f (t) for nearly all possible functions f (t). Some mathematicians who specialize in
192
Calculus enjoy constructing bizzarre function definitions to be used as counterexamples when one carelessly uses limits to claim a particular result as a variable
is increased without bound. The so-called Dirichlet conditions restrict the set of
all functions to a smaller subset that can be represented as a Fourier Series. The
Dirichlet conditions claim that if f (t) has a finite number of finite discontinuities,
f (t) has a finite number of maxima and minima, and f (t) is absolutely integrable,
i.e.
Z
|f (t)|dt <
then f (t) can be represented by the inverse Fourier series. The above conditions
are sufficient, but not necessary. This means that there are likely other functions
that can be represented by the inverse Fourier transform that are not covered by
the above conditions. Often the third condition above is relaxed to those functions
that have finite energy, i.e.
Z
|f (t)|2 dt <
Not all functions that meet the relaxed third condition can be represented by an
inverse Fourier transform, but most do. At this point, we will leave it to the
Calculus experts to further partition all possible functions into those that can be
represented by a discrete Fourier transform and those that cannot. Most practical
cases, however, are finite-energy functions that also satisfy the first two Dirichlet
conditions and can be represented by a discrete Fourier transform.
In summary, the Discrete Fourier Transform is a method for interpolating a set
of N values obtained by sampling f (t) uniformly over some interval T0 t T0
into the Discrete Fourier Series of at most N sinusoids that can be evaluated at
the N sample points and obtain the N sample values. As the number of samples
N within the interval increases, the Discrete Fourier Series becomes a better and
better approximation for f (t) within that interval. If f (t) has period T0 , then this
resulting Fourier Series represents f (t) exactly. As T0 increases without bound,
then the Discrete Fourier Series approaches the inverse Fourier Transform which can
represent nearly any function f (t) as an infinite summation of sinusoids. The inverse
Discrete Fourier Transform is used to efficiently evaluate the Discrete Fourier Series
at the N points used to generate this series. The Discrete Fourier Transform is an
approximation for the Fourier Transform, the inverse Discrete Fourier Transform
is an approximation for the inverse Fourier Transform, and the inverse Discrete
Fourier Transform is the inverse of the Discrete Fourier Transform. There are
several valid definitions of the Discrete Fourier Transform, each of which differ
from one another only by a multiplicative constant.
In the main part of the book, we will use the simpler and more conventional
notation f ( ) to represent fe( ) and F (k) to represent Fe,N (k).
APPENDIX B
Residue Rings
1. Background
In an abstract algebra course, a quotient ring is typically defined as R/A where
R is some ring and A is an ideal of R. This partitions R into a number of subsets
which are the elements of R/A. A standard result of an abstract algebra course is
to prove that these subsets form a ring using the operations
(a + A) (b + A)
(a + A) (b + A)
= (a + b) + A
= (a b) + A
194
B. RESIDUE RINGS
assume that the representatives are really the elements of the quotient ring, when
in reality the elements are actually subsets of elements in some other ring which
are combined in a way isomorphic to the representative elements. This method
of instruction is traditional, but typically very confusing for the beginning algebra
student and for engineers who need to learn finite fields for coding theory.
Throughout this manuscript, we have claimed that the collection of residue
polynomials for a fixed modulus polynomial is equivalent to the algebraic structure
of a quotient ring in the case of univariate polynomials. In this section of the appendix, we will provide a proof of this claim. First, we will show that the collection
of residue polynomials is indeed a ring and second we will show that this set of
elements is isomorphic to a similarly defined quotient ring.
2. Definitions
There is some variation among abstract algebra textbooks about definitions
involving basic abstract algebra concepts. Here, we have adopted the definitions
given in [?] (except for Euclidean Domain with unique remainders and residue rings
which are introduced in this discussion).
Integral domain.
A commutative ring with identity 1 6= 0 is called an integral
domain if it has no zero divisors.
Euclidean domain.
An integral domain R is said to be a Euclidean Domain if
there is a norm N on R such that for any two elements a and b of R
with b 6= 0, there exist elements q and r in R with
(2)
a= qb+r
195
Residues.
Let D be a Euclidean domain with unique remainders and let
m be some element of R. Let D\m denote the set of all possible
elements of r in (2). We will call this collection of elements the
set of residues. This is also the set of elements a of D such that
N (a) < N (m) with any additional restictions placed on the remainders so that D has unique remainders.
Given elements f and m of D, a Euclidean domain with unique
remainders, the function f rem m is defined as the element r of D
obtained from (2).
Note: Since D has unique remainders, then r is unique and so the function f rem
m is well-defined.
Let us define addition and multiplication as follows over D\m, given elements
a and b D\m.
(3)
ab
(a + b) rem m
(4)
(a b) rem m
Here, and are the operations defined over D\m, while + and are operations defined over D. Since + and are assumed to be well-defined over D and the
rem operation produces a unique result for any element of D, then and
are
also well-defined.
3. The set of residues is a ring
Lemma 1. Let f be an element of D and let g be an element of D such that
f = g + k m for some k D. Then f rem m = g rem m.
Proof:
Let rf be result of computing f rem m. Thus, f = qf m + rf where qf
and rf are uniquely determined and rf satisfies the conditions to be a remainder
in D. Let rg be the result of computing g rem m. Thus, g = qg m + rg . Since
f = g + k m for some k D, then g + k m = qf m + rf and subtracting k m
from both sides of this equation yields g = qf m k m + rf = (qf k) m + rf .
Since rf satisfies the conditions to be a remainder in D and g rem m is unique,
then rf = rg and f rem m = g rem m.
Using this lemma, we will now show that the set of elements D\m with addition
operation defined by and multiplication operation defined by is a commutative
ring with identity. Some of the parts of this proof are left as exercises.
196
B. RESIDUE RINGS
a = (b
a) (c
a) and
Since the set of elements D\m, we will now refer to this set of elements as a
residue ring.
198
B. RESIDUE RINGS
Let (a) = (b) for two elements a and b of D\m. Then (a) and (b) are
elements of D/(m) and can be expressed as a + (m) and b + (m), respectively. Now
(a + (m)) (a + (m)) = (a (a)) + (m) = z + (m), where z is the zero element of
D\m. Thus, (a + (m)) = (a + (m)). Since b + (m) = a + (m), then (b + (m))
(a + (m)) = z + (m) as well. Since (b + (m)) (a + (m)) = (b (a)) + (m), then
(b (a)) + (m) = z + (m). Let = b (a) D\m. Since + (m) = z + (m),
there must exist an element C such that = z + C m = C m. Assume that
6= z. By definition of Euclidean Domain, then N () = N (C m) N (m). But
this violates the definition of D\m. So = z and therefore a = b. Thus, is a 1-1
function.
Let a + (m) be any element of D/(m) where a is an element of D. Let b = a
rem m. Here, b is an element of D which is also an element of D\m. Furthermore,
b can be expressed as b = a q m for some q D. Now (b) = (a q m) + (m).
By properties of ideals, (a q m) + (m) = a + (m). Thus, for every element a + (m)
of D/(m), there exists an element b of D\m such that (b) = a + (m). So is an
onto function.
Since is a well-defined, 1-1, onto function, is a bijection. To show that
D\m and D/(m) are isomorphic, we must show that (a b) = (a) (b) and
(a b) = (a) (b) for all a and b in D\m. By definition of , (a) = a + (m)
and (b) = b + (m).
By definition of , (a) (b) = (a + b) + (m). Now, a b = (a + b) rem m.
There is an element q1 of D such that a b can be expressed as (a + b) q1 m. So
(a b) = ((a + b) q1 m) + (m). By properties of ideals, ((a + b) q1 m) + (m) =
(a + b) + (m). So, (a b) = (a) (b).
By definition of , (a) (b) = (a b) + (m). Now, a b = (a b) rem m.
There is an element q2 of D such that a b can be expressed as (a b) q2 m. So
(a b) = ((a b) q2 m) + (m). By properties of ideals, ((a b) q2 m) + (m) =
(a b) + (m). So, (a b) = (a) (b).
So is a bijection that satisfies (ab) = (a)(b) and (a b) = (a)(b).
Thus, D\m is isomorphic to D/(m) and we have shown that a residue ring is essentially equivalent to a quotient ring in the case where D has unique remainders. The
set of residues resulting from the division of all polynomials over a coefficient ring
by a fixed modulus polynomial is the residue ring used throughout this manuscript
that was argued to be isomorphic to a similarly defined quotient ring.
5. Examples
(1) The integers are a Euclidean domain with norm given by N (a) = |a|. It is
not a Euclidean domain with unique remainder, however. Suppose that a = 7 and
m = 3. Note that 7 = 2 3 + 1 = 3 3 2. Since both 1 and 2 have norms less
then N (m) = 3, then 1 and 2 are each valid reminders in this case. If we add the
additional restriction that the remainder be nonnegative, then we have a Euclidean
domain with unique reminder. Alternatively, one could add the restriction that
5. EXAMPLES
199
the remainder be nonpositive and obtain a different Euclidean domain with unique
remainder for the integers.
It is possible to restrict the allowable remainders to some other range of the
integers of length m by carefully selecting the definitions of norm and arithmetic
functions over D\(m). For example, if we wish the elements of D\(3) to be 3, 4,
and 5, then use N (a) = |a 3| (plus the restriction that the remainder be greater
than 2) and define addition and multiplication as follows:
ab =
3 + (a + b) rem 3
3 + (a b) rem 3
b =
200
B. RESIDUE RINGS
one must specify whether to round up or round down. One can even specify
different rules for s0 and t0 if desired. Once these rules have been specified, then
Z[i] is a Euclidean domain with unique remainder. Then Z[i]\ is a remainder ring
for any 6= 0. The specific elements contained in Z[i]\ will vary depending on
the rounding rules selected for s0 and t0 , but the algebraic structure should be the
same for each case.
For example, let = 1 + i. If we round down for both s0 and t0 , then Z[i]\ =
{0, i}. If we round up for both s0 and t0 , then Z[i]\ = {0, i}. If we round down
for s0 , but up for t0 , then Z[i]\ = {0, 1}. If we round up for s0 , but down for t0 ,
then Z[i]\ = {0, 1}. Each of these rings is isomorphic to Z\2, Z/(2), and GF (2)
where the nonzero element is the identity element of the ring in each case.
By varying , we obtain other finite rings. Only when Z[i]\ has a prime
number of elements will the finite ring be isomorphic to a finite field.
domain
using the norm
(4)The Eisenstein integers Z[ 3] are a Euclidean
b)
=
a
+
D
b
.
for
any
element
a
+
D
Z[
D]. In order
Z[ D].Once this has been determined, then Z[ D]\ is a remainder ring for any
Z[ D] where 6= 0. Again, it remains to be determined if this yields any new
algebraic structures and if these structures have any practical importance.
6. Concluding Remarks
This section of the appendix showed that the construction of residue rings is
isomorphic to quotient rings and finite fields involving Euclidean Domains with
unique remainders, but can be defined without the complicated concept of ideals
for beginning algebra students and engineers. Specifically, the representative elements commonly used in such quotient rings have an algebraic structure of their
own which can be used in place of the quotient ring.
6. CONCLUDING REMARKS
201
The theory of residue rings can be used for more than just constructung finite
fields. Residue rings can replace any application of quotient rings when a ring used
to construct a quotient ring is also a Euclidean domain with unique remainders.
The Gaussian integer rings mentioned in the previous section is one example of
these additional applications.
APPENDIX C
y( )
n1
X
k=0
x( k) ~(k)
Now let X(k) be the (engineers) discrete Fourier transform (DFT) of x( ) and
H(k)
be the (engineers) DFT of ~( ). So, the input and transfer functions can be
expressed in terms of the discrete Fourier series using
x( )
~( )
n1
1 X
X(i) W i
n i=0
n1
1 X
H(j) W j
n j=0
where W = eI2/n . Recall that the engineers FFT scales the coefficients of
the discrete Fourier series by n/T0 , the engineers IFFT includes a scaling of the
result by T0 , and so a factor of 1/n is needed in the IFFT to produce the correct
coefficients for the series.
Using the definition of convolution, the output can now be expressed by
203
204
y( )
n
X
k=0
x( k) h(k)
n1
X
k=0
n2
1
n2
1
n2
n1
X
i=0
1
X(i) i( k)
n
n1
X n1
X n1
X
k=0 i=0 j=0
n1
X
X n1
X n1
i=0 j=0 k=0
n1
X n1
X
i=0 j=0
! n1
X1
H(j)
jk
n
j=0
X(i) i( k) H(j)
jk
X(i) H(j)
i (ji)k
X(i) H(j)
j
n1
X
(ji)k
k=0
n
X
ck
k=0
y( )
n
n1 n1
X
1 XX
j
X(i) H(j)
(ij)k
n2 i=0 j=0
k=0
=
=
1
n2
n1
X
i=0
n1
X
k=0
(n X(i) H(i)) i
1
X(k) H(k) k
n
Now the output sequence of the system can also be expressed in terms of the discrete
Fourier series using
y( )
n1
X
k=0
1
Y (k) k
n
where Y (k) is the discrete Fourier transform of y( ). Subtracting these two formulas
for y( ) and multiplying by n, we obtain
n1
X
k=0
205
= 0
c0 0
c0 1
+ c1 0
+ c1 2
+
+
+ cn1 0
+ cn1 n1
..
.
=
=
0
0
c0 n1
+ c1 2(n1)
+ cn1 (n1)(n1)
using each of the values of in the interval 0 < n. In matrix form, the linear
system of equations can be expressed as:
..
.
1
2
..
.
n1
2(n1)
..
.
1
n1
..
.
(n1)(n1)
c0
c1
..
.
cn1
0
0
..
.
0
206
Convolution Theorem.
Let x( ) is the input to an engineering system with transfer function
~( ) and the output of the system for this input is y( ) = x( ) ~( ).
Now let X(k) be (the engineers) discrete Fourier transform (DFT) of
Bibliography
[1] Bernstein, D. Multidigit Multiplication for mathematicians.
<http://cr.yp.to/papers.html\#m3>.
[2] Bernstein, D. Fast Multiplication and its applications.
<http://cr.yp.to/papers.html\#multapps>.
[3] Bernstein, D. The Tangent FFT.
<http://cr.yp.to/papers.html\#tangentfft>.
[4] Bittinger, Marvin L. Intermediate Algebra, 9th Edition, Pearson Education (2003).
[5] Bouguezel, Saad, M. Omair Ahmad, and M.N.S. Swamy. An Improved Radix-16 FFT Algorithm, Canadian Conference on Electrical and Computer Engineering, 2: 1089-92, 2004.
[6] Bouguezel, Saad, M. Omair Ahmad, and M.N.S. Swamy. Arithmetic Complexity of the SplitRadix FFT Algorithms, International Conference on Acoustics, Speech, and Signal Processing, 5: 137-40, 2005.
[7] Brigham, E. Oran. The Fast Fourier Transform and its Applications, Prentice Hall (1988).
[8] Buneman, Oscar. Journal of Computational Physics, 12: 127-8, 1973.
[9] Cantor, David G. and Erich Kaltofen. On fast multiplication of polynomials over arbitrary
algebras, Acta Informatica, 28: 693-701, 1991.
[10] Chu, Eleanor and Alan George. Inside the FFT Black Box: Serial and Parallel Fast Fourier
Transform Algorithms, CRC Press (2000).
[11] Cooley, J. and J. Tukey. An algorithm for the machine calculation of complex Fourier series,
Mathematics of Computation, 19: 297-301, 1965.
[12] Dubois, Eric and Anastasios N. Venetsanopoulos. A New Algorithm for the Radix-3 FFT,
IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(3): 222-5, 1978.
[13] Duhamel, Pierre and H. Hollmann. Split-radix FFT algorithm, Electronic Letters, 20: 14-6,
1984.
[14] Duhamel, P. and M. Vetterli. Fast Fourier Transforms: A tutorial review and a state of the
art, Signal Processing, 19: 259-99, 1990.
[15] Fiduccia, Charles. Polynomial evaluation via the division algorithm: the fast Fourier transform revisited, Proceedings of the fourth annual ACM symposium on theory of computing,
88-93, 1972.
[16] Frigo, Matteo and Steven Johnson. The Design and Implementation of FFTW3, Proceedings
of the IEEE, 93(2): 216-231, 2005.
[17] Gao, Shuhong. Clemson University Mathematical Sciences 985 Course Notes, Fall 2001.
[18] von zur Gathen, Joachim and J
urgen Gerhard. Arithmetic and Factorization of Polynomials
over F2 . Technical report, University of Paderborn, 1996.
[19] von zur Gathen, Joachim and J
urgen Gerhard. Modern Computer Algebra, Cambridge University Press (2003).
[20] Gentleman, Morven and Gordon Sande. Fast Fourier Transforms for fun and profit, AFIPS
1966 Fall Joint Computer Conference. Spartan Books, Washington, 1966.
[21] Gopinath, R. A. Comment: Conjugate Pair Fast Fourier Transform, Electronic Letters, 25(16)
1084, 1989.
[22] Heideman, M. T. and C. S. Burrus, A Bibliography of Fast Transform and Convolution
Algorithms II, Technical Report Number 8402, Electrical Engineering Dept., Rice University,
Houston, TX 77251-1892, 1984.
[23] Heideman, Michael T., Don H. Johnson, and C. Sidney Burrus, Gauss and the History of the
Fast Fourier Transform.
[24] van der Hoeven, Joris. The Truncated Fourier Transform and Applications. ISSAC 04 Proceedings, 2004.
207
208
BIBLIOGRAPHY
[25] van der Hoeven, Joris. Notes on the Truncated Fourier Transform. Preprint., 2005.
[26] Johnson, Steven G. and Matteo Frigo. A modified split-radix FFT with fewer arithmetic
operations, IEEE Trans. Signal Processing, 55 (1): 111-119, 2007.
[27] Kamar, I. and Y. Elcherif. Conjugate Pair Fast Fourier Transform. Electronic Letters, 25 (5):
324-5, 1989.
[28] Krot, A. M. and H. B. Minervina. Comment: Conjugate Pair Fast Fourier Transform, Electronic Letters, 28(12): 1143-4, 1992.
[29] Lidl, Rudolf, and Harald Neiderreiter. Finite Fields. Encyclopedia of Mathematics and Its
Applications, Volume 20, Cambridge University Press (1987).
[30] Loy, Gareth. Musimathics: the mathematical foundations of music, volume 2, MIT Press,
2007.
[31] Mateer, Todd D. PhD Dissertation.
[32] Merris, Russell. Combinatorics (Second Edition), Wiley (2003).
[33] Nussbaumer, H.J. Fast Fourier Transforms and Convolution Algorithms, Springer (1990).
[34] Suzuki, Y
oiti, Toshio Sone, and Kenuti Kido. A New FFT algorithm of Radix 3, 6, and 12,
IEEE Transactions on Acoustics, Speech, and Signal Processing, 34(2): 380-3, 1986.
[35] Takahashi, Daisuke. An Extended Split-Radix FFT Algorithm, IEEE Processing Letters, 8
(5): 145-7, 2001.
[36] Wan, Zhe-Xian. Lectures on Finite Fields and Galois Rings, World Scientific Publishing Co.
(2003).
[37] Wang, Yao and Xuelong Zhu. A Fast Algorithm for Fourier Transform Over Finite Fields
and its VLSI Implementation, IEEE Journal on Selected Areas in Communications, 6 (3):
572-7, 1988.
[38] Yavne, R. An economical method for calculating the discrete Fourier transform, Proc. Fall
Joint Computing Conference, 115-25, 1968.