Cheb Polynomials

18.
330 Lecture Notes:

Chebyshev Spectral Methods
Homer Reid
April 29, 2014
Contents
1 The question 2
2 The classical answer 3
3 The modern answer for periodic functions 5
4 The modern answer for non-periodic functions 6
5 Chebyshev polynomials 10
6 Chebyshev spectral methods 14
1
18.330 Lecture Notes 2
1 The question
In these notes we will concern ourselves with the following basic question: Given
a function f (x) on an interval x [a, b],
1. How accurately can we characterize f using only samples of its value at
N sample points {xn } in the interval [a, b]?
2. What is the optimal way to to choose the N sample points {xn }?
What does it mean to characterize a function f (x) over an interval [a, b]?
There are at least three possible answers:
Rb
1. We may want to evaluate the integral a f (x) dx. In this case, the problem
of characterizing f from N function samples is the problem of designing
an N -point quadrature rule.
2. We may want to evaluate the derivative of f at each of our sample points

using the information contained in the sample values. This is the problem
of constructing a differentiation stencil, and it arises when we try to solve
ODEs or PDEs: in that case we are trying to reconstruct f (x) given knowl-
edge of its derivative, so generally upon constructing the differentiation
stencil we will want to invert it.
3. We may want to construct an interpolant f interp (x) that agrees with f (x)
at the sample points but smoothly interpolates between those points in
a way that mimics the original function f (x) as closely as possible. For
example, f (x) may be the result of an experimental measurement or the
result of a costly numerical calculation, and we might to accelerate calcu-
lation of f (x) at arbitrary values of x by precomputing f (xn ) at just the
sample points {xn } and then interpolating to get values at intermediate
points x.
In a sense, the first half of our course was devoted to studying the answer
to this question furnished by classical numerical analysis, while the second half
has been focused on the modern answer. Lets begin by reviewing what the
classical approach had to offer.
2 The classical answer

Classical numerical analysis answers the question of how to choose the sample
points {xn } in the simplest possible way: We simply take the sample points to
be evenly spaced throughout the interval [a, b]:1
ba
xn = a + n, n = 0, 1, , N, = .
N
In this case,
The quadrature rules one obtains are the usual Newton-Cotes quadrature
rules, which we studied in the first and second weeks of our course. These
work by fitting polynomials through the function samples and then inte-
grating those polynomials to approximate the integral of the the function.
The differentiation stencils one obtains are the usual finite-difference sten-
cils, which we studied in the third and fourth weeks of our course. These
may again be interpreted as a form of polynomial interpolation: we are
essentially constructing and differentiating a low-degree approximation to
the Taylor-series polynomial
The interpolant one constructs is the unique N th degree polynomial P interp (x)
that agrees with the values of the underlying function f (x) at the N + 1
sample points. Although we didnt get to this in the first unit of our
course, it turns out to be easy to write down a formula for this polynomial
in terms of the sample points {xn } and the values of f at those points,
{fn } {f (xn )}. For example, for the cases N = 1, 2, 3 we have2
(x x2 ) (x x1 )
P1interp (x) = f1 + f2
(x1 x2 ) (x2 x1 )
(x x2 )(x x3 ) (x x1 )(x x3 ) (x x1 )(x x2 )
P2interp (x) = f1 + f2 + f3
(x1 x2 )(x1 x3 ) (x2 x1 )(x2 x3 ) (x3 x1 )(x3 x2 )
(x x2 )(x x3 )(x x4 ) (x x1 )(x x3 )(x x4 )
P3interp (x) = f1 + f2
(x1 x2 )(x1 x3 )(x1 x4 ) (x2 x1 )(x2 x3 )(x2 x4 )
(x x1 )(x x2 )(x x4 ) (x x1 )(x x2 )(x x3 )
+ f3 + f4
(x3 x1 )(x3 x2 )(x3 x4 ) (x4 x1 )(x4 x2 )(x4 x3 )
The formula of this type for general N is called the Lagrange interpolation
formula; it constructs an N th degree polynomial passing through N + 1
fixed data points (xn , fn ).
1 Technicallywe have here a set of N + 1 points, not N points as we stated above.
2 Do you see the pattern here? The general expression for PN includes one term for each
sample point xm . The numerator of this term is a product of N linear factors which are
constructed to ensure that the numerator vanishes whenever x equals one of the other sample
points (x = xn , n 6= m). The denominator of this term is just a constant chosen to replicate
the value of the numerator at x = xm , which ensures that the fraction evaluates to 1 at
x = xm . Then we just multiply by fm to obtain a term which yields fm at xm and vanishes
at all the other sample points. Summing all such terms for each sample point, we obtain an
N th degree polynomial which yields fn at each sample point xn .
Performance of the classical approach on general functions

How well does the classical approach work?
Integration: If we divide our interval into N subintervals and approximate

the integral over each subinterval using a pth-order Newton-Cotes quadra-
ture rule, then we saw in Unit 1 that for general functions the error decays
1
like N p+1 , i.e. algebraically with N (as opposed to exponentially with N ).
Differentiation: If we estimate derivative values via a pth-order finite-

difference stencil using function samples at points spaced by multiples of
, then the error decays like p , or like N1p . [For example, the forward
finite-difference approximation f 0 (x) f (x+)f

(x)
has error proportional
to , while the centered finite-difference f 0 (x) f (x+)f
2
(x)
has error
proportional to 2 .] Thus here again we find convergence algebraic in N ,
not exponential in N .
Interpolation: Polynomial interpolation in evenly-spaced sample points

is a notoriously badly-behaved procedure due to the Runge phenomenon
(we will discuss it briefly in an appendix). The Runge phenomenon is so
severe that, in some cases, the polynomial interpolant through N evenly-
spaced function samples points doesnt just converge slowly as N .
It doesnt converge at all!
To summarize the results of the classical approach,
Classical approach: To characterize a function over an in-

terval using N function choose the sample points to be evenly-
spaced points and construct polynomial interpolants. The ap-
proach in general yields convergence algebraic in N for integra-
tion and differentiation, but does not converge for interpolation
of some functions.
Performance of the classical approach on periodic functions

However, as we saw already in PSet 1, there is one exception to the general
rule of algebraic convergence: If the function we are integrating is periodic
over the interval in question, then simple Newton-Cotes using evenly-spaced
functions achieves convergence exponential in N (although differentiation and
interpolation continue to behave as above even for periodic functions). This
observation forms the basis of the modern approach, to which we now turn.
3 The modern answer for periodic functions

The classical approachto use evenly-spaced function samples and construct
polynomialsyields slow convergence in general and non-convergence (of the
polynomial interpolant) in some cases.
The modern approach, for periodic functions, retains the evenly-spaced sam-
ple points of the classical approach but throws out the idea of using polynomials
to interpolate them, choosing instead to construct trigonometric interpolants
consisting of linear combinations of sinusoids of various frequencies.3
The performance of the modern approach for periodic functions follows logi-
cally by aggregating a series of observations we made in our discussion of Fourier
analysis:
If a function f (t) is periodic with period T , then it has a Fourier-series
representation of the form

X
f (t) = fen ein0 t (1)
n=
If f (t) is also smoothin particular, if neither f nor its derivatives (of any
order) have discontinuitiesthen the Fourier-series coefficients fen decay
extremely rapidly with n, with typical decay rates looking something like
|fn | en (for some ) for large n.
This latter observation means that we can obtain a highly accurate charac-
terization of our function f (x) from just a few of its Fourier coefficients
more specifically, if we retain N coefficients, then the error in our approx-
imation should be on the order of eN .
Finally, from our discussion of discrete Fourier transforms we know that
the first N Fourier-series coefficients of a periodic function can be accu-
rately estimated from knowledge of its values at N sample points evenly
spaced throughout one period.
These observations motivate the modern strategy for how best to characterize
a periodic function based on a finite number of samples:
Modern approach, periodic functions: To characterize a

periodic function over an interval using N function samples,
choose the sample points to be evenly spaced throughout the
interval and construct a trigonometric interpolant consisting of
a sum of N sinusoids. The approach in general yields conver-
gence exponential in N for integration, differentiation, and in-
terpolation.
3 Linear
P
combinations of sinusoids like [an sin n0 t + bn cos n0 t] are sometimes called
trigonometric polynomials since they are in fact polynomials in the variable ei0 t , but I
personally find this terminology a little confusing.
4 The modern answer for non-periodic functions

The modern answer to the characterization problemsample at evenly-spaced
points and construct a trigonometric interpolantworks very well for periodic
functions. What do we do if we have a non-periodic function? Easy: we make
it into a periodic function. When you have such a powerful hammer, treat
everything like a nail!
The only tricky bit is that we cant simply take a non-periodic function
and construct a brute-force periodization by slicing out its behavior over some
finite interval, then repeating that behavior periodically. This destroys the
beautiful convergence properties discussed in the previous section by introducing
discontinuities at the endpoints of the interval; the Fourier-series coefficients fen
in (1) no longer decay exponentially, and the error incurred by retaining only
N of them now decreases like 1/N p for some power p.
However, as we saw in our discussion of Clenshaw-Curtis quadrature, there is
a wonderful way to transmogrify a non-periodic function into a periodic function
without introducing discontinuities. Lets review how this construction works.
Construct a smooth periodic version of f (x)

To construct a periodic function out of a non-periodic function f (x), we restrict
our attention to the interval x [1, 1] (if you need to consider a different
interval, just shift and scale variables accordingly) and define
g() = f (cos ).
This is a smooth4 periodic function. As varies from 0 to , g() traces out the
behavior of f (x) over the interval [1, 1] [that is, g() traces out f (x) backwards].
When crosses and continues on to 2, g() turns around and begins to retrace
its steps, going backwards over the same terrain it covered between = 0 and
. Figure 1 (which also appeared in our notes on Clenshaw-Curtis quadrature)
shows an example of a non-periodic function f (x) and the periodic function g()
that captures the behavior of f over the interval [1, 1].
Write down a Fourier cosine series for g()

Because g() is 2periodic and even, it has a a Fourier cosine series of the
form

a0 X
e
g() = + a cos()
e (2)
2 =1
with coefficients Z
2
a =
e g() cos() d. (3)
0
4 Assuming f is smooth. The construction of the g function doesnt do anything to smooth
out discontinuities in f or any of its derivatives; it only smoothes out the discontinuities arising
from the mismatch at the endpoints.
5 5
4 4
f(x)
3 3
2 2
1 1
y
0 0
-1 -1
-2 -2
-3 -3
-4 -4
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x
(a)
5 5
c
4 4
3 3
2 2
1 1
y
0 0
-1 -1
-2 -2
-3 -3
-4 -4
(b)
Figure 1: (a) A function f (t) that we want to integrate over the interval [1, 1].
(b) The function g() = f (cos ). Note the following facts: (1) g() is periodic
with period 2. (2) g() is an even function of . (3) Over the interval 0
, g() traces out the behavior of f (t) as t varies from 1 1 [i.e. g()
traces out f (t) backwards.] However, (4) g() knows nothing about what f (t)
does outside the range 1 < t < 1, which can make it a little tricky to compare
the two plots. For example, g() has local minima at = 0, even though f (t)
does not have local minima at t = 1, 1.
Sample g() at N + 1 evenly-spaced points and construct an

interpolant
Now consider sampling the function g() at N + 1 evenly-spaced points dis-
tributed throughout the interval [0, ], including the endpoints:
n
gn g(n) = g , n = 0, 1, , N (4)
N
The discrete Fourier transform of the set of samples {gn } yields a set of Fourier
coefficients {e
g }:
DFT
{gn } {e g }
From the {e
g } coefficients we can reconstruct the original {gn } samples through
the magic of the inverse DFT:
IDFT
{e
g } {gn }
where the specific form of the reconstruction is

N
X
gn = ge ein . (5)
=0
Now proceeding exactly as in our discussion of trigonometric interpolation, we

continue equation (5) from the integer variable n to a real-valued variable :
N
X
g interp () = ge ei (6)
=0
Note that g interp () is (in general) not the same function as the original g();
the difference is that the sum in (7) is truncated at = N , whereas the Fourier
series for the full function g() will in general contain infinitely many terms.
The form of (6) may be simplified by noting that, because g() is an even
function of , its Fourier series includes only cosine terms:
N/2
a0 X
g interp () =
e
+ a cos()
e (7)
2 =1
where the e
an coefficients are related to the gen coefficients computed by the DFT
according to
a0 = 2e
e g0 , a = (e
e g + ge ) = 2eg .
[The last equality here follows from the fact that, for an even function g(), the
Fourier series coefficients for positive and negative are equal, ge = ge .]
The procedure we have outlined above uses general DFT techniques for
computing the numbers a . In this particular case, because g() is an even
function, it is possible to accelerate the calculation by a factor of 4 using the
discrete cosine transform, a specialized version of the discrete Fourier transform.
We wont elaborate on this detail here.
Express g interp () in terms of the variable x

Finally, lets now ask what equation (2) looks like in terms of the original variable
x. If we recall the original definition
g() f (cos ) (8)
we can manipulate this to read
f (x) = g(arccos x). (9)
Now plugging in the approximation (2) yields an approximation to f :
N/2
a0 X
f interp (x) =
e
+ a cos (n arccos x)
e (10)
2 =1
Equation (10) would appear at first blush to define a horribly ugly function of
x. It took the twisted5 genius of the Russian mathematician P. L. Chebyshev
to figure out that in fact equation (10) defines a polynomial function of x. To
understand how this could possibly be the case, we must now make a brief foray
in the world of the Chebyshev polynomials.
5 We intend this adjective in the most admiring possible sense.

5 Chebyshev polynomials
Trigonometric definition
The definition of the Chebyshev polynomials is inspired by the observation, from
high-school trigonometry, that cos(n) is a polynomial in cos for any n. For
example,
cos 2 = 2 cos2 1
cos 3 = 4 cos3 3 cos
cos 4 = 8 cos4 8 cos2 + 1
The polynomials on the RHS of these equations define the Chebyshev polyno-
mials for n = 2, 3, 4. More generally, the nth Chebyshev polynomial Tn (x) is
defined by the equation
cos n = Tn (cos )
and the first few Chebyshev polynomials are
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 1
T3 (x) = 4x3 3x
T4 (x) = 8x4 8x2 + 1.
Figure 2 plots the first several Chebyshev polynomials. Notice the following
important fact: For all n and all x [1, 1], we have 1 Tn (x) 1. This
boundedness property of the Chebyshev polynomials turns out to be quite useful
in practice.
On the other hand, the Chebyshev polynomials are not bounded between
1 and 1 for values of x outside the interval [1, 1] (nor, being polynomials,
could they possibly be). Figure 3 shows what happens to T15 (x) as soon as we
get even the slightest little bit outside the range x [1, 1]: the polynomial
takes off to . In almost all situations involving Chebyshev polynomials we
will be interested in their behavior within the interval [1, 1].
Completeness and Orthogonality

The Chebyshev polynomials constitute our first example of an orthogonal family
of polynomials. We will have more to say about this idea later, but for the time
being the salient points are the following:
1. The Chebyshev polynomials are complete: Any N th-degree polynomial
can be expressed exactly (and uniquely) as a linear combination of T0 (x), T1 (x), , TN (x).
Thus the set of N + 1 functions {Tn } for n = 0, , N forms a basis of
the N + 1-dimensional vector space of N -th degree polynomials.
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
T0 (x) T1 (x)
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
T2 (x) T3 (x)
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
T4 (x) T15 (x)
Figure 2: The Chebyshev polynomials T04 (x) and T15 (x).

15 15
10 10
5 5
0 0
y
-5 -5
-10 -10
-15 -15
-1 -0.5 0 0.5 1
x
Figure 3: The Chebyshev polynomials Tn (x) take off to for values of x

outside the range [1 : 1]. Shown here is the case T15 (x).
2. The Chebyshev polynomials are orthogonal with respect to the following

inner product:6 Z 1
f (x)g(x)dx
hf, gi .
1 1 x2
Orthogonality means that if we insert Tn and Tm in the inner product we
get zero unless n = m:

hTn , Tm i = nm . (11)
2
Taken together, these two properties furnish a convenient way to represent ar-
bitrary functions as linear combinations of Chebyshev polynomials. The first
property tells us that, given any function f (x), we can write f (x) in the form

X
f (x) = Cn Tn (x). (12)
n=0
The second property gives us a convenient way to extract the Cn coefficients:

Just take the inner product of both sides of (12) with Tm (x). Because of orthog-
onality (equation 11), every term on the RHS dies except for the one involving
Cm , and we find

hf, Tm i = Cm
2
[where the /2 factor here comes from equation (11)]. In other words, the
Chebyshev expansion coefficients of a general function f (x) are
Z 1
2 f (x)Tm (x)
Cm = dx. (13)
1 1 x2
Equations (12) and (13) amount to form what we might refer to as the forward
and inverse discrete Chebyshev transforms of a function f (x).
6 An inner product on a vector space V is just a rule that assigns a real number to any pair
of elements in V . (Mathematicians would say it is a map V V R.) The rule has to be

linear (the inner product of a linear combination is a linear combination of the inner products)
and non-degenerate, meaning no non-zero element has vanishing inner product with itself.
6 Chebyshev spectral methods

Chebyshev spectral methods furnish the second half of the modern solution to
the problem we posed at the beginning of these notes, namely, how best to
characterize a function using samples of its value at N points.
Recall that the first half of the modern solution went like this:
Modern approach, periodic functions: To characterize a

periodic function over an interval using N function samples,
choose the sample points to be evenly spaced throughout the
interval and construct a trigonometric interpolant consisting of
a sum of N sinusoids. The approach in general yields conver-
gence exponential in N for integration, differentiation, and in-
terpolation.
The second half of the modern solution now reads like this:
Modern approach, non-periodic functions: To character-

ize a non-periodic function over an interval using N function
samples, map the interval into [1, 1], choose the sample points
to be Chebyshev points, and construct a polynomial interpolant
consisting of a sum of N Chebyshev polynomials. The approach
in general yields convergence exponential in N for integration,
differentiation, and interpolation.
Lets now investigate how Chebyshev spectral methods work for each of the
various aspects of the characterization problem we considered above.
Chebyshev approximation
As we saw previously, a function f (x) on the interval [1, 1] may be represented
exactly as a linear combination of Chebyshev polynomials:

X
f (x) = Cn Tn (x) (14)
n=0
One way to obtain a formula for the C coefficients in this expansion is to take
the inner product of both sides with Tm (x) and use the orthogonality of the T
functions:
hf, Tm i
Cm =
hTm , Tm
2 1 f (x)Tm (x)
Z
= dx. (15)
1 1 x2
However, there are better ways to compute these coefficients, as discussed below.
If we restrict the sum in (16) to include only its first N terms, we obtain an
approximate representation of f (x), the N th Chebyshev approximant:
N
X 1
f approx (x) = Cn Tn (x) (16)
n=0
Chebyshev interpolation
The coefficients Cn in formula (16) for the Chebyshev approximant may be
computed using the integral formula (14), but there are easier ways to get them.
These are based on the following alternative characterization of (16):
The N -th Chebyshev approximant (16) is the unique N -th de-

gree polynomial that agrees with f (x) at the N + 1 Chebyshev
points xn = cos n
N , n = 0, 1, , N.
Thus, when we construct (16), we are really constructing an interpolant that

smoothly connects N + 1 samples of f (x) evaluated at the Chebyshev points.
In particular, the values of f at the Chebyshev points are the only data we need
to construct f approx in (16). This is not obvious from expression (15), which
would seem to suggest that we need to know f throughout the interval [1, 1].
How do we use this characterization of (16) to compute the Chebyshev ex-
pansion coefficients {Cn } in (16)? There are at least two ways to proceed:
1. We could use the Lagrange interpolation formula to construct the unique
data points {xn , f (xn )} for
N -th degree polynomial running through the
the N + 1 Chebyshev points xn = cos nN , n = 0, 1, , N.
2. We could observe that the Cn coefficients are the coefficients in the Fourier
cosine series of the even 2-periodic function
g() = f (cos ). The samples
of g() at evenly-spaced points g n N are precisely just the samples
of f (x) at the Chebyshev points cos n N , and the Fourier cosine series
coefficients may be computed by computing the discrete cosine transform
of the set of numbers {fn }:
DCT
{fn } {Cn }
where n
fn = f cos , n = 0, 1, , N.
N
Option 1 here is discussed in Trefethen, Spectral Methods in MATLAB,
Chapter 6 (see particularly Exercise 6.1).
Here we will focus on option 2. The numbers Cn are just the Fourier cosine
series coefficients of g(), i.e. the numbers we called e
a in equation (3):
Z
2
Cn = f (cos ) cos(n)d.
0
We compute the integral using a simple (N + 1)-point trapezoidal rule:

2 h1 n 2n
Cn = f0 + f1 cos + f2 cos +
N 2 N N

(N 1)n 1 i
+ fN 1 cos + fN cos N (17)
N 2
where n
fn f cos
N
If we write out equation (17) for all of the Cn coefficients at once, we have an
(N + 1)-dimensional linear system relating the sets of numbers {fn } and {Cn }:

12 1 1 1 1 1
2
f0 C0

1
cos N cos 2 cos 3 1
2 cos f1 C1

2 N N

1
cos 2 cos 4 cos 6 1

2
2 N N N 2 cos 2
f2
= C2
N 1

cos 3 cos 6 cos 9 1
cos 3 f3 C3

2 N N N 2

. .. .. .. .. .. .. ..
..

. . . . .
.
.

1 1
2 cos cos 2 cos 3 2 cos N fN CN
which we could write in the form
f = C (18)
where f and C are the (N + 1)-dimensional vectors of function samples at

Chebyshev points and Chebyshev expansion coefficients, respectively, and the
elements of the matrix are

1
N ,
m=0
nm

nm = N2 cos , m = 1, , N 1

1 N
N cos n, m=N
where the n, m indices run from 0 to N .

Using equation (18) directly is actually not a good way to compute the C
coefficients from the f samples, because the computational cost of the matrix-
vector multiplication scales like N 2 , whereas FFT techniques (the fast cosine
transform) can perform the same computation with cost scaling like N log N .
However, the existence of the matrix is useful for deriving Clenshaw-Curtis
quadrature rules and Chebyshev differentiation matrices, as we will now see.
Chebyshev integration
The Chebyshev spectral approach to integrating a function f (x) goes like this:
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(16)].
2. Integrate the approximant and take this as an approximation to the inte-

gral.
In symbols, we have
Z 1 Z 1
f (x) dx f approx (x) dx
1 1
Insert equation (16):
N
X Z 1
Cm Tm (x) dx. (19)
m=0 1
But the integrals of the Chebyshev polynomials can be evaluated in closed form,
with the result Z 1 (
2
2, m even
Tm (x) dx = 1m (20)
1 0, m odd.
Thus equation (19) reads
Z 1 N
X 2Cm
f (x) dx . (21)
1 m=0
1 m2
m even
Does this expression look familiar? It is exactly what we found in our discussion
of Clenshaw-Curtis quadrature, except there we interpreted the integral (20) in
the equivalent form
Z 1 Z
Tm (x) dx = cos(m) sin d.
1 0
Thus the Chebyshev spectral approach to integration is just Clenshaw-Curtis

quadrature. As we have observed, the Cm coefficients may be computed exactly
up to m = N using N + 1 samples of the function f (x) (where the samples
are taken at the Chebyshev points). Indeed, we can write (21) in the form of
a vector-vector product involving the vector C of Chebyshev expansion coeffi-
cients:
2

0

2
Z 1 122
f (x) dx WT C,

W=
0 .

1
2
142
..

.

2
1N 2
Now plugging in equation (18) yields

Z 1
f (x) dx WT f (22)
1
= wt f (23)
which just illustrates that the weights of the (N + 1)-point Clenshaw-Curtis

quadrature rule are the elements of the vector w = WT .
Chebyshev differentiation
In the first unit of our course we saw how to use finite-difference techniques to
approximate derivative values from function values. For example, if feven is a
vector of function samples taken at evenly-spaced points in an interval [a, b] i.e.
if
f (a)
f (a + )

feven = f (a + 2)

..
.
f (b)
then the vector of derivative values at the sample points may be represented in
the centered-finite-difference approximation as a matrix-vector product of the
form
0
feven = DCFD feven
where7

0 1 0 0 0 0

1 0 1 0 0 0

0 1 0 1 0 0
1 0 0 1 0 0 0

DCFD = .

2

.
..

0 0 0 0 0 1
0 0 0 0 1 0
As we saw in our discussion of finite-difference techniques, this approximation
will converge like 1/N 2 , i.e. the error between our approximate derivative and
the actual derivative will decay like 1/N 2 .
Now that we are equipped with Chebyshev spectral methods, we can write
a numerical differentiation stencil whose errors will decay exponentially 8 in N .
Indeed, following the general spirit of Chebyshev spectral methods, all we have
to do is
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(16)].
2. Differentiate the approximant and take this as an approximation to the
derivative.
The N th Chebyshev approximant to f (x) is
N
X
fapprox (x) = Cm Tm (x)
m=0
Differentiating, we find
N
X
0 0
fapprox (x) = Cm Tm (x).
m=0
If we evaluate this formula at each of the (N + 1) Chebyshev points xn =

0
cos n

N , n = 0, 1, , N , we obtain a vector fcheb whose entries are approximate
values of the derivative of f at the Chebyshev points, and which is related to
the vector C of Chebyshev coefficients via a matrix-vector product relationship:
0 0
T0 (x0 ) T10 (x0 ) T20 (x0 ) TN0 (x0 )

f (x0 ) C0
f 0 (x1 ) T00 (x1 ) T10 (x1 ) T20 (x1 ) TN0 (x1 ) C1
0
f (x2 ) T00 (x2 ) T10 (x2 ) T20 (x2 ) TN0 (x2 ) C2

=
.. .. .. .. .. .. ..
. . . . . . .
f 0 (xN ) T00 (xN ) T10 (xN ) T20 (xN ) TN0 (xN ) CN
| {z } | {z } | {z }
f 0 T0 C
cheb
(24)
7 We are here assuming that f vanishes to the left and right of the endpoints; as we saw
earlier in the course, it is easy to generalize to arbitrary boundary values of f .
8 Technically: faster than any polynomial in N .
Lets abbreviate this equation by writing

0
fcheb = T0 C
where T0 is the (N + 1) (N + 1)-dimensional matrix in (24). If we now plug

in C = f cheb [equation 18], we get
0 0
fcheb = T
|{z} fcheb
Dcheb
This equation identifies the (N + 1) (N + 1) matrix

Dcheb = T0
as the matrix that operates on a vector of f samples at Chebyshev points to
yield a vector of f 0 samples at Chebyshev points.
Second derivatives
What if we need to compute second derivatives? Easy! Just go like this:
00 0
fcheb = Dcheb fcheb

= Dcheb Dcheb fcheb
2
= Dcheb fcheb .
This equation identifies the (N +1)(N +1) matrix (Dcheb )2 , i.e just the square
of the matrix Dcheb , as the matrix that operates on a vector of f samples at
Chebyshev points to yield a vector of f 00 samples at Chebyshev points.
Chebyshev Boundary-Value Problems

Earlier in the course we used finite-difference differentiation matrices to solve
boundary-value problems, with errors decaying like 1/N p where N is the number
of sample points and p is some integer power. Now that we are equipped with
Chebyshev spectral methods, we can use Chebyshev differentiation matrices
like Dcheb to solve boundary-value problems with errors decaying exponentially
rapidly with N .
Although this process is conceptually just as straightforward as was our
use of finite-difference stencils to solve boundary-value problems earlier in the
course, there are a couple of minor technical details to consider that slightly
complicate the story. To illustrate how these can be tamed, lets work out a
Chebyshev algorithm for solving a boundary-value problem of the form

f 00 (x) + f 0 (x) + 2 f (x) = g(x), f xL fL , f xR fR . (25)
In this equation, and are fixed parameters, g(x) is a known forcing function,
the x variable ranges over the interval [xL , xR ], and the boundary values of f at
the left and right endpoints are fL , fR . (The subscripts L and R stand for
left and right.)
Rescaling to [1, 1]
Our boundary-value problem is defined on the interval [xL , xR ], but Chebyshev
spectral methods are nicest when we are working on the interval [1, 1]. Thus,
before we do anything else, lets redefine our problem so that the independent
variable runs over [1, 1]. That is, we will write x as a linear function of a new
variable (i.e. x = A + B for constants A, B to be determined) such that x
runs from xL to xR as runs from 1 to 1. As you can easily check, the unique
choice that works is
xR xL xR + xL
x( ) = W + xM , W , xM .
2 2
(Note that W is just half the width of the interval, while xM is the midpoint of
the interval. Here W stands for width, while M stands for midpoint.)
I will use the symbols F( ) and G( ) to denote new functions of obtained
by evaluating the old functions f (x) and g(x) at the point x = x( ):

F( ) f x( ) = f W + xM G( ) g x( ) = g W + xM .
One consequence of the change of variables is that derivatives with respect to x

acquire factors9 of W when we write them in terms of derivatives with respect
to :
1 0 1
f (x) = F( ), f 0 (x) = F ( ), f 00 (x) = 2 F 0 ( ).
W W
Now we just rewrite the differential equation (25 in terms of the new variable
:
1 00 0
F ( ) + 2 F( ) = G( ),

2
F ( ) + F 1 fL , F + 1 fR .
W W
(26)
We now have a differential equation defined on the interval [1, 1], and we can
apply Chebyshev spectral methods.
Discretization
The next step is to discretize. Fix a value of N and consider the set of (N + 1)
Chebyshev points10 in the interval [1 : 1]:
n n o
n = cos , n = 0, 1, , N total of N + 1 sample points (27)
N
9 You can use dimensional analysis as a mnemonic device to help you remember where the
W factors go: Think of x as a quantity with units of length (so W , the width of an interval
in x, has units of length too), while is dimensionless. We know that x derivatives like df /dx
have units of inverse length [and d2 f /dx2 has units of (inverse length)2 ], but derivatives
like dF /d are dimensionless, so to recover a quantity like df /dx from a quantity like dF /d
we have to divide the latter by a quantity with units of length, i.e. by one factor of W .
Alternatively, you can think in terms of this symbolic identity:
d 1 d
= .
dx W d
10 As usual in Fourier and Chebyshev methods, there is some annoying confusion here over
precisely what N means, and failure to get this minor point straight can lead to annoying
Let F , F 0 , F 00 , and G be vectors of length N + 1 containing samples of F,

derivatives of F, and G at the Chebyshev points:
F00 F000

F0 G0
F1 F10 F100 G1
F20 F200

F2 G2
F = , F0 = , F 00 = , G=

.. .. .. ..

.

.

.

.

FN 1 F0 F 00 GN 1
N 1 N 1
0 00
FN FN FN GN
(28)
where
Fn F(n ), Fn0 F 0 (n ), Fn00 F 00 (n ), Gn G(n ).
The vectors F 0 and F 00 can be obtained by operating on F with the Chebyshev

differentiation matrices we constructed earlier:
F 0 = DF , F 00 = D2 F
where D is what we earlier called Dcheb .

In discretized form, the boundary-value problem (26) thus becomes a linear-
algebra problem involving some matrices and some vectors:
1
2 2
2
D + D + F =G (29)
|W {zW }
M
where M is just a convenient name that we have assigned to the (N +1)(N +1)
matrix in parentheses.
Handling of boundary values

The only remaining complication is to account for the boundary values. To do
this, note that the first and last entries in the vector F are actually known, not
errors that result from being off by 1.
The way we have written things (which is the conventional formulation of Chebyshev spec-
tral methods), N is the number of angular segments into which the upper-half-circle is split,
which means that the number of sample points is actually one larger than N ; this is because
the index n in equation (27) needs to runs from 0 to N inclusive because we need to include
sample points at both 1 = cos 0
N
and 1 = cos NN .
This is straightforward enough, but it differs from the convention typically used in
DFT/FFT methods, where N (not N + 1) is the number of sample points. This corresponds
to the fact that, in DFT/FFT methods, the index n only runs from 0 to N 1, not all the
way to N . One way to think about the distinction is that, in DFT/FFT methods, the point
n = N is equivalent to n = 0 (it corresponds to one full lap around the unit circle in the com-
plex plane), so including it would be redundant. On the other hand, in Chebyshev methods
[more broadly, in discrete sine/cosine transform methods (DST/DCT methods) as opposed
to discrete Fourier transform methods], the point n = N corresponds to one half lap around
the unit circle, taking us to = ; this is inequivalent to n = 0 and thus the corresponding
sample point must be retained.
unknown, quantities: they are simply11 given by the boundary conditions, i.e.
we have
F0 fR , FN fL . (30)
This means that equation (29), which consists of N + 1 simultaneous linear
equations, actually gives us more equations than we need; we want to eliminate
the first and last of those equations and solve a reduced (N 1)-dimensional
system for just the unknown (N 1) quantities F1 , , FN 1 .
To separate out what is known from what is unknown on the LHS of equation
(29), lets write the N +1 equations implicit in that statement in a {1, (N 1), 1}
block form:

M00 v1T M0N F0 G0

= . (31)

v2
M int v 3 F
int int
G

T
MN 0 v4 MN N FN GN
In this equation, F int and G int are the interior portions of the F and G
vectors, containing just the values of F and G at the N 1 interior Chebyshev
points:
F1 G1

F2 G2

F int = .. G int = ..
, .

. .
FN 2 GN 2

FN 1 GN 1
Also, in equation (31), v1,2,3,4 are (N 1)-dimensional vectors obtained by
slicing out chunks of the original matrix M, and Mint is the (N 1) (N 1)
interior chunk of M. In a high-level language like julia these may be extracted
from M using the following commands:
MInt = M[ 2:end-1, 2:end-1 ];

v1 = M[ 1, 2:end-1 ];
v2 = M[ 2:end-1, 1 ];
11 Careful! In Chebyshev spectral methods, the angle = n [the argument of the cosine
N
in equation (27)] runs from = 0 to = as the index n runs from 0 to N . This means that
the variable winds up running backwards from 1 to 1 as n runs from 0 to N , i.e.
n = 0 corresponds to = +1, n = N corresponds to = 1.
Looking at equation (29), this yields the at-first-surprising conclusion that the boundary value
at the right endpoint, fR , wants to go in the first slot of the vector F , while fL wants to go
in the last slot of the vector, as in (30).
v3 = M[ 2:end-1, end ];
v4 = M[ end, 2:end-1 ];
The portion of (31) that we now want to solve is the interior portion, i.e.
the innermost (N 1) (N 1) chunk of the system, which reads
F0 v2 + Mint F int + FN v3 = G int
or, swinging all known quantities over the RHS so that we have a linear system
relating unknowns to knowns,
Mint F int = G int F0 v2 FN v3
in terms of which our solution vector will be

h i
F int = M1
int G int F 0 v 2 F N v3 . (32)
This equation gives us only the innermost (N 1) entries in our solution vector
F ; the outer 2 entries are obtained by just plugging in the given boundary
conditions.
If this procedure seems complicated, its actually nothing more than what
we did earlier in our treatment of finite-difference solutions to boundary-value
problems. For example, in the section titled Finite-differencing as matrix-
vector multiplication in the Numerical Differentiation lecture notes, the RHS
of the boundary-value problem involved a vector we called , which depended
on the boundary values. This vector is equivalent to the vector F0 v2 +Fn v3 that
appears on the RHS of (32). The only difference is that in the finite-difference
case this vector is sparse (almost all of its entries are zero), whereas here the
vector is dense.
This reflects the fact that finite-differencing is essentially a local procedure,
which estimates derivatives from function samples only at immediately adja-
cent points; in contrast, Chebyshev differentiation is inherently global, with
each sample of the derivative needing to know information on the entire set of
function samples. This non-locality makes Chebyshev methods more costly for a
given number of samples, but is also responsible for the dramatically accelerated
convergence properties.

Cheb Polynomials

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Cheb Polynomials

Hochgeladen von

Copyright:

Verfügbare Formate

18.

330 Lecture Notes:

2 The classical answer 3

3 The modern answer for periodic functions 5

4 The modern answer for non-periodic functions 6

6 Chebyshev spectral methods 14

2. We may want to evaluate the derivative of f at each of our sample points

2 The classical answer

Performance of the classical approach on general functions

Integration: If we divide our interval into N subintervals and approximate

Differentiation: If we estimate derivative values via a pth-order finite-

Interpolation: Polynomial interpolation in evenly-spaced sample points

To summarize the results of the classical approach,

Classical approach: To characterize a function over an in-

Performance of the classical approach on periodic functions

3 The modern answer for periodic functions

Modern approach, periodic functions: To characterize a

4 The modern answer for non-periodic functions

Construct a smooth periodic version of f (x)

Write down a Fourier cosine series for g()

Sample g() at N + 1 evenly-spaced points and construct an

where the specific form of the reconstruction is

Now proceeding exactly as in our discussion of trigonometric interpolation, we

Express g interp () in terms of the variable x

g() f (cos ) (8)

we can manipulate this to read

f (x) = g(arccos x). (9)

Now plugging in the approximation (2) yields an approximation to f :

5 We intend this adjective in the most admiring possible sense.

Completeness and Orthogonality

T4 (x) T15 (x)

Figure 2: The Chebyshev polynomials T04 (x) and T15 (x).

Figure 3: The Chebyshev polynomials Tn (x) take off to for values of x

2. The Chebyshev polynomials are orthogonal with respect to the following

The second property gives us a convenient way to extract the Cn coefficients:

of elements in V . (Mathematicians would say it is a map V V R.) The rule has to be

6 Chebyshev spectral methods

Modern approach, periodic functions: To characterize a

Modern approach, non-periodic functions: To character-

The N -th Chebyshev approximant (16) is the unique N -th de-

Thus, when we construct (16), we are really constructing an interpolant that

We compute the integral using a simple (N + 1)-point trapezoidal rule:

which we could write in the form

where f and C are the (N + 1)-dimensional vectors of function samples at

where the n, m indices run from 0 to N .

2. Integrate the approximant and take this as an approximation to the inte-

Insert equation (16):

Thus the Chebyshev spectral approach to integration is just Clenshaw-Curtis

Now plugging in equation (18) yields

which just illustrates that the weights of the (N + 1)-point Clenshaw-Curtis

If we evaluate this formula at each of the (N + 1) Chebyshev points xn =

Lets abbreviate this equation by writing

where T0 is the (N + 1) (N + 1)-dimensional matrix in (24). If we now plug

This equation identifies the (N + 1) (N + 1) matrix

Chebyshev Boundary-Value Problems

One consequence of the change of variables is that derivatives with respect to x

Let F , F 0 , F 00 , and G be vectors of length N + 1 containing samples of F,

Fn F(n ), Fn0 F 0 (n ), Fn00 F 00 (n ), Gn G(n ).

The vectors F 0 and F 00 can be obtained by operating on F with the Chebyshev

where D is what we earlier called Dcheb .

Handling of boundary values

MInt = M[ 2:end-1, 2:end-1 ];

F0 v2 + Mint F int + FN v3 = G int