Beruflich Dokumente
Kultur Dokumente
Contents
1 The question 2
5 Chebyshev polynomials 10
1
18.330 Lecture Notes 2
1 The question
In these notes we will concern ourselves with the following basic question: Given
a function f (x) on an interval x [a, b],
1. How accurately can we characterize f using only samples of its value at
N sample points {xn } in the interval [a, b]?
2. What is the optimal way to to choose the N sample points {xn }?
What does it mean to characterize a function f (x) over an interval [a, b]?
There are at least three possible answers:
Rb
1. We may want to evaluate the integral a f (x) dx. In this case, the problem
of characterizing f from N function samples is the problem of designing
an N -point quadrature rule.
3. We may want to construct an interpolant f interp (x) that agrees with f (x)
at the sample points but smoothly interpolates between those points in
a way that mimics the original function f (x) as closely as possible. For
example, f (x) may be the result of an experimental measurement or the
result of a costly numerical calculation, and we might to accelerate calcu-
lation of f (x) at arbitrary values of x by precomputing f (xn ) at just the
sample points {xn } and then interpolating to get values at intermediate
points x.
In a sense, the first half of our course was devoted to studying the answer
to this question furnished by classical numerical analysis, while the second half
has been focused on the modern answer. Lets begin by reviewing what the
classical approach had to offer.
18.330 Lecture Notes 3
If f (t) is also smoothin particular, if neither f nor its derivatives (of any
order) have discontinuitiesthen the Fourier-series coefficients fen decay
extremely rapidly with n, with typical decay rates looking something like
|fn | en (for some ) for large n.
This latter observation means that we can obtain a highly accurate charac-
terization of our function f (x) from just a few of its Fourier coefficients
more specifically, if we retain N coefficients, then the error in our approx-
imation should be on the order of eN .
Finally, from our discussion of discrete Fourier transforms we know that
the first N Fourier-series coefficients of a periodic function can be accu-
rately estimated from knowledge of its values at N sample points evenly
spaced throughout one period.
These observations motivate the modern strategy for how best to characterize
a periodic function based on a finite number of samples:
3 Linear
P
combinations of sinusoids like [an sin n0 t + bn cos n0 t] are sometimes called
trigonometric polynomials since they are in fact polynomials in the variable ei0 t , but I
personally find this terminology a little confusing.
18.330 Lecture Notes 6
with coefficients Z
2
a =
e g() cos() d. (3)
0
4 Assuming f is smooth. The construction of the g function doesnt do anything to smooth
out discontinuities in f or any of its derivatives; it only smoothes out the discontinuities arising
from the mismatch at the endpoints.
18.330 Lecture Notes 7
5 5
4 4
f(x)
3 3
2 2
1 1
y
0 0
-1 -1
-2 -2
-3 -3
-4 -4
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x
(a)
5 5
c
4 4
3 3
2 2
1 1
y
0 0
-1 -1
-2 -2
-3 -3
-4 -4
(b)
Figure 1: (a) A function f (t) that we want to integrate over the interval [1, 1].
(b) The function g() = f (cos ). Note the following facts: (1) g() is periodic
with period 2. (2) g() is an even function of . (3) Over the interval 0
, g() traces out the behavior of f (t) as t varies from 1 1 [i.e. g()
traces out f (t) backwards.] However, (4) g() knows nothing about what f (t)
does outside the range 1 < t < 1, which can make it a little tricky to compare
the two plots. For example, g() has local minima at = 0, even though f (t)
does not have local minima at t = 1, 1.
18.330 Lecture Notes 8
Note that g interp () is (in general) not the same function as the original g();
the difference is that the sum in (7) is truncated at = N , whereas the Fourier
series for the full function g() will in general contain infinitely many terms.
The form of (6) may be simplified by noting that, because g() is an even
function of , its Fourier series includes only cosine terms:
N/2
a0 X
g interp () =
e
+ a cos()
e (7)
2 =1
where the e
an coefficients are related to the gen coefficients computed by the DFT
according to
a0 = 2e
e g0 , a = (e
e g + ge ) = 2eg .
[The last equality here follows from the fact that, for an even function g(), the
Fourier series coefficients for positive and negative are equal, ge = ge .]
The procedure we have outlined above uses general DFT techniques for
computing the numbers a . In this particular case, because g() is an even
function, it is possible to accelerate the calculation by a factor of 4 using the
discrete cosine transform, a specialized version of the discrete Fourier transform.
We wont elaborate on this detail here.
18.330 Lecture Notes 9
N/2
a0 X
f interp (x) =
e
+ a cos (n arccos x)
e (10)
2 =1
Equation (10) would appear at first blush to define a horribly ugly function of
x. It took the twisted5 genius of the Russian mathematician P. L. Chebyshev
to figure out that in fact equation (10) defines a polynomial function of x. To
understand how this could possibly be the case, we must now make a brief foray
in the world of the Chebyshev polynomials.
5 Chebyshev polynomials
Trigonometric definition
The definition of the Chebyshev polynomials is inspired by the observation, from
high-school trigonometry, that cos(n) is a polynomial in cos for any n. For
example,
cos 2 = 2 cos2 1
cos 3 = 4 cos3 3 cos
cos 4 = 8 cos4 8 cos2 + 1
The polynomials on the RHS of these equations define the Chebyshev polyno-
mials for n = 2, 3, 4. More generally, the nth Chebyshev polynomial Tn (x) is
defined by the equation
cos n = Tn (cos )
and the first few Chebyshev polynomials are
T0 (x) = 1
T1 (x) = x
T2 (x) = 2x2 1
T3 (x) = 4x3 3x
T4 (x) = 8x4 8x2 + 1.
Figure 2 plots the first several Chebyshev polynomials. Notice the following
important fact: For all n and all x [1, 1], we have 1 Tn (x) 1. This
boundedness property of the Chebyshev polynomials turns out to be quite useful
in practice.
On the other hand, the Chebyshev polynomials are not bounded between
1 and 1 for values of x outside the interval [1, 1] (nor, being polynomials,
could they possibly be). Figure 3 shows what happens to T15 (x) as soon as we
get even the slightest little bit outside the range x [1, 1]: the polynomial
takes off to . In almost all situations involving Chebyshev polynomials we
will be interested in their behavior within the interval [1, 1].
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
T0 (x) T1 (x)
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
T2 (x) T3 (x)
1 1.5
1 1.5
1 1
0.5 0.5
0.5 0.5
0 0 0 0
y
-0.5 -0.5
-0.5 -0.5
-1 -1
-1 -1.5
-1 -1.5
-1 -0.5 0
x 0.5 1 -1 -0.5 0
x 0.5 1
15 15
10 10
5 5
0 0
y
-5 -5
-10 -10
-15 -15
-1 -0.5 0 0.5 1
x
Equations (12) and (13) amount to form what we might refer to as the forward
and inverse discrete Chebyshev transforms of a function f (x).
6 An inner product on a vector space V is just a rule that assigns a real number to any pair
The second half of the modern solution now reads like this:
Lets now investigate how Chebyshev spectral methods work for each of the
various aspects of the characterization problem we considered above.
Chebyshev approximation
As we saw previously, a function f (x) on the interval [1, 1] may be represented
exactly as a linear combination of Chebyshev polynomials:
X
f (x) = Cn Tn (x) (14)
n=0
One way to obtain a formula for the C coefficients in this expansion is to take
the inner product of both sides with Tm (x) and use the orthogonality of the T
functions:
hf, Tm i
Cm =
hTm , Tm
2 1 f (x)Tm (x)
Z
= dx. (15)
1 1 x2
However, there are better ways to compute these coefficients, as discussed below.
18.330 Lecture Notes 15
If we restrict the sum in (16) to include only its first N terms, we obtain an
approximate representation of f (x), the N th Chebyshev approximant:
N
X 1
f approx (x) = Cn Tn (x) (16)
n=0
Chebyshev interpolation
The coefficients Cn in formula (16) for the Chebyshev approximant may be
computed using the integral formula (14), but there are easier ways to get them.
These are based on the following alternative characterization of (16):
2. We could observe that the Cn coefficients are the coefficients in the Fourier
cosine series of the even 2-periodic function
g() = f (cos ). The samples
of g() at evenly-spaced points g n N are precisely just the samples
of f (x) at the Chebyshev points cos n N , and the Fourier cosine series
coefficients may be computed by computing the discrete cosine transform
of the set of numbers {fn }:
DCT
{fn } {Cn }
where n
fn = f cos , n = 0, 1, , N.
N
Option 1 here is discussed in Trefethen, Spectral Methods in MATLAB,
Chapter 6 (see particularly Exercise 6.1).
Here we will focus on option 2. The numbers Cn are just the Fourier cosine
series coefficients of g(), i.e. the numbers we called e
a in equation (3):
Z
2
Cn = f (cos ) cos(n)d.
0
18.330 Lecture Notes 16
where n
fn f cos
N
If we write out equation (17) for all of the Cn coefficients at once, we have an
(N + 1)-dimensional linear system relating the sets of numbers {fn } and {Cn }:
12 1 1 1 1 1
2
f0 C0
1
cos N cos 2 cos 3 1
2 cos f1 C1
2 N N
1
cos 2 cos 4 cos 6 1
2
2 N N N 2 cos 2
f2
= C2
N 1
cos 3 cos 6 cos 9 1
cos 3 f3 C3
2 N N N 2
. .. .. .. .. .. .. ..
..
. . . . .
.
.
1 1
2 cos cos 2 cos 3 2 cos N fN CN
f = C (18)
Chebyshev integration
The Chebyshev spectral approach to integrating a function f (x) goes like this:
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(16)].
N
X Z 1
Cm Tm (x) dx. (19)
m=0 1
But the integrals of the Chebyshev polynomials can be evaluated in closed form,
with the result Z 1 (
2
2, m even
Tm (x) dx = 1m (20)
1 0, m odd.
Thus equation (19) reads
Z 1 N
X 2Cm
f (x) dx . (21)
1 m=0
1 m2
m even
Does this expression look familiar? It is exactly what we found in our discussion
of Clenshaw-Curtis quadrature, except there we interpreted the integral (20) in
the equivalent form
Z 1 Z
Tm (x) dx = cos(m) sin d.
1 0
cients:
2
0
2
Z 1 122
f (x) dx WT C,
W=
0 .
1
2
142
..
.
2
1N 2
Chebyshev differentiation
In the first unit of our course we saw how to use finite-difference techniques to
approximate derivative values from function values. For example, if feven is a
vector of function samples taken at evenly-spaced points in an interval [a, b] i.e.
if
f (a)
f (a + )
feven = f (a + 2)
..
.
f (b)
then the vector of derivative values at the sample points may be represented in
the centered-finite-difference approximation as a matrix-vector product of the
form
0
feven = DCFD feven
18.330 Lecture Notes 19
where7
0 1 0 0 0 0
1 0 1 0 0 0
0 1 0 1 0 0
1 0 0 1 0 0 0
DCFD = .
2
.
..
0 0 0 0 0 1
0 0 0 0 1 0
As we saw in our discussion of finite-difference techniques, this approximation
will converge like 1/N 2 , i.e. the error between our approximate derivative and
the actual derivative will decay like 1/N 2 .
Now that we are equipped with Chebyshev spectral methods, we can write
a numerical differentiation stencil whose errors will decay exponentially 8 in N .
Indeed, following the general spirit of Chebyshev spectral methods, all we have
to do is
1. Construct the N th Chebyshev approximant f approx (x) to f (x) [equation
(16)].
2. Differentiate the approximant and take this as an approximation to the
derivative.
The N th Chebyshev approximant to f (x) is
N
X
fapprox (x) = Cm Tm (x)
m=0
Differentiating, we find
N
X
0 0
fapprox (x) = Cm Tm (x).
m=0
Second derivatives
What if we need to compute second derivatives? Easy! Just go like this:
00 0
fcheb = Dcheb fcheb
= Dcheb Dcheb fcheb
2
= Dcheb fcheb .
This equation identifies the (N +1)(N +1) matrix (Dcheb )2 , i.e just the square
of the matrix Dcheb , as the matrix that operates on a vector of f samples at
Chebyshev points to yield a vector of f 00 samples at Chebyshev points.
In this equation, and are fixed parameters, g(x) is a known forcing function,
the x variable ranges over the interval [xL , xR ], and the boundary values of f at
the left and right endpoints are fL , fR . (The subscripts L and R stand for
left and right.)
18.330 Lecture Notes 21
Rescaling to [1, 1]
Our boundary-value problem is defined on the interval [xL , xR ], but Chebyshev
spectral methods are nicest when we are working on the interval [1, 1]. Thus,
before we do anything else, lets redefine our problem so that the independent
variable runs over [1, 1]. That is, we will write x as a linear function of a new
variable (i.e. x = A + B for constants A, B to be determined) such that x
runs from xL to xR as runs from 1 to 1. As you can easily check, the unique
choice that works is
xR xL xR + xL
x( ) = W + xM , W , xM .
2 2
(Note that W is just half the width of the interval, while xM is the midpoint of
the interval. Here W stands for width, while M stands for midpoint.)
I will use the symbols F( ) and G( ) to denote new functions of obtained
by evaluating the old functions f (x) and g(x) at the point x = x( ):
F( ) f x( ) = f W + xM G( ) g x( ) = g W + xM .
Discretization
The next step is to discretize. Fix a value of N and consider the set of (N + 1)
Chebyshev points10 in the interval [1 : 1]:
n n o
n = cos , n = 0, 1, , N total of N + 1 sample points (27)
N
9 You can use dimensional analysis as a mnemonic device to help you remember where the
W factors go: Think of x as a quantity with units of length (so W , the width of an interval
in x, has units of length too), while is dimensionless. We know that x derivatives like df /dx
have units of inverse length [and d2 f /dx2 has units of (inverse length)2 ], but derivatives
like dF /d are dimensionless, so to recover a quantity like df /dx from a quantity like dF /d
we have to divide the latter by a quantity with units of length, i.e. by one factor of W .
Alternatively, you can think in terms of this symbolic identity:
d 1 d
= .
dx W d
10 As usual in Fourier and Chebyshev methods, there is some annoying confusion here over
precisely what N means, and failure to get this minor point straight can lead to annoying
18.330 Lecture Notes 22
F00 F000
F0 G0
F1 F10 F100 G1
F20 F200
F2 G2
F = , F0 = , F 00 = , G=
.. .. .. ..
.
.
.
.
FN 1 F0 F 00 GN 1
N 1 N 1
0 00
FN FN FN GN
(28)
where
F 0 = DF , F 00 = D2 F
where M is just a convenient name that we have assigned to the (N +1)(N +1)
matrix in parentheses.
unknown, quantities: they are simply11 given by the boundary conditions, i.e.
we have
F0 fR , FN fL . (30)
This means that equation (29), which consists of N + 1 simultaneous linear
equations, actually gives us more equations than we need; we want to eliminate
the first and last of those equations and solve a reduced (N 1)-dimensional
system for just the unknown (N 1) quantities F1 , , FN 1 .
To separate out what is known from what is unknown on the LHS of equation
(29), lets write the N +1 equations implicit in that statement in a {1, (N 1), 1}
block form:
M00 v1T M0N F0 G0
= . (31)
v2
M int v 3 F
int int
G
T
MN 0 v4 MN N FN GN
In this equation, F int and G int are the interior portions of the F and G
vectors, containing just the values of F and G at the N 1 interior Chebyshev
points:
F1 G1
F2 G2
F int = .. G int = ..
, .
. .
FN 2 GN 2
FN 1 GN 1
Also, in equation (31), v1,2,3,4 are (N 1)-dimensional vectors obtained by
slicing out chunks of the original matrix M, and Mint is the (N 1) (N 1)
interior chunk of M. In a high-level language like julia these may be extracted
from M using the following commands:
11 Careful! In Chebyshev spectral methods, the angle = n [the argument of the cosine
N
in equation (27)] runs from = 0 to = as the index n runs from 0 to N . This means that
the variable winds up running backwards from 1 to 1 as n runs from 0 to N , i.e.
n = 0 corresponds to = +1, n = N corresponds to = 1.
Looking at equation (29), this yields the at-first-surprising conclusion that the boundary value
at the right endpoint, fR , wants to go in the first slot of the vector F , while fL wants to go
in the last slot of the vector, as in (30).
18.330 Lecture Notes 24
v3 = M[ 2:end-1, end ];
v4 = M[ end, 2:end-1 ];
The portion of (31) that we now want to solve is the interior portion, i.e.
the innermost (N 1) (N 1) chunk of the system, which reads
or, swinging all known quantities over the RHS so that we have a linear system
relating unknowns to knowns,
This equation gives us only the innermost (N 1) entries in our solution vector
F ; the outer 2 entries are obtained by just plugging in the given boundary
conditions.
If this procedure seems complicated, its actually nothing more than what
we did earlier in our treatment of finite-difference solutions to boundary-value
problems. For example, in the section titled Finite-differencing as matrix-
vector multiplication in the Numerical Differentiation lecture notes, the RHS
of the boundary-value problem involved a vector we called , which depended
on the boundary values. This vector is equivalent to the vector F0 v2 +Fn v3 that
appears on the RHS of (32). The only difference is that in the finite-difference
case this vector is sparse (almost all of its entries are zero), whereas here the
vector is dense.
This reflects the fact that finite-differencing is essentially a local procedure,
which estimates derivatives from function samples only at immediately adja-
cent points; in contrast, Chebyshev differentiation is inherently global, with
each sample of the derivative needing to know information on the entire set of
function samples. This non-locality makes Chebyshev methods more costly for a
given number of samples, but is also responsible for the dramatically accelerated
convergence properties.