Sie sind auf Seite 1von 48

Physics 129b

Integral Equations
051012 F. Porter
Revision 150928 F. Porter

Introduction

The integral equation problem is to find the solution to:


h(x)f (x) = g(x) +

Z b

k(x, y)f (y)dy.

(1)

We are given functions h(x), g(x), k(x, y), and wish to determine f (x). The
quantity is a parameter, which may be complex in general. The bivariate
function k(x, y) is called the kernel of the integral equation.
We shall assume that h(x) and g(x) are defined and continuous on the
interval a x b, and that the kernel is defined and continuous on a x b
and a y b. Here we will concentrate on the problem for real variables
x and y. The functions may be complex-valued, although we will sometimes
simplify the discussion by considering real functions. However, many of the
results can be generalized in fairly obvious ways, such as relaxation to piecewise continuous functions, and generalization to multiple dimensions.
There are many resources for further reading on this subject. Some of
the popular ones among physicists include the classic texts by Mathews
and Walker, Courant and Hilbert, Whittaker and Watson, and Margenau
and Murphy, as well as the newer texts by Arfken, and Riley, Hobson, and
Bence.

Integral Transforms

If h(x) = 0, we can take = 1 without loss of generality and obtain the


integral equation:
Z
b

g(x) =

k(x, y)f (y)dy.

(2)

This is called a Fredholm equation of the first kind or an integral


transform. Particularly important examples of integral transforms include
the Fourier transform and the Laplace transform, which we now discuss.

2.1

Fourier Transforms

A special case of a Fredholm equation of the first kind is


a =
b = +
1
k(x, y) = eixy .
2

(3)
(4)
(5)

This is known as the Fourier transform:


1 Z ixy
g(x) =
e
f (y)dy
2

(6)

Note that the kernel is complex in this case.


The solution to this equation is given by:
1 Z ixy
e g(x)dx.
f (y) =
2

(7)

Well forego rigor here and give the physicists demonstration of this:
1 Z ixy Z ix0 y
e g(x0 )dx0
e
dy
2

Z
1 Z
0
g(x0 )dx0
ei(x x)y dy
=
2

g(x) =

g(x0 )(x x0 )dx0

(8)
(9)
(10)

= g(x).

(11)

Here, we have used the fact that the Dirac delta-function may be written
1 Z ixy
(x) =
e dy.
2

(12)

The reader is encouraged to demonstrate this, if s/he has not done so before.
It is instructive to notice that the Fourier transform may be regarded as
a limit of the Fourier series. Let f (x) be expanded in a Fourier series in a
box of size [L/2, L/2]:
f (x) =

an e2inx/L .

(13)

n=

We have chosen periodic boundary conditions here: f (L/2) = f (L/2).


The an expansion coefficients may be determined for any given f (x) using
the orthogonality relations:
1 Z L/2 2inx/L 2imx/L
e
e
dx = mn .
L L/2
2

(14)

Hence,
1 Z L/2
an =
f (x)e2inx/L dx.
(15)
L L/2
Now consider taking the limit as L . In this limit, the summation

goes over to a continuous integral. Let y = 2n/L and g(y) = Lan / 2.


Then, using dn = (L/2)dy,
f (x) =

lim

an e2inx/L

(16)

n=

2
g(y)eixy
= lim
L
L
n=
1 Z ixy

e g(y)dy.
=
2

(17)
(18)

Furthermore:

Lan
1 Z
g(y) = =
f (x)eixy dx.
(19)
2
2
We thus verify our earlier statements, including the -function equivalence,
assuming our limit procedure is acceptable.
Suppose now that f (y) is an even function, f (y) = f (y). Then,
Z
0
1
ixy

g(x) =
eixy f (y)dy
e
f (y)dy +
0
2
i
1 Z h ixy
=
e + eixy f (y) dy
2 0
s
2Z
f (y) cos xy dy.
=
0
Z

(20)
(21)
(22)

This is known as the Fourier cosine transform. It may be observed that


the transform g(x) will also be an even function, and the solution for f (y) is:
s

f (y) =

2Z
g(x) cos xy dx.
0

(23)

Similarly, if f (y) is an odd function, we have the Fourier sine transform:


s
2Z
g(x) =
f (y) sin xy dy,
(24)
0
where a factor of i has been absorbed. The solution for f (y) is
s

f (y) =

2Z
g(x) sin xy dx.
0

(25)

Let us briefly make some observations concerning an approach to a more


rigorous discussion. Later we shall see that if the kernel k(x, y) satisfies

conditions such as square-integrability on [a, b] then convenient behavior is


achieved for the solutions of the integral equation. However, in the present
case, we not only have |a|, |b| , but the kernel eixy nowhere approaches
zero. Thus, great care is required to ensure valid results.
We may deal with this difficult situation by starting with a set of functions
which are themselves sufficiently well-behaved (e.g., approach zero rapidly
as |x| ) that the behavior of the kernel is mitigated. For example, in
quantum mechanics we may construct our Hilbert space of acceptable wave
functions on R3 by starting with a set S of functions f (x) where:
1. f (x) C , that is f (x) is an infinitely differentiable complex-valued
function on R3 .
2. lim|x| |x|n d(x) = 0, n, where d(x) is any partial derivative of f .
That is, f and its derivatives fall off faster than any power of |x|.
We could approach the proof of the Fourier inverse theorem with more
rigor than our limit of a series as follows: First, consider that subset of S
consisting of Gaussian functions. Argue that any function in S may be approximated aribtrarily closely by a series of Gaussians. Then note that the S
functions form a pre-Hilbert space (also known as an Euclidean space). Add
the completion to get a Hilbert space, and show that the theorem remains
valid.
The Fourier transform appears in many physical situations via its connection with waves, for example:
<eixy = cos xy.

(26)

In electronics we use the Fourier transform to translate time domain problems in terms of frequency domain problems, with xy t. An LCR
circuit is just a complex impedance for a given frequency, hence the integraldifferential time-domain problem is translated into an algebraic problem in
the frequency domain. In quantum mechanics the position-space wave functions are related to momenutm-space wave functions via the Fourier transform.
2.1.1

Example: RC circuit

Suppose we wish to determine the output voltage Vo (t) in the simple circuit
of Fig. 1. The time domain problem requires solving the equation:


1 1
1 Zt
1 Zt
0
0
Vi (t ) dt
+
Vo (t0 ) dt0 .
Vo (t) =
R1 C
C R1 R2

(27)

This is an integral equation, which we will encounter in Section 5.2 as a


Volterras equation of the second kind.

R1

Vi (t)

R2

Vo(t)

Figure 1: A simple RC circuit problem.


If Vi (t) is a sinusoid waveform of a fixed frequency (), the circuit elements
may be replaced by complex impedances:
R1 Z1 = R1
R2 Z2 = R2
1
.
C ZC =
iC

(28)
(29)
(30)

Then it is a simple matter to solve for Vo (t):


Vo (t) = Vi (t)

1+

R1
R2

1
,
(1 + iR2 C)

(31)

if Vi (t) = sin(t + ), and where it is understood that the real part is to be


taken.
Students usually learn how to obtain the result in Eqn. 31 long before
they know about the Fourier transform. However, it is really the result in
the frequency domain according to the Fourier transform. That is:
1 Z
Vo (t)eit dt
o () =
2
1
= Vbi ()
.
R1
1 + R2 (1 + iR2 C)

Vb

(32)
(33)

We are here using the hat ( b ) notation to indicate the integral transform of
the unhatted function. The answer to the problem for general (not necessarily
sinusoidal) input Vi (t) is then:
1 Z b
Vo (t) =
Vo ()eit d

2
1 Z b
eit
=
Vi ()
d.
1
1+ R
(1
+
iR
C)
2
2
R2

(34)
(35)

2.2

Laplace Transforms

The Laplace transform is an integral transform of the form:


F (s) =

f (x)esx dx.

(36)

The solution for f (x) is:


1 Z c+i
F (s)esx ds,
f (x) =
2i ci

(37)

where x > 0.
This transform can be useful for some functions where the Fourier transform does not exist. Problems at x + are removed by multiplying by
ecx , where c is a positive real number. Then the problem at is repaired
by multiplying by the unit step function (x):
(x)

if x > 0,
1/2
if x = 0, and

0
if x < 0.

(38)

Thus, we have
g(y) =
=

f (x)(x)ecx eixy dx

(39)

f (x)ecx eixy dx,

(40)

where we have by convention also absorbed the 1/ 2.


The inverse Fourier transform is just:
e

cx

1 Z
g(y)eixy dy.
(x)f (x) =
2

(41)

If we let s = c + iy and define F (s) g(y) at s = c + iy, then


F (s) =

f (x)esx dx,

(42)

and
1 Z
F (s)ex(c+iy) dy
2
1 Z c+i
=
F (s)exs ds,
2i ci

f (x)(x) =

(43)
(44)

which is the above-asserted result.


We group together here some useful theorems for Fourier and Laplace
transforms: First define some notation. Let
1 Z
(Ff ) (y) = g(y) =
f (x)eixy dx
2
6

(45)

be the Fourier transform of f , and


(Lf ) (s) = F (s) =

f (x)esx dx

(46)

be the Laplace transform of f . Finally, let T (transform) stand for either


F or L.
The reader should immediately verify the following properties:
1. Linearity: For functions f and g and complex numbers , ,
T (f + g) = (T f ) + (T g).

(47)

2. Transform of derivatives: Integrate by parts to show that


(Ff 0 )(y) = iy(Ff )(y),

(48)

assuming f (x) 0 as x , and


(Lf 0 )(y) = s(Lf )(s) f (0),

(49)

assuming f (x) 0 as x and defining f (0) limx0+ f (x).


The procedure here may be iterated to obtain expressions for higher
derivatives.
3. Transform of integrals:


Z



f (x)dx

(y) =

1
(Ff ) (y) + C(y),
iy

(50)

where C is an arbitrary constant arising from the arbitrary constant of


integration of an indefinite integral;


Z x



f (t)dt

(s) =

dxesx

Z x

dtf (t)

f (t)dt
dxesx

1
(Lf ) (s).
s

(51)

4. Translation:
[Ff (x + a)] (y) = eiay [Ff ] (y),
[Lf (x + a)] (s) = eas [Lf ] (s) (a)

(52)
Z a

f (x)esx dx.

(53)

5. Multiplication by an exponential:
{F [eax f (x)]} (y) = (Ff ) (y + ia),
{L [eax f (x)]} (s) = (Lf ) (s a).

(54)
(55)

6. Multiplication by x:
d
(Ff ) (y),
dy
d
{L [xf (x)]} (s) = (Lf ) (s).
ds

{F [xf (x)]} (y) = i

(56)
(57)

An important notion that arises in applications of integral transforms is


that of a convolution:
Definition (Convolution):Given two functions f1 (x) and f2 (x), and constants a, b, the convolution of f1 and f2 is:
g(x) =

Z b
a

f1 (y)f2 (x y)dy.

(58)

In the case of Fourier transforms, we are interested in a = b = . For


Laplace transforms, a = 0, b = . We then have the celebrated convolution
theorem:
Theorem:

2 (Ff1 ) (y) (Ff2 ) (y),


(Fg) (y) =

2 (Lf1 ) (y) (Lf2 ) (y).


(Lg) (y) =

(59)
(60)

The proof is left as an exercise.


2.2.1

Laplace transform example: RC circuit

Let us return to the problem of determining the output voltage Vo (t) in


the simple circuit of Fig. 1. But now, suppose that we know that Vi (t) = 0
for t < 0. In this case, the Laplace transform is an appropriate method to
try.
There are, of course, many equivalent solution paths; let us think in terms
of the currents: i(t) is the total current (through R1 ), iC (t) is the current
through the capacitor, and iR (t) is the current through R2 . We know that
i(t) = iC (t) + iR (t), and
Vo (t) = Vi (t) i(t)R1
= [i(t) iC (t)] R2
1 Zt
Q
iC (t0 ) dt0 .
=
=
C
C 0

(61)
(62)
(63)

This gives us three equations relating the unknowns Vo (t), i(t), and iC (t),
which we could try to solve to obtain Vo (t). However, the integral in the last
equation complicates the solution. This is where the Laplace transform will
help us.

Corresponding to the first of the three equations we obtain (where the


hat now indicates the Laplace transform):
Vb

o (s)

Z
0

Vo (t)est dt = Vbi (s) bi(s)R1 .

(64)

Corresponding to the second, we have:


h

Vbo (s) = bi(s) biC (s) R2 .

(65)

For the third, we have:


1 Z st Z t
e
iC (t0 ) dt0 dt
C 0
0
Z
1 Z
0
est dt
=
iC (t ) dt0
0
C 0
t
1 b
=
iC (s).
sC

Vbo (s) =

(66)

Now we have three simultaneous algebraic equations, which may be readily


solved for Vbo (s):
1
Vbo (s) = Vbi (s)
.
(67)
R1
1 + R2 (1 + sR2 C)
We note the similarity with Eqn. 33. Going back to the time domain, we
find:
1 Z a+i
1
Vbi (s)est ds.
(68)
Vo (t) =
R1
2i ai 1 + R2 (1 + sR2 C)
For example, lets suppose that Vi (t) is a brief pulse, V t A, at t = 0.
Lets model this as:
V (t) = A(t ),
(69)
where  is a small positive number, inserted to make sure we dont get into
trouble with the t = 0 boundary in the Laplace transform. Then:
Vi (s) =

A(t )est dt = Aes .

(70)

Inserting into Eqn. 68, we have


Z a+i s(t)
1
R2
e
Vo (t) =
A
ds,
2i R1 + R2
ai 1 + s

where

R1 R2
C.
R1 + R2

(71)

(72)

The integrand has a pole at s = 1/ . We thus choose the contour of


integration as in Fig. 2. A contour of this form is known as a Bromwich

Im(s)

x
-1/

Re(s)

Figure 2: The Bromwich contour, for the RC circuit problem.


contour. In the limit R the integral around the semicircle is zero. We
thus have:
1
1
R2
Vo (t) =
A2i Residue at
2i R1 + R2


 s(t)
1 e
R2
= A
lim s +
R1 + R2 s1/
1 + s
1 t
= A
e .
R1 C


(73)

In this simple problem, we could have guessed this result: At t = 0,


we instantaneously put a voltage A/R1 C on the capacitor. The time is
simply the time constant for the capacitor to discharge through the parallel
(R1 , R2 ) combination. However, we may also treat more difficult problems
with this technique. The integral-differential equation in the t-domain becomes a problem of finding the zeros of a polynomial in the s-domain, at
which the residues are evaluated. This translated problem lends itself to
numerical solution.

Laplaces Method for Ordinary Differential


Equations

There are many other integral transforms that we could investigate, but the
Fourier and Laplace transforms are the most ubiquitous in physics applications. Rather than pursue other transforms, well look at another example

10

that suggests the breadth of application of these ideas. This is the Laplaces
Method for the solution of ordinary differential equations. This method represents a sort of generalization of the Laplace transform, using the feature of
turning derivatives into powers.
Suppose we wish to solve the differential equation:
n
X

(ak + bk x)f (k) (x) = 0,

(74)

k=0

where we use the notation


dk f
(x).
(75)
dxk
In the spirit of the Laplace transform, assume a solution in the integral
form
Z
f (x) =
F (s)esx ds,
(76)
f (k) (x)

where the contour will be chosen to suit the problem (not necessarily the
contour of the Laplace transform). Well insert this proposed solution into
the differential equation. Notice that
f 0 (x) =

F (s)sesx ds.

(77)

Thus,
0=

[U (s) + xV (s)] F (s)esx ds,

(78)

where
n
X

U (s) =

k=0
n
X

V (s) =

ak s k ,

(79)

bk s k .

(80)

k=0

We may eliminate the x in the xV term of Eqn. 78 with an integration


by parts:
Z

sx

sx

V (s)F (s)xe ds = [V (s)F (s)e

c2

]

c1

Z
C

d
[V (s)F (s)] esx ds,
ds

(81)

where c1 and c2 are the endpoints of C. Hence


0=

Z (
C

2
d
sx
sx
U (s)F (s)
[V (s)F (s)] e ds + [V (s)F (s)e ] .

ds
c1

(82)

We assume that we can choose C such that the integrated part vanishes.
Then we will have a solution to the differential equation if
U (s)F (s)

d
[V (s)F (s)] = 0.
ds
11

(83)

Note that we have transformed a problem with high-order derivatives (but


only first order polynomial coefficients) to a problem with first-order derivatives only, but with high-order polynomial coefficients.
Formally, we find a solution as:
d
[V (s)F (s)] = U (s)F (s)
ds
dF (s)
dV (s)
V (s)
= U (s)F (s) F (s)
ds
ds
d ln F
U
d ln V
=

ds
V
ds
!
Z
U
d ln V
ln F =

ds
V
ds
Z
U
ds ln V + ln A,
=
V

(84)
(85)
(86)
(87)
(88)

where A is an arbitrary constant. Thus, the soluton for F (s) is;


"Z
A
F (s) =
exp
V (s)

3.1

U (s0 ) 0
ds .
V (s0 )
#

(89)

Example of Laplaces method: Hermite Equation

The simple harmonic oscillator potential in the Schrodinger equation leads


to the Hermite differential equation:
f 00 (x) 2xf 0 (x) + 2f (x) = 0,

(90)

where is a constant. This is an equation that may lend itself to treatment


with Laplaces method; lets try it.
First, we determine U (s) and V (s):
U (s) = a0 + a2 s2 = 2 + s2
V (s) = b1 s = 2s.

(91)
(92)

Substituting these into the general formula Eqn. 89, we have


Z s 2
s + 2
1
ds
F (s) exp
2s
2s
!
1
s2
exp ln s
2s
4

(93)
(94)

es /4
+1 .
2s

12

(95)

Im(s)

Re(s)

Figure 3: A possible contour for the Hermite equation with non-integral


constant . The branch cut is along the negative real axis.
To find f (x) we substitute this result into Eqn. 76:
f (x) =

F (s)esx ds

(96)

C
2

es +2sx
= A
ds,
(97)
s+1
C
where A is an arbitrary constant, and where we have let s 2s according
to convention for this problem.
Now we are faced with the question of the choice of contour C. We at
least must require that the integrated part in Eqn. 82 vanish:
Z

2

V (s)F (s)esx
1

es +2sx
= 0.

s 1

(98)

Well need to avoid s = 0 on our contour. If = n is a non-negative integer,


we can take a circle around the origin, since the integrand is then analytic
and single-valued everywhere except at s = 0. If 6= n, then s = 0 is a
branch point, and we cannot choose C to circle the origin. We could in this
case take C to be the contour of Fig. 3.
Lets consider further the case with = n = 0, 1, 2, . . . Take C to be a
circle around the origin. Pick by convention A = n!/2i, and define:
2

n! Z es +2sx
ds.
(99)
2i C sn+1
This is a powerful integral form for the Hermite polynomials (or, Hermite
functions in general with the branch cut contour of Fig. 3). For example,
Hn (x)

n!
Hn (x) = 2i
residue at s = 0 of
2i

13

es +2sx
.
sn+1
!

(100)

Recall that the residue is the coefficient of the 1/s term in the Laurent series
expansion. Hence,


Hn (x) = n! coefficient of sn in es
That is,
es

2 +2sx

2 +2sx

Hn (x) n
s .
n!
n=0

(101)

(102)

This is the generating function for the Hermite polynomials.


The term generating function is appropriate, since we have:
dn s2 +2sx
e
s0 dsn
2
H0 (x) = lim es +2sx = 1

Hn (x) = lim

(103)
(104)

s0

H1 (x) = lim(2s + 2x)es

2 +2sx

s0

= 2x,

(105)

and so forth.

Integral Equations of the Second Kind

Referring back to Eq. 1, if h(x) 6= 0 for a x b we may rewrite the


problem in a form with h(x) = 1:
g(x) = f (x)

Z b

k(x, y)f (y)dy.

(106)

This is referred to as a linear integral equation of the second kind or as a


Fredholm equation of the second kind. It defines a linear transformation
from function f to function g. To see this, let us denote this transformation
by the letter L:
g(x) = (Lf )(x) = f (x)

Z b

k(x, y)f (y)dy.

(107)

If
Lf1 = g1 and
Lf2 = g2 ,

(108)
(109)

then for aribtrary complex constants c1 and c2 :


L(c1 f1 + c2 f2 ) = c1 g1 + c2 g2 .

(110)

Notice that we may sometimes find it convenient to use the notation:


|gi = L|f i = |f i K|f i,

14

(111)

where K|f i indicates here the integral ab k(x, y)f (y)dy. Our linear operator
is then written:
L = I K,
(112)
R

where I is the identity operator.


We are interested in the problem of inverting this linear transformation
given g, what is f ? As it is a linear transformation, it should not be
surprising that the techniques are analogous with those familiar in matrix
equations. The difference is that we are now dealing with vector spaces that
are infinite-dimensional function spaces.

4.1

Homogeneous Equation, Eigenfunctions

It is especially useful in approaching this problem to first consider the special


case g(x) = 0:
f (x) =

Z b

k(x, y)f (y)dy.

(113)

This is called the homogeneous integral equation. It has a trivial solution


f (x) = 0 for a x b.
If there exists a non-trivial solution f (x) to the homogeneous equation,
then cf (x) is also a solution, and we may assume that our solution is normalized (at least up to some here-neglected questions of rigor related to the
specification of our function space):
Z b

|f (x)|2 dx = 1.

(114)

If there are several solutions, f1 , f2 , f3 , . . . , fn , then any linear combination of these is also a solution. Hence, if we have several linearly independent
solutions, we can assume that they are orthogonal and normalized. If they
are not, we may use the Gram-Schmidt process to obtain such a set of orthonormal solutions. We therefore assume, without loss of generality, that:
Z b
a

fi (x)fj (x)dx = ij .

(115)

Alternatively, we may use the familiar shorthand:


hfi |fj i = ij ,

(116)

|f ihf | = If ,

(117)

or even
where If is the identity matrix in the subspace spanned by {f }.
A value of for which the homogeneous equation has non-trivial solutions
is called an eigenvalue of the equation (or, of the kernel). Note that the use
of the term eigenvalue here is analogous with, but different in detail from the

15

usage in matrices our present eigenvalue is more similar with the inverse of
a matrix eigenvalue. The corresponding solutions are called eigenfunctions
of the kernel for eigenvalue . We have the following:
Theorem: There are a finite number of eigenfunctions fi corresponding to
a given eigenvalue .
Proof: Well prove this for real functions, leaving the complex case as an
exercise. Given an eigenfunction fj corresponding to eigenvalue , let:
pj (x)

Z b
a

k(x, y)fj (y)dy =

1
fj (x).

(118)

Now consider, for some set of n eigenfunctions corresponding to eigenvalue


:
D(x) 2

Z b
a

k(x, y)

n
X

pj (x)fj (y) dy.

(119)

j=1

It must be that D(x) 0 because the integrand is nowhere negative for any
x. Note that the sum term may be regarded as an approximation to the
kernel, hence D(x) is a measure of the closeness of the approximation. With
some manipulation:
D(x) = 2

Z b

[k(x, y)]2 dy 22

Z bX
n

a j=1

k(x, y)pj (x)fj (y)dy

2
Z b X
n

+2
pj (x)fj (y) dy
a

Z b

j=1
2

[k(x, y)] dy 2

+2

[pj (x)]2

j=1

n
X

pj (x)

j=1

= 2

n
X

Z b

n
X

pk (x)

Z b
a

k=1

[k(x, y)]2 dy 2

fj (y)fk (y)dy

n
X

[pj (x)]2 .

(120)

j=1

With D(x) 0, we have thus proved a form of Bessels inequality. We may


rewrite the inequality as:
2

Z b

[k(x, y)]2 dy

n
X

[fj (x)]2 .

(121)

j=1

If we integrate both sides over x, we obtain:

Z bZ b
a

[k(x, y)] dydx

n Z b
X
j=1 a

n,

16

[fj (x)]2 dx
(122)

using the normalization of the fj . As long as


k 2 dxdy is bounded, we see
that n must be finite. For finite a and b, this is certainly satisfied, by our
continuity assumption for k. Otherwise, we may impose this as a requirement
on the kernel.
More generally, we regard nice kernels as those for which
RR

Z bZ b
a

[k(x, y)]2 dydx < ,

(123)

Z b
a

Z b
a

[k(x, y)]2 dx < U1 ,

y [a, b],

(124)

[k(x, y)]2 dy < U2 ,

x [a, b],

(125)

where U1 and U2 are some fixed upper bounds. We will assume that these
conditions are satisfied in our following discussion. Note that the kernel may
actually be discontinuous and even become infinite in [a, b], as long as these
conditions are satisfied.

4.2

Degenerate Kernels

Definition (Degenerate Kernel ):If we can write the kernel in the form:
k(x, y) =

n
X

i (x)i (y)

(126)

i=1

We may
(or K = ni=1 |i ihi |), then the kernel is called degenerate.
assume that the i (x) are linearly independent. Otherwise we could reduce
the number of terms in the sum to use only independent functions. Likewise
we may assume that the i (x) are linearly independent.
The notion of a degenerate kernel is important due to two facts:
P

1. Any continuous function k(x, y) can be uniformly approximated by


polynomials in a closed interval. That is, the polynomials are complete on a closed bounded interval.
2. The solution of the integral equation for degenerate kernels is easy (at
least formally).
The first fact is known under the label Weierstrass Approximation
Theorem. A proof by construction may be found in Courant and Hilbert.
We remind the reader of the notion of uniform convergence in the sense used
here:
P
PN
Definition (Uniform Convergence):If S(z) =
n=0 un (z) and SN =
n=0 un (z),
then S(z) is said to be uniformly convergent over the set of points A =
{z|z A} if, given any  > 0, there exists an integer N such that
|S(z) SN +k (z)| < ,

k = 0, 1, 2, . . . and z A.

17

(127)

Note that this is a rather strong form of convergence a series may converge
for all z A, but may not be uniformly convergent.
Let us now pursue the second fact asserted above. We wish to solve for
f:
Z
b

g(x) = f (x)

k(x, y)f (y)dy.

(128)

If the kernel is degenerate, we have:


g(x) = f (x)

n
X

i (x)

Z b
a

i=1

i (y)f (y)dy.

(129)

We define the numbers:


gi
fi
cij

Z b
a
Z b
a
Z b
a

i (x)g(x)dx

(130)

i (x)f (x)dx

(131)

i (x)j (x)dx.

(132)

Multiply Eq. 128 through by j (x) and integrate over x to obtain:


gj = fj

n
X

cji fi .

(133)

i=1

This is a system of n linear equations in the n unknowns fi . Suppose that


there is a unique solution f1 , f2 , . . . , fn to this system. It is readily verified
that a solution to the integral equation is:
f (x) = g(x) +

n
X

fi i (x).

(134)

i=1

Substituting in:
g(x) = g(x) +

n
X

fi i (x)

= g(x) +

i (x) fi

i=1

i (x)

i=1n

i=1
n
X

Z b
a

Z b
a

i (y) g(y) +

i (y) g(y) +

n
X

fj j (y) dy

j=1
n
X
j=1

fj j (y) dy

n
n

X
X
g(x) +
i (x) fi gi +
cij fj

i=1

j=1

= g(x).

(135)

Let us try an explicit example to illustrate how things work. We wish to


solve the equation:
2

x = f (x)

Z 1
0

18

x(1 + y)f (y)dy

(136)

In this case, n = 1, and it is clear that the solution is simply a quadratic


polynomial which can be determined directly. However, let us apply our new
method instead. We have g(x) = x2 and k(x, y) = x(1 + y). The kernel is
degenerate, with 1 (x) = x and 1 (y) = 1 + y. Our constants evaluate to:
7
12
0
Z 1
5
=
x(1 + x)dx = .
6
0

g1 =
c11

Z 1

(1 + x)x2 dx =

(137)
(138)

The linear equation we need to solve is then:


7
5
= f1 f1 ,
12
6

(139)

giving
f1 =

7 1
,
2 6 5

(140)

and

7
x.
(141)
2 6 5
The reader is encouraged to check that this is a solution to the original
equation, and that no solution exists if = 6/5.
To investigate this special value = 6/5, consider the homogeneous equation:
Z 1
x(1 + y)f (y)dy.
(142)
f (x) =
f (x) = x2 +

We may use the same procedure in this case, except now g1 = 0 and we find
that


5
f1 1
= 0.
(143)
6
Either f1 = 0 or = 6/5. If f1 = 0, then f (x) = g(x) + f1 1 (x) = 0. If
6= 6/5 the only solution to the homogeneous equation is the trivial one. But
if = 6/5 the solution to the homogeneous equation is f (x) = ax, where a
is arbitrary. The value = 6/5 is an (in this case the only) eigenvalue ofthe
integral equation, with corresponding normalized eigenfunction f (x) = 3x.
This example suggests the plausibility of the important theorem in the
next section.

4.3

Fredholm Alternative Theorem

Theorem: Either the integral equation


f (x) = g(x) +

Z b
a

19

k(x, y)f (y)dy,

(144)

with given , possesses a unique continuous solution f (x) for each continuous function g(x) (and in particular f (x) = 0 if g(x) = 0), or the
associated homogeneous equation
f (x) =

Z b

k(x, y)f (y)dy

(145)

possesses a finite number of linearly independent solutions.


Well give an abbreviated proof of this theorem to establish the ideas; the
reader may wish to fill in the rigorous details.
We have already demonstrated that there exists at most a finite number
of linearly independent solutions to the homogeneous equation. A good approach to proving the remainder of the theorem is to first prove it for the
case of degenerate kernels. Well use the Dirac notation for this, suggesting
the applicability for linear operators in general. Thus, let
n
X

K =

|i ihi |

(146)

i=1

|f i = |gi +

n
X

|i ihi |f i,

(147)

i=1

and let
gi hi |gi
fi hi |f i
cij hj |i i.
Then,
f j = gj +

n
X

cji fi ,

(148)
(149)
(150)

(151)

i=1

or

g1

g2

g=
.. = (I C) f ,
.

(152)

gn
where C is the matrix formed of the cij constants.
Thus, we have a system of n linear equations for the n unknowns {fi }.
Either the matrix I C is non-singular, in which case a unique solution f
exists for any given g (in particular f = 0 if g = 0), or I C is singular, in
which case the homogeneous equation f = Cf possesses a finite number of
linearly independent solutions. Up to some further considerations concerning
continuity, this proves the theorem for the case of a degenerate kernel.
We may extend the proof to arbitrary kernels by appealing to the fact
that any continuous funciton k(x, y) may be uniformly approximated by degenerate kernels in a closed interval (for example, see Courant and Hilbert).
There is an additional useful theorem under Fredholms name:

20

Theorem: If the integral equation:


f (x) = g(x) +

Z b

k(x, y)f (y)dy

(153)

for given possesses a unique continuous solution for each continuous


g(x), then the transposed equation:
t(x) = g(x) +

Z b

k(y, x)t(y)dy

(154)

also possesses a unique solution for each g. In the other case, if


the homogeneous equation possesses n linearly independent solutions
{f1 , f2 , . . . , fn }, then the transposed homogeneous equation
t(x) =

Z b

k(y, x)t(y)dy

(155)

also has n linearly independent solutions {t1 , t2 , . . . , tn }. In this case,


the original inhomogeneous equation 153 has a solution if and only if
g(x) satisfies conditions:
hg|ti i =

Z b
a

g (x)ti (x)dx = 0,

i = 1, 2, . . . , n.

(156)

That is, g must be orthogonal to all of the eigenvectors of the transposed


homogeneous equation. Furthermore, in this case, the solution is only
determined up to addition of an arbitrary linear combination of the
form:
c1 f 1 + c2 f 2 + . . . + cn f n .
(157)
Again, a promising approach to proving this is to first consider the case of
degenerate kernels, and then generalize to arbitrary kernels.

Practical Approaches

We turn now to a discussion of some practical tools of the trade for solving
integral equations.

5.1

Degenerate Kernels

If the kernel is degenerate, we have shown that the solution may be obtained
by transforming the problem to that of solving a system of linear equations.

21

5.2

Volterras Equations

Integral equations of the form:


g(x) =

Z x

k(x, y)f (y)dy

(158)

f (x) = g(x) +

Z x

k(x, y)f (y)dy

(159)

are called Volterras equations of the first and second kind, respectively.
One situation where such equations arise is when k(x, y) = 0 for y > x:
k(x, y) = (x y)`(x, y). Thus,
Z b

Z x

k(x, y)f (y)dy =

`(x, y)f (y)dy.

(160)

Consider Volterras equation of the first kind. Recall the fundamental


theorem:
db
da Z b f
d Z b(x)
(y, x)dy.
f (y, x)dy = f (b, x) f (a, x) +
dx a(x)
dx
dx
a x

(161)

We may use this to transform the equation of the first kind to:
Z x
dg
k
(x) = k(x, x)f (x) +
(x, y)f (y)dy.
dx
a x

(162)

This is now a Volterras equation of the second kind, and the approach to
solution may thus be similar.
Notice that if the kernel is independent of x, k(x, y) = k(y), then the
solution to the equation of the first kind is simply:
f (x) =

1 dg
(x).
k(x) dx

(163)

Let us try a simple example. Suppose we wish to find f (x) in:


2

x =1+

Z x

xyf (y)dy.

(164)

This may be solved with various approaches. Let (x)


(x) =

Rx
1

yf (y)dy. Then

x2 1
.
x

(165)

Now take the derivative of both sides of the original equation:


2

2x = x f (x) +

Z x

yf (y)dy = x2 f (x) + (x).

(166)

A bit of further algebra yields the answer:


1
1
f (x) =
x+
.
2
x
x


22

(167)

As always, especially when you have taken derivatives, it should be checked


that the result actually solves the original equation!
This was pretty easy, but it is even easier if we notice that this problem
is actually equivalent to one with an x-independent kernel. That is, we may
rewrite the equation as:
Z x
x2 1
=
yf (y)dy.
x
1

(168)

Then we may use Eq. 163 to obtain the solution.


5.2.1

Numerical Solution of Volterras equation

The Volterras equation readily lends itself to a numerical approach to solution on a grid (or mesh or lattice). We note first that (absorbing the
factor into the definition of k for convenience):
f (a) = g(a) +

Z x=a

k(a, y)f (y)dy

= g(a).

(169)

This suggests building up a solution at arbitrary x by stepping along a grid


starting at x = a.
To carry out this program, we start by dividing the interval (a, x) into N
steps, and define:
xn = a + n,

n = 0, 1, . . . , N,

xa
.
N

(170)

We have here defined a uniform grid, but that is not a requirement. Now let
gn = g(xn )
fn = f (xn )
knm = k(xn , xm ).

(171)
(172)
(173)

Note that f0 = g0 .
We may pick various approaches to the numerical integration, for example, the trapezoidal rule gives:
Z xn
a

n1
X
1
1
k(xn , y)f (y) dy
kn0 f0 +
knm fm + knn fn .
2
2
m=1

(174)

Substituting this into the Volterra equation yields, at x = xn :


n1
X
1
1
f n = gn +
kn0 f0 +
knm fm + knn fn ,
2
2
m=1

23

n = 1, 2, . . . N.

(175)

Solving for fn then gives:


fn =

gn +

1
k f
2 n0 0

Pn1

m=1

knm fm

k
2 nn

n = 1, 2, . . . N.

(176)

For example,
f 0 = g0 ,
g1 + 2 k10 f0
f1 =
,
1 2 k11
f2 =

g2 +

(177)
(178)

k f + k21 f1
2 20 0
,
1 2 k22

(179)

and so forth.
We note that we dont even have to explicitly solve a system of linear
equations, as we did for Fredholms equation with a degenerate kernel. There
are of order
!
O

N
X

n = O N2

(180)

n=1

operations in this algorithm. The accuracy may be estimated by looking at


the change as additional grid points are added.

5.3

Neumann Series Solution

Often an exact closed solution is elusive, and we resort to approximate methods. For example, one common approach is the iterative solution. We start
with the integral equation:
f (x) = g(x) +

Z b

k(x, y)f (y)dy.

(181)

The iterative approach begins with setting


f1 (x) = g(x).

(182)

Substituting this into the integrand in the original equation gives:


f2 (x) = g(x) +

Z b

k(x, y)g(y)dy.

(183)

Substituting this yields:


f3 (x) = g(x) +

Z b

"

k(x, y) g(y) +

Z b
a

24

#
0

k(y, y )g(y )dy dy.

(184)

This may be continuted indefinitely, with the nth iterative solution given in
terms of the (n 1)th:
Z b

fn (x) = g(x) +

a
Z b

= g(x) +
+2

Z bZ
a

k(x, y)fn1 (y)dy

(185)

k(x, y)g(y)dy

(186)

a
b

k(x, y)k(y, y 0 )g(y 0 )dydy 0

+...
n1

Z b

Z b

k(x, y) k(y (n2)0 , y (n1)0 )g(y (n1)0 )dy . . . dy (n1)0 .

If the method converges, then


f (x) = lim fn (x).

(187)

This method is only useful if the series converges, and the faster the better. It will converge if the kernel is bounded and lambda is small enough.
We wont pursue this further here, except to note what happens if
Z b

k(x, y)g(y)dy = 0.

(188)

In this case, the series clearly converges, onto solution f (x) = g(x). However,
this solution is not necessarily unique, as we may add any linear combination
of solutions to the homogeneous equation.

5.4

Fredholm Series

Better convergence properties are obtained with the Fredholm series. As


before, we wish to solve
f (x) = g(x) +

Z b

k(x, y)f (y)dy.

(189)

Let


2 Z Z k(x, x) k(x, x0 )
0
k(x0 , x) k(x0 , x0 ) dxdx
2!
a


k(x, x)
k(x, x0 ) k(x, x00 )
Z
Z
Z
3

k(x0 , x) k(x0 , x0 ) k(x0 , x00 ) dxdx0 dx00




3!
k(x00 , x) k(x00 , x0 ) k(x00 , x00 )
+...,
(190)

D() = 1

Z b

k(x, x)dx +

and let
2

D(x, y; ) = k(x, y)

Z
k(x, y)

k(z, y)

25

k(x, z)
dz
k(z, z)

2!


Z Z k(x, y)

k(z, y)

k(z 0 , y)

k(x, z) k(x, z 0 )

k(z, z) k(z, z 0 ) dzdz 0

k(z 0 , z) k(z 0 , z 0 )

+...,

(191)

Note that not everyone uses the same convention for this notation. For
example, Mathews and Walker defines D(x, y; ) to be 1/ times the quantity
defined here.
We have the following:
Theorem: If D() 6= 0 and if the Fredholms equation has a solution, then
the solution is, uniquely:
f (x) = g(x) +

Z b
a

D(x, y; )
g(y)dy.
D()

(192)

The homogeneous equation f (x) = ab k(x, y)f (y)dy has no continuous non-trivial solutions unless D() = 0.
R

A proof of this theorem may be found in Whittaker and Watson. The


proof may be approached as follows: Divide the range a < x < b into equal
intervals and replace the original integral by a sum:
f (x) = g(x) +

n
X

k(x, xi )f (xi ),

(193)

i=1

where is the width of an interval, and xi is a value of x within interval


i. This provides a system of linear equations for f (xi ), which we may solve
and take the limit as n , 0. In this limit, D() is the limit of the
determinant matrix of coefficients expanded in powers of .
While the Fredholm series is cumbersome, it has the advantage over the
Neumann series that the series D() and D(x, y; ) are guaranteed to converge.
There is a nice graphical representation of the Fredholm series; Ill describe a variant here. We let a line segment or smooth arc represent the
kernel k(x, y), one end of the segment corresponds to variable x and the
other end to variable y. If the segment closes on itself smoothly (e.g., we
have a circle), then the variables at the two ends are the same we have
k(x, x). The product of two kernels is represented by making two segments
meet at a point. The meeting ends correspond to the same variable, the first
variable in one kernel and the second variable in the other kernel. One may
think of the lines as directed, such that the second variable, say, is at the
starting end, and the first variable is at the finishing end. When two segments meet, it is always that the finish of one is connected to the start
of the other. We could draw arrows to keep track of this, but it actually
isnt needed in this application, since in practice well always integrate over

26

the repeated variables. A heavy dot on a segment breaks the segment into
two meeting segments, according to the above rule, and furthermore means
integration over the repeated variable with a factor of . For illustration of
these rules:
k(x, y)
k(x, x)
k(x, y)k(y, z)

k(x, y)k(y, z)dy


R

k(x, x)dx

Thus,
D(x, y; )
=

1
2!

...,

(194)

and
D() = 1

1
3!

1

+ 2!

+ ...

(195)

Let us try a very simple example to see how things work. Suppose we
wish to solve:
Z 1
f (x) = x +
xyf (y)dy.
(196)
0

Of course, this may be readily solved by elementary means, but let us apply
our new techniques. We have:
= k(x, y) = xy

Z 1
0

k(x, x)dx =

(197)

27

(198)

Z 1

k(x, y)k(y, z)dy =

0
2

Z 1Z 1
0

xz =
3

k(x, y)k(y, x)dxdy =

!2

(199)


2

(200)

We thus notice that

n
dots

Z 1

dx . . . dx(n) x2 . . . x(n)

i2

Z 1

n

(201)

We may likewise show that

n
dots

n

(202)

We find from this that all determinants of dimension 2 vanish. We


have
D() = 1
1
D(x, y; ) =

= 1

= xy.

(203)
(204)

The solution in terms of the Fredholm series is then:


f (x) = g(x) +

Z 1
0

D(x, y; )
g(y)dy
D()

3 Z 1 2
= x+
xy dy
3 0
3
=
x.
3

(205)

Generalizing from this example, we remark that if the kernel is degenerate,


k(x, y) =

n
X

i (x)i (y),

i=1

28

(206)

then D() and D(x, y; ) are polynomials of degree n in . The reader


is invited to attempt a graphical proof of this. This provides another
algorithm for solving the degenerate kernel problem.
Now suppose that we attempt to solve our example with a Neumann
series. We have
f (x) = g(x) +
= x + x

Z 1

k(x, y)g(y)dy + 2
y 2 dy + 2 x

= x

X
n=0

Z 1Z 1
0

Z Z

k(x, y)k(y, y 0 )g(y 0 )dydy 0 + . . .

y 2 (y 0 )2 dydy 0 + . . .

!n

(207)

This series converges for || < 3 to


3
x.
(208)
3
This is the same result as the Fredholm solution above. However, the Neumann solution is only valid for || < 3, while the Fredholm solution is valid
for all 6= 3. At eigenvalue = 3, D( = 3) = 0.
At = 3, we expect a non-trivial solution to the homogeneous equation
f (x) =

f (x) = 3

Z 1

xyf (y)dy.

(209)

Indeed, f (x) = Ax solves this equation. The roots of D() are the eigenvalues
of the kernel. If the kernel is degenerate we only have a finite number of
eigenvalues.

Symmetric Kernels

Definition: If k(x, y) = k(y, x) then the kernel is called symmetric. If


k(x, y) = k (y, x) then the kernel is called Hermitian.
Note that a real, Hermitian kernel is symmetric. For simplicity, well restrict
ourselves to real symmetric kernels here,1 but the generalization to Hermitian
kernels is readily accomplished (indeed is already done when we use Diracs
notation). The study of such kernels via eigenfunctions is referred to as
Schmidt-Hilbert theory. We will assume that our kernels are bounded in
the sense:
Z b

[k(x, y)]2 dy M,

(210)

Z b"
k
a

#2

(x, y)

dy M 0 ,

(211)

Note that, since we are assuming real functions in this section, we do not put a complex
conjugate in our scalar products. But dont forget to put in the complex conjugate if you
have a problem with complex functions!

29

where M and M 0 are finite.


Our approach to studying the symmetric kernel problem will be to analyze
it in terms of the solutions to the homogeneous equation. We have the
following:
Theorem: Every continuous symmetric kernel (not identically zero) possesses eigenvalues. Their number is countably infinite if and only if the
kernel is not degenerate. All eigenvalues of a real symmetric kernel are
real.
Proof: First, recall the Schwarz inequality, in Dirac notation:
|hf |gi|2 hf |f ihg|gi.

(212)

Consider the quadratic integral form:


J(, ) = h|K|i

Z bZ b
a

k(x, y)(x)(y)dxdy,

(213)

where is any (piecewise) continuous function in [a, b]. Well assume |a|, |b| <
for simplicity here; the reader may consider what additional criteria must
be satisifed if the interval is infinite.
Our quadratic integral form is analogous with the quadratic form for
systems of linear equations:
A(x, x) =

n
X

aij xi xj = ( x )

x ,

(214)

i,j=1

and this analogy persists in much of the discussion, lending an intuitive


perspective.
Notice that if we write:
J(, ) = hu|vi =

Z Z

u(x, y)v(x, y)dxdy,

(215)

where
u(x, y) k(x, y)
v(x, y) (x)(y),

(216)
(217)

we have defined a scalar product between the vectors u and v. We are thus
led to consider its square,
[J(, )]2 =

dx

dy

dx0

dy 0 k(x, y)(x)(y)k(x0 , y 0 )(x0 )(y 0 ), (218)

to which we apply the Schwarz inequality:


[J(, )]2 = |hu|vi|2 hu|uihv|vi

Z Z

[(x)(y)]2 dxdy

30

Z Z

[k(x, y)]2 dxdy.

(219)

Thus, if we require to be a normalized function,


Z

[(x)]2 dx = 1,

(220)

we see that |J(, )| is bounded, since the integral of the squared kernel is
bounded.
Furthermore, we can have J(, ) = 0 for all if and only if k(x, y) = 0.
The if part is obviously true; let us deal with the only if part. This
statement depends on the symmetry of the kernel. Consider the bilinear
integral form:
J(, ) = J(, )

Z Z

k(x, y)(x)(y)dxdy.

(221)

We have
J( + , + ) = J(, ) + J(, ) + 2J(, ),

(222)

for all , piecewise continuous on [a, b]. We see that J(, ) = 0 for all
only if it is also true that J(, ) = 0, , .
In particular, let us take
(y) =

k(x, y)(x)dx.

(223)

Then
0 = J(, ) =
=

Z Z

dx

dyk(x, y)(x)

dx0 k(x0 , y)(x0 )

2

dy.

k(x, y)(x)dx

(224)

Thus, k(x, y)(x)dx = 0 . In particular, take for any given value of y,


(x) = k(x, y). Then
Z
[k(x, y)]2 dx = 0,
(225)
R

and we find k(x, y) = 0.


We now assume that J(, ) 6= 0. Let us assume for convenience that
J(, ) can take on positive values. If not, we could repeat the following
arguments for the case J(, ) 0 . We are interested in finding the
normalized for which J(, ) attains its greatest possible value. Since
J(, ) is bounded, there exists a least upper bound:
J(, ) 1 = 1/1 ,

such that h|i = 1.

(226)

We wish to show that this bound is actually achieved for a suitable (x).
Let us suppose that the kernel is uniformly approximated by a series of
degenerate symmetric kernels:
An (x, y) =

an
X
(n)

cij i (x)j (y),

i,j=1

31

(227)

(n)

(n)

where cij = cji and hi |j i = ij , and such that the approximating kernels
are uniformly bounded in the senses:
Z b
a

[An (x, y)]2 dy MA ,

Z b"
An

(228)

#2

(x, y)

dy MA0 ,

(229)

where MA and MA0 are finite and independent of n.


We consider the quadratic integral form for the approximating kernels:
Jn (, )

Z Z

Z
an
X
(n)

=
=

An (x)(y)dxdy
cij

i,j=1
an
X

i (x)(x)dx

j (y)(y)dy

(n)

cij ui uj ,

(230)

i,j=1

where ui i (x)(x)dx. This is a quadratic form in the numbers u1 , u2 , . . . , uan .


Now,
"
#
R

(x)

an
X

ui i (x)

dx 0,

(231)

i=1

implies that (Bessel inequality):


h|i = 1

an
X

u2i .

(232)

i=1

The maximum of J(, ) is attained when


an
X

u2i = 1.

(233)

i=1

More intuitively, note that


(x) =

an
X

ui i (x),

(234)

i=1

unless there is a component of orthogonal to all of the i . By removing


that component we can make Jn (, ) larger.
We wish to find a function n (x) such that the maximum is attained. We
know that it must be of the form n (x) = u1 1 (x) + u2 2 (x) + uan an (x),
P
where u2i = 1, since then hn |n i = 1. The problem of finding max [Jn (, )]
is thus one of finding the maximum of the quadratic form subject to the
P
constraint u2i = 1. We know that such a maximum exists, because a continuous function of several variables, restricted to a finite domain, assumes a

32

maximum value in the domain. Suppose that {u} is the appropriate vector.
Then
a
n
X
(n)

cij ui uj = 1n

(235)

i,j=1

is the maximum value that Jn (, ) attains.


But the problem of finding the maximum of the quadratic form is just the
problem of finding its maximum eigenvalue and corresponding eigenvector.
That is,
an
X
(n)

cij uj = 1n ui ,

i = 1, 2, . . . , an .

(236)

j=1

This is also called the principal axis problem. Take


n (x) = u1 1 (x) + u2 2 (x) + + uan an (x),

(237)

where {u} is now our (normalized) vector for which the quadratic form is
maximal. The normalization hn |n i still holds. Apply the approximate
kernel operator to this function:
Z

an
X
(n)

An (x, y)n (y)dy =


=

cij i (x)

i,j=1
an
X

an
X
(n)

i=1

j=1

i (x)

= 1n

an
X

j (y)n (y)dy

cij uj

ui i (x)

i=1

= 1n n (x).

(238)

Therefore n (x) is an eigenfunction of An (x, y) belonging to eigenvalue 1n =


1/1n .
Finally, it is left to argue that, as we let An converge on k, n (x) converges
on eigenfunction (x), with eigenvalue 1 . Well let n . Since An (x, y)
is uniformly convergent on k(x, y), we have that, given any  > 0, there exists
an N such that whenever n N :
|k(x, y) An (x, y)| < ,

x, y [a, b].

(239)

Thus,
2

[J(, ) Jn (, )]

Z Z

[k(x, y) An (x, y)] (x)(y)dxdy


2

|h|i|
2

Z Z

Z bZ b
a

[k(x, y) An (x, y)]2 dxdy

2

(Schwarz),

dxdy

2 (b a)2 .

(240)

33

Thus, the range of Jn may be made arbitrarily close to the range of J by taking n large enough, and hence, the maximum of Jn may be made arbitrarily
close to that of J:
lim 1n = 1 .
(241)
n

Now, by the Schwarz inequality, the functions n (x) are uniformly bounded
for all n:


[n (x)]

1n

2

An (x, y)n (y)dy

21n hn |n i

[An (x, y)]2 dy.

(242)

As n , 1n 1 and An (x, y) k(x, y). Also, since An (x, y) is


piecewise continuous, n (x) is continuous, since it is an integral function.
The n (x) form what is known as an equicontinuous set: For every  > 0,
there exists () > 0, independent of n, such that
|n (x + ) n (x)| < ,

(243)

whenever || < . This may be seen as follows: First, we show that 0n (x) is
uniformly bounded:
2
[0n (x)]

"

1n

An
(x, y)n (y)dy
x
#2

Z "

An

(x, y)
x
21n MA0 .
21n

#2

dy

(Schwarz)
(244)

Or, [0n (x)]2 MA00 , where MA00 = MA0 max 21n . With this, we find:
2

|n (x + ) n (x)|

=
=

Z x+
2


0

n (y)dy

x
Z
b


[(y x) (y
a
Z b

2


x )] 0n (y)dy
Z b

[(y x) (y x )]2 dy

||(b a)MA00
< ,

[0n (y)] dy
(245)

for /(b a)MA00 .


For such sets of functions there is a theorem analogous to the BolzanoWeierstrass theorem on the existence of a limit point for a bounded infinite
sequence of numbers:

34

Theorem: (Arzela) If f1 (x), f2 (x), . . . is a uniformly bounded equicontinuous set of functions on a domain D, then it is possible to select a
subsequence that converges uniformly to a continuous limit function in
the domain D.
The proof of this is similar to the proof of the Bolzano-Weierstrass theorem,
which it relies on. We start by selecting a set of points x1 , x2 , . . . that is
everywhere dense in [a, b]. For example, we could pick successive midpoints
of intervals. By the Bolzano-Weierstrass theorem, this sequence of numbers contains a convergent subsequence. Now select an infinite sequence of
functions (out of {f }) a1 (x), a2 (x), . . . whose values at x1 form a convergent
sequence, which we may also accomplish by the same reasoning. Similarly,
select a convergent sequence of functions (out of {a}) b1 (x), b2 (x), . . . whose
values at x2 form a convergent sequence, and so on.
Now consider the diagonal sequence:
q1 (x)
q2 (x)
q3 (x)

= a1 (x)
= b2 (x)
= c3 (x)
...

(246)

We wish to show that the sequence {q} converges on the entire interval [a, b].
Given  > 0, take M large enough so that there exist values xk with
k M such that |x xk | () for every point x of the interval, where ()
is the in our definition of equicontinuity. Now choose N = N () so that for
m, n > N
|qm (xk ) qn (xk )| < , k = 1, 2, . . . , M.
(247)
By equicontinuity, we have, for some k M :
|qm (x) qm (xk )| < ,
|qn (x) qn (xk )| < .

(248)
(249)

Thus, for m, n > N :


|qm (x) qn (x)| = |qm (x) qm (xk ) + qm (xk ) qn (xk ) + qn (xk ) qn (x)|
< 3.
(250)
Thus, {q} is uniformly convergent for all x [a, b].
With this theorem, we can find a subsequence n1 , n2 , . . . that converges
uniformly to a continuous limit function 1 (x) for a x b. There may be
more than one limit function, but there cannot be an infinite number, as we
know that the number of eigenfunctions for given is finite. Passing to the
limit,
hn |n i = 1 h1 |1 i = 1
Jn (n , n ) = 1n J(1 , 1 ) = 1
n (x) = 1n

An (x, y)n (y)dy 1 (x) = 1

35

(251)
(252)

k(x, y)1 (y)dy.(253)

Thus we have proven the existence of an eigenvalue (1 ).


Note that 1 6= since we assumed that J(, ) could be positive:
max [J(, )] = 1 =

1
> 0.
1

(254)

Note also that, just as in the principal axis problem, additional eigenvalues (if
any exist) can be found by repeating the procedure, restricting to functions
orthogonal to the first one. If k(x, y) is degenerate, there can only be a finite
number of them, as the reader may demonstrate. This completes the proof
of the theorem stated at the beginning of the section.
Well conclude this section with some further properties of symmetric kernels. Suppose that we have found all of the positive and negative eigenvalues
and ordered them by absolute value:
|1 | |2 | . . .

(255)

Denote the corresponding eigenfunctions by


1 , 2 , . . .

(256)

They form an orthonormal set (e.g., if two independent eigenfunctions corresponding to the same eigenvalue are not orthogonal, we use the GramSchmidt procedure to obtain orthogonal functions).
We now note that if there are only a finite number of eigenvalues, then
the kernel k(x, y) must be degenerate:
k(x, y) =

n
X
i (x)i (y)

i=1

(257)

We may demonstrate this as follows: Consider the kernel


k 0 (x, y) = k(x, y)

n
X
i (x)i (y)
i=1

(258)

and its integral form


0

J (, ) =

Z Z

k 0 (x, y)(x)(y)dxdy.

(259)

The maximum (and minimum) of this form is zero, since the eigenvalues of
P
i (y)
k(x, y) equal eigenvalues of ni=1 i (x)
. Hence k 0 (x, y) = 0.
i
We also have the following expansion theorem for integral transforms
with a symmetric kernel.
Theorem: Every continuous function g(x) that is an integral transform with
symmetric kernel k(x, y) of a piecewise continuous function f (y),
g(x) =

k(x, y)f (y)dy,

36

(260)

where k(y, x) = k(x, y), can be expanded in a uniformly and absolutely


convergent series in the eigenfunctions of k(x, y):

g(x) =

gi i (x),

(261)

i=1

where gi = hi |gi.
We notice that for series of the form:
k(x, y) =

X
i (x)i (y)

i=1

(262)

the theorem is plausible, since

X
i (x) Z

g(x) =

i=1

i (y)f (y)dy

gi i (x),

(263)

i=1

where gi = hi |f i/i = hi |gi, and we should properly justify the interchange


of the summation and the integral. Well forego a proper proof of the theorem
and consider its application.
We wish to solve the inhomogeneous integral equation:
g(x) = f (x)

Z b

k(x, y)f (y)dy.

(264)

Suppose that is not an eigenvalue, 6= i , i = 1, 2, . . .. Write


f (x) g(x) =

Z b

k(x, y)f (y)dy.

(265)

Assuming f (y) is at least piecewise continuous (hence, f g must be continuous), the expansion theorem tells us that f (x) g(x) may be expanded
in the absolutely convergent series:
f (x) g(x) =

ai i (x),

(266)

i=1

where
ai = hi |f gi
=
=
=

Z Z
Z

k(x, y)f (y)i (x)dydx

f (y)dy

hi |f i.
i

k(y, x)i (x)dx


(267)

37

Using the first and final lines, we may eliminate hi |f i:


i
hi |gi,
i

hi |f i =

(268)

and arrive at the result for the expansion coefficients:


ai =

hi |gi.
i

(269)

Thus, we have the solution to the integral equation:


f (x) = g(x) +

i (x)

i=1

hi |gi
.
i

(270)

This solution fails only if = i is an eigenvalue, except that it remains valid


even in this case if g(x) is orthogonal to all eigenfunctions corresponding to
i , in which case any linear combination of such eigenfunctions may be added
to solution f .

6.1

Resolvent Kernels and Formal Solutions

In the context of the preceding discussion, we may define a resolvent kernel


R(x, y; ) by:
f (x) = g(x) +

Z b

R(x, y; )g(y)dy.

(271)

Then
R(x, y; ) =

X
i (y)i (x)
i=1

The Fredholm series:

1 D(x, y; )
D()

(272)

(273)

is an example of such a resolvent kernel.


Now look at the problem formally: We wish to solve the (operator) equation f = g + Kf . The solution in terms of the resolvent is
f = g + Rg = (1 + R)g.

(274)

But we could have also obtained the formal solution:


f=

1
g.
1 K

(275)

If |K|< 1 then we have the series solution:


f = g + Kg + 2 K 2 g + . . . ,

38

(276)

which is just the Neumann series.


What do these formal operator equations mean? Well, they only have
meaning in the context of operating on the appropriate operands. For example, consider the meaning of |K| < 1. This might mean that for all possible
normalized functions we must have that kKk < 1, where the k indicates
an operator norm, given by:
 Z Z

kKk max

2

k(x, y)(x)(y)dxdy

< 1.

(277)

By the Schwarz inequality, we have that


kKk

Z Z

[k(x, y)]2 dxdy.

(278)

The reader is invited to compare this notion with the condition for convergence of the Neumann series in Whittaker and Watson:
|(b a)| max
|k(x, y)| < 1.
x,y

6.2

(279)

Example

Consider the problem:


2

f (x) = sin x +

Z 2

k(x, y)f (y)dy,

(280)

with symmetric kernel


k(x, y) =

1 2
1
,
2 1 2 cos(x y) + 2

|| < 1.

(281)

We look for a solution of the form


f (x) = g(x) +

hi |gi
i (x),
i=1 i

(282)

where g(x) = sin2 x. In order to accomplish this, we need to determine the


eigenfunctions of the kernel.

With some inspection, we realize that the constant 1/ 2 is an (normalized) eigenfunction. This is because the integral:
I0

Z 2
0

dx
1 2 cos(x y) + 2

(283)

is simply a constant, with no dependence on y. In order to find the corresponding eigenvalue, we must evaluate I0 . Since I0 is independent of y,
evaluate it at y = 0:
I0 =

Z 2
0

dx
.
1 2 cos x + 2
39

(284)

We turn this into a contour integral on the unit circle, letting z = eix .
Then dx = dz/iz and 2 cos x = z + z1 . This leads to:
I0 = i

z 2

dz
.
(1 + 2 )z +

(285)

The roots of the quadratic in the denominator are at z = {, 1/}. Thus,


I0 =

i I
dz
.
(z )(z 1/)

(286)

Only the root at is inside the contour; we evaluate the residue at this pole,
and hence determine that
2
I0 =
.
(287)
1 2

We conclude that eigenfunction 1/ 2 corresponds to eigenvalue 1.


We wish to find the rest of the eigenfunctions. Note that if we had not
taken y = 0 in evaluating I0 , we would have written:
I0 =

ieiy I
dz
,

(z eiy )(z eiy /)

(288)

and the relevant pole is at eiy . We thence notice that we know a whole class
of integrals:
1 2 ieiy I
z n dz
= n einy ,
iy
iy
2
(z e )(z e /)

n 0.

(289)

Since z n = einx , we have found an infinite set of eigenfunctions, and their


egienvalues. But we should investigate the negative powers as well we didnt
include them here so far because they yield an additional pole, at z = 0.
We wish to evaluate:
In

dz
ieiy I
,

n
iy

z (z e )(z eiy /)

n 0.

(290)

1
1
The residue at pole z = eiy is 1
2 n einy . We need also the residue at z = 0.
It is coefficient A1 in the expansion:

X
eiy
=
Aj z j .
z n (z eiy )(z eiy /) j=

(291)

After some algebra, we find that





eiy
X
1
jn i(j+1)y
j+1
=
z
e

.
z n (z eiy )(z eiy /)
1 2 j=0
j+1

40

(292)

The j = n 1 term will give us the residue at z = 0:


A1 = einy


 n
n

.
1 2

(293)

Thus,
2
n einy .
(294)
2
1
We summarize the result: The normalized eigenfunctions are n (x) =
einx

, with eigenvalues n = |n| , for n = 0, 1, 2, . . ..


2
Finally, it remains to calculate:
In =

hn | sin2 xi
n (x)
n
n=1
"
#
2 20 (x) 2 (x) + 2 (x)
2
= sin x +

4
1
2



1
1
= sin2 x +
2
cos 2x .
2 1

f (x) = sin2 x +

(295)

Note that if = 1 or = 2 then there is no solution. On the other hand,


if = |n| is one of the other eigenvalues (n 6= 0, 2), then the above is still
a solution, but it is not unique, since we can add any linear combination of
n (x) and n (x) and still have a solution.

Exercises
1. Given an abstract complex vector space (linear space), upon which we
have defined a scalar product (inner product):
ha|bi

(296)

between any two vectors a and b, prove the Schwarz inequality:


|ha|bi|2 ha|aihb|bi.

(297)

Give the condition for equality to hold.


One way to approach the proof is to consider the fact that the projection
of a onto the subspace which is orthogonal to b cannot have a negative
length, where we define the length (norm) of a vector according to:
kck

hc|ci.

(298)

Further, prove the triangle inequality:


ka + bk kak + kbk.

41

(299)

2. Considering our RC circuit example, derive the results in Eqn. 31


through Eqn. 35 using the Fourier transform.
3. Prove the convolution theorem.
4. We showed the the Fourier transform of a Gaussian was also a Gaussian
shape. That is, let us denote a Gaussian of mean and standard
deviation by:
(x )2
1
.
exp
N (x; , ) =
2 2
2
#

"

(300)

(a) In class we found (in an equivalent form) that the Fourier Transform of a Gaussian of mean zero was:
2 2
(y; 0, ) = 1 exp y .
N
2
2

"

(301)

Generalize this result to find the Fourier transform of N (x; , ).


(b) The experimental resolution function of many measurements is
approximately Gaussian in shape (in probability&statistics well
prove the Central Limit Theorem). Often, there is more than
one source of uncertainty contributing to the final result. For example, we might measure a distance in two independent pieces,
with means 1 , 2 and standard deviations 1 , 2 . The resolution function (sampling distribution) of the final result is then the
convolution of the two pieces:
P (x; 1 , 1 , 2 , 2 ) =

N (y; 1 , 1 )N (x y; 2 , 2 )dy. (302)

Do this integral to find P (x; 1 , 1 , 2 , 2 ). Note that it is possible


to do so by straightforward means, though it is a bit tedious.
You are asked here to instead use Fourier transforms to (I hope!)
obtain the result much more easily.
5. The Gaussian integral is:
"
#
(x )2
1 Z

exp
dx = 1.
2 2
2

(303)

Typically, the constants and 2 are real. However, we have encountered situations where they are complex. Determine the domain in
(, 2 ) for which this integral is valid. Try to do a careful and convincing demonstration of your answer.
6. In class we
consider the three-dimensional Fourier transform of er /r,
where r = x2 + y 2 + z 2 . What would the Fourier transform
in two
2 be
2
dimensions (i.e., in a two-dimensional space with r = x + y )?

42

7. The lowest P -wave hydrogen wave function in position space may be


written:


r
1
q
r cos exp
(x) =
,
(304)
2a0
32a50

where r = x2 + y 2 + z 2 , is the polar angle with respect to the z


axis, and a0 is a constant. Find the momentum-space wave function
for this state (i.e., find the Fourier transform of this function).
In this and all problems in this course, I urge you to avoid look-up
tables (e.g., of integrals). If you do feel the need to resort to tables,
however, be sure to state your source.
8. In section 2.2.1, we applied the Laplace transform method to determine
the response of the RC circuit:

V(t)
R1
Vc (t)
R2

to an input voltage V (t) which was a delta function. Now determine


VC (t) for a pulse input. Model the pulse as the difference between two
exponentials:


V (t) = A et/1 et/2 .
(305)
9. In considering the homogeneous integral equation, we stated the theorem that there are a finite number of eigenfunctions for any given
eigenvalue. We proved this for real functions; now generalize the proof
to complex functions.
10. Give a graphical proof that the series D() and D(x, y; ) in the Fredholm solution are polynomials of degree n if the kernel is of the degenerate form:
n
k(x, y) =

X
i=1

43

i (x)i (y).

(306)

11. Solve the following equation for u(t):


Z 1
d2 u
(t)
+
sin [k(s t)] u(s)ds = a(t),
dt2
0

(307)

with boundary condition u(0) = u0 (0) = 0, and a(t) is a given function.


12. Prove that an n-term degenerate kernel possesses at most n distinct
eigenvalues.
13. Solve the integral equation:
x

f (x) = e +

Z x

1+y
f (y)dy.
x

(308)

Hint: If you need help solving a differential equation, have a look at


Mathews and Walker chapter 1.
14. In section 5.2.1 we developed an algorithm for the numerical solution
of Volterras equation. Apply this method to the equation:
Z x

f (x) = x +

exy f (y)dy.

(309)

In particular, estimate f (1), using one, two, and three intervals (i.e.,
N = 1, N = 2, and N = 3). [Were only doing some low values so you
dont have to develop a lot of technology to do the computation, but
going to high enough N to get a glimpse at the convergence.]
15. Another method we discussed in section 3 is the extension to the
Laplace transform in Laplaces method for solving differential equations. Ill summarize here: We are given a differential equation of the
form:
n
X

(ak + bk x)f (k) (x) = 0

(310)

k=0

We assume a solution of the form:


f (x) =

F (s)esx ds,

(311)

where C is chosen depending on the problem. Letting


U (s) =
V (s) =

n
X
k=0
n
X

ak s k

(312)

bk s k ,

(313)

k=0

the formal solution for F (s) is:


F (s) =

Z
A
exp
V (s)

44

U (s0 ) 0
ds ,
V (s0 )

(314)

where A is an arbitrary constant.


A differential equation that arises in the study of the hydrogen atom
is the Laguerre equation:
xf 00 (x) + (1 x)f 0 (x) + f (x) = 0.

(315)

Let us attack the solution to this equation using Laplaces method.


(a) Find F (s) for this differential equation.
(b) Suppose that = n = 0, 1, 2, . . . Pick an appropriate contour, and
determine fn (x).
16. Write the diagram, with coefficients, for the fifth-order numerator and
denominator of the Fredholm expansion.
17. Solve the equation:
Z

f (x) = sin x +

cos x sin yf (y)dy

(316)

for f (x). Find any eigenvalues and the corresponding eigenfunctions.


Hint: This problem is trivial!
18. Find the eigenvalues and eigenfunctions of the kernel:

x+y
1
x y
k(x, y) =
log sin
/ sin
2
2
2

X
sin nx sin ny
, 0 x, y .
=
n
n=1

(317)

19. In the notes we considered the kernel:


k(x, y) =

1
1 2
,
2 1 2 cos(x y) + 2

(318)

where || < 1 and 0 x, y 2. Solve the integral equation


Z 2

f (x) = e +

k(x, y)f (y)dy

(319)

with this kernel. What happens if is an eigenvalue? If your solution


is in the form of a series, does it converge?
20. Solve for f (x):
f (x) = x +

Z x

(y x)f (y)dy,

(320)

This problem can be done in various ways. If you happen to obtain a


series solution, be sure to sum the series.

45

21. We wish to solve the following integral equation for f (x):


f (x) = g(x)

Z x

f (y)dy,

(321)

where g(x) is a known, real continuous function with continuous first


derivative, and satisfies g(0) = 0.
(a) Show that this problem may be re-expressed as a differential equation with suitable boundary condition, which may be written in
operator form as Lf = g 0 . Give L explicitly and show that it is a
linear operator.
(b) Suppose that G(x, y) is the solution of LG = (x y), where
(x) is the Dirac function. Express the solution to the original
problem in the form of an integral transform involving G and g 0 .
(c) Find G(x, y) and write down the solution for f (x).
22. Some more Volterras equations: Solve for f (x) in the following two
cases
(a) f (x) = sin x + cos x +
(b) f (x) =

Rx
0

sin(x y)f (y) dy,

R
ex + 2x + x eyx f (y) dy.
0

23. Consider the LCR circuit in Fig. 4:

V(t)
L
V0(t)
R

Figure 4: An LCR circuit.


Use the Laplace transform to determine V0 (t) given


0<t<T
(322)
otherwise.

Make a sketch of V (t) for (a) 2RC > LC; (b) 2RC < LC; (c)
2RC = LC.
V (t) =

V
0

46

24. The radioactive decay of a nucleus is a random process in which the


probability of a decay in time interval (t, t+dt) is independent of t, if the
decay has not already occurred. This leads to the familiar exponential
decay law (as you may wish to convince yourself): If at time t, there
are N (t) nuclei, then the rate of decays is proportional to N (t):
dN
(t) = N (t).
dt
Integrating, we find N (t):
N (t) = N (0)et .
In practice, radioactive decays often occur in long chains. For (a simplified) example, 238 U decays via -emission to 234 Th with a half-life of
4.5 109 y; 234 Th decays in a subchain with two emissions to 234 U
with a half-life of 24 d; 234 U decays via -emission to 230 Th with a
2.4 106 y half-life; etc. We may use the method of Laplace transforms
to determine how the abundance of any species of nucleus in such a
chain evolves with time.
Thus, suppose that we have a decay chain A B C D, where
D is stable, and the decay rates for A, B, and C are A , B , and C ,
respectively. Suppose NB (0) = 0. Determine NC (t) as a function of
the rates and the initial abundances NA (0) and NC (0).
You are supposed to approach this problem by setting up a system
of differential equations for the abundances, and then using Laplace
transforms to solve the equations. Note, in setting up your differential
equations, that the rate of change in abundance for an intermediate
nucleus in the chain gets a contribution from the nucleus decaying into
it as well as from its own decay rate.
25. Solve the following integral equations for f (x):
(a)
x

f (x) = e +

Z 2

xyf (y)dy.

(323)

(b)
f (x) =

sin(x y)f (y)dy.

(324)

For both parts, what happens for different values of ?


26. Consider the following simple integral equation:
2

f (x) = x +

Z 1
0

47

xyf (y)dy.

(325)

(a) Find the Neumann series solution, to order 2 . For what values
of do you expect the Neumann series to be convergent? If you
arent sure from what you have done so far, try doing the rest of
the problem and come back to this.
(b) Find the Fredholm series solution, to order 2 .
(c) This is a degenerate kernel, so find the solution according to our
method for degenerate kernels.

48

Das könnte Ihnen auch gefallen