Calculus 3 Lecture Notes

Lecture Notes for
MAS204: CALCULUS III

M.A.H. MacCallum
These notes are based on independent previous versions by M.J.
Thompson, with gures by C.D. Murray, and by P. Saha.
School of Mathematical Sciences,
Queen Mary, University of London
September-December 2007
It is one of the most unnatural features of science that the abstract language of mathematics
should provide such a powerful tool for describing the behaviour of systems both inanimate, as
in physics, and living, as in biology. Why the world should conformto mathematical descriptions
is a deep question. Whatever the answer, it is astonishing.
Lewis Wolpert (1992)
Students on the course MAS204 Calculus III are welcome to download, print and photocopy these notes,
in whole or in part, for their personal use. The notes are intended to supplement rather than to replace the
lectures.
c M.A.H. MacCallum 2007, 2006, 2001, P. Saha 2005, M. J. Thompson 1999.
Chapter 1
Introductory material
This chapter gives a quick review of some of those parts of the prerequisite courses (Calculus I and II and
Geometry I) which we will actually use, adding some extra material. Those parts which are revision will be
done with few examples.
1.1 Curves and surfaces
We shall need a number of geometrical shapes for use in examples, so we need the equations for them. The
main ones are so-called conic sections in two dimensions, and related three-dimensional surfaces. First we
discuss curves in two dimensions.
You should already know that
x
2
+y
2
= a
2
(1.1)
is the equation of a circle centred at the origin, (0, 0), and radius a (Thomas, 1.2). Given
x
2
+6x +y
2
+8x = 0
we can carry out a process called completing the square to write it as
(x +3)
2
+(y +4)
2
= 25
which we now recognize as a circle radius 5, centre (3, 4): this circle passes through the origin. Similar
methods can be used to recognize the other standard curves if they are given relative to origins different from
the ones used in the most standard forms below (cf. Thomas 1.5).
The equation
x
2
a
2
+
y
2
b
2
= 1 (1.2)
is the equation of an ellipse (a sort of squashed circle), centred at (0, 0) and with semi-major axes a and b
(Thomas 1.5).
The equation
y = x
2
(1.3)
1
is the simplest form of the equation of a parabola. A somewhat more general form (to which all others can
be transformed by change of coordinates) is
y = ax
2
+b .
One standard form of the equation of a hyperbola is
y
2
x
2
= a
2
, (1.4)
or more generally
cy
2
kx
2
= a
2
, (1.5)
where c > 0, k > 0. Another is xy = b
2
. One can see the relationship between xy = b
2
and (1.4) by noting
that (y +x)
2
(y x)
2
= 4xy so xy = b
2
is using axes at 45
to those used in (1.4) (and 4b

2
= a
2
).
Thomas 10.6 and 10.7 discusses a number of other curves.
A curve is a one-dimensional object, and we often want to be able to label points on it by a one-
dimensional variable, i.e. to write the points as (x(t), y(t)) for some parameter t (see Thomas 3.5). Later we
shall see examples where x(t) and y(t) are given explicitly. We shall especially need such parametrizations to
calculate integrals along curves. If a given x gives a unique y, x itself may do as a parameter but it may not be
the best or most convenient. For the circle (1.1) the usual parametrization is by the angular variable of polar
coordinates.
(x, y) = (acos, asin) . (1.6)
This relies on cos
2
+sin
2
= 1, and thus has a simple variant to deal with the ellipse (1.2):
(x, y) = (acos, bsin) . (1.7)
Noting the analogy between (1.1) and the form (1.4), we may be able to guess that
(x, y) = (acosh, asinh)
gives a parametrization of the hyperbola (not the only one possible: see Thomas 10.4).
One of the simplest three-dimensional surfaces is that of a sphere of radius a centred at the origin:
x
2
+y
2
+z
2
= a
2
. (1.8)
To parametrize a surface, which is two-dimensional, we need to write it in terms of two variables e.g.
(x(u, v), y(u, v), z(u, v)) (see Thomas 16.6). The sphere has an angular variable parametrization, where
(u, v) = (, ),
(x, y, z) = (asin cos, asin sin, acos) (1.9)
which in fact is using spherical polars (which we shall meet again when we discuss curvilinear coordinates).
Example 1.1. What is the surface x
2
+y
2
+z
2
= a
2
, x 0?
The hemisphere to the right of the plane x = 0.
A second important surface which is also based on generalizing the circle in fact has the same equation as
the circle, (1.1), but in three dimensions this means one circle for each value of z, so we have a cylinder. Very
often some bounding values of z are given, e.g. 0 z 2. The cylinder can be parametrized by two variables
and z, using (1.6) to give x and y: this uses cylindrical polars, which we shall also meet again later (see
Thomas 16.6).
2
We can also dene an ellipsoid, by generalizing (1.2) to
x
2
a
2
+
y
2
b
2
+
z
2
c
2
= 1 (1.10)
(see Thomas, 12.6) and an (elliptic) paraboloid by generalizing (1.3) to
z
c
=
x
2
a
2
+
y
2
b
2
. (1.11)
Note that if we take the intercept of (1.11) with the plane z = 2 (say) we get an ellipse with varying x and
y, while if we choose the intercept with a plane y = constant we get a parabola with varying x and z. These
shapes, and the ones that follow, are shown in diagrams 12.48-12.52 in Thomas.
Similarly we can obtain an (elliptic) hyperboloid as
1 =
y
2
a
2
+
x
2
b
2
z
2
c
2
. (1.12)
Note that in this case we have ellipses in planes z = constant, and hyperbolae in the planes x = 0 and y = 0.
This surface has just one piece (we say one sheet). The surface
1 =
y
2
a
2
+
x
2
b
2
z
2
c
2
. (1.13)
has two parts (one in z <c and one in z > c)
The elliptic paraboloid and hyperboloid have circular special cases where a = b. Note also that we can
swap x, y and z around in these forms so we have different choices of axes for the same shapes.
Finally note that in general one can describe a surface by an equation V(x, y, z) = 0.
1.2 Trigonometric functions
1.2.1 Values
(See Thomas 1.6)
We can quickly estimate the value of a trigonometric function for any argument in [0, 2] by remembering a
few things. First we have the table
0
30
=

6
radians 45
=

4
rad. 60
=

3
rad. 90
=

2
rad.
cos 1
3
2
1
2
1
2
0
sin 0
1
2
1
3
2
1
Then we remember that sin and cos swap places when we shift by an odd multiple of /2 (90
), and they
do not swap places but do swap sign when we shift by a multiple of (180
). More precisely, we have the

equations
cos(x) = cosx, sin(x) =sinx,
cos(
2
x) = sinx, sin(
2
x) = cosx,
cos( x) =cosx, sin( x) = sinx.
3
These identities enable us to relate the value we want to a value in the rst quadrant (i.e. the range [0,
1
2
]).
To get the sign we remember the table
Radians Degrees sin cos tan Positive functions
0-
1
2
0
-90
+ + + All
1
2
- 90
-180
+ - - Sin

3
2
180
-270
- - + Tan
3
2
2 270
-360
- + - Cos
sometimes called the Add Sugar To Coffee rule or use Thomas variant All Students Take Calculus.
(Note: to be entirely accurate we should have special rows in this table for the values
1
2
etc because at those
points one or more of the functions will be zero or unbounded.)
1.2.2 Identities for the trigonometric functions
The most important formulae to remember are
sin
2
A+cos
2
A = 1
cos(A+B) = cosAcosBsinAsinB
sin(A+B) = sinAcosB+cosAsinB.
If you have trouble remembering which of the last two is which, and which has the minus in it, try substituting
some special values such as A = 0 or B =
1
2
and checking the result.
The double/half angle cases
sin2x = 2sinxcos x
cos2x = cos
2
x sin
2
x = 2cos
2
x 1 = 1 2sin
2
x
cos
2
x =
1
2
(1 +cos2x)
sin
2
x =
1
2
(1 cos2x)
come up often.
Using the basic identities we can easily derive plenty more, such as
sec
2
A = 1 +tan
2
A
cosC+cosD = 2 cos
1
2
(C+D) cos
1
2
(CD).
We should also note (see Thomas 3.4)
dsinx
dx
= cosx,
dcosx
dx
=sinx .
Both sin and cos therefore obey
d
2
y
dx
2
=y .
Those who have done applied maths. at A-level or later may recognize this as an equation for simple harmonic
motion. Later in the course we shall see how we could start from this and get a denition of sin and cos, in a
way we then follow to dene other functions.
4
1.3 Ln, or log
e
, exp, and hyperbolic functions
(See Thomas section 7.2)
The natural logarithm lnx can be dened as
lnx =
_
x
1
dt
t
.
This implies ln1 = 0. Note that this is not a good denition if x < 0, but it is easy to show that for negative u,
_
du/u = ln|u| +constant. The number e is then dened by lne = 1. From the denition it is obvious that
dlnx
dx
=
1
x
.
One also nds:
ln( f g) = ln f +lng
(Thomas gives other results that follow from this e.g. for f = 1/g etc.) and
lnx
r
= r lnx, for constant r.
Note that ln(a +b) = lna +lnb (unless a +b = ab)
We can dene exp (see Thomas 7.3) to be the inverse function to ln, so that exp(lnx) = ln(expx) = x.
Then exp1 = e and expr = e
r
. Note that exp(a +b) = e
a
e
b
NOT e
a
+e
b
. For any number a, a = e
lna
and
hence a
x
= e
xlna
. In particular, this enables us to relate the usual logarithms (base 10) to natural logarithms
since if x =logy, y =10
x
=e
xln10
, so lny =xln10 and x =lny/ln10. For a general a we can dene log
a
y =x
to be such that y = a
x
, so lny = log
e
y.
One can show that
dexpx
dx
= expx .
We can now use e
x
to dene the hyperbolic functions (see Thomas 7.8)
coshx =
1
2
(e
x
+e
x
), sinhx =
1
2
(e
x
e
x
) .
These functions have identities and derivative properties that run closely parallel to those of sin and cos. If
you know the trigonometric identities, the identities for hyperbolic functions can be recovered by substituting
cosh for cos and i sinh for sin, where i
2
=1.
From differentiating e
x
we nd
dsinhx
dx
= coshx,
dcoshx
dx
= sinhx .
Both sinh and cosh therefore obey
d
2
y
dx
2
= y
(as do e
x
and e
x
).
1.4 Vectors
(Note: this is in chapter 12 in Thomas but in Geometry I you used Hirst)
Vectors can be introduced as displacements in space, called position vectors. To describe a position vector,
5
we need to specify its direction and its length or magnitude (to say how far we go in the given direction).
This is a geometric denition. A vector is different from a scalar, a quantity which has only a magnitude.
One can draw a vector as an arrow of the appropriate length. Vectors are usually notated in print by
boldface type, e.g. a, and in handwriting by under- or over-lining such as a, a, or a
.
To dene a vector algebraically, i.e. in a formula, we can use the Cartesian coordinates of the point to
which it displaces the origin, e.g.
r = (x, y, z). (1.14)
(In this course we shall use only this row vector form.) x, y and z are then referred to as the components of r.
We may refer to (x, y, z) as the point r. From now on I shall use the notation r only for this vector.
The length of a vector v, which is a scalar, is denoted by |v| or sometimes just v. r has length
_
x
2
+y
2
+z
2
,
by Pythagoras theorem in 3 dimensions.
To add vectors a and b we simply take the displacement obtained by displacing rst by a and then by b
(the result can be dened as the diagonal of the parallelogram with sides a and b). In components this says
that v = (x
1
, y
1
, z
1
) and w = (x
2
, y
2
, z
2
) have the sum
v +w = (x
1
+x
2
, y
1
+y
2
, z
1
+z
2
).
Subtraction can then be dened similarly. The zero vector 0 is the one with zero magnitude (and no well-
dened direction!).
It is now easy to arrive at the following rules for addition and subtraction of displacements.
For any vectors a, b and c,
a +b = b+a, (a +b) +c = a +(b+c), 0 such that a +0 = a.
Given a, (a) such that a +(a) = 0.
These rules are purely abstract and make no reference to displacements or three dimensions, and are part
of the general denition of a vector space which is given in Linear Algebra I. Those who have encountered
groups will recognise that they ensure that the space of vectors is an additive group under vector addition.
The displacement from a point r
1
to a point r
2
is r
2
r
1
.
We can multiply a vector by a scalar (a number) simply by multiplying its magnitude, preserving the
direction. In components we will have
v = (x, y, z).
This operation also obeys some very simple rules. For any vectors a and b, and numbers and , we have
(a +b) = a +b, ( +)a = a +a, ()a = (a)
and 1a = a. For a general vector space, as dened in Linear Algebra I, the scalars are elements of a general
eld but here we shall only use the real numbers R. However, these rules again apply when and are
elements of a general eld.
This gives us a way to dene the unit vector (the vector of length 1) in the same direction as v, denoted by
v, by v v/|v| (strictly we should write the number rst so we would have to write (1/|v|)v, but in practice
its obvious what we mean).
These rules give us another common way of writing a vector. We note that we can arrive at the same total
displacement by rst moving along the x-axis and then along y and z, and we can express this using the unit
length vectors i, j and k along the directions of the three axes by
r = xi +yj +zk.
6
This way of writing (1.14) has the advantage of making it clearer how the components change if we change
our choice of axes, i.e. of i, j and k.
Note that all of these statements about position vectors in 3 dimensions can very simply be applied in 2
dimensions also, with obvious minor changes.
Although we have motivated vectors by introducing them as displacements, they can represent, or be
interpreted as, many other things: for example, a force, a velocity, inputs and outputs in an economic model,
and so on.
An expression of the type
r = p+sq, < s < (1.15)
denes a line through p in direction q. In particular r = sk, < s < is the z axis, while a line going
through r
1
and r
2
is r = r
1
+s(r
2
r
1
).
Example 1.2. Medians of a triangle
Vectors can often be used to derive geometrical results very concisely, as this example shows.
Let a, b, c be the corners of a triangle. The midpoint of the side connecting b and c will be
1
2
(b+c). A
line through this last point and a is
r = a +s(
1
2
b+
1
2
c a), < s <
and is called the median through a. Putting s =
2
3
(note: here this choice is a rabbit out of the hat but we can
nd it by writing down a second median and solving for the intersection point) we get the point
1
3
(a+b+c).
Since this point is symmetric in a, b, c, the medians through b and c will also pass through it. Hence the
medians of a triangle intersect.
If we write out the components of (1.15), with notation r =(x, y, z), p =(p
1
, p
2
, p
3
), q =(q
1
, q
2
, q
3
) we
nd
x = p
1
+sq
1
y = p
2
+sq
2
, z = p
3
+sq
3
,
from which we can eliminate s to get
x p
1
q
1
=
y p
2
q
2
=
z p
3
q
3
,
giving the two independent linear equations (e.g. for y and z in terms of x) needed for a line in three-
dimensional space.
We can now write functions of position as functions f (r). Equations of the form f (r) = constant dene
surfaces, the constant surfaces of f . A simple example is r
2
= 1, which is a sphere of unit radius centred at
the origin. (Recall our notation allows r |r|.)
Example 1.3. A sphere
The geometrical interpretation of
|r k| = 1
as a sphere of unit radius centred at (0, 0, 1) is obvious. Equivalent expressions are x
2
+y
2
+(z 1)
2
= 1 and
x
2
+y
2
+z
2
2z = 0.
Warning: One of the commonest errors made by students is to confuse vectors and scalars, in particular
to start adding together the components of a vector. The vector (3, 1, 2) is not the same as the scalar 6. This
7
may seem obvious now, but the mistake is more easily made when using basis vectors like i, j and k; then it
somehow seems to be easier to think 3i +j +2k= 6.
1.5 Scalar and vector products
We have dened vector addition and subtraction, but not multiplication of vectors. This is more complicated
because to obtain another vector we need to dene both a magnitude and a direction (and in general division
cannot be dened at all).
We rst dene a product whose result is not a vector but a scalar. For vectors v and w, this is dened by
v.w |v||w| cos, (1.16)
where is the angle between v and w. An alternative denition in terms of the components (v
1
, v
2
, v
3
) and
(w
1
, w
2
, w
3
) of v and w is
v.w v
1
w
1
+v
2
w
2
+v
3
w
3
=
3
i=1
v
i
w
i
.
One can prove that the two denitions are the same by applying Pythagoras theorem to a triangle con-
structed as follows. Take sides v, w and v +w. Draw the perpendicular from v +w to the line in direction v.
It has height |w| sin and meets the direction v at a distance |v| +|w| cos. Now write out Pythagoras with
the lengths in terms of |v|, |w| and and again in terms of components and compare the results. The details
are left as an exercise (if you have trouble, look in the online notes for MAS114 Geometry I or in A.E. Hirst,
Vectors in 2 or 3 dimensions, Arnold 1995, chapter 3).
Because of the notation used in (1.16) this is often called the dot product, and in a more abstract setting
(such as in Linear Algebra I) may also be called the inner product.
We note in particular that two vectors v and w are perpendicular ( is a right angle) if and only if v.w=0.
Example 1.4. (This example was used in Geometry I.)
Find cos where is the angle between v = (1, 3, 1) and w = (2, 2, 1).
v.w = 1.2 +3.2 +(1).1 = 7 =|v||w| cos,
|v|
2
= 1
2
+3
2
+(1)
2
= 11,
|w|
2
= 2
2
+2
2
+1
2
= 9, so
cos =
7
11
9
=
7
3
11
Note that from either form of the denition it is obvious that v.w = w.v. It is also easy to show that for
any vectors u, v and w and any scalar ,
v.(w) = (v.w) = (v).w,
u.(v +w) = (u.v) +(u.w),
(u+v).w = (u.w) +(v.w),
v.v = |v|
2
0,
v.v = 0 v = 0.
8
A geometrical application of the dot product is in giving the equation of a plane. The plane through a
point p perpendicular to a vector v is given by the points r which obey r.v = p.v, as is easily seen from a
sketch. If v = (a, b, c) and p.v = d the equation for a plane reads ax +by +cz = d. In practice people often
take a unit vector v when specifying a plane in this form, so that p.v becomes the distance of the plane from
the origin. The distance of a point r from the plane is given by p.v/|v| (the sign here tells one which side of
the plane one is on).
To dene a product which is a vector we need to dene a direction from two vectors u and v. The only
way to do this which treats the two vectors equally is to take the perpendicular to the plane in which u and
v lie. However, this does not fully dene a direction, because we need to know which way to go along the
perpendicular. For that the convention is to use the so-called right-hand rule: hold the ngers of your right
hand so they curl round from u to v and then take the direction your thumb points (see Thomas gures 12.27
and 12.28). If you do DIY, you may nd it helpful to remember that this is the direction a normal screw
travels if you turn your screwdriver clockwise. Note that this denition only works in three dimensions: there
is no well-dened vector product in n dimensions for n = 3.
The magnitude of v w is dened to be |v||w| sin ( as before). Geometrically this is the area of a
parallelogram with sides v and w. Note that for perpendicular vectors this rule implies that the magnitude is
|v||w|.
These rules have the consequences that for any vectors u, v and w and any scalar ,
v w = wv,
(v) w = (v w) = v (w),
u(v +w) = (uv) +(uw),
(u+v) w = uw+v w,
and v w = 0 if and only if v and w are parallel (in particular v v = 0 for all v).
From the notation used, the vector product is often called the cross product. You will also nd that in
some texts it is denoted vw, but I strongly advise against using this notation as it leads to confusion in more
general settings where v w is not a vector [Aside: The reason for this misuse is that v w is whats called
a two-form, and there is an operation called the Hodge dual, denoted by , such that in three dimensions
(v w) = v w.]
To get the expressions for the cross product in terms of components, we can start by noting that the unit
vectors i, j and k are perpendicular to one another (so the vector product of any two distinct ones among them
has magnitude 1) and that the usual x, y and z axes, in that order, are a right-handed set. Thus,
i j = k, j k = i, ki = j,
while i i = j j = kk = 0. Using these and the rules written down above we easily obtain
(v
1
i +v
2
j +v
3
k) (w
1
i +w
2
j +w
3
k) = (v
2
w
3
v
3
w
2
)i +(v
3
w
1
v
1
w
3
)j +(v
1
w
2
v
2
w
1
)k. (1.17)
This can also be written as the formal determinant
i j k
v
1
v
2
v
3
w
1
w
2
w
3
.
One geometrical use of the cross product is in forming the volume of a parallellepiped with sides u,v
and w. Thinking of (say) u and v as the base, and as the angle between uv and w, so that the height is
9
|w| cos, we see that the volume is (uv).w (positive if u,v and w are a right-handed set). This quantity is
called the triple scalar product and it is easy to show that
u.(v w) = v.(wu) = w.(uv)
(but this is v.(uw) etc, remember).
Some geometrical applications of the cross product are that:
the distance of a point v from a line r = a +b is |b(v a)|/|b|.
the distance between two lines r = a +v and r = b+w is |(a b).(v w)|/|v w|,
two planes r.v = p.v and r.w = q.w meet in the line r = a +(v w) where a is any point on both planes,
the plane through p perpendicular to v and w is r.(v w) = p.(v w).
Exercise 1.1. Prove from the denitions that, for all a, b and c,
a (bc) = (a.c)b(a.b)c.
1.6 Gradients and directional derivatives
(See Thomas 14.5 [and 16.2])
In Calculus II you met functions of more than one variable. Those that were discussed there were scalar
functions, i.e. their values at a particular point were numbers. Such a scalar quantity (magnitude but no
direction) that depends on position in space is called a scalar eld. An example would be the temperature in
a room it has magnitude but not direction (so it is a scalar), and it is (in general) a function of position.
Suppose that V(x, y, z) is a scalar eld dened in some region. Then we can dene a vector, the gradient
of V, at each point, which we denote V, as follows:
V =
V
x
i +
V
y
j +
V
z
k.
So the x-, y- and z-components of the new vector are V/x, V/y and V/z. See Thomas 14.5 if
you need to revise this in more detail. Sometimes instead of V we write gradV: the two notations are
interchangeable. Note also that we could write the scalar eld as V(r).
Example 1.5. If V(x, y, z) = x
2
sinz, calculate V.
In this example, V/x = 2xsinz, V/y = 0 and V/z = x
2
cosz. Hence
V = 2xsinzi +x
2
coszk.
Exercise 1.2. Evaluate the gradient f of the following scalar elds.
(a) f = x +y +z,
(b) f = yx
2
+y
3
y +2x
2
z,
(c) f = a.r, where a is a constant vector.
10
V tells us howV changes if we move from one point to a nearby point. Consider the small change in V
if one moves from r = (x, y, z) to r +dr =(x +dx, y +dy, z +dz):
dV V(x +dx, y +dy, z +dz) V(x, y, z) =
V
x
dx +
V
y
dy +
V
z
dz
(using the Taylor series in more than one variable, to rst order). But (dx, dy, dz) = dr, and so the right-hand
side is just V.dr. Hence
dV = V.dr. (1.18)
In our original denition of grad, it was implicitly assumed that we were working in terms of some
specied Cartesian coordinate system (x, y, z). Equation (1.18) is important, because we can use it as a more
fundamental denition of grad, which will enable us to write down V in more general coordinate systems.
We shall return to this point later.
Consider a surface
V(r) = constant,
and suppose that the point r is on that surface. If dr is a displacement on this surface, V(r+dr) =V(r). Thus
dV =V.dr =0. Since this applies for every small displacement dr in the surface, V must be perpendicular
to the surface. This gives us a way of nding a normal to a surface: the unit normal n will be V/|V|.
Suppose we now want a normal line to V = constant, or a tangent plane (see Thomas 14.6). As we know,
a line through point a in direction n can be written in parametric form as
r = a +tn, <t < ,
while the perpendicular plane is (ra).n =0. It is sometimes convenient to eliminate the parameter t for the
line, which we can do by taking the cross product with n:
(r a) n = 0.
Note that the forms (ra).n =0 and (ra)n =0 would give the same plane or line if we used V instead
of n, so we need not calculate |V| in these cases (and in r = a +tn such a change only alters the values of
t).
Exercise 1.3. Find equations for the (i) tangent plane and (ii) normal line at the point P
0
on the given
surface:
(a) x
2
+3yz +4xy = 27, P
0
= (3, 1, 2).
(b) y
2
z +x
2
y = 7, P
0
= (2, 1, 3).
Suppose now that we want to calculate the rate of change of V(r) in a particular direction specied by the
unit vector t. Let s be the distance travelled in the direction of t; then dr = tds. So dV = V.tds. Hence we
can conclude that the rate of change of V in the direction of t is
dV
ds
= V.t = t.V.
t.V is called the directional derivative. Now
V.t =|V| |t| cos =|V| cos,
where is the angle between the vectors V and t. This is maximized when cos =1, i.e. when =0. Thus
V changes most rapidly in the direction of V, and |V| is this most rapid rate of change. It is this property,
11
in the two-dimensional case, that gave rise to the name gradient, because |V| is the gradient of the surface
given by z = f (x, y) in that case. Correspondingly the maximum decrease is when t is opposite to V.
Example 1.6. Find the directions in which the function f (x, y, z) = (x/y) yz increases and decreases
most rapidly at the point P, (4, 1, 1).
We can describe the directions in which f increases and decreases most rapidly by specifying the unit
vectors in those directions. Now
f =
1
y
i
_
x
y
2
+z
_
j yk = (1, 5, 1) at P.
The rate of change of f in the direction of unit vector t is f .t. This has its maximum when t is in the same
direction as f ; so the direction t in which f increases most rapidly is
f
|f |
=
1
27
(1, 5, 1).
and the actual rate is
27. The rate of decrease of f is greatest, at
27, when t is in the opposite direction,

i.e.
1
27
(1, 5, 1)
Exercise 1.4. Find the directional derivative of at the point (1, 2, 3) in the direction of the vector
(1, 1, 1) where
=
x
2
3
+
y
2
9
+
z
2
27
.
We can write on its own as
= i

x
+j

y
+k

z
and work with it like a vector eld, although it is in fact not a vector eld (since we cannot say what numerical
value its components have at a particular point) but a vector differential operator. The name of the symbol
is nabla but often in speech we say del. It is easy to see how to take a two-dimensional version of . This
can then be used in formulae to differentiate vectors, as we shall soon see.
1.7 Double and triple integrals
(See Thomas 15.1 and 15.4)
First let us revise the idea of 2-D integration.
Example 1.7. Integrate the function f (x, y) = x
2
y
2
over the triangular area R: 0 x 1, 0 y x.
We can write this integral as
_
R
f (x, y)dA,
where dA is an area element. But the area of a little rectangle of length x in the x-direction and length y in
the y-direction is A =xy; hence we can rewrite dA as dA =dxdy. Thus the integral we want (cf. Fig. 1.7)
12
0.2 0 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
x
y
y = x
Figure 1.1: Integrating over the triangular region R : 0 x 1, 0 y x.
is
_ _
f (x, y)dxdy =
_
1
0
_
_
x
0
x
2
y
2
dy
_
dx =
_
1
0
x
2
_
1
3
x
3
_
dx
=
_
1
18
x
6
_
1
0
=
1
18
.
An area integral such as this is often called a double integral (because it can be rewritten as two 1-D
integrations). Some authors use two integration signs, to remind you that it is an area integral: thus they
would write
_ _
f (x, y)dA. In this course, whenever it is obvious that an integral is over area, we shall
generally just write
_
f (x, y)dA. Similarly some books write
_ _ _
f (x, y, z)dV for a volume integral: where
no confusion will arise, we shall just write
_
f (x, y, z)dV for a volume integral. We shall need to put in all the
integral signs when obtaining a value by doing the two or three integrations with respect to coordinates.
Exercise 1.5. Calculate
_
R
f (x, y)dA for
f (x, y) = 1 6x
2
y and R : 0 x 2, 1 y 1.
[Answer: 4]
Note that in this exercise the limits of both the x- and y- integrations were constants so one could do the
x- or the y-integration rst the answer will be the same. In the previous example, the upper limit of the
y-integral was x, so the y-integration had to be performed rst with the limits as given. Otherwise the answer
would still have contained x, which is ridiculous as the answer is for a whole area, not some value of x. (Had
we done the integrations in the other order the limits on x would have been (y, 1) and the limits on y (0, 1).)
It is a straightforward step fromdouble integrals to volume integrals (triple integrals) of the form
_
V
f (x, y, z)dV.
In Cartesian coordinates we have dV = dxdydz (the volume of a 3-D rectangular box) and so
_
V
f (x, y, z)dV =
_ _ _
V
f (x, y, z)dxdydz.
Sometimes the geometry of the volume will make other choices of coordinate system preferable. In
Thomas 15.3 and 15.6, which were studied in Calculus II, two-dimensional plane integrals in polar coordi-
nates, and triple integrals in spherical and cylindrical polars are discussed: you will nd it very useful to
13
revise those sections. For a general change of coordinate system from Cartesians (x, y, z) to (u, v, w),
_ _ _
V
f dxdydz =
_ _ _
V
J f dudvdw
where J is the Jacobian determinant of (x, y, z) with respect to (u, v, w) (if you need to revise this in more
detail, see Thomas 15.7).
Exercise 1.6. Evaluate
_ _ _
e
x
dxdydz over the volume V of the tetrahedron bounded by x = 0, y = 0,
z = 0 and x +y +z = a (a > 0).
14
Chapter 2
Vector differentiation and the vector
differential operator
Syllabus topics covered:
1. Vector elds
2. Grad, div and curl operators in Cartesian coordinates. Grad, div, and curl of products etc.
Here we cover differentiation of vectors, whereas the gradient is a vector quantity which arises from
differentiating a scalar.
2.1 Vector functions of one or more variables
(See Thomas 13.1)
In many physical contexts one is interested in vectors that vary with position or time. For example, the
position of a point can be described by a vector r. Thus, if we consider a moving particle, its position can be
described as a function of time t by the vector r(t), and its rate of change, which we dene next, will be the
velocity.
We can specify a vector function of some scalar u, say F(u), by specifying its components as functions of
u:
F(u) = ( f
1
(u), f
2
(u), f
3
(u)).
The derivative dF/du of F with respect to u can be calculated by differentiating the components of F:
dF
du
=
_
d f
1
du
,
d f
2
du
,
df
3
du
_
.
This is entirely equivalent to going back to the fundamental denition of a derivative:
dF
du
= lim
u0
F(u +u) F(u)
u
.
Clearly one can compute higher derivatives, such as d
2
F/du
2
, by differentiating the components of F the
required number of times.
15
Example 2.1. If r(t) is the position vector of a particle, as a function of time t, then dr/dt is the velocity
v of the particle. Also dv/dt d
2
r/dt
2
is the particles acceleration.
Example 2.2. The continuous parameter t can take all real values. Write down the derivatives dr/dt and
d
2
r/dt
2
for the vector r = (sint)i +tj. Also, sketch the curve whose parametric equation is r = r(t).
The rst and second derivatives are
dr
dt
= (cost)i +j,
d
2
r
dt
2
= (sint)i.
The sketch is shown in Fig. 2.1.
-1 1
2

2
t = 0
t = /2
t =
Figure 2.1: Sketch of the curve dened parametrically by r = (sint)i +tj
It is easy to prove, by writing out the components and collecting terms, that if F and Gare vector functions
of u, then
d(F.G)
du
= F.
dG
du
+
dF
du
.G.
Proof:
dF.G
du
=
d
du
( f
1
g
1
+ f
2
g
2
+ f
3
g
3
)
= f
1
dg
1
du
+ f
2
dg
2
du
+ f
3
dg
3
du
+
df
1
du
g
1
+
d f
2
du
g
2
+
d f
3
du
g
3
= F.
dG
du
+
dF
du
.G. Q.E.D.
Exercise 2.1. Sketch the curves whose parametric equations are
(a) r = (3sint)i +(2cost)j
16
(b) r = (cost)j
(c) r =ti +t
2
k
( t ), and write down the derivatives dr/dt and d
2
r/dt
2
where they are dened.
If F is a vector function of more than one variable, say F = F(t, u), then it is straightforward to dene its
partial derivatives with respect to t or u, in terms of partial derivatives of its components. Thus, for example,
if F = ( f
1
(t, u), f
2
(t, u), f
3
(t, u)), then
F
t
=
_
f
1
t
,
f
2
t
,
f
3
t
_
.
2.2 Vector Fields
(See Thomas 16.2)
Henceforward we shall be concerned mostly with vectors (and scalars) which depend on position in three-
dimensional space, i.e. which are functions of three variables x, y, z. Sometimes they may depend also on a
fourth variable, such as time t, or we may only be interested in their values on a path r(s). A vector depending
on position is said to constitute a vector eld. We write a vector F that varies with position as
F = F(x, y, z) F(r)
An example is shown in Figure 2.2. We can also write F in terms of its components, which again depend on
position:
F = (F
x
(x, y, z), F
y
(x, y, z), F
z
(x, y, z)) .
A physical example of a vector eld is the velocity in a owing uid (e.g. the water in the oceans, moving
because of currents and tides; or the air in the atmosphere, moving because of winds; or the blood in your
body, moving because it is pumped by the heart). The velocity at any point in the uid is a vector quantity
it has magnitude and direction. If we attach a velocity vector to each point of the owing uid, we have a
vector eld dened in the region occupied by the uid.
We could of course now differentiate the vector eld with respect to each of the coordinates (x, y, z) in
turn, in the manner described in the previous section. However, we will be more interested in forming scalar
and vector quantities from these derivatives. [Aside: if we used all the terms we could form a quantity of a
new kind, called a tensor. These are used in uid mechanics, solid mechanics and relativity, for example.]
We do this by forming, in a sense dened below, the dot and cross products of with F.
2.3 The Divergence Operator
(See Thomas 16.8)
Suppose F(x, y, z) = F
x
i +F
y
j +F
z
k is a vector eld. The divergence of F is dened to be
F =
F
x
x
+
F
y
y
+
F
z
z
.
Note that, given a scalar eld V, we found a vector eld V. Now, given a vector eld F, we produce a scalar
eld F.
17
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0.0
0.0
x
y
Figure 2.2: Example of a ow. In this case the speed and direction at each point is a function of the position
(x, y)
We can also write F as divF. These notations are completely interchangeable.
It is easy to show, by direct calculation, that
(F+G) = F+ G.
The geometrical meaning of the divergence is as follows. Loosely speaking, if at some point in space
the divergence is positive, and one considers a small closed surface surrounding that point, then on balance
the vector eld is pointing away from the point and out of the surface. If the divergence is negative, then on
balance the vector eld is pointing towards the point and into the surface. (See Fig. 2.3.) This idea will be
made more precise when we come to the Divergence Theorem.
A vector eld F for which F = 0 everywhere is called divergence-free or solenoidal. The reason for
the name solenoidal is that a solenoid is a coil that produces a magnetic eld, and a magnetic eld B is an
example of a eld that has B = 0 everywhere.
Example 2.3. If F = 3xy
2
i +e
z
j +xysinzk, calculate F.
F =
(3xy
2
)
x
+
e
z
y
+
(xysinz)
z
= 3y
2
+xycosz.
Exercise 2.2. If F = (y x)i +(z y)j +(x z)k, calculate F. [Answer: -3]
18
-2 -1 0 1 2
-2
-1
0
1
2
x
y
Figure 2.3: Example of a vector eld with positive divergence (everywhere): F = xi +yj.
2.4 The Curl Operator
(See Thomas 16.7)
The curl of a vector eld F is dened to be
F =
_
F
z
y

F
y
z
_
i +
_
F
x
z

F
z
x
_
j +
_
F
y
x

F
x
y
_
k.
Note that curl F is another vector eld.
We can write F as curl F again the two notations are interchangeable. It is convenient to remember
F in terms of a determinant like the one for v w:
F =
i j k
/x /y /z
F
x
F
y
F
z
.
It is easy to verify, by writing out the determinant in full, that this is equivalent to the original denition.
It is also easy to show, by writing out the components, that
(F+G) = F+G.
The geometrical meaning of the curl is as follows. Loosely speaking, if at some point in space the
component of the curl in the n direction is positive, it means that in the vicinity of the point and in a plane
normal to n, the vector eld tends to go round in an anticlockwise direction if one looks along vector n. If
the component of the curl were negative, it would mean that the vector eld tends to go round in a clockwise
direction. (See Fig. 2.4.) This idea will be made more precise when we come to Stokess Theorem.
A vector eld F for which F = 0 everywhere is called curl-free or irrotational.
19
-2 -1 0 1 2
-2
-1
0
1
2
x
y
Figure 2.4: Example of a vector eld with positive curl (in the z direction): F = xj yi.
Example 2.4. The velocity in a uid is v = yi xj +0k. Find v.
v =
i j k
/x /y /z
y x 0
= i(0 0) +j(0 0) +k(11) =2k.

Exercise 2.3. If F = (x
2
+y
2
+z
2
)i +(x
4
y
2
z
2
)j +xyzk, nd F.
Exercise 2.4. Find the divergence ( F) and curl (F) of the following vector elds:
F = x
2
i +xzj 3zk
F = x
2
i 2xyj +3xzk
F = (1/r) where r = (x
2
+y
2
+z
2
)
1/2
= 0.
2.5 Derivatives of products and repeated derivatives
(See Thomas 16.7 and the exercises to 16.8)
We can now consider the application of grad, div and curl to products.
To use grad, we have to have a product which is itself a scalar. That can be a product of two scalars, say
, or a scalar product of two vectors, F.G.
Div and curl can only be applied to vectors, so the possible products we could have are F, where is a
scalar eld, or FG.
20
If we were dealing with functions of a single variable, the derivative would just give
d( f g)
dx
= f
dg
dx
+
df
dx
g.
Some of the vector cases are just like that, but most are more complicated. All of themcan be proved by direct
calculation: however, more succinct proofs are available in index notation, which we will study in Chapter 4,
and we defer proofs until then. They are:
1. () = +.
2. (F.G) = F(G) +G(F) +(F.)G+(G.)F.
3. (F) = F+().F.
4. (FG) = G.(F) F.(G).
5. (F) = (F) +() F = (F) F().
6. (FG) = F( G) +(G.)FG( F) (F.)G.
Here the notation (G.)F is to be interpreted as (Gf
1
, G.f
2
, G.f
3
), taking F = ( f
1
, f
2
, f
3
), where for a
scalar
(G.) =
_
G
1
x
+G
2
y
+G
3
z
_
= G
1
x
+G
2
y
+G
3
z
,
where G = (G
1
, G
2
, G
3
) (Warning: the form of this denition will not persist in curvilinear coordinates.)
Example 2.5. Let a be a constant vector. Then
(ra) = r(a) a r
=
a r
r
Example 2.6. Let a be a constant vector. Then
(a r) = a( r) +(r.)a r( a) (a.)r
= 3a a = 2a.
(The two middle terms on the right of the rst line are both zero and it is easy to check that r = 3 and
(a.)r = a.)
We also have a second set of identities arising from applying two of the derivatives. Div produces a
scalar, so we can apply grad to that, whereas grad and curl produce vectors, to which div and curl can be
applied. ( F) cannot be usefully simplied, and () also does not have a simplifed form but does
have the special notation
2
. Of the remaining three cases, two give zero identically, and the third can be
expressed using combinations of grad and div. Both the zero ones (the rst a vector, the second a scalar) can
be memorized by the fact that they would be true if were replaced by the usual sort of vector.
1. curl(grad) = () = 0.
2. div(curl F) = (F) = 0.
3. (F) = ( F)
2
F.
21
Here
2
F means apply
2
to each component of F separately. These relations can also be proved by
direct calculation, e.g.:
Proof of 1:
curl() =
i j k
/x /y /z
/x /y /z
=
_

2
yz
zy
,

2
zx
xz
,

2
xy
yx
_
= 0.
22
Chapter 3
Vector integrals and integral theorems
Syllabus covered:
1. Line, surface and volume integrals.
2. Vector and scalar forms of Divergence and Stokess theorems. Conservative elds: equivalence to curl-free
and existence of scalar potential. Greens theorem in the plane.
Calculus I and II covered integrals in one and two dimensions. We now have to adapt these to the case
of one and two-dimensional curved objects in three dimensions, i.e. curves and surfaces. We will also revisit
volume integrals (triple integrals). Then we show how these are related.
Before starting on this, we deal with a simpler case. One often wishes to integrate a vector function with
respect to its scalar argument. The integral of F(u) with respect to u can be calculated simply by integrating
the components (in Cartesian coordinates) of F = ( f
1
, f
2
, f
3
):
_
Fdu =
_
_
f
1
du,
_
f
2
du,
_
f
3
du
_
From this point of view integration of a vector is just a set of three ordinary integrals. The restriction to
Cartesian coordinates can be overcome by looking at the denition in vectorial terms: we go back to the
basic denition of integration, which leads to a geometrical picture of G
_
b
a
Fdu (see Fig. 3.1):
G =
_
b
a
Fdu = lim
u
p
0
N
p=1
F(u)u
p
.
Example 3.1. If v(t) dr/dt is the velocity of a particle, as a function of time t, then
_
t
2
t
1
vdt =
_
t
2
t
1
dr
dt
dt =
_
r(t
2
)
r(t
1
)
dr = r(t
2
) r(t
1
).
3.1 Line Integrals
(See Thomas 16.1 and 16.2: note that Thomas begins by dening a scalar integral
_
f |dr|, in the notation
below. I come back to this at the end of this section.)
23
F(u)u
1
F(u + u
1
)u
2
F(u + u
1
+ )u
N
G
G
_
b
a
F(u) du
= lim
N, up0
N
p=1
F(u)u
p
Figure 3.1: Geometrical picture of G =
_
b
a
Fdu = lim
u
p
0
N
p=1
F(u)u
p
.
Suppose F =(F
x
, F
y
, F
z
) is a vector eld dened in some region of space, and C is a path through that region
from r
1
to r
2
. Suppose further that C has parametrization:
r = (g(t), h(t), k(t)) (t
1
t t
2
).
Then one can dene the integral
_
r
2
r
1
F.dr
to be
_
t
2
t
1
F.
dr
dt
dt
_
t
2
t
1
_
F
x
dg
dt
+F
y
dh
dt
+F
z
dk
dt
_
dt
(Warning: do not forget to write the components of F in terms of the parameter t, so that t is the only variable
that appears in the integral!) This is equivalent to going back to the fundamental denition of an integral as
the limit of lots of small contributions, in this case the scalar products of F with small displacements along
C:
_
r
2
r
1
F.dr = lim
r
p
0
N
p=1
F(r).r
p
.
Similarly, but not as widely useful, in R
3
one can dene
_
r
2
r
1
Fdr
_
t
2
t
1
F
dr
dt
dt
(note there is no dot here: it is not a scalar or a vector, but would in fact dene a tensor) and
_
r
2
r
1
V(r)dr
_
t
2
t
1
V
dr
dt
dt,
where V(r) is a scalar eld, which denes a vector.
In examples, you may be given the parametrization, or you may have to nd one rst.
Example 3.2. Evaluate the integral
_
F.dr for the vector eld F = 4xyi +8yj +2k, from the origin to
(2, 4, 1)
1. along the path r =ti +t
2
j +
1
2
tk, 0 t 2,
24
2. from the origin to (2, 0, 0), then from there to (2, 4, 0), then to (2, 4, 1), along straight lines [Note that
the answer will be the sum of the three parts: a path may have several pieces providing one ends where
the next begins.]
3. on the surface 4x
2
+y
2
= 32z along a line with constant y/x.
Note that only for the rst of these do we have the parametrization given.
1. In this case
dr
dt
= i +2tj +
1
2
k, so
_
r(t=2)
r(t=0)
F.dr =
_
2
0
F.
dr
dt
dt
=
_
2
0
(4(t)(t
2
)i +8(t
2
)j +2k).(i +2tj +
1
2
k)dt
=
_
2
0
(4t
3
) (1) +(8t
2
) (2t) +(2) (
1
2
)dt
=
_
2
0
(12t
3
+1)dt
=
_
3t
4
+t
2
0
= 48 +2 = 50.
2. The rst section is from (0, 0, 0) to (2, 0, 0). The straight line is
r = xi, 0 x 2
so along it we have F = 2k and dr = idx, so F.dr = 0 and hence this gives a zero integral. [Added
explanation: to get the equation for the line we can use (1.15) for the case of a line through r
1
and r
2
,
i.e. r = r
1
+s(r
2
r
1
). Here r
1
= (0, 0, 0) = 0 and r
2
= (2, 0, 0) = 2i so we get the line r = 2si. Then
comparing with the usual position vector we see that x = 2s, y = z = 0, and we can write the line we
need in the form above. To get the value of F we substitute y = z = 0 into the general form for F.]
In the second section, similarly,
r = 2i +yj, 0 y 4
so along it we have F =8yi +8yj +2k and dr = jdy, so F.dr = 8ydy and hence this gives
_
4
0
8ydy = [4y
2
]
4
0
= 64
In the last section
r = 2i +4j +zk, 0 z 1
so along it we have F =32i +32j +2k and dr = kdz, so F.dr = 2dz and hence this gives
_
1
0
2dz = 2.
Adding the three bits together we get 66.
3. Since at the end point x = 2 and y = 4 we will need y = 2x. Then 4x
2
+y
2
= 32z gives 8x
2
= 32z so
x =2
z. Using z as the parameter, we have r =2
zi +4
zj +zk so dr =(i/
z +2j/
z +k)dz while
F =32zi +32
zj +2k. Hence we have

_
1
0
(32
z +64 +2)dz =
_
1
0
(66 32
z)dz = [66z 64z

3/2
/3]
1
0
= 66
64
3
,
which we could also write as 44
2
3
.
25
Note that the result depends on the curve, not just on its endpoints. We shall return to this matter in
section 3.6.
Exercise 3.1. Calculate
_
C
F.dr, where F = 4yzi 3zj +2x
2
k, over each of the following curves from
(0, 0, 0) to (1, 1, 1):
(a) C: r =ti +tj +tk 0 t 1
(b) C: r =t
2
i +tj +t
3
k 0 t 1
If the vector eld F represents a force (e.g. gravitational force), then
_
r
2
r
1
F.dr
is called a work integral and its value is the work done by the force, which equals the increase in energy of the
body acted on. If instead of representing a force, F represents the velocity eld in a uid, and if C is some
curve in the uid, then
_
C
F.dr is called the ow along curve C. If C is a closed curve, the ow is called the
circulation around C.
Finally, note that Thomass form
_
f |dr| is obtained if one assumes that F is parallel to the unit tangent
vector to the curve, t =
dr
dt
/
dr
dt
, at all points on the curve, since

dr
dt
dt = t|dr|, and in this case, taking F = f t,
F.
dr
dt
dt = f t.t|dr| = f |dr|.
Thus Thomass starting point is simply a special case of the general line integral.
3.2 Surface and volume integrals
(See Thomas 16.5 and 16.6, but be aware that Thomas starts by dening the integral of a scalar, using what
is, in the notation below,
_
f |dS|. )
We now have to extend our ideas of double integrals to include integrals over surfaces. We take into ac-
count that a small area on a curved surface has both a magnitude and a direction (the normal to the surface)
associated with it, so we can represent it as a vector.
Consider an area S in a plane (see Fig. 3.2a). If n is a unit vector perpendicular to the plane, then the
vector representing the area, S, is dened to be
S = Sn.
In the case of a surface in three dimensions (see 3.2b), we dene the vector S for an area element S as
S = Sn,
where n is a unit vector normal to the surface element S. Note we are still using the convention that vectors
are written in bold type and the same symbol in ordinary type means the length, thus S =|S|. In the limit
we shall write dS rather than S. (Thomas uses d for this dS.)
Note we still have a sign ambiguity in the denition, because either direction of the normal vector along
a normal line could be used. One case where we can x the sign is that for a closed surface, n is generally
taken to be the outward-pointing normal vector.
26
(a) (b) n
n
S
S
Figure 3.2: (a) Normal n to a plane area S. The vector area is S =Sn. (b) Normal n to a more general surface.
The vector area of the small surface element is S =Sn, where S is the magnitude of the area.
The double integrals we had before,
__
f dxdy, can now be thought of as integrals of F.dS, where
dS = (dxdy)k and F = f k.
Following this idea, we can now dene the surface integral for a vector eld F over some area S:
_
S
F.dS =
_
S
F.ndS.
Such an integral is also called the ux of F across area S.
We now take three examples of increasing difculty: one is a simple plane case, the second a curved
surface where the integral is easy, and the third gives us a pattern for the general case.
Example 3.3. If F =(3x, 2xz, 3), evaluate the ux of F across the surface S: z =0, 0 x 1, 0 y 2
(where the normal is to be in the positive z direction).
The area is in the xy-plane, so the normal n is (0, 0, 1). We are told to take the plus sign.
_
S
F.ndS =
_
1
0
_
2
0
(3xi +0j +3k).(0i +0j +1k)dydx =
_
1
0
_
2
0
3dydx =
_
1
0
6dx = 6.
Example 3.4. If the velocity eld of a uid is v =
1
r
2
e
r
, where r is the distance from the origin O and e
r
is a unit vector at position r pointing away from the origin, nd the ux
_
v.ndS across a sphere S of radius
a whose centre is at the origin. (The outward normal should be taken.)
In this case, the outward normal and e
r
are the same vector, so
v.n =
1
r
2
e
r
.e
r
=
1
r
2
(e
r
.e
r
= 1 because e
r
is a unit vector). On the sphere of radius a, r = a, so
_
S
v.ndS =
_
S
1
a
2
dS =
1
a
2
area of sphere of radius a =
1
a
2
4a
2
= 4.
27
Example 3.5. Find the ux of the eld F = zk across the portion of the sphere x
2
+y
2
+z
2
= a
2
in the
rst octant with normal taken in the direction away from the origin.
This example is easier in spherical polars but we can do it in Cartesians. Write the required part of
the sphere as a surface z =
_
a
2
x
2
y
2
(note that for a whole sphere we would also need the points where
z =
_
a
2
x
2
y
2
, the square root being understood to be the non-negative one). Consider the displacement
vector for a small change dx, by taking the derivative of r = (x, y,
_
a
2
x
2
y
2
) as in section 2.1. It will be
r
x
dx =
_
x
x
,
y
x
,
z
x
_
dx =
_
1, 0,
x
_
a
2
x
2
y
2
_
dx (3.1)
and similarly a small change in y gives a displacement
r
y
dy =
_
0, 1,
y
_
a
2
x
2
y
2
_
dy . (3.2)
The magnitude of the corresponding area element is then given by the area of a parallellogram with sides
(3.1) and (3.2), and the normal direction is perpendicular to them both, so we need their cross-product
dS =
_
1, 0,
x
_
a
2
x
2
y
2
_
dx
_
0, 1,
y
_
a
2
x
2
y
2
_
dy
=
_
x
_
a
2
x
2
y
2
i +
y
_
a
2
x
2
y
2
j +k
_
dxdy
Thus F.dS = zdxdy =
_
a
2
x
2
y
2
dxdy.
Now we need the limits on the variables. The rst octant is above the rst quadrant of the circle x
2
+y
2
=
a
2
, z = 0, so we will have
_
a
0
_
a
2
x
2
0
_
a
2
x
2
y
2
dydx .
The rest of the problem is just a double integral like those in Calculus II. We can do it by a substitution such
as y =
a
2
x
2
sin which gives
_
a
0
(a
2
x
2
)
_
/2
=0
cos
2
d dx
and this can easily be shown to be a
3
/6.
Note that parametrization by a pair of coordinates will not always give all the surface: for example, the
surface consisting of the square with a vertex at the origin and sides 1 along the x and y axes and the similar
square in the (x, z) plane cannot be covered by any pair of the Cartesian coordinates, though it can easily be
split into two pieces each of which separately can be handled that way.
The nal part of the above example provides a general method for turning a surface integral into a double
integral for the case where the surface is given, or can be found, in terms of two parameters (coordinates on
the surface). See also Thomas 16.6 and diagrams 16.55 and 16.56.
r = x(u, v)i +y(u, v)j +z(u, v)k .
1. Calculate r
u
du and r
v
dv, where r
u
r
u
and similarly for v.
28
2. Calculate dS = (r
u
r
v
)dudv.
3. Form F.dS.
4. Work out from the geometry of the surface appropriate limits on the variables and perform the double
integral with respect to u and v.
Note that here n = (r
u
r
v
)/|r
u
r
v
| and dS =|r
u
r
v
|dudv. The possible snag is that the n thus found may
not be in the required normal direction, so one may need to take its negative. Advice: both for this reason
and for working out limits on the variables it is a good idea to draw a sketch rst.
Note that if we use x and y as the parameters the area element on the curved surface z = f (x, y) is
dS = (f
x
, f
y
, 1)dxdy where f
x
f /x etc. Thus the normal is
n =
1
_
f
2
x
+ f
2
y
+1
(f
x
, f
y
, 1),
the angle this makes with the z axis is given by k.n = cos = 1/
_
f
2
x
+ f
2
y
+1 and (since dS = ndS = n|dS|)
the magnitude obeys
dScos = dxdy .
This gives us a second way to nd dS which deals with a surface given by f (x, y, z) =constant even if we
do not have a convenient parametrization.
1. Take f .
2. Find the unit vector in that direction n = f /|f |.
3. Calculate n.k = cos
4. Write dS = ndxdy/cos and use it to form F.dS.
Thus we can use (x, y) as the parameters, provided cos =0 and we can write F in terms of x and y. Note
that Thomas gives an even more general version of this where he considers a plane with normal p and an area
dA in the plane (in place of k and dxdy): because he is working with |dS| he uses | cos | and writes 1/| cos|
as |f |/|f .p|. While one is unlikely to need to use a general p, this version has the advantage of covering
the three cases p = i, p = j and p = k in one formula.
Exercise 3.2. If F = xi +yj, evaluate
_
S
F.ndS
where S is the rectangular box formed by the planes
x = 0, a, y = 0, b, z = 0, c.
Exercise 3.3. If F=3y
2
i j +xzk, evaluate the integral
_
S
F.dS, where S is the surface z =1, 0 x 1,
0 y x (take the normal pointing in the positive z direction).
[Answer: 1/3]
Exercise 3.4. If F = i +j +k, evaluate
_
S
F.ndS
29
over the hemispherical surface S given by z 0, x
2
+y
2
+z
2
=a
2
, taking the normal outward fromthe origin.
Since we will not be considering curved three-dimensional objects in four-dimensional space we do not
have to think about a vectorial version of the volume element dV = dxdydz. However, the fact that it is a
volume element is an important way to look at it. If we use new coordinates (u, v, w) and calculate corre-
sponding displacements by the method used in (3.1), the volume of the parallelepiped is given by the scalar
triple product (see section 1.5) and this will give the Jacobian formula as in section 1.7.
Usually the integrand of a volume integral is a scalar. However, we could integrate vectors in R
3
, though
this is not so often used. Given a vector eld F = F
x
i +F
y
j +F
z
k, one can dene
_
V
FdV =
_
_
V
F
x
dV
_
i +
_
_
V
F
y
dV
_
j +
_
_
V
F
z
dV
_
k
For example, F might be the momentum vector eld in a uid, and the volume integral would then be the
total net momentum of that volume of uid).
The most useful integrals are the work integral
_
F.dr, the ux across a surface,
_
F.dS, and the integral
of a scalar over a volume,
_
f dV.
Finally, to link up with Thomas, his initial
_
f |dS| is just
_
F.dS for a vector eld such that F = f n on the
surface.
3.3 The Divergence Theorem
We now develop higher dimensional analogues of
_
b
a
d f
dx
dx = f (b) f (a), (3.3)
the fundamental theorem of calculus, which we use in the proofs. Stokess theorem
1
, the two dimensional
version in a general surface, and Greens theorem, for the special case of a plane, relate the surface integral
of a curl to a line integral (2 dimensions to 1). The Divergence Theorem
2
relates the volume integral of a
divergence to a surface integral (3 dimensions to 2).
[Aside: All these are in fact special cases of the general Stokess theorem which relates an n 1 dimen-
sional integral of a eld to the n dimensional integral of its derivative. Here the eld is a generalization of a
vector eld called an (n 1)-form eld.]
We will use these results to derive others including, later on, the forms of divergence and curl in curvi-
linear coordinates. These latter could be found, more laboriously, by direct calculation from the Cartesian
denitions by applying the chain rule.
(See Thomas 16.8)
The Divergence Theorem says (following Thomass wording) that under suitable conditions:
1
A linguistic comment: For a name ending in s, it used to be customary to only add an apostrophe, e.g. Stokes. Modern use also
adds an s e.g. Stokess. Books, and people, vary in which they prefer.
2
First discovered by Joseph Louis Lagrange in 1762, then independently rediscovered by Carl Friedrich Gauss in 1813, by George
Green in 1825 and in 1831 by Mikhail Vasilievich Ostrogradsky, who also gave the rst proof of the theorem. Thus the result may be
called Gausss Theorem, Greens theorem, or Ostrogradskys theorem.
30
Theorem 3.1 The ux of a vector eld F across a closed oriented surface S in the direction of the surfaces
outward unit normal vector eld n equals the integral of F over the region D enclosed by the surface
_
D
FdV =
_
S
F.ndS =
_
S
F.dS. (3.4)
We have not spelt out here in detail the suitable conditions required of F or the surface which are
necessary. These, and a proof, are discussed in section 3.8.
The word oriented could be omitted here and included under the suitable conditions. Oriented means
we assign an outward direction for the normal to S in a consistent and continuous way. An S for which this
is possible is called orientable: the Mbius strip (see Thomas Fig. 16.46) is an example of a non-orientable
surface.
Note that it is not required that S has a single connected piece. For instance, it could have two parts, one
inside the other, and then D would be the volume in between.
The value of this theorem is that to calculate either of the integrals in it we can use the other if it is easier;
it also helps interpretation in applications. However, in the rst example we calculate both sides and verify
they really are equal.
Example 3.6. Suppose f = xy. Find a vector eld F such that F = f . Suppose V is the closed
rectangular volume bounded by the planes x = 0, a, y = 0, b, z = 0, c, and S is the surface of the volume.
Evaluate directly
_
V
f dV and
_
S
F.ndS
(where n is an outward normal), and showthat they are equal as they should be, according to the Divergence
Theorem.
The volume integral is straightforward.
_
c
0
_
b
0
_
a
0
xydxdydz =
_
c
0
_
b
0
[
1
2
x
2
y]
a
0
dydz =
_
c
0
_
b
0
1
2
a
2
ydydz
=
_
c
0
[
1
4
a
2
y
2
]
b
0
dz =
_
c
0
1
4
a
2
b
2
dz =
1
4
a
2
b
2
[z]
c
0
=
1
4
a
2
b
2
c.
There are numerous ways to construct a vector eld F of the required form, e.g. by integrating f with
respect to x and making this the x-component of a vector F:
F =
_
x
2
y/2, 0, 0
_
.
S has six faces, so we must evaluate F.n on each of the six and add the results. On two of the faces, n =i,
on two n = j and on the last two n = k. Because F i is wholly in the x-direction, F.n = 0 on the four
faces where n = j, k. The remaining faces are the two where x = 0 and x = a. On the x = 0 face, F = 0
and so F.n = 0. This leaves only the face x = a. On that face F.n = (a
2
y/2)i.i = a
2
y/2. Integrating this over
that face gives
_
S
F.n =
_
b
0
_
c
0
1
2
a
2
ydzdy =
_
1
2
a
2
__
1
2
b
2
_
c =
1
4
a
2
b
2
c.
Example 3.7. A more typical example of the use of the Divergence Theorem is the following. Find the
integral
_
S
A.dS for A = (x, z, 0) and the surface S of a sphere of radius a.
31
Here A = 1 and the integral is
_
dV over a sphere which is the volume 4a
3
/3 of the sphere. Doing
the integration directly over the surface is much more long-winded.
Exercise 3.5. State the Divergence Theorem. Evaluate both sides of the Divergence Theorem for the
vector eld F = xy
2
zk over a volume V which is the interior of the unit cube, i.e. the cube whose vertices are
at (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 1, 1), (1, 0, 1), (1, 1, 0) and (1, 1, 1).
The Divergence Theorem equates two scalar values. However, one can derive from it vector identities.
For example, we can obtain what is called the vector form of the theorem:
_
dS =
_
dV, (3.5)
both sides of which are vectors. This follows by considering F =t for a constant t, and then taking the cases
t = i, j and k in turn.
Greens identities
We now derive two corollaries of the divergence theorem.
Let f (r) and g(r) be two scalar elds. Then from the identities for vector differentiation of products we
have
( f g) = (f ).(g) + f
2
g (3.6)
Taking (3.6) and the similar identity with f and g swapped, and applying the divergence theorem gives
_
( f g gf ).dS =
_
( f
2
g g
2
f )dV (3.7)
where integrals are over a closed surface and the volume enclosed by it.
For another identity, we put g = f in (3.6) and apply the divergence theorem. We get
_
( f f ).dS =
_
_
(f )
2
+ f
2
g
_
dV (3.8)
where again the integrals are over a closed surface and the volume enclosed by it. We are already given that
f is well-behaved; let us further assume that as r , f falls off faster than 1/
r. Then f f will fall off

faster than 1/r
2
. Hence, if we take the integrals in (3.8) over a sphere of radius R, the LHS 0 as R .
Thus
_
(f )
2
dV =
_
f
2
f dV (3.9)
where the integrals are over all space.
The identities (3.7) and (3.9) are known as Greens identities.
3.4 Greens Theorem (in the plane)
(See Thomas 16.4: we take the statement he gives as Theorem 4, reworded. Note that the right side is a
component of a curl.)
Theorem 3.2 If C is a simple closed curve in the x-y plane, traversed counterclockwise, and M and N are
suitably differentiable functions of x and y, then
_
C
(Mdx +Ndy) =
_ _
R
_
N
x

M
y
_
dxdy,
32
where the area integral is over the region R enclosed by the curve C.
Proof: The proof is a simple application of the Divergence Theorem with a volume of height 1 in the
z-direction above R. (Or, if one proves Stokess theorem rst, of that theorem.) Take F = (N, M, 0): then
_
( F)dV =
_ _ _
_
N
x

M
y
_
dxdydz
=
_ _
_
N
x

M
y
_
dxdy
on integrating over z from 0 to 1. On the top and bottom of the volume, dS is in the k direction so F.dS =0.
On the rest of the surface we have
_
F.dS =
_ _
NdS
x
MdS
y
where dS
x
is the component of dS along the x-axis. Using dr
C
= (dx, dy, 0) along C and dr
z
= (0, 0, dz) in
the z-direction, dS = dr
C
dr
z
gives dS
x
= dydz and dS
y
=dxdz, so
_
F.dS =
_ _
S
Ndydz +
_ _
S
Mdxdz =
_
C
Ndy +
_
C
Mdx.
Now we have proved the two sides of the theorem are equal.
This is called Greens theorem in the plane. (Thomass Theorem 3 is the same with N replaced by M and
M replaced by N. This version makes the right side look like a two-dimensional divergence.)
Example 3.8. Use Greens theorem to evaluate
_
_
xydy y
2
dx
_
around the unit square: straight path segments from the origin to (1, 0) to (1, 1) to (0, 1) and back to the
origin.
In this case, M =y
2
and N = xy; hence
N
x

M
y
= y +2y = 3y
Thus the required integral is
_
1
0
_
1
0
3ydydx =
_
1
0
(3/2)dx = 3/2.
Example 3.9. Area within a curve
A surprising expression for the area A inside a closed curve C bounding a region S in a plane is
nA =
1
2
_
C
r dr ,
where n is the normal to the plane. We can assume without loss of generality that the plane of the curve is
the x, y plane. Then n = k and r dr k so we only need the z component of the integral which is
1
2
_
C
(xdy ydx).
By Greens theorem in the plane this equals
_
S
dxdy =< area inside C > .
33
3.5 Stokess Theorem
(See Thomas 16.7)
The other major theorem of similar character to the Divergence Theorem is Stokess theorem which fol-
lows. (Because both are versions of the n-dimensional Stokess theorem we can prove Stokess theorem from
Greens and thence from the Divergence Theorem, which we do in section 3.8. It can also be proved directly.)
We reword Thomass version.
Theorem 3.3 [Stokess theorem] If F is a (suitably differentiable) vector eld, and C is a closed path bound-
ing an oriented surface S:
_
C
F.dr =
_
S
F.ndS =
_
S
F.dS, (3.10)
where C is travelled counterclockwise with respect to the unit normal n of S (i.e. counterclockwise as seen
from the positive n side of S).
It is easy to see that Greens theorem is a planar version of this result.
Note that the result is the same for any surface S spanning C, so two surfaces with the same bounding
curve give the same integral. This can simplify integration a lot if the bounding curve lies in a plane.
To emphasize the need for differentiability conditions, consider
F =
yi +xj
x
2
+y
2
.
We can easily verify that F = 0 (except on the z axis). But we can also show that
_
F.dr = 0 if we go
around the z axis: for example going round a circle of radius using a parametrization ( cos, sin)we
would have
_
F.dS =
_

2
( sini + cosj).( sini + cosj)d =
_
d = 2.
Example 3.10. Use the surface integral in Stokess theorem to calculate the circulation of the eld F
F = x
2
i +2xj +z
2
k
around the curve C, where C is the ellipse 4x
2
+y
2
=4 in the x-y plane, taken counterclockwise when viewed
from z > 0.
In Stokess Theorem, we can choose any surface that spans the curve C. The easiest one in this case is
just the planar surface z = 0 contained inside the ellipse (so we can use Greens theorem in fact). Thus n will
be purely in the z-direction: n = k, and so we only need to calculate the z-component of F:
(F).k =
F
y
x

F
x
y
=
(2x)
x

x
2
y
= 2.
Integrating this over the elliptical area is easy: the answer is just 2 times the area of the ellipse. The area of
an ellipse is ab, where a is one semi-major axis length (in this case 1) and b is the other semi-major axis
length (in this case 2). Hence the answer is 4.
As in the case of the Divergence Theorem, we can give a vector form of Stokess Theorem. Let F = t
for some constant vector t. Then
t.
_
C
dr =
_
S
(() t).dS = t.
_
S
dS.
34
The rst equality follows from the rules for differentiation of products and the second from the rules for the
scalar triple product. Now taking t = i, j and k in turn we derive the vector equation
_
C
dr =
_
S
dS.
Exercise 3.6. State Stokess theorem.
Evaluate both sides of the theorem for the vector eld F = yi +zj +yk and the surface S of the hemisphere
x
2
+y
2
+z
2
= 4 in z 0, with normal in the positive z-direction. [You may nd the expressions relating
Cartesian and spherical polar coordinates useful.]
Exercise 3.7. Use the surface integral in Stokess theorem to calculate the circulation of the eld F
F = 2yi +3xj z
2
k
around the curve C where C is the circle x
2
+y
2
= 9 in the x-y plane, counterclockwise when viewed from
z > 0. [Answer 9.]
3.6 Conservative Fields and Scalar Potentials
(See Thomas 16.3)
Conservative elds play an important role in many applications. They can be dened by the property that the
value of
_
Q
P
F.dr between points P and Q depends only on the endpoints and not on the path taken between
them. An example of a eld which is not conservative is the one in Example 3.2 we explicitly found
different answers depending on the path taken.
We rst state and prove an important result. In its statement, contractible means we can continuously
deform the region so it squashes to a point. (A torus, for example, is not contractible.)
Theorem 3.4 In a contractible region,
F = 0 (r) such that F = . (3.11)
Proof:
(): This was done at the end of Chapter 2.
(): We proceed by dening
(r) =
_
r
a
F.dr, (3.12)
where a is an arbitrary but xed point, and show that = F as required. First though, since we have not
dened the path to be taken from a to r, we must show that the integral is independent of the path taken, i.e.
that is well-dened.
Suppose that C
1
and C
2
are two different paths from a to r. We need to show that
_
C
1
F.dr =
_
C
2
F.dr.
35
Let C be the closed path formed by following C
1
from a to r and then taking C
2
backwards to get from r
back to a. Let S be a surface spanning C. Then by Stokess theorem:
_
C
1
F.dr
_
C
2
F.dr =
_
C
F.dr
=
_
S
F.dS
= 0
Hence the value of does not depend on the path taken, and so is well-dened. [Note: Thomas gives a
direct proof of the path-independence property for F = V.]
Now
(r +r) (r) =
_
r+r
r
F.dr = F.r (approx.)
and this is true for any (innitesimal) vector r. But by denition of , = .r. Hence
.r = F.r.
But this is true for all r, so = F, as we wanted to show. Q.E.D.
A vector eld F with the property that F = 0 is called a conservative eld. The name arises from
the property, which we have just proved in passing (a simple application of Stokess theorem) that for such a
eld the integral
_
F.dr around any closed path is zero. (So if F is a force, for example, the net work in going
round a path back to where one started is zero energy is conserved. This leads to the mnemonic joke with
apologies to any reader whose political or religious views are conservative that a conservative force is
one that goes round and round without doing anything useful.) For a conservative eld there exists a (scalar)
potential such that F=. In the case where F is a force, it is conventional to take (r) =
_
r
a
F.dr instead
of (3.12), so that F =, because can then be identied with the potential energy, which decreases when
a body moves in the direction of the force. Note that the value of is only xed up to an additive constant.
Example 3.11. Show that F = (z, z, x +y) satises F = 0, and nd a such that F = .
[Note that in answering questions of this sort, where you have to nd , you might as well do that rst since
F = F = 0 immediately.] We want
(z, z, x +y) =
_
x
,

y
,

z
_
.
Equating the rst components and integrating with respect to x gives
x
= z = xz + f (y, z) (3.13)
where f is an (as yet) arbitrary function of y and z. Note that f is a constant of integration as far as
differentiation with respect to x is concerned: when integrating partial derivatives we have to replace simple
constants by functions of those variables not yet taken into account. The second components give
z =

y
=
f
y
,
where the second equality follows by substituting from (3.13). Hence
f
y
= z f (y, z) = yz +g(z) .
No x appears in g since we already know that f does not depend on x. So
= xz +yz +g(z)
36
(g arbitrary as yet). Finally, the third components similarly give
x +y =

z
= x +y +
dg
dz
.
Hence g has a zero derivative, i.e. is constant and there is a given by
= xz +yz +const.
(We could drop the constant here as without it would still full the conditions of the problem.) Hence
F = 0.
An alternative derivation is to use the construction used in the proof above, taking the xed point at 0 (which
xes the constant in ):
=
_
r
0
F.dr.
Since is independent of the path taken, lets choose one for our convenience: rst from the origin 0 to
(x, 0, 0) parallel to the x-axis; then from (x, 0, 0) to (x, y, 0) parallel to the y-axis; and nally from (x, y, 0) to
r = (x, y, z) parallel to the z-axis. Then
=
_
x
0
F
x
(s, 0, 0)ds +
_
y
0
F
y
(x, t, 0)dt +
_
z
0
F
z
(x, y, u)du
= 0 +
_
y
0
xdt +
_
z
0
ydu
= xy +yz.
Example 3.12. The gravitational force on a ball of mass m is F = (0, 0, mg). If the gravitational
acceleration g can be assumed to be constant (which is an excellent approximation for everyday life: g
9.8ms
2
) then F = where = mgz+const., z being measured, say, from the surface of the Earth. (We
can measure z from wherever we wish, since a change of origin just changes the arbitrary constant in ). In
this case is the gravitational potential energy.
Exercise 3.8. Show that F = (yz, zx, xy) is conservative and nd a suitable potential such that F =.
[Answer: = xyz+const.]
Exercise 3.9. For each of the following elds F, evaluate F and either nd the general solution
satisfying F = everywhere, or show that no such exists:
(a) F = x
2
i +y
2
j +2zk
(b) F = z
2
i +x
2
j +y
2
k
(c) F = 3z
2
i +3y
2
j +6xzk
(d) F = yzj xyk.
3.7 Vector Potentials
(Note: this is not on the syllabus. It is included for completeness, for the sake of those who take later courses
where it is used)
We have seen that, if F = 0, then there exists a scalar potential such that F =. There is a similar
result if F = 0 instead:
37
Theorem 3.5 In a contractible domain,
F = 0 A(r) such that F = A.
In the () direction, this is the identity discussed before. The proof in the other direction consists of
writing down suitable integrals, in a way analogous to the proof of (3.11), and is messy so we omit it.
The function A is called a vector potential. Note that one can always add an arbitrary function of the form
to A and get another perfectly good vector potential for F, because () is zero for any , and so
(A+) = A+() = F+0 = F.
In physical contexts this is referred to as a gauge transformation, and provides the basic example whose
generalization gives all the modern gauge eld theories of physics, the basis of our understanding of all
microsopic physical processes.
Example 3.13. Any magnetic eld B satises B = 0. So, for example, consider a constant magnetic
eld B = (0, 0, B
0
) in the zdirection. A suitable vector potential A in this case is
_
1
2
B
0
y,
1
2
B
0
x, 0
_
,
since
A =
_
A
z
y

A
y
z
,
A
x
z

A
z
x
,
A
y
x

A
x
y
_
=
_
0 0, 0 0,
1
2
B
0
(
1
2
B
0
)
_
= B.
3.8 Derivations of the main theorems
(See Thomas 16.7 and 16.8)
[This section is not examinable]
We now return to the proofs of the Divergence and Stokess Theorems.
Consider rst the proof of the Divergence Theorem using rectangular boxes. Take a box [x
1
, x
2
]
[y
1
, y
2
] [z
1
, z
2
]. Then for a vector A = A
1
i +A
2
j +A
3
k,
_
( A)dV =
_ _ _
_
A
1
x
+
A
2
y
+
A
3
z
_
dxdydz
=
_ _
[A
1
]
x
2
x
1
dydz +
_ _
[A
2
]
y
2
y
1
dxdz +
_ _
[A
3
]
z
2
z
1
dxdy
=
_ _
front
A
1
dydz
_ _
back
A
1
dydz +
_ _
right end
A
2
dxdz
_ _
left end
A
2
dxdz (3.14)
+
_ _
top
A
3
dxdy
_ _
bottom
A
3
dxdy.
On the front of the box (i.e. the surface x =x
2
) dS =idydz while on the back (x =x
1
) dS =idydz so the rst
two terms in (3.14) are
_
A.dS for the front and back. Similarly for the remaining terms.
38
One can complete a proof by decomposing a volume into such boxes and adding the results, noting that
the surface integrals on a face common to two boxes will cancel one another. This overlooks the difculty of
proving that the surface integral for all the boxes gives a correct limit for the smooth surface (for the volume
integral this just follows from the denition of such integrals).
Instead we can work towards a correct proof by rst noting that the terms match up in the sense that
_ _ _
D
A
3
z
dxdydz =
_ _
S
A
3
(dS)
z
(3.15)
for the box. (What we thus really do is prove the theorem for F = A
3
k and then add together three such
results.)
We now have to cope with some technical points
1. We must be able to integrate the derivatives of A once. A sufcient condition is that all rst derivatives of
A are piecewise continuous. If the derivatives have discontinuities we have to do the proof for each smooth
piece separately and then add the results.
2. That rst point implies A itself is piecewise continuous.
3. We need the surface to be bounded (so we have a nite area) and closed (so we have a nite volume).
4. We need to be able to integrate
_
A.dS. So we want to be able to assign coordinates on pieces of the
surface S, say (u, v), in such a way that (e
u
e
v
)dudv can be dened and calculated, i.e. we want the map
R
2
R
3
: (u, v) (x(u, v), y(u, v), z(u, v)) to be (piecewise) sufciently differentiable.
These assumptions ensure we can break D up into convex pieces. Convex means that any line cuts the
surface at most twice. So now we have the form
Theorem 3.6 If S is a bounded closed piecewise smooth orientable surface enclosing a volume D, and if F
is a vector eld all of whose rst derivatives are continuous, then
_
D
FdV =
_
S
F.ndS =
_
S
F.dS,
where n is the normal outward-pointing from D.
Proof: [This proof, which I have been using for about 30 years, is more-or-less identical, with slight
changes in notation, with the one given by Thomas.] We break D into convex pieces and rst prove the result
for a single convex piece (which we call D
1
). In fact we need only prove (3.15). Consider lines parallel to
the z-axis. Those which meet D
1
either meet it twice or touch it on a closed curve. Divide the surface into
S
+
and S
, the upper and lower halves (i.e. S
is where the lines parallel to the z-axis rst meet S: see

Figure 3.3). Then, just using the fundamental theorem of calculus,
_ _ _
D
1
A
3
z
dxdydz =
_ _
S
+
A
3
(x, y, z
2
)dxdy
_ _
S
A
3
(x, y, z
1
)dxdy
On S
+
, (A
3
k).dS = A
3
|dS| cos = A
3
dxdy and similarly on S
. Hence we have shown that

_ _ _
D
1
A
3
z
dxdydz =
_ _
S
(A
3
k).dS.
and adding similar results for A
1
and A
2
we get the Divergence Theorem for D
1
. When we re-combine
the convex pieces, the surfaces where they join appear twice in the surface integrals, once with each of the
two possible signs for the normal, so these parts cancel one another and only the integral over the bounding
surface remains. Q.E.D.
We showed above that the Divergence Theorem implies Greens theorem. We only have left to prove
Stokess theorem. The conditions are arrived at by similar considerations to those for the Divergence Theo-
rem.
39
Figure 3.3: Convex surface used in the proof of the Divergence Theorem
Theorem 3.7 For any piecewise smooth surface S bounded by a piecewise smooth curve C on which F
is piecewise continuous,
_
S
F.dS =
_
C
F.dr,
where the integral round C is taken in the direction which is counter-clockwise as seen from the side of S
pointed to by dS.
Proof: The conditions imply that the surface can be decomposed in pieces which project to regions in
one of the planes of Cartesian coordinates; without loss of generality say the (x, y) plane. We prove the result
for one such region. Suppose we have coordinates (u, v) on this region. We also consider only the terms
involving P where F = (P, Q, R) (i.e. we prove the result for F = Pi rst).
_
C
Pdx =
_
C
P
_
x
u
du +
x
v
dv
_
=
_ _
_

v
_
P
_
x
u
__
+

u
_
P
_
x
v
___
dudv by Greens theorem
=
_ _
_
P
u
x
v
P
v
x
u
_
dudv
=
_ _
__
P
x
x
u
+
P
y
y
u
+
P
z
z
u
_
x
v
_
P
x
x
v
+
P
y
y
v
+
P
z
z
v
_
x
u
_
dudv
using the Chain Rule
=
_ _
P
y
_
y
u
x
v
y
v
x
u
_
dudv +
_ _
P
z
_
z
u
x
v
z
v
x
u
_
dudv
and taking the cross product of
dr
u
=
_
x
u
i +
y
u
j +
z
u
k
_
du,
40
dr
v
=
_
x
v
i +
y
v
j +
z
v
k
_
dv,
easily shows that the double integrals give
_ _
_
P
y
(dS)
z
+
P
z
(dS)
y
_
which is the part of F.dS involving P. To complete the proof we add the parts with Q and R and add
together the results from the pieces into which a general S has to be split. Q.E.D.
41
Chapter 4
Index Notation and the Summation
Convention
Syllabus covered:
3. Index notation and the Summation Convention; summation over repeated indices; Kronecker delta and
i jk
; formula for
i jk
klm
.
We now introduce a very useful notation, which makes proving identities such as those in Chapter 2 much
simpler. An extended version of it is used in MAS 322 Relativity, and it can be widely used in linear algebra
and its applications (e.g. input-output models in economics).
Index notation
In this notation we abbreviate a vector a = (a
1
, a
2
, a
3
) as a
i
(really, the notation means the ith component
of a). The same idea works equally well in any number of dimensions, but throughout this course we shall
assume that indices run from 1 to 3.
The name of the index (here, i) is irrelevant since it just provides a way of saying we are dealing with the
vector a, except when we write equations such as a
i
= b
i
+c
i
; here the index names must all match so that
we know we actually mean the three equations a
1
= b
1
+c
1
, a
2
= b
2
+c
2
and a
3
= b
3
+c
3
. Note that this
means that if we wish to mix the vector and index ways of writing things we must write (a)
i
= a
i
and so on.
We can apply the same idea to matrices, e.g. C
i j
is a notation for a 3 3, or more generally n n, matrix C.
Einstein summation convention
One compression of notation which is very useful allows us to drop summation signs. For example, the
dot product of a and b is
a.b =
3
i=1
a
i
b
i
= a
i
b
i
. (4.1)
The last expression uses Einsteins summation convention: if the same index (in this case i) occurs twice, it
implies that you sum over its possible values. The name i is irrelevant: a
j
b
j
means exactly the same thing.
Indices that are summed over are called dummy indices; the others are called free.
42
Using this convention, we can write the ith component of the matrix product Cb as
(Cb)
i
=
3
j=1
C
i j
b
j
=C
i j
b
j
.
Example 4.1. We can easily prove the matrix identity (AB)
T
= B
T
A
T
. Writing the matrix C = AB, we
have
C
T
i j
=C
ji
= A
jk
B
ki
= A
T
k j
B
T
ik
= B
T
ik
A
T
k j
.
The rules for the Einstein summation convention are:
1. In any term (which may be a product), an index can appear at most twice.
2. If an index appears twice in the same term, it means we are to sum over all allowed values (dummy
index).
3. If an index appears once only, the same index must appear once only in all other terms in an equation
(free index).
Note that the names of the indices (i.e. i, j and so on) have no signicance in themselves, providing the
rules are obeyed.
If we are using the index convention but wish to violate these rules, we can do so by going back to putting
summation signs in explicitly, or by saying after an equation no sum, or whatever other change is needed
to indicate the violation.
Exercise 4.1. Which of the following have meaning in Einsteins summation convention:
1. (a
i
b
i
)c
j
= (c
j
d
j
)b
i
2. (a
k
b
k
)c
j
= (a
m
b
m
)d
j
3. a
i
b
j
c
k
d
k
e
f
= m
i
n
j
p
f
q
k
4. a
i
b
j
= a
j
b
i
5. a
m
= d
k
b
m
/
_
(c
k
b
k
)
Kronecker delta
We now introduce an object called the Kronecker delta:
i j
=
_
1 if i = j
0 if i = j
.
So for example,
1 j
a
j
=
3
j=1
1 j
a
j
= a
1
and hence
i j
a
j
=
3
j=1
i j
a
j
= a
i
. (4.2)
43
What is
ii
? Dont forget that we must sum over a repeated index; so
ii
=
3
i=1
ii
=
11
+
22
+
33
= 1 +1 +1 = 3.
Note that the matrix represented by
i j
is the unit matrix and
ii
is its trace.
Levi-Civita epsilon
We dene another new object, the Levi-Civita epsilon, by
i jk
=
_
_
_
1 if (i, j, k) is a cyclic permutation of (1, 2, 3),
1 if (i, j, k) is an anticyclic permutation of (1, 2, 3),
0 otherwise.
[Aside: in n dimensions, we would have to replace cyclic and anticyclic in the above denition by even and
odd.] Thus
123
=
231
=
312
= 1,
132
=
321
=
213
=1,
and all other possibilities (e.g.
112
,
333
,
232
) are zero.
Consider then
1 jk
a
j
b
k
=
123
a
2
b
3
+
132
a
3
b
2
= a
2
b
3
a
3
b
2
.
Likewise
2 jk
a
j
b
k
= a
3
b
1
a
1
b
3
,
3 jk
a
j
b
k
= a
1
b
2
a
2
b
1
.
Thus
i jk
a
j
b
k
= (a b)
i
. (4.3)
The determinant of a matrix A can be written
i jk
A
1i
A
2 j
A
3k
.
Example 4.2. Show that if T
jk
= T
k j
for all values of j and k then
i jk
T
jk
must be zero.
i jk
T
jk
=
ik j
T
k j
swapping indices and using the symmetries
=
imn
T
mn
renaming dummy indices k m and j n.
=
i jk
T
jk
again renaming dummy indices, now m j and n k.
So 2
i jk
T
jk
= 0.
Here
i jk
could be replaced with anything skew in jk: the outcome is referred to as skew summed with
symmetric is zero.
This looks like a trick. To understand it better, it may help to look at a two-dimensional case with a skew
object A
i j
. Then
A
i j
T
i j
= A
11
T
11
+A
12
T
12
+A
21
T
21
+A
22
T
22
.
Since A is skew, A
11
= A
22
= 0. Then substituting A
21
= A
12
and T
21
= T
12
the remaining terms cancel.
This argument can be repeated in any number of dimensions but is very laborious to write out. The index
form achieves the same much more economically.
An important identity is
i jk
ilm
=
jl
km
jm
kl
. ()
44
This can be proved by comparing both sides for all possible choices of ( j, k, l, m): e.g. if ( j, k, l, m) =
(1, 2, 1, 2) the left-hand side of () is
i12
i12
=
312
312
= 1,
and the right-hand side is
11
22
12
12
= 1 0 = 1.
A more abstract argument is to note that (a) the product of the epsilons is zero unless the pairs jk and lm
contain the same pair of indices, (b) if the order in the pairs is the same then both epsilons are 1 or both are
1, and the right side is 1, and (c) a similar argument for when jk and lm have opposite orders.
Exercise 4.2. Pick another set of values for ( j, k, l, m) and verify that () holds.
The previous lecturers on this course said:
Identity () is important write it on your bathroom mirror so you see it every morning!
Note that due to the cyclic symmetry
i jk
ilm
=
i jk
lmi
=
i jk
mil
=
i jk
mli
etc. ,
so the key point is that one of the indices on the rst epsilon is the same as one of the indices on the second
epsilon: then it can always be rearranged to the standard form quoted. Moreover, the names of the indices
themselves can be changed only the pattern of their occurrence is signicant, e.g. it also follows that
pik
mj p
=
im
k j
i j
km
.
Example 4.3. Expand out (a b).(c d) using index notation.
(a b).(c d) =
i jk
a
j
b
k
ilm
c
l
d
m
[using (4.1) and (4.3)]
= (
jl
km
jm
kl
)a
j
b
k
c
l
d
m
[using ()]
= a
l
c
l
b
m
d
m
a
m
d
m
b
k
c
k
[using (4.2)]
so (a b).(c d) = (a.c)(b.d) (a.d)(b.c)
in index notation
We will from now on use the convention that x
i
refers to the position vector, so that x
1
= x, x
2
= y and
x
3
= z in Cartesian coordinates.
Since V = (V/x
1
, V/x
2
, V/x
3
), we can write the ith component of V as
V
x
i
.
For brevity we introduce the shorthand notation
i
=

x
i
and write the gradient as
(V)
i
=
i
V. (4.4)
Note that
i
x
j
=
i j
.
45
The divergence of a vector eld F
i
is
F
1
/x
1
+F
2
/x
2
+F
3
/x
3
,
so we can write
F =
F
i
x
i
=
i
F
i
. (4.5)
Likewise
i jk
F
k
x
j
=
i jk
j
F
k
(4.6)
is the ith component of the curl, F.
Example 4.4. Expand (FG) using index notation.
(FG)
i
=
i jk
j
(
klm
F
l
G
m
)
= (
il
jm
im
jl
)
j
(F
l
G
m
)
=
m
(F
i
G
m
)
l
(F
l
G
i
)
= F
i
(
m
G
m
) +G
m
m
F
i
G
i
(
m
F
m
) F
m
m
G
i
so (FG) = F( G) +(G.)FG( F) (F.)G.
Explanation: Here we are substituting FG for F in (4.6), using (), differentiating the products, and
converting back to vector notation using (4.4) and (4.5).
Exercise 4.3. Using index notation, prove the identity
( f v) = f v +(f ) v.
Now repeat this exercise, but without using index notation (i.e. writing out the left-hand sides of the identity
in full, and showing that it can be rearranged to give the right-hand side).
Exercise 4.4. Show that
i
(
i jk
u
j
v
k
) = v
k
ki j
i
u
j
u
j
jik
i
v
k
.
Hence prove that
(uv) = v.(u) u.(v). ()
Now prove identity () without using index notation (i.e. writing out the left-hand side of the identity in full
and showing that this can be rearranged to give the right-hand side).
One can prove the remaining identities for derivatives of products or products of derivatives similarly.
Even the harder standard product identities, e.g.
(u.v) = u(v) +v (u) +u.v+v.u (4.7)
are not very difcult: for this one, we start with the rst term on the right, using (4.3) and (4.6), and work
analogously to Examples 4.3 and 4.4 to get
i jk
u
j
(
klm
l
v
m
) = u
j
i
v
j
u
j
j
v
i
(4.8)
and then swap u and v and add.
Example 4.5. Use index notation to evaluate (F).
(F)
i
=
i jk
j
(
klm
l
F
m
)
= (
il
jm
im
jl
)
j
l
F
m
=
i
m
F
m
l
F
i
46
so
(F) = ( F)
2
F
using (4.6) for the rst equality; the second follows from ().
47
Chapter 5
Orthogonal Curvilinear Coordinates
Syllabus section:
4. Orthogonal curvilinear coordinates; length of line element; grad, div and curl in curvilinear coordinates;
spherical and cylindrical polar coordinates as examples.
So far we have only used Cartesian coordinates. Often, because of the geometry of the problem, it is
easier to work in some other coordinate system. Here we show how to do this, restricting the generality only
by an orthogonality condition.
5.1 Plane Polar Coordinates
In Calculus II you met the simple curvilinear coordinates in two dimensions, plane polars, dened by
x = r cos, y = r sin.
We can easily invert these relations to get
r =
_
x
2
+y
2
, = arctan(y/x).
The Chain Rule enables us to relate partial derivatives with respect to x and y to those with respect to r and
and vice versa, e.g.
f
r
=
x
r
f
x
+
y
r
f
y
=
x
r
f
x
+
y
r
f
y
. (5.1)
In Calculus II, the rule for changing coordinates in integrals is also given. The general rule is that if x =
x(u, v), y = y(u, v) then the dxdy in the integral is replaced by
x
u
x
v
y
u
y
v
dudv .
This is just dS =|r
u
r
v
|dudv, as derived in section 3.2.
For polar coordinates this gives just dS = r dr d.
48
Example 5.1. The Gaussian integral (related to the Gaussian distribution in statistics)
Consider the integral
_

e
(x
2
+y
2
)
dxdy =
_
_
e
x
2
dx
_
2
.
Transforming to polar coordinates gives
_

0
re
r
2
dr
_
2
0
d = [
1
2
e
r
2
]
0
[]
2
0
=
and hence (according to Dr. Saha the most beautiful of all integrals)
_

e
x
2
dx =
.
For later use, consider the unit vectors in the directions in which r and increase at a point, which we
denote e
r
and e
. These point along the coordinate lines, a coordinate line meaning a curve on which only
one of the coordinates is varying. Coordinate lines are generalizations of coordinate axes. We know how
to nd the displacements arising from small changes in the coordinates, so all we have to do is divide the
displacements by their lengths. Thus
dr
r
= r
r
dr = ((cos)i +(sin)j)dr, |r
r
| = 1 e
r
= r
r
= cosi +sinj
while
dr
= r
d = ((r sin)i +(r cos)j)d, |r
| = r e
= r
/r =sini +cosj.
Here we again use the notation introduced in section 3.2, i.e. r
u
r
u
, although it looks a bit odd when the
coordinate r is being used.
We are now going to consider three-dimensional versions of polar coordinates.
5.2 Cylindrical Polar Coordinates
For these, we turn the plane polars in the x, y plane into three-dimensional coordinates by simply using z as
the third coordinate (see Fig. 5.1). To avoid confusion with other coordinate systems, we shall for clarity
1
rename r and as and , but beware that in other courses, books, and applications of these ideas, r and
will still be used. Thus we have
x = cos, y = sin, z = z,
and quantities in the planes z =constant will be as in plane polars. A line of constant and z (a coordinate
line for , which is a straight line) and a line of constant and z (a coordinate line for , which is a circle) are
shown in the gure. Thomass Fig. 15.37 shows a nice diagram of surfaces on which one of the coordinates
is constant.
These coordinates are natural ones to use whenever there is a problem involving cylindrical geometry.
1
Unfortunately, for the same reasons of clarity, Thomas adopts the alternative solution of renaming two of the spherical polar coor-
dinates. To avoid confusion with past years exam papers I have kept to the choice used there, which is also the one used in most books.
Thomas chooses (, , ) for the usual (r, , ). The swap of and is particularly likely to be confusing.
49
Figure 5.1: Cylindrical polar coordinates relative to Cartesian, and with sample - and -curves shown.
To get partial derivatives in curvilinear coordinates we again use the chain rule (5.1), but now with three
terms on the right. When doing volume integrals, we may need the volume element which is dV =d ddz.
Taking the plane polar results, changing variable names and appending e
z
= k, the unit vectors along the
coordinate lines are e
= cosi +sinj, e
= sini +cosj and e

z
= k respectively. We can write this in
matrix form as
_
_
e
e
z
_
_
=
_
_
cos sin 0
sin cos 0
0 0 1
_
_
_
_
i
j
k
_
_
. (5.2)
The matrix here, R say, is a rotation matrix, i.e. one such that R
1
= R
T
. One can see this from the fact
that it transforms one set of mutually orthogonal unit vectors (i, j, k) into another (e
, e
, e
z
).
We can use the unit vectors and the lengths of r
, r
and r
z
(respectively 1, and 1) to nd area elements,
e.g. on the surfaces =constant the outward area element is dS = (r
r
z
)d dz = d dze
.
5.3 Spherical Polar Coordinates
These are coordinates (r, , ), where r measures distance from the origin, measures angle from some
chosen axis, called the polar axis, and measures angle around that axis (see Fig 5.2.) To relate them to
Cartesian coordinates we usually assume that the z-axis is the polar axis and then
x = r sin cos, y = r sin sin, z = r cos.
Then the is the same as that of cylindrical polars which explains why we used the same letter. The inverse
of these relations is
r =
_
x
2
+y
2
+z
2
, = arctan
_
_
x
2
+y
2
z
_
, = arctan
_
y
x
_
.
Coordinate lines of r, i.e. lines of constant and , are straight radial lines fromthe origin, coordinate lines of
(constant r and ) are meridional semicircles, i.e. semicircles centred at the origin and in a plane containing
the polar axis, and coordinate lines of (constant r and ) are latitudinal circles, i.e. circles centred at a point
on the polar axis and in a plane perpendicular to it. Note however that while r runs from 0 to (like the r
50
Figure 5.2: Spherical polar coordinates relative to Cartesian, and with sample r-, and -curves shown.
of plane polars and of cylindrical polars) and runs from 0 to 2 (like the of plane polars), only runs
from 0 to . The coordinate lines of are strictly semi-circles, rather than circles. To make a circle we have
to take the coordinate lines of for two different , say
0
and
0
+. Thomass Fig. 15.42 shows a nice
diagram of surfaces on which one of the coordinates is constant.
You should beware of the fact that some authors, including Thomas, use different notation, in particular
swapping the meanings of and in the denition of spherical polars. We shall consistently use the above
notation for spherical polar coordinates, which is the most common one, throughout this course.
Note that these again generalize the plane polar coordinates, but this time the polars r, are in planes
containing the z (or polar) axis, rather than in planes perpendicular to it.
These coordinates are of course the natural ones to use when we have a spherical geometry, or part of a
sphere.
Calculating derivatives of r with respect to each of the coordinates in turn we have
r
r
= sin cos i +sin sin j +cos k,
r
= r cos cos i +r cos sin j r sin k

r
= r sin sin i +r sin cos j.

The lengths of these are respectively 1, r, and r sin. Dividing the derivatives by their lengths gives us the
unit vectors e
r
, e
and e
along the coordinate lines, which we can write as

_
_
e
r
e
_
_
=
_
_
sin cos sin sin cos
cos cos cos sin sin
sin cos 0
_
_
_
_
i
j
k
_
_
. (5.3)
The matrix here is again a rotation matrix.
It may be worth noting that r = re
r
.
The volume element is given by (r
r
r
).r
dr d d = r
2
sindr d d. To get the area element on a
sphere (i.e. a surface of constant r) we can take r
d d = r
2
sindd e
r
. Similar results hold for
surfaces of constant and of constant .
51
Example 5.2. Earth polar coordinates
To dene spherical polars on the Earth, let the polar axis be the Earths rotation axis, with z increasing
to the North, let the equator dene the x, y plane, and let the prime meridian (the one through Greenwich)
be = 0. Then any point on the Earths surface can be referred to by the spherical polar angles (, ). In
navigation people use latitude and logitude. Longitude is measured East or West from the prime meridian
and is in the range (0, 180
) so to get for a place with Westerly longitude we just subtract from 2 = 360
.
Latitude is dened to be 0 at the equator (whereas =90
=/2 there). Given a latitude, we need to subtract

it from 90
if it is North and add it to 90
if it is South.
For example Buenos Aires, which has latitude 34
36
S, and longitude 58
22
W, will have Earth polar

coordinates = 125
, = 302
to the nearest degree.

5.4 Some applications of these polar coordinates
Using polar (or cylindrical) coordinates the area within a circle of radius R,
_
R
0
_
2
0
r d dr, comes out immedi-
ately as R
2
. Using spherical polar coordinates the volume of a sphere of radius R,
_
R
0
_
0
_
2
0
r
2
sin d d dr
comes to
4
3
R
3
.
Example 5.3. Area of a cone
Consider the conical surface =
1
cut in a sphere of radius s. The area is given by integrating
_
2
0
d
_
s
0
sin
1
r dr = s
2
sin
1
.
Here s is the slant height of a cone. The cones base (say b) is going to be ssin
1
. Hence we can express the
sloping area of a cone neatly as sb.
Example 5.4. We now reconsider Example 3.5.
Find the ux of the eld F = zk across the portion of the sphere x
2
+y
2
+z
2
= a
2
in the rst octant with
normal taken in the direction away from the origin.
Because of the geometry of the surface, it is easiest to work in spherical polar coordinates (r, , ). The
normal to the sphere that points away from the origin is e
r
, the outward radial vector of unit length. Now
F.e
r
= zk.e
r
= zcos = r cos
2
.
An area element on the surface of a sphere of radius r is (rd)(r sin d) =r
2
sin d d. Thus
_
F.ndS =
_
/2
0
_
/2
0
acos
2
a
2
sin d d
= a
3
_
/2
0
cos
2
sin d
_
/2
0
d
=

2
a
3
_
1
3
cos
3
_
/2
0
=

6
a
3
.
52
Example 5.5. Cutting an apple
In his book, Matthews poses a good problem for illustrating integration using curved coordinates: A
cylindrical apple corer of radius a cuts through a spherical apple of radius b. How much of the apple does it
remove?
We can reformulate the problemslightly, without losing generality, by letting the radius of the apple equal
unity and introducing sin
1
= a/b (i.e. we scale the problem by b). In our restated problem the corer cuts
through the peel at =
1
and =
1
2

1
in spherical polars, i.e. in cylindrical polars at
= sin
1
, z = cos
1
,
and, of course, at z =cos
1
.
We can now complete the solution of this problem in (at least) four different ways: three of these are
relegated to an appendix, not given in lectures.
2
The rst way is to integrate over z and then
4
_
sin
1
0
d
_
1
2
0
dz = 4
_
sin
1
0
(1
2
)
1
2 d =
4
3
(1 cos
3
1
).
5.5 General Orthogonal Curvilinear Coordinates
The two sets of polar coordinates above have a feature in common: the three sets of coordinate lines are
orthogonal to one another at all points. The corresponding unit vectors are also orthogonal.
General orthogonal coordinates are coordinates for which these properties are true, i.e. the coordinate
lines are always mutually perpendicular. In general, coordinates need not be orthogonal. However, we shall
be concerned only with orthogonal curvilinear coordinates. Cylindrical polars and spherical polars are the
only non-Cartesian coordinate systems in which you will be expected to perform explicit calculations in this
course, apart from simple substitutions into the general formulae.
Suppose (u
1
, u
2
, u
3
) are a general set of coordinates. The displacement corresponding to a small change
in u
1
is
dr
1
= r
u
1
du
1
=
_
x
u
1
i +
y
u
1
j +
z
u
1
k
_
du
1
.
If |dr
1
| = h
1
du
1
, then h
1
is called an arc-length parameter: it relates the actual length of an arc to the magni-
tude of the change in coordinate. The unit vector e
1
along a line of constant u
2
and u
3
is then
e
1
=
dr
1
|dr
1
|
=
r
u
1
h
1
,
and conversely
r
u
1

dr
du
1
= h
1
e
1
.
It is easy to calculate that
h
2
1
=
_
x
u
1
_
2
+
_
y
u
1
_
2
+
_
z
u
1
_
2
.
2
I give only the key steps. Some algebraic lling-in is needed. In each version we can shorten the calculations by ignoring the
integration, which trivially gives a factor 2 in every term, and by doing the integrals only for z 0, and then doubling.
53
Similarly one denes two more unit vectors e
2
, e
3
, along the coordinate lines of u
2
and u
3
, which have
associated with them arc-length parameters h
2
and h
3
. For example, in cylindrical polar coordinates, h
= 1
and h
z
= 1, but a change d corresponds to moving a distance d, so h
= .
In spherical polar coordinates, h
r
= 1 again, and h
= r. A change d in corresponds to moving a

distance r sind (r sin being the radius of the particular latitudinal circle), so h
= r sin.
In orthogonal coordinates e
1
, e
2
and e
3
are mutually orthogonal everywhere. Coordinates (u
1
, u
2
, u
3
) will
be orthogonal if and only if
r
u
1
.r
u
2
= r
u
2
.r
u
3
= r
u
3
.r
u
1
= 0.
For orthogonal coordinates, a general change (du
1
, du
2
, du
3
) in the coordinates means a displacement
h
1
du
1
e
1
+h
2
du
2
e
2
+h
3
du
3
e
3
,
which corresponds to a distance
_
h
2
1
du
2
1
+h
2
2
du
2
2
+h
2
3
du
2
3
_
1/2
.
Also, the matrix R relating (e
1
, e
2
, e
3
) to (i, j, k) will necessarily be orthogonal and so have the property that
R
T
= R
1
.
Cartesian coordinates are of course a special simple case of orthogonal curvilinear coordinates, in which
all the coordinate lines are straight lines.
One reason that orthogonal coordinates are so useful is that in any orthogonal coordinate system (u
1
, u
2
, u
3
),
small displacements along u
1
and u
2
dene small rectangles, while small displacements along u
1
, u
2
, u
3
de-
ne small rectangular boxes. In other words, h
1
h
2
du
1
du
2
is an area element normal to e
3
and h
1
h
2
h
3
du
1
du
2
du
3
is a volume element.
5.6 Vector elds and vector algebra in curvilinear coordinates
Scalar elds can of course be expressed in (orthogonal) curvilinear coordinates: they are simply written as
functions f (u
1
, u
2
, u
3
) or for brevity f (u
i
).
As you will know from Linear Algebra, vectors can be expressed using any basis of the vector space
concerned. The same is true, at each point, of vector elds. However, when using curvilinear coordinates we
will always use the orthogonal unit vectors along the coordinate lines, and write
F = F
1
e
1
+F
2
e
2
+F
3
e
3
.
We may also use the full coordinate names as subscripts for the components. Thus we may write
F = F
x
i +F
y
j +F
z
k
= F
+F
+F
z
e
z
= F
r
e
r
+F
+F
.
to express the same vector in Cartesian, cylindrical polar and spherical polar coordinates.
In any orthogonal coordinate system, the scalar (dot) and vector (cross) products work just as in cartesian
coordinates:
w.v = w
1
v
1
+w
2
v
2
+w
3
v
3
(5.4)
and
wv =
e
1
e
2
e
3
w
1
w
3
w
3
v
1
v
2
v
3
. (5.5)
54
Differentiation of these vectors with respect to a variable other than position (like the derivatives in
Section 2.1) is straightforward. For example if position r depends on time, in cylindrical polars, where
r = e
+ze
z
,
r = e
+ e
+ ze
z
+z e
z
.
Since e
= cosi +sinj from (5.2),

e
=

(sini +cosj) =

e
.
Similarly e
z
= 0. Substituting into the rst result we get
r = e
+

e
+ ze
z
.
Vector differentiation is more complicated, because the unit vectors are no longer constant. The simplest
way to work out the forms taken by gradient, divergence and curl is to use expressions for them which we
can apply in all coordinate systems. So far we only have this for gradient.
5.7 The Gradient Operator
To calculate the gradient of V(u
1
, u
2
, u
3
) in orthogonal curvilinear coordinates (u
1
, u
2
, u
3
), we go back to the
denition
dV = V.dr. ()
Substituting dr = e
1
h
1
du
1
+e
2
h
2
du
2
+e
3
h
3
du
3
and writing V = (V)
1
e
1
+(V)
2
e
2
+(V)
3
e
3
gives for
the right-hand side of ()
((V)
1
e
1
+(V)
2
e
2
+(V)
3
e
3
) . (e
1
h
1
du
1
+e
2
h
2
du
2
+e
3
h
3
du
3
) (V)
1
h
1
du
1
+(V)
2
h
2
du
2
+(V)
3
h
3
du
3
.
By Taylors theorem, the rst-order expansion of the left-hand side of () gives
V
u
1
du
1
+
V
u
2
du
2
+
V
u
3
du
3
.
These two expressions must be equal for arbitrary changes du
1
, du
2
and du
3
. Hence we must have
V
u
1
= (V)
1
h
1
;
V
u
2
= (V)
2
h
2
;
V
u
3
= (V)
3
h
3
.
In other words, in orthogonal curvilinear coordinates,
V =
1
h
1
V
u
1
e
1
+
1
h
2
V
u
2
e
2
+
1
h
3
V
u
3
e
3
.
Example 5.6. What is V in spherical polar coordinates? Evaluate V where V = r sin cos.
In spherical polars, (u
1
, u
2
, u
3
) = (r, , ) and h
1
= 1, h
2
= r, h
3
= r sin. Hence
V =
V
r
e
r
+
1
r
V
+
1
r sin
V
.
For the given V, V/r = sin cos, V/ = r cos cos and V/ =r sin sin. Hence, using
the result above,
V = sin cos e
r
+cos cos e
sin e
.
55
Exercise 5.1. What is V in cylindrical polar coordinates (, , z)?
Exercise 5.2. Let (r, , ) be spherical polar coordinates. Evaluate f where
(a) f = ; (b) f = ; (c) f = (r
n
sinm).
5.8 The Divergence Operator in curvilinear coordinates
We aim to compute F in orthogonal curvilinear coordinates. Although we could directly calculate the
divergence in any coordinates, using the Cartesian denition, the relations of basis unit vectors, and the
chain rule, the results can be found with much less effort from the coordinate-independent denition of the
divergence provided by the Divergence Theorem. The Divergence Theorem is true in all coordinates (since
it equates scalars, whose value must be independent of the coordinates). Thus
_
V
FdV =
_
V
F.dS,
where V is the surface of volume V. In particular, consider a small cuboid with edges corresponding
to coordinate separations u
1
, u
2
, u
3
along the coordinate lines. The volume of the cuboid is V =
(h
1
u
1
)(h
2
u
2
)(h
3
u
3
). For a sufciently small volume, we can write the left-hand side as ( F)V, i.e.
( F)(h
1
h
2
h
3
u
1
u
2
u
3
).
The integral of F.n over the side of the cuboid where the rst coordinate has value u
1
+u
1
, for example, is
(h
2
u
2
h
3
u
3
)F
1
evaluated at u
1
+u
1
.
In the limit as V 0 we have
F = lim
u
1
, u
2
, u
3
0
1
V
_
(h
2
u
2
h
3
u
3
F
1
)
u
1
+u
1
(h
2
u
2
h
3
u
3
F
1
)
u
1
+(h
3
u
3
h
1
u
1
F
2
)
u
2
+u
2
(h
3
u
3
h
1
u
1
F
2
)
u
2
+(h
1
u
1
h
2
u
2
F
3
)
u
3
+u
3
(h
1
u
1
h
2
u
2
F
3
)
u
3
_
.
Thus
F =
1
h
1
h
2
h
3
_
(h
2
h
3
F
1
)
u
1
+
(h
3
h
1
F
2
)
u
2
+
(h
1
h
2
F
3
)
u
3
_
.
Example 5.7. What is F in cylindrical polar coordinates, where F = F
+F
+F
z
e
z
?
In cylindrical polars, (u
1
, u
2
, u
3
) = (, , z) and h
1
= 1, h
2
= , h
3
= 1. Hence
F =
1
_
(F
+
F
+
(F
z
)
z
_
.
Note that since /z = 0 this could also be written
F =
1
_
(F
+
F
+
F
z
z
_
.
56
Exercise 5.3. What is F in spherical polar coordinates, where F = F
r
e
r
+F
+F
?
5.9 The Curl Operator in curvilinear coordinates
In analogy with the previous section, we use Stokess theorem to provide a coordinate-independent denition
of F:
_
S
F.dS =
_
C
F.dr,
where S is a surface spanning the closed curve C.
To calculate the rst component, say, of F with respect to orthogonal curvilinear coordinates (u
1
, u
2
, u
3
),
consider a planar curve (a rectangle") whose normal is in the e
1
direction, with sides of length u
2
and u
3
(parallel to the e
2
and e
3
directions respectively). The integral over the surface is approximately
(F)
1
h
2
u
2
h
3
u
3
while the integral around the curve is approximately
(h
2
u
2
F
2
)
u
3
+(h
3
u
3
F
3
)
u
2
+u
2
(h
2
u
2
F
2
)
u
3
+u
3
(h
3
u
3
F
3
)
u
2
.
Taking the limit as u
2
, u
3
0,
(F)
1
=
1
h
2
h
3
lim
u
2
,u
3
0
_
(h
3
F
3
)
u
2
+u
2
(h
3
F
3
)
u
2
u
2
(h
2
F
2
)
u
3
+u
3
(h
2
F
2
)
u
3
u
3
_
=
1
h
2
h
3
_
(h
3
F
3
)
u
2
(h
2
F
2
)
u
3
_
.
Similarly (just cycling through the indices) one can show that
(F)
2
=
1
h
3
h
1
_
(h
1
F
1
)
u
3
(h
3
F
3
)
u
1
_
,
(F)
3
=
1
h
1
h
2
_
(h
2
F
2
)
u
1
(h
1
F
1
)
u
2
_
.
This result can be written in a compact (and more memorable) form as a determinant:
F =
1
h
1
h
2
h
3
h
1
e
1
h
2
e
2
h
3
e
3
/u
1
/u
2
/u
3
h
1
F
1
h
2
F
2
h
3
F
3
.
Example 5.8. What is F in spherical polar coordinates?
In spherical polar coordinates (r, , ) we have h
1
= 1, h
2
= r, h
3
= r sin. Hence, using the determinant
form:
F =
1
r
2
sin
e
r
re
r sine
/r / /
F
r
rF
r sinF
.
or in expanded form
F =
1
r
2
sin
_
(r sinF

(rF
_
e
r
+
1
r sin
_
F
r

(r sinF
)
r
_
e
+
1
r
_
(rF
)
r

F
r
_
e
.
57
Note that since r is independent of and , etc., we can for instance take the r outside the differentiations
in the e
r
component and cancel it with an r in the denominator. Be aware that the answer is a vector. Do not
add the components together, forgetting the vectors e
r
etc. (this is a common error).
Exercise 5.4. Show by expanding it that the determinant denition is equivalent to the full expressions
for the individual components given above.
Exercise 5.5. What is F in cylindrical polar coordinates?
Note that if and z have dimensions of length and is dimensionless, all the terms in the expression
for F should have the same dimensions, namely the dimensions of F divided by length. This is a simple
check that you should make.
Exercise 5.6. Use spherical polar coordinates to evaluate the divergence and curl of r/r
3
. [Hint: dont
forget that in spherical polar coordinates, the position vector r is equal to re
r
.]
Exercise 5.7. State Stokess theorem, and verify it for the hemispherical surface r = 1, z 0, with the
vector eld A(r) = (y, x, z).
Exercise 5.8. The vector eld B() = (0,
1
, 0) in cylindrical polar coordinates (, , z). Evaluate
B. Evaluate
_
C
B.d, where C is the unit circle z = 0, = 1, 0 2. Does Stokess theorem
apply?
Appendix
Other ways of doing Example 5.5 are as follows
The second method is to divide the volume removed into two parts: (i) a cylinder with radius sin
1
and
height cos
1
, and (ii) a top-slice. Volume (i), the cylinder, is easy: 2 sin
2
1
cos
1
. To get volume (ii) we
integrate over and then z
4
_
1
cos
1
dz
_
1z
2
0
d = 2
_
1
cos
1
(1 z
2
)dz =
2
3
(2 +cos
3
1
3cos
1
).
The sum of volumes (i) and (ii) is
4
3
(1 cos
3
1
) as expected.
A third way also divides the volume removed into two parts: (i) an ice-cream cone or cone with a
spherical top, and (ii) a cylinder minus cone. The volume (i) is
4
_

1
0
sin d
_
1
0
r
2
dr =
4
3
(1 cos
1
).
Volume (ii), a cylinder with cone removed, is a bit harder:
4
_
cos
1
0
dz
_
sin
1
ztan
1
d = 2
_
cos
1
0
(sin
2
1
z
2
tan
2
1
)dz =
4
3
sin
2
1
cos
1
(which notice is
2
3
of the volume of the cylinder). Again the sum of the volumes integrated is
4
3
(1cos
3
1
).
Finally, a fourth possibility is to integrate for the volume remaining after coring, which is
4
_
cos
1
0
dz
_
1z
2
sin
1
d = 2
_
cos
1
0
(1 z
2
sin
2
1
)dz =
4
3
cos
3
1
.
58
SUMMARY OF ORTHOGONAL CURVILINEAR COORDINATES
In orthogonal curvilinear coordinates (u
1
, u
2
, u
3
), with corresponding unit vectors e
1
, e
2
, e
3
and arc-
length parameters h
1
, h
2
, h
3
, the gradient of a scalar eld V is given by
V =
1
h
1
V
u
1
e
1
+
1
h
2
V
u
2
e
2
+
1
h
3
V
u
3
e
3
;
the divergence of a vector eld F = F
1
e
1
+ F
2
e
2
+ F
3
e
3
is given by
F =
1
h
1
h
2
h
3
_

u
1
(h
2
h
3
F
1
) +

u
2
(h
3
h
1
F
2
) +

u
3
(h
1
h
2
F
3
)
_
;
and the curl of the same vector eld is given by
F =
1
h
1
h
2
h
3
h
1
e
1
h
2
e
2
h
3
e
3
/u
1
/u
2
/u
3
h
1
F
1
h
2
F
2
h
3
F
3
.
Cartesian coordinates
(u
1
, u
2
, u
3
) (x, y, z); arc-length parameters h
1
= 1, h
2
= 1, h
3
= 1.
Spherical polar coordinates
(u
1
, u
2
, u
3
) (r, , ); arc-length parameters h
1
= 1, h
2
= r, h
3
= r sin.
Cylindrical polar coordinates
(u
1
, u
2
, u
3
) (, , z); arc-length parameters h
1
= 1, h
2
= , h
3
= 1.
59
Chapter 6
Series solutions of ODEs and special
functions
Syllabus section:
5. Series solution of ODEs. Introduction to special functions, e.g., Legendre, Bessel, and Hermite functions;
orthogonality of special functions.
6.1 Context
[This section is not in itself examinable, although parts are necessary, for those who have not done Differential
Equations, to understand what follows.]
This chapter is about some methods for obtaining series solutions of ordinary differential equations,
a particular application of which will be in solving Laplaces equation later in the course. The methods
also relate to so-called special functions, which are discussed very fully in major texts
1
and which, in the
days before computers, were even more important than they are now, because they provided a way to solve
problems we could now tackle using numerical computer programs.
To understand where this ts in, we need to know some basic things about differential equations (not all
of which are discussed in standard texts). That is what this section covers.
A (scalar) differential equation (DE) is some equation in a function, u say, and its derivatives, which we
want to solve for f .
This is a scalar equation because u is a scalar function. One can also have DEs for vectors or matrices,
which are equivalent to systems of DEs for the components, but we will not discuss these here.
It is an ordinary differential equation (ODE) if u depends only on a single variable, say x. In an ODE
we often write u
for du/dx, and u

(k)
d
k
u/dx
k
. If u depends on more variables, the derivatives are partial
derivatives and the equation is a partial differential equation (PDE).
The order of a differential equation is the order of the highest derivative that it contains, and the degree
1
E.g. Handbook of Mathematical Functions by M. Abramowitz and I. Stegun, which has over 1000 pages, including extensive
tables of numerical values.
60
is the power to which that derivative appears when fractional powers have been eliminated. An equation is
called linear if any term in it has at most one of u, u
, . . . , u
(n)
, to degree one at most.
Equation Type Order Degree Linear?
du
dx
+ p(x)u +q(x)u
2
= r(x) ODE 1 1 No
p(x), q(x) and r(x) known functions. Riccati equation
x
2
d
2
u
dx
2
+x
du
dx
+(x
2
n
2
)u = 0 ODE 2 1 Yes
n a constant. Bessels equation.
d
2
u
dx
2
_
1 +
_
du
dx
_
2
_
3/2
= g(x) ODE 2 2 No
g(x) some known function. Equation for curvature of a curve.
2
u
x
2
=
u
t
PDE 2 1 Yes
constant, u = u(x, t). Heat equation or diffusion equation.
Table 6.1: Some examples of differential equations
We shall assume for simplicity that we always can write our ODEs in the form
d
n
u
dx
n
= w(x, u, u
, . . . , u
(n1)
) (6.1)
where w is some known function of its arguments, and that we always do do this.
An ordinary differential equation of order n has n independent solutions. That this should be so is easy
to see by considering giving u, u
. . . up to u
(n1)
at an initial point x
0
. Then the equation gives the n-th
derivative and we can nd the values at x
0
+x, say. These ideas can be made more formal, and under
appropriate conditions on the functions in the equations it can be shown that the solution for given initial
values is unique and depends on the initial values in a continuous or even differentiable way. The basic
method of proof is due to Picard and related to the series solution method described below.
Conversely if we have a function u(x, c
1
, . . . , c
n
) depending on n arbitrary constants, we can eliminate
the constants by taking the derivatives up to the n-th and forming an n-th order ODE.
Example 6.1. For constants a and b,
u = ax +be
x
u
= a +be
x
, u
= be
x
u xu
= b(1 x)e
x
= (1 x)u
so (1 x)u
+xu
u = 0. Conversely, given this differential equation the general solution is u = ax +be

x
with two arbitrary constants.
The situation with PDEs is more complicated. An equation where u depends on k variables typically
needs conditions on surfaces of dimension (k 1), but the exact nature of these initial or boundary conditions
depends on the equation. We will only deal in this course with the one example of Laplaces equation.
The simplest form of ordinary differential equation is
du
dx
= f (x) (6.2)
61
where f (x) is some known function. We knowthe answer must be u =
_
f (x)dx. This is only really an answer
if we can in fact do the integration. An example for which we cannot is f = exp(x
2
). What do we mean by
do the integration, even?
Suppose we start with constants and a variable x. Then we can construct polynomials. From these we
can build rational functions (ratios of two polynomials). If f (x) is a rational function there is a complete
algorithmfor obtaining the integral, but the answer is not always rational. For example if f (x) =1/x we need
lnx. Liouville proved that integrals of rational functions give just rational functions and their logarithms.
Methods for actually calculating the result in practice are due to Hermite (1872), Horowitz (1969, 1971),
Rothstein (1976) and Trager (1976) and are used in systems like Maple.
Having logarithms, we will also need the inverse, expx, and from that we can get e
x
, e
x
, e
ix
and e
ix
which means we have all the trigonometric and hyperbolic functions. To these we also add algebraic func-
tions, i.e. functions which obey a polynomial equation whose coefcients are polynomials in x: for example
g =
1 x
2
obeys g
2
= 1x
2
. Functions which are not algebraic are called transcendental. Taking polyno-
mials, rational functions, algebraics, ln and exp, and allowing combinations such as the log of an algebraic
function or a polynomial in ln, denes the set of elementary functions. When we say we can do an integral
we usually mean that the answer is an elementary function.
If the solution u of (6.2) is not elementary, but f is, it denes a Liouvillian function, for example the error
function
erf(x) =
2
_
x
0
e
t
2
dt .
The special functions described in books include some Liouvillian ones like that. More are transcenden-
tal functions which are not Liouvillian but obey certain particular differential equations, and it is these we
will discuss, but some important cases are just particular elementary functions or even polynomials. Most
are dened as solutions of second-order linear differential equations (one exception is given by the elliptic
integrals, dened by rst order equations of the form
_
du
dx
_
2
= g(x, z)
where g is a rational function of x and z and z is a cubic or quartic polynomial in x).
If we can write down an exact answer for u, e.g. as an elementary function, we say we have an exact,
or closed form solution. In MAS118 Differential Equations, methods for nding closed form solutions of a
number of special types of ODE appear, including rst order separable, rst order linear (homogeneous and
inhomogeneous), exact, homogeneous, and linear second order with constant coefcients
2
. We will avoid
repeating almost all of this.
Instead we will discuss getting solutions in the form of series
u =
n=0
a
n
f
n
(x).
These are useful providing we can show that the series converges (preferably, converges uniformly) to u for
some required range of x. In this course we will quote one or two results on convergence but not prove them
(you will know some of the methods for testing convergence from Calculus II). In particular a convergent
series allows us to get good numerical values by taking some nite number of terms.
2
Aside: there is a lot of interesting and recent theory on this which is not in standard texts or courses. For example, Risch in 1969
showed how to decide whether exponential or logarithmic functions have an elementary integral, Davenport (1984) and Bronstein (1990
and later) showed how to deal with algebraic and transcendental functions, Prelle and Singer (1983) gave a method which covers many
of the usual rst order ODEs (Man Y.-K. and I gave a small improvement of this in 1997), and Kovacic (1986) gave a method to decide
if linear ODEs of order 2 or more have Liouvillian solutions. Kovacics method has been improved several times, by Singer, Ulmer, Weil
and others, and is still the subject of research.
62
Very often, we will take f
n
(x) = x
n
, giving power series, or f
n
(x) = x
n+c
for some constant c, which in
general gives Frobenius series, but we may also take f
n
to be sinnx or cosnx, which gives Fourier series
(see the next chapter), or some set of special functions (we discuss one example in the chapter on Laplaces
equation).
If we know a series for the integrand, any Liouvillian function can of course be given as a series, by
integration term by term.
6.2 Taylor and Picard series
The rst series solution method is simply Taylor series. If we know a formula for the n-th derivative, as in
(6.1) and we have initial values for u and its derivatives at x
0
, say u
0
, u
0
, . . . , u
(n1)
0
, we can use the ODE to
nd u
(n)
0
and differentiate it to nd higher derivatives, and then put the values into
u(x) =
0
u
(n)
0
(x x
0
)
n
n!
.
(We could only nd an innite number of terms if we have some formula for general n, of course, but we
could always in principle calculate any nite number.)
In practice if w in (6.1) involves u itself and its derivatives, the expressions for the higher derivatives may
become unmanageable. For this reason it is not a widely used method.
Whereas Taylor series work by repeated differentiation, Picards method works by repeated integration.
For our purposes we stick to a single rst order equation
u
= f (x, u) .
with initial values given. (Picards method can be used for vector and matrix equations, and hence extends to
an n-th order equation written as a series of equations p = u
, q = p
(= u
) etc.)
The idea is that we must have
u = u
0
+
_
x
x
0
f (x, u)dx ,
where u = u
0
at x = x
0
is a given initial value. This is actually what is called an integral equation for u. It
writes u as an integral but the integral involves u itself.
What we do with the integral equation is to get successively better answers by taking
u
n
(x) = u
0
+
_
x
x
0
f (x, u
n1
)dx . (6.3)
Using the constant u
0
as the rst guess, we get u
1
, then u
2
and so on.
Example 6.2. Using Picards method, solve u
= 1 +u
2
with u = 0 at x = 0.
u
n
= 0 +
_
x
0
(1 +u
2
n1
)dx so
u
1
=
_
x
0
(1)dx = x
u
2
=
_
x
0
(1 +x
2
)dx = x +
1
3
x
3
63
u
3
=
_
x
0
(1 +(x +
1
3
x
3
)
2
)dx =
_
x
0
(1 +x
2
+
2
3
x
3
+
1
9
x
6
)dx
= x +
1
3
x
3
+
2
15
x
5
+
1
63
x
7
We have now gone far enough to see a pitfall, because in this case we know the answer actually must be
u = tanx = x +
1
3
x
3
+
2
15
x
5
+
17
315
x
7
+. . .. The point is that at each step we should only add one new term to
our series. At u
3
we tried to add two terms at once but the second one is not correct.
6.3 Frobeniuss method
This technique is also called the method of undetermined coefcients: Frobenius contributed an important
part but not all of the method. It and its generalizations are a more important way to get series solutions
than those above. It works for linear equations of any order, provided the coefcients in the equation
3
can be
written as convergent Taylor series at a general point (such functions are called analytic), and we assume this
throughout. (Exact solutions for linear equations with constant coefcients can be given, so we do not use
Frobeniuss method for those.) For simplicity we stick, in the examples, to rational function coefcients and
second order homogeneous ODE only (extending the results to higher order is easy).
Thus we are going to deal with equations
u
+ f (x)u
+g(x)u = 0 , (6.4)
where f and g (or suitable multiples of them see below) are analytic functions of x. The idea is to expand
the solution in powers of (x x
0
) where x
0
is some initial point. For simplicity in examples we will take
x
0
= 0 (you can easily change the origin of the x coordinate if necessary).
We rst consider how the equation behaves at x = x
0
. In particular, are f and g well-dened there?
Case 1. If f and g have nite limits as x x
0
, x = x
0
is an ordinary or regular point. Otherwise it is
singular.
Example 6.3. Legendres equation is
(1 x
2
)u
2xu
+( +1)u = 0 (6.5)
where is an integer. We may note that the solutions are the same for = k > 0 and = k 1 < 0, since
both give ( +1) = k(k +1), so we can take 0.
Here f =2x/(1 x
2
) and g = ( +1)/(1 x
2
). As x 0 these have limits 0 and ( +1), so x = 0 is
an ordinary point.
Case 2. If (x x
0
) f (x) and (x x
0
)
2
g(x) have nite limits as x x
0
, but f and/or g do not, x = x
0
is
a regular singular point. (One intuitive way to explain this is as follows. Multiply our equation through by
(x x
0
)
2
to get
(x x
0
)
2
u
+(x x
0
)
2
f (x)u
+(x x
0
)
2
g(x)u = 0 (6.6)
Since differentiating behaves a bit like division, as x x
0
, (x x
0
)
2
u
and (x x
0
)u
will have nite limits,

so to make all terms do so we need (x x
0
) f (x) and (x x
0
)
2
g(x) to stay nite. Of course this is not at all a
rigorous argument.)
3
What this means is more accurately dened below.
64
Example 6.4.
1. Returning to the Legendre example, as x 1, f = 2x/(1 x
2
) and g = ( +1)/(1 x
2
) diverge but
(x 1) f = 2x/(x +1) and (x 1)
2
g =( +1)(x 1)/(x +1) have limits 1 and 0 respectively. Thus x = 1
is a regular singular point. Similarly, so is x =1.
2. If we take Bessels equation
x
2
u
+xu
+(x
2
m
2
)u = 0 ,
where m is a constant, we see that x = 0 is a regular singular point ( f = 1/x, g = 1 m
2
/x
2
).
3. Mathieus equation
4x(1 x)u
+2(1 2x)u
+(a +bx)u = 0
where a and b are constants, has regular singular points at x = 0 and x = 1 ( f = (1 2x)/2x(1 x), g =
(a +bx)/4x(1 x)).
4. The equation
x
2
u
+x
2
u
2u = 0 ,
like Bessels equation, has a regular singular point at x = 0. I do not know who rst considered this equation.
As it is 2.191 in Kamkes book I will call it Kamke 2.191.
We will take these 4 equations as our example set for the rest of the discussion.
Case 3. Otherwise x = x
0
is an irregular singular point. An example is given by x
3
u
+2xu
u = 0 at
x = 0.
We can now state two theorems, which we will not prove.
Theorem 6.1 If x = x
0
is an ordinary point, then all solutions of the ODE (6.4) can be written as power
series
u =
n=0
a
n
(x x
0
)
n
(6.7)
Theorem 6.2 If x = x
0
is a regular singular point, then at least one solution of the ODE (6.4) can be written
as a Frobenius series
u =
n=0
a
n
(x x
0
)
(n+c)
(6.8)
where c is a constant (not necessarily an integer) and a
0
= 0.
We will discuss later what the second solution at a regular singular point could be. Note that in the form
(6.8) we can always take a
0
= 1 since any multiple of a solution is also a solution. We cannot do this in the
form (6.7) as it may be that a
0
= 0.
At an irregular singular point, one needs a still more complicated form and only gets results valid in a
sector of the complex plane this too can be made into computer programs and evaluated numerically but is
beyond this course.
Frobeniuss method is simply to take the Frobenius form and substitute into the equation to nd suitable
values of the c and a
n
. Note that if c is a non-negative integer, the form (6.8) is a power series (6.7) whose
rst non-zero term is a multiple of (x x
0
)
c
.
65
A very helpful tool, which we will use repeatedly, is shifting indices: for k > 0
n=0
b
n
(x x
0
)
(n+c+k)
=
n=0
b
nk
(x x
0
)
(n+c)
(6.9)
provided that b
p
= 0 for all positive p. To see this is correct, just expand the summations on both sides. In
practice, b
n
may be something like (n+c)(n+c 1)a
n
which would become (n+c k)(n+c k 1)a
nk
.
If we use the same idea for k < 0 we have to change the lower index so that the rst term included is the
one with coefcient b
0
:
n=0
b
n
(x x
0
)
(n+cp)
=
n=p
b
n+p
(x x
0
)
(n+c)
(6.10)
Important note: We now discuss what can happen at each stage. You do not need to memorize this or
the formulae for the indicial equation and recurrence relation, though. All you need to do is remember the
form (6.8), and how to do the shift (6.9), and then start equating coefcients of powers of x and solving.
To show how the method works, let us take x
0
= 0 and suppose x f = f
0
+ f
1
x + f
2
x
2
+. . ., x
2
g = g
0
+
g
1
x +g
2
x
2
+. . . which covers both ordinary and regular singular points (in the ordinary case we would have
f
0
= g
0
= g
1
= 0).
A second important note: Although the following calculation and discussion is good for showing how
things work in general, in practice f and g are often rational functions, and rather than expand these as Taylor
series it may be better to multiply by the denominators of f and g so we get an equation of the form
a(x)u
+b(x)u
+c(x)u = 0 (6.11)
with polynomial coefcients. This does not affect the main lines of the argument, and I show the Legendre
example later.
Step 1: Put x
0
= 0, the expansions of f and g, and (6.8) into (6.6) to get
0 = x
2
(
n=0
(n +c)(n +c 1)a
n
x
(n+c2)
) +x( f
0
+ f
1
x + f
2
x
2
+. . .)(
n=0
(n +c)a
n
x
(n+c1)
)
+(g
0
+g
1
x +g
2
x
2
+. . .)(
n=0
a
n
x
(n+c)
)
=
n=0
(n +c)(n +c 1)a
n
x
(n+c)
+ f
0
(
n=0
(n +c)a
n
x
(n+c)
) + f
1
(
n=0
(n +c)a
n
x
(n+c+1)
)
+f
2
(
n=0
(n +c)a
n
x
(n+c+2)
) +g
0
(
n=0
a
n
x
(n+c)
) +g
1
(
n=0
a
n
x
(n+c+1)
) +g
2
(
n=0
a
n
x
(n+c+2)
) +. . .
=
n=0
(n +c)(n +c 1)a
n
x
(n+c)
+ f
0
(
n=0
(n +c)a
n
x
(n+c)
) + f
1
(
n=0
(n +c 1)a
n1
x
(n+c)
)
+f
2
(
n=0
(n +c 2)a
n2
x
(n+c)
) +g
0
(
n=0
a
n
x
(n+c)
) +g
1
(
n=0
a
n1
x
(n+c)
) +g
2
(
n=0
a
n2
x
(n+c)
) +. . . ,
=
n=0
{[(n +c)(n +c 1) + f
0
(n +c) +g
0
]a
n
+[ f
1
(n +c 1) +g
1
]a
n1
+[ f
2
(n +c 2) +g
2
]a
n2
. . .}x
(n+c)
using (6.9). Now we equate coefcients of the different powers of x.
Note: In the above calculation, the nal series contains only a
nk
for k 0. If we use the form (6.11) of
the equation, we may also end up with terms a
n+1
or a
n+2
depending on the lowest order powers of x in a, b
and c. See the following examples.
66
Example 6.5. Continuing the examples above
1. For Legendres equation it is easier to use the original form (6.5) rather than expand 2x/(1 x
2
) as a
power series, etc. This gives us
0 =
n=0
(n +c)(n +c 1)a
n
x
(n+c2)
x
2
(
n=0
(n +c)(n +c 1)a
n
x
(n+c2)
) 2x(
n=0
(n +c)a
n
x
(n+c1)
)
+( +1)(
n=0
a
n
x
(n+c)
)
=
n=0
(n +c)(n +c 1)a
n
x
(n+c2)
n=0
(n +c)(n +c 1)a
n
x
(n+c)
2
n=0
(n +c)a
n
x
(n+c)
+
n=0
( +1)a
n
x
(n+c)
=
n=0
(n +c)(n +c 1)a
n
x
(n+c2)
+
n=0
[( +1) (n +c)(n+c +1)]a
n
x
(n+c)
Now in this case if we shift the index in the rst sum we need to ensure we keep the rst terms, i.e. we get
n=2
(n +c +2)(n +c +1)a
n+2
x
(n+c)
, so we end up with
n=2
{(n +c +2)(n +c +1)a
n+2
+[( +1) (n +c)(n+c +1)]a
n
}x
(n+c)
= 0. (6.12)
2. For Bessels equation we get the general form above with f
0
= 1, g
0
=m
2
, g
2
= 1 and all other f and g
coefcients zero.
3. For Mathieus equation we again do better to use the original form than expand f and g as Taylor series.
The result is
0 =
n=1
4(n +c +1)(n +c)a
n+1
x
(n+c)
n=0
4(n +c)(n +c 1)a
n
x
(n+c)
+
n=1
2(n +c +1)a
n+1
x
(n+c)
n=0
4(n +c)a
n
x
(n+c)
+
n=0
ba
n
x
(n+c+1)
0 =
n=1
{[2(2n +2c +1)(n +c)]a
n+1
+ba
n1
4(n +c)
2
a
n
}x
(n+c)
4. For Kamke 2.191 we get the general form above with f
0
= 0, f
1
= 1, g
0
=2, other coefcients 0.
Step 2: The terms arising from the lowest powers of x give us the indicial equation which tells us what
values of c to use. In the general form the rst terms come from n = 0 in the above sums, giving
a
0
(c(c 1) + f
0
c +g
0
) = 0
and since a
0
= 0 we can solve this quadratic for c. Let us call the roots and , with . Then
f
0
1 =( +), g
0
= .
Note that at an ordinary point we always have f
0
= g
0
= 0 so we always get = 0 and = 1.
Example 6.6. Again continuing the previous examples.
1. For the Legendre equation, x = 0 is an ordinary point, so c = 0 or c = 1. Using (6.12) we need to take
n =2 to nd this result.
67
2. For Bessels equation, f
0
= 1, g
0
=m
2
, so c(c 1) +c m
2
= 0 which gives c =m.
3. For Mathieus equation, we can either argue that in the general form f
0
=
1
2
, g
0
=0 so c(c1)+
1
2
c =0 and
hence c = 0 or c = 1/2, or proceed from the sum involving a
n+1
, taking n =1 to get the same conclusion.
4. Kamke 2.191 has f
0
= 1, f
1
= 1, g
0
=2 so c(c 1) 2 = 0, whence c =1 or c = 2.
Step 3: Now consider the higher powers of x. For n = r > 0 in the general form we get
a
r
[(r +c)(r +c 1) + f
0
(r +c) +g
0
] + a
r1
[(r +c 1) f
1
+g
1
] +a
r2
[(r +c 2) f
2
+g
2
] . . .
+ a
0
[c f
r
+g
r
] = 0 . (6.13)
This is called the recurrence relation.
Example 6.7.
1. For Legendres equation we get for r 1
(r +c +2)(r +c +1)a
r+2
[(r +c)(r +c +1) ( +1)]a
r
= 0
2. For Bessels equation we get for r > 0
[(r +c)
2
m
2
]a
r
+a
r2
= 0
3. For Mathieus equation we get for r 0
2(r +c +1)(2r +2c +1)a
r+1
+(a 4(r +c)
2
)a
r
+ba
r1
= 0
4. Finally for Kamke 2.191 we get for r > 0
(r +c +1)(r +c 2)a
r
+(r +c 1)a
r1
= 0
In general the recurrence relation gives us a
r
from the preceding a
i
, provided the coefcient of a
r
is not
0. This coefcient, in the general case, is
(r +c)(r +c) +( f
0
1)(r +c) +g
0
= (r +c)(r +c) ( +)(r +c) + = (r +c )(r +c ) .
If c = this is (r + )r > 0 so no problem arises. The case to consider further is c = where the
coefcient is r(r + ).
1. If is not an integer, we again have no problem, because r(r + ) = 0. Example: Mathieus
equation
2. If is an integer, k say, we may have a problem: three cases can arise.
2A: At an ordinary point, the problem would be with c = 0, r = 1 but then the coefcient of a
0
= a
r1
is
(r +c 1) f
1
+g
1
= g
1
= 0 so the recurrence relation reads 0 = 0. This just tells us that a
1
can be chosen
arbitrarily, which is a way of saying that to the series with c = 0 you can add any multiple of the series for
c = 1, or that you can choose u
0
= a
0
and u
0
= a
1
arbitrarily. Example: Legendres equation.
2B: At a regular singular point, it may happen that the equation for r = reduces to 0 = 0. In this case,
the series with c = becomes a polynomial multiple of x
, together with, as in the ordinary point case, an

arbitrary multiple a
of the series with c = . Example: Kamke 2.191.

2C: If we have a regular singular point and the recurrence relation would tell us that a
= (or if =),
then the second solution is not of the form (6.8): instead one would have to add a multiple of the series for
c = with lnx. You will not be asked to nd any such second solution explicitly. Example: Bessels equation
for an integer m.
68
Summarizing: at a regular singular point, either the second solution can be found as a Frobenius series,
possibly giving a polynomial multiple of (x x
0
)
, or it involves a multiple of ln(x x

0
).
Step 4: Solve the recurrence relation and work out the series. There are methods for solving recurrence
relations, in many cases, to give a formula for a
r
as a function of r. We do not have time in this course to
include these. Instead we will either consider cases where it is easy to nd the formula or restrict ourselves
to working out the rst few terms numerically. One further thing that can happen is that a series may turn out
to be a polynomial, i.e. the recurrence may give us 0 = a
k
= a
k+1
= a
k+2
. . . for some k.
Example 6.8. We will only continue the calculations for the Kamke and Legendre equations. For
Mathieus equation, and for Bessels equation if m is not an integer, both series can be found. For Bessels
equation when m is an integer we need the form with lnx to get the complete solution for c = m. The
(appropriately normalized) solutions of Bessels equation for c =m are the Bessel functions (of the rst kind)
J
m
(x) while the solutions for c = m are the Neumann functions: because the Neumann functions contain
lnx they diverge as x 0.
For Legendres equation, if we take c = 1, we have
(r +3)(r +2)a
r+2
= [(r +2)(r +1) ( +1)]a
r
which in particular gives a
1
=0, and we can nd the whole series, e.g. if = 2 we have 6a
2
=4a
0
, 20a
4
=
6a
2
, 42a
6
= 24a
4
so the rst few terms of the series are
a
0
x(1
2
3
x
2
1
5
x
4
4
35
x
6
. . .).
If is an odd integer,
( +4)( +3)a
+1
= [( +1) ( +1)]a
1
= 0
so a
+1
= 0. In this case the series becomes just a polynomial in odd powers of x, with highest power x
. For
example for = 5, 6a
2
=28a
0
, 20a
4
=18a
2
, 42a
6
= 0 and
u = a
0
x(1
14
3
x
2
+
21
5
x
4
).
For Legendre with c = 0, we have
(r +2)(r +1)a
r+2
= [(r +1)r ( +1)]a
r
and in particular a
1
(r = 1) can have any value, meaning we can add a multiple of the series with c = 1.
Taking just the even terms, we see that if is a positive even integer, a
+2
=0. So we again have a polynomial,
this time of even powers of x, with highest power x
.
Thus for the Legendre equation we will always get one solution which is polynomial and another which
is an innite series. We may note for future reference that if we had started by taking in place of ( +1) in
the equation we would nd that the condition for the series to terminate at some r would be (r +2)(r +1) =
or r(r +1) = respectively which gives back the condition = ( +1).
The polynomials are called the Legendre polynomials and denoted P
(x), where by convention we choose

the overall factor a
0
or a
1
in them so that P
(1) = 1. Using the above relations we can write down the rst
few of the polynomials as
P
0
= 1, P
1
= x, P
2
=
1
2
(3x
2
1), P
3
=
1
2
(5x
3
3x). . .
The innite series have radius of convergence 1 by the ratio test. If we re-do the Frobenius method
at x = 1 we would nd that c = 0 twice so the second solutions contain logarithms which diverge at the
singular points. Another way to nd this is to use the method of variation of constants to nd the second
solution, which always has a term P
l
(x)ln
1 +x
1 x
in it.
69
For the Kamke equation we get a series for c = 2: 4a
1
= 2a
0
, 10a
2
= 3a
1
, 18a
3
=4a
2
, . . . so the
series is
a
0
x
2
(1
1
2
x +
3
20
x
2
1
30
x
3
. . .).
It can in fact be written as a Liouvillian function. For c = 1 we get a
1
=
1
2
a
0
and a
2
= 0, so this gives
a solution u = a
0
(2 x)/2x. At r = 3 (where the c = 2 series would start) the recurrence reads 0 = 0. This
means we can add any multiple of the series for c = 2.
Exercise 6.1. Use Frobeniuss method to nd series solutions to the following familiar differential
equation:
d
2
y
dx
2
+y = 0.
You should satisfy yourself that the series you obtain are indeed series for the well-known solutions of this
equation.
Exercise 6.2. Obtain two independent series solutions of the following equation, using the Frobenius
method:
2xy
+y
2y = 0.
6.4 Special functions
We can do no more than scratch the surface of this topic. We will consider only special functions dened as
solutions of a Sturm-Liouville system.
The rst component of such a system is a self-adjoint differential second order linear ODE, i.e. one of the
form
(p(x)u
+(q(x) +w(x))u = 0 , (6.14)

where is an unknown constant and p, q and w are known suitably dened
4
functions.
Example 6.9.
1. Legendres equation has this form (6.14), since we can write it as
[(1 x
2
)u
+u = 0 .,
where = ( +1), so p = 1 x
2
, w = 1, q = 0.
2. Bessels equation can be written as
(xu
+(x m
2
/x)u = 0 ,
(by changing the variable to

x). Here p = w = x, q = m
2
/x. (A different eigenvalue problem can be
given by replacing m
2
by .)
3. Hermites equation
u
2xu
+u = 0
is not in the form (6.14) as it stands.
4
We will not in this introduction spell out all the required conditions. For those who have done the Differential Equations course, I
note that if one solution u is known, variation of parameters gives the other as u(
_
dx/pu
2
), which is easily checked.
70
If we have an equation
a(x)u
+b(x)u
+c(x)u = 0 (6.15)
which is not in the above form, we see (by comparing this with (6.14) written as
p(x)u
+ p
+(q(x) +w(x))u = 0)
that to put it in the above form we need to multiply by a factor p/a where p
/p = b/a, ending up with

_
u
exp
_
_
b
a
dx
__
+
cu
a
exp
_
_
b
a
dx
_
= 0 .
We can also recast the equation (6.15) in the form v
+h(x)v = 0 by changing the variable so

v = uexp
_
_
b
2a
dx
_
and h =
c
a
_
b
2a
_
_
b
2a
_
2
.
Doing the second of these for Hermites equation we get
(ue
x
2
/2
)
+( +1 x
2
)ue
x
2
/2
= 0 ,
which is an equation of the form (6.14) for v = ue
x
2
/2
with p = w = 1, q = 1 x
2
.
4. The equation for the trigonometric functions sin and cos:
u
+u = 0 .
Here p = w = 1, q = 0.
5. Mathieus equation can also be written in this form but to do so it is best to change the independent
variable to v where x = cos
2
v. We will not pursue this further. (Details are available in many books if you
are interested.)
A Sturm-Liouville system consists of a differential equation (6.14) together with suitable boundary con-
ditions. A regular S-L system is one with boundary conditions of the form
a
1
u(a) +a
2
u
(a) = 0 = b
1
u(b) +b
2
u
(b) (6.16)
at some xed points x = a and x = b. We assume p and w are non-zero in [a, b]. Here a
1
, a
2
, b
1
and b
2
are
constants and we do not allow a
1
= a
2
= 0 or b
1
= b
2
= 0.
Given an S-L system we want to nd the allowed values of and the corresponding solutions u. (There
is always the trivial solution u = 0, but we ignore that as being of no interest.).
The allowed values are called the eigenvalues and the corresponding u are the eigenfunctions. (These
are exactly analogous to eigenvalues and eigenvectors in linear algebra if we take suitable vector spaces of
functions.)
It is known that
1. All eigenvalues are real. (The proof of this result follows the logic of the proof of the similar result for
eigenvalues of a symmetric matrix.)
2. There are a countably innite number of eigenvalues and they can be ordered as
1
<
2
<
3
. . ..
3. lim
n
n
=
71
4. For each eigenvalue there is an eigenfunction. (This is just a variant of the uniqueness theorem for
differential equations.)
5. Eigenfunctions for different eigenvalues are linearly independent. (This is easy to prove by contradic-
tion.)
The second and third of these have harder proofs, beyond both this course and your present knowledge
from other courses. Accepting these results, we can easily prove a further important property of the eigen-
functions. Suppose we have two eigenvalues
5
m
and
n
=
m
with corresponding eigenfunctions u
m
and u
n
.
Then
_
b
a
[(pu
m
)
u
n
(pu
n
)
u
m
]dx =
_
b
a
[(q +
m
w)u
m
u
n
+(q +
n
w)u
n
u
m
]dx
= (
n
m
)
_
b
a
wu
m
u
n
dx, but
_
b
a
[(pu
m
)
u
n
(pu
n
)
u
m
]dx = [(pu
m
)u
n
(pu
n
)u
m
]
b
a
= [p(u
m
u
n
u
n
u
m
)]
b
a
.
Now at a, (u
m
u
n
u
n
u
m
) = a
1
(u
m
u
n
u
n
u
m
)/a
2
= 0 if a
2
= 0, while if a
2
= 0, u
m
= u
n
= 0 at a. Either
way (u
m
u
n
u
n
u
m
) = 0 at a. Similarly (u
m
u
n
u
n
u
m
) is zero at b. Thus
0 = (
n
m
)
_
b
a
wu
m
u
n
dx
and since
n
=
m
this implies
0 =
_
b
a
wu
m
u
n
dx .
This is an orthogonality condition. Because of its role in this condition, w is called the weight.
We can make the functions orthonormal (assuming, for instance, that w > 0 in (a, b)) by choosing their
scale so that for any ,
1 =
_
b
a
w(u
)
2
dx . (6.17)
Looking at the proof of orthogonality we see that what we really need is to be able to show that [p(u
m
u
n
n
u
m
)]
b
a
= 0. As well as regular S-L systems some other forms will work.
One type of these, collectively called singular S-L systems, arise when p(a)p(b) = 0. If p(a) = 0 = p(b),
we only need the boundary condition at b, and similarly if p(a) = 0 = p(b). If p(a) = 0 = p(b) we just need
to ensure that u is bounded in [a, b] (so that the integral is well-dened).
Lastly, if p(a) = p(b) we can have periodic S-L systems where we impose u(a) =u(b) and u
(a) =u
(b).
(If p itself is periodic this will lead to u(x +(b a)) = u(x), i.e. u will be a periodic function.)
Example 6.10. Continuing the previous examples (except for Mathieus equation)
1. For Legendres equation, we take the range [1, 1]. The p vanishes at the endpoints of the range so the
boundary conditions (6.16) are replaced by the requirement that u is bounded on [1, 1]. The innite series
obtained from Frobeniuss method blow up at x = 1 so only the polynomial solutions are allowed and it is
for this reason that we must have = ( +1) for integer . The orthogonality relation is
_
1
1
P
k
(x)P
(x)dx = 0, if k = .
5
In the lectures I callled these
1
and
2
but this caused confusion both with the rst two of the sequence above and with the numerical
values in examples, so it has been changed here.
72
There are interesting further properties of Legendre polynomials which we do not have time to discuss, e.g.
generation by Rodriguess formula.
2. (Bessels equation) We take an integer m, choose [0, b] as the range [0, b] and take the conditions that u(0)
is bounded and u
(b) is zero. Then u(x) = J

m
(kx) where J
m
(kb) = 0 and = k
2
. Taking two different values
of k, say k
1
and k
2
, we obtain
_
a
0
xJ
m
(k
1
x)J
m
(k
2
x)dx = 0.
3. (Hermites equation in self-adjoint form.) In this case we take the range of x as (, ) and the boundary
condition as v = ue
x
2
/2
0 as x (i.e. the usual conditions with a
1
= b
1
= 1, a
2
= b
2
= 0 for v).
Applying the Frobenius method to Hermites equation we see that x = 0 is an ordinary point. The
recurrence relation (6.13) gives, for c = 0,
r(r 1)a
r
= [2(r 2) ]a
r2
.
This applies to the series starting at a
0
and the independent one starting at a
1
. One of the series terminates if
= 2k is an even integer: these give the Hermite polynomials H
k
.
Also from the recurrence relation we see that for large r the successive terms in the innite series have a
ratio that is approximately x
2
/2r so the series is like the one for e
x
2
/2
. Hence we can (correctly) guess that
ue
x
2
/2
0 as x will eliminate the innite series and leave only the polynomials.
From the above theory we see that if m = k
_

H
k
H
m
e
x
2
dx = 0 .
4. For u
+u = 0 we can take periodic boundary conditions (Note we can take other boundary conditions
for the same equation!) Taking them at and , u() = u() and u
() = u
().
We know is real. We now try the three possible cases in turn to see which gives solutions of the ODE
which also t the boundary conditions.
If = k
2
< 0 the solution (cf. section 1.3) is u
k
=C
k
coshkx +D
k
sinhkx and this cannot be periodic.
So the eigenvalue cannot be negative.
If = 0 there is the special solution u
0
= A
0
+B
0
x and we need B
0
= 0 for periodicity. So = 0 is
possible with eigenfunction u =constant.
If = k
2
> 0 the solution (cf. section 1.2.2) is u
k
= A
k
coskx +B
k
sinkx where A
k
and B
k
are constants,
and this will be periodic with the given period if k is an integer. (Note we could alternatively use e
ix
, which
would give a nice complex form of the Fourier series to be discussed in the next chapter.)
Combining the results for 0, the eigenvalues are = k
2
for integer k.
Note that in this case there are 2 eigenfunctions for each positive eigenvalue. They can be split as the
solutions with u() = 0 and those with u
() = 0, i.e. sinkx and coskx.

The general method used above shows that if m and n are non-negative integers,
_

cosmxcosnxdx = 0 if m = n,
(6.18)
_

sinmxsinnxdx = 0 if m = n,
73
We also need to check that the two eigenfunctions for a given are still orthogonal: in fact,
_

cosmxsinnxdx = 0 for all m, n. (6.19)

(You can do this by direct integration as in exercises 1, question 5. We also know from this exercise that for
k > 0
_

cos
2
kxdx =
_

pi
sin
2
kxdx = , (6.20)
if k > 0. In fact we can easily see by the same method that for any interval [a, b] = [p/2, q/2]
_
b
a
cos
2
kxdx =
_
b
a
sin
2
kxdx =
1
2
(b a) ,
i.e. on such intervals the average value of cos
2
kx or sin
2
kx is
1
2
, a useful result to remember.)
This example leads to Fourier series (the next chapter).
There are many more things one can say about Sturm-Liouville systems, for example one can make
statements about the numbers and locations of zeroes of u
in the range [a, b], but we will point out only one
more.
This is that one can usually prove that any function f (x) on the range [a, b] which is continuous and
piecewise differentiable can be written as a series
f (x) =
n=1
c
n
u
n
(x). (6.21)
(This property is known as completeness of the set of eigenfunctions.) Then because of orthogonality one has
_
b
a
wf u
m
dx = c
m
_
b
a
wu
2
m
dx
and the right hand side will be just c
m
if the eigenfunctions have been normalized as in (6.17). Hence we can
write f as a Sturm-Liouville series (6.21) with coefcients
c
m
=
_
b
a
wf u
m
dx/
_
b
a
wu
2
m
dx
or c
m
=
_
b
a
wf u
m
dx if (6.17) holds.
We turn now to the Fourier series example.
74
Chapter 7
Fourier series
Syllabus section:
6. Fourier series: full, half and arbitrary range series. Parsevals Theorem.
Fourier series can be obtained for any function dened on a nite range, as in the S-L section above. In
practice they provide a way to do various calculations with, and to analyse the behaviour of, functions which
are periodic, i.e. repeat the same values in a regular pattern. Such a function with period 2 will obey an
equation
f (x +2) = f (x).
To begin with we will assume = . We know cosnx and sinnx for any integer n have period 2. So,
of course, do the other trigonometric functions such as tanx but these have the disadvantage of becoming
unbounded at certain values (for tanx, at x = /2, for example).
Since cosnx and sinnx are the eigenfunctions of an S-L system studied in the previous chapter, we can
write
f =
0
(a
n
cosnx +b
n
sinnx)
for periodic piecewise differentiable f (in fact, for any function dened on a range of length 2). We will
modify this way of writing the series slightly soon.
Such a series splits f into pieces of different frequency. Examples where this technique (or its generaliza-
tions for Fourier transforms, wavelets, etc) is useful and makes sense include: resolution of sound waves into
their different harmonics, optics, telecommunications, astronomy, climate variation, water waves, periodic
behaviour of nancial measures, etc. Fouriers original application (in his Theorie analytique de la chaleur
in 1822) was related to solutions of the equations governing the propagation of heat in a solid; as an example
they can be used to show why cellars maintain an almost constant temperature all year round, and we will
discuss this application later.
We know from S-L system theory that sinnx and cosnx form a set of orthogonal functions: see (6.18)
and (6.20).
75
7.1 Full range Fourier series
The idea is to write a function f (x) dened for a range of values of x of length 2, say x , as a
series of trigonometric functions
f =
1
2
a
0
+
1
a
n
cosnx +
1
b
n
sinnx. (7.1)
Here the
1
2
a
0
is really a cos0x = 1 term (the eigenfunction for = 0): the reason for the half is that
_
u
2
0
dx =
_
1dx = 2 which is double the value of

_
u
2
dx for the other values of . There is no point

in including a sin0x = 0 term. Strictly, the use of the equals sign depends on convergence properties which
we describe later.
From the S-L system results above we have that for all m 0
a
m
=
1
f cosmxdx.
It was to provide this nice form that we included the
1
2
with the a
0
term:
1
2
a
0
is in fact the average value of f
over the range. Similarly,
b
m
=
1
f sinmxdx.
Example 7.1. Find the Fourier series for
f (x) =
_
0 if < x < 0
x if 0 < x < .
Using the formulae above,
a
m
=
1
f cosmxdx =
1
_

0
xcos mxdx
b
m
=
1
f sinmxdx =
1
_

0
xsinmxdx
(the lower limit becomes 0 because f = 0 in [, 0] so we have no contribution from this range). Evaluating
these, using integration by parts, we nd that for m > 0
a
m
=
1
__
xsinmx
m
_
_

0
sinmx
m
dx
_
=
1
_
cosmx
m
2
_
0
=
1
m
2
(cosm 1) =
1
m
2
((1)
m
1)
and this is 2/m
2
for odd m and 0 for even m. For m = 0 we have a special case
a
m
=
1
_

0
xdx =
1
2
[x
2
]
0
=

2
.
b
m
=
1
__
xcosmx
m
_
0
+
_

0
cosmx
m
dx
_
=
1
cosm
m
+
_
sinmx
m
2
_
0
_
=
1
m
( cosm) =
(1)
m
m
=
(1)
m+1
m
76
Putting these back into the general form, the Fourier series we are asked for is
k=1
2
(2k +1)
2
cos(2k +1)x +
1
(1)
n+1
n
sinnx.
where we have dealt with the odd/even m for a
m
by taking only m = 2k +1 which must be odd.
Although this general method always works, we do not need to do it for functions we can put into the
required form by other means.
Example 7.2. Find the Fourier series for sin
4
x.
sin
4
x =
1
4
(1 cos2x)
2
=
1
4
(1 2cos2x +cos
2
2x) =
1
4
(1 2cos2x +
1
2
[1 +cos4x])
so sin
4
x =
3
8
1
2
cos2x +
1
8
cos4x is the required series.
Here a
0
=
3
8
, a
2
=
1
2
, a
4
=
1
8
and all other coefcients are zero.
We note that the whole series is periodic, i.e. if we take the same series for any x, rather than staying in
the range x , the series will obey S(x +2) = S(x). So this can also be used for functions dened
on a range longer than 2 if those functions are periodic. Another way to look at this is that if we know the
function on the range [, ] we can dene it for all x by insisting that it be periodic.
We note that in full range series the range of x could equally well be [, +2] for any since all the
quantities involved are periodic so this will give integrals over exactly the same range of values of f . = 0
is often used.
Exercise 7.1. Find the Fourier series of f (x) dened by f (x) = 0 in < x < 0 and f = cosx in
0 < x < . [Answer should be
1
2
cosx +
k=1
4k
(4k
2
1)
sin2kx.]
7.2 Half range series; odd and even functions
We recall
f (x) is odd f (x) =f (x) for all x.
f (x) is even f (x) = f (x) for all x.
A general function can always be written as
f (x) =
1
2
( f (x) + f (x)) +
1
2
( f (x) f (x))
in which the rst part on the right is an even function and the second part is odd.
Since sine is odd and cosine is even we might suspect that for even functions only cosine terms appear in
the Fourier series, and similarly for odd functions only sine terms. This is correct.
We can easily check this, e.g.
a
n
=
_

f (x)cos nxdx
77
=
_
0
f (x)cos nxdx +
_

0
f (x)cos nxdx
=
_
0
x=
f (x)cos n(x)d(x) +
_

0
f (x)cos nxdx
=
_

0
f (x)cos nxdx +
_

0
f (x)cos nxdx
=
_

0
( f (x) + f (x))cos nxdx
which is 0 if f is odd.
Similarly
b
n
=
_

0
( f (x) f (x))sinnxdx.
Hence for an odd function, b
n
= 2
_
0
f sinnxdx.
And for an even function, a
n
= 2
_
0
f cosnxdx.
In the exercise 7.1 above, one can shorten the calculation by noting that the given function is
1
2
(cosx +g)
where g is the odd function dened as cosx if x >0 and cosx if x <0. Then we only need the Fourier series
for g which must be a sine series.
If f is odd, we have a Fourier sine series.
If f is even, we have a Fourier cosine series.
(These names imply we are using half range series.)
Given a function (x) dened on [0, ] (half range) we can dene an even function f such that f =
in (0, ) and an odd function g such that g = in (0, ). The even function gives a series with
a
n
=
2
_

0
(x)cos nxdx
(and b
n
= 0) and the odd function gives a series with
b
n
=
2
_

0
(x)sinnxdx
(and a
n
= 0). These are the two half-range series for and their values agree on the range (0, ).
Example 7.3. f (x) is such that f (x) = f (x+2) and f (x) =f (x), and on 0 x , f (x) =x( x).
Find its Fourier series and prove that
1
1
3
3
+
1
5
3
+. . . =

3
32
.
The function has period 2 and is odd so we know the series is a sine series and
b
n
=
2
_

0
x( x)sinnxdx
=
2
_
_
x( x)
cosnx
n
_
0
+
_

0
( 2x)
cosnx
n
dx
_
=
2
__
( 2x)
sinnx
n
2
_
0
+2
_

0
sinnx
n
2
dx
_
=
4
cosnx
n
3
_
0
=
_
_
_
0 for n = 2p,
8
(2p +1)
3
for n = 2p +1.
78
Thus
f (x) =
8
p=0
sin(2p +1)x
(2p +1)
3
. (7.2)
To get the series requested, we try evaluating (7.2) at some x such that sin(2p+1)x =(1)
p
. This occurs
at x = /2. Evaluating the series there gives
2
4
= f (/2) =
8
p=0
(1)
p
(2p +1)
3
which on rearranging gives the required result.
A number of results of this sort, giving sums of numerical series, can be obtained by direct evaluation of
equation (7.1) at some particular x. The only tricky point in using this is to guess where to evaluate: usually
one of ,
1
2
or /4 is what is needed. Moreover it assumes a convergence we now need to discuss.
7.3 Completeness and convergence of Fourier series
We now give answers to two questions: can every function with period 2 be written this way, and does the
series always converge at all x? These ideas are referred to as completeness and convergence. To specify
more fully, the series converges if when we take the sum of the rst N terms and let N , with x xed, the
limit exists. Completeness amounts to asking if this limit is the value at x of the function f . The proof of the
relevant properties is not part of this course, but the result is. As always, the conditions in it are like small
print in contracts ignorable most of the time but important when things go wrong.
Theorem 7.1 (Fouriers theoremor Dirichlets theorem) If f (x) is periodic with period 2 for all x, and f (x)
is piecewise smooth in (, ), then the Fourier series S(x) with coefcients a
n
and b
n
as above converges
to
1
2
( f (x+) + f (x)) at every point.
Here piecewise smooth means sufciently differentiable at all except isolated points, and f (x+) means
the limit of f (x
0
) for x
0
> x as x
0
approaches x, called the upper limit or right limit (and similarly for f (x),
the lower limit or left limit). Wherever f is continuous, S(x) = f (x) =
1
2
( f (x+)+ f (x)). At discontinuities,
S(x) =
1
2
( f (x+) + f (x)) but this may or may not be the value of f itself.
Typically, we will nd that as n , the coefcients a
n
and b
n
drop off like 1/n or faster.
Example 7.4. Taking the function and series of Example 7.1, we nd that at the series converges to
1
2
( f (+) + f ()) =
1
2
(0 +) =
1
2
. The series then gives
2
=

4
+
k=1
2
(2k +1)
2
,
since sinn = 0 and cos(2k +1) =1. Hence
4
=
k=1
2
(2k +1)
2
=
2
(1 +
1
3
2
+
1
5
2
. . .), or
2
8
= 1 +
1
3
2
+
1
5
2
. . . .
79
x
f(x)
Figure 7.1: Square wave (as in equation (7.3) but with the vertical direction stretched for better visibility) and
Fourier partial sums: two terms and four terms.
There is a problemhowever. The result tells us what happens in the limit. But if we take any nite number
of terms we obviously cannot match a discontinuity exactly, since the series must then give a continuous
function. It turns out that a nite sum overshoots the function on either side of the discontinuity: this curious
effect is called Gibbs phenomenonand adding more terms does not reduce the overshoot, it just moves the
overshoot closer to the discontinuity.
Example 7.5. The square wave.
Consider
f (x) =
_
0 if x < 0
1 if x > 0
(7.3)
in the domain [, ] and periodic with period 2. This gives
a
0
= 1 a
n>0
= 0 b
n
=
1 cosn
n
or equivalently
f (x) =
1
2
+2
n odd
sinnx
n
. (7.4)
Figure 7.1 shows the square wave and its approximations by its Fourier series (up to n = 1 and n = 5).
Several things are noticeable:
(i) even a square wave, which looks very unlike sines and cosines, can be approximated by them, to any
desired accuracy;
(ii) although we only considered the domain [, ] the Fourier series automatically extends the domain to
all real x by generating a periodic answer;
(iii) at discontinuities the Fourier series gives the mean value;
(iv) close to discontinuities the Fourier series overshoots.
80
Putting x =
2
in (7.4) gives an unexpected identity:
1
1
3
+
1
5
1
7
+. . . =

4
.
A second result telling us in what sense we have a good approximation is Parsevals theorem.
Theorem 7.2 (Parseval) If f has a Fourier series,
_

f
2
dx =
1
2
a
2
0
+
n=1
(a
2
n
+b
2
n
).
The expression is obtained by direct calculation using the orthogonality properties, but for a proper proof
one has to deal with convergence of the innite sum. In a similar way one can show that for two functions f
and g with Fourier series with coefcients a
n
, b
n
and A
n
, B
n
respectively we have
_

f gdx =
1
2
a
0
A
0
+
n=1
(a
n
A
n
+b
n
B
n
).
Example 7.6. Consider the Fourier series for the square wave. Putting this into Parsevals theorem we
have
_

0
1dx =

2
+
4
k=1
1
(2k +1)
2
=

2
+
4
(1 +
1
3
2
+
1
5
2
+. . .)
On rearranging we get
2
8
=
k=0
1
(2k +1)
2
= 1 +
1
3
2
+
1
5
2
+. . .
which we had already derived in another way in Example 7.4.
7.4 Arbitrary range series
If we have f (x) dened in x , we can dene a new variable y = x/ so that y and write
f as a Fourier series in y.
f (x) =
1
2
a
0
+
n=1
(a
n
cosny +b
n
sinny)
=
1
2
a
0
+
n=1
(a
n
cos(
nx
) +b
n
sin(
nx
)).
where
a
n
=
1
_
y=
y=
f (
y
)cos
nx
d(
x
),
=
1
f (x)cos
nx
dx.
81
and similarly
b
n
=
1
f (x)sin
nx
dx.
Example 7.7. [This is Fouriers application]
In the propagation of heat, the temperature obeys the equation
k
x
2
=

t
.
As mentioned before, this is the simplest diffusion equation.
We introduce here a newidea which will run through the rest of the course. This is separation of variables:
we can see that if we look for a solution in the form X(x)T(t) we will nd
kT
d
2
X
dx
2
= X
dT
dt
k
d
2
X
X dx
2
=
dT
T dt
.
Here the left side depends only on x and the right side only on t: hence the two sides must both equal the same
constant (only constants can depend only on x and only on t at the same time). We then have two equations
k
d
2
X
X dx
2
= , =
dT
T dt
,
to solve, where is a constant. When we have solved them, we multiply the answers together to solve the
original equation. In general we assume (and indeed usually can prove) that the full solution is a sum of
solutions of the separable type.
For Fouriers problem we proceed as follows:
At the earths surface, the temperature is assumed to vary periodically over the year (for simplicity) so it
has a Fourier series in time t with period 1 year. We take x to be the depth in the earth. Then at x = 0 we can
write
=
1
2
a
0
+
n=1
(a
n
cos
2nt
T
+b
n
sin
2nt
T
)
with T = 365/2 days.
Now at other x we let a
n
and b
n
depend on x and put these into the differential equation: this means we are
writing the whole solution as a sum of separable solutions in which the t dependence gives a Fourier series
for each x. Plugging this into the original equation and equating coefcients in the Fourier series we get
k
2
a
n
x
2
cos
2nt
T
=
2n
T
b
n
cos
2nt
T
.
k
2
b
n
x
2
sin
2nt
T
=
2n
T
a
n
sin
2nt
T
.
These can be written as a single complex equation
2
(b
n
+ia
n
)
x
2
=
2ni
T
(b
n
+ia
n
).
This equation is easy to solve as it is a linear equation with constant coefcients. [For those who have done
the Differential Equations course, The auxiliary equation has roots
_
n
kT
(1 +i).
82
and that gives the solutions. We need the solution with a negative real part (temperature decreases as we go
into the earth)] The solution is
b
n
+ia
n
= cexp(
_
n
kT
(1 +i)x).
This means we have a solution which varies sinusoidally but the amplitude decreases by a factor e in a distance
_
kT/n. Some realistic gures are k =2.10
3
cm
2
/s, T =365.24.3600/2secs, giving 1/
_
kT/ =177
cm for annual variation and roughly 1/19 of this for daily variation. The amplitude of the annual variation
halves in a distance such that ln2 = x, about 123 cm. So in 5 metres the variation of temperature reduces
by a factor 1/16 (and at that depth the variation is out of phase i.e. coolest in mid-summer).
83
Chapter 8
Laplaces Equation
Syllabus section;
7. Laplaces equation. Uniqueness under suitable boundary conditions. Separation of variables. Two-
dimensional solutions in Cartesian and polar coordinates. Axisymmetric spherical harmonic solutions.
Note: there is a key objective for this chapter, so although exam questions on this material tend to be on
Part B of the paper, it could appear in section A (and did in 2007).
8.1 The Laplace and Poisson equations
Laplaces equation is simply
2
(r) = 0
where, as before,
2
= .
It often occurs as an equation F = 0 for a conservative eld F (so that F = for some ).
Laplaces equation is the simplest and most basic example of one of the three types
1
of second-order
linear PDEs, known as elliptic. Laplaces equation is a linear homogeneous equation.
Going back to our gravitational and electromagnetic examples of conservative elds, and using the Di-
vergence Theorem
_
V
FdV =
_
V
F.dS
to obtain Gausss results, we see that if F = 0 everywhere there are no sources inside the volume, which
for gravity means that there is no mass there and for electric eld means that there is no (net) charge. Hence
Laplaces equation describes the gravitational potential in regions where there is no matter, and the electric
potential in regions where there are no charges.
1
The basic examples of the other types, the hyperbolic and parabolic equations, are the wave equation
1
c
2
2
f
t
2
=
2
f
and the heat equation or diffusion equation
f
t
=
2
f
respectively. We met the heat equation with a single spatial variable in Example 7.7 on Fourier series.
84
If instead there is a net charge density , the electric eld satises
0
E = (r)
(where
0
is a constant). Combining this with E = gives
2
=(r).
This is known as Poissons equation. Laplaces equation is of course a special case of Poissons equation, in
which the function on the right-hand side happens to be zero.
Laplaces and Poissons equations are so important, both because of their occurrence in applications
and because they are the basic examples of elliptic equations, that we are now going to spend some time
considering their solutions. Solutions of Laplaces equation are known as harmonic functions. To nd the
solution in a particular case we need to know some boundary conditions. We will prove that under a variety
of boundary conditions the solution of Poissons (or Laplaces) equation is unique. We shall then investigate
what the solutions actually are in some simple cases.
8.2 Uniqueness of Solutions to Poissons (and Laplaces) Equation
Note that
2
is a linear operator: that is
2
(
1
+
2
) =
2
1
+
2
2
.
Hence if
1
and
2
are both solutions of Laplaces equation, so is
1
+
2
(or indeed any linear combination
of them). Also, if is a solution of Poissons equation and is a solution of Laplaces equation, + is
also a solution of Poissons equation, for the same (r).
Theorem 8.1 Suppose that
2
u = f (r) throughout some closed volume V, f (r) being some specied function
of r, and that the value of u is specied at each point on the boundary V of V. Then if a solution u exists to
this problem, it is unique.
Proof
Suppose that u
1
and u
2
are both solutions to this problem. Let v = u
1
u
2
. Then
2
v = 0 in V,
v = 0 on V.
Now
_
V
|v|
2
dV =
_
V
(v).(v)dV
=
_
V
(vv) v
2
vdV
=
_
V
vv.dS (by the Divergence Theorem)
= 0
because v =0 everywhere on V. Hence v =0 everywhere in V. So v is constant. But v =0 on the boundary
and is constant, so v = 0 everywhere. Hence u
1
= u
2
. So the solution is unique. Q.E.D.
It is fairly obvious that the nal step in the displayed calculation also works if v.n = 0 where n is the
normal to the surface V. This corresponds to being given values for u.n. Moreover, it still works if at each
85
point either u or its derivative normal to the boundary, n.u, is specied. The cases where u and where n.u
are given are respectively called Dirichlet and Neumann boundary conditions. If we only have Neumann
conditions, v is still a constant but not necessarily zero, so the solution is only unique up to addition of a
constant.
The virtue of this theorem is that it gives us a licence to make whatever assumptions or guesses we like,
provided we can justify them afterwards by showing the equation and boundary conditions are satised: if
they are, the solution found, no matter how aky the method, must be the right one.
Having proved uniqueness, we now demonstrate how to actually nd solutions of Laplaces equation in
a variety of situations. In general (r) can depend on all three coordinates, but we will conne ourselves to
cases depending on two variables: (x, y), (, ) and (r, ).
The rst two of these cases provide us with a nice simple interpretation. For (x, y) or (, ) the
boundary conditions are on walls at xed x, y on which only z varies. But we can choose to forget the
z direction and think of a planar problem. Now imagine as a varying height. Then solving Laplaces
equation subject to boundary conditions is like holding a rubber sheet between an arbitrarily shaped hoop.
The hoop xes the height at the boundary, while the rubber tries to minimize the total area.
For (r, ) though, we still need to think in three dimensions, and the simple interpretation no longer
applies.
8.3 2-D solutions of Laplaces equation in Cartesian coordinates
We rst develop a general method for nding solutions = (x, y) to Laplaces equation in a rectangular
domain, with given boundary conditions. In Cartesian coordinates
() =

x
_
x
_
+

y
_
y
_
+

z
_
z
_
= 0. (8.1)
Let us try looking for a solution of the form
(x, y) = X(x)Y(y).
Such a solution is called a separable solution.
Substituting this into (8.1) gives
d
2
X
dx
2
Y +X
d
2
Y
dy
2
= 0.
Hence
1
X
d
2
X
dx
2
=
1
Y
d
2
Y
dy
2
.
The left-hand side is a function of x only, and the right-hand side is a function of y only. The only way for
this to be possible is if both sides are constant (say , real). Thus
d
2
X
dx
2
+X = 0 and
d
2
Y
dy
2
Y = 0.
These equations are (with change of variables x kx) the equations met in chapter 1 for trigonometric and
hyperbolic functions, so we know their general solutions (as in section 6.4).
86
If is positive, let = k
2
and the solution is
X = Acoskx +Bsinkx, Y =Ccoshky +Dsinhky,
where A, B,C, D are constants. Multiplying these together,
= (Acoskx +Bsinkx) (Ccoshky +Dsinhky) . (8.2)
If = 0 we can integrate each equation twice to arrive at
= (A
0
x +B
0
)(C
0
y +D
0
), (8.3)
with constants A
0
, B
0
, C
0
and D
0
.
If is negative, let =k
2
and then the solution is
X =

Acoshkx +

Bsinhkx, Y =

Ccosky +

Dsinky.
where

A,

B,

C,

D are constants. Then
=
_
Acoshkx +

Bsinhkx
__
Ccosky +

Dsinky
_
. (8.4)
In each of these solutions there are more constants than we really need. For example if in (8.2) AC = 0 we
can write
= AC(coskx +B/Asinkx) (coshky +D/Csinhky)
using just three constants AC, B/A and D/C: this means that in examples one of the four constants can usually
be set to 1. One way people sometimes use to take this into account is to write (8.2) as
= Lsin(kx +M)sinh(ky +N)
for some constants L, M, and N. In general this works ne, but it does not cover the case where D = 0.
Now we have to satisfy the boundary conditions. If we are lucky, a particular one of the separable
solutions will do this.
Example 8.1. Find the solution of
x
2
+

2
y
2
= 0 ()
in the domain D: 0 x a, 0 y b, given boundary conditions = 0 on x = 0, on y = 0 and on x = a,
and = sin px/a (for some integer p) on y = b.
Can we satisfy the boundary conditions in this case with one of the separable solutions? We consider
them one by one. Clearly (8.3) will not work. For (8.2) or (8.4) = 0 on x = 0 gives respectively
(0, y) = 0 for all y A = 0 or

A = 0.
(This follows because at x = 0, (8.2) gives (A1+B0)(Ccoshky +Dsinhky) = A(Ccoshky +Dsinhky) which
will only be zero for all y if A = 0 and similarly for (8.4).) = 0 on x = a gives
(a, y) = 0 for all y sinka = 0 or sinhka = 0.
(For (8.2) we get Bsinka(Ccoshky +Dsinhky) = 0 but we do not want B = 0 or we just get u = 0, so we
need sinka(Ccoshky +Dsinhky) = 0 for all y, and similarly for (8.4).) Now sinhka = 0 only if ka = 0, i.e.
k = 0, which is not an interesting solution (since it implies that = 0 everywhere), so (8.4) is no use.
sinka = 0 k =
n
a
87
where n is a (positive) integer. Without loss of generality we can set B = 1 in (8.2), so now
= sin
nx
a
_
Ccosh
ny
a
+Dsinh
ny
a
_
.
We still have to satisfy = 0 on y = 0
(x, 0) = 0 for all x C = 0.
Finally, = sin px/a (for some integer p) on y = b gives
(x, b) = sin
px
a
for all x Dsinh
nb
a
sin
nx
a
= sin
px
a
. ()
which is true if p = n and D = 1/sinh(pb/a) so nally
= sin
px
a
sinh
py
a
/sinh
pb
a
.
However, using just one separable solution will not in general work. In the above example we will not be
able to satisfy the last equation () if =g(x) for a more general function of x on y =b than g(x) =sin px/a.
Since Laplaces equation is linear, we can add together separable solutions to get a more general solution.
In many cases, including the Cartesian one, it is possible to prove that every solution can be written as a sum
of separable solutions (this is called completeness of the separable solutions).
In the Cartesian case we would need to introduce different values of A for each p etc., which we typically
would denote A
p
. Since p can take any value, the sum of separable solutions would in general become an
integral over p [Aside: this leads to the use of Fourier transforms, which we have not covered], but for the
rectangular boundaries in the example above we typically take
= (A
0
x +B
0
)(C
0
y +D
0
)
+
n=1
(A
n
cosnx/a +B
n
sinnx/a)(C
n
coshny/a +D
n
sinhny/a)
+
n=1
(a
n
coshnx/b +b
n
sinhnx/b)(c
n
cosny/b +d
n
sinny/b)
In order to t boundary conditions we now typically have to work out the Fourier series for the functions
given on the boundaries. Again, since solutions of Laplaces equation can be added together, it is perfectly
valid to work out solutions that have = 0 on three of the four sides of the rectangle, and t the (non-zero)
conditions on the fourth boundary, do this for each choice of fourth side in turn, and add the four resulting
solutions together. That is why as examples we usually take cases where = 0 on three of the four sides of
a rectangle.
The problem where the derivative of normal to the boundary is specied instead of can be solved by
a similar method.
Moreover, in practice conditions like = 0 can be applied to each separable solution in turn, rather than
considering the sum as a whole: this may look unjustied, but uniqueness allows us to justify it if the nal
answer works. Using these ideas let us look again at the generalized version of the previous example.
Example 8.2. Consider the previous example but with = g(x) on y = b.
88
On y = b, we try a linear combination of solutions of the form found above (keeping the conditions
derived from the other parts of the boundary):
(x, y) =
n=1
D
n
sinh
ny
a
sin
nx
a
.
Equation () becomes
g(x) =
n=1
E
n
sin
nx
a
, ()
where for convenience we have dened
E
n
= D
n
sinh
nb
a
.
Finding the coefcients E
n
in equation () is a standard problem in (arbitrary range) Fourier series [Com-
ment: now you see why we did section 7.5]. Multiplying both sides by sin(mx/a) and using the result
that
_
a
0
sin
mx
a
sin
nx
a
dx =
_
a/2 if m = n
0 if m = n
gives
E
m
=
2
a
_
a
0
g(x)sin
mx
a
dx.
Evaluating this for all m nally gives us a solution to the original problem
=
n=1
2
a
_
_
a
0
g(x
)sin
nx
a
dx
_
sinh(ny/a)
sinh(nb/a)
sin
nx
a
.
By uniqueness, we have found the solution.
We still have one more issue to deal with. If at the corners, e.g. (x, y) = (0, 0) and (x, y) = (a, 0), we
had different values, say p and q, for (which is allowed in the conditions of the uniqueness theorem) we
could not t them with a Fourier series in x since the Fourier series would treat this as a discontinuity and
take the value
1
2
(p+q). However, if you compare the expressions above with the discussion of Fourier series
in section 6.4 you will see that for = 0 we have kept the non-periodic solution x. This enables us to t the
values at the corners by choosing the constants in (8.3). Subtracing that fromthe original boundary conditions
gives a new problem which can be solved by Fourier series without discontinuities at the endpoints.
Example 8.3. Consider a rectangle with 0 x 2, 0 y 1, and boundary values for (x, 0) = sinx
etc. as shown at the left diagram in Figure 8.1.
sinx
2y
x
siny
sinx
0
0
siny
Figure 8.1: Left: boundary conditions on (x, y). Right: boundary conditions after subtracting off
0
= xy.
First we solve for the coefcients in
0
(x, y) = +x +y +xy (which is just another way to write
(8.3)) so as to t the corners.
0
(0, 0) = 0 = 0,
0
(2, 0) = 0 = 0,
0
(0, 1) = 0 = 0,
0
(2, 1) = 2 = 1, so
0
= xy
89
Subtracting this off leaves the boundary conditions at the right of Figure 8.1. We can now get (x, 0) right
with
A
(x, y) =
sinh((1 y))sinx
sinh()
(this is like example 8.1) and get (0, y) right with
B
(x, y) =
sinh((2 x))siny
sinh(2)
.
The full solution is
=
0
+
A
+
B
.
You may be wondering why we only end up using the sin and not the cos terms. With the method
described above, this is because we always want = 0 except on one part of the boundary, whereas use of
cos in the same way will normally give non-zero values on at least three parts. Only a few particular Dirichlet
problems will fall out well with cos but give a messy formafter the above method (in those cases the solutions
would amounting to writing out the Fourier sine series for cos). However, for Neumann problems cos has the
same advantage over sin.
We should also note that there are 2-dimensional harmonics which are not of the above form but are
rational functions of x and y. These are less useful for general problems with rectangular boundaries, but may
be useful in other contexts. One way to obtain them is to rewrite the z-independent cylindrical harmonics
given below using the relation of Cartesians and plane polars.
Exercise 8.1. Find (x, y) in 0 < x < , 0 < y < 1, satisfying the following conditions:
2
= 0 in 0 < x < , 0 < y < 1,
= sinx on y = 0
and = 0 on the other three sides of the rectangle. Is the solution unique?
8.4 2-D solutions of Laplaces equation in cylindrical polar coordi-
nates
Often the geometry of the domain of interest and its boundaries mean that it is much easier to work in
coordinates other than Cartesians. For example, one may need to calculate the electrostatic potential outside
a charged sphere. This would be very messy in Cartesian coordinates, and is much simpler if we use spherical
polar coordinates instead. If the geometry is cylindrical (e.g. the ow in a cylinder) it may be preferable to
use cylindrical polar coordinates.
As we discussed earlier, in cylindrical polar coordinates,
=

+
1
+

z
e
z
and the divergence of F = F
+F
+F
z
e
z
is
F =
1
_
(F
+
F
+
(F
z
)
z
_
.
90
Putting these together we obtain
2
div() =
1
_
+

_
1
_
+

z
_
z
__
,
i.e.
2
=
1
_
+
1
2
+

2
z
2
.
Consider the case when everything in the problem is independent of z, so = (, ). Once again we
seek a separable solution, this time
(, ) = R()S().
Collecting functions of on one side and functions of on the other gives
R
d
d
_
dR
d
_
=
1
S
d
2
S
d
2
= ,
where is a constant (by the usual argument!).
The equation for S is then
d
2
S
d
2
+S = 0, (8.5)
which is just the now familiar equation encountered in the Cartesian case. If > 0 it has solutions
S() = sin
or cos
.
(or a linear combination). If < 0 we would similarly have
S() = sinh
or cosh
,
while if = 0 we have S = A
0
+B
0
. But the solution must be the same if we change by 2, since
the two values of represent the same point in space, which means that we need > 0 and

must be
an integer. Thus = m
2
where m is a non-negative integer (without loss of generality) and the solution is
S = A
m
cosmx +B
m
sinmx.
There are special cases where it is acceptable for not to be unique, provided is. This happens in
uid dynamics, for example, where we are interested in the uid velocity u = rather than the potential
itself. In this case we require that be single valued, which allows us also to use the m = 0 solution
S = A
0
+B
0
.
The radial equation gives
d
d
_
dR
d
_
= m
2
R.
Seeking a solution R
p
gives
p
2
= m
2
so p = m. This gives two independent solutions which means that together they give the general solution
and hence
(, ) = (A
m
cosm +B
m
sinm)
_
C
m
m
+D
m
m
_
.
The case m = 0 requires special consideration: then R =C
0
ln +D
0
. As in the Cartesian case each of these
solutions has more constants than we need in any given example.
The general linear combination of all these solutions is
(, ) = (A
0
+B
0
)(C
0
ln +D
0
) +
m=1
(A
m
cosm +B
m
sinm)
_
C
m
m
+D
m
m
_
.
91
Note that this form implies that boundary conditions on a surface of xed lead to a Fourier series problem
in (once the terms in A
0
have been found). However, in many cases we need only a nite number of terms
and can use intelligent guesswork (essentially, including only terms of types which appear in the boundary
conditions) to choose a suitable form for .
Example 8.4. Consider
2
(, ) = 0 with the boundary conditions (1, ) = 2sin
2
and ln at
large .
Since 2sin
2
= 1 cos2, we nd that in the boundary conditions we have m = 0 terms and an m = 2
term, so we can (correctly) guess that the same is tru of the solution. All we have to do is get the constant
coefcients right which is easy in this case, and we obtain
(, ) = 1 +ln
cos2
2
.
Exercise 8.2. Consider the region D dened by a b, 0 , <z <. Sketch the region in
a plane perpendicular to the z-axis which lies in D. On the boundaries = a, = 0 and = , = 0 while
on the boundary = b, = sin. Find the solution of Laplaces equation in D, independent of z, which
satises these boundary conditions.
[You may assume that on 0
sin =
k=1
16k
(4k
2
1)
2
sin2k.]
8.5 Axisymmetric solutions of Laplaces equation in spherical polar
coordinates
Now we consider what to do in problems with a naturally spherical geometry. First we need to work out what
2
is in spherical polar coordinates.
Recall that
2
div().
Now in spherical polar coordinates,
=

r
e
r
+
1
r
+
1
r sin
and the divergence of F = F

r
e
r
+F
+F
is
F =
1
r
2
sin
_
(r
2
sinF
r
)
r
+
(r sinF
+
(rF
_
.
Hence putting these together we obtain
2
=
1
r
2
sin
_

r
_
r
2
sin

r
_
+

_
sin

_
+

_
1
sin
__
,
i.e.
2
=
1
r
2
_

r
_
r
2
r
_
+
1
sin
_
sin

_
+
1
sin
2
2
_
.
92
Many problems are axisymmetric that is, there is no dependence on the coordinate. In such cases
= (r, ) and (anything)/ = 0. As in the previous cases, we proceed by seeking a separable solution:
(r, ) = R(r)S().
[different meanings from the R and S in the last section]. Thus
2
= 0 becomes
1
r
2
_

r
_
r
2
R
r
_
S +
1
sin
_
sin
S
_
R
_
= 0
which implies that
1
R(r)
r
_
r
2
R
r
_
=
1
S()sin
_
sin
S
_
.
The left-hand side is a function of r only, and the right-hand side is a function of only. But they are equal,
and so they must both be a constant, say . Thus
d
dr
_
r
2
dR
dr
_
R = 0 (8.6)
and
1
sin
d
d
_
sin
dS
d
_
+S = 0. (8.7)
We consider equation (8.7) rst. If we dene w = cos, then
d
dw
=
1
sin
d
d
,
so equation (8.7) can be written in the Legendre form
d
dw
_
(1 w
2
)
dS
dw
_
+S = 0.
Only the Legendre polynomial solutions P
(w) = P
(cos) (dened as in chapter 6) are allowed because we

want to be nite at = 0 and = , i.e. at w =1.
Going back to equation (8.6), if = ( +1) then R(r) satises
d
dr
_
r
2
dR
dr
_
( +1)R = 0. (8.8)
We try looking for a solution of the form R = Ar
p
: equation (8.8) gives two independent solutions of this
form (and hence the general solution) as follows.
p(p +1)Ar
p
= ( +1)Ar
p
i.e. p(p +1) = ( +1). Given , this is a quadratic equation for p. It has solutions p = and p = ( +1).
Hence the general solution for R is
R = Ar
+
B
r
+1
.
and so
(r, ) =
_
Ar
+
B
r
+1
_
P
(cos).
Because any linear combination of solutions of the equation is also a solution of the equation (
2
is a linear
operator), we can nd a more general solution of the axisymmetric problem:
(r, ) =
n=0
_
A
n
r
n
+
B
n
r
n+1
_
P
n
(cos). (8.9)
93
The individual functions on the right are axisymmetric spherical harmonics and they form a set of axisym-
metric solutions of Laplaces equation which is complete, i.e. (8.9) can be shown to be the most general
axisymmetric solution.
One can match such series to arbitrary boundary conditions using the orthogonality properties of the
Legendre polynomials. However, in this course we will stick to problems where only a few terms are needed
and we can see what they are by intelligent guesswork: the essential rule is only to put into the prospective
answer those Legendre polynomials which appear in the boundary conditions.
Example 8.5. A perfectly spherical conductor, centre 0, radius a, is placed in an otherwise uniformelec-
tric eld E
0
. (Mathematically, the condition for a conductor is that the electrostatic potential is constant.)
What is the potential everywhere outside the conductor? And inside?
Outside the conductor (r > a), we want to solve
2
= 0. The boundary conditions are that =constant
on r = a and that far from the conductor E
0
.
The unperturbed eld (the one before the conductor was added) is
E
0
= E
0
cos e
r
E
0
sin e
(with axes chosen appropriately), which is what the eld must look like as r : this has potential
0
= E
0
r cos + constant = E
0
rP
1
(cos) + constant.
(Note that this is a solution of Laplaces equation.) Now our potential
=
n=0
_
A
n
r
n
+
B
n
r
n+1
_
P
n
(cos)
n=0
A
n
r
n
P
n
(cos)
as r . But this must equal
0
= E
0
rP
1
(cos)+const., so we can assume that A
1
= E
0
, A
0
is an arbitrary
constant, and A
n
= 0 for all other n. On r = a we want to be constant, i.e. it should not vary with . Now
on r = a
= A
0
+
B
0
a
+
_
E
0
a +
B
1
a
2
_
P
1
(cos) +
n=2
B
n
a
n+1
P
n
(cos)
The potential on r = a will vary with unless all the coefcients of P
n
(cos) (n > 0) each vanish. Hence we
must have B
1
=E
0
a
3
and B
n
= 0 (n 2). Hence nally the solution is
(r, ) = A
0
+
B
0
r
+E
0
_
r
a
3
r
2
_
cos.
Note that A
0
and B
0
are undetermined constants. To determine B
0
we need additional information to ascertain
the potential difference between the surface of the conductor and a point at innity. The constant A
0
will
always be arbitrary, because the absolute value of the potential has no physical meaning.
Inside, since is constant on the boundary, it must be constant inside the conductor.
This last point has practical consequances. The voltage in space [in a static eld] satises Laplaces
equation. If you stand under an electricity pylon, there is a rather large voltage changethousands of volts
between your head and your feet. But if you stand inside a wire cage (often called a Faraday cage), then
the wire acts like a continuous conductor and equalizes the voltage over the cage and hence inside the cage
too. That is why a wire cage provides a refuge from lightning. Cages also provide screening from electronic
surveillance, or, by putting equipment inside them, safety for the people outside.
Exercise 8.3. Show that at a general point the following are solutions of Laplaces equation
2
= 0.
94
1. = r
n
cosn, for an integer n, in cylindrical polar coordinates.
2. = r sin cos, in spherical polar coordinates.
95

Calculus 3 Lecture Notes

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Calculus 3 Lecture Notes

Hochgeladen von

Copyright:

Verfügbare Formate

Lecture Notes for

MAS204: CALCULUS III

to those used in (1.4) (and 4b

). More precisely, we have the

27. The rate of decrease of f is greatest, at

27, when t is in the opposite direction,

= i(0 0) +j(0 0) +k(11) =2k.

z. Using z as the parameter, we have r =2

zj +2k. Hence we have

z)dz = [66z 64z

, at all points on the curve, since

r. Then f f will fall off

, the upper and lower halves (i.e. S

is where the lines parallel to the z-axis rst meet S: see

. Hence we have shown that

d = ((r sin)i +(r cos)j)d, |r

= sini +cosj and e

= r cos cos i +r cos sin j r sin k

= r sin sin i +r sin cos j.

along the coordinate lines, which we can write as

=/2 there). Given a latitude, we need to subtract

if it is North and add it to 90

W, will have Earth polar

to the nearest degree.

= r. A change d in corresponds to moving a

= cosi +sinj from (5.2),

for du/dx, and u

u = 0. Conversely, given this differential equation the general solution is u = ax +be

will have nite limits,

, together with, as in the ordinary point case, an

of the series with c = . Example: Kamke 2.191.

, or it involves a multiple of ln(x x

(x), where by convention we choose

+(q(x) +w(x))u = 0 , (6.14)

/p = b/a, ending up with

+h(x)v = 0 by changing the variable so

(b) is zero. Then u(x) = J

() = 0, i.e. sinkx and coskx.

cosmxsinnxdx = 0 for all m, n. (6.19)

1dx = 2 which is double the value of

dx for the other values of . There is no point

and the divergence of F = F

(cos) (dened as in chapter 6) are allowed because we

Das könnte Ihnen auch gefallen