9321 Chap01

October 24, 2014
16:53
9321 - An Introduction to Lagrangian Mechanics
9789814623612
Chapter 1
The Calculus of Variations
A wide range of equations in physics, from quantum eld and superstring

theories to general relativity, from uid dynamics to plasma physics and
condensed-matter theory, are derived from action (variational) principles
[2, 17]. The purpose of this Chapter is to introduce the methods of the
Calculus of Variations that gure prominently in the formulation of action
principles in physics.
1.1
1.1.1
Foundations of the Calculus of Variations

A Simple Minimization Problem
It is a well-known fact that the shortest distance between two points in

(Euclidean) space is calculated along a straight line joining the two points.
Although this fact is intuitively obvious, we begin our discussion of the
problem of minimizing certain integrals in mathematics and physics with a
search for an explicit proof. In particular, we prove that the straight line
y0 (x) = mx yields a path of shortest distance between the two points (0, 0)
and (1, m) on the (x, y)-plane as follows.
First, we consider the length integral

L[y] =

1 + (y )2 dx,
(1.1)
where y = y (x) is the slope of the function y at point x. The notation L[y]
is used to denote the fact that the value of the integral (1.1) depends on the
choice we make for the function y(x); thus, L[y] is called a functional of y.
We insist, however, that the function y(x) satisfy the boundary conditions
y(0) = 0 and y(1) = m. For example, for y0 (x) = m x, we nd L[y0 ] =
1
page 1
October 24, 2014
16:53
9789814623612
An Introduction to Lagrangian Mechanics
1 + m2 , while for y1 (x) = m x2 , we nd

arcsinh 2m
1
1
L[y1 ] =
1 + 4 m2 x2 dx =
cosh2 z dz
2m
0
0

1
2
2m 1 + 4 m + arcsinh(2 m) ,
=
4m
where we used the hyperbolic-trigonometric substitution1 2m x = sinh z.
We can readily show that L[y0 ] < L[y1 ] for all non-vanishing m-values.
Next, we introduce the modied function
y(x; ) = y0 (x) + y(x),
where y0 (x) = mx (i.e., the solution to our problem) and the variation
function y(x) is required to satisfy the prescribed boundary conditions
y(0) = 0 = y(1). We thus dene the modied length integral
1
L[y0 + y] =
1 + (m + y )2 dx
0
as a function of and a functional of y. We now show that the function y0 (x) = mx minimizes the integral (1.1) by evaluating the following
derivatives

1
d
m
L[y0 + y]
=
y dx
2
d
1
+
m
0
=0
m
=
[ y(1) y(0)] = 0,
1 + m2
and

2
1
d
(y )2
L[y
+

y]
=
dx > 0,
0
d2
(1 + m2 )3/2
0
=0
which holds for a xed value of m and all variations y(x) that satisfy the
conditions y(0) = 0 = y(1). Hence, we have shown that y(x) = mx
minimizes the length integral (1.1) since the rst derivative (with respect
to ) vanishes at = 0, while its second derivative is positive at = 0.
We note, however, that our task was made easier by our knowledge of the
actual minimizing function y0 (x) = mx; without this knowledge, we would
be required to choose a trial function y0 (x) and test for all variations y(x)
that vanish at the integration boundaries.
Another way to tackle this minimization problem is to nd a way to
characterize the function y0 (x) that minimizes the length integral (1.1), for
1 The trigonometric and hyperbolic-trigonometric substitutions are used extensively in
this textbook and reviewed in App. A.
page 2
October 24, 2014
16:53
9789814623612
all variations y(x), without actually solving for y(x). For example, the
characteristic property of a straight line y(x) is that its second derivative
vanishes for all values of x. The methods of the Calculus of Variations
introduced in this Chapter present a mathematical procedure for transforming the problem of minimizing an integral to the problem of nding
the solution to an ordinary dierential equation for y(x). The mathematical foundations of the Calculus of Variations were developed by Leonhard
Euler (1707-1783) and Joseph-Louis Lagrange (1736-1813), who developed
the mathematical method for nding curves that minimize (or maximize)
certain integrals.
1.1.2
1.1.2.1
Methods of the Calculus of Variations

Eulers First Equation
The methods of the Calculus of Variations transform the problem of minimizing (or maximizing) an integral of the form

F [y] =
F (y, y ; x) dx
(1.2)
(with xed boundary points a and b) into the solution of a dierential equation for y(x) expressed in terms of derivatives of the integrand F (y, y ; x),
which is assumed to be a smooth function of y(x) and its rst derivative
y (x), with a possible explicit dependence on x. See problem 1 at the end
of this Chapter for a generalization of Eq. (1.2).
The problem of extremizing the integral (1.2) will be treated in analogy
with the problem of nding the extremal value of any (smooth) function
f (x), i.e., nding the value x0 such that
f (x0 ) = lim
0
1

d
1
f (x0 + h)
= 0,
f (x0 + ) f (x0 )
h d
=0
where h = 0 is an arbitrary constant factor.2 First, we introduce the rst2 An extremum point refers to either the minimum or maximum of a one-variable function. A critical point, on the other hand, refers to a point where the gradient of a
multi-variable function vanishes. Critical points include minima and maxima as well as
saddle points (where the function exhibits maxima in some directions and minima in
other directions). A function y(x) is said to be a stationary solution of the functional
(1.2) if the rst variation (1.3) vanishes for all variations y that satisfy the boundary
conditions.
page 3
October 24, 2014
16:53
9789814623612
order functional variation F [y; y] dened as

d
F [y + y]
F [y; y]
d
=0

b
d

=
F (y + y, y + y , x) dx
d
a
(1.3)
=0
where y(x) is an arbitrary smooth variation of the path y(x) subject to

the boundary conditions y(a) = 0 = y(b), i.e., the end points of the
path are not aected by the variation (see Fig. 1.1). By performing the
y
y(x) + y(x)
yb
y(x)
ya
a
Fig. 1.1
Virtual displacement y(x) for the functional variation (1.3).
-derivatives in the functional variation (1.3), which involves partial derivatives of F (y, y , x) with respect to y and y , we nd

b
F
F
+ y (x)
F [y; y] =
y(x)
dx.
y(x)
y (x)
a
When the second term is integrated by parts, we obtain

b
F
d F
y
dx
F [y; y] =
y
dx y
a

F
F
y
.
(1.4)
+ yb
a
y b
y a
Here, since the variation y(x) vanishes at the integration boundaries (yb =
0 = ya ), the last terms involving yb and ya vanish explicitly and Eq. (1.4)
becomes

b
b
d F
F
F
dx,
(1.5)
F [y; y] =
y
y
dx

y
dx
y
y
a
a
page 4
October 24, 2014
16:53
9789814623612
where F /y is called the functional derivative of F [y] with respect to the

function y. The stationarity condition
F [y; y] = 0
for all variations y yields Eulers First equation

2F
F
d F
2F
2F
,
+ y
+
=
y

dx y
y y
y y
x y
y
(1.6)
(1.7)
which represents a second-order ordinary dierential equation for y(x). According to the Calculus of Variations, the solution y(x) to this ordinary
dierential equation, subject to the boundary conditions y(a) = ya and
y(b) = yb , yields a solution to the problem of minimizing (or maximizing)
the integral (1.2). Lastly, we note that Lagranges variation operator ,
while analogous to the derivative operator d, commutes with the integral
operator, i.e.,
b
b
P (y(x)) dx =
y(x) P (y(x)) dx,
for any smooth function P .

1.1.2.2
Extremal Values of an Integral
Eulers First Equation (1.7), which results from the stationarity condition
(1.6), does not necessarily imply that the Euler path y(x), in fact, minimizes
the integral (1.2). To investigate whether the path y(x) actually minimizes
Eq. (1.2), we must evaluate the second-order functional variation
2

d
2
F [y; y]
F [y + y]
,
d2
=0
and investigate its sign. By following steps similar to the derivation of
Eq. (1.5), the second-order variation is expressed as
2

2
b
2
F
F
d
2 F
2 F [y; y] =
)
y 2
+
(y
dx,
y 2
dx yy
(y )2
a
(1.8)
after integration by parts was performed. The necessary and sucient
condition for a minimum is 2 F > 0 and, thus, the sucient conditions for
a minimal integral are
2
d
2F
2F
F
> 0,
(1.9)
>
0
and
y 2
dx yy
(y )2
page 5
October 24, 2014
16:53
9789814623612
for all smooth variations y(x). For a small enough interval (a, b), however, the (y )2 -term will normally dominate over the (y)2 -term and the
sucient condition becomes 2 F/(y )2 > 0 (Legendres Condition [6]).
Because variational problems often involve nding the minima or maxima of certain integrals, the methods of the Calculus of Variations enable
us to nd extremal solutions y0 (x) for which the integral F [y] is stationary
(i.e., F [y0 ] = 0), without specifying whether the second-order variation
is positive-denite (corresponding to a minimum), negative-denite (corresponding to a maximum), or with indenite sign (i.e., when the coecients
of (y)2 and (y )2 have opposite signs).
1.1.2.3
Jacobi Equation*
Warning: Material identied by an asterisk is not meant to be covered

in an undergraduate-level course and can be skipped without loss of
continuity.
Carl Gustav Jacobi (1804-1851) derived a useful dierential equation
describing the deviation u(x) = y(x) y(x) between two extremal curves
that solve Eulers First Equation (1.7) for a given function F (x, y, y ). Upon
Taylor expanding Eulers First Equation (1.7) for y = y + u and keeping
only linear terms in u (which is assumed to be small in the interval (a, b)),
we easily obtain the linear ordinary dierential equation

2
2
2F
2F
d
F
F
.
(1.10)
+
u
+
u
=
u
u
dx
(y )2
yy
y 2
y y
By performing the x-derivative on the second term on the left side, we
obtain a partial cancellation with the second term on the right side and
obtain the Jacobi equation [6]
2

2
2
F du
F
F
d
d
.
(1.11)
=
u
dx (y )2 dx
y 2
dx yy
We immediately see that the extremal properties (1.9) of the solutions of
Eulers First Equation (1.7) are intimately connected to the behavior of the
deviation u(x) between two nearby extremal curves.
We note that the dierential
equation (1.10) may be derived from the

variational principle J(u, u ) dx = 0 as the Jacobi-Euler equation

d
J
J
,
(1.12)
=
dx u
u
page 6
October 24, 2014
16:53
9789814623612
where the Jacobi function J(u, u ; x) is dened as

2
d
1

F (y + u, y + u )
J(u, u )
2 d2
=0
2
u2 2 F
u2 2 F
F
+
u
u
+
.
(1.13)
2 y 2
yy
2 (y )2

For example, for F (y, y ) = 1 + (y )2 ,we nd 2 F/y 2 = 0 = 2 F/yy
and 2 F/(y )2 = 3 , where 1 + m2 for the extremal solution
y(x) = m x. The Jacobi function (1.13) for this case is J(u, u ) = 12 (u )2 /3
and the Jacobi equation (1.11) becomes (3 u ) = 0, or u = 0 (i.e.,
deviations between straight lines diverge linearly).
Lastly, the second functional variation (1.8) can be combined with the
Jacobi equation (1.11) to yield the expression [6]

2
b
2F
u
y
dx,
(1.14)
y
2 F [y; y] =
2
u
a (y )
where u(x) is a solution of the Jacobi equation (1.11). We note that the
minimum condition 2 F > 0 is now clearly associated with the Legendre
condition 2 F/(y )2 > 0. Furthermore, we note that the Jacobi equation describing space-time geodesic deviations plays a fundamental role in
Einsteins Theory of General Relativity.3 We shall return to the Jacobi
equation (1.11) in Sec. 1.4, where we briey discuss Fermats Principle of
Least Time and its applications to the general theory of geometric optics.
1.1.2.4
Eulers Second Equation
Whenever the function F appearing in Eq. (1.2) satises the condition

F/x 0, we may obtain a partial solution to Eulers First Equation
(1.7) as follows. First, we write the exact x-derivative of F (y, y ; x) as
F
F
F
dF
=
+ y
+ y
,
dx
x
y
y
and substitute Eulers First Equation (1.7) for F/y in order to combine
the last two terms, so that we obtain Eulers Second equation

F
d
F
.
(1.15)
=
F y

dx
y
x
This equation is especially useful when the integrand F (y, y ) in Eq. (1.2) is
independent of x (i.e., F/x 0), for which Eq. (1.15) yields the solution
F
(y, y ) = ,
(1.16)
F (y, y ) y
y
3 See, for example, S. Weinberg, Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity (Wiley 1972).
page 7
October 24, 2014
16:53
9789814623612
where the constant is determined from the conditions y(x0 ) = y0 and

y (x0 ) = y0 . Here, Eq. (1.16) is a partial solution (in some sense) of
Eq. (1.7), since we have reduced the derivative order from a second-order
derivative y (x) in Eq. (1.7) to a rst-order derivative y (x) in Eq. (1.16)
on the solution y(x). Hence, Eulers Second Equation has produced an
equation of the form
G(y, y ; ) F (y, y ) y
F
(y, y ) = 0,
y
which can often be integrated by quadrature (as we shall see later)by solving
for F (y, y ) = 1 + (y )2 ,
for y as a function of y and x. For example,

we ndF (y, y ) y F (y, y )/y = 1/ 1 + (y )2 = , which yields y =
1 1 2 .
1.1.3
Path of Shortest Distance and Geodesic Equation
We now return to the problem of minimizing

the length integral (1.1),

with the integrand written as F (y, y ) = 1 + (y )2 . Here, Eulers First
Equation (1.7) yields

d F
F
y
= 0,
=
=

2
3/2
dx y
y
[1 + (y ) ]
so that the function y(x) that minimizes the length integral (1.1) is the
solution of the dierential equation y (x) = 0 subject to the boundary
conditions y(0) = 0 and y(1) = m, i.e., the extremal solution is y(x) =
m x. Note that the integrand F (y, y ) also satises the sucient minimum
conditions (1.9) so that the path y(x) = m x is indeed the path of shortest
distance between two points on the plane.
1.1.3.1
Geodesic Equation*
We generalize the problem of nding the path of shortest distance on the

Euclidean plane (x, y) to the problem of nding geodesic paths in arbitrary
geometry because it introduces important geometric concepts in Classical
Mechanics needed in later chapters. For this purpose, let us consider a
path in n-dimensional space from point xA to point xB parameterized by
the continuous parameter : x() such that x(A) = xA and x(B) = xB .
The length integral from point A to B is
1/2
B
dxi dxj
L[x] =
d,
(1.17)
gij
d d
A
page 8
October 24, 2014
16:53
9789814623612
where the space metric gij is dened so that the squared innitesimal length
element is ds2 gij (x) dxi dxj (summation over repeated indices is implied
throughout the text).
Next, using the denition (1.3), the rst-order variation L[x] is given
as

B
i
j
gij
d
dxi dxj
1
k dx dx
+
2
g
x
L[x] =
ij
2 A
xk
d d
d d ds/d

b
i
j
gij
1
dxi dxj
k dx dx
+ 2 gij
x
=
ds,
2 a
xk
ds ds
ds ds
where a = s(A) and b = s(B) and we have performed a parameterization
change: x() x(s). By integrating the second term by parts (with x
vanishing at the end points), we obtain

b
d
1 gjk dxj dxk
dxj
(1.18)
gij
xi ds
L[x] =
ds
ds
2 xi ds ds
a

gij
d2 xj
1 gjk dxj dxk
=
+
gij
xi ds.
ds2
xk
2 xi
ds ds
a
We now note that, using symmetry properties under interchange of the j-k
indices, the second term in Eq. (1.18) can also be written as

gij
1 gij
1 gjk dxj dxk
gik
gjk dxj dxk
=
xk
2 xi
ds ds
2 xk
xj
xi
ds ds
dxj dxk
,
ds ds
using the denition of the Christoel symbol

gij
1
gik
gjk
jk = g i i|jk = g i
+
(1.19)
ikj ,
2
xk
xj
xi
where g ij denotes a component of the inverse metric (i.e., g ij gjk = ik ).
Hence, the rst-order variation (1.18) can be expressed as

b 2 i
j
k
d x
i dx dx
+
(1.20)
L[x] =
gi x ds.
jk
2
ds
ds
ds
a
The stationarity condition L = 0 for arbitrary variations x yields an
equation for the path x(s) of shortest distance known as the geodesic equation
j
k
d2 xi
i dx dx
= 0.
(1.21)
+
jk
ds2
ds ds
Returning to two-dimensional Euclidean geometry, where the components
of the metric tensor are constants (i.e., ds2 = dx2 + dy 2 ), the geodesic
equations are x (s) = 0 = y (s), which once again leads to a straight line.
= i|jk
page 9
October 24, 2014
16:53
9789814623612
10
1.1.3.2
Geodesic Equation on a Sphere
We now consider the example of geodesic curves on the surface of a sphere

of radius R, which are known to be great circles as we will now show.
According to Eq. (1.17), they are extremal curves of the length functional

2

d
d R
L( , ) d,
(1.22)
L[] =
R 1 + sin2
d
where the azimuthal angle () is an arbitrary function of the polar angle
. Since the function L( , ) in Eq. (1.22) is independent of the azimuthal
angle , its corresponding Eulers First Equation (1.7) yields the partial
solution
sin2
L

=
= sin ,

1 + sin2 ( )2
where is an arbitrary constant angle. Solving for we nd
() =
sin

,
sin sin2 sin2
which can, thus, be integrated to give

sin d

=
=
sin sin2 sin2
tan du
1 u2 tan2
where is another constant angle and we used the change of variable u =

cot . A simple trigonometric substitution nally yields
cos( ) = tan cot ,
(1.23)
which describes a great circle on the surface of the sphere. We easily verify
this statement by converting Eq. (1.23) into the equation for a plane that
passes through the origin:
z sin = x cos cos + y cos sin .
The intersection of this plane with the unit sphere is expressed in terms of
the coordinate functions x = sin cos , y = sin sin , and z = cos .
1.2
Classical Variational Problems
The development of the Calculus of Variations led to the resolution of several classical optimization problems in mathematics and physics [2, 17]. In
page 10
October 24, 2014
16:53
9789814623612
11
this Section, we present two classical variational problems that are connected to its original development. First, in the isoperimetric problem,
we show how Lagrange modied Eulers formulation of the Calculus of
Variations by allowing constraints to be imposed on the search for nding
extremal values of certain integrals. Next, in the brachistochrone problem,
we show how the Calculus of Variations is used to nd the path of quickest descent for a bead sliding along a frictionless wire under the action of
gravity.
1.2.1
Isoperimetric Problem
Isoperimetric problems represent some of the earliest applications of the

variational approach to solving mathematical optimization problems. Pappus (ca. 290-350) was among the rst to recognize that among all the
isoperimetric closed planar curves (i.e., closed curves that have the same
perimeter length), the circle encloses the greatest area.4 The variational
formulation of the (planar) isoperimetric
problem requires that we maxi
mize the area
integral A = y(x) dx while keeping the perimeter length
1 + (y )2 dx constant.
integral L =
The isoperimetric problem falls in a class of variational problems
called
constrained variational principles, where a certain functional

f (y, y , x) dx is to be optimized under the constraint that another functional g(y, y , x) dx be held constant (say at value G). The constrained
variational principle is then expressed in terms of the functional

f (y, y , x) dx + G
g(y, y , x) dx
F [y] =

=
f (y, y , x) g(y, y , x) dx + G,
(1.24)
where the parameter is called a Lagrange multiplier. Note that the functional F [y] is chosen, on the one hand, so that the derivative

dF [y]
= G
g(y, y , x) dx = 0
d
enforces the constraint for all curves y(x). On the other hand, the stationarity condition F = 0 for the functional (1.24) with respect to arbitrary
4 Such results are normally described in terms of the so-called isoperimetric inequalities
4 A L2 , where A denotes the area enclosed by a closed curve of perimeter length L;
here, equality is satised by the circle, where L = 2 a and A = a2 .
page 11
October 24, 2014
12
16:53
9789814623612
variations y(x) (which vanish at the integration boundaries) yields Eulers

First Equation:

f
g
d
g
f

.
(1.25)

=

dx y
y
y
y
Here, we assume that this second-order dierential equation is to be solved
(at xed ) subject to the conditions y(x0 ) = y0 and y (x0 ) = 0; the
solution y(x; ) of Eq. (1.25) is, however, parameterized by the Lagrange
multiplier .
If the integrands f (y, y ) and g(y, y ) in Eq. (1.24) are both explicitly
independent of x, then Eulers Second Equation (1.16) for the functional
(1.24) becomes

f
d
g
y
= 0.
(1.26)
f y
dx
y
y
By integrating this equation we obtain

f
g
y
= f0 g0 = 0,
f y
y
y
where the constant of integration on the right is chosen from the conditions
y(x0 ) = y0 and y (x0 ) 0 (i.e., x0 is an extremum point of y(x)), so
that the value of the constant Lagrange multiplier is now dened as =
f (y0 , 0)/g(y0 , 0) f0 /g0 . Hence, the solution y(x; ) of the constrained
variational problem (1.24) is now uniquely determined.
We return to the isoperimetric problem now represented in terms of the
constrained functional

2
y dx + L
1 + (y ) dx
A [y] =

y 1 + (y )2 dx + L,
(1.27)
=
where L denotes the value of the constant-length constraint. From
Eq. (1.25), the stationarity of the functional (1.27) with respect to arbitrary variations y(x) yields

y
d

= 1,
dx
1 + (y )2
which can be integrated to give
y

= x x0 ,
1 + (y )2
(1.28)
page 12
October 24, 2014
16:53
9789814623612
13
where x0 denotes a constant of integration associated

with y (x0 ) = 0. Since

the integrands f (y, y ) = y and g(y, y ) = 1 + (y )2 are both explicitly
independent of x, then Eulers Second Equation (1.26) applies, and we
obtain
d
= 0,
y
dx
1 + (y )2
which can be integrated to give

= y,
1 + (y )2
(1.29)
where we chose y(x0 ) = with y (x0 ) = 0.

Lastly, by combining Eqs. (1.28) and (1.29), we obtain y y +(xx0 ) = 0,
which can be integrated to give y 2 (x) = 2 (x x0 )2 . We immediately
recognize that the maximal isoperimetric curve y(x) is a circle of radius
r = , centered at (x, y) = (x0 , 0), with perimeter length L = 2 and
maximal enclosed area A = 2 = L2 /4.
1.2.2
Brachistochrone Problem
The brachistochrone problem is a least-time variational problem, which was

rst solved in 1696 by Jean (Johann) Bernoulli (1667-1748). The problem
can be stated as follows. A bead is released from rest (at the origin in
Fig. 1.2) and slides down a frictionless wire that connects the origin to a
given point (xf , yf ). The question posed by the brachistochrone problem
is to determine the shape y(x) of the wire for which the frictionless descent
of the bead under gravity takes the shortest amount of time.
Using the (x, y)-coordinates set up in Fig. 1.2, the speed of the bead
after it has fallen a vertical distance x along the wire is v = 2g x (where

g denotes the gravitational acceleration) and, thus, the time integral
xf
xf

1 + (y )2
ds
=
dx =
F (y, y , x) dx, (1.30)
T [y] =
v
2 gx
0
0
is a functional of the path y(x). Note that, in the absence of friction,
the beads mass does not enter into the problem. Since the integrand of
Eq. (1.30) is independent of the y-coordinate (F/y = 0), Eulers First
Equation (1.7) simply yields

y
F
d F

= ,
=
=
0
dx y
y
2 gx [1 + (y )2 ]
page 13
October 24, 2014
16:53
9789814623612
14
yf
x
y(x)
g
xf
x
Fig. 1.2
Brachistochrone problem.
where is a constant, which can be rewritten in terms of the scale length

= (22 g)1 as
x
x
(y )2
(y )2 =
.
=
1 + (y )2

( x)
Integration by quadrature yields the solution
x
s
y(x) =
ds,
s
0
subject to the initial condition y(x = 0) = 0. Using the trigonometric
substitution (with = 2a)
s = 2a sin2 (/2) = a (1 cos ),
we obtain the parametric solution
xB () = a (1 cos )
and
yB () =
0
1 cos
a sin d
1 + cos
(1 cos ) d = a ( sin ).
=a
(1.31)
(1.32)
This solution yields a parametric representation of the cycloid (Fig. 1.3)

where the bead is placed on a rolling hoop of radius a.
page 14
October 24, 2014
16:53
9789814623612
15
2a
2a
x
Fig. 1.3
Brachistochrone solution.
Using the parametric solution (1.31)-(1.32) for the brachistochrone

problem, we calculate the time taken to go from the origin (0, 0) to the
end point (2a, a) to be

2

)2
(xB ) + (yB
a2 [sin2 + (1 cos )2 ]
d
T [xB , yB ] =
d =
2g xB
2 ag (1 cos )
0
0

a
a
=
TB .
d =
(1.33)
g 0
g
By comparison, we note that the time taken by following a straight-line
path (yL = x/2) between these two points would be

2a

1 + (/2)2
a
dx =
> TB ,
2 + 4
T [yL ] =
2g x
g
0
which is longer than the minimal time (1.33).
1.3
Fermats Principle of Least Time
Several minimum principles have been invoked throughout the history of

Physics to explain the behavior of light and particles. In one of its earliest
form, Hero of Alexandria (ca. 75 AD) stated that light travels in a straight
line and that light follows a path of shortest distance when it is reected.
In 1657, Pierre de Fermat (1601-1665) stated the Principle of Least Time,
whereby light travels between two points along a path that minimizes the
travel time, to explain Snells Law (Willebrord Snell, 1591-1626) associated
with light refraction in a stratied medium. Using the index of refraction
n0 1 of the uniform medium, the speed of light in the medium is expressed
as v0 = c/n0 c, where c is the speed of light in vacuum. This straight-line
page 15
October 24, 2014
16:53
9789814623612
16
B+
A
n
h
B
Fig. 1.4 Reection path (AxB+ ) and refraction path (AxB ) for light propagating in
a stratied medium (n < n).
path in a uniform medium is not only a path of shortest distance but also
a path of least time.
The laws of reection and refraction as light propagates in uniform
media separated by sharp boundaries (see segments AxB+ and AxB in
Fig. 1.4) are easily formulated as minimization problems as follows. The
time taken by light to go from point A = (0, h) to point B = (L, h)
after being reected or refracted at point (x, 0) is given by

TAB (x) = c1 n x2 + h2 + n (L x)2 + h2 ,
where n and n denote the indices of refraction of the medium along path
Ax and xB , respectively. We easily evaluate the derivative of TAB (x) to
nd

(L
x)
x
dTAB (x)
= c1 n
n
dx
x2 + h2
(L x)2 + h2
c1 (n sin n sin ) ,
(1.34)
where the angles and are dened in Fig. 1.4. Here, the law of reection
(n = n and B = B+ in Fig. 1.4) is expressed in terms of the extremum

(x) = 0, which implies that the path of least time is obtained
condition TAB
when the reected angle is equal to the incidence angle (or x = L/2 in
Fig. 1.4).
page 16
October 24, 2014
16:53
9789814623612
17

Next, the extremum condition TAB
(x) = 0 for refraction (n = n and
B = B in Fig. 1.4) yields Snells law of refraction
n sin = n sin .
(1.35)
Note that Snells law implies that the refracted light ray bends toward the
medium with the largest index of refraction (n > n in Fig. 1.4). In what
follows, we generalize Snells law to describe the case of light refraction in
a continuous nonuniform medium.
Before proceeding with this general case, we note that the second derivative is strictly positive:

1
d2 TAB (x)
n cos3 + n cos3 > 0,
=
2
dx
hc
which proves that the paths of light reection and refraction in a at stratied medium are indeed paths of minimal optical lengths. For some curved
reecting surfaces, however, the reected path corresponds to a path of
maximum optical length (see problem 16). This example emphasizes the
fact that Fermats Principle is in fact a principle of stationary time.
1.3.1
Light Propagation in a Nonuniform Medium
According to Fermats Principle, light propagates in a nonuniform medium

by traveling along a path that minimizes the travel time between an initial
point A (where a light ray is launched) and a nal point B (where the light
ray is received). The time taken by a light ray following a path from
point A to point B (parameterized by ) is [3]

dx
1
(1.36)
T [x] =
c n(x) d = c1 Ln [x],
d
where Ln [x] represents the length of the optical path taken by light as it
travels in a nonuniform medium with refractive index n(x), and

2
2
2
dx
dy
dz
dx
=
+
+
.
(1.37)
d
d
d
d
Fermats Principle of Least Time states that light traveling in a nonuniform
medium follows an optical path x() that is a stationary solution of the
variational principle
T [x] 0.
(1.38)
page 17
October 24, 2014
18
16:53
9789814623612
We now consider ray propagation in two dimensions (x, y), with the index
of refraction n(y), so that the optical length
b

Ln [y] =
n(y) 1 + (y )2 dx
(1.39)
a
is a functional of y(x). We shall return to the general properties of ray

propagation in Sec. 1.4.

By
applying the variational principle (1.38) for the case where F (y, y ) =
n(y) 1 + (y )2 , we nd

n(y) y
F
F

=
n
=
and
(y)
1 + (y )2 ,
y
y
1 + (y )2
so that Eulers First Equation (1.7) becomes

n(y) y = n (y) 1 + (y )2 .
(1.40)
Although the solution of this (nonlinear) second-order ordinary dierential

equation is dicult to obtain for general functions n(y), we can nonetheless
obtain a qualitative picture of its solution by noting that y has the same
sign as n (y). Hence, when n (y) = 0 for all y (i.e., the medium is spatially
uniform), the solution y = 0 yields the straight line y(x; 0 ) = tan 0 x,
where 0 denotes the initial launch angle (as measured from the horizontal
axis). The case where n (y) > 0 (or < 0), on the other hand, yields a light
path which is concave upward, i.e., y > 0 (or downward, i.e., y < 0), as
will be shown below.
Note that the sucient conditions (1.9) for a minimal optical path are
expressed as
n
2F
=
> 0,
(y )2
[1 + (y )2 ]3/2
which is satised for all refractive media, and

2

2F
F
n
y
d
d
= n 1 + (y )2
y 2
dx yy
dx
1 + (y )2

n2 d2 ln n
1
n n (n )2 =
=
,
F
F dy 2
whose sign is indenite. Hence, the sucient condition for a minimal optical
length for light traveling in a nonuniform refractive medium is d2 ln n/dy 2 >
0; note, however, that only the stationarity of the optical path is physically
meaningful and, thus, we shall not discuss the minimal properties of light
paths in what follows.
page 18
October 24, 2014
16:53
Since the function F (y, y ) = n(y)

of x, Eulers Second Equation yields
F y
9789814623612
19

1 + (y )2 is explicitly independent
F
n(y)
=
= constant,

y
1 + (y )2
and, thus, the partial solution of Eq. (1.40) is

n(y) = N 1 + (y )2 ,
(1.41)
where N is a constant determined from the initial conditions of the light ray.
We note that Eq. (1.41) states that as a light ray enters a region of increased
(decreased) refractive index, the slope of its path also increases (decreases).
In particular, by substituting Eq. (1.40) into Eq. (1.41), we nd N 2 y =
n(y) n (y), and, hence, the path of a light ray is concave upward (downward)
where n (y) is positive (negative), as previously discussed. Eq. (1.41) can
be integrated by quadrature to give the integral solution
y
N ds

,
(1.42)
x(y) =
[n(s)]2 N 2
0
subject to the condition x(y = 0) = 0. From the explicit dependence of
the index of refraction n(y), we may be able to perform the integration
in Eq. (1.42) to obtain x(y) and, thus, obtain an explicit solution y(x) by
inverting x(y).
1.3.2
Snells Law
We now show that the partial solution (1.41) corresponds to Snells Law for
light refraction in a nonuniform medium. Consider a light ray traveling in
the (x, y)-plane launched from the initial position (0, 0) at an initial tangent
angle 0 /2 (measured from the x-axis) so that y (0) = tan 0 is the
slope at x = 0. The constant N is then simply determined from Eq. (1.41)
as N = n0 cos 0 , where n0 = n(0) is the refractive index at y(0) = 0.
Next, let y (x) = tan (x) be the slope of the light ray at (x, y(x)), so that

1 + (y )2 = sec and Eq. (1.41) becomes n(y) cos = n0 cos 0 . Lastly,
when we substitute the complementary angle = /2 (measured from
the vertical y-axis), we obtain the local form of Snells Law of refraction
n[y(x)] sin (x) = n0 sin 0 ,
(1.43)
properly generalized to include a light path in a nonuniform refractive

medium. Note that Snells Law (1.43) does not tell us anything about
the actual light path y(x); this solution must come from solving Eq. (1.42).
page 19
October 24, 2014
16:53
20
1.3.3
9789814623612
Application of Fermats Principle
In order to obtain an explicit solution (1.42) of Fermats Principle in two

dimensions, we consider the propagation of a light ray in a medium with
linear refractive index
n(y) = n0 (1 y)
(1.44)
exhibiting a constant gradient n (y) = n0 . Substituting this prole into

the optical-path solution (1.42), we nd
y
cos 0 ds

x(y) =
.
(1.45)
(1 s)2 cos2 0
0
Next, we use the trigonometric substitution

1
cos 0
y() =
1
,
cos
with = 0 at (x, y) = (0, 0), so that Eq. (1.45) becomes

sec + tan
cos 0
ln
x() =
.
sec 0 + tan 0
(1.46)
(1.47)
Fig. 1.5 Parametric solution (1.46)-(1.47) for 0 = /4 and = 1 (solid), = 1/2

(dashed), and = 0 (dotted). The plots are shown for (x, y) from (0, 0) to (x, y), except
for = 0, for which y = x tan 0 .
The parametric solution (1.46)-(1.47) for the optical path in a linear

medium (see Fig. 1.5) shows that the path reaches a maximum height y =
y(0) at a distance x = x(0) when the tangent angle is zero:
x =
cos 0
1 cos 0
ln(sec 0 + tan 0 ) and y =
.
page 20
October 24, 2014
16:53
9789814623612
21
The time taken for the light path parameterized by Eqs. (1.46)-(1.47) in
going from the origin (0, 0) to the end point (x, y) is a function of the initial
angle 0 :

n0 0
1 y()
T (0 ) =
(x )2 + (y )2 d
c 0
0
n0
2
cos 0
sec3 d
=
c
0

n0
=
(1.48)
sin 0 + cos2 0 ln (sec 0 + tan 0 ) .
2 c
Lastly, we note that by expressing y(x; ) as a function of x, Eq. (1.46)
becomes

1
1 cos 0 cosh sec 0 (x x) .
(1.49)
y(x; ) =
In the uniform limit ( = 0), we use LHospitals rule on Eq. (1.49) to nd

the straight-line equation y(x; 0) x tan 0 .
1.4
1.4.1
Geometric Formulation of Ray Optics

General Geometric Optics
We now return to the general formulation for light-ray propagation based

on the time integral (1.36), where the integrand is

dx
dx
F x,
= n(x) ,
d
d
and light rays are allowed to travel in a three-dimensional refractive medium
with a general index of refraction n(x). Eulers First equation in this case
is

F
d
F
,
(1.50)
=
d (dx/d)
x
where
n dx
F
F
=
and
= n,
(dx/d)
d
x
with = |dx/d| given by Eq. (1.37). Eulers First Equation (1.50), therefore, yields the Euler-Fermat equation

d n dx
= n.
(1.51)
d d
page 21
October 24, 2014
22
16:53
9789814623612
By choosing the ray parameterization = ds/d n, the Euler-Fermat

equation (1.51) becomes d2 x/d 2 = n n = 12 n2 , which implies that
the light ray is accelerated toward regions of higher index of refraction (see
Fig. 1.6).
Eulers Second Equation, on the other hand, states that
H() F

F
dx
dx
= 0
x,
d
d (dx/d)
is a constant of motion. Note that, while Eulers Second Equation (1.41)

proved very useful in providing an explicit solution (Snells Law) to nding
the optical path in a nonuniform medium with index of refraction n(y), it
appears that Eulers Second Equation H() 0 now reveals no information
about the optical path. Where did the information go? To answer this
question, we apply the Euler-Fermat
equation (1.51) to the two-dimensional

y. Hence, the Eulercase where = x and = 1 + (y )2 with n = n (y)
Fermat equation (1.51) becomes

d n
y ) = n
y,
(
x + y
dx
from which we immediately conclude that Eulers Second Equation (1.41),
n = N , now appears as a constant of the motion d(n/)/dx = 0 associated with a symmetry of the optical medium (i.e., the optical properties of
the medium are invariant under translation along the x-axis). The association of symmetries with constants of the motion will later be discussed in
terms of Noethers Theorem (see problem 17 and Sec. 2.5).
nk
dk
ds
k = dx
ds
Fig. 1.6 Light-ray curvature

n d
k/ds and the Frenet-Serret frame (
k,
n,
b) following
a light ray. The vectors (n,
k,
n) lie on the same plane (i.e., the surface of the page),
k is perpendicular to that plane (i.e., into the page).
while the vector
b 1 ln n
page 22
October 24, 2014
16:53
1.4.2
9789814623612
23
Light-ray Frenet-Serret Equations
Next, by choosing the ray parametrization d = ds (so that = 1), the

Euler-Fermat equation (1.51) becomes

d
dx
n
= n.
(1.52)
ds
ds
Since the ray velocity dx/ds =
k is a unit vector, which denes the direction
of the wave vector k, Eq. (1.52) yields the light-curvature equation [see
Eq. (A.24)]

d
k
=
k ln n
k
n,
ds
(1.53)
where
n denes the principal normal unit vector and the light-ray FrenetSerret curvature is
= | ln n
k| =
n ln n;
(1.54)
see App. A for a review of the Frenet-Serret formulas for a general spatial
curve. Note that for the one-dimensional problem discussed in Sec. 1.3.1,
the curvature = |n |/(n ) = |y |/3 is in agreement with the standard
Frenet-Serret curvature.
The light-ray Frenet-Serret torsion is calculated as follows. First, using
the denition of the binormal unit vector
b
k
n, we use Eq. (1.53) to
nd
d
k
= ln n
k,

b =
k
ds
(1.55)
which shows that

b n 0 (see Fig. 1.6). Next, we introduce the remaining Frenet-Serret equations [see Eqs. (A.28)-(A.30)]
d
n
d
b
=
b
k and
=
n,
ds
ds
(1.56)
where and denote the curvature and torsion of the light ray. By taking
the s-derivative of Eq. (1.55) and then taking the dot product with the
normal unit vector
n, we obtain the light-ray Frenet-Serret torsion

d
d
n
b
ln n
k =
ln n .
(1.57)
=
ds
ds
Lastly, a light wave is characterized by a polarization unit vector
e
(dened in terms of the waves electric eld E |E| e) that is perpendicular
page 23
October 24, 2014
16:53
24
9789814623612
to the wavevector direction

k. We may, thus, write the polarization vector
as

e cos
n + sin
b,
(1.58)
where denotes the polarization angle measured from the local normal

n-axis. Using the Frenet-Serret equations (1.56), we obtain the evolution
equation for the polarization unit vector along a light ray:

de
d
e

= cos k +
+
,
(1.59)
ds
ds
which is expressed in terms of the Frenet-Serret curvature and torsion (, ):

g cos denotes the geodesic curvature and r + d/ds denotes
the relative torsion, which are derived in Eq. (A.34).
1.4.3
Light Propagation in Spherical Geometry
We now explore the case where the medium is spherically symmetric. By

using the general ray-orbit equation (1.53), we can also show that, for a
spherically-symmetric nonuniform medium with index of refraction n(r),
the light-ray orbit r(s) satises the conservation law

dr
d
dr
d
r n(r)
= r
n(r)
= r n(r) = 0.
(1.60)
ds
ds
ds
ds
Next, we use the fact that the light-ray path is planar (i.e., the torsion is
zero, = 0) and, thus, we write

dr
= r sin cos cos sin z = r sin z,
(1.61)
r
ds
where denotes the angle between the position vector r =
r (cos
x + sin
y) and the tangent vector dr/ds = cos
x + sin
y
k

(see Fig. 1.7). The Frenet-Serret equation dk/ds = ( sin
x + cos
y)
yields the curvature d/ds.
Using Eq. (1.61), the conservation law (1.60) for ray orbits in a
spherically-symmetric medium can, therefore, be expressed as
n(r) r sin (r) = N a,
(1.62)
which is known as Bouguers formula (Pierre Bouguer, 1698-1758), where

N and a are constants (see Fig. 1.7); note that the condition n(r) r N a
must also be satised since sin (r) 1. This conservation law is analogous
to the conservation law of angular momentum for particles moving in a
central-force potential (see Chap. 4).
page 24
October 24, 2014
16:53
9789814623612
dr
ds
25
r
a
Fig. 1.7
Light path in a nonuniform medium with spherical symmetry.
An explicit expression for the ray orbit r() is then obtained as follows.
First, since dr/ds is a unit vector, we nd

d
r + (dr/d) r
dr
dr
=
r =
,
r +
ds
ds
d
r2 + (dr/d)2
so that
d
1
=
2
ds
r + (dr/d)2
and Eq. (1.61) yields
r
d
dr
z
= r sin z = r2
ds
ds
r
Na
sin =
,
=
2
2
nr
r + (dr/d)
where we made use of Bouguers formula (1.62). Next, integration by

quadrature yields
r
d

(r) = N a
,
2
n () 2 N 2 a2
r0
where we choose r0 so that (r0 ) = 0. Lastly, a change of integration
variable = N a/ yields
N a/r0
d

(r) =
,
(1.63)
2
n () 2
N a/r
where n() n(N a/). Hence, for a spherically-symmetric medium with
index of refraction n(r), we can compute the light-ray orbit r() by inverting
the integral (1.63) for (r).
Consider,
for example, the spherically-symmetric refractive index
n(r) = n0 2 (r/R)2 , where n0 = n(R) denotes the refractive index
page 25
October 24, 2014
16:53
9789814623612
26
r1
( )
r( )
r0
Fig. 1.8
Elliptical light path in a spherically-symmetric refractive medium.
at r = R. Introducing the dimensional parameter = a/R and the transformation = 2 , Eq. (1.63) becomes
N a/r0
d

(r) =
2
2
n0 (2 N 2 e2 ) 4
N a/r
(N a/r0 )2
d
1

=
,
4
2
2 (N a/r)2
n0 e ( n20 )2

where e =
1 N 2 2 /n20 (assuming that n0 > N ). Next, using the
trigonometric substitution = n20 (1 + e cos ), we nd (r) = 12 (r) or
r0 1 + e
r() =
,
(1.64)
1 + e cos 2
which represents an ellipse (see Fig. 1.8)

2
2
y
x
+
= 1
R 1e
R 1+e
with semi-major and semi-minor axes r1 = R (1+e)1/2 and r0 = R (1e)1/2 ,
respectively. This example shows that, surprisingly, it is possible to trap
light!
1.4.4
Geodesic Representation of Light Propagation
We now investigate the geodesic properties of light propagation in a nonuniform refractive medium. For this purpose, let us consider a path AB in
space from point A to point B parameterized by the continuous parameter
page 26
October 24, 2014
16:53
9789814623612
27
, i.e., x() such that x(A) = xA and x(B) = xB . The time taken by light
in propagating from A to B is

1/2
B
B
dt
n
dxi dxj
T [x] =
d =
d,
(1.65)
gij
d
c
d d
A
A
where dt = n ds/c denotes the innitesimal time interval taken by light in
moving an innitesimal distance ds in a medium with refractive index n
and the space metric is denoted by gij . The geodesic properties of light
propagation are investigated with the vacuum metric gij or the mediummodified metric gij = n2 gij .
1.4.4.1
Vacuum-metric Case
We begin with the vacuum-metric case and consider the light-curvature

equation (1.53). First, we dene the vacuum-metric tensor gij = ei ej in
terms of the basis vectors (e1 , e2 , e3 ), so that the ray velocity is
dxi
dx

=
ei .
k =
ds
ds
Second, using the denition for the Christoel symbol (1.19) and the relations
dxk
dej
ijk
ei ,
ds
ds
we nd
2 i

j
k
d2 x
d x
d2 xi
dxj dej
d
k
i dx dx
=
=
e
+
+
ei .
i
jk
ds
ds2
ds2
ds ds
ds2
ds ds
By combining these relations, the light-ray curvature equation (1.53) becomes

j
k
d2 xi
dxi dxj ln n
i dx dx
ij
=
g
+
.
(1.66)
jk
ds2
ds ds
ds ds
xj
This equation shows that the path of a light ray departs from a vacuum
geodesic line as a result of a refractive-index gradient projected along the
tensor
hij g ij
dxi dxj
ds ds
which, by construction, is perpendicular to the ray velocity dx/ds (i.e.,

hij dxj /ds = 0).
page 27
October 24, 2014
16:53
28
1.4.4.2
9789814623612
Medium-metric Case
Next, we investigate the geodesic propagation of a light ray associated

with the medium-modied (conformal) metric g ij = n2 gij , where c2 dt2 =
n2 ds2 = gij dxi dxj . The derivation follows a variational formulation similar to that found in Sec. 1.1.3. Hence, the rst-order variation T [x] is
expressed as

tB 2 i
j
k
d x
dt
i dx dx
+ jk
g i x 2 ,
(1.67)
T [x] =
2
dt
dt dt
c
tA
i
where the medium-modied Christoel symbol jk includes the eects of

the gradient in the refractive index n(x). We, therefore, nd that the light
path x(t) is a solution of the geodesic equation
j
k
d2 xi
i dx dx
= 0,
+
jk
dt2
dt dt
(1.68)
which is also the path of least time for which T [x] = 0. By using the
substitution d/dt = (c/n) d/ds in Eq. (1.68), we nd dx/dt = c
k/n and
k/ds
k d ln n/ds).
d2 x/dt2 = (c/n)2 (d
When using Cartesian coordinates (where g ij = n2 ij and g ij =
n2 ij ), for example, the medium-modied Christoel symbol
i
jk = ji k ln n + ki j ln n jk i ln n
(1.69)
is expressed in terms of gradient-components of the logarithm of the refraction index n. By inserting Eq. (1.69) into Eq. (1.68), we readily recover the
light-ray curvature equation (1.53).
1.4.4.3
Jacobi Equation for Light Propagation
Lastly, we point out that the Jacobi equation for the deviation () =
x() x() between two rays that satisfy the Euler-Fermat ray equation
(1.51) can be obtained from the Jacobi function

2
dx
d
d
1

+
J(, d/d)
(1.70)
n(x + )
2 d2
d
d
=0

2
n d
dx
n d dx
+
: n,
+

3
2 d
d
d d
2
where the Euler-Fermat ray equation (1.51) was taken into account and the
exact -derivative, which cancels out upon integration, is omitted. Hence,
page 28
October 24, 2014
16:53
9789814623612
29
the Jacobi equation describing light-ray deviation is expressed as the JacobiEuler-Fermat equation

J
d
J
,
=
d (d/d)
which yields

d
d
n dx
dx
= n I
d 3 d
d
d

d
( ln n)
+
d

1 dx dx
(1.71)
2 d d

dx
n dx
.
d
The Jacobi equation (1.71) describes the property of nearby rays to converge
or diverge in a nonuniform refractive medium. Note, here, that the terms
involving 1 n dx/d in Eq. (1.71) can be written in terms of the
Euler-Fermat ray equation (1.51) as
2

d x dx
1 d n dx
n
n dx
dx
= 2
= 3
d
d d
d
d 2
d
which, thus, involve the Frenet-Serret ray curvature [see Eq. (1.55)].
1.4.5
Wavefront Representation
The complementary picture of rays propagating in a nonuniform medium

was proposed by Christiaan Huygens (1629-1695) in terms of the picture of
propagating wavefronts. Here, a wavefront is dened as the surface that is
locally perpendicular to a ray. Hence, the index of refraction itself (for an
isotropic medium) can be written as
n = |S| =
dx
ck
ck
or S = n
=
,
ds
(1.72)
where S is called the eikonal function (i.e., a wavefront is dened by the

surface S = constant; see Fig. 1.9). To show that this denition is consistent
with Eq. (1.53), we easily check that

dx
1
d
dx
dS
=
S =
S S
n
=
ds
ds
ds
ds
n
1
1
=
|S|2 =
n2 = n.
2n
2n
This denition, therefore, implies that the wavevector k is curl-free:

S 0,
(1.73)
k =
c
page 29
October 24, 2014
30
16:53
9789814623612
where we used the fact that the wave frequency is unchanged by refraction. Hence, we nd that
k=
k ln n
b, from which we obtain
the light-curvature equation (1.53). Note also that because k is curl-free,
we
easily apply Stokes Theorem to nd that the closed contour integral
A k dx = 0 along
the boundary A of an open surface A vanishes, i.e.,
the path integral k dx is path-independent.
S(x)
Fig. 1.9
Wavefront surface.
Lastly, in the absence of sources and sinks, the light energy ux entering a nite volume bounded by a closed surface is equal to the light
energy ux leaving the volume and, thus, the intensity of light I satises
the conservation law
0 = (I S) = I 2 S + S I.
(1.74)
Using the denition S n /s, we nd the intensity evolution equation

ln I
= n1 2 S,
s
whose solution is expressed as
s

d
I = I0 exp
2 S
,
(1.75)
n
0
where I0 is the light intensity at position s = 0 along a ray. This equation, therefore, determines whether light intensity increases (2 S < 0) or
decreases (2 S > 0) along a ray depending on the sign of 2 S. In a rek = r, the
fractive medium with spherical symmetry, with S (r) = n(r) and
conservation law (1.74) becomes

1 d 2
r In ,
0 = 2
r dr
which implies that the light intensity satises the generalized inverse-square
law:
I(r)n(r) r2 = I0 n0 r02 .
(1.76)
page 30
October 24, 2014
16:53
9789814623612

Table 1.1
Summary of Chapter 1: The Calculus of Variations.
Topic
Equation
Eulers First Equation

Eulers Second Equation
Constrained Variational Principle
Brachistochrone Problem
Fermats Principle of Least Time
Euler-Fermat Equation
Frenet-Serret Formulas for Light Rays
1.5
31
(1.7)
(1.16)
(1.24)-(1.25)
(1.30)-(1.32)
(1.36)
(1.51)
(1.53)-(1.57)
Summary
Chapter 1 presented the mathematical foundations of the Calculus of Variations, which will form the basis upon which the Lagrangian method (introduced in Chapter 2) will be built. The brachistochrone problem and
Fermats Principle of Least Time are two examples that were discussed extensively. Table 1.1 presents a summary of the important topics of Chapter
1.
1.6
Problems
1. Consider the problem of nding the extremal solution y(x) of the integral
b
F (y, y , y ) dx,
F [y] =
a
where F (y, y , y ) is a smooth function of its arguments.

(a) Show that Eulers First Equation for this problem is

F
d F
F
d2
=
.
y
dx y
dx2 y
(b) Find Eulers Second Equation and state whether an additional set of
boundary conditions for y (a) and y (b) are necessary.
2. Find the curve joining two points (x1 , y1 ) and (x2 , y2 ) that yields a
surface of revolution (about the x-axis) of minimum area by minimizing
the integral
x2
y 1 + (y )2 dx.
A[y] =
x1
page 31
October 24, 2014
32
16:53
9789814623612
3. Use the Jacobi equation (1.11) to obtain Eq. (1.14) for 2 F .

4. This problem deals with nding the equation for geodesics on a cone
represented by z() = () cot , for which the innitesimal length element
ds is dened as

ds2 = d2 () + 2 () d2 + dz 2 () = 2 + csc2 ( )2 d2 .
The length integral is

2 + csc2 ( )2 d
F (, ) d,
L[] =
and Eulers First and Second equations are

d F
d
F
F
F
and
0.
=
=
F
d
d
(a) Show that Eulers Second equation for () can be written as

2 sin

= 0 ,
2 sin2 + ( )2
where 0 (0 ) and (0 ) 0.
(b) Solve Eulers Second equation for () and show that the equation for
geodesics on a cone is
() = 0 sec [sin ( 0 )] 0 sec().
(c) With the solution given in part (b), show that
F
d
F
sin()
= cos() and

sin
d
F

=
5. Show that the time required for a particle to move without friction
from the point (x0 , y0 ) parametrized by the angle 0 to the minimum point
( a, 2a) of the cycloid solution of the brachistochrone problem is

1 cos
a
d =
,
cos
cos
g
0
0

where we used x() = 2g [x() x0 ].
6. A thin rope of mass m (and uniform density) is attached to two vertical
poles of height H separated by a horizontal distance 2D; the coordinates
page 32
October 24, 2014
16:53
9789814623612
33
of the pole tops are set at ( D, H). If the length L of the rope is greater
than 2D, it will sag under the action of gravity and its lowest point (at
its midpoint) will be at a height y(x = 0) = y0 . The shape of the rope,
subject to the boundary conditions y( D) = H, is obtained by minimizing
the gravitational potential energy of the rope expressed in terms of the
functional
D

mg y 1 + (y )2 dx.
U[y] =
D
Show that the extremal curve y(x) (known as the catenary curve) for this
problem is

xb
y(x) = c cosh
,
c
where b = 0 and c = y0 .
7. Show that the parametric solution given by Eqs. (1.46)-(1.47) for the
linear refractive medium can be expressed as Eq. (1.49).
8. A light ray travels in a medium with refractive index
n(y) = n0 exp ( y),
where n0 is the refractive index at y = 0 and is a positive constant.
(a) Using Eq. (1.42), with the transformation exp( s) = cos 0 sec ,
show that the path of the light ray is expressed as

cos( x 0 )
1
ln
,
(1.77)
y(x; ) =
cos 0
where the light ray is initially traveling upwards from (x, y) = (0, 0) at an
angle 0 .
(b) Using the appropriate mathematical techniques, show that we recover
the expected result lim0 y(x; ) = (tan 0 ) x from Eq. (1.77).
(c) Show that the light ray reaches a maximum height y = 1 ln(sec 0 )
at x = 0 /.
9. Consider the path associated with the index of refraction n(y) = H/y,
where the height H is a constant and 0 < y < H 1 R to ensure that,
according to Eq. (1.41), n(y) > . Show that the light path has the simple
semi-circular form:

x (2R x).
(R x)2 + y 2 = R2 y(x) =
page 33
October 24, 2014
34
16:53
9789814623612
10. Using the parametric solutions (1.46)-(1.47) of the optical path in a

linear refractive medium, calculate the Frenet-Serret curvature coecient
() =
|r () r ()|
,
|r ()|3
and show that it is equal to |

k ln n|, where
1
dr
ds
x
x + y
y

= r ()
k =
=
,

2
ds
d
(x ) + (y )2
and ln n(y) =
y n (y)/n(y).
11. Assuming that the refractive index n(z) in a nonuniform medium is
a function of z only, show that the Euler-Fermat equations (1.53) for the
components (, , ) of the unit vector
k =
x+
y + z are
= n /n,
= n /n,
= (1 2 ) n /n = (2 + 2 ) n /n.
12. In Fig. 1.8, show that the angle () dened from Eq. (1.64) is expressed as

1 + e cos 2
r
,
= arcsin
() = arcsin
1 + e2 + 2 e cos 2
r2 + (dr/d)2
so that = /2 at = 0 and /2, as expected for an ellipse.
13. Consider the light-path trajectory r() for a spherically-symmetric
medium, with index of refraction n(r) = n0 (b/r)2 , where b is an arbitrary
constant and n0 = n(b).
(a) Using Eq. (1.63), show that the light-path trajectory is

r() = r0 cos + R2 r02 sin ,
where r0 = r( = 0) and R = (n0 /N ) a2 /r0 .
(b) Using the vector r r() (sin
x + cos z) r()r (), show that this
solution satises the Euler-Fermat equation

r
dr
1 d
n(r)
= n = 2 n0 b2 3 ,
R2 d
d
r
page 34
October 24, 2014
16:53
9789814623612
where ds = d
35

r2 + (r )2 R d.
14. Derive the Jacobi equation (1.71) for two-dimensional light propagation in a nonuniform medium with index of refraction n(y); Hint: choose
= x. Compare your Jacobi equation with that obtained from Eq. (1.11).
15. Lagrange showed in 1760 that a surface z(x, y) has minimal area if it
satises the partial dierential equation
2z
2z

2z
2
= 0,
1 + q2
+
1
+
p
2
p
q
x2
y 2
x y
(1.78)
where (p, q) (z/x, z/y).

(a) Derive Eq. (1.78) by minimizing the surface integral

I[z] =
1 + p2 + q 2 dx dy.

(b) Show that the surface z(x, y) = cosh1 ( x2 + y 2 ) has minimal area.
Fig. 1.10
Problem 16.
16. (a) Show that the optical

length followed by a light ray along the path
AP B in Fig. 1.10 is L() = 2 2 R cos(/2), where R is the radius of the
circle.
(b) Show that the optical length L() has a maximum for = 0.
17. We now consider light propagation in axially-symmetric cylindrical
geometry, where the index of refraction n() is a function of the cylindrical
radius (measured from the z-axis). If we use the z-coordinate as the ray
page 35
October 24, 2014
16:53
9789814623612
36
parameter, Fermats Principle of Least Time (1.38) becomes

b
b
n() (, , ) dz
n() 1 + ( )2 + 2 ( )2 dz = 0,
a
a
where = d/dz and = d/dz. Note that the integrand F n is

independent of z and and therefore
N F
F
F
,
F
,

are constants along the light path (i.e., dN/dz = 0 = dR/dz).
R
(1.79)
(1.80)
(a) Using the conservation law (1.80), show that, by solving for as a
function of and , we obtain

1 + ( )2
.
(1.81)
(, ) n()
2
n () 2 R2
(b) Using the conservation law (1.79), with Eq. (1.81), obtain the integral
solution

N d

z() za +
,
2
2
[n () N 2 ] R2
a
which can then be inverted to obtain (z; N, R).
page 36

9321 Chap01

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

9321 Chap01

Hochgeladen von

Copyright:

Verfügbare Formate

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

A wide range of equations in physics, from quantum eld and superstring

Foundations of the Calculus of Variations

It is a well-known fact that the shortest distance between two points in

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

1 + m2 , while for y1 (x) = m x2 , we nd

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

Methods of the Calculus of Variations

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

order functional variation F [y; y] dened as

where y(x) is an arbitrary smooth variation of the path y(x) subject to

Virtual displacement y(x) for the functional variation (1.3).

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

where F /y is called the functional derivative of F [y] with respect to the

for any smooth function P .

Extremal Values of an Integral

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

Warning: Material identied by an asterisk is not meant to be covered

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

where the Jacobi function J(u, u ; x) is dened as

Eulers Second Equation

Whenever the function F appearing in Eq. (1.2) satises the condition

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

where the constant is determined from the conditions y(x0 ) = y0 and

Path of Shortest Distance and Geodesic Equation

We now return to the problem of minimizing

We generalize the problem of nding the path of shortest distance on the

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

Geodesic Equation on a Sphere

We now consider the example of geodesic curves on the surface of a sphere

which can, thus, be integrated to give

where is another constant angle and we used the change of variable u =

Classical Variational Problems

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

The Calculus of Variations

Isoperimetric problems represent some of the earliest applications of the

October 24, 2014

9321 - An Introduction to Lagrangian Mechanics

An Introduction to Lagrangian Mechanics

variations y(x) (which vanish at the integration boundaries) yields Eulers

October 24, 2014

which shows that

to the wavevector direction