Sie sind auf Seite 1von 10

The Shortest Path -- A Derivation by Picture

The calculus of variations is fundamental to modern physics in general, and the Lagrangian
formulation of mechanics in particular. Unfortunately, the derivation of the conditions under which
the path over which we evaluate a line integral will produce a minimal value can seem obscure or
even opaque. My first encounter with it, in an old (first) edition of Goldstein, left me feeling
bewildered; the "trick" which was used to switch the order of differentiation around (swapping a
total derivative for a partial) didn't even seem valid, let alone being easy to picture.

On this page I present two "visual" derivations of the minimality condition for a path. The first
derivation is (in my own opinion!) delightfully easy to picture and understand; unfortunately it's not
really valid (as I explain below). The second derivation is only slightly less visual, and has the
advantage of being valid.

(For a more "traditional" approach, see the classic derivation. For an application of minimization
over a path, see Lagrangian mechanics.)

Consider a function, f, defined on all paths through RN. We wish to find a path such that the integral
of f along the path will be minimized. We would like to find the conditions which the path must
satisfy directly, just by examining a graph of the path, without using algebraic "tricks". Just so we
can say we've been rigorous about it, we've also provided a conventional algebraic derivation of the
necessary conditions on another page, but on this page we're going to stick with the proof-by-picture
approach.

The definition of a path and statement of the problem, given below, are the same here as those given
on the proof-by-parts page; if you've already seen them there, you might as well just skip down to
the derivation.

Definition of a Path

We define a path in RN from point Xa to point Xb as a smooth mapping from the unit interval on the
real line [0,1] into RN:

Figure 1 -- Some possible paths in R2:


Statement of the Problem

We are given a function f which is defined along any path. Along any particular path, f is a function
of the Xi and of their derivatives, which we show with an overdot. So, f maps a particular path, and a
point in RN which lies on that path, into R. Thus, we have:

Note that f is a function of the “shape” of the path, and a function of the speed at which we travel
along the path.

We wish to find a particular path which will minimize (or extremize) the integral of f over the path.
That integral we will call F:

More specifically, we wish to find a path such that the integral of f along any nearby path is at least
as large as the integral along our chosen path.

Figure 2 -- Some "nearby" paths in R2:


Note, however, that something important isn't shown on figures 1 and 2: The function, f, is a
function of the location along the path, and is also a function of how fast we are moving along the
path. The derivatives of x and y with respect to t do not appear in those pictures but they are
important none the less. (We will need to use a picture in which we can see the derivative with
respect to t to complete the derivation.)

The condition a path must satisfy, to be a minimum (or maximum) of F, is the familiar one: The
derivative of F (with respect to the path) must vanish. That means that any “infinitesimal” deviation
from the path will result in no change -- or, in other words, the path must be a stationary point for F
in the space of all possible paths. (If that were not true, then the path would not be minimal. If a
particular infinitesimal deviation produced a better result, obviously the path would not be minimal.
If a particular infinitesimal deviation produced a worse result, on the other hand, then an
infinitesimal deviation the opposite way would produce a better result and again the original path
was not minimal.)

From here we could proceed with a conventional rigorous derivation using integration by parts, and
we have done that, here. But on this page, we will proceed to derive the minimization conditions
visually, if not entirely rigorously.

The Problem Re-Drawn in 1 Spatial Dimension

First, to do this visually, we're going to need a picture which lets us see how "fast" we're moving on
the path. So, we must show t, the path parameter, explicitly in the graph. Since we can't easily show
more than 2 dimensions, we'll have to limit ourselves to a path through R1. That's sufficient for
deriving the formula for a single dimension, however, and generalizing the result to N dimensions is
straightforward since the derivative is a linear function of the partial derivatives. Because it's linear,
the derivative with respect to a small deviation in N dimensions is just the sum of the derivatives of
the integral taken over each dimension separately.
So here is a 1-dimensional path (figure 3). It's just a function of one variable: the value of X at each
point is just X(t), and its “speed” along the path is just dX/dt -- the slope of the curve at each point.

Figure 3 -- A 1-dimensional path from X=a to X=b, parametrized by t:

If the path is minimal, then moving away from it “a little bit” will not change the value of the
integral (to first order).

The First Derivation -- A Clear Picture (that Doesn't Quite Work)

Moving from a path which is just a little to one side of our chosen path to a different path an equal
distance to the other side of our “chosen path” will result in no change (to second order). We will
now “zoom in” on a little piece of the path, which is so short we can treat it as being “straight”
(figure 4). We show two other paths, P1 and P2. Path P2 “races ahead” briefly at twice the rate of the
main path, and then holds a constant value until the main path catches up. Path P1 holds a constant
value while path P2 moves ahead, then races at twice the rate of the main path to catch up with path
P2. Since P1 and P2 are equal perturbations on either side of our “minimum” path, we would like
the integral of f to be equal (to second order) along P1 and P2.

Figure 4 -- The path segments we'll compare:


In this image, we've broken apart the effect of shifting the path along the X axis and changing the
speed at which we move along the path. In words, the difference in the “flat parts” of paths P1 and
P2 -- segments A and B in figure 4 -- is due solely to the rate of change in f as X changes. If f grows
larger as X increases, then the integral along path P2 will have a larger value over the “flat part” of
its course than path P1.

On the “racing” parts of paths P1 and P2 -- segments C and D in figure 4 -- the paths pass through
the same X values, so the derivative of f with respect to X doesn't matter there. They have the same
slopes, too -- but path P1 races ahead, along segment D, after path P2 finishes racing along segment
C. So, if the effect on f of the velocity along the curve changes as time goes by, then the integral
along segment D will be larger by the amount that the derivative of f with respect to the velocity
changed.

Again just in words, in order for the net integral along P1 to equal the integral along P2, the effect
due to the increase in X between segments A and B must exactly balance the effect of racing ahead
later along segment D versus segment C. The former effect is the partial derivative of f with respect
to X; the latter effect is the time derivative of the partial derivative of f with respect to the velocity.

We'd like to extract the actual minimization condition from figure 4. Let's translate the origin to the
point marked f0 and replace f with its first-order Maclaurin expansion:

Using formula (4), we can see from figure 4 that the difference in f between segment A and segment
B must be:
and, since each segment covers Δt, the difference in the integral of f between segment A and
segment B must be:

Looking at segments C and D on the path, and applying (4), we can see that the average value of X
on each segment is the same same as X at f0, and the slope of each segment is 2(ΔX/Δt). So, the
average value of f on segment C must be:

(N.B. -- this step in the derivation is easy to visualize but it's not really valid; we'll have more to say
about that below.) Similarly, the average value of f on segment D must be:

Their difference, then, will be:

And the difference in the path integrals across segment D versus segment C must be:

Finally, the difference in the integral over path P2 versus path P1 must be:

and this difference can only be zero if

which is, indeed, the minimization condition we wanted to find.


Oops -- That wasn't really valid.

Unfortunately, the derivation just given isn't valid, because the first-order expansion we used in (4) is
only accurate in the limit as we approach the origin. It's only valid for small values of x and for
small values of dx/dt. But dx/dt is actually on the order of Δx/Δt, which is in general not small (we
can minimize each of Δx and Δt by restricting ourselves to a small region on the path but their ratio
is unaffected by shrinking the neighborhood).

The use of "flat spots" in the alternate paths, where x is momentarily fixed while t continues to
advance, is very nice for visualizing the minimization conditions, because it completely splits the
effect of from the effect of . Unfortunately, the difference in speed between a "flat spot" and
the main path is typically not infinitesimal! So, we're no longer looking at an infinitesimal variation
in the path, and the derivation doesn't really work when done this way, even though we managed to
pull out the right answer.

A Second Derivation: A Bit Less Clear, but More Correct

As observed above, when we look for an alternate path that's "close to" our selected path, we can't
have "flat spots" where the slope of the alternate path goes to 0. We must keep the slope similar to
the slope of the main path. So, we can't really split the effect of speed (slope of the path) completely
apart from the effect of traveling over different ground (value of x at each point). But we can
nonetheless derive the desired result from a very simple diagram.

In figure 5, we've shown a tiny piece of the main path (path "M"), and we've shown one alternate
path (path "P"). The alternate path goes a little faster than the main path on segment A, then
proceeds more slowly along segment B until path "M" once again catches up with it. Thus, path "P"
spends more time at larger values of X, which represents a cost if is positive. However, it does
more of its fast traveling earlier, and can idle along a bit later; this tradeoff represents a benefit if
is positive (the price of gas is going up). Thus, if the overall tradeoff is to net to zero (which
must be true for very nearby paths, if the main path is minimal), then and must again
balance. We will now derive that relationship more precisely.

Figure 5 -- The path segments to be compared, second time around:


Figure 5 is largely self explanatory but a few things (which can be read directly from the figure)
should, perhaps, be pointed out, if only because the print on the figure may be too small to read on
some screens!

Path M has slope Δx/Δt in the figure (assumed constant in this tiny region). The point f0, at location
(t0,X0), has the average value of f on path M (to first order), and the average t and X values on path
M are t0 and X0. Path P takes time Δt to traverse segment A, which it does with slope .
During that time, path M traverses segment M1; its average X value on segment M1 is Xa, and the
average X value on segment A is . Path P takes time Δt to traverse segment B, which
it does with slope . During that time, the average X value of segment M2 is Xb, and the
average X value on segment B is . Now, let us proceed.

The average value of f on segment A of path P, to first order, is:

The average value of f on segment B of path P, to first order, is:

Summing and dividing by 2, we get the average value of f on path P:


We can make the following simple substitutions, all accurate to first order:

Simplifying and multiplying by the total time, we obtain the integral of f over path P (to first order):

But the integral over path M is (to first order):

So, if F(M) = F(P), and keeping in mind that "to first order" is redundant when comparing first
derivatives, we must have:

which was to be shown.

Explicit Dependency of f on t

f may depend explicitly on t as well. Do we need to modify the above derivations in that case?

In the first derivation we compare the integral over segments which are separated by time. The
differences would be affected if were nonzero. However, a brief inspection shows that, when we
subtract the difference between segments C and D from the difference between segments A and B,
the differences due to will cancel and the final result will be unaffected.

In the second derivation, we compared integrals only over paths with identical average times, so
nonzero will not affect the calculations, or the result, at all.
Page last modified 10/31/06. Reference to Lagrangian mechanics added on 11/13/06.

Das könnte Ihnen auch gefallen