Sie sind auf Seite 1von 7

For anyone reading this, if you found this article useful please email me at

Nicholas.fisk@kcl.ac.uk, if you have used this article for academia, please cite the article.
Functional derivatives: a happy coincidence? A first principles comprehensive review of
functional derivatives.
A function requires an x-value as its input and outputs a number. A functional [] requires a
function as its input and outputs a number. Examples of functionals are stated below in order
of complexity.
1: [] = ( = ) = ()
Where is an arbitrary x-value.
2: [] = () ()
Where a and b are arbitrary x-values.

3: [] = ()

In this example, the functional represents the area under the curve from a to b.

4: [] = ()()

In this example () is not affected by the integration sign and can be taken out as a constant.

5: [] = ()()

In this example, the functional only depends on the function not . The function , whatever
it is, in this functional remains constant. Had the functional been written [], the function
would have remained constant with only the function being able to vary. In the case that the
functional is written [, ] both functions and may vary (independently).
From the examples above, it is easily seen that functionals may have an integral sign in their
definition, but it should be noted that this is not always the case (examples 1 and 2).
Functionals themselves are assumed to be continuous, this means that, a small change in
the varying input function would only lead to a small change in the output number returned.
= lim [ + ] []
0

Eq.1

A change in a function is denoted where the (not to be confused with the Kronecker
delta or the dirac delta function) denotes a change in something. Taking the limit of to 0
represents an infinitesimally small change in a function. In actuality, has a similar definition
to the commonly seen in the definition for the small change in a function where
represent a change in the value of , and taking the limit to 0 represents an infinitesimally
small change. When using derivatives, the limits to 0 are always in effect and hence we are
always looking at a change of something due to the infinitesimally small change of

something else. As an analogue of the common , the derivative which operates on a

function, , is defined (in many texts) as, the derivative which operates on a functional.

Hence represents the change in a functional with respect to an infinitesimally small change
in a function.
It is reasonable to first concern ourselves with the general definition for the small change in a
multivariate function of n independent variables.

(1 , 2 , , ) =
1 +
2 + +
=

1
2

Eq.2

=1

In words, think of a vector representing (the slight change from one output number to
another). The equation above states that is a linear combination of n vectors whereby
each coefficient for each vector is the ratio between the size (magnitude) of the vector
and itself. Equivalently, each coefficient can be seen as the contribution each independent
variable makes towards the small change in the output value of a function (at a particular
point). In essence, the smaller the size of the vector representing each of the n independent
variables (meaning, the larger the value of the coefficient), the larger the contribution of that
independent variable towards the (linear) small change in . Larger contribution means that
a smaller change in that independent variable is required to incur a change in the function (at
a particular point).
Another commonly written definition for the small change in the functional is:

()
()

Eq.3

And it is fair to say that this is an extension of (eq. 2) except that the summation symbol is
replaced with an integral. Both Eq 2 and 3 will be derived rigorously later on in the text.
It is now a convenient point to confirm the accepted interpretation of the integral sign. While
the summation sign sums over a specified number of terms by discrete integral increases of
the index (in eq. 2, the index is i), the integral sign sums over an infinite number of terms by
continuous increases of the variable (in eq. 3, this is the x-value). This means that the
integral sign (in eq. 3) not only sums over (x = a, a+1, , b-1, b) but also all the points in
between. It should also be noted that each term in the uncountably infinite sum is multiplied
by dx. So, it is in reality a sum of an uncountably infinite number of infinitesimally small
areas. The formula (if correct) suggests that each value of () (for all the x-values within
the limits x = a up to b) may contribute to the small change in . This in itself is not wholly
unrealistic, however, the formula makes a further assumption, that each () point (in the
range of (x = a to b) varies independently.
A small change in the value of a point would be for example (3) = 5 to (3) = 5. 01.
Varying any number of points (by an infinitesimally small amount) in the function (for all the
x-values) can generate a new infinitesimally similar function. Similarly, varying each point in
the range of x = a to b can generate a new function for the range x = a to b, since the
functional does not consider the points outside of the range, there is no need to vary points
outside of the range.
However, the assumption is by no means intuitive. It is clearly incorrect to say that you can
vary each point in a function independently and always generate a new acceptable function.
It is very clear that for a function to be continuous, varying one point will affect neighbouring
points. For example, if you move the points at ( = ) up and ( = + ) down by an
infinitesimal amount, you will generate a gap in the function, this is not an acceptable
function.

Arguably, integrals can in theory operate on a discontinuous function so long as an output


value exists at every point in the range of x-values concerned.
Take for example a functional in the form of Example 3, one might claim that it does not
matter if a discontinuous function is produced, addition (an infinite sum) is commutative, you
can add the points in any order, it will still produce the same answer, hence even if a
discontinuous function is produced from the operation of changing the value of () at each
and every point in the specified range, so long as the newly formed points can be reshuffled
into an acceptable function, it does not matter if the newly produced function is
discontinuous.
However, it can be easily proven that independently varying an acceptable function can
produce a discontinuous function that cannot be reshuffled into an acceptable function for
the range.
An acceptable function for example can be the function () = 2, for all x-values.
Graphically, this is a straight line. It is possible to change each point upwards by an
infinitesimally small amount however one specific point ( = ) downwards. This will clearly
create a gap in the function, which no amount of shuffling can fix. Thus this will create an
unacceptable function. Hence it seems a roadblock has been hit, it looks as though for this
case (and many others) the functional will produce an output value due to a discontinuous
function. However, this is not a problem, because it can be easily proven that the output
value that is due to a discontinuous function is not unique and that a continuous function can
always replace any discontinuous function and give the same output value. This is due to an
earlier more intuitive assumption, functionals are continuous.
Evaluate the output value of a functional [] where is a continuous function. Now
evaluate the functional lim{[ + ]} where is an arbitrary continuous function. This will
0

lead to an answer [] + [] as the new function is infinitesimally similar to the old


function. By varying continuously, you vary the output value continuously. All the
discontinuous functions formed from infinitesimally varying each point in a continuous
function will clearly produce an output value infinitesimally close to the original continuous
function (this is due to how integration works). Since the discontinuous function produced an
output value, it should be possible to recreate the output value by using + (starting with
any expression for ) and varying , note that + is clearly a continuous function. This
leads to the conclusion that any discontinuous function that gives an output value can be
represented as an analogous continuous function and hence it is valid to vary each point in
the function independently. As stated before, you cannot normally infinitesimally vary the
points of the function independently and expect to get a new continuous function, this is only
the case for functionals, in that, every discontinuous function as an input of a functional will
have at least one continuous function analogue.
While the definition for the small change in a functional has been stated, the procedure to
calculate the derivative of a functional has not been outlined. The first step in the procedure
is to write out the Taylor series of [ + ] to first order. The assumption that each point in
the function may be treated as an independent variable is a huge advance in the derivation
of the Taylor series of [ + ] to first order.
We begin by considering the Taylor series of a one dimensional function:
Any acceptable function () in mathematics can be written as a power series.
Eq.4

() = 0 + 1 + 2 + 3 + =
=0

Where each is a constant.


This is reasonable as an infinite amount of information can be stored in the value of the
coefficients.
We can re-write this power series as:

() = 0 + 1 ( ) + 2 ( ) + 3 ( ) + = ( )

Eq.5

=0

Where is a constant which can be any number, and where in general .


This is reasonable as there is still an infinite amount of information stored in the coefficients
and the ( ) which would ordinarily just shift the entire function by + along the x-axis
(not losing any information of the function) is accounted for by the change in the values of
the coefficients. From this point onwards, anything mathematically similar to the purpose of
the inclusion of in ( ) will be referred to as a perturbation. The procedure of how to
work out the coefficients is outlined below. The full derivation is written in appendix A.
() = 0
() = 1
() = 22
() = 63
() () = !
Whereby (3) () is equivalent to ().
1

Hence, ! () () =
Substituting this result:

1
() = ( ) = () ()( )
!

=0

=0

This is the Taylor series of ().


The Power series of a multivariable function (, ) is:

(, ) = 00 + 10 + 01 + 11 + =
,

To first order this is:


(, ) = 00 + 10 + 01 +
Which can be re-written as:
(, ) = 0 + 1 + 2 +

Where the coefficients have been re-written for convenience. This result can be generalised
to the Taylor series of an n-multivariate function to first order:

(1 , 2 , , ) = 0 + (1 1 + 2 2 + + ) + = 0 + ( ) +
=1

By using the same trick as before for (), this can be written as:

(1 , 2 , , ) = 0 + ( ( )) +
=1

And again, using the same trick as with the () derivation:

(1 , 2 , , ) = 0 + (
=1

Where

means,
|
=

( )) +
|
=

take the partial derivative of with respect to and then set = .

Although this may seem complicated, it is similar to how to work out () = 1 in the Taylor
series expansion of (). Take the derivative of with respect to and set = . The above
procedure for working out the first order coefficient is used when dealing with functions of
more than one variable.
Similarly the Taylor series of multivariable function of an infinite number of independent
variables is, to first order:

(1 , 2 , , ) = 0 + (
=1

( )) +
|
=

Making an arbitrary change in each of the independent variables we obtain

(1 + 1 , 2 + 2 , , + ) = 0 + ( ( + )) +
=1

Which leads to,

(1 + 1 , 2 + 2 , , + ) = 0 + (
=1

( + )) +
|
=

Setting to match the value of we obtain

(1 + 1 , 2 + 2 , , + ) = 0 + (
=1

( )) +
|
=

Which is the same as,

(1 + 1 , 2 + 2 , , + ) = 0 + (
=1

) +

Following from the method to word out 0 in the Taylor expansion of (),
() = 0

The 0 in the above expansion would be,


(1 , 2 , , ) = 0
However, if we are setting all the perturbations on the independent variables equal to the
independent variables themselves 0 simply becomes:
0 = (1 , 2 , , ) =
Hence,

(1 + 1 , 2 + 2 , , + ) = + (
=1

) +

A direct expansion of the above formula is the Taylor series of the functional [ + ] to
first order,

[ + ] = +

() +
()

It should be noted that by moving to the other side and taking the limit as goes to zero,
you receive (eq.1) as higher order terms become negligible and vanish.
As an example of how to use this relationship, take for instance the most complicated
example for a general functional (example 4).
We first begin by expanding the Taylor series as an infinite sum,
[ + ] = +

() +
( + ) + +
()
( + )

Higher order terms are defined as the terms that are not derived from the first order terms.

Substituting [] = ()() where required,

( ()())
( ()())
[ + ] = +
() +
( + ) +
()
( + )
+

Expanding ()() as an infinite sum:

(()())() +
(( + )( + ))() +
()
()

(()())( + )
+
( + )

(( + )( + ))( + ) +
+
( + )
+

[ + ] = +

Since the differentials act like derivatives of independent variables, this expression simplifies
to,


(()())()
()

(( + )( + ))( + ) +
+
( + )
+

[ + ] = +

This will simplify to,


[ + ] = + ()() + ( + )( + ) + +
= + ()() 2 + ( + )( + ) 2 + +
dx is infinitesimally small so, 2 = , which gives:
[ + ] = + ()() + ( + )( + ) + +
Which simply collapses to

[ + ] = + ()() +

And by comparing the Taylor expansion of [ + ] to first order and the above equation, it
can be seen that the functional derivative

is (). Since the integral sign keeps the

general form of the expression being expanded as an infinite sum and the derivatives cancel
terms that consider more than one x-value points since they are seen by the derivative as
constants (the change in the value of one point will not affect any other); the functional
derivative is almost like a normal derivative, just take the derivative of the term in the integral
of the functional [], treating the variable just like any other normal variable.

As a final example, the functional derivative of [] = ()()2 would be 2()().

Das könnte Ihnen auch gefallen