Beruflich Dokumente
Kultur Dokumente
When it does, we write f 0 (a; v) for the limit. Note that this definition makes sense because a
is an interior point. Indeed, under this hypothesis, D contains a basic open set U containing
a, and so a + hv will, for small enough h, fall into U , allowing us to speak of f (a + hv). This
1
derivative behaves exactly like the one variable derivative and has analogous properties. For
example, we have the following
Mean Value Theorem Assume f 0 (a + tv; v) exists for all 0 t 1. Then t0 [0, 1]
such that f 0 (a + t0 v; v) = f (a + v) f (a).
Proof. Put (t) = f (a + tv). By hypothesis, is differentiable at every t in [0, 1], and
0 (t) = f 0 (a + tv; v). By the one variable mean value theorem, there exists a t0 such that
0 (t0 ) =
(1) (0)
= (1) (0) = f (a + v) f (a).
1
Done.
When v is a unit vector, f 0 (a; v) is called the directional derivative of f at a in the
direction of v.
The disadvantage of this construction is that it forces us to study the change of f in one
direction at a time. So we revisit the one-dimensional definition and note that the condition
for differentiability
there is equivalentto requiring that there exists a constant c (= f 0 (a)),
f (a + h) f (a) ch
such that lim
= 0. If we put L(h) = f 0 (a)h, then L : R R is
h0
h
clearly a linear map. We generalize this idea in higher dimensions as follows:
Definition. Let f : D Rm (D Rn ) be a vector field and a an interior point of D. Then
f is differentiable at x = a if and only if there exists a linear map L : Rn Rm such that
||f (a + u) f (a) L(u)||
= 0.
u0
||u||
()
lim
Note that the norm || || denotes the length of vectors in Rm in the numerator and in Rn in
the denominator; this should not lead to any confusion, however.
Lemma 1 Such an L, if it exists, is unique.
Proof. Suppose we have L, M : Rn Rm satisfying (*) at x = a. Then
||L(u) M (u)||
||L(u) + f (a) f (a + u) + (f (a + u) f (a) M (u))||
= lim
u0
u0
||u||
||u||
||L(u) + f (a) f (a + u)||
lim
u0
||u||
||f (a + u) f (a) M (u)||
+ lim
= 0.
u0
||u||
lim
Pick any non-zero v Rn , and set u = tv, with t R. Then, the linearity of L, M implies
that L(tv) = tL(v) and M (tv) = tM (v). Consequently, we have
||L(tv) M (tv)||
= 0
t0
||tv||
lim
and this tends to zero as u tends to zero. Identifying linear maps from Rn to R with row
vectors we get Ta f = 2at .
(4) Here is another variation on the theme that the derivative of x2 is 2x. Let
f (X) = X 2 = X X where X is an n n-matrix and the denotes matrix multiplication.
2
2
2
So f is a function from Rn to Rn where we view the space of n n-matrices as Rn . Again
just using bilinearity of matrix multiplication we find f (A + U ) f (A) = A U + U A + U 2 .
Using the factq
(not proven in this class) that ||XY || ||X|| ||Y || for matrices X and Y
P
2
where ||X|| =
i,j |Xi,j | we find that TA f is the linear map U 7 A U + U A. But this
time we cannot rewrite this as 2 A U since matrix multiplication is not commutative.
This concludes our list of examples where we can show directly that Ta f exists. Theorem
1 (d) below will give a powerful criterion for the existence of Ta f in many more examples.
Before we leave this section, it will be useful to take note of the following:
Lemma 2 Let f1 , . . . , fm be the component (scalar) fields of f . Then f is differentiable at
a iff each fi is differentiable at a.
An easy consequence of this lemma is that, when n = 1, f is differentiable at a iff the
following familiar looking limit exists in Rm :
lim
h0
f (a + h) f (a)
,
h
allowing us to suggestively write f 0 (a) instead of Ta f . Clearly, f 0 (a) is given by the vector
0
(f10 (a), . . . , fm
(a)), so that (Ta f )(h) = f 0 (a)h, for any h R.
Proof. Let f be differentiable at a. For each v Rn , write Li (v) for the i-th component
of (Ta f )(v). Then Li is clearly linear. Since fi (a + u) fi (u) Li (u) is the i-th component
of f (a + u) f (a) L(u), the norm of the former is less than or equal to that of the latter.
This shows that (*) holds with f replaced by fi and L replaced by Li . So fi is differentiable
for any i. Conversely, suppose each fi differentiable. Put L(v) = ((Ta f1 )(v), . . . , (Ta fm )(v)).
Then L is a linear map, and by the triangle inequality,
||f (a + u) f (a) L(u)||
m
X
i=1
2.2
Partial Derivatives
Let {e1 , . . . , en } denote the standard basis of Rn . The directional derivatives along the unit
vectors ej are of special importance.
Definition. Let j n. The jth partial derivative of f at x = a is f 0 (a; ej ), denoted by
f
(a) or Dj f (a).
xj
Just as in the case of the total derivative, it can be shown that
f
fi
(a) exists iff
(a)
xj
xj
Define f : R3 R2 by
f (x, y, z) = (exsin(y) , zcosy).
All the partial derivatives exist at any a = (x0 , y0 , z0 ). We will show this for
f
and leave
y
Proof.
(h) (h)
= 0.
h0
h
lim
Check that is an equivalence relation. Then by definition, we have, for all a D0 and u
in Rn ,
f (a + hu) f (a) + hf 0 (a; u).
Then f (a + h(v + v 0 )) is equivalent to f (a) + hf 0 (a; v + v 0 ) on the one hand, and to
f (a + hv) + hf 0 (a + hv; v 0 ) f (a) + h(f 0 (a; v) + f 0 (a + hv; v 0 )),
on the other. Moreover, the continuity hypothesis shows that f 0 (a + hv; v 0 ) tends to f 0 (a; v 0 )
as h goes to 0. Consequently, we get the equivalence of f 0 (a; v + v 0 ) with f 0 (a; v) + f 0 (a; v 0 ).
Since they are independent of h, they must in fact be equal.
P
Finally, since {ej |j n} is a basis of Rn , we can write any v as j j ej , and by what
P
f
we have just shown, f 0 (a : v) is determined as j j
(a).
xj
In the next section we will show that the conclusion of this lemma remains valid without
the continuity hypothesis if we assume instead that f has a total derivative at a.
The gradient of a scalar field g at an interior point a of its domain in Rn is defined to
be the following vector in Rn :
g
g
(a), . . . ,
(a) .
g(a) = grad g(a) =
x1
xn
Given a vector field f as above, we can then put together the gradients of its component
fields fi , 1 i m, and form the following important matrix, called the Jacobian matrix
at a:
fi
Df (a) =
(a)
Mm,n (R).
xj
1im,1jn
The i-th row is given by fi (a), while the j-th column is given by
2.3
f
(a).
xj
In this section we collect the main properties of the total and partial derivatives.
6
Rn
Rm
Rk .
a 7 b = f (a)
Suppose f is differentiable at a and g is differentiable at b = f (a). Then the composite
function h = g f is differentiable at a and moreover,
Ta h = Tb g Ta f.
In terms of the Jacobian matrices, this reads as
Dh(a) = Dg(b) Df (a) Mk,n (R)
where indicates a matrix product.
(f ) Assume Ta f and Ta g exist. Then Ta (f + g) exists and
Ta (f + g) = Ta f + Ta g
(additivity)
(product rule)
if g(a) 6= 0
(quotient rule)
The following corollary is an immediate consequence of the theorem, which we will make
use of in the next chapter on normal vectors and extrema.
Corollary 1 Let g be a scalar field, differentiable at an interior point b of its domain D in
Rn , and let v be any vector in Rn . Then we have
g(b) v = f 0 (b; v).
Furthermore, let be a function from a subset of R into D Rn , differentiable at an interior
point a mapping to b. Put h = g . Then h is differentiable at a with
h0 (a) = g(b) 0 (a).
Here is a simple observation before we begin the proof. Let f : R2 R2 be a vector
field such that f1 (x, y) = (x), f2 (x, y) = (y), with , differentiable
everywhere.
Then,
0
(x)
0
clearly, the Jacobian matrix Df (x, y) is the diagonal matrix
. Conversely,
0
0
(y)
(x) 0
suppose we know apriori that Df is diagonal (at all points), say Df (x, y) =
.
0
(y)
R
R
1
1
2
2
Then f
= (x), f
= 0 = f
, f
= (y) f1 (x, y) = (x) dx; f2 (x, y) = (y) dy. So
x
y
x
y
f1 is independent of y and f2 is independent of x.
Proof of main theorem. (a) It suffices to show that (Ta fi )(v) = fi (a; v) for each i n
and this is clear if v = 0 (both sides are zero by definition). So assume v 6= 0. By definition,
||fi (a + u) fi (a) (Ta fi )(u)||
=0
u0
||u||
lim
f 0 (a; v) = lim
(b) By part (a), each partial derivative exists at a (since f is assumed to be differentiable
at a). The matrix of the linear map Ta f is determined by the effect on the standard basis
vectors. Let {e0i |1 i m} denote the standard basis in Rm . Then we have, by definition,
m
m
X
X
fi
0
(Ta f )(ej ) =
(Ta fi )(ej )ei =
(a)e0i .
x
j
i=1
i=1
n
X
j=1
We can write
f (a + u) f (a) =
n
X
uj
f
(a).
xj
j=1
Lemma 4 Let T : Rn Rm be a linear map. Then, c > 0 such that ||T v|| c||v|| for any
v Rn .
Proof of Lemma. Let
A be the matrix of T relative to the standard bases. Put C =
n
P
maxj {||T (ej )||}. If v =
j ej , then
j=1
||T (v)|| = ||
j T (ej )|| C
C(
n
X
|j | 1
j=1
n
X
2 1/2
|j | )
j=1
n
X
(
1)1/2 C n||v||,
j=1
||T v||
|v Rn \ {0}} = sup{||T v|| | ||v|| = 1}
||v||
Note that the Lemma implies that the first set is bounded so the sup exists. The second set
is even compact so the sup is attained, i.e. there is always a vector v of Norm one for which
||T v|| is the optimal constant c.
Proof of (e) (contd.).
Write L = Ta f , M = Tb g, N = M L. To show: Ta h = N .
Define F (x) = f (x) f (a) L(x a), G(y) = g(y) g(b) M (y b) and H(x) = h(x)
h(a) N (x a). Then we have
||G(y)||
||F (x)||
= 0 = lim
.
xa ||x a||
yb ||y b||
lim
So we need to show:
||H(x)||
= 0.
xa ||x a||
lim
But
H(x) = g(f (x)) g(b) M (L(x a))
Since L(x a) = f (x) f (a) F (x), we get
H(x) = [g(f (x)) g(b) M (f (x) f (a))] + M (F (x)) = G(f (x)) + M (F (x)).
Therefore it suffices to prove:
10
(i) lim
xa
||G(f (x))||
= 0 and
||x a||
||M (F (x))||
= 0.
xa
||x a||
(ii) lim
By Lemma 4, we have ||M (F (x))|| c||F (x)||, for some c > 0. Then
||M (F (x))||
||x a||
||F (x)||
= 0, yielding (ii).
xa ||x a||
c lim
||G(y)||
= 0. So we can find, for every > 0, a
yb ||y b||
> 0 such that ||G(f (x))|| < ||f (x) b|| if ||f (x) b|| < . But since f is continuous,
||f (x) b|| < whenever ||x a|| < 1 , for a small enough 1 > 0. Hence
On the other hand, we know lim
||G(f (x))|| < ||f (x) b|| = ||F (x) + L(x a)||
||F (x)|| + ||L(x a)||,
||F (x)||
xa ||xa||
is zero, we get
||G(f (x))||
||L(x a)||
lim
.
xa ||x a||
xa ||x a||
lim
Applying Lemma 4 again, we get ||L(x a)|| c0 ||x a||, for some c0 > 0. Now (i) follows
easily.
(f) (i) We can think of f + g as the composite h = s(f, g) where (f, g)(x) = (f (x), g(x))
and s(u, v) = u + v (sum). Set b = (f (a), g(a)). Applying (e), we get
Ta (f + g) = Tb (s) Ta (f, g) = Ta (f ) + Tb (g).
Done. The proofs of (ii) and (iii) are similar and will be left to the reader.
QED.
Remark. It is important to take note of the fact that a vector field f may be differentiable
at a without the partial derivatives being continuous. We have a counterexample already
when n = m = 1 as seen by taking
1
2
f (x) = x sin
if x 6= 0,
x
11
and f (0) = 0. This is differentiable everywhere. The only question is at x = 0, where the
relevant limit lim f (h)
is clearly zero, so that f 0 (0) = 0. But for x 6= 0, we have by the
h
h0
product rule,
1
1
f (x) = 2xsin
cos
,
x
x
0
2.4
Let f be a scalar field, and a an interior point in its domain D Rn . For j, k n, we may
consider the second partial derivative
2f
f
(a) =
(a),
xj xk
xj xk
when it exists. It is called the mixed partial derivative when j 6= k, in which case it is of
interest to know whether we have the equality
(3.4.1)
2f
2f
(a) =
(a).
xj xk
xk xj
2f
2f
and
both exist near a and are continuous there.
xj xk
xk xj
Then the equality (3.4.1) holds.
Proposition 1 Suppose
12