Beruflich Dokumente
Kultur Dokumente
2012/2013
Version 21 January 2013
5
5
30
35
40
46
46
72
83
87
91
91
118
128
134
138
139
139
156
166
170
173
173
190
201
198
217
230
236
238
238
249
272
274
280
280
297
298
305
Effort requirement
K 2.1.
Limits, continuity
K 2.1.
K 2.2.
K 2.2.
Discrete compounding
K.3.1.
Exponential function
K.2.3. K.3.2.
K.2.3.
K.3.2.
Continuous compounding
K.3.2.
NPV
K.3.2.
K.3.3.
Rules of logarithms
K.3.3.
Summations
Supplemental
material
3,
1
, , e
2
Example 2:
let S = {0,1, 2...,,10}
An element may belong to a set (or: may be a member of a set). Thus
xS
Example 3:
integer 1 S
integer 1
Irrational number (=3.141...) and e (=2,71...)
However, 0.5 S and and e
Definition: Sub-set
Example 4: let T = {0,1}
T is a sub-set of S: T S
: inclusion symbol
or : S T
3
x
Domain: (0, )
Range: (2, )
y= 2 +
Example 12:
y= +
Domain: (0, )
Range: ( , )
10
Example 14
lim
x
k
=0
mx + h
if m 0
11
x a
x a
x a
and the limit equals the value of the function at that point.
Example 15
1
Let f ( x) =
x 8
The left-hand and right-hand limit are unequal
1
1
= and lim+
=
lim
x8 x 8
x8 x 8
so that the function is discontinuous at x=8
Example 16
Let f ( x) =
1
( x 8) 2
1
1
= and lim+
=
2
x8 ( x 8)
x8 ( x 8) 2
So that the function has the same left-hand and right-hand limit at x=8.
However the function f(.) is not defined at x=8
lim
12
1
( y 3)
3
1
( x 3)
3
is the inverse function of =
y 3( x + 1)
13
1
( x 3) are inverse functions.
3
1
thus 3[ ( x 3) + 1] =
x
3
1
and [3( x + 1) 3] =
x
3
14
Maximum:
It says that we maximize the function 3 2( x 9) 2 with respect to x.
The function value (at x=9) is equal to 3.
max 3 2( x 9) 2 =
3
x
15
Definition:
Secant line: line between the points ( x A , y A ) and ( xB , yB )
where y A = f ( x A ) and yB = f ( xB )
f ( xB ) f ( x A )
y ' y A
=
( x ' x A )
x
x
B
A
For any point ( x ', y ') on this line, x ' is within [ x A , xB ] and y ' is
within [ y A , yB ]
16
Definition:
A function is strictly convex in an interval if for any distinct points
x A and xB in that interval, and for all values in the open interval
(0,1)
f ( x A + (1 ) xB ) < f ( x A ) + (1 ) f ( xB )
17
18
19
20
xa x
= ( y 0)
ya y
21
and lim b x = 1 if | b |= 1
x
Example 30:
lim b x = 0 If | b |> 1
x
Which is equivalent to
1
lim x = 0 If | b |> 1
x b
22
x
i =1
= x1 + x2 + x3 + x4 + x5
x
j =1
= x1 + x2 + x3 + x4 + x5
Example 31:
6
=12 + 22 + 32 + 42 + 52 + 62 = 91
i =1
2
1
1 1 1 21
=
+ + =
3 8 15 40
j =0 ( j + 1)( j + 3)
23
(a
=i 1
+ bi )=
a + b
i
=i 1 =i 1
Homogeneity property:
n
ca
= c ai
i
=i 1 =i 1
So that:
n
c = nc
i =1
24
=
=
a
a
ij
ij =
=i 1 =j 1
i 1 =j 1
=j 1
1j
=j 1
=j 1
+ a2 j + ... + amj
or
m
=
a=
a
ij
ij
=j 1 =i 1
=j 1 =i 1
a + a
=i 1
i1
=i 1
i2
thus
m
a = a
=i 1 =j 1
ij
=j 1 =i 1
ij
25
+ ... + ain
=i 1
X t +=
1
1 + X t
k
Definition:
Exponential e (irrational number)
k
1
e = lim 1 + = 2.71828182845...
k
k
Example 32:
With an annual interest rate of 100 percent, the value of $100 after one
year of continuously compounded interest is
k
1
lim 1 + $100 =
$271.83
k
k
26
k /r
k /r
1
r r
1 + = 1 + = 1 +
k k k / r
k
k /r
m
1
1
=
er =
lim 1 +
lim 1 + =
(2.71828...) r
m m
k / r
k / r
Thus:
X (t + 1) =
e r X (t )
t is denoted in parentheses: moments measured in continuous time.
X (t + n) =
e rn X (t )
n: any real number
27
28
ln(e z ) = z
eln x = x
ln(=
xy ) ln( x) + ln( y )
ln( x=
/ y ) ln( x) ln( y )
ln( x z ) = z ln( x)
29
Graphically assess the domain and the range of f, its limits x 0, x A, x B, x C and its
continuity at these points. By graphically assessing, we mean that the exact function values do
not matter, but the procedure followed should be clear.
Solution:
The domain is the set on which the function is defined. f appears to be defined everywhere,
except on the interval [-1,-0.5] and at x=0, where f has no values. So its domain is
/ ([1, .5] {0}) (this means the set , which denotes the real numbers, with the interval
[-1,-0.5] and the point x=0 cut out).
The range is the set of outcomes of the function. In this case all numbers from seem to be
reached except for the small interval at about [ 2, 1] . So the range of the function is
/ ([2, 1]) . Note that we cant tell from the picture whether this is equal to the co-domain
of f, i.e., if f : X Y , whether Y = range( f ) . For instance, Y here could be the whole of ,
or it could be / ([2, 1]) . In the latter case, the function would be called onto, or surjective.
A function which is onto or surjective has range equal to its codomain.
30
x C
We turn to point B. The function has a jump here, so it is not continuous at B. Furthermore,
from the left the function tends to a different value than from the right (from the right it goes
to, say, 2, whereas from the left it appears to go to something like 1.2). Therefore a limit at B
is not defined.
At point A the function also has a jump, so it is again discontinuous. However from the left
and from the right it tends towards the same value, so it does have a limit: lim f ( x) 1 . Do
x A
Finally, at point 0 something strange happens. The function is not defined at this point.
However, it does have a left-hand and a right-hand limit. From the left it goes to , from the
right to . These values are different, so the limit at 0 is not defined. Even if it were, we
could not decide whether f is continuous at 0, because the condition lim f ( x) = f (0) is not
x 0
lim
=
X t + n lim X t (1 + r ) n . Will the value of X t + n ever equal zero?
n
Solution:
lim X t (0.98) n = 0 , however, 0 is never actually reached.
n
Exercise 5.
Exercise 3.1.6. from Klein.
Assume a firms net profits are $50 million in 2000 and are expected to grow at a steady stat
of 6% per year through the end of the next decade. How much would you expect the firm to
earn in 2001? In 2003? Now assume that the firms profits have been growing at 6% since
1997. If a negative value of n can be interpreted as the number of time periods before t, how
much did the company earn in 1998? Graph the path of income growth between 1998 and
2003 and explain why the curve gets steeper over time.
Solution:
In 2001: 1.061 50 =
53
3
In 2003: 1.06 50 =
59.5508...
2
In 1998: 1.06 50 =
44.4998...
Graph:
This actually looks very much like a straight line, so for sake of clarity, we also draw a graph
up to 2030:
32
The line gets steeper, because at each point growth is 6% of profits, and profits are always
increasing. Therefore growth is always increasing.
Exercise 6.
Exercises 3.2.1. and 3.2.5. from Klein.
r
3.2.1. Using the formula of multiple compounding in one period X=
X t (1 + ) k . Calculate
t +1
k
the value of X t +1 for the following values of r and k. Assume that X t = 20 .
a) r = 8%, k = 4
b) r = 0.5%, k = 2
c) r = 10%, k = 365
3.2.5. Assuming that X(t) = 75, determine the value of X(t + n) for the following values of k
and n using the continuous compounding formula:
X t +1 = X t e r + n
a)
b)
c)
d)
e)
f)
n = 3, r = 9%
n = 0.5, r = 2.5%
n = -2, r = 11%
n = 0.25, r = 6%
n = 3, r = 9%
n = 0.75, r = -3%
Solution:
We do one from each, the point should be clear:
r
3.2.1. a) X t +1 =X t (1 + ) k =20 (1.02) 4 =21.649
k
3.2.5. a) X (t + n) = X (t )e r n = 75 e0.27 = 98.2473
Exercise 7.
Exercise 3.3.1 from Klein.
a) 10log10 (100)
b) ln e x eln( x )
1
c) log10 ( 5 )
x
33
e) ln 5 [ x y ]2
e
Solution:
a) 10log10 (100) = 100
b) ln(e x ) eln( x ) = x x = 0
1
c) log10 ( 5 ) = 5log10 ( x)
x
d) Cannot be simplified further.
e) ln(e a +bx + cz ) =a + bx + cz
f) log(3 x) 4 Unclear, has two interpretations:
Either:
log((4
=
x)3 ) 3log(4
=
x) 3(2 log(2) + log( x))
Or:
(log(3 x)) 4 Which cannot be simplified further.
1
ln( 5 [ x y ]2 ) =ln(e 5 ) + ln([ x y ]2 ) =5 + 2(ln( x ) + ln( y )) =
g)
e
5 + 2( ln( x) ln( y ))
* Exercise 8.
ln q
i =1
where u is the index of utility, qi is the quantity of good i, and 0 < i < 1 . Transform the
function back to its original form, where U is utility.
Solution:
n
i log( qi )
log(U )
U e=
e i=1 =
=
log( qi )
e i =
qi ) i
)
(elog(=
=i 1 =i 1
=i 1
34
* Exercise 1.
x2
Where f ( x) = x . Would your answer change if f ( x) =
0
2
if x 4
if x = 4
Solution:
2
) 4=
16 . Recall the formal
Since f(.) is clearly continuous, we want to find that lim f ( x=
x4
2
) 4=
16 and
definition of a limit. It should hold simultaneously that lim+ f ( x=
x4
2
lim f ( x=
) 4=
16 . We consider the left-hand limit, the right-hand is similar. The idea is now
x4
that for any number close to 16 we can find numbers close to, but smaller than 4, such that
their square is even closer to 16. That is, given a number > 0 , we have to find a > 0 such
that for numbers x such that 4 < x < 4 , we find x 2 16 < . This means that we have to
find as a function of .
If we can make this work for < 1 , we can obviously also make it work for larger , so we
restrict our attention to that case.
1
Let's try = . Then, because 4 < x < 4 , we know that
8
x 2 16 < (4 ) 2 16 =
16 8 + 2 16 =
8 + 2 < 8 =
The first inequality follows by the monotonicity of f ( x 2 16 is at its largest near 4- ), the
second inequality from the fact that < < 1.
Now, comparing the left-hand side and the right-hand side of (1), we see that we have the
definition of the limit, so we are done.
Note that we have nowhere looked at f(4), since x<4. Therefore, our result will not change if
the function has a different value at 4.
Finally, a picture to elucidate and .
35
(1)
Exercise 2.
Show from the definitions that for f ( x) = x 2
a) f(.) is a concave function,
b) f(.) it is a homogeneous function (of a degree to be determined by you).
Solution:
2a. Recall the definition of concavity: f(.) is concave if and only if, for 0<<1,
or
(1)
f ( x A ) + (1 ) f ( xB ) f ( x A + (1 ) xB )
f ( x A ) + (1 ) f ( xB ) f ( x A + (1 ) xB ) 0
(1 ) f ( xB ) =(1 ) xB 2
(2c)
f ( x A + (1 ) xB ) =
( x A + (1 ) xB ) 2
{so that f ( x A + (1 ) xB ) =
+( x A + (1 ) xB ) 2 }
Writing out equation (1), by substitution (2a), (2b) and (2c) in this equation, we get:
x A 2 (1 ) xB 2 + ( x A + (1 ) xB ) 2 =
x A 2 (1 ) xB 2 + 2 x A 2 + 2 (1 ) x A xB + (1 ) 2 xB 2
36
(3)
The last term consist of a minus, two positive terms { >0 and 1 > 0 }, and a square (
( x A xB ) 2 >0), so that the last term is negative.
which proves our result.
2b. Recall the definition of homogeneity:
The function f(.) is homogeneous of degree k if, for >0, f ( x) = k f ( x)
We write out the definition:
f ( x) =
( x) 2 =
2 x 2 =
2 f ( x)
So we find that f(.) is homogeneous of degree 2.
* Exercise 3.
Suppose f(x) and g(x) are both monotonous functions on the same domain. Show that
a. f(.) + g(.) is also monotonous.
b. And if f(x) and g(x) are both concave, can you show that f(.) + g(.) is concave?
c. And if f(x) and g(x) are both homogeneous, of degrees l and m respectively, can you then
show that f ( x) g ( x) is homogeneous, and of what degree?
Solution:
3a. Monotonicity:
We know that, if x A < xB , f ( x A ) < f ( xB ) and g ( x A ) < g ( xB ) , for any
x A , xB Dom( f ) Dom( g )
(meaning that x A and xB are both on the domain of both f(.) and g(.)).
Now we have to prove that, if we let h=
( x) f ( x) + g ( x) that h( x A ) < h( xB ) .
Thus:
h( x A ) = f ( x A ) + g ( x A ) < f ( xB ) + g ( xb ) = h( xB )
f ( x A ) + (1 ) f ( xB ) f ( xA + (1 ) xB )
37
g ( x A ) + (1 ) g ( xB ) g ( x A + (1 ) xB )
( f ( x A ) + g ( x A )) + (1 )( f ( xB ) + g ( xB )) =
f ( x A ) + (1 ) f ( xB ) + g ( x A ) + (1 ) g ( xB ) f ( x A + (1 ) xB ) + g ( x A + (1 ) xB ) =
h( x A + (1 ) xB )
That proves what we want.
3c. Homogeneity:
We know that f ( x) = l f ( x) and g (gg
x) = m g ( x) .
Now we want that, if we define k=
( x) f ( x) g ( x) , then we want k ( x) = p k ( x) for some p
still to be determined.
k (gggggggg
x) = f ( x) g ( x) = l f ( x) m g ( x) = l m f ( x) g ( x) = l + m k ( x) .
So we see that the function k(.) is homogeneous of degree l+m.
Exercise 4.
Write out the following:
4
a)
i * 2
=i 2=j 1
b)
(i + j )
=i 1 =j 1
Solution:
a)
4
4
i
i
i
i
=
i
2
2i )
(i 2 + i 2 + i 2 + i =
=i 2=j 1
=i 2 =j 1
=i 2
2
3
4
4(2 2 + 3 2 + 4 2 ) =
384
4
2i
i=
i 2i = i 2i =
=j 1 =i 2
=j 1 =i 2
(2 2
+ 3 23 + 4 2 4 ) =
=j 1
2i
i =
=j 1 =i 2
i 2
=i 2 =j 1
38
2i ) 4 (i =
2i )
4(i =
=i 2
=i 2
96 = 4 96 = 384
=j 1
)2
(i + j=
=i 1 =j 1
( (1 + 1)
2
2
(i + j ) =
=i 1 =j 1
( (i + 1)
=i 1
2
+ (i + 2)=
)
+ (1 + 2) 2 ) + ( (2 + 1) 2 + (2 + 2) 2 ) + ( (3 + 1) 2 + (3 + 2) 2 ) =
4 + 9 + 9 + 16 + 16 + 25 =
76
Again, one can show that
3
) (i + j )
(i + j=
2
=i 1 =j 1
=j 1 =i 1
Note that always the indices have dropped out after you have evaluated the sums. They are
only useful within the sum and for that reason are sometimes called dummies.
Exercise 5.
Let k be some constant and f(.) some function. Show, or at least make clear, that
n
=i 1 =i 1
k = nk .
i =1
Solution:
n
kf (i) = k f (1) + k f (2) + ... + k f (n) = k ( f (1) + f (2) + ... + f (n)) = k f (i)
i 1 =i 1
n
k =
i =1
(k + k + ... + k ) = nk
((((
n times
39
* Exercise 1.
The following are all graphs of functions . Determine whether they are one-to-one
(i.e. injective) and whether they are onto (i.e. surjective).
a)
Solution:
We check for injectivity. Clearly the function does not take the same value twice, so the
function is injective.
We check for surjectivity. The range for the function is only about [3, ) , while it is given
that Y = . So the function is not surjective.
40
b)
Solution:
We check for injectivity. We see that, for instance, both at x=2 and at x=-2 f(x)=1, so the
function is not injective.
We check for surjectivity. The range for the function is only about [3, ) , while it is given
that Y = . So the function is not surjective.
41
Solution:
We check for injectivity. We see that the function does not take the same value twice, so it is
injective.
We check for surjectivity. We see that the range of the function is = Y , so it is surjective.
d)
Solution:
We check for injectivity. We see that for instance at x=0 and at x=40 the function takes the
value f(x)=0, so it is not injective.
We check for surjectivity. We see that the range of the function is = Y , so it is surjective.
42
(2i + 3 j )
=i 1 =j 1
Solution:
2
=j 1
=j 1
=i 1 =j 1
(2i + 3 j ) = ((2i + 3) + (2i + 6) + (2i + 9) + (2i + 12)) = (8i + 30) =(8 + 30) + (16 + 30) =84
=i 1 =j 1
=i 1
=i 1
Of course, not all of these brackets are necessary, they are mostly to show what comes from
what.
* Exercise 3.
Determine whether the following function is homogeneous. If it is, determine the degree.
f ( x) = h( x 3 ) , where h( x) is homogeneous of degree 7. (Hint: if you find this confusing, first
try it with h( x) = x 7 , which is a homogeneous function of degree 7.)
Solution:
First the general problem:
We check if f (t x) =
t m f ( x) for some m.
* Exercise 4.
Of course you all know intuitively what the derivative of a function f(x) is: it is the very small
change that occurs in f(x) when you very slightly change x. The picture illustrates this. The
blue line is the graph of the function f(x). If you take x ever smaller, you will approach ever
more closely the slope of the red line of the derivative.
43
For this approaching ever more closely, we naturally think of the limit (in fact, it was in the
context of derivatives that the notion of limit was first developed).
d
f ( x + x) f ( x)
We define:
. (Note that f ( x + x) f ( x) = f ( x) .)
f ( x) = lim
x 0
dx
x
d
Now you must prove that for f ( x) = x 2 it indeed holds that
f ( x) = 2 x by writing out the
dx
limit. Do this in three steps: first, before evaluating the limit, observe that
f ( x + x) f ( x=
) 2( x x) + x 2 . Then, still before evaluating the limit, show that
f ( x + x) f ( x)
simplifies to 2x + x . Then evaluate the limit lim 2 x + x directly from
x 0
x
the definition (either doing it from the left or the right hand side is enough).
If you succeed, you have proved a rule that youve already known for a long time. Isnt that
fun!
Solution:
So, let the fun begin:
We first write out the definition. To emphasize that x is a single number and not a
multiplication of and x, I will now define x=h. (So this is just giving it a new name).
d 2
f ( x + h) f ( x )
( x + h) 2 x 2
=
x lim
= lim
h 0
h 0
dx
h
h
Now, before we touch the limit, we just apply algebra to what is inside the limit. This is
allowed, because were basically not changing the expression over which we take the limit.
x 2 + 2 xh + h 2 x 2
( x + h) 2 x 2
2 xh + h 2
= lim
= lim = lim(2 x + h)
lim
h 0
h 0
h 0
h 0
h
h
h
Strictly speaking, for our last step, we should observe that h
44
then f(x) will get ever closer to c, as x gets closer to a. This was formalised thus: if you say
how close to c you want to get, then I should be able to give a distance from a so that you will
indeed get that close or closer to c. You saying how close you want to get is setting an , me
providing you with this distance is picking the .
Lets apply this to our case. The function over which we are taking a limit is 2x+h, where x is
now just some given number. Intuitively, we would expect that as h goes to zero, this function
will just go to 2x. So lets make that our guess for the limit.
Now, since this function is very simple, we can do the right hand limit and the left hand limit
at the same time.
You provide me with >0 and I decide to pick = (Why? It turns out that it works. This is
basically backward engineering.) Then if I only look at h whose distance to 0 is less than ,
i.e. 0 h = h < ,I hope to find that my distance to 2x is smaller than , i.e. f (h) 2 x < .
Lets see:
1
f ( h) 2 x = 2 x + h 2 x = h < = <
2
Comparing the left hand side and the right hand side, we see that we indeed are close enough,
d
so the limit is as we specified. This proves that
f ( x) = 2 x . Yay!
dx
45
y1
x1
Two vectors x = and y = , x, y y n can be added:
yn
xn
x1 + y1
x+ y =
xn + yn
Example 3
x, y y 3
2
1
x = 1 and y = 0 so that
1
5
46
47
Example 4
2
If x = 1 then:
1
a)
6
3 x = 3
3
b)
6
3
3 x =
3
48
for which a ,
0
is at a line through the origin O = and the point
0
c1
c
2
Example 5
x1 2a
Any =
x2 3a
0
for which a is on a line that goes through the origin O = and
0
2
the point
3
6
Thus, for instance is at this line (a = 3).
9
49
x12 + x2 2
Implication:
The length of vector ax:
ax=
2
2
| a | x12 + x2=
|a| x
a 2 x12 + a 2 x2=
Note that we take the absolute value of a, because the length cannot be
a negative number.
2
If x = then
3
a)
x =
b)
3 x= 3 4 + 9= 3 13 (length of 3x)
c)
3 x= 3 4 + 9= 3 13
4+9 =
13
(length of x)
(length of -3x)
50
b 2 + 9= 5
b 2 + 9= 1
51
2
2
Definition:
1
0
In 2 the unit vectors e1 = and e2 = have a length of 1
0
1
52
x1
Vectors x =
x2
x12 + x2 2 = 1
0
Thus the locus of this circle is the origin O =
0
Consequence:
1
0
In 2 the unit vectors e1 = and e2 = are at the unit circle.
0
1
Definition:
c
x
A vector x = 1 is at a circle with locus c = 1 and with a non x2
c2
negative radius (r 0) if it satisfies the restriction:
( x1 c1 ) 2 + ( x2 c2 ) 2 =
r
Example 8
1
( x1 + 1) 2 + ( x2 2) 2 =
25 describes a circle with locus and radius
2
5.
53
54
Example 9
1
0
In 2 the unit vectors e1 = and e2 = are orthogonal:
0
1
e1 e2 = 1 0 + 0 1 = 0
Definition:
1
0
In 2 the unit vectors e1 = and e2 = are referred to as
0
1
orthonormal vectors (they are perpendicular and they have a length
of 1).
Example 10
1
0
In 2 the unit vectors e1 = and a e2 =
a , a , are orthogonal.
0
Reason:
e1 (a e2) =1 0 + 0 a = 0
1
Consequence: e1 = is orthogonal to any point at the line
0
0
a e2 =
a for a
55
0
a
,
and
the
line
In 2 any point at the line a e1 =
a
e
2
=
b
0
b are orthogonal:
Reason:
(a e1) (b e2) = a 0 + 0 b = 0
56
for k1 = 0 and k2 = 0
57
58
2 = k1 * (1),1 = k2 * 2
So that
k1 =
1/ 2, k2 =
1/ 2, k1 k2
Reason 2)
Reason 3)
2 (1) + 1 2 =0
=
x
1
22 +=
12 1
5
59
and
=
y
1
12 + =
22 1
5
x1
y1
z k1 + k2
Thus:
=
x2
y2
for which k1 , k2
Thus: 2 can be spanned by two linearly independent vectors!!!!!
Consequence 2:
a
c
Let x, y y 2 . We write both vector as x = and y =
d
b
Both vectors are linearly dependent if
ad bc =
0
Both vectors are linearly independent if
ad bc 0
Proof: both vectors x and y are linearly dependent if
a
1) a= k1 c so that k1 =
c
b
2) b= k2 d so that k2 =
d
a b
0
3) Linear dependent so that k1 = k2 or = or a d b c =
c d
60
Definition:
The (2 X 2) matrix A:
a12
a
A = 11
a21 a22
It consists of two rows and two columns.
Definitions:
aij is an element of the matrix A.
The diagonal of the matrix consists of the elements a11 and a22 .
The elements a11 and a22 are referred to as the diagonal elements
of the matrix A.
The elements a21 and a12 referred to as the off-diagonal elements
of the matrix A.
The (2 X 2) matrix A can be (post)multiplied by a (2 X 1)-vector x.
a
Ax = 11
a21
a12 x1
a22 x2
61
x2
Ax
x1
=
a
Ax
=
a21 a22 x2 a21 x1 + a22 x2
62
1 3 2 11
=
a)
2 5 3 11
1 0 2 2
b)
3 = 3
0
1
0 1 2 3
c)
3 = 2
1
0
5 0 2 10
d)
3 = 15
0
5
2 1 2 7
e)
3 = 14
4
2
63
64
a11b12 + a12b22
AB =
AB =
a21 a22 b21 b22 a12b11 + a22b21
Note that
a) AB BA
b) A( BC ) = ( AB)C
65
a11b12 + a12b22
a21b12 + a22b22
0 4
2 5
1 3 1 2 0 5
A=
+B
+0=
2
5
4 2 9
The product of both matrices (number of columns of A equals the
number of rows of B):
AB
1 3 1 2 1 14
=
2 5 0 4 2 16
66
a
Ax 11
=
a21
a12 x1
a11
a12
x
x
=
+
a22 x2 1 a21 2 a22
a12
a11
Thus Ax is a linear combination of the vectors and
a22
a21
Example 19:
1 3 2
1
3
=
2
+
3
a)
2
5
2 5 3
1 0 2
1
0
=
2
+
3
b)
0
1
0 1 3
0 1 2
0
1
=
2
+
3
c)
1
0
1 0 3
5 0 2
5
0
2
3
=
+
d)
0
5
0 5 3
2 1 2
2
1
=
2
+
3
e)
4
2
4 2 3
67
Ix = x
1 0
I
=
Where
0 1
1 0 2
1
2
=
0 1 3
0
1 0 2
1
0
=
2
+
3
0 1 3
0
1
68
Ax = b
can be rewritten as:
A1 Ax = A1b
or
x = A1b
69
a c
A=
b d
equals
A1 =
Proof:
1 d c
ad bc b a
1 d c a c
=
ad bc b a b d
dc dc
1 da bc
=
ad bc ba + ab bc + ad
A1 A
1 da bc
=
bc + ad
ad bc 0
1 0
=
0 1
0
Note: the inverse of matrix A does not exist if ad bc =
70
2 4
A=
3 1
1 0
A=
0 1
0 1
3 1
3/ 2 1
0 5
3/10 1/ 5
0 1
Round 4: Row 1 new: row 1 - 2 times row 1
1 0
1/10 2 / 5
1
A=
A
=
3/10 1/ 5
0 1
Check:
AA1
2 4 1/10 2 / 5 1 0
=
3 1 3/10 1/ 5 0 1
71
1 4
4
5
6
,2 5
7 8 9
3 6
1
2
3
Solution:
1 2 3
1 4
4
5
6
2 5
=
7 8 9
3 6
1 2 3
1 1 + 2 2 + 3 3 1 4 + 2 5 + 3 6
2 + 63 4 4 + 55 + 66
4 1 + 5 =
7 1 + 8 2 + 9 3 7 4 + 8 5 + 9 6
1 1 + 2 2 + 3 3 1 4 + 2 5 + 3 6
14 32
32 77
50 122
14 32
Exercise 3.
1 4
8 0
For the matrices A =
,B =
, show that ABBA. Do the same, but without
4 5
3 1
T
1
1
2
2
calculation for 3 and 3 = (1 2 3 4 5 ) .
4
4
5
5
Solution:
Observe that A is a symmetric matrix. It doesnt help your calculations, but you should know
what it is.
1 4 8 0 20 4
8 0 1 4 8 32
AB
=
=
=
=
BA
4 5 3 1 47 5
3 1 4 5 7 17
For the second pair of vectors, observe that one order of the multiplication gives rise to a 5x5
matrix, while the other leads to a 1x1 matrix, usually called a number. Clearly those cannot be
72
1
2
the same. In fact the version with the number as an outcome, 3
4
5
1
2
3
4
5
is just another way of writing the inner product. (Check that its the same thing!)
3b).
5 3
17 17
4 3
1
Show for the matrix A =
,
with
inverse
A
=
1 4
1 5
17 17
( A1 )T = ( AT ) 1 holds.
Solution:
5 3 5 1
17 17 17 17
( A1 )T =
=
We compute directly
.
1 4 3 4
17 17 17 17
T
that I A=
If this is equal to ( AT ) 1 , then it must hold
=
( AT ) 1 AT ( A1 )T . We check:
5 1
4 1 17 17 1 20 3 4 + 4 1 0
=
AT ( A1 )T =
3 5 3 4 17 15 15 3 + 20 0 1
17 17
Yippee.
Exercise 4.
Write out the following sets of equations in matrix form. Solve by sweeping.
a)
x1 + x2 + 3 x3 + 7 x4 =
5
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4
x1 x2 + 3 x3 6 x4 =
0
4
x1 2 x3 + 1x4 =
b)
x1 + 3 x2 x3 =
3
x1 + 2 x2 5 x3 =
4
x2 + 4 x3 =
1
c)
x1 + 3 x2 x3 =
3
x1 + 2 x2 5 x3 =
4
x2 + 4 x3 =
0
73
2 2 4 6 x2 =
4
1 1 3 6 x3 0
1 0 2 1 x4 4
If you multiply this out, you indeed get the equations back (try it!).This is of course why
matrix multiplication is defined the way it is: it makes it very easy to write sets of equations
compactly. However, for actual solving we will write things down slightly differently.
Suppose we subtract the first equation x1 + x2 + 3 x3 + 7 x4 =
5 from the second
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4 , then we get:
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4
x1 + x2 + 3x3 + 7 x4 =5
x1 + x2 + x3 x4 =
1
Notice that the xs dont change, we only have to look at the value in front of them. Thats
why we write down the set as follows:
1 1 3 7 5
2 2 4 6 4
1 1 3 6 0
1 0 2 1 4
Now we rework this augmented matrix, as it is called, to get something that we can interpret
quickly. Each time we rewrite the matrix, we indicate this with the sign ~ rather than =, as the
matrices are not equal. Notice that we not only subtract, add and multiply, but also
interchange rows. We get:
1 1 3 7 5 0 1 5 6 9 1 0 2 1 4
4 12 0 1 4 2 6
2 2 4 6 4 0 2 8
1 1 3 6 0 0 1 5 7 4 0 0 9 5 10
1 0 2 1 4 1 0 2 1 4 0 0 1 4 3
Here in the first step we subtracted the last row from the first row once, twice from the second
row and once from the third. In the second we interchanged the first and the last row, then
divided the second row by 2 and then added it to the third row and subtracted it from the
fourth. From now on we dont describe our steps, as this is very cumbersome and confusing.
We continue:
1 0 2 1 4
4
1 0 2 1 4 1 0 2 1
0 1 4 2 6
0
1
4
2
6
0
1
4
2
6
0 0 1 4 3
0 0 9 5 10 0 0 1
4
3
17
0
0
1
4
3
0
0
0
41
17
0 0 0 1
41
74
17
. We
41
17
17
55
) =3 x3 =3 4( ) = . We use this again in
41
41 41
55
17
55
17
8
the line above that: x2 + 4( ) + 2( ) =
and finally
6
x2 =
6 4( ) 2( ) =
41
41
41
41
41
55
17
55
17
71
.
x1 2( ) + ( ) =4 x1 =4 + 2( ) ( ) =
41
41
41
41
41
Note that once we have all zeros in the lower corner, life is quite easy. Of course, in general,
sweeping is just a systematic way of solving by substitution. The operations we used to
change the augmented matrix (addition, subtraction, etc.) are called elementary operations.
They are very useful in the study of linear algebra.
b)
1 3 1 x1 3
1 2 5 x2 =
4
0 1 4 x 1
3
1 3 1 3 1 3 1 3 1 3 1 3
1 2 5 4 0 1 4 1 0 1 4 1
0 1 4 1 0 1 4 1 0 0 0 0
also: one for each x1 such that x1 = x2 .) We have one free variable. We could pick any of our
xs to be the free one and express the others in terms of it. We pick x3 . Then we get
1 2 5 x2 =
4
0 1 4 x 0
3
1 3 1 3 1 3 1 3 1 3 1 3
1 2 5 4 0 1 4 1 0 1 4 1
0 1 4 0 0 1 4 0 0 0 0 1
Exercise 5. Find vectors b and c that pick out element aij from matrix A = ,
a
m1 amn
i.e. aij= b A c .
Solution:
0
th spot. These es are called unit vectors. Then:
a1 j
0
a11 a1n
(0 1 0) 1 =
(0 1 0) aij =
b Ac =
aij
m1 amn
0
a
mj
In general: pre-multiplication with a unit vector gives you a row from the matrix, postmultiplication gives you a column.
* Exercise 6.
Find vectors b and c such that b A c gives the average of all elements of A.
Solution:
1
1 , then:
=
1 1) , c
Consider b (1=
mn
1
1
mn
a11 a1n
1 m n
b Ac =
(1 1 1) 1 =
aij
mn mn=i 1 =j 1
m1 amn
mn
This is the sum of all the elements in A divided by the number of elements in A, i.e. the
76
1
could also have been in front of the first vector, or shared between
mn
* Exercise 7.
Write out the matrix product A B in terms of their typical elements aij , bij , assuming A and B
are conformable. I.e., find the typical element cij of C= A B , where we write C = cij .
Solution:
b1 j
b1 p
a1n b11
a11
aik bkj
C
ai1 aij ain
=
= ai1b1 j + + ainbnj=
ij
k =1
a
bnj
bnp
amn bn1
m1
* Exercise 8.
x
Show that the length formula =
Solution:
The picture shows a general vector x and its components x1 , x2 . Pythagoras theorem now
states (as you hopefully remember from high school) that for a triangle with a right angle like
2
this, x= x12 + x2 2 . Taking the square root gives the result.
* Exercise 9.
Show that the circle formula ( x1 c1 ) 2 + ( x2 c2 ) 2 =
r 2 indeed gives a circle with radius r and
77
c1
x1
The figure tells the basic story. We start with point and we pick a point such that
c2
x2
2
2
2
the condition ( x1 c1 ) + ( x2 c2 ) =
r holds. We did this in the figure. But then, by
c1
x1
Pythagoras theorem, the distance between and must be r. This must hold for all
c2
x2
x1
x1
x we can find this way, so the set of points x for which the condition holds is the set of
2
2
c1
points that has distance r to . This is the circle drawn.
c2
Exercise 10.
1
Find all vectors orthogonal to 2 . Do the same for
2
Solution:
78
1
2 3
3
1 a
2 b =
. We find
0
3 c
1 a + 2 b + 3 c = a + 2b + 3c = 0 . This is a single equation in 3 unknowns, so we have two
2b 3c
free variables. Lets take b and c free. Then a=-2b-3c, so al vectors b , b, c are
1
orthogonal to 2 . If you try to imagine this in space, it makes sense that a vector in 3 has
3
two free variables. If you find an orthogonal vector you cannot only extend it, like in 2 , but
also rotate it.
* Exercise 11.
Bonus:
Show that the formula for orthogonality coincides with our intuitive notion of orthogonality.
Solution:
As a preliminary, convince yourself that the vector indicated in the following figure as B-A is
indeed B-A.
79
The easiest way to see this (much easier than I explained in class) is that in the figure A and
B-A add up to B, which they clearly also do algebraically.
Now look at the following picture:
80
Convince yourself that the vectors are as indicated. Now, from the figure we see that the angle
between A and B can only be orthogonal if A+B and A-B have the same length, like so:
81
A+B having the same length as B-A means B A = A + B . We now manipulate this
82
a1
b
a =
, b 1 from 2 , draw them and a + b . Also calculate a + b .
=
For vectors
a2
b2
Solution:
In the figure, the red lines denote the vectors a and b, while the green lines denote their
translations (by b and a respectively). The yellow line is the resultant vector a+b, which has
coordinates (a1 + b1 , a2 + b2 ) .
* Exercise 2.
a11
a1n
b1
Solution:
Let T (b) be the transformation indicated. Then we have to show that
T ( =
b) T (b), , b n and T (b1 + b 2 )= T (b1 ) + T (b 2 ), b1 , b 2 n . So:
( A B) 1 A =
A1 B 1 A1 ( A B
=
) 1 B 1 A1
Exercise 4.
Determine dimension of the span of the following vectors:
1 0 3 2
2 7 5 11
3 , 4 , 3 , 14
4 2 7 7
5 5 0 5
Solution:
The span of a set of vector is the set of all multiples and sums of these vectors. Geometrically,
you can think of it as everywhere you can get by taking steps in the direction of these vectors.
The dimension of a span is the number of linearly independent vectors within the span. So we
have to determine how many linearly independent vectors there are in the set of four vectors
given.
Linear independence was defined as follows: v1 , v 2 , , v n are linearly independent if
= =
0 . We can write this in matrix notation
1 v1 + 2 v 2 + + n v n =
0 only if =
1
2
n
as follows:
0 only if =
=
= =
0
1
2
n
v1 v 2 v n =
v1 v 2 v n 0
Lets think about this for a second. If we start sweeping this matrix, the right hand side will
never change, as we would be adding zeros.
Now, in general for a set of linear equations there were three possibilities: well-determined,
underdetermined, or inconsistent. However, in this case we already know that the system is
not inconsistent (that is called consistent), because =
=
= =
0 is certainly a solution.
1
2
n
So the question becomes if it is underdetermined and to what extent (how many free variables
are there). If there are no free variables, then all the vectors are linearly independent. If there
are free variables, then there are as many linearly dependent vectors as free variables.
So what this boils down to is just an ordinary matrix sweep, made simpler by the fact that the
right hand side contains only zeros, and then counting the number of lines that do not contain
only zeros.
It will be clearer after our example, so lets turn to that. We sweep:
84
2 7 5
2 4 3
5 11 5
7 15 2
2 0 1
11 0 2
14 0 2
23 0 3
37 0 2
2
0 1
1 15
1
0 0
7
7
59 111
0
0 0
7
7
0
0
0 0 0
0
0
0 0 0
0
7
3
5
2 0 1
11 0 0
4 3 14 0 0
7 8 9 0 1
4 3 14 0 0
0
1
7
0
0
0
0
0 3 2 0
7 1 15 0
4 9 18 0
0 3 2 0
0 0 0 0
15
0
7
111
0
59
0
0
0
0
We see that from the four s we set out to find, one will be free, while the other three are
fixed in terms of the fourth. We dont care about their actual values, so we stop solving here.
The dimension of the span of the set of vectors is three.
We have determined the dimension of the span of the column vectors of a matrix (this span is
often called the column space of the matrix). However, some reflection will show that we also
determined the dimension of the span of the row vectors of our matrix (called the row space).
The reason is as follows: vectors are linearly independent if one of them cannot be obtained
through elementary operations (additions, subtraction etc.) on the others. That is precisely
what we check by sweeping. We therefore see that the row space also has dimension three. In
general the dimension of the row space is equal to the dimension of the column space. It is
also equal to the rank of the matrix, as we defined it in class. So now we have a few ways of
thinking about the rank of a matrix.
Exercise 5.
Vi is the ith column vector. Write the outcome of map in terms of the column vectors and a
general
vector x ( x1 , x2 , , xn ) n . What does this mean for the relation between the
=
column space (the space spanned by the column vectors of A) and the image of T.
Solution:
x1
T : ( x1 , x2 , , xn ) A x= V1 V2 Vn = x1V1 + + xnVn
x
n
So the image of T (the possible outcomes that T could give) is equal to the set of all linear
combinations of the column vectors of A. But that set is just the column space of A. So the
column space of A is the image of T.
* Exercise 6.
85
apply T2 : T2 (T1 ( x)) . This is meant by T2 T1 . It defines a new linear map T3 : n p which
has an associated matrix also. Lets call this matrix C and derive what it is:
T3 ( x) = C x = T2 (T1 ( x)) = T2 ( A1 x) = A2 A1 x C = A2 A1 .
86
Make a system of orthonormal vectors based on the vector x that span the entire space 3
Solution
In a three dimensional space there are at maximum three independent vectors. First, we
1
0
0
consider the three unit vectors e1 = 0 , e2 = 1 , e3 = 0
0
0
1
12 + 02 + 02 =
1
3) The three vectors span the entire space 3 . It implies that each vector in 3 can be written
as a linear combination of e1, e2 and e3. For instance:
1
1
0
0
2 =1 0 2 1 + 0 0
0
0
0
1
1
Next, we consider x = 2 . Our strategy is the following. Step 1: we construct two vectors
0
that are perpendicular to x. Step 2, we normalize the length of both vectors to one.
2
1
87
a
b )
0
0
2
1
1
=
x
5
2
b) The length of y = 1 is
0
b
a
b and a )
0
0
12 + (2) 2 + 02 = 5
1
1
2 is
5
0
1 4 0
+ + =
5 5 5
5
= 1
5
(2) 2 + 12 + 02 =5
2
1
1
y=
1 is
5
5
0
4 1 0
+ + =
5 5 5
5
= 1
5
0
c) the length of z = 0 is 1.
1
1
5
2
Step 3. The orthonormal vectors ,
5
0
2
5
0
1 and 0 span the entire space 3
5
1
0
1
5
2
It means that each vector can be written as a linear combination of these vectors ,
5
0
88
Exercise 2.
1
Given is the vector x = 2
0
1 1 0
and A = 0 2 1
1 4 2
0 1
1 2 =
2 0
1 1 + 1 (2) + 0 0
[1 2 0] 0 1 + 2 (2) + 1 0 =
(1) 1 + 4 (2) + 2 0
Alternative solution:
1 1 0 1
x ' Ax = [1 2 0] 0 2 1 2 =
1 4 2 0
1
[11 + (2) 0 + 0 (1) 11 + (2) 2 + 0 4 1 0 + (2) 1 + 0 2] 2 =
0
1
= [1 3 2] 2 = 1 1 + (3) (2) + 0 (9) = 7
0
89
2
2 t
For which t has the determinant of the matrix A a negative value?
For which t is x ' Ax > 0
For which t is x ' A1 x > 0
Does it matter for these results that A is a symmetric matrix? So that A = A '
1 1
So, check whether the results is different for e.g. B =
?
2 t
Solution:
a) det( A) = 1 t 2 2 = t 4
The determinant of A is negative if t < 4
b)
1 2 1
1
x ' Ax =[1 2]
=[ 3 2 2t ] =3 2(2 2t ) > 0
2 t 2
2
Thus, 7 + 4t > 0 , so that t > 7 / 4
1 2
c) A =
2 t
A1 =
1 t 2
t 4 2 1
1
x ' A=
x
t 2 1
1 t + 12
1
1
=
[1 2]
[t + 4 4] =
t 4
2 1 2 t 4
2 t 4
2 t
B 1 =
1 t 1
t + 2 2 1
1
x ' B =
x
t 1 1
1 t + 6
1
1
=
[1 2]
[t + 4 1] =
t+2
2 1 2 t + 2
2 t + 2
Section 5.1
Rank of matrix
Section 5.3
Diagonalization of a matrix
Section 5.3
91
a11 a1n
As we have seen last week, the matrix can be
am1 amn
interpreted as a function f (.) : n m , in particular:
x1 a11 a1n x1
f ( ) = .
xn am1 amn xn
We now look at functions T (.) : n m such that, for
n
x, y yy
, c , T (x + y )= T (x) + T (y ) and T (cx) = cT (x) .
Functions that satisfy both criteria are called linear functions (or linear
maps or linear transformations).
Example 1:
x1
x +x
T (.) : 3 2 , T ( x2 ) = 1 2 , then
x1 2 x3
x3
w1 y1
w1 + y1
w + y + w2 + y2
T ( w2 + y2 ) = T ( w2 + y2 ) = 1 1
=
w
+
y
2(
w
+
y
)
3
3
w3 y3
w3 + y3 1 1
w1
y1
w
w
y
y
w
w
y
y
+
+
+
+
+
1
1
1
2
1
2
2
2
T ( w2 ) + T ( y2 )
=
+
w 2 w + y 2 y ) w wx y 2 y =
3
1
3
3
3
1
1
1
w3
y3
And
y1
cy1
y1
cy
+
cy
y
+
y
2
1
2
T=
(c y2 ) T=
( cy2 ) 1=
c =
cT ( y2 )
cy1 2cy3
y1 2 y3
y3
cy3
y3
92
A = T (e1 ) T (e2 ) T (en )
This shows (if we proved it) that every linear map has a matrix
representation. The other way around (that every matrix represents a
linear map) is done in the tutorial.
Example 2
We represent the linear map from example 1 as a matrix:
0
1 + 0 1 0 + 1 1
T () =
,T ( 1 ) =
=
1=
1
2
0
0 0 2 0 0
0
0+0 0
T ( 0 ) =
=
0 2 1 2
1
So the map T is represented by the matrix:
1 1 0
1 0 2
Lets check:
x1
1 1 0 x1 + x2
1 0 2 x2 =
x x1 2 x3
3
This is indeed our original map T.
93
0 1
The matrix A =
can be understood as follows:
1 0
1
First column of the matrix A. Rotating the unit vector
0
0
counter clockwise we get .
1
0
Second column of the matrix A. Rotating the unit vector
1
1
counterclockwise we get ,
0
0 1
Thus, the rotation is represented by: A =
1 0
Thus the matrix can be used to rotate any vector counter clockwise.
2
For instance the vector :
1
0 1 2 1
=
Ax
=
1 =
y
1
0
2
2
The rotation implies that the vector x = is perpendicular to its
1
1
mapping y =
2
94
0 1
The matrix B =
represents a clockwise rotation: Thus
1
0
1 0
0 1
B = and B =
0 1
1 0
95
96
0 1
A=
1 0
3 0
B=
0 3
Question: what is the meaning of ABx?
1) First multiplication Bx: Implies a multiplication of both elements
of the vector x by a factor 3.
3 0
Bx =
x
0
3
3 0 1
=
Be1 =
0 3 0
and
3 0 0
=
Be2 =
0 3 1
3
0
0
3
0 1 3 0 0 3
=
C AB
=
0 =
1
0
3 3 0
Mapping C:
Step 1: Bx: three times larger length of the vector
Step 2: ABx: Counter clockwise rotation by 90 degrees
97
2
We apply the mapping on x =
1
0 3 2 3
3 0 1 = 6
2
3
x = is perpendicular to y =
1
6
Because:
1) x y = 2 (3) + 1 6 = 0
2) x = 5 and =
y
=
45 3 5
so that y = 3 x
98
99
Also here, it does not matter whether we first multiply our vector with
a number and then rotate it or the other way around.
100
Ax= 3x
3 0 0 2 6
0 3 0 1 = 3
0 0 3 3 9
Example 6:
3 3
Ax = x ,
0 0 2
2
0 0 =
1 =
1
0 0 3
3
2
3
101
0 0 3
The product of AA makes the length of a vector 9 times larger:
2
3 0 0 3 0 0 3
0 3 0 0 3 0 = 0
0 0 3 0 0 3 0
0
32
0
0
32
0 0 3 0 0 3 0 0 3 0
Et cetera
102
0
33
0
0
33
2
In 3 the line spanned by 1 has a dimension of 1. The line is
3
2
referred to as a subspace of 3 . It means that 1 , , is part of
3
this subspace.
Example 9:
2
1
In 3 the sub-space spanned by the vectors 1 and 0 has a
3
2
dimension of 2. A vector in this subspace can be characterized as
2 +
2
1
1 =
1 + 0 ,
3 + 2
3
2
103
3 2 0
2
1
vectors 1 and 0 . Thus, the image of each vector will be part of
3
2
this subspace. The matrix A is non-invertible. It has a rank of 2.
104
105
106
b d
We consider the size of the area spanned by:
a c 0 0
1)
0 = 0
b
d
a c 1 a
2)
0 = b
b
d
a c 1 a + c
3)
1 = b + d
b
d
a c 0 c
4)
1 = d
b
d
a a + c c
b , b + d , d
a c
is ad bc , which is the determinant of the matrix
b d
0
It can be shown that the area spanned by ,
0
107
0 3
equals nine. Thus det(A)=9. (The area of the mapping spanned by the
0 3 3 0
four vectors , , , equals 9).
0 0 3 3
For the matrix:
1/ 3 0
B=
0 1/ 3
we have det(B)=1/9
Properties of determinants:
1) It can be shown that in general det(AB)=det(A)det(B)
1
2) It can be show that det( A1 ) =
det( A)
Note that for the particular example 11 AB=I
3 0 1/ 3 0 1 0
0 3 0 1/ 3 = 0 1
108
a22 a23
a
a
a
a
a12 21 23 + a13 21 22
a32 a33
a31 a33
a31 a32
a
a23
For which 22
is the minor of a11 (the determinant of the suba32 a33
matrix of a11 )
| A |= a11
a21
a31
a21
a31
a23
109
a11 a12
a
A=
a22
21
a31 a32
a13
a23
a33
a11 a12
A = 0 a22
0
0
a13
a23
a33
a11 0
A = 0 a22
0
0
0
0
a33
110
2 1 1
A 1 4 4
=
The determinant of the matrix
1 0 2
equals 6.
For a 3x3 matrix, there are six possibilities to calculate the
determinant of the matrix:
First row of A:
4 4
1 4
1 4
2
1
+1
=
6
0 2
1 2
1 0
or (second row of A):
1 1
2 1
2 1
1
+4
+4
=
6
0 2
1 2
1 0
or (third row of A):
1 1
2 1
1
6
+2
=
4 4
1 4
or (first column of A):
4 4
1 4
1 4
2
1
+1
=
6
0 2
1 2
1 0
or (second column of A):
1 4
2 1
1
+4
=
6
1 2
1 2
or (third column of A):
1 4
2 1
2 1
1
+4
+2
=
6
1 0
1 0
1 4
111
a 1 0
The determinant of the matrix A = 2 a 2
0 1 a
equals a 3 4a .
112
or
thus
( A I ) 1 ( A I ) x =
( A I ) 1 0
=
x ( A I ) 1 0
x = 0.
a
If A is a 2 x 2 matrix: A = 11
a21
a12
a22
Then | A I |=
0 becomes
a11
a
21
a12
=0
a22
1,2 =
trA (trA)2 4 | A |
2
114
AP= P
For which is a diagonal matrix with the eigenvalues of A on the
main diagonal. For a 2 x 2 matrix A, the diagonal matrix is
0
= 1
0 2
P is a matrix that is spanned by the eigenvectors P = [ p1, p 2] .
For which 1 corresponds to the vector p1 and 2 corresponds to
the vector p 2
Notation:
The diagonalization AP= P can also written as
a) A= PP 1
1
b) =P AP
115
2 2
The matrix A =
has the eigenvalues 1 = 1 and 2 = 6
2
5
1 0
=
0 6
2
1
The eigenvectors belonging to 1 = 1 are p1 = and p 2 =
1
2
Note that the eigenvectors are orthogonal (because the matrix A is
symmetric, what we will not proof here)
Example 15
2 4
The matrix A =
has the eigenvalues 1 = 3 and 2 = 2
1
1
1 0
=
0 6
4
1
The eigenvectors belonging to 1 = 1 are p1 = and p 2 =
1
1
for 2 = 2
116
2 1 1
The matrix A = 2 3 4 has the characteristic equation
1 1 2
1
2
| A I |= 2
3
1
1
1
4 = ( 1)( + 1)( 3) = 0
1
1 = 1: p1= 1
0
0
2 = 1 : p 2 = 1
1
2
2 = 3 : p3 = 3
1
117
118
119
Let A = 2 1 t
0 1 1
2 1 t =0
0 1 1
Which gives
1 t
2 1
1
+t
=
0
1 1
0 1
So that
(1 t ) + 2t =
0
Thus the matrix A has an inverse if t 1
* Exercise 2.
a11 a1n
m1 amn
Solution:
We explained the procedure in class.
* Exercise 3.
Given a parallelepiped C of a certain volume and a linear map T with associated matrix A,
find Vol(T(C)).
Solution:
Vol (T=
(C )) det( A) Vol (C )
* Exercise 4.
Consider the following maps and show that they are linear, without deriving their matrix
representation. Also derive and show their eigenvectors (if any).
a) Blow-up of a vector along the x-axis by 100%, while the y-axis remains unchanged.
120
In the figure we drew the transformation for a specific vector. Note that algebraically, this
transformation amounts to T : (a1 , a2 ) (2a1 , a2 ) . In green and yellow are two eigenvectors
for this transformation, we return to them shortly. First we show that the map T is linear. For
that we have to show that T (a + b)= T (a) + T (b) and T (ra) = rT (a) . We do this both
graphically and algebraically.
In the figure we constructed T (a + b) and it can be seen to be equal to T (a) + T (b) (what does
121
=
rT (a) rT
=
(a1 , a2 ) r=
(2a1 , a2 ) (2ra1 , ra2 )
We see again that they are equal.
For the eigenvectors, we return to the first figure. Two examples of them are drawn in green
and yellow. First consider a vector along the y-axis. What would happen to it under this
transformation? Absolutely nothing. So it is an eigenvector with eigenvalue 1.
T (0, y ) = 1 (0, y )
Now consider a vector along the x-axis. What will happen to it under T? It will get doubled.
So it is an eigenvector with eigenvalue 2.
T ( x, 0)= (2 x, 0)= 2 ( x, 0)
122
b)
To make my life easier, all in one picture this time. A projection simple takes any vector and
only keeps the y-part of it: T : ( x, y ) (0, y ) . Form the picture it is again clear that the map
is a linear one. Note that this time we checked T (ra) = rT (a) for r<1.
Algebraically we have:
T (a + b)= T ((a1 , a2 ) + (b1 , b2 ))= T (a1 + b1 , a2 + b2 )= (0, a2 + b2 )
123
c)
The figure shows that the first condition for linearity holds.
124
This figure shows that the second condition for linearity also holds. We drew it here for r<0.
It is a bit beyond the scope of this course to derive the map algebraically, so we leave that to
the interested reader (It is actually not very hard. Give it a try).
This map is interesting in that it has no eigenvectors at all. Because it is a rotation, there is no
vector that does not change direction under the map.
* Exercise 5.
Suppose v is an eigenvector of a matrix A, with associated eigenvalue . Show that, for
0 , v is also an eigenvector with eigenvalue .
Solution:
v . Now we
We know from the fact that v is an eigenvector with eigenvalue that A v =
use the linearity of a matrix: A v= ( A v )= ( v )= ( v ) . So v is also an eigenvector
with eigenvalue .
Exercise 6.
Calculate the eigenvectors and the associated eigenvalues of the following matrix:
2 0 0
A = 1 3 5
1 1 1
Solution:
We start by solving the characteristic equation:
125
1
1
3
1
0
5 = 0 = (2 )((3 )(1 ) 5) = (2 )( 2 2 8) =
1
(2 )(4 )(2 )
So we find 1 = 2, 2 = 4, 3 = 2 . (You have to be lucky to be able to solve a cubic equation
this way. Dont worry; on an exam you will always be lucky.)
Now we find the associated eigenvectors v1 , v 2 , v 3 by solving the equation:
( A 1 I ) v1 =
0
We solve by sweeping:
0
0
0 0 0 0 0 1 1 5 0
22
3 2
5
0 =
1
1 1 5 0 ~ 0 0 1 0
1
1 2 0 1 1 3 0 0 0 0 0
1
p
So we find v1 = p for any p, with associated eigenvalue 1 = 2 .
0
We check our result:
2 0 0 p 2 p
p
1 3 5 p = p 3 p = 2 p
1 1 1 0 p p
0
Such a relief!
We move on to v 2 with associated eigenvalue 2 = 4
0
0
0 2 0 0 0 1 0 0
24
3 4
5
0 =
1
1 1 5 0 0 1 5
1
1 4 0 1 1 5 0 0 1 5
1
0
So we find v 2 = 5q for any q, with associated eigenvalue 2
q
We check our result:
2 0 0 0 0
0
=
5q =
20q 4 5q
1 3 5
1 1 1 q 4q
q
Hurrah!
We move on to v 3 and 3 = 2 .
0 1 0 0 0
0 0 1 5 0
0 0 0 0 0
=4
0
0
0 4 0 0 0 4 0 0 0 1 0 0 0
2 2
3 2
5
0 = 1 5 5 0 0 5 5 0 0 1 1 0
1
1
1
1 2 0 1 1 1 0 0 1 1 0 0 0 0 0
0
So we find v 3 = r for any r, with associated eigenvalue 3 = 2 .
r
126
2 r
1 3 5 r =
2r =
1 1 1 r 2r
r
Again it works out and we have found all our eigenvectors.
127
3 0 0
1 3 1
1 1 2
6 4 3
the long term state of the population (no matter what the starting state was) by calculating
x1
x1
k
lim A x2 for general population x2 .
k
x
x
3
3
Solution:
Background
At the tutorial there was more explanation about the background of Markov transition
matrices. It describes transition in the labour market, for which there are three states (e.g. state
1: employment; state 2: unemployment; state 3: non-participation).
The matrix describes the probabilities in the transitions across the three states between period
t and period t+1.
Note that the numbers in the matrix should be read as conditional probabilities.
2/3 = Pr(employed in period t+1 | someone was employed in period t)
1/6 = Pr(unemployed in period t+1 | someone was employed in period t)
1/6 = Pr(non-participant in period t+1 | someone was employed in period t)
These probabilities add up to one exactly.
3/4 = Pr(unemployed in period t+1 | someone was unemployed in period t)
1/4 = Pr(non-participant in period t+1 | someone was unemployed in period t)
These probabilities add up to one exactly
2/3 = Pr(non participant in period t+1 | someone was non participant in period t)
1/3 = Pr(unemployed in period t+1 | someone was non participant in period t)
These probabilities add up to one exactly
Note that x1 + x2 + x3 =
1
3 0 0
x1
1 3 1
x
Thus Ax =
6 4 3 2
x3
1
1
2
6 4 3
is informative about the states in period t+1
To diagonalize the transition-matrix A, we have to start by finding the eigenvectors and
128
3
1
3
1
2
3
2
1
2
1 17
1
P ( ) =
=( )(( )( ) ) =( )( + 2 ) =0
6
4
3
3
4
3
12
3
2 12
12
1
1
2
6
4
3
2
2
0 =( )(5 17 + 12 2 ) =( )(1 )(5 12 )
3
3
2
5
This gives us three eigenvalues:=
. (You have to be lucky to be able to
, 3
1 1,=
2
=
3
12
solve a cubic equation this way. Dont worry; on an exam you will always be lucky.)
To get the associated eigenvectors v1 , v 2 , v 3 , we use the equation:
( A 1 I ) v1 =
0
W sweep:
2
1
1
0
0
0
0
0 0
0 0 0
3 1
3
3
4
3
1
1 1 1
1
1
0 = 0 0 1
0
6
6
4
3
4
3
3
1
2
1
1 1
0 0 0 0
1
0
1 0
4
3
4
3
6
6
0
1
So v1 = 4r for any r. Of course, if v1 is to represent a population, then r = .
7
3r
We check if v1 is indeed an eigenvector with eigenvalue 1:
2
3
1
6
1
6
0
0
0 0
3 1
4r =
3r + r =
1 4r So it is as we wanted.
4 3
3r
r + 2r
3r
1 2
4 3
2
The eigenvector for 2 = :
3
2
2
0
0
0
1
33
0 0 0 0 1
2 0
3 2
1
1 1 1
1
0
0 0 1 2 0
=
6
6 12 3
4 3
3
0 0 0 0
1
1
2
2
1
1
0
0 0
4
3 3
6
6 4
0
129
3 0 0
2 p
2 p
3 p
3 p
1 3 1 2 p =
1 p+ 3 p+ 1 p =
4 p=
2p
2
6 4 3
2
3
3
3
p
1
1
2
1
1
2
2
p + p + p p
2
3 3
2
6 4 3
So again we made no error in calculation.
5
Finally we solve for the eigenvector associated with 3 =
12
2 5
1
0
0
0
0 0 0
3 12
1 0 0 0
4
1 0 0 0
3 5
1
1 1 1
1 1
0 =
0 0
0 0 1 1 0
6
6 3 3
4 12
3
3 3
0 0 0 0
1
1
2
5
1
1
1
1
1
0
0 0
0
4
3 12
4 4
6
6 4 4
0
5
So v 3 = q for any q. We check if v 3 is indeed an eigenvector with eigenvalue
.
12
q
2
3 0 0
0
0
0
1 3 1 q = 3 q 1 q = 5 q = 5 q
6 4 3 4
3 12 12
q
5
q
1 1 2
1 q 2 q q
3 12
4
6 4 3
Yippee.
Now were almost ready to diagonalize our matrix. Recall that we want to write
0 3 0
1 0 0
easy values for r,p and q, and D is the diagonal matrix of eigenvalues, so D = 0
0 .
5
0 0
12
We still need to find C 1 . Because we do not show how to find an inverse of a matrix in this
course its not hard, but we can only do so much we simply postulate that
130
1
1
C =
3
2
21
1
7
1
7
3 4
7 7
1 1 1
0 3 0 7 7 7 1 0 0
C C 1 4 2 1 =
0 0 0 1 0 How good of us.
=
3 1 1 2 3 4 0 0 1
21
7
7
1k
0
0
2
k 1
0 C 1
=
=
Ak CD
C
C 0
3
5
0
0
12
What we wanted was to take the limit of this for to infinity, to see what would happen after
infinitely many periods, i.e. in the long run. But, because two of our eigenvalues are smaller
than one, they tend to zero as k tends to infinity. So:
1 1
1k
0
0
1 0 0
0 3 0 1 0 0 7 7
1
2
1
Ak lim C 0
C 0 0 0 C=
lim =
0 C=
4 2 1 0 0 0
0
k
k
3
0 0 0
3 1 1 0 0 0 3
2 3
k
0
0
21 7
12
1 1 1 0 0 0
0 3 0 7 7 7
4 4 4
4 2 1 0 0 0 = 7 7 7
3 1 1 0 0 0
3 3 3
7 7 7
So now we can calculate:
131
1
7
0=
0 0 0
0
0
x1
4
4 4 4 x=
4
( x1 + x2 + x3 )=
2
7 7 7 7
7
3 3 3 x3 3
3
( x1 + x2 + x3 )
7 7 7
7
7
The last equality follows that a population vector has ( x1 + x2 + x3 ) =
1 . So in the long run the
population will be divided over states 2 and 3 in the proportions 4:3, while nobody will be in
state 1. (Can you understand just by looking at matrix A why that might be the case?)
Finally, not that the long run population vector that we found is also an eigenvector with
eigenvalue 1. That is no coincidence: it is almost always the case with Markov chains, in fact
always if the long-run state is well defined. The reason is as follows: In the long run, we
expect a steady state, so nothing changes anymore. So we want a vector such that, if A works
on it, we get our vector back. But that is just an eigenvector with eigenvalue 1.
Dont get angry, we did not sweat for nothing. Although it is true that it is much easier to find
the long run state by looking for an eigenvector with eigenvalue 1, our method is the only way
( I know of) to fairly easily find Ak for any large k.
*Exercise 2.
We show that the determinant-volume formula holds in a special case and discuss the general
proof.
Solution:
We start with a unit square C, characterized by the vectors (1, 0), (0,1) and investigate what
a c
happens under transformation T : 2 2 with associated matrix
.If we let T work
b d
on our two vectors, we get
a c 1 a
=
b d 0 b
a c 0 c
=
b d 1 d
So our transformation on the unit square looks like this:
132
Now we know the area of the unit square is 1, so to calculate the determinant of the matrix, all
we have to do is calculate the area of the resulting parallelogram. To calculate this, we need
one geometric fact, which we now illustrate.
The parallelogram with the blue sides has the same area as the parallelogram with the red
sides. In general, if you keep one side of a parallelogram fixed and you move the opposing
side along a parallel line, the area of the resulting parallelogram is the same as that of the
original.
We now use this fact to transform our parallelogram given by (a,b) and (c,d) into a more
manageable one with the same area. We actually use it twice, to transform it into a rectangle:
133
So we see that the rectangle given by (p,0) and (0,q) has the same area. Clearly that area equal
pq. So what we have still to do is calculate p and q. We start with p. What did we do in the
first step, the first shift of parallelograms? We took our point (a,b) and went in the direction of
(c,d) until we reached the x-axis, so we have ( a b ) r ( c d ) =
( p 0 ) for some r to be
bc
b
, so p =a rc =a
.
d
d
Thats one out of the way. Actually, finding q is easier. We see in the figure that going from
(c,d) to (0,q) is a horizontal shift, so the y-coordinate does not change: q=d.
So the area of our rectangle and therefore also our original parallelogram is
bc
pq =
(a
)d =
ad bc
d
This is indeed the determinant formula for the two-dimensional case.
Of course, this is no proof of the formula in general. For that we would have to show that it
holds for all shapes we could start with, not just the unit square. The way to do that is not by
extending the argument we gave above (just imagine doing this for general parallelograms in
higher-dimensional spaces). Instead what mathematicians do is very different: they look at
volume as a function of a shape and show that is must have certain properties (for instance, if
you translate a shape, its volume does not change). Then they show that there can be only one
such function. And then they show that the determinant also has these properties. Then they
can conclude that determinant indeed gives the volume of a transformation. We wont trace
their steps here, as that would take as much too far afield, but you might be interested to see
how you can handle such a seemingly awesome problem.
determined. We know b rd = 0 r =
* Exercise 3.
Prove that det( A1 ) =
1
(if A1 exists of course).
det( A)
Solution:
If A1 exists, then we can say A A1 =
I , so det( A A1 )= det( I )= 1 . Now recall that
134
1
.
det( A)
* Exercise 4.
Prove that det( A) = i , if A is diagonalizable, where i are the eigenvalues of A.
i
Solution:
We first establish the intuition. Suppose A is 2x2. Because A is diagonalizable, we know it has
2 eigenvectors v, w with associated eigenvalues , . Now consider the parallelogram given
by v, w and consider what would happen to it when multiplied by A. We call the original
parallelogram P and the new one Q.
In the figure it is quite clear that Vol (Q) = lVol ( P ) . Therefore we should find that indeed
det( A) = i . We now proceed to prove this.
i
From class we know that for a diagonalizable matrix A the following holds:
AC = CD , where C is the matrix with for every column an eigenvector of A, and D is a
diagonal matrix with the associated eigenvalues on the diagonal. Now we know:
det( AC ) = det(CD) det( A) det(C ) = det(C ) det( D) det( A) = det( D)
1 0 0
=
0 i .
i
0 0 n
We have quite a powerful apparatus by now. This was not such an easy theorem to
understand, but the proof is just a few lines.
But D is a diagonal matrix, so det(
=
A) det(
=
D)
Consequence: if one of the eigenvalues of A equals zero, the determinant of the matrix A will
be zero. If one of the eigenvalues of A equals zero, the inverse of the matrix A does not exist.
135
1 2
Compute the matrix decomposition P 1 AP = for A =
3 0
Using the decomposition, compute A4
Solution
Compute the eigenvalues of A:
1 2
A I =
=
(1 ) 6 =
0
3
0
So that the characteristic equation is ( 3)( + 2) =
For = 3 , the eigenvector is
2 x1 + 2 x2 =
0
3 x1 3 x2 =
0
1
so that v1 is an eigenvector
1
3 x1 + 2 x2 =
0
2
so that v2 is an eigenvector
3 x1 + 2 x2 =
0
3
Thus:
3 0
=
0 2
1 2
P=
1 3
3
1 2 1 3 2 5
=
P 1 =
=
1 3 5 1 1 1
5
2
5
1
5
3 2
1 2 5 5 1 0
Check: PP 1 =
=
1 3 1 1 0 1
5 5
136
Thus: 5
1
5
2
5 1 2 1 2
=
1 3 0 1 3
5
9 6
5 5 1 2 3 0
=
2 2 1 3 0 2
5 5
3 2
3 2
3 2
0 5 5
1 2 3
1 2 81 0 5 5
81 32 5 5
A4
=
=
4
1 3 0 2 1 1 1 3 0 16 1 1 81 48 1 1
5 5
5 5
5 5
275 130
5
5 55 26
= =
195 210 39 42
5
5
4
137
solution =
0 . 1u + 2 v + 3 w= 1 0 + 2 2 + 3 1 = 0 .
=
=
1
2
3
1
1
0 0
We can rewrite this equation as:
1 1 1 1 0
0 2 1 2 = 0 . We sweep the matrix:
1 1 0 0
3
1 1 1 0 1 1 1 0 1 1 1 0
0 2 1 0 ~ 0 2 1 0 ~ 0 2 1 0
1 1 0 0 0 2 1 0 0 0 0 0
We can stop here, since we are not interested in the explicit solution and it is clear that we
have all the zero-rows that we will get. The number of unknowns (the three lambdas) minus
the number of zero rows is the number of free variables that we have, i.e. the number of
lambdas that we can pick non-zero. This means that there is one linear dependent vector in
the three and two linearly independent. So the dimension of the span of u,v and w is 2.
138
K.6.3.
Differentiability
K.6.3.
Differentials
K.6.4.
K.7.1. K.7.2.
K.7.3.
Multivariate functions
K.8.1.
Partial derivatives
K.8.2.
Young's rule
K.8.2.
K.8.3.
Total differentials
K.8.4.
Implicit differentiation
K.8.4.
K.6.3.
K.7.3.
Homogeneous functions
K.8.3.
139
y f ( x0 + x) f ( x0 )
=
x
x
Example 1:
y =a + bx + cx 2
y a + bx0 + bx + c( x0 + bx) 2 (a + bx0 + cx0 2 )
=
x
x
= b + 2cx0 + cx
Derivative:
Let y = f ( x)
x0 : initial value
f ( x0 + x) f ( x0 )
dy
= lim
dx x0
x
Also denoted by: f '( x0 )
Definition:
A function is differentiable in an interval if a derivative exists for each
point in that interval.
Requirement: the function must be continuous and smooth.
140
f ( x0 )
x0
Differential:
=
dy f '( x0 ) dx
141
h( x + x) h( x)
=
lim
x0
x
f ( x + x) g ( x + x) ( f ( x) g ( x))
= lim
x0
x
f ( x + x) ( f ( x)
g ( x + x) g ( x)
lim
lim
x0
x0
x
x
= f '( x) g '( x)
h '( x)
142
k f ( x + x) k f ( x)
lim
=
x0
x
f ( x + x) f ( x)
= k lim
x0
x
= k f '( x)
g '( x)
143
144
=
y f=
( x) g (h( x))
where u = h( x)
g (h( x)) g (u )
and both h( x) and g (u ) are differentiable functions
df ( x)
= g '(h( x)) h '( x)
dx
Or
dy dy du
=
dx du dx
Natural logarithmic function rule
f ( x) = ln( x)
d ln( x) 1
=
f '( x) =
dx
x
Example:
ln( x)
log b ( x) =
ln(b)
Example:
f ( x) = ln(h( x))
145
y = f ( x)
Second derivative
d 2 y d dy
=
dx 2 dx dx
Examples:
U (c ) = c
U (c) = ln(c)
Definition: A function is strictly concave over an interval if:
f ''( x) < 0
for all values of x in that interval.
Definition: A function is strictly convex over an interval if:
f ''( x) > 0
for all values of x in that interval.
146
147
148
f ( x1 , x2 ,, xn )
f ( x1 , x2 ,, xn )
xi
x j
x j
xi
or
f ji ( x1 , x2 ,, xn ) = fij ( x1 , x2 ,, xn )
149
150
151
Eulers theorem
For any multivariate function y = f ( x1 , x2 ,, xn )
that is homogenous of degree k if for any number s > 0:
=
ky x1 f1 ( x1 , x2 ,, xn ) + + xn f n ( x1 , x2 ,, xn )
152
153
f1 ( x10 , x20 ,, xn0 )dx1 + f 2 ( x10 , x20 ,, xn0 )dx2 + + f n ( x10 , x20 ,, xn0 )dxn
154
155
a) e 5 x + 2 x
Solution:
1
d 5 x2 + 2 x
d d
d
e
e (5 x 2 + 2 x ) = e (10 x + (2 x) 2 ) =
=
dx
d
dx
dx
1
2
2
1
1
)
e 5 x + 2 x (10 x + (2 x) 2 =
2) e 5 x + 2 x (10 x +
2
2x
b) log(
x +1
)
x2
Solution:
d
x +1
d
d x + 1 1 ( x 2 ( x + 1)2 x)
log( 2 ) = log()
=
=
dx
x
d
dx x 2
( x 2 )2
x 2 ( x 2 ( x + 1)2 x) ( x 2 ( x + 1)2 x)
=
x +1
( x 2 )2
( x + 1) x 2
c) elog( x ) + 2 x
Solution:
d log( x ) + 2 x d log( x ) 2 x
d
x
e
e=
=
(e
)
( xe 2=
) e 2 x + 2 xe 2 x
dx
dx
dx
Alternatively:
d log( x ) + 2 x d d
1
e=
e
log( x=
) + 2 x elog( x ) + 2 x (=
+ 2)
dx
d dx
x
1
1
elog( x ) e 2 x ( + 2) = xe 2 x ( + 2) = e 2 x + 2 xe 2 x
x
x
2
3 x + 7 log( x)
d)
( x + 1) 4
Solution:
7
( x + 1) 4 (6 x + ) (3 x 2 + 7 log( x))4( x + 1)3
d 3 x 2 + 7 log( x)
x
=
dx
( x + 1) 4
( x + 1)8
156
Solution:
d 3 log(3 x4 ) d 3log(3 x4 ) d
4 2
e)
=
e
=
(=
(3 x 4 )3 3(3 x=
) 12 x 3 324 x11
dx
dx
dx
x5 + 3x 2
f)
Solution:
1
1
d
d 5
1 5
5x4 + 6 x
x5 + 3x 2 =
( x + 3x 2 ) 2 =
( x + 3 x 2 ) 2 (5 x 4 + 6 x) =
dx
dx
2
2 x5 + 3x 2
ax
g)
3 x
+ 3x 2
3x + 1
Solution:
3
d a x 3 x + 3 x 2
=
dx
3x + 1
3
3 x + 1(log(a )a
x3 3 x
3(a x 3 x + 3 x 2 )
(3 x 3) + 6 x)
2 3x + 1
3x + 1
2
h) e(3 x + 6)
Solution:
3
3
d (3 x3 + 6)3
d d
d
e
e
()3 (3 x 3 + 6) = e 3() 2 (9 x 2 ) = 27e(3 x + 6) (3 x 3 + 6) 2 ( x 2 )
=
dx
d d
dx
*Exercise 2.
Compute the partial derivative with respect to x and y of the following functions (they are
called the Cobb-Douglas and the Constant Elasticity of Substitution (CES) function
respectively, and you will see them often in Microeconomics as well as more mathematical
Macroeconomics):
a) x a y1 a
Solution:
x a y1 a
or
a 1 a
x y = ax a 1 y1 a
x
a 1 a
x y = (1 a ) x a y a
y
157
s 1
s
+ (1 ) y
s 1 s
s
s 1
Solution:
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
R ( x s + (1 =
)( x s + (1 ) y s ) s 1 =
) y s ) s 1 R(
x s
x
s 1
s
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
1
R ( x s + (1 ) y s ) s =
R(
)( x s + (1 ) y s ) s 1
x s =
x
s 1
s
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
1
R ( x s + (1 ) y s ) s =
R(
)( x s + (1 ) y s ) s 1
(1 ) y s =
=
s 1
s
y
= R( x
s 1
s
+ (1 ) y
s 1 1
s s 1
(1 ) y s
Exercise 3.
Calculate the Hessian of the following function. Verify by computation that Youngs rule
holds.
f ( x, y ) =
y e x+2 y + x2 y
Solution:
y e x+2 y + x2 y =
y e x + 2 y + 2 xy, y e x + 2 y + x 2 y =
2 y e x+2 y + e x+2 y + x2
x
y
2
y e x + 2 y + x 2 y = y e x + 2 y + 2 xy =
y e x+2 y + 2 y
2
x
x
2
y e x + 2 y + x 2 y = y e x + 2 y + 2 xy =(2 y + 1)e x + 2 y + 2 x
yx
y
y e x + 2 y + x 2 y = (2 y + 1) e x + 2 y + x 2 =(2 y + 1) e x + 2 y + 2 x
xy
x
2
y e x + 2 y + x 2 y = (2 y + 1) e x + 2 y + x 2 =(4 y + 4)e x + 2 y
2
y
y
2
2
Note that
f ( x, y ) =
f ( x, y ) , verifying Youngs rule.
yx
xy
Exercise 4.
Verify that the second order derivative(s) of the following concave functions is/are indeed
negative (be careful: on the domain of the functions!):
a) x 2
Solution:
d2
d
x 2 = 2 x =2 < 0
2
dx
dx
158
f ( x, y ) = x 2 y 2
Solution:
4 2 2
f (=
tx, ty ) (tx) 2=
(ty ) 2 t =
x y t 4 f ( x, y )
So the function is homogeneous of degree 4.
c) f ( x, y=
)
x+ y
Solution:
f (tx, ty ) = tx + ty = t ( x + y ) = t x + y = t f ( x, y )
So the function is homogeneous of degree .
d)
f ( x, y ) = x 2 y 2 + x + y
Solution:
f (tx, ty
=
) (tx) 2 (ty ) 2 + tx + =
ty t 4 x 2 y 2 + t x + y
We cant go any further with this. The function is not homogeneous (even though it is the sum
of two homogeneous functions).
e)
f ( x=
, y ) log( x + y )
Solution:
f (tx, ty=
) log(tx + ty=
) log(t ( x + y ))
= log(t ) + log( x + y=
) log(t ) + f ( x, y )
Clearly, this is not a homogeneous function either.
159
This graph sort of looks like the graph of a function, but it is not, because for a function we
want that every x-value gives only one y-value. Here, however, for every x (1,1) there are
two corresponding y-values. But if we zoomed in on the graph, we would get something that
that looks like a function:
160
The part in the zoom is perfectly well behaved: for every x-value there is just one y-value. So
in this part of the graph we can talk about a function y* = g ( x) .
Now lets revisit the conditions of the implicit function theorem. They are two: the partial
f ( x0 , y0 )
f ( x0 , y0 )
and
must exist at the point ( x0 , y0 ) (the point on which
derivatives
y
x
f ( x0 , y0 )
were zooming in) and
0 (otherwise we would be dividing by zero in the formula
y
f ( x, y )
*
dy
dg ( x)
=
= x ). Lets check these conditions for f ( x, y ) = x 2 + y 2 1 .
f ( x, y )
dx
dx
y
f ( x0 , y0 )
f ( x0 , y0 )
= 2 y0 . These both exist everywhere (well see in the third
= 2 x0 ,
y
x
f ( x0 , y0 )
example a case where this isnt so). However, for y0 = 0 ,
= 0 , so our second
y
condition is violated. What points in the function are we talking about? Well, lets check:
f ( x, 0) = x 2 + 02 1 = x 2 1 = 0 , so x =1 x =1 . Lets look at these point (-1,0) and (1,0):
So what goes wrong here? At these points, no matter how far we zoom in, there will always
be two y-values. The problem is that at y=0 the graph goes straight up. A rough way of
161
162
Just looking at the graph, we see immediately that thing will go wrong in three points:
(-1,0),(0,0) and (1,0). So we imagine that our conditions will fail there. Lets check:
f ( x0 , y0 )
=
= 8 x0 (1 x0 2 ) 8 x03
(4 x 2 (1 x 2 ) y 2 )
x
x
=
0
x
x
f ( x0 , y0 )
=
(4 x 2 (1 x 2 ) y 2 )= 2 y0
y = y0
y
y
f ( x0 , y0 )
= 0 . For what values of x
Both partial derivatives are well-defined, but y0 = 0 ,
y
does this hold? Well:
f ( x, 0) =4 x 2 (1 x 2 ) 02 =0 4 x 2 (1 x 2 ) =0
x =1 x =0 x =
1
So we indeed find that we have trouble at the points (-1,0),(0,0) and (1,0).
163
Now lets think about the derivative of this function at x=2. This should be the tangent line at
x=2, but because of the dent in the function, there is no clear-cut tangent line. Therefore, this
function does not have a derivative at x=2.
Now consider the implicit relation f ( x, y ) = y 2 x 2 = 0 . Plotted, it looks like this:
Clearly, here we have trouble at (0,-2): no matter how far we zoom in, we never get a
function. So lets check our condition:
f ( x0 , y0 )
=
( y 2 x 2 ) = 2 x0
x = x0
x
x
if y0 > 2
1
f ( x0 , y0 )
2
=
( y 2 x ) = not defined
if =
y0 2
y = y0
y
x
1
if y0 > 0
164
165
1 y
1 x
dx +
dy =
xydx +
xydy =
dx +
dy
x
y
x
y
2 x
2 y
Because we are interested in the MRS, we want to keep utility constant, so we impose dU=0.
dy
Furthermore, since the MRS is the change in y for a given change in x, MRS =
. We solve
dx
for that:
1 y
1 x
x
y
dy
y
0 = dx +
dy
dy =
dx
=
2 x
2 y
y
x
dx
x
So what does this mean? Well, if you have, say, 5 units of good y and 10 units of good x, then,
5 1
if you had to give up an infinitesimal amount of x, you would require
= a unit of y as
10 2
compensation to keep your utility unchanged. The minus indicates the opposite directions:
you receive one and relinquish the other.
dU =
The implicit differentiation method to derive this is very similar. In fact, it is a bit more
precise, since differentials are not completely well-defined: it is not clear what exactly an
infinitesimal change is. However, intuitively the total differential is easier to grapple with.
Since the method is more precise anyway, we make one other change in the direction of
dy
precision. It is slightly misleading to speak of
, since we appear not even to have defined
dx
a relationship between x and y. And how could there be such a relationship: x and y are just
amounts of goods; you could have as many as you like. Of course the relationship between the
two comes from the fact that we impose that utility is fixed. In effect we imposed a relation
between x and y when we imposed dU=0 (keeping utility fixed). At that point we were no
longer speaking of general x and y, but of particular related values x and y , as we shall now
dy
call them. We are thus interested in
dx .
For the implicit differentiation approach we start as follows. We impose again that utility is
fixed:
U (x, y) =
x y = U , where U is some constant, so that
166
dx
1
y
2
y
y
x =
. Lo and behold, the result is the same.
x
x
y
* Finally we derive a similar result for general utility functions. This is primarily to show that
more abstract calculations, although they may seem a bit more confusing, are often more easy
than concrete examples.
For the total differential approach we again have:
U
U
U
U
U
dy
0=
dU = dx +
dy
dy =
dx
=
x
U
x
y
y
x
dx
y
In fact this last line is just the implicit function theorem (the constants U drop out after
differentiation). Indeed the total differential approach is one way of proving the implicit
function theorem. It just remains to link this result to the marginal utilities, but that is easy.
MU x
dy
U
The marginal utility MU x of x is just
and similarly for y. So MRS =
=
. That
dx
MU y
x
result was easier to derive than the specific case! (In fact, since positive number are easier to
MU
dy
(= x ) . This is just a matter of notation.
work with, MRS is often defined as MRS :=
dx
MU y
*Exercise 2.
Derive the derivative of log(x) by differentiating elog( x ) = x and using your knowledge of the
derivate of e y and the inverse of the exponential function.
Solution:
This may seem like a silly question, since we know the derivative of log(x) just as much as we
know the derivative of e x . However, log(x) is actually defined as the inverse of e x , so all the
information we have on log(x) comes from our knowledge of e x . To work then:
d log( x ) d
e
=
x , working out the left hand side we get:
dx
dx
d log( x )
d d
d
d
) d
e
e=
=
log( x) e=
log( x) elog( x=
log( x) x log( x) ,
dx
d dx
dx
dx
dx
whilst the right hand side gives:
d
x =1
dx
So:
d
d
1
x log( x) =
1
log( x) =
dx
dx
x
So now we have actually proved that the derivative of log(x) is as we always assumed it was.
167
d log( y )
, where y , x is the elasticity of y with respect to x, by using
d log( x)
Solution:
This is actually not that hard, but it turns out to be rather useful in many areas. We derive the
total differential of f(y)=log(y).
log( y )
1
1
=
df ( y ) d=
log( y )
=
dy
dy . Similarly d log( x) = dx . Dividing the two, we get:
y
y
x
dy
( )
d log( y )
dy x dy x
y
= =
= = y,x
d log( x) ( dx ) y dx dx y
x
Exercise 4.
Show that the demand function Q = P ( and constants) exhibits constant elasticity, as
well as the derived log-linear demand function log(
=
Q) log( ) log( P ) . Next week we will
see that this demand function arises from Cobb-Douglas utility functions.
Solution:
dQ P
P
P 1 P
Q , P =
=
P 1
=
=
, which is constant (independent of
dP Q
P
P
price).
d log Q
log(Q ),log( P ) =
=
d log P
* Exercise 5.
Estimate the effect of a change in x on f(x,y(x)), where:
a) x is ability, y is education and f (x,(y(x)) is income.
b) ( p, D( p )) =
pD( p ) cD ( p )
Solution:
df f f dy
+
a) =
. What does this mean? The total effect of ability on income is
dx x y dx
f
composed of two separate effects: the direct effect of ability on income
, which is
x
positive (if youre smarter youll generally earn more money) plus the effect of ability on
dy
education
(which is presumably also positive) times the effect of education of income
dx
f
, which is again positive. Since all terms are positive, the effect of ability on income
y
will also be positive. Should it be the case that very able people actually get less education
(say because they think it beneath them, or because theyre so smart they dont function in
168
The point here is that writing down this equation allows you to see all the partial effects.
Your analysis will then be as convincing as your explanation of the signs of the
derivatives is.
b)
d
( p, D( p ))= D( p ) + ( p c) D '( p ) (Note that we sometimes denote the derivative of
dp
d
f ( x) = f '( x) ).
dx
Here p is the price, D is demand, is profit and c is the constant unit cost of production.
So
d ( p, D( p ))
dp
is the effect of a change in price on the profit. The marginal effect of this is that it allows
you to get a little more money from the people you sell to (D(p)), while it costs you some
demand D(p),which in turn costs you p-c per costumer, as you dont get your money, but
you also dont incur the costs of production for them. In fact the total effect is
unambiguously positive if p c < 0, which makes sense: if your price is so low that you
make a loss each time you sell, you can increase profits by increasing the price. Otherwise
the effect depends on whether you think the loss in customers will be outweighed by the
extra profit per costumer you make.
functions of one variable as
169
f ( x, y ) =
xy 2 + e x
2 xy
+ log x
Solution:
3
f ( x, y )
1
=y 2 + (3 x 2 2 y )e x 2 xy +
x
x
3
f ( x, y )
= 2 xy 2 xe x 2 xy
y
1
2 f ( x, y )
2
2
x3 2 xy
[6
x
(3
x
2
y
)
]
e
=
+
2
2
x
x
3
3
3
2 f ( x, y )
= 2 y 2e x 2 xy 2 x(3 x 2 2 y )e x 2 xy = 2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
yx
3
3
3
2 f ( x, y )
= 2 y 2e x 2 xy 2 x(3 x 2 2 y )e x 2 xy = 2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
xy
3
2 f ( x, y )
= 2 x + 4 x 2 e x 2 xy
2
y
2 f ( x, y )
2
x
2 f ( x, y )
yx
2 f ( x, y )
1
x3 2 xy
2
2
2
xy [6 x + (3 x 2 y ) ]e
x
=
2 f ( x, y )
x3 2 xy
2
2 y [2 + 2 x(3 x 2 y )]e
y 2
2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
3
2 y + 4 x 2 e x 2 xy
Exercise 2.
Suppose that f ( x1 , x2 ) is homogeneous function of degree 2 and g ( x1 , x2 ) is a homogeneous
function of degree 6. Show that the function=
h( x1 , x2 ) f ( x1 , x2 )3 + g ( x1 , x2 ) is homogeneous
and determine the degree.
170
s 6 y = g ( sx1 , sx2 )
h( sx1 , sx2 )= f ( sx1 , sx2 )3 + g ( sx1 , sx2 )= [ s 2 f ( x1 , x2 )]3 + s 6 g ( x1 , x2 )= s 6 [ f ( x1 , x2 )3 + g ( x1 , x2 )]= s 6 h( x1 , x2 )
Hence, h(.) is homogenous of degree 6.
* Exercise 3.
Consider the implicit relation between x and y defined by:
dy
.
dx
You will get an outcome that depends on both x and y. Use the original relation between x and
dy
y to determine the value of the derivative
at x=3 and at x=1. For the latter case you will get
dx
two possible outcomes.
( x 3) 2 + ( y + 3) 2 =
9 . Use the implicit function theorem to find the derivative
Also find the two points where the relation cannot be represented as a function y(x).
Finally, draw a picture to elucidate you findings.
Solution:
The function ( x 3) 2 + ( y + 3) 2 =
9 is a circle with locus (3, -3) and radius 3.
We construct an implicit function: z = ( x 3) 2 + ( y + 3) 2 9
dz = 2( x 3)dx + 2( y + 3)dy
dy
( x 3)
dz = 0 gives
=
dx
y+3
dy
At x=3 we get y=0, and y = -6 and
=0
dx
At x=1 we get y=-0.523 and y =-0.765
dy
does not exist at (0,-3) and (6,-3)
dx
171
K.9.1.
Univariate Calculus
First order condition, stationary point
K.9.1.
K.9.1.
Concavity
K.9.1.
Multivariate calculus
Frst order condition, stationary/saddle point
K.10.1.
Hessian matrix
K.10.3.
K.10.3.
Constrained optimization
Substitution
K.11.1.
Lagrange
K.11.2.
Multipliers
K.11.2.
Value function
K.11.2.
Envelope theorem
K.11.2.
172
y= f ( x)= | x 5 |
Note that f (5) = 0 , but the derivative does not exist at x = 5 . Thus
x = 5 is a stationary point.
173
174
x
x
=
0
lim
= 0 so that both extrema are
and
x x 2 + 4
x x 2 + 4
global extrema.
Note that lim
176
f ''( x) =
2
1 1
x = (2 x 1)
3
3 3
1
The function has an inflection point at x=1/2 (because f ''( ) = 0 ;
2
1
f ''( x) > 0 for x > ,
2
1
f ''( x) < 0 for x <
2
For x=-1 the function has a local maximum:
f '(1) =
0 and f ''(1) < 0
For x=2 the function has a local minimum:
f '(2) = 0 and f ''(2) > 0
Example 9:
f ( x=
) x 6 10 x 4
f '(=
x) 6 x 5 40 x 3
Second derivative of f(x):
f ''( x) =30 x 4 120 x 2 =
30 x 2 ( x 2 4) =
30 x 2 ( x + 2)( x 2)
The function has an inflection point at x=-2 and x=2. No inflection
point at x=0
177
=
d 2 y f=
''( x)dx 2 f11 ( x)dx 2 > 0 (minimum)
5) Note that d 2 y is the change in dy and that dx 2 is the square of the
change in dx
178
f 21 f 22
Remember: Youngs theory (lecture 4): f12 = f 21
179
f
dx2 ] 11
f 21
f12 dx1
f 22 dx2
=
H2
f11
f12
>0
f 21 f 22
Minimum:
H
=
f11 > 0
1
f11 f12
=
H2
>0
f 21 f 22
=
H2
Saddle point:
f11
f 21
f12
<0
f 22
180
a c
A matrix A =
is positive definite if x ' Ax > 0 for all x.
b
d
a c
A matrix A =
is negative definite if x ' Ax < 0 for all x.
b
d
181
2 0
H =
0 2
The eigenvalues of the matrix at (0, 0) are = 2 . Reason:
0
2
H I =
= (2 ) 2
2
0
H I is zero for = 2 . Because all eigenvalues are negative, the
matrix is negative definite, so that at (0, 0) the function reaches a
maximum.
Hessian at (2/3, 0):
2 0
A=
0 2
0
2
H I =
= (2 )(2 )
0
2
Because the eigenvalues have a negative and positive sign, the matrix
is indefinite, so that at (0, 0) the function reaches neither a maximum
nor a minimum.
182
We can solve the function by taking the first partial derivative with
respect to x and . Both derivatives are equal to zero.
L( x, )
so that
= 2ax = 0
(1)
x
L( x, )
and
= x2=0
(2)
183
184
(1)
g ( x1* (c), x2* (c)) x1* (c) g ( x1* (c), x2* (c)) x2* (c)
+
=
1
(2)
x1
c
x2
c
Step 2 Take the derivative of the objective function f ( x1 , x2 ) with
respect to c (chain rule):
df ( x1* (c), x2* (c)) f ( x1* (c), x2* (c)) dx1* (c) f ( x1* (c), x2* (c)) dx2* (c)
=
+
dc
x1
dc
x2
dc
(3)
Step 3 we consider the first-order conditions of the Lagrangian
function (equation (1):
f ( x1* (c), x2* (c))
g ( x1* (c), x2* (c))
=
x1
x1
f ( x1* (c), x2* (c))
g ( x1* (c), x2* (c))
=
x2
x2
(4)
(5)
g ( x1* (c), x2* (c)) dx1* (c) g ( x1* (c), x2* (c)) dx1* (c)
df ( x1* (c), x2* (c))
=
+
=
dc
x
dc
x
dc
1
2
186
(1)
Step 1.
Take the first derivative of the constraint x* (c) = c with respect to c:
dx* (c)
= 1:
(2)
dc
Step 2.
Take the first derivative of the unconstrained function (the objective
function) with respect to c:
df ( x* (c))
dx* (c)
*
= 2ax (c)
dc
dc
(3)
Step 3.
Take the partial derivative of the Lagrangian function (1) with respect
to x:
2ax* (c) =
(4)
Step 4.
Substituting equation (4) in equation (3)
dx* (c)
dx* (c)
=
2ax (c)
dc
dc
*
(5)
187
ax (c)
dc
dc
188
L( x1 ,, xn , 1 ,, m ) =
f ( x1 ,, xn ) i=1 i [ g i ( x1 ,, xn ) ci ]
m
i-th constraint: g i ( x1 , x2 ) = ci
189
if n = 1
dx 2 x =0 2
So we have found a minimum for n=1. But for the other cases we dont know yet. Well graph
them to what is going on:
190
1
4
Here we graphed x , x , x . Clearly they all have a minimum at x=0. This shows that the
conditions we use for finding a minimum are only a sufficient requirement: always if we find
a critical point that obeys the second order condition, it is a minimum, but not all minima
obey the second order condition.
d) e x
Solution:
de x
= e x > 0 , so this function has no critical points. It is always increasing, monotonously
dx
increasing, as it is called.
e) e x
+3 x 2
Solution:
x
Given that we saw in the last exercise that e is always increasing, we expect to find a
minimum here at the minimum of the exponent. Indeed, this is what comes out:
2
2
de x +3 x 2
(2 x + 3)e x +3 x 2 =
0 , since the exponential function is always positive, this implies:
=
dx
3
. This indeed at the minimum of the exponent.
2x + 3 = 0 x =
2
2
2
2
d 2 e x +3 x 2
(2 x 3) 2 e x +3 x 2 + 3e x +3 x 2 > 0 . We could plug in the optimal value of x, but we
=+
2
dx
see that only positive numbers (a square, twice an exponential function, 3) occur, so we see
immediately that this is positive and therefore the critical point a minimum.
191
Solution:
de x 2 x
= e x 2 = 0 e x = 2 x = log(2)
dx
d 2e x 2 x
= e x > 0 , so we find a minimum. Now at the minimum:
dx 2
2 2 log(2) > 0 , as log(2) < 1. So, coming back to question
ex 2x
elog(2) 2 log(2) =
=
x = log(2)
f), we find that even at its minimum, the first derivative of e x x 2 is positive, so it is always
increasing and has no critical points.
Exercise 2.
Determine whether the following matrices are positive definite, negative definite, or neither:
a b
* a)
b c
Solution:
Because the dimension of the matrix is very small, we can apply the definition of positive
definiteness directly. We focus on symmetric matrices, because it can be shown that if a
matrix A is positive definite, one can always find a symmetric positive definite matrix S that
gives the same outcomes, i.e.: xT Ax = xT Sx for all vectors x.
For a positive definite matrix, it must hold that:
a b x
ax + by
2
2
( x y)
= ( x y )
= ax + 2bxy + cy > 0 for all possible x and y.
b c y
bx + cy
Clearly, if we take either x=0 or y=0, we find that both a and c must be positive. To find the
final condition, we rewrite our expression:
b
ax 2 + 2bxy + cy 2 = a ( x 2 + 2 x y ) + cy 2 =
a
2
b
b
b2
a ( x 2 + 2 x y + 2 y 2 ) + cy 2 y 2 =
a
a
a
2
b
b
a ( x + y ) 2 + (c ) y 2 > 0
a
a
192
2
3
1
b)
6
1
6
1
3
0
3
4
1
4
Solution:
We have seen this matrix before, it is the markov chain example we analyzed in the tutorial of
2
5
week 3. There we saw that it had eigenvalues=
, so all eigenvalues are
, 3
1 1,=
2
=
3
12
positive. This is one characterization of a positive definite (PD) matrix. We double-check our
result by looking at the leading principal minors:
2
3
1
det
6
1
6
0
3
4
1
4
1 2 3 2 1 1 5
=
= >0
3 3 4 3 4 3 18
3 0 2 3 1
det
=
>0
=
1 3 3 4 2
6 4
2
>0
3
They are all positive, confirming our results.
193
c) 0 2 1
1 1 3
Solution:
This time we have to check by direct computation:
det 0
1
1
det
0
0 1
1> 0
It is PD.
*Exercise 3.
Find the critical points of the following functions and assess whether they are minima,
maxima or saddle points:
a b x
x
=
a) f ( x, y ) ( x y )
+
d
e
(
)
, a,b,c,d,e constants
b c y
y
Solution:
From 2a) we know that the function written out becomes:
a b x
x
2
2
f ( x, y ) = ( x y )
+ ( d e ) = ax + 2bxy + cy + dx + ey
b c y
y
The two first-order conditions become:
f
= 2ax + 2by + d= 0
x
f
= 2bx + 2cy + e= 0
y
Note that we can write this as:
a b x d
2
0
+ =
b c y e
Basically, f is a quadratic function, but now in two dimensions (In fact functions of the form
xT Ax + bT x + cT are called quadratic functions). The rule for its derivative is very similar to
the one dimensional case. We could solve for the critical point (x,y) by solving this matrix
equation, but we dont really care about the outcome, so we move on to the second order
condition.
194
x
xxy
a b
=
2
H =
2 f
2 f
b c
y 2
yx
Again, the hessian of this matrix looks very much like the second derivative of a one
dimensional matrix. Furthermore, we have seen in question 2a) under what conditions this
matrix is positive definite ( ac b 2 > 0 , a > 0 and c > 0) and when it is negative definite (
ac b 2 > 0 , a < 0 and c < 0). The first case corresponds to a minimum, the second to a
maximum.
b)=
f ( x, y , z )
(x
1 0 1 x
x
z ) 0 2 1 y + (1 2 3) y + 5
1 1 3 z
z
Solution:
We can apply the same result that we saw in question 3a): the derivative of a quadratic
function is:
1 0 1 x 1
=
f ( x, y, z ) 2 0 2 1
=
y + 2 0
1 1 3 z 3
Here we write f ( x, y, z ) for the column vector of partial derivatives of f(x,y,z). This is here
nothing more than a short-hand, although f ( x, y, z ) , called the gradient of f, is a useful thing
with interesting properties, which we do not study here.
Our equation leads to the matrix equation:
1 0 1 x
0 2 1 y =
1 1 3 z
1
2
1
3
2
1
1
1
Which we can solve by sweeping to find: x =
,y=
,z =
. So our critical point is
6
3
3
1
6
1 0 1
1
. We know that the Hessian is equal to: 0 2 1 . We saw in question 2c) that this is a
3
1 1 3
3
positive definite matrix, so we have a minimum here.
195
Solution:
We compute the partial derivatives and set them equal to zero:
f ( x, y, z )
= yz 2 x = 0
x
f ( x, y, z )
= xz 6 x = 0
y
f ( x, y, z )
1
1
1
= xy = 0 xyz =1, yz = , xz =
y
z
x
y
Plugging these last two equalities back into the first partial derivatives, we get
1
1
1
1
2 x = x 2 = x = x =
x
2
2
2
1
1
1
1
6 y = y2 = y = y =
6
6
6
y
Using now the fact that xyz = 1 , we find z =2 3 z =2 3 . If we make sure that the signs
work out correctly (for xyz to be positive, we must have an even number of negative
numbers), we find that there are 4 critical points:
1 1 1 1
2 2 2 2
1 1 1 1
,
,
,
.
6 6 6 6
2 3 2 3 2 3 2 3
To see whether these are maxima, minima or saddle points, we have to calculate the Hessian:
2 f
2 f
2 f
2
xy xz
x
2 z
y
2 f
2 f
2 f
=
H
=
z 6 x
2
yz
yx y
1
2 f
x
2 f
2 f y
2
z
2
zx zy z
In general, it could be a lot of work to check for all four points whether this matrix will be
positive or negative definite. However, in this case we observe that the first two diagonal
entries are negative, while the third is positive (it is a square). Since positive definiteness
requires all diagonal elements positive, while negative definiteness requires all diagonal
elements negative, clearly these Hessians can be neither. Therefore, all critical points are
saddle points.
196
1
1
+
x y + x 1
1
1
+
y y + x 1
=
H =
2 f
1
2 f
( x + y 1) 2
2
y
yx
( x + y 1)
1
1
2
2
y ( x + y 1)
18
9
1
3
9
=0
18
1
3
(18 ) 2 =
81
1 =
27, 2 =
9
Both eigenvalues of H are negative, so that the Matrix H is negative definite (ND).
1
Consequently, at x= y=
the function reaches a maximum.
3
One could also argue that H is a negative definite matrix, because the diagonal entries are
negative (minus squares), while the determinant is:
1
1
1
2
x 2 ( x + y 1) 2
a+z
z
( x + y 1)
2
= det
det
= (a + z )(b + z ) z
1
1
1
b+ z
z
2
2
( x + y 1) 2
y ( x + y 1)
Here I define a, b and z to simplify the expression. Because a, b and z are all negative, this
determinant must be positive. So we see the Hessian is negative definite everywhere and
therefore the function is concave. This means that whatever critical point we find, it will be a
maximum. Furthermore, for a strictly concave function, there is only 1 critical point.
Looking at the symmetry of the function, we might wish to guess what the critical point is. If
we do this successfully and find that all the partial derivatives are zero there, we are done.
1
We pick x= y=
. When we evaluate the partial derivatives we found earlier, we see that we
3
indeed get zero, so we are done.
197
x
xxy 4( x 2 + y 2 ) + 8 x 2
8 xy
=
H =
2 f
8 xy
4( x 2 + y 2 ) + 8 y 2
2 f
y 2
yx
Evaluated at zero, this becomes:
0 0
H =
0 0
This matrix is not positive definite, but still the function is at a minimum. This is another
example of the fact (already observed in question 1c) ) that the second-order condition is
sufficient, but not necessary for a minimum.
But have still to show that the function is indeed at a minimum. Fortunately, this is not so hard
to do. Recall that for a point (x,y), x 2 + y 2 is the square of the distance to the origin, i.e. the
point (0,0). So what the function f(x,y) does is give us the square of that number. Clearly, for
any point but (0,0) itself, our critical point, this will give a positive number. So the function is
everywhere positive, but in the point (0,0), where it is 0. Thus (0,0) must be a minimum.
* Exercise 4.
Maximize U ( x=
, y ) x y , =
+ 1 subject to px + y =
I . Do this first by substitution and
then by Lagrange multipliers.
Solution:
Substitution:
px + y = I y = I px , so
=
, y ) x=
( x) x ( I px)
U ( x=
y U
U
0
x 1 ( I px) px ( I px) 1 =
=
x
x 1 ( I px) = px ( I px) 1
( I px) = px I =( + ) px
I
, (remember + = 1)
x=
p
y =I px =(1 ) I = I
198
L
=
x 1 y p =0 x 1 y = p
x
L
= x y 1 = 0 x y 1 =
y
For the problems encountered in economics, it is almost always useful to divide the two
partial derivative constraints of the Lagrangian:
x 1 y p y
=
=p
x y 1
x
We plug this into the budget constraint:
( + ) y
y
y
px + y = I
x+ y =
+y=
=I y =I
x
(1 ) I aI
=
px + y = I px + I = I x =
p
p
The results are the same.
However, the Lagrangian multiplier method does allow us to say one more thing: we can now
interpret the multiplier . Lets derive the marginal utility of income. That is the amount of
extra utility derived from more income. So if we call our optimal consumption solutions
aI *
=
x* ( I ) =
, y ( I ) I , our utility at optimal consumption is: U * ( I ) = U ( x* , y* ) and the
p
U * ( I ) U ( x* , y* ) x* U ( x* , y* ) y*
marginal utility of income
is:
. Now we know from
=
+
I
x
I
y
I
U ( x* , y* )
U ( x* , y* )
= p and
= , so we get:
the Lagrange constraints that
x
y
U ( x* , y* ) x* U ( x* , y* ) y*
x*
y*
x* y*
+
= p
+
= ( p
+
) . Let us have a look at the
x
I
y
I
I
I
I
I
x* y*
+
. It measures marginal expenditure as income increase. But since expenditure
I
I
and income are equal at optimal consumption (we dont leave money lying around), this term
must be 1 (this can also be seen by implicitly differentiating the budget constraint w.r.t.
U * ( I )
=.
income). So
I
term p
Three additional exercises on shadow price. See the lecture slides of this week for further
explanation on the shadow price. Below three examples are given on how to compute the
shadow price for these specific examples.
Addition exercise 1 shadow price
c .
Maximize f ( x, y=
) x 2 + y 2 , s.t. x + y =
and calculate the shadow price.
199
x* (c), y* (c) :
c
c
c2
f * (c) = f ( x* (c), y* (c)) = ( ) 2 + ( ) 2 =
2
2
2
We can now take the derivative of this with respect to c. This shows us how much our optimal
value of f changes when we change the constraint c:
f * (c) c 2
=
( )= c=
c
c 2
Hence, the interpretation of the is that it gives the marginal change of the objective
function, if the constraint is changed by one unit.
Addition exercise 2 shadow price
A slightly more complicated example to drive the point home:
Maximize f ( x, y )= x + y , s.t. x 2 + y =
c .
Construct the Lagrangian function:
L ( x, y , ) = x + y ( x 2 + y c )
1
1
L
L
=1 2 x =0,
=1 =0 =1, x = , y =c
2
4
x
y
Now the value function becomes:
1
1
f * (c) = f ( x* (c), y* (c)) = + c and
2
4
*
f (c)
= 1=
c
Addition exercise 3 shadow price
In one variable it is a bit confusing, but it still works:
Maximize f ( x) = x 2 , s.t. x = c
Construction of the Lagrangian function:
L( x, ) =x 2 ( x c)
L
= 2 x = 0 = 2 x , x = c = 2c
x
Our value function becomes:
=
f * (c) f=
( x* (c)) c 2 and
f * (c)
= 2=
c
c
200
K.11.2.
Envelope theorem
K.11.2.
K.12.1.
K.12.1.
Simple rules
K.12.2.
Substitution
K.12.2.
Integration by parts
K.12.2.
Improper integrals
K.12.2.
201
=
Q*
(3 1)
= 1
2 * 0.5
Q* =
(a c)
2b
202
(a c) 2
(Q , a, b, c) =
4b
*
which becomes:
(a c) 2
F (a, b, c) =
4b
Thus the maximum value is a function of the parameters a, b, and c.
The interpretation of F is that it is an indirect objective function.
203
z * = f ( x* ( ), y* ( ); )
Thus
z *
= =
F ( )
f ( x* ( ), y* ( ); )
i i
i
Reason:
F ( )
i
f ( x* ( ), y* ( ); )
=
i
x*
y*
*
*
*
*
f ( x ( ), y ( ); )
f ( x ( ), y ( ); )
f ( x* ( ), y* ( ); ) =
+
+
x
i y
i i
x*
y*
0
f ( x* ( ), y* ( ); ) =f ( x* ( ), y* ( ); )
+0
+
i
i i
i
204
, w) pf ( K , L) rK WL
Profit function: ( K , L, p, r=
Maximize profit with respect to K, L, keeping p, r, and w fixed.
Optimal K and L are: K * = K * ( p, r , w) and L* = L* ( p, r , w) so that
* ( K * , L* , p, r=
, w) pf ( K , L) rK WL
Hence:
*
*
) Q* > 0 and
= f ( K * , L=
p
*
=
K * < 0 : how much profits is lost if the price of capital
r
increases by a small amount?
*
=
L* < 0
and
w
Note that the optimization problem can also be written as (Page 322 of
Klein):
Profit function:
( K , L, Y , p, r , w, ) = pY rK WL ( f ( K , L) Y )
205
L
=
y
And
L
= xi
pi
Application of general envelope theorem:
Marginal disutility income is equal to
U * ( p1 ,..., pn ; y ) L( x1 ,..., xn , p1 ,..., pn , y )
= =
m
m
Marginal disutility of a price increase is the marginal disutility of
income ( ) times the quanity demanded ( xi* )
U * ( p1 ,..., pn ; y ) L( x1 ,..., xn , p1 ,..., pn , y )
=
= xi*
pi
pi
206
xn*
0
+ g i =
i
xn*
=
g i
i
207
x1*
xn*
= g xi
+ + g xi
+ f i
i
i
208
L
= 0 (slack condition)
and
L( x1 , x2 , , c)
=f x1 ( x1 , x2 ) g x1 ( x1 , x2 ) =0
x1
L( x1 , x2 , , c)
=f x2 ( x1 , x2 ) g x2 ( x1 , x2 ) =0
x2
Second, we introduce the complementary slackness conditions
0
= 0 if g ( x1 , x2 ) < c
Third, require ( x1 , x2 ) to satisfy the constraint g ( x1 , x2 ) c
Find all the points ( x1 , x2 ) together with that satisfy the three
conditions. Consider these candidates for optimality
210
The lagrangian is
L( x1 , x2 , ) = x12 + x22 + x2 1 ( x12 + x22 1)
So that the first-order conditions are
L( x1 , x2 , )
=2 x1 2 x1 =0
x1
L( x1 , x2 , )
= 2 x2 + 1 2 x2= 0
x2
Complementary slackness conditions are
0
= 0 if x12 + x22 < 1
1) First-order condition for x1 gives x1 = 0 or = 1
2) = 1 does not satisfy the first-order condition for x2 . Thus, x1 = 0
2
2
1:
3) Consider x1 = 0 for x1 + x2 =
a. If x2 = 1 then = 3/ 2 (satisfy slackness condition)
b. If x2 = 1 then = 1/ 2 (satisfy slackness condition)
2
2
2
c. If x1 + x2 =0 + x2 < 1 then = 0 . According to the first-order
condition for x2 , if = 0 then x2 = 1/ 2
211
L
= 0 (slack condition)
and
212
f ( x)dx
Note that:
1)
2)
3)
f ( x)dx
=
a
b
f ( x)dx + f ( x)dx
b
( f ( x) g ( x) ) dx =
a
b
f ( x)dx = f ( x)dx
b
f ( x)dx g ( x)dx
a
5 3
x
dx
=
x
5
a
3
b
x=a
5
5
5
= b3 a 3 = (b3 a 3 )
3
3
3
213
a n+1
x +C
n +1
2 3/ 2
Example: 0 16 x dx = 16 * x
3
2
3
16 x1/ 2 dx= 16 x 3/ 2
0
3
3) Exponential function
de kx
= ke x
dx
Thus
a kx
kx
ae
=
dx
e +C
1/ 2
=
x =0
32
3 3 0 = 32 3
3
2
= 16( 3 3 0)= 32 3
3
x =0
4) Logarithmic function - I
d ln( x) 1
=
dx
x
Thus
a
x + k dx= a ln( x + k ) + C
5) Logarithmic function - II
d ln( f ( x)) f '( x)
=
dx
f ( x)
Thus
f '( x)
dx ln( f ( x)) + C
=
f ( x)
214
x3 x3
x3
x ln( x)dx = ln( x)d 3 =3 ln( x) 3 d ln( x) =
x3
x3
= ln( x) dx =
3
x
x3
= ln( x) x 2 dx =
3
x3
1
= ln( x) x 3
3
3
2
215
2
=
(3x + 1)6 xdx
2
2
1)
(3x + 1)d (3x +=
Solution 2:
2
(3
x
+ 1)6 xdx =
udu =
1
(3 x 2 + 1) 2 + C
2
1 2 1
u = (3x 2 + 1) 2 + C
2
2
216
10
5
200
400
600
800
1000
1200
217
4
Looking at the figures, we can expect to find four points of interest with the Kuhn-Tucker
method: the two points where the function becomes flat and the two points where one the
restrictions becomes binding. Lets compute and see if it works out.
We construct the Lagrangian:
L( x) = x 3 3 x 2 + x 1 ( x 6) 2 ( x + 10)
Note that we always construct the Lagrangian with constraints of the form g ( x) c , so we
had to rewrite x 10 as x 10 .
The first-order condition:
L
( x=
) 3 x 2 6 x + 1 1 + 2= 0
x
Now there are two possibilities: either a constraint is binding, meaning it holds with equality,
or it is slack, meaning that it holds with strict inequality. If the constraint is slack, we want the
associated multiplier (lambda) to equal zero. For maximization, we want that the multiplier is
positive if the constraint is binding. Lets first try it with the multipliers equal to zero and see
if we end up with point in which the constraints are slack.
6 + 24
6 24
3x 2 6 x + 1 = 0 x =
x=
6
6
We get two critical points that both lie strictly between -10 and 6 (they are then called interior
points). So these are candidate optima and we have to check the second order conditions to
see if they are local minima or maxima or inflection points:
2 f ( x)
= 6 x 6 , which we evaluate at our critical points:
x 2
6 24
6 + 24
6
6 = 24 < 0, 6
6 = 24 > 0
6
6
So the first point is a maximum and the second a minimum.
218
So here the Kuhn-Tucker conditions do not hold and this is not a local maximum (it is in fact
a minimum).
We can now check the two local maxima we found to see which is bigger and is the global
maximum, but we will not do so, as this is only cumbersome and the answer is obvious from
the graph. x=6 is the global maximum.
Now, where do these Kuhn-Tucker conditions come from?
Remember that the first order condition in general says:
f
g
=
0
x
x
For a positive lambda (which the Kuhn-Tucker conditions demand) this means that the
derivative of the objective function and the derivative of the constraint must have the same
sign. Lets think about this. For a maximum at a constraint we want that the function is
increasing if the constraint is an upper bound; its only a maximum if we could increase by
relaxing the constraint. But if the constraint is an upper bound than it must also be increasing.
(see the figure, where we zoomed in our function at x=6 and scaled it down for clarity. The
figure also shows the constraint).
2.0
1.5
1.0
0.5
5.5
6.0
0.5
1.0
219
6.5
7.0
1.0
0.5
10.5
10.0
9.5
9.0
0.5
1.0
1.5
Clearly, though, this is a local constrained minimum. What Kuhn-Tucker, for all its
complexity, boils down to, is simply checking that the sign of the derivatives of the function
and the constraint(s) are equal for a maximum and opposite for a minimum.
* Exercise 2. Maximize f ( x, y ) = x y , where > 0, subject to
220
The red area is the area where all the constraints hold. You could think of our function as a hill
landscape over the plane. When we optimize it subject to our constraints we restrict ourselves
to looking for peaks and valleys on the red area.
Now lets have a look at our objective function, f ( x, y ) = x y . Because x is only defined for
negative x if is an integer (a whole number), we restrict our attention to the case where x is
positive. The f(x,y) is increasing in x and y as long as y is positive. So, because we are
maximizing, we can restrict our attention to the following area:
Because we want to maximize and f(x,y) is increasing in both our variables, we expect to end
up somewhere on the outer edge of the red area. But where? This will depend on . The
higher , the more x contributes to a higher outcome, so the more we want to move to a
221
L( x, y )
y =x
= x3 22 y 3
y 0.85
=
2.15,
x 2.15,
y 0.85
=
=
10 22 3=0
2
66
2 = > 0, 3 = > 0
7
7
So we do indeed find that the multipliers are positive, so the Kuhn-Tucker conditions hold, so
we have a local constrained maximum.
Exercise 3. Find the following integrals:
x3 2 x 2 + 1
a)
dx
x
Solution:
We can work out the fraction and split up the parts:
x3 2 x 2 + 1
1
1
2
2
x dx= ( x 2 x + x )dx= x dx 2 xdx + x dx=
1 3
x x 2 + log( x) + C
3
We check by taking the derivative:
d 1 3 2
1
( x x + log( x) + C ) = x 2 2 x +
dx 3
x
This, as we saw, is what we started out with under the integral sign.
5
x3 2 x 2 + 1
dx
b)
x
2
Solution:
We already found the indefinite integral under a), so now we just use the fundamental theorem
of calculus to plug in:
5
x3 2 x 2 + 1
1
1
1
dx
= x 3 x 2 + log( x) =
125 25 + log(5) 8 4 + log(2)
2
x
3
2 3
3
3 x 2 + 4 x 1
dx
2 x3 4 x 2 + 2 x 6
Solution:
This is a lucky integral. We just so happen to be able to apply a substitution:
y = g ( x) = 2 x3 4 x 2 + 2 x 6 , then we have dy= g '( x)dx= (6 x 2 8 x + 2)dx . If we
manipulate our numerator a little bit, it becomes exactly this:
3 x 2 + 4 x 1
1
6 x2 8x + 2
1 1
dx
=
dx =
dy
3
2
2 x3 4 x 2 + 2 x 6
2 2x 4x + 2x 6
2 y
1
1
=
log( y=
)+C
log(2 x 3 4 x 2 + 2 x 6) + C
2
2
6
x
8
x
+
2
dx 2
2 2 x3 4 x 2 + 2 x 6
2 x3 4 x 2 + 2 x 6
This is exactly what we started out with. Notice how lucky we were. If we change the
denominator even slightly, say changing the -1 to -2, there is no easy way to find the solution
anymore.
d) 1 + 2xdx
Solution:
Here it makes sense to try a slightly complex substitution y 2 (= h( y )) = g ( x) = 1 + 2 x . The rule
for the differentials then becomes: h '( y )dy = g '( x)dx 2 ydy = 2dx ydy = dx . Lets plug
this in:
1 3
2
2
1 + 2 xdx = y ydy = y dy = 3 y + C
Now the last step is plugging back in our x. For this we have to rewrite our substitution rule a
little: y 2 =1 + 2 x y = 1 + 2 x . This we can plug in:
3
1 3
1
1 + 2 xdx =
y +C=
(1 + 2 x) 2 + C
3
3
3
2 x 2 )dx
x 12
Solution:
Here we see a square root to which we would like to apply a substitution, but there is the other
term there. But this is no problem; we can just split it off:
223
3
x 12
x 12
x 12
Now we can apply a similar substitution as last time to the remaining integral.
y 2 =x 12 2 ydy =dx, y = x 12
3
3
x 12 dx = y 2 ydy = 6dy = 6 y + C = 6 x 12 + C
Now we can add the two solutions (notice that we just add the two integration constants into
one new constant. This doesnt matter, since the constants could be anything anyway. Strictly
speaking, we should give all these constants new names, but that would be very cumbersome.)
3
2 x 2 )dx=
x 12
3
2
2
dx x 3 + C= 6 x 12 x 3 + C
3
3
x 12
d
2
1
2
(6 x 12 x 3=
)
6( x 12) 2 2 x=
dx
3
2
3
2x2
x 12
f) log(3 x 7)dx
Solution:
We try our luck with another substitution:
1
y = 3 x 7 dy = dx
3
1
log( y )dy
log(3x 7)dx =
3
We now have to find an integral for the logarithmic function. We can do this by a tricky
application of integration by parts. Remember that integration by parts is the following:
) g '( x)dx f ( x) g ( x) f '( x) g ( x)dx
f ( x=
y
3
3
3
1
1
1
y log( y ) =
dy
( y log( y ) y ) + C
3
3
3
Now we substitute back:
1
1
= ( y log( y ) y ) +=
C
( (3x 7) log(3x 7) 3x + 7 ) + C
log(3x 7)dx
3
3
We check:
224
g) x 2 e x dx
Solution:
Here we apply partial integration twice. It works because taking the integral of the
2
exponential is so easy. We take
=
f ( x) x=
, g '( x) e x
dx
x e=
2 x
x 2 e x 2 xe x dx
x 2 e x 2 xe x dx = x 2 e x [2 xe x 2e x dx] = ( x 2 2 x + 2)e x + C
We check:
d 2
( x 2 x + 2)e x = ( x 2 2 x + 2)e x + (2 x 2)e x = x 2 e x
dx
* h) ( x 2 + 1)e3 x + 2 dx
Solution:
This works exactly the same as above:
1 2
1
1 2
1 1
1
2
3x+2
3x+2
3x+2
3x+2
3x+2
3x+2
( x + 1)e dx =3 ( x + 1)e 3 2 xe dx =3 ( x + 1)e 3 [ 3 2 xe 3 2e dx =
1 2
2
2
1
2
2
( x + 1)e3 x + 2 xe3 x + 2 + e3 x + 2 +=
C ( ( x 2 + 1) x + )e3 x + 2 + C
3
9
27
3
9
27
We check:
d 1 2
2
2
1
2
2
2
2
( ( x + 1) x + )e3 x + 2 =3 ( ( x 2 + 1) x + )e3 x + 2 + ( x )e3 x + 2 =( x 2 + 1)e3 x + 2
dx 3
9
27
3
9
27
3
9
* i) x 2 log 2 ( x)dx
Solution:
2
This also works by repeated integration by parts. We
take f ( x) log
=
=
( x), g '( x) x 2
1 3
1
1
1 3
2
2
2
dx
=
x log 2 ( x) 2 x 3 log( x)=
x log 2 ( x) x 2 log(
x)dx
( x)dx
x log =
x
3
3
3
3
1 3
2 1
1 3 1
1
2
2
x log 2 ( x) [ x3 log( x) x=
dx x3 ( log 2 ( x) log( x) + ) + C
x
3
3 3
3
3
9
27
We check:
d 3 1 2
2
2
1
2
2
2
1 21
( x ( log ( x) log( x) + =
)) 3 x 2 ( log 2 ( x) log( x) + ) + x 3 ( log( x) =
) x 2 log 2 ( x)
dx
3
9
27
3
9
27
3
x 9x
225
x
dx
+
x
1
2
2
* j)
Solution:
This is an example where we apply our familiar substitution, but we also have to take into
account the limits of integration. We take the substitution: y 2 = x + 2 2 ydy = dx .
For the limits we get: x = 2 y = 2, x = 7 y = 3 , so:
7
x
y2 2
2 y3 4 y
=
=
dx
ydy
2
2 1 2 + x 2 1 y
2 1 y dy
We can now apply long division of this polynomial (if you dont know what this is, you can
just believe me). Working it out, we get:
3
3
2 y3 4 y
2
2
=
dy
2 1 y
2 (2 y 2 y + 2 + y 1)dy =
3
2 y dy 2 ydy + 2 dy + 2
2
1
dy =
y 1
3
1
47
1
2 y 3 y 2 + y + log( y =
1) 2 log(2)
2
3
3
2
k) x 2 ( x 3)12 dx
Solution:
This is just a nice trick. We could manually compute this beast by just expanding the
expression raised to the power 12. That would be a lot of work. By applying the substitution
y=
x 3, dy =
dx , we get:
x ( x 3)
2
12
2 12
dx =
( y + 3) y dy
14
1 15 6 14 9 13
1
6
9
y + y + y + C=
( x 3)15 + ( x 3)14 + ( x 3)13 + C
15
14
13
15
14
13
6 x( x + 1) dx = 6( y 1) y dx = (6 y
4
6 y 4 )dy = y 6
6 5
6
y + C = ( x + 1)6 ( x + 1)5 + C
5
5
xe
cx 2
dx
Solution 1:
xe
cx 2
2
1 cx2 2
1
dx =
e dx =
e cx d ( x 2 )
2
2
2
2
1
1
et
e cx
=
e cx d ( cx 2 ) =
et dt =
+C =
+C
2c
2c
2c
2c
Alternative solution 2:
dt
= 2cx or dt = 2cxdx
dx
2
et
e cx
1 t
cx 2
xe dx = 2c e dt = 2c + C = 2c + C
Substitution: t = cx 2 and
xe x dx
Solution:
xe x dx
Lets take
t = x 2 , dt = 2 xdx , For x = 0, we have t = 1. For x = 2, we have t = -4
1 4 t
1 4 0
1 4
=
(e e=
)
(e 1) =
e dx
2 0
2
2
p) Apply the substitution method to solve
e 1 + ln x
1 x dx
227
1 2
1
3
(4 1)=
tdt=
t =
2 t =1 2
2
Alternative solution:
1
dx , Limits of integration: t = 1 (for x = 1; t =1 + ln(1) =1 + 0 =1 ) and
x
t = 2 (for x = e; t = 1 + ln(e) = 1 + 1 = 2 )
Take t = 1 + ln( x) , dt =
1 + ln x
1 x dx=
e
1
1
3
tdt= t 2 =
(4 1)=
2 t =1 2
2
10
(1 + 0.4t )e 0.05t dt
Solution:
228
10
(1 + 0.4t )e
10
1
1
t
(1 + 0.4t )de 0.05
=
dt
=
(1 + 0.4t ) e 0.05t
0.05 0
0.05
10
0.05t
10
=
100e 0.5 + 20 + 20 0.4 e 0.05t dt =
100e 0.5 + 20 + 8
0
1
e 0.05t
0.05
t =0
10
1
e 0.05t d (1 + 0.4=
t)
0.05 0
10
=
100e 0.5 + 20 160(e 0.5 1)
t =0
22.3
t) Apply both the method of integration by parts and the substitution method to solve
2
2
2 x ln( x + b )dx
Solution:
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2 x ln( x + b )dx = ln( x + b )d ( x + b ) = ( x + b ) ln( x + b ) ( x + b )d ln( x + b ) =
t
= ( x 2 + b 2 ) ln( x 2 + b 2 ) td ln(t ) = ( x 2 + b 2 ) ln( x 2 + b 2 ) dt = ( x 2 + b 2 ) ln( x 2 + b 2 ) t + C =
t
2
2
2
2
2
2
= ( x + b ) ln( x + b ) ( x + b ) + C
229
L1 1
=
1 = 0
x1 x1
L1 1
1
= =0 =
y1
y1 y1
x1
1 x1 1
1
1=
=
y1 =
x1
x1
y1
1 x1
x1 + y1 = x1 +
x1
x (1 x1 ) + x1 (2 x1 ) x1
= 1
=
= 10
1 x1
1 x1
1 x1
230
Note that we get two completely symmetric sets of two equations: the first two and the last
two. We can just solve one of these; the other will be exactly the same.
Furthermore, the steps of the solution are almost the same as in the individual case, only a 1 is
changed to a 2 (because of the internalization of the effects of consumption). We race through
it:
LD 1
=
2 = 0
x1 x1
LD 1
1
= =0 =
y1 y1
y1
x1
1 2 x1 1
1
2=
=
y1 =
x1
x1
y1
1 2 x1
x1 + y1 = x1 +
x1
x (1 2 x1 ) + x1 (2 2 x1 ) x1
= 1
=
= 10
1 2 x1
1 2 x1
1 2 x1
=
x1
231
f *
by direct computation and by the
Solution:
We check the first-order condition:
df
=
2 x* + =
0 x* =
dx x = x*
2
f * ( ) = x*2 + x* + 1 =
We now compute
2
4
f *
directly:
2
2
+1 =
+1
f *
2
=
=
+1
4
2
Instead we could have used the envelope theorem, which says:
f * f
=
=
[ x 2 + x + 1]x = x* = x x = x* =
x = x*
2
The two methods give the same outcome. In this case, they are equally easy, but in general,
the envelope theorem saves you a lot of work, as we will see in the next exercise.
*Exercise 3.
Maximize f ( x, y ) =
x 2 y 2 + yx x . Find
f *
f *
and
by direct computation and by the
envelope theorem.
Solution:
From the first-order conditions we obtain:
f ( x, y )
0
=
2 x + y =
x
f ( x, y )
x
=
2 y + x =
0 y =
y
2
2 x + y =
2 x = x = 2
,y= 2
4
4
2
4 2
2 2 2
2 2
2 2
f * ( , ) =
2
=
( 4) 2 ( 2 4) ( 2 4) 2 2 4
( 2 4) 2
2 2
2
( 2 4) 2
2 4
2 4
We can now calculate
f *
directly:
f *
2
2
= =
2
4 2 4
232
=
= ( x 2 y 2 + yx x)
=x x*=
, y y*
=
x
, y y*
x*=
2
=
x* = 2
4
The results are the same. Please notice that for the envelope theorem, we did not have to
calculate f * at all. That saves a lot of time.
f *
Lets do
:
f *
2
2 2
= =
2 4 ( 2 4) 2
With the envelope theorem (dropping the cumbersome conditions x = x* etc. for ease of
writing):
f * f
2 2
=
=
( x 2 y 2 + yx x) = y* x* =
( 2 4) 2
Hurrah, they are the same!
Exercise 4.
10 . Find
Maximize f ( x) = x 2 subject to x + =
envelope theorem.
f *
by direct computation and by the
Solution:
Clearly this problem is a little silly if we view it as a maximization problem: the constraint
f *
x 10 . Still, for the computation of
already fixes x: =
it will be worthwhile to write
We could also have gotten this by the envelope theorem, which for a constrained optimization
problem is as follows:
f * L
=
= x*2 = (10 ) 2 2 x* = (10 ) 2 2 (10 )
x = x*
The outcome is once more the same.
233
I . Find
Maximize f ( x, y ) = x y1 a subject to x + py =
by the envelope theorem.
f *
both by direct computation and
p
Solution:
This optimization of a Cobb-Douglas function should look familiar by now:
L( x, y=
) x y1 a ( x + py I )
L
a
= x 1 y1=
0
x
L
=(1 ) x y a p =0
y
=
( x* ) 1 ( y* )1 a
We divide the constraints to get:
y 1
py
=
x=
1 x p
1
py
py
(1 ) I
+ py = I =
y=
, x = I
p
1
1
For the direct computation we have to plug this into f(x,y), to get something fiercely ugly:
(1 ) I 1
1 1
=
=
(
f * ( I ) (
)
)=
I (1 )1 p 1 I
p
p
This yields:
f *
=
p 2 (1 ) 2
( 1) p 2 (1 )1 I =
p
Lets see what the envelope theorem gives us.
f * L
=
=
y * =
( x* ) 1 ( y* )1 a y* =
( x* ) 1 ( y* ) 2 a =
p p=x x*=
, y y*
(1 ) I 2 a
1 2
(
) =
) I
p
p
Once more, they are the same.
( I ) 1 (
* Exercise 6.
A firm has the profit function ( p, A) = ( p c) Q( p, A) A , where p is price, Q demand, A
the amount of advertising, c the constant unit cost of production and the cost of advertising.
Find the effect of on pricing and on profits.
Solution:
This is now easy. We just apply the envelope theorem:
*
=
= A*
=p p=
*
, A A*
So a marginal increase in advertising costs decreases you profits by the (optimal) initial
amount of advertising. At the margin, we dont have to take into account the fact that you will
also change your amount of advertising.
234
235
dL df
7
= =2 x + 7 =0 x =
dx dx
2
We see that x satisfies our constraints, so we just check the second-order conditions:
d2 f
=2 < 0 , so this is a maximum.
dx 2
Possibility 2, first constraint binding, x=4:
L( x) = x 2 + 7 x + 2 1 ( x 4)
dL
=2 x + 7 1 =2 4 + 7 1 =0
dx
1 =1 < 0
So this is not a maximum, but a minimum.
Possibility 3, second constraint binding, x=-2:
L( x) = x 2 + 7 x + 2 2 ( x 2)
dL
=2 x + 7 + 2 =2 2 + 7 + 2 =0
dx
11 < 0
2 =
This is also a minimum. So the maximum is the unconstrained maximum: x =
237
7
.
2
Differential calculus
Solution concept (solution is a function)
K.14.1.
K.14.1.
Phase diagram
K.14.1.
Stability
K.14.1.
238
16
16
1
dx = 2 x
= 8 lim+ 2 x = 8 0 = 8
0
x
=
x 0
x
Example 2:
1
f ( x) =
x
We can calculate an improper integral, in which the lower limit
approaches negative infinity or the upper limit approaches positive
infinity. The following integral has no finite limit. It diverges.
1
dx = 2 x
= lim 2 x lim+ 2 x = 0 =
x
x =0
x0
x
239
16
16
2
2
2
dx ==
lim+
=
0
x x
x x =0
x
16
Example 4:
1
f ( x) =
x x
The integral does not exist:.
2
dx =
x x
x
= lim
x =0
2
2
lim+
= 0 =
x x0 x
Example 5:
1
f ( x) =
x x
The integral does exist:
2
2 2
dx
=
= lim
= 2
x
x x
x x=1
x
1
240
e t dt =e t
t =0
=lim e t (1) =0 + 1 =1
t
f ( x) = 2e 2 x
(expected value is two days) what is the probability of having a
duration longer than one day?
2e 2t dt =e 2t
t =1
=lim e t (e 6 ) =0 + e 1 0.368
t
241
dx
(
;
)
=
a f ( x; )dx
a
Example 8:
Take the integral, which depends on the unknown
4
4 1
1 2
2
3
1 ( 3 t t )dt = t 2 t t =1 = (64 8 ) + (1 0.5 ) = 65 8.5
4 2
(3
t
t
)
dt
(65 8.5 ) =
=
8.5
1
4
4
4
1 2
1 2
1 ( 3 t t )dt =1 tdt = 2 t t =1 =8 0.5 =8.5
4
4 2
1 2
(3
t
t
)
=
dt
(
Thus,
1 3 t t )dt
1
242
F=
( x)
2
t=
dt
F '( x) = x 2
1 3 1
x
3
3
Example 10:
A statistical probability density function is f ( x) . The cumulative
function is F ( x) =
f (t )dt .
f (t )dt = 1
F () = 0
F () =1
We can derive the density function from the cumulative function:
F '( x) = f ( x)
243
0 1
( x 2 + xy )dydx =
1
( x 2 + xy )dy dx = x 2 y + xy 2
0
2
dx =
y =1
1
1
5
1
5
= (2 x 2 + 2 x) ( x 2 x) dx = ( x 2 + x)dx = x 3 + x 2
0
0
2
2
3
4
x =0
1 5 19
= + =
3 4 12
Theorem
Let f be a continuous function defined on the rectangle
R = [ a , b ] x [c, d ]
Then
(
b
f ( x, y )dy dx =
f ( x, y )dx dy
Explanation:
On the left-hand side: we first integrate over [c, d ] with respect to
y; next we integrate over [a, b] with respect to x.
On the right-hand side: we first integrate over [a, b] with respect to
x; next we integrate over [c, d ] with respect to y.
Example 12:
2
(x
1
+ xy )dxdy =
1 3 1 2 1
0 ( x + xy)dx dy =1 3 x + 2 x y x=0 dy =
1 1
1
1
( + y )dy = y + y 2
3 2
3
4
2 4
1 1 19
= ( + )( + ) =
3 4
3 4 12
y =1
244
dx(t )
at
= aCe
=
ax(t )
dt
a0
0
Definite solution: x=
(0) Ce=
Ce
=
C
The solution is stable if a < 0:
lim x(t ) x=
(0)lim e at 0
=
245
dx(t )
+ ax(t ) =
b
a0
dt
or
x (t ) + ax(t ) =
b
We can multiply this equation by e at :
dx(t ) at
e + ax(t )e at =
be at
dt
Which can be rewritten as:
d
( x(t )e at ) = be at
dt
Thus:
)e at
x(t=
Thus:
x(t )=
at
dt
be=
b at
e +C
a
b
+ Ce at
a
x(t ) converges to
b
as t
a
Example 15
Find the general solution of
Solution: x(t )= 3 + Ce 4t
dx(t )
+ 4 x(t ) =
12
dt
246
dx(t )
+ ax(t ) =
b(t ) a 0
dt
or
x (t ) + ax(t ) =
b(t )
We can multiply this equation by e at :
dx(t ) at
e + ax(t )e at =
b(t )e at
dt
Which can be rewritten as:
d
( x(t )e at ) = b(t )e at
dt
Thus:
=
x(t )e at
at
b
(
t
)
e
dt + C
Thus:
=
x(t ) e at b(t )e at dt + Ce at
247
248
x=
1
t2 + C
249
1 7
1
x + x= t 4 + C
7
4
250
(C is a constant)
1
a)
dt
3
t +3
Solution:
1
1
=
3 t + 3 dt 3 t + 3 d (t + 3)
We apply the substitution method, by substituting t+3=x.
For t = -3 (lower bound of integral), x = 0. The upper bound of the integral does not change.
1
1
2 x
lim 2 x lim+ 2 x =
=
3 t + 3 d (t + 3) =0 x dx =
x
x =0
x 0
b)
1/ 3
e 3t dt
Solution:
1/ 3
1 3t
1 3(1/ 3)
1 3t 1
+ =
e =
e dt = 3 e t = = 3 e tlim
3
3e
b additional bonus - exercise)
1/ 3
c)
3t
1
e 3t dt = e 3t
3
1 3t 1 0
1 1
e e =0
=
t 3
3
3 3
=lim
t =0
xe x dx
Solution:
1 x2 2
e dx
2
We apply the substitution method, by substituting x 2 = t
For x=1 (lower bound of integral), t = 1. The upper bound of the integral does not change.
xe x dx =
2
1
1 x2 2
1
e dx = e t dt = e t
1
2
2
2
=lim
t =1
1
1
1 t 1 1
=
e e =0 +
2
2e 2e
2
251
d)
y
0 e ( xy + x )dxdy =
1
y
3
0 e ( xy + x )dx dy =
1
( xy + x )dydx }
3
1 2
0 2 x y + y ln( x) x=e dy =
1
1
9
= y e 2 y + y ln(3) y dy =
0 2
2
x =e
1
1 2
7 1
1
9 2 1 2 2 1 2
=
= e 2 + ln(3)
y e y + y ln(3) y
4
2
2 y =0 4 4
2
4
e)
( xy + x )dydx
Solution:
y
0 ( xy + x )dydx =
1
y
1
e 0 ( xy + x )dy dx =
3
1
1 y2
2
+
xy
e 2
2 x
dx =
y =0
3 1
1
1 2 1
9 1 2 1
1 7 1 2 1
=
e 2 x + 2 x dx =4 x + 2 ln( x) x=e =4 4 e + 2 ln(3) 2 =4 4 e + 2 ln(3)
f)
f '(t )dt
Solution:
)dt f (t ) + C
f '(t=
C is a constant
Exercise 2.
Please compute the first derivative of the following functions
x
3t
a. F ( x) = 1 e dt
Solution:
F '( x) = e 3 x
Alternative solution:
x
1 3t
1 3 x 1 3
=
F ( x) =
e dt
e=
e + e
1
3
3
3
t =1
1
(3)e 3 x= e 3 x
F '( x)=
3
x
3t
2x
3t
b. F ( x) = 1 e dt
Solution:
F '( x) = e 3(2 x )
(2 x)
= 2e 6 x
x
252
=
F ( x)
F '( x)=
2x
2x
=
e 3t dt
1 3t
1 6 x 1 3
e=
e + e
3
3
3
t =1
1
(6)e 6 x= 2e 6 x
3
3
3t
c. F ( x) = x e dt
Solution
F ( x) =
e 3t dt = e 3t dt
3
3 x
F '( x) = e
Alternative solution:
x
1 3t
1 3 x 1 9
F ( x) =
e dt =
e
=
e e
x e dt =
3
3
3
3
t =3
3
3t
3t
F '( x) = e 3 x
x
3t
d. F ( x) = e dt
Solution:
F '( x) = e 3 x
*Alternative solution:
x
1 3t
1 3 x
1
=
F ( x) =
e dt
e=
e lim e 3t
t 3
3
3
t =
x
3t
F '( x) = e 3 x
e. F (t ) =
1
dx
( x + 1)
Solution:
F '(t ) =
f.
1
(t + 1)
1
dx
0 ( x + 1)
F ( ) =
* Solution:
We know that
1 1
1
=
F '( ) =
dx
d ( x )
d ( x + 1)
0
0 =
0
( x + 1) ( x + 1)
( x + 1)
254
1 1
dx
=
0 ( x + 1)
dx
=
( x + 1)
0
x
dx
( x + 1) 2
dt =
+ ln(t + 1) for positive t+1.
(t + 1)2 dt =
td (t + 1) =
(t + 1)
(t + 1)
(t + 1)
{However, there is one problem, it is not t, but ( x + 1) . Hence:
1
d
( x + 1)
1
1
1
=
, because
}
dx = d
2
dx
( x + 1) 2
( x + 1)
( x + 1)
1
1 x
1 x
1
x
1
1
x
1
x 1
0 ( x + 1)2 dx =0 d ( x + 1) =0 d ( x + 1) = ( x + 1) 0 ( x + 1) d =
x =0
1
1
1
1 1 1 1 1
=
=
2
dx
0
( + 1)
( + 1) ( x + 1)
1
1
=
2
( + 1)
+1
1
+ 1)
d ( x=
0 ( x + 1)
+1
1
1
1
1
1
dt =
2 ln(t ) =
2 ln( + 1)
t
( + 1)
( + 1)
t =1
Exercise 3.
Sketch the direction diagram and the phase diagram (if possible) for the following differential
equations. If possible, give an explicit solution.
a. x (t ) = 3 x(t )
Solution:
x(t )= C e3t
The directional diagram, with some solutions drawn in, looks thus:
255
What we see is a picture of widely diverging functions as time progresses. This can be
confirmed by looking at the phase diagram, which, for autonomous equations (those who do
not explicitly depend on time, see e.g. exercise 3e), plots the relation between x and x .
256
2
4
6
As we can see in the phase diagram, if x is positive, then x is also positive. That means that if
x is positive, it will increase over time. Similarly, if it is negative, it will decrease over time.
Only if x is zero, the time-derivative of x is also zero, and x does not change over time. This is
what we saw in the first picture. The solutions are either always increasing or always
decreasing. Only the zero-solution does not change over time. It is called an unstable
equilibrium: if you are there, you stay there, but if you are near, no matter how close you are,
youll never get there.
257
1
t
2
From the picture we see that these solutions do converge to 0. We can again confirm this in
the phase diagram.
258
0.5
0.5
1.0
This time, a positive x implies a negative x and vice versa, so a positive x means that x is
decreasing over time. Again zero is an equilibrium, as it does not change over time. However,
this time, it is a stable equilibrium: if you start somewhere near it, youll get ever closer. In
fact, in this particular case, it doesnt matter where you start, you always get ever closer to
zero.
259
10
10
260
4
2
2
4
1
0.5 is an equilibrium, because x ( ) = 0 , and it is stable, because when x is smaller than 0.5,
2
261
We see -0.5 is an equilibrium, but it is unstable. Again, the phase diagram confirms:
262
2
0
e. x (t ) tx(t ) =
Solution:
This differential equation is of the form:
x (t ) + f (t ) x(t ) =
0 ,where f (t ) = t . These equations have the following general solution:
dF (t )
If F (t ) is such that
= f (t ) on a certain interval, then:
dt
1
x(t )= C e F (t ) on that interval. Here, the simplest F that works is F (t ) = t 2 (
2
1 2
F (t ) =
t + 43 would also work, but it would make life more complicated). We get:
2
1
x(t )= C e 2
t2
263
10
10
15
Because our differential equation also depends on t, we cannot draw a phase diagram; the
relation between x and x changes with time. We can observe however that x=0 is the only
equilibrium. We know that in equilibrium x = 0 for all t, from this and our differential
equation it follows immediately that x=0.
0
f. tx (t ) + x(t ) =
Solution:
This is not of the form x (t ) + f (t ) x(t ) =
0 , that we discussed in 3e), but we can make it so by
dividing both sides by t. We obtain:
x(t )
1
x (t ) +
=
0 , which is of the form we want, with f (t ) = . We can find integrands for f on
t
t
the positive and the negative interval: F (t ) = log(t ) and F =
(t ) log(t ) respectively.
So we get on the two intervals:
264
log( t )
if t < 0 C
Ce
if t < 0
t
To be sure, lets check this by plugging it into the original equation:
Ct C
tx (t ) + x(t )=
+ = 0 , so it works for t>0. Similarly for t<0.
t2
t
log( t )
t
g. x (t ) + x(t ) =
Solution:
The nicest way to solve this is in 2 steps. This is a linear differential equation, meaning that
the x and its derivatives appear only linearly (their coefficients could be non-linear, although
0 instead. In
thats not the case here). It is not homogeneous; it would be if it was x (t ) + x(t ) =
general, it is homogeneous if, when you write all the terms involving x and its derivatives on
one side, the other side is 0. Now the equation is called inhomogeneous. The solution to an
inhomogeneous linear differential equation can be obtained as follows. Find a simple
particular solution and find the general solution to the homogeneous counterpart. Add the two
265
x(t )= C e t .
So our total solution becomes:
x(t ) = C e t + t 1 . We check our result:
x (t ) + x(t ) =C e t + 1 + C e t + t 1 =t
10
10
t2
h. x (t ) + x(t ) =
Solution:
We apply the same trick. In fact, the homogeneous solution is the same as before:
x(t )= C e t
266
15
10
10
Notice that, although any particular solution does not converge, they also start to resemble the
function x(t =
) t 2 2t , the particular solution, more and more over time. The same thing
happened in exercise 3g); the solutions converged to the particular solution. This is because
the homogeneous solutions converge to 0.
t
i. x (t ) + tx(t ) =
Solution:
Again, we try to find a particular solution. It turns out to be particularly easy in this case:
x(t ) = 1 works.
0 we know to be (similarly to
The solution to the homogeneous equation x (t ) + tx(t ) =
267
x(t )= C e 2
So we obtain:
1
t2
x(t ) =
C e 2 +1
Lets check:
x (t ) + tx(t ) =
t C e
1
t2
2
+ t (C e
1
t2
2
+ 1) =
t , it worked!
15
10
10
15
* j. x (t ) + tx(t ) =
t2
Solution:
Well, sort of. It turns out that there is no handy particular solution to this system. This is quite
a common occurrence with differential equations. However, we can still see what is going on
by drawing our regular pictures. Only, now the solutions were found numerically by a
computer.
268
10
Notice that the solution converge to the line x=t. According to the differential equation
x = 0 along that line. Unfortunately it is not, because x = 1 .
k. x (t ) = t 3e x (t )
Solution:
This can be handled by separating the variables, as it is called. The right hand side is a
product of a term containing only x and a term containing only t. So we can rewrite it like
this:
dx 3 x (t )
x (t ) = =
te
t 3dt
e x (t ) dx =
dt
1 4
3
e x (t ) =
t +C
e x (t ) dx =
t dt =
4
1
t ) log t 4 + C
x (=
4
269
l. x (t ) =
20
15
10
10
log(t )
n. x (t ) x(t ) =
Again we can only show the solution via a computer simulation:
270
15
10
10
o. x ( t ) + t x ( t ) =t 3 + 2t 2 + 1
Solution:
Particular: x = t + 2
2
Homogeneous: x= C e
4e t
p. x ( t ) + 3 x ( t ) =
1
t3
3
Solution:
Particular: x = 2e t
Homogeneous: x= C e 3t
2
et
q. x ( t ) + (2t + 2) x ( t ) =
Solution:
1 t 2
e
2
2
Homogeneous: x= C e (t + 2t )
Particular: x =
271
xde x =
xe x
x e x dx =
x =0
1
1 1
=0 lim e x =
x
x
x =0
x
It can be shown that lim xe = 0
=lim( xe x + 0)
+ e x dx =
e x
e t dt
Exercise 2.
The probability of observing a t is f (t ) = e t for t [0, )
Compute the probability of t larger than 1.
Solution:
1 1
t
t
Pr(T > 1) =
et + =
1 e dt =e t =1 =lim
t
e e
272
K = sX
(investment is proportional to output)
t
L = L0 e
(exponential growth of labour force)
where X = X (t ) is the national product, K = K (t ) is the capital stock and L = L(t ) is the
number of employees at time t. The model contains the following constants: A, , s, and L0 .
K K (0) > 0
Derive the differential equation to determine K = K (t ) when=
Solution:
dK
K = sAK 1 ( L0 et=
sA( L0 ) K 1 e t
=
)
dt
It is a separable differential equation. So we have on the left-hand side a function of dK and
K. On the right hand side we have a function of dt and t.
K 1dK = sA ( L0 et ) dt
dK = sA ( L0 et ) dt
which becomes
1
1
sA ( L0 ) et + C
K =
K =
sA ( L0 ) et + C1
if K = K 0 for t=0
sA
C
K 0 ( L0 )
=
1
K=
( L0 ) ( et 1)
K0 +
273
10 + 84
10 84
x = y =
x = y =
8
8
So we obtain the following four critical points (x,y):
10 + 68 10 + 68 10 68 10 68
(
,
), (
,
),
16
16
16
16
10 + 84 10 84 10 84 10 + 84
(
,
), (
,
)
8
8
8
8
To see whether these are maxima, minima or saddle points, we calculate the Hessian at each
of these points and check its definiteness. The Hessian is:
2 f
2 f
2
xy 12 x 10 + 2 y 2 x
x
=
2 f
2 y
2x
2 f
2
yx y
This we can now evaluate at our 4 points. For the first we obtain:
10 + 68
10 + 68
10 + 68
10 + 2
2
12
6
2.3
16
16
16
10 + 68
10 + 68 2.3 2.3
2
2
16
16
Ordinarily, we would check the determinants of the principal minors to determine the
274
10 -68
10
68 0.2 -0.2
2
-2
16
16
Since all the diagonal elements are negative, this might be a negative definite matrix, so we
check the determinants of the principal minors. There is of course only the full matrix in this
case:
8.5 0.2
= 0.2 8.5 0.22 = 1.66 > 0 .
0.2 0.2
So the principal minors alternate in sign: this is a negative definite matrix and this point is a
local maximum.
We check the third point:
10 + 84
10 84
10 + 84
10 + 2
2
12
14 4.8
8
8
8
10 + 84
10 84 4.8 4.8
2
2
8
8
This has all diagonal elements positive, so it might be positive definite, we check the principal
minor:
14 4.8
=14 4.8 4.82 =44.16 > 0 , so this is positive definite and we have found a
4.8 4.8
minimum.
Finally, we check the fourth point:
10 84
10 + 84
10 84
10 + 2
2
12
9 0.2
8
8
8
10 84
10 + 84 0.2 0.2
2
2
8
8
Here the diagonal elements again have different signs, so the matrix is indefinite and our point
is neither a minimum nor a maximum.
275
L( x, y=
) log( x) + 4 log(3 y ) l ( x + 6 y 10)
L 1
= l = 0
x x
L 4
= 6l =0
y y
Because it almost always works in economics, we divide the two first order conditions:
y 1
4
= y= x
4x 6
6
We plug this into the constraint:
4
x + 6 y = x + 6 x = 5 x = 10
6
4
4
x = 2, y = 2 =
6
3
Exercise 3.
Solve the following integrals:
t
a)
dt for which you may assume that t is positive.
(t + 3) 2
Solution:
We apply the method of integration by parts:
t
1
1
1
1
1
t
t
+
dt =
dt =
(t + 3)2 dt =
td (t + 3) =
(t + 3)
(t + 3)
(t + 3)
(t + 3)
t
=
+ ln(t + 3) + C
(t + 3)
Two remarks:
1) We applied in the first step that
1
d
(t + 3) = 1
dt
(t + 3) 2
1
1
1
dt = d
= d
2
(t + 3)
(t + 3)
(t + 3)
2) We can check that:
d
d t
+ ln(t + 3) + C = ( t (t + 3) 1 + ln(t + 3) + C ) =
(t + 3) 1 + t (t + 3) 2 + (t + 3) 1 =
t (t + 3) 2
dt (t + 3)
dt
276
b)
0 1
1
( xy=
)dxdy
2
1 1
13
3 2
3
2 1
xy )=
dx dy x 2 y =
dy =
ydy =
y
0 1 ( 2 =
0 4
0 4
8 y0 8
x 1=
Alternative:
1
0 ( 2 xy)dydx =
1
2 1
1 1
2
0 ( xy )dy dx = 1 xy
2=
y
4
1
21
1 2
4 1 3
=
=
= =
dx
xdx
x
1 4
8 x1 8 8 8
0=
Exercise 4.
Sketch the direction diagram and the phase diagram (if possible) for the following differential
equations. If possible, give an explicit solution.
x (t ) = 2 x(t )
Solution:
Lets first solve it exactly. We could either rewrite the equation in a form to which we know
the solution, or we could try to guess the solution, because the equation is so simple. We will
start with the guessing approach. x (t ) = 2 x(t ) requires that we find functions that are twice
their own derivative. We know that the exponential function is equal to its own derivative, so
it makes sense to see if we can manipulate it to get a solution to our equation. We could try
multiplying the exponential function, but then it would still be equal to its own derivative:
d 2t
d
e = 2e 2t .
C et = C et . But if we multiply the power, we do get our desired result:
dt
dt
This gives us exactly what we want. But we know that it will also hold for any multiple of this
function, so we get as a general solution: x(t )= C e 2t .
We can also obtain this result in a more procedural fashion: if we rewrite the equation as
x (t ) 2 x(t ) =
0 , we see that is a homogeneous linear equation, i.e. of the form
x (t ) + f (t ) x(t ) =
0 , with f (t ) = 2 . This general form has the solution:
277
1.0
0.5
0.5
10
Finally, we draw a phase diagram. Here we set x on the horizontal axis and x on the vertical
axis.
278
1.0
4
From the phase diagram we can also infer the qualitative behaviour of our solutions. In
particular, we see that x = 0 when x=0. This implies that x=0 is an equilibrium: when x=0, the
x value does not change over time. If x>0, then x > 0 , according to the phase diagram. This
means that if x is positive, it will grow over time. Similarly, if x<0, then x < 0 , so x will
decrease over time. We see that solutions will move away from the equilibrium-value 0 over
time, so that x=0, while an equilibrium, is not stable. All this is corroborated by our actual
solutions.
279
K.14.1.
K.14.3. Page 481
486 (Phase diagram
Difference equations
Introduction to first-order difference equations
280
K. 13.1
f (t )dt + C
C is some constant
281
x + A(t ) x =
B (t )
Solution:
Step 1
Solve the general solution to the homogenous equation (or reduced
equation)
x + A(t ) x =
0
x = A(t ) x
x
= A(t )
x
x
= A(t )
x
1
dx = A(t )dt
x
ln x =
A(t )dt + C
x = e
A ( t ) dt +C
Step 2
Find the particular solution for the non-homogenous differential
equation. This can be considered as a steady-state value
x + A(t ) x =
B (t )
Try for the particular solution of x :
If B (t ) is constant, a constant
If B (t ) is a polynomial of degree n, try for x a polynomial of degree
B (t )
e at
e at
282
283
x + ax =
b
Note that x is a function of t. We solve the general solution by
following the four steps of above.
Step 1 (homogenous equation: right-hand side of the differential
equation is zero). The homogenous equation is a separable differential
equation.
x + ax =
0
x = ax
dx
= ax
dt
dx
= adt
x
dx
x = adt
(we assume a positive x; C is some real
ln( x) =
adt + C
number)
adt +C
adt
adt
=
eln( x ) e=
eC=
e
C1e
Step 2:
x=
( C1 = eC )
b
is the steady state
a
Step 3:
General solution is the solution of the homogenous equation plus the
steady state:
b
=
x C1e at +
a
284
Thus:
b
b
= C1 1 +
a
a
b
a
Next we substitute C1 in the equation
b
b
=
C1 x(0)
Step 5
We study the solution as t becomes infinitely large.
b
Hence, if the initial value x(0) is , then the limit will be equal to
a
the initial value.
b b
b b
lim x = e at + =
t
a a
a a
b
If the initial value x(0) is not equal to and a >0 then
a
b
b
b
b
= x(0) lim e at + =
a t
a
b
b b
= x(0) 0 +=
a
a a
lim x =
lim x(0) e at + =
t
t
a
a
285
b
and a < 0 then
a
x + ax =
b
Note that x is a function of t; so the outcome of variable x depends on
b
time. We calculated that the equilibrium is
a
We rewrite the differential equation as a function of equilibrium:
b
x =ax + b =ax + a =ax + a x* =a ( x + x* )
a
Convergence:
If a is positive, and x > x* , x becomes negative. It means that x is
too large (relative to equilibrium value). A negative x ensures
adjustment towards equilibrium, so that x becomes smaller.
If a is positive, and x < x* , x is positive. Hence x is too small. The
positive x gives an adjustment towards equilibrium.
Finally, we can show that a more positive a gives a faster
convergence (more rapid adjustment) towards equilibrium. We
consider the solution of the differential equation. A larger a gives
and e at close to zero, so that x is close to x* . It implies a more
rapid adjustment.
b
b
x x(0) e at + =
=
a
a
( x(0) x ) e
*
286
at
*
x(0)e at + x* (1 e at )
+ x=
287
288
290
291
c d
292
Thus:
x ax + y
=
(1 a ) x =
y
1
x =
y
(1 a )
Convergence of the sequence to steady state (regardless of initial
value) of x0 if a < 1
The steady state is not well defined if a = 1
Divergence of the sequence if a > 1
293
lim b nut +n = 0
n
294
Step 1:
Solve the homogenous equation (so, the right-hand side is zero):
xt axt 1 =
0
Solution to this equation (we need to determine A and k):
xt = Ak t
which is substituted in the homogenous equation:
Ak t aAk t 1 =
0
so that k = a
solution to the homogenous equation
xt = Aa t
Step 2:
Find a particular solution of
=
xt axt 1 + y :
which is
1
x =
y
(1 a )
Step 3:
General solution:
=
xt Aa t +
1
y
(1 a )
295
1
=
A x0
y
(1 a )
t
1
1
1 at
t
xt =
y a +
y=
x0 a +
y
x0
(1
a
)
(1
a
)
1
296
1
xt 1 + 2 explicitly by repeated substitution. Assume that at time 0, x0 has
2
value C (since we dont specify C, this isnt really an assumption, but more of a definition).
1
1
=
xt
xt 1 + 2 holds for all t, we can plug in=
xt 1
xt 2 + 2 , to get:
Since our equation
2
2
1 1
( xt 2 + 1) + 1 . We can keep repeating this process until we get to t=0:
=
xt
2 2
=
xt
We solve
1 1 1
1 1
( ( ( ( x0 + 2) +=
2) + 2) + 2
2 2 2
2 2
1
1 1
1
1
( )t x0 + 2(1 + + ( ) 2 + ( )3 + + ( )t 1 )
2
2 2
2
2
This last line we can rewrite in an easier way by applying the following very neat trick. We do
it for a general constant , instead of 2.
We want to find the value of B =1 + + 2 + + t 1 . First observe that
B = + 2 + 3 + t 1 + t . If we compare the term of these two sequences of numbers,
we see that they are very similar: only the first term of the first and the last term of the second
are different. If we subtract them, we get therefore:
(1 + + 2 + + t 1 ) ( + 2 + + t 1 + t ) =1 t =B B =(1 ) B
=
xt
1 t
B=
1
If we apply this formula to our function we obtain:
1
1 ( )t
2 = ( 1 )t x + 4(1 ( 1 )t )
0
1
2
2
2
Finally, we can ask what will be an equilibrium of this process. In equilibrium, x does not
change over time, so we have xt = xt 1 . Plugging this into our equation, we find:
1
xt=
xt + 2 xt= 4 . We can also see that all our solutions will converge towards
2
1
1
( )t x0 + 4(1 ( )t ) goes to zero,
equilibrium over time. As t , the first term of x=
t
2
2
whereas the term in brackets goes to 1. SO the whole thing goes to 4, the equilibrium value.
1
1 1
1
1
1
( )t x0 + 2(1 + + ( ) 2 + ( )3 + + ( )t 1=
) ( )t x0 + 2
x=
t
2
2 2
2
2
2
297
=
xt
7
xt 1 + 30
8
with x0 = 300
7
x x =
30
8
1
30= 240
x=
7
1
8
Step 3: General solution
t
7
=
xt A + 240
8
Step 4: Solve A for x0
0
7
=
x0 A + 240
8
0
7
=
300 A + 240
8
60 = A
t
7
=
So the solution
is xt 60 + 240
8
Repeated iteration:
7
=
xt
xt 1 + 30
8
298
77
7
7
xt = xt 2 + 30 + 30 = xt 2 + 30 1 + = =
88
8
8
t
t 1
t
7 7 2
1 (7 / 8)t
7
7 7
30
= x0 + 30 1 + + + =
+
8
8 8
1 7 / 8
8 8
For which in the final step we used the following equality:
2
t 1
7 7
7
1 + + + =
8 8
8
t
7 1 (7 / 8)
i=
=0
1 7 / 8
8
t 1
Exercise 2.
Compute the forward solution for the difference equation
7
xt
xt +1 + 30
=
8
Solution:
7
=
xt
xt +1 + 30
8
It is also valid for one period ahead: period t+1 (instead of t)
7
=
xt +1
xt + 2 + 30
8
Substitute the equation for t+1 in the first equation. Repeat it n times. Consider the outcome
for n very large. The first term because zero as n tends to infinity. The second term becomes
240.
n
7 7 2
77
7
xt = xt + 2 + 30 + 30 = = lim + 30 1 + + + =
n 8
88
8 8
1
=0 + 30
=240
1 7 / 8
Exercise 3.
Solve the differential equations
x(0) = 50
x + 3 x =
3
x + 0.1x =
3 x(0) = 50
and demonstrate that for the first equation there is a more rapid convergence towards the
steady state.
Solution:
Lets solve the following differential equation first:
x + 3 x =
3
x(0) = 50
299
x + 3x =
0
dx
= 3dt
x
ln x =3t + C
x = C1e 3t for which C1 = eC
3
= 1
3
x C1e 3t + 1
=
Step 3: General solution:
Step 4: We solve the outcome of step 3 for the unknown C1 , using the initial condition x(0)
(so we take t=0)
50 = C1 1 + 1
So that the solution is:=
x ( x(0) 1)e 3t +=
1 49e 3t + 1
The convergence towards equilibrium:
x(0) e 3t + 1 (1 e 3t )
Hence, x is a weighted average of x(0) (the initial condition) and 1 (the steady state). The
weights are e 3t for the initial condition x(0) and 1 e 3t for the steady state. Note that both
weights are between 0 and 1, because 0 < e 3t < 1 and 0 < 1 e 3t < 1 (which may be rewritten
=
x
as 0 <
1
1
< 1 and 0 < 1 3t < 1 ). Both weights add up to one.
3t
e
e
Differential equation 2
One can show that for the second differential equation
x + 0.1x =
3
x(0) = 50
3
x = 30 . The solution to the differential equation becomes:
The steady state of step 2 is=
0.1
0.1t
0.1t
x= ( x(0) 30)e + 30= 20e + 30
The convergence from the initial condition x(0) towards the steady state (30) can be written
as
x=
We compare both weights for the second and the first equation. The weight for the steady
state of the previous differential equation (1 e 3t ) is closer to one than the corresponding
weight of the current equation (1 e 0.1t ) . Hence, there is a more rapid convergence towards
to steady state for the first differential equation, since it has a larger weight attached to the
steady state.
Exercise 4.
Consider the system of differential equations.
x = ax + by + e1
y = cx + dy + e2
For which a > 0 , b < 0 , c > 0 , and d > 0
300
y < 0 . So that
y = dy + cx + e2 < 0
dy < cx e2
e
c
y<
x 2
d
d
e
c
x 2 has a negative value for y .
d
d
(consequence: the vertical arrows in areas South and West point in downward direction).
y
Consequently, the area below the equation=
y > 0 implies that the areas North and East above the equation have a positive value for y .
The vertical arrows point in upward direction in both areas. Note that the Figure 14.7b in
Klein is wrong with respect to the vertical arrows in West (must be downward) and East (must
be upward). All vertical arrows below the line y = 0 must point in the same direction.
301
2
2
302
y=
y1 2 y2
2
Solution:
1
1
303
3
3
304
1
2
16
a=
,b =
,c =
3
9
27
So our particular solution becomes:
1
2 16
x(t ) =
t2 t +
3
9 27
0 . We could observe
We proceed the find the solution to the homogeneous system: x 3 x =
the solution directly here, but we apply the general rule: if the equation is of the form
x + f (t ) x =
0 , then the solution is x(t )= C e F (t ) , where F '(t ) = f (t ) . Here f (t ) = 2 , so
we obtain: x(t )= C e 2t .
The general solution is now the sum of the particular and the homogeneous solution:
x(t ) = C e 2t + at 2 + bt + c
b) x 3 x =
et
Solution:
Observe that the homogeneous equation is the same as before, so we only look for the
particular solution. We first try the exponential function, as the inhomogeneous part of the
equation is also an exponential function. However, we can see that this doesnt work (can
we?), because were off by a factor, so we try a factor in front of our function:
x(t ) = C et = x (t ) . Our equation becomes:
1
x 3 x =
C et 3C et =
2C et =
et C =
2
1
So our particular solution is: x(t ) = et .
2
c) x + t 2 x = 3t 4 4t 2 + 6t
Solution:
We start with the particular solution. Since the inhomogeneous part is a fourth-degree
polynomial, we could try a fourth degree polynomial as our solution, but because of the t 2 in
front of x, we see that it is wiser to try a polynomial of degree 4-2=2 (we could try the fourth
degree polynomial and it would give us the same result, with a lot more work). So we try
x = at 2 + bt + c, x = 2at + b . Our equation becomes:
305
=
x 3t 2 4
We move on to the homogeneous solution:
1
t3
1
x + t 2 x =0, f (t ) =t 2 F (t ) = t 3 , so x= C e 3 is the homogeneous solution and
3
x(t ) =C e
1
t3
3
306