Advanced Mathematics 2012 2013 Week 1 - Week 8 - 21 Jan 2013

Advanced Mathematics
2012/2013
Version 21 January 2013
Utrecht University School of Economics

Prof. dr. Wolter Hassink
Mathijs Janssen
ADVANCED MATHEMATICS CONTENTS

General information
Week 1 Introductory material

Lecture
Technical tutorial
Broad tutorial
Take home assignments
5
5
30
35
40
Week 2 Linear algebra (I)

Lecture
Technical tutorial
Broad tutorial
46
46
72
83
87
Week 3 Linear algebra (II)

Lecture
Technical tutorial
Broad tutorial
Additional exercises
91
91
118
128
134
138
Week 4 Calculus (II)

Lecture
Technical tutorial
Broad tutorial
139
139
156
166
170
Week 5 Optimization (I)

Lecture
Tutorials
173
173
190
Week 6 Optimization (II) and integrals (I)

Lecture
Technical tutorial
Broad tutorial
201
198
217
230
236
Week 7 Integrals (II) and dynamic analysis (I)

Lecture
Technical tutorial
Broad tutorial
238
238
249
272
274
Week 8 Dynamic analysis (II)

Lecture
Technical tutorial
Broad tutorial
280
280
297
298
305
ADVANCED MATHEMATICS GENERAL INFORMATION

Assignments
From week 2: hand in your work. At random, three of these
assignments will be assessed.
From week 3: Two group assignments on broad economic
questions.
Assessment method
The average of the assessment of three, randomly chosen

individual assignments (24%);
Group assignment (16%);
o Assignment to team will be done by lecturers.
Individual end term exam, closed book (60%). For more
information about replacement and supplementary retake exams
see the course manual.
Effort requirement
Six out of seven assignments should be handed in.
ADVANCED MATHEMATICS CONTENTS

Academic Skills
Problem solving: on average a sufficient grade (5.5 or higher) for

individual assignments.
o The justification of the method chosen, the systematically
presentation of both the steps in the problem solving stage
and the solution, and the interpretation of the results are all
aspects that count heavily. This means that solutions using the
problem solving skill extensively but with calculation errors
can be assessed sufficient.
Effective teamwork. 1) Sufficient contribution to team assignments.

2) Sufficient average grade (5.5 or higher) for team assignment. For
the assignment, for each of the team members the contribution to
the assignment must be shown in the paper.
ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS SLIDES WEEK 1
Week 1 - Introductory material
Functional notation (domain, range etc.)
Klein 2.1. [K 2.1.]
Graphs of univariate and multivariate

functions
K 2.1.
Limits, continuity
K 2.1.
Properties of functions: monotonous,

convex, concave, injective, surjective,
inverse, homogeneous
K 2.2.
Necessary conditions, sufficient

conditions
K 2.2.
Discrete compounding
K.3.1.
Exponential function
K.2.3. K.3.2.
Rules of exponential functions
K.2.3.
Multiple compounding per period
K.3.2.
Continuous compounding
K.3.2.
NPV
K.3.2.
Logarithm (as inverse of exponential

function)
K.3.3.
Rules of logarithms
K.3.3.
Summations
Supplemental
material

Sets
Definition: Set: collection of elements
Example 1:
let
W = : the set of all non-negative integers W = {0,1, 2...}
Set of positive integers: W = {1, 2,...}
W = : the set of all integers W= {..., 2, 1,0,1, 2,...}
W = : the set of all rational numbers of the form a / b , where a and
b are both integers
W = : the set of all real numbers (includes both rational and
irrational numbers).
Definition: Irrational numbers, e.g.:
3,
1
, , e
2
Example 2:
let S = {0,1, 2...,,10}
An element may belong to a set (or: may be a member of a set). Thus
xS
Example 3:
integer 1 S
integer 1
Irrational number (=3.141...) and e (=2,71...)
However, 0.5 S and and e
Definition: Sub-set
Example 4: let T = {0,1}
T is a sub-set of S: T S
: inclusion symbol
or : S T

Definition: union of sets
Example 5: S T =
S
All elements that belong to S or T (or both).
Definition: intersection of sets:
Example 6: S T =
T
Elements that belong to both S and T.
Definition: empty set
Example 7: V = {9,10}
V T =
The sets V and T have no elements in common. They are disjoint.

Functions
Definition: Function, mapping (or transformation): element of set X
into set Y
Function: f : X Y
Set X: domain
Set Y: range
Note: range can be broad.

Univariate functions
Definition: one member of domain is related to one member of range
y = f ( x)
Definition: x is argument of the function f(x)
Example 8:
y= 2 + 3 x
Domain:
Range:
Example 9:
y= 2 + 3 | x |
Domain:
Range: [2, )
Example 10:
y= 2 + 3 x
Domain: [0, )
Range: [2, )
Example 11:
3
x
Domain: (0, )
Range: (2, )
y= 2 +
Example 12:
y= +
Domain: (0, )
Range: ( , )
Definition: and are parameters

9

Multivariate functions
Definition: different independent variables and one dependent
variables
y = f ( x1 , x2 ,..., xn )
Subscript of x refers to the variable name.
10

Limits and continuity
Definition: Left-hand limit
lim f ( x)
x a
exists and is equal to LL for any arbitrarily small number there

exists a small number such that
| f ( x) LL |< for a < x < a

Definition: Right-hand limit
lim+ f ( x)
x a
exists and is equal to LR for any arbitrarily small number there

exists a small number such that
| f ( x) LR |< for a < x < a +

Example 13
lim m(k + x) =
mk
x0
Example 14
lim
x
k
=0
mx + h
if m 0
11

Continuity
Definition: a function f ( x) is continuous at x=a if
lim
f ( x) lim
f ( x) lim+ f ( x)
=
=
x a
x a
x a
and the limit equals the value of the function at that point.
Example 15
1
Let f ( x) =
x 8
The left-hand and right-hand limit are unequal
1
1
= and lim+
=
lim
x8 x 8
x8 x 8
so that the function is discontinuous at x=8
Example 16
Let f ( x) =
1
( x 8) 2
1
1
= and lim+
=
2
x8 ( x 8)
x8 ( x 8) 2
So that the function has the same left-hand and right-hand limit at x=8.
However the function f(.) is not defined at x=8
lim
Definition: the function has a vertical asymptote at x=8

It is not defined at x=8
The function value approaches plus infinity or minus infinity at that
point.
12

Properties of functions
Definition:
Lets have f ( x) and xB > x A
f ( x) is increasing if f ( xB ) f ( x A )
f ( x) is strictly increasing if f ( xB ) > f ( x A )
f ( x) is decreasing if f ( xB ) < f ( x A )
f ( x) is strictly decreasing if f ( xB ) f ( x A )
Definition: a monotone function is either increasing or decreasing.
Definition: a strict monotone function is either strictly increasing or
strictly decreasing.
Example 17
The function f (=
x) 3( x + 1) 2 is not a monotone function. However, the
function f (=
x) 3( x + 1) is a monotone function
Definition: Strictly monotone functions are one-to-one functions (or
injective functions).
If f ( x A ) = f ( xB ) then x A = xB
Other formulation: f ( x A ) f ( xB ) whenever x A xB
Definition:
Any monotone function has an inverse function. Notation:
y = f ( x) has the inverse function y = f 1 ( x)
Example 18
=
y 3( x + 1)
can be rewritten as=
x
1
( y 3)
3
1
( x 3)
3
is the inverse function of =
y 3( x + 1)
Thus the function=

y
13

Definition: Composite function
Argument x of the function y = f ( x)
is also a function x = g ( z )
so that y = f ( g ( z ))
Property
f ( f 1 ( x)) = x and f 1 ( f ( x)) = x
Example 19
=
y 3( x + 1) and=
y
1
( x 3) are inverse functions.
3
1
thus 3[ ( x 3) + 1] =
x
3
1
and [3( x + 1) 3] =
x
3
14

Extreme values
Definition:
Global maximum: largest value of function over range
Global minimum: smallest value of function over range
Example 20:
Minimum:
It says that we minimize the function 5 + ( x 8) 2 with respect to x. The
minimum function value (at the argument x=8) is equal to 5.
min 5 + ( x 8) 2 =
5
x
The minimum function value is at the argument x=8:

arg min 5 + ( x 8) 2 =
8
x
Maximum:
It says that we maximize the function 3 2( x 9) 2 with respect to x.
The function value (at x=9) is equal to 3.
max 3 2( x 9) 2 =
3
x
The argument at which the function has a maximum:

arg max 3 2( x 9) 2 =
9
x
15

Average rate of change of function
Definition: The average rate of change of the function y = f ( x)
over the closed interval [ x A , xB ] is
y f ( xB ) f ( x A )
=
x
xB x A
Definition:
Secant line: line between the points ( x A , y A ) and ( xB , yB )
where y A = f ( x A ) and yB = f ( xB )
f ( xB ) f ( x A )
y ' y A
=
( x ' x A )
x
x
B
A
For any point ( x ', y ') on this line, x ' is within [ x A , xB ] and y ' is
within [ y A , yB ]
16

Concavity and convexity
Definition:
A function is strictly concave in an interval if for any distinct points
x A and xB in that interval, and for all values in the open interval
(0,1)
f ( x A + (1 ) xB ) > f ( x A ) + (1 ) f ( xB )
Definition:
A function is strictly convex in an interval if for any distinct points
x A and xB in that interval, and for all values in the open interval
(0,1)
f ( x A + (1 ) xB ) < f ( x A ) + (1 ) f ( xB )
17

Necessary and sufficient conditions: some logic
Definition:
Whenever P is true, Q is necessarily true.
Q is a necessary condition for P:
P is a sufficient condition for Q:
PQ
Read: It means that if P then Q

or Q is the consequence of P
Example 21:
X is a square X is a rectangle
A sufficient condition for X to be a rectangle is that X be a square.
or
A necessary condition for X to be a square is that X be a rectangle.
Example 22:
Person is healthy Person breathes without difficulty
Wrong implications (reverse implication of above):
A person breathes without difficulty is necessarily healthy.
and
Breathing is a sufficient condition for a person to be healthy.
Example 23:
x > 5 x 2 > 25
x < 5 x 2 > 25
Example 24:
xy = 0 x = 0 or y = 0
Example 25:
0
x = 0 or y = 0 xy =
18

Definition:
P is a necessary and sufficient condition for Q:
P Q and Q P
Read: It means that P if and only if Q

PQ
Example 26:
0
x = 0 or y = 0 xy =
Example 27:
x < 5 or x > 5 x 2 > 25
19

More on logic: Structure of a mathematical proof
Method 1 (direct proof):
PQ
P: set of propositions. Also referred to as premise (what we know)
Q: set of propositions. Also referred to as conclusion (what we want
to know)
Method 2 (indirect proof):
Alternative structure of a proof:
P Q is equivalent to not Q not P
P Q is equivalent to Q P
Example 28:
Direct structure: If it is raining, the grass is getting wet.
Indirect structure: If the grass is not getting wet, then it is not raining.
20

A menu of functions: power function
Definition: power function
( x) kx p
=
y f=
for which p is referred to as the exponent of the function
Rules of exponents:
x0 = 1
x1 = x
1
x p = p
x
xm / n = n xm
x a +b = x a x b
xa
a b
x = b
x
b
( x a ) = x ab
x a y a = ( xy )
xa x
= ( y 0)
ya y
Definition: polynomial function

y = f ( x) = a0 + a1 x + a2 x 2 + ... + an x n
Degree of the polynomial function: highest exponent of the function
(=n)
21

A menu of functions: exponential function
Definition: exponential function
( x) kb x
=
y f=
b: base of the function
Example 29:
lim b x = 0 if | b |< 1
x
and lim b x = 1 if | b |= 1
x
and lim b x = if | b |> 1

x
Example 30:
lim b x = 0 If | b |> 1
x
Which is equivalent to
1
lim x = 0 If | b |> 1
x b
22

Summations
Definition:
The sum from i = 1 to i = 5 of xi :
5
x
i =1
= x1 + x2 + x3 + x4 + x5
i: summation index (an integer).

This summation is the same as
5
x
j =1
= x1 + x2 + x3 + x4 + x5
Example 31:
6
=12 + 22 + 32 + 42 + 52 + 62 = 91
i =1
2
1
1 1 1 21
=
+ + =
3 8 15 40
j =0 ( j + 1)( j + 3)
23

Summations: properties
Additive property:
n
(a
=i 1
+ bi )=
a + b
i
=i 1 =i 1
Homogeneity property:
n
ca
= c ai
i
=i 1 =i 1
So that:
n
c = nc
i =1
24

Double summations
Property:
=
=
a
a
ij
ij =
=i 1 =j 1
i 1 =j 1
=j 1
1j
=j 1
=j 1
+ a2 j + ... + amj
or
m
=
a=
a
ij
ij
=j 1 =i 1
=j 1 =i 1
a + a
=i 1
i1
=i 1
i2
thus
m
a = a
=i 1 =j 1
ij
=j 1 =i 1
ij
25
+ ... + ain
=i 1

Calculating growth (1)
Growth from period t to period t+1: X t +1= (1 + r ) X t
Growth from period t to period t+n: X t +n= (1 + r ) n X t
Subscript: discrete time period
A variable X growths by the rate r compounded k times during a
period:
k
r
X t +=
1
1 + X t
k
Definition:
Exponential e (irrational number)
k
1
e = lim 1 + = 2.71828182845...
k
k
Example 32:
With an annual interest rate of 100 percent, the value of $100 after one
year of continuously compounded interest is
k
1
lim 1 + $100 =
$271.83
k
k
26

Calculating growth (2)
Consequence:
k /r
k /r
1
r r
1 + = 1 + = 1 +

k k k / r
k
k /r
m
1
1
=
er =
lim 1 +
lim 1 + =
(2.71828...) r
m m
k / r
k / r
Thus:
X (t + 1) =
e r X (t )
t is denoted in parentheses: moments measured in continuous time.
X (t + n) =
e rn X (t )
n: any real number
Present value: see book and tutorial
27

Logarithmic functions
Definition:
y = bx
has a point in logarithmic form: y = log b ( x)
Example 33
blogb ( x ) = x
Rule 1:
log=
log b ( x) + log b ( y )
b ( xy )
Rule 2:
log b=
( x / y ) log b ( x) log b ( y )
Rule 3:
log b ( x l ) = l log b ( x)
Property:
logW H
= logW J
log J H
28

Natural Logarithms
Rule 1:
Rule 2:
Rule 3:
Rule 4:
Rule 5:
ln(e z ) = z
eln x = x
ln(=
xy ) ln( x) + ln( y )
ln( x=
/ y ) ln( x) ln( y )
ln( x z ) = z ln( x)
29
ADVANCED MATHEMATICS TUTORIALS WEEK 1

ADVANCED MATHEMATICS TUTORIAL WEEK 1
Advanced Mathematics Tutorials week 1 - Solutions week 1
Exercises with an asterisk (*) are meant to deepen the knowledge of the material of that
week. Hence, these exercises are less likely to be asked on the exam.
Technical tutorial (Wednesday)
Exercise 1.
Consider the following graph of a function f.
Graphically assess the domain and the range of f, its limits x 0, x A, x B, x C and its
continuity at these points. By graphically assessing, we mean that the exact function values do
not matter, but the procedure followed should be clear.
Solution:
The domain is the set on which the function is defined. f appears to be defined everywhere,
except on the interval [-1,-0.5] and at x=0, where f has no values. So its domain is
/ ([1, .5] {0}) (this means the set , which denotes the real numbers, with the interval
[-1,-0.5] and the point x=0 cut out).
The range is the set of outcomes of the function. In this case all numbers from seem to be
reached except for the small interval at about [ 2, 1] . So the range of the function is
/ ([2, 1]) . Note that we cant tell from the picture whether this is equal to the co-domain
of f, i.e., if f : X Y , whether Y = range( f ) . For instance, Y here could be the whole of ,
or it could be / ([2, 1]) . In the latter case, the function would be called onto, or surjective.
A function which is onto or surjective has range equal to its codomain.
30

For the limits and continuity we start with point C. Recall that a function is continuous if you
can draw it without lifting you pencil from the paper. Clearly around point C this is possible,
so f is continuous at C. Continuity at a point means that the function value at that point is
equal to its limit, so lim =
f ( x) f (C ) 0.9 , where the last approximate equality is our guess
looking at the graph.
x C
We turn to point B. The function has a jump here, so it is not continuous at B. Furthermore,
from the left the function tends to a different value than from the right (from the right it goes
to, say, 2, whereas from the left it appears to go to something like 1.2). Therefore a limit at B
is not defined.
At point A the function also has a jump, so it is again discontinuous. However from the left
and from the right it tends towards the same value, so it does have a limit: lim f ( x) 1 . Do
x A
note that lim f ( x) f ( A) .

x A
Finally, at point 0 something strange happens. The function is not defined at this point.
However, it does have a left-hand and a right-hand limit. From the left it goes to , from the
right to . These values are different, so the limit at 0 is not defined. Even if it were, we
could not decide whether f is continuous at 0, because the condition lim f ( x) = f (0) is not
x 0
well-defined: f(0) does not exist.

Hopefully this rather convoluted example displays all the conceptual pitfalls of limits and
continuity.
Exercise 2.
Let A and B be two sets and A B . We wonder if x B . Would x A be a sufficient
condition for that? And a necessary condition? And if we knew x B and wondered x A ?
Solution:
If x A , then certainly x B , so it is sufficient. However, it is not necessary, because x could
be in B without being in A.
If x B , then it might still be that x A , so it is not a sufficient condition. However, it is a
necessary condition, for if x B , then certainly x A .
Exercise 3.
Exercise 2.2.2. from Klein.
Which of the following functions are one-to-one:
a) A function relating countries to their citizens
b) A function relating street addresses to zip codes
c) A function relating library call numbers to books.
d) A function relating a students identification number to a course grade in a specific
class.
Solution:
a) Not, because Pierre and Jacques are both from France, but not the same person.
b) Not
c) Don't know, I don't know what a library call number is. If there is only one call number per
book (and one book per library call number), then it will be one-to-one.
d) Not, if two different students obtain the same grade.
31

Exercise 4.
Suppose you are a farmer that stores grain in a leaky silo so that you lose 2% of your crop
each year as a result of dampness and rot. To obtain the future value of X, you still obtain the
same formula X=
X t (1 + r ) n
t +n
If r = -2%, what is the value of X t + n as n approaches infinity? That is, what is the value of
lim
=
X t + n lim X t (1 + r ) n . Will the value of X t + n ever equal zero?
n
Solution:
lim X t (0.98) n = 0 , however, 0 is never actually reached.
n
Exercise 5.
Assume a firms net profits are $50 million in 2000 and are expected to grow at a steady stat
of 6% per year through the end of the next decade. How much would you expect the firm to
earn in 2001? In 2003? Now assume that the firms profits have been growing at 6% since
1997. If a negative value of n can be interpreted as the number of time periods before t, how
much did the company earn in 1998? Graph the path of income growth between 1998 and
2003 and explain why the curve gets steeper over time.
Solution:
In 2001: 1.061 50 =
53
3
In 2003: 1.06 50 =
59.5508...
2
In 1998: 1.06 50 =
44.4998...
Graph:
This actually looks very much like a straight line, so for sake of clarity, we also draw a graph
up to 2030:
32
The line gets steeper, because at each point growth is 6% of profits, and profits are always
increasing. Therefore growth is always increasing.
Exercise 6.
Exercises 3.2.1. and 3.2.5. from Klein.
r
3.2.1. Using the formula of multiple compounding in one period X=
X t (1 + ) k . Calculate
t +1
k
the value of X t +1 for the following values of r and k. Assume that X t = 20 .
a) r = 8%, k = 4
b) r = 0.5%, k = 2
c) r = 10%, k = 365
3.2.5. Assuming that X(t) = 75, determine the value of X(t + n) for the following values of k
and n using the continuous compounding formula:
X t +1 = X t e r + n
a)
b)
c)
d)
e)
f)
n = 3, r = 9%
n = 0.5, r = 2.5%
n = -2, r = 11%
n = 0.25, r = 6%
n = 3, r = 9%
n = 0.75, r = -3%
Solution:
We do one from each, the point should be clear:
r
3.2.1. a) X t +1 =X t (1 + ) k =20 (1.02) 4 =21.649
k
3.2.5. a) X (t + n) = X (t )e r n = 75 e0.27 = 98.2473
Exercise 7.
Exercise 3.3.1 from Klein.
a) 10log10 (100)
b) ln e x eln( x )
1
c) log10 ( 5 )
x
33

d) log 2 (a + b)
e) ln(e a +bx + cz )
f) ln(4 x)3
1
e) ln 5 [ x y ]2
e
Solution:
a) 10log10 (100) = 100
b) ln(e x ) eln( x ) = x x = 0
1
c) log10 ( 5 ) = 5log10 ( x)
x
d) Cannot be simplified further.
e) ln(e a +bx + cz ) =a + bx + cz
f) log(3 x) 4 Unclear, has two interpretations:
Either:
log((4
=
x)3 ) 3log(4
=
x) 3(2 log(2) + log( x))
Or:
(log(3 x)) 4 Which cannot be simplified further.
1
ln( 5 [ x y ]2 ) =ln(e 5 ) + ln([ x y ]2 ) =5 + 2(ln( x ) + ln( y )) =
g)
e
5 + 2( ln( x) ln( y ))
* Exercise 8.

The theory of consumer behaviour is one of the foundations of economic analysis. The linear
logarithmic utility function is one of the original functions developed to measure consumer
utility and is still widely use by economists. It is written as
=
u ln=
U
ln q
i =1
where u is the index of utility, qi is the quantity of good i, and 0 < i < 1 . Transform the
function back to its original form, where U is utility.
Solution:
n
i log( qi )
log(U )
U e=
e i=1 =
=
log( qi )
e i =
qi ) i
)
(elog(=
=i 1 =i 1
=i 1
34

Advanced Mathematics Week 1 - Broad tutorial (Friday of week 1)
* Exercise 1.
Evaluate the following limit directly from the definition:

lim f ( x)
x4
x2
Where f ( x) = x . Would your answer change if f ( x) =
0
2
if x 4
if x = 4
Solution:
2
) 4=
16 . Recall the formal
Since f(.) is clearly continuous, we want to find that lim f ( x=
x4
2
) 4=
16 and
definition of a limit. It should hold simultaneously that lim+ f ( x=
x4
2
lim f ( x=
) 4=
16 . We consider the left-hand limit, the right-hand is similar. The idea is now
x4
that for any number close to 16 we can find numbers close to, but smaller than 4, such that
their square is even closer to 16. That is, given a number > 0 , we have to find a > 0 such
that for numbers x such that 4 < x < 4 , we find x 2 16 < . This means that we have to
find as a function of .
If we can make this work for < 1 , we can obviously also make it work for larger , so we
restrict our attention to that case.
1
Let's try = . Then, because 4 < x < 4 , we know that
8
x 2 16 < (4 ) 2 16 =
16 8 + 2 16 =
8 + 2 < 8 =
The first inequality follows by the monotonicity of f ( x 2 16 is at its largest near 4- ), the
second inequality from the fact that < < 1.
Now, comparing the left-hand side and the right-hand side of (1), we see that we have the
definition of the limit, so we are done.
Note that we have nowhere looked at f(4), since x<4. Therefore, our result will not change if
the function has a different value at 4.
Finally, a picture to elucidate and .
35
(1)
Exercise 2.
Show from the definitions that for f ( x) = x 2
a) f(.) is a concave function,
b) f(.) it is a homogeneous function (of a degree to be determined by you).
Solution:
2a. Recall the definition of concavity: f(.) is concave if and only if, for 0<<1,
or
(1)
f ( x A ) + (1 ) f ( xB ) f ( x A + (1 ) xB )
f ( x A ) + (1 ) f ( xB ) f ( x A + (1 ) xB ) 0
Given our function f(.), we know that:

f ( x A ) = x A 2
(2a)
(2b)
(1 ) f ( xB ) =(1 ) xB 2
(2c)
f ( x A + (1 ) xB ) =
( x A + (1 ) xB ) 2
{so that f ( x A + (1 ) xB ) =
+( x A + (1 ) xB ) 2 }
Writing out equation (1), by substitution (2a), (2b) and (2c) in this equation, we get:
x A 2 (1 ) xB 2 + ( x A + (1 ) xB ) 2 =
x A 2 (1 ) xB 2 + 2 x A 2 + 2 (1 ) x A xB + (1 ) 2 xB 2
36
(3)

Next, we are rearranging equation (3) in terms of x A 2 , xB 2 and x A xB
(1 ) x A 2 (1 ) xB 2 + 2 (1 ) x A xB =
(1 )( x A 2 + xB 2 2 x A xB ) =
(1 )( x A xB ) 2 < 0
The last term consist of a minus, two positive terms { >0 and 1 > 0 }, and a square (
( x A xB ) 2 >0), so that the last term is negative.
which proves our result.
2b. Recall the definition of homogeneity:
The function f(.) is homogeneous of degree k if, for >0, f ( x) = k f ( x)
We write out the definition:
f ( x) =
( x) 2 =
2 x 2 =
2 f ( x)
So we find that f(.) is homogeneous of degree 2.
* Exercise 3.
Suppose f(x) and g(x) are both monotonous functions on the same domain. Show that
a. f(.) + g(.) is also monotonous.
b. And if f(x) and g(x) are both concave, can you show that f(.) + g(.) is concave?
c. And if f(x) and g(x) are both homogeneous, of degrees l and m respectively, can you then
show that f ( x) g ( x) is homogeneous, and of what degree?
Solution:
3a. Monotonicity:
We know that, if x A < xB , f ( x A ) < f ( xB ) and g ( x A ) < g ( xB ) , for any
x A , xB Dom( f ) Dom( g )
(meaning that x A and xB are both on the domain of both f(.) and g(.)).
Now we have to prove that, if we let h=
( x) f ( x) + g ( x) that h( x A ) < h( xB ) .
Thus:
h( x A ) = f ( x A ) + g ( x A ) < f ( xB ) + g ( xb ) = h( xB )
This proves what we want.

3b. Concavity: We know that for any 0<<1 and x A , xB Dom( f ) Dom( g ) .
f ( x A ) + (1 ) f ( xB ) f ( xA + (1 ) xB )
37

and
g ( x A ) + (1 ) g ( xB ) g ( x A + (1 ) xB )
Now we want to show that, if we define again h=

( x) f ( x) + g ( x) , then
(1)
h( x A ) + (1 )h( xB ) h( x A + (1 ) xB ) .
Thus we start with the left-hand side of equation (1) and we rewrite it in terms of functions f(.)
and g(.)
h( x A ) + (1 )h( xB ) =
( f ( x A ) + g ( x A )) + (1 )( f ( xB ) + g ( xB )) =
f ( x A ) + (1 ) f ( xB ) + g ( x A ) + (1 ) g ( xB ) f ( x A + (1 ) xB ) + g ( x A + (1 ) xB ) =
h( x A + (1 ) xB )
That proves what we want.
3c. Homogeneity:
We know that f ( x) = l f ( x) and g (gg
x) = m g ( x) .
Now we want that, if we define k=
( x) f ( x) g ( x) , then we want k ( x) = p k ( x) for some p
still to be determined.
k (gggggggg
x) = f ( x) g ( x) = l f ( x) m g ( x) = l m f ( x) g ( x) = l + m k ( x) .
So we see that the function k(.) is homogeneous of degree l+m.
Exercise 4.
Write out the following:
4
a)
i * 2
=i 2=j 1
b)
(i + j )
=i 1 =j 1
Solution:
a)
4
4
i
i
i
i
=
i
2
2i )
(i 2 + i 2 + i 2 + i =
=i 2=j 1
=i 2 =j 1
=i 2
2
3
4
4(2 2 + 3 2 + 4 2 ) =
384
4
2i
i=
We could also have shown:

4
4
4
4
i 2i = i 2i =
=j 1 =i 2
=j 1 =i 2
(2 2
+ 3 23 + 4 2 4 ) =
=j 1
2i
i =
=j 1 =i 2
i 2
=i 2 =j 1
38
2i ) 4 (i =
2i )
4(i =
=i 2
=i 2
96 = 4 96 = 384
=j 1
Thus both outcomes are equal:

4

b)
3
)2
(i + j=
=i 1 =j 1
( (1 + 1)
2
2
(i + j ) =
=i 1 =j 1
( (i + 1)
=i 1
2
+ (i + 2)=
)
+ (1 + 2) 2 ) + ( (2 + 1) 2 + (2 + 2) 2 ) + ( (3 + 1) 2 + (3 + 2) 2 ) =
4 + 9 + 9 + 16 + 16 + 25 =
76
Again, one can show that
3
) (i + j )
(i + j=
2
=i 1 =j 1
=j 1 =i 1
Note that always the indices have dropped out after you have evaluated the sums. They are
only useful within the sum and for that reason are sometimes called dummies.
Exercise 5.
Let k be some constant and f(.) some function. Show, or at least make clear, that
n
kf (i) = k f (i) and
=i 1 =i 1
k = nk .
i =1
Solution:
n
kf (i) = k f (1) + k f (2) + ... + k f (n) = k ( f (1) + f (2) + ... + f (n)) = k f (i)
i 1 =i 1
n
k =
i =1
(k + k + ... + k ) = nk
((((
n times
39
ADVANCED MATHEMATICS TAKE HOME WEEK 1

ADVANCED MATHEMATICS TAKE HOME ASSIGNMENT
MATERIAL OF WEEK 1
Advanced mathematics
Solutions take home assignments of week 2 (on material of week 1)
* Exercise 1.
The following are all graphs of functions . Determine whether they are one-to-one
(i.e. injective) and whether they are onto (i.e. surjective).
a)
Solution:
We check for injectivity. Clearly the function does not take the same value twice, so the
function is injective.
We check for surjectivity. The range for the function is only about [3, ) , while it is given
that Y = . So the function is not surjective.
40
b)
Solution:
We check for injectivity. We see that, for instance, both at x=2 and at x=-2 f(x)=1, so the
function is not injective.
We check for surjectivity. The range for the function is only about [3, ) , while it is given
that Y = . So the function is not surjective.
41

c)
Solution:
We check for injectivity. We see that the function does not take the same value twice, so it is
injective.
We check for surjectivity. We see that the range of the function is = Y , so it is surjective.
d)
Solution:
We check for injectivity. We see that for instance at x=0 and at x=40 the function takes the
value f(x)=0, so it is not injective.
We check for surjectivity. We see that the range of the function is = Y , so it is surjective.
42

Exercise 2.
Write out:
2
(2i + 3 j )
=i 1 =j 1
Solution:
2
=j 1
=j 1
(2i + 3 j ) = (2 + 3 j ) + (4 + 3 j ) = (6 + 6 j ) = (6 + 6) + (6 + 12) + (6 + 18) +(6 + 24) = 14 6 = 84
=i 1 =j 1
Or the other way around:

2
(2i + 3 j ) = ((2i + 3) + (2i + 6) + (2i + 9) + (2i + 12)) = (8i + 30) =(8 + 30) + (16 + 30) =84
=i 1 =j 1
=i 1
=i 1
Of course, not all of these brackets are necessary, they are mostly to show what comes from
what.
* Exercise 3.
Determine whether the following function is homogeneous. If it is, determine the degree.
f ( x) = h( x 3 ) , where h( x) is homogeneous of degree 7. (Hint: if you find this confusing, first
try it with h( x) = x 7 , which is a homogeneous function of degree 7.)
Solution:
First the general problem:
We check if f (t x) =
t m f ( x) for some m.
f (t x) = h((t x)3 ) = h(t 3 x3 )

Now we know that h is homogeneous of degree 7, i.e. h(r y ) =
r 7 h( y ) . Take r = t 3 and
y = x3 to find:
7
21
h=
(t 3 x 3 ) (t 3 )=
h( x 3 ) t =
h( x 3 ) t 21 f ( x)
So f is homogeneous of degree 21.
The hint is solved similarly, but now we take h( x) = x 7 :
f (t x) = h((t x)3 ) = (t 3 x 3 )7 = t 21 ( x 3 )7 = t 21h( x 3 ) = t 21 f ( x)

Which is not really easier, I suppose.
* Exercise 4.
Of course you all know intuitively what the derivative of a function f(x) is: it is the very small
change that occurs in f(x) when you very slightly change x. The picture illustrates this. The
blue line is the graph of the function f(x). If you take x ever smaller, you will approach ever
more closely the slope of the red line of the derivative.
43
0, because otherwise it would
For this approaching ever more closely, we naturally think of the limit (in fact, it was in the
context of derivatives that the notion of limit was first developed).
d
f ( x + x) f ( x)
We define:
. (Note that f ( x + x) f ( x) = f ( x) .)
f ( x) = lim
x 0
dx
x
d
Now you must prove that for f ( x) = x 2 it indeed holds that
f ( x) = 2 x by writing out the
dx
limit. Do this in three steps: first, before evaluating the limit, observe that
f ( x + x) f ( x=
) 2( x x) + x 2 . Then, still before evaluating the limit, show that
f ( x + x) f ( x)
simplifies to 2x + x . Then evaluate the limit lim 2 x + x directly from
x 0
x
the definition (either doing it from the left or the right hand side is enough).
If you succeed, you have proved a rule that youve already known for a long time. Isnt that
fun!
Solution:
So, let the fun begin:
We first write out the definition. To emphasize that x is a single number and not a
multiplication of and x, I will now define x=h. (So this is just giving it a new name).
d 2
f ( x + h) f ( x )
( x + h) 2 x 2
=
x lim
= lim
h 0
h 0
dx
h
h
Now, before we touch the limit, we just apply algebra to what is inside the limit. This is
allowed, because were basically not changing the expression over which we take the limit.
x 2 + 2 xh + h 2 x 2
( x + h) 2 x 2
2 xh + h 2
= lim
= lim = lim(2 x + h)
lim
h 0
h 0
h 0
h 0
h
h
h
Strictly speaking, for our last step, we should observe that h
44

not be allowed to divide by h. However, if you recall the definition of a limit, you will see that
h never actually takes the value to which it goes, i.e. 0 in this case. Therefore our last step is
valid and is obtained by dividing both the denominator and the numerator by h. For this last
expression we can now apply the definition of a limit.
How does it work again? In general, the idea is that, if
c = lim f ( x)
xa
then f(x) will get ever closer to c, as x gets closer to a. This was formalised thus: if you say
how close to c you want to get, then I should be able to give a distance from a so that you will
indeed get that close or closer to c. You saying how close you want to get is setting an , me
providing you with this distance is picking the .
Lets apply this to our case. The function over which we are taking a limit is 2x+h, where x is
now just some given number. Intuitively, we would expect that as h goes to zero, this function
will just go to 2x. So lets make that our guess for the limit.
Now, since this function is very simple, we can do the right hand limit and the left hand limit
at the same time.
You provide me with >0 and I decide to pick = (Why? It turns out that it works. This is
basically backward engineering.) Then if I only look at h whose distance to 0 is less than ,
i.e. 0 h = h < ,I hope to find that my distance to 2x is smaller than , i.e. f (h) 2 x < .
Lets see:
1
f ( h) 2 x = 2 x + h 2 x = h < = <
2
Comparing the left hand side and the right hand side, we see that we indeed are close enough,
d
so the limit is as we specified. This proves that
f ( x) = 2 x . Yay!
dx
45

LINEAR ALGEBRA
Vectors
Definition:
A vector x, x n , is defined as
x1
x =
xn
The vector x has an (n X 1) format, which means that it is a column
vector with two elements x1 and x2
Definition:
y1
x1
Two vectors x = and y = , x, y y n can be added:

yn
xn
x1 + y1

x+ y =
xn + yn
Example 3
x, y y 3
2
1
x = 1 and y = 0 so that

1
5
46

3
x + y = 1

4
47

Multiplication of a vector by a real number
Definition:
A vector x can be multiplied by a real number a:
ax1
ax =
axn
Example 4
2
If x = 1 then:

1
a)
6
3 x = 3

3
(multiplication of vector x by scalar 3)
b)
6
3
3 x =

3
(multiplication of vector x by scalar -3)
48

Line through the origin
Definition
Every x 2 which satisfies
c1 ac1
a =
c2 ac2
for which a ,
0
is at a line through the origin O = and the point
0
c1
c
2
Example 5
x1 2a
Any =
x2 3a
0
for which a is on a line that goes through the origin O = and
0
2
the point
3
6
Thus, for instance is at this line (a = 3).
9
49

Length of a vector
Definition:
The length of vector x:
=
x
x12 + x2 2
Implication:
The length of vector ax:
ax=
2
2
| a | x12 + x2=
|a| x
a 2 x12 + a 2 x2=
Note that we take the absolute value of a, because the length cannot be
a negative number.
2
If x = then
3
a)
x =
b)
3 x= 3 4 + 9= 3 13 (length of 3x)
c)
3 x= 3 4 + 9= 3 13
4+9 =
13
(length of x)
(length of -3x)
50

Example 6
b
If x =
3
a) For which real number b is the length of x equal to 5?
x=
b 2 + 9= 5
Thus for b = 4 or for b = -4

b) For which real number b is the length of x equal to 1?
x=
b 2 + 9= 1
Thus, there is no b available.
51

Example 7
b
If x =
0.5
For which real numbers b is the length of the vector x equal to 1?

x = b 2 + 0.25 =1
1
3
2
1
1
3
2 3
2
have a length of 1.
and x =
Thus the vectors x =
1
1
2
2
Thus b 2 = 3/ 4 . Which means that b =
Definition:
1
0
In 2 the unit vectors e1 = and e2 = have a length of 1
0
1
52

Circles
Definition:
x1
Vectors x =
x2
are at the unit circle if their length is equal to 1:

x =
x12 + x2 2 = 1
0
Thus the locus of this circle is the origin O =
0
Consequence:
1
0
In 2 the unit vectors e1 = and e2 = are at the unit circle.
0
1
Definition:
c
x
A vector x = 1 is at a circle with locus c = 1 and with a non x2
c2
negative radius (r 0) if it satisfies the restriction:
( x1 c1 ) 2 + ( x2 c2 ) 2 =
r
Example 8
1
( x1 + 1) 2 + ( x2 2) 2 =
25 describes a circle with locus and radius
2
5.
53

Inner product
Definition: inner product
The inner product of two vectors
x1
y1
x = and y =
x2
y2
is defined as
x y= x1 y1 + x2 y2
54

Orthogonal vectors
Definition
Two vectors are orthogonal (perpendicular) if their inner product is
equal to zero:
x y =
0
Example 9
1
0
In 2 the unit vectors e1 = and e2 = are orthogonal:
0
1
e1 e2 = 1 0 + 0 1 = 0
Definition:
1
0
In 2 the unit vectors e1 = and e2 = are referred to as
0
1
orthonormal vectors (they are perpendicular and they have a length
of 1).
Example 10
1
0
In 2 the unit vectors e1 = and a e2 =
a , a , are orthogonal.
0

Reason:
e1 (a e2) =1 0 + 0 a = 0
1
Consequence: e1 = is orthogonal to any point at the line
0
0
a e2 =
a for a

55

Example 11
0
a
,
and
the
line
In 2 any point at the line a e1 =
a
e
2
=
b
0

b are orthogonal:
Reason:
(a e1) (b e2) = a 0 + 0 b = 0
56

Linearly dependence of two vectors
(Informal) definition of dependence:
In 2 two vectors x and y are linearly dependent if the first vector is a
x
y
linear combination of the second vector. Thus x = 1 and y = 1
x2
y2
x1 ky1
If
x =
=
=
ky
ky
x
2 2
Formal definition of independence:
In 2 two vectors x and y are linearly independent if
x1
y1 0
k1 + k2 =

x2
y2 0
for k1 = 0 and k2 = 0
57

Examples linearly dependence
Example 12
4
12
x = and y = are linearly dependent (k=1/3)
3
1
Reason: 4= k1 12 , 1= k2 (3) , thus k=
k=
1/ 3 .
1
2
Example 13
10
5
x = and y = are linearly dependent (k=2).
0
0
Reason: 10= k1 5 , 0= k2 0 , thus k=
k=
2
1
2
Example 14
3
10
x = and y = are linearly independent.
8
0
Reason: 10= k1 3 , 0= k2 8 , thus k1 k2
Example 15
2
1
x = and y = are two linearly independent and orthogonal
2
1
vectors.
Reason (for independent vectors): 2 = k1 (1) , =
1 k2 2 , thus k1 k2
Reason (for orthogonal vectors): 2 (1) + 1 2 =0
58

Example 16
1 2
1 1
and
x=
y
=

are linearly independent and orthonormal
5 1
5 2
vectors.
Because:
Reason 1)
2 = k1 * (1),1 = k2 * 2
So that
k1 =
1/ 2, k2 =
1/ 2, k1 k2
Reason 2)
Reason 3)
2 (1) + 1 2 =0
=
x
1
22 +=
12 1
5
59
and
=
y
1
12 + =
22 1
5

Linear independence of two vectors: important consequences
Consequence 1:
All vectors z 2 can be written as a linear combination of two
linearly independent vectors x, y y 2
x1
y1
z k1 + k2
Thus:
=
x2
y2
for which k1 , k2
Thus: 2 can be spanned by two linearly independent vectors!!!!!
Consequence 2:
a
c
Let x, y y 2 . We write both vector as x = and y =
d
b
Both vectors are linearly dependent if
ad bc =
0
Both vectors are linearly independent if
ad bc 0
Proof: both vectors x and y are linearly dependent if
a
1) a= k1 c so that k1 =
c
b
2) b= k2 d so that k2 =
d
a b
0
3) Linear dependent so that k1 = k2 or = or a d b c =
c d
60

Matrices
Definition:
The (2 X 2) matrix A:
a12
a
A = 11
a21 a22
It consists of two rows and two columns.
Definitions:
aij is an element of the matrix A.
The diagonal of the matrix consists of the elements a11 and a22 .
The elements a11 and a22 are referred to as the diagonal elements
of the matrix A.
The elements a21 and a12 referred to as the off-diagonal elements
of the matrix A.
The (2 X 2) matrix A can be (post)multiplied by a (2 X 1)-vector x.
a
Ax = 11
a21
a12 x1
a22 x2
For the multiplication it is required that the number of columns of A is

equal to the number of rows of the vector x.
61

How to determine Ax ?
1) The first element of the product can be determined as:
a11 a12 x1 a11 x1 + a12 x2

=
Ax =
x2
2) The second element of the product can be determined as:
Ax
x1
=
a

21 a22 x2 a21 x1 + a22 x2
So that it can be combined as:
Ax
a11 a12 x1 a11 x1 + a12 x2
=

a21 a22 x2 a21 x1 + a22 x2
62

Example 17
1 3 2 11
=
a)
2 5 3 11
1 0 2 2
b)
3 = 3
0
1

0 1 2 3
c)
3 = 2
1
0

5 0 2 10
d)
3 = 15
0
5

2 1 2 7
e)
3 = 14
4
2
63

How to determine (A+B)x
Format of both matrices must be equal:
a b b12 a11 + b11 a12 + b12

a
=
A + B 11 12 + 11=

a21 a22 b21 b22 a21 + b21 a22 + b22

Note that A + B = B + A
64

How to calculate (AB)x ?
Requirement for matrix multiplication:
Number of columns of matrix A = Number of rows of matrix B
1) Row 1 times column 1:
a11 a12 b11 a11b11 + a12b21
AB =
b21

a11 a12 b12
AB =
b22
a11b12 + a12b22

b11
AB =
a21 a22 b21 a21b11 + a22b21

b12
AB =
a21 a22 b22 a21b12 + a22b22

Thus all together:
a11 a12 b11 b12 a11b11 + a12b21
AB =

a21 a22 b21 b22 a12b11 + a22b21
Note that
a) AB BA
b) A( BC ) = ( AB)C
65
a11b12 + a12b22
a21b12 + a22b22

Example 18
1 3
1 2
B
=
and
If A =
0 4
2 5
The sum of both matrices (both have the same format)
1 3 1 2 0 5
A=
+B
+0=
2
5
4 2 9

The product of both matrices (number of columns of A equals the
number of rows of B):
AB
1 3 1 2 1 14
=
2 5 0 4 2 16
66

How to interpret Ax?
a
Ax 11
=
a21
a12 x1
a11
a12
x
x
=
+
a22 x2 1 a21 2 a22
a12
a11
Thus Ax is a linear combination of the vectors and
a22
a21
Example 19:
1 3 2
1
3
=
2
+
3
a)

2
5
2 5 3

1 0 2
1
0
=
2
+
3
b)

0
1
0 1 3

0 1 2
0
1
=
2
+
3
c)

1
0
1 0 3

5 0 2
5
0
2
3
=
+
d)

0
5
0 5 3

2 1 2
2
1
=
2
+
3
e)

4
2
4 2 3

67

How to interpret the identity matrix?
Definition:
The square matrix I is an identity matrix and it has the following
property
Ix = x
1 0
I
=
Where
0 1
I is a diagonal matrix with ones on the diagonal (all off-diagonal

elements are zero).
For a 2 X 2 matrix A, we have the following consequence:
Consequence 1: IA = A
Furthermore, I times I equals I:
Consequence 2: II = I
Example 20:
See examples 17b and 19b:
1 0 2
1
2
=
0 1 3
0

1 0 2
1
0
=
2
+
3
0 1 3
0
1
68

How to interpret the inverse matrix of A?
The inverse of the square matrix A is referred to as A1
It has the following properties:
1) AA1 = I
2) A1 A = I
1
Consequence: I I = I
Why is it important to calculate the inverse matrix?
Ax = b
can be rewritten as:
A1 Ax = A1b
or
x = A1b
Hence, Ax = b can be solved as x = A1b
69

How to calculate the inverse matrix A?
Definition:
The inverse of the square matrix A
a c
A=
b d
equals
A1 =
Proof:
1 d c
ad bc b a
1 d c a c
=
ad bc b a b d
dc dc
1 da bc
=
ad bc ba + ab bc + ad
A1 A
1 da bc
=
bc + ad
ad bc 0
1 0
=
0 1
0
Note: the inverse of matrix A does not exist if ad bc =
70

How to calculate the inverse matrix?
2 4
A=
3 1
1 0
A=
0 1
Round 1: Divide row 1 by 2

1 2
1/ 2 0
A=
A
=
0 1
3 1
Round 2: Row 2 new: row 2 - 3 times row 1

1 2
1/ 2 0
A=
A
=
3/ 2 1
0 5
Round 3: Divide row 2 by (-5)

0
1 2
1/ 2
A=
A
=
3/10 1/ 5
0 1
Round 4: Row 1 new: row 1 - 2 times row 1
1 0
1/10 2 / 5
1
A=
A
=
3/10 1/ 5
0 1
Check:
AA1
2 4 1/10 2 / 5 1 0
=
3 1 3/10 1/ 5 0 1
71

Solutions Tutorials Week 2
Advanced Mathematics - Technical tutorial Week 2 (Wednesday)
These are the solutions of the exercises as we treated them in class (mostly). In the second
class I got a bit further than in the first. Pay heed.
Exercise 1.
Exercise 2.
Determine in which order the following matrices can be multiplied and carry out the
multiplication.
1 2 3
1 4
4
5
6
,2 5
7 8 9
3 6
1
2
3
Solution:
1 2 3
1 4
4
5
6
2 5
=
7 8 9
3 6
1 2 3
1 1 + 2 2 + 3 3 1 4 + 2 5 + 3 6
2 + 63 4 4 + 55 + 66
4 1 + 5 =
7 1 + 8 2 + 9 3 7 4 + 8 5 + 9 6
1 1 + 2 2 + 3 3 1 4 + 2 5 + 3 6
14 32
32 77
50 122
14 32
Exercise 3.
1 4
8 0
For the matrices A =
,B =
, show that ABBA. Do the same, but without
4 5
3 1
T
1
1

2
2
calculation for 3 and 3 = (1 2 3 4 5 ) .

4
4
5
5

Solution:
Observe that A is a symmetric matrix. It doesnt help your calculations, but you should know
what it is.
1 4 8 0 20 4
8 0 1 4 8 32
AB
=
=
=
=

BA
4 5 3 1 47 5
3 1 4 5 7 17
For the second pair of vectors, observe that one order of the multiplication gives rise to a 5x5
matrix, while the other leads to a 1x1 matrix, usually called a number. Clearly those cannot be
72

T
1

2
the same. In fact the version with the number as an outcome, 3

4
5

1

2
3

4
5

is just another way of writing the inner product. (Check that its the same thing!)
3b).
5 3
17 17
4 3
1
Show for the matrix A =
,
with
inverse
A
=
that the general rule
1 4
1 5
17 17
( A1 )T = ( AT ) 1 holds.
Solution:
5 3 5 1
17 17 17 17
( A1 )T =
=
We compute directly

.
1 4 3 4
17 17 17 17
T
that I A=
If this is equal to ( AT ) 1 , then it must hold
=
( AT ) 1 AT ( A1 )T . We check:
5 1
4 1 17 17 1 20 3 4 + 4 1 0
=
AT ( A1 )T =
3 5 3 4 17 15 15 3 + 20 0 1
17 17
Yippee.
Exercise 4.
Write out the following sets of equations in matrix form. Solve by sweeping.
a)
x1 + x2 + 3 x3 + 7 x4 =
5
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4
x1 x2 + 3 x3 6 x4 =
0
4
x1 2 x3 + 1x4 =
b)
x1 + 3 x2 x3 =
3
x1 + 2 x2 5 x3 =
4
x2 + 4 x3 =
1
c)
x1 + 3 x2 x3 =
3
x1 + 2 x2 5 x3 =
4
x2 + 4 x3 =
0
73

Solution:
a)
Sweeping is basically adding and subtracting and multiplying equations until you get
something from which you can easily read the result. Ideal would be the identity matrix, but
in practice we settle for something a bit messier. It is best seen in an example. We write the
equation in a matrix as follows:
1 1 3 7 x1 5

2 2 4 6 x2 =
4
1 1 3 6 x3 0

1 0 2 1 x4 4
If you multiply this out, you indeed get the equations back (try it!).This is of course why
matrix multiplication is defined the way it is: it makes it very easy to write sets of equations
compactly. However, for actual solving we will write things down slightly differently.
Suppose we subtract the first equation x1 + x2 + 3 x3 + 7 x4 =
5 from the second
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4 , then we get:
2 x1 + 2 x2 + 4 x3 + 6 x4 =
4
x1 + x2 + 3x3 + 7 x4 =5
x1 + x2 + x3 x4 =
1
Notice that the xs dont change, we only have to look at the value in front of them. Thats
why we write down the set as follows:
1 1 3 7 5
2 2 4 6 4
1 1 3 6 0
1 0 2 1 4
Now we rework this augmented matrix, as it is called, to get something that we can interpret
quickly. Each time we rewrite the matrix, we indicate this with the sign ~ rather than =, as the
matrices are not equal. Notice that we not only subtract, add and multiply, but also
interchange rows. We get:
1 1 3 7 5 0 1 5 6 9 1 0 2 1 4
4 12 0 1 4 2 6
2 2 4 6 4 0 2 8
1 1 3 6 0 0 1 5 7 4 0 0 9 5 10
1 0 2 1 4 1 0 2 1 4 0 0 1 4 3
Here in the first step we subtracted the last row from the first row once, twice from the second
row and once from the third. In the second we interchanged the first and the last row, then
divided the second row by 2 and then added it to the third row and subtracted it from the
fourth. From now on we dont describe our steps, as this is very cumbersome and confusing.
We continue:
1 0 2 1 4
4
1 0 2 1 4 1 0 2 1

0 1 4 2 6
0
1
4
2
6
0
1
4
2
6
0 0 1 4 3
0 0 9 5 10 0 0 1
4
3

17
0
0
1
4
3
0
0
0
41
17

0 0 0 1
41
74

From the last matrix we can now easily solve the equations. The last line reads: x4 =
17
. We
41
17
17
55
) =3 x3 =3 4( ) = . We use this again in
41
41 41
55
17
55
17
8
the line above that: x2 + 4( ) + 2( ) =
and finally
6
x2 =
6 4( ) 2( ) =
41
41
41
41
41
55
17
55
17
71
.
x1 2( ) + ( ) =4 x1 =4 + 2( ) ( ) =
41
41
41
41
41
Note that once we have all zeros in the lower corner, life is quite easy. Of course, in general,
sweeping is just a systematic way of solving by substitution. The operations we used to
change the augmented matrix (addition, subtraction, etc.) are called elementary operations.
They are very useful in the study of linear algebra.
b)
1 3 1 x1 3

1 2 5 x2 =
4
0 1 4 x 1
3
1 3 1 3 1 3 1 3 1 3 1 3
1 2 5 4 0 1 4 1 0 1 4 1
0 1 4 1 0 1 4 1 0 0 0 0
Hmm, so whats going on here? We have an equation saying 0 x1 + 0 x2 + 0 x3 =

0 , which
doesnt tell us anything we did not already know. And after that we have only two equations
left for three unknowns. So what we get is infinitely many solutions. (If you find this
confusing, think back to the simpler case x1 + x2 =
0 , which has infinitely many solutions
plug this in the line above to get: x3 + 4(
also: one for each x1 such that x1 = x2 .) We have one free variable. We could pick any of our
xs to be the free one and express the others in terms of it. We pick x3 . Then we get
x2 + 4 x3 =1 x2 =1 4 x3 and x1 + 3(1 4 x3 ) x3 = 3 x1 = 6 + 13x3 .

A system of linear equations which allows multiple solutions like this is called
underdetermined. You get such an underdetermined system if and only if you get a row of all
zeros in your augmented matrix. The number of free variables you get is the number of
columns minus the number of zero rows you get in the end (Beware, I said this wrong in
class).
c)
1 3 1 x1 3

1 2 5 x2 =
4
0 1 4 x 0
3
1 3 1 3 1 3 1 3 1 3 1 3
1 2 5 4 0 1 4 1 0 1 4 1
0 1 4 0 0 1 4 0 0 0 0 1
Hmm, this seems especially problematic: 0 x1 + 0 x2 + 0 x3 =

1 . What we have here is an
inconsistent system: it has no solutions. It is no coincidence that the matrix (not augmented)
here is the same as under b). Non-augmented matrices which can get a row of all zeros give
rise to either an underdetermined or an inconsistent system. Only matrices such as that under
a), which cannot have a row of zeros under elementary operations, have one and only one
75

solution for every possible vector on the right hand side. The number of rows with not all
zeros in the end is called the rank of the matrix. Thus this matrix has rank 2, whereas that
under a) had rank 4.
a11 a1n
Exercise 5. Find vectors b and c that pick out element aij from matrix A = ,
a
m1 amn
i.e. aij= b A c .
Solution:
1 , where e j has a 1 only on its j(0 1 0), c= e =

Consider the vectors: b= e=
i
j

0

th spot. These es are called unit vectors. Then:
a1 j
0

a11 a1n

(0 1 0) 1 =
(0 1 0) aij =
b Ac =
aij

m1 amn

0
a

mj
In general: pre-multiplication with a unit vector gives you a row from the matrix, postmultiplication gives you a column.
* Exercise 6.
Find vectors b and c such that b A c gives the average of all elements of A.
Solution:
1
1 , then:
=
1 1) , c
Consider b (1=
mn
1

1
mn
a11 a1n
1 m n
b Ac =
(1 1 1) 1 =
aij
mn mn=i 1 =j 1
m1 amn
mn
This is the sum of all the elements in A divided by the number of elements in A, i.e. the
76

average. The term
them.
1
could also have been in front of the first vector, or shared between
mn
* Exercise 7.
Write out the matrix product A B in terms of their typical elements aij , bij , assuming A and B
are conformable. I.e., find the typical element cij of C= A B , where we write C = cij .
Solution:
b1 j
b1 p
a1n b11
a11
aik bkj
C
ai1 aij ain
=
= ai1b1 j + + ainbnj=
ij
k =1

a
bnj
bnp
amn bn1
m1
* Exercise 8.
x
Show that the length formula =
x12 + x2 2 holds in two dimensions.
Solution:
The picture shows a general vector x and its components x1 , x2 . Pythagoras theorem now
states (as you hopefully remember from high school) that for a triangle with a right angle like
2
this, x= x12 + x2 2 . Taking the square root gives the result.
* Exercise 9.
Show that the circle formula ( x1 c1 ) 2 + ( x2 c2 ) 2 =
r 2 indeed gives a circle with radius r and
77

c1
centre (or locus) .
c2
Solution:
c1
x1
The figure tells the basic story. We start with point and we pick a point such that
c2
x2
2
2
2
the condition ( x1 c1 ) + ( x2 c2 ) =
r holds. We did this in the figure. But then, by
c1
x1
Pythagoras theorem, the distance between and must be r. This must hold for all
c2
x2
x1
x1
x we can find this way, so the set of points x for which the condition holds is the set of
2
2
c1
points that has distance r to . This is the circle drawn.
c2
Exercise 10.
1
Find all vectors orthogonal to 2 . Do the same for
2
Solution:
78
1
2 3

3

a
a
1
For an orthogonal vector we want the inner product between and , which is
b
b
2
1 a + 2 b =
0 . We find 1 a + 2 b =
0 . This is a single equation in two unknowns, so we have
2b
one free variable. Lets take b free. Then a=-2b, so all vectors
, b are orthogonal to
b
1
2 .

1
For 2 3 we want the inner product

3
1 a
2 b =
. We find
0
3 c
1 a + 2 b + 3 c = a + 2b + 3c = 0 . This is a single equation in 3 unknowns, so we have two
2b 3c
free variables. Lets take b and c free. Then a=-2b-3c, so al vectors b , b, c are
1
orthogonal to 2 . If you try to imagine this in space, it makes sense that a vector in 3 has

3
two free variables. If you find an orthogonal vector you cannot only extend it, like in 2 , but
also rotate it.
* Exercise 11.
Bonus:
Show that the formula for orthogonality coincides with our intuitive notion of orthogonality.
Solution:
As a preliminary, convince yourself that the vector indicated in the following figure as B-A is
indeed B-A.
79
The easiest way to see this (much easier than I explained in class) is that in the figure A and
B-A add up to B, which they clearly also do algebraically.
Now look at the following picture:
80
Convince yourself that the vectors are as indicated. Now, from the figure we see that the angle
between A and B can only be orthogonal if A+B and A-B have the same length, like so:
81
A+B having the same length as B-A means B A = A + B . We now manipulate this
0 , as we want. First note that for any vector C it holds

expression to derive that then A B =
2
that C C =
C . This can be verified easily by writing out the definitions.
Now, squaring our equation, we get:
2
2
B A = A + B B A = A + B ( B A) ( B A) = ( A + B) ( A + B)
Working this out:
( B A) ( B A) = B B 2 A B + A A = ( A + B ) ( A + B) = B B + 2 A B + A A
2 A B = 2 A B 4 A B = 0 A B = 0
Which is what we wanted.
82

Advanced Mathematics - Broad tutorial Week 2 (Friday)
Exercise 1.
a1
b
a =
, b 1 from 2 , draw them and a + b . Also calculate a + b .
=
For vectors
a2
b2
Solution:
In the figure, the red lines denote the vectors a and b, while the green lines denote their
translations (by b and a respectively). The yellow line is the resultant vector a+b, which has
coordinates (a1 + b1 , a2 + b2 ) .
* Exercise 2.
a11
Consider the matrix A =

a
m1
constitutes a linear map from
a1n
b1

working on vector b = . Show that this

b
amn
n
n
m
to .
Solution:
Let T (b) be the transformation indicated. Then we have to show that
T ( =
b) T (b), , b n and T (b1 + b 2 )= T (b1 ) + T (b 2 ), b1 , b 2 n . So:
T ( b)= A ( b)= ( A b)= T (b), , b n

T (b1 + b 2 ) =A (b1 + b 2 ) =A b1 + A b 2 =
T (b1 ) + T (b 2 ), b1 , b 2 n
This is so easy that it becomes confusing. Have a good look at it: youre gazing into the minds
of mathematicians. From the outset we did not know that a matrix function was always a
linear map, nor that the reverse holds (as we showed in class). Now however, we have proved
that this is the case and that matrix functions and linear maps are the same thing.
83

* Exercise 3.
Show that ( A B) 1 =B 1 A1
Solution:
( A B) 1 ( A B) =I ( A B) 1 ( A B) B 1 =I B 1 ( A B) 1 A =B 1
( A B) 1 A =
A1 B 1 A1 ( A B
=
) 1 B 1 A1
Exercise 4.
Determine dimension of the span of the following vectors:
1 0 3 2

2 7 5 11
3 , 4 , 3 , 14

4 2 7 7
5 5 0 5

Solution:
The span of a set of vector is the set of all multiples and sums of these vectors. Geometrically,
you can think of it as everywhere you can get by taking steps in the direction of these vectors.
The dimension of a span is the number of linearly independent vectors within the span. So we
have to determine how many linearly independent vectors there are in the set of four vectors
given.
Linear independence was defined as follows: v1 , v 2 , , v n are linearly independent if
= =
0 . We can write this in matrix notation
1 v1 + 2 v 2 + + n v n =
0 only if =
1
2
n
as follows:

0 only if =
=
= =
0
1
2
n
v1 v 2 v n =
This equation has the associated augmented matrix:
v1 v 2 v n 0
Lets think about this for a second. If we start sweeping this matrix, the right hand side will
never change, as we would be adding zeros.
Now, in general for a set of linear equations there were three possibilities: well-determined,
underdetermined, or inconsistent. However, in this case we already know that the system is
not inconsistent (that is called consistent), because =
=
= =
0 is certainly a solution.
1
2
n
So the question becomes if it is underdetermined and to what extent (how many free variables
are there). If there are no free variables, then all the vectors are linearly independent. If there
are free variables, then there are as many linearly dependent vectors as free variables.
So what this boils down to is just an ordinary matrix sweep, made simpler by the fact that the
right hand side contains only zeros, and then counting the number of lines that do not contain
only zeros.
It will be clearer after our example, so lets turn to that. We sweep:
84

1 0 3
2 7 5
2 4 3
5 11 5
7 15 2
2 0 1

11 0 2
14 0 2

23 0 3
37 0 2
2
0 1

1 15
1
0 0
7
7

59 111
0
0 0
7
7

0
0
0 0 0

0
0
0 0 0
0
7
3
5
2 0 1

11 0 0
4 3 14 0 0

7 8 9 0 1
4 3 14 0 0
0
1
7
0
0
0
0
0 3 2 0
7 1 15 0
4 9 18 0
0 3 2 0
0 0 0 0
15
0
7
111
0
59
0
0
0
0
We see that from the four s we set out to find, one will be free, while the other three are
fixed in terms of the fourth. We dont care about their actual values, so we stop solving here.
The dimension of the span of the set of vectors is three.
We have determined the dimension of the span of the column vectors of a matrix (this span is
often called the column space of the matrix). However, some reflection will show that we also
determined the dimension of the span of the row vectors of our matrix (called the row space).
The reason is as follows: vectors are linearly independent if one of them cannot be obtained
through elementary operations (additions, subtraction etc.) on the others. That is precisely
what we check by sweeping. We therefore see that the row space also has dimension three. In
general the dimension of the row space is equal to the dimension of the column space. It is
also equal to the rank of the matrix, as we defined it in class. So now we have a few ways of
thinking about the rank of a matrix.
Exercise 5.
Consider the linear map T : n m with associated matrix A = V1 V2 Vn , where

Vi is the ith column vector. Write the outcome of map in terms of the column vectors and a
general
vector x ( x1 , x2 , , xn ) n . What does this mean for the relation between the
=
column space (the space spanned by the column vectors of A) and the image of T.
Solution:
x1

T : ( x1 , x2 , , xn ) A x= V1 V2 Vn = x1V1 + + xnVn
x
n
So the image of T (the possible outcomes that T could give) is equal to the set of all linear
combinations of the column vectors of A. But that set is just the column space of A. So the
column space of A is the image of T.
* Exercise 6.
85

Consider linear maps T1 : n m and T2 : m p , with associated matrices A1 , A2 . Show
that the matrix associated with the linear map T2 T1 is A2 A1 .
Solution:
What does T2 T1 mean=
again? If, x ( x1 , x2 , , xn ) n , then T1 ( x) m so that we can
apply T2 : T2 (T1 ( x)) . This is meant by T2 T1 . It defines a new linear map T3 : n p which
has an associated matrix also. Lets call this matrix C and derive what it is:
T3 ( x) = C x = T2 (T1 ( x)) = T2 ( A1 x) = A2 A1 x C = A2 A1 .
86

ADVANCED MATHEMATICS TAKE HOME ASSIGNMENTS
ON MATERIAL OF WEEK 2
Take home assignments of week 3 (on material of week 2)
* Exercise 1.
1
Given is the vector x = 2

0
Make a system of orthonormal vectors based on the vector x that span the entire space 3
Solution
In a three dimensional space there are at maximum three independent vectors. First, we
1
0
0
consider the three unit vectors e1 = 0 , e2 = 1 , e3 = 0

0
0
1
They have the following features:

1) The vectors are perpendicular, which means that the inner product for a pair of these
vectors is zero. For instance for e1 and e2, the inner product is e1 e2 =1 0 + 0 1 + 0 0 = 0
.This also holds true for the other combinations (e1 and e3, e2 and e3).
2) The vectors have a length of one. For instance, the length of the vector e1 is
12 + 02 + 02 =
1
3) The three vectors span the entire space 3 . It implies that each vector in 3 can be written
as a linear combination of e1, e2 and e3. For instance:
1
1
0
0
2 =1 0 2 1 + 0 0

0
0
0
1
1
Next, we consider x = 2 . Our strategy is the following. Step 1: we construct two vectors

0
that are perpendicular to x. Step 2, we normalize the length of both vectors to one.
2
1
Step 1: The vector y = 1 is perpendicular to x = 2

0
0
Reason: x y =1 2 2 1 + 0 0 = 0
87

b
(Thus a is perpendicular to

0
a
b )

0
0
2
1
The vector z = 0 is perpendicular to x = 2 and y = 1

1
0
0
Reason: x z =1 0 2 0 + 0 1 = 0 and y z = 2 0 + 1 0 + 0 1 = 0
0
(Thus 0 is perpendicular to

1
Step 2:
1
a) The length of x = 2 is

0
Hence, the length of
1
=
x
5
2
b) The length of y = 1 is

0
Hence, the length of
b
a
b and a )

0
0
12 + (2) 2 + 02 = 5
1
1
2 is
5
0
1 4 0
+ + =
5 5 5
5
= 1
5
(2) 2 + 12 + 02 =5
2
1
1
y=
1 is
5
5
0
4 1 0
+ + =
5 5 5
5
= 1
5
0
c) the length of z = 0 is 1.

1
1
5

2
Step 3. The orthonormal vectors ,
5

0

2
5

0
1 and 0 span the entire space 3
5

1
0

1
5

2
It means that each vector can be written as a linear combination of these vectors ,
5

0

88

2
5

0
1 and 0
5

1
0

Exercise 2.
1
Given is the vector x = 2

0
1 1 0
and A = 0 2 1
1 4 2
Compute x ' x and x ' Ax

Solution:
1
x ' x = [1 2 0] 2 = 1 1 + (2) (2) + 0 0 = 5
0
1 1
x ' Ax= [1 2 0] 0 2
1 4
1
= [1 2 0] 4 =
9
0 1
1 2 =
2 0
1 1 + 1 (2) + 0 0
[1 2 0] 0 1 + 2 (2) + 1 0 =
(1) 1 + 4 (2) + 2 0
1 (1) + (2) (4) + 0 (9) = 7
Alternative solution:
1 1 0 1
x ' Ax = [1 2 0] 0 2 1 2 =
1 4 2 0
1
[11 + (2) 0 + 0 (1) 11 + (2) 2 + 0 4 1 0 + (2) 1 + 0 2] 2 =
0
1
= [1 3 2] 2 = 1 1 + (3) (2) + 0 (9) = 7
0
89

Exercise 3.
1
1 2
and x =
If A =
2
2 t
For which t has the determinant of the matrix A a negative value?
For which t is x ' Ax > 0
For which t is x ' A1 x > 0
Does it matter for these results that A is a symmetric matrix? So that A = A '
1 1
So, check whether the results is different for e.g. B =
?
2 t
Solution:
a) det( A) = 1 t 2 2 = t 4
The determinant of A is negative if t < 4
b)
1 2 1
1
x ' Ax =[1 2]
=[ 3 2 2t ] =3 2(2 2t ) > 0
2 t 2
2
Thus, 7 + 4t > 0 , so that t > 7 / 4
1 2
c) A =
2 t
A1 =
1 t 2
t 4 2 1
1
x ' A=
x
t 2 1
1 t + 12
1
1
=
[1 2]
[t + 4 4] =
t 4
2 1 2 t 4
2 t 4
It is positive if t > -4 or t < -12

1 1
We check the results for B =
2 t
B 1 =
1 t 1
t + 2 2 1
1
x ' B =
x
t 1 1
1 t + 6
1
1
=
[1 2]
[t + 4 1] =
t+2
2 1 2 t + 2
2 t + 2
It is positive if t > -2 or t < -6

We do not observe any major difference between A and B.
90

Week 3 - Linear Algebra (II)

Klein: Chapter 5 (it excludes Cramers rule)
Determinant of matrix
Section 5.1
Rank of matrix
See lecture slides
Eigenvalues and eigenvectors
Section 5.3
Diagonalization of a matrix
Section 5.3
Cramers rule is no part of the material
91

Linear Maps
a11 a1n
As we have seen last week, the matrix can be
am1 amn
interpreted as a function f (.) : n m , in particular:
x1 a11 a1n x1
f ( ) = .

xn am1 amn xn
We now look at functions T (.) : n m such that, for
n
x, y yy
, c , T (x + y )= T (x) + T (y ) and T (cx) = cT (x) .
Functions that satisfy both criteria are called linear functions (or linear
maps or linear transformations).
Example 1:
x1
x +x
T (.) : 3 2 , T ( x2 ) = 1 2 , then
x1 2 x3
x3
w1 y1
w1 + y1
w + y + w2 + y2
T ( w2 + y2 ) = T ( w2 + y2 ) = 1 1
=

w
+
y
2(
w
+
y
)
3
3
w3 y3
w3 + y3 1 1
w1
y1
w
w
y
y
w
w
y
y
+
+
+
+
+
1
1
1
2
1
2
2
2
T ( w2 ) + T ( y2 )
=
+
w 2 w + y 2 y ) w wx y 2 y =

3
1
3
3
3
1
1
1
w3
y3
And
y1
cy1
y1
cy
+
cy
y
+
y
2
1
2
T=
(c y2 ) T=
( cy2 ) 1=
c =
cT ( y2 )

cy1 2cy3

y1 2 y3
y3
cy3
y3
92

Mapping and matrices
Theorem:
Any linear map can be represented by a matrix and any matrix is a
linear map. That is, they are the same thing.
Matrix representation of a linear map:
Let e1 ,, en be the unit vectors in n , then a matrix representation of
a linear map T (.) : n m is:

A = T (e1 ) T (e2 ) T (en )

This shows (if we proved it) that every linear map has a matrix
representation. The other way around (that every matrix represents a
linear map) is done in the tutorial.
Example 2
We represent the linear map from example 1 as a matrix:
0
1 + 0 1 0 + 1 1
T () =
,T ( 1 ) =
=
1=

1
2
0
0 0 2 0 0

0
0+0 0
T ( 0 ) =
=
0 2 1 2
1
So the map T is represented by the matrix:
1 1 0
1 0 2
Lets check:
x1
1 1 0 x1 + x2
1 0 2 x2 =
x x1 2 x3
3
This is indeed our original map T.
93

Example 3 of linear mapping: a counter clockwise rotation of 90
degrees
We are interested in a matrix A: 2 2 , which represents a counter
clockwise rotation.
0 1
The matrix A =
can be understood as follows:
1 0
1
First column of the matrix A. Rotating the unit vector
0
0
counter clockwise we get .
1
0
Second column of the matrix A. Rotating the unit vector
1
1
counterclockwise we get ,
0
0 1
Thus, the rotation is represented by: A =
1 0
Thus the matrix can be used to rotate any vector counter clockwise.
2
For instance the vector :
1
0 1 2 1
=
Ax
=
1 =
y
1
0
2

2
The rotation implies that the vector x = is perpendicular to its
1
1
mapping y =
2
94

Because:
1) x y = 2 (1) + 1 2 = 0 (the inner product of x and y is zero)
2) x = 5 and y = 5 So that both vectors x and y have equal
length.
Implication 1
0 1
The matrix B =
represents a clockwise rotation: Thus
1
0
1 0
0 1
B = and B =
0 1
1 0
95
Now we show graphically that T (x + y )= T (x) + T (y ) . It can be seen

in the picture that it does not matter whether you first rotate your
vectors and then add them, or the other way around, i.e. first adding
them and only then rotating them.
96

Example 4: Interpretation of matrix multiplication (I)
We have two matrices A and B:
0 1
A=
1 0
3 0
B=
0 3
Question: what is the meaning of ABx?
1) First multiplication Bx: Implies a multiplication of both elements
of the vector x by a factor 3.
3 0
Bx =
x
0
3
3 0 1
=
Be1 =

0 3 0
and
3 0 0
=
Be2 =

0 3 1
3
0

0
3

2) Second multiplication ABx:

Counter clockwise rotation
Thus :
0 1 3 0 0 3
=
C AB
=
0 =

1
0
3 3 0
Mapping C:
Step 1: Bx: three times larger length of the vector
Step 2: ABx: Counter clockwise rotation by 90 degrees
97
2
We apply the mapping on x =
1
0 3 2 3
3 0 1 = 6

2
3
x = is perpendicular to y =
1
6
Because:
1) x y = 2 (3) + 1 6 = 0
2) x = 5 and =
y
=
45 3 5
so that y = 3 x
98
Finally, we show that T (cx) = cT (x) .
99
Also here, it does not matter whether we first multiply our vector with
a number and then rotate it or the other way around.
100

Multiplication by a common factor
Example 5:
3 3
Ax= 3x
3 0 0 2 6
0 3 0 1 = 3

0 0 3 3 9
Example 6:
3 3
Ax = x ,
0 0 2
2
0 0 =
1 =
1

0 0 3
3
2

3
101

Interpretation of matrix multiplication (II)
Example 7:
To make the length of vector in 3 three times larger:
3 0 0
A = 0 3 0
0 0 3
The product of AA makes the length of a vector 9 times larger:
2
3 0 0 3 0 0 3
0 3 0 0 3 0 = 0

0 0 3 0 0 3 0
0
32
0
0
32
The product of AAA makes the length of a vector 27 times larger:

3
3 0 0 3 0 0 3 0 0 3
0 3 0 0 3 0 0 3 0 = 0

0 0 3 0 0 3 0 0 3 0
Et cetera
102
0
33
0
0
33

Space and subspace
3 has a dimension of 3.
It means that it can be spanned by three independent vectors at

maximum.
Example 8:
2
In 3 the line spanned by 1 has a dimension of 1. The line is

3
2
referred to as a subspace of 3 . It means that 1 , , is part of

3
this subspace.
Example 9:
2
1
In 3 the sub-space spanned by the vectors 1 and 0 has a

3
2
dimension of 2. A vector in this subspace can be characterized as
2 +
2
1
1 =
1 + 0 ,

3 + 2
3
2
103

Example 10:
2 1 0
A = 1 0 0 will be a mapping into the subspace spanned by the
3 2 0
2
1
vectors 1 and 0 . Thus, the image of each vector will be part of

3
2
this subspace. The matrix A is non-invertible. It has a rank of 2.
104

Interpretation of the determinant of a matrix
The determinant measures how much a linear map blows up the
image.
Consider the map in the figure. It maps the blue square into the
green one.
The determinant is the ratio between the area of the green square
and the blue one.
Since the blue one is the unit square, while the green has area 8, the
determinant of the matrix associated with this map is 8.
Negative determinants occur when positive vectors are mapped into
negative ones.
Note that for any shape to which we apply a linear map, this
relation between area before and after the map is the same. In
higher dimensions we do not talk of the area but of the volume.
105
106

Thus interpretation of determinant for 2 (no proof):
a c
Lets consider the mapping
b d
We consider the size of the area spanned by:
a c 0 0
1)
0 = 0
b
d
a c 1 a
2)
0 = b
b
d

a c 1 a + c
3)
1 = b + d
b
d
a c 0 c
4)
1 = d
b
d

a a + c c
b , b + d , d

a c
is ad bc , which is the determinant of the matrix
b d
0
It can be shown that the area spanned by ,
0
107

Example 11:
The determinant of
3 0
A=
0 3
equals nine. Thus det(A)=9. (The area of the mapping spanned by the
0 3 3 0
four vectors , , , equals 9).
0 0 3 3
For the matrix:
1/ 3 0
B=
0 1/ 3
we have det(B)=1/9
Properties of determinants:
1) It can be shown that in general det(AB)=det(A)det(B)
1
2) It can be show that det( A1 ) =
det( A)
Note that for the particular example 11 AB=I
3 0 1/ 3 0 1 0
0 3 0 1/ 3 = 0 1
And that the determinant of the unit matrix equals 1.
108

Determinants in 3
For 3 the determinant of A
a11 a12 a13
A = a21 a22 a23
a31 a32 a33

can be calculated as follows:
The determinant can be calculated for each of the three rows or each
of the three columns.
1) For the first row of the matrix:
a22 a23
a
a
a
a
a12 21 23 + a13 21 22
a32 a33
a31 a33
a31 a32
a
a23
For which 22
is the minor of a11 (the determinant of the suba32 a33
matrix of a11 )
| A |= a11
a21
a31
a21
a31
a23
is the minor of a12

a33
a22
is the minor of a13
a32
2) One can also take the second row of the matrix A:

a
a13
a
a
a
a
| A |=
a21 12
+ a22 21 23 a23 11 12
a32 a33
a31 a33
a31 a32
etc
109

Thus for each of the minors it is multiplied by:
a11 a12
a
A=
a22
21
a31 a32
a13
a23
a33
Conclusion: The determinant of an upper-diagonal matrix
a11 a12
A = 0 a22
0
0
a13
a23
a33
equals | A |= a11a22 a33

And for a diagonal matrix:
a11 0
A = 0 a22
0
0
0
0
a33
Hence: the determinant of A is equal to zero if one of the diagonal

elements equals zero.
| A |= a11a22 a33
110

Example 12
2 1 1
A 1 4 4
=
The determinant of the matrix
1 0 2
equals 6.
For a 3x3 matrix, there are six possibilities to calculate the
determinant of the matrix:
First row of A:
4 4
1 4
1 4
2
1
+1
=
6
0 2
1 2
1 0
or (second row of A):
1 1
2 1
2 1
1
+4
+4
=
6
0 2
1 2
1 0
or (third row of A):
1 1
2 1
1
6
+2
=
4 4
1 4
or (first column of A):
4 4
1 4
1 4
2
1
+1
=
6
0 2
1 2
1 0
or (second column of A):
1 4
2 1
1
+4
=
6
1 2
1 2
or (third column of A):
1 4
2 1
2 1
1
+4
+2
=
6
1 0
1 0
1 4
111

Example 13
a 1 0
The determinant of the matrix A = 2 a 2
0 1 a
equals a 3 4a .
112

Eigenvalues and eigenvectors
Compute the eigenvalues of the n x n square matrix A as follows:
Ax = x
( A I )x =
0
For which x is a non-zero vector, and I is an n x n identity matrix. In

order for a non-zero vector x to satisfy this equation, ( A I ) must
not be invertible. If ( A I ) has an inverse:
or
thus
( A I ) 1 ( A I ) x =
( A I ) 1 0
=
x ( A I ) 1 0
x = 0.
The matrix is non-invertible if its determinant is zero:

| A I |=
0
a
If A is a 2 x 2 matrix: A = 11
a21
a12
a22
Then | A I |=
0 becomes
a11
a
21
a12
=0
a22
The characteristic equation is a second-order polynomial:

(a11 )(a11 ) a21a12 =
0
or
2 (a11 + a22 ) + (a11a22 a21a12 ) =
0
For which: the trace of the matrix A is defined as the sum of the
113

diagonal elements: tr( A=
) a11 + a22
=
| A | a11a22 a21a12
1,2 =
trA (trA)2 4 | A |
2
Thus two solutions: two real eigenvalues at maximum
114

A matrix is diagonalizable:
The square matrix can be diagonalized as follows:
AP= P
For which is a diagonal matrix with the eigenvalues of A on the
main diagonal. For a 2 x 2 matrix A, the diagonal matrix is
0
= 1
0 2
P is a matrix that is spanned by the eigenvectors P = [ p1, p 2] .
For which 1 corresponds to the vector p1 and 2 corresponds to
the vector p 2
Notation:
The diagonalization AP= P can also written as
a) A= PP 1
1
b) =P AP
115

Example 14
2 2
The matrix A =
has the eigenvalues 1 = 1 and 2 = 6
2
5
1 0
=
0 6
2
1
The eigenvectors belonging to 1 = 1 are p1 = and p 2 =
1
2
Note that the eigenvectors are orthogonal (because the matrix A is
symmetric, what we will not proof here)
Example 15
2 4
The matrix A =
has the eigenvalues 1 = 3 and 2 = 2
1
1
1 0
=
0 6
4
1
The eigenvectors belonging to 1 = 1 are p1 = and p 2 =
1
1
for 2 = 2
116

Example 16 (it is hard example because of the third order
polynomial characteristic equation)
2 1 1
The matrix A = 2 3 4 has the characteristic equation
1 1 2
1
2
| A I |= 2
3
1
1
1
4 = ( 1)( + 1)( 3) = 0
The eigenvalues are 1, -1 and 3.

The associated eigenvectors are:
1
1 = 1: p1= 1
0
0
2 = 1 : p 2 = 1
1
2
2 = 3 : p3 = 3
1
117

Wrapping up: under which conditions is a matrix A invertible?
Think of matrices as maps, the they are invertible if they are both oneto-one (injective) and onto (surjective).
This map is not injective.
This map is not surjective.
118
This map is both injective and surjective, and is therefore invertible.

Thus: A square matrix A n n is invertible if :
The matrix A is of full rank n;

The columns of the matrix A are linearly independent.
None of the real eigenvalues is equal to zero.
The determinant of A is nonzero.
119

Technical tutorial of week 3 - Solutions
Exercise 1.
1 0 t
Let A = 2 1 t
0 1 1
For what value of t does A have an inverse?

Solution:
The matrix no inverse if
1 0 t
2 1 t =0
0 1 1
Which gives
1 t
2 1
1
+t
=
0
1 1
0 1
So that
(1 t ) + 2t =
0
Thus the matrix A has an inverse if t 1
* Exercise 2.
a11 a1n
Find the determinant of .

a
m1 amn
Solution:
We explained the procedure in class.
* Exercise 3.
Given a parallelepiped C of a certain volume and a linear map T with associated matrix A,
find Vol(T(C)).
Solution:
Vol (T=
(C )) det( A) Vol (C )
* Exercise 4.
Consider the following maps and show that they are linear, without deriving their matrix
representation. Also derive and show their eigenvectors (if any).
a) Blow-up of a vector along the x-axis by 100%, while the y-axis remains unchanged.
120

b) Projection of a vector on the y-axis.
c) A counter-clockwise rotation of a vector by 90 .
Solution:
a)
In the figure we drew the transformation for a specific vector. Note that algebraically, this
transformation amounts to T : (a1 , a2 ) (2a1 , a2 ) . In green and yellow are two eigenvectors
for this transformation, we return to them shortly. First we show that the map T is linear. For
that we have to show that T (a + b)= T (a) + T (b) and T (ra) = rT (a) . We do this both
graphically and algebraically.
In the figure we constructed T (a + b) and it can be seen to be equal to T (a) + T (b) (what does
121

this mean? Try to do the transformation T on a + b and see that it gives the same result as
adding T (a) + T (b) .) Algebraically, we have:
T (a + b)= T ((a1 , a2 ) + (b1 , b2 ))= T (a1 + b1 , a2 + b2 )= (2(a1 + b1 ), a2 + b2 )
T (a) + T (b) = T (a1 , a2 ) + T (b1 , b2 ) = (2a1 , a2 ) + (2b1 , b2 ) = (2(a1 + b1 ), a2 + b2 )

So they are equal.
Again, we see in the figure that T (ra) = rT (a) . Algebraically we have:

=
T (ra) T (=
r (a1 , a2 )) T=
(ra1 , ra2 ) (2ra1 , ra2 )
=
rT (a) rT
=
(a1 , a2 ) r=
(2a1 , a2 ) (2ra1 , ra2 )
We see again that they are equal.
For the eigenvectors, we return to the first figure. Two examples of them are drawn in green
and yellow. First consider a vector along the y-axis. What would happen to it under this
transformation? Absolutely nothing. So it is an eigenvector with eigenvalue 1.
T (0, y ) = 1 (0, y )
Now consider a vector along the x-axis. What will happen to it under T? It will get doubled.
So it is an eigenvector with eigenvalue 2.
T ( x, 0)= (2 x, 0)= 2 ( x, 0)
122
b)
To make my life easier, all in one picture this time. A projection simple takes any vector and
only keeps the y-part of it: T : ( x, y ) (0, y ) . Form the picture it is again clear that the map
is a linear one. Note that this time we checked T (ra) = rT (a) for r<1.
Algebraically we have:
T (a + b)= T ((a1 , a2 ) + (b1 , b2 ))= T (a1 + b1 , a2 + b2 )= (0, a2 + b2 )
T (a) + T (b) = T (a1 , a2 ) + T (b1 , b2 ) = (0, a2 ) + (0, b2 ) = (0, a2 + b2 )

and
=
T (ra) T (=
r ( x, y )) T=
(rx, ry ) (0, ry )
rT
=
(a) rT=
( x, y ) r=
(0, y ) (0, ry )
Finally, this map has only one type of eigenvector: any vector along the y-axis is an
eigenvector with eigenvalue 1, as it is unchanged by the map. However, any other vector is
not an eigenvector.
123
c)
The figure shows that the first condition for linearity holds.
124
This figure shows that the second condition for linearity also holds. We drew it here for r<0.
It is a bit beyond the scope of this course to derive the map algebraically, so we leave that to
the interested reader (It is actually not very hard. Give it a try).
This map is interesting in that it has no eigenvectors at all. Because it is a rotation, there is no
vector that does not change direction under the map.
* Exercise 5.
Suppose v is an eigenvector of a matrix A, with associated eigenvalue . Show that, for
0 , v is also an eigenvector with eigenvalue .
Solution:
v . Now we
We know from the fact that v is an eigenvector with eigenvalue that A v =
use the linearity of a matrix: A v= ( A v )= ( v )= ( v ) . So v is also an eigenvector
with eigenvalue .
Exercise 6.
Calculate the eigenvectors and the associated eigenvalues of the following matrix:
2 0 0
A = 1 3 5
1 1 1
Solution:
We start by solving the characteristic equation:
125

P ( ) =
1
1
3
1
0
5 = 0 = (2 )((3 )(1 ) 5) = (2 )( 2 2 8) =
1
(2 )(4 )(2 )
So we find 1 = 2, 2 = 4, 3 = 2 . (You have to be lucky to be able to solve a cubic equation
this way. Dont worry; on an exam you will always be lucky.)
Now we find the associated eigenvectors v1 , v 2 , v 3 by solving the equation:
( A 1 I ) v1 =
0
We solve by sweeping:
0
0
0 0 0 0 0 1 1 5 0
22
3 2
5
0 =
1
1 1 5 0 ~ 0 0 1 0
1
1 2 0 1 1 3 0 0 0 0 0
1
p

So we find v1 = p for any p, with associated eigenvalue 1 = 2 .
0

We check our result:
2 0 0 p 2 p
p

1 3 5 p = p 3 p = 2 p
1 1 1 0 p p
0

Such a relief!
We move on to v 2 with associated eigenvalue 2 = 4
0
0
0 2 0 0 0 1 0 0
24

3 4
5
0 =
1
1 1 5 0 0 1 5
1
1 4 0 1 1 5 0 0 1 5
1
0

So we find v 2 = 5q for any q, with associated eigenvalue 2
q

2 0 0 0 0
0

=
5q =
20q 4 5q
1 3 5
1 1 1 q 4q
q

Hurrah!
We move on to v 3 and 3 = 2 .
0 1 0 0 0

0 0 1 5 0
0 0 0 0 0
=4
0
0
0 4 0 0 0 4 0 0 0 1 0 0 0
2 2
3 2
5
0 = 1 5 5 0 0 5 5 0 0 1 1 0
1
1
1
1 2 0 1 1 1 0 0 1 1 0 0 0 0 0
0

So we find v 3 = r for any r, with associated eigenvalue 3 = 2 .
r

126

2 0 0 0 0
0

2 r
1 3 5 r =
2r =
1 1 1 r 2r
r

Again it works out and we have found all our eigenvectors.
127

Broad tutorial (Friday)
* Exercise 1.
3 0 0
1 3 1
Consider the Markov matrix A =

. Diagonalize it and use your result to determine
6 4 3
1 1 2
6 4 3
the long term state of the population (no matter what the starting state was) by calculating
x1
x1

k
lim A x2 for general population x2 .
k
x
x
3
3
Solution:
Background
At the tutorial there was more explanation about the background of Markov transition
matrices. It describes transition in the labour market, for which there are three states (e.g. state
1: employment; state 2: unemployment; state 3: non-participation).
The matrix describes the probabilities in the transitions across the three states between period
t and period t+1.
Note that the numbers in the matrix should be read as conditional probabilities.
2/3 = Pr(employed in period t+1 | someone was employed in period t)
1/6 = Pr(unemployed in period t+1 | someone was employed in period t)
1/6 = Pr(non-participant in period t+1 | someone was employed in period t)
These probabilities add up to one exactly.
3/4 = Pr(unemployed in period t+1 | someone was unemployed in period t)
1/4 = Pr(non-participant in period t+1 | someone was unemployed in period t)
These probabilities add up to one exactly
2/3 = Pr(non participant in period t+1 | someone was non participant in period t)
1/3 = Pr(unemployed in period t+1 | someone was non participant in period t)
These probabilities add up to one exactly
Note that x1 + x2 + x3 =
1
3 0 0
x1
1 3 1
x
Thus Ax =
6 4 3 2
x3
1
1
2
6 4 3
is informative about the states in period t+1
To diagonalize the transition-matrix A, we have to start by finding the eigenvectors and
128

eigenvalues. We look at the characteristic equation (we showed in the lecture why this
equation matters):
2
0
0
3
1
3
1
2
3
2
1
2
1 17
1
P ( ) =
=( )(( )( ) ) =( )( + 2 ) =0
6
4
3
3
4
3
12
3
2 12
12
1
1
2
6
4
3
2
2
0 =( )(5 17 + 12 2 ) =( )(1 )(5 12 )
3
3
2
5
This gives us three eigenvalues:=
. (You have to be lucky to be able to
, 3
1 1,=
2
=
3
12
solve a cubic equation this way. Dont worry; on an exam you will always be lucky.)
To get the associated eigenvectors v1 , v 2 , v 3 , we use the equation:
( A 1 I ) v1 =
0
W sweep:
2
1
1
0
0
0
0
0 0
0 0 0
3 1
3
3
4
3
1
1 1 1
1
1
0 = 0 0 1
0
6
6

4
3
4
3
3
1
2
1
1 1
0 0 0 0
1
0
1 0
4
3
4
3
6
6

0
1

So v1 = 4r for any r. Of course, if v1 is to represent a population, then r = .
7
3r

We check if v1 is indeed an eigenvector with eigenvalue 1:
2
3
1
6
1
6
0
0
0 0
3 1

4r =
3r + r =
1 4r So it is as we wanted.
4 3
3r
r + 2r
3r

1 2
4 3
2
The eigenvector for 2 = :
3
2
2
0
0
0
1
33
0 0 0 0 1
2 0
3 2
1
1 1 1
1
0
0 0 1 2 0
=
6
6 12 3
4 3
3

0 0 0 0
1
1
2
2
1
1
0
0 0
4
3 3
6
6 4
0
129

3 p
2
So v 2 = 2 p for any p. We check if v 2 is indeed an eigenvector with eigenvalue .

3
p
3 0 0
2 p
2 p
3 p
3 p

1 3 1 2 p =
1 p+ 3 p+ 1 p =
4 p=
2p
2
6 4 3
2
3
3
3
p
1
1
2
1
1
2
2
p + p + p p
2
3 3
2
6 4 3
So again we made no error in calculation.
5
Finally we solve for the eigenvector associated with 3 =
12
2 5
1
0
0
0
0 0 0
3 12
1 0 0 0
4

1 0 0 0
3 5
1
1 1 1
1 1
0 =
0 0
0 0 1 1 0
6
6 3 3

4 12
3
3 3

0 0 0 0
1
1
2
5
1
1
1
1
1
0
0 0
0
4
3 12
4 4

6
6 4 4
0
5

So v 3 = q for any q. We check if v 3 is indeed an eigenvector with eigenvalue
.
12
q

2
3 0 0
0
0
0
1 3 1 q = 3 q 1 q = 5 q = 5 q
6 4 3 4
3 12 12
q
5
q
1 1 2
1 q 2 q q
3 12
4
6 4 3
Yippee.
Now were almost ready to diagonalize our matrix. Recall that we want to write
0 3 0
A = CDC 1 , where C is the matrix of eigenvectors, so C = 4 2 1 , where we picked

3 1 1
1 0 0
easy values for r,p and q, and D is the diagonal matrix of eigenvalues, so D = 0
0 .
5
0 0
12
We still need to find C 1 . Because we do not show how to find an inverse of a matrix in this
course its not hard, but we can only do so much we simply postulate that
130

1
7
1
1
C =
3
2
21
1
7
1
7
0 0 and check if this is indeed true:
3 4
7 7
1 1 1
0 3 0 7 7 7 1 0 0
C C 1 4 2 1 =
0 0 0 1 0 How good of us.
=
3 1 1 2 3 4 0 0 1
21
7
7
Finally we have diagonalized A: A = CDC 1 .

Now we want to use this to make it easy to raise A to a certain power. Notice that:
Ak = (CDC 1 ) k = (CDC 1 ) (CDC 1 ) (CDC 1 ) (CDC 1 ) = CD k C 1
(((((((
(((((((((
So we only have to raise D to the power k, and D is a diagonal matrix, so:
1k
0
0
2
k 1
0 C 1
=
=
Ak CD
C
C 0
3
5
0
0

12
What we wanted was to take the limit of this for to infinity, to see what would happen after
infinitely many periods, i.e. in the long run. But, because two of our eigenvalues are smaller
than one, they tend to zero as k tends to infinity. So:
1 1
1k
0
0
1 0 0
0 3 0 1 0 0 7 7
1
2
1
Ak lim C 0
C 0 0 0 C=
lim =
0 C=
4 2 1 0 0 0
0
k
k
3
0 0 0
3 1 1 0 0 0 3
2 3
k
0

0
21 7
12
1 1 1 0 0 0
0 3 0 7 7 7
4 4 4
4 2 1 0 0 0 = 7 7 7
3 1 1 0 0 0
3 3 3
7 7 7
So now we can calculate:
131
1
7
0=

0 0 0
0
0
x1

4
4 4 4 x=
4
( x1 + x2 + x3 )=
2
7 7 7 7
7
3 3 3 x3 3
3
( x1 + x2 + x3 )
7 7 7
7
7
The last equality follows that a population vector has ( x1 + x2 + x3 ) =
1 . So in the long run the
population will be divided over states 2 and 3 in the proportions 4:3, while nobody will be in
state 1. (Can you understand just by looking at matrix A why that might be the case?)
Finally, not that the long run population vector that we found is also an eigenvector with
eigenvalue 1. That is no coincidence: it is almost always the case with Markov chains, in fact
always if the long-run state is well defined. The reason is as follows: In the long run, we
expect a steady state, so nothing changes anymore. So we want a vector such that, if A works
on it, we get our vector back. But that is just an eigenvector with eigenvalue 1.
Dont get angry, we did not sweat for nothing. Although it is true that it is much easier to find
the long run state by looking for an eigenvector with eigenvalue 1, our method is the only way
( I know of) to fairly easily find Ak for any large k.
*Exercise 2.
We show that the determinant-volume formula holds in a special case and discuss the general
proof.
Solution:
We start with a unit square C, characterized by the vectors (1, 0), (0,1) and investigate what
a c
happens under transformation T : 2 2 with associated matrix
.If we let T work
b d
on our two vectors, we get
a c 1 a
=
b d 0 b
a c 0 c
=
b d 1 d
So our transformation on the unit square looks like this:
132
Now we know the area of the unit square is 1, so to calculate the determinant of the matrix, all
we have to do is calculate the area of the resulting parallelogram. To calculate this, we need
one geometric fact, which we now illustrate.
The parallelogram with the blue sides has the same area as the parallelogram with the red
sides. In general, if you keep one side of a parallelogram fixed and you move the opposing
side along a parallel line, the area of the resulting parallelogram is the same as that of the
original.
We now use this fact to transform our parallelogram given by (a,b) and (c,d) into a more
manageable one with the same area. We actually use it twice, to transform it into a rectangle:
133
So we see that the rectangle given by (p,0) and (0,q) has the same area. Clearly that area equal
pq. So what we have still to do is calculate p and q. We start with p. What did we do in the
first step, the first shift of parallelograms? We took our point (a,b) and went in the direction of
(c,d) until we reached the x-axis, so we have ( a b ) r ( c d ) =
( p 0 ) for some r to be
bc
b
, so p =a rc =a
.
d
d
Thats one out of the way. Actually, finding q is easier. We see in the figure that going from
(c,d) to (0,q) is a horizontal shift, so the y-coordinate does not change: q=d.
So the area of our rectangle and therefore also our original parallelogram is
bc
pq =
(a
)d =
ad bc
d
This is indeed the determinant formula for the two-dimensional case.
Of course, this is no proof of the formula in general. For that we would have to show that it
holds for all shapes we could start with, not just the unit square. The way to do that is not by
extending the argument we gave above (just imagine doing this for general parallelograms in
higher-dimensional spaces). Instead what mathematicians do is very different: they look at
volume as a function of a shape and show that is must have certain properties (for instance, if
you translate a shape, its volume does not change). Then they show that there can be only one
such function. And then they show that the determinant also has these properties. Then they
can conclude that determinant indeed gives the volume of a transformation. We wont trace
their steps here, as that would take as much too far afield, but you might be interested to see
how you can handle such a seemingly awesome problem.
determined. We know b rd = 0 r =
* Exercise 3.
Prove that det( A1 ) =
1
(if A1 exists of course).
det( A)
Solution:
If A1 exists, then we can say A A1 =
I , so det( A A1 )= det( I )= 1 . Now recall that
134

det(C B) =
det(C ) det( B) , so 1 = det( A A1 ) = det( A) det( A1 ) det( A1 ) =
1
.
det( A)
* Exercise 4.
Prove that det( A) = i , if A is diagonalizable, where i are the eigenvalues of A.
i
Solution:
We first establish the intuition. Suppose A is 2x2. Because A is diagonalizable, we know it has
2 eigenvectors v, w with associated eigenvalues , . Now consider the parallelogram given
by v, w and consider what would happen to it when multiplied by A. We call the original
parallelogram P and the new one Q.
In the figure it is quite clear that Vol (Q) = lVol ( P ) . Therefore we should find that indeed
det( A) = i . We now proceed to prove this.
i
From class we know that for a diagonalizable matrix A the following holds:
AC = CD , where C is the matrix with for every column an eigenvector of A, and D is a
diagonal matrix with the associated eigenvalues on the diagonal. Now we know:
det( AC ) = det(CD) det( A) det(C ) = det(C ) det( D) det( A) = det( D)
1 0 0
=
0 i .
i

0 0 n
We have quite a powerful apparatus by now. This was not such an easy theorem to
understand, but the proof is just a few lines.
But D is a diagonal matrix, so det(
=
A) det(
=
D)
Consequence: if one of the eigenvalues of A equals zero, the determinant of the matrix A will
be zero. If one of the eigenvalues of A equals zero, the inverse of the matrix A does not exist.
135

ADVANCED MATHEMATICS TAKE HOME ASSIGNMENT
MATERIAL OF WEEK 3
Question 1
i)
ii)
1 2
Compute the matrix decomposition P 1 AP = for A =
3 0
Using the decomposition, compute A4
Solution
Compute the eigenvalues of A:
1 2
A I =
=
(1 ) 6 =
0
3
0
So that the characteristic equation is ( 3)( + 2) =
For = 3 , the eigenvector is
2 x1 + 2 x2 =
0
3 x1 3 x2 =
0
1
so that v1 is an eigenvector
1
For = 2 , the eigenvector is
3 x1 + 2 x2 =
0
2
so that v2 is an eigenvector
3 x1 + 2 x2 =
0
3
Thus:
3 0
=
0 2
1 2
P=
1 3
3
1 2 1 3 2 5
=
P 1 =
=

1 3 5 1 1 1
5
2
5
1
5
3 2
1 2 5 5 1 0
Check: PP 1 =
=

1 3 1 1 0 1
5 5
136

P 1 AP =
3
Thus: 5
1
5
2
5 1 2 1 2
=
1 3 0 1 3
5
9 6
5 5 1 2 3 0
=
2 2 1 3 0 2
5 5
Using the decomposition, compute A4

A= PP 1
A4 =
P P 1 PP 1 PP 1 PP 1 =
P 4 P 1
3 2
3 2
3 2
0 5 5
1 2 3
1 2 81 0 5 5
81 32 5 5
A4
=
=
4
1 3 0 2 1 1 1 3 0 16 1 1 81 48 1 1
5 5
5 5
5 5
275 130
5
5 55 26
= =

195 210 39 42
5
5
4
137
ADVANCED MATHEMATICS ADDITIONAL EXERCISES

WEEK 3
One additional exercise of week 3
Exercise 1.
1
1
1

Determine the dimension of the span of the following vectors:

u =
=
0v 2 =
, w 1 .

1
1
0
Solution:
The dimension of the span of a set of vectors is equal to the number of linearly independent
vectors in the set. Vectors are linearly independent if the following equation has only the
1 0
1
1
solution =
0 . 1u + 2 v + 3 w= 1 0 + 2 2 + 3 1 = 0 .
=
=
1
2
3
1
1
0 0

We can rewrite this equation as:
1 1 1 1 0

0 2 1 2 = 0 . We sweep the matrix:
1 1 0 0
3
1 1 1 0 1 1 1 0 1 1 1 0
0 2 1 0 ~ 0 2 1 0 ~ 0 2 1 0
1 1 0 0 0 2 1 0 0 0 0 0
We can stop here, since we are not interested in the explicit solution and it is clear that we
have all the zero-rows that we will get. The number of unknowns (the three lambdas) minus
the number of zero rows is the number of free variables that we have, i.e. the number of
lambdas that we can pick non-zero. This means that there is one linear dependent vector in
the three and two linearly independent. So the dimension of the span of u,v and w is 2.
138

Week 4 Calculus
Klein: Chapters 6, 7, and 8
Derivatives as limits
K.6.3.
Differentiability
K.6.3.
Differentials
K.6.4.
Rules of differentiation (up to chain rule)
K.7.1. K.7.2.
Second order derivative
K.7.3.
Multivariate functions
K.8.1.
Partial derivatives
K.8.2.
Young's rule
K.8.2.
Chain rule again
K.8.3.
Total differentials
K.8.4.
Implicit differentiation
K.8.4.
Marginal concept vs derivative
K.6.3.
Concavity and second order derivatives
K.7.3.
Homogeneous functions
K.8.3.
139

Differential calculus
Difference quotient:
Let y = f ( x)
x0 : initial value
y f ( x0 + x) f ( x0 )
=
x
x
Example 1:
y =a + bx + cx 2
y a + bx0 + bx + c( x0 + bx) 2 (a + bx0 + cx0 2 )
=
x
x
= b + 2cx0 + cx
Derivative:
Let y = f ( x)
x0 : initial value
f ( x0 + x) f ( x0 )
dy
= lim
dx x0
x
Also denoted by: f '( x0 )
Definition:
A function is differentiable in an interval if a derivative exists for each
point in that interval.
Requirement: the function must be continuous and smooth.
140

Difference between marginal value and average value:
Compare f '( x0 ) and
f ( x0 )
x0
Differential:
=
dy f '( x0 ) dx
141

Rules of differentiation
Sum difference rule
For any two functions f ( x) and g ( x)
d ( f ( x) g ( x))
= f '( x) g '( x)
dx
Proof
h( x + x) h( x)
=
lim
x0
x
f ( x + x) g ( x + x) ( f ( x) g ( x))
= lim
x0
x
f ( x + x) ( f ( x)
g ( x + x) g ( x)
lim
lim
x0
x0
x
x
= f '( x) g '( x)
h '( x)
142

Scalar rule
Let g ( x) = kf ( x) k
g '( x) = kf '( x)
Proof:
k f ( x + x) k f ( x)
lim
=
x0
x
f ( x + x) f ( x)
= k lim
x0
x
= k f '( x)
g '( x)
143

Product rule
For f=
( x ) g ( x ) h( x )
f '( x) = g '( x) h( x) + h '( x) g ( x)

Power rule
For f ( x)= k x n k , n
f '( x) = n k x n1
Exponential function rule
For f ( x) = e kx k
f '( x)= k e kx
Otherwise stated:
For =
f ( x) exp(k x)
f '( x) =
k exp(k x)
144

Chain rule
The derivative of the composite function
=
y f=
( x) g (h( x))
where u = h( x)
g (h( x)) g (u )
and both h( x) and g (u ) are differentiable functions
df ( x)
= g '(h( x)) h '( x)
dx
Or
dy dy du
=
dx du dx
Natural logarithmic function rule
f ( x) = ln( x)
d ln( x) 1
=
f '( x) =
dx
x
Example:
ln( x)
log b ( x) =
ln(b)
Example:
f ( x) = ln(h( x))
145
y = f ( x)
Second derivative
d 2 y d dy
=
dx 2 dx dx
Examples:
U (c ) = c
U (c) = ln(c)
Definition: A function is strictly concave over an interval if:
f ''( x) < 0
for all values of x in that interval.
Definition: A function is strictly convex over an interval if:
f ''( x) > 0
for all values of x in that interval.
146

Taylor Series expansion
The function f(x) can be around x = a approximated by:
y = f ( x)
Linear approximation around x=a:

h( x=
) f (a) + b ( x a)
Take b = f '(a )
h( x=
) f (a ) + f '(a ) ( x a )
Quadratic approximation around x=a:
j ( x=
) f (a ) + f '(a ) ( x a ) + c ( x a ) 2
Take c such that j ''( x) = f ''( x)
j ''( x) = 2c
1
c = f ''(a )
2
1
j ( x=
) f (a ) + f '(a ) ( x a ) + f ''(a ) ( x a ) 2
2
n-th degree approximation around x=a:
f (a ) f '(a )
f ''(a )
f ( n ) (a)
2
m( x=
+
( x a) +
( x a) + +
( x a)n
)
n!
0!
1!
2!
147

Partial derivative
The partial derivative of f ( x1 , x2 ,, xn ) with respect to xi is
f ( x1 ,, xi + xi ,, xn ) f ( x1 ,, xi ,, xn )
y
= lim
xi xi 0
xi
Notation: fi ( x1 , x2 ,, xn )
Cross derivatives:
2 y
f11 ( x1 , =
x2 ) =
f ( x1 , x2 )
x12
2 y
f 22 ( x1 , =
x2 ) =
f ( x1 , x2 )
x22
2 y
f12 ( x=
= f ( x1 , x2 )
1 , x2 )
x1x2
2 y
f 21 ( x=
= f ( x1 , x2 )
1 , x2 )
x2 x1
148

Youngs theorem
If all the partial derivatives of the function f ( x1 , x2 ,, xn ) exist and
are themselves differentiable with continuous derivatives then
f ( x1 , x2 ,, xn )
f ( x1 , x2 ,, xn )
xi
x j
x j
xi
or
f ji ( x1 , x2 ,, xn ) = fij ( x1 , x2 ,, xn )
149

Composite functions
Multivariate chain rule (I)
If the arguments of the function
y = f ( x1 , x2 ,, xn )
are themselves differentiable functions of the variable t, such that

x1 = g 1 (t ) x2 = g 2 (t ) ,..., xn = g n (t )
Then:
dx
dx
dx
dy
= f1 1 + f 2 2 + + f n n
dt
dt
dt
dt
with
y
fi =
xi
150

Multivariate chain rule (II)
If the arguments of the function
y = f ( x1 , x2 ,, xn )
are themselves differentiable functions of the variables t1 ,, tm , such

that
x1 = g 1 (t1 ,, tm ) x2 = g 2 (t1 ,, tm ) ,..., xn = g n (t1 ,, tm )
Then:
dx
dx
dx
y
= f1 1 + f 2 2 + + f n n
ti
dti
dti
dti
with
y
fi =
xi
151

Homogenous function
A multivariate function y = f ( x1 , x2 ,, xn )
is homogenous of degree k if for any number s > 0:
s k y = f ( sx1 , sx2 ,, sxn )
Eulers theorem
For any multivariate function y = f ( x1 , x2 ,, xn )
that is homogenous of degree k if for any number s > 0:
=
ky x1 f1 ( x1 , x2 ,, xn ) + + xn f n ( x1 , x2 ,, xn )
152

Homothetic function:
A monotone transformation of a homogenous function:
y = f ( x1 , x2 ,, xn )
z = g ( y)
is a homothetic function if g(y) is strictly monotonic: g '( y ) > 0 for

all y or g '( y ) < 0 for all y.
y = x14 x25
Is homogenous to the degree 9

The function:
=
z ln(
=
y ) 4ln( x1 ) + 5ln( x2 ) is a homothetic function.
The function z is not a homogenous function
Property: every homogenous function is a homothetic function (take
g ( y) = y )
153

Total differential
The total differential of
y = f ( x1 , x2 ,, xn )
evaluated at the point ( x10 , x20 ,, xn0 )

is
dy
f1 ( x10 , x20 ,, xn0 )dx1 + f 2 ( x10 , x20 ,, xn0 )dx2 + + f n ( x10 , x20 ,, xn0 )dxn
154

Implicit functions
Explicit function:
y = f ( x1 , x2 ,, xn )
Implicit function:
F ( y, x1 , x2 ,, xn ) = k
Implicit function theorem
For F ( y, x1 , x2 ,, xn ) = k
that is defined at ( y 0 , x10 , x20 ,, xn0 )
That has continuous derivatives at the point ( y 0 , x10 , x20 ,, xn0 )
there is a function y = f ( x1 , x2 ,, xn )
such that :
1) F ( f (( x10 , x20 ,, xn0 ), x10 , x20 ,, xn0 ) = k
2) y 0 = f ( x10 , x20 ,, xn0 )
Fxi ( y 0 , x10 , x20 ,, xn0 )
0
0
0
3) fi ( x1 , x2 ,, xn ) =
Fy ( y 0 , x10 , x20 ,, xn0 )
Example:
6 y + 2x2 =
10
155

Technical tutorial Advanced Mathematics, Week 4
Exercise 1.
Calculate the derivatives of the following functions:
2
a) e 5 x + 2 x
Solution:
1
d 5 x2 + 2 x
d d
d
e
e (5 x 2 + 2 x ) = e (10 x + (2 x) 2 ) =
=
dx
d
dx
dx
1
2
2
1
1
)
e 5 x + 2 x (10 x + (2 x) 2 =
2) e 5 x + 2 x (10 x +
2
2x
b) log(
x +1
)
x2
(log: natural logarithm)
Solution:
d
x +1
d
d x + 1 1 ( x 2 ( x + 1)2 x)
log( 2 ) = log()
=
=
dx
x
d
dx x 2
( x 2 )2
x 2 ( x 2 ( x + 1)2 x) ( x 2 ( x + 1)2 x)
=
x +1
( x 2 )2
( x + 1) x 2
c) elog( x ) + 2 x
Solution:
d log( x ) + 2 x d log( x ) 2 x
d
x
e
e=
=
(e
)
( xe 2=
) e 2 x + 2 xe 2 x
dx
dx
dx
Alternatively:
d log( x ) + 2 x d d
1
e=
e
log( x=
) + 2 x elog( x ) + 2 x (=
+ 2)
dx
d dx
x
1
1
elog( x ) e 2 x ( + 2) = xe 2 x ( + 2) = e 2 x + 2 xe 2 x
x
x
2
3 x + 7 log( x)
d)
( x + 1) 4
Solution:
7
( x + 1) 4 (6 x + ) (3 x 2 + 7 log( x))4( x + 1)3
d 3 x 2 + 7 log( x)
x
=
dx
( x + 1) 4
( x + 1)8
156

e) (e3 )log(3 x
Solution:
d 3 log(3 x4 ) d 3log(3 x4 ) d
4 2
e)
=
e
=
(=
(3 x 4 )3 3(3 x=
) 12 x 3 324 x11
dx
dx
dx
x5 + 3x 2
f)
Solution:
1
1
d
d 5
1 5
5x4 + 6 x
x5 + 3x 2 =
( x + 3x 2 ) 2 =
( x + 3 x 2 ) 2 (5 x 4 + 6 x) =
dx
dx
2
2 x5 + 3x 2
ax
g)
3 x
+ 3x 2
3x + 1
Solution:
3
d a x 3 x + 3 x 2
=
dx
3x + 1
3
3 x + 1(log(a )a
x3 3 x
3(a x 3 x + 3 x 2 )
(3 x 3) + 6 x)
2 3x + 1
3x + 1
2
h) e(3 x + 6)
Solution:
3
3
d (3 x3 + 6)3
d d
d
e
e
()3 (3 x 3 + 6) = e 3() 2 (9 x 2 ) = 27e(3 x + 6) (3 x 3 + 6) 2 ( x 2 )
=
dx
d d
dx
log(ax) , where a > 0 is some constant.

Solution:
d
1
1
log(ax) =
a =
dx
ax
x
Alternatively:
d
d
1
log(ax=
)
(log(a ) + log( x))=
dx
dx
x
i)
*Exercise 2.
Compute the partial derivative with respect to x and y of the following functions (they are
called the Cobb-Douglas and the Constant Elasticity of Substitution (CES) function
respectively, and you will see them often in Microeconomics as well as more mathematical
Macroeconomics):
a) x a y1 a
Solution:
x a y1 a
or
a 1 a
x y = ax a 1 y1 a
x
a 1 a
x y = (1 a ) x a y a
y
157

b) R ( x
s 1
s
+ (1 ) y
s 1 s
s
s 1
Solution:
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
R ( x s + (1 =
)( x s + (1 ) y s ) s 1 =
) y s ) s 1 R(
x s
x
s 1
s
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
1
R ( x s + (1 ) y s ) s =
R(
)( x s + (1 ) y s ) s 1
x s =
x
s 1
s
s 1
s 1 s
s 1
s 1 s
s 1
1 s 1
1
s
1
R ( x s + (1 ) y s ) s =
R(
)( x s + (1 ) y s ) s 1
(1 ) y s =
=
s 1
s
y
= R( x
s 1
s
+ (1 ) y
s 1 1
s s 1
(1 ) y s
Exercise 3.
Calculate the Hessian of the following function. Verify by computation that Youngs rule
holds.
f ( x, y ) =
y e x+2 y + x2 y
Solution:
y e x+2 y + x2 y =
y e x + 2 y + 2 xy, y e x + 2 y + x 2 y =
2 y e x+2 y + e x+2 y + x2
x
y
2
y e x + 2 y + x 2 y = y e x + 2 y + 2 xy =
y e x+2 y + 2 y
2
x
x
2
y e x + 2 y + x 2 y = y e x + 2 y + 2 xy =(2 y + 1)e x + 2 y + 2 x
yx
y
y e x + 2 y + x 2 y = (2 y + 1) e x + 2 y + x 2 =(2 y + 1) e x + 2 y + 2 x
xy
x
2
y e x + 2 y + x 2 y = (2 y + 1) e x + 2 y + x 2 =(4 y + 4)e x + 2 y
2
y
y
2
2
Note that
f ( x, y ) =
f ( x, y ) , verifying Youngs rule.
yx
xy
Exercise 4.
Verify that the second order derivative(s) of the following concave functions is/are indeed
negative (be careful: on the domain of the functions!):
a) x 2
Solution:
d2
d
x 2 = 2 x =2 < 0
2
dx
dx
158

x
b)
Solution:
1
d2
d 1
=
=
< 0 , as on the domain of the function x 0 .
x
2
dx
dx 2 x 4 x x
c) log( x)
Solution:
d2
d 1 1
log( x=
)
=
<0
2
dx
dx x x 2
d) x a y1 a , 0<<1
Solution:
d 2 a 1 a d a 1 1 a
x y =
ax y =
(a 1)ax a 2 y1 a < 0
2
dx
dx
As a 1 <0, a > 0 and x, y > 0. The result is similar for y.
Exercise 5.
Establish if the following functions are homogeneous and, if so, of what degree:
a) f ( x, y=
) x2 + y 2
Solution:
f (tx, ty ) = (tx) 2 + (ty ) 2 = t 2 x 2 + t 2 y 2 = t 2 ( x 2 + y 2 ) = t 2 f ( x, y )
So the function is homogeneous of degree 2.
b)
f ( x, y ) = x 2 y 2
Solution:
4 2 2
f (=
tx, ty ) (tx) 2=
(ty ) 2 t =
x y t 4 f ( x, y )
So the function is homogeneous of degree 4.
c) f ( x, y=
)
x+ y
Solution:
f (tx, ty ) = tx + ty = t ( x + y ) = t x + y = t f ( x, y )
So the function is homogeneous of degree .
d)
f ( x, y ) = x 2 y 2 + x + y
Solution:
f (tx, ty
=
) (tx) 2 (ty ) 2 + tx + =
ty t 4 x 2 y 2 + t x + y
We cant go any further with this. The function is not homogeneous (even though it is the sum
of two homogeneous functions).
e)
f ( x=
, y ) log( x + y )
Solution:
f (tx, ty=
) log(tx + ty=
) log(t ( x + y ))
= log(t ) + log( x + y=
) log(t ) + f ( x, y )
Clearly, this is not a homogeneous function either.
159

Broad tutorial (Friday) - Some additional material on implicit functions
Suppose we are given a function f : 2 and an equation f ( x, y ) = 0 . Then this equation
implicitly defines a relation between x and y: for any particular x only certain y obey the
equation (perhaps one, perhaps a few, perhaps none, perhaps infinitely many, call them y* ).
The implicit function theorem states that under certain condition this relation can be locally
represented as a function (so y* = g ( x) for some function g) and it states what the derivative
f ( x, y )
*
dy
dg ( x)
of this function is, i.e.
. In this exercise we will see what locally
=
= x
f ( x, y )
dx
dx
y
represented by a function means, as well as three examples of what may go wrong with this
local representation when the conditions of the implicit function theorem do not hold.
* Example 1
Consider first the function f ( x, y ) = x 2 + y 2 1 . We know from first week that the equation
f ( x, y ) = 0 now represents a circle with radius 1 and centre (0,0). That is, if we graph all the
points (x,y) for which the equation f ( x, y ) = 0 holds, we get that circle:
This graph sort of looks like the graph of a function, but it is not, because for a function we
want that every x-value gives only one y-value. Here, however, for every x (1,1) there are
two corresponding y-values. But if we zoomed in on the graph, we would get something that
that looks like a function:
160
The part in the zoom is perfectly well behaved: for every x-value there is just one y-value. So
in this part of the graph we can talk about a function y* = g ( x) .
Now lets revisit the conditions of the implicit function theorem. They are two: the partial
f ( x0 , y0 )
f ( x0 , y0 )
and
must exist at the point ( x0 , y0 ) (the point on which
derivatives
y
x
f ( x0 , y0 )
were zooming in) and
0 (otherwise we would be dividing by zero in the formula
y
f ( x, y )
*
dy
dg ( x)
=
= x ). Lets check these conditions for f ( x, y ) = x 2 + y 2 1 .
f ( x, y )
dx
dx
y
f ( x0 , y0 )
f ( x0 , y0 )
= 2 y0 . These both exist everywhere (well see in the third
= 2 x0 ,
y
x
f ( x0 , y0 )
example a case where this isnt so). However, for y0 = 0 ,
= 0 , so our second
y
condition is violated. What points in the function are we talking about? Well, lets check:
f ( x, 0) = x 2 + 02 1 = x 2 1 = 0 , so x =1 x =1 . Lets look at these point (-1,0) and (1,0):
So what goes wrong here? At these points, no matter how far we zoom in, there will always
be two y-values. The problem is that at y=0 the graph goes straight up. A rough way of
161

f ( x, y )
f ( x0 , 0)
dg ( x)
thinking about this is that, as
= 0,
= x , so the function goes
f ( x, 0)
y
dx
y
straight up or down, leading to two y-values for a particular x-value.
162

* Example 2
Another example is the equation f ( x, y=
) 4 x 2 (1 x 2 ) y 2= 0 . Its graph looks like this:
Just looking at the graph, we see immediately that thing will go wrong in three points:
(-1,0),(0,0) and (1,0). So we imagine that our conditions will fail there. Lets check:
f ( x0 , y0 )
=
= 8 x0 (1 x0 2 ) 8 x03
(4 x 2 (1 x 2 ) y 2 )
x
x
=
0
x
x
f ( x0 , y0 )
=
(4 x 2 (1 x 2 ) y 2 )= 2 y0
y = y0
y
y
f ( x0 , y0 )
= 0 . For what values of x
Both partial derivatives are well-defined, but y0 = 0 ,
y
does this hold? Well:
f ( x, 0) =4 x 2 (1 x 2 ) 02 =0 4 x 2 (1 x 2 ) =0
x =1 x =0 x =
1
So we indeed find that we have trouble at the points (-1,0),(0,0) and (1,0).
163

* Example 3
As a final example, lets see what can go wrong if the partial derivative is not defined. Lets
first consider why a derivative might not be defined. Consider the function f ( x)= x 2 ,
that is the absolute value of x-2. Plotted, it looks thus:
Now lets think about the derivative of this function at x=2. This should be the tangent line at
x=2, but because of the dent in the function, there is no clear-cut tangent line. Therefore, this
function does not have a derivative at x=2.
Now consider the implicit relation f ( x, y ) = y 2 x 2 = 0 . Plotted, it looks like this:
Clearly, here we have trouble at (0,-2): no matter how far we zoom in, we never get a
function. So lets check our condition:
f ( x0 , y0 )
=
( y 2 x 2 ) = 2 x0
x = x0
x
x
if y0 > 2
1
f ( x0 , y0 )
2
=
( y 2 x ) = not defined
if =
y0 2
y = y0
y
x
1
if y0 > 0
164

So for y0 = 2 we run into trouble. The associated x is: f ( x, y 2) = 2 2 x 2 = 0 x = 0 , so
we find that it is indeed (0,2) that is causing us headaches.
165

Broad tutorial of Friday Week 4
Exercise 1.
Calculate the Marginal Rate of Substitution (MRS) by total differential and by implicit
differentiation and show its relation to marginal utility. Do it first for U = xy and then for
the general case (*)
Solution:
The MRS between two goods is the amount you have to receive from one good if you give up
the other good to keep utility constant. The total differential of a function is the very small
change (called infinitesimal change) in that function for infinitesimal changes in its variables.
Given that U = xy ,
U
U
1 y
1 x
dx +
dy =
xydx +
xydy =
dx +
dy
x
y
x
y
2 x
2 y
Because we are interested in the MRS, we want to keep utility constant, so we impose dU=0.
dy
Furthermore, since the MRS is the change in y for a given change in x, MRS =
. We solve
dx
for that:
1 y
1 x
x
y
dy
y
0 = dx +
dy
dy =
dx
=
2 x
2 y
y
x
dx
x
So what does this mean? Well, if you have, say, 5 units of good y and 10 units of good x, then,
5 1
if you had to give up an infinitesimal amount of x, you would require
= a unit of y as
10 2
compensation to keep your utility unchanged. The minus indicates the opposite directions:
you receive one and relinquish the other.
dU =
The implicit differentiation method to derive this is very similar. In fact, it is a bit more
precise, since differentials are not completely well-defined: it is not clear what exactly an
infinitesimal change is. However, intuitively the total differential is easier to grapple with.
Since the method is more precise anyway, we make one other change in the direction of
dy
precision. It is slightly misleading to speak of
, since we appear not even to have defined
dx
a relationship between x and y. And how could there be such a relationship: x and y are just
amounts of goods; you could have as many as you like. Of course the relationship between the
two comes from the fact that we impose that utility is fixed. In effect we imposed a relation
between x and y when we imposed dU=0 (keeping utility fixed). At that point we were no
longer speaking of general x and y, but of particular related values x and y , as we shall now
dy
call them. We are thus interested in
dx .
For the implicit differentiation approach we start as follows. We impose again that utility is
fixed:
U (x, y) =
x y = U , where U is some constant, so that
f ( x , y ) := U ( x , y ) U = x y U = 0 (:= means that you define what is on the left of the

sign as what is on the right of the sign, so we defined f here).
We can now apply the implicit function theorem to f ( x , y ) :
166

1
f
dy
=
x =
2
f
dx
1
y
2
y
y
x =
. Lo and behold, the result is the same.
x
x
y
* Finally we derive a similar result for general utility functions. This is primarily to show that
more abstract calculations, although they may seem a bit more confusing, are often more easy
than concrete examples.
For the total differential approach we again have:
U
U
U
U
U
dy
0=
dU = dx +
dy
dy =
dx
=
x
U
x
y
y
x
dx
y
In fact this last line is just the implicit function theorem (the constants U drop out after
differentiation). Indeed the total differential approach is one way of proving the implicit
function theorem. It just remains to link this result to the marginal utilities, but that is easy.
MU x
dy
U
The marginal utility MU x of x is just
and similarly for y. So MRS =
=
. That
dx
MU y
x
result was easier to derive than the specific case! (In fact, since positive number are easier to
MU
dy
(= x ) . This is just a matter of notation.
work with, MRS is often defined as MRS :=
dx
MU y
*Exercise 2.
Derive the derivative of log(x) by differentiating elog( x ) = x and using your knowledge of the
derivate of e y and the inverse of the exponential function.
Solution:
This may seem like a silly question, since we know the derivative of log(x) just as much as we
know the derivative of e x . However, log(x) is actually defined as the inverse of e x , so all the
information we have on log(x) comes from our knowledge of e x . To work then:
d log( x ) d
e
=
x , working out the left hand side we get:
dx
dx
d log( x )
d d
d
d
) d
e
e=
=
log( x) e=
log( x) elog( x=
log( x) x log( x) ,
dx
d dx
dx
dx
dx
whilst the right hand side gives:
d
x =1
dx
So:
d
d
1
x log( x) =
1
log( x) =
dx
dx
x
So now we have actually proved that the derivative of log(x) is as we always assumed it was.
167

Exercise 3.
Show that y , x =
differentials.
d log( y )
, where y , x is the elasticity of y with respect to x, by using
d log( x)
Solution:
This is actually not that hard, but it turns out to be rather useful in many areas. We derive the
total differential of f(y)=log(y).
log( y )
1
1
=
df ( y ) d=
log( y )
=
dy
dy . Similarly d log( x) = dx . Dividing the two, we get:
y
y
x
dy
( )
d log( y )
dy x dy x
y
= =
= = y,x
d log( x) ( dx ) y dx dx y
x
Exercise 4.
Show that the demand function Q = P ( and constants) exhibits constant elasticity, as
well as the derived log-linear demand function log(
=
Q) log( ) log( P ) . Next week we will
see that this demand function arises from Cobb-Douglas utility functions.
Solution:
dQ P
P
P 1 P
Q , P =
=
P 1
=
=
, which is constant (independent of
dP Q
P
P
price).
d log Q
log(Q ),log( P ) =
=
d log P
* Exercise 5.
Estimate the effect of a change in x on f(x,y(x)), where:
a) x is ability, y is education and f (x,(y(x)) is income.
b) ( p, D( p )) =
pD( p ) cD ( p )
Solution:
df f f dy
+
a) =
. What does this mean? The total effect of ability on income is
dx x y dx
f
composed of two separate effects: the direct effect of ability on income
, which is
x
positive (if youre smarter youll generally earn more money) plus the effect of ability on
dy
education
(which is presumably also positive) times the effect of education of income
dx
f
, which is again positive. Since all terms are positive, the effect of ability on income
y
will also be positive. Should it be the case that very able people actually get less education
(say because they think it beneath them, or because theyre so smart they dont function in
168

f
dy
becomes negative, while
is still
y
dx
positive. Then for this range of ability the marginal effect of ability is unclear: on the one
hand it increases your income directly, on the other it decreases your education and
through that effect reduces income.
a rigid system) then for these high values of x
The point here is that writing down this equation allows you to see all the partial effects.
Your analysis will then be as convincing as your explanation of the signs of the
derivatives is.
b)
d
( p, D( p ))= D( p ) + ( p c) D '( p ) (Note that we sometimes denote the derivative of
dp
d
f ( x) = f '( x) ).
dx
Here p is the price, D is demand, is profit and c is the constant unit cost of production.
So
d ( p, D( p ))
dp
is the effect of a change in price on the profit. The marginal effect of this is that it allows
you to get a little more money from the people you sell to (D(p)), while it costs you some
demand D(p),which in turn costs you p-c per costumer, as you dont get your money, but
you also dont incur the costs of production for them. In fact the total effect is
unambiguously positive if p c < 0, which makes sense: if your price is so low that you
make a loss each time you sell, you can increase profits by increasing the price. Otherwise
the effect depends on whether you think the loss in customers will be outweighed by the
extra profit per costumer you make.
functions of one variable as
169

ADVANCED MATHEMATICS TAKE HOME ASSIGNMENTS
WEEK 4
Exercise 1.
Calculate the Hessian of the following function:
f ( x, y ) =
xy 2 + e x
2 xy
+ log x
Solution:
3
f ( x, y )
1
=y 2 + (3 x 2 2 y )e x 2 xy +
x
x
3
f ( x, y )
= 2 xy 2 xe x 2 xy
y
1
2 f ( x, y )
2
2
x3 2 xy
[6
x
(3
x
2
y
)
]
e
=
+
2
2
x
x
3
3
3
2 f ( x, y )
= 2 y 2e x 2 xy 2 x(3 x 2 2 y )e x 2 xy = 2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
yx
3
3
3
2 f ( x, y )
= 2 y 2e x 2 xy 2 x(3 x 2 2 y )e x 2 xy = 2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
xy
3
2 f ( x, y )
= 2 x + 4 x 2 e x 2 xy
2
y
Thus the Hessian:
2 f ( x, y )
2
x
2 f ( x, y )
yx
2 f ( x, y )
1
x3 2 xy
2
2
2
xy [6 x + (3 x 2 y ) ]e
x
=
2 f ( x, y )
x3 2 xy
2
2 y [2 + 2 x(3 x 2 y )]e
y 2
2 y [2 + 2 x(3 x 2 2 y )]e x 2 xy
3
2 y + 4 x 2 e x 2 xy
Exercise 2.
Suppose that f ( x1 , x2 ) is homogeneous function of degree 2 and g ( x1 , x2 ) is a homogeneous
function of degree 6. Show that the function=
h( x1 , x2 ) f ( x1 , x2 )3 + g ( x1 , x2 ) is homogeneous
and determine the degree.
170

Solution:
It is given that
s 2 y = f ( sx1 , sx2 )
s 6 y = g ( sx1 , sx2 )
h( sx1 , sx2 )= f ( sx1 , sx2 )3 + g ( sx1 , sx2 )= [ s 2 f ( x1 , x2 )]3 + s 6 g ( x1 , x2 )= s 6 [ f ( x1 , x2 )3 + g ( x1 , x2 )]= s 6 h( x1 , x2 )
Hence, h(.) is homogenous of degree 6.
* Exercise 3.
Consider the implicit relation between x and y defined by:
dy
.
dx
You will get an outcome that depends on both x and y. Use the original relation between x and
dy
y to determine the value of the derivative
at x=3 and at x=1. For the latter case you will get
dx
two possible outcomes.
( x 3) 2 + ( y + 3) 2 =
9 . Use the implicit function theorem to find the derivative
Also find the two points where the relation cannot be represented as a function y(x).
Finally, draw a picture to elucidate you findings.
Solution:
The function ( x 3) 2 + ( y + 3) 2 =
9 is a circle with locus (3, -3) and radius 3.
We construct an implicit function: z = ( x 3) 2 + ( y + 3) 2 9
dz = 2( x 3)dx + 2( y + 3)dy
dy
( x 3)
dz = 0 gives
=
dx
y+3
dy
At x=3 we get y=0, and y = -6 and
=0
dx
At x=1 we get y=-0.523 and y =-0.765
dy
does not exist at (0,-3) and (6,-3)
dx
171

Week 5 + first half week 6 - Optimization
Klein: Chapters 9, 10, and 11
Local versus global optima, strict optimum
K.9.1.
Univariate Calculus
First order condition, stationary point
K.9.1.
Second order condition
K.9.1.
Concavity
K.9.1.
Multivariate calculus
Frst order condition, stationary/saddle point
K.10.1.
Hessian matrix
K.10.3.
Second order condition in terms of

semidefiniteness
K.10.3.
Concavity, convexity and semidefiniteness
See lecture slides
Constrained optimization
Substitution
K.11.1.
Lagrange
K.11.2.
Multipliers
K.11.2.
Value function
K.11.2.
Envelope theorem
K.11.2.
Convex constraints, multiple constraints, slackness See lecture slides
172

Chapter 9: Extreme values of univariate functions
Stationary function
For a differentiable function f ( x)
x * is a stationary point if f '( x*) = 0
First-order condition
If the function is everywhere differentiable on an interval and reaches
a minimum or a maximum at the point x * then x * is a stationary
point.
It is a necessary but not sufficient condition for identifying a local
maximum or a local minimum.
Example 1:
y = f ( x) = x 5 3
For which f (5) = 3 . On the interval [5, ) , the function reaches a
minimum at x = 5 ( f ( x) > 3 for x > 5 ). The function is
differentiable on [5, ) so that x = 5 is a stationary point.
Stationary point
x * is a stationary point if f '( x*) = 0 or if the derivative does not exist
at x *
Example 2:
y = f ( x) = 10 ( x 5) 2
Stationary point f '( x) = 2( x 5) = 0 at x = 5
Example 3:
y= f ( x)= | x 5 |
Note that f (5) = 0 , but the derivative does not exist at x = 5 . Thus
x = 5 is a stationary point.
173

Global maximum
If a function f ( x) is everywhere differentiable, for which x * is a
stationary point. The stationary point is a global maximum if
f '( x*) 0 for x x * and f '( x*) 0 for x x *
Global minimum
If a function f ( x) everywhere differentiable, for which x * is a
stationary point. The stationary point is a global minimum if
f '( x*) 0 for x x * and f '( x*) 0 for x x *
Local maximum
If a function f ( x) that is everywhere differentiable has an interior
maximum at x * , then throughout some interval (m, x*) to the left of
the stationary point f '( x*) 0 and throughout some interval ( x*, m)
on the right of the stationary point f '( x*) 0
Local minimum
If a function f ( x) that is everywhere differentiable has an interior
minimum at x * , then throughout some interval (m, x*) to the left of
the stationary point f '( x*) 0 and throughout some interval ( x*, m)
on the right of the stationary point f '( x*) 0
Example 4:
x
f ( x) = 2
x +4
( x 2 + 4) 2 x 2
( x 2 4)
f '( x) =
=
2
=
0
( x 2 + 4) 2
( x + 4) 2
There is a stationary point for x = 2 and x = 2
f '( x) < 0 for x < 2
f '( x) > 0 for 2 < x < 2
f '( x) < 0 for x > 2
1
1
f (2) = is a local minimum and f (2) = is a local maximum.
4
4
174
x
x
=
0
lim
= 0 so that both extrema are
and
x x 2 + 4
x x 2 + 4
global extrema.
Note that lim
Second-order condition for local maximum

If the second order derivative of the differentiable function f ( x) is
negative when evaluated at the stationary point x * (that is f ''( x) < 0 )
then that stationary point represents a local maximum.
Second-order condition for local minimum
If the second order derivative of the differentiable function f ( x) is
positive when evaluated at the stationary point x * (that is f ''( x) > 0 )
then that stationary point represents a local minimum.
Convex function
The function f ( x) is convex on (m, n) if f ''( x) 0 for all x on (m, n) .
Example 5:
f ( x) = x 2 2 x + 2 is a convex function, Because f ''( x) = 2 for all x.
Concave function
The function f ( x) is concave on (m, n) if f ''( x) 0 for all x on
(m, n) .
Stationary point of a strictly concave function
If the function f ( x) is strictly concave on the interval (m, n) and has
the stationary point x * , for which m x* n , then x * is a local
maximum in that interval. If it is concave everywhere, then it has at
most one stationary point (which is a global maximum).
Stationary point of a strictly convex function
If the function f ( x) is strictly convex on the interval (m, n) and has
the stationary point x * , for which m x* n , then x * is a local
minimum in that interval. If it is convex everywhere, then it has at
most one stationary point (which is a global minimum).
175

Example 6:
f (=
x) e x1 x
f '( x)= e x1 1= 0 for x = 1
f ''(=
x) e x1 > 0 for all x. So that f ( x) is concave.
Inflection point: the twice-differentiable function f ( x) has an
inflection point if and only if the sign of the second derivative
switches from negative (positive) in some interval (m, x ) to positive
(negative) in some interval ( x , n) . Note that if the second derivative is
negative, the function is concave. If the second derivative is positive is
convex. Thus, at the inflection point, the curvature changes from
convex to concave (or from concave to convex).
Example 7:
f ( x) = x 4 does not have an inflection point at x=0.
f ''( x) = 12 x 2
It does not change from sign at x=0. Because f ''( x) > 0 for positive
and negative.
176

Example 8:
1
1
2
f ( x) = x3 x 2 x + 1
9
6
3
1
1
2 1
1
f '( x)= x 2 x = ( x 2 x 2)= ( x 2)( x + 1)
3
3
3 3
3
f ''( x) =
2
1 1
x = (2 x 1)
3
3 3
1
The function has an inflection point at x=1/2 (because f ''( ) = 0 ;
2
1
f ''( x) > 0 for x > ,
2
1
f ''( x) < 0 for x <
2
For x=-1 the function has a local maximum:
f '(1) =
0 and f ''(1) < 0
For x=2 the function has a local minimum:
f '(2) = 0 and f ''(2) > 0
Example 9:
f ( x=
) x 6 10 x 4
f '(=
x) 6 x 5 40 x 3
Second derivative of f(x):
f ''( x) =30 x 4 120 x 2 =
30 x 2 ( x 2 4) =
30 x 2 ( x + 2)( x 2)
The function has an inflection point at x=-2 and x=2. No inflection
point at x=0
177

Wrapping up: Procedure of one variable optimization
1) Start with y = f ( x) and determine the first-order necessary
dy
'( x) 0
condition:= f=
dx
2) Check second-order sufficient conditions:
d2y
=
f ''( x) < 0 (maximum)
dx 2
d2y
=
f ''( x) > 0 (minimum)
dx 2
3) We can rewrite the first-order condition as:
=
dy f=
'( x)dx f=
0
1 ( x ) dx
4) the second order derivatives can be rewritten
=
d 2 y f=
''( x)dx 2 f11 ( x)dx 2 < 0 (maximum)
=
d 2 y f=
''( x)dx 2 f11 ( x)dx 2 > 0 (minimum)
5) Note that d 2 y is the change in dy and that dx 2 is the square of the
change in dx
178

Chapter 10: Multivariable optimization without constraints
1) Start with y = f ( x1 , x2 ) and determine the first-order necessary
dy f1dx1 + f 2 dx2
condition:=
dy = 0 if and only if f1 = 0 and f 2 = 0
2) Check second-order sufficient conditions:
d 2 y = f11dx12 + f 22 dx22 + f 21dx2 dx1 + f12 dx1dx2
Which can be rewritten as
f dx
f
d 2 y = [ dx1 dx2 ] 11 12 1
f 21 f 22 dx2
Maximum: d 2 y < 0
Minimum: d 2 y > 0
Definition:
The Hessian of the function y = f ( x1 , x2 ) is
f12
f
H = 11
f 21 f 22
Remember: Youngs theory (lecture 4): f12 = f 21
179

How to consider d 2 y = [ dx1
f
dx2 ] 11
f 21
f12 dx1
f 22 dx2
Method 1 (for a 2 by 2 matrix):

Maximum:
H
=
f11 < 0
1
=
H2
f11
f12
>0
f 21 f 22
Minimum:
H
=
f11 > 0
1
f11 f12
=
H2
>0
f 21 f 22
=
H2
Saddle point:
f11
f 21
f12
<0
f 22
Method 2 (for a general matrix): Check the signs of the eigenvalues

of H (below).
180

Positive definite matrices (and negative definite matrices)
Definition:
a c
A matrix A =
is positive definite if x ' Ax > 0 for all x.
b
d
Important: A matrix A is positive definite if all eigenvalues of the

matrix A are positive.
Definition:
a c
A matrix A =
is negative definite if x ' Ax < 0 for all x.
b
d
Important: A matrix A is negative definite if all eigenvalues of the

matrix A are negative.
Important: A matrix A is indefinite if it has both positive eigenvalues
and negative eigenvalues.
Note
that the determinant of A is equal to the product of its eigenvalues
(week 3). As a consequence, if the determinant of A has a negative
sign, it must have both positive and negative eigenvalues. Hence, the
matrix A will be indefinite. The matrix A will be either positive
definite or negative definite if it has a positive determinant.
181

Example 10:
y = f ( x1 , x2 ) = x13 x12 x22 + 8
First-order necessary conditions:
f1 ( x1 , x2 ) = 3 x12 2 x1 = 0
f 2 ( x1 , x2 ) =
2 x2 =
0
Thus stationary points are (0,0) and (2/3,0)
f11 ( x1 , x=
6 x1 2
2)
f=
f=
0
21 ( x1 , x2 )
12 ( x1 , x2 )
f 22 ( x1 , x2 ) = 2
Hessian at (0, 0):
2 0
H =
0 2
The eigenvalues of the matrix at (0, 0) are = 2 . Reason:
0
2
H I =
= (2 ) 2
2
0
H I is zero for = 2 . Because all eigenvalues are negative, the
matrix is negative definite, so that at (0, 0) the function reaches a
maximum.
Hessian at (2/3, 0):
2 0
A=
0 2
0
2
H I =
= (2 )(2 )
0
2
Because the eigenvalues have a negative and positive sign, the matrix
is indefinite, so that at (0, 0) the function reaches neither a maximum
nor a minimum.
182

Informal interpretation of the Lagrange multiplier
The polynomial function is the following:
f (=
x) ax 2 + 3
The unconstrained optimum of f ( x) is equal to 3 for x = 0 . If a is
positive, the optimum is a minimum at the function ( f ( x) is a convex
function). If a is negative, the optimum is a maximum ( f ( x) is a
concave function).
Next, we introduce the constraint
x=2
Optimizing the function f ( x) with respect to x seems to be a silly
exercise, because we know that x must equal 2. The Lagrangian
function is
max L( x, )= ax 2 + 3 ( x 2)
x ,
We can solve the function by taking the first partial derivative with
respect to x and . Both derivatives are equal to zero.
L( x, )
so that
= 2ax = 0
(1)
x
L( x, )
and
= x2=0
(2)
Equation (2) yields x = 2 and after substituting it in equation (1) gives

= 4a
Lets change the constraint x = 2 . Note that:
For the constraint x = 0 (which is equal to the unconstrained
optimum) the multiplier equals zero (so that the constraint
becomes irrelevant)
For the constraint x = 3 : = 6a
In general, for the constraint x = c we get = 2ac . Thus, the
multiplier is increasing in c.
Thus, for larger values of we are further away from the optimum
value of f ( x)
183

Conclusion 1: From an economic perspective, this is an important
result. It means that the importance of the constraint is signifying for
larger values of c!
Conclusion 2: We optimize the Lagrangian function by taking the
derivative of f ( x) at the constraint x = c . If the derivative at this
point is close to zero, we are not far away from the optimum of the
unconstrained function. This is reflected by the value of .
If is zero: the constrained optimum is equal to the unconstrained
optimum.
If is large: constrained optimum is very different from
unconstrained optimum. Thus, the constraint is important.
184

Formal interpretation of the Lagrange multiplier :
Objective function: f ( x1 , x2 ) and the constraint: g ( x1 , x2 ) = c . The
optimal values of ( x1 , x2 ) are ( x1* (c), x2* (c)) , which depend on c. The
Lagrangian function:
Max f ( x1* (c), x2* (c)) [ g ( x1* (c), x2* (c)) c]
(1)
We consider separately the derivatives at the constraint and at the

objective function.
Step 1 Take the derivative of the constraint g ( x1* (c), x2* (c)) = c with
respect to c:
g ( x1* (c), x2* (c)) x1* (c) g ( x1* (c), x2* (c)) x2* (c)
+
=
1
(2)
x1
c
x2
c
Step 2 Take the derivative of the objective function f ( x1 , x2 ) with
respect to c (chain rule):
df ( x1* (c), x2* (c)) f ( x1* (c), x2* (c)) dx1* (c) f ( x1* (c), x2* (c)) dx2* (c)
=
+
dc
x1
dc
x2
dc
(3)
Step 3 we consider the first-order conditions of the Lagrangian
function (equation (1):
f ( x1* (c), x2* (c))
g ( x1* (c), x2* (c))
=
x1
x1
f ( x1* (c), x2* (c))
g ( x1* (c), x2* (c))
=
x2
x2
(4)
(5)
Step 4 Substituting the first-order conditions (4) and (5) in the

derivative of the objective function (equation (3)):
185
df ( x1* (c), x2* (c))

g ( x1* (c), x2* (c)) dx1* (c)
g ( x1* (c), x2* (c)) dx1* (c)
=
+
dc
x1
dc
x2
dc
(6)
Step 5 Because of equation (2), equation (6) can be rewritten as:
g ( x1* (c), x2* (c)) dx1* (c) g ( x1* (c), x2* (c)) dx1* (c)
df ( x1* (c), x2* (c))
=
+
=
dc
x
dc
x
dc
1
2
Thus: interpretation of is:

The amount by which the value of the objective function increases
when the constraint is relaxed by one unit.
The maximum amount that the economic agent would be willing to
pay (in units of the objective function) for a relaxation of the
constraint.
Shadow price of the constraint.
186

We apply this formal procedure to our simple problem:
The Lagrangian function of the objective function y = ax 2 , subject to
the constraint x = c is
max
L( x=
, ) ax* (c) 2 + 3 [ x* (c) c]
*
x ,c ,
(1)
Step 1.
Take the first derivative of the constraint x* (c) = c with respect to c:
dx* (c)
= 1:
(2)
dc
Step 2.
Take the first derivative of the unconstrained function (the objective
function) with respect to c:
df ( x* (c))
dx* (c)
*
= 2ax (c)
dc
dc
(3)
Step 3.
Take the partial derivative of the Lagrangian function (1) with respect
to x:
2ax* (c) =
(4)
Step 4.
Substituting equation (4) in equation (3)
dx* (c)
dx* (c)
=
2ax (c)
dc
dc
*
(5)
187

Step 5.
Because of equation (2), equation (5) can be rewritten as:
df ( x* (c))
dx* (c)
*
= 2=
ax (c)
dc
dc
188

Lagrangian method with multiple constraints (section 11.2)
L( x1 ,, xn , 1 ,, m ) =
f ( x1 ,, xn ) i=1 i [ g i ( x1 ,, xn ) ci ]
m
i-th constraint: g i ( x1 , x2 ) = ci
189

ADVANCED MATHEMATICS TUTORIAL WEEK 5
Answers tutorial week 5 Advanced Mathematics
Exercise 1.
Find the critical points of the following functions and assess whether they are minima,
maxima or inflection points:
a) ax 2 + bx + c , where a, b, c are constants.
Solution:
d (ax 2 + bx + c)
b
= 2ax + b = 0 x =
, so the function has one critical point.
dx
2a
d 2 (ax 2 + bx + c)
= 2a The partial derivative is 2a, independent of x. If a < 0, the function
dx 2
takes a maximum, if a > 0 it takes a minimum.
b) log( x) 4 x
Solution:
d log( x) 4 x 1
1
= 4 = 0 x = , so the function has one critical point.
dx
x
4
2
d log( x) 4 x 1
= 2 We can now plug in the optimal value for x:
dx 2
x
1
1
=
16 < 0 , so the function takes a maximum. (We could also observe that 2 is negative
1
x
( )2
4
for all x).
* c) x 2 n , n
Solution:
dx 2 n
= 2nx 2 n 1 = 0 x = 0 .
dx
2n(2n 1) x 2 n 2 if n 2
d 2 x2n
2n2
= 2n(2n 1) x
=
If we evaluate this at the optimal value for
dx 2
if n = 1
2
x, we find:
if n 2
0
d 2 x2n
=
if n = 1
dx 2 x =0 2
So we have found a minimum for n=1. But for the other cases we dont know yet. Well graph
them to what is going on:
190

35
30
25
20
15
10
5
2
1
4
Here we graphed x , x , x . Clearly they all have a minimum at x=0. This shows that the
conditions we use for finding a minimum are only a sufficient requirement: always if we find
a critical point that obeys the second order condition, it is a minimum, but not all minima
obey the second order condition.
d) e x
Solution:
de x
= e x > 0 , so this function has no critical points. It is always increasing, monotonously
dx
increasing, as it is called.
e) e x
+3 x 2
Solution:
x
Given that we saw in the last exercise that e is always increasing, we expect to find a
minimum here at the minimum of the exponent. Indeed, this is what comes out:
2
2
de x +3 x 2
(2 x + 3)e x +3 x 2 =
0 , since the exponential function is always positive, this implies:
=
dx
3
. This indeed at the minimum of the exponent.
2x + 3 = 0 x =
2
2
2
2
d 2 e x +3 x 2
(2 x 3) 2 e x +3 x 2 + 3e x +3 x 2 > 0 . We could plug in the optimal value of x, but we
=+
2
dx
see that only positive numbers (a square, twice an exponential function, 3) occur, so we see
immediately that this is positive and therefore the critical point a minimum.
191

* f) e x x 2
Solution:
?
d (e x x 2 )
= e x 2 x = 0 We dont know how to solve this; in fact it is even unclear if this has a
dx
solution. Is it possible that e x > 2 x for all x? One way to check that is to see if e x 2 x > 0
even at its minimum. As it happens, this is precisely the next exercise. (Well come back to
this one when were done with it).
x
g) e 2 x
Solution:
de x 2 x
= e x 2 = 0 e x = 2 x = log(2)
dx
d 2e x 2 x
= e x > 0 , so we find a minimum. Now at the minimum:
dx 2
2 2 log(2) > 0 , as log(2) < 1. So, coming back to question
ex 2x
elog(2) 2 log(2) =
=
x = log(2)
f), we find that even at its minimum, the first derivative of e x x 2 is positive, so it is always
increasing and has no critical points.
Exercise 2.
Determine whether the following matrices are positive definite, negative definite, or neither:
a b
* a)
b c
Solution:
Because the dimension of the matrix is very small, we can apply the definition of positive
definiteness directly. We focus on symmetric matrices, because it can be shown that if a
matrix A is positive definite, one can always find a symmetric positive definite matrix S that
gives the same outcomes, i.e.: xT Ax = xT Sx for all vectors x.
For a positive definite matrix, it must hold that:
a b x
ax + by
2
2
( x y)
= ( x y )
= ax + 2bxy + cy > 0 for all possible x and y.
b c y
bx + cy
Clearly, if we take either x=0 or y=0, we find that both a and c must be positive. To find the
final condition, we rewrite our expression:
b
ax 2 + 2bxy + cy 2 = a ( x 2 + 2 x y ) + cy 2 =
a
2
b
b
b2
a ( x 2 + 2 x y + 2 y 2 ) + cy 2 y 2 =
a
a
a
2
b
b
a ( x + y ) 2 + (c ) y 2 > 0
a
a
192

The second and third step here are not at all obvious, but we take them, because we end up
b
with something nice. Indeed, in the last line we have two squares ( ( x + y ) 2 and y 2 ) which
a
b2
are always positive, and a, which we also demanded be positive. So the final term, (c ) ,
a
2
b
must also be positive. But (c ) > 0 ac b 2 > 0 . This means that the determinant of our
a
matrix should also be positive. This is actually the familiar principal minor condition.
2
3
1
b)
6
1
6
1
3
0
3
4
1
4
Solution:
We have seen this matrix before, it is the markov chain example we analyzed in the tutorial of
2
5
week 3. There we saw that it had eigenvalues=
, so all eigenvalues are
, 3
1 1,=
2
=
3
12
positive. This is one characterization of a positive definite (PD) matrix. We double-check our
result by looking at the leading principal minors:
2
3
1
det
6
1
6
0
3
4
1
4
1 2 3 2 1 1 5
=
= >0
3 3 4 3 4 3 18
3 0 2 3 1
det
=
>0
=
1 3 3 4 2
6 4
2
>0
3
They are all positive, confirming our results.
193

1 0 1
c) 0 2 1
1 1 3
Solution:
This time we have to check by direct computation:
det 0
1
1
det
0
0 1
2 1 = 1(2 3 1 1) 0 + 1(0 1 2 1) = 5 2 = 3 > 0

1 3
0
= 2 > 0
2
1> 0
It is PD.
*Exercise 3.
Find the critical points of the following functions and assess whether they are minima,
maxima or saddle points:
a b x
x
=
a) f ( x, y ) ( x y )
+
d
e
(
)

, a,b,c,d,e constants
b c y
y
Solution:
From 2a) we know that the function written out becomes:
a b x
x
2
2
f ( x, y ) = ( x y )
+ ( d e ) = ax + 2bxy + cy + dx + ey
b c y
y
The two first-order conditions become:
f
= 2ax + 2by + d= 0
x
f
= 2bx + 2cy + e= 0
y
Note that we can write this as:
a b x d
2
0
+ =
b c y e
Basically, f is a quadratic function, but now in two dimensions (In fact functions of the form
xT Ax + bT x + cT are called quadratic functions). The rule for its derivative is very similar to
the one dimensional case. We could solve for the critical point (x,y) by solving this matrix
equation, but we dont really care about the outcome, so we move on to the second order
condition.
194

We have to check the Hessian:
2 f
2 f
2
x
xxy
a b
=
2
H =
2 f
2 f
b c
y 2
yx
Again, the hessian of this matrix looks very much like the second derivative of a one
dimensional matrix. Furthermore, we have seen in question 2a) under what conditions this
matrix is positive definite ( ac b 2 > 0 , a > 0 and c > 0) and when it is negative definite (
ac b 2 > 0 , a < 0 and c < 0). The first case corresponds to a minimum, the second to a
maximum.
b)=
f ( x, y , z )
(x
1 0 1 x
x

z ) 0 2 1 y + (1 2 3) y + 5
1 1 3 z
z
Solution:
We can apply the same result that we saw in question 3a): the derivative of a quadratic
function is:
1 0 1 x 1

=
f ( x, y, z ) 2 0 2 1
=
y + 2 0
1 1 3 z 3

Here we write f ( x, y, z ) for the column vector of partial derivatives of f(x,y,z). This is here
nothing more than a short-hand, although f ( x, y, z ) , called the gradient of f, is a useful thing
with interesting properties, which we do not study here.
Our equation leads to the matrix equation:
1 0 1 x

0 2 1 y =
1 1 3 z
1
2
1
3

2
1
1
1
Which we can solve by sweeping to find: x =
,y=
,z =
. So our critical point is
6
3
3
1
6
1 0 1
1
. We know that the Hessian is equal to: 0 2 1 . We saw in question 2c) that this is a
3
1 1 3
3
positive definite matrix, so we have a minimum here.
195

2
2
* c) f ( x, y, z ) = xyz x 3 y log( z )
Solution:
We compute the partial derivatives and set them equal to zero:
f ( x, y, z )
= yz 2 x = 0
x
f ( x, y, z )
= xz 6 x = 0
y
f ( x, y, z )
1
1
1
= xy = 0 xyz =1, yz = , xz =
y
z
x
y
Plugging these last two equalities back into the first partial derivatives, we get
1
1
1
1
2 x = x 2 = x = x =
x
2
2
2
1
1
1
1
6 y = y2 = y = y =
6
6
6
y
Using now the fact that xyz = 1 , we find z =2 3 z =2 3 . If we make sure that the signs
work out correctly (for xyz to be positive, we must have an even number of negative
numbers), we find that there are 4 critical points:
1 1 1 1
2 2 2 2
1 1 1 1
,
,
,
.
6 6 6 6
2 3 2 3 2 3 2 3
To see whether these are maxima, minima or saddle points, we have to calculate the Hessian:
2 f
2 f
2 f
2
xy xz
x
2 z
y
2 f
2 f
2 f
=
H
=
z 6 x
2
yz
yx y
1
2 f
x
2 f
2 f y
2
z
2
zx zy z
In general, it could be a lot of work to check for all four points whether this matrix will be
positive or negative definite. However, in this case we observe that the first two diagonal
entries are negative, while the third is positive (it is a square). Since positive definiteness
requires all diagonal elements positive, while negative definiteness requires all diagonal
elements negative, clearly these Hessians can be neither. Therefore, all critical points are
saddle points.
196

y ) log( x) + log( y ) + log(1 y x) (hint:check if it is concave first)
* d) f ( x,=
Solution:
As per the hint we check for concavity. We first the compute the partial derivatives:
f ( x, y )
=
x
f ( x, y )
=
y
1
1
+
x y + x 1
1
1
+
y y + x 1
We can show that the partial derivatives are zero for x= y=

Next the Hessian of the function:
2 f
2 f 1
1
2
2
x
xxy
x ( x + y 1) 2
=
H =
2 f
1
2 f
( x + y 1) 2
2
y
yx
( x + y 1)
1
1
2
2
y ( x + y 1)
We compute the eigenvalues of the Hessian at x= y=
18
9
1
3
9
=0
18
1
3
(18 ) 2 =
81
1 =
27, 2 =
9
Both eigenvalues of H are negative, so that the Matrix H is negative definite (ND).
1
Consequently, at x= y=
the function reaches a maximum.
3
One could also argue that H is a negative definite matrix, because the diagonal entries are
negative (minus squares), while the determinant is:
1
1
1
2
x 2 ( x + y 1) 2
a+z
z
( x + y 1)
2
= det
det
= (a + z )(b + z ) z
1
1
1
b+ z
z
2
2
( x + y 1) 2
y ( x + y 1)
Here I define a, b and z to simplify the expression. Because a, b and z are all negative, this
determinant must be positive. So we see the Hessian is negative definite everywhere and
therefore the function is concave. This means that whatever critical point we find, it will be a
maximum. Furthermore, for a strictly concave function, there is only 1 critical point.
Looking at the symmetry of the function, we might wish to guess what the critical point is. If
we do this successfully and find that all the partial derivatives are zero there, we are done.
1
We pick x= y=
. When we evaluate the partial derivatives we found earlier, we see that we
3
indeed get zero, so we are done.
197

e) f ( x, y ) =+
x4 2 x2 y 2 + y 4
Solution:
We first rewrite our function:
f ( x, y ) = x 4 + 2 x 2 y 2 + y 4 = ( x 2 + y 2 ) 2
Next we compute the first order conditions:
f ( x, y )
= 4 x( x 2 + y 2 )= 0
x
f ( x, y )
= 4 y ( x 2 + y 2 )= 0
y
From these it follows that x = y = 0 is the only critical point. Now let us compute the Hessian
at this point:
2 f
2 f
2
x
xxy 4( x 2 + y 2 ) + 8 x 2
8 xy
=
H =
2 f
8 xy
4( x 2 + y 2 ) + 8 y 2
2 f
y 2
yx
Evaluated at zero, this becomes:
0 0
H =
0 0
This matrix is not positive definite, but still the function is at a minimum. This is another
example of the fact (already observed in question 1c) ) that the second-order condition is
sufficient, but not necessary for a minimum.
But have still to show that the function is indeed at a minimum. Fortunately, this is not so hard
to do. Recall that for a point (x,y), x 2 + y 2 is the square of the distance to the origin, i.e. the
point (0,0). So what the function f(x,y) does is give us the square of that number. Clearly, for
any point but (0,0) itself, our critical point, this will give a positive number. So the function is
everywhere positive, but in the point (0,0), where it is 0. Thus (0,0) must be a minimum.
* Exercise 4.
Maximize U ( x=
, y ) x y , =
+ 1 subject to px + y =
I . Do this first by substitution and
then by Lagrange multipliers.
Solution:
Substitution:
px + y = I y = I px , so
=
, y ) x=
( x) x ( I px)
U ( x=
y U
U
0
x 1 ( I px) px ( I px) 1 =
=
x
x 1 ( I px) = px ( I px) 1
( I px) = px I =( + ) px
I
, (remember + = 1)
x=
p
y =I px =(1 ) I = I
198

Lagrange:
L( x, y, =
) x y ( px + y I )
L
=
x 1 y p =0 x 1 y = p
x
L
= x y 1 = 0 x y 1 =
y
For the problems encountered in economics, it is almost always useful to divide the two
partial derivative constraints of the Lagrangian:
x 1 y p y
=
=p
x y 1
x
We plug this into the budget constraint:
( + ) y
y
y
px + y = I
x+ y =
+y=
=I y =I
x
(1 ) I aI
=
px + y = I px + I = I x =
p
p
The results are the same.
However, the Lagrangian multiplier method does allow us to say one more thing: we can now
interpret the multiplier . Lets derive the marginal utility of income. That is the amount of
extra utility derived from more income. So if we call our optimal consumption solutions
aI *
=
x* ( I ) =
, y ( I ) I , our utility at optimal consumption is: U * ( I ) = U ( x* , y* ) and the
p
U * ( I ) U ( x* , y* ) x* U ( x* , y* ) y*
marginal utility of income
is:
. Now we know from
=
+
I
x
I
y
I
U ( x* , y* )
U ( x* , y* )
= p and
= , so we get:
the Lagrange constraints that
x
y
U ( x* , y* ) x* U ( x* , y* ) y*
x*
y*
x* y*
+
= p
+
= ( p
+
) . Let us have a look at the
x
I
y
I
I
I
I
I
x* y*
+
. It measures marginal expenditure as income increase. But since expenditure
I
I
and income are equal at optimal consumption (we dont leave money lying around), this term
must be 1 (this can also be seen by implicitly differentiating the budget constraint w.r.t.
U * ( I )
=.
income). So
I
term p
Three additional exercises on shadow price. See the lecture slides of this week for further
explanation on the shadow price. Below three examples are given on how to compute the
shadow price for these specific examples.
Addition exercise 1 shadow price
c .
Maximize f ( x, y=
) x 2 + y 2 , s.t. x + y =
and calculate the shadow price.
199

L ( x, y , ) = x 2 + y 2 ( x + y c )
c
L
L
= 2 x = 0,
=2y =0 x = y = , =c
2
x
y
Now our value function f * (c) is the objective function f evaluated at the optimal values
x* (c), y* (c) :
c
c
c2
f * (c) = f ( x* (c), y* (c)) = ( ) 2 + ( ) 2 =
2
2
2
We can now take the derivative of this with respect to c. This shows us how much our optimal
value of f changes when we change the constraint c:
f * (c) c 2
=
( )= c=
c
c 2
Hence, the interpretation of the is that it gives the marginal change of the objective
function, if the constraint is changed by one unit.
A slightly more complicated example to drive the point home:
Maximize f ( x, y )= x + y , s.t. x 2 + y =
c .
Construct the Lagrangian function:
L ( x, y , ) = x + y ( x 2 + y c )
1
1
L
L
=1 2 x =0,
=1 =0 =1, x = , y =c
2
4
x
y
Now the value function becomes:
1
1
f * (c) = f ( x* (c), y* (c)) = + c and
2
4
*
f (c)
= 1=
c
In one variable it is a bit confusing, but it still works:
Maximize f ( x) = x 2 , s.t. x = c
Construction of the Lagrangian function:
L( x, ) =x 2 ( x c)
L
= 2 x = 0 = 2 x , x = c = 2c
x
Our value function becomes:
=
f * (c) f=
( x* (c)) c 2 and
f * (c)
= 2=
c
c
200

Week 6 Constrained optimization and Integral calculus
Klein: Chapter 11
Value function
K.11.2.
Envelope theorem
K.11.2.
Convex constraints, multiple constraints, slackness See lecture slides

Klein: Chapter 12
Area under curve
K.12.1.
Anti-derivative, fundamental theorem of calculus
K.12.1.
Simple rules
K.12.2.
Substitution
K.12.2.
Integration by parts
K.12.2.
Improper integrals
K.12.2.
201

Envelope theorem
Take a revenue function. Question: for which Q are the revenues
highest?
Demand function
P= 3 0.5Q
Cost function:
C = 2Q
The objective function (profit function) becomes:
(Q, a, b, c) =(3 2)Q 0.5Q 2
=
Q*
(3 1)
= 1
2 * 0.5
Next, we generalize this exercise. We introduce the parameters a, b,

and c
Demand function
P= a bQ
Cost function:
C = 2Q
The objective function (profit function) becomes:
(Q, a, b, c) =(a c)Q bQ 2
Q* =
(a c)
2b
Substitute Q* in the objective function. The maximum value function

is the following:
202
(a c) 2
(Q , a, b, c) =
4b
*
which becomes:
(a c) 2
F (a, b, c) =
4b
Thus the maximum value is a function of the parameters a, b, and c.
The interpretation of F is that it is an indirect objective function.
203

Thus:
z = f ( x, y; ) is objective function; x and y are choice variables.
x* = x* ( ) and y* = y* ( ) are the optimal x and y.
z * = f ( x* ( ), y* ( ); )
Thus
z *
= =
F ( )
f ( x* ( ), y* ( ); )
i i
i
Reason:
F ( )
i
f ( x* ( ), y* ( ); )
=
i
x*
y*
*
*
*
*
f ( x ( ), y ( ); )
f ( x ( ), y ( ); )
f ( x* ( ), y* ( ); ) =
+
+
x
i y
i i
x*
y*
0
f ( x* ( ), y* ( ); ) =f ( x* ( ), y* ( ); )
+0
+
i
i i
i
204

Example (Hotellings lemma)
Production function: Q = f ( K , L)
Where Q: output, K: capital and L: labour.
, w) pf ( K , L) rK WL
Profit function: ( K , L, p, r=
Maximize profit with respect to K, L, keeping p, r, and w fixed.
Optimal K and L are: K * = K * ( p, r , w) and L* = L* ( p, r , w) so that
* ( K * , L* , p, r=
, w) pf ( K , L) rK WL
Hence:
*
*
) Q* > 0 and
= f ( K * , L=
p
*
=
K * < 0 : how much profits is lost if the price of capital
r
increases by a small amount?
*
=
L* < 0
and
w
Note that the optimization problem can also be written as (Page 322 of
Klein):
Profit function:
( K , L, Y , p, r , w, ) = pY rK WL ( f ( K , L) Y )
205

Next, for constrained optimization
Example (Roys identity):
U: utility function that depends on quantities x1 ,..., xn
Prices are p1 ,..., pn
Income is y
Maximum utility (Lagrangian function):
L( x1 ,..., xn , p1 ,...,
=
pn , y ) U ( x1 ,..., xn ) ( p1 x1 + ... + pn xn y )
L
=
y
And
L
= xi
pi
Application of general envelope theorem:
Marginal disutility income is equal to
U * ( p1 ,..., pn ; y ) L( x1 ,..., xn , p1 ,..., pn , y )
= =
m
m
Marginal disutility of a price increase is the marginal disutility of
income ( ) times the quanity demanded ( xi* )
U * ( p1 ,..., pn ; y ) L( x1 ,..., xn , p1 ,..., pn , y )
=
= xi*
pi
pi
206

Maximize
f ( x1 ,..., xn ; 1 ,..., m )
Subject to
g ( x1 ,..., xn ; 1 ,..., m ) = 0
Lagrangian function:
( x1 ,..., xn ; 1 ,..., m ) f ( x1 ,..., xn ; 1 ,..., m ) g ( x1 ,..., xn ; 1 ,..., m )
L=
Take the partial derivative of the Lagrangian with respect to xi :
0
f xi ( x1 ,..., xn ; 1 ,..., m ) g xi ( x1 ,..., xn ; 1 ,..., m ) =
Consequence 1:
f xi ( x1 ,..., xn ; 1 ,..., m ) =
g xi ( x1 ,..., xn ; 1 ,..., m ) g xi )
=
Take the partial derivative of the Lagrangian with respect to :
g ( x1 ,..., xn ; 1 ,..., m ) = 0
Consequence 2:
x1*
g xi
+ + g xi
i
x1*
g xi
+ + g xi
i
xn*
0
+ g i =
i
xn*
=
g i
i
207

The maximum value function is:
F ( 1 ,..., m ) = f ( x1* ,..., xn* ( 1 ,..., m ); 1 ,..., m )
Partial derivative of the maximum value function with respect to
i :
xn*
x1*
Fi ( 1 ,..., =
f xi
+ + f xi
+ f i
m)
i
i
Which is rewritten because of consequence 1:
xn*
x1*
f i
= g xi
+ + g xi
+=
i
i
x1*
xn*
= g xi
+ + g xi
+ f i
i
i
(Next, we make use of consequence 2):

=
g i + f i
208

Kuhn Tucker conditions (one variable; one constraint)
There is an objective function that depends on one variable x: f ( x) .
We want to maximize f ( x)
We introduce the inequality constraint: g ( x) c
The constraint is binding (active) if g ( x* ) = c
The constraint is nonbinding (inactive/ slack) if g ( x* ) < c
The Lagrangian function becomes:
L( x, , c) =f ( x) ( g ( x) c)
The value of x and that maximize the objective function subject to
the constraint must satisfy one of two conditions:
either is positive and the constraint is binding
or is zero and the constraint is nonbinding
We write these necessary conditions as:
L
0 (nonnegativity condition)
0,
L
= 0 (slack condition)
and
and we have the condition:

L
0
Hence, there are two possibilities:

First, the constraint is binding, so that it involves the Lagrangian
analysis of before.
Second, the constraint is nonbinding, so that we have an unconstrained
optimization.
Thus, the following cases must be checked:
1. x = 0 and = 0 (border solution, non-binding constraint)
2. x = 0 and > 0 (border solution, binding constraint)
3. x > 0 and = 0 (interior solution and a non-binding constraint)
4. x > 0 and > 0 (interior solution and a binding constraint)
209

Kuhn Tucker conditions (two variable; one constraint)
We maximize f ( x1 , x2 )
We introduce the inequality constraint: g ( x1 , x2 ) c
The Lagrangian function becomes:
L( x, y, , c) =
f ( x) ( g ( x1 , x2 ) c)
There are three conditions:
First, the partial derivatives with respect to x1 and x2 are zero
L( x1 , x2 , , c)
=f x1 ( x1 , x2 ) g x1 ( x1 , x2 ) =0
x1
L( x1 , x2 , , c)
=f x2 ( x1 , x2 ) g x2 ( x1 , x2 ) =0
x2
Second, we introduce the complementary slackness conditions
0
= 0 if g ( x1 , x2 ) < c
Third, require ( x1 , x2 ) to satisfy the constraint g ( x1 , x2 ) c
Find all the points ( x1 , x2 ) together with that satisfy the three
conditions. Consider these candidates for optimality
210

Example
2
2
2
2
maximize f ( x1 , x2 ) = x1 + x2 + x2 1 subject to g ( x1 , x2 ) = x1 + x2 1
The lagrangian is
L( x1 , x2 , ) = x12 + x22 + x2 1 ( x12 + x22 1)
So that the first-order conditions are
L( x1 , x2 , )
=2 x1 2 x1 =0
x1
L( x1 , x2 , )
= 2 x2 + 1 2 x2= 0
x2
Complementary slackness conditions are
0
= 0 if x12 + x22 < 1
1) First-order condition for x1 gives x1 = 0 or = 1
2) = 1 does not satisfy the first-order condition for x2 . Thus, x1 = 0
2
2
1:
3) Consider x1 = 0 for x1 + x2 =
a. If x2 = 1 then = 3/ 2 (satisfy slackness condition)
b. If x2 = 1 then = 1/ 2 (satisfy slackness condition)
2
2
2
c. If x1 + x2 =0 + x2 < 1 then = 0 . According to the first-order
condition for x2 , if = 0 then x2 = 1/ 2
There are three candidates for optimality:

f (0,1) = 1
f (0, 1) =
1
f (0, 1/ 2) =
5 / 4
Thus ( x, y ) = (0,1) solves the maximization problem.
211

either is positive and the constraint is binding
or is zero and the constraint is nonbinding
We write these necessary conditions as:
L
0 (nonnegativity condition)
0,
L
= 0 (slack condition)
and
and we have the condition:

L
0
212

Integral Calculus
We compute the area under the function between the lower limit a and
the upper limit b:
f ( x)dx
Note that:
1)
2)
3)
f ( x)dx
=
a
b
f ( x)dx + f ( x)dx
b
( f ( x) g ( x) ) dx =
a
b
f ( x)dx = f ( x)dx
b
f ( x)dx g ( x)dx
a
We compute the antiderivative of f ( x) , which is F ( x) , such that

dF ( x)
= f ( x)
dx
Take f ( x) = 5 x 2
5 3
F=
( x)
x + C where C is a constant
3
5 3
x
dx
=
x
5
a
3
b
x=a
5
5
5
= b3 a 3 = (b3 a 3 )
3
3
3
213

Rules of integration
1) Polynomial function
n
=
ax dx
a n+1
x +C
n +1
2)How to treat a constant in the integral?

kf ( x)dx = k f ( x)dx
2 3/ 2
Example: 0 16 x dx = 16 * x
3
2
3
16 x1/ 2 dx= 16 x 3/ 2
0
3
3) Exponential function
de kx
= ke x
dx
Thus
a kx
kx
ae
=
dx
e +C
1/ 2
=
x =0
32
3 3 0 = 32 3
3
2
= 16( 3 3 0)= 32 3
3
x =0
4) Logarithmic function - I
d ln( x) 1
=
dx
x
Thus
a
x + k dx= a ln( x + k ) + C
5) Logarithmic function - II
d ln( f ( x)) f '( x)
=
dx
f ( x)
Thus
f '( x)
dx ln( f ( x)) + C
=
f ( x)
214

6) Integration by parts
Product rule for differentiation
df ( x) g ( x)
= f '( x) g ( x) + g '( x) f ( x)
dx
Thus the analogue for integration is:
f ( x=
) g ( x) + C f '( x) g ( x)dx + g '( x) f ( x)dx
Thus: udv =
uv vdu + C
Example:
x3 x3
x3
x ln( x)dx = ln( x)d 3 =3 ln( x) 3 d ln( x) =
x3
x3
= ln( x) dx =
3
x
x3
= ln( x) x 2 dx =
3
x3
1
= ln( x) x 3
3
3
2
215

7) Substitution method:
g ( h( x=
)) + C g '(h( x)) h '( x)dx
or
u = h( x )
du = h '( x)dx
g '(h( x)) h '( x)dx =
g '(u )du
Solution 1:
2
=
(3x + 1)6 xdx
2
2
1)
(3x + 1)d (3x +=
Solution 2:
2
(3
x
+ 1)6 xdx =
udu =
1
(3 x 2 + 1) 2 + C
2
1 2 1
u = (3x 2 + 1) 2 + C
2
2
216

Advanced Mathematics Week 6 Technical tutorial (Wednesday)
Exercise 1.
Maximize f ( x) =x 3 3 x 2 + x s.t. x 6 and x 10
Solution:
Lets start by drawing the graph on the indicated domain:
10
5
200
400
600
800
1000
1200
Lets also zoom in a bit on the action:
217
4
Looking at the figures, we can expect to find four points of interest with the Kuhn-Tucker
method: the two points where the function becomes flat and the two points where one the
restrictions becomes binding. Lets compute and see if it works out.
We construct the Lagrangian:
L( x) = x 3 3 x 2 + x 1 ( x 6) 2 ( x + 10)
Note that we always construct the Lagrangian with constraints of the form g ( x) c , so we
had to rewrite x 10 as x 10 .
The first-order condition:
L
( x=
) 3 x 2 6 x + 1 1 + 2= 0
x
Now there are two possibilities: either a constraint is binding, meaning it holds with equality,
or it is slack, meaning that it holds with strict inequality. If the constraint is slack, we want the
associated multiplier (lambda) to equal zero. For maximization, we want that the multiplier is
positive if the constraint is binding. Lets first try it with the multipliers equal to zero and see
if we end up with point in which the constraints are slack.
6 + 24
6 24
3x 2 6 x + 1 = 0 x =
x=
6
6
We get two critical points that both lie strictly between -10 and 6 (they are then called interior
points). So these are candidate optima and we have to check the second order conditions to
see if they are local minima or maxima or inflection points:
2 f ( x)
= 6 x 6 , which we evaluate at our critical points:
x 2
6 24
6 + 24
6
6 = 24 < 0, 6
6 = 24 > 0
6
6
So the first point is a maximum and the second a minimum.
218

Now lets check the solutions at which the constraints are binding. Since clearly x cannot
equal -10 and 6 at the same time, only one of the constraints can bind.
Lets first try x=6. Then it must hold that 1 > 0 if this is an actual constrained maximum
(remember that the other multiplier should still be zero, as it is not binding).
3 x 2 6 x + 1 1
= 0 = 73 1 1 = 73 > 0
x =6
So this is in fact a maximum.

Now lets try the other constraint: x= -10:
3 x 2 6 x + 1 + 2
=0 =361 + 2 2 =361 < 0
x = 10
So here the Kuhn-Tucker conditions do not hold and this is not a local maximum (it is in fact
a minimum).
We can now check the two local maxima we found to see which is bigger and is the global
maximum, but we will not do so, as this is only cumbersome and the answer is obvious from
the graph. x=6 is the global maximum.
Now, where do these Kuhn-Tucker conditions come from?
Remember that the first order condition in general says:
f
g
=
0
x
x
For a positive lambda (which the Kuhn-Tucker conditions demand) this means that the
derivative of the objective function and the derivative of the constraint must have the same
sign. Lets think about this. For a maximum at a constraint we want that the function is
increasing if the constraint is an upper bound; its only a maximum if we could increase by
relaxing the constraint. But if the constraint is an upper bound than it must also be increasing.
(see the figure, where we zoomed in our function at x=6 and scaled it down for clarity. The
figure also shows the constraint).
2.0
1.5
1.0
0.5
5.5
6.0
0.5
1.0
219
6.5
7.0

Similarly, if the constraint is a lower bound, we want our function to be decreasing there (so
that by going below the lower bound, we would increase our value). But for a constraint to be
a lower bound, it must be decreasing. In the case of our function and the lower bound
constraint, this was not the case.
1.0
0.5
10.5
10.0
9.5
9.0
0.5
1.0
1.5
Clearly, though, this is a local constrained minimum. What Kuhn-Tucker, for all its
complexity, boils down to, is simply checking that the sign of the derivatives of the function
and the constraint(s) are equal for a maximum and opposite for a minimum.
* Exercise 2. Maximize f ( x, y ) = x y , where > 0, subject to
x 2 + 2 y 2 10, 2 x 2 + y 2 10, x + y 3 . Determine which constraints are binding for which

values of .
Solution:
Lets first get a picture of what these constraints mean. The ones with squares in them remind
us of the circle formulas we have seen. In fact they are ellipsoids: elongated circles (in fact,
in this particular case you could obtain them from circles by applying the linear map that we
discussed in week 3, where you double your vectors in the x-direction and leave them
unchanged in the y direction). The third constraint is of course just a straight line. The whole
thing then looks like this:
220
The red area is the area where all the constraints hold. You could think of our function as a hill
landscape over the plane. When we optimize it subject to our constraints we restrict ourselves
to looking for peaks and valleys on the red area.
Now lets have a look at our objective function, f ( x, y ) = x y . Because x is only defined for
negative x if is an integer (a whole number), we restrict our attention to the case where x is
positive. The f(x,y) is increasing in x and y as long as y is positive. So, because we are
maximizing, we can restrict our attention to the following area:
Because we want to maximize and f(x,y) is increasing in both our variables, we expect to end
up somewhere on the outer edge of the red area. But where? This will depend on . The
higher , the more x contributes to a higher outcome, so the more we want to move to a
221

higher x, at the expense of a lower y. So what we expect is that, as we increase from 0 up to
infinity, we will move along the outer edge of the red area, starting at the y-axis and moving
towards the x-axis.
For =3, we obtain a case where two of the constraints are binding. We write down the
Lagrangian:
L( x, y ) = x 3 y 1 ( x 2 + 2 y 2 10) 2 (2 x 2 + y 2 10) 3 ( x + y 3)
From our calculations, we expect the second and third constraint to be binding here. As we
3+ 2 3 6 2 3
,
) (2.15, 0.85) . What we have to
saw, this happens=
at the point ( x, y ) (
3
3
check now is that both the associated lambdas are positive at this point. So lets calculate:
L( x, y )
= 3 x 2 y 42 x 3
12 92 3= 0
x 2.15,
y 0.85
=
=
x =x 2.15,
y 0.85
=
L( x, y )
y =x
= x3 22 y 3
y 0.85
=
2.15,
x 2.15,
y 0.85
=
=
10 22 3=0
2
66
2 = > 0, 3 = > 0
7
7
So we do indeed find that the multipliers are positive, so the Kuhn-Tucker conditions hold, so
we have a local constrained maximum.
Exercise 3. Find the following integrals:
x3 2 x 2 + 1
a)
dx
x
Solution:
We can work out the fraction and split up the parts:
x3 2 x 2 + 1
1
1
2
2
x dx= ( x 2 x + x )dx= x dx 2 xdx + x dx=
1 3
x x 2 + log( x) + C
3
We check by taking the derivative:
d 1 3 2
1
( x x + log( x) + C ) = x 2 2 x +
dx 3
x
This, as we saw, is what we started out with under the integral sign.
5
x3 2 x 2 + 1
dx
b)
x
2
Solution:
We already found the indefinite integral under a), so now we just use the fundamental theorem
of calculus to plug in:
5
x3 2 x 2 + 1
1
1
1
dx
= x 3 x 2 + log( x) =
125 25 + log(5) 8 4 + log(2)
2
x
3
2 3
3
This has some outcome we dont care about.

222

c)
3 x 2 + 4 x 1
dx
2 x3 4 x 2 + 2 x 6
Solution:
This is a lucky integral. We just so happen to be able to apply a substitution:
y = g ( x) = 2 x3 4 x 2 + 2 x 6 , then we have dy= g '( x)dx= (6 x 2 8 x + 2)dx . If we
manipulate our numerator a little bit, it becomes exactly this:
3 x 2 + 4 x 1
1
6 x2 8x + 2
1 1
dx
=
dx =
dy
3
2
2 x3 4 x 2 + 2 x 6
2 2x 4x + 2x 6
2 y
1
1
=
log( y=
)+C
log(2 x 3 4 x 2 + 2 x 6) + C
2
2

d 1
1
1
3 x 2 + 4 x 1
2
log(2=
x 3 4 x 2 + 2 x 6)
=
6
x
8
x
+
2
dx 2
2 2 x3 4 x 2 + 2 x 6
2 x3 4 x 2 + 2 x 6
This is exactly what we started out with. Notice how lucky we were. If we change the
denominator even slightly, say changing the -1 to -2, there is no easy way to find the solution
anymore.
d) 1 + 2xdx
Solution:
Here it makes sense to try a slightly complex substitution y 2 (= h( y )) = g ( x) = 1 + 2 x . The rule
for the differentials then becomes: h '( y )dy = g '( x)dx 2 ydy = 2dx ydy = dx . Lets plug
this in:
1 3
2
2
1 + 2 xdx = y ydy = y dy = 3 y + C
Now the last step is plugging back in our x. For this we have to rewrite our substitution rule a
little: y 2 =1 + 2 x y = 1 + 2 x . This we can plug in:
3
1 3
1
1 + 2 xdx =
y +C=
(1 + 2 x) 2 + C
3
3

3
1
d 1
1
3
2
(1 + 2 x) =
(1 + 2 x) 2 2 = 1 + 2 x
dx 3
3
2
e) (
3
2 x 2 )dx
x 12
Solution:
Here we see a square root to which we would like to apply a substitution, but there is the other
term there. But this is no problem; we can just split it off:
223

3
3
3
2
2=
x 2 )dx
dx 2=
x 2 dx
dx x 3 + C
3
x 12
x 12
x 12
Now we can apply a similar substitution as last time to the remaining integral.
y 2 =x 12 2 ydy =dx, y = x 12
3
3
x 12 dx = y 2 ydy = 6dy = 6 y + C = 6 x 12 + C
Now we can add the two solutions (notice that we just add the two integration constants into
one new constant. This doesnt matter, since the constants could be anything anyway. Strictly
speaking, we should give all these constants new names, but that would be very cumbersome.)
3
2 x 2 )dx=
x 12
3
2
2
dx x 3 + C= 6 x 12 x 3 + C
3
3
x 12

1
d
2
1
2
(6 x 12 x 3=
)
6( x 12) 2 2 x=
dx
3
2
3
2x2
x 12
f) log(3 x 7)dx
Solution:
We try our luck with another substitution:
1
y = 3 x 7 dy = dx
3
1
log( y )dy
log(3x 7)dx =
3
We now have to find an integral for the logarithmic function. We can do this by a tricky
application of integration by parts. Remember that integration by parts is the following:
) g '( x)dx f ( x) g ( x) f '( x) g ( x)dx
f ( x=
We now take the following for our f and g:

=
f ( x) log(
=
x), g '( x) 1
1
, g ( x)= x
f '( x)=
x
Notice that we regard log( x) as 1 log( x) , which is a trick.
Applying partial integration, we get:
1
1
1
1
y log( y ) y =
dy
1 log( y=
)dy
y
3
3
3
1
1
1
y log( y ) =
dy
( y log( y ) y ) + C
3
3
3
Now we substitute back:
1
1
= ( y log( y ) y ) +=
C
( (3x 7) log(3x 7) 3x + 7 ) + C
log(3x 7)dx
3
3
We check:
224

d 1
( ( (3 x 7) log(3 x 7) 3 x + 7 )) =
dx 3
1
3x 7
=
(3log(3 x 7) + 3
3) log(3 x 7)
3
3x 7
g) x 2 e x dx
Solution:
Here we apply partial integration twice. It works because taking the integral of the
2
exponential is so easy. We take
=
f ( x) x=
, g '( x) e x
dx
x e=
2 x
x 2 e x 2 xe x dx
Now we take for the next integral

=
f ( x) 2=
x, g '( x) e x
x 2 e x 2 xe x dx = x 2 e x [2 xe x 2e x dx] = ( x 2 2 x + 2)e x + C
We check:
d 2
( x 2 x + 2)e x = ( x 2 2 x + 2)e x + (2 x 2)e x = x 2 e x
dx
* h) ( x 2 + 1)e3 x + 2 dx
Solution:
This works exactly the same as above:
1 2
1
1 2
1 1
1
2
3x+2
3x+2
3x+2
3x+2
3x+2
3x+2
( x + 1)e dx =3 ( x + 1)e 3 2 xe dx =3 ( x + 1)e 3 [ 3 2 xe 3 2e dx =
1 2
2
2
1
2
2
( x + 1)e3 x + 2 xe3 x + 2 + e3 x + 2 +=
C ( ( x 2 + 1) x + )e3 x + 2 + C
3
9
27
3
9
27
We check:
d 1 2
2
2
1
2
2
2
2
( ( x + 1) x + )e3 x + 2 =3 ( ( x 2 + 1) x + )e3 x + 2 + ( x )e3 x + 2 =( x 2 + 1)e3 x + 2
dx 3
9
27
3
9
27
3
9
* i) x 2 log 2 ( x)dx
Solution:
2
This also works by repeated integration by parts. We
take f ( x) log
=
=
( x), g '( x) x 2
1 3
1
1
1 3
2
2
2
dx
=
x log 2 ( x) 2 x 3 log( x)=
x log 2 ( x) x 2 log(
x)dx
( x)dx
x log =
x
3
3
3
3
1 3
2 1
1 3 1
1
2
2
x log 2 ( x) [ x3 log( x) x=
dx x3 ( log 2 ( x) log( x) + ) + C
x
3
3 3
3
3
9
27
We check:
d 3 1 2
2
2
1
2
2
2
1 21
( x ( log ( x) log( x) + =
)) 3 x 2 ( log 2 ( x) log( x) + ) + x 3 ( log( x) =
) x 2 log 2 ( x)
dx
3
9
27
3
9
27
3
x 9x
225

Quite a relief, to be honest.
7
x
dx
+
x
1
2
2
* j)
Solution:
This is an example where we apply our familiar substitution, but we also have to take into
account the limits of integration. We take the substitution: y 2 = x + 2 2 ydy = dx .
For the limits we get: x = 2 y = 2, x = 7 y = 3 , so:
7
x
y2 2
2 y3 4 y
=
=
dx
ydy
2
2 1 2 + x 2 1 y
2 1 y dy
We can now apply long division of this polynomial (if you dont know what this is, you can
just believe me). Working it out, we get:
3
3
2 y3 4 y
2
2
=
dy
2 1 y
2 (2 y 2 y + 2 + y 1)dy =
3
2 y dy 2 ydy + 2 dy + 2
2
1
dy =
y 1
3
1
47
1
2 y 3 y 2 + y + log( y =
1) 2 log(2)
2
3
3
2
k) x 2 ( x 3)12 dx
Solution:
This is just a nice trick. We could manually compute this beast by just expanding the
expression raised to the power 12. That would be a lot of work. By applying the substitution
y=
x 3, dy =
dx , we get:
x ( x 3)
2
12
2 12
dx =
( y + 3) y dy
This is a lot easier to expand!

2 12
2
12
( y + 3) y dy= ( y + 6 y + 9) y dy=
14
+ 6 y13 + 9 y12 dy=
1 15 6 14 9 13
1
6
9
y + y + y + C=
( x 3)15 + ( x 3)14 + ( x 3)13 + C
15
14
13
15
14
13
Some additional integral exercises:

i) Apply the substitution method to solve
2x + 4
( x 2 + 4 x 8) dx
Substitution method:
f ( g ( x) ) g '( x)dx = f ( y ) dy and y = g ( x)
Usually, I take one step in between, to have a better understanding. It is as follows:

f ( g ( x) ) dg ( x) f ( y ) dy
( g ( x) ) g '( x)dx =
f=
226

Solution:
dy (2 x + 4)dx
Lets take y = x 2 + 4 x 8 , =
2x + 5
1
dx =
dy ln( y ) + C for y > 0.
( x 2 + 5 x 3)=
y
Thus the solution: ln( x 2 + 4 x 8) + C for x 2 + 4 x 8 > 0
m) Apply the substitution method to solve
4
6 x( x + 1) dx
Solution
y= x + 1 so that x= y 1 and dx = dy
6 x( x + 1) dx = 6( y 1) y dx = (6 y
4
6 y 4 )dy = y 6
6 5
6
y + C = ( x + 1)6 ( x + 1)5 + C
5
5
n) Apply the substitution method to solve
xe
cx 2
dx
Solution 1:
xe
cx 2
2
1 cx2 2
1
dx =
e dx =
e cx d ( x 2 )
2
2
2
2
1
1
et
e cx
=
e cx d ( cx 2 ) =
et dt =
+C =
+C
2c
2c
2c
2c
Alternative solution 2:
dt
= 2cx or dt = 2cxdx
dx
2
et
e cx
1 t
cx 2
xe dx = 2c e dt = 2c + C = 2c + C
Substitution: t = cx 2 and
o) Apply the substitution method to solve
xe x dx
Solution:
xe x dx
Lets take
t = x 2 , dt = 2 xdx , For x = 0, we have t = 1. For x = 2, we have t = -4
1 4 t
1 4 0
1 4
=
(e e=
)
(e 1) =
e dx
2 0
2
2
p) Apply the substitution method to solve
e 1 + ln x
1 x dx
227

Solution:
e 1 + ln x
e
e
dx
=+
(1
ln(
x
))
d
ln(
x
)
=+
1 x
1
1 (1 ln( x))d (1 + ln( x) )
For the substitution method, we also need to change the limits of integration.
We will apply t = 1 + ln( x)
Lower bound of integral: x=1, so that t =1 + ln(1) =1 + 0 =1
Upper bound of integral: x=e, so that t = 1 + ln(e) = 1 + 1 = 2
(1 + ln( x))d (1 + ln( x) )=
1 2
1
3
(4 1)=
tdt=
t =
2 t =1 2
2
1
dx , Limits of integration: t = 1 (for x = 1; t =1 + ln(1) =1 + 0 =1 ) and
x
t = 2 (for x = e; t = 1 + ln(e) = 1 + 1 = 2 )
Take t = 1 + ln( x) , dt =
1 + ln x
1 x dx=
e
1
1
3
tdt= t 2 =
(4 1)=
2 t =1 2
2
q) Apply the method of integration by parts to solve

2x
4 xe dx
Solution:
2x
2x
2x
2x
2x
2x
4 xe dx= 2 xde = 2 xe 2 e dx= 2 xe e + C
r) Apply the method of integration by parts to solve
2x
( x 8)3 dx
Solution:
2x
xd ( x 8) 2 =
x( x 8) 2 + ( x 8) 2 dx =
x( x 8) 2 ( x 8) 1 + C
( x 8)3 dx =
Check:
d
x( x 8) 2 ( x 8) 1 + C ) =
( x 8) 2 + 2 x( x 8) 3 + ( x 8) 2 =
2 x( x 8) 3
(
dx
s) Apply the method of integration by parts to solve
10
(1 + 0.4t )e 0.05t dt
Solution:
228
10
(1 + 0.4t )e
10
1
1
t
(1 + 0.4t )de 0.05
=
dt
=
(1 + 0.4t ) e 0.05t
0.05 0
0.05
10
0.05t
10
=
100e 0.5 + 20 + 20 0.4 e 0.05t dt =
100e 0.5 + 20 + 8
0
1
e 0.05t
0.05
t =0
10
1
e 0.05t d (1 + 0.4=
t)
0.05 0
10
=
100e 0.5 + 20 160(e 0.5 1)
t =0
22.3
t) Apply both the method of integration by parts and the substitution method to solve
2
2
2 x ln( x + b )dx
Solution:
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2 x ln( x + b )dx = ln( x + b )d ( x + b ) = ( x + b ) ln( x + b ) ( x + b )d ln( x + b ) =
t
= ( x 2 + b 2 ) ln( x 2 + b 2 ) td ln(t ) = ( x 2 + b 2 ) ln( x 2 + b 2 ) dt = ( x 2 + b 2 ) ln( x 2 + b 2 ) t + C =
t
2
2
2
2
2
2
= ( x + b ) ln( x + b ) ( x + b ) + C
229

Broad tutorial week 6 (Friday)
* Exercise 1.
Two individuals have the following utility function:
U i ( x1 , x2 , =
yi ) log( xi ) + log( yi ) x1 x2 , subject to xi + yi 10 . Find their utility maximizing
consumption pattern. Next suppose that a king wishes to maximize the sum of their utilities.
Find the consumption that he would ordain.
Solution:
The individual 1 simply maximizes:
L1 ( x1 , =
y1 ) log( x1 ) + log( y1 ) x1 x2 ( x1 + y1 10)
L1 1
=
1 = 0
x1 x1
L1 1
1
= =0 =
y1
y1 y1
x1
1 x1 1
1
1=
=
y1 =
x1
x1
y1
1 x1
x1 + y1 = x1 +
x1
x (1 x1 ) + x1 (2 x1 ) x1
= 1
=
= 10
1 x1
1 x1
1 x1
x12 + 2 x1= 10 10 x1 x12 + 12 x1 10= 0

12 144 40
x1 =
= 6 104
2
y1 =10 x1 =4 104
Since y cannot be negative, we are left with the single possibility:
x1 =
6 104 0.9, y1 =
4 + 104 9.1
Individual 2 maximizes the exact same function, so he chooses the same consumption levels.
However, neither individual takes into account the fact that his consumption of x leads to a
lower utility for his neighbour. Each regards the consumption of his neighbour as a given and
neither cares directly about the utility of his neighbour.
As you are probably aware this situation is not Pareto-optimal; there exists a consumption
pattern that leaves both individuals better off. One way to see this is by introducing a
benevolent dictator. He will optimize total welfare, which we take here to be simply the sums
of the individual utilities, by choosing the consumption levels of the individuals for them,
while respecting their budget constraints (so without redistributing). We omit the possibility
of redistribution because we are interested in a Pareto-improvement: everybody must be made
better off, so we dont take anything away from anybody.
This means that the dictator faces two constraints. His Lagrangian looks as follows:
LD ( x1 , x2 , y1 , y=
log( x1 ) + log( y1 ) + log( x2 ) + log( y2 ) 2 x1 2 x2 1 ( x1 + y1 10) 2 ( x2 + y2 10)
2)
230

This gives first-order conditions:
LD ( x1 , x2 , y1 , y2 ) 1
=
2 1 = 0
x1
x1
LD ( x1 , x2 , y1 , y2 ) 1
=
1 = 0
y1
y1
LD ( x1 , x2 , y1 , y2 ) 1
=
2 2 = 0
x2
x2
LD ( x1 , x2 , y1 , y2 ) 1
=
2 = 0
y2
y2
Note that we get two completely symmetric sets of two equations: the first two and the last
two. We can just solve one of these; the other will be exactly the same.
Furthermore, the steps of the solution are almost the same as in the individual case, only a 1 is
changed to a 2 (because of the internalization of the effects of consumption). We race through
it:
LD 1
=
2 = 0
x1 x1
LD 1
1
= =0 =
y1 y1
y1
x1
1 2 x1 1
1
2=
=
y1 =
x1
x1
y1
1 2 x1
x1 + y1 = x1 +
x1
x (1 2 x1 ) + x1 (2 2 x1 ) x1
= 1
=
= 10
1 2 x1
1 2 x1
1 2 x1
2 x12 + 2 x1= 10 20 x1 2 x12 + 22 x1 10= 0

22 484 80 22 404
=
4
4
18 104
y1 = 10 x1 =
4
Once again, because y cannot be negative, we are left with a single possibility:
22 404
18 404
x=
0.48, y=
9.52
1
1
4
4
So we find that, as expected, consumption of x is reduced, because the externality is now
taken into account. Finally, we check that utility is indeed higher now.
Utility before was:
U i ( x1 , x2 , =
yi ) log( xi ) + log( yi ) x1 x2 log(0.9) + log(9.1) 0.9 0.9 0.3
Utility afterwards is:
U i ( x1 , x2 , =
yi ) log( xi ) + log( yi ) x1 x2 log(0.48) + log(9.52) 0.48 0.48 0.56
So our individuals are better off.
=
x1
231

Exercise 2.
Maximize f ( x) =
x 2 + x + 1 . Find denote the solution by x* and the value function, or
indirect objective function f * ( ) = f ( x* ) . Find
envelope theorem.
f *
by direct computation and by the
Solution:
We check the first-order condition:
df
=
2 x* + =
0 x* =
dx x = x*
2
f * ( ) = x*2 + x* + 1 =
We now compute
2
4
f *
directly:
2
2
+1 =
+1
f *
2
=
=
+1
4
2
Instead we could have used the envelope theorem, which says:
f * f
=
=
[ x 2 + x + 1]x = x* = x x = x* =
x = x*
2
The two methods give the same outcome. In this case, they are equally easy, but in general,
the envelope theorem saves you a lot of work, as we will see in the next exercise.
*Exercise 3.
Maximize f ( x, y ) =
x 2 y 2 + yx x . Find
f *
f *
and
envelope theorem.
Solution:
From the first-order conditions we obtain:
f ( x, y )
0
=
2 x + y =
x
f ( x, y )
x
=
2 y + x =
0 y =
y
2
2 x + y =
2 x = x = 2
,y= 2
4
4
2
4 2
2 2 2
2 2
2 2
f * ( , ) =
2
=
( 4) 2 ( 2 4) ( 2 4) 2 2 4
( 2 4) 2
2 2
2
( 2 4) 2
2 4
2 4
We can now calculate
f *
directly:
f *
2
2
= =
2
4 2 4
232

Lets compare this to the envelope theorem:
f * f
=
= ( x 2 y 2 + yx x)
=x x*=
, y y*
=
x
, y y*
x*=
2
=
x* = 2
4
The results are the same. Please notice that for the envelope theorem, we did not have to
calculate f * at all. That saves a lot of time.
f *
Lets do
:
f *
2
2 2
= =
2 4 ( 2 4) 2
With the envelope theorem (dropping the cumbersome conditions x = x* etc. for ease of
writing):
f * f
2 2
=
=
( x 2 y 2 + yx x) = y* x* =

( 2 4) 2
Hurrah, they are the same!
Exercise 4.
10 . Find
Maximize f ( x) = x 2 subject to x + =
envelope theorem.
f *
Solution:
Clearly this problem is a little silly if we view it as a maximization problem: the constraint
f *
x 10 . Still, for the computation of
already fixes x: =
it will be worthwhile to write
this down with a Lagrangian:

L( x)= x 2 ( x + 10 )
This gives us FO:
L
= 2 x = 0 = 2 x
x
x 10 , so:
From the constraint we got =
*
2
f =
( ) (10 )
From this we obtain directly:
f *
= (10 ) 2 2 (10 )
We could also have gotten this by the envelope theorem, which for a constrained optimization
problem is as follows:
f * L
=
= x*2 = (10 ) 2 2 x* = (10 ) 2 2 (10 )
x = x*
The outcome is once more the same.
233

* Exercise 5.
I . Find
Maximize f ( x, y ) = x y1 a subject to x + py =
by the envelope theorem.
f *
both by direct computation and
p
Solution:
This optimization of a Cobb-Douglas function should look familiar by now:
L( x, y=
) x y1 a ( x + py I )
L
a
= x 1 y1=
0
x
L
=(1 ) x y a p =0
y
=
( x* ) 1 ( y* )1 a
We divide the constraints to get:
y 1
py
=
x=
1 x p
1
py
py
(1 ) I
+ py = I =
y=
, x = I
p
1
1
For the direct computation we have to plug this into f(x,y), to get something fiercely ugly:
(1 ) I 1
1 1
=
=
(
f * ( I ) (
)
)=
I (1 )1 p 1 I
p
p
This yields:
f *
=
p 2 (1 ) 2
( 1) p 2 (1 )1 I =
p
Lets see what the envelope theorem gives us.
f * L
=
=
y * =
( x* ) 1 ( y* )1 a y* =
( x* ) 1 ( y* ) 2 a =
p p=x x*=
, y y*
(1 ) I 2 a
1 2
(
) =
) I
p
p
Once more, they are the same.
( I ) 1 (
* Exercise 6.
A firm has the profit function ( p, A) = ( p c) Q( p, A) A , where p is price, Q demand, A
the amount of advertising, c the constant unit cost of production and the cost of advertising.
Find the effect of on pricing and on profits.
Solution:
This is now easy. We just apply the envelope theorem:
*
=
= A*
=p p=
*
, A A*
So a marginal increase in advertising costs decreases you profits by the (optimal) initial
amount of advertising. At the margin, we dont have to take into account the fact that you will
also change your amount of advertising.
234

Since the problem is posed so abstractly, this is all we can really say, we cannot derive any
meaningful result on A* .
235

WEEK 6
Some additional exercises of week 6
Exercise 1.
a) Maximize x 2 3 x + 4 , s.t. 2 < x < 4 .
Solution:
L( x) = x 2 3 x + 4 1 ( x 4) 2 ( x 2)
There are three possibilities: either no constraint is binding, or the first is, or the second is
(strictly speaking, there is a fourth case, where both conditions hold, but that is not possible
here, as x=-2 and x=4 cannot hold at the same time).
Possibility 1, both constraints non-binding:
Kuhn-Tucker says that the multiplier () associated with a non-binding constraint should be 0.
That means that in this case the Lagrangian boils down to the ordinary function:
L( x) = x 2 3 x + 4 1 ( x 4) 2 ( x 2) = x 2 3x + 4 = f ( x)
We have an ordinary, unconstrained optimization problem.
dL df
3
=
= 2x 3 = 0 x =
dx dx
2
We see that this candidate maximum satisfies our constraints, so all we have to do is check the
second order conditions:
d2 f
= 2 > 0 , so this is in fact a minimum, not a maximum.
dx 2
Possibility 2, constraint 1 binding, so x=4,
L( x) = x 2 3 x + 4 1 ( x 4)
The Kuhn-Tucker conditions now say that 1 should be positive here for a maximum.
dL
=2 x 3 1 =5 1 =0 1 =5 > 0
dx
So this is a maximum. We also compute the value of f(4):
f (4) = 42 3 4 + 4 = 8
Possibility 3, last constraint binding, x=-2:
L( x) = x 2 3 x + 4 2 ( x 2)
dL
= 2 x 3 + 2 = 2 2 3 + 2 = 0 2 = 7 > 0
dx
Again our multiplier is positive, so this is another local maximize. To compare it to our other
maximize, x=4, we compute f(-2)
f (2) = (2) 2 3 (2) + 4 = 14
We see that f takes a larger value at -2 than at 4, so -2 is our global maximizer.
Exercise 2.
Maximize x 2 + 7 x + 2 , s.t. 2 < x < 4
Solution:
This is very similar to the above, L( x) = x 2 + 7 x + 2 1 ( x 4) 2 ( x 2) , we check the
three cases:
236

WEEK 6
Possibility 1, no constraints binding:
L( x) =f ( x) =
x2 + 7 x + 2
dL df
7
= =2 x + 7 =0 x =
dx dx
2
We see that x satisfies our constraints, so we just check the second-order conditions:
d2 f
=2 < 0 , so this is a maximum.
dx 2
Possibility 2, first constraint binding, x=4:
L( x) = x 2 + 7 x + 2 1 ( x 4)
dL
=2 x + 7 1 =2 4 + 7 1 =0
dx
1 =1 < 0
So this is not a maximum, but a minimum.
Possibility 3, second constraint binding, x=-2:
L( x) = x 2 + 7 x + 2 2 ( x 2)
dL
=2 x + 7 + 2 =2 2 + 7 + 2 =0
dx
11 < 0
2 =
This is also a minimum. So the maximum is the unconstrained maximum: x =
237
7
.
2

Week 7 Constrained optimization and Integral calculus
Integral calculus
Further issues of integration
See lecture slides
Differential calculus
Solution concept (solution is a function)
K.14.1.
Monotonic, oscillatory (convergence, divergence) K.14.1.

Steady state
K.14.1.
Phase diagram
K.14.1.
Stability
K.14.1.
238

Integral Calculus
Remember from week 6 that we may calculate the area under a
function. However, we may have not well defined functions. E.g. take
the function, which is only defined for x (0, )
1
f ( x) =
x
The antiderivative is F ( x) = 2 x
Example1:
1
f ( x) =
x
Note that f(x) has a vertical asymptote at x=0. Yet, we can compute the
the following area:
16
16
1
dx = 2 x
= 8 lim+ 2 x = 8 0 = 8
0
x
=
x 0
x
Example 2:
1
f ( x) =
x
We can calculate an improper integral, in which the lower limit
approaches negative infinity or the upper limit approaches positive
infinity. The following integral has no finite limit. It diverges.
1
dx = 2 x
= lim 2 x lim+ 2 x = 0 =
x
x =0
x0
x
239

Example 3:
1
f ( x) =
x x
The integral does not exist:
16
16
2
2
2
dx ==
lim+
=
0
x x
x x =0
x
16
Example 4:
1
f ( x) =
x x
The integral does not exist:.
2
dx =
x x
x
= lim
x =0
2
2
lim+
= 0 =
x x0 x
Example 5:
1
f ( x) =
x x
The integral does exist:
2
2 2
dx
=
= lim
= 2
x
x x
x x=1
x
1
240

Example 6: exponential distribution
Consider the following function:
f ( x) = e x
The integral does exist and the area is exactly equal to one:
e t dt =e t
t =0
=lim e t (1) =0 + 1 =1
t
t may be considered as a random variable, which is the duration of a

certain event. It can be shown that is the expected duration of t.
Example 7: exponential distribution
For an exponential distribution of
f ( x) = 2e 2 x
(expected value is two days) what is the probability of having a
duration longer than one day?
2e 2t dt =e 2t
t =1
=lim e t (e 6 ) =0 + e 1 0.368
t
Thus, the probability is 36.8 percent
241

Integration and differentiation
The continuous function f ( x; ) is a function of x, but it also depends
on the parameter .
If the range of integration does not depend on , integration and
differentiation are interchangeable:
b
b
f
x
dx
(
;
)
=
a f ( x; )dx
a
Example 8:
Take the integral, which depends on the unknown
4
4 1
1 2
2
3
1 ( 3 t t )dt = t 2 t t =1 = (64 8 ) + (1 0.5 ) = 65 8.5
4 2
(3
t
t
)
dt
(65 8.5 ) =
=
8.5
1
4
4
4
1 2
1 2
1 ( 3 t t )dt =1 tdt = 2 t t =1 =8 0.5 =8.5
4
4 2
1 2
(3
t
t
)
=
dt
(
Thus,
1 3 t t )dt
1
242

Integration and differentiation
Fundamental theorem of calculus
Let f ( x) is be a function that is integrable on [a, x] for each x in [a, b]
. Let c be such that a c b . Define F ( x) as follows:
x
F ( x) = f (t )dt which is a function of the upper limit of the integral.

c
The derivative of F ( x) becomes: F '( x) = f ( x)

Example 9:
F=
( x)
2
t=
dt
F '( x) = x 2
1 3 1
x
3
3
Example 10:
A statistical probability density function is f ( x) . The cumulative
function is F ( x) =
f (t )dt .
We have the following features:
f (t )dt = 1
F () = 0
F () =1
We can derive the density function from the cumulative function:
F '( x) = f ( x)
243

Multiple integrals
We can also compute multiple integrals, in which we keep the other
variable constant.
Example 11:
1
0 1
( x 2 + xy )dydx =
1
( x 2 + xy )dy dx = x 2 y + xy 2
0
2
dx =
y =1
1
1
5
1
5
= (2 x 2 + 2 x) ( x 2 x) dx = ( x 2 + x)dx = x 3 + x 2
0
0
2
2
3
4
x =0
1 5 19
= + =
3 4 12
Theorem
Let f be a continuous function defined on the rectangle
R = [ a , b ] x [c, d ]
Then
(
b
f ( x, y )dy dx =
f ( x, y )dx dy
Explanation:
On the left-hand side: we first integrate over [c, d ] with respect to
y; next we integrate over [a, b] with respect to x.
On the right-hand side: we first integrate over [a, b] with respect to
x; next we integrate over [c, d ] with respect to y.
Example 12:
2
(x
1
+ xy )dxdy =
1 3 1 2 1
0 ( x + xy)dx dy =1 3 x + 2 x y x=0 dy =
1 1
1
1
( + y )dy = y + y 2
3 2
3
4
2 4
1 1 19
= ( + )( + ) =
3 4
3 4 12
y =1
This outcome is equal to that of example 11.
244

Differential equations (Chapter 14)
In a differential equation, the unknown is a function, not a number
The equation contains one or more of the derivatives of the
function.
We consider ordinary differential equations: the unknown is a
function of only one variable
We focus on linear first-order differential equations.
Example 13
Autonomous equation (it does not involve time).
The differential equation describes natural growth:
dx(t )
= ax(t )
dt
Otherwise formulated: x (t ) = ax(t ) (dot above x denotes derivative
with respect to time)
General solution: x(t ) = Ce at
Because:
dx(t )
at
= aCe
=
ax(t )
dt
a0
0
Definite solution: x=
(0) Ce=
Ce
=
C
The solution is stable if a < 0:
lim x(t ) x=
(0)lim e at 0
=
The solution is unstable if a > 0:

lim x(t ) = x(0)lim e at = (if x(0) > 0 )
245
(x(0) is the initial value)

Example 14
Equation with a constant term. Next, we consider the general solution
of
dx(t )
+ ax(t ) =
b
a0
dt
or
x (t ) + ax(t ) =
b
We can multiply this equation by e at :
dx(t ) at
e + ax(t )e at =
be at
dt
Which can be rewritten as:
d
( x(t )e at ) = be at
dt
Thus:
)e at
x(t=
Thus:
x(t )=
at
dt
be=
b at
e +C
a
b
+ Ce at
a
x(t ) converges to
b
as t
a
Example 15
Find the general solution of
Solution: x(t )= 3 + Ce 4t
dx(t )
+ 4 x(t ) =
12
dt
246

Example 16
Next, we consider the general solution of, which contains a timevarying term b(t)
dx(t )
+ ax(t ) =
b(t ) a 0
dt
or
x (t ) + ax(t ) =
b(t )
We can multiply this equation by e at :
dx(t ) at
e + ax(t )e at =
b(t )e at
dt
Which can be rewritten as:
d
( x(t )e at ) = b(t )e at
dt
Thus:
=
x(t )e at
at
b
(
t
)
e
dt + C
Thus:
=
x(t ) e at b(t )e at dt + Ce at
247

Separable equations
Lets suppose that
d
x(t ) = F (t , x)
dt
(or: x (t ) = F (t , x) )
for which F (t , x) is the product of two functions:
F (t , x) = f (t ) g ( x)
Solution:
Step 1: write the difference equation as:
dx
= f (t ) g ( x)
dt
Step 2:separate the variables:
dx
= f (t )dt
g ( x)
Step 3: integrate each side
dx
g ( x) = f (t )dt
Solve both integrals.
248

Example 17
Find the general solution of the difference equation:
dx
= 2tx 2
dt
Step 2: Separate the equation.
dx
= 2tdt
x2
Step 3: Integrate
dx
x 2 = 2tdt
Which has the solution:
1 2
= t +C
(C is a constant)
x
or
x=
1
t2 + C
249

Example 18
Find the general solution of the difference equation:
dx
t3
=
dt x 6 + 1
Separate: ( x 6 + 1)dx =
t 3dt
3
Integrate: ( x 6 + 1)dx =
t dt
which has the solution:
1 7
1
x + x= t 4 + C
7
4
250
(C is a constant)

Tutorial of Week 7 Advanced mathematics
Exercise 1.
Compute the following integrals:
1
a)
dt
3
t +3
Solution:
1
1
=
3 t + 3 dt 3 t + 3 d (t + 3)
We apply the substitution method, by substituting t+3=x.
For t = -3 (lower bound of integral), x = 0. The upper bound of the integral does not change.
1
1
2 x
lim 2 x lim+ 2 x =
=
3 t + 3 d (t + 3) =0 x dx =
x
x =0
x 0
b)
1/ 3
e 3t dt
Solution:
1/ 3
1 3t
1 3(1/ 3)
1 3t 1
+ =
e =
e dt = 3 e t = = 3 e tlim
3
3e
b additional bonus - exercise)
1/ 3
c)
3t
1
e 3t dt = e 3t
3
1 3t 1 0
1 1
e e =0
=
t 3
3
3 3
=lim
t =0
xe x dx
Solution:
1 x2 2
e dx
2
We apply the substitution method, by substituting x 2 = t
For x=1 (lower bound of integral), t = 1. The upper bound of the integral does not change.
xe x dx =
2
1
1 x2 2
1
e dx = e t dt = e t
1
2
2
2
=lim
t =1
1
1
1 t 1 1
=
e e =0 +
2
2e 2e
2
251

y
( xy + )dxdy {and show that it is equal to
0 e
x
Solution:
d)
y
0 e ( xy + x )dxdy =
1
y
3
0 e ( xy + x )dx dy =
1
( xy + x )dydx }
3
1 2
0 2 x y + y ln( x) x=e dy =
1
1
9
= y e 2 y + y ln(3) y dy =
0 2
2
x =e
1
1 2
7 1
1
9 2 1 2 2 1 2
=
= e 2 + ln(3)
y e y + y ln(3) y
4
2
2 y =0 4 4
2
4
e)
( xy + x )dydx
Solution:
y
0 ( xy + x )dydx =
1
y
1
e 0 ( xy + x )dy dx =
3
1
1 y2
2
+
xy
e 2
2 x
dx =
y =0
3 1
1
1 2 1
9 1 2 1
1 7 1 2 1
=
e 2 x + 2 x dx =4 x + 2 ln( x) x=e =4 4 e + 2 ln(3) 2 =4 4 e + 2 ln(3)
f)
f '(t )dt
Solution:
)dt f (t ) + C
f '(t=
C is a constant
Exercise 2.
Please compute the first derivative of the following functions
x
3t
a. F ( x) = 1 e dt
Solution:
F '( x) = e 3 x
x
1 3t
1 3 x 1 3
=
F ( x) =
e dt
e=
e + e
1
3
3
3
t =1
1
(3)e 3 x= e 3 x
F '( x)=
3
x
3t
2x
3t
b. F ( x) = 1 e dt
Solution:
F '( x) = e 3(2 x )
(apply chain rule)
(2 x)
= 2e 6 x
x
252

=
F ( x)
F '( x)=
2x
2x
=
e 3t dt
1 3t
1 6 x 1 3
e=
e + e
3
3
3
t =1
1
(6)e 6 x= 2e 6 x
3
3
3t
c. F ( x) = x e dt
Solution
F ( x) =
e 3t dt = e 3t dt
3
3 x
F '( x) = e
x
1 3t
1 3 x 1 9
F ( x) =
e dt =
e
=
e e
x e dt =
3
3
3
3
t =3
3
3t
3t
F '( x) = e 3 x
x
3t
d. F ( x) = e dt
Solution:
F '( x) = e 3 x
*Alternative solution:
x
1 3t
1 3 x
1
=
F ( x) =
e dt
e=
e lim e 3t
t 3
3
3
t =
x
3t
F '( x) = e 3 x
e. F (t ) =
1
dx
( x + 1)
Solution:
F '(t ) =
f.
1
(t + 1)
1
dx
0 ( x + 1)
F ( ) =
* Solution:
We know that
t dt = ln(t ) for positive t. However the denominator is not t, but x + 1. So
we need to rearrange the integral a bit:

11
1 1 1
1 1
1
=
F '( ) =
dx
d ( x )
d ( x + 1)
0
0 =
0
( x + 1) ( x + 1)
( x + 1)
We apply the substitution method here, x + 1 =

t , so that the upper bound (x=1) becomes
1 + 1 = + 1 , and the lower bound (x=0) is 0 + 1 =
1
253

1 +1 1
1 1 1
d ( x + 1) =
dt
0
( x + 1)
1 t
Next, we apply the product rule of differentiation: f '( x) g ( x) + f ( x) g '( x)

1 +1 1 1 1
1
1
+1
1 +1 1
1
1
+ 2
+ 2 ln(t ) t =1 =
+ 2 ln( + 1)
dt =
dt =
1

t +1 1 t
( + 1)
( + 1)
254

*Alternative:
F '( )
=
1 1
dx
=
0 ( x + 1)
dx
=
( x + 1)
0
x
dx
( x + 1) 2
At this stage, I decided to apply of integration by parts, because I know that

t
1
1
1
t
t
dt =
+ ln(t + 1) for positive t+1.
(t + 1)2 dt =
td (t + 1) =
(t + 1)
(t + 1)
(t + 1)
{However, there is one problem, it is not t, but ( x + 1) . Hence:
1
d
( x + 1)
1
1
1
=
, because
}
dx = d
2
dx
( x + 1) 2
( x + 1)
( x + 1)
1
1 x
1 x
1
x
1
1
x
1
x 1
0 ( x + 1)2 dx =0 d ( x + 1) =0 d ( x + 1) = ( x + 1) 0 ( x + 1) d =
x =0
1
1
1
1 1 1 1 1
=
=
2
dx
0
( + 1)
( + 1) ( x + 1)
1
1
=
2
( + 1)
+1
1
+ 1)
d ( x=
0 ( x + 1)
+1
1
1
1
1
1
dt =
2 ln(t ) =
2 ln( + 1)
t
( + 1)
( + 1)
t =1
Exercise 3.
Sketch the direction diagram and the phase diagram (if possible) for the following differential
equations. If possible, give an explicit solution.
a. x (t ) = 3 x(t )
Solution:
x(t )= C e3t
The directional diagram, with some solutions drawn in, looks thus:
255
What we see is a picture of widely diverging functions as time progresses. This can be
confirmed by looking at the phase diagram, which, for autonomous equations (those who do
not explicitly depend on time, see e.g. exercise 3e), plots the relation between x and x .
256

x
6
4
2
2
2
4
6
As we can see in the phase diagram, if x is positive, then x is also positive. That means that if
x is positive, it will increase over time. Similarly, if it is negative, it will decrease over time.
Only if x is zero, the time-derivative of x is also zero, and x does not change over time. This is
what we saw in the first picture. The solutions are either always increasing or always
decreasing. Only the zero-solution does not change over time. It is called an unstable
equilibrium: if you are there, you stay there, but if you are near, no matter how close you are,
youll never get there.
257

1
b. x (t ) = x(t )
2
Solution:
x(t ) = C e
1
t
2
From the picture we see that these solutions do converge to 0. We can again confirm this in
the phase diagram.
258

x
1.0
0.5
0.5
1.0
This time, a positive x implies a negative x and vice versa, so a positive x means that x is
decreasing over time. Again zero is an equilibrium, as it does not change over time. However,
this time, it is a stable equilibrium: if you start somewhere near it, youll get ever closer. In
fact, in this particular case, it doesnt matter where you start, you always get ever closer to
zero.
259

1
c. x (t ) + 2 x(t ) =
Solution:
From the lecture slides we obtain:
1
x = + C e 2t
2
10
10
260

We see all solutions converging to 0.5. The phase diagram confirms:
x
4
2
2
4
1
0.5 is an equilibrium, because x ( ) = 0 , and it is stable, because when x is smaller than 0.5,
2
it is increasing, x > 0 and it is decreasing if x is larger than 0.5.
261

1
d. x (t ) 2 x(t ) =
Solution:
We obtain from the lecture slides:
1
x = + C e 2t
2
We see -0.5 is an equilibrium, but it is unstable. Again, the phase diagram confirms:
262

x
6
4
2
2
0
e. x (t ) tx(t ) =
Solution:
This differential equation is of the form:
x (t ) + f (t ) x(t ) =
0 ,where f (t ) = t . These equations have the following general solution:
dF (t )
If F (t ) is such that
= f (t ) on a certain interval, then:
dt
1
x(t )= C e F (t ) on that interval. Here, the simplest F that works is F (t ) = t 2 (
2
1 2
F (t ) =
t + 43 would also work, but it would make life more complicated). We get:
2
1
x(t )= C e 2
t2
263

15
10
10
15
Because our differential equation also depends on t, we cannot draw a phase diagram; the
relation between x and x changes with time. We can observe however that x=0 is the only
equilibrium. We know that in equilibrium x = 0 for all t, from this and our differential
equation it follows immediately that x=0.
0
f. tx (t ) + x(t ) =
Solution:
This is not of the form x (t ) + f (t ) x(t ) =
0 , that we discussed in 3e), but we can make it so by
dividing both sides by t. We obtain:
x(t )
1
x (t ) +
=
0 , which is of the form we want, with f (t ) = . We can find integrands for f on
t
t
the positive and the negative interval: F (t ) = log(t ) and F =
(t ) log(t ) respectively.
So we get on the two intervals:
264

C
if t > 0
Ce
if t > 0 t
x(t ) =
log( t )
if t < 0 C
Ce
if t < 0
t
To be sure, lets check this by plugging it into the original equation:
Ct C
tx (t ) + x(t )=
+ = 0 , so it works for t>0. Similarly for t<0.
t2
t
log( t )
t
g. x (t ) + x(t ) =
Solution:
The nicest way to solve this is in 2 steps. This is a linear differential equation, meaning that
the x and its derivatives appear only linearly (their coefficients could be non-linear, although
0 instead. In
thats not the case here). It is not homogeneous; it would be if it was x (t ) + x(t ) =
general, it is homogeneous if, when you write all the terms involving x and its derivatives on
one side, the other side is 0. Now the equation is called inhomogeneous. The solution to an
inhomogeneous linear differential equation can be obtained as follows. Find a simple
particular solution and find the general solution to the homogeneous counterpart. Add the two
265

and you have your general solution. It will become clear in a minute:
Finding the particular solution can be tricky, but here its rather easy. If we try x(t ) = t , wed
get: x (t ) + x(t ) = 1 + t t , but this gives us the hint we need: x(t ) = t 1 will work:
x (t ) + x(t ) = 1 + t 1 = t . By being clever like this, you can often find the particular solution.
The homogeneous equation is:
x (t ) + x(t ) =
0 , which has solution:
x(t )= C e t .
So our total solution becomes:
x(t ) = C e t + t 1 . We check our result:
x (t ) + x(t ) =C e t + 1 + C e t + t 1 =t
10
10
t2
h. x (t ) + x(t ) =
Solution:
We apply the same trick. In fact, the homogeneous solution is the same as before:
x(t )= C e t
266

(Strictly speaking, we should not write x(t ) , because this is not a solution to our equation. It
would however be a bit formal and a bit too much work to introduce new notation for this.)
For the particular solution, we first try x(t ) = t 2 , only to see that it should be x(t =
) t 2 2t .
So the general solution becomes:
x(t ) = C e t + t 2 2t
20
15
10
10
Notice that, although any particular solution does not converge, they also start to resemble the
function x(t =
) t 2 2t , the particular solution, more and more over time. The same thing
happened in exercise 3g); the solutions converged to the particular solution. This is because
the homogeneous solutions converge to 0.
t
i. x (t ) + tx(t ) =
Solution:
Again, we try to find a particular solution. It turns out to be particularly easy in this case:
x(t ) = 1 works.
0 we know to be (similarly to
The solution to the homogeneous equation x (t ) + tx(t ) =
267

exercise 3e):
1
t2
x(t )= C e 2
So we obtain:
1
t2
x(t ) =
C e 2 +1
Lets check:
x (t ) + tx(t ) =
t C e
1
t2
2
+ t (C e
1
t2
2
+ 1) =
t , it worked!
15
10
10
15
* j. x (t ) + tx(t ) =
t2
Solution:
Well, sort of. It turns out that there is no handy particular solution to this system. This is quite
a common occurrence with differential equations. However, we can still see what is going on
by drawing our regular pictures. Only, now the solutions were found numerically by a
computer.
268

10
10
Notice that the solution converge to the line x=t. According to the differential equation
x = 0 along that line. Unfortunately it is not, because x = 1 .
k. x (t ) = t 3e x (t )
Solution:
This can be handled by separating the variables, as it is called. The right hand side is a
product of a term containing only x and a term containing only t. So we can rewrite it like
this:
dx 3 x (t )
x (t ) = =
te
t 3dt
e x (t ) dx =
dt
1 4
3
e x (t ) =
t +C
e x (t ) dx =
t dt =
4
1
t ) log t 4 + C
x (=
4
269

t2 +1
x5 + 1
Solution:
This works in the same way, but the solution stays in implicit form.
1 6
1 3
5
2
( x + 1)dx= 6 x + x= (t + 1)dt= 3 t + t + C
1
1
So we end up with the implicit relation between x and t: x 6 + x= t 3 + t + C . We cant
6
3
simply rewrite, but we know from week 5 how to handle implicit relations.
* m. x (t ) x 2 (t ) =
t
Solution:
No simple solution exists and no phase diagram can be drawn. But we can see the solution
graphically.
l. x (t ) =
20
15
10
10
log(t )
n. x (t ) x(t ) =
Again we can only show the solution via a computer simulation:
270

20
15
10
10
o. x ( t ) + t x ( t ) =t 3 + 2t 2 + 1
Solution:
Particular: x = t + 2
2
Homogeneous: x= C e
4e t
p. x ( t ) + 3 x ( t ) =
1
t3
3
Solution:
Particular: x = 2e t
Homogeneous: x= C e 3t
2
et
q. x ( t ) + (2t + 2) x ( t ) =
Solution:
1 t 2
e
2
2
Homogeneous: x= C e (t + 2t )
Particular: x =
271

Friday broad tutorial
* Exercise 1. Consider the following probability density function (the exponential
distribution):
f ( x) = e x for x [0, )
a. Please calculate the expected value of x.
b. Compute the cumulative distribution of x.
Solution:
a. We know that EX = xf ( x)dx
Hence, we apply the method of integration by parts:
xde x =
xe x
x e x dx =
x =0
1
1 1
=0 lim e x =
x

x
x =0
x
It can be shown that lim xe = 0
=lim( xe x + 0)
+ e x dx =
e x
b. The cumulative distribution is
e t dt
Exercise 2.
The probability of observing a t is f (t ) = e t for t [0, )
Compute the probability of t larger than 1.
Solution:
1 1
t
t
Pr(T > 1) =
et + =
1 e dt =e t =1 =lim
t
e e
272

* Exercise 3. Consider the Solow economic growth model:
(Cobb Douglas production function)
X = AK 1 L
K = sX
(investment is proportional to output)
t
L = L0 e
(exponential growth of labour force)
where X = X (t ) is the national product, K = K (t ) is the capital stock and L = L(t ) is the
number of employees at time t. The model contains the following constants: A, , s, and L0 .
K K (0) > 0
Derive the differential equation to determine K = K (t ) when=
Solution:
dK
K = sAK 1 ( L0 et=
sA( L0 ) K 1 e t
=
)
dt
It is a separable differential equation. So we have on the left-hand side a function of dK and
K. On the right hand side we have a function of dt and t.
K 1dK = sA ( L0 et ) dt
We take the integral on both sides of this equation:
dK = sA ( L0 et ) dt
which becomes
1
1
sA ( L0 ) et + C
K =
Next, we rewrite this equation, so that K is a function of t. Thus, we eliminate

or ( C1 = C )
K =
sA ( L0 ) et + C1
if K = K 0 for t=0
sA
C
K 0 ( L0 )
=
1
Thus the solution becomes:

1/

K=
( L0 ) ( et 1)
K0 +
273

Take home assignments of week 7 (on material of week 5, 6 and 7)
Exercise 1.
Find the critical points of the following function and determine whether they are minima,
maxima or saddlepoints.
1
f ( x, y )= 2 x3 5 x 2 + x + yx 2 y 3
3
(Hint: The first order conditions give you a system of 2 equations. First look at the condition
with respect to y, this will give you two possibilities. Plug what you find in the condition with
respect to x. This will give you a quadratic equation for both possibilities, so you end up with
four critical points.)
Solution:
f ( x, y )
= 6 x 2 10 x + 1 + 2 yx= 0
x
f ( x, y )
=x 2 y 2 =0 y 2 =x 2
y
From the latter equality we see that x =y x = y . If we plug the first case into the first
condition, we get:
6 x 2 10 x + 1 + 2 yx =
8 x 2 10 x + 1 =
0
10 + 68
10 68
x= y=
16
16
If we plug the second case into the first condition, we get:
6 x 2 10 x + 1 + 2 yx =
4 x 2 10 x + 1 =
0
x= y=
10 + 84
10 84
x = y =
x = y =
8
8
So we obtain the following four critical points (x,y):
10 + 68 10 + 68 10 68 10 68
(
,
), (
,
),
16
16
16
16
10 + 84 10 84 10 84 10 + 84
(
,
), (
,
)
8
8
8
8
To see whether these are maxima, minima or saddle points, we calculate the Hessian at each
of these points and check its definiteness. The Hessian is:
2 f
2 f
2
xy 12 x 10 + 2 y 2 x
x
=
2 f
2 y
2x
2 f
2
yx y
This we can now evaluate at our 4 points. For the first we obtain:
10 + 68
10 + 68
10 + 68
10 + 2
2
12
6
2.3
16
16
16
10 + 68
10 + 68 2.3 2.3
2
2
16
16
Ordinarily, we would check the determinants of the principal minors to determine the
274

definiteness, but here we see that the signs of the numbers on the diagonal are not all the
same, so the matrix is indefinite. The point is a saddle point.
For the second point we obtain:
10 --68
10
68
10
68
- 10 + 2
2
12
-8.5 0.2
16
16
16
10 -68
10
68 0.2 -0.2
2
-2
16
16
Since all the diagonal elements are negative, this might be a negative definite matrix, so we
check the determinants of the principal minors. There is of course only the full matrix in this
case:
8.5 0.2
= 0.2 8.5 0.22 = 1.66 > 0 .
0.2 0.2
So the principal minors alternate in sign: this is a negative definite matrix and this point is a
local maximum.
We check the third point:
10 + 84
10 84
10 + 84
10 + 2
2
12
14 4.8
8
8
8
10 + 84
10 84 4.8 4.8
2
2
8
8
This has all diagonal elements positive, so it might be positive definite, we check the principal
minor:
14 4.8
=14 4.8 4.82 =44.16 > 0 , so this is positive definite and we have found a
4.8 4.8
minimum.
Finally, we check the fourth point:
10 84
10 + 84
10 84
10 + 2
2
12
9 0.2
8
8
8
10 84
10 + 84 0.2 0.2
2
2
8
8
Here the diagonal elements again have different signs, so the matrix is indefinite and our point
is neither a minimum nor a maximum.
275

Exercise 2.
Maximize f (=
x, y ) log( x) + 4 log(3 y ) , subject to x + 6 y =
10 .
Solution:
We construct the Lagrangian and take the first derivatives:
L( x, y=
) log( x) + 4 log(3 y ) l ( x + 6 y 10)
L 1
= l = 0
x x
L 4
= 6l =0
y y
Because it almost always works in economics, we divide the two first order conditions:
y 1
4
= y= x
4x 6
6
We plug this into the constraint:
4
x + 6 y = x + 6 x = 5 x = 10
6
4
4
x = 2, y = 2 =
6
3
Exercise 3.
Solve the following integrals:
t
a)
dt for which you may assume that t is positive.
(t + 3) 2
Solution:
We apply the method of integration by parts:
t
1
1
1
1
1
t
t
+
dt =
dt =
(t + 3)2 dt =
td (t + 3) =
(t + 3)
(t + 3)
(t + 3)
(t + 3)
t
=
+ ln(t + 3) + C
(t + 3)
Two remarks:
1) We applied in the first step that
1
d
(t + 3) = 1
dt
(t + 3) 2
1
1
1
dt = d
= d
2
(t + 3)
(t + 3)
(t + 3)
2) We can check that:
d
d t
+ ln(t + 3) + C = ( t (t + 3) 1 + ln(t + 3) + C ) =
(t + 3) 1 + t (t + 3) 2 + (t + 3) 1 =
t (t + 3) 2
dt (t + 3)
dt
276

1
( xy )dxdy
0 1 2
Solution:
b)
0 1
1
( xy=
)dxdy
2
1 1
13
3 2
3
2 1
xy )=
dx dy x 2 y =
dy =
ydy =
y
0 1 ( 2 =
0 4
0 4
8 y0 8
x 1=
Alternative:
1
0 ( 2 xy)dydx =
1
2 1
1 1
2
0 ( xy )dy dx = 1 xy
2=
y
4
1
21
1 2
4 1 3
=
=
= =
dx
xdx
x
1 4
8 x1 8 8 8
0=
Exercise 4.
Sketch the direction diagram and the phase diagram (if possible) for the following differential
equations. If possible, give an explicit solution.
x (t ) = 2 x(t )
Solution:
Lets first solve it exactly. We could either rewrite the equation in a form to which we know
the solution, or we could try to guess the solution, because the equation is so simple. We will
start with the guessing approach. x (t ) = 2 x(t ) requires that we find functions that are twice
their own derivative. We know that the exponential function is equal to its own derivative, so
it makes sense to see if we can manipulate it to get a solution to our equation. We could try
multiplying the exponential function, but then it would still be equal to its own derivative:
d 2t
d
e = 2e 2t .
C et = C et . But if we multiply the power, we do get our desired result:
dt
dt
This gives us exactly what we want. But we know that it will also hold for any multiple of this
function, so we get as a general solution: x(t )= C e 2t .
We can also obtain this result in a more procedural fashion: if we rewrite the equation as
x (t ) 2 x(t ) =
0 , we see that is a homogeneous linear equation, i.e. of the form
x (t ) + f (t ) x(t ) =
0 , with f (t ) = 2 . This general form has the solution:
x(t )= C e F (t ) , where F is an antiderivative of f, F '(t ) = f (t ) . Here F (t ) = 2t and we

obtain as our solution x(t )= C e 2t , as before.
We draw the direction diagram (or vector field) and some solutions:
277

10
1.0
0.5
0.5
10
Finally, we draw a phase diagram. Here we set x on the horizontal axis and x on the vertical
axis.
278
1.0

4
4
From the phase diagram we can also infer the qualitative behaviour of our solutions. In
particular, we see that x = 0 when x=0. This implies that x=0 is an equilibrium: when x=0, the
x value does not change over time. If x>0, then x > 0 , according to the phase diagram. This
means that if x is positive, it will grow over time. Similarly, if x<0, then x < 0 , so x will
decrease over time. We see that solutions will move away from the equilibrium-value 0 over
time, so that x=0, while an equilibrium, is not stable. All this is corroborated by our actual
solutions.
279

Week 8 Dynamic analysis
Differential equations
Equilibrium of homogenous equation
Stability of system of equation
K.14.1.
K.14.3. Page 481
486 (Phase diagram
Difference equations
Introduction to first-order difference equations
280
K. 13.1

Again linear differential equations
We consider the following differential equation, that we solve as
follows:
x = f ( x, t )
So, lets take a simple case:

x = f (t )
dx
= f (t )
dt
dx = f (t )dt
dx = f (t )dt
=
x
f (t )dt + C
C is some constant
281

First-order linear differential equations
n
If
is , try a multiple of .
For linear combinations of the above, try linear combinations.
If this fails, try multiplying with a factor t, before you throw in the
towel. Dont be upset if it doesnt work, this might be one of many
insoluble differential equations
x + A(t ) x =
B (t )
Solution:
Step 1
Solve the general solution to the homogenous equation (or reduced
equation)
x + A(t ) x =
0
x = A(t ) x
x
= A(t )
x
x
= A(t )
x
1
dx = A(t )dt
x
ln x =
A(t )dt + C
x = e
A ( t ) dt +C
Step 2
Find the particular solution for the non-homogenous differential
equation. This can be considered as a steady-state value
x + A(t ) x =
B (t )
Try for the particular solution of x :
If B (t ) is constant, a constant
If B (t ) is a polynomial of degree n, try for x a polynomial of degree
B (t )
e at
e at
282

Step 3
General solution of non-homogenous equation is the sum of step 1 and
step 2.
Step 4
A definite solution of C specifies the initial value of x0 . Thus
substitute t=0 in the general solution (step 3) and solve for C.
Step 5
Study the limit of the solution of step 4 if t gets infinitely large.
283

Example: First-order linear differential equation
We consider the following equation:
x + ax =
b
Note that x is a function of t. We solve the general solution by
following the four steps of above.
Step 1 (homogenous equation: right-hand side of the differential
equation is zero). The homogenous equation is a separable differential
equation.
x + ax =
0
x = ax
dx
= ax
dt
dx
= adt
x
dx
x = adt
(we assume a positive x; C is some real
ln( x) =
adt + C
number)
adt +C
adt
adt
=
eln( x ) e=
eC=
e
C1e
Step 2:
x=
( C1 = eC )
b
is the steady state
a
Step 3:
General solution is the solution of the homogenous equation plus the
steady state:
b
=
x C1e at +
a
284

Step 4:
We solve C1 of step 3 by means of substituting the initial value, t=0,
in the equation of step 3.
x(0) = C1e a0 +
Thus:
b
b
= C1 1 +
a
a
b
a
Next we substitute C1 in the equation
b
b
Thus solution: x = x(0) e at +

a
a
=
C1 x(0)
Step 5
We study the solution as t becomes infinitely large.
b
Hence, if the initial value x(0) is , then the limit will be equal to
a
the initial value.
b b
b b
lim x = e at + =
t
a a
a a
b
If the initial value x(0) is not equal to and a >0 then
a
b
b
lim x lim x(0) + e at=

=
+
t
t
a
a
b
b
= x(0) lim e at + =
a t
a
b
b b
= x(0) 0 +=
a
a a
If the initial value x(0) is not equal to

b
b
lim x =
lim x(0) e at + =
t
t
a
a
285
b
and a < 0 then
a

Adjustment towards equilibrium
We consider the process of adjustment towards equilibrium. The major
question here is whether there is rapid or slow adjustment.
We start again with the difference equation:
x + ax =
b
Note that x is a function of t; so the outcome of variable x depends on
b
time. We calculated that the equilibrium is
a
We rewrite the differential equation as a function of equilibrium:
b
x =ax + b =ax + a =ax + a x* =a ( x + x* )
a
Convergence:
If a is positive, and x > x* , x becomes negative. It means that x is
too large (relative to equilibrium value). A negative x ensures
adjustment towards equilibrium, so that x becomes smaller.
If a is positive, and x < x* , x is positive. Hence x is too small. The
positive x gives an adjustment towards equilibrium.
Finally, we can show that a more positive a gives a faster
convergence (more rapid adjustment) towards equilibrium. We
consider the solution of the differential equation. A larger a gives
and e at close to zero, so that x is close to x* . It implies a more
rapid adjustment.
b
b
x x(0) e at + =
=
a
a
( x(0) x ) e
*
286
at
*
x(0)e at + x* (1 e at )
+ x=

System of two differential equations (two-variable phase
diagram)
Lets consider the system of two differential equations, for which x
and y depend on x and y. The equations contain six parameters a, b, c,
d, e1 and e2 . Both of the variables x and y depend on t.
x = ax + by + e1
y = cx + dy + e2
287

Case 1 - Global stability of the system of equations (figure 14.7a).
For this case we assume that a < 0 , b > 0 , c < 0 and d < 0
We consider a two-variable Phase diagram, for which the xvariable is on the horizontal axis and the y-variable is on the
vertical axis (See Figure 14.7 of Klein).
First: we consider the upward sloping x = 0 equation:
For this equation, we take a < 0 and b > 0 .
Consequence: the line x = 0 is upward sloping in the (y x) phase
diagram.
Reason: we are interested for which combinations of x and y, x = 0 .
Thus:
0 = ax + by + e1
e
b
y=
x 1
a
a
b
Because a < 0 , b > 0 , the slope of the equation x = 0
a
becomes positive.
For any point below the line x = 0 , x is negative. A negative x
implies that x is becoming smaller, thus the horizontal arrows are
pointing in leftward direction in Figure 14.7 a.
o Reason:
x < 0 corresponds to
ax + by + e1 < 0
by < ax e1
e
a
y< x 1
b
b
(division by a positive number b, so that the inequality sign
does not change).
For any point above the line x = 0 , x is positive. The positive x
means that x is becoming larger. Thus, the horizontal arrows point
in rightward direction.
288

o Reason:
x > 0 corresponds to
ax + by + e1 > 0
e
a
y > x 1
b
b
Second: we consider the downward sloping y = 0 equation:
For this equation we take the parameters c < 0 and d < 0
Consequence: the line y = 0 is downward sloping in the (y x)
phase diagram
o Reason:
y = 0
0 = cx + dy + e2
e
c
y=
x 2
d
d
c
o Because c < 0 , d < 0 , the slope of the equation y = 0
d
becomes negative.
o For any point below the line y = 0 , y is positive (y is
becoming larger, thus the vertical arrows are upwardly
pointing).
o Reason:
y > 0
cx + dy + e2 > 0
dy > cx e2
e
c
y< x 2
(the inequality sign turns around because
d
d
the left hand side and the right hand side of the equation are
divided by d, which is a negative number)
o For any point above the line y = 0 , y is negative (y is
becoming smaller. Thus, the vertical arrows in downward
direction).
o Reason:
y < 0
289

cx + dy + e2 < 0
dy < cx e2
e
c
y > x 2
(the inequality sign turns around because
d
d
the left-hand side and the right-hand side of the equation are
divided by d, which is a negative number)
Conclusion: as a result of the directions of the vertical and horizontal

arrows, there will be convergence towards equilibrium, for all initial
conditions of x and y.
290

Case 2 (saddlepath stability; Figure 14.7c):
For the x = 0 equation we take a < 0 and b > 0 . So, consider the
case 1 of above:
o For any point below the line x = 0 , x is negative. A negative
x implies that x is becoming smaller, thus the horizontal
arrows are pointing in leftward direction in Figure 14.7 c.
o For any point above the line x = 0 , x is positive. The positive
x means that x is becoming larger. Thus, the horizontal
arrows point in rightward direction.
For the y = 0 equation we take c > 0 and d > 0 . The line y = 0 is
downward sloping in the (y x) phase diagram.
o Reason:
y = 0
0 = cx + dy + e2
e
c
y=
x 2
d
d
c
o Because c > 0 , d > 0 , the slope of the equation y = 0
d
becomes negative.
For any point below the line y = 0 , y is positive (y becomes larger,
arrow up).
For any point above the line y = 0 , y is negative (y becomes
smaller, arrow down).
Consequence: for particular initial conditions, there will be
convergence towards equilibrium. Starting north and south (N and S)
of the equilibrium (see Figure 14.7c), there will be convergence
towards equilibrium. Starting east and west of equilibrium (E and W
in Figure 14.7c), there will be divergence.
291

Stability and eigenvalues
For stability, we can consider the eigenvalues ( 1 and 2 ) of the
matrix
a b
A=
c d
1) Globaly stable if both eigenvalues of A are negative:

1 < 0 , 2 < 0
2) Saddlepath stable if the eigenvalues of A have different signs:
1 < 0 , 2 > 0
3) Globaly unstable if both eigenvalues of A are positive:
1 > 0 , 2 > 0
We will not pursue on this matter in this course.
292

Difference equations (Chapter 13)
We consider periods time. Consequently, we can have a sequence {xt }
for x and a sequence { yt } for y:
x1 , x2 , x3 ,...
y1 , y2 , y3 ,...
First-order difference equation:
=
xt axt 1 + yt
Second-order difference equation:
xt = axt 1 + bxt 2 + yt
There will be a monotonic sequence if a > 0.
There will be a sequence that alternates in sign if a < 0.
A sequence {xt } is bounded if there is a such that for any t: xt <
xt axt 1 + yt converges if:
The sequence=
lim xt = x
t
The sequence diverges if:

lim xt =
t
Thus:
x ax + y
=
(1 a ) x =
y
1
x =
y
(1 a )
Convergence of the sequence to steady state (regardless of initial
value) of x0 if a < 1
The steady state is not well defined if a = 1
Divergence of the sequence if a > 1
293

Solutions to first-order difference equations
1) Repeated iteration
=
xt axt 1 + y
=
xt 1 axt 2 + y
so that
=
xt a (axt 2 + y ) +=
y a 2 xt 2 + ay + y
thus
t 1
=
xt a t x0 + i =0 a i y
2) Forward solution
=
ut but +1 + vt
ut +1 but +2 + vt +1
=
so that
u=
b(but +2 + vt +1 ) + =
vt b 2ut +2 + bvt +1 + vt
t
thus
n
=
ut lim b nut +n + i=0 bi vt +i
n
If b < 1 and {ut } is bounded
lim b nut +n = 0
n
If {vt } is bounded then the solution to the difference equation is

ut = i =0 bi vt +i
294

General solution
We consider the difference equation. It resembles the procedure of
above for the differential equation.
=
xt axt 1 + y
Step 1:
Solve the homogenous equation (so, the right-hand side is zero):
xt axt 1 =
0
Solution to this equation (we need to determine A and k):
xt = Ak t
which is substituted in the homogenous equation:
Ak t aAk t 1 =
0
so that k = a
solution to the homogenous equation
xt = Aa t
Step 2:
Find a particular solution of
=
xt axt 1 + y :
which is
1
x =
y
(1 a )
Step 3:
General solution:
=
xt Aa t +
1
y
(1 a )
295

Step 4:
Determine A of the general solution
1
=
x0 Aa 0 +
y
(1 a )
1
=
A x0
y
(1 a )
So that we substitute A in the general solution
t
1
1
1 at
t
xt =
y a +
y=
x0 a +
y
x0
(1
a
)
(1
a
)
1
296

Tutorial of Week 8 Advanced mathematics
Exercise 1.
1
=
xt
xt 1 + 2 explicitly by repeated substitution.
solve
2
Solution:
1
xt 1 + 2 explicitly by repeated substitution. Assume that at time 0, x0 has
2
value C (since we dont specify C, this isnt really an assumption, but more of a definition).
1
1
=
xt
xt 1 + 2 holds for all t, we can plug in=
xt 1
xt 2 + 2 , to get:
Since our equation
2
2
1 1
( xt 2 + 1) + 1 . We can keep repeating this process until we get to t=0:
=
xt
2 2
=
xt
We solve
1 1 1
1 1
( ( ( ( x0 + 2) +=
2) + 2) + 2
2 2 2
2 2
1
1 1
1
1
( )t x0 + 2(1 + + ( ) 2 + ( )3 + + ( )t 1 )
2
2 2
2
2
This last line we can rewrite in an easier way by applying the following very neat trick. We do
it for a general constant , instead of 2.
We want to find the value of B =1 + + 2 + + t 1 . First observe that
B = + 2 + 3 + t 1 + t . If we compare the term of these two sequences of numbers,
we see that they are very similar: only the first term of the first and the last term of the second
are different. If we subtract them, we get therefore:
(1 + + 2 + + t 1 ) ( + 2 + + t 1 + t ) =1 t =B B =(1 ) B
=
xt
1 t
B=
1
If we apply this formula to our function we obtain:
1
1 ( )t
2 = ( 1 )t x + 4(1 ( 1 )t )
0
1
2
2
2
Finally, we can ask what will be an equilibrium of this process. In equilibrium, x does not
change over time, so we have xt = xt 1 . Plugging this into our equation, we find:
1
xt=
xt + 2 xt= 4 . We can also see that all our solutions will converge towards
2
1
1
( )t x0 + 4(1 ( )t ) goes to zero,
equilibrium over time. As t , the first term of x=
t
2
2
whereas the term in brackets goes to 1. SO the whole thing goes to 4, the equilibrium value.
1
1 1
1
1
1
( )t x0 + 2(1 + + ( ) 2 + ( )3 + + ( )t 1=
) ( )t x0 + 2
x=
t
2
2 2
2
2
2
297

Broad tutorial (Friday)
Exercise 1.
For the difference equation
=
xt
7
xt 1 + 30
8
with x0 = 300
compute the general solution and the repeated iteration.

Solution:
For the general solution of the differential equation, take the four steps.
Step 1: Solve homogenous equation.
7
0
xt xt 1 =
8
A solution has the form xt = Ak t
7
0
Ak t Ak t 1 =
8
7
k=
8
Step 2: Find a particular solution of (this is the steady state)
7
x
x + 30
=
8
7
x x =
30
8
1
30= 240
x=
7
1
8
Step 3: General solution
t
7
=
xt A + 240
8
Step 4: Solve A for x0
0
7
=
x0 A + 240
8
0
7
=
300 A + 240
8
60 = A
t
7
=
So the solution
is xt 60 + 240
8
Repeated iteration:
7
=
xt
xt 1 + 30
8
298

Rewrite this equation for t-1:
7
=
xt 1
xt 2 + 30
8
Substitute xt 1 in the first equation. Bring it back towards x0 , by repeating this method of
substitution (so xt 2 will be substituted as a function of xt 3 ,et cetera).
2
77
7
7
xt = xt 2 + 30 + 30 = xt 2 + 30 1 + = =
88
8
8
t
t 1
t
7 7 2
1 (7 / 8)t
7
7 7
30
= x0 + 30 1 + + + =
+
8
8 8
1 7 / 8
8 8
For which in the final step we used the following equality:
2
t 1
7 7
7
1 + + + =

8 8
8
t
7 1 (7 / 8)
i=
=0
1 7 / 8
8
t 1
Exercise 2.
Compute the forward solution for the difference equation
7
xt
xt +1 + 30
=
8
Solution:
7
=
xt
xt +1 + 30
8
It is also valid for one period ahead: period t+1 (instead of t)
7
=
xt +1
xt + 2 + 30
8
Substitute the equation for t+1 in the first equation. Repeat it n times. Consider the outcome
for n very large. The first term because zero as n tends to infinity. The second term becomes
240.
n
7 7 2
77
7
xt = xt + 2 + 30 + 30 = = lim + 30 1 + + + =
n 8
88

8 8
1
=0 + 30
=240
1 7 / 8
Exercise 3.
Solve the differential equations
x(0) = 50
x + 3 x =
3
x + 0.1x =
3 x(0) = 50
and demonstrate that for the first equation there is a more rapid convergence towards the
steady state.
Solution:
Lets solve the following differential equation first:
x + 3 x =
3
x(0) = 50
299

Step 1:
x + 3x =
0
dx
= 3dt
x
ln x =3t + C
x = C1e 3t for which C1 = eC
3
= 1
3
Step 2: Particular solution (steady state): x=
x C1e 3t + 1
=
Step 3: General solution:
Step 4: We solve the outcome of step 3 for the unknown C1 , using the initial condition x(0)
(so we take t=0)
50 = C1 1 + 1
So that the solution is:=
x ( x(0) 1)e 3t +=
1 49e 3t + 1
The convergence towards equilibrium:
( x(0) 1) e3t +=1
x(0) e 3t + 1 (1 e 3t )
Hence, x is a weighted average of x(0) (the initial condition) and 1 (the steady state). The
weights are e 3t for the initial condition x(0) and 1 e 3t for the steady state. Note that both
weights are between 0 and 1, because 0 < e 3t < 1 and 0 < 1 e 3t < 1 (which may be rewritten
=
x
as 0 <
1
1
< 1 and 0 < 1 3t < 1 ). Both weights add up to one.
3t
e
e
Differential equation 2
One can show that for the second differential equation
x + 0.1x =
3
x(0) = 50
3
x = 30 . The solution to the differential equation becomes:
The steady state of step 2 is=
0.1
0.1t
0.1t
x= ( x(0) 30)e + 30= 20e + 30
The convergence from the initial condition x(0) towards the steady state (30) can be written
as
x=
( x(0) 30 ) e0.1t + 30=
x(0)e 0.1t + 30(1 e 0.1t )
We compare both weights for the second and the first equation. The weight for the steady
state of the previous differential equation (1 e 3t ) is closer to one than the corresponding
weight of the current equation (1 e 0.1t ) . Hence, there is a more rapid convergence towards
to steady state for the first differential equation, since it has a larger weight attached to the
steady state.
Exercise 4.
Consider the system of differential equations.
x = ax + by + e1
y = cx + dy + e2
For which a > 0 , b < 0 , c > 0 , and d > 0
300

Show that this system is unstable as shown in Figure 14.7b of Klein.
Solution:
See the figure 14.7b of Klein for a two-variable Phase diagram for a globally unstable
equilibrium. It is partly wrong! See the solution below.
First, we consider the equation x = 0 .
x = ax + by + e1
x = 0
0 = ax + by + e1
e
a
y=
x+ 1
b
b
Since we assume that a > 0 and b < 0 , the equation x = 0 is upward sloping.
x < 0 if by < ax e1
e
a
x 1
So that (because b is a negative number, the inequality sign changes) y >
b
b
e1
a
x + has a negative value for x . (consequence:
So, the area above the equation y =
b
b
horizontal arrows in areas North and West point in leftward direction).
In the same vein, x > 0 implies that the areas South and East below the equation x = 0 have a
positive value for x . The horizontal arrows point in rightward direction.
Second, we consider the equation y = 0 .
y = cx + dy + e2
y = 0
0 = cx + dy + e2
e
c
y=
x 2
d
d
Since we assume that c > 0 and d > 0 , the equation y = 0 is downward sloping.
y < 0 . So that
y = dy + cx + e2 < 0
dy < cx e2
e
c
y<
x 2
d
d
e
c
x 2 has a negative value for y .
d
d
(consequence: the vertical arrows in areas South and West point in downward direction).
y
Consequently, the area below the equation=
y > 0 implies that the areas North and East above the equation have a positive value for y .
The vertical arrows point in upward direction in both areas. Note that the Figure 14.7b in
Klein is wrong with respect to the vertical arrows in West (must be downward) and East (must
be upward). All vertical arrows below the line y = 0 must point in the same direction.
301

Exercise 5.
Solve the following systems of differential equations.
a)
y1 = y1
y 2 = y2
Solution:
2
2
302

b)
y1 =
2 y1 + y2
y=
y1 2 y2
2
Solution:
1
1
303

c)
=
y1 3 y1 + 4 y2
=
y 2 4 y1 + 3 y2
Solution:
3
3
304

WEEK 8
Some additional exercises of week 8
Exercise 1.
Solve the following differential equations:
a) x 3 x =t 2 2
Solution:
We first find a particular solution. We postulate that it will be a polynomial of the form
x(t ) = at 2 + bt + c (we do this because the inhomogeneous part is also a function of this form).
x 2at + b and our equation becomes:
Then=
x 3 x =2at + b 3(at 2 + bt + c) =3at 2 + (2a 3b)t + (b 3c) =t 2 2
From this we see:
3a =
1, 2a 3b =
0, b 3c =
2
1
2
16
a=
,b =
,c =
3
9
27
So our particular solution becomes:
1
2 16
x(t ) =
t2 t +
3
9 27
0 . We could observe
We proceed the find the solution to the homogeneous system: x 3 x =
the solution directly here, but we apply the general rule: if the equation is of the form
x + f (t ) x =
0 , then the solution is x(t )= C e F (t ) , where F '(t ) = f (t ) . Here f (t ) = 2 , so
we obtain: x(t )= C e 2t .
The general solution is now the sum of the particular and the homogeneous solution:
x(t ) = C e 2t + at 2 + bt + c
b) x 3 x =
et
Solution:
Observe that the homogeneous equation is the same as before, so we only look for the
particular solution. We first try the exponential function, as the inhomogeneous part of the
equation is also an exponential function. However, we can see that this doesnt work (can
we?), because were off by a factor, so we try a factor in front of our function:
x(t ) = C et = x (t ) . Our equation becomes:
1
x 3 x =
C et 3C et =
2C et =
et C =
2
1
So our particular solution is: x(t ) = et .
2
c) x + t 2 x = 3t 4 4t 2 + 6t
Solution:
We start with the particular solution. Since the inhomogeneous part is a fourth-degree
polynomial, we could try a fourth degree polynomial as our solution, but because of the t 2 in
front of x, we see that it is wiser to try a polynomial of degree 4-2=2 (we could try the fourth
degree polynomial and it would give us the same result, with a lot more work). So we try
x = at 2 + bt + c, x = 2at + b . Our equation becomes:
305

WEEK 8
x + t 2 x = 2at + b + t 2 (at 2 + bt + c)= at 4 + bt 3 + ct 2 + 2at + b= 3t 4 4t 2 + 6t
This gives us five equations in three unknowns. In general, those will not have a solution, but
here we are lucky:
a ===
3, b 0, c 4, 2a ==
6, b 0 is a consistent system. So we obtain particular solution:
=
x 3t 2 4
We move on to the homogeneous solution:
1
t3
1
x + t 2 x =0, f (t ) =t 2 F (t ) = t 3 , so x= C e 3 is the homogeneous solution and
3
x(t ) =C e
1
t3
3
+ 3t 2 4 is the total solution.
306

Advanced Mathematics 2012 2013 Week 1 - Week 8 - 21 Jan 2013

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Advanced Mathematics 2012 2013 Week 1 - Week 8 - 21 Jan 2013

Hochgeladen von

Copyright:

Verfügbare Formate

Advanced Mathematics

Utrecht University School of Economics

ADVANCED MATHEMATICS CONTENTS

Week 1 Introductory material

Week 2 Linear algebra (I)

Week 3 Linear algebra (II)

Week 4 Calculus (II)

Week 5 Optimization (I)

Week 6 Optimization (II) and integrals (I)

Week 7 Integrals (II) and dynamic analysis (I)

Week 8 Dynamic analysis (II)

ADVANCED MATHEMATICS GENERAL INFORMATION

The average of the assessment of three, randomly chosen

Six out of seven assignments should be handed in.

ADVANCED MATHEMATICS CONTENTS

Problem solving: on average a sufficient grade (5.5 or higher) for

Effective teamwork. 1) Sufficient contribution to team assignments.

ADVANCED MATHEMATICS LECTURE WEEK 1

Klein 2.1. [K 2.1.]

Graphs of univariate and multivariate

Properties of functions: monotonous,

Necessary conditions, sufficient

Rules of exponential functions

Multiple compounding per period

Logarithm (as inverse of exponential

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

The sets V and T have no elements in common. They are disjoint.

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

Definition: and are parameters

ADVANCED MATHEMATICS LECTURE WEEK 1

Subscript of x refers to the variable name.

ADVANCED MATHEMATICS LECTURE WEEK 1

exists and is equal to LL for any arbitrarily small number there

| f ( x) LL |< for a < x < a

exists and is equal to LR for any arbitrarily small number there

| f ( x) LR |< for a < x < a +

ADVANCED MATHEMATICS LECTURE WEEK 1

Definition: the function has a vertical asymptote at x=8

ADVANCED MATHEMATICS LECTURE WEEK 1

Thus the function=

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

The minimum function value is at the argument x=8:

The argument at which the function has a maximum:

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

Read: It means that if P then Q

ADVANCED MATHEMATICS LECTURE WEEK 1

Read: It means that P if and only if Q

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

Definition: polynomial function

ADVANCED MATHEMATICS LECTURE WEEK 1

and lim b x = if | b |> 1

ADVANCED MATHEMATICS LECTURE WEEK 1

i: summation index (an integer).

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

ADVANCED MATHEMATICS LECTURE WEEK 1

Present value: see book and tutorial