Sie sind auf Seite 1von 80

MASTER OF ECONOMICS

AND FINANCE
School of Economics and
Business Administration
Academic year 2015

Mathematics for Economists


Lecture notes

Ignacio Rodrguez Carre


no
irodriguezc@unav.es
Economics Department

Contents
1 Linear Algebra
1.1 Matrix Algebra (Chapter 8 of Simon and Blume) . . . . . . . . . . . . . .
1.1.1 Operations with matrices . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Laws of the matrix Algebra . . . . . . . . . . . . . . . . . . . . . .
1.1.3 Trasposition of a matrix . . . . . . . . . . . . . . . . . . . . . . . .
1.1.4 Special kinds of matrices . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Determinant of a matrix (Chapter 9 of Simon and Blume) . . . . . . . . .
1.2.1 Properties of the determinant . . . . . . . . . . . . . . . . . . . . .
1.3 Rank of a matrix (rk(A)) . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Using determinants . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Using Gauss-Jordan method . . . . . . . . . . . . . . . . . . . . . .
1.3.3 Using vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Invertible matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Inverse matrix using cofactors . . . . . . . . . . . . . . . . . . . . .
1.4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Rouche-Frobenius theorem . . . . . . . . . . . . . . . . . . . . . .
1.5.2 Solution of systems of linear equations . . . . . . . . . . . . . . . .
1.6 Eigenvalues and eigenvectors of a matrix . . . . . . . . . . . . . . . . . . .
1.6.1 Diagonalization of a matrix . . . . . . . . . . . . . . . . . . . . . .
1.6.2 Properties of diagonalization . . . . . . . . . . . . . . . . . . . . .
1.7 Application: linear difference equations (Chapter 23 of Simon and Blume)
1.7.1 One dimensional equations . . . . . . . . . . . . . . . . . . . . . .
1.7.2 k-dimensional systems . . . . . . . . . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
1
3
4
4
5
6
8
8
9
9
11
11
12
13
14
14
20
22
25
25
25
25

2 Multivariate Calculus
2.1 Functions (Chapter 13 of Simon and Blume)
2.1.1 Special functions . . . . . . . . . . . .
2.1.2 Classification of functions . . . . . . .
2.1.3 Composition of functions . . . . . . .
2.2 Derivatives of multivariate functions . . . . .
2.2.1 Partial Derivatives . . . . . . . . . . .
2.2.2 The total differential . . . . . . . . . .
2.2.3 The Chain Rule . . . . . . . . . . . .

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

27
27
28
30
30
31
31
32
33

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.

CONTENTS
.
.
.
.
.
.
.
.
.

34
34
35
35
36
37
38
39
39

3 Optimization
3.1 Unconstrained optimization . . . . . . . . . . . . . .
3.1.1 Theorem 1 . . . . . . . . . . . . . . . . . . .
3.1.2 Theorem 2 . . . . . . . . . . . . . . . . . . .
3.1.3 Theorem 3 . . . . . . . . . . . . . . . . . . .
3.2 Optimization with equality constraints . . . . . . . .
3.2.1 Two variables and one equality constraint . .
3.2.2 Several equality constraints . . . . . . . . . .
3.3 Optimization with inequality constraints . . . . . . .
3.3.1 One inequality constraint . . . . . . . . . . .
3.3.2 Several inequality constraints . . . . . . . . .
3.4 Kuhn-Tucker formulation . . . . . . . . . . . . . . .
3.4.1 Optimization with mixed constraints . . . . .
3.4.2 Envelope theorem . . . . . . . . . . . . . . .
3.4.3 Special functions . . . . . . . . . . . . . . . .
3.4.4 Concave and quasiconcave functions (Chapter

. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
21 of Simon and Blume)

43
43
43
44
44
45
45
51
54
54
59
64
66
67
68
70

4 Analysis
4.1 Sequences of real numbers . . . . . . .
4.1.1 Convergent sequence . . . . . .
4.2 Open sets . . . . . . . . . . . . . . . .
4.3 Continuity of functions . . . . . . . . .
4.3.1 Continuous function at x0 . . .
4.3.2 Uniformly continuous function

.
.
.
.
.
.

73
73
73
75
75
75
76

2.3

2.4

2.2.4 Higher order derivatives . . . . . . . . . . . . . . . . . . . . . .


Taylors series approximation (Chapter 30 of Simon and Blume) . . .
2.3.1 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Taylors series approximation on R . . . . . . . . . . . . . . . .
2.3.3 Taylors series approximation on Rk . . . . . . . . . . . . . . .
Implicit function theorem (Chapter 15 Simon and Blume) . . . . . . .
2.4.1 The implicit function theorem for R2 . . . . . . . . . . . . . . .
2.4.2 The implicit function theorem for Rk . . . . . . . . . . . . . . .
2.4.3 The implicit function theorem for systems of implicit functions

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

Chapter 1

Linear Algebra
1.1

Matrix Algebra (Chapter 8 of Simon and Blume)

A matrix is simply a rectangular array of numbers. So, any table of data is a matrix.

a11 a12 a1n


a21 a22 a2n

A= .
..
..
..
..
.
.
.
am1 am2 amn
The dimension of the matrix is dim(A) = m n, being m the number of rows and n
the number of columns.
aij : elements of the matrix in i th row and j th column.
Mmn = {Set of matrices of dim = m n}

1.1.1

Operations with matrices

Addition of matrices
Being A = (aij ) and B = (bij ) two matrices such that dim(A) = dim(B),
matrix of A and B is defined as:

b11 b12
a11 a12 a1n
a21 a22 a2n b21 b22

A + B = (aij + bij ) = .
..
..
.. + ..
..
..
..
.
.
. .
.
.
bm1 bm2
am1 am2 amn
1

the addition
b1n
b2n
..
.
bmn

1. Linear Algebra

a11 + b11
a21 + b21
..
.

a12 + b12
a22 + b22
..
.

..
.

a1n + b1n
a2n + b2n
..
.

am1 + bm1

am2 + bm2

amn + bmn

Example.

1 2
0 7
1 5

4
0 4 2
1
6 2
8 + 5 4 0 = 5 11 8
5
1 0 2
2 5 3

Scalar multiplication
Given A Mmn and a real number R, A is the matrix (aij ):

A =

a11
a21
..
.

a12
a22
..
.

..
.

a1n
a2n
..
.

am1

am2

amn

a11
a21
..
.

a12
a22
..
.

..
.

a1n
a2n
..
.

am1

am2

amn

Example.

1 2 4
5 10 20
5 0 7 8 = 0 35 40
1 5 5
5 25 25

Matrix multiplication
Definition 1: Given a row matrix A and a column matrix B

A = (a1 , a2 , . . . , an )

B=

b1
b2
..
.
bn

1. Linear Algebra
we define the multiplication of A B as the scalar:

b1
n
b2
X

A B = (a1 , a2 , . . . , an ) . = a1 b1 + . . . + an bn =
ai b i
..
i=1

bn

Definition 2: Given a matrix A of dimension m n and a matrix B n p, we define


C = A B as the matrix (cij ) in which element cij is equal to:
cij =

n
X

aik bkj

k=1

The dimension of the matrix C is dim(C) = m p.


Example.

1 2
0 7
1 5


4
0 4
8 5 4
5
1 0

2
6 12 6
0 = 43 28 16
2
20 16 12

Therefore, if A is a matrix Mmn and B is a matrix Mrs :


A B exists if n = r.
B A exists if s = m.
In general, A B 6= B A, the matrix multiplication is not conmutative!

1.1.2

Laws of the matrix Algebra

Associative laws.
A, B, C Mmn :

(A + B) + C = A + (B + C)

A Mmn , B Mnr and C Mrp :


(A B) C = A (B C)
Conmutative law for addition. A, B Mmn : A + B = B + A
Distributive laws.

1. Linear Algebra
A Mnm and B, C Mmr :
A (B + C) = A B + A C
A, B Mnm and C Mmr :
(B + C) A = A C + B C

1.1.3

Trasposition of a matrix

At = (aji ) is the transposed matrix, calculated by interchanging the rows of columns of A:

A=

a11
a21
..
.

a12
a22
..
.

..
.

a1n
a2n
..
.

am1

am2

amn

At =

a11
a12
..
.

a21
a22
..
.

..
.

am1
am2
..
.

a1n

a2n

amn

Example.

1
At =
0
3

1 1 0 3
7
0 7
A= 0
1 5 5 9

1.1.4

Special kinds of matrices

Square matrix. A Mmn such that m = n.


Column matrix. A Mmn such that n = 1.
Row matrix. A Mmn such that m = 1.
Diagonal matrix. A = (aij ) such that:
aij = 0

if i 6= j

Example.

1
0
0

0
0
7
0
0 15

0 1
7 5

0 5
7 9

1. Linear Algebra
Upper (lower) triangular matrix. A = (aij ) such that:
aij = 0

if i > j( if i < j)

Example.

1 2
0 7
0 0

4
8
5

1 0
0 7
1 5

0
0
5

Symmetric matrix. A Mmn such that A = At , aij = aji i 6= j. Example.

1 1 3
0
A = 1 7
3
0 5
Skew symmetric matrix. A Mmn such that A = At . aji = aij i 6= j and,
aii = 0 i.
Example.

0 1 3
0 2
A= 1
3 2
0
Identity matrix. Inn = In :

I=

1.2

1
0
..
.
0

0
1
..
..
.
.
0

0
0
..
.
1

Determinant of a matrix (Chapter 9 of Simon and


Blume)

The determinant of a square matrix is a number. Determinant of 2 2 and 3 3 matrices


straightforward by using rule of Sarrus.
2 2:


a

c


b
= ad bc
d

1. Linear Algebra
3 3:

a11 a12

a21 a22

a31 a32

a13
a23
a33

Examples.


0 4

5 4

1 0

1.2.1

2
0
2




= a11 a22 a33 +a12 a23 a31 +a13 a21 a32 [(a31 a22 a13 )+(a32 a23 a11 )+(a33 a12 a21 )]




3 2


6 4 = 0




= 0 4 2 + 4 0 (1) + 2 (5) 0 2 4 (1) 0 0 0 2 (5) 4 = 48

Properties of the determinant

1. |A| = |At |
2. |A B| = |A||B|

A, B Mnn

3. In general, |A + B| 6= |A| + |B|

A, B Mnn

4. If one forms matrix B interchanging 2 rows or 2 columns of A, then |B| = |A|


5. If two rows(columns) are equal, then |A| = 0.
6. If a common factor
minant:

a11

..
.

ai1

.
..

an1
7. |A| = n |A|

multiplies a row (column), it can be




a11 a12
a12 a1n

..
..
..
..
..
.

.
.
.
.

ai2 ain = ai1 ai2
.
..
..
..
..
..
.
.
.
.

an1 an2
an2 ann

taken put from the deter


a1n
..
..
.
.
ain
..
..
.
.
ann

8. If a matrix A has an all-zero row(column), then |A| = 0.


9. Determinant of an upper(or lower) triangular matrix is the product of its diagonal
entries. Example.







3 0 0
1 2 0
3 2 6
2 1 2

0
0
0
4





= 3 (2) 6 4


1. Linear Algebra
10. If a linear combination of several rows (columns) is added to
|A| does not change.
Example.




3 4 R2 =3R2 2R1 3
4



6=
0 14
2 2



3 4 R2 =3R2 2R1
3
4

(1/3)
2 2
0 14

3

2


4 R2 =R2 2/3R1


3

0



4

14/3

one row (column) then

For square matrices of order > 3, let introduce two definitions:


Definition 1. The Minor of aij , Mij , is the determinant of a submatrix obtained by
deleting row i and column j from A.
Definition 2. Cofactor of aij , Cij is defined as Cij = (1)i+j Mij .
Example.

0 4 2
A = 5 4 0
1 0 2


4
M11 =
0


0
=8
2

A11 = M11 (1)1+1 = 8 1 = 8

The determinant of an n n matrix A is given by:


|A| = ai1 Ai1 + ai2 Ai2 + . . . + ain Ain
|A| = a1j A1j + a2j A2j + . . . + anj Anj

Example.

i-th row
j-th column

1. Linear Algebra



3 2
1 1
2 1


1 0 3 2
2+1
= 1 (1)

2 0
3 2
0 5
1 2
2 1 2 4

1
5
4




3


+ 0 (1)2+2 3



2

1
0
2





3 2 1
3 2 1




+(3) (1)2+3 3 2 5 + 2 (1)2+4 3 2 0 =
2 1 2
2 1 4

1
5
4

1 (29) + 0 + 3 28 + 2(7) = 99

By using the last property:









1.3
1.3.1

3
1
3
2



2
1 1 C3 = C3 + 3C1 3 2 10 5
0
0
0 3 2 C4 = C4 2C1 1 0

3 2
9
1
2
0 5

2 1 8
0
1 2 4


2 10 5


9 1 = 1 (99) = 99
= 1 (1)2+1 2
1 8
0

Rank of a matrix (rk(A))


Using determinants

The rank of a matrix A Mmn is the order of the greatest minor of A distinct to zero.
It is said the rank of a matrix A is p and it is written rk(A) = p if:
(i) A minor of order p exits and is nonzero.
(ii) All the minors of order greater than p do not exist or are zero.

1 4 2
Example. Given the matrix A = 5 2 1 , we have that:
1 0 2
(a) Any element of the matrix is a minor of order 1 of the matrix. As at least one element
of the matrix is nonzero, (for instance, a11 = 1), then, at least the rank of the matrix
is : rk(A) 1. a1 1 is the principle minor of order 1.

1. Linear Algebra

1 4
(b)
5 2
4 2


0 2
nonzero,













(principle minor of order 2), 1 4 , 1 2 , 1 2 , 4 2 ,

1 0 5 1 1 2 2 1

are all the minors of order 2 of the matrix A. As at least one of them is

then the rank of the matrix is not one, is 2 : rk(A) 2.




1 4 2


(c) The unique minor of order 3 of the matrix is its determinant. 5 2 1 = 36 6= 0.
1 0 2
As it is different from zero, finally we conclude that the rank is 3: rk(A) = 3.

1.3.2

Using Gauss-Jordan method

The Gauss-Jordan method is used to reduce a matrix to row echelon form. The rank of the
matrix is the number of nonzero rows in its row echelon form.
To reduce a matrix into its rwo echelon form, we need to apply elementary row operations:
Interchange two rows of a matrix.
Change a row by adding it a linear combination of another row.
A row of a matrix A has k leading zeros if the first k elements of the row are all zeros
and the (k + 1)-th element of the row is not zero.
A matrix is in row echelon form if each row has more leading zeros than the row
preceding it (unless the row contains only zeros, in which case the subsequent rows must
contain only zeros.
Example.

1 0
A= 2 3
4 6

1.3.3

R2 = R2 2R1
3
1
R3 = R3 4R1
0
6

11
0

0 3
1
R3 =R3 2R2
0
3 0

6 1
0

0 3
3 0
0 1

Using vectors

Given A Mmn with m row vectors r 1 , r2 , . . . , r m and n column vectors c1 , c2 , . . . , cn . The


rank of A is the maximum number of linearly independent column or row vectors.
Let introduce vector space:(V, +, ).
For all u,
v ,w
in V and all the scalars and R, V is a vector space over R if:
1. u
+ v V
2. u
+ v = v + u

10

1. Linear Algebra

3. (
u + v) + w
=u
+ (
v + w)

4. There is an element, the null vector 0 such that: v + 0 = 0 + v = v


0
5.
v V , v + (
v) =
6.
uV
7. (
u + v) =
u +
v
8. ( + )
u = (
u + u)
9. ( )
u = ( u)
10. 1 u
= u, being 1 R.
If V = Rn , being the set of n-tuples, + the addition of vectors in Rn and the scalar
multiplication of vectors, Rn satisfy all these properties, then (Rn , +, ) is a vector space.
A subset of Rn that satisfies the above properties is called a subspace of Rn .
Theorem. A subset of Rn is a subspace of Rn S V if and only if
u, v S y
, R:

u +
vS
Examples.
1. {0} is a subpace of Rn .
2. S = {(1, 0), (0, 0), (0, 1)} is a subset of R2 , not a subspace, as:
(1, 0) + (1, 0)
/S
3. T = {(x, y) R2 |x y = 0} is a subspace of R2
Needed concepts related to Rn : vectors, addition of vectors and scalar multiplication.
Vectors u
1 , u
2 , . . . , u
n in Rn are linearly independent if and only if (Chapter 11 Simon
and Blume):
1 u
1 + 2 u
2 + . . . + n u
n =

n
X

i u
i = 0

i=1

with i = 0, . . . , n
For scalars i ,

1 u
1 + 2 u
2 + . . . + n u
n

is called a linear combination of u


1 , u2 , . . . , u
n .

11

1. Linear Algebra

The set of linear combinations of u1 , u


2 , . . . , un , is called the set generated or spanned
by u
1 , u
2 , . . . , u
n . This is always a vector subspace:
L[
u1 , u
2 , . . . , u
n ] = {1 u
1 + 2 u
2 + . . . + n u
n |i R, i = 1, . . . , n}
Let u
1 , u
2 , . . . , u
n be a collection of vectors in V (vector space or subspace). Then
u
1 , u2 , . . . , u
n forms a basis of V if:
u
1 , u
2 , . . . , u
n spans V and
u
1 , u
2 , . . . , u
n are linearly independent.
The dimension of V is the number of vectors of any of its basis, then dim(V ) = n.
Examples.
In the vector space R2 , D = {(1, 0), (1, 1)} is a basis of R2 , as the dimension of
R2 is two and this two vectors are linearly independent as:

1 1
|A| =
0 1



6= 0

Then rk(A) = 2 =number of linearly independent vectors. Then both vectors are
basis of R2 .
In the vector space R3 , D = {(1, 1, 0), (1, 2, 1), (3, 0, 5)} is a basis R3 , as it is
formed by 3 linearly independent vectors.

1.4

Invertible matrices

Given A Mnn , its inverse matrix, A1 , its the matrix that accomplishes:
A1 A = AA1 = In
(A has to be a non singular matrix, |A| 6= 0).

1.4.1

Inverse matrix using cofactors

A1 =

1
(Ad )t

|A|
|A|

C11
C12
..
.

C21
C22
..
.

..
.

Cn1
Cn2
..
.

C1n

C2n

Cnn

12

1. Linear Algebra
with Cij the cofactor of aij .
Example.

2 4 7
Ad = 0 2 2
2 4
6

2
A= 2
3

A1

1.4.2

2 2
1 0
2 2

|A| = 2

1 0
= 2 1
7/2 1

2
0 2
(Ad )t = 4 2 4
7 2 6

1
2
3

Properties

The inverse matrix is unique.


A matrix A is invertible |A| 6= 0 (A non singular).
Given A, B invertible matrices:
(A B)1 = B 1 A1
Given A non singular:

(At )1 = (A1 )t
|A1 | =

If A is nonsingular:

1
|A|

Am = A A A . . . A

(m times)
Am = (A1 )m
Am is invertible and (Am )1 = (A1 )m For any scalar 6= 0, A is invertible and:
(A)1 =

1 1
A

13

1. Linear Algebra

1.5

Systems of linear equations

A system of m linear equations with n unknows or varaibales x1 , x2 , . . . , xn :


a11 x1 + a12 x2 + . . . + a1n xn
a21 x1 + a22 x2 + . . . + a2n xn
..
.

=
=
..
.

am1 x1 + am2 x2 + . . . + amn xn

b1
b2
..
.

= bm

whose matricial expression is:

a11
a21
..
.

a12
a22
..
.

..
.

a1n
a2n
..
.

am1

am2

amn

x1
x2
..
.
xn

b1
b2
..
.
bm

Therefore:

AX = B

A is the coefficients matrix with fixed numbers aij with 1 i m y 1 j n.


X is the matrix of the variables xi of dimension n 1. B is the independent matrix of fixed
numbers bj of dimension m 1.
The augmented matrix of the system is:

(A|B) =

a11
a21
..
.

a12
a22
..
.

..
.

a1n
a2n
..
.

am1

am2

amn

|
|
|
|

b1
b2
..
.
bm

Some questions arise:


Does a solution exist?
How many solutions are there?
Is there an efficient algorithm that computes actual solutions?

14

1. Linear Algebra

1.5.1

Rouch
e-Frobenius theorem

A system of linear equations with m equations and n variables has a solution if and only if:
rk(A) = rk(A|B)
Then:
If rk(A) = rk(A|B), the system has solution.
(a) If rk(A) = rk(A|B) = n (n: number of variables), the system has a unique
solution.
(b) If rk(A) = rk(A|B) < n the system has infinite solutions.
Otherwisw if rk(A) 6= rk(A|B), the system has no solution.

1.5.2

Solution of systems of linear equations

There are three different methods:


Substitution
Solve one equation of the system, say xk , in terms of other variables in that equation.
Substitute this expression for xk into the other m 1 equations. Continue this process until
upon reach a system with just a single equation in just one variable.
Finally, using the previously derived expressions, find all the xi s
Example.
2x + y + z
x + 2y + z
x + y + 2z

=
=
=

1
2
0

I calculate x in terms of y and z from the first equation: x =


result into the second and third equations. We have then:
x
3y + z
y + 3z

From the second equation, y =


for z:

1yz
. And plug this
2

1yz
2
= 3
= 1
=

3z
. We plug this result into equation three and solve
3

15

1. Linear Algebra

x
y
z

1yz
2
3z
=
3
= 3/4

Elimination of variables
Use elementary equation operations:
Add a multiple of one equation to another.
Multiply both sides of one eq. by a non-zero scalar.
Interchange two equations.
Fact: If one system of linear equations is derived from another elementary equation
operationed, then both systems have the same solutions (systems are equivalent).
Example.
2x + y + z
x + 2y + z
x + y + 2z

=
=
=

1
2
0

Its matricial expression:

2
1
1



1 1
x
1
2 1 y = 2
1 2
z
0

A equivalent system is another one in which R2 = 2R2 R1 and R3 = R3 R2 :

2 1 1
x
1
0 3 1 y = 3
0 1 1
z
2
Another equivalent system R3 = 3R3 + R2 :

2 1 1
x
1
0 3 1 y = 3
0 0 4
z
3

16

1. Linear Algebra

Matrix methods
1. For systems with a unique solution. If in a system of linear equations, rk(A) =
rk(A|B) = n (n linearly independent equations), the unique solution can be found
using:
(a) Cramers Rule. The unique solution of the system X = (x1 , x2 , . . . , xn ) of the
system AX = B is:

xi =

Bi
|A|

i = 1, . . . , n

where Bi is the matrix A with the independent column matrix B replacing the
i th column of A:

xi =


a11

a21

..
.

an1

..
.

b1
b2
..
.

..
.

a1n
a2n
..
.

bn
|A|

ann

Example.


3 2

|A| = 2 1
1 1

1 2

1 1

2 1
x1 =
8

3x1 + 2x2 + x3 = 1
2x1 + x2 2x3 = 1
x1 + x2 x3
= 2

3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2

1
2 = 8 6= 0, I can use the Cramers rule:
1



3
1
1
1

2 1 2
2


1 2 1
1
= 1
x2 =
=1
8



3 2 1


2 1 1


1 1 2
x3 =
=2
8

(b) Inverse matrix method. If rk(A) = n, then A is a non singular matrix, |A| 6= 0.
Then A1 exists and:
A1 AX = A1 B
In X = A1 B

17

1. Linear Algebra
X = A1 B
Example.

3x1 + 2x2 + x3 = 1
2x1 + x2 2x3 = 1
x1 + x2 x3
= 2

3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2


3 2 1


|A| = 2 1 2 = 8 6= 0, then A1 exist and the solution of the system
1 1 1
will be X = A1 B:

1
3 5
4 4 4

1/8 3/8 5/8


3
1
7
= 1/2
1/2 1/2
A1 =
8
3/8
1/8 7/8

1/8 3/8 5/8


1
1
1/2 1/2 1 = 1
X = A1 B = 1/2
3/8
1/8 7/8
2
2

2. For systems with one or infinite solutions. In any system of linear equations,
row echelon form is applied to the augmented matrix of the system, (A|B).
Straightforward, rank(A) and rank(A|B) can be obtained, and the number of solutions
of the system (0, 1 or ) can be determined.
(a) If the system has a unique solution (rk(A) = rk(A|B) = n). The substitution
method, or Cramer or the inverse method is applied to obtain the unique solution
over the equivalent system with the row echelon form of the augmented matrix,
(A|B)r .
Example.
3x1 + 2x2 + x3
2x1 + x2 2x3
x1 + x2 x3

= 1
= 1
= 2

18

1. Linear Algebra

3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2
The augmented matrix is:

3 2
(A|B) = 2 1
1 1

1 | 1
2 | 1
1 | 2

Applying elementary operations to obtain the row echelon form:

3 2
(A|B) = 2 1
1 1
R2 + 2R1
1
R3 3R1
0

1 | 1
1 1
R1 R3
2 | 1 3 2
1 | 2
2 1

1 1 | 2
1
3R3 +R2
3 4 | 5 0
1 4 | 7
0

1 | 2
1 | 1
2 | 1

1 1
3 4
0 8

| 2
| 5
| 16

1
The solution of the system is: X = 1
2
(b) If the system has infinite solutions (rk(A) = rk(A|B) < n). From the set of n
variables, n rk(A) variables are taken as parameters. Thus, a subsystem of k
equations and k variables is obtained.
The rest of the variables xi are calculated in terms of these n rk(A) variables.
Example.
x1 x2 + x3
4x1 + 5x2 5x3
2x1 + x2 x3
x1 + 2x2 2x3

1
4

2
1

=
=
=
=

1
4
2
1

1 1
1
x1
4
5 5
x2 =
2
1 1
x3
2 2
1

19

1. Linear Algebra
The augmented matrix of the system:

1
4
(A|B) =
2
1

1 1 | 1
5 5 | 4

1 1 | 2
2 2 | 1

Obtaining row echelon form:


R2 4R1
R 2R
3
1
1 1 1 | 1
4 5 5 | 4 R4 R1

(A|B) =

2 1 1 | 2
1 2 2 | 1

1
0

0
0

1 1 | 1
9 9 | 0

3 3 | 0
3 3 | 0

Then, rk(A) = rk(A|B) = 2 < 3, with n r = 3 2 = 1 parameter.


If x3 = , then:
x1 x2
x2

= 1
=

1
The solution of this system is X =

3. Particular case: Homogeneous systems A homogeneous system is a system of


the form:
AX = 0
with A Mmn , X Mn1 with variables xi , and B the null m 1 column matrix.
A homogeneous system ALWAYS has 1 or solutions as:
rk(A) = rk(A|B)
Then:
If rk(A) = n, the unique solution of the system is the zero solution or the trivial
solution:
x1 = x2 = . . . = xn = 0
If rk(A) < n, then the system has infinite solutions
Example.
3x1 + 2x2 + 4x3
2x1 + 2x3
x1 + 2x2

=
=
=

0
0
0

20

1. Linear Algebra

3 2
2 0
1 2


4
x1
0
2 x2 = 0
0
x3
0



3 2 4


As |A| = 2 0 2 = 0, then rk(A) = 2 < 3, then the systems has infinite
1 2 0
solutions and we need n r = 3 2 = 1 parameter. Making x2 = , the solution
of the system is: x1 = 2, x3 = 2, R

1.6

Eigenvalues and eigenvectors of a matrix

Let A be a square matrix. An eigenvalue of A is a number r which then subtrated


from each of the diagonal entries of A, converts A into a singular matrix.
A square matrix is singular if and only if |A| = 0.
r is an eigenvalue of A if and only if |ArI| = 0. If r is a variable, |ArI| = p(r) is the
characteristic polynomial of A. The r values for wich this characteristic polynomial is
zero are the eigenvalues of matrix A. The algebraic multiplicity i of an eigenvalue ri
is the exponent at which this root appears in the characteristic polynomial.

2 1 0
Example. A = 0 0 1 .
0 0 3
The characteristic polynomial is:

2r

p() = |A rI| = 0
0

1
0
r
1
0 3r




= r(2 r)(3 r)

The eigenvalues are: r1 = 0, r2 = 2, r3 = 3, with algebraic multiplicities 1 =


1, 2 = 1, 3 = 1.
Associated to each eigenvalue, ri such that |A ri I| = 0, there is a system of linear
equations (A ri I)
vi =
0. From this system, solutions for vi can be ontained such

that vi 6= 0. These vectors vi are called the eigenvectors of A corresponding to the


eigenvalue ri . For each eigenvalue ri , a basis of the subspace of the vi corresponding
vectors (eigenspace of ri ) can be obtained.

21

1. Linear Algebra
Example.

2
A= 0
0

1 0
0 1
0 3

Eigenspace associated to r1 = 0:
x R3 |(A 0I)
x = 0}
Er1 = {


2 1 0
x
0
0 0 1 y = 0
0 0 3
z
0

rk(A) = 2 < 3 =number of variables. Then the system has infinite solutions and
we need 3 2 = 1 parameter. The solution of the system is: x = , y = 2 y
z = 0, R. A basis of this subspace will be:
BEr1 = {(1, 2, 0)}

Eigenspace associated to r2 = 2:
x R3 |(A 2I)
x = 0}
Er2 = {


0 1 0
x
0
0 2 1 y = 0
0 0 1
z
0

rk(A 2I) = 2 < 3 =number of variables. Then the system has infinite solutions
and we need 3 2 = 1 parameter. The solution of the system is: x = , y = 0 y
z = 0, R. A basis of this subspace will be:
BEr2 = {(1, 0, 0)}

Eigenspace associated to r3 = 3:
x R3 |(A 3I)
x = 0}
Er3 = {


1 1 0
x
0
0 3 1 y = 0
0
0 0
z
0

rk(A 3I) = 2 < 3 =number of variables. Then the system has infinite solutions
and we need 3 2 = 1 parameter. The solution of the system is: x = , y =
y z = 3, R. A basis of this subspace will be:
BEr3 = {(1, 1, 3)}

22

1. Linear Algebra

1.6.1

Diagonalization of a matrix

(Chapter 23 of Simon and Blume)


The eigenvalues and eigenvectors are used to diagonalize a square matrix A, which
means finding a diagonal matrix D and a non singular matrix P such that:
D = P 1 AP

with A, D, P Mnn

with vi the basis of the eigenspaces associated to eigenvalues ri .


There are two different cases:
(a) Given A Mnn . If r1 , r2 , . . . , rn are the eigenvalues of A such that r1 6= r2 6= . . . 6= rn ,
then A is diagonalizable and we can find P and D:

D=

r1
0
..
.

0
r2
..
.

..
.

0
0
..
.

rn

|
P = v1
|

|
v2
|

|
vn
|

with vi the basis of the eigenspaces associated to eigenvalues ri .

2 1 0
Example. A = 0 0 1 . All the eigenvalues are different and we had that:
0 0 3
BEr1 =0 = {(1, 2, 0)}
BEr2 =2 = {(1, 0, 0)}
BEr3 =3 = {(1, 1, 3)}
Then, A is diagonalizable and D and P such that D = P 1 AP are:

0 0
D= 0 2
0 0

0
1 1
0 P = 2 0
3
0 0

1
1
3

(b) Given A Mnn . If r1 , r2 , . . . , rp are the eigenvalues of A with algebraic multiplicities


1 , 2 , . . . , p . Then A is diagonalizable if and only if:

i = i = dim(Eri ) = i = 1, 2, . . . , p

23

1. Linear Algebra

D=

r1

0
..
.

0
r1
..
.

0
..
.

..
.

..
.

..
.

..
.

..
.

..
.

0
0
..
.

ri

0
0
..
.

..
.

..
.

..
.

..
.

0
..

0
..
.

0
ri
..
.

0
..
.

..
.

..
.

0
0
..
.

rn

0
0

0
..

0
0
..
.

0
..
.

0
rn

The matrix P is formed by the eigenvectors in columns. The order of the eigenvectors
in matrix P has to be the one corresponding to the eigenvalues in matrix D.

1
Example. Given A = 1
1

1 1
1 1
1 1

If A is diagonalizable, i = dim(Eri ) = i i = 1, 2, . . . , p.

1r

p(r) = |A I| = 1
1

1
1r
1

1
1
1r




= (1 r)3 + 2 3(1 r) = r2 (r 3)

The eigenvalues are: r1 = 0, r2 = 3, with algebraic multiplicities 1 = 2, 2 = 1.


Calculating the basis of the eigenspace associated to r1 = 0:

Er1

1
1
1

= {
x R3 |(A 0I)
x = 0}


1 1
x
0
1 1 y = 0
1 1
z
0

rk(A) = 1 < 3 =number of variables. We need 3 1 = 2 parameters. The solution


of ther system is x = , y = and z = , , R. Then, a basis of this
eigenspace will be:
BEr1 =0 = {(1, 1, 0), (1, 0, 1)}

24

1. Linear Algebra
We can see that 1 = dim(Er1 =0 ) = 2 = 1 .
Calculating the basis of the eigenspace associated to r2 = 3:

x R3 |(A 3I)
x = 0}
Er2 = {


2 1
1
x
0
1 2 1 y = 0
1
1 2
z
0
rk(A) = 2 < 3 =number of variables. We need 3 2 = 1 parameter. The solution of
ther system is: x = , y = y z = , , R. Then, a basis of this eigenspace will
be:
BEr2 =3 = {(1, 1, 1)}

We can see that 2 = dim(Er2 =0 ) = 1 = 2 .


As the dimensions of both eigenspaces coincide with the algebraic multiplicities of both
eigenvalues, then A is diagonalizable and we can find D and P such that D = P 1 AP :

0
D= 0
0

0 0
1 1 1
0 0 P = 1
0 1
0 3
0
1 1

Symmetric matrices(Optimization and statistics)


Let A be a n n symmetric matrix. Then:
A is diagonalizable with distinct or repeated eigenvalues.
Eigenvectors corresponding to different eigenvalues are orthogonal.
Note: Two vectors u
= (u1 , u2 , . . . , un ), v = (v1 , v2 , . . . , vn ) are orthogonal if its
product is zero:
u
vt = u1 v1 + u2 v2 + . . . + un vn = 0
A matrix P orthogonal can be found such that:
D = P t AP

with A, D, P Mnn

25

1. Linear Algebra

1.6.2

Properties of diagonalization

If A n n is diagonalizable:
1. |A| = ni=1 ri , being r1 , r2 , . . . , rn its eigenvalues.
2. Eigenvalues of At = Eigenvalues of A.
3. If ri is an eigenvalue of A with algebraic multiplicity i , then:
dim(Eri ) = i i
4. P, D such that D = P 1 AP . Then An = P Dn P 1 n N.
5. A1 = P D1 P 1

1.7
1.7.1

Application: linear difference equations (Chapter


23 of Simon and Blume)
One dimensional equations

In this equation:
yn+1 = ayn
The y variable can represent the amount of money in a savings account whose principal
is left untouched and whose interest is compounded once a year,
yn+1 = (1 + )yn
with the interest rate. With this difference equation, in which the variable at a certain
time n depends on its value at a time n + 1, we can solve the amount of money at a certain
time n:

1.7.2

y1
y2
..
.

=
=

ay0
ay1
..
.

yn

= an y 0

a2 y 0

k-dimensional systems

In general,
zn+1 = Azn
where zn+1 , zn , are k 1 vectors and A is a k k matrix. To solve these systems we
can face different situations:

26

1. Linear Algebra

1. A is diagonal. Then the solution of the system is the same as the one dimensional
case: zn = An z0 .
2. A has distinct real eigenvalues. In this case there is a nonsingular matrix P such that
D = P 1 AP :

r1 0 0
0 r2 0

D= .
..
..
..
..
.
.
.
0 0 rk
In this case, we can solve this system as the one dimensional case:
z1
z2
..
.

=
=

Az0
Az1
..
.

zn

An z0

A2 z0

As we know that An = P Dn P 1 , thus:


zn = P Dn P 1 z0

Chapter 2

Multivariate Calculus
2.1

Functions (Chapter 13 of Simon and Blume)

A function from a set A to a set B is a rule that assigns to each object in A one object in
B.
f :AB
The elements of set A for which the function f is defined is called the domain of the
function f ; the set B for which the function f takes its values is called the target of the
function f .
y = f (x) is said to be the image of x under f . The set of all f (x)s for x in the domain
of f is called the image of f .
Example:
If we consider the function f : R2 R, f (x, y) = x2 + y 2 , we have that the domain of
the function is R2 and the image of the function are all the real positive values.
If we consider the function g : R R, g(x) = 1/x, its domain is all the real numbers
except 0 and its image is R {0}.
But in Economy we are mostly interested on functions that usually have two or more
variables as input:
Demand function: f : R3 R
q1 = f (p1 , p2 , y) = K1 pa1 11 pa2 12 y b1
Production function: f : R2 R
q = f (x1 , x2 ) = kxb11 xb22
27

28

2. Multivariate Calculus
Production function of 2 outputs using 3 inputs: f : R3 R2
Q(p1 , p2 , y) = (q1 (p1 , p2 , y), q2 (p1 , p2 , y)) = (K1 pa1 11 pa2 12 y b1 , K2 pa1 21 pa2 22 y b2 )

2.1.1

Special functions

Linear functions (f : Rk Rm )
Functions that preserve the vector space structure:
f (x + y) = f (x) + f (y)
f (rx) = rf (x)

x, y Rk

x Rk , r R

Usually these functions have the general form:


f (x1 , x2 , . . . , xk ) = a1 x1 + a2 x2 + . . . + ak xk
If f : Rk Rm is a linear function, there exists an m k matrix A such that:
f (x) = Ax

x Rk

Quadratic forms (f : Rk R)

A quadratic form on Rk is a real-valued function associated to a symmetric matrix A (A =


At ) of the form:

a11 a12 a1n


x1
a12 a22 a2n x2

Q(
x) = X t AX = (x1 , x2 , . . . , xn ) .
..
..
.. ..
..
.
.
. .
xn
a1n a2n ann
Developing this form we get:

Q(
x) = X t AX = (x1 , x2 , . . . , xn )

a1n x1 + a2n x2 + . . . + ann xn

= a11 x21 + a22 x22 + . . . + ann x2n + 2a12 x1 x2 + 2a13 x1 x3 + . . . =

n
X

aii x2i + 2

i=1

Example: Q(x, y, z) = x2 5y 2 + 3z 2 2xy 5xy.

1
Q(x, y, z) = (x, y, z) 1
0

a11 x1 + a12 x2 + . . . + a1n xn


a12 x1 + a22 x2 + . . . + a2n xn
..
.

1
0
x
5 5/2 y
5/2
3
z

n
n
X
X

i=1 j=i+1

aij xi xj

29

2. Multivariate Calculus
xy: 12 or 21.
yz: 23 or 32.

Definiteness of the quadratic forms


The definiteness of Q(
x) is very important in optimization of certain functions to check
second order conditions (differentiation of maxima and minima, convex or concave function).
Attending the eigenvalues ri of the matrix A, a quadratic form Q(
x) is:
1. Q is positive def inite ri > 0, i = 1, . . . , n.
2. Q is positive semidef inite ri 0, rj = 0.
3. Q is negative def inite ri < 0, i = 1, . . . , n.
4. Q is negative semidef inite ri 0, rj = 0.
5. Q is indef inite ri > 0, rj < 0.

But sometimes the calculus of the eigenvalues is not easy and we can classify the
quadratic forms by using the principle minors of
the matrix A.
a
a12
Given Q(
x) = X t AX and let |A1 | = a11 , |A2 | = 11
, . . . , |An | = |A| be the principle
a12 a22
minors of A.
1. Q is positive def inite |Ai | > 0, i = 1, . . . , n.

2. Q is positive semidef inite if |Ai | > 0 i = 1, . . . , n 1 and |An | = |A| = 0.


3. Q is negative def inite (1)i |Ai | > 0, i = 1, . . . , n.

4. Q is negative semidef inite if (1)i |Ai | > 0 i = 1, . . . , n 1 and |An | = |A| = 0.

5. Q is indef inite if |An | = |A| 6= 0 and conditions of cases (1) and (3) are not satisfied.
6. Q is indef inite if |An | = |A| = 0 y |Ai | 6= 0 i = 1, . . . , n 1 and conditions of cases
(2) and (4) are not satisfied.
Monomials
A function f : Rk R is a monomial if it can be written as:
f (x1 , . . . , xk ) = cxa1 1 xa2 2 . . . xakk

c R, ai 0

The degree of the monomial is: a1 + . . . + ak


Examples:
A constant function is a monomial of degree zero.
Each term of a linear function is a monomial of degree 1.
f (x1 , x2 ) = 3x1 + 4x2
Each term of a quadratic form is a monomial of degree 2.

Q(x, y, z) = x2 5y 2 + 3z 2 2xy 5xy

30

2. Multivariate Calculus

Polynomial
A function f : Rk R is called a polynomial function if f is the finite sum of monomials
on Rk .
The degree of the polynomial is the highest degree among the different monomials.
Example:
f (x, y, z) = x3 + 2xyz + y 4
f is a polynomial of degree 4.

2.1.2

Classification of functions

1. f : A B is surjective if for each element b B, there is an element a A such that


b = f (a), the whole target space is the image of f .
Examples.
Consider the real functions f (x) = x and g(x) = x2 . f (x) = x is surjective, as
Im(f ) = R. Otherwise, function g is not surjective as Im(g) = {y 0|y R}.
2. f : A B is injective or one-to-one if:
x, y A

f (x) 6= f (y) x 6= y

Examples.
g(x) = x2 is not one-to-one as for x = 1 and x = 1, we have that f (1) = 1.
f (x) = x is one-to-one as for two any different values of x, their images are always
different.
f : R2 R, f (x, y) = x2 + y 2 . f is not surjective as the image of f is R+ instead of
all R. Neither is one-to-one since f (1, 0) = f (0, 1) = 1.

2.1.3

Composition of functions

Let f : A B and g : C D be two functions. Suppose that B C. Then, the


composition of f with g, g f : A D is the function:
(g f )(x) = g(f (x))

x A

Examples:
f : R R and g : R R, f (x) = sin(x), g(x) = x2 .
(g f )(x) = g(f (x)) = g(sin(x)) = sin2 (x)

31

2. Multivariate Calculus

Figure 2.1: Derivative of a function

2.2

Derivatives of multivariate functions

2.2.1

Partial Derivatives

Let f : Rk R, y = f (x1 , . . . , xk ). Then, we define the partial derivative of f with


respect to xi at point x0 = (x01 , x02 , . . . , x0k ) Rk as:
f (x01 , . . . , x0i , . . . , x0k ) f (x01 , . . . , x0i + h, . . . , x0k )
f
(x01 , x02 , . . . , x0k ) = lim
h0
xi
h
provided this limit exists. This means that for any > 0, there is a > 0 such that:


f
f (x01 , . . . , x0i , . . . , x0k ) f (x01 , . . . , x0i + h, . . . , x0k )
|h| <
(x01 , x02 , . . . , x0k )
<
xi
h
(x0 )
Example. If f : R R. If limh0 f (x0 +h)f
tends to a particular point, then we
h
have a derivative.
Limit: > 0 > 0 such that:



f (x0 + h) f (x0 )

|h| < f (x0 )
<
h

In Economic terms, it can be interpreted as for example, MARGINAL PRODUCT


F F
(given the production function F (K, L)): K
, L are the marginal products of capital and
labour, respectively. Or ELASTICITY (given a certain demand function Q1 (P1 , P2 , I)):

32

2. Multivariate Calculus

Q1 P1
%change in demand

P1 Q1
%change in own price

is the own price elasticity of demand and,

Q1 P2

is the cross price elasticity of demand


P2 Q1
Q1 I
is the income elasticity of demand

I Q1

2.2.2

The total differential

Intuitively, if F : R2 R:
F
(x , y )x
x
F
F (x , y + y) F (x , y )
(x , y )y
y

F (x + x, y ) F (x , y )

so we expect that:
F (x + x, y + y) F (x , y )

F
F
(x , y )x +
(x , y )y
x
y

or if F : Rk R, with x = (x1 , . . . , xk ):
F (x1 + x1 , . . . , xk + xk ) F (x1 , . . . , xk )

F
F
(x )x1 + . . . +
(x )xk
x1
xk

Then,
dF = F (x + x) F (x )
Thus,
dF =

F
F
(x )dx1 + . . . +
(x )dxk
x1
xk

The JACOBIAN DERIVATIVE of F at x is:

DF (x ) =


F
F
(x ), . . . ,
(x )
x1
xk

33

2. Multivariate Calculus
If F : Rk Rm , F = (f1 (x1 , . . . , xk ), f2 (x1 , . . . , xk ), . . . , fm (x1 , . . . , xk )):

F (x + x) F (x )

f1
(x )
xk
..
.
fm
(x )
xk

f1
(x )
x1
..
.

fm
(x )
x1

x1

.
.
.

xk

So, the Jacobian Derivative of F at x is:

DF (x ) =

f1
(x )
x1
..
.

fm
(x )
x1

f1
(x )
xk
..
.
fm
(x )
xk

Example.
3/2

1 2
Consider the pair of demand functions q1 = 6p2
1 p2 y and q2 = 4p1 p2 y . Calculate

the total differential of these functions at point p1 = 6, p2 = 9, y = 2.

dq1 =

q1
q1
q1
(x )dp1 +
(x )dp2 +
(x )dy
p1
p2
y
dq1 = 3dp1 + 1.5dp2 + 4.5dy
dq2 =

2.2.3

32
32
16
dp1 dp2 + dy
9
27
3

The Chain Rule

Let g(t) = f (x(t)). The derivative of the composite function is the derivative of the
outside function f (x) (evaluated at the inside of the function) times derivative of the
inside function:
g (t) = f (x(t)) x (t)
Let g(t) = f (x1 (t), . . . , xk (t)). Then:
g (t) =

f
f
(x(t))x1 (t) + . . . +
(x(t))xk (t)
x1
xk

34

2. Multivariate Calculus
Let u(t) = (u1 (t1 , . . . , ts ), u2 (t1 , . . . , ts ), . . . , un (t1 , . . . , ts )), u : Rs Rn , with t =
(t1 , . . . , ts ), and f : Rn R, then the composite function g : Rs R:
g = (f u)(t) = g(t1 , . . . , ts ) = f (u(t)) = f (u1 (t), . . . , un (t))
And then,

g
f
u1
f
un
=
(t)
(t) + . . . +
(t)
(t)
ti
u1
ti
un
ti
Let F : Rk Rm , and a : R Rk . Then, the composite function g = F a = F (a(t))
is a function g : R Rm , and its derivative is:
g (t) = DF (a(t)) a (t)
Let F : Rk Rm , and A : Rs Rk . Then, the composite function H = F A is a
function H : Rs Rm , and:
DH(s) = D(F A)(s) = DF (A(s)) D(A(s))

2.2.4

Higher order derivatives

Let f : Rk R. For this function we could define the matrix of second order derivatives as:

D2 f (x) =

2f
x21
2f
x1 x2
..
.
2f
(x)
x1 xk

2f
x2 x1
2f
x22

2f
(x)
xk x1
2f
(x)
xk x2
..
.
2f
(x)
x2k

This matrix is called the Hessian Matrix, that is as you can see, a symmetric matrix.
The Young's Theorem shows that mixed partials of order n, are equal if they are continuously.

2.3

Taylors series approximation (Chapter 30 of Simon


and Blume)

A function f : Rk R is r-times continuously differentaiable (cr ) if all the derivatives of f


of order r exist and are continuous.

35

2. Multivariate Calculus

Figure 2.2: Mean value

2.3.1

Mean value theorem

Let f : U R a c1 fucntion on a (connected) interval U in R. For any points a, b U , there


is a point c between a and b such that:
f (b) f (a) = f (c)(b a)
Similarly, let F : U R such that U R (open subset). Let a and b be two points in
U such that the line segment from a to b lies in U . Then, there is a point on that segment
such that:
F (b) F (a) = DF (c)(b a)

2.3.2

Taylors series approximation on R

Let f : U R be a cr+1 function defined on a (connected) interval U in R. For any a and


a + h in U , there exists a point c between a and a + h such that:
f (a + h) = f (a) + f (a)h + . . . +

1 (k)
1
f (a)hk +
f (k+1) (a)hk+1
k!
(k + 1)!

36

2. Multivariate Calculus
In this expression, the k th order Taylor polynomial of F at x = a is:
Pk (a + h) = f (a) + f (a)h + . . . +

1 (k)
f (a)hk
k!

Defining the difference Rk (h; a) between the actual value f (a + h) and its k th order
approximation Pk (a + h) satisfies:
Rk (h; a) = f (a + h) Pk (a + h)

and this difference satisfies that

2.3.3

Rk (h; a)
0 as h 0.
hk

Taylors series approximation on Rk

Suppose that F : U R is a c2 function on an (open) subset U of Rk . Let a U . Then,


there exists a continuous function R2 (h; a) such that for any point a + h U with the
property that the line segment from a to a + h lies in U ,
F (a + h) = F (a) + DF (a)h +

where

1 t 2
h D F (a)h + R2 (h; a)
2!

R2 (h; a)
0 as ||h|| 0 and ht = (h1 , . . . , hk ) and:
||h||2
k

1 X X 2F
1 t 2
(a)hi hj
h D F (a)h =
2!
2 i=1 j=1 xi xj

In coordinates on R2 (a = (a1 , a2 ) and h = (h1 , h2 )),


F (a1 + h1 , a2 + h2 ) = F (a1 , a2 ) +

F
F
(a)h1 +
(a)h2 +
x1
x2

1 2F
2F
1 2F
(a)h21 +
(a)h1 h2 +
(a)h22 + R2 (h1 , h2 ; a)
2
2 x1
x1 x2
2 x22
Example. Compute the Taylor approximation of order two of the Cobb-Douglas
fucntion F (x, y) = x1/4 y 3/4 at (1, 1).
1
F
= x3/4 y 3/4
x
4

F
3
= x1/4 y 1/4
y
4

37

2. Multivariate Calculus

3
2F
= x1/4 y 5/4
y 2
16

3
2F
= x7/4 y 3/4
x2
16

2F
3 3/4 1/4
2F
=
=
x y
xy
yx
16
Evaluating the partial derivatives at z = (1, 1) we have that:

1
F
(z ) =
x
4

2F
3
(z ) =
x2
16

F
3
(z ) =
y
4

2F
3
(z ) =
y 2
16

2F
3
=
xy
16

Therefore,
F (1+h1 , 1+h2 ) = F (1, 1)+ 1/4 3/4
=1+

h1
h2


1
+
2

h1

h2

3/16 3/16
3/16 3/16



h1
h2

h1
3
3
3
3
+ h2 + h21 + h1 h2 h22 + R(h1 , h2 )
4
4
32
16
32

Note that if ht = (0.1, 0.1), we have that F (1.1, 0.9) = 0.9463026.


The value of the Taylor approximation of order 1 is 0.95.
The value of the Taylor approximation of order 1 is 0.94625

2.4

Implicit function theorem (Chapter 15 Simon and


Blume)

Usually the endogeneous variable is a explicit function of the exogeneous variable, that is:
y = F (x1 , . . . , xk )
but sometimes we face situations where both kinds of variables are mixed as in:
G(x1 , . . . , xk , y) = 0
Depending on G, we usually cannot solve for y, but we still want to answer the basic
question: How does a small change in one of the exogeneous variables affect the value of the
endogeneous variable?

38

2. Multivariate Calculus

The implicit function theorem for R2

2.4.1

For a given function G(x, y) = C and a specific point (x0 , y0 ), we want to answere the
questions:
(a) Does G(x, y) = C determine y as a continuous function of x near to x0 and y near y0 ?
(b) If so, how do changes in x affect y?
Theorem: let G(x, y) be a c1 function on a ball about (x0 , y0 ) in R2 . Suppose that
G(x0 , y0 ) = C and consider the expression G(x, y) = C.
G
(x0 , y0 ) 6= 0, then, there exists a c1 function y = y(x) defined on an interval I about
If
y
x0 such that:
1. G(x, y(x)) = C x I,
2. y(x0 ) = yo
G
(x0 , y0 )
3. y (x0 ) = x
G
(x0 , y0 )
y

Example. Consider x2 3xy + y 3 7 = 0 around the solution point x0 = 4, y0 = 3.


Suppose we can found a function y = y(x) such that:
x2 3xy(x) + y 3 (x) 7 = 0
If we take the partial derivative of G with respect to y (using the Chain rule), we have
that:
2x 3y(x) 3xy (x) + 3y 2 (x)y (x) = 0
G
2x 3y
= x
y (x) = 2
G
3y 3x
y

At the point x = 4, y = 3, we find that:


1
15
We conclude that if there is a function which solves the equation and if it is differentiable,
then, as x changes by x, y will change by x/15.
y (4) =

Summarizing, we suppose that there exist a function y(x) that is solution to the equation
G(x, y) = C, G(x, y(x)) = C. Then we apply the Chain Rule to differentiate with respect
to x at x0 , and we get y (x0 ).

39

2. Multivariate Calculus
In the example, we can get the Taylors series approximation of order 1:
y1 y0 + y (x0 )x = 3 +

1
0.3 = 3.02
15

and y1 = 3.01475.

2.4.2

The implicit function theorem for Rk

Let G(x1 , . . . , xk , y) be a c1 function around the point (x1 , . . . , xk , y ). Suppose that (x1 , . . . , xk , y )
satisfies:
G(x1 , . . . , xk , y ) = C
G
(x , . . . , xk , y ) 6= 0
y 1
Then, there is a c1 function y = y(x1 , . . . , xk ) defined on an open ball B about
so that:

(x1 , . . . , xk )

1. G(x1 , . . . , xk , y(x1 , . . . , xk )) = C (x1 , . . . , xk ) B,


2. y = y((x1 , . . . , xk )
3. i = 1, . . . , k:

2.4.3

G
(x , . . . , xk , y )
y
xi 1

(x , . . . , xk ) =
G
xi 1
(x , . . . , xk , y )
y 1

The implicit function theorem for systems of implicit functions

Let F1 , . . . , Fm : Rm+k R be a c1 fucntions. Consider the system of equations:


F1 (y1 , . . . , ym , x1 , . . . , xk )
..
.

=
..
.

C1
..
.

Fm (y1 , . . . , ym , x1 , . . . , xk ) = Cm

Defining y = (y1 , . . . , ym
), x = (x1 , . . . , xk ), suppose that (y , x ) is a solution of the
system above.

If the determinant of the m m matrix:

40

2. Multivariate Calculus
F
1
y1
.
.
.
F
m
y1

F1
ym
..
.
Fm
ym

evaluated at (y , x ) is nonzero, then there exist functions:


y1
..
.

f1 (x1 , . . . , xk )
..
.

ym

fm (x1 , . . . , xk )

defined on a ball B about x such that:


F1 (f1 (x), . . . , fm (x), x1 , . . . , xk )
..
.

=
..
.

C1
..
.

Fm (f1 (x), . . . , fm (x), x1 , . . . , xk ) = Cm


x = (x1 , . . . , xk ) B and

Furthermore, one can compute

y1
xj
..
.
ym
xj

y1
..
.

f1 (x )
..
.

ym

fm (x )

yi
(y , x ), i = 1, . . . , m by evaluating at (y , x ),
xj

F
1
y
1

..
= .

Fm
y1

F1
ym
..
.
Fm
ym

1 F
1
x

j

..

.
F
m
xj

Example. Consider:
F1 (x, y, a) =
F2 (x, y, a) =

x2 + axy + y 2 1 =
x2 + y 2 a2 + 3 =

0
0

around the point x = 0, y = 1, a = 2. If we change a a little to a near


we find (x , y ) near (0, 1) so that (x , y , a ) satisfies these two equations?

= 2, can

41

2. Multivariate Calculus
First, defining z = (x , y , a ),

x
(z )
a
..
.
y
(z )
a

F1


x
=

F2

(z )
x


2
0

2
2



1
F1
(z )

y
F2
(z )
y

0
4

2
2

F1
(z )
a
F2 =
(z )
a

So,
x = 2a
y = 2a
If a increases to 2.1, y will increase about 1.2 and x will decrease to 0.2.

Chapter 3

Optimization
3.1

Unconstrained optimization

Let F : U R be a real-valued function whose domain is a subset of Rk , U Rk .


1. x U is a maximum of F on U if F (x ) F (x) x U .
2. x U is a strict maximum of F on U if F (x ) > F (x) x 6= x U .
3. x U is a local (or relative) maximum of F if there is a ball B (x ) about x such
that F (x ) F (x) x B (x ) U
4. x U is a strict local maximum of F if there is a ball B (x ) about x such that
F (x ) F (x) x B (x ) U
Reversing the inequalities, we have the definitions for minimum, strict minimum, . . ..

B (x ) = {x|||x x || < }

Definition: x is an interior point of a set U if there is a ball B (x ) about x in the


set U (x does not lie in the extremes of the interval, is not on the surface of the sphere B).

3.1.1

Theorem 1

Let F : U R be a c1 function defined on a subset U of Rk . If x is a local maximum or


minimum of F in U and if x is an interior point of U , then

43

44

3. Optimization

F
(x ) = 0
xi

i = 1, . . . , k

A point x which satisfies these expressions, is called a critical point of F .

3.1.2

Theorem 2

Definition: A set S in Rk is open if x S, there exists and open -ball about x completely
contained in S:

x S > 0 such that B S


Let F : U R be a c2 function whose domain is an open set U in Rk . Suppose that
x is a critical point of F .

1. If the Hessian:

F (x ) = D F (x ) =

2F
(x )
x21
2
F
(x )
x1 x2
..
.

2F
(x )
x2 x1
2F
(x )

x22

2F
(x )
x1 xk

2F
(x )
xk x1
2F
(x )
xk x2
..
.
2F
(x )
x2k

is negative definite, then x is a strict local maximum of F .


2. If D2 F (x ) is positive definite, tehn x is a strict local minimum of F .
3. If D2 F (x ) is indefinite, then x is a saddle point.

3.1.3

Theorem 3

Let F : U R be a c2 function whose domain is U in Rk . Suppose that x is an interior


point of U and that x is a local maximum (minimum) of F . Then, D2 F (x ) is negative
(positive) semidefinite.

45

3. Optimization

3.2

Optimization with equality constraints

3.2.1

Two variables and one equality constraint

Let f and h be c1 functions of two variables. Suppose that x = (x1 , x2 ) is a solution of the
problem:

Maximize f (x1 , x2 ) subject to h(x1 , x2 ) = c


Suppose that (x1 , x2 ) is not a critical point of h. Then, there is a real number such
that (x1 , x2 , ) is a critical point of the Lagrangian function,
L(x1 , x2 , ) = f (x1 , x2 ) [h(x1 , x2 ) c]
is called Lagrange multiplier.
Intuition
Level curves of f : R2 R. For a point (x0 , y0 ) we evaluate f (x0 , y0 ) = z0 . Then we
sketch the locus in the x-y plane of all other (x, y) pairs for which f has the same value z0 .
This locus is a level curve of f .
Example: For f (x, y) = x2 + y 2 , the set of all points (x, y) at which f equals a is:
f 1 (a) = {(x, y)|x2 + y 2 = a}
Two ideas:
The highest level curve of f must be tangent to the constraint set (curve) at the
constrained maximum x . This means that the slope of the level curve of f equals the
slope of the constraint curve at x .
How do we calculate those slopes? By the implicit function theorem it is straightforward to show that the slope of the level set of f at x is:
f
(x )
x
1
f
(x )
x2
whereas the slope of the constraint set at x is:
h
(x )
x1

h
(x )
x2

46

3. Optimization

Figure 3.1: Level curves of a function

Thus, by the optimality condition we have that

h
f
(x )
(x )
x1
x1
=
f
h
(x )
(x )
x2
x2

f
f
(x )
(x )
x1
x2
=
=
h
h
(x )
(x )
x1
x2

Rewritting these equations:

h
f
(x )
(x ) = 0
x1
x1
h
f
(x )
(x ) = 0
x2
x2

and including the constraint equation:

47

3. Optimization

Figure 3.2: Intuition

h(x1 , x2 ) = c

We obtain a system of 3 equations in 3 unkowns. We get exactly the same through


the use of the Lagrangian function.
Of course this method does not work for the critical points of h:

h
h
(x ) =
(x ) = 0
x1
x2

so this was the reason for the condition called the constraint qualification, which
is automatically satisfied if the constraint is linear.
Example. Find the maximum of the function f (x1 , x2 ) = x1 x2 subject to:
h(x1 , x2 )

= x1 + 4x2

= 16

48

3. Optimization

First, we have to see the constraint qualification condition, if there are critical points
of h that lie in the constraint region.
h
=1
x1

h
=4
x2

Then h has no critical points and the constraint qualification is satisfied.


The Lagrangian is:

L(x1 , x2 , ) = x1 x2 (x1 + 4x2 16)


The partial derivatives are:

L
= x2 = 0
x1

L
= x1 4 = 0
x2

L
= (x1 + 4x2 16) = 0

and setting the derivatives equal to zero and solving for the three unknowns, we have
that x1 = 8, x2 = 2, a st = 2.
Example. Maximize the function f (x1 , x2 ) = x21 x2 subject to (x1 , x2 ) in the constraint
set C = {(x1 , x2 )|2x21 + x22 = 3}
First, we have to compute the critical points of h:
h
= 4x1
x1

h
= 2x2
x2

These derivatives are zero if (x1 , x2 ) = (0, 0), But as this point is not in the constraint
set C, the constraint qualification is satisfied and we can form the Lagrangian.
Lagrangian:
L(x1 , x2 , ) = x21 x2 (2x21 + x22 3)
Derivatives of the Lagrangian with respect to the variables and the lagrangian multipliers:

49

3. Optimization

L
= 2x1 x2 4x1 = 2x1 (x2 2) = 0
x1

L
= x21 2x2 = 0
x2

L
= 2x21 x22 + 3 = 0

Solving the system. As you can see, this is a nonlinear system of equations, so matrix
methods cannot be applied. Only Substitution and and elimination methods can be
applied.
We have the following equations:

2x1 (x2 2) = 0
x21 2x2 = 0
2
2x1 + x22 3 = 0
From the first equation we have two alternatives:
(a) 2x1 = 0 and x2 6= 0. Then x1 = 0, substituting this value in the three
equations we have that:

2x2
x22 3

=
=

0
0

From the last equation we have that x2 = 3, and then =0. Thus, we have
two possible solution points: (x1 , x2 , ) = (0, 3, 0) and (0, 3, 0).
(b) x1 6= 0 and x2 = 0. Then we have that x2 = 2 and substituting this
expression in the rest equations:

2x21

x21 42 =
+ 42 3 =

0
0

1
and x1 = 1. Thus, we have another 4 possible points:
2
(1, 1, 0.5), (1, 1, 0.5), (1, 1, 0.5), (1, 1, 0.5).

We have that =

(c) 2x1 = 0 and x2 = 0. Then x1 = 0 and x2 = 2. But if we substitute we have:

50

3. Optimization

42 =
42 3 =

0
0

And this system is incompatible, it has no solution.


Plugging the solutions in f , we conclude that the maximum of the function occurs at
(x1 , x2 ) = (1, 1) and (x1 , x2 ) = (1, 1)
The meaning of the multiplier
Theorem:
Given the problem:
Max f (x1 , x2 )
subject to h(x1 , x2 ) = c

Let f and h be c1 functions. For any fixed value of the parameter c, let (x1 (c), x2 (c))
be the solution of the problem, with the corresponding multiplier (c).
Suppose that x1 , x2 and are c1 functions of c and that the constraint qualification holds
at (x1 (c), x2 (c)). Then,

(c) =

df
(x (c), x2 (c))
dc 1

So that (c) measures the rate of change of the optimal value of f with respect to c.
Example. In the previous example, we found that a maximizer of f subject to 2x21 +x22 =
3 was x1 = x2 = 1, with = 0.5.

We could redo the problem with a new constraint 2x21 + x22 = 3.3 to get x1 = x2 = 1.1
with the maximum value of f = 1.1537, an increase of 0.1537 over the original value of f .
But, by the previous theorem, we could get an approximation of this increase in f as:

f = c = 0.3 0.5 = 0.15

51

3. Optimization
quite similar to the tru increase in f .

3.2.2

Several equality constraints

Now we want to maximize:


Max f (x1 , x2 , . . . , xk )
subject to C = {x = (x1 , x2 , . . . , xk )|h1 (x) = c1 , . . . , hm (x) = cm }

with m constraint functions. The generalization of the constraint qualification is that


the Jacobian derivative evaluated at x (optimal x) is:

Dh(x ) =

h1
(x )
x1
..
.

hm
(x )
x1

h1
(x )
xk
..
.
hm
(x )
xk

In general, a critical point x of h = (h1 , . . . , hm ) has rk(Dh(x )) < m. then we say


that (h1 , . . . , hm ) satisfies the nondegenerate constraint qualification (NDCQ) at x .

Theorem
Let f, h1 , . . . , hm be c1 functions of k variables. Consider the previous problem and suppose
that x C is a local max or min of f on C and satisfies the NDCQ above. Then, there
exists 1 , . . . , m such that:
(x , ) = (x1 , . . . , xk , 1 , . . . , m )

is a critical point of the Lagrangian L(x, ) = f (x)1 (h1 (x)c1 ). . .m (h1 (x)cm ).
Example. Find the maximum of the function f (x, y, z) = xyz with the constraints:
h1 (x, y, z) =
h2 (x, y, z) =

x2 + y 2
x+z

= 1
= 1

52

3. Optimization
First, we check the NDCQ. The Jacobian matrix is:

h1
x
Dh(x, y, z) = h
2
x

h1
y
h2
y

h1

2x 2y
z =
h2
1
0
z

0
1

which has rank < 2 if x = y = 0. However, this point violates the first constraint, so
all the points in the constraint set satisfy the NDCQ. Forming the Lagrangian,

L(x, y, z, 1 , 2 ) = xyz 1 (x2 + y 2 1) 2 (x + z 1)


Derivatives of the Lagrangian with respect to the variables and the lagrangian multipliers:

L
= yz 21 x 2 = 0
x
L
= xy 2 = 0
z

L
= xz 21 y = 0
y
L
=x+z1=0
2

L
= x2 + y 2 1 = 0
1

Solving the system. As you can see, this is a nonlinear system of equations, so matrix
methods cannot be applied. Only Substitution and and elimination methods can be
applied.
We have the following equations:

yz 21 x 2
xz 21 y
x2 + y 2 1
x+z1

=
=
=
=

0
0
0
0

Taking 1 and 2 in terms of x, y and z, and plugging into the first equation we have:

y 2 z x2 z xy 2
x2 + y 2 1
x+z1

=
=
=

0
0
0

53

3. Optimization

Substituting z and y 2 in terms of x and plugging into the first equation we obtain
a polynomial of third order with a root x = 1. Solving for x in the second order
polynomial and obtaining y and z, we have another 4 possible solution points:
x 0.4343
x 0.7676

y 0.9008
y 0.6409

z 0.5657
z 1.7676

Evalutaing the possible candidates in the objective function, we have that the maximizer
is x 0.7676, y 0.6409 and z 1.7676.
The meaning of the multipliers
Let f, h1 , . . . , hm be c1 functions of k variables. Let c = (c1 , . . . , cm ) be an m-tuple of
exogeneous parameters and consider the problem above. Let x1 (c), . . . , xk (c) denote the
solution with 1 (c), . . . , m (c),.
Suppose that xi , j i, j are differentiable functions of (c1 , . . . , cm ) and that NDCQ holds.
Then:

j (c1 , . . . , cm ) =

f
(x (c1 , . . . , cm ), . . . , xk (c1 , . . . , cm ))
cj 1

j = 1, . . . , m

Figure 3.3: The gradient is perpendicular to the tangent line at that point

54

3. Optimization

3.3
3.3.1

Optimization with inequality constraints


One inequality constraint

We face the problem:


Maximize f (x1 , x2 ) subject to g(x1 , x2 ) b
Intuition:
Let F : Rk R be a c1 function and x Rk . Then the derivative:

F (x ) = DF (x ) =

F
(x )
x1

F
(x )
xk

t

is called the gradient of F at x .


At any x Rk at which F (x) 6= 0, the gradient points at x into the direction in which
F increases most rapidly. At the same time, the gradient is perpendicular to the tangent
line to the level curve at x .

Figure 3.4: Gradient of Q


Example. Consider Q = 4K 3/4 L1/4 with the input bundle (10000, 625). We want to
know in what proportions we should add K and L to (10000, 625) to increase production
most rapidly. Computing the gradient:

55

3. Optimization

F


(10000,
625)
1.5

K
F (10000, 625) = F
=
8
(10000, 625)
L

We deduce that we should add K and L at a ratio of 1.5 to 8.

Figure 3.5: Constraint is binding

Constraint is binding (active, effective or tight)


The highest level curve which meets the constraint set meets it at p. Thus, f (p) and g(p)
line up and therefore f (p) = g(p). Note that the gradients points at the direction
in which f (g) increases most rapidly at p. Thus, both gradients should point at the same
direction and 0.
Then, we set,

L(x1 , x2 , ) = f (x1 , x2 ) [g(x1 , x2 ) b]


and calculate
L
f
g
=

x1
x1
x1

56

3. Optimization

L
f
g
=

x2
x2
x2
L
= g(x1 , x2 ) b

and proceed as before. Note that we also require that the maximizer is not a critical
point of g.

Figure 3.6: Constraint is not binding

Constraint is not binding


Now, tha max of f occurs in a point q where g(x1 , x2 ) < b. As q is in the interior of the
constraint set, we say that the constraint is not binding (inactive, ineffective) at q. The point
r as the max, the multiplier would be negative. In fact, q must be a local unconstrained
max, so:
f
f
(q) = 0,
(q) = 0
x1
x2
Threfore, the derivatives of g do not define point q, so we can still use the Lagrangian,
provided we set = 0, which causes the constraint function to drop out of the analysis.

57

3. Optimization

L
= 0, so we have to use the complementary slacknesss condition,
Thus, we cannot use

which can be summarized as:


(g(x1 , x2 ) b) = 0
Theorem
Let f and g be c1 functions of two variables. Suppose that x = (x1 , x2 ) maximizes f on
g(x1 , x2 ) b. If g(x1 , x2 ) = b, suppose that,
g
g
(x , x ) 6= 0 or
(x , x ) 6= 0
x1 1 2
x2 1 2
Forming the Lagrangian function,

L(x1 , x2 , ) = f (x1 , x2 ) [g(x1 , x2 ) b]


There is a multiplier such that:
(a)

L
(x , x , ) = 0
x1 1 2

(b)

L
(x , x , ) = 0
x2 1 2

(c) [g(x1 , x2 ) b]
(d) 0
(e) g(x1 , x2 ) b
Example. Find the maximum of the function f (x, y) = xy subject to:
g(x1 , x2 ) =

x2 + y 2

First, we have to compute the critical points of g:


g
= 2x
x

g
= 2y
y

These derivatives are zero if (x, y) = (0, 0), but as these point is not on the boundary
of the region and does not satisfy x2 + y 2 = 1, the constraint qualification is satisfied
and we can form the Lagrangian.

58

3. Optimization
Lagrangian:
L(x, y, ) = xy (x2 + y 2 1)
We take the partial derivatives of the Lagrangian and the complementary slackness
condition:
L
= y 2x = 0
x
L
2.
= x 2y = 0
y
1.

3. (x2 + y 2 1) = 0
4. 0

5. x2 + y 2 1
We have the following system of equations:

y 2x = 0
x 2y = 0
(x2 + y 2 1) = 0

Solving the system. We have to analyze these systems through the lagrangian multipliers. We have in this case only one multiplier, , thus we start the analysis from the
third equation, supossing that or = 0 or x2 + y 2 1 = 0:
(a) If = 0, we have:

y
x

= 0
= 0
= 0

Thus, the solution of the system is (x, y) = (0, 0) as this point satisfies also the
inequality of the constraint x2 + y 2 1. Therefore, (0, 0) is a possible candidate
of the solution.
(b) If x2 + y 2 1 = 0, we have:

59

3. Optimization

y 2x
x 2y
x2 + y 2 1

=
=
=

0
0
0
0

From the first two equations we have:


=

x
y
= y 2 = x2
x
y

Plugging this result on the third equation we have:

2x2 = 1

1
x =
2

1
1
Then, y = , = and we have four possible points:
2
2


1 1
1
, ,
2
2 2

 
 
 

1 1
1
1
1
1
1
1
1
, , ,
, , ,
, , ,
2
2 2
2
2 2
2
2 2

Notice that there are some possible candidate points with < 0.These points will be
consider if we want to calculate the minimizer of the function. Right now we want to
calculate the maximizer of the function and we consider the points with 0, and
1
1
1
1
then we find that the maximizers of this function are ( , ) and ( , ).
2
2
2
2

3.3.2

Several inequality constraints

Suppose that f, g1 , . . . , gm are c1 functions of k variables. Suppose that x Rk is a local


max of f on the constraint set defined by:
g1 (x1 , . . . , xk ) 0
..
..
.
.
gm (x1 , . . . , xk ) 0
Without loss of generality, assume that the first m0 constraints are binding at x and
the last m m0 are not binding. Suppose that the rank of:

60

3. Optimization

Dg(x ) =

g1
(x )
x1
..
.

gm
(x )
x1

g1
(x )
xk
..
.
gm
(x )
xk

is m0 . Then we can form the Lagrangian:


L(x1 , . . . , xk , 1 , . . . , m ) = f (x) 1 [g1 (x) b1 ] . . . m [gm (x) bm ]
Then, there exist multipliers = (1 , . . . , m ) such that:
(a)

L
L
(x , ) = 0 . . .
(x , ) = 0
x1
xk

(b) 1 [g1 (x ) b1 ] = 0, . . . , m [gm (x ) bm ] = 0


(c) 1 0 . . . m 0
(d) g1 (x ) b1 , . . . , gm (x ) bm
Example. Maximize f (x, y, z) = xyz subject to x + y + z 1, x 0, y 0, z 0.
First of all, we have to transform the inequalities: x + y + z 1, x 0, y 0,
z 0.
We check the NDCQ. Calculating the Jacobian matrix of the constraints, we have that:

g1 g1 g1
x
y
z


1
1
1
g2 g2 g2


1 0
0
x
y
z

Dg(x, y, z) = g
=

g
g
3
3
3
0 1 0

0
0 1
y
z
x
g4 g4 g4
x
y
z
This matrix has rank 3, which means that at most 3 of the 4 constraints can be binding
at the same time. Besides, the NDCQ holds at any solution candidate.
The Lagrangian is:
L(x, y, z, 1 , 2 , 3 , 4 ) = xyz 1 (x + y + z 1) + 2 x + 3 y + 4 z
We take the partial derivatives of the Lagrangian and the complementary slackness
conditions:

61

3. Optimization
L
= yz 1 + 2 = 0
x
L
2.
= xz 1 + 3 = 0
y
L
3.
= xy 1 + 4 = 0
z
4. 1 (x + y + z 1) = 0

1.

5. 2 x = 0
6. 3 y = 0
7. 4 z = 0
8. i 0 i = 1, . . . , 4.
9. x + y + z 1
10. x 0
11. y 0
12. z 0

Figure 3.7: Lagrangian multipliers tree.


Then, we have the following system of equations:

62

3. Optimization

yz 1 + 2
xz 1 + 3
xy 1 + 4
1 (x + y + z 1)
2 x
3 y
4 z

=
=
=
=
=
=
=

0
0
0
0
0
0
0

(1)
(2)
(3)
(4)
(5)
(6)
(7)

From the first 3 equations we have,


1 = yz + 2 = xz + 3 = xy + 4

(13)

Solving the system. We have to analyze this system through the lagrangian multipliers.
We will form a tree with different suppositions from the values of the Lagrangian
multipliers in equations 4,5,6 and 7. As we have 4 different multipliers, we can have
24 = 16 different cases.
If 1 = 0, we have that equation (13) will be:
0 = yz + 2 = xz + 3 = xy + 4
As x, y, z, 2 , 3 and 4 are positive (greater or equal than zero from the inequalities), the only possible solution is that all the terms in those 3 equations have to
be zero:
2 = 3
yz =
xz =
xy =

= 4 = 0
0 (1)
0 (2)
0 (3)

From this system, we get infinite solutions in which two of the variables are 0
and the other is different than 0. From inequality (9), we have that the value
of this variable will be in the interval [0, 1]. For all these possible candidates,
f (x, y, z) = 0. Then, all the cases from 1 to 8 in the figure have been solved.
If 1 6= 0, then from inequality (9)
1
1
1
x+y+z1
2 x
3 y
4 z

=
=
=
=
=
=
=

yz + 2
xz + 3
xy + 4
0
0
0
0

(1)
(2)
(3)
(4)
(5)
(6)
(7)

63

3. Optimization

To solve this system of equations we have to discuss it from equations (5), (6)
and (7). We analyse the different cases in the figure, from 9 to 16:
Case 9. 1 6= 0, 2 = 3 = 4 = 0 (x, y, z are 6= 0). Then the system will be:
1
1
1
x+y+z1
1 =
x+y+z1 =

yz
0

=
=
=
=

yz
xz
xy
0

= xz

(1)
(2)
(3)
(4)
=

xy

(1)
(4)

The solutions are x = y = z = 1/3, with 1 = 1/9. We check that this


solution satisfies inequalities (9), (10), (11) and (12). So this point is a
possible candidate, with f (1/3, 1/3, 1/3) = 1/27.
Case 10. 1 6= 0, 2 = 3 = 0,4 6= 0 (x, y are 6= 0 and z = 0)
1
1
1
x+y1

=
=
=
=

0
0
xy + 4
0

(1)
(2)
(3)
(4)

As in the first two equations we get 1 = 0, this leads to a contradiction


with the suposition that 1 6= 0, then this system has no solution, Therefore,
z 6= 0 and 4 = 0 has to be zero (Cases 12, 14 and 16 from the figure are
excluded).
Case 11. 1 6= 0, 2 = 4 = 0, 3 6= 0 (x, z are 6= 0 and y = 0)
1
1
1
x+y+z1
3 y

=
=
=
=
=

0
xz + 3
0
0
0

(1)
(2)
(3)
(4)
(6)

Again, y = 0 leads to contradiction, and therefore y has to be greater than


zero, y > 0 and 3 = 0. Then Cases 12 and 15 are excluded.
Case 13. 1 6= 0, 2 6= 0, 3 = 4 = 0 (y, z are 6= 0 and x = 0). In this
case occurs again that we get to a contradiction, and therefore, x has to be
greater than zero, x > 0 and 2 = 0. Cases 14 and 16 are excluded too.
Then, the maximizer of the problem is (1/3, 1/3, 1/3).

64

3. Optimization

The meaning of the multipliers


Let f, g1 , . . . , gm be c1 functions of k variables. Let b = (b1 , . . . , bm ) be an m-tuple of
exogeneous parameters and consider the problem above. Let x1 (c), . . . , xk (c) denote the
solution with 1 (b), . . . , m (b),.
Suppose that xi , j i, j are differentiable functions of (b1 , . . . , bm ) and that NDCQ holds.
Then:
j (b1 , . . . , bm ) =

f
(x (b), . . . , xk (b))
bj 1

j = 1, . . . , m

Example. If in the previous example we change the first constraint to x + y + z 0.9,


the new solution is x = y = z = 0.3 where f (x , y , z ) = 0.027.
By the previous theorem, we could get an approximation of this increase in f as:

f = b1 = 1/9 0.1 0.011

f (0.3, 0.3, 0.3) = f (1/3, 1/3, 1/3) 0.011 0.0259


If we change the second constraint to x 0.1, no change in solution will be found in
the solution as the optimal point is still inside the new constraint set. This is consistent as
this constraint is not binding at the solution, 2 = 0.

3.4

Kuhn-Tucker formulation

The most common problem in Economics is of the form:


Maximize f (x1 , . . . , xk )
subject to g1 (x1 , . . . , xk ) b1 , . . . , gm (x1 , . . . , xk ) bm
x1 0, . . . , xk 0
We could set:
L(x, , v) = f (x)

m
X
j=1

j [gj (x) bj ] +

k
X

The first order conditions, as we have learned, are:


(a)

L
g1
f
gm
=
1
...
+ vi = 0, i = 1, . . . , k
xi
xi
xi
xi

i=1

vi xi

65

3. Optimization

(b) j [gj (x) bj ] = j

L
= 0, j = 1, . . . , m
j

(c) vi xi = 0, i = 1, . . . , k
(d) j 0, vi 0, i = 1, . . . , k, j = 1, . . . , m
Kuhn and Tucker worked with a Lagrangian without including the nonnegativity constraints:

e 1 , . . . , m ) = f (x)
L(x,

m
X
j01

j [gj (x) bj ]

which is called the Kuhn-Tucker Lagrangian. Note that:

e , v) +
L(x, , v) = L(x,

k
X

vi xi

i=1

Thus, for i = 1, . . . , k,
e
L
L
=
+ vi = 0
xi
xi

or

e
L
= vi
xi

Now it is straightforward to show that:


e
e
L
L
0 and xi
=0
xi
xi
On the other hand, for j = 1, . . . , m
e
L
L
=
= bj gj (x) 0
j
j
Thus, the first order conditions in terms of the Kuhn-Tucker Lagrangian are:
e
e
L
L
0, . . . ,
0
x1
xk

66

3. Optimization
e
e
L
L
0, . . . ,
0
1
m
x1
1

e
e
L
L
0, . . . , xk
0
x1
xk

e
e
L
L
= 0, . . . , m
=0
1
m

Two advantages over the previous formulation:


1. k + m equations instead of 2k + m.
2. Symmetry in the way xi s and j s enter the first order conditions.

3.4.1

Optimization with mixed constraints

Suppose f, g1 , . . . , gp , h1 , . . . , hm are c1 functions of k variables. LSuppose that x Rk is a


local maximizer of f in the constraint set defined by gi (x) bi , i = 1, . . . , p and hj (x) = cj ,
j = 1, . . . , m. Without loss of generality assume that the first p0 inequality constraints are
binding at x whereas p p0 are not.
Suppose that the rank of the Jacobian of the constraints is p0 + m:

g1
g1
(x
)

(x
)

x1
xk

..
..

gp0
gp0

(x )
(x )

x1
xk
Dgh(x ) = h

h
1
1

(x )
(x )

xk

x1

.
.

..
.

hm
hm
(x )
(x )
x1
xk
Form the Lagrangian:

L(x, , v) = f (x)

p
X
i=1

j [gj (x) bj ]

m
X
j=1

j [hj (x) cj ]

Then, there exists multipliers 1 , . . . , p , 1 , . . . , m such that:


(a)

L
L
(x , ) = 0 . . .
(x , ) = 0
x1
xk

67

3. Optimization
(b) 1 [g1 (x ) b1 ] = 0, . . . , m [gm (x ) bm ] = 0
(c) hj (x ) = cj , j = 1, . . . , m
(d 1 0 . . . m 0
(e) g1 (x ) b1 , . . . , gm (x ) bm

3.4.2

Envelope theorem

Unconstrained problems
Let f (x; a) be a c1 function of x Rk and the scalar a. For each choice of a, consider the
problem:
Max f (x; a) with respect to x

Let x (a) be the solution of this problem and suppose that x (a) is a c1 function of a.
Then,
f
df
(x (a); a) =
(x (a); a)
da
a

Example. Calculate the effect of a unit increase in a on the max value of f (x; a) =
x2 + 2ax + 4a2 .
f (x) = 2x + 2a = 0 x (a) = a

Now, f (x (a); a) = f (a, a) = 5a2 . Applying the envelope theorem,


df
f
(x (a); a) =
(a, a) = 10a
da
a
which means f will increase at a rate 10a as a increases one unit.
Constrained problems
Suppose f, h1 , . . . , hm are c1 functions of k variables. Let x (a) = (x1 (a), . . . , xk (a)) denote
the solution of the problem of maximization f (x; a) on the constraint set hj (x; a) = 0
j = 1, . . . , m.
or any fixed choice of a, suppose that x (a) and j (a) are c1 functions of a and that NDCQ
holds. Then,

68

3. Optimization

f
L
df
(x (a); a) =
(x (a); a) =
(x (a), (a); a)
da
a
a
Example. Find the maximum of the function f (x, y) = xy subject to:
g(x1 , x2 ) =

x2 + ay 2

Lagrangian:
L(x, y, ) = xy (x2 + ay 2 1)

A solution for a = 1 was already calculated before, and is x = y = 1/ 2 with


= 1/2. Now, the envelope theorem says that as a changes from 1 to 1.1 (0.1 of
positive increase in a), the optimal value of f changes by
df
f
L
(x (a); a) =
(x (a); a)
(x (a), (a); a) = y 2
da
a
a


1
1
1 1
L
, , ;1 =
a
4
2
2 2

So the optimal value of f will be decreased by 0.025 to 0.475. One can calculate
1
1
directly that the solution to the new problem is x = , y = , with maximum
2
2.2
objective value of f approx. 0.4767.

3.4.3

Special functions

Homogeneous functions (Chapter 20 of Simon and Blume)


For all scalar r R, a real-valued function f (x1 , . . . , xk ) is homogeneous of degree r if:
f (tx1 , . . . , txk ) = tr f (x1 , . . . , xk )
k

(x1 , . . . , xk ), t > 0

y = ax is an homogeneous function of degree k, while y = x3 + 4x2 is not homogeneous


at all. A monomial z = axk11 xk22 xk33 is an homogeneous function of degree k1 + k2 + k3 . A
function that is a sum of monomials of the same degreee is an homogeneous function. If a
function is composed by monomials of different degrees is not an homogeneous function.
In Economics, production functions are usually homogeneous functions. Homogeneous of
degree 1 is equivalent to constant returns to scale (double input double output). If r > 1
(double input double output 2r ), the firm exhibits increasing returns to scale, whereas
if r < 1, decreasing returns to scale.
Properties

69

3. Optimization

(a) If f c1 is homogeneous of degree r, its partial derivatives are homogeneous of degree


r 1.
(b) If f c1 on Rk+ . The tangent planes to the level sets of f have constant slope along
each ray from the origin.
Example. Suppose that u(x1 , x2 ) is an homogeneous utility function. Fixing (p1 , p2 , I0 ),
we want to maximize u(x1 , x2 ) subject to:
p1 x1 + p2 x2 I0
x0 is the solution. If we increase I to I1 , we get a new solution x1 . The optimal bundle
demanded for different income levels is called the Income Expansion Path and by b)
is a ray from the origin for homogeneous utility functions.

Figure 3.8: Homogeneous utility function


(c) If f (x) c1 and is homogeneous of degree r on Rk+ , then:

x1

f
f
(x) + . . . + xk
(x) = r(f (x)
x1
xk

x f (x) = rf (x)

70

3. Optimization

3.4.4

Concave and quasiconcave functions (Chapter 21 of Simon


and Blume)

Convex set
A set U is a convex set if whenever x and y are points in U , the line segment from x to y,

is also in U .

l(x, y) = {tx + (1 t)y|0 t 1}

Figure 3.9: Convex and not convex sets

Concave function
A real-valued function f defined on a convex set U of Rk is concave if x, y U and
t [0, 1],
f (tx + (1 t)y) tf (x) + (1 t)f (y)
Convex function
A real-valued function g defined on a convex set U of Rk is concave if x, y U and t [0, 1],
g(tx + (1 t)y) tg(x) + (1 t)f (y)

71

3. Optimization

Figure 3.10: Concave function


Note that if f is concave, then f is convex. Concavity and convexity are alwasy
defined on convex sets. In Economics, almost all the functions, specially the utility and
production functions have convex sets as their natural domains.

Theorems
Theorem 1. Let f be a c1 function on an interval i in R. Then, f is concave(convex) if:

f (y) f (x) ()f (x)(y x)

x, y I

Theorem 2. Let f be a c1 function on a convex subset U of Rk . Then f is concave


(convex) on U if x, y U :

f (y) f (x) ()Df (x)(y x)

72

3. Optimization
Theorem 3. Let f be a c2 function on an open convex subset U of Rk . Then, f is a
concave (convex) function on U if and only if the Hessian matrix D2 f (x) is negative
(positive) semidefinite x U .
Theorem 4 (Global maxima and minima). Let f be a concave (convex) function defined
on an open convex subset U of Rk . If x0 is a critical point of f , Df (x0 ) = 0, then x0
is a global maximizer (minimizer) of f on U .

Quasiconcave functions
A function f defined on a convex subset U Rk is quasiconcave if x, y U and t [0, 1],
f (tx + (1 t)y) min{f (x, f (y)}

set.

Alternative definition: f is quasiconcave if a R, Ca = {x U |f (x) a} is a convex

Example. Every Cobb-Douglas function F (x, y) = Axa y b with A, a, b > 0 is quasiconcave.

Chapter 4

Analysis
4.1

Sequences of real numbers

A sequence of real numbers is a function f : N R. This function relates a natural number


to a real number. Usually, the sequence is not considered the function f but the images:
xn = {f (n)}
n=1

4.1.1

Convergent sequence

A sequence {xn }n = 1 converges to a limit x if > 0 there exists a number N such that,
|xn x| <

n N

lim xn = x

Examples. Sequences converging to 0:


1
1
1, 0, , 0, , 0, . . .
2
3
1 1 1
1, , , , . . .
2 3 4
3 1 3 1 3 1
1, , , , , , , . . .
1 2 2 3 3 4

73

74

4. Analysis

Cauchy sequence
It is a sequence {xn }n = 1 such that > 0 there exists a number N such that,
|xn xm | <

n N, m N

Proposition: Any convergent sequence is a Cauchy sequence.


Theorem. Let {xn }n = 1 , {yn }n = 1 be sequences with limits x and y respectively. Then, the sequence {xn + yn }n = 1 converges to the limit x + y and the
sequence {xn yn }n = 1 converges to xy.

Monotone sequences
A sequence is monotone increasing (decreasing) if xn1 ()xn n N. It is
monotone if it is monotone increasing or decreasing.
Bounded sequence: A sequence is bounded if there is a number B such that
|xn | B

Theorem: Every bounded monotone sequence converges.


Example. Consider the sequence {an }:


r n
an = p 1 +
12

in which p is the investment capital, an is the accounting balance after n months, and
r is the annual compound interest rate.
This sequence is divergent, as:



r n
r n
n
= p lim 1 +
= p lim (k) =
lim p 1 +
n
n
n
12
12

Notice that the limit tends to infinity as k > 1.


Besides, this sequence is monotone increasing.

75

4. Analysis

4.2

Open sets

Open ball: For a point z Rk and > 0, then the open -ball about z is:
B (z) = {x Rk |||x z|| < }
Open set: A set S in Rk is open if for each x S, there exists an open -ball
baout x completely contained in s:
x S > 0|B (z) S
Examples. (0, 1) is an open set in R, but not in R2 . S = {x Rk |||x|| 1} is
not open.
Theorem
The intersection of a finite number of open subsets of Rk is an open set.
Union of arbitrary collection of open sets is an open set.
Interior of a set S: Union of all open sets contained in s (largest open set
contained in S).
Example. S = {(x, y) R2 |0 < z 1}. Int(s) are points with open -balls
belonging to S:

Int(s) = {(x, y) R2 |0 < z 1}

Closed set. A set S Rk is closed if whenever {xn }n = 1 is a convergent


sequence completely contained in S, its limit is also contained in S. Or easier, a
set S is closed if S c = Rk S is open.
Theorem

Any intersection of closed sets is closed.


The finite union of closed sets is closed.

4.3

Continuity of functions

4.3.1

Continuous function at x0

Let f : Rk Rm and x0 Rk . f is continuous at x0 if > 0 > 0 such that


||x x0 || < , implies that ||f (x) f (x0 )|| < . f is continuous if it is continuous at
every point in its domain.
f is continuous if and only if for all convergent sequence in its domain if xn x0

76

4. Analysis
f (xn ) f (x0 ).
Example.
f (x) =

1 x>0
0 x0

The sequence 1/n converges to 0, but f (1/n) = 1, which is not f (0) = 0.

4.3.2

Uniformly continuous function


k

Let f : R Rm and B Rk . We say f is uniformly continuous function in B if > 0


> 0 such that x, y B and||x y|| < , implies that ||f (x) f (y)|| < . Clearly if f is
uniformly continuous f is continuous.
Example. f : R R, f (x) = x2 is continuous. But not uniformly continuous, as given
> 0 and x0 > 0, we want to choose
|x x0 | < |x2 x20 | <
Clearly, as x0 gets bigger, keeping the same > 0 has to be smaller, implying that
given > 0, there is not a single > 0 which works for every x0 R.

Das könnte Ihnen auch gefallen