Maths Lecture Notes 2015 16

MASTER OF ECONOMICS
AND FINANCE
School of Economics and
Business Administration
Academic year 2015
Mathematics for Economists

Lecture notes
Ignacio Rodrguez Carre

no
irodriguezc@unav.es
Economics Department
Contents
1 Linear Algebra
1.1 Matrix Algebra (Chapter 8 of Simon and Blume) . . . . . . . . . . . . . .
1.1.1 Operations with matrices . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Laws of the matrix Algebra . . . . . . . . . . . . . . . . . . . . . .
1.1.3 Trasposition of a matrix . . . . . . . . . . . . . . . . . . . . . . . .
1.1.4 Special kinds of matrices . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Determinant of a matrix (Chapter 9 of Simon and Blume) . . . . . . . . .
1.2.1 Properties of the determinant . . . . . . . . . . . . . . . . . . . . .
1.3 Rank of a matrix (rk(A)) . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 Using determinants . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Using Gauss-Jordan method . . . . . . . . . . . . . . . . . . . . . .
1.3.3 Using vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Invertible matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 Inverse matrix using cofactors . . . . . . . . . . . . . . . . . . . . .
1.4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Systems of linear equations . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5.1 Rouche-Frobenius theorem . . . . . . . . . . . . . . . . . . . . . .
1.5.2 Solution of systems of linear equations . . . . . . . . . . . . . . . .
1.6 Eigenvalues and eigenvectors of a matrix . . . . . . . . . . . . . . . . . . .
1.6.1 Diagonalization of a matrix . . . . . . . . . . . . . . . . . . . . . .
1.6.2 Properties of diagonalization . . . . . . . . . . . . . . . . . . . . .
1.7 Application: linear difference equations (Chapter 23 of Simon and Blume)
1.7.1 One dimensional equations . . . . . . . . . . . . . . . . . . . . . .
1.7.2 k-dimensional systems . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
3
4
4
5
6
8
8
9
9
11
11
12
13
14
14
20
22
25
25
25
25
2 Multivariate Calculus
2.1 Functions (Chapter 13 of Simon and Blume)
2.1.1 Special functions . . . . . . . . . . . .
2.1.2 Classification of functions . . . . . . .
2.1.3 Composition of functions . . . . . . .
2.2 Derivatives of multivariate functions . . . . .
2.2.1 Partial Derivatives . . . . . . . . . . .
2.2.2 The total differential . . . . . . . . . .
2.2.3 The Chain Rule . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
27
27
28
30
30
31
31
32
33
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
CONTENTS
.
.
.
.
.
.
.
.
.
34
34
35
35
36
37
38
39
39
3 Optimization
3.1 Unconstrained optimization . . . . . . . . . . . . . .
3.1.1 Theorem 1 . . . . . . . . . . . . . . . . . . .
3.1.2 Theorem 2 . . . . . . . . . . . . . . . . . . .
3.1.3 Theorem 3 . . . . . . . . . . . . . . . . . . .
3.2 Optimization with equality constraints . . . . . . . .
3.2.1 Two variables and one equality constraint . .
3.2.2 Several equality constraints . . . . . . . . . .
3.3 Optimization with inequality constraints . . . . . . .
3.3.1 One inequality constraint . . . . . . . . . . .
3.3.2 Several inequality constraints . . . . . . . . .
3.4 Kuhn-Tucker formulation . . . . . . . . . . . . . . .
3.4.1 Optimization with mixed constraints . . . . .
3.4.2 Envelope theorem . . . . . . . . . . . . . . .
3.4.3 Special functions . . . . . . . . . . . . . . . .
3.4.4 Concave and quasiconcave functions (Chapter
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
21 of Simon and Blume)
43
43
43
44
44
45
45
51
54
54
59
64
66
67
68
70
4 Analysis
4.1 Sequences of real numbers . . . . . . .
4.1.1 Convergent sequence . . . . . .
4.2 Open sets . . . . . . . . . . . . . . . .
4.3 Continuity of functions . . . . . . . . .
4.3.1 Continuous function at x0 . . .
4.3.2 Uniformly continuous function
.
.
.
.
.
.
73
73
73
75
75
75
76
2.3
2.4
2.2.4 Higher order derivatives . . . . . . . . . . . . . . . . . . . . . .

Taylors series approximation (Chapter 30 of Simon and Blume) . . .
2.3.1 Mean value theorem . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Taylors series approximation on R . . . . . . . . . . . . . . . .
2.3.3 Taylors series approximation on Rk . . . . . . . . . . . . . . .
Implicit function theorem (Chapter 15 Simon and Blume) . . . . . . .
2.4.1 The implicit function theorem for R2 . . . . . . . . . . . . . . .
2.4.2 The implicit function theorem for Rk . . . . . . . . . . . . . . .
2.4.3 The implicit function theorem for systems of implicit functions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Chapter 1
Linear Algebra
1.1
Matrix Algebra (Chapter 8 of Simon and Blume)
A matrix is simply a rectangular array of numbers. So, any table of data is a matrix.
a11 a12 a1n

a21 a22 a2n
A= .
..
..
..
..
.
.
.
am1 am2 amn
The dimension of the matrix is dim(A) = m n, being m the number of rows and n
the number of columns.
aij : elements of the matrix in i th row and j th column.
Mmn = {Set of matrices of dim = m n}
1.1.1
Operations with matrices
Addition of matrices
Being A = (aij ) and B = (bij ) two matrices such that dim(A) = dim(B),
matrix of A and B is defined as:

b11 b12
a11 a12 a1n
a21 a22 a2n b21 b22

A + B = (aij + bij ) = .
..
..
.. + ..
..
..
..
.
.
. .
.
.
bm1 bm2
am1 am2 amn
1
the addition
b1n
b2n
..
.
bmn
1. Linear Algebra
a11 + b11
a21 + b21
..
.
a12 + b12
a22 + b22
..
.
..
.
a1n + b1n
a2n + b2n
..
.
am1 + bm1
am2 + bm2
amn + bmn
Example.
1 2
0 7
1 5
4
0 4 2
1
6 2
8 + 5 4 0 = 5 11 8
5
1 0 2
2 5 3
Scalar multiplication
Given A Mmn and a real number R, A is the matrix (aij ):
A =
a11
a21
..
.
a12
a22
..
.
..
.
a1n
a2n
..
.
am1
am2
amn
a11
a21
..
.
a12
a22
..
.
..
.
a1n
a2n
..
.
am1
am2
amn
Example.
1 2 4
5 10 20
5 0 7 8 = 0 35 40
1 5 5
5 25 25
Matrix multiplication
Definition 1: Given a row matrix A and a column matrix B
A = (a1 , a2 , . . . , an )
B=
b1
b2
..
.
bn
1. Linear Algebra
we define the multiplication of A B as the scalar:
b1
n
b2
X
A B = (a1 , a2 , . . . , an ) . = a1 b1 + . . . + an bn =
ai b i
..
i=1
bn
Definition 2: Given a matrix A of dimension m n and a matrix B n p, we define

C = A B as the matrix (cij ) in which element cij is equal to:
cij =
n
X
aik bkj
k=1
The dimension of the matrix C is dim(C) = m p.

Example.
1 2
0 7
1 5

4
0 4
8 5 4
5
1 0
2
6 12 6
0 = 43 28 16
2
20 16 12
Therefore, if A is a matrix Mmn and B is a matrix Mrs :

A B exists if n = r.
B A exists if s = m.
In general, A B 6= B A, the matrix multiplication is not conmutative!
1.1.2
Laws of the matrix Algebra
Associative laws.
A, B, C Mmn :
(A + B) + C = A + (B + C)
A Mmn , B Mnr and C Mrp :

(A B) C = A (B C)
Conmutative law for addition. A, B Mmn : A + B = B + A
Distributive laws.
1. Linear Algebra
A Mnm and B, C Mmr :
A (B + C) = A B + A C
A, B Mnm and C Mmr :
(B + C) A = A C + B C
1.1.3
Trasposition of a matrix
At = (aji ) is the transposed matrix, calculated by interchanging the rows of columns of A:
A=
a11
a21
..
.
a12
a22
..
.
..
.
a1n
a2n
..
.
am1
am2
amn
At =
a11
a12
..
.
a21
a22
..
.
..
.
am1
am2
..
.
a1n
a2n
amn
Example.
1
At =
0
3
1 1 0 3
7
0 7
A= 0
1 5 5 9
1.1.4
Special kinds of matrices
Square matrix. A Mmn such that m = n.

Column matrix. A Mmn such that n = 1.
Row matrix. A Mmn such that m = 1.
Diagonal matrix. A = (aij ) such that:
aij = 0
if i 6= j
Example.
1
0
0
0
0
7
0
0 15
0 1
7 5
0 5
7 9
1. Linear Algebra
Upper (lower) triangular matrix. A = (aij ) such that:
aij = 0
if i > j( if i < j)
Example.
1 2
0 7
0 0
4
8
5
1 0
0 7
1 5
0
0
5
Symmetric matrix. A Mmn such that A = At , aij = aji i 6= j. Example.
1 1 3
0
A = 1 7
3
0 5
Skew symmetric matrix. A Mmn such that A = At . aji = aij i 6= j and,
aii = 0 i.
Example.
0 1 3
0 2
A= 1
3 2
0
Identity matrix. Inn = In :
I=
1.2
1
0
..
.
0
0
1
..
..
.
.
0
0
0
..
.
1
Determinant of a matrix (Chapter 9 of Simon and

Blume)
The determinant of a square matrix is a number. Determinant of 2 2 and 3 3 matrices

straightforward by using rule of Sarrus.
2 2:

a

c

b
= ad bc
d
1. Linear Algebra
3 3:

a11 a12

a21 a22

a31 a32
a13
a23
a33
Examples.

0 4

5 4

1 0
1.2.1
2
0
2

= a11 a22 a33 +a12 a23 a31 +a13 a21 a32 [(a31 a22 a13 )+(a32 a23 a11 )+(a33 a12 a21 )]

3 2

6 4 = 0

= 0 4 2 + 4 0 (1) + 2 (5) 0 2 4 (1) 0 0 0 2 (5) 4 = 48

Properties of the determinant
1. |A| = |At |
2. |A B| = |A||B|
A, B Mnn
3. In general, |A + B| 6= |A| + |B|
A, B Mnn
4. If one forms matrix B interchanging 2 rows or 2 columns of A, then |B| = |A|

5. If two rows(columns) are equal, then |A| = 0.
6. If a common factor
minant:

a11

..
.

ai1

.
..

an1
7. |A| = n |A|
multiplies a row (column), it can be

a11 a12
a12 a1n

..
..
..
..
..
.

.
.
.
.

ai2 ain = ai1 ai2
.
..
..
..
..
..
.
.
.
.

an1 an2
an2 ann
taken put from the deter

a1n
..
..
.
.
ain
..
..
.
.
ann
8. If a matrix A has an all-zero row(column), then |A| = 0.

9. Determinant of an upper(or lower) triangular matrix is the product of its diagonal
entries. Example.

3 0 0
1 2 0
3 2 6
2 1 2
0
0
0
4

= 3 (2) 6 4

1. Linear Algebra
10. If a linear combination of several rows (columns) is added to
|A| does not change.
Example.

3 4 R2 =3R2 2R1 3
4

6=
0 14
2 2

3 4 R2 =3R2 2R1
3
4

(1/3)
2 2
0 14

3

2

4 R2 =R2 2/3R1

3

0

4

14/3
one row (column) then
For square matrices of order > 3, let introduce two definitions:

Definition 1. The Minor of aij , Mij , is the determinant of a submatrix obtained by
deleting row i and column j from A.
Definition 2. Cofactor of aij , Cij is defined as Cij = (1)i+j Mij .
Example.
0 4 2
A = 5 4 0
1 0 2

4
M11 =
0

0
=8
2
A11 = M11 (1)1+1 = 8 1 = 8
The determinant of an n n matrix A is given by:

|A| = ai1 Ai1 + ai2 Ai2 + . . . + ain Ain
|A| = a1j A1j + a2j A2j + . . . + anj Anj
Example.
i-th row
j-th column
1. Linear Algebra

3 2
1 1
2 1

1 0 3 2
2+1
= 1 (1)

2 0
3 2
0 5
1 2
2 1 2 4
1
5
4

3

+ 0 (1)2+2 3

2
1
0
2

3 2 1
3 2 1

+(3) (1)2+3 3 2 5 + 2 (1)2+4 3 2 0 =
2 1 2
2 1 4
1
5
4
1 (29) + 0 + 3 28 + 2(7) = 99
By using the last property:

1.3
1.3.1
3
1
3
2

2
1 1 C3 = C3 + 3C1 3 2 10 5
0
0
0 3 2 C4 = C4 2C1 1 0
3 2
9
1
2
0 5

2 1 8
0
1 2 4

2 10 5

9 1 = 1 (99) = 99
= 1 (1)2+1 2
1 8
0
Rank of a matrix (rk(A))

Using determinants
The rank of a matrix A Mmn is the order of the greatest minor of A distinct to zero.
It is said the rank of a matrix A is p and it is written rk(A) = p if:
(i) A minor of order p exits and is nonzero.
(ii) All the minors of order greater than p do not exist or are zero.
1 4 2
Example. Given the matrix A = 5 2 1 , we have that:
1 0 2
(a) Any element of the matrix is a minor of order 1 of the matrix. As at least one element
of the matrix is nonzero, (for instance, a11 = 1), then, at least the rank of the matrix
is : rk(A) 1. a1 1 is the principle minor of order 1.
1. Linear Algebra

1 4
(b)
5 2
4 2

0 2
nonzero,

(principle minor of order 2), 1 4 , 1 2 , 1 2 , 4 2 ,

1 0 5 1 1 2 2 1
are all the minors of order 2 of the matrix A. As at least one of them is
then the rank of the matrix is not one, is 2 : rk(A) 2.

1 4 2

(c) The unique minor of order 3 of the matrix is its determinant. 5 2 1 = 36 6= 0.
1 0 2
As it is different from zero, finally we conclude that the rank is 3: rk(A) = 3.
1.3.2
Using Gauss-Jordan method
The Gauss-Jordan method is used to reduce a matrix to row echelon form. The rank of the
matrix is the number of nonzero rows in its row echelon form.
To reduce a matrix into its rwo echelon form, we need to apply elementary row operations:
Interchange two rows of a matrix.
Change a row by adding it a linear combination of another row.
A row of a matrix A has k leading zeros if the first k elements of the row are all zeros
and the (k + 1)-th element of the row is not zero.
A matrix is in row echelon form if each row has more leading zeros than the row
preceding it (unless the row contains only zeros, in which case the subsequent rows must
contain only zeros.
Example.
1 0
A= 2 3
4 6
1.3.3
R2 = R2 2R1
3
1
R3 = R3 4R1
0
6
11
0
0 3
1
R3 =R3 2R2
0
3 0
6 1
0
0 3
3 0
0 1
Using vectors
Given A Mmn with m row vectors r 1 , r2 , . . . , r m and n column vectors c1 , c2 , . . . , cn . The

rank of A is the maximum number of linearly independent column or row vectors.
Let introduce vector space:(V, +, ).
For all u,
v ,w
in V and all the scalars and R, V is a vector space over R if:
1. u
+ v V
2. u
+ v = v + u
10
1. Linear Algebra
3. (
u + v) + w
=u
+ (
v + w)
4. There is an element, the null vector 0 such that: v + 0 = 0 + v = v

0
5.
v V , v + (
v) =
6.
uV
7. (
u + v) =
u +
v
8. ( + )
u = (
u + u)
9. ( )
u = ( u)
10. 1 u
= u, being 1 R.
If V = Rn , being the set of n-tuples, + the addition of vectors in Rn and the scalar
multiplication of vectors, Rn satisfy all these properties, then (Rn , +, ) is a vector space.
A subset of Rn that satisfies the above properties is called a subspace of Rn .
Theorem. A subset of Rn is a subspace of Rn S V if and only if
u, v S y
, R:
u +
vS
Examples.
1. {0} is a subpace of Rn .
2. S = {(1, 0), (0, 0), (0, 1)} is a subset of R2 , not a subspace, as:
(1, 0) + (1, 0)
/S
3. T = {(x, y) R2 |x y = 0} is a subspace of R2
Needed concepts related to Rn : vectors, addition of vectors and scalar multiplication.
Vectors u
1 , u
2 , . . . , u
n in Rn are linearly independent if and only if (Chapter 11 Simon
and Blume):
1 u
1 + 2 u
2 + . . . + n u
n =
n
X
i u
i = 0
i=1
with i = 0, . . . , n
For scalars i ,
1 u
1 + 2 u
2 + . . . + n u
n
is called a linear combination of u

1 , u2 , . . . , u
n .
11
1. Linear Algebra
The set of linear combinations of u1 , u

2 , . . . , un , is called the set generated or spanned
by u
1 , u
2 , . . . , u
n . This is always a vector subspace:
L[
u1 , u
2 , . . . , u
n ] = {1 u
1 + 2 u
2 + . . . + n u
n |i R, i = 1, . . . , n}
Let u
1 , u
2 , . . . , u
n be a collection of vectors in V (vector space or subspace). Then
u
1 , u2 , . . . , u
n forms a basis of V if:
u
1 , u
2 , . . . , u
n spans V and
u
1 , u
2 , . . . , u
n are linearly independent.
The dimension of V is the number of vectors of any of its basis, then dim(V ) = n.
Examples.
In the vector space R2 , D = {(1, 0), (1, 1)} is a basis of R2 , as the dimension of
R2 is two and this two vectors are linearly independent as:

1 1
|A| =
0 1

6= 0

Then rk(A) = 2 =number of linearly independent vectors. Then both vectors are
basis of R2 .
In the vector space R3 , D = {(1, 1, 0), (1, 2, 1), (3, 0, 5)} is a basis R3 , as it is
formed by 3 linearly independent vectors.
1.4
Invertible matrices
Given A Mnn , its inverse matrix, A1 , its the matrix that accomplishes:
A1 A = AA1 = In
(A has to be a non singular matrix, |A| 6= 0).
1.4.1
Inverse matrix using cofactors
A1 =
1
(Ad )t
|A|
|A|
C11
C12
..
.
C21
C22
..
.
..
.
Cn1
Cn2
..
.
C1n
C2n
Cnn
12
1. Linear Algebra
with Cij the cofactor of aij .
Example.
2 4 7
Ad = 0 2 2
2 4
6
2
A= 2
3
A1
1.4.2
2 2
1 0
2 2
|A| = 2
1 0
= 2 1
7/2 1
2
0 2
(Ad )t = 4 2 4
7 2 6
1
2
3
Properties
The inverse matrix is unique.

A matrix A is invertible |A| 6= 0 (A non singular).
Given A, B invertible matrices:
(A B)1 = B 1 A1
Given A non singular:
(At )1 = (A1 )t
|A1 | =
If A is nonsingular:
1
|A|
Am = A A A . . . A
(m times)
Am = (A1 )m
Am is invertible and (Am )1 = (A1 )m For any scalar 6= 0, A is invertible and:
(A)1 =
1 1
A
13
1. Linear Algebra
1.5
Systems of linear equations
A system of m linear equations with n unknows or varaibales x1 , x2 , . . . , xn :

a11 x1 + a12 x2 + . . . + a1n xn
a21 x1 + a22 x2 + . . . + a2n xn
..
.
=
=
..
.
am1 x1 + am2 x2 + . . . + amn xn
b1
b2
..
.
= bm
whose matricial expression is:
a11
a21
..
.
a12
a22
..
.
..
.
a1n
a2n
..
.
am1
am2
amn
x1
x2
..
.
xn
b1
b2
..
.
bm
Therefore:
AX = B
A is the coefficients matrix with fixed numbers aij with 1 i m y 1 j n.

X is the matrix of the variables xi of dimension n 1. B is the independent matrix of fixed
numbers bj of dimension m 1.
The augmented matrix of the system is:
(A|B) =
a11
a21
..
.
a12
a22
..
.
..
.
a1n
a2n
..
.
am1
am2
amn
|
|
|
|
b1
b2
..
.
bm
Some questions arise:

Does a solution exist?
How many solutions are there?
Is there an efficient algorithm that computes actual solutions?
14
1. Linear Algebra
1.5.1
Rouch
e-Frobenius theorem
A system of linear equations with m equations and n variables has a solution if and only if:
rk(A) = rk(A|B)
Then:
If rk(A) = rk(A|B), the system has solution.
(a) If rk(A) = rk(A|B) = n (n: number of variables), the system has a unique
solution.
(b) If rk(A) = rk(A|B) < n the system has infinite solutions.
Otherwisw if rk(A) 6= rk(A|B), the system has no solution.
1.5.2
Solution of systems of linear equations
There are three different methods:

Substitution
Solve one equation of the system, say xk , in terms of other variables in that equation.
Substitute this expression for xk into the other m 1 equations. Continue this process until
upon reach a system with just a single equation in just one variable.
Finally, using the previously derived expressions, find all the xi s
Example.
2x + y + z
x + 2y + z
x + y + 2z
=
=
=
1
2
0
I calculate x in terms of y and z from the first equation: x =

result into the second and third equations. We have then:
x
3y + z
y + 3z
From the second equation, y =

for z:
1yz
. And plug this
2
1yz
2
= 3
= 1
=
3z
. We plug this result into equation three and solve
3
15
1. Linear Algebra
x
y
z
1yz
2
3z
=
3
= 3/4
Elimination of variables
Use elementary equation operations:
Add a multiple of one equation to another.
Multiply both sides of one eq. by a non-zero scalar.
Interchange two equations.
Fact: If one system of linear equations is derived from another elementary equation
operationed, then both systems have the same solutions (systems are equivalent).
Example.
2x + y + z
x + 2y + z
x + y + 2z
=
=
=
1
2
0
Its matricial expression:
2
1
1

1 1
x
1
2 1 y = 2
1 2
z
0
A equivalent system is another one in which R2 = 2R2 R1 and R3 = R3 R2 :
2 1 1
x
1
0 3 1 y = 3
0 1 1
z
2
Another equivalent system R3 = 3R3 + R2 :
2 1 1
x
1
0 3 1 y = 3
0 0 4
z
3
16
1. Linear Algebra
Matrix methods
1. For systems with a unique solution. If in a system of linear equations, rk(A) =
rk(A|B) = n (n linearly independent equations), the unique solution can be found
using:
(a) Cramers Rule. The unique solution of the system X = (x1 , x2 , . . . , xn ) of the
system AX = B is:
xi =
Bi
|A|
i = 1, . . . , n
where Bi is the matrix A with the independent column matrix B replacing the
i th column of A:
xi =

a11

a21

..
.

an1
..
.
b1
b2
..
.
..
.
a1n
a2n
..
.
bn
|A|
ann
Example.

3 2

|A| = 2 1
1 1

1 2

1 1

2 1
x1 =
8
3x1 + 2x2 + x3 = 1
2x1 + x2 2x3 = 1
x1 + x2 x3
= 2
3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2

1
2 = 8 6= 0, I can use the Cramers rule:
1

3
1
1
1

2 1 2
2

1 2 1
1
= 1
x2 =
=1
8

3 2 1

2 1 1

1 1 2
x3 =
=2
8
(b) Inverse matrix method. If rk(A) = n, then A is a non singular matrix, |A| 6= 0.
Then A1 exists and:
A1 AX = A1 B
In X = A1 B
17
1. Linear Algebra
X = A1 B
Example.
3x1 + 2x2 + x3 = 1
2x1 + x2 2x3 = 1
x1 + x2 x3
= 2
3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2

3 2 1

|A| = 2 1 2 = 8 6= 0, then A1 exist and the solution of the system
1 1 1
will be X = A1 B:
1
3 5
4 4 4
1/8 3/8 5/8

3
1
7
= 1/2
1/2 1/2
A1 =
8
3/8
1/8 7/8
1/8 3/8 5/8

1
1
1/2 1/2 1 = 1
X = A1 B = 1/2
3/8
1/8 7/8
2
2
2. For systems with one or infinite solutions. In any system of linear equations,
row echelon form is applied to the augmented matrix of the system, (A|B).
Straightforward, rank(A) and rank(A|B) can be obtained, and the number of solutions
of the system (0, 1 or ) can be determined.
(a) If the system has a unique solution (rk(A) = rk(A|B) = n). The substitution
method, or Cramer or the inverse method is applied to obtain the unique solution
over the equivalent system with the row echelon form of the augmented matrix,
(A|B)r .
Example.
3x1 + 2x2 + x3
2x1 + x2 2x3
x1 + x2 x3
= 1
= 1
= 2
18
1. Linear Algebra
3 2 1
x1
1
2 1 2 x2 = 1
1 1 1
x3
2
The augmented matrix is:
3 2
(A|B) = 2 1
1 1
1 | 1
2 | 1
1 | 2
Applying elementary operations to obtain the row echelon form:
3 2
(A|B) = 2 1
1 1
R2 + 2R1
1
R3 3R1
0
1 | 1
1 1
R1 R3
2 | 1 3 2
1 | 2
2 1
1 1 | 2
1
3R3 +R2
3 4 | 5 0
1 4 | 7
0
1 | 2
1 | 1
2 | 1
1 1
3 4
0 8
| 2
| 5
| 16
1
The solution of the system is: X = 1
2
(b) If the system has infinite solutions (rk(A) = rk(A|B) < n). From the set of n
variables, n rk(A) variables are taken as parameters. Thus, a subsystem of k
equations and k variables is obtained.
The rest of the variables xi are calculated in terms of these n rk(A) variables.
Example.
x1 x2 + x3
4x1 + 5x2 5x3
2x1 + x2 x3
x1 + 2x2 2x3
1
4
2
1
=
=
=
=
1
4
2
1
1 1
1
x1
4
5 5
x2 =
2
1 1
x3
2 2
1
19
1. Linear Algebra
The augmented matrix of the system:
1
4
(A|B) =
2
1
1 1 | 1
5 5 | 4
1 1 | 2
2 2 | 1
Obtaining row echelon form:

R2 4R1
R 2R
3
1
1 1 1 | 1
4 5 5 | 4 R4 R1
(A|B) =
2 1 1 | 2
1 2 2 | 1
1
0
0
0
1 1 | 1
9 9 | 0
3 3 | 0
3 3 | 0
Then, rk(A) = rk(A|B) = 2 < 3, with n r = 3 2 = 1 parameter.

If x3 = , then:
x1 x2
x2
= 1
=
1
The solution of this system is X =
3. Particular case: Homogeneous systems A homogeneous system is a system of

the form:
AX = 0
with A Mmn , X Mn1 with variables xi , and B the null m 1 column matrix.
A homogeneous system ALWAYS has 1 or solutions as:
rk(A) = rk(A|B)
Then:
If rk(A) = n, the unique solution of the system is the zero solution or the trivial
solution:
x1 = x2 = . . . = xn = 0
If rk(A) < n, then the system has infinite solutions
Example.
3x1 + 2x2 + 4x3
2x1 + 2x3
x1 + 2x2
=
=
=
0
0
0
20
1. Linear Algebra
3 2
2 0
1 2

4
x1
0
2 x2 = 0
0
x3
0

3 2 4

As |A| = 2 0 2 = 0, then rk(A) = 2 < 3, then the systems has infinite
1 2 0
solutions and we need n r = 3 2 = 1 parameter. Making x2 = , the solution
of the system is: x1 = 2, x3 = 2, R
1.6
Eigenvalues and eigenvectors of a matrix
Let A be a square matrix. An eigenvalue of A is a number r which then subtrated

from each of the diagonal entries of A, converts A into a singular matrix.
A square matrix is singular if and only if |A| = 0.
r is an eigenvalue of A if and only if |ArI| = 0. If r is a variable, |ArI| = p(r) is the
characteristic polynomial of A. The r values for wich this characteristic polynomial is
zero are the eigenvalues of matrix A. The algebraic multiplicity i of an eigenvalue ri
is the exponent at which this root appears in the characteristic polynomial.
2 1 0
Example. A = 0 0 1 .
0 0 3
The characteristic polynomial is:

2r

p() = |A rI| = 0
0
1
0
r
1
0 3r

= r(2 r)(3 r)

The eigenvalues are: r1 = 0, r2 = 2, r3 = 3, with algebraic multiplicities 1 =

1, 2 = 1, 3 = 1.
Associated to each eigenvalue, ri such that |A ri I| = 0, there is a system of linear
equations (A ri I)
vi =
0. From this system, solutions for vi can be ontained such
that vi 6= 0. These vectors vi are called the eigenvectors of A corresponding to the

eigenvalue ri . For each eigenvalue ri , a basis of the subspace of the vi corresponding
vectors (eigenspace of ri ) can be obtained.
21
1. Linear Algebra
Example.
2
A= 0
0
1 0
0 1
0 3
Eigenspace associated to r1 = 0:
x R3 |(A 0I)
x = 0}
Er1 = {

2 1 0
x
0
0 0 1 y = 0
0 0 3
z
0
rk(A) = 2 < 3 =number of variables. Then the system has infinite solutions and
we need 3 2 = 1 parameter. The solution of the system is: x = , y = 2 y
z = 0, R. A basis of this subspace will be:
BEr1 = {(1, 2, 0)}
x R3 |(A 2I)
x = 0}
Er2 = {

0 1 0
x
0
0 2 1 y = 0
0 0 1
z
0
rk(A 2I) = 2 < 3 =number of variables. Then the system has infinite solutions
and we need 3 2 = 1 parameter. The solution of the system is: x = , y = 0 y
z = 0, R. A basis of this subspace will be:
BEr2 = {(1, 0, 0)}
x R3 |(A 3I)
x = 0}
Er3 = {

1 1 0
x
0
0 3 1 y = 0
0
0 0
z
0
rk(A 3I) = 2 < 3 =number of variables. Then the system has infinite solutions
and we need 3 2 = 1 parameter. The solution of the system is: x = , y =
y z = 3, R. A basis of this subspace will be:
BEr3 = {(1, 1, 3)}
22
1. Linear Algebra
1.6.1
Diagonalization of a matrix
(Chapter 23 of Simon and Blume)

The eigenvalues and eigenvectors are used to diagonalize a square matrix A, which
means finding a diagonal matrix D and a non singular matrix P such that:
D = P 1 AP
with A, D, P Mnn
with vi the basis of the eigenspaces associated to eigenvalues ri .

There are two different cases:
(a) Given A Mnn . If r1 , r2 , . . . , rn are the eigenvalues of A such that r1 6= r2 6= . . . 6= rn ,
then A is diagonalizable and we can find P and D:
D=
r1
0
..
.
0
r2
..
.
..
.
0
0
..
.
rn
|
P = v1
|
|
v2
|
|
vn
|
with vi the basis of the eigenspaces associated to eigenvalues ri .
2 1 0
Example. A = 0 0 1 . All the eigenvalues are different and we had that:
0 0 3
BEr1 =0 = {(1, 2, 0)}
BEr2 =2 = {(1, 0, 0)}
BEr3 =3 = {(1, 1, 3)}
Then, A is diagonalizable and D and P such that D = P 1 AP are:
0 0
D= 0 2
0 0
0
1 1
0 P = 2 0
3
0 0
1
1
3
(b) Given A Mnn . If r1 , r2 , . . . , rp are the eigenvalues of A with algebraic multiplicities

1 , 2 , . . . , p . Then A is diagonalizable if and only if:
i = i = dim(Eri ) = i = 1, 2, . . . , p
23
1. Linear Algebra
D=
r1
0
..
.
0
r1
..
.
0
..
.
..
.
..
.
..
.
..
.
..
.
..
.
0
0
..
.
ri
0
0
..
.
..
.
..
.
..
.
..
.
0
..
0
..
.
0
ri
..
.
0
..
.
..
.
..
.
0
0
..
.
rn
0
0
0
..
0
0
..
.
0
..
.
0
rn
The matrix P is formed by the eigenvectors in columns. The order of the eigenvectors
in matrix P has to be the one corresponding to the eigenvalues in matrix D.
1
Example. Given A = 1
1
1 1
1 1
1 1
If A is diagonalizable, i = dim(Eri ) = i i = 1, 2, . . . , p.

1r

p(r) = |A I| = 1
1
1
1r
1
1
1
1r

= (1 r)3 + 2 3(1 r) = r2 (r 3)

The eigenvalues are: r1 = 0, r2 = 3, with algebraic multiplicities 1 = 2, 2 = 1.

Calculating the basis of the eigenspace associated to r1 = 0:
Er1
1
1
1
= {
x R3 |(A 0I)
x = 0}

1 1
x
0
1 1 y = 0
1 1
z
0
rk(A) = 1 < 3 =number of variables. We need 3 1 = 2 parameters. The solution

of ther system is x = , y = and z = , , R. Then, a basis of this
eigenspace will be:
BEr1 =0 = {(1, 1, 0), (1, 0, 1)}
24
1. Linear Algebra
We can see that 1 = dim(Er1 =0 ) = 2 = 1 .
Calculating the basis of the eigenspace associated to r2 = 3:
x R3 |(A 3I)
x = 0}
Er2 = {

2 1
1
x
0
1 2 1 y = 0
1
1 2
z
0
rk(A) = 2 < 3 =number of variables. We need 3 2 = 1 parameter. The solution of
ther system is: x = , y = y z = , , R. Then, a basis of this eigenspace will
be:
BEr2 =3 = {(1, 1, 1)}
We can see that 2 = dim(Er2 =0 ) = 1 = 2 .

As the dimensions of both eigenspaces coincide with the algebraic multiplicities of both
eigenvalues, then A is diagonalizable and we can find D and P such that D = P 1 AP :
0
D= 0
0
0 0
1 1 1
0 0 P = 1
0 1
0 3
0
1 1
Symmetric matrices(Optimization and statistics)

Let A be a n n symmetric matrix. Then:
A is diagonalizable with distinct or repeated eigenvalues.
Eigenvectors corresponding to different eigenvalues are orthogonal.
Note: Two vectors u
= (u1 , u2 , . . . , un ), v = (v1 , v2 , . . . , vn ) are orthogonal if its
product is zero:
u
vt = u1 v1 + u2 v2 + . . . + un vn = 0
A matrix P orthogonal can be found such that:
D = P t AP
with A, D, P Mnn
25
1. Linear Algebra
1.6.2
Properties of diagonalization
If A n n is diagonalizable:
1. |A| = ni=1 ri , being r1 , r2 , . . . , rn its eigenvalues.
2. Eigenvalues of At = Eigenvalues of A.
3. If ri is an eigenvalue of A with algebraic multiplicity i , then:
dim(Eri ) = i i
4. P, D such that D = P 1 AP . Then An = P Dn P 1 n N.
5. A1 = P D1 P 1
1.7
1.7.1
Application: linear difference equations (Chapter

23 of Simon and Blume)
One dimensional equations
In this equation:
yn+1 = ayn
The y variable can represent the amount of money in a savings account whose principal
is left untouched and whose interest is compounded once a year,
yn+1 = (1 + )yn
with the interest rate. With this difference equation, in which the variable at a certain
time n depends on its value at a time n + 1, we can solve the amount of money at a certain
time n:
1.7.2
y1
y2
..
.
=
=
ay0
ay1
..
.
yn
= an y 0
a2 y 0
k-dimensional systems
In general,
zn+1 = Azn
where zn+1 , zn , are k 1 vectors and A is a k k matrix. To solve these systems we
can face different situations:
26
1. Linear Algebra
1. A is diagonal. Then the solution of the system is the same as the one dimensional
case: zn = An z0 .
2. A has distinct real eigenvalues. In this case there is a nonsingular matrix P such that
D = P 1 AP :
r1 0 0
0 r2 0
D= .
..
..
..
..
.
.
.
0 0 rk
In this case, we can solve this system as the one dimensional case:
z1
z2
..
.
=
=
Az0
Az1
..
.
zn
An z0
A2 z0
As we know that An = P Dn P 1 , thus:

zn = P Dn P 1 z0
Chapter 2
Multivariate Calculus
2.1
Functions (Chapter 13 of Simon and Blume)
A function from a set A to a set B is a rule that assigns to each object in A one object in
B.
f :AB
The elements of set A for which the function f is defined is called the domain of the
function f ; the set B for which the function f takes its values is called the target of the
function f .
y = f (x) is said to be the image of x under f . The set of all f (x)s for x in the domain
of f is called the image of f .
Example:
If we consider the function f : R2 R, f (x, y) = x2 + y 2 , we have that the domain of
the function is R2 and the image of the function are all the real positive values.
If we consider the function g : R R, g(x) = 1/x, its domain is all the real numbers
except 0 and its image is R {0}.
But in Economy we are mostly interested on functions that usually have two or more
variables as input:
Demand function: f : R3 R
q1 = f (p1 , p2 , y) = K1 pa1 11 pa2 12 y b1
Production function: f : R2 R
q = f (x1 , x2 ) = kxb11 xb22
27
28
2. Multivariate Calculus
Production function of 2 outputs using 3 inputs: f : R3 R2
Q(p1 , p2 , y) = (q1 (p1 , p2 , y), q2 (p1 , p2 , y)) = (K1 pa1 11 pa2 12 y b1 , K2 pa1 21 pa2 22 y b2 )
2.1.1
Special functions
Linear functions (f : Rk Rm )
Functions that preserve the vector space structure:
f (x + y) = f (x) + f (y)
f (rx) = rf (x)
x, y Rk
x Rk , r R
Usually these functions have the general form:

f (x1 , x2 , . . . , xk ) = a1 x1 + a2 x2 + . . . + ak xk
If f : Rk Rm is a linear function, there exists an m k matrix A such that:
f (x) = Ax
x Rk
Quadratic forms (f : Rk R)
A quadratic form on Rk is a real-valued function associated to a symmetric matrix A (A =

At ) of the form:
a11 a12 a1n

x1
a12 a22 a2n x2
Q(
x) = X t AX = (x1 , x2 , . . . , xn ) .
..
..
.. ..
..
.
.
. .
xn
a1n a2n ann
Developing this form we get:
Q(
x) = X t AX = (x1 , x2 , . . . , xn )
a1n x1 + a2n x2 + . . . + ann xn
= a11 x21 + a22 x22 + . . . + ann x2n + 2a12 x1 x2 + 2a13 x1 x3 + . . . =
n
X
aii x2i + 2
i=1
Example: Q(x, y, z) = x2 5y 2 + 3z 2 2xy 5xy.
1
Q(x, y, z) = (x, y, z) 1
0
a11 x1 + a12 x2 + . . . + a1n xn

a12 x1 + a22 x2 + . . . + a2n xn
..
.
1
0
x
5 5/2 y
5/2
3
z
n
n
X
X
i=1 j=i+1
aij xi xj
29
xy: 12 or 21.
yz: 23 or 32.
Definiteness of the quadratic forms

The definiteness of Q(
x) is very important in optimization of certain functions to check
second order conditions (differentiation of maxima and minima, convex or concave function).
Attending the eigenvalues ri of the matrix A, a quadratic form Q(
x) is:
1. Q is positive def inite ri > 0, i = 1, . . . , n.
2. Q is positive semidef inite ri 0, rj = 0.
3. Q is negative def inite ri < 0, i = 1, . . . , n.
4. Q is negative semidef inite ri 0, rj = 0.
5. Q is indef inite ri > 0, rj < 0.
But sometimes the calculus of the eigenvalues is not easy and we can classify the
quadratic forms by using the principle minors of
the matrix A.
a
a12
Given Q(
x) = X t AX and let |A1 | = a11 , |A2 | = 11
, . . . , |An | = |A| be the principle
a12 a22
minors of A.
1. Q is positive def inite |Ai | > 0, i = 1, . . . , n.
2. Q is positive semidef inite if |Ai | > 0 i = 1, . . . , n 1 and |An | = |A| = 0.

3. Q is negative def inite (1)i |Ai | > 0, i = 1, . . . , n.
4. Q is negative semidef inite if (1)i |Ai | > 0 i = 1, . . . , n 1 and |An | = |A| = 0.
5. Q is indef inite if |An | = |A| 6= 0 and conditions of cases (1) and (3) are not satisfied.
6. Q is indef inite if |An | = |A| = 0 y |Ai | 6= 0 i = 1, . . . , n 1 and conditions of cases
(2) and (4) are not satisfied.
Monomials
A function f : Rk R is a monomial if it can be written as:
f (x1 , . . . , xk ) = cxa1 1 xa2 2 . . . xakk
c R, ai 0
The degree of the monomial is: a1 + . . . + ak

Examples:
A constant function is a monomial of degree zero.
Each term of a linear function is a monomial of degree 1.
f (x1 , x2 ) = 3x1 + 4x2
Each term of a quadratic form is a monomial of degree 2.
Q(x, y, z) = x2 5y 2 + 3z 2 2xy 5xy
30
Polynomial
A function f : Rk R is called a polynomial function if f is the finite sum of monomials
on Rk .
The degree of the polynomial is the highest degree among the different monomials.
Example:
f (x, y, z) = x3 + 2xyz + y 4
f is a polynomial of degree 4.
2.1.2
Classification of functions
1. f : A B is surjective if for each element b B, there is an element a A such that

b = f (a), the whole target space is the image of f .
Examples.
Consider the real functions f (x) = x and g(x) = x2 . f (x) = x is surjective, as
Im(f ) = R. Otherwise, function g is not surjective as Im(g) = {y 0|y R}.
2. f : A B is injective or one-to-one if:
x, y A
f (x) 6= f (y) x 6= y
Examples.
g(x) = x2 is not one-to-one as for x = 1 and x = 1, we have that f (1) = 1.
f (x) = x is one-to-one as for two any different values of x, their images are always
different.
f : R2 R, f (x, y) = x2 + y 2 . f is not surjective as the image of f is R+ instead of
all R. Neither is one-to-one since f (1, 0) = f (0, 1) = 1.
2.1.3
Composition of functions
Let f : A B and g : C D be two functions. Suppose that B C. Then, the

composition of f with g, g f : A D is the function:
(g f )(x) = g(f (x))
x A
Examples:
f : R R and g : R R, f (x) = sin(x), g(x) = x2 .
(g f )(x) = g(f (x)) = g(sin(x)) = sin2 (x)
31
Figure 2.1: Derivative of a function
2.2
Derivatives of multivariate functions
2.2.1
Partial Derivatives
Let f : Rk R, y = f (x1 , . . . , xk ). Then, we define the partial derivative of f with

respect to xi at point x0 = (x01 , x02 , . . . , x0k ) Rk as:
f (x01 , . . . , x0i , . . . , x0k ) f (x01 , . . . , x0i + h, . . . , x0k )
f
(x01 , x02 , . . . , x0k ) = lim
h0
xi
h
provided this limit exists. This means that for any > 0, there is a > 0 such that:

f
f (x01 , . . . , x0i , . . . , x0k ) f (x01 , . . . , x0i + h, . . . , x0k )
|h| <
(x01 , x02 , . . . , x0k )
<
xi
h
(x0 )
Example. If f : R R. If limh0 f (x0 +h)f
tends to a particular point, then we
h
have a derivative.
Limit: > 0 > 0 such that:

f (x0 + h) f (x0 )

|h| < f (x0 )
<
h
In Economic terms, it can be interpreted as for example, MARGINAL PRODUCT

F F
(given the production function F (K, L)): K
, L are the marginal products of capital and
labour, respectively. Or ELASTICITY (given a certain demand function Q1 (P1 , P2 , I)):
32
Q1 P1
%change in demand
P1 Q1
%change in own price
is the own price elasticity of demand and,
Q1 P2
is the cross price elasticity of demand

P2 Q1
Q1 I
is the income elasticity of demand
I Q1
2.2.2
The total differential
Intuitively, if F : R2 R:
F
(x , y )x
x
F
F (x , y + y) F (x , y )
(x , y )y
y
F (x + x, y ) F (x , y )
so we expect that:
F (x + x, y + y) F (x , y )
F
F
(x , y )x +
(x , y )y
x
y
or if F : Rk R, with x = (x1 , . . . , xk ):
F (x1 + x1 , . . . , xk + xk ) F (x1 , . . . , xk )
F
F
(x )x1 + . . . +
(x )xk
x1
xk
Then,
dF = F (x + x) F (x )
Thus,
dF =
F
F
(x )dx1 + . . . +
(x )dxk
x1
xk
The JACOBIAN DERIVATIVE of F at x is:
DF (x ) =

F
F
(x ), . . . ,
(x )
x1
xk
33
If F : Rk Rm , F = (f1 (x1 , . . . , xk ), f2 (x1 , . . . , xk ), . . . , fm (x1 , . . . , xk )):
F (x + x) F (x )
f1
(x )
xk
..
.
fm
(x )
xk
f1
(x )
x1
..
.
fm
(x )
x1
x1
.
.
.
xk
So, the Jacobian Derivative of F at x is:
DF (x ) =
f1
(x )
x1
..
.
fm
(x )
x1
f1
(x )
xk
..
.
fm
(x )
xk
Example.
3/2
1 2
Consider the pair of demand functions q1 = 6p2
1 p2 y and q2 = 4p1 p2 y . Calculate
the total differential of these functions at point p1 = 6, p2 = 9, y = 2.
dq1 =
q1
q1
q1
(x )dp1 +
(x )dp2 +
(x )dy
p1
p2
y
dq1 = 3dp1 + 1.5dp2 + 4.5dy
dq2 =
2.2.3
32
32
16
dp1 dp2 + dy
9
27
3
The Chain Rule
Let g(t) = f (x(t)). The derivative of the composite function is the derivative of the
outside function f (x) (evaluated at the inside of the function) times derivative of the
inside function:
g (t) = f (x(t)) x (t)
Let g(t) = f (x1 (t), . . . , xk (t)). Then:
g (t) =
f
f
(x(t))x1 (t) + . . . +
(x(t))xk (t)
x1
xk
34
Let u(t) = (u1 (t1 , . . . , ts ), u2 (t1 , . . . , ts ), . . . , un (t1 , . . . , ts )), u : Rs Rn , with t =
(t1 , . . . , ts ), and f : Rn R, then the composite function g : Rs R:
g = (f u)(t) = g(t1 , . . . , ts ) = f (u(t)) = f (u1 (t), . . . , un (t))
And then,
g
f
u1
f
un
=
(t)
(t) + . . . +
(t)
(t)
ti
u1
ti
un
ti
Let F : Rk Rm , and a : R Rk . Then, the composite function g = F a = F (a(t))
is a function g : R Rm , and its derivative is:
g (t) = DF (a(t)) a (t)
Let F : Rk Rm , and A : Rs Rk . Then, the composite function H = F A is a
function H : Rs Rm , and:
DH(s) = D(F A)(s) = DF (A(s)) D(A(s))
2.2.4
Higher order derivatives
Let f : Rk R. For this function we could define the matrix of second order derivatives as:
D2 f (x) =
2f
x21
2f
x1 x2
..
.
2f
(x)
x1 xk
2f
x2 x1
2f
x22
2f
(x)
xk x1
2f
(x)
xk x2
..
.
2f
(x)
x2k
This matrix is called the Hessian Matrix, that is as you can see, a symmetric matrix.
The Young's Theorem shows that mixed partials of order n, are equal if they are continuously.
2.3
Taylors series approximation (Chapter 30 of Simon

and Blume)
A function f : Rk R is r-times continuously differentaiable (cr ) if all the derivatives of f

of order r exist and are continuous.
35
Figure 2.2: Mean value
2.3.1
Mean value theorem
Let f : U R a c1 fucntion on a (connected) interval U in R. For any points a, b U , there

is a point c between a and b such that:
f (b) f (a) = f (c)(b a)
Similarly, let F : U R such that U R (open subset). Let a and b be two points in
U such that the line segment from a to b lies in U . Then, there is a point on that segment
such that:
F (b) F (a) = DF (c)(b a)
2.3.2
Taylors series approximation on R
Let f : U R be a cr+1 function defined on a (connected) interval U in R. For any a and

a + h in U , there exists a point c between a and a + h such that:
f (a + h) = f (a) + f (a)h + . . . +
1 (k)
1
f (a)hk +
f (k+1) (a)hk+1
k!
(k + 1)!
36
In this expression, the k th order Taylor polynomial of F at x = a is:
Pk (a + h) = f (a) + f (a)h + . . . +
1 (k)
f (a)hk
k!
Defining the difference Rk (h; a) between the actual value f (a + h) and its k th order
approximation Pk (a + h) satisfies:
Rk (h; a) = f (a + h) Pk (a + h)
and this difference satisfies that
2.3.3
Rk (h; a)
0 as h 0.
hk
Taylors series approximation on Rk
Suppose that F : U R is a c2 function on an (open) subset U of Rk . Let a U . Then,

there exists a continuous function R2 (h; a) such that for any point a + h U with the
property that the line segment from a to a + h lies in U ,
F (a + h) = F (a) + DF (a)h +
where
1 t 2
h D F (a)h + R2 (h; a)
2!
R2 (h; a)
0 as ||h|| 0 and ht = (h1 , . . . , hk ) and:
||h||2
k
1 X X 2F
1 t 2
(a)hi hj
h D F (a)h =
2!
2 i=1 j=1 xi xj
In coordinates on R2 (a = (a1 , a2 ) and h = (h1 , h2 )),

F (a1 + h1 , a2 + h2 ) = F (a1 , a2 ) +
F
F
(a)h1 +
(a)h2 +
x1
x2
1 2F
2F
1 2F
(a)h21 +
(a)h1 h2 +
(a)h22 + R2 (h1 , h2 ; a)
2
2 x1
x1 x2
2 x22
Example. Compute the Taylor approximation of order two of the Cobb-Douglas
fucntion F (x, y) = x1/4 y 3/4 at (1, 1).
1
F
= x3/4 y 3/4
x
4
F
3
= x1/4 y 1/4
y
4
37
3
2F
= x1/4 y 5/4
y 2
16
3
2F
= x7/4 y 3/4
x2
16
2F
3 3/4 1/4
2F
=
=
x y
xy
yx
16
Evaluating the partial derivatives at z = (1, 1) we have that:
1
F
(z ) =
x
4
2F
3
(z ) =
x2
16
F
3
(z ) =
y
4
2F
3
(z ) =
y 2
16
2F
3
=
xy
16
Therefore,
F (1+h1 , 1+h2 ) = F (1, 1)+ 1/4 3/4
=1+
h1
h2

1
+
2
h1
h2
3/16 3/16
3/16 3/16

h1
h2
h1
3
3
3
3
+ h2 + h21 + h1 h2 h22 + R(h1 , h2 )
4
4
32
16
32
Note that if ht = (0.1, 0.1), we have that F (1.1, 0.9) = 0.9463026.

The value of the Taylor approximation of order 1 is 0.95.
The value of the Taylor approximation of order 1 is 0.94625
2.4
Implicit function theorem (Chapter 15 Simon and

Blume)
Usually the endogeneous variable is a explicit function of the exogeneous variable, that is:
y = F (x1 , . . . , xk )
but sometimes we face situations where both kinds of variables are mixed as in:
G(x1 , . . . , xk , y) = 0
Depending on G, we usually cannot solve for y, but we still want to answer the basic
question: How does a small change in one of the exogeneous variables affect the value of the
endogeneous variable?
38
The implicit function theorem for R2
2.4.1
For a given function G(x, y) = C and a specific point (x0 , y0 ), we want to answere the
questions:
(a) Does G(x, y) = C determine y as a continuous function of x near to x0 and y near y0 ?
(b) If so, how do changes in x affect y?
Theorem: let G(x, y) be a c1 function on a ball about (x0 , y0 ) in R2 . Suppose that
G(x0 , y0 ) = C and consider the expression G(x, y) = C.
G
(x0 , y0 ) 6= 0, then, there exists a c1 function y = y(x) defined on an interval I about
If
y
x0 such that:
1. G(x, y(x)) = C x I,
2. y(x0 ) = yo
G
(x0 , y0 )
3. y (x0 ) = x
G
(x0 , y0 )
y
Example. Consider x2 3xy + y 3 7 = 0 around the solution point x0 = 4, y0 = 3.

Suppose we can found a function y = y(x) such that:
x2 3xy(x) + y 3 (x) 7 = 0
If we take the partial derivative of G with respect to y (using the Chain rule), we have
that:
2x 3y(x) 3xy (x) + 3y 2 (x)y (x) = 0
G
2x 3y
= x
y (x) = 2
G
3y 3x
y
At the point x = 4, y = 3, we find that:

1
15
We conclude that if there is a function which solves the equation and if it is differentiable,
then, as x changes by x, y will change by x/15.
y (4) =
Summarizing, we suppose that there exist a function y(x) that is solution to the equation
G(x, y) = C, G(x, y(x)) = C. Then we apply the Chain Rule to differentiate with respect
to x at x0 , and we get y (x0 ).
39
In the example, we can get the Taylors series approximation of order 1:
y1 y0 + y (x0 )x = 3 +
1
0.3 = 3.02
15
and y1 = 3.01475.
2.4.2
The implicit function theorem for Rk
Let G(x1 , . . . , xk , y) be a c1 function around the point (x1 , . . . , xk , y ). Suppose that (x1 , . . . , xk , y )
satisfies:
G(x1 , . . . , xk , y ) = C
G
(x , . . . , xk , y ) 6= 0
y 1
Then, there is a c1 function y = y(x1 , . . . , xk ) defined on an open ball B about
so that:
(x1 , . . . , xk )
1. G(x1 , . . . , xk , y(x1 , . . . , xk )) = C (x1 , . . . , xk ) B,

2. y = y((x1 , . . . , xk )
3. i = 1, . . . , k:
2.4.3
G
(x , . . . , xk , y )
y
xi 1
(x , . . . , xk ) =
G
xi 1
(x , . . . , xk , y )
y 1
The implicit function theorem for systems of implicit functions
Let F1 , . . . , Fm : Rm+k R be a c1 fucntions. Consider the system of equations:

F1 (y1 , . . . , ym , x1 , . . . , xk )
..
.
=
..
.
C1
..
.
Fm (y1 , . . . , ym , x1 , . . . , xk ) = Cm
Defining y = (y1 , . . . , ym
), x = (x1 , . . . , xk ), suppose that (y , x ) is a solution of the
system above.
If the determinant of the m m matrix:
40
F
1
y1
.
.
.
F
m
y1
F1
ym
..
.
Fm
ym
evaluated at (y , x ) is nonzero, then there exist functions:

y1
..
.
f1 (x1 , . . . , xk )
..
.
ym
fm (x1 , . . . , xk )
defined on a ball B about x such that:

F1 (f1 (x), . . . , fm (x), x1 , . . . , xk )
..
.
=
..
.
C1
..
.
Fm (f1 (x), . . . , fm (x), x1 , . . . , xk ) = Cm

x = (x1 , . . . , xk ) B and
Furthermore, one can compute
y1
xj
..
.
ym
xj
y1
..
.
f1 (x )
..
.
ym
fm (x )
yi
(y , x ), i = 1, . . . , m by evaluating at (y , x ),
xj
F
1
y
1

..
= .

Fm
y1
F1
ym
..
.
Fm
ym
1 F
1
x

j

..

.
F
m
xj
Example. Consider:
F1 (x, y, a) =
F2 (x, y, a) =
x2 + axy + y 2 1 =
x2 + y 2 a2 + 3 =
0
0
around the point x = 0, y = 1, a = 2. If we change a a little to a near

we find (x , y ) near (0, 1) so that (x , y , a ) satisfies these two equations?
= 2, can
41
First, defining z = (x , y , a ),
x
(z )
a
..
.
y
(z )
a
F1

x
=
F2
(z )
x

2
0
2
2

1
F1
(z )

y
F2
(z )
y
0
4
2
2
F1
(z )
a
F2 =
(z )
a
So,
x = 2a
y = 2a
If a increases to 2.1, y will increase about 1.2 and x will decrease to 0.2.
Chapter 3
Optimization
3.1
Unconstrained optimization
Let F : U R be a real-valued function whose domain is a subset of Rk , U Rk .

1. x U is a maximum of F on U if F (x ) F (x) x U .
2. x U is a strict maximum of F on U if F (x ) > F (x) x 6= x U .
3. x U is a local (or relative) maximum of F if there is a ball B (x ) about x such
that F (x ) F (x) x B (x ) U
4. x U is a strict local maximum of F if there is a ball B (x ) about x such that
F (x ) F (x) x B (x ) U
Reversing the inequalities, we have the definitions for minimum, strict minimum, . . ..
B (x ) = {x|||x x || < }
Definition: x is an interior point of a set U if there is a ball B (x ) about x in the

set U (x does not lie in the extremes of the interval, is not on the surface of the sphere B).
3.1.1
Theorem 1
Let F : U R be a c1 function defined on a subset U of Rk . If x is a local maximum or

minimum of F in U and if x is an interior point of U , then
43
44
3. Optimization
F
(x ) = 0
xi
i = 1, . . . , k
A point x which satisfies these expressions, is called a critical point of F .
3.1.2
Theorem 2
Definition: A set S in Rk is open if x S, there exists and open -ball about x completely
contained in S:
x S > 0 such that B S

Let F : U R be a c2 function whose domain is an open set U in Rk . Suppose that
x is a critical point of F .
1. If the Hessian:
F (x ) = D F (x ) =
2F
(x )
x21
2
F
(x )
x1 x2
..
.
2F
(x )
x2 x1
2F
(x )
x22
2F
(x )
x1 xk
2F
(x )
xk x1
2F
(x )
xk x2
..
.
2F
(x )
x2k
is negative definite, then x is a strict local maximum of F .

2. If D2 F (x ) is positive definite, tehn x is a strict local minimum of F .
3. If D2 F (x ) is indefinite, then x is a saddle point.
3.1.3
Theorem 3
Let F : U R be a c2 function whose domain is U in Rk . Suppose that x is an interior

point of U and that x is a local maximum (minimum) of F . Then, D2 F (x ) is negative
(positive) semidefinite.
45
3. Optimization
3.2
Optimization with equality constraints
3.2.1
Two variables and one equality constraint
Let f and h be c1 functions of two variables. Suppose that x = (x1 , x2 ) is a solution of the
problem:
Maximize f (x1 , x2 ) subject to h(x1 , x2 ) = c

Suppose that (x1 , x2 ) is not a critical point of h. Then, there is a real number such
that (x1 , x2 , ) is a critical point of the Lagrangian function,
L(x1 , x2 , ) = f (x1 , x2 ) [h(x1 , x2 ) c]
is called Lagrange multiplier.
Intuition
Level curves of f : R2 R. For a point (x0 , y0 ) we evaluate f (x0 , y0 ) = z0 . Then we
sketch the locus in the x-y plane of all other (x, y) pairs for which f has the same value z0 .
This locus is a level curve of f .
Example: For f (x, y) = x2 + y 2 , the set of all points (x, y) at which f equals a is:
f 1 (a) = {(x, y)|x2 + y 2 = a}
Two ideas:
The highest level curve of f must be tangent to the constraint set (curve) at the
constrained maximum x . This means that the slope of the level curve of f equals the
slope of the constraint curve at x .
How do we calculate those slopes? By the implicit function theorem it is straightforward to show that the slope of the level set of f at x is:
f
(x )
x
1
f
(x )
x2
whereas the slope of the constraint set at x is:
h
(x )
x1
h
(x )
x2
46
3. Optimization
Figure 3.1: Level curves of a function
Thus, by the optimality condition we have that
h
f
(x )
(x )
x1
x1
=
f
h
(x )
(x )
x2
x2
f
f
(x )
(x )
x1
x2
=
=
h
h
(x )
(x )
x1
x2
Rewritting these equations:
h
f
(x )
(x ) = 0
x1
x1
h
f
(x )
(x ) = 0
x2
x2
and including the constraint equation:
47
3. Optimization
Figure 3.2: Intuition
h(x1 , x2 ) = c
We obtain a system of 3 equations in 3 unkowns. We get exactly the same through

the use of the Lagrangian function.
Of course this method does not work for the critical points of h:
h
h
(x ) =
(x ) = 0
x1
x2
so this was the reason for the condition called the constraint qualification, which
is automatically satisfied if the constraint is linear.
Example. Find the maximum of the function f (x1 , x2 ) = x1 x2 subject to:
h(x1 , x2 )
= x1 + 4x2
= 16
48
3. Optimization
First, we have to see the constraint qualification condition, if there are critical points
of h that lie in the constraint region.
h
=1
x1
h
=4
x2
Then h has no critical points and the constraint qualification is satisfied.

The Lagrangian is:
L(x1 , x2 , ) = x1 x2 (x1 + 4x2 16)

The partial derivatives are:
L
= x2 = 0
x1
L
= x1 4 = 0
x2
L
= (x1 + 4x2 16) = 0
and setting the derivatives equal to zero and solving for the three unknowns, we have
that x1 = 8, x2 = 2, a st = 2.
Example. Maximize the function f (x1 , x2 ) = x21 x2 subject to (x1 , x2 ) in the constraint
set C = {(x1 , x2 )|2x21 + x22 = 3}
First, we have to compute the critical points of h:
h
= 4x1
x1
h
= 2x2
x2
These derivatives are zero if (x1 , x2 ) = (0, 0), But as this point is not in the constraint
set C, the constraint qualification is satisfied and we can form the Lagrangian.
Lagrangian:
L(x1 , x2 , ) = x21 x2 (2x21 + x22 3)
Derivatives of the Lagrangian with respect to the variables and the lagrangian multipliers:
49
3. Optimization
L
= 2x1 x2 4x1 = 2x1 (x2 2) = 0
x1
L
= x21 2x2 = 0
x2
L
= 2x21 x22 + 3 = 0
Solving the system. As you can see, this is a nonlinear system of equations, so matrix
methods cannot be applied. Only Substitution and and elimination methods can be
applied.
We have the following equations:
2x1 (x2 2) = 0
x21 2x2 = 0
2
2x1 + x22 3 = 0
From the first equation we have two alternatives:
(a) 2x1 = 0 and x2 6= 0. Then x1 = 0, substituting this value in the three
equations we have that:
2x2
x22 3
=
=
0
0
From the last equation we have that x2 = 3, and then =0. Thus, we have
two possible solution points: (x1 , x2 , ) = (0, 3, 0) and (0, 3, 0).
(b) x1 6= 0 and x2 = 0. Then we have that x2 = 2 and substituting this
expression in the rest equations:
2x21
x21 42 =
+ 42 3 =
0
0
1
and x1 = 1. Thus, we have another 4 possible points:
2
(1, 1, 0.5), (1, 1, 0.5), (1, 1, 0.5), (1, 1, 0.5).
We have that =
(c) 2x1 = 0 and x2 = 0. Then x1 = 0 and x2 = 2. But if we substitute we have:
50
3. Optimization
42 =
42 3 =
0
0
And this system is incompatible, it has no solution.

Plugging the solutions in f , we conclude that the maximum of the function occurs at
(x1 , x2 ) = (1, 1) and (x1 , x2 ) = (1, 1)
The meaning of the multiplier
Theorem:
Given the problem:
Max f (x1 , x2 )
subject to h(x1 , x2 ) = c
Let f and h be c1 functions. For any fixed value of the parameter c, let (x1 (c), x2 (c))
be the solution of the problem, with the corresponding multiplier (c).
Suppose that x1 , x2 and are c1 functions of c and that the constraint qualification holds
at (x1 (c), x2 (c)). Then,
(c) =
df
(x (c), x2 (c))
dc 1
So that (c) measures the rate of change of the optimal value of f with respect to c.
Example. In the previous example, we found that a maximizer of f subject to 2x21 +x22 =
3 was x1 = x2 = 1, with = 0.5.
We could redo the problem with a new constraint 2x21 + x22 = 3.3 to get x1 = x2 = 1.1
with the maximum value of f = 1.1537, an increase of 0.1537 over the original value of f .
But, by the previous theorem, we could get an approximation of this increase in f as:
f = c = 0.3 0.5 = 0.15
51
3. Optimization
quite similar to the tru increase in f .
3.2.2
Several equality constraints
Now we want to maximize:

Max f (x1 , x2 , . . . , xk )
subject to C = {x = (x1 , x2 , . . . , xk )|h1 (x) = c1 , . . . , hm (x) = cm }
with m constraint functions. The generalization of the constraint qualification is that

the Jacobian derivative evaluated at x (optimal x) is:
Dh(x ) =
h1
(x )
x1
..
.
hm
(x )
x1
h1
(x )
xk
..
.
hm
(x )
xk
In general, a critical point x of h = (h1 , . . . , hm ) has rk(Dh(x )) < m. then we say

that (h1 , . . . , hm ) satisfies the nondegenerate constraint qualification (NDCQ) at x .
Theorem
Let f, h1 , . . . , hm be c1 functions of k variables. Consider the previous problem and suppose
that x C is a local max or min of f on C and satisfies the NDCQ above. Then, there
exists 1 , . . . , m such that:
(x , ) = (x1 , . . . , xk , 1 , . . . , m )
is a critical point of the Lagrangian L(x, ) = f (x)1 (h1 (x)c1 ). . .m (h1 (x)cm ).
Example. Find the maximum of the function f (x, y, z) = xyz with the constraints:
h1 (x, y, z) =
h2 (x, y, z) =
x2 + y 2
x+z
= 1
= 1
52
3. Optimization
First, we check the NDCQ. The Jacobian matrix is:
h1
x
Dh(x, y, z) = h
2
x
h1
y
h2
y
h1

2x 2y
z =
h2
1
0
z
0
1
which has rank < 2 if x = y = 0. However, this point violates the first constraint, so
all the points in the constraint set satisfy the NDCQ. Forming the Lagrangian,
L(x, y, z, 1 , 2 ) = xyz 1 (x2 + y 2 1) 2 (x + z 1)

Derivatives of the Lagrangian with respect to the variables and the lagrangian multipliers:
L
= yz 21 x 2 = 0
x
L
= xy 2 = 0
z
L
= xz 21 y = 0
y
L
=x+z1=0
2
L
= x2 + y 2 1 = 0
1
Solving the system. As you can see, this is a nonlinear system of equations, so matrix
methods cannot be applied. Only Substitution and and elimination methods can be
applied.
We have the following equations:
yz 21 x 2
xz 21 y
x2 + y 2 1
x+z1
=
=
=
=
0
0
0
0
Taking 1 and 2 in terms of x, y and z, and plugging into the first equation we have:
y 2 z x2 z xy 2
x2 + y 2 1
x+z1
=
=
=
0
0
0
53
3. Optimization
Substituting z and y 2 in terms of x and plugging into the first equation we obtain
a polynomial of third order with a root x = 1. Solving for x in the second order
polynomial and obtaining y and z, we have another 4 possible solution points:
x 0.4343
x 0.7676
y 0.9008
y 0.6409
z 0.5657
z 1.7676
Evalutaing the possible candidates in the objective function, we have that the maximizer
is x 0.7676, y 0.6409 and z 1.7676.
The meaning of the multipliers
Let f, h1 , . . . , hm be c1 functions of k variables. Let c = (c1 , . . . , cm ) be an m-tuple of
exogeneous parameters and consider the problem above. Let x1 (c), . . . , xk (c) denote the
solution with 1 (c), . . . , m (c),.
Suppose that xi , j i, j are differentiable functions of (c1 , . . . , cm ) and that NDCQ holds.
Then:
j (c1 , . . . , cm ) =
f
(x (c1 , . . . , cm ), . . . , xk (c1 , . . . , cm ))
cj 1
j = 1, . . . , m
Figure 3.3: The gradient is perpendicular to the tangent line at that point
54
3. Optimization
3.3
3.3.1
Optimization with inequality constraints

One inequality constraint
We face the problem:

Maximize f (x1 , x2 ) subject to g(x1 , x2 ) b
Intuition:
Let F : Rk R be a c1 function and x Rk . Then the derivative:
F (x ) = DF (x ) =
F
(x )
x1
F
(x )
xk
t
is called the gradient of F at x .

At any x Rk at which F (x) 6= 0, the gradient points at x into the direction in which
F increases most rapidly. At the same time, the gradient is perpendicular to the tangent
line to the level curve at x .
Figure 3.4: Gradient of Q

Example. Consider Q = 4K 3/4 L1/4 with the input bundle (10000, 625). We want to
know in what proportions we should add K and L to (10000, 625) to increase production
most rapidly. Computing the gradient:
55
3. Optimization
F

(10000,
625)
1.5
K
F (10000, 625) = F
=
8
(10000, 625)
L
We deduce that we should add K and L at a ratio of 1.5 to 8.
Figure 3.5: Constraint is binding
Constraint is binding (active, effective or tight)

The highest level curve which meets the constraint set meets it at p. Thus, f (p) and g(p)
line up and therefore f (p) = g(p). Note that the gradients points at the direction
in which f (g) increases most rapidly at p. Thus, both gradients should point at the same
direction and 0.
Then, we set,
L(x1 , x2 , ) = f (x1 , x2 ) [g(x1 , x2 ) b]

and calculate
L
f
g
=
x1
x1
x1
56
3. Optimization
L
f
g
=
x2
x2
x2
L
= g(x1 , x2 ) b
and proceed as before. Note that we also require that the maximizer is not a critical
point of g.
Figure 3.6: Constraint is not binding
Constraint is not binding

Now, tha max of f occurs in a point q where g(x1 , x2 ) < b. As q is in the interior of the
constraint set, we say that the constraint is not binding (inactive, ineffective) at q. The point
r as the max, the multiplier would be negative. In fact, q must be a local unconstrained
max, so:
f
f
(q) = 0,
(q) = 0
x1
x2
Threfore, the derivatives of g do not define point q, so we can still use the Lagrangian,
provided we set = 0, which causes the constraint function to drop out of the analysis.
57
3. Optimization
L
= 0, so we have to use the complementary slacknesss condition,
Thus, we cannot use
which can be summarized as:

(g(x1 , x2 ) b) = 0
Theorem
Let f and g be c1 functions of two variables. Suppose that x = (x1 , x2 ) maximizes f on
g(x1 , x2 ) b. If g(x1 , x2 ) = b, suppose that,
g
g
(x , x ) 6= 0 or
(x , x ) 6= 0
x1 1 2
x2 1 2
Forming the Lagrangian function,
L(x1 , x2 , ) = f (x1 , x2 ) [g(x1 , x2 ) b]

There is a multiplier such that:
(a)
L
(x , x , ) = 0
x1 1 2
(b)
L
(x , x , ) = 0
x2 1 2
(c) [g(x1 , x2 ) b]
(d) 0
(e) g(x1 , x2 ) b
Example. Find the maximum of the function f (x, y) = xy subject to:
g(x1 , x2 ) =
x2 + y 2
First, we have to compute the critical points of g:

g
= 2x
x
g
= 2y
y
These derivatives are zero if (x, y) = (0, 0), but as these point is not on the boundary
of the region and does not satisfy x2 + y 2 = 1, the constraint qualification is satisfied
and we can form the Lagrangian.
58
3. Optimization
Lagrangian:
L(x, y, ) = xy (x2 + y 2 1)
We take the partial derivatives of the Lagrangian and the complementary slackness
condition:
L
= y 2x = 0
x
L
2.
= x 2y = 0
y
1.
3. (x2 + y 2 1) = 0
4. 0
5. x2 + y 2 1
We have the following system of equations:
y 2x = 0
x 2y = 0
(x2 + y 2 1) = 0
Solving the system. We have to analyze these systems through the lagrangian multipliers. We have in this case only one multiplier, , thus we start the analysis from the
third equation, supossing that or = 0 or x2 + y 2 1 = 0:
(a) If = 0, we have:
y
x
= 0
= 0
= 0
Thus, the solution of the system is (x, y) = (0, 0) as this point satisfies also the
inequality of the constraint x2 + y 2 1. Therefore, (0, 0) is a possible candidate
of the solution.
(b) If x2 + y 2 1 = 0, we have:
59
3. Optimization
y 2x
x 2y
x2 + y 2 1
=
=
=
0
0
0
0
From the first two equations we have:

=
x
y
= y 2 = x2
x
y
Plugging this result on the third equation we have:
2x2 = 1
1
x =
2
1
1
Then, y = , = and we have four possible points:
2
2

1 1
1
, ,
2
2 2

1 1
1
1
1
1
1
1
1
, , ,
, , ,
, , ,
2
2 2
2
2 2
2
2 2
Notice that there are some possible candidate points with < 0.These points will be
consider if we want to calculate the minimizer of the function. Right now we want to
calculate the maximizer of the function and we consider the points with 0, and
1
1
1
1
then we find that the maximizers of this function are ( , ) and ( , ).
2
2
2
2
3.3.2
Several inequality constraints
Suppose that f, g1 , . . . , gm are c1 functions of k variables. Suppose that x Rk is a local

max of f on the constraint set defined by:
g1 (x1 , . . . , xk ) 0
..
..
.
.
gm (x1 , . . . , xk ) 0
Without loss of generality, assume that the first m0 constraints are binding at x and
the last m m0 are not binding. Suppose that the rank of:
60
3. Optimization
Dg(x ) =
g1
(x )
x1
..
.
gm
(x )
x1
g1
(x )
xk
..
.
gm
(x )
xk
is m0 . Then we can form the Lagrangian:

L(x1 , . . . , xk , 1 , . . . , m ) = f (x) 1 [g1 (x) b1 ] . . . m [gm (x) bm ]
Then, there exist multipliers = (1 , . . . , m ) such that:
(a)
L
L
(x , ) = 0 . . .
(x , ) = 0
x1
xk
(b) 1 [g1 (x ) b1 ] = 0, . . . , m [gm (x ) bm ] = 0

(c) 1 0 . . . m 0
(d) g1 (x ) b1 , . . . , gm (x ) bm
Example. Maximize f (x, y, z) = xyz subject to x + y + z 1, x 0, y 0, z 0.
First of all, we have to transform the inequalities: x + y + z 1, x 0, y 0,
z 0.
We check the NDCQ. Calculating the Jacobian matrix of the constraints, we have that:
g1 g1 g1
x
y
z

1
1
1
g2 g2 g2

1 0
0
x
y
z
Dg(x, y, z) = g
=
g
g
3
3
3
0 1 0
0
0 1
y
z
x
g4 g4 g4
x
y
z
This matrix has rank 3, which means that at most 3 of the 4 constraints can be binding
at the same time. Besides, the NDCQ holds at any solution candidate.
The Lagrangian is:
L(x, y, z, 1 , 2 , 3 , 4 ) = xyz 1 (x + y + z 1) + 2 x + 3 y + 4 z
We take the partial derivatives of the Lagrangian and the complementary slackness
conditions:
61
3. Optimization
L
= yz 1 + 2 = 0
x
L
2.
= xz 1 + 3 = 0
y
L
3.
= xy 1 + 4 = 0
z
4. 1 (x + y + z 1) = 0
1.
5. 2 x = 0
6. 3 y = 0
7. 4 z = 0
8. i 0 i = 1, . . . , 4.
9. x + y + z 1
10. x 0
11. y 0
12. z 0
Figure 3.7: Lagrangian multipliers tree.

Then, we have the following system of equations:
62
3. Optimization
yz 1 + 2
xz 1 + 3
xy 1 + 4
1 (x + y + z 1)
2 x
3 y
4 z
=
=
=
=
=
=
=
0
0
0
0
0
0
0
(1)
(2)
(3)
(4)
(5)
(6)
(7)
From the first 3 equations we have,

1 = yz + 2 = xz + 3 = xy + 4
(13)
Solving the system. We have to analyze this system through the lagrangian multipliers.
We will form a tree with different suppositions from the values of the Lagrangian
multipliers in equations 4,5,6 and 7. As we have 4 different multipliers, we can have
24 = 16 different cases.
If 1 = 0, we have that equation (13) will be:
0 = yz + 2 = xz + 3 = xy + 4
As x, y, z, 2 , 3 and 4 are positive (greater or equal than zero from the inequalities), the only possible solution is that all the terms in those 3 equations have to
be zero:
2 = 3
yz =
xz =
xy =
= 4 = 0
0 (1)
0 (2)
0 (3)
From this system, we get infinite solutions in which two of the variables are 0
and the other is different than 0. From inequality (9), we have that the value
of this variable will be in the interval [0, 1]. For all these possible candidates,
f (x, y, z) = 0. Then, all the cases from 1 to 8 in the figure have been solved.
If 1 6= 0, then from inequality (9)
1
1
1
x+y+z1
2 x
3 y
4 z
=
=
=
=
=
=
=
yz + 2
xz + 3
xy + 4
0
0
0
0
(1)
(2)
(3)
(4)
(5)
(6)
(7)
63
3. Optimization
To solve this system of equations we have to discuss it from equations (5), (6)
and (7). We analyse the different cases in the figure, from 9 to 16:
Case 9. 1 6= 0, 2 = 3 = 4 = 0 (x, y, z are 6= 0). Then the system will be:
1
1
1
x+y+z1
1 =
x+y+z1 =
yz
0
=
=
=
=
yz
xz
xy
0
= xz
(1)
(2)
(3)
(4)
=
xy
(1)
(4)
The solutions are x = y = z = 1/3, with 1 = 1/9. We check that this

solution satisfies inequalities (9), (10), (11) and (12). So this point is a
possible candidate, with f (1/3, 1/3, 1/3) = 1/27.
Case 10. 1 6= 0, 2 = 3 = 0,4 6= 0 (x, y are 6= 0 and z = 0)
1
1
1
x+y1
=
=
=
=
0
0
xy + 4
0
(1)
(2)
(3)
(4)
As in the first two equations we get 1 = 0, this leads to a contradiction

with the suposition that 1 6= 0, then this system has no solution, Therefore,
z 6= 0 and 4 = 0 has to be zero (Cases 12, 14 and 16 from the figure are
excluded).
Case 11. 1 6= 0, 2 = 4 = 0, 3 6= 0 (x, z are 6= 0 and y = 0)
1
1
1
x+y+z1
3 y
=
=
=
=
=
0
xz + 3
0
0
0
(1)
(2)
(3)
(4)
(6)
Again, y = 0 leads to contradiction, and therefore y has to be greater than

zero, y > 0 and 3 = 0. Then Cases 12 and 15 are excluded.
Case 13. 1 6= 0, 2 6= 0, 3 = 4 = 0 (y, z are 6= 0 and x = 0). In this
case occurs again that we get to a contradiction, and therefore, x has to be
greater than zero, x > 0 and 2 = 0. Cases 14 and 16 are excluded too.
Then, the maximizer of the problem is (1/3, 1/3, 1/3).
64
3. Optimization
The meaning of the multipliers

Let f, g1 , . . . , gm be c1 functions of k variables. Let b = (b1 , . . . , bm ) be an m-tuple of
exogeneous parameters and consider the problem above. Let x1 (c), . . . , xk (c) denote the
solution with 1 (b), . . . , m (b),.
Suppose that xi , j i, j are differentiable functions of (b1 , . . . , bm ) and that NDCQ holds.
Then:
j (b1 , . . . , bm ) =
f
(x (b), . . . , xk (b))
bj 1
j = 1, . . . , m
Example. If in the previous example we change the first constraint to x + y + z 0.9,

the new solution is x = y = z = 0.3 where f (x , y , z ) = 0.027.
By the previous theorem, we could get an approximation of this increase in f as:
f = b1 = 1/9 0.1 0.011
f (0.3, 0.3, 0.3) = f (1/3, 1/3, 1/3) 0.011 0.0259

If we change the second constraint to x 0.1, no change in solution will be found in
the solution as the optimal point is still inside the new constraint set. This is consistent as
this constraint is not binding at the solution, 2 = 0.
3.4
Kuhn-Tucker formulation
The most common problem in Economics is of the form:

Maximize f (x1 , . . . , xk )
subject to g1 (x1 , . . . , xk ) b1 , . . . , gm (x1 , . . . , xk ) bm
x1 0, . . . , xk 0
We could set:
L(x, , v) = f (x)
m
X
j=1
j [gj (x) bj ] +
k
X
The first order conditions, as we have learned, are:

(a)
L
g1
f
gm
=
1
...
+ vi = 0, i = 1, . . . , k
xi
xi
xi
xi
i=1
vi xi
65
3. Optimization
(b) j [gj (x) bj ] = j
L
= 0, j = 1, . . . , m
j
(c) vi xi = 0, i = 1, . . . , k
(d) j 0, vi 0, i = 1, . . . , k, j = 1, . . . , m
Kuhn and Tucker worked with a Lagrangian without including the nonnegativity constraints:
e 1 , . . . , m ) = f (x)
L(x,
m
X
j01
j [gj (x) bj ]
which is called the Kuhn-Tucker Lagrangian. Note that:
e , v) +
L(x, , v) = L(x,
k
X
vi xi
i=1
Thus, for i = 1, . . . , k,
e
L
L
=
+ vi = 0
xi
xi
or
e
L
= vi
xi
Now it is straightforward to show that:

e
e
L
L
0 and xi
=0
xi
xi
On the other hand, for j = 1, . . . , m
e
L
L
=
= bj gj (x) 0
j
j
Thus, the first order conditions in terms of the Kuhn-Tucker Lagrangian are:
e
e
L
L
0, . . . ,
0
x1
xk
66
3. Optimization
e
e
L
L
0, . . . ,
0
1
m
x1
1
e
e
L
L
0, . . . , xk
0
x1
xk
e
e
L
L
= 0, . . . , m
=0
1
m
Two advantages over the previous formulation:

1. k + m equations instead of 2k + m.
2. Symmetry in the way xi s and j s enter the first order conditions.
3.4.1
Optimization with mixed constraints
Suppose f, g1 , . . . , gp , h1 , . . . , hm are c1 functions of k variables. LSuppose that x Rk is a

local maximizer of f in the constraint set defined by gi (x) bi , i = 1, . . . , p and hj (x) = cj ,
j = 1, . . . , m. Without loss of generality assume that the first p0 inequality constraints are
binding at x whereas p p0 are not.
Suppose that the rank of the Jacobian of the constraints is p0 + m:
g1
g1
(x
)
(x
)
x1
xk
..
..
gp0
gp0
(x )
(x )
x1
xk
Dgh(x ) = h
h
1
1
(x )
(x )
xk
x1
.
.
..
.
hm
hm
(x )
(x )
x1
xk
Form the Lagrangian:
L(x, , v) = f (x)
p
X
i=1
j [gj (x) bj ]
m
X
j=1
j [hj (x) cj ]
Then, there exists multipliers 1 , . . . , p , 1 , . . . , m such that:

(a)
L
L
(x , ) = 0 . . .
(x , ) = 0
x1
xk
67
3. Optimization
(b) 1 [g1 (x ) b1 ] = 0, . . . , m [gm (x ) bm ] = 0
(c) hj (x ) = cj , j = 1, . . . , m
(d 1 0 . . . m 0
(e) g1 (x ) b1 , . . . , gm (x ) bm
3.4.2
Envelope theorem
Unconstrained problems
Let f (x; a) be a c1 function of x Rk and the scalar a. For each choice of a, consider the
problem:
Max f (x; a) with respect to x
Let x (a) be the solution of this problem and suppose that x (a) is a c1 function of a.
Then,
f
df
(x (a); a) =
(x (a); a)
da
a
Example. Calculate the effect of a unit increase in a on the max value of f (x; a) =
x2 + 2ax + 4a2 .
f (x) = 2x + 2a = 0 x (a) = a
Now, f (x (a); a) = f (a, a) = 5a2 . Applying the envelope theorem,

df
f
(x (a); a) =
(a, a) = 10a
da
a
which means f will increase at a rate 10a as a increases one unit.
Constrained problems
Suppose f, h1 , . . . , hm are c1 functions of k variables. Let x (a) = (x1 (a), . . . , xk (a)) denote
the solution of the problem of maximization f (x; a) on the constraint set hj (x; a) = 0
j = 1, . . . , m.
or any fixed choice of a, suppose that x (a) and j (a) are c1 functions of a and that NDCQ
holds. Then,
68
3. Optimization
f
L
df
(x (a); a) =
(x (a); a) =
(x (a), (a); a)
da
a
a
Example. Find the maximum of the function f (x, y) = xy subject to:
g(x1 , x2 ) =
x2 + ay 2
Lagrangian:
L(x, y, ) = xy (x2 + ay 2 1)
A solution for a = 1 was already calculated before, and is x = y = 1/ 2 with

= 1/2. Now, the envelope theorem says that as a changes from 1 to 1.1 (0.1 of
positive increase in a), the optimal value of f changes by
df
f
L
(x (a); a) =
(x (a); a)
(x (a), (a); a) = y 2
da
a
a

1
1
1 1
L
, , ;1 =
a
4
2
2 2
So the optimal value of f will be decreased by 0.025 to 0.475. One can calculate
1
1
directly that the solution to the new problem is x = , y = , with maximum
2
2.2
objective value of f approx. 0.4767.
3.4.3
Special functions
Homogeneous functions (Chapter 20 of Simon and Blume)

For all scalar r R, a real-valued function f (x1 , . . . , xk ) is homogeneous of degree r if:
f (tx1 , . . . , txk ) = tr f (x1 , . . . , xk )
k
(x1 , . . . , xk ), t > 0
y = ax is an homogeneous function of degree k, while y = x3 + 4x2 is not homogeneous

at all. A monomial z = axk11 xk22 xk33 is an homogeneous function of degree k1 + k2 + k3 . A
function that is a sum of monomials of the same degreee is an homogeneous function. If a
function is composed by monomials of different degrees is not an homogeneous function.
In Economics, production functions are usually homogeneous functions. Homogeneous of
degree 1 is equivalent to constant returns to scale (double input double output). If r > 1
(double input double output 2r ), the firm exhibits increasing returns to scale, whereas
if r < 1, decreasing returns to scale.
Properties
69
3. Optimization
(a) If f c1 is homogeneous of degree r, its partial derivatives are homogeneous of degree

r 1.
(b) If f c1 on Rk+ . The tangent planes to the level sets of f have constant slope along
each ray from the origin.
Example. Suppose that u(x1 , x2 ) is an homogeneous utility function. Fixing (p1 , p2 , I0 ),
we want to maximize u(x1 , x2 ) subject to:
p1 x1 + p2 x2 I0
x0 is the solution. If we increase I to I1 , we get a new solution x1 . The optimal bundle
demanded for different income levels is called the Income Expansion Path and by b)
is a ray from the origin for homogeneous utility functions.
Figure 3.8: Homogeneous utility function

(c) If f (x) c1 and is homogeneous of degree r on Rk+ , then:
x1
f
f
(x) + . . . + xk
(x) = r(f (x)
x1
xk
x f (x) = rf (x)
70
3. Optimization
3.4.4
Concave and quasiconcave functions (Chapter 21 of Simon

and Blume)
Convex set
A set U is a convex set if whenever x and y are points in U , the line segment from x to y,
is also in U .
l(x, y) = {tx + (1 t)y|0 t 1}
Figure 3.9: Convex and not convex sets
Concave function
A real-valued function f defined on a convex set U of Rk is concave if x, y U and
t [0, 1],
f (tx + (1 t)y) tf (x) + (1 t)f (y)
Convex function
A real-valued function g defined on a convex set U of Rk is concave if x, y U and t [0, 1],
g(tx + (1 t)y) tg(x) + (1 t)f (y)
71
3. Optimization
Figure 3.10: Concave function

Note that if f is concave, then f is convex. Concavity and convexity are alwasy
defined on convex sets. In Economics, almost all the functions, specially the utility and
production functions have convex sets as their natural domains.
Theorems
Theorem 1. Let f be a c1 function on an interval i in R. Then, f is concave(convex) if:
f (y) f (x) ()f (x)(y x)
x, y I
Theorem 2. Let f be a c1 function on a convex subset U of Rk . Then f is concave

(convex) on U if x, y U :
f (y) f (x) ()Df (x)(y x)
72
3. Optimization
Theorem 3. Let f be a c2 function on an open convex subset U of Rk . Then, f is a
concave (convex) function on U if and only if the Hessian matrix D2 f (x) is negative
(positive) semidefinite x U .
Theorem 4 (Global maxima and minima). Let f be a concave (convex) function defined
on an open convex subset U of Rk . If x0 is a critical point of f , Df (x0 ) = 0, then x0
is a global maximizer (minimizer) of f on U .
Quasiconcave functions
A function f defined on a convex subset U Rk is quasiconcave if x, y U and t [0, 1],
f (tx + (1 t)y) min{f (x, f (y)}
set.
Alternative definition: f is quasiconcave if a R, Ca = {x U |f (x) a} is a convex
Example. Every Cobb-Douglas function F (x, y) = Axa y b with A, a, b > 0 is quasiconcave.
Chapter 4
Analysis
4.1
Sequences of real numbers
A sequence of real numbers is a function f : N R. This function relates a natural number

to a real number. Usually, the sequence is not considered the function f but the images:
xn = {f (n)}
n=1
4.1.1
Convergent sequence
A sequence {xn }n = 1 converges to a limit x if > 0 there exists a number N such that,
|xn x| <
n N
lim xn = x
Examples. Sequences converging to 0:

1
1
1, 0, , 0, , 0, . . .
2
3
1 1 1
1, , , , . . .
2 3 4
3 1 3 1 3 1
1, , , , , , , . . .
1 2 2 3 3 4
73
74
4. Analysis
Cauchy sequence
It is a sequence {xn }n = 1 such that > 0 there exists a number N such that,
|xn xm | <
n N, m N
Proposition: Any convergent sequence is a Cauchy sequence.

Theorem. Let {xn }n = 1 , {yn }n = 1 be sequences with limits x and y respectively. Then, the sequence {xn + yn }n = 1 converges to the limit x + y and the
sequence {xn yn }n = 1 converges to xy.
Monotone sequences
A sequence is monotone increasing (decreasing) if xn1 ()xn n N. It is
monotone if it is monotone increasing or decreasing.
Bounded sequence: A sequence is bounded if there is a number B such that
|xn | B
Theorem: Every bounded monotone sequence converges.

Example. Consider the sequence {an }:

r n
an = p 1 +
12
in which p is the investment capital, an is the accounting balance after n months, and
r is the annual compound interest rate.
This sequence is divergent, as:

r n
r n
n
= p lim 1 +
= p lim (k) =
lim p 1 +
n
n
n
12
12
Notice that the limit tends to infinity as k > 1.

Besides, this sequence is monotone increasing.
75
4. Analysis
4.2
Open sets
Open ball: For a point z Rk and > 0, then the open -ball about z is:
B (z) = {x Rk |||x z|| < }
Open set: A set S in Rk is open if for each x S, there exists an open -ball
baout x completely contained in s:
x S > 0|B (z) S
Examples. (0, 1) is an open set in R, but not in R2 . S = {x Rk |||x|| 1} is
not open.
Theorem
The intersection of a finite number of open subsets of Rk is an open set.
Union of arbitrary collection of open sets is an open set.
Interior of a set S: Union of all open sets contained in s (largest open set
contained in S).
Example. S = {(x, y) R2 |0 < z 1}. Int(s) are points with open -balls
belonging to S:
Int(s) = {(x, y) R2 |0 < z 1}
Closed set. A set S Rk is closed if whenever {xn }n = 1 is a convergent

sequence completely contained in S, its limit is also contained in S. Or easier, a
set S is closed if S c = Rk S is open.
Theorem
Any intersection of closed sets is closed.

The finite union of closed sets is closed.
4.3
Continuity of functions
4.3.1
Continuous function at x0
Let f : Rk Rm and x0 Rk . f is continuous at x0 if > 0 > 0 such that

||x x0 || < , implies that ||f (x) f (x0 )|| < . f is continuous if it is continuous at
every point in its domain.
f is continuous if and only if for all convergent sequence in its domain if xn x0
76
4. Analysis
f (xn ) f (x0 ).
Example.
f (x) =
1 x>0
0 x0
The sequence 1/n converges to 0, but f (1/n) = 1, which is not f (0) = 0.
4.3.2
Uniformly continuous function

k
Let f : R Rm and B Rk . We say f is uniformly continuous function in B if > 0

> 0 such that x, y B and||x y|| < , implies that ||f (x) f (y)|| < . Clearly if f is
uniformly continuous f is continuous.
Example. f : R R, f (x) = x2 is continuous. But not uniformly continuous, as given
> 0 and x0 > 0, we want to choose
|x x0 | < |x2 x20 | <
Clearly, as x0 gets bigger, keeping the same > 0 has to be smaller, implying that
given > 0, there is not a single > 0 which works for every x0 R.

Maths Lecture Notes 2015 16

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Maths Lecture Notes 2015 16

Hochgeladen von

Copyright:

Verfügbare Formate

MASTER OF ECONOMICS

Mathematics for Economists

Ignacio Rodrguez Carre

2.2.4 Higher order derivatives . . . . . . . . . . . . . . . . . . . . . .

Matrix Algebra (Chapter 8 of Simon and Blume)

a11 a12 a1n

Operations with matrices

Definition 2: Given a matrix A of dimension m n and a matrix B n p, we define

The dimension of the matrix C is dim(C) = m p.

Therefore, if A is a matrix Mmn and B is a matrix Mrs :

Laws of the matrix Algebra

A Mmn , B Mnr and C Mrp :

At = (aji ) is the transposed matrix, calculated by interchanging the rows of columns of A:

Special kinds of matrices

Square matrix. A Mmn such that m = n.

Symmetric matrix. A Mmn such that A = At , aij = aji i 6= j. Example.

Determinant of a matrix (Chapter 9 of Simon and

The determinant of a square matrix is a number. Determinant of 2 2 and 3 3 matrices

Properties of the determinant

3. In general, |A + B| 6= |A| + |B|

4. If one forms matrix B interchanging 2 rows or 2 columns of A, then |B| = |A|

multiplies a row (column), it can be

taken put from the deter

8. If a matrix A has an all-zero row(column), then |A| = 0.

one row (column) then

For square matrices of order > 3, let introduce two definitions:

A11 = M11 (1)1+1 = 8 1 = 8

The determinant of an n n matrix A is given by:

By using the last property:

Rank of a matrix (rk(A))

then the rank of the matrix is not one, is 2 : rk(A) 2.

Using Gauss-Jordan method

Given A Mmn with m row vectors r 1 , r2 , . . . , r m and n column vectors c1 , c2 , . . . , cn . The

4. There is an element, the null vector 0 such that: v + 0 = 0 + v = v

is called a linear combination of u

The set of linear combinations of u1 , u

Inverse matrix using cofactors

The inverse matrix is unique.

Systems of linear equations

A system of m linear equations with n unknows or varaibales x1 , x2 , . . . , xn :

am1 x1 + am2 x2 + . . . + amn xn

whose matricial expression is:

A is the coefficients matrix with fixed numbers aij with 1 i m y 1 j n.

Some questions arise:

Solution of systems of linear equations

There are three different methods:

I calculate x in terms of y and z from the first equation: x =

From the second equation, y =

Its matricial expression:

A equivalent system is another one in which R2 = 2R2 R1 and R3 = R3 R2 :

1/8 3/8 5/8

1/8 3/8 5/8

Applying elementary operations to obtain the row echelon form:

Obtaining row echelon form:

Then, rk(A) = rk(A|B) = 2 < 3, with n r = 3 2 = 1 parameter.

3. Particular case: Homogeneous systems A homogeneous system is a system of

Eigenvalues and eigenvectors of a matrix

Let A be a square matrix. An eigenvalue of A is a number r which then subtrated

The eigenvalues are: r1 = 0, r2 = 2, r3 = 3, with algebraic multiplicities 1 =

that vi 6= 0. These vectors vi are called the eigenvectors of A corresponding to the

(Chapter 23 of Simon and Blume)

with vi the basis of the eigenspaces associated to eigenvalues ri .

with vi the basis of the eigenspaces associated to eigenvalues ri .

(b) Given A Mnn . If r1 , r2 , . . . , rp are the eigenvalues of A with algebraic multiplicities