Sie sind auf Seite 1von 22

G6215.

001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Contents
1 The Ramsey growth model with technological progress 2
1.1 A useful lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Household . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Characterization of the optimal path for consumption and asset holdings . . . . . . . . . . . . 6
1.2.3 Euler equation and transversality condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 The model equations in intensive form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Dynamic programming and applications 13


2.1 A first look at discrete time deterministic problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.1 Sequential problem and functional equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.2 The principle of optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.3 Solving the functional equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

1 The Ramsey growth model with technological progress


Spoiler: the differences with the Ramsey growth model seen in class are very minimal.
In class, we saw that the competitive equilibrium of the Ramsey economy was the unique pair of determin-
isitic processes {c(t), k(t)}t≥0 solving the following system of differential equations:

ċ(t) 1
= (f 0 (k(t) − δ − ρ)
c(t) u (c(t))
k̇(t) = f (k(t)) − (n + δ)k(t) − c(t)

for the initial and boundary conditions:


  Z T 
0
lim k(T ) exp − (f (k(s)) − δ − n)ds =0
T →∞ 0

k(0) > 0 given


(c(t), k(t)) were consumption per capita and capital per capita, respectively.
Now, assume we add in exogneously growing Harrod-Neutral technology:

Y (t) = F (K(t), A(t)L(t))

Ȧ(t)
=g
A(t)
In this recitation, we will show that the equilibrium of the newly obtained Ramsey economy is the unique
sequence {ĉ(t), k̂(t)}t≥0 such that:

˙
ĉ(t) 1 0 
= f (k̂(t) − δ − ρ − σg
ĉ(t) σ
˙
k̂(t) = f (k̂(t)) − (n + δ + g)k̂(t) − ĉ(t)

for the initial and boundary conditions:


  Z T 
0
lim k̂(T ) exp − (f (k̂(s)) − δ − n − g)ds =0
T →∞ 0

2
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

k̂(0) > 0 given


where (k̂(t), ĉ(t)) are now consumption and capital per unit of effective labour, i.e.
K(t)
k̂(t) ≡
A(t)L(t)
C(t)
ĉ(t) ≡
A(t)L(t)
00
Notice the subtle difference: u (c(t)) ≡ − u u(c(t))c(t)
0 (c(t)) in the first definition has been replaced by the σ in the
second definition. This is because when there is technological progress, we will show that the only form of
instantaneous utility consistent with our Kaldor facts from the first recitation are CRRA utility functions.
Otherwise, the introduction of technological progress will merely mean that per capita variables now grow
at the constant rate g in the steady-state. In the differential equations governing the system, effective rates of
depreciation have to be modified accordingly. However:
• The household’s optimization problem remains completely unchanged, and in particular, one may rewrite
the new transversality condition in term of capital per capita (that is, in terms of k̂ not k) to see that it is
identical to the transversality condition in the first problem.
• The analysis of existence and stability of the steady-state will be very similar to that made in class. The
system is saddle-path stable, and transitional dynamics are as described in class.
Since the analysis of the model is very similar to what was done in class, we will only emphasize important
points.

1.1 A useful lemma


We use this repeatedly in the analysis of the Ramsey economy.
Result 1 (First-Order Differential Equation with variable coefficients). Given two continuous functions r(t) and
Y (t), the linear differential equation:
Ẋ(t) = r(t)X(t) + Y (t)
with initial condition X(0) has unique solution:
Z T  Z T Z T  Z t 
X(T ) = X(0) exp r(s)ds + exp r(s)ds exp − r(s)ds Y (t)dt
0 0 0 0
Proof. In class.

3
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Application In the Ramsey economy, assets of the representative household (ie, aggregate assets) follow:

Ȧ(t) = r(t)A(t) + w(t)L(t) − C(t)

This is simply the budget constraint of the household. Using our lemma, along with the value of initial assets
A(0) we can solve for A(T ), as:
Z T  Z T Z T  Z t 
A(T ) = A(0) exp r(s)ds + exp r(s)ds exp − r(s)ds (w(t)L(t) − C(t))dt
0 0 0 0

We can rewrite this as:


Z T  Z t  Z T  Z t  Z T 
exp − r(s)ds w(t)L(t)dt = exp − r(s)ds C(t)dt + exp −r(s)ds A(T ) − A(0)
0 0 0 0 0

This is the budget constraint of the household (with population L(t), which can easily be rewritten in per capita
terms. On the left hand side is the present discounted value of total income from  time 0 to time T , discounted
Rt
from the point of view of period 0. The discount factor up to any time t ≤ T is: exp − 0 r(s)ds . On the right hand
side is the present discounted value of everything I purchase between time 0 and time T : that is, consumption
plus the net change in my stock of capital.

1.2 Household
1.2.1 Setup
At time t, the infinitely lived household is populated by L(t) = ent individuals who all derive the instantaneous
utility from identical amount of consumption c(t) = C(t)/L(t), u(c(t)). Instantaneous utility is assumed to be
twice continuously differentiable, strictly concave, and to satisfy the Inada-type condition: limc→0+ u0 (c) = +∞.
The lifetime utility of the household is given by:
Z ∞
U (c) = u(c(t)) exp(−(ρ − n)t)dt
0

We always make the following assumption

Assumption 1.1 (Discounting). The discount rate always exceeds the rate of population growth, that is: ρ > n

4
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Intuitively, why do we need to make this restriction? If population where to grow faster than the rate
at which we discount the future, the population-adjusted discounted value of the utility of the guys living in
period t, from the point of view of period 0, L(t)e−ρt u(c(t)), would tend towards infinity in the very long run, if u(ct )
remains bounded. The households’ problem would cease to be well defined: any plan is optimal if discounted
consumption grows toward infinity as time increases.
Asset accumulation by the household follows:

Ȧ(t) = r(t)A(t) + w(t)L(t) − C(t)

or in per capita terms:


ȧ(t) = (r(t) − n)a(t) + w(t) − l(t)
The household seeks to pick a path (c(t), a(t)) that maximizes U (c) subject to the asset accumulation constraint.
As discussed in class, if the household were only solving this problem, then it would prefer to choose a path
of A(t) that diverges toward −∞. Looking at the integrated budget constraint in the previous section, we see
that this would allow the household to pursue a path where consumption diverges (in the sense that the present
discounted value of consumption could be infinite).
Thus, in order to force the household to have a well-defined optimization problem, we need to impose an
additional constraint. There are different possible choices for this additional constraint; for a discussion of this
(and in particular for why natural debt constraints are not sufficient here), see Acemoglu, 8.1.2. The constraint
that we choose to impose is the No-Ponzi scheme condition:
 Z T 
lim A(T ) exp − r(s)ds ≥ 0
T →+∞ 0

Or, in per capita terms:  Z T 


lim a(T ) exp − (r(s) − n)ds ≥ 0
T →+∞ 0

Note that under the No-Ponzi scheme condition, taking limits in the integrated budget constraint as T → +∞,
we obtain: Z  Z T  t Z  Z  T t
A(0) + exp − r(s)ds w(t)L(t)dt ≥ exp − r(s)ds C(t)dt
0 0 0 0

This is a "lifetime" budget constraint (for the infinitely lived household): the present discounted value of total
earnings cannot be lower than the present discounted value of total consumption. However, this constraint
does not, a priori, need to hold with equality.

5
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

1.2.2 Characterization of the optimal path for consumption and asset holdings
The household maximizes lifetime utility subject to the asset accumulation constraint and the No-Ponzi scheme
condition. To solve this problem, we set up its associated Jacobian:

J(t) = u(c(t)) exp(−(ρ − n)t) + λ(t) (w(t) + (r(t) − n)a(t) − c(t))

Here, λ(t) represents the value, in utils at time 0, of an additional unit of income at time t, sometimes called the
shadow value of assets at time t.
Under the No-ponzi scheme condition and the initial condition a(0) = 0, and because of the properties of
u, theorems 7.13 and 7.14 in Acemoglu tell us that the sequence of asset holdings and consumption that the
constrained maximization problem of the household is characterized by the following equations:

∂J(t)
=0
∂c(t)

∂J(t)
= −λ̇(t)
∂a(t)
lim λ(T )a(T ) = 0
T →∞

From the second equation and our initial lemma, the shadow value of income follows:
 Z t 
λ(t) = λ(0) exp − (r(s) − n)ds
0 Z t 
0
= u (c(0)) exp − (r(s) − n)ds
0

Eliminating out the shadow value of income, we can rewrite the sufficient conditions as:

ċ(t) 1
= (r(t) − ρ)
c(t) u (c(t))
 Z t 
lim exp − (r(s) − n)ds a(T ) = 0
T →∞ 0

The second condition is known as the transversality condition.

6
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

1.2.3 Euler equation and transversality condition


We will discuss two important intuitions about this solution during the recitation:

• First, the Euler equation. The term u (c(t)) (the inverse of the intertemporal elasticity of substitution)
is key here. It is always positive, and it governs the intensity of the response of consumption growth
to the difference between interest and discount rates. Informally, when r(t) > ρ, the household wants
consume less today in order to save and consume more tomorrow. However, when it does this, it also
"suffers" because its consumption profile becomes steeper. The resulting choice of consumption path is a
compromise between the two effects. To see why the elasticity of marginal utility governs the second effect,
we can show that1
d ln(u0 (c(s))/u0 (c(t)))
u (c(t)) = lim −
s→t d ln(c(s)/c(t))
To interpret the term inside the limit, think of s > t, cs > ct , and think of the effect of consuming slightly
less at time t and slightly more at time s, that is, increasing slightly the ratio cs /ct . At time t, you lose
u0 (ct ) utils, but at time s, you gain u0 (cs ). Because u is concave, u0 (ct ) > u0 (cs ), so that small move always
makes the agent worse off. By how much worse off depends on by how much u0 (ct ) increases relative to
u0 (cs ), that is, it depends on the elasticity in the limit above. The bigger that elasticity, the worse this
trade-off is for you. In the limit, the larger the u (c(t)), the worse it is for the household to increase the
slope of its consumption profile. From the Euler equation, one sees that this effect dampens the response
of consumption to changes in interest rates.

• Second, the optimality condition:


 Z t 
lim exp − (r(s) − n)ds a(T ) = 0
T →∞ 0
 
1 c(s)
Here is a dodgy proof. Let kt,s ≡ ln c(t) , then up to first order:

u0 (c(s)) u00 (c(t))c(t)


0
=1+ kt,s + o(kt,s )
u (c(t)) u0 (c(t))

Thus,
u0 (c(s)) u00 (c(t))c(t)
ln( 0
)≈ kt,s
u (c(t)) u0 (c(t))
 
c(s)
As kt,s ≡ ln c(t) , this roughly justifies the limiting equality.

7
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

is actually the No-Ponzi scheme condition, holding with equality. Be careful not to get this the wrong way.
We need to impose the No-Ponzi scheme condition (inequality) in order to ensure that the optimization
problem of the household is well defined. Once that is ensured, it turns out that the optimal plan involves
a path for assets that makes the No-Ponzi scheme condition bind. The pedestrian intuition for this is that
there should be no money left on the table by the household: rouhgly, if the optimal solution involved
household "dying" with a strictly positive amount of wealth, one can think of deviating from the optimum
by saving slightly over less the life-cycle (in a way that the No-Ponzi condition still doesn’t bind) and
increasing the present discounted value of its consumption with the extra available funds. Because utility
is increasing in the PDV of lifetime consumption, this would violate optimality of the initial solution.
Finally, note that consumption per capita follows:
ċ(t) 1
= (r(t) − ρ)
c(t) u (c(t))
Remember the Kaldor facts: in the long-run, consumption per capita has a constant growth rate, and r(t) is
constant. Therefore, the only type of preferences allowing for long-run growth and a constant interest rate are
those such that:
u (c(t)) = constant ≡ σ
It can be shown that the this reduces us to the class of CRRA utility functions:
c1−σ
u(c) =
1−σ
Our interpretation of the coefficient u (c(t)) still holds; a small σ means that the utility cost of choosing a
steeper consumption profile is relatively small (so that households will not care too much about variability of
consumption over their life-cycle), while a large σ means the opposite (households prefer smooth consumption
profiles).
As in class, we can use the transversality condition and our initial remark about the integrated budget
constraint to show that:
c(0) = µ(0)(a(0) + w̃(0))
where: +∞  t 
1−σ
Z Z
−1
µ(0) = exp r(s)ds − ρt/σ + nt dt
0 σ 0

and w̃(0) is the discounted value of lifetime earnings.

8
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications
Rt
We can use this expression to think about the effects of higher lifetime interest rates (an increase in 0 r(s)ds)
on period 0 consumption. On the one hand, it increases our lifetime wealth (because all future savings will earn
a higher return); therefore, consumption should increase in all periods, and in particular c(0) should increase.
But on the other hand, higher interest rates also mean the household gains from postponing consumption to
the future, and so c(0) should decrease. Which of the two effects dominates for c(0) is again a function of much
the household dislikes steep consumption profiles. We know from our previous discussion that if σ is high, then
the household dislikes steep consumption profiles, and so the second effect is probably small with respect to
the first effect; therefore we should expect consumption to rise. Indeed, if σ > 1, we see in the expression above
that µ(0)−1 decreases with the increase in interest rate, and so c(0) rises. With a small σ the increase in interest
rates would have the opposite effect: the substitution effect (towards future consumption) would dominate, and
lead to a fall in c(0). When θ = 1, both effects exactly cancel out, and consumption does not change. (This is
only approximately true, since w̃(0) also changes with the interest rate).

1.3 Firms
Firms in the Ramsey model are exactly as in the Solow model. Namely, there is a representative firm with
production technology F (K(t), A(t)L(t)) that, taking input prices R(t) and w(t) as given, minimizes its costs. The
first order conditions of the cost minimization problem are:
R(t) = F1 (K(t), A(t)L(t))
w(t) = A(t)F2 (K(t), A(t)L(t))
Since we wrote the consumer’s problem in per capita terms, we can also write the optimality conditions of the
firm in per capita terms using homogeneity of degree 0 of F1 and F2 :
R(t) = F1 (k(t), A(t))
w(t) = A(t)F2 (k(t), A(t))
That’s all.

1.4 Equilibrium
This definition is written in per capita terms, so that we don’t need to make reference to the growth of the
population. Alternatively, we could write it in aggregate terms or (as we will do in the next section) in the form
that will allow to study the steady-state, that is, in intensive units.

9
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Definition 1.1 (Competitive equilibrium in the Ramsey economy). A competitive equilibrium of the Ramsey econ-
omy is a per capita allocation {c(t), k(t), a(t)}t≥0 and prices {r(t), R(t), w(t)}t≥0 such that given an exogenous process
for labour-augmenting technology {A(t)}t≥0 :

1. Taking prices as given, households maximize lifetime utility subject to the capital accumulation constraint and
the No-Ponzi scheme condition, and given initial per capita assets a(0). This esults in a path for consumption
and assets fully characterized by the following conditions:

ċ(t) 1
= (r(t) − ρ)
c(t) σ
 Z t 
lim exp − (r(s) − n)ds a(T ) = 0
T →∞ 0

ȧ(t) = (r(t) − n)a(t) + w(t) − c(t)

2. Taking prices as given, firms minimize cost, resulting in the optimality conditions:

R(t) = F1 (k(t), A(t))

w(t) = A(t)F2 (k(t), A(t))

3. Asset (and implicitly labour) markets clear:


R(t) = r(t) + δ
a(t) = k(t)

The only tricky part of this equilibrium definition is the clearing condition for the capital markets. It may
seem like there are two different pices for capital, R(t) and r(t). If there is no depreciation, then the two
coincide. If δ > 0, however, when renting out capital to the firm (through its asset purchases), the household
is also wearing it off; every unit rented out loses a share δ of its value each period. Thus, the effective return
to renting each unit of capital is R(t) − δ, the rental price offered by the firm, minus the depreciation rate of
capital. I tend to think of δ in this model not as a depreciation but as a transaction cost: in transforming the
household’s assets into usable capital, the household incurs a loss of δ per unit of capital lended - possibly
because some unspecified financial intermediary took it.

10
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

With this definition in hand, we can fully characterize the dynamics of the economy in terms of {c(t), k(t)}t≥0
by the following system of differential equations:

ċ(t) 1
= (F1 (k(t), A(t)) − δ − ρ)
c(t) σ
k̇(t) = (F1 (k(t), A(t)) − δ − n)k(t) + A(t)F2 (k(t), A(t)) − c(t)

for the initial and boundary conditions:


 Z t 
lim exp − (F1 (k(s), A(s)) − δ − n)ds k(T ) = 0
T →∞ 0

k(0) > 0 given

1.5 The model equations in intensive form


The system above is hard to study graphically, because it is not a system with constant coefficients: there is
an extra variable A(t) which, although exogenous, affects the dynamics of the per capita variables. In order to
study the system graphically, we need to find stationary variables. As usual, the stationary variables in this
economy will be consumption and capital per units of effective labour:

c(t)
ĉ(t) ≡
A(t)

k(t)
k̂(t) ≡
A(t)
Let us define:
f (k̂(t)) ≡ F (k̂(t), 1)
which, as in the previous recitation, implies that:

F1 (k(t), A(t)) = f 0 (k̂(t)

F2 (k(t), A(t)) = f (k̂(t)) − k̂(t)f 0 k̂(t)

11
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Then, using the fact that:


ċ(t) ˙
ĉ(t)
= +g
c(t) ĉ(t)
˙
k̇(t) k̂(t)
= +g
k(t) k̂(t)
we can rewrite the system of differential equations above as:

˙
ĉ(t) 1 0 
= f (k̂(t) − δ − ρ − σg
ĉ(t) σ
˙
k̂(t) = f (k̂(t)) − (n + δ + g)k̂(t) − ĉ(t)
The initial condition k(0) > 0 is unchanged, since the level of technology is normalized to 1 at time 0. To see
how the boundary condition is modified, simply replace k(T ) = A(T )k̂(T ) = egT k̂(T ):
 Z t   Z t 
0
exp − (F1 (k(s), A(s)) − δ − n)ds k(T ) = exp − (f (k̂(t) − δ − n)ds egT k̂(T )
0
 Z0 t 
0
= exp − (f (k̂(s)) − δ − n − g)ds k̂(T )
0

so that the boundary condition becomes:


  Z T 
0
lim k̂(T ) exp − (f (k̂(s)) − δ − n − g)ds =0
T →∞ 0

The two differential equations and the initial and boundary conditions are now fully written in terms of
endogenous variables and constants, and we can apply our usual steady-state and graphical analysis.
We’ll draw the new (and almost unmodified) phase diagram in class.
A key element in the analysis of the existence of a steady-state of this system is that we must impose the
following restriction on parameters:
Assumption 1.2 (Discounting with technological progress). The discount rate must always exceed the population
growth rate, plus the (adjusted) growth rate of technological progress:
ρ > n + (1 − σ)g

12
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Analytically, we must impose this constraint because if there is to be a steady-state for capital in the long
run, the the transversality can only hold if the steady-state discount factor f 0 (k̂ ∗ ) − δ − n − g is strictly positive.
Using the Euler equation in steady-state, we find that this restriction is equivalent to ρ > n + (1 − σ)g.
More intuitively, one can notice that since the steady-state interest rate is r∗ = ρ + σg, this assumption is
equivalent to r∗ > n + g. To understand the meaning of this condition, remember that n + g is the growth rate
of capital and output in the steady-state of this economy. If n + g > r∗ , roughly speaking, the household could
borrow on unit of output today for future repayment r∗ tomorrow; because the economy is growing at rate (n+g),
the asset would be worth (n + g) tomorrow, and the household would be able to repay its debt and still obtain a
positive profit, doing nothing. Doing this with not one but infinitely many assets, the household would be able
to obtain infinite utility.
Why does σ come into play in this condition? Notice that given our initial assumption about discounting, the
additional restriction ρ > n + (1 − σ)g is relevant when σ > 1, that is, when the household really dislikes steep
consumption profiles. However, when it does not really care about steep consumption profiles, that is, when
σ < 1, this assumption is stronger than the initial discounting assumption ρ > n. In effect, the higher the σ, the
higer the interest rate r∗ : households that dislike steep consumption profiles need to be compensated more for
the consuming less today, saving, and consuming more tomorrow.
Thus, there are two effects driving the real interest rate in the steady-state: it must be high enough that no
"arbitrage" of the products of exogenous technological progress and population growth is possible (the condition
r∗ > n + g; on the other hand, it must be lower, the less the household dislikes steep profiles. The condition
ρ > n + (1 − σ)g reflects both these effects.

2 Dynamic programming and applications


For lack of time, we won’t study the (serious) theory of dynamic programming. What I will give you below should
be interpreted are unrigorous guidelines to how you should approach a general discrete time deterministic
problem using the tools of dynamic programming.
The serious theory of discrete time, deterministic and stochastic dynamic programming is detailed in "Re-
cursive Methods in Economic Dynamics" by Stockey, Lucas and Prescott (1982).

13
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

2.1 A first look at discrete time deterministic problems


2.1.1 Sequential problem and functional equation
Consider a general discrete time, infinite horizon deterministic optimization problem. The ingredients of the
problem are:
• Discount rate β ∈ ]0, 1[
• State variable xt in X, subset of RL
• A non-empty correspondence Γ : X → X, such that: xt+1 ∈ Γ(xt ), which describes the constraints of the
problem.
• Instantaneous utility function F (xt , xt+1 ), with values in R, and defined on {(x, y) ∈ X 2 : y ∈ Γ(x)}.
An example of such of problem is given in page 20-21.
Given an initial value for the state, x0 , a feasible plan is a sequence xt such that xt+1 ∈ Γ(xt ) for all t ≥ 0.
What we want to do is find a feasible plan that maximizes lifetime utility:
+∞
X
V (x0 ) = β t F (xt , xt+1 )
t=0

This is often called a "sequential problem" (SP):

+∞
X
V̂ (x0 ) = max β t F (xt , xt+1 )
{xt }t≥0
t=0
s.t. xt+1 ∈ Γ(xt ) (1)
x0 given
The sequence that attains this optimum (if it exists) is called an "optimal plan"; I will denote it by {x̂t }t≥0 .
Now, as we saw in class, the solutions (V̂ , x̂) to this problem (for a given initial solution) have some kind of
special relationship with the solutions to the problem:
J(x) = max F (x, y) + βJ(y) (2)
y ∈Γ(x)

(3)
We will call this problem the Functional Equation (FE) problem, or also (as in class) the Bellman equation.

14
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

2.1.2 The principle of optimality


The main insight of dynamic programming is that value functions solving 1 also solve 2. This is known as the
"principle of optimality".

Result 2 (Principle of optimality). Suppose that V̂ is well defined on X. Then, V̂ satisfies the functional equation
2, that is:
∀x ∈ X, V̂ (x) = max F (x, y) + β V̂ (y)
y∈Γ(x)

Proof. Take some x0 ∈ X, and let {x̂t } be a corresponding optimal plan for (SP). By optimality of the plan, for any
other feasible plan {xt } starting at x0 ,
+∞
X
V̂ (x0 ) ≥ F (x0 , x1 ) + β β t−1 F (xt , xt+1 )
t=1

By definition,
+∞
X
V̂ (x0 ) = F (x0 , x̂1 ) + β β t−1 F (x̂t , x̂t+1 )
t=1

so that:
+∞
X +∞
X
t−1
F (x0 , x̂1 ) + β β F (x̂t , x̂t+1 ) ≥ F (x0 , x1 ) + β β t−1 F (xt , xt+1 )
t=1 t=1

Fix a particular x1 ∈ Γ(x0 ), and let {x̂ˆt }t≥1 be an optimal plan from x1 , then by definition of V̂ :
+∞
X
V̂ (x1 ) = β t−1 F (x̂ˆt , x̂ˆt+1 )
t=1

so that:
V̂ (x0 ) ≥ F (x0 , x1 ) + β V̂ (x1 )
Now, remember this holds for any x1 ∈ Γ(x0 ). Picking x1 = xˆ1 , that is, the second state for the optimal plan we
started with, we get:
+∞
X
β β t−1 F (x̂t , x̂t+1 ≥ V̂ (xˆ1 )
t=1

15
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

But optimality of V̂ (x̂1 ) over feasible plans starting at x∗1 , we have that:
+∞
X
V̂ (xˆ1 ) ≥ β β t−1 F (x̂t , x̂t+1
t=1

So that:
+∞
X
V̂ (xˆ1 ) = β β t−1 F (x̂t , x̂t+1 )
t=1
Now we can replace this into the initial relation to get:
V̂ (x0 ) = F (x0 , x̂1 ) + β V̂ (x̂1 )
+∞
X
≥ F (x0 , x1 ) + β β t−1 F (xt , xt+1 )
t=1

= F (x0 , x1 ) + β V̂ (x1 )

This holds for any x1 ∈ Γ(x0 ), so V̂ solves (FE).


We can use the computations in this proof to gain some insight into the optimal plan {x̂}t≥0 .
Result 3 (Optimal plans are generated by the policy correspondence). . Suppose V̂ is well defined on X. Fix
x0 ∈ X and let {x̂t }t≥0 be a corresponding optimal plan. Then, the optimal plan satisfies:
∀t, x̂t+1 ∈ G(x̂t )
where G is defined as:
G(x) = arg max F (x, y) + β V̂ (y)
x∈Γ(y)

Proof. The previous computations showed that:


V̂ (x0 ) = F (x0 , x̂1 ) + βV (xˆ1 )
Because V (x0 ) solves the Bellman equation, this implies that:
x̂1 ∈ G(x0 )
. Moreover, we showed that the plan {x̂t }t≥1 is optimal starting from x̂1 . The desired result follows from induction.

16
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

The function G is called the policy correspondence. This theorem tells you that if you happen to know the
solution to (SP), then you may derive a large set of sequences from the policy correspondence (sequences so
that xt+1 ∈ G(xt ), among which you will find our optimal plan.

Why do we care ? As you guessed from the formulation of the theorems, unless we add some additional
restrictions to the problen, there will be solutions to the FE that are not solutions to the SP. Similarly, there
may be plans that are generated by the policy correspondence, but that are not the optimal plans. (We will give
one such example in class).
However, there are partial converses to the theorem: that is, restrictions on the solutions to FE that ensure
they are also solutions to SP.

Result 4 (Optimality of plans generated by the policy correspondence). Let {x̂t }t≥0 be generated by the policy
correspondence. It is an optimal plan for (SP) if and only if :

lim β t V̂ (x̂t ) = 0
t→+∞

Result 5 (Partial converse to the principle of optimality). Suppose V solves (FE). Furthermore, assume that, for
each x0 , there is a plan {x̂t } such that:
V (x̂t ) = F (x̂t , x̂t+1 ) + βV (x̂t+1 )
and such that:
lim β t V (x̂t ) = 0
t→+∞

Finally, suppose that:


lim sup β t V (xt ) ≥ 0
t→∞

for all feasible plans. Then, V = V̂ .

Given all this, our strategy to tackle a problem of the type (SP) will be the following:

1. Write down the corresponding problem (FE).

2. If you can solve this problem, solve it. The next section describes how to. If the problem has not solutions,
then you’re done: the principle of optimality then also tells you there are no solutions to the (SP) problem.

17
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

3. Once you have your solutions to the (FE) problem, apply the converse theorem: check that some solution
V to FE, you can generate feasible plans that satisfy the conditions of the previous theorem; and check
that the solution you picked, V , itself satisfies the limsup condition.

4. If you find such a V among the solutions to (FE), you’re done: it’s your solution to (SP). Moreover, then you
can apply the converse theorem for the policy correspondence to generate a plan that is optimal for (SP).
In practice, what people really do is use some restrictions on X, F and Γ that ensure the existence and
unicity of a solution to (FE) that satisfies the conditions of the partial converse.
Some other restrictions, which we will not cover here, ensure that V̂ is strictly concave, differentiable and
that the policy function is a single-valued correspondence.

2.1.3 Solving the functional equation


So how do we solve the functional equation? There are two ways. One is numerical: under some (very restrictive)
on the primitives of the problem, we can show that the solution to (FE) is unique and moreover, it is the limit of
a sequence of functions that are easy to approximate numerically. I will show a short example of Matlab code
on how to do this. The other method is to characterize the solution by using optimality conditions.

By applying the contraction mapping theorem Make the following restrictions:


• X is a convex set

• F (x, y) is continuous and bounded

• Γ is continuous and compact-valued


The key is to think of the Bellman equation as a mapping. Define the mapping:

T f (x) = max F (x, y) + βf (y)


y∈Γ(x)

We define T over the space of continuous and bounded functions over X, with the sup norm. Because F
is continuous and Γ is compact-valued, the theorem of the maximum (see SLP, chapter 3) tells us that this
definition makes sense.
Notice that if we can find an f such that:
Tf = f

18
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

then we are done: this f , by construction, solves (FE). Remember that to get back to (SP), we would still need to
check that it verifies some additional transitional conditions. In this case, because our solution is in the space
of continous and bounded functions, it is very simple to check that the conditions for the converse theorem
actually hold. Therefore, if we show that this fixed point is unique, we are done: it is the solution to the (SP).
So how do we show that the function equation T f = f has a unique solution, and how do we find this
solution? The idea is to prove that T is a contraction.

Definition 2.1 (Contraction mapping). Let C be a complete metric space for the norm k.k and let T be a mapping
from C to C. Let β ∈ ]0, 1[. T is a contraction mapping, if and only if, for all f, g in C,

kT f − T gk ≤ β kf − gk

This definition is fairly general; but since the space of continuous and bounded functions over X is a complete
metric space for the sup norm, we may hope that our own T is also a contraction.
But why do we want T to be a contraction? Because of the following theorem:

Result 6 (Contraction mapping theorem). Let C be a complete metric space for the norm k.k, and let T be a
contraction on (C, k.k). Then:

• T has a unique fixed point, f , inside C.

• For any f0 ∈ C, define the sequence (fn )n≥0 by fn+1 = T fn . Then,

lim fn = f
n→+∞

The proof of this theorem is in SLP, chapter 3.


This theorem is interesting to us because it tells us that if our correspondence is a contraction, then we have
existence and unicity of the solution to our problem, and furthermore we know how to find it.
In order to check that the our mapping is indeed a contraction, we need to check that Blackwell’s sufficiency
conditions hold. For a proof that these conditions are sufficient, see SLP, chapter 3.
They are as follows:

Result 7 (Blackwell’s sufficiency conditions). Suppose the mapping T , defined on the space of continuous bounded
functions on X endowed with the sup norm, has the following two properties:

• T is monotone: if f, g are such that f (x) ≥ g(x) for all x ∈ X, then T f (x) ≥ T g(x) for all x ∈ X.

19
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

• T satisfies a discounting inequality: there is 0 < δ < 1 such that for any a ≥ 0, and any f ∈ X, the function
f (x) = g(x) + a) satisfies:
T (f − g)(x) ≤ δa

Then, T is a contraction mapping over C.

An example of applying the contraction mapping theorem to solvle numerically an FE We will look at the
following, very simple capital accumulation problem under the SP form:
+∞
X
max β t ln(ct )
ct ,bt+1
t=0

ct + bt+1 ≤ Rbαt
The only constraints here are:
ct > 0
bt+1 > 0
We can put this in our canonical form with:
X = R+
x t = bt
Γ(x) = {y ∈ X : Rxα − y > 0}
F (x, y) = ln(Rxα − y)
We can check that, except for the fact that F is not bounded, all other conditions that we reviewed in the
previous section apply. However, the fact that F is bounded is only useful because it serves to apply the
maximum theorem in order to obtain that our mapping T is well defined. Here, it is easy to check that the
mappin is well defined even if F is not bounded.
The mapping in question is:
T f (x) = max ln(Rxα − y) + βf (y)
y∈Γ(x)

As previously mentioned, this mapping is well defined on the space of continuous and bounded functions
for the sup norm (the space for f , not F !). Furthermore, checking Blackwell’s sufficiency condition is trivial. We

20
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

can therefore apply the contraction mapping theorem: the solution to (SP ) exists, is unique, and can be found
by constructing the sequence:
fn+1 = T fn
for some initial f0 .
How do we do this, in practice? Here is some bad pseudo-code that one can follow to solve our problem
using, for example:

1. Choose some compact subset of X, for example Y = [0.1; 15].

2. Create a vector of n points, equally spaced between the boundaries of Y . This is your "grid".

3. Pick an initial function f0 ; for example, f0 = 1 (f is constant). Note that this choice of f0 ensures that
f0 is bounded and continuous, so we start within C, the space of continuous bounded functions. This
function is represented by a vector of length n inside the computer: each element corresponds of this
vector corresponds to the value of f0 at a point of the "discretized" grid of Y .

4. Apply repeatedly T to generate the sequence fn , until the fn almost don’t change, that is, until maxx∈Y (fn+1 (x)−
fn (x)) <  for some very small epsilon. In the computer, one needs to do the following:

• Given your current value for f , For each value x ∈ Y , compute all possible values of ln(Rxα − y) + βf (y)
for all the y ∈ X. Note that you compute all values of the objective function for y ∈ X, not y ∈ Γ(x). If
necessary, just throw out the values of y that are not in Γ(x).
• Among those values for which you computed the objective function, pick the value of y which maxi-
mizes ln(Rxα − y) + βf (y). This value is your new T f (x).
• Once you finished doing this for all x ∈ Y , you can compute the maximum of the distance between
T f (x) and f (x). This is the sup norm. If it is smaller than , exit; if not, repeat the steps above

The contraction mapping theorem ensures that this process will eventually converge, and when it does, we
will have retained an approximation to the value function on our subset Y .

By Euler equations and transversality conditions This has already bee done in class: assuming that the
value function is continuously differentiable and concave, and that the constraint set is convex, solutions to
(FE) are characterized by:
DV (x) = F1 (x, y)

21
G6215.001 - Recitation 3: Ramsey Growth model with technological progress; discrete time dy-
namic programming and applications

Fy (x, y) = −βDV (y)


and y ∈ Γ(x). As you may remember from our initial discussion, we need to impose additional conditions in
order to ensure that this solution corresponds to the solution to our initial problem, (SP).
It turns out that, for instantaneous value functions F that are continuously differentiable, increasing in
x and strictly concave, it is easy to characterize some optimal plans directly (without resorting to the (FE)).
Indeed, in this case, any feasible plan such that x̂t ∈ int Γ(xt ) and such that:

F2 (x̂t , x̂t+1 ) + βF1 (x̂t+1 , x̂t+2 ) = 0

lim β t F1 (x̂t , x̂t+1 ) = 0


t→+∞

is an optimal plan. When seeking to characterize the solutions to (SP) problems through Euler equations, it is
this theorem that we are often using.

22