Espen Henriksen September 18, 2008 A rough guide to recursive methods and numerical dynamic programming In its simplest form, the prototypical stochastic neoclassical growth model amounts to maximize expected discounted additive separable utility over consumption and leisure subject to the national income and product accounts, ie. max {ct,xt,nt}∞ t=0 E0 ∞ t=0 βt u (ct, 1 − nt) subject to Product approach to output: yt = ztf (kt, nt) . (1) Income approach to output: yt = rtkt + wtnt. (2) Expenditure approach to output: yt = ct + xt. (3) Law of motion for capital: kt+1 = (1 − δ)kt + xt. (4) Total factor productivity: zt = ρzt−1 + εt, εt ∼ N(0, σ2 ), (5) where the exogenous stochastic process, specified in Equation (5), is computed, in final samples, from {˜zt} T t=0 where each ˜zt is given by ln ˜zt = ln ˜yt − α ln ˜kt − (1 − α) ln ˜nt. 1 Competitive equilibrium A competitive equilibrium consists of quantities {ct, nt, xt}∞ t=0 satisfying 1. Household optimization. They supply capital and labor, ie. choose quantities {xt, nt, }∞ t=0 given prices {rt, wt}∞ t=0. 2. Firm optimization. They rent capital and hire labor, ie. choose quantities {xt, nt, }∞ t=0 given prices {rt, wt}∞ t=0. 3. Market clearing. Prices, {rt, wt}∞ t=0, are set such that supply equals demand of capital and labor, {xt, nt, }∞ t=0. By Walras’ law, since the markets for capital and labor clear, the market for consumption goods will also clear. 2 First welfare theorem This model economy satisfies the first welfare theorem – that is any competitive equilibrium is Pareto efficient. That implies that we can solve the centralized planner’s problem instead of the decentralized problem. The social planner’s 1 problem is much easier to solve since we get rid of the prices and the individuals’ budget constraint and instead solve the optimization problem subject to the resource constraint. We can then find the prices that support these equilibrium allocations. max {ct,xt,nt}∞ t=0 E0 ∞ t=0 βt u (ct, 1 − nt) subject to yt = ztf (kt, nt) . (6) yt = ct + xt. (7) kt+1 = (1 − δ)kt + xt. (8) zt = ρzt−1 + εt, εt ∼ N(0, σ2 ), (9) k0 > 0, given. (10) 3 Simplifying the problem For pedagogical purposes, let’s simplify the problem and for a while ignore the labor-leisure choice and the stochastic process for total factor productivity. We set zt equal to its unconditional mean ¯z for all t, and, in addition, we eliminate yt and xt by combining Equations (6), (7) and (8). The problem simplifies to max {ct,xt}∞ t=0 ∞ t=0 βt u (ct) subject to ct + kt+1 = ¯zf (kt) + (1 − δ)kt. k0 > 0, given. 4 The sequence problem, solving for allocations The mathematical strategy is solve for a sequence of allocations by maximizing a function (total utility) of an infinity of variables (consumption and capital at each date) subject to technological constraints (production function and law of motion for capital) from a given initial capital stock k0. The Lagrangian equation for the problem is L = ∞ t=0 βt {u (ct) − λt [ct + kt+1 − ¯zf (kt) − (1 − δ) kt]} , with transversality condition lim t→∞ βt λtkt+1 = 0. Given standard assumptions on the functional forms of f and u, capital stock converge monotonically to the level that, if sustained, maximizes consumption per unit of time. For the deterministic problem, it is quick and easy to solve the sequence problem. However, if uncertainty is introduced the problem blows up and most likely becomes impossible to solve. 2 5 The recursive problem, solving for functions The mathematical strategy is to seek the optimal savings/investment function directly and then to use this function to compute the optimal sequence of investments from any initial capital stock. This way of looking at the problem – decide on the immediate action to take as a function of the current situation – is called a recursive formulation. It exploits the observation that a decision problem of the same structure recurs every period. For the deterministic problem, this might seem like an unnecessarily hard way to solve the problem. However, as we shall see, if uncertainty is introduced the structure of the problem hardly changes and we can still solve it with relative ease. Our problem is still max {ct,kt+1}∞ t=0 ∞ t=0 βt u (ct) subject to ct + kt+1 = ¯zf (kt) + (1 − δ)kt. k0 > 0, given. Substituting in for ct we have the following problem max {kt+1}∞ t=0 ∞ t=0 βt u [¯zf (kt) + (1 − δ) kt − kt+1] , or max {kt+1}∞ t=0 ∞ t=0 βt u (kt, kt+1) , both given k0 > 0. Starting from an arbitrary time t, the problem is max {ks+1}∞ s=t ∞ s=t βs−t u (ks, ks+1) . (11) given cs + ks+1 = ¯zf (ks) + (1 − δ)ks ∀s ≥ t (12) kt > 0 given (13) 5.1 Value function Let us define v(kt) as the indirect utility function, ie. function that attains the value of the optimal program from period t and onwards given an initial condition kt v(kt) ≡ max {ks+1}∞ s=t ∞ s=t βs−t u (ks, ks+1) . 3 Using the maximization-by-steps idea, we can write v(kt) ≡ max kt+1 u (kt, kt+1) + max {ks+1}∞ s=t+1 ∞ s=t+1 βs−t u (ks, ks+1) = max kt+1 u (kt, kt+1) + β max {ks+1}∞ s=t+1 ∞ s=t+1 βs−(t+1) u (ks, ks+1) ≡ max kt+1 {u (kt, kt+1) + βv (kt+1)} . = max kt+1 {u (ct) + βv (kt+1)} We have derived the Bellman equation v(kt) = max kt+1 {u (ct) + βv (kt+1)} , (14) where ct = ¯zf (kt) + (1 − δ)kt − kt+1. (15) The Bellman equation captures a key element of most intertemporal decision, which most of our decisions are, namely that they are a trade-off between instantaneous utility u(ct) and discounted continuation utility βv (kt+1). The only state variable of our recursive problem is kt. The variable kt captures all relevant information for making a decision. Since kt depends on past decision, we classify it as an endogenous state variable. The only decision or control variable of our recursive problem is kt+1. Equivalently, we could instead have defined ct or xt as our control/decision variable. 5.2 Recursive notation We will call a problem stationary whenever the structure of the choice problem that a decision maker faces is identical at every point in time, ie. only relative time, not absolute time, is relevant. In order to make sure we are not using notation that might indicate otherwise we write the Bellman equation using recursive notation v(k) = max k {u (c) + βv (k )} , (16) where c = ¯zf (k) + (1 − δ)k − k . (17) 5.3 Properties of the decision rule Given the true (but potentially unknown) value function v (·), we can study the optimality conditions of the problem. The unique optimal time-invariant decision rule is defined by the first-order condition of the maximization problem, defined by the right-hand-side of the Bellman equation (16), with respect to the control variable, in this case k ∂u (c) ∂c ∂c ∂k + β ∂v (k ) ∂k = 0, 4 and since ∂c/∂k = −1 ∂u (c) ∂c = β ∂v (k ) ∂k . (18) Application of the envelope theorem, also known as the Benveniste and Scheinkman (1979) condition, which generally holds off corners, amounts to taking the derivative of the problem with respect to the endogenous state variable, in this case k ∂v (k) ∂k = ∂u (c) ∂c ∂c ∂k . Since ∂c/∂k = ¯z∂f (k) /∂k + (1 − δ) ∂v (k) ∂k = ∂u (c) ∂c ¯z ∂f (k) ∂k + 1 − δ . (19) Since we have a stationary, recursive problem, given Equation (20) we also have ∂v (k ) ∂k = ∂u (c ) ∂c ¯z ∂f (k ) ∂k + 1 − δ . (20) Combining Equations (18) and (7.4) and eliminating ∂v (k ) /∂k , we have 1 β ∂u (c) ∂c = ∂u (c ) ∂c ¯z ∂f (k ) ∂k + 1 − δ , or equivalently β ∂u (c ) /∂c ∂u (c) /∂c ¯z ∂f (k ) ∂k + 1 − δ = 1. This is the intertemporal optimality condition, also known as the Euler equation. 5.4 Steady state In steady state, k = k = ¯k, c = c = ¯c, etc. The Euler equation becomes β ¯z ∂f ¯k ∂¯k + 1 − δ = 1, from which we can solve for ¯k as a function of the structural parameters and ¯z. 5.5 Contraction mapping theorem In general, the function v (·) is unknown, but the Bellman equation can be used to find it. In most of the cases we will deal with, the Bellman equation satisfies a contraction mapping theorem, which implies that 1. There is a unique function v (·) which satisfies the Bellman equation. 2. If we begin with any initial guess for the function v, let’s call it v0(k) and define vn+1 (k) = max c,k [u (c) + βvn (k )] subject to c + k = f (k) + (1 − δ) k 5 for n = 0, 1, 2, ... then limn→∞ vn+1 (k) = v (k) In other words, if we iterate on the Bellman equation and for each iteration update the guess for the function v, we will eventually converge to the true function v. The above two implications give us two alternative means of uncovering the value function. First, given implication 1 above, if we are fortunate enough to correctly guess the value function v (·) then we can simply plug v(k ) into the right side and then verify that v(k) solves the Bellman equation. This procedure only works in a very few and very special cases. Second, implication 2 above is useful for doing numerical work. One approach is to find an approximation to the value function and iterate. It is almost like it begs to be programmed. 6 Discrete deterministic value function iteration Numerical deterministic value function iteration amounts to two choices of numerical methods: how to approximate the value function and how to optimize the object inside the curly brackets: v (k) approximation = max k u (c) + β v (k ) approximation optimization With discrete deterministic value function iteration, the value function is numerically approximated by a vector of discrete points, ie. both the state variable k and the decision variable k can just take a set of discrete values. Numerical optimization is then almost trivial and will simply amount to for each k pick the k which gives the highest value for u (c) + βv (k ) 6.1 Computational implementation 1. Initialize the algorithm (a) Define the calibrated structural parameters of the model economy. (b) Compute the steady state value of the capital stock k∗ (c) Set gk (number of grid points), k (lower bound of the state space), ¯k (upper bound of the state space), and ε (tolerance of error). gk is determined weighting the tradeoff between speed and precision. Pseudo-code: α ← .35 β ← .98 δ ← .025 σ ← 2 ¯z ← 5 k∗ ← 1 β −(1−δ) α¯z 1 α−1 6 k ← .95k∗ ¯k ← 1.05k∗ gk ← 101 ε ← 10−8 Matlab implementation: c l e a r a l l ; c l o s e a l l ; alpha = . 3 5 ; beta = . 9 8 ; delta = . 0 2 5 ; sigma = 2; zbar = 5; kstar = ((1/ beta − 1 + delta )/( alpha ∗ zbar ) ) ˆ ( 1 / ( alpha −1)); kmin = 0.95∗ kstar ; kmax = 1.05∗ kstar ; gk = 101; e p s i l o n = 1E−8; Note that the values assigned to the variables in this example are arbitrary. In this example, they are not calibrated, but are simply added to illustrate the algorithm. 2. Given the bounds of the state space, set the grid points {k1, k2, . . . , kgk }. Default is equidistanced grid points. The value function will be approximated as a (gk × 1)-dimensional vector v. Pseudo-code for i = 1 to gk do ki ← k + ¯k−k gk−1 (i − 1) end for Matlab implementation: f o r i = 1 : gk k( i ) = kmin + (kmax − kmin ) / ( gk − 1) ∗ ( i − 1 ) ; end An alternative Matlab implementation: k = l i n s p a c e (kmin , kmax , gk ) ; 3. Construct consumption and welfare matrices. Compute a (gk × gk) dimensional consumption matrix c with the value of consumption for all the (gk × gk) combinations of k and k . Then compute a (gk × gk)-dimensional welfare matrix u with the utility of consumption for all the (gk × gk) combinations of k and k . Pseudo-code: for i = 1 to gk do for j = 1 to gk do ci,j ← ¯z (ki) α + (1 − δ)ki − kj if ci,j < 0 then ci,j ← 0 7 end if end for end for for i = 1 to gk do for j = 1 to gk do if σ = 1 then u (ci,j) ← ln (ci,j) else u (ci,j) ← (ci,j )1−σ −1 1−σ end if end for end for Matlab implementation: f o r i = 1 : gk f o r j = 1 : gk c ( i , j ) = zbar ∗k ( i )ˆ alpha + (1− delta )∗k( i ) − k( j ) ; i f c ( i , j ) < 0 c ( i , j ) = 0; end end end c l e a r i j f o r i = 1 : gk f o r j = 1 : gk i f sigma == 1 u( i , j ) = log ( c ( i , j ) ) ; e l s e u( i , j ) = ( c ( i , j )ˆ(1− sigma ) − 1)/(1− sigma ) ; end end end c l e a r i j 4. Set an initial value of v0 = v0 i gk i=1 . A trivial initial condition is v0 = 0. Also initialize Tv Pseudo-code: for i = 1 to gk do vi ← 0 Tvi ← 0 end for Matlab implementation: v = zeros ( gk , 1 ) ; Tv = zeros ( gk , 1 ) ; 5. Update the value function and obtain Tv. More specifically, do the following steps for each of i = 1, . . . , gk. (a) Solve the following problem dk i = argmax k ∈{kj } gk j=1 {u (ki, k ) + βv (k )} 8 For later use, also store dg i ← argmaxj∈{1,...,gk} {u (ki, kj) + βv (kj)} (b) Once di is obtained, use it to update value function. Specifically: Tvi = u ki, dk i + βv dk i After implementing the procedure above for i = 1, . . . , gk, we have constructed a new (discretized) approximation for the value function as Tv = {Tvi} gk i=1. 6. Compare {vi} gk i=1 and {Tvi} gk i=1 and compute the distance w. One way to define the error is to use maximum distance, as follows: w = max i |vi − Tvi| If w > ε, the error is not small enough. Update the value function using: v = Tv and go back to Step 5. Pseudo-code: while w > ε do for i = 1 to gk do dk i ← argmaxk ∈{kj } gk j=1 {u (ki, k ) + βv (k )} dg i ← argmaxj∈{1,...,gk} {u (ki, kj) + βv (kj)} Tvi ← u ki, dk i + βv dk i end for w = maxi |vi − Tvi| v = Tv end while Matlab implementation: while w > e p s i l o n f o r i = 1 : gk [ y , j ] = max( u( i , : ) + beta ∗v ’ ) ; dk ( i ) = k ( j ) ; dg ( i ) = j ; Tv( i ) = u( i , dg ( i )) + beta ∗v( dg ( i ) ) ; end w = max( abs (Tv − v ) ) ; v = Tv ; end 7. If w < ε, then we find our optimal value function. The value function is approximated by the vector v = {vi} gk i=1. The optimal decision rule is approximated by the vector dk = dk i gk i=1 . Given dk we can compute the optimal decision rule for consumption dc = {dc i } gk i=1. Pseudo code: for i = 1 to gk do dc i ← ¯z (ki) α + (1 − δ)ki − dk i end for Matlab implementation: 9 f o r i = 1 : gk dc ( i ) = zbar ∗k( i )ˆ alpha + (1− delta )∗k ( i ) − dk ( i ) ; end 6.2 Simulate the model If we want to simulate the model for T periods from any given initial capital stock ks ∈ {kj} gk j=1, the easiest is to use the grid-representation of the optimal decision rule for next-period capital; dg . Pseudo code: Require: {ks, i}, dg Ensure: {ˆkt}T +1 t=1 , {ˆct}T t=1 for t = 1 to T do j ← dg i ˆkt+1 ← k(j) ˆct ← f(ˆkt) + (1 − δ)ˆk − ˆkt+1 i ← j end for 7 Reintroducing uncertainty An AR(1) process like the one specifying the exogenous law of motion for zt+1 in Equation (5) is an example of a continuous process which has the Markov property. Loosely speaking, having the Markov property means that the probability distribution of the current state is conditionally independent of the path of past states – ie. no memory beyond the present. A discrete two-state Markov chain is an example of a simpler process with the Markov property. We take the problem from Equations ..., but let zs be stochastic: max {ks+1}∞ s=t Et ∞ s=t βs−t u (cs) . (21) subject to cs + ks+1 = zsf (ks) + (1 − δ)ks ∀s ≥ t (22) kt > 0 given. (23) where zs can attain two values zH (high productivity) and zL (low productivity). zt follows a two-state Markov chain with Pr (zs+1 = zH | zs = zH) = p and Pr (zs+1 = zL | zs = zL) = q. Denoting the Markov transition matrix Π, we have Π = p 1 − p 1 − q q . 7.1 The sequence problem, solving for allocations Introduction of a very simple exogenous stochastic process – the entire problem “blows up”. 10 7.2 The recursive problem, solving for functions Introduction of exogenous uncertainty hardly changes the problem v(kt, zt) = max kt+1 u (ct) + βEzt+1|zt v (kt+1, zt+1) , where ct = ztf (kt) + (1 − δ)kt − kt+1. Our recursive problem has now two state variables: kt and zt. Together these two variables capture all relevant information for making a decision. Since zt does not depend on past decision, we classify it as an exogenous state variable. kt is still an endogenous state variable. The only decision or control variable of our recursive problem is still kt+1 (or equivalently ct or xt). 7.3 Recursive notation v(k, z) = max k u (c) + βEz |z [v (k , z )] , where c = zf (k) + (1 − δ)k − k . 7.4 Properties of the decision rule As before, we can take the first order condition with respect to the control variable (in this case k ) ∂u (c) ∂c · (−1) + βEz |z ∂v (k , z ) ∂k = 0, and the envelope condition with respect to the endogenous state variable (in this case k) ∂v (k, z) ∂k = ∂u (c) ∂c z ∂f (k) ∂k + 1 − δ . Updating the latter, combining these two equations and eliminating ∂v (k , z ) /∂k , we have 1 β ∂u (c) ∂c = Ez |z ∂u (c ) ∂c z ∂f (k , z ) ∂k + 1 − δ , or equivalently βEz |z ∂u (c ) /∂c ∂u (c) /∂c z ∂f (k , z ) ∂k + 1 − δ = 1. 7.5 Discrete stochastic value function iteration Numerical stochastic value function iteration amounts to three choices of numerical methods: how to approximate the value function, how to integrate in 11 order to compute the expectation, and how to optimize the object inside the curly brackets: v (k, z) approximation = max k u (c) + β Ez |z v (k , z ) approximation integration optimization With discrete stochastic value function iteration, the value function is numerically approximated by a vector of discrete points, ie. both the state variables k and z and the decision variable k can only take a set of discrete values. Both numerical integration and numerical optimization are then almost trivial. Integration will simply amount to weighting discrete events by discrete probabilities and optimization is simply for each combination (k, z) to pick the k which gives the highest value for the expression within the curly brackets. 8 Reintroducing labor-leisure choice max {kt+1,nt}∞ t=s Es ∞ t=s βt−s u (ct, 1 − nt) . (24) subject to ct + kt+1 = ztf (kt, nt) + (1 − δ)kt ∀t ≥ s (25) ks > 0 given. (26) where zt follows a two-state Markov chain with Pr (zt+1 = zH | zt = zH) = p and Pr (zt+1 = zL | zt = zL) = q. 8.1 The recursive problem, solving for functions Again, the problem is hardly changed v(kt, zt) = max kt+1,nt u (ct, 1 − nt) + βEzt+1|zt v (kt+1, zt+1) , where ct = ztf (kt, nt) + (1 − δ)kt − kt+1. Our recursive problem has still two state variables: kt and zt. No additional variables are necessary to capture all relevant information for making a decision. However, we have now two decision or control variables: kt+1 and nt. 8.2 Recursive notation v(k, z) = max k ,n u (c, 1 − n) + βEz |zv (k , z ) , where c = zf (k, n) + (1 − δ)k − k . 12 8.3 Properties of the decision rule As before, we can take the first order condition with respect to the control variables k ∂u (c, 1 − n) ∂c · (−1) + βEz |z ∂v (k , z ) ∂k = 0, (27) and n, ∂u (c, 1 − n) ∂c ∂c ∂n + ∂u (c, 1 − n) ∂n = 0, (28) and the envelope condition with respect to the endogenous state variable k ∂v (k, z) ∂k = ∂u (c) ∂c z ∂f (k) ∂k + 1 − δ . (29) Equation (28) is our intratemporal optimality condition. Reorganizing it gives us ∂u (c, 1 − n) /∂n ∂u (c, 1 − n) /∂c = z ∂f (k, n) ∂n Updating Equation (29), combining it with Equation (27) and eliminating ∂v (k , z ) /∂k , gives us (as before) the intertemporal optimality condition βEz |z ∂u (c ) /∂c ∂u (c) /∂c z ∂f (k , n ) ∂k + 1 − δ = 1. 8.4 Discrete stochastic value function iteration Again everything follows through almost exactly as in the deterministic case without labor-leisure choice... to be continued ... 13 9 Piecewise-linear interpolation of the value fn As we saw in the previous section, numerical deterministic value function iteration amounts to two choices of numerical methods: how to approximate the value function and how to optimize the sum of instantaneous return (utility) and the discounted continuation value. In this case, the value function is numerically approximated by piecewise linear interpolation, ie. whereas we will still evaluate the function at a set of discreet notes for k, the decision variable k can now take any value within the permissable range. For the numerical optimization part, we need a routine that ... a continuous, but not continuously differentiable function. A natural choice is golden section search. 9.1 Computational implementation 1. Initialize the algorithm (a) Define the calibrated structural parameters of the model economy. (b) Compute the steady state value of the capital stock k∗ (c) Set gk (number of grid points), k (lower bound of the state space), ¯k (upper bound of the state space), and ε (tolerance of error). gk is determined weighting the tradeoff between speed and precision. Pseudo-code: α ← .35 β ← .98 δ ← .025 σ ← 2 ¯z ← 5 k∗ ← 1 β −(1−δ) αγ 1 α−1 k ← .95k∗ ¯k ← 1.05k∗ gk ← 11 ε ← 10−8 Note that the values assigned to the variables in this example are arbitrary. In this example, they are not calibrated, but are simply added to illustrate the algorithm. 2. Given the bounds of the state space, set the knots {ki} gk i=1. Default is to set equidistance grid points. Given the knots, the value function, which is approximated using piecewise-linear interpolation, can be stored as an array of length gk. Let’s denote the value function as {vi} gk i=1 3. (a) Initialize the algorithm to evaluate d given the piecewise-linear interpolation defined by {ki, vi} gk i=1 Pseudo-code: 14 Require: d and {ki, vi} gk i=1 Ensure: y for i = 1 to gk − 1 do if d ≥ ki and d < ki+1 then y = vi + (d − ki) vi+1−vi ki+1−ki break loops end if end for (b) Initialize the algorithm to evaluate u(k, k ) + v(k ) given k = d Pseudo-code: Require: k, d and {ki, vi} gk i=1 Ensure: v(k | k = d) v(x) ← evaluate the interpolation at d given {ki, vi} gk i=1 if σ = 1 then v(k | k = d) ← ln [¯z (k) α + (1 − δ)k − d] + v(d) else v(k | k = d) ← [¯z(k)α +(1−δ)k−d]1−σ −1 1−σ + v(d) end if (c) Initialize Golden-section search algorithm, for given k, to find d = argmax k ∈[k,¯k] {u (k, k ) + βv (k )} . Start of pseudo: Require: k and {ki, vi} gk i=1 Ensure: d . . . 4. Set an initial value of v0 = v0 i gk i=1 . A trivial initial condition is v0 = 0. Also initialize Tv Pseudo-code: for i = 1 to gk do vi ← 0 Tvi ← 0 end for 5. Update the value function and obtain Tv. More specifically, do the following steps for each knot in the state space {ki} gk i=1. (a) Solve the following problem dk i = argmax k ∈[k,¯k] {u (ki, k ) + βv (k )} using golden section search. (b) Once dk i is obtained, use it to update value function. Specifically: Tvi = u ki, dk i + βv dk i 15 After implementing the procedure above for i = 1, . . . , gk, we have constructed a new piece-wise linear interpolation for the value function as Tv = {Tvi} gk i=1. 6. Compare {vi} gk i=1 and {Tvi} gk i=1 and compute the distance w. One way to define the error is to use maximum distance, as follows: w = max i |vi − Tvi| If w > ε, the error is not small enough. Update the value function using: v = Tv and go back to Step 5. 7. If w < ε, then we find our optimal value function. The value function is approximated by the vector v = {vi} gk i=1. The optimal decision rule is approximated by the vector dk = dk i gk i=1 . Given dk we can compute the optimal decision rule for consumption dc = {dc i } gk i=1. 16