11. Stochastic Dynamic Programming

Note

本章主题:随机动态规划。 把外生随机冲击 \(z\) 作为额外状态变量加入,设 \(\{z_t\}\) 为一阶马尔可夫过程。§11.1 随机冲击:离散状态空间——马尔可夫矩阵 \(Q=[q_{i,j}]\)(\(\sum_j q_{i,j}=1\),高阶可化为一阶);连续状态空间——转移密度 \(q(z,z')=\partial Q(z,z')/\partial z'\)、\(\int q\,dz'=1\)、定义 11.1 Feller 性质(算子 \(M\) 把连续函数映到连续函数)。§11.2 问题:状态 \(S=(x,z)\);贝尔曼方程 \(V(x,z)=\max_{y\in\Gamma(x,z)}\{F(x,z,y)+\beta\int V(y,z')q(z,z')dz'\}\);\(V\)/\(g\) 关于 \(x\) 与 \(z\) 的单调性、凹性、可微性(逐 \(z\) 固定);\(z\) i.i.d. 时 \(g\) 关于 \(z\) 递增;替代贝尔曼方程(\(x'=\phi(x,y,z)\));稳态 = \(S(x,z)\) 的平稳分布(依赖 \(g\) 与 \(Q\)、需遍历集);包络条件 \(V_x=F_x\)、欧拉方程 \(-F_y(x,y,z)=\beta\int F_y(y,y',z')q(z,z')dz'\)。

Note

Chapter theme: stochastic dynamic programming. Add an exogenous stochastic shock \(z\) as an additional state variable, assuming \(\{z_t\}\) is a first-order Markov process. §11.1 Stochastic shock: discrete state space — the Markov matrix \(Q=[q_{i,j}]\) (\(\sum_j q_{i,j}=1\), higher-order can be reduced to first-order); continuous state space — transition density \(q(z,z')=\partial Q(z,z')/\partial z'\), \(\int q\,dz'=1\), Definition 11.1 the Feller property (the operator \(M\) maps continuous functions to continuous functions). §11.2 The problem: state \(S=(x,z)\); the Bellman equation \(V(x,z)=\max_{y\in\Gamma(x,z)}\{F(x,z,y)+\beta\int V(y,z')q(z,z')dz'\}\); monotonicity, concavity, differentiability of \(V\)/\(g\) in \(x\) and in \(z\) (for each fixed \(z\)); \(g\) increasing in \(z\) when \(z\) is i.i.d.; the alternative Bellman equation (\(x'=\phi(x,y,z)\)); the steady state = the stationary distribution of \(S(x,z)\) (involving \(g\) and \(Q\), needs an ergodic set); the envelope condition \(V_x=F_x\), the Euler equation \(-F_y(x,y,z)=\beta\int F_y(y,y',z')q(z,z')dz'\).

在随机动态规划问题中,可把外生随机冲击 \(z\) 作为额外状态变量加入。这里设 \(\{z_t\}\) 是一阶马尔可夫过程。

11.1 Stochastic Shock

外生随机冲击 \(z\) 可以两种方式加入。

11.1.1 离散状态空间与马尔可夫矩阵

冲击 \(z\) 可取 \(n\) 个值,即 \(z\in Z=\{z_1,z_2,\dots,z_n\}\)。可把转移描述为马尔可夫矩阵 \(Q\): $$Q=\begin{bmatrix}q_{1,1}&q_{1,2}&\cdots&q_{1,n}\\q_{2,1}&q_{2,2}&\cdots&q_{2,n}\\\vdots&\vdots&\vdots&\vdots\\q_{n,1}&q_{n,2}&\cdots&q_{n,n}\end{bmatrix}$$ 其中 \(q_{i,j}\) 是当前期为 \(z_i\)、下一期为 \(z_j\) 的概率。注意 $$\sum_{j=1}^n q_{i,j}=1\quad\text{for }i=1,2,\dots,n$$

Tip

注记(高阶 → 一阶马尔可夫) 即便 \(\{z_t\}\) 是高阶马尔可夫过程,也仍可将其转换为一阶马尔可夫过程。考虑二阶马尔可夫过程 \(z_{t+2}=\theta_1 z_{t+1}+\theta_2 z_t+\varepsilon_{t+2}\),\(\varepsilon_{t+2}\sim\mathcal N(0,\sigma^2)\)。定义 \(A=\begin{bmatrix}\theta_1&\theta_2\\1&0\end{bmatrix}\)、\(x_t=\begin{pmatrix}z_{t+1}\\z_t\end{pmatrix}\)、\(\epsilon_t=\begin{pmatrix}\varepsilon_{t+1}\\0\end{pmatrix}\),则 $$\begin{pmatrix}z_{t+2}\\z_{t+1}\end{pmatrix}=\begin{bmatrix}\theta_1&\theta_2\\1&0\end{bmatrix}\begin{pmatrix}z_{t+1}\\z_t\end{pmatrix}+\begin{pmatrix}\varepsilon_{t+2}\\0\end{pmatrix}$$ 即 \(x_{t+1}=Ax_t+\epsilon_{t+1}\),是一阶马尔可夫过程(马尔可夫过程的正式定义见 §42)。

In the stochastic Dynamic Programming problem, we can add an exogenous stochastic shock \(z\) as an additional state variable. Here we assume \(\{z_t\}\) is a first-order Markov process.

11.1 Stochastic Shock

The exogenous stochastic shock \(z\) can be added in the following two ways.

11.1.1 Discrete state space and Markov matrix

The shock can take \(n\) possible values, i.e. \(z\in Z=\{z_1,z_2,\dots,z_n\}\). We can describe the transition as a Markov matrix \(Q\) such that $$Q=\begin{bmatrix}q_{1,1}&q_{1,2}&\cdots&q_{1,n}\\q_{2,1}&q_{2,2}&\cdots&q_{2,n}\\\vdots&\vdots&\vdots&\vdots\\q_{n,1}&q_{n,2}&\cdots&q_{n,n}\end{bmatrix}$$ where \(q_{i,j}\) is the probability of having \(z_j\) in the next period given \(z_i\) in this period. Note that $$\sum_{j=1}^n q_{i,j}=1\quad\text{for }i=1,2,\dots,n$$

Tip

Remark (higher-order → first-order Markov) Note that even if \(\{z_t\}\) is a higher-order Markov process, we can still transform it into a first-order Markov process. Consider a second-order Markov process \(z_{t+2}=\theta_1 z_{t+1}+\theta_2 z_t+\varepsilon_{t+2}\) with \(\varepsilon_{t+2}\sim\mathcal N(0,\sigma^2)\). Define \(A=\begin{bmatrix}\theta_1&\theta_2\\1&0\end{bmatrix}\), \(x_t=\begin{pmatrix}z_{t+1}\\z_t\end{pmatrix}\) and \(\epsilon_t=\begin{pmatrix}\varepsilon_{t+1}\\0\end{pmatrix}\), then $$\begin{pmatrix}z_{t+2}\\z_{t+1}\end{pmatrix}=\begin{bmatrix}\theta_1&\theta_2\\1&0\end{bmatrix}\begin{pmatrix}z_{t+1}\\z_t\end{pmatrix}+\begin{pmatrix}\varepsilon_{t+2}\\0\end{pmatrix}$$ i.e. \(x_{t+1}=Ax_t+\epsilon_{t+1}\), which is a first-order Markov process (see the formal definition of Markov process in §42).

11.1.2 连续状态空间与 Feller 性质

冲击取连续统的值,即 \(z\in Z=[\underline z,\bar z]\)。可把转移描述为函数 \(Q(z,z')\)、带密度 $$q(z,z')=\frac{\partial Q(z,z')}{\partial z'}$$ 其中 \(z\) 是当前期冲击、\(z'\) 是下一期冲击。\(q(z,z')\) 具有性质 $$\int_{\underline z}^{\bar z}q(z,z')dz'=1$$ (对应离散情形的 \(\sum_{j=1}^n q_{i,j}=1\))。设对每个 \(z'\in Z\),\(q(z,z')\) 关于第一个参数 \(z\) 连续。

Tip

注记 11.1 注意我们总会假设 \(q(z,z')\) 关于第二个参数 \(z'\) 连续(因为想要连续密度)。但这里还假设 \(q(z,z')\) 关于第一个参数 \(z\) 连续,意即若当前冲击 \(z\) 变一点点,则下一期冲击 \(z'\) 的整个密度函数也移动一点点。这对多数经济问题是合理的。

该假设保证:若 \(h(z)\) 关于 \(z\) 连续,则 $$(Mh)(z)=\int_{\underline z}^{\bar z}h(z')q(z,z')dz'$$ 关于 \(z\) 连续。若这对每个连续函数 \(h\) 都成立,则称之为 \(Q\) 的 Feller 性质

Important

定义 11.1(Feller 性质 Feller Property) 给定 \((Z,Q)\),对 \(z\in Z\)、\(q(z,z')=\dfrac{\partial Q(z,z')}{\partial z'}\),定义算子 \(M\) 使 $$(Mh)(z)=\mathbb E(h(z')\mid z)=\int_{\underline z}^{\bar z}h(z')q(z,z')dz'\quad\text{for }\forall z$$ 若 \(M\) 把任意连续函数 \(h\) 映到连续函数 \(Mh\),则称 \(Q\) 具有 Feller 性质

11.1.2 Continuous state space and the Feller property

The shock can have a continuum of values, i.e. \(z\in Z=[\underline z,\bar z]\). We can describe the transition as a function \(Q(z,z')\) with density $$q(z,z')=\frac{\partial Q(z,z')}{\partial z'}$$ where \(z\) is the current period shock and \(z'\) is the next period shock. \(q(z,z')\) has the property that $$\int_{\underline z}^{\bar z}q(z,z')dz'=1$$ (which corresponds to \(\sum_{j=1}^n q_{i,j}=1\) in the discrete case). Assume that for each \(z'\in Z\), \(q(z,z')\) is continuous in its first argument \(z\).

Tip

Remark 11.1 Note that we always assume that \(q(z,z')\) is continuous in its second argument \(z'\) because we want a continuous density. But here we also assume that \(q(z,z')\) is continuous in its first argument \(z\), which means that if the current shock \(z\) changes a little bit, then the whole density function for the next-period shock \(z'\) moves a little bit. Such an assumption is reasonable for most economic problems.

This assumption insures that if \(h(z)\) is continuous in \(z\), then $$(Mh)(z)=\int_{\underline z}^{\bar z}h(z')q(z,z')dz'$$ is continuous in \(z\). If this is true for every continuous function \(h\), then this is called the Feller property of \(Q\).

Important

Definition 11.1 (Feller Property) Given \((Z,Q)\), for \(z\in Z\) and \(q(z,z')=\dfrac{\partial Q(z,z')}{\partial z'}\), define an operator \(M\) such that $$(Mh)(z)=\mathbb E(h(z')\mid z)=\int_{\underline z}^{\bar z}h(z')q(z,z')dz'\quad\text{for }\forall z$$ If \(M\) maps any continuous function \(h\) to a continuous function \(Mh\), then \(Q\) is called to have the Feller property.

11.2 The Problem

现在状态变量为 \(S=(x,z)\)。

11.2.1 贝尔曼方程

这类问题的典型贝尔曼方程可设为 $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,z,y)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ 其中 \(\Gamma(x,z)\) 是 \(y\) 的可行集、\(x\) 是内生状态变量、\(z\) 是外生冲击状态变量。

11.2.2 假设与性质

\(V\) 良定(存在性与唯一性):与确定性情形相同的要求成立,注意要求需对每个固定 \(z\) 满足。

\(V\) 与 \(g\) 作为 \(x\) 的函数的性质: - (1) \(V\) 关于 \(x\) 的单调性:要求 1:\(\Gamma(x,z)\) 对每个固定 \(z\) 关于 \(x\) 单调(\(x'\ge x\Rightarrow\Gamma(x,z)\subseteq\Gamma(x',z)\))。要求 2:\(F(x,y,z)\) 对每个固定 \(z\) 关于 \(x\) 弱递增。若 \(F(x,y,z)\) 对每个固定 \(z\) 关于 \(x\) 严格递增,则 \(V\) 严格递增。 - (2) \(V\) 关于 \(x\) 的凹性:要求 1:\(F(x,y,z)\) 对每个固定 \(z\) 关于 \((x,y)\) 弱凹。若 \(F\) 对每个固定 \(z\) 关于 \(x\) 严格凹,则 \(V\) 严格凹。要求 2:\(\Gamma(x,z)\) 对每个固定 \(z\) 作为对应凸。 - (3) \(V\) 在 \(\hat x\) 处的可微性:要求 1:\(X\subseteq\mathbb R^n\) 凸集。要求 2:\(V\) 凹。要求 3:\(\hat x\in\text{int}(X)\)。要求 4:\(g(\hat x,z)\in\text{int}(\Gamma(\hat x,z))\)。要求 5:\(F\) 对每个固定 \(z\) 关于 \(x\) 可微。 - (4) \(g\) 的性质:若 \(V(x,z)\) 对每个固定 \(z\) 关于 \(x\) 严格凹,则 \(g(x,z)\) 对每个固定 \(z\) 是单值函数(非对应)且关于 \(x\) 连续。若 \(F(x,z,y)\) 对每个固定 \(z\) 关于 \((x,y)\)(严格)凹,则 \(g(x,z)\) 对每个固定 \(z\) 关于 \(x\)(严格)递增。

Now we have the state variable as \(S=(x,z)\).

11.2.1 Bellman equation

A typical Bellman equation for this type of problem can be set up as $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,z,y)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ where \(\Gamma(x,z)\) is the feasible set for \(y\), \(x\) is the endogenous state variable and \(z\) is the exogenous shock state variable.

11.2.2 Assumptions and Properties

\(V\) is well-defined (existence and uniqueness): the same requirements hold as in the deterministic case. Note that the requirements should be satisfied at each fixed \(z\).

Properties of \(V\) and \(g\) as a function of \(x\): - (1) Monotonicity of \(V\) in \(x\): Requirement 1: \(\Gamma(x,z)\) is monotone in \(x\) for each fixed \(z\) (i.e. \(x'\ge x\Rightarrow\Gamma(x,z)\subseteq\Gamma(x',z)\)). Requirement 2: \(F(x,y,z)\) weakly increasing in \(x\) for each fixed \(z\). If \(F(x,y,z)\) is strictly increasing in \(x\) for each fixed \(z\), then \(V\) is strictly increasing. - (2) Concavity of \(V\) in \(x\): Requirement 1: \(F(x,y,z)\) weakly concave in \((x,y)\) for each fixed \(z\). If \(F\) is strictly concave in \(x\) for each fixed \(z\), then \(V\) is strictly concave. Requirement 2: \(\Gamma(x,z)\) is convex as a correspondence for each fixed \(z\). - (3) Differentiability of \(V\) at \(\hat x\): Requirement 1: \(X\subseteq\mathbb R^n\) is a convex set. Requirement 2: \(V\) is concave. Requirement 3: \(\hat x\in\text{int}(X)\). Requirement 4: \(g(\hat x,z)\in\text{int}(\Gamma(\hat x,z))\). Requirement 5: \(F\) is differentiable in \(x\) for each fixed \(z\). - (4) Property of \(g\): If \(V(x,z)\) is strictly concave in \(x\) for each fixed \(z\), then \(g(x,z)\) is a single valued function (not a correspondence) and continuous in \(x\) for each fixed \(z\). If \(F(x,z,y)\) is (strictly) concave in \((x,y)\) for each fixed \(z\), then \(g(x,z)\) is (strictly) increasing in \(x\) for each fixed \(z\).

\(V\) 与 \(g\) 作为 \(z\) 的函数的性质: - (1) \(V\) 关于 \(z\) 的单调性:要求 1:\(\Gamma(x,z)\) 对所有 \(x\) 关于 \(z\) 单调(\(\hat z\ge z\Rightarrow\Gamma(x,z)\subseteq\Gamma(x,\hat z)\) 对 \(\forall x\))。要求 2:\(F(x,y,z)\) 关于 \(z\) 弱递增。若 \(F(x,y,z)\) 关于 \(z\) 严格递增,则 \(V\) 关于 \(z\) 严格递增。要求 3:算子 \(M\) 把(关于 \(z\))递增的函数 \(V(x,z)\) 映到(关于 \(z\))递增的函数 \((Mh)(x,z)=\int_{\underline z}^{\bar z}V(x,z')q(z,z')dz'\)。 - (2) \(g\) 关于 \(z\) 的单调性:若 \(z\) 是 i.i.d. 的,则在假设 \(F_y(x,z,y)\) 关于 \(z\) 递增时,\(g(x,z)\) 对所有 \(x\) 关于 \(z\) 递增。回忆贝尔曼方程 $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,z,y)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ 考虑关于 \(y\) 的一阶条件 $$-F_y(x,z,y)=\beta\int_{\underline z}^{\bar z}V_y(y,z')q(z,z')dz'$$ 若 \(z\) 是 i.i.d. 的,可把一阶条件重写为 $$-F_y(x,z,y)=\beta\int_{\underline z}^{\bar z}V_y(y,z')\pi(z')dz'$$ 其中 \(\pi(z')\) 是无条件密度。注意此时 LHS 随 \(z\) 减小、RHS 不随 \(z\) 变化,故与之前同样的逻辑成立、可用图示论证 \(g(x,z)\) 关于 \(z\) 递增。但若 \(z\) 不是 i.i.d. 的,则除非对 \(\beta\int V_y(y,z')q(z,z')dz'\) 关于 \(z\) 的变化方向做进一步假设,否则不会有此性质。

11.2.3 替代贝尔曼方程

也可把贝尔曼方程表述为 $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,y,z)+\beta\int_{\underline z}^{\bar z}V\big(\underbrace{\phi(x,y,z)}_{\equiv x'},z'\big)q(z,z')dz'\right\}$$ 其中 \(y\) 不是下一期状态变量本身,但可通过函数 \(\phi(x,y,z)\) 影响下一期状态变量 \(x'\) 的值。

11.2.4 稳态

随机规划的稳态与确定性情形不同。这里稳态是 \(S(x,z)\) 的平稳分布(stationary distribution),其转移函数涉及 \(g\) 与 \(Q\)。为使系统收敛到稳态,假设 \(\{z_t\}\) 过程有唯一遍历集、带平稳分布、且系统收敛到它。

11.2.5 包络条件与欧拉方程

包络条件。 对贝尔曼方程 $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,y,z)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ 每个固定 \(z\) 的包络条件为 $$V_x(x,z)=F_x(x,y,z)$$ 欧拉方程。 合并包络条件 $$V_y(y,z')=F_y(y,y',z')$$ 与一阶条件 $$-F_y(x,y,z)=\beta\int_{\underline z}^{\bar z}V_y(y,z')q(z,z')dz'$$ 得欧拉方程 $$-F_y(x,y,z)=\beta\int_{\underline z}^{\bar z}F_y(y,y',z')q(z,z')dz'$$

Properties of \(V\) and \(g\) as a function of \(z\): - (1) Monotonicity of \(V\) in \(z\): Requirement 1: \(\Gamma(x,z)\) is monotone in \(z\) for all \(x\) (i.e. \(\hat z\ge z\Rightarrow\Gamma(x,z)\subseteq\Gamma(x,\hat z)\) for \(\forall x\)). Requirement 2: \(F(x,y,z)\) weakly increasing in \(z\). If \(F(x,y,z)\) is strictly increasing in \(z\), then \(V\) is strictly increasing in \(z\). Requirement 3: the operator \(M\) maps an increasing (in \(z\)) function \(V(x,z)\) to an increasing (in \(z\)) function \((Mh)(x,z)=\int_{\underline z}^{\bar z}V(x,z')q(z,z')dz'\). - (2) Monotonicity of \(g\) in \(z\): If \(z\)'s are i.i.d., then \(g(x,z)\) is increasing in \(z\) for all \(x\) if we assume \(F_y(x,z,y)\) is increasing in \(z\). Recall the Bellman equation $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,z,y)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ Consider the f.o.c. w.r.t. \(y\): $$-F_y(x,z,y)=\beta\int_{\underline z}^{\bar z}V_y(y,z')q(z,z')dz'$$ If \(z\)'s are i.i.d., then we can rewrite the f.o.c. as $$-F_y(x,z,y)=\beta\int_{\underline z}^{\bar z}V_y(y,z')\pi(z')dz'$$ where \(\pi(z')\) is the unconditional density. Note that the LHS decreases in \(z\) and the RHS doesn't change with \(z\), so the same logic as before follows and we can use the graph argument to show that \(g(x,z)\) is increasing in \(z\). But if \(z\)'s are not i.i.d., then we won't have this property unless we make further assumptions on the direction of change of \(\beta\int V_y(y,z')q(z,z')dz'\) in \(z\).

11.2.3 Alternative Bellman equation

Instead, we can formulate the Bellman equation as follows $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,y,z)+\beta\int_{\underline z}^{\bar z}V\big(\underbrace{\phi(x,y,z)}_{\equiv x'},z'\big)q(z,z')dz'\right\}$$ where \(y\) is not the next-period state variable itself, but can affect the value of next-period state variable \(x'\) through the function \(\phi(x,y,z)\).

11.2.4 The steady state

The steady state for stochastic programming is not the same as the deterministic case. Here the steady state is a stationary distribution for \(S(x,z)\) whose transition function involves \(g\) and \(Q\). For the system to converge to the steady state, we assume the \(\{z_t\}\) process has a unique ergodic set with a stationary distribution and the system converges to it.

11.2.5 Envelop condition and Euler Equation

Envelop Condition. For the Bellman equation $$V(x,z)=\max_{y\in\Gamma(x,z)}\left\{F(x,y,z)+\beta\int_{\underline z}^{\bar z}V(y,z')q(z,z')dz'\right\}$$ the envelop condition for each fixed \(z\) is $$V_x(x,z)=F_x(x,y,z)$$ Euler Equation. Combining the Envelop Condition $$V_y(y,z')=F_y(y,y',z')$$ and the f.o.c. that $$-F_y(x,y,z)=\beta\int_{\underline z}^{\bar z}V_y(y,z')q(z,z')dz'$$ we have the Euler Equation $$-F_y(x,y,z)=\beta\int_{\underline z}^{\bar z}F_y(y,y',z')q(z,z')dz'$$