39. Mean and Variance Estimation with High Frequency Data

Note

本章(全书最后一章)用一个 \(dX_t=\mu\,dt+\sigma\,dZ_t\) 的简单模型说明高频数据的一条核心结论:高频数据能精确估计方差 \(\sigma^2\),却无助于估计漂移 \(\mu\)漂移:经典估计量 (39.1) \(\tilde\mu=(X_T-X_0)/T\) 只用首尾两点,方差 \(\operatorname{Var}(\tilde\mu)=\sigma^2/T\) (39.2) 与观测频率 \(N\) 无关;即便构造用上中间所有点的替代估计量 (39.3),其方差 \(\frac{N+2}{NT}\sigma^2\to\sigma^2/T\) 仍不随 \(N\) 趋零——因为这些估计只消除了与 \(dZ_t\) 正交的观测误差,无法消除 \(dZ_t\) 本身的随机性。方差:由 \(\big(\frac{X_j-X_{j-1}-\mu T/N}{\sigma\sqrt{T/N}}\big)^2\sim\chi_1^2\) (39.4)(均值 1、方差 2),已实现方差估计量 (39.5) \(\hat\sigma^2=\frac1T\sum(X_j-X_{j-1}-\hat\mu T/N)^2\) 无偏,且 \(\operatorname{Var}(\hat\sigma^2)=2\sigma^4/N\to0\)——\(N\) 越大方差估计越精确。直觉:漂移信息只藏在端点(低频),方差信息藏在每一步的二次变差(高频)。

Note

This chapter (the last of the book) uses a simple model \(dX_t=\mu\,dt+\sigma\,dZ_t\) to illustrate a core result about high-frequency data: high-frequency data can precisely estimate the variance \(\sigma^2\), but is of no help for estimating the drift \(\mu\). Drift: the classical estimator (39.1) \(\tilde\mu=(X_T-X_0)/T\) uses only the two endpoints, with variance \(\operatorname{Var}(\tilde\mu)=\sigma^2/T\) (39.2) independent of the observation frequency \(N\); even an alternative estimator (39.3) that uses all the middle points has variance \(\frac{N+2}{NT}\sigma^2\to\sigma^2/T\) that does not go to zero with \(N\) — because these estimators only cancel observational error orthogonal to \(dZ_t\), not the randomness of \(dZ_t\) itself. Variance: from \(\big(\frac{X_j-X_{j-1}-\mu T/N}{\sigma\sqrt{T/N}}\big)^2\sim\chi_1^2\) (39.4) (mean 1, variance 2), the realized-variance estimator (39.5) \(\hat\sigma^2=\frac1T\sum(X_j-X_{j-1}-\hat\mu T/N)^2\) is unbiased and \(\operatorname{Var}(\hat\sigma^2)=2\sigma^4/N\to0\) — the larger \(N\), the more precise the variance estimate. Intuition: drift information sits only in the endpoints (low frequency), while variance information sits in the quadratic variation of every step (high frequency).

39.1 Setup

设 \(X_t\) 满足 \(dX_t=\mu\,dt+\sigma\,dZ_t\),\(\{Z_t\}\) 为标准布朗运动。我们关心计算均值(漂移)\(\mu\) 与方差 \(\sigma^2\)。设可在 \(t\in[0,T]\) 上等间隔观测 \(N\) 次 \(dX_t\),相邻两次观测的时间间隔为 \(T/N\),记 \(0=t_0

39.2 Mean Estimation

39.2.1 Classical Estimator

估计 \(\mu\) 的一种经典方式 (39.1):

Suppose \(X_t\) satisfies \(dX_t=\mu\,dt+\sigma\,dZ_t\), with \(\{Z_t\}\) a standard Brownian motion. We are interested in computing the mean (drift) \(\mu\) and the variance \(\sigma^2\). Suppose we can observe \(dX_t\) during \(t\in[0,T]\) at \(N\) evenly spaced points, so the time interval between any two adjacent observations is \(T/N\); denote \(0=t_0

39.2 Mean Estimation

39.2.1 Classical Estimator

One classical way of estimating \(\mu\) (39.1):

$$\tilde\mu\,\frac TN=\frac1N\sum_{j=1}^N(X_j-X_{j-1})\ \Rightarrow\ \tilde\mu=\frac1T\sum_{j=1}^N(X_j-X_{j-1})=\frac{X_T-X_0}{T}\tag{39.1}$$

其中 \(X_j\) 为 \(t=t_j\) 处观测到的 \(X_t\)。注意 \(X_j-X_{j-1}\sim\mathcal N\big(\mu\frac TN,\sigma^2\frac TN\big)\)。该估计量的方差 (39.2):

where \(X_j\) is the observed \(X_t\) at \(t=t_j\). Note that \(X_j-X_{j-1}\sim\mathcal N\big(\mu\frac TN,\sigma^2\frac TN\big)\). The variance of this estimator (39.2):

$$\operatorname{Var}(\tilde\mu)=\operatorname{Var}\Big(\frac{X_T-X_0}{T}\Big)=\frac1{T^2}\operatorname{Var}(\sigma Z_T)=\frac1{T^2}\sigma^2 T=\frac{\sigma^2}{T}\tag{39.2}$$

由 (39.1) 可见,更高的观测频率(更大的 \(N\))对估计 \(\mu\) 毫无帮助——因为 \(N\) 根本不出现在 (39.1) 中。它丢弃(浪费)了中间收集的信息,只盯住 \(t=0\) 与 \(t=T\) 两端:更多的中间观测既不起作用,也无法消除 \(dZ_t\) 带来的随机性。

39.2.2 An Alternative Estimator

考虑如下替代估计量(不失一般性设 \(N\) 为偶数)(39.3):

From (39.1) we can see that higher observation frequency (larger \(N\)) doesn't help estimate \(\mu\) at all — because \(N\) does not appear in (39.1) at all. It drops (wastes) the information collected in the middle, focusing only on the two ends \(t=0\) and \(t=T\): more middle observations play no role and cannot remove the randomness induced by \(dZ_t\).

39.2.2 An Alternative Estimator

Consider the following alternative estimator (without loss of generality let \(N\) be even) (39.3):

$$\hat\mu\Big(\frac12T\Big)=\frac1{N/2}\sum_{j=1}^{N/2}\big(X_{N/2+j}-X_j\big)\ \Rightarrow\ \hat\mu=\frac4{NT}\sum_{j=1}^{N/2}\big(X_{N/2+j}-X_j\big)\tag{39.3}$$

(39.3) 用上了中间所有观测点,但它只消除了观测误差(若有,即数据读错),那是与 \(dZ_t\) 正交的另一维随机性。虽然 \(N\) 出现在 (39.3) 中,却不消除 \(dZ_t\) 本身的随机性。其无偏:\(\mathbb E[\hat\mu]=\frac4{NT}\cdot\frac N2\cdot\frac T2\mu=\mu\);方差为

$$\operatorname{Var}(\hat\mu)=\frac{16}{N^2T^2}\cdot\frac N2\Big(1+2+\dots+\frac N2\Big)\sigma^2\frac TN=\frac{N+2}{NT}\sigma^2,$$

当 \(N\to\infty\) 渐近趋于 \(\sigma^2/T\),与 (39.2) 完全相同。故两个 \(\mu\) 的估计量方差都不能被高频数据驱至零。

39.3 Variance Estimation

由 \(X_j-X_{j-1}\sim\mathcal N\big(\mu\frac TN,\sigma^2\frac TN\big)\) 得 (39.4):

(39.3) fully uses all the middle observation points, but it only cancels the observational error (if any, i.e. misreading the data), which is another dimension of randomness orthogonal to \(dZ_t\). Although \(N\) appears in (39.3), it doesn't cancel the randomness introduced by \(dZ_t\) itself. Unbiased: \(\mathbb E[\hat\mu]=\frac4{NT}\cdot\frac N2\cdot\frac T2\mu=\mu\); the variance is

$$\operatorname{Var}(\hat\mu)=\frac{16}{N^2T^2}\cdot\frac N2\Big(1+2+\dots+\frac N2\Big)\sigma^2\frac TN=\frac{N+2}{NT}\sigma^2,$$

which asymptotically goes to \(\sigma^2/T\) as \(N\to\infty\), exactly the same as (39.2). So both estimators of \(\mu\) have a variance that cannot be driven to zero by high frequency of data.

39.3 Variance Estimation

From \(X_j-X_{j-1}\sim\mathcal N\big(\mu\frac TN,\sigma^2\frac TN\big)\) we get (39.4):

$$\left(\frac{X_j-X_{j-1}-\mu\frac TN}{\sigma\sqrt{\frac TN}}\right)^2\sim\chi_1^2\tag{39.4}$$

\(\chi_1^2\) 的均值为 1、方差为 2。构造 \(\sigma^2\) 的估计量 (39.5):

The \(\chi_1^2\) has mean 1 and variance 2. Construct the estimator of \(\sigma^2\) (39.5):

$$\hat\sigma^2\frac TN=\frac1N\sum_{j=1}^N\Big(X_j-X_{j-1}-\hat\mu\frac TN\Big)^2\ \Rightarrow\ \hat\sigma^2=\frac1T\sum_{j=1}^N\Big(X_j-X_{j-1}-\hat\mu\frac TN\Big)^2\tag{39.5}$$

其中 \(\hat\mu\) 由 (39.3) 得到。为说明基本思想,先设已知 \(\mu\)。由布朗运动的增量独立性,(39.5) 中的 \(\sum_{j=1}^N(X_j-X_{j-1})^2\) 是一列相互独立 i.i.d. 项之和。于是由 (39.4),

$$\mathbb E[\hat\sigma^2]=\frac1T\sigma^2\frac TN\sum_{j=1}^N\mathbb E\!\left[\left(\frac{X_j-X_{j-1}-\mu\frac TN}{\sigma\sqrt{\frac TN}}\right)^2\right]=\frac1T\sigma^2\frac TN\cdot N=\sigma^2,$$

即估计量无偏;其方差

$$\operatorname{Var}(\hat\sigma^2)=\frac1{T^2}\sum_{j=1}^N\sigma^4\frac{T^2}{N^2}\underbrace{\operatorname{Var}\!\left[\left(\frac{X_j-X_{j-1}-\mu\frac TN}{\sigma\sqrt{\frac TN}}\right)^2\right]}_{=2}=\frac{\sigma^4}{N^2}\cdot2N=\frac{2\sigma^4}{N},$$

当 \(N\) 足够大时趋于零。(虽然 \(\mu\) 仍需估计,但只要用无偏的 \(\hat\mu\),基本思想与结论不变。)

结论:即便用最自然的方式估计 \(\sigma^2\),高频数据也能帮助我们更精确地估计方差——这与漂移 \(\mu\) 的情形形成鲜明对比。

where \(\hat\mu\) is obtained from (39.3). To illustrate the basic idea, first suppose we know \(\mu\). By the increment independence of Brownian motion, \(\sum_{j=1}^N(X_j-X_{j-1})^2\) in (39.5) is a sum of mutually independent i.i.d. objects. So by (39.4),

$$\mathbb E[\hat\sigma^2]=\frac1T\sigma^2\frac TN\sum_{j=1}^N\mathbb E\!\left[\left(\frac{X_j-X_{j-1}-\mu\frac TN}{\sigma\sqrt{\frac TN}}\right)^2\right]=\frac1T\sigma^2\frac TN\cdot N=\sigma^2,$$

i.e. the estimator is unbiased; and its variance

$$\operatorname{Var}(\hat\sigma^2)=\frac1{T^2}\sum_{j=1}^N\sigma^4\frac{T^2}{N^2}\underbrace{\operatorname{Var}\!\left[\left(\frac{X_j-X_{j-1}-\mu\frac TN}{\sigma\sqrt{\frac TN}}\right)^2\right]}_{=2}=\frac{\sigma^4}{N^2}\cdot2N=\frac{2\sigma^4}{N},$$

which goes to zero when \(N\) is large enough. (Although \(\mu\) still needs to be estimated, the basic idea and conclusion don't change as long as we use the unbiased \(\hat\mu\).)

Conclusion: even using the most natural way of estimating \(\sigma^2\), higher-frequency data can help us estimate the variance more precisely — in sharp contrast to the case of the drift \(\mu\).

References

  • He, X. (2019b). Macroeconomics Notes by Xindi He.
  • He, X. (2019c). Microeconomics Notes by Xindi He.
  • He, X. (2019d). Stochastic Calculus Notes by Xindi He.
  • He, X. (2020–2024). Asset Pricing (lecture notes), Ch. 39.