A second-order iterated smoothing algorithm

Nguyen, Dao; Ionides, Edward L.

doi:10.1007/s11222-016-9711-9

A second-order iterated smoothing algorithm

Published: 15 October 2016

Volume 27, pages 1677–1692, (2017)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Dao Nguyen¹ &
Edward L. Ionides¹

1 Citation
Explore all metrics

Abstract

Simulation-based inference for partially observed stochastic dynamic models is currently receiving much attention due to the fact that direct computation of the likelihood is not possible in many practical situations. Iterated filtering methodologies enable maximization of the likelihood function using simulation-based sequential Monte Carlo filters. Doucet et al. (2013) developed an approximation for the first and second derivatives of the log likelihood via simulation-based sequential Monte Carlo smoothing and proved that the approximation has some attractive theoretical properties. We investigated an iterated smoothing algorithm carrying out likelihood maximization using these derivative approximations. Further, we developed a new iterated smoothing algorithm, using a modification of these derivative estimates, for which we establish both theoretical results and effective practical performance. On benchmark computational challenges, this method beat the first-order iterated filtering algorithm. The method’s performance was comparable to a recently developed iterated filtering algorithm based on an iterated Bayes map. Our iterated smoothing algorithm and its theoretical justification provide new directions for future developments in simulation-based inference for latent variable models such as partially observed Markov process models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Overview of Recent Advances in Monte-Carlo Methods for Bayesian Filtering in High-Dimensional Spaces

Conditionally Minimax Nonlinear Filter and Unscented Kalman Filter: Empirical Analysis and Comparison

Article 12 July 2019

Bayesian Belief Models in Simulation-Based Decision-Making

References

Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B 72(3), 269–342 (2010)
Article MathSciNet MATH Google Scholar
Bhadra, A., Ionides, E.L., Laneri, K., Pascual, M., Bouma, M., Dhiman, R.C.: Malaria in Northwest India: data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Am. Stat. Assoc. 106, 440–451 (2011)
Article MATH Google Scholar
Bjørnstad, O.N., Grenfell, B.T.: Noisy clockwork: time series analysis of population fluctuations in animals. Science 293, 638–643 (2001)
Article Google Scholar
Blackwood, J.C., Cummings, D.A.T., Broutin, H., Iamsirithaworn, S., Rohani, P.: Deciphering the impacts of vaccination and immunity on pertussis epidemiology in Thailand. Proc. Natl. Acad. Sci. USA 110, 9595–9600 (2013a)
Article Google Scholar
Blackwood, J.C., Streicker, D.G., Altizer, S., Rohani, P.: Resolving the roles of immunity, pathogenesis, and immigration for rabies persistence in vampire bats. Proc. Natl. Acad. Sci. USA 110, 2083720842 (2013b)
Google Scholar
Blake, I.M., Martin, R., Goel, A., Khetsuriani, N., Everts, J., Wolff, C., Wassilak, S., Aylward, R.B., Grassly, N.C.: The role of older children and adults in wild poliovirus transmission. Proc. Natl. Acad. Sci. USA 111(29), 10604–10609 (2014)
Article Google Scholar
Bretó, C., He, D., Ionides, E.L., King, A.A.: Time series analysis via mechanistic models. Ann. Appl. Stat. 3, 319–348 (2009)
Article MathSciNet MATH Google Scholar
Camacho, A., Ballesteros, S., Graham, A.L., Carrat, F., Ratmann, O., Cazelles, B.: Explaining rapid reinfections in multiple-wave influenza outbreaks: Tristan da Cunha 1971 epidemic as a case study. Proc. R. Soc. Lond. Ser. B 278(1725), 3635–3643 (2011)
Article Google Scholar
Chopin, N., Jacob, P.E., Papaspiliopoulos, O.: SMC$^2$: an efficient algorithm for sequential analysis of state space models. J. R. Stat. Soc. Ser. B 75(3), 397–426 (2013)
Article MathSciNet Google Scholar
Dahlin, J., Lindsten, F., Schön, T.B.: Particle Metropolis-Hastings using gradient and Hessian information. Stat. Comput. 25(1), 81–92 (2015)
Article MathSciNet MATH Google Scholar
Douc, R., Cappé, O., Moulines, E.: Comparison of resampling schemes for particle filtering. In: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005, pp 64–69. IEEE, New York (2005)
Doucet, A., Jacob, P. E., and Rubenthaler, S.: Derivative-free estimation of the score vector and observed information matrix with application to state-space models (version 2). arXiv:1304.5768v2 (2013)
Earn, D.J., He, D., Loeb, M.B., Fonseca, K., Lee, B.E., Dushoff, J.: Effects of school closure on incidence of pandemic influenza in Alberta. Ann. Int. Med. 156(3), 173–181 (2012)
Article Google Scholar
Gordon, N.J., Salmond, D.J., Smith, A.F.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F. Radar Signal Process. 140, 107–113 (1993)
Article Google Scholar
He, D., Dushoff, J., Day, T., Ma, J., Earn, D.J.D.: Inferring the causes of the three waves of the 1918 influenza pandemic in England and Wales. Proc. R. Soc. Lond. Ser. B 280(1766), 20131345 (2013)
Article Google Scholar
He, D., Ionides, E.L., King, A.A.: Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J. R. Soc. Interface 7(43), 271–283 (2010)
Article Google Scholar
Ionides, E.L., Bhadra, A., Atchadé, Y., King, A.: Iterated filtering. Ann. Stat. 39, 1776–1802 (2011)
Article MathSciNet MATH Google Scholar
Ionides, E.L., Bretó, C., King, A.A.: Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103, 18438–18443 (2006)
Article Google Scholar
Ionides, E.L., Nguyen, D., Atchadé, Y., Stoev, S., King, A.A.: Inference for dynamic and latent variable models via iterated, perturbed Bayes maps. P. Natl. Acad. Sci. USA 112(3), 719–724 (2015)
Article MathSciNet MATH Google Scholar
Kevrekidis, I.G., Gear, C.W., Hummer, G.: Equation-free: the computer-assisted analysis of complex, multiscale systems. Am. Inst. Chem. Eng. J. 50, 1346–1354 (2004)
Article Google Scholar
King, A.A., Domenech de Celle, M., Magpantay, F.M.G., Rohani, P.: Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola. Proc. R. Soc. Lond. Ser. B 282, 20150347 (2015)
Article Google Scholar
King, A.A., Ionides, E.L., Pascual, M., Bouma, M.J.: Inapparent infections and cholera dynamics. Nature 454, 877–880 (2008)
Article Google Scholar
King, A.A., Nguyen, D., Ionides, E.L.: Statistical inference for partially observed Markov processes via the R package pomp. J. Stat. Softw 69, 1–43 (2016)
Article Google Scholar
Kloeden, P.E., Platen, E.: Numerical Soluion of Stochastic Differential Equations, 3rd edn. Springer, New York (1999)
Google Scholar
Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)
Book MATH Google Scholar
Laneri, K., Bhadra, A., Ionides, E.L., Bouma, M., Dhiman, R.C., Yadav, R.S., Pascual, M.: Forcing versus feedback: epidemic malaria and monsoon rains in Northwest India. PLoS Comput. Biol. 6(9), e1000898 (2010)
Article Google Scholar
Laneri, K., Paul, R.E., Tall, A., Faye, J., Diene-Sarr, F., Sokhna, C., Trape, J.-F., Rodó, X.: Dynamical malaria models reveal how immunity buffers effect of climate variability. Proc. Natl. Acad. Sci. USA 112(28), 8786–8791 (2015)
Article Google Scholar
Lavine, J.S., King, A.A., Andreasen, V., Bjrnstad, O.N.: Immune boosting explains regime-shifts in prevaccine-era pertussis dynamics. PLoS ONE 8(8), e72086 (2013)
Article Google Scholar
Lavine, J.S., Rohani, P.: Resolving pertussis immunity and vaccine effectiveness using incidence time series. Expert Rev. Vaccines 11, 1319–1329 (2012)
Article Google Scholar
Macdonald, G.: The Epidemiology and Control of Malaria. Oxford University Press, Oxford (1957)
Google Scholar
Martinez-Bakker, M., King, A.A., Rohani, P.: Unraveling the transmission ecology of polio. PLoS Biol. 13(6), e1002172 (2015)
Article Google Scholar
Nemeth, C., Fearnhead, P., Mihaylova, L.: Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost. arXiv:1306.0735 (2013)
Nguyen, D. (2015). Iterated smoothing r package, is2. https://r-forge.r-project.org/projects/is2
Olsson, J., Cappé, O., Douc, R., Moulines, E.: Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli 14(1), 155–179 (2008)
Article MathSciNet MATH Google Scholar
Poyiadjis, G., Doucet, A., Singh, S.S.: Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98(1), 65–80 (2011)
Article MathSciNet MATH Google Scholar
Romero-Severson, E., Volz, E., Koopman, J., Leitner, T., Ionides, E.: Dynamic variation in sexual contact rates in a cohort of HIV-negative gay men. Am. J. Epidemiol. 182, 255–262 (2015)
Article Google Scholar
Ross, R.: The Prevention of Malaria. Dutton, Boston (1910)
Google Scholar
Roy, M., Bouma, M.J., Ionides, E.L., Dhiman, R.C., Pascual, M.: The potential elimination of plasmodium vivax malaria by relapse treatment: Insights from a transmission model and surveillance data from NW India. PLoS Negl. Trop. Dis. 7, e1979 (2013)
Article Google Scholar
Shrestha, S., Foxman, B., Weinberger, D.M., Steiner, C., Viboud, C., Rohani, P.: Identifying the interaction between influenza and pneumococcal pneumonia using incidence data. Sci. Transl. Med. 5(191), 191ra84 (2013)
Article Google Scholar
Shrestha, S., King, A.A., Rohani, P.: Statistical inference for multi-pathogen systems. PLoS Comput. Biol. 7(8), e1002135 (2011)
Sisson, S.A., Fan, Y., Tanaka, M.M.: Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 104(6), 1760–1765 (2007)
Article MathSciNet MATH Google Scholar
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, Hoboken (2003)
Book MATH Google Scholar
Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.: Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202 (2009)
Wood, S.N.: Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466(7310), 1102–1104 (2010)
Article Google Scholar
Yıldırım, S., Singh, S.S., Dean, T., Jasra, A.: Parameter estimation in hidden Markov models with intractable likelihoods using sequential Monte Carlo. J. Comput. Graph. Stat. 24, 846–865 (2015)
Article MathSciNet Google Scholar

Download references

Acknowledgments

This research was funded in part by National Science Foundation Grant DMS-1308919 and National Institutes of Health Grants 1-U54-GM111274 and 1-U01-GM110712.

Author information

Authors and Affiliations

Department of Statistics, University of Michigan, Ann Arbor, MI, USA
Dao Nguyen & Edward L. Ionides

Authors

Dao Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Edward L. Ionides
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Edward L. Ionides.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 603 KB)

Appendix: Proofs

1.1 Proof of Theorem 3

Let

$$\begin{aligned} R=\left[ \begin{array}{cccc} \tau _{0}I_{d\times d} &{}\quad 0_{d\times d} &{}\quad \cdots &{}\quad 0_{d\times d}\\ \tau _{0}I_{d\times d} &{}\quad \tau _{1}I_{d\times d} &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \tau _{0}I_{d\times d} &{}\quad \tau _{1}I_{d\times d} &{}\quad \cdots &{}\quad \tau _{N}I_{d\times d} \end{array}\right] , \end{aligned}$$

(8)

where $I_{d\times d}$ is identity matrix of dimension d and $0_{d\times d}$ is zero matrix of dimension d, then a random walk noise will be $R\tau Z_{0:N}$. From Assumption 5, $\breve{\ell }$ is four times continuously differentiable for $\theta ^{[N+1]}$. Since N is fixed, we can apply Theorem 1, with $\Sigma =\mathrm {Cov}(RZ_{0:N})=\breve{\Psi }_N$, to obtain the existence of an $\eta $ and a $C_8$ such that for every $\tau < \eta $ we have,

$$\begin{aligned}&\left| \breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) -\tau ^{2}\breve{\Psi }_N\nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) \right| \nonumber \\&\quad <C_8\tau ^{4}, \end{aligned}$$

(9)

where

$$\begin{aligned} \breve{\Psi }_N=\left[ \begin{array}{cccc} \tau _{0}^2\Psi &{}\quad \tau _{0}^2\Psi &{}\quad \cdots &{}\quad \tau _{0}^2\Psi \\ \tau _{0}^2\Psi &{}\quad \tau _{0}^2+\tau _{1}^2\Psi &{}\quad \ddots &{}\quad \tau _{0}^2+\tau _{1}^2\Psi \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \tau _{0}^2\Psi &{}\quad \tau _{0}^2+\tau _{1}^2\Psi &{}\quad \cdots &{}\quad \sum _{i=0}^N\tau _{i}^2\Psi \end{array}\right] . \end{aligned}$$

(10)

Note that Assumptions 1 and 2 are automatically satisfied for the multivariate normal distribution with mean zero and variance $\breve{\Psi }_N$, corresponding to the random variable $RZ_{0:N}$. As a result, for fixed $\tau _{0},\dots ,\tau _{N}$ and for a random walk noise, we have

$$\begin{aligned}&\left| \nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) -\tau ^{-2}\breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \right| \\&\quad <C_9\tau ^{2}. \end{aligned}$$

An application of the Gaussian-Jordan inverse method gives

$$\begin{aligned}&\breve{\Psi }_N^{-1}=\\&\quad \left[ \begin{array}{cccc} (\tau _{0}^{-2}+\tau _{1}^{-2})\Psi ^{-1} &{}\quad -\tau _{1}^{-2}\Psi ^{-1} &{}\quad \cdots &{}\quad 0\\ -\tau _{1}^{-2}\Psi ^{-1} &{}\quad (\tau _{1}^{-2}+\tau _{2}^{-2})\Psi ^{-1} &{}\quad \cdots &{}\quad \vdots \\ 0 &{}\quad -\tau _{2}^{-2}\Psi ^{-1}&{}\quad \cdots &{}\quad \vdots \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ 0 &{}\quad 0 &{}\quad (\tau _{N-1}^{-2}+\tau _{N}^{-2})\Psi ^{-1} &{}\quad -\tau _{N}^{-2}\Psi ^{-1}\\ 0 &{}\quad 0 &{}\quad -\tau _{N}^{-2}\Psi ^{-1} &{}\quad \tau _{N}^{-2}\Psi ^{-1} \end{array}\right] . \end{aligned}$$

We write $\nabla _{n} \breve{\ell }(\theta ^{[N+1]})$ for the d-dimensional vector of partial derivatives of $\breve{\ell }(\theta ^{[N+1]})$ with respect to each of the d components of $\theta _{n}$. An application of the chain rule gives the identity

$$\begin{aligned} \nabla {{\ell }}\left( \theta \right) =\sum _{n=0}^{N}\nabla _{n} \breve{\ell }\left( \theta ^{[N+1]}\right) , \end{aligned}$$

giving rise to an inequality,

$$\begin{aligned}&\left| \nabla {{\ell }}\left( \theta \right) -\tau ^{-2}\sum _{n=0}^{N}\left\{ \breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{[N+1]}|\breve{Y}_{1:N}=y^*_{1:N}\right) \right\} _{n}\right| \\&\quad <C_{10}\tau ^{2}, \end{aligned}$$

where $\left\{ s\right\} _{n}$ is the entries $\left\{ dn+1,...,d(n+1)\right\} $ of a vector $s\in R^{d(N+1)}$. Decomposing the matrix multiplication by $\breve{\Psi }_N^{-1}$ into $d\times d$ blocks, we have

$$\begin{aligned}&\tau ^{-2}\sum _{n=0}^{N}\left\{ \breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{[N+1]}|\breve{Y}_{1:N}=y^*_{1:N}\right) \right\} _{n} \nonumber \\&\quad =\tau ^{-2}\sum _{n=0}^{N}\mathrm {SumCol}_{n}\left( \breve{\Psi }_N^{-1}\right) \breve{\mathbb {E}}\left( \breve{\Theta }_{n}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) ,\nonumber \\ \end{aligned}$$

(11)

where $\mathrm {SumCol}_{n}$ is the sum of the nth column in the $d\times d$ block construction of $\breve{\Psi }_N^{-1}$. Every column of $\breve{\Psi }_N^{-1}$ except the first sums to 0, and this special structure of $\tilde{\Psi }_N^{-1}$ gives a simple form,

$$\begin{aligned}&\left| \sum _{n=0}^{N}\nabla _{n}{\breve{\ell }}\left( \theta ^{[N+1]}\right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| \\&\quad <C_{11}\tau ^{2}. \end{aligned}$$

This can be written as

$$\begin{aligned} \left| \nabla {\ell }\left( \theta \right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| <C_{11}\tau ^{2}. \end{aligned}$$

1.2 Proof of Theorem 4

Using similar set up as above, let the random walk noise be $R\tau Z_{0:N}$ with R defined as in Eq. (8). By selecting $p_{\Theta _{0:N}}$ to follow a multivariate normal distribution, Assumption 4 is also satisfied. From Theorem 2, for fixed $\tau _{0}, \dots , \tau _{N}$, there exist $\eta $ and $C_{12}$ such that for $0<\tau <\eta $,

$$\begin{aligned}&\left| \nabla ^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) -\tau ^{-4}\right. \nonumber \\&\quad \left. \left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] \right| \nonumber \\&\qquad <C_{12}\tau ^{2}. \end{aligned}$$

(12)

Define $\nabla ^2_{s,n}{\breve{\ell }}\left( \theta ^{[N+1]}\right) $ as

$$\begin{aligned} \nabla ^{2}_{s,n}\breve{\ell }\left( \theta ^{[N+1]}\right) =\frac{{{\partial }}^{2}\breve{\ell }\left( \theta ^{[N+1]}\right) }{{{\partial \theta }}_{s}{{\partial \theta }}_{n}}. \end{aligned}$$

Applying the chain rule, we have

$$\begin{aligned} \nabla ^2{\ell }\left( \theta \right) =\sum _{s=0}^{N}\sum _{n=0}^{N}\nabla _{s,n}^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) . \end{aligned}$$

Adding up term in Eq. (12), we get

$$\begin{aligned}&\left| \sum _{s=0}^{N}\sum _{n=0}^{N}\nabla _{s,n}^{2}{\breve{\ell }}(\theta ^{[N+1]})-\tau ^{-4}\right. \\&\left. \sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\right| \\&\quad <C_{14}\tau ^{2}, \end{aligned}$$

where $\left\{ A \right\} _{s,n}$ is the entries of rows $\left\{ ds+1,...,d(s+1)\right\} $ and of columns $\left\{ dn+1,...,d(n+1)\right\} $ of a matrix $A\in R^{d(N+1)\times d(N+1)}$. Therefore,

$$\begin{aligned}&\Bigg |\nabla ^2{\ell }\left( \theta \right) -\tau ^{-4}\Bigg .\\&\left. \sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\right| \\&\quad <C_{14}\tau ^{2}. \end{aligned}$$

Defining $\mathrm {SumCol}_n$ as in Eq. (11), we have

$$\begin{aligned}&\sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\\&\quad =\sum _{s=0}^{N}\sum _{n=0}^{N}\mathrm{SumCol}_{\mathrm {s}}(\breve{\Psi }_N^{-1})\mathrm {SumCol}_{n}(\breve{\Psi }_N^{-1})\\&\qquad \times \left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{s},\breve{\Theta }_{n}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\sum _{k=0}^{s\wedge n}\tau _k^{2}\tau ^2\Psi \right) \Psi ^{-1}\\&\quad =\left( {\breve{\mathrm{V}}\mathrm {ar}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_0|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau _0^{2}\tau ^2\Psi \right) . \end{aligned}$$

The last equality follows since $\breve{\Psi }_N^{-1}$ is symmetric matrix with block of $d\times d$ for which each column except the first sums to 0. Thus, we obtain

$$\begin{aligned}&\left| \nabla ^2{\ell }\left( \theta \right) -\tau ^{-4}\Psi ^{-1}\left( {\breve{\mathrm{V}}\mathrm {ar}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau _0^{2}\tau ^2\Psi \right) \Psi ^{-1}\right| \\&\quad <C_{15}\tau ^{2}. \end{aligned}$$

1.3 Proof of Theorem 5

In order to prove Theorem 5, we use the following corollary to Theorem 1.

Corollary 1

Suppose the perturbation kernel takes a value $\kappa \in \mathcal {K}$ satisfying Assumptions 6 and 7. Suppose also assumption 3. There exists an $\eta $ and a constant $C_{16}$ such that for every $0<\tau <\eta $ and every $\kappa \in \mathcal {K}$,

$$\begin{aligned} \left| \breve{\mathbb {E}}\left( \breve{\Theta }-\theta \left| \breve{Y}=y^*\right. \right) -\tau ^{2}\Sigma \nabla \ell \left( \theta \right) \right| <C_{16}\tau ^{4}. \end{aligned}$$

(13)

Corollary 1 follows directly from applying Theorem 1, noting that Assumptions 6 and 7 imply a uniform bound on $C_2$ in Theorem 1. Applying Corollary 1, we obtain the existence of an $\eta $ and $C_{18}$ such that for every $\tau < \eta $ we have

$$\begin{aligned}&\left| \tau ^{2}\breve{\Psi }_N\nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) -\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \right| \nonumber \\&\quad <C_{18}\tau ^{4}. \end{aligned}$$

(14)

For compactness of notation, we write $E_n=\breve{\mathbb {E}}\left( \breve{\Theta }_{n}-\theta \left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) $ and $D_n=\Psi \nabla _n{\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) $. With $\breve{\Psi }_N$ as in Eq. (10), writing out terms of the vector equation in (14) gives

$$\begin{aligned}&E_0 = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +O(\tau ^4), \end{aligned}$$

(15)

$$\begin{aligned}&E_1 = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n + \tau ^2 \tau _1^2 \sum _{n=1}^N D_n+O(\tau ^4), \end{aligned}$$

(16)

$$\begin{aligned}&\vdots \end{aligned}$$

(17)

$$\begin{aligned}&E_{N-1} = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +\cdots + \tau ^2 \tau _{N-1}^2 \sum _{n=N-1}^N D_n +O(\tau ^4), \nonumber \\ \end{aligned}$$

(18)

$$\begin{aligned}&E_N = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +\cdots + \tau ^2 \tau _{N}^2 D_N +O(\tau ^4). \end{aligned}$$

(19)

Using our assumption that for all $n=1\ldots N$, $\tau _n=O(\tau ^2)$, we get that $E_n=E_0+O(\tau ^{4})$, from which we can conclude that

$$\begin{aligned} \frac{1}{N+1}\sum _{n=0}^NE_n=E_0+O(\tau ^{4}). \end{aligned}$$

From Eq. (14) and the chain rule, for fixed $\tau _0$ we have

$$\begin{aligned} \left| \nabla {\ell }\left( \theta \right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| <C_{19}\tau ^{2}, \end{aligned}$$

which then completes the proof.

1.4 Proof of Theorem 6

As for Theorem 5, we need a corollary to Theorem 2 over the kernel set $\mathcal {K}$, making use of Assumptions 6 and 7.

Corollary 2

Suppose Assumptions 3, 6, and 7. Suppose also that every kernel in $\mathcal {K}$ satisfies the mesokurtic condition in Assumptions 3. There exists an $\eta $ and a constant $C_{20}$ such that for every $0<\tau <\eta $ and every $\kappa \in \mathcal {K}$,

$$\begin{aligned}&\left| \breve{\mathbb {E}}\left[ \left( \breve{\Theta }-\theta \right) \left( \breve{\Theta }-\theta \right) ^{\top }\left| \breve{Y}=y^*\right. \right] -\tau ^{2}\Sigma -\tau ^{4}\Sigma \left( \nabla ^{2}\ell (\theta )\right) \Sigma \right| \\&\quad < C_{20}\tau ^{6}. \end{aligned}$$

Applying Corollary 2, we have

$$\begin{aligned}&\left| \tau ^{4}\breve{\Psi }_N\nabla ^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) \breve{\Psi }_N\right. \\&\quad \left. -\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}{=}y^*_{1:N}\right) {-}\tau ^{2}\breve{\Psi }_N\right) \right| <C_{21}\tau ^{6}. \end{aligned}$$

For compact notation, we write

$$\begin{aligned} \breve{C}ov_{s,n}=\breve{\mathrm{C}}{\mathrm {ov}}\left( \breve{\Theta }_{s},\breve{\Theta }_{n}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\sum _{k=0}^{s\wedge {n}}\tau _k^{2}\tau ^2\Psi \end{aligned}$$

and

$$\begin{aligned} \nabla ^2_{s,n}\breve{\ell }= \nabla ^2_{s,n}\breve{\ell }\left( \theta ^{[N+1]}\right) . \end{aligned}$$

From the diagonal terms of the above matrix norm inequality, we derive $N+1$ equations,

$$\begin{aligned}&\breve{\mathrm{C}}{\mathrm {ov}}_{0,0}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{0,0}+O(\tau ^6), \end{aligned}$$

(20)

$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{1,1}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{1,1}+O(\tau ^6), \end{aligned}$$

(21)

$$\begin{aligned}&\vdots \end{aligned}$$

(22)

$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{N-1,N-1}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{N-1,N-1}+O(\tau ^6), \end{aligned}$$

(23)

$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{N,N}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{N,N}+O(\tau ^6). \end{aligned}$$

(24)

Using (20) through (24), and expanding out a matrix multiplication, we get

$$\begin{aligned}&\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{n,n} = \Psi ^2\sum _{j=0}^n\\&\quad \left( \sum _{k=0}^j\tau _k^2\right) \left[ \sum _{i=0}^n\left( \sum _{k=0}^i\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }+\sum _{i=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }\right] \\&\qquad + \Psi ^2\sum _{j=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \\&\quad \left[ \sum _{i=0}^n\left( \sum _{k=0}^i\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }+\sum _{i=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }\right] . \end{aligned}$$

Using our assumption that for all $n=1\ldots N$, $\tau _n=O(\tau ^2)$, we get that

$$\begin{aligned} \breve{\mathrm{C}}\mathrm {ov}_{n,n}=\breve{\mathrm{C}}\mathrm {ov}_{0,0}+O(\tau ^{6}), \end{aligned}$$

from which we can conclude that

$$\begin{aligned} \frac{1}{N+1}\sum _{n=0}^N\breve{\mathrm{C}}\mathrm {ov}_{n,n}=\breve{\mathrm{C}}\mathrm {ov}_{0,0}+O(\tau ^{6}). \end{aligned}$$

For $n=0$, we have

$$\begin{aligned} \left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{0,0}=\Psi ^2\tau _0^4\sum _{j=0}^N\sum _{i=0}^N\nabla ^2_{i,j}\breve{\ell }, \end{aligned}$$

(25)

from which, applying the chain rule completes the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nguyen, D., Ionides, E.L. A second-order iterated smoothing algorithm. Stat Comput 27, 1677–1692 (2017). https://doi.org/10.1007/s11222-016-9711-9

Download citation

Received: 19 August 2015
Accepted: 03 October 2016
Published: 15 October 2016
Issue Date: November 2017
DOI: https://doi.org/10.1007/s11222-016-9711-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A second-order iterated smoothing algorithm

Abstract

Access this article

Similar content being viewed by others

An Overview of Recent Advances in Monte-Carlo Methods for Bayesian Filtering in High-Dimensional Spaces

Conditionally Minimax Nonlinear Filter and Unscented Kalman Filter: Empirical Analysis and Comparison

Bayesian Belief Models in Simulation-Based Decision-Making

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 603 KB)

Appendix: Proofs

1.1 Proof of Theorem 3

1.2 Proof of Theorem 4

1.3 Proof of Theorem 5

Corollary 1

1.4 Proof of Theorem 6

Corollary 2

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A second-order iterated smoothing algorithm

Abstract

Access this article

Similar content being viewed by others

An Overview of Recent Advances in Monte-Carlo Methods for Bayesian Filtering in High-Dimensional Spaces

Conditionally Minimax Nonlinear Filter and Unscented Kalman Filter: Empirical Analysis and Comparison

Bayesian Belief Models in Simulation-Based Decision-Making

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (pdf 603 KB)

Appendix: Proofs

Appendix: Proofs

1.1 Proof of Theorem 3

1.2 Proof of Theorem 4

1.3 Proof of Theorem 5

Corollary 1

1.4 Proof of Theorem 6

Corollary 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation