Abstract
Simulation-based inference for partially observed stochastic dynamic models is currently receiving much attention due to the fact that direct computation of the likelihood is not possible in many practical situations. Iterated filtering methodologies enable maximization of the likelihood function using simulation-based sequential Monte Carlo filters. Doucet et al. (2013) developed an approximation for the first and second derivatives of the log likelihood via simulation-based sequential Monte Carlo smoothing and proved that the approximation has some attractive theoretical properties. We investigated an iterated smoothing algorithm carrying out likelihood maximization using these derivative approximations. Further, we developed a new iterated smoothing algorithm, using a modification of these derivative estimates, for which we establish both theoretical results and effective practical performance. On benchmark computational challenges, this method beat the first-order iterated filtering algorithm. The method’s performance was comparable to a recently developed iterated filtering algorithm based on an iterated Bayes map. Our iterated smoothing algorithm and its theoretical justification provide new directions for future developments in simulation-based inference for latent variable models such as partially observed Markov process models.
Similar content being viewed by others
References
Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B 72(3), 269–342 (2010)
Bhadra, A., Ionides, E.L., Laneri, K., Pascual, M., Bouma, M., Dhiman, R.C.: Malaria in Northwest India: data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Am. Stat. Assoc. 106, 440–451 (2011)
Bjørnstad, O.N., Grenfell, B.T.: Noisy clockwork: time series analysis of population fluctuations in animals. Science 293, 638–643 (2001)
Blackwood, J.C., Cummings, D.A.T., Broutin, H., Iamsirithaworn, S., Rohani, P.: Deciphering the impacts of vaccination and immunity on pertussis epidemiology in Thailand. Proc. Natl. Acad. Sci. USA 110, 9595–9600 (2013a)
Blackwood, J.C., Streicker, D.G., Altizer, S., Rohani, P.: Resolving the roles of immunity, pathogenesis, and immigration for rabies persistence in vampire bats. Proc. Natl. Acad. Sci. USA 110, 2083720842 (2013b)
Blake, I.M., Martin, R., Goel, A., Khetsuriani, N., Everts, J., Wolff, C., Wassilak, S., Aylward, R.B., Grassly, N.C.: The role of older children and adults in wild poliovirus transmission. Proc. Natl. Acad. Sci. USA 111(29), 10604–10609 (2014)
Bretó, C., He, D., Ionides, E.L., King, A.A.: Time series analysis via mechanistic models. Ann. Appl. Stat. 3, 319–348 (2009)
Camacho, A., Ballesteros, S., Graham, A.L., Carrat, F., Ratmann, O., Cazelles, B.: Explaining rapid reinfections in multiple-wave influenza outbreaks: Tristan da Cunha 1971 epidemic as a case study. Proc. R. Soc. Lond. Ser. B 278(1725), 3635–3643 (2011)
Chopin, N., Jacob, P.E., Papaspiliopoulos, O.: SMC\(^2\): an efficient algorithm for sequential analysis of state space models. J. R. Stat. Soc. Ser. B 75(3), 397–426 (2013)
Dahlin, J., Lindsten, F., Schön, T.B.: Particle Metropolis-Hastings using gradient and Hessian information. Stat. Comput. 25(1), 81–92 (2015)
Douc, R., Cappé, O., Moulines, E.: Comparison of resampling schemes for particle filtering. In: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005, pp 64–69. IEEE, New York (2005)
Doucet, A., Jacob, P. E., and Rubenthaler, S.: Derivative-free estimation of the score vector and observed information matrix with application to state-space models (version 2). arXiv:1304.5768v2 (2013)
Earn, D.J., He, D., Loeb, M.B., Fonseca, K., Lee, B.E., Dushoff, J.: Effects of school closure on incidence of pandemic influenza in Alberta. Ann. Int. Med. 156(3), 173–181 (2012)
Gordon, N.J., Salmond, D.J., Smith, A.F.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F. Radar Signal Process. 140, 107–113 (1993)
He, D., Dushoff, J., Day, T., Ma, J., Earn, D.J.D.: Inferring the causes of the three waves of the 1918 influenza pandemic in England and Wales. Proc. R. Soc. Lond. Ser. B 280(1766), 20131345 (2013)
He, D., Ionides, E.L., King, A.A.: Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J. R. Soc. Interface 7(43), 271–283 (2010)
Ionides, E.L., Bhadra, A., Atchadé, Y., King, A.: Iterated filtering. Ann. Stat. 39, 1776–1802 (2011)
Ionides, E.L., Bretó, C., King, A.A.: Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103, 18438–18443 (2006)
Ionides, E.L., Nguyen, D., Atchadé, Y., Stoev, S., King, A.A.: Inference for dynamic and latent variable models via iterated, perturbed Bayes maps. P. Natl. Acad. Sci. USA 112(3), 719–724 (2015)
Kevrekidis, I.G., Gear, C.W., Hummer, G.: Equation-free: the computer-assisted analysis of complex, multiscale systems. Am. Inst. Chem. Eng. J. 50, 1346–1354 (2004)
King, A.A., Domenech de Celle, M., Magpantay, F.M.G., Rohani, P.: Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola. Proc. R. Soc. Lond. Ser. B 282, 20150347 (2015)
King, A.A., Ionides, E.L., Pascual, M., Bouma, M.J.: Inapparent infections and cholera dynamics. Nature 454, 877–880 (2008)
King, A.A., Nguyen, D., Ionides, E.L.: Statistical inference for partially observed Markov processes via the R package pomp. J. Stat. Softw 69, 1–43 (2016)
Kloeden, P.E., Platen, E.: Numerical Soluion of Stochastic Differential Equations, 3rd edn. Springer, New York (1999)
Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)
Laneri, K., Bhadra, A., Ionides, E.L., Bouma, M., Dhiman, R.C., Yadav, R.S., Pascual, M.: Forcing versus feedback: epidemic malaria and monsoon rains in Northwest India. PLoS Comput. Biol. 6(9), e1000898 (2010)
Laneri, K., Paul, R.E., Tall, A., Faye, J., Diene-Sarr, F., Sokhna, C., Trape, J.-F., Rodó, X.: Dynamical malaria models reveal how immunity buffers effect of climate variability. Proc. Natl. Acad. Sci. USA 112(28), 8786–8791 (2015)
Lavine, J.S., King, A.A., Andreasen, V., Bjrnstad, O.N.: Immune boosting explains regime-shifts in prevaccine-era pertussis dynamics. PLoS ONE 8(8), e72086 (2013)
Lavine, J.S., Rohani, P.: Resolving pertussis immunity and vaccine effectiveness using incidence time series. Expert Rev. Vaccines 11, 1319–1329 (2012)
Macdonald, G.: The Epidemiology and Control of Malaria. Oxford University Press, Oxford (1957)
Martinez-Bakker, M., King, A.A., Rohani, P.: Unraveling the transmission ecology of polio. PLoS Biol. 13(6), e1002172 (2015)
Nemeth, C., Fearnhead, P., Mihaylova, L.: Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost. arXiv:1306.0735 (2013)
Nguyen, D. (2015). Iterated smoothing r package, is2. https://r-forge.r-project.org/projects/is2
Olsson, J., Cappé, O., Douc, R., Moulines, E.: Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli 14(1), 155–179 (2008)
Poyiadjis, G., Doucet, A., Singh, S.S.: Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98(1), 65–80 (2011)
Romero-Severson, E., Volz, E., Koopman, J., Leitner, T., Ionides, E.: Dynamic variation in sexual contact rates in a cohort of HIV-negative gay men. Am. J. Epidemiol. 182, 255–262 (2015)
Ross, R.: The Prevention of Malaria. Dutton, Boston (1910)
Roy, M., Bouma, M.J., Ionides, E.L., Dhiman, R.C., Pascual, M.: The potential elimination of plasmodium vivax malaria by relapse treatment: Insights from a transmission model and surveillance data from NW India. PLoS Negl. Trop. Dis. 7, e1979 (2013)
Shrestha, S., Foxman, B., Weinberger, D.M., Steiner, C., Viboud, C., Rohani, P.: Identifying the interaction between influenza and pneumococcal pneumonia using incidence data. Sci. Transl. Med. 5(191), 191ra84 (2013)
Shrestha, S., King, A.A., Rohani, P.: Statistical inference for multi-pathogen systems. PLoS Comput. Biol. 7(8), e1002135 (2011)
Sisson, S.A., Fan, Y., Tanaka, M.M.: Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 104(6), 1760–1765 (2007)
Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, Hoboken (2003)
Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.: Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202 (2009)
Wood, S.N.: Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466(7310), 1102–1104 (2010)
Yıldırım, S., Singh, S.S., Dean, T., Jasra, A.: Parameter estimation in hidden Markov models with intractable likelihoods using sequential Monte Carlo. J. Comput. Graph. Stat. 24, 846–865 (2015)
Acknowledgments
This research was funded in part by National Science Foundation Grant DMS-1308919 and National Institutes of Health Grants 1-U54-GM111274 and 1-U01-GM110712.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Proofs
Appendix: Proofs
1.1 Proof of Theorem 3
Let
where \(I_{d\times d}\) is identity matrix of dimension d and \(0_{d\times d}\) is zero matrix of dimension d, then a random walk noise will be \(R\tau Z_{0:N}\). From Assumption 5, \(\breve{\ell }\) is four times continuously differentiable for \(\theta ^{[N+1]}\). Since N is fixed, we can apply Theorem 1, with \(\Sigma =\mathrm {Cov}(RZ_{0:N})=\breve{\Psi }_N\), to obtain the existence of an \(\eta \) and a \(C_8\) such that for every \(\tau < \eta \) we have,
where
Note that Assumptions 1 and 2 are automatically satisfied for the multivariate normal distribution with mean zero and variance \(\breve{\Psi }_N\), corresponding to the random variable \(RZ_{0:N}\). As a result, for fixed \(\tau _{0},\dots ,\tau _{N}\) and for a random walk noise, we have
An application of the Gaussian-Jordan inverse method gives
We write \(\nabla _{n} \breve{\ell }(\theta ^{[N+1]})\) for the d-dimensional vector of partial derivatives of \(\breve{\ell }(\theta ^{[N+1]})\) with respect to each of the d components of \(\theta _{n}\). An application of the chain rule gives the identity
giving rise to an inequality,
where \(\left\{ s\right\} _{n}\) is the entries \(\left\{ dn+1,...,d(n+1)\right\} \) of a vector \(s\in R^{d(N+1)}\). Decomposing the matrix multiplication by \(\breve{\Psi }_N^{-1}\) into \(d\times d\) blocks, we have
where \(\mathrm {SumCol}_{n}\) is the sum of the nth column in the \(d\times d\) block construction of \(\breve{\Psi }_N^{-1}\). Every column of \(\breve{\Psi }_N^{-1}\) except the first sums to 0, and this special structure of \(\tilde{\Psi }_N^{-1}\) gives a simple form,
This can be written as
1.2 Proof of Theorem 4
Using similar set up as above, let the random walk noise be \(R\tau Z_{0:N}\) with R defined as in Eq. (8). By selecting \(p_{\Theta _{0:N}}\) to follow a multivariate normal distribution, Assumption 4 is also satisfied. From Theorem 2, for fixed \(\tau _{0}, \dots , \tau _{N}\), there exist \(\eta \) and \(C_{12}\) such that for \(0<\tau <\eta \),
Define \(\nabla ^2_{s,n}{\breve{\ell }}\left( \theta ^{[N+1]}\right) \) as
Applying the chain rule, we have
Adding up term in Eq. (12), we get
where \(\left\{ A \right\} _{s,n}\) is the entries of rows \(\left\{ ds+1,...,d(s+1)\right\} \) and of columns \(\left\{ dn+1,...,d(n+1)\right\} \) of a matrix \(A\in R^{d(N+1)\times d(N+1)}\). Therefore,
Defining \(\mathrm {SumCol}_n\) as in Eq. (11), we have
The last equality follows since \(\breve{\Psi }_N^{-1}\) is symmetric matrix with block of \(d\times d\) for which each column except the first sums to 0. Thus, we obtain
1.3 Proof of Theorem 5
In order to prove Theorem 5, we use the following corollary to Theorem 1.
Corollary 1
Suppose the perturbation kernel takes a value \(\kappa \in \mathcal {K}\) satisfying Assumptions 6 and 7. Suppose also assumption 3. There exists an \(\eta \) and a constant \(C_{16}\) such that for every \(0<\tau <\eta \) and every \(\kappa \in \mathcal {K}\),
Corollary 1 follows directly from applying Theorem 1, noting that Assumptions 6 and 7 imply a uniform bound on \(C_2\) in Theorem 1. Applying Corollary 1, we obtain the existence of an \(\eta \) and \(C_{18}\) such that for every \(\tau < \eta \) we have
For compactness of notation, we write \(E_n=\breve{\mathbb {E}}\left( \breve{\Theta }_{n}-\theta \left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \) and \(D_n=\Psi \nabla _n{\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) \). With \(\breve{\Psi }_N\) as in Eq. (10), writing out terms of the vector equation in (14) gives
Using our assumption that for all \(n=1\ldots N\), \(\tau _n=O(\tau ^2)\), we get that \(E_n=E_0+O(\tau ^{4})\), from which we can conclude that
From Eq. (14) and the chain rule, for fixed \(\tau _0\) we have
which then completes the proof.
1.4 Proof of Theorem 6
As for Theorem 5, we need a corollary to Theorem 2 over the kernel set \(\mathcal {K}\), making use of Assumptions 6 and 7.
Corollary 2
Suppose Assumptions 3, 6, and 7. Suppose also that every kernel in \(\mathcal {K}\) satisfies the mesokurtic condition in Assumptions 3. There exists an \(\eta \) and a constant \(C_{20}\) such that for every \(0<\tau <\eta \) and every \(\kappa \in \mathcal {K}\),
Applying Corollary 2, we have
For compact notation, we write
and
From the diagonal terms of the above matrix norm inequality, we derive \(N+1\) equations,
Using (20) through (24), and expanding out a matrix multiplication, we get
Using our assumption that for all \(n=1\ldots N\), \(\tau _n=O(\tau ^2)\), we get that
from which we can conclude that
For \(n=0\), we have
from which, applying the chain rule completes the proof.
Rights and permissions
About this article
Cite this article
Nguyen, D., Ionides, E.L. A second-order iterated smoothing algorithm. Stat Comput 27, 1677–1692 (2017). https://doi.org/10.1007/s11222-016-9711-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-016-9711-9