Gradient Free Parameter Estimation for Hidden Markov Models with Intractable Likelihoods



In this article we focus on Maximum Likelihood estimation (MLE) for the static model parameters of hidden Markov models (HMMs). We will consider the case where one cannot or does not want to compute the conditional likelihood density of the observation given the hidden state because of increased computational complexity or analytical intractability. Instead we will assume that one may obtain samples from this conditional likelihood and hence use approximate Bayesian computation (ABC) approximations of the original HMM. Although these ABC approximations will induce a bias, this can be controlled to arbitrary precision via a positive parameter ϵ, so that the bias decreases with decreasing ϵ. We first establish that when using an ABC approximation of the HMM for a fixed batch of data, then the bias of the resulting log- marginal likelihood and its gradient is no worse than \(\mathcal{O}(n\epsilon)\), where n is the total number of data-points. Therefore, when using gradient methods to perform MLE for the ABC approximation of the HMM, one may expect parameter estimates of reasonable accuracy. To compute an estimate of the unknown and fixed model parameters, we propose a gradient approach based on simultaneous perturbation stochastic approximation (SPSA) and Sequential Monte Carlo (SMC) for the ABC approximation of the HMM. The performance of this method is illustrated using two numerical examples.


Approximate Bayesian computation Hidden Markov models Parameter estimation Sequential Monte Carlo 

AMS 2000 Subject Classification

65C05 62F10 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Andrieu C, Doucet A, Tadic VB (2005) On-line simulation-based algorithms for parameter estimation in general state-space models. In: Proc. of the 44th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC ’05), pp 332–337. Expanded Technical Report, available at URL
  2. Arapostathis A, Marcus SI (1990) Analysis of an identification algorithm arising in the adaptive estimation of Markov chains. Math Control Signals Syst 3:1–29CrossRefMATHMathSciNetGoogle Scholar
  3. Barthelmé S, Chopin N (2011) Expectation–Propagation for summary-less, likelihood-free inference. arXiv:1107.5959 [stat.CO]
  4. Benveniste A, Métivier M, Priouret P (1990) Adaptive algorithms and stochastic approximation. Springer-Verlag, New YorkCrossRefGoogle Scholar
  5. Beskos A, Crisan D, Jasra A, Whiteley N (2011) Error bounds and normalizing constants for sequential Monte carlo in high-dimensions. arXiv:1112.1544 [stat.CO]
  6. Bickel P, Li B, Bengtsson T (2008) Sharp failure rates for the bootstrap particle filter in high dimensions. In: Clarke B, Ghosal S (eds) Pushing the limits of contemporary statistics. IMS, pp 318–329Google Scholar
  7. Cappé O, Ryden T, Moulines Ï (2005) Inference in hidden Markov models. Springer, New YorkMATHGoogle Scholar
  8. Cappé O (2009) Online sequential Monte Carlo EM algorithm. In: Proc. of IEEE workshop Statist. Signal Process. (SSP). Cardiff, Wales, UKGoogle Scholar
  9. Calvet C, Czellar V (2012) Accurate methods for approximate Bayesian computation filtering. Technical Report, HEC ParisGoogle Scholar
  10. Cérou F, Del Moral P, Guyader A (2011) A non-asymptotic variance theorem for un-normalized Feynman–Kac particle models. Ann Inst Henri Poincare 47:629–649CrossRefMATHGoogle Scholar
  11. Dean TA, Singh SS, Jasra A, Peters GW (2010) Parameter estimation for Hidden Markov models with intractable likelihoods. arXiv:1103.5399 [math.ST]
  12. Dean TA, Singh SS (2011) Asymptotic behavior of approximate Bayesian estimators. arXiv:1105.3655 [math.ST]
  13. Del Moral P (2004) Feynman–Kac formulae: genealogical and interacting particle systems with applications. Springer, New YorkCrossRefGoogle Scholar
  14. Del Moral P, Doucet A, Jasra A (2006) Sequential Monte Carlo samplers. J R Stat Soc B 68:411–436CrossRefMATHGoogle Scholar
  15. Del Moral P, Doucet A, Jasra A (2012) An adaptive sequential Monte Carlo method for approximate Bayesian computation. Stat Comput 22:1009–1020CrossRefMATHMathSciNetGoogle Scholar
  16. Del Moral P, Doucet A, Singh SS (2009) Forward only smoothing using sequential Monte Carlo. arXiv:1012.5390 [stat.ME]
  17. Del Moral P, Doucet A, Singh SS (2011) Uniform stability of a particle approximation of the optimal filter derivative. arXiv:1106.2525 [math.ST]
  18. Doucet A, Godsill S, Andrieu C (2000) On sequential Monte Carlo sampling methods for Bayesian filtering. Stat Comput 10:197–208CrossRefGoogle Scholar
  19. Gauchi JP, Vila JP (2013) Nonparametric filtering approaches for identification and inference in nonlinear dynamic systems. Stat Comput 23:523–533CrossRefMathSciNetGoogle Scholar
  20. Jasra A, Singh SS, Martin JS, McCoy E (2012) Filtering via approximate Bayesian computation. Stat Comput 22:1223–1237CrossRefMATHMathSciNetGoogle Scholar
  21. Kantas N, Doucet A, Singh SS, Maciejowski JM, Chopin N (2011) On particle methods for parameter estimation in general state-space models. (submitted)Google Scholar
  22. Le Gland F, Mevel M (2000) Exponential forgetting and geometric ergodicity in hidden Markov models. Math Control Signals Syst 13:63–93CrossRefMATHGoogle Scholar
  23. Le Gland F, Mevel M (1997) Recursive identification in hidden Markov models. In: Proc. 36th IEEE conf. decision and control, pp 3468–3473Google Scholar
  24. Le Gland F, Mevel M (1995) Recursive identification of HMM’s with observations in a finite set. In: Proc. of the 34th conference on decision and control, pp 216–221Google Scholar
  25. Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20:130–141CrossRefGoogle Scholar
  26. Marin J-M, Pudlo P, Robert CP, Ryder R (2012) Approximate Bayesian computational methods. Stat Comput 22:1167–1197CrossRefMATHMathSciNetGoogle Scholar
  27. Martin JS, Jasra A, Singh SS, Whiteley N, McCoy E (2012) Approximate Bayesian computation for smoothing. arXiv:1206.5208 [stat.CO]
  28. McKinley J, Cook A, Deardon R (2009) Inference for epidemic models without likelihooods. Int J Biostat 5:a24Google Scholar
  29. Murray LM, Jones E, Parslow J (2011) On collapsed state-space models and the particle marginal Metropolis–Hastings sampler. arXiv:1202.6159 [stat.CO]
  30. Pitt MK (2002) Smooth particle filters for likelihood evaluation and maximization. Technical Report, University of WarwickGoogle Scholar
  31. Poyiadjis G, Doucet A, Singh SS (2011) Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98:65–80CrossRefMATHMathSciNetGoogle Scholar
  32. Poyiadjis G, Singh SS, Doucet A (2006) Gradient-free maximum likelihood parameter estimation with particle filters. In: Proc Amer. control conf., pp 6–9Google Scholar
  33. Spall JC (1992) Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans Autom Control 37(3):332–341CrossRefMATHMathSciNetGoogle Scholar
  34. Spall J (2003) Introduction to stochastic search and optimization, 1st edn. Wiley, New YorkCrossRefMATHGoogle Scholar
  35. Tadic VB, Doucet A (2005) Exponential forgetting and geometric ergodicity for optimal filtering in general state-space models. Stoch Process Appl 115:1408–1436CrossRefMATHMathSciNetGoogle Scholar
  36. Tadic VB (2009) Analyticity, convergence and convergence rate of recursive maximum likelihood estimation in hidden Markov models. arXiv:0904.4264
  37. Whiteley N, Kantas N, Jasra A (2012) Linear variance bounds for particle approximations of time-homogeneous Feynman–Kac formulae. Stoch Process Appl 122:1840–1865CrossRefMATHMathSciNetGoogle Scholar
  38. Yildirim S, Singh SS, Doucet A (2013a) An online expectation–maximisation algorithm for changepoint models. J Comput Graph Stat. doi: 10.1080/10618600.2012.674653
  39. Yildirim S, Dean TA, Singh SS, Jasra A (2013b) Approximate Bayesian computation for recursive maximum likelihood estimation in hidden Markov models. Technical Report, University of CambridgeGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Department of MathematicsImperial College LondonLondonUK
  2. 2.Department of Statistics & Applied ProbabilityNational University of SingaporeSingaporeSingapore
  3. 3.Department of Statistical ScienceUniversity College LondonLondonUK

Personalised recommendations