Environmental and Ecological Statistics

, Volume 22, Issue 2, pp 247–274 | Cite as

Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

  • David A. Kennedy
  • Vanja Dukic
  • Greg Dwyer


When Markov chain Monte Carlo (MCMC) algorithms are used with complex mechanistic models, convergence times are often severely compromised by poor mixing rates and a lack of computational power. Methods such as adaptive algorithms have been developed to improve mixing, but these algorithms are typically highly sophisticated, both mathematically and computationally. Here we present a nonadaptive MCMC algorithm, which we term line-search MCMC, that can be used for efficient tuning of proposal distributions in a highly parallel computing environment, but that nevertheless requires minimal skill in parallel computing to implement. We apply this algorithm to make inferences about dynamical models of the growth of a pathogen (baculovirus) population inside a host (gypsy moth, Lymantria dispar). The line-search MCMC appeal rests on its ease of implementation, and its potential for efficiency improvements over classical MCMC in a highly parallel setting, which makes it especially useful for ecological models.


Birth–death model MCMC Parameter line-search  Survival-time data Within-host model 



DAK was supported by an ARCS fellowship, a GAANN training grant while at the University of Chicago, and the RAPIDD program of the Science and Technology Directorate, Department of Homeland Security and Fogarty International Center, National Institutes of Health (NIH). GD and VD were supported by NIH Grant R01GM096655. VD was also supported by Grants NSF-DEB 1316334 and NSF-GEO 1211668. We thank two anonymous reviewers for comments that substantially improved the manuscript.

Supplementary material

10651_2014_297_MOESM1_ESM.txt (0 kb)
Supplementary material 1 (txt 137 Bytes)
10651_2014_297_MOESM2_ESM.txt (3 kb)
Supplementary material 2 (txt 2.86 KB)
10651_2014_297_MOESM3_ESM.txt (15 kb)
Supplementary material 3 (txt 14.9 KB)
10651_2014_297_MOESM4_ESM.txt (18 kb)
Supplementary material 4 (txt 18.1 KB)
10651_2014_297_MOESM5_ESM.txt (16 kb)
Supplementary material 5 (txt 15.7 KB)
10651_2014_297_MOESM6_ESM.txt (16 kb)
Supplementary material 6 (txt 16.1 KB)
10651_2014_297_MOESM7_ESM.txt (1 kb)
Supplementary material 7 (txt 1008 Bytes)
10651_2014_297_MOESM8_ESM.txt (0 kb)
Supplementary material 8 (txt 127 Bytes)
10651_2014_297_MOESM9_ESM.txt (19 kb)
Supplementary material 9 (txt 18.8 KB)
10651_2014_297_MOESM10_ESM.txt (20 kb)
Supplementary material 10 (txt 19.5 KB)
10651_2014_297_MOESM11_ESM.txt (1 kb)
Supplementary material 11 (txt 793 Bytes)


  1. Alizon S, van Baalen M (2008) Acute or chronic? Within-host models with immune dynamics, infection outcome, and parasite evolution. Am Nat 172:E244–E256CrossRefPubMedGoogle Scholar
  2. Antia R, Levin B, May R (1994) Within-host population-dynamics and the evolution and maintenance of microparasite virulence. Am Nat 144:457–472CrossRefGoogle Scholar
  3. Armenian H, Lilienfeld A (1983) Incubation period of disease. Epidemiol Rev 5:1–15PubMedGoogle Scholar
  4. Ashida M, Brey P (1998) Molecular mechanisms of immune responses in insects. Chapman & Hall, LondonGoogle Scholar
  5. Baldwin K, Hakim R (1991) Growth and differentiation of the larval midgut epithelium during molting in the moth, Manduca sexta. Tissue Cell 23:411–422CrossRefPubMedGoogle Scholar
  6. Beaumont M, Zhang W, Balding D (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035PubMedCentralPubMedGoogle Scholar
  7. Bogich T, Shea K (2008) A state-dependent model for the optimal management of an invasive metapopulation. Ecol Appl 18:748–761CrossRefPubMedGoogle Scholar
  8. Bolker B (2008) Ecological models and data in R. Princeton University Press, New JerseyGoogle Scholar
  9. Braun M (1983) Differential equations and their applications, an introduction to applied mathematics, 3rd edn. Springer, New YorkCrossRefGoogle Scholar
  10. Brigham C, Power A, Hunter A (2002) Evaluating the internal consistency of recovery plans for federally endangered species. Ecol Appl 12:648–654CrossRefGoogle Scholar
  11. Brockwell A (2006) Parallel Markov chain Monte Carlo simulation by pre-fetching. J Comput Graph Stat 15:246–261CrossRefGoogle Scholar
  12. Chakerian J, Holmes S (2012) Computational tools for evaluating phylogenetic and hierarchical clustering trees. J Comput Graph Stat 21:581–599CrossRefGoogle Scholar
  13. Comon P (1994) Independent component analysis, a new concept. Signal Proces 36:287–314CrossRefGoogle Scholar
  14. Cory J, Myers J (2003) The ecology and evolution of insect baculoviruses. Annu Rev Ecol Evol Syst 34:239–272CrossRefGoogle Scholar
  15. Cowles M, Carlin B (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904CrossRefGoogle Scholar
  16. Craiu R, Rosenthal J, Yang C (2009) Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J Am Stat Assoc 104:1454–1466CrossRefGoogle Scholar
  17. Csillery K, Blum M, Gaggiotti O, Francois O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25:410–418Google Scholar
  18. Doak DF, Morris WF (2010) Demographic compensation and tipping points in climate-induced range shifts. Nature 467:959–962CrossRefPubMedGoogle Scholar
  19. Doob J (1945) Markoff chains: denumerable case. Trans Am Math Soc 58:455–473Google Scholar
  20. Dukic V, Lopes H, Polson N (2012) Tracking epidemics with Google Flu trends data and a state-space SEIR model. J Am Stat Assoc 107:1410–1426CrossRefGoogle Scholar
  21. Feng H, Gould F, Huang Y, Jiang Y, Wu K (2010) Modeling the population dynamics of cotton bollworm Helicoverpa armigera (Hubner) (Lepidoptera: Noctuidae) over a wide area in northern China. Ecol Model 221:1819–1830CrossRefGoogle Scholar
  22. Fuller E, Elderd B, Dwyer G (2012) Pathogen persistence in the environment and insect-baculovirus interactions: disease-density thresholds, epidemic burnout and insect outbreaks. Am Nat 179:E70–E96Google Scholar
  23. Fuller S, Millet L (2011) Computing performance: Game over or next level? IEEE Comput 44:31–38CrossRefGoogle Scholar
  24. Geer D (2005) Chip makers turn to multicore processors. IEEE Comput 38:11–13CrossRefGoogle Scholar
  25. Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472CrossRefGoogle Scholar
  26. Gilchrist M, Sasaki A (2002) Modeling host-parasite coevolution: a nested approach based on mechanistic models. J Theor Biol 218:289–308CrossRefPubMedGoogle Scholar
  27. Gilks W, Roberts G (1996) Markov chain Monte Carlo in practice, chapter Introducing Markov chain Monte Carlo. Chapman & Hall, LondonGoogle Scholar
  28. Gillespie D (1977) Exact stochastic simulation of coupled chemical-reactions. J Phys Chem 81:2340–2361CrossRefGoogle Scholar
  29. Girolami M, Calderhead B (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B 73:123–214CrossRefGoogle Scholar
  30. Grant A, Restif O, McKinley T, Sheppard M, Maskell D, Mastroeni P (2008) Modelling within-host spatiotemporal dynamics of invasive bacterial disease. PLoS Biol 6:757–770CrossRefGoogle Scholar
  31. Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7:223–242CrossRefGoogle Scholar
  32. Hartig F, Calabrese JM, Reineking B, Wiegand T, Huth A (2011) Statistical inference for stochastic simulation models—theory and application. Ecol Lett 14:816–827CrossRefPubMedGoogle Scholar
  33. Heidelberger P, Welch P (1983) Simulation run length control in the presence on an initial transient. Oper Res 31:1109–1144CrossRefGoogle Scholar
  34. Hoover K, Washburn J, Volkman L (2000) Midgut-based resistance of Heliothis virescens to baculovirus infection mediated by phytochemicals in cotton. J Insect Physiol 46:999–1007Google Scholar
  35. Hunter-Fujita F, Entwistle P, Evans H, Crook N (1998) Insect viruses and pest management. Wiley, ChichesterGoogle Scholar
  36. Ionides E, Breto C, King A (2006) Inference for nonlinear dynamical systems. Proc Natl Sci USA 103:18438–18443CrossRefGoogle Scholar
  37. Jacob P, Robert C, Smith M (2011) Using parallel computation to improve independent Metropolis-Hastings based estimation. J Comput Graph Stat 20:616–635CrossRefGoogle Scholar
  38. Jolliffe I (1986) Principal component analysis. Springer, New YorkCrossRefGoogle Scholar
  39. Karlin S, Taylor H (1975) A first course in stochastic processes. Academic, New YorkGoogle Scholar
  40. Kennedy DA, Dukic V, Dwyer G (2014) The mechanisms determining the within-host population dynamics of an insect pathogen. Am Nat 184:407–423Google Scholar
  41. Khorsheed E, Hurn M, Jennison C (2011) Mapping electron density in the ionosphere: a principal component MCMC algorithm. Comput Stat Data Anal 55:338–352CrossRefGoogle Scholar
  42. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, New YorkCrossRefGoogle Scholar
  43. King A, Shrestha S, Harvill E, Bjørnstad O (2009) Evolution of acute infections and the invasion-persistence trade-off. Am Nat 173:446–455CrossRefPubMedCentralPubMedGoogle Scholar
  44. Kot M (2001) Elements of mathematical ecology. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  45. Lele S, Dennis B, Lutscher F (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol Lett 10:551–563CrossRefPubMedGoogle Scholar
  46. Lele S, Nadeem K, Schmuland B (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. J Am Stat Assoc 105:1617–1625CrossRefGoogle Scholar
  47. Liu J (2001) Monte Carlo strategies in scientific computing. Springer, BerlinGoogle Scholar
  48. Luenberger D, Ye Y (2008) Linear and nonlinear programming, 3rd edn. Springer Science and Business Media, New YorkGoogle Scholar
  49. McNeil J, Cox-Foster D, Gardner M, Slavicek J, Thiem S, Hoover K (2010) Pathogenesis of Lymantria dispar multiple nucleopolyhedrovirus (LdMNPV) in L. dispar and mechanisms of developmental resistance. J Gen Virol 91:1590–1600Google Scholar
  50. Meynell G (1957) The applicability of the hypothesis of independent action to fatal infections in mice given Salmonella typhimurium by mouth. J Gen Microbiol 16:396–404CrossRefPubMedGoogle Scholar
  51. Miller G (2010) Markov chain Monte Carlo calculations allowing parallel processing using a variant of the Metropolis algorithm. Open Numer Methods J 2:12–17CrossRefGoogle Scholar
  52. Morgan B (1992) Analysis of quantal response data. Chapman & Hall, LondonCrossRefGoogle Scholar
  53. Mudholkar G, Srivastava D, Kollia G (1996) A generalization of the Weibull distribution with application to the analysis of survival data. J Am Stat Assoc 91:1575–1583CrossRefGoogle Scholar
  54. Plummer M, Best N, Cowles K, Vines K. (2009) coda: Output analysis and diagnostics for MCMC. R package version 0.13-4Google Scholar
  55. Ponciano J, Burleigh J, Braun E, Taper M (2012) Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 61:955–972CrossRefPubMedCentralPubMedGoogle Scholar
  56. Press W, Teukolsky S, Vetterling W, Flannery B (1992) Numerical recipes in C. Cambridge University Press, CambridgeGoogle Scholar
  57. Development Core Team R (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0Google Scholar
  58. Robert C, Cornuet J, Marin J, Pillai N (2011) Lack of confidence in approximate Bayesian computation model choice. Proc Natl Acad Sci USA 108:15112–15117CrossRefPubMedCentralPubMedGoogle Scholar
  59. Roberts G, Gelman A, Gilks W (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7:110–120CrossRefGoogle Scholar
  60. Rosenthal J (2000) Parallel computing and Monte Carlo algorithms. Far East J Theor Stat 4:207–236Google Scholar
  61. Saaty T (1961) Some stochastic-processes with absorbing barriers. J R Stat Soc Ser B Stat Methodol 23:319–334Google Scholar
  62. Schmid-Hempel P (2005) Evolutionary ecology of insect immune defenses. Annu Rev Entomol 50:529–551CrossRefPubMedGoogle Scholar
  63. Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRefGoogle Scholar
  64. Shapiro M, Farrar R Jr, Domek J, Javaid I (2002) Effects of virus concentration and ultraviolet irradiation on the activity of corn earworm and beet armyworm (Lepidoptera:Noctuidae) nucleopolyhedroviruses. J Econ Entomol 95:243–249CrossRefPubMedGoogle Scholar
  65. Shapiro M, Robertson J, Bell R (1986) Quantitative and qualitative differences in gypsy moth (Lepidoptera: Lymantriidae) nucleopolyhedrosis virus produced in different-aged larvae. J Econ Entomol 79:1174–1177CrossRefGoogle Scholar
  66. Shortley G (1965) A stochastic model for distributions of biological response times. Biometrics 21:562–582CrossRefPubMedGoogle Scholar
  67. Solonen A, Ollinaho P, Laine M, Haario H, Tamminen J, Jarvinen H (2012) Efficient MCMC for climate model parameter estimation: parallel adaptive chains and early rejection. Bayesian Anal 7:715–736CrossRefGoogle Scholar
  68. Strid I (2010) Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach. Comput Stat Data Anal 54:2814–2835CrossRefGoogle Scholar
  69. Trudeau D, Washburn J, Volkman L (2001) Central role of hemocytes in Autographa californica M nucleopolyhedrovirus pathogenesis in Heliothis virescens and Helicoverpa zea. J Virol 75:996–1003CrossRefPubMedCentralPubMedGoogle Scholar
  70. Turchin P (2003) Complex population dynamics: a theoretical/empirical synthesis. Princeton University Press, PrincetonGoogle Scholar
  71. van Beek N, Flore P, Wood H, Hughes P (1990) Rate of increase of Autographa californica nuclear polyhedrosis virus in Trichoplusia ni larvae determined by DNA-DNA hybridization. J Invertebr Pathol 55:85–92CrossRefPubMedGoogle Scholar
  72. van Beek N, Hughes P, Wood H (2000) Effects of incubation temperature on the dose-survival time relationship of Trichoplusia ni larvae infected with Autographa californica nucleopolyhedrovirus. J Invertebr Pathol 76:185–190CrossRefPubMedGoogle Scholar
  73. van Beek N, Wood H, Hughes P (1988) Quantitative aspects of nuclear polyhedrosis virus infections in Lepidopterous larvae: the dose-survival time relationship. J Invertebr Pathol 51:58–63CrossRefGoogle Scholar
  74. van den Berg S, Beem L, Boomsma D (2006) Fitting genetic models using Markov chain Monte Carlo algorithms with BUGS. Twin Res Hum Genet 9:334–342CrossRefPubMedGoogle Scholar
  75. Vaughan T, Drummond P, Drummond A (2012) Within-host demographic fluctuations and correlations in early retroviral infection. J Theor Biol 295:86–99CrossRefPubMedGoogle Scholar
  76. Wilkinson D (2005) Handbook of Parallel computing and statistics, chapter parallel Bayesian computation. Dekker/CRC Press, New YorkGoogle Scholar
  77. Yan J, Cowles M, Wang S, Armstrong M (2007) Parallelizing MCMC for Bayesian spatiotemporal geostatistical models. Stat Comput 17:323–335CrossRefGoogle Scholar
  78. Zwart M, Hemerik L, Cory J, de Visser J, Bianchi F, Van Oers M, Vlak J, Hoekstra R, Van der Werf W (2009) An experimental test of the independent action hypothesis in virus-insect pathosystems. Proc R Soc Lond Ser B-Biol Sci 276:2233–2242CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Department of Ecology and EvolutionUniversity of ChicagoChicagoUSA
  2. 2.Center for Infectious Disease DynamicsPennsylvania State UniversityUniversity ParkUSA
  3. 3.Fogarty International CenterNational Institutes of HealthBethesdaUSA
  4. 4.Department of Applied MathematicsUniversity of Colorado - BoulderBoulderUSA

Personalised recommendations