# Combining principal component analysis with parameter line-searches to improve the efficacy of Metropolis–Hastings MCMC

- 303 Downloads
- 3 Citations

## Abstract

When Markov chain Monte Carlo (MCMC) algorithms are used with complex mechanistic models, convergence times are often severely compromised by poor mixing rates and a lack of computational power. Methods such as adaptive algorithms have been developed to improve mixing, but these algorithms are typically highly sophisticated, both mathematically and computationally. Here we present a nonadaptive MCMC algorithm, which we term line-search MCMC, that can be used for efficient tuning of proposal distributions in a highly parallel computing environment, but that nevertheless requires minimal skill in parallel computing to implement. We apply this algorithm to make inferences about dynamical models of the growth of a pathogen (baculovirus) population inside a host (gypsy moth, *Lymantria dispar*). The line-search MCMC appeal rests on its ease of implementation, and its potential for efficiency improvements over classical MCMC in a highly parallel setting, which makes it especially useful for ecological models.

## Keywords

Birth–death model MCMC Parameter line-search Survival-time data Within-host model## Notes

### Acknowledgments

DAK was supported by an ARCS fellowship, a GAANN training grant while at the University of Chicago, and the RAPIDD program of the Science and Technology Directorate, Department of Homeland Security and Fogarty International Center, National Institutes of Health (NIH). GD and VD were supported by NIH Grant R01GM096655. VD was also supported by Grants NSF-DEB 1316334 and NSF-GEO 1211668. We thank two anonymous reviewers for comments that substantially improved the manuscript.

## Supplementary material

## References

- Alizon S, van Baalen M (2008) Acute or chronic? Within-host models with immune dynamics, infection outcome, and parasite evolution. Am Nat 172:E244–E256CrossRefPubMedGoogle Scholar
- Antia R, Levin B, May R (1994) Within-host population-dynamics and the evolution and maintenance of microparasite virulence. Am Nat 144:457–472CrossRefGoogle Scholar
- Armenian H, Lilienfeld A (1983) Incubation period of disease. Epidemiol Rev 5:1–15PubMedGoogle Scholar
- Ashida M, Brey P (1998) Molecular mechanisms of immune responses in insects. Chapman & Hall, LondonGoogle Scholar
- Baldwin K, Hakim R (1991) Growth and differentiation of the larval midgut epithelium during molting in the moth,
*Manduca sexta*. Tissue Cell 23:411–422CrossRefPubMedGoogle Scholar - Beaumont M, Zhang W, Balding D (2002) Approximate Bayesian computation in population genetics. Genetics 162:2025–2035PubMedCentralPubMedGoogle Scholar
- Bogich T, Shea K (2008) A state-dependent model for the optimal management of an invasive metapopulation. Ecol Appl 18:748–761CrossRefPubMedGoogle Scholar
- Bolker B (2008) Ecological models and data in R. Princeton University Press, New JerseyGoogle Scholar
- Braun M (1983) Differential equations and their applications, an introduction to applied mathematics, 3rd edn. Springer, New YorkCrossRefGoogle Scholar
- Brigham C, Power A, Hunter A (2002) Evaluating the internal consistency of recovery plans for federally endangered species. Ecol Appl 12:648–654CrossRefGoogle Scholar
- Brockwell A (2006) Parallel Markov chain Monte Carlo simulation by pre-fetching. J Comput Graph Stat 15:246–261CrossRefGoogle Scholar
- Chakerian J, Holmes S (2012) Computational tools for evaluating phylogenetic and hierarchical clustering trees. J Comput Graph Stat 21:581–599CrossRefGoogle Scholar
- Comon P (1994) Independent component analysis, a new concept. Signal Proces 36:287–314CrossRefGoogle Scholar
- Cory J, Myers J (2003) The ecology and evolution of insect baculoviruses. Annu Rev Ecol Evol Syst 34:239–272CrossRefGoogle Scholar
- Cowles M, Carlin B (1996) Markov chain Monte Carlo convergence diagnostics: a comparative review. J Am Stat Assoc 91:883–904CrossRefGoogle Scholar
- Craiu R, Rosenthal J, Yang C (2009) Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J Am Stat Assoc 104:1454–1466CrossRefGoogle Scholar
- Csillery K, Blum M, Gaggiotti O, Francois O (2010) Approximate Bayesian computation (ABC) in practice. Trends Ecol Evol 25:410–418Google Scholar
- Doak DF, Morris WF (2010) Demographic compensation and tipping points in climate-induced range shifts. Nature 467:959–962CrossRefPubMedGoogle Scholar
- Doob J (1945) Markoff chains: denumerable case. Trans Am Math Soc 58:455–473Google Scholar
- Dukic V, Lopes H, Polson N (2012) Tracking epidemics with Google Flu trends data and a state-space SEIR model. J Am Stat Assoc 107:1410–1426CrossRefGoogle Scholar
- Feng H, Gould F, Huang Y, Jiang Y, Wu K (2010) Modeling the population dynamics of cotton bollworm
*Helicoverpa armigera*(Hubner) (Lepidoptera: Noctuidae) over a wide area in northern China. Ecol Model 221:1819–1830CrossRefGoogle Scholar - Fuller E, Elderd B, Dwyer G (2012) Pathogen persistence in the environment and insect-baculovirus interactions: disease-density thresholds, epidemic burnout and insect outbreaks. Am Nat 179:E70–E96Google Scholar
- Fuller S, Millet L (2011) Computing performance: Game over or next level? IEEE Comput 44:31–38CrossRefGoogle Scholar
- Geer D (2005) Chip makers turn to multicore processors. IEEE Comput 38:11–13CrossRefGoogle Scholar
- Gelman A, Rubin D (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7:457–472CrossRefGoogle Scholar
- Gilchrist M, Sasaki A (2002) Modeling host-parasite coevolution: a nested approach based on mechanistic models. J Theor Biol 218:289–308CrossRefPubMedGoogle Scholar
- Gilks W, Roberts G (1996) Markov chain Monte Carlo in practice, chapter Introducing Markov chain Monte Carlo. Chapman & Hall, LondonGoogle Scholar
- Gillespie D (1977) Exact stochastic simulation of coupled chemical-reactions. J Phys Chem 81:2340–2361CrossRefGoogle Scholar
- Girolami M, Calderhead B (2011) Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J R Stat Soc Ser B 73:123–214CrossRefGoogle Scholar
- Grant A, Restif O, McKinley T, Sheppard M, Maskell D, Mastroeni P (2008) Modelling within-host spatiotemporal dynamics of invasive bacterial disease. PLoS Biol 6:757–770CrossRefGoogle Scholar
- Haario H, Saksman E, Tamminen J (2001) An adaptive Metropolis algorithm. Bernoulli 7:223–242CrossRefGoogle Scholar
- Hartig F, Calabrese JM, Reineking B, Wiegand T, Huth A (2011) Statistical inference for stochastic simulation models—theory and application. Ecol Lett 14:816–827CrossRefPubMedGoogle Scholar
- Heidelberger P, Welch P (1983) Simulation run length control in the presence on an initial transient. Oper Res 31:1109–1144CrossRefGoogle Scholar
- Hoover K, Washburn J, Volkman L (2000) Midgut-based resistance of
*Heliothis virescens*to baculovirus infection mediated by phytochemicals in cotton. J Insect Physiol 46:999–1007Google Scholar - Hunter-Fujita F, Entwistle P, Evans H, Crook N (1998) Insect viruses and pest management. Wiley, ChichesterGoogle Scholar
- Ionides E, Breto C, King A (2006) Inference for nonlinear dynamical systems. Proc Natl Sci USA 103:18438–18443CrossRefGoogle Scholar
- Jacob P, Robert C, Smith M (2011) Using parallel computation to improve independent Metropolis-Hastings based estimation. J Comput Graph Stat 20:616–635CrossRefGoogle Scholar
- Jolliffe I (1986) Principal component analysis. Springer, New YorkCrossRefGoogle Scholar
- Karlin S, Taylor H (1975) A first course in stochastic processes. Academic, New YorkGoogle Scholar
- Kennedy DA, Dukic V, Dwyer G (2014) The mechanisms determining the within-host population dynamics of an insect pathogen. Am Nat 184:407–423Google Scholar
- Khorsheed E, Hurn M, Jennison C (2011) Mapping electron density in the ionosphere: a principal component MCMC algorithm. Comput Stat Data Anal 55:338–352CrossRefGoogle Scholar
- Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, New YorkCrossRefGoogle Scholar
- King A, Shrestha S, Harvill E, Bjørnstad O (2009) Evolution of acute infections and the invasion-persistence trade-off. Am Nat 173:446–455CrossRefPubMedCentralPubMedGoogle Scholar
- Kot M (2001) Elements of mathematical ecology. Cambridge University Press, CambridgeCrossRefGoogle Scholar
- Lele S, Dennis B, Lutscher F (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecol Lett 10:551–563CrossRefPubMedGoogle Scholar
- Lele S, Nadeem K, Schmuland B (2010) Estimability and likelihood inference for generalized linear mixed models using data cloning. J Am Stat Assoc 105:1617–1625CrossRefGoogle Scholar
- Liu J (2001) Monte Carlo strategies in scientific computing. Springer, BerlinGoogle Scholar
- Luenberger D, Ye Y (2008) Linear and nonlinear programming, 3rd edn. Springer Science and Business Media, New YorkGoogle Scholar
- McNeil J, Cox-Foster D, Gardner M, Slavicek J, Thiem S, Hoover K (2010) Pathogenesis of
*Lymantria dispar*multiple nucleopolyhedrovirus (LdMNPV) in*L. dispar*and mechanisms of developmental resistance. J Gen Virol 91:1590–1600Google Scholar - Meynell G (1957) The applicability of the hypothesis of independent action to fatal infections in mice given
*Salmonella typhimurium*by mouth. J Gen Microbiol 16:396–404CrossRefPubMedGoogle Scholar - Miller G (2010) Markov chain Monte Carlo calculations allowing parallel processing using a variant of the Metropolis algorithm. Open Numer Methods J 2:12–17CrossRefGoogle Scholar
- Morgan B (1992) Analysis of quantal response data. Chapman & Hall, LondonCrossRefGoogle Scholar
- Mudholkar G, Srivastava D, Kollia G (1996) A generalization of the Weibull distribution with application to the analysis of survival data. J Am Stat Assoc 91:1575–1583CrossRefGoogle Scholar
- Plummer M, Best N, Cowles K, Vines K. (2009) coda: Output analysis and diagnostics for MCMC. R package version 0.13-4Google Scholar
- Ponciano J, Burleigh J, Braun E, Taper M (2012) Assessing parameter identifiability in phylogenetic models using data cloning. Syst Biol 61:955–972CrossRefPubMedCentralPubMedGoogle Scholar
- Press W, Teukolsky S, Vetterling W, Flannery B (1992) Numerical recipes in C. Cambridge University Press, CambridgeGoogle Scholar
- Development Core Team R (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0Google Scholar
- Robert C, Cornuet J, Marin J, Pillai N (2011) Lack of confidence in approximate Bayesian computation model choice. Proc Natl Acad Sci USA 108:15112–15117CrossRefPubMedCentralPubMedGoogle Scholar
- Roberts G, Gelman A, Gilks W (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Probab 7:110–120CrossRefGoogle Scholar
- Rosenthal J (2000) Parallel computing and Monte Carlo algorithms. Far East J Theor Stat 4:207–236Google Scholar
- Saaty T (1961) Some stochastic-processes with absorbing barriers. J R Stat Soc Ser B Stat Methodol 23:319–334Google Scholar
- Schmid-Hempel P (2005) Evolutionary ecology of insect immune defenses. Annu Rev Entomol 50:529–551CrossRefPubMedGoogle Scholar
- Schölkopf B, Smola A, Müller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRefGoogle Scholar
- Shapiro M, Farrar R Jr, Domek J, Javaid I (2002) Effects of virus concentration and ultraviolet irradiation on the activity of corn earworm and beet armyworm (Lepidoptera:Noctuidae) nucleopolyhedroviruses. J Econ Entomol 95:243–249CrossRefPubMedGoogle Scholar
- Shapiro M, Robertson J, Bell R (1986) Quantitative and qualitative differences in gypsy moth (Lepidoptera: Lymantriidae) nucleopolyhedrosis virus produced in different-aged larvae. J Econ Entomol 79:1174–1177CrossRefGoogle Scholar
- Shortley G (1965) A stochastic model for distributions of biological response times. Biometrics 21:562–582CrossRefPubMedGoogle Scholar
- Solonen A, Ollinaho P, Laine M, Haario H, Tamminen J, Jarvinen H (2012) Efficient MCMC for climate model parameter estimation: parallel adaptive chains and early rejection. Bayesian Anal 7:715–736CrossRefGoogle Scholar
- Strid I (2010) Efficient parallelisation of Metropolis-Hastings algorithms using a prefetching approach. Comput Stat Data Anal 54:2814–2835CrossRefGoogle Scholar
- Trudeau D, Washburn J, Volkman L (2001) Central role of hemocytes in
*Autographa californica M*nucleopolyhedrovirus pathogenesis in*Heliothis virescens*and*Helicoverpa zea*. J Virol 75:996–1003CrossRefPubMedCentralPubMedGoogle Scholar - Turchin P (2003) Complex population dynamics: a theoretical/empirical synthesis. Princeton University Press, PrincetonGoogle Scholar
- van Beek N, Flore P, Wood H, Hughes P (1990) Rate of increase of
*Autographa californica*nuclear polyhedrosis virus in*Trichoplusia ni*larvae determined by DNA-DNA hybridization. J Invertebr Pathol 55:85–92CrossRefPubMedGoogle Scholar - van Beek N, Hughes P, Wood H (2000) Effects of incubation temperature on the dose-survival time relationship of
*Trichoplusia ni*larvae infected with*Autographa californica*nucleopolyhedrovirus. J Invertebr Pathol 76:185–190CrossRefPubMedGoogle Scholar - van Beek N, Wood H, Hughes P (1988) Quantitative aspects of nuclear polyhedrosis virus infections in Lepidopterous larvae: the dose-survival time relationship. J Invertebr Pathol 51:58–63CrossRefGoogle Scholar
- van den Berg S, Beem L, Boomsma D (2006) Fitting genetic models using Markov chain Monte Carlo algorithms with BUGS. Twin Res Hum Genet 9:334–342CrossRefPubMedGoogle Scholar
- Vaughan T, Drummond P, Drummond A (2012) Within-host demographic fluctuations and correlations in early retroviral infection. J Theor Biol 295:86–99CrossRefPubMedGoogle Scholar
- Wilkinson D (2005) Handbook of Parallel computing and statistics, chapter parallel Bayesian computation. Dekker/CRC Press, New YorkGoogle Scholar
- Yan J, Cowles M, Wang S, Armstrong M (2007) Parallelizing MCMC for Bayesian spatiotemporal geostatistical models. Stat Comput 17:323–335CrossRefGoogle Scholar
- Zwart M, Hemerik L, Cory J, de Visser J, Bianchi F, Van Oers M, Vlak J, Hoekstra R, Van der Werf W (2009) An experimental test of the independent action hypothesis in virus-insect pathosystems. Proc R Soc Lond Ser B-Biol Sci 276:2233–2242CrossRefGoogle Scholar