European Journal of Epidemiology

, Volume 34, Issue 10, pp 927–938 | Cite as

Bias from self selection and loss to follow-up in prospective cohort studies

  • Guido BieleEmail author
  • Kristin Gustavson
  • Nikolai Olavi Czajkowski
  • Roy Miodini Nilsen
  • Ted Reichborn-Kjennerud
  • Per Minor Magnus
  • Camilla Stoltenberg
  • Heidi Aase


Self-selection into prospective cohort studies and loss to follow-up can cause biased exposure-outcome association estimates. Previous investigations illustrated that such biases can be small in large prospective cohort studies. The structural approach to selection bias shows that general statements about bias are not possible for studies that investigate multiple exposures and outcomes, and that inverse probability of participation weighting (IPPW) but not adjustment for participation predictors generally reduces bias from self-selection and loss to follow-up. We propose to substantiate assumptions in structural models of selection bias through calculation of genetic correlations coefficients between participation predictors, outcome, and exposure, and to estimate a lower bound for bias due to self-selection and loss to follow-up by comparing effect estimates from IPP weighted and unweighted analyses. This study used data from the Norwegian Mother and Child Cohort Study and the Medical Birth Registry of Norway. Using the example of risk factors for ADHD, we find that genetic correlations between participation predictors, exposures, and outcome suggest the presence of bias. The comparison of exposure-outcome associations from regressions with and without IPPW revealed meaningful deviations. Assessment of selection bias for entire multi-exposure multi-outcome cohort studies is not possible. Instead, it has to be assessed and controlled on a case-by-case basis.


Bias Self selection Loss to follow up Cohort study Inverse probability weighting Bayesian estimation Directed acyclic graphs ADHD 



The Norwegian Mother and Child Cohort Study is supported by the Norwegian Ministry of Health and Care Services and the Ministry of Education and Research. We are grateful to all the participating families in Norway who take part in this on-going cohort study. The authors thank Eivind Ystrøm for discussing an earlier version of the research and the International Cannabis Consortium for providing GWAS summary statistics.


This study was funded by the Norwegian Institute of Public Health.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Supplementary material

10654_2019_550_MOESM1_ESM.pdf (353 kb)
Supplementary material 1 (PDF 354 kb)


  1. 1.
    Rothman KJ, Greenland S, Lash TL. Modern Epidemiology. Philadelphia: Lippincott Williams and Wilkins; 2008.Google Scholar
  2. 2.
    Greenland S. For and against methodologies: Some perspectives on recent causal and statistical inference debates. Eur J Epidemiol. 2017;32:3–20.PubMedCrossRefPubMedCentralGoogle Scholar
  3. 3.
    Galea S, Tracy M. Participation rates in epidemiologic studies. Ann Epidemiol. 2007;17:643–53.PubMedCrossRefPubMedCentralGoogle Scholar
  4. 4.
    Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology. 2004;15:615–25.PubMedCrossRefPubMedCentralGoogle Scholar
  5. 5.
    Nilsen RM, et al. Self-selection and bias in a large prospective pregnancy cohort in Norway. Paediatr Perinat Epidemiol. 2009;23:597–608.PubMedCrossRefPubMedCentralGoogle Scholar
  6. 6.
    Nohr EA, Frydenberg M, Henriksen TB, Olsen J. Does low participation in cohort studies induce bias? Epidemiology. 2006;17:413–8.PubMedCrossRefPubMedCentralGoogle Scholar
  7. 7.
    Nohr EA, Liew Z. How to investigate and adjust for selection bias in cohort studies. Acta Obstet Gynecol Scand. 2018;97:407–16.PubMedCrossRefPubMedCentralGoogle Scholar
  8. 8.
    Hatch EE, et al. Evaluation of selection bias in an internet-based study of pregnancy planners. Epidemiology. 2016;27:98–104.PubMedPubMedCentralCrossRefGoogle Scholar
  9. 9.
    Greene N, Greenland S, Olsen J, Nohr EA. Estimating bias from loss to followup in the Danish National Birth Cohort. Epidemiology. 2011;22:815–22.PubMedPubMedCentralGoogle Scholar
  10. 10.
    Pearl J. Causal diagrams for empirical research. Biometrika. 1995;82:669–88.CrossRefGoogle Scholar
  11. 11.
    Cole SR, et al. Illustrating bias due to conditioning on a collider. Int J Epidemiol. 2010;39:417–20.PubMedCrossRefPubMedCentralGoogle Scholar
  12. 12.
    Johnson W, et al. Does education confer a culture of healthy behavior? Smoking and drinking patterns in Danish twins. Am J Epidemiol. 2011;173:55–63.PubMedCrossRefPubMedCentralGoogle Scholar
  13. 13.
    Verweij KJH, Huizink AC, Agrawal A, Martin NG, Lynskey MT. Is the relationship between early-onset cannabis use and educational attainment causal or due to common liability? Drug Alcohol Depend. 2013;133:580–6.PubMedCrossRefPubMedCentralGoogle Scholar
  14. 14.
    Tambs K, et al. Genetic and environmental contributions to the relationship between education and anxiety disorders: A twin study. Acta Psychiatr Scand. 2012;125:203–12.PubMedCrossRefPubMedCentralGoogle Scholar
  15. 15.
    Bulik-Sullivan BK, et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet. 2015;47:291–5.PubMedPubMedCentralCrossRefGoogle Scholar
  16. 16.
    Bulik-Sullivan BK, et al. An atlas of genetic correlations across human diseases and traits. Nat Genet. 2015;47:1236–41.PubMedPubMedCentralCrossRefGoogle Scholar
  17. 17.
    Miettinen OS. Standardization of risk ratios. Am J Epidemiol. 1972;96:383–8.PubMedCrossRefPubMedCentralGoogle Scholar
  18. 18.
    Downes M, et al. Multilevel regression and poststratification: A modelling approach to estimating population quantities from highly selected survey samples. Am J Epidemiol. 2018;187:1780–90.PubMedCrossRefPubMedCentralGoogle Scholar
  19. 19.
    Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res. 2013;22:278–95.PubMedCrossRefPubMedCentralGoogle Scholar
  20. 20.
    Magnus P, et al. Cohort profile: The Norwegian mother and child cohort study (MoBa). Int J Epidemiol. 2006;35:1146–50.PubMedCrossRefPubMedCentralGoogle Scholar
  21. 21.
    Magnus P, et al. Cohort profile update: The Norwegian mother and child cohort study (MoBa). Int J Epidemiol. 2016;45:382–8.PubMedCrossRefPubMedCentralGoogle Scholar
  22. 22.
    Irgens LM. The medical birth registry of Norway. Epidemiological research and surveillance throughout 30 years. Acta Obstet Gynecol Scand. 2000;79:435–9.PubMedCrossRefPubMedCentralGoogle Scholar
  23. 23.
    Tambs K, Moum T. How well can a few questionnaire items indicate anxiety and depression? Acta Psychiatr Scand. 1993;87:364–7.PubMedCrossRefPubMedCentralGoogle Scholar
  24. 24.
    Kendler KS, Neale MC, Kessler RC, Heath AC, Eaves LJ. The lifetime history of major depression in women. Reliability of diagnosis and heritability. Arch Gen Psychiatry. 1993;50:863–70.PubMedCrossRefPubMedCentralGoogle Scholar
  25. 25.
    Kessler RC, et al. Validity of the World Health Organization adult ADHD self-report scale (ASRS) screener in a representative sample of health plan members. Int J Methods Psychiatr Res. 2007;16:52–65.PubMedPubMedCentralCrossRefGoogle Scholar
  26. 26.
    Rizopoulos D. ltm: An R package for latent variable modelling and item response theory analyses. J Stat Softw. 2006;17:1–25.CrossRefGoogle Scholar
  27. 27.
    Su Y-S, Gelman A, Hill J, Yajima M. Multiple imputation with diagnostics (mi) in R: Opening windows into the black box. J Stat Softw. 2011;45:1–31.CrossRefGoogle Scholar
  28. 28.
    Austin PC. Some methods of propensity-score matching had superior performance to others: Results of an empirical investigation and Monte Carlo simulations. Biometr J. 2009;51:171–84.CrossRefGoogle Scholar
  29. 29.
    Mascha EJ, Sessler DI. Equivalence and non-inferiority testing in regression models and repeated-measures designs. Anesth Analg. 2011;112:678–87.PubMedCrossRefPubMedCentralGoogle Scholar
  30. 30.
    Carpenter B, et al. Stan: A probabilistic programming language. J Stat Softw. 2017;76:1–29.CrossRefGoogle Scholar
  31. 31.
    Stan Development Team. RStan: The R interface to Stan R package version 2.18.2. 2018. Accessed 4 Feb 2019.
  32. 32.
    Vinther-Larsen M, et al. The Danish Youth Cohort: Characteristics of participants and non-participants and determinants of attrition. Scand J Public Health. 2010;38:648–56.PubMedCrossRefPubMedCentralGoogle Scholar
  33. 33.
    Howe LD, Tilling K, Galobardes B, Lawlor DA. Loss to follow-up in cohort studies: Bias in estimates of socioeconomic inequalities. Epidemiology. 2013;24:1–9.PubMedPubMedCentralCrossRefGoogle Scholar
  34. 34.
    Wolke D, et al. Selective drop-out in longitudinal studies and non-biased prediction of behaviour disorders. Br J Psychiatry. 2009;195:249–56.PubMedPubMedCentralCrossRefGoogle Scholar
  35. 35.
    Wu W, et al. The heritability of gestational age in a two-million member cohort: Implications for spontaneous preterm birth. Hum Genet. 2015;134:803–8.PubMedPubMedCentralCrossRefGoogle Scholar
  36. 36.
    Schuirmann DJ. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability. J Pharmacokinet Biopharm. 1987;15:657–80.PubMedCrossRefPubMedCentralGoogle Scholar
  37. 37.
    Sullivan GM, Feinn R. Using effect size-or why the p value is not enough. J Grad Med Educ. 2012;4:279–82.PubMedPubMedCentralCrossRefGoogle Scholar
  38. 38.
    Cohen J. A power primer. Psycholog Bull. 1992;112:155–9.CrossRefGoogle Scholar
  39. 39.
    Robins JM, Finkelstein DM. Correcting for noncompliance and dependent censoring in an AIDS Clinical Trial with inverse probability of censoring weighted (IPCW) log-rank tests. Biometrics. 2000;56:779–88.CrossRefGoogle Scholar
  40. 40.
    Donovan SJ, Susser E. Commentary: advent of sibling designs. Int J Epidemiol. 2011;40:345–9.PubMedPubMedCentralCrossRefGoogle Scholar
  41. 41.
    Shrier I, Platt RW. Reducing bias through directed acyclic graphs. BMC Med Res Methodol. 2008;8:70.PubMedPubMedCentralCrossRefGoogle Scholar
  42. 42.
    Rothman KJ, Gallacher JEJ, Hatch EE. Why representativeness should be avoided. Int J Epidemiol. 2013;42:1012–4.PubMedPubMedCentralCrossRefGoogle Scholar
  43. 43.
    Keiding N, Louis TA. Perils and potentials of self-selected entry to epidemiological studies and surveys. J R Stat Soc Ser A (Stat Soc). 2016;179:319–76.CrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  • Guido Biele
    • 1
    Email author
  • Kristin Gustavson
    • 1
  • Nikolai Olavi Czajkowski
    • 1
  • Roy Miodini Nilsen
    • 1
  • Ted Reichborn-Kjennerud
    • 1
  • Per Minor Magnus
    • 1
  • Camilla Stoltenberg
    • 1
  • Heidi Aase
    • 1
  1. 1.Norwegian Institute of Public HealthOsloNorway

Personalised recommendations