Avoiding selection bias in metabolomics studies: a tutorial
Metabolomics techniques are increasingly applied in epidemiologic research. Many available assays are still relatively expensive and therefore measurements are often performed in small patient population studies such as case series or case–control designs with strong participant selection criteria. Subsequently, metabolomics data are frequently used to assess secondary associations for which the original study was not explicitly designed. Especially in these secondary analyses, there is a risk that the original selection criteria and the conditioning that takes place due to this selection are not properly accounted for which can lead to selection bias.
Aim of review
In this tutorial, we start with a brief theoretical introduction on the issue of selection bias. Subsequently, we demonstrate how selection bias can occur in metabolomics studies by means of an investigation into associations of metabolites with total body fat in a nested case–control study that was originally designed to study effects of elevated fasting glucose.
Key scientific concepts of review
We demonstrate that standard analytical methods, such as stratification or adjustment in regression analyses, are not suited to deal with selection bias and may even induce the bias when analysing metabolite–phenotype relationships in selected groups. Finally, we show that inverse probability weighting, also known as survey weighting, can be used in some situations to make unbiased estimates of the outcomes.
KeywordsMetabolomics Selection bias Collider bias Inverse probability weighting Epidemiology
We express our gratitude to all individuals who participate in the Netherlands Epidemiology of Obesity study. We are grateful to all participating general practitioners for inviting eligible participants. We furthermore thank P. van Beelen and all research nurses for collecting the data and P. Noordijk and her team for sample handling and storage, and I. de Jonge, MSc for data management of the NEO study.
SCB: performed analyses, wrote manuscript and conceived idea, SLC: conceived idea, contributed to techniques for analysis, read and approved manuscript, KWD: helped in formulation of text aimed at target audience, read and approved manuscript, RDM: read and approved manuscript, DOM: conceived idea, read and approved manuscript.
The NEO study is supported by the participating Departments, the Division and the Board of Directors of the Leiden University Medical Centre, and by the Leiden University, Research Profile Area ‘Vascular and Regenerative Medicine’. Dennis Mook-Kanamori and the metabolomics measurements are supported by Dutch Science Organization (ZonMW-VENI Grant 916.14.023).
Compliance with ethical standards
Conflict of interest
All contributing authors declare that there is no conflict of interest involved in the creation of this manuscript.
The Netherlands Epidemiology of Obesity study was approved by the medical ethical committee of the Leiden University Medical Center (LUMC).
All participants gave written informed consent.
- de Mutsert, R., den Heijer, M., Rabelink, T. J., Smit, J. W., Romijn, J. A., Jukema, J. W., de Roos, A., Cobbaert, C. M., Kloppenburg, M., le Cessie, S., Middeldorp, S., & Rosendaal, F. R. (2013). The Netherlands Epidemiology of Obesity (NEO) study: study design and data collection. European Journal of Epidemiology, 28, 513–523.CrossRefGoogle Scholar
- Floegel, A., Stefan, N., Yu, Z., Muhlenbruch, K., Drogan, D., Joost, H. G., Fritsche, A., Haring, H. U., Hrabe de Angelis, M., Peters, A., Roden, M., Prehn, C., Wang-Sattler, R., Illig, T., Schulze, M. B., Adamski, J., Boeing, H., & Pischon, T. (2013). Identification of serum metabolites associated with risk of type 2 diabetes using a targeted metabolomic approach. Diabetes, 62, 639–648.CrossRefGoogle Scholar
- Lumley, T. (2018) Analysis of complex survey samples, version 3.34. Retrieved November 2018, from https://cran.r-project.org/web/packages/survey/survey.pdf.
- Mook-Kanamori, D. O., de Mutsert, R., Rensen, P. C., Prehn, C., Adamski, J., den Heijer, M., le Cessie, S., Suhre, K., Rosendaal, F. R., & Dijk, K. W. (2016). Type 2 diabetes is associated with postprandial amino acid measures. Archives of Biochemistry and Biophysics, 589, 138–144.CrossRefGoogle Scholar
- Volksgezondheidenzorg.info. (2016). Retrieved June 06, 2018, from https://www.volksgezondheidenzorg.info/onderwerp/diabetes-mellitus. RIVM: Bilthoven.
- Wang-Sattler, R., Yu, Z., Herder, C., Messias, A. C., Floegel, A., He, Y., Heim, K., Campillos, M., Holzapfel, C., Thorand, B., Grallert, H., Xu, T., Bader, E., Huth, C., Mittelstrass, K., Doring, A., Meisinger, C., Gieger, C., Prehn, C., Roemisch-Margl, W., Carstensen, M., Xie, L., Yamanaka-Okumura, H., Xing, G., Ceglarek, U., Thiery, J., Giani, G., Lickert, H., Lin, X., Li, Y., Boeing, H., Joost, H. G., de Angelis, M. H., Rathmann, W., Suhre, K., Prokisch, H., Peters, A., Meitinger, T., Roden, M., Wichmann, H. E., Pischon, T., Adamski, J., & Illig, T. (2012). Novel biomarkers for pre-diabetes identified by metabolomics. Molecular Systems Biology, 8, 615.CrossRefGoogle Scholar