Data Imputation in Merged Isobaric Labeling-Based Relative Quantification Datasets

Palstrøm, Nicolai Bjødstrup; Matthiesen, Rune; Beck, Hans Christian

doi:10.1007/978-1-4939-9744-2_13

Nicolai Bjødstrup Palstrøm³,
Rune Matthiesen⁴ &
Hans Christian Beck³

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2051))

3575 Accesses
6 Citations

Abstract

The data-dependent acquisition in mass spectrometry-based proteomics combined with quantitative analysis using isobaric labeling (iTRAQ and TMT) inevitably introduces missing values in proteomic experiments where a number of LC-runs are combined, especially in the growing field of shotgun clinical proteomics, where the protein profiles from the proteomics analysis of several hundred patient samples are compared and correlated to clinical traits such as a specific disease or disease treatment in order to link specific outcomes to one or more proteins. In the context of clinical research it is evident that missing values in such datasets reduce the power of the downstream statistical analysis therefore may hampers the linking of the expression of disease traits to the expression of specific proteins that may be useful for prognostic, diagnostic, or predictive purposes. In our study, we tested three data imputation approaches initially developed for microarray data for the imputation of missing values in datasets that are generated by several runs of shotgun proteomic experiments and where the data were relative protein abundances based on isobaric tags (iTRAQ and TMT). Our conclusion is that imputation methods based on k Nearest Neighbors successfully impute missing values in datasets with up to 50% missing values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kang H (2013) The prevention and handling of the missing data. Korean J Anesthesiol 64:402–406
Article Google Scholar
Beretta L, Santaniello A (2016) Nearest neighbor imputation algorithms: a critical evaluation. BMC Med Inform Decis Mak 16(Suppl 3):74
Article Google Scholar
Lazar C, Gatto L, Ferro M, Bruley C, Burger T (2016) Accounting for the multiple natures of missing values in label-free quantitative proteomics data sets to compare imputation strategies. J Proteome Res 15:1116–1125
Article CAS Google Scholar
Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R et al (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17:520–525
Article CAS Google Scholar
Beck HC, Jensen LO, Gils C, Ilondo AMM, Frydland M, Hassager C et al (2018) Proteomic discovery and validation of the confounding effect of heparin administration on the analysis of candidate cardiovascular biomarkers. Clin Chem 64:1474–1484
Article CAS Google Scholar
Chich JF, David O, Villers F, Schaeffer B, Lutomski D, Huet S (2007) Statistics for proteomics: experimental design and 2-DE differential analysis. J Chromatogr B Analyt Technol Biomed Life Sci 849:261–272
Article CAS Google Scholar
Webb-Robertson BJ, Wiberg HK, Matzke MM, Brown JN, Wang J, McDermott JE et al (2015) Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. J Proteome Res 14:1993–2001
Article CAS Google Scholar
Beck HC, Nielsen EC, Matthiesen R, Jensen LH, Sehested M, Finn P et al (2006) Quantitative proteomic analysis of post-translational modifications of human histones. Mol Cell Proteomics 5:1314–1325
Article CAS Google Scholar

Download references

Acknowledgments

Odense University Hospital Research Fund (Grant R22-A1187-B615) is acknowledged for financial support.

Author information

Authors and Affiliations

Department of Clinical Biochemistry and Pharmacology, Odense University Hospital, Odense C, Denmark
Nicolai Bjødstrup Palstrøm & Hans Christian Beck
Computational and Experimental Biology Group, CEDOC, Chronic Diseases Research Centre, NOVA Medical School, Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Lisbon, Portugal
Rune Matthiesen

Authors

Nicolai Bjødstrup Palstrøm
View author publications
You can also search for this author in PubMed Google Scholar
Rune Matthiesen
View author publications
You can also search for this author in PubMed Google Scholar
Hans Christian Beck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hans Christian Beck .

Editor information

Editors and Affiliations

Computational and Experimental Biology Group, CEDOC, Chronic Diseases Research Centre, NOVA Medical School, Faculdade de Ciências Médicas, Universidade NOVA de Lisboa, Lisboa, Portugal
Rune Matthiesen

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Palstrøm, N.B., Matthiesen, R., Beck, H.C. (2020). Data Imputation in Merged Isobaric Labeling-Based Relative Quantification Datasets. In: Matthiesen, R. (eds) Mass Spectrometry Data Analysis in Proteomics. Methods in Molecular Biology, vol 2051. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9744-2_13

Download citation

DOI: https://doi.org/10.1007/978-1-4939-9744-2_13
Published: 25 September 2019
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-4939-9743-5
Online ISBN: 978-1-4939-9744-2
eBook Packages: Springer Protocols

Publish with us

Policies and ethics