Skip to main content

Multi-omics Classification on Kidney Samples Exploiting Uncertainty-Aware Models

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2020)

Abstract

Due to the huge amount of available omic data, classifying samples according to various omics is a complex process. One of the most common approaches consists of creating a classifier for each omic and subsequently making a consensus among the classifiers that assigns to each sample the most voted class among the outputs on the individual omics.

However, this approach does not consider the confidence in the prediction ignoring that a biological information coming from a certain omic may be more reliable than others. Therefore, it is here proposed a method consisting of a tree-based multi-layer perceptron (MLP), which estimates the class-membership probabilities for classification. In this way, it is not only possible to give relevance to all the omics, but also to label as Unknown those samples for which the classifier is uncertain in its prediction. The method was applied to a dataset composed of 909 kidney cancer samples for which these three omics were available: gene expression (mRNA), microRNA expression (miRNA) and methylation profiles (meth) data. The method is valid also for other tissues and on other omics (e.g. proteomics, copy number alterations data, single nucleotide polymorphism data). The accuracy and weighted average f1-score of the model are both higher than 95%. This tool can therefore be particularly useful in clinical practice, allowing physicians to focus on the most interesting and challenging samples.

Data availability: the code is freely accessible at https://github.com/Bontempogianpaolo1/Consunsus-on-multi-omics, while mRNA, miRNA and meth data can be obtained from the GDC database [2] or upon request to the authors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113 (2013)

    Article  Google Scholar 

  2. Grossman, R.L., et al.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375(12), 1109–1112 (2016)

    Article  Google Scholar 

  3. Leinonen, R., Sugawara, H., Shumway, M.: International nucleotide sequence database collaboration: the sequence read archive. Nucleic Acids Res. 39((suppl_1)), D19–D21 (2010)

    Google Scholar 

  4. Pochet, N., De Smet, F., Suykens, J.A., De Moor, B.L.: Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction. Bioinformatics 20(17), 3185–3195 (2004)

    Article  Google Scholar 

  5. Lee, G., Rodriguez, C., Madabhushi, A.: An empirical comparison of dimensionality reduction methods for classifying gene and protein expression datasets. In: Măndoiu, I., Zelikovsky, A. (eds.) ISBRA 2007. LNCS, vol. 4463, pp. 170–181. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-72031-7_16

    Chapter  Google Scholar 

  6. Kim, P.M., Tidor, B.: Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res. 13(7), 1706–1718 (2003)

    Article  Google Scholar 

  7. Lu, M., Zhan, X.: The crucial role of multiomic approach in cancer research and clinically relevant outcomes. EPMA J. 9(1), 77–102 (2018)

    Article  Google Scholar 

  8. Wang, B., Mezlini, A.M., Demir, F., Fiume, M., Tu, Z., Brudno, M., et al.: Similarity network fusion for aggregating data types on a genomic scale. Nat. Meth. 11(3), 333 (2014)

    Article  Google Scholar 

  9. Argelaguet, R., et al.: Multi-omics factor analysis—a framework for unsupervised integration of multi-omics data sets. Mol. Syst. Biol. 14(6), e8124 (2018)

    Article  Google Scholar 

  10. Robles, A.I., Arai, E., Mathé, E.A., Okayama, H., Schetter, A.J., Brown, D., et al.: An integrated prognostic classifier for stage I lung adenocarcinoma based on mRNA, microRNA, and DNA methylation biomarkers. J. Thorac. Oncol. 10(7), 1037–1048 (2015)

    Article  Google Scholar 

  11. Tang, W., Wan, S., Yang, Z., Teschendorff, A.E., Zou, Q.: Tumor origin detection with tissue-specific miRNA and DNA methylation markers. Bioinformatics 34(3), 398–406 (2018)

    Article  Google Scholar 

  12. Cantini, L., Medico, E., Fortunato, S., Caselle, M.: Detection of gene communities in multi-networks reveals cancer drivers. Sci. Rep. 5, 17386 (2015)

    Article  Google Scholar 

  13. Barabási, A.L., Gulbahce, N., Loscalzo, J.: Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12(1), 56–68 (2011)

    Article  Google Scholar 

  14. Fuchs, M., Beißbarth, T., Wingender, E., Jung, K.: Connecting high-dimensional mRNA and miRNA expression data for binary medical classification problems. Comput. Meth. Programs Biomed. 111(3), 592–601 (2013)

    Article  Google Scholar 

  15. Mallik, S., Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: Integrated analysis of gene expression and genome-wide DNA methylation for tumor prediction: an association rule mining-based approach. In: 2013 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), (pp. 120–127). IEEE April 2013

    Google Scholar 

  16. Huber, W., Von Heydebreck, A., Sültmann, H., Poustka, A., Vingron, M.: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 18((suppl_1)), S96–S104 (2002)

    Article  Google Scholar 

  17. Love, M.I., Huber, W., Anders, S.: Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15(12), 550 (2014)

    Article  Google Scholar 

  18. Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometr. Intell. Lab. Syst. 2(1-3), 37–52 (1987)

    Article  Google Scholar 

  19. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  20. Ramchoun, H., Idrissi, M.A.J., Ghanou, Y., Ettaouil, M.: Multilayer perceptron: architecture optimization and training. IJIMAI 4(1), 26–30 (2016)

    Article  Google Scholar 

  21. Christopher, M.: Bishop.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag, Berlin, Heidelberg (2006)

    Google Scholar 

  22. Paszke, A., et al. Automatic differentiation in pytorch (2017)

    Google Scholar 

  23. Bingham, E., et al.: Pyro: deep universal probabilistic programming. J. Mach. Learn. Res. 20(1), 973–978 (2019)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marta Lovino .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lovino, M., Bontempo, G., Cirrincione, G., Ficarra, E. (2020). Multi-omics Classification on Kidney Samples Exploiting Uncertainty-Aware Models. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12464. Springer, Cham. https://doi.org/10.1007/978-3-030-60802-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-60802-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-60801-9

  • Online ISBN: 978-3-030-60802-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics