Skip to main content

Disease Prediction Using Metagenomic Data Visualizations Based on Manifold Learning and Convolutional Neural Network

  • Conference paper
  • First Online:
Book cover Future Data and Security Engineering (FDSE 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11814))

Included in the following conference series:

Abstract

Deep learning algorithms have obtained numerous achievements in image classification, speed recognition, video processing. Visualizing metagenomic data is a challenge because of its complexity and high-dimensional. In this paper, we introduce several approaches based on dimensionality reduction algorithms and data density to visualize features which reflect the species abundance. The sophisticated methods used in this study, that are unsupervised approaches, carry out dimensionality reduction and map the data into a 2-dimensional space. From the visualizations obtained, deep learning techniques are leveraged to enhance the prediction performance for colorectal cancer. We show by experiments on five Metagenome-based colorectal cancer datasets from different regions such as Chinese, Austrian, American, German and French cohorts that the proposed visualizations allow to visualize bio-medical signatures and improve the prediction performance compared to classical machine learning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dai, Z., et al.: Multi-cohort analysis of colorectal cancer metagenome identified altered bacteria across populations and universal bacterial markers. Microbiome 6, 70 (2018). https://doi.org/10.1186/s40168-018-0451-2. ISSN 2049–2618

  2. Sudarikov, K., et al.: Methods for the metagenomic data visualization and analysis. Curr. Issues Mol. Biol. 24, 37–58 (2017). ISSN: 14673037

    Google Scholar 

  3. Oh, J., et al.: Biogeography and individuality shape function in the human skin metagenome. Nature 514, 59–64 (2014). https://www.nature.com/articles/nature13786. ISSN 1476–4687

    Article  Google Scholar 

  4. R Development Core Team: A Language and Environment for Statistical Computing (2008). ISBN: 3-900051-07-0

    Google Scholar 

  5. Ondov, B.D., et al.: Interactive metagenomic visualization in a web browser. BMC Bioinform. 12, 385 (2011)

    Google Scholar 

  6. Kerepesi, C., et al.: AmphoraNet: the webserver implementation of the AMPHORA2 metagenomic workflow suite. Gene, 538–540 (2013). https://doi.org/10.1016/j.gene.2013.10.015

    Article  Google Scholar 

  7. Rudis, B., Almossawi, A., Ulmer, H.: ‘metricsgraphics’, CRAN repository (2015). https://CRAN.R-project.org/package=metricsgraphics

  8. Warnes, G.R., et al.: Package ‘gplots’, CRAN repository (2016). https://CRAN.R-project.org/package=gplots

  9. Jiang, X., et al.: Manifold learning reveals nonlinear structure in metagenomic profiles. In: 2012 IEEE International Conference on Bioinformatics and Biomedicine (2012)

    Google Scholar 

  10. Alshawaqfeh, M., et al.: Consistent metagenomic biomarker detection via robust PCA. Biol. Direct 12(1), 4 (2016)

    Article  Google Scholar 

  11. Huo, X., et al.: A survey of manifold-based learning methods. In: Recent Advances in Data Mining of Enterprise Data: Algorithms and Applications, pp. 691–745 (2007). https://doi.org/10.1142/9789812779861_0015

    Chapter  Google Scholar 

  12. Izenman, A.J.: Introduction to manifold learning. Wiley Interdisc. Rev.: Comput. Stat. 5, 439–446 (2012)

    Article  Google Scholar 

  13. Meyer, F., et al.: The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinform. 9(1), 386 (2011)

    Article  Google Scholar 

  14. Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. In: Conference in Modern Analysis and Probability. New Haven, Conn. (1982)

    Google Scholar 

  15. Grellmann, C., et al.: Random projection for fast and efficient multivariate correlation analysis of high-dimensional data: a new approach. Front. Genet. 7, 102 (2016)

    Article  Google Scholar 

  16. Lahiri, S., et al.: Random projections of random manifolds; arXiv:1607.04331 [cs, q-bio, stat] (2016)

  17. Févotte, C., Idier, J.: Algorithms for nonnegative matrix factorization with the beta-divergence; arXiv:1010.1763 [cs] (2010)

  18. Huson, D.H., Auch, A.F., Qi, J., Schuster, S.C.: MEGAN analysis of metagenomic data 17, 377–386. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1800929/. ISSN 1088–9051

  19. Gillis, N.: The Why and How of Nonnegative Matrix Factorization; arXiv:1401.5226 [cs, math, stat] (2010)

  20. Borg, I., Groenen, P.J.F.: Modern Multidimensional Scaling. SSS. Springer, New York (2005). https://doi.org/10.1007/0-387-28981-X

    Book  MATH  Google Scholar 

  21. McQueen, J., Meila, M., VanderPlas, J., Zhang, Z.: Manifold Learning with Millions of points; arxiv (2005)

    Google Scholar 

  22. Park, H.: ISOMAP induced manifold embedding and its application to Alzheimer’s disease and mild cognitive impairment. Neurosci. Lett. 513, 141–145 (2012)

    Article  Google Scholar 

  23. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2012)

    Article  Google Scholar 

  24. Talwalkar, A., Kumar, S., Rowley, H.: Large-scale manifold learning. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition (2008)

    Google Scholar 

  25. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008)

    MATH  Google Scholar 

  26. Nguyen, T.H., et al.: Disease classification in metagenomics with 2D embeddings and deep learning. In: The Annual French Conference in Machine Learning (CAp 2018) (2018)

    Google Scholar 

  27. Hamel, P., Eck, D.: Learning features from music audio with deep belief networks (2010)

    Google Scholar 

  28. Garreta, R., Moncecchi, G.: Learning Scikit-Learn: Machine Learning in Python. Packt Publishing Ltd (2013)

    Google Scholar 

  29. Kingma, D.P., et al.: Adam: A Method for Stochastic Optimization; CoRR abs/1412.6980 (2014)

    Google Scholar 

  30. Bolger, A.M., Lohse, M., Usadel, B.: Trimmomatic: a flexible trimmer for illumina sequence data. Bioinformatics 30, 2114–2120 (2014). ISSN 1367–4811

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thanh Hai Nguyen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nguyen, T.H., Nguyen, TN. (2019). Disease Prediction Using Metagenomic Data Visualizations Based on Manifold Learning and Convolutional Neural Network. In: Dang, T., Küng, J., Takizawa, M., Bui, S. (eds) Future Data and Security Engineering. FDSE 2019. Lecture Notes in Computer Science(), vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35653-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35652-1

  • Online ISBN: 978-3-030-35653-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics