Skip to main content

IT in Biology & Medical Informatics: On the Challenge of Understanding the Data Ecosystem

  • Conference paper
  • First Online:
Information Technology in Bio- and Medical Informatics (ITBAM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10443))

  • 592 Accesses

Abstract

Data intensive disciplines, such as life sciences and medicine, are promoting vivid research activities in the area of data science. Modern technologies, such as high-throughput mass-spectrometry and sequencing, micro-arrays, high-resolution imaging, etc., produce enormous and continuously increasing amounts of data. Huge public databases provide access to aggregated and consolidated data on genome and protein sequences, biological pathways, diseases, anatomy atlases, and scientific literature. There has never been before more potentially available data to study biomedical systems, ranging from single cells to complete organisms. However, it is a non-trivial task to transform the vast amount of biomedical data into actionable, useful and usable information, triggering scientific progress and supporting patient management.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andersen, T.F., Madsen, M., Jorgensen, J., Mellemkjaer, L., Olsen, J.H.: The Danish national hospital register - a valuable source of data for modern health sciences. Dan. Med. Bull. 46, 263–268 (1999)

    Google Scholar 

  2. Sathyanarayana, A., Srivastava, J., Fernandez-Luque, L.: The science of sweet dreams: predicting sleep efficiency from wearable device data. Computer 50, 30–38 (2017)

    Article  Google Scholar 

  3. Holzinger, A.: Trends in interactive knowledge discovery for personalized medicine: cognitive science meets machine learning. IEEE Intell. Inform. Bull. 15, 6–14 (2014)

    Google Scholar 

  4. Trusheim, M.R., Berndt, E.R., Douglas, F.L.: Stratified medicine: strategic and economic implications of combining drugs and clinical biomarkers. Nat. Rev. Drug Discovery 6, 287–293 (2007)

    Article  Google Scholar 

  5. Su, X., Kang, J., Fan, J., Levine, R.A., Yan, X.: Facilitating score and causal inference trees for large observational studies. J. Mach. Learn. Res. 13, 2955–2994 (2012)

    MathSciNet  MATH  Google Scholar 

  6. Huppertz, B., Holzinger, A.: Biobanks – a source of large biological data sets: open problems and future challenges. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 317–330. Springer, Heidelberg (2014). doi:10.1007/978-3-662-43968-5_18

    Chapter  Google Scholar 

  7. Schulam, P., Saria, S.: Integrative analysis using coupled latent variable models for individualizing prognoses. J. Mach. Learn. Res. 17, 1–35 (2016)

    MATH  Google Scholar 

  8. Rost, B., Radivojac, P., Bromberg, Y.: Protein function in precision medicine: deep understanding with machine learning. FEBS Lett. 590, 2327–2341 (2016)

    Article  Google Scholar 

  9. Ghahramani, Z.: Bayesian non-parametrics and the probabilistic approach to modelling. Philos. Trans. R. Soc. A: Math. Phys. Eng. Sci. 371, 20110553 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  10. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 1566–1581 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  11. Houlsby, N., Huszar, F., Ghahramani, Z., Hernández-lobato, J.M.: Collaborative Gaussian processes for preference learning. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems (NIPS 2012), pp. 2096–2104 (2012)

    Google Scholar 

  12. McDermott, J.E., Wang, J., Mitchell, H., Webb-Robertson, B.J., Hafen, R., Ramey, J., Rodland, K.D.: Challenges in biomarker discovery: combining expert insights with statistical analysis of complex omics data. Expert Opin. Med. Diagn. 7, 37–51 (2013)

    Article  Google Scholar 

  13. Libbrecht, M.W., Noble, W.S.: Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16, 321–332 (2015)

    Article  Google Scholar 

  14. Holzinger, A.: Machine learning for health informatics. In: Holzinger, A. (ed.) Machine Learning for Health Informatics: State-of-the-Art and Future Challenges. LNCS, vol. 9605, pp. 1–24. Springer, Cham (2016). doi:10.1007/978-3-319-50478-0_1

    Chapter  Google Scholar 

  15. Varshney, U., Chang, C.K.: Smart health and well-being. Computer 49, 11–13 (2016)

    Article  Google Scholar 

  16. Tang, L., Song, P.X.: Fused lasso approach in regression coefficients clustering - learning parameter heterogeneity in data integration. J. Mach. Learn. Res. 17, 1–23 (2016)

    MathSciNet  MATH  Google Scholar 

  17. Jeanquartier, F., Jean-Quartier, C., Schreck, T., Cemernek, D., Holzinger, A.: Integrating open data on cancer in support to tumor growth analysis. In: Renda, M.E., Bursa, M., Holzinger, A., Khuri, S. (eds.) ITBAM 2016. LNCS, vol. 9832, pp. 49–66. Springer, Cham (2016). doi:10.1007/978-3-319-43949-5_4

    Chapter  Google Scholar 

  18. Gottweis, H., Zatloukal, K.: Biobank governance: trends and perspectives. Pathobiology 74, 206–211 (2007)

    Article  Google Scholar 

  19. Bleiholder, J., Naumann, F.: Data fusion. ACM Comput. Surv. (CSUR) 41, 1–41 (2008)

    Article  Google Scholar 

  20. Lafon, S., Keller, Y., Coifman, R.R.: Data fusion and multicue data matching by diffusion maps. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1784–1797 (2006)

    Article  Google Scholar 

  21. Pellegrini, M., Renda, M.E., Vecchio, A.: Tandem repeats discovery service (TReaDS) applied to finding novel cis-acting factors in repeat expansion diseases. BMC Bioinformatics 13, S3 (2012)

    Article  Google Scholar 

  22. Blanchet, L., Smolinska, A.: Data fusion in metabolomics and proteomics for biomarker discovery. In: Jung, K. (ed.) Statistical Analysis in Proteomics, pp. 209–223. Springer, New York (2016)

    Chapter  Google Scholar 

  23. Holzinger, A.: Introduction to Machine Learning and Knowledge Extraction (MAKE). Mach. Learn. Knowl. Extr. 1(1), 1–20 (2017). doi:10.3390/make1010001

Download references

Acknowledgments

We are grateful for the great support of the ITBAM program committee, and particularly for the excellent work of Gabriela Wagner.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Holzinger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Holzinger, A., Bursa, M., Khuri, S., Renda, M.E. (2017). IT in Biology & Medical Informatics: On the Challenge of Understanding the Data Ecosystem. In: Bursa, M., Holzinger, A., Renda, M., Khuri, S. (eds) Information Technology in Bio- and Medical Informatics. ITBAM 2017. Lecture Notes in Computer Science(), vol 10443. Springer, Cham. https://doi.org/10.1007/978-3-319-64265-9_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64265-9_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64264-2

  • Online ISBN: 978-3-319-64265-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics