Skip to main content

Biomedical Informatics

  • Living reference work entry
  • First Online:
Encyclopedia of Machine Learning and Data Mining
  • 1102 Accesses

Introduction

Recent years have witnessed a tremendous increase in the use of machine learning for biomedical applications. This surge in interest has several causes. One is the successful application of machine learning technologies in other fields such as web search, speech and handwriting recognition, agent design, spatial modeling, etc. Another is the development of technologies that enable the production of large amounts of data in the time it used to take to generate a single data point (run a single experiment). A third most recent development is the advent of electronic medical/health records (EMRs/EHRs). The drastic increase in the amount of data generated has led the biologists and clinical researchers to adopt algorithms that can construct predictive models from large amounts of data. Naturally, machine learning is emerging as a tool of choice.

In this entry, we will present a few data types and tasks involving such large-scale biological data, where machine learning...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • Ananiev GE, Goldstein S, Runnheim R, Forrest DK, Zhou S, Potamousis K, Churas CP, Bergendah V, Thomson JA, David C (2008). Schwartz1. Optical mapping discerns genome wide DNA methylation profiles. BMC Mol Biol 9. doi:10.1186/1471-2199-9-68.

    Google Scholar 

  • Baggerly K, Morris JS, Combes KR (2004) Reproducibility of seldi-tof protein patterns in serum: comparing datasets from different experiments. Bioinformatics 20:777–785

    Article  Google Scholar 

  • Bonneau R, Baker D (2001) Ab initio protein structure prediction: progress and prospects. Ann Rev Biophys Biomol Struct 30:173–189

    Article  Google Scholar 

  • Burnside ES, Davis J, Chhatwal J, Alagoz O, Lindstrom MJ, Geller BM, Littenberg B, Kahn CE, Shaffer K, Page D (2009) Unique features of HLA-mediated hiv evolution in a Mexican cohort: a comparative study. Radiology 251:663–672

    Article  Google Scholar 

  • Carlson J, Valenzuela-Ponce H, Blanco-Heredia J, Garrido-Rodriguez D, Garcia-Morales C, Heckerman D et al (2009) Unique features of HLA-mediated HIV evolution in a Mexican cohort: a comparative study. Retrovirology 6(72):39

    Google Scholar 

  • Davis J, Santos Costa V, Ray S, Page D (2007a) An integrated approach to feature construction and model building for drug activity prediction. In: Proceedings of the 24th international conference on machine learning (ICML), Corvalis

    Google Scholar 

  • Davis J, Ong I, Struyf J, Burnside E, Page D, Santos Costa V (2007b) Change of representation for statistical relational learning. In: Proceedings of the 20th international joint conference on artificial intelligence (IJCAI), Hyderabad

    Google Scholar 

  • DiMaio F, Kondrashov D, Bitto E, Soni A, Bingman C, Phillips G, Shavlik J (2007) Creating protein models from electron-density maps using particle-filtering methods. Bioinformatics 23:2851–2858

    Article  Google Scholar 

  • Easton DF, Pooley KA, Dunning AM, Pharoah PD et al (2007) Genome-wide association study identifies novel breast cancer susceptibility loci. Nature 447:1087–1093

    Article  Google Scholar 

  • Finn P, Muggleton S, Page D, Srinivasan A (1998) Discovery of pharmacophores using the inductive logic programming system PROGOL. Mach Learn 30(1,2):241–270

    Google Scholar 

  • Friedman N (2000) Being Bayesian about network structure. Mach Learn 50:95–125

    Article  Google Scholar 

  • Friedman N, Halpern J (1999) Modeling beliefs in dynamic systems. Part II: revision and update. J AI Res 10:117–167

    MathSciNet  MATH  Google Scholar 

  • Furey TS, Cristianini N, Duffy N, Bednarski BW, Schummer M, Haussler D (2000) Support vector classification and validation of cancer tissue samples using microarray expression. Bioinformatics 16(10):906–914

    Article  Google Scholar 

  • Getoor L, Taskar B (2007) Introduction to statistical relational learning. MIT, Cambridge

    MATH  Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286:531–537

    Article  Google Scholar 

  • Hardin J, Waddell M, Page CD, Zhan F, Barlogie B, Shaugh-nessy J et al (2004) Evaluation of multiple models to distinguish closely related forms of disease using DNA microarray data: an application to multiple myeloma. Stat Appl Gene Mol Biol 3(1):1018

    MathSciNet  Google Scholar 

  • Jain AN, Dietterich TG, Lathrop RH, Chapman D, Critchlow RE, Bauer BE et al (1994) Compass: a shape-based machine learning tool for drug design. Aided Mol Des 8(6):635–652

    Article  Google Scholar 

  • Jones KE, Reiser FM, Bryant PGK, Muggleton CH, Kell S, King DB et al (2004) Functional genomic hypothesis generation and experimentation by a robot scientist. Nature 427:247–252

    Article  Google Scholar 

  • KDD Cup (2001) http://pages.cs.wisc.edu/-dpage/kddcup2001/

  • Klösgen W (2002) Handbook of data mining and knowledge discovery, chapter 16.3: subgroup discovery. Oxford University Press, New York

    Google Scholar 

  • Listgarten J, Damaraju S, Poulin B, Cook L, Dufour J, Driga A et al (2004) Predictive models for breast cancer susceptibility from multiple single nucleotide polymorphisms. Clin Cancer Res 10:2725–2737

    Article  Google Scholar 

  • Mardis ER (2006) Anticipating the 1,000 dollar genome. Genome Biol 7(7):112

    Article  Google Scholar 

  • Martin YC, Bures MG, Danaher EA, DeLazzer J, Lico II, Pavlik PA (1993) A fast new approach to pharmacophore mapping and its application to dopaminergic and benzodiazepine agonists. J Comput Aided Mol Des 8:751–758

    Google Scholar 

  • McCarty C, Wilke RA, Giampietro PF, Wesbrook SD, Caldwell MD (2005) Personalized medicine research project (PMRP): design, methods and recruitment for a large population-based biobank. Personal Med 2:49–79

    Article  Google Scholar 

  • Molla M, Waddell M, Page D, Shavlik J (2004) Using machine learning to design and interpret gene expression microarrays. AI Mag 25(1):23–44

    Google Scholar 

  • Muggleton S, De Raedt L (1994) Inductive logic programming: theory and methods. J Log Program 19(20):629–679

    Article  Google Scholar 

  • Noto K, Craven M (2006) A specialized learner for inferring structured cis-regulatorymodules. BMC Bioinform 7(528). doi:10.1186/1471-2105-7-528

    Google Scholar 

  • Oliver SG, Young M, Aubrey W, Byrne E, Liakata M, Markham M et al (2009) The automation of science. Science 324:85–89

    Article  Google Scholar 

  • Ong I, Glassner J, Page D (2002) Modelling regulatory pathways in E.coli from time series expression profiles. Bioinformatics 18:241S–248S

    Article  Google Scholar 

  • Pe’er D, Regev A, Elidan G, Friedman N (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics 17:215–224

    Article  Google Scholar 

  • Perou C, Jeffrey S, Van De Rijn M, Rees CA, Eisen MB, Ross, DT et al (1999) Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci 96:9212–9217

    Article  Google Scholar 

  • Petricoin EF III, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg SM et al (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359:572–577

    Article  Google Scholar 

  • Rost B, Sander C (1993) Prediction of protein secondary structure at better than 70 accuracy. J Mol Biol 232:584–599

    Article  Google Scholar 

  • Segal E, Pe’er D, Regev A, Koller D, Friedman N (2005) Learning module networks. J Mach Learn Res 6:557–588

    MathSciNet  MATH  Google Scholar 

  • Spatola A, Page D, Vogel D, Blondell S, Crozet Y (1999) Can machine learning and combinatorial chemistry co-exist? In: Proceedings of the American peptide symposium, Minneapolis. Kluwer Academic

    Google Scholar 

  • Srinivasan A (2001) The aleph manual. http://web.comlab.ox.ac.uk/oucl/research/areas/machlearn/Aleph/

  • Storey JD, Tibshirani R (2003) Statistical significance for genome-wide studies. Proc Natl Acad Sci 100:9440–9445

    Article  MathSciNet  MATH  Google Scholar 

  • The International Warfarin Pharmacogenetics Consortium (2009) Estimation of the Warfarin dose with clinical and pharmacogenetic data. N Engl J Med 360:753–764

    Article  Google Scholar 

  • Tucker A, Vinciotti V, Hoen PAC, Liu X, Famili AF (2005) Bayesian network classifiers for time-series microarray data. Adv Intell Data Anal VI 3646:475–485

    Article  Google Scholar 

  • Van’t Veer LL, Dai H, van de Vijver MM, He Y, Hart A, Mao M et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530–536

    Article  Google Scholar 

  • Waddell M, Page D, Shaughnessy J Jr (2005) Predicting cancer susceptibility from single-nucleotide polymorphism data: a case study in multiple myeloma. In: BIOKDD’05: proceedings of the fifth international workshop on bioinformatics, Chicago

    Google Scholar 

  • Wrobel S (1997) An algorithm for multi-relational discovery of subgroups. In: European symposium on principles of KDD, Trondheim. Lecture notes in computer science. Springer, pp 78–87

    Google Scholar 

  • Zhang X, Mesirov JP, Waltz DL (1992) Hybrid system for protein secondary structure prediction. J Mol Biol 225:81–92

    Article  Google Scholar 

  • Zou M, Conzen SD (2005) A new dynamic Bayesian network approach for identifying gene regulatory networks from time course microarray data. Bioinformatics 21:71–79

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to C. David Page .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this entry

Cite this entry

Page, C.D., Natarajan, S. (2014). Biomedical Informatics. In: Sammut, C., Webb, G. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7502-7_30-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7502-7_30-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Online ISBN: 978-1-4899-7502-7

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics