Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5547))

Abstract

Systems modeling and quantitative analysis of large amounts of complex clinical and biological data may help to identify discriminatory patterns that can uncover health risks, detect early disease formation, monitor treatment and prognosis, and predict treatment outcome. In this talk, we describe a machine-learning framework for classification in medicine and biology. It consists of a pattern recognition module, a feature selection module, and a classification modeler and solver. The pattern recognition module involves automatic image analysis, genomic pattern recognition, and spectrum pattern extractions. The feature selection module consists of a combinatorial selection algorithm where discriminatory patterns are extracted from among a large set of pattern attributes. These modules are wrapped around the classification modeler and solver into a machine learning framework. The classification modeler and solver consist of novel optimization-based predictive models that maximize the correct classification while constraining the inter-group misclassifications. The classification/predictive models 1) have the ability to classify any number of distinct groups; 2) allow incorporation of heterogeneous, and continuous/time-dependent types of attributes as input; 3) utilize a high-dimensional data transformation that minimizes noise and errors in biological and clinical data; 4) incorporate a reserved-judgement region that provides a safeguard against over-training; and 5) have successive multi-stage classification capability. Successful applications of our model to developing rules for gene silencing in cancer cells, predicting the immunity of vaccines, identifying the cognitive status of individuals, and predicting metabolite concentrations in humans will be discussed. We acknowledge our clinical/biological collaborators: Dr. Vertino (Winship Cancer Institute, Emory), Drs. Pulendran and Ahmed (Emory Vaccine Center), Dr. Levey (Neurodegenerative Disease and Alzheimer’s Disease), and Dr. Jones (Clinical Biomarkers, Emory).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brooks, J.P., Lee, E.K.: Solving a Mixed-Integer Programming Formulation of a Multi-Category Constrained Discrimination Model. In: INFORMS Proceedings of Artificial Intelligence and Data Mining, pp. 1–6 (2006)

    Google Scholar 

  2. Brooks, J.P., Lee, E.K.: Analysis of the Consistency of a Mixed Integer Programming-based Multi-Category Constrained Discriminant Model. Annals of Operations Research on Data Mining (Early version appeared online) (in press, 2008)

    Google Scholar 

  3. Feltus, F.A., Lee, E.K., Costello, J.F., Plass, C., Vertino, P.M.: Predicting Aberrant CpG Island Methylation. Proceedings of the National Academy of Sciences 100(21), 12253–122558 (2003)

    Article  Google Scholar 

  4. Feltus, F.A., Lee, E.K., Costello, J.F., Plass, C., Vertino, P.M.: DNA Signatures Associated with CpG island Methylation States. Genomics 87, 572–579 (2006)

    Article  Google Scholar 

  5. Gallagher, R.J., Lee, E.K., Patterson, D.: An Optimization Model for Constrained Discriminant Analysis and Numerical Experiments with Iris, Thyroid, and Heart Disease Datasets. In: Cimino, J.J. (ed.) Proceedings of the 1996 American Medical Informatics Association, pp. 209–213 (1996)

    Google Scholar 

  6. Gallagher, R.J., Lee, E.K., Patterson, D.A.: Constrained discriminant analysis via 0/1 mixed integer programming. Annals of Operations Research 74, 65–88 (1997) (Special Issue on Non-Traditional Approaches to Statistical Classification and Regression)

    Article  Google Scholar 

  7. Lee, E.K., Gallagher, R.J., Patterson, D.: A Linear Programming Approach to Discriminant Analysis with a Reserved Judgment Region. INFORMS Journal on Computing 15(1), 23–41 (2003)

    Article  MathSciNet  Google Scholar 

  8. Lee, E.K.: Large-scale optimization-based classification models in medicine and biology. Annals of Biomedical Engineering, Systems Biology and Bioinformatics 35(6), 1095–1109 (2007)

    Google Scholar 

  9. Lee, E.K., Easton, T., Gupta, K.: Novel evolutionary models and applications to sequence alignment problems. Annals of Operations Research – Computing and Optimization in Medicine and Life Sciences 148, 167–187 (2006)

    MATH  Google Scholar 

  10. Lee, E.K., Fung, A.Y.C., Brooks, J.P., Zaider, M.: Automated Tumor Volume Contouring in Soft-Tissue Sarcoma Adjuvant Brachytherapy Treatment. International Journal of Radiation Oncology, Biology and Physics 47(11), 1891–1910 (2002)

    Google Scholar 

  11. Lee, E.K., Gallagher, R., Campbell, A., Prausnitz, M.: Prediction of ultrasound-mediated disruption of cell membranes using machine learning techniques and statistical analysis of acoustic spectra. IEEE Transactions on Biomedical Engineering 51(1), 1–9 (2004)

    Article  Google Scholar 

  12. Lee, E.K., Galis, Z.S.: Fingerprinting Native and Angiogenic Microvascular Networks through Pattern Recognition and Discriminant Analysis of Functional Perfusion Data (submitted, 2008)

    Google Scholar 

  13. Lee, E.K., Ashfaq, S., Jones, D.P., Rhodes, S.D., Weintrau, W.S., Hopper, C.H., Vaccarino, V., Harrison, D.G., Quyyumi, A.A.: Prediction of early atherosclerosis in healthy adults via novel markers of oxidative stress and d-ROMs. Working paper (2009)

    Google Scholar 

  14. Lee, E.K., Wu, T.L.: Classification and disease prediction via mathematical programming. In: Seref, O., Kundakcioglu, O.E., Pardalos, P. (eds.) Data Mining, Systems Analysis, and Optimization in Biomedicine, AIP Conference Proceedings, vol. 953, pp. 1–42 (2007)

    Google Scholar 

  15. McCabe, M., Lee, E.K., Vertino, P.M.: A Multi-Factorial Signature of DNA Sequence and Polycomb Binding Predicts Aberrant CpG Island Methylation. Cancer Research 69(1), 282–291 (2009)

    Article  Google Scholar 

  16. Querec, T.D., Akondy, R., Lee, E.K., et al.: Systems biology approaches predict immunogenicity of the yellow fever vaccine in humans. Nature Immunology 10, 116–125 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, E.K. (2009). Machine Learning Framework for Classification in Medicine and Biology. In: van Hoeve, WJ., Hooker, J.N. (eds) Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. CPAIOR 2009. Lecture Notes in Computer Science, vol 5547. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01929-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01929-6_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01928-9

  • Online ISBN: 978-3-642-01929-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics