Abstract
Systems modeling and quantitative analysis of large amounts of complex clinical and biological data may help to identify discriminatory patterns that can uncover health risks, detect early disease formation, monitor treatment and prognosis, and predict treatment outcome. In this talk, we describe a machine-learning framework for classification in medicine and biology. It consists of a pattern recognition module, a feature selection module, and a classification modeler and solver. The pattern recognition module involves automatic image analysis, genomic pattern recognition, and spectrum pattern extractions. The feature selection module consists of a combinatorial selection algorithm where discriminatory patterns are extracted from among a large set of pattern attributes. These modules are wrapped around the classification modeler and solver into a machine learning framework. The classification modeler and solver consist of novel optimization-based predictive models that maximize the correct classification while constraining the inter-group misclassifications. The classification/predictive models 1) have the ability to classify any number of distinct groups; 2) allow incorporation of heterogeneous, and continuous/time-dependent types of attributes as input; 3) utilize a high-dimensional data transformation that minimizes noise and errors in biological and clinical data; 4) incorporate a reserved-judgement region that provides a safeguard against over-training; and 5) have successive multi-stage classification capability. Successful applications of our model to developing rules for gene silencing in cancer cells, predicting the immunity of vaccines, identifying the cognitive status of individuals, and predicting metabolite concentrations in humans will be discussed. We acknowledge our clinical/biological collaborators: Dr. Vertino (Winship Cancer Institute, Emory), Drs. Pulendran and Ahmed (Emory Vaccine Center), Dr. Levey (Neurodegenerative Disease and Alzheimer’s Disease), and Dr. Jones (Clinical Biomarkers, Emory).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brooks, J.P., Lee, E.K.: Solving a Mixed-Integer Programming Formulation of a Multi-Category Constrained Discrimination Model. In: INFORMS Proceedings of Artificial Intelligence and Data Mining, pp. 1–6 (2006)
Brooks, J.P., Lee, E.K.: Analysis of the Consistency of a Mixed Integer Programming-based Multi-Category Constrained Discriminant Model. Annals of Operations Research on Data Mining (Early version appeared online) (in press, 2008)
Feltus, F.A., Lee, E.K., Costello, J.F., Plass, C., Vertino, P.M.: Predicting Aberrant CpG Island Methylation. Proceedings of the National Academy of Sciences 100(21), 12253–122558 (2003)
Feltus, F.A., Lee, E.K., Costello, J.F., Plass, C., Vertino, P.M.: DNA Signatures Associated with CpG island Methylation States. Genomics 87, 572–579 (2006)
Gallagher, R.J., Lee, E.K., Patterson, D.: An Optimization Model for Constrained Discriminant Analysis and Numerical Experiments with Iris, Thyroid, and Heart Disease Datasets. In: Cimino, J.J. (ed.) Proceedings of the 1996 American Medical Informatics Association, pp. 209–213 (1996)
Gallagher, R.J., Lee, E.K., Patterson, D.A.: Constrained discriminant analysis via 0/1 mixed integer programming. Annals of Operations Research 74, 65–88 (1997) (Special Issue on Non-Traditional Approaches to Statistical Classification and Regression)
Lee, E.K., Gallagher, R.J., Patterson, D.: A Linear Programming Approach to Discriminant Analysis with a Reserved Judgment Region. INFORMS Journal on Computing 15(1), 23–41 (2003)
Lee, E.K.: Large-scale optimization-based classification models in medicine and biology. Annals of Biomedical Engineering, Systems Biology and Bioinformatics 35(6), 1095–1109 (2007)
Lee, E.K., Easton, T., Gupta, K.: Novel evolutionary models and applications to sequence alignment problems. Annals of Operations Research – Computing and Optimization in Medicine and Life Sciences 148, 167–187 (2006)
Lee, E.K., Fung, A.Y.C., Brooks, J.P., Zaider, M.: Automated Tumor Volume Contouring in Soft-Tissue Sarcoma Adjuvant Brachytherapy Treatment. International Journal of Radiation Oncology, Biology and Physics 47(11), 1891–1910 (2002)
Lee, E.K., Gallagher, R., Campbell, A., Prausnitz, M.: Prediction of ultrasound-mediated disruption of cell membranes using machine learning techniques and statistical analysis of acoustic spectra. IEEE Transactions on Biomedical Engineering 51(1), 1–9 (2004)
Lee, E.K., Galis, Z.S.: Fingerprinting Native and Angiogenic Microvascular Networks through Pattern Recognition and Discriminant Analysis of Functional Perfusion Data (submitted, 2008)
Lee, E.K., Ashfaq, S., Jones, D.P., Rhodes, S.D., Weintrau, W.S., Hopper, C.H., Vaccarino, V., Harrison, D.G., Quyyumi, A.A.: Prediction of early atherosclerosis in healthy adults via novel markers of oxidative stress and d-ROMs. Working paper (2009)
Lee, E.K., Wu, T.L.: Classification and disease prediction via mathematical programming. In: Seref, O., Kundakcioglu, O.E., Pardalos, P. (eds.) Data Mining, Systems Analysis, and Optimization in Biomedicine, AIP Conference Proceedings, vol. 953, pp. 1–42 (2007)
McCabe, M., Lee, E.K., Vertino, P.M.: A Multi-Factorial Signature of DNA Sequence and Polycomb Binding Predicts Aberrant CpG Island Methylation. Cancer Research 69(1), 282–291 (2009)
Querec, T.D., Akondy, R., Lee, E.K., et al.: Systems biology approaches predict immunogenicity of the yellow fever vaccine in humans. Nature Immunology 10, 116–125 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, E.K. (2009). Machine Learning Framework for Classification in Medicine and Biology. In: van Hoeve, WJ., Hooker, J.N. (eds) Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. CPAIOR 2009. Lecture Notes in Computer Science, vol 5547. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01929-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-01929-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01928-9
Online ISBN: 978-3-642-01929-6
eBook Packages: Computer ScienceComputer Science (R0)