Abstract
A major goal of clinical proteomics is the identification of protein biomarkers from mass spectral analyses of fairly easily obtainable samples such as blood serum, urine or cerebrospinal fluid from patient populations. It is hoped that such protein biomarkers can be utilized for early detection of disease and examined further for potential therapeutic use. In this paper, we present the process for successful discovery of biomarkers that are indicators of a chronic neurodegenerative disease of motor neurons, called Amyotrophic Lateral Sclerosis; from application of rule learning to the analysis of proteomic mass spectra from cerebrospinal fluid samples. We have implemented a wrapper-based rule learning framework within which the massive number of features that accumulate from mass spectral analyses of clinical samples can be evaluated by repeated invocation of a rule learner. Our framework facilitates evidence gathering as indicated in this case study, and can speed up disease-specific biomarker discovery from clinical proteomic mass spectra.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Srinivas, P.R., Verma, M., Zhao, Y., Srivastava, S.: Proteomics for cancer biomarker discovery. Clin. Chem. 48(8), 1160–1169 (2002)
Tyers, M., Mann, M.: From genomics to proteomics. Nature 422(6928), 193–197 (2003)
Cazares, L.H., Adam, B.L., Ward, M.D., Nasim, S., Schellhammer, P.F., Semmes, O.J., Wright Jr., G.L.: Normal, benign, preneoplastic, and malignant prostate cells have distinct protein expression profiles resolved by surface enhanced laser desorption/ionization mass spectrometry. Clin. Cancer. Res. 8(8), 2541–2552 (2002)
Wright, G.L., Cazares, L.H., Leung, S.M., Nasim, S., Adam, B.L., Yip, T.T., Schellhammer, P.F., Gong, L., Vlahou, A.: Proteinchip(R) surface enhanced laser desorption/ionization (SELDI) mass spectrometry: a novel protein biochip technology for detection of prostate cancer biomarkers in complex protein mixtures. Prostate Cancer Prostatic Dis. 2(5/6), 264–276 (1999)
Adam, B.L., Qu, Y., Davis, J.W., Ward, M.D., Clements, M.A., Cazares, L.H., Semmes, O.J., Schellhammer, P.F., Yasui, Y., Feng, Z., Wright Jr., G.L.: Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res. 62(13), 3609–3614 (2002)
Petricoin, E.F., Ardekani, A.M., Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577 (2002)
Coombes, K.R., Morris, J.S., Hu, J., Edmonson, S.R., Baggerly, K.A.: Serum proteomics profiling–a young technology begins to mature. Nat. Biotechnol. 23(3), 291–292 (2005)
Bensmail, H., Golek, J., Moody, M.M., Semmes, J.O., Haoudi, A.: A novel approach for clustering proteomics data using Bayesian fast Fourier transform. Bioinformatics 21(10), 2210–2224 (2005)
Fung, E.T., Weinberger, S.R., Gavin, E., Zhang, F.: Bioinformatics approaches in clinical proteomics. Expert Rev. Proteomics 2(6), 847–862 (2005)
Seibert, V., Ebert, M.P., Buschmann, T.: Advances in clinical cancer proteomics: SELDI-ToF-mass spectrometry and biomarker discovery. Brief Funct. Genomic. Prot. 4(1), 16–26 (2005)
Ranganathan, S., Williams, E., Ganchev, P., Gopalakrishnan, V., Lacomis, D., Urbinelli, L., Newhall, K., Cudkowicz, M.E., Brown Jr., R.H., Bowser, R.: Proteomic profiling of cerebrospinal fluid identifies biomarkers for amyotrophic lateral sclerosis. J. Neurochem. 95(5), 1461–1471 (2005)
Frank, E., Hall, M., Trigg, L., Holmes, G., Witten, I.H.: Data mining in bioinformatics using Weka. Bioinformatics 20(15), 2479–2481 (2004)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Clearwater, S., Provost, F.: RL4: A Tool for Knowledge-Based Induction. In: Proceedings of the Second International IEEE Conference on Tools for Artificial Intelligence (TAI 1990) (1990)
Feigenbaum, E.A., Buchanan, B.G.: Dendral and Meta-Dendral - Roots of Knowledge Systems and Expert System Applications. Artif. Intell. 59(1-2), 223–240 (1993)
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42, 203–231 (2001)
Mitchell, T.: The need for biases in learning generalizations. In: Dietterich, T.G., Shavlik, J. (eds.) Readings in Machine Learning. Morgan Kaufmann, San Francisco (1991)
Provost, F., Buchanan, B.G.: Inductive policy: the pragmatics of bias selection. Machine Learning 20, 35–61 (1995)
Gopalakrishnan, V., Williams, E., Ranganathan, S., Bowser, R., Cudkowic, M.E., Novelli, M., Lattanzi, W., Ganbotto, A., Day, B.W.: Proteomic Data Mining Challenges in Identification of Disease-Specific Biomarkers from Variable Resolution Mass Spectra. In: Proceedings of SIAM Bioinformatics Workshop 2004. Society of Industrial and Applied Mathematics International Conference on Data Mining, April 2004, pp. 1–10 (2004)
Liu, H., Li, J., Wong, L.: A Comparative Study on Feature Selection and Classification methods Using Gene Expression Profiles and Proteomic Patterns. Genome Informatics 13, 51–60 (2002)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth International Group, Belmont (1984)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gopalakrishnan, V., Ganchev, P., Ranganathan, S., Bowser, R. (2006). Rule Learning for Disease-Specific Biomarker Discovery from Clinical Proteomic Mass Spectra. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_10
Download citation
DOI: https://doi.org/10.1007/11691730_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33104-9
Online ISBN: 978-3-540-33105-6
eBook Packages: Computer ScienceComputer Science (R0)