Deriving Biomedical Diagnostics from Spectroscopic Data

Conference paper


Biomedical spectroscopic experiments generate large volumes of data. For accurate, robust diagnostic tools the data must be analyzed for only a few characteristic observations per subject, and a large number of subjects must be studied. We describe here some of the current mathematical methods applied to this problem: Principal Component Analysis, Partial Least Squares, and the Statistical Classification Strategy. We demonstrate the application of these methods by three examples of their use in analyzing 1H NMR spectra: screening for colon cancer, characterization of thyroid cancer, and distinguishing cancer from cholangitis in the biliary tract.


Biomedical spectroscopy Multivariate methods Classifiers PC PCA SIMCA Cancer 



Fisher’s linear discriminant


Fecal occult blood test


Nuclear magnetic resonance


Principal component


Principal component analysis


Principal component regression


Partial least squares


Primary sclerosing cholangitis


Statistical classification strategy


Soft independent modelling of class analogies


Weighted cross validated bootstrap


  1. Albiin, N. et al. (2008) Detection of cholangiocarcinoma with magnetic resonance spectroscopy of bile in patients with and without primary sclerosing cholangitis. Acta Radiologica 49: 855–862.PubMedCrossRefGoogle Scholar
  2. Bezabeh, T. et al. (2009) Detecting colorectal cancer by 1H magnetic resonance spectroscopy of fecal extracts. NMR Biomed. 22(6): 593–600.PubMedCrossRefGoogle Scholar
  3. Erikkson, L. et al. (2001) Multi- and megavariate data analysis – principles and applications. Umetrics AB, Umeå.Google Scholar
  4. Kuncheva, L.I. (2004) Combining instance classifiers – methods and algorithms. Wiley, Hoboken, NJ.Google Scholar
  5. Nikulin, A.E. et al. (1998) Near-optimal region selection for feature space reduction: novel preprocessing methods for classifying MR spectra. NMR Biomed. 11: 209–217.PubMedCrossRefGoogle Scholar
  6. Somorjai, R.L. et al. (1995) Computerized consensus diagnosis: a classification strategy for the robust analysis of NMR spectra. I. Application to thyroid neoplasms. Magn. Res. Med. 33: 257–263.CrossRefGoogle Scholar
  7. Somorjai, R.L. et al. (2004a) A data-driven, flexible machine learning strategy for the classification of biomedical data. Artificial intelligence methods and tools for systems biology (Chapter 5), W. Dubitzky and F. Azuaje, eds. Computational Biology Series, Vol. 5, Springer, Dordrecht, pp. 67–85.CrossRefGoogle Scholar
  8. Somorjai, R.L. et al. (2004b) Mapping high-dimensional data onto a relative distance plane – a novel, exact method for visualizing and characterizing high-dimensional instances. J. Biomed. Inform. 37: 366–379.PubMedCrossRefGoogle Scholar
  9. Somorjai, R.L. (2009) Creating robust, reliable, clinically relevant classifiers from spectroscopic data. Biophys. Rev. 1: 201–211.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2011

Authors and Affiliations

  1. 1.Institute for Biodiagnostics, National Research Council WinnipegWinnipegCanada

Personalised recommendations