Classification of High-Resolution NMR Spectra Based on Complex Wavelet Domain Feature Selection and Kernel-Induced Random Forest

  • Guangzhe Fan
  • Zhou Wang
  • Seoung Bum Kim
  • Chivalai Temiyasathit
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6134)


High-resolution nuclear magnetic resonance (NMR) spectra contain important biomarkers that have potentials for early diagnosis of disease and subsequent monitoring of its progression. Traditional features extraction and analysis methods have been carried out in the original frequency spectrum domain. In this study, we conduct feature selection based on a complex wavelet transform by making use of its energy shift-insensitive property in a multiresolution signal decomposition. A false discovery rate based multiple testing procedure is employed to identify important metabolite features. Furthermore, a novel kernel-induced random forest algorithm is used for the classification of NMR spectra based on the selected features. Our experiments with real NMR spectra showed that the proposed method leads to significant reduction in misclassification rate.


High-resolution NMR spectrum Metabolomics Classification tree Random forest Complex wavelet transforms False discovery rate Kernel 


  1. 1.
    Goodacre, R., York, E.V., Heald, J.K., Scott, I.M.: Phytochemistry 62, 859–863 (2003)Google Scholar
  2. 2.
    Tapp, H.S., Defernez, M., Kemsley, E.K.: Journal of Agricultural And Food Chemistry 51, 6110–6115 (2003)CrossRefGoogle Scholar
  3. 3.
    Davis, R.A., Charlton, A.J., Oehlschlager, S., Wilson, J.C.: Chemometrics and Intelligent Laboratory Systems 81, 50–59 (2006)CrossRefGoogle Scholar
  4. 4.
    Barache, D., Antoine, J., Dereppe, J.: Journal of Magnetic Resonance 128, 1–11 (1997)Google Scholar
  5. 5.
    Gunther, U.L., Ludwig, C., Ruterjans, H.: Journal of Magnetic Resonance 156, 19–25 (2002)Google Scholar
  6. 6.
    Qu, Y., Adam, B.-L., Thornquist, M., Potter, J.D., Thompson, M.L., Yasui, Y., Davis, J., Schellhammer, P.F., Cazares, L., Clements, M., Write, G.L., Feng, Z.: Biometrics 59, 143–151 (2003)Google Scholar
  7. 7.
    Kim, S.B., Wang, Z., Oraintara, S., Temiyasathit, C., Wongsawat, Y.: Chemometrics and Intelligent Laboratory Systems 90, 161–168 (2008)CrossRefGoogle Scholar
  8. 8.
    Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)zbMATHCrossRefGoogle Scholar
  9. 9.
    Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)zbMATHGoogle Scholar
  10. 10.
    Fan, G.: Kernel-Induced Classification Tree and Random Forest.Technical Report, Dept. of Statistics and Actuarial Science, University of Waterloo (2009)Google Scholar
  11. 11.
    Gabor, D.: Journal of Institution Electrical Engineering, 429–457 (1946)Google Scholar
  12. 12.
    Benjamini, Y., Hochberg, Y.: Journal of The Royal Statistical Society Series B. Methodological 57, 289–300 (1995)zbMATHMathSciNetGoogle Scholar
  13. 13.
    Shaffer, J.P.: Annual Review of Psychology 46, 561–584 (1995)Google Scholar
  14. 14.
    Kim, S.B., Tsui, K.-L., Borodovsky, M.: International Journal of Bioinformatics Research and Applications 2, 193–217 (2006)Google Scholar
  15. 15.
    Storey, J.D.: Annals of Statistics 31, 2013–2035 (2003)Google Scholar
  16. 16.
    Hastie, T., Tibshirani, R., Friedman, J.: The Element of Statistical Learning. Springer, New York (2001)Google Scholar
  17. 17.
    Lee, G.C., Woodruff, D.L.: Analytica Chimica Acta 513, 413–416 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Guangzhe Fan
    • 1
  • Zhou Wang
    • 2
  • Seoung Bum Kim
    • 3
  • Chivalai Temiyasathit
    • 4
  1. 1.Dept. of Statistics & Acturial ScienceUniversity of WaterlooWaterlooCanada
  2. 2.Dept. of Electrical & Computer EngineeringUniversity of WaterlooWaterlooCanada
  3. 3.Dept. Industrial Systems & Information EngineeringKorea UniversitySeoulKorea
  4. 4.International CollegeKing Mongkut’s Inst. of Technology LadkrabangBangkokThailand

Personalised recommendations