Selecting Features with Group-Sparse Nonnegative Supervised Canonical Correlation Analysis: Multimodal Prostate Cancer Prognosis

  • Haibo Wang
  • Asha Singanamalli
  • Shoshana Ginsburg
  • Anant Madabhushi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8675)


This paper presents Group-sparse Nonnegative supervised Canonical Correlation Analysis (GNCCA), a novel methodology for identifying discriminative features from multiple feature views. Existing correlation-based methods do not guarantee positive correlations of the selected features and often need a pre-feature selection step to reduce redundant features on each feature view. The new GNCCA approach attempts to overcome these issues by incorporating (1) a nonnegativity constraint that guarantees positive correlations in the reduced representation and (2) a group-sparsity constraint that allows for simultaneous between- and within- view feature selection. In particular, GNCCA is designed to emphasize correlations between feature views and class labels such that the selected features guarantee better class separability. In this work, GNCCA was evaluated on three prostate cancer (CaP) prognosis tasks: (i) identifying 40 CaP patients with and without 5-year biochemical recurrence following radical prostatectomy by fusing quantitative features extracted from digitized pathology and proteomics, (ii) predicting in vivo prostate cancer grade for 16 CaP patients by fusing T2w and DCE MRI, and (iii) localizing CaP/benign regions on MR spectroscopy and MRI for 36 patients. For the three tasks, GNCCA identifies a feature subset comprising 2%, 1% and 22%, respectively, of the original extracted features. These selected features achieve improved or comparable results compared to using all features with the same Support Vector Machine (SVM) classifier. In addition, GNCCA consistently outperforms 5 state-of-the-art feature selection methods across all three datasets.


Feature Selection Canonical Correlation Analysis Feature Selection Method Nonnegative Matrix Factorization Linear Support Vector Machine 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Kettenring, J.R.: Canonical analysis of several sets of variables. Biometrika 58(3), 433–451 (1971)CrossRefzbMATHMathSciNetGoogle Scholar
  2. 2.
    Singanamalli, A., Wang, H., et al.: Supervised multi-view canonical correlation analysis: Fused multimodal prediction of disease diagnosis and prognosis. In: SPIE Medical Imaging, vol. 9038 (2014)Google Scholar
  3. 3.
    Witten, D.M., Tibshirani, R.J.: Extensions of sparse canonical correlation analysis with applications to genomic data. Statistical Applications in Genetics and Molecular Biology 8(1), 1–27 (2009)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Ginsburg, S., Tiwari, P., Kurhanewicz, J., Madabhushi, A.: Variable ranking with pca: Finding multiparametric mr imaging markers for prostate cancer diagnosis and grading. In: Madabhushi, A., Dowling, J., Huisman, H., Barratt, D. (eds.) Prostate Cancer Imaging 2011. LNCS, vol. 6963, pp. 146–157. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  5. 5.
    Duda, R.O., Hart, P., Stork, D.: Pattern classification. Journal of Classification 24(2), 305–307 (2007)CrossRefGoogle Scholar
  6. 6.
    Student: The problem error of a mean. Biometrika 6, 1–25 (1908)CrossRefGoogle Scholar
  7. 7.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI) 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  8. 8.
    Ye, J., Liu, J.: Sparse methods for biomedical data. SIGKDD 14(1), 4–15 (2012)CrossRefGoogle Scholar
  9. 9.
    Jingu Kim, R.M., Park, H.: Group sparsity in nonnegative matrix factorization. In: SIAM International Conference on Data Mining (SDM), pp. 851–862 (2012)Google Scholar
  10. 10.
    Duchi, J., et al.: Efficient projections onto the l1-ball for learning in high dimensions. In: The 25th International Conference on Machine Learning (ICML), pp. 272–279 (2008)Google Scholar
  11. 11.
    Tiwari, P., et al.: Multimodal wavelet embedding representation for data combination (maweric): integrating magnetic resonance imaging and spectroscopy for prostate cancer detection. NMR in Biomedicine 25(4), 607–619 (2012)CrossRefGoogle Scholar
  12. 12.
    Gleason, D.F.: The veteran’s administration cooperative urologic research group: histologic grading and clinical staging of prostatic carcinoma. In: Urologic Pathology: The Prostate, pp. 171–198 (1977)Google Scholar
  13. 13.
    Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)zbMATHGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Haibo Wang
    • 1
  • Asha Singanamalli
    • 1
  • Shoshana Ginsburg
    • 1
  • Anant Madabhushi
    • 1
  1. 1.Department of Biomedical EngineeringCase Western Reserve UniversityUSA

Personalised recommendations