Performance Analysis of Non-negative Matrix Factorization Methods on TCGA Data
Non-negative Matrix Factorization (NMF) is recognized as one of fundamentally important and highly popular methods for clustering and feature selection, and many related methods have been proposed so far. Nevertheless, their performances, especially on real data, are still unclear due to few studies focusing on their comparison. This study aims at a assessment study of several representative methods from clustering and feature selection, including NMF, GNMF, MD-NMF, L2,1NMF, LNMF, Convex-NMF and Semi-NMF, on the data of the Cancer Genome Atlas (TCGA), which is one of current research hotspot of bioinformatics. Specifically, three data types of four cancers are either separately or integratedly decomposed as the coefficient matrices and the basis matrices by these NMF methods. The coefficient matrices are evaluated by accuracies of clustered samples and the basis matrices are assessed by p-values of selected genes. Experiment results not only show merits and limitations of compared NMF methods, which may provide guidelines for applying them and proposing novel NMF methods, but also reveal several clues for the exploration of related cancers.
KeywordsNon-negative Matrix Factorization Clustering Genomic data Dimensionality reduction
This work was supported in part by the NSFC under grant Nos. 61572284, 61502272 and 61702299.
- 2.Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing systems, pp. 556–562 (2001)Google Scholar
- 4.Wang, S., Tang, J., Liu, H.: Embedded unsupervised feature selection (2015)Google Scholar
- 5.Sumanta, R., Sanghamitra, B.: A NMF based approach for integrating multiple data sources to predict HIV-1–human PPIs. BMC Bioinf. 17(1), 1–13 (2016)Google Scholar
- 7.Yang, Z., Michailidis, G.: A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics 32(1), 325–342 (2015)Google Scholar
- 8.Gao, J., Aksoy, B.A., Dogrusoz, U., Dresdner, G., Gross, B., Sumer, S.O., Sun, Y., Jacobsen, A., Sinha, R., Larsson, E.: Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci. Sign. 6(269), 2383 (2013)Google Scholar
- 13.Kong, D., Ding, C., Huang, H.: Robust nonnegative matrix factorization using l21-norm. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 673–682 (2011)Google Scholar
- 14.Li, S.Z., Hou, X.W., Zhang, H., Cheng, Q.: Learning spatially localized, parts-based representation. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, I-207-I-212 (2001). vol. 201Google Scholar
- 15.Ding, C., Li, T., Jordan, M.I.: Convex and semi-nonnegative matrix factorizations. IEEE Trans. Softw. Eng. 32(1), 45–55 (2010)Google Scholar
- 18.Le, L., Yu-Jin, Z.: A survey on algorithms of non-negative matrix factorization. J. Acta Electronica Sinica. 36(4), 737–743 (2008)Google Scholar
- 19.Chen, X., Gu, L., Li, S.Z., Zhang, H.-J.: Learning representative local features for face detection. In: 2001 Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2001, vol. 1, I-1126-I-1131 (2001). vol. 1121Google Scholar
- 22.Wang, Q., Liu, X.D.: Genes and Cholangiocarcinoma Genesis and Development. Medical Recapitulate (2012)Google Scholar