Abstract
In this paper a novel feature selection technique based on mutual dependency modelling between genes is proposed for multiclass microarray gene expression classification. Several studies on analysis of gene expression data has shown that the genes (whether or not they belong to the same gene group) get co-expressed via a variety of pathways. Further, a gene may participate in multiple pathways that may or may not be co-active for all samples. It is therefore biologically meaningful to simultaneously divide genes into functional groups and samples into co-active categories. This can be done by modeling gene profiles for multiclass microarray gene data sets based on mutual dependency models, which model complex gene interactions. Most of the current works in multiclass microarray gene expression studies are based on statistical models with little or no consideration of gene interactions. This has led to lack of robustness and overly optimistic estimates of accuracy and noise reduction. In this paper, we propose multivariate analysis techniques which model the mutual dependency between the features and take into account complex interactions for extracting a subset of genes. The two techniques, the cross modal factor analysis (CFA) and canonical correlation analysis(CCA) show a significant reduction in dimensionality and class-prediction error, and improvement in classification accuracy for multiclass microarray gene expression datasets.
Chapter PDF
Similar content being viewed by others
Keywords
- Classification Accuracy
- Canonical Correlation Analysis
- Mutual Dependency
- Class Accuracy
- Feature Selection Technique
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dudoit, S., Fridly, J., Speed, T.P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data (June 2000), http://www.stat.berkeley.edu/tech-reports/576.pdf
Ooi, C.H., Chetty, M., Teng, S.W.: Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data. BMVC Journal 47, 1–19 (2006)
Tripathi, A., Klami, A., Kaski, S.: Simple integrative preprocessing preserves what is shared in data sources. BMC Bioinformatics 9, 111 (2008)
Bittner, M., et al.: Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406(3), 536–540 (2000)
Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proc. Eighth Int’l Conf. Intelligent Systems for Molecular Biology (ISMB), vol. 8, pp. 93–103 (2000)
Duggan, D.J., Bittner, M.L., Chen, Y., Meltzer, P., Trent, J.M.: Expression profiling using cDNA microarrays. Nature Genetics 21, 10–14 (1999)
Munagala, K., Tibshirani, R., Brown, P.: Cancer characterization and feature set extraction by discriminative margin clustering. BMC Bioinformatics 5, 21 (2004)
Ramaswamy, S., Tamayo, P., Rifkin, R., Mukherjee, S., Yeang, C.H., Angelo, M., Ladd, C., Reich, M., Latulippe, E., Mesirov, J.P., et al.: Multi-class cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA 98, 15149–15154 (2001)
Ross, D.T., Scherf, U., Eisen, M.B., Perou, C.M., Rees, C., Spellman, P., Iyer, V., Jeffrey, S.S., Van de Rijn, M., Waltham, M., et al.: Systematic variation in gene expression patterns in human cance cell lines. Nat. Genet. 24, 227–235 (2000)
Yeoh, E.-J., Ross, M.E., Shurtleff, S.A., Williams, W.K., Patel, D., Mahfouz, R., Behm, F.G., Raimondi, S.C., Relling, M.V., Patel, A., et al.: Classification, subtype discovery, and prediction of outcome in pediatric lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2), 133–143 (2002)
Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M., Antonescu, C.R., Peterson, C., et al.: Classification and diagnostic prediction of cancers using expression profiling and artificial neural networks. Nat. Med. 7, 673–679 (2001)
Bhattacharjee, A., Richards, W.G., Staunton, J.E., Li, C., Monti, S., Vasa, P., Ladd, C., Beheshti, J., Bueno, R., Gillette, M., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. USA 98, 13790–13795 (2001)
Armstrong, S.A., Staunton, J.E., Silverman, L.B., Pieters, R., den Boer, M.L., Minden, M.D., Sallan, S.E., Lander, E.S., Golub, T.R., Korsmeyer, S.J.: MLL translocations specify adistinct gene expression profile that distinguishes a unique leukemia. Nat. Genet. 30, 41–47 (2002)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., et al.: Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Borga, M.: Canonical correlation a tutorial (1999), http://www.imt.liu.se/mi/Publications/magnus.html
Hsu, C.-W., Lin, C.-J.: A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks 13, 415–425 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chetty, G., Chetty, M. (2009). Multiclass Microarray Gene Expression Analysis Based on Mutual Dependency Models. In: Kadirkamanathan, V., Sanguinetti, G., Girolami, M., Niranjan, M., Noirel, J. (eds) Pattern Recognition in Bioinformatics. PRIB 2009. Lecture Notes in Computer Science(), vol 5780. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04031-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-04031-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04030-6
Online ISBN: 978-3-642-04031-3
eBook Packages: Computer ScienceComputer Science (R0)