Abstract
The incorporation of pathway data into the microarray analysis had lead to a new era in advance understanding of biological processes. However, this advancement is limited by the two issues in quality of pathway data. First, the pathway data are usually made from the biological context free, when it comes to a specific cellular process (e.g. lung cancer development), it can be that only several genes within pathways are responsible for the corresponding cellular process. Second, pathway data commonly curated from the literatures, it can be that some pathway may be included with the uninformative genes while the informative genes may be excluded. In this paper, we proposed a hybrid of support vector machine and smoothly clipped absolute deviation with group-specific tuning parameters (gSVM-SCAD) to select informative genes within pathways before the pathway evaluation process. Our experiments on lung cancer and gender data sets show that gSVM-SCAD obtains significant results in classification accuracy and in selecting the informative genes and pathways.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wang, X., Dalkic, E., Wu, M., et al.: Gene module level analysis: identification to networks and dynamics. Curr. Opin. Biotechnol. 19, 482–491 (2008), doi:10.1016/j.copbio.2008.07.011
Misman, M.F., Deris, S., Hashim, S.Z.M., et al.: Pathway-based microarray analysis for defining statistical significant phenotype-related pathways: a review of common approaches. In: Int. Conf. Inf. Manag. Eng. (2009), doi:10.1109/ICIME.2009.103
Chen, X., Wang, L., Smith, J.D., et al.: Supervised principle component analysis for gene set enrichment of microarray data with continuous or survival outcome. Bioinformatics 24, 2474–2481 (2008), doi:10.1093/bioinformatics/btn458
Mohamad, M.S., Omatu, S., Deris, S., et al.: A modified binary particle swarm optimization for selecting the small subset of informative genes from gene expression data. IEEE Trans. Inf. Technol. Biomed (2011), doi:10.1109/TITB.2011.2167756
Zhang, H.H., Ahn, J., Lin, X., et al.: Gene selection using support vector machines with non-convex penalty. Bioinformatics 22, 88–95 (2006), doi:10.1093/bioinformatics/bti736
Guyon, I., Weston, J., Barnhill, S., et al.: Gene selection for cancer classification using support vector machines. Mach Learn 46, 389–422 (2002), doi:10.1093/bioinformatics/btl386
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58, 267–288 (1996), doi:10.1.1.35.7574
Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001), doi:10.2307/3085904
Wahba, G., Lin, Y., Zhang, H.: GACV for support vector machines, or, another way to look at margin-like quantities. In: Smola, A.J., Bartlett, P., Schoelkopf, B., Schurmans, D. (eds.) Advances in Large Margin Classifiers. MIT Press, Cambridge (2000)
Wahba, G.: Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV. In: Schoelkopf, A.B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods – Support Vector Learning. MIT Press, Cambridge (1999)
Tai, F., Pan, W.: Incorporating Prior Knowledge of Predictors into Penalized Classifiers with Multiple Penalty Terms. Bioinformatics 23, 1775–1782 (2007), doi:10.1093/bioinformatics/btm234
Pang, H., Lin, A., Holford, M., et al.: Pathway analysis using random forest classification and regression. Bioinformatics 16, 2028–2036 (2006), doi:10.1093/bioinformatics/btl344
Battacharjee, A., Richards, W.G., Satunton, J., et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. 98, 13790–13795 (2001), doi:10.1073/pnas.191502998
Becker, N., Werft, W., Toedt, G., et al.: PenalizedSVM: A R-package for feature selection SVM classification. Bioinformatics 25, 1711–1712 (2009), doi:10.1093/bioinformatics/btp286
He, Z., Yu, W.: Stable feature selection for biomarker discovery. Computational Biology and Chemistry 34, 215–225 (2010), doi:10.1016/j.compbiolchem.2010.07.002
Zucknick, M., Richardson, S., Stronach, E.A.: Comparing the Characteristics of Gene Expression Profiles Derived by Univariate and Multivariate Classification Methods. Statistical Application in Genetics and Molecular Biology 7, 1–28 (2008), doi:10.2202/1544-6115.1307
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Misman, M.F., Mohamad, M.S., Deris, S., Mohamad, R.N.M.R., Hashim, S.Z.M., Omatu, S. (2012). A Hybrid of SVM and SCAD with Group-Specific Tuning Parameter for Pathway-Based Microarray Analysis. In: Omatu, S., De Paz Santana, J., González, S., Molina, J., Bernardos, A., RodrÃguez, J. (eds) Distributed Computing and Artificial Intelligence. Advances in Intelligent and Soft Computing, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28765-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-642-28765-7_46
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28764-0
Online ISBN: 978-3-642-28765-7
eBook Packages: EngineeringEngineering (R0)