Abstract
During the last few years, different studies have been done to reveal the involvement of microRNAs (miRNAs) in pathways of different types of cancers. It is evident from the research in this field that miRNA expression profiles help classify cancerous tissue from normal tissue or different subtypes of cancer. In this article, miRNA expression data of different cancer types are analyzed using a novel multiobjective genetic algorithm-based feature selection method for finding reduced non-redundant set of miRNA markers. Three objectives, viz. classification accuracy, a cluster validity index call Davies–Bouldin (DB) index, and the number of miRNAs encoded in a chromosome of genetic algorithm is optimized simultaneously. The classification accuracy is maximized to obtain the most relevant set of miRNAs. DB index is optimized for clustering the miRNAs and choosing representative miRNAs from each cluster in order to obtain a non-redundant set of miRNA markers. Finally, the number of miRNAs is minimized to yield a reduced set of selected miRNAs. The performance of the proposed genetic algorithm-based method is compared with that of the other existing feature selection techniques. It has been found that the performance of the proposed technique is better than that of the other methods with respect to most of the performance metrics. Lastly, the obtained miRNA markers with their associated disease and number of target mRNAs are reported.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bandyopadhyay, S., Mallik, S., Mukhopadhyay, A.: A survey and comparative study of statistical tests for identifying differential expression from microarray data. IEEE/ACM Trans. Comput. Biol. Bioinform. 11(1), 95–115 (2014)
Cover, T., Thomas, J.: Entropy, Relative Entropy and Mutual Information. Elements of Information Theory, Wiley (2006)
Covoes, T.F., Hruschka, E.R., de Castro, L.N., Santos, A.M.: A cluster-based feature selection approach. In: International Conference on Hybrid Artificial Intelligence Systems, pp. 169–176 (2009)
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intel. 1(2), 224–227 (1979)
Deb, K., Pratap, A., Agrawal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. In: IEEE Transactions on Evolutionary Computation, pp. 182–197 (2002)
Ding, C., Peng, H.: Minimum redundancy feature selection for microarray gene expression data. J. Bioinform. Comput. Biol. 3(2), 185–205 (2005)
Gasch, A.P., Eisen, M.B.: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. In: Genome Biol. 3(11), 0059.1–0059.22 (2002)
Goldberg, D.E.: Genetic Algorithms in Search. Optimization and Machine Learning. Addison-Wesley, New York (1989)
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gassenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomeld, D.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Kamandar, M., Ghassemian, H.: Maximum relevance, minimum redundancy band selection for hyperspectral images. In: 19th Iranian Conference on Electrical Engineering (ICEE) (2011)
Lu, J., Getz, G., Miska, E.A., Alvarez-Saavedra, E., Lamb, J., Peck, D., Sweet-Cordero, A., Ebert, B.L., Mak, R.H., Ferrando, A.A., Downing, J.R., Jacks, T., Horvitz, H.R., Golub, T.R.: MicroRNA expression profiles classify human cancers. Nature 435(7043), 834–838 (2005)
Mandal, M., Mukhopadhyay, A.: A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary PSO. Plos One 9(3), e90949 (2014)
Mankiewicz, R.: The Story of Mathematics. Princeton University Press (2000)
Maulik, U., Bandyopadhyay, S., Mukhopadhyay, A.: Multiobjective Genetic Algorithms for Clustering–Applications in Data Mining and Bioinformatics. Springer, ISBN 978-3-642-16615-0 (2011)
Mukhopadhyay, A., Bandyopadhyay, S., Maulik, U.: Multi-class clustering of cancer subtypes through SVM based ensemble of paretooptimal solutions for gene marker identification. PLoS One 5(11), e13803 (2010)
A. Mukhopadhyay and M. Mandal. Identifying non-redundant gene markers from microarray data: a multiobjective variable length PSO-based approach. IEEE/ACM Trans. Comput. Biol. Bioinform. pp(99) (2014)
Mukhopadhyay, A., Maulik, U.: An SVM-wrapped multiobjective evolutionary feature selection approach for identifying cancer-microRNA markers. IEEE Trans. NanoBioSci. 12(4), 275–281 (2013)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: An interactive approach to multiobjective clustering of gene expression patterns. IEEE Trans. Biomed. Eng. 60(1), 35–41 (2013)
Mukhopadhyay, A., Maulik, U., Bandyopadhyay, S.: A survey of multiobjective evolutionary clustering. ACM Comput. Surv. (CSUR) 47(4), 61:1–61:46 (2015)
Ruiza, R., Riquelmea, J.C., Aguilar-Ruizb, J.S.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognit. 39(12), 2383–2392 (2010)
Sun, J.-G., Liao, R.-X., Qiu, J., Jin, J.-Y., Wang, X.-X., Duan, Y.-Z., Chen, F.-L., Hao, P., Xie, Q.-C., Wang, Z.-X., Li, D.-Z., Chen, Z.-T., Zhang, S.-X.: Microarray-based analysis of microRNA expression in breast cancer stem cells. J. Exp. Clin. Cancer Res. 29(174) (2010)
Thomson, J.M., Parker, J., Perou, C.M., Hammond, S.M.: A custom microarray platform for analysis of microRNA gene expression. Nat. Methods 1(1), 47–53 (2004)
Troyanskaya, O., Garber, M., Brown, P., Botstein, D., Altman, R.: Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18, 1454–1461 (2002)
Vapnik, V.: Statistical Learning Theory. Wiley, New York, USA (1998)
Wu, D., Hu, Y., Tong, S., Williams, B.R., Smyth, G.K., Gantier, M.: The use of mirna microarrays for the analysis of cancer samples with global mirna decrease. RNA 19(7), 876–888 (2013)
Zhang, Z., Hancock, E.R.: A graph-based approach to feature selection. In: International Workshop on Graph-Based Representations, Pattern Recognition, pp. 205–214 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mandal, M., Mukhopadhyay, A., Maulik, U. (2018). A Genetic Algorithm-Based Clustering Approach for Selecting Non-redundant MicroRNA Markers from Microarray Expression Data. In: Kar, S., Maulik, U., Li, X. (eds) Operations Research and Optimization. FOTA 2016. Springer Proceedings in Mathematics & Statistics, vol 225. Springer, Singapore. https://doi.org/10.1007/978-981-10-7814-9_12
Download citation
DOI: https://doi.org/10.1007/978-981-10-7814-9_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7813-2
Online ISBN: 978-981-10-7814-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)