Abstract
With the development of microarray technology, it is possible now to study and measure the expression profiles of thousands of genes simultaneously which can lead to identify subgroup of specific disease or extract hidden relationships between genes. One computational method often used to this end is clustering. In this paper, we propose a parallel distributed system for gene expression profiling (PDS-GEF) which provides a useful basis for individualized treatment of a certain disease such as Cancer. The proposed approach is based on two major techniques: the GIM (Generalized Island Model) and clustering ensemble. GIMs are used to generate good quality clusterings which are refined by a consensus function to get a high quality clustering. PDS-GEF system is implemented using Matlab®’s PCT (Parallel Computing ToolboxTM) which runs on a desktop computer, and tested on 34 different publicly available gene expression data sets. The obtained results compete with and even outperform existing methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jens, S., Kerstin, B., Anette, J., Jvrg, D.H., Philipp, A.: Microarray Technology as a Universal Tool for High-Throughput Analysis of Biological Systems. Combinatorial Chemistry & High Throughput Screening 9, 365–380 (2006)
Tarca, A.L., Roberto, R., Sorin, D.: Analysis of microarray experiments of gene expression profiling. American Journal of Obstetrics and Gynecology 195, 373–388 (2006)
Aach, J., Rindone, W., George, M.S.: Systematic management and analysis of yeast gene expression data. Genome Research 10, 431–445 (2000)
Bethin, K.E., Nagai, Y., Sladek, R., Asada, M., Sadovsky, Y., Hudson, T.J., et al.: Microarray analysis of uterine gene expression in mouse and human pregnancy. Mol. Endocrinol. 17, 1454–1469 (2003)
Vladimir, E.C.: Why so many clustering algorithms. Sigkdd Explorations 4, 65–75 (2002)
Daxin, J., Chun, T., Aidong, Z.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Transaction on Knowledge And Data Engineering 16, 1370–1386 (2004)
Kerr, G., Ruskin, H.J., Crane, M., Doolan, P.: Techniques for clustering gene expression data. Computer in Biology and Medecine 38, 283–293 (2008)
Harun, P., Burak, E., Andy, D.P., Çertin, Y.: Clustering of high throughput gene expression data. Computer & Operation Research 39, 3046–3061 (2012)
Strehl, A., Ghost, J.: Cluster A Knowledge Reuse Framework for combining Mutiple Partitions. J. Machine Learning Research 3, 583–617 (2002)
Fred, A., Jain, A.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Transaction Pattern Analysis and Machine Intelligence 27, 835–850 (2005)
Strehl, A., Ghosh, J.: Cluster: Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research. 3, 583–617 (2002)
Mimaroglu, S., Erdil, E.: Obtaining Better Quality Final Clustering by Merging a Collection of Clusterings. Bioinformatics 26, 2645–2646 (2010)
Izzo, D., Ruciński, M., Biscani, F.: The Generalized Island Model. In: Fernandez de Vega, F., Hidalgo Pérez, J.I., Lanchares, J. (eds.) Parallel Architectures & Bioinspired Algorithms. SCI, vol. 415, pp. 151–170. Springer, Heidelberg (2012)
Ravi, V., Aggarwal, N., Chauhan, N.: Differential Evolution Based Fuzzy Clustering. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Dash, S.S. (eds.) SEMCCO 2010. LNCS, vol. 6466, pp. 38–45. Springer, Heidelberg (2010)
Sheikh, R.H., Raghuwanshi, M.M., Jaiswal, A.N.: Genetic Algorithm Based Clustering: A Survey. Emerging Trends in Engineering and Technology 8, 314–319 (2008)
Alia, O.M., Al-Betar, M.A., Mandava, R., Khader, A.T.: Data Clustering Using Harmony Search Algorithm. In: Panigrahi, B.K., Suganthan, P.N., Das, S., Satapathy, S.C. (eds.) SEMCCO 2011, Part II. LNCS, vol. 7077, pp. 79–88. Springer, Heidelberg (2011)
Changsheng, Z., Dantong, O., Jiaxu, N.: An artificial bee colony approach for clustering. Expert Systems with Applications 37, 4761–4767 (2010)
Yau, K.L., Tsang, P.W.M., Leung, C.S.: PSO-based K-means clustering with enhanced cluster matching for gene expression data. Neural Computing and Application 22, 1349–1355 (2013)
Kao, Y., Cheng, K.: An ACO-Based Clustering Algorithm. In: Dorigo, M., Gambardella, L.M., Birattari, M., Martinoli, A., Poli, R., Stützle, T. (eds.) ANTS 2006. LNCS, vol. 4150, pp. 340–347. Springer, Heidelberg (2006)
Sandro, V.P., José, R.S.: A Survey of Clustering Ensemble Algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25, 337–372 (2011)
Filkov, V.: Integrating microarray data by consensus clustering. IEEE International Conference on Tools with Artificial Intelligence 15, 418–426 (2003)
Mimaroglu, S., Erdil, E.: Obtaining Better quality final clustering by Merging a Collection of Clusterings. Bioinformatics 26, 2645–2646 (2010)
Fred, A., Jain, A.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Tran. Pattern Analysis and Machine Intelligence 27, 835–850 (2005)
Natthakan, I.O., Tossapon, B., Simon, G.: LCE: A Link-Based Cluster Ensemble Method for Improved Gene Expression Data Analysis. Bioinformatics 26, 1513–1519 (2010)
Yu, Z., Wong, H., Wang, H.: Graph-Based Consensus Clustering for Class Discovery from Gene Expression Data. Bioinformatics 33, 2888–2896 (2007)
Selim, M., Emin, A.: DICLEANS: Divisive Clustering Ensemble With Automatic Cluster Number. IEEE/ACM Tran. Computational Biology and Bioinformatics 9, 408–420 (2012)
Souto, M., Costa, I., de Araujo, D., Ludermir, T., Schliep, A.: Clustering Cancer Gene Expression Data: A Comparative Study. BMC Bioinformatics 9, 497 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Benmounah, Z., Batouche, M. (2013). A Parallel Distributed System for Gene Expression Profiling Based on Clustering Ensemble and Distributed Optimization. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-03859-9_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03858-2
Online ISBN: 978-3-319-03859-9
eBook Packages: Computer ScienceComputer Science (R0)