Skip to main content

A Parallel Distributed System for Gene Expression Profiling Based on Clustering Ensemble and Distributed Optimization

  • Conference paper
Book cover Algorithms and Architectures for Parallel Processing (ICA3PP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8285))

Abstract

With the development of microarray technology, it is possible now to study and measure the expression profiles of thousands of genes simultaneously which can lead to identify subgroup of specific disease or extract hidden relationships between genes. One computational method often used to this end is clustering. In this paper, we propose a parallel distributed system for gene expression profiling (PDS-GEF) which provides a useful basis for individualized treatment of a certain disease such as Cancer. The proposed approach is based on two major techniques: the GIM (Generalized Island Model) and clustering ensemble. GIMs are used to generate good quality clusterings which are refined by a consensus function to get a high quality clustering. PDS-GEF system is implemented using Matlab®’s PCT (Parallel Computing ToolboxTM) which runs on a desktop computer, and tested on 34 different publicly available gene expression data sets. The obtained results compete with and even outperform existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jens, S., Kerstin, B., Anette, J., Jvrg, D.H., Philipp, A.: Microarray Technology as a Universal Tool for High-Throughput Analysis of Biological Systems. Combinatorial Chemistry & High Throughput Screening 9, 365–380 (2006)

    Article  Google Scholar 

  2. Tarca, A.L., Roberto, R., Sorin, D.: Analysis of microarray experiments of gene expression profiling. American Journal of Obstetrics and Gynecology 195, 373–388 (2006)

    Article  Google Scholar 

  3. Aach, J., Rindone, W., George, M.S.: Systematic management and analysis of yeast gene expression data. Genome Research 10, 431–445 (2000)

    Article  Google Scholar 

  4. Bethin, K.E., Nagai, Y., Sladek, R., Asada, M., Sadovsky, Y., Hudson, T.J., et al.: Microarray analysis of uterine gene expression in mouse and human pregnancy. Mol. Endocrinol. 17, 1454–1469 (2003)

    Article  Google Scholar 

  5. Vladimir, E.C.: Why so many clustering algorithms. Sigkdd Explorations 4, 65–75 (2002)

    Article  Google Scholar 

  6. Daxin, J., Chun, T., Aidong, Z.: Cluster Analysis for Gene Expression Data: A Survey. IEEE Transaction on Knowledge And Data Engineering 16, 1370–1386 (2004)

    Article  Google Scholar 

  7. Kerr, G., Ruskin, H.J., Crane, M., Doolan, P.: Techniques for clustering gene expression data. Computer in Biology and Medecine 38, 283–293 (2008)

    Article  Google Scholar 

  8. Harun, P., Burak, E., Andy, D.P., Çertin, Y.: Clustering of high throughput gene expression data. Computer & Operation Research 39, 3046–3061 (2012)

    Article  Google Scholar 

  9. Strehl, A., Ghost, J.: Cluster A Knowledge Reuse Framework for combining Mutiple Partitions. J. Machine Learning Research 3, 583–617 (2002)

    Google Scholar 

  10. Fred, A., Jain, A.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Transaction Pattern Analysis and Machine Intelligence 27, 835–850 (2005)

    Article  Google Scholar 

  11. Strehl, A., Ghosh, J.: Cluster: Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. J. Machine Learning Research. 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  12. Mimaroglu, S., Erdil, E.: Obtaining Better Quality Final Clustering by Merging a Collection of Clusterings. Bioinformatics 26, 2645–2646 (2010)

    Article  Google Scholar 

  13. Izzo, D., Ruciński, M., Biscani, F.: The Generalized Island Model. In: Fernandez de Vega, F., Hidalgo Pérez, J.I., Lanchares, J. (eds.) Parallel Architectures & Bioinspired Algorithms. SCI, vol. 415, pp. 151–170. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Ravi, V., Aggarwal, N., Chauhan, N.: Differential Evolution Based Fuzzy Clustering. In: Panigrahi, B.K., Das, S., Suganthan, P.N., Dash, S.S. (eds.) SEMCCO 2010. LNCS, vol. 6466, pp. 38–45. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Sheikh, R.H., Raghuwanshi, M.M., Jaiswal, A.N.: Genetic Algorithm Based Clustering: A Survey. Emerging Trends in Engineering and Technology 8, 314–319 (2008)

    Google Scholar 

  16. Alia, O.M., Al-Betar, M.A., Mandava, R., Khader, A.T.: Data Clustering Using Harmony Search Algorithm. In: Panigrahi, B.K., Suganthan, P.N., Das, S., Satapathy, S.C. (eds.) SEMCCO 2011, Part II. LNCS, vol. 7077, pp. 79–88. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  17. Changsheng, Z., Dantong, O., Jiaxu, N.: An artificial bee colony approach for clustering. Expert Systems with Applications 37, 4761–4767 (2010)

    Article  Google Scholar 

  18. Yau, K.L., Tsang, P.W.M., Leung, C.S.: PSO-based K-means clustering with enhanced cluster matching for gene expression data. Neural Computing and Application 22, 1349–1355 (2013)

    Article  Google Scholar 

  19. Kao, Y., Cheng, K.: An ACO-Based Clustering Algorithm. In: Dorigo, M., Gambardella, L.M., Birattari, M., Martinoli, A., Poli, R., Stützle, T. (eds.) ANTS 2006. LNCS, vol. 4150, pp. 340–347. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  20. Sandro, V.P., José, R.S.: A Survey of Clustering Ensemble Algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25, 337–372 (2011)

    Article  MathSciNet  Google Scholar 

  21. Filkov, V.: Integrating microarray data by consensus clustering. IEEE International Conference on Tools with Artificial Intelligence 15, 418–426 (2003)

    Article  Google Scholar 

  22. Mimaroglu, S., Erdil, E.: Obtaining Better quality final clustering by Merging a Collection of Clusterings. Bioinformatics 26, 2645–2646 (2010)

    Article  Google Scholar 

  23. Fred, A., Jain, A.: Combining Multiple Clusterings Using Evidence Accumulation. IEEE Tran. Pattern Analysis and Machine Intelligence 27, 835–850 (2005)

    Article  Google Scholar 

  24. Natthakan, I.O., Tossapon, B., Simon, G.: LCE: A Link-Based Cluster Ensemble Method for Improved Gene Expression Data Analysis. Bioinformatics 26, 1513–1519 (2010)

    Article  Google Scholar 

  25. Yu, Z., Wong, H., Wang, H.: Graph-Based Consensus Clustering for Class Discovery from Gene Expression Data. Bioinformatics 33, 2888–2896 (2007)

    Article  Google Scholar 

  26. Selim, M., Emin, A.: DICLEANS: Divisive Clustering Ensemble With Automatic Cluster Number. IEEE/ACM Tran. Computational Biology and Bioinformatics 9, 408–420 (2012)

    Article  Google Scholar 

  27. Souto, M., Costa, I., de Araujo, D., Ludermir, T., Schliep, A.: Clustering Cancer Gene Expression Data: A Comparative Study. BMC Bioinformatics 9, 497 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Benmounah, Z., Batouche, M. (2013). A Parallel Distributed System for Gene Expression Profiling Based on Clustering Ensemble and Distributed Optimization. In: Kołodziej, J., Di Martino, B., Talia, D., Xiong, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2013. Lecture Notes in Computer Science, vol 8285. Springer, Cham. https://doi.org/10.1007/978-3-319-03859-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-03859-9_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-03858-2

  • Online ISBN: 978-3-319-03859-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics