Skip to main content

A Parallel Consensus Clustering Algorithm

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9432))

Abstract

Consensus clustering is a stability-based algorithm with a prediction power far better than other internal measures. Unfortunately, this method is reported to be slow in terms of time and hard to scalability. We presented here consensus clustering algorithm optimized for multi-core processors. We showed that it is possible to obtain scalable performance of the compute-intensive algorithm for high-dimensional data such as gene expression microarrays.

The research is financed by Wroclaw University of Technology statutory grant.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  2. Allison, D.B., et al.: Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7(1), 55–65 (2006)

    Article  MathSciNet  Google Scholar 

  3. Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing, vol. 7 (2001)

    Google Scholar 

  4. Bertrand, P., Bel Mufti, G.: Loevinger’s measures of rule quality for assessing cluster stability. Comput. Stat. Data Anal. 50(4), 992–1015 (2006)

    Article  MATH  Google Scholar 

  5. Dudoit, S., Fridlyand, J.: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 3(7), research0036 (2002)

    Article  Google Scholar 

  6. Garge, N., et al.: Reproducible clusters from microarray research: whither? BMC Bioinform. 6(Suppl 2), S10 (2005)

    Article  Google Scholar 

  7. Giancarlo, R., Utro, F.: Algorithmic paradigms for stability-based cluster validity and model selection statistical methods, with applications to microarray data analysis. Theoret. Comput. Sci. 428, 58–79 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  8. Giancarlo, R., Scaturro, D., Utro, F.: Computational cluster validation for microarray data analysis: experimental assessment of clest, consensus clustering, figure of merit, gap statistics and model explorer. BMC Bioinform. 9(1), 462 (2008)

    Article  Google Scholar 

  9. Giancarlo, R., Utro, F.: Speeding up the Consensus Clustering methodology for microarray data analysis. Algorithms Mol. Biol. 6(1), 1–13 (2011)

    Article  Google Scholar 

  10. Giurcaneanu, C.D., Tabus, I.: Cluster structure inference based on clustering stability with applications to microarray data analysis. EURASIP J. Appl. Sig. Process. 2004, 64–80 (2004)

    Article  MATH  Google Scholar 

  11. Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21(15), 3201–3212 (2005)

    Article  Google Scholar 

  12. Kustra, R., Zagdanski, A.: Data-fusion in clustering microarray data: balancing discovery and interpretability. IEEE/ACM Trans. Comput. Biol. Bioinf. 7(1), 50–63 (2010)

    Article  Google Scholar 

  13. Lange, T., et al.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)

    Article  MATH  Google Scholar 

  14. Levine, E., Domany, E.: Resampling method for unsupervised estimation of cluster validity. Neural Comput. 13(11), 2573–2593 (2001)

    Article  MATH  Google Scholar 

  15. Liu, Y., et al.: Understanding of internal clustering validation measures. In: 2010 IEEE 10th International Conference on Data Mining (ICDM). IEEE (2010)

    Google Scholar 

  16. MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)

    Google Scholar 

  17. Monti, S., et al.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52(1–2), 91–118 (2003)

    Article  MATH  Google Scholar 

  18. NCI 60 Cancer Microarray Project. http://genome-www.stanford.edu/NCI60

  19. Pirim, H., et al.: Clustering of high throughput gene expression data. Comput. Oper. Res. 39(12), 3046–3061 (2012)

    Article  MathSciNet  Google Scholar 

  20. Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA 98, 15149–15154 (2001)

    Article  Google Scholar 

  21. RDevelopment Core Team: R: A language and environment for statistical computing, pp. 1–1731. R Foundation for Statistical Computing, Vienna, Austria (2008)

    Google Scholar 

  22. Simpson, T., et al.: Merged consensus clustering to assess and improve class discovery with microarray data. BMC Bioinform. 11(1), 590 (2010)

    Article  Google Scholar 

  23. Stevans, W.R.: Advanced Programming in the UNIX Environment. Pearson Education, India (2011)

    Google Scholar 

  24. Unold, O., Tagowski, T.: A GPU-based consensus clustering. Glob. J. Comput. Sci. 4(2), 65–69 (2014)

    Google Scholar 

  25. Yeoh, E.J., et al.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1, 133–143 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olgierd Unold .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Unold, O., Tagowski, T. (2015). A Parallel Consensus Clustering Algorithm. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27926-8_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27925-1

  • Online ISBN: 978-3-319-27926-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics