A Parallel Consensus Clustering Algorithm

Unold, Olgierd; Tagowski, Tadeusz

doi:10.1007/978-3-319-27926-8_28

A Parallel Consensus Clustering Algorithm

Olgierd Unold¹⁷ &
Tadeusz Tagowski¹⁷

Conference paper
First Online: 06 January 2016

2160 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9432))

Abstract

Consensus clustering is a stability-based algorithm with a prediction power far better than other internal measures. Unfortunately, this method is reported to be slow in terms of time and hard to scalability. We presented here consensus clustering algorithm optimized for multi-core processors. We showed that it is possible to obtain scalable performance of the compute-intensive algorithm for high-dimensional data such as gene expression microarrays.

The research is financed by Wroclaw University of Technology statutory grant.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alizadeh, A., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Article Google Scholar
Allison, D.B., et al.: Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7(1), 55–65 (2006)
Article MathSciNet Google Scholar
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing, vol. 7 (2001)
Google Scholar
Bertrand, P., Bel Mufti, G.: Loevinger’s measures of rule quality for assessing cluster stability. Comput. Stat. Data Anal. 50(4), 992–1015 (2006)
Article MATH Google Scholar
Dudoit, S., Fridlyand, J.: A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 3(7), research0036 (2002)
Article Google Scholar
Garge, N., et al.: Reproducible clusters from microarray research: whither? BMC Bioinform. 6(Suppl 2), S10 (2005)
Article Google Scholar
Giancarlo, R., Utro, F.: Algorithmic paradigms for stability-based cluster validity and model selection statistical methods, with applications to microarray data analysis. Theoret. Comput. Sci. 428, 58–79 (2012)
Article MATH MathSciNet Google Scholar
Giancarlo, R., Scaturro, D., Utro, F.: Computational cluster validation for microarray data analysis: experimental assessment of clest, consensus clustering, figure of merit, gap statistics and model explorer. BMC Bioinform. 9(1), 462 (2008)
Article Google Scholar
Giancarlo, R., Utro, F.: Speeding up the Consensus Clustering methodology for microarray data analysis. Algorithms Mol. Biol. 6(1), 1–13 (2011)
Article Google Scholar
Giurcaneanu, C.D., Tabus, I.: Cluster structure inference based on clustering stability with applications to microarray data analysis. EURASIP J. Appl. Sig. Process. 2004, 64–80 (2004)
Article MATH Google Scholar
Handl, J., Knowles, J., Kell, D.B.: Computational cluster validation in post-genomic data analysis. Bioinformatics 21(15), 3201–3212 (2005)
Article Google Scholar
Kustra, R., Zagdanski, A.: Data-fusion in clustering microarray data: balancing discovery and interpretability. IEEE/ACM Trans. Comput. Biol. Bioinf. 7(1), 50–63 (2010)
Article Google Scholar
Lange, T., et al.: Stability-based validation of clustering solutions. Neural Comput. 16(6), 1299–1323 (2004)
Article MATH Google Scholar
Levine, E., Domany, E.: Resampling method for unsupervised estimation of cluster validity. Neural Comput. 13(11), 2573–2593 (2001)
Article MATH Google Scholar
Liu, Y., et al.: Understanding of internal clustering validation measures. In: 2010 IEEE 10th International Conference on Data Mining (ICDM). IEEE (2010)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Monti, S., et al.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52(1–2), 91–118 (2003)
Article MATH Google Scholar
NCI 60 Cancer Microarray Project. http://genome-www.stanford.edu/NCI60
Pirim, H., et al.: Clustering of high throughput gene expression data. Comput. Oper. Res. 39(12), 3046–3061 (2012)
Article MathSciNet Google Scholar
Ramaswamy, S., et al.: Multiclass cancer diagnosis using tumor gene expression signatures. Proc. Natl. Acad. Sci. USA 98, 15149–15154 (2001)
Article Google Scholar
RDevelopment Core Team: R: A language and environment for statistical computing, pp. 1–1731. R Foundation for Statistical Computing, Vienna, Austria (2008)
Google Scholar
Simpson, T., et al.: Merged consensus clustering to assess and improve class discovery with microarray data. BMC Bioinform. 11(1), 590 (2010)
Article Google Scholar
Stevans, W.R.: Advanced Programming in the UNIX Environment. Pearson Education, India (2011)
Google Scholar
Unold, O., Tagowski, T.: A GPU-based consensus clustering. Glob. J. Comput. Sci. 4(2), 65–69 (2014)
Google Scholar
Yeoh, E.J., et al.: Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1, 133–143 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Faculty of Electronics, Wroclaw University of Technology, Wyb. Wyspianskiego 25, 50-370, Wroclaw, Poland
Olgierd Unold & Tadeusz Tagowski

Authors

Olgierd Unold
View author publications
You can also search for this author in PubMed Google Scholar
Tadeusz Tagowski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Olgierd Unold .

Editor information

Editors and Affiliations

University of Florida, Gainsville, Florida, USA
Panos Pardalos
University of Catania, Catania, Italy
Mario Pavone
University of Catania, Catania, Italy
Giovanni Maria Farinella
University of Catania, Catania, Italy
Vincenzo Cutello

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Unold, O., Tagowski, T. (2015). A Parallel Consensus Clustering Algorithm. In: Pardalos, P., Pavone, M., Farinella, G., Cutello, V. (eds) Machine Learning, Optimization, and Big Data. MOD 2015. Lecture Notes in Computer Science(), vol 9432. Springer, Cham. https://doi.org/10.1007/978-3-319-27926-8_28

Download citation

DOI: https://doi.org/10.1007/978-3-319-27926-8_28
Published: 06 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-27925-1
Online ISBN: 978-3-319-27926-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics