Skip to main content

Heterogeneous Clustering Ensemble Method for Combining Different Cluster Results

  • Conference paper
Data Mining for Biomedical Applications (BioDM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3916))

Included in the following conference series:

Abstract

Biological data set sizes have been growing rapidly with the technological advances that have occurred in bioinformatics. Data mining techniques have been used extensively as approaches to detect interesting patterns in large databases. In bioinformatics, clustering algorithm technique for data mining can be applied to find underlying genetic and biological interactions, without considering prior information from datasets. However, many clustering algorithms are practically available, and different clustering algorithms may generate dissimilar clustering results due to bio-data characteristics and experimental assumptions. In this paper, we propose a novel heterogeneous clustering ensemble scheme that uses a genetic algorithm to generate high quality and robust clustering results with characteristics of bio-data. The proposed method combines results of various clustering algorithms and crossover operation of genetic algorithm, and is founded on the concept of using the evolutionary processes to select the most commonly-inherited characteristics. Our framework proved to be available on real data set and the optimal clustering results generated by means of our proposed method are detailed in this paper. Experimental results demonstrate that the proposed method yields better clustering results than applying a single best clustering algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alexander, P.T., Behrouz, M.-B., Anil, K.J., William, F.P.: Adaptive clustering ensembles. In: Proceedings of the International Conference on Pattern Recognition, vol. 1, pp. 272–275 (2004)

    Google Scholar 

  2. Banerjee, A., Krumpelman, C., Basu, S., Mooney, R., Ghosh, J.: Model-based overlapping clustering. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 532–537 (2005)

    Google Scholar 

  3. Greene, D., Tsymbal, A., Bolshakova, N., Cunningham, P.: Ensemble clustering in medical diagnostics. In: Proceedings of the 17th IEEE Symposium on Computer-Based Medical Systems, pp. 576–581 (2004)

    Google Scholar 

  4. Jaewoo, K., Jiong, Y., Wanhong, X., Pankaj, C.: Integrating heterogeneous microarray data sources using correlation signatures. In: Ludäscher, B., Raschid, L. (eds.) DILS 2005. LNCS (LNBI), vol. 3615, pp. 105–120. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  5. Jouve, P.E., Nicoloyannis, N.: A new method for combining partitions, applications for distributed clustering. In: Proceedings of the International Workshop on Parallel and Distributed Machine Learning and Data Mining (2003)

    Google Scholar 

  6. Kasturi, J., Acharya, R.: Clustering of diverse genomic data using information fusion. Bioinformatics 21, 423–429 (2005)

    Article  Google Scholar 

  7. Kenneth, J.R., Suzanne, D.V., Ellen, B., William, C.R.: The economic impact of chronic fatigue syndrome. Cost Effectiveness and Resource Allocation, 2 (2004)

    Google Scholar 

  8. Liu, J.J., Cutler, G., Li, W., Pan, Z., Peng, S., Hoey, T., Chen, L., Ling, X.B.: Multiclass cancer classification and biomarker discovery using GA-based algorithms. Bioinformatics 21, 2691–2697 (2005)

    Article  Google Scholar 

  9. Patrick, C.H.M., Keith, C.C.C.: Discovering clusters in gene expression data using evolutionary approach. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence, pp. 459–466 (2003)

    Google Scholar 

  10. Qiu, P., Wang, Z.J., Liu, K.J.: Ensemble dependence model for classification and prediction of cancer and normal gene expression data. Bioinformatics 21, 3114–3121 (2005)

    Article  Google Scholar 

  11. Whistler, T., Unger, E.R., Nisenbaum, R., Vernon, S.D.: Integration of gene expression, clinical, and epidemiologic data to characterize Chronic Fatigue Syndrome. Journal of Translational Medicine 1 (2003)

    Google Scholar 

  12. Xiaohua, H.: Integration of cluster ensemble and text summarization for gene. In: Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering, pp. 251–258 (2004)

    Google Scholar 

  13. Xiaohua, H., Illhoi, Y.: Cluster ensemble and its applications in gene expression. In: Proceedings of the Asia-Pacific Bioinformatics Conference, vol. 29, pp. 297–302 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yoon, HS., Ahn, SY., Lee, SH., Cho, SB., Kim, J.H. (2006). Heterogeneous Clustering Ensemble Method for Combining Different Cluster Results. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_9

Download citation

  • DOI: https://doi.org/10.1007/11691730_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33104-9

  • Online ISBN: 978-3-540-33105-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics