Skip to main content

A Fuzzy Threshold Based Modified Clustering Algorithm for Natural Data Exploration

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 6122))

Abstract

Traditional supervised clustering methods require the user to provide the number of clusters before we start any data exploration. The data engineer also has to select the initial cluster seeds. In c-means clustering method, the performance efficiency of the algorithm depends mainly on the initial selection of number of clusters and cluster seeds. With the real world data, the initial selection of cluster count and centroids becomes a tedious task. In this paper we propose a modified clustering algorithm which works on the principles of fuzzy clustering. The method we propose is using a modified form of popular fuzzy c-means algorithm for membership calculation. The algorithm begins on the assumption that all the data points are initial centroids. . The clusters are continuously merged based on a threshold value until we get the optimum number of clusters. The algorithm is also capable of detecting the outliers The algorithm is tested with the data for Gross National Happiness (GNH) program of Bhutan and found to be highly efficient in segmenting natural data sets.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pal, K., Mitra, P.: Data Mining in Soft Computing Framework: A Survey. IEEE transactions on neural networks 13(1) (January 2002)

    Google Scholar 

  2. Au, W.H., Chan, K.C.C.: Classification with Degree of Membership: A Fuzzy Approach. In: Proceedings IEEE International Conference on Data Mining, ICDM 2001 (2001)

    Google Scholar 

  3. Halkidi, M.: Quality assessment and Uncertainty Handling in Data Mining Process, http://citeseer.ist.psu.edu/halkidi00quality.html

  4. Inmon, W.H.: The data warehouse and data mining. Commun., ACM 39, 49–50 (1996)

    Article  Google Scholar 

  5. Fayyad, U., Uthurusamy, R.: Data mining and knowledge discovery in databases. ACM Commun. 39, 24–27 (1996)

    Article  Google Scholar 

  6. Thomas, B., Raju, G.: A Modified c-means algorithm for Natural Data Exploration. In: WASET International Conference on Knowledge Management (ICKM), January 2009, vol. 49 (2009) ISSN 2070-3724

    Google Scholar 

  7. Thomas, B., Raju, G.: A Fuzzy Threshold Based Unsupervised Clustering Algorithm for Natural Data Exploration. In: Proceedings of International Conference on Database and Data Mining (ICDDM) (June 2010)

    Google Scholar 

  8. Keith, C.C., Wai-Ho Au, C., Choi, B.: Mining Fuzzy Rules in A Donor Database for Direct Marketing by A Charitable Organization. In: Proceedings. First IEEE International Conference on Cognitive Informatics, pp. 239–246 (2002)

    Google Scholar 

  9. Cox, E.: Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration. Elsevier, Amsterdam (2005)

    MATH  Google Scholar 

  10. Klir, G.J., Folger, T.A.: Fuzzy Sets, Uncertainty and Information. Prentice Hall, Englewood Cliffs (1988)

    MATH  Google Scholar 

  11. Han, J., Kamber, M.: Data Mining Concepts and Techniques. Elsevier, Amsterdam (2003)

    Google Scholar 

  12. Donnelly, S.: How Bhutan Can Develop and Measure GNH, http://www.bhutanstudies.org.bt/seminar/0402-gnh/GNH-papers-1st_18-20.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Thomas, B., Raju, G. (2010). A Fuzzy Threshold Based Modified Clustering Algorithm for Natural Data Exploration. In: Chen, H., Chau, M., Li, Sh., Urs, S., Srinivasa, S., Wang, G.A. (eds) Intelligence and Security Informatics. PAISI 2010. Lecture Notes in Computer Science, vol 6122. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13601-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13601-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13600-9

  • Online ISBN: 978-3-642-13601-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics