Abstract
Clustering is the most important task in unsupervised learning and clustering validity is a major issue in cluster analysis. In this paper, a new strategy called Clustering Algorithm Based on Histogram Threshold (HTCA) is proposed to improve the execution time. The HTCA method combines a hierarchical clustering method and Otsu’s method. Compared with traditional clustering algorithm, our proposed method would save at leastten several times of execution time without losing the accuracy. From the experiments, we find that the performance with regard to speed up the execution time of the HTCA is much better than traditional methods.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hirschman, L., Park, J.C., Tsujii, J., Wong, L., Wu, C.H.: Accomplishments and challenges in literature data mining for biology. Bioinformatics 18, 1553–1561 (2002)
Berkhin, P.: Survey of clustering data mining techniques. Technique Report, Accrue Software (2002)
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31, 264–323 (1999)
Li, C., Biswas, G.: Unsupervised learning with mixed numeric and nominal data. IEEE Transactions on Knowledge and Data Engineering 14, 673–690 (2002)
Rauber, A., Pampalk, E., Paralic, J.: Empirical evaluation of clustering algorithms. Journal of Information and Organizational Sciences, JIOS (2000)
Rui, X., Wunsch II, D.: Survey of clustering algorithms. IEEE Transactions on Neural Networks 16, 645–678 (2005)
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man and Cybernetics 9, 62–66 (1979)
Chang-Chin, H.: Efficient VQ Codebook Generation by Global/Local Clustering Algorithms (2009)
Patel, R., Shrawankar, U.N., Raghuwanshi, M.M.: Genetic Algorithm with Histogram Construction Technique. In: Proceedings of the 2009 Second International Conference on Emerging Trends in Engineering \& Technology, pp. 615–618. IEEE Computer Society (2009)
Sun, L., Lin, T.-C., Huang, H.-C., Liao, B.-Y., Pan, J.-S.: An Optimized Approach on Applying Genetic Algorithm to Adaptive Cluster Validity Index. In: Proceedings of the Third International Conference on International Information Hiding and Multimedia Signal Processing (IIH-MSP 2007), vol. 02, pp. 582–585. IEEE Computer Society (2007)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data (1988)
Huang, Z.: Extensions of the K-means Algorithm for Clustering Large Data Sets with Categorical Values. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 283–304. Springer, Heidelberg (1998)
Merz, C.J., Blake, C.L.: UCI repository of machine learning datasets. Department of Information and Computer Science. University of California, Irvine (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shieh, SL., Lin, TC., Szu, YC. (2012). An Efficient Clustering Algorithm Based on Histogram Threshold. In: Pan, JS., Chen, SM., Nguyen, N.T. (eds) Intelligent Information and Database Systems. ACIIDS 2012. Lecture Notes in Computer Science(), vol 7197. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28490-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-28490-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28489-2
Online ISBN: 978-3-642-28490-8
eBook Packages: Computer ScienceComputer Science (R0)