Abstract
This project shows the development of a new clustering algorithm, based on k-means, which faces its problems with clusters of differences variances. This new algorithm uses a line segment as prototype which captures the axis that presents the biggest variance of the cluster. The line segment adjusts iteratively its long and direction as the data are classified. To perform the classification, a border region that determines approximately the limit on the cluster is built based on geometric model, which depends on the central line segment. The data are classified later according to their proximity to the different border regions. The process is repeated until the parameters of the all border regions associated with each cluster remain constant.
Chapter PDF
References
Jain, A.K., Murty, M.N., Flynn, O.J.: Data Clustering: a review. ACM Computing Surveys 31(3) (September 1999)
Gan, G., Ma, C., Wu, J.: Data Clustering Theory, algorithms and applications. SIAM, Society for Industrial and Applied Mathematics (May 30, 2007)
Fahim, M., Saake, G., Salem, A.M., Torkey, F.A., Ramadan, M.A.: K-Means for Spherical Clusters with Large Variance in Sizes. In: Proceedings of World Academy of Science, Engineering and Technology, Paris, vol. 35, pp. 177–182 (November 2008) ISSN 2070-3740
Guha, S., Rastogi, R., Shim, K.: CURE: An Efficient Clustering Algorithms for Large Databases. In: Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, pp. 73–84 (1998)
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. (2006)
Pham, D.T., Dimov, S.S., Nguyen, C.D.: Selection of k in K-means clustering. Mechanical Engineering Science 219, 103–119 (2004)
Pelleg, D., Moore, A.: x-means: Extending k-means with efficient estimation of the number of clusters. In: Proceedings of Seventeenth International Conference on Machine Learning, pp. 727–734. Morgan Kaufmann, San Francisco (2000)
Faber, V.: Clustering and the continuous k-means algorithm. Los Alamos Science 22, 138–144 (1994)
Phillips, S.: Acceleration of K-means and Related Clustering Algorithms. In: Mount, D.M., Stein, C. (eds.) ALENEX 2002. LNCS, vol. 2409, pp. 166–177. Springer, Heidelberg (2002)
Huang, Z.: Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values. Data Minig and Knowledge Discovery 2(3), 283–304 (1998)
Bradley, P.S., Fayyad, U.M.: Refining initial points for k-means clustering. In: Proceedings of the 15th International Conference on Machine Learning, pp. 91–99. Morgan Kaufmann, San Francisco (1998)
Deelers, S., Auwatanamongkol, S.: Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance. PWASET 26, 323–328 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Thomas, J.C.R. (2011). A New Clustering Algorithm Based on K-Means Using a Line Segment as Prototype. In: San Martin, C., Kim, SW. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2011. Lecture Notes in Computer Science, vol 7042. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25085-9_76
Download citation
DOI: https://doi.org/10.1007/978-3-642-25085-9_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25084-2
Online ISBN: 978-3-642-25085-9
eBook Packages: Computer ScienceComputer Science (R0)