Skip to main content
Log in

Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Support vector clustering (SVC) is an important boundary-based clustering algorithm in multiple applications for its capability of handling arbitrary cluster shapes. However, SVC’s popularity is degraded by its highly intensive time complexity and poor label performance. To overcome such problems, we present a novel efficient and robust convex decomposition based cluster labeling (CDCL) method based on the topological property of dataset. The CDCL decomposes the implicit cluster into convex hulls and each one is comprised by a subset of support vectors (SVs). According to a robust algorithm applied in the nearest neighboring convex hulls, the adjacency matrix of convex hulls is built up for finding the connected components; and the remaining data points would be assigned the label of the nearest convex hull appropriately. The approach's validation is guaranteed by geometric proofs. Time complexity analysis and comparative experiments suggest that CDCL improves both the efficiency and clustering quality significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Xu R, Wunsch D C. Hierarchical clustering. In Clustering, Hoboken: John Wiley&Sons, 2008.

    Chapter  Google Scholar 

  2. Ben-Hur A, Horn D, Siegelmann H T, Vapnik V N. Support vector clustering. Journal of Machine Learning Research, 2001, 2: 125–137.

    Google Scholar 

  3. Schölkopf B, Platt J C, Shawe-Taylor J C, Smola A J, Williamson R C. Estimating the support of a high-dimensional distribution. Neural Computation, 2001, 13(7): 1443–1472.

    Article  MATH  Google Scholar 

  4. Tax D M J, Duin R P W. Support vector domain description. Pattern Recognition Letters, 1999, 20(11–13): 1191–1199.

    Article  Google Scholar 

  5. Burges C J C. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2(2): 121–167.

    Article  Google Scholar 

  6. Yang J H, Estivill-Castro V, Chalup S K. Support vector clustering through proximity graph modelling. In Proc. the 9th International Conference on Neural Information Processing (ICONIP 2002), Orchid Country Club, Singapore, Nov. 18–22, 2002, pp.898–903.

  7. Lee J, Lee D. An improved cluster labeling method for support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3): 461–464.

    Article  Google Scholar 

  8. Lee J, Lee D. Dynamic characterization of cluster structures for robust and inductive support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, 28(11): 1869–1874.

    Article  Google Scholar 

  9. Lee D, Lee J. Equilibrium-based support vector machine for semisupervised classi¯cation. IEEE Transactions on Neural Networks, 2007, 18(2): 578–583.

    Article  Google Scholar 

  10. Lee S-H, Daniels K M. Cone cluster labeling for support vector clustering. In Proc. the 6th SIAM Conference on Data Mining, Bethesda, Maryland, Apr. 20–22, 2006, pp.484–488.

  11. Varma C M B S, Asharaf S, Murty M N. Rough core vector clustering. In Proc. the 2nd International Conference on Pattern Recognition and Machine Intelligence, Kolkata, India, Dec. 18–22, 2007, pp.304–310.

  12. Hsieh T W, Taur J S, Tao C W, Kung S Y. A kernel-based core growing clustering method. International Journal of Intelligent Systems, 2009, 24(4): 441–458.

    Article  MATH  Google Scholar 

  13. Ling P, Zhou C G, Zhou X. Improved support vector clustering. Eng. Applicat. Artificial Intelligence, 2010, 23(4): 552–559.

    Article  Google Scholar 

  14. Ping Y, Zhou Y J, Yang Y X. A novel scheme for accelerating support vector clustering. Computing and Infomatics, 2012, 31(2): in press.

  15. Jung J H, Kim N, Lee J. Dynamic pattern denoising method using multi-basin system with kernels. Pattern Recognition, 2011, 44(8): 1698–1707.

    Article  MATH  Google Scholar 

  16. Vapnik V. The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995.

    MATH  Google Scholar 

  17. Jung K H, Lee D, Lee J. Fast support-based clustering method for large-scale problems. Pattern Recognition, 2010, 43(5): 1975–1983.

    Article  MATH  Google Scholar 

  18. Lee D, Lee J. Dynamic dissimilarity measure for support-based clustering. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(6): 900–905.

    Article  Google Scholar 

  19. Ben-Hur A, Horn D, Siegelmann H T, Vapnik V. A support vector cluster method. In Proc. the 15th International Conference on Pattern Recognition, Barcelona, Spain, Sep. 3–7, 2000, pp.724–727.

  20. Estivill-Castro V, Lee I. AMOEBA: Hierarchical clustering based on spatial proximity using Delaunay diagram. In Proc. the 9th Int. Symposium on Spatial Data Handling, Beijing, China, Aug. 10–12, 2000, pp.7a.26–7a.41.

  21. Estivill-Castro V, Lee I, Murray A T. Criteria on proximity graphs for boundary extraction and spatial clustering. In Proc. the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2001), Hong Kong, China, Apr. 16–18, 2001, pp.348–357.

  22. Ban T, Abe S. Spatially chunking support vector clustering algorithm. In Proc. International Joint Conference on Neural Networks, Budapest, Hungary, July 25–29, 2004, pp.413–418.

  23. Puma-Villanueva W J, Bezerra G B, Lima C A M, Zuben F J V. Improving support vector clustering with ensembles. In Proc. International Joint Conference on Neural Networks, Montreal, Quebec, Canada, Jul. 31–Aug. 4, 2005, pp.13–15.

  24. Chiang J H, Hao P Y. A new kernel-based fuzzy clustering approach: Support vector clustering with cell growing. IEEE Transactions on Fuzzy Systems, 2003, 11(4): 518–527.

    Article  Google Scholar 

  25. Wang F, Zhao B, Zhang C S. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332.

    Article  Google Scholar 

  26. Peng J M, Mukherjee L, Singh V, Schuurmans D, Xu L L. An efficient algorithm for maximal margin clustering. Journal of Global Optimization, 2011, 2: 1–15. doi:10.1007/s10898-011-9691-4.

    Google Scholar 

  27. Lee S H, Daniels K. Gaussian kernel width selection and fast cluster labeling for support vector clustering. University of Massachusetts, Lowell, Technical Report No. 2005–009, 2005.

  28. Lee C H, Yang H C. Construction of supervised and unsupervised learning systems for multilingual text categorization. Expert Systems with Applications, 2009, 36(2): 2400–2410.

    Article  MathSciNet  Google Scholar 

  29. Lee D, Jung K H, Lee J. Constructing sparse kernel machines using attractors. IEEE Transactions on Neural Networks, 2009, 20(4): 721–729.

    Article  Google Scholar 

  30. Hao P Y, Chiang J H, Tu Y K. Hierarchically SVM classification based on support vector clustering method and its application to document categorization. Expert Systems with Applications, 2007, 33(3): 627-635.

    Article  Google Scholar 

  31. Hocking T D, Joulin A, Bach F, Vert J P. Clusterpath: An algorithm for clustering using convex fusion penalties. In Proc. the 28th International Conference on Machine Learning (ICML), Bellevue, WA, USA, Jun. 28–Jul. 2, 2011, pp.1–8.

  32. Shamir O, Tishby N. Stability and model selection in k-means clustering. Machine Learning, 2010, 80(2–3): 213–243.

    Article  Google Scholar 

  33. Lee S H, Daniels K. Gaussian kernel width generator for support vector clustering. In Proc. International Conference on Bioinformatics and Its Applications, Fort Lauderdale, Florida, USA, Dec. 16–19, 2004, pp.151–162.

  34. Boyd S, Vandenberghe L. Convex Optimization. Cambridge, 7th edition, New York: Cambridge University Press, 2009.

    Google Scholar 

  35. Li Y H, Maguire L. Selecting critical patterns based on local geometrical and statistical information. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(6): 1189–1201.

    Article  Google Scholar 

  36. Kpotufe S, von Luxburg U. Pruning nearest neighbor cluster trees. In Proc. the 28th International Conference on Machine Learning (ICML 2011), Bellevue, USA, Jun. 28–Jul. 2, 2011, pp.225–232.

  37. Kim H C, Lee J. Clustering based on gaussian processes. Neural Computation, 2007, 19(11): 3088–3107.

    Article  MathSciNet  MATH  Google Scholar 

  38. Camastra F, Verri A. A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(5): 801–805.

    Article  Google Scholar 

  39. Frank A, Asuncion A. UCI machine learning repository. http://archive.ics.uci.edu/ml, 2010.

  40. Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193-218.

    Article  Google Scholar 

  41. Wu M, Scholkopf B. A local learning approach for clustering. In Proc. the 20th Annual Conference on Advances Neural Information Processing Systems (NIPS 2007), Vancouver, B.C., Canada, Dec. 4–7, 2006, pp.1529–1536.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan Ping.

Additional information

This work was supported by the National Natural Science Foundation of China under Grant No. 60972077 and partially under Grant No. 70921061, the National Science and Technology Major Program under Grant No. 2010ZX03003-003-01, the Natural Science Foundation of Beijing under Grant No. 9092009, and the Fundamental Research Funds for the Central Universities under Grant No. 2011RC0212.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 102 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ping, Y., Tian, YJ., Zhou, YJ. et al. Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering. J. Comput. Sci. Technol. 27, 428–442 (2012). https://doi.org/10.1007/s11390-012-1232-1

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-012-1232-1

Keywords

Navigation