Skip to main content

MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets

  • Conference paper
Book cover Life System Modeling and Intelligent Computing (ICSEE 2010, LSMS 2010)

Abstract

This paper presents a multi-scale, hierarchical framework to extend the scalability of support vector clustering (SVC). Based on the multi-sphere support vector clustering, the clustering algorithm called multi-scale multi-sphere support vector clustering (MMSVC) in this framework works in a coarse-to-fine and top-to-down manner. Given one parent cluster, the next learning scale is generated by a secant-like numerical algorithm. A local quantity called spherical support vector density (sSVD) is proposed as a cluster validity measure which describes the compactness of the cluster. It is used as a terminate term in our framework. When dealing with large-scale dataset, our method benefits from the online learning, easy parameters tuning and the learning efficiency. 1.5 million tiny images were used to evaluate the method. Experimental results demonstrate that the method greatly improves the scalability and learning efficiency of support vector clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V., Critianini, N., Shawe-Taylor, J., Williamson, B.: Support Vector Clustering. Journal of Machine Learning Research 2, 125–137 (2002)

    Article  MATH  Google Scholar 

  2. Dhillon, I.S., Guan, Y., Kulis, B.: Kernel k-means: spectral clustering and normalized cuts. In: ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 551–556 (2004)

    Google Scholar 

  3. Fischer, B., Roth, V., Buhmann, J.M.: Clustering with the Connectivity Kernel. In: Advances in Neural Information Processing Systems, vol. 16, pp. 1–16 (2004)

    Google Scholar 

  4. Girolami, M.: Mercer kernel-based clustering in feature space. In: Proceedings of 2004 IEEE International Joint Conference on Neural Networks, 2004, vol. 13, pp. 780–784 (2002)

    Google Scholar 

  5. Ben-Hur, A., Horn, D., Siegelmann, H.T., Vapnik, V.: A support vector clustering method. In: Pattern Recognition, vol. 722, pp. 724–727 (2000)

    Google Scholar 

  6. Tax, D.M.J., Duin, R.P.W.: Support Vector Data Description. Machine Learning 54, 45–66 (2004)

    Article  MATH  Google Scholar 

  7. Camastra, F., Verri, A.: A novel kernel method for clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 801–805 (2005)

    Article  Google Scholar 

  8. Defeng, W., Daniel, S.Y., Eric, C.C.T.: Structured One-Class Classification. IEEE Transactions on Systems, Man, and Cybernetics 36, 1283–1295 (2006)

    Article  Google Scholar 

  9. Jung-Hsien, C., Pei-Yi, H.: A new kernel-based fuzzy clustering approach: support vector clustering with cell growing. Fuzzy Systems 11, 518–527 (2003)

    Article  Google Scholar 

  10. Daewon, L., Jaewook, L.: Domain described support vector classifier for multi-classification problems. Pattern Recognition 40, 41–51 (2007)

    Article  MATH  Google Scholar 

  11. Chang, L., Deng, X.M., Zheng, S.W., Wang, Y.Q.: Scaling up Kernel Grower Clustering Method for Large Data Sets via Core-sets. Acta Automatica Sinica 34, 376–382 (2008)

    Article  MATH  Google Scholar 

  12. Jen-Chieh, C., Jeen-Shing, W.: Support Vector Clustering with a Novel Cluster Validity Method. In: IEEE International Conference on Systems, Man and Cybernetics, SMC 2006, vol. 5, pp. 3715–3720 (2006)

    Google Scholar 

  13. Jaewook, L., Daewon, L.: An improved cluster labeling method for support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 27, 461–464 (2005)

    Article  Google Scholar 

  14. Lee, S.H., Daniels, K.M.: Cone Cluster Labeling for Support Vector Clustering. In: Proceedings of the 6th SIAM International Conference on Data Mining (2006)

    Google Scholar 

  15. Lee, S.H., Daniels, K.M.: Gaussian Kernel Width Generator for Support Vector Clustering. In: International Conference on Bioinformatics and its Applications, pp. 151–162 (2004)

    Google Scholar 

  16. Grira, N., Crucianu, M., Boujemaa, N.: Unsupervised and Semi-supervised Clustering: a Brief Survey. A Review of Machine Learning Techniques for Processing Multimedia Contents. Report of the MUSCLE European Network of Excellence (FP6) (2004)

    Google Scholar 

  17. Cao, F., Delon, J., Desolneux, A., Mus, P., Sur, F.: An a contrario approach to hierarchical clustering validity assessment (2004)

    Google Scholar 

  18. Halkidi, M., Batistakis, Y., Vazirgiannis, M.: Cluster validity methods: part I. ACM SIGMOD Record. 31, 40–45 (2002)

    Article  Google Scholar 

  19. Wang, J.-S., Chiang, J.-C.: A cluster validity measure with a hybrid parameter search method for the support vector clustering algorithm. Pattern Recognition 41, 506–520 (2008)

    Article  MATH  Google Scholar 

  20. Torralba, A., Fergus, R., Freeman, W.T.: Tiny Images. Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology (2007)

    Google Scholar 

  21. Hansen, M.S., Holm, D.A., Sjöstrand, K., Ley, C.D., Rowland, I.J., Larsen, R.: Multiscale hierarchical support vector clustering. In: Medical Imaging 2008: Image Processing 6914, 69144B, pp. 136–144 (2008)

    Google Scholar 

  22. Sjöstrand, K., Larsen, R.: The Entire Regularization Path for the Support Vector Domain Description. In: Larsen, R., Nielsen, M., Sporring, J. (eds.) MICCAI 2006. LNCS, vol. 4190, pp. 241–248. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gu, H., Zhao, G., Zhang, J. (2010). MMSVC: An Efficient Unsupervised Learning Approach for Large-Scale Datasets. In: Li, K., Jia, L., Sun, X., Fei, M., Irwin, G.W. (eds) Life System Modeling and Intelligent Computing. ICSEE LSMS 2010 2010. Lecture Notes in Computer Science(), vol 6330. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15615-1_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15615-1_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15614-4

  • Online ISBN: 978-3-642-15615-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics