Skip to main content
Log in

On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this work we present the first efficient algorithm for unsupervised training of multi-class regularized least-squares classifiers. The approach is closely related to the unsupervised extension of the support vector machine classifier known as maximum margin clustering, which recently has received considerable attention, though mostly considering the binary classification case. We present a combinatorial search scheme that combines steepest descent strategies with powerful meta-heuristics for avoiding bad local optima. The regularized least-squares based formulation of the problem allows us to use matrix algebraic optimization enabling constant time checks for the intermediate candidate solutions during the search. Our experimental evaluation indicates the potential of the novel method and demonstrates its superior clustering performance over a variety of competing methods on real world datasets. Both time complexity analysis and experimental comparisons show that the method can scale well to practical sized problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). New York, NY, USA: Springer, 2009.

    Book  Google Scholar 

  2. Bao T, Cao H, Chen E, Tian J, Xiong H. An unsupervised approach to modeling personalized contexts of mobile users. Knowledge and Information Systems, 2012, 31(2): 345–370.

    Article  Google Scholar 

  3. Jain A, Dubes R. Algorithms for Clustering Data. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1988.

    MATH  Google Scholar 

  4. Schölkopf B, Smola A. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press, 2001.

    Google Scholar 

  5. Steinwart I, Christmann A. Support Vector Machines. New York, NY, USA: Springer-Verlag, 2008.

    MATH  Google Scholar 

  6. Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. In Advances in Neural Information Processing Systems 17, Saul L, Weiss Y, Bottou L (eds.), MIT Press, 2005, pp. 1537–1544.

  7. Pahikkala T, Airola A, Gieseke F, Kramer O. Unsupervised multi-class regularized least-squares classification. In Proc. the 12th IEEE International Conference on Data Mining (ICDM 2012), Dec. 2012, pp. 585–594.

  8. Boyd S, Vandenberghe L. Convex Optimization. New York, NY, USA: Cambridge University Press, 2004.

    Book  MATH  Google Scholar 

  9. Valizadegan H, Jin R. Generalized maximum margin clustering and unsupervised kernel learning. In Advances in Neural Information Processing Systems 19, Schölkopf B, Platt J, Hoffman T (eds.), MIT Press, 2007, pp. 1417–1424.

  10. Zhao B, Wang F, Zhang C. Efficient maximum margin clustering via cutting plane algorithm. In Proc. the SIAM International Conference on Data Mining, Apr. 2008, pp. 751–762.

    Google Scholar 

  11. Li Y, Tsang I, Kwok J, Zhou Z. Tighter and convex maximum margin clustering. In Proc. the 12th International Conference on Artificial Intelligence and Statistics, Apr. 2009, pp. 344–351.

  12. Zhang K, Tsang I, Kwok J. Maximum margin clustering made practical. In Proc. the 24th International Conference on Machine Learning, June 2007, pp. 1119–1126.

  13. Gieseke F, Pahikkala T, Kramer O. Fast evolutionary maximum margin clustering. In Proc. the 26th International Conference on Machine Learning, June 2009, pp. 361–368.

  14. Zhao B, Wang F, Zhang C. Efficient multiclass maximum margin clustering. In Proc. the 25th International Conference on Machine Learning, July 2008, pp. 1248–1255.

  15. Xu L, Schuurmans D. Unsupervised and semi-supervised multi-class support vector machines. In Proc. the 20th National Conference on Artificial Intelligence, July 2005, pp. 904–910.

  16. Rifkin R, Klautau A. In defense of one-vs-all classification. Journal of Machine Learning Research, 2004, 5(Jan): 101–141.

    MATH  MathSciNet  Google Scholar 

  17. Rifkin R, Yeo G, Poggio T. Regularized least-squares classification. In Advances in Learning Theory: Methods, Models and Applications, Suykens J, Horvath G, Basu S, Micchelli C, Vandewalle J (eds.), Amsterdam, The Netherlands: IOS Press, 2003, pp. 131–154.

    Google Scholar 

  18. Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 1971, 33(1): 82–95.

    Article  MATH  MathSciNet  Google Scholar 

  19. Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995, 7(2): 219–269.

    Article  Google Scholar 

  20. Russell S, Norvig P. Artificial Intelligence: A Modern Approach (3rd edition). Upper Saddle River, NJ, USA: Prentice Hall Press, 2009.

    Google Scholar 

  21. Kirkpatrick S, Gelatt C, Vecchi M. Optimization by simulated annealing. Science, 1983, 220(4598): 671–680.

    Article  MATH  MathSciNet  Google Scholar 

  22. Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding. In Proc. the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, Jan. 2007, pp. 1027–1035.

  23. Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1): 1–38.

    MATH  MathSciNet  Google Scholar 

  24. Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888–905.

    Article  Google Scholar 

  25. Schölkopf B, Mika S, Burges C, Knirsch P, Müller K R, Rätsch G, Smola A. Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 1999, 10(5): 1000–1017.

    Article  Google Scholar 

  26. Nene S, Nayar S, Murase H. Columbia object image library (COIL-100). Techical Report, CUCS-006-96, Department of Computer Science, Columbia University, 1996.

  27. Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193–218.

    Article  Google Scholar 

  28. Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332.

    Article  Google Scholar 

  29. Waegeman W, Verwaeren J, Slabbinck B, De Baets B. Supervised learning algorithms for multi-class classification problems with partial class memberships. Fuzzy Sets and Systems, 2011, 184(1): 106–125.

    Article  MATH  MathSciNet  Google Scholar 

  30. Williams C, Seeger M. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13, Leen T, Dietterich T, Tresp V (eds.), MIT Press, 2001, pp. 682-688.

  31. Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. Journal of Intelligent Information Systems, 2001, 17(2/3): 107–145.

    Article  MATH  Google Scholar 

  32. Zhao Q. Cluster validity in clustering methods [Ph.D. Thesis]. University of Eastern Finland, 2012.

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tapio Pahikkala.

Additional information

Tapio Pahikkala is supported by the Academy of Finland under Grant No. 134020 and Fabian Gieseke by the German Academic Exchange Service (DAAD).

A preliminary version of the paper was published in the Proceedings of ICDM 2012.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 28 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pahikkala, T., Airola, A., Gieseke, F. et al. On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers. J. Comput. Sci. Technol. 29, 90–104 (2014). https://doi.org/10.1007/s11390-014-1414-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-014-1414-0

Keywords

Navigation