On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

Pahikkala, Tapio; Airola, Antti; Gieseke, Fabian; Kramer, Oliver

doi:10.1007/s11390-014-1414-0

On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

Regular Paper
Published: 10 January 2014

Volume 29, pages 90–104, (2014)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Tapio Pahikkala¹,
Antti Airola¹,
Fabian Gieseke² &
…
Oliver Kramer³

148 Accesses
1 Citation
Explore all metrics

Abstract

In this work we present the first efficient algorithm for unsupervised training of multi-class regularized least-squares classifiers. The approach is closely related to the unsupervised extension of the support vector machine classifier known as maximum margin clustering, which recently has received considerable attention, though mostly considering the binary classification case. We present a combinatorial search scheme that combines steepest descent strategies with powerful meta-heuristics for avoiding bad local optima. The regularized least-squares based formulation of the problem allows us to use matrix algebraic optimization enabling constant time checks for the intermediate candidate solutions during the search. Our experimental evaluation indicates the potential of the novel method and demonstrates its superior clustering performance over a variety of competing methods on real world datasets. Both time complexity analysis and experimental comparisons show that the method can scale well to practical sized problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Article Open access 22 April 2016

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). New York, NY, USA: Springer, 2009.
Book Google Scholar
Bao T, Cao H, Chen E, Tian J, Xiong H. An unsupervised approach to modeling personalized contexts of mobile users. Knowledge and Information Systems, 2012, 31(2): 345–370.
Article Google Scholar
Jain A, Dubes R. Algorithms for Clustering Data. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1988.
MATH Google Scholar
Schölkopf B, Smola A. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press, 2001.
Google Scholar
Steinwart I, Christmann A. Support Vector Machines. New York, NY, USA: Springer-Verlag, 2008.
MATH Google Scholar
Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. In Advances in Neural Information Processing Systems 17, Saul L, Weiss Y, Bottou L (eds.), MIT Press, 2005, pp. 1537–1544.
Pahikkala T, Airola A, Gieseke F, Kramer O. Unsupervised multi-class regularized least-squares classification. In Proc. the 12th IEEE International Conference on Data Mining (ICDM 2012), Dec. 2012, pp. 585–594.
Boyd S, Vandenberghe L. Convex Optimization. New York, NY, USA: Cambridge University Press, 2004.
Book MATH Google Scholar
Valizadegan H, Jin R. Generalized maximum margin clustering and unsupervised kernel learning. In Advances in Neural Information Processing Systems 19, Schölkopf B, Platt J, Hoffman T (eds.), MIT Press, 2007, pp. 1417–1424.
Zhao B, Wang F, Zhang C. Efficient maximum margin clustering via cutting plane algorithm. In Proc. the SIAM International Conference on Data Mining, Apr. 2008, pp. 751–762.
Google Scholar
Li Y, Tsang I, Kwok J, Zhou Z. Tighter and convex maximum margin clustering. In Proc. the 12th International Conference on Artificial Intelligence and Statistics, Apr. 2009, pp. 344–351.
Zhang K, Tsang I, Kwok J. Maximum margin clustering made practical. In Proc. the 24th International Conference on Machine Learning, June 2007, pp. 1119–1126.
Gieseke F, Pahikkala T, Kramer O. Fast evolutionary maximum margin clustering. In Proc. the 26th International Conference on Machine Learning, June 2009, pp. 361–368.
Zhao B, Wang F, Zhang C. Efficient multiclass maximum margin clustering. In Proc. the 25th International Conference on Machine Learning, July 2008, pp. 1248–1255.
Xu L, Schuurmans D. Unsupervised and semi-supervised multi-class support vector machines. In Proc. the 20th National Conference on Artificial Intelligence, July 2005, pp. 904–910.
Rifkin R, Klautau A. In defense of one-vs-all classification. Journal of Machine Learning Research, 2004, 5(Jan): 101–141.
MATH MathSciNet Google Scholar
Rifkin R, Yeo G, Poggio T. Regularized least-squares classification. In Advances in Learning Theory: Methods, Models and Applications, Suykens J, Horvath G, Basu S, Micchelli C, Vandewalle J (eds.), Amsterdam, The Netherlands: IOS Press, 2003, pp. 131–154.
Google Scholar
Kimeldorf G, Wahba G. Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 1971, 33(1): 82–95.
Article MATH MathSciNet Google Scholar
Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995, 7(2): 219–269.
Article Google Scholar
Russell S, Norvig P. Artificial Intelligence: A Modern Approach (3rd edition). Upper Saddle River, NJ, USA: Prentice Hall Press, 2009.
Google Scholar
Kirkpatrick S, Gelatt C, Vecchi M. Optimization by simulated annealing. Science, 1983, 220(4598): 671–680.
Article MATH MathSciNet Google Scholar
Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding. In Proc. the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, Jan. 2007, pp. 1027–1035.
Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1): 1–38.
MATH MathSciNet Google Scholar
Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888–905.
Article Google Scholar
Schölkopf B, Mika S, Burges C, Knirsch P, Müller K R, Rätsch G, Smola A. Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 1999, 10(5): 1000–1017.
Article Google Scholar
Nene S, Nayar S, Murase H. Columbia object image library (COIL-100). Techical Report, CUCS-006-96, Department of Computer Science, Columbia University, 1996.
Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193–218.
Article Google Scholar
Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319–332.
Article Google Scholar
Waegeman W, Verwaeren J, Slabbinck B, De Baets B. Supervised learning algorithms for multi-class classification problems with partial class memberships. Fuzzy Sets and Systems, 2011, 184(1): 106–125.
Article MATH MathSciNet Google Scholar
Williams C, Seeger M. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13, Leen T, Dietterich T, Tresp V (eds.), MIT Press, 2001, pp. 682-688.
Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. Journal of Intelligent Information Systems, 2001, 17(2/3): 107–145.
Article MATH Google Scholar
Zhao Q. Cluster validity in clustering methods [Ph.D. Thesis]. University of Eastern Finland, 2012.

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their comments.

Author information

Authors and Affiliations

Department of Information Technology, University of Turku, Turku, 20520, Finland
Tapio Pahikkala & Antti Airola
Department of Computer Science, University of Copenhagen, Copenhagen, K 1017, Denmark
Fabian Gieseke
Computer Science Department, Carl von Ossietzky University of Oldenburg, Oldenburg, 26111, Germany
Oliver Kramer

Authors

Tapio Pahikkala
View author publications
You can also search for this author in PubMed Google Scholar
Antti Airola
View author publications
You can also search for this author in PubMed Google Scholar
Fabian Gieseke
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Kramer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tapio Pahikkala.

Additional information

Tapio Pahikkala is supported by the Academy of Finland under Grant No. 134020 and Fabian Gieseke by the German Academic Exchange Service (DAAD).

A preliminary version of the paper was published in the Proceedings of ICDM 2012.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(DOC 28 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pahikkala, T., Airola, A., Gieseke, F. et al. On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers. J. Comput. Sci. Technol. 29, 90–104 (2014). https://doi.org/10.1007/s11390-014-1414-0

Download citation

Received: 01 September 2013
Revised: 28 October 2013
Published: 10 January 2014
Issue Date: January 2014
DOI: https://doi.org/10.1007/s11390-014-1414-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Learning from imbalanced data: open challenges and future directions

Supervised Classification Algorithms in Machine Learning: A Survey and Review

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

ESM 1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation