Learning Through Non-linearly Supervised Dimensionality Reduction

Grabocka, Josif; Schmidt-Thieme, Lars

doi:10.1007/978-3-662-46335-2_4

Josif Grabocka²¹ &
Lars Schmidt-Thieme²¹

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 8970))

490 Accesses

Abstract

Dimensionality reduction is a crucial ingredient of machine learning and data mining, boosting classification accuracy through the isolation of patterns via omission of noise. Nevertheless, recent studies have shown that dimensionality reduction can benefit from label information, via a joint estimation of predictors and target variables from a lowf-rank representation. In the light of such inspiration, we propose a novel dimensionality reduction which simultaneously reconstructs the predictors using matrix factorization and estimates the target variable via a dual-form maximum margin classifier from the latent space. Compared to existing studies which conduct the decomposition via linearly supervision of targets, our method reconstructs the labels using nonlinear functions. If the hyper-plane separating the class regions in the original data space is non-linear, then a nonlinear dimensionality reduction helps improving the generalization over the test instances. The joint optimization function is learned through a coordinate descent algorithm via stochastic updates. Empirical results demonstrate the superiority of the proposed method compared to both classification in the original space (no reduction), classification after unsupervised reduction, and classification using linearly supervised projection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Samet, H.: Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). Morgan Kaufmann Publishers Inc., San Francisco (2005)
Google Scholar
Grabocka, J., Bedalli, E., Schmidt-Thieme, L.: Efficient classification of long time-series. In: Markovski, S., Gushev, M. (eds.) ICT Innovations 2012. AISC, vol. 207, pp. 47–57. Springer, Heidelberg (2013)
Chapter Google Scholar
Grabocka, J., Nanopoulos, A., Schmidt-Thieme, L.: Classification of sparse time series via supervised matrix factorization. In: Hoffmann, J., Selman, B. (eds.) AAAI, AAAI Press (2012)
Google Scholar
Das Gupta, M., Xiao, J.: Non-negative matrix factorization as a feature selection tool for maximum margin classifiers. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, pp. 2841–2848. IEEE Computer Society, Washington, DC (2011)
Google Scholar
Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)
MATH Google Scholar
Wismüller, A., Verleysen, M., Aupetit, M., Lee, J.A.: Recent advances in nonlinear dimensionality reduction, manifold and topological learning. In: ESANN (2010)
Google Scholar
Hoffmann, H.: Kernel pca for novelty detection. Pattern Recognit. 40(3), 863–874 (2007)
Article MATH Google Scholar
Sun, J., Crowe, M., Fyfe, C.: Extending metric multidimensional scaling with bregman divergences. Pattern Recognit. 44(5), 1137–1154 (2011)
Article MATH Google Scholar
Gorban, A.N., Zinovyev, A.Y.: Principal manifolds and graphs in practice: from molecular biology to dynamical systems. Int. J. Neural Syst. 20(3), 219–232 (2010)
Article Google Scholar
Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, New York; London (2007)
Book MATH Google Scholar
Gashler, M.S., Martinez, T.: Temporal nonlinear dimensionality reduction. In: Proceedings of the IEEE International Joint Conference on Neural Networks, IJCNN 2011, pp. 1959–1966. IEEE Press (2011)
Google Scholar
Lawrence, N., Hyvrinen, A.: Probabilistic non-linear principal component analysis with gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2005)
MATH MathSciNet Google Scholar
Lawrence, N.: Gaussian process latent variable models for visualisation of high dimensional data. In: NIPS (2003, 2004)
Google Scholar
Singh, A.P., Gordon, G.J.: A unified view of matrix factorization models. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 358–373. Springer, Heidelberg (2008)
Chapter Google Scholar
Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Comput. 42(8), 30–37 (2009)
Article Google Scholar
Rendle, S., Schmidt-Thieme, L.: Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In: Pu, P., Bridge, D.G., Mobasher, B., Ricci, F. (eds.) RecSys, pp. 251–258. ACM (2008)
Google Scholar
Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1548–1560 (2011)
Article Google Scholar
Giannakopoulos, T., Petridis, S.: Fisher linear semi-discriminant analysis for speaker diarization. IEEE Trans. Audio Speech Lang. Process. 20(7), 1913–1922 (2012)
Article Google Scholar
Menon, A.K., Elkan, C.: Predicting labels for dyadic data. Data Min. Knowl. Discov. 21(2), 327–343 (2010)
Article MathSciNet Google Scholar
Rish, I., Grabarnik, G., Cecchi, G., Pereira, F., Gordon, G.J.: Closed-form supervised dimensionality reduction with generalized linear models. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning, pp. 832–839. ACM, New York (2008)
Google Scholar
Rennie, J.D.M.: Loss functions for preference levels: regression with discrete ordered labels. In: Proceedings of the IJCAI Multidisciplinary Workshop on Advances in Preference Handling, pp. 180–186 (2005)
Google Scholar
Fukumizu, K., Bach, F.R., Jordan, M.I.: Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. J. Mach. Learn. Res. 5, 73–99 (2004)
MATH MathSciNet Google Scholar
Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 11 (2007)
Google Scholar
Zhang, D., Zhou, Z.-H., Chen, S.: Semi-supervised dimensionality reduction. In: Proceedings of the 7th SIAM International Conference on Data Mining, pp. 11–393 (2007)
Google Scholar
Urtasun, R., Darrell, T.: Discriminative gaussian process latent variable models for classification. In: International Conference in Machine Learning (2007)
Google Scholar
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Google Scholar
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)
Google Scholar

Download references

Acknowledgment

This study was funded by the Seventh Framework Programme (FP7) of the European Commission, through projects REDUCTION(www.reduction-project.eu) and iTalk2Learn(www.italk2learn.eu).

In addition, the authors express their gratitude to Lucas Rego Drumond (University of Hildesheim) for his assistance on formalizing the linearly supervised decomposition.

Author information

Authors and Affiliations

Information Systems and Machine Learning Lab, Samelsonplatz 22, 31141, Hildesheim, Germany
Josif Grabocka & Lars Schmidt-Thieme

Authors

Josif Grabocka
View author publications
You can also search for this author in PubMed Google Scholar
Lars Schmidt-Thieme
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Josif Grabocka .

Editor information

Editors and Affiliations

IRIT, Paul Sabatier University, Toulouse, France
Abdelkader Hameurlain
FAW, University of Linz, Linz, Austria
Josef Küng
FAW, University of Linz, Linz, Austria
Roland Wagner
LIAS/ISAE-ENSMA, Chasseneuil-du-Poitou, France
Ladjel Bellatreche
IBM India Research Lab, New Delhi, India
Mukesh Mohania

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Grabocka, J., Schmidt-Thieme, L. (2015). Learning Through Non-linearly Supervised Dimensionality Reduction. In: Hameurlain, A., Küng, J., Wagner, R., Bellatreche, L., Mohania, M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII. Lecture Notes in Computer Science(), vol 8970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46335-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-662-46335-2_4
Published: 30 January 2015
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46334-5
Online ISBN: 978-3-662-46335-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics