Skip to main content

Learning Through Non-linearly Supervised Dimensionality Reduction

  • Chapter
  • First Online:
Book cover Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII

Part of the book series: Lecture Notes in Computer Science ((TLDKS,volume 8970))

  • 490 Accesses

Abstract

Dimensionality reduction is a crucial ingredient of machine learning and data mining, boosting classification accuracy through the isolation of patterns via omission of noise. Nevertheless, recent studies have shown that dimensionality reduction can benefit from label information, via a joint estimation of predictors and target variables from a lowf-rank representation. In the light of such inspiration, we propose a novel dimensionality reduction which simultaneously reconstructs the predictors using matrix factorization and estimates the target variable via a dual-form maximum margin classifier from the latent space. Compared to existing studies which conduct the decomposition via linearly supervision of targets, our method reconstructs the labels using nonlinear functions. If the hyper-plane separating the class regions in the original data space is non-linear, then a nonlinear dimensionality reduction helps improving the generalization over the test instances. The joint optimization function is learned through a coordinate descent algorithm via stochastic updates. Empirical results demonstrate the superiority of the proposed method compared to both classification in the original space (no reduction), classification after unsupervised reduction, and classification using linearly supervised projection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Samet, H.: Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). Morgan Kaufmann Publishers Inc., San Francisco (2005)

    Google Scholar 

  2. Grabocka, J., Bedalli, E., Schmidt-Thieme, L.: Efficient classification of long time-series. In: Markovski, S., Gushev, M. (eds.) ICT Innovations 2012. AISC, vol. 207, pp. 47–57. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  3. Grabocka, J., Nanopoulos, A., Schmidt-Thieme, L.: Classification of sparse time series via supervised matrix factorization. In: Hoffmann, J., Selman, B. (eds.) AAAI, AAAI Press (2012)

    Google Scholar 

  4. Das Gupta, M., Xiao, J.: Non-negative matrix factorization as a feature selection tool for maximum margin classifiers. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, pp. 2841–2848. IEEE Computer Society, Washington, DC (2011)

    Google Scholar 

  5. Jolliffe, I.T.: Principal Component Analysis, 2nd edn. Springer, New York (2002)

    MATH  Google Scholar 

  6. Wismüller, A., Verleysen, M., Aupetit, M., Lee, J.A.: Recent advances in nonlinear dimensionality reduction, manifold and topological learning. In: ESANN (2010)

    Google Scholar 

  7. Hoffmann, H.: Kernel pca for novelty detection. Pattern Recognit. 40(3), 863–874 (2007)

    Article  MATH  Google Scholar 

  8. Sun, J., Crowe, M., Fyfe, C.: Extending metric multidimensional scaling with bregman divergences. Pattern Recognit. 44(5), 1137–1154 (2011)

    Article  MATH  Google Scholar 

  9. Gorban, A.N., Zinovyev, A.Y.: Principal manifolds and graphs in practice: from molecular biology to dynamical systems. Int. J. Neural Syst. 20(3), 219–232 (2010)

    Article  Google Scholar 

  10. Lee, J.A., Verleysen, M.: Nonlinear Dimensionality Reduction. Springer, New York; London (2007)

    Book  MATH  Google Scholar 

  11. Gashler, M.S., Martinez, T.: Temporal nonlinear dimensionality reduction. In: Proceedings of the IEEE International Joint Conference on Neural Networks, IJCNN 2011, pp. 1959–1966. IEEE Press (2011)

    Google Scholar 

  12. Lawrence, N., Hyvrinen, A.: Probabilistic non-linear principal component analysis with gaussian process latent variable models. J. Mach. Learn. Res. 6, 1783–1816 (2005)

    MATH  MathSciNet  Google Scholar 

  13. Lawrence, N.: Gaussian process latent variable models for visualisation of high dimensional data. In: NIPS (2003, 2004)

    Google Scholar 

  14. Singh, A.P., Gordon, G.J.: A unified view of matrix factorization models. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 358–373. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Koren, Y., Bell, R.M., Volinsky, C.: Matrix factorization techniques for recommender systems. IEEE Comput. 42(8), 30–37 (2009)

    Article  Google Scholar 

  16. Rendle, S., Schmidt-Thieme, L.: Online-updating regularized kernel matrix factorization models for large-scale recommender systems. In: Pu, P., Bridge, D.G., Mobasher, B., Ricci, F. (eds.) RecSys, pp. 251–258. ACM (2008)

    Google Scholar 

  17. Cai, D., He, X., Han, J., Huang, T.S.: Graph regularized nonnegative matrix factorization for data representation. IEEE Trans. Pattern Anal. Mach. Intell. 33(8), 1548–1560 (2011)

    Article  Google Scholar 

  18. Giannakopoulos, T., Petridis, S.: Fisher linear semi-discriminant analysis for speaker diarization. IEEE Trans. Audio Speech Lang. Process. 20(7), 1913–1922 (2012)

    Article  Google Scholar 

  19. Menon, A.K., Elkan, C.: Predicting labels for dyadic data. Data Min. Knowl. Discov. 21(2), 327–343 (2010)

    Article  MathSciNet  Google Scholar 

  20. Rish, I., Grabarnik, G., Cecchi, G., Pereira, F., Gordon, G.J.: Closed-form supervised dimensionality reduction with generalized linear models. In: ICML 2008: Proceedings of the 25th International Conference on Machine Learning, pp. 832–839. ACM, New York (2008)

    Google Scholar 

  21. Rennie, J.D.M.: Loss functions for preference levels: regression with discrete ordered labels. In: Proceedings of the IJCAI Multidisciplinary Workshop on Advances in Preference Handling, pp. 180–186 (2005)

    Google Scholar 

  22. Fukumizu, K., Bach, F.R., Jordan, M.I.: Dimensionality reduction for supervised learning with reproducing kernel hilbert spaces. J. Mach. Learn. Res. 5, 73–99 (2004)

    MATH  MathSciNet  Google Scholar 

  23. Salakhutdinov, R., Hinton, G.: Learning a nonlinear embedding by preserving class neighbourhood structure. In: Proceedings of the International Conference on Artificial Intelligence and Statistics, vol. 11 (2007)

    Google Scholar 

  24. Zhang, D., Zhou, Z.-H., Chen, S.: Semi-supervised dimensionality reduction. In: Proceedings of the 7th SIAM International Conference on Data Mining, pp. 11–393 (2007)

    Google Scholar 

  25. Urtasun, R., Darrell, T.: Discriminative gaussian process latent variable models for classification. In: International Conference in Machine Learning (2007)

    Google Scholar 

  26. Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  27. Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT Press, Cambridge (1998)

    Google Scholar 

Download references

Acknowledgment

This study was funded by the Seventh Framework Programme (FP7) of the European Commission, through projects REDUCTION(www.reduction-project.eu) and iTalk2Learn(www.italk2learn.eu).

In addition, the authors express their gratitude to Lucas Rego Drumond (University of Hildesheim) for his assistance on formalizing the linearly supervised decomposition.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Josif Grabocka .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Grabocka, J., Schmidt-Thieme, L. (2015). Learning Through Non-linearly Supervised Dimensionality Reduction. In: Hameurlain, A., Küng, J., Wagner, R., Bellatreche, L., Mohania, M. (eds) Transactions on Large-Scale Data- and Knowledge-Centered Systems XVII. Lecture Notes in Computer Science(), vol 8970. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46335-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-46335-2_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-46334-5

  • Online ISBN: 978-3-662-46335-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics