Skip to main content

On Applying Dimension Reduction for Multi-labeled Problems

  • Conference paper
Machine Learning and Data Mining in Pattern Recognition (MLDM 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4571))

Abstract

Traditional classification problem assumes that a data sample belongs to one class among the predefined classes. On the other hand, in a multi-labeled problem such as text categorization, data samples can belong to multiple classes and the task is to output a set of class labels associated with new unseen data sample. As common in text categorization problem, learning a classifier in a high dimensional space can be difficult, known as the curse of dimensionality. It has been shown that performing dimension reduction as a preprocessing step can improve classification performances greatly. Especially, Linear discriminant analysis (LDA) is one of the most popular dimension reduction methods, which is optimized for classification tasks. However, in applying LDA for a multi-labeled problem some ambiguities and difficulties can arise. In this paper, we study on applying LDA for a multi-labeled problem and analyze how an objective function of LDA can be interpreted in multi-labeled setting. We also propose a LDA algorithm which is effective in a multi-labeled problem. Experimental results demonstrate that by considering multi-labeled structures LDA can achieve computational efficiency and also improve classification performances greatly.

This work was supported by the Korea Research Foundation Grant funded by the Korean Government(MOEHRD)(KRF-2006-331-D00510).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lewis, D., Yang, Y., Rose, T., Li, F.: Rcv1: a new benchmark collection for text categorization research. Journal of Machine learning research 5, 361–397 (2004)

    Google Scholar 

  2. Pavlidis, P., Weston, J., Cai, J., Grundy, W.: Combining microarray expression data and phylogenetic profiles to learn functional categories using support vector machines. In: Proceedings of the 5th Annual international conference on computational biology, Montreal, Canada (2001)

    Google Scholar 

  3. Elisseeff, A., Weston, J.: A kernel method for multi-labeled classification. Advances in neural information processing systems 14, 681–687 (2002)

    Google Scholar 

  4. Zhang, M., Zhou, Z.: A k-nearest neighbor based algorithm for multi-label classification. In: 2005 IEEE International Conference on Granular Computing (2005)

    Google Scholar 

  5. Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004)

    Google Scholar 

  6. Zhu, S., Ji, X., Xu, W., Gong, Y.: Multi-labelled classification using maximum entropy method. In: SIGIR 2005, Salvador, Brazil (2005)

    Google Scholar 

  7. Torkkola, K.: Linear discriminant analysis in document classification. In: TextDM 2001. IEEE ICDM-2001 Workshop on Text Mining, San Jose, CA (2001)

    Google Scholar 

  8. Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces v.s. fisherfaces: Recognition using class specific linear projection. IEEE transactions on pattern analysis and machine learning 19(7), 711–720 (1997)

    Article  Google Scholar 

  9. Nguyen, D., Rocke, D.: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 18(1), 39–50 (2002)

    Article  Google Scholar 

  10. Park, C.H., Park, H., Pardalos, P.: A comparative study of linear and nonlinear feature extraction methods. In: Fourth IEEE International Conference on Data Mining, Brighton, United Kingdom, pp. 495–498 (2004)

    Google Scholar 

  11. Fukunaga, K.: Introduction to Statistical Pattern Recognition, 2nd edn. Acadamic Press, San Diego (1990)

    MATH  Google Scholar 

  12. Yu, H., Yang, J.: A direct lda algorithm for high-dimensional data- with application to face recognition. Pattern recognition 34, 2067–2070 (2001)

    Article  MATH  Google Scholar 

  13. Howland, P., Park, H.: Generalizing discriminant analysis using the generalized singular value decomposition. IEEE transaction on pattern analysis and machine intelligence 26(8), 995–1006 (2004)

    Article  Google Scholar 

  14. Zheng, W., Zou, C., Zhao, L.: Real-time face recognition using gram-schmidt orthogonalization for lda. In: The Proceedings of the 17th International Conference on Pattern Recognition (2004)

    Google Scholar 

  15. Ye, J., Janardan, R., Park, C.H., Park, H.: An optimization criterion for generalized discriminant analysis on undersampled problems. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(8), 982–994 (2004)

    Article  Google Scholar 

  16. Schapire, R., Singer, Y.: Boostexter: a boosting-based system for text categorization. Machine learning 39, 135–168 (2000)

    Article  MATH  Google Scholar 

  17. Luo, X., Zincir-Heywood, N.: Evaluation of two systems on multi-class multi-label document classification. In: ISMIS 2005, New York, USA (2005)

    Google Scholar 

  18. Golub, G.H., Van Loan, C.F.: Matrix Computations. Johns Hopkins University Press, Baltimore (1996)

    MATH  Google Scholar 

  19. Friedman, J.H.: Regularized discriminant analysis. Journal of the American statistical association 84(405), 165–175 (1989)

    Article  MathSciNet  Google Scholar 

  20. Chen, L., Liao, H.M., Ko, M., Lin, J., Yu, G.: A new lda-based face recognition system which can solve the small sample size problem. Pattern recognition 33, 1713–1726 (2000)

    Article  Google Scholar 

  21. Yang, J., Yang, J.-Y.: Why can lda be performed in pca transformed space? Pattern Recognition 36, 563–566 (2003)

    Article  Google Scholar 

  22. Kolman, B., Hill, D.: Introductory linear algebra, 8th edn. Prentice-Hall, Englewood Cliffs (2005)

    Google Scholar 

  23. Yu, K., Yu, S., Tresp, V.: Multi-label informed latent semantic indexing. In: SIGIR 2005, Salvador, Brazil (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lee, M., Park, C.H. (2007). On Applying Dimension Reduction for Multi-labeled Problems. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2007. Lecture Notes in Computer Science(), vol 4571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73499-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73499-4_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73498-7

  • Online ISBN: 978-3-540-73499-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics