Skip to main content

Multi-label Ranking with LSTM\(^2\) for Document Classification

  • Conference paper
  • First Online:
Pattern Recognition (CCPR 2016)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 663))

Included in the following conference series:

Abstract

Multi-label document classification is an important challenge with many real-world applications. While multi-label ranking is a common approach for multi-label classification. However existing works usually suffer from incomplete and context-free representation, and nonautomatic and part based model implementation. To solve the problem, we propose a LSTM\(^2\) (Long short term memory) model for document classification in this paper. This model consists of two-steps. The first is repLSTM process which is based on supervised LSTM by introducing the document labels to learn document representation. The second is rankLSTM process. The order of documents labels are rearranged in accordance with a semantics tree, which better exerts the advantages of the LSTM in sequence. Besides by predicting label serially, the model can be trained as a whole. In addition, Connectionist Temporal Classification is used in this process which is a good solution to deal with the error propagation for variable length output (the number of labels in each document). Experiments on three generalization datasets have achieved good results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.bioasq.org/participate/challenges.

  2. 2.

    http://www.d.umn.edu/~tpederse/enron.html.

  3. 3.

    http://www.daviddlewis.com/resources/testcollections/rcv1/.

  4. 4.

    https://www.csie.ntu.edu.tw/~cjlin/libsvm/.

  5. 5.

    http://lamda.nju.edu.cn/code_MLkNN.ashx.

  6. 6.

    http://www.vlfeat.org/matconvnet/.

  7. 7.

    http://www.fit.vutbr.cz/~imikolov/rnnlm/.

  8. 8.

    http://deeplearning.net/tutorial/lstm.html.

References

  1. Barutcuoglu, Z., Schapire, R.E., Troyanskaya, O.G.: Hierarchical multi-label prediction of gene function. Bioinformatics 22(7), 830–836 (2006)

    Article  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. In: Advances in neural information processing systems, pp. 601–608 (2001)

    Google Scholar 

  3. Bucak, S.S., Mallapragada, P.K., Jin, R., Jain, A.K.: Efficient multi-label ranking for multi-class learning: application to object recognition. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2098–2105. IEEE (2009)

    Google Scholar 

  4. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  5. Chen, J., Chaudhari, N.S.: Protein secondary structure prediction with bidirectional LSTM networks. In: International Joint Conference on Neural Networks: Post-Conference Workshop on Computational Intelligence Approaches for the Analysis of Bio-data (CI-BIO), August 2005

    Google Scholar 

  6. Chiang, T.H., Lo, H.Y., Lin, S.D.: A ranking-based KNN approach for multi-label classification. ACML 25, 81–96 (2012)

    Google Scholar 

  7. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  8. Dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. In: Proceedings of the 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland (2014)

    Google Scholar 

  9. Elman, J.L.: Finding structure in time. Cognit. Sci. 14(2), 179–211 (1990)

    Article  Google Scholar 

  10. Elsas, J.L., Donmez, P., Callan, J., Carbonell, J.G.: Pairwise document classification for relevance feedback. Technical report, DTIC Document (2009)

    Google Scholar 

  11. Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649. IEEE (2013)

    Google Scholar 

  12. Graves, A., Daojian, L., K., Lai, S., Zhou, G., Zhao, J.: Supervised Sequence Labelling with Recurrent Neural Networks, vol. 385. Springer, Heidelberg (2012)

    Google Scholar 

  13. Hüllermeier, E., Fürnkranz, J., Cheng, W., Brinker, K.: Label ranking by learning pairwise preferences. Artif. Intell. 172(16), 1897–1916 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  14. Jordan, A.: On discriminative vs. generative classifiers: a comparison of logistic regression and naive Bayes. Adv. Neural Inf. Process. Syst. 14, 841 (2002)

    Google Scholar 

  15. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. arXiv preprint arXiv:1412.2306 (2014)

  16. Lai, S., Xu, L., Liu, K., Zhao, J.: Recurrent convolutional neural networks for text classification. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)

    Google Scholar 

  17. Li, J., Chen, X., Hovy, E., Jurafsky, D.: Visualizing and understanding neural models in NLP. arXiv preprint arXiv:1506.01066 (2015)

  18. Loza Mencía, E., Fürnkranz, J.: Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS, vol. 5212, pp. 50–65. Springer, Heidelberg (2008). doi:10.1007/978-3-540-87481-2_4

    Chapter  Google Scholar 

  19. Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, Makuhari, Chiba, Japan, 26–30 September 2010, pp. 1045–1048 (2010)

    Google Scholar 

  20. Padhye, A.: Comparing supervised and unsupervised classification of messages in the enron email corpus. Ph.D. thesis. University of Minnesota (2006)

    Google Scholar 

  21. Petterson, J., Caetano, T.S.: Reverse multi-label learning. In: Advances in Neural Information Processing Systems, pp. 1912–1920 (2010)

    Google Scholar 

  22. Srivastava, N., Mansimov, E., Salakhutdinov, R.: Unsupervised learning of video representations using LSTMS. arXiv preprint arXiv:1502.04681 (2015)

  23. Srivastava, N., Salakhutdinov, R.R., Hinton, G.E.: Modeling documents with deep Boltzmann machines. arXiv preprint arXiv:1309.6865 (2013)

  24. Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)

  25. Trohidis, K., Tsoumakas, G., Kalliris, G., Vlahavas, I.P.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)

    Google Scholar 

  26. Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multilabel classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)

    Article  Google Scholar 

  27. Vembu, S., Gärtner, T.: Label ranking algorithms: a survey. In: Fürnkranz, J., Hüllermeier, E. (eds.) Preference Learning, pp. 45–64. Springer, Heidelberg (2011)

    Google Scholar 

  28. Xue, X., Zhang, W., Zhang, J., Wu, B., Fan, J., Lu, Y.: Correlative multi-label multi-instance image annotation. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 651–658. IEEE (2011)

    Google Scholar 

  29. Yepes, A.J., MacKinlay, A., Bedo, J., Garnavi, R., Chen, Q.: Deep belief networks and biomedical text categorisation. In: Australasian Language Technology Association Workshop, p. 123 (2014)

    Google Scholar 

  30. Zeng, D., Liu, K., Lai, S., Zhou, G., Zhao, J.: Relation classification via convolutional deep neural network. In: Proceedings of COLING, pp. 2335–2344 (2014)

    Google Scholar 

  31. Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)

    Article  MATH  Google Scholar 

  32. Zhu, X., Sobihani, P., Guo, H.: Long short-term memory over recursive structures. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-2015), pp. 1604–1612 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xu-Cheng Yin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Yan, Y., Yin, XC., Yang, C., Zhang, BW., Hao, HW. (2016). Multi-label Ranking with LSTM\(^2\) for Document Classification. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-3005-5_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-3004-8

  • Online ISBN: 978-981-10-3005-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics