Skip to main content

Incorporating Task-Oriented Representation in Text Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11447))

Abstract

Text classification (TC) is an important task in natural language processing. Recently neural network has been applied to text classification and achieves significant improvement in performance. Since some documents are short and ambiguous, recent research enriches document representation with concepts of words extracted from an external knowledge base. However, this approach might incorporate task-irrelevant concepts or coarse granularity concepts that could not discriminate classes in a TC task. This might add noise to document representation and degrade TC performance. To tackle this problem, we propose a task-oriented representation that captures word-class relevance as task-relevant information. We integrate task-oriented representation in a CNN classification model to perform TC. Experimental results on widely used datasets show our approach outperforms comparison models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://cogcomp.cs.illinois.edu/Data/QA/QC/.

  2. 2.

    https://nlp.stanford.edu/sentiment/.

  3. 3.

    https://code.google.com/archive/p/word2vec/.

References

  1. Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 1107–1116 (2016). https://doi.org/10.1007/s13218-012-0198-z, http://arxiv.org/abs/1606.01781

    Article  Google Scholar 

  2. Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, vol. 283. Addison-Wesley, Reading (2010)

    Google Scholar 

  3. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011). https://doi.org/10.1109/CDC.2012.6426698. http://jmlr.org/papers/v12/duchi11a.html

    Article  MathSciNet  MATH  Google Scholar 

  4. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc. (1999)

    Google Scholar 

  5. Ji, Y., Smith, N.A.: Neural discourse structure for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 996–1005 (2017)

    Google Scholar 

  6. Jin, P., Zhang, Y., Chen, X., Xia, Y.: Bag-of-embeddings for text classification. In: IJCAI International Joint Conference on Artificial Intelligence, vol. 16, pp. 2824–2830, January 2016. https://www.ijcai.org/Proceedings/16/Papers/401.pdf

  7. Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 562–570 (2017). https://doi.org/10.18653/v1/P17-1052, http://aclweb.org/anthology/P17-1052

  8. Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Short Papers, vol. 2, pp. 427–431 (2017)

    Google Scholar 

  9. Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)

    Google Scholar 

  10. Kottur, S., Moura, J.M.F., Lee, S., Batra, D.: Natural language does not emerge ‘Naturally’ in multi-agent dialog. Synth. Lect. Hum. Lang. Technol. 10(1), 1–309 (2017). https://doi.org/10.2200/S00762ED1V01Y201703HLT037. http://arxiv.org/abs/1706.08502

    Article  Google Scholar 

  11. Li, S., Zhao, Z., Liu, T., Hu, R., Du, X.: Initializing convolutional filters with semantic features for text classification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1884–1889 (2017)

    Google Scholar 

  12. Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  13. Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)

    Google Scholar 

  14. Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)

    Article  Google Scholar 

  15. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  16. Sehgal, U., Kaur, K., Kumar, P.: Notice of violation of IEEE publication principles the anatomy of a large-scale hyper textual web search engine. In: Second International Conference on Computer and Electrical Engineering, ICCEE 2009, vol. 2, pp. 491–495. IEEE (2009)

    Google Scholar 

  17. Shu, L., Xu, H., Liu, B.: Doc: deep open classification of text documents. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2911–2916 (2017)

    Google Scholar 

  18. Socher, R., Perelygin, A., Wu, J.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013). https://www.aclweb.org/anthology/D13-1170

  19. Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 2915–2921 (2017). https://doi.org/10.24963/ijcai.2017/406, https://www.ijcai.org/proceedings/2017/406

  20. Wang, T., Cai, Y., Leung, H.f., Cai, Z., Min, H.: Entropy-based term weighting schemes for text categorization in VSM. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 325–332. IEEE, November 2015. https://doi.org/10.1109/ICTAI.2015.57, http://ieeexplore.ieee.org/document/7372153/

  21. Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492. ACM (2012)

    Google Scholar 

  22. Xu, R., Yang, Y.: Cross-lingual distillation for text classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 1415–1425 (2017)

    Google Scholar 

  23. Zhang, K., Zhu, K.Q., Hwang, S.W.: An association network for computing semantic relatedness. In: AAAI, pp. 593–600 (2015)

    Google Scholar 

  24. Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)

    Google Scholar 

  25. Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 3485–3495 (2016)

    Google Scholar 

Download references

Acknowledgement

This work was supported by the Fundamental Research Funds for the Central Universities, SCUT (No. 2017ZD048, D2182480), the Tiptop Scientific and Technical Innovative Youth Talents of Guangdong special support program (No. 2015-TQ01X633), the Science and Technology Planning Project of Guangdong Province (No. 2017B050506004), the Science and Technology Program of Guangzhou International Science & Technology Cooperation Program (No. 201704030076). The research described in this paper has been supported by a collaborative research grant from the Hong Kong Research Grants Council (project no. C1031-18G).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Cai .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lei, X., Cai, Y., Xu, J., Ren, D., Li, Q., Leung, Hf. (2019). Incorporating Task-Oriented Representation in Text Classification. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18579-4_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18578-7

  • Online ISBN: 978-3-030-18579-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics