Incorporating Task-Oriented Representation in Text Classification

Lei, Xue; Cai, Yi; Xu, Jingyun; Ren, Da; Li, Qing; Leung, Ho-fung

doi:10.1007/978-3-030-18579-4_24

Incorporating Task-Oriented Representation in Text Classification

Xue Lei²⁴,
Yi Cai²⁴,
Jingyun Xu²⁴,
Da Ren²⁴,
Qing Li²⁵ &
…
Ho-fung Leung²⁶

Conference paper
First Online: 24 April 2019

2941 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11447))

Abstract

Text classification (TC) is an important task in natural language processing. Recently neural network has been applied to text classification and achieves significant improvement in performance. Since some documents are short and ambiguous, recent research enriches document representation with concepts of words extracted from an external knowledge base. However, this approach might incorporate task-irrelevant concepts or coarse granularity concepts that could not discriminate classes in a TC task. This might add noise to document representation and degrade TC performance. To tackle this problem, we propose a task-oriented representation that captures word-class relevance as task-relevant information. We integrate task-oriented representation in a CNN classification model to perform TC. Experimental results on widely used datasets show our approach outperforms comparison models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Conneau, A., Schwenk, H., Barrault, L., Lecun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, vol. 1, pp. 1107–1116 (2016). https://doi.org/10.1007/s13218-012-0198-z, http://arxiv.org/abs/1606.01781
Article Google Scholar
Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice, vol. 283. Addison-Wesley, Reading (2010)
Google Scholar
Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011). https://doi.org/10.1109/CDC.2012.6426698. http://jmlr.org/papers/v12/duchi11a.html
Article MathSciNet MATH Google Scholar
Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, pp. 289–296. Morgan Kaufmann Publishers Inc. (1999)
Google Scholar
Ji, Y., Smith, N.A.: Neural discourse structure for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 996–1005 (2017)
Google Scholar
Jin, P., Zhang, Y., Chen, X., Xia, Y.: Bag-of-embeddings for text classification. In: IJCAI International Joint Conference on Artificial Intelligence, vol. 16, pp. 2824–2830, January 2016. https://www.ijcai.org/Proceedings/16/Papers/401.pdf
Johnson, R., Zhang, T.: Deep pyramid convolutional neural networks for text categorization. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 562–570 (2017). https://doi.org/10.18653/v1/P17-1052, http://aclweb.org/anthology/P17-1052
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Short Papers, vol. 2, pp. 427–431 (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751 (2014)
Google Scholar
Kottur, S., Moura, J.M.F., Lee, S., Batra, D.: Natural language does not emerge ‘Naturally’ in multi-agent dialog. Synth. Lect. Hum. Lang. Technol. 10(1), 1–309 (2017). https://doi.org/10.2200/S00762ED1V01Y201703HLT037. http://arxiv.org/abs/1706.08502
Article Google Scholar
Li, S., Zhao, Z., Liu, T., Hu, R., Du, X.: Initializing convolutional filters with semantic features for text classification. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 1884–1889 (2017)
Google Scholar
Li, X., Roth, D.: Learning question classifiers. In: Proceedings of the 19th International Conference on Computational Linguistics, vol. 1, pp. 1–7. Association for Computational Linguistics (2002)
Google Scholar
Pang, B., Lee, L.: Seeing stars: exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 115–124. Association for Computational Linguistics (2005)
Google Scholar
Pang, B., Lee, L., et al.: Opinion mining and sentiment analysis. Found. Trends® Inf. Retr. 2(1–2), 1–135 (2008)
Article Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Sehgal, U., Kaur, K., Kumar, P.: Notice of violation of IEEE publication principles the anatomy of a large-scale hyper textual web search engine. In: Second International Conference on Computer and Electrical Engineering, ICCEE 2009, vol. 2, pp. 491–495. IEEE (2009)
Google Scholar
Shu, L., Xu, H., Liu, B.: Doc: deep open classification of text documents. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2911–2916 (2017)
Google Scholar
Socher, R., Perelygin, A., Wu, J.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 1631–1642 (2013). https://www.aclweb.org/anthology/D13-1170
Wang, J., Wang, Z., Zhang, D., Yan, J.: Combining knowledge with deep convolutional neural networks for short text classification. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pp. 2915–2921 (2017). https://doi.org/10.24963/ijcai.2017/406, https://www.ijcai.org/proceedings/2017/406
Wang, T., Cai, Y., Leung, H.f., Cai, Z., Min, H.: Entropy-based term weighting schemes for text categorization in VSM. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 325–332. IEEE, November 2015. https://doi.org/10.1109/ICTAI.2015.57, http://ieeexplore.ieee.org/document/7372153/
Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492. ACM (2012)
Google Scholar
Xu, R., Yang, Y.: Cross-lingual distillation for text classification. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Long Papers, vol. 1, pp. 1415–1425 (2017)
Google Scholar
Zhang, K., Zhu, K.Q., Hwang, S.W.: An association network for computing semantic relatedness. In: AAAI, pp. 593–600 (2015)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 3485–3495 (2016)
Google Scholar

Download references

Acknowledgement

This work was supported by the Fundamental Research Funds for the Central Universities, SCUT (No. 2017ZD048, D2182480), the Tiptop Scientific and Technical Innovative Youth Talents of Guangdong special support program (No. 2015-TQ01X633), the Science and Technology Planning Project of Guangdong Province (No. 2017B050506004), the Science and Technology Program of Guangzhou International Science & Technology Cooperation Program (No. 201704030076). The research described in this paper has been supported by a collaborative research grant from the Hong Kong Research Grants Council (project no. C1031-18G).

Author information

Authors and Affiliations

School of Software Engineering, South China University of Technology, Guangzhou, China
Xue Lei, Yi Cai, Jingyun Xu & Da Ren
Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong
Qing Li
The Chinese University of Hong Kong, Hong Kong, Hong Kong SAR, China
Ho-fung Leung

Authors

Xue Lei
View author publications
You can also search for this author in PubMed Google Scholar
Yi Cai
View author publications
You can also search for this author in PubMed Google Scholar
Jingyun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Da Ren
View author publications
You can also search for this author in PubMed Google Scholar
Qing Li
View author publications
You can also search for this author in PubMed Google Scholar
Ho-fung Leung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yi Cai .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Guoliang Li
Duke University, Durham, NC, USA
Jun Yang
University of Porto, Porto, Portugal
Joao Gama
Chiang Mai University, Chiang Mai, Thailand
Juggapong Natwichai
Beihang University, Beijing, China
Yongxin Tong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lei, X., Cai, Y., Xu, J., Ren, D., Li, Q., Leung, Hf. (2019). Incorporating Task-Oriented Representation in Text Classification. In: Li, G., Yang, J., Gama, J., Natwichai, J., Tong, Y. (eds) Database Systems for Advanced Applications. DASFAA 2019. Lecture Notes in Computer Science(), vol 11447. Springer, Cham. https://doi.org/10.1007/978-3-030-18579-4_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-18579-4_24
Published: 24 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18578-7
Online ISBN: 978-3-030-18579-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics