Abstract
Text Categorization is a task of assigning documents to a fixed number of predefined categories. Concept is the grouping of semantically related items under a unique name. Dimensionality space and sparsity of the document representation can be reduced using concept generation. Conceptual representation of a text can be generated using WordNet. In this paper, an empirical evolution using Convolutional Neural Networks (CNN) for text categorization has been performed. The Convolutional Neural Networks exploit the one-dimensional structures of the text such as words, concepts, word embeddings, and concept embeddings to improve the categorical label prediction. The Reuter’s dataset is evaluated with Convolutional Neural Networks on four categories of data. The representation of a text with word embeddings and concept embeddings together results to a better classification performance using CNN compared with word embeddings and concept embeddings individually.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Thorsten: TC with SVM and Learn relevant features, ECML (1998)
Yang, E.T.: Semi supervised RNN classification of text with word embedding. JMLR Res. 5, 361–397 (2004)
Dai, A.M., Le, Q.V.: Semi-supervised sequence learning. Adv. Neural Inf. Process. Syst. 3079–3087 (2015)
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. Adv. Neural Inf. Process. Syst. 649–657 (2015)
Johnson, R., Zhang, T.: Semi-supervised convolutional neural networks for text categorization via region embedding. Adv. Neural Inf. Process. Syst. 919–927 (2015)
Aggarwal, C.C., Zhai, C.: A survey of text classification algorithms. Mining text data, 163–222 (2012)
Dinu, G.: Predict a systematic compare of context counting using context predict semantic vector. ACL, 238–247 (2012)
Vincent, P.: ANN probabilistic model of a language. JMLR 3, 1137, 1155 (2003)
Bengio, Y., Courville, A., Vincent, P.: Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1798–1828 (2013)
Bottou, L.: Learning of gradient in networks using CNN. In: Proceedings on Neuro-Nımes, vol. 91 (1999)
Bloehdorn, S., Hotho, A.: Boosting for text classification with semantic features. In: WebKDD, pp. 149–166 (2004)
Johnson, M.: Maxent discriminative re-ranking and Coarse-to-fine n-best parsing. In: Association for Computational Linguistics, pp. 173–180 (2005)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley (2012)
Glänzel, Wolfgang, Thijs, Bart: Using ‘core documents’ for detecting and labelling new emerging topics. Scientometrics 91(2), 399–416 (2012)
Hinton, G.E, Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science, 313(5786), 504–507 (2006)
Huang, E.H., Socher, R., Manning, C.D., Ng, A.Y.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 873–882. Association for Computational Linguistics (2012)
Kalchbrenner, N., Blunsom, P.: Recurrent convolutional neural networks for discourse compositionality. arXiv preprint arXiv:1306.3584 (2013)
Klementiev, A., Titov, I., Bhattarai, B.: Inducing crosslingual distributed representations of words (2012)
Mikonos, T.: Distributed representations of sentences and docs. ICML (2014)
Sutskever, I.: Distributional representations of words and phrases and their composite. NIPS, 3111–3119 (2013)
Mikolov, T., Yih, W., Zweig, G.: Linguistic regularities in continuous space word representations. In hlt-Naacl 13, 746–751 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Premchander, K., Sarma, S.S.V.N., Vaishali, K., Vijaypal Reddy, P., Anjaneyulu, M., Nagaprasad, S. (2018). WordNet-Based Text Categorization Using Convolutional Neural Networks. In: Tiwari, B., Tiwari, V., Das, K., Mishra, D., Bansal, J. (eds) Proceedings of International Conference on Recent Advancement on Computer and Communication . Lecture Notes in Networks and Systems, vol 34. Springer, Singapore. https://doi.org/10.1007/978-981-10-8198-9_25
Download citation
DOI: https://doi.org/10.1007/978-981-10-8198-9_25
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8197-2
Online ISBN: 978-981-10-8198-9
eBook Packages: EngineeringEngineering (R0)