Advertisement

Convolutional Neural Networks

  • Uday Kamath
  • John Liu
  • James Whitaker
Chapter

Abstract

In the last few years, convolutional neural networks (CNNs), along with recurrent neural networks (RNNs), have become a basic building block in constructing complex deep learning solutions for various NLP, speech, and time series tasks. LeCun first introduced certain basic parts of the CNN frameworks as a general NN framework to solve various high-dimensional data problems in computer vision, speech, and time series. ImageNet applied convolutions to recognize objects in images; by improving substantially on the state of the art, ImageNet revived interest in deep learning and CNNs. Collobert et al. pioneered the application of CNNs to NLP tasks, such as POS tagging, chunking, named entity resolution, and semantic role labeling. Many changes to CNNs, from input representation, number of layers, types of pooling, optimization techniques, and applications to various NLP tasks have been active subjects of research in the last decade.

References

  1. [AS17a]
    Heike Adel and Hinrich Schütze. “Global Normalization of Convolutional Neural Networks for Joint Entity and Relation Classification”. In: EMNLP. Association for Computational Linguistics, 2017, pp. 1723–1729.Google Scholar
  2. [Bro+94]
    Jane Bromley et al. “Signature Verification using a “Siamese” Time Delay Neural Network”. In: Advances in Neural Information Processing Systems 6. Ed. by J. D. Cowan, G. Tesauro, and J. Alspector. Morgan-Kaufmann, 1994, pp. 737–744.Google Scholar
  3. [Che+15]
    Yubo Chen et al. “Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 167–176.Google Scholar
  4. [CL16]
    Jianpeng Cheng and Mirella Lapata. “Neural Summarization by Extracting Sentences and Words”. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2016, pp. 484–494.Google Scholar
  5. [Col+11]
    R. Collobert et al. “Natural Language Processing (Almost) from Scratch”. In: Journal of Machine Learning Research 12 (2011), pp. 2493–2537.Google Scholar
  6. [CW08b]
    Ronan Collobert and Jason Weston. “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning”. In: Proceedings of the 25th International Conference on Machine Learning. ICML ’08. 2008.Google Scholar
  7. [Con+16]
    Alexis Conneau et al. “Very Deep Convolutional Networks for Natural Language Processing”. In: CoRR abs/1606.01781 (2016).Google Scholar
  8. [Den+14]
    Misha Denil et al. “Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network.” In: CoRR abs/1406.3830 (2014).Google Scholar
  9. [Don+15b]
    Li Dong et al. “Question Answering over Freebase with Multi-Column Convolutional Neural Networks”. In: Proceedings of the International Joint Conference on Natural Language Processing. Association for Computational Linguistics, 2015, pp. 260–269.Google Scholar
  10. [DSZ14]
    Cícero Nogueira Dos Santos and Bianca Zadrozny. “Learning Character-level Representations for Part-of-speech Tagging”. In: Proceedings of the 31st International Conference on International Conference on Machine Learning - Volume 32. ICML’14. 2014, pp. II–1818–II–1826.Google Scholar
  11. [Geh+17b]
    Jonas Gehring et al. “Convolutional Sequence to Sequence Learning”. In: Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. Vol. 70. Proceedings of Machine Learning Research. 2017, pp. 1243–1252.Google Scholar
  12. [Hu+15]
    Baotian Hu et al. “Context-Dependent Translation Selection Using Convolutional Neural Network”. In: ACL (2). The Association for Computer Linguistics, 2015, pp. 536–541.Google Scholar
  13. [JZ15]
    Rie Johnson and Tong Zhang. “Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding”. In: Advances in Neural Information Processing Systems 28. Ed. by C. Cortes et al. 2015, pp. 919–927.Google Scholar
  14. [KGB14b]
    Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. “A Convolutional Neural Network for Modelling Sentences”. In: CoRR abs/1404.2188 (2014).Google Scholar
  15. [Kim14b]
    Yoon Kim. “Convolutional Neural Networks for Sentence Classification”. In: CoRR abs/1408.5882 (2014).Google Scholar
  16. [KSH12a]
    Alex Krizhevsky, I Sutskever, and G. E Hinton. “ImageNet Classification with Deep Convolutional Neural Networks”. In: Advances in Neural Information Processing Systems (NIPS 2012). 2012, p. 4.Google Scholar
  17. [LBC17]
    Jey Han Lau, Timothy Baldwin, and Trevor Cohn. “Topically Driven Neural Language Model”. In: ACL (1). Association for Computational Linguistics, 2017, pp. 355–365.Google Scholar
  18. [LG16]
    Andrew Lavin and Scott Gray. “Fast Algorithms for Convolutional Neural Networks”. In: CVPR. IEEE Computer Society, 2016, pp. 4013–4021.Google Scholar
  19. [LB95]
    Y. LeCun and Y. Bengio. “Convolutional Networks for Images, Speech, and Time-Series”. In: The Handbook of Brain Theory and Neural Networks. 1995.Google Scholar
  20. [LeC+98]
    Yann LeCun et al. “Gradient-Based Learning Applied to Document Recognition”. In: Proceedings of the IEEE. Vol. 86. 1998, pp. 2278–2324.CrossRefGoogle Scholar
  21. [Li+15]
    Yujia Li et al. “Gated Graph Sequence Neural Networks”. In: CoRRabs/1511.05493 (2015).Google Scholar
  22. [LZ16]
    Depeng Liang and Yongdong Zhang. “AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification”. In: CoRR abs/1611.01884 (2016).Google Scholar
  23. [Ma+15]
    Mingbo Ma et al. “Tree-based Convolution for Sentence Modeling”. In: CoRR abs/1507.01839 (2015).Google Scholar
  24. [Men+15]
    Fandong Meng et al. “Encoding Source Language with Convolutional Neural Network for Machine Translation”. In: ACL (1). The Association for Computer Linguistics, 2015, pp. 20–30.Google Scholar
  25. [Mou+14]
    Lili Mou et al. “TBCNN: A Tree-Based Convolutional Neural Network for Programming Language Processing”. In: CoRR abs/1409.5718 (2014).Google Scholar
  26. [NG15b]
    Thien Huu Nguyen and Ralph Grishman. “Relation Extraction: Perspective from Convolutional Neural Networks”. In: Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Association for Computational Linguistics, 2015, pp. 39–48.Google Scholar
  27. [RSA15]
    Oren Rippel, Jasper Snoek, and Ryan P. Adams. “Spectral Representations for Convolutional Neural Networks”. In: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2. NIPS’15. 2015, pp. 2449–2457.Google Scholar
  28. [SFH17]
    Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. “Dynamic Routing Between Capsules”. In: 2017, pp. 3856–3866.Google Scholar
  29. [SG14]
    Cicero dos Santos and Maira Gatti. “Deep Convolutional Neural Networks for Sentiment Analysis of Short Texts”. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.Google Scholar
  30. [SM15]
    Aliaksei Severyn and Alessandro Moschitti. “Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks”. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’15. 2015, pp. 373–382.Google Scholar
  31. [SZ14]
    Karen Simonyan and Andrew Zisserman. “Very Deep Convolutional Networks for Large-Scale Image Recognition”. In: 2014.Google Scholar
  32. [Sze+17]
    Christian Szegedy et al. “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning”. In: AAAI. AAAI Press, 2017, pp. 4278–4284.Google Scholar
  33. [Wan+15a]
    Peng Wang et al. “Semantic Clustering and Convolutional Neural Network for Short Text Categorization”. In: Proceedings the 7th International Joint Conference on Natural Language Processing. 2015.Google Scholar
  34. [XC16]
    Yijun Xiao and Kyunghyun Cho. “Efficient Character-level Document Classification by Combining Convolution and Recurrent Layers”. In: CoRR abs/1602.00367 (2016).Google Scholar
  35. [Xu+17]
    Jiaming Xu et al. “Self-Taught convolutional neural networks for short text clustering”. In: Neural Networks 88 (2017), pp. 22–31.CrossRefGoogle Scholar
  36. [YS16]
    Wenpeng Yin and Hinrich Schütze. “Multichannel Variable-Size Convolution for Sentence Classification”. In: CoRR abs/1603.04513 (2016).Google Scholar
  37. [Yin+16a]
    Wenpeng Yin et al. “ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs”. In: Transactions of the Association for Computational Linguistics 4 (2016), pp. 259–272.CrossRefGoogle Scholar
  38. [Yin+16b]
    Wenpeng Yin et al. “Simple Question Answering by Attentive Convolutional Neural Network”. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, 2016, pp. 1746–1756.Google Scholar
  39. [YK15]
    Fisher Yu and Vladlen Koltun. “Multi-Scale Context Aggregation by Dilated Convolutions”. In: CoRR abs/1511.07122 (2015).Google Scholar
  40. [Yu+14]
    Lei Yu et al. “Deep Learning for Answer Sentence Selection”. In: CoRR abs/1412.1632 (2014).Google Scholar
  41. [ZF13b]
    Matthew D. Zeiler and Rob Fergus. “Stochastic Pooling for Regularization of Deep Convolutional Neural Networks”. In: CoRR abs/1301.3557 (2013).Google Scholar
  42. [ZZL15]
    Xiang Zhang, Junbo Jake Zhao, and Yann LeCun. “Character level Convolutional Networks for Text Classification”. In: CoRR abs/1509.01626 (2015).Google Scholar
  43. [ZRW16]
    Ye Zhang, Stephen Roller, and Byron C. Wallace. “MGNC-CNN: A Simple Approach to Exploiting Multiple Word Embeddings for Sentence Classification”. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2016, pp. 1522–1527.Google Scholar
  44. [ZW17]
    Ye Zhang and Byron Wallace. “A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification”. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Asian Federation of Natural Language Processing, 2017, pp. 253–263.Google Scholar
  45. [Zhe+15]
    Xiaoqing Zheng et al. “Character-based Parsing with Convolutional Neural Network”. In: Proceedings of the 24th International Conference on Artificial Intelligence. IJCAI’15. 2015, pp. 1054–1060.Google Scholar
  46. [Zho+15]
    Chunting Zhou et al. In: CoRR abs/1511.08630 (2015).Google Scholar
  47. [Zhu+15]
    Chenxi Zhu et al. “A Re-ranking Model for Dependency Parser with Recursive Convolutional Neural Network”. In: Proceedings of International Joint Conference on Natural Language Processing. 2015, pp. 1159–1168.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Uday Kamath
    • 1
  • John Liu
    • 2
  • James Whitaker
    • 1
  1. 1.Digital Reasoning Systems Inc.McLeanUSA
  2. 2.Intelluron CorporationNashvilleUSA

Personalised recommendations