Automatic Extraction of Cognitive Features from Gaze Data
Cognitive NLP systems—i.e., NLP systems that make use of behavioral data—augment traditional text-based features with cognitive features extracted from eye-movement patterns, EEG signals, brain imaging, etc. Such extraction of features has been typically manual, as we have seen in the previous chapter. We now contend that manual extraction of features is not good enough to tackle text subtleties that characteristically prevail in complex classification tasks like sentiment analysis and sarcasm detection, and that even the extraction and choice of features should be delegated to the learning system. We introduce a framework to automatically extract cognitive features from the eye-movement data of human readers reading the text and use them as features along with textual features for the tasks of sentiment polarity and sarcasm detection. Our proposed framework is based on Convolutional Neural Network (CNN). The CNN learns features from both gaze and text and uses them to classify the input text. We test our technique on published sentiment and sarcasm labeled datasets, enriched with gaze information, to show that using a combination of automatically learned text and gaze features yields better classification performance over (i) CNN-based systems that rely on text input alone and (ii) existing systems that rely on handcrafted gaze and textual features.
KeywordsGaze Data Sarcasm Detection Convolutional Neural Network (CNN) Gaze Information Sentiment Polarity
- Akkaya, C., Wiebe, J., & Mihalcea, R. (2009). Subjectivity word sense disambiguation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1 (pp. 190–199). ACL.Google Scholar
- Collobert, R., & Weston, J. (2008). A unified architecture for natural language processing: Deep neural networks with multitask learning. In Proceedings of the 25th international conference on Machine learning (pp. 160–167). ACM.Google Scholar
- Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., & Kuksa, P. (2011). Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12:2493–2537.Google Scholar
- dos Santos, C. N., & Gatti, M. (2014). Deep convolutional neural networks for sentiment analysis of short texts. In Proceedings of COLING.Google Scholar
- Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580.
- Joshi, A., Sharma, V., & Bhattacharyya, P. (2015). Harnessing context incongruity for sarcasm detection. Proceedings of 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, China (p. 757).Google Scholar
- Joshi, A., Mishra, A., Senthamilselvan, N., & Bhattacharyya, P. (2014). Measuring sentiment annotation complexity of text. In ACL (Daniel Marcu 22 June 2014 to 27 June 2014). ACL.Google Scholar
- Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., & Fei-Fei, L. (2014). Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (pp. 1725–1732).Google Scholar
- Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv:1408.5882.
- Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Adv Neural Inf. Process. Syst. 1097–1105.Google Scholar
- LeCun, Y., & Bengio, Y. (1995). Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw., 3361(10), 1995.Google Scholar
- Li, J., Chen, X., Hovy, E., & Jurafsky, D. (2016). Visualizing and understanding neural models in nlp. In Proceedings of NAACL-HLT (pp. 681–691).Google Scholar
- Liu, B., & Zhang, L. (2012). A survey of opinion mining and sentiment analysis. In Mining text data (pp. 415–463). New York: Springer.Google Scholar
- Liu, P., Joty, S. R., & Meng, H. M. (2015). Fine-grained opinion mining with recurrent neural networks and word embeddings. In EMNLP (pp. 1433–1443).Google Scholar
- Melamud, O., McClosky, D., Patwardhan, S., & Bansal, M. (2016). The role of context types and dimensionality in learning word embeddings. In NAACL HLT 2016 (pp. 1030–1040).Google Scholar
- Meng, F., Lu, Z., Wang, M., Li, H., Jiang, W., & Liu, Q. (2015). Encoding source language with convolutional neural network for machine translation. arXiv:1503.01838.
- Mikolov, T., Yih, W.-T., & Zweig, G. (2013). Linguistic regularities in continuous space word representations. In HLT-NAACL, (Vol. 13, pp. 746–751).Google Scholar
- Mishra, A., Kanojia, D., & Bhattacharyya, P. (2016a). Predicting readers’ sarcasm understandability by modeling gaze behavior. In Proceedings of AAAI.Google Scholar
- Mishra, A., Kanojia, D., Nagar, S., Dey, K., & Bhattacharyya, P. (2016b). Harnessing cognitive features for sarcasm detection. In ACL 2016 (p. 156).Google Scholar
- Mishra, A., Kanojia, D., Nagar, S., Dey, K., & Bhattacharyya, P. (2016c). Leveraging cognitive features for sentiment analysis. In CoNLL 2016 (p. 156).Google Scholar
- Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (pp. 115–124). Association for Computational Linguistics.Google Scholar
- Riloff, E., Qadir, A., Surve, P., De Silva, L., Gilbert, N., & Huang, R. (2013). Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of Empirical Methods in Natural Language Processing (pp. 704–714).Google Scholar
- Sharma, R., & Bhattacharyya, P. (2013). Detecting domain dedicated polar words. In Proceedings of the International Joint Conference on Natural Language Processing.Google Scholar
- Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
- Taigman, Y., Yang, M., Ranzato, M., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1701–1708).Google Scholar
- Tang, D., Wei, F., Qin, B., Liu, T., & Zhou, M. (2014). Coooolll: A deep learning system for twitter sentiment classification. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014) (pp. 208–212).Google Scholar
- Von der Malsburg, T., Kliegl, R., & Vasishth, S. (2015). Determinants of scanpath regularity in reading. Cogn. Sci., 39(7), 1675–1703.Google Scholar
- Wang, T., Wu, D. J., Coates, A., & Ng, A. Y. (2012). End-to-end text recognition with convolutional neural networks. In 2012 21st International Conference on Pattern Recognition (ICPR) (pp. 3304–3308). IEEE.Google Scholar
- Wood, E., & Bulling, A. (2014). Eyetab: Model-based gaze estimation on unmodified tablet computers. In Proceedings of the Symposium on Eye Tracking Research and Applications (pp. 207–210). ACM.Google Scholar
- Yamamoto, M., Nakagawa, H., Egawa, K., & Nagamatsu, T. (2013). Development of a mobile tablet pc with gaze-tracking function. In Human interface and the management of information. information and interaction for health, safety, mobility and complex environments (pp. 421–429). Berlin: Springer.Google Scholar
- Zeiler, M. D. (2012). Adadelta: An adaptive learning rate method. arXiv:1212.5701.