Estimating Annotation Complexities of Text Using Gaze and Textual Information

  • Abhijit MishraEmail author
  • Pushpak Bhattacharyya
Part of the Cognitive Intelligence and Robotics book series (CIR)


The basic requirement of supervised data-driven methods for various NLP tasks like part-of-speech tagging, dependency parsing, machine translation is large-scale annotated data. Since statistical methods have taken places overrule/heuristic methods over the years, text annotation has become an essential NLP research. Annotation refers to the task of manually labeling of text, image, or other data with comments, explanation, tags or markups—for NLP, often carried out by linguists to label raw text. While the outcome of the annotation process, i.e., the labeled data is valuable, capturing user activities may help in understanding the cognitive subprocesses underlying text annotation.


Sentiment Annotation Terminal Ratio Machine Translation Evaluation Metrics WordCount Complex Sentiment 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Ananthakrishnan, R., Bhattacharyya, P., Sasikumar, M., & Shah, R. M. (2007). Some issues in automatic evaluation of English-Hindi MT: More blues for bleu. In ICON.Google Scholar
  2. Balahur, A., Hermida, J. M., & Montoyo, A. (2011). Detecting implicit expressions of sentiment in text based on commonsense knowledge. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (pp. 53–60). Association for Computational Linguistics.Google Scholar
  3. Bird, S. (2006). NLTK: The natural language toolkit. In Proceedings of the COLING/ACL on Interactive Presentation Sessions (pp. 69–72). Association for Computational Linguistics.Google Scholar
  4. Campbell, S., & Hale, S. (1999). What makes a text difficult to translate? In Refereed Proceedings of the 23rd Annual ALAA Congress.Google Scholar
  5. Carl, M. (2012a). The CRITT TPR-DB 1.0: A database for empirical human translation process research. In AMTA 2012 Workshop on Post-Editing Technology and Practice (WPTP-2012).Google Scholar
  6. Carl, M. (2012b). Translog-II: A program for recording user activity data for empirical reading and writing research. In LREC (pp. 4108–4112).Google Scholar
  7. Chall, J. S., & Dale, E. (1995). Readability revisited: The new Dale-Chall readability formula. Cambridge: Brookline Books.Google Scholar
  8. Denkowski, M., & Lavie, A. (2011). Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation (pp. 85–91). Association for Computational Linguistics.Google Scholar
  9. Dragsted, B. (2010). Coordination of reading and writing processes in translation. Translation and Cognition, 15, 41.Google Scholar
  10. Esuli, A. & Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417–422).Google Scholar
  11. Fellbaum, C. (1998). WordNet. Wiley Online Library.Google Scholar
  12. Fort, K., Nazarenko, A., & Rosset, S. (2012). Modeling the complexity of manual annotation tasks: a grid of analysis. In International Conference on Computational Linguistics (pp. 895–910).Google Scholar
  13. Ganapathibhotla, G. & Liu, B. (2008). Identifying preferred entities in comparative sentences. In Proceedings of the International Conference on Computational Linguistics, COLING.Google Scholar
  14. Gunning, R. (1969). The fog index after twenty years. Journal of Business Communication, 6(2), 3–13.Google Scholar
  15. Hornof, A. J., & Halverson, T. (2002). Cleaning up systematic error in eye-tracking data by using required fixation locations. Behavior Research Methods, Instruments, & Computers, 34(4), 592–604.Google Scholar
  16. Joachims, T. (2006). Training linear SVMS in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 217–226). ACM.Google Scholar
  17. Joshi, S., Kanojia, D., & Bhattacharyya, P. (2013). More than meets the eye: Study of human cognition in sense annotation. In NAACL HLT 2013. Atlanta, USA.Google Scholar
  18. Kincaid, J. P., Fishburne, R. P. Jr., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, DTIC Document.Google Scholar
  19. Lin, D. (1996). On the structural complexity of natural language sentences. In Proceedings of the 16th Conference on Computational Linguistics (Vol. 2, pp. 729–733). Association for Computational Linguistics.Google Scholar
  20. Martınez-Gómez, P., & Aizawa, A. (2013). Diagnosing causes of reading difficulty using Bayesian networks. In IJCNLP.Google Scholar
  21. McAuley, J. J. & Leskovec, J. (2013). From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd International Conference on World Wide Web (pp. 897–908). International World Wide Web Conferences Steering Committee.Google Scholar
  22. Mishra, A., Bhattacharyya, P., Carl, M., & CRITT, I. (2013). Automatically predicting sentence translation difficulty. In ACL (Vol. 2, pp. 346–351).Google Scholar
  23. Mishra, A., Carl, M., & Bhattacharyya, P. (2012). A heuristic-based approach for systematic error correction of gaze data for reading. In Proceedings of the First Workshop on Eyetracking and Natural Language Processing. Mumbai, India.Google Scholar
  24. Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (pp. 115–124). Association for Computational Linguistics.Google Scholar
  25. Platt, J. C. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in large margin classifiers. Citeseer.Google Scholar
  26. Ramteke, A., Malu, A., Bhattacharyya, P., & Nath, J. S. (2013). Detecting turnarounds in sentiment analysis: Thwarting. In ACL (Vol. 2, pp. 860–865).Google Scholar
  27. Riloff, E., Qadir, A., Surve, P., De Silva, L., Gilbert, N., & Huang, R. (2013). Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of Empirical Methods in Natural Language Processing (pp. 704–714).Google Scholar
  28. Scott, G. G., O’Donnell, P. J., & Sereno, S. C. (2012). Emotion words affect eye fixations during reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(3), 783.Google Scholar
  29. Snover, M., Dorr, B., Schwartz, R., Micciulla, L., & Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas (Vol. 200).Google Scholar
  30. Von der Malsburg, T., & Vasishth, S. (2011). What is the scanpath signature of syntactic reanalysis? Journal of Memory and Language, 65(2), 109–127.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.India Research LabIBM ResearchBangaloreIndia
  2. 2.Indian Institute of Technology PatnaPatnaIndia

Personalised recommendations