Estimating Annotation Complexities of Text Using Gaze and Textual Information

Mishra, Abhijit; Bhattacharyya, Pushpak

doi:10.1007/978-981-13-1516-9_3

Abhijit Mishra⁵ &
Pushpak Bhattacharyya⁶

Part of the book series: Cognitive Intelligence and Robotics ((CIR))

1003 Accesses

Abstract

The basic requirement of supervised data-driven methods for various NLP tasks like part-of-speech tagging, dependency parsing, machine translation is large-scale annotated data. Since statistical methods have taken places overrule/heuristic methods over the years, text annotation has become an essential NLP research. Annotation refers to the task of manually labeling of text, image, or other data with comments, explanation, tags or markups—for NLP, often carried out by linguists to label raw text. While the outcome of the annotation process, i.e., the labeled data is valuable, capturing user activities may help in understanding the cognitive subprocesses underlying text annotation.

Declaration: Consent of the subjects participating in the eye-tracking experiments for collecting data used for the work reported in this chapter has been obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://en.wikipedia.org/wiki/Annotation.
2.
http://www.translog.dk.
3.
\(20\%\) of the translation sessions were discarded as it was difficult to rectify the gaze logs for these sessions.
4.
Anything beyond the upper bound is hard to translate and can be assigned with the maximum score.
5.
http://jbauman.com/gsl.html.
6.
http://www.victoria.ac.nz/lals/resources/academicwordlist/.
7.
http://wordnet.princeton.edu.
8.
http://nlp.stanford.edu/software/corenlp.shtml.
9.
The MSE values are absolute, as opposed to the percentage values presented in the paper. Also, the results reported here slightly differ from the paper due to the fact that an updated version of TPR dataset was used for this experimentation.
10.
The online version that was active in the year of 2013.
11.
BLEU, another popular metric was not used, as techniques to measure sentence wise BLEU scores were non-existent at the time of this experimentation. Moreover, BLEU may not be the most appropriate metric for English–Indian language translation evaluation as shown by Ananthakrishnan et al. (2007).
12.
The fixation duration per word is calculated for each sentence, and an average is taken.
13.
The complete eye-tracking data (with recorded values of fixations, saccades, eye regression patterns, pupil dilation, and gaze-to-word mapping) are available for academic use at http://www.cfilt.iitb.ac.in/~cognitive-nlp.
14.
http://scikit-learn.org/stable/.
15.
In case of SVM, the probability of predicted class is computed as given in Platt (1999).

References

Ananthakrishnan, R., Bhattacharyya, P., Sasikumar, M., & Shah, R. M. (2007). Some issues in automatic evaluation of English-Hindi MT: More blues for bleu. In ICON.
Google Scholar
Balahur, A., Hermida, J. M., & Montoyo, A. (2011). Detecting implicit expressions of sentiment in text based on commonsense knowledge. In Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (pp. 53–60). Association for Computational Linguistics.
Google Scholar
Bird, S. (2006). NLTK: The natural language toolkit. In Proceedings of the COLING/ACL on Interactive Presentation Sessions (pp. 69–72). Association for Computational Linguistics.
Google Scholar
Campbell, S., & Hale, S. (1999). What makes a text difficult to translate? In Refereed Proceedings of the 23rd Annual ALAA Congress.
Google Scholar
Carl, M. (2012a). The CRITT TPR-DB 1.0: A database for empirical human translation process research. In AMTA 2012 Workshop on Post-Editing Technology and Practice (WPTP-2012).
Google Scholar
Carl, M. (2012b). Translog-II: A program for recording user activity data for empirical reading and writing research. In LREC (pp. 4108–4112).
Google Scholar
Chall, J. S., & Dale, E. (1995). Readability revisited: The new Dale-Chall readability formula. Cambridge: Brookline Books.
Google Scholar
Denkowski, M., & Lavie, A. (2011). Meteor 1.3: Automatic metric for reliable optimization and evaluation of machine translation systems. In Proceedings of the Sixth Workshop on Statistical Machine Translation (pp. 85–91). Association for Computational Linguistics.
Google Scholar
Dragsted, B. (2010). Coordination of reading and writing processes in translation. Translation and Cognition, 15, 41.
Google Scholar
Esuli, A. & Sebastiani, F. (2006). Sentiwordnet: A publicly available lexical resource for opinion mining. In Proceedings of LREC (Vol. 6, pp. 417–422).
Google Scholar
Fellbaum, C. (1998). WordNet. Wiley Online Library.
Google Scholar
Fort, K., Nazarenko, A., & Rosset, S. (2012). Modeling the complexity of manual annotation tasks: a grid of analysis. In International Conference on Computational Linguistics (pp. 895–910).
Google Scholar
Ganapathibhotla, G. & Liu, B. (2008). Identifying preferred entities in comparative sentences. In Proceedings of the International Conference on Computational Linguistics, COLING.
Google Scholar
Gunning, R. (1969). The fog index after twenty years. Journal of Business Communication, 6(2), 3–13.
Google Scholar
Hornof, A. J., & Halverson, T. (2002). Cleaning up systematic error in eye-tracking data by using required fixation locations. Behavior Research Methods, Instruments, & Computers, 34(4), 592–604.
Google Scholar
Joachims, T. (2006). Training linear SVMS in linear time. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 217–226). ACM.
Google Scholar
Joshi, S., Kanojia, D., & Bhattacharyya, P. (2013). More than meets the eye: Study of human cognition in sense annotation. In NAACL HLT 2013. Atlanta, USA.
Google Scholar
Kincaid, J. P., Fishburne, R. P. Jr., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel. Technical report, DTIC Document.
Google Scholar
Lin, D. (1996). On the structural complexity of natural language sentences. In Proceedings of the 16th Conference on Computational Linguistics (Vol. 2, pp. 729–733). Association for Computational Linguistics.
Google Scholar
Martınez-Gómez, P., & Aizawa, A. (2013). Diagnosing causes of reading difficulty using Bayesian networks. In IJCNLP.
Google Scholar
McAuley, J. J. & Leskovec, J. (2013). From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews. In Proceedings of the 22nd International Conference on World Wide Web (pp. 897–908). International World Wide Web Conferences Steering Committee.
Google Scholar
Mishra, A., Bhattacharyya, P., Carl, M., & CRITT, I. (2013). Automatically predicting sentence translation difficulty. In ACL (Vol. 2, pp. 346–351).
Google Scholar
Mishra, A., Carl, M., & Bhattacharyya, P. (2012). A heuristic-based approach for systematic error correction of gaze data for reading. In Proceedings of the First Workshop on Eyetracking and Natural Language Processing. Mumbai, India.
Google Scholar
Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics (pp. 115–124). Association for Computational Linguistics.
Google Scholar
Platt, J. C. (1999). Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In Advances in large margin classifiers. Citeseer.
Google Scholar
Ramteke, A., Malu, A., Bhattacharyya, P., & Nath, J. S. (2013). Detecting turnarounds in sentiment analysis: Thwarting. In ACL (Vol. 2, pp. 860–865).
Google Scholar
Riloff, E., Qadir, A., Surve, P., De Silva, L., Gilbert, N., & Huang, R. (2013). Sarcasm as contrast between a positive sentiment and negative situation. In Proceedings of Empirical Methods in Natural Language Processing (pp. 704–714).
Google Scholar
Scott, G. G., O’Donnell, P. J., & Sereno, S. C. (2012). Emotion words affect eye fixations during reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 38(3), 783.
Google Scholar
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., & Makhoul, J. (2006). A study of translation edit rate with targeted human annotation. In Proceedings of Association for Machine Translation in the Americas (Vol. 200).
Google Scholar
Von der Malsburg, T., & Vasishth, S. (2011). What is the scanpath signature of syntactic reanalysis? Journal of Memory and Language, 65(2), 109–127.
Google Scholar

Download references

Author information

Authors and Affiliations

India Research Lab, IBM Research, Bangalore, Karnataka, India
Abhijit Mishra
Indian Institute of Technology Patna, Patna, Bihar, India
Pushpak Bhattacharyya

Authors

Abhijit Mishra
View author publications
You can also search for this author in PubMed Google Scholar
Pushpak Bhattacharyya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abhijit Mishra .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mishra, A., Bhattacharyya, P. (2018). Estimating Annotation Complexities of Text Using Gaze and Textual Information. In: Cognitively Inspired Natural Language Processing. Cognitive Intelligence and Robotics. Springer, Singapore. https://doi.org/10.1007/978-981-13-1516-9_3

Download citation

DOI: https://doi.org/10.1007/978-981-13-1516-9_3
Published: 02 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1515-2
Online ISBN: 978-981-13-1516-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics