Abstract
Manually annotated data is the basis for a large number of tasks in natural language processing as either: evaluation or training data. The annotation of large amounts of data by dedicated full-time annotators can be an expensive task, which may be beyond the budgets of many research projects. An alternative is crowd-sourcing where annotations are split among many part time annotators. This paper presents a freely available open-source platform for crowd-sourcing manual annotation tasks, and describes its application to annotating causative relations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bermingham, A., Smeaton, A.F.: A study of inter-annotator agreement for opinion retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 784–785 (2009)
Bird, S.: Nltk: The natural language toolkit. In: COLING, COLING-ACL 2006, pp. 69–72. Association for Computational Linguistics (2006)
Brants, T.: Inter-annotator agreement for a german newspaper corpus. In: Proceedings of Second International Conference on Language Resources and Evaluation, LREC 2000 (2000)
Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20(1), 37 (1960)
Hsueh, P.-Y., Melville, P., Sindhwani, V.: Data quality from crowdsourcing: A study of annotation selection criteria. In: Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing, HLT 2009, Stroudsburg, PA, USA, pp. 27–35. Association for Computational Linguistics (2009)
Malkowski, S., Hedwig, M., Pu, C.: Experimental evaluation of n-tier systems: Observation and analysis of multi-bottlenecks. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 118–127. IEEE (2009)
Mason, W., Watts, D.J.: Financial incentives and the “performance of crowds”. SIGKDD Explor. Newsl. 11(2), 100–108 (2010)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(2001) (1999)
Ng, H.T., Yong, C., Foo, K.S.: A case study on Inter-Annotator agreement for word sense disambiguation. In: Proceedings of the ACL SIGLEX Workshop on Standardizing Lexical Resources (SIGLEX 1999). College Park, Maryland (1999)
Nowak, S., Rüger, S.: How reliable are annotations via crowdsourcing: A study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the International Conference on Multimedia Information Retrieval, MIR 2010, pp. 557–566. ACM, New York (2010)
Passonneau, R., Habash, N.Y., Rambow, O.: Inter-annotator agreement on a multilingual semantic annotation task. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC (2006)
Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: Lessons from natural language processing. In: Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, I-KNOW 2012, pp. 17:1–17:8. ACM (2012)
Wang, A., Hoang, V.C.D., Kan, M.-Y.: Perspectives on crowdsourcing annotations for natural language processing. Language Resources and Evaluation 47(1), 9–31 (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Drury, B., Cardoso, P.C.F., Valverde-Rebaza, J., Valejo, A., Pereira, F., de Andrade Lopes, A. (2014). An Open Source Tool for Crowd-Sourcing the Manual Annotation of Texts. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_31
Download citation
DOI: https://doi.org/10.1007/978-3-319-09761-9_31
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09760-2
Online ISBN: 978-3-319-09761-9
eBook Packages: Computer ScienceComputer Science (R0)