Skip to main content

An Open Source Tool for Crowd-Sourcing the Manual Annotation of Texts

  • Conference paper
  • 661 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8775))

Abstract

Manually annotated data is the basis for a large number of tasks in natural language processing as either: evaluation or training data. The annotation of large amounts of data by dedicated full-time annotators can be an expensive task, which may be beyond the budgets of many research projects. An alternative is crowd-sourcing where annotations are split among many part time annotators. This paper presents a freely available open-source platform for crowd-sourcing manual annotation tasks, and describes its application to annotating causative relations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bermingham, A., Smeaton, A.F.: A study of inter-annotator agreement for opinion retrieval. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 784–785 (2009)

    Google Scholar 

  2. Bird, S.: Nltk: The natural language toolkit. In: COLING, COLING-ACL 2006, pp. 69–72. Association for Computational Linguistics (2006)

    Google Scholar 

  3. Brants, T.: Inter-annotator agreement for a german newspaper corpus. In: Proceedings of Second International Conference on Language Resources and Evaluation, LREC 2000 (2000)

    Google Scholar 

  4. Cohen, J.: A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20(1), 37 (1960)

    Article  Google Scholar 

  5. Hsueh, P.-Y., Melville, P., Sindhwani, V.: Data quality from crowdsourcing: A study of annotation selection criteria. In: Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing, HLT 2009, Stroudsburg, PA, USA, pp. 27–35. Association for Computational Linguistics (2009)

    Google Scholar 

  6. Malkowski, S., Hedwig, M., Pu, C.: Experimental evaluation of n-tier systems: Observation and analysis of multi-bottlenecks. In: IEEE International Symposium on Workload Characterization, IISWC 2009, pp. 118–127. IEEE (2009)

    Google Scholar 

  7. Mason, W., Watts, D.J.: Financial incentives and the “performance of crowds”. SIGKDD Explor. Newsl. 11(2), 100–108 (2010)

    Article  Google Scholar 

  8. Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(2001) (1999)

    Google Scholar 

  9. Ng, H.T., Yong, C., Foo, K.S.: A case study on Inter-Annotator agreement for word sense disambiguation. In: Proceedings of the ACL SIGLEX Workshop on Standardizing Lexical Resources (SIGLEX 1999). College Park, Maryland (1999)

    Google Scholar 

  10. Nowak, S., Rüger, S.: How reliable are annotations via crowdsourcing: A study about inter-annotator agreement for multi-label image annotation. In: Proceedings of the International Conference on Multimedia Information Retrieval, MIR 2010, pp. 557–566. ACM, New York (2010)

    Google Scholar 

  11. Passonneau, R., Habash, N.Y., Rambow, O.: Inter-annotator agreement on a multilingual semantic annotation task. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation, LREC (2006)

    Google Scholar 

  12. Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: Lessons from natural language processing. In: Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, I-KNOW 2012, pp. 17:1–17:8. ACM (2012)

    Google Scholar 

  13. Wang, A., Hoang, V.C.D., Kan, M.-Y.: Perspectives on crowdsourcing annotations for natural language processing. Language Resources and Evaluation 47(1), 9–31 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Drury, B., Cardoso, P.C.F., Valverde-Rebaza, J., Valejo, A., Pereira, F., de Andrade Lopes, A. (2014). An Open Source Tool for Crowd-Sourcing the Manual Annotation of Texts. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09761-9_31

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09760-2

  • Online ISBN: 978-3-319-09761-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics