Abstract
Annotated corpora are an important resource to evaluate methods, compare competing methods, or to train supervised learning methods. When creating a new corpora with the help of human annotators, two important goals are pursued by annotation practitioners: Minimizing the required resources (efficiency) and maximizing the resulting annotation quality (effectiveness). Optimizing these two criteria is a challenging problem, especially in certain domains (e.g. medical, legal). In the scope of my PhD thesis, the aim is to create novel annotation methods for an efficient and effective data acquisition. In this paper, methods and preliminary results are described for two ongoing annotation projects: medical information extraction and question-answering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arora, S., Liang, Y., Ma, T.: Simple but tough-to-beat baseline for sentence embeddings. In: International Conference on Learning Representations, p. 16 (2017)
Chabou, S., Iglewski, M.: PICO extraction by combining the robustness of machine-learning methods with the rule-based methods. In: 2015 World Congress on Information Technology and Computer Applications (WCITCA), pp. 1–4 (2015)
Kim, S.N., Martinez, D., Cavedon, L., Yencken, L.: Automatic classification of sentences to support evidence based medicine. BMC Bioinform. 12(2), S5 (2011)
Nakov, P., Mà rquez, L., Magdy, W., Moschitti, A., Glass, J., Randeree, B.: Semeval-2015 task 3: answer selection in community question answering. In: Proceedings of the 9th International Workshop on Semantic Evaluation, SemEval 2015, pp. 269–281 (2015)
Voorhees, E.M.: The TREC question answering track. Nat. Lang. Eng. 7(04) (2001). https://doi.org/10.1017/S1351324901002789
Yang, Y., Yih, W.T., Meek, C.: WikiQA: a challenge dataset for open-domain question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2013–2018 (2015)
Zlabinger, M., Andersson, L., Brassey, J., Hanbury, A.: Extracting the population, intervention, comparison and sentiment from randomized controlled trials. Stud. Health Technol. Inform. 247, 146–150 (2018)
Zlabinger, M., Andersson, L., Hanbury, A., Andersson, M., Quasnik, V., Brassey, J.: Medical entity corpus with PICO elements and sentiment analysis. In: Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC) (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zlabinger, M. (2019). Improving the Annotation Efficiency and Effectiveness in the Text Domain. In: Azzopardi, L., Stein, B., Fuhr, N., Mayr, P., Hauff, C., Hiemstra, D. (eds) Advances in Information Retrieval. ECIR 2019. Lecture Notes in Computer Science(), vol 11438. Springer, Cham. https://doi.org/10.1007/978-3-030-15719-7_46
Download citation
DOI: https://doi.org/10.1007/978-3-030-15719-7_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-15718-0
Online ISBN: 978-3-030-15719-7
eBook Packages: Computer ScienceComputer Science (R0)