Abstract
We present a probabilistic framework for inferring coreference relations among person names in a news collection. The approach does not assume any prior knowledge about persons (e.g. an ontology) mentioned in the collection and requires basic linguistic processing (named entity recognition) and resources (a dictionary of person names). The system parameters have been estimated on a 5K corpus of Italian news documents. Evaluation, over a sample of four days news documents, shows that the error rate of the system (1.4%) is above a baseline (5.4%) for the task. Finally, we discuss alternative approaches for evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Artiles, J., Gonzalo, J., Sekine, S.: Establishing a benchmark for the Web People Search Task: The Semeval 2007 WePS Track. In: Proceedings of Semeval 2007. Association for Computational Linguistics (2007)
Bagga, A., Baldwin, B.: Entity-based cross-document co-referencing using the vector space model. In: Proceedings of the 17th international conference on Computational linguistics, pp. 75–85 (1998)
Magnini, B., Pianta, E., Girardi, C., Negri, M., Romano, L., Speranza, M., Bartalesi Lenzi, V., Sprugnoli, R.: I-CAB: the Italian Content Annotation Bank. In: Proceedings of LREC 2006, Genova, Italy (2006)
Pedersen, T., Purandare, A., Kulkarni, A.: Name Discrimination by Clustering Similar Contexts. In: Proceedings of the World Wide Web Conference (2006)
Popescu, O., Magnini, B., Pianta, E., Serafini, L., Speranza, M., Tamilin, A.: From Mentions to Ontology: A Pilot Study. In: Proceedings SWAP 2006, Pisa, Italy (2006)
Zanoli, R., Pianta, E.: SVM based NER, Technical Report, Trento, Italy (2006)
Magnini, B., Pianta, E., Popescu, O., Speranza, M.: Ontology Population from Textual Mentions: Task Definition and Benchmark. In: Proceedings of the OLP2 workshop on Ontology Population and Learning, Joint with ACL/Coling 2006, Sidney, Australia (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Popescu, O., Magnini, B. (2007). Inferring Coreferences Among Person Names in a Large Corpus of News Collections. In: Basili, R., Pazienza, M.T. (eds) AI*IA 2007: Artificial Intelligence and Human-Oriented Computing. AI*IA 2007. Lecture Notes in Computer Science(), vol 4733. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74782-6_32
Download citation
DOI: https://doi.org/10.1007/978-3-540-74782-6_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74781-9
Online ISBN: 978-3-540-74782-6
eBook Packages: Computer ScienceComputer Science (R0)