Abstract
We are developing a reference disambiguation system called NAYOSE System. In order to cope with the case the same person name or place name appears over two or more Web pages, we propose a system classifying each page into a cluster which corresponds to the same entity in the real world. For this purpose, we propose two new methods involving algorithms to classify these pages. In our evaluation, the combination of local text matching and named entities matching outperformed the previous baseline algorithm used in simple document classification method by 0.22 in the overall F-measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bagga, A., Baldwin, B.: Entity-Based Cross-Document Coreferencing Using the Vector Space Model. In: Proceedings of COLING-ACL 1998, pp. 79–85 (1998)
Bekkerman, R., MacCallum, A.: Disambiguating Web Appearances of People in a Social Network. In: Proceedings of WWW 2005, pp. 463–470 (2005)
Larsen, B., Aone, C.: Fast and effective text mining using linear-time document clustering. In: Proceedings of the 5th ACM SIGKDD, pp. 16–22 (1999)
Mann, G.S., Yarowsky, D.: Unsupervised Personal Name Disambiguation. In: Proceedings of CoNLL 2003, pp. 33–40 (2003)
Morton, T.S.: Coreference for NLP Applications. In: Proceedings of ACL 2000, pp. 173–180 (2000)
Niu, C., Li, W., Srihari, R.K.: Weakly Supervised Learning for Cross-document Person Name Disambiguation Supported by Information Extraction. In: Proceedings of ACL 2004, pp. 598–605 (2004)
Wan, X., Gao, J., Li, M., Ding, B.: Person Resolution in Person Search Results: WebHawk. In: Proceedings of CIKM 2005, pp. 163–170 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ono, S., Yoshida, M., Nakagawa, H. (2006). NAYOSE: A System for Reference Disambiguation of Proper Nouns Appearing on Web Pages. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_26
Download citation
DOI: https://doi.org/10.1007/11880592_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45780-0
Online ISBN: 978-3-540-46237-8
eBook Packages: Computer ScienceComputer Science (R0)