Abstract
In this paper we present a system for the Web People Search task, which is the task of clustering together the pages referring to the same person. The vector space model approached is modified in order to develop a more flexible clustering technique. We have implemented a dynamic weighting procedure for the attributes common to different cluster in order to maximize the between cluster variance with respect with the within cluster variance. We show that in this way the undesired collateral effect such as superposition and masking are alleviated. The system we present obtains similar results to the ones reported by the top three systems presented at the SEMEVAL 2007 competition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Artiles, J., Gonzalo, J., Sekine, S.: Establishing a benchmark for the Web People Search Task: The Semeval WePS Track. In: Proceedings of Semeval 2007, Association for Computational Linguistics (2007)
Artiles, J., Gonzalo, J., Verdejo, F.: A Testbed for People Searching Strategies in the WWW. In: SIGIR 2005 (2005)
Bagga, A., Baldwin, B.: Entity-based cross-document co-referencing using the vector space model. In: Proceedings of the 17th international conference on Computational linguistics, pp. 75–85 (1998)
Buitelaar, P., Cimiano, P., Magnini, B. (eds.): Ontology Learning from Text: Methods, Evaluation and applications. IOS Press, Amsterdam (2005)
Grishman, R.: Whither Written Language Evaluation? In: Human Language Technology Workshop, pp. 120–125. San Mateor Morgan Kaufmann, San Francisco (1994)
Luo, X., Ittycherian, Y., Jing, H.: A mention-synchronous coreference resolution algorithm based on the Bell tree. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona (2004)
Magnini, B., Pianta, E., Popescu, O., Speranza, M.: Ontology Population from Textual Mentions: Task Definition and Benchmark. In: Proceedings of the OLP2 workshop on Ontology Population and Learning, Sidney, Australia. Joint with ACL/Coling (2006)
Malin, B.: Unsupervised Name Disambiguation via social Network Similarity. In: Proceedings of the SIAM Workshop on Link Analysis, Counterterrorism and Security, CA (2005)
Mann, G., Yarowsky, D.: Unsupervised Personal Name Disambiguation, CoNLL, Edmonton, Canada (2003)
Niu, C., Li, W., Srihari, R.: Weakly supervised learning for cross-document person name disambiguation supported by information extraction, ACL, Spain (2004)
Ng, V.: Shallow Semantics for Coreference Resolution. In: IJCAI (2007)
Pedersen, T., Purandare, A., Kulkarni, A.: Name discrimination by clustering similar contexts. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 226–237. Springer, Heidelberg (2005)
Popescu, O., Magnini, B., Pianta, E., Serafini, L., Speranza, M., Tamilin, A.: From Mentions to Ontology: A Pilot Study. In: Proceedings SWAP 2006, Pisa, Italy (2006)
Song, Y., Councill, I., Li, J., Giles, C.: Efficient Topic-based Unsupervised Name Disambiguation. In: JCDL 2007, Vancouver, British Columbia, Canada (2007)
Wan, X., Yang, J., Xiao, J.: Using Cross-Document RandomWalks or Topic-Focused Multi-Document. In: Proceedings of IEEE/WIC/ACM, International Conference on Web Intelligence (2006)
Wei, Y., Lin, M., Chen, H.: Name Disambiguation in Person Information Mining. In: IEEE/WIC/ACM International Conference on Web Intelligence (2006)
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Popescu, O., Magnini, B. (2009). Alleviating the Problem of Wrong Coreferences in Web Person Search. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-00382-0_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00381-3
Online ISBN: 978-3-642-00382-0
eBook Packages: Computer ScienceComputer Science (R0)