Skip to main content

Alleviating the Problem of Wrong Coreferences in Web Person Search

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5449))

Abstract

In this paper we present a system for the Web People Search task, which is the task of clustering together the pages referring to the same person. The vector space model approached is modified in order to develop a more flexible clustering technique. We have implemented a dynamic weighting procedure for the attributes common to different cluster in order to maximize the between cluster variance with respect with the within cluster variance. We show that in this way the undesired collateral effect such as superposition and masking are alleviated. The system we present obtains similar results to the ones reported by the top three systems presented at the SEMEVAL 2007 competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Artiles, J., Gonzalo, J., Sekine, S.: Establishing a benchmark for the Web People Search Task: The Semeval WePS Track. In: Proceedings of Semeval 2007, Association for Computational Linguistics (2007)

    Google Scholar 

  2. Artiles, J., Gonzalo, J., Verdejo, F.: A Testbed for People Searching Strategies in the WWW. In: SIGIR 2005 (2005)

    Google Scholar 

  3. Bagga, A., Baldwin, B.: Entity-based cross-document co-referencing using the vector space model. In: Proceedings of the 17th international conference on Computational linguistics, pp. 75–85 (1998)

    Google Scholar 

  4. Buitelaar, P., Cimiano, P., Magnini, B. (eds.): Ontology Learning from Text: Methods, Evaluation and applications. IOS Press, Amsterdam (2005)

    Google Scholar 

  5. Grishman, R.: Whither Written Language Evaluation? In: Human Language Technology Workshop, pp. 120–125. San Mateor Morgan Kaufmann, San Francisco (1994)

    Chapter  Google Scholar 

  6. Luo, X., Ittycherian, Y., Jing, H.: A mention-synchronous coreference resolution algorithm based on the Bell tree. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, Barcelona (2004)

    Google Scholar 

  7. Magnini, B., Pianta, E., Popescu, O., Speranza, M.: Ontology Population from Textual Mentions: Task Definition and Benchmark. In: Proceedings of the OLP2 workshop on Ontology Population and Learning, Sidney, Australia. Joint with ACL/Coling (2006)

    Google Scholar 

  8. Malin, B.: Unsupervised Name Disambiguation via social Network Similarity. In: Proceedings of the SIAM Workshop on Link Analysis, Counterterrorism and Security, CA (2005)

    Google Scholar 

  9. Mann, G., Yarowsky, D.: Unsupervised Personal Name Disambiguation, CoNLL, Edmonton, Canada (2003)

    Google Scholar 

  10. Niu, C., Li, W., Srihari, R.: Weakly supervised learning for cross-document person name disambiguation supported by information extraction, ACL, Spain (2004)

    Google Scholar 

  11. Ng, V.: Shallow Semantics for Coreference Resolution. In: IJCAI (2007)

    Google Scholar 

  12. Pedersen, T., Purandare, A., Kulkarni, A.: Name discrimination by clustering similar contexts. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 226–237. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Popescu, O., Magnini, B., Pianta, E., Serafini, L., Speranza, M., Tamilin, A.: From Mentions to Ontology: A Pilot Study. In: Proceedings SWAP 2006, Pisa, Italy (2006)

    Google Scholar 

  14. Song, Y., Councill, I., Li, J., Giles, C.: Efficient Topic-based Unsupervised Name Disambiguation. In: JCDL 2007, Vancouver, British Columbia, Canada (2007)

    Google Scholar 

  15. Wan, X., Yang, J., Xiao, J.: Using Cross-Document RandomWalks or Topic-Focused Multi-Document. In: Proceedings of IEEE/WIC/ACM, International Conference on Web Intelligence (2006)

    Google Scholar 

  16. Wei, Y., Lin, M., Chen, H.: Name Disambiguation in Person Information Mining. In: IEEE/WIC/ACM International Conference on Web Intelligence (2006)

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Popescu, O., Magnini, B. (2009). Alleviating the Problem of Wrong Coreferences in Web Person Search. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00382-0_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00381-3

  • Online ISBN: 978-3-642-00382-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics