Skip to main content

Insights into Entity Name Evolution on Wikipedia

  • Conference paper
Web Information Systems Engineering – WISE 2014 (WISE 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8787))

Included in the following conference series:

Abstract

Working with Web archives raises a number of issues caused by their temporal characteristics. Depending on the age of the content, additional knowledge might be needed to find and understand older texts. Especially facts about entities are subject to change. Most severe in terms of information retrieval are name changes. In order to find entities that have changed their name over time, search engines need to be aware of this evolution. We tackle this problem by analyzing Wikipedia in terms of entity evolutions mentioned in articles regardless the structural elements. We gathered statistics and automatically extracted minimum excerpts covering name changes by incorporating lists dedicated to that subject. In future work, these excerpts are going to be used to discover patterns and detect changes in other sources. In this work we investigate whether or not Wikipedia is a suitable source for extracting the required knowledge.

This work is partly funded by the European Research Council under ALEXANDRIA (ERC 339233).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web Journal (2014)

    Google Scholar 

  2. Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD Conference, pp. 1247–1250 (2008)

    Google Scholar 

  3. Miller, G.A.: Wordnet: A lexical database for english. Commun. ACM, 39–41 (1995)

    Google Scholar 

  4. Suchanek, F.M., Kasneci, G., Weikum, G.: Yago: A Core of Semantic Knowledge. In: 16th International World Wide Web Conference (WWW 2007). ACM Press, New York (2007)

    Google Scholar 

  5. Kanhabua, N., Nørvåg, K.: Exploiting time-based synonyms in searching document archives. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, JCDL 2010, pp. 79–88. ACM, New York (2010)

    Google Scholar 

  6. The Stanford Natural Language Processing Group. Stanford corenlp - a suite of core nlp tools (2010), http://nlp.stanford.edu/software/corenlp.shtml (accessed February 3, 2014)

  7. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: Yago2: A spatially and temporally enhanced knowledge base from wikipedia. Artificial Intelligence 194, 28–61 (2013)

    Article  MATH  MathSciNet  Google Scholar 

  8. Anderka, M., Stein, B., Lipka, N.: Predicting quality flaws in user-generated content: the case of wikipedia. In: SIGIR, pp. 981–990 (2012)

    Google Scholar 

  9. Anderka, M., Stein, B.: Overview of the 1th international competition on quality flaw prediction in wikipedia. In: CLEF (Online Working Notes/Labs/Workshop) (2012)

    Google Scholar 

  10. Ferschke, O., Gurevych, I., Chebotar, Y.: Behind the article: Recognizing dialog acts in wikipedia talk pages. In: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (2012)

    Google Scholar 

  11. Ferschke, O., Zesch, T., Gurevych, I.: Wikipedia revision toolkit: Efficiently accessing wikipedia’s edit history. In: ACL (System Demonstrations), pp. 97–102 (2011)

    Google Scholar 

  12. Goldfarb, D., Arends, M., Froschauer, J., Merkl, W.: Art history on wikipedia, a macroscopic observation. In: Proceedings of the ACM WebSci 2012, pp. 163–168. ACM (2012)

    Google Scholar 

  13. Berberich, K., Bedathur, S.J., Sozio, M., Weikum, G.: Bridging the terminology gap in web archive search. In: WebDB (2009)

    Google Scholar 

  14. Kaluarachchi, A.C., Varde, A.S., Bedathur, S.J., Weikum, G., Peng, J., Feldman, A.: Incorporating terminology evolution for query translation in text retrieval with association rules. In: CIKM, pp. 1789–1792. ACM (2010)

    Google Scholar 

  15. Mazeika, A., Tylenda, T., Weikum, G.: Entity timelines: Visual analytics and named entity evolution. In: Proc. of the 20th ACM Int. Conference on Information and Knowledge Management, CIKM 2011, pp. 2585–2588. ACM, New York (2011)

    Google Scholar 

  16. Tahmasebi, N., Gossen, G., Kanhabua, N., Holzmann, H., Risse, T.: Neer: An unsupervised method for named entity evolution recognition. In: Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012), Mumbai, India (2012)

    Google Scholar 

  17. Tahmasebi, N.: Models and Algorithms for Automatic Detection of Language Evolution. Towards Finding and Interpreting of Content in Long-Term Archives. PhD thesis, Leibniz Universität Hannover (2013)

    Google Scholar 

  18. Holzmann, H., Gossen, G., Tahmasebi, N.: fokas: Formerly known as - a search engine incorporating named entity evolution. In: Proceedings of the 24th International Conference on Computational Linguistics: Demonstration Papers (Coling 2012), Mumbai, India (2012)

    Google Scholar 

  19. Holzmann, H., Tahmasebi, N., Risse, T.: Blogneer: Applying named entity evolution recognition on the blogosphere. In: Proc. of the 3rd Int. Workshop on Semantic Digital Archives (SDA 2013), in Conjunction with TPDL 2013, Valetta, Malta (September 2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Holzmann, H., Risse, T. (2014). Insights into Entity Name Evolution on Wikipedia. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds) Web Information Systems Engineering – WISE 2014. WISE 2014. Lecture Notes in Computer Science, vol 8787. Springer, Cham. https://doi.org/10.1007/978-3-319-11746-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11746-1_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11745-4

  • Online ISBN: 978-3-319-11746-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics