Keyword-Driven Resource Disambiguation over RDF Knowledge Bases

  • Saeedeh Shekarpour
  • Axel-Cyrille Ngonga Ngomo
  • Sören Auer
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7774)


Keyword search is the most popular way to access information. In this paper we introduce a novel approach for determining the correct resources for user-supplied queries based on a hidden Markov model. In our approach the user-supplied query is modeled as the observed data and the background knowledge is used for parameter estimation. We leverage the semantic relationships between resources for computing the parameter estimations. In this approach, query segmentation and resource disambiguation are mutually tightly interwoven. First, an initial set of potential segments is obtained leveraging the underlying knowledge base; then, the final correct set of segments is determined after the most likely resource mapping was computed. While linguistic analysis (e.g. named entity, multi-word unit recognition and POS-tagging) fail in the case of keyword-based queries, we will show that our statistical approach is robust with regard to query expression variance. Our experimental results reveal very promising results.


Hide Markov Model Noun Phrase Natural Language Processing Query Expansion SPARQL Query 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. Technical Report 2003-29 (2003)Google Scholar
  2. 2.
    Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. ACM Press (2000)Google Scholar
  3. 3.
    Brenes, D.J., Gayo-Avello, D., Garcia, R.: On the fly query entity decomposition using snippets. CoRR, abs/1005.5516 (2010)Google Scholar
  4. 4.
    Brill, E., Ngai, G.: Man* vs. machine: A case study in base noun phrase learning. ACL (1999)Google Scholar
  5. 5.
    Chieu, H.L., Ng, H.T.: Named entity recognition: A maximum entropy approach using global information. In: Proceedings COLING 2002 (2002)Google Scholar
  6. 6.
    Chuang, S.-L., Chien, L.-F.: Towards automatic generation of query taxonomy: A hierarchical query clustering approach. IEEE Computer Society (2002)Google Scholar
  7. 7.
    Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: SIGDAT Empirical Methods in NLP and Very Large Corpora (1999)Google Scholar
  8. 8.
    Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing Search in Context: the Concept Revisited. In: WWW (2001)Google Scholar
  9. 9.
    Guo, J., Xu, G., Cheng, X., Li, H.: Named entity recognition in query. ACM (2009)Google Scholar
  10. 10.
    Joachims, T., Granka, L.A., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR. ACM (2005)Google Scholar
  11. 11.
    Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2), 18–28 (2003)CrossRefGoogle Scholar
  12. 12.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5) (1999)Google Scholar
  13. 13.
    Kraft, R., Chang, C.C., Maghoul, F., Kumar, R.: Searching with context. In: WWW 2006: 15th Int. Conf. on World Wide Web. ACM (2006)Google Scholar
  14. 14.
    Lawrence, S.: Context in web search. IEEE Data Eng. Bull. 23(3), 25–32 (2000)Google Scholar
  15. 15.
    Pu, K.Q., Yu, X.: Keyword query cleaning. PVLDB 1(1), 909–920 (2008)MathSciNetGoogle Scholar
  16. 16.
    Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. CoRR (1995)Google Scholar
  17. 17.
    Risvik, K.M., Mikolajewski, T., Boros, P.: Query segmentation for web search (2003)Google Scholar
  18. 18.
    Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. ACM (2008)Google Scholar
  19. 19.
    Tan, B., Peng, F.: Unsupervised query segmentation using generative language models and wikipedia. In: WWW. ACM (2008)Google Scholar
  20. 20.
    Tan, B., Peng, F.: Unsupervised query segmentation using generative language models and wikipedia. ACM (2008)Google Scholar
  21. 21.
    Uzuner, A., Katz, B., Yuret, D.: Word sense disambiguation for information retrieval. AAAI Press/The MIT Press (1999)Google Scholar
  22. 22.
    Vorhees, E.: The trec-8 question answering track report. In: Proceedings of TREC-8 (1999)Google Scholar
  23. 23.
    Wen, J.-R., Nie, J.-Y., Zhang, H.-J.: Query Clustering Using User Logs. ACM Transactions on Information Systems 20(1) (2002)Google Scholar
  24. 24.
    White, R.W., Jose, J.M., van Rijsbergen, C.J., Ruthven, I.: A simulated study of implicit feedback models. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 311–326. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  25. 25.
    Yu, X., Shi, H.: Query segmentation using conditional random fields. ACM (2009)Google Scholar
  26. 26.
    Zhu, Y., Callan, J., Carbonell, J.G.: The impact of history length on personalized search. ACM (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Saeedeh Shekarpour
    • 1
  • Axel-Cyrille Ngonga Ngomo
    • 1
  • Sören Auer
    • 1
  1. 1.Department of Computer ScienceUniversity of LeipzigLeipzigGermany

Personalised recommendations