Skip to main content

Context-Sensitive Ranking Using Cross-Domain Knowledge for Chemical Digital Libraries

  • Conference paper
Research and Advanced Technology for Digital Libraries (TPDL 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8092))

Included in the following conference series:

Abstract

Today, entity-centric searches are common tasks for information gathering. But, due to the huge amount of available information the entity itself is often not sufficient for finding suitable results. Users are usually searching for entities in a specific search context which is important for their relevance assessment. Therefore, for digital library providers it is inevitable to also consider this search context to allow for high quality retrieval. In this paper we present an approach enabling context searches for chemical entities. Chemical entities play a major role in many specific domains, ranging from biomedical over biology to material science. Since most of the domain specific documents lack of suitable context annotations, we present a similarity measure using cross-domain knowledge gathered from Wikipedia. We show that structure-based similarity measures are not suitable for chemical context searches and introduce a similarity measure combining entity- and context similarity. Our experiments show that our measure outperforms structure-based similarity measures for chemical entities. We compare against two baseline approaches: a Boolean retrieval model and a model using statistical query expansion for the context term. We compared the measures computing mean average precision (MAP) using a set of queries and manual relevance assessments from domain experts. We were able to get a total increase of the MAP of 30% (from 31% to 61%). Furthermore, we show a personalized retrieval system which leads to another increase of around 10%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Corbett, P., Murray-Rust, P.: High-throughput identification of chemistry in life science texts. In: Berthold, M., Glen, R.C., Fischer, I. (eds.) CompLife 2006. LNCS (LNBI), vol. 4216, pp. 107–118. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  2. Sun, B., et al.: Identifying, Indexing, and Ranking Chemical Formulae and Chemical Names in Digital Documents. ACM Transactions on Information Systems 29 (2011)

    Google Scholar 

  3. Tönnies, S., Köhncke, B., Koepler, O., Balke, W.-T.: Exposing the Hidden Web for Chemical Digital Libraries. In: Proc. of the Joint Conf. on Digital Libraries (JCDL) (2010)

    Google Scholar 

  4. Tönnies, S., et al.: Taking Chemistry to the Task – Personalized Queries for Chemical Digital Libraries. In: Proc. of the Joint Conf. on Digital Libraries (JCDL) (2011)

    Google Scholar 

  5. Kraft, R., Zien, J.: Mining anchor text for query refinement. In: Proc. of the Int. Conf. on World Wide Web (WWW) (2004)

    Google Scholar 

  6. Kraft, R., Chang, C.C., Maghoul, F., Kumar, R.: Searching with context. In: Proc. of the Int. Conf. on World Wide Web (WWW) (2006)

    Google Scholar 

  7. Jiang, D., et al.: Context-aware search personalization with concept preference. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2011)

    Google Scholar 

  8. Haveliwala, T.: Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering 15 (2003)

    Google Scholar 

  9. Chen, L., Papakonstantinou, Y.: Context-sensitive ranking for document retrieval. In: Proc. of ACM SIGMOD Conf. (2011)

    Google Scholar 

  10. Degtyarenko, K., et al.: ChEBI: A database and ontology for chemical entities of biological interest. Nucleic Acids Research 36, Database issue (2008)

    Google Scholar 

  11. Köhncke, B., Balke, W.-T.: Using Wikipedia categories for compact representations of chemical documents. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2010)

    Google Scholar 

  12. Liu, C., Wu, S., Jiang, S., Tung, A.K.H.: Cross Domain Search by Exploiting Wikipedia. In: Int. Conf. on Data Engineering (ICDE) (2012)

    Google Scholar 

  13. Milne, D., Witten, I.H.: An open-source toolkit for mining Wikipedia. Artificial Intelligence 194 (2012)

    Google Scholar 

  14. Milne, D., Witten, I.: Learning to link with wikipedia. In: Proc. of Conf. on Information and Knowledge Management (CIKM) (2008)

    Google Scholar 

  15. Kendall, M.G.: A New Measure of Rank Correlation. Journal of Biometrika 30(1-2) (1938)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Köhncke, B., Balke, WT. (2013). Context-Sensitive Ranking Using Cross-Domain Knowledge for Chemical Digital Libraries. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2013. Lecture Notes in Computer Science, vol 8092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40501-3_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40501-3_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40500-6

  • Online ISBN: 978-3-642-40501-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics