Advertisement

Search Result Caching in Peer-to-Peer Information Retrieval Networks

  • Almer S. Tigelaar
  • Djoerd Hiemstra
  • Dolf Trieschnigg
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6653)

Abstract

For peer-to-peer web search engines it is important to quickly process queries and return search results. How to keep the perceived latency low is an open challenge. In this paper we explore the solution potential of search result caching in large-scale peer-to-peer information retrieval networks by simulating such networks with increasing levels of realism. We find that a small bounded cache offers performance comparable to an unbounded cache. Furthermore, we explore partially centralised and fully distributed scenarios, and find that in the most realistic distributed case caching can reduce the query load by thirty-three percent. With optimisations this can be boosted to nearly seventy percent.

Keywords

Search Result Distribute Hash Table Cache Size Share Ratio Cache Policy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Cuenca-Acuna, F.M., Martin, R.P., Nguyen, T.D.: Planetp: Using gossiping to build content addressable peer-to-peer information sharing communities. In: Proceedings of HPDC, Seattle, Washington, US (June 2003)Google Scholar
  2. 2.
    Suel, T., Mathur, C., Wu, J.w., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasundaram, K.: Odissea: A peer-to-peer architecture. In: Proceedings of WebDB, San Diego, CA, US, pp. 67–72 (June 2003)Google Scholar
  3. 3.
    Lu, J., Callan, J.: Full-text federated search of text-based digital libraries in peer-to-peer networks. Information Retrieval 9(4), 477–498 (2006), doi:10.1007/s10791-006-6388-2CrossRefGoogle Scholar
  4. 4.
    Skobeltsyn, G., Aberer, K.: Distributed cache table: efficient query-driven processing of multi-term queries in p2p networks. In: Proceedings of P2PIR, Arlington, Virginia, US, pp. 33–40 (November 2006)Google Scholar
  5. 5.
    Lu, J.: Full-Text Federated Search in Peer-to-Peer Networks. PhD thesis, Carnegie Mellon University (2007)Google Scholar
  6. 6.
    Stutzbach, D., Rejaie, R.: Understanding churn in peer-to-peer networks. In: Proceedings of IMC, Rio de Janeiro, BR, pp. 189–202 (October 2006)Google Scholar
  7. 7.
    Markatos, E.P.: On caching search engine query results. Computer Communications 24(2), 137–143 (2001)CrossRefGoogle Scholar
  8. 8.
    Bhattacharjee, B., Chawathe, S., Gopalakrishnan, V., Keleher, P., Silaghi, B.: Efficient peer-to-peer searches using result-caching. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 225–236. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  9. 9.
    Pass, G., Chowdhury, A., Torgeson, C.: A picture of search. In: Proceedings of InfoScale, Hong Kong, p. 1 (May 2006), doi:10.1145/1146847.1146848Google Scholar
  10. 10.
    Brenes, D.J., Gayo-Avello, D.: Stratified analysis of aol query log. Information Sciences 179(12), 1844–1858 (2009)CrossRefGoogle Scholar
  11. 11.
    Cohen, B.: Incentives build robustness in bittorrent. In: Proceedings of P2PEcon, Berkeley, CA, US (June 2003)Google Scholar
  12. 12.
    McNamee, P., Mayfield, J.: Character n-gram tokenization for european language text retrieval. Information Retrieval 7(1), 73–97 (2004), doi:10.1023/b:inrt.0000009441.78971.beCrossRefGoogle Scholar
  13. 13.
    Croft, W.B., Metzler, D., Strohman, T.: Search Engines: Information Retrieval in Practice. Pearson Education, London (2010)Google Scholar
  14. 14.
    Teevan, J., Adar, E., Jones, R., Potts, M.A.S.: Information re-retrieval. In: Proceedings of SIGIR, Amsterdam, NL, pp. 151–158 (July 2007)Google Scholar
  15. 15.
    Podlipnig, S., Böszörmenyi, L.: A survey of web cache replacement strategies. ACM Computing Surveys 35(4), 374–398 (2003), doi:10.1145/954339.954341CrossRefGoogle Scholar
  16. 16.
    Porter, M.F.: The english (porter2) stemming algorithm (2001), snowball.tartarus.org/algorithms/english/stemmer.html (January 2011)
  17. 17.
    Pouwelse, J.A., Garbacki, P., Wang, J., Bakker, A., Yang, J., Iosup, A., Epema, D.H.J., Reinders, M., van Steen, M.R., Sips, H.J.: Tribler: A social-based peer-to-peer system. Concurrency and Computation: Practice and Experience 20(2), 127–138 (2008), doi:10.1002/cpe.1189CrossRefGoogle Scholar
  18. 18.
    Megiddo, N., Modha, D.S.: Arc: A self-tuning, low overhead replacement cache. In: Proceedings of FAST, Berkeley, CA, US, pp. 115–130 (2003)Google Scholar
  19. 19.
    Blanco, R., Bortnikov, E., Junqueira, F., Lempel, R., Telloli, L., Zaragoza, H.: Caching search engine results over incremental indices. In: Proceedings of SIGIR, Geneva, CH, pp. 82–89 (July 2010), doi:10.1145/1835449.1835466Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Almer S. Tigelaar
    • 1
  • Djoerd Hiemstra
    • 1
  • Dolf Trieschnigg
    • 1
  1. 1.University of TwenteEnschedeThe Netherlands

Personalised recommendations