Abstract
Efficient full-text searching is a big challenge in Peer-to-Peer (P2P) system. Recently, Distributed Hash Table (DHT) becomes one of the reliable communication schemes for P2P. Some research efforts perform keyword searching and result intersection on DHT substrate. Two or more search requests must be issued for multi-keyword query. This article proposes a Sliding Window improved Multi-keyword Searching method (SWMS) to index and search full-text for short queries on DHT. The main assumptions behind SWMS are: (1) query overhead to do standard inverted list intersection is prohibitive in a distributed P2P system; (2) most of the documents relevant to a multi-keyword query have those keywords appearing near each other. The experimental results demonstrate that our method guarantees the search quality while reduce the cost of communication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Medina, A., Lakhina, A., Matta, I., Byers, J.: BRITE: An Approach to Universal Topology Generation. In: Proc. of MASCOTS (2001)
Tomasic, A., Garcia-Molina, H.: Performance of Inverted Indices in Shared-nothing Distributed Text Document Information Retrieval Systems. In: Proc. of PDIS 1993 (1993)
Bhattacharjee, B., Chawathe, S., Gopalakrishnan, V., Keleher, P., Silaghi, B.: Efficient Peer to Peer Searches Using Result-Caching. In: Proc. of IPTPS 2003 (2003)
Ribeiro-Neto, B.A., Barbosa, R.A.: Query Performance for Tightly Coupled Distributed Digital Libraries. In: Proc. of ACM DL 1998 (1998)
Zhao, B.Y., Kubiatowicz, J., Joseph, A.D.: Tapestry: An Infrastructure for Faulttolerant Wide-area Location and Routing. Tech, Rep (2001)
Buckley, C.: Implementation of the SMART Information Retrieval System. Tech. Rep. (1985)
Clarke, C.L.A., Cormack, G.V., Burkowski, F.J.: Shortest Substring Ranking (MultiText Experiments for TREC-4). NIST special publication
FIPS 180-1. Secure hash standard. Tech. Rep. Publication 180-1, FIPS (1995)
Salton, G.: Automatic Text Processing: The transformation, analysis, and retrieval of information by computer. Addison-Wesley, Reading (1989)
Gnawali, O.D.: A Keyword-Set Search System for Peer-to-Peer Networks. Master’s Thesis, Massachusetts Institute of Technology (2002)
Burton, B.H.: Bloom: Space/time Trade offs in Hash Coding with Allowable Errors. Communications of the ACM (1970)
Church, K.W., Hanks, P.: Word Association norms, mutual information and lexicography. In: Proc. of ACL 27 (1989)
Li, J., Loo, B.T., Hellerstein, J.M., Kaashoek, M.F., Karger, D.R., Morris, R.: On the Feasibility of Peer-to-Peer Web Indexing and Search. In: Proc. of IPTPS 2003 (2003)
Porter, M.F.: An algorithm for suffix stripping. Program (1980)
Napster: http://www.napster.com
Rowstron, A., Druschel, P.: Pastry: Scalable, Decentralized Object Location and Routing for Large-scale Peer-to-peer Systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)
Reynolds, P., Vahdat, A.: Efficient Peer-to-Peer Keyword Searching. In: Endler, M., Schmidt, D.C. (eds.) Middleware 2003. LNCS, vol. 2672, Springer, Heidelberg (2003)
Pastry Software, http://research.microsoft.com/~antr/Pastry/
Ratnasamy, S., Francis, P., Handley, M., Karp, R.,, R., Shenker, S.: A Scalable Content- Addressable Network. In: Proc. of the ACM SIGCOMM 2001 (2001)
Yates, R.B., Neto, B.R.: Modern Information Retrieval. Addison-Wesley Pub. Co., Reading (1999)
Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a Very Large Web Search Engine Query Log. In: Proc. of ACM SIGIR (1999)
SMART Test Collections: http://www.cs.utk.edu/~lsi/corpa.html
Saroiu, S., Krishna Gummadi, P., Gribble, S.D.: A Measurement Study of Peer-to- Peer File Sharing Systems. In: The Proc. of MMCN 2002 (2002)
Stoic, I., Morris, R., Kargeer, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peerto- Peer Lookup Service for Internet Applications. In: Proc of the SIGCOMM 2001 (2001)
Tang, C.Q., Xu, Z.C., Dwarkadas, S.: Peer-to-Peer Information Retrieval Using Self- Organizing Semantic Overlay Networks. In: Proc. of the ACM SIGCOMM 2003 (2003)
The Gnutella Homepage: http://gnutella.wego.com
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: Proc. of ICML 1997 (1997)
Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Proc. of ECIR 2003 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huang, S., Xue, GR., Zhu, X., Ge, YF., Yu, Y. (2004). DHT Based Searching Improved by Sliding Window. In: Li, Q., Wang, G., Feng, L. (eds) Advances in Web-Age Information Management. WAIM 2004. Lecture Notes in Computer Science, vol 3129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27772-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-540-27772-9_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22418-1
Online ISBN: 978-3-540-27772-9
eBook Packages: Springer Book Archive