Skip to main content
Log in

A Semantic Searching Scheme in Heterogeneous Unstructured P2P Networks

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Semantic-based searching in peer-to-peer (P2P) networks has drawn significant attention recently. A number of semantic searching schemes, such as GES proposed by Zhu Y et al., employ search models in Information Retrieval (IR). All these IR-based schemes use one vector to summarize semantic contents of all documents on a single node. For example, GES derives a node vector based on the IR model: VSM (Vector Space Model). A topology adaptation algorithm and a search protocol are then designed according to the similarity between node vectors of different nodes. Although the single semantic vector is suitable when the distribution of documents in each node is uniform, it may not be efficient when the distribution is diverse. When there are many categories of documents at each node, the node vector representation may be inaccurate. We extend the idea of GES and present a new class-based semantic searching scheme (CSS) specifically designed for unstructured P2P networks with heterogeneous single-node document collection. It makes use of a state-of-the-art data clustering algorithm, online spherical k-means clustering (OSKM), to cluster all documents on a node into several classes. Each class can be viewed as a virtual node. Virtual nodes are connected through virtual links. As a result, the class vector replaces the node vector and plays an important role in the class-based topology adaptation and search process. This makes CSS very efficient. Our simulation using the IR benchmark TREC collection demonstrates that CSS outperforms GES in terms of higher recall, higher precision, and lower search cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Li X, Wu J. Searching techniques in peer-to-peer networks. Handbook of Theoretical and Algorithmic Aspects of Sensor, Ad Hoc Wireless, and Peer-to-Peer Networks, CRC Press, 2005, http://www.kiv.zcn.lz/~ledvina/DHT/p2psurvey.pdf.

  2. Ratnasamy S, Francis P, Handley M, Karp R, Shenker S. A scalable content addressable network. In Proc. the 2003 Conf. Applications, Technologies, Architecture, and Protocols for Computer Communications (SIGCOMM 2001), San Diego, USA, August 27–31, 2001, pp.161-172.

  3. Rowstron A, Druschel P. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Proc. the 18th IFIP/ACM International Conference on Distributed System Platforms (Middleware 2001), Heidelberg, Germany, November 12–16, 2001, pp.329-350.

  4. Stoica I, Morris R, Nowell D L, Karger D, Kaashoek M, Dabek F, Balakrishnan H. Chord: A scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Transactions on Networking, 2003, 11(1): 17–32.

    Article  Google Scholar 

  5. Yu D, Chen X, Chang Y. An improved P2P model based on Chord. In Proc. the 6th International Conference on Parallel and Distributed Computing Applications and Technologies (PDCAT2005), Dalian, China, December 5–8, 2005, pp.807-811.

  6. Xue K, Hongk P, Li J. FS-chord: A new P2P model with fractional steps joining. In Proc. Advanced International Conference on Telecommunications and International Conference on Internet and Web Applications and Services (AICT-ICIW 2006), Guadeloupe, French Caribbean, February 19–25, 2006.

  7. Cai J, Shao X, Ma W. Ontology driven semantic search over structured P2P network. In Proc. the 9th International Conference on Hybrid Intelligent Systems (HIS 2009), Shenyang, China, August 12–14, 2009, pp.29-34.

  8. Dragan F, Gardarin G, Yeh L. A semantic layer for publishing and localizing XML data for a P2P XQuery mediator. In Proc. the 17th International World Wide Web Conference (WWW2008), Beijing, China, April 21–25, 2008, pp.1105-1106.

  9. Zhu Y, Hu Y. Efficient semantic search on DHT overlays. Journal of Parallel and Distributed Computing, 2007, 67(5): 604–616.

    Article  MATH  MathSciNet  Google Scholar 

  10. Clarke I, Sandberg O, Wiley B, Hong T W. Freenet: A distributed anonymous information storage and retrieval system. In Proc. the 2000 Workshop on Design Issues in Anonymity and Unobservability, Berkeley, USA, July 25–26, 2000, pp.46-66.

  11. Manku G S, Bawa M, Raghavan P. Symphony: Distributed hashing in a small world. In Proc. the 4th USENIX Symposium on Internet Technology and Systems (USITS 2003), Seattle, USA, March 26–28, 2003.

  12. The Gnutella Protocol Specification V0.4. http://www.stanford.edu/class/cs244b/gnutella_protocol_0.4.pdf.

  13. Lv Q, Cao P, Cohen E et al. Search and replication in unstructured peer-to-peer networks. In Proc. the 16th ACM International Conference on Supercomputing (ACM ICS 2002), New York, USA, June 22–26, 2002, pp.84-95.

  14. Yang B, Garcia-Molina H. Improving search in peer-to-peer networks. In Proc. the 22nd IEEE International Conference on Distributed Computing (IEEE ICDCS 2002), Vienna, Austria, July 2–5, 2002.

  15. Crespo A, Garcia-Molina H. Routing indices for peer-to-peer systems. In Proc. the 22nd International Conference on Distributed Computing Systems (IEEE ICDCS 2002), Vienna, Austria, July 2–5, 2002.

  16. Zhu Y, Yang X, Hu Y. Making search efficient on Gnutella-like P2P systems. In Proc. the 19th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2005), Denver, USA, April 3–8, 2005.

  17. Faye D, Nachouki G, Valduriez P. Semantic query routing in SenPeer, a P2P data management system. In Lecture Notes in Computer Science 4658, Enokido T et al. (eds.), Springer-Verlag, 2007, pp.365-374.

  18. Dohnal V, Sedmidubsky J. Query routing mechanisms in self-organizing search systems. In Proc. the 2nd International Workshop on Similarity Search and Applications (SISAP 2009), Prague, Czech Republic, August 29–30, 2009, pp.132-139.

  19. Haase P, Siebes R, Harmelen F V. Expertise-based peer selection in Peer-to-Peer networks. Knowledge and Information Systems, 2008, 15(1): 75–107.

    Article  Google Scholar 

  20. Pirro G, Ruffolo M, Talia D. Advanced semantic search and retrieval in a collaborative peer-to-peer system. In Proc. the 2008 International Workshop on Content Management and Delivery in Large-Scale Networks (UPGRADE-CN2008), Boston, USA, June 23–27, 2008, pp.65-72.

  21. Bawa M, Manku G, Raghavan P. SETS: Search enhanced by topic segmentation. In Proc. the 26th Annual International ACM SIGIR Conference (SIGIR 2003), Toronto, Canada, July 28-August 1, 2003, pp.306-313.

  22. Shen H T, Shu Y, Yu B. Efficient semantic-based content search in P2P network. IEEE Transactions on Knowledge and Data Engineering, 2004, 16(7): 813–826.

    Article  Google Scholar 

  23. Zhou Y, Croft W B, Levine B N. Content-based search in peer-to-peer networks. Technical Report, University of Massachusetts, 2004.

  24. Witschel H F. Content-oriented topology restructuring for search in P2P networks. Technical Report, University of Leipzig, Germany, 2005.

  25. Zhu Y, Hu Y. Enhancing search performance on Gnutella-like P2P systems. IEEE Transactions on Parallel and Distributed Systems, 2006, 17(12): 1482–1495.

    Article  Google Scholar 

  26. Yang X, Hu Y. SEIF: Search enhanced by intelligent feedback in unstructured P2P networks. In Proc. International Conference on Parallel Processing, Vienna, Austria, September 22–25, 2009, pp.494-501.

  27. Zhong S. Efficient online spherical K-means clustering. In Proc. IEEE Int. Joint Conf. Neural Networks (IJCNN 2005), Montreal, Canada, July 31-August 4, 2005, pp.3180-3185.

  28. Wang Q, Li R, Chen L, Lian J, Ä Ozsu M T. Speed up semantic search in P2P networks. In Proc. the ACM 17th Conference on Information and Knowledge Management (CIKM 2008), Napa Valley, USA, October 26–30, 2008, pp.1341-1342.

  29. Kacimi M, Yetongnon K. Similarity search in a hybrid overlay P2P network. In Proc. the 11th IEEE Symposium on Computers and Communications (ISCC 2006), Cagliari, Italy, June 26–29, 2006.

  30. Comito C, Patarin S, Talia D. A semantic overlay network for P2P schema-based data integration. In Proc. the 11th IEEE Symposium on Computers and Communications (ISCC 2006), Cagliari, Italy, June 26–29, 2006.

  31. Yang X, Hu Y. Search enhanced by distributed semantic clustering in Gnutella-like P2P systems. In Proc. the 15th International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication (MASCOTS 2007), Istanbul, Turkey, October 24–26, 2007, pp.318-324.

  32. Huang J, Li X, Wu J. A class-based search system in unstructured P2P networks. In Proc. the 21st International Conference on Advanced Networking and Applications, Niagara Falls, Canada, May 21–23, 2007, pp.76-83.

  33. Ng C H, Sia K C. Peer clustering and firework query model. In Proc. the 11th International World Wide Web Conference (WWW 2002), Honolulu, Hawaii, USA, May 7–11, 2002.

  34. Crespo A, Garcia-Molina H. Semantic overlay networks for P2P systems. In Lecture Notes in Computer Science 3601, Garbonell J G, Siekmann J (eds.), Springer-Verlag, 2005, pp.1-13.

  35. Lin K, Wang C, Chou C, Golubchik L. SocioNet: A social-based multimedia access system for unstructured P2P networks. IEEE Transactions on Parallel and Distributed Systems, 2010, 21(7): 1027–1041.

    Article  Google Scholar 

  36. Deconinck G, Vanthournout K. Agora: A semantic overlay network. International Journal of Critical Infrastructures, 2009, 5(1/2): 175–195.

    Article  Google Scholar 

  37. Chawathe Y, Ratnasamy S, Breslau L, Lanham N, Sheaker S. Making gnutella-like P2P systems scalable. In Proc. the 2003 Conf. Applications, Technologies, Architecture, and Protocols for Computer Communications (SIGCOMM2003), Karlsruhe, Germany, August 25–29, 2003, pp.407-418.

  38. Berry M W, Drmac Z, Jessup E R. Matrices, vector spaces, and information retrieval. SIAM Review, 1999, 41(2): 335–362.

    Article  MATH  MathSciNet  Google Scholar 

  39. Text REtrieval Conference (TREC). http://trec.nist.gov, May, 2010.

  40. McCallum A K. Rainbow toolkit. http://www.cs.cmu.edu/~mccallum/bow/, May, 2010.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun-Cheng Huang.

Additional information

This work was supported in part by the National Science Foundation of USA under Grant Nos. ANI 0073736, EIA 0130806, CCR 0329741, CNS 0422762, CNS 0434533, CNS 0531410, CNS 0626240, CCF 0830289, and CNS 0948184.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(PDF 108 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, JC., Li, XQ. & Wu, J. A Semantic Searching Scheme in Heterogeneous Unstructured P2P Networks. J. Comput. Sci. Technol. 26, 925–941 (2011). https://doi.org/10.1007/s11390-011-1190-z

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-1190-z

Keywords

Navigation