Abstract
Peer-to-Peer (P2P) data integration combines the P2P infrastructure with traditional scheme-based data integration techniques. Some of the primary problems in this research area are the techniques to be used for querying, indexing and distributing documents among peers in a network especially when document files are in XML format. In order to handle this problem we describe an XML P2P system that efficiently distributes a set of clustered XML documents in a P2P network in order to speed-up user queries. The novelty of the proposed system lies in the efficient distribution of the XML documents and the construction of an appropriate virtual index on top of the network peers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aberer, K., Datta, A., Hauswirth, M., Schmidt, R.: Indexing Dataoriented Overlay Networks. In: Proc. of the 31st VLDB Conference, Trondheim, Norway, pp. 685–696 (2005)
Abiteboul, S., Manolescu, I., Preda, N.: Constructing and Querying Peer-to-Peer Warehouses of XML Resources. ICDE: 1122-1123 (2005)
Antonellis, P., Makris, C., Tsirakis, N.: XEdge: Clustering Homogeneous and Heterogeneous XML Documents Using Edge Summaries. In: 23rd Annual ACM Symposium on Applied Computing, Fortalezza, Brazil (2008)
Bonifati, A., Matrangolo, U., Cuzzocrea, A., Jain, M.: XPath Lookup Queries in P2P Networks. In: The 6th annual ACM Intl. Workshop on Web Information and Data Management (WIDM 2004), Washington, DC, November 2004, pp. 48–55 (2004)
Cai, M., Frank, M.: RDFPeers: A Scalable Distributed Repository based on a Structured Peer-to-Peer Network. In: WWW (2004)
Cai, M., Frank, M., Chen, J., Szekely, P.: MAAN: A Multi-Attribute Addressable Network for Grid Information Services. J. Grid Comput. 2(1), 3–14 (2004)
Chamberlin, D.: XQuery: An XML query language. IBM System Journal 41 (2003)
Cohen, S., Mamou, J., Kanza, Y., Sagiv, Y.: XSEarch: A semantic Search Engine for XML. In: VLDB (2003)
Crainiceanu, A., Linga, P., Machanavajjhala, A., Gehrke, J., Shanmugasundaram, J.: P-Ring: An Efficient and Robust P2P Range Index Structure. In: Proc. of the 2007 ACM-SIGMOD Conference, Beijing, China, pp. 223–234 (2007)
Crespo, A., Garcia-Molina, H.: Routing Indices for Peer-to-Peer Systems. In: ICDCS (2002)
Galanis, L., Wang, Y., Jeffrey, S., DeWitt, D.: Locating Data Sources in Large Distributed Systems. In: VLDB (2003)
Ganesan, P., Yang, B., Garcia-Molina, H.: One Torus to Rule them All: Multi-dimensional Queries in P2P Systems. In: Seventh Intl. Workshop on the Web and Databases, Paris, France (June 2004)
Garces-Erice, L., Felber, P.A., Biersack, E.W., Urvoy-Keller, G., Ross, K.W.: Data Indexing in Peer-to-peer DHT Networks. In: Proc. of the 24th IEEE Intl. Conference on Distributed Computing Systems, Tokyo, March 2004, pp. 200–208 (2004)
Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: Proc. of the 23rd VLDB Conference, Athens, Greece, August 1997, pp. 436–445 (1997)
Hristidis, V., Papakonstantinou, Y., Balmin, A.: Keyword Proximity Search on XML Graphs. In: ICDE (2003)
Jagadish, H., Ooi, B.C., Vu, Q.H.: BATON: A Balanced Tree Structure for Peer-to-Peer Networks. In: Proc. of the 31st VLDB Conference, Trondheim, Norway (2005)
Jagadish, H.V., Ooi, B.C., Vu, Q.H., Zhang, R., Zhou, A.: VBI-Tree: a Peer-to-Peer Framework for Supporting Multi-Dimensional Indexing Schemes. In: ICDE (2006)
Jiang, H., Jin, S.: Exploiting Dynamic Querying like Flooding Techniques for Unstructured Peer-to-peer Networks. In: Proceedings of IEEE ICNP (2005)
Knowbuddy’s Gnutella faq (2009), http://www.rixsoft.com/Knowbuddy/gnutellafaq.html (Accessed January 10, 2009)
Koloniari, G., Pitoura, E.: Content-based routing of path queries in peer-to-peer systems. In: Bertino, E., Christodoulakis, S., Plexousakis, D., Christophides, V., Koubarakis, M., Böhm, K., Ferrari, E. (eds.) EDBT 2004. LNCS, vol. 2992, pp. 29–47. Springer, Heidelberg (2004)
Koudas, N., Rabinovich, M., Srivastava, D., Yu, T.: Routing XML Queries. In: ICDE (2004)
Liu, B., Lee, W.C., Lee, D.L.: Supporting Complex Multi-Dimensional Queries in P2P Systems. In: Proc. of the 25th IEEE Intl. Conference on Distributed Computing Systems, Columbus, OH, June 2005, pp. 155–164 (2005)
Napster (2009), http://www.napster.com (Accessed January 10, 2009)
Nejdl, W., Wolpers, M., Siberski, W., Schmitz, C., Schlosser, M., Brunkhorst, I., Loser, A.: Super-Peer-Based Routing and Clustering Strategies for RDF-Based Peer-to-Peer Networks. In: WWW (2003)
Rao, P.R., Moon, B.: Locating XML Documents in a Peer-to-Peer Network using Distributed Hash Tables. IEEE Transactions on Knowledge and Data Engineering (January 08, 2009)
Rabin, M.O.: Fingerprinting by Random Polynomials. Harvard University, Cambridge, MA 02138, Tech. Rep. TR 15-81 (1981)
Sartiani, C., Manghi, P., Ghelli, G., Conforti, G.: XPeer: A Self-Organizing XML P2P Database System. In: Intl. Workshop on Peer-to-Peer Computing and Databases, Greece (2004)
Schmidt, C., Parashar, M.: Flexible Information Discovery in Decentralized Distributed Systems. In: HPDC (2003)
Skobeltsyn, G., Hauswirth, M., Aberer, K.: Efficient Processing of XPath Queries with Structured Overlay Networks. In: The 4th Intl. Conference on Ontologies, DataBases, and Applications of Semantics, Aiga Napa, Cyprus (October 2005)
Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications. In: SIGCOMM (2001)
Tang, C., Xu, Z., Dwarkadas, S.: Peer-to-Peer Information Retrieval Using Self-Organizing Semantic Overlay Networks. In: Proc. of the 2003 ACM-SIGCOMM Conference, Germany, August 2003, pp. 175–186 (2003)
Viglas, S.: Distributed File Structures in a Peer-to-Peer Environment. In: Proc. of the 23th IEEE Intl. Conference on Data Engineering, Cancun, Mexico, pp. 406–415 (2007)
Wang, Q., Oszu, M.: A Data Locating Mechanism for Distributed XML Data over P2P Networks. Technical report, CS-2004-45, University of Waterloo, School of Computer Science, Waterloo, Canada (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Antonellis, P., Makris, C., Tsirakis, N. (2009). Utilizing XML Clustering for Efficient XML Data Management on P2P Networks. In: Bhowmick, S.S., Küng, J., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2009. Lecture Notes in Computer Science, vol 5690. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03573-9_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-03573-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03572-2
Online ISBN: 978-3-642-03573-9
eBook Packages: Computer ScienceComputer Science (R0)