Abstract
In this paper we present HOPI, a new connection index for XML documents based on the concept of the 2–hop cover of a directed graph introduced by Cohen et al. In contrast to most of the prior work on XML indexing we consider not only paths with child or parent relationships between the nodes, but also provide space– and time–efficient reachability tests along the ancestor, descendant, and link axes to support path expressions with wildcards in our XXL search engine. We improve the theoretical concept of a 2–hop cover by developing scalable methods for index creation on very large XML data collections with long paths and extensive cross–linkage. Our experiments show substantial savings in the query performance of the HOPI index over previously proposed index structures in combination with low space requirements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., et al.: Compact labeling schemes for ancestor queries. In: SODA 2001, pp. 547–556 (2001)
Alstrup, S., Rauhe, T.: Improved labeling scheme for ancestor queries. In: SODA 2002, pp. 947–953 (2002)
Bancilhon, F., Ramakrishnan, R.: An amateur’s introduction to recursive query processing strategies. In: SIGMOD 1986, pp. 16–52 (1986)
Blanken, H., Grabs, T., Schek, H.-J., Schenkel, R., Weikum, G. (eds.): Intelligent Search on XML Data. LNCS, vol. 2818. Springer, Heidelberg (2003)
Böhme, T., Rahm, E.: Multi-user evaluation of XML data management systems with XMach-1. In: EEXTT 2002, pp. 148–158 (2003)
Chung, C.-W., Min, J.-K., Shim, K.: APEX: An adaptive path index for XML data. In: SIGMOD 2002, pp. 121–132 (2002)
Ciarlet Jr, P., Lamour, F.: On the validity of a front oriented approach to partitioning lage sparse graphs with a connectivity constraint. Numerical Algorithms 12(1,2), 193–214 (1996)
Cohen, E., et al.: Labeling dynamic XML trees. In: PODS 2002, pp. 271–281 (2002)
Cohen, E., et al.: Reachability and distance queries via 2-hop labels. In: SODA 2002, pp. 937–946 (2002)
Cooper, B., et al.: A fast index for semistructured data. In: VLDB 2001, pp. 341–350 (2001)
Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms, 1st edn. MIT Press, Cambridge (1990)
DeRose, S., et al.: XML linking language (XLink), version 1.0. W3C recommendation (2001)
Farhat, C.: A simple and efficient automatic FEM domain decomposer. Computers and Structures 28(5), 579–602 (1988)
Goldman, R., Widom, J.: DataGuides: Enabling query formulation and optimization in semistructured databases. In: VLDB 1997, pp. 436–445 (1997)
Grust, T.: Accelerating XPath location steps. In: SIGMOD 2002, pp. 109–120 (2002)
Grust, T., van Keulen, M.: Tree awareness for relational DBMS kernels: Staircase join. In: Blanken et al. [4]
Kaplan, H., et al.: A comparison of labeling schemes for ancestor queries. In: SODA 2002, pp. 954–963 (2002)
Kaplan, H., Milo, T.: Short and simple labels for small distances and other functions. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 246–257. Springer, Heidelberg (2001)
Kaushik, R., et al.: Covering indexes for branching path queries. In: SIGMOD 2002, pp. 133–144 (2002)
Ley, M.: DBLP XML Records. Downloaded September 1 (2003)
Milo, T., Suciu, D.: Index structures for path expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Qun, C., et al.: D(k)-index: An adaptive structural summary for graph-structured data. In: SIGMOD 2003, pp. 134–144 (2003)
Schenkel, R., Theobald, A., Weikum, G.: Ontology-enabled XML search. In: Blanken et al. [4]
Theobald, A., Weikum, G.: The index-based XXL search engine for querying XML data with relevance ranking. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 477–495. Springer, Heidelberg (2002)
Theobald, A., Weikum, G.: The XXL search engine: Ranked retrieval of XML data using indexes and ontologies. In: SIGMOD 2002 (2002)
Zezula, P., Amato, G., Rabitti, F.: Processing XML queries with tree signatures. In: Blanken et al. [4]
Zezula, P., et al.: Tree signatures for XML querying and navigation. In: 1st Int. XML Database Symposium, pp. 149–163 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schenkel, R., Theobald, A., Weikum, G. (2004). HOPI: An Efficient Connection Index for Complex XML Document Collections. In: Bertino, E., et al. Advances in Database Technology - EDBT 2004. EDBT 2004. Lecture Notes in Computer Science, vol 2992. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24741-8_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-24741-8_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21200-3
Online ISBN: 978-3-540-24741-8
eBook Packages: Springer Book Archive