Abstract
The emergence of the Web has increased interests in XML data. XML query languages such as XQuery and XPath use label paths to traverse the irregularly structured data. Without a structural summary and efficient index, query processing can be quite inefficient due to an exhaustive traversal on XML data. To overcome the inefficiency, several path indexes have been proposed in the research community. DataGuides and the 1-Index can be viewed as covering indexes, for simple path expressions over tree- or graph-structured XML data. By representing both XML documents and XML queries in structure-encoded sequences, querying XML data is equivalent to finding subsequence matches. We will also introduce the above index structures in this chapter.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering, Taipei, pp. 3–14, Mar 1995
Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding structure to unstructured data. In: Proceedings of the 6th International Conference on Database Theory, Delphi, pp. 336–350, Jan 1997
Cattell, R.G.G. (ed.): The Object Database Standard: ODMG-93. Morgan Kaufmann, San Francisco (1994)
Clark, J., Derose, S.: XML path language (XPath) 1.0. W3C Recommendation. World Wide Web Consortium, http://www.w3.org/TR/xpath, Nov 1999
Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: PODS, Madison, pp.271–281 (2002)
Chung, C-W., Min, K-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD, Madison, pp. 121–132 (2002)
Garofalakis, M.N., Rastogi, R., Shim, K.: SPIRIT: sequential pattern mining with regular expression constraints. In: Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, pp. 223–234, Sept 1999
Goldman, R., Widom, J.: DataGuides: enabling query formulation and optimization in semistructured databases. In: VLDB, Athens, pp. 436–445 (1997)
Henziner, M., Henziner, T., Kopke, P.: Computing simulations on finite and infinite graphs. In: Proceedings of 20th Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, pp. 453–462 (1995)
The internet movie database: http://www.imdb.com (2000)
Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proceedings of SIGMOD 2002, Madison, (2002)
Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for efficient indexing of paths in graph structured data. In: Proceedings of ICDE, San Jose (2002)
Ley, M.: DBLP database web site. http://www.informatik.uni-trier.de/ley/db (2000)
Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of the 27th VLDB Conference, pp. 361–370, Sept 2001
Milner, R.: A Calculus for Communicating Processes. Lecture Notes in Computer Science, vol. 92. (1980)
Milner, R.: Communication and Concurrency. Prentice Hall, New York (1989)
Milo, T., Suciu, D.: Index structures for path expressions. In: ICDT, Jerusalem, pp. 277–295 (1999)
Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained association rules. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, pp. 13–24, June 1998
Nestorov, S., Ullman, J.D., Wiener, J.L., Chawathe, S.S.: Representative objects: concise representations of semistructured, hierarchial data. In: Proceedings of the 13th International Conference on Data Engineering, Birmingham, pp. 79–90, Apr 1997
Papakonstantinou, Y., Garcia-molina, H., Widom, J.: Object exchange across heterogeneous information source. In: Proceeding of the 11th International Conference on Data Engineering, Taipei, pp. 251–260 (1995)
Prüfer, H.: Neuer Beweis eines Satzes über Permutationen. Arch. Math. Phys. 27, 742–744 (1918)
Paige, R., Tarjan, R.: Three partition refinement algorithms. SIAM J. Commun. 16, 973–988 (1987)
Rao, P., Moon, B.: PRIX: indexing and querying XML using Prüfer sequences. Technical Report TR 03-06, University Of Arizona, Tucson, AZ 85721. http://www.cs.arizona.edu/research/reports.html, July 2003
Stockmeyer, L.J., Meyer, A.R.: Word problems requiring exponential time. In: 5th STOC. ACM, Austin, pp. 1–9 (1973)
Wang, H., Park, S., Fan, W., Yu, F.S.: ViST: a dynamic index method for query XML data by tree structures. In: Proceeding of the 2003 ACM-SIGMOD Conference, San Diego, CA, June 2003
XMARK.: The XML-benchmark project. http://monetdb.cwi.nl/xml (2002)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Tsinghua University Press, Beijing and Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Lu, J. (2013). XML Data Indexing. In: An Introduction to XML Query Processing and Keyword Search. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34555-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-34555-5_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34554-8
Online ISBN: 978-3-642-34555-5
eBook Packages: Computer Science