Synonyms
XML cardinality estimation
Definition
Selectivity estimation in database systems refers to the task of estimating the number of results that will be output for a given query. Selectivity estimates are crucial in query optimization, since they enable optimizers to select efficient query plans. They are also employed in interactive data exploration as timely feedback about the expected outcome of user queries, and can even serve as approximate answers for count queries.
Selectivity estimators apply an estimation procedure on a synopsis of the data. Due to the stringent time and space constraints of query optimization, of which selectivity estimation is only one of the steps, selectivity estimators are faced with two, often conflicting, requirements: they have to accurately and efficiently estimate the cardinality of queries while keeping the synopsis size to a minimum.
While there is a large body of literature on selectivity estimation in the context of relational databases, the...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Aboulnaga A, Alameldeen AR, Naughton J. Estimating the selectivity of XML path expressions for internet scale applications. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 591–600.
Chen Z, Jagadish HV, Korn F, Koudas N, Muthukrishnan S, Ng RT, Srivastava D. Counting twig matches in a tree. In: Proceedings of the 17th International Conference on Data Engineering; 2001. p. 453–62.
Freire J, Haritsa J, Ramanath M, Roy P, Siméon J. StatiX: making XML count. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 181–91.
Goldman R, Widom J. Dataguides: enabling query formulation and optimization in semistructured databases. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 436–45.
Lim L, Wang M, Padmanabhan S, Vitter J, Parr R. XPathLearner: an on-line self-tuning markov histogram for XML path selectivity estimation. In: Proceedings of the 28th International Conference on Very Large Data Bases; 2002. p. 442–53.
Lim L, Wang M, Vitter J. CXHist: an on-line classification-based histogram for XML string selectivity estimation. In: Proceedings of the 31st International Conference on Very Large Data Bases; 2005. p. 1187–98.
McHugh J, Abiteboul S, Goldman R, Quass D, Widom J. A database management system for semistructured data. ACM SIGMOD Rec. 1997;26(3):54–66.
Milo T, Suciu D. Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory; 1999. p. 277–95.
Nestorov S, Ullman J, Wiener J, Chawathe S. Representative objects: concise representations of semistructured, hierarchical data. In: Proceedings of the 13th International Conference on Data Engineering; 1997. p. 79–90.
Polyzotis N, Garofalakis M. XCluster synopses for structured XML content. In: Proceedings of the 22nd International Conference on Data Engineering; 2006. p. 63.
Polyzotis N, Garofalakis M. XSketch synopses for XML data graphs. ACM Trans Database Syst. 2006;31(3):1014–63.
Polyzotis N, Garofalakis M, Ioannidis Y. Approximate XML query answers. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2004. p. 263–74.
Ramanath M, Zhang L, Freire J, Haritsa J. IMAX: incremental maintenance of schema-based XML statistics. In: Proceedings of the 21st International Conference on Data Engineering; 2005. p. 273–84.
Rao P, Moon B. Sketchtree: approximate tree pattern counts over streaming labeled trees. In: Proceedings of 22nd International Conference on Data Engineering; 2006. p. 80.
Sartiani C. A framework for estimating XML query cardinality. In: Proceedings of the 6th International Workshop on the World Wide Web and Databases; 2003. p. 43–48.
Wang W, Jiang H, Lu H, Yu JX. Containment join size estimation: models and methods. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2003. p. 145–56.
Wang W, Jiang H, Lu H, Yu JX. Bloom histogram: path selectivity estimation for XML data with updates. In Proceedings of the 30th International Conference on Very Large Data Bases; 2004. p. 240–51.
Wu Y, Patel JM, Jagadish HV. Estimating answer sizes for XML queries. In: Advances in database technology, Proceedings of the 8th International Conference on Extending Database Technology; 2002. p. 590–608.
Zhang N, Özsu MT, Aboulnaga A, Ilyas IF. XSEED: accurate and fast cardinality estimation for XPath queries. In: Proceedings of the 22nd International Conference on Data Engineering; 2006. p. 61.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Ramanath, M., Freire, J., Polyzotis, N. (2018). XML Selectivity Estimation. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_801
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_801
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering