Abstract
In query optimization theory a selectivity parameter is used by cost query optimizer for early estimating the size of data that satisfies a query condition. It requires some representation of distribution of attribute values. There are many approximate representations of m–d distribution where the copula-based is new one. This approach gives a possibility to take into account the fact of a varying m–d distribution by predicting both a copula and 1–d marginal distributions. In this paper we propose the method of forecasting trajectories of either copula parameters and marginals’ quantiles using time series prediction models. This method is mainly designated for predicting outdated distribution representation what may improve accuracy of selectivity estimation based on such representation. It also may be used for predicting a varying query workload to forecast important regions of data domain. Having detected such regions we may improve there the resolution of distribution representation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Estimate AR and ARMA Models – Matlab and Simulink (2019) https://www.mathworks.com/help/ident/ug/estimating-ar-and-arma-models.html.
- 2.
Matlab Sample Data Sets (2016) https://www.mathworks.com/help/stats/_bq9uxn4.html.
References
Augustyn, D.R.: Query-condition-aware V-optimal histogram in range query selectivity estimation. Bull. Pol. Acad. Sci. Tech. Sci. 62(2), 287–303 (2014)
Augustyn, D.R.: Using the model of continuous dynamical system with viscous resistance forces for improving distribution prediction based on evolution of quantiles. In: Proceedings of BDAS 2014, pp. 1–9 (2014)
Augustyn, D.R.: Copula-based module for selectivity estimation of multidimensional range queries. Man. Mach. Interact. 5, 569–580 (2018)
Chakrabarti, K., Garofalakis, M., Rastogi, R., Shim, K.: Approximate query processing using wavelets. Int. J. Very Large Data Bases 10(2–3), 199–223 (2001)
Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. In: ACM SIGMOD Record, vol. 30, pp. 461–472. ACM (2001)
Joy, H.: Dependence Modeling with Copulas. CRC Press, Bosa Roca (2015)
Poosala, V., Ioannidis, Y.E.: Selectivity estimation without the attribute value independence assumption. VLDB J., 486–495 (1997)
Yan, F., Hou, W.C., Jiang, Z., Luo, C., Zhu, Q.: Selectivity estimation of range queries based on data density approximation via cosine series. Data Knowl. Eng. 63(3), 855–878 (2007)
Acknowledgements
This work was supported by the Statutory Research funds of Institute of Informatics, Silesian University of Technology (grant No BK/204/RAU2/2019).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Augustyn, D.R. (2020). Using Copula and Quantiles Evolution in Prediction of Multidimensional Distributions for Better Query Selectivity Estimation. In: Gruca, A., Czachórski, T., Deorowicz, S., Harężlak, K., Piotrowska, A. (eds) Man-Machine Interactions 6. ICMMI 2019. Advances in Intelligent Systems and Computing, vol 1061 . Springer, Cham. https://doi.org/10.1007/978-3-030-31964-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-31964-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31963-2
Online ISBN: 978-3-030-31964-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)