A Higher-Fidelity Frugal Quantile Estimator

Yazidi, Anis; Hammer, Hugo Lewi; John Oommen, B.

doi:10.1007/978-3-319-69179-4_6

Anis Yazidi¹⁸,
Hugo Lewi Hammer¹⁸ &
B. John Oommen¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10604))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3044 Accesses
1 Citations

Abstract

The estimation of the quantiles is pertinent when one is mining data streams. However, the complexity of quantile estimation is much higher than the corresponding estimation of the mean and variance, and this increased complexity is more relevant as the size of the data increases. Clearly, in the context of “infinite” data streams, a computational and space complexity that is linear in the size of the data is definitely not affordable. In order to alleviate the problem complexity, recently, a very limited number of studies have devised incremental quantile estimators [7, 12]. Estimators within this class resort to updating the quantile estimates based on the most recent observation(s), and this yields updating schemes with a very small computational footprint – a constant-time (i.e., O(1)) complexity. In this article, we pursue this research direction and present an estimator that we refer to as a Higher-Fidelity Frugal [7] quantile estimator. Firstly, it guarantees a substantial advancement of the family of Frugal estimators introduced in [7]. The highlight of the present scheme is that it works in the discretized space, and it is thus a pioneering algorithm within the theory of discretized algorithms (The fact that discretized Learning Automata schemes are superior to their continuous counterparts has been clearly demonstrated in the literature. This is the first paper, to our knowledge, that proves the advantages of discretization within the domain of quantile estimation). Comprehensive simulation results show that our estimator outperforms the original Frugal algorithm in terms of accuracy.

B. John Oommen—Chancellor’s Professor; Fellow: IEEE and Fellow: IAPR. This author is also an Adjunct Professor with the University of Agder in Grimstad, Norway.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
With some insight, one sees that this elegant median estimation procedure is similar to the Boyer and Moore algorithm [2] for computing the majority item in a stream, using only a single pass.
2.
Clearly, though, such an approach would not be able to handle the case of non-stationary quantile estimation as the positions of the markers would be affected by stale data points.
3.
Throughout this paper, there is an implicit assumption that the true quantile lies in [a, b]. However, this is not a limitation of our scheme; the proof is valid for any bounded and probably non-bounded function.

References

Arandjelovic, O., Pham, D.S., Venkatesh, S.: Two maximum entropy-based algorithms for running quantile estimation in nonstationary data streams. IEEE Trans. Circuits Syst. Video Technol. 25(9), 1469–1479 (2015)
Article Google Scholar
Boyer, R.S., Moore, J.S.: MJRTY-a fast majority vote algorithm. In: Boyer, R.S. (ed.) Automated Reasoning: Essays in Honor of Woody Bledsoe, pp. 105–117. Springer, Netherlands (1991). doi:10.1007/978-94-011-3488-0_5
Chapter Google Scholar
Cao, J., Li, L.E., Chen, A., Bu, T.: Incremental tracking of multiple quantiles for network monitoring in cellular networks. In: Proceedings of the 1st ACM Workshop on Mobile Internet Through Cellular Networks, pp. 7–12. ACM (2009)
Google Scholar
Chambers, J.M., James, D.A., Lambert, D., Wiel, S.V.: Monitoring networked applications with incremental quantile estimation. Stat. Sci. 21(4), 463–475 (2006)
Article MathSciNet MATH Google Scholar
Chen, F., Lambert, D., Pinheiro, J.C.: Incremental quantile estimation for massive tracking. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 516–522. ACM (2000)
Google Scholar
Jain, R., Chlamtac, I.: The P2 algorithm for dynamic calculation of quantiles and histograms without storing observations. Commun. ACM 28(10), 1076–1085 (1985)
Article Google Scholar
Ma, Q., Muthukrishnan, S., Sandler, M.: Frugal streaming for estimating quantiles. In: Brodnik, A., López-Ortiz, A., Raman, V., Viola, A. (eds.) Space-Efficient Data Structures, Streams, and Algorithms. LNCS, vol. 8066, pp. 77–96. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40273-9_7
Chapter Google Scholar
Oommen, B.J.: Stochastic searching on the line and its applications to parameter learning in nonlinear optimization. IEEE Trans. Syst. Man Cybern. Part B 27(4), 733–739 (1997)
Article MathSciNet Google Scholar
Schmeiser, B.W., Deutsch, S.J.: Quantile estimation from grouped data: the cell midpoint. Commun. Stat. Simul. Comput. 6(3), 221–234 (1977)
Article MATH Google Scholar
Tierney, L.: A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM J. Sci. Stat. Comput. 4(4), 706–711 (1983)
Article MathSciNet MATH Google Scholar
Yazidi, A., Granmo, O.-C., Oommen, B.J.: A stochastic search on the line-based solution to discretized estimation. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds.) IEA/AIE 2012. LNCS, vol. 7345, pp. 764–773. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31087-4_77
Chapter Google Scholar
Yazidi, A., Hammer, H.: Quantile estimation using the theory of stochastic learning. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, pp. 7–14. ACM (2015)
Google Scholar
Yazidi, A., Oommen, B.J.: Novel discretized weak estimators based on the principles of the stochastic search on the line problem. IEEE Trans. Cybern. 46(12), 2732–2744 (2016)
Article Google Scholar
Yazidi, A., Oommen, B.J., Horn, G., Granmo, O.C.: Stochastic discretized learning-based weak estimation: a novel estimation method for non-stationary environments. Pattern Recognit. 60(C), 430–443 (2016)
Article Google Scholar
Yazidi, Anis Hammer L., H., Oommen, B.J.: Higher-fidelity frugal and accurate quantile estimation using a novel incremental (2017, to be submitted for publication). Journal version
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Oslo and Akershus University College of Applied Sciences, Oslo, Norway
Anis Yazidi & Hugo Lewi Hammer
School of Computer Science, Carleton University, Ottawa, Canada
B. John Oommen

Authors

Anis Yazidi
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Lewi Hammer
View author publications
You can also search for this author in PubMed Google Scholar
B. John Oommen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anis Yazidi .

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore, Singapore
Gao Cong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Chih Peng
Macquarie University, Sydney, New South Wales, Australia
Wei Emma Zhang
Wuhan University, Wuhan, China
Chengliang Li
Nanyang Technological University, Singapore, Singapore
Aixin Sun

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yazidi, A., Hammer, H.L., John Oommen, B. (2017). A Higher-Fidelity Frugal Quantile Estimator. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-69179-4_6
Published: 14 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69178-7
Online ISBN: 978-3-319-69179-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics