Skip to main content

A Higher-Fidelity Frugal Quantile Estimator

  • Conference paper
  • First Online:
Book cover Advanced Data Mining and Applications (ADMA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10604))

Included in the following conference series:

Abstract

The estimation of the quantiles is pertinent when one is mining data streams. However, the complexity of quantile estimation is much higher than the corresponding estimation of the mean and variance, and this increased complexity is more relevant as the size of the data increases. Clearly, in the context of “infinite” data streams, a computational and space complexity that is linear in the size of the data is definitely not affordable. In order to alleviate the problem complexity, recently, a very limited number of studies have devised incremental quantile estimators [7, 12]. Estimators within this class resort to updating the quantile estimates based on the most recent observation(s), and this yields updating schemes with a very small computational footprint – a constant-time (i.e., O(1)) complexity. In this article, we pursue this research direction and present an estimator that we refer to as a Higher-Fidelity Frugal [7] quantile estimator. Firstly, it guarantees a substantial advancement of the family of Frugal estimators introduced in [7]. The highlight of the present scheme is that it works in the discretized space, and it is thus a pioneering algorithm within the theory of discretized algorithms (The fact that discretized Learning Automata schemes are superior to their continuous counterparts has been clearly demonstrated in the literature. This is the first paper, to our knowledge, that proves the advantages of discretization within the domain of quantile estimation). Comprehensive simulation results show that our estimator outperforms the original Frugal algorithm in terms of accuracy.

B. John Oommen—Chancellor’s Professor; Fellow: IEEE and Fellow: IAPR. This author is also an Adjunct Professor with the University of Agder in Grimstad, Norway.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    With some insight, one sees that this elegant median estimation procedure is similar to the Boyer and Moore algorithm [2] for computing the majority item in a stream, using only a single pass.

  2. 2.

    Clearly, though, such an approach would not be able to handle the case of non-stationary quantile estimation as the positions of the markers would be affected by stale data points.

  3. 3.

    Throughout this paper, there is an implicit assumption that the true quantile lies in [ab]. However, this is not a limitation of our scheme; the proof is valid for any bounded and probably non-bounded function.

References

  1. Arandjelovic, O., Pham, D.S., Venkatesh, S.: Two maximum entropy-based algorithms for running quantile estimation in nonstationary data streams. IEEE Trans. Circuits Syst. Video Technol. 25(9), 1469–1479 (2015)

    Article  Google Scholar 

  2. Boyer, R.S., Moore, J.S.: MJRTY-a fast majority vote algorithm. In: Boyer, R.S. (ed.) Automated Reasoning: Essays in Honor of Woody Bledsoe, pp. 105–117. Springer, Netherlands (1991). doi:10.1007/978-94-011-3488-0_5

    Chapter  Google Scholar 

  3. Cao, J., Li, L.E., Chen, A., Bu, T.: Incremental tracking of multiple quantiles for network monitoring in cellular networks. In: Proceedings of the 1st ACM Workshop on Mobile Internet Through Cellular Networks, pp. 7–12. ACM (2009)

    Google Scholar 

  4. Chambers, J.M., James, D.A., Lambert, D., Wiel, S.V.: Monitoring networked applications with incremental quantile estimation. Stat. Sci. 21(4), 463–475 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  5. Chen, F., Lambert, D., Pinheiro, J.C.: Incremental quantile estimation for massive tracking. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 516–522. ACM (2000)

    Google Scholar 

  6. Jain, R., Chlamtac, I.: The P2 algorithm for dynamic calculation of quantiles and histograms without storing observations. Commun. ACM 28(10), 1076–1085 (1985)

    Article  Google Scholar 

  7. Ma, Q., Muthukrishnan, S., Sandler, M.: Frugal streaming for estimating quantiles. In: Brodnik, A., López-Ortiz, A., Raman, V., Viola, A. (eds.) Space-Efficient Data Structures, Streams, and Algorithms. LNCS, vol. 8066, pp. 77–96. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40273-9_7

    Chapter  Google Scholar 

  8. Oommen, B.J.: Stochastic searching on the line and its applications to parameter learning in nonlinear optimization. IEEE Trans. Syst. Man Cybern. Part B 27(4), 733–739 (1997)

    Article  MathSciNet  Google Scholar 

  9. Schmeiser, B.W., Deutsch, S.J.: Quantile estimation from grouped data: the cell midpoint. Commun. Stat. Simul. Comput. 6(3), 221–234 (1977)

    Article  MATH  Google Scholar 

  10. Tierney, L.: A space-efficient recursive procedure for estimating a quantile of an unknown distribution. SIAM J. Sci. Stat. Comput. 4(4), 706–711 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  11. Yazidi, A., Granmo, O.-C., Oommen, B.J.: A stochastic search on the line-based solution to discretized estimation. In: Jiang, H., Ding, W., Ali, M., Wu, X. (eds.) IEA/AIE 2012. LNCS, vol. 7345, pp. 764–773. Springer, Heidelberg (2012). doi:10.1007/978-3-642-31087-4_77

    Chapter  Google Scholar 

  12. Yazidi, A., Hammer, H.: Quantile estimation using the theory of stochastic learning. In: Proceedings of the 2015 Conference on Research in Adaptive and Convergent Systems, pp. 7–14. ACM (2015)

    Google Scholar 

  13. Yazidi, A., Oommen, B.J.: Novel discretized weak estimators based on the principles of the stochastic search on the line problem. IEEE Trans. Cybern. 46(12), 2732–2744 (2016)

    Article  Google Scholar 

  14. Yazidi, A., Oommen, B.J., Horn, G., Granmo, O.C.: Stochastic discretized learning-based weak estimation: a novel estimation method for non-stationary environments. Pattern Recognit. 60(C), 430–443 (2016)

    Article  Google Scholar 

  15. Yazidi, Anis Hammer L., H., Oommen, B.J.: Higher-fidelity frugal and accurate quantile estimation using a novel incremental (2017, to be submitted for publication). Journal version

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anis Yazidi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Yazidi, A., Hammer, H.L., John Oommen, B. (2017). A Higher-Fidelity Frugal Quantile Estimator. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69179-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69178-7

  • Online ISBN: 978-3-319-69179-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics