Smooth distribution function estimation for lifetime distributions using Szasz–Mirakyan operators

Abstract

In this paper, we introduce a new smooth estimator for continuous distribution functions on the positive real half-line using Szasz–Mirakyan operators, similar to Bernstein’s approximation theorem. We show that the proposed estimator outperforms the empirical distribution function in terms of asymptotic (integrated) mean-squared error and generally compares favorably with other competitors in theoretical comparisons. Also, we conduct the simulations to demonstrate the finite sample performance of the proposed estimator.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. Altman, N., Léger, C. (1995). Bandwidth selection for Kernel distribution function estimation. Journal of Statistical Planning and Inference, 46(2), 195–214.

    MathSciNet  Article  Google Scholar 

  2. Babu, G. J., Canty, A. J., Chaubey, Y. P. (2002). Application of Bernstein polynomials for smooth estimation of a distribution and density function. Journal of Statistical Planning and Inference, 105(2), 377–392.

    MathSciNet  Article  Google Scholar 

  3. Bowman, A., Hall, P., Prvan, T. (1998). Bandwidth selection for the smoothing of distribution functions. Biometrika, 85(4), 799–808.

    MathSciNet  Article  Google Scholar 

  4. Duin, R. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers C-25(11):1175–1179.

    Article  Google Scholar 

  5. Falk, M. (1983). Relative efficiency and deficiency of Kernel type estimators of smooth distribution functions. Statistica Neerlandica, 37(2), 73–83.

    MathSciNet  Article  Google Scholar 

  6. Gramacki, A. (2018). Nonparametric Kernel density estimation and its computational aspects studies in big data. New York: Springer International Publishing.

    Google Scholar 

  7. Hanebeck, A. (2020). Nonparametric distribution function estimation. Master’s thesis, Karlsruher Institut für Technologie (KIT).

  8. Hanebeck, A., Klar, B. (2020). Smooth distribution function estimation for lifetime distributions using Szasz-Mirakyan operators. arXiv:200509994 [math, stat] 2005.09994.

  9. Helali, S., Slaoui, Y. (2020). Estimation of a distribution function using Lagrange polynomials with Tchebychev-Gauss points. Statistics and Its Interface, 13(3), 399–410.

    MathSciNet  Article  Google Scholar 

  10. Hogg, R. V., Klugman, S. A. (1984). Loss distributions. New York: Wiley-Interscience.

    Google Scholar 

  11. Jmaei, A., Slaoui, Y., Dellagi, W. (2017). Recursive distribution estimator defined by stochastic approximation method using bernstein polynomials. Journal of Nonparametric Statistics, 29, 792–805.

    MathSciNet  Article  Google Scholar 

  12. Johnson, N. L., Kotz, S., Balakrishnan, N. (1994). Continuous univariate distributions 2nd ed., Vol. 1. New York: Wiley-Interscience.

    Google Scholar 

  13. Johnson, N. L., Kotz, S., Balakrishnan, N. (1995). Continuous univariate distributions 2nd ed., Vol. 2. New York: Wiley-Interscience.

    Google Scholar 

  14. Kim, C., Kim, S., Park, M., Lee, H. (2006). A bias reducing technique in kernel distribution function estimation. Computational Statistics, 21(3), 589–601.

    MathSciNet  Article  Google Scholar 

  15. Leblanc, A. (2012). On estimating distribution functions using bernstein polynomials. Annals of the Institute of Statistical Mathematics, 64(5), 919–943.

    MathSciNet  Article  Google Scholar 

  16. Lockhart, R. (2013). The basics of nonparametric models. http://people.stat.sfu.ca/~lockhart/richard/830/13_3/lectures/nonparametric_basics/.

  17. Lorentz, G. G. (1986). Bernstein polynomials 2nd ed. Co, New York, N.Y.: Chelsea Pub.

    Google Scholar 

  18. Marshall, A. W., Olkin, I. (2007). Life distributions: Structure of nonparametric, semiparametric, and parametric families. Springer Series in Statistics, New York: Springer.

    Google Scholar 

  19. Mokkadem, A., Pelletier, M., Slaoui, Y. (2009). The Stochastic approximation method for the estimation of a multivariate probability density. arXiv:08072960 [math, stat] 0807.2960.

  20. Ouimet, F. (2020). A local limit theorem for the Poisson distribution and its application to the Le Cam distance between Poisson and Gaussian experiments and asymptotic properties of szasz estimators. arXiv:201005146 [math, stat] 2010.05146.

  21. Parzen, E. (1962). On estimation of a probability density function and mode. The Annals of Mathematical Statistics, 33(3), 1065–1076.

    MathSciNet  Article  Google Scholar 

  22. Polansky, A. M., Baker, E. R. (2000). Multistage plug- in bandwidth selection for Kernel distribution function estimates. Journal of Statistical Computation and Simulation, 65(1–4), 63–80.

    MathSciNet  Article  Google Scholar 

  23. Rosenblatt, M. (1956). Remarks on some nonparametric estimates of a density function. The Annals of Mathematical Statistics, 27(3), 832–837.

    MathSciNet  Article  Google Scholar 

  24. Rudemo, M. (1982). Empirical choice of histograms and Kernel density estimators. Scandinavian Journal of Statistics, 9(2), 65–78.

    MathSciNet  MATH  Google Scholar 

  25. Schwartz, S. C. (1967). Estimation of probability density by an orthogonal series. The Annals of Mathematical Statistics, 38(4), 1261–1265.

    MathSciNet  Article  Google Scholar 

  26. Slaoui, Y. (2014). Bandwidth selection for recursive Kernel density estimators defined by stochastic approximation method. https://www.hindawi.com/journals/jps/2014/739640/.

  27. Stephanou, M., Varughese, M., Macdonald, I. (2017). Sequential quantiles via Hermite series density estimation. Electronic Journal of Statistics, 11(1), 570–607.

    MathSciNet  Article  Google Scholar 

  28. Szasz, O. (1950). Generalization of S. Bernstein’s polynomials to the infinite interval. Journal of Research of the National Bureau of Standards, 45, 239–245.

    MathSciNet  Article  Google Scholar 

  29. Tenreiro, C. (2006). Asymptotic behaviour of multistage plug-in bandwidth selections for Kernel distribution function estimators. Journal of Nonparametric Statistics, 18(1), 101–116.

    MathSciNet  Article  Google Scholar 

  30. Watson, G. S., Leadbetter, M. R. (1964). Hazard analysis ii. Sankhya The Indian Journal of Statistics Series A (1961–2002), 26(1), 101–116.

    MathSciNet  MATH  Google Scholar 

  31. Yamato, H. (1973). Uniform convergence of an estimator of a distribution function. Bulletin of Mathematical Statistics, 15(3), 69–78.

    MathSciNet  Article  Google Scholar 

  32. Zhang, S., Li, Z., Zhang, Z. (2020). Estimating a distribution function at the boundary. Austrian Journal of Statistics, 49(1), 1–23.

    Article  Google Scholar 

Download references

Acknowledgements

The authors are grateful to two reviewers and the editors for their helpful remarks and comments on an earlier version of this manuscript. They are also sincerely grateful to Frédéric Ouimet for pointing out an error in a previous version of Lemma 3, for helpful discussions and for sharing his preprint Ouimet (2020).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ariane Hanebeck.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

The following theorem can be found in Ouimet (2020). He pointed out a mistake in the paper of Leblanc (2012) which also has an impact on this paper. The asymptotic behavior of \(R_{1,m}^S\) in Lemma 3 has been corrected compared to Lemma 3 in Hanebeck and Klar 2020, arXiv v.1.

Theorem 8

We define

$$\begin{aligned} V_{k,m}(x)=\frac{(mx)^k}{k!}e^{-mx}, \; \phi (x)=\frac{1}{\sqrt{2\pi }}e^{-x^2/2}, \; and \; \delta _k=\frac{k-mx}{\sqrt{mx}}. \end{aligned}$$

Pick any \(\eta \in (0,1)\). Then, we have uniformly for \(k \in {\mathbb {N}}_0\) with \(\left| \frac{\delta _k}{\sqrt{mx}}\right| \le \eta\) that

$$\begin{aligned} \frac{V_{k,m}(x)}{\frac{1}{\sqrt{mx}}\phi (\delta _k)}&=1+m^{-1/2}\frac{1}{\sqrt{x}}\left( \frac{1}{6}\delta _k^3-\frac{1}{2}\delta _k\right) \\&+m^{-1}\frac{1}{x}\left( \frac{1}{72}\delta _k^6-\frac{1}{6}\delta _k^4+\frac{3}{8}\delta _k^2-\frac{1}{12}\right) +O_{x,\eta }\left( \frac{|1+\delta _k|^9}{m^{3/2}}\right) \end{aligned}$$

as \(n \rightarrow \infty\).

We now present various properties of \(V_{k,m}\) that are needed for the proofs. The following lemma and its proof are similar to Lemma 2 and Lemma 3 in Leblanc (2012). As mentioned before, parts (e) and (h) take the suggestions in Ouimet (2020) into account. The proofs for these parts are adjusted accordingly.

Lemma 3

Define

$$\begin{aligned}&L_m^S(x)=\sum _{k=0}^{\infty }V_{k,m}^2(x), \\&R^S_{j,m}(x)=m^{-j}\mathop {\sum \sum }_{0\le k < l \le \infty }(k-mx)^jV_{k,m}(x)V_{l,m}(x) {\text { for }} j\in \{0,1,2\}, \end{aligned}$$

and

$$\begin{aligned} \tilde{R}_{1,m}^S(x)=m^{1/2}\sum _{k,l=0}^{\infty }\left( \frac{k\wedge l}{m}-x\right) V_{k,m}(x)V_{l,m}(x), \end{aligned}$$

and \(V_{k,m}(x)=e^{-mx}\frac{(mx)^k}{k!}\). It trivially holds that \(0 \le L_m^S(x) \le 1\) for \(x\in [0,\infty )\). In addition, the following properties hold.

  1. (a)

    \(L_m^S(0)=1\) and \(\displaystyle \lim _{x\rightarrow \infty } L_m^S(x)=0\),

  2. (b)

    \(R_{j,m}^S(0)=0\) for \(j\in \{0,1,2\}\),

  3. (c)

    \(0 \le R_{2,m}^S(x) \le \frac{x}{m} {\text { for }} x \in (0,\infty )\),

  4. (d)

    \(L_m^S(x)=m^{-1/2}\left[ (4\pi x)^{-1/2}+o_x(1)\right] {\text { for }} x\in (0,\infty )\),

  5. (e)

    \(\tilde{R}_{1,m}^S(x)= -\sqrt{\frac{x}{\pi }}+o_x(1) {\text { for }} x\in (0,\infty )\) and \(R_{1,m}^S(x)=m^{-1/2}\left[ -\sqrt{\frac{x}{4\pi }}+o_x(1)\right]\),

  6. (f)

    \(m^{1/2} \displaystyle \int _0^{\infty } L_m^S(x)e^{-ax}{\mathrm {d}}x =\frac{1}{2\sqrt{a}}+o(1)\) for \(a \in (0,\infty )\),

  7. (g)

    \(m^{1/2} \displaystyle \int _0^{\infty } x L_m^S(x)e^{-ax}{\mathrm {d}}x =\frac{1}{4a^{3/2}}+o(1)\) for \(a \in (0,\infty )\),

  8. (h)

    For any continuous and bounded function g on \([0,\infty )\), \(m^{1/2} \displaystyle \int _0^{\infty } g(x)R_{1,m}^S(x)e^{-ax}{\mathrm {d}}x = -\displaystyle \int _0^{\infty } g(x)\frac{\sqrt{x}}{\sqrt{4\pi }}e^{-ax}{\mathrm {d}}x+o(1)\) for \(a \in (0,\infty )\) and \(\displaystyle \int _0^{\infty } g(x)\tilde{R}_{1,m}^S(x)e^{-ax}{\mathrm {d}}x = -\displaystyle \int _0^{\infty } g(x)\frac{\sqrt{x}}{\sqrt{\pi }}e^{-ax}{\mathrm {d}}x+o(1)\).

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hanebeck, A., Klar, B. Smooth distribution function estimation for lifetime distributions using Szasz–Mirakyan operators. Ann Inst Stat Math (2021). https://doi.org/10.1007/s10463-020-00783-y

Download citation

Keywords

  • Distribution function estimation
  • Nonparametric
  • Szasz–Mirakyan operator
  • Hermite estimator
  • Mean squared error