Abstract
Kernel density estimators (KDEs) are ubiquitous tools for nonparametric estimation of probability density functions (PDFs), when data are obtained from unknown data generating processes. The KDEs that are typically available in software packages are defined, and designed, to estimate real-valued data. When applied to positive data, these typical KDEs do not yield bona fide PDFs. A log-transformation methodology can be applied to produce a nonparametric estimator that is appropriate and yields proper PDFs over positive supports. We call the KDEs obtained via this transformation log-KDEs. We derive expressions for the pointwise biases, variances, and mean-squared errors of the log-KDEs that are obtained via various kernel functions. Mean integrated squared error (MISE) and asymptotic MISE results are also provided and a plug-in rule for log-KDE bandwidths is derived. We demonstrate the log-KDEs methodology via our R package, logKDE. Real data case studies are provided to demonstrate the log-KDE approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C.C.: Data Mining. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-14142-8
Amemiya, T.: Introduction to Statistics and Econometrics. Harvard University Press, Cambridge (1994)
Chambers, J.M., Cleveland, W.S., Kleiner, B., Tukey, P.A.: Graphical Methods for Data Analysis. Wadsworth, Belmont (1983)
Charpentier, A., Flachaire, E.: Log-transform kernel density estimation of income distribution. L’Actualite Economique 91, 141–159 (2015)
DasGupta, A.: Asymptotic Theory Of Statistics And Probability. Springer, New York (2008). https://doi.org/10.1007/978-0-387-75971-5
Hirukawa, M., Sakudo, M.: Nonnegative bias reduction methods for density estimation using asymmetric kernels. Comput. Stat. Data Anal. 75, 112–123 (2014)
Igarashi, G.: Weighted log-normal kernel density estimation. Commun. Stat. - Theory Methods 45, 6670–6687 (2016)
Igarashi, G., Kakizawa, Y.: Bias corrections for some asymmetric kernel estimators. J. Stat. Plan. Inference 159, 37–63 (2015)
Jin, X., Kawczak, J.: Birnbaum-Saunders and lognormal kernel estimators for modelling durations in high frequency financial data. Ann. Econ. Financ. 4, 103–124 (2003)
Jones, A.T., Nguyen, H.D., McLachlan, G.J.: logKDE: log-transformed kernel density estimation. J. Open Source Softw. 3, 870 (2018)
Marron, J.S., Ruppert, D.: Transformations to reduce boundary bias in kernel density estimation. J. R. Stat. Soc. B 56, 653–671 (1994)
Nguyen, H.D., Jones, A.T., McLachlan, G.J.: logKDE: computing log-transformed kernel density estimates for postive data (2018). cran.r-project.org/package=logKDE
Parzen, E.: On estimation of a probability density function and mode. Ann. Math. Stat. 33, 1065–1076 (1962)
R Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing (2016)
Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Stat. 27, 832–835 (1956)
Scott, D.W., Terrell, G.R.: Biased and unbiased cross-validation in density estimation. J. Am. Stat. Assoc. 82(400), 1131–1146 (1987)
Sheather, S.J., Jones, M.C.: A reliable data-based bandwidth selection method for kernel density estimation. J. R. Stat. Soc. B 53, 683–690 (1991)
Silverman, B.W.: Density Estimation for Statistics and Data Analysis. Chapman and Hall, London (1986)
van der Vaart, A.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
Wand, M.P., Jones, M.C.: Kernel Smoothing. Springer, New York (1995)
Wand, M.P., Marron, J.S., Ruppert, D.: Transformations in density estimation. J. Am. Stat. Assoc. 86, 343–353 (1991)
Wansouwé, W.E., Libengué, F.G., Kokonendji, C.C.: Conake: Continuous Associated Kernel Estimation (2015). CRAN.R-project.org/package=Conake
Wansouwé, W.E., Some, S.M., Kokonendji, C.C.: Ake: an R package for discrete and continuous associated kernel estimations. R Journal 8, 258–276 (2016)
Watnik, M.R.: Pay for play: are baseball salaries based on performance? J. Stat. Educ. 6, 1–5 (1998)
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Amsterdam (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nguyen, H.D., Jones, A.T., McLachlan, G.J. (2019). Positive Data Kernel Density Estimation via the LogKDE Package for R. In: Islam, R., et al. Data Mining. AusDM 2018. Communications in Computer and Information Science, vol 996. Springer, Singapore. https://doi.org/10.1007/978-981-13-6661-1_21
Download citation
DOI: https://doi.org/10.1007/978-981-13-6661-1_21
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6660-4
Online ISBN: 978-981-13-6661-1
eBook Packages: Computer ScienceComputer Science (R0)