Annals of the Institute of Statistical Mathematics

, Volume 32, Issue 2, pp 223–240

# Nonparametric estimation of an affinity measure between two absolutely continuous distributions with hypotheses testing applications

Article

## Abstract

LetF andG denote two distribution functions defined on the same probability space and are absolutely continuous with respect to the Lebesgue measure with probability density functionsf andg, respectively. A measure of the closeness betweenF andG is defined by:$$\lambda = \lambda (F,G) = 2\int {f(x)g(x)dx} /\left[ {\int {f^2 (x)dx + \int {g^2 (x)dx} } } \right]$$. Based on two independent samples it is proposed to estimate λ by$$\hat \lambda = \left[ {\int {\hat f(x)dG_n (x) + \int {\hat g(x)dF_n (x)} } } \right]/\left[ {\int {\hat f^2 (x)dx + \int {\hat g^2 (x)dx} } } \right]$$, whereFn(x) andGn(x) are the empirical distribution functions ofF(x) andG(x) respectively and$$\hat f(x)$$ and$$\hat g(x)$$ are taken to be the so-called kernel estimates off(x) andg(x) respectively, as defined by Parzen [16]. Large sample theory of$$\hat \lambda$$ is presented and a two sample goodness-of-fit test is presented based on$$\hat \lambda$$. Also discussed are estimates of certain modifications of λ which allow us to propose some test statistics for the one sample case, i.e., wheng(x)=f0(x), withf0(x) completely known and for testing symmetry, i.e., testingH0:f(x)=f(−x).

## Keywords

Probability Density Function Central Limit Theorem Bounded Variation Kernel Estimate Lebesgue Dominate Convergence Theorem

## References

1. [1]
Ahmad, I. A. and Lin, P. E. (1977). Non parametric density estimation for dependent variables with application, under revision.Google Scholar
2. [2]
Ahmad, I. A. and Van Belle, G. (1974). Measuring affinity of distributions.Reliability and Biometry, Statistical Analysis of Life Testing, (eds., Proschan and R. J. Serfling), SIAM, Philadelphia, 651–668.Google Scholar
3. [3]
Bhattacharayya, G. K. and Roussas, G. (1969). Estimation of certain functional of probability density function,Skand. Aktuarietidskr.,52, 203–206.
4. [4]
Billingsley, P. (1968).Convergence of Probability Measures, John Wiley and Sons, New York.
5. [5]
Chernoff, H. and Savage, I. R. (1958). Asymptotic normality and efficiency of certain nonparametric test statistics,Ann. Math. Statist.,29, 972–994.
6. [6]
Dvoretzky, A., Kiefer, J. and Wolfowitz, J. (1956). Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator.Ann. Math. Statist.,27, 642–669.
7. [7]
Matusita, K. (1955). Decision rules based on the distance for the problems of fit, two samples, and estimation,Ann. Math. Statist.,26, 631–640.
8. [8]
Matusita, K. (1964). Distance and decision rules,Ann. Inst. Statist. Math.,16, 305–315.
9. [9]
Matusita, K. (1966). A distance and related statistics in multivariate analysis,Multivariate Analysis I, (ed. P. R. Krishnaiah), Academic Press, New York, 187–200.Google Scholar
10. [10]
Matusita, K. (1967a). Classification Based on Distance in Multivariate Gaussian Cases,Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. I, 299–304.
11. [11]
Matusita, K. (1967b). On the notion of affinity of several distributions and some of its applications,Ann. Inst. Statist. Math.,19, 181–192.
12. [12]
Matusita, K. (1971). Some properties of affinity and applications,Ann. Inst. Statist. Math.,23, 137–155.
13. [13]
Matusita, K. (1973). Correlation and affinity in Gaussian cases,Multivariate Analysis III, (ed., P. R. Krishnaiah), Academic Press, New York, 345–349.
14. [14]
Matusita, K. and Akaike, H. (1956). Decision rules based on the distance for the problem of independence, invariance, and two samples,Ann. Inst. Statist. Math.,7, 67–80.
15. [15]
Nadaraya, E. A. (1965). On nonparametric estimation of density function and regression curve,Theory Prob. Appl.,10, 186–190.
16. [16]
Parzen, E. (1962). On the estimation of a probability density function and mode,Ann. Math. Statist.,33, 1065–1076.
17. [17]
Philipp, W. (1969). The central limit theorem for mixing sequences of random variables.Z. Wahrscheinlickeitsth.,12, 155–171.
18. [18]
Resenblatt, M. (1956a). Remarks on some nonparametric estimates of a density function.Ann. Math. Statist.,27, 832–837.
19. [19]
Rosenblatt, M. (1956b). A central limit theorem and a strong mixing condition,Proc. Nat. Acad. Sci. USA,42, 43–47.
20. [20]
Royden, H. L. (1968),Real Analysis (Second Edition), Macmillan, New York.