# An optimal test for the additive model with discrete or categorical predictors

## Abstract

In multivariate nonparametric regression, the additive models are very useful when a suitable parametric model is difficult to find. The backfitting algorithm is a powerful tool to estimate the additive components. However, due to complexity of the estimators, the asymptotic *p* value of the associated test is difficult to calculate without a Monte Carlo simulation. Moreover, the conventional tests assume that the predictor variables are strictly continuous. In this paper, a new test is introduced for the additive components with discrete or categorical predictors, where the model may contain continuous covariates. This method is also applied to the semiparametric regression to test the goodness of fit of the model. These tests are asymptotically optimal in terms of the rate of convergence, as they can detect a specific class of contiguous alternatives at a rate of \(n^{-1/2}\). An extensive simulation study and a real data example are presented to support the theoretical results.

## Keywords

Additive model Categorical data analysis Backfitting algorithm Generalized likelihood ratio test Semiparametric model Local polynomial regression## Notes

### Acknowledgements

The author very much appreciates Kuchibhotla Arun Kumar for carefully reading the paper including all proofs and providing helpful comments and suggestions. The author would like to thank two anonymous referees who significantly improved the presentation of the paper.

## Supplementary material

## References

- Buja, A., Hastie, T., Tibshirani, R. (1989). Linear smoothers and additive models.
*The Annals of Statistics*,*17*(2), 453–555.MathSciNetzbMATHGoogle Scholar - Davies, R. B. (1980). The distribution of a linear combination of \(\chi ^2\) random variables. Algorithm AS155.
*Applied Statistics*,*29*, 323–333.Google Scholar - Fan, J., Jiang, J. (2005). Nonparametric inferences for additive models.
*Journal of the American Statistical Association*,*100*(471), 890–907.MathSciNetzbMATHGoogle Scholar - Fan, J., Zhang, C., Zhang, J. (2001). Generalized likelihood ratio statistics and Wilks phenomenon.
*The Annals of Statistics*,*29*(1), 153–193.MathSciNetzbMATHGoogle Scholar - Friedman, J. H., Stuetzle, W. (1981). Projection pursuit regression.
*Journal of the American Statistical Association*,*76*(376), 817–823.MathSciNetGoogle Scholar - Hall, P., Marron, J. S. (1988). Variable window width kernel estimates of probability densities.
*Probability Theory and Related Fields*,*80*(1), 37–49.MathSciNetzbMATHGoogle Scholar - Hastie, T., Tibshirani, R. (2000). Bayesian backfitting (with discussion).
*Statistical Science. A Review Journal of the Institute of Mathematical Statistics*,*15*(3), 196–223.MathSciNetzbMATHGoogle Scholar - Hastie, T. J., Tibshirani, R. J. (1990).
*Generalized additive models, volume 43 of monographs on statistics and applied probability*. London: Chapman and Hall Ltd.Google Scholar - Ingster, Y. I. (1993a). Asymptotically minimax hypothesis testing for nonparametric alternatives. I.
*Mathematical Methods of Statistics*,*2*, 85–114. MathSciNetzbMATHGoogle Scholar - Ingster, Y. I. (1993b). Asymptotically minimax hypothesis testing for nonparametric alternatives. II.
*Mathematical Methods of Statistics*,*3*, 171–189.MathSciNetzbMATHGoogle Scholar - Ingster, Y. I. (1993c). Asymptotically minimax hypothesis testing for nonparametric alternatives. III.
*Mathematical Methods of Statistics*,*4*, 249–268.MathSciNetzbMATHGoogle Scholar - Jiang, J., Zhou, H., Jiang, X., et al. (2007). Generalized likelihood ratio tests for the structure of semiparametric additive models.
*The Canadian Journal of Statistics*,*35*(3), 381–398.MathSciNetzbMATHGoogle Scholar - Mammen, E., Linton, O., Nielsen, J. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions.
*The Annals of Statistics*,*27*(5), 1443–1490.MathSciNetzbMATHGoogle Scholar - Opsomer, J. D. (2000). Asymptotic properties of backfitting estimators.
*Journal of Multivariate Analysis*,*73*(2), 166–179.MathSciNetzbMATHGoogle Scholar - Opsomer, J. D., Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression.
*The Annals of Statistics*,*25*(1), 186–211.MathSciNetzbMATHGoogle Scholar - Opsomer, J. D., Ruppert, D. (1999). A root-n consistent backfitting estimator for semiparametric additive modeling.
*Journal of Computational and Graphical Statistics*,*8*(4), 715–732.Google Scholar - Speckman, P. (1988). Kernel smoothing in partial linear models.
*Journal of the Royal Statistical Society. Series B. Methodological*,*50*(3), 413–436.MathSciNetzbMATHGoogle Scholar - Sperlich, S., Tjøstheim, D., Yang, L. (2002). Nonparametric estimation and testing of interaction in additive models.
*Econometric Theory*,*18*(2), 197–251.MathSciNetzbMATHGoogle Scholar - Spokoiny, V. G. (1996). Adaptive hypothesis testing using wavelets.
*The Annals of Statistics*,*24*(6), 2477–2498.MathSciNetzbMATHGoogle Scholar - Stone, C. J. (1985). Additive regression and other nonparametric models.
*The Annals of Statistics*,*13*(2), 689–705.MathSciNetzbMATHGoogle Scholar - Stone, C. J. (1986). The dimensionality reduction principle for generalized additive models.
*The Annals of Statistics*,*14*(2), 590–606.MathSciNetzbMATHGoogle Scholar - Tjøstheim, D., Auestad, B. H. (1994). Nonparametric identification of nonlinear time series: Projections.
*Journal of the American Statistical Association*,*89*(428), 1398–1409.MathSciNetzbMATHGoogle Scholar - Wand, M. P. (1999). A central limit theorem for local polynomial backfitting estimators.
*Journal of Multivariate Analysis*,*70*(1), 57–65.MathSciNetzbMATHGoogle Scholar - Watson, G. S. (1964). Smooth regression analysis.
*Sankhyā (Statistics). The Indian Journal of Statistics. Series A*,*26*, 359–372.MathSciNetzbMATHGoogle Scholar - Wickham, H. (2009).
*ggplot2: Elegant graphics for data analysis*. New York: Springer.zbMATHGoogle Scholar - Yang, L., Sperlich, S., Härdle, W. (2003). Derivative estimation and testing in generalized additive models.
*Journal of Statistical Planning and Inference*,*115*(2), 521–542.MathSciNetzbMATHGoogle Scholar