Skip to main content
Log in

Nonparametric regression under double-sampling designs

  • Published:
Journal of Systems Science and Complexity Aims and scope Submit manuscript

Abstract

This paper studies nonparametric estimation of the regression function with surrogate outcome data under double-sampling designs, where a proxy response is observed for the full sample and the true response is observed on a validation set. A new estimation approach is proposed for estimating the regression function. The authors first estimate the regression function with a kernel smoother based on the validation subsample, and then improve the estimation by utilizing the information on the incomplete observations from the non-validation subsample and the surrogate of response from the full sample. Asymptotic normality of the proposed estimator is derived. The effectiveness of the proposed method is demonstrated via simulations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. J. Wittes, E. Lakatos, and J. Probstfield, Surrogate endpoints in clinical trials: Cardiovascular diseases, Statist. Med., 1989, 8: 415–425.

    Article  Google Scholar 

  2. J. Neyman, Contribution to the theory of sampling from human populations, J. Amer. Statist. Assoc., 1938, 33: 101–116.

    Article  MATH  Google Scholar 

  3. M. S. Pepe, Inference using surrogate outcome data and a validation sample, Biometrika, 1992, 79: 355–365.

    Article  MATH  MathSciNet  Google Scholar 

  4. S. R. Lipsitz, N. M. Laird, and D. P. Harrington, Weighted least squares analysis of repeated categorical measurements with outcomes subject to nonresponse, Biometrics, 1994, 50: 11–24.

    Article  MATH  Google Scholar 

  5. M. S. Pepe, M. Reilly, and T. R. Fleming, Auxiliary outcome data and the mean score method, J. Statist. Plann. Inference, 1994, 42: 137–160.

    Article  MATH  MathSciNet  Google Scholar 

  6. J. M. Robins, A. Rotnitzky, and L. P. Zhao, Estimation of regression coefficients when some regressors are not always observed, J. Amer. Statist. Assoc., 1994, 89: 846–866.

    Article  MATH  MathSciNet  Google Scholar 

  7. N. E. Breslow and K. C. Cain, Logistic regression for two-stage case-control data, Biometrika, 1998, 75: 11–20.

    Article  MathSciNet  Google Scholar 

  8. J. J. Forster and P. W. F. Smith, Model-based inference for categorical survey data subject to non-ignorable non-response, J. R. Statist. Soc. B., 1998, 60: 57–70.

    Article  MATH  MathSciNet  Google Scholar 

  9. N. Chatterjee, Y. H. Chen, and N. E. Breslow, A pseudoscore estimator for regression problems with two-phrase sampling, J. Amer. Statist. Assoc., 2003, 98: 158–168.

    Article  MATH  MathSciNet  Google Scholar 

  10. R. J. A. Little and D. Rubin, Statistical Analysis with Missing Data, 2nd Ed., John Wiley, New York, 2002.

    MATH  Google Scholar 

  11. Y. H. Chen and H. Chen, A unified approach to regression analysis under double-sampling designs, J. R. Statist. Soc. B., 2000, 62: 449–460.

    Article  MATH  Google Scholar 

  12. J. Jiang and H. Zhou, Additive Hazards Regression with Auxiliary Covariates, Biometrika, 2007, 94: 359–369.

    Article  MATH  MathSciNet  Google Scholar 

  13. J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications, Chapman and Hall, London, 1996.

    MATH  Google Scholar 

  14. J. Fan and I. Gijbels, Data-driven bandwidth selection in local polynomial fitting: Variable bandwidth and spatial adaptation, J. R. Statist. Soc. B., 1995, 57: 371–394.

    MATH  MathSciNet  Google Scholar 

  15. W. Härdle, Applied Nonparametric Regression Analysis, Cambridge University Press, Cambridge, 1990.

    Google Scholar 

  16. J. Jiang and P. Mack, Robust local polynomial regression for dependent data, Statist. Sinica, 2001, 11: 705–722.

    MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

This research is supported by the National Natural Science Foundation of the US under Grant No. DMS-0906482.

This paper was recommended for publication by Editor Guohua ZOU.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, X., Jiang, J. & Liu, Y. Nonparametric regression under double-sampling designs. J Syst Sci Complex 24, 167–175 (2011). https://doi.org/10.1007/s11424-011-8129-x

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11424-011-8129-x

Key words

Navigation