Skip to main content
Log in

M-based simultaneous inference for the mean function of functional data

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Estimating and constructing a simultaneous confidence band for the mean function in the presence of outliers is an important problem in the framework of functional data analysis. In this paper, we propose a robust estimator and a robust simultaneous confidence band for the mean function of functional data using M-estimation and B-splines. The robust simultaneous confidence band is also extended to the difference of mean functions of two populations. Further, the asymptotic properties of the M-based mean function estimator, such as the asymptotic consistency and asymptotic normality, are studied. The performance of the proposed robust methods and their robustness are demonstrated with an extensive simulation study and two real data examples.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bali, J. L., Boente, G., Tyler, D. E., Wang, J. L. (2011). Robust functional principal components: A projection-pursuit approach. Annals of Statistics, 39(6), 2852–2882.

  • Boente, G., Salibian-Barrera, M. (2015). S-estimators for functional principal component analysis. Journal of the American Statistical Association, 110(511), 1100–1111.

  • Cao, G., Yang, L., Todem, D. (2012). Simultaneous inference for the mean function based on dense functional data. Journal of Nonparametric Statistics, 24(2), 359–377.

  • Cox, D. D. (1983). Asymptotics for m-type smoothing splines. Annals of Statistics, 11, 530–551.

    Article  MathSciNet  MATH  Google Scholar 

  • Daszykowski, M., Kaczmarek, K., Vander Heyden, Y., Walczak, B. (2007). Robust statistics in data analysis—A review: Basic concepts. Chemometrics and Intelligent Laboratory Systems, 85(2), 203–219.

  • Embling, C. B., Illian, J., Armstrong, E., van der Kooij, J., Sharples, J., Camphuysen, K. C., Scott, B. E. (2012). Investigating fine-scale spatio-temporal predator-prey patterns in dynamic marine ecosystems: A functional data analysis approach. Journal of Applied Ecology, 49(2), 481–492.

  • Esbensen, K., Schönkopf, S., Midtgaard, T., Guyot, D. (1996). Multivariate analysis in practice: A training package. Trondheim: Camo As.

  • Febrero, M., Galeano, P., González-Manteiga, W. (2008). Outlier detection in functional data by depth measures, with application to identify abnormal nox levels. Environmetrics, 19(4), 331–345.

  • Ferraty, F. (2011). Recent advances in functional data analysis and related topics. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Ferraty, F., Rabhi, A., Vieu, P. (2005). Conditional quantiles for dependent functional data with application to the climatic “el niño” phenomenon. Sankhyā: The Indian Journal of Statistics, 67(2), 378–398.

  • Gervini, D. (2008). Robust functional estimation using the median and spherical principal components. Biometrika, 95(3), 587–600.

    Article  MathSciNet  MATH  Google Scholar 

  • Gu, L., Wang, L., Härdle, W. K., Yang, L. (2014). A simultaneous confidence corridor for varying coefficient regression with sparse functional data. Test, 23(4), 806–843.

  • Huang, J. Z., Wu, C. O., Zhou, L. (2004). Polynomial spline estimation and inference for varying coefficient models with longitudinal data. Statistica Sinica, 14, 763–788.

  • Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35(1), 73–101.

    Article  MathSciNet  MATH  Google Scholar 

  • Kraus, D., Panaretos, V. M. (2012). Dispersion operators and resistant second-order functional data analysis. Biometrika, 99(4), 813–832.

  • Lee, S., Shin, H., Billor, N. (2013). M-type smoothing spline estimators for principal functions. Computational Statistics & Data Analysis, 66, 89–100.

  • Lim, Y., Oh, H. S. (2015). Simultaneous confidence interval for quantile regression. Computational Statistics, 30(2), 345–358.

  • Lima, I. R., Cao, G., Billor, N. (2017). Robust simultaneous inference for the mean function of functional data. Ph.D. dissertation. Auburn University.

  • Locantore, N., Marron, J., Simpson, D., Tripoli, N., Zhang, J., Cohen, K., Boente, G., Fraiman, R., Brumback, B., Croux, C. (1999). Robust principal component analysis for functional data. Test, 8(1), 1–73.

  • Maronna, R., Martin, D., Yohai, V. (2006). Robust statistics: Theory and methods. Wiley series in probability and statistics. Chichester: Wiley.

  • Maronna, R. A., Yohai, V. J. (2013). Robust functional linear regression based on splines. Computational Statistics & Data Analysis, 65, 46–55.

  • Shin, H., Lee, S. (2016). An RKHS approach to robust functional linear regression. Statistica Sinica, 26, 255–272.

  • Silverman, B., Ramsay, J. (2005). Functional data analysis (2nd ed.). New York: Springer.

  • Stone, C. J. (1985). Additive regression and other nonparametric models. The Annals of Statistics, 13, 689–705.

    Article  MathSciNet  MATH  Google Scholar 

  • Tang, Q., Cheng, L. (2012). M-estimation and b-spline approximation for varying coefficient models with longitudinal data. Journal of Nonparametric Statistics, 20, 611–625.

  • Venables, W. N., Ripley, B. D. (2002). Modern applied statistics with S (4th ed.). New York: Springer.

  • Wei, Y., He, X. (2006). Conditional growth charts. Annals of Statistics, 34(5), 2069–2097.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guanqun Cao.

Additional information

Cao’s research is supported in part by the Simons Foundation under Grant #354917 and the National Science Foundation under Grants DMS 1736470. We thank the Associate Editor and two anonymous referees for their helpful and constructive comments, which lead to significant improvement in this paper.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 249 KB)

Appendix: Variance of pseudo-data

Appendix: Variance of pseudo-data

In order to evaluate the efficiency of pseudo-data method, we compare the sample variance of the pseudo-data with the real variance of the uncontaminated model defined in Sect. 3.1. To further emphasize the influence of outliers in the calculation of the variance, we also compared the results with the sample variance of the outlier contaminated dataset, using the least-squares method as the estimator for the mean function. We generate a functional dataset from the model in Sect. 3.1 for sample size \(n = 200\), with \(s_l = 5\) and \(s_u = 7\) for peak outlier case. Each simulation is repeated 500 times. The results are presented in Fig. 4. We only present the results for peak outliers, as all other cases have similar results. The results show that the variance of the pseudo-data is very close to the true sample variance computed from the clean dataset, while the non-robust estimation of the variance of the contaminated dataset is strongly affected by the outlier curves.

Fig. 4
figure 4

Comparison of the real variance (solid), sample variance of pseudo-data (dotted) and sample variance of contaminated data (dashed). Peak outlier with 5% contamination

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lima, I.R., Cao, G. & Billor, N. M-based simultaneous inference for the mean function of functional data. Ann Inst Stat Math 71, 577–598 (2019). https://doi.org/10.1007/s10463-018-0656-y

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10463-018-0656-y

Keywords

Navigation