Abstract
In this paper, we focus on the construction methods of the prediction model, estimation methods of the change point locations, and the confidence intervals for the generalized linear model with piecewise different coefficients. As a standard approach for multiple change point analysis, the application of the hierarchical splitting algorithm is widely used. However, the hierarchical splitting algorithm has a high risk in that the standard error of the change point estimators become large and, therefore, the prediction accuracy of the estimated model decreases. To deal with this problem, we consider the application of a bootstrap method based on the hierarchical splitting algorithm. Through simulation studies, we compare the algorithms in terms of the prediction accuracy of the estimated model, bias and variance of the change point estimators, and the accuracy of the confidence intervals of the change points. From the result, we confirmed the utility of the bootstrap-based methods for change point analysis, especially the increased prediction accuracy of the obtained model, decreased standard error of the change point estimators, and construction of better confidence intervals depending on the situation. We also present the results of a simple example to demonstrate the utility of the method.
Similar content being viewed by others
References
Agresti, A. (2013). Categorical Data Analysis (3rd ed.). Hoboken, New Jersey: Wiley.
Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B.N. Petrov, F. Csáki (Eds.)proceedings of the 2nd International Symposium on Information Theory (pp. 267-281). Budapest.
Bai, J., & Perron, P. (2003). Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18, 1–22.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. (1984). Classification and Regression Trees. California: Wadsworth.
Brown, R. L., Durbin, J., & Evans, J. M. (1975). Techniques for testing the constancy of regression relationships over time. Journal of the Royal Statistical Society Series B, 37, 149–192.
Chen, J., & Gupta, A. K. (1997). Testing and locating variance changepoints with application to stock prices. Journal of the American Statistical Association, 92, 739–747.
Chen, J., & Gupta, A. K. (2012). Parametric Statistical Change Point Analysis (2nd ed.). New York: Birkhäuser.
Csörgő, M., & Horváth, L. (1997). Limit Theorems in Change-Point Analysis. New York: John Wiley & Sons.
Davis, R. A., Lee, T. C. M., & Rodriguez-Yam, G. A. (2006). Structural break estimation for nonstationary time series models. Journal of the American Statistical Association, 101, 223–239.
Davison, A. C., & Hinkley, D. V. (1997). Bootstrap Methods and their Application. Cambridge: Cambridge University Press.
Efron, B., & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Boca Raton, Florida: Chapman and Hall/CRC Press.
Fox, J. (2015). Applied Regression Analysis and Generalized Linear Models (3rd ed.). Thousand Oaks: Sage Publicatons.
Gurevich, G., & Vexler, A. (2005). Change point problems in the model of logistic regression. Journal of Statistical Planning and Inference, 131, 313–331.
Hawkins, D. M. (1977). Testing a sequence of observations for a shift in location. Journal of the American Statistical Association, 72, 180–186.
Hawkins, D. M. (2001). Fitting multiple change-point models to data. Computational Statistics & Data Analysis, 37, 323–341.
Holbert, D. (1982). A Bayesian analysis of a switching linear model. Journal of Econometrics, 19, 77–87.
Inclán, C. (1993). Detection of multiple changes of variance using posterior odds. Journal of Business and Economic Statistics, 11, 289–300.
James, B. J., James, K. L., & Siegmund, D. (1987). Tests for a change-point. Biometrika, 74, 71–84.
Kim, H. (1994). Tests for a change-point in linear regression. IMS Lecture Notes-Monograph Series, 23, 170–176.
Kim, H., & Siegmund, D. (1989). The likelihood ratio test for a change-point in simple linear regression. Biometrika, 76, 409–423.
Küchenhoff, H., & Carroll, R. J. (1997). Segmented regression with errors in predictors: semi-parametric and parametric methods. Statistics in Medicine, 16, 169–188.
Liu, F. T., Ting, K. M., Yu, Y., & Zhou, Z. H. (2008). Spectrum of variable-random trees. Journal of Artificial Intelligence Research, 32, 355–384.
Lu, Q., Lund, R., & Lee, T. C. M. (2010). An mdl approach to the climate segmentation problem. The Annals of Applied Statistics, 4, 299–319.
Quandt, R. E. (1958). The estimation of parameters of a linear regression system obeying two separate regimes. Journal of the American Statistical Association, 53, 873–880.
Quandt, R. E. (1960). Tests of the hypothesis that a linear regression system obeys two separate regimes. Journal of the American Statistical Association, 55, 324–330.
Rissanen, J. (2007). Information and Complexity in Statistical Modeling. New York: Springer.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6, 461–464.
Smith, P. L. (1979). Splines as a useful and convenient statistical tool. The American Statistician, 33, 57–62.
Stasinopoulos, D. M., & Rigby, R. A. (1992). Detecting break points in generalised linear models. Computational Statistics & Data Analysis, 13, 461–471.
Ulm, K. (1991). A statistical method for assessing a threshold in epidemiological studies. Statistics in Medicine, 10, 341–349.
Worsley, K. J. (1979). On the likelihood ratio test for a shift in location of normal populations. Journal of the American Statistical Association, 74, 365–367.
Wu, Y. (2008). Simultaneous change point analysis and variable selection in a regression problem. Journal of Multivariate Analysis, 99, 2154–2171.
Zhou, Z. H. (2012). Ensemble Methods Foundations and Algorithms. Boca Raton: Chapman and Hall/CRC Press.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shimokawa, A., Miyaoka, E. Application of the bootstrap method for change points analysis in generalized linear models. Jpn J Stat Data Sci 1, 413–433 (2018). https://doi.org/10.1007/s42081-018-0023-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42081-018-0023-5