Regression-Based Predictive Analytics

  • Y. Z. Ma


Regression is one of the most commonly used multivariate statistical methods. Multivariate linear regression can integrate many explanatory variables to predict the target variable. However, collinearity due to intercorrelations in the explanatory variables leads to many surprises in multivariate regression. This chapter presents both basic and advanced regression methods, including standard least square linear regression, ridge regression and principal component regression. Pitfalls in using these methods for geoscience applications are also discussed.


  1. Bertrand, P. V., & Holder, R. L. (1988). A quirk in multiple regression: The whole regression can be greater than the sum of its parts. The Statistician, 37, 371–374.CrossRefGoogle Scholar
  2. Chen, A., Bengtsson, T., & Ho, T. K. (2009). A regression paradox for linear models: Sufficient conditions and relation to Simpson’s paradox. The American Statistician, 63(3), 218–225.MathSciNetCrossRefGoogle Scholar
  3. Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation for the behavioral sciences (3rd edn) (1st edition, 1975), Mahwah: Lawrence Erlbaum Associates, 703 p.Google Scholar
  4. Darmawan, I. G. N., & Keeves, J. P. (2006). Suppressor variables and multilevel mixture modeling. International Education Journal, 7(2), 160–173.Google Scholar
  5. Delfiner, P. (2007). Three pitfalls of Phi-K transforms. SPE Formation Evaluation & Engineering, 10(6), 609–617.CrossRefGoogle Scholar
  6. Friedman, L., & Wall, M. (2005). Graphic views of suppression and multicollinearity in multiple linear regression. The American Statistician, 59(2), 127–136.MathSciNetCrossRefGoogle Scholar
  7. Gonzalez, A. B., & Cox, D. R. (2007). Interpretation of interaction: A review. The Annals of Statistics, 1(2), 371–385.MathSciNetCrossRefGoogle Scholar
  8. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.CrossRefGoogle Scholar
  9. Hoerl, A. E., & Kennard, R. W. (1970). Ridge regression: Biased estimation for non-orthogonal problems. Technometrics, 12, 55–68.CrossRefGoogle Scholar
  10. Huang, D. Y., Lee, R. F., & Panchapakesan, S. (2006). On some variable selection procedures based on data for regression models. Journal of Statistical Planning and Inference, 136(7), 2020–2034.MathSciNetCrossRefGoogle Scholar
  11. Jones, T. A. (1972). Multiple regression with correlated independent variables. Mathematical Geology, 4, 203–218.CrossRefGoogle Scholar
  12. Liao, D., & Valliant, R. (2012). Variance inflation factors in the analysis of complex survey data. Survey Methodology, 38(1), 53–62.Google Scholar
  13. Lord, F. M. (1967). A paradox in the interpretation of group comparisons. Psychological Bulletin, 68, 304–305.CrossRefGoogle Scholar
  14. Ma, Y. Z. (2010). Error types in reservoir characterization and management. Journal of Petroleum Science and Engineering, 72(3–4), 290–301. Scholar
  15. Ma, Y. Z. (2011). Pitfalls in predictions of rock properties using multivariate analysis and regression method. Journal of Applied Geophysics, 75, 390–400.CrossRefGoogle Scholar
  16. O’Brien, R. M. (2007). A caution regarding rules of thumb for variance inflation factors. Quality & Quantity, 41, 673–690.CrossRefGoogle Scholar
  17. Smith, A. C., Koper, N., Francis, C. M., & Farig, L. (2009). Confronting collinearity: Comparing methods for disentangling the effects of habitat loss and fragmentation. Landscape Ecology, 24, 1271–1285.CrossRefGoogle Scholar
  18. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso: A retrospective. Journal of the Royal Statistical Society, Series B, 58(1), 267–288.MathSciNetzbMATHGoogle Scholar
  19. Vargas-Guzman, J. A. (2009). Unbiased estimation of intrinsic permeability with cumulants beyond the lognormal assumption. SPE Journal, 14, 805–810.CrossRefGoogle Scholar
  20. Webster, J. T., Gunst, R. F., & Mason, R. L. (1974). Latent root regression analysis. Technometrics, 16(4), 513–522.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Y. Z. Ma
    • 1
  1. 1.SchlumbergerDenverUSA

Personalised recommendations