Advertisement

Weak Signals in High-Dimensional Logistic Regression Models

  • Orawan Reangsephet
  • Supranee Lisawadi
  • Syed Ejaz Ahmed
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1001)

Abstract

In this work,  we addressed parameter estimation and prediction in the high-dimensional sparse logistic regression model through both Monte Carlo simulations and application to real data. We applied two well-known penalized maximum likelihood (ML) methods (LASSO and aLASSO) for variable screening. There may exist overfitting from LASSO or underfitting from aLASSO, making ML estimators based on these models inefficient. Hence, after performing variable selection, we proposed post-selection improved estimation based on linear shrinkage, pretest, and James-Stein shrinkage strategies, which efficiently combine overfitted and underfitted ML estimators. Regardless of the correctness in the variable selection stage, the proposed estimators were shown to be more efficient than the classical ML estimators, which were severely affected by inappropriate variable selection.

Keywords

High-dimensional sparse logistic Monte Carlo simulation Penalized maximum likelihood Linear shrinkage Pretest James-Stein shrinkage 

Notes

Acknowledgments

The research of Professor S. Ejaz Ahmed was partially supported by the Natural Sciences and Engineering Research Council of Canada.

References

  1. 1.
    Agresti, A.: Foundations of Linear and Generalized Linear Models. Wiley, New York (2015)Google Scholar
  2. 2.
    Ahmed, S.E.: Shrinkage preliminary test estimation in multivariate normal distributions. J. Stat. Comput. Simul. 43(3–4), 177–195 (1992)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Ahmed, S.E.: Penalty, Shrinkage and Pretest Strategies: Variable Selection and Estimation. Springer (2014)Google Scholar
  4. 4.
    Ahmed, S.E., Yüzbaşı, B.: Big data analytics: integrating penalty strategies. Int. J. Manag. Sci. Eng. Manag. 11(2), 105–115 (2016)Google Scholar
  5. 5.
    Algamal, Z.: An efficient gene selection method for high-dimensional microarray data based on sparse logistic regression. Electron. J. Appl. Stat. Anal. 10(1), 242–256 (2017)MathSciNetGoogle Scholar
  6. 6.
    Algamal, Z.Y., Lee, M.H.: Penalized logistic regression with the adaptive lasso for gene selection in high-dimensional cancer classification. Expert. Syst. Appl. 42(23), 9326–9332 (2015)CrossRefGoogle Scholar
  7. 7.
    Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Gao, X., Ahmed, S.E., Feng, Y.: Post selection shrinkage estimation for high-dimensional data analysis. Appl. Stoch. Model. Bus. Ind. 33(2), 97–120 (2017)MathSciNetzbMATHGoogle Scholar
  9. 9.
    Hoerl, A.E., Kennard, R.W.: Ridge regression: biased estimation for nonorthogonal problems. Technometrics 42(1), 80–86 (2000)CrossRefGoogle Scholar
  10. 10.
    Hossain, S., Ahmed, S.E., Doksum, K.A.: Shrinkage, pretest, and penalty estimators in generalized linear models. Stat. Methodol. 24, 52–68 (2015)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Li, Y., Hong, H.G., Ahmed, S.E., Li, Y.: Weak signals in high-dimensional regression: Detection, estimation and prediction. Appl. Stoch. Model. Bus. Ind. (2018)Google Scholar
  12. 12.
    Lisawadi, S., Shah, M.K.A., Ahmed, S.E.: Model selection and post estimation based on a pretest for logistic regression models. J. Stat. Comput. Simul. 86(17), 3495–3511 (2016)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Myers, R.H., Montgomery, D.C., Vining, G.G., Robinson, T.J.: Generalized Linear Models: With Applications in Engineering and the Sciences, vol. 791. Wiley, New York (2012)Google Scholar
  14. 14.
    Reangsephet, O., Lisawadi, S., Ahmed, S.E.: A comparison of pretest, stein-type and penalty estimators in logistic regression model. In: International Conference on Management Science and Engineering Management, pp. 19–34. Springer (2017)Google Scholar
  15. 15.
    Reangsephet, O., Lisawadi, S., Ahmed, S.E.: Improving estimation of regression parameters in negative binomial regression model. In: International Conference on Management Science and Engineering Management, pp. 265–275. Springer (2018)Google Scholar
  16. 16.
    Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 267–288 (1996)Google Scholar
  17. 17.
    Towell, G.G., Shavlik, J.W., Noordewier, M.O.: Refinement of approximate domain theories by knowledge-based neural networks. In: Proceedings of the Eighth National Conference on Artificial Intelligence, Boston, MA (1990)Google Scholar
  18. 18.
    Yuzbasi, B., Arashi, M., Ahmed, S.E.: Big data analysis using shrinkage strategies (2017). arXiv:170405074
  19. 19.
    Yüzbaşı, B., Arashi, M., Ahmed, S.E.: Shrinkage estimation strategies in generalized ridge regression models under low/high-dimension regime (2017). arXiv:170702331
  20. 20.
    Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Orawan Reangsephet
    • 1
  • Supranee Lisawadi
    • 1
  • Syed Ejaz Ahmed
    • 2
  1. 1.Department of Mathematics and StatisticsThammasat UniversityBangkokThailand
  2. 2.Faculty of Mathematics and ScienceBrock UniversityOntarioCanada

Personalised recommendations