Abstract
In this Big-data and computational innovation era, advanced level analysis and modelling strategies are essential in data science to understanding the individual activities which occur within very complex behavioral, socio-economic and ecological systems. However, the scales at which models can be developed, and the subsequent problems they can inform, are often limited by our inability or challenges to effectively understand data that mimic interactions at the finest spatial, temporal, or organizational resolutions. Linear regression analysis is the one of the widely used methods for investigating such relationship between variables. Multicollinearity is one of the major problem in regression analysis. Multicollinearity can be reduced by using the appropriate regularized regression methods. This study aims to measure the robustness of regularized regression models such as ridge and Lasso type models designed for the high dimensional data having the multicollinearity problems. Empirical results show that Lasso and Ridge models have less residual sum of squares values. Findings also demonstrate an improved accuracy of estimated parameters on the best model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Rahman, A.: Estimating small area health-related characteristics of populations: A methodological review. Geospatial Health 12(495), 1–12 (2017)
Das, S., Rahman, A., et al.: Multi-level models can benefit from minimizing higher-order variations: an illustration using child malnutrition data. J. Stat. Comput. Simul. 1, 1–21 (2018). https://doi.org/10.1080/00949655.2018.1553242
Rahman, A., Nimmy, S.F., Sarowar, G.: Developing an automated machine learning approach to test discontinuity in DNA for detecting tuberculosis. Proc. Twelfth Int. Conf. Manag. Sci. Eng. Manag. 2018, 277–286 (2018)
Rahman, A., Harding, A.: Small Area Estimation and Microsimulation Modeling. CRC Press, Florida (2016)
Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2008)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, Berlin (2009)
Cortez, P., Cerdeira, A., et al.: Modeling wine preferences by data mining from physicochemical properties. Decis. Support. Syst. 47(4), 547–553 (2009)
James, J., Witten, D., et al.: An Introduction to Statistical Learning with Applications in R. Springer, Berlin (2013)
Hastie, T., Tibshirani, R., Wainwright, M.: Statistical Learning with Sparsity. The Lasso and Generalizations. Chapman & Hall, London (2015)
Nguyen, N.H., Tran, T.D.: Robust lasso with missing and grossly corrupted observations. IEEE Trans. Inf. Theory 59(4), 2036–2058 (2013)
Sardy, S., Bruce, S., Tseng, P.: Block coordinate relaxation methods for nonparametric signal denouncing with wavelet dictionaries. Technical report, Seattle, WA (1998)
Zhang H, Wahba G, et al.: Variable selection and model building via likelihood basis pursuit. Technical Report 1059, University of Wisconsin, Department of Statistics (2002)
Ferris, M.C., Voelker, M.M., Zhang, H.: Model building with likelihood basis pursuit. Optim. Methods Softw. 19(5), 577–594 (2004)
Mayooran, T.: Gradient-based optimization algorithm for ridge regression by using R. Int. J. Res. Sci. Innov. 5(4), 38–44 (2018)
Rahman, A.: Bayesian Predictive Inference for Some Linear Models Under Student-t Errors. VDM-Verlag, Saarbrucken (2008)
Montgomery, D., Peck, M., Vining, V.: Introduction to Linear Regression Analysis. Wiley, New York (2012)
Rahman, A., Harding, A.: A new analysis of the characteristics of households in housing stress: Results and tools for validation. In: Proceedings of the 6th Australasian Housing Researchers’ Conference (AHRC), pp. 1–21 (2012)
Cortez, P., Cerdeira, A. et al.: UCI machine learning repository wine quality data set. http://archive.ics.uci.edu/ml/datasets/Wine+Quality (2009)
Acknowledgements
The authors would like to thank the Data Science Research Unit (DSRU) in the School of Computing and Mathematics at the Charles Sturt University for all supports in working on this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Programming Codes
Appendix: Programming Codes
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Thevaraja, M., Rahman, A. (2020). Assessing Robustness of Regularized Regression Models with Applications. In: Xu, J., Ahmed, S., Cooke, F., Duca, G. (eds) Proceedings of the Thirteenth International Conference on Management Science and Engineering Management. ICMSEM 2019. Advances in Intelligent Systems and Computing, vol 1001. Springer, Cham. https://doi.org/10.1007/978-3-030-21248-3_30
Download citation
DOI: https://doi.org/10.1007/978-3-030-21248-3_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21247-6
Online ISBN: 978-3-030-21248-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)