Abstract
This study identifies an efficient method for the main interaction between categorical variables in multi-dimensional contingency tables. LASSO and Stepwise Algorithms are cross-validation methods for finding the penalty coefficient and model selection method. The methods used the Akaike Information Criterion (AIC) and p-value as indicators for selecting parameters to be included in the model. The aims of the study are to review the literature related to multi-dimensional contingency tables with log-linear models and high dimensional tables; to analyse the obesity dataset from Locksmith (GSK) GP Research Database from around the year 2000, where the dataset is composed of pā=ā10 binary comorbidities in nā=ā5000 patients using the models; and lastly, to compare the results obtained from the models. Stepwise Algorithms is an appropriate method for finding the parsimonious interaction structure between the categorical variables. The method defines a continuous shrinking operation that can produce coefficients which are exactly zero.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pearson, K.: X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. London Edinb. Dublin Philos. Mag. J. Sci. 50(302), 157ā175 (1900)
Heyde, C.C., Seneta, E.: IJ BienaymƩ: statistical theory anticipated, vol. 3. Springer Science & Business Media (2012)
Dobson, A.J., Barnett, A.: An introduction to generalized linear models. CRC Press (2008)
Fienberg, S.E., Rinaldo, A.: Three centuries of categorical data analysis: log-linear models and maximum likelihood estimation. J. Stat. Plan. Inference 137(11), 3430ā3445 (2007)
Powers, D., Xie, Y.: Statistical Methods for Categorical Data Analysis. Emerald Group Publishing (2008)
Fienberg, S.E.: The Analysis of Cross-Classified Categorical Data. Springer Science & Business Media (2007)
OāFlaherty, M., MacKenzie, G.: Algorithm AS 172: direct simulation of nested Fortran DO-LOOPS. J. R. Stat. Soc. Ser. C (Appl. Stat.) 31(1), 71ā74
Bishop, Y.M., Fienberg, S.E., Holland, P.W., Light, R.J., Mosteller, F.: Book review: discrete multivariate analysis: theory and practice. Appl. Psychol. Meas. 1(2), 297ā306 (1977)
Dahinden, C., Parmigiani, G., Emerick, M.C., BĆ¼hlmann, P.: Penalized likelihood for sparse contingency tables with an application to full-length cDNA libraries. BMC Bioinform. 8(1), 476 (2007)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 267ā288 (1996)
Tibshirani, R.J.: The lasso problem and uniqueness. Electron. J. Stat. 7, 1456ā1490 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Md Shahri, N.H.N., Conde, S. (2019). Modelling Multi-dimensional Contingency Tables: LASSO and Stepwise Algorithms. In: Kor, LK., Ahmad, AR., Idrus, Z., Mansor, K. (eds) Proceedings of the Third International Conference on Computing, Mathematics and Statistics (iCMS2017). Springer, Singapore. https://doi.org/10.1007/978-981-13-7279-7_70
Download citation
DOI: https://doi.org/10.1007/978-981-13-7279-7_70
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7278-0
Online ISBN: 978-981-13-7279-7
eBook Packages: Computer ScienceComputer Science (R0)