Cross Tabulation and Categorical Data Analysis
Often, we have questions about associations of events or variables with each other or their correlation with each other. For example, in pathology we commonly face the question of association of a test result with a disease status. In statistics, the process of testing the association between events is called hypothesis testing. If the variables are categorical (i.e., they can only assume finite discrete values), a common approach to hypothesis testing is to employ cross tabulation.
Cross tabulation is the summarization of categorical data into a table with each cell in the table containing the frequency (either raw or proportional) of the observations that fit the categories represented by that cell. The summary data presented in cross-tabulated form then can be used for many statistical tests most of which follow a distribution called chi-squared distribution.
In this chapter, we explain the concept of hypothesis testing and introduce the most common statistical tests used in hypothesis testing of categorical data.
KeywordsCategorical data Hypothesis testing Cross tabulation Chi-squared distribution Chi-squared tests Fisher’s exact test Agreement measures
- 1.Strike PW. Statistical methods in laboratory medicine. New York: Butterworth-Heinemann; 2014.Google Scholar
- 5.Simpson EH. The interpretation of interaction in contingency tables. J R Stat Soc Ser B Methodol. 1951;13:238–41.Google Scholar
- 8.Sharpe D. Your chi-square test is statistically significant: Now what? Practical Assessment, Research & Evaluation. 2015;20:1–10.Google Scholar
- 9.Scheaffer RL, Yes N. Categorical data analysis: NCSSM Statistics Leadership Institute, USA; 1999. (online publication accessible at: http://courses.ncssm.edu/math/Stat_Inst/PDFS/Categorical%20Data%20Analysis.pdf)
- 11.Mantel N. Chi-square tests with one degree of freedom; extensions of the Mantel-Haenszel procedure. J Am Stat Assoc. 1963;58(303):690–700.Google Scholar
- 13.Routledge R. Fisher’s exact test. In: Encyclopedia of biostatistics. New York: John Wiley Publishing; 2005.Google Scholar
- 16.Zhou XH, McClish DK, Obuchowski NA. Statistical methods in diagnostic medicine. John Wiley & Sons: New York; 2009.Google Scholar