Quantification of Ordinal Variables: A Critical Inquiry into Polychoric and Canonical Correlation

  • Shizuhiko Nishisato
  • David Hemsworth
Conference paper


“Scaling” or “quantification” was re-examined with respect to its main objectives and requirements. Among other things, the attention was directed to the condition that the variance-covariance matrix of scaled quantities must be positive definite or semi-definite in order for the variables to be mapped in Euclidean space. Dual scaling was chosen to guide us through the search for identifying problems, understanding the basic aspects of those problems and practical remedies for them. In particular, the pair-wise quantification approach and the global quantification approach to multivariate analysis were used to identify some tricky theoretical problems, associated with the failure of identifying coordinates of variables in Euclidean hyper-space. One of the problems, arising from the pair-wise quantification, was the lack of a geometric definition of correlation between two sets of categorical variables. This absence of a geometric definition was attributed to the lack of a single data matrix, often leading to negative eigenvalues of the correlation matrix. Then the attention was shifted to the calculation of polychoric correlation and canonical correlation for categorized ordinal variables, the practice often seen in the study of structural equation modeling (SEM). Of particular interest were the problems associated with the pair-wise determination of thresholds (polychoric correlation), the univariate determination of thresholds (polychoric correlation) and the pair-wise determination of category weights (canonical correlation). The study identified two possible causes for the failure of mapping variables in Euclidean space: the pair-wise determination of thresholds or categories and the lack of underlying multivariate normality of the distribution. The degree of this failure was noted as an increasing function of the number of variables in the data set. It was highlighted then that dual scaling could mitigate the problems due to these causes that the current SEM practice of using polychoric correlation and canonical correlation would constantly encounter. Numerical examples were provided to show what is at stake when scaling is not properly carried out. It was stressed that when one cannot reasonably make the assumption of the latent multivariate normal distribution dual scaling offers an excellent alternative to canonical correlation and polychoric correlation as used in SEM because dual scaling transforms the data towards the categorized normal distribution.


Canonical Correlation Ordinal Variable Negative Eigenvalue Category Weight Multivariate Normal Distribution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Beltrami, E. (1873). Sulle funzioni bilineari (On the linear functions). In G. Battagline and E. Fergola (Eds.), Giornale di Mathematiche, 11, 98–106.Google Scholar
  2. Bentler, P. M. (1989). EQS structural equations program manual. Los Angeles: BMDP Statistical Software.Google Scholar
  3. Benzécri, J.-P. and Cazes, P. (1973). L’Analyse des Données: II. L’Analyse des Correspondances (Data analysis II: Correspondence analysis. Paris: Dunod.Google Scholar
  4. Bock, R. D. (1960). Methods and applications of optimal scaling. The University of North Carolina Psychometric Laboratory Research Memorandum, No. 5.Google Scholar
  5. Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley-Interscience.MATHGoogle Scholar
  6. Bradley, R. A. and Terry, M. E. (1952). Rank analysis ofincomplete block design: The method of paired comparisons. Biometrika, 39, 324–345.MathSciNetMATHGoogle Scholar
  7. Coombs, C. H. (1964). A theory of data. New York: Wiley.Google Scholar
  8. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16, 297–334.CrossRefGoogle Scholar
  9. Escofier-Cordier, B. (1969). L’analyse factorielle des correspondances. Bureau Universitaire de Recherche Operationelle, Cahiers, Série Recherche, 13, 25–29.Google Scholar
  10. Gifi, A. (1990). Nonlinear multivariate analysis. New York: Wiley.MATHGoogle Scholar
  11. Greenacre, M. J. (1984). Theory and applications of correspondence Analysis. London: Academic Press.MATHGoogle Scholar
  12. Guttman, L. (1950). Chapters 3–6. Measurement and prediction, edited by Stouffer, S. A., et al., Princeton: Princeton University Press.Google Scholar
  13. Hamming, R. W. (1950). Error detecting and error correcting codes. The Bell System Technical Journal,26 147–160.Google Scholar
  14. Hayashi, C. (1950). On the quantification ofqualitative data from the mathematico-statistical point of view. Annals of the Institute of Statistical Mathematics, 2, 35–47.MathSciNetMATHCrossRefGoogle Scholar
  15. Hayashi, C. (1952). On the prediction of phenomenon from qualitative data and the quantification of qualitative data from the mathematico-statistical point of view. Annals of the Institute of Statistical Mathematics, 3, 69–98.MATHCrossRefGoogle Scholar
  16. Hemsworth, D. (1999). Personal communication with Stephen DuToit.Google Scholar
  17. Hemsworth, D. (2002). The use of dual scaling for the production of correlation matrices for use in structural equation modeling. Unpublished Ph.D. thesis, University of Toronto.Google Scholar
  18. Hill, M. O. (1974). Correspondence analysis: A neglected multivariate method. Applied Statistics, 23, 340–354.CrossRefGoogle Scholar
  19. Hirshcfeld, H. O. (1935). A connection between correlation and contingency. Cambridge Philosophical Society Proceedings, 31, 520–524.CrossRefGoogle Scholar
  20. Horst, P. (1935). Measuring complex attitudes. Journal of Social Psychology, 6, 369–374.CrossRefGoogle Scholar
  21. Hotelling, H. (1936). Relation between two sets of variables. Biometrika, 28, 321–377.MATHGoogle Scholar
  22. Jordan, C. (1874). Mémoire sur les formes bilineares (Notes on bilinear forms). Journal de Mathématiques Pures et Appliquées, Deuxiéme Série, 19, 35–54.Google Scholar
  23. Jöreskog, K. and Sorbom, D. (1996). Lisrel 8: Users reference guide. Chicago: Scientific Software International.Google Scholar
  24. Jöreskog, K. and Sorbom, D. (1996). Prelis 2: Users reference guide. Chicago: Scientific Software International.Google Scholar
  25. Kendall, M. G. and Stuart, A. (1961). The Advanced Theory of Statistics. Volume II. London: Griffin.Google Scholar
  26. Kruskal, J. B. (1964a). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–28.MathSciNetMATHCrossRefGoogle Scholar
  27. Kruskal, J. B. (1964b). Nonmetric multidimensional scaling: a numerical method. Psychometrika, 29, 115–129.MathSciNetMATHCrossRefGoogle Scholar
  28. Lee, S. Y., Poon, W. Y., and Bentler, P. M. (1990a). A three-stage estimation procedure for structural equation models with polytomous variables. Psychometrika, 55, 45–52.CrossRefGoogle Scholar
  29. Lee, S. Y., Poon, W. Y., and Bentler, P. M. (1990b). Full maximum likelihood analysis of structural equation models with polytomous variables. Statistics and Probability Letters, 9, 91–97.MathSciNetCrossRefGoogle Scholar
  30. Lee, S. Y., Poon, W. Y., and Bentler, P. N. (1992). Structural equation models with continuous and polytomous variables. Psychometrika, 57, 89105.CrossRefGoogle Scholar
  31. Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 44–53.Google Scholar
  32. Lingoes, J. C. (1964). Simulataneous linear regression: An IBM 7090 por-gram for analyzing metric/nonmetric data. Behavioral Science, 9, 87–88.Google Scholar
  33. Lord, F. M. (1958). Some relations between Guttman’s principal components of scale analysis and other psychometric theory. Psychometrika, 23, 291–296.MATHCrossRefGoogle Scholar
  34. Luce, R. D. (1959). Individual choice behavior. New York: Wiley.MATHGoogle Scholar
  35. MacCallum, R. (1983). A comparison of factor analysis program in SPSS, BMDP, and SAS. Psychometrika, 48, 223–231.MATHCrossRefGoogle Scholar
  36. Martinson E. O., and Hamdan, M. A. (1971). Maximum likelihood and some other asymptotically efficient estimators of correlation in two-way contingency tables. Journal of Statistical Computation and Simulation, 1, 45–54.CrossRefGoogle Scholar
  37. Messick, S. J., and Abelson, R. P. (1956). The additive constant problem in multidimensional scaling. Psychometrika, 21, 1–17.MATHCrossRefGoogle Scholar
  38. Minkowski, H. (1896). Geometrie der Zahlen. Leipzig: Teubner.Google Scholar
  39. Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical and continuous latent variable indicators. Psychometrika, 49, 115–132.CrossRefGoogle Scholar
  40. Muthén, B. (1988). LISCOMP: Analysis of linear structural equations with a comprehensive measurement model. Chicago: Scientific Software.Google Scholar
  41. Nishisato, S. (1980). Analysis of categorical data: dual scaling and its applications. Toronto: University of Toronto Press.MATHGoogle Scholar
  42. Nishisato, S. (1993). On quantifying different types of categorical data. Psychometrika, 58, 617–629.MATHCrossRefGoogle Scholar
  43. Nishisato, S. (1994). Elements of dual scaling. Hillsdale, N.J.: Lawrence Erlbaum Associates.Google Scholar
  44. Nishisato, S. (1996). Gleaning in the field of dual scaling. Psychometrika, 61, 559–599.MATHCrossRefGoogle Scholar
  45. Nishisato, S. (1998). Unifying a spectrum of data types under a comprehensive frame work for data analysis. A talk presented at a symposium at the Institute of Statistical Mathematics, Tokyo, Japan.Google Scholar
  46. Nishisato, S. (2000). Personal communication with Karl Jöreskog.Google Scholar
  47. Nishisato, S., and Sheu, W. J. (1980). Piecewise method of reciprocal averages for dual scaling of multiple-choice data. Psychometrika, 45, 467–478.MATHCrossRefGoogle Scholar
  48. Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44, 485–500.MathSciNetCrossRefGoogle Scholar
  49. Olsson, U., Drasgow, F., and Doran, N. J. (1982). The polyserial correlation coefficient. Psychometrika, 47, 337–347.MathSciNetMATHCrossRefGoogle Scholar
  50. Poon, W. Y., and Lee, S. Y. (1987). Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients. Psychometrika, 52, 409–430.MathSciNetMATHCrossRefGoogle Scholar
  51. Richardson, M. and Kuder, G. F. (1933). Making a rating scale that measures. Personnel Journal, 12, 36–40.Google Scholar
  52. Sas Institute Inc. (1991). The CALIS procedure: Analysis of covariance structures. Cary, N.C.Google Scholar
  53. Schmidt, E. (1907). Zür Theorie der linearen und nichtlinearen Integleichungen Erster Teil. Entwicklung willkürlicher Functionen nach Syetemen vorgeschriebener. Mathematische Annalen, 63, 433–476.MathSciNetMATHCrossRefGoogle Scholar
  54. Shepard, R. N. (1962a). The analysis of proximities: Multidimensional scaling with an unknown distance function. I. Psychometrika, 27, 125–140.MathSciNetMATHCrossRefGoogle Scholar
  55. Shepard, R. N. (1962b). The analysis of proximities: Multidimensional scaling with an unknown distance function. II. Psychometrika, 27, 219–245.MathSciNetCrossRefGoogle Scholar
  56. Steiger, J. H. (1989). EzPATH: A supplementary module for SYSTAT and SYGRAPH. Evanston, Il.: SYSTAT.Google Scholar
  57. Stevens, S. S. (1951). Handbook of Experimental Psychology. New York: Wiley.Google Scholar
  58. Talfis, G. (1962). The maximum likelihood estimation of correlation from contingency tables. Biometrics, 18, 342–353.MathSciNetCrossRefGoogle Scholar
  59. Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.CrossRefGoogle Scholar
  60. Torgerson, W. S. (1952). Multidimensional scaling. I. Theory and method. Psychometrika, 17, 401–419.MathSciNetMATHCrossRefGoogle Scholar
  61. Torgerson, W. S. (1958). Theory and Methods of Scaling. New York: Wiley.Google Scholar
  62. Wilson, D., Wood, R., and Gibbons, R. (1984). Testfact: Test scoring, item statistics, and item factor analysis. Chicago: Scientific Software.Google Scholar
  63. Wothke, W. (1993). Nonpositive definite matrices in structural modeling. Testing structural equation models, edited by Bollen, K. and Long, S., 256293, Newbury Park: Sage.Google Scholar
  64. Yanai, H., Shigemasu, K., Maekawa, S., and Ichikawa, M. (1990). Inshi Bunseki (Factor Analysis). Tokyo: Asakura Shoten (in Japanese)Google Scholar
  65. Young, G., and Householder, A. S. (1938). A note on multi-dimensional psychophysical analysis. Psychometrika, 6, 331–333.CrossRefGoogle Scholar

Copyright information

© The Institute of Statistical Mathematics 2002

Authors and Affiliations

  • Shizuhiko Nishisato
    • 1
  • David Hemsworth
    • 2
  1. 1.The Ontario Institute for Studies in Education of the University of TorontoTorontoCanada
  2. 2.Wilfrid Laurier UniversityWaterlooCanada

Personalised recommendations