Canonical Discriminant Analysis of Multinomial Samples with Applications to Textual Data

  • Myung-Hoe Huh
  • Kyung-Sook Yang
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)


We develop the canonical discriminant analysis of the G groups data consisting of n 1, , n G multinomial samples within each group, on scaled Euclidean space with chi-square distance. Our discriminant analysis produces quantification plots showing both q observation categories and N (= n1++ n G ) multinomial sample units, as well as G group centroids. We apply the proposed method to Korean text analysis to extract statistical characteristics of Korean language by genres.


Canonical Discriminant Analysis Multinomial Samples 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Gabriel, K. R. (1971). The biplot graphical display of matrices with applications to principal component analysis. Biometrika, 58. 453–467.CrossRefGoogle Scholar
  2. Greenacre, M. J. and Hastie, T. (1984). The geometric interpretation of correspondence analysis, Journal of the American Statistical Association, 82. 437–447.CrossRefGoogle Scholar
  3. Jin, M.-Z. (1994). Positioning of commas in sentences and classification of texts, Mathematical Linguistics, 19. 317–363. (written in Japanese)Google Scholar
  4. Johnson, R. A. and Wichern, D. W. (1992). Applied Multivariate Statistical Analysis, 3rd Edition. Prentice-Hall, Inc.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 1998

Authors and Affiliations

  • Myung-Hoe Huh
    • 1
  • Kyung-Sook Yang
    • 1
  1. 1.Dept. of StatisticsKorea UniversitySeoul 136-701Korea

Personalised recommendations