Canonical Discriminant Analysis of Multinomial Samples with Applications to Textual Data
We develop the canonical discriminant analysis of the G groups data consisting of n 1, …, n G multinomial samples within each group, on scaled Euclidean space with chi-square distance. Our discriminant analysis produces quantification plots showing both q observation categories and N (= n1+…+ n G ) multinomial sample units, as well as G group centroids. We apply the proposed method to Korean text analysis to extract statistical characteristics of Korean language by genres.
KeywordsCanonical Discriminant Analysis Multinomial Samples
Unable to display preview. Download preview PDF.