Structural Representation of Categorical Data and Cluster Analysis Through Filters

Nishisato, Shizuhiko

doi:10.1007/978-3-319-01264-3_7

Shizuhiko Nishisato²²

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

900 Accesses
10 Citations

Abstract

Representation of categorical data by nominal measurement leaves the entire information intact, which is not the case with widely used numerical or pseudo-numerical representation such as Likert-type scoring. This aspect is first explained, and then we turn our attention to the analysis of nominally represented data. For the analysis of a large number of variables, one typically resorts to dimension reduction, and its necessity is often greater with categorical data than with continuous data. In spite of this, Nishisato S, Clavel JG (Behaviormetrika 57:15–32, 2010) proposed an approach which is diametrically opposite to the dimension-reduction approach, for they advocate the use of doubled hyper-space to accommodate both row variables and column variables of two-way data in a common space. The rationale of doubled space can be used to vindicate the validity of the Carroll-Green-Schaffer scaling (Carroll JD, Green PE, Schaffer CM (1986) J Mark Res 23(3):271–280). The current paper will then introduce a simple procedure for the analysis of a hyper-dimensional configuration of data, called cluster analysis through filters. A numerical example will be presented to show a clear contrast between the dimension-reduction approach and the total information analysis by cluster analysis. There is no doubt that our approach is preferred to the dimension-reduction approach on two grounds: our results are a factual summary of a multidimensional data configuration, and our procedure is simple and practical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Carroll JD, Green PE, Schaffer CM (1986) Interpoint distance comparisons in correspondence analysis. J Mark Res 23(3):271–280
Article Google Scholar
Greenacre MJ (1989) The carroll-green-schaffer scaling in correspondence analysis: a theoretical and empirical appraisal. J Mark Res 26(3):358–365
Article MathSciNet Google Scholar
Heuer G (1979) Selbstmord bei Kindern und Jugendlichen: ein Beitrag zur Suizidprophylaxe aus pädagogischer Sicht. Klett-Cotta, Stuttgart
Google Scholar
Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):44–53
Google Scholar
Nishisato S (1980) Analysis of categorical data: dual scaling and its applications. University of Toronto Press, Toronto
MATH Google Scholar
Nishisato S (1984) Forced classification: a simple application of a quantification method. Psychometrika 49:25–36
Article Google Scholar
Nishisato S (1994) Elements of dual scaling: an introduction to practical data analysis. Lawrence Erlbaum Associates, Hillsdale
Google Scholar
Nishisato S (1999) Data types and information: beyond the current practice of data analysis. In: Decker R, Gaul W (eds) Classification and information processing at the turn of the Millennium. Springer, Berlin/Heidelberg, pp 40–51
Google Scholar
Nishisato S (2006) Correlational structure of multiple-choice data as viewed from dual scaling. In: Greenacre MJ, Blasius I (eds) Multiple correspondence analysis and related methods. Chapman and Hall/CRC, Boca Raton, chap 6, pp 161–178
Chapter Google Scholar
Nishisato S (2007) Multidimensional nonlinear descriptive analysis. Chapman and Hall/CRC, Boca Raton
MATH Google Scholar
Nishisato S (2012a) Optimal quantities for analysis through regression of measurement on data. Bull Data Anal Jpn Classif Soc 1:1–10
Google Scholar
Nishisato S (2012b) Reminiscence and a step forward. In: Gaul W, Geyer-Schultz A, Schmidt-Thieme L, Kunze J (eds) Classification, data analysis, and knowledge organization. Springer, Heidelberg, pp 109–119
Google Scholar
Nishisato S, Baba Y (1999) On contingency, projection and forced classification of dual scaling. Behaviormetrika 26:207–219
Article Google Scholar
Nishisato S, Clavel JG (2003) A note on between-set distances in dual scaling and correspondence analysis. Behaviormetrika 30(1):87–98
Article MathSciNet MATH Google Scholar
Nishisato S, Clavel JG (2010) Total information analysis: comprehensive dual scaling. Behaviormetrika 57:15–32
Google Scholar
Van der Heijden PGM, De Leeuw J (1985) Correspondence analysis used complementary to loglinear analysis. Psychometrika 50(4):429–447
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

Thanks are due to José Garcia Clavel for the calculation of between-set distances of Heuer’s data.

Author information

Authors and Affiliations

University of Toronto, Toronto, Canada
Shizuhiko Nishisato

Authors

Shizuhiko Nishisato
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shizuhiko Nishisato .

Editor information

Editors and Affiliations

Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Wolfgang Gaul
Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Andreas Geyer-Schulz
The Institute of Statistical Mathematics, Tokyo, Japan
Yasumasa Baba
Graduate School of Management and Information Systems, Tama University, Tokyo, Japan
Akinori Okada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nishisato, S. (2014). Structural Representation of Categorical Data and Cluster Analysis Through Filters. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-01264-3_7
Published: 10 October 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01263-6
Online ISBN: 978-3-319-01264-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics