Skip to main content

Structural Representation of Categorical Data and Cluster Analysis Through Filters

  • Conference paper
  • First Online:
German-Japanese Interchange of Data Analysis Results

Abstract

Representation of categorical data by nominal measurement leaves the entire information intact, which is not the case with widely used numerical or pseudo-numerical representation such as Likert-type scoring. This aspect is first explained, and then we turn our attention to the analysis of nominally represented data. For the analysis of a large number of variables, one typically resorts to dimension reduction, and its necessity is often greater with categorical data than with continuous data. In spite of this, Nishisato S, Clavel JG (Behaviormetrika 57:15–32, 2010) proposed an approach which is diametrically opposite to the dimension-reduction approach, for they advocate the use of doubled hyper-space to accommodate both row variables and column variables of two-way data in a common space. The rationale of doubled space can be used to vindicate the validity of the Carroll-Green-Schaffer scaling (Carroll JD, Green PE, Schaffer CM (1986) J Mark Res 23(3):271–280). The current paper will then introduce a simple procedure for the analysis of a hyper-dimensional configuration of data, called cluster analysis through filters. A numerical example will be presented to show a clear contrast between the dimension-reduction approach and the total information analysis by cluster analysis. There is no doubt that our approach is preferred to the dimension-reduction approach on two grounds: our results are a factual summary of a multidimensional data configuration, and our procedure is simple and practical.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Carroll JD, Green PE, Schaffer CM (1986) Interpoint distance comparisons in correspondence analysis. J Mark Res 23(3):271–280

    Article  Google Scholar 

  • Greenacre MJ (1989) The carroll-green-schaffer scaling in correspondence analysis: a theoretical and empirical appraisal. J Mark Res 26(3):358–365

    Article  MathSciNet  Google Scholar 

  • Heuer G (1979) Selbstmord bei Kindern und Jugendlichen: ein Beitrag zur Suizidprophylaxe aus pädagogischer Sicht. Klett-Cotta, Stuttgart

    Google Scholar 

  • Likert R (1932) A technique for the measurement of attitudes. Arch Psychol 22(140):44–53

    Google Scholar 

  • Nishisato S (1980) Analysis of categorical data: dual scaling and its applications. University of Toronto Press, Toronto

    MATH  Google Scholar 

  • Nishisato S (1984) Forced classification: a simple application of a quantification method. Psychometrika 49:25–36

    Article  Google Scholar 

  • Nishisato S (1994) Elements of dual scaling: an introduction to practical data analysis. Lawrence Erlbaum Associates, Hillsdale

    Google Scholar 

  • Nishisato S (1999) Data types and information: beyond the current practice of data analysis. In: Decker R, Gaul W (eds) Classification and information processing at the turn of the Millennium. Springer, Berlin/Heidelberg, pp 40–51

    Google Scholar 

  • Nishisato S (2006) Correlational structure of multiple-choice data as viewed from dual scaling. In: Greenacre MJ, Blasius I (eds) Multiple correspondence analysis and related methods. Chapman and Hall/CRC, Boca Raton, chap 6, pp 161–178

    Chapter  Google Scholar 

  • Nishisato S (2007) Multidimensional nonlinear descriptive analysis. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Nishisato S (2012a) Optimal quantities for analysis through regression of measurement on data. Bull Data Anal Jpn Classif Soc 1:1–10

    Google Scholar 

  • Nishisato S (2012b) Reminiscence and a step forward. In: Gaul W, Geyer-Schultz A, Schmidt-Thieme L, Kunze J (eds) Classification, data analysis, and knowledge organization. Springer, Heidelberg, pp 109–119

    Google Scholar 

  • Nishisato S, Baba Y (1999) On contingency, projection and forced classification of dual scaling. Behaviormetrika 26:207–219

    Article  Google Scholar 

  • Nishisato S, Clavel JG (2003) A note on between-set distances in dual scaling and correspondence analysis. Behaviormetrika 30(1):87–98

    Article  MathSciNet  MATH  Google Scholar 

  • Nishisato S, Clavel JG (2010) Total information analysis: comprehensive dual scaling. Behaviormetrika 57:15–32

    Google Scholar 

  • Van der Heijden PGM, De Leeuw J (1985) Correspondence analysis used complementary to loglinear analysis. Psychometrika 50(4):429–447

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

Thanks are due to José Garcia Clavel for the calculation of between-set distances of Heuer’s data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shizuhiko Nishisato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Nishisato, S. (2014). Structural Representation of Categorical Data and Cluster Analysis Through Filters. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_7

Download citation

Publish with us

Policies and ethics