Classification, Clustering, and Visualisation Based on Dual Scaling

Mucha, Hans-Joachim

doi:10.1007/978-3-319-01264-3_5

Classification, Clustering, and Visualisation Based on Dual Scaling

Hans-Joachim Mucha²²

Conference paper
First Online: 10 October 2013

885 Accesses
1 Citations

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Abstract

In practice, the statistician is often faced with data already available. In addition, there are often mixed data. The statistician must now try to gain optimal statistical conclusions with the most sophisticated methods. But, are the variables scaled optimally? And, what about missing data? Without loss of generality here we restrict to binary classification/clustering. A very simple but general approach is outlined that is applicable to such data for both classification and clustering, based on data preparation (i.e., a down-grading step such as binning for each quantitative variable) followed by dual scaling (the up-grading step: scoring). As a byproduct, the quantitative scores can be used for multivariate visualisation of both data and classes/clusters. For illustrative purposes, a real data application to optical character recognition (OCR) is considered throughout the paper. Moreover, the proposed approach will be compared with other multivariate methods such as the simple Bayesian classifier.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Berry MW, Browne M (eds) (2006) Lecture notes in data mining. World Scientific, Singapore
MATH Google Scholar
Frank A, Asuncion A (2010) UCI machine learning repository. School of Information and Computer Sciences, University of California, Irvine. http://archive.ics.uci.edu/ml
Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic, London
MATH Google Scholar
Kauderer H, Mucha HJ (1998) Supervised learning with qualitative and mixed attributes. In: Balderjahn I, Mathar R, Schader M (eds) Classification, data analysis, and data highways. Springer, Berlin, pp 374–382
Chapter Google Scholar
Mucha HJ (2002) An intelligent clustering technique based on dual scaling. In: Nishisato S, Baba Y, Bozdogan H, Kanefuji K (eds) Measurement and multivariate analysis. Springer, Tokyo, pp 37–46
Chapter Google Scholar
Mucha HJ (2009) ClusCorr98 for Excel 2007: clustering, multivariate visualization, and validation. In: Mucha HJ, Ritter G (eds) Classification and clustering: models, software and applications. Report 26, WIAS, Berlin, pp 14–40
Google Scholar
Mucha HJ, Siegmund-Schultze R, Dübon K (1998) Adaptive cluster analysis techniques – software and applications. In: Hayashi C, Ohsumi N, Yajima K, Tanaka Y, Bock HH, Baba Y (eds) Data science, classification and related methods. Springer, Tokyo, pp 231–238
Chapter Google Scholar
Nishisato S (1980) Analysis of categorical data: dual scaling and its applications. University of Toronto Press, Toronto
MATH Google Scholar
Nishisato S (1994) Elements of dual scaling: an introduction to practical data analysis. Lawrence Erlbaum Associates, Hillsdale
Google Scholar
Parvez MT, Mahmoud SA (2013) Arabic handwriting recognition using structural and syntactic pattern attributes. Pattern Recognit 46(1):141–154
Article Google Scholar
Pölz W (1995) Optimal scaling for ordered categories. Comput Stat 10:37–41
MATH Google Scholar
Pölz W (1996) Überprüfung und Erhöhung der Diskriminierfähigkeit von Skalen. In: Mucha HJ, Bock HH (eds) Classification and multivariate graphics: models, software and applications. Report 10, WIAS, Berlin, pp 51–55
Google Scholar
Vamvakas G, Gatos B, Perantonis SJ (2010) Handwritten character recognition through two-stage foreground sub-sampling. Pattern Recognit 43(8):2807–2816
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Weierstrass Institute for Applied Analysis and Stochastics (WIAS), D-10117, Berlin, Germany
Hans-Joachim Mucha

Authors

Hans-Joachim Mucha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hans-Joachim Mucha .

Editor information

Editors and Affiliations

Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Wolfgang Gaul
Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
Andreas Geyer-Schulz
The Institute of Statistical Mathematics, Tokyo, Japan
Yasumasa Baba
Graduate School of Management and Information Systems, Tama University, Tokyo, Japan
Akinori Okada

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mucha, HJ. (2014). Classification, Clustering, and Visualisation Based on Dual Scaling. In: Gaul, W., Geyer-Schulz, A., Baba, Y., Okada, A. (eds) German-Japanese Interchange of Data Analysis Results. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01264-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-01264-3_5
Published: 10 October 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01263-6
Online ISBN: 978-3-319-01264-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics