Abstract
We consider binary classification based on the dual scaling technique. In the case of more than two classes many binary classifiers can be considered. The proposed approach goes back to Mucha (An intelligent clustering technique based on dual scaling. In: S. Nishisato, Y. Baba, H. Bozdogan, K. Kanefuji (eds.) Measurement and multivariate analysis, pp. 37–46. Springer, Tokyo, 2002) and it is based on the pioneering book of Nishisato (Analysis of categorical data: Dual scaling and its applications. The University of Toronto Press, Toronto, 1980). It is applicable to mixed data the statistician is often faced with. First, numerical variables have to be discretized into bins to become ordinal variables (data preprocessing). Second, the ordinal variables are converted into categorical ones. Then the data is ready for dual scaling of each individual variable based on the given two classes: each category is transformed into a score. Then a classifier can be derived from the scores simply in an additive manner over all variables. It will be compared with the simple Bayesian classifier (SBC). Examples and applications to archaeometry (provenance studies of Roman ceramics) are presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Fahrmeir, L., & Hamerle, A. (1984). Multivariate statistische Verfahren. Berlin: Walter de Gruyter.
Frank, A., & Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]: [http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits]. University of California: School of Information and Computer Science, Irvine.
Gebelein, M. J. (1941). Das statistische Problem der Korrelation als Variations- und Eigenwertproblem und sein Zusammenhang mit der Ausgleichsrechnung. Zeitschrift für angewandte Mathematik und Mechanik, 21, 365–375.
Giacomini, F. (2005). The Roman stamped tiles of Vindonissa (1st century AD., Northern Switzerland). Provenance and technology of production - an archaeometric study. BAR international series (Vol. 1449), Oxford: Archaeopress.
Good, I. J. (1965). The estimation of probabilities: An essay on modern Bayesian methods. Cambridge: MIT.
Greenacre, M. J. (1989). Theory and applications of correspondence analysis (3rd edn.). London: Academic.
Kauderer, H., & Mucha, H. J. (1998). Supervised learning with qualitative and mixed attributes. In I. Balderjahn, R. Mathar, & M. Schader (Eds.), Classification, data analysis, and data highways (pp. 374–382). Berlin: Springer.
Mucha, H. J. (2002). An intelligent clustering technique based on dual scaling. In S. Nishisato, Y. Baba, H. Bozdogan, & K. Kanefuji (Eds.), Measurement and multivariate analysis (pp. 37–46). Tokyo: Springer.
Mucha, H. J. (2009). ClusCorr98 for Excel 2007: Clustering, multivariate visualization, and validation. In H. J. Mucha, & G. Ritter (Eds.), Classification and clustering: Models, software and applications (pp. 14–40). WIAS, Berlin, Report No. 26.
Nishisato, S. (1980). Analysis of categorical data: Dual scaling and its applications. Toronto: The University of Toronto Press.
Nishisato, S. (1994). Elements of dual scaling: An introduction to practical data analysis. Hillsdale: Lawrence Erlbaum Associates Publishers.
Pölz, W. (1988). Ein dispositionsstatistisches Verfahren zur optimalen Informationsausschöpfung aus Datensystemen mit unterschiedlichem Ausprägungsniveau der Merkmale. Frankfurt: Peter Lang.
Pölz, W. (1995). Optimal scaling for ordered categories. Computational Statistics, 10, 37–41.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Mucha, HJ., Bartel, HG., Dolata, J. (2014). Dual Scaling Classification and Its Application in Archaeometry. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds) Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01595-8_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-01595-8_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01594-1
Online ISBN: 978-3-319-01595-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)