Abstract
In a statistical survey, the treatment of missing data needs the adoption of particular precautions considering that each decision has an impact on the analysis results. In this paper we propose a strategy based on Classification and Discrimination methods conducted on symbolic data and it enables us to extract both compatibility rules and to impute data in order to reconstruct the information. The strategy makes use of tools developed in statistical methods fields for the analysis of complex structures named symbolic objects. The starting point is the use of the Symbolic Marking for the determination of the rules (complex units) for the construction of the Edit plane. The following phase is the construction of the symbolic matrix and the last phase will be the reconstruction of the missing data by comparing symbolic objects through the application of a suitable dissimilarity measure based on ”Minkowski L1” weighted distance. The proposed strategy has been applied to a real case of 100 manufacturing enterprises located in the South Italy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
ARKHIPOFF O. (1996): La qualité de l'information et sa precision, Colloque de l'ISEOR.
BALBI, S., GRASSIA M.G. (2003): Meccanismi di accesso al mercato del lavoro degli studenti di Economia a Napoli — Profili dei laureati attraverso tre indagini ripetute in Transizione Università-Lavoro: la definizione delle competenze, Cleup.
BALBI S., VERDE R. (1998): Structuring Questionnaires as Symbolic Objects: a New Tool for Improving Data Quality in Surveys, III International Seminar on New Techniques and Technologies-NTTS, Sorrento.
BARCAROLI G. (1993): Un approccio logico formale al problema del controllo e della correzione dei dati statistici, Quaderni di Ricerca, n.9, ISTAT.
BOCK H., DIDAY E. (2000): Analysis of Symbolic Data, Springer — Verlag,.
BOCCI L., RIZZI A. (2000): Misure di prossimità nell'analisi dei dati simbolici, in Atti della XL Riunione Scientifica della società Italiana di Statistica, Sessioni Plenarie e specializzate, Firenze 26–28 aprile 2000, 91–102.
BRUZZESE D., DAVINO C. (2003): Post Analysis of Association Rules in a Symbolic Framework, Atti della XLI Riunione Scientifica della società Italiana di Statistica, Milano 5–7 giugno 2003, 63–66.
GRASSIA, M.G., MURATORE, M.G. (2001): The contribution of symbolic objects theory to errors prevention in CATI questionnaires, IV International Seminar on New Techniques and Technologies — NTTS, Creta
GETTLER-SUMMA M. (1998): MGS in SODAS: Marking and Generalization by Symbolic Objects in the Symbolic Official Data Analysis Software, Cahier9935, Université Dauphine LISE CEREMADE — Paris.
LITTLE, R.J.A., RUBIN, D.B. (1987): Statistical analysis with missing data, New York, Wiley &Sons.
MASSRALI M., GETTLER-SUMMA M., DIDAY E. (1998): Extracting knowledge from very large databases, Kesda '98, Luxembourg.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Grassia, M.G. (2005). A Classification and Discrimination Integrated Strategy Conducted on Symbolic Data for Missing Data Treatment in Questionnaire Survey. In: Bock, HH., et al. New Developments in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27373-5_6
Download citation
DOI: https://doi.org/10.1007/3-540-27373-5_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23809-6
Online ISBN: 978-3-540-27373-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)