Visualization of the Critical Patterns of Missing Values in Classification Data

Wang, Hai; Wang, Shouhong

doi:10.1007/978-3-540-76414-4_27

Hai Wang¹ &
Shouhong Wang²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4781))

Included in the following conference series:

International Conference on Advances in Visual Information Systems

1049 Accesses
3 Citations

Abstract

The patterns of missing values are important for assessing the quality of a classification data set and the validation of classification results. The paper discusses the critical patterns of missing values in a classification data set: missing at random, uneven symmetric missing, and uneven asymmetric missing. It proposes a self-organizing maps (SOM) based cluster analysis method to visualize the patterns of missing values in classification data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bello, A.L.: Imputation techniques in regression analysis: Looking closely at their implementation. Computational Statistics and Data Analysis 20, 45–57 (1995)
Article MATH Google Scholar
Chan, P., Dunn, O.J.: The treatment of missing values in discriminant analysis. Journal of the American Statistical Association 6, 473–477 (1972)
Article Google Scholar
Deboeck, G., Kohonen, T.: Visual Explorations in Finance with Self-Organizing Maps. Springer, London, UK (1998)
MATH Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B39(1), 1–38 (1997)
MathSciNet Google Scholar
Gnanadesikan, R., Kettenring, J.R.: Discriminant analysis and clustering. Statistical Science 14(1), 34–69 (1989)
MathSciNet Google Scholar
Hand, D.J.: Discrimination and Classification. Wiley, New York (1981)
MATH Google Scholar
Hand, D.J.: Data mining: Statistics and more? The American Statistician 52(2), 112–118 (1998)
Article MathSciNet Google Scholar
Hartigan, J.A.: Clustering Algorithms. Wiley, New York, NY (1995)
Google Scholar
Kalton, G., Kasprzyk, D.: The treatment of missing survey data. Survey Methodology 12, 1–16 (1986)
Google Scholar
Kohonen, T.: Self-Organization and Associative Memory, 3rd edn. Springer, Heidelberg (1989)
Google Scholar
Little, R.J.A., Rubin, D.B. (eds.): Statistical Analysis with Missing Data, 2nd edn. John Wiley and Sons, New York (2002)
MATH Google Scholar
Mundfrom, D.J., Whitcomb, A.: Imputing missing values: The effect on the accuracy of classification. Multiple Linear Regression Viewpoints 25(1), 13–19 (1998)
Google Scholar
Romesburg, H.C.: Cluster Analysis for Researchers, Robert E. Krieger: Malabar, FL (1990)
Google Scholar
Seber, G.A.F.: Multivariate Observations. Wiley, New York, NY (1984)
MATH Google Scholar
Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering 5(6), 970–974 (1996)
Article Google Scholar
Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Bostein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)
Article Google Scholar
Wang, H., Wang, S.: Data mining with incomplete data, in Encyclopedia of Data Warehousing and Mining. In: Wang, J. (ed.), Idea Group Inc. Hershey, PA, pp. 293–296 (2005)
Google Scholar
Yang, Q., Ling, C., Chai, X., Pan, R.: Test-cost sensitive classification on data with missing values. IEEE Transactions on Knowledge and Data Engineering 18(5), 626–638 (2006)
Article Google Scholar
Zhang, S., Qin, Z., Ling, C., Sheng, S.: “Missing is useful”: Missing values in cost-sensitive decision trees. IEEE Transactions on Knowledge and Data Engineering 17(12), 1689–1693 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Sobey School of Business, Saint Mary’s University, Canada
Hai Wang
Charlton College of Business, University of Massachusetts Dartmouth, USA
Shouhong Wang

Authors

Hai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Shouhong Wang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Guoping Qiu Clement Leung Xiangyang Xue Robert Laurini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, H., Wang, S. (2007). Visualization of the Critical Patterns of Missing Values in Classification Data. In: Qiu, G., Leung, C., Xue, X., Laurini, R. (eds) Advances in Visual Information Systems. VISUAL 2007. Lecture Notes in Computer Science, vol 4781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76414-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-76414-4_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-76413-7
Online ISBN: 978-3-540-76414-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics