Skip to main content

Visualization of the Critical Patterns of Missing Values in Classification Data

  • Conference paper
Advances in Visual Information Systems (VISUAL 2007)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 4781))

Included in the following conference series:

Abstract

The patterns of missing values are important for assessing the quality of a classification data set and the validation of classification results. The paper discusses the critical patterns of missing values in a classification data set: missing at random, uneven symmetric missing, and uneven asymmetric missing. It proposes a self-organizing maps (SOM) based cluster analysis method to visualize the patterns of missing values in classification data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bello, A.L.: Imputation techniques in regression analysis: Looking closely at their implementation. Computational Statistics and Data Analysis 20, 45–57 (1995)

    Article  MATH  Google Scholar 

  2. Chan, P., Dunn, O.J.: The treatment of missing values in discriminant analysis. Journal of the American Statistical Association 6, 473–477 (1972)

    Article  Google Scholar 

  3. Deboeck, G., Kohonen, T.: Visual Explorations in Finance with Self-Organizing Maps. Springer, London, UK (1998)

    MATH  Google Scholar 

  4. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B39(1), 1–38 (1997)

    MathSciNet  Google Scholar 

  5. Gnanadesikan, R., Kettenring, J.R.: Discriminant analysis and clustering. Statistical Science 14(1), 34–69 (1989)

    MathSciNet  Google Scholar 

  6. Hand, D.J.: Discrimination and Classification. Wiley, New York (1981)

    MATH  Google Scholar 

  7. Hand, D.J.: Data mining: Statistics and more? The American Statistician 52(2), 112–118 (1998)

    Article  MathSciNet  Google Scholar 

  8. Hartigan, J.A.: Clustering Algorithms. Wiley, New York, NY (1995)

    Google Scholar 

  9. Kalton, G., Kasprzyk, D.: The treatment of missing survey data. Survey Methodology 12, 1–16 (1986)

    Google Scholar 

  10. Kohonen, T.: Self-Organization and Associative Memory, 3rd edn. Springer, Heidelberg (1989)

    Google Scholar 

  11. Little, R.J.A., Rubin, D.B. (eds.): Statistical Analysis with Missing Data, 2nd edn. John Wiley and Sons, New York (2002)

    MATH  Google Scholar 

  12. Mundfrom, D.J., Whitcomb, A.: Imputing missing values: The effect on the accuracy of classification. Multiple Linear Regression Viewpoints 25(1), 13–19 (1998)

    Google Scholar 

  13. Romesburg, H.C.: Cluster Analysis for Researchers, Robert E. Krieger: Malabar, FL (1990)

    Google Scholar 

  14. Seber, G.A.F.: Multivariate Observations. Wiley, New York, NY (1984)

    MATH  Google Scholar 

  15. Silberschatz, A., Tuzhilin, A.: What makes patterns interesting in knowledge discovery systems. IEEE Transactions on Knowledge and Data Engineering 5(6), 970–974 (1996)

    Article  Google Scholar 

  16. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Bostein, D., Altman, R.B.: Missing value estimation methods for DNA microarrays. Bioinformatics 17(6), 520–525 (2001)

    Article  Google Scholar 

  17. Wang, H., Wang, S.: Data mining with incomplete data, in Encyclopedia of Data Warehousing and Mining. In: Wang, J. (ed.), Idea Group Inc. Hershey, PA, pp. 293–296 (2005)

    Google Scholar 

  18. Yang, Q., Ling, C., Chai, X., Pan, R.: Test-cost sensitive classification on data with missing values. IEEE Transactions on Knowledge and Data Engineering 18(5), 626–638 (2006)

    Article  Google Scholar 

  19. Zhang, S., Qin, Z., Ling, C., Sheng, S.: “Missing is useful”: Missing values in cost-sensitive decision trees. IEEE Transactions on Knowledge and Data Engineering 17(12), 1689–1693 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Guoping Qiu Clement Leung Xiangyang Xue Robert Laurini

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, H., Wang, S. (2007). Visualization of the Critical Patterns of Missing Values in Classification Data. In: Qiu, G., Leung, C., Xue, X., Laurini, R. (eds) Advances in Visual Information Systems. VISUAL 2007. Lecture Notes in Computer Science, vol 4781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76414-4_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76414-4_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76413-7

  • Online ISBN: 978-3-540-76414-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics