Skip to main content

Dealing with Missing Values

  • Chapter
  • First Online:
Data Preprocessing in Data Mining

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 72))

Abstract

In this chapter the reader is introduced to the approaches used in the literature to tackle the presence of Missing Values (MVs). In real-life data, information is frequently lost in data mining, caused by the presence of missing values in attributes. Several schemes have been studied to overcome the drawbacks produced by missing values in data mining tasks; one of the most well known is based on preprocessing, formally known as imputation. After the introduction in Sect. 4.1, the chapter begins with the theoretical background which analyzes the underlying distribution of the missingness in Sect. 4.2. From this point on, the successive sections go from the simplest approaches in Sect. 4.3, to the most advanced proposals, focusing in the imputation of the MVs. The scope of such advanced methods includes the classic maximum likelihood procedures, like Expectation-Maximization or Multiple-Imputation (Sect. 4.4) and the latest Machine Learning based approaches which use algorithms for classification or regression in order to accomplish the imputation (Sect. 4.5). Finally a comparative experimental study will be carried out in Sect. 4.6.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acuna, E., Rodriguez, C.: Classification, Clustering and Data Mining Applications. Springer, Berlin (2004)

    Google Scholar 

  2. Atkeson, C.G., Moore, A.W., Schaal, S.: Locally weighted learning. Artif. Intell. Rev. 11, 11–73 (1997)

    Article  Google Scholar 

  3. Aydilek, I.B., Arslan, A.: A hybrid method for imputation of missing values using optimized fuzzy c-means with support vector regression and a genetic algorithm. Inf. Sci. 233, 25–35 (2013)

    Article  Google Scholar 

  4. Azim, S., Aggarwal, S.: Hybrid model for data imputation: using fuzzy c-means and multi layer perceptron. In: Advance Computing Conference (IACC), 2014 IEEE International, pp. 1281–1285 (2014)

    Google Scholar 

  5. Barnard, J., Meng, X.: Applications of multiple imputation in medical studies: from aids to nhanes. Stat. Methods Med. Res. 8(1), 17–36 (1999)

    Article  Google Scholar 

  6. Batista, G., Monard, M.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5), 519–533 (2003)

    Article  Google Scholar 

  7. Bezdek, J., Kuncheva, L.: Nearest prototype classifier designs: an experimental study. Int. J. Intell. Syst. 16(12), 1445–1473 (2001)

    Article  MATH  Google Scholar 

  8. Broomhead, D., Lowe, D.: Multivariable functional interpolation and adaptive networks. Complex Systems 11, 321–355 (1988)

    MathSciNet  Google Scholar 

  9. van Buuren, S., Groothuis-Oudshoorn, K.: MICE: multivariate imputation by chained equations in r. J. Stat. Softw. 45(3), 1–67 (2011)

    Google Scholar 

  10. le Cessie, S., van Houwelingen, J.: Ridge estimators in logistic regression. Appl. Stat. 41(1), 191–201 (1992)

    Article  MATH  Google Scholar 

  11. Chai, L., Mohamad, M., Deris, S., Chong, C., Choon, Y., Ibrahim, Z., Omatu, S.: Inferring gene regulatory networks from gene expression data by a dynamic bayesian network-based model. In: Omatu, S., De Paz Santana, J.F., González, S.R., Molina, J.M., Bernardos, A.M., Rodríguez, J.M.C. (eds.) Distributed Computing and Artificial Intelligence, Advances in Intelligent and Soft Computing, pp. 379–386. Springer, Berlin (2012)

    Chapter  Google Scholar 

  12. Ching, W.K., Li, L., Tsing, N.K., Tai, C.W., Ng, T.W., Wong, A.S., Cheng, K.W.: A weighted local least squares imputation method for missing value estimation in microarray gene expression data. Int. J. Data Min. Bioinform. 4(3), 331–347 (2010)

    Article  Google Scholar 

  13. Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theor. 14(3), 462–467 (1968)

    Article  MATH  MathSciNet  Google Scholar 

  14. Clark, P., Niblett, T.: The CN2 induction algorithm. Machine Learning 3(4), 261–283 (1989)

    Google Scholar 

  15. Cohen, W., Singer, Y.: A simple and fast and effective rule learner. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence, pp. 335–342 (1999)

    Google Scholar 

  16. Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning (ICML), pp. 115–123 (1995).

    Google Scholar 

  17. Cortes, C., Vapnik, V.: Support vector networks. Machine Learning 20, 273–297 (1995)

    MATH  Google Scholar 

  18. Cover, T.M., Thomas, J.A.: Elements of Information Theory, 2 edn. Wiley, New York (1991)

    Google Scholar 

  19. Daniel, R.M., Kenward, M.G.: A method for increasing the robustness of multiple imputation. Comput. Stat. Data Anal. 56(6), 1624–1643 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  20. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  21. Domingos, P., Pazzani, M.: On the optimality of the simple bayesian classifier under zero-one loss. Machine Learning 29, 103–137 (1997)

    Article  MATH  Google Scholar 

  22. Dorri, F., Azmi, P., Dorri, F.: Missing value imputation in dna microarrays based on conjugate gradient method. Comp. Bio. Med. 42(2), 222–227 (2012)

    Article  Google Scholar 

  23. Dunning, T., Freedman, D.: Modeling section effects, Sage, pp. 225–231 (2008)

    Google Scholar 

  24. Ennett, C.M., Frize, M., Walker, C.R.: Influence of missing values on artificial neural network performance. Stud. Health Technol. Inform. 84, 449–453 (2001)

    Google Scholar 

  25. Fan, R.E., Chen, P.H., Lin, C.J.: Working set selection using second order information for training support vector machines. J. Machine Learning Res. 6, 1889–1918 (2005)

    MATH  MathSciNet  Google Scholar 

  26. Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recognit. 41(12), 3692–3705 (2008). http://dx.doi.org/10.1016/j.patcog.2008.05.019

  27. Farhangfar, A., Kurgan, L.A., Pedrycz, W.: A novel framework for imputation of missing values in databases. IEEE Trans. Syst. Man Cybern. Part A 37(5), 692–709 (2007)

    Article  Google Scholar 

  28. Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th International Joint Conference on Uncertainly in Artificial Intelligence(IJCAI93), pp. 1022–1029 (1993)

    Google Scholar 

  29. Feng, H., Guoshun, C., Cheng, Y., Yang, B., Chen, Y.: A SVM regression based approach to filling in missing values. In: Khosla, R., Howlett, R.J., Jain, L.C. (eds.) KES (3), Lecture Notes in Computer Science, vol. 3683, pp. 581–587. Springer, Berlin (2005)

    Google Scholar 

  30. Feng, X., Wu, S., Liu, Y.: Imputing missing values for mixed numeric and categorical attributes based on incomplete data hierarchical clustering. In: Proceedings of the 5th International Conference on Knowledge Science, Engineering and Management, KSEM’11, pp. 414–424 (2011)

    Google Scholar 

  31. Figueroa García, J.C., Kalenatic, D., Lopez Bello, C.A.: Missing data imputation in multivariate data by evolutionary algorithms. Comput. Hum. Behav. 27(5), 1468–1474 (2011)

    Article  Google Scholar 

  32. de França, F.O., Coelho, G.P., Zuben, F.J.V.: Predicting missing values with biclustering: a coherence-based approach. Pattern Recognit. 46(5), 1255–1266 (2013)

    Article  MATH  Google Scholar 

  33. Frank, E., Witten, I.: Generating accurate rule sets without global optimization. In: Proceedings of the 15th International Conference on Machine Learning, pp. 144–151 (1998)

    Google Scholar 

  34. Gheyas, I.A., Smith, L.S.: A neural network-based framework for the reconstruction of incomplete data sets. Neurocomputing 73(16–18), 3039–3065 (2010)

    Article  Google Scholar 

  35. Gibert, K.: Mixed intelligent-multivariate missing imputation. Int. J. Comput. Math. 91(1), 85–96 (2014)

    Article  MATH  Google Scholar 

  36. Grzymala-Busse, J., Goodwin, L., Grzymala-Busse, W., Zheng, X.: Handling missing attribute values in preterm birth data sets. In: 10th International Conference of Rough Sets and Fuzzy Sets and Data Mining and Granular Computing(RSFDGrC05), pp. 342–351 (2005)

    Google Scholar 

  37. Grzymala-Busse, J.W., Hu, M.: A comparison of several approaches to missing attribute values in data mining. In: Ziarko, W., Yao, Y.Y. (eds.) Rough Sets and Current Trends in Computing, Lecture Notes in Computer Science, vol. 2005, pp. 378–385. Springer, Berlin (2000)

    Google Scholar 

  38. Howell, D.: The analysis of missing data. SAGE Publications Ltd, London (2007)

    Google Scholar 

  39. Hruschka Jr, E.R., Ebecken, N.F.F.: Missing values prediction with k2. Intell. Data Anal. 6(6), 557–566 (2002)

    MATH  Google Scholar 

  40. Hulse, J.V., Khoshgoftaar, T.M.: Incomplete-case nearest neighbor imputation in software measurement data. Inf. Sci. 259, 596–610 (2014)

    Article  Google Scholar 

  41. Ingsrisawang, L., Potawee, D.: Multiple imputation for missing data in repeated measurements using MCMC and copulas, pp. 1606–1610 (2012)

    Google Scholar 

  42. Ishioka, T.: Imputation of missing values for unsupervised data using the proximity in random forests. In: eLmL 2013, The 5th International Conference on Mobile, Hybrid, and On-line Learning, pp. 30–36 (2013)

    Google Scholar 

  43. Jamshidian, M., Jalal, S., Jansen, C.: Missmech: an R package for testing homoscedasticity, multivariate normality, and missing completely at random (mcar). J. Stat. Softw. 56(6), 1–31 (2014)

    Google Scholar 

  44. Joenssen, D.W., Bankhofer, U.: Hot deck methods for imputing missing data: the effects of limiting donor usage. In: Proceedings of the 8th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM’12, pp. 63–75 (2012)

    Google Scholar 

  45. Juhola, M., Laurikkala, J.: Missing values: how many can they be to preserve classification reliability? Artif. Intell. Rev. 40(3), 231–245 (2013)

    Article  Google Scholar 

  46. Keerin, P., Kurutach, W., Boongoen, T.: Cluster-based knn missing value imputation for dna microarray data. In: Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on, pp. 445–450. IEEE (2012)

    Google Scholar 

  47. Keerin, P., Kurutach, W., Boongoen, T.: An improvement of missing value imputation in dna microarray data using cluster-based lls method. In: Communications and Information Technologies (ISCIT), 2013 13th International Symposium on, pp. 559–564 (2013)

    Google Scholar 

  48. Khan, S.S., Hoey, J., Lizotte, D.J.: Bayesian multiple imputation approaches for one-class classification. In: Kosseim, L., Inkpen, D. (eds.) Advances in Artificial Intelligence - 25th Canadian Conference on Artificial Intelligence, Canadian AI 2012, Toronto, ON, Canada, Proceedings, pp. 331–336. 28–30 May 2012

    Google Scholar 

  49. Kim, H., Golub, G.H., Park, H.: Missing value estimation for dna microarray gene expression data: local least squares imputation. Bioinform. 21(2), 187–198 (2005)

    Article  Google Scholar 

  50. Krzanowski, W.: Multiple discriminant analysis in the presence of mixed continuous and categorical data. Comput. Math. Appl. 12(2, Part A), 179–185 (1986)

    Article  MATH  Google Scholar 

  51. Kwak, N., Choi, C.H.: Input feature selection by mutual information based on parzen window. IEEE Trans. Pattern Anal. Mach. Intell. 24(12), 1667–1671 (2002)

    Article  Google Scholar 

  52. Kwak, N., Choi, C.H.: Input feature selection for classification problems. IEEE Trans. Neural Networks 13(1), 143–159 (2002)

    Article  Google Scholar 

  53. Li, D., Deogun, J., Spaulding, W., Shuart, B.: Towards missing data imputation: a study of fuzzy k-means clustering method. In: 4th International Conference of Rough Sets and Current Trends in Computing (RSCTC04), pp. 573–579 (2004)

    Google Scholar 

  54. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data, 1st edn. Wiley Series in Probability and Statistics, New York (1987)

    MATH  Google Scholar 

  55. Little, R.J.A., Schluchter, M.D.: Maximum likelihood estimation for mixed continuous and categorical data with missing values. Biometrika 72, 497–512 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  56. Lu, X., Si, J., Pan, L., Zhao, Y.: Imputation of missing data using ensemble algorithms. In: Fuzzy Systems and Knowledge Discovery (FSKD), 2011 8th International Conference on, vol. 2, pp. 1312–1315 (2011)

    Google Scholar 

  57. McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York(2004)

    Google Scholar 

  58. Merlin, P., Sorjamaa, A., Maillet, B., Lendasse, A.: X-SOM and L-SOM: a double classification approach for missing value imputation. Neurocomputing 73(7–9), 1103–1108 (2010)

    Article  Google Scholar 

  59. Michalksi, R., Mozetic, I., Lavrac, N.: The multipurpose incremental learning system AQ15 and its testing application to three medical domains. In: 5th INational Conference on Artificial Intelligence (AAAI86), pp. 1041–1045 (1986)

    Google Scholar 

  60. Miyakoshi, Y., Kato, S.: Missing value imputation method by using Bayesian network with weighted learning. IEEJ Trans. Electron. Inf. Syst. 132, 299–305 (2012)

    Google Scholar 

  61. Moller, F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6, 525–533 (1990)

    Article  Google Scholar 

  62. Oba, S., aki Sato, M., Takemasa, I., Monden, M., ichi Matsubara, K., Ishii, S.: A bayesian missing value estimation method for gene expression profile data. Bioinform. 19(16), 2088–2096 (2003)

    Article  Google Scholar 

  63. Ouyang, M., Welsh, W.J., Georgopoulos, P.: Gaussian mixture clustering and imputation of microarray data. Bioinform. 20(6), 917–923 (2004)

    Article  Google Scholar 

  64. Panigrahi, L., Ranjan, R., Das, K., Mishra, D.: Removal and interpolation of missing values using wavelet neural network for heterogeneous data sets. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics, ICACCI ’12, pp. 1004–1009 (2012)

    Google Scholar 

  65. Patil, B., Joshi, R., Toshniwal, D.: Missing value imputation based on k-mean clustering with weighted distance. In: Ranka, S., Banerjee, A., Biswas, K., Dua, S., Mishra, P., Moona, R., Poon, S.H., Wang, C.L. (eds.) Contemporary Computing, Communications in Computer and Information Science, vol. 94, pp. 600–609. Springer, Berlin (2010)

    Google Scholar 

  66. Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 27(8), pp. 1226–1238 (2005)

    Google Scholar 

  67. Pham, D.T., Afify, A.A.: Rules-6: a simple rule induction algorithm for supporting decision making. In: Industrial Electronics Society, 2005. IECON 2005. 31st Annual Conference of IEEE, pp. 2184–2189 (2005)

    Google Scholar 

  68. Pham, D.T., Afify, A.A.: SRI: a scalable rule induction algorithm. Proc. Inst. Mech. Eng. [C]: J. Mech. Eng. Sci. 220, 537–552 (2006)

    Article  Google Scholar 

  69. Plat, J.: A resource allocating network for function interpolation. Neural Comput. 3(2), 213–225 (1991)

    Article  Google Scholar 

  70. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods: Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)

    Google Scholar 

  71. Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann Publishers Inc., San Francisco (1999)

    Google Scholar 

  72. Qin, Y., Zhang, S., Zhang, C.: Combining knn imputation and bootstrap calibrated empirical likelihood for incomplete data analysis. Int. J. Data Warehouse. Min. 6(4), 61–73 (2010)

    Article  MathSciNet  Google Scholar 

  73. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Francisco (1993)

    Google Scholar 

  74. Rahman, G., Islam, Z.: A decision tree-based missing value imputation technique for data pre-processing. In: Proceedings of the 9th Australasian Data Mining Conference - Volume 121, AusDM ’11, pp. 41–50 (2011)

    Google Scholar 

  75. Rahman, M., Islam, M.: KDMI: a novel method for missing values imputation using two levels of horizontal partitioning in a data set. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) Advanced Data Mining and Applications. Lecture Notes in Computer Science, vol. 8347, pp. 250–263. Springer, Berlin (2013)

    Chapter  Google Scholar 

  76. Rahman, M.G., Islam, M.Z.: Missing value imputation using decision trees and decision forests by splitting and merging records: two novel techniques. Know.-Based Syst. 53, 51–65 (2013)

    Article  Google Scholar 

  77. Rahman, M.G., Islam, M.Z.: Fimus: a framework for imputing missing values using co-appearance, correlation and similarity analysis. Know.-Based Syst. 56, 311–327 (2014)

    Article  Google Scholar 

  78. Royston, P., White, I.R.: Multiple imputation by chained equations (MICE): implementation in STATA. J. Stat. Softw. 45(4), 1–20 (2011)

    MathSciNet  Google Scholar 

  79. Rubin, D.B.: Inference and missing data. Biometrika 63(3), 581–592 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  80. Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)

    Google Scholar 

  81. Safarinejadian, B., Menhaj, M., Karrari, M.: A distributed EM algorithm to estimate the parameters of a finite mixture of components. Knowl. Inf. Syst. 23(3), 267–292 (2010)

    Article  Google Scholar 

  82. Schafer, J.L.: Analysis of Incomplete Multivariate Data. Chapman & Hall, London (1997)

    Book  MATH  Google Scholar 

  83. Schafer, J.L., Olsen, M.K.: Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivar. Behav. Res. 33(4), 545–571 (1998)

    Article  Google Scholar 

  84. Scheuren, F.: Multiple imputation: how it began and continues. Am. Stat. 59, 315–319 (2005)

    Article  MathSciNet  Google Scholar 

  85. Schneider, T.: Analysis of incomplete climate data: estimation of mean values and covariance matrices and imputation of missing values. J. Clim. 14, 853–871 (2001)

    Article  Google Scholar 

  86. Schomaker, M., Heumann, C.: Model selection and model averaging after multiple imputation. Comput. Stat. Data Anal. 71, 758–770 (2014)

    Article  MathSciNet  Google Scholar 

  87. Sehgal, M.S.B., Gondal, I., Dooley, L.: Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data. Bioinform. 21(10), 2417–2423 (2005)

    Article  Google Scholar 

  88. Silva-Ramírez, E.L., Pino-Mejías, R., López-Coello, M., Cubiles-de-la Vega, M.D.: Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Networks 24(1), 121–129 (2011)

    Article  Google Scholar 

  89. Simński, K.: Rough fuzzy subspace clustering for data with missing values. Comput. Inform. 33(1), 131–153 (2014)

    Google Scholar 

  90. Somasundaram, R., Nedunchezhian, R.: Radial basis function network dependent exclusive mutual interpolation for missing value imputation. J. Comput. Sci. 9(3), 327–334 (2013)

    Article  Google Scholar 

  91. Tanner, M.A., Wong, W.: The calculation of posterior distributions by data augmentation. J. Am. Stat. Assoc. 82, 528–540 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  92. Ting, J., Yu, B., Yu, D., Ma, S.: Missing data analyses: a hybrid multiple imputation algorithm using gray system theory and entropy based on clustering. Appl. Intell. 40(2), 376–388 (2014)

    Article  Google Scholar 

  93. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., Altman, R.B.: Missing value estimation methods for dna microarrays. Bioinform. 17(6), 520–525 (2001)

    Article  Google Scholar 

  94. Unnebrink, K., Windeler, J.: Intention-to-treat: methods for dealing with missing values in clinical trials of progressively deteriorating diseases. Stat. Med. 20(24), 3931–3946 (2001)

    Article  Google Scholar 

  95. Vellido, A.: Missing data imputation through GTM as a mixture of t-distributions. Neural Networks 19(10), 1624–1635 (2006)

    Article  MATH  Google Scholar 

  96. Wang, H., Wang, S.: Mining incomplete survey data through classification. Knowl. Inf. Syst. 24(2), 221–233 (2010)

    Google Scholar 

  97. Williams, D., Liao, X., Xue, Y., Carin, L., Krishnapuram, B.: On classification with incomplete data. IEEE Trans. Pattern Anal. Mach. Intell. 29(3), 427–436 (2007)

    Google Scholar 

  98. Wilson, D.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. 2(3), 408–421 (1972)

    Article  MATH  Google Scholar 

  99. Wong, A.K.C., Chiu, D.K.Y.: Synthesizing statistical knowledge from incomplete mixed-mode data. IEEE Trans. Pattern Anal. Mach. Intell. 9(6), 796–805 (1987)

    Article  Google Scholar 

  100. Wu, X., Urpani, D.: Induction by attribute elimination. IEEE Trans. Knowl. Data Eng. 11(5), 805–812 (1999)

    Article  Google Scholar 

  101. Zhang, S.: Nearest neighbor selection for iteratively knn imputation. J. Syst. Softw. 85(11), 2541–2552 (2012)

    Article  Google Scholar 

  102. Zhang, S., Wu, X., Zhu, M.: Efficient missing data imputation for supervised learning. In: Cognitive Informatics (ICCI), 2010 9th IEEE International Conference on, pp. 672–679 (2010)

    Google Scholar 

  103. Zheng, Z., Webb, G.I.: Lazy learning of bayesian rules. Machine Learning 41(1), 53–84 (2000)

    Article  MathSciNet  Google Scholar 

  104. Zhu, B., He, C., Liatsis, P.: A robust missing value imputation method for noisy data. Appl. Intell. 36(1), 61–74 (2012)

    Article  Google Scholar 

  105. Zhu, X., Zhang, S., Jin, Z., Zhang, Z., Xu, Z.: Missing value estimation for mixed-attribute data sets. IEEE Transactions on Knowl. Data Eng. 23(1), 110–121 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Salvador García .

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

García, S., Luengo, J., Herrera, F. (2015). Dealing with Missing Values. In: Data Preprocessing in Data Mining. Intelligent Systems Reference Library, vol 72. Springer, Cham. https://doi.org/10.1007/978-3-319-10247-4_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10247-4_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10246-7

  • Online ISBN: 978-3-319-10247-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics