Missing Data Imputation for Machine Learning

Wang, Shaoqian; Li, Bo; Yang, Mao; Yan, Zhongjiang

doi:10.1007/978-3-030-14657-3_7

Shaoqian Wang¹⁹,
Bo Li¹⁹,
Mao Yang¹⁹ &
…
Zhongjiang Yan¹⁹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 271))

Included in the following conference series:

International Conference on Internet of Things as a Service

970 Accesses
1 Citations
3 Altmetric

Abstract

The imputation of missing values in datasets always plays an important role in the data preprocessing. In the process of data collection, because of the various reasons, the datasets often contain some missing values, and the excellent missing data imputation algorithms can increase the reliability of the dataset and reduce the impact of missing values on the whole dataset. In this paper, based on the Artificial Neural Network (ANN), we propose a missing data imputation method for the classification-type datasets. For each record which contains missing values, we make a list of the values that can be used to replace the missing data from the complete dataset. Our ANN model uses the complete records as the train dataset, and selects the most appropriate value in the list as the final result based on the label categories of the missing data. In our experiments, we compare our algorithm with the traditional single value imputation method and mean value imputation method with the Pima dataset. The result shows that our proposed algorithm can achieve better classification results when there are more missing values in the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cheng, Y., Miao, D., Feng, Q.: Positive approximation and converse approximation in interval-valued fuzzy rough sets. Inf. Sci. 181, 2086–2110 (2011)
Google Scholar
Meng, Z., Shi, Z.: Extended rough set-based attribute reduction in inconsistent incomplete decision systems. Inf. Sci. 204(20), 44–69 (2012)
Article MathSciNet Google Scholar
Batista, G.E.A.P.A., Monard, M.C.: An analysis of four missing data treatment methods for supervised learning. Appl. Artif. Intell. 17(5–6), 519–533 (2003)
Google Scholar
Rahman, G., Islam, Z.: A decision tree-based missing value imputation technique for data pre-processing. In: The Australasian Data Mining Conference, pp. 41–50 (2010)
Google Scholar
Silvaramírez, E.L., et al.: Missing value imputation on missing completely at random data using multilayer perceptrons. Neural Netw. Official J. Int. Neural Netw. Soc. 24(1), 121–129 (2011)
Google Scholar

Download references

Acknowledgement

This work was supported in part by the National Natural Science Foundations of CHINA (Grant No. 61771390, No. 61771392, No. 61501373, and No. 61271279), the National Science and Technology Major Project (Grant No. 2016ZX03001018-004, and No. 2015ZX03002006-004), and the Fundamental Research Funds for the Central Universities (Grant No. 3102017ZY018).

Author information

Authors and Affiliations

School of Electronics and Information, Northwestern Polytechnical University, Xi’an, China
Shaoqian Wang, Bo Li, Mao Yang & Zhongjiang Yan

Authors

Shaoqian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bo Li
View author publications
You can also search for this author in PubMed Google Scholar
Mao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhongjiang Yan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mao Yang .

Editor information

Editors and Affiliations

Northwestern Polytechnical University, Xi′an, China
Bo Li
Northwestern Polytechnical University, Xi'an, China
Mao Yang
Shandong University, Jinan, Qinghai, China
Hui Yuan
Northwestern Polytechnical University, Xi'an, Shaanxi, China
Zhongjiang Yan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, S., Li, B., Yang, M., Yan, Z. (2019). Missing Data Imputation for Machine Learning. In: Li, B., Yang, M., Yuan, H., Yan, Z. (eds) IoT as a Service. IoTaaS 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 271. Springer, Cham. https://doi.org/10.1007/978-3-030-14657-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-14657-3_7
Published: 07 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14656-6
Online ISBN: 978-3-030-14657-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics