Abstract
Decision making has become a primary motive of data analytics in the present scenario. Prior to analysis data has to be set free from noise by applying data preprocessing techniques to the raw data. Missing value imputation is one of the data cleaning method in data preprocessing. This article presents a novel data imputation technique with the concepts of rough set theory. An imputation algorithm Rough Set Missing Value Imputation (RSMVI) is developed. The performance of the proposed algorithm is carried out by comparing the classification accuracy obtained, after the missing value imputation is performed. C4.5 classifier is chosen for the same. Cleveland heart data set has been used for evaluation of the proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Prabhu Devika (2016) Application of web 2.0 and web 3.0: an overview. Int J Res Libr Sci 2(1):54–62
Hiremath BK, Kenchakkanavar AY (2016) An alteration of the web 1.0, web 2.0 and web 3.0: a comparative study. Imperial J Interdisc Res 2(4):705–710
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems
El-Hasnony IM, El-Bakry HM, Saleh AA (2016) Classification of breast cancer using soft computing techniques. Int J Electron Inf Eng 4(1):45–54
Amala Jayanthi M, Swathi S, Tharakai R (2016) Data mining—a survey. Int J Adv Res Comput Sci Softw Eng 6(4):270–273
Sumath K, Kannan S, Nagarajan K (2016) Data mining: analysis of student database using classification techniques. Int J Comput Appl 141(8):22–27
Pyle D (1999) Data preparation for data mining. MorganKaufmann Publishers Inc.
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Probability and statistics, 1st edn. Wiley Series
Wang H, Wang S (2010) Mining incomplete survey data through classification. Knowl Inf Syst 24(2):221–233
Barnard J, Meng X (1999) Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat Methods Med Res 8(1):17–36
Juhola M, Laurikkala J (2013) Missing values: how many can they be to preserve classification reliability? Artif Intell Rev 40(3):231–245
Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern, Part A 37(5):692–709
Waqas I, Syed Saeed-Ur-Rahman S, Imran MJ, Rehan A (2016) Treatment of missing values in data mining. J Comput Sci Syst Biol 9(2):51–53
Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms, 2nd edn. Wiley-IEEE Press
Kaiser J (2014) Dealing with missing values in data. J Syst Integr (1804–2724) 5(1):42–51
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17:91–209
Sim J, Lee JS, Kwon O (2015) Missing values and optimal selection of an imputation method and classification algorithm to improve the accuracy of ubiquitous computing applications. Math Prob Eng. http://dx.doi.org/10.1155/2015/538613
Gimpy D, Rajan Vohra M (2014) Estimation of missing values using decision tree approach. Int J Comput Sci Inf Technol 5(4):5216–5220
Quinlan JR. (1993) C4. 5: programs for machine learning, Morgan Kaufmann Publishers Inc ISBN:1-55860-238-0
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 The Author(s)
About this chapter
Cite this chapter
Sujatha, M., Lavanya Devi, G., Srinivasa Rao, K., Ramesh, N. (2018). Rough Set Theory Based Missing Value Imputation. In: Cognitive Science and Health Bioinformatics. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-10-6653-5_9
Download citation
DOI: https://doi.org/10.1007/978-981-10-6653-5_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6652-8
Online ISBN: 978-981-10-6653-5
eBook Packages: EngineeringEngineering (R0)