Rough Set Theory Based Missing Value Imputation

Sujatha, M.; Lavanya Devi, G.; Srinivasa Rao, K.; Ramesh, N.

doi:10.1007/978-981-10-6653-5_9

M. Sujatha⁴,
G. Lavanya Devi⁴,
K. Srinivasa Rao⁴ &
…
N. Ramesh⁴

Part of the book series: SpringerBriefs in Applied Sciences and Technology ((BRIEFSFOMEBI))

595 Accesses
3 Citations

Abstract

Decision making has become a primary motive of data analytics in the present scenario. Prior to analysis data has to be set free from noise by applying data preprocessing techniques to the raw data. Missing value imputation is one of the data cleaning method in data preprocessing. This article presents a novel data imputation technique with the concepts of rough set theory. An imputation algorithm Rough Set Missing Value Imputation (RSMVI) is developed. The performance of the proposed algorithm is carried out by comparing the classification accuracy obtained, after the missing value imputation is performed. C4.5 classifier is chosen for the same. Cleveland heart data set has been used for evaluation of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Prabhu Devika (2016) Application of web 2.0 and web 3.0: an overview. Int J Res Libr Sci 2(1):54–62
Google Scholar
Hiremath BK, Kenchakkanavar AY (2016) An alteration of the web 1.0, web 2.0 and web 3.0: a comparative study. Imperial J Interdisc Res 2(4):705–710
Google Scholar
Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems
Google Scholar
El-Hasnony IM, El-Bakry HM, Saleh AA (2016) Classification of breast cancer using soft computing techniques. Int J Electron Inf Eng 4(1):45–54
Google Scholar
Amala Jayanthi M, Swathi S, Tharakai R (2016) Data mining—a survey. Int J Adv Res Comput Sci Softw Eng 6(4):270–273
Google Scholar
Sumath K, Kannan S, Nagarajan K (2016) Data mining: analysis of student database using classification techniques. Int J Comput Appl 141(8):22–27
Google Scholar
Pyle D (1999) Data preparation for data mining. MorganKaufmann Publishers Inc.
Google Scholar
Little RJA, Rubin DB (1987) Statistical analysis with missing data. Probability and statistics, 1st edn. Wiley Series
Google Scholar
Wang H, Wang S (2010) Mining incomplete survey data through classification. Knowl Inf Syst 24(2):221–233
Article Google Scholar
Barnard J, Meng X (1999) Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat Methods Med Res 8(1):17–36
Article Google Scholar
Juhola M, Laurikkala J (2013) Missing values: how many can they be to preserve classification reliability? Artif Intell Rev 40(3):231–245
Article Google Scholar
Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern, Part A 37(5):692–709
Article Google Scholar
Waqas I, Syed Saeed-Ur-Rahman S, Imran MJ, Rehan A (2016) Treatment of missing values in data mining. J Comput Sci Syst Biol 9(2):51–53
Google Scholar
Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms, 2nd edn. Wiley-IEEE Press
Google Scholar
Kaiser J (2014) Dealing with missing values in data. J Syst Integr (1804–2724) 5(1):42–51
Google Scholar
Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17:91–209
Article MATH Google Scholar
Sim J, Lee JS, Kwon O (2015) Missing values and optimal selection of an imputation method and classification algorithm to improve the accuracy of ubiquitous computing applications. Math Prob Eng. http://dx.doi.org/10.1155/2015/538613
Gimpy D, Rajan Vohra M (2014) Estimation of missing values using decision tree approach. Int J Comput Sci Inf Technol 5(4):5216–5220
Google Scholar
http://archive.ics.uci.edu/ml
Quinlan JR. (1993) C4. 5: programs for machine learning, Morgan Kaufmann Publishers Inc ISBN:1-55860-238-0
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Systems Engineering, Andhra University, Visakhapatnam, Andhra Pradesh, India
M. Sujatha, G. Lavanya Devi, K. Srinivasa Rao & N. Ramesh

Authors

M. Sujatha
View author publications
You can also search for this author in PubMed Google Scholar
G. Lavanya Devi
View author publications
You can also search for this author in PubMed Google Scholar
K. Srinivasa Rao
View author publications
You can also search for this author in PubMed Google Scholar
N. Ramesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to M. Sujatha .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Sujatha, M., Lavanya Devi, G., Srinivasa Rao, K., Ramesh, N. (2018). Rough Set Theory Based Missing Value Imputation. In: Cognitive Science and Health Bioinformatics. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-10-6653-5_9

Download citation

DOI: https://doi.org/10.1007/978-981-10-6653-5_9
Published: 27 December 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6652-8
Online ISBN: 978-981-10-6653-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics