Skip to main content

Rough Set Theory Based Missing Value Imputation

  • Chapter
  • First Online:
Cognitive Science and Health Bioinformatics

Part of the book series: SpringerBriefs in Applied Sciences and Technology ((BRIEFSFOMEBI))

Abstract

Decision making has become a primary motive of data analytics in the present scenario. Prior to analysis data has to be set free from noise by applying data preprocessing techniques to the raw data. Missing value imputation is one of the data cleaning method in data preprocessing. This article presents a novel data imputation technique with the concepts of rough set theory. An imputation algorithm Rough Set Missing Value Imputation (RSMVI) is developed. The performance of the proposed algorithm is carried out by comparing the classification accuracy obtained, after the missing value imputation is performed. C4.5 classifier is chosen for the same. Cleveland heart data set has been used for evaluation of the proposed algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 16.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Prabhu Devika (2016) Application of web 2.0 and web 3.0: an overview. Int J Res Libr Sci 2(1):54–62

    Google Scholar 

  2. Hiremath BK, Kenchakkanavar AY (2016) An alteration of the web 1.0, web 2.0 and web 3.0: a comparative study. Imperial J Interdisc Res 2(4):705–710

    Google Scholar 

  3. Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques, 3rd edn. The Morgan Kaufmann Series in Data Management Systems

    Google Scholar 

  4. El-Hasnony IM, El-Bakry HM, Saleh AA (2016) Classification of breast cancer using soft computing techniques. Int J Electron Inf Eng 4(1):45–54

    Google Scholar 

  5. Amala Jayanthi M, Swathi S, Tharakai R (2016) Data mining—a survey. Int J Adv Res Comput Sci Softw Eng 6(4):270–273

    Google Scholar 

  6. Sumath K, Kannan S, Nagarajan K (2016) Data mining: analysis of student database using classification techniques. Int J Comput Appl 141(8):22–27

    Google Scholar 

  7. Pyle D (1999) Data preparation for data mining. MorganKaufmann Publishers Inc.

    Google Scholar 

  8. Little RJA, Rubin DB (1987) Statistical analysis with missing data. Probability and statistics, 1st edn. Wiley Series

    Google Scholar 

  9. Wang H, Wang S (2010) Mining incomplete survey data through classification. Knowl Inf Syst 24(2):221–233

    Article  Google Scholar 

  10. Barnard J, Meng X (1999) Applications of multiple imputation in medical studies: from AIDS to NHANES. Stat Methods Med Res 8(1):17–36

    Article  Google Scholar 

  11. Juhola M, Laurikkala J (2013) Missing values: how many can they be to preserve classification reliability? Artif Intell Rev 40(3):231–245

    Article  Google Scholar 

  12. Farhangfar A, Kurgan LA, Pedrycz W (2007) A novel framework for imputation of missing values in databases. IEEE Trans Syst Man Cybern, Part A 37(5):692–709

    Article  Google Scholar 

  13. Waqas I, Syed Saeed-Ur-Rahman S, Imran MJ, Rehan A (2016) Treatment of missing values in data mining. J Comput Sci Syst Biol 9(2):51–53

    Google Scholar 

  14. Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms, 2nd edn. Wiley-IEEE Press

    Google Scholar 

  15. Kaiser J (2014) Dealing with missing values in data. J Syst Integr (1804–2724) 5(1):42–51

    Google Scholar 

  16. Dubois D, Prade H (1990) Rough fuzzy sets and fuzzy rough sets. Int J Gen Syst 17:91–209

    Article  MATH  Google Scholar 

  17. Sim J, Lee JS, Kwon O (2015) Missing values and optimal selection of an imputation method and classification algorithm to improve the accuracy of ubiquitous computing applications. Math Prob Eng. http://dx.doi.org/10.1155/2015/538613

  18. Gimpy D, Rajan Vohra M (2014) Estimation of missing values using decision tree approach. Int J Comput Sci Inf Technol 5(4):5216–5220

    Google Scholar 

  19. http://archive.ics.uci.edu/ml

  20. Quinlan JR. (1993) C4. 5: programs for machine learning, Morgan Kaufmann Publishers Inc ISBN:1-55860-238-0

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Sujatha .

Rights and permissions

Reprints and permissions

Copyright information

© 2018 The Author(s)

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sujatha, M., Lavanya Devi, G., Srinivasa Rao, K., Ramesh, N. (2018). Rough Set Theory Based Missing Value Imputation. In: Cognitive Science and Health Bioinformatics. SpringerBriefs in Applied Sciences and Technology(). Springer, Singapore. https://doi.org/10.1007/978-981-10-6653-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6653-5_9

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6652-8

  • Online ISBN: 978-981-10-6653-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics