Skip to main content

Analysis of Missing Data Using Matrix-Characterized Approximations

  • Chapter
  • First Online:
Software Engineering Research, Management and Applications (SERA 2019)

Part of the book series: Studies in Computational Intelligence ((SCI,volume 845))

Abstract

Nowadays, the veracity related to data quality such as incomplete, inconsistent, vague or noisy data creates a major challenge to data mining and data analysis. Rough set theory presents a special tool for handling the incomplete and imprecise data in information systems. In this paper, rough set based matrix-represented approximations are presented to compute lower and upper approximations. The induced approximations are conducted as inputs for data analysis method, LERS (Learning from Examples based on Rough Set) used with LEM2 (Learning from Examples Module, Version2) rule induction algorithm. Analyzes are performed on missing datasets with “do not care” conditions and missing datasets with lost values. In addition, experiments on missing datasets with different missing percent by using different thresholds are also provided. The experimental results show that the system outperforms when missing data are characterized as “do not care” conditions than represented as lost values.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Grzymala-Busse, J. W. (2005). Characteristic relations for incomplete data: A generalization of the indiscernibility relation. Transactions on rough sets IV (pp. 58–68). Berlin, Heidelberg: Springer.

    Chapter  Google Scholar 

  2. Grzymala-Busse, J. W. (2008). Three approaches to missing attribute values: A rough set perspective. Data mining: Foundations and practice (pp. 139–152). Berlin, Heidelberg: Springer.

    Chapter  Google Scholar 

  3. Kryszkiewicz, M. (1998). Rough set approach to incomplete information systems. Information Sciences, 112(1–4), 39–49.

    Article  MathSciNet  Google Scholar 

  4. Kryszkiewicz, M. (1999). Rules in incomplete information systems. Information Sciences, 113(3–4), 271–292.

    Article  MathSciNet  Google Scholar 

  5. Pawlak, Z. (1991). Rough sets: Theoretical aspects of reasoning about data. In System Theory. Boston, London, Dordrecht: Kluwer Academic Publishers.

    Google Scholar 

  6. Stefanowski, J., & Tsoukiàs, A. (1999, November). On the extension of rough sets under incomplete information. In International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing (pp. 73–81). Springer, Berlin, Heidelberg.

    Google Scholar 

  7. Zhang, J., Wong, J. S., Pan, Y., & Li, T. (2015). A parallel matrix-based method for computing approximations in incomplete information systems. IEEE Transactions on Knowledge and Data Engineering, 27(2), 326–339.

    Article  Google Scholar 

  8. Soe, T. T., & Min, M. M. (2018, June). Speeding up incomplete data analysis using matrix-represented approximations. In 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) (pp. 206–211). IEEE.

    Google Scholar 

  9. Zhang, J., Li, T., Ruan, D., & Liu, D. (2012). Rough sets based matrix approaches with dynamic attribute variation in set-valued information systems. International Journal of Approximate Reasoning, 53(4), 620–635.

    Article  MathSciNet  Google Scholar 

  10. Qian, Y., Dang, C., Liang, J., & Tang, D. (2009). Set-valued ordered information systems. Information Sciences, 179(16), 2809–2832.

    Article  MathSciNet  Google Scholar 

  11. Grzymala-Busse, J. W., & Wang, C. P. B. (1996, June). Classification methods in rule induction. In Proceedings of the 5th Intelligent Information Systems Workshop (pp. 120–126).

    Google Scholar 

  12. Grzymala-Busse, J. W. (1992). LERS-a system for learning from examples based on rough sets. Intelligent decision support (pp. 3–18). Dordrecht: Springer.

    Chapter  Google Scholar 

  13. Grzymala-Busse, J. W. (2006). Rough set strategies to data with missing attribute values. Foundations and novel approaches in data mining (pp. 197–212). Berlin, Heidelberg: Springer.

    Google Scholar 

  14. https://archive.ics.uci.edu/ml/datasets/Mushroom.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thin Thin Soe .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Soe, T.T., Min, M.M. (2020). Analysis of Missing Data Using Matrix-Characterized Approximations. In: Lee, R. (eds) Software Engineering Research, Management and Applications. SERA 2019. Studies in Computational Intelligence, vol 845. Springer, Cham. https://doi.org/10.1007/978-3-030-24344-9_7

Download citation

Publish with us

Policies and ethics