Skip to main content

Studies on the Necessary Data Size for Rule Induction by STRIM

  • Conference paper
Rough Sets and Knowledge Technology (RSKT 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8171))

Included in the following conference series:

Abstract

STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by a simulation experiment specifying rules in advance, and by comparison with the conventional methods. However, there remains scope for future studies. One aspect which needs examination is determination of the size of the dataset needed for inducting true rules by simulation experiments, since finding statistically significant rules is the core of the method. This paper examines the theoretical necessary size of the dataset that STRIM needs to induct true rules with probability w [%] in connection with the rule length, and confirms the validity of this study by a simulation experiment at the rule length 2. The results provide useful guidelines for analyzing real-world datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Pawlak, Z.: Rough sets. Internat. J. Inform. Comput. Sci. 11, 341–356 (1982)

    Article  MathSciNet  MATH  Google Scholar 

  2. Skowron, A., Rauszer, C.: The discernibility matrix and functions in information systems. In: Slowiński, R. (ed.) Intelligent Decision Support — Handbook of Application and Advances of Rough Set Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1992)

    Chapter  Google Scholar 

  3. Bao, Y.G., Du, X.Y., Deng, M.G., Ishii, N.: An efficient method for computing all reducts. Transactions of the Japanese Society for Artificial Intelligence 19, 166–173 (2004)

    Article  Google Scholar 

  4. Grzymała-Busse, J.W.: LERS — A system for learning from examples based on rough sets. In: Slowiński, R. (ed.) Intelligent Decision Support — Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publisher, Dordrecht (1992)

    Chapter  Google Scholar 

  5. Ziarko, W.: Variable precision rough set model. Journal of Computer and System Science 46, 39–59 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  6. Shan, N., Ziarko, W.: Data-based acquisition and incremental modification of classification rules. Computational Intelligence 11, 357–370 (1995)

    Article  Google Scholar 

  7. Nishimura, T., Kato, Y., Saeki, T.: Studies on an effective algorithm to reduce the decision matrix. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS, vol. 6743, pp. 240–243. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  8. Matsubayashi, T., Kato, Y., Saeki, T.: A new rule induction method from a decision table using a statistical test. In: Li, T., Nguyen, H.S., Wang, G., Grzymala-Busse, J., Janicki, R., Hassanien, A.E., Yu, H. (eds.) RSKT 2012. LNCS, vol. 7414, pp. 81–90. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Walpole, R.E., Myers, R.H., Myers, S.L., Ye, K.: Probability and Statistics for Engineers and Scientists, 8th edn., pp. 187–191. Pearson Prentice Hall, New Jersey (2007)

    Google Scholar 

  10. Grzymała-Busse, J.W., Grzymała-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 33–49. Springer, Heidelberg (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kato, Y., Saeki, T., Mizuno, S. (2013). Studies on the Necessary Data Size for Rule Induction by STRIM. In: Lingras, P., Wolski, M., Cornelis, C., Mitra, S., Wasilewski, P. (eds) Rough Sets and Knowledge Technology. RSKT 2013. Lecture Notes in Computer Science(), vol 8171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41299-8_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41299-8_20

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41298-1

  • Online ISBN: 978-3-642-41299-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics