Abstract
STRIM (Statistical Test Rule Induction Method) has been proposed as a method to effectively induct if-then rules from the decision table which is considered as a sample set obtained from the population of interest. Its usefulness has been confirmed by a simulation experiment specifying rules in advance, and by comparison with the conventional methods. However, there remains scope for future studies. One aspect which needs examination is determination of the size of the dataset needed for inducting true rules by simulation experiments, since finding statistically significant rules is the core of the method. This paper examines the theoretical necessary size of the dataset that STRIM needs to induct true rules with probability w [%] in connection with the rule length, and confirms the validity of this study by a simulation experiment at the rule length 2. The results provide useful guidelines for analyzing real-world datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pawlak, Z.: Rough sets. Internat. J. Inform. Comput. Sci. 11, 341–356 (1982)
Skowron, A., Rauszer, C.: The discernibility matrix and functions in information systems. In: Slowiński, R. (ed.) Intelligent Decision Support — Handbook of Application and Advances of Rough Set Theory, pp. 331–362. Kluwer Academic Publisher, Dordrecht (1992)
Bao, Y.G., Du, X.Y., Deng, M.G., Ishii, N.: An efficient method for computing all reducts. Transactions of the Japanese Society for Artificial Intelligence 19, 166–173 (2004)
Grzymała-Busse, J.W.: LERS — A system for learning from examples based on rough sets. In: Slowiński, R. (ed.) Intelligent Decision Support — Handbook of Applications and Advances of the Rough Sets Theory, pp. 3–18. Kluwer Academic Publisher, Dordrecht (1992)
Ziarko, W.: Variable precision rough set model. Journal of Computer and System Science 46, 39–59 (1993)
Shan, N., Ziarko, W.: Data-based acquisition and incremental modification of classification rules. Computational Intelligence 11, 357–370 (1995)
Nishimura, T., Kato, Y., Saeki, T.: Studies on an effective algorithm to reduce the decision matrix. In: Kuznetsov, S.O., Ślęzak, D., Hepting, D.H., Mirkin, B.G. (eds.) RSFDGrC 2011. LNCS, vol. 6743, pp. 240–243. Springer, Heidelberg (2011)
Matsubayashi, T., Kato, Y., Saeki, T.: A new rule induction method from a decision table using a statistical test. In: Li, T., Nguyen, H.S., Wang, G., Grzymala-Busse, J., Janicki, R., Hassanien, A.E., Yu, H. (eds.) RSKT 2012. LNCS, vol. 7414, pp. 81–90. Springer, Heidelberg (2012)
Walpole, R.E., Myers, R.H., Myers, S.L., Ye, K.: Probability and Statistics for Engineers and Scientists, 8th edn., pp. 187–191. Pearson Prentice Hall, New Jersey (2007)
Grzymała-Busse, J.W., Grzymała-Busse, W.J.: Handling missing attribute values. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn., pp. 33–49. Springer, Heidelberg (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kato, Y., Saeki, T., Mizuno, S. (2013). Studies on the Necessary Data Size for Rule Induction by STRIM. In: Lingras, P., Wolski, M., Cornelis, C., Mitra, S., Wasilewski, P. (eds) Rough Sets and Knowledge Technology. RSKT 2013. Lecture Notes in Computer Science(), vol 8171. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41299-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-41299-8_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41298-1
Online ISBN: 978-3-642-41299-8
eBook Packages: Computer ScienceComputer Science (R0)