Data Mining: A Probabilistic Rough Set Approach

Zhong, Ning; Dong, Ju-Zhen; Ohsuga, Setsuo

doi:10.1007/978-3-7908-1883-3_7

Ning Zhong,
Ju-Zhen Dong⁵ &
Setsuo Ohsuga⁶

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 19))

233 Accesses
11 Citations

Abstract

This paper introduces a new approach for mining if-then rules in databases with uncertainty and incompleteness. The approach is based on the combination of Generalization Distribution Table (GDT) and the Rough Set methodology. A GDT is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented. By using a GDT as a hypothesis search space and combining the GDT with the rough set methodology, noises and unseen instances can be handled, biases can be flexibly selected, background knowledge can be used to constrain rule generation, and if-then rules with strengths can be effectively acquired from large, complex databases in an incremental, bottom-up mode. In this paper, we focus on basic concepts and an implementation of our methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 194–202.
Google Scholar
Gordon, D.F., DesJardins, M.: Evaluation and selection of biases in machine learning. Machine Learning 20 (1995) 5–22
Google Scholar
Hirsh, H.: Generalizing version spaces. Machine Learning 17 (1994) 5–46
Google Scholar
Langley, P.: Elements of machine learning, Morgan Kaufmann Publishers (1996)
Google Scholar
Mollestad, T., Skowron, A.: A rough set framework for data mining of propositional default rules. In: Z.W. Ras and M. Michalewicz (eds.), Ninth International Symposium on Methodologies for Intelligent Systems (ISMIS-96), Zakopane, Poland, June 9–13, Lecture Notes in Artificial Intelligence 1079, Springer-Verlag, Berlin (1996) 448–457
Google Scholar
Michalski, R.S., Carbonell, J.G., Mitchell, T.M.: Machine learning - An artificial intelligence approach, 1–3 Morgan Kaufmann Publishers ( 1983, 1986, 1990 )
Google Scholar
Mitchell, T.M.: Version spaces: A candidate elimination approach to rule learning. In: Proc. 5th Int. Joint Conf. Artificial Intelligence, (1977) 305–310
Google Scholar
Mitchell, T.M.: Generalization as search. Artificial Intelligence 18 (1982) 203–226
Article Google Scholar
Ohsuga, S.: Symbol processing by non-symbol processor. In: Proc. 4th Pacific Rim International Conference on Artificial Intelligence (PRICAI’96) (1996) 193–205
Google Scholar
Pfahringer, B.: Compression-based discretization of continuous attributes. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 456–463
Google Scholar
Piatetsky-Shapiro, G., Frawley, W.J. (eds.): Knowledge discovery in databases. AAAI Press and The MIT Press, (1991)
Google Scholar
Shavlik, J.W., Dietterich, T.G. (eds.): Readings in machine learning. Morgan Kaufmann Publishers, San Mateo, CA (1990)
Google Scholar
Shan, N., Hamilton, H.J., Ziarko, W., Cercone, N.: Discretization of continuos valued attributes in classification systems, In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 74–81
Google Scholar
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 331–362
Chapter Google Scholar
Skowron, A., Suraj, Z.: A parallel algorithm for real-time decision making: A rough set approach. Journal of Intelligent Information Systems 7 (1996) 5–28
Article Google Scholar
Skowron, A., Polkowski, L.: Synthesis of decision systems from data tables. In: T.Y. Lin, N. Cercone (eds.): Rough Sets and Data Mining. Analysis of Imprecise Data, Kluwer Academic Publishers, Boston, Dordrecht (1997) 259–299
Google Scholar
Teghem, J., Charlet J.-M.: Use of ‘rough sets’ method to draw premonitory factors for earthquakes by emphasing gas geochemistry: The case of a low seismic activity context, in Belgium. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 165–179
Chapter Google Scholar
Lin, T.Y.: Neighborhood systems - A qualitative theory for fuzzy and rough sets. In: P.P. Wang (ed.), Advances in Machine Intelligence and Soft Computing 4 (1996) 132–155
Google Scholar
Lin, T.Y., Cercone, N. (eds.): Rough sets and data mining: Analysis of imprecise data. Kluwer Academic Publishers, Boston, Dordrecht (1997)
Google Scholar
Pawlak, Z.: Rough sets - Theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht (1991)
Google Scholar
Zhong, N. Ohsuga,S.: Using generalization distribution tables as a hypotheses search space for generalization. In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 396–403
Google Scholar
Zhong, N., Fujitsu, S., Ohsuga, S.: Generalization based on the connectionist networks representation of a generalization distribution table. In: Proc. First Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD97), World Scientific (1997) 183–197
Google Scholar
Zhong, N., Dong, J.Z., Ohsuga, S.: Discovering rules in the environment with noise and incompleteness. In: Proc. 10th International Florida AI Reaserch Symposium (FLAIRS-97), Special Track on Uncertainty in AI (1997) 186–191
Google Scholar
Zhong, N., Dong, J.Z., Ohsuga, S.: Soft techniques to rule discovery in data. In: Proceedings of the Fifth European Congress on Intelligent Techniques and Soft Computing (EUFIT’97), September 8–11, Aachen, Germany, Verlag Mainz, Aachen (1997) 212–217
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Systems Engineering, Faculty of Engineering, Yamaguchi University, Tokiwa-Dai, 2557, Ube 755, Japan
Ju-Zhen Dong
Department of Information and Computer Science, School of Science and Engineering, Waseda University, 3-4-1 Okubo Shinjuku-Ku, Tokyo 169, Japan
Setsuo Ohsuga

Authors

Ning Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Ju-Zhen Dong
View author publications
You can also search for this author in PubMed Google Scholar
Setsuo Ohsuga
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Mathematics, Warsaw University of Technology, Pl. Politechniki 1, 00-665, Warsaw, Poland
Lech Polkowski
Polish-Japanese Institute of Computer Techniques, Koszykowa 86, 02-008, Warsaw, Poland
Lech Polkowski
Institute of Mathematics, Warsaw University, ul. Banacha 2, 02-097, Warsaw, Poland
Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhong, N., Dong, JZ., Ohsuga, S. (1998). Data Mining: A Probabilistic Rough Set Approach. In: Polkowski, L., Skowron, A. (eds) Rough Sets in Knowledge Discovery 2. Studies in Fuzziness and Soft Computing, vol 19. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1883-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-7908-1883-3_7
Publisher Name: Physica, Heidelberg
Print ISBN: 978-3-7908-2459-9
Online ISBN: 978-3-7908-1883-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics