Skip to main content

Data Mining: A Probabilistic Rough Set Approach

  • Chapter
Book cover Rough Sets in Knowledge Discovery 2

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 19))

Abstract

This paper introduces a new approach for mining if-then rules in databases with uncertainty and incompleteness. The approach is based on the combination of Generalization Distribution Table (GDT) and the Rough Set methodology. A GDT is a table in which the probabilistic relationships between concepts and instances over discrete domains are represented. By using a GDT as a hypothesis search space and combining the GDT with the rough set methodology, noises and unseen instances can be handled, biases can be flexibly selected, background knowledge can be used to constrain rule generation, and if-then rules with strengths can be effectively acquired from large, complex databases in an incremental, bottom-up mode. In this paper, we focus on basic concepts and an implementation of our methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 194–202.

    Google Scholar 

  2. Gordon, D.F., DesJardins, M.: Evaluation and selection of biases in machine learning. Machine Learning 20 (1995) 5–22

    Google Scholar 

  3. Hirsh, H.: Generalizing version spaces. Machine Learning 17 (1994) 5–46

    Google Scholar 

  4. Langley, P.: Elements of machine learning, Morgan Kaufmann Publishers (1996)

    Google Scholar 

  5. Mollestad, T., Skowron, A.: A rough set framework for data mining of propositional default rules. In: Z.W. Ras and M. Michalewicz (eds.), Ninth International Symposium on Methodologies for Intelligent Systems (ISMIS-96), Zakopane, Poland, June 9–13, Lecture Notes in Artificial Intelligence 1079, Springer-Verlag, Berlin (1996) 448–457

    Google Scholar 

  6. Michalski, R.S., Carbonell, J.G., Mitchell, T.M.: Machine learning - An artificial intelligence approach, 1–3 Morgan Kaufmann Publishers ( 1983, 1986, 1990 )

    Google Scholar 

  7. Mitchell, T.M.: Version spaces: A candidate elimination approach to rule learning. In: Proc. 5th Int. Joint Conf. Artificial Intelligence, (1977) 305–310

    Google Scholar 

  8. Mitchell, T.M.: Generalization as search. Artificial Intelligence 18 (1982) 203–226

    Article  Google Scholar 

  9. Ohsuga, S.: Symbol processing by non-symbol processor. In: Proc. 4th Pacific Rim International Conference on Artificial Intelligence (PRICAI’96) (1996) 193–205

    Google Scholar 

  10. Pfahringer, B.: Compression-based discretization of continuous attributes. In: Proc. 12th Inter. Conf. on Machine Learning (1995) 456–463

    Google Scholar 

  11. Piatetsky-Shapiro, G., Frawley, W.J. (eds.): Knowledge discovery in databases. AAAI Press and The MIT Press, (1991)

    Google Scholar 

  12. Shavlik, J.W., Dietterich, T.G. (eds.): Readings in machine learning. Morgan Kaufmann Publishers, San Mateo, CA (1990)

    Google Scholar 

  13. Shan, N., Hamilton, H.J., Ziarko, W., Cercone, N.: Discretization of continuos valued attributes in classification systems, In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 74–81

    Google Scholar 

  14. Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 331–362

    Chapter  Google Scholar 

  15. Skowron, A., Suraj, Z.: A parallel algorithm for real-time decision making: A rough set approach. Journal of Intelligent Information Systems 7 (1996) 5–28

    Article  Google Scholar 

  16. Skowron, A., Polkowski, L.: Synthesis of decision systems from data tables. In: T.Y. Lin, N. Cercone (eds.): Rough Sets and Data Mining. Analysis of Imprecise Data, Kluwer Academic Publishers, Boston, Dordrecht (1997) 259–299

    Google Scholar 

  17. Teghem, J., Charlet J.-M.: Use of ‘rough sets’ method to draw premonitory factors for earthquakes by emphasing gas geochemistry: The case of a low seismic activity context, in Belgium. In: R. Slowinski (ed.): Intelligent Decision Support - Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992) 165–179

    Chapter  Google Scholar 

  18. Lin, T.Y.: Neighborhood systems - A qualitative theory for fuzzy and rough sets. In: P.P. Wang (ed.), Advances in Machine Intelligence and Soft Computing 4 (1996) 132–155

    Google Scholar 

  19. Lin, T.Y., Cercone, N. (eds.): Rough sets and data mining: Analysis of imprecise data. Kluwer Academic Publishers, Boston, Dordrecht (1997)

    Google Scholar 

  20. Pawlak, Z.: Rough sets - Theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht (1991)

    Google Scholar 

  21. Zhong, N. Ohsuga,S.: Using generalization distribution tables as a hypotheses search space for generalization. In: S. Tsumoto, S. Kobayashi, T. Yokomori, H. Tanaka, and A. Nakamura (eds.): Proceedings of the Fourth International Workshop on Rough Sets, Fuzzy Sets, and Machine Discovery (RSFD’96), The University of Tokyo, November 6–8 (1996) 396–403

    Google Scholar 

  22. Zhong, N., Fujitsu, S., Ohsuga, S.: Generalization based on the connectionist networks representation of a generalization distribution table. In: Proc. First Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD97), World Scientific (1997) 183–197

    Google Scholar 

  23. Zhong, N., Dong, J.Z., Ohsuga, S.: Discovering rules in the environment with noise and incompleteness. In: Proc. 10th International Florida AI Reaserch Symposium (FLAIRS-97), Special Track on Uncertainty in AI (1997) 186–191

    Google Scholar 

  24. Zhong, N., Dong, J.Z., Ohsuga, S.: Soft techniques to rule discovery in data. In: Proceedings of the Fifth European Congress on Intelligent Techniques and Soft Computing (EUFIT’97), September 8–11, Aachen, Germany, Verlag Mainz, Aachen (1997) 212–217

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Zhong, N., Dong, JZ., Ohsuga, S. (1998). Data Mining: A Probabilistic Rough Set Approach. In: Polkowski, L., Skowron, A. (eds) Rough Sets in Knowledge Discovery 2. Studies in Fuzziness and Soft Computing, vol 19. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1883-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-1883-3_7

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-2459-9

  • Online ISBN: 978-3-7908-1883-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics