Rule Generation Based on Rough Set Theory for Text classification

Bi, Yaxin; Anderson, Terry; McClean, Sally

doi:10.1007/978-1-4471-0269-4_12

Yaxin Bi⁴,
Terry Anderson⁴ &
Sally McClean⁴

93 Accesses

Abstract

In this paper we describe an approach based on rough set techniques for decision rule generation applied to text classification. A minimal discriminating set - a reduct - for the original data set is achieved through analyzing the degree of dependency among attributes. To speed up the search for reducts, the information gain criterion is used to reduce the number of attributes considered and rank the attributes in decreasing order, and heuristic functions are incorporated into a range of rule generation algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Schütze H, Hull D and Pedersen JO. A comparison of classifiers and document representations for the routing problem. In proceedings of the annual international ACM SIGIR conference on research and development in information retrieval, 1995, pp229–237
Google Scholar
Yang Y and Chute CG. An example-based mapping method for text categorization and retrieval. ACM transactions on information systems, 1994, 12(3), pp252–277
Article Google Scholar
Lewis DD and Ringuette M. A comparison of two learning algorithms for categorization. In symposium on document analysis and information retrieval, pp81–93, 1994
Google Scholar
Wiener EJ, Pedersen O and Weigend AS. A neural network approach to topic spotting. In symposium on document analysis and information retrieval, 1995, pp 317–332
Google Scholar
Cohen WW and Yoram S. Context-sensitive learning methods for text categorization. In proceedings of the annual ACM SIGIR conference on research and development in information retrieval, 1996, pp307–315
Google Scholar
Salton G, Allan J, Buckley C and Singhal A. Automatic analysis, theme generation, and summarization of machine-readable texts. Science. 1994, 264:1421–1426
Article Google Scholar
Apté C, Damerau F and Weiss SM. Towards language independent automated learning of text categorization models. In proceedings of the annual international ACM SIGIR conference on research and development in information retrieval, 1994, pp24–30
Google Scholar
Stefanowski J. On rough set based approaches to induction of decision rules. In Lech Polkowski and Andrzej Skowron (Eds) Studies in fuzziness and soft computing. Physica-Verlag, 1998, 1:500–529
Google Scholar
Guang JW and Bell D. Rough computational methods for information systems. Artificial Intelligence 1998, 105:77–103
Article Google Scholar
Pawlak Z. Rough Set: Theoretical aspects of reasoning about data. Kluwer Academic, 1991
Google Scholar
Wroblewski J. Genetic algorithms in decomposition and classification problems. Polkowski, L. and Skowron, A. (Eds) Finding minimal reducts using genetic Algorithms. Rough set in knowledge discovery 2: applications, cases studies and software systems, Physica-Verlag, Heidelberg, 1998, pp472–492
Google Scholar
Yang Y and Pedersen JP. A comparative study on feature selection in text categorization proceedings of the fourteenth international conference on machine learning, 1997
Google Scholar
Quinlan JR. C4.5: Programs for machine learning. Morgan Kaufmann, 1993
Google Scholar
Agrawal R, Imielinski T and Swami A. Mining association rules between sets of items in large databases. In proceedings of the ACM SIGMOD conference, 1993
Google Scholar
Lewis DD. Reuters-21578: http://www.research.att.com/~lewis/reuters21578.html
van Rijsbergen CJ. Information Retrieval (second edition). Butterworths, 1979
Google Scholar
Yang Y. An evaluation of statistical approaches to text categorization. Journal of Information Retrieval, 1999, 1(1/2): 67–88
Google Scholar
Bi Y, Murtagh F, McClean S and Anderson T. Text passage classification using supervised learning. Workshop on logical and uncertainty models for information systems, 1999, pp22–34
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, University of Ulster at Jordanstown, Newtownabbey, Co. Antrim, BT37 OQB, N. Ireland, UK
Yaxin Bi, Terry Anderson & Sally McClean

Authors

Yaxin Bi
View author publications
You can also search for this author in PubMed Google Scholar
Terry Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Sally McClean
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Technology, University of Portsmouth, Portsmouth, UK
Max Bramer BSc, PhD, CEng
Department of Computer Science, University of Aberdeen, Aberdeen, UK
Alun Preece BSc, PhD
Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen PhD

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bi, Y., Anderson, T., McClean, S. (2001). Rule Generation Based on Rough Set Theory for Text classification. In: Bramer, M., Preece, A., Coenen, F. (eds) Research and Development in Intelligent Systems XVII. Springer, London. https://doi.org/10.1007/978-1-4471-0269-4_12

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0269-4_12
Publisher Name: Springer, London
Print ISBN: 978-1-85233-403-1
Online ISBN: 978-1-4471-0269-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics