A Rough Set Model with Ontological Information for Discovering Maximal Association Rules in Document Collections

Bi, Yaxin; Anderson, Terry; McClean, Sally

doi:10.1007/978-1-4471-0651-7_2

Yaxin Bi⁴,
Terry Anderson⁴ &
Sally McClean⁴

89 Accesses

Abstract

In this paper we investigate the applicability of a Rough Set model and method to discover maximal associations from a collection of text documents, and compare its applicability with that of the maximal association method. Both methods are based on computing co-occurrences of various sets of keywords, but it has been shown that by using the Rough Set method, rules discovered are similar to maximal association rules, and it is much simpler than the maximal association method. In addition, we also present an alternative strategy to taxonomies required in the above methods, instead of building taxonomies based on labelled document collections themselves. This is to effectively utilise ontologies which will increasingly be deployed on the Internet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal R, Imielinski T and Swami A. Mining Association Rules between Sets of Items in Large Databases. Proceedings of the ACM SIGMOD conference, pp207–216, Washington D.C., 1993.
Google Scholar
Feldman, R., Aumann, Y., Amir, A., Zilberstein, A., Klösgen, W.: Maximal Association Rules: A New Tool for Mining for Keyword Co-Occurrences in Document Collections. pp 167–170, 1997.
Google Scholar
Srikant, R. and Agrawal, R. Mining Generalized Association Rules. Proc. of the 21st Int’l Conference on Very Large Databases, Zurich, Switzerland, Sep. 1995.
Google Scholar
John D. Holt, Soon Myoung Chung: Multipass Algorithms for Mining Association Rules in Text Databases. Knowledge and Information Systems 3 (2): pp l68 – 183, 2001.
Google Scholar
Reuters-21578,http://www.research.att.com/~lewis/reuters21578.html, (April 2002).
Feldman, R., Fresko, M., Kinar, Y., Lindell, Y., Liphstat, O., Rajman, M., Schler, Y., Zamir, O.: Text Mining at the Term Level. pp 65–73, 1998.
Google Scholar
Gruber, T. A translation Approach to Portable Ontology Specifications. Knowledge Acquisition. Vol 5, 1993.
Google Scholar
WordNet,www.cogsci.princeton.edu/~wn, (April 2002)
UN Classifications Registry,esa.un.org/unsd/cr/registry, (April 2002)
RAMON,europa.eu.int/comm/eurostat/ramon, (April 2002).
McGuinness, D.L. Ontologies Come of Age. To appear in D Fensel, J Hendler, H Lieberman, and W Wahlster, editors. The Semantic Web: Why, What, and How, MIT Press, 2002.
Google Scholar
Pawlak Z. Rough Set: Theoretical Aspects of Reasoning About Data. Kluwer Academic, 1991.
Google Scholar
Bi, Y., Anderson, T. and McClean, S. Rule Generation Based on Rough Set Theory for Text Classification. Twentieth SGES International Conference on KBS and Applied Al. pp 101–112, 2000.
Google Scholar
Ahonen-Myka, H. Finding All Frequent Maximal Sequences in Text. Proceedings of the 16th International Conference on Machine Learning ICML-99 Workshop on Machine Learning in Text Data Analysis, eds. D. Mladenic and M. Grobelnik, pp. 11–17, J. Stefan Institute, Ljubljana, 1999.
Google Scholar
Hotho, A., Mädche, A., Staab, S.: Ontology-based Text Clustering, Workshop “Text Learning: Beyond Supervision”, IJCAI 2001.
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics, University of Ulster at Jordanstown, Newtownabbey, Co. Antrim, N. Ireland, UK
Yaxin Bi, Terry Anderson & Sally McClean

Authors

Yaxin Bi
View author publications
You can also search for this author in PubMed Google Scholar
Terry Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Sally McClean
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Technology, University of Portsmouth, Portsmouth, UK
Max Bramer BSc, PhD, CEng, FBCS, FIEE, FRSA (Technical Programme Chair) (Technical Programme Chair)
Dept of Computer Science, University of Aberdeen, Aberdeen, UK
Alun Preece (Deputy Technical Programme Chair) (Deputy Technical Programme Chair)
Department of Computer Science, University of Liverpool, Liverpool, UK
Frans Coenen (Conference Chairman) (Conference Chairman)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bi, Y., Anderson, T., McClean, S. (2003). A Rough Set Model with Ontological Information for Discovering Maximal Association Rules in Document Collections. In: Bramer, M., Preece, A., Coenen, F. (eds) Research and Development in Intelligent Systems XIX. Springer, London. https://doi.org/10.1007/978-1-4471-0651-7_2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-0651-7_2
Publisher Name: Springer, London
Print ISBN: 978-1-85233-674-5
Online ISBN: 978-1-4471-0651-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics