Mining Rough Association from Text Documents

Li, Yuefeng; Zhong, Ning

doi:10.1007/11908029_39

Yuefeng Li²⁵ &
Ning Zhong²⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4259))

Included in the following conference series:

International Conference on Rough Sets and Current Trends in Computing

Abstract

It is a big challenge to guarantee the quality of association rules in some application areas (e.g., in Web information gathering) since duplications and ambiguities of data values (e.g., terms). This paper presents a novel concept of rough association rules to improve the quality of discovered knowledge in these application areas. The premise of a rough association rule consists of a set of terms (items) and a weight distribution of terms (items). The distinct advantage of rough association rules is that they contain more specific information than normal association rules. It is also feasible to update rough association rules dynamically to produce effective results. The experimental results also verify that the proposed approach is promising.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Antonie, M.L., Zaiane, O.R.: Text document categorization by term association. In: 2nd IEEE International Conference on Data Mining, Japan, pp. 19–26 (2002)
Google Scholar
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Google Scholar
Chang, G., Healey, M.J., McHugh, J.A.M., Wang, J.T.L.: Mining the World Wide Web: an information search approach. Kluwer Academic Publishers, Dordrecht (2001)
MATH Google Scholar
Eirinaki, M., Vazirgiannis, M.: Web mining for web personalization. ACM Transactions on Internet Technology 3(1), 1–27 (2003)
Article Google Scholar
Evans, D.A., et al.: CLARIT experiments in batch filtering: term selection and threshold optimization in IR and SVM Filters. In: TREC 2002 (2002)
Google Scholar
Feldman, R., Hirsh, H.: Mining associations in text in presence of background knowledge. In: 2nd ACM SIGKDD, pp. 343–346 (1996)
Google Scholar
Feldman, R., et al.: Maximal association rules: a new tool for mining for keyword co-occurrences in document collection. In: KDD 1997, pp. 167–170 (1997)
Google Scholar
Feldman, R., et al.: Text mining at the term level. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 65–73. Springer, Heidelberg (1998)
Chapter Google Scholar
Feldman, R., Dagen, I., Hirsh, H.: Mining text using keywords distributions. Journal of Intelligent Information Systems 10(3), 281–300 (1998)
Article Google Scholar
Grossman, D.A., Frieder, O.: Information retrieval algorithms and heuristics. Kluwer Academic Publishers, Boston (1998)
MATH Google Scholar
Guan, J.W., Bell, D.A., Liu, D.Y.: The rough set approach to association rules. In: 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 529–532 (2003)
Google Scholar
Holt, J.D., Chung, S.M.: Multipass algorithms for mining association rules in text databases. Knowledge and Information Systems 3, 168–183 (2001)
Article MATH Google Scholar
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: IJCAI, pp. 587–592 (2003)
Google Scholar
Li, Y., Zhong, N.: Web mining model and its applications on information gathering. Knowledge-Based Systems 17, 207–217 (2004)
Article Google Scholar
Li, Y., Zhong, N.: Capturing evolving patterns for ontology-based. In: IEEE/WIC/ACM International Conference on Web Intelligence, Beijing, China, pp. 256–263 (2004)
Google Scholar
Li, Y., Zhong, N.: Interpretations of association rules by granular computing. In: 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 593–596 (2003)
Google Scholar
Li, Y., Zhong, N.: Mining ontology for automatically acquiring Web user information needs. IEEE Transactions on Knowledge and Data Engineering 18(4), 554–568 (2006)
Article MathSciNet Google Scholar
Mostafa, J., Lam, W., Palakal, M.: A multilevel approach to intelligent information filtering: model, system, and evaluation. ACM Transactions on Information Systems 15(4), 368–399 (1997)
Article Google Scholar
Pawlak, Z.: In pursuit of patterns in data reasoning from data, the rough set way. In: 3rd International Conference on Rough Sets and Current Trends in Computing, USA, pp. 1–9 (2002)
Google Scholar
Pawlak, Z., Skowron, A.: Rough sets and Boolean reasoning. In: Information Science (2006) (to appear)
Google Scholar
Robertson, S., Hull, D.A.: The TREC-9 filtering track final report, TREC-9 (2000)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Article Google Scholar
Tzvetkov, P., Yan, X., Han, J.: TSP: Mining top-K closed sequential patterns. In: Proceedings of 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 347–354 (2003)
Google Scholar
Wu, S.-T., Li, Y., Xu, Y., Pham, B., Chen, P.: Automatic pattern taxonomy exatraction for Web mining. In: IEEE/WIC/ACM International Conference on Web Intelligence, Beijing, China, pp. 242–248 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Software Engineering and Data Communications, Queensland University of Technology, Brisbane, QLD, 4001, Australia
Yuefeng Li
Department of Systems and Information Engineering, Maebashi Institute of Technology, Japan
Ning Zhong

Authors

Yuefeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Ning Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics, University of Catania, Corso Italia, 55, 95129, Catania, Italy
Salvatore Greco
Graduate School of Engineering, Department of Electrical Engineering and Computer Sciences, University of Hyogo, 2167 Shosha, 671-2280,, Himeji, Hyogo, Japan
Yutaka Hata
Department of Medical Informatics, Faculty of Medicine, Shimane University, 89-1 Enya-cho, Izumo, 693-8501, Shimane, Japan
Shoji Hirano
Department of Systems Innovation, Graduate School of Engineering Science, Osaka University, 1-3, Machikaneyama, Toyonaka, 560-8531, Osaka, Japan
Masahiro Inuiguchi
Department of Risk Engineering, School of Systems and Information Engineering, University of Tsukuba, 305-8573, Ibaraki, Japan
Sadaaki Miyamoto
Institute of Mathematics, Warsaw University, Banacha 2, 02-097, Warsaw, Poland
Hung Son Nguyen
Systems Research Institute, Polish Academy of Sciences, 01-447, Warsaw, Poland
Roman Słowiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Y., Zhong, N. (2006). Mining Rough Association from Text Documents. In: Greco, S., et al. Rough Sets and Current Trends in Computing. RSCTC 2006. Lecture Notes in Computer Science(), vol 4259. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11908029_39

Download citation

DOI: https://doi.org/10.1007/11908029_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47693-1
Online ISBN: 978-3-540-49842-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics