Abstract
It is a big challenge to guarantee the quality of association rules in some application areas (e.g., in Web information gathering) since duplications and ambiguities of data values (e.g., terms). This paper presents a novel concept of rough association rules to improve the quality of discovered knowledge in these application areas. The premise of a rough association rule consists of a set of terms (items) and a weight distribution of terms (items). The distinct advantage of rough association rules is that they contain more specific information than normal association rules. It is also feasible to update rough association rules dynamically to produce effective results. The experimental results also verify that the proposed approach is promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Antonie, M.L., Zaiane, O.R.: Text document categorization by term association. In: 2nd IEEE International Conference on Data Mining, Japan, pp. 19–26 (2002)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Chang, G., Healey, M.J., McHugh, J.A.M., Wang, J.T.L.: Mining the World Wide Web: an information search approach. Kluwer Academic Publishers, Dordrecht (2001)
Eirinaki, M., Vazirgiannis, M.: Web mining for web personalization. ACM Transactions on Internet Technology 3(1), 1–27 (2003)
Evans, D.A., et al.: CLARIT experiments in batch filtering: term selection and threshold optimization in IR and SVM Filters. In: TREC 2002 (2002)
Feldman, R., Hirsh, H.: Mining associations in text in presence of background knowledge. In: 2nd ACM SIGKDD, pp. 343–346 (1996)
Feldman, R., et al.: Maximal association rules: a new tool for mining for keyword co-occurrences in document collection. In: KDD 1997, pp. 167–170 (1997)
Feldman, R., et al.: Text mining at the term level. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 65–73. Springer, Heidelberg (1998)
Feldman, R., Dagen, I., Hirsh, H.: Mining text using keywords distributions. Journal of Intelligent Information Systems 10(3), 281–300 (1998)
Grossman, D.A., Frieder, O.: Information retrieval algorithms and heuristics. Kluwer Academic Publishers, Boston (1998)
Guan, J.W., Bell, D.A., Liu, D.Y.: The rough set approach to association rules. In: 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 529–532 (2003)
Holt, J.D., Chung, S.M.: Multipass algorithms for mining association rules in text databases. Knowledge and Information Systems 3, 168–183 (2001)
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: IJCAI, pp. 587–592 (2003)
Li, Y., Zhong, N.: Web mining model and its applications on information gathering. Knowledge-Based Systems 17, 207–217 (2004)
Li, Y., Zhong, N.: Capturing evolving patterns for ontology-based. In: IEEE/WIC/ACM International Conference on Web Intelligence, Beijing, China, pp. 256–263 (2004)
Li, Y., Zhong, N.: Interpretations of association rules by granular computing. In: 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 593–596 (2003)
Li, Y., Zhong, N.: Mining ontology for automatically acquiring Web user information needs. IEEE Transactions on Knowledge and Data Engineering 18(4), 554–568 (2006)
Mostafa, J., Lam, W., Palakal, M.: A multilevel approach to intelligent information filtering: model, system, and evaluation. ACM Transactions on Information Systems 15(4), 368–399 (1997)
Pawlak, Z.: In pursuit of patterns in data reasoning from data, the rough set way. In: 3rd International Conference on Rough Sets and Current Trends in Computing, USA, pp. 1–9 (2002)
Pawlak, Z., Skowron, A.: Rough sets and Boolean reasoning. In: Information Science (2006) (to appear)
Robertson, S., Hull, D.A.: The TREC-9 filtering track final report, TREC-9 (2000)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Tzvetkov, P., Yan, X., Han, J.: TSP: Mining top-K closed sequential patterns. In: Proceedings of 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, pp. 347–354 (2003)
Wu, S.-T., Li, Y., Xu, Y., Pham, B., Chen, P.: Automatic pattern taxonomy exatraction for Web mining. In: IEEE/WIC/ACM International Conference on Web Intelligence, Beijing, China, pp. 242–248 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, Y., Zhong, N. (2006). Mining Rough Association from Text Documents. In: Greco, S., et al. Rough Sets and Current Trends in Computing. RSCTC 2006. Lecture Notes in Computer Science(), vol 4259. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11908029_39
Download citation
DOI: https://doi.org/10.1007/11908029_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-47693-1
Online ISBN: 978-3-540-49842-1
eBook Packages: Computer ScienceComputer Science (R0)