An Improvement of Text Association Classification Using Rules Weights

Chen, Xiao-Yun; Chen, Yi; Li, Rong-Lu; Hu, Yun-Fa

doi:10.1007/11527503_43

Xiao-Yun Chen^21,22,
Yi Chen²²,
Rong-Lu Li²² &
…
Yun-Fa Hu²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

2336 Accesses

Abstract

Recently, categorization methods based on association rules have been given much attention. In general, association classification has the higher accuracy and the better performance. However, the classification accuracy drops rapidly when the distribution of feature words in training set is uneven. Therefore, text categorization algorithm Weighted Association Rules Categorization (WARC) is proposed in this paper. In this method, association rules are used to classify training samples and rule intensity is defined according to the number of misclassified training samples. Each strong rule is multiplied by factor less than 1 to reduce its weight while each weak rule is multiplied by factor more than 1 to increase its weight. The result of research shows that this method can remarkably improve the accuracy of association classification algorithms by regulation of rules weights.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: ACM Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD 1998), NewYork City, NY, August 1998, pp. 80–86 (1998)
Google Scholar
Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Li, W., Han, J., Pei, J.: CMAR:accurate and efficient classification based on multiple classification rules. San Jose, California, November 29-December 2 (2001)
Google Scholar
Zaïane, O.R., Antonie, M.L.: Classifying text documents by associating terms with text categories. In: Proceeding of the Thirteenth Australasian Database Conference (ADC 2002), Melbourne, Australia, January 2002, pp. 215–222 (2002)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceeding of the 1994 International Conference on Vary Large Data Bases, Santiago, chile, pp. 487–499 (1994)
Google Scholar
Zhou, S.G., Guan, J.H., Hu, Y.F., Zhou, A.Y.: A Chinese text classification algorithm without lexicon and segmentation. Computer Research and Development 38(7) (2001) (in Chinese)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–157 (1996)
Google Scholar
Yang, Y., Lin, X.: A Re-Examination of Text Categorization Methods. In: Proceedings of SIGIR 1999 (1999)
Google Scholar
Yang, Y., Pedersen, J.P.: A comparative study on feature selection in text categorization. In: Fisher Jr., D.H. (ed.) Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, TN, July 8-12 (1997)
Google Scholar
Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1996)
MATH Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000, Dallas, TX (May 2000)
Google Scholar
Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: International Conference on Machine Learning, pp. 170–178 (1997)
Google Scholar
Chen, X.Y., Chen, Y., Wang, L., Hu, Y.F.: Text Categorization Based on Association Rules with Term Frequency. In: Proceeding of the 3rd International Conference on Machine Learning and Cybernetics, Shanghai, China, August 26-29, pp. 1610–1615 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Computer Science, Fuzhou University, Fuzhou, 350002, China
Xiao-Yun Chen
Department of Computer and Information Technology, Fudan University, Shanghai, 200433, China
Xiao-Yun Chen, Yi Chen, Rong-Lu Li & Yun-Fa Hu

Authors

Xiao-Yun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Rong-Lu Li
View author publications
You can also search for this author in PubMed Google Scholar
Yun-Fa Hu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, 4072, Brisbane, Queensland, Australia
Xue Li
The State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 430072, Wuhan, China
Shuliang Wang
School of ITEE, The Univ of Queensland, St. Lucia, 4072, QLD, Australia
Zhao Yang Dong

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, XY., Chen, Y., Li, RL., Hu, YF. (2005). An Improvement of Text Association Classification Using Rules Weights. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_43

Download citation

DOI: https://doi.org/10.1007/11527503_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics