Skip to main content

An Improvement of Text Association Classification Using Rules Weights

  • Conference paper
Book cover Advanced Data Mining and Applications (ADMA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3584))

Included in the following conference series:

  • 2336 Accesses

Abstract

Recently, categorization methods based on association rules have been given much attention. In general, association classification has the higher accuracy and the better performance. However, the classification accuracy drops rapidly when the distribution of feature words in training set is uneven. Therefore, text categorization algorithm Weighted Association Rules Categorization (WARC) is proposed in this paper. In this method, association rules are used to classify training samples and rule intensity is defined according to the number of misclassified training samples. Each strong rule is multiplied by factor less than 1 to reduce its weight while each weak rule is multiplied by factor more than 1 to increase its weight. The result of research shows that this method can remarkably improve the accuracy of association classification algorithms by regulation of rules weights.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: ACM Int. Conf. on Knowledge Discovery and Data Mining (SIGKDD 1998), NewYork City, NY, August 1998, pp. 80–86 (1998)

    Google Scholar 

  2. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  3. Li, W., Han, J., Pei, J.: CMAR:accurate and efficient classification based on multiple classification rules. San Jose, California, November 29-December 2 (2001)

    Google Scholar 

  4. Zaïane, O.R., Antonie, M.L.: Classifying text documents by associating terms with text categories. In: Proceeding of the Thirteenth Australasian Database Conference (ADC 2002), Melbourne, Australia, January 2002, pp. 215–222 (2002)

    Google Scholar 

  5. Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceeding of the 1994 International Conference on Vary Large Data Bases, Santiago, chile, pp. 487–499 (1994)

    Google Scholar 

  6. Zhou, S.G., Guan, J.H., Hu, Y.F., Zhou, A.Y.: A Chinese text classification algorithm without lexicon and segmentation. Computer Research and Development 38(7) (2001) (in Chinese)

    Google Scholar 

  7. Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–157 (1996)

    Google Scholar 

  8. Yang, Y., Lin, X.: A Re-Examination of Text Categorization Methods. In: Proceedings of SIGIR 1999 (1999)

    Google Scholar 

  9. Yang, Y., Pedersen, J.P.: A comparative study on feature selection in text categorization. In: Fisher Jr., D.H. (ed.) Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), Nashville, TN, July 8-12 (1997)

    Google Scholar 

  10. Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1996)

    MATH  Google Scholar 

  11. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: SIGMOD 2000, Dallas, TX (May 2000)

    Google Scholar 

  12. Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: International Conference on Machine Learning, pp. 170–178 (1997)

    Google Scholar 

  13. Chen, X.Y., Chen, Y., Wang, L., Hu, Y.F.: Text Categorization Based on Association Rules with Term Frequency. In: Proceeding of the 3rd International Conference on Machine Learning and Cybernetics, Shanghai, China, August 26-29, pp. 1610–1615 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, XY., Chen, Y., Li, RL., Hu, YF. (2005). An Improvement of Text Association Classification Using Rules Weights. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_43

Download citation

  • DOI: https://doi.org/10.1007/11527503_43

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-27894-8

  • Online ISBN: 978-3-540-31877-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics