Advertisement

Combining Multiple Classifiers Using Dempster’s Rule of Combination for Text Categorization

  • Yaxin Bi
  • David Bell
  • Hui Wang
  • Gongde Guo
  • Kieran Greer
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3131)

Abstract

In this paper, we present an investigation into the combination of four different classification methods for text categorization using Dempster’s rule of combination. These methods include the Support Vector Machine, kNN (nearest neighbours), kNN model-based approach (kNNM), and Rocchio methods. We first present an approach for effectively combining the different classification methods. We then apply these methods to a benchmark data collection of 20-newsgroup, individually and in combination. Our experimental results show that the performance of the best combination of the different classifiers on the 10 groups of the benchmark data can achieve 91.07% classification accuracy, which is 2.68% better than that of the best individual method, SVM, on average.

Keywords

Mass Function Test Document Text Categorization Multiple Classifier Individual Classifier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Xu, L., Krzyzak, A., Suen, C.Y.: Several Methods for Combining Multiple Classifiers and Their Applications in Handwritten Character Recognition. IEEE Trans. on System, Man and Cybernetics 22(3), 418–435 (1992)CrossRefGoogle Scholar
  2. 2.
    Denoeux, T.: A neural network classifier based on Dempster-Shafer theory. IEEE transactions on Systems, Man and Cybernetics A 30(2), 131–150 (2000)CrossRefMathSciNetGoogle Scholar
  3. 3.
    Yang, Y., Ault, T., Pierce, T.: Combining multiple learning strategies for effective cross validation. In: The Seventeenth International Conference on Machine Learning (ICML 2000), pp. 1167–1182 (2000)Google Scholar
  4. 4.
    Ho, T.K.: Multiple Classifier Combination: Lessons and Next Steps. In: Ho, T.K., Kandel, A., Bunke, H. (eds.) Hybrid Methods in Pattern Recognition, pp. 171–198. World Scientific, Singapore (2002)CrossRefGoogle Scholar
  5. 5.
    Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys 34(1) (2002)Google Scholar
  6. 6.
    Li, Y.H., Jain, A. k.: Classification of Text Documents. The Computer Journal 41(8), 537–546 (1998)zbMATHCrossRefGoogle Scholar
  7. 7.
    Larkey, L.S., Croft, W.B.: Combining classifiers in text categorization. In: Proceedings of SIGIR 1996, 19th ACM International Conference on Research and Development in Information Retrieval, pp. 289–297 (1996)Google Scholar
  8. 8.
    Guan, J.W., Bell, D.: Evidence Theory and its applications, vol. 1,2 (1991)Google Scholar
  9. 9.
    Bi, Y., Bell, D., Guan, J.W.: Combining Evidence from Classifiers in Text Categorization. To appear in 8th International Conference on Knowledge-Based Intelligent Information & Engineering Systems (2004)Google Scholar
  10. 10.
    Bi, Y.: Combining Multiple Classifiers for Text Categorization using Dempster- Shafer Theory of Evidence. Internal report (2004)Google Scholar
  11. 11.
    Shi, S.: On Reasoning with Uncertainty and Belief Change. PhD thesis. University of Ulster (1995)Google Scholar
  12. 12.
    van Rijsbergen, C.J.: Information Retrieval, 2nd edn., Butterworths (1979)Google Scholar
  13. 13.
    Joachims, T.: A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. In: The Fourteen International Conference on Machine Learning, ICML 1997 (1997)Google Scholar
  14. 14.
    Salton, G., Allan, J., Buckley, C., Singhal, A.: Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts. Science 264, 1421–1426 (1994)CrossRefGoogle Scholar
  15. 15.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
  16. 16.
    Guo, G., Wang, H., Bell, D.J., Bi, Y., Greer, K.: KNN model-based approach in classification. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 986–996. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  17. 17.
    Yang, Y.: A study on thresholding strategies for text categorization. In: Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2001), pp. 137–145 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Yaxin Bi
    • 1
  • David Bell
    • 1
  • Hui Wang
    • 2
  • Gongde Guo
    • 2
  • Kieran Greer
    • 2
  1. 1.School of Computer ScienceQueen’s University of BelfastBelfastUK
  2. 2.School of Computing and MathematicsUniversity of UlsterNewtownabbey, Co. AntrimUK

Personalised recommendations