Abstract
In this paper we look at a way of combining two or more different classification methods for text categorization. The specific methods we have been experimenting with in our group include the Support Vector Machine, kNN (nearest neighbours), kNN model-based approach (kNNM), and Rocchio methods. Then we describe our method for combining the classifiers. A previous study suggested that the combination of the best and the second best classifiers using evidential operations [1] can achieve better performance than other combinations. We assess some aspects of this from an evidential reasoning perspective and suggest a refinement of the approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bi, Y., Bell, D., Wang, H., Guo, G. and Greer, K. Combining Classification Decisions for Text Categorization: An Experimental Study. 15th International Conference on Database and Expert Systems Applications (DEXA’04), Lecture Notes of Computer Science by Spring-Verlag, pp. 222–231, 2004.
Sebastiani, F. (2002). Machine Learning in Automated Text Categorization. ACM Computing Surveys, Vol. 34(1), 2002.
Larkey, L.S. and Croft, W.B. (1996) Combining classifiers in text categorization. In Proceedings of SIGIR-96, 19th ACM International Conference on Research and Development in Information Retrieval, pp. 289–297.
Li, Y.H. and Jain, A.K. (1998). Classification of Text Documents. The Computer Journal, Vol 41(8), pp. 537–546.
Yang, Y., Thomas Ault, Thomas Pierce. (2000). Combining multiple learning strategies for effective cross validation. The Seventeenth International Conference on Machine Learning (ICML’00), pp. 1167–1182.
Bi, Y., Bell, D., Wang, H., Guo, G. and Greer, K. Combining Multiple Classifiers Using Dempster’s Rule of Combination for Text Categorization. Proceedings of Modelling Decision for Artificial Intelligence Conference. Lecture Notes on Artificial Intelligence by Spring-Verlag, pp. 127–138, 2004.
Bell, D., Guan, J., Bi, Y. On Combining Classifier Mass Functions for Text Categorisation (to appear) IEEE Transactions on Knowledge and Data Engineering.
Ittner, D. J. Lewis, D. D and Ahn, D. D. (1995). Text categorization of low quality images. In Symposium on Document Analysis and Information Retrieval, pp. 301–315.
Yang, Y. (2001). A study on thresholding strategies for text categorization. Proceedings of ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR’01), pp. 137–145.
Guo, G., Wang, H., Bell, D., Bi, Y. and Kieran Greer, K. (2003). kNN model-based approach in classification. Cooperative Information Systems (CoopIS) International Conference. Lecture Notes in Computer Science, pp. 986–996.
Joachims, T. (1997). A probabilistic analysis of the Rocchio algorithm with TFIDF for text categorization. The Fourteen International Conference on Machine Learning (ICML’97).
Chang, C. C and Lin, C. J. (2001). LIBSVM: a library for support vector machines (http://www.csie.ntu.edu.tw/~cjlin/libsvm).
Guan J., Bell D.A. (1991), Evidence Theory and its Applications, North-Holland.
Mitchell, T. (1997). Mitchell. Machine Learning. McGraw-Hill.
Shafer, G. (1976). A Mathematical Theory of Evidence, Princeton University Press, Princeton, New Jersey.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bell, D.A., Guan, J.W., Bi, Y.X. (2005). An Evidential Approach to Classification Combination for Text Categorisation. In: Sirmakessis, S. (eds) Knowledge Mining. Studies in Fuzziness and Soft Computing, vol 185. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-32394-5_2
Download citation
DOI: https://doi.org/10.1007/3-540-32394-5_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25070-8
Online ISBN: 978-3-540-32394-5
eBook Packages: EngineeringEngineering (R0)