Email Categorization with Tournament Methods

Xia, Yunqing; Liu, Wei; Guthrie, Louise

doi:10.1007/11428817_14

Yunqing Xia¹⁹,
Wei Liu²⁰ &
Louise Guthrie²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3513))

Included in the following conference series:

International Conference on Application of Natural Language to Information Systems

1385 Accesses
3 Citations

Abstract

To perform the task of email categorization, the tournament methods are proposed in this article in which the multi-class categorization process is broken down into a set of binary classification tasks. The methods of elimination tournament and Round Robin tournament are implemented and applied to classify emails within 15 folders. Substantial experiments are conducted to compare the effectiveness and robustness of the tournament methods against the n-way classification method. The experimental results prove that the tournament methods outperform the n-way method by 11.7% regarding precision, and the Round Robin performs slightly better than the Elimination tournament on average.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Xia, Y., Dalli, A., Wilks, Y., Guthrie, L.: FASiL Adaptive Email Categorization System. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 718–729. Springer, Heidelberg (2005)
Chapter Google Scholar
Guthrie, L., Walker, E., Guthrie, J.: Document Classification by machine: Theory and practice. In: Proc. COLING 1994, pp. 1059–1063 (1994)
Google Scholar
Smadja, F., Tumblin, H.: Automatic Spam Detection as a Text Classification Task. Elron Software (2003)
Google Scholar
Lewis, D.: Naive Bayes at forty: The independence assumption in information retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 4–15. Springer, Heidelberg (1998)
Chapter Google Scholar
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI 1998 Workshop on Text Categorization (1998)
Google Scholar
Androutsopoulos, K.I., Chandrinos, J., Paliouras, G.K.V., Spyropoulos, C.D.: An Evaluation of Naive Bayesian Anti-Spam Filtering. In: Proc. of the workshop on Machine Learning in the New Information Age (2000)
Google Scholar
Carrerras, X., Marquez, L.: Boosting Trees for Anti-Spam Email Filtering. In: Proc. RANLP-2001 (2001)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Thorsten, J.: A Statistical Learning Model of Text Classification with Support Vector Machines. In: Proc. of SIGIR 2001, New Orleans, ACM Press, New York (2001)
Google Scholar
Wiener, Pederson, E.J.O., Weigend, A.S.: A neural network approach to topic spotting. In: Proc. SDAIR 1995, Nevada, Las Vegas, pp. 317–332 (1995)
Google Scholar
Yang, Y.: An evaluation of statistical approaches to text categorization. Journal of Information Retrieval 1(1/2), 67–88 (1999)
Google Scholar
Breiman, B.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a New Boosting Algorithm. In: Proceedings of the 13th International Conference on Machine Learning, pp. 325–332 (1996)
Google Scholar
Cohen, W.: Learning Rules that Classify EMail. In: Proc. AAAI Spring Symposium on Machine Learning in Information Access, Stanford, California (1996)
Google Scholar
Payne, T., Edwards, P.: Interface Agents that Learn: An Investigation of Learning Issues in a Mail Agent Interface. Applied Artificial Intelligence Journal, AUCS/TR9508 (1997)
Google Scholar
Aas, L., Eikvil, L.: Text categorisation: A survey. Norwegian Computing Center, Raport NR 941 (1999)
Google Scholar
Fürnkranz, J.: Round Robin Classification. Journal of Machine Learning Research 2, 21–747 (2002)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, Hong Kong
Yunqing Xia
NLP Research Group, Department of Computer Science, University of Sheffield, Regent court, 211 Portobello Street, Sheffield, S10 4DP
Wei Liu & Louise Guthrie

Authors

Yunqing Xia
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Louise Guthrie
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Software and Computing Systems, University of Alicante, Spain
Andrés Montoyo
Grupo de investigación del Procesamiento del Lenguaje y Sistemas de Información, Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Alicante, Spain
Rafael Muńoz
Lab. CEDRIC, CNAM, Paris, France
Elisabeth Métais

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xia, Y., Liu, W., Guthrie, L. (2005). Email Categorization with Tournament Methods. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_14

Download citation

DOI: https://doi.org/10.1007/11428817_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26031-8
Online ISBN: 978-3-540-32110-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics