Combining Multiclass Maximum Entropy Text Classifiers with Neural Network Voting

Koehn, Philipp

doi:10.1007/3-540-45433-0_19

Philipp Koehn^3,4

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2389))

Included in the following conference series:

International Conference for Natural Language Processing in Portugal

487 Accesses
1 Citations

Abstract

We improve a high-accuracy maximum entropy classifier by combining an ensemble of classifiers with neural network voting. In our experiments we demonstrate significantly superior performance both over a single classifier as well as over the use of the traditional weighted-sum voting approach. Specifically, we apply this to a maximum entropy classifier on a large scale multi-class text categorization task: the online job directory Flipdog with over half a million jobs in 65 categories.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ethem Alpaydyn. Techniques for combining multiple learners. In Proceedings of Engineering of Intelligent Systems, volume 2, pages 6–12, 1998.
Google Scholar
Leo Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.
MATH MathSciNet Google Scholar
Pedro Domingos. Bayesian averaging of classifiers and the overfitting problem. In Proceedings of the 17th International Conference on Machine Learning, pages 223–230, 2000.
Google Scholar
Pedro Domingos and Michael Pazzani. Beyond independence: Conditions for the optimality of the simple bayesian classifier. In Proceedings of the 13th International Conference on Machine Learning, pages 105–112, 1996.
Google Scholar
Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In Proceedings of the 13th International Conference on Machine Learning, pages 148–156, 1996.
Google Scholar
Sally Goldman and Yah Zhou. Enhancing supervised learning with unlabeled data. In Proceedings of the 17th International Conference on Machine Learning, pages 327–334, 2000.
Google Scholar
Leah S. Larkey and W. Bruce Croft. Combining classifiers in text categorization. In Proceedings of 19th Annual International Conference on Research and Development in Information Retrieval (SIGIR 96), pages 289–297, 1996.
Google Scholar
Christopher Manning and Hinrich Schütze. Foundations of Statistical Natural Language Processing. MIT Press, 1999.
Google Scholar
Kamal Nigam, John Lafferty, and Andrew McCallum. Using maximum entropy for text classification. In IJCAI Workshop on Information Filtering, 1999.
Google Scholar
David M. Pennock, Pedrito Maynard-Reid, C. Lee Giles, and Eric Horvitz. A normative examination of ensemble learning algorithms. In Proceedings of the 17th International Conference on Machine Learning, pages 735–742, 2000.
Google Scholar
J. R. Quinlan. Boosting first-order learning. In Proceedings of the 7th International Workshop on Algorithmic Learning Theory, pages 143–155, 1996.
Google Scholar
Fabrizio Sebastiani. Machine learning in automated text categorisation. Technical Report IEI-B4-31-1999, Istituto di Elaborazione dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, IT, 1999. Submitted for publication to ACM Computing Surveys.
Google Scholar
Robert E. Shapire and Yoram Singer. Boostexter: A system for multiclass multi-label text categorization. Technical report, AT&T Labs-Research, 1998.
Google Scholar
David H. Wolpert. Stacked generalization. Neural Network, 5:241–259, 1992.
Article Google Scholar
X. Zhang, J. P. Mesirov, and D. L. Waltz. Hybrid system for protein secondary structure prediction. Journal of Molecular Biology, 225:1049–1063, 1992.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Whizbang! Labs, Provo, UT, 84604, USA
Philipp Koehn
USC, Information Sciences Institute, Marina del Rey, CA, 90292, USA
Philipp Koehn

Authors

Philipp Koehn
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidade de Lisboa e CAUTL (IST), Av. Rovisco Pais, 1049-001, Lisboa, Portugal
Elisabete Ranchhod
L2F/INESC ID Lisboa, Technical University of Lisbon, Av. Rovisco Pais, 1049-001, Lisboa, Portugal
Nuno J. Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Koehn, P. (2002). Combining Multiclass Maximum Entropy Text Classifiers with Neural Network Voting. In: Ranchhod, E., Mamede, N.J. (eds) Advances in Natural Language Processing. PorTAL 2002. Lecture Notes in Computer Science(), vol 2389. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45433-0_19

Download citation

DOI: https://doi.org/10.1007/3-540-45433-0_19
Published: 21 June 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43829-8
Online ISBN: 978-3-540-45433-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics