Advertisement

The Effectiveness of the Max Entropy Classifier for Feature Selection

  • Martin Schn¨oll
  • Cornelia Ferner
  • Stefan Wegenkittl
Conference paper
  • 1.3k Downloads

Abstract

Feature selection is the task of systematically reducing the number of input features for a classification task. In natural language processing, basic feature selection is often achieved by removing common stop words. In order to more drastically reduce the number of input features, actual feature selection methods such as Mutual Information or Chi-Squared are used on a count-based input representation. We suggest a task-oriented approach to select features based on the weights as learned by a Max Entropy classifier trained on the classification task. The remaining features can then be used by other classifiers to do the actual classification. Experiments on different natural language processing tasks confirm that the weight-based method is comparable to count-based methods. The number of input features can be reduced considerably while maintaining the classification performance.

Index Terms

feature selection natural language processing maximum entropy classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer Fachmedien Wiesbaden GmbH, ein Teil von Springer Nature 2019

Authors and Affiliations

  • Martin Schn¨oll
    • 1
  • Cornelia Ferner
    • 2
  • Stefan Wegenkittl
    • 2
  1. 1.Fact AI GmbHSalzburgAustria
  2. 2.Salzburg University of Applied SciencesSalzburgAustria

Personalised recommendations