Classifier-independent feature selection for two-stage feature selection

  • Mineichi Kudo
  • Jack Sklansky
Feature Selection and Extraction
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1451)


The effectiveness of classifier-independent feature selection is described. The aim is to remove garbage features and to improve the classification accuracy of all the practical classifiers compared with the situation where all the given features are used. Two algorithms of classifier-independent feature selection and two other conventional classifier-specific algorithms are compared on three sets of real data. In addition, two-stage feature selection is proposed.


Feature Selection Classification Accuracy Feature Subset Discrimination Rate Correct Recognition Rate 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ferri, F. J., Pudil P., Hatef M., Kittler J.: Comparative study of techniques for large-scale feature selection. In Gelsema E. S. and Kanal L. N. eds. Pattern Recognition in Practice IV Elsevier Science B. V. 1994 403–413Google Scholar
  2. 2.
    Jain A., Zongker D.: Feature selection: Evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Machine Intell. 19 (1997) 153–157CrossRefGoogle Scholar
  3. 3.
    Kudo M., Sklansky J.: A comparative evaluation of medium-and large-scale feature selectors for pattern classifiers. In 1st International Workshop on Statistical Techniques in Pattern Recognition, Prague Czech Republic 1997 91–96Google Scholar
  4. 4.
    Holz H. J., Loew M. H.: Relative feature importance: A classifier-independent approach to feature selection. In Gelsema E. S. and Kanal L. N., eds. Pattern Recognition in Practice IV, Amsterdam: Elsevier 1994 473–487Google Scholar
  5. 5.
    Kudo M., Shimbo M.: Feature selection based on the structural indices of categories. Pattern Recognition 26 (1993) 891–901CrossRefGoogle Scholar
  6. 6.
    Pudil P., Novovičová J., Kittler J.: Feature selection based on the approximation of class densities by finite mixtures of special type. Pattern Recognition 28 (1995) 1389–1397CrossRefGoogle Scholar
  7. 7.
    Novovičovä J., Pudil P., Kitler J.: Divergence based feature selection for mulimodal class densities. IEEE Trans. Pattern Anal. and Machine Intell. 18 (1996) 218–223CrossRefGoogle Scholar
  8. 8.
    Kudo M., Yanagi S., Shimbo M.: Construction of class regions by a randomized algorithm: A randomized subclass method. Pattern Recognition 29 (1996) 581–588CrossRefGoogle Scholar
  9. 9.
    McKenzie P., Alder M.: Initializing the em algorithm for use in gaussian mixture modelling. In Gelsema E. S. and Kanal L. N. eds. Pattern Recognition in Practice IV Amsterdam:Elsevier 1994 91–105Google Scholar
  10. 10.
    Murphy P. M., AhaD. W.: UCI Repository of machine learning databases [Machine-readable data repository]. University of California, Irivne, Department of Information and Computation Science 1996Google Scholar
  11. 11.
    Quinlan J. R.: C4.5: Programs for Machine Learning. Morgan Kaufmann San Mateo CA 1993Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Mineichi Kudo
    • 1
  • Jack Sklansky
    • 2
  1. 1.Division of Systems and Information Engineering Graduate School of EngineeringHokkaido UniversitySapporoJapan
  2. 2.Department of Electrical EngineeringUniversity of CaliforniaIrvineUSA

Personalised recommendations