Advertisement

Preprocessing by a Cost-Sensitive Literal Reduction Algorithm: Reduce

  • N. Lavrac
  • D. Gamberger
  • P. Turney
Conference paper
Part of the International Centre for Mechanical Sciences book series (CISM, volume 382)

Abstract

This study is concerned with whether it is possible to detect what information contained in the training data and background knowledge is relevant for solving the learning problem, and whether irrelevant information can be eliminated in preprocessing before starting the learning process. A case study of data preprocessing for a hybrid genetic algorithm shows that the elimination of irrelevant features can substantially improve the efficiency of learning. In addition, cost-sensitive feature elimination can be effective for reducing costs of induced hypotheses.

Keywords

Decision Tree Average Cost Feature Reduction Hybrid Genetic Algorithm Inductive Logic Programming 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Caruana, R. and D. Freitag: Greedy Attribute Selection, in: Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann, 1994, 28–36.Google Scholar
  2. 2.
    Fayyad, U.M. and K.B. Irani: On the handling of continuous-valued attributes in decision tree generation, Machine Learning, 8 (1992), 87–102.MATHGoogle Scholar
  3. 3.
    Gamberger, D.: A Minimization Approach to Propositional Inductive Learning, in: Proceedings of the 8th European Conference on Machine Learning, Springer, 1995, 151–160.Google Scholar
  4. 4.
    Grefenstette, J.J.: Optimization of control parameters for genetic algorithms, IEEE Transactions on Systems, Man, and Cybernetics, 16 (1986), 122–128.CrossRefGoogle Scholar
  5. 5.
    John, G.H., R. Kohavi and K. Pfleger: Irrelevant Features and the Subset Selection Problem, in: Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann, 1994, 190–198.Google Scholar
  6. 6.
    Lavrac, N., S. Dzeroski and M. Grobelnik:. Learning Nonrecursive Definitions of Relations with LINUS, in: Proceedings of the 5th European Working Session on Learning, Springer, 1991, 265–281.Google Scholar
  7. 7.
    Lavrac, N. and S. Dzeroski: Inductive Logic Programming: Techniques and Applications, Ellis Horwood, 1994.MATHGoogle Scholar
  8. 8.
    Lavrac, N., D. Gamberger and S. Dzeroski: An Approach to Dimensionality Reduction in Learning from Deductive Databases, in: Proceedings of the 5th International Workshop on Inductive Logic Programming, Scientific Report, Katholieke Universiteit Leuven, 1995, 337–354.Google Scholar
  9. 9.
    Lavrac, N., D. Gamberger and P. Turney: Cost-Sensitive Feature Reduction Applied to a Hybrid Genetic Algorithm, in: Proceedings of the 7th International Workshop on Algorithmic Learning Theory, Springer, 1996, 127–134.Google Scholar
  10. 10.
    Michalski, R.S. and J.B. Larson: Inductive Inference of VL Decision Rules, ACM SIGART Newsletter, 63 (1977), 38–44.Google Scholar
  11. 11.
    Michalski, R.S.: A Theory and Methodology of Inductive Learning, in: Machine Learning: An Artificial Intelligence Approach (Eds. R. Michalski, J. Carbonell and T. Mitchell ), Tioga, 1983, 83–134.Google Scholar
  12. 12.
    Michie, D., S. Muggleton, D. Page and A. Srinivasan: To the International Computing Community: A new East-West Challenge. Oxford University Computing Laboratory, Oxford, 1994. [Available at URL http://ftp.comlab.ox.ac.uk/pub/Packages/ILP/trains.tar.Z.]
  13. 13.
    Quinlan, J.R.: C4. 5: Programs for Machine Learning, Morgan Kaufmann, 1993.Google Scholar
  14. 14.
    Skalak, D: Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms, in: Proceedings of the 11th International Conference on Machine Learning, Morgan Kaufmann, 1994, 293–301.Google Scholar
  15. 15.
    Turney, P.: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm, Journal of Artificial Intelligence Research, 2 (1995), 369–409. [Available at URL http://www.cs.washington.edu/research/ jair/home.html.]
  16. 16.
    Turney, P.: Low Size-Complexity Inductive Logic Programming: The East-West Challenge as a Problem in Cost-Sensitive Classification, in: Advances in Inductive Logic Programming (Ed. L. De Raedt ), IOS Press, 1996, 308–321.Google Scholar

Copyright information

© Springer-Verlag Wien 1997

Authors and Affiliations

  • N. Lavrac
    • 1
  • D. Gamberger
    • 2
  • P. Turney
    • 3
  1. 1.J. Stefan InstituteLjubljanaSlovenia
  2. 2.R. Boskovic InstituteZagrebCroatia
  3. 3.National Research Council CanadaOttawaCanada

Personalised recommendations