Advertisement

Computational Optimization and Applications

, Volume 34, Issue 2, pp 273–294 | Cite as

A Dual-Objective Evolutionary Algorithm for Rules Extraction in Data Mining

  • K. C. Tan
  • Q. Yu
  • J. H. Ang
Article

Abstract

This paper presents a dual-objective evolutionary algorithm (DOEA) for extracting multiple decision rule lists in data mining, which aims at satisfying the classification criteria of high accuracy and ease of user comprehension. Unlike existing approaches, the algorithm incorporates the concept of Pareto dominance to evolve a set of non-dominated decision rule lists each having different classification accuracy and number of rules over a specified range. The classification results of DOEA are analyzed and compared with existing rule-based and non-rule based classifiers based upon 8 test problems obtained from UCI Machine Learning Repository. It is shown that the DOEA produces comprehensible rules with competitive classification accuracy as compared to many methods in literature. Results obtained from box plots and t-tests further examine its invariance to random partition of datasets.

Keywords

data mining evolutionary algorithm classification rules extraction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A.D. Arbatli and H.L. Akin, “Rule extraction from trained neural networks using genetic algorithms,” in Proceedings of the 2nd World Congress of Nonlinear Analysis, Theory, Methods & Application, vol. 30, no. 3, pp. 1639–1648, 1997.Google Scholar
  2. 2.
    W. Banzhaf, E. Nordin, P.R. Keller, and F.D. Francone, Genetic Programming: An Introduction on the Automatic Evolution of Computer Programs and its Applications, Morgan Kaufmann, San Francisco, CA, 1998.Google Scholar
  3. 3.
    C.L. Blake and C.J. Merz, UCI Repository of machine learning databases [http://www.ics.uci.edu/∼mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science, 1998.
  4. 4.
    C.C. Bojarczuk, H.S. Lopes, and A.A. Freitas, “Genetic programming for knowledge discovery in chest-pain diagnosis,” IEEE Engineering in Medicine and Biology Magazine, vol. 4, no. 19, pp. 38–44, 2000.CrossRefGoogle Scholar
  5. 5.
    M. Brameier and W. Banzhaf, “A comparison of linear genetic programming neural networks in medical data mining,” IEEE Transactions on Evolutionary Computation, vol. 5, no. 1, pp. 17–26, 2001.CrossRefGoogle Scholar
  6. 6.
    R. Cattral, F. Oppacher, and D. Deugo, “Rule acquisition with a genetic algorithm,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 125–129, 1999.Google Scholar
  7. 7.
    J.M. Chambers, W.S. Cleveland, B. Kleiner, and P.A. Turkey, Graphical Methods for Data Analysis, Wadsworth & Brooks/Cole, Pacific CA, 1983.MATHGoogle Scholar
  8. 8.
    C.A. Coello Coello, D.A. Van Veldhuizen, and G.B. Lamont, Evolutionary Algorithms for Solving Multi-Objective Problems, Plenum Pub Corp, 2002.Google Scholar
  9. 9.
    C.B. Congdon, “Classification of epidemiological data: a comparison of genetic algorithm and decision tree approaches,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 442–449, 2000.Google Scholar
  10. 10.
    R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, 2nd edition, John Wiley and Sons, 2001.Google Scholar
  11. 11.
    U. Fayyad, “Data mining and knowledge discovery in databases: implications for scientific databases,” Proceedings of the Ninth International Conference on Scientific and Statistical Database Management, pp. 2–11, 1997.Google Scholar
  12. 12.
    M.V. Fidelis, H.S. Lopes, and A. Freitas, “Discovering comprehensible classification rules with a genetic algorithm,” in Proceedings of the IEEE Congress on Evolutionary Computation, vol. 1, pp. 805–810, 2000.Google Scholar
  13. 13.
    E. Frank and I.H. Witten “Generating accurate rule sets without global optimization,” Proceedings of the Fifteenth International Conference Machine Learning (ICML’98), pp. 144–151, 1998.Google Scholar
  14. 14.
    L.M. Howard and D.J. D’Angelo, “The GA-P: a genetic algorithm and genetic programming hybrid,” IEEE Expert, vol. 10, pp. 11–15, 1995.CrossRefGoogle Scholar
  15. 15.
    H. Ishibuchi, T. Murata, and I.B. Türksen, “Single-objective and two-objective genetic algorithms for selecting linguistic rules for pattern classification problems,” Fuzzy Sets and Systems, vol. 89, no. 2, pp. 135–150, 1997.CrossRefGoogle Scholar
  16. 16.
    H. Ishibuchi, T. Nakashima, and T. Murata, “Three-objective genetics-based machine learning for linguistic rule extraction,” Information Sciences, vol. 136, no. 1–4, pp. 109–133, 2001.MATHCrossRefGoogle Scholar
  17. 17.
    G.H. John, and P. Langley, “Estimating continuous distributions in Bayesian classifiers,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, San Mateo, pp. 338–345, 1995.Google Scholar
  18. 18.
    Y. Kim, W.N. Street, and F. Menczer, “Evolutionary model selection in unsupervised learning,” Intelligent Data Analysis, vol. 6, no. 6, pp. 531–556, 2002.MATHGoogle Scholar
  19. 19.
    J.K. Kishore, L.M. Patnaik, V. Mani, and V.K. Agrawal, “Application of genetic programming for multicategory pattern classification,” IEEE Transactions on Evolutionary Computation, vol. 4, no. 3, pp. 242–258, 2000.CrossRefGoogle Scholar
  20. 20.
    R. Kohavi, “The power of decision tables,” in Proceedings of the 8th European Conference on Machine Learning, pp. 174–189, 1995.Google Scholar
  21. 21.
    R.R.F. Mendes, F.B. Voznika, A.A. Freitas and J.C. Nievola, “Discovering fuzzy classification rules with genetic programming and co-evolution,” Lecture Notes in Artificial Intelligence 2168, Springer-Verlag, pp. 314–325, 2001.Google Scholar
  22. 22.
    Z. Michalewicz, Genetic Algorithms + Data Structure = Evolutionary Programs, Springer-Verlag: Berlin, 2nd edition, 1996.Google Scholar
  23. 23.
    D. Michie, D.J. Spiegelhalter, and C.C. Taylor, Machine Learning, Neural and Statistical Classification, London: Ellis Horwood, 1994.MATHGoogle Scholar
  24. 24.
    T.M. Mitchell, Machine Learning, McGraw Hill, 1997.Google Scholar
  25. 25.
    D.C. Montgomery, G.C. Runger, and N.F. Hubele, Engineering Statistics, Wiley, John & Sons:, New York, 2nd edition, 2001.Google Scholar
  26. 26.
    C.A. Peña-Reyes and M. Sipper, “A fuzzy-genetic approach to breast cancer diagnosis,” Artificial Intelligence in Medicine, vol. 17, no. 2, pp. 131–155, 1999.CrossRefGoogle Scholar
  27. 27.
    A.R. Polo and M. Hasse, “A Genetic Classifier Tool,” in Proceedings of the 20th International Conference of the Chilean Computer Science Society, pp. 14–23, 2000.Google Scholar
  28. 28.
    L. Prechelt, “Some notes on neural learning algorithm benchmarking,” Neurocomputing, vol. 9, no. 3, pp. 343–347, 1995.Google Scholar
  29. 29.
    J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann: CA, 1993.Google Scholar
  30. 30.
    R. Setiono and H. Liu, “NeuroLinear: From neural networks to oblique decision rules,” Neurocomputing, vol. 17, no. 1, pp. 1–24, 1997.CrossRefGoogle Scholar
  31. 31.
    K.C. Tan, A. Tay, T.H. Lee, and C.M. Heng, “Mining multiple comprehensible classification rules using genetic programming,” in Proceedings of the IEEE Congress on Evolutionary Computation, Honolulu, Hawaii, vol. 2, pp. 1302–1307, 2002.Google Scholar
  32. 32.
    K.C. Tan, Q. Yu, and T.H. Lee, “A distributed coevolutionary classifier for knowledge discovery in data mining,” IEEE Transaction on Systems, Man and Cybernetics: Part C (Applications and Reviews), vol. 35, no. 2, pp. 131–142, 2005.CrossRefGoogle Scholar
  33. 33.
    D.A. Van Veldhuizen and G.B. Lamont, “Multiobjective Evolutionary Algorithms: Analyzing the State-of-the-Art,” Evolutionary Computation, vol. 8, no. 2, pp. 125–147, 2000.CrossRefGoogle Scholar
  34. 34.
    V. Vapnik, The Nature of Statistical Learning Theory, Springer: NY, 1995.MATHGoogle Scholar
  35. 35.
    C.H. Wang, T.P. Hong, S.S. Tseng, and C.M. Liao, “Automatically integrating multiple rule sets in a distributed-knowledge environment,” IEEE Transactions on Systems, Man, and Cybernetics Part C: Applications and Reviews, vol. 28, no. 3, pp. 471–476, 1998.MATHCrossRefGoogle Scholar
  36. 36.
    C.H. Wang, T.P. Hong, and S.S. Tseng, “Integrating membership functions and fuzzy rule sets from multiple knowledge sources,” Fuzzy Sets and Systems, vol. 112, no. 1, pp. 141–154, 2000.CrossRefGoogle Scholar
  37. 37.
    I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann Publishers: CA, 1999.Google Scholar
  38. 38.
    M.L. Wong and K.S. Leung, Data Mining Using Grammar Based Genetic Programming and Applications, Kluwer Academic Publishers: London, 2000.MATHGoogle Scholar
  39. 39.
    X. Yao, and Y. Liu, “A new evolutionary system for evolving artificial neural networks,” IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 694–713, 1997.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science + Business Media, Inc. 2006

Authors and Affiliations

  1. 1.Department of Electrical and Computer EngineeringNational University of SingaporeSingapore

Personalised recommendations