Advertisement

A Non-ordered Rule Induction Algorithm through Multi-Objective Particle Swarm Optimization: Issues and Applications

  • André B. de Carvalho
  • Aurora Pozo
  • Silvia Vergilio
Part of the Studies in Computational Intelligence book series (SCI, volume 261)

Abstract

Multi-Objective Metaheuristics permit to conceive a complete novel approach to induce classifiers, where the properties of the rules can be expressed in different objectives, and then the algorithm finds these rules in an unique run by exploring Pareto dominance concepts. Furthermore, these rules can be used as an unordered classifier, in this way, the rules are more intuitive and easier to understand because they can be interpreted independently one of the other. The quality of the learned rules is not affected during the learning process because the dataset is not modified, as in traditional rule induction approaches. With this philosophy, this chapter describes a Multi-Objective Particle Swarm Optimization (MOPSO) algorithm. One reason to choose the Particle Swarm Optimization Meta heuristic is its recognized ability to work in numerical domains. This propriety allows the described algorithm deals with both numerical and discrete attributes. The algorithm is evaluated by using the area under ROC curve and, by comparing the performance of the induced classifiers with other ones obtained with well known rule induction algorithms. The produced Pareto Front coverage of the algorithm is also analyzed following a Multi-Objective methodology. In addition to this, some application results in the Software Engineering domain are described, more specifically in the context of software testing. Software testing is a fundamental Software Engineering activity for quality assurance that is traditionally very expensive. The algorithm is used to induce rules for fault-prediction that can help to reduce testing efforts. The empirical evaluation and the comparison show the effectiveness and scalability of this new approach.

Keywords

Area Under Curve Pareto Front Rule Induction Discrete Attribute Rule Induction Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alshayeb, M., Li, W.: An empirical validation of object-oriented metrics in two different iterative software processes. IEEE Transaction on Software Engineering 29(11), 1043–1049 (2003)CrossRefGoogle Scholar
  2. 2.
    Asuncion, A., Newman, D.: UCI machine learning repository (2007)Google Scholar
  3. 3.
    Baronti, F., Starita, A.: Hypothesis Testing with Classifier Systems for Rule-Based Risk Prediction, pp. 24–34. Springer, Heidelberg (2007), http://dx.doi.org/10.1007/978-3-540-71783-6_3 Google Scholar
  4. 4.
    Basili, V.R., Briand, L.C., Melo, W.L.: A validation of object-oriented design metrics as quality indicators. IEEE Transaction on Software Engineering 22(10), 751–761 (1996)CrossRefGoogle Scholar
  5. 5.
    Batista, G., Milare, C., Prati, R.C., Monard, M.: A comparison of methods for rule subset selection applied to associative classification. Inteligencia Artificial. Revista Iberoamericana de IA 7(32), 29–35 (2006)Google Scholar
  6. 6.
    Bleuler, S., Laumanns, M., Thiele, L., Zitzler, E.: PISA — a platform and programming language independent interface for search algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 494–508. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  7. 7.
    Bleuler, S., Laumanns, M., Thiele, L., Zitzler, E.: The PISA homepage (2003), http://www.tik.ee.ethz.ch/pisa/
  8. 8.
    Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition 30(7), 1145–1159 (1997)CrossRefGoogle Scholar
  9. 9.
    Bratton, D., Kennedy, J.: Defining a standard for particle swarm optimization. In: Proceedings of IEEE Swarm Intelligence Symposium (SIS 2007), Honolulu, Hawaii, USA, pp. 120–127. IEEE Computer Society, Los Alamitos (2007)CrossRefGoogle Scholar
  10. 10.
    Briand, L.C., Wust, J., Daly, J., Porter, V.: A comprehensive empirical validation of design measures for object-oriented systems. In: METRICS 1998: Proceedings of the 5th International Symposium on Software Metrics, Washington, DC, USA, p. 246. IEEE Computer Society, Los Alamitos (1998)Google Scholar
  11. 11.
    Briand, L.C., Wust, J., Daly, J.W., Porter, D.V.: Exploring the relationships between design measures and software quality in object-oriented systems. The Journal of Systems and Software 51(3), 245–273 (2000)CrossRefGoogle Scholar
  12. 12.
    Chidamber, S., Kemerer, C.: A metrics suite for object-oriented design. IEEE Transaction on Software Engineering 20(6), 476–493 (1994)CrossRefGoogle Scholar
  13. 13.
    Clark, P., Niblett, T.: Rule induction with CN2: Some recent improvements. In: ECML: European Conference on Machine Learning. Springer, Heidelberg (1991)Google Scholar
  14. 14.
    Cohen, W.W.: Fast effective rule induction. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 115–123 (1995)Google Scholar
  15. 15.
    Conover, W.J.: Practical nonparametric statistics. Wiley, Chichester (1971)Google Scholar
  16. 16.
    de Carvalho, A.B., Pozo, A., Vergilio, S., Lenz, A.: Predicting fault proneness of classes trough a multiobjective particle swarm optimization algorithm. In: Poceedings of 20th IEEE International Conference on Tools with Artificial Intelligence (2008)Google Scholar
  17. 17.
    de la Iglesia, B., Philpott, M.S., Bagnall, A.J., Rayward-Smith, V.J.: Data mining rules using multi-objective evolutionary algorithms. In: Congress on Evolutionary Computation, pp. 1552–1559. IEEE Computer Society, Los Alamitos (2003)CrossRefGoogle Scholar
  18. 18.
    de la Iglesia, B., Reynolds, A., Rayward-Smith, V.J.: Developments on a multi-objective metaheuristic (momh) algorithm for finding interesting sets of classification rules. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds.) EMO 2005, vol. 3410, pp. 826–840. Springer, Heidelberg (2005)Google Scholar
  19. 19.
    Egan, J.: Signal detection theory and ROC analysis. Academic Press, New York (1975)Google Scholar
  20. 20.
    Pérez-Miñana, E., Gras, J.-J.: Improving fault prediction using bayesian networks for the development of embedded software applications: Research articles. Softw. Test. Verif. Reliab. 16(3), 157–174 (2006)CrossRefGoogle Scholar
  21. 21.
    Fawcett, T.: Using rule sets to maximize ROC performance. In: IEEE International Conference on Data Mining, pp. 131–138. IEEE Computer Society Press, Los Alamitos (2001)CrossRefGoogle Scholar
  22. 22.
    Fenton, N., Neil, M., Marsh, W., Hearty, P., Marquez, D., Krause, P., Mishra, R.: Predicting software defects in varying development lifecycles using bayesian nets. Infromation on Software Technology 49(1), 32–43 (2007)CrossRefGoogle Scholar
  23. 23.
    Ferri, C., Flach, P., Hernandez-Orallo, J.: Learning decision trees using the area under the ROC curve. In: Sammut, C., Hoffmann, A. (eds.) Proceedings of the 19th International Conference on Machine Learning, July 2002, pp. 139–146. Morgan Kaufmann, San Francisco (2002)Google Scholar
  24. 24.
    Group, W.M.L.: Weka machine learning project (2007), http://www.cs.waikato.ac.nz/ml/weka
  25. 25.
    Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29–36 (1982)Google Scholar
  26. 26.
    Hansen, M.P., Jaszkiewicz, A.: Evaluating the quality of approximations to the non-dominated set. Technical Report IMM-REP-1998-7, Technical University of Denmark (March 1998)Google Scholar
  27. 27.
    Ishibuchi, H.: Multiobjective association rule mining. In: PPSN Workshop on Multiobjective Problem Solving from Nature, Reykjavik, Iceland, pp. 39–48 (2006)Google Scholar
  28. 28.
    Ishibuchi, H., Nojima, Y.: Accuracy-complexity tradeoff analysis by multiobjective rule selection. In: ICDM, pp. 39–48. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  29. 29.
    Ishida, C., de Carvalho, A.B., Pozo, A.: Exploring Multi-objective PSO and GRASP-PR for rule induction. In: van Hemert, J., Cotta, C. (eds.) EvoCOP 2008. LNCS, vol. 4972, pp. 73–84. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  30. 30.
    Ishida, C.Y., Pozo, A.T.R.: Optimization of the auc criterion for rule subset selection. In: 7th. International Conference on Intelligent Systems Design and Applications, New York, NY, USA. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  31. 31.
    Jin, Y.: Multi-Objective Machine Learning. Springer, Berlin (2006)zbMATHCrossRefGoogle Scholar
  32. 32.
    Jovanoski, V., Lavrac, N.: Classification rule learning with APRIORI-C. In: Brazdil, P.B., Jorge, A.M. (eds.) EPIA 2001. LNCS, vol. 2258, pp. 44–51. Springer, Heidelberg (2001)Google Scholar
  33. 33.
    Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, pp. 1492–1948. IEEE Press, Los Alamitos (1955)Google Scholar
  34. 34.
    Kennedy, J., Eberhart, R.C.: Swarm intelligence. Morgan Kaufmann Publishers Inc., San Francisco (2001)Google Scholar
  35. 35.
    Knowles, J., Thiele, L., Zitzler, E.: A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers. In: Computer Engineering and Networks Laboratory (TIK), ETH Zurich, Switzerland, Febuary 2006, vol. 214 (2006) (revised version)Google Scholar
  36. 36.
    Lavrac, N., Flach, P., Zupan, B.: Rule evaluation measures: A unifying view. In: Džeroski, S., Flach, P.A. (eds.) ILP 1999. LNCS, vol. 1634, pp. 174–185. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  37. 37.
    Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: Cercone, N., Lin, T.Y., Wu, X. (eds.) ICDM, pp. 369–376. IEEE Computer Society, Los Alamitos (2001)Google Scholar
  38. 38.
    Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Knowledge Discovery and Data Mining, pp. 80–86 (1998)Google Scholar
  39. 39.
    Lounis, H., Ait-Mehedine, L.: Machine-learning techniques for software product quality assessment. In: Fourth International Conference QSIC 2004: Proceedings of the Quality Software, Washington, DC, USA, pp. 102–109. IEEE Computer Society, Los Alamitos (2004)CrossRefGoogle Scholar
  40. 40.
    Martin, B.: Instance-Based learning: Nearest Neighbor With Generalization. PhD thesis, Department of Computer Science, University of Waikato, New Zealand (1995)Google Scholar
  41. 41.
    Mostaghim, S., Teich, J.: Strategies for finding good local guides in multi-objective particle swarm optimization. In: Proceedings of the 2003 IEEE Swarm Intelligence Symposium SIS 2003 Swarm Intelligence Symposium, pp. 26–33. IEEE Computer Society, Los Alamitos (2003)CrossRefGoogle Scholar
  42. 42.
    Pai, G.J., Dugan, J.B.: Empirical analysis of software fault content and fault proneness using bayesian methods. IEEE Transaction on Software Engineering 33(10), 675–686 (2007)CrossRefGoogle Scholar
  43. 43.
    Pareto, V.: Manuel d”economie politique (1927)Google Scholar
  44. 44.
    Prati, R.C., Flach, P.A.: ROCCER: An algorithm for rule learning based on ROC analysis. In: Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, pp. 823–828 (2005)Google Scholar
  45. 45.
    Program, N.I.F.M.D.: Metrics data repository, http://mdp.ivv.nasa.gov/
  46. 46.
    Provost, F., Fawcett, T.: Robust classification for imprecise environments. Machine Learning 42(3), 203 (2001)zbMATHCrossRefGoogle Scholar
  47. 47.
    Provost, F., Fawcett, T., Kohavi, R.: The case against accuracy estimation for comparing induction algorithms. In: Proceedings 15th International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)Google Scholar
  48. 48.
    Provost, F.J., Fawcett, T.: Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: KDD, pp. 43–48 (1997)Google Scholar
  49. 49.
    Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)Google Scholar
  50. 50.
    Rakotomamonjy, A.: Optimizing area under roc curve with SVMs. In: Hernández-Orallo, J., Ferri, C., Lachiche, N., Flach, P.A. (eds.) ROCAI, pp. 71–80 (2004)Google Scholar
  51. 51.
    Reyes-Sierra, M., Coello, C.A.C.: Multi-objective particle swarm optimizers: A survey of‘the state-of-the-art. International Journal of Computational Intelligence Research 2(3), 287–308 (2006)MathSciNetGoogle Scholar
  52. 52.
    Sebag, M., Aze, J., Lucas, N.: ROC-based evolutionary learning: Application to medical data mining. In: International Conference on Artificial Evolution, Evolution Artificielle. LNCS, vol. 6 (2003)Google Scholar
  53. 53.
    Subramanyam, R., Krishnan, M.S.: Empirical analysis of CK metrics for object-oriented design complexity: Implications for software defects. IEEE Transaction on Software Engineering 29(4), 297–310 (2003)CrossRefGoogle Scholar
  54. 54.
    Succi, G., Pedrycz, W., Stefanovic, M., Miller, J.: Practical assessment of the models for identification of defect-prone classes in object-oriented commercial systems using design metrics. The Journal of Systems and Software 65(1), 1–12 (2003)CrossRefGoogle Scholar
  55. 55.
    Thwin, M.M.T., Quah, T.-S.: Application of neural networks for software quality prediction using object-oriented metrics. The Journal of Systems and Software 76(2), 147–156 (2005)CrossRefGoogle Scholar
  56. 56.
    Toracio, A., Pozo, A.: Multiple objective particle swarm for classification-rule discovery. In: Proceedings of CEC 2007, pp. 684–691. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  57. 57.
    Yin, X., Han, J.: CPAR: Classification based on predictive association rules. In: Proceedings SIM International Conference on Data Mining (SDM 2003), pp. 331–335 (2003)Google Scholar
  58. 58.
    Zhou, Y., Leung, H.: Empirical analysis of object-oriented design metrics for predicting high and low severity faults. IEEE Transaction on Software Engineering 32(10), 771–789 (2006)CrossRefGoogle Scholar
  59. 59.
    Zitzler, E., Thiele, L.: Multiobjective Evolutionary Algorithms: A Comparative Case Study and the Strength Pareto Approach. IEEE Transactions on Evolutionary Computation 3(4), 257–271 (1999)CrossRefGoogle Scholar
  60. 60.
    Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., da Fonseca., V.G.: Performance assessment of multiobjective optimizers: an analysis and review. IEEE Transactions on Evolutionary Computation 7, 117–132 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • André B. de Carvalho
    • 1
  • Aurora Pozo
    • 1
  • Silvia Vergilio
    • 1
  1. 1.Computer Sciences DepartmentFederal University of ParanáCuritibaBrazil

Personalised recommendations