Skip to main content

Abstract

Ensemble methods have received much attention recently for their significant improvements in classification accuracy. However, ensemble algorithms do not provide any information about how the final decision is made. That is, ensemble methods improve classification accuracy at the expense of interpretability. In this chapter, we investigate possibilities of using ensemble methods for generating useful rules, which help understanding the data set as well as the decision. Extensive review of three ensemble algorithms — bagging, boosting, and CHEM is presented and the algorithm of rule generation with CHEM is proposed. The proposed rule generation algorithm is illustrated with a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. E. Bauer, R. Kohavi: An Empirical Comparison of Voting Classification Algo-rithms. Bagging, Boosting and Variants. Machine Learning, 36, 105–139 (1999)

    Article  Google Scholar 

  2. L. Breiman: Bagging Predictors. Machine Learning, 24, 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  3. L. Breiman: Arcing the Edge. Technical Report (1997)

    Google Scholar 

  4. L. Breiman: Arcing Classifiers. Annals of Statistics, 26, pp. 801–846 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  5. L Breiman: Random Forests. Machine Learning, 45, pp. 5–32 (2001)

    Article  MATH  Google Scholar 

  6. L. Breima, J. Friedman, J. Olshen, C. Stone: Classification and Regression Trees ( Wadsworth, Belmont, CA 1984 )

    Google Scholar 

  7. P. Bühlman, B. Yu: Boosting with the L2 Loss: Regression and Classification. Techincal Report (2001)

    Google Scholar 

  8. W. Cohen, Y. Singer: A Simple, Fast and Effective Rule Learner. Proceedings of the Sixtennth National Conference on Artificial Intelligience, 1999

    Google Scholar 

  9. A. Buja, W. Stuetzle: The Effect of Bagging on Variance, Bias and MeanSquared Error. Techincal Report (2000)

    Google Scholar 

  10. B. Efron: Bootstrap Methods: Another Look at the Lackknife. Annals of Statistics, 7, pp. 1–26 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  11. J. Friedman: On Bias, Variance, 0–1 Loss and the Curse of Dimensionality.Journal of Data Mining and Knowledge Discovery, 1, pp. 55–77 (1997)

    Article  Google Scholar 

  12. J. Friedman: Geedy Function Approximation: A Gradient Boosting Machine.Annals of Statistics (2001)

    Google Scholar 

  13. J. Friedman, P. Hall: On Bagging and Nonlinear Estimation. Techincal Report (1999)

    Google Scholar 

  14. J. Friedman, J.H. Hastie, R. Tibschirani: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)

    Google Scholar 

  15. tistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)

    Google Scholar 

  16. Y. Freund: Boosting a Weak Learning Algorithm by Majority. Information and Computation, 121, pp. 256–285 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  17. Y. Freund, R. Schapire: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Science, 55, pp. 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  18. Y. Kim: Convex Hulle Ensemble Machine. Proceedings of 2002 IEEE International Conference on Data Mining pp. 243–249 (2002)

    Google Scholar 

  19. J. R. Quinlan: C4.5: Programs for Machine Learning (Morgan Kaufmann, 1994 )

    Google Scholar 

  20. J.R. Quinlan: Bagging, Boosting and C4.5. Proceedings of the Thirteen National Conference an Artificial Intelligience pp. 725–730 (1996)

    Google Scholar 

  21. G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)

    Article  MATH  Google Scholar 

  22. R. Schapire: The Strength of Weak Learnability. Machine Learning, 5, 197–227 (1990)

    Google Scholar 

  23. R. Schapire: Using Output Codes to Boost Multiclass Learning Problems. Proceedings of the Fourteenth International Conference on Machine Learning pp. 313–321 (1997)

    Google Scholar 

  24. R. Schapire, Y. Freund, P. Bartlett, W. Lee: Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods. Annals of Statistics, 26, pp. 1651–1686(1998)

    Google Scholar 

  25. R. Schapire, Y. Singer: Improved Boosting Algorithm Using Confidence-Rated Predictions. Machine Learning, 37, pp. 297–336 (1999)

    Article  MATH  Google Scholar 

  26. L.G. Valiant: A Theory of the Learnable. Communication of the ACM, 27, pp. 1134–1142 (1984)

    Article  MATH  Google Scholar 

  27. N. Zhong: Rough Stes in Knowledge Discovery and Data Mining. Journal of Japan Society for Fuzzy Theory and Systems, 13, pp. 581–591 (2001)

    Google Scholar 

  28. T.G. Dietterich: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting and Randomization. Machine Learning, 40, pp. 139–157 (2000)

    Article  Google Scholar 

  29. G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kim, Y., Kim, J., Jeon, J. (2004). Ensemble Methods and Rule Generation. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-07952-2_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-07378-6

  • Online ISBN: 978-3-662-07952-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics