Ensemble Methods and Rule Generation

Kim, Yongdai; Kim, Jinseog; Jeon, Jongwoo

doi:10.1007/978-3-662-07952-2_4

Yongdai Kim³,
Jinseog Kim³ &
Jongwoo Jeon³

183 Accesses

Abstract

Ensemble methods have received much attention recently for their significant improvements in classification accuracy. However, ensemble algorithms do not provide any information about how the final decision is made. That is, ensemble methods improve classification accuracy at the expense of interpretability. In this chapter, we investigate possibilities of using ensemble methods for generating useful rules, which help understanding the data set as well as the decision. Extensive review of three ensemble algorithms — bagging, boosting, and CHEM is presented and the algorithm of rule generation with CHEM is proposed. The proposed rule generation algorithm is illustrated with a real data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

E. Bauer, R. Kohavi: An Empirical Comparison of Voting Classification Algo-rithms. Bagging, Boosting and Variants. Machine Learning, 36, 105–139 (1999)
Article Google Scholar
L. Breiman: Bagging Predictors. Machine Learning, 24, 123–140 (1996)
MathSciNet MATH Google Scholar
L. Breiman: Arcing the Edge. Technical Report (1997)
Google Scholar
L. Breiman: Arcing Classifiers. Annals of Statistics, 26, pp. 801–846 (1998)
Article MathSciNet MATH Google Scholar
L Breiman: Random Forests. Machine Learning, 45, pp. 5–32 (2001)
Article MATH Google Scholar
L. Breima, J. Friedman, J. Olshen, C. Stone: Classification and Regression Trees ( Wadsworth, Belmont, CA 1984 )
Google Scholar
P. Bühlman, B. Yu: Boosting with the L2 Loss: Regression and Classification. Techincal Report (2001)
Google Scholar
W. Cohen, Y. Singer: A Simple, Fast and Effective Rule Learner. Proceedings of the Sixtennth National Conference on Artificial Intelligience, 1999
Google Scholar
A. Buja, W. Stuetzle: The Effect of Bagging on Variance, Bias and MeanSquared Error. Techincal Report (2000)
Google Scholar
B. Efron: Bootstrap Methods: Another Look at the Lackknife. Annals of Statistics, 7, pp. 1–26 (1979)
Article MathSciNet MATH Google Scholar
J. Friedman: On Bias, Variance, 0–1 Loss and the Curse of Dimensionality.Journal of Data Mining and Knowledge Discovery, 1, pp. 55–77 (1997)
Article Google Scholar
J. Friedman: Geedy Function Approximation: A Gradient Boosting Machine.Annals of Statistics (2001)
Google Scholar
J. Friedman, P. Hall: On Bagging and Nonlinear Estimation. Techincal Report (1999)
Google Scholar
J. Friedman, J.H. Hastie, R. Tibschirani: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)
Google Scholar
tistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)
Google Scholar
Y. Freund: Boosting a Weak Learning Algorithm by Majority. Information and Computation, 121, pp. 256–285 (1995)
Article MathSciNet MATH Google Scholar
Y. Freund, R. Schapire: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Science, 55, pp. 119–139 (1997)
Article MathSciNet MATH Google Scholar
Y. Kim: Convex Hulle Ensemble Machine. Proceedings of 2002 IEEE International Conference on Data Mining pp. 243–249 (2002)
Google Scholar
J. R. Quinlan: C4.5: Programs for Machine Learning (Morgan Kaufmann, 1994 )
Google Scholar
J.R. Quinlan: Bagging, Boosting and C4.5. Proceedings of the Thirteen National Conference an Artificial Intelligience pp. 725–730 (1996)
Google Scholar
G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)
Article MATH Google Scholar
R. Schapire: The Strength of Weak Learnability. Machine Learning, 5, 197–227 (1990)
Google Scholar
R. Schapire: Using Output Codes to Boost Multiclass Learning Problems. Proceedings of the Fourteenth International Conference on Machine Learning pp. 313–321 (1997)
Google Scholar
R. Schapire, Y. Freund, P. Bartlett, W. Lee: Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods. Annals of Statistics, 26, pp. 1651–1686(1998)
Google Scholar
R. Schapire, Y. Singer: Improved Boosting Algorithm Using Confidence-Rated Predictions. Machine Learning, 37, pp. 297–336 (1999)
Article MATH Google Scholar
L.G. Valiant: A Theory of the Learnable. Communication of the ACM, 27, pp. 1134–1142 (1984)
Article MATH Google Scholar
N. Zhong: Rough Stes in Knowledge Discovery and Data Mining. Journal of Japan Society for Fuzzy Theory and Systems, 13, pp. 581–591 (2001)
Google Scholar
T.G. Dietterich: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting and Randomization. Machine Learning, 40, pp. 139–157 (2000)
Article Google Scholar
G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Seoul National University, Korea
Yongdai Kim, Jinseog Kim & Jongwoo Jeon

Authors

Yongdai Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jinseog Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jongwoo Jeon
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kim, Y., Kim, J., Jeon, J. (2004). Ensemble Methods and Rule Generation. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-662-07952-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07378-6
Online ISBN: 978-3-662-07952-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics