Abstract
Ensemble methods have received much attention recently for their significant improvements in classification accuracy. However, ensemble algorithms do not provide any information about how the final decision is made. That is, ensemble methods improve classification accuracy at the expense of interpretability. In this chapter, we investigate possibilities of using ensemble methods for generating useful rules, which help understanding the data set as well as the decision. Extensive review of three ensemble algorithms — bagging, boosting, and CHEM is presented and the algorithm of rule generation with CHEM is proposed. The proposed rule generation algorithm is illustrated with a real data set.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
E. Bauer, R. Kohavi: An Empirical Comparison of Voting Classification Algo-rithms. Bagging, Boosting and Variants. Machine Learning, 36, 105–139 (1999)
L. Breiman: Bagging Predictors. Machine Learning, 24, 123–140 (1996)
L. Breiman: Arcing the Edge. Technical Report (1997)
L. Breiman: Arcing Classifiers. Annals of Statistics, 26, pp. 801–846 (1998)
L Breiman: Random Forests. Machine Learning, 45, pp. 5–32 (2001)
L. Breima, J. Friedman, J. Olshen, C. Stone: Classification and Regression Trees ( Wadsworth, Belmont, CA 1984 )
P. Bühlman, B. Yu: Boosting with the L2 Loss: Regression and Classification. Techincal Report (2001)
W. Cohen, Y. Singer: A Simple, Fast and Effective Rule Learner. Proceedings of the Sixtennth National Conference on Artificial Intelligience, 1999
A. Buja, W. Stuetzle: The Effect of Bagging on Variance, Bias and MeanSquared Error. Techincal Report (2000)
B. Efron: Bootstrap Methods: Another Look at the Lackknife. Annals of Statistics, 7, pp. 1–26 (1979)
J. Friedman: On Bias, Variance, 0–1 Loss and the Curse of Dimensionality.Journal of Data Mining and Knowledge Discovery, 1, pp. 55–77 (1997)
J. Friedman: Geedy Function Approximation: A Gradient Boosting Machine.Annals of Statistics (2001)
J. Friedman, P. Hall: On Bagging and Nonlinear Estimation. Techincal Report (1999)
J. Friedman, J.H. Hastie, R. Tibschirani: Additive Logistic Regression: A Statistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)
tistical View of Boosting. Annals of Statistics, 38, pp. 337–374 (2000)
Y. Freund: Boosting a Weak Learning Algorithm by Majority. Information and Computation, 121, pp. 256–285 (1995)
Y. Freund, R. Schapire: A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Science, 55, pp. 119–139 (1997)
Y. Kim: Convex Hulle Ensemble Machine. Proceedings of 2002 IEEE International Conference on Data Mining pp. 243–249 (2002)
J. R. Quinlan: C4.5: Programs for Machine Learning (Morgan Kaufmann, 1994 )
J.R. Quinlan: Bagging, Boosting and C4.5. Proceedings of the Thirteen National Conference an Artificial Intelligience pp. 725–730 (1996)
G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)
R. Schapire: The Strength of Weak Learnability. Machine Learning, 5, 197–227 (1990)
R. Schapire: Using Output Codes to Boost Multiclass Learning Problems. Proceedings of the Fourteenth International Conference on Machine Learning pp. 313–321 (1997)
R. Schapire, Y. Freund, P. Bartlett, W. Lee: Boosting the Margin: A New Explanation for the Effectiveness of Voting Methods. Annals of Statistics, 26, pp. 1651–1686(1998)
R. Schapire, Y. Singer: Improved Boosting Algorithm Using Confidence-Rated Predictions. Machine Learning, 37, pp. 297–336 (1999)
L.G. Valiant: A Theory of the Learnable. Communication of the ACM, 27, pp. 1134–1142 (1984)
N. Zhong: Rough Stes in Knowledge Discovery and Data Mining. Journal of Japan Society for Fuzzy Theory and Systems, 13, pp. 581–591 (2001)
T.G. Dietterich: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, Boosting and Randomization. Machine Learning, 40, pp. 139–157 (2000)
G. Rätsch, T. Onoda, K.R. Müller: Soft Margins for AdaBoost. Machine Learning, 42, pp. 287–320 (2001)
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Kim, Y., Kim, J., Jeon, J. (2004). Ensemble Methods and Rule Generation. In: Intelligent Technologies for Information Analysis. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-07952-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-662-07952-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-07378-6
Online ISBN: 978-3-662-07952-2
eBook Packages: Springer Book Archive