Contribution of Boosting in Wrapper Models

Sebban, Marc; Nock, Richard

doi:10.1007/978-3-540-48247-5_23

Contribution of Boosting in Wrapper Models

Marc Sebban⁸ &
Richard Nock⁸

Conference paper

1909 Accesses
6 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1704))

Abstract

We describe a new way to deal with feature selection when boosting is used to assess the relevancy of feature subsets. In the context of wrapper models, the accuracy is here replaced as a performance function by a particular exponential criterion, usually optimized in boosting algorithms. A first experimental study brings to the fore the relevance of our approach. However, this new ”boosted” strategy needs the construction at each step of many learners, leading to high computational costs.

We focus then, in a second part, on how to speed-up boosting convergence to reduce this complexity. We propose a new update of the instance distribution, which is the core of a boosting algorithm. We exploit these results to implement a new forward selection algorithm which converges much faster using overbiased distributions over learning instances. Speed-up is achieved by reducing the number of weak hypothesis when many identical observations are shared by different classes. A second experimental study on the UCI repository shows significantly speeding improvements with our new update without altering the feature subset selection.

Download to read the full chapter text

Chapter PDF

References

Aha, D., Bankert, R.: A comparative evaluation of sequential feature selection algorithms. In: Fisher and Lenz Edts, Artificial intelligence and Statistics (1996)
Google Scholar
Blum, A., Langley, P.: Selection of relevant features and examples in machine learning. Issue of Artificial Intelligence (1997)
Google Scholar
Buntine, W., Niblett, T.: A further comparison of splitting rules for decision tree induction. Machine Learning, 75–85 (1992)
Google Scholar
Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 119–139 (1997)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: a Statistical View of Boosting. draft (July 1998)
Google Scholar
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Eleventh ICML conference, pp. 121–129 (1994)
Google Scholar
Kohavi, R.: Feature subset selection as search with probabilistic estimates. In: AAAI Fall Symposium on Relevance (1994)
Google Scholar
Koller, D., Sahami, R.: Toward optimal feature selection. In: Thirteenth International Conference on Machine Learning, Bari-Italy, pp. 284–292 (1996)
Google Scholar
Langley, P., Sage, S.: Oblivious decision trees and abstract cases. In: Working Notes of the AAAI 1994 Workshop on Case-Based Reasoning, pp. 113–117 (1994)
Google Scholar
Quinlan, J.: Bagging, boosting and c4.5. In: AAAI 1996, pp. 725–730 (1996)
Google Scholar
Rao, C.: Linear statistical inference and its applications. Wiley, New York (1965)
MATH Google Scholar
Schapire, R.E., Singer, Y.: Improved boosting algorithms using confidencerated predictions. In: Proceedings of the Eleventh Annual ACM Conference on Computational Learning Theory, pp. 80–91 (1998)
Google Scholar
Sebban, M.: On feature selection: a new filter model. In: Twelfth International Florida AI Research Society Conference, pp. 230–234 (1999)
Google Scholar
Skalak, D.: Prototype and feature selection by sampling and random mutation hill climbing algorithms. In: 11th International Conference on Machine Learning, pp. 293–301 (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

TRIVIA, West Indies and Guiana University, Campus de Fouillole, 95159, Pointe à Pitre, France
Marc Sebban & Richard Nock

Authors

Marc Sebban
View author publications
You can also search for this author in PubMed Google Scholar
Richard Nock
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science Department, UNC Charlotte, Charlotte, N.C. 28223 and Institute of Computer Science, Polish Academy of Sciences,
Jan M. Żytkow
Faculty of Informatics and Statistics, University of Economics, Prague, nám. W. Churchilla 4, 130 67, Prague, Czech Republic
Jan Rauch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sebban, M., Nock, R. (1999). Contribution of Boosting in Wrapper Models. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-48247-5_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics