Disturbing Neighbors Diversity for Decision Forests

Maudes, Jesús; Rodríguez, Juan J.; García-Osorio, César

doi:10.1007/978-3-642-03999-7_7

Disturbing Neighbors Diversity for Decision Forests

Jesús Maudes⁴,
Juan J. Rodríguez⁴ &
César García-Osorio⁴

Chapter

917 Accesses
5 Citations

Part of the book series: Studies in Computational Intelligence ((SCI,volume 245))

Abstract

Ensemble methods take their output from a set of base predictors. The ensemble accuracy depends on two factors: the base classifiers accuracy and their diversity (how different these base classifiers outputs are from each other). An approach for increasing the diversity of the base classifiers is presented in this paper. The method builds some new features to be added to the training dataset of the base classifier. Those new features are computed using a Nearest Neighbor (NN) classifier built from a few randomly selected instances. The NN classifier returns: (i) an indicator pointing the nearest neighbor and, (ii) the class this NN predicts for the instance. We tested this idea using decision trees as base classifiers . An experimental validation on 62 UCI datasets is provided for traditional ensemble methods, showing that ensemble accuracy and base classifiers diversity are usually improved.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aha, D., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository., http://www.ics.uci.edu/~mlearn/MLRepository.html
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36(1-2), 105–139 (1999)
Article Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH MathSciNet Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Bruno Caprile, B., Merler, S., Furlanello, C., Jurman, G.: Exact bagging with k-nearest neighbour classifiers. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 72–81. Springer, Heidelberg (2004)
Google Scholar
Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Comp. 10(7), 1895–1923 (1998)
Article Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet Google Scholar
Domeniconi, C., Yan, B.: Nearest neighbor ensemble. In: Kittler, J., Petrou, M., Nixon, M.S. (eds.) Proc. 17th Int. Conf. Patt. Recogn., Cambridge, UK, pp. 228–231. IEEE Comp. Soc., Los Alamitos (2004)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. 13th Int. Conf. Mach. Learn., Bari, Italy, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Gama, J., Brazdil, P.: Cascade generalization. Mach. Learn. 41(3), 315–343 (2000)
Article MATH Google Scholar
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Patt. Analysis Mach. Intell. 20(8), 832–844 (1998)
Article Google Scholar
Kuncheva, L., Whitaker, C.J.: Using diversity with three variants of boosting: aggressive, conservative, and inverse. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 81–90. Springer, Heidelberg (2002)
Chapter Google Scholar
Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Fisher, D.H. (ed.) Proc. 14th Int. Conf. Mach. Learn., Nashville, TN, pp. 211–218. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Google Scholar
Webb, G.I.: MultiBoosting: a technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Ingenieria Civil, University of Burgos, Escuela Politécnica Superior, C/Francisco de Vitoria s/n, 09006
Jesús Maudes, Juan J. Rodríguez & César García-Osorio

Authors

Jesús Maudes
View author publications
You can also search for this author in PubMed Google Scholar
Juan J. Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
César García-Osorio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Precise Biometrics AB, Scheelevagen 30, P.O. Box 798, 220 07, Lund, Sweden
Oleg Okun
Dipartimento di Scienze dell’Informazione, Università degli Studi di Milano, Via Comelico 39, 20135, Milano, Italy
Giorgio Valentini

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Maudes, J., Rodríguez, J.J., García-Osorio, C. (2009). Disturbing Neighbors Diversity for Decision Forests. In: Okun, O., Valentini, G. (eds) Applications of Supervised and Unsupervised Ensemble Methods. Studies in Computational Intelligence, vol 245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03999-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-03999-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03998-0
Online ISBN: 978-3-642-03999-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics