Skip to main content

Disturbing Neighbors Diversity for Decision Forests

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 245))

Abstract

Ensemble methods take their output from a set of base predictors. The ensemble accuracy depends on two factors: the base classifiers accuracy and their diversity (how different these base classifiers outputs are from each other). An approach for increasing the diversity of the base classifiers is presented in this paper. The method builds some new features to be added to the training dataset of the base classifier. Those new features are computed using a Nearest Neighbor (NN) classifier built from a few randomly selected instances. The NN classifier returns: (i) an indicator pointing the nearest neighbor and, (ii) the class this NN predicts for the instance. We tested this idea using decision trees as base classifiers . An experimental validation on 62 UCI datasets is provided for traditional ensemble methods, showing that ensemble accuracy and base classifiers diversity are usually improved.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aha, D., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  2. Asuncion, A., Newman, D.J.: UCI machine learning repository., http://www.ics.uci.edu/~mlearn/MLRepository.html

  3. Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36(1-2), 105–139 (1999)

    Article  Google Scholar 

  4. Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)

    MATH  MathSciNet  Google Scholar 

  5. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Bruno Caprile, B., Merler, S., Furlanello, C., Jurman, G.: Exact bagging with k-nearest neighbour classifiers. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 72–81. Springer, Heidelberg (2004)

    Google Scholar 

  7. Dietterich, T.G.: Approximate statistical test for comparing supervised classification learning algorithms. Neural Comp. 10(7), 1895–1923 (1998)

    Article  Google Scholar 

  8. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  Google Scholar 

  9. Domeniconi, C., Yan, B.: Nearest neighbor ensemble. In: Kittler, J., Petrou, M., Nixon, M.S. (eds.) Proc. 17th Int. Conf. Patt. Recogn., Cambridge, UK, pp. 228–231. IEEE Comp. Soc., Los Alamitos (2004)

    Google Scholar 

  10. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Saitta, L. (ed.) Proc. 13th Int. Conf. Mach. Learn., Bari, Italy, pp. 148–156. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  11. Gama, J., Brazdil, P.: Cascade generalization. Mach. Learn. 41(3), 315–343 (2000)

    Article  MATH  Google Scholar 

  12. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Patt. Analysis Mach. Intell. 20(8), 832–844 (1998)

    Article  Google Scholar 

  13. Kuncheva, L., Whitaker, C.J.: Using diversity with three variants of boosting: aggressive, conservative, and inverse. In: Roli, F., Kittler, J. (eds.) MCS 2002. LNCS, vol. 2364, pp. 81–90. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  14. Margineantu, D.D., Dietterich, T.G.: Pruning adaptive boosting. In: Fisher, D.H. (ed.) Proc. 14th Int. Conf. Mach. Learn., Nashville, TN, pp. 211–218. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)

    Google Scholar 

  16. Webb, G.I.: MultiBoosting: a technique for combining boosting and wagging. Mach. Learn. 40(2), 159–196 (2000)

    Article  Google Scholar 

  17. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Maudes, J., Rodríguez, J.J., García-Osorio, C. (2009). Disturbing Neighbors Diversity for Decision Forests. In: Okun, O., Valentini, G. (eds) Applications of Supervised and Unsupervised Ensemble Methods. Studies in Computational Intelligence, vol 245. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03999-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03999-7_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03998-0

  • Online ISBN: 978-3-642-03999-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics