Skip to main content

A Scalable Boosting Learner Using Adaptive Sampling

  • Conference paper
  • First Online:
  • 680 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9384))

Abstract

Sampling is an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, machine learning and knowledge discovery. Sampling is particularly useful in data mining when the training data set is huge. In this paper, we present a new sampling-based method for learning by Boosting. We show how to utilize the adaptive sampling method in [2] for estimating classifier accuracy in building an efficient ensemble learning method by Boosting. We provide a preliminary theoretical analysis of the proposed sampling-based boosting method. Empirical studies with 4 datasets from UC Irvine ML database show that our method typically uses much smaller sample size (and is thus much more efficient) while maintaining competitive prediction accuracy compared with Watanabe’s sampling-based Boosting learner Madaboost.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist. 23, 493–507 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  2. Chen, J., Chen, X.: A new method for adaptive sequential sampling for learning and parameter estimation. In: Proceedings of International Symposium on Methodologies for Intelligent Systems, pp. 220–229, Warsaw, Poland, June 2011

    Google Scholar 

  3. Chen, J.: Scalable ensemble learning by adaptive sampling. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 622–625, December 2012

    Google Scholar 

  4. Chen, X.: A New Framework of Multistage Estimation. ArXiv:0809.1241v20 [Math.ST]

  5. Chen, X.: A new framework of multistage parametric inference. In: Proceeding of SPIE Conference, Orlando, Frioda, U.S.A., April 2010

    Google Scholar 

  6. Domingo, C., Watanabe, O.: Scaling up a boosting-based learner via adaptive sampling. In: Terano, T., Liu, H., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 317–328. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  7. Domingo, C., Watanabe, O.: Adaptive sampling methods for scaling up knowledge discovery algorithms. In: Proceeings of 2nd International Conference on discovery Science, Japan, December 1999

    Google Scholar 

  8. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ghosh, M., Mukhopadhyay, N., Sen, P.K.: Sequential Estimation. Wiley, New York (1997)

    Book  MATH  Google Scholar 

  10. Hoeffding, W.: Probability inequalities for sums of bounded variables. J. Am. Stat. Assoc. 58, 13–29 (1963)

    Article  MathSciNet  MATH  Google Scholar 

  11. Lipton, R., Naughton, J., Schneider, D.A., Seshadri, S.: Efficient sampling strategies for relational database operations. Theoret. Comput. Sci. 116, 195–226 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  12. Lipton, R., Naughton, J.: Query size estimation by adaptive sampling. J. Comput. Syst. Sci. 51, 18–25 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  13. Lynch, J.F.: Analysis and application of adaptive sampling. J. Comput. Syst. Sci. 66, 2–19 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  14. Minh, V., Szepesvari, C., Audibert, J.-Y.: Emoirical Bernstein stopping. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), Helsinki, Finland, pp. 672–679 (2008)

    Google Scholar 

  15. Watanabe, O.: Simple sampling techniques for discovery sciences. IEICE Trans. Inf. Sys. ED83–D(1), 19–26 (2000)

    Google Scholar 

  16. Watanabe, O.: Sequential sampling techniques for algorithmic learning theory. Theo. Comput. Sci. 348, 3–14 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianhua Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, J., Burleigh, S., Chennupati, N., Gudapati, B.K. (2015). A Scalable Boosting Learner Using Adaptive Sampling. In: Esposito, F., Pivert, O., Hacid, MS., Rás, Z., Ferilli, S. (eds) Foundations of Intelligent Systems. ISMIS 2015. Lecture Notes in Computer Science(), vol 9384. Springer, Cham. https://doi.org/10.1007/978-3-319-25252-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25252-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25251-3

  • Online ISBN: 978-3-319-25252-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics