A Scalable Boosting Learner Using Adaptive Sampling

Chen, Jianhua; Burleigh, Seth; Chennupati, Neeharika; Gudapati, Bharath K.

doi:10.1007/978-3-319-25252-0_11

A Scalable Boosting Learner Using Adaptive Sampling

Jianhua Chen¹⁸,
Seth Burleigh¹⁹,
Neeharika Chennupati¹⁸ &
…
Bharath K. Gudapati¹⁸

Conference paper
First Online: 30 December 2015

680 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9384))

Abstract

Sampling is an important technique for parameter estimation and hypothesis testing widely used in statistical analysis, machine learning and knowledge discovery. Sampling is particularly useful in data mining when the training data set is huge. In this paper, we present a new sampling-based method for learning by Boosting. We show how to utilize the adaptive sampling method in [2] for estimating classifier accuracy in building an efficient ensemble learning method by Boosting. We provide a preliminary theoretical analysis of the proposed sampling-based boosting method. Empirical studies with 4 datasets from UC Irvine ML database show that our method typically uses much smaller sample size (and is thus much more efficient) while maintaining competitive prediction accuracy compared with Watanabe’s sampling-based Boosting learner Madaboost.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. Ann. Math. Statist. 23, 493–507 (1952)
Article MathSciNet MATH Google Scholar
Chen, J., Chen, X.: A new method for adaptive sequential sampling for learning and parameter estimation. In: Proceedings of International Symposium on Methodologies for Intelligent Systems, pp. 220–229, Warsaw, Poland, June 2011
Google Scholar
Chen, J.: Scalable ensemble learning by adaptive sampling. In: Proceedings of International Conference on Machine Learning and Applications (ICMLA), pp. 622–625, December 2012
Google Scholar
Chen, X.: A New Framework of Multistage Estimation. ArXiv:0809.1241v20 [Math.ST]
Chen, X.: A new framework of multistage parametric inference. In: Proceeding of SPIE Conference, Orlando, Frioda, U.S.A., April 2010
Google Scholar
Domingo, C., Watanabe, O.: Scaling up a boosting-based learner via adaptive sampling. In: Terano, T., Liu, H., Chen, A.L.P. (eds.) PAKDD 2000. LNCS, vol. 1805, pp. 317–328. Springer, Heidelberg (2000)
Chapter Google Scholar
Domingo, C., Watanabe, O.: Adaptive sampling methods for scaling up knowledge discovery algorithms. In: Proceeings of 2nd International Conference on discovery Science, Japan, December 1999
Google Scholar
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Article MathSciNet MATH Google Scholar
Ghosh, M., Mukhopadhyay, N., Sen, P.K.: Sequential Estimation. Wiley, New York (1997)
Book MATH Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded variables. J. Am. Stat. Assoc. 58, 13–29 (1963)
Article MathSciNet MATH Google Scholar
Lipton, R., Naughton, J., Schneider, D.A., Seshadri, S.: Efficient sampling strategies for relational database operations. Theoret. Comput. Sci. 116, 195–226 (1993)
Article MathSciNet MATH Google Scholar
Lipton, R., Naughton, J.: Query size estimation by adaptive sampling. J. Comput. Syst. Sci. 51, 18–25 (1995)
Article MathSciNet MATH Google Scholar
Lynch, J.F.: Analysis and application of adaptive sampling. J. Comput. Syst. Sci. 66, 2–19 (2003)
Article MathSciNet MATH Google Scholar
Minh, V., Szepesvari, C., Audibert, J.-Y.: Emoirical Bernstein stopping. In: Proceedings of the 25th International Conference on Machine Learning (ICML 2008), Helsinki, Finland, pp. 672–679 (2008)
Google Scholar
Watanabe, O.: Simple sampling techniques for discovery sciences. IEICE Trans. Inf. Sys. ED83–D(1), 19–26 (2000)
Google Scholar
Watanabe, O.: Sequential sampling techniques for algorithmic learning theory. Theo. Comput. Sci. 348, 3–14 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer Science and Engineering, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, 70803-4020, USA
Jianhua Chen, Neeharika Chennupati & Bharath K. Gudapati
Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA, 70803-4020, USA
Seth Burleigh

Authors

Jianhua Chen
View author publications
You can also search for this author in PubMed Google Scholar
Seth Burleigh
View author publications
You can also search for this author in PubMed Google Scholar
Neeharika Chennupati
View author publications
You can also search for this author in PubMed Google Scholar
Bharath K. Gudapati
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianhua Chen .

Editor information

Editors and Affiliations

Computer Science, University of Bari, Bari, Italy
Floriana Esposito
Enssat, Lannion, France
Olivier Pivert
LISI-UFR d'Informatique, Université Claude Bernard Lyon 1, Villeurbanne Cedex, France
Mohand-Said Hacid
University of North Carolina, CHARLOTTE, North Carolina, USA
Zbigniew W. Rás
Dipartimento di Informatica, Università degli Studi di Bari, Bari, Italy
Stefano Ferilli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, J., Burleigh, S., Chennupati, N., Gudapati, B.K. (2015). A Scalable Boosting Learner Using Adaptive Sampling. In: Esposito, F., Pivert, O., Hacid, MS., Rás, Z., Ferilli, S. (eds) Foundations of Intelligent Systems. ISMIS 2015. Lecture Notes in Computer Science(), vol 9384. Springer, Cham. https://doi.org/10.1007/978-3-319-25252-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-25252-0_11
Published: 30 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25251-3
Online ISBN: 978-3-319-25252-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics