Probabilistic Active Learning: Towards Combining Versatility, Optimality and Efficiency

Krempl, Georg; Kottke, Daniel; Spiliopoulou, Myra

doi:10.1007/978-3-319-11812-3_15

Georg Krempl²¹,
Daniel Kottke²¹ &
Myra Spiliopoulou²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8777))

Included in the following conference series:

International Conference on Discovery Science

1978 Accesses
5 Citations

Abstract

Mining data with minimal annotation costs requires efficient active approaches, that ideally select the optimal candidate for labelling under a user-specified classification performance measure. Common generic approaches, that are usable with any classifier and any performance measure, are either slow like error reduction, or heuristics like uncertainty sampling. In contrast, our Probabilistic Active Learning (PAL) approach offers versatility, direct optimisation of a performance measure and computational efficiency. Given a labelling candidate from a pool, PAL models both the candidate’s label and the true posterior in its neighbourhood as random variables. By computing the expectation of the gain in classification performance over both random variables, PAL then selects the candidate that in expectation will improve the classification performance the most. Extending our recent poster, we discuss the properties of PAL and perform a thorough experimental evaluation on several synthetic and real-world data sets of different sizes. Results show comparable or better classification performance than error reduction and uncertainty sampling, yet PAL has the same asymptotic time complexity as uncertainty sampling and is faster than error reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Asuncion, A., Newman, D.J.: UCI ML repository (2013)
Google Scholar
Chapelle, O.: Active learning for parzen window classifier. In: Proc. 10th Int. Workshop on AI and Statistics, pp. 49–56 (2005)
Google Scholar
Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press (2006)
Google Scholar
Cohn, D.: Active learning. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of ML, pp. 10–14. Springer (2010)
Google Scholar
Cohn, D.A., Ghahramani, Z., Jordan, M.I.: Active learning with statistical models. J. of AI Research 4, 129–145 (1996)
MATH Google Scholar
Fu, Y., Zhu, X., Li, B.: A survey on instance selection for active learning. Knowledge and Inf. Syss. 35(2), 249–283 (2012)
Article Google Scholar
Garnett, R., Krishnamurthy, Y., Xiong, X., Schneider, J.G., Mann, R.: Bayesian optimal active search and surveying. In: Proc. of the 29th ICML (2012)
Google Scholar
Gopalkrishnan, V., Steier, D., Lewis, H., Guszcza, J.: Big data, big business: Bridging the gap. In: Workshop on Big Data, Streams and Heterogeneous Source Mining, pp. 7–11 (2012)
Google Scholar
Ho, S.S., Wechsler, H.: Query by transduction. IEEE Trans. on Pattern A. & Mach. Int. 30(9), 1557–1571 (2008)
Article Google Scholar
Krempl, G., Kottke, D., Spiliopoulou, M.: Probabilistic active learning: A short proposition. In: Proc. 21st Europ. Conf. on AI (ECAI 2014). IOS Press (2014)
Google Scholar
Lewis, D.D., Gale, W.A.: A sequential algorithm for training text classifiers. In: Proc. of the 17th ACM SIGIR, pp. 3–12 (1994)
Google Scholar
Parker, C.: An analysis of performance measures for binary classifiers. In: Proc. of the 11th ICDM, pp. 517–526. IEEE (2011)
Google Scholar
Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: Proc. of the 18th ICML, pp. 441–448 (2001)
Google Scholar
Settles, B.: Active Learning literature survey. CS Tech. Rep. 1648, U. Wisconsin (2009)
Google Scholar
Settles, B.: Active Learning. in Synth. Lect. AI and ML. Morgan Claypool, vol. 18 (2012)
Google Scholar
Tomanek, K., Morik, K.: Inspecting sample reusability for active learning. In: Guyon, I., et al. (eds.) AISTATS workshop on Act. Learning and Exp. Design., vol. 16, pp. 169–181 (2011)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Management and Discovery Lab, University Magdeburg, Germany
Georg Krempl, Daniel Kottke & Myra Spiliopoulou

Authors

Georg Krempl
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Kottke
View author publications
You can also search for this author in PubMed Google Scholar
Myra Spiliopoulou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Knowledge Technologies, Jožef Stefan Institute, Jamova cesta 39, 1000, Ljubljana, Slovenia
Sašo Džeroski , Panče Panov & Dragi Kocev , &
Faculty of Administration, University of Ljubljana, Gosarjeva 5, 1000, Ljubljana, Slovenia
Ljupčo Todorovski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Krempl, G., Kottke, D., Spiliopoulou, M. (2014). Probabilistic Active Learning: Towards Combining Versatility, Optimality and Efficiency. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds) Discovery Science. DS 2014. Lecture Notes in Computer Science(), vol 8777. Springer, Cham. https://doi.org/10.1007/978-3-319-11812-3_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-11812-3_15
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11811-6
Online ISBN: 978-3-319-11812-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics