Skip to main content

Finding Rare Classes: Adapting Generative and Discriminative Models in Active Learning

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6635))

Included in the following conference series:

Abstract

Discovering rare categories and classifying new instances of them is an important data mining issue in many fields, but fully supervised learning of a rare class classifier is prohibitively costly. There has therefore been increasing interest both in active discovery: to identify new classes quickly, and active learning: to train classifiers with minimal supervision. Very few studies have attempted to jointly solve these two inter-related tasks which occur together in practice. Optimizing both rare class discovery and classification simultaneously with active learning is challenging because discovery and classification have conflicting requirements in query criteria. In this paper we address these issues with two contributions: a unified active learning model to jointly discover new categories and learn to classify them; and a classifier combination algorithm that switches generative and discriminative classifiers as learning progresses. Extensive evaluation on several standard datasets demonstrates the superiority of our approach over existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Asuncion, A., Newman, D.: UCI machine learning repository (2007), http://www.ics.uci.edu/ml/

  2. Baram, Y., El-Yaniv, R., Luz, K.: Online choice of active learning algorithms. Journal of Machine Learning Research 5, 255–291 (2004)

    Google Scholar 

  3. Cebron, N., Berthold, M.R.: Active learning for object classification: from exploration to exploitation. Data Min. Knowl. Discov. 18(2), 283–299 (2009)

    Article  Google Scholar 

  4. Deselaers, T., Heigold, G., Ney, H.: SVMs, gaussian mixtures, and their generative/discriminative fusion. In: ICPR (2008)

    Google Scholar 

  5. Donmez, P., Carbonell, J.G., Bennett, P.N.: Dual strategy active learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 116–127. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  6. Ertekin, S., Huang, J., Bottou, L., Giles, L.: Learning on the border: active learning in imbalanced data classification. In: CIKM (2007)

    Google Scholar 

  7. Goldberger, J., Roweis, S.: Hierarchical clustering of a mixture model. In: NIPS (2004)

    Google Scholar 

  8. He, H., Garcia, E.: Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  9. He, J., Carbonell, J.: Nearest-neighbor-based active learning for rare category detection. In: NIPS (2007)

    Google Scholar 

  10. Kittler, J., Hatef, M., Duin, R.P.W., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)

    Article  Google Scholar 

  11. Ng, A., Jordan, M.: On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. In: NIPS (2001)

    Google Scholar 

  12. Pelleg, D., Moore, A.: Active learning for anomaly and rare-category detection. In: NIPS (2004)

    Google Scholar 

  13. Roy, N., McCallum, A.: Toward optimal active learning through sampling estimation of error reduction. In: ICML, pp. 441–448 (2001)

    Google Scholar 

  14. Settles, B.: Active learning literature survey. Tech. Rep. 1648, University of wisconsin–Madison (2009)

    Google Scholar 

  15. Sillito, R., Fisher, R.: Incremental one-class learning with bounded computational complexity. In: ICANN (2007)

    Google Scholar 

  16. Stokes, J.W., Platt, J.C., Kravis, J., Shilman, M.: Aladin: Active learning of anomalies to detect intrusions. Tech. Rep. 2008-24, MSR (2008)

    Google Scholar 

  17. Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. In: ICML (2000)

    Google Scholar 

  18. Wu, T.F., Lin, C.J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research 5, 975–1005 (2004)

    MATH  Google Scholar 

  19. Xiang, T., Gong, S.: Video behavior profiling for anomaly detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(5), 893–908 (2008)

    Article  Google Scholar 

  20. Yu, S., Tan, D., Tan, T.: A framework for evaluating the effect of view angle, clothing and carrying condition on gait recognition. In: ICPR (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hospedales, T.M., Gong, S., Xiang, T. (2011). Finding Rare Classes: Adapting Generative and Discriminative Models in Active Learning. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20847-8_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20847-8_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20846-1

  • Online ISBN: 978-3-642-20847-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics