Skip to main content

On-Line Classification of Data Streams with Missing Values Based on Reinforcement Learning

  • Conference paper
Pattern Recognition and Image Analysis (IbPRIA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6669))

Included in the following conference series:

  • 3066 Accesses

Abstract

In some applications, data arrive sequentially and they are not available in batch form, what makes difficult the use of traditional classification systems. In addition, some attributes may lack due to some real-world conditions. For this problem, a number of decisions have to be made regarding how to proceed with the incomplete and unlabeled incoming objects, how to guess its missing attributes values, how to classify it, whether to include it in the training set, or when to ask for the class label to an expert. Unfortunately, no decision works well for all data sets. This data dependency motivates our formulation of the problem in terms of elements of reinforcement learning. The application of this learning paradigm for this problem is, to the best of our knowledge, novel. The empirical results are encouraging since the proposed framework behaves better and more generally than many strategies used isolatedly, and makes an efficient use of human effort (requests for the class label to an expert) and computer memory (the increase of size of the training set).

This work has been supported in part by the Spanish Ministry of Education and Science under grants CSD2007–00018 (Consolider Ingenio 2010) and TIN2009–14205, and by Bancaixa under grant P1–1B2009–04.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Little, R.J.A., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Chichester (1987)

    MATH  Google Scholar 

  2. Ding, Y., Simonoff, J.S.: An investigation of missing data methods for classification trees applied to binary response data. J. of Machine Learning Res. 11, 131–170 (2010)

    MathSciNet  MATH  Google Scholar 

  3. Farhangfar, A., Kurgan, L., Dy, J.: Impact of imputation of missing values on classification error for discrete data. Pattern Recognitions 41(12), 3692–3705 (2008)

    Article  MATH  Google Scholar 

  4. Millán-Giraldo, M., Sánchez, J.S., Traver, V.J.: Exploring early classification strategies of streaming data with delayed attributes. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009. LNCS, vol. 5863, pp. 875–883. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  5. Vogiatzis, D., Stafylopatis, A.: Reinforcement learning for rule extraction from a labeled dataset. Cognitive Systems Research 3(2), 237–253 (2002)

    Article  Google Scholar 

  6. Langford, J., Zadrozny, B.: Relating reinforcement learning performance to classification performance. In: Proc. of the Intl. Conference on Machine Learning, pp. 473–480 (2005)

    Google Scholar 

  7. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)

    Google Scholar 

  8. Chapelle, O., Schölkopf, B., Zien, A. (eds.): Semi-Supervised Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2006)

    Google Scholar 

  9. Bruzzone, L., Roli, F., Serpico, S.B.: An extension of the Jeffreys Matusita distance to multiclass cases for feature selection. IEEE Transactions on Geoscience and Remote Sensing 33(6), 1318–1321 (1995)

    Article  Google Scholar 

  10. Nagy, G.: Classifiers that improve with use. In: In Proc. Conf. on Pattern Recognition and Multimedia, pp. 79–86 (2004)

    Google Scholar 

  11. Frank, A., Asuncion, A.: UCI Machine Learning Repository

    Google Scholar 

  12. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)

    Book  MATH  Google Scholar 

  13. Library: Real medical data sets, http://www.bangor.ac.uk/~mas00a/activities/real_data.htm

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Millán-Giraldo, M., Traver, V.J., Sánchez, J.S. (2011). On-Line Classification of Data Streams with Missing Values Based on Reinforcement Learning. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21257-4_44

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21256-7

  • Online ISBN: 978-3-642-21257-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics