Skip to main content

The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5650))

Abstract

Case-based approaches to classification, as instance-based learning techniques, have a particular reliance on training examples that other supervised learning techniques do not have. In this paper we present the RDCL case profiling technique that categorises each case in a case-base based on its classification by the case-base, the benefit it has and/or the damage it causes by its inclusion in the case-base. We show how these case profiles can identify the cases that should be removed from a case-base in order to improve generalisation accuracy and we show what aspects of existing noise reduction algorithms contribute to good performance and what do not.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Smyth, B., Keane, M.: Remembering to forget: A competence preserving case deletion policy for CBR systems. In: Mellish, C. (ed.) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI, pp. 337–382. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  2. Delany, S.J., Cunningham, P.: An analysis of case-based editing in a spam filtering system. In: Funk, P., González-Calero, P. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 128–141. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6, 153–172 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  4. Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)

    Article  MATH  Google Scholar 

  5. Hart, P.E.: The condensed nearest neighbor rule. IEEE Transactions on Information Theory 14, 515–516 (1968)

    Article  Google Scholar 

  6. Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory 21, 665–669 (1975)

    Article  MATH  Google Scholar 

  7. Gates, G.W.: The reduced nearest neighbor rule. IEEE Transactions on Information Theory 18, 431–433 (1972)

    Article  Google Scholar 

  8. Chou, C.H., Kuo, B.H., Chang, F.: The generalized condensed nearest neighbor rule as a data reduction method. In: ICPR 2006: Proceedings of the 18th International Conference on Pattern Recognition, Washington, DC, USA, pp. 556–559. IEEE Computer Society, Los Alamitos (2006)

    Google Scholar 

  9. Angiulli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Transactions on Knowledge and Data Engineering 19, 1450–1464 (2007)

    Article  Google Scholar 

  10. Hao, X., Zhang, C., Xu, H., Tao, X., Wang, S., Hu, Y.: An improved condensing algorithm. In: Seventh IEEE/ACIS International Conference on Computer and Information Science, ICIS 2008, pp. 316–321 (2008)

    Google Scholar 

  11. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics 2, 408–421 (1972)

    Article  MathSciNet  MATH  Google Scholar 

  12. Tomek, I.: An experiment with the nearest neighbor rule. IEEE Transactions on Information Theory 6, 448–452 (1976)

    MathSciNet  MATH  Google Scholar 

  13. Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24, 1015–1022 (2003)

    Article  Google Scholar 

  14. Jiang, Y., Zhou, Z.: Editing training data for knn classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  15. Koplowitz, J., Brown, T.A.: On the relation of performance to editing in nearest neighbor rules. Pattern Recognition 13, 251–255 (1981)

    Article  Google Scholar 

  16. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)

    Google Scholar 

  17. Brodley, C.: Addressing the selective superiority problem: Automatic algorithm/mode class selection. In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 17–24. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Chapter  Google Scholar 

  18. Cameron-Jones, R.M.: Minimum description length instance-based learning. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 368–373. Morgan Kaufmann Publishers Inc., San Francisco (1992)

    Google Scholar 

  19. Zhang, J.: Selecting typical instances in instance-based learning. In: Proceedings of the 9th International Conference on Machine Learning (ICML 1992), pp. 470–479. Morgan Kaufmann Publishers Inc., San Francisco (1992)

    Google Scholar 

  20. Pan, R., Yang, Q., Pan, S.J.: Mining competent case bases for case-based reasoning. Artificial Intelligence 171, 1039–1068 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  21. Massie, S., Craw, S., Wiratunga, N.: When similar problems don’t have similar solutions. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS, vol. 4626, pp. 92–106. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  22. McKenna, E., Smyth, B.: Competence-guided editing methods for lazy learning. In: Horn, W. (ed.) ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, pp. 60–64. IOS Press, Amsterdam (2000)

    Google Scholar 

  23. Wilson, D., Martinez, T.: Instance pruning techniques. In: ICML 1997: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 403–411. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  24. Asuncion, A., Newman, D.: UCI machine learning repository (2007)

    Google Scholar 

  25. Dietterich, D.T.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computing 10, 1895–1923 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Delany, S.J. (2009). The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing. In: McGinty, L., Wilson, D.C. (eds) Case-Based Reasoning Research and Development. ICCBR 2009. Lecture Notes in Computer Science(), vol 5650. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02998-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-02998-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-02997-4

  • Online ISBN: 978-3-642-02998-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics