The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing

Delany, Sarah Jane

doi:10.1007/978-3-642-02998-1_11

The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing

Sarah Jane Delany²¹

Conference paper

806 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5650))

Abstract

Case-based approaches to classification, as instance-based learning techniques, have a particular reliance on training examples that other supervised learning techniques do not have. In this paper we present the RDCL case profiling technique that categorises each case in a case-base based on its classification by the case-base, the benefit it has and/or the damage it causes by its inclusion in the case-base. We show how these case profiles can identify the cases that should be removed from a case-base in order to improve generalisation accuracy and we show what aspects of existing noise reduction algorithms contribute to good performance and what do not.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Smyth, B., Keane, M.: Remembering to forget: A competence preserving case deletion policy for CBR systems. In: Mellish, C. (ed.) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI, pp. 337–382. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Delany, S.J., Cunningham, P.: An analysis of case-based editing in a spam filtering system. In: Funk, P., González-Calero, P. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 128–141. Springer, Heidelberg (2004)
Chapter Google Scholar
Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6, 153–172 (2002)
Article MathSciNet MATH Google Scholar
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Article MATH Google Scholar
Hart, P.E.: The condensed nearest neighbor rule. IEEE Transactions on Information Theory 14, 515–516 (1968)
Article Google Scholar
Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory 21, 665–669 (1975)
Article MATH Google Scholar
Gates, G.W.: The reduced nearest neighbor rule. IEEE Transactions on Information Theory 18, 431–433 (1972)
Article Google Scholar
Chou, C.H., Kuo, B.H., Chang, F.: The generalized condensed nearest neighbor rule as a data reduction method. In: ICPR 2006: Proceedings of the 18th International Conference on Pattern Recognition, Washington, DC, USA, pp. 556–559. IEEE Computer Society, Los Alamitos (2006)
Google Scholar
Angiulli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Transactions on Knowledge and Data Engineering 19, 1450–1464 (2007)
Article Google Scholar
Hao, X., Zhang, C., Xu, H., Tao, X., Wang, S., Hu, Y.: An improved condensing algorithm. In: Seventh IEEE/ACIS International Conference on Computer and Information Science, ICIS 2008, pp. 316–321 (2008)
Google Scholar
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics 2, 408–421 (1972)
Article MathSciNet MATH Google Scholar
Tomek, I.: An experiment with the nearest neighbor rule. IEEE Transactions on Information Theory 6, 448–452 (1976)
MathSciNet MATH Google Scholar
Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24, 1015–1022 (2003)
Article Google Scholar
Jiang, Y., Zhou, Z.: Editing training data for knn classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)
Chapter Google Scholar
Koplowitz, J., Brown, T.A.: On the relation of performance to editing in nearest neighbor rules. Pattern Recognition 13, 251–255 (1981)
Article Google Scholar
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Google Scholar
Brodley, C.: Addressing the selective superiority problem: Automatic algorithm/mode class selection. In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 17–24. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Chapter Google Scholar
Cameron-Jones, R.M.: Minimum description length instance-based learning. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 368–373. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Google Scholar
Zhang, J.: Selecting typical instances in instance-based learning. In: Proceedings of the 9th International Conference on Machine Learning (ICML 1992), pp. 470–479. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Google Scholar
Pan, R., Yang, Q., Pan, S.J.: Mining competent case bases for case-based reasoning. Artificial Intelligence 171, 1039–1068 (2007)
Article MathSciNet MATH Google Scholar
Massie, S., Craw, S., Wiratunga, N.: When similar problems don’t have similar solutions. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS, vol. 4626, pp. 92–106. Springer, Heidelberg (2007)
Chapter Google Scholar
McKenna, E., Smyth, B.: Competence-guided editing methods for lazy learning. In: Horn, W. (ed.) ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, pp. 60–64. IOS Press, Amsterdam (2000)
Google Scholar
Wilson, D., Martinez, T.: Instance pruning techniques. In: ICML 1997: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 403–411. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Google Scholar
Dietterich, D.T.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computing 10, 1895–1923 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dublin Institute of Technology, Dublin, Ireland
Sarah Jane Delany

Authors

Sarah Jane Delany
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Adaptive Information Cluster, School of Computer Science & Informatics, University College Dublin,, Computer Science Building, Belfield, Dublin 4, Ireland
Lorraine McGinty
Department of Software and Information Systems, College of Computing and Informatics, University of North Carolina at Charlotte, 9201 University City Boulevard, NC 28223-0001, Charlotte, USA
David C. Wilson

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Delany, S.J. (2009). The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing. In: McGinty, L., Wilson, D.C. (eds) Case-Based Reasoning Research and Development. ICCBR 2009. Lecture Notes in Computer Science(), vol 5650. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02998-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-02998-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02997-4
Online ISBN: 978-3-642-02998-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics