Abstract
Case-based approaches to classification, as instance-based learning techniques, have a particular reliance on training examples that other supervised learning techniques do not have. In this paper we present the RDCL case profiling technique that categorises each case in a case-base based on its classification by the case-base, the benefit it has and/or the damage it causes by its inclusion in the case-base. We show how these case profiles can identify the cases that should be removed from a case-base in order to improve generalisation accuracy and we show what aspects of existing noise reduction algorithms contribute to good performance and what do not.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Smyth, B., Keane, M.: Remembering to forget: A competence preserving case deletion policy for CBR systems. In: Mellish, C. (ed.) Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, IJCAI, pp. 337–382. Morgan Kaufmann, San Francisco (1995)
Delany, S.J., Cunningham, P.: An analysis of case-based editing in a spam filtering system. In: Funk, P., González-Calero, P. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 128–141. Springer, Heidelberg (2004)
Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6, 153–172 (2002)
Wilson, D.R., Martinez, T.R.: Reduction techniques for instance-based learning algorithms. Machine Learning 38, 257–286 (2000)
Hart, P.E.: The condensed nearest neighbor rule. IEEE Transactions on Information Theory 14, 515–516 (1968)
Ritter, G.L., Woodruff, H.B., Lowry, S.R., Isenhour, T.L.: An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory 21, 665–669 (1975)
Gates, G.W.: The reduced nearest neighbor rule. IEEE Transactions on Information Theory 18, 431–433 (1972)
Chou, C.H., Kuo, B.H., Chang, F.: The generalized condensed nearest neighbor rule as a data reduction method. In: ICPR 2006: Proceedings of the 18th International Conference on Pattern Recognition, Washington, DC, USA, pp. 556–559. IEEE Computer Society, Los Alamitos (2006)
Angiulli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Transactions on Knowledge and Data Engineering 19, 1450–1464 (2007)
Hao, X., Zhang, C., Xu, H., Tao, X., Wang, S., Hu, Y.: An improved condensing algorithm. In: Seventh IEEE/ACIS International Conference on Computer and Information Science, ICIS 2008, pp. 316–321 (2008)
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Cybernetics 2, 408–421 (1972)
Tomek, I.: An experiment with the nearest neighbor rule. IEEE Transactions on Information Theory 6, 448–452 (1976)
Sánchez, J.S., Barandela, R., Marqués, A.I., Alejo, R., Badenas, J.: Analysis of new techniques to obtain quality training sets. Pattern Recognition Letters 24, 1015–1022 (2003)
Jiang, Y., Zhou, Z.: Editing training data for knn classifiers with neural network ensemble. In: Yin, F.-L., Wang, J., Guo, C. (eds.) ISNN 2004. LNCS, vol. 3173, pp. 356–361. Springer, Heidelberg (2004)
Koplowitz, J., Brown, T.A.: On the relation of performance to editing in nearest neighbor rules. Pattern Recognition 13, 251–255 (1981)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6, 37–66 (1991)
Brodley, C.: Addressing the selective superiority problem: Automatic algorithm/mode class selection. In: Proceedings of the 10th International Conference on Machine Learning (ICML 1993), pp. 17–24. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Cameron-Jones, R.M.: Minimum description length instance-based learning. In: Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, pp. 368–373. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Zhang, J.: Selecting typical instances in instance-based learning. In: Proceedings of the 9th International Conference on Machine Learning (ICML 1992), pp. 470–479. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Pan, R., Yang, Q., Pan, S.J.: Mining competent case bases for case-based reasoning. Artificial Intelligence 171, 1039–1068 (2007)
Massie, S., Craw, S., Wiratunga, N.: When similar problems don’t have similar solutions. In: Weber, R.O., Richter, M.M. (eds.) ICCBR 2007. LNCS, vol. 4626, pp. 92–106. Springer, Heidelberg (2007)
McKenna, E., Smyth, B.: Competence-guided editing methods for lazy learning. In: Horn, W. (ed.) ECAI 2000, Proceedings of the 14th European Conference on Artificial Intelligence, pp. 60–64. IOS Press, Amsterdam (2000)
Wilson, D., Martinez, T.: Instance pruning techniques. In: ICML 1997: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 403–411. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Dietterich, D.T.: Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computing 10, 1895–1923 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Delany, S.J. (2009). The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing. In: McGinty, L., Wilson, D.C. (eds) Case-Based Reasoning Research and Development. ICCBR 2009. Lecture Notes in Computer Science(), vol 5650. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02998-1_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-02998-1_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02997-4
Online ISBN: 978-3-642-02998-1
eBook Packages: Computer ScienceComputer Science (R0)