Abstract
Developing a Computer-Assisted Detection (CAD) system for automatic diagnosis of pulmonary nodules in thoracic CT is a highly challenging research area in the medical domain. It requires a successful application of quite sophisticated, state-of-the-art image processing and pattern recognition technologies. The object recognition and feature extraction phase of such a system generates a huge imbalanced training set, as is the case in many learning problems in medical domain. The performance of concept learning systems is traditionally assessed with the percentage of testing examples classified correctly, termed as accuracy. This accuracy measurement becomes inappropriate for imbalanced training sets like in this case, where the non-nodules (negative) examples outnumber nodule (positive) examples. This paper introduces the mechanism developed for filtering negative examples in the training so as to remove ‘obvious’ ones, and discusses alternative evaluation criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, Y., Hara, A., Hara, T., Fujita, H., Itoh, S., Ishigaki, T.: Automated Detection of Pulmonary Nodules in Helical CT Images Based on an Improved Template-Matching Technique. In: IEEE Transactions on Medical Imaging, Vol. 20, No. 7. (2001) 595–604
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1. (1967) 281–297
Bishop, C..: Neural Networks for Pattern Recognition. Oxford University Press, UK (1995)
Dempster, A., Laird, N., Rubin, D.: Maximum Likelihood from Incomplete Data via the EM Algorithm. In: Journal of the Royal Statistical Society. B39(1) (1977) 1–38
Nickerson, A., Japkowicz, N., Milios, E.: Using Unsupervised Learning to Guide Resampling in Imbalanced Data Sets. In: Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics. (2001)
Kubat, M., Holte, R., Matwin, S.,: Learning when Negative Examples Abound. In: Proceedings of ECML-97, Vol. 1224. Springer Verlag, (1997) 146–153
Kubat, M., Matwin, S.,: Addressing the Curse of Imbalanced Training Sets: One-Sided Selection. In: Proceedings of 14th International Conference on Machine Learning, (1997) 179–186
Metz, C.: Fundamental ROC analysis. In: Beutel, J., Kundel, H., MetterHandbook, R. (eds.): Medical Imaging, Vol. 1. SPIE Press, Bellingham, WA (2000) 751–769
Domingos, P.: Unifying Instance-Based and Rule-Based Induction. In: Machine Learning, Vol. 24, No. 2. (1996) 141–168
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dehmeshki, J., Karaköy, M., Casique, M.V. (2003). A Rule-Based Scheme for Filtering Examples from Majority Class in an Imbalanced Training Set. In: Perner, P., Rosenfeld, A. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2003. Lecture Notes in Computer Science, vol 2734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45065-3_19
Download citation
DOI: https://doi.org/10.1007/3-540-45065-3_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40504-7
Online ISBN: 978-3-540-45065-8
eBook Packages: Springer Book Archive