Abstract
In the development of a computer-aided detection (CAD) system a large database containing training samples is of major importance. However, as obtaining training samples may be costly, it is useful to evaluate the effect of training sample size on performance of mass detection and classification. In this paper we investigate the effect of the number of masses as well as the number of normals in the training database. In particular we are interested in the performance of the CAD system operating at high specificity. We use a combination of databases comprising over 5000 cases. Each mammogram is classified multiple times, using neural networks trained with a different number of training samples. To measure performance free-response operator characteristic (FROC)-curves are computed. The mean sensitivity in the interval between 0.05 and 0.5 false positive (FP) marks/image is taken as a performance measure. It was found that performance steadily increases with adding masses to the training database. Even with 555 mass cases a plateau was not yet reached. For normal cases, however, we found that a large number of normals was not needed. The maximal performance was reached with around 700 cases. These results show that optimal training requires a lot of malignant cases, whereas the influence of the number of normal cases is less.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Heath, M., Bowyer, B., Kopans, D., Kegelmeyer, W.P., Moore, R., Chang, K., MunishKumaran, S.: Current status of the Digital Database for Screening Mammography. In: Proceedings of the Fourth International Workshop on Digital Mammography (1998)
Heath, M., Bowyer, K., Kopans, D., Moore, R., Kegelmeyer, W.: The Digital Database for Screening Mammography. In: Proceedings of the Fifth International Workshop on Digital Mammography (2001)
te Brake, G.M., Karssemeijer, N.: Single and multiscale detection of masses in digital mammograms. IEEE Trans. Med. Imaging 18(7), 628–639 (1999)
Karssemeijer, N., te Brake, G.: Detection of stellate distortions in mammograms. IEEE Trans. Med. Imaging 15(5), 611–619 (1996)
Timp, S., Karssemeijer, N.: A new 2D segmentation method based on dynamic programming applied to computer aided detection in mammography. Med. Phys. 31(5), 958–971 (2004)
van Engeland, S., Karssemeijer, N.: Combining two mammographic projections in a computer aided mass detection method. Med. Phys. 34(3), 898–905 (2007)
Fukunaga, K., Hayes, R.: Effects of sample size on classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 873–885 (1998)
Chan, H.P., Sahiner, B., Wagner, R.F., Petrick, N.: Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers. Med. Phys. 26(12), 2654–2668 (1999)
Chan, H.P., Sahiner, B., Hadjiiski, L.: Sample size and validation issues on the development of CAD systems. In: International Congress Series, CARS 2004 - Computer Assisted Radiology and Surgery, Proceedings of the 18th International Congress and Exhibition, June 2004, vol. 1268, pp. 872–877 (2004)
Sahiner, B., Chan, H.P., Hadjiiski, L.: Classifier performance estimation under the constraint of a finite sample size: Resampling schemes applied to neural network classifiers. In: Neural. Netw. (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kallenberg, M., Karssemeijer, N. (2008). The Effect of Training Sample Size on Performance of Mass Detection. In: Krupinski, E.A. (eds) Digital Mammography. IWDM 2008. Lecture Notes in Computer Science, vol 5116. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70538-3_48
Download citation
DOI: https://doi.org/10.1007/978-3-540-70538-3_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70537-6
Online ISBN: 978-3-540-70538-3
eBook Packages: Computer ScienceComputer Science (R0)