The Effect of Training Sample Size on Performance of Mass Detection

Kallenberg, Michiel; Karssemeijer, Nico

doi:10.1007/978-3-540-70538-3_48

Michiel Kallenberg¹ &
Nico Karssemeijer¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5116))

Included in the following conference series:

International Workshop on Digital Mammography

1302 Accesses
2 Citations

Abstract

In the development of a computer-aided detection (CAD) system a large database containing training samples is of major importance. However, as obtaining training samples may be costly, it is useful to evaluate the effect of training sample size on performance of mass detection and classification. In this paper we investigate the effect of the number of masses as well as the number of normals in the training database. In particular we are interested in the performance of the CAD system operating at high specificity. We use a combination of databases comprising over 5000 cases. Each mammogram is classified multiple times, using neural networks trained with a different number of training samples. To measure performance free-response operator characteristic (FROC)-curves are computed. The mean sensitivity in the interval between 0.05 and 0.5 false positive (FP) marks/image is taken as a performance measure. It was found that performance steadily increases with adding masses to the training database. Even with 555 mass cases a plateau was not yet reached. For normal cases, however, we found that a large number of normals was not needed. The maximal performance was reached with around 700 cases. These results show that optimal training requires a lot of malignant cases, whereas the influence of the number of normal cases is less.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Heath, M., Bowyer, B., Kopans, D., Kegelmeyer, W.P., Moore, R., Chang, K., MunishKumaran, S.: Current status of the Digital Database for Screening Mammography. In: Proceedings of the Fourth International Workshop on Digital Mammography (1998)
Google Scholar
Heath, M., Bowyer, K., Kopans, D., Moore, R., Kegelmeyer, W.: The Digital Database for Screening Mammography. In: Proceedings of the Fifth International Workshop on Digital Mammography (2001)
Google Scholar
te Brake, G.M., Karssemeijer, N.: Single and multiscale detection of masses in digital mammograms. IEEE Trans. Med. Imaging 18(7), 628–639 (1999)
Article Google Scholar
Karssemeijer, N., te Brake, G.: Detection of stellate distortions in mammograms. IEEE Trans. Med. Imaging 15(5), 611–619 (1996)
Article Google Scholar
Timp, S., Karssemeijer, N.: A new 2D segmentation method based on dynamic programming applied to computer aided detection in mammography. Med. Phys. 31(5), 958–971 (2004)
Article Google Scholar
van Engeland, S., Karssemeijer, N.: Combining two mammographic projections in a computer aided mass detection method. Med. Phys. 34(3), 898–905 (2007)
Article Google Scholar
Fukunaga, K., Hayes, R.: Effects of sample size on classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 873–885 (1998)
Article Google Scholar
Chan, H.P., Sahiner, B., Wagner, R.F., Petrick, N.: Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers. Med. Phys. 26(12), 2654–2668 (1999)
Article Google Scholar
Chan, H.P., Sahiner, B., Hadjiiski, L.: Sample size and validation issues on the development of CAD systems. In: International Congress Series, CARS 2004 - Computer Assisted Radiology and Surgery, Proceedings of the 18th International Congress and Exhibition, June 2004, vol. 1268, pp. 872–877 (2004)
Google Scholar
Sahiner, B., Chan, H.P., Hadjiiski, L.: Classifier performance estimation under the constraint of a finite sample size: Resampling schemes applied to neural network classifiers. In: Neural. Netw. (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Radiology, Radboud University Nijmegen Medical Centre, Geert Grooteplein Zuid 18, 6525 GA, Nijmegen, The Netherlands
Michiel Kallenberg & Nico Karssemeijer

Authors

Michiel Kallenberg
View author publications
You can also search for this author in PubMed Google Scholar
Nico Karssemeijer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Elizabeth A. Krupinski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kallenberg, M., Karssemeijer, N. (2008). The Effect of Training Sample Size on Performance of Mass Detection. In: Krupinski, E.A. (eds) Digital Mammography. IWDM 2008. Lecture Notes in Computer Science, vol 5116. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70538-3_48

Download citation

DOI: https://doi.org/10.1007/978-3-540-70538-3_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70537-6
Online ISBN: 978-3-540-70538-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics