Skip to main content

The Effect of Training Sample Size on Performance of Mass Detection

  • Conference paper
Digital Mammography (IWDM 2008)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5116))

Included in the following conference series:

Abstract

In the development of a computer-aided detection (CAD) system a large database containing training samples is of major importance. However, as obtaining training samples may be costly, it is useful to evaluate the effect of training sample size on performance of mass detection and classification. In this paper we investigate the effect of the number of masses as well as the number of normals in the training database. In particular we are interested in the performance of the CAD system operating at high specificity. We use a combination of databases comprising over 5000 cases. Each mammogram is classified multiple times, using neural networks trained with a different number of training samples. To measure performance free-response operator characteristic (FROC)-curves are computed. The mean sensitivity in the interval between 0.05 and 0.5 false positive (FP) marks/image is taken as a performance measure. It was found that performance steadily increases with adding masses to the training database. Even with 555 mass cases a plateau was not yet reached. For normal cases, however, we found that a large number of normals was not needed. The maximal performance was reached with around 700 cases. These results show that optimal training requires a lot of malignant cases, whereas the influence of the number of normal cases is less.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Heath, M., Bowyer, B., Kopans, D., Kegelmeyer, W.P., Moore, R., Chang, K., MunishKumaran, S.: Current status of the Digital Database for Screening Mammography. In: Proceedings of the Fourth International Workshop on Digital Mammography (1998)

    Google Scholar 

  2. Heath, M., Bowyer, K., Kopans, D., Moore, R., Kegelmeyer, W.: The Digital Database for Screening Mammography. In: Proceedings of the Fifth International Workshop on Digital Mammography (2001)

    Google Scholar 

  3. te Brake, G.M., Karssemeijer, N.: Single and multiscale detection of masses in digital mammograms. IEEE Trans. Med. Imaging 18(7), 628–639 (1999)

    Article  Google Scholar 

  4. Karssemeijer, N., te Brake, G.: Detection of stellate distortions in mammograms. IEEE Trans. Med. Imaging 15(5), 611–619 (1996)

    Article  Google Scholar 

  5. Timp, S., Karssemeijer, N.: A new 2D segmentation method based on dynamic programming applied to computer aided detection in mammography. Med. Phys. 31(5), 958–971 (2004)

    Article  Google Scholar 

  6. van Engeland, S., Karssemeijer, N.: Combining two mammographic projections in a computer aided mass detection method. Med. Phys. 34(3), 898–905 (2007)

    Article  Google Scholar 

  7. Fukunaga, K., Hayes, R.: Effects of sample size on classifier design. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 873–885 (1998)

    Article  Google Scholar 

  8. Chan, H.P., Sahiner, B., Wagner, R.F., Petrick, N.: Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers. Med. Phys. 26(12), 2654–2668 (1999)

    Article  Google Scholar 

  9. Chan, H.P., Sahiner, B., Hadjiiski, L.: Sample size and validation issues on the development of CAD systems. In: International Congress Series, CARS 2004 - Computer Assisted Radiology and Surgery, Proceedings of the 18th International Congress and Exhibition, June 2004, vol. 1268, pp. 872–877 (2004)

    Google Scholar 

  10. Sahiner, B., Chan, H.P., Hadjiiski, L.: Classifier performance estimation under the constraint of a finite sample size: Resampling schemes applied to neural network classifiers. In: Neural. Netw. (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Elizabeth A. Krupinski

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kallenberg, M., Karssemeijer, N. (2008). The Effect of Training Sample Size on Performance of Mass Detection. In: Krupinski, E.A. (eds) Digital Mammography. IWDM 2008. Lecture Notes in Computer Science, vol 5116. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70538-3_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70538-3_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70537-6

  • Online ISBN: 978-3-540-70538-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics