Skip to main content
Log in

Mammographic Image Classification System via Active Learning

  • Original Article
  • Published:
Journal of Medical and Biological Engineering Aims and scope Submit manuscript

Abstract

Training an accurate prediction model for mammographic image classification is usually necessary to require a large number of labeled images. However, the manually acquiring rich and reliable annotations is known to be tedious and time-consuming process, especially for medical image. The advances in machine learning yielded a branch of technique, termed active learning (AL), which has been proposed for solving the problem of the limited training samples and expensive labeling cost, and has resulted in highly successful applications in many pattern recognition tasks such as image processing and speech recognition. In this article, a comparison is provided among the mammographic image classification systems, relying on traditional supervised learning, un-supervised learning and AL, aiming to obtain a system with low labeling cost. The experiments based on digital database for screening mammography demonstrate that the AL is able to minimize the labeling cost of mammographic image without sacrificing the accuracy of final classification system. In addition, some specific characteristics of mammographic image: file information and spatial feature, which are not available to the traditional AL methods, have been found to further decrease the labeling cost. In conclusion, we suggest that the AL is a reasonable alternative to supervised learning for the researchers in the field of medical image classification with limited experimental conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  1. Oliver, A., Freixenet, J., Marti, J., Perez, E., Pont, J., Denton, E. R., et al. (2010). A review of automatic mass detection and segmentation in mammographic images. Medical Image Analysis, 14, 87–110. https://doi.org/10.1016/j.media.2009.12.005.

    Article  Google Scholar 

  2. World Health Organization. (2012). International Agency for Research on Cancer GLOBOCAN 2012: Estimated cancer incidence, mortality and prevalence worldwide in 2012. Geneva: WHO.

    Google Scholar 

  3. Sickles, E. A. (1989). Breast masses: Mammographic evaluation. Radiology, 173, 297–303. https://doi.org/10.1148/radiology.173.2.2678242.

    Article  Google Scholar 

  4. Kooi, T., Litjens, G., van Ginneken, B., Gubern-Mérida, A., Sánchez, C. I., Mann, R., et al. (2017). Large scale deep learning for computer aided detection of mammographic lesions. Medical Image Analysis, 35, 303–312. https://doi.org/10.1016/j.media.2016.07.007.

    Article  Google Scholar 

  5. de Lima, S. M., da Silva-Filho, A. G., & dos Santos, W. P. (2016). Detection and classification of masses in mammographic images in a multi-kernel approach. Computer Methods and Programs in Biomedicine, 134, 11–29. https://doi.org/10.1016/j.cmpb.2016.04.029.

    Article  Google Scholar 

  6. Bekker, A. J., Shalhon, M., Greenspan, H., & Goldberger, J. (2015). Learning to combine decisions from multiple mammography views. In 2015 IEEE 12th international symposium on biomedical imaging (ISBI) (pp. 97–100). IEEE. https://doi.org/10.1109/isbi.2015.7163825.

  7. Andersson, I., Hildell, J., Muhlow, A., & Pettersson, H. (1978). Number of projections in mammography: Influence on detection of breast disease. American Journal of Roentgenology, 130, 349–351. https://doi.org/10.2214/ajr.130.2.349.

    Article  Google Scholar 

  8. Sickles, E., Weber, W., Galvin, H., Ominsky, S., & Sollitto, R. (1986). Baseline screening mammography: One vs two views per breast. American Journal of Roentgenology, 147, 1149–1153. https://doi.org/10.2214/ajr.147.6.1149.

    Article  Google Scholar 

  9. Lladó, X., Oliver, A., Freixenet, J., Martí, R., & Martí, J. (2009). A textural approach for mass false positive reduction in mammography. Computerized Medical Imaging and Graphics, 33, 415–422. https://doi.org/10.1016/j.compmedimag.2009.03.007.

    Article  Google Scholar 

  10. Buciu, I., & Gacsadi, A. (2011). Directional features for automatic tumor classification of mammogram images. Biomedical Signal Processing and Control, 6, 370–378. https://doi.org/10.1016/j.bspc.2010.10.003.

    Article  Google Scholar 

  11. Sampaio, W. B., Diniz, E. M., Silva, A. C., De Paiva, A. C., & Gattass, M. (2011). Detection of masses in mammogram images using CNN, geostatistic functions and SVM. Computers in Biology and Medicine, 41, 653–664. https://doi.org/10.1016/j.compbiomed.2011.05.017.

    Article  Google Scholar 

  12. Junior, G. B., da Rocha, S. V., Gattass, M., Silva, A. C., & de Paiva, A. C. (2013). A mass classification using spatial diversity approaches in mammography images for false positive reduction. Expert Systems with Applications, 40, 7534–7543. https://doi.org/10.1016/j.eswa.2013.07.034.

    Article  Google Scholar 

  13. de Oliveira, F. S. S., de Carvalho Filho, A. O., Silva, A. C., de Paiva, A. C., & Gattass, M. (2015). Classification of breast regions as mass and non-mass based on digital mammograms using taxonomic indexes and SVM. Computers in Biology and Medicine, 57, 42–53. https://doi.org/10.1016/j.compbiomed.2014.11.016.

    Article  Google Scholar 

  14. Kashyap, K. L., Bajpai, M. K., & Khanna, P. (2017). Globally supported radial basis function based collocation method for evolution of level set in mass segmentation using mammograms. Computers in Biology and Medicine, 87, 22–37. https://doi.org/10.1016/j.compbiomed.2017.05.015.

    Article  Google Scholar 

  15. Costa, D. D., Campos, L. F., & Barros, A. K. (2011). Classification of breast tissue in mammograms using efficient coding. Biomedical Engineering Online, 10, 55. https://doi.org/10.1186/1475-925x-10-55.

    Article  Google Scholar 

  16. Beura, S., Majhi, B., & Dash, R. (2015). Mammogram classification using two dimensional discrete wavelet transform and gray-level co-occurrence matrix for detection of breast cancer. Neurocomputing, 154, 1–14. https://doi.org/10.1016/j.neucom.2014.12.032.

    Article  Google Scholar 

  17. Saki, F., Tahmasbi, A., Soltanian-Zadeh, H., & Shokouhi, S. B. (2013). Fast opposite weight learning rules with application in breast cancer diagnosis. Computers in Biology and Medicine, 43, 32–41. https://doi.org/10.1016/j.compbiomed.2012.10.006.

    Article  Google Scholar 

  18. Oliver, A., Marti, J., Marti, R., Bosch, A., & Freixenet, J. (2006). A new approach to the classification of mammographic masses and normal breast tissue. In 18th International conference on pattern recognition, 2006. ICPR 2006 (pp. 707–710). IEEE. https://doi.org/10.1109/icpr.2006.113.

  19. Vadivel, A., & Surendiran, B. (2013). A fuzzy rule-based approach for characterization of mammogram masses into BI-RADS shape categories. Computers in Biology and Medicine, 43, 259–267. https://doi.org/10.1016/j.compbiomed.2013.01.004.

    Article  Google Scholar 

  20. Raghavendra, U., Acharya, U. R., Fujita, H., Gudigar, A., Tan, J. H., & Chokkadi, S. (2016). Application of Gabor wavelet and Locality Sensitive Discriminant Analysis for automated identification of breast cancer using digitized mammogram images. Applied Soft Computing, 46, 151–161. https://doi.org/10.1016/j.asoc.2016.04.036.

    Article  Google Scholar 

  21. Jiang, F., Liu, H., Yu, S., & Xie, Y. (2017). Breast mass lesion classification in mammograms by transfer learning. In Proceedings of the 5th international conference on bioinformatics and computational biology, 2017 (pp. 59–62). ACM. https://doi.org/10.1145/3035012.3035022.

  22. Do, C. B., & Batzoglou, S. (2008). What is the expectation maximization algorithm? Nature Biotechnology, 26, 897. https://doi.org/10.1038/nbt1406.

    Article  Google Scholar 

  23. Kohonen, T. (1998). The self-organizing map. Neurocomputing, 21, 1–6. https://doi.org/10.1016/s0925-2312(98)00030-7.

    Article  MATH  Google Scholar 

  24. Braspenning, P. J., & Thuijsman, F. (1995). Artificial neural networks: An introduction to ANN theory and practice (Vol. 931, pp. 101–117). Berlin: Springer.

    Book  MATH  Google Scholar 

  25. Settles, B. (2010). Active learning literature survey 52-11. Madison, WI: University of Wisconsin.

  26. Panda, N., Goh, K.-S., & Chang, E. Y. (2006). Active learning in very large databases. Multimedia Tools and Applications, 31, 249–267. https://doi.org/10.1007/s11042-006-0043-1.

    Article  Google Scholar 

  27. Demir, B., Persello, C., & Bruzzone, L. (2011). Batch-mode active-learning methods for the interactive classification of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 49, 1014–1031. https://doi.org/10.1109/tgrs.2010.2072929.

    Article  Google Scholar 

  28. Shannon, C. E. (2001). A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5, 3–55. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.

    Article  Google Scholar 

  29. Fu, W., Hao, S., & Wang, M. (2016). Active learning on anchorgraph with an improved transductive experimental design. Neurocomputing, 171, 452–462. https://doi.org/10.1016/j.neucom.2015.06.046.

    Article  Google Scholar 

  30. Tong, S., & Koller, D. (2001). Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, 2, 45–66. https://doi.org/10.1162/153244302760185243.

    MATH  Google Scholar 

  31. Muslea, I., Minton, S., & Knoblock, C. A. (2006). Active learning with multiple views. Journal of Artificial Intelligence Research, 27, 203–233. https://doi.org/10.1613/jair.2005.

    Article  MathSciNet  MATH  Google Scholar 

  32. Freund, Y., Seung, H. S., Shamir, E., & Tishby, N. (1997). Selective sampling using the query by committee algorithm. Machine Learning, 28, 133–168. https://doi.org/10.1023/a:1007330508534.

    Article  MATH  Google Scholar 

  33. Lewis, D. D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Machine learning proceedings 1994 (pp. 148–156). Elsevier. https://doi.org/10.1016/b978-1-55860-335-6.50026-x.

  34. Settles, B., Craven, M., & Ray, S. (2008). Multiple-instance active learning. In Advances in neural information processing systems, 2008 (pp. 1289–1296).

  35. Olsson, F. (2009). A literature survey of active machine learning in the context of natural language processing. Swedish Institute of Computer Science.

  36. Hoi, S. C., Jin, R., Zhu, J., & Lyu, M. R. (2006). Batch mode active learning and its application to medical image classification. In Proceedings of the 23rd international conference on machine learning, 2006 (pp. 417–424). ACM. https://doi.org/10.1145/1143844.1143897.

  37. Rubens, N., Elahi, M., Sugiyama, M., & Kaplan, D. (2015). Active learning in recommender systems. In Recommender systems handbook (pp. 809–846). Springer. https://doi.org/10.1007/978-0-387-85820-3_23.

  38. Zhang, C., & Chen, T. (2002). An active learning framework for content-based information retrieval. IEEE Transactions on Multimedia, 4, 260–268. https://doi.org/10.1109/tmm.2002.1017738.

    Article  Google Scholar 

  39. Heath, M., Bowyer, K., Kopans, D., Kegelmeyer, P., Moore, R., Chang, K., & Munishkumaran, S. (1998). Current status of the digital database for screening mammography. In Digital mammography (pp. 457–460). Springer. https://doi.org/10.1007/978-94-011-5318-8_75.

  40. USF digital mammography home page (2007). http://marathon.csee.usf.edu/Mammography/Database.html.

  41. Rose, C., Turi, D., Williams, A., Wolstencroft, K., & Taylor, C. (2006). Web services for the DDSM and digital mammography research. In International workshop on digital mammography, 2006 (pp. 376–383). Springer. https://doi.org/10.1007/11783237_51.

  42. Karssemeijer, N., & te Brake, G. M. (1996). Detection of stellate distortions in mammograms. IEEE Transactions on Medical Imaging, 15, 611–619. https://doi.org/10.1109/42.538938.

    Article  Google Scholar 

  43. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Computer Society conference on computer vision and pattern recognition, 2005. CVPR 2005 (pp. 886–893). IEEE. https://doi.org/10.1109/cvpr.2005.177.

  44. Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component analysis. Chemometrics and Intelligent Laboratory Systems, 2, 37–52. https://doi.org/10.1016/0169-7439(87)80084-9.

    Article  Google Scholar 

  45. Gumus, E., Kilic, N., Sertbas, A., & Ucan, O. N. (2010). Evaluation of face recognition techniques using PCA, wavelets and SVM. Expert Systems with Applications, 37, 6404–6408. https://doi.org/10.1016/j.eswa.2010.02.079.

    Article  Google Scholar 

  46. Liu, G., Gao, X., You, D., & Zhang, N. (2016). Prediction of high power laser welding status based on PCA and SVM classification of multiple sensors. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-016-1286-y.

    Google Scholar 

  47. Moura, D. C., & López, M. A. G. (2013). An evaluation of image descriptors combined with clinical data for breast cancer diagnosis. International Journal of Computer Assisted Radiology and Surgery, 8, 561–574. https://doi.org/10.1007/s11548-013-0838-2.

    Article  Google Scholar 

  48. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297. https://doi.org/10.1007/bf00994018.

    MATH  Google Scholar 

  49. Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27. https://doi.org/10.1145/1961189.1961199.

    Article  Google Scholar 

  50. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27, 861–874. https://doi.org/10.1016/j.patrec.2005.10.010.

    Article  Google Scholar 

  51. Huang, H., Zhang, C., Hu, Q., & Zhu, P. (2016). Multi-view representative and informative induced active learning. In Pacific Rim international conference on artificial intelligence, 2016 (pp. 139–151). Springer. https://doi.org/10.1007/978-3-319-42911-3_12.

Download references

Acknowledgements

This research is partially supported by the National Key Research Program of China (2016YFC0106200), the 863 national research fund (2015AA043203) of China, the National Natural Science Foundation of China (81301283, 61190124 and 61271318), and the special funding of capital health research and development with No. 2016-1-4011. The authors are grateful to the Massachusetts General Hospital, the University of South Florida, and Sandia National Laboratories, which provides DDSM as a resource for our experimental data. We also express our sincere gratitude towards Department of Computer Science in University of North Carolina at Charlotte for their free tech support.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Hongzhi Xie or Lixu Gu.

Ethics declarations

Conflict of interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Ethical Approval

The study doesn’t involve human or animal subjects.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLSX 1001 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Chen, D., Xie, H. et al. Mammographic Image Classification System via Active Learning. J. Med. Biol. Eng. 39, 569–582 (2019). https://doi.org/10.1007/s40846-018-0437-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40846-018-0437-3

Keywords

Navigation