Abstract
Cervical cancer is one of the most common types of cancer in women worldwide. Most deaths of cervical cancer occur in less developed areas of the world. In this work, we introduce a new image dataset along with ground truth diagnosis for evaluating image-based cervical disease classification algorithms. We collect a large number of cervigram images from a database provided by the US National Cancer Institute. From these images, we extract three types of complementary image features, including Pyramid histogram in L*A*B* color space (PLAB), Pyramid Histogram of Oriented Gradients (PHOG), and Pyramid histogram of Local Binary Patterns (PLBP). PLAB captures color information, PHOG encodes edges and gradient information, and PLBP extracts texture information. Using these features, we run seven classic machine-learning algorithms to differentiate images of high-risk patient visits from those of low-risk patient visits. Extensive experiments are conducted on both balanced and imbalanced subsets of the data to compare the seven classifiers. These results can serve as a baseline for future research in cervical dysplasia classification using images. The image-based classifiers also outperform results of several other screening tests on the same datasets.
C. Xin—Co-first author
Preview
Unable to display preview. Download preview PDF.
References
WHO: Human papillomavirus and related cancers in world. In: ICO Information Centre on HPV and Cancer Summary Report, August 2014
Kim, E., Huang, X.: A data driven approach to cervigram image analysis and classification. In: Color Medical Image analysis, Lecture Notes in Computational Vision and Biomechanics, vol. 6, pp. 1–13 (2013)
Biscotti, C.V., Dawson, A.E., et al.: Assisted primary screening using the automated thinprep imaging system. AJCP 123(2), 281–287 (2005)
Wilbur, D.C., Black-Schaffer, W.S., Luff, R.D., et al.: The becton dickinson focalpoint gs imaging system: Clinical trials demonstrate significantly improved sensitivity for the detection of important cervical lesions. AJCP 132(5), 767–775 (2009)
Zhang, J., Liu, Y.: Cervical cancer detection using SVM based feature screening. In: Barillot, C., Haynor, D.R., Hellier, P. (eds.) MICCAI 2004. LNCS, vol. 3217, pp. 873–880. Springer, Heidelberg (2004)
Herrero, R., Schiffman, M., Bratti, C., et al.: Design and methods of a population-based natural history study of cervical neoplasia in a rural province of costa rica: the guanacaste project. Rev. Panam. Salud Publica 1, 362–375 (1997)
Jeronimo, J., Long, L.R., Neve, L., et al.: Digital tools for collecting data from cervigrams for research and training in colposcopy. Journal of Lower Genital Tract Disease 10(1), 16–25 (2006)
Xu, T., Kim, E., Huang, X.: Adjustable adaboost classifier and pyramid features for image-based cervical cancer diagnosis. In: International Symposium on Biomedical Imaging (ISBI) (2015)
Morra, J.H., Tu, Z., Apostolova, L.G., et al.: Comparison of adaboost and support vector machines for detecting alzheimer’s disease through automated hippocampal segmentation. Medical Imaging 29, 30–43 (2010)
Osareh, A., Mirmehdi, M., Thomas, B., Markham, R.: Comparative exudate classification using support vector machines and neural networks. In: Dohi, T., Kikinis, R. (eds.) MICCAI 2002, Part II. LNCS, vol. 2489, pp. 413–420. Springer, Heidelberg (2002)
Wei, L., Yang, Y., Nishikawa, R.M., Jiang, Y.: A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. Medical Imaging 24, 371–380 (2005)
Timoner, S.J., Golland, P., Kikinis, R., Shenton, M.E., Grimson, W.E.L., Wells III, W.M.: Performance issues in shape classification. In: Dohi, T., Kikinis, R. (eds.) MICCAI 2002, Part I. LNCS, vol. 2488, pp. 355–362. Springer, Heidelberg (2002)
Alexander, D.C., Zikic, D., Zhang, J., Zhang, H., Criminisi, A.: Image quality transfer via random forest regression: applications in diffusion MRI. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014, Part III. LNCS, vol. 8675, pp. 225–232. Springer, Heidelberg (2014)
Hastie, T., Tibshirani, R., Friedman, J., et al.: The elements of statistical learning, vol. 2. Springer (2009)
Appel, R., Fuchs, T., Dollr, P., Perona, P.: Quickly boosting decision trees pruning underachieving features early. In: ICML (2013)
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
Goodfellow, I.J., Warde-Farley, D., Lamblin, P., et al.: Pylearn2: a machine learning research library (2013). arXiv:1308.4214
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research 15(1), 3133–3181 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, T. et al. (2015). A New Image Data Set and Benchmark for Cervical Dysplasia Classification Evaluation. In: Zhou, L., Wang, L., Wang, Q., Shi, Y. (eds) Machine Learning in Medical Imaging. MLMI 2015. Lecture Notes in Computer Science(), vol 9352. Springer, Cham. https://doi.org/10.1007/978-3-319-24888-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-24888-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24887-5
Online ISBN: 978-3-319-24888-2
eBook Packages: Computer ScienceComputer Science (R0)