A New Image Data Set and Benchmark for Cervical Dysplasia Classification Evaluation

Xu, Tao; Xin, Cheng; Rodney Long, L.; Antani, Sameer; Xue, Zhiyun; Kim, Edward; Huang, Xiaolei

doi:10.1007/978-3-319-24888-2_4

Tao Xu²³,
Cheng Xin²³,
L. Rodney Long²⁴,
Sameer Antani²⁴,
Zhiyun Xue²⁴,
Edward Kim²⁵ &
…
Xiaolei Huang²³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9352))

Included in the following conference series:

International Workshop on Machine Learning in Medical Imaging

3242 Accesses
13 Citations

Abstract

Cervical cancer is one of the most common types of cancer in women worldwide. Most deaths of cervical cancer occur in less developed areas of the world. In this work, we introduce a new image dataset along with ground truth diagnosis for evaluating image-based cervical disease classification algorithms. We collect a large number of cervigram images from a database provided by the US National Cancer Institute. From these images, we extract three types of complementary image features, including Pyramid histogram in L*A*B* color space (PLAB), Pyramid Histogram of Oriented Gradients (PHOG), and Pyramid histogram of Local Binary Patterns (PLBP). PLAB captures color information, PHOG encodes edges and gradient information, and PLBP extracts texture information. Using these features, we run seven classic machine-learning algorithms to differentiate images of high-risk patient visits from those of low-risk patient visits. Extensive experiments are conducted on both balanced and imbalanced subsets of the data to compare the seven classifiers. These results can serve as a baseline for future research in cervical dysplasia classification using images. The image-based classifiers also outperform results of several other screening tests on the same datasets.

C. Xin—Co-first author

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

WHO: Human papillomavirus and related cancers in world. In: ICO Information Centre on HPV and Cancer Summary Report, August 2014
Google Scholar
Kim, E., Huang, X.: A data driven approach to cervigram image analysis and classification. In: Color Medical Image analysis, Lecture Notes in Computational Vision and Biomechanics, vol. 6, pp. 1–13 (2013)
Google Scholar
Biscotti, C.V., Dawson, A.E., et al.: Assisted primary screening using the automated thinprep imaging system. AJCP 123(2), 281–287 (2005)
Google Scholar
Wilbur, D.C., Black-Schaffer, W.S., Luff, R.D., et al.: The becton dickinson focalpoint gs imaging system: Clinical trials demonstrate significantly improved sensitivity for the detection of important cervical lesions. AJCP 132(5), 767–775 (2009)
Google Scholar
Zhang, J., Liu, Y.: Cervical cancer detection using SVM based feature screening. In: Barillot, C., Haynor, D.R., Hellier, P. (eds.) MICCAI 2004. LNCS, vol. 3217, pp. 873–880. Springer, Heidelberg (2004)
Chapter Google Scholar
Herrero, R., Schiffman, M., Bratti, C., et al.: Design and methods of a population-based natural history study of cervical neoplasia in a rural province of costa rica: the guanacaste project. Rev. Panam. Salud Publica 1, 362–375 (1997)
Article Google Scholar
Jeronimo, J., Long, L.R., Neve, L., et al.: Digital tools for collecting data from cervigrams for research and training in colposcopy. Journal of Lower Genital Tract Disease 10(1), 16–25 (2006)
Article Google Scholar
Xu, T., Kim, E., Huang, X.: Adjustable adaboost classifier and pyramid features for image-based cervical cancer diagnosis. In: International Symposium on Biomedical Imaging (ISBI) (2015)
Google Scholar
Morra, J.H., Tu, Z., Apostolova, L.G., et al.: Comparison of adaboost and support vector machines for detecting alzheimer’s disease through automated hippocampal segmentation. Medical Imaging 29, 30–43 (2010)
Article Google Scholar
Osareh, A., Mirmehdi, M., Thomas, B., Markham, R.: Comparative exudate classification using support vector machines and neural networks. In: Dohi, T., Kikinis, R. (eds.) MICCAI 2002, Part II. LNCS, vol. 2489, pp. 413–420. Springer, Heidelberg (2002)
Chapter Google Scholar
Wei, L., Yang, Y., Nishikawa, R.M., Jiang, Y.: A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. Medical Imaging 24, 371–380 (2005)
Article Google Scholar
Timoner, S.J., Golland, P., Kikinis, R., Shenton, M.E., Grimson, W.E.L., Wells III, W.M.: Performance issues in shape classification. In: Dohi, T., Kikinis, R. (eds.) MICCAI 2002, Part I. LNCS, vol. 2488, pp. 355–362. Springer, Heidelberg (2002)
Chapter Google Scholar
Alexander, D.C., Zikic, D., Zhang, J., Zhang, H., Criminisi, A.: Image quality transfer via random forest regression: applications in diffusion MRI. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014, Part III. LNCS, vol. 8675, pp. 225–232. Springer, Heidelberg (2014)
Chapter Google Scholar
Hastie, T., Tibshirani, R., Friedman, J., et al.: The elements of statistical learning, vol. 2. Springer (2009)
Google Scholar
Appel, R., Fuchs, T., Dollr, P., Perona, P.: Quickly boosting decision trees pruning underachieving features early. In: ICML (2013)
Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
MathSciNet Google Scholar
Goodfellow, I.J., Warde-Farley, D., Lamblin, P., et al.: Pylearn2: a machine learning research library (2013). arXiv:1308.4214
Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)
Google Scholar
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research 15(1), 3133–3181 (2014)
MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science & Engineering Department, Lehigh University, Bethlehem, PA, USA
Tao Xu, Cheng Xin & Xiaolei Huang
Communications Engineering Branch, NLM, Bethesda, MD, USA
L. Rodney Long, Sameer Antani & Zhiyun Xue
Computing Sciences Department, Villanova University, Villanova, PA, USA
Edward Kim

Authors

Tao Xu
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Xin
View author publications
You can also search for this author in PubMed Google Scholar
L. Rodney Long
View author publications
You can also search for this author in PubMed Google Scholar
Sameer Antani
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyun Xue
View author publications
You can also search for this author in PubMed Google Scholar
Edward Kim
View author publications
You can also search for this author in PubMed Google Scholar
Xiaolei Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Xu .

Editor information

Editors and Affiliations

School of Computing and Information Technology, University of Wollongong, Wollongong, NSW, Australia
Luping Zhou
Radiology and BRIC, University of North Carolina, Chapel Hill, NC, USA
Li Wang
Biomedical Engineering, Shanghai Jiaotong University, Shanghai, China
Qian Wang
Computer Science, Nanjing University, Nanjing, China
Yinghuan Shi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, T. et al. (2015). A New Image Data Set and Benchmark for Cervical Dysplasia Classification Evaluation. In: Zhou, L., Wang, L., Wang, Q., Shi, Y. (eds) Machine Learning in Medical Imaging. MLMI 2015. Lecture Notes in Computer Science(), vol 9352. Springer, Cham. https://doi.org/10.1007/978-3-319-24888-2_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-24888-2_4
Published: 02 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24887-5
Online ISBN: 978-3-319-24888-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics