Mining the Risk Types of Human Papillomavirus (HPV) by AdaCost

  • S. -B. Park
  • S. Hwang
  • B. -T. Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2736)


Human Papillomavirus (HPV) infection is known as the main factor for cervical cancer, where cervical cancer is a leading cause of cancer deaths in women worldwide. Because there are more than 100 types in HPV, it is critical to discriminate the HPVs related with cervical cancer from those not related with it. In this paper, we classify the risk type of HPVs using their textual explanation. The important issue in this problem is to distinguish false negatives from false positives. That is, we must find out high-risk HPVs though we may miss some low-risk HPVs. For this purpose, the AdaCost, a cost-sensitive learner is adopted to consider different costs between training examples. The experimental results on the HPV sequence database show that considering costs gives higher performance. The F-score is higher than the accuracy, which implies that most high-risk HPVs are found.


Cervical Cancer Human Papilloma Virus Weak Learner Human Papilloma Virus Type Risk Type 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chan, S., Chew, S., Egawa, K., Grussendorf-Conen, E., Honda, Y., Rubben, A., Tan, K., Bernard, H.: Phylogenetic Analysis of the Human Papillomavirus Type 2 (HPV-2), HPV-27, and HPV-57 Group, Which is Associated with Common Warts. Virology 239, 296–302 (1997)CrossRefGoogle Scholar
  2. 2.
    Fan, W., Stolfo, S., Zhang, J., Chan, P.: AdaCost: Misclassification Cost-Sensitive Boosting. In: Proceedings of the 16th International Conference on Machine Learning, pp. 97–105 (1999)Google Scholar
  3. 3.
    Favre, M., Kremsdorf, D., Jablonska, S., Obalek, S., Pehau-Arnaudet, G., Croissant, O., Orth, G.: Two New Human Papillomavirus Types (HPV54 and 55) Characterized from Genital Tumours Illustrate the Plurality of Genital HPVs. International Journal of Cancer 45, 40–46 (1990)CrossRefGoogle Scholar
  4. 4.
    Furumoto, H., Irahara, M.: Human Papilloma Virus (HPV) and Cervical Cancer. The Jounral of Medical Investigation 49(3–4), 124–133 (2002)Google Scholar
  5. 5.
    Ishiji, T.: Molecular Mechanism of Carcinogenesis by Human Papillomavirus-16. The Journal of Dermatology 27(2), 73–86 (2000)Google Scholar
  6. 6.
    Janicek, M., Averette, H.: Cervical Cancer: Prevention, Diagnosis, and Therapeutics. Cancer Journal for Clinicians 51, 92–114 (2001)CrossRefGoogle Scholar
  7. 7.
    Kim, Y.-H., Hahn, S.-Y., Zhang, B.-T.: Text Filtering by Boosting Naive Bayes Classifiers. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 168–175 (2000)Google Scholar
  8. 8.
    McCallum, A., Nigam, K.: Empolying EM in Pool-based Active Learning for Text Classification. In: Proceedings of the 15th International Conference on Machine Learning, pp. 350–358 (1998)Google Scholar
  9. 9.
    Meyer, T., Arndt, R., Christophers, E., Beckmann, E., Schroder, S., Gissmann, L., Stockfleth, E.: Association of Rare Human Papillomavirus Types with Genital Premalignant and Malignant Lesions. The Journal of Infectious Diseases 178, 252–255 (1998)CrossRefGoogle Scholar
  10. 10.
    Nuovo, G., Crum, C., De Villiers, E., Levine, R., Silverstein, S.: Isolation of a Novel Human Papillomavirus (Type 51) from a Cervical Condyloma. Journal of Virology 62, 1452–1455 (1988)Google Scholar
  11. 11.
    Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison Under Imprecise Class and Cost Distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 43–48 (1997)Google Scholar
  12. 12.
    Park, S.-B., Zhang, B.-T.: A Boosted Maximum Entropy Model for Learning Text Chunking. In: Proceedings of the 19th Internatinal Conference on Machine Learning, pp. 482–489 (2002)Google Scholar
  13. 13.
    Ting, K.-M., Zheng, Z.: Boosting Trees for Cost-Sensitive Classifications. In: Proceedings of the 10th European Conference on Machine Learning, pp. 190–195 (1998)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • S. -B. Park
    • 1
  • S. Hwang
    • 1
  • B. -T. Zhang
    • 1
  1. 1.School of Computer Science and EngineeringSeoul National UniversitySeoulKorea

Personalised recommendations