Skip to main content

Mining the Risk Types of Human Papillomavirus (HPV) by AdaCost

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2736))

Abstract

Human Papillomavirus (HPV) infection is known as the main factor for cervical cancer, where cervical cancer is a leading cause of cancer deaths in women worldwide. Because there are more than 100 types in HPV, it is critical to discriminate the HPVs related with cervical cancer from those not related with it. In this paper, we classify the risk type of HPVs using their textual explanation. The important issue in this problem is to distinguish false negatives from false positives. That is, we must find out high-risk HPVs though we may miss some low-risk HPVs. For this purpose, the AdaCost, a cost-sensitive learner is adopted to consider different costs between training examples. The experimental results on the HPV sequence database show that considering costs gives higher performance. The F-score is higher than the accuracy, which implies that most high-risk HPVs are found.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chan, S., Chew, S., Egawa, K., Grussendorf-Conen, E., Honda, Y., Rubben, A., Tan, K., Bernard, H.: Phylogenetic Analysis of the Human Papillomavirus Type 2 (HPV-2), HPV-27, and HPV-57 Group, Which is Associated with Common Warts. Virology 239, 296–302 (1997)

    Article  Google Scholar 

  2. Fan, W., Stolfo, S., Zhang, J., Chan, P.: AdaCost: Misclassification Cost-Sensitive Boosting. In: Proceedings of the 16th International Conference on Machine Learning, pp. 97–105 (1999)

    Google Scholar 

  3. Favre, M., Kremsdorf, D., Jablonska, S., Obalek, S., Pehau-Arnaudet, G., Croissant, O., Orth, G.: Two New Human Papillomavirus Types (HPV54 and 55) Characterized from Genital Tumours Illustrate the Plurality of Genital HPVs. International Journal of Cancer 45, 40–46 (1990)

    Article  Google Scholar 

  4. Furumoto, H., Irahara, M.: Human Papilloma Virus (HPV) and Cervical Cancer. The Jounral of Medical Investigation 49(3–4), 124–133 (2002)

    Google Scholar 

  5. Ishiji, T.: Molecular Mechanism of Carcinogenesis by Human Papillomavirus-16. The Journal of Dermatology 27(2), 73–86 (2000)

    Google Scholar 

  6. Janicek, M., Averette, H.: Cervical Cancer: Prevention, Diagnosis, and Therapeutics. Cancer Journal for Clinicians 51, 92–114 (2001)

    Article  Google Scholar 

  7. Kim, Y.-H., Hahn, S.-Y., Zhang, B.-T.: Text Filtering by Boosting Naive Bayes Classifiers. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 168–175 (2000)

    Google Scholar 

  8. McCallum, A., Nigam, K.: Empolying EM in Pool-based Active Learning for Text Classification. In: Proceedings of the 15th International Conference on Machine Learning, pp. 350–358 (1998)

    Google Scholar 

  9. Meyer, T., Arndt, R., Christophers, E., Beckmann, E., Schroder, S., Gissmann, L., Stockfleth, E.: Association of Rare Human Papillomavirus Types with Genital Premalignant and Malignant Lesions. The Journal of Infectious Diseases 178, 252–255 (1998)

    Article  Google Scholar 

  10. Nuovo, G., Crum, C., De Villiers, E., Levine, R., Silverstein, S.: Isolation of a Novel Human Papillomavirus (Type 51) from a Cervical Condyloma. Journal of Virology 62, 1452–1455 (1988)

    Google Scholar 

  11. Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison Under Imprecise Class and Cost Distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pp. 43–48 (1997)

    Google Scholar 

  12. Park, S.-B., Zhang, B.-T.: A Boosted Maximum Entropy Model for Learning Text Chunking. In: Proceedings of the 19th Internatinal Conference on Machine Learning, pp. 482–489 (2002)

    Google Scholar 

  13. Ting, K.-M., Zheng, Z.: Boosting Trees for Cost-Sensitive Classifications. In: Proceedings of the 10th European Conference on Machine Learning, pp. 190–195 (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Park, S.B., Hwang, S., Zhang, B.T. (2003). Mining the Risk Types of Human Papillomavirus (HPV) by AdaCost. In: Mařík, V., Retschitzegger, W., Štěpánková, O. (eds) Database and Expert Systems Applications. DEXA 2003. Lecture Notes in Computer Science, vol 2736. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45227-0_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45227-0_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40806-2

  • Online ISBN: 978-3-540-45227-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics