Integration of Instance-Based Learning and Text Mining for Identification of Potential Virus/Bacterium as Bio-terrorism Weapons

  • Xiaohua Hu
  • Xiaodan Zhang
  • Daniel Wu
  • Xiaohua Zhou
  • Peter Rumm
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3975)


There are some viruses and bacteria that have been identified as bioterrorism weapons. However, there are a lot other viruses and bacteria that can be potential bioterrorism weapons. A system that can automatically suggest potential bioterrorism weapons will help laypeople to discover these suspicious viruses and bacteria.  In this paper we apply instance-based learning & text mining approach to identify candidate viruses and bacteria as potential bio-terrorism weapons from biomedical literature. We first take text mining approach to identify topical terms of existed viruses (bacteria) from PubMed separately. Then, we use the term lists as instances to build matrices with the remaining viruses (bacteria) to discover how much the term lists describe the remaining viruses (bacteria). Next, we build a algorithm to rank all remaining viruses (bacteria). We suspect that the higher the ranking of the virus (bacterium) is, the more suspicious they will be potential bio-terrorism weapon. Our findings are intended as a guide to the virus and bacterium literature to support further studies that might then lead to appropriate defense and public health measures.


MeSH Term Encephalitis Virus Biomedical Literature Rift Valley Fever Rift Valley Fever Virus 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    SdfsdfBüchen-Osmond, C.: Taxonomy and Classification of Viruses. In: Manual of Clinical Microbiology, 8th edn., vol. 2, pp. 1217–1226. ASM Press, Washington (2003)Google Scholar
  2. 2.
    DiGiacome, R.A., Kremer, J.M., Shah, D.M.: Fish oil dietary supplementation is patients with Raynaud’s phenomenon: a double-blind, controlled, prospective study. American Journal of Medicine 8, 158–164 (1989)CrossRefGoogle Scholar
  3. 3.
    Geissler, E. (ed.): Biological and toxin weapons today. SIPRI, Oxford (1986)Google Scholar
  4. 4.
    Swanson, D.R., Smalheiser, N.R., Bookstein, A.: Information discovery from complementary literatures: categorizing viruses as potential weapons. JASIST 52(10), 797–812 (2001)CrossRefGoogle Scholar
  5. 5.
    Swanson, D.R.: Fish-oil, Raynaud’s Syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine 30(1), 7–18 (1986)Google Scholar
  6. 6.
    Swanson, D.R.: Undiscovered public knowledge. Libr. Q. 56(2), 103–118 (1986)CrossRefMathSciNetGoogle Scholar
  7. 7.
    Hu, X., Yoo, I., Rumm, P., Atwood, M.E.: Mining candidate viruses as potential bio-terrorism weapons from biomedical literature. In: Kantor, P., Muresan, G., Roberts, F., Zeng, D.D., Wang, F.-Y., Chen, H., Merkle, R.C. (eds.) ISI 2005. LNCS, vol. 3495, pp. 60–71. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  8. 8.
    Kankar, P., Adak, S., Sarkar, A., Murari, K.K., Sharma, G.: MedMeSH Summarizer: Text Mining for Gene Clusters. In: The Proceedings of the Second SIAM International Conference on Data Mining, Arlington, VA (2002)Google Scholar
  9. 9.
    Guidance on cooperative agreements from the U.S. Department of Health and Human Services, Centers for Disease Control and Prevention and the Human Resource Service Administration. Accessible at,
  10. 10.
    Rumm, P.D.: Bioterrorism preparedness: potential threats remain. Am. J. Public. Health 95(3), 372 (2005) (comment on previous article)Google Scholar
  11. 11.
    Rumm, P., Gaydos, J., Mansfield, J., Kelley, P.: A Department of Defense (DOD) Virtual Public Health Laboratory Directory. In: Mil. Med., vol. 165(Supp. 2), p. 73 (July 2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Xiaohua Hu
    • 1
  • Xiaodan Zhang
    • 1
  • Daniel Wu
    • 1
  • Xiaohua Zhou
    • 1
  • Peter Rumm
    • 2
  1. 1.College of Information Science and TechnologyDrexel UniversityPhiladelphia
  2. 2.School of Public HealthDrexel UniversityPhiladelphia

Personalised recommendations