Cleavage Site Analysis Using Rule Extraction from Neural Networks

  • Yeun-Jin Cho
  • Hyeoncheol Kim
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3610)


In this paper, we demonstrate that the machine learning approach of rule extraction from a trained neural network can be successfully applied to SARS-coronavirus cleavage site analysis. The extracted rules predict cleavage sites better than consensus patterns. Empirical experiments are also shown.


Neural Network Severe Acute Respiratory Syndrome Feedforward Neural Network Severe Acute Respiratory Syndrome Rule Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Andrews, R., Diederich, J., Tickle, A.B.: Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowledge-Based Systems 8(6), 373–389 (1995)CrossRefGoogle Scholar
  2. 2.
    Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Wheeler, D.L.: GenBank: update. Nucleic Acids Res. 32(Database issue), D23–D26 (2004)CrossRefGoogle Scholar
  3. 3.
    Blom, N., Hansen, J., Blaas, D., Brunak, S.: Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Sci. 5, 2203–2216 (1996)CrossRefGoogle Scholar
  4. 4.
    Chen, L.L., Ou, H.Y., Zhang, R., Zhang, C.T.: ZCURVE-CoV: a new system to recognize protein coding genes in coronavirus genomes, and its applications in analyzing SARS-CoV genomes. Science Direct, BBRC, 382–388 (2003)Google Scholar
  5. 5.
    Fu, L.: Neural Networks in Computer Intelligence. McGraw Hill, Inc., New York (1994)Google Scholar
  6. 6.
    Fu, L.: Rule generation from neural networks. IEEE Transactions on Systems, Man, and Cybernetics 24(8), 1114–1124 (1994)CrossRefGoogle Scholar
  7. 7.
    Fu, L.: Introduction to knowledge-based neural networks. Knowledge-Based Systems 8(6), 299–300 (1995)CrossRefGoogle Scholar
  8. 8.
    Fu, L., Kim, H.: Abstraction and Representation of Hidden Knowledge in an Adapted Neural Network. CISE, University of Florida (1994) (unpublished)Google Scholar
  9. 9.
    Gaoa, F., Oua, H.Y., Chena, L.L., Zhenga, W.X., Zhanga, C.T.: Prediction of proteinase cleavage sites in polyproteins of coronaviruses and its applications in analyzing SARS-CoV genomes. FEBS Letters 553, 451–456 (2003)CrossRefGoogle Scholar
  10. 10.
    Hu, L.D., Zheng, G.Y., Jiang, H.S., Xia, Y., Zhang, Y., Kong, X.Y.: Mutation analysis of 20 SARS virus genome sequences: evidence for negative selection in replicase ORF1b and spike gene. Acta Pharmacol. Sin., 741–745 (2003)Google Scholar
  11. 11.
    Kiemer, L., Lund, O., Brunak, S., Blom, N.: Coronavirus 3CL-pro proteinase cleavage sites: Possible relevance to SARS virus pathology. BMC Bioinformatics (2004)Google Scholar
  12. 12.
    Kim, H.: Computationally Efficient Heuristics for If-Then Rule Extraction from Feed-Forward Neural Networks. In: Morishita, S., Arikawa, S. (eds.) DS 2000. LNCS (LNAI), vol. 1967, pp. 170–182. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  13. 13.
    Luo, H., Luo, J.: Initial SARS Coronavirus Genome Sequence Analysis Using a Bioinformatics Platform. In: APBC 2004, vol. 29 (2004)Google Scholar
  14. 14.
    Marra, M.A., Jones, S.J.M., Astell, C.R., Holt, R.A., Brooks-Wilson, A., Butterfield, Y.S.N., Khattra, J., Asano, J.K., Barber, S.A., Chan, S.Y., Cloutier, A., Coughlin, S.M., Freeman, D., Girn, N., Griffith, O.L., Leach, S.R., Mayo, M., McDonald, H., Montgomery, S.B., Pandoh, P.K., Petrescu, A.S., Robertson, A.G., Schein, J.E., Siddiqui, A., Smailus, D.E., Stott, J.M., Yang, G.S., Plummer, F., Andonov, A., Artsob, H., Bastien, N., Bernard, K., Booth, T.F., Bowness, D., Czub, M., Drebot, M., Fernando, L., Flick, R., Garbutt, M., Gray, M., Grolla, A., Jones, S., Feldmann, H., Meyers, A., Kabani, A., Li, Y., Normand, S., Stroher, U., Tipples, G.A., Tyler, S., Vogrig, R., Ward, D., Watson, B., Brunham, R.C., Krajden, M., Petric, M., Skowronski, D.M., Upton, C., Roper, R.L.: The Genome Sequence of the SARS-Associated Coronavirus. Science 300, 1399–1404 (2003)CrossRefGoogle Scholar
  15. 15.
    Narayanan, A., Wu, X., Yang, Z.R.: Mining viral protease data to extract cleavage knowledge. Bioinformatics 18(1), s5–s13 (2002)Google Scholar
  16. 16.
    Ruan, Y., Wei, C.L., Ee, L.A., Vega, V.B., Thoreau, H., Yun, S.T.S., Chia, J.M., Ng, P., Chiu, K.P., Lim, L., Tao, Z., Peng, C.K., Ean, L.O.L., Lee, N.M., Sin, L.Y., Ng, L.F.P., Chee, R.E., Stanton, L.W., Long, P.M., Liu, E.T.: Comparative full-length genome sequence analysis of 14 SARS coronavirus isolates and common mutations associated with putative origins of infection. THE LANCET o Published online (2003)Google Scholar
  17. 17.
    Setino, R., Liu, H.: Understanding neural networks via rule extraction. In: Proceedings of the 14th International Conference on Neural Networks, Montreal, Canada, vol. (1), pp. 480–485 (1995)Google Scholar
  18. 18.
    Shi, J., Wei, Z., Song, J.: Dissection Study on the Severe Acute Respiratory Syndrome 3C-like Protease Reveals the Critical Role of the Extra Domain in Dimerization of the Enzyme. The Journal of Biological Chemistry 279(23), 24765–24773 (2004)CrossRefGoogle Scholar
  19. 19.
    Shi, Y., Yi, Y., Li, P., Kuang, T., Li, L., Dong, M., Ma, Q., Cao, C.: Diagnosis of Severe Acute Respiratory Syndrome (SARS) by Detection of SARS Coronavirus Nucleocapsid Antibodies in an Antigen-Capturing Enzyme-Linked Immunosorbent Assay. Journal of Clinical Microbiology, 5781–5782 (2003)Google Scholar
  20. 20.
    Stadler, K., Masignani, V., Eickmann, M., Becker, S., Abrignani, S., Klenk, H.D., Rappuoli, R.: Sars - Beginning to Understand a New Virus. Nature Reviews, Microbiology 1, 209–218 (2003)CrossRefGoogle Scholar
  21. 21.
    Taha, I.A., Ghosh, J.: Symbolic interpretation of artificial neural networks. IEEE Transactions on Knowledge and Data Engineering 11(3), 443–463 (1999)CrossRefGoogle Scholar
  22. 22.
    Towell, G.G., Shavlik, J.W.: Extracting refined rules from knowledge-based neural networks. Machine Learning 13(1) (1993)Google Scholar
  23. 23.
    Tsur, S.: Data Mining in the Bioinformatics Domain. In: Proceedings of the 26th VLDB Conference, Cairo, Egypt (2000)Google Scholar
  24. 24.
    Xu, D., Zhang, Z., Chu, F., Li, Y., Jin, L., Zhang, L., Gao, G.F., Wang, F.S.: Genetic Variation of SARS Coronavirus in Beijing Hospital. Emerging Infectious Diseases 10(5) (2004),
  25. 25.
    Yap, Y.L., Zhang, X.W., Danchin, A.: Relationship of SARS-CoV to other pathogenic RNA viruses explored by tetranucleotide usage profiling. BMC Bioinformatics (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Yeun-Jin Cho
    • 1
  • Hyeoncheol Kim
    • 1
  1. 1.Department of Computer Science EducationKorea UniversitySeoulKorea

Personalised recommendations