Bio-kernel Self-organizing Map for HIV Drug Resistance Classification

  • Zheng Rong Yang
  • Natasha Young
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3610)


Kernel self-organizing map has been recently studied by Fyfe and his colleagues [1]. This paper investigates the use of a novel bio-kernel function for the kernel self-organizing map. For verification, the application of the proposed new kernel self-organizing map to HIV drug resistance classification using mutation patterns in protease sequences is presented. The original self-organizing map together with the distributed encoding method was compared. It has been found that the use of the kernel self-organizing map with the novel bio-kernel function leads to better classification and faster convergence rate ...


Protease Cleavage Site Support Vector Machine Approach Signal Peptide Cleavage Site Regularization Factor Protein Secondary Structure Prediction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Corchado, E., Fyfe, C.: Relevance and kernel self-organising maps. In: International Conference on Artificial Neural Networks (2003)Google Scholar
  2. 2.
    Qian, N., Sejnowski, T.J.: Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202, 865–884 (1988)CrossRefGoogle Scholar
  3. 3.
    Thompson, T.B., Chou, K.C., Zhang, C.: Neural network prediction of the HIV-1 protease cleavage sites. Journal of Theoretical Biology 177, 369–379 (1995)CrossRefGoogle Scholar
  4. 4.
    Nielsen, M., Lundegaard, C., Worning, P., Lauemoller, S.L., Lamberth, K., Buss, S., Brukak, S., Lund, O.: Reliable prediction of T-cell epitopes using neural networks with novel sequence representations. Protein Science 12, 1007–1017 (2003)CrossRefGoogle Scholar
  5. 5.
    Hansen, J.E., Lund, O., Engelbrecht, J., Bohr, H., Nielsen, J.O.: Prediction of O-glycosylation of mammalian proteins: specificity patterns of UDP-GalNAc: polypeptide N-acetylgalactosaminyltransferase. Biochem. J. 30, 801–813 (1995)Google Scholar
  6. 6.
    Gutteridge, A., Bartlett, G.J., Thornton, J.M.: Using a neural network and spatial clustering to predict the location of active sites in enzymes. Journal of Molecular Biology 330, 719–734 (2003)CrossRefGoogle Scholar
  7. 7.
    Blom, N., Gammeltoft, S., Brunak, S.: Sequence and structure based prediction of eukaryotic protein phosphorylation sites. J. Mol. Biol. 294, 1351–1362 (1999)CrossRefGoogle Scholar
  8. 8.
    Ehrlich, L., Reczko, M., Bohr, H., Wade, R.C.: Prediction of protein hydration sites from sequence by modular neural networks. Protein Eng. 11, 11–19 (1998)CrossRefGoogle Scholar
  9. 9.
    Thomson, R., Hodgman, T.C., Yang, Z.R., Doyle, A.K.: Characterising proteolytic cleavage site activity using bio-basis function neural networks. Bioinformatics 19, 1741–1747 (2003)CrossRefGoogle Scholar
  10. 10.
    Yang, Z.R., Thomson, R.: A novel neural network method in mining molecular sequence data. IEEE Trans. on Neural Networks 16, 263–274 (2005)zbMATHCrossRefGoogle Scholar
  11. 11.
    Yang, Z.R.: Orthogonal kernel machine in prediction of functional sites in preteins. IEEE Trans on Systems, Man and Cybernetics 35, 100–106 (2005)CrossRefGoogle Scholar
  12. 12.
    Cai, Y.D., Ricardo, P.W., Jen, C.H., Chou, K.C.: Application of SVMs to predict membrane protein types. Journal of Theoretical Biology 226, 373–376 (2004)CrossRefMathSciNetGoogle Scholar
  13. 13.
    Cai, Y.D., Lin, X.J., Xu, X.B., Chou, K.C.: Prediction of protein structural classes by support vector machines. Computers & Chemistry 26, 293–296 (2002)CrossRefGoogle Scholar
  14. 14.
    Hua, S., Sun, Z.: Support vector machine approach for protein subcellular localization prediction. Bioinformatics 17, 721–728 (2001)CrossRefGoogle Scholar
  15. 15.
    Chu, F., Jin, G., Wang, L.: Cancer diagnosis and protein secondary structure prediction using support vector machines. In: Wang, L. (ed.) Support Vector Machines. Springer, Heidelberg (2004)Google Scholar
  16. 16.
    Park, K., Kanehisa, M.: Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs. Bioinformatics 19, 1656–1663 (2003)CrossRefGoogle Scholar
  17. 17.
    Carter, R.J., Dubchak, I., Holbrook, S.R.: A computational approach to identify genes for functional RNAs in genomic sequences. Nucleic Acids Res. 29, 3928–3938 (2001)Google Scholar
  18. 18.
    Ding, C.H.Q., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)CrossRefGoogle Scholar
  19. 19.
    Cai, C.Z., Wang, W.L., Sun, L.Z., Chen, Y.Z.: Protein function classification via support vector machine approach. Mathematical Biosciences 185, 111–122 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Cai, Y.D., Lin, S.L.: Support vector machines for predicting rRNA-, RNA-, and DNA-binding proteins from amino acid sequence. Biochimica et Biophysica Acta (BBA) - Proteins & Proteomics 1648, 127–133 (2003)CrossRefGoogle Scholar
  21. 21.
    Lin, K., Kuang, Y., Joseph, J.S., Kolatkar, P.R.: Conserved codon composition of ribo-somal protein coding genes in Escherichia coli, Mycobacterium tuberculosis and Saccharomyces cerevisiae: lessons from supervised machine learning in functional genomics. Nucleic Acids Res. 30, 2599–2607 (2002)CrossRefGoogle Scholar
  22. 22.
    Jaakkola, T., Diekhans, M., Haussler, D.: Using the Fisher kernel method to detect remote protein homologies. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology, pp. 149–158 (1999)Google Scholar
  23. 23.
    Jaakkola, T., Diekhans, M., Haussler, D.: A Discriminative Framework for Detecting Remote Protein Homologies. Journal of Computational Biology 7, 95–114 (2000)CrossRefGoogle Scholar
  24. 24.
    Karchin, R., Karplus, K., Haussler, D.: Classifying G-protein coupled receptors with support vector machines. Bioinformatics 18, 147–159 (2002)CrossRefGoogle Scholar
  25. 25.
    Guermeur, Y., Pollastri, G., Elisseeff, A., Zelus, D., Paugam-Moisy, H., Baldi, P.: Combining protein secondary structure prediction models with ensemble methods of optimal complexity. Neurocomputing 56, 305–327 (2004)CrossRefGoogle Scholar
  26. 26.
    Kohonen, T.: Self organization and associative Memory, 3rd edn. Springer, Berling (1989)Google Scholar
  27. 27.
    Arrigo, P., Giuliano, F., Scalia, F., Rapallo, A., Damiani, G.: Identification of a new motif on nucleic acid sequence data using Kohonen’s self-organising map. In: CABIOS, vol. 7, pp .353–357 (1991)Google Scholar
  28. 28.
    Bengio, Y., Pouliot, Y.: Efficient recognition of immunoglobulin domains from amino acid sequences using a neural network. In: CABIOS, vol. 6, pp. 319–324 (1990)Google Scholar
  29. 29.
    Ferran, E.A., Ferrara, P.: Topological maps of protein sequences. Biological Cybernetics 65, 451–458 (1991)zbMATHCrossRefGoogle Scholar
  30. 30.
    Wang, H.C., Dopazo, J., Carazo, J.M.: Self-organising tree growing network for classifying amino acids. Bioinformatics 14, 376–377 (1998)CrossRefGoogle Scholar
  31. 31.
    Ferran, E.A., Pflugfelder, B.: A hybrid method to cluster protein sequences based on statistics and artificial neural networks. In: CABIOS, vol. 9, pp. 671–680 (1993)Google Scholar
  32. 32.
    Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. In: PNAS 1999, vol. 96, pp. 2907–2912 (1999)Google Scholar
  33. 33.
    Scholkopf, B.: The kernel trick for distances, Technical Report. Microsoft Research (May 2000)Google Scholar
  34. 34.
    MacDonald, D., Koetsier, J., Corchado, E., Fyfe, C.: A kernel method for classification. In: Monroy, R., Arroyo-Figueroa, G., Sucar, L.E., Sossa, H. (eds.) MICAI 2004. LNCS (LNAI), vol. 2972. Springer, Heidelberg (2004)Google Scholar
  35. 35.
    Fyfe, C., MacDonald, D.: Epsilon-insensitive Hebbian learning. Neuralcomputing 47, 35–57 (2002)zbMATHCrossRefGoogle Scholar
  36. 36.
    Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. matrices for detecting distant relationships. Atlas of protein sequence and structure 5, 345–358 (1978)Google Scholar
  37. 37.
    Johnson, M.S., Overington, J.P.: A structural basis for sequence comparisons-an evaluation of scoring methodologies. J. Molec. Biol. 233, 716–738 (1993)CrossRefGoogle Scholar
  38. 38.
    Yang, Z.R., Berry, E.: Reduced bio-basis function neural networks for protease cleavage site prediction. Journal of Computational Biology and Bioinformatics 2, 511–531 (2004)CrossRefGoogle Scholar
  39. 39.
    Thomson, R., Esnouf, R.: Predict disordered proteins using bio-basis function neural networks. In: Yang, Z.R., Yin, H., Everson, R.M. (eds.) IDEAL 2004. LNCS, vol. 3177, pp. 19–27. Springer, Heidelberg (2004)Google Scholar
  40. 40.
    Yang, Z.R., Thomson, R., Esnouf, R.: RONN: use of the bio-basis function neural network technique for the detection of natively disordered regions in proteins. Bioinformatics (accepted)Google Scholar
  41. 41.
    Berry, E., Dalby, A., Yang, Z.R.: Reduced bio basis function neural network for identification of protein phosphorylation sites: Comparison with pattern recognition algorithms. Computational Biology and Chemistry 28, 75–85 (2004)zbMATHCrossRefGoogle Scholar
  42. 42.
    Yang, Z.R., Chou, K.C.: Bio-basis function neural networks for the prediction of the O-linkage sites in glyco-proteins. Bioinformatics 20, 903–908 (2004)CrossRefGoogle Scholar
  43. 43.
    Yang, Z.R.: Prediction of Caspase Cleavage Sites Using Bayesian Bio-Basis Function Neural Networks. Bioinformatics (in press)Google Scholar
  44. 44.
    Yang, Z.R.: Mining SARS-CoV protease cleavage data using decision trees, a novel method for decisive template searching. Bioinformatics (accepted)Google Scholar
  45. 45.
    Sidhu, A., Yang, Z.R.: Predict signal peptides using bio-basis function neural networks. Applied Bioinformatics (accepted)Google Scholar
  46. 46.
    Draghici, S., Potter, R.B.: Predicting HIV drug resistance with neural networks. Bioinformatics 19, 98–107 (2003)CrossRefGoogle Scholar
  47. 47.
    Beerenwinkel, N., Daumer, M., Oette, M., Korn, K., Hoffmann, D., Kaiser, R., Lengauer, T., Selbig, J., Walter, H.: Geno2pheno: estimating phenotypic drug resistance from HIV-1 genotypes. NAR 31, 3850–3855 (2003)CrossRefGoogle Scholar
  48. 48.
    Beerenwinkel, N., Schmidt, B., Walter, H., Kaiser, R., Lengauer, T., Hoffmann, D., Korn, K., Selbig, J.: Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype. PNAS 99, 8271–8276 (2002)CrossRefGoogle Scholar
  49. 49.
    Zazzi, M., Romano, L., Giulietta, V., Shafer, R.W., Reid, C., Bello, F., Parolin, C., Palu, G., Valensin, P.: Comparative evaluation of three computerized algorithms for prediction of antiretroviral susceptibility from HIV type 1 genotype. Journal of Antiimicrobial Chemotherapy 53, 356–360 (2004)CrossRefGoogle Scholar
  50. 50.
    Sa-Filho, D.J., Costa, L.J., de Oliceira, C.F., Guimaraes, A.P.C., Accetturi, C.A., Tanuri, A., Diaz, R.S.: Analysis of the protease sequences of HIV-1 infected individuals after Indinavir monotherapy. Journal of Clinical Virology 28, 186–202 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Zheng Rong Yang
    • 1
  • Natasha Young
    • 1
  1. 1.Department of Computer ScienceUniversity of ExeterExeterUK

Personalised recommendations