Complementing Kernel-Based Visualization of Protein Sequences with Their Phylogenetic Tree

  • Martha Ivón Cárdenas
  • Alfredo Vellido
  • Iván Olier
  • Xavier Rovira
  • Jesús Giraldo
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7548)


The world of pharmacology is becoming increasingly dependent on the advances in the fields of genomics and proteomics. This dependency brings about the challenge of finding robust methods to analyze the complex data they generate. In this brief paper, we focus on the analysis of a specific type of proteins, the G protein-couple receptors, which are the target for over 15% of current drugs. We describe a kernel method of the manifold learning family for the analysis and intuitive visualization of their protein amino acid symbolic sequences. This method is shown to reveal the grouping structure of the sequences in a way that closely resembles the corresponding phylogenetic trees.


Kernel GTM Phylogenetic Tree Data Visualization GPCR Protein Sequence 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lisboa, P.J.G., Vellido, A., Tagliaferri, R., Napolitano, F., Ceccarelli, M., Martin-Guerrero, J.D., Biganzoli, E.: Data mining in cancer research. IEEE Computational Intelligence Magazine 5(1), 14–18 (2010)CrossRefGoogle Scholar
  2. 2.
    Kahn, S.D.: On the future of genomic data. Science 331(6018), 728–729 (2011)CrossRefGoogle Scholar
  3. 3.
    Lipman, D.J., Pearson, W.R.: Rapid and sensitive protein similarity searches. Science 227(4693), 1435–1441 (1985)CrossRefGoogle Scholar
  4. 4.
    Lisboa, P.J.G.: A review of evidence of health benefit from Artificial Neural Networks in medical intervention. Neural Networks 15, 9–37 (2002)CrossRefGoogle Scholar
  5. 5.
    Baldi, P., Brunak, S.: Bioinformatics: The Machine Learning Approach. The MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  6. 6.
    Schölkopf, B., Tsuda, K., Vert, J.-P.: Kernel Methods in Computational Biology. The MIT Press, Cambridge (2004)Google Scholar
  7. 7.
    Overington, J.P., Al-Lazikani, B., Hopkins, A.L.: How many drug targets are there? Nature Reviews Drug Discovery 5, 993–996 (2006)CrossRefGoogle Scholar
  8. 8.
    Horn, F., Weare, J., Beukers, M.W., Horsch, S., Bairoch, A., Chen, W., Edvardsen, O., Campagne, F., Vriend, G.: GPCRDB: an information system for G protein-coupled receptors. Nucleic Acids Research 26, 275–279 (1998)CrossRefGoogle Scholar
  9. 9.
    Pierce, K.L., Premont, R.T., Lefkowitz, R.J.: Seven-transmembrane receptors. Nature Reviews: Molecular Cell Biology 3, 639–650 (2002)CrossRefGoogle Scholar
  10. 10.
    Rondard, P., Goudet, C., Kniazeff, J., Pin, J.-P., Prézeau, L.: The complexity of their activation mechanism opens new possibilities for the modulation of mGlu and GABAB class C G protein-coupled receptors. Neuropharmacology 60, 82–92 (2011)CrossRefGoogle Scholar
  11. 11.
    Cobanoglu, M.C., Saygin, Y., Sezerman, U.: Classification of GPCRs using family specific motifs. IEEE/ACM Transactions on Computational Biology and Bioinformatics (in press), doi:10.1109/TCBB.2010.101Google Scholar
  12. 12.
    Bishop, C.M., Svensén, M., Williams, C.K.I.: The Generative Topographic Mapping. Neural Computation 10(1), 215–234 (1998)CrossRefGoogle Scholar
  13. 13.
    Villa, N., Rossi, F.: A comparison between dissimilarity SOM and kernel SOM for clustering the vertices of a graph. In: Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 2007), Bielefield, Germany (2007)Google Scholar
  14. 14.
    Olier, I., Vellido, A., Giraldo, J.: Kernel Generative Topographic Mapping. In: Verleysen, M. (ed.) Proceedings of the 18th European Symposium on Artificial Neural Networks (ESANN 2010), pp. 481–486 (2010)Google Scholar
  15. 15.
    Felsenstein, J.: Inferring phylogenies from protein sequences by parsimony, distance, and likelihood methods. Methods Enzymol 266, 418–427 (1996)CrossRefGoogle Scholar
  16. 16.
    Waterhouse, A.M., Procter, J.B., Martin, D.M.A., Clamp, M., Barton, G.J.: Jalview Version 2-a multiple sequence alignment editor and analysis workbench. Bioinformatics 25(9), 1189–1191 (2009)CrossRefGoogle Scholar
  17. 17.
    Henikoff, S.: Amino acid substitution matrices from protein blocks. PNAS 89, 10915–10919 (1992)CrossRefGoogle Scholar
  18. 18.
    Sokal, R., Michene, C.: A statistical method for evaluating systematic relationships. Science Bulletin 38, 1409–1438 (1958)Google Scholar
  19. 19.
    Vellido, A., Cárdenas, M.I., Olier, I., Rovira, X., Giraldo, J.: A probabilistic approach to the visual exploration of G Protein-Coupled Receptor sequences. In: Verleysen, M. (ed.) Proceedings of the 19th European Symposium on Artificial Neural Networks (ESANN 2011), pp. 233–238 (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Martha Ivón Cárdenas
    • 1
  • Alfredo Vellido
    • 1
  • Iván Olier
    • 2
  • Xavier Rovira
    • 3
  • Jesús Giraldo
    • 4
  1. 1.Departament de Llenguatges i Sistemes InformàticsUniversitat Politècnica de CatalunyaBarcelonaSpain
  2. 2.School of Psychological SciencesThe University of ManchesterManchesterUnited Kingdom
  3. 3.Department of Molecular Pharmacology Institute of Functional Genomics (IGF) CNRS UMR5203, INSERM U661University of MontpellierMontpellier cedex 5France
  4. 4.Institut de Neurociències, Unitat de BioestadísticaUniversitat Autònoma de BarcelonaBarcelonaSpain

Personalised recommendations