SVM-Based Classification of Class C GPCRs from Alignment-Free Physicochemical Transformations of Their Sequences

  • Caroline König
  • Raúl Cruz-Barbosa
  • René Alquézar
  • Alfredo Vellido
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8158)

Abstract

G protein-coupled receptors (GPCRs) have a key function in regulating the function of cells due to their ability to transmit extracelullar signals. Given that the 3D structure and the functionality of most GPCRs is unknown, there is a need to construct robust classification models based on the analysis of their amino acid sequences for protein homology detection. In this paper, we describe the supervised classification of the different subtypes of class C GPCRs using support vector machines (SVMs). These models are built on different transformations of the amino acid sequences based on their physicochemical properties. Previous research using semi-supervised methods on the same data has shown the usefulness of such transformations. The obtained classification models show a robust performance, as their Matthews correlation coefficient is close to 0.91 and their prediction accuracy is close to 0.93.

Keywords

pharmaco-proteomics G-Protein coupled receptors homology transformation supervised learning support vector machines 

References

  1. 1.
    Chang, C., Lin, C.: LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:27 (2011)CrossRefGoogle Scholar
  2. 2.
    Cheng, B., Carbonell, J., Klein-Seetharaman, J.: Protein classification based on text document classification techniques. Proteins: Structure, Function, and Bioinformatics 58(4), 955–970 (2005)CrossRefGoogle Scholar
  3. 3.
    Cruz-Barbosa, R., Vellido, A., Giraldo, J.: Advances in semi-supervised alignment-free classification of G protein-coupled receptors. In: Procs. of the International Work-Conference on Bioinformatics and Biomedical Engineering (IWBBIO 2013), pp. 759–766 (2013)Google Scholar
  4. 4.
    Horn, F., Bettler, E., Oliveira, L., Campagne, F., Cohen, F., Vriend, G.: GPCRDB: An information system for G protein-coupled receptors. Nucleic Acids Res. 26, 294–297 (1998)Google Scholar
  5. 5.
    John, G., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann (1995)Google Scholar
  6. 6.
    Karchin, R., Karplus, K., Haussler, D.: Classifying G-protein coupled receptors with support vector machines. Bioinformatics 18(1), 147–159 (2002)CrossRefGoogle Scholar
  7. 7.
    Katritch, V., Cherezov, V., Stevens, R.C.: Structure-Function of the G Protein Coupled Receptor Superfamily. Annual Review of Pharmacology and Toxicology 53(1), 531–556 (2013)CrossRefGoogle Scholar
  8. 8.
    Lapinsh, M., Gutcaits, A., Prusis, P., Post, C., Lundstedt, T., Wikberg, J.E.S.: Classification of G-protein coupled receptors by alignment-independent extraction of principal chemical properties of primary amino acid sequences. Protein Science 11(4), 795–805 (2002)CrossRefGoogle Scholar
  9. 9.
    Liu, B., Wang, X., Chen, Q., Dong, Q., Lan, X.: Using Amino Acid Physicochemical Distance Transformation for Fast Protein Remote Homology Detection. PLoS ONE 7(9) (2012)Google Scholar
  10. 10.
    Opiyo, S.O., Moriyama, E.N.: Protein Family Classification with Partial Least Squares. Journal of Proteome Research 6(2), 846–853 (2007)CrossRefGoogle Scholar
  11. 11.
    Pin, J.P., Galvez, T., Prezeau, L.: Evolution, structure, and activation mechanism of family 3/C G-protein-coupled receptors. Pharmacology & Therapeutics 98(3), 325–354 (2003)CrossRefGoogle Scholar
  12. 12.
    Quinlan, J.R.: C4.5: Programs for Machine Learning by J. Ross Quinlan. Machine Learning 16(3), 235–240 (1993)Google Scholar
  13. 13.
    Sandberg, M., Eriksson, L., Jonsson, J., Sjöström, M., Wold, S.: New Chemical Descriptors Relevant for the Design of Biologically Active Peptides. A Multivariate Characterization of 87 Amino Acids. Journal of Medicinal Chemistry 41(14), 2481–2491 (1998)CrossRefGoogle Scholar
  14. 14.
    Vapnik, V.N.: Statistical Learning Theory. Wiley-Interscience (1998)Google Scholar
  15. 15.
    Wold, S., Jonsson, J., Sjörström, M., Sandberg, M., Rännar, S.: DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Analytica Chimica Acta 277(2), 239–253 (1993)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Caroline König
    • 1
  • Raúl Cruz-Barbosa
    • 2
    • 3
  • René Alquézar
    • 1
  • Alfredo Vellido
    • 1
  1. 1.Univ. Politècnica de Catalunya. Barcelona TechBarcelonaSpain
  2. 2.Univ. Tecnológica de la MixtecaHuajuapanMéxico
  3. 3.Institut de Neurociències. Univ. Autònoma de BarcelonaBarcelonaSpain

Personalised recommendations