Identification of Defensins Employing Recurrence Quantification Analysis and Random Forest Classifiers

  • Shreyas Karnik
  • Ajay Prasad
  • Alok Diwevedi
  • V. Sundararajan
  • V. K. Jayaraman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5909)

Abstract

Defensins represent a class of antimicrobial peptides synthesized in the body acting against various microbes. In this paper we study defensins using a non-linear signal analysis method Recurrence Quantication Analysis (RQA). We used the descriptors calculated employing RQA for the classification of defensins with Random Forest Classifier.The RQA descriptors were able to capture patterns peculiar to defensins leading to an accuracy rate of 78.12% using 10-fold cross validation.

Keywords

Random Forest Antimicrobial Peptide Recurrence Plot Random Forest Algorithm Cross Validation Accuracy 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ganz, T.: Defensins: antimicrobial peptides of vertebrates. Comptes Rendus Biologies 327(6), 539–549 (2004)CrossRefGoogle Scholar
  2. 2.
    Giuliani, A., Benigni, R., Sirabella, P., Zbilut, J.P., Colosimo, A.: Nonlinear meth- ods in the analysis of protein sequences: A case study in rubredoxins. Biophysics Journal 78(1), 136–149 (2000)CrossRefGoogle Scholar
  3. 3.
    Zbilut, J.P., Giuliani, A., Webber, C.L.J., Colosimo, A.: Recurrence quantification analysis in structure-function relationships of proteins: an overview of a general methodology applied to the case of tem-1 beta-lactamase. Protein Eng. 11(2), 87–93 (1998)CrossRefGoogle Scholar
  4. 4.
    Angadi, S., Kulkarni, A.: Nonlinear signal analysis to understand the dynamics of the protein sequences. The European Physical Journal - Special Topics 164(1), 141–155 (2008)CrossRefGoogle Scholar
  5. 5.
    Mitra, J., Mundra, P.K., Kulkarni, B.D., Jayaraman, V.K.: Using recurrence quantification analysis descriptors for protein sequence classification with support vector machines. Journal of Biomolecular Structure and Dynamics 25(3), 141 (2007)Google Scholar
  6. 6.
    Eckmann, J.P., Kamphorst, S.O., Ruelle, D.: Recurrence plots of dynamical sys- tems. EPL (Europhysics Letters) (9), 973 (1987)CrossRefGoogle Scholar
  7. 7.
    Webber Jr., C.L., Zbilut, J.P.: Dynamical assessment of physiological systems and states using recurrence plot strategies. J. Appl. Physiol. 76(2), 965–973 (1994)Google Scholar
  8. 8.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
  9. 9.
    Diaz-Uriarte, R., Alvarez de Andres, S.: Gene selection and classification of mi- croarray data using random forest. BMC Bioinformatics 7(1), 3 (2006)CrossRefGoogle Scholar
  10. 10.
    Hamby, S., Hirst, J.: Prediction of glycosylation sites using random forests. BMC Bioinformatics 9, 500 (2008)CrossRefGoogle Scholar
  11. 11.
    Pang, H., Lin, A., Holford, M., Enerson, B.E., Lu, B., Lawton, M.P., Floyd, E., Zhao, H.: Pathway analysis using random forests classification and regression. Bioinformatics (2006)Google Scholar
  12. 12.
    R Development Core Team: R: A Language and Environment for Statistical Computing. In: R. Foundation for Statistical Computing, Vienna, Austria (2009) ISBN 3-900051-07-0Google Scholar
  13. 13.
    Liaw, A., Wiener, M.: Classification and regression by randomforest. R. News 2(3), 18–22 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Shreyas Karnik
    • 1
    • 3
  • Ajay Prasad
    • 1
  • Alok Diwevedi
    • 1
  • V. Sundararajan
    • 2
  • V. K. Jayaraman
    • 2
  1. 1.Chemical Engineering and Process Development DivisionNational Chemical LaboratoryPuneIndia
  2. 2.Center for Development of Advanced ComputingPuneIndia
  3. 3.School of InformaticsIndiana UniversityIndianapolisUSA

Personalised recommendations