Skip to main content

Advertisement

Log in

Diversity and motif conservation in protein 3D structural landscape: exploration by a new multivariate simulation method

  • Original Paper
  • Published:
Journal of Molecular Modeling Aims and scope Submit manuscript

Abstract

In this paper, diversity and conservation in the ‘landscape’ of random variation of protein tertiary structures are explored for quantitative feature-vector models of major types of functionally important 3D structural motifs. For this, I have deployed a recently developed nonparametric regression (NPR)-based multidimensional copula method of simulation. Apart from improved accuracy of multidimensional random sample generation, the simulation provides additional insight into diversity in the protein structural landscape in terms of random variation in the feature-vector. It shows the relative importance of several features, with biological implications, in conservation of motifs. Mapping of this landscape in distance-preserving 2D eigenspace also shows consistency in demarcation of different motif classes and preservation of their characteristic patterns in this 2D space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

References

  1. Zhang J, Grigoryan G (2013) Methods Enzymol 523:21–40. https://doi.org/10.1016/B978-0-12-394292-0.00002-3

    Article  CAS  Google Scholar 

  2. Zhou J, Gevorg GG (2014) Protein Sci 24:508–524. https://doi.org/10.1002/pro.2610

    Article  Google Scholar 

  3. Jun X, Nak-Kyeong K (2005) J Comput Biol 12(7):950–968

    Google Scholar 

  4. Joshi RR, Hira U, Suri D (2009) Protein Pept Lett 16(11):1393–1398

    Article  CAS  Google Scholar 

  5. Joshi RR, Sekharan S (2010) Protein Pept Lett 17(10):1198–1206

    Article  CAS  Google Scholar 

  6. Joshi RR, Sreenath S (2014) J Mol Model 20(1):2077–2085. https://doi.org/10.1007/s00894-014-2077-z

    Article  Google Scholar 

  7. Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S (1995) Gene 163:7–26

    Article  Google Scholar 

  8. Orengo CA, Michie AD, Jones DT, Swindells MB, Thornton JM (1997) Structure 5:1093–1108

  9. Gonnet P, Lisacek F (2002) Bioinformatics 18:1091–1101

    Article  CAS  Google Scholar 

  10. Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer ELL, Studholme DJ, Yeats C, Eddy SR (2004) Nucl Acids Res Database Issue 32:D138–D141

    Article  CAS  Google Scholar 

  11. Tao T, Zhai CX, Lu X, Fang H (2004) Appl Bioinforma 3(2–3):115–124

    Article  CAS  Google Scholar 

  12. Chen BY, Fofanov VY, Kristensen DM, Kimmel M, Lichtarge O, Kavraki LE (2005) Proc Pac Symp Biocompu 10:334–345

    Google Scholar 

  13. Cassela G, George EI (1992) Am Stat 46:167–174

    Google Scholar 

  14. Thijs G, Marchal K, Lescot M, Rombauts S, De Moor B, Rouzé P, Moreau Y (2002) J Comput Biol 9(2):447–464

    Article  CAS  Google Scholar 

  15. Mckenzie CO, Zhou J, Grigoryan G (2016) Proc Natl Acad Sci U S A 113(47):E7438–E7447

    Article  Google Scholar 

  16. David P, Leader E, Milner-White J (2015) PROTEINS: Struct Funct Bioinform 83(11):2067–2076

    Article  Google Scholar 

  17. Michalik M, Orwick-Rydmark M, Habeck M, Alva V, Arnold T, Linke D (2017) PLoS One 12(8):e0182016. https://doi.org/10.1371/journal.pone.0182016

    Article  Google Scholar 

  18. Mckenzie CO, Grigoryan G (2017) Curr Opin Struct Biol 44:161–167. https://doi.org/10.1016/j.sbi.2017.03.012

    Article  Google Scholar 

  19. Nepomnyachiya S, Ben-Tala N, Kolodny R (2017) Proc Natl Acad Sci U S A 114(44):11703–11708

    Article  Google Scholar 

  20. Kozakov D, Hall DR, Chuang G-Y, Cencic R, Brenke R, Grove LE, Beglov D, Pelletier J, Whitty A, Vajda S (2011) Proc Natl Acad Sci U S A 108(33):13528–13533

    Article  CAS  Google Scholar 

  21. Joshi RR, Krishnanand K (1996) J Comp Biol 3(1):143–162

    Article  CAS  Google Scholar 

  22. Joshi RR (2001) Protein Pept Lett 8(4):257–264

    Article  CAS  Google Scholar 

  23. Xu D, Li H, Gu T (2008) In: Chen F, Juttler B (ed) Advances in geometrical modelling and processing. Lect Notes Comp Sci 4975:556–562. Springer, Berlin

  24. Chi PH, Scott G, Shyu CR (2005) Int J Softw Eng Knowl Eng 15(3):527–545

    Article  Google Scholar 

  25. Chi PH, Shyu CR, Xu D (2006) BMC Bioinform 7:362. https://doi.org/10.1186/1471-2105-7-362

    Article  Google Scholar 

  26. Joshi RR, Panigrahi P, Patil RN (2012) J Mol Model 18(6):2741–2754. https://doi.org/10.1007/s00894-011-1223-0

    Article  CAS  Google Scholar 

  27. Teodorescu D (1977) Biol Cybern 28(2):83–93

    Article  Google Scholar 

  28. Adami C, Ofria C, Collier TC (2000) Proc Natl Acad Sci U S A 97:4463–4468

    Article  CAS  Google Scholar 

  29. Adami C (2004) Information theory in molecular biology. Phys Life Rev 1:3–22

  30. Williams OT (ed) (2007) Biological cybernetics – research trends. Nova Science, New York

  31. Joshi RR (1990) Math Comput Model 13(10):59–65

    Article  Google Scholar 

  32. Jones G, Hobert J (2001) Stat Sci 16:312–334

    Article  Google Scholar 

  33. Nelsen RB (2006) Introduction to copulas. Springer, New York

  34. Voet D, Voet JG (2004) Biochemsitry. Wiley, Hoboken

  35. Dewasthaly SS, Bhonde GS, Shankarraman V, Biswas SM, Ayachit VM, Gore MM (2007) Protein Pept Lett 14(6):543–551

    Article  CAS  Google Scholar 

  36. McConkey BJ, Sobolev V, Edelman M (2002) Bioiniformatics 18(10):1365–1373

  37. Härdle W (1990) Applied nonparametric regression. Cambridge Univ Press, Cambridge

    Book  Google Scholar 

  38. Everitt BS, Dunn GD (2001) Applied multivariate data analysis, 2nd edn. Arnold, London

Download references

Acknowledgements

The author would like to thank Srijit Chakrabarty for implementing the initial version of the author’s algorithm as part of his MSc project. The version used in this work has been developed further with significant modifications.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajani R. Joshi.

Appendix

Appendix

(Re: Legend of Figs. 1, 2, 3, 4, 5 and 6)

Let θij(v) denote the coefficient of (Rank or Partial, as the case may be) correlation between Xi and Xj in the Validation sample and let θij(g) denote the same in the Generated sample. Let η(i,j) = | θij(v) - θij(g) |, i, j = 1, .....11.

As the correlation matrices are symmetric, so is the matrix of their difference, i.e., η(i,j) = η(j,i). As each element of the diagonals of the correlation matrices is = 1, (and almost so are their estimates for both the samples with accuracy up to 7th place of decimal), so η(i,i) = 0 for each i. These properties are satisfied in our results for both the methods. Therefore only the values below the diagonal elements (i.e. η(i,j) for i = j+1, to 11, for j = 1, .....,11) are shown in Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12.

In these bar-diagrams, the integers 1 to 10 along the X-axis indicate the suffix-labels (i, j) = (2,1), .... (11,1) respectively; integers, 11 to 19 indicate the suffix-labels (3,2), .... (11,2) respectively; and so on ....., integers, 53 and 54 indicate the suffix-labels (10,9) and (11,9) respectively; and integer 55 indicates the suffix-label (11,10).

The numerical values of the corresponding η(i,j) are shown along the Y-axis.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Joshi, R.R. Diversity and motif conservation in protein 3D structural landscape: exploration by a new multivariate simulation method. J Mol Model 24, 76 (2018). https://doi.org/10.1007/s00894-018-3614-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00894-018-3614-y

Keywords

Navigation