Skip to main content

Finding Largest Well-Predicted Subset of Protein Structure Models

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5029))

Abstract

How to evaluate the quality of models is a basic problem for the field of protein structure prediction. Numerous evaluation criteria have been proposed, and one of the most intuitive criteria requires us to find a largest well-predicted subset — a maximum subset of the model which matches the native structure [12]. The problem is solvable in O(n 7) time, albeit too slow for practical usage. We present a (1 + ε)d distance approximation algorithm that runs in time O(n 3logn/ε 5) for general protein structures. In the case of globular proteins, this result can be enhanced to a randomized O(nlog2 n) time algorithm with probability at least 1 − O(1/n). In addition, we propose a (1 + ε)-approximation algorithm to compute the minimum distance to fit all the points of a model to its native structure in time O(n(loglogn + log1/ε)/ε 5). We have implemented our algorithms and results indicate our program finds much more matched pairs with less running time than TMScore, which is one of the most popular tools to assess the quality of predicted models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, P.K., Matoušek, J., Suri, S.: Farthest neighbors, maximum spanning trees and related problems in higher dimensions. Comput. Geom. Theory Appl. 1(4), 189–201 (1992)

    MATH  Google Scholar 

  2. Alt, H., Mehlhorn, K., Wagener, H., Welzl, E.: Congruence, similarity, and symmetries of geometric objects. In: SCG 1987: Proceedings of the third annual symposium on Computational geometry, pp. 308–315. ACM Press, New York (1987)

    Chapter  Google Scholar 

  3. Ambühl, C., Chakraborty, S., Gärtner, B.: Computing largest common point sets under approximate congruence. In: Paterson, M. (ed.) ESA 2000. LNCS, vol. 1879, pp. 52–64. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  4. Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. Pattern Anal. Mach. Intell. 9(5), 698–700 (1987)

    Article  Google Scholar 

  5. Choi, V., Goyal, N.: A combinatorial shape matching algorithm for rigid protein docking. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 285–296. Springer, Heidelberg (2004)

    Google Scholar 

  6. Choi, V., Goyal, N.: An efficient approximation algorithm for point pattern matching under noise. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 298–310. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Hamelryck, T., Kent, J.T., Krogh, A.: Sampling Realistic Protein Conformations Using Local Structural Bias. PLoS Computational Biology 2(9), e131 (2006)

    Article  Google Scholar 

  8. Kolodny, R., Koehl, P., Guibas, L., Levitt, M.: Small libraries of protein fragments model native protein structures accurately. J. Mol. Biol. 323, 297–307 (2002)

    Article  Google Scholar 

  9. Kolodny, R., Linial, N.: Approximate protein structural alignment in polynomial time. Proc. Natl. Acad. Sci. 101, 12201–12206 (2004)

    Article  Google Scholar 

  10. Lancia, G., Istrail, S.: Protein structure comparison: Algorithms and applications. In: Mathematical Methods for Protein Structure Analysis and Design, pp. 1–33 (2003)

    Google Scholar 

  11. Moult, J., Fidelis, K., Rost, B., Hubbard, T., Tramontano, A.: Critical assessment of methods of protein structure prediction (casp):round 6. Proteins: Struct. Funct. Genet. 61, 3–7 (2005)

    Article  Google Scholar 

  12. Siew, N., Elofsson, A., Rychlewski, L., Fischer, D.: Maxsub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16(9), 776–785 (2000)

    Article  Google Scholar 

  13. Simons, K.T., Kooperberg, C., Huang, E., Baker, D.: Assembly of Protein Tertiary Structures from Fragments with Similar Local Sequences using Simulated Annealing and Bayesian Scoring Functions. J. Mol. Biol. 268 (1997)

    Google Scholar 

  14. Zemla, A.: LGA: a method for finding 3D similarities in protein structures. Nucl. Acids Res. 31(13), 3370–3374 (2003)

    Article  Google Scholar 

  15. Zhang, Y., Skolnick, J.: Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57(4), 702–710 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Paolo Ferragina Gad M. Landau

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, S.C., Bu, D., Xu, J., Li, M. (2008). Finding Largest Well-Predicted Subset of Protein Structure Models. In: Ferragina, P., Landau, G.M. (eds) Combinatorial Pattern Matching. CPM 2008. Lecture Notes in Computer Science, vol 5029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69068-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69068-9_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69066-5

  • Online ISBN: 978-3-540-69068-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics