Finding Largest Well-Predicted Subset of Protein Structure Models

Li, Shuai Cheng; Bu, Dongbo; Xu, Jinbo; Li, Ming

doi:10.1007/978-3-540-69068-9_7

Finding Largest Well-Predicted Subset of Protein Structure Models

Shuai Cheng Li¹,
Dongbo Bu^1,3,
Jinbo Xu² &
…
Ming Li¹

Conference paper

539 Accesses
9 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5029))

Abstract

How to evaluate the quality of models is a basic problem for the field of protein structure prediction. Numerous evaluation criteria have been proposed, and one of the most intuitive criteria requires us to find a largest well-predicted subset — a maximum subset of the model which matches the native structure [12]. The problem is solvable in O(n ⁷) time, albeit too slow for practical usage. We present a (1 + ε)d distance approximation algorithm that runs in time O(n ³logn/ε ⁵) for general protein structures. In the case of globular proteins, this result can be enhanced to a randomized O(nlog² n) time algorithm with probability at least 1 − O(1/n). In addition, we propose a (1 + ε)-approximation algorithm to compute the minimum distance to fit all the points of a model to its native structure in time O(n(loglogn + log1/ε)/ε ⁵). We have implemented our algorithms and results indicate our program finds much more matched pairs with less running time than TMScore, which is one of the most popular tools to assess the quality of predicted models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, P.K., Matoušek, J., Suri, S.: Farthest neighbors, maximum spanning trees and related problems in higher dimensions. Comput. Geom. Theory Appl. 1(4), 189–201 (1992)
MATH Google Scholar
Alt, H., Mehlhorn, K., Wagener, H., Welzl, E.: Congruence, similarity, and symmetries of geometric objects. In: SCG 1987: Proceedings of the third annual symposium on Computational geometry, pp. 308–315. ACM Press, New York (1987)
Chapter Google Scholar
Ambühl, C., Chakraborty, S., Gärtner, B.: Computing largest common point sets under approximate congruence. In: Paterson, M. (ed.) ESA 2000. LNCS, vol. 1879, pp. 52–64. Springer, Heidelberg (2000)
Chapter Google Scholar
Arun, K.S., Huang, T.S., Blostein, S.D.: Least-squares fitting of two 3-d point sets. IEEE Trans. Pattern Anal. Mach. Intell. 9(5), 698–700 (1987)
Article Google Scholar
Choi, V., Goyal, N.: A combinatorial shape matching algorithm for rigid protein docking. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 285–296. Springer, Heidelberg (2004)
Google Scholar
Choi, V., Goyal, N.: An efficient approximation algorithm for point pattern matching under noise. In: Correa, J.R., Hevia, A., Kiwi, M. (eds.) LATIN 2006. LNCS, vol. 3887, pp. 298–310. Springer, Heidelberg (2006)
Chapter Google Scholar
Hamelryck, T., Kent, J.T., Krogh, A.: Sampling Realistic Protein Conformations Using Local Structural Bias. PLoS Computational Biology 2(9), e131 (2006)
Article Google Scholar
Kolodny, R., Koehl, P., Guibas, L., Levitt, M.: Small libraries of protein fragments model native protein structures accurately. J. Mol. Biol. 323, 297–307 (2002)
Article Google Scholar
Kolodny, R., Linial, N.: Approximate protein structural alignment in polynomial time. Proc. Natl. Acad. Sci. 101, 12201–12206 (2004)
Article Google Scholar
Lancia, G., Istrail, S.: Protein structure comparison: Algorithms and applications. In: Mathematical Methods for Protein Structure Analysis and Design, pp. 1–33 (2003)
Google Scholar
Moult, J., Fidelis, K., Rost, B., Hubbard, T., Tramontano, A.: Critical assessment of methods of protein structure prediction (casp):round 6. Proteins: Struct. Funct. Genet. 61, 3–7 (2005)
Article Google Scholar
Siew, N., Elofsson, A., Rychlewski, L., Fischer, D.: Maxsub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics 16(9), 776–785 (2000)
Article Google Scholar
Simons, K.T., Kooperberg, C., Huang, E., Baker, D.: Assembly of Protein Tertiary Structures from Fragments with Similar Local Sequences using Simulated Annealing and Bayesian Scoring Functions. J. Mol. Biol. 268 (1997)
Google Scholar
Zemla, A.: LGA: a method for finding 3D similarities in protein structures. Nucl. Acids Res. 31(13), 3370–3374 (2003)
Article Google Scholar
Zhang, Y., Skolnick, J.: Scoring function for automated assessment of protein structure template quality. Proteins: Structure, Function, and Bioinformatics 57(4), 702–710 (2004)
Article Google Scholar

Download references

Author information

Authors and Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Canada
Shuai Cheng Li, Dongbo Bu & Ming Li
Toyota Technological Institute at Chicago, USA
Jinbo Xu
Institute of Computing Technology, Chinese Academy of Sciences, China
Dongbo Bu

Authors

Shuai Cheng Li
View author publications
You can also search for this author in PubMed Google Scholar
Dongbo Bu
View author publications
You can also search for this author in PubMed Google Scholar
Jinbo Xu
View author publications
You can also search for this author in PubMed Google Scholar
Ming Li
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Paolo Ferragina Gad M. Landau

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, S.C., Bu, D., Xu, J., Li, M. (2008). Finding Largest Well-Predicted Subset of Protein Structure Models. In: Ferragina, P., Landau, G.M. (eds) Combinatorial Pattern Matching. CPM 2008. Lecture Notes in Computer Science, vol 5029. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69068-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-69068-9_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69066-5
Online ISBN: 978-3-540-69068-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics