The Use of a Conformational Alphabet for Fast Alignment of Protein Structures
A protein conformational alphabet refers to the discretized states of the three-dimensional segmental structure of protein backbones. Here a letter corresponds to a cluster of combinations of three angles formed by C α pseudobonds of four contiguous residues, and our alphabet consist of 17 letters obtained by clustering based on the probability distribution of these angles. A substitution matrix called CLESUM has been derived from an alignment database of representative structures to measure both evolutionary and geometrical similarity between any two such letters. A structural fragment is then mapped to a string, and two strings with their CLESUM score being higher than a preset threshold form a similar fragment pair (SFP). The search for SFPs by string comparison is fast. Furthermore, CLESUM scores reflect the importance of SFPs to structure alignment, and then the search space can be significantly reduced. A fast tool for pairwise alignment called CLePAPS is developed by collecting as many spatially consistent SFPs as possible. Extending the concept of SFPs to that of similar fragment blocks for multiple structure alignment leads to a fast tool for multiple structure alignment called BLOMAPS. Both CLePAPS and BLOMAPS are tested on ensembles of various structures. They are reliable, and about two or three orders faster than some well-known algorithms.
KeywordsStructure Alignment Pairwise Alignment Substitution Matrix Fast Tool Protein Structure Alignment
Unable to display preview. Download preview PDF.
- Fischer, D., Elofsson, A., Rice, D., Eisenberg, D.: Assessing the performance of fold recognition methods by means of a comprehensive benchmark. In: Proc. Pac. Symp. Biocomput., pp. 300–318 (1996)Google Scholar
- Guda, C., Lu, S., Sheeff, E.D., Bourne, P.E., Shindyalov, I.N.: CE-MC: A multiple protein structure alignment server. Nucleic Acids Res. 32, W100–W103 (2004)Google Scholar