Simultaneous Alignment and Folding of Protein Sequences
Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We presentpartiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm’s complexity is polynomial in time and space. Algorithmically,partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments,partiFold-Align significantly outperforms state-of-the-art pairwise sequence alignment tools in the most difficult low sequence homology case and improves secondary structure prediction where current approaches fail. Importantly, partiFold-Align requires no prior training. These general techniques are widely applicable to many more protein families. partiFold-Align is available at http://partiFold.csail.mit.edu.
KeywordsStructure Prediction Secondary Structure Prediction Consensus Structure Folding Energy Pairwise Sequence Alignment
Unable to display preview. Download preview PDF.
- 5.Sankoff, D.: Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM J. Comput. 45(5), 810–825 (1985)Google Scholar
- 12.Xu, J., Li, M., Kim, D., Xu, Y.: RAPTOR: Optimal protein threading by linear programming. J. of Bioinform. and Comp. Biol., JBCB (2003)Google Scholar