PAAA: A Progressive Iterative Alignment Algorithm Based on Anchors

  • Ahmed Mokaddem
  • Mourad Elloumi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7036)

Abstract

In this paper, we present a new iterative progressive algorithm for Multiple Sequence Alignment (MSA), called Progressive Iterative Alignment Algorithm Based on Anchors (PAAA). Our algorithm adopts a new distance, called anchor distance to compute the distance between two sequences, and a variant of the UPGMA method to construct a guide tree.

PAAA is of complexity O(N 4 + N*L 2) in computing time, where N is the number of the sequences and L is the length of the longest sequence.

We benchmarked PAAA using different benchmarks, e.g., BALIBASE, HOMSTRAD, OXBENCH and BRALIBASE, and we compared the obtained results to those obtained with other alignment algorithms, e.g., CLUSTALW, MUSCLE, MAFFT and PROBCONS, using us criteria the Column Score (CS) and the Sum of Pairs Score (SPS). We obtained good results for protein sequences.

Keywords

Multiple sequence alignment progressive alignment algorithms complexities distances 

References

  1. 1.
    Do, C.B., Mahabhashyam, M.S., Brudno, M., Batzoglou, S.: PROBCONS: Probabilistic Consistency-Based Multiple Sequence Alignment. Genome Res. 15, 330–340 (2005)CrossRefGoogle Scholar
  2. 2.
    Edgar, R.C.: MUSCLE: Multiple Sequence Alignment with high Accuracy high Throughput. Nucleic Acids Research 32, 1792–1797 (2004)CrossRefGoogle Scholar
  3. 3.
    Elloumi, M., Mokaddem, A.: A Heuristic Algorithm for the N-LCS Problem. Journal of the Applied Mathematics, Statistics and Informatics (JAMSI) 4, 17–27 (2008)Google Scholar
  4. 4.
    Gardner, P.P., Wilm, A., Washiet, S.: A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Research 33, 2433–2439 (2005)CrossRefGoogle Scholar
  5. 5.
    Gotoh, O.: An Improved Algorithm for Matching Biological Sequences. J. Mol. Biol. 162, 705–708 (1982)CrossRefGoogle Scholar
  6. 6.
    Gotoh, O.: Further Improvement in Methods of Group-to-Group Sequence Alignment with Generalized Profile Operations. CABIOS 1, 379–387 (1994)Google Scholar
  7. 7.
    Gotoh, O.: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as assessed by reference to structural alignments. J. Mol. Biol. 264, 823–838 (1996)CrossRefGoogle Scholar
  8. 8.
    Katoh, K., Kuma, K., Toh, H., Miyata, T.: MAFFT version 5: Improvement in Accuracy of Multiple Sequence Alignment. Nucleic Acids Res. 33, 511–518 (2005)CrossRefGoogle Scholar
  9. 9.
    Kimura, M.: The Neutral Theory of Molecular Evolution. Cambridge University Press, Cambridge (1983)CrossRefGoogle Scholar
  10. 10.
    Lassman, T., Frings, O., Sonnhammer, L.L.: KALIGN2: High-Performance Multiple Alignment of Protein and Nucleotide Sequences Allowing External Features. Nucleic Acids Research 37, 858–865 (2009)CrossRefGoogle Scholar
  11. 11.
    Lassman, T., Sonnhammer, L.L.: KALIGN: An Accurate and Fast Multiple Sequence Alignment Algorithm. BMC Bioinformatics 6 (2005)Google Scholar
  12. 12.
    Notredame, C., Higgins, D., Heringa, J.: T-COFFEE: A novel mMthod for Multiple Sequence Alignments. J. Mol. Biol. 302, 205–217 (2000)CrossRefGoogle Scholar
  13. 13.
    Raghava, G.P., Searle, S.M., Audley, P.C., Barber, J.D., Barton, G.J.: OXBENCH: a Benchmark for Evaluation of Protein Multiple Sequence Alignment Accuracy. BMC Bioinformatics 4 (2003)Google Scholar
  14. 14.
    Russell, D.J., Out, H.H., Sayood, K.: Grammar-Based Distance in Progressive Multiple Sequence Alignment. BMC Bioinformatics 9 (2008)Google Scholar
  15. 15.
    Saitou, N., Nei, M.: The Neighbor-Joining Method: A New Method for Reconstructing Phylogenetic Trees. Mol. Biol. E 4, 406–425 (1987)Google Scholar
  16. 16.
    Sneath, P., Sokal, R.: Numerical Taxonomy, pp. 230–234. Freeman, San Francisco (1973)MATHGoogle Scholar
  17. 17.
    Stebbings, L.A., Mizuguchi, K.: HOMSTRAD: Recent Developments of the Homologous Protein Structure Alignment Database. Nucleic Acids Research 32, 203–207 (2004)CrossRefGoogle Scholar
  18. 18.
    Thompson, J.D., Higgins, T.J., Gibson, D.G.: CLUSTALW: Improving the Sensitivity of Progressive Multiple Sequence Alignment through Sequence Weighting, Position Specific Gap Penalties and Weight Matrix Choice. Nucleid Acids Research 22, 4673–4680 (1994)CrossRefGoogle Scholar
  19. 19.
    Thompson, J.D., Plewniak, F., Poch, O.: A Comprehensive Comparison of Multiple Sequence Alignment Programs. Nucleic. Acids. Res. 27, 2682–2690 (1999)CrossRefGoogle Scholar
  20. 20.
    Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a Benchmark Alignment Database for the Evaluation of Multiple Alignment Programs. Bioinformatics 15, 87–88 (1999)CrossRefGoogle Scholar
  21. 21.
    Wallace, I.M., O’Sullivan, O., Higgins, D.G.: Evaluation of Iterative Alignment Algorithms for Multiple Alignments. Bioinformatics 21, 1408–1414 (2005)CrossRefGoogle Scholar
  22. 22.
    Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)CrossRefGoogle Scholar
  23. 23.
    Wheeler, T.J., Kececioglu, J.D.: Multiple alignment by aligning alignments. Bioinformatics 23, 559–568 (2007)CrossRefGoogle Scholar
  24. 24.
    Min, Z., Weiwu, F., Junhua, Z., Zhongxian, C.: MSAID: Multiple Sequence Alignment Based on a Measure of Information Discrepancy. Computational Biology and Chemistry 29, 175–181 (2005)CrossRefMATHGoogle Scholar
  25. 25.
    Nicholas, K.B., Nicholas, K.B., Deerfield, D.W.: GeneDoc: Analysis and Visualization of Genetic Variation. Embnew News 4 (1997)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ahmed Mokaddem
    • 1
  • Mourad Elloumi
    • 1
  1. 1.Research Unit of Technologies of Information and CommunicationHigher School of Sciences and Technologies of TunisTunisia

Personalised recommendations