Advertisement

Chaining Algorithms for Alignment of Draft Sequence

  • Mukund Sundararajan
  • Michael Brudno
  • Kerrin Small
  • Arend Sidow
  • Serafim Batzoglou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)

Abstract

In this paper we propose a chaining method that can align a draft genomic sequence against a finished genome. We introduce the use of an overlap tree to enhance the state information available to the chaining procedure in the context of sparse dynamic programming, and demonstrate that the resulting procedure more accurately penalizes the various biological rearrangements. The algorithm is tested on a whole genome alignment of seven yeast species. We also demonstrate a variation on the algorithm that can be used for co-assembly of two genomes and show how it can improve the current assembly of the Ciona savignyi (sea squirt) genome.

Keywords

Local Alignment Global Alignment Genome Alignment Draft Sequence Binary Search Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abouelhoda, M.I., Ohlebusch, E.: A Local Chaining Algorithm and Its Applications in Comparative Genomics. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 1–16. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  2. Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search. Nucleic Acids Res. 25(17), 3389–3402 (1997)CrossRefGoogle Scholar
  3. Batzoglou, S., Jaffe, D., Stanley, K., Butler, J., Gnerre, S., Mauceli, E., Berger, B., Mesirov, J.P., Lander, E.S.: ARACHNE: A whole genome shotgun assembler. Genome Research 12, 177–189 (2002)CrossRefGoogle Scholar
  4. Bray, N., Dubchak, I., Pachter, L.: AVID: A Global Alignment Program. Genome Research 13, 97–102 (2003)CrossRefGoogle Scholar
  5. Brudno, M., Chapman, M., Gottgens, B., Batzoglou, S., Morgenstern, B.: Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics 4(1), 66 (2003a)CrossRefGoogle Scholar
  6. Brudno, M., Do, C.B., Cooper, G.M., Kim, M.F., Davydov, E., Green, E.D., Sidow, A., Batzoglou, S.: LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNA. Genome Research 13(4), 721–731 (2003b)CrossRefGoogle Scholar
  7. Brudno, M., Malde, S., Poliakov, A., Do, C.B., Couronne, O., Dubchak, I., Batzoglou, S.: Glocal alignment: finding rearrangements during alignment. Bioinformatics 19(1), i54–i62 (2003c)CrossRefGoogle Scholar
  8. Brudno, M., Morgenstern, B.: Fast and sensitive alignment of large genomic sequences. In: Proceedings of the IEEE Computer Society Bioinformatics Conference CSB (2002)Google Scholar
  9. Burton, F.W., Huntbach, M.M.: Multiple Generation Text Files Using Overlapping Tree. The Computer Journal 28(4), 414–416 (1985)CrossRefGoogle Scholar
  10. Cliften, P., Sudarsanam, P., Desikan, A., Fulton, L., Fulton, B., Majors, J., Waterston, R., Cohen, B., Johnston, M.: Finding functional features in Saccharomyces Genomes by phylogenetic footprinting. Science 301, 71–76 (2003)CrossRefGoogle Scholar
  11. Delcher, A.L., Kasif, S., Fleischmann, R.D., Peterson, J., White, O., Salzberg, S.L.: Alignment of Whole Genomes. Nucleic Acids Research 27(11), 2369–2376 (1999)CrossRefGoogle Scholar
  12. Delcher, A.L., Phillippy, A., Carlton, J., Salzberg, S.L.: Fast Algorithms for Large-scale Genome Alignment and Comparision. Nucleic Acids Research 30(11), 2478–2483 (2002)CrossRefGoogle Scholar
  13. Eddy, S.R., Durbin, R.: RNA sequence analysis using covariance models. Nucl Acids Res. 22, 2079–2088 (1994)CrossRefGoogle Scholar
  14. Eppstein, D., Galil, R., Giancarlo, R., Italiano, G.F.: Sparse dynamic programming I: linear cost functions. J. ACM 39, 519–545 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  15. Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., Bult, C.J., Tomb, J.F., Dougherty, B.A., Merrick, J.M., et al.: Whole-genome random sequencing and assembly of Haemophilus influenzae. Science 269(5223), 496–512 (1995)CrossRefGoogle Scholar
  16. Jaffe, D.B., Butler, J., Gnerre, S., Mauceli, E., Lindblad-Toh, K., Mesirov, J.P., Zody, M.C., Lander, E.S.: Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res. 13(1), 91–96 (2003)CrossRefGoogle Scholar
  17. Kellis, M., Birren, B., Lander, E.: Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428, 617–624 (2004)CrossRefGoogle Scholar
  18. Kellis, M., Patterson, N., Endrizzi, M., Birren, B., Lander, E.: Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 241–254 (2003)CrossRefGoogle Scholar
  19. Lippert, R.A., Zhao, X., Florea, L., Mobarry, C., Istrail, S.: Finding Anchors for Genomic Sequence Comparison. In: Proceedings of ACM RECOMB (2004)Google Scholar
  20. Mullikin, J.C., Ning, Z.: The phusion assembler. Genome Res 13(1), 81–90 (2003)CrossRefGoogle Scholar
  21. Needleman, S., Wunsch, C.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)CrossRefGoogle Scholar
  22. Smith, T.F., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)CrossRefGoogle Scholar
  23. Tzouramanis, T., Vassilakopoulos, M., Manolopoulos, Y.: Multiversion Linear Quadtree for Spatio-Temporal Data. In: Masunaga, Y., Thalheim, B., Štuller, J., Pokorný, J. (eds.) ADBIS 2000 and DASFAA 2000. LNCS, vol. 1884, p. 279. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  24. Veeramachaneni, V., Berman, P., Miller, W.: Aligning two fragmented sequences. Discrete Applied Mathematics 127(1), 119–143 (2003)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Mukund Sundararajan
    • 1
  • Michael Brudno
    • 1
  • Kerrin Small
    • 2
  • Arend Sidow
    • 2
    • 3
  • Serafim Batzoglou
    • 1
  1. 1.Department of Computer ScienceStanford UniversityStanfordUSA
  2. 2.Department of GeneticsStanford UniversityStanfordUSA
  3. 3.Department of PathologyStanford UniversityStanfordUSA

Personalised recommendations