Skip to main content

Counting, Generating and Sampling Tree Alignments

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2016)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 9702))

Included in the following conference series:

Abstract

Pairwise ordered tree alignment are combinatorial objects that appear in RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by means of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal RNA secondary structures alignments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    In this work, unless explicitly specified, all trees will be rooted and ordered.

  2. 2.

    The present results can be trivially extended to any edit scoring system that is a positive linear combination of the numbers of insertions, deletions and matches.

References

  1. Andrade, H., Area, I., Nieto, J.J., Torres, A.: The number of reduced alignments between two dna sequences. BMC Bioinformatics 15, 94 (2014). http://dx.doi.org/10.1186/1471-2105-15-94

    Article  Google Scholar 

  2. Blin, G., Denise, A., Dulucq, S., Herrbach, C., Touzet, H.: Alignments of RNA structures. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(2), 309–322 (2010). http://doi.acm.org/10.1145/1791396.1791409

    Article  Google Scholar 

  3. Chauve, C., Courtiel, J., Ponty, Y.: Counting, generating and sampling tree alignments. In: ALCOB - 3rd International Conference on Algorithms for Computational Biology - 2016. Trujillo, Spain, Jun 2016. https://hal.inria.fr/hal-01154030

  4. Do, C.B., Gross, S.S., Batzoglou, S.: CONTRAlign: discriminative training for protein sequence alignment. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 160–174. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Dress, A., Morgenstern, B., Stoye, J.: The number of standard and of effective multiple alignments. Appl. Math. Lett. 11(4), 43–49 (1998). http://www.sciencedirect.com/science/article/pii/S0893965998000548

    Article  MathSciNet  MATH  Google Scholar 

  6. Flajolet, P., Sedgewick, R.: Analytic combinatorics. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  7. Herrbach, C., Denise, A., Dulucq, S.: Average complexity of the Jiang-Wang-Zhang pairwise tree alignment algorithm and of a RNA secondary structure alignment algorithm. Theor. Comput. Sci. 411(26–28), 2423–2432 (2010). http://dx.doi.org/10.1016/j.tcs.2010.01.014

    Article  MathSciNet  MATH  Google Scholar 

  8. Höchsmann, M., Töller, T., Giegerich, R., Kurtz, S.: Local similarity in RNA secondary structures. Proc. Ieee Comput. Soc. Bioinform Conf. 2, 159–168 (2003)

    Google Scholar 

  9. Höchsmann, M., Voss, B., Giegerich, R.: Pure multiple rna secondary structure alignments: a progressive profile approach. IEEE/ACM Trans. Comput. Biol. Bioinformatics 1(1), 53–62 (2004). http://dx.doi.org/10.1109/TCBB.2004.11

    Article  Google Scholar 

  10. Jiang, T., Wang, L., Zhang, K.: Alignment of trees - an alternative to tree edit. Theor. Comput. Sci. 143(1), 137–148 (1995). http://dx.doi.org/10.1016/0304-3975(95)80029-9

    Article  MathSciNet  MATH  Google Scholar 

  11. Ponty, Y., Saule, C.: A combinatorial framework for designing (pseudoknotted) RNA algorithms. In: Przytycka, T.M., Sagot, M.-F. (eds.) WABI 2011. LNCS, vol. 6833, pp. 250–269. Springer, Heidelberg (2011). http://dx.doi.org/10.1007/978-3-642-23038-7_22

    Chapter  Google Scholar 

  12. Schirmer, S., Giegerich, R.: Forest alignment with affine gaps and anchors, applied in RNA structure comparison. Theor. Comput. Sci. 483, 51–67 (2013). http://dx.doi.org/10.1016/j.tcs.2012.07.040

    Article  MathSciNet  MATH  Google Scholar 

  13. Torres, A., Cabada, A., Nieto, J.J.: An exact formula for the number of alignments between two DNA sequences. DNA Seq. 14(6), 427–430 (2003)

    Article  Google Scholar 

  14. Vingron, M., Argos, P.: Determination of reliable regions in protein sequence alignments. Protein Eng. 3(7), 565–569 (1990). http://peds.oxfordjournals.org/content/3/7/565.abstract

    Article  Google Scholar 

  15. Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences, and Genomes. CRC Press, Pevzner (1995)

    Book  MATH  Google Scholar 

  16. Wilf, H.S.: A unified setting for sequencing, ranking, and selection algorithms for combinatorial objects. Adv. Math. 24, 281–291 (1977)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yann Ponty .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chauve, C., Courtiel, J., Ponty, Y. (2016). Counting, Generating and Sampling Tree Alignments. In: Botón-Fernández, M., Martín-Vide, C., Santander-Jiménez, S., Vega-Rodríguez, M.A. (eds) Algorithms for Computational Biology. AlCoB 2016. Lecture Notes in Computer Science(), vol 9702. Springer, Cham. https://doi.org/10.1007/978-3-319-38827-4_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-38827-4_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-38826-7

  • Online ISBN: 978-3-319-38827-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics