Skip to main content
Log in

On the Variance of the Optimal Alignments Score for Binary Random Words and an Asymmetric Scoring Function

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

We investigate the order of the variance of the optimal alignments (OA) score of two independent iid binary random words having the same length. The letters are equiprobable, but the scoring function is such that one letter has a larger score than the other. In this setting, we prove that the order of variance is linear in the common length. OAs constitute a generalization of longest common subsequences, they can be represented as optimal paths in a two-dimensional last passage percolation setting with dependent weights.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Alexander, K.S.: The rate of convergence of the mean length of the longest common subsequence. Ann. Appl. Probab. 4(4), 1074–1082 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  2. Arratia, R., Waterman, M.S.: A phase transition for the score in matching random sequences allowing deletions. Ann. Appl. Probab. 4(1), 200–225 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  3. Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Am. Math. Soc. 12(4), 1119–1178 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bonetto, F., Matzinger, H.: Fluctuations of the longest common subsequence in the case of 2- and 3-letter alphabets. ALEA 2, 195–216 (2006)

    MathSciNet  MATH  Google Scholar 

  5. Chvátal, V., Sankoff, D.: Longest common subsequences of two random sequences. J. Appl. Probab. 12, 306–315 (1975)

    Article  MathSciNet  MATH  Google Scholar 

  6. Clote, P., Barckofen, R.: Computational Molecular Biology: An Introduction. Wiley, Chichseter (2000)

    MATH  Google Scholar 

  7. Durbin, R., Eddy, S., Krogh, A., Mitschson, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge (1999)

    MATH  Google Scholar 

  8. Gong, R., Houdré, C., Işlak, Ü.: A central limit theorem for optimal alignments score in multiple random words. arXiv:1512.05699 (2015)

  9. Hauser, R., Matzinger, H.: Letter change bias and local uniqueness in optimal sequence alignments. J. Stat. Phys. 153(3), 512–529 (2013)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  10. Houdré, C., Işlak, Ü.: A central limit theorem for the length of the largest common subsequences in random words. arXiv:1408.1559v3

  11. Houdré, C., Lember, J., Matzinger, H.: On the longest common increasing binary subsequence. C.R. Acad. Sci. Paris Ser. I 343, 589–594 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  12. Houdré, C., Ma, J.: On the order of the central moments of the length of the largest common subsequences in random words. In: Houdré, C., Mason, D.M., Reynaud-Bouret, P., Rosiński, J. (eds.) High Dimensional Probability VII: The Cargèse Volume, Progress in Probability, Birkhäuser, To appear (2016)

  13. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000)

    Article  ADS  MathSciNet  MATH  Google Scholar 

  14. Kardar, M., Parisi, G., Zhang, Y.C.: Dynamic scaling of growing interfaces. Phys. Rev. Lett. 56(9), 889–892 (1986)

    Article  ADS  MATH  Google Scholar 

  15. Krug, J., Spohn, H.: Kinetic Roughening of Growing Surfaces. In Solids Far From Equilibrium, pp. 479–582. Cambridge University Press, Cambridge (1991)

    Google Scholar 

  16. Lember, J., Matzinger, H.: Standard deviation of the longest common subsequence. Ann. Probab. 37(3), 1192–1235 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lember, J., Matzinger, H., Duringer, C.: Deviation from the mean in sequence comparison with a periodic sequence. ALEA 3, 1–29 (2007)

    MathSciNet  MATH  Google Scholar 

  18. Lember, J., Matzinger, H., Torres, F.: The rate of the convergence of the mean score in random sequence comparison. Ann. Appl. Probab. 22(3), 1046–1058 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Pevzner, P.: Computational Molecular Biology: An Algorithmic Approach. MIT Press, Cambridge (2000)

    MATH  Google Scholar 

  20. Robin, S., Rodolphe, F., Schbath, S.: ADN, mots et modèles. Belin, Paris (2003)

    Google Scholar 

  21. Romik, D.: The Surprising Mathematics of Longest Increasing Subsequences. Cambridge University Press, Cambridge (2014)

    Book  MATH  Google Scholar 

  22. Sankoff, D., Kruskal, J.: Time Warps, String Edits and Macromolecules: The Theory and Practice of Sequence Comparison. Center for the Study of Language and Information, Cambridge (1999)

    MATH  Google Scholar 

  23. Steele, M.J.: An Efron–Stein inequality for non-symmetric statistics. Ann. Stat. 14, 753–758 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  24. Waterman, M.S.: Estimating statistical significance of sequence alignments. Philos. Trans. R. Soc. Lond. B 344, 383–390 (1994)

    Article  ADS  Google Scholar 

  25. Waterman, M.S.: Introduction to Computational Biology: Maps, Sequences and Genomes. Chapman and Hall, London (1995)

    Book  MATH  Google Scholar 

Download references

Acknowledgments

Research supported in part by the Simons Foundation Grant #246283.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heinrich Matzinger.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Houdré, C., Matzinger, H. On the Variance of the Optimal Alignments Score for Binary Random Words and an Asymmetric Scoring Function. J Stat Phys 164, 693–734 (2016). https://doi.org/10.1007/s10955-016-1549-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10955-016-1549-1

Keywords

Mathematics Subject Classification

Navigation