Skip to main content

Inferring Strings from Lyndon Factorization

  • Conference paper
Mathematical Foundations of Computer Science 2014 (MFCS 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8635))

Abstract

The Lyndon factorization of a string w is a unique factorization \(\ell_1^{p_1}, \ldots, \ell_m^{p_m}\) of w s.t. ℓ1, …, ℓ m is a sequence of Lyndon words that is monotonically decreasing in lexicographic order. In this paper, we consider the reverse-engineering problem on Lyndon factorization: Given a sequence S = ((s 1, p 1), …, (s m , p m )) of ordered pairs of positive integers, find a string w whose Lyndon factorization corresponds to the input sequence S, i.e., the Lyndon factorization of w is in a form of \(\ell_1^{p_1}, \ldots, \ell_m^{p_m}\) with |ℓ i | = s i for all 1 ≤ i ≤ m. Firstly, we show that there exists a simple O(n)-time algorithm if the size of the alphabet is unbounded, where n is the length of the output string. Secondly, we present an O(n)-time algorithm to compute a string over an alphabet of the smallest size. Thirdly, we show how to compute only the size of the smallest alphabet in O(m) time. Fourthly, we give an O(m)-time algorithm to compute an O(m)-size representation of a string over an alphabet of the smallest size. Finally, we propose an efficient algorithm to enumerate all strings whose Lyndon factorizations correspond to S.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Apostolico, A., Crochemore, M.: Fast parallel Lyndon factorization with applications. Mathematical Systems Theory 28(2), 89–108 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  2. Bannai, H., Inenaga, S., Shinohara, A., Takeda, M.: Inferring strings from graphs and arrays. In: Rovan, B., Vojtáš, P. (eds.) MFCS 2003. LNCS, vol. 2747, pp. 208–217. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  3. Brlek, S., Lachaud, J.O., Provençal, X., Reutenauer, C.: Lyndon + Christoffel = digitally convex. Pattern Recognition 42(10), 2239–2246 (2009)

    Article  MATH  Google Scholar 

  4. Chemillier, M.: Periodic musical sequences and Lyndon words. Soft Comput. 8(9), 611–616 (2004)

    MATH  Google Scholar 

  5. Chen, K.T., Fox, R.H., Lyndon, R.C.: Free differential calculus. IV. the quotient groups of the lower central series. Annals of Mathematics 68(1), 81–95 (1958)

    Article  MathSciNet  Google Scholar 

  6. Crochemore, M., Perrin, D.: Two-way string matching. J. ACM 38(3), 651–675 (1991)

    Article  MATH  MathSciNet  Google Scholar 

  7. Daykin, J.W., Iliopoulos, C.S., Smyth, W.F.: Parallel RAM algorithms for factorizing words. Theor. Comput. Sci. 127(1), 53–67 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  8. Delgrange, O., Rivals, E.: STAR: an algorithm to search for tandem approximate repeats. Bioinformatics 20(16), 2812–2820 (2004)

    Article  Google Scholar 

  9. Duval, J.P.: Factorizing words over an ordered alphabet. J. Algorithms 4(4), 363–381 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  10. Duval, J.P.: Génération d’une section des classes de conjugaison et arbre des mots de Lyndon de longueur bornée. Theor. Comput. Sci. 60, 255–283 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  11. Duval, J.P., Lecroq, T., Lefebvre, A.: Border array on bounded alphabet. Journal of Automata, Languages and Combinatorics 10(1), 51–60 (2005)

    MATH  MathSciNet  Google Scholar 

  12. Duval, J.P., Lecroq, T., Lefebvre, A.: Efficient validation and construction of border arrays and validation of string matching automata. RAIRO - Theoretical Informatics and Applications 43(2), 281–297 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  13. Duval, J.P., Lefebvre, A.: Words over an ordered alphabet and suffix permutations. Theoretical Informatics and Applications 36, 249–259 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  14. Franek, F., Gao, S., Lu, W., Ryan, P.J., Smyth, W.F., Sun, Y., Yang, L.: Verifying a border array in linear time. J. Comb. Math. and Comb. Comp. 42, 223–236 (2002)

    MATH  MathSciNet  Google Scholar 

  15. Gawrychowski, P., Jeż, A., Jeż, Ł.: Validating the Knuth-Morris-Pratt failure function, fast and online. Theory Comput. Syst. 54(2), 337–372 (2014)

    Article  Google Scholar 

  16. Gil, J.Y., Scott, D.A.: A bijective string sorting transform. CoRR abs/1201.3077 (2012)

    Google Scholar 

  17. He, J., Liang, H., Yang, G.: Reversing longest previous factor tables is hard. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS, vol. 6844, pp. 488–499. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  18. I, T., Inenaga, S., Bannai, H., Takeda, M.: Counting and verifying maximal palindromes. In: Chavez, E., Lonardi, S. (eds.) SPIRE 2010. LNCS, vol. 6393, pp. 135–146. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  19. I, T., Inenaga, S., Bannai, H., Takeda, M.: Inferring strings from suffix trees and links on a binary alphabet. In: Proc. PSC 2011, pp. 121–130 (2011)

    Google Scholar 

  20. I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Efficient Lyndon factorization of grammar compressed text. In: Fischer, J., Sanders, P. (eds.) CPM 2013. LNCS, vol. 7922, pp. 153–164. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  21. I, T., Nakashima, Y., Inenaga, S., Bannai, H., Takeda, M.: Faster lyndon factorization algorithms for SLP and LZ78 compressed text. In: Kurland, O., Lewenstein, M., Porat, E. (eds.) SPIRE 2013. LNCS, vol. 8214, pp. 174–185. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  22. Kufleitner, M.: On bijective variants of the Burrows-Wheeler transform. In: Proc. PSC 2009, pp. 65–79 (2009)

    Google Scholar 

  23. Lyndon, R.C.: On Burnside’s problem. Transactions of the American Mathematical Society 77, 202–215 (1954)

    MATH  MathSciNet  Google Scholar 

  24. Matsubara, W., Ishino, A., Shinohara, A.: Inferring strings from runs. In: Proc. PSC 2010, pp. 150–160 (2010)

    Google Scholar 

  25. Moore, D., Smyth, W.F., Miller, D.: Counting distinct strings. Algorithmica 23(1), 1–13 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  26. Schürmann, K.B., Stoye, J.: Counting suffix arrays and strings. Theoretical Computer Science 395(2-3), 220–234 (2008)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nakashima, Y., Okabe, T., I, T., Inenaga, S., Bannai, H., Takeda, M. (2014). Inferring Strings from Lyndon Factorization. In: Csuhaj-Varjú, E., Dietzfelbinger, M., Ésik, Z. (eds) Mathematical Foundations of Computer Science 2014. MFCS 2014. Lecture Notes in Computer Science, vol 8635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44465-8_48

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44465-8_48

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44464-1

  • Online ISBN: 978-3-662-44465-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics