On Prefix Normal Words

  • Gabriele Fici
  • Zsuzsanna Lipták
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6795)


We present a new class of binary words: the prefix normal words. They are defined by the property that for any given length k, no factor of length k has more a’s than the prefix of the same length. These words arise in the context of indexing for jumbled pattern matching (a.k.a. permutation matching or Parikh vector matching), where the aim is to decide whether a string has a factor with a given multiplicity of characters, i.e., with a given Parikh vector. Using prefix normal words, we give the first non-trivial characterization of binary words having the same set of Parikh vectors of factors. We prove that the language of prefix normal words is not context-free and is strictly contained in the language of pre-necklaces, which are prefixes of powers of Lyndon words. We discuss further properties and state open problems.


Parikh vectors pre-necklaces Lyndon words context-free languages jumbled pattern matching permutation matching non- standard pattern matching indexing 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Acharya, J., Das, H., Milenkovic, O., Orlitsky, A., Pan, S.: Reconstructing a string from its substring compositions. In: Proceedings of IEEE International Symposium on Information Theory, ISIT 2010. pp. 1238–1242 (2010)Google Scholar
  2. 2.
    Berstel, J., Boasson, L.: Context-free languages. In: Handbook of Theoretical Computer Science, Volume B: Formal Models and Sematics (B), pp. 59–102. Elsevier, Amsterdam (1990)Google Scholar
  3. 3.
    Berstel, J., Boasson, L.: The set of Lyndon words is not context-free. Bull. Eur. Assoc. Theor. Comput. Sci. EATCS 63, 139–140 (1997)MathSciNetzbMATHGoogle Scholar
  4. 4.
    Berstel, J., Perrin, D.: The origins of combinatorics on words. Eur. J. Comb. 28, 996–1022 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Böcker, S.: Simulating multiplexed SNP discovery rates using base-specific cleavage and mass spectrometry. Bioinformatics 23(2), 5–12 (2007)CrossRefGoogle Scholar
  6. 6.
    Burcsi, P., Cicalese, F., Fici, G., Lipták, Zs.: On table arrangements, scrabble freaks, and jumbled pattern matching. In: Boldi, P., Gargano, L. (eds.) FUN 2010. LNCS, vol. 6099, pp. 89–101. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  7. 7.
    Champarnaud, J., Hansel, G., Perrin, D.: Unavoidable sets of constant length. Internat. J. Algebra Comput. 14, 241–251 (2004)Google Scholar
  8. 8.
    Cicalese, F., Fici, G., Lipták, Zs.: Searching for Jumbled Patterns in Strings. In: Holub, J., Zdárek, J. (eds.) Prague Stringology Conference, PSC 2009. Proceedings, pp. 105–117. Czech Tech. Univ. in Prague (2009)Google Scholar
  9. 9.
    Cieliebak, M., Erlebach, T., Lipták, Zs., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: combinatorics of weighted strings. Discrete Appl. Math. 137(1), 27–46 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Eres, R., Landau, G.M., Parida, L.: Permutation pattern discovery in biosequences. J. Comput. Biol. 11(6), 1050–1060 (2004)CrossRefGoogle Scholar
  11. 11.
    Knuth, D.E.: Generating All Tuples and Permutations. The Art of Computer Programming, Vol. 4, Fascicle 2. Addison-Wesley, Reading (2005)zbMATHGoogle Scholar
  12. 12.
    Lothaire, M.: Algebraic Combinatorics on Words. Encyclopedia of Mathematics and its Applications. Cambridge Univ. Press, Cambridge (2002)CrossRefzbMATHGoogle Scholar
  13. 13.
    Moosa, T.M., Rahman, M.S.: Sub-quadratic time and linear size data structures for permutation matching in binary strings . J. Discrete Algorithms (to appear)Google Scholar
  14. 14.
    Moosa, T.M., Rahman, M.S.: Indexing permutations for binary strings. Inf. Process. Lett. 110, 795–798 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Navarro, G., Mäkinen, V.: Compressed full-text indexes. ACM Comput. Surv. 39(1) (2007)Google Scholar
  16. 16.
    Ruskey, F., Savage, C., Wang, T.M.Y.: Generating necklaces. J. Algorithms 13(3), 414–430 (1992)Google Scholar
  17. 17.
    Sloane, N.J.A.: The On-Line Encyclopedia of Integer Sequences, Sequence A062692, available electronically at

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Gabriele Fici
    • 1
  • Zsuzsanna Lipták
    • 2
  1. 1.I3S, CNRS & Université de Nice-Sophia AntipolisFrance
  2. 2.AG Genominformatik, Technische FakultätBielefeld UniversityGermany

Personalised recommendations