The Maximum Equality-Free String Factorization Problem: Gaps vs. No Gaps

  • Radu Stefan MincuEmail author
  • Alexandru Popa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12011)


A factorization of a string w is a partition of w into substrings \(u_1,\dots ,u_k\) such that \(w=u_1 u_2 \cdots u_k\). Such a partition is called equality-free if no two factors are equal: \(u_i \ne u_j, \forall i,j\) with \(i \ne j\). The maximum equality-free factorization problem is to decide, for a given string w and integer k, whether w admits an equality-free factorization with k factors.

Equality-free factorizations have lately received attention because of their application in DNA self-assembly. Condon et al. (CPM 2012) study a version of the problem and show that it is \(\mathcal {NP}\)-complete to decide if there exists an equality-free factorization with an upper bound on the length of the factors. At STACS 2015, Fernau et al. show that the maximum equality-free factorization problem with a lower bound on the number of factors is \(\mathcal {NP}\)-complete. Shortly after, Schmid (CiE 2015) presents results concerning the Fixed Parameter Tractability of the problems.

In this paper we approach equality free factorizations from a practical point of view i.e. we wish to obtain good solutions on given instances. To this end, we provide approximation algorithms, heuristics, Integer Programming models, an improved FPT algorithm and we also conduct experiments to analyze the performance of our proposed algorithms.

Additionally, we study a relaxed version of the problem where gaps are allowed between factors and we design a constant factor approximation algorithm for this case. Surprisingly, after extensive experiments we conjecture that the relaxed problem has the same optimum as the original.


String factorization Equality-free String algorithms Heuristics 


  1. 1.
    Bulteau, L., Hüffner, F., Komusiewicz, C., Niedermeier, R.: Multivariate algorithmics for NP-hard string problems. Bull. EATCS 114, 295–301 (2014)zbMATHGoogle Scholar
  2. 2.
    Clifford, R., Harrow, A.W., Popa, A., Sach, B.: Generalised matching. In: Karlgren, J., Tarhio, J., Hyyrö, H. (eds.) SPIRE 2009. LNCS, vol. 5721, pp. 295–301. Springer, Heidelberg (2009). CrossRefGoogle Scholar
  3. 3.
    Condon, A., Maňuch, J., Thachuk, C.: The complexity of string partitioning. J. Discrete Algorithms 32, 24–43 (2015)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Fernau, H., Manea, F., Mercas, R., Schmid, M.L.: Pattern matching with variables: fast algorithms and new hardness results. In: 32nd International Symposium on Theoretical Aspects of Computer Science, 4–7 March 2015, Garching, Germany, pp. 302–315 (2015)Google Scholar
  5. 5.
    Schmid, M.L.: Computing equality-free and repetitive string factorisations. Theor. Comput. Sci. 618, 42–51 (2016)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Spieksma, F.: On the approximability of an interval scheduling problem. J. Sched. 2(5), 215–227 (1999)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Ziv, J., Lempel, A.: Compression of individual sequences via variable-rate coding. IEEE Trans. Inf. Theory 24, 530–536 (1978)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of BucharestBucharestRomania
  2. 2.National Institute for Research and Development in InformaticsBucharestRomania

Personalised recommendations