# Duplication-correcting codes

- 142 Downloads

**Part of the following topical collections:**

## Abstract

In this work, we propose constructions that correct duplications of multiple consecutive symbols. These errors are known as tandem duplications, where a sequence of symbols is repeated; respectively as palindromic duplications, where a sequence is repeated in reversed order. We compare the redundancies of these constructions with code size upper bounds that are obtained from sphere packing arguments. Proving that an upper bound on the code cardinality for tandem *deletions* is also an upper bound for *inserting* tandem duplications, we derive the bounds based on this special tandem *deletion* error as this results in tighter bounds. Our upper bounds on the cardinality directly imply lower bounds on the redundancy which we compare with the redundancy of the best known construction correcting arbitrary burst insertions. Our results indicate that the correction of palindromic duplications requires more redundancy than the correction of tandem duplications and both significantly less than arbitrary burst insertions.

## Keywords

Error-correcting codes Duplication errors Generalized sphere packing bound DNA storage Combinatorial channel Burst insertions/deletions## Mathematics Subject Classification

94B20 94B65 94B60## Notes

### Acknowledgements

This work was supported by the Institute for Advanced Study (IAS), Technische Universität München (TUM), with funds from the German Excellence Initiative and the European Union’s Seventh Framework Program (FP7) under Grant Agreement No. 291763. Parts of this work have been presented at the 2017 Workshop on Coding and Cryptography (WCC), St. Petersburg [7]..

## References

- 1.Dolecek L., Anantharam V.: Repetition error correcting sets: explicit constructions and prefixing methods. SIAM J. Discret. Math.
**23**(4), 2120–2146 (2010).MathSciNetCrossRefzbMATHGoogle Scholar - 2.Fazeli A., Vardy A., Yaakobi E.: Generalized sphere packing bound. IEEE Trans. Inf. Theory
**61**(5), 2313–2334 (2015).MathSciNetCrossRefzbMATHGoogle Scholar - 3.Hansen P.: Studies on Graphs and Discrete Programming, vol. 11. North Holland, New York (1981).Google Scholar
- 4.Jain S., Farnoud F., Schwartz M., Bruck J.: Duplication-correcting codes for data storage in the DNA of living organisms. In: IEEE International Symposium on Information Theory (ISIT), Barcelona, pp. 1028–1032 (2016).Google Scholar
- 5.Kulkarni A.A., Kiyavash N.: Nonasymptotic upper bounds for deletion correcting codes. IEEE Trans. Inf. Theory
**59**(8), 5115–5130 (2013).MathSciNetCrossRefzbMATHGoogle Scholar - 6.Kurmaev O.F.: Constant-weight and constant-charge binary run-length limited codes. IEEE Trans. Inf. Theory
**57**(7), 4497–4515 (2011).MathSciNetCrossRefzbMATHGoogle Scholar - 7.Lenz A., Wachter-Zeh A., Yaakobi E.: Bounds on codes correcting tandem and palindromic duplications. In: Workshop on Coding and Cryptography (WCC) (2017).Google Scholar
- 8.Levenshtein V.: Binary codes capable of correcting spurious insertions and deletions of ones. Probl. Pereda. Inform.
**1**(1), 12–25 (1965).zbMATHGoogle Scholar - 9.Levenshtein V.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl.
**10**, 707–710 (1966).MathSciNetGoogle Scholar - 10.Mahdavifar H., Vardy A.: Asymptotically optimal sticky-insertion-correcting codes with efficient encoding and decoding. In: IEEE International Symposium on Information Theory (ISIT), Aachen, pp. 2688–2692 (2017).Google Scholar
- 11.Roth R., Siegel P.: Lee-metric bch codes and their application to constrained and partial-response channels. IEEE Trans. Inf. Theory
**40**(4), 1083–1096 (1994).MathSciNetCrossRefzbMATHGoogle Scholar - 12.Schoeny C., Wachter-Zeh A., Gabrys R., Yaakobi E.: Codes correcting a burst of deletions or insertions. IEEE Trans. Inf. Theory
**63**(4), 1971–1985 (2017).MathSciNetCrossRefzbMATHGoogle Scholar - 13.Varshamov R.R., Tenengolts G.M.: Codes which correct single asymmetric errors. Autom. Remote Control
**26**(2), 286–290 (1965).MathSciNetGoogle Scholar