Enhancing a Genetic Algorithm with a Solution Archive to Reconstruct Cross Cut Shredded Text Documents
In this work the concept of a trie-based complete solution archive in combination with a genetic algorithm is applied to the Reconstruction of Cross-Cut Shredded Text Documents (RCCSTD) problem. This archive is able to detect and subsequently convert duplicates into new yet unvisited solutions. Cross-cut shredded documents are documents that are cut into rectangular pieces of equal size and shape. The reconstruction of documents can be of high interest in forensic science. Two types of tries are compared as underlying data structure, an indexed trie and a linked trie. Experiments indicate that the latter needs considerably less memory without affecting the run-time. While the archive-enhanced genetic algorithm yields better results for runs with a fixed number of iterations, advantages diminish due to the additional overhead when considering run-time.
Keywordsgenetic algorithm solution archive reconstruction
Unable to display preview. Download preview PDF.
- 2.Hu, B., Raidl, G.R.: An evolutionary algorithm with solution archives and bounding extension for the generalized minimum spanning tree problem. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation (GECCO), pp. 393–400. ACM Press, Philadelphia (2012)Google Scholar
- 3.Mauldin, M.L.: Maintaining Diversity in Genetic Search. In: National Conference on Artificial Intelligence, vol. 19, pp. 247–250. AAAI, William Kaufmann (1984)Google Scholar
- 4.Perl, J., Diem, M., Kleber, F., Sablatnig, R.: Strip shredded document reconstruction using optical character recognition. In: 4th International Conference on Imaging for Crime Detection and Prevention 2011 (ICDP 2011), pp. 1–6 (2011)Google Scholar
- 5.Prandtstetter, M.: Hybrid Optimization Methods for Warehouse Logistics and the Reconstruction of Destroyed Paper Documents. Ph.D. thesis, Vienna University of Technology (2009)Google Scholar
- 6.Prandtstetter, M., Raidl, G.R.: Meta-heuristics for reconstructing cross cut shredded text documents. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO 2009, pp. 349–356. ACM Press, New York (2009)Google Scholar
- 8.Ronald, S.: Duplicate genotypes in a genetic algorithm. In: IEEE World Congress on Computational Intelligence, Evolutionary Computation Proceedings, pp. 793–798 (1998)Google Scholar
- 10.Sleit, A., Massad, Y., Musaddaq, M.: An alternative clustering approach for reconstructing cross cut shredded text documents. Telecommunication Systems, 1–11 (2011)Google Scholar
- 11.Yuen, S.Y., Chow, C.K.: A non-revisiting genetic algorithm. In: IEEE Congress on Evolutionary Computation, CEC 2007, pp. 4583–4590 (2007)Google Scholar