Abstract
We define the directed acyclic subsequence graph of a text as the smallest deterministic partial finite automaton that recognizes all possible subsequences of that text. We define the size of the automaton as the size of the transition function and not the number of states. We show that it is possible to build this automaton using O(nlogn) time and O(n) space for a text of size n. With this structure, we can search a subsequence in logarithmic time. We extend this construction to the case of multiple strings obtaining a O(n 2logn) time and O(n 2) space algorithm, where n is the size of the set of strings. For the later case, we discuss its application to the longest common subsequence problem improving previous solutions.
(Preliminary version)
This work was supported by the Institute of Computer Research of the University of Waterloo and by the University of Chile.
Chapter PDF
References
Aho, A., Hopcroft, J. and Ullman, J. The Design and Analysis of Computer Algorithms, Addison-Wesley, Reading, Mass., 1974.
Aho, A., Hirschberg, D. and Ullman, J. “Bounds on the Complexity of the Longest Common Subsequence Problem”, JACM 23 (1976), 1–12.
Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., and Seiferas, J. “The Smallest Automaton Recognizing the Subwords of a Text”, Theoretical Computer Science, 40 (1985), 31–55.
Garey, M. and Johnson, D. Computers and Intractability, A Guide to the Theory of NP-Completeness, Freeman, New York, 1979.
Hopcroft, J. and Ullman, J. Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, 1969.
Hsu, W. and Du, M. “Computing a longest common subsequence for a set of strings”, BIT 24 (1984), 45–59.
Itoga, S. “The string merging problem”, BIT 21 (1981), 20–30.
Morrison, D. “PATRICIA-Practical algorithm to retrieve information coded in alphanumeric”, JACM 15, 4 (Oct 1968), 514–534.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1989 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baeza-Yates, R.A. (1989). The subsequence graph of a text. In: Díaz, J., Orejas, F. (eds) TAPSOFT '89. CAAP 1989. Lecture Notes in Computer Science, vol 351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-50939-9_127
Download citation
DOI: https://doi.org/10.1007/3-540-50939-9_127
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-50939-4
Online ISBN: 978-3-540-46116-6
eBook Packages: Springer Book Archive