Abstract
A K-indexing is a mapping from an alphabet Σ to a set Γ of K symbols forming the homomorphism to transform strings. Given Σ, two disjoint sets of strings P, Q over Σ and Γ={1,..., K}, Alphabet Indexing is the problem to find a K-indexing that transforms no two different strings taken from each P and Q into the same one. Although this problem is NP-complete, applying K-indexing to input data brings remarkable advantages in actual applications. In this paper, we introduce Max K-Indexing, a maximization version of Alphabet Indexing, that intends to maximize the number of pairs in P×Q whose strings are not transformed into the same ones. We show that the problem is MAX SNP-hard, that is, the problem seems to have no polynomial-time algorithm achieving an arbitrary small error ratio. Then we propose a simple polynomial-time greedy algorithm and show that the algorithm attains the constant error ratio 1/K for K-indexing. Also we define M-Tuple K-Indexing problem by extending pairs of strings in Max K-Indexing to tuples of strings over more than two sets, and show that a natural extension of the algorithm also achieves a constant error bound.
Preview
Unable to display preview. Download preview PDF.
References
S. Arikawa, S. Miyano, A. Shinohara, S. Kuhara, Y. Mukouchi, and T. Shinohara, A machine discovery from amino acid sequences by decision trees over regular patterns, New Gener. Comput., 11 (1993), pp. 361–375.
S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy, Proof verification and hardness of approximation problems, in Proc. 33rd Annual Symposium on Foundations of Computer Science, 1992, pp. 14–23.
A. Blum, T. Jiang, M. Li, J. Tromp, and M. Yannakakis, Linear approximation of shortest superstrings, in Proc. 23rd Annual ACM Symposium on Theory of Computing, 1991, pp. 328–336.
S. Fukamachi, S. Shimozono, H. Arimura, and T. Shinohara, Lossy text compression for string pattern matching, Technical Report of IEICE NLC95-6, 1995.
M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman and Company, New York, 1978.
M. X. Goemans and D. P. Williamson, 878-approximation algorithms for MAX CUT and MAX 2SAT, in Proc. 26th Annual ACM Symposium on Theory of Computing, 1994, pp. 422–431.
D. S. Johnson, Approximation algorithms for combinatorial problems, J. Comput. Sys. Sci., 9 (1974), pp. 256–278.
C. H. Papadimitriou, Computational Complexity, Addison-Wesley Publishing Company, Reading, Massachusetts, 1994.
C. H. Papadimitriou and M. Yannakakis, Optimization, approximation, and complexity classes, J. Comput. Sys. Sci., 43 (1991), pp. 425–440.
S. Shimozono and S. Miyano, Complexity of finding alphabet indexing, IEICE Trans. Inf. Sys., E78-D (1995), pp. 13–18.
S. Shimozono, A. Shinohara, T. Shinohara, S. Miyano, S. Kuhara, and S. Arikawa, Knowledge acquisition from amino acid sequences by machine learning system BONSAI, Trans. Inf. Proc. Soc. Japan, 35 (1994), pp. 2009–2018.
M. Yannakakis, On the approximation of maximum satisfiability, J. Algorithms, 17 (1994), pp. 475–502.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shimozono, S. (1995). An approximation algorithm for alphabet indexing problem. In: Staples, J., Eades, P., Katoh, N., Moffat, A. (eds) Algorithms and Computations. ISAAC 1995. Lecture Notes in Computer Science, vol 1004. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015403
Download citation
DOI: https://doi.org/10.1007/BFb0015403
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60573-7
Online ISBN: 978-3-540-47766-2
eBook Packages: Springer Book Archive