Sharper Upper and Lower Bounds for an Approximation Scheme for Consensus-Pattern
We present sharper upper and lower bounds for a known polynomial-time approximation scheme due to Li, Ma and Wang  for the Consensus-Pattern problem. This NP-hard problem is an abstraction of motif finding, a common bioinformatics discovery task. The PTAS due to Li et al. is simple, and a preliminary implementation  gave reasonable results in practice. However, the previously known bounds on its performance are useless when runtimes are actually manageable. Here, we present much sharper lower and upper bounds on the performance of this algorithm that partially explain why its behavior is so much better in practice than what was previously predicted in theory. We also give specific examples of instances of the problem for which the PTAS performs poorly in practice, and show that the asymptotic performance bound given in the original proof matches the behaviour of a simple variant of the algorithm on a particularly bad instance of the problem.
KeywordsApproximation Scheme Approximation Ratio Input String Alphabet Size Binary Alphabet
Unable to display preview. Download preview PDF.
- 1.Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the 2nd International Conference on Intelligent Systems for Molecular Biology (ISMB 1994), pp. 28–36. AAAI Press, Menlo Park (1994)Google Scholar
- 2.Buhler, J., Tompa, M.: Finding motifs using random projections. In: Proceedings of the 5th Annual International Conference on Computational Molecular Biology (RECOMB 2001), pp. 69–76 (2001)Google Scholar
- 8.Liang, C.: COPIA: A New Software for Finding Consensus Patterns in Unaligned Protein Sequences. Master’s thesis, University of Waterloo (October 2001)Google Scholar
- 9.Liu, J.: A Combinatorial Approach for Motif Discovery in Unaligned DNA Sequences. Master’s thesis, University of Waterloo (March 2004)Google Scholar
- 10.Pevzner, P.A., Sze, S.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), pp. 269–278 (2000)Google Scholar