A Faster Reliable Algorithm to Estimate the p-Value of the Multinomial llr Statistic
The subject of estimating the p-value of the log-likelihood ratio statistic for multinomial distribution has been studied extensively in the statistical literature. Nevertheless, bioinformatics laid new challenges before that research by often concentrating its interest on the “thin tail” of the distribution where classical statistical approximation typically fails. Hence, some of the more recent development in this area have come from the bioinformatics community (, ).
Since algorithms for computing the exact p-value have an exponential complexity, the only generally applicable algorithms for reliably estimating the p-value are lattice based. In particular, Hertz and Stormo have a dynamic programming algorithm whose complexity is O(QKN 2), where Q is the size of the lattice, K is the size of the alphabet and N is the size of the sample. We present a new algorithm that is practically as reliable as Hertz and Stormo’s and has a complexity of O(QKNlog N). An interesting feature of our algorithm is that it can guarantee the quality of its estimated p-value.
KeywordsNumerical Error Dynamic Programming Algorithm Multinomial Distribution Roundoff Error Runtime Comparison
Unable to display preview. Download preview PDF.
- 2.Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, Menlo Park, California, pp. 28–36 (1994)Google Scholar
- 3.Bejerano, G.: Efficient exact value computation and applications to biosequence analysis. In: Vingron, M., Istrail, S., Pevzner, P.A., Waterman, M.S. (eds.) Proceedings of the Seventh Annual International Conference on Computational Molecular Biology (RECOMB 2003), Berlin, Germany, pp. 38–47. ACM Press, New York (2003)CrossRefGoogle Scholar
- 9.Keich, U.: Efficiently computing the p-value of the entropy score. Journal of Computational Biology (in press)Google Scholar