Abstract
Motif finding problem is one of the NP-complete problems in Computational Biology. Existing nondeterministic algorithms for motif finding do not guarantee the global optimality of results and are sensitive to initial parameters. To address this problem, the PROJECTION algorithm provides a good initial estimate that can be further refined using local optimization algorithms such as EM, MEME or Gibbs. For large enough input (600-1000 base pair per sequence) or for challenging motif finding problems, the PROJECTION algorithm may run in an inordinate amount of time. In this paper we present a parallel implementation of the PROJECTION algorithm in Graphics Processing Units (GPUs) using CUDA. We also list down several major issues we have encountered including performing space optimizations because of the GPU’s space limitations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bailey, T.L.: Discovering Motifs i DNA and Protein Sequences: The Approximate Common Substring Problem. Ph.D. Dissertation, University of California San Diego (1995)
Buhler, J., Tompa, M.: Finding Motifs Using Random Projections. In: RECOMB 2001 Proceedings of the Fifth Annual International Conference on Computational Biology (2001)
Chen, C., Schmidt, B., Weiguo, L., Müller-Wittig, W.: GPU-MEME: Using Graphics Hardware to Accelerate Motif Finding in DNA Sequences. In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 448–459. Springer, Heidelberg (2008)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (1977)
Harris, M.: Mapping computational concepts to GPUs. In: ACM SIGGRAPH 2005 Courses, NY, USA (2005)
Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences. Bioinformatics 15, 563–577 (1999)
Jones, N., Pevzner, P.: An Introduction to Bioinformatics Algorithms. Massachusetts Institute of Technology Press (2004)
Kirk, D., Hwu, W.: Programming Massively Parallel Processors: A Hands On Approach, 1st edn. Morgan Kaufmann, MA (2010)
McGuire, A., Church, G.: Discovery of DNA Regulatory Motifs. Harvard University Medical School
Lawrence, C., Reilly, A.: An Expectation Maximization Algorithm for the Identification and Characterization of Common sites in Unaligned Biopolymer Sequences. Proteins 7(1), 41–51 (1990)
Lawrence, C., Altschul, S., Boguski, M., Liu, J., Neuwald, A., Wootton, J.: Detecting subtle sequence signals: A Gibbs sampling strategy fr multiple alignment. Science 262, 208–214 (1993)
Liu, Y., Schmidt, B., Liu, W., Maskell, D.: CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units. Pattern Recognition Letters 31, 2170–2177 (2009)
Park, S., Miller, K.W.: Random Number Generator: Good ones are Hard to Find. Comm. ACM 31, 1192–1201 (1988)
Shashidhara, H.S., Joseph, P., Srinivasa, K.G.: Improving Motif Refinement using Hybrid Expectation Maximization and Random Projection. In: ISB 2010, Calicut India, February 15-17 (2010)
Shida, K.: Hybrid Gibbs-Sampling Algorithm for Challenging Motif Discovery: GibbsDST. Genome Informatics 17(2), 3–13 (2006)
Yu, L., Xu, Y.: Parallel Gibbs Sampling Algorithm for Motif Finding on GPU. In: IEEE International Symposium on Parallel and Distributed Processing with Applications (2009)
NVIDIA corporation. NVIDIA CUDA C programming guide, version 3.2. NVIDIA, CA, USA (October 2011)
NVIDIA CUDA Bio-informatics and Life Sciences page for software et al (October 2011), http://www.nvidia.com/object/bio_info_life_sciences.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Tokyo
About this paper
Cite this paper
Clemente, J.B., Cabarle, F.G.C., Adorna, H.N. (2012). PROJECTION Algorithm for Motif Finding on GPUs. In: Nishizaki, Sy., Numao, M., Caro, J., Suarez, M.T. (eds) Theory and Practice of Computation. Proceedings in Information and Communications Technology, vol 5. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54106-6_9
Download citation
DOI: https://doi.org/10.1007/978-4-431-54106-6_9
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-54105-9
Online ISBN: 978-4-431-54106-6
eBook Packages: Computer ScienceComputer Science (R0)