Skip to main content

PROJECTION Algorithm for Motif Finding on GPUs

  • Conference paper
Theory and Practice of Computation

Abstract

Motif finding problem is one of the NP-complete problems in Computational Biology. Existing nondeterministic algorithms for motif finding do not guarantee the global optimality of results and are sensitive to initial parameters. To address this problem, the PROJECTION algorithm provides a good initial estimate that can be further refined using local optimization algorithms such as EM, MEME or Gibbs. For large enough input (600-1000 base pair per sequence) or for challenging motif finding problems, the PROJECTION algorithm may run in an inordinate amount of time. In this paper we present a parallel implementation of the PROJECTION algorithm in Graphics Processing Units (GPUs) using CUDA. We also list down several major issues we have encountered including performing space optimizations because of the GPU’s space limitations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bailey, T.L.: Discovering Motifs i DNA and Protein Sequences: The Approximate Common Substring Problem. Ph.D. Dissertation, University of California San Diego (1995)

    Google Scholar 

  2. Buhler, J., Tompa, M.: Finding Motifs Using Random Projections. In: RECOMB 2001 Proceedings of the Fifth Annual International Conference on Computational Biology (2001)

    Google Scholar 

  3. Chen, C., Schmidt, B., Weiguo, L., Müller-Wittig, W.: GPU-MEME: Using Graphics Hardware to Accelerate Motif Finding in DNA Sequences. In: Chetty, M., Ngom, A., Ahmad, S. (eds.) PRIB 2008. LNCS (LNBI), vol. 5265, pp. 448–459. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  4. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, Series B (1977)

    Google Scholar 

  5. Harris, M.: Mapping computational concepts to GPUs. In: ACM SIGGRAPH 2005 Courses, NY, USA (2005)

    Google Scholar 

  6. Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences. Bioinformatics 15, 563–577 (1999)

    Article  Google Scholar 

  7. Jones, N., Pevzner, P.: An Introduction to Bioinformatics Algorithms. Massachusetts Institute of Technology Press (2004)

    Google Scholar 

  8. Kirk, D., Hwu, W.: Programming Massively Parallel Processors: A Hands On Approach, 1st edn. Morgan Kaufmann, MA (2010)

    Google Scholar 

  9. McGuire, A., Church, G.: Discovery of DNA Regulatory Motifs. Harvard University Medical School

    Google Scholar 

  10. Lawrence, C., Reilly, A.: An Expectation Maximization Algorithm for the Identification and Characterization of Common sites in Unaligned Biopolymer Sequences. Proteins 7(1), 41–51 (1990)

    Article  Google Scholar 

  11. Lawrence, C., Altschul, S., Boguski, M., Liu, J., Neuwald, A., Wootton, J.: Detecting subtle sequence signals: A Gibbs sampling strategy fr multiple alignment. Science 262, 208–214 (1993)

    Article  Google Scholar 

  12. Liu, Y., Schmidt, B., Liu, W., Maskell, D.: CUDA-MEME: Accelerating Motif Discovery in Biological Sequences Using CUDA-enabled Graphics Processing Units. Pattern Recognition Letters 31, 2170–2177 (2009)

    Article  Google Scholar 

  13. Park, S., Miller, K.W.: Random Number Generator: Good ones are Hard to Find. Comm. ACM 31, 1192–1201 (1988)

    Article  MathSciNet  Google Scholar 

  14. Shashidhara, H.S., Joseph, P., Srinivasa, K.G.: Improving Motif Refinement using Hybrid Expectation Maximization and Random Projection. In: ISB 2010, Calicut India, February 15-17 (2010)

    Google Scholar 

  15. Shida, K.: Hybrid Gibbs-Sampling Algorithm for Challenging Motif Discovery: GibbsDST. Genome Informatics 17(2), 3–13 (2006)

    Google Scholar 

  16. Yu, L., Xu, Y.: Parallel Gibbs Sampling Algorithm for Motif Finding on GPU. In: IEEE International Symposium on Parallel and Distributed Processing with Applications (2009)

    Google Scholar 

  17. NVIDIA corporation. NVIDIA CUDA C programming guide, version 3.2. NVIDIA, CA, USA (October 2011)

    Google Scholar 

  18. NVIDIA CUDA Bio-informatics and Life Sciences page for software et al (October 2011), http://www.nvidia.com/object/bio_info_life_sciences.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Tokyo

About this paper

Cite this paper

Clemente, J.B., Cabarle, F.G.C., Adorna, H.N. (2012). PROJECTION Algorithm for Motif Finding on GPUs. In: Nishizaki, Sy., Numao, M., Caro, J., Suarez, M.T. (eds) Theory and Practice of Computation. Proceedings in Information and Communications Technology, vol 5. Springer, Tokyo. https://doi.org/10.1007/978-4-431-54106-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-54106-6_9

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-54105-9

  • Online ISBN: 978-4-431-54106-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics