Skip to main content

Large Scale Protein Sequence Alignment Using FPGA Reprogrammable Logic Devices

  • Conference paper
Field Programmable Logic and Application (FPL 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3203))

Included in the following conference series:

Abstract

In this paper we show how to significantly accelerate Smith-Waterman protein sequence alignment algorithm using reprogrammable logic devices – FPGAs (Field Programmable Gate Array). Due to perfect sensitivity, the Smith-Waterman algorithm is important in a field of computational biology but computational complexity makes it impractical for large database searches when running on general purpose computers.

Current approach allows for aminoacid sequence alignment with full substitution matrix which leads to more complex formula than used in DNA alignment and is much more memory demanding. We propose different parellization scheme than commonly used systolic arrays, leading to full utilization of PUs (Processing Units), regardless of sequence length. FPGA based implementation of Smith-Waterman algorithm can accelerate sequence alignment on a Pentium desktop computer by two orders of magnitude comparing to standard OSEARCH program from FASTA package.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Yu, C.W., Kwong, K.H., Lee, K.H., Leong, P.H.W.: A Smith-Waterman Systolic Cell. In: Proceedings of the Tenth International Workshop on Field Programmable Logic and Applications (FPL 2003), Lisbon, pp. 375–384 (2003)

    Google Scholar 

  2. West, B., Chamberlain, R.D., Indeck, R., Zhang, Q.: An FPGA-based Search Engine for Unstructured Database. In: Proc. of 2nd Workshop on Application Specific Processors (December 2003)

    Google Scholar 

  3. Weaver, N., Markovskiy, Y., Patel, Y., Wawrzynek, J.: Post Placement C-slow Retiming for the Xilinx Virtex FPGA. In: 11th ACM Symposium of Field Programmable Gate Arrays, FPGA (2003)

    Google Scholar 

  4. Guccione, S.A., Keller, E.: Gene matching using JBits. In: Field-Programmable Logic and Applications, Reconfigurable Computing 12th International Conference, September 2-4, pp. 1168–1171 (2002)

    Google Scholar 

  5. Yamaguchi, Y., Maruyama, T., Konagaya, A.: High Speed Homology Search with FPGAs. In: Pacific Symposium on Biocomputing, vol. 7, pp. 271–282 (2002)

    Google Scholar 

  6. Rognes, T., Seeberg, E.: Six-fold speedup of Smith-Waterman sequence database searches using parallel processing on common microprocessors. Bioinformatics 16(8), 699–706 (2000)

    Article  Google Scholar 

  7. Lavenier, D.: Speeding up genome computations with a systolic accelerator. SIAM News 31(8) (October 1998)

    Google Scholar 

  8. Hirshber, J.D., Hughey, R., Karplus, K., Kestrel: A Programmable Array for Sequence Analysis. In: Proc. Int. Conf. Application-Specific Systems, Architectures, and Processors, August 19-21, pp. 25–35. IEEE CS, Los Alamitos (1996)

    Google Scholar 

  9. Lavenier, D.: SAMBA: Systolic Accelerators for Molecular Biological Applications, IRISA Report (PI-988) (March 1996)

    Google Scholar 

  10. Hoang, D.T.: Searching genetic databases on splash 2. In: Proceedings 1993 IEEE Workshop on Field-Programmable Custom Computing Machines, pp. 185–192 (1993)

    Google Scholar 

  11. Hoang, D.T.: FPGA Implementation of Systolic Sequence Alignment. In: International Workshop on Field Programmable Logic and Applications, Vienna, Austria, August 31-September 2 (1992)

    Google Scholar 

  12. Lipton, R.J., Lopresti, D.: A systolic array for rapid string comparison. In: Proceedings of the Chapel Hill Conference on VLSI, pp. 363–376 (1985)

    Google Scholar 

  13. Paracel, inc., http://www.paracel.com

  14. Sencel’s search software, http://www.sencel.com

  15. Celera genomics, inc., http://www.celera.com

  16. Crochemore, M., Iliopoulos, C., Pinzon, Y., Reid, J.: A Fast and Practical Bit-Vector Algorithm for the Longest Common Subsequence Problem. Information Processing Letters 80(6), 279–285 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  17. Smith, T.F., Waterman, M.S.: Identifcation of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)

    Article  Google Scholar 

  18. Waterman, M.S.: Introduction to Computational Biology: Sequences, Maps and Genomes. Chapman and Hall, London (1995)

    Google Scholar 

  19. Pearson, W.R.: Searching protein sequence libraries: comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithms. Genomics 11(3), 635–650 (1991)

    Article  Google Scholar 

  20. Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85(8), 2444–2448 (1988)

    Article  Google Scholar 

  21. Pearson, W.R.: Rapid and sensitive sequence comparison with fastp and fasta. Methods in Enzymology 183, 63–98 (1990)

    Article  Google Scholar 

  22. Ma, B., Tromp, J., Li, M.: PatternHunter: Faster and More Sensitive Homology Search. Bioinformatics 18(3), 440–445 (2002)

    Article  Google Scholar 

  23. Hertz, G.Z., Stormo, G.D.: Identifing DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7/8), 563–577 (1999)

    Article  Google Scholar 

  24. Davidson, A.: A Fast Pruning Algorithm for Optimal Sequence Alignment. In: Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001), pp. 49–56. IEEE Comput. Soc., Los Alamitos (2001)

    Chapter  Google Scholar 

  25. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. matl. Acad. Sci. USA 89, 10915–10919 (1992)

    Article  Google Scholar 

  26. Timelogic home page, http://www.timelogic.com

  27. Xilinx home page, http://www.xilinx.com

  28. Synplicity home page, http://www.synplicity.com

  29. Opencores home page, http://www.opencores.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dydel, S., Bała, P. (2004). Large Scale Protein Sequence Alignment Using FPGA Reprogrammable Logic Devices. In: Becker, J., Platzner, M., Vernalde, S. (eds) Field Programmable Logic and Application. FPL 2004. Lecture Notes in Computer Science, vol 3203. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30117-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30117-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22989-6

  • Online ISBN: 978-3-540-30117-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics