Skip to main content

A Parameterized Algorithm for Predicting Transcription Factor Binding Sites

  • Conference paper
Intelligent Computing in Bioinformatics (ICIC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8590))

Included in the following conference series:

Abstract

In this paper, we study the Transcription Factor Binding Sites (TFBS) prediction problem in bioinformatics. We develop a novel parameterized approach that can efficiently explore the space of all possible locations of TFBSs in a set of homologous sequences with high accuracy. The exploration is performed by an ensemble of a few Hidden Markov Models (HMM), where the size of the ensemble is the parameter of the algorithm. The ensemble is initially constructed through the local alignments between two sequences that have the lowest similarity value in the sequence set, the parameters of each HMM in the ensemble are revised when the remaining sequences in the set are scanned through by it one by one. A list of possible TFBSs are generated when all sequences in the set have been processed by the ensemble. Testing results showed that this approach can accurately handle the cases where a single sequence may contain multiple binding sites and thus has advantages over most of the existing approaches when a sequence may contain multiple binding sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bailey, T.L., Elkan, C.: Unsupervised Learning of Multiple Motifs In Biopolymers Using Expectation Maximization. Technical Report CS93-302, Department of Computer Science, University of California, San Diego (August 1993)

    Google Scholar 

  2. Bailey, T.L., Elkan, C.: Fitting a Mixture Model by Expectation Maximization to Discover Motifs in Biopolymers. In: Proceedings of the Second International Conference on Intelligient Systems for Molecular Biology, pp. 28–36 (1994)

    Google Scholar 

  3. Brent, M.M., Anand, R., Marmorstein, R.: Structural Basis for DNA Recognition by Foxo1 and its Regulation by Posttranslational Modification. Structure 16, 1407–1416 (2008)

    Article  Google Scholar 

  4. Che, D., Song, Y., Rasheed, K.: MDGA: Motif Discovery Using a Genetic Algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 447–452 (2005)

    Google Scholar 

  5. Durbin, R., Eddy, S.R., Krogh, A., Mitchison, G.: Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press (1998)

    Google Scholar 

  6. Galas, D.J., Schmitz, A.: A DNA Footprinting: A Simple Method for the Detection of Protein-DNA Binding Specificity. Nucleic Acids Research 5(9), 3157–3170 (1978)

    Article  Google Scholar 

  7. Garner, M.M., Revzin, A.: A Gel Electrophoresis Method for Quantifying He Binding of Proteins to Specific DNA Regions: Application to Components of the Escherichia Coli Lactose Operon Regulatory Systems. Nucleic Acids Research 9(13), 3047–3060 (1981)

    Article  Google Scholar 

  8. Hertz, G.Z., Stormo, G.D.: Identifying DNA and Protein Patterns with Statistically Significant Alignments of Multiple Sequences. Bioinformatcs 15(7), 53–577 (1999)

    Google Scholar 

  9. Hu, T.C., et al.: Snail Associates with EGR-1 and SP-1 to Upregulate Transcriptional Activation of P15ink4b. The FEBS Journal 277, 1202–1218 (2010)

    Article  Google Scholar 

  10. Liu, C., Song, Y., Shapiro, L.W.: RNA Folding Including Pseudoknots: A New Parameterized Algorithm and Improved Upper Bound. In: Proceedings of the 7th Workshop on Algorithms in Bioinformatics, pp. 310–322 (2007)

    Google Scholar 

  11. Liu, C., Song, Y., Burge III, L.L.: Parameterized Lower Bound and Inapproximability of Polylogarithmic String Barcoding. Journal of Combinatorial Optimization 16(1), 39–49 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  12. Liu, C., Song, Y.: Parameterized Dominating Set Problem in Chordal Graphs: Complexity and Lower Bound. Journal of Combinatorial Optimization 18(1), 87–97 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  13. Liu, C., Song, Y.: Parameterized Complexity and Inapproximability of Dominating Set Problem in Chordal and Near Chordal Graphs. Journal of Combinatorial Optimization 22(4), 684–698 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  14. Liu, F.F.M., Tsai, J.J.P., Chen, R.M., Chen, S.N., Shih, S.H.: FMGA: Finding Motifs by Genetic Algorithm. In: IEEE Fourth Symposium on Bioinformatics And Bioengineering, pp. 459–466 (2004)

    Google Scholar 

  15. Liu, J.S., Neuwald, A.F., Lawrence, C.E.: Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies. J. Am. Stat. Assoc. 90(432), 1156–1170 (1995)

    Article  MATH  Google Scholar 

  16. Liu, X., Brutlag, D.L., Liu, J.S.: Bioprospector: Discovering Conserved DNA Motifs in Upstream Regulatory Regions of Co-Expressed Genes. In: Pacific Symposium of Biocomputing, vol. 6, pp. 127–1138 (2001)

    Google Scholar 

  17. Quigley, M., et al.: Transcriptional Analysis of HIV-Specific CD8+ T Cells Shows That PD-1 Inhibits T Cell Function by Upregulating BATF. Nature Medicine 16, 1147–1151 (2010)

    Article  Google Scholar 

  18. Rigbolt, K.T., et al.: System-Wide Temporal Characterization of the Proteome and Phosphoproteome of Human Embryonic Stem Cell Differentiation. Science Signaling 4, RS3–RS3 (2011)

    Google Scholar 

  19. Roth, F.R., Hughes, J.D., Estep, P.E., Church, G.M.: Finding DNA Regulatory Motifs Within Unaligned Non-Coding Sequences Clustered by Whole-Genome Mrna Quantitation. Nature Biotechnology 16(10), 939–945 (1998)

    Article  Google Scholar 

  20. Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147, 195–197 (1981)

    Article  Google Scholar 

  21. Song, J., Liu, C., Song, Y., Qu, J., Hura, G.: Alignment of Multiple Proteins With an Ensemble of Hidden Markov Models. International Journal of Bioinformatics and Data Mining 4(1), 60–71 (2010)

    Article  Google Scholar 

  22. Song, Y., Liu, C., Huang, X., Malmberg, R.L., Xu, Y., Cai, L.: Efficient Parameterized Algorithm for Biopolymer Structure-Sequence Alignment. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 376–388. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  23. Song, Y., Liu, C., Malmberg, R.L., Pan, F., Cai, L.: Tree Decomposition Based Fast Search of RNA Structures Including Pseudoknots in Genomes. In: Proceedings of IEEE 2005 Computational Systems Bioinformatics Conference, pp. 223–234 (2005)

    Google Scholar 

  24. Song, Y., Zhao, J., Liu, C., Liu, K., Malmberg, R.L., Cai, L.: RNA Structural Homology Search with a Succinct Stochastic Grammar Model. Journal of Computer Science and Technology 20(4), 454–464 (2005)

    Article  MathSciNet  Google Scholar 

  25. Song, Y., Liu, C., Huang, X., Malmberg, R.L., Xu, Y., Cai, L.: Efficient Parameterized Algorithms for Biopolymer Structure-Sequence Alignment. IEEE/ACM Transactions on Computational Biology and Bioinformatics 3(4), 423–432 (2006)

    Article  Google Scholar 

  26. Song, Y., Liu, C., Malmberg, R.L., He, C., Cai, L.: Memory Efficient Alignment Between RNA Sequences and Stochastic Grammar Models of Pseudoknots. International Journal on Bioinformatics Research and Applications 2(3), 289–304 (2006)

    Google Scholar 

  27. Song, Y.: A New Parameterized Algorithm for Rapid Peptide Sequencing. Plos ONE 9(2), E87476 (2014)

    Google Scholar 

  28. Song, Y., Chi, A.Y.: A New Approach for Parameter Estimation in the Sequence-Structure Alignment of Non-Coding Rnas. Journal of Information Science and Engineering (in press, 2014)

    Google Scholar 

  29. Song, Y.: An Improved Parameterized Algorithm for the Independent Feedback Vertex Set Problem. Theoretical Computer Science (2014), doi:10.1016/J.Tcs.2014.03.031

    Google Scholar 

  30. Stormo, G.D.: Computer Methods for Analyzing Sequence Recognition of Nucleic Acids. Annu. Rev. Biochem. 17, 241–263 (1988)

    Google Scholar 

  31. Stormo, G.D., Hartzell, G.W.: Identifying Protein-Binding Sites from Unaligned DNA Fragments. Proc. of Nat. Acad. Sci. 86(4), 1183–1187 (1989)

    Article  Google Scholar 

  32. https://Www.Genome.Gov/Encode/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Song, Y., Wang, C., Qu, J. (2014). A Parameterized Algorithm for Predicting Transcription Factor Binding Sites. In: Huang, DS., Han, K., Gromiha, M. (eds) Intelligent Computing in Bioinformatics. ICIC 2014. Lecture Notes in Computer Science(), vol 8590. Springer, Cham. https://doi.org/10.1007/978-3-319-09330-7_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09330-7_41

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09329-1

  • Online ISBN: 978-3-319-09330-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics