Abstract
The identification of transcription factor binding sites (TFBS) by computational methods is very important in understanding the gene regulatory network. Although many methods have been developed to identifying TFBSs, they generally have relatively low accuracy, especially when the positions of the TFBS are dependent. Motivated by this challenge, an efficient algorithm, IBSS, is developed for the identification of TFBSs. Our results indicate that IBSS outperforms other approaches with a relatively high accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bailey TL, Elkan C (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In:Altman R, Brutlag, D, Karp P, Lathrop R, Searls D (eds) Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park, CA.
Brazma A, Jonassen I, Vilo J, Ukkonen E (1998) Predicting gene regulatory elements in silico on a genomic scale. Genome Res 8:1202–1215.
Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94.
Bussemaker HJ, Li H, Siggia ED (2001) Regulatory element detection using correlation with expression. Nat Genet 27:167–171.
Casella G, Berger RL (2001) Statistical Inference, 2nd ed. Duxbury Press.
Chen GX, Hata N, Zhang MQ (2004) Transcription factor binding element detection using functional clustering of mutant expression dat. Nucleic Acids Res 32:2362–2371.
Crooks GE, Hon G, Chandonia JM, Brenner SE (2004) WebLogo:a sequence logo generator. Genome Res 14:1188–1190.
Duda RO, Hart PE, Stork DG (2000) Pattern Classification, 2nd ed. Wiley-Interscience.
Efron B (2004) Large-scale simultaneous hypothesis testing:the choice of a null hypothesis. J Am Statistical Assoc 99:97–104.
Galas DJ, Eggert M, Waterman MS (1985) Rigorous pattern-recognition methods for DNA sequence:analysis of promoter sequences from Escherichia coli. J Mol Biol 186:117–128.
Harbison CT et al (2004) Transcriptional regulatory code of a eukaryotic genome. Nature 431:99–104.
Lawrence CE, Reilly AA (1990) An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 7:41–51.
Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AN,Wootton J (1993) Detecting subtle sequence signals:a Gibbs sampling strategy for multiple alignment. Science 262:208–214.
Lee TI et al (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298:799–804.
Li H, Wang W (2003) Dissecting the transcription networks of a cell using computational genomics. Curr Opin Genet Dev 13:611–616.
Liu XS, Brutlag DL, Liu JS (2001) BioProspector:discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 6:127–138.
Liu XS, Brutlag DL, Liu JS (2002) An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nature Biotech 20:835–839.
Ren B, Robert F, Wyrick J et al (2000) Genome-wide location and function of DNA binding proteins. Science 290:2306–2309.
Roth FP, Hughes JD, Estep PW, Chruch GM (1998) Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotech 16:939–945.
Staden R (1984) Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res 13:505–519.
Stormo GD, Hartzell GW (1989) Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci USA 86:1183–1187.
Sinha S, Tompa M (2002) Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 30:5549–5560.
Sumazin P, Chen GX, Hata N, Smith AD, Zhang T, Zhang MQ (2004) DWE:Discriminating Word Enumerator. Bioinformatics 21:31–38.
Thompson JD, Higgins DG, Gibson TJ (1994) ClustalW:improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680.
van Helden J, Andre B, Collado-Vides J (2000) Discovering regulatory elements in non-coding sequences by analysis of spaced dyads. Nucleic Acids Res 28:1808–1818.
Wolfertstetter F, Frech K, Herrmann G, Werner T (1996) Identification of functional elements in unaligned nucleic acid sequences by a novel tuple search algorithm. Bioinformatics 12:71–81.
Zhang MQ, Marr TG (1993) A weight array method for splicing signal analysis. Computer Application in the Biosciences (CABIOS) 9 (5):499–509.
Zhao XY, Huang HY, Speed T (2004) Finding short DNA motifs using permuted Markov models. Proceeding of RECOMB 4:68–75.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Feng, X., Wan, L., Deng, M., Sun, F., Qian, M. (2007). An Efficient Algorithm for Deciphering Regulatory Motifs. In: Feng, J., Jost, J., Qian, M. (eds) Networks: From Biology to Theory. Springer, London. https://doi.org/10.1007/978-1-84628-780-0_12
Download citation
DOI: https://doi.org/10.1007/978-1-84628-780-0_12
Publisher Name: Springer, London
Print ISBN: 978-1-84628-485-4
Online ISBN: 978-1-84628-780-0
eBook Packages: Computer ScienceComputer Science (R0)