Abstract
Reconstructing how transcriptional networks function involves figuring out which promoters are affected by which transcription factors. Searching for functional regulatory sites bound by particular transcription factors in a genome is therefore of great importance. The chapter discusses efforts at building classifiers that separate promoters targeted by particular transcription factors from those that are not. We start with simple sequence classifiers based on Support Vector Machines and go on to discuss how to integrate different kind of data into the analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lewin B. Genes VII. New York: Oxford University Press; 2000.
Fickett JW, Wasserman WW. Discovery and modeling of transcriptional regulatory regions. Curr Opin Biotechnol 2000;1:19–24.
Stormo GD, Tan K. Mining genome databases to identify and understand new gene regulatory systems. Curr Opin Microbiol 2002;5:149–153.
Sengupta AM, Djordjevic M, Shraiman BI. Specificity and robustness of transcription control networks. Proc Natl Acad Sci USA 2002;99:2072–2077.
Wagner R. Transcription Regulation in Prokaryotes. Oxford: Oxford University Press; 2000.
Gilbert SF. Developmental Biology, 6th edition. Sunderland: Sinauer; 2000.
Docherty K. Gene Transcription, DNA Binding Proteins. New York: John Wiley & Sons Ltd.; 1997.
Travers AA, Buckle M. DNA-Protein Interactions: A Practical Approach. Oxford: Oxford University Press; 2000.
Robison K, McGuire AM, Church GM. A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J Mol Biol 1998;284:241–254. Available at http://arep.med.harvard.edu/dpinteract/
Salgado H, Santos A, Garza-Ramos U, et al. RegulonDB (version 2.0): a database on transcriptional regulation in Escherichia coli. Nucleic Acids Res 1999;27:59–60. http://www.cifn.unam.mx/ComputationalGenomics/regulonDB
Zhu J, Zhang MQ. SCPD: A Promoter Database of Yeast Saccharomyces cerevisiae. Bioinformatics 1999;15:607–611. Available at http://cgsigma.cshl.org/jian/
Wingender E, Chen X, Hehl R, et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 2000;28:316–319. Available at http://transfac,gdb.de/TRANSFAC
Ren B, Robert F, Wyrick JJ, et al. Genome-wide location and function of DNA binding proteins. Science 2000;290:2306–2309.
Iyer VR, Horak CE, Scafe CS, et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 2001;409:533–538.
Lee TI, Rinaldi NJ, Robert F, Odom DT, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002;298:799–804.
Harbison CT, Gordon DB, Lee TI, et al. Transcriptional regulatory code of a eukaryotic genome. Nature 2004;431:99–104.
Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage-T4 DNA-polymerase. Science 1990;249:505–510.
Mathias JR, Hanlon SE, O’Flanagan RA, et al. Repression of the yeast HO gene by the MATα2 and MATa1 homeodomain proteins. Nucleic Acids Res 2004;32:6469–6478.
Roulet E, Busso S, Camargo AA, et al. High-throughput SELEX SAGE method for quantitative modeling of transcription factor binding sites. Nat Biotechnol 2002;20:831–835.
Nagaraj VH, O’Flanagan RA, Shraiman BI, Sengupta AM, manuscript in preparation.
Chen QK, Hertz GZ, Stormo GD. MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using a database of weight matrices. Comput Appl Biosci 1995;11:563–566.
Gralla J, Collado-Vides J. Organization and function of transcription regulatory elements. In: Neidhart FC, Ingraham F, eds. Escherichia coli and Samonella typhimurium: Cellular and Molecular Biology, Washington DC: ASM Press, 1996:1232–1245.
Stormo GD, Hartzell GW, 3rd. Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci USA 1989;86:1183–1197.
Tavazoie S, Hughes JD, Campbell MJ, et al. Systematic determination of genetic network architecture. Nat Genet 1999;22:281–285.
Hughes JD, Estep PW, Tavazoie S, Church GM. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 2000;296:1205–1214.
Bussemaker HJ, Li H, Siggia ED. Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc Natl Acad Sci USA 2000;97:10096–10100.
Bussemaker HJ, Li H, Siggia ED. Regulatory element detection using correlation with expression. Nat Genet 2001;27:167–171.
McCue L, Thompson W, Carmack C, et al. Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucleic Acids Res 2001;29:774–782.
Rajewsky N, Socci ND, Zapotocky M, Siggia ED. The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons. Genome Res 2002;12:298–308.
Liu XS, Brutlag DL, Liu JS. An algorithm for finding protein-DNA binding sites with applications to chromatin-immunoprecipitation microarray experiments. Nat Biotechnol 2002;20:835–839.
Staden R. Computer methods to locate signals in nucleic acid sequences. Nucleic Acids Res 1984;12:505–519.
Schneider TD, Stormo GD, Gold L, Ehrenfeucht A. Information content of binding sites on nucleotide sequences. J Mol Biol 1986;188:415–431.
Stormo GD, Schneider TD, Gold L. Quantitative analysis of the relationship between nucleotide sequence and functional activity. Nucleic Acids Res 1986;14:6661–6679.
Berg OG, von Hippel PH. Selection of DNA binding sites by regulatory proteins: statistical-mechanical theory and application to operators and promoters. J Mol Biol 1987;193:723–750.
Stormo GD, Fields DS. Specificity, free energy and information content in protein-DNA interactions. Trends Biochem Sci 1998;3:109–113.
Stormo GD. DNA binding sites: representation and discovery. Bioinformatics 2000;1:16–23.
Djordjevic M, Sengupta AM, Shraiman BI. A biophysical approach to transcription factor binding site discovery. Genome Res 2003;13:2381–2390.
Fletcher R. Practical Methods of Optimization. New York: Wiley; 1987.
Cristianini N, Shawe-Taylor J. Introduction to support vector machines. Cambridge: Cambridge University Press; 2001.
Schölkopf B, Platt J, Shawe-Taylor J, et al. Estimating the support of a high-dimensional distribution. Neural Comput 2001;13:1443–1471.
Manevitz LM, Yousef M. One-class SVMs for document classification. J Mach Learn Res 2001;2:139–154.
Tax DMJ, Duin RPW. Uniform object generation for optimizing one-class classifiers. J Mach Learn Res 2002;2:155–173.
Jaakkola T, Diekhans M, Haussler D. Using the Fisher kernel method to detect remote protein homologies. In: Lengauer T, Schneider R, Bork P, Brutlad D, Glasgow J, Mewes H, Zimmer R editors. ISMB 99. Proceedings Seventh International Conference on Intelligent Systems for Molecular Biology; 1999 Aug 6–11; Heidelberg, Germany. Menlo Park: AAAI Press; 1999:149–158.
Jaakkola T, Diekhans M, Haussler D. A discriminative framework for detecting remote protein homologies. J Comput Biol 2000;7:95–114.
Furey TS, Cristianini N, Duffy N, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000;16:906–914.
Brown MP, Grundy WN, Lin D, et al. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA 2000;97:262–267.
Pavlidis P, Furey TS, Liberto M, Haussler D, Grundy WN. Promoter regionbased classification of genes. In: Altman RB, Dunker AK, Hunter L, Lauderdale K, Klein TE editors. BIOCOMPUTING 2001. Proceedings of the Pacific Symposium; 2001 Jan 3–7; Mauna Lani, Hawaii, USA. Singapore: World Scientific; 2000:151–163.
Vert JP. Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings. In: Altman RB, Dunker AK, Hunter L, Lauderdale K, Klein TE editors. BIOCOMPUTING 2002. Proceedings of the Pacific Symposium; 2002 Jan 3–7; Kauai, Hawaii, USA. Singapore: World Scientific; 2001:649–660.
Schölkopf B, Tsuda K, Vert JP. Kernel Methods in Computational Biology. Cambridge: The MIT Press; 2004.
Kowalczyk A, Raskutti B. One class SVM for yeast regulation prediction, ACM SIGKDD Explorations Newsletter 2002;4:99–100.
Egan JP. Signal Detection Theory and ROC Analysis. New York: Academic Press, 1975.
Bulyk ML, Johnson PL, Church GM. Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res 2002;30:1255–1261.
Benos PV, Bulyk ML, Stormo GD. Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res 2002;30:4442–4451.
O’Flanagan RA, Paillard G, Lavery R, Sengupta AM. Non-additivity in protein-DNA binding. Bioinformatics 2005;21:2254–2263.
Paillard G, Lavery R. Analyzing protein-DNA recognition mechanisms. Structure 2004;12:113–122.
Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. New York: Springer; 2001.
Hosmer DW, Lemeshow S. Applied Logistic Regression. New York: Wiley; 2000.
Dietterich TG. Machine learning research: four current directions. AI Magazine 1997;18:97–136.
Johnson A. A combinatorial regulatory circuit in budding yeast. In: McKnight SL, Yamamoto KR, editors. Transcriptional Regulation. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 1992.
Nagaraj VH, O’Flanagan RA, Bruning AR, et al. Combined analysis of expression data and transcription factor binding sites in the yeast genome. BMC Genomics 2004;5:59.
Galitski T, Saldanha AJ, Styles CA, et al. Ploidy regulation of gene expression. Science 1999;285:251–254.
Jin Y, Zhong H, Vershon AK. The yeast a1 and alpha2 homeodomain proteins do not contribute equally to heterodimeric DNA binding. Mol Cell Biol 1999;19, 585–593.
Galgoczy DJ, Cassidy-Stone A, Llinas M, et al. Genomic dissection of the cell-type-specification circuit in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 2004;101:18069–18074.
Beer MA, Tavazoie S. Predicting gene expression from sequence. Cell 2004;117:185–198.
Segal E, Shapira M, Regev A, Pe’er D, Botstein D, Koller D, Friedman N. Module networks: identifying regulatory modules and their conditionspecific regulators from gene expression data. Nat Genet 2003;34:166–176.
McGuire AM, Hughes JD, Church GM. Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes. Genome Res 2000;10:744–757.
Pennacchio LA, Rubin EM. Genomic strategies to identify mammalian regulatory sequences. Nat Rev Genet 2001;2:100–109.
Blanchette M, Tompa M. Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 2002;12:739–748.
Bulyk ML. Computational prediction of transcription-factor binding site locations. Genome Biol 2003;5:201.
Cliften P, Sudarsanam P, Desikan A, et al. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science 2003;301:71–76.
Kellis M, Patterson N, Endrizzi M, et al. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 2003;423:241–254.
Miller AM, MacKay VL, Nasmyth, KA. Identification and comparison of two sequence elements that confer cell-type specific transcription in yeast. Nature 1985;314:598–603.
Morgenstern B, Dress A, Werner T. Multiple DNA and protein sequence alignment based on segment-to-segment comparison. Proc Natl Acad Sci USA 1996;93:12098–12103.
Morgenstern B, Frech K, Dress A, Werner T. DIALIGN: finding local similarities by multiple sequence alignment. Bioinformatics 1998;14:290–294.
Morgenstern B. DIALIGN 2: improvement of the segment-to-segment approach to multiple sequence alignment. Bioinformatics 1999;15, 211–218.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Humana Press Inc.
About this chapter
Cite this chapter
Nagaraj, V.H., Sengupta, A.M. (2007). Dissecting Transcriptional Control Networks. In: Choi, S. (eds) Introduction to Systems Biology. Humana Press. https://doi.org/10.1007/978-1-59745-531-2_6
Download citation
DOI: https://doi.org/10.1007/978-1-59745-531-2_6
Publisher Name: Humana Press
Print ISBN: 978-1-58829-706-8
Online ISBN: 978-1-59745-531-2
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)