Skip to main content

The Analyses of Global Gene Expression and Transcription Factor Regulation

  • Chapter
  • First Online:
Transcriptomics and Gene Regulation

Part of the book series: Translational Bioinformatics ((TRBIO,volume 9))

Abstract

A major challenge in molecular cell biology lies in understanding how the same genome can give rise to different cell types and how gene expression is regulated. Gene expression and regulation studies focus on the abundance and structure of transcripts as well as how RNA production is controlled. High-throughput sequencing technologies such as RNA sequencing have allowed more accurate profiling of the transcriptome and the rapid identification of differentially expressed genes among samples. The regulation of gene expression is orchestrated by transcription factors. The development of ChIP sequencing assay has made it possible to comprehensively identify transcription factor-binding sites in vivo, allowing rapid unraveling of signaling pathways. The following chapter described the common methods used in studying global gene expression and transcription factor regulation with a special emphasis on bioinformatic analyses. The final section illustrates an example of an integrated gene expression and regulation study for identifying key factors regulating self-renewal and differentiation in hematopoietic precursor cells.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Liang F, Holt I, Pertea G, Karamycheva S, Salzberg SL, Quackenbush J. Gene index analysis of the human genome estimates approximately 120,000 genes. Nat Genet. 2000;25(2):239–40.

    Article  CAS  PubMed  Google Scholar 

  2. Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, et al. Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991;252(5013):1651–6.

    Article  CAS  PubMed  Google Scholar 

  3. Wolfsberg TG, Landsman D. A comparison of expressed sequence tags (ESTs) to human genomic sequences. Nucleic Acids Res. 1997;25(8):1626–32.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  4. Bailey LC Jr, Searls DB, Overton GC. Analysis of EST-driven gene annotation in human genomic sequence. Genome Res. 1998;8(4):362–76.

    CAS  PubMed  Google Scholar 

  5. Das M, Burge CB, Park E, Colinas J, Pelletier J. Assessment of the total number of human transcription units. Genomics. 2001;77(1–2):71–8.

    Article  CAS  PubMed  Google Scholar 

  6. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270(5235):484–7.

    Article  CAS  PubMed  Google Scholar 

  7. Saha S, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, et al. Using the transcriptome to annotate the genome. Nat Biotechnol. 2002;20(5):508–12.

    Article  CAS  PubMed  Google Scholar 

  8. Wei CL, Ng P, Chiu KP, Wong CH, Ang CC, Lipovich L, et al. 5′ Long serial analysis of gene expression (LongSAGE) and 3′ LongSAGE for transcriptome characterization and genome annotation. Proc Natl Acad Sci USA. 2004;101(32):11701–6 (Epub 2004/07/24).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  9. Kawai J, Shinagawa A, Shibata K, Yoshino M, Itoh M, Ishii Y, et al. Functional annotation of a full-length mouse cDNA collection. Nature. 2001;409(6821):685–90.

    Article  PubMed  Google Scholar 

  10. Clark MD, Hennig S, Herwig R, Clifton SW, Marra MA, Lehrach H, et al. An oligonucleotide fingerprint normalized and expressed sequence tag characterized zebrafish cDNA library. Genome Res. 2001;11(9):1594–602.

    Article  PubMed Central  PubMed  Google Scholar 

  11. Okazaki Y, Furuno M, Kasukawa T, Adachi J, Bono H, Kondo S, et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature. 2002;420(6915):563–73.

    Article  PubMed  Google Scholar 

  12. Stapleton M, Carlson J, Brokstein P, Yu C, Champe M, George R, et al. A Drosophila full-length cDNA resource. Genome Biol. 2002;3(12):RESEARCH0080.

    Google Scholar 

  13. Reboul J, Vaglio P, Rual JF, Lamesch P, Martinez M, Armstrong CM, et al. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat Genet. 2003;34(1):35–41.

    Article  PubMed  Google Scholar 

  14. Wiemann S, Weil B, Wellenreuther R, Gassenhuber J, Glassl S, Ansorge W, et al. Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs. Genome Res. 2001;11(3):422–35.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Ota T, Suzuki Y, Nishikawa T, Otsuki T, Sugiyama T, Irie R, et al. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat Genet. 2004;36(1):40–5.

    Article  PubMed  Google Scholar 

  16. Gerhard DS, Wagner L, Feingold EA, Shenmen CM, Grouse LH, Schuler G, et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 2004;14(10B):2121–7.

    Article  PubMed  Google Scholar 

  17. Strausberg RL, Feingold EA, Grouse LH, Derge JG, Klausner RD, Collins FS, et al. Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci USA. 2002;99(26):16899–903.

    Article  PubMed  Google Scholar 

  18. Temple G, Gerhard DS, Rasooly R, Feingold EA, Good PJ, Robinson C, et al. The completion of the Mammalian Gene Collection (MGC). Genome Res. 2009;19(12):2324–33.

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  19. Bareyre FM, Schwab ME. Inflammation, degeneration and regeneration in the injured spinal cord: insights from DNA microarrays. Trends Neurosci. 2003;26:555–63.

    Article  CAS  PubMed  Google Scholar 

  20. Carmel JB, Galante a, Soteropoulos P, Tolias P, Recce M, Young W, et al. Gene expression profiling of acute spinal cord injury reveals spreading inflammatory signals and neuron loss. Physiol Genomics 2001;7:201–13.

    Google Scholar 

  21. Velardo MJ, Burger C, Williams PR, Baker HV, López MC, Mareci TH, et al. Patterns of gene expression reveal a temporally orchestrated wound healing response in the injured spinal cord. J Neurosci. 2004;24:8562–76.

    Article  CAS  PubMed  Google Scholar 

  22. Liu CL, Jin AM, Tong BH. Detection of gene expression pattern in the early stage after spinal cord injury by gene chip. Chin J Traumatol. 2003;6(1):18–22 (Epub 2003/01/25).

    CAS  PubMed  Google Scholar 

  23. Tachibana T, Noguchi K, Ruda MA. Analysis of gene expression following spinal cord injury in rat using complementary DNA microarray. Neurosci Lett. 2002;327(2):133–7 (Epub 2002/07/06).

    Article  CAS  PubMed  Google Scholar 

  24. Jaerve A, Kruse F, Malik K, Hartung HP, Muller HW. Age-dependent modulation of cortical transcriptomes in spinal cord injury and repair. PLoS One. 2012;7(12):e49812 (Epub 2012/12/14).

    Google Scholar 

  25. Kahvejian A, Quackenbush J, Thompson JF. What would you do if you could sequence everything? Nat Biotechnol. 2008;26:1125–33.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  26. Wu JQ, Habegger L, Noisa P, Szekely A, Qiu C, Hutchison S, et al. Dynamic transcriptomes during neural differentiation of human embryonic stem cells revealed by short, long, and paired-end sequencing. Proc Natl Acad Sci USA. 2010;107(11):5254–9 (Epub 2010/03/03).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  27. Wu JQ, Seay M, Schulz, V., Hariharan, M., Tuck, D., Lian, J., Du, J., Shi, M., Ye, Z.J, Gerstein M, Snyder M, Weissman S. TCF7 is a key regulator of the self-renewal and differentiation switch in a multipotential hematopoietic cell line. PLoS Genet. 2012;8(3):e1002565 (Epub 2012).

    Google Scholar 

  28. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57–63 (Epub 2008/11/19).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Torres-Garcia W, Zheng S, Sivachenko A, Vegesna R, Wang Q, Yao R, et al. PRADA: pipeline for RNA sequencing data analysis. Bioinformatics. 2014;30(15):2224–6 (Epub 2014/04/04).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  30. Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc. 2012;7(3):562–78 (Epub 2012/03/03).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  31. Kalari KR, Nair AA, Bhavsar JD, O’Brien DR, Davila JI, Bockol MA, et al. MAP-RSeq: mayo analysis pipeline for RNA sequencing. BMC Bioinf. 2014;15:224 (Epub 2014/06/29).

    Article  CAS  Google Scholar 

  32. Cumbie JS, Kimbrel JA, Di Y, Schafer DW, Wilhelm LJ, Fox SE, et al. GENE-counter: a computational pipeline for the analysis of RNA-Seq data for gene expression differences. PLoS ONE. 2011;6(10):e25279 (Epub 2011/10/15).

    Google Scholar 

  33. Fonseca NA, Marioni J, Brazma A. RNA-Seq gene profiling–a systematic empirical comparison. PLoS ONE. 2014;9(9):e107026 (Epub 2014/10/01).

    Google Scholar 

  34. Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–4 (Epub 2011/02/01).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  35. Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2010;38(6):1767–71 (Epub 2009/12/18).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  36. Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998;8(3):186–94 (Epub 1998/05/16).

    Article  CAS  PubMed  Google Scholar 

  37. Del Fabbro C, Scalabrin S, Morgante M, Giorgi FM. An extensive evaluation of read trimming effects on Illumina NGS data analysis. PLoS ONE. 2013;8(12):e85024 (Epub 2014/01/01).

    Google Scholar 

  38. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20 (Epub 2014/04/04).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  39. Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17:10–2.

    Google Scholar 

  40. Smeds L, Kunstner A. ConDeTri–a content dependent read trimmer for Illumina data. PLoS ONE. 2011;6(10):e26314 (Epub 2011/11/01).

    Google Scholar 

  41. Hansen KD, Irizarry RA, Wu Z. Removing technical variability in RNA-seq data using conditional quantile normalization. Biostatistics. 2012;13(2):204–16 (Epub 2012/01/31).

    Article  PubMed Central  PubMed  Google Scholar 

  42. Bohnert R, Ratsch G. rQuant.web: a tool for RNA-Seq-based transcript quantitation. Nucleic Acids Res. 2010;38(Web Server issue):W348-51 (Epub 2010/06/17).

    Google Scholar 

  43. Srivastava S, Chen L. A two-parameter generalized Poisson model to improve the analysis of RNA-seq data. Nucleic Acids Res. 2010;38(17):e170 (Epub 2010/07/31).

    Google Scholar 

  44. Hansen KD, Brenner SE, Dudoit S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 2010;38(12):e131 (Epub 2010/04/17).

    Google Scholar 

  45. Oshlack A, Wakefield MJ. Transcript length bias in RNA-seq data confounds systems biology. Biol Direct. 2009;4:14 (Epub 2009/04/18).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  46. Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5 (Epub 2010/05/04).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  47. Roberts A, Trapnell C, Donaghey J, Rinn JL, Pachter L. Improving RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 2011;12(3):R22 (Epub 2011/03/18).

    Google Scholar 

  48. Fonseca NA, Rung J, Brazma A, Marioni JC. Tools for mapping high-throughput sequencing data. Bioinformatics. 2012;28(24):3169–77 (Epub 2012/10/13).

    Article  CAS  PubMed  Google Scholar 

  49. Updated listing of mappers. Available from: http://wwwdev.ebi.ac.uk/fg/hts_mappers/.

  50. Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60 (Epub 2009/05/20).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  51. Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3):R25 (Epub 2009/03/06).

    Google Scholar 

  52. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9 (Epub 2012/03/06).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  53. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11 (Epub 2009/03/18).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  54. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):R36 (Epub 2013/04/27).

    Google Scholar 

  55. Jean G, Kahles A, Sreedharan VT, De Bona F, Ratsch G. RNA-Seq read alignments with PALMapper. Current protocols in bioinformatics/editoral board, Andreas D Baxevanis [et al]. 2010;Chapter 11:Unit 11 6 (Epub 2010/12/15).

    Google Scholar 

  56. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21 (Epub 2012/10/30).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  57. Homer N, Merriman B, Nelson SF. BFAST: an alignment tool for large scale genome resequencing. PLoS ONE. 2009;4(11):e7767 (Epub 2009/11/13).

    Google Scholar 

  58. Schneeberger K, Hagmann J, Ossowski S, Warthmann N, Gesing S, Kohlbacher O, et al. Simultaneous alignment of short reads against multiple genomes. Genome Biol. 2009;10(9):R98 (Epub 2009/09/19).

    Google Scholar 

  59. Novocraft. 2010. Available from: http://www.novocraft.com/.

  60. David M, Dzamba M, Lister D, Ilie L, Brudno M. SHRiMP2: sensitive yet practical SHort read mapping. Bioinformatics. 2011;27(7):1011–2 (Epub 2011/02/01).

    Article  CAS  PubMed  Google Scholar 

  61. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, et al. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics. 2009;25(15):1966–7 (Epub 2009/06/06).

    Article  CAS  PubMed  Google Scholar 

  62. Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, et al. Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet. 2009;41(10):1061–7 (Epub 2009/09/01).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  63. Clement NL, Clement MJ, Snell Q, Johnson WE. Parallel mapping approaches for GNUMAP. IPDPS. 2011;435–43 (Epub 2011/01/01).

    Google Scholar 

  64. Smith AD, Chung WY, Hodges E, Kendall J, Hannon G, Hicks J, et al. Updates to the RMAP short-read mapping software. Bioinformatics. 2009;25(21):2841–2 (Epub 2009/09/09).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  65. Li H, Ruan J, Durbin R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008;18(11):1851–8 (Epub 2008/08/21).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  66. Maher MC, Hernandez RD. Rock, paper, scissors: harnessing complementarity in ortholog detection methods improves comparative genomic inference. G3 (Bethesda). 2015;5(4):629–38 (Epub 2015/02/26).

    Google Scholar 

  67. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10 (Epub 1990/10/05).

    Article  CAS  PubMed  Google Scholar 

  68. Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981;147(1):195–7 (Epub 1981/03/25).

    Article  CAS  PubMed  Google Scholar 

  69. Needleman SB, Wunsch CD. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol. 1970;48(3):443–53 (Epub 1970/03/01).

    Article  CAS  PubMed  Google Scholar 

  70. Rumble SM, Lacroute P, Dalca AV, Fiume M, Sidow A, Brudno M. SHRiMP: accurate mapping of short color-space reads. PLoS Comput Biol. 2009;5(5):e1000386 (Epub 2009/05/23).

    Google Scholar 

  71. Barsky M, Stege U, Thomo A, Upton C, editors. Suffix trees for very large genomic sequences. CIKM ’09: Proceedings of the 18th ACM Conference on Information and Knowledge Management; 2009; New York, NY, USA.

    Google Scholar 

  72. Ferragina P, Manzini G, editors. Opportunistic data structures with applications. Proceedings of the 41st Symposium on Foundations of Computer Science (FOCS 2000); 2000; Redondo Beach, CA.

    Google Scholar 

  73. Burrows M, Wheeler D. A block sorting lossless data compression algorithm. Palo Alto, CA: Digital Equipment Corporation; 1994.

    Google Scholar 

  74. Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, et al. Ensembl 2015. Nucleic Acids Res. 2015;43(Database issue):D662-9 (Epub 2014/10/30).

    Google Scholar 

  75. iGenomes. Available from: https://support.illumina.com/sequencing/sequencing_software/igenome.html.

  76. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9 (Epub 2009/06/10).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  77. Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14(2):178–92 (Epub 2012/04/21).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  78. Fiume M, Smith EJ, Brook A, Strbenac D, Turner B, Mezlini AM, et al. Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res. 2012;40(Web Server issue):W615-21 (Epub 2012/05/29).

    Google Scholar 

  79. Nicol JW, Helt GA, Blanchard SG Jr, Raja A, Loraine AE. The integrated genome browser: free software for distribution and exploration of genome-scale datasets. Bioinformatics. 2009;25(20):2730–1 (Epub 2009/08/06).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  80. Surget-Groba Y, Montoya-Burgos JI. Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res. 2010;20(10):1432–40 (Epub 2010/08/10).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  81. De Bruijn NG. A combinatorial problem. Koninklijke Nederlandse Akademie v Wetenschappen. 1946;46(6).

    Google Scholar 

  82. Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18(5):821–9 (Epub 2008/03/20).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  83. Robertson G, Schein J, Chiu R, Corbett R, Field M, Jackman SD, et al. De novo assembly and analysis of RNA-seq data. Nat Methods. 2010;7(11):909–12 (Epub 2010/10/12).

    Article  CAS  PubMed  Google Scholar 

  84. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19(6):1117–23 (Epub 2009/03/03).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  85. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29(7):644–52 (Epub 2011/05/17).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  86. Schulz MH, Zerbino DR, Vingron M, Birney E. Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012;28(8):1086–92 (Epub 2012/03/01).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  87. Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18(9):1509–17 (Epub 2008/06/14).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  88. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008;5(7):621–8 (Epub 2008/06/03).

    Article  CAS  PubMed  Google Scholar 

  89. Jiang H, Wong WH. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics. 2009;25(8):1026–32 (Epub 2009/02/27).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  90. Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN. RNA-Seq gene expression estimation with read mapping uncertainty. Bioinformatics. 2010;26(4):493–500 (Epub 2009/12/22).

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  91. Katz Y, Wang ET, Airoldi EM, Burge CB. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat Methods. 2010;7(12):1009–15 (Epub 2010/11/09).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  92. Consortium TE. Standards, Guideline and Best Practices for RNA-Seq. 2011; V1.0. Available from: https://www.encodeproject.org/.

  93. Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010;11(10):R106 (Epub 2010/10/29).

    Google Scholar 

  94. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26(1):139–40 (Epub 2009/11/17).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  95. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47 (Epub 2015/01/22).

    Google Scholar 

  96. da Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44–57 (Epub 2009/01/10).

    Article  CAS  Google Scholar 

  97. Liberzon A, Subramanian A, Pinchback R, Thorvaldsdottir H, Tamayo P, Mesirov JP. Molecular signatures database (MSigDB) 3.0. Bioinformatics. 2011;27(12):1739–40 (Epub 2011/05/07).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  98. Hardcastle TJ, Kelly KA. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics. 2010;11:422 (Epub 2010/08/12).

    Article  PubMed Central  PubMed  Google Scholar 

  99. Boyer LA, Lee TI, Cole MF, Johnstone SE, Levine SS, Zucker JP, et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell. 2005;122(6):947–56.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  100. Iyer VR, Horak CE, Scafe CS, Botstein D, Snyder M, Brown PO. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature. 2001;409(6819):533–8.

    Article  CAS  PubMed  Google Scholar 

  101. Wu JQ, Seay M, Schulz, V., Hariharan, M., Tuck, D., Lian, J., Du, J., Shi, M., Ye, Z. J.,, Gerstein M, Snyder M, Weissman S. TCF7 is a key regulator of the self-renewal and differentiation switch in a multipotential hematopoietic cell line. PLoS Genetics. 2012;In Press.

    Google Scholar 

  102. Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, et al. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129(4):823–37 (Epub 2007/05/22).

    Article  CAS  PubMed  Google Scholar 

  103. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316(5830):1497–502 (Epub 2007/06/02).

    Article  CAS  PubMed  Google Scholar 

  104. Park PJ. ChIP-seq: advantages and challenges of a maturing technology. Nat Rev Genet. 2009;10(10):669–80 (Epub 2009/09/09).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  105. Visel A, Blow MJ, Li Z, Zhang T, Akiyama JA, Holt A, et al. ChIP-seq accurately predicts tissue-specific activity of enhancers. Nature. 2009;457(7231):854–8 (Epub 2009/02/13).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  106. Chen Y, Negre N, Li Q, Mieczkowska JO, Slattery M, Liu T, et al. Systematic evaluation of factors influencing ChIP-seq fidelity. Nat Methods. 2012;9(6):609–14 (Epub 2012/04/24).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  107. Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22(9):1813–31 (Epub 2012/09/08).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  108. Kharchenko PV, Tolstorukov MY, Park PJ. Design and analysis of ChIP-seq experiments for DNA-binding proteins. Nat Biotechnol. 2008;26(12):1351–9 (Epub 2008/11/26).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  109. Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008;24(5):713–4 (Epub 2008/01/30).

    Article  CAS  PubMed  Google Scholar 

  110. Bailey T, Krajewski P, Ladunga I, Lefebvre C, Li Q, Liu T, et al. Practical guidelines for the comprehensive analysis of ChIP-seq data. PLoS computational biology. 2013;9(11):e1003326 (Epub 2013/11/19).

    Google Scholar 

  111. Jung LY, Kharchenko P, Wold B, Sidow A, Batzoglou S, Park P. Assessment of ChIP-seq data quality using cross-correlation analysis.

    Google Scholar 

  112. Muino JM, Kaufmann K, van Ham RC, Angenent GC, Krajewski P. ChIP-seq analysis in R (CSAR): an R package for the statistical detection of protein-bound genomic regions. Plant Methods. 2011;7:11 (Epub 2011/05/11).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  113. Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137 (Epub 2008/09/19).

    Google Scholar 

  114. Ji H, Jiang H, Ma W, Johnson DS, Myers RM, Wong WH. An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol. 2008;26(11):1293–300 (Epub 2008/11/04).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  115. Qin ZS, Yu J, Shen J, Maher CA, Hu M, Kalyana-Sundaram S, et al. HPeak: an HMM-based algorithm for defining read-enriched regions in ChIP-Seq data. BMC Bioinf. 2010;11:369 (Epub 2010/07/06).

    Article  CAS  Google Scholar 

  116. Bardet AF, He Q, Zeitlinger J, Stark A. A computational pipeline for comparative ChIP-seq analyses. Nat Protoc. 2012;7(1):45–61 (Epub 2011/12/20).

    Article  CAS  Google Scholar 

  117. Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol. 2005;23(1):137–44 (Epub 2005/01/08).

    Article  CAS  PubMed  Google Scholar 

  118. Ma W, Noble WS, Bailey TL. Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nat Protoc. 2014;9(6):1428–50 (Epub 2014/05/24).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  119. Rozowsky J, Euskirchen G, Auerbach RK, Zhang ZD, Gibson T, Bjornson R, et al. PeakSeq enables systematic scoring of ChIP-seq experiments relative to controls. Nat Biotechnol. 2009;27(1):66–75.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  120. Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S, et al. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods. 2008;5(9):829–34 (Epub 2009/01/23).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  121. Fejes AP, Robertson G, Bilenky M, Varhol R, Bainbridge M, Jones SJ. FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics. 2008;24(15):1729–30 (Epub 2008/07/05).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  122. Boyle AP, Guinney J, Crawford GE, Furey TS. F-Seq: a feature density estimator for high-throughput sequence tags. Bioinformatics. 2008;24(21):2537–8 (Epub 2008/09/12).

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  123. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA. 2005;102(43):15545–50.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  124. Bryder D, Rossi DJ, Weissman IL. Hematopoietic stem cells: the paradigmatic tissue-specific stem cell. Am J Pathol. 2006;169(2):338–46.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  125. Shizuru JA, Negrin RS, Weissman IL. Hematopoietic stem and progenitor cells: clinical and preclinical regeneration of the hematolymphoid system. Annu Rev Med. 2005;56:509–38.

    Article  CAS  PubMed  Google Scholar 

  126. Faubert A, Lessard J, Sauvageau G. Are genetic determinants of asymmetric stem cell division active in hematopoietic stem cells? Oncogene. 2004;23(43):7247–55.

    Article  CAS  PubMed  Google Scholar 

  127. Zhou JX, Huang S. Understanding gene circuits at cell-fate branch points for rational cell reprogramming. Trends Genet. 2011;27(2):55–62.

    Article  CAS  PubMed  Google Scholar 

  128. Waltzer L, Gobert V, Osman D, Haenlin M. Transcription factor interplay during Drosophila haematopoiesis. Int J Dev Biol. 2010;54(6–7):1107–15.

    Article  CAS  PubMed  Google Scholar 

  129. Bertrand V, Hobert O. Lineage programming: navigating through transient regulatory states via binary decisions. Curr Opin Genet Dev. 2010;20(4):362–8.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  130. Jukam D, Desplan C. Binary fate decisions in differentiating neurons. Curr Opin Neurobiol. 2010;20(1):6–13.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  131. Moore KA, Lemischka IR. “Tie-ing” down the hematopoietic niche. Cell. 2004;118(2):139–40.

    Article  CAS  PubMed  Google Scholar 

  132. Tsai S, Bartelmez S, Sitnicka E, Collins S. Lymphohematopoietic progenitors immortalized by a retroviral vector harboring a dominant-negative retinoic acid receptor can recapitulate lymphoid, myeloid, and erythroid development. Genes Dev. 1994;8(23):2831–41.

    Article  CAS  PubMed  Google Scholar 

  133. Pinto do OP. Kolterud A, Carlsson L. Expression of the LIM-homeobox gene LH2 generates immortalized steel factor-dependent multipotent hematopoietic precursors. EMBO J. 1998;17(19):5744–56.

    Article  Google Scholar 

  134. Yu WM, Hawley TS, Hawley RG, Qu CK. Immortalization of yolk sac-derived precursor cells. Blood. 2002;100(10):3828–31.

    Article  PubMed  Google Scholar 

  135. Sauvageau G, Iscove NN, Humphries RK. In vitro and in vivo expansion of hematopoietic stem cells. Oncogene. 2004;23(43):7223–32.

    Article  CAS  PubMed  Google Scholar 

  136. Ye ZJ, Kluger Y, Lian Z, Weissman SM. Two types of precursor cells in a multipotential hematopoietic cell line. Proc Natl Acad Sci USA. 2005;102(51):18461–6.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  137. Raich N, Clegg CH, Grofti J, Romeo PH, Stamatoyannopoulos G. GATA1 and YY1 are developmental repressors of the human epsilon-globin gene. EMBO J. 1995;14(4):801–9.

    PubMed Central  CAS  PubMed  Google Scholar 

  138. Breitkreutz BJ, Stark C, Tyers M. Osprey: a network visualization system. Genome Biol. 2003;4(3):R22.

    Article  PubMed Central  PubMed  Google Scholar 

  139. Horak CE, Luscombe NM, Qian J, Bertone P, Piccirrillo S, Gerstein M, et al. Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae. Genes Dev. 2002;16(23):3017–33.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  140. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science. 2002;298(5594):799–804.

    Article  CAS  PubMed  Google Scholar 

  141. Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, et al. Transcriptional regulatory code of a eukaryotic genome. Nature. 2004;431(7004):99–104.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  142. Borneman AR, H. Yu, P. Bertone, M. Gerstein and M. Snyder. The transcription factors Mga1 and Phd1 are master regulators of a complex transcriptional network controlling pseudohyphal growth. Cell, submitted. 2005.

    Google Scholar 

  143. Weintraub H, Tapscott SJ, Davis RL, Thayer MJ, Adam MA, Lassar AB, et al. Activation of muscle-specific genes in pigment, nerve, fat, liver, and fibroblast cell lines by forced expression of MyoD. Proc Natl Acad Sci USA. 1989;86(14):5434–8.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  144. Tapscott SJ. The circuitry of a master switch: Myod and the regulation of skeletal muscle gene transcription. Development. 2005;132(12):2685–95.

    Article  CAS  PubMed  Google Scholar 

  145. Asakura A, Lyons GE, Tapscott SJ. The regulation of MyoD gene expression: conserved elements mediate expression in embryonic axial muscle. Dev Biol. 1995;171(2):386–98.

    Article  CAS  PubMed  Google Scholar 

  146. Goldhamer DJ, Brunk BP, Faerman A, King A, Shani M, Emerson CP Jr. Embryonic activation of the myoD gene is regulated by a highly conserved distal control element. Development. 1995;121(3):637–49.

    CAS  PubMed  Google Scholar 

  147. Kurokawa M, Hirai H. Role of AML1/Runx1 in the pathogenesis of hematological malignancies. Cancer Sci. 2003;94(10):841–6.

    Article  CAS  PubMed  Google Scholar 

  148. Friedman AD. Cell cycle and developmental control of hematopoiesis by Runx1. J Cell Physiol. 2009;219(3):520–4.

    Article  CAS  PubMed  Google Scholar 

  149. Coelho PS, Bryan AC, Kumar A, Shadel GS, Snyder M. A novel mitochondrial protein, Tar1p, is encoded on the antisense strand of the nuclear 25S rDNA. Genes Dev. 2002;16(21):2755–60.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  150. Tycowski KT, Shu MD, Steitz JA. A mammalian gene with introns instead of exons generating stable RNA products. Nature. 1996;379(6564):464–6.

    Article  CAS  PubMed  Google Scholar 

  151. Zhang Z, Harrison P, Gerstein M. Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. Genome Res. 2002;12(10):1466–82.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  152. Snyder M, Gerstein M. Genomics. Defining genes in the genomics era. Science. 2003;300(5617):258–60.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

We thank Dr. Eva Zsigmond for reading and editing our manuscript. JQW, RCDD, and SM are supported by grants from the National Institutes of Health R01 NS088353 and R00 HL093213, the Staman Ogilvie Fund—Memorial Hermann Foundation, Mission Connect—a program of the TIRR Foundation, the Senator Lloyd & B.A. Bentsen Center for Stroke Research, UTHealth BRAIN Initiative and CTSA UL1 TR000371, and a grant from the University of Texas System Neuroscience and Neurotechnology Research Institute (Grant #362469).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiaqian Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Cuevas Diaz Duran, R., Menon, S., Wu, J. (2016). The Analyses of Global Gene Expression and Transcription Factor Regulation. In: Wu, J. (eds) Transcriptomics and Gene Regulation . Translational Bioinformatics, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-7450-5_1

Download citation

Publish with us

Policies and ethics