Skip to main content

Handling and Interpreting Gene Groups

  • Chapter
Introduction to Systems Biology

Abstract

Systems biologists often have to deal with large gene groups obtained from high-throughput experiments, genome-wide predictions, and literature searches. Handling and functional interpretation of these gene groups is rather challenging. Problems arise from redundancies in databases, where a gene is given several names or identifiers, and from falsely assigned genes in the list. Moreover, genes in gene groups obtained by different methods are often represented by different types of identifiers, or are even genes from other model organisms. Thus, research in systems biology requires software tools that help to handle and interpret gene groups.

This chapter will review tools to store and compare gene groups represented by various identifiers. We introduce software that uses Gene Ontology (GO) annotations to infer biological processes associated with the gene groups. Additionally, we review approaches to further analyze gene groups regarding their transcriptional regulation by retrieving and analyzing their putative promoter regions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Benson D, Karsch-Mizrachi I, Lipman D, et al. GenBank. Nucleic Acids Res 2005;33:D34–38.

    Article  PubMed  CAS  Google Scholar 

  2. Wheeler D, Barrett T, Benson D, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 2005;33:D39–45.

    Article  PubMed  CAS  Google Scholar 

  3. Boeckmann B, Bairoch A, Apweiler R, et al. The SWISS-PROT protein knowledge base and its supplement TrEMBL in 2003. Nucleic Acids Res 2003;31:365–370.

    Article  PubMed  CAS  Google Scholar 

  4. Maglott D, Ostell J, Pruitt K, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 2005;33:D54–58.

    Article  PubMed  CAS  Google Scholar 

  5. Blüthgen N, Kielbasa SM, Cajavec B, Herzel H. HOMGL-comparing genelists across species and with different accession numbers. Bioinformatics 2004;20:125–126.

    Article  PubMed  Google Scholar 

  6. Tullai JW, Schaffer ME, Mullenbrock S, et al. Identification of transcription factor binding sites upstream of human genes regulated by the phosphatidylinositol 3-kinase and MEK/ERK signaling pathways. J Biol Chem 2004;279:20167–20177.

    Article  PubMed  CAS  Google Scholar 

  7. Cheung K, Hager J, Pan D, et al. KARMA: a web server application for comparing and annotating heterogeneous microarray platforms. Nucleic Acids Res 2004;32:W441–444.

    Article  PubMed  CAS  Google Scholar 

  8. Veldhoven A, de Lange D, Smid M, et al. Storing, linking, and mining microarray databases using SRS. BMC Bioinformatics 2005;6:192.

    Article  PubMed  Google Scholar 

  9. Tsai J, Sultana R, Lee Y, et al. RESOURCERER: a database for annotating and linking microarray resources within and across species. Genome Biol 2001;2:SOFTWARE0002.

    Google Scholar 

  10. Wang P, Ding F, Chiang H, et al. ProbeMatchDB—a web database for finding equivalent probes across microarray platforms and species. Bioinformatics 2002;18:488–489.

    Article  PubMed  CAS  Google Scholar 

  11. Ashburner M, Ball CA, Blake JA, et al. Gene Ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000;25:25–29.

    Article  PubMed  CAS  Google Scholar 

  12. Bard JB, Rhee SY. Ontologies in biology: design, applications and future challenges. Nat Rev Genet 2004;5:213–222.

    Article  PubMed  CAS  Google Scholar 

  13. Dudoit S, Shaffer JP, Boldrick JC. Multiple hypothesis testing in microarray experiments. Stat Sci 2003;18:71–103.

    Article  Google Scholar 

  14. Blüthgen N, Brand K, Cajavec B, et al. Biological profiling of gene groups utilizing gene ontology. Genome Inform 2005;16:106–115.

    PubMed  Google Scholar 

  15. Draghici S, Khatri P, Martins RP, et al. Global functional profiling of gene expression. Genomics 2003;81:98–104.

    Article  PubMed  CAS  Google Scholar 

  16. Hosack DA, Dennis G Jr, Sherman BT, et al. Identifying biological themes within lists of genes with EASE. Genome Biol 2003;4:R70.

    Article  PubMed  Google Scholar 

  17. Dennis G, Sherman BT, Hosack DA, et al. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 2003;4:P3.

    Article  PubMed  Google Scholar 

  18. Zhong S, Li C, Wong WH. ChipInfo: Software for extracting gene annotation and gene ontology information for microarray analysis. Nucleic Acids Res 2003;31:3483–3486.

    Article  PubMed  CAS  Google Scholar 

  19. Feng W, Wang G, Zeeberg B, et al. Development of gene ontology tool for biological interpretation of genomic and proteomic data. AMIA Annu Symp Proc 2003;839.

    Google Scholar 

  20. Castillo-Davis CI, Hartl DL. GeneMerge-post-genomic analysis, data mining, and hypothesis testing. Bioinformatics 2003;19:891–892.

    Article  PubMed  CAS  Google Scholar 

  21. Al-Shahrour F, Diaz-Uriarte R, Dopazo J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes. Bioinformatics 2004;20:578–580.

    Article  PubMed  CAS  Google Scholar 

  22. Beissbarth T, Speed TP. GOstat: find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 2004;20:1464–1465.

    Article  PubMed  CAS  Google Scholar 

  23. Maere S, Heymans K, Kuiper M. BiNGO: a Cytoscape plug-in to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 2005;21:3448–3449.

    Article  PubMed  CAS  Google Scholar 

  24. Conesa A, Gotz S, Garcia-Gomez J, et al. Blast2go: A universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 2005;21:3674–3676.

    Article  PubMed  CAS  Google Scholar 

  25. Kielbasa S, Blüthgen N, Herzel H. Genome-wide analysis of functions regulated by sets of transcription factors. Proceedings of the German Conference on Bioinformatics. 2004;105–113.

    Google Scholar 

  26. Blüthgen N, Kielbasa S, Herzel H. Inferring combinatorial regulation of transcription in silico. Nucleic Acids Res 2005;33:272–279.

    Article  PubMed  Google Scholar 

  27. Wasserman W, Fickett J. Identification of regulatory regions which confer muscle-specific gene expression. J Mol Biol 1998;278:167–181.

    Article  PubMed  CAS  Google Scholar 

  28. Schmid C, Praz V, Delorenzi M, et al. The Eukaryotic Promoter Database EPD: the impact of in silico primer extension. Nucleic Acids Res 2004;32:D82–85.

    Article  PubMed  CAS  Google Scholar 

  29. Carninci P, Kasukawa T, Katayama S, et al. The transcriptional landscape of the mammalian genome. Science 2005;309:1559–1563.

    Article  PubMed  CAS  Google Scholar 

  30. Suzuki Y, Yamashita R, Sugano S, Nakai K. DBTSS, DataBase of Transcriptional Start Sites: progress report 2004. Nucleic Acids Res 2004;32:D78–81.

    Article  PubMed  CAS  Google Scholar 

  31. Birney E, Andrews D, Bevan P, et al. Ensembl 2004. Nucleic Acids Res 2004;32 Database issue:D468–D470.

    Article  PubMed  CAS  Google Scholar 

  32. Stormo G. DNA binding sites: representation and discovery. Bioinformatics 2000;16:16–23.

    Article  PubMed  CAS  Google Scholar 

  33. Lawrence CE, Altschul SF, Boguski MS, et al. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 1993;262:208–214.

    Article  PubMed  CAS  Google Scholar 

  34. Roth FR, Hughes JD, Estep PE, Church GM. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotechnol 1998;16:939–945.

    Article  CAS  Google Scholar 

  35. Frith M, Hansen U, Spouge J, Weng Z. Finding functional sequence elements by multiple local alignment. Nucleic Acids Res 2004;32:189–200.

    Article  PubMed  CAS  Google Scholar 

  36. Bailey TL, Elkan C. Fitting a mixture model by expectation maximisation to discover motifs in biopolymers. In: Proceedings of the International Conference on Intelligence Systems for Molecular Biology. AAAI Press; 1994:28–36.

    Google Scholar 

  37. van Helden J, André B, Collado-Vides J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 1998;281:827–842.

    Article  PubMed  Google Scholar 

  38. Kielbasa S, Korbel J, Beule D, et al. Combining frequency and positional information to predict transcription factor binding sites. Bioinformatics 2001;17:1019–1026.

    Article  PubMed  CAS  Google Scholar 

  39. Sandelin A, Alkema W, Engstrom P, et al. JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 2004;32 Database issue:D91–D94.

    Article  PubMed  CAS  Google Scholar 

  40. Wingender E, Dietze P, Karas H, Knuppel R. TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res 1996;24:238–241.

    Article  PubMed  CAS  Google Scholar 

  41. Wingender E, Chen X, Hehl R, et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 2000;28:316–319.

    Article  PubMed  CAS  Google Scholar 

  42. Matys V, Fricke E, Geffers R, et al. TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 2003;31:374–378.

    Article  PubMed  CAS  Google Scholar 

  43. Quandt K, Frech K, Karas H, et al. MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res 1995;23:4878–4884.

    Article  PubMed  CAS  Google Scholar 

  44. Kel A, Gossling E, Reuter I, et al. MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 2003;31:3576–3579.

    Article  PubMed  CAS  Google Scholar 

  45. Frith M, Fu Y, Yu L, et al. Detection of functional DNA motifs via statistical over-representation. Nucleic Acids Res 2004;32:1372–1381.

    Article  PubMed  CAS  Google Scholar 

  46. Rahmann S, Müller T, Vingron M. On the power of profiles for transcription factor binding site detection. Stat Appl Genet Mol Biol 2003;2:7.

    Google Scholar 

  47. Wasserman W, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 2004;5:276–287.

    Article  PubMed  CAS  Google Scholar 

  48. Bussemaker H, Li H, Siggia E. Regulatory element detection using correlation with expression. Nat Genet 2001;27:167–171.

    Article  PubMed  CAS  Google Scholar 

  49. Caselle M, Di Cunto F, Provero P. Correlating overrepresented upstream motifs to gene expression: a computational approach to regulatory element discovery in eukaryotes. BMC Bioinformatics 2002;3:7.

    Article  PubMed  Google Scholar 

  50. Wagner A. A computational genomics approach to the identification of gene networks. Nucleic Acids Res 1997;25:3594–3604.

    Article  PubMed  CAS  Google Scholar 

  51. Pilpel Y, Sudarsanam P, Church G. Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet 2001;29:153–159.

    Article  PubMed  CAS  Google Scholar 

  52. Frith M, Spouge J, Hansen U, Weng Z. Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences. Nucleic Acids Res 2002;30:3214–3224.

    Article  PubMed  CAS  Google Scholar 

  53. Frith M, Li M, Weng Z. Cluster-Buster: Finding dense clusters of motifs in DNA sequences. Nucleic Acids Res 2003;31:3666–3668.

    Article  PubMed  CAS  Google Scholar 

  54. Murakami K, Kojima T, Sakaki Y. Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression. BMC Genomics 2004;5:16.

    Article  PubMed  Google Scholar 

  55. Kel-Margoulis O, Romashchenko A, Kolchanov N, et al. COMPEL: a database on composite regulatory elements providing combinatorial transcriptional regulation. Nucleic Acids Res 2000;28:311–315.

    Article  PubMed  CAS  Google Scholar 

  56. Dieterich C, Cusack B, Wang H, et al. Annotating regulatory DNA based on man-mouse genomic comparison. Bioinformatics 2002;18Suppl 2 S84–S90.

    PubMed  Google Scholar 

  57. Wasserman W, Palumbo M, Thompson W, et al. Human-mouse genome comparisons to locate regulatory sites. Nat Genet 2000;26:225–228.

    Article  PubMed  CAS  Google Scholar 

  58. Wang T, Stormo G. Combining phylogenetic data with co-regulated genes to identify regulatory motifs. Bioinformatics 2003;19:2369–2380.

    Article  PubMed  CAS  Google Scholar 

  59. Lenhard B, Sandelin A, Mendoza L, et al. Identification of conserved regulatory elements by comparative genome analysis. J Biol 2003;2:13.

    Article  PubMed  Google Scholar 

  60. Roepcke S, Grossmann S, Rahmann S, Vingron M. T-Reg Comparator: an analysis tool for the comparison of position weight matrices. Nucleic Acids Res 2005;33:W438–441.

    Article  PubMed  CAS  Google Scholar 

  61. Kielbasa S, Gonze D, Herzel H. Measuring similarities between transcription factor binding sites. BMC Bioinformatics 2005;6:237.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Humana Press Inc.

About this chapter

Cite this chapter

Blüthgen, N., Kielbasa, S.M., Beule, D. (2007). Handling and Interpreting Gene Groups. In: Choi, S. (eds) Introduction to Systems Biology. Humana Press. https://doi.org/10.1007/978-1-59745-531-2_4

Download citation

Publish with us

Policies and ethics