Skip to main content

Functional Annotation of Plant Genomes

  • Chapter
  • First Online:
Cereal Genomics II

Abstract

The recent introduction of highly-efficient next-generation sequencing platforms (Roche 454, Illumina, PacBio, Life Technologies SOLiD, etc.) has lead to an increased number of sequenced plant genomes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • AGI (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408(6814):796–815

    Article  Google Scholar 

  • Al-Dous EK, George B et al (2011) De novo genome sequencing and comparative genomics of date palm (Phoenix dactylifera). Nat Biotechnol 29(6):521–527

    Article  PubMed  CAS  Google Scholar 

  • Alexeyenko A, Tamas I et al (2006) Automatic clustering of orthologs and inparalogs shared by multiple proteomes. Bioinformatics 22(14):e9–e15

    Article  PubMed  CAS  Google Scholar 

  • Ashburner M, Ball CA et al (2000) Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet 25(1):25–29

    Article  PubMed  CAS  Google Scholar 

  • Banks JA, Nishiyama T et al (2011) The selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science 332(6032):960–963

    Article  PubMed  CAS  Google Scholar 

  • Beissbarth T, Speed TP (2004) GOstat: find statistically overrepresented gene ontologies within a group of genes. Bioinformatics 20(9):1464–1465

    Article  PubMed  CAS  Google Scholar 

  • Benjamini YH, Yosef (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Stat Soc 57(1):289–300

    Google Scholar 

  • Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27(2):573–580

    Article  PubMed  CAS  Google Scholar 

  • Berglund AC, Sjolund E et al (2008) InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res 36(Database issue):D263–266

    Google Scholar 

  • Berriz GF, Beaver JE et al (2009) Next generation software for functional trend analysis. Bioinformatics 25(22):3043–3044

    Article  PubMed  CAS  Google Scholar 

  • Blanco E, Abril JF (2009) Computational gene annotation in new genome assemblies using GeneID. Methods Mol Biol 537:243–261

    Article  PubMed  CAS  Google Scholar 

  • Blanco E, Parra G et al (2007) Using geneid to identify genes. Curr Protoc Bioinformatics Chapter 4: Unit 4 3

    Google Scholar 

  • Camacho C, Coulouris G et al (2009) BLAST+: architecture and applications. BMC Bioinf 10:421

    Article  Google Scholar 

  • Chen F, Mackey AJ et al (2006) OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res 34(Database issue):D363–368

    Google Scholar 

  • Cock JM, Sterck L et al (2010) The Ectocarpus genome and the independent evolution of multicellularity in brown algae. Nature 465(7298):617–621

    Article  PubMed  CAS  Google Scholar 

  • Couch JA, Zintel HA et al (1993) The genome of the tropical tree Theobroma cacao L. Mol Gen Genet 237(1–2):123–128

    PubMed  CAS  Google Scholar 

  • Du Z, Zhou X et al (2010) AgriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res 38(Web Server issue):W64–W70

    Google Scholar 

  • Filichkin SA, Breton G et al (2011) Global profiling of rice and poplar transcriptomes highlights key conserved circadian-controlled pathways and cis-regulatory modules. PLoS ONE 6(6):e16907

    Article  PubMed  CAS  Google Scholar 

  • Goff SA, Ricke D et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296(5565):92–100

    Article  PubMed  CAS  Google Scholar 

  • Hermjakob H, Montecchi-Palazzi L et al (2004) IntAct: an open source molecular interaction database. Nucleic Acids Res 32(Database issue):D452–D455

    Google Scholar 

  • Huang S, Li R et al (2009) The genome of the cucumber, cucumis sativus L. Nat Genet 41(12):1275–1281

    Article  PubMed  CAS  Google Scholar 

  • Hunter S, Apweiler R et al (2009) InterPro: the integrative protein signature atabase. Nucleic Acids Res 37(Database issue):D211–D215

    Google Scholar 

  • International Union of Biochemistry and Molecular Biology. Nomenclature Committee. and E. C. Webb (1992) Enzyme nomenclature 1992: recommendations of the nomenclature committee of the international union of biochemistry and molecular biology on the nomenclature and classification of enzymes. Published for the International Union of Biochemistry and Molecular Biology by Academic Press, San Diego

    Google Scholar 

  • IRGSP (2005) The map-based sequence of the rice genome. Nature 436(7052):793–800

    Article  Google Scholar 

  • Jaillon O, Aury JM et al (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449(7161):463–467

    Article  PubMed  CAS  Google Scholar 

  • Jurka J, Kapitonov VV et al (2005) Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res 110(1–4):462–467

    Article  PubMed  CAS  Google Scholar 

  • Kopp J, Schwede T (2006) The SWISS-MODEL repository: new features and functionalities. Nucleic Acids Res 34(Database issue):D315–D318

    Google Scholar 

  • Korf I (2004) Gene finding in novel genomes. BMC Bioinf 5:59

    Article  Google Scholar 

  • Kriventseva EV, Fleischmann W et al (2001) CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins. Nucleic Acids Res 29(1):33–36

    Article  PubMed  CAS  Google Scholar 

  • Li L, Stoeckert CJ Jr et al (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13(9):2178–2189

    Article  PubMed  CAS  Google Scholar 

  • Liang C, Mao L et al (2009) Evidence-based gene predictions in plant genomes. Genome Res 19(10):1912–1923

    Article  PubMed  CAS  Google Scholar 

  • Maere S, Heymans K et al (2005) BiNGO: a cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21(16):3448–3449

    Article  PubMed  CAS  Google Scholar 

  • Merchant SS, Prochnik SE et al (2007) The chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318(5848):245–250

    Article  PubMed  CAS  Google Scholar 

  • Ming R, Hou S et al (2008) The draft genome of the transgenic tropical fruit tree papaya (carica papaya Linnaeus). Nature 452(7190):991–996

    Article  PubMed  CAS  Google Scholar 

  • Mockler TC, Michael TP et al (2007) The DIURNAL project: DIURNAL and circadian expression profiling, model-based pattern matching, and promoter analysis. Cold Spring Harb Symp Quant Biol 72:353–363

    Article  PubMed  CAS  Google Scholar 

  • Mulder N, Apweiler R (2007) InterPro and InterProScan: tools for protein sequence classification and comparison. Methods Mol Biol 396:59–70

    Article  PubMed  CAS  Google Scholar 

  • O’Brien KP, Remm M et al (2005) Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res 33(Database issue):D476–D480

    Google Scholar 

  • Ostlund G, Schmitt T et al (2010). InParanoid 7: new algorithms and tools for eukaryotic orthology analysis.” Nucleic Acids Res 38(Database issue):D196–D203

    Google Scholar 

  • Ouyang S, Buell CR (2004) The TIGR plant repeat databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res 32(Database issue):D360–D363

    Google Scholar 

  • Paterson AH, Bowers JE et al (2009) The sorghum bicolor genome and the diversification of grasses. Nature 457(7229):551–556

    Article  PubMed  CAS  Google Scholar 

  • Pieper U, Webb BM et al (2011) ModBase, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res 39(Database issue): D465–D474

    Google Scholar 

  • Potter SC, Clarke L et al (2004) The Ensembl analysis pipeline. Genome Res 14(5):934–941

    Article  PubMed  CAS  Google Scholar 

  • Rawlings ND, Tolle DP et al (2004) MEROPS: the peptidase database. Nucleic Acids Res 32(Database issue):D160–D164

    Google Scholar 

  • Remm M, Storm CE et al (2001) Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol 314(5):1041–1052

    Article  PubMed  CAS  Google Scholar 

  • Rensing SA, Lang D et al (2008) The physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319(5859):64–69

    Article  PubMed  CAS  Google Scholar 

  • Sato S, Hirakawa H et al (2011) Sequence analysis of the genome of an oil-bearing tree, Jatropha curcas L. DNA Res 18(1):65–76

    Article  PubMed  CAS  Google Scholar 

  • Schmutz J, Cannon SB et al (2010) Genome sequence of the palaeopolyploid soybean. Nature 463(7278):178–183

    Article  PubMed  CAS  Google Scholar 

  • Schnable PS, Ware D et al (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326(5956):1112–1115

    Article  PubMed  CAS  Google Scholar 

  • Shulaev V, Sargent DJ et al (2011) The genome of woodland strawberry (Fragaria vesca). Nat Genet 43(2):109–116

    Article  PubMed  CAS  Google Scholar 

  • Smoot ME, Ono K et al (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27(3):431–432

    Article  PubMed  CAS  Google Scholar 

  • Solovyev V, Kosarev P et al (2006) Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol 7 Suppl 1:S10 11–12

    Google Scholar 

  • Spannagl M, Noubibou O et al (2007) MIPSPlantsDB–plant database resource for integrative and comparative plant genome research. Nucleic Acids Res 35(Database issue):D834–D840

    Google Scholar 

  • Stanke, M. and B. Morgenstern (2005). “AUGUSTUS: a web server for gene prediction in eukaryotes that allows user-defined constraints.” Nucleic Acids Res 33(Web Server issue): W465-467

    Google Scholar 

  • Tarailo-Graovac M, Chen N (2009) Using repeatmasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics Chapter 4: Unit 4 10

    Google Scholar 

  • Tipney HHL (2010) An introduction to effective use of enrichment analysis software. Hum Genomics 4(3):202

    Article  PubMed  CAS  Google Scholar 

  • Tuskan GA, Difazio S et al (2006) The genome of black cottonwood, Populus trichocarpa (Torr. and Gray). Science 313(5793):1596–1604

    Article  PubMed  CAS  Google Scholar 

  • Velasco R, Zharkikh A et al (2010) The genome of the domesticated apple (Malus x domestica Borkh.). Nat Genet 42(10):833–839

    Article  PubMed  CAS  Google Scholar 

  • Vilella AJ, Severin J et al (2009) Ensembl Compara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19(2):327–335

    Article  PubMed  CAS  Google Scholar 

  • Vogel (2010) Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463(7282):763–768

    Article  CAS  Google Scholar 

  • Yeats C, Lees J et al (2011) The Gene3D Web Services: a platform for identifying, annotating and comparing structural domains in protein sequences. Nucleic Acids Res 39(Web Server issue):W546–W550

    Google Scholar 

  • Youens-Clark K, Buckler E et al (2011) Gramene database in 2010: updates and extensions. Nucleic Acids Res 39(Database issue): D1085–D1094

    Google Scholar 

  • Zheng Q, Wang XJ (2008). GOEAST: a web-based software toolkit for gene ontology enrichment analysis. Nucleic Acids Res 36(Web Server issue): W358–W363

    Google Scholar 

Download references

Acknowledgments

The authors VA, PD, JE and PJ are supported by the Gramene project award (# IOS:0703908) and the Plant Ontology project (# IOS:0822201) from the National Science Foundation (NSF) of USA. The Jaiswal lab is also supported by the startup funds provided to PJ by the Oregon State University (OSU), Corvallis, OR, USA. Authors would also like to thank Rajani Raja of OSU for the InterProScan tabular output (Fig. 7.4); Justin Preece of OSU for editorial comments; and Sarah Hunter, InterPro team and the European Bioinformatics Institute for giving the permission to use screen shots of the InterProScan web interface (Figs. 7.2, 7.3).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pankaj Jaiswal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Amarasinghe, V., Dharmawardhana, P., Elser, J., Jaiswal, P. (2013). Functional Annotation of Plant Genomes. In: Gupta, P., Varshney, R. (eds) Cereal Genomics II. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6401-9_7

Download citation

Publish with us

Policies and ethics