Skip to main content

Practical Guide for Fungal Gene Prediction from Genome Assembly and RNA-Seq Reads by FunGAP

  • Protocol
  • First Online:
Gene Prediction

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1962))

Abstract

FunGAP is a Python-wrapped fungal genome annotation pipeline running under the Linux/Unix operating system. The annotation procedure used in FunGAP requires two inputs, genome assembly and RNA-seq reads. FunGAP aims to predict the most feasible gene from all plausible gene models obtained from various gene prediction programs using multiple strategies such as ab initio, EST-, and/or homology-based methods. This guide covers how to run the FunGAP from the command line and use various options for practical gene prediction. Users can choose options for quality control of the input sequences, selecting model database, filtration of predicted gene models, and post-process such as checking genome completeness and transposable elements. Using FunGAP, the user will acquire a high-quality fungal gene prediction for post-genome sequencing analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M (2016) BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32(5):767–769. https://doi.org/10.1093/bioinformatics/btv661

    Article  CAS  PubMed  Google Scholar 

  2. Reid I, O'Toole N, Zabaneh O, Nourzadeh R, Dahdouli M, Abdellateef M, Gordon PM, Soh J, Butler G, Sensen CW, Tsang A (2014) SnowyOwl: accurate prediction of fungal genes by using RNA-Seq and homology information to select among ab initio models. BMC Bioinformatics 15:229. https://doi.org/10.1186/1471-2105-15-229

    Article  PubMed  PubMed Central  Google Scholar 

  3. Zickmann F, Renard BY (2015) IPred—integrating ab initio and evidence based gene predictions to improve prediction accuracy. BMC Genomics 16:134. https://doi.org/10.1186/s12864-015-1315-9

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, White O, Buell CR, Wortman JR (2008) Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments. Genome Biol 9(1):R7. https://doi.org/10.1186/gb-2008-9-1-r7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Holt C, Yandell M (2011) MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491. https://doi.org/10.1186/1471-2105-12-491

    Article  PubMed  PubMed Central  Google Scholar 

  6. Stanke M, Keller O, Gunduz I, Hayes A, Waack S, Morgenstern B (2006) AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res 34:W435–W439. https://doi.org/10.1093/nar/gkl200

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Borodovsky M, Lomsadze A (2011) Eukaryotic gene prediction using GeneMark.hmm-E and GeneMark-ES. Current Protoc Bioinformatics. Chapter 4:Unit 4.6.1-10. https://doi.org/10.1002/0471250953.bi0406s35

    Article  Google Scholar 

  8. Min B, Grigoriev IV, Choi IG (2017) FunGAP: fungal genome annotation pipeline using evidence-based gene model evaluation. Bioinformatics 33(18):2936–2937. https://doi.org/10.1093/bioinformatics/btx353

    Article  CAS  PubMed  Google Scholar 

  9. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinformatics 10:421. https://doi.org/10.1186/1471-2105-10-421

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM (2015) BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210–3212. https://doi.org/10.1093/bioinformatics/btv351

    Article  CAS  PubMed  Google Scholar 

  11. Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, McWilliam H, Maslen J, Mitchell A, Nuka G, Pesseat S, Quinn AF, Sangrador-Vegas A, Scheremetjew M, Yong SY, Lopez R, Hunter S (2014) InterProScan 5: genome-scale protein function classification. Bioinformatics 30(9):1236–1240. https://doi.org/10.1093/bioinformatics/btu031

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Smit A, Hubley R (2008) RepeatModeler Open-1.0. http://www.repeatmasker.org. Accessed 26 Sep 2018

  13. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29(7):644–652. https://doi.org/10.1038/nbt.1883

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Krueger F (2015) Trim galore. A wrapper tool around Cutadapt and FastQC to consistently apply quality and adapter trimming to FastQ files. https://www.bioinformatics.babraham.ac.uk/projects/trim_galore. Accessed 26 Sep 2018

  15. Tarailo-Graovac M, Chen N (2009) Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. Chapter 4:Unit 4.10. https://doi.org/10.1002/0471250953.bi0410s25

  16. Peter M, Kohler A, Ohm RA, Kuo A, Krutzmann J, Morin E, Arend M, Barry KW, Binder M, Choi C, Clum A, Copeland A, Grisel N, Haridas S, Kipfer T, LaButti K, Lindquist E, Lipzen A, Maire R, Meier B, Mihaltcheva S, Molinier V, Murat C, Poggeler S, Quandt CA, Sperisen C, Tritt A, Tisserant E, Crous PW, Henrissat B, Nehls U, Egli S, Spatafora JW, Grigoriev IV, Martin FM (2016) Ectomycorrhizal ecology is imprinted in the genome of the dominant symbiotic fungus Cenococcum geophilum. Nat Commun 7:12662. https://doi.org/10.1038/ncomms12662

    Article  PubMed  PubMed Central  Google Scholar 

  17. Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M (2014) Pfam: the protein families database. Nucleic Acids Res 42:D222–D230. https://doi.org/10.1093/nar/gkt1223

    Article  CAS  PubMed  Google Scholar 

  18. Grigoriev IV, Nikitin R, Haridas S, Kuo A, Ohm R, Otillar R, Riley R, Salamov A, Zhao X, Korzeniewski F, Smirnova T, Nordberg H, Dubchak I, Shabalov I (2014) MycoCosm portal: gearing up for 1000 fungal genomes. Nucleic Acids Res 42:D699–D704. https://doi.org/10.1093/nar/gkt1183

    Article  CAS  PubMed  Google Scholar 

  19. Baker SE, Schackwitz W, Lipzen A, Martin J, Haridas S, LaButti K, Grigoriev IV, Simmons BA, McCluskey K (2015) Draft genome sequence of Neurospora crassa strain FGSC 73. Genome Announc 3(2). https://doi.org/10.1128/genomeA.00074-15

Download references

Acknowledgments

This work is supported by the Cooperative Research Program for Agriculture Science & Technology Development (Project No. PJ01044003 and No. PJ01337602) Rural Development Administration, Republic of Korea, and Dr. Byoungnam Min was supported by the Korea University grant.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to In-Geol Choi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Min, B., Choi, IG. (2019). Practical Guide for Fungal Gene Prediction from Genome Assembly and RNA-Seq Reads by FunGAP. In: Kollmar, M. (eds) Gene Prediction. Methods in Molecular Biology, vol 1962. Humana, New York, NY. https://doi.org/10.1007/978-1-4939-9173-0_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-9173-0_4

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-4939-9172-3

  • Online ISBN: 978-1-4939-9173-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics