Advertisement

The GFF3toolkit: QC and Merge Pipeline for Genome Annotation

  • Mei-Ju May Chen
  • Han Lin
  • Li-Mei Chiang
  • Christopher P. ChildersEmail author
  • Monica F. PoelchauEmail author
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1858)

Abstract

The GFF3toolkit (https://github.com/NAL-i5K/GFF3toolkit) supported by the i5k Workspace@NAL provides a suite of tools to handle gene annotations in GFF3 format from arthropod genome projects and their research communities. To improve GFF3 formatting of gene annotations, a quality control and merge procedure is proposed along with the GFF3toolkit. In particular, the toolkit provides functions to sort a GFF3 file, detect GFF3 format errors, merge two GFF3 files, and generate biological sequences from a GFF3 file. This chapter explains when and how to use the provided tools to obtain nonredundant arthropod gene sets in high quality.

Key words

I5k Arthropods Insects Genomics Community annotation Gene annotations GFF3 

Notes

Acknowledgments

We would like to thank Chien-Yueh Lee and Yu-Yu Lin for their suggestions on the early development of the program suite. Dan Hughes and Stephen (fringy) Richards had the initial idea for the “replace” tag for the merge program. Funding for this project is from the United States Department of Agriculture–Agricultural Research Service; and the USDA-ARS Bee Research Laboratory.

References

  1. 1.
    Wang Q, Arighi CN, King BL, Polson SW, Vincent J, Chen C, Huang H, Kingham BF, Page ST, Rendino MF, Thomas WK, Udwary DW, Wu CH, North East Bioinformatics Collaborative Curation T (2012) Community annotation and bioinformatics workforce development in concert—Little Skate Genome Annotation Workshops and Jamborees. Database (Oxford) 2012:bar064.  https://doi.org/10.1093/database/bar064CrossRefGoogle Scholar
  2. 2.
    Mazumder R, Natale DA, Julio JA, Yeh LS, Wu CH (2010) Community annotation in biology. Biol Direct 5:12.  https://doi.org/10.1186/1745-6150-5-12CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    Benoit JB, Adelman ZN, Reinhardt K, Dolan A, Poelchau M, Jennings EC, Szuter EM, Hagan RW, Gujar H, Shukla JN, Zhu F, Mohan M, Nelson DR, Rosendale AJ, Derst C, Resnik V, Wernig S, Menegazzi P, Wegener C, Peschel N, Hendershot JM, Blenau W, Predel R, Johnston PR, Ioannidis P, Waterhouse RM, Nauen R, Schorn C, Ott MC, Maiwald F, Johnston JS, Gondhalekar AD, Scharf ME, Peterson BF, Raje KR, Hottel BA, Armisen D, Crumiere AJ, Refki PN, Santos ME, Sghaier E, Viala S, Khila A, Ahn SJ, Childers C, Lee CY, Lin H, Hughes DS, Duncan EJ, Murali SC, Qu J, Dugan S, Lee SL, Chao H, Dinh H, Han Y, Doddapaneni H, Worley KC, Muzny DM, Wheeler D, Panfilio KA, Vargas Jentzsch IM, Vargo EL, Booth W, Friedrich M, Weirauch MT, Anderson MA, Jones JW, Mittapalli O, Zhao C, Zhou JJ, Evans JD, Attardo GM, Robertson HM, Zdobnov EM, Ribeiro JM, Gibbs RA, Werren JH, Palli SR, Schal C, Richards S (2016) Unique features of a global human ectoparasite identified through sequencing of the bed bug genome. Nat Commun 7:10165.  https://doi.org/10.1038/ncomms10165CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    McKenna DD, Scully ED, Pauchet Y, Hoover K, Kirsch R, Geib SM, Mitchell RF, Waterhouse RM, Ahn SJ, Arsala D, Benoit JB, Blackmon H, Bledsoe T, Bowsher JH, Busch A, Calla B, Chao H, Childers AK, Childers C, Clarke DJ, Cohen L, Demuth JP, Dinh H, Doddapaneni H, Dolan A, Duan JJ, Dugan S, Friedrich M, Glastad KM, Goodisman MA, Haddad S, Han Y, Hughes DS, Ioannidis P, Johnston JS, Jones JW, Kuhn LA, Lance DR, Lee CY, Lee SL, Lin H, Lynch JA, Moczek AP, Murali SC, Muzny DM, Nelson DR, Palli SR, Panfilio KA, Pers D, Poelchau MF, Quan H, Qu J, Ray AM, Rinehart JP, Robertson HM, Roehrdanz R, Rosendale AJ, Shin S, Silva C, Torson AS, Jentzsch IM, Werren JH, Worley KC, Yocum G, Zdobnov EM, Gibbs RA, Richards S (2016) Genome of the Asian longhorned beetle (Anoplophora glabripennis), a globally significant invasive species, reveals key functional and evolutionary innovations at the beetle-plant interface. Genome Biol 17(1):227.  https://doi.org/10.1186/s13059-016-1088-8CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Papanicolaou A, Schetelig MF, Arensburger P, Atkinson PW, Benoit JB, Bourtzis K, Castanera P, Cavanaugh JP, Chao H, Childers C, Curril I, Dinh H, Doddapaneni H, Dolan A, Dugan S, Friedrich M, Gasperi G, Geib S, Georgakilas G, Gibbs RA, Giers SD, Gomulski LM, Gonzalez-Guzman M, Guillem-Amat A, Han Y, Hatzigeorgiou AG, Hernandez-Crespo P, Hughes DS, Jones JW, Karagkouni D, Koskinioti P, Lee SL, Malacrida AR, Manni M, Mathiopoulos K, Meccariello A, Murali SC, Murphy TD, Muzny DM, Oberhofer G, Ortego F, Paraskevopoulou MD, Poelchau M, Qu J, Reczko M, Robertson HM, Rosendale AJ, Rosselot AE, Saccone G, Salvemini M, Savini G, Schreiner P, Scolari F, Siciliano P, Sim SB, Tsiamis G, Urena E, Vlachos IS, Werren JH, Wimmer EA, Worley KC, Zacharopoulou A, Richards S, Handler AM (2016) The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals insights into the biology and adaptive evolution of a highly invasive pest species. Genome Biol 17(1):192.  https://doi.org/10.1186/s13059-016-1049-2CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Poelchau M, Childers C, Moore G, Tsavatapalli V, Evans J, Lee CY, Lin H, Lin JW, Hackett K (2015) The i5k Workspace@NAL--enabling genomic data access, visualization and curation of arthropod genomes. Nucleic Acids Res 43(Database issue):D714–D719.  https://doi.org/10.1093/nar/gku983CrossRefPubMedGoogle Scholar
  7. 7.
    Gremme G, Steinbiss S, Kurtz S (2013) GenomeTools: a comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Trans Comput Biol Bioinform 10(3):645–656.  https://doi.org/10.1109/TCBB.2013.68CrossRefPubMedGoogle Scholar
  8. 8.
    Lee E, Helt GA, Reese JT, Munoz-Torres MC, Childers CP, Buels RM, Stein L, Holmes IH, Elsik CG, Lewis SE (2013) Web Apollo: a web-based genomic annotation editing platform. Genome Biol 14(8):R93.  https://doi.org/10.1186/gb-2013-14-8-r93CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Standage DS, Brendel VP (2012) ParsEval: parallel comparison and analysis of gene structure annotations. BMC Bioinformatics 13:187.  https://doi.org/10.1186/1471-2105-13-187CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26(6):841–842.  https://doi.org/10.1093/bioinformatics/btq033CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Graduate Institute of Biomedical Electronics and BioinformaticsNational Taiwan UniversityTaipeiTaiwan
  2. 2.Agricultural Research Service, National Agricultural LibraryUSDABeltsvilleUSA

Personalised recommendations