Abstract
Background
Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.
Results
WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.
Conclusions
Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.
Similar content being viewed by others
References
Heitz E: Das Heterochromatin der Moose. I Jahrb Wiss Botanik. 1928, 69: 762-818.
John B: The biology of heterochromatin. In Heterochromatin: Molecular and Structural Aspects. Edited by: Verma RS. 1988, Cambridge: Cambridge University Press, 1-147.
Elgin SC, Workman JL: Chromosome and expression mechanisms: a year dominated by histone modifications, transitory and remembered. Curr Opin Genet Dev. 2002, 12: 127-129. 10.1016/S0959-437X(02)00276-9.
Weiler KS, Wakimoto BT: Heterochromatin and gene expression in Drosophila. Annu Rev Genet. 1995, 29: 577-605. 10.1146/annurev.ge.29.120195.003045.
Gatti M, Pimpinelli S: Functional elements in Drosophila melanogaster heterochromatin. Annu Rev Genet. 1992, 26: 239-275. 10.1146/annurev.ge.26.120192.001323.
Sullivan B, Karpen G: Centromere identity in Drosophila is not determined in vivo by replication timing. J Cell Biol. 2001, 154: 683-690. 10.1083/jcb.200103001.
McKee BD, Karpen GH: Drosophila ribosomal RNA genes function as an X-Y pairing site during male meiosis. Cell. 1990, 61: 61-72.
Dernburg AF, Sedat JW, Hawley RS: Direct evidence of a role for heterochromatin in meiotic chromosome segregation. Cell. 1996, 86: 135-146.
Karpen GH, Le MH, Le H: Centric heterochromatin and the efficiency of achiasmate disjunction in Drosophila female meiosis. Science. 1996, 273: 118-122.
Moore DP, Orr-Weaver TL: Chromosome segregation during meiosis: building an unambivalent bivalent. Curr Top Dev Biol. 1998, 37: 263-299.
Bernard P, Maure JF, Partridge JF, Genier S, Javerzat JP, Allshire RC: Requirement of heterochromatin for cohesion at cen-tromeres. Science. 2001, 294: 2539-2542. 10.1126/science.1064027.
Adams MD, Celniker SE, Holt RA, Evans CA, Gocayne JD, Ama-natides PG, Scherer SE, Li PW, Hoskins RA, Galle RF, et al: The genome sequence of Drosophila melanogaster. Science. 2000, 287: 2185-2195. 10.1126/science.287.5461.2185.
Copenhaver GP, Nickel K, Kuromori T, Benito MI, Kaul S, Lin X, Bevan M, Murphy G, Harris B, Parnell LD, et al: Genetic definition and sequence analysis of Arabidopsis centromeres. Science. 1999, 286: 2468-2474. 10.1126/science.286.5449.2468.
Horvath JE, Schwartz S, Eichler EE: The mosaic structure of human pericentromeric DNA: a strategy for characterizing complex regions of the human genome. Genome Res. 2000, 10: 839-852. 10.1101/gr.10.6.839.
Horvath JE, Viggiano L, Loftus BJ, Adams MD, Archidiacono N, Rocchi M, Eichler EE: Molecular structure and evolution of an alpha satellite/non-alpha satellite junction at 16p11. Hum Mol Genet. 2000, 9: 113-123. 10.1093/hmg/9.1.113.
Haupt W, Fischer TC, Winderl S, Fransz P, Torres-Ruiz RA: The centromere1 (CEN1) region of Arabidopsis thaliana: architecture and functional impact of chromatin. Plant J. 2001, 27: 285-296. 10.1046/j.1365-313x.2001.01087.x.
Kumekawa N, Hosouchi T, Tsuruoka H, Kotani H: The size and sequence organization of the centromeric region of Arabidopsis thaliana chromosome 5. DNA Res. 2000, 7: 315-321.
The Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.
Kotani H, Hosouchi T, Tsuruoka H: Structural analysis and complete physical map of Arabidopsis thaliana chromosome 5 including centromeric and telomeric regions. DNA Res. 1999, 6: 381-386.
Carvalho AB: Origin and evolution of the Drosophila Y chromosome. Curr Opin Genet Dev. 2002, 12: 664-668. 10.1016/S0959-437X(02)00356-8.
Schueler MG, Higgins AW, Rudd MK, Gustashaw K, Willard HF: Genomic and genetic definition of a functional human cen-tromere. Science. 2001, 294: 109-115. 10.1126/science.1065042.
Sun X, Le H, Wahlstrom J, Karpen GH: Sequence analysis of a functional Drosophilacentromere. Genome Res.
Heitz E: Uber α- und β-Heterochromatin sowie Konstanz und Bau der Chromomeren bei Drosophila. Biol Zentbl. 1934, 54: 588-609.
Gall JG, Cohen EH, Polan ML: Reptitive DNA sequences in Drosophila. Chromosoma. 1971, 33: 319-344.
Gatti M, Bonaccorsi S, Pimpinelli S: Looking at Drosophila mitotic chromosomes. Methods Cell Biol. 1994, 44: 371-391.
Lohe AR, Hilliker AJ, Roberts PA: Mapping simple repeated DNA sequences in heterochromatin of Drosophila melanogaster. Genetics. 1993, 134: 1149-1174.
Pimpinelli S, Berloco M, Fanti L, Dimitri P, Bonaccorsi S, Marchetti E, Caizzi R, Caggese C, Gatti M: Transposable elements are stable structural components of Drosophila melanogaster heterochromatin. Proc Natl Acad Sci USA. 1995, 92: 3804-3808.
Le MH, Duricka D, Karpen GH: Islands of complex DNA are widespread in Drosophila centric heterochromatin. Genetics. 1995, 141: 283-303.
Sun X, Wahlstrom J, Karpen G: Molecular structure of a functional Drosophila centromere. Cell. 1997, 91: 1007-1019.
Losada A, Abad JP, Villasante A: Organization of DNA sequences near the centromere of the Drosophila melanogaster Y chromosome. Chromosoma. 1997, 106: 503-512.
Miklos GL, Yamamoto MT, Davies J, Pirrotta V: Microcloning reveals a high frequency of repetitive sequences characteristic of chromosome 4 and the beta-heterochromatin of Drosophila melanogaster. Proc Natl Acad Sci USA. 1988, 85: 2051-2055.
Kaminker JS, Bergman CM, Kronmiller B, Carlson J, Svirskas R, Patel S, Frise E, Wheeler DA, Lewis SE, Rubin GM, et al: The transposable elements of the Drosophila melanogaster euchromatin: a genomics perspective. Genome Biol. 2002, 3: research0084.1-0084.20. 10.1186/gb-2002-3-12-research0084.
FlyBase. [http://flybase.bio.indiana.edu]
Hilliker AJ: Genetic analysis of the centromeric heterochromatin of chromosome 2 of Drosophila melanogaster: deficiency mapping of EMS-induced lethal complementation groups. Genetics. 1976, 83: 765-782.
Schupbach T, Wieschaus E: Female sterile mutations on the second chromosome of Drosophila melanogaster. I. Maternal effect mutations. Genetics. 1989, 121: 101-117.
Marchant GE, Holm DG: Genetic analysis of the heterochromatin of chromosome 3 in Drosophila melanogaster. II. Vital loci identified through EMS mutagenesis. Genetics. 1988, 120: 519-532.
Hilliker AJ, Appels R: Pleiotropic effects associated with the deletion of heterochromatin surrounding rDNA on the X chromosome of Drosophila. Chromosoma. 1982, 86: 469-490.
Sinclair DA, Schulze S, Silva E, Fitzpatrick KA, Honda BM: Essential genes in autosomal heterochromatin of Drosophila melanogaster. Genetica. 2000, 109: 9-18. 10.1023/A:1026500620158.
Brosseau GE: Genetic analysis of the male fertility factors on the Y chromosome of Drosophila melanogaster. Genetics. 1960, 45: 257-274.
Howe M, Dimitri P, Berloco M, Wakimoto BT: Cis-effects of heterochromatin on heterochromatic and euchromatic gene activity in Drosophila melanogaster. Genetics. 1995, 140: 1033-1045.
Koryakov DE, Zhimulev IF, Dimitri P: Cytogenetic analysis of the third chromosome heterochromatin of Drosophila melanogaster. Genetics. 2002, 160: 509-517.
Devlin RH, Bingham B, Wakimoto BT: The organization and expression of the light gene, a heterochromatic gene of Drosophila melanogaster. Genetics. 1990, 125: 129-140.
Biggs WH, Zavitz KH, Dickson B, van der Straten A, Brunner D, Hafen E, Zipursky SL: The Drosophila rolled locus encodes a MAP kinase required in the sevenless signal transduction pathway. EMBO J. 1994, 13: 1628-1635.
Tulin A, Stewart D, Spradling AC: The Drosophila heterochromatic gene encoding poly(ADP-ribose) polymerase (PARP) is required to modulate chromatin structure during development. Genes Dev. 2002, 16: 2108-2119. 10.1101/gad.1003902.
Bonaccorsi S, Gatti M, Pisano C, Lohe A: Transcription of a satellite DNA on two Y chromosome loops of Drosophila melanogaster. Chromosoma. 1990, 99: 260-266.
Reugels AM, Kurek R, Lammermann U, Bunemann H: Mega-introns in the dynein gene DhDhc7(Y) on the heterochromatic Y chromosome give rise to the giant threads loops in primary spermatocytes of Drosophila hydei. Genetics. 2000, 154: 759-769.
Celniker SE, Wheeler DA, Kronmiller B, Carlson JW, Halpern A, Patel S, Adams M, Champe M, Dugan SP, Frise E, et al: Finishing a whole-genome shotgun sequence assembly: Release 3 of the Drosophila euchromatic sequence. Genome Biol. 2002, 3: research0079.1-0079.14. 10.1186/gb-2002-3-12-research0079.
Misra S, Crosby MA, Mungall CJ, Matthews BB, Campbell KS, Hradecky P, Huang Y, Kaminker JS, Milburn GH, Prochnik SE, et al: Annotation of the Drosophila euchromatic genome: a systematic review. Genome Biol. 2002, 3: research0083.1-0083.22. 10.1186/gb-2002-3-12-research0083.
Dimitri P: Cytogenetic analysis of the second chromosome heterochromatin of Drosophila melanogaster. Genetics. 1991, 127: 553-564.
Mount SM, Burks C, Hertz G, Stormo GD, White O, Fields C: Splicing signals in Drosophila: intron size, information content, and consensus sequences. Nucleic Acids Res. 1992, 20: 4255-4262.
Mungall CJ, Misra S, Berman BP, Carlson J, Frise E, Harris N, Marshall B, Shu S, Kaminker JS, Prochnik SE, et al: An integrated computational pipeline and database to support whole-genome sequence annotation. Genome Biol. 2002, 3: research0081.1-0081.11. 10.1186/gb-2002-3-12-research0081.
Lewis SE, Searle SMJ, Harris NL, Gibson M, Iyer VR, Richter J, Wiel C, Bayraktaroglu L, Birney E, Crosby MA, et al: Apollo: A sequence annotation editor. Genome Biol. 2002, 3: research0082.1-0082.14. 10.1186/gb-2002-3-12-research0082.
Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W: A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998, 8: 967-974.
Carvalho AB, Vibranovski MD, Carlson JW, Celniker SE, Hoskins RA, Rubin GM, Sutton GG, Adams MD, Myers EW, Clark AG: Y chromosome and other heterochromatic sequences of the Drosophila melanogastergenome: how far can we go?. Genetica.
Carvalho AB, Lazzaro BP, Clark AG: Y chromosomal fertility factors kl-2 and kl-3 of Drosophila melanogaster encode dynein heavy chain polypeptides. Proc Natl Acad Sci USA. 2000, 97: 13239-13244. 10.1073/pnas.230438397.
Carvalho AB, Dobo BA, Vibranovski MD, Clark AG: Identification of five new genes on the Y chromosome of Drosophila melanogaster. Proc Natl Acad Sci USA. 2001, 98: 13225-13230. 10.1073/pnas.231484998.
Stapleton M, Carlson J, Brokstein P, Yu C, Champe M, George R, Guarin H, Kronmiller B, Pacleb J, Park S, et al: A Drosophila full-length cDNA resource. Genome Biol. 2002, 3: research0080.1-0080.8. 10.1186/gb-2002-3-12-research0080.
Tartof KD: Increasing the multiplicity of ribosomal RNA genes in Drosophila melanogaster. Science. 1971, 171: 294-297.
Williams SM, Robbins LG: Molecular genetic analysis of Drosophila rDNA arrays. Trends Genet. 1992, 8: 335-340. 10.1016/0168-9525(92)90277-B.
Hoskins RA, Nelson CR, Berman BP, Laverty TR, George RA, Ciesiolka L, Naeemuddin M, Arenson AD, Durbin J, David RG, et al: A BAC-based physical map of the major autosomes of Drosophila melanogaster. Science. 2000, 287: 2271-2274. 10.1126/science.287.5461.2271.
Yasuhara J, Marchetti M, Fanti L, Pimpinelli S, Wakimoto BT: A strategy for mapping the heterochromatin of chromosome 2 of Drosophila melanogaster. Genetica.
Wakimoto BT, Hearn MG: The effects of chromosome rearrangements on the expression of heterochromatic genes in chromosome 2L of Drosophila melanogaster. Genetics. 1990, 125: 141-154.
RepeatMasker. [http://ftp.genome.washington.edu/RM/RepeatMasker.html]
Ashburner M, Misra S, Roote J, Lewis SE, Blazej R, Davis T, Doyle C, Galle R, George R, Harris N, et al: An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region. Genetics. 1999, 153: 179-219.
Lohe AR, Brutlag DL: Multiplicity of satellite DNA sequences in Drosophila melanogaster. Proc Natl Acad Sci USA. 1986, 83: 696-700.
Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JM, Wides R, et al: The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002, 298: 129-149. 10.1126/science.1076181.
Human Genome Sequencing Center: Baylor College of Medicine. [http://hgsc.bcm.tmc.edu]
Reese MG, Kulp D, Tammana H, Haussler D: Genie - gene finding in Drosophila melanogaster. Genome Res. 2000, 10: 529-538. 10.1101/gr.10.4.529.
Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R: Prediction of probable genes by Fourier analysis of genomic sequences. Comput Appl Biosci. 1997, 13: 263-270.
Yan CM, Dobie KW, Le HD, Konev AY, Karpen GH: Efficient recovery of centric heterochromatin P-element insertions in Drosophila melanogaster. Genetics. 2002, 161: 217-229.
Berkeley Drosophila Genome Project. [http://www.fruitfly.org]
BACPAC Resources. [http://www.chori.org/bacpac]
Delcher AL, Phillippy A, Carlton J, Salzberg SL: Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res. 2002, 30: 2478-2483. 10.1093/nar/30.11.2478.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.
FlyBase GadFly Genome Annotation Database. [http://www.fruitfly.org/cgi-bin/annot/query]
Yamamoto MT, Mitchelson A, Tudor M, O'Hare K, Davies JA, Miklos GL: Molecular and cytogenetic analysis of the heterochromatin-euchromatin junction region of the Drosophila melanogaster X chromosome using cloned DNA sequences. Genetics. 1990, 125: 821-832.
Acknowledgements
We thank Sima Misra and Casey Bergman for helpful discussions; Andrew Skora, Christopher Yan, and David Acevedo for assistance with FISH experiments; Robert Svirskas, John Tupy, Pavel Hradecky, Colin Wiel, Bruno Ribeiro, Marcelo Alvim and Maria Vibranovski for assistance with informatics; Erwin Frise, Eric Smith and Dave Hurley for computer systems support; and Catherine Nelson for editing the manuscript. This work was supported by Celera Genomics, the Howard Hughes Medical Institute, NSF grant MCB0213163 to B.T.W., fellowships from the CNPq and the Pew Latin American Fellows Program to A.B.C., NIH grant R01 HG00747 to G.H.K. and NIH grant P50 HG00750 to G.M.R. The work supported by P50-HG00750 was carried out under Department of Energy Contract DE-AC0376SF00098, University of California.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Rights and permissions
About this article
Cite this article
Hoskins, R.A., Smith, C.D., Carlson, J.W. et al. Heterochromatic sequences in a Drosophila whole-genome shotgun assembly. Genome Biol 3, research0085.1 (2002). https://doi.org/10.1186/gb-2002-3-12-research0085
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1186/gb-2002-3-12-research0085