Transcription of the human and rodent SPAM1 / PH-20 genes initiates within an ancient endogenous retrovirus
- 3.8k Downloads
Sperm adhesion molecule 1 (SPAM1) is the major mammalian testicular hyaluronidase and is expressed at high levels in sperm cells. SPAM1 protein is important for penetration of the cumulus cell layer surrounding the ovum, and is also involved in zona pellucida binding and sperm intracellular signalling. A previous study had identified SPAM1 as one of the many human genes that initiate within a transposable element.
Examination of the human, mouse and rat SPAM1 loci revealed that transcripts initiate within the pol gene of an endogenous retrovirus (ERV) element. This is highly unusual, as all previously identified ERV-initiated cellular gene transcripts initiate within the viral long terminal repeat promoter. The SPAM1 locus therefore represents an example of the evolution of a promoter from protein-coding sequence. We have identified novel alternative promoter and splicing variants of human and murine SPAM1. We show that all transcript variants are expressed primarily in the testis and are predicted to encode identical proteins.
The testis-specific promoters of the human and mouse SPAM1 genes are derived from sequence that was originally part of an ERV pol gene. This represents the first known example of an ERV-derived promoter acting in a gender-specific manner.
KeywordsLong Terminal Repeat Transposable Element Insertion Total Gene Expression Functional Transcription Factor Binding Site SPAM1 Gene
List of abbreviations
androgen response element
cAMP response element
estrogen response element
long interspersed nuclear element
long terminal repeat
mitogen activated protein kinase
open reading frame
rapid amplification of cDNA ends
short interspersed nuclear element
sperm adhesion molecule 1
University of California at Santa Cruz
Sperm adhesion molecule 1 (SPAM1, also known as PH-20) is a member of a family of at least six mammalian hyaluronidases. The genes encoding these enzymes cluster in two groups of three – SPAM1, HYAL4 and HYALP1 (a pseudogene) on human chromosome 7q31, and HYAL1, HYAL2 and HYAL3 on human chromosome 3p21 [1, 2]. The orthologous mouse genes form similar clusters at syntenic chromosomal locations . This suggests that two single-gene duplications, followed by a small segmental duplication, occurred before the divergence of human and mouse approximately 80 million years ago.
HYAL4 exclusively degrades chondroitin. In contrast, HYAL1, HYAL2, HYAL3 and SPAM1 hydrolyze hyaluronic acid, with different substrate size preferences and tissue specificities [1, 2, 3]. Expression of SPAM1 has been unanimously reported in the testis in various species (reviewed in [1, 4]). Expression has also been detected in the human epididymis, vas deferens, prostate and placenta [2, 5] and the murine epididymis, kidney, uterus, vagina and oviduct [6, 7, 8]. Expression of SPAM1 has not been detected in the human female reproductive tract [2, 9].
SPAM1 has various functions in fertilization. A catalytic domain has been shown to degrade hyaluronic acid [10, 11]. This molecule is a major extracellular matrix component of the cumulus cell layer that surrounds the ovum, and SPAM1 has been shown to remove this cumulus layer in vitro . SPAM1 has hyaluronic acid and zona pellucida binding regions that are distinct from its catalytic domain [13, 14] and is also involved in an intracellular signalling pathway in sperm cells upon binding to the zona pellucida [4, 15, 16].
The role of murine SPAM1 in fertilization has been investigated using a knockout mouse line. Sperm from Spam1 -/- mice showed a delay in the removal of the cumulus cell layer and fertilization in vitro. Surprisingly, however, Spam1 -/- males showed normal in vivo fertility rates and sired normal-sized litters . Sperm from Spam1 -/- mice maintained 40% of the wild-type level of hyaluronidase activity, while protein expression assays indicated the presence of a second hyaluronidase in these cells . This was unexpected, as SPAM1 was thought to be the only testicular hyaluronidase. These results may be explained by recent evidence that the murine orthologue of the human HYALP1 pseudogene has an intact ORF and is expressed in mouse testis [1, 3, 18], and that a seventh hyaluronidase, Hyal5, may exist in mouse, but not in human [3, 18]. There may therefore be some redundancy among murine testicular hyaluronidases that explains the fertility of Spam1 -/- mice. In this case, it remains likely that SPAM1 is an essential protein in human fertilization.
Little is known about the transcriptional regulation of SPAM1. A non-consensus cAMP response element (CRE) in the murine Spam1 promoter bound the testis-specific CRE modulator (CREM) protein and was involved in activation of Spam1 transcription in vitro . In addition, Spam1 expression was abolished in CREM-deficient mice . Various other putative transcription factor binding sites have been identified in the human, mouse and rat SPAM1 promoters [5, 6, 7, 19, 20]; however, the sites are generally non-consensus and have not yet been shown to be functional. The restricted developmental and spatial expression of SPAM1 [5, 7, 21], as well as the unique transcriptional mechanisms employed during spermatogenesis (reviewed in ), may render SPAM1 unamenable to traditional methods of transcription and promoter analysis.
In a previous study by our group, SPAM1 was identified as one of the many human transcripts that contain transposable element (TE) sequence . TEs include long and short interspersed nuclear elements (LINEs and SINEs), DNA transposons, and endogenous retroviruses (ERVs). TEs are extremely common in the human and mouse genomes, and together contribute 45% and 40% of the total sequence, respectively [24, 25]. Many human and mouse gene transcripts contain TE sequences in their untranslated regions (UTRs) [23, 26, 27]. TEs also contribute to the transcriptional regulation of many genes. The antisense LINE1 promoter and the long terminal repeat (LTR) promoters of ERVs are known to participate in the tissue-specific expression of various host genes [28, 29, 30]. Through bioinformatic analysis, human and mouse SPAM1 transcripts were predicted to initiate within an antisense ERV , indicating that this gene may represent another example of transcriptional regulation by a TE.
In this study, we show that the first exons and proximal promoter regions of the human and rodent SPAM1 genes are derived from an ERV1 pol coding region, and identify novel alternative promoters and splicing variants of the gene. We show that the human and mouse ERV-derived promoters are largely testis-specific, and discuss the implications of ERV insertion on the evolution of transcriptional regulation at this locus.
The human SPAM1 gene initiates within an ERV1 pol region
A recent study by our group used bioinformatic methods to investigate the contribution of TEs to human and mouse gene transcripts . That study determined that 3.1% of human RefSeq genes initiate within a TE sequence, indicating that these genes are candidates for transcriptional regulation by TEs. One example identified in this way was the SPAM1 gene, where the 5'-terminus was found to map within an antisense ERV element. We have now investigated this locus in more detail.
The alternative promoters and splicing variants of SPAM1
We performed 5'-rapid amplification of cDNA ends (RACE) to confirm the position of the SPAM1 transcriptional start site. Since expression of SPAM1 is confined largely to the testis, we used human testis RNA for this analysis. Sequencing of 5'-RACE clones identified two alternative first exons of SPAM1 (Figure 1, 2). We have designated the upstream, previously-identified first exon as exon 1A, and the novel downstream first exon as exon 1B. Exon 1A is wholly derived from the antisense ERV1 pol region. Exon 1B initiates within a different fragment of the same pol gene, but terminates within a sense orientation LTR of the ERV1 MER4C family (Figure 1B, 2). Transcripts containing both alternative first exons spliced into the same downstream exons; the SPAM1 ORF begins in exon 4, and is therefore not affected by alternative promoter usage (Figure 1A).
Multiple transcription start sites were identified within exon 1A, at position +1, +6, +20 and +51 (Figure 2). We also identified a splicing variant of exon 1A, with variant 1A2 using a splice donor site at position +118. Use of this alternative splice site resulted in a truncated 117 bp first exon, as opposed to a full-length size of 296 bp for variant 1A1 (Figure 1A, 2). In contrast, a single transcription start site and no splicing variants were observed for exon 1B. However, some transcripts initiating within exon 1B contained a novel alternatively-spliced 85 bp exon (Figure 1A). The sequences of all human and murine SPAM1 splicing variants have been deposited in GenBank with accession numbers AY920278 – AY920283.
Both ERV-derived promoters are male-specific
Primer positions and sequences
The results obtained by non-quantitative RT-PCR (Figure 3) and quantitative, real-time RT-PCR (Figure 4) were generally similar. However, transcripts containing SPAM1 ORF sequence were detected in the small intestine and muscle by the former method, but not the latter. The bands amplified from these tissues by non-quantitative RT-PCR were sequenced and were confirmed to correspond to the predicted SPAM1 ORF transcript. 5'-RACE analysis performed on human muscle total RNA identified a low level of transcripts initiating within promoter 1B, but no other SPAM1-specific transcripts (data not shown). These results suggest that the 35 cycles used for non-quantitative RT-PCR analysis amplified transcripts present at levels too low to be detected by real-time RT-PCR.
ERV1-derived promoter 1A is conserved in the mouse genome
The sequence annotated as the ERV1 pol region in the human genome corresponds to nucleotides 246 – 1296 in Figure 6A (nucleotides -754 to +296 in Figure 2). The positions of the ERV1 pol region and the exon 1A transcriptional start site are shown below the lower horizontal axis. The mouse genomic sequence from approximately nucleotide 100 – 800 in Figure 6A shows some sequence similarity to nucleotides 300 -1050 of the human sequence. Therefore the region of the mouse Spam1 locus derived from the ERV1 pol region is considerably larger than that annotated by RepeatMasker, extending approximately 700 bp upstream of the transcriptional start site. The positions of the annotated and extended ERV1 pol regions are represented by solid and dashed boxes, respectively, on the right hand side of Figure 6A. A similar DOTTER result was observed upon comparison of the corresponding rat and human genomic sequences (data not shown).
The level of sequence similarity between the human and mouse SPAM1 promoter regions is highest at position 900 – 950 in the human sequence and 650 – 700 in the mouse (Figure 6A, region marked with asterisk). A sequence comparison revealed that this conserved region contains the functional CRE identified in the murine Spam1 promoter (Figure 6B, reference ). The relatively high level of primate – rodent conservation of this element and the surrounding sequence indicates that this region may be functionally important.
We performed 5'-RACE on mouse testis RNA to identify the transcriptional start site(s) and to search for alternative promoters of Spam1. As shown in Figure 5, a single first exon with multiple transcriptional start sites was identified for Spam1. This exon is orthologous to exon 1A of the human gene (Figure 6A). No sequence equivalent to human exon 1B was detected in mouse Spam1 transcripts. Two splicing variants were identified for the mouse Spam1 gene. Variant 2 utilized an alternative transcription start site and splice donor site within exon 1 to generate a truncated first exon, and spliced into a short (35 bp) novel downstream exon (Figure 5). As with human SPAM1, the murine splicing variants affect only the 5'-UTR, leaving the downstream ORF intact (Figure 5A).
Expression of the mouse Spam1 gene is largely testis-specific
In this study we have experimentally confirmed a previous in silico observation  that transcription of the human and murine SPAM1 genes initiates within an antisense ERV common to both species. SPAM1 is the only hyaluronidase gene to initiate within an ERV (data not shown); this TE insertion therefore took place after the small segmental duplication of three ancestral hyaluronidase genes, but before the divergence of primates and rodents. Interestingly, human HYAL4, but not its mouse orthologue, appears to initiate within an antisense LINE1 element ( and our unpublished observations). This element therefore inserted after the primate-rodent split, indicating an ongoing contribution by TEs to human hyaluronidase transcriptional regulation.
A previous study by our group determined that TE insertions were more common in transcripts with a high Ka/Ks value . The Ka/Ks ratio for the human-Old World monkey SPAM1 orthologous pair is high at 0.57 . This is in line with our hypothesis that TE insertions are more likely to be tolerated by rapidly-evolving genes . High Ka/Ks ratios are a common characteristic of primate genes that are involved in male reproduction. This may be due to positive selection, driven by competition between the sperm of individual males of the more promiscuous primate species . In the case of SPAM1, the requirement for species-specific sperm-zona pellucida recognition may also have contributed to the high inter-species divergence of the protein sequence.
We have identified two closely-spaced ERV-derived promoters for human SPAM1. Both were active primarily in the testis, albeit with an approximately 10-fold difference in promoter activity. This close physical proximity and similar tissue specificity suggests that the two promoters may be regulated by a shared testis-specific enhancer element, rather than by individual tissue-specific proximal promoter regions. We have also identified alternative splicing variants for the human and mouse genes. Alternatively-spliced transcripts of HYAL1 and HYAL3 have been described that cause an in-frame deletion of the putative catalytic site and abolish hyaluronidase function . Evidence from the NCBI database suggests that an alternative splicing event in SPAM1 exon 6 generates an extended 3'-transcript, which encodes a C-terminally truncated SPAM1 isoform. However, the presence of this splicing variant has yet to be confirmed. In contrast, the alternatively-spliced transcripts of SPAM1 and Spam1 described in this study differ only in the sequence and length of the 5'-UTR, and are not predicted to affect enzyme function. Changes in the 5'-UTR sequence may however alter the stability and / or translation efficiency of the transcripts (reviewed in ), and hence impact indirectly on SPAM1 expression.
We have shown that all SPAM1 / Spam1 alternative promoter and splicing variants are expressed primarily in the testis. Lower levels of expression were also observed in the human prostate and murine kidney. This contradicts previous reports that human SPAM1 is expressed in the placenta  and that murine Spam1 is expressed in tissues of the female reproductive tract . Expression of SPAM1 is confined to a subset of specialized cells in some tissues [5, 7], which may explain these contradictory results.
In contrast to all known examples of host gene transcriptional regulation by ERVs, SPAM1 and Spam1 initiate not within an LTR, but rather within a fragment of the pol coding region. While the SPAM1 / Spam1 promoters have not yet been fully analyzed, a non-consensus CRE at position -39 has been shown to be important for activity of the murine Spam1 promoter in an in vitro testis system . This site, and a similar sequence in the human promoter, are clearly derived from the ERV1 pol region and are well conserved between the two species (Figures 2, 5B, and 6). Various lines of evidence suggest that SPAM1 expression is regulated by sex hormones: the expression of SPAM1 in the male and female reproductive organs; the increased expression of Spam1 in male kidney compared to female ; the seasonal variation in SPAM1 expression in red fox testis ; and the variations in murine female SPAM1 expression at different stages of estrus . Indeed, various groups have identified putative androgen response elements (AREs) in the SPAM1 and Spam1 promoters [5, 6, 19], and estrogen response elements (EREs) in the Spam1 promoter . Many of these predicted sites also map within the ERV pol region. However, none of these sequences represents a consensus binding site, and none has yet been shown to bind its cognate transcription factor or to be required for SPAM1 expression.
Alternatively, hormonal regulation may be mediated through the CRE. Androgen treatment of Sertoli cells was recently shown to rapidly induce phosphorylation of a CRE binding protein and activate transcription of target genes via the MAPK pathway . This mechanism was postulated to represent a common mechanism for activation of testis-specific promoters that do not contain a consensus ARE. Much work remains to be done to elucidate the mechanisms of transcriptional regulation of SPAM1 and Spam1. However, it is clear that at least one functional transcription factor binding site is derived from the ERV1 pol region.
ERV LTRs contain the regulatory signals necessary for transcription of the retroviral genes. Insertion of an LTR sequence near a host gene could therefore provide a novel, pre-formed regulatory unit and be rapidly adopted by the gene for use as an alternative promoter. It is less clear how a retroviral protein coding region, which has no known function in transcriptional regulation, could be adopted for use as a promoter by a host gene. We suggest the following scenario.
Prior to the primate-rodent divergence, an ERV inserted upstream of the ancestral SPAM1 gene, in the antisense orientation. By chance, the antisense pol coding region contained sequences that were similar to a CRE, and possibly to other transcription factor binding sites necessary for testis-specific transcription. The region of the human SPAM1 promoter that contains the CRE is quite divergent from the MER34 consensus sequence (Figure 6B). It is therefore unlikely that the CRE was functional, and hence preserved by purifying selection, from the time of ERV insertion. The CRE present in the modern SPAM1 and Spam1 promoters is more likely to have evolved by random nucleotide substitution from a similar sequence in the original antisense pol gene. The ~50 bp of genomic sequence that contains the CRE is relatively well conserved between human and rodents (Figure 6), indicating that purifying selection of this sequence probably occurred at some time after the creation of the functional CRE. The evolutionary origins of other functional transcription factor binding sites in the modern SPAM1 / Spam1 promoters remain to be determined.
The selective processes driving the evolution of a promoter from a protein coding sequence, and the fate of the original ancestral SPAM1 promoter, remain unknown. This gene therefore represents an extremely intriguing example of how the host genome can adopt "parasitic" ERV sequences for its own purposes.
We have shown that transcription of the human and mouse SPAM1 genes initiates within an antisense ERV pol gene. The first exons and proximal promoters of both genes are derived from this ancient ERV pol sequence. Expression of the human and mouse SPAM1 genes is largely testis-specific, and we have provided evidence that testis-specific transcription factor binding sites are derived from conserved ERV sequence in both species. SPAM1 can therefore be added to the growing list of mammalian genes that are regulated by TEs. This gene represents the first known example of the evolution of promoter function from an ERV coding sequence, and of gender-specific transcription from an ERV-derived promoter.
The human, mouse and rat SPAM1 / Spam1 loci were examined using the University of California, Santa Cruz genome browser . Homology searches were performed using the Basic Local Alignment Search Tool (BLAST, ). The SPIDEY alignment program  was used to compare cDNA and genomic DNA sequences for all splicing variants and for 5'-RACE clones. Human and mouse genomic DNA sequences were compared using the DOTTER program .
Reverse transcription and RT-PCR
C57BL/6 mouse testis total RNA and all human total RNAs were purchased from Clontech. All other mouse RNAs were extracted from C57BL/6 mouse tissues using TRIzol (Invitrogen) according to the manufacturer's instructions. 5 μg of each RNA was treated with DNase I and reverse transcribed as described . 35 cycles of RT-PCR were performed using Taq DNA polymerase with 2 ng/μl of each primer in 4 mM MgCl2. Primer pairs were as follows. GAPDH, HGF1 & HGR1; SPAM1 ORF, HSF1 & HSR1; SPAM1 Exon 1A, HSF2 & HSR2; SPAM1 Exon 1B, HSF3 & HSR2; Gapdh, MGF & MGR; Spam1 ORF, MSF1 & MSR1; Spam1 ERV, MSF2 & MSR2. All primer positions and sequences are given in Table 1.
5'-RACE analysis of human or mouse testis total RNA was carried out using the FirstChoice RLM-RACE kit (Ambion) as described . HSR3 and HSR2 were used as the outer and inner primers, respectively, for nested RT-PCR amplification of SPAM1 5'-RACE products. MSR3 and MSR2 were used as the equivalent mouse primers.
Real-time quantification of transcript levels was carried out as described . Dissociation curves demonstrated that each primer pair amplified a single product. Standard curves were prepared for each primer pair using serial dilutions of human testis cDNA to enable calculation of the relative abundance of each transcript type. The level of SPAM1 ORF transcripts for each tissue was normalized to GAPDH and expressed relative to the level detected in heart cDNA. The relative amounts of SPAM1 ERV1A and ERV1B transcripts were assessed only in testis cDNA. The level of each transcript was divided by the amount of ORF transcript detected in testis. This value was then multiplied by the GAPDH- and heart-normalized level of ORF transcripts to determine the contribution of each ERV promoter to total SPAM1 expression. Primer pairs were as follows. GAPDH, HGF2 & HGR2; Total SPAM1, HSF4 & HSR4; SPAM1 Exon 1A, HSF5 & HSR5; SPAM1 Exon 1B, HSF6 & HSR5.
The authors wish to thank the following members of the group for their help: Louie van de Lagemaat for assistance with the DOTTER program and other bioinformatic methods; Mark Romanish, Arefeh Rouhi and Brian Wilhelm for obtaining mouse tissue samples and sharing RNA stocks; Leanne Gutierrez for assisting in the preparation of Figure 1A; Greg Baillie for assistance with the DOTTER program, and for helpful comments on the manuscript. This work was supported by a grant from the Canadian Institutes of Health Research.
- 19.Zheng Y, Martin-Deleon PA: Characterization of the genomic structure of the murine Spam1 gene and its promoter: evidence for transcriptional regulation by a cAMP-responsive element. Mol Reprod Dev. 1999, 54: 8-16. 10.1002/(SICI)1098-2795(199909)54:1<8::AID-MRD2>3.0.CO;2-D.PubMedCrossRefGoogle Scholar
- 24.Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis SE, Attwood J, Baertsch R, Bailey J, Barlow K, Beck S, Berry E, Birren B, Bloom T, Bork P, Botcherby M, Bray N, Brent MR, Brown DG, Brown SD, Bult C, Burton J, Butler J, Campbell RD, Carninci P, Cawley S, Chiaromonte F, Chinwalla AT, Church DM, Clamp M, Clee C, Collins FS, Cook LL, Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, Cutts T, Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitzakis ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I, Dunn DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E, Felsenfeld A, Fewell GA, Flicek P, Foley K, Frankel WN, Fulton LA, Fulton RS, Furey TS, Gage D, Gibbs RA, Glusman G, Gnerre S, Goldman N, Goodstadt L, Grafham D, Graves TA, Green ED, Gregory S, Guigo R, Guyer M, Hardison RC, Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, Hlavina W, Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, Jaffe DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson EK, Karolchik D, Kasprzyk A, Kawai J, Keibler E, Kells C, Kent WJ, Kirby A, Kolbe DL, Korf I, Kucherlapati RS, Kulbokas EJ, Kulp D, Landers T, Leger JP, Leonard S, Letunic I, Levine R, Li J, Li M, Lloyd C, Lucas S, Ma B, Maglott DR, Mardis ER, Matthews L, Mauceli E, Mayer JH, McCarthy M, McCombie WR, McLaren S, McLay K, McPherson JD, Meldrim J, Meredith B, Mesirov JP, Miller W, Miner TL, Mongin E, Montgomery KT, Morgan M, Mott R, Mullikin JC, Muzny DM, Nash WE, Nelson JO, Nhan MN, Nicol R, Ning Z, Nusbaum C, O'Connor MJ, Okazaki Y, Oliver K, Overton-Larty E, Pachter L, Parra G, Pepin KH, Peterson J, Pevzner P, Plumb R, Pohl CS, Poliakov A, Ponce TC, Ponting CP, Potter S, Quail M, Reymond A, Roe BA, Roskin KM, Rubin EM, Rust AG, Santos R, Sapojnikov V, Schultz B, Schultz J, Schwartz MS, Schwartz S, Scott C, Seaman S, Searle S, Sharpe T, Sheridan A, Shownkeen R, Sims S, Singer JB, Slater G, Smit A, Smith DR, Spencer B, Stabenau A, Stange-Thomann N, Sugnet C, Suyama M, Tesler G, Thompson J, Torrents D, Trevaskis E, Tromp J, Ucla C, Ureta-Vidal A, Vinson JP, Von Niederhausern AC, Wade CM, Wall M, Weber RJ, Weiss RB, Wendl MC, West AP, Wetterstrand K, Wheeler R, Whelan S, Wierzbowski J, Willey D, Williams S, Wilson RK, Winter E, Worley KC, Wyman D, Yang S, Yang SP, Zdobnov EM, Zody MC, Lander ES: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420: 520-562. 10.1038/nature01262.PubMedCrossRefGoogle Scholar
- 25.Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowski J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, Szustakowki J, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.PubMedCrossRefGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.