Phylogenetic analysis of bacterial and archaeal arsC gene sequences suggests an ancient, common origin for arsenate reductase
- 13k Downloads
The ars gene system provides arsenic resistance for a variety of microorganisms and can be chromosomal or plasmid-borne. The arsC gene, which codes for an arsenate reductase is essential for arsenate resistance and transforms arsenate into arsenite, which is extruded from the cell. A survey of GenBank shows that arsC appears to be phylogenetically widespread both in organisms with known arsenic resistance and those organisms that have been sequenced as part of whole genome projects.
Phylogenetic analysis of aligned arsC sequences shows broad similarities to the established 16S rRNA phylogeny, with separation of bacterial, archaeal, and subsequently eukaryotic arsC genes. However, inconsistencies between arsC and 16S rRNA are apparent for some taxa. Cyanobacteria and some of the γ-Proteobacteria appear to possess arsC genes that are similar to those of Low GC Gram-positive Bacteria, and other isolated taxa possess arsC genes that would not be expected based on known evolutionary relationships. There is no clear separation of plasmid-borne and chromosomal arsC genes, although a number of the Enterobacteriales (γ-Proteobacteria) possess similar plasmid-encoded arsC sequences.
The overall phylogeny of the arsenate reductases suggests a single, early origin of the arsC gene and subsequent sequence divergence to give the distinct arsC classes that exist today. Discrepancies between 16S rRNA and arsC phylogenies support the role of horizontal gene transfer (HGT) in the evolution of arsenate reductases, with a number of instances of HGT early in bacterial arsC evolution. Plasmid-borne arsC genes are not monophyletic suggesting multiple cases of chromosomal-plasmid exchange and subsequent HGT. Overall, arsC phylogeny is complex and is likely the result of a number of evolutionary mechanisms.
KeywordsArsenate Horizontal Gene Transfer Arsenite Horizontal Gene Transfer Event Arsenate Reductase
Arsenic is a toxic element that is found in both natural environments such as geothermal springs, and in sites contaminated by a number of industries. Inorganic arsenic exists primarily in two valence states: arsenite (AsIII) and arsenate (AsV, or the arsenate ion AsO43-). Both forms are toxic to microorganisms, with arsenite disrupting enzyme function, and arsenate behaving as a phosphate analog and interfering with phosphate uptake and utilization . Microorganisms have evolved a variety of mechanisms for coping with arsenic toxicity, including minimizing the amount of arsenic that enters the cell (e.g. through increased specificity of phosphate uptake, ), arsenite oxidation through the activity of arsenite oxidase [3, 4], or peroxidation reactions with membrane lipids [5, 6]. Other microorganisms utilize arsenic in metabolism, either as a terminal electron acceptor in dissimilatory arsenate respiration [7, 8, 9, 10] or as an electron donor in chemoautotrophic arsenite oxidation [11, 12]. However, the most well-characterized microbial arsenic detoxification pathway involves the ars operon [2, 13].
The ars operon consists of a group of genes coding for a transmembrane pump and an arsenate reductase (arsC). The operon includes a regulatory gene (arsR) and a gene coding for an arsenite-specific pump (arsB) as well as arsC . Arsenite is pumped directly out of the cell by the arsB protein; however arsenate must first be reduced to arsenite by the cytoplasmic arsenate reductase coded from arsC. Some bacteria also possess other ars genes: arsA produces an arsenite-stimulated ATPase  that results in more efficient arsenite extrusion; arsD encodes for a regulatory protein that controls the upper level of ars expression ; arsH has been identified but has an uncertain function . The ars operon was initially recognized in plamids of Staphylococcus aureus and S. xylosus [17, 18] but has subsequently been found in other microorganisms (e.g. Escherichia coli [19, 20], Acidiphilum multivorum , Bacillus subtilis , Pseudomonas aeruginosa ). The genes can be plasmid-borne or chromosomal, and genome-sequencing projects have identified putative ars genes in both Bacteria and Archaea that have not been specifically characterized in terms of arsenic resistance. An analogous genetic system (Arr or ACR) has also been described in the eukaryote Saccharomyces cervisiae . Thus the ars operon, or similar arsenic resistance systems, appears to be relatively widespread throughout microorganisms.
The arsC gene is of particular interest in that its product, the soluble enzyme arsenate reductase, catalyzes the reduction of arsenate to arsenite. Arsenate is the thermodynamically favorable form of arsenic under aerobic conditions [25, 26], so it is likely to be the most common form of arsenic in many environments. Thus, the presence and expression of arsC is likely to be required for microorganisms inhabiting such areas. Furthermore, arsenite is generally more labile and toxic than arsenate , so that expression of arsC, and the ensuing reduction of arsenate to arsenite, might increase the toxicity of arsenic in the environment. Despite the potential importance of the arsC gene at both a physiological and environmental level, a thorough study of the phylogenetic distribution of different arsC genes has not been performed.
A simple phylogenetic tree showing the relationships between the arsenate reductases of seven Bacteria (those that had been confirmed to possess the gene at that time) was presented as part of a study that identified the arsenic resistance genes of Thiobacillus ferrooxidans (now Acidithiobacillus ferroxidans) . Saltikov and Olson  used probes/primers based on the E. coli ars operon to detect ars genes in natural environments, and presented a phylogenetic analysis of these genes, however their analysis was largely confined to enteric bacteria (those that were similar enough to E. coli to be detected) and they emphasized the arsB gene rather than arsC. As part of a review of microbial arsenic transformations, we recently determined a preliminary arsC phylogeny based on 19 sequences . These three studies reported phylogenies that suggested a common origin for the ars genes examined [27, 28, 29]. An alternative view suggested by Mukhopadhyay et al.  is that three distinct classes of arsenate reductases exist, which have developed similar mechanisms and reaction centers through convergent evolution. Two of these classes are bacterial (one typified by the arsC found in enteric bacteria such as E. coli; the other typified by the arsC found in Staphylococcus plasmids and other Gram-positive Bacteria); the third is the Arr2 gene of S. cerevisiae. Phylogenies of these gene families based on 18 arsC sequences were recently reported . However, as well as microorganisms that have been shown to have demonstrable arsenate resistance, whole genome analyses of various microorganisms have revealed an increasing number of open reading frames (ORFs) that have homology to arsC. Inclusion of these putative arsC genes in phylogenetic analyses might resolve the basic question: Did arsenate resistance systems evolve from a common origin or develop convergently in multiple taxa? In this study we attempt to answer this question through a phylogenetic analysis of both confirmed arsC genes and those inferred from whole genome analyses. We compare this phylogeny to one derived from 16S rRNA, and suggest possible implications for the evolutionary history of arsenic resistance in microorganisms.
Results and Discussion
A number of taxa do show arsC phylogenies that are inconsistent with the established 16S rRNA evolutionary tree, perhaps the most dramatic being the separation of some of the non-enteric γ-Proteobacteria (Pseudomonas putida, P. aeruginosa, and A. ferrooxidans) from the Enterobacteriales. The non-enteric γ-Proteobacteria actually appear to be paraphyletic with regards to arsC, in that the two Xanthomonas species group with the α-Proteobacteria, whereas the others form a distinct group with the sole representative of the β-Proteobacteria (R. solanacearum). The phylogenetic affiliation of the latter clade is uncertain: ED analysis placed it as a deep branch in the Low GC Gram positive Bacteria group (Figure 2), whereas MP placed it loosely with the large Enterobacteriales/α-Proteobacteria group (Figure 3). Either placement had poor bootstrap support, and given the relatively low sequence homology of this clade to either of the other major bacterial arsC groups (50% similarity to the Low GC Gram positives, 40% similarity to the Enterobacteriales/α-Proteobacteria group) we suspect that this group might represent a fourth distinct group of arsenate reductases, making its placement in a binary tree difficult. A second distinct difference between the arsC and 16s rRNA trees is the tight grouping of the Cyanobacteria with the arsC sequences from Low GC Gram-positive Bacteria. Other discrepancies are placement of the three Streptococcus sequences (Low GC Gram-positive Bacteria) with the Actinobacteria and the placement of Aquifex aeolicus (basal in the 16S rRNA-defined bacterial clade) and F. nucleatum within the Low GC Gram-positives (Figures 2 and 3). The latter two sequence types were the only arsC sequences reported for the Aquificales and Fusobacteria, respectively, and as with some of the 16S rRNA tree branches, might represent the limited data set rather than true evolutionary relationships. However, A. aeolicus is known to have exchanged numerous genes with other microorganisms , and the phylogeny of its arsenate reductase gene might represent another example of this.
Regardless of the particular phylogenetic methods used, the arsC trees clearly show a number of non-orthologous arsC genes, assuming that the 16S rRNA tree reflects true phylogeny. One explanation for these inconsistencies could be the existence of arsC paralogs that have diverged after arsC gene duplication events. While this might be the case for those organisms with arsC sequences that were not closely related to other sequences (e.g. D. radiodurans or C. tepidum), it is an unreasonable explanation for the similar arsC sequences (e.g. those of the two Cyanobacteria) that group within unexpected larger clades. Rather than multiple gene duplication events or convergent evolution, this suggests horizontal gene transfer (HGT) of arsC genes. Recent studies suggest that HGT events may have been much more widespread during prokaryotic evolution than had been previously thought, with genetic exchange even occurring between Bacteria and Archaea [35, 36, 37]. While we see no evidence of cross-domain transfer of arsC genes, a number of HGT events could explain some of the discrepancies between 16S rRNA-based and arsC-based phylogenies. HGT of arsC from an ancestral low GC Gram-positive bacterium to an ancestral cyanobacterium seems likely, and a similar HGT event from ancestral Actinobacteria to the Streptococci would explain the unexpected phylogeny of the Streptococcus arsC sequences. Similar gene transfer events from the low GC Gram-positives to F. nucleatum could also have occurred, but with an arsC sequence only available from a single member of the Fusobacteria it is difficult to draw conclusions. The paraphyletic arsC sequences in the non-enteric γ-Proteobacteria likewise may have arisen from HGT events, although the difficulty in resolving the affiliation of some of these sequences makes speculation about their evolution premature. Indeed, the exact placement of many of the non-orthologous arsC genes was difficult and poor bootstrap support for some nodes suggests that refinement of arsC phylogeny as more sequences become available should be an ongoing process. Regardless, a number of bacteria possess arsC genes that are inconsistent with established phylogenies, and HGT events are one possible explanation. It should also be noted that some of the non-orthologous sequences represent putative arsC genes identified from ORF's, rather than confirmed arsenate reductases. These ORF's show homology to identified arsC genes, but may not necessarily code for a functional enzyme. Thus, some of the non-orthologous sequences might represent elevated genotypic variation if the organism has no selective pressure to maintain a functional arsC gene.
We were hoping that patterns in the phylogeny of plasmid-borne arsC genes might indicate whether HGT of arsC genes has occurred (or is still occurring) in recent times. Specifically, if the plasmid-borne arsenate reductases are similar, this would suggest a common origin for these genes, which must ultimately be chromosomal. However, the plasmid-borne arsC genes appear to be paraphyletic and clearly separate into very different arsC types (Figures 2 and 3). The two major groups of plasmid-borne genes are those of the Staphylococci (which were the first arsC genes to be recognized [17, 18]) and those of the Enterobacteriales. Both of these groups of plasmid-borne genes are similar to the chromosomal genes found in related organisms. The only exception is the arsC found on the pKW301 plasmid of A. multivorum, which shows high sequence similarity to the plasmid-borne arsC sequences found in the Enterobacteriales, strongly suggesting relatively recent plasmid transfer between a member of this group and A. multivorum. Indeed, the ars operon found in A. multivorum is expressed and confers arsenate resistance when transferred to E. coli . Three other types of plasmid-borne arsC genes are found in R. solanacearum, Clostridium acetobutylicum, and Halobacterium halobium, demonstrating that plasmid-borne arsC genes are phylogenetically widespread and suggesting multiple incidences of chromosomal-plasmid transfer. The arsC sequence of C. acetobutylicum is typical of the clostridial arsC genes, and also shows some similarity to the arsenate reductase sequence in L. interrogans (a Spirochete). Thus, it is possible that ancestral clostridial plasmids may have transferred ars genes to the Spirochetes. As well as C. acetobutylicum, two other bacteria were represented by both chromosomal and plasmid-borne arsC sequences. The two arsC genes of E. coli K12 are similar, but they are equally similar to the plasmid-borne genes of other enteric bacteria. The plasmid-borne arsC gene from Salmonella typhimurium, shows little similarity to it's chromosomal equivalent, and appears to be related to the other arsC genes found in plasmids in the Enterobacteriales. The high similarity (at least 70%) between all the enteric plasmids suggests a common source, which from this analysis appears to be very similar to the E. coli chromosomal arsC gene. However, it is clear that within the enterics, relationships based on arsC do not parallel relationships based on 16S rRNA, and significant genetic exchange within this group (possibly inside a host organism) has likely occurred.
Despite the suggestion of HGT and duplication events, the arsC phylogeny suggests that arsenate reductase is an evolutionarily old enzyme, a hypothesis that was recently suggested for arsenite oxidase . This is the first phylogenetic analysis of an arsenic redox active gene to include all three domains and the Archaea and Bacteria were clearly separated, with S. cerevisiae subsequently branching from Archaea. Thus, the arsC gene, and by extension, arsenic resistance mechanisms, must have been present in early organisms, either in the last universal ancestor or after the divergence of the two prokaryotic domains (i.e. in an ancestral bacterium or archaean). In the latter case, early HGT event(s) must have transferred arsenate resistance to the other domain, followed by subsequent divergence to the phylogeny seen today. Such gene exchange between Archaea and primitive Bacteria has certainly taken place  and makes absolute statements about the initial origin of genes difficult to make. Regardless, it seems likely that arsenate resistance developed early in the evolution of microorganisms. Recent ideas on the origin of life suggest that early cellular structures were abiotic iron-sulfur formations at submarine hydrothermal vents in the Hadean ocean, in which basic biochemical and metabolic processes could have developed before the formation of cell membranes [39, 40]. As well as iron and sulfur, hydrothermal areas are often are often characterized by high levels of arsenic  and may even contain arsenic redox active microbial communities . Given the integral role of phosphate in early (and current) cellular metabolism , an ability to reduce the levels of arsenate (a phosphate analog) might have been an essential biochemical process. There would have been selective pressure to develop an arsenate reductase that could change this arsenate to arsenite, and this early evolution of an arsC-like ancestor coupled with sequence divergence and HGT would result in the diversity and widespread distribution of arsC genes seen in microorganisms today.
Sequence acquisition and alignment
Sequences of arsC genes were obtained from GenBank  and were characterized into two broad types: confirmed arsC genes (i.e. those sequences that were identified in studies that explicitly tested arsenate reduction and/or resistance in that given organism), and putative arsC genes (i.e. open reading frames (ORFs) obtained from genome projects that showed homology to known arsC genes). Genes were also noted as being either chromosomal or plasmid-borne. A total of 60 sequences were analyzed consisting of 54 bacterial, 5 archaeal, and the Arr2 gene of Saccharomyces cerevisae. Accession numbers for these sequences are given in Table 1 [see Additional file 1]. The Arr2 gene (or ACR2 ) has been identified as coding for a protein necessary for arsenate resistance in yeast, and appears to have some homology to prokaryotic arsC genes. Sequences were imported into the ARB software package (distributed by W. Ludwig and O. Strunk, Technical University of Munich, Germany; http://www.arb-home.de/) running on a SuSE Linux 6.2 platform.
In ARB, amino acid sequences were derived from nucleotide sequences using the Translate DNA to Protein function and the standard genetic code. Because some arsC genes were incomplete, all three possible reading frames were examined to ensure correct translation. Amino acid sequences of the confirmed bacterial arsenate reductases (12 total) were initially aligned automatically using the Fast Aligner function of ARB Edit 4.1, and then manually adjusted to ensure that secondary structure considerations of the arsenate reductase protein were met. The secondary structure of E. coli arsenate reductase (17978 in the NCBI Molecular Modeling Database, ) was visualized using Cn3D 3.0 and One-D Viewer 1.0 (available from http://www.ncbi.nih.gov) and used as a model for arsenate reductase secondary structure. Nucleotide sequences of these confirmed arsC genes were aligned according to the amino acid alignment, and the putative arsC ORFs subsequently aligned to these confirmed sequences. Multiple alignments of all arsC genes used in this study are presented as an additional ARB file [see Additional file 2]. Aligned 16S rRNA sequences of all organisms used in the arsC analysis (with the exception of an arsC sequence obtained from an unknown Proteobacterium) were obtained from the Ribosomal Database Project  and imported into ARB. Because three arsC genes have been reported for different strains/plasmids of E. coli, and two arsC genes have been identified for both Salmonella typhimurium and Clostridium acetobutylicum (plasmid and chromosomal for each; Table 1 [see Additional file 1]) only one 16S rRNA sequence was used for each to avoid potential biases and inaccurate attractions due to sequence identity.
The aligned 16S rRNA sequences (55) were used to construct a baseline phylogenetic tree to serve as a comparison for arsC sequences. 1326 informative positions (corresponding to E. coli positions 140–1475) were compared using evolutionary distance (ED) methods. An ED tree was constructed in ARB for the 56 16S rRNA sequences using neighbor joining  with 1000 bootstraps. Distance and maximum parsimony (MP) methods were used to construct arsC trees using aligned nucleotide sequences. Both methods used all 60 aligned arsC nucleotide sequences. The ED tree was constructed in ARB using neighbor joining following a simple Jukes and Cantor model with the Olsen correction. Because visual examination of the aligned sequences revealed a number of gaps suggesting insertion/deletion events within different taxa, gaps were treated as a fifth base for construction. Thus, ED phylogeny was based on a total of 408 informative positions with 1000 bootstraps performed as part of the procedure. MP phylogeny used the same informative positions and number of bootstraps, and was performed using the PHYLIP package included with ARB.
Funding for this research was provided by award LEQSF(2002-05)-RD-A-25 from the Louisiana Board of Regents Support Fund, Research and Development Program, to CRJ. Rick Miller, Evelyn F. Jackson and two anonymous reviewers provided useful comments on an earlier version of this manuscript.
- 1.Tamaki S, Frankenberger WT: Environmental biochemistry of arsenic. Rev Environ Cont Toxicol. 1992, 124: 79-110.Google Scholar
- 5.Abdrashitova SA, Abdullina GG, Ilyaletdinov AN: Role of arsenites in lipid peroxidation in Pseudomonas putida cells oxidizing arsenite. Mikrobiologiya. 1986, 55: 212-216.Google Scholar
- 6.Abdrashitova SA, Abdullina GG, Mynbaeva BN, Ilyaletdinov AN: Oxidation of iron and manganese by arsenic-oxidizing bacteria. Mikrobiologiya. 1990, 59: 85-89.Google Scholar
- 11.Ilyaletdinov AN, Abdrashitova SA: Autotrophic oxidation of arsenic by a culture of Pseudomonas arsenitoxidans. Mikrobiologiya. 1981, 50: 197-204.Google Scholar
- 13.Ji G, Silver S, Garber EAE, Ohtake H, Cervantes C, Corbisier P: Bacterial molecular genetics and enzymatic transformations of arsenate, arsenite, and chromate. In Biohydrometallurgical Technologies. Edited by: Torma AE, Apel ML, Brierly CL. 1993, Warrendale, PA: The Minerals, Metals & Materials Society, 529-539.Google Scholar
- 17.Ji G, Silver S: Regulation and expression of the arsenic resistance operon from Staphylococcus aureus plasmid pI258. J Bacterio. 1992, 174: 3684-3694.Google Scholar
- 24.Bobrowicz P, Wysocki R, Owsianik G, Goffeau A, Ulaszewski S: Isolation of three contiguous genes, ACR1, ACR2 and ACR3, involved in resistance to arsenic compounds in the yeast Saccharomyces cerevisiae. Yeast. 1997, 13: 819-828. 10.1002/(SICI)1097-0061(199707)13:9<819::AID-YEA142>3.0.CO;2-Y.CrossRefPubMedGoogle Scholar
- 25.Cullen WR, Reimer KJ: Arsenic speciation in the environment. Rev. 1989, 89: 713-764.Google Scholar
- 27.Butcher BG, Deane SM, Rawlings DE: The chromosomal arsenic resistance genes of Thiobacillus ferrooxidans have an unusual arrangement and confer increased arsenic and antimony resistance to Escherichia coli. Appl Environ Microbiol. 2000, 66: 1826-1833. 10.1128/AEM.66.5.1826-1833.2000.PubMedCentralCrossRefPubMedGoogle Scholar
- 29.Jackson CR, Jackson EF, Dugas SL, Gamble K, Williams SE: Microbial transformations of arsenite and arsenate in natural environments. Recent Res Devel Microbiol. 2003,Google Scholar
- 34.Nei M, Kumar S: Molecular Evolution and Phylogenetics. 2000, New York: Oxford University PressGoogle Scholar
- 37.Mirkin BG, Fenner TI, Galperin MY, Koonin EV: Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes. BMC Evolutionary Biology. 2003, 3: 2-10.1186/1471-2148-3-2.PubMedCentralCrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.