Unexpected complexity of the Reef-Building Coral Acropora millepora transcription factor network
- 4.5k Downloads
Coral reefs are disturbed on a global scale by environmental changes including rising sea surface temperatures and ocean acidification. Little is known about how corals respond or adapt to these environmental changes especially at the molecular level. This is mostly because of the paucity of genome-wide studies on corals and the application of systems approaches that incorporate the latter. Like in any other organism, the response of corals to stress is tightly controlled by the coordinated interplay of many transcription factors.
Here, we develop and apply a new system-wide approach in order to infer combinatorial transcription factor networks of the reef-building coral Acropora millepora. By integrating sequencing-derived transcriptome measurements, a network of physically interacting transcription factors, and phylogenetic network footprinting we were able to infer such a network. Analysis of the network across a phylogenetically broad sample of five species, including human, reveals that despite the apparent simplicity of corals, their transcription factors repertoire and interaction networks seem to be largely conserved. In addition, we were able to identify interactions among transcription factors that appear to be species-specific lending strength to the novel concept of "Taxonomically Restricted Interactions".
This study provides the first look at transcription factor networks in corals. We identified a transcription factor repertoire encoded by the coral genome and found consistencies of the domain architectures of transcription factors and conserved regulatory subnetworks across eumetazoan species, providing insight into how regulatory networks have evolved.
KeywordsOcean Acidification Transcriptional Network Pfam Domain Arcsine Transformation Domain Composition
List of abbreviations
open reading frame
Taxonomically Restricted Genes
Taxonomically Restricted Interactions
Deciphering and predicting transcriptional regulatory networks is of considerable importance in understanding how organisms function, adapt, and respond to changes in their environment. Much effort has been addressed to elucidate these regulatory networks in several model organisms. For instance, global transcription factors (TFs) combinatorial interaction maps were built in human and mouse  and developmental gene regulatory circuits were elucidated in the sea urchin embryo [2, 3]. However, little effort has been made so far in understanding the structure, function, and conversation of transcriptional networks in non-model organisms, e.g. corals, despite their ecological importance.
Corals are members of the phylum Cnidaria that includes such diverse forms as jellyfish, hydra, and sea anemones. Reef-building corals (Cnidaria: Hexacorallia: Scleractinia) in symbiosis with their unicellular photosynthetic dinoflagellate symbionts (Alveolata: Dinophycea: Symbiodinium) provide the foundation of coral reef ecosystems, and are well known for providing biodiversity to marine ecosystems . The sensitivity of corals to environmental stresses such as temperature, salinity, and nutrient loading make them highly vulnerable to climate change, ocean acidification, and other anthropogenic impacts . As a consequence, coral cover has continuously declined in recent decades . A more detailed understanding of how scleractinian corals and their associated microbes will respond to environmental changes is needed in order to eventually establish effective management policies that are able to conserve and sustain coral reef ecosystems. So far only a few studies have looked at the mechanisms on a molecular level [7, 8, 9, 10, 11, 12] that go beyond transcriptome annotation and ortholog identification [13, 14]. The emerging picture from these studies is that corals are complex organisms as revealed by a diverse set of receptors and a comprehensive innate immunity reservoir which are important for responses to the environment .
In this study, we developed and applied a systems-wide integrative approach to assess the complexity of the Acropora millepora transcription factor (TF) network by reconstructing a TF interaction map from known interactions and comparing it to those of four model organisms (fruitfly, sea urchin, mouse, and human). The A. millepora TF repertoire was identified using sequence-specific DNA binding domains from DBD . Subsequently conserved combinatorial TF interactions as well as species-specific TF interactions were inferred by the integration of the fly, mouse and human TFs interaction networks [1, 17] followed by phylogenetic network footprinting analysis. Our study provides not only the first comprehensive catalog of A. millepora TFs but also a first assessment of how these are organized to form transcriptional networks. Evidence of conservation and divergence across the phylogenetic tree can also be inferred from this analysis. The analysis presented here can be considered a starting point for a more comprehensive study of regulatory networks in corals based on coral reef genomics- and systems biology-based interpretive framework.
Results and Discussion
The A. millepora TF repertoire
Firstly, all possible open reading frames (ORF) for each contig and singleton were identified and translated by applying GetORF . To identify coral proteins whose orthologs exist in other species, all coding sequences were queried against NCBI non-redundant (nr) database using BLASTp , the best-hit sequence for each contig or singleton was chosen. If no best hit was found, the longest coding sequence of contig/singleton was chosen. The latter cases are most likely coral-specific proteins or new proteins currently uncharacterized. 19,840 and 49,320 protein sequences were chosen from 104,005 contigs/singletons in coral transcriptome data by abovementioned two approaches, respectively.
To identify the transcription factor repertoire we used 147 sequence-specific DNA binding domains which have been manually curated recently . Protein sequences of the five analyzed species (Acropora millepora, Drosophila melanogaster, Strongylocentrotus purpuratus, Mus musculus, and Homo sapiens) were searched by InterProScan to identify those with DNA binding domain signatures . Any protein was regarded as a TF if it had at least one such domain. This approach yielded 359 TF signatures in coral, 1,047 in fruitfly, 839 in sea urchin, 1,462 in mouse, and 1,885 in human (see Additional files 1, 2, 3, 4, and 5).
Evolutionary signatures of the A. millepora transcription factors repertoire
In order to define the evolutionary signatures of the A. millepora TF repertoire, we analyzed the protein domain composition of the TFs across the five species. We scanned, the full-length protein sequences of all the TFs using the Pfam database . The 359 TFs identified in A. millepora contained 60 Pfam domains. We also compared the domain composition of A. millepora with that of the fruit fly (1,047 TFs with 123 domains), sea urchin (839 TFs with 157 domains), mouse (1,462 TFs with 166 domains) and human (1,885 TFs with 168 domains), (see Additional file 6).
Inferring A. millepora transcriptional network using phylogenetic network footprinting
A. millepora transcriptional network shows properties of those of Bilaterian organisms
A close analysis of the conserved transcriptional network depicted in Figure 3A reveals surprising and intriguing properties. A. millepora organization of TFs into regulatory networks is similar to those of bilateral organisms. For example A. millepora has a homeobox gene regulatory sub-network, comprised of several Hox genes, that is conserved across evolution. In general, syntenic occurrence of these Hox genes was considered a prerequisite for the correct function of these transcription factors in animals . However, disintegration of Hox clusters has been observed in diverse taxa including several cnidarians, and thus the evolutionary significance of synteny regions is in question [36, 37, 38, 39, 40]. We wonder if interaction retention between Hox genes might be more important than merely close physical linkage of them. More specifically, one of the larger and most conserved transcriptional networks in A. millepora was identified among Hox gene regulatory sub-networks (Figure 3A), and this network is enriched for genes that are implicated in developmental process such as neural tube formation in mammals. Furthermore, there is evidence that this network expanded during mammalian divergence. This probably represents a net gain in modularity and plasticity for the network in more complex organisms (Figure 3A), which might reflect an increase in transcriptional plasticity and control.
Evidence for Taxonomically Restricted Interactions (TRIs)
In evolutionary biology the concept of Taxonomically Restricted Genes (TRGs) is associated with those genes that are restricted to particular species and are responsible for species-specific phenotypes and/or lineage-specific adaptations . These genes usually evolve and adapt as a consequence of the specific environment and lifestyle of the organism and are more common in organisms that live in extreme environmental conditions. Although we do not debate the existence of such TRGs, we identified cases where the transcriptional network differs between specific organisms. Most importantly this does not correlate with the complexity of the organism but rather with its adaptation to a particular environment. Consequently, we propose that modulation of transcriptional networks might be a prime mechanism for species-specific adaptations. For example in Figure 3B, we report cases where specific interactions were gained or lost in specific species during evolution. In other words, we can identify what we like to call "Taxonomically Restricted Interactions" (TRIs). This new concept is also supported by the network expansion that we see in the mammalian lineage (Figure 3A). Overall, the concept might give a better explanation how an organism adapts to specific environments than merely considering a set of TRGs. Note that we anticipate plentiful examples of TRIs if whole protein interactomes were to be compared. Our results also support the notion that transcriptional networks evolve by gene duplication followed by gain or loss of specific interactions . For example, the subnetworks in Figure 3B are typical examples of lineage-specific network expansions by gene duplication and interaction retention.
We also identified an interesting subnetwork that is composed of the oncogene MAX (Figure 3B network 7). In mammals, this network is composed of the homodimer of the oncogene MAX. In contrast, A. millepora retained a paralog forming a redundant system that is composed of two homodimers (Figure 3B). We argue that this paralog diverged from its ortholog sister as inferred from the alignment of the HLH domain and the evolutionary tree (Figure 3C and 3D).
The here presented analysis scratches just the surface of our understanding of the complex structure of the transcriptional network in A. millepora. This study is a first attempt toward understanding the structure of the transcription factor networks in corals despite the paucity of - omics-level datasets including genome sequences. Contrary to the apparent simplicity of A. millepora, its gene repertoire and more importantly its transcriptional network are not so different from those of higher organisms and definitely not less complex than the ones of other model organisms such as D. melanogaster, making Scleractinian corals a candidate for a new model organism.
Coral protein sequence identification
To identify A. millepora TFs, ORFs were firstly searched in A. millepora transcriptome data  and then all possible coding sequences were translated using GetORF with default parameters . Coding sequences from each contig/singleton were queried against the NCBI non-redundant (nr) protein database, and then best hits were chosen using BLASTP program. If there was no hit to nr database, the longest coding sequence (CDS) of the contig/singleton was chosen as a final CDS. Among 104,005 contigs from the transcriptome data, 19,840 and 49,320 sequences were chosen by above two approaches, respectively.
Identification of TFs with DNA binding domains
To identify transcription factors of five analyzed species, we used 147 Pfam domains that bind DNA in a sequence-specific manner . InterProScan  was used to search Pfam domains in protein sequences of five organisms, and then a protein was regarded as a TF if it contains at least one DNA binding domain. This resulted in 359, 1,047, 839, 1,462, 1,885 TFs for A. millepora (AM), D. melanogaster (DM), S. purpuratus (SP), M. musculus (MM), and H. sapiens (HS), respectively. To assess the accuracy of prediction, a 'known' human TF set  was downloaded. 1,443 known TFs and 1,405 predicted TFs were obtained after converting TF lists to Entrez gene IDs. Among the known TFs, 1,226 were predicted correctly and 217 did not have any sequence-specific DNA binding domains. 179 predicted TFs did not overlap with known TFs. Positive predictive value, ratio of true positives to sum of true and false positives, was 87.26%.
Analysis of domain architecture
We searched the Pfam domain annotation for these 359 coral transcription factors, and recorded all types of domains they contained. The results showed that the total of all identified TFs consisted of 60 types of protein family domains.
To compare across species we applied an arcsine transformation to above data . The arcsine transformation converts a binomial random variable into one that is nearly normal and whose variance depends very little on the probability parameter, by taking the arcsine of the square root of the abundance value for each functional family domain.
If X is a binomial random variable with parameters n and p, then Open image in new window is the arcsine transformation of Χ.
We further applied a Fisher's statistical test to measure how significantly different the five organisms are, based on their protein family domain composition. The contingency table compiled data from the top ten Pfam domains. In addition, we conducted pairwise comparisons to cluster organisms based on their domain composition. We constructed a phylogenetic tree based on the top ten most abundant protein domain families (Figure 2).
Identification of orthologous TFs is the essential part of our approach. We therefore employed the Inparanoid algorithm that was specifically designed for identifying true orthologs solely based on the protein sequence [29, 30]. Protein sequences of analyzed species were downloaded from NCBI (SP, MM, and HS) and Ensembl (DM). Coral protein sequences were obtained as described above. Inparanoid algorithm was applied for each species pair and orthologous proteins were retrieved.
Inference of interlogs
Based on the assumption that if two proteins interact in one organism, their orthologs in other organisms will also interact with each other , we inferred interacting orthologs across species, i.e. interlogs, using available TF interaction data. 5,238 human and 1,145 mouse TF interactions data were obtained from mammalian 2 hybrid assay . 45 fruitfly TF interaction data was additionally downloaded from DIP . For each known interacting TF pair (A-B), we sought all orthologs of A (Ai1, Ai2, ···, Aip, ···, and AiP, where 1≤p≤P) and B (Bi1, Bi2, ···, Biq, ···, and BiQ, where 1≤q≤Q) in organism i using Inparanoid result. All possible pairs of A ip and B iq were then regarded as interlogs. In the case that an interaction did not have at least one DNA binding domain, this interaction was not considered. Finally, 3,985 interlogs were inferred using this approach, and the total number of interactions including source and inferred interactions were 5,509; 2,323; 524; 599 and 134 for human, mouse, sea urchin, fruitfly, and coral, respectively (see Additional files 7, 8, 9, 10, and 11).
Conserved TF interactions
Essential proteins and interactions are usually conserved across many species [27, 28]. We therefore aligned protein networks of above five model organisms in order to find such elements, and sought conserved interactions across five organisms (see Additional files 12, 13, 14, 15, and 16).
Acknowledgements and Funding
The authors thank Dr. Christopher Gehring for critical reading of the manuscript. All authors are supported by the King Abdullah University of Science and Technology (KAUST).
- 1.Ravasi T, Suzuki H, Cannistraci CV, Katayama S, Bajic VB, Tan K, Akalin A, Schmeier S, Kanamori-Katayama M, Bertin N, Carninci P, Daub CO, Forrest AR, Gough J, Grimmond S, Han JH, Hashimoto T, Hide W, Hofmann O, Kamburov A, Kaur M, Kawaji H, Kubosaki A, Lassmann T, van Nimwegen E, MacPherson CR, Ogawa C, Radovanovic A, Schwartz A, Teasdale RD, et al.: An atlas of combinatorial transcriptional regulation in mouse and man. Cell. 2010, 140 (5): 744-752. 10.1016/j.cell.2010.01.044CrossRefPubMedGoogle Scholar
- 5.Hoegh-Guldberg O, Mumby PJ, Hooten AJ, Steneck RS, Greenfield P, Gomez E, Harvell CD, Sale PF, Edwards AJ, Caldeira K, Knowlton N, Eakin CM, Iglesias-Prieto R, Muthiga N, Bradbury RH, Dubi A, Hatziolos ME: Coral reefs under rapid climate change and ocean acidification. Science. 2007, 318 (5857): 1737-1742. 10.1126/science.1152509CrossRefPubMedGoogle Scholar
- 6.Hughes TP, Rodrigues MJ, Bellwood DR, Ceccarelli D, Hoegh-Guldberg O, McCook L, Moltschaniwskyj N, Pratchett MS, Steneck RS, Willis B: Phase shifts, herbivory, and the resilience of coral reefs to climate change. Curr Biol. 2007, 17 (4): 360-365. 10.1016/j.cub.2006.12.049CrossRefPubMedGoogle Scholar
- 7.DeSalvo MK, Voolstra CR, Sunagawa S, Schwarz JA, Stillman JH, Coffroth MA, Szmant AM, Medina M: Differential gene expression during thermal stress and bleaching in the Caribbean coral Montastraea faveolata. Mol Ecol. 2008, 17 (17): 3952-3971. 10.1111/j.1365-294X.2008.03879.xCrossRefPubMedGoogle Scholar
- 14.Schwarz JA, Brokstein PB, Voolstra C, Terry AY, Manohar CF, Miller DJ, Szmant AM, Coffroth MA, Medina M: Coral life history and symbiosis: functional genomic resources for two reef building Caribbean corals, Acropora palmata and Montastraea faveolata. BMC Genomics. 2008, 9: 97- 10.1186/1471-2164-9-97PubMedCentralCrossRefPubMedGoogle Scholar
- 16.Wilson D, Charoensawan V, Kummerfeld SK, Teichmann SA: DBD--taxonomically broad transcription factor predictions: new content and functionality. Nucleic Acids Res. 2008, D88-92. 36 Database,Google Scholar
- 22.Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A: The Pfam protein families database. Nucleic Acids Res. 2010, D211-222. 38 Database,Google Scholar
- 25.McDonald J: Handbook of Biological Statistics. 2009, Baltimore, Maryland.: Sparky House Publishing, 2,Google Scholar
- 29.O'Brien KP, Remm M, Sonnhammer EL: Inparanoid: a comprehensive database of eukaryotic orthologs. Nucleic Acids Res. 2005, D476-480. 33 Database,Google Scholar
- 31.Ostlund G, Schmitt T, Forslund K, Kostler T, Messina DN, Roopra S, Frings O, Sonnhammer EL: InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res. 2010, D196-203. 38 Database,Google Scholar
- 38.Putnam NH, Srivastava M, Hellsten U, Dirks B, Chapman J, Salamov A, Terry A, Shapiro H, Lindquist E, Kapitonov VV, Jurka J, Genikhovich G, Grigoriev IV, Lucas SM, Steele RE, Finnerty JR, Technau U, Martindale MQ, Rokhsar DS: Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science. 2007, 317 (5834): 86-94. 10.1126/science.1139158CrossRefPubMedGoogle Scholar
- 39.Seo HC, Edvardsen RB, Maeland AD, Bjordal M, Jensen MF, Hansen A, Flaat M, Weissenbach J, Lehrach H, Wincker P, Reinhardt R, Chourrout D: Hox cluster disintegration with persistent anteroposterior order of expression in Oikopleura dioica. Nature. 2004, 431 (7004): 67-71. 10.1038/nature02709CrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.