Phylogenetic prediction of cis-acting elements: a cre-like sequence in Norovirus genome?
- 3.2k Downloads
Discrete RNA structures such as cis-acting replication elements (cre) in the coding region of RNA virus genomes create characteristic suppression of synonymous site variability (SSSV). Different phylogenetic methods have been developed to predict secondary structures in RNA viruses, for high-resolution thermodynamic scanning and for detecting SSSV. These approaches have been successfully in predicting cis-acting signals in different members of the family Picornaviridae and Caliciviridae. In order to gain insight into the identification of cis-acting signals in viruses whose mechanisms of replication are currently unknown, we performed a phylogenetic analysis of complete genome sequences from 49 Human Norovirus (NoV) strains.
The complete coding sequences of NoV ORF1 were obtained from the DDBJ database and aligned. Shannon entropy calculations and RNAalifold consensus RNA structure prediction identified a discrete, conserved, invariant sequence region with a characteristic AAACG cre motif at positions 240 through 291 of the RNA dependant RNA polymerase (RdRp) sequence (relative to strain [EMBL:EU794713]). This sequence region has a high probability to conform a stem-loop.
A new predicted stem-loop has been identified near the 5' end of the RdRp of Human NoV genome. This is the same location recently reported for Hepatovirus cre stem-loop.
KeywordsShannon Entropy Sapovirus NS5B Code Region Internal Base Pairing RdRp Code Region
Internal base pairing that creates stem-loops and other RNA structures places constraints on sequence variability in bases required for structure formation in the genome of RNA viruses. For instance, the Hepatitis C virus (HCV) genome has a marked suppression of synonymous codon variability within several evolutionary conserved stem-loops in the core and NS5B coding regions that demonstrate their role in virus replication [1, 2, 3]. Discrete RNA structures such as cis-acting replication elements (cre) in the coding region of human enteroviruses (HEVs)  and other viruses also create characteristic suppression of synonymous site variability (SSSV), similar to that observed in HCV [5, 6]. Different phylogenetic methods have been developed to predict secondary structures in RNA viruses, like PFOLD [2, 7] or Alifold , for high-resolution thermodynamic scanning, and like UNAFold  for detecting SSSV . These methods have permitted to identify suitable genome regions for an in-depth experimental analysis allowing establishing the role of the identified secondary RNA structures in translation or replication. This approach has permitted to raise the hypothesis that when SSSV (i.e. highly conserved synonymous sites in a RNA virus genome sequence alignment) takes place in a sequence region with a high probability of conforming a secondary structure (i.e. high probability of base pairing to generate a stable stem-loop), a cis-acting signal can be identified. This hypothesis has been successfully tested in different members of the family Picornaviridae, like Hepatitis A virus (HAV), Avian Encephalitis virus (AEV) and Rhinovirus [10, 11] and in members of the family Caliciviridae, like Norovirus, Sapovirus, Vesivirus and Lagovirus .
In order to gain insight into the identification of cis-acting signals in viruses whose mechanisms of replication are currently unknown; we tested the above hypothesis for a group of 49 Human Noroviruses (NoV), for whom complete genome sequences have been recently obtained.
Only few, discrete, genome regions in the ORF1 of Human NoV have a Shannon entropy of zero indicating that they are invariants among all NoV sequences introduced in this analysis (Fig. 1). Interestingly, one of these discrete regions has an AAACG cre sequence motif  at position 3948 to 3952 of the alignment. These positions correspond to positions 240 through 291 of the RNA dependant RNA polymerase (RdRp) sequence (relative to strain [EU794713]) (Fig. 1).
Using the RNAalifold program  we have identified the presence of a unique and conserved stem-loop near the 5' end of the RdRp coding region for the 49 Human NoV genomes analyzed (Figs. 1 and 3). This predicted structure contains a cre sequence motif (Fig. 3). Interestingly, the stem-loop predicted for Human NoV is situated near the 5' end of the RdRp (Figs. 2 and 3). This is the same location recently reported for Hepatovirus cre stem-loop  (Fig. 1).
RNA structure predictions are consistent with previous analyses based on the thermodynamic folding of individual sequences [12, 17, 18]. Although RNA structure is clearly not the only cause of SSSV, occurring for example also in overlapping gene sequences , there is an impressive co-localization of the major sites of SSSV and thermodynamically predicted secondary structures [4, 11, 12].
NoV belong to the family Caliciviridae, and they are non-enveloped viruses with positive, single-stranded RNA genomes. They also share other important features with picornaviruses, like having a VPg protein covalently linked to the 5' of the genomic RNA . Nevertheless, in contrast with picornaviruses, NoV express a downstream sub-genomic (sg) transcript encoding structural genes . NoV are the leading cause of outbreaks of acute gastroenteritis in humans worldwide .
Despite the importance of these outbreaks, our understanding of the RNA structures or sequences required for NoV replication has been limited. Previous reports have identified the poly-pyrimidine tract-binding protein (PTB), poly-A binding protein (PAB) and La autoantigen to interact with the 3' untranslated region of the Norwalk virus genome . Very recent studies have identified cis-acting signals in the 5' and 3' regions as well as at the start of the sg RNA transcript of NoV .
As a member of the family Caliciviridae, NoV are thought to replicate in a manner typical of positive-stranded RNA viruses, through the synthesis of a full-length anti-genomic strand (reverse complement copy) using the viral RdRp translated initially from the RNA genome entering the cell . The minus strand then acts as a template for the synthesis of full-length genomic RNA from which non-structural proteins are translated, including the RdRp. Features of the RdRp common to all positive-sense RNA viruses support this idea .
Although the presence of this new putative cis-acting signal predicted in this study was not yet investigated in vitro due to the lack of a standard cell culture to grow these viruses, the probability that this predicted structure will acts as a functional element may open new avenues to our understanding of molecular mechanisms of NoV replication.
Extensive mutagenesis studies performed in members of the family Picornaviridae, like Poliovirus (PV) and Human Rhinovirus 14 (HRV-14), revealed a critical conserved AAACA/G cre sequence motif in the 5' half of the loop sequence that is essential for its function . Similar conserved motifs are present within the loops of the cre elements of other picornaviruses and are important for RNA replication [10, 24, 25].
Paul and colleagues (2003) have shown that the PV cre act as the template for VPg uridylylation through a "slide-back" mechanism catalyzed by the 3Dpol (RdRp) [24, 26, 27]. The uridylylation of VPg leads to the production of VPg-pUpU, which serves as the protein primer for new RNA synthesis .
Interestingly, recent studies have shown that incubation of VPg with NoV 3Dpol (RdRp) generates VPg-poly(U) and that this uridylylated VPg can prime the replication of polyadenylated RNA . In contrast, replication of antigenomic RNA was not primer dependent. Moreover, on nonpolyadenylated RNA, NoV RdRp initiated RNA synthesis de novo . These findings clearly show that initiation of replication of the NoV genome by the RpRp requires a VPg-protein-primed initiation of replication of polyadenylated genomic RNA and a de novo initiation of replication of antigenomic RNA . Besides, very recent studies revealed that the NoV RdRp is a typical template-dependent RNA polymerase .
It is possible that the predicted stem-loop identified near the 5' end of the NoV RdRp coding region, which share a cre-like sequence motif with members of the family Picornaviridae , will be capable to perform the uridylylation of VPg. If that is the case, this will permit VPg to act as a primer for the synthesis of the minus strand RNA, in agreement with the results outlined above .
We acknowledge support by CAPES (Brazil) and Universidad de la República (Uruguay) through project No. 018/08.
- 12.Simmonds P, Karakasiliotis I, Bailey D, Chaudhry Y, Evans DJ: Goodfellow: Bioinformatic and functional analysis of RNA secondary structure elements among different genera of human and animal caliciviruses. Nucleic Acids Res. 2008, 36: 2530-2546. 10.1093/nar/gkn096.PubMedCentralCrossRefPubMedGoogle Scholar
- 14.Korber BT, Kuntsman K, Patterson B, Furtado M, McEvilly M, Levy R, Wolinsky S: Genetic differences between blood- and brain-derived viral sequences from human immunodeficiency virus type 1-infected patients: evidence of conserved elements in the V3 region of the envelope protein of brain derived sequences. J Virol. 1994, 68: 7467-7481.PubMedCentralPubMedGoogle Scholar
- 16.Gruber AR, Lorenz R, Bernhart SH, Neubock R, Hofacker IL: The Vienna RNA websuite. Nucleic Acids Res. 2008, W70-74. 10.1093/nar/gkn188. 36 Web ServerGoogle Scholar
- 25.Yin J, Paul AV, Wimmer E, Rieder E: Functional dissection of a poliovirus cis-acting replication element [PV-cre(2C)]: analysis of single and dual-cre viral genomes and proteins that bind specifically to PV-cre RNA. J Virol. 2003, 77: 5152-5166. 10.1128/JVI.77.9.5152-5166.2003.PubMedCentralCrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.