Enteropathogenic porcine coronaviruses affect pig herds around the world, leading to significant financial losses. Porcine epidemic diarrhea (PED) is a highly contagious viral disease in pigs, caused by an RNA-containing virus belonging to the family Coronaviridae. PED is characterized by debilitating diarrhea, dehydration, and high mortality. The disease affects pigs of all age groups, but the most susceptible are newborn piglets (up to two weeks old) among which mortality ranges between 50 and 100% [1, 2].

PED is common in the United States, Canada, China, Korea, Japan, Thailand and Vietnam, as well as European countries, with the exception of Ireland, Denmark and Sweden [3,4,5,6,7]. PED was first introduced into large pig farms in the Russian Federation in 2006. At present, the prevalence of PED is increasing within regions with a high pig concentration in Russia. However, limited information is available about the genetic characteristics of PEDV strains currently circulating in Russia. All porcine epidemic diarrhea virus (PEDV) isolates form one serotype but have different degrees of virulence in the field [8].

The spike (S) protein of PEDV is the viral protein that is subjected to the greatest immunological pressure and variability. Deletions (S-INDELs) or small insertions have been observed in the S gene nucleotide sequences of many PEDV isolates [9]. The PEDV strains that are currently circulating in the European Union are similar to the American S-INDEL strains [10,11,12]. The phylogenetic classification of the PEDV strains is based on the analysis of complete genomes sequences obtained worldwide [13] or individual genes such as S, M, N, or ORF3 [9, 11, 14].

In this study, we analyzed the genome sequence of recently sequenced PEDV isolate, PEDV/Belgorod/dom/2008 (GenBank accession number MF577027) [15].

Pathological samples from the intestine and stomach were taken from one-month-old sick piglets from the Belgorod region of Russia in 2008 [15]. Total RNA was extracted from a 10% organ suspension using TRIzol Reagent (ThermoFisher Scientific) according to the manufacturer’s instructions. Next-generation sequencing was done using an Illumina MiSeq instrument with a MiSeq reagent kit v3 in 2- × 300-bp PE mode (Illumina, San Diego, CA, USA) [15]. The PEDV/Belgorod/dom/2008 isolate was subsequently isolated from the small-intestine tissue in Vero cell culture.

Prediction of homologous recombination events was carried out using RDP4 (Recombination Detection Program) and SIMPLOT [16, 17]. Pairwise identity analysis was performed using SDT v1.2 software [18] and 18 whole genomic PEDV sequences, three transmissible gastroenteritis virus (TGEV) sequences, and one swine enteric coronavirus sequence from the GenBank database. Multiple alignment was performed using MUSCLE software [19]. Phylogenetic trees were constructed based on PEDV M and S gene sequences using the maximum-likelihood method in MEGA 6.0. [20]. Bootstrap values were estimated for 1000 replicates.

The complete coding sequence of the PEDV/Belgorod/dom/2008 is 28,315 nucleotides (nt) in length (GenBank accession number MF577027) [15]. Two putative recombination sites were detected in the genome of recombinant PEDV/Belgorod/dom/2008 at nt 20476 in ORF1B and nt 24403 in the S gene (Fig. 1). PEDV strain LZC (EF185992) and PEDV strain SLO/JH-11/2015 (KU297956) were identified as the major and minor parental viruses, respectively. The recombinant event was identified by six modules (RDP, MaxChi, Chimaera, Geneconv, Bootscan, SiScan) with high confidence (average p-value, 2.77 × 10−23).

Fig. 1
figure 1

Recombination breakpoints in the genome of the PEDV/Belgorod/dom/2008 isolate predicted by RDP4. The potential parental strains and the recombinant isolate are shown in teal (major), purple (minor) and yellow (recombinant), respectively. Arrows indicate recombinant breakpoints. UTR, untranslated region; ORF, open reading frame; S, spike; E, envelope; M, membrane; N, nucleocapsid

A similarity plot showed high overall sequence similarity between the PEDV/Belgorod/dom/2008 strain and the parental PEDV strains, but with a marked drop in the nucleotide sequence similarity in the S gene region (Fig. 1).

Phylogenetic analysis of the complete genomes showed that PEDV/Belgorod/dom/2008 has a distant relationship to known PEDV strains. The PEDV/Belgorod/dom/2008 isolate does not belong to any groups formed by the American or Chinese strains and forms a separate cluster together with the SeCoV-ITA09 recombinant strain isolated in Italy (Fig. 2).

Fig. 2
figure 2

Phylogenetic tree of the PEDV/Belgorod/dom/2008 isolate (highlighted in black) and other PEDV, TGEV and SeCoV strains of different geographical origin, based on an alignment of predicted amino acid sequences derived from complete genome sequences. The isolation year of LZC is unknown but should be before 2006 according to the GenBank submission date

Since only M gene sequences are available in the GenBank database for the Russian PEDV isolates, we rebuilt the phylogenetic tree to refine the analysis. Based on the phylogenetic analysis of the M gene, the PEDV/Belgorod/dom/2008 isolate belongs to the same clade as other virulent Russian PEDV strains, indicating a high degree of sequence homogeneity in the M gene (Fig. 3a). Interestingly, PEDV/Belgorod/dom/2008 carries a significant number of nucleotide substitutions with reference to the PEDV isolate Belgorod/05/07 (EU179730), which was isolated earlier from the same region.

Fig. 3
figure 3

Maximum-likelihood phylogenies of the PEDV isolates and closely related coronaviruses (TGEV and swine enteric coronavirus strain) based on predicted amino acid sequences. Phylogenetic trees based on the M gene (a) and the S gene (b) are presented. Bootstrap values of 60 or greater are shown at the nodes. The trees show robust incongruence for the PEDV/Belgorod/dom/2008 topology between the M and S genes. PEDV/Belgorod/dom/2008 is indicated by a black circle

The S gene phylogeny of PEDV and related coronaviruses demonstrates that the PEDV/Belgorod/dom/2008 isolate is genetically distinct and does not belong to any group (Fig. 3b). This robust incongruence between the M- and S-gene-based trees may be explained by a recombination event within the genome of the PEDV/Belgorod/dom/2008 isolate. Such variability can lead to dramatic changes in viral virulence, pathogenicity, and antigenicity.

Pairwise identity analysis based on the spike amino acid sequences revealed that PEDV/Belgorod/dom/2008 is an intermediate between PEDV and TGEV and is also distantly related to other PEDV strains (Fig. 4).

Fig. 4
figure 4

Genome-wide pairwise identity matrix of PEDV/Belgorod/dom/2008 and representative spike amino acid sequences of PEDV and TGEV. The PEDV/Belgorod/dom/2008 isolate is indicated by a black circle

PEDV/Belgorod/dom/2008 has a unique spike protein sequence and low similarity to other PEDV isolates. Changes in the S glycoprotein gene play an important role, since this protein is important for tissue tropism and virulence [5]. A preliminary animal study with PEDV/Belgorod/dom/2008 demonstrated that this recombinant is highly virulent in unvaccinated suckling piglets [21, 22].

Recombination events are possible and can sometimes be observed in cases where pigs have been vaccinated or infected with a mixture of TGEV and PEDV. Such recombination events can potentially result in a loss of vaccine efficacy.

Boniotti et al. reported a virus possessing a TGEV genome sequence in which the S protein sequence was identical to that of a PEDV isolate (SeCoV-ITA09) [23]. This chimeric virus was probably generated by recombination between TGEV and PEDV. Similar chimeric viruses have been found by other research groups in Germany [24] and Eastern Europe [25].

The genome sequence of one PEDV isolate (CH/HNQX-3/14) from China shows that this strain appeared due to naturally occurring recombination of the attenuated strains CV777 and DR13 with the circulating field strain CH/ZMDZY/11. The recombination events occurred in the S, ORF3, and N-structural protein-coding region and the replicase ORF1a region [26].

The results of phylogenetic and recombination analysis revealed a discrepancy between the S gene sequence of PEDV/Belgorod/dom/2008 and the sequences of other isolates available in the GenBank database. Our results indicate that PEDV/Belgorod/dom/2008 is a new recombinant strain. Interestingly, PEDV/Belgorod/dom/2008 and SeCoV-ITA09 (a recombinant strain from Italy) form a unique phylogenetic group.

Pairwise identity analysis demonstrated that the amino acid sequence of the S gene of PEDV/Belgorod/dom/2008 is 60% identical to the S gene of other PEDV strains and 50% identical to those of TGEV strains. These data argue that PEDV/Belgorod/dom/2008 occupies an intermediate position between TGEV and PEDV.

The identification of recombinant regions in PEDV/Belgorod/dom/2008 can be useful for further analysis of evolutionary variability, epidemiology, and development of a new diagnostic gene-based assay for porcine epidemic diarrhea virus.