Introduction

Group A streptococci (Streptococcus pyogenes) are beta-hemolytic Gram-positive bacteria that are non-motile, facultative anaerobic, and fermentative. They grow in short chains or pairs, and more than 200 serotypes of group A streptococci reported [1]. In the year 1933, Rebecca Lansfield introduced a method for their classification based on specific group A carbohydrate of the cell wall [2]. S. pyogenes is a human pathogen that colonizes the skin or throat, resulting in diseases that vary in presentation and clinical severity. S. pyogenes infection is responsible for rheumatic heart disease (RHD), bacterial pharyngitis, post-streptococcal rheumatic fever (RF), cellulitis, rheumatic heart disease (RHD), and glomerulonephritis (PSGN). Further S. pyogenes is also the causative agent of severe invasive diseases including toxic shock syndrome and necrotizing fasciitis [3]. Diseases associated with S. pyogenes are estimated to kill half a million people each year [4]. The annual global burden of GAS disease is estimated to be 111 million cases of pyoderma, at least 517,000 deaths due to severe invasive diseases and sequelae, and 616 million cases of pharyngitis. These reports show the major GAS impact on global morbidity and mortality [5]. The majority of these deaths occur in developing nations and are attributable to RHD because it is the autoimmune consequence of S. pyogenes infection in which immune cells and molecules combat epitopes of bacterial proteins attacking host protein and tissues [6], which is an effective option for the treatment of primary steps of RHD is antibiotic treatment. The ideal approach for prevention of RHD and other S. pyogenes infections can be a preventive vaccine [7].

It is now obvious that during the primary steps of infection, a firm and intimate specific adherence of the bacteria to host cells and tissues is of utmost importance [8]. Bacterial surface proteins are involved in interaction with host cells. Hence, surface proteins are good potential targets of effective vaccines aimed at preventing bacterial diseases and infections [9]. Investigations showed that many bacterial species surface proteins were successfully implied into the immunotherapeutic approaches [10].

Development of vaccines by the traditional method, though successful most times, is a complex process which involves culturing of living organisms, inactivating and reinjecting it into subjects to check for immune responses, and isolating an antigen which is specific to that organism [11]. Hence, the process is costly and time consuming with a very low success rate and is difficult for organisms which cannot be cultured in vitro [12]. The epitope-driven vaccine is an attractive concept that is being successfully pursued in a large number of research groups, especially to the design of vaccines targeting conserved epitopes in rapidly mutating pathogens [13]. Studies show that using epitope prediction methods in combination with molecular docking technique improves the prediction accuracy significantly. Molecular docking is applied in order to measure free-binding energy of selected epitopes. Free-binding energy depends on the conformation of the protein the ligand and before and after binding. The specific cellular immune response is based on the recognition by cytotoxic T lymphocytes of immunogenic peptides presented in the context of class I major histocompatibility complex (MHC). Due to the important and central role of the T lymphocyte in the immune response, the molecular basis of the interaction between the peptide-MHC (p-MHC) complex and the T cell receptor (TCR) is of general interest for medicine immunology, as well as for understanding the factors that contribute to stability and specificity in the formation of protein–protein and protein–peptide complexes [14]. The selected epitopes in a vaccine should be conserved across different stages of the pathogen. Furthermore, the desired immune response [15] should be taken into consideration. It is almost impossible to identify and develop a universal vaccine candidate using a conventional method for variable organisms like S. pyogenes having various strains.

In this study, we present a reverse vaccinology method for the prediction of epitopes. The epitope prediction strategy presented here is based on the identification of 9-mer antigenic epitopes, selected for being able to simulate both humoral and cell-mediated immunity. Molecular techniques were used in order to model the interaction between the selected epitopes with MHC alleles and TCR that allow identifying peptides resulting in the negative free-binding energy of peptide-MHC-TCR binding. Finally, conservation of peptides among different S. pyogenes types is checked.

Material and Methods

Selection of Surface Protein Sequences

Sequences of eight different surface proteins of S. pyogenes with a high degree of conservation, from NCBI protein database, were selected. Surface proteins with their accession numbers are shown in Table 1.

Table 1 Selected surface proteins of S. pyogenes with their accession numbers

Protein Location and Antigenicity Testing

In order to find out if proteins are antigenic, whole protein sequences were tested by VaxiJen (www.ddg-pharmfac.net/vaxijen), which allows antigen classification solely based on the physicochemical properties of proteins without recourse to sequence alignment [16]. TMHMM server (www.cbs.dtu.dk) was used as a topology prediction server to confirm surface exposure [17].

Prediction of B Cell Epitopes

Finding T cell epitopes, which are parts of B cell epitope, can be helpful in evoking an immune system in a more powerful way. As the first step in epitope designing, linear non-overlapping 20-mer B cell epitope prediction was done with ABCpred, which is freely available at www.imtech.res.in/raghava/abcpred [18]. Epitopes with score > 0.8 were accepted and then checked for antigenicity and exo-membrane location using VaxiJen and TMHMM servers.

Prediction of T Cell Epitopes

ProPred-1 was used as an MHC class I epitope prediction server, which identifies 47 alleles [19], and ProPred server was used to predict an MHC class-II (51 alleles) binding site [20]. These servers are useful tools in locating the promiscuous binding regions that can bind to several HLA alleles. Promiscuous T cell epitopes are those which could bind to different MHC class I and II alleles. Due to the significant polymorphism in the peptide binding groove of MHC molecule, it is important to determine the promiscuous epitopes [21]. In this study, epitopes, which bind to at least seven MHC alleles, both class I and II, were selected. Also, MHCPred was used to predict epitopes. MHCPred uses partial least squares based approach for the prediction of binding affinity to MHC molecules. MHCPred was used for cross-check, to ensure that if the selected epitopes from the two previous servers in this server are correctly predicted [22]. VaxiJen server was used to analyze selected epitopes for antigenicity.

Binding Affinity Analyze

HLA-A1, HLA-A2, DRB1, and DRB4 are a number of the most frequent MHC alleles in the human population [23]. MHCPred was used to predict the binding affinity between mentioned alleles and selected epitopes. The result was given in terms of inhibitory concentration (IC50) [22]. Epitopes with IC50 value less than 500 nM for at least three alleles were selected.

Molecular Docking of the Promiscuous Epitopes Binding to MHC Class I and II and TCR

In order to analyze molecular docking, the 3D structures of selected peptides, alleles, and TCR were needed. The 3D structures of alleles and TCR were retrieved from Protein Data Bank (www.rcsb.org/pdb/home/home.do): HLA-A1(PDB ID: 4NQV), HLA-A2 (PDB ID: 1HLA), DRB1 (PDB ID: 1DLH), DRB4 (PDB ID: 2SEB), and TCR (PDB ID: 5c07). Extra atoms and water molecules were omitted from the PDB file. The tertiary structure of selected peptides was predicted using the PEPstr server, which predicts the structure of peptide from the sequence given as input [24]. The ClusPro server is a widely used tool for protein–protein docking [25]. Firstly, we dock peptides into MHC molecules using ClusPro. Most of the central residues of the peptides were exposed in the MHC complexes and were recognized by TCRs [26]. Therefore, subsequently MHC-peptide complex was docked onto the TCR molecule using Cluspro. For each section, we selected conformation with the minimum free-binding energy score.

Selected T Cell Epitopes Conservation

In the last step, the selected T cell epitopes sequences were analyzed to see if they were conserved among different streptococcus types. Conservation was checked by Protein BLAST [27].

Results

Protein Location and Antigenicity Testing

Antigenicity analysis of full-length protein showed antigenicity ranging from 0.47 for capsule synthesis protein to 0.7 for M protein. TMHMM software exhibited that all the proteins have an exo-membrane location; therefore, all the proteins were suitable for epitope prediction.

Prediction of B Cell Epitopes

B cell epitopes for each surface protein were identified using ABCpred. Twenty-mer non-overlapping B cell epitopes with score > 0.8 were selected. The selected epitopes were analyzed for antigenicity and exo-membrane topology using VaxiJen and TMHMM. The number of detected and accepted epitopes is shown in Table 2.

Table 2 B cell epitopes of surface proteins of S. pyogenes

Prediction of T Cell Epitopes

B cell epitopes of each surface protein were further analyzed to detect promiscuous T cell epitopes within the B cell epitopes. Propred and Propred I were used to detecting 9-mer promiscuous T cell epitopes which share the same sequence with selected B cell epitopes. In the results of the epitope prediction by MHCPred server, all of our selected epitopes from two previous servers (Proped and Propred I) were predicted as epitope by MHCPred. The results of this sever confirmed all our previous results. Subsequently, promiscuous T cell epitopes were analyzed for antigenicity using VaxiJen. M protein has no promiscuous epitopes able to bind at least seven alleles. For other seven proteins, there are 30 promiscuous alleles, which fulfill all the mentioned criteria. Identified epitopes are listed in Table 3.

Table 3 Identification of promiscuous T cell epitopes

Binding Affinity Analyze

In this investigation, HLA_A1, HLA-A2, DRB1, and DRB4 were as common as MHC I and II alleles in the human population. The binding affinity of peptides to mentioned alleles was measured by MHCpred. The further step of screening was searching for the peptides with the ability to bind to at least 3 alleles with IC50 under 500 nM.

Capsule synthesis, laminin-binding surface protein LmB, Lipoteichoic acid synthase LtaS type IIc, and strypsin-resistant surface T6 protein had no promiscuous T cell epitope binding to at least 3 mentioned alleles with IC50 under 500 nM. Table 4 shows epitopes, which fulfill all mentioned criteria.

Table 4 Selection of T cell epitopes through affinity for most frequent MHC alleles

Molecular Docking of the Promiscuous Epitopes Binding to MHC Class I and II and TCR

Molecular docking of final peptides was done with candidate MHC I and II alleles. Subsequently, the peptide-MHC complex was docked onto the TCR molecule using ClusPrp. All peptides had low free-binding energy in docking with MHC I and II alleles and TCR. It means that selected epitopes were good vaccine candidates. P3 epitopes from C5a peptidase protein had the lowest free-binding energy for all four alleles. The free-binding energies of interaction between peptides and MHC I and II alleles are listed in Table 5, also the free-binding energies of interaction between peptide-MHC complex and the TCR are listed in Table 6.

Table 5 Free-binding energy of peptide-MHC I and II allele interaction
Table 6 Free-binding energy of peptide-MHC complex and TCR interaction

Selected T Cell Epitopes Conservation

The selected T cell epitopes were checked if they were conserved among the different Streptococcus types or different S. pyogenes strains. Protein BLAST server showed that P1, P2, and P3 were 100% conserved among four different S. pyogenes strains (Gene Bank: BAC64654.1, AAM78905.1, ESA59261.1, and ESA52643.1), P4 and P5 were 100% conserved among four different S. pyognes (Gene Bank: AAT86292.1, ESA46406.1, ESA48928.1, and EQL82046.1) and also fibronectin-binding protein of two Streptococcus dysgalactiae strains (Gene Bank: WP_053042319.1 and KKC18439.1), and finally, P6 and P7 were 100% conserved among seven different S. pyogenes strains (Gene Bank: ESU86208.1, AAT86292.1, AAX71219.1, AIG49791.1, ESA52483.1, EPZ45792.1, and ESA46406.1), four different S. dysgalactiae strains (Gene Bank: BAM60328.1, CAE53872.1, WP_053042319.1 and KKC18439.1) and Streptococcus sp. “group G” (Gene Bank: AAB06623.1).

Discussions

Vaccination is a performance of antigenic compounds that might elicit immune responses that give a remarkable degree of protection against a disease [28]. Prediction of effective epitopes by means of computational approaches is the key to develop a successful vaccine [29]. Computational epitope prediction could reduce experimental efforts and does not need living culture [30]. Fastidious bacteria can be easily analyzed and can reduce costs [12]. Also, molecular docking is a relatively new and efficient technique to predict the potential ligand binding site on the whole protein target [31].

S. pyogenes is a serious cause of mortality worldwide. Vaccination against S. pyogenes infections and its immunological complication have been a goal of researchers for decades [32]. Surface proteins collaborate significantly to the adhesion of S. pyogenes to host cell and tissues. Surface proteins are responsible for pathogenicity and are mostly antigenic. Hence, surface proteins are the favored target of vaccine design [33]. Yet, there are no practical human peptide-based vaccines on the market. This stems primarily from the difficulties associated with peptide stability and delivery, and the challenge posed by the diversity of human immunogenetics. In addition, peptide-based vaccines offer means for safe immune intervention; more than 30 peptide-based vaccines are currently under development, including several which are in phase III clinical trials [34]. Attempts to immunize humans against GAS started in the 1930s. For many years, the development of vaccines based on the M protein was hindered by presumed “molecular mimicry” caused by antibodies raised against purified M protein that cross-reacted with human tissues, including the heart, skeletal muscle, brain, and glomerular basement membrane, thereby causing concern about the potential for induction of autoimmunity [35]. So far, efforts to develop a practical vaccine against S. pyogenes have not been completely successful. Development of vaccine, which confers protection from infection by multiple S. pyogenes types, is a major challenge [7]. In vaccine design, selected epitopes should ideally be conserved across pathogen variants and its different stages [36]. Although research directed toward the development of effective vaccines started many years ago, a commercial vaccine is still not available.

In the year 2000, reverse vaccinology was applied to identify a potential vaccine candidate for serogroup B meningococcus [37]. During 2006 and in 2011, a universal vaccine with promising results for serogroup B meningococcus was identified [33, 34]. Besides bacteria, this approach was tested against a variety of pathogenic organisms such as protozoan lishmania [35, 36], HIV [38], influenza virus, Iranian HPV [39], and SARS-CoV [40]. In another study, B and T cell epitopes from HER2 extracellular domain (HER2 ECD) for breast cancer were predicted [41]. A reverse vaccinology approach for predicting potential candidates was done in ten surface proteins of Staphylococcus aureus causing endocarditis. Selected epitope simulated both B cell- and T cell-mediated immunity and was conserved across the various bacterial strains [23]. In a study in 2007, two vaccine candidates for group A streptococcus bacteria were compared with bioinformatics approaches. In this research, a combination of epitope prediction, bioinformatics, and immunoinformatics was applied. However, research on bioinformatics peptide vaccines continues and it is at the beginning of its path. In our previous study, we focused on finding conserved epitopes among different M proteins of streptococcal bacteria. This research was done on only HLA-A1 allele and we managed to find highly conserved epitopes among streptococcal bacteria. For example, an epitope (ALEEANSKL) was found that was conserved among six different types of M proteins of streptococcal bacteria. The studies show that it is possible to find practical conserved peptide-based vaccines between streptococcal bacteria [42].

In this study, we focused on detection of antigenic epitopes in eight surface proteins, which are conserved and representative among S. pyogenese strains. In order to have maximum immune response simulation, prediction of epitopes, which are able to simulate both B cell and T cell immunity, was done. We prefer promiscuous epitopes, which have binding affinity to more than seven major histocompatibility complex (MHC) alleles and also cover a major population. Among different MHC I and II alleles, we focused on HLA-A1, HLA-A2, DRB1, and DRB4 in order to detect T cell epitopes matching to a large HLA population. Using epitope prediction methods in combination with molecular docking technique, the prediction accuracy is greatly improved. Protection against infection of multiple S. pyogenes types is one of the most important feathers of effective vaccine development. Hence, conservation of selected epitopes across different strains and variants of Streptococcus types were checked.

In this study, in order to guarantee that final selected T cell epitopes are appropriate candidates for vaccine design, different filters are applied. Experimental validation for epitopes potential use is required in peptide vaccine against S. pyogenes infection.