Constraint-based analysis of metabolic capacity of Salmonella typhimurium during host-pathogen interaction
- 13k Downloads
Infections with Salmonella cause significant morbidity and mortality worldwide. Replication of Salmonella typhimurium inside its host cell is a model system for studying the pathogenesis of intracellular bacterial infections. Genome-scale modeling of bacterial metabolic networks provides a powerful tool to identify and analyze pathways required for successful intracellular replication during host-pathogen interaction.
We have developed and validated a genome-scale metabolic network of Salmonella typhimurium LT2 (iRR1083). This model accounts for 1,083 genes that encode proteins catalyzing 1,087 unique metabolic and transport reactions in the bacterium. We employed flux balance analysis and in silico gene essentiality analysis to investigate growth under a wide range of conditions that mimic in vitro and host cell environments. Gene expression profiling of S. typhimurium isolated from macrophage cell lines was used to constrain the model to predict metabolic pathways that are likely to be operational during infection.
Our analysis suggests that there is a robust minimal set of metabolic pathways that is required for successful replication of Salmonella inside the host cell. This model also serves as platform for the integration of high-throughput data. Its computational power allows identification of networked metabolic pathways and generation of hypotheses about metabolism during infection, which might be used for the rational design of novel antibiotics or vaccine strains.
KeywordsMetabolic Network Flux Balance Analysis Metabolic Reconstruction Glucose Minimal Medium Flux Variability Analysis
Salmonellae are Gram-negative bacterial pathogens with a wide host range. The ability of Salmonella enterica to cause disease is closely correlated with its ability to survive inside its host cells, which include epithelial cells and macrophages . Infection of mice with Salmonella enterica serovar Typhimurium (S. typhimurium) serves as model for the systemic disease typhoid fever. Salmonella strains that are deficient for intra-macrophage survival cannot elicit disease in the mouse model. Many genomic and proteomic datasets exist for Salmonella as a pathogen [2, 3, 4, 5, 6]. Most experiments have focused on identifying 'virulence' factors that might drive the disease process, such as surface structures or secreted products. Differential expression of metabolic gene-products has largely been left unexplored in terms of their role in pathogenesis. However, it has now gained more attention that the metabolism of intracellular bacterial pathogens such as Salmonella may hold important clues to understanding their intracellular life-style as well as to understanding host defense mechanisms [7, 8]. Genome-scale metabolic in silico models that can integrate and exploit genomic and proteomic data may provide a first step towards a quantitative analysis of the metabolism of intracellular bacteria.
Constraint-based reconstruction and analysis (COBRA) of cellular metabolism has been employed successfully to develop organism-specific genome-scale in silico models across all the three major domains of the tree of life [9, 10, 11, 12]. The COBRA approach seeks to differentiate between those network states that are achievable by the organism from those that are not, based on a detailed reconstruction of metabolism and incorporation of physiological parameters that are consistent with known experimental information . The solutions this analysis provides are based on the successive imposition of governing physicochemical constraints on genome-scale reconstructions.
COBRA integrates and represents our current knowledge of network components and interactions and has been applied to several prokaryotic and eukaryotic organisms, including Escherichia coli , Helicobacter pylori , Haemophilus Influenzae , Saccharomyces cerevisiae  and Geobacter sulfurreducens . By systematizing and providing context for transcriptomic, proteomic, and metabolomic data, these models allow for the imposition of constraints, which together define the possible phenotypic behavior allowed by these in silico organisms. The use of constrained based modeling has significant potential to identify new antibacterial targets . Modeling allows simulation of single gene knockouts and, more importantly, the effect of combinatorial deletions, which might allow the rational development of combination drug therapy. The use of such a modeling approach has been demonstrated for H. influenza  and for Mycobacterium tuberculosis . As the number of pathogens with such models grows [19, 20, 21, 22, 23, 24, 8], one might expect that more basic principles of bacterial metabolism needed for pathogenesis can be derived.
Here we describe the reconstruction of a detailed and validated genome-scale metabolic network for S. typhimurium, that reconciles established genomic and high-throughput phenotyping data and that is augmented by experimental data to facilitate iterative model building efforts. Furthermore, we demonstrate how the model can be used to reconcile existing virulence data and to resolve gaps in our understanding of pathogen metabolism and host-cell nutrient environment. We also use gene expression data obtained for this study in combination with established microarray and proteomic data in the context of the model to understand pathogen metabolism in macrophage cell lines.
Characteristics of the Reconstructed Metabolic Network
The reconstruction accounts for central metabolism, carbohydrate utilization, amino acid biosynthesis and degradation, nucleotide biosynthesis and salvage, and cofactor biosynthesis (see Additional file 1). Also included were anaerobic and aerobic electron transport chain reactions, transport reactions for various minerals and small molecules, and biosynthesis reactions for cell wall constituents and precursors, such as phospholipids and peptidoglycan. Lipid biosynthesis in bacteria involves (i) fatty acid biosynthesis, (ii) attachment of the complete fatty acids to sn-glycerol-3-phosphate, and (iii) addition and modification of head groups to yield the major cellular phospholipids. Fatty acids are important constituents of lipid A and the cell wall of bacteria. In contrast to published models of other microbes that used compositional data of E. coli, we have directly measured amino acid, fatty acid and lipid profiles for Salmonella LT2 (see Additional file 2) and incorporated them into our reconstruction in i RR1083. Since fatty acid synthesis has also become a recent target of new antimicrobial drugs, it is critical to represent fatty acids and lipids accurately for different pathogens.
In addition to accounting for these major aspects of cell metabolism and biosynthesis, the reconstruction also includes reactions required for drug efflux, proton pumps, and inhibitory effects of antibiotics on bacterial metabolism. We have included the AcrAB-TolC multi-component drug efflux pump which has been characterized in S. typhimurium. This efflux pump has been shown to be involved in the resistance to β-lactams (ampicillin and penicillin G), fluoroquinolone (ciprofloxacin and fluoroquinolone), cefotaxime, and chloramphenicol [27, 28, 29]. An increase in the expression levels of this pump has been recognized as a significant determinant of antibiotic resistance in bacterial pathogens. The inclusion of this efflux pump allows modeling of the reaction of Salmonella to antibiotics, by varying efflux pump activity and/or limiting metabolic flux through the respective pathways. For example, β-lactam antibiotics result in incomplete peptidoglycan synthesis (D-Ala-D-Ala addition), which can be modeled by restricting flux through peptidoglycan synthesis. Similarly, dihydropteroate synthase inhibition by sulfonamides and dihydrofolate reductase inhibition by trimethoprim can be simulated by constraining fluxes through their corresponding reactions.
Another unique aspect of our reconstruction are the reactions involved in free radical quenching that are generally used by Salmonella to counteract the host cell response to infection [30, 31]. In addition to the classically known mechanisms mediated by superoxide dismutases, catalases and the enzymes that make and recycle glutathione and thioredoxin, our reconstruction includes mechanisms for reactive nitrogen species (RNS) and reactive oxygen species (ROS) resistance and removal. Some of those include: a soluble flavohemoglobin (Hmp) with a NO dioxygenase activity that is induced solely by RNS; and an alkyl hydroperoxide reductase (AhpC) belonging to the peroxiredoxin family that is involved in dissipating both ROS and RNS [31, 32, 33, 34]. The inclusion of reactions for ROS (superoxide, peroxide) and RNS (nitrate, nitrite, nitric oxide, and nitrous oxide) species make this model a suitable platform to analyze their role in infection and pathogenesis.
Flux balance analysis (FBA) was used to predict growth rates on six different carbon sources (glucose, fructose, arabinose, xylose, ribose and glycerol) using their measured substrate uptake rates. These in silico predicted growth rates were then compared to experimental results of aerobic batch cultivation of S. typhimurium LT2 in M9 minimal media. The calculated maximum specific growth rate (hr-1) of Salmonella for a given carbon source uptake rate was close to the experimentally determined values for S. typhimurium under similar conditions (Figure 1C). Quantitative differences between calculated and observed growth rates could be attributed to differences in ATP maintenance parameters (values for E. coli were used in the simulations ) or secretion of by-products, which were not predicted in any of the simulations since carbon sources were assumed to be the limiting nutrient.
S. typhimurium is a mixed acid fermenter that metabolizes glucose via the Embden-Meyerhoff-Parnas pathway . Many of the predicted non-zero fluxes (as calculated using FBA) corresponded to genes whose mRNA transcript expression levels were measurable during growth of Salmonella (see Additional file 8) on glucose minimal media. Most of the transcripts detected in our study concur with previous reports in literature. During growth on glucose the iRR1083 model predicts that oxaloacetate can be produced by phosphoenolpyruvate carboxylase (PepC), which is known to be the only enzyme serving the anaplerotic role of replenishing oxaloacetate that is withdrawn for amino acid biosynthesis. In agreement with our flux calculations, our experimental results also indicate a high abundance of the PepC mRNA. The data also indicate that glucose is converted to biomass and some part of the carbon is diverted into stoichiometric formation of formate, acetate, ethanol and succinate under anaerobic conditions .
Growth of S. typhimurium on 3 pentoses (D-arabinose, D-ribose and D-xylose) was also tested experimentally. Growth was only observed on D-xylose and D-ribose, which is consistent with literature and model predictions . Xylose is catabolized, as most other aldopentoses, by transport into the cell, isomerization and phosphorylation of ketopentose (xylulose). The phosphorylated derivative is then converted to xylulose-5-phosphate by an epimerase and channeled into the pentose phosphate pathway. D-ribose is transported into the cell, concomitant with the disruption of a high-energy phosphate bond and enters the pentose phosphate pathway following epimerization. These results highlight the utility of using modeling to predict growth on substrates a priori to experimentation and can help understand the different pathways that are operational during growth on specific carbon and nitrogen sources.
Qualitative Growth Phenotypes
In silico predictions and experimental data.
Experimental No Growth
No Growth (Carbon)
No Growth (Nitrogen)
The model predicted that 113 out of 163 potential carbon sources could support growth as the sole carbon source, and 50 out of 98 potential nitrogen sources could support growth as the sole nitrogen source. Comparisons with experimental data could be made for 125 of the potential carbon sources and for 74 of the potential nitrogen sources. Overall, the model predicted carbon and nitrogen sources with 82% accuracy for carbon sources and 81% accuracy for nitrogen sources. The relationship between model predictions and experimental growth phenotypes is statistically significant (chi-squared test statistic yields p < 0.0001 for both carbon and nitrogen sources). Most of the inaccurately predicted carbon sources are compounds that can also serve as sole nitrogen sources experimentally (13 of 21). This suggests that regulation or enzyme kinetic limitations likely prevent these compounds from being used by S. typhimurium LT2 as sole carbon sources.
S. typhimurium and E. coli are closely related microorganisms. Comparison of their physiological data and metabolic network structures suggest differences in carbon and nitrogen source utilization. For example, E. coli can utilize D-alanine, L-asparagine, L-aspartate, L-glutamine, and D-lactate as sole carbon sources experimentally, while S. typhimurium can not, even though S. typhimurium has the necessary enzymes and transporters. These phenotypic differences are likely to be caused by differences in how the organisms control metabolism through regulation and kinetics rather than by differences in metabolic network structure. The same explanation can be made for carbon sources that only S. typhimurium can use even though both organisms can catalyze the required reactions (1,2-propanediol, citrate, galactitol, L-glutamate, L-proline, and L-tartarate).S. typhimurium is also predicted to be able to utilize additional carbon sources due to the presence of unique transporters and/or enzymes (4-hydroxyphenylacetate, deoxyribose, ethanolamine, myo-inositol, L-lyxose, D-tartrate, D-glycerate, 2-phosphate-D-glycerate, 3-Phospho-D-glycerate, L-cysteine, isocitrate, phosphoenolpyruvate and tricarballylate).
Analysis of gene deletions, essentiality and lethal genes
The impact of single gene deletions on growth of S. typhimurium LT2 was analyzed by simulating its growth on rich medium in the absence of the reactions catalyzed by the corresponding gene. The results were compared for consistency with a compiled list of 255 essential genes from a large-scale gene essentiality screen. Of the 255 essential genes, only 74 were included in iRR1083 and could be compared with experimental results. The total accuracy of essentiality predictions was around 40% (30 out of 74). The accuracy of predicting in vivo essential genes for growth in rich media is within the range reported for models of Pseudomonas aeroginosa (40%) , H. pylori (57%), and Bacillus subtilis (69%). Discrepancies between model predictions and experimental results can be used to better identify what metabolites are present in the rich media and lead to model improvements. For example, inclusion of charged tRNAs rather than free amino acids into the biomass reaction would lead to the correct prediction of amino acid tRNA synthetases as essential genes.
To further explore the metabolic differences between E. coli and S. typhimurium we compared gene essentiality predictions for in silico aerobic growth in glucose minimal media using the most recent E. coli metabolic reconstruction, i AF1261  and i RR1083 (see Additional file 5). The number of predicted essential genes was fairly similar for the two organisms, with E. coli having 237 (out of 1261) essential metabolic genes and S. typhimurium having 200 (out of 1083) essential metabolic genes. The two models have 900 overlapping orthologs, meaning there are 183 genes included in i RR1083 which do not have an ortholog included in i AF1261 (142 of these do not have orthologs in E. coli and another 41 of the E. coli orthologs were not included in i AF1261).
For 667 of the 900 overlapping orthologs both models predict the genes to be non-essential for in silico growth in a glucose minimal medium under aerobic conditions. The models also agree on 136 of the 900 orthologs being essential. Comparing the predictions to experimental E. coli knock-out mutant phenotypes (in rich and glucose minimal media [38, 39], the conserved non-essential predictions and essential predictions were both 89% accurate (595/667 and 122/136, respectively).
The discrepancies between essentiality predictions made by the two models can be useful in identifying model differences (biomass formulation and reaction reversibility) and metabolic network differences (isozymes and unique reactions) for the two microorganisms. In 45 of the 900 overlapping orthologs, the E. coli model predicted genes as essential and the S. typhimurium model predicted those genes as non-essential; in 52 cases the S. typhimurium model predicted genes as essential and the E. coli model predicted those as non-essential. Most of these discrepancies (~74%) can be attributed to differences in the biomass formulations that were used for the two models. The S. typhimurium model either has additional compounds (such as a complete lipopolysaccharide, glycogen, and spermidine) that cannot be produced in a gene's absence (contributing to 35 out of 52 discrepancies), or the E. coli model has additional compounds (such as thiamine diphosphate, pyridoxine 5'-phosphate, ubiquinone, and protoheme) that cannot be produced in a gene's absence (contributing to 37 out of 45 discrepancies).
Reaction reversibility differences are also responsible for some of the discrepancies. In the S. typhimurium model threonine aldolase is reversible so that threonine can be produced from glycine and acetaldehyde; however, this reaction was changed to irreversible in the updated E. coli model, i AF1261. As a result the biosynthetic genes for threonine production encoding homoserine kinase (thrB) and threonine synthase (thrC) are predicted to be essential by the E. coli model. In contrast, the S. typhimurium model predicts that in their absence threonine can be synthesized from glycine and acetaldehyde. Since both thrB and thrC are essential for E. coli growth in minimal media conditions  it is likely that threonine aldolase does not produce threonine. Glycine hydroxymethyltransferase also differs in its reversibility between the two models, where it is reversible in i AF1261 and irreversible in i RR1083. The reversible reaction provides another pathway in the E. coli model for serine production in the absence of phosphoserine phosphatase (serA) or 3-phosphoglycerate dehydrogenase (serB), and explains why these two enzymes are predicted to be essential for S. typhimurium growth and non-essential for E. coli growth. Since both serA and serB are essential experimentally in E. coli , it is likely that neither organism uses glycine hydroxymethyltransferase for serine production, which is consistent with the S. typhimurium model predictions.
Gene essentiality discrepanices caused by metabolic network differences, such as additional reactions and/or isozymes, provide interesting insights. For example, S. typhimurium has genes encoding for two separate 3-isopropylmalate dehydratases (encoded by STM0110-STM0111 and STM0329-STM0330) while E. coli only has genes for one (encoded by b0071-b0072). As a result, when STM0110 or STM0111 are deleted the i RR1083 model predicts that STM0329 and STM0330 can be used instead to produce leucine, which is not possible in E. coli since it lacks the second isozyme. S. typhimurium also carries out reactions that E. coli can not. For example, an NADPH dependent sulfite reductase (unique to S. typhimurium) is predicted to be used for hydrogen sulfide production (needed for cysteine biosynthesis) when the NADH dependent sulfite reductase (present in both organisms) is absent.
There are also cases where E. coli has additional isozymes or metabolic reactions, that explain why genes are predicted to be non-essential in E. coli but essential in S. typhimurium. For example, there are multiple carbonic anhydrases and dihydrofolate reductases in E. coli while S. typhimurium has only one copy of each of these essential enzymes. E. coli also contains a gene encoding ribose 1,5-bisphosphokinase, for which there is no ortholog in S. typhimurium. This E. coli specific enzyme is predicted by the iAF1261 model to compensate for ribose-phosphate pyrophosphokinase in 5-phospho-alpha-D-ribose 1-diphosphate production, which is needed for histidine and nucleotide biosynthesis in the absence of ribose-phosphate pyrophosphokinase . As shown here, comparisons between gene essentiality predictions can be useful in identifying model discrepancies which can lead to updated and improved models, as well as identification of unique metabolic features, which allow for discovery of possible phenotypic differences between two organisms.
Metabolic gene essentiality for virulence
The i RR1083 reconstruction as described earlier only contains metabolic genes, so there are additional non-metabolic essential genes in vivo that were not included in our virulence prediction study. The in silico virulence predictions matched in vivo virulence characterization for 92% (22/24) of the in vivo virulent mutants and 56% (9/16) of the in vivo avirulent mutants. Most attenuated strains were also correctly predicted by the model to still be able to grow (13/15). Again a statistically significant relationship between model predictions and experimental observations was found (χ2 = 10.9 and p = 0.001). The model predicted incorrectly that mutants defective in pmrF and udg were avirulent [42, 43]. Both of these genes are involved in the biosynthesis of the aminoarabinose component of lipopolysaccharide (which is attached to Lipid A). However, this aminoarabinose component of the LPS is not always present. If alternate LPS structures were also included in the model then the model would accurately predict the virulence of pmrF and udg mutant strains. This would increase the final virulence prediction accuracy to 100%. The consistencies and discrepancies between in silico and in vivo mutant phenotypes deserve further discussion and may generate potential hypotheses that can subsequently be tested.
Some genes that are involved in major central metabolic pathways and are non-essential for intracellular survival (i.e. mutants defective in these genes are virulent) are discussed below. AceA encodes isocitrate lyase, which catalyzes the first step in the glyoxylate shunt, an anaplerotic carbon assimilatory pathway that synthesizes C4 dicarboxylates from C2 compounds like acetate or short chain fatty acids. AceA is required only for chronic Salmonella infections and not for acute infections . This enzyme is generally used when substrates such as glucose or pyruvate are absent and acetate or fatty acids are present. AckA and Pta are two enzymes in salmonella that lead to acetyl phosphate formation and the double deletion strain is virulent. Acetyl-phosphate is thought to activate expression of invasion genes, which can also be activated via a sensory kinase BarA. However, virulence is decreased only when Pta, AckA, and BarA are deleted .
Phagocytic cells are able to generate superoxide, hydrogen peroxide and other reactive oxygen species (ROS) and pathogens possess multiple defenses to such oxidative stresses. The model also accurately predicted that catalase (katE and katG) mutants  and super oxide dismutase mutants (sodA and sodCII) [47, 48] are virulent. Mutants lacking a third superoxide dismutase (sodCI) are attenuated and the model incorrectly predicts that a sodCI mutant would be virulent; however, if the host-cell environment included superoxide and none of the other superoxide dismutases were expressed or active (sodA, sodB, and sodCII) then the bacterium would have no way of dissipating superoxide and the model would predict an avirulent phenotype.
One of the incorrect predictions for the experimentally determined avirulent strains was for the nagA mutant. NagA is a N-acetylglucosamine-6-phosphate deacetylase, and it's substrate acts as an inducer of the nag operon, which relieves repression by NagC. It has been proposed that nagA mutants indirectly lead to reduction in aminosugar biosynthesis, which is needed in membrane structures . The model's inability to identify this mutant correctly as being avirulent could be due to the absence of any regulatory mechanisms not accounted for in the model. Inclusion of simple regulatory rules as has been done for the E. coli model, would allow us to address the role of regulation of metabolism during Salmonella-host interaction.
Of the experimentally determined attenuated mutants, the model predicted that mutants defective in aromatic amino acid biosynthesis are avirulent (aroABCD). Aro Salmonella strains have been used as oral vaccines . The attenuation of this strain, which is accomplished by interrupting pathways for biosynthesis of aromatic amino acids, is consistent with our in silico analysis. Strains of S. typhimurium with complete non-reverting blocks in aromatic biosynthesis are avirulent  as predicted by our model.
Pseudogenes and their role in infection among serovars of Salmonella
Sixty-five genes that are included in iRR1083, and thus present in Salmonella serovar typhimurium, are missing or are pseudogenes in other Salmonella serovars (see Additional file 7) . This implies that these genes could be dispensable for virulence. Based on the in silico analyses of the effect of single gene deletions on growth of S. typhimurium in both rich media and host-cell environments, all 65 genes are predicted to be dispensable. Only one gene out of these 65, panD, was predicted to be required for growth in minimal media. PanD is required for panthothenate synthesis, an intermediate in the biosynthesis of CoA. In E. coli this enzyme is not required for growth in LB but is required for growth in minimal media [38, 39]. Presumably then, serovars lacking panD would require panthothenate from its host. Compounds that affect pantothenate transport could potentially be used as antibiotics.
Salmonella Metabolism in infection
We sought to identify genes (Figure 4B) and select pathways that may be crucial for Salmonella's infection of and growth inside macrophage cell lines by quantitatively monitoring the expression of 100 selected metabolic genes (see Additional file 8) that are involved in different key pathways. Since only a small percentage of metabolic genes were hypothesized to change mRNA abundance levels during infection, we use the Genomelab Expression profiler (GEXP) for estimating gene expression quantitatively. This technique combines RT-PCR and capillary gel electrophoresis in a high-throughput format to obtain accurate, consistent mRNA abundance levels for expressed transcripts of chosen genes rapidly. Consistent with metabolic gene essentiality hypothesis suggested in literature, only 24% of the genes chosen in this study were up-regulated 24 hrs post infection. Our data indicates that isocitrate lyase (aceA), and fatty acid degradation genes fadA, fadD/fadE are not expressed. Although these genes have been implicated for virulence in chronic Salmonella infection, they are known to be unchanged during short-term infection experiments . Our results suggest the presence of glucose or pyruvate and not acetate or short chain fatty acids in vivo during the early stages of infection as suggested in literature . This also explains why genes from the oxidative branch of the TCA cycle (gltA, sdh) are over expressed.
The simultaneous up regulation of succinate dehydrogenase and electron transport chain genes cyoE and cyoA is suggestive of an aerobic environment in vivo, at least during the early stages of infection . Increased levels of F1F0 ATP synthase subunits measured within our data set also support this. ATP synthase is primarily responsible for the oxidative phosphorylation of ADP to ATP. It also plays a role in maintaining the proton gradient to maintain intracellular pH and drive efflux pumps. The high levels of mRNA transcripts for succinate dehydrogenase and protoheme farnesyl transferase genes also suggest heme biosynthesis occurs in S. typhimurium during infection. These are in concurrence with other existing data sets .
Genes belonging to the histidine biosynthesis cluster (hisC, hisD, hisG) were up regulated and those involved in aromatic amino acid synthesis pathways (aroA, pheA) were down regulated. These findings are consistent with the literature . Genes involved in nucleotide (purF) and NAD biosynthesis (nadE) were down regulated or not detected. The down regulated genes also included those involved in glycolysis and carbon metabolite repression, certain amino acid decarboxylases and some transporter genes including drug efflux genes. These results suggest that Salmonella acquires a significant amount of essential nutrients directly from the host cell, thus shutting down several biosynthetic pathways when inside the host. The genes that cannot be detected include some genes involved in glycolysis/gluconeogenesis (glk, pps) and anaplerotic reactions (ppc). Other genes are detectable, but do not change in expression level over the course of infection, including those in the pentose phosphate pathway (zwf), Entner Duoduroff pathway (edd) and glutathione cycle (ggt, gst). Fatty acid biosynthesis and degradation genes are also not detected during the monitored early stages of infection.
Metabolic capacity of S. typhimurium in the host cell
We compiled the list of all proteins that comprise the in silico reactome and compared that to previously determined proteome of S. typhimurium extracted from macrophage cells after infection (Figure 5B). Of the 315 proteins identified with a high level of confidence in S. typhimurium [5, 57], 129 have an annotated metabolic function (metabolic proteome) and in our model i RR1083. Eighty of these proteins were associated with reactions present in the reactome calculated by FVA analysis during optimal growth, while only 34 were identified in the host-cell reactome when growth was sub-optimal. In another experiment, where the proteome of S. typhimurium was probed under different in vitro environmental conditions, a total of 507 proteins were identified , of which 127 were accounted for in our metabolic reconstruction. Out of these 127 proteins, 83 and 22 were associated with the optimal and suboptimal host cell reactome, respectively, as calculated using FVA. Most of the proteins that were found in both proteomic data sets and associated with blocked reactions, as calculated by FVA, were tRNA synthetases (7 out of 15  and 13 out of 22 ). Incorporating charged tRNAS instead of free amino acids in the biomass reaction would change the tRNA synthetase reactions from blocked to part of the optimal reactome and would result in higher agreement between the optimal reactome and identified proteins.
Using a similar strategy we compared micro-array gene expression data obtained from S. typhimurium extracted from macrophage cells . Of the 919 genes that were differentially expressed, 250 genes had a metabolic function, thus comprising the metabolic transcriptome. When we compared those with the gene set represented by our FVA analysis reactome, we identified 80% of the genes in the metabolic transcriptome were associated with reactions in the reactome. Of the remaining 49 that were not present in our in silico reactome, 65% of the genes were down-regulated and formed a subset of the blocked reactions from our FVA analysis (Figure 5B). Although the accuracy of prediction is very good, it could be increased by improving our knowledge of existing nutrients in the macrophage or incorporating regulation into the model. By identifying preferred metabolic pathways we can better understand how the pathogen responds to its microenvironment during invasion and infection.
This study describes a genome-scale metabolic model, i RR1083, of the Gram-negative intracellular pathogen S. typhimurium. The model has been reconstructed by integration of existing genomic, proteomic, and phenotypic experimental data. Comparison between model predictions and experimental data demonstrated an accuracy of more than 80% for growth and virulence phenotypes. The model allowed us to use flux balance analysis to analyze growth and virulence genes given the environment the intracellular pathogen encounters. Further, gene expression experiments and computational analysis allowed us to identify metabolic enzymes and pathways that are potentially active during infection. Model-development is an iterative process that moves from a genome annotation and biochemistry of the organism to the development of a genome-scale metabolic network that can be used as a fully functional in silico organism to compute physiological phenotypes. Our model iRR1083 serves also as a framework to integrate further gene expression and proteomic data and explain the expression level of specific gene products during infection.
Computing cellular functions from reconstructed networks allowed us to understand the relationship between environmental factors (nutrient availability) and genetic factors (mutations or deletions) and their integrated function to orchestrate cell phenotypes (virulence or attenuation). Knowledge of the functional status of metabolic genes under defined constraints allows us to interpret the role of critical nodes in a cellular metabolic network. The function of redundant pathways under internally perturbed conditions, such as in the host cell during infection, provides important insights into metabolic network structure. Although this internal robustness has been experimentally demonstrated, building a complete catalog of genes essential for growth under a myriad of conditions for any organism is time-consuming and laborious. A computational approach that predicts gene essentiality with high accuracy is thus of great value. Model predictions of virulence for different mutants improved from 80% to 91% upon altering host-cell environment components (by adding glutamine, and removing thymidine, guanosine, adenosine, and deoxyadenosine) and this shows how the model can be used as a tool for providing insights into the role of metabolism in infection and pathogenesis. To date, constraint-based models cannot predict effects of toxic intermediates and methods are needed that allow for prediction of concentrations within constraint-based models.
A biomass reaction (representing the conversion of biomass components into one gram dry weight of biomass) is needed to accurately predict the effects of gene deletions on growth and viability. Although predictions have been made in the past with approximations of the biomass function to that of E. coli, there are some major differences in fatty acid and amino acid composition that require experimental measurements to determine the drain of different metabolites and the cost of their production on the function of the organism. Our study has addressed this issue by directly determining the components of Salmonella biomass, thus increasing the validity and predictive accuracy of model iRR1083.
S. typhimurium is a pathogen that is the leading cause of human gastroenteritis and is used as a mouse model for human typhoid fever. Although more than 2000 serovars exist, the LT2 strain, whose genome sequence has been used here for building model iRR1083, is one of the principal strains for cellular and molecular biology of Salmonella. Attenuated mutants have been used not only as live oral vaccines against Salmonella infection but also to express antigens from other pathogens and to deliver proteins to solid tumors. One reason for attenuation or avirulence is the deletion or functional silencing of a genetic element in the bacterium resulting in the impairment of the intracellular survival capacity of S. typhimurium. Our in silico analysis shows the ability to predict such effects with good accuracy. At this point the host cell metabolic environment has been inferred indirectly from gene-expression data. Improved accuracy in directly determining what metabolites are available in the host cell environment and incorporation of transcriptional regulatory rules into i RR1083 will allow a more precise analysis of metabolic selective pressures a pathogen encounters during infection.
In addition to pathogenicity and virulence analysis, the S. typhimurium metabolic reconstruction can be a valuable tool to interpret several 'omics' data types. Although there is not a direct correlation between gene or protein expression levels and metabolic fluxes, the genome-scale model serves as a platform to visualize these large data sets and interpret them in the context of operational, redundant or incomplete metabolic pathways. This provides a better understanding of cellular phenotypes and offers insights into how system level properties affect survival or sensitivity of pathogens to factors in the environment provided by the host cell. For example, the model prediction of fad genes not being operational during early stages of infection (optimal growth) and operational during sub-optimal growth can be explained by the fact that fatty acids are not utilized in the initial stages of infection by S. typhimurium . This observation along with the calculated host cell reactome from i RR1083 could suggest a switch from optimal to sub-optimal growth or a change in nutrients available for Salmonella inside host cell tissue. This switch is concomitant with a marked decrease in growth during persistence or chronic infection. Growth rates are presumably higher (optimal) during the initial stage of infection. The entire reactome of Salmonella as predicted by the model provides many hypotheses that need to be further investigated to identify the implications of metabolic pathways and functions in pathogenesis. Metabolomic data have become more readily available and easier to measure. They can be used to apply additional constraints to the model , thus improving model predictions of cell function and global phenotypes.
The genome-scale model of S. typhimurium we have developed here also presents us with the opportunity to interrogate several other serovars or species that belong to the Salmonella genus. For example, panD is the only gene that is non-functional in Salmonella typhi, and is predicted to be required for virulence in specific conditions. The presence of pseudogenes in other serovars can be analyzed in the context of the genome-scale model to present us with systemic similarities and differences that may or may not play a role in the pathogenic potential of all the four species. Serovars lacking panD would require panthothenate from its host. Thus, compounds that affect pantothenate transport could potentially be used as antibiotics. Such analyses across different serovars require genome-scale models and might accelerate the identification of rational drug targets.
Although many of the reactions in the i RR1083 model for S. typhimurium are conserved in models for E. coli MG1655, there are unique metabolic and genetic differences that affect growth phenotypes in response to different environmental or genetic perturbations. The two organisms are each capable of utilizing unique carbon sources, which are related to differences in metabolic networks (unique transporters or enzymes) or differences in regulation or kinetics that affect flow through shared reactions common to both organisms (reactions are present to catabolize carbon sources in both metabolic networks but one organism does not use them as sole carbon sources). We also found that there are differences in the number of isozymes or alternative reactions/pathways that enable one organism to be more robust against gene deletions in a glucose minimal media condition. We anticipate that comparisons of model-predicted essential genes in more environmental conditions will lead to the identification of more metabolic differences between the two organisms, as the analysis was only performed for aerobic growth in glucose minimal media. These metabolic differences might shed light onto the evolutionary differences between the two organisms and the nutrients they are or were exposed to.
The genome-scale metabolic network reconstruction presented here supports the notion of a robust minimal set of pathways that Salmonella requires for intracellular survival during infection . Our model allows in silico simulation of different nutritional requirements and constraints as they may be present in different host cell types or tissues that are difficult to directly address experimentally. Our model can generate hypotheses that can be validated by using mutant strains, high-throughput data, and selective analysis of pathways. Such an endeavor has the potential to lead to an improved understanding of how intracellular pathogens utilize the existing microenvironment of the host. This in turn might lead to the identification of conserved metabolic pathways that can be targeted by novel interfering compounds.
We have presented here extensive datasets within a reconstruction of a genome-scale metabolic model for S. typhimurium (iRR1083). This model has been validated by experimental data and by legacy data. It demonstrates an accuracy of 80% for growth and virulence phenotypes. Thus the model iRR1083 provides a suitable platform for computing cellular phenotypes for Salmonella and for further integration of high-throughput data.
Biomass composition of Salmonella was determined experimentally, which increased the accuracy of our model building and also showed distinctive difference to that of E. coli. Constraint-based analysis using the metabolic reconstructions iRR1083 for S. typhimurium and MG1655 for E. coli identified unique metabolic differences between these two bacteria.
Expression profiling of selective key metabolic enzymes in Salmonella during infection of macrophages identified operational and non-operational metabolic pathways. Flux variabity analysis of metabolic pathways identified a minimal set of reactions required for Salmonella intracellular survival.
This study thus provides another high-resolution reconstruction of microbial metabolism of an intracellular pathogen next to that of M. tuberculosis and demonstrates its suitability for analyzing bacterial metabolic networks during infection. From further validation and testing of hypotheses generated from these reconstruction efforts we expect to arrive at a conserved set of metabolic reactions that interdigitate with those of the host cell and might thus provide novel targets for therapeutic intervention.
Metabolic Network Reconstruction
The reconstruction process for S. typhimurium is outlined schematically in the Additional file 9. The process followed the approach outlined by Edwards and Palsson (Edwards and Palsson, 2000). The reconstruction was carried out in SimPheny™ (Genomatica, Inc., San Diego, CA), a platform for development of cellular models. Lists of genes, proteins and reactions, metabolites and exchange fluxes that are included in the resulting final reconstruction can be found in the Additional file 1.
The initial draft reconstruction was built from the annotated genome  of Salmonella enterica serovar typhimurium LT2 (S. typhimurium LT2), which is available at TIGR. Genes to proteins to reaction associations (GPR) were confirmed by comparing homology of S. typhimurium gene sequences to E. coli gene sequences and biological information databases such as EXPASY, KEGG, and BioCyc. These GPR associations connect genetic data to reactions in the metabolic network  and allow for subsequent evaluation of how genetic perturbations affect metabolic phenotypes.
Biomass Composition and Reaction
A reaction that represents biomass production is included in the model to account for the drain of precursors and macromolecular building blocks into biomass. Biomass was represented as a linear combination of all the macromolecular components (lipid, glycogen, lipopolysaccharide, and peptidoglycan) or monomers of macromolecules (amino acids and nucleotides). Amino acid and fatty acid composition were experimentally measured using standard methods described in manual of bacteriological methods . The relative fatty acid composition was used to specify the average fatty acid content on the phospholipds and diacylglyerol. S. typhimurium LT2 data from the literature was then used to determine the amount of lipids (and their phospholipid composition), lipopolysaccharide, peptidoglycan, and glycogen. The growth associated ATP maintenance and soluble metabolite pools were assumed to be the same as reported for E. coli . Complete details of the cellular composition and references used for the S. typhimurium biomass reaction are provided in the Additional file 2. The biomass reaction thus represents the weighted combination of components forming the dry weight of the cell and the amount of ATP hydrolysis, needed for energy during growth and cellular maintenance.
Flux Balance Analysis
Fluxes through metabolic reactions across the network can be calculated using flux balance analysis (FBA) . With FBA, the biological system is assumed to be at steady-state and so all intracellular metabolite concentrations and fluxes are assumed to be constant. The steady-state assumption makes it possible to compare the simulation results directly to data obtained from cells growing at a fixed growth rate. FBA is formulated as an optimization problem, where constraints are imposed that limit flux values (steady-state mass balance constraint and upper and lower bounds for fluxes based on thermodynamics and substrate uptake constraints). These constraints define the range of values that fluxes can take. An objective function is also used to compare flux distributions (v) that satisfy all the constraints in order to find optimal flux distributions. Flux through the biomass production reaction has successfully been used as an objective function for E. coli (along with other objective functions [60, 61], and was used for FBA performed in this study. Thus, FBA predicts an optimal growth yield and a flux distribution(s), which correspond to this maximal growth yield. Alternate optimal solutions exist, and flux variability analysis  was additionally used to identify the ranges individual fluxes can take while still achieving the maximal growth yield.
In silico constraints and media composition
Three different environments were used to provide constraints for simulations: (i) M9 minimal medium (contains salts, phosphate, sulfate, ammonium minerals) with an added carbon source; (ii) LB-media (composition was based on a typical analysis of the yeast extract and tryptone provide by the manufacturer); (iii) host-cell nutrient environment (representing nutrient conditions inside a host-cell during Salmonella infection). While the host-cell nutrient environment was difficult to ascertain, it was based on an extensive literature review to identify a possible composition. Details of all simulated media conditions are provided in the Additional file 6. For simulation of aerobic growth, the following external metabolites were allowed to freely enter and leave the network: phosphate, ammonium, sulfate, water, oxygen, and proton. All metabolites that were not media components (potential products) were only allowed to leave the system.
Validation of the model against Biolog data for S. typhimurium LT2 (available from Biolog website, http://www.biolog.com/accesstoPMTech.html) was performed by optimizing biomass production with different carbon and nitrogen sources. A maximum uptake rate of 5 mmol per gram of dry weight per hour (mmol/gDW cell/h) was used for the different carbon and nitrogen sources. Different nitrogen sources were simulated with pyruvate as a carbon source and different carbon sources were simulated with ammonium as a nitrogen source.
Genetic changes, such as gene deletions and mutations, can be simulated by adding additional constraints to individual fluxes . For example, deletion of an enzyme subunit would force flux values through associated reactions to be zero, unless other isozymes are present. The essentiality of a given gene was determined by calculating the maximal growth rate using FBA when fluxes through associated reactions were constrained to be zero. This type of analysis identifies lethal gene deletions (where maximum growth rate is zero) and non-lethal gene deletions (where maximum growth rate is greater than zero). Gene essentiality is dependent on media composition, and all three growth environments (M9 minimal media with glucose, LB media, and host-cell nutrient environment) were analyzed separately.
Growth and Infection Experiments
All growth experiments were carried out using the strain S. enterica serovar typhimurium LT2. Aerobic batch cultivation was done in an incubator shaker at 37°C and 100 rpm. Minimal M9 medium was supplemented with 6 different carbon sources (glucose, fructose, glycerol, arabinose, ribose and xylose). The cell culture optical density (OD) was monitored at regular intervals using a Beckman DU spectrophotometer at 600 nm. The OD was converted into g dry weight using a previously determined biomass calibration standard curve using the same spectrophotometer (gDW = 0.7396·OD600). Samples were also withdrawn every 2 hours, filtered and the supernatants were frozen and stored at -80°C for HPLC analysis to analyze substrate utilization and product formation profiles.
Organic acid and sugar analysis was done using an ion exchange Aminex 87H+ column (Biorad). A 5 mM Sulfuric acid mobile phase at a flow rate of 0.5 ml/min at 65°C was used for separation. Both UV and RI (refractive index) detector were employed. Small molecules were quantified using a previously determined calibration curve.
Quantitative Gene expression using GeXP™
Infection of the macrophage like cell line (RAW246.7) was carried out as described . RNA was purified from harvested bacterial cells using the protocol from RNeasy Kit (QIAGEN) and quantitated using UV detection at 260 nm and 280 nm. The GenomeLab™ GeXP Genetic Analysis System (Beckman) was used for quantitative gene expression analysis of the selected metabolic transcriptome of S. typhimurum. GeXP employs eXpress Profiling (XP-PCR), a patented technology for multiplex gene expression profiling analysis by which up to 30 genes can be easily multiplexed in the same reaction. We used a protocol that comes with GeXP Reagent kits that involved five basic steps: 1) Primer design; 2) cDNAsynthesis; 3) PCR; 4) Separation on the GenomeLab GeXP Genetic Analysis System; 5) Fragment Analysis and Expression Profiling. The data was then analyzed using the GeXP analysis proprietary software (Beckman, Inc.), Excel (Microsoft, Inc.) and Matlab (Mathworks, Inc.). Primers were designed using proprietary software provided by Beckman (see Additional file 10).
This work was supported by funds from the Mount Sinai School of Medicine, New York. JR received financial support in the early stage of the project as a Faculty Fellow at the University of California, San Diego.
- 5.Shi L, Adkins JN, Coleman JR, Schepmoes AA, Dohnkova A, Mottaz HM, Norbeck AD, Purvine SO, Manes NP, Smallwood HS, Wang H, Forbes J, Gros P, Uzzau S, Rodland KD, Heffron F, Smith RD, Squier TC: Proteomic analysis of Salmonella enterica serovar typhimurium isolated from RAW 264.7 macrophages: identification of a novel protein that contributes to the replication of serovar typhimurium inside macrophages. J Biol Chem. 2006, 281: 29131-29140. 10.1074/jbc.M604640200CrossRefPubMedGoogle Scholar
- 9.Feist AM, Henry CS, Reed JL, Krummenacker M, Joyce AR, Karp PD, Broadbelt LJ, Hatzimanikatis V, Palsson BO: A genome-scale metabolic reconstruction for Escherichia coli K-12 MG1655 that accounts for 1260 ORFs and thermodynamic information. Mol Syst Biol. 2007, 3: 121- 10.1038/msb4100155PubMedCentralCrossRefPubMedGoogle Scholar
- 15.Raghunathan A, Price ND, Galperin MY, Makarova KS, Purvine S, Picone AF, Cherny T, Xie T, Reilly TJ, Munson RJ, Tyler RE, Akerley BJ, Smith AL, Palsson BO, Kolker E: In Silico Metabolic Model and Protein Expression of Haemophilus influenzae Strain Rd KW20 in Rich Medium. OMICS. 2004, 8: 25-41. 10.1089/153623104773547471CrossRefPubMedGoogle Scholar
- 17.Mahadevan R, Bond DR, Butler JE, Esteve-Nunez A, Coppi MV, Palsson BO, Schilling CH, Lovley DR: Characterization of metabolism in the Fe(III)-reducing organism Geobacter sulfurreducens by constraint-based modeling. Appl Environ Microbiol. 2006, 72: 1558-1568. 10.1128/AEM.72.2.1558-1568.2006PubMedCentralCrossRefPubMedGoogle Scholar
- 19.Thiele I, Vo TD, Price ND, Palsson BO: Expanded metabolic reconstruction of Helicobacter pylori (iIT341 GSM/GPR): an in silico genome-scale characterization of single- and double-deletion mutants. J Bacteriol. 2005, 187: 5818-5830. 10.1128/JB.187.16.5818-5830.2005PubMedCentralCrossRefPubMedGoogle Scholar
- 25.McClelland M, Sanderson KE, Spieth J, Clifton SW, Latreille P, Courtney L, Porwollik S, Ali J, Dante M, Du F, Hou S, Layman D, Leonard S, Nguyen C, Scott K, Holmes A, Grewal N, Mulvaney E, Ryan E, Sun H, Florea L, Miller W, Stoneking T, Nhan M, Waterston R, Wilson RK: Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature. 2001, 413: 852-856. 10.1038/35101614CrossRefPubMedGoogle Scholar
- 27.Quinn T, O'Mahony R, Baird AW, Drudy D, Whyte P, Fanning S: Multi-drug resistance in Salmonella enterica: efflux mechanisms and their relationships with the development of chromosomal resistance gene clusters. Curr Drug Targets. 2006, 7: 849-860. 10.2174/138945006777709548CrossRefPubMedGoogle Scholar
- 42.Gunn JS, Ryan SS, Van Velkinburgh JC, Ernst RK, Miller SI: Genetic and functional analysis of a PmrA-PmrB-regulated locus necessary for lipopolysaccharide modification, antimicrobial peptide resistance, and oral virulence of Salmonella enterica serovar typhimurium. Infect Immun. 2000, 68: 6139-6146. 10.1128/IAI.68.11.6139-6146.2000PubMedCentralCrossRefPubMedGoogle Scholar
- 48.Krishnakumar R, Craig M, Imlay JA, Slauch JM: Differences in enzymatic properties allow SodCI but not SodCII to contribute to virulence in Salmonella enterica serovar Typhimurium strain 14028. J Bacteriol. 2004, 186: 5230-5238. 10.1128/JB.186.16.5230-5238.2004PubMedCentralCrossRefPubMedGoogle Scholar
- 54.Karatzas KA, Randall LP, Webber M, Piddock LJ, Humphrey TJ, Woodward MJ, Coldham NG: Phenotypic and proteomic characterization of multiply antibiotic-resistant variants of Salmonella enterica serovar Typhimurium selected following exposure to disinfectants. Appl Environ Microbiol. 2008, 74: 1508-1516. 10.1128/AEM.01931-07PubMedCentralCrossRefPubMedGoogle Scholar
- 57.Adkins JN, Mottaz HM, Norbeck AD, Gustin JK, Rue J, Clauss TR, Purvine SO, Rodland KD, Heffron F, Smith RD: Analysis of the Salmonella typhimurium proteome through environmental response toward infectious conditions. Mol Cell Proteomics. 2006, 5: 1450-1461. 10.1074/mcp.M600139-MCP200CrossRefPubMedGoogle Scholar
- 59.Van Dien SJ, Lidstrom ME: Stoichiometric model for evaluating the metabolic capabilities of the facultative methylotroph Methylobacterium extorquens AM1, with application to reconstruction of C(3) and C(4) metabolism. Biotechnol Bioeng. 2002, 78: 296-312. 10.1002/bit.10200CrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.