Taxonomic profiling and populational patterns of bacterial bile salt hydrolase (BSH) genes based on worldwide human gut microbiome
Bile salt hydrolase plays an important role in bile acid-mediated signaling pathways, which regulate lipid absorption, glucose metabolism, and energy homeostasis. Several reports suggest that changes in the composition of bile acids are found in many diseases caused by dysbacteriosis.
Here, we present the taxonomic identification of bile salt hydrolase (BSH) in human microbiota and elucidate the abundance and activity differences of various bacterial BSH among 11 different populations from six continents. For the first time, we revealed that bile salt hydrolase protein sequences (BSHs) are distributed in 591 intestinal bacterial strains within 117 genera in human microbiota, and 27.52% of these bacterial strains containing BSH paralogs. Significant variations are observed in BSH distribution patterns among different populations. Based on phylogenetic analysis, we reclassified these BSHs into eight phylotypes and investigated the abundance patterns of these phylotypes among different populations. From the inspection of enzyme activity among different BSH phylotypes, BSH-T3 showed the highest enzyme activity and is only found in Lactobaclillus. The phylotypes of BSH-T5 and BSH-T6 mainly from Bacteroides with high percentage of paralogs exhibit different enzyme activity and deconjugation activity. Furthermore, we found that there were significant differences between healthy individuals and patients with atherosclerosis and diabetes in some phylotypes of BSHs though the correlations were pleiotropic.
This study revealed the taxonomic and abundance profiling of BSH in human gut microbiome and provided a phylogenetic-based system to assess BSHs activity by classifying the target sequence into specific phylotype. Furthermore, the present work disclosed the variation patterns of BSHs among different populations of geographical regions and health/disease cohorts, which is essential to understand the role of BSH in the development and progression of related diseases.
KeywordsBile salt hydrolase (BSH) Gut microbiota Taxonomic identification Bile acids Number of paralogs
- BD Energy
Bile salt hydrolase
BSH protein sequences
Glycine deoxycholic acid
Human Microbiome Project
Hadza ethnic group of Tanzania
Impaired glucose tolerance
META genomics of the Human Intestinal Tract
Type 2 diabetes
The United States
World Health Organization
Bile acids (BAs) are well known for regulating cholesterol balance, and disorders of BAs enterohepatic circulation can cause gallbladder  or gastrointestinal diseases . The metabolism of BAs is also known to be associated with diabetes , obesity , and cardiovascular diseases . BAs are synthesized from cholesterol in hepatocytes, after which they are further conjugated with the amino acids glycine or taurine to form bile salts and transfer to intestine (Additional file 1: Figure S1). Notably, the amphiphilic combination of bile salts is essential to the absorption of fat in the intestine. However, excessive bile salts are toxic to intestinal bacteria .
Bile salt hydrolase (BSH, EC 126.96.36.199), also designated as choloylglycine hydrolase, present in the gut microbiome can catalyze the hydrolysis of conjugated bile salts into deconjugated BAs to preserve the balance of metabolism of BAs. Moreover, deconjugated BAs serve as signaling molecules to facilitate the secretion of GLP-1 , activate multiple receptors [8, 9, 10], and influence different metabolic processes to cause a variety of diseases [11, 12].
BSH has already been identified in several microbial genera, including Lactobacillus [13, 14], Bifidobacterium , Enterococcus , Clostridium spp. , and Bacteroides . Interestingly, L. johnsonii PF01 was reported to have three distinct BSHs , and two BSH enzymes (BSH1 and BSH2) from Lactobacillus salivarius LMG14476 were found to have strikingly different properties with respect to their catalytic efficiency and substrate preference . The above studies indicate that, in some strains, the number of BSH genes can be variable as can their properties.
Given the important role of gut microbiota in bile acid metabolism, it is essential to systematically identify which bacterial strains hold BSH genes, as well as how their abundance and activity varies in the gut. Because of the rapid development of next-generation sequencing technologies, some public databases now provide the baselines and variances of human gut metagenomic data (e.g., the Human Microbiome Project (HMP)  and META genomics of the Human Intestinal Tract (MetaHIT) ). Additionally, different populations exhibit considerable variations in gut microbiota because of different genetic backgrounds, environments, and especially large differences in dietary habits , which is also helpful to the investigation of the variation patterns of BSH in human gut microbiota.
The HMP reference genome database demonstrates the taxonomic distribution of bacteria containing bile salt hydrolase protein sequences (BSHs), while the public gut metagenome databases can provide the abundance of BSHs in various populations of geographical regions and health/disease cohorts. Thus, in the present study, we investigated the taxonomic, populational, and functional patterns of BSH in human gut microbiota using computational and biological approaches with the following underlying aims: (1) to identify bacteria encoding BSHs in human gut microbiota; (2) to investigate the variation in sequence, structural, and biological activity variations of BSHs from different bacteria; (3) to explore the factors that might affect the patterns of BSHs in human microbiota; and (4) to discover the associations between the relative abundance of BSHs with several diseases.
Sequence and structure comparisons of BSHs
The molecular weights of the BSH subunits ranged from 28 to 50 kDa . Comparison of sequences revealed that the number of amino acids in BSHs from Clostridium, Bifidobacterium, Lactobacillus, and Enterococcus were 329, 316, 325, and 326, respectively (Additional file 1: Figure S4a). Pairwise amino acid sequence alignments of the BSHs showed that the highest identity was 54.29% (comparison of BSHs from Lactobacillus and Enterococcus), and the lowest identity was 34.91% (comparison of BSHs from Bifidobacterium and Enterococcus) (Additional file 1: Figure S4b). Overall, their average value of sequence identity was 40.41% (Additional file 1: Figure S4a).
The secondary structures (the pattern of hydrogen bonds between the amino hydrogen and carboxyl oxygen atoms in the peptide backbone) of these BSH showed that the six reported active site residues were conservative (Fig. 1a, highlight by red frame), although the number of α-helix and β-strand were different. The topological diagram of BSH showed that the domain had a six-layered structure of composition βαββαβ, and the core of BSH was composed of two sandwiched antiparallel α-sheets, which were conformed to the structure characteristics of the Ntn-hydrolase family [29, 31] (Fig. 1b).
According to previous studies, the conserved and functional important residues in these four BSH were Cys2, Arg18/18/16/16, Asp21/21/19/19, Asn82/82/79/79, Asn175/173/170/171, and Arg228/226/223/224 [26, 27, 28, 29]. These six residues were closed in the 3D structure to form the core active site of BSH although they were far apart in BSH (Fig. 1c). It is noteworthy that the functional active sites in BSHs are not only these six residues. For instance, there were totally 30 residues that are contacted with various substrates in BSH of 2BJF (Additional file 1: Figure S5, Additional file 2: Table S1).
These results indicated that although there exist much sequence differences among different BSH, the functional residues, αββα fold of secondary structure, and core active site are conserved.
Taxonomic identification and paralogs investigation of BSHs
The homology of BSHs between genera demonstrated that they were mainly distributed in two intervals, one at about 35% and the other at about 50% (Fig. 2e). However, there was also a lower identity of BSHs within the genus at about 50% (Fig. 2e). These results indicated that distinguishing BSHs by genera might not be a rational method because of the paralogs of BSHs harbored in the bacterial genome.
Populational comparison and multivariable-adjusted analyses of BSHs
The BSHs obtained from the HMP database were further screened in the gut microbiome of worldwide populations (Additional file 2: Table S3). Among the 561 BSHs we obtained from the HMP database, only 156 were presented in the gut microbiome of 591 people from 11 different countries of six continents (Additional file 2: Table S4). The drastic decrease in the number of strains (591 to 156, 26.40%) indicates that many BSH-expressing bacteria are very low in abundance in the human gut microbiome and undetectable under the current sequencing depth.
Based on the cumulative RA distribution of BSHs in different populations (Fig. 3d, Additional file 2: Table S4), the America (US) population exhibited the highest cumulative RA of BSHs (3.74 × 10−4), while the Hadza ethnic group of Tanzania (HZ) exhibited the lowest (6.20 × 10−5). The lower cumulative RA of BSHs in the HZ population may have been because they live in habitats almost completely isolated from other humans and have had relatively little modification to their basic way of life for hundreds of years . Moreover, majority of BSHs were identified from five genera, Bacteroides, Blautia, Eubacterium, Clostridium, and Roseburia, which represented about 71.31% of the total abundance of BSHs in human gut metagenomes (Fig. 3d, Additional file 2: Table S5).
Multivariable regression analysis was implemented to evaluate the relationship of BSHs abundance with gender, age, BMI, and population (Fig. 3). Gender, age, and BMI were not significantly correlated with the abundance of BSHs after adjustment (Fig. 3a, b, and c). However, significant associations were found between the cumulative RA of BSHs and populations from different geographical regions (pb = 1.2e-28, Fig. 3d). The populational factor explained the highest variations (35.15%) of BSHs abundance among the four factors (Fig. 3e). These results were consistent with the observation that disparity of the gut microbiota composition in different populations was mainly caused by geography .
Reclassification and variation patterns of BSHs
Among the genera distribution (middle panel of Fig 4, Additional file 2: Table S4) and population patterns (right panel of Fig. 4, Additional file 2: Table S4) of each phylotype, BSH-T0 contained only 0.25% of the RA of BSHs, which were not shown in the gut microbiome of HZ population and distributed in Clostridium, Intestinibacter, Lactobacillus, and Enterococcus, respectively. BSH-T1 contained 38.04% of the total RA of BSHs, which were mainly distributed in Eubacterium, Blautia, Clostridium, Roseburia, and Ruminococcus and could be found in the gut microbiota of all 11 populations. BSH-T2 contained 1.01% of the total RA of BSHs, and the major genera of this phylotype were Streptococcus and Enterococcus. Notably, this phylotype was not found in gut microbiota of the population of China (CN; right panel of Fig 4, Additional file 2: Table S6). Sequences of BSH-T3 were all from Lactobacillus, which were only found in CN, Japan (JP), Austria (AT), and France (FR) populations and showed higher RA in CN and AT populations (Fig. 4, Additional file 2: Table S6). BSH-T4 represented 2.74% of the total RA of BSHs, which were mainly distributed in Bifidobacterium and Collinsella and could not be found in HZ population (Fig. 4, Additional file 2: Table S6). The RA of BSH-T4 in gut microbiota of the population of JP was higher than that of other populations (Fig. 4, Additional file 2: Table S6). BSH-T5 contained 23.63% of the total RA of BSHs, which were mainly distributed in Bacteroides (Fig. 4). BSH-T6 contained 32.2% of the total RA of BSHs, which were also mainly distributed in Bacteroides (Fig. 4). However, there were no BSHs from this phylotype could be found in HZ and Peru (PE) population (Fig. 4, Additional file 2: Table S6). BSH-T7, which contained 1.98% of the total RA of BSHs mainly distributed in Blautia and could be found in the population from all countries (Fig. 4, Additional file 2: Table S6).
There were 28 strains with BSHs paralogs in 120 strains, the BSHs from Bacteroides, Clostridium, Lactobacillus, Ruminococcus, and Marvinbryantia showed varied phylotype distributions. Overall, nine strains with paralogs were distributed in the same phylotype (Additional file 1: Figure S8, marked by black triangles), 15 strains were distributed in two phylotypes (Additional file 1: Figure S8, marked by red triangles). Specifically, most of the strain with paralogs were from Bacteroides, and mainly distributed in two phylotypes, i.e., BSH-T5 and BSH-T6 (right panel of Fig. 4, Additional file 1: Figure S8). Four strains were distributed both within and between phylotypes (Additional file 1: Figure S8). Thus, different paralogs of BSHs within a genus could have sequence dissimilarity, which might lead to variable functional roles of these two genera in bile acid metabolism.
Molecular docking and enzyme activity comparisons between eight BSH phylotypes
Representative BSHs of the eight phylotypes were synthesized and purified (Additional file 1: Figure S9a) as described in the “Methods” section. The deconjugation of BSH at lowest concentration (0.1 mM) of substrates was confirmed by LC-MS/MS (right panel of Fig. 5, Additional file 2: Table S9). The results of enzyme activity assay shown that BSH-T0, BSH-T1, BSH-T3, BSH-T4, and BSH-T7 performed higher specific activity (the activity of an enzyme per milligram of total protein, using the highest enzyme activity as 100% for each substrate) among seven bile salts (middle and right panel of Fig. 5, Additional file 2: Table S8). However, different BSH phylotypes display selective deconjugation activity based on substrates. The specific activity of BSH-T5 was lower when the substrate was glycocholic acid (GCA) and both GCA and taurocholic acid (TCA) for BSH-T6 (middle and right panel of Fig. 5, Additional file 1: Figure S18). In particular, BSH-T1, BSH-T3, and BSH-T4 showed the highest specific activity with GCA, while BSH-T0 and BSH-T2 showed the highest specific activity with glycochenodeoxycholic acid (GCDCA), and BSH-T0 showed the highest specific activity when taurochenodeoxycholic acid (TCDCA) was the substrate (Additional file 1: Figure S18). It is worth noting that BSH-T1, which had the highest abundance of BSHs in the human gut microbiota, exhibited higher deconjugation activity (middle and right panel of Fig. 5). BSH-T2 showed lower enzyme activity both in silico and in vitro (Fig. 5, Additional file 1: Figure S12, S18). Comparatively, BSH-T7 showed higher deconjugation activity in vitro but not in silico (middle and right panel of Fig. 5, Additional file 1: Figure S17, S18). This discrepancy was likely because the computational molecular docking only partially reflected the actual enzyme activities. Nevertheless, computational work is helpful to understand the biological activity at the molecular level.
Variation in patterns of BSHs in health/disease cohorts
To further investigate the functional implications of BSH, we analyzed the relationship between the RA of BSHs and disease status of different groups, including populations of geographical regions and health/disease cohorts.
First, we analyzed the relationship between the RA of BSHs in our target populations with the World Health Organization (WHO) released phenotypes, namely, death rate of diabetes, death rate of cardiovascular diseases (CVD), mean blood cholesterol, and BMI of obesity (Additional file 1: Figure S19, Additional file 2: Table S10). We found that the RA of all BSH was significantly correlated with death rate of diabetes (r = − 0.65, p = 0.03) (Additional file 1: Figure S19a, entry 7 of Additional file 1: Figure S19c), and the RA of all BSH-T0 was significantly correlated with death rate of CVD (r = 0.92, p = 0.0041) (Additional file 1: Figure S19b, entry 7 of Additional file 1: Figure S19c). These findings might indicate that the RA of BHSs was relevant to CVD risk among different populations. However, it should be noted that the metagenome data and epidemiological data did not originate from the same cohort in this association study, which may be a limitation of this investigation. Nevertheless, these results still shed light on the relationship of BSH with various diseases.
In our study, BSH protein sequences (BSHs) were used for BLASTP searches to determine their sensitivity relative to other gene sequences. The average sequence identity of BSHs between genera was 44.46%, while it was 82.13% within genera (Additional file 1: Figure S3b). Thus, the BLOSUM45 matrix and cutoff of 45% homology were used build reference BSHs while ensuring all BSHs would be screened from the HMP database. Additionally, the BLOSUM62 with parameters of e-value 1e-5 and an identity of 62% as the cutoff were used to identify BSHs sequences in each individual to the accurate taxonomic assignments of BSHs in the human gut microbiota.
To the best of our knowledge, there have been few studies of paralogs of the BSH gene in bacterial genomes. Based on our results, the taxonomic characteristics of reference BSHs showed that 27.52% of the BSHs encoding bacteria behaved as paralogs (Fig. 2d). The presence of paralogs may cause differences in the BSHs and their functions in the same bacteria strain. Thus, we reclassified the 156 BSHs to eight phylotypes based on the phylogenetic tree, and the taxonomic characteristics of different phylotypes were diverse. Interestingly, most of paralogs between phylotypes were belonging to BSH-T5 and BSH-T6 and distributed in Bacteroides (Additional file 1: Figure S8). At the bacterial level, Yao et al. performed a screen of the BSH activity of 20 Bacteroidetes strains found in the human gut, and these majority strains display some degree of selectivity for conjugated bile acid substrates . At the genetic level, our results have demonstrated that different BSHs in the same Bacteroides strain also have differences in deconjugation ability. Moreover, the BSHs from the same strain but belonged to different phylotypes, i.e., representative BSHs of BSH-T5 and BSH-T6 from Bacteroides uniformis ATCC 8492, could also exhibit different deconjugation ability (Fig. 5).
Because of their importance in human metabolism, there have been several studies investigating BSH. However, most previous studies focused on cloning BSH from limited bacteria, such as Bacteroidetes strains , Lactobacillus [28, 36, 37, 38, 39], Bifidobacterium [40, 41, 42], Listeria [43, 44], Enterococcus [45, 46], and Clostridium . Fang et al. revealed the complexity of bile resistance level determination in commensal L. salivarius strains . Sun et al. presented a robust phylogenomic framework of existing species and for classifying new species of Lactobacillus . Liang et al. divided BSH sequences from Lactobacillaceae into five groups based on phylogenetic analysis , while Jones et al. divided BSH genes identified in fecal microbiota of 15 humans into three clusters and performed a functional analysis of BSH activity . The present study benefited from the use of next-generation sequencing technology and the abundance of public metagenomic databases, enabling a comprehensive investigation of BSH. First, we presented the taxonomic characteristics of BSHs in the human gut microbiome as widely and comprehensively as possible. Then, we proposed a new classification framework of BSHs based on phylogenetic relationship and functionally separated and characterized them both in silico and in vitro. Moreover, we disclosed the relationship between the abundance of BSHs and diseases from public gut metagenome cohort data. As a result, the present study of BSH is more comprehensive than prior studies and provides a phylogenetic-based system to evaluate BSHs activity by classifying the target sequence into specific phylotype.
Several previous studies have reported that changes in bile acids are closely connected with managing blood cholesterol level , diabetes [50, 51, 52, 53, 54], obesity , inflammatory bowel disease (IBD) , Crohn’s disease , and cardiovascular diseases (CVD) [5, 58, 59, 60, 61, 62, 63]. Furthermore, Labbé A et al. investigated the change of bile acid modification genes from IBD and T2D patients with bacterial taxonomic analysis in the gut microbiome . The reported association between bile acid and various diseases is pleiotropic, and the extremely low  or high  bile acid may both be related to some disease. In this paper, we investigated the relationship between relative abundance (RA) of BSHs and above-mentioned diseases. Specifically, the RA of the phylotype with the highest BSH activity (BSH-T3) was higher in patients with AS, while the RA of the phylotype with intermediate activity (BSH-T5, T6) was higher in healthy people (Fig. 6). The results suggested that BSH usually displayed beneficial effect on metabolism, but the higher RA of high activity BSH phylotype may have adverse effects and promoting the occurrence and development of diseases. It also demonstrates the importance of BSH classification, i.e., exploring the relationship of RA of BSH and disease should firstly distinguish specific BSH with activity assessment rather than using rough RA of total BSHs. Taken together, BSH indeed plays a role in some diseases, but it could be pleiotropic signals, causal analysis of the relationship between BSH and bile acid metabolism-related phenotypes still requires subsequent biological studies in the future.
It should be noted that this study has several limitations: (1) the data quality among different metagenome databases is not consistent (e.g., sample sizes, sequencing depth, individuals selected, and standards differed); (2) the metagenome data lack individual dietary and other information such as data describing climate, local habitats, and lifestyle; (3) the individuals described by the metagenome data and epidemiological data of the WHO were not directly related in the populational-level correlation analysis; (4) the results of BSH activity comparison in silico and in vitro were not exactly identical; and (5) the in vivo effects of different BSHs on various bile acids were not evaluated. Despite the limitations, the results are still valid for two reasons: the activity of different BSHs is evaluated by substantial enzyme kinetics and confirmed by LC-MS/MS approach; the relationship of BSHs and diseases is assessed by both populational association and cohort study with big sample sizes.
In this study, we describe the distribution of bacteria that expressed BSH and discuss their distribution and abundance in worldwide populations. Based on the above series of analyses, we propose a new method to reclassify the BSHs and compare the enzyme activities between phylotypes. Moreover, we found the RA of some BSH phylotypes that are significantly correlated with T2D and AS, but the effects are pleiotropic, which highlights the importance of BSH classification in the future studies of BA metabolism-related diseases.
Sequences and structures comparison
The known 3D structures of BSHs were searched in the protein data bank (PDB)  using the keyword “bile salt hydrolase” and “Choloylglycine hydrolase”. Only 12 of the 25 searched results are our purpose structures (Additional file 1: Figure S2b). Pairwise amino acid sequence alignments of the BSHs were performed by BLASTP (v 2.2.29+) , and multiple alignments were conducted using DNAMAN (v 8.0). The secondary structures of these BSHs were downloaded from PDBsum . Comparison of 3D structures was conducted using MOE (v 2014).
The reference genomes of 1751 bacterial strains covering 1253 species were obtained from the HMP database  in September 2014. The genes and related proteins from these bacterial genomes were predicted by MetaGeneMark (v 2.8) , and the taxonomic information regarding these genes and proteins was directly extracted from the strain names.
The query BSH protein sequences were collected from the Refseq database of NCBI database  using the keyword “bile salt hydrolase” and “Choloylglycine hydrolase”. Then, the total number of amino acids in the sequences was limited between 300 and 400. The sequence identity interval of the BSHs between and within genera was calculated from these 765 sequences to define the thresholds. Thus, the final 591 BSHs reference sequences (screening was performed by controlling the total length of the sequences from 300 to 400) were identified from HMP database by taking the initial 765 BSH as query and using BLASTP with e-value of 1e-5 and sequence identity of 45% as cutoff (Additional file 1: Figure S2a).
Publicly available metagenomic sequence data
The metagenomic sequence data of individuals were collected from 11 populations of six continents, including the Hadza ethnic group of Tanzania (HZ) of Africa [33, 67], China (CN) of Asia [50, 68], Japan (JP) of Asia , South Korea (KR) of Asia , Denmark (DK) of Europe , Sweden (SE) of Europe , Austria (AT) of Europe , France (FR) of Europe , Australia (AU, PRJEB6092) of Oceania, the United States (US) of North America (N-America) [21, 32], and Peru (PE) of South America (S-America) .
To construct metagenomic datasets of healthy individuals from each country, we screened out the data for individuals with 18.5 < BMI < 29.9 kg/m2 [69, 75], 12 < age < 75 [69, 76], and those designated with diseases that were excluded from the data. Although we could not access the metadata for individuals from the United States, we used all data from healthy individuals with an average BMI of 24 ± 4 kg/m2 for this cohort. Finally, a total of 581 healthy individuals were selected from these 11 countries for analysis (Additional file 2: Table S3).
To study the differences in the BSH relative abundance (RA) between healthy individuals and patients, we used metagenomic datasets of individuals from different countries who were identified with diseases expected to be related to deviations in bile acids such as colorectal cancer (CRC), adenoma, type 2 diabetes (T2D), impaired glucose tolerance (IGT), and atherosclerosis (AS)  (Additional file 2: Table S3).
Metagenomic analysis based on different populations
All raw sequencing reads were assessed and filtered using the FASTX-Toolkit, and the high quality microbiome sequencing reads were assembled with the SOAPdenovo2 (v 2.04)  package. After assembly, contigs with at least 500 bp were further used to predict the genes with MetaGeneMark (v 2.8), after which a non-redundant protein set was constructed by pair-wise comparison of all protein sequences within populations using BLASTP (v 35x1)  with 95% identity and 90% overlapping thresholds. The relative abundance (RA) of each protein sequence in each individual was calculated based on the number of read pairs mapped to the gene over the length of the protein sequence and divided by the summary of sequence abundance per individual . The cumulative RA was calculated by the sum of RA of each genus in each population. The BSHs were identified in the above population database using BLASTP with an e-value = 1e-5 and 62% sequence identity as the cutoff values.
A phylogenetic tree was built using the maximum likelihood method in the MEGA software (v 7.0). Dendroscope (v 3.4.7) was used to embellish the phylogenetic tree by adjusting the labels and filling the colors as needed.
Related physiological/diseases data
Nationwide blood cholesterol (2008), obese BMI (2014), diabetes death rate (per 100,000 individuals; 2012) and cardiovascular diseases death rate (per 100,000 individuals; 2012) data by country were downloaded from the World Health Organization (WHO)  (Additional file 2: Table S10).
Homology modeling and molecular docking
The online software, Protein Homology/analogY Recognition Engine (Phyre2, V 2.04) , was employed to predict the homologous structure of BSHs using intensive mode . Ligands including glycocholic acid (GCA, C26H43NO6), glycochenodeoxycholic acid (GCDCA, C26H43NO5), glycine deoxycholic acid (GDCA, C26H43NO5), taurocholic acid (TCA, C26H45NO7S), taurochenodeoxycholic acid (TCDCA, C26H45NO6S), taurodeoxycholic acid (TDCA, C26H45NO6S), and tauroursodeoxycholic acid (TUDCA, C26H45NO6S) were obtained from the ZINC database . The AutoDock program (v 4.2.6)  was employed to generate an ensemble of docked conformations for each ligand bound to its target. Four units of BSH were found to bond with single bile acid substrate molecules by forming a tetramer. We then used the genetic algorithm for conformation searches and conducted 100 individual genetic algorithm (GA) runs to generate 100 docked conformations for each ligand.
Materials, bacterial strains, and vectors
Sodium glycocholate hydrate (CAS: 863-57-0), sodium glycochenodeoxycholate (CAS: 16564-43-5), sodium glycodeoxycholate (CAS: 16409-34-0), sodium taurocholate hydrate (CAS: 345909-26-4), sodium taurochnodeoxycholate (CAS: 6009-98-9), sodium taurodeoxycholate hydrate (CAS: 207737-97-1), sodium tauroursodeoxycholate (CAS: 35807-85-3), CA-d4 (CAS: 116380-66-6), DCA-d4 (CAS: 112076-61-6), CDCA-d4 (CAS: 99102-69-9), UDCA-d4 (CAS: 347841-46-7), glycine (CAS: 56-40-6), and taurine (CAS: 107-35-7) were used in the enzyme assay. pET28a (+) was used for the expression of His-tagged (6x) recombinant BSH (Genscript, Nanjing, China). The target BSH sequences were ligated into the NcoI/XhoI-digested vector pET28a (+) (T7 promoter), resulting in six recombinant plasmid pET28a (+)-bsh genes being transformed into E. coli BL21 (DE3) competent cells.
Expression and purification of BSH
The eight bsh-recombinant bacteria were inoculated into Luria-Bertani (LB) broth containing 50 μg·ml−1 kanamycin, and expression of the BSH genes was induced by the addition of isopropyl β-D-thiogalactoside (IPTG, 0.5 mM). The cell pellet obtained by harvesting was subsequently resuspended (1:10) in buffer solution (0.05 M Tris, 0.05 M NaCl, 0.5 mM EDTA, 5% Glycerol, pH 7.9) and disrupted by sonication (alternating pulses: on for 3 s, off for 3 s; 60% amplitude) until clear solutions were obtained.
The BSH was purified by nickel-nitrilotriacetic acid (Ni2+-NTA) agarose column (HisTrap™ HP, GE Healthcare, USA). The presence and purity of BSH in each sample were confirmed by SDS-PAGE. After which pure BSH proteins were stored at − 80 °C after subsequently lyophilized.
Kinetics of BSH reaction
The BSH specific activity was determined by measuring the release of amino acids (glycine and taurine) from conjugated bile salts. The amounts of amino acids liberated by BSH were determined by ninhydrin assay . Protein concentrations were determined using a Bradford Protein Assay kit (Beyotime, Nanjing, China). The purified BSH-lyophilized powders were diluted with 20 mM phosphate buffer (pH 6.5) until the concentration of BSH was 0.65 mg·ml−1. Next, 10 μl of BSH liquid, 10 μl of bile salts, and 180 μl of reaction buffer (20 mM phosphate buffer, pH 6.5) were mixed, and 10 μl of liquid paraffin were added. The samples were subsequently incubated at 37 °C for 30 min, after which used 200 μl of 15% (w/v) trichloroacetic acid to terminate the reaction, and the sample was centrifuged to remove the precipitates. Mix 10 μl of reaction supernatant with 190 μl of ninhydrin reagent and kept in a boiling water bath for 15 min. The absorbance at 570 nm was measured in a 96-well plate. A standard BSH activity curve was subsequently prepared for glycine and taurine (Additional file 1: Figure S11b). One unit of BSH activity was defined as the amount of enzyme that released 1 μmol of amino acid from the substrate per min.
Analysis was performed on a triple quadrupole tandem liquid chromatography-mass spectrometry (LC-MS/MS) system (LCMS8050, Shimadzu). Chromatographic separation was performed on an ACQUITY UPLC BEH C18 (2.1 × 100 mm, 1.7 μm) column. The mobile phase consisted of 0.1% formic acid (mobile phase A) in water and 0.1% formic acid in acetonitrile (mobile phase B) running at a flow rate of 0.4 ml/min. The gradient elution program was 5% B at 0–1 min, 5–42% at 1–2.5 min, 42–45% at 2.5–5.5 min, 45–60% at 5.5–8 min, 60–95% at 8-9 min, 95% at 9–9.5 min, back to initial conditions, and 2 min for equilibration. The column was maintained at 55 °C and the injection volume of each sample was 1 μl.
The LC-MS/MS system control and data analyses were performed by LabSolutions software (the software version: version 5.65). The ion source parameters were set as follows: nebulizing gas flow of 2 l/min, heating gas flow of 10 l/min, interface temperature of 300 °C, DL temperature of 250 °C, heat block temperature of 400 °C, and drying flow of 10 l/min. The data were collected with multiple reaction monitor (MRM) in negative mode.
An isotope-labeled standard calibration approach with standard addition was used to avoid the matrix effects and ensure the accuracy of measurement. Calibration curves were constructed each day using seven calibrators prepared from pooled plasma spiked with CA, DCA, CDCA, and UDCA at concentration range of 0.00625–0.2 mM and CA-d4, DCA-d4, CDCA-d4, and UDCA-d4 at 0.003125 mM (Additional file 1: Figure S21, S22).
All values were expressed as mean ± SEM. Statistically significant differences between two groups were determined by Mann-Whitney U test followed by false discovery rate correction. The relation analysis was done with the Spearman rank correlations. Multiple comparisons were performed by multivariable-adjusted analysis using linear regression model. All analyses were performed using R (v 3.3.2), and p < 0.05 was considered as statistically significant.
The authors are grateful to all members of the State Key Laboratory of Natural Medicines in China Pharmaceutical University. The high-performance computing resources and services used in this work were supported by the High-Performance Computing Center of China Pharmaceutical University.
This work was strategically funded by the National Natural Science Foundation of China (Grant No. 31670495, 81421005) and a Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD) and Top-notch Academic Programs Project of Jiangsu Higher Education Institutions (TAPP). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Availability of data and materials
All data generated in this manuscript are included in additional files.
JL conceived and designed the study. ZS, YC, XL, XW, XL, YC, JL, LJ, JS and JL collected the data and performed the analyses. JL and ZS interpreted the data and wrote the manuscript. PK embellished the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 5.Charach G, Argov O, Geiger K, Charach L, Rogowski O, Grosskopf I. Diminished bile acids excretion is a risk factor for coronary artery disease: 20-year follow up and long-term outcome. Ther Adv Gastroenterol. 2017;11:1–11.Google Scholar
- 24.National Center for Biotechnology Information, U.S. https://www.ncbi.nlm.nih.gov. Accessed 17 Oct 2018.
- 25.RCSB: Protein Data Bank. https://www.rcsb.org. Accessed 20 Oct 2018.
- 32.Human Microbiome Project. https://www.hmpdacc.org. Accessed 24 Sept 2014.
- 35.Yao L, Seaton SC, Ndousse-Fetter S, Adhikari AA, DiBenedetto N, Mina AI, et al. A selective gut bacterial bile salt hydrolase alters host metabolism. Elife. 2018;7:e37182.Google Scholar
- 48.Liang L, Yi Y, Lv Y, Qian J, Lei X, Zhang G. A comprehensive genome survey provides novel insights into bile salt hydrolase (BSH) in Lactobacillaceae. Molecules. 2018;23(5):1157.Google Scholar
- 51.Steiner C, Othman A, Saely CH, Rein P, Drexel H, von Eckardstein A, et al. Bile acid metabolites in serum: Intraindividual variation and associations with coronary heart disease, metabolic syndrome and diabetes mellitus. PLoS One. 2011;6(11):e25006.Google Scholar
- 65.PDBsum. http://www.ebi.ac.uk/thornton-srv/databases/pdbsum. Accessed 20 Oct 2018.
- 70.Lim MY, Rho M, Song YM, Lee K, Sung J, Ko G. Stability of gut enterotypes in Korean monozygotic twins and their association with biomarkers and diet. Sci Rep. 2014;4:7348.Google Scholar
- 77.Ruibang LBL, Yinlong X, Zhenyu L, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1:18.Google Scholar
- 79.World Health Organization (WHO). http://www.who.int/en. Accessed 23 Mar 2017.
- 80.Protein Homology/analogY Recognition Engine V 2.0. http://www.sbg.bio.ic.ac.uk/phyre2. Accessed 21 Oct 2018.
- 82.ZINC. http://zinc.docking.org. Accessed 21 Oct 2018.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.