1 Introduction

Pathogenic bacteria Vibrio cholerae can increase or decrease the appearance of its genes for virulence by some specific and global transcriptional regulators in response to environmental conditions. It is known that the production of the bacterial virulence factors, the toxin-co-regulated pilus (TCP), and cholera toxin (CT) are responsible for the severe watery diarrhea in Vibrio cholerae infected individuals. The expressions of these virulence genes are regulated mainly by a transcription factor called Integration Host Factor (IHF). IHF is a positive regulator of transcription of virulence genes. The transcriptional regulatory activity of IHF is however, dependent on the presence of different other specific and global transcriptional regulators, notably among them are ToxT—a specific regulator and H-NS—a global regulator [1,2,3,4]. Earlier studies revealed that H-NS act as a homo-dimer and binds to the DNA with consensus motif TCGATAAATT. On the other hand, IHF is a hetero-dimer having subunits IHFα and IHFβ, which after binding to the DNA can bend it by 180° [1, 5, 6]. It has been proposed that the activations of virulence genes by IHF occur in presence of H-NS [1]. H-NS being the repressor of transcription of the virulence genes, and the binding of the H-NS protein to the DNA prevents transcription of the virulence genes. IHF initiates the transcription process by displacing H-NS from the DNA. However, till date, the detail molecular mechanism of this process of virulence gene activation is not reported. In the present scenario, an attempt has been made to analyze the molecular biochemistry of the virulence gene activation process in Vibrio cholerae by IHF from a structural point of view. No structures of the IHFα, IHFβ, and H-NS from Vibrio cholerae are available till date. So, an attempt has been made to build the homology models of these proteins by structural bioinformatics approach. The models of the proteins have been docked with the corresponding DNA regions and then the docked protein–DNA complexes have been subjected to molecular dynamics simulations (MD). The binding energies of the complexes have been calculated from the MD simulations. The results of the MD studies have been analyzed to predict the probable mode of binding. The analyses could further reveal the details of the DNA-IHF interactions as well as H-NS-DNA interactions and finally DNA-H-NS-IHF interactions reveal the mechanism of displacement of H-NS by IHF from the DNA. This is the first bioinformatics approach towards the understanding of the molecular mechanism of the virulence gene expression by Vibrio cholerae from a structural point of view. This study may therefore be useful for future genetic studies to analyze the functionalities of the various amino acid residues from IHF and H-NS in DNA binding and interactions among themselves. The interacting amino acids from these proteins may be targeted to develop new drugs to prevent the spread of Vibrio cholerae infections.

2 Materials and Methods

2.1 Sequence Analysis and Homology Modeling of IHFα, IHFβ and H-NS

The amino acid sequences of IHFα and H-NS were extracted from GenBank (id: 147673436 for IHFα and 147673645 for H-NS) and from Swissprot (id: A5F6Y4) for IHFβ. These sequences were applied for searching the Brookhaven Protein Data Bank (PDB) [7] with the software BLAST [8] to find suitable templates for homology modeling. For IHFα, the BLAST search result picked up the X-ray crystal structure of single chain Integration Host Factor protein (scIHF2) in complex with DNA from E. coli (PDB id: 2IIE, Chain A) with 75% sequence identity. For IHFβ, the BLAST search result chose the X-ray crystal structure of mutant IHF (BetaE44A) complexed with the native H` site from E. coli (PDB id: 1OWF, Chain B) again with 75% sequence identity. For H-NS, the BLAST result showed 73% sequence similarity with H-NS (DNA BINDING DOMAIN) (PDB id: 1HNR, Chain A). Then homology models of the proteins were built using Modeler [9] using the corresponding templates (2IIE, Chain A for IHFα, 1OWF, Chain B for IHFβ and 1HNR, Chain A for H-NS respectively). Since 1HNR is an NMR structure, the structure was first energy minimized using CHARMM [10] force fields before being utilized as the template to build the model of H-NS. The modeled structures of the proteins were then subjected to structural alignments. The root mean squared deviations (RMSD) for the superimpositions were found to be 0.5 Å (for IHFα and IHFβ on to their corresponding crystal templates) and 0.8 Å for H-NS on 1HNR, Chain A. The models of the proteins were then energy minimized in two steps. In the first step, the modeled structures were minimized without fixing the backbones. In the second step, the energy minimizations were done by fixing the backbones of the modeled proteins to ensure proper interactions. All energy minimizations were done with the help of conjugate gradient (CG) with CHARMM force fields until the structures reached the final derivative of 0.01 kcal/mole.

2.2 Validation of the Models

The main chain properties of the modeled proteins were found to be good having no considerable bad contacts nor Cα tetrahedron distortions nor hydrogen bond (H-bond) energy problems. The side chain properties were also accurately predicted as observed from the results of the side chain torsion angles. The Z-scores calculated with PROSA (https://prosa.services.came.sbg.ac.at/prosa.php) would reveal that the predicted structure was a good homology model. The residue profiles of the three-dimensional models were further checked by VERIFY3D [11]. PROCHECK [12] analyses were performed in order to assess the stereo-chemical qualities of the models and Ramachandran plots [13] were drawn. No residues were found to be present in the disallowed regions of the Ramachandran plots of the modeled proteins.

2.3 Building the Model of the Promoter DNA

In order to find the interactions between promoter DNA and the proteins, the nucleotide sequences of the promoter regions from Vibrio cholerae were extracted. The nucleotide sequences of the binding regions for IHF and H-NS were taken from [1]. These nucleotide sequences were used separately to build two models of the corresponding DNA regions using the CHARMM software tool and then subjected to energy minimizations. The resulting energy minimized structures were used for docking studies.

2.4 Molecular Docking Simulation of IHF with the Corresponding DNA Region

It was known that IHF binds to DNA as a hetero-dimer of IHFα and IHFβ [1]. Thus, a model of IHF hetero-dimer complex was built by docking IHFα with IHFβ using the software GRAMM [14]. The hetero-dimeric model of IHF obtained after docking studies was subjected to energy minimization as per the protocol previously mentioned in Sect. 2.1. In order to elucidate the mode of binding between the DNA and the IHF hetero-dimeric protein, the model of the IHF hetero-dimeric protein and the DNA were docked using the software patchdock [15]. The docked structure of the DNA–protein complexes that yielded the best score was selected and analyzed visually using DS modeling software suite. The docked complex was then energy minimized as per the protocol previously mentioned in Sect. 2.1.

2.5 Molecular Docking Simulation of H-NS with the Corresponding DNA Region

H-NS also binds to DNA as a homo-dimer. Therefore, a dimeric model of H-NS was built by docking the individual monomeric units together using the software GRAMM. The homo-dimeric model of H-NS obtained after docking studies was subjected to energy minimization as per the protocol previously mentioned in Sect. 2.1. In order to elucidate the mode of binding between the DNA and the H-NS protein, the homo-dimeric model of the H-NS protein and the DNA were docked using the software patchdock. The docked structure of the DNA–protein complexes that yielded the best score was selected and analyzed visually using DS modeling software suite. The docked complex was then energy minimized as per the protocol previously mentioned in Sect. 2.1.

2.6 Molecular Dynamics (MD) Simulation of H-NS-DNA Complex

The MD simulation of the DNA-H-NS protein complex was performed with the CHARMM module of DS modeling software suite. The initial coordinates were extracted from the energy-minimized structure of the DNA-H-NS docked complex. The complex was then placed in an orthorhombic box having dimensions preventing self-interactions. The system was solvated with adequate water molecules at the typical density of water at 298 K and 1.0 atm utilizing single point charge (SPC) model. The whole system was energy minimized keeping the temperature constant to the body temperature of 310 K using NPT dynamics protocol. A 100 ns dynamics run was then performed with the DNA–protein complex. The modes of interactions between H-NS and the corresponding DNA were then analyzed using DS modeling software suite.

2.7 Calculation of Binding Free Energy of the H-NS-DNA Complex

In order to have a quantitative estimation of the interactions between H-NS and DNA, the H-NS-DNA complex was analyzed by the FoldX server [16]. The average solvated coordinate of the H-NS-DNA complex generated from the MD simulation was used as input to FoldX. FoldX calculates the free energy as the sum of the different energetic contributions along with entropy and temperature factors [16].

2.8 Molecular Dynamics (MD) Simulation of IHF-DNA Complex

The MD simulation of the DNA–protein complex was performed using the CHARMM module of DS modeling software suite. The initial coordinates were extracted from the energy-minimized structure of the DNA-IHF docked complex. The complex was then placed in an orthorhombic box having dimensions preventing self-interactions. The system was solvated with adequate water molecules at the typical density of water at 298 K and 1.0 atm utilizing single point charge (SPC) model. The whole system was energy minimized keeping the temperature constant to the body temperature of 310 K using NPT dynamics protocol. A 100 ns dynamics run was then performed for the DNA–protein complex. The modes of interactions between IHF and the corresponding DNA were then analyzed using DS modeling software suite.

2.9 Calculation of Binding Free Energy of the IHF-DNA Complex

For the purpose of comparison of the binding energies of H-NS with DNA and IHF with DNA, the complex of IHF with DNA was analyzed by the FoldX server. The average solvated coordinate of the IHF-DNA complex generated from the MD simulation was used as input to FoldX. To analyze the DNA bending capability of the IHF protein, the cationic residue density (Cpc) and DNA phosphate crowding (Cpp) were determined as in [17].

2.10 Molecular Docking and Dynamics (MD) Simulation of the Docked Complex of H-NS-DNA with IHF

It was proposed that IHF hetero-dimer can displace H-NS bound to DNA [1]. Therefore, the model of the IHF hetero-dimeric complex was used to dock with the DNA-H-NS complex using the software GRAMM as per the protocol mentioned in Sect. 2.4. The docked structure of the DNA–protein complexes that yielded the best score was selected and analyzed visually using DS modeling software suite. The docked complex was then energy minimized as per the protocol previously mentioned in Sect. 2.1. The resulting energy-minimized structure was subjected to MD simulation using the CHARMM module of DS modeling software suite as per the steps mentioned before in Sects. 2.6 and 2.8.

2.11 Calculation of Binding Free Energy of the H-NS DNA Complex with IHF

It was known that IHF hetero-dimer can displace H-NS from H-NS-DNA complex [1]. Thus to quantify the interactions, the average solvated coordinate of the H-NS-DNA-IHF complex generated from the MD simulation was used as input to FoldX. The results of the binding free energies from all the aforementioned complexes were compared to justify the mechanism of the interactions.

3 Results

3.1 Structures of IHF and H-NS Proteins from Vibrio Cholerae

The IHF protein is a hetero-dimeric protein having two components IHFα and IHFβ. The IHFα and IHFβ components have only 35% sequence identity [1]. The sequences of the two proteins were therefore used separately to build their corresponding models. IHFα was built on the template of 2IIE, Chain A. The protein was an alpha beta protein with the N-terminal having two helices followed by a beta sheet having five antiparallel beta strands culminating in another helix (Fig. 1).

Fig. 1
figure 1

Cartoon representation of the model of IHFα protein. The helices are shown in red. The sheets are shown in yellow. The remaining portions are coil regions

The IHFβ protein was built on the template 1OWF, Chain B. This protein was also an alpha beta protein with the N-terminal having two helices followed by a beta sheet having five antiparallel beta strands culminating in another helix (Fig. 2). The difference between the structures of IHFα and IHFβ was in the presence of typeII turn regions. The IHFα had two such turns whereas the IHFβ had four such turn regions. However, when the backbone atoms of the models of the IHFα and IHFβ were superimposed on to each other, the RMSD was found to be 2.21 Å. This large RMSD difference between the backbone atoms of the IHFα and IHFβ proteins implied that though the two proteins had the same sets of secondary structural elements, their structural organizations were different.

Fig. 2
figure 2

Cartoon representation of the model of IHFβ protein. The helices are shown in red. The sheets are shown in yellow. The remaining portions are coil regions

The structure of H-NS was built using 1HNR, Chain A as the template. The protein was mainly alpha helical joined by loops. The C-terminal region of the protein had the signature sequence of a global transcriptional regulator. This region had an abundance of charged residues making H-NS capable of binding to the DNA. The surface charge distribution of H-NS protein was presented in Fig. 3.

Fig. 3
figure 3

The surface charge distribution of H-NS. The electrostatically positive surface (blue) of the protein binds to the electrostatically negatively charged DNA

3.2 Interaction of IHF with DNA

The model of IHF hetero-dimer was docked onto the model of the promoter DNA region. The dimerization of the two components of the IHF protein (i.e., the IHFα and IHFβ) led to the formation of a compact globular domain with an extension of the beta sheet region from IHFα. The beta sheet region was found to contact the DNA in its minor groove. Analysis of the IHF hetero-dimeric structure revealed that the IHFα had comparatively less exposed surface area than the IHFβ as measured by PDBePISA server; the accessible surface area of IHFβ was 8116 Å2 whereas that of the IHFα was 7867 Å2. The docked complex of IHF hetero-dimer and DNA was subjected to MD simulations after energy minimization. The MD simulation results revealed that the IHFβ protein was mainly responsible for binding to the DNA and the binding occurred mainly via hydrophobic stacking interactions between the DNA bases and the nonpolar side chains of the IHFβ protein. The residues from IHFβ protein that were involved in the DNA binding were Leu6, Ile7, Leu17, Leu30, Ile43, Val60, and Val70. However, there were interactions between the phosphate backbone of the DNA as well as the charged amino acid residues of the IHFβ. The amino acids from IHFβ that were involved in ionic interactions as well as hydrogen bonding with the phosphate backbone of the DNA were Lys3, Arg9, Lys20, Lys27, and Lys90. Figure 4 represents the electrostatic surface charge distribution of the protein bound to the DNA. The nonpolar region on the protein surface (colored green) predominantly was bound to the hydrophobic regions on the DNA bases. Interestingly, with the passage of MD run, it was observed that the DNA was getting bent. The bending of the DNA allowed the IHFα protein to come in contact with the DNA leading to the formation of DNA- IHFα protein interactions. The side chains of the amino acid residues, viz., Arg21, Lys24, Arg60, Arg76, and Arg77 of IHFα protein came near the phosphate backbone of the DNA after getting bent. This led to stronger interactions between the DNA and the IHF protein as a whole. The binding energy of the DNA-IHF complex was measured to be 15.732 kcal/mole as measured by FoldX server. The DNA bending capacity of the IHF protein was verified by measuring the correlation between Cpc and Cpp as in [17] and the value of correlation was found to be 0.31(>0.25). This means that IHF indeed can bend the DNA as per [17].

Fig. 4
figure 4

Interactions between DNA and the IHF protein. The hydrophobic protein surface (colored green) was attached to the hydrophobic DNA bases

3.3 Interaction of H-NS with DNA

The model of H-NS-DNA complex was obtained by docking H-NS with DNA followed by MD simulations of the complex. The analysis of the complex obtained after MD simulation revealed that mainly the C-terminal helical region of the H-NS protein was involved in DNA binding. The interactions were mainly phosphate backbone interactions involving the side chains of Lys110, Gln115, Gln123, and Lys130 from H-NS protein. However, no DNA bending was observed in this case. The binding energy of the DNA-IHF complex was measured to be 11.156 kcal/ mole as measured by FoldX server. The binding energy value suggests that IHF interacts more strongly with DNA than H-NS.

3.4 Interaction of IHF Hetero-Dimer with H-NS-DNA Complex

It was known that IHF hetero-dimer can displace H-NS from H-NS-DNA complex [1]. In order to study the details of the interactions, the model of IHF hetero-dimer was docked on to the model of H-NS-DNA complex followed by MD simulation. It was revealed that when IHF was allowed to dock with the H-NS-DNA complex, IHFα starts interacting with H-NS in a stronger way than H-NS was doing with DNA. The amino acid residues of the C-terminal helical region of H-NS were found to be getting involved in interaction with IHFα as revealed by the MD simulation run. The amino acid residues Asp104, Lys110, Gln115, Arg116, Lys124, Asp127, Glu128, and Lys130 from H-NS protein were bound to Lys5, Glu10, Lys20, Glu32, and Glu33 of IHFα. This led to displacement of H-NS from its DNA bound state facilitating the binding of IHFβ to the DNA as more number of amino acid residues from IHFβ started interacting with the DNA (12 residues from IHFβ as compared to 4 residues from H-NS). In other words, interactions between IHFα and H-NS became stronger than interactions of H-NS with the DNA and consequently, the interactions between IHFβ and DNA became stronger than the corresponding interactions of H-NS with DNA. This way the repressions of virulence genes by H-NS are abolished and the gene expression ensues. The IHF protein thus acts as an activator of transcription of the virulence genes. The overall binding energy of the H-NS-DNA-IHF complex was measured to be 18.596 kcal/ mole as measured by FoldX server. This also signifies that the interactions between H-NS and IHFα are stronger than the interactions between H-NS and DNA. Thereby, IHF can displace H-NS from its DNA bound state.

4 Discussions

In the present work, an attempt has been made to analyze the probable molecular details of the interactions of IHF and H-NS with the corresponding DNA regions to predict the mode of the virulence gene expression. It was known that the expression of the virulence genes by Vibrio cholerae occurs by a complex interplay between IHF, H-NS, and the corresponding promoter DNA [1]. Since there were no previous reports regarding the structural aspects of the mode of interactions, molecular simulation approach was employed to do so. The molecular simulation study involving these proteins revealed that the binding of hetero-dimeric IHF to DNA indeed bended the DNA. The interactions were mainly mediated by IHFβ with the minor groove of the DNA facilitated by IHFα. Together IHFα and IHFβ bended the DNA in such a way that the mutual interactions between the DNA and the IHF protein was enhanced manifold. Further verification of the DNA bending was observed by calculating the correlation between Cpc and Cpp as in [17]. The correlation value being 0.31 clearly indicates DNA bending by IHF protein as per [17]. This phenomenon was again established when the H-NS-DNA complex was docked with the IHF hetero-dimer and the resulting complex was simulated. The resulting complex obtained after MD simulation showed that IHFα formed stronger interactions with H-NS than H-NS does with the DNA. It was also apparent from the simulation results that IHFβ formed stronger interactions with DNA than H-NS did. The DNA bending by IHF helped the protein create more interactions with DNA than H-NS as the bending of the DNA made it more accessible to IHFα which initially could not interact with DNA. Calculations of the binding free energy values of these complexes also testified the aforementioned observations. The hetero-dimeric IHF was known to bind to DNA in order to transcribe the virulence genes [1]. The MD simulation results also pointed towards the same. The progress and completion of the MD simulation processes were monitored by plotting a graph of RMSD of the backbone atoms of the docked complexes vs time periods of the MD simulation runs. Since there were no previous reports that dealt with the mechanistic details of the DNA–protein interactions in the expression of virulence genes by Vibrio cholerae, the present work may be useful to analyze the hitherto unknown molecular mechanism of the virulence gene expression by Vibrio cholerae. The results from this study may be used in future genetic and mutational works to identify the functions of the amino acid residues involved in the interactions. This may further be utilized to develop new drugs against Vibrio cholerae infection.