Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Helicases are motor proteins that use the free energy of NTP hydrolysis to catalyze the unwinding of duplex nucleic acids. Helicases participate in almost all processes involving nucleic acids. Their action is critical for replication, recombination, repair, transcription, translation, splicing, mRNA editing, chromatin remodeling, transport, and degradation (Matson and Kaiser-Rogers 1990; Matson et al. 1994; Mendonca et al. 1995; Luking et al. 1998).

A significant number of genes of all organisms encode for helicases. A study in Salmonella cerevisiae revealed 134 ORFs encoding for helicases (Shiratori et al. 1999). This would account for about 2% of the yeast genome. Similarly, more than 12 DNA helicases and about 17 RNA helicases have been identified in Escherchia coli(Matson 1991; Schmid and Linder 1992; Bird et al. 1998; Egelman 1998; Dreyfus 2006). However, this is not unexpected considering that these enzymes are ubiquitous and are involved in such diverse metabolic roles. As is the case with the bacteria and other higher eukaryotes, most viruses too encode for proteins with conserved helicase motifs.

Table 20.1 Viral genuses and their associated helicases

For viruses whose genomes are comprised of double-stranded DNA or RNA, the presence of a helicase in the virus-encoded genome is conceivable. This is indeed the case for most double-stranded DNA and RNA viruses for which the genome sequence has been reported (Gorbalenya et al. 1988a,b; Gorbalenya and Koonin 1989). Many positive-strand RNA viruses too encode their own helicases presumably to remove any partial duplexes that might exist within the genome and to facilitate viral replication either directly or indirectly (Jeang and Yedavalli 2006). Some viruses have been identified to encode for more than one helicase indicating the role of helicases in other viral processes like packaging (Kadare and Haenni 1997; Luking et al. 1998). Table 20.1 gives a comprehensive list of viral genuses with their corresponding hosts. The table also indicates if the virus encodes a helicase, and if so, the superfamily it has been classified to.

Classification of Helicases

Helicases can be broadly classified into two groups based on their substrate requirements: DNA helicases and RNA helicases. This, however, is not a very stringent classification as many of the DNA helicases can unwind RNA and vice versa (Matson and Kaiser-Rogers 1990; Kadare and Haenni 1997; Luking et al. 1998). Helicases are also classified based on the polarity of their translocation as 3ʹ→5ʹ helicases or 5ʹ→3ʹ helicases. A 3ʹ→5ʹ helicase requires a 3ʹsingle-stranded tail to load onto the nucleic-acid substrate and move unidirectionally toward the 5ʹ end of the substrate. Some examples of 3ʹ→5ʹ helicase are HCV NS3 (Gwack et al. 1996), E. coli UvrD (Matson 1986), the RNA helicase from Vaccinia virus NPH-II (Shuman 1993), etc. A 5ʹ→3ʹ helicase requires a 5ʹ single-stranded tail for it to load and move along the nucleic acid. Some examples of 5ʹ→3ʹ helicases are T7 gp4 helicase–primase (Matson et al. 1983), E. coli DnaB (LeBowitz and McMacken 1986), phage T4 Dda helicase (Jongeneel et al. 1984), etc. Though most helicases require a single-stranded tail to initiate unwinding, some enzymes like E. coli RecBCD can initiate the strand-separation reaction from a blunt-ended duplex (Braedt and Smith 1989).

Based on their oligomeric structure, helicases can be either ring-shaped or non-ring shaped. Although most of the non-ring shaped helicases are 3ʹ→5ʹ helicases and the ring-shaped helicases are predominantly 5ʹ→3ʹ helicases (Lohman 1993; Hall and Matson 1999; Patel and Picha 2000), there are a few exceptions to the rule; the hexameric E1 helicase from the Papilloma virus – a 3ʹ→5ʹ helicase (Hughes and Romanos 1993), and the monomeric Dda helicase from bacteriophage T4 – a 5ʹ→3ʹ helicase (Jongeneel et al. 1984) are a couple of examples.

Fig. 20.1
figure 20_1_117964_1_En

Superfamilial classification of helicases. Helicases can be broadly classified into families and superfamilies based on sequence similarities. The diagram includes both viral and non-viral helicases; the list is not exhaustive and does not include many of the plant helicases (Gorbalenya and Koonin 1993; Levin 2002).

Fig. 20.2
figure 20_2_117964_1_En

Conserved motifs in superfamilies. Primary sequence analysis has led to the identification of certain sequence motifs that are conserved between many helicases. These sequence similarities have resulted in the classification of the helicases into different superfamilies. Represented here are the consensus sequences for the different superfamilies in the N-terminus to C-terminus orientation. The spaces between the motifs are arbitrary (Gorbalenya and Koonin 1989; Gorbalenya et al. 1989; Ilyina et al. 1992; Hall and Matson 1999).

The largest classification of helicases is based on their primary structure. The earliest classification of helicases by Gorbalenya and Koonin, based on amino acid sequence similarities revealed several conserved sequence motifs. Based on the extent of sequence similarity, they classified helicases into three large superfamilies: SF1, SF2, and SF3 (SF standing for superfamily) and smaller families (Fig. 20.1) (Gorbalenya and Koonin 1993). SF1 and SF2 constitute the largest of these superfamilies. The proteins of these superfamilies shared seven conserved motifs (Fig. 20.2). Two of these motifs, designated as the Walker A and the Walker B motifs are conserved among all the helicases and other nucleotide hydrolases as they are implicated in NTP binding and hydrolysis (Gorbalenya et al. 1988a,b). The SF3 superfamily of helicases contained only three conserved motifs including the Walker A and Walker B sequences (Fig. 20.2). This superfamily includes the majority of viral helicases (Gorbalenya et al. 1990). Of the two smaller families, one contained helicases related to the E. coli DnaB helicase. These proteins shared three distinct conserved motifs in addition to Walker A and Walker B sequences (Fig. 20.2). Only bacterial and bacteriophage members of this family have been identified so far and all the (putative) helicases have been shown to have a functional and/or physical association with a primase function as well (Ilyina et al. 1992). The last group of proteins showed extensive sequence similarity to the transcription terminator Rho. This group also included the AAA+ family of ATPases, demonstrating an apparent evolutionary relationship between helicases and non-helicase NTPases (Iyer et al. 2004).

In the past decade, new helicases have been identified and characterized from a variety of different viruses, bacteria, archaebacteria, and plants. Novel protein motifs have been discovered which are conserved among helicases across different species. This called for a revision in the classification of the helicase superfamilies. In the current system of classification suggested by Singleton et al., the helicases are classified into six superfamilies SF1 through SF6. SF1 and SF2 still remain the largest of the superfamilies; SF3 continues to constitute the viral helicases. The DnaB-like family has been renamed as superfamily 4 (SF4), Rho family as SF5. The AAA+ ATPases are now classified into a stand-alone superfamily of their own in SF6 (Singleton et al. 2007). The conserved sequence motifs among SF1 and SF2 family members have been extended to include the TxGx motif (Pause and Sonenberg 1992), Q-motif (Tanner et al. 2003), motif-4a (Korolev et al. 1998), and TRG motif (Mahdi et al. 2003), some of which are specific to each superfamily or subfamilies therein(Singleton et al. 2007). Singleton et al. also propose to include the directionality of translocation as criterion for classifying the members of the superfamily into subfamilies. In their classification, subfamily A represents 3ʹ→ 5ʹ helicases within the superfamily, while subfamily B represents 5ʹ→3ʹ helicases (Singleton et al. 2007).

Recent discoveries have identified helicases that either do not translocate (e.g., Swi/Snf) or translocate along duplex DNA (e.g., EcoR124I). The latter class of helicases has been referred to as translocases in order to differentiate them from bona fide helicases, which translocate along single-stranded substrates and bring about duplex unwinding. In the new system of classification, classic helicases are referred to as sub-type α, while the translocases are referred to as sub-type β. As per the new system of classification, Dda helicase from bacteriophage T4, a 5ʹ→ 3ʹ SF1 helicase, will be classified into SF1Bα, while NS3 helicase of HCV will be classified into SF2Aα.

It is clear that classification by sequence homology does not correlate with other helicase taxonomies and that helicases with different substrate specificities or directionalities could still be classified under the same superfamily (Fig. 20.1). What this means is that minor changes in the amino acid sequence could result in changes in substrate specificity or polarity of the enzyme, but the overall mechanism of enzyme action is more conserved. Thus, sequence similarities between helicases could directly reflect on their conserved enzyme mechanisms.

Fig. 20.3
figure 20_4_117964_1_En

Crystal structures of viral helicases. Many viral helicases have been crystallized either in their apo-forms or bound with their substrates (NTP/nucleic acid). X-ray structures often reveal important information about substrate binding and/or catalysis. In panel (a) is the E1 helicase of Papilloma virus in complex with ssDNA and ADP (PDB ID: 2GXA). In E1 helicase, the ssDNA is bound in the central channel of hexameric helicase and the ADP is co-ordinated at the interface between the subunits. Panel (b) is the structure of the bi-functional primase–helicase, also a hexameric helicase, from bacteriophage T7 (PDB ID:1Q57). In this protein, the primase domain trails behind the helicase domain, giving a distinct two domain organization of the protein. It should also be noted that the primase domain of one subunit interacts with the helicase domain of the neighbouring subunit. Panel (c) is the X-ray structure of helicase domain of the Hepatitis C virus NS3 helicase co-crystallized with ssDNA (PDB ID:1A1V). Unlike E1 and T7 gene4 helicases, NS3 is a monomeric helicase. It binds to the ssDNA in the cleft between all its subdomains. Panel (d) is the crystal structure of the full length NS3 protease-helicase (PDB ID: 1C1U). The protein is a recombinant construct where the 4A peptide, the co-factor for the protease domain (Howe et al. 1999), has been covalently attached to the N-terminus of the protease domain of the full length protease-helicase. All the crystal structures in the figure were rendered in 3D using Expasy’s Swiss PDB Viewer (Guex and Peitsch 1997).

Structure and Function of Helicases

The classification of helicases into superfamilies laid out the groundwork for most of the structural studies on helicases. With the emergence of crystal structure information, the signature helicase motifs have been extensively characterized not only at the amino acid level but also at the three-dimensional structure level. Figure 20.3 contains a few of the crystal structures of some representative viral helicases. A list of all helicase structures along with the PDB IDs are given in Table 20.2.

Table 20.2 Viral helicases: Crystal and solution structures

Given that the conserved sequence motifs are short stretches of 4–10 amino acids interspersed with non-conserved segments, it has been hypothesized that the divergent regions are responsible for the individual protein functions while the highly conserved regions are involved in nucleotide binding and/or hydrolysis. Though separated in sequence, the structural data indicate that these conserved regions are close together in space and form one large functional domain (Hall and Matson 1999). Some of these motifs have been biochemically characterized. Given below is a brief description of the structural description of each of the major superfamilies with special reference to these conserved motifs.

Superfamily 1 and Superfamily 2 Helicases

SF1 and SF2 helicases are among the most extensively studied and structurally well-characterized helicases. Though there is currently no high-resolution structural information available for the SF1 viral helicases, their counterpart in bacteria like PcrA (Velankar et al. 1999), Rep (Korolev et al. 1997), etc. have been well studied. Among the SF2 viral helicases, crystal structures are available for the helicases of many members of the flaviviridae family – HCV, Yellow fever virus, Kunjin virus, and Dengue virus (See Table 20.1). Sequence alignment among the members of the SF1 and SF2 superfamilies reveal up to 40% identity within the family members, with ˜90% lying in the conserved domains (Gorbalenya and Koonin 1993). This close relationship between the sequences allows for x-ray data from one enzyme to be extrapolated to the other members of the superfamily.

Structural studies on the SF1 and SF2 helicases indicate that these enzymes share extensive similarities in their stricture. The representative structures of SF1 helicases indicate a four domain structure, with all the conserved sequence motifs concentrated in two of these domains: domain 1A and domain 2A (Subramanya et al. 1996; Velankar et al. 1999). In the case of the SF2 helicases, these sequences reside within two large domains, domain 1 and domain 2, which are homologous to the domain 1A and 2A of SF1 helicases (Yao et al. 1997; Kim et al. 1998).

Motif I, also referred to as the Walker A motif, has a consensus of XGXAGXGKT in SF1 helicases and a consensus of XGXGKT/S in SF2 helicases (Blinov et al. 1989; Tuteja and Tuteja 2004). The conserved lysine residue in both the families is responsible for binding to the β- and γ-phosphates of NTP–Mg complex. Mutation of this lysine residue results in deficiency of the ATPase activity (Hall and Matson 1999). However, it has no effect on nucleic-acid binding (Levin and Patel 2002). Motif Ia of both superfamilies is involved in ssDNA binding (Kim et al. 1998; Lee and Yang 2006). In a recent study on HSV-1 UL9 helicase, mutational analysis of the residues in the Ia motif implicated in DNA binding resulted in moderate to severe defects in single-stranded nucleic-acids binding and ssNA stimulated ATPase activity, while retaining the intrinsic ATPase activity similar to that of wildtype enzyme (Marintcheva and Weller 2003). Motif II has a conserved sequence of XXDEXD/H and is referred to as the Walker B motif (Linder et al. 1989). Proteins carrying a conserved D-E-A-D sequence, also referred to as the DEAD-box proteins, are predominantly RNA helicases (Koonin 1991; Koonin 1992; Linder and Daugeron 2000; Cordin et al. 2006), while proteins carrying variant of the DEAD sequence like DEAH/DEXH are usually DNA helicases (Subramanya et al. 1996; Linder 2000; Linder and Daugeron 2000). The conserved D of this motif has been shown to interact with the catalytic Mg2+ and is important for the NTPase activity. A mutation in this residue affects both the NTPase and the helicase function (Pause and Sonenberg 1992).

Motifs III and VI of the SF1 and SF2 helicases, though not equivalent in sequence or structure, are implicated in coupling ATPase activity to the helicase function. Mutations in the SAT domain of motif III of the SF2 helicases resulted in loss of helicase activity with no effect on the ATPase activity (Pause and Sonenberg 1992; Graves-Woodward et al. 1997), while mutations in motif VI resulted in loss of both ATPase and RNA helicase activity. An invariant arginine in this domain has been shown to be extremely important for the RNA helicase activity of HCV NS3 helicase (Kim et al. 1997). In SF1 helicases, residues in motif III are also involved in nucleic-acid binding through hydrogen-bonding and stacking interactions with the nucleic-acid bases (Hall and Matson 1999). Motifs I, IV, and V of the SF1 helicases have been shown to have direct contact with either the nucleotide in the enzyme-NTP complexes or interact with the nucleic acid through the sugar–phosphate backbone (Korolev et al. 1997; Hall and Matson 1999; Velankar et al. 1999). In the SF2 family, a newly discovered motif, called the Q-motif, owing to its conserved Gln residue has been implicated in adenine recognition of these enzymes (Tanner 2003; Tanner et al. 2003; Tuteja and Tuteja 2004; Killoran and Keck 2006).

Superfamily 3

All SF3 helicases contain the Walker A and Walker B sequences which are important for nucleotide binding and/or hydrolysis. In addition to these they contain the conserved motif C (Bork and Koonin 1993; Gorbalenya and Koonin 1993; Iyer et al. 2004) and a newly discovered motif Bʹ (Yoon-Robarts et al. 2004). The Walker B motif is atypical and carries a consensus of XXXXEE, while the motif C carries the consensus XXX(S/T)(S/T)N (Hall and Matson 1999; James et al. 2003). The motif C of SF3 helicases is implicated in distinguishing ATP from ADP. The conserved Asn hydrogen bonds to the γ-phosphate of ATP to facilitate this function (James et al. 2003). Bʹ motif is characterized by a 14-residue long stretch, with a central highly conserved glycine, and positively charged residues on either end of the motif. This motif has been established to be involved in nucleotide binding and unwinding. Mutation of a Lys at one end of the motif abolishes both helicase and ATPase activity, while the mutation of the other Lys eliminates helicase but not ATPase activity (Walker et al. 1997).

DnaB-like Family

This family of hexameric helicases possesses five conserved sequence motifs. Two of the five are the Walker A and Walker B motifs common to all helicases and NTPases, while the other three are specific to this helicase family. Bacteriophage T7 helicase, the viral representative of this family, has been extensively studied, not only in terms of its structure but also biochemically and mechanistically (Matson et al. 1983; Rosenberg et al. 1992; Hingorani and Patel 1993; Patel and Hingorani 1993; Patel et al. 1994; Egelman et al. 1995; Patel and Hingorani 1995; Hingorani and Patel 1996; Washington et al. 1996; Yu et al. 1996; Ahnert and Patel 1997; Hingorani et al. 1997; Picha and Patel 1998; Sawaya et al. 1999; Ahnert et al. 2000; Patel and Picha 2000; Singleton et al. 2000; Kim et al. 2002; Toth et al. 2003; Jeong et al. 2004). From the crystal structure studies, four of the five conserved domains, 1,1a, 2, and 3 lie in the conserved C-terminal domain of the helicase and are involved in nucleotide binding and hydrolysis. Domain 4 is a part of the DNA-binding surface of the helicase and lines the region which forms the central channel when the hexamer gets assembled (Sawaya et al. 1999; Singleton et al. 2000; Toth et al. 2003).

Helicases and the RecA Fold

The crystal structures of all the helicases from the different superfamilies discussed above reveals an interesting fact: all these proteins share a common fold – the RecA fold (Bird et al. 1998). The basic structural unit of the helicases from the SF1 and SF2 superfamilies is the RecA-like subdomains (Yao et al. 1997; Kim et al. 1998; Velankar et al. 1999; Lee and Yang 2006). Structures of the T7 gene 4 helicase and the SF3 helicase from Adeno-associated virus type-2 also possesses a RecA-like fold in their helicase domains (Sawaya et al. 1999; James et al. 2003), as is the case with SV40 T-antigen (Seif 1982). It has also been shown that the ATP-binding domain of RecA and the F1-ATPase superimpose with a root mean squared deviation of less than 2 Å (Story et al. 1992; Abrahams et al. 1994) (Fig. 20.3).

Fig. 20.4
figure 20_5_117964_1_En

Topology diagrams of viral helicase structures. iral helicases are structurally homologous to the E.coli RecA protein. Panel (a) is a schematic representation of the RecA fold with the corresponding conserved motifs (b) is a topology diagram of the E.coli RecA protein (c) is the topological representation of domain 1 of the HCV helicase NS3, an SF2 helicase. (d) is the representation of T7 gene4 helicase, a member of the hexameric DnaB family. No structures of any SF1 viral helicase are available so far. The topology diagrams were adapted from EMBL’s PDBSum database.

The conservation of this structural motif in all helicases could mean that this fold is the minimal requirement for all helicases (including the generic NTPases) for NTP binding and hydrolysis. For all other diverse functions that the helicases carry out, this minimal domain needs to be supplemented with additional domains. This observation is exemplified by the eukaryotic transcription initiation factor eIF4a, an SF2 DEAD-box helicase (Rogers et al. 2002). eIF4a protein, which is essentially just the RecA-like motor with no additional domains, is a very poor helicase (Du et al. 2002). However, its helicase activity gets considerable enhanced in the presence of other factors like 4B, 4H, etc. (Rogers et al. 2001).

The commonality in the motor domain of all these proteins could mean that all these proteins couple binding and hydrolysis of nucleotides to conformational changes that in turn affect the affinity of these enzymes for different forms of nucleic acids. However, the disparities between the helicases in terms of their polarity, substrate specificity, oligomeric nature, etc., could be derived from the associated domains and/or proteins. Thus, it is often necessary to study these proteins as a part of the macromolecular complex in which they form the central functional component, rather than in isolation. This is especially true of viral helicases, where the helicase not only plays a central role in genome replication but also in other functions like mRNA capping, recombination, packaging, etc.

In viruses, the minimal replisome consists of a helicase, polymerase, and a single-strand binding protein (SSB). The virus hijacks the host machinery for all other accessory proteins. However, the virus-specific activities lie within the minimal replisome. Thus, many of the virus-encoded proteins are known to be multi-domain with multiple functions. Figure 20.4 gives a few examples of the accessory activities associated with some viral helicases along with their role in viral replication. In addition to the multiple functions that many viral helicase possess, they act in concert with many other proteins to carry out functions that aid viral replications. In some viruses, this kind of multi-protein interactions is required for facilitating basic functions that might reside within a single polypeptide in other viruses. For example, in herpes-simplex virus type-1, the association of three proteins, UL8, UL5, and UL52, constitutes the helicase–primase activity (Crute et al. 1991). Similarly, in papillomaviruses, the E1 protein has to associate with the E2 protein for origin binding and initiation of replication (Seo et al. 1993; Masterson et al. 1998; Gillitzer et al. 2000). Helicases also interact functionally and physically with the polymerases within the replication complex. These polymerases could be either host-derived or virus derived (Smale and Tjian 1986; Gannon and Lane 1990; Park et al. 1994; Notarnicola et al. 1997; Delagoutte and von Hippel 2001; Kato et al. 2001; Piccininni et al. 2002). These interactions could be either direct protein–protein interactions or mediated through an intermediary scaffolding protein. Single-strand binding proteins have also shown to interact with the helicases both in vitro and in vivo. The SSB could once again be host-derived or virus derived (For examples see: (Nakai and Richardson 1988; Hamatake et al. 1997; Kong and Richardson 1998; Lefebvre et al. 1999).

Biochemical studies on hepatitis C virus showed that the eukaryotic RNA helicase p68 and the poly-pyrimidine tract binding protein (PTB) are essential for viral replication (Goh et al. 2004; Zhang et al. 2004; Aizaki et al. 2006; Chang and Luo 2006; Lim et al. 2006). Similar observations have been made for many of the tumor-inducing viruses, which show interactions with the cellular factors which are important in apoptosis and other related metabolic pathways (Barber 2001; Schattner 2002; Lavia et al. 2003; Brechot 2004; Ledwaba et al. 2004; Zhang et al. 2004; Levrero 2006; Strath and Blair 2006). Table 20.3 gives a list of some of the proteins that the viral helicases interact with along with their corresponding function in viral replication and/or infection.

Table 20.3 Helicase associated proteins and functions

Why are these interactions with the cellular factors important? Viruses are facultative parasites. They have evolved to have some of the smallest genomes, encoding for only those functions that are most essential and are specific to its replication. Thus from the point of view of viral evolution and host-virus specificity, hijacking host proteins for viral replication maximizes the viral perpetuation by re-routing all or most of the cellular metabolism towards virus-directed processes. Protein–protein interactions between helicases and other accessory proteins also have kinetic and thermodynamic implications. The function of most helicases within such assemblies is not merely to catalyze the opening of a dsNA segment, but also to drive rearrangements in which one or both of the ssNA products end up bound to another macromolecular component. Often the inclusion of loading or trapping factors can improve helicase activity. A loading factor facilitates initiation of the helicase reaction, while a trapping component (e.g., ssNA binding protein) facilitates elongation by stabilizing ssNA intermediates in the reaction as they are formed. In the context of replication, the ssNA thus stabilized can be used by the polymerase for genome replication. Thus, a simplified replisome can be thought to be a combination of at least two motor proteins: the helicase and the polymerase.

In a mathematical treatment by Stukalin et al., it is apparent that when there is coupling between two motor proteins, it results in a much more efficient motor when compared to the individual motors (Stukalin et al. 2005). The increased efficiency could be reflected as an increase in the overall rate of the reaction and/or the processivity of the enzyme(s) (Jarvis et al. 1991; von Hippel and Delagoutte 2001; Delagoutte and von Hippel 2002; Stano et al. 2005). Such a behaviour has also been reported for isolated helicases of the SF1 and SF2 superfamilies, where functional oligomerization of the enzymes resulted in an increase in the processivity of the enzyme without change in reaction rates (Levin and Patel 1999; Byrd and Raney 2005; Tackett et al. 2005).

Mechanism of Helicase Action

The unwinding activity of a helicase can be considered an outcome of two fundamental activities of all helicases: (1) unidirectional translocation along single-stranded nucleic acid and, (2) strand-separation activity. To carry out these reactions, a helicase must cycle through a series of energy states driven by NTP binding and/or hydrolysis and subsequently product release. Thus, in order to understand the mechanism of helicase catalyzed unwinding reactions, it is important to understand all the individual steps to it, namely: nucleic-acid binding, NTP binding and hydrolysis, single-stranded translocation, and then finally the strand-separation function. In the following section, each of these aspects of helicase mechanism will be dealt with in detail, with respect to two viral helicases that have been extensively characterized– the Hepatitis C Virus NS3 helicase and the bacteriophage T7 gene4 helicase.

Nucleic-Acid Binding

The binding of the helicase to the nucleic acid forms the first critical step toward unwinding the duplex substrate. Understanding DNA/RNA binding by the enzyme could help answer questions like: does the enzyme require a single-stranded region to initiate unwinding? Does the enzyme interact with only one strand of the nucleic acid or both? Does NTP binding alter the enzyme’s affinity for nucleic-acid binding?

Most helicases have been shown to require a short single-stranded tail to load onto the duplex substrate to carry out the unwinding efficiently. The polarity of the single-strand almost always depends on the polarity of the helicase translocation, i.e., a 3ʹ→5ʹ helicase uses a short single-stranded 3ʹ-tail, while a 5ʹ→3ʹ helicase uses a 5ʹ-tail. Many of the ring-helicases like T7 gene4 helicase, DnaB helicase of E. coli, etc., require a Y-shaped substrate, having both 3ʹ and 5ʹ-tails. Interestingly, SF3 helicases like E1 helicase of Papillomaviruses and T-antigen from SV40 can initiate unwinding from completely dsDNA by binding to a site-specific region (origin of replication), causing duplex melting and entry of the helicase onto the single-stranded region. The site-specific DNA binding is mediated by the DNA-binding domain, while the unwinding is mediated by the ATPase/helicase domain (Wu et al. 1998; Wu et al. 2001; Enemark and Joshua-Tor 2006).

The directional translocation of the helicase on its nucleic-acid substrate entails that its binding site for the nucleic acid is also polarized with respect to the sugar-phosphate backbone. A direct evidence for this was shown with Rep helicase binding to single-stranded dT16 containing the fluorescent base Etheno-adenosine at the 5ʹ end or the 3ʹ end. The enzyme showed different extents of fluorescence enhancements depending on position of the label (Bjornson et al. 1998). A similar observation was also demonstrated with HCV NS3 helicase domain using duplex substrates with either a 5ʹ- or a 3ʹ-overhang. The enzyme bound to the 3ʹ-overhang with a 45-fold higher affinity than to the 5ʹ-overhang (Levin et al. 2005). Another parameter that can be obtained from nucleic-acid binding experiments is the occlusion site measurement. Occlusion site can be defined as the number of bases/base pairs the enzyme protects when it binds to the nucleic-acid substrate. Nuclease foot-printing is often used to assay for the occlusion site. Occlusion site measurements also give an idea about the enzyme’s interaction with the duplex region or the displaced strand.

Fig. 20.5
figure 20_3_117964_1_En

Domain organization of viral helicases. Many viral helicases have been shown to have other enzymatic activities in addition to their helicase function. These associated functions play an important role in replication and/or packaging of the mature virions. The domain organization was recreated based on the data from the Pfam database (Finn et al. 2006)

X-ray structures of the oligonucleotide bound HCV NS3 domain (Fig. 20.5) showed that the enzyme bound the oligo nucleotide at a cleft that separated domain 3 (all-helix domain) from domains 1 and 2 (RecA homology domains). The binding polarity of the oligonucleotide was consistent with the biochemical assays – the 3ʹend positioned away from the enzyme and the 5ʹ-end oriented between domains 2 and 3 of the enzyme. The enzyme sought predominantly backbone interactions with the DNA, with very few base-specific interactions. Trp501 and Val 432, both highly conserved among HCV NS3 sequences, show interactions with the nucleic-acid bases defining the central binding cavity to five nucleotides (Kim et al. 1998).

Nucleic-acid binding has been biochemically characterized using equilibrium-binding experiments, which is the preferred method of studying nucleic-acid binding. These studies have provided valuable information about the binding constants, stoichiometry of binding and at times can also give insights into the oligomeric state of the enzyme. The DNA-binding studies, with fluorimetric titrations and nitro-cellulose filter binding assays, on HCV NS3 helicase domain (NS3h) showed that the enzyme bound ssDNA with a very high affinity (Kd ∼2–10 nM) in the absence of NTP. The NS3h binding occluded about 8.3 bases and had a stoichiometry of 1:1, enzyme:ssDNA (Levin and Patel 2002). However, the binding affinity dropped 80-fold in the presence of NTP (Levin et al. 2003). Though the enzyme did not bind to blunt-ended duplexes (Levin and Patel 2002; Levin et al. 2005), it showed a high affinity for partial duplexes with a 3ʹ-single-stranded tail (Levin et al. 2005).

It is extremely interesting to note that the enzyme’s DNA-binding behavior is vastly different from its RNA-binding behavior. It has been shown that NS3h helicase binds ssRNA with a 10-fold lower affinity, at neutral pH, than it binds DNA (Levin and Patel 2002). However, the enzyme seems to have maximal RNA-binding capacity at pH 6.5 (Gwack et al. 1996). Also, the affinity of the ATP bound enzyme for RNA is increased at low pH (Lam et al. 2004). Though NS3 helicase possesses the ability to bind to any single-stranded RNA, it exhibits a preference for its genomic RNA sequences. Banerjee and Dasgupta have shown that NS3 binds to the 3ʹ-UTR of the genomic RNA with a much higher affinity than the 5ʹ-UTR sequence (Banerjee and Dasgupta 2001). They attribute the differential affinities to the secondary structures associated with each of these sequences (Banerjee and Dasgupta 2001). The specificity of the HCV helicase for the 3ʹ-UTR could implicate a role for the helicase in viral replication since the negative strand synthesis would have to initiate at this terminus. Chang et al have demonstrated that the Arginine-rich motif (motif VI) of SF2 helicases is important for RNA binding (Chang et al. 2000). This motif is both structurally and functionally conserved in many flaviviruses including the HGV (Gwack et al. 1999). The conserved Arginine is critical for nucleic-acid binding and helicase activity. The dynamics of the subdomain 2, which contains this conserved motif, revealed that this domain could be responsible for the conformational change associated with ATP-binding and hydrolysis, thereby driving the helicase reaction (Liu et al. 2001). The subdomain 2 has also been implicated in dsDNA binding. Motifs IV and V of subdomain 2 have been shown to undergo local unfolding in order to accommodate the dsDNA in their DNA-binding site (Liu et al. 2003).

Electrostatic analysis of the HCV NS3 helicase by Multi-Conformation Continuum Electrostatics (MCCE) identified two residues crucial for nucleic-acid binding – H369 and E493. H369 and E493 were at 3 and 6 Å distance, respectively, from the reported DNA-binding site. Mutational analysis of the two residues resulted in a drastic decrease in the nucleic-acid-binding affinities, indicating the importance of these two residues in NA binding. Based on these results, Frick et al. propose a model to explain the modulation of nucleic-acid-binding affinity by ATP due to the changes in the intrinsic pKa of these residues that arise from ATP and DNA binding, and the activation of the enzyme at low pH (Frick et al. 2004).

A high-resolution structure of the ring-shaped bacteriophage T7 gp4 helicase bound to DNA is not available as yet. Mutational studies have indicated that the conserved motif H4 is somehow involved in DNA binding. The x-ray of the helicase domain revealed that the residues of this motif lined up near the center of the hexamer, consistent with the enzyme binding the ssDNA in its central channel (Egelman et al. 1995). It has been proposed that nucleotide binding induces a conformational change in the H4 motif causing the region around it to fold into a helical structure. Two residues, R487 and G488, have been implicated in contacting DNA (Washington et al. 1996). In the unliganded state, this region is still disordered implying that nucleotide-binding couples the conformational changes important for DNA binding by the enzyme (Sawaya et al. 1999). Recently, it was also shown that mutation of three lysines to alanines (K467, 471, 473) abolishes DNA binding (Crampton et al. 2006).

The DNA bound structure of the ring-shaped E1 helicase of Papilloma virus has been solved (Fig. 20.5), which shows that all the residues seen to interact with the DNA are within the AAA+ domain. The groups mediate mostly backbone interactions with the DNA – the H507 and K506 forming hydrogen-bonding interactions with the backbone phosphates and all three residues F464, K506, and H507 making Van der Waals interactions with the sugar residue linking the two H-bonding phosphates. 5ʹ end of the ssDNA is directed toward the N-terminal oligomerization domains, whereas the 3ʹ end is directed toward the C terminus consistent with its translocation polarity (Enemark and Joshua-Tor 2006).

Unlike HCV NS3 protein, the ring-shaped T7 helicase requires a forked DNA substrate to initiate unwinding. The protein makes contact with both the 3ʹ- and the 5ʹ- strands, and these contacts are important not only for initiating the reaction, but throughout the unwinding reaction (Hingorani and Patel 1993). For optimal unwinding the enzyme requires a 35nt 5ʹ-tail and a 15nt 3ʹ-tail (Ahnert and Patel 1997). The nuclease protection assays indicate a 25–30 base occlusion site on the 5ʹstrand (Hingorani and Patel 1993) which is consistent with the enzyme requiring a 35nt 5ʹ-tail for optimal unwinding. As opposed to HCV helicase, T7 helicase binds to ssDNA tightly only in the presence of dTTP (Hingorani and Patel 1993). Therefore, the NTPase activity of helicases partly serves to modulate interactions with the nucleic acid. Some helicases bind tightly to nucleic acid in their NTP-liganded form, while others in the nucleotide-free or NDP-liganded form and vice versa.

The single-stranded DNA is bound in the central cavity of the hexamer (Egelman et al. 1995). The binding of the helicase to ssDNA is a multistep process that does not utilize NTP (Picha et al. 2000). At a given time only one or two subunits of the hexameric helicase contacts the DNA (Yu et al. 1996). The enzyme binds ssDNA with a Kd of ˜10 nM and a stoichiometry of one strand per hexamer (Hingorani and Patel 1993). The T7 gene4 helicase also exhibits dsDNA-binding activity. However, the enzyme has a 50-fold lower affinity for dsDNA as compared to ssDNA (Hingorani and Patel 1993).

The enzyme binds the 5ʹ-strand in its central cavity and excludes the 3ʹ-strand from its active binding site. Replacing the 3ʹ-strand with the biotin–streptavidin complex results in the same outcome, implying that the 3ʹ-strand of the fork provides steric hinderance to the enzyme thereby preventing it from binding the duplex (Hacker and Johnson 1997). At the replication fork of the T7 genome, the enzyme is thought to transiently bind at the primase site, followed by a conformational change accompanied by the ring-opening, ssDNA binding, and ring-closure (Ahnert et al. 2000). In the absence of the 3ʹ-tail, the enzyme can bind and translocate along the duplex DNA (Jeong and Patel, unpublished data).

Experiments involving synthetic substrates also give information about which strands are contacted by the helicase at the unwinding junction. Different helicases show different levels of tolerance to changes in the chemical nature of the loading strand, breaks along the unwinding track, abasic sites, electrostatic disruptions, etc. HCV NS3 helicase is extremely sensitive to the nature of the displaced strand (Tackett et al. 2001a), while the Dda helicase of bacteriophage T4 and NPH-II helicase of Vaccinia virus show little or no sensitivity (Tackett et al. 2001b; Kawaoka et al. 2004). The T7 helicase stalls with disruptions on the loading strand (Yong and Romano 1995), while replacing the displaced strand with a morpholino substrate increases the unwinding rate of the enzyme (Jeong and Patel, unpublished data).

Unidirectional Translocation

Translocation of the helicase along the single-stranded nucleic acid is considered to be one of the two key activities of the helicase that is required for its unwinding function. The translocation function is coupled to NTP hydrolysis, and though no one has so far demonstrated this translocation to be strictly unidirectional, the overall movement of the protein is biased to a single direction.

Different approaches have been used to study the translocation of the protein along the single-stranded nucleic acid and the coupling of this action to NTP hydrolysis. One of the earliest approaches was to study the steady-state kinetics of NTP hydrolysis as a function of ssDNA length (Liu and Alberts 1981; Matson and Richardson 1983; Raney and Benkovic 1995). The steady-state kinetics of NTP hydrolysis has also been used to differentiate between the ssDNA translocation activities of PriA protein from its unwinding activity (Lee and Marians 1990). A more recent approach involves biotin labeling the oligonucleotide at either the 3ʹ- or the 5ʹ-end and observing the disruption of the biotin–streptavidin complex by the helicase (Morris et al. 2001). This approach has been used to demonstrate both the polarity and the unidirectional translocation of bacteriophage T4 Dda helicase (Byrd and Raney 2004) HCV NS3 helicase and SV40 T-antigen (Morris et al. 2002). Kim and co-workers not only studied pre-steady state kinetics of dTTP hydrolysis as a function of ssDNA length, of bacteriophage T7 helicase, but also studied the energy coupling of the process using a coupled enzyme assay which measured the amount of inorganic phosphate (Pi) released using phosphate-binding protein (PBP) labeled with MDCC (Kim et al. 2002). This approach originally developed by the Webb lab (Hirshberg et al. 1998) has been used to obtain stepping rates and energy efficiency (coupling constants) of other enzymes including PcrA, UvrD (Raney and Benkovic 1995; Dillingham et al. 1999; Dillingham et al. 2000; Soultanas and Wigley 2000). Extensive modeling of the pre-steady state kinetics of NTP-dependent translocation of the motor proteins have been done by Fischer and Lohman (Fischer and Lohman 2004) and demonstrated on the E. coli protein UvrD (Fischer et al. 2004; Tomko et al. 2007).

Fig. 20.6
figure 20_6_117964_1_En

Mechanisms of translocation of helicases on single-stranded nucleic acids. Many mechanisms of single-stranded translocation have been reported, based on structural and biochemical data. Panels (a) and (b) describe the stepping mechanism of helicase translocation. Amongst the stepping mechanisms, panel (a) represents the “Inchworming” model of helicase translocation. This model is often used to describe the translocation of monomeric helicases with two nucleic-acid-binding sites. The two sites cycle between tight binding and weak moving as dictated by their NTP ligation states, and associated conformational changes. One cycle of inchworming is typically completed in a set of six conformational changes, with the two binding sites (or domains) always retaining their position on the nucleic acid relative to each other constant. Panel (b) describes the “Rolling” mechanism. This model is often used to describe the translocation of dimeric helicases. In this model, the two subunits alternate their positions on the nucleic acid as they change their NTP ligation states. Panel (c) describes the “Brownian Ratchet” model reported for the translocation of Hepatitis C virus NS3 helicase domain. In this model, the enzyme’s binding affinities for the single-stranded nucleic acid are modulated by its ATP ligation state. ATP binding weakens its affinity while ATP hydrolysis results in tight binding. In its weakly bound state, the enzyme could ratchet back and forth on the single-stranded substrate. Panel (d) represents one of the many possible NTP ligation and nucleic-acid occupancy states for the “Sequential hydrolysis” mechanism proposed for T7 gene4 helicase. Here, the enzyme contacts the ssNA two subunits at a time. NTP hydrolysis results in translocation of the helicase and the transfer of the ssNA substrate to the adjacent subunits.

Models of Unidirectional Translocation

Different mechanisms have been proposed for translocation of helicases along single-stranded nucleic acids. All mechanisms involve NTP hydrolysis, with a coupled conformation change to explain the biased movement.

Stepping mechanism - The stepping mechanism requires that the helicase possesses two DNA-binding sites. The two sites have differential affinities for the nucleic acid, which are modulated by the different NTP ligation states. In the “inchworm” type stepping model (Fig. 20.6a), one site is bound to the nucleic acid tightly (H), while the other site is weakly bound (T). NTP hydrolysis results in a power stroke, causing the weak site T to dissociate, move away from the tight site H and bind ahead of it. At the new position, the weak site T initiates tight interactions and becomes the new H site, in the process weakening the interactions of the previous H site to generate the new T site. In another power stroke, the sites undergo another round of nucleic-acid affinity changes to obtain their original starting affinities. Thus, one cycle in an inchworm stepping mechanism is completed with six conformational changes (Hill and Tsuchiya 1981; Lohman 1993; Patel and Donmez 2006).

Another stepping model is used to describe the translocation of dimeric helicases. In the inchworming model, both the nucleic-acid-binding sites could be present on the same polypeptide chain. In the “rolling” model (Fig. 20.6b) each monomer contributes one nucleic-acid binding site and that the two subunits of the helicase alternate their binding to the single-stranded nucleic acid depending on the changes in their NTP ligation states (Wong et al. 1992; Lohman 1993).

Brownian motor mechanism – This mechanism was proposed as an alternative mechanism to the stepping mechanism (Fig. 20.6c). This mechanism involves a single nucleic-acid binding site modulated by NTP binding and hydrolysis. In the absence of any bound NTP, the enzyme mediates very tight interactions with the ssNA substrate. In this state, the energy profile of the helicase is deep and saw-tooth shaped. Thus, in the tight state, the helicase is unable to mediate any motion along the ssNA. On NTP binding, its affinity for ssNA drops several fold, resulting in a shallow energy profile. The enzyme is now capable of moving either forward or backward (Brownian motion). In a power stroke coupled with NTP hydrolysis, the enzyme moves forward, going back to its original tight state (Levin et al. 2005; Patel and Donmez 2006).

Sequential “subunit rotation” mechanism of hexameric helicases- This mechanism has been proposed to explain the translocation of hexameric helicases like the bacteriophage T7 helicase on ssDNA (Fig. 20.6d). In this mechanism, three cooperative steps of sequential DNA-binding and release are required for processive translocation along ssDNA. DNA is translocated by power strokes powered by NTP binding to the catalytic site. First, the empty NTP site gets occupied to generate the weak DNA-binding site T*. The DNA-binding step in the next subunit, T*→N·T*, commences when the previous subunit in the sequence has completed its power stroke and is in the N·T state. Geometrically, this is possible if the power stroke of the previous subunit brings the DNA strand into a position where it can quickly fluctuate to the next subunit. Since hydrolysis enables release of the DNA strand, in order to ensure high processivity the unbinding of DNA in one subunit must take place after the binding of DNA to the next subunit. Thus, the transition N·T→DP in one catalytic site must follow the binding of nucleotide to the next site, i.e., state T*→N·T*. Finally, the power stroke N.T*→N·T, results in translocation (Liao et al. 2005).

Base Pair Separation Mechanisms

Helicases couple the energy of NTP hydrolysis to single-stranded translocation and base-pair separation. Translocation of helicases can take place by any of the above-mentioned mechanisms. Mechanisms of base-pair separation can be in general classified into “active” or “passive” depending on the extent to which the enzyme is involved in the strand-separation function (Lohman 1993; Lohman and Bjornson 1996; von Hippel and Delagoutte 2001; Betterton and Julicher 2005).

In a “passive” mechanism (Fig 20.7a) of strand separation, the enzyme translocates along the single-stranded nucleic acid till it reaches the duplex junction. Now the enzyme waits for the two strands to open due to thermal fraying. Once a base pair opens, the enzyme now moves ahead and this cycle continues till the duplex has completely separated. For a passive helicase, the unwinding step-size is likely to be one, since it is extremely difficult for more than one base pair to open by thermal fluctuations. In the “active” mechanism of helicase action (Fig. 20.7b), the enzyme destabilizes the junction, thereby altering the energy profile of the duplexes at the junction, making them easier to melt. An active mechanism can account for larger step-sizes reported for many of the helicases (Serebrov and Pyle 2004; Spurling et al. 2006; Myong et al. 2007). Force dependence and stability dependence studies have revealed that both T7 gene 4 helicase and the HCV NS3 helicase unwind by an active mechanism (Cheng et al. 2007; Donmez et al. 2007; Johnson et al. 2007).

Fig. 20.7
figure 20_7_117964_1_En

Passive and Active mechanism of nucleic-acid unwinding. Helicases convert the chemical energy of NTP hydrolysis to mechanical work done in-terms of translocation on the single-strand and separation of the two strands of the duplex DNA or RNA. The energy from NTP hydrolysis can be used by the helicase just for single-stranded translocation, and the enzyme relies on thermal fraying to break the base pair. This mechanism is referred to as the Passive mechanism (A) On the other hand, the energy from NTP hydrolysis is used for translocation as well as for destabilization of the base pairs (shown as a gray cloud) on the duplex, to enable strand-separation. This mechanism is referred to as the active mechanism (B).

A mechanism of strand separation reported for many of the helicases involves excluding the complementary strand, preventing reannealing. The enzyme could use specific residues in the nucleic-acid-binding cleft as a wedge to separate the two strands (Tackett et al. 2001a; Kawaoka et al. 2004), or the entire helicase molecule could assemble in such a way so as to exclude the other strand thereby keeping the two strands separated (Ahnert and Patel 1997; Hacker and Johnson 1997; Donmez and Patel 2006).

Some of the hexameric helicases like the papillomavirus E1 helicase, SV40 T-antigen are involved in viral replication and hence are required to bind sequence specifically to the origin of replication and melt the base pairs to initiate replication. A looping model was suggested for the strand-separation mechanism of T-antigen, where the double hexamer carried out bidirectional unwinding, looping out the separated single strands through the middle (Li et al. 2003). However, this mechanism has now been refined. According to the new model, the separated strands no longer loop out of the double hexamer, but instead an alternative conformation is proposed where the ssDNA exit through an exit channel on the helicase domain of the double helical enzyme (Gai et al. 2004). A few other mechanisms have been used to describe strand separation by hexameric helicases and these include, the torsional model, plough-share model, etc. (Takahashi et al. 2005; Patel and Donmez 2006).

Table 20.4 Helicase inhibitors

Helicases as Antiviral Drug Targets

Viruses are obligate parasites. They direct the host cellular metabolism for their replication. To date, although a multitude of viral infections can be warded off through vaccinations, there still exist many viral pathogens against which vaccination is not yet available. This includes diseases like hepatitis C and acquired immuno deficiency syndrome (AIDS). The current strategy in handling these conditions have been through chemotherapeutics which include immune system boosters like interferon-α and -γ and a host of antiviral drugs.

Target the Host or the Virus?

While designing antiviral targets one could consider two broad strategies: targeting a cellular factor involved in viral replication or targeting a virus-specific gene product. Targeting the host factors could result in drastic side effects since the targeted protein could also get inhibited in normal non-infected cells. The latter strategy, on the other hand, could confer a higher virus-specific activity and a low toxicity to the host. However, one caveat that could exist is that, if the targeted protein has metabolic functions, then there would a smaller window of specificity since viral and cellular enzymes catalyze similar enzymatic reactions. However, since the viral and cellular proteins are not identical, structure-based drug design can often exploit the differences between the host and the viral enzymes to generate drugs specific for the virus.

Currently, the most targeted virus-specific factors are the polymerases. The polymerases are essentially required for the replication of viruses. The reverse transcriptase (RT) of the retroviruses and the hepadnaviruses is the sole viral enzyme required for the synthesis of DNA from viral RNA. Viral polymerases are therefore an extremely favorable target for the development of antiviral therapy (De Clercq 2004). Another virus-specific target is the viral helicase. Most viral helicases have multiple enzyme activities associated with the unwinding function. Thus drug design against helicases could involve several general strategies.

Helicase Inhibitors: Strategies and Prospects

All helicases are fuelled by NTP hydrolysis for their unwinding function. Thus, small-molecule inhibitors could be used to inhibit the NTPase function in a number of ways. These inhibitor molecules, usually nucleotide analogs, could directly compete for NTP binding, inhibit nucleic-acid binding, inhibit NTP hydrolysis or NDP release, or uncouple NTP hydrolysis and translocation (Borowski et al. 2000; Borowski et al. 2002; Xu et al. 2006).

Another strategy used in helicase inhibition involves disruption of the protein–protein interfaces between the helicase and other proteins of the replication complex. This strategy has been currently deployed for the inhibition of the HPV E1 helicase whereby its interaction with the E2 protein has been disrupted using inhibitors (White et al. 2003). The HSV helicase–primase complex is inhibited by aminothiozolylphenyl-containing drugs and thiozole urea derivatives. These compounds appear to act by enhancing the binding of the complex to ssDNA in the replication bubble preventing DNA polymerization (Crumpacker and Schaffer 2002; Crute et al. 2002; Kleymann 2004; Biswas et al. 2007).

In a more recent approach, Xue et al. have developed a new strategy for the inhibition of HCV replication. They use siRNAs to knock down cellular host factors, which are important for HCV replication (Zhang et al. 2004; Xue et al. 2007). However, this approach cannot be used as a sole approach for anti-HCV therapy since the host factors involved are important not only for HCV replication but a host of other functions related to cellular RNA metabolism. Table 20.4 gives a list of all small-molecule inhibitors against helicases that have been developed so far. (For a more comprehensive study on helicases as antiviral targets see reviews by Yao and Weber 1998; Frick 2003; Kleymann 2004; Kwong et al. 2005; Maga et al. 2005; Frick and Lam 2006; Frick 2007).