Structural mimicry in the phage [phis]21 N peptide–boxB RNA complex
Abstract
We determined the solution structure of a 22-amino-acid peptide from the amino-terminal domain of the bacteriophage [phis]21 N protein in complex with its cognate 24-mer boxB RNA hairpin using heteronuclear magnetic resonance spectroscopy. The N peptide binds as an α-helix and interacts predominately with the major groove side of the 5′ half of the boxB RNA stem-loop. This binding interface is defined by surface complementarity of polar and nonpolar interactions, and little sequence-specific recognition. The [phis]21 boxB loop (CUAACC) has hydrogen bond and backbone torsions typical of the “U-turn” motif, as well as base stacking of the last 4 nt, and a hydrogen bonded C:C pair closing the loop. The exposed face of the [phis]21 boxB loop, in complex with the N peptide, is strikingly similar to the GNRA tetraloop-like folds of the related λ and P22 bacteriophage N peptide–boxB RNA complexes. The N peptide–boxB complexes of the various phage, while individually distinct, provide similar structural features for interactions with the Escherichia coli host factors to enable antitermination.
Keywords
INTRODUCTION
The Lambda family of bacteriophage (λ, [phis]21, and P22) use a temporally programmed sequence of gene expression during infection that is controlled, in part, by termination and antitermination of transcription. These bacteriophage all use N proteins, N-utilization (nut) sites in the nascent mRNA, and transcription terminators, to regulate early transcription. The interaction between the phage-encoded N protein and the RNA nut site plays a central role in the organization of viral and host factors (Nus proteins) required for formation of a termination resistant form of the RNA polymerase (RNAP) transcription complex (Das 1993; Greenblatt et al. 1993; Friedman and Court 1995).
The nut site is composed of a single-stranded 5′-boxA site and a 3′-boxB RNA hairpin, separated by a variable length linker region of 8–14 nt (Fig. 1A,B; Szybalski et al. 1986). The organization of the genome is conserved between phage types, but the sequences of the RNA nut sites and N proteins are diverse (Franklin 1985a). The boxA sequences are highly homologous between the related phage and have been shown to bind host-encoded NusB and ribosomal protein S10 (Olson et al. 1984; Horwitz et al. 1987; Nodwell and Greenblatt 1993). The boxB hairpins consist of short stems (5–7 bp) and 5- or 6-nt loops and have little sequence homology between the three phages. Point mutations in λ boxB loop sequences have been shown to abolish antitermination in vivo (Doelling and Franklin 1989; Chattopadhyay et al. 1995) and N binding in vitro (Cilley and Williamson 1997).
The three related phage N proteins are all small (96–107 amino acids), basic proteins that only function with their cognate nut-boxB RNAs (Lazinski et al. 1989). Amino acid sequence alignment of the N proteins from λ, [phis]21, and P22 shows an amino-terminal, arginine-rich, 18-amino-acid region of homology with four invariant residues (Fig. 1C; Franklin 1985b). In vivo and in vitro studies show peptides from this region specifically bind the appropriate boxB hairpins (Lazinski et al. 1989; Tan and Frankel 1995), with affinities similar to full-length N protein (Cilley and Williamson 1997). The N protein binds to one face of the boxB hairpin (Chattopadhyay et al. 1995), whereas Escherichia coli elongation factor NusA binds the opposite face (Mogridge et al. 1995) and interacts directly with N (Van Gilst and von Hippel 1997).
The solution structures of the N peptide–boxB RNA complex have been solved for the phages λ (Legault et al. 1998) and P22 (Cai et al. 1998). Both structures show a bent α-helical peptide bound to one face of an RNA hairpin loop. A number of polar and nonpolar interactions are observed with few, if any, involved in sequence-specific contacts in either of the structures. Each hairpin has a pentanucleotide loop that adopts a GNRA-type tetraloop fold with the fourth or third nucleotide flipped out of the loop in λ and P22, respectively.
Here, we describe the solution structure of the [phis]21 N-boxB consisting of an amino-terminal 22-amino-acid N peptide in complex with a 24-nt boxB RNA by NMR methods. The boxB hairpin has an A-form helical stem with a hexanucleotide loop (5′-CUAACC-3′) with the last four bases continuously stacking on the 3′ half of the hairpin stem. The loop residues U11-A12-A13 form a “U-turn,” first described in yeast tRNAphe (Quigley and Rich 1976) and also observed in ribosomal RNA loops (Fountain et al. 1996; Huang et al. 1996; Conn et al. 1999; Zhang et al. 2001), and the hammerhead ribozyme (Pley et al. 1994). The U-turn consensus sequence, UNR, is characterized by a hydrogen bond between the U imino proton and the 5– phosphate of the nucleotide following the UNR, and a second hydrogen bond between the U 2′ hydroxyl and the N7 of the R nucleotide. U-turns are commonly flanked by noncanonical Y:Y base pairs (Auffinger and Westhof 1999; Gutell et al. 2000). A core 13-amino-acid region of N[phis]21 peptide binds as an α-helix and interacts predominately with both the major groove side of the 5′ half of the ascending upper stem, and the loop of the boxB hairpin. Electrostatic and hydrogen bond interactions exist all along the phosphodiester backbone of the RNA.
This structure completes the family of three λ-related phage N-boxB protein–RNA structures that have been genetically characterized. All three peptides bind the RNA helical stems using polar and nonpolar contacts that are almost superimposable. Peptide interactions with their respective RNA hairpin loops diverge from this structural similarity, and all three peptides show loop-specific interactions. The pentanucleotide loops of λ and P22 adopt GNRA-type tetraloop structures with exclusion of 1 of the 5 nt. The [phis]21 boxB hexanucleotide loop contains a U-turn that is structurally related to the GNRA tetraloop (Jucker and Pardi 1995), a mismatched C10:C15 base pair, and stacking of C14, giving the [phis]21 complex a structure similar in shape and exposed surface groups to the λ N peptide–boxB complex. Comparison of the [phis]21 complex to those of λ and P22 reveals how phage type-specificity is maintained in the presence of conserved host factor interactions, through similar peptide-RNA stem binding motifs coupled with loop-specific interactions that yield similar RNA loop structures.
RESULTS
Minimal peptide/RNA complex
The N-nut complexes from λ and P22 were minimized to short N peptides containing the amino-terminal basic domain and short RNA hairpins corresponding to boxB (Tan and Frankel 1995). As a starting point for defining a similar minimal system for [phis]21, we began with the 41 amino-terminal residues of [phis]21 N protein and a 49-nt nut site RNA. The affinity of this interaction was determined using a polyacrylamide coelectrophoresis (PACE) assay (Cilley and Williamson 1999) with a Kd,app of 135 nM. The [phis]21 nut RNA was minimized to just the boxB hairpin, and this minimal 24-nt hairpin binds to the 41-mer peptide with the same affinity, 82 nM (Fig. 2A).
To identify the minimal N[phis]21 peptide, a series of N-terminal and C-terminal deletion peptides were prepared (Fig. 2B) and assayed. The resulting 22-amino-acid peptide spans the basic region homologous to λ and P22 bound to the boxB RNA hairpin and has a Kd,app of 200 nM, less than twofold down from the [phis]21 N(1–41)–nut RNA complex.
The three N-nut complexes exhibit phage type specificity in vivo. Each N protein has a strong preference for its cognate RNA, and it is the amino terminus of the N protein that confers this type specificity (Lazinski et al. 1989). We determined the complete matrix of in vitro binding affinities of each N peptide for each boxB RNA using the PACE assay (Fig. 3; Table 1). Binding affinity in vitro reflects type specificity, consistent with the in vivo experiments. The three cognate N peptide-boxB RNA complexes have a 20-fold range in binding affinities. The individual peptides and RNAs demonstrate a wide range of binding specificity for cognate versus noncognate complexes. The Nλ peptide and the boxBλ RNA each show about a 300-fold range in binding affinities (Fig. 3A). In contrast, the NP22 peptide and boxBP22 RNA are the least discriminating, showing at most a fourfold preference for cognate sequences (Fig. 3B), whereas N[phis]21 peptide and boxB[phis]21 RNA have a more intermediate 15-fold specificity (Fig. 3C).
NMR structure determination
The solution structure of the [phis]21 N peptide–boxB RNA complex was solved with standard heteronuclear NMR experiments, using both 15N and 13C, 15N-labeled peptide and RNA (see Material and Methods; Table 2). Complex formation was monitored using 15N- or 13C-HSQC experiments. The 15N-HSQC spectra contains a single set of peaks for a 1:1 complex of 15N-peptide and unlabeled boxB RNA, distinct from the set of peaks for the 15N-peptide only (Fig. 4A). Sequential and unambigious assignment of all of the peptide 1H, 13C, and 15N backbone and side chain chemical shifts was achieved using triple resonance through-bond correlation experiments (Table 2). The resonances of the backbone amides were well dispersed in both the 1H and 15N dimensions, and there were few problems with spectral overlap in these experiments. The boxB ribose spin systems were assigned using three-dimensional HCCH-COSY, HCCH-TOCSY, and HCCH-COSY-TOCSY experiments (Hu et al. 1998). These through-bond experiments were combined with sequential assignment of the RNA from two-dimensional (2D) NOESY experiments in both H2O and D2O to allow for the complete assignment of the nucleotide spin systems. A pair of 2D HCN spectra were collected to directly establish ribose-to-base connectivities (Table 2). These experiments correlate a base N1/N9 to the ribose H1′ or the base H6/H8 to the N1/N9. These connectivities served as a check on the sequential assignments from NOESY analysis. Finally, an HNN-COSY experiment (Dingley and Grzesiek 1998) provided direct evidence for the base pairing in the stem.
Distance restraints for structure calculations were derived from a variety of NOESY experiments (Table 2). Use of labeled peptide and RNA samples allowed for the determination of a number of NOEs (Table 3, Panel A), often with the same NOE present in both peptide-only and RNA-only spectra (Fig. 4B). The N peptide showed sequential NOE connectivities in 15N-NOESY spectra typical of an α-helical structure. Similarly, the boxB RNA had intra- and internucleotide NOE patterns indicative of A-form helical regions of a stem-loop structure. Peptide backbone torsion angle restraints and RNA ribose sugar pucker restraints were determined from HNHA and COSY experiments, respectively (Tables 2, 3, Panel A). A schematic of the RNA–RNA and peptide–RNA NOEs shows the density of distance restraints, especially in the loop region (Fig. 5A). The core, ordered region of the peptide–RNA complex is evident in the superposition of the 14 low energy structures (Fig. 5B). The average structure of this low energy ensemble was minimized to obtain the final solution structure (Fig. 6A,B; Table 3, Panel B).
Peptide structure in complex
The N[phis]21 peptide in complex with the boxB RNA forms an α-helix spanning residues Ala13–Ala26, with the remaining residues in an extended conformation. The measured [phis] and ξ angles for the amino acids Ala13–Ala26 fall within the α-helical regions of the Ramachandran plot (Morris et al. 1992). The conserved residues Ala13, Arg16, Arg20, and Arg21 of the N peptide family do not appear to lie on a single face of the α-helix, but they are all oriented towards the binding surface of the boxB RNA (Fig. 6A). The slight curvature of the helical axis does not disrupt the helical connectivities. No apparent hydrogen bonds between side chains can be resolved in the N peptide structure. In the α-helical portion of the peptide, all side chains with potential hydrogen bond donor moieties are oriented on one face of the helix, whereas the potential hydrogen bond acceptor side chains lie mostly on the opposite face. All of the peptide side chains that gave rise to NOE distance restraints (Fig. 5A) lie along the 5′-side of the major groove of the upper 4 bp of the stem and behind the loop of boxB RNA (Fig. 6B).
The nonhelical regions of the N peptide, Glu8 to Thr12 and Glu27 to Arg29, are randomly oriented away from the boxB RNA. These residues had few interresidue NOEs and no NOEs to the boxB RNA. Deletion analysis showed that a peptide truncated at Thr12, just before the conserved Ala13, bound the boxB RNA with less affinity than a peptide truncated at Glu8 (Fig. 2B). The first seven amino acids of full-length N[phis]21 contain a number of hydrophobic residues, including two valines, an isoleucine, and a tryptophan (Fig. 1C). Although not important for RNA binding, these residues may play a role in subsequent N protein folding, or contribute to protein–protein interactions in the complete antitermination complex.
boxB structure in complex
The [phis]21 boxB RNA in complex with the N peptide adopts a stem-loop structure (Fig. 6A). The stem of the boxB RNA contains seven Watson–Crick base pairs and two canonical wobble U:G base pairs, U3:G22 and U9:G16. The stem is essentially A-form helical, with no significant perturbations of ribose sugar puckers or base glycosidic torsion angles.
In the hexanucleotide loop of boxB RNA (CUAACC), the four bases on the 3′ end stack upon the 3′ strand of the helical stem. Cytidine-15, located at the 3′ end of the hexanucleotide loop, stacks on G16, the 3′ base in the U9:G16 base pair at the top of the stem. This stacking in the hexanucleotide loop continues upward with nucleotides C14, A13, and A12, extending the topology of an A-form helix up to the tip of the loop. The U11-A12-A13 U-turn facilitates reversal of the direction of the phosphate backbone. Although the imino proton of U11 is not observed in the NMR experiments, 11 of the 14 structures show hydrogen bonds between this imino and the phosphate of C14, a hallmark of the U-turn motif. Similarly, a hydrogen bond between the 2′ hydroxyl of U11 and the N7 of A13 is observed in 13 of the 14 low energy structures. In 12 of the 14 low energy structures, the N4-amino group of C15 can hydrogen bond to either the N3 or O2 of C10. The C10:C15 pair stacks on the U9:G16 base pair. This C10:C15 pairing can only be inferred indirectly because there are no observed resonances from any of the N4-amino protons, indicating the lack of a stable hydrogen bond.
Electrostatic and hydrogen bond interactions
One-third of all residues in the N[phis]21 peptide are either lysine or arginine. The N[phis]21 peptide belongs to a loose class of protein sequences containing the “arginine-rich motif” (Weiss and Narayana 1998). Members of this “class” have ultimately shown little sequence or structural-specific homology, but all use arginine side chains in binding and recognition of the RNA major groove. In the N[phis]21 peptide–boxB RNA complex, all but one of the five arginines and one of the three lysines make extensive contacts with the phosphate backbone (Fig. 7A). The guanidinium groups of the conserved arginines, Arg16, Arg20, and Arg21, are within 5 Å of the phosphate backbone on the 5′-ascending stem of the boxB RNA. The guanidinium group of Arg28 is more distant from the phosphate of A12, but could still contribute a favorable electrostatic interaction in the complex. The amino group of Lys18 is within 5 Å of the phosphate group of C14 on the opposite side of the groove. Analysis of the family of low energy structures indicates that some of these guanidinium and amino groups may be making stronger ionic interactions, or even hydrogen bonds, to the phosphate backbone. Although not as apparent in the average structure (Fig. 7A), in all 14 structures the guanidinium group of Arg16 is <2.4 Å from the nonbridging phosphate oxygens of C7 and C8, in many cases close to both phosphate groups. Similarly, the amino group of Lys18 is close to the phosphate of C14 in 13 of the 14 structures. In 12 of the 14 structures, the Tyr17-Hη is <2.4 Å from the C10 phosphate group. Tyrosine has been shown to have a high propensity for forming hydrogen bonds in protein–RNA complexes (Jones et al. 2001).
No clear sequence-specific base interactions are seen in the peptide–RNA complex. In all 14 of the low energy structures, the amino group of Lys14 penetrates into the major groove and makes close contacts to one or more of the Hoogsteen faces of G16, G17, and G18. The lack of precision in the placement of Lys14 from the NOE data makes it impossible to definitively assign its role in any sequence-specific recognition of the boxB RNA. Collectively, these arginine and lysine side chains, together with the contribution of Tyr17, create a unique positive surface on one side of the N[phis]21 peptide α-helix that interacts with the negatively charged RNA phosphodiester backbone and contributes to recognition of boxB RNA.
All of the conserved arginines in the N[phis]21 peptide are within electrostatic and hydrogen bonding distances to the boxB RNA. From examination of the structure of the complex, we believe the conserved Arg21 plays a role in specific recognition of boxB RNA. Arginine-21 rests in the curve of the phosphate backbone from C10 to A13 at the top of the boxB loop. The Hε and Hη hydrogens are all close enough to the phosphate backbone to provide a framework around which the backbone can organize. All of the other arginines that make contacts to the phosphate backbone do so by simply extending out towards a particular phosphate or pair of phosphates without any other contact with the RNA. The 15N-HSQC of the peptide in the complex shows that the Arg21-Hε is shifted downfield in both the 1H and 15N dimensions from all of the other Arg-Hε resonances, indicative of a unique structural or chemical environment. In addition to its extensive ionic interactions, the aliphatic part of the Arg21 side chain makes van der Waals contacts with the base H5/H6 protons of U11. The ability of arginine to make a variety of polar and nonpolar interactions to accommodate a particular binding surface on an RNA is a common motif (Legault et al. 1998; Weiss and Narayana 1998).
van der Waals contacts
With ∼1100 Å2 of buried surface area between the N[phis]21 peptide–boxB RNA complex, polar and nonpolar interactions play an important role in binding. Beginning with Ala13, with each four-residue turn of the α-helix to Ile25, the peptide side chains make van der Waals contact to the 5′ bases and riboses of the boxB RNA (Fig. 7B). The methyl group of Ala13 lies between C7 and C8, packing against both the ribose and base H5/H6 protons of these nucleotides. This alanine is conserved among the family of N proteins, and mutagenesis of this position in the Nλ system showed that only alanine and serine maintained wild-type binding affinity for the boxB RNA (Su et al. 1997). Tyrosine-17 is one turn up the α-helix and lies between U9 and C10, the next two bases up the stem from the Ala13 position. In addition to the hydrogen bond interaction of the Tyr17 hydroxyl group, the aromatic ring lies against the edges of the H5/H6 base protons of U9 and C10, and the ribose of U9. Further along the α-helix, Arg21 is positioned to make van der Waals contacts with the U11 base H5/H6 protons of the boxB loop, as discussed above. Finally, Ile25 makes extensive van der Waals contacts with the ribose of A12, at the very top of the boxB hairpin (Fig. 7B). Although the interactions of Arg21 with U11 may play a role in supporting the shape of the RNA backbone in the loop, the close contact of Ile25 may support the back side of the boxB loop on the 3′ face. Lysine-18 may also play a role in recognition and stabilization of the boxB loop, through potentially important van der Waals contacts between the Lys18-Hγ and Hδ groups and the ribose of A13.
DISCUSSION
Overview of [phis]21, λ, and P22 N peptide–boxB structures
The NMR solution structures of the N peptide–boxB RNA complexes for phage λ (Legault et al. 1998; Scharpf et al. 2000) and phage P22 (Cai et al. 1998) have been previously determined. All three peptides bind as α-helices. The Nλ peptide has a pronounced bend in the α-helix of ∼120° as a result of deviations of the backbone [phis] and ξ angles for Arg11. This bend has been shown to be important for Nλ binding (Tan and Frankel 1995; Su et al. 1997) and antitermination in vivo (Franklin 1993; Su et al. 1997), and is required for bringing Trp18λ into position for stacking onto the boxBλ tetraloop. The NP22 peptide α-helix is also bent, but not as dramatically as that of Nλ. In contrast, the N[phis]21 peptide α-helix is almost linear. This result is consistent with the NMR-derived NOE connectivities and backbone torsion angles. In all three complexes, the conserved alanine and conserved arginines play important roles in binding their cognate boxB RNAs, though not always identical in each structure. In addition to the conserved residues, each peptide has a unique subset of residues that participate in the peptide–RNA interaction.
All three boxB RNAs form hairpin structures, and the stems of each are essentially A-form helical RNA. The sequences of the pentanucleotide loops of λ (GAAGA) and P22 (GACAA) are similar, and, in fact, both the λ and P22 boxB loops adopt a fold that is almost identical to a GAAA-type GNRA tetraloop (Cai et al. 1998; Legault et al. 1998). In each case, one of the five nucleotides in the loop is excluded from the tetraloop fold. The boxBλ loop excludes the fourth nucleotide, G, whereas in P22 it is the third nucleotide, C. In a GNRA tetraloop, the first G and last A form a sheared G-A base pair (type XI; Saenger 1984) resulting in a severe change in direction of the phosphate backbone between the first G and the second nucleotide (N) in the loop. The last three nucleotides in the GNRA sequence stack sequentially on the 3′ stem below the loop. The structure of a GAAA tetraloop has been solved by NMR (Heus and Pardi 1991) and X-ray crystallography (Scott et al. 1995; Cate et al. 1996). The r.m.s.d. values of superpositions of the λ and P22 GAAA tetraloop folds with GAAA tetraloops from crystal structures were 1.4 and 0.8 Å, respectively. The Nλ peptide makes no contacts with the bulged G (fourth nucleotide in the loop). It was shown that mutation of the fourth position had no effect on N binding, and mutations that abolish the “GNRA” sequence motif diminish Nλ binding (Doelling and Franklin 1989; Cilley and Williamson 1997). The bulged C (third nucleotide in the loop) in the boxBP22 of the structure makes extensive hydrophobic interactions with the peptide (Cai et al. 1998). In the [phis]21 complex, only Arg21[phis]21 and Ile25[phis]21 have any significant contacts with the loop. The side chain of Arg21[phis]21 binds in the sharp turn in the phosphate backbone of nucleotides C10, U11, A12, and A13 (Fig. 7A), and makes hydrophobic contact with U11 (Fig. 7B). Just above Arg21[phis]21, Ile25[phis]21 packs against the ribose of A12 at the top of the loop. The [phis]21 peptide has few interactions with nucleotide bases in the boxB[phis]21 RNA loop, in contrast to the base stacking and extensive base hydrophobic packing seen in the λ and P22 structures.
Stem binding motif
The binding of N[phis]21 peptide to its cognate boxB hairpin has a bipartite character where the residues in the lower half of the α-helix interact with the boxB stem, and the upper part of the α-helix interacts with the boxB loop. A similar division can be made in the Nλ and NP22 peptide–boxB complexes. The bend in the Nλ peptide α-helix occurs at Arg11λ (Legault et al. 1998). The slight bend in the NP22 peptide α-helix is centered between Arg6P22 and Glu8P22 (Cai et al. 1998). All three N peptides interact with their cognate boxB stems in these complexes in very similar ways (Fig. 8). Interestingly, a sequence alignment of the three N peptides shows that there exists the greatest degree of homology in the first half of the amino acid sequences. Between the conserved alanine and the last of the three conserved arginines, a total of nine residues, there is only one position that is not the same in at least two of the three sequences (Fig. 8). The high degree of sequence homology, combined with similar α-helical secondary structure results in the three N peptides being almost identical in their binding and recognition of the boxB stem, with a pairwise r.m.s.d. of important side chains and nucleotides of 1.41 ±0.22 (detailed in the Fig. 8 legend).
The conserved alanine, Ala13[phis]21, packs against two 5′-nucleotides (C7 and C8) that are in the same relative position in the boxB stems of all three phages (cyan coloring in Fig. 8). This is a 5′C-G step in λ and [phis]21, and 5′G-C in the P22. Mutation of the λ 5′C-G step to 5′G-C only diminished in vivo antitermination activity approximately sixfold (Chattopadhyay et al. 1995). This would imply that although the interaction of this conserved alanine with the boxB stem at this position contributes to N binding, it is more likely a nonspecific than sequence-specific interaction.
The first of the three conserved arginines, Arg16[phis]21, makes electrostatic interactions with the phosphate backbone (blue in Fig. 8). Like the conserved alanine, these interactions occur in the same position along the boxB stems. In both the λ and [phis]21 complexes, the guanidino groups of this arginine make close contact with the phosphate backbone. In the P22 structure, as reported (Cai et al. 1998), this same arginine is not as strongly oriented towards the phosphate backbone. An Arg6λ → Ala mutation at this position in Nλ reduces binding by >20-fold (Su et al. 1997) and abolishes antitermination activity in vivo (Franklin 1993).
The next residue, Tyr17[phis]21, is instead an arginine in both λ and P22. However, similar contacts are observed in each structure with the pair of 5′-nucleotides (U9 and C10) above the alanine step. The Tyr17[phis]21, Arg7λ, and Arg6P22 all make hydrogen bond or electrostatic interactions with the phosphate backbone, and all three have van der Waals contacts with the bases and ribose sugars (green in Fig. 8). The arginines in λ and P22 at this position can also make additional electrostatic interactions through the Hε hydrogen that Tyr17[phis]21, with its single polar group, is unable to duplicate.
Finally, the second conserved arginine, Arg20[phis]21, makes electrostatic contacts with the phosphate backbone in the same region in the three structures. In all three structures, this arginine appears to have a weaker, more distant interaction than any of the other arginines. In only 10 of the 14 low energy structures, the Arg20[phis]21 hydrogen bonds to the phosphate of C8 or U9. It is important to note that alanine substitutions at this position in Nλ had deleterious effects on binding and antitermination activity (Franklin 1993).
These striking similarities end as each of the N peptides transitions from binding the stem to binding the loop. The three N peptides differ in their recognition of and interactions with their cognate boxB loops. It was proposed that the more conserved parts of the three N peptides would have similar modes of binding the boxB RNA stem (Cai et al. 1998; Legault et al. 1998). This hypothesis is strengthened by the addition of the N[phis]21 peptide–boxB structure. The amino acid sequences of λ and P22 in the stem binding part of the peptide are phylogenetically conserved. N[phis]21 helps to identify the conserved position and types of interactions that are important for boxB stem binding.
Structural mimicry within the boxB loops
Binding interactions between the N peptides and boxB loops diverges sharply from the structural homology evident in the peptide-stem interactions (Fig. 8). The third conserved arginine in all three complexes binds the boxB loop; there is not the same degree of overlap seen for the side-chains-involved stem binding part of the peptide (highlighted in red, orange, and yellow in Fig. 8). The pronounced bend in the Nλ α-helix accommodates the stacking of Trp18λ on top of the AAA stack of the tetraloop. This interaction is a critical determinant in λ recognition because neither of the other N peptides has an aromatic residue near this position in its primary sequence, and mutation of this position diminishes binding of the Nλ peptide (Su et al. 1997) and abolishes antitermination in vivo (Franklin 1993). The majority of residues between Arg11λ and Lys19λ are involved in electrostatic and van der Waals interactions with the GAAA tetraloop formed by boxBλ. The NP22 peptide interacts with its cognate boxB through extensive van der Waals interactions with a cytidine nucleotide that is excluded from the GAAA tetraloop structure of boxBP22 (Cai et al. 1998). The NP22 has far fewer interactions with the tetraloop structure of its boxB than does Nλ peptide. Despite the differences in peptide–loop interactions, the NP22 and Nλ peptide–boxB complexes have essentially the same GAAA tetraloop structure, with the bases of the stacked adenosines on the opposite face from the N peptides.
The structure determined in this work shows that the boxB[phis]21 loop, in complex with the N[phis]21 peptide, does have structural homology with the GNRA-type tetraloops of boxBλ and boxBP22. The GNRA-like fold is mimicked in boxB[phis]21 by formation of a U-turn and stacking of the last four nucleotides in the loop (AACC) onto the 3′ stem, similar to the AAA stacking in tetraloops of λ and P22. The degree of structural similarity is evident in a superposition of the stacked AACC of boxB[phis]21 and the stacked Trp18:AAA of boxBλ (Fig. 9). It is striking that the Trp18λ amino acid occupies a similar position to the A12[phis]21 nucleotide in the boxB[phis]21 complex and has a similar structural geometry. The face of the two loops that would be presented to NusA or other bacterial host factors are very similar in both the geometry and the constellation of hydrogen bond donors and acceptors. This is consistent with the hypothesis that the N–boxB complex interacts with some other factor or factors in the antitermination complex. This degree of structural similarity is a surprising coincidence given the disparate RNA loop sequences and the very different ways in which they bind to their cognate N proteins, as discussed above. Despite having very different sets of interactions with their boxB loops, these three peptides ultimately organize the 3′-half of their boxB hairpins to be structurally and chemically similar.
The three peptide–RNA structures help explain the binding specificity and affinity of the N peptide–boxB RNAs (Table 1). The λ and P22 boxB RNAs contain the requisite sequences for forming a GNRA tetraloop, each flipping out one base in the process. The bulged C (third nucleotide) in the boxBP22 may have steric clashes with the Nλ peptide stacking of the critical Trp18λ onto the tetraloop, a critical binding interaction. In contrast, the boxBλ could substitute for the boxBP22 by flipping out the third nucleotide (A instead of C) in the context of the extensive hydrophobic packing of the NP22 peptide. Two factors could contribute to the weak binding of Nλ peptide to boxB[phis]21 RNA. First, the top of the stacked nucleotides in boxB[phis]21 is further away from the stem and thus not as accessible for Trp18λ stacking. Second, A12 in boxB[phis]21 occupies the position where Trp18λ would stack, resulting in steric clash.
N–boxB in a biological context
The boxB hairpin does more than simply anchor N protein to the RNA transcript and help deliver the N protein to RNA polymerase. Although the mechanism of N-mediated RNA binding by NusA is not known, there is evidence that NusA interacts with both the 3′-half of the boxB hairpin (Mah et al. 2000) and directly with N protein (Kd |ma 70 nM), and that the NusA-binding and RNA-binding domains of N are distinct (Van Gilst and von Hippel 1997). A mutation of guanosine to cytidine at the fourth position in the boxBλ loop or substitution of the GAAGA with a GAAA tetraloop sequence has no effect on Nλ binding in vitro, but abolishes binding of NusA to an Nλ–boxB complex (Mogridge et al. 1995; Legault et al. 1998). Similar mutations abolish antitermination activity in vivo and with a minimal N, NusA, RNAP antitermination system, and indicate that it is the 3′ half of the boxB stem that is being recognized (Chattopadhyay et al. 1995). This evidence suggests that boxB interacts with other elements of the transcription antitermination complex. NusA has an S1 homology region, which is found in proteins that can nonspecifically bind to RNA (Bycroft et al. 1997). This S1 domain is activated to bind RNA by both N protein and the α subunit of E. coli RNA polymerase (Mah et al. 2000). Though there has been no direct evidence of a NusA–boxB interaction in the absence of N protein, NusA is a likely candidate for interaction with the boxB hairpin.
Normally, NusA plays a role in pausing of the transcription complex. Interaction of NusA with N and boxB RNA reverses that role (Mah et al. 2000), diminishing pausing and increasing processivity of the transcription complex. Evidence suggests that NusA binds termination hairpins on the nascent transcript and allosteric effects the RNA polymerase active site (Toulokhonov et al. 2001). What roles do N and boxB play in potentially sequestering this function of NusA? Additionally, it has been shown that NusA cannot bind the λ N peptide–boxB complex if the fourth nucleotide (A or G) is removed or is mutated to a C or U, and supershifts with NusA are sensitive to the identity at this position. In the λ boxB–N structure, this nucleotide is extruded out of the tetraloop and has no apparent interactions with either the boxB RNA or the N peptide. Presumably this bulged nucleotide is available for interactions with NusA or other factors. Neither the [phis]21 nor the P22 boxBs have a purine base bulged out of the side of the loop. How is this seemingly critical determinant of the Nλ–boxBλ–NusA interaction provided for in the other two phage?
Addition of a third member to the family of λ phage boxB–N structures has helped clarify general schemes for α-helix:hairpin complex formation. It is clear that these phage N proteins use very similar modes of interaction to bind the hairpin stem, but diverge significantly in binding and recognition of the hairpin loop, easily discriminating cognate from noncognate. As the [phis]21 N–boxB structure so elegantly shows, these RNA loops and α-helical peptides form complexes using different strategies while still presenting almost identical faces with which to interact with their host.
MATERIALS AND METHODS
Sample preparation
Peptides for these studies were obtained by using a bacterial expression system in which the coding sequence for the peptide was fused to the carboxy terminus of a poly-His-tagged TrpLE leader polypeptide (Schumacher et al. 1996). This system allowed for rapid, high-level expression and purification of both unlabeled and uniformly isotopically enriched (15N, 13C) peptides. Fusion peptides were purified from inclusion bodies from E. coli (JM109) and subsequently subjected to cyanogen bromide cleavage in 70% formic acid to remove the TrpLE leader sequence, followed by reverse phase HPLC purification. The final product was dialyzed extensively against water and lyophilized.
All RNAs were prepared by transcription from synthetic DNA templates using T7 RNA polymerase (Wyatt et al. 1991). The template strand, or bottom strand, of the RNA used for NMR analysis was 5′-GGCTCACCCGGTTAGAGGTGAACCTATAGTG AGTCGTATTA-3′. Use of 4-methyl-indole at the 5′-end of the template DNA causes a dramatic decrease in the amount of N + 1 and higher add-on transcripts (Moran et al. 1996). The optimized transcription conditions used for the N[phis]21 boxB RNA were: 40 mM Tris-HCl (pH 8.1), 1 mM spermidine, 0.01% Triton X-100, 15 mM DTT, 80 mg/mL PEG-8000, 300 nM top strand, 300 nM template strand, 10 mM NTPs (total), 7.8 mM MgCl2, and 0.4 μL T7 RNA polymerase (optimized amount) per 50 μL of transcription volume. Transcription reactions were run for 4 h at 37°C, then extracted with phenol to remove the T7 RNA polymerase protein and ethanol precipitated. The crude RNA was purified using denaturing preparative gel electrophoresis, electroeluted using an Elutrap (Schleicher and Schuell), and ethanol precipitated. The purified RNA was extensively dialyzed against ddH2O, and lyophilized. For the most accurate determination of the RNA concentration, a small sample was hydrolyzed in NaOH (pH 12), neutralized with HCl, and the UV absorbance at 260 nm was used to calculate the concentration.
Peptide and RNA stocks were stored at −20°C in a lyophilized state until used. Peptide and RNA samples for NMR were resuspended with 500 μL of ddH2O and dialyzed in a microdialysis chamber (BRL) with a 500 MWCO membrane for 12 to 24 h each against 0.5 M NaCl, 10 mM EDTA, then 0.1 mM EDTA, and finally two changes of ddH2O. Samples were lyophilized and then resuspended in N[phis]21 NMR buffer: 25 mM D6(98%)-succinate (pH 6.0; Cambridge Isotopes), 2 mM NaCl, 0.2 mM EDTA (pH 8.0), 0.05 mM Na-azide, 10% D2O. NMR sample volumes were 600 μL. Samples were exchanged into D2O as needed by lyophilization and resuspension in 99.9% D2O (Cambridge Isotopes). This was repeated three times, and after the third lyophilization, the sample was resuspended in “100%” D2O to 600 mL and transferred to an NMR sample tube.
NMR assignments
A number of heteronuclear NMR experiments were run to determine peptide and RNA 1H, 15N, and 13C resonance assignments, as well as generate restraints for molecular modeling (Table 2). These included NOE distance restraints, torsion angle restraints, sugar pucker and glycosidic bond angle restraints, RNA base pairing, and peptide backbone hydrogen bond restraints (Table 3, Panel A).
Molecular modeling
The molecular modeling of the N[phis]21 peptide–boxB RNA complex was done in three steps. First, a complete intramolecular restraint set was generated for each molecule, and then peptide and RNA structures were generated separately using ab initio simulated annealing (SA) starting from a random extended structure in CNSsolve (Brunger et al. 1998). For both the peptide and the RNA, constrained (torsion) dynamics was used at 50,000 K. Though this approach is more computer intensive than restrained (Cartesian) dynamics, it demonstrates a better rate of convergence of the calculated structures (Rice and Brunger 1994). A total of 100 structures each of the peptide and RNA were generated.
In the second step, each of the 20 lowest energy peptide and 20 lowest energy RNA structures were combined in single PDB files, in all 400 possible combinations. The RNA was held at the origin and the peptide was randomly rotated and moved 100 Å away in a random direction from the origin. These 400 possible “complexes” were docked using CNSsolve. The objective of the docking was to bring the peptide and RNA together without dramatically perturbing their folded structures from the first round of SA, so the temperatures for the docking were set much lower than in the initial calculations (1000 versus 50,000 K).
The 100 lowest energy docked structures were minimized by two rounds of low-temperature annealing using Sander, a module of AMBER (Pearlman et al. 1995). As with the docking, the temperature was kept low (1000 K). Our experience has been that structural minimization using AMBER yields better and more consistent results for nucleic acid, and protein–nucleic acid complexes than Xplor or Discover. We also wanted to take advantage of the recent improvements in the Generalized Born model for solvation that is incorporated into AMBER. Structure refinement using implicit solvent is computationally expensive, but results in both faster convergence and higher quality structures than in vacuo calculations (Bashford and Case 2000). The 14 lowest energy structures were used for generating an average structure, which was energy minimized using a conjugate gradient. The structural statistics for the ensemble and the minimized, average structure indicate that, at least within the core peptide-RNA-binding region, the solution structure converges well (Table 3, Panel B).
The atomic coordinates for the ensemble of low energy structures, as well as that for the minimized, average structure have been deposited with the Protein Data Bank, and assigned PDB ID code 1NYB.
The N peptides and boxB RNAs discriminate among noncognate partners.
Data acquisition parameters for N[phis]21 peptide–boxB RNA heteronuclear NMR experiments.
Summary of restraints used for molecular modeling and the resulting structural statistics for the N[phis]21 peptide–boxB RNA complex.
Alignments of the three phage nut RNAs and N proteins. (A) Schematic of the secondary structure of the nut site RNA highlighting the boxA (red), linker (black), and boxB (blue) domains. (B) Sequence alignment of the three phage nut RNAs. The boxB loop nucleotides are shown in blue italics. (C) Sequence alignment of the amino-terminal domains of the three phage N proteins. The four conserved residues are highlighted in green.
boxB RNA and N peptide minimization for NMR structural studies. (A) The full nut site was truncated and tested for binding against the full-length 41-amino-acid N[phis]21 peptide. The binding affinities of the RNAs were directly compared against one another in a single PACE gel. (B) Deletions were made from both ends of the 41-amino-acid peptide. The relative binding affinity of all of the peptides for boxB[phis]21 were compared in a single PACE gel in which each peptide was cast in the gel at 300 nM concentration. The migration distance of boxB[phis]21 RNA in each lane was compared to that of the N[phis]21 41-amino-acid peptide to evaluate loss of binding affinity.
Binding data from PACE experiments for each of the three phage N peptides with the three boxB RNAs. The Nλ(2–19), NP22(2–30), and N[phis]21(1–41) experiments are graphed in A, B, and C, respectively. The boxBλ, boxBP22, and boxB[phis]21 RNAs in all three graphs are represented by an open circle, a triangle, and a square, respectively. The y-axis, Θ, is the normalized fraction bound. The error bars are estimates from the χ2 of the least-squares fit to the data. All of these PACE binding assays were run at 25°C, with no salt and no detergent.
Peptide and RNA NMR data. All experiments were carried out at 25°C. (A) 15N-HSQC on labeled peptide both free (black) and bound (red) to unlabled boxB RNA. Peaks are labeled with their amino acid and sequence number for the backbone amide (black), arginine epsilon (blue), and arginine eta (green) protons. The epsilon and eta protons are folded in the 15N dimension by 32.9 ppm. (B) Identification of intermolecular NOEs. An ω1-13C-Filtered, ω2-;13C-Edited NOESY-HSQC spectrum recorded on an unlabeled boxB RNA in complex with a fully 13C,15N-labeled N peptide. Strong crosspeaks between the peptide (blue) and RNA (red) are noted.
The NMR derived restraints and resulting low energy structures. (A) Schematic of distance restraints. Internucleotide RNA–RNA restraints are indicated by dashed lines between protons (colored dots). Peptide–RNA restraints (solid arrows) indicate one or more NOEs between the peptide side chains and the RNA ribose (pentagon) or base (rectangle). (B) Superposition of the 14 lowest energy structures of the [phis]21 N peptide–boxB RNA complex, drawn in red and blue, respectively. The ordered regions of both the RNA (C7 to G18) and the peptide (Ala13 to Ile25) are shown and the alignment was calculated using all of the heavy atoms in this region. For clarity, only the backbone atoms of the peptide are shown.
(A) Stereo view of the minimized, average structure of the complex. The heavy atoms and phosphate backbone ribbon of the RNA (blue) are shown in complex with the peptide (red). The conserved residues (Ala13, Arg16, Arg20, and Arg21) are highlighted (green). (B) View of minimized, average structure looking down the α-helical axis. RNA bases and backbone phosphates along the groove contacted by the peptide are highlighted (cyan), as are side chains that had NOEs (distance restraints) to the RNA (green).
A Connelly surface over the boxB RNA helps illustrate the close contacts within the peptide–RNA complex. In both panels, the RNA phosphate and oxygen atoms are colored white with the rest of the nucleotide atoms in blue. The peptide backbone is shown as a red ribbon. Peptide side chains are labeled using the single-letter for the amino acid and peptide sequence number. The RNA phosphates are numbered along the phosphate backbone. (A) Electrostatic and hydrogen bond interactions with the phosphate backbone. Ball and stick representations of amino acid side chains with potential phosphate interactions. (B) van der Waals contacts in the complex. Yellow CPK representations are used for the side chains that pack against the boxB RNA.
Two views of the superposition of the boxB stem-peptide recognition regions from all three phage. An amino acid sequence alignment of the three N peptides is at the bottom. In the molecular models, the three boxB RNA hairpins (blue ribbon for the phosphate backbone and purple spheres for the actual phosphate atoms) and the three peptide backbones (shown as colored ribbons) have been superimposed using the phosphate backbone and heavy atoms of the cyan and green colored nucleotides along with the backbone and side chain heavy atoms from the conserved alanine (cyan) and arginines (light blue with violet Nη spheres). The pairwise r.m.s.d. is 1.41 ± 0.22 angstroms. The coloring of the side chains and bases in the molecular models corresponds to the coloring in the sequence alignment at bottom. The nucleotides on the 3′ half of the boxB stem have been removed for greater clarity.
Two views of the superposition of the λ and [phis]21 boxB loops. The peptides are shown as ribbons with only the λ-Trp18 side chain drawn in. The boxB loops of λ and [phis]21 (cyan and blue, respectively) are shown using heavy atoms and a phosphate backbone ribbon. The λ and [phis]21 structures were superimposed using the heavy atoms of [phis]21 A12, A13, C14 and C15 with the λ Trp18, A9, A10, and A12 with a resultant r.m.s.d. of 1.9 Å (106 atoms). Potential hydrogen bond donors and acceptors in both structures are highlighted as spheres (green and purple, respectively).
Acknowledgments
This work has been supported by grants from the NIH (GM-53320) and the Skaggs Institute for Chemical Biology. The authors gratefully acknowledge the assistance of John Chung of the Buddy Taub NMR center at The Scripps Research Institute.
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.
Footnotes
-
Article and publication are at http://www.rnajournal.org/cgi/doi/10.1261/rna.2189203.
-
- Accepted February 14, 2003.
- Received December 3, 2002.
- Copyright 2003 by RNA Society











