Nuclear export receptor CRM1 binds highly variable nuclear export signals (NESs) in hundreds of different cargoes. Previously we have shown that CRM1 binds NESs in both polypeptide orientations (Fung et al., 2015). Here, we show crystal structures of CRM1 bound to eight additional NESs which reveal diverse conformations that range from loop-like to all-helix, which occupy different extents of the invariant NES-binding groove. Analysis of all NES structures show 5-6 distinct backbone conformations where the only conserved secondary structural element is one turn of helix that binds the central portion of the CRM1 groove. All NESs also participate in main chain hydrogen bonding with human CRM1 Lys568 side chain, which acts as a specificity filter that prevents binding of non-NES peptides. The large conformational range of NES backbones explains the lack of a fixed pattern for its 3-5 hydrophobic anchor residues, which in turn explains the large array of peptide sequences that can function as NESs.
- Some nuclear export signals (NESs) bind to the transport receptor CRM1 in the opposite orientation to those previously studied.eLife 2015;4:e10034DOI: 10.7554/eLife.10034
The chromosome region maintenance 1 protein (CRM1) or Exportin-1 (XPO1) binds 8–15 residues-long nuclear export signals (NESs) in hundreds of different protein cargoes (Fornerod et al., 1997; Fukuda et al., 1997; Stade et al., 1997; Ossareh-Nazari et al., 1997; Xu et al., 2012a; Kırlı et al., 2015; Thakar et al., 2013). NES sequences are very diverse but each usually has 4–5 hydrophobic residues (often Leu/Val/Ile/Phe/Met; labeled Ф0–5) that bind hydrophobic pockets (labeled P0–P4) in a hydrophobic groove formed by HEAT repeats 11 and 12 of CRM1 (9–16). The hydrophobic anchor residues are arranged in many ways, currently described by ten consensus patterns for corresponding NES classes 1a, 1b, 1c, 1d, 2, 3, 1a-R, 1b-R, 1c-R and 1d-R (Figure 1A) (Fung et al., 2015).
Previous structures of CRM1 bound to five different NESs showed virtually identical NES-binding grooves (Fung et al., 2015; Monecke et al., 2009; Dong et al., 2009; Güttler et al., 2010). NESs from Snurportin-1 (SNUPNNES; class 1c) and Protein Kinase A Inhibitor (PKINES; class 1a) bind CRM1 as α-helix followed by a short β-strand, while the proline-rich NES from HIV-1 REV (RevNES; class 2) adopts mostly extended conformation (Figure 1B) (Güttler et al., 2010). The majority of CRM1-NES interactions involve NES hydrophobic anchor side chains, with very few polar and main chain interactions. Previously, we studied NESs with the Φ1XXΦ2XXXΦ3XXΦ4 (class 3) pattern where the i, i + 3, i + 7, i + 10 Φ positions suggested a single long amphipathic helix. However, it is perplexing that a long all-helical peptide could fit in the narrow tapering CRM1 groove. Structures of such NESs from kinase RIO2 and cytoplasmic polyadenylation element-binding protein 4 (hRio2NES, CPEB4NES) showed that they do not adopt all-helical conformations but unexpectedly adopt helix-strand conformations that bind CRM1 in the opposite or minus (−) polypeptide direction to that of SNUPNNES, PKINES and RevNES ((+) NESs) (Fung et al., 2015). hRio2NES and CPEB4NES were hence reclassified as class 1a-R NESs (Figure 1A).
We do not understand the extent of NES structural diversity nor how NESs with different hydrophobic patterns that presumably reflect different secondary structures all bind to the seemingly invariant and three-dimensionally constrained CRM1 groove. Here, eight new structures of CRM1 bound to diverse NESs show several different and unexpected NES backbone conformations that share only a common one-turn helix element. All NESs also participate in hydrogen bonding with Lys568 of HsCRM1, and mutagenic/structural analysis identifies Lys568 as a selectivity filter that blocks binding of non-NES peptides.
CRM1-bound NESs adopt diverse conformations
We study three NESs that uniquely match the all-helical class 3 pattern (Ф1XXФ2XXXФ3XXФ4). Because most previously studied NESs have substantial helical content, we also study five NESs that match class 2 (Ф1XФ2XXФ3XФ4) and class 1b (Ф1XXФ2XXФ3XФ4) patterns, where the hydrophobic residue positions do not suggest an amphipathic helix (Figure 1A). All eight NESs bind HsCRM1 in the presence of RanGTP with dissociation constants (KDs) of 670 nM-20 μM, and were crystallized bound to the previously described engineered ScCRM1-RanGppNHp-Yrb1p complex (Figure 1—figure supplement 1, Figure 1—source data 1, 2 and 3) (Fung et al., 2015). Details of how the NES peptides were modelled can be found in methods section and in Figure 1—figure supplements 2 and 3. NES-bound CRM1 grooves in the structures (2.1–2.4 Å resolution) resemble the PKINES-bound MmCRM1 groove (Cα/all-atom rmsds 0.5 Å/1.1 Å for 85 groove residues), and all NESs use 4–5 hydrophobic anchor residues to bind P0-P4 hydrophobic pockets of CRM1 (Figures 1D–F and 2) (Güttler et al., 2010).
Class 3 NESs from mouse diaphanous homolog 3 (mDia2NES: 1179SVPEVEALLARLRAL1193) and the cell division cycle 7-related protein kinase (CDC7NES: 456QDLRKLCERLRGMDSSTP473) are indeed all-helix peptides, both forming 3-turn α-helices that occupy only the wide part of the CRM1 groove (Figure 1D,E, Figure 1—figure supplement 4). The last residue of the mDia2 protein (Leu1193) binds the CRM1 P3 pocket leaving the P4 pocket empty. CDC7NES is far from the protein C-terminus but structures of longer peptides suggest that CDC7NESexits the groove after Met468 or Φ3 (Figure 1—figure supplement 5). The mDia2NES and CDC7NESsequence patterns should thus be Φ0XXΦ1XXXΦ2XXΦ3. The Φ4 anchor position is clearly not used in mDia2NES and CDC7NES even though Φ4 is key for activities of several other NESs (Wen et al., 1995; Meyer et al., 1996; Richards et al., 1996; Scott et al., 2002; Tsukahara and Maru, 2004). The number of Φ anchor residues necessary for CRM1 binding can vary between 3–5 (Figure 1—figure supplement 6). A third class 3-matching NES from beta-amyloid binding protein X11L2 (X11L2NES: 55SSLQELVQQFEALPGDLV72) binds differently (Figure 1F, Figure 1—figure supplement 4). 57LQELVQQFEAL67 forms a 3-turn α-helix, 68PGDL71 forms a type I β-turn, and X11L2NES therefore exhibits a new Φ0XXΦ1XXXΦ2XXΦ3XXXΦ4 (class 4) pattern.
Structures of NESs with non-helical patterns are also informative. The previous structure of RevNES (class 2) suggested that its three prolines may constrain against a helical conformation (Figure 1B) (Güttler et al., 2010). Class 2 NESs in the Mothers against decapentaplegic homolog 4 protein (SMAD4NES: 133YERVVSPGIDLSGLTLQ149) and the fragile X mental retardation protein (FMRPNES: 423YLKEVDQLRLERLQI437) have few to no prolines but still bind CRM1 with mostly loop-like structures (Figure 2A,B, Figure 2—figure supplement 1). Histone deacetylase 5 (HDAC5NES: 1082EAETVSAMALLSVG1095) and Paxillin (PaxNES: 264RELDELMASLSDFKFMAQ281) have NESs with non-α-helical class 1b pattern (Ф1XXФ2XXФ3XФ4), but the peptides bind according to the class 1a pattern instead (Figure 2C,D, Figure 2—figure supplement 1). This left no other experimentally verified NES in the databases that unambiguously match class 1b pattern (Kosugi et al., 2008; Xu et al., 2012c). We engineered a class 1b NES by adding an alanine to FMRPNES (FMRP-1bNES; YLKEVDQLRALERLQID), which forms a short 1.5-turn 310 helix followed by a 3-residue β-strand (Figure 2E, Figure 2—figure supplements 1 and 2). The 310 helix favorably presents Φ1 and Φ2 into the CRM1 groove and natural class 1b NESs are likely to bind similarly.
Structural requirements for an NES
Structures of >13 different CRM1-bound NESs are now available, and may be sorted into five or six groups according to peptide backbone conformations (Figure 3A). Class 1 NESs are helix-strand peptides with either α-helices (class 1a, 1c) or 310 helices (class 1b). Class 1-R NESs are strand-helix peptides, class 2 NESs are mostly loop-like and class 3 NESs are all-helix peptides. The helix-β-turn X11L2NES structure revealed a new Φ0XXΦ1XXXΦ2XXΦ3XXXΦ4 (class 4) pattern.
The only common secondary structural element in the NES structures is one turn of NES helix at Φ2X2-3Φ3 (grey box, Figure 3A). This conserved turn of helix is flanked on one side by additional turns of helix (classes 1, 1-R) or by loops (class 2), and on the other side by β-strands (classes 1, 1-R, 2) or β-turn (class 4), or the helix ends as the chain terminates or exits the groove (class 3) (Figure 3A). Dihedral (psi) angles in the 1-turn of helix gradually increase in progression from helical to β-strand conformations (Figure 3B).
In all (+) NESs, main chain carbonyls of Φ2+1 and Φ3 residues in the 1-turn helix element hydrogen bonds with the ScCRM1 Lys579 (or MmCRM1/HsCRM1 Lys568) side chain, much like niche3/4 motifs where carbonyls of residues i and i + 2 or i + 3 coordinate a cationic group (Torrance et al., 2009). The Φ3-Lys579 hydrogen bond is possible only because the β-strand psi angle turns Φ3 carbonyl towards Lys579 (Figure 3C). NES helix-Lys579 hydrogen bonds are absent in (−) NESs as backbone carbonyls point in the opposite direction. Here, carbonyls of the N-terminal β-strand hydrogen bond with Lys579 (Figure 3—figure supplement 1). Therefore, another common structural feature of NESs is hydrogen bonding between NES backbone and ScCRM1 Lys579 (HsCRM1 Lys568). Mutations of HsCRM1 Lys568 impair NES binding. Mutants HsCRM1(K568A) and HsCRM1(K568M) bind FITC-PKINES two to three orders of magnitude weaker than wild type HsCRM1, supporting the importance of Lys568-NES interactions (Figure 3—figure supplement 2).
In summary, an active NES (1) can use many different backbone conformations to present 3–5 hydrophobic anchor residues into 3–5 CRM1 hydrophobic pockets (P0 and/or P4 are sometimes not used), (2) has one turn of helix with helix-strand transition that binds the central portion of the CRM1 groove and (3) has backbone conformation that can hydrogen bond with Lys568 of HsCRM1.
HsCRM1 Lys568 is a selectivity filter for NES recognition
What then are CRM1 groove features that selectively recognize the key NES features? Arrangement of hydrophobic pockets in the groove likely selects NESs with suitably placed Φ residues. Groove shape, tapering and most constricted at ScCRM1 Lys579 (HsCRM1 Lys568), likely selects for NES helices that transition to strands or NES helices that end (Figure 3C). Is groove-constricting HsCRM1 Lys568 perhaps key for differentiating active from false positive NES sequences? We tested mutants HsCRM1(K568A) and HsCRM1(K568M) for interactions with three previously identified false positive NESs that match NES consensus but do not bind CRM1: peptides from Hexokinase-2 (Hxk2pep: 18DVPKELMQQIENFEKIFTV36, class 3 match), Deformed Epidermal Autoregulatory Factor 1 homolog (DEAF1pep:452SWLYLEEMVNSLLNTAQQ469; class 1a-R match) and COMM domain-containing protein 1 (COMMD1pep; 173KTLSEVEESISTLISQPN190; class 3 match) (Figure 4A) (Xu et al., 2012c, 2015). Wild type HsCRM1 does not bind the peptides but HsCRM1(K568A) binds Hxk2pep and DEAF1pep, and HsCRM1(K568M) binds DEAF1pep but not Hxk2pep, suggesting that Lys568 is important in filtering out false positive NESs (Figure 4A).
Both ScCRM1(K579A)-bound Hxk2pep and DEAF1pep are all-helix peptides (Figure 4B,C, Figure 4—figure supplement 1, Figure 4—source data 1). The fourth turn of the Hxk2pep helix packs into hydrophobic space widened by removal and rearrangement of the Lys579 and Glu582 side chains, respectively (Figure 4B). The 2.5-turn α-helix of DEAF1pepbinds in the (−) direction and is slightly longer than helices in true (−) NESs (Figure 4C). Superpositions of Hxk2pep and DEAF1pep onto wild type CRM1 grooves show the fourth turn of the Hxk2pep helix and the N-terminus of the DEAF1pep helix clashing with ScCRM1 Lys579/MmCRM1 Lys568 side chains (Figure 4B,C, Figure 4—figure supplement 2). The rest of the mutant ScCRM1(K579A) groove is highly similar to the wild type groove. Therefore, the key feature of the wild type groove that prevents Hxk2pep and DEAF1pep binding is Lys568, which is not only a critical hydrogen bond donor for binding NESs, but its long side chain also blocks binding of sequences that do not meet NES structural requirements.
Class 1a, 1b, 1c, 2, 3, 4 and 1a-R NES structures show 5–6 distinct backbone conformations that match their respective hydrophobic sequence patterns. We can infer structures of remaining NES classes: class 1d NESs (Ф1XXФ2XXXФ3XФ4) are likely α-helix-strand and other (−) NES classes are likely the reverse of their (+) counterparts. Symmetrical class 2, 3 and 4 patterns are identical in both (+)/(−) directions but (−) class 3 and 4 NESs, lacking β-strands to hydrogen bond with HsCRM1 Lys568, may not be ideal NESs.
Structures of many diverse NES sequences suggest how one unchanging peptide-bound CRM1 groove can recognize up to a thousand different peptides. Dependence of 3–5 hydrophobic residues in 8–15 residues-long NES arises from the substantial binding energy of anchor hydrophobic side chains interacting with 3–5 CRM1 hydrophobic pockets. However, lack of contact with NES backbone allows anchor side chains to be presented in many conformations including both N- to C-terminal orientations, explaining broad specificity defined by highly variable spacings between anchors. Interestingly, NES conformation is not entirely unrestrained, as CRM1 groove constriction imposes either exit/termination of the NES chain or its continuation in extended configuration. Solutions for the broadly specific NES recognition contrast with those of analogous systems. MHC I and II proteins, each recognizing at least hundreds of different peptide antigens, use many peptide main chain contacts for affinity with only a few supplementary peptide side chain interactions (Zhang et al., 1998; Madden, 1995). The result here is a conformational selection of particular lengths of extended peptides binding in conserved N- to C-terminal orientation, with little sequence restriction. The Calmodulin-helical peptide system is yet another contrast, as the binding domain uses its flexible fold to adapt to various helical ligands (Tidow and Nissen, 2013; Hoeflich and Ikura, 2002).
CRM1-NES structures expanded the six NES patterns derived from peptide library studies to the eleven patterns shown in Figures 1A and 3A. The ever-expanding set of NES patterns suggests that no fixed hydrophobic pattern likely describes the NES. Furthermore, only ~50% of consensus-matching previously reported NESs that were tested actually bound CRM1, contributing to the inefficiency of available NES predictors (with precision of 50% at 20% recall rate) (Xu et al., 2012b, 2015; Kosugi et al., 2014; Fu et al., 2011). The many available NES structures, diversity of NES conformations and the structurally conserved one-turn helix NES element revealed here will enable development of structure- rather than sequence-based NES predictors (Raveh et al., 2011; Schindler et al., 2015; Trellet et al., 2013; Yan et al., 2016). There is a need to identify many more CRM1 cargoes as apoptosis of different cancer cells upon CRM1 inhibition by the drug Selinexor (in clinical trials for a variety of cancers) and other inhibitors (Parikh et al., 2014; Mendonca et al., 2014; Das et al., 2015; Alexander et al., 2016; Gounder et al., 2016; Abdul Razak et al., 2016; Lapalombella et al., 2012; Inoue et al., 2013; Etchin et al., 2013; Tai et al., 2014; Cheng et al., 2014; Kim et al., 2016; Hing et al., 2016; Vercruysse et al., 2016) appears to be driven by nuclear accumulation of different sets of NES-containing cargoes, but identities of most of these apoptosis-causing cargoes are still unknown.
Finally, we find that the HsCRM1 Lys568 side chain acts as a filter that physically selects for NESs with helices that transition to strands or end at the narrow part of the CRM1 groove. Interestingly, Lys568 interacts electrostatically with HsCRM1 Glu571, which is mutated to glycine or lysine in chronic lymphocytic leukemia and lymphomas with poor prognosis (Puente et al., 2011; Jardin et al., 2016; Camus et al., 2016). Disease mutations will abolish Glu571-Lys568 contacts and possibly affect NES binding and selectivity.
Materials and methods
Crystallization of CRM1-Ran-RanBP1-NES complexes
CRM1 (ScCRM1 residues 1–1058, △377–413, 537DLTVK541 to GLCEQ, V441D), Yrb1p (yeast RanBP1 residues 62–201), human Ran full length and various NESs were expressed and purified as described in Fung et al. (Fung et al., 2015). CRM1 (K579A) mutant was expressed in pGex-TEV and purified like CRM1. Crystallization, data collection and processing, and solving of the structures were also performed in the same manner as previously described. X-ray/stereochemistry weight and X-ray/ADP weight were both optimized by phenix.refine in PHENIX (RRID:SCR_014224).
NES peptides are modeled according to the positive difference density (2mFo-DFc map) at the binding groove after refinement of the CRM1-Ran-RanBP1 model. In all structures, there are good electron densities for the NES main chain and directions of side chain density in the helical portion of the peptides allow unambiguous determination that they are all oriented in the positive (+) NES orientation. Side chain assignments of the NES peptides are guided by (1) densities of Ф side chains that point into the binding groove, (2) densities for long non-Ф side chains such as arginine, phenylalanine and methionine and (3) physical considerations such as steric clashes.
For example, to model the bound mDia2NES peptide (sequence: GGSY-1179SVPEVEALLARLRAL1193), we made use of the obvious electron densities (mFo-DFc map) for long side chains to guide sequence assignment. There is a strong side chain density suitable for an arginine side chain on the peptide (white dashed circle in Figure 1—figure supplement 2A). There are only two arginine residues in the mDia2NES peptide, Arg1189 and Arg1191. If the long side chain density is assigned to Arg1189, then Arg1191 would end up pointing into the binding groove – a very energetically unfavorable and unlikely situation. Furthermore, Ala1188 would end up in the P2 pocket of CRM1 where there is an obvious density for a longer hydrophobic side chain (left panel, Figure 1—figure supplement 2A). On the other hand, when Arg1191 is assigned to the long and continuous side chain density (adjacent to helix H12A of CRM1), the remaining side chains in the NES end up in positions that are consistent the electron densities.
For the FMRP-1bNES (sequence: 1GGS-YLKEVDQLRALERLQID20), there are no obvious long side chain densities that could help with modeling. There are however obvious densities for several side chains in the first two turn of the NES helix. These side chain densities are consistent with two possible sequence assignments: 5LKEVDQLRAL14 or the more C-terminal 11LRALERLQID20. We tested modeling of FMRP-1bNES by refining both peptide models and by testing a mutant peptide that should distinguish between the two models. Ten cycles of PHENIX refinement of the 5LKEVDQLRAL14 model resulted in positive and negative difference densities (mFo-DFc map) at several NES side chains, which suggested an incorrect assignment (left panels, Figure 1—figure supplement 2B). In contrast, different densities are absent when the 11LRALERLQID20 model is refined (right panels, Figure 1—figure supplement 2B). The final FMRP-1bNES structure was therefore modelled as 11LRALERLQID20. The sequence assignment of FMRP-1bNES was also tested using a mutant FMRP-1bNES that has the sequence YLKEVDQLRALER. If the NES is 5LKEVDQLRAL14, FMRP-1bNES mutant YLKEVDQLRALER should bind well to CRM1. However, if 11LRALERLQID20 is the FMRP-1bNES, mutant YLKEVDQLRALER should not bind CRM1 as the C-terminal half of the NES or 17LQID20 which includes Ф3 and Ф4 is missing. Results in Figure 1—figure supplement 3 show that FMRP-1bNES mutant YLKEVDQLRALER does not bind CRM1, providing further support that the NES is indeed 11LRALERLQID20 as currently assigned.
NES activity assays
Pull-down binding assays, in vivo NES activity assay and differential bleaching experiments for determining binding affinities were all performed the same way as described in Fung et al. (2015). The data were analyzed in PALMIST (Scheuermann et al., 2016) and plotted with GUSSI (Brautigam, 2015).
Structures and crystallographic data have been deposited at the PDB: 5UWI (CRM1-HDAC5NES), 5UWH (CRM1-PaxNES), 5UWO (CRM1-FMRP-1bNES), 5UWJ (CRM1-FMRPNES), 5UWU (CRM1-SMAD4NES) 5UWP (CRM1-mDia2NES), 5UWQ (CRM1-CDC7NES), 5UWR (CRM1-CDC7NES ext), 5UWS (CRM1-X11L2NES), 5UWT (CRM1(K579A)-Hxk2pep), 5UWW (CRM1(K579A)-DEAF1pep).
We thank the Structural Biology Laboratory and Macromolecular Biophysics Resource at UTSW for their assistance with crystallographic and biophysical data collection. Results shown in this report are derived from work performed at Argonne National Laboratory, Structural Biology Center at the Advanced Photon Source. Argonne is operated by UChicago Argonne, LLC, for the US Department of Energy, Office of Biological and Environmental Research under contract DE-AC02-06CH11357. Ho Yee Joyce Fung is a Howard Hughes Medical Institute International Student Research fellow. This work was funded by the Cancer Prevention Research Institute of Texas (CPRIT) Grants RP120352 and RP150053 (YMC), R01 GM069909 (YMC), the Welch Foundation Grant I-1532 (YMC), Leukemia and Lymphoma Society Scholar Award (YMC), the University of Texas Southwestern Endowed Scholars Program (YMC), and a Croucher Foundation Scholarship (HYJF).
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "CRM1 recognizes diverse Nuclear Export Signals conformations" for consideration by eLife. Your article has been reviewed by two peer reviewers, and the evaluation has been overseen by a Reviewing Editor and John Kuriyan as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Gino Cingolani (Reviewer #1) and Douglas Barrick (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
This "Research Advances" manuscript describes continued structural analysis of NES sequences bound to CRM1, a nuclear export receptor. In the previous 2015 eLife paper, the authors published two crystal structures of bound sequences and showed that they can interact in different ways, reversing chain orientation. In the present study, they solve eight additional structures from diverse sequences, and characterize additional variation in the mode of binding. The results demonstrate that a 'folded' secondary structure is not a prerequisite to bear a functional NES. Instead, sidechains from the NES sequences arranged in conformations that match the tight and restrained groove generated at the HEAT:HEAT interface are the key determinants. The results help to define different sequence motifs for binding, and how they translate into structure. The present study is certainly an advance from the previous work. As such, the manuscript is favorable for the format of "Research Advances" in eLife.
1) One major comment to be addressed in revision is that the authors should do a better job in Figure 1 showing the sequences in each NES subfamily, their expected structures, and their actual structures (perhaps a string of h's, s's, and t's). And importantly, indicate which sequences are from the current study, and which are from previous studies. Without a better guide, most readers won't remember which NES sequence has which structure, sequence motif, and orientation.
2) Although all of the crystal structures are solved at a resolution ranging from 2.0~2.5 Å, the electron densities for some of the NES peptides may not have sufficient quality for reliably model building (some of the sidechain densities for the critical hydrophobic residues are missing, e.g. FMRP-1b, mDia-2, etc.). The overall statistics (R factor and R-free factor) cannot reflect this because the peptide only constitutes a small portion of the entire structure. Moreover, the B factors for some of the peptides are extremely high compared to the CRM1*-Ran-RanBP1 protein, suggesting overfitting of the peptides to some extent. The authors need to address this point by stating the rationales for modelling the peptides as they reported.