Regulation of gene expression in the cytoplasm Peter Josef Lukavsky Research Group Leader Brno, 2019 RNA plays a central role in biology DNA RNA Protein RNA polymerase II Cramer et al., Science 2001 70S ribosome Weixlbaumer et al, Science 2008 RNAs function both in the nucleus and cytoplasm (Courtney Hodges/UC Berkeley) The life of an RNA starts with transcription in the nucleus….. Post-transcriptional regulation of gene expression - Alternative splicing of pre-mRNA - mRNA processing (5’-/3’-end formation) - Nuclear export - cytoplasmic RNA transport - Translational control - mRNA decay Regulated by multiple RNA signals in complex with Proteins Why regulate gene expression post-transcriptionally? - Greater diversity of gene products - Rapid response to stimuli - Spatial and temporal control of protein synthesis - Flexible control allows fine-tuning of protein synthesis ~70% of transcripts could be asymmetrically localized (Lecuyer et al., Cell 2007) Dendritic mRNA transport for memory formation (Sutton et al. Cell 2006) N O Anterior-posterior and axis formation during development (Martin et al. Cell 2009) mRNPs regulate gene expression: fertilization development cell cycle stress response Deregulation leads to disease: cancer diabetes neurological disorders => RNA-protein interactions are a valuable drug target!! mRNAs are usually transcribed as precursors • Exon: Any nucleotide sequence encoded by a gene that remains present within the final RNA product. • Intron: Any nucleotide sequence encoded by a gene that is removed by RNA splicing from the final RNA product. Exons are similar in size Introns are highly variable in size mRNA splicing diversifies gene expression - Pyrimidine-rich region (~15 bases) is located upstream of 3’ SS - Donor splice site: GU - Acceptor splice site: AG - Branch point: A Donor Acceptor 5’ and 3’ splice sites (SS) in vertebrate pre- mRNAs The human spliceosomal snRNPs Will & Lührmann, 2011 snRNA base pairing with pre-mRNA U1 snRNA Donor Acceptor Definition of exon boundaries Spliceosomal E complex Introns are removed by two consecutive transesterification reactions Spliceosomal assembly and disassembly pathway Will & Lührmann, 2011 Remodelling of the Spliceosome by RNA helicases and auxiliary proteins Brr2 unwinds U4/U6 snRNA duplex and displaces U4, Prp28 displaces U1 Series of conformational changes => catalytically active B* complex formed Prp16 releases factors, Prp22 binds => catalytically active C* complex formed Prp22 dissociates and ligated exons leave the spliceosome followed by disassembly Structural views of the spliceosome U1 snRNP Extensive remodeling of the RNA U2 U5 U6 intron U4 exon Major RNA remodeling required to form the active sites of the spliceosome U2 snRNA mRNA splicing diversifies gene expression Alternative splicing is regulated by exonic and intronic splicing enhancer and silencer sequences - Alternative splicing is often associated with weak splice sites - Sequences surrounding alternative exons are often more evolutionarily conserved than sequences flanking constitutive exons - Specific exonic and intronic sequences can enhance or suppress splice site selection - four cis-regulatory RNA elements which influence exon definition during splicing: exonic splicing enhancers (ESE): SR protein family exonic splicing silencers (ESS): hnRNP protein family intronic splicing enhancers (ISE): hnRNP F/H, NOVA, or FOX proteins intronic splicing silencers (ISS): hnRNP protein family - SR protein-ESE interactions facilitate assembly of the E complex and recognition of frequently found weak 5’ss RRM domain and interaction with RNA The RRM domain contains two highly conserved sequences, RNP1 (b3) and RNP2 (b1) mediating specific RNA recognition. RRM: RNA-binding domain containing about 80 amino acid residues mRNA splicing regulated through multiple RNA-protein interactions Spliceosomal E complex CFTR exon 9 Cystic fibrosis (CF): - most frequent genetic disease in newborns (1 in 2000-3000) - 1 in 25 caucasians carries one allele with CF mutations - Most frequent mutations in CFTR gene - abnormal transport of chloride and sodium ions - accumulation of mucus in several organs 241 2 3 4 5 6a 6b 7 8 10 11 12 14a 14b 15 16 191817b17a 222120 23 TAGATG 13 mRNA CFTR Exons 9 Normal splicing Aberrant splicing CFTR protein MSD 1 MSD 2 NBD1 NBD2 Dominio R Plasmatic membrane NH2 COOH 8 9 10 8 10 8 9 10Pre-mRNA Aberrant splicing of cystic fibrosis transmembrane conductance regulator (CFTR) exon 9 Spliceosomal E complex TDP-43 is a dominant inhibitor of CFTR exon 9 splicing Buratti et al, EMBO J (2001) add back siRNA Ex 9+ Ex 9+ + + - WT mut Questions… G5, A6, A7 interact at the interface of both RRMs GUGUGAAUGAAU More processing of eukaryotic pre-mRNA Structure of the 5’ CAP A methyl group from S-adenosylmethionine is added to the N7 position of the G to the 2’OH of the first two riboses of the nascent mRNA Functions of 5’CAP In prokaryotes, the Shine-Dalgarno sequence (localized at 10 bases upstream of AUG of mRNA) binds to 16S rRNA to initiate translation. In eukaryotes, the 5’ end of the mRNA is capped. The 5’CAP binds to the CAP-binding complex (CBC) => protects mRNA from degradation and facilitates nuclear export After nuclear export, the CBC is replaced with eIF4E and the complex will recruit other eIFs and the 40S ribosomal subunit to initiate translation. The AUG is localized within the consensus sequence of GCCA/GCCAUGG (Kozak’s sequence) and the 40S subunits need to scan 5’UTR to reach the start codon. Polyadenylation of eukaryotic mRNA Cleavage of mRNA transcript at the site downstream of AAUAAA and upstream of G/U-rich sequence. The mRNA sequences are recognized by CPSF (cleavage- and polyadenylation-specific complex) and CstF (cleavage-stimulation factor) First endonucleolytic cleavage followed by polyadenylation of mRNA and degradation of the 3’ fragment. PolyA tail important for RNA stability, export and translational control in cytoplasm Alternative polyadenylation regulates gene expression Splicing is connected to mRNA export and stability control. Exon junction complex (EJC) – Protein complex that assembles 20-24nt upstream of exon–exon junctions during splicing. EJC assists in RNA export, localization, and degradation. Splicing facilitates nuclear export of mRNA to cytoplasm EJC recruits protein complex for mRNA export REF (Aly), a key protein mediating mRNA export by interacting with TAP (Mex67p) Popp & Maquat, Ann. Rev. Genet. 2013 Splicing in the nucleus can influence mRNA translation in the cytoplasm. Nonsense-mediated mRNA decay (NMD) – A pathway that degrades an mRNA that has a nonsense mutation prior to the last exon. up-frameshift proteins: UPF1-3 eukaryotic release factors : eRF1+3 serine/threonine kinase: SMG1 ATP-dependent helicase: UPF1 activated by UPF2 Silencers of SMG1: SMG8+9 EJC couples splicing with NMD Popp & Maquat, Ann. Rev. Genet. 2013 Phosphorylated UPF1 recruits endonuclease SMG6 5’ cleavage product: decapping and 3’-to-5’ decay 3’ cleavage product: UPF1 removes proteins and 5’-to-3’ decay EJC couples splicing with NMD target for post-transcriptional regulation of gene expression Eukaryotic mRNA in the cytoplasm up to 3 kb up to 10 kb Eukaryotic mRNA 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein Regulatory RNA stem-loops in 5’- and 3’-UTR IRES 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein SBS TLS Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA RBPRBP RBP RBP RBP RBP many other RNA elements and RNA binding proteins? RBP Complex network of RNA-protein interactions within mRNA regulates gene expression Translation – from RNA to Proteins • In prokaryotes: primary mRNA transcript directly used as template for protein synthesis. • In eukaryotes: primary mRNA transcript processed in nucleus, exported into cytoplasm for translation. What do we need for translation? • Template mRNA • Transfer RNAs (tRNAs) – charged with amino acids • Ribosomes • Many accessory proteins – especially in eukaryotes!! • Lots of energy (ATP and GTP hydrolysis) mRNA • Prokaryotic: often codes for two or more proteins – Shine-Dalgarno sequence in 5’UTR • Eukaryotic: codes for one protein – 5’ cap (7-methyl guanosine) – 3’ poly-A tail (50-200 adenines) • Protect from degradation by nucleases • Allow for regulation of protein synthesis Genetic code • Codons specify type of amino acid • Initiation (start) codon – AUG codes for methionine – Every protein in a cell starts with methionine • Termination (stop) codons – UAA, UGA, UAG tRNA • Delivers amino acid to ribosome • Adaptor between codons in mRNA and amino acid • 4 stems and 3 loops – Complex fold – Anticodon in hairpin loop tRNA • L-shaped structure • Anticodon and amino acid at opposite ends • First structure by Aaron Klug (LMB) and Alex Rich (MIT) in 1974 Ribosomes • Composed of small and large subunit • Subunits form complex for translation • E. coli: 20.000 ribosomes/cell • Ribosomes self-assemble without additional factors => assembly needs to be controlled Eukaryotic Ribosomes Subunit RNA Nucleotide Proteins 60S 28S rRNA 4718 49 Polypeptides 5.8S rRNA 160 5S rRNA 120 40S 18S rRNA 1874 33 Polypeptides Numbers for mammalian ribosomes Comparison of prokaryotic and eukaryotic Ribosomes ~200 Å diameter ~250 Å diameter Eukaryotic Ribosomes • Cytosolic (free) • Bound to ER • Also found in mitochondria and chloroplasts of eukaryotic cells Free ribosomes • In the cytosol (not in nucleus!) • Often found in groups termed polysomes – Multiple ribosomes on one mRNA in cytosol Bound ribosomes • Bound to the rough endoplasmic reticulum • Secretory proteins, post-translational modifications – Single ribosome on translocon channel Secondary Structure of E.coli 16S rRNA Woese et al, NAR 1980 X-ray structures of prokaryotic ribosome 23S rRNA (yellow) 5S rRNA (orange) + Proteins (red) 16S rRNA (green) + Proteins (blue) 3D structure of the ribosome • rRNAs determine the overall shape of subunits • rRNAs help to bind and position mRNA and tRNAs • 16S rRNA controls decoding of mRNA • 23S rRNA catalyzes peptide bond formation – A2451 of 23S rRNA acts as acid/base catalyst (like histidine in serine proteases) – Conformational catalysis (proper positioning of activated substrates) – 2’OH of terminal adenine of tRNA catalyzes reaction 3D structure of the ribosome • rRNAs determine the overall shape of subunits • rRNAs help to bind and position mRNA and tRNAs • 16S rRNA controls decoding of mRNA • 23S rRNA catalyzes peptide bond formation – A2451 of 23S rRNA acts as acid/base catalyst (like histidine in serine proteases) – Conformational catalysis (proper positioning of activated substrates) – 2’OH of terminal adenine of tRNA catalyzes reaction The 70S initiation complex initiator fMet-tRNA AUG start codon Aminoacyl (A) site Peptidyl (P) site Exit (E) site Prokaryotic Initiation Factors (IFs) Factor number of aa Function(s) IF1 71 stimuliates activity of IF2 and IF3 prevents tRNA-binding in A site IF2 889 favors fMet-tRNA binding to 30S subunit mediates subunit joining (GTPase activity) IF3 181 helps correct positioning of mRNA promotes codon-anticodon base pairing in P site Numbers for E. coli https://www2.mrc-lmb.cam.ac.uk/groups/ribo/resources/videos/ Eukaryotic translation initiation • Cap-dependent initiation is the major translation initiation pathway in eukaryotes • eukaryotic mRNAs are monocistronic, capped at the 5' end and polyadenylated at the 3' end • ribosomes consist of 40S and 60S subunits • 40S subunits locate the initiator AUG codon by scanning • At the AUG codon 60S ribosomal subunit to form an 80S ribosome competent for translation elongation • Assisted by eukaryotic initiation factors (eIFs) Jackson, Hellen, Pestova, Nature Reviews 2010 Jackson, Hellen, Pestova, Nature Reviews 2010 Eukaryotic initiation factors (eIFs) Factor kDa Function(s) eIF1 15 stimulates mRNA binding and scanning negative regulator of AUG recognition binds near E site eIF1A 17 stimulates mRNA binding and scanning assists AUG recognition binds in A site eIF2 130 3 subunits (abg) delivers Met-tRNAi Met to 40S subunit (ternary complex) GTPase activity eIF5 48 GTPase activating protein (GAP of eIF2g) Eukaryotic initiation factors (eIFs) Factor kDa Function(s) eIF4E 25 mRNA binding 5’cap binding protein eIF4G 154 mRNA binding large scaffold protein for binding of eIF4E, eIF4A, eIF3 eIF4A 44 scanning of 5’UTR ATP-dependent RNA helicase eIF4B 70 scanning of 5’UTR co-factor of eIF4A Eukaryotic initiation factors (eIFs) Factor kDa Function(s) eIF3 800 13 subunits (a-m) in mammals shares conserved eIF3abcgij core with yeast ribosome dissociation stimulates mRNA binding stabilizes TC binding assists scanning and AUG recognition eIF5B 112 GTPase activity subunit joining and eIF release Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA Jackson, Hellen, Pestova, Nature Reviews 2010 Internal ribosome entry site mediated translation initiation Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA mRNA transport across the Kingdoms Drosophila embryo bcd (green) mammalian neuron Actb (green) Mtap2 (red) E. coli BglF RNA (green), protein (red) Nevo-Dinur et al. 2011. Science Raj et al. 2011. Nat. Methods Lécuyer et al. 2007. Cell mRNA transport across the Kingdoms Bullock and Lukavsky. 2010. Nat. Struct. Mol. Biol. mRNA transport - mRNA localization is essential for spatial and temporal control of gene expression of hundreds of transcripts in dendrites (Martin and Zukin, 2006, J. Neurosci.) - Targeting elements are located in 3’UTR and recognized by common, shared RNA-binding proteins: PURa, Staufen, hnRNPA2, ZBP1, etc > Sequence-specific 11nt A2RE recognized by hnRNPA2 bipartite zipcode recognized by ZBP1 > Structure-specific drosophila TLS (K10, etc.) – A’-form helices dendritic targeting elements (DTE) - ??? Martin & Ephrussi, Cell, 2009 https://www.youtube.com/watch?v=y-uuk4Pr2i8 https://youtu.be/-7AQVbrmzFw Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA Cytoplasmic polyadenylation regulates protein synthesis - Some mRNAs contain CPE sequence to regulate gene expression - CPE is recognized by CPEB (2 RRMs) - PARN shortens polyA tail and Maskin blocks access to eIF4e => mRNA is dormant - CPEB phosphorylation eliminates PARN from assembly, Gld2 extends the polyA tail and Maskin phosphorylation breaks interaction with eIF4e => mRNA can be translated - Regulation of cell cycle mRNAs maternal mRNAs during development dendritic mRNAs for memory formation Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA Regulation of gene expression by micro RNAs Discovered in the nematode (C. elegans) during analysis of lin-4 and let-7 genes (Victor Ambros et al. Cell 1993) Cloning and nucleotide sequence analysis revealed that lin-4 and let-7 do not encode any protein product. lin-4 RNA hybridizes to the 3’-untranslated region of lin-14 mRNA and lin-28 RNA and degrades the mRNA. miRNA precursors are transcribed by Pol II and processed into mature miRNAs of 21-22 bases. Base pairing between the miRNA and the 3’ UTR of an mRNA does not have to be 100% complementary. This differentiates it from the RNA interference. In human, more than 1,000 different miRNA are produced. Target site in 3’UTR can be regulated through alternative polyadenylation!! miRNA Processing Pri-miRNA is transcribed by RNA polymerase II Nuclear double-strand specific endonuclease “Drosha” (RNase III) with its partner dsRNA binding protein “DGCR8” cleave the primiRNA to generate a 70-nucleotide pre- miRNA The 70 nt pre-miRNA is exported from nucleus to cytoplasm In the cytoplasm, the pre-miRNA is processed by Dicer to form miRNA One of the strands of miRNA is incorporated into an RISC complex with Argonaute protein Regulation of gene expression by micro RNAs Fabian & Sonenberg, NSMB 2012 Complex network of RNA-protein interactions within mRNA regulates gene expression IRES SBS TLS 4E 4G 4A 4B CAP-binding complex PABP Poly(A)-binding protein ITAF TAF CPEB • Translational control • mRNA stability TAF • Translational control • mRNA localization • mRNA stability miRNA RNA-based regulation of gene expression RBP RBP RBP RBP RBP Sequence-specific recognition of single-stranded RNA by RNA-binding proteins Structure-specific recognition of double-stranded RNA by RNA-binding proteins ssRNA dsRNA dsRNA binding domain (dsRBD) Topology of a dsRBD • ~ 65-70 amino acid domain • Found in eukaryotes, prokaryotes and viral proteins • Conserved αβββα topology • α-helices are packed against antiparallel β-sheet • 30 dsRBDs structure (X-ray and NMR) • Many biological functions: antiviral response RNA processing RNA transport RNA silencing mRNA degradation dsRNA-binding regions of a dsRBD Three distinct binding region Region I (minor groove) α1-helix conserved residues: QE Region II (minor groove) β1β2 (loop2) conserved residues: GPxH Region III (major groove) N-terminus of α2-helix conserved residues: KKxAK dsRBD of A. aeolicus RNase III Gleghorn & Maquat (2014) N C Sequence preference: binding register of dsRBD on dsRNA Masliah et al. (2013) ADAR2 – dsRNA binding RNaseIII – dsRNA binding Stefl et al. (2010) Gan et al. (2008) Staufen1 and Staufen2 proteins ➢ Staufen1 contains multiple dsRBDs. STAU1 gene encodes two isoforms ➢ dsRBDs 3 and 4 bind dsRNA ➢ Staufen1 has many biological functions: ➢ microtubule dependent transport of RNAs: development and higher brain functions ➢ Translational control: associated with translating ribosomes ➢ Staufen-mediated mRNA decay How does Staufen1 target certain mRNAs? hiCLIP revealed multiple interactions with 3’UTRs and the ribosome hiCLIP revealed multiple interactions with the ribosome dmStaufen dsRBD3+dsRNA Ramos et al. (2000) Amino acids that bind to dsRNA Structure of dmStaufen dsRBD3 with artificial dsRNA ➢ non cellular dsRNA ➢ three contact regions (helix α1- tetra loop, β1β2-loop - minor groove of dsRNA , helix α1-lysines - major groove of dsRNA. ➢ number of intermolecular NOEs was only 10 How does staufen1 recognize its cellular dsRNA targets? Staufen mRNA target recognition Staufen recruits UPF1 for mRNA decay ADP-ribosylation factor 1 (ARF1) mRNA Staufen1 binds in vivo to a complex structure within the ARF1 SBS The apical part of the ARF1 dsRNA is crucial for Staufen binding and mRNA decay Kim et al. (2005, 2007) - Encodes ADP-ribosylation factor 1 protein - Stimulates ADP-ribosyltransferase activity of cholera toxin and role in vesicular trafficking - Known and validated SMD target Questions… - How does human Staufen1 recognize dsRNA targets? - Is there any sequence specificity for dsRNA? Design of the ARF1 SBS – STAU1 dsRBD3+4 complex STAU1 dsRBD3+4 – long ARF1 stem loop long ARF1 SBS Design of the ARF1 SBS – STAU1 dsRBD3+4 complex STAU1 dsRBD3+4 – long ARF1 stem loop STAU1 dsRBD3+4 – short ARF1 stem loop short ARF1 SBS Design of the ARF1 SBS – STAU1 dsRBD3+4 complex short ARF1 SBS STAU1 dsRBD3+4 – long ARF1 stem loop STAU1 dsRBD3+4 – short ARF1 stem loop Strategy to assign intermolecular NOEs13C,15N Protein unlabeleddsRNA 13CdsRNA 13C,15N Protein 13CdsRNA 15N Protein 15N Protein unlabeleddsRNA 3D 13C-edited NOESY RNA = to assign RNA-RNA and RNA-protein NOEs 3D F1-filtered, F2-edited NOESY = to assign intermolecular NOEs from protein sidechain to RNA Comparison of NMR and x-ray structure Lazzaretti et al, 2018 Base contacts from minor groove side S208 Q215 S106 UA AU CG Drop in affinity S208A: 1.4x Q212A, Q215A: 1.6x Combined: 5.2x I105A, S106A: 3.6x Base contacts from minor groove side U A GC minor groove major groove S208 Q215 S106 UA AU CG Drop in affinity S208A: 1.4x Q212A, Q215A: 1.6x Combined: 5.2x I105A, S106A: 3.6x L2 L2 Base contacts from minor groove side U A GC minor groove major groove S208 Q215 S106 UA AU CG Drop in affinity S208A: 1.4x Q212A, Q215A: 1.6x Combined: 5.2x I105A, S106A: 3.6x Base contacts from minor groove side U A GC minor groove major groove S208 Q215 S106 UA AU CG Drop in affinity S208A: 1.4x Q212A, Q215A: 1.6x Combined: 5.2x I105A, S106A: 3.6x I207 I105 A216 1 0.73 0.84 0.73 2.24 1 0.89 0.87 1.12 3.26 0 0.5 1 1.5 2 2.5 3 3.5 4 control Stau1 WT Stau1 RBD3-mut Stau1 RBD4-mut Stau1 RBD34-mut ARF1 XBP1 ARF1 and XBP1 mRNA levels in HeLa cells Mutations in helix a1: dsRBD3 (I105A+S106A) dsRBD4 (Q212A+Q215A) Structure determination of large Staufen - 3’UTR – ribosome complexes How does Staufen influence mRNA stability? Integrative Structural Biology NMR PRE, RDC SAXS Cryo-EM etc. Conclusions: • dsRBD3+4 bind dsRNA in expected three distinct regions • dsRBD4 arginines in b1b2 loop “anchor” dsRBD4 at the end of helix • dsRBD3 binds to dsRNA in opposite orientation • dsRBD3 lysine in b1b2 loop “anchors” dsRBD3 at the other end of helix • dsRBD3+4 lysines of helix a2 insert in major groove • dsRBD4 glutamines and a serine of helix α1 recognize 2 pyrimidines and 1 adenine • dsRBD3 serine of helix α1 recognizes 1 guanine • no protein-protein interactions between the domains • Binding register or neighboring residues determine recognition of bases from minor groove Post-transcriptional regulation of gene expression - Alternative splicing of pre-mRNA - mRNA processing (5’-/3’-end formation) - Nuclear export - cytoplasmic RNA transport - Translational control - mRNA decay Regulated by multiple RNA signals in complex with Proteins