Liam.Keegan@ceitec.muni.cz Lewin, GENES XI Chapter 2 Genes Encode RNAs and Polypeptides Lewin’s GENES XII (PDF) https://pdfroom.com/books/lewins-genes-xii/Wx5aDYKl2BJ biotecher.ir https://biotecher.ir/wp-content/uploads/2018/06/bio... Figure 02.01: Each chromosome has a single long molecule of DNA within which are the sequences of individual genes. ‘Compare and contrast’ transcription in prokaryotes and eukaryotes 2.2 Most Genes Encode Polypeptides • Ribosomal RNAs, tRNAs and ribosomal proteins are the most highly-expressed gene products • The one gene : one enzyme hypothesis summarizes the basis of modern genetics: that a gene is a stretch of DNA encoding one or more isoforms of a single polypeptide chain. (Beadle and Tatum) • one gene : one polypeptide hypothesis – A modified version of the not generally correct one gene : one enzyme hypothesis; the hypothesis that a gene is responsible for the production of a single polypeptide. (Vernon Ingram, 1957) 2.10 Bacterial Genes Are Colinear with Their Proteins • A bacterial gene consists of a continuous length of 3N nucleotides that encodes N amino acids. • The gene is colinear with both its mRNA and polypeptide product, • i.e. no introns Figure 02.12: The recombination map of the tryptophan synthetase gene corresponds with the amino acid sequence of the polypeptide. 2.11 Transcription and translation are Required to Express the Product of a Gene • Each mRNA consists of a untranslated 5′ region (5′ UTR or leader), a coding region, and an untranslated 3′ UTR or trailer. Eukaryotes have introns and pre-mRNA process ing to make longer-lasting mRNAs. Figure 02.14: The gene is usually longer than the sequence encoding the polypeptide. 26.1 Introduction • coupled transcription/translation – The phenomena in bacteria where translation of the mRNA occurs simultaneously with its transcription (40 nucleotides /sec, 2.4 kb / minute). In eukaryotes transcription and translation are separated to nucleus and cytoplasm. • Bacterial mRNA has the bare 5’ triphosphate from transcription-initiation and ribonucleases ,5’-> 3’, exonucleases, rapidly degrade mRNA behind translating ribosomes 26.2 Operons, in prokaryotes only, are Structural Gene Clusters that Are Coordinately Controlled • Genes coding for proteins that function in the same pathway may be located adjacent to one another and controlled as a single unit that is transcribed into a polycistronic mRNA. Figure 26.05: The lac operon occupies ~6000 bp of DNA. Lewin XI Chapter 19 Prokaryotic Transcription 19.1 Introduction • Transcription is 5′ to 3′ on a template that is 3′ to 5′. • coding (nontemplate) strand – The DNA strand that has the same sequence as the mRNA and is related by the genetic code to the protein sequence that it represents. • RNA polymerase – An enzyme that synthesizes RNA using a DNA template (formally described as a DNAdependent RNA polymerase). Figure 19.01: The function of RNA polymerase is to copy one strand of duplex DNA into RNA. 19.1 Introduction • promoter – A region of DNA where RNA polymerase binds to initiate transcription. • Start point – The position on DNA corresponding to the first base incorporated into RNA. • terminator – A sequence of DNA that causes RNA polymerase to terminate transcription. • transcription unit – The sequence between sites of initiation and termination by RNA polymerase; it may include more than one gene. Figure 19.02: A transcription unit is a sequence of DNA transcribed into a single RNA, starting at the promoter and ending at the terminator. 19.2 Transcription Occurs by Base Pairing in a “Bubble” of Unpaired DNA Figure 19.03: DNA strands separate to form a transcription bubble. RNA is synthesized by complementary base pairing with one of the DNA strands. Figure 19.05: During transcription, the bubble is maintained within bacterial RNA polymerase. 19.3 The Transcription Reaction Has Three Stages • RNA polymerase initiates transcription (initiation) after opening the DNA duplex to form a transcription bubble (the open complex). • During elongation the transcription bubble moves along DNA and the RNA chain is extended in the 5′→3′ direction by adding nucleotides to the 3′ end. • Transcription stops (termination) and the DNA duplex reforms when RNA polymerase dissociates at a terminator site. 19.3 The Transcription Reaction Has Three Stages • Transcription initiation • RNA polymerase binds to a promoter site on DNA to form a closed complex. • RNA polymerase initiates transcription (initiation) after opening the DNA duplex to form a transcription bubble (the open complex). Figure 19.06: Transcription has three stages. 19.17 Negative Supercoiling facilitates Transcription and transcription affects supercoiling • Negative supercoiling increases the efficiency of some promoters by assisting the melting reaction. • Transcription generates positive supercoils ahead of the enzyme and negative supercoils behind it, and these must be removed by gyrase and topoisomerase. Figure 19.34: Transcription generates more tightly wound DNA ahead of RNA polymerase. 19.4 Bacterial RNA Polymerase Consists of Multiple Subunits • holoenzyme – The RNA polymerase form that is competent to initiate transcription. It consists of the five subunits of the core enzyme and σ factor. • Bacterial RNA core polymerases are ~400 kD multisubunit complexes with the general structure α2ββ′. • Catalysis derives from the β and β′ subunits. • CTD (C-terminal domain) – The domain of RNA polymerase that is involved in stimulating transcription by contact with regulatory proteins. Figure 19.07: Eubacterial RNA polymerases have five types of subunits. 19.4 Bacterial RNA Polymerase Consists of Multiple Subunits Figure 19.08: The upstream face of the core RNA polymerase, illustrating the ‘crabclaw’ shape of the enzyme. Figure 19.09: The structure of the RNA polymerase core enzyme for the bacterium Thermus aquaticus. Adapted from K. M. Geszvain and R. Landick (ed. N. P. Higgins). The Bacterial Chromosome. American Society for Microbiology, 2004. Structure from Protein Data Bank 1HQM. L. Minakhin, et al., Proc. Natl. Acad. Sci. USA 98 (2001): 892-897. 19.5 RNA Polymerase Holoenzyme Consists of the Core Enzyme and Sigma Factor • Bacterial RNA polymerase can be divided into the α2ββ′ core enzyme that catalyzes transcription and the  subunit that is required only for initiation. • Sigma factor changes the DNA-binding properties of RNA polymerase so that its affinity for general DNA is reduced and its affinity for promoters is increased. Figure 19.10: Core enzyme binds indiscriminately to any DNA. 19.7 The Holoenzyme Goes Through Transitions in the Process of Recognizing and Escaping from Promoters Figure 19.13: RNA polymerase initially contacts the region from –55 to +20. 19.7 The Holoenzyme Goes Through Transitions in the Process of Recognizing and Escaping from Promoters Figure 19.12: RNA polymerase passes through several steps prior to elongation. Adapted from S. P. Haugen, W. Ross, and R. L. Gourse, Nat. Rev. Microbiol. 6 (2008): 507-519. 19.7 The Holoenzyme Goes Through Transitions in the Process of Recognizing and Escaping from Promoters • ternary complex – The complex in initiation of transcription that consists of RNA polymerase and DNA as well as a dinucleotide that represents the first two bases in the RNA product. • There may be a cycle of abortive initiations before the enzyme moves to the next phase. • Sigma factor is usually released from RNA polymerase when the nascent RNA chain reaches ~10 bases in length. 19.8 Sigma Factor Controls Binding to DNA by Recognizing Specific Sequences in Promoters • conserved sequence – Sequences in which many examples of a particular nucleic acid or protein are compared and the same individual bases or amino acids are always found at particular locations. • A promoter is defined by the presence of short consensus sequences at specific locations. 19.8 Sigma Factor Controls Binding to DNA by Recognizing Specific Sequences in Promoters • The promoter consensus sequences usually consist of a purine at the start point, a hexamer with a sequence close to TATAAT centered at ~ –10 (–10 element or Pribnow box), and another hexamer with a sequence similar to TTGACA centered at ~–35 (–35 element). • Individual promoters usually differ from the consensus at one or more positions. 19.8 Sigma Factor Controls Binding to DNA by Recognizing Specific Sequences in Promoters • The α subunit also contributes to promoter recognition. • Promoter efficiency can be affected by additional elements as well. • UP element – A sequence in bacteria adjacent to the promoter, upstream of the –35 element, that enhances transcription. Figure 19.14: DNA elements and RNA polymerase modules that contribute to promoter recognition by sigma factor. Adapted from S. P. Haugen, W. Ross, and R. L. Gourse, Nat. Rev. Microbiol. 6 (2008): 507-519. 19.9 Promoter Efficiencies Can Be Increased or Decreased by Mutation • Down mutations to decrease promoter efficiency usually decrease conformance to the consensus sequences, whereas up mutations have the opposite effect. • Mutations in the –35 sequence can affect initial binding of RNA polymerase. • Mutations in the –10 sequence can affect binding or the melting reaction that converts a closed to an open complex. 19.10 Multiple Regions in RNA Polymerase Directly Contact Promoter DNA • The structure of σ70 changes when it associates with core enzyme, allowing its DNA-binding regions to interact with the promoter. • Multiple regions in σ70 interact with the promoter. • The α subunit also contributes to promoter recognition. Figure 19.17: Amino acids in the 2.4 -helix of σ70 contact specific bases in the coding strand of the –10 promoter sequence. Figure 19.18: The N-terminus of sigma blocks the DNA-binding regions from binding to DNA. Much greater discrimination between base pairs is possible from the major groove side. 19.10 Multiple Regions in RNA Polymerase Directly Contact Promoter DNA Figure 19.16: The structure of sigma factor in the context of the holoenzyme:-10 and -35 interactions. StructurefromProteinDataBank1IW7.D. G.Vassylyev,etal.,Nature417(2002):712- 719. Illustration adapted from D. G. Vassylyev, et al., Nature 417 (2002): 712-719. 19.11 RNA Polymerase–Promoter and DNA– Protein Interactions Are the Same for Promoter Recognition and DNA Melting • footprinting – A technique for identifying the site on DNA bound by some protein by virtue of the protection of bonds in this region against attack by nucleases. Figure 19.21: Footprinting identifies DNA-binding sites for proteins by their protection against nicking. 19.11 RNA Polymerase–Promoter and DNA– Protein Interactions Are the Same for Promoter Recognition and DNA Melting • The consensus sequences at –35 and –10 provide most of the contact points for RNA polymerase in the promoter. • The points of contact lie primarily on one face of the DNA. • Melting the double helix begins with base flipping within the promoter. Figure 19.22: One face of the promoter contains the contact points for RNA. 19.12 Interactions Between Sigma Factor and Core RNA Polymerase Change During Promoter Escape Figure 19.24: Sigma factor and core enzyme recycle at different points in transcription. 19.12 Interactions Between Sigma Factor and Core RNA Polymerase Change During Promoter Escape • A domain in sigma occupies the RNA exit channel and must be displaced to accommodate RNA synthesis. • Abortive initiations usually occur before the enzyme forms a true elongation complex. • Sigma factor is usually released from RNA polymerase by the time the nascent RNA chain reaches ~10 nt in length. 19.7 The Holoenzyme Goes Through Transitions in the Process of Recognizing and Escaping from Promoters Figure 19.12: RNA polymerase passes through several steps prior to elongation. Adapted from S. P. Haugen, W. Ross, and R. L. Gourse, Nat. Rev. Microbiol. 6 (2008): 507-519. 26.8 lac Repressor Binding to the Operator represses transcription simply by blocking RNA polymerase binding to the lac promoter • lac repressor protein binds to the double-stranded DNA sequence of the operator. • The operator is a palindromic sequence of 26 bp. • Each inverted repeat of the operator binds to the DNAbinding site of one repressor subunit. Figure 26.17: The lac operator has a symmetrical sequence. 19.19 Replacement of/Competition for Sigma Factors Can Regulate Initiation • E. coli has seven sigma factors, each of which causes RNA polymerase to initiate at a set of promoters defined by specific –35 and –10 sequences. Figure 19.36: The sigma factor associated with core enzyme determines the set of promoters at which transcription is initiated. 19.19 Replacement of/ Competition for Sigma Factors Can Regulate Initiation • The activities of the different sigma factors are regulated by different mechanisms. • anti-sigma factor – A protein that binds to a sigma factor to inhibit its ability to utilize specific promoters. Figure 19.37: In addition to 70, E. coli has several sigma factors that are induced by particular environmental conditions. 19.19 Replacement of/ Competition for Sigma Factors Can Regulate Initiation • Heat-shock response – A set of loci that is activated in response to an increase in temperature that causes proteins to denature (and other abuses to the cell). – All organisms have this response. – The gene products usually include chaperones that act on denatured proteins. 19.15 Bacterial RNA Polymerase Terminates at Discrete Sites • There are two classes of terminators: Those recognized solely by RNA polymerase itself without the requirement for any cellular factors are usually referred to as “intrinsic terminators.” – Others require a cellular protein called rho and are referred to as “rho-dependent terminators.” 19.15 Bacterial RNA Polymerase Terminates at Discrete Sites Figure 19.28: The DNA sequences required for termination are located upstream of the terminator sequence. 19.15 Bacterial RNA Polymerase Terminates at Discrete Sites • Intrinsic termination requires recognition of a terminator sequence in DNA that codes for a hairpin structure in the RNA product. The highly-transcribed rRNA transcripts (rrn operon) and tRNA operons terminate this way too. • The signals for termination lie mostly within sequences already transcribed by RNA polymerase, and thus termination relies on scrutiny of the template and/or the RNA product that the polymerase is transcribing. • May require NusA protein that contacts alpha subunit replacing replacing sigma. NusA is the only other factor conserved in eukaryotes. Figure 19.29: Intrinsic terminators include palindromic regions that form hairpins varying in length from 7 to 20 bp. 19.15 Bacterial RNA Polymerase Terminates at Discrete Sites • readthrough – It occurs at transcription or translation when RNA polymerase or the ribosome, respectively, ignores a termination signal because of a mutation of the template or the behavior of an accessory factor. • antitermination – A mechanism of transcriptional control in which termination is prevented at a specific terminator site, allowing RNA polymerase to read into the genes beyond it. Lambda phage N and Q proteins. 19.16 How Does Rho Factor Work? Figure 19.30: Rho factor binds to RNA at a rut site and translocates along RNA until it reaches the RNA–DNA hybrid in RNA polymerase. • Rho factor is a protein that binds to nascent RNA and tracks along the RNA to interact with RNA polymerase and release it from the elongation complex. • rut – An acronym for rho utilization site, the sequence of RNA that is recognized by the rho termination factor. 19.16 How Does Rho Factor Work? • Mutation polarity – The effect of a mutation in one gene in influencing the expression (at transcription or translation) of subsequent genes in the same transcription unit. Figure 19.33: The action of rho factor may create a link between transcription and translation. Molecular Genetics 3 Lecture 5 4.10.11 Transcription in eukaryotes. Eukaryotes, RNA polymerases and RNA pol. II promoters. Transcription initiation by RNA pol. II. Liam Keegan MRC Human Genetics Unit, Western General Hospital, Edinburgh Eukaryotic total RNA. Ribosomal RNAs and tRNAs are major bands, mRNA is a smear on denaturing gel stained with Ethidium Bromide. • Eukaryotic RNA Polymerases • • Three nuclear RNA polymerases (E. coli has only one). • • RNA pol I (in nucleolus) • transcribes rRNA genes → 50 – 70 % cell’s RNA synthesis • resistant to > 500 g/ml -amanitin, an octapeptide from • Amanita phalloides (Death Cap Mushroom) that grows near Oak trees. • • RNA pol II (in nucleoplasm) • transcribes all protein-encoding genes & most small nuclear RNAs • → 20 – 40% cell’s RNA synthesis • inhibited by low ~ 0.03 g/ml -amanitin • • RNA pol III (in nucleoplasm) • transcribes 5S, tRNA genes & some small nuclear RNAs • → < 10% cell’s RNA synthesis • inhibited by 20 g/ml -amanitin in animal cells • resistant to -amanitin in yeast and insects Eukaryotic RNA polymerases are similar to that of E. coli but have 12 subunits. 20.2 Eukaryotic RNA Polymerases Consist of Many Subunits • All eukaryotic RNA polymerases have ~12 subunits and are complexes of ~500 kD. • Some subunits are common to all three RNA polymerases. • The largest subunit in RNA polymerase II has a CTD (carboxy-terminal domain) consisting of multiple repeats of a heptamer. Figure 20.02: Some subunits are common to all classes of eukaryotic RNA polymerases and some are related to bacterial RNA polymerase. Archaebacteria have RNA polymerase with eukaryote-like subunits. J Biol Chem. 2006 Oct 13;281(41):30581-92. Epub 2006 Aug 1. Protein-protein interactions in the archaeal transcriptional machinery: binding studies of isolated RNA polymerase subunits and transcription factors. Goede B, Naji S, von Kampen O, Ilg K, Thomm M. Of the Eukaryotic genes that can be traced back to such ancient origins 70% come from Eubacteria and 30% come from Archaea. The 30% that are most related to Archaea include proteins involved in nuclear processes. Eukaryotic RNA polymerase subunits, TATA-binding protein (TBP), TFIIB and histones have Archaeal homologs. 20.3 RNA Polymerase I Has a Bipartite Promoter Figure 20.03: Transcription units for RNA polymerase I have a core promoter separated by ~70 bp from the upstream promoter element. 20.3 RNA Polymerase I Has a Bipartite Promoter • SL1 includes the factor TATA-binding protein (TBP) that is involved in initiation by all three RNA polymerases. • RNA polymerase I binds to the UBF1-SL1 complex at the core promoter, or arrives already associated with it. 20.4 RNA Polymerase III Uses Downstream and Upstream Promoters • RNA polymerase III has two types of promoters. • Internal promoters have short consensus sequences located within the transcription unit and cause initiation to occur at a fixed distance upstream. • Upstream promoters contain three short consensus sequences upstream of the start point that are bound by transcription factors. Figure 20.04: Promoters for RNA polymerase III. 5S RNA, tRNAs, snRNAs 20.4 RNA Polymerase III Uses Downstream and Upstream Promoters • assembly factors – Proteins that are required for formation of a macromolecular structure but are not themselves part of that structure. • TFIIIA and TFIIIC bind to the consensus sequences and enable TFIIIB to bind at the startpoint. Figure 20.05: Internal type 2 pol III promoters. Figure 20.06: Internal type 1 pol III promoters. 20.4 RNA Polymerase III Uses Downstream and Upstream Promoters • TFIIIB has TBP as one subunit and enables RNA polymerase to bind. • preinitiation complex – The assembly of transcription factors at the promoter before RNA polymerase binds in eukaryotic transcription. Transcription of protein-coding mRNAs by RNA polymerase II. Defining RNA polymerase II promoters. • The TATA box is a common component of RNA polymerase II promoters – It consists of an A-T-rich octamer located ~25 bp upstream of the startpoint. • The DPE is a common component of RNA polymerase II promoters that do not contain a TATA box. • A core promoter for RNA polymerase II includes: – the InR – either a TATA box or a DPE Figure 24.10 Promoters for protein-coding transcripts. • hnRNA/pre-mRNA processing • A typical eukaryotic mRNA: • • 5'cap..............________________________________.....................AAAAAAAAAn • nontranslated coding long 3' trailer n =100-200 • leader <300nt often >1000nt 7-methyl-Guanosine cap and polyA tail protect eukaryotic mRNA against exonucleases and make it much more long-lived than prokaryotic mRNA (Lecture 7). polyA+ RNA purification and old fashioned cDNA synthesis Second strand initiation depended on random hairpin ends on first strand cDNA. Inefficient. Newer methods of cDNA synthesis use random primers to initiate second strand synthesis. Mapping transcripts and transcription start sites. PolyA and Cap uses in cDNA cloning. • Especially in the early days cDNA synthesis using oligo-dT primer and reverse transcriptase allowed cDNA clones to be obtained that covered most of the transcript sequence but cDNA clones often failed to reach 5’ end of transcript. • Ribo depletion is used instead of poly A + RNA isolation now. Ribodepletion uses beads with probes homologous to rRNA to remove ribosomal RNA from total RNA for sequencing. Nowadays, getting full length cDNA is sometimes improved by using an antibody to Cap to enrich in full length mRNAs first. (Newer Capselected cDNA libraries for genome sequencing projects.) • Next Generation Sequencing is now completing analysis of transcriptomes (total genome-wide RNA sets). To see where RNA-binding proteins are located RNA immunoprecipitation seq (Rip Seq, like ChIp Seq for DNA-binding proteins)) is used. General transcription factors are required by RNA polymerase II to recognize promoters. 24.7 Accurate initiation by RNA Polymerase II in S100 nuclear extracts in vitro. • Purified RNA polymerase would not initiate transcription correctly at a promoter on a DNA fragment without additional factors. It would start at the end of a linearized DNA fragment and transcribe from there. Adding an S100 nuclear extract allowed accurate initiation at promoters. First nuclei are purified from cultured human HeLa cells on a sucrose density gradient. Then nuclei are lysed and a clear supernatant obtained after 100,000 G centrifugation. • RNA polymerase II requires general transcription factors (called TFIIX) to recognize promoters and to initiate transcription at the correct starting site. Radiolabeled runoff transcripts show accurate initiation of new transcripts at promoter in nuclear extract in vitro. • Primer extension analysis was also used to check accuracy of start and levels of transcription in transfected cells in vivo. A nuclear extract from HeLa cells that gave accurate initiation at Pol II promoters was fractionated by ionexchange column chromatography. - Charge on column + charge on column + charge on column Increasing salt concentration TATA-binding protein (TBP) and 12 TBP associated factors (TAFs) comprise TFIID. TFIIB binds asymmetrically and sets direction of transcription. Transcription start. 24.10 The Basal Apparatus Assembles at the Promoter. General Transcription Factors. • Binding of TFIID to the TATA box is the first step in initiation. • Other transcription factors bind to the complex in a defined order – This extends the length of the protected region on DNA (by DNAse1 footprinting). • When RNA polymerase II binds to the complex, it may initiates transcription. • Key step is promoter clearance by pol II after initiation Figure 24.14 24.11 Initiation Is Followed by Promoter Clearance of RNA polymerase II • TFIIE and TFIIH are required to melt DNA and move with polymerase. • Capping and Ser2 and Ser5 Phosphorylation of the CTD may be required for elongation to begin. • Many promoters have polymerase II paused, waiting for a further signal (e.g. Drosophila Hsp70 promoters) • The CTD may coordinate processing of RNA with transcription. Recruit 3’ end processing and splicing factors. Figure 24.17 Figure 02.01: Each chromosome has a single long molecule of DNA within which are the sequences of individual genes. ‘Compare and contrast’ transcription in prokaryotes and eukaryotes 2.2 Most Genes Encode Polypeptides • heteromultimer – A molecular complex (such as a protein) composed of different subunits. • homomultimer – A molecular complex (such as a protein) in which the subunits are identical. • Alphafold2, (Google DeepMind, 2021) is a neural network type, machine learning algorithm trained on known protein structures to predict new protein structures accurately. Multiple sequence alignments (MSAs) show evolutionary covariation of amino acid changes and allow a protein sequence to be assigned to one of over a thousand domain structure families. Further structure refinement is based on side chain packing.