rtie structure and function of biopolymers during the transitions 1^ of genetic information Daniel Renciuk VFU, Brno 05.10.2016 Genetic information is coded by DNA The experiment combining two strains of Streptococcus pneumoniae bacteria. S strain cP8 I smooth pathogenic bacterium causes pneumonia RANDOM MUTATION R strain o 0o o u rough nonpathogenic mutant bacterium live R strain cells grown in presence of either heat-killed S strain cells or cell-free extract of S strain cells TRANSFORMATION fUfo Some R strain cells are Uf% transformed to S strain cells, whose daughters S strain are pathogenic and cause pneumonia CONCLUSION: Molecules that can carry heritable information are present in S strain cells. S strain cells i r fractionation of cell-free extract into classes of purified molecules RNA protein DNA 1 \ lipid carbohydrate 1 1 1 1 molecules tested for transformation of R strain cells 0o o0 oo o0 0o R R S R R strain strain strain strain strain CONCLUSION: The molecule that carries the heritable information is DNA. (A) (B) Molecular Biology of the Cell (© Garland Science 2008) 0. Avery C. MacLeod M.McCarty STUDIÉS ON 1HE CHEMICAL XrVTlTlE OF TTTK SUBSTANCE IHDUC1KG TKAMSfOlLMAriOK OF PWĚUMOCŮOCAL TYPES FyiinriKW oř ^UWHPQTATGQfi pv a Dcsco^rijMMticxfcC ACHt FultujqJ iMUttfcD HřCtff PbíEMOCOCCUB TTFt HI Bi ŮBWALD T. AVGlLV, H.LJ COLItf M. HfrJLEOO, H.fl., *tt> U.UTT.W U^V.T TT." M D. ■ Jrfiti litt ti-řipiiil z/ T.w S&ťStfclltr JmnVyfe for Hí-fcal JkremrAJ PtAI» I ^"Rfia-ivůd í-k ijluliliráciiiri, Fiírembec I, LVU) BirtJflptfj fiivt lnoj atbenpttd "by choniLil tw.t.i tn indLjCE in tpjjtur ufgaalůríiů pwdkrabl* iiui spůtlác dfwiw« wtikl tfwtctíwj tva\& tt tmu-in kclcí ■i.Lir--- .!:-in -L:ri. AíDang EZULrodr^nuou tt* crarf -AEVTTVpht -Ol inJbCTÍubk ATiiÍ EfttCJfJC oltCTltÚMtft in C(i\\ lÁTUCtUJĚ uai ■ r-,ri -Jíl Ki.ii be nfMf illMriuLJ^ iílJufltJ áml iPř Hfttúdudbk UůdíL- bell i-if-finL-rj jin^ ii.íqii ulily canlralled cr«:l:Linru ir, ihe truufarmitLm] nÉ apeciac iypn <4 Jatvivxňiifub, Thia pli&iieifnwn n* flnt dwíflb*J by Gnftfth (1) vrbo HJCCKLlrd n imruíocituji^ mi i■ r.iII ■.-•A nDa-nci}»uiirni (K) vilUl Acivpl frera -QQC *pcfiůt Ijrpc iniú En^y tncajmjlRted-Biíitviruhiil (50 urllůufa hjelůtůfc#ouůflpMlůt vařičiy oJ Iran*-(ycUůHíjyiiH (tal 3J* pĎůlifclt ^iťtbtiu. LliC ILudlů oj liia bKOítJiL ůpeclta. Nuclein Nuclein - acidic substance rich in nitrogen and phosphorus J. F. Miescher -1864 Isolated from blood of wounded patients and cleaved by pepsin (proteolytic enzyme) Roles of genetic material Genotype role - storage of genetic information and its transition to the offspring Fenotype role - expression of genetic information to a particular properties of an individual Evolutionary role - adaptation of an organism/species to the environment through the changes in genetic information 4 Terminology Gene - several "defi nit ions" depending on the point of view: classic genetics (Mendel) - elementary unit of hereditary genetic information molecular genetics - part of DNA coding for RNA (and as a consequence coding for some property of the individual) structural genes coding for mRNA/protein (+ regulatory regions) genes coding for functional RNA (miRNA,...) strict - structural gene - part of DNA that codes for protein sequence Allele - particular variant of the respective gene Genome - complete DNA of organism (molecular) x complete genetic information of organism x sum of genes (classic) Genotype - the combination of particular alleles of all genes in individual Phenotype - the sum of actual individual properties ( as a result of expression of particular genotype in the respective environment) Genophor - the carrier of genetic information, usually a molecule of DNA (often used for bacteria) 5 Information biopolymers Deoxyribonucleic acid (DNA) • linear heteropolymer composed from 2-deoxyribonucleotides connected by fosfodiester bonds • usually as a stable and resistant double helix • serves as a storage of genetic information, as a template for its reproduction (replication) and as a template for the expression of genetic information to the fenotype (transcription) Ribonucleic acid (RNA) • linear heteropolymer composed from ribonucleotides connected by fosfodiester bonds • usually as a single-stranded structure of variable length, structure and reactivity • many functions depending on type of RNA (see later) Protein • linear heteropolymer composed from 20 (21) amino acids connected by peptidic bonds • highly variable structures, properties and functions Central dogma of molecular biology Nucleic acids Replication DNA n Transcription! Reverse transcription (retroviruses, ...) RN/p Replication Translation Protein 7 Gene P0U5F1 POU class 5 homeobox 1 [ Homo sapiens (human) ] Gere ID: 5460, updated or 17-Sep-2I>16 * 7 Official Symbol POU5F1 pro hghc Official Full Name POU class 5 homeobox 1 : ; =s= v hghc Primary source HGNC:HGNC:9221 See related Ensembl:ENSG00000204531 HPRD:012S2: MIM:1G4177: Veoa:QTTHIJMGOOOQ0031206 Gene type protein coding RefSeq status REVIEWED Organism Homo sapiens Lineage Euor.-ola l,-1etaica Chcrdata Craniata Vertebrata Euteleostami Mammalia: Eutheria: Also known as OCT3; OCT4; OTF3; OTF4; QTF-3; Oct-3; Oct-4 Summary This gene encodes a transcription factor containing a POU homeo;:c "ain that clays a )P_08107l8979.1: t HP_B013E 5: ■IP J090Í 8.2 HPJ01167002.1: POU.. NP.976034.4: POU d... NP.001272916.1: POU.. NP_B01272915.1: POU.. 8 Genes Most Eucaryotic genes contain introns, that are transcribed into primary RNA transcript and introns are consecutively removed by splicing process on spliceosome to form final mRNA. Intron = noncoding region Exon = coding region Procaryotic genes do not contain introns and they are directly transcribed into mRNA that serves as a template for translation into protein. Nucleic acids NUCLEOTIDE^- *N NUCLEOSIDE 5\carbon sugaf j Nitrogenous base 10 5-carbon sugar - pentose DNA 5'. 4' .0 o 3' 2' RNA HO—CH? 0H OH H HO—CH2 0H H OH OH (3-D-2-deoxyribose (3-D-ribose ii Base Adenine Guanine Pyrimidine Cytosine H o N Chi H Tymine O H. N O^N H Uracil 12 Nucleic acid nomenclature Báze Nukleosid Nukleotid NTP G Guanine Guanosine Guanosine triphosphate A Adenine Adenosine Adenosine triphosphate T Tymine Tymidine Tymidine triphosphate c Cytosine Cytidine Cytidine triphosphate U Uracil uridine Uridine triphosphate Modified bases tRNA + mRNA o x HN^NH O O HN Tr N HN > pseudouracil d i hydro uracil metylguanine metylinosine metyladenine Oxidative damage NH R 8-oxo adenine o 8-oxo guanine 5-hydroxymethyl uracil Metabolism Epigenetics o o o < NH, NH N N CH, "N 2 yOH CHo xanthine hypoxanthine mosme 5-methyl cytosine 5-hydroxymethyl cytosineL4 Conformation of N-glykosidic bon OH -N OH OH IT' 180" orj/i X = ] 80 ■* 90° OF 120* 160* LW i kit 300* TAW Figure 2-8 The glycosidic torsion angle * is defined by 04'- Cl'-N9-C4 for purines and 04'-Cl'-Nl-C2 for pyrimidines When x — 0° the 04'-Cl' is eclipsed by the N9-C4 bond in purines and by the N1-C2 bond in pyrimidines. The syn conformations correspond to 0* ± 90r; anli conformations correspond to 180" ± 90". In nucleotides steric hindrance limits the conformations actually found to a much narrower range of angles thai depend on sugar pucker and base. The syn conformations are usually found with x — 45° ± 45*; anti conformations are usually found with X — - 135" ± 45". Formation of sugar-phosphate backbone 5' end (phosphate)^ho—p—o—\ N o N NH2 NH2 Phosphodiester bond 0 o II II HO-P-O-P-OH 1 l OH OH 000 II HO—P—O—P—O—P—O—i 1 I I OH OH OH REPLICATION DNA-de pen dent DNA polymerase TRANSCRIPTION DNA-dependent RNA polymerase REVERSE TRANSCRIPTION RNA-dependent DNA polymerase NHc ,0. OH 16 Base reactivity - hydrogen bonds Hydrogen bond - weak electrostatic interaction of two polar groups - one covalently bonds hydrogen (DONOR - usually -N-H a -O-H); the second (ACCEPTOR) is usually N or 0 Length: ~ 2.8 A (2 - 3,4 A) Energy: < 1 kcal/mol both depends particular atoms and on Guanine Tymine Watson-Crick base pairing DNA double helix 3' Sugar-phosphate backbone 5' 0,34 nm = 3,4 A 2 nm = 20 A 19 DNA double helix • two molecules (strands) of DNA • the helix is right-handed • the strands go antiparalel - their 5'-3' direction is opposite in context of the double helix direction • similar content of purines and pyrimidines; content of A = T, G = C (Chargaff rules) • result - the strands are complementary - i.e. according to the Watson-Crick base pairing rules we can predict/create the sequence of one strand according to the sequence of the other • on average the double helix contains 10,5 base pairs per turn of the helix, which is about 3,4 nm in length 20 DNA structure -Watson and Crick mode no. 43bb April 25, 195:$ N ATI! R E 737 F. Crick J.D. Watson M. Wilkins R. Franklin E. Chargaff equip"* ľ' ■ and to llr. (í. K. It, Deacon and the eaptain nml officer* of K.K.S. Dbcovrry II fur llieir piiVi m making the lwervattoim. • TmiM. ľ- »<• liMľ*m. H . 'ii l Jc-rottf, Vi,, rr.it. U*, 4f. II« (liftu. ♦L-iitíU'' Hl**in*. M.S.. M.m .W JU* Jtr* Oer.. twpiH S»f*.. t. S»l (t«l») • Vim Art. W. S.. Wouci Hate l"«pcr» la i'I.jt. Ontmk. Vrlrtr . H (91 r the *alt VV of äVoxyriln*»c nucleic acid (D.X.A.,. Thin structure -..i- novel featurea which arc foonsidcrahl biological interest. A structure for nucleic acid lut* already been proposed by I'auling ami Carry1. Thoy kindly made their iiianiusii i|it available to im in advance of pubtiort*i"ii. Tlieir model consist* of (hn-e intertwined rlwian. with the phosplui'en near the fibre , Mil the I ..Ji - nn mtside In our opinion, tlii* struct uru is unsat is fiic t o ry for two i ■ • - u- : (l| We l>elinve lluu the material which give* the X-my diagram* i* |1m< Mill, not tlie free acid. Wit hi mi the wiilic hydrogen atom* it i« not clear whnt force* llil !.■■!■! the at nurture together. csrs* lolly iw the jsjtively 1. 11 r I (>•,.1 .|i,it iii-iir the will repel inch oih-t. |2| Some of t-., van tier Waal* d islam npjhwir to |*i too -mu i Anoth»r ■ .r■• ■ cliam structure ha* also been sug geared hy Fm«er (in tlie preaal. Tn hi* model the phnsplui!c« an- hi the outside mid the Ikwms on the itisiiii-. linked top.ilu-r hy hydrogen Uinds. Tin* ttnicture m dcserilied in ra'her ill-defined, anil for this reason we shall not conuneiil nn il. We wish in put forwunl a radically ilitT.-reii* structure for tlie salt of iU-ovyrilw>4ii nucleic acid. ThiM structure lias two i.. Ill ill ■ I: ' iiil.-i i OUtuI I he same iiXio [ore. Iiagram We have made the usual chemical oasmnptions, namely, that each chain constats of phosphntc di-cMter groups joining Jl-n-deoxy-rittofuran'Me n-ergV model No. 1 ; iliat is. he 'i..... nn< on iiniikt "I the helix and tlw* phonphateM on the nut tide. 'I'l»e e of the migar ami t he atoms two Mgf*£*m* imar it in eloap lo Kurborg. Hh muI Hip l-«l - P nui tvHlt tbr p*m c4 Ntandaru uonJigtirat imi . t he tay^baidb^i iheeivttn. flugar being roughly porpondi-tktairt! ■ti-^'i'-T- ni^ ciilar to the atiaelwil Iwwe. Tlien- I iw a remdne on i-aeh eliain every 3-4 A. in the i-direr-iion. We have awmmed an angle of ľl6 between adjacent r. - -1 j - m the Manu* cluvm. bo that tlľ' Ulmet uro refieata after 10 rewi lx iu*íi perpendicular to tint togottier in jiaini. a Hing! hytlrogan-lMittiliHl lo a chain, no that tin* tw>> 1 :-co-on 1 mate*. One of tl the other a pyrimidine hydrogen liond» are mad I to pyrirakuno jwaitiu pyrimidme ■ - <: ft. If it u . ■ hü : that ntnieture in thn m out Itlia'. ia. with the keto ■■.if.-'.- it H ' ■ 11. I ■ 1 Franklin's X-ray photoqraph shows I purine I with thymine _ . ' ■* ° r (purine) with cyt — n.. DNA's B'-form (1952) In other worm, if an f, i «n ..w ummw «■ a j»air, on eitlirr cltain, tlwu M 11 " anMunpIionn the oilier member mini l»c lh)-mine ; similarly for guanine and cytoMine. The -wquonee of baa»h on n tingle cliatn dotw not amnnr to lie restricted m any way. Howuver, if nnl> apeeiflc juiu» ..i Immi« oan be formed, it follow* that if tlie «oqueuoe of Immim on ■ nur cltain it given, (hen llie -wipience un th» oilier e I lain in automatically determined. tt hat bouu found exivriinonlally'-1 iliAt tlie ratio of the amount■ of adenine tn thymine, anil the ratio ■ ■f guanine tucyfroue, an- .dway* vi-ry c|o«e !■> unity for deoxyribu*d nucleic acid. It i" probably imi>omilite to liuild thin ntniotun-with a riboNc augar in jitoei* of (lie dcoxynboMc, the extra oxygen atom would make ton . a van lUir Waals contact. Tlu? pruvioiittly puhhulieal X-ray ibtta*^ nn deoxy-riboa» nucleio acid are in«umeicnl for a rigomus teal of our structure. Ho far an we oan tell, it t4 roughly compatible with the experimental data, hut it miui Imi regarded a* unproved until it hau been checked agamfit mom exact remilt*. Some of these are given in the following cominuuicattuuN. We were not aware of the il-'t.uli of lie tvmiltfl presented Ihtuv wticn we ■Inviaed our structure, which nwU mainly though not entirely on publiahed experimental data anil iiteroo-cttemical argument*. Ii luv* no* eacape«! our not too tliat the upecilk-(miring we li.ivt- j-.-!iiUiI.-i imm.-lintely Mtlggmta •» pnatiblu copying mechanutm for tlie genetic material. Full detail* of tlie ■tnieture, inmmling the mm ditiorw i-'Mm." 1 m building it, together with * Mel of co "nhuai.-t for I lie atonw, will be published clue whore. We are much indebted t" Dr. .terry Donohuo for ennAliuit oj| v um and criticism, eopoeially on inler-atomto r. It. K. Franklin and tlieir oo-workem at 21 DNA structure - Pauling mode CHEMISTRY: PAUUNC AHB COREY Proc n. A. a Linus Pauling A PROPOSED STRUCrVXB, FOR THE SUCLE1C ACIDS By Linus Paulino and Kohbrt B. Corby Gates and Ckrlli.n I.auoratorirs uf C'HBmsrRV.* California Institittf of Tbciinulogv Communicated DecemlKT 31, IS*52 The nucleic aculs, as constituents of living organisms, are comparable in imjx>rtance to the proteins. There is evidence that they are involved in the processes of cell division and growth, that they participate in the transmission of hereditary characters, and that they are important constituents of viruses. An understanding of the molecular structure of the nucleic acids should be of value in the effort to understand the fundamental phe- nrmi*»ii3 /if lift* which are involved in ester linkages. This distortion of the phosphate group from the regular tetrahedral configuration is not Supported by direct ex|>eritncntal evidence; unfortunately no precise structure determinations have been made of any phosphate di-esters. The distortion, which corresponds to a larger amount of double bond character for the inner oxygen atoms than for the oxygen atoms involved in the ester linkages, is a reason- picunn r. Plan of the nucleic acid structure, showing several nucleotide residues. able one, and the assumed distances are those indicated by the observed values for somewhat similar substances, es|>ecially the ring compound S3O9, in which each sulfur atom is surrounded by a tetrahedron of four oxygen atoms, two of which are shared with adjacent tetrahedra, and two unshared. The O— O distances within the phosphate tetrahedron are 2.:S2 A (between the two inner oxygen atoms), 2.40 A, 2.55 A, and 2.60 A. The 22 Various types of double helix ^—TUQ^NA A-DNA Z-DNA • DlWin water/salt soulutions • DNA in crowding solutions • CpG sequances in crowding conditions • RNA 23 Reversed Watson-Crick pairing Base protonation R Cytidinc •base protonation might alter the base reactivity • free bases have pK far from physiological •pK of bases in DNA might be closer to pH 7.4 • cytosine in Cn sequences has pK~7 - cytosine i-motif DNA double helix x ions / water • phosphates in DNA backbone are negatively charged - repulsion •Tftis is compensated by interaction with ions (Na+, K+, Mg2+,...) or water (H-H bonds) 26 Stability of DNA double helix r Tm = melting temperature • hydrogen bonds AT = 2 x GC = 3 Tm increases with GC and length • base stacking various Tm increases with length and ions • repulsion of backbone phosphates Mg2+>Na+ Tm increases with ions 27 Base-pair parameters in double helix Stagger (Sz) Buckle (k) Opening (a) Lu et al., 2003, NAR Shift (Ox) Rise (Dz) Tilt(r) Roll (p) r Twi >t u ;- + Coordinate frame 3' Types of nucleic acids • linear (human chromosome) x circular (bacterial genome) • single-stranded (most RNAs) x double-stranded (human DNA) Superhelicity Overwound topological domains form compact large scale chromatin structures Supercoiling influences higher levels of chromatin organisation Underwound topological domains have a decompacted large-scale structure Superhelicity happen mostly as a result of transition of polymerase complex and unwinding of DNA (helicase,...) during replication and transcription. Topoisomerases • Enzymes that relax the superhelicity • Topo I - works on 1 DNA strand • Topo II - works on 2-strand DNA 30 Reactivity of bases with amino acids Double-stranded NA: Interaction of Hoogsteen side with amino acid in major groove. n Asparagine (or glutarm r») Serine (or threonine) HjC—CHj N'—H N—H Argjoine Figure 2-16 Interactions involving two hydrogen bonds between amino acids and bases that can occur through the major groove of a double helix. Reactivity of bases with amino acids Single-stranded NA: Interaction of Watson-Crick side with amino acid. H2 C„ C ~ "H II 0 0 ■H I G L H Aspartic (or glutamic) Cf H 0 H H Asparagine (or glutaimne) H Ha I ,--c-^ cT c H ,CH, N H R Asparaginc (or glutamine) CH, H Afifiaragine (or gluiamine) N h" .....N^yV I A [ >-H Asparagine (or glutamine) N H HSC I I ,CH, H, I R Asparagine (or glutamine) © I. H H. -o ArginLne Figure 2-17 I ini'i ad ion - involving two hydrogen bonds between amino acids and bases that take the place of Watson-Crick base pairing. Genome composition percentage 0 10 20 30 40 50 60 70 80 90 100 1 1 1 1 1 1 1 1 1 1 1 LINEs SINEs retroviral-like elements —1 DNA-only transposon 'fossils' simple sequence repeats — segmental duplications introns protein-coding regions I_ JL GENES non-repetitive DNA that is neither in introns nor codons J Molecular Biology of the Cell (© Garland Science 2008) 33 Repetitive sequences - repeats Some sequences in genome are unique, usually the genomic sequences (both coding and non-coding). In contrast, other sequences exist in many copies - repetitive sequences (repeats). The length of repeat (microsatelites 2-6 bp x LINE 6-7000 bp), as well as the number of copies (several - 1.5M SINE in human) is highly variable. Structure: • direct repeats fcCg*^ . . . AGTC . . . AGTC. . .3' JUft • . TCAG. . . TCAG. . .5' 5'...AGTC...CTGA...3' 3'...TCAG...GACT...5' • inverted repeats + palindromes 5'...AGTC...GACT...3' 3'...TCAG...CTGA...5' Position: • Tandem repeats • Interspersed repeats 34 nverted repeats Hairpin Palindrom AGTCGACT 3' 5- C--G T—A G--C A—T Hairpin with loop Inverted repeat 5-AGTCTGAGCTGACT A G G C 5- c- T-G-A- -G -A -C -T 35 nverted repeats Cruciform A G G C 5'-3- Inverted repeat AGTCTGAGCTGACT TCAGACTCGACTGA 3' 5' 5-3- c- T-G-A-T-C-A-G- -G -A -C -T --A --G -T -C. G ■3' ■5' T C 36 Special types of repetitions - transposons Interspersed repetitions with various lengths and number of copies. LTR - long terminal repetitions -100 bp - 5 kbp - variant of retrotransposons LINE - long interspersed nuclear elements - up to 6 kbp - human> 500k copies - 3 types (LI, L2, L3) - only some LI are able to transpose SINE - (Alu,...) short interspersed nuclear elements - up to 500 bp - human ~ 1,5M copies 37 Loops and hairpins in RNA 16S RNA a*ucc aucuggacc accc uo "CC<:t;AuCUC m »o -c .u-a uauc uc.c ^u** CgV U UG ACCpGGGCCCCCAC AAl aal;0,gacucc.g-uac" 8!0B£.UGUUCCGGGa » UAACCGUACG' AAGCCUG A UGCAC C^Oac S^CGGC u a a - u* UCC..'/ 'rCC c ag gaagaacc G<- M J^V ' CACU1;'JUUCI)UCcU \ rC* V A rGc UC \ N r C a, /»»MoSS CArlIC O C c - C ■A*1. »« * Gr -,.°aS K M tRNA (Lys) UUAaU G D-loop \ a"g»-.. c y aa / s g <& 111 s 8 10 A U- h <= A \ c g c A" • G AC G " 5u pre-tRNA tRNA Translation; amino acid carrier snRNA, snoRNA,... Splicing/modification of RNA SÍRNA, miRNA,... Gene expression regulation 39 Hoogsteen pairing - triplexes • gene expression regulation ^u ^u ^Y ^Y ^u ^Y • therapy R / A*A T C+*G C 40 Hoogsteen pairing - quadruplexes 41 Guanine quadruplexes GGGNnGGGNnGGGNnGGG • gene expression regulation • telomere structure transcriptional activation altered transcription (Huppert J.L, Chem Soc Rev, 2008) (Biffi G., et alv Nat Chem, 2013) C-Myc KRAS i (Brooks T. A., et al., FEBS J, 2010) o Regions surveyed 25,747 24.963 • Poly-A (Maizels N. and Gray L.T., PloS Genet., 2013) Base reactivity Hydrofobic bases with high ability to form hydrogen bonds are reluctant to be freely expressed into water environment around - if there is any chance to avoid this and lower the base exposition to the environment by any type of base pairing or base stacking, the bases tend to form a structure. Even the "single-stranded" RNA or DNA forms, in fact, compact structure with number of base pairs. Packing of DNA into chromosome T At the simplest level, chromatin is a double-stranded helical structure of DNA. The 300-nm fibers are compressed and folded to produce a 250-nm-wide fiber. Tight coiling of the 250-nm fiber produces the chromatid of a chromosome. 1400 nm 44 (Nature Education, 2013) Binding of DNA to a histone octamer linker DNA r 1 core histories of nucieosome "beads-on-a-string" nucieosome includes form of chromatin -200 nucleotide pairs of DNA NUCLEASE DIGESTS LINKER DNA • ■ ■ 1 released r j nucieosome n nm core particle £ DISSOCIATION WITH HIGH CONCENTRATION OF SALT octameric histone core 147-r.ucleotide-pair DNA double helix f 4 DISSOCIATION 1 \ ^^ii^#IM^^ ^^^?!lr^«^ ^is^^fe^^^ 50 nm H2A H2B H3 H4 Molecular Biology of the Cell (© Garland Science 2008) Folding of nucleosomes into 30 nm fiber Molecular Biology of the Cell (© Garland Science 2008) 46 30 nm fiber binds to protein scaffold Molecular Biology of the Cell (© Garland Science 2008) 47 Chromosome o I One Chromosome (two identical Chromatides) Short Arm (p) Long Arm (q) A / Chromatides Telomeres Molecular Biology of the Cell (© Garland Science 2008) Centromere - here are the chromosomes connected to the system of cellular microtubules - important for chromosome segregation during cell division Telomere - terminal part of chromatides that protect the end from being recognised as a double-strand break by a DNA repair machinery T-loop '-■mm. A2-B1 ihnRNPs, PC im O-loop Nature Reviews | Cancer 48 Chromosome Fully condensed chromosomes are present only during the cell division, otherwise they are more or less decondensed to a lower levels of structure, especially in transcriptionally active sites (euchromatin). Transcriptionally inactive parts of DNA, as well as repetitive regions or telomeres are much more condensed (heterochromatin). Various types of chromatin differ in epigenetic markers of both DNA (5-metyl cytosine) and histones (methylation a acetylation). I ■ 111 ■ 11 I I ■ 1111 ■ I I 11111 ■ I I 111 ■ 1 f I I 111111 I I 111111 I I 111111 105 106 107 108 109 1010 1011 1012 number of nucleotide pairs per haploid genome Figure 1-37 Molecular Biology of the Cell, Fifth Edition (© Garland Science 2008) Table 1-1 Some Genomes That Have Been Completely Sequenced SPECIAL FEATURES HABITAT GENOME SIZE (1000s OF NUCLEOTIDE PAIRS PER ESTIMATED NUMBER OF GENES CODING FOR HAPLOID GENOME) PROTEINS ARCHAE Methanococcus jannaschii lithotrophic, anaerobic, hydrothermal vents 1664 1750 methane-producing Archaeoglobus fulgidus lithotrophic or organotrophic, hydrothermal vents 2178 2493 anaerobic, sulfate-reducing Nanoarchaeum equitans smallest known archaean; hydrothermal and 491 552 anaerobic; parasitic on another, volcanic hot vents larger archaean EUCARYOTES Saccharomyces cerevisiae minimal model eucaryote grape skins, beer 12,069 -6300 (budding yeast) Arabidopsis thaliana model organism for flowering soil and air -142,000 -26,000 (Thale cress) plants Caenorhabditis elegans simple animal with perfectly soil -97,000 -20,000 (nematode worm) predictable development Drosophila melanogaster key to the genetics of animal rotting fruit -137,000 -14,000 (fruit fly) development Homo sapiens (human) most intensively studied mammal houses -3,200,000 -24,000 Genome size and gene number vary between strains of a single species, especially for bacteria and archaea. The table shows data for particular strains that have been sequenced. For eucaryotes, many genes can give rise to several alternative variant proteins, so that the total number of proteins specified by the genome is substantially greater than the number of genes. Table 1-1 (part 2 of 2) Molecular Biology of the Cell, Fifth Edition (© Garland Science 2008) Levels of structure of biopolymers DNA RNA Protein Genetic code Set of rules that assign a sequence of aminoacids in the protein to the sequence of nucleotides in DNA or RNA. Transcription Translation RNA CODEWORDS AND PROTEIN SYNTHESIS, III. ON THE NUCLEOTIDE SEQUENCE OF A CYSTEINE AND A LEUCINE RNA CODEWORD By Philip Ledgk and Marshall W. Nihenbehg NATIONAL HEART INSTITUTE, NATIONAL INSTITUTES OF HEALTH Communicated by Richard B. Roberto, October 1, 1961, Previous studies utilizing randomly ordered synthetic polynucleotides to direct amino acid incorporation into protein in E. coli extracts indicated that RN"A codewords corresponding to valine, leucine, and cysteine contain the bases (UUG).1-1 The activity of chemically defined trinucleotides in stimulating the binding of a specific C"-aminoacyl-sRNA to ribosomes, prior to peptide bond formation,' provided a means of investigating base sequence of RNA codewords and showed that the sequence of a valine RNA codeword is GpUpU." 53 Properties of genetic code • genetic code is based on triplets - one aminoacid in protein is coded by a sequence of three nucleotides in DNA (RNA]_ Triplet = Codon x anticodon = complementary sequence on particular tRNA that carries the mRNA CGUGGUACGAUUGGAUGUL _» 1_._I l_ Protein Arg Gly Thr Me Gly CyS respective aminoacid • genetic code is universal - individual triplets code for the same aminoacid in Hfebst all organisms (x mitochondria)^ CGU = Arginine CGU = Arginine CGU = Arginine • genetic code is degenerated - one aminoacid might be code by several different triplets (but the opposite is not true) Arginine CGC AGA 54 Genetic code Second nt First nt U C A G Third nt Phe Ser Tyr Cys U U Phe Ser Tyr Cys C Leu Ser STOP STOP/Sel A Leu Ser STOP Trp G Leu Pro His Arg U C Leu Pro His Arg C Leu Pro Gin Arg A Leu Pro Gin Arg G Me Thr Asn Ser U A Me Thr Asn Ser C Me Thr Lys Arg A Met/START Thr Lys Arg G Val Ala Asp Gly U G Val Ala Asp Gly C Val Ala Glu Gly A Val Ala Glu Gly G Reading frames Genetic code is based on triplets - three possible ways of reading (reading frames), but only one is correct. «^^^^^% mRNA CGUGGUACGAUUGGAUGU 1—i—"—i—"—i—••—i—1•—i—"—.—• Proteini Arg Gly Thr Me Gly Cys mRNA CGUGGUACGAUUGGAUGU •—i—1 •—i—m—i—"—i—>1—i—m—i—1 Protein2 Val Val Arg Leu Asp mRNA CGUGGUACGAUUGGAUGU '-1-' 1-1-"-1-' '-1-• 1-i-"-r- Protein3 Trp Tyr Asp Trp Met 56 Genetic code Although the genetic code is universal, the usage of particular codons, as well as the amount of particular tRNAs and aminoacyl transferases differ Optimization of synthetic genes for recombinant protein production according to the expression system used (Bacteria, human,...) might be highly beneficial.