NUCLEIC ACIDS Basic terms and notions Presentation by Eva Fadrná adapted by Radovan Fiala Books Saenger, W., Principles of Nucleic Acid Structure, Springer 1984. Bloomfield, V. A., Crothers, D. M., Tinoco, I., Nucleic Acids, Structures, Properties, and Functions, Univ. Sei. Books, 2000. Wuthrich, K., NMR of Proteins and Nucleic Acids, Wiley, 1986. Review articles Can be downloaded from https://web.ncbr.muni.cz/~fiala/ Bowater, R. P., Waller, Z. AE., DNA Structure, In: eLS. John Wiley & Sons, Chichester, 2014. Wijmenga, S. S., van Buuren, B. N. M., The use of NMR methods for conformational studies of nucleic acids, Progr. NMR Spect. 32, (1998), 287-387. Furtig, B. et al., NMR of RNA, ChemBioChem 4 (2003), 936-962. |\pril 25, 195^ NATURE 737 738 NATURE equipment, and to Dr. G. E. R. Deacon, and the captain and officers of R.R.S. Discovery II for their part in making the observations. 'Young, F. I;.. Gerrard, H., and Jevons, W., Phil. Mag., 40, 149 (1920). ' Longuet-Higgins, M. S., Man. Not. Roy. Astro. Soc., Geophyt. Supp., 6, 285 (1949). * Von Arx, W. Woods Hole Papers in Phys. Ocearot. Meteor., 11 (3) (1950). *Ekm»n, V. W.. Arkiv. Mat. Attn*. Fytilc. (Stockholm). ft (11) (1905). MOLECULAR STRUCTURE OF NUCLEIC ACIDS A Structure for Deoxyribose Nucleic Acid WE wish to suggest a structure for the salt of deoxyribose nucleic acid (D.N.A.). This structure has novel features which are of considerable biological interest. A structure for nucleic acid has already been proposed by Pauling and Corey1. They kindly made their manuscript available to us in advance of publication. Their model consists of three intertwined chains, with the phosphates near the fibre axis, and the bases on the outside. In our opinion, this structure is unsatisfactory for two reasons : (1) We believe that the material which gives the X-ray diagrams is the salt, not the free acid. Without the acidic hydrogen atoms it is not. clear what forces would hold the structure together, especially as the negatively charged phosphates near the axis will repel each other. (2) Some of the van der Waals distances appear to be too small. Another three-chain structure has also been suggested by Fraser (in the press). In his model the phosphates are on the outside and the bases on the inside, linked together by hydrogen bonds. This structure as described is rather ill-defined, and for this reason we shall not comment on it. We wish to put forward a radically different structure for the salt of deoxyribose nucleic acid. This structure has two helical chains each coiled round the same axis {see diagram). We have made the usual chemical assumptions, namely, that each chain consists of phosphate di-ester groups joining [5-d-dcoxy-ribofuranose residues with 3',5' linkages. The two chains (but not their bases) arc related by a dyad perpendicular to the fibre axis. Both chains follow right-handed helices, but owing to the dyad the sequences of the atoms in the two chains run in opposite directions. Each chain loosely resembles Fur-berg's2 model No. 1 ; that is, the bases are on the inside of the helix and the phosphates on the outside. The configuration of the sugar and the atoms near it is close to Furberg's 'standard configuration', the sugar being roughly perpendicular to the attached base. There This figure is purely diagrammatic. The two ribbons symbolize the two phosphate—sugar chains, and the horizontal rods the pairs of bases holding the chains together. The vertical line mark* the fibre axis is a residue on each chain every 3-4 A. in the s-direc-tion. We have assumed an angle of 36° between adjacent residues in the same ohain, so that the structure repeats after 10 residues on each chain, that is, after 34 A. The distance of a phosphorus atom from the fibre axis is 10 A. As the phosphates are on the outside, cations have easy access to them. The structure is an open one, and its water content is rather high. At lower water contents we would expect the bases to tilt so that the structure could become more compact. The novel feature of the structure is the manner in which the two chains are held together by the purine and pyrimidine bases. The planes of the bases are perpendicular to the fibre axis. They are joined togcthor in pairs, a single base from one chain being hydrogen-bonded to a single base from the other chain, so that the two lie side by side with identical z-co-ordinatos. One of the pair must be a purine and the other a pyrimidine for bonding to oocur. The hydrogen bonds are made as follows : purine position 1 to pyrimidine position 1 ; purine position 6 to pyrimidine position 6. If it is assumed that the bases only occur in the structure in the most plausible tautomeric forms (that is, with the keto rather than the enol configurations) it is found that only specific pairs of bases can bond together. These pairs are : adenine (purine) with thymine (pyrimidine), and guanine (purine) with cytosine (pyrimidine). In other words, if an adenine forms one member of a pair, on either chain, then on these assumptions the other member must be thymine ; similarly for guanine and cytosine. The sequence of bases on a single chain does not appear to be restricted in any way. However, if only specific pairs of bases can be formed, it follows that if the sequence of bases on one chain is given, then the sequence on the other chain is automatically determined. It has been found experimentally'-* that the ratio of the amounts of adenine to thymine, and the ratio of guanine to cytosine, are always very close to unity for deoxyribose nucleic acid. It is probably impossible to build this struoture with a ribose sugar in place of the deoxyribose, as the extra oxygen atom would make too close a van der Waals contact. The previously published X-ray data6'* on deoxyribose nucleic acid are insufficient for a rigorous t©Bt of our structure. So far as we can tell, it is roughly compatible with the experimental data, but it must be regarded as unproved until it has been checked against more exact results. Some of these are given in the following, communications. We were not aware of the details of tho results presented there when we devised our structure, which rests mainly though not entirely on published experimental data and stereochemical arguments. It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material. Full details of the structure, including the conditions assumed in building it, together with a set of co-ordinates for the atoms, will be published elsewhere. We are much indebted to Dr. Jerry Donohue for constant advice and criticism, especially on interatomic distances. We have also been stimulated by a knowledge of the general nature of the unpublished experimental results and ideas of Dr. M. H. F. Wilkins, Dr. R. E. Franklin and their co-workers at April 25s 1953 vol ,71 Kings College, London. One of us (J. D. W.) has been aided by a fellowship from the National Foundation for Infantile Paralysis. J. D. Watson F. H. C. Chick Medical Research Council Unit for the Study of the Molecular Structure of Biological Systems, Cavendish Laboratory, Cambridge. April 2. 1 rauilng, L., and Corey, R. B-, Nature. 171, S46 (1953); Proc. U.S. Nat. Acad. Set., 39, SI (1953). * Furberg, S., Acta Chem. Scand., 6. 634 0952). 'Chargafr, B., for references see Zamenhof, S.. Brawerman, C... and Chargaff, Jlioehim. et Biophys. Acta, ft, 402 (1852). * Wyatt, G. It., J. Gen. Physiol., 36, 201 (1952). •Afstbury, w. T., Symp. Soc. Exp. Biol. 1, Nucleic Acid. 66 (Camb. L'niv. Tress, 1947). •Wilkins. M. H. F-. and Randall, J. T., Bioehfrn. et JHophyt. Ada, 10, 192 (J953). Molecular Structure of Deoxypentose Nucleic Acids While the biological properties of deoxypentose nucleic acid suggest a molecular structure containing great complexity, X-ray diffraction studies described hero (cf. Astbury1) show the basic molecular configuration has great simplicity. Tho purpose of this communication is to describe, in a preliminary way, some of the experimental evidence for the polynucleotide chain configuration being helical, and existing in this form when in the natural state. A fuller account of the work will be published shortly. The structure of deoxypentose nucleic acid is the same in all species (although the nitrogen base ratios alter considerably) in nucleoprotein, extracted or in cells, and in purified nucleate. The same linear group of polynucleotide chains may pack together parallel in different ways to give crystalline V-*, semi-crystalline or paracrystalline material. In all cases the X-ray diffraction photograph consists of two regions, one determined largely by the regular spacing of nucleotides along the chain, and the other by the longer spacings of the chain configuration. The sequence of different nitrogen bases along the chain is not made visible. Oriented paracrystalline deoxypentose nucleic acid ('structure B' in the following communication by Franklin and Gosling) gives a fibre diagram as shown in Fig. 1 (cf. ref. 4). Astbury suggested that the strong 3-4-A. reflexion corresponded to tho inter -nucleotide repeat along the fibre axis. The ~ 34 A. layer lines, however, are not due to a repeat of a polynucleotide composition, but to the chain configuration repeat, which causes strong diffraction as tho nucleotide chains have higher density than the interstitial water. The absence of reflexions on or near the meridian immediately suggests a helical structure with axis parallel to fibre length. Diffraction by Helices It may be shown5 (also Stokes, unpublished) that the intensity distribution in the diffraction pattern of a series of points equally spaced along a helix is given by the squares of Bessel functions. A uniform continuous helix gives a series of layer lines of spacing corresponding to the helix pitch, the intensity distribution along the nth layer lino being proportional to the square of Jn, tho nth order Bessel function. A straight line may be drawn approximately through Fig. 1. Fibre diagram of deoxypentose nucleic acid from B. eoli. Fibre axis vertical the innermost maxima of each Bessel function and the origin. The angle this line makes with the equator is roughly equal to the angle between an element of the helix and the helix axis. If a unit repeats n times along the helix there will be a meridional reflexion (J02) on the nth layer line. The helical configuration produces side-bands on this fundamental frequency, the effect5 being to reproduce the intensity distribution about the origin around the new origin, on the nth layer line, corresponding to C in Fig. 2. We will now briefly analyse in physical terms some of the effects of the shape and size of the repeat unit or nucleotide on the diffraction pattern. First, if the nucleotide consists of a unit having circular symmetry about an axis parallel to the helix axis, the whole diffraction pattern is modified by tho form factor of the nucleotide. Second, if the nucleotide consists of a series of points on a radius at right-angles to the helix axis, the phases of radiation scattered by the helices of different diameter passing through each point are the same. Summation of the corresponding Bessol functions gives reinforcement for the inner- ! r V _____ --^ ■- ■ —",. '' ^- ~v —-w^—.-- A . i' , A y a'j^Cv^^— B B ^- o Fig. 2. Diffraction pattern of system of helices corresponding to structure of deoxypentose nucleic acid. The squares of Besse] functions are plotted about 0 on the equator and on the first, second, third and fifth layer lines for half of the nucleotide mass at 20 A. diameter and remainder distributed along a radius, the mass at a ulven radius being proportional to the radlue. About C on the tenth layer line similar functions are plotted for an outer diameter of! 2 A. Single strand A-RNA B-DNA duplex Length of NA Total length of DNA in a human cell DNA in typical human chromozome DNA from bacterial chromozome Diameter of typical human cell Diameter of folded DNA Diameter of DNA fiber Diameter of atom 1 cm 1 mm 0.01 mm 0.1 |um 1 nm 1 A =^> 1 chromozome would be 10km long with fiber diameter of 1 mm and it would fold into 10 cm diameter =^> extraordinary DNA RNA vs DNA deoxythymidine uridine Nukleotide/nukleoside base + sugar (ribose/deoxyribose) nukleoside + phosphate nukleotide Base numbering Base tautomerism fysiolog. conditions C -C = 0 C = C-N H C -C=N-H Base tautomerism enamin|<-> imin C C-O- c C-N Sugar - pentoses HO CH, semiacetal hydroxyl group RNA OH OH (&y D - ribose semiacetal hydroxyl group + base DNA 2 - deoxy - R> - D - ribose nukleoside C1'-N1 C1' - N9 pyrimidines purines Nukleosides Nukleosides Deoxyribonukleosides deoxythymidine = dT deoxycytidine = dC deoxyadenosine = dA deoxyguanosine = dG HO (OH) Phosphate group + adenosine OH OH orthophosphonc , adenosine acid h3po4 adenosine(mono)phosphate (AMP) Nukleotides Ribonucleotides uridyl acid = uridine - 5'monophosphate = UMP, pU cytidyl acid = cytidin -"- = CMP, pC adenyl acid = adenosin -"- = AMP, pA guanyl acid = guanosin -"- = GMP, pG HO (OH) Deoxyribonucleotides deoxytymidyl acid = 2'deoxythymidine-5'-monophosphate deoxycytidyl acid = -"- cytidin -"- = dCMP, pdC deoxyadenyl acid = -"- adenosin -"- = dAMP, pdA deoxyguanyl acid = -"- guanosin -"- = dGMP, pdG = dTMP, pdT Phosphate OH OH acid (ester) OH OH nukleoside-nukleotide diester ApA Nucleotide chain h ns n c5'-h 04' h-c4' cr-h h -C3' C2 - h 03' 02' - H n — I l K 91 o h I 05' C5' - h 04' h-C4' -h c1-h h -C3' C2'- h - h ) 05' C5' - h 04' h-C4' C1'-h h -C3' C2'- h 02' -h Torsion angle Torsion angle synperiplanar synclinal anticlinal (ac) antiperiplanar Torsion angles in NA Torsion angles cont. l o M 03'(j-1) «VT* 05'@ C5'® C3'® 03'® P~(iVl)~ " 05'(i+l) ci (0 = 03 (i-1) - P (J) - 051 (i) - C51 (J) ß (j) = P (i) - 051 (i) - C51 (i) - C41 (i) V® = 05l(i)-C5l(i)-C4-lCi)-C3lCi) 5(i) = C5\0-C+\j)-C3l(j)-Ü3l(i)' e(i) = C4-1 (j) - C31 (i) - 031 (j) - P (i+1) C(i) = C31 © - 03' ffi - P (j+1) - OS' (j+1) chain direction mic lern tide unit i P(j+1) 05'(i+l) Torsion angle % SYN: Torsion angle % Orientation around the C1' - N glycosidic bond <270°, 360°> Torion % - border intervals Torsion angles in DNA Angle B-DNA A-DNA a -40.7 -74.8 ß -135.6 -179.1 Y -37.4 58.9 5 139.5 78.2 8 -133.2 -155.0 c -156.9 -67.1 X -101.9 -158.9 Sugar conformation „Puckering" of the sugar ring Definition of the puckering modes The sugar ring is not planar With respect to C5' - endo Envelope C3'-endo 3E (prevalent in RNA) Envelope C2'-endo 2E (prevalent in DNA) symmetric Twist C2'-exo-C3'-endo 3 T 21 Non-symmetric Twist C3'-endo-C2'-exo 3T2 Pseudorotation cycle Theoretically - infinite number of conformations, can be characterized by maximum torsion angle (degree of pucker) and pseudorotation phase angle Torsion angles are not independent (ring closed) vmax amplitude Maximum out-of-plane pucker max = VolCOS(P P, Vj relation P in nucleic acids SOUTH C2'-endo c5' s 1 o3' C2 -erdo 0° < P < 36° north (prevalent in ) 144°< P< 190° south (prevalent in DNA) Helical parameters Helical... Helical parameters Base pairing Watson-Crick pairs Base pairing H A H2N. 2hJ H" " - -O CH- NN 1hJHN H 0 Hoogsteen and reverse Hoogsteen pairs H H / H-N N 2hJ NN N- .N H — N C H o A and B double helix A and B helices A-RNA with bulge A and B helices Nuclear properties of selected isotopes Isotope Y x 10 V at 11.74T Natural Sensitivity (I=V2) (rad T s ) (MHz) Abundance (%) Bal.a Abs.fc 1 H 26.75 500.0 99.98 1.00 1.00 13c 6.73 125.7 1.11 1.6xl0~2 1.8xl0"4 15 N -2.71 50.7 0.37 l.OxlO"3 3.8xl0~6 31 P 10.83 202.4 100 6.6xl0~2 6.6xl0~2 1 Relative sensitivity at constant field for equal number of nuclei. Product of relative sensitivity and natural abundance. Spin systems in ribose and deoxyribose 5' O-CH Base 0- 0-CH2 Base 0 H h^-tfH B-D-Rlbose XWTPMA 'C—c 0^ H 2" 2'-Deoxy-f3-D-Ribos XAMWTNP Spin systems in nucleic acid bases INH* ^C2 6C ° XNX H Cytosine.C AX H; N 0 II CH: Thymine, T A3X o Uracil, U AX N :NH?: H- * 4 7 \ N Adenine, A A+ A o ii iH=!Nrc%N/c-N/ Guanine, G A C-H H chemical shift ranges in DNA and RNA H chemical shift ranges in DNA and RNA Code 8 (ppm) Commpnts 2' 1.8-3.0 2'H, 2"H in DNA 4' ,5' 3.7-4.5 4'H, 5'H, 5"H in DNA 3' 4.4-5.2 3'H in DNA 3.7-5.2 2'H, 3'H, 4'H, 5'H, 5"H in RNA 1" 5.3-6.3 l'H CH3 1.2-1.6 CH3 of T 5 5.3-6.0 5H of C and U 6 7.1-7.6 6H of C, T and U 2,8 7.3-8.4 8H of A and G, 2H of A -NH2* 6.6-9.0 NH2 of A, C and G ^ NH* 10 - 15 Ring NH of G, T and U 1H NMR spectrum of d(CGCGAATTCGCG) 1H NMR spectra in H20 1H COSY spectrum of DNA d(CGCGAATTCGCG)2 - s b ■ i■■■■i■-■■i■■■■i■■■■i■ ■ ■ -1 ■ ..i......... i.... i.... i.... i.... i.. 3.0 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 ppm -1.5 2.0 -25 a H2T-H2,T -3.0 b H4,-H5,,5" -3.5 HS'-HS" -4.0 H3'-H4T c 4.5 d H2',2"-H3' -5.0 e H1'-H2',2" -5.5 6.0 f H5-H6 (Cyt) -6.5 g CH3-H6 (Thy) 7.0 -7.5 8.0 ppm 1H TOCSY spectrum of DNA 1H NOESY spectrum of DNA in D20 1H NOESY spectrum of DNA in H20 b c ll ; ll I i 1 CytHS 1 f i CytH4 ! AdeH2 CytH4(HB) * ■ > * I**: * *i t r 14 13 12 11 10 9 —r~ 7 d(CGCGAATTCGCG)2 a H imino - H imino b H imino - H amino H imino - AdeH2,CytH5 c H amino - H amino H amino - AdeH2,CytH5,H6 d H imino - TCH3 4 3 2 1 ppm Water Suppression The presence of an intense solvent resonance necessitates an impractical high dynamic range. 110 M vs How do we set distance information? o o Nuclear Overhauser effect (< 6A) a and t pose problems Determinants of 31P chem shift, e and t correlate. rc = -317-1.23 e p 8 'JP5'-H5'(H5") JH4'-H5'(H5") JP3-H3' JHl'-C6 (UJP5'-C4' JC3'-H5'(H5") JP3-C2' JH1'-C2 (UATJ 3 t 3 ~JP3'-CM' JH1'-C8 (A,G) 3jHl'^4 (A,G) Structure Determination: I) Assignment II) Local Analysis •glycosidic torsion angle, sugar puckering,backbone conformation base pairing III) Global Analysis •sequential, inter strand/cross strand, dipolar coupling Nucleic Acids have few protons..... •NOE accuracy > account for spin diffusion •Backbone may be difficult to fully characterize > especially a and t. •Dipolar couplings What do we know? • Distance, Torsion, H-Bond constraints What do we want? • Low energy structures Methods • Distance Geometry • Simulated annealing, rMD •Torsion angle dynamics (DYANA) • Mardigr as/IRMA/Morass optimize conditions pH, I, T. 1D NMR Assignments spin system sequential long range Distance constraints Torsion constraints Distance Geometry/ simulated annealing Initial structure(s) 1 Reffine structure(s) NOESY. TOCSY, COSY NOESY, COSY Use contraints to calculate structure Identify additional constraints (side chains, additional long range contacts etc) rMD calculations Structures \ y Additional Experiments f Dynamics y Mutants Interaction with target/drug