- .-- criticali~eviewsin Biochemistryand Molecular Biology. 26(2):151-226 (1991) Local Supercoil-Stabilized DNA Structures Max-Planck lnstitut fijr Biophysikalische Chemie, Gbttingen, BRD and Institute of Biophysics, Czechoslovak Academy of Sciences, 61265 Brno, CSFR Referee: . " James E. Dahlbwg, Dept. of PhyrlologlcalChemistry, 587 Med. Scl. Bldg., UnlvwsHy of Wlrconaln. 1360; UnlvsraltyAve., Madison, WI 53706 ABSTRACT:The DNA double helix exhibits local sequence-dependentpolymorphism at the level of the single base pair and dinucleotide step. Curvature of the DNA molecule occurs in DNA regions with a specific type of nucleotide sequence periodicities. Negativesupercoiling induces in vitro local nucleotidesequence-dependent DNA structures such as cmciforms, left-handed DNA, multistranded structures, etc. Techniques based on chemical probes have been proposed that make it possible to study DNA local structures in cells. Recent results suggest that the local DNA structures observed in vitro exist in the cell, but their occurrence and structural details are dependent on the DNA superhelicaldensity in the cell and can be related to some cellular processes. KEY WORDS:supercoil-stabilized DNA structures, DNA double helix polymorphy, probing of DNA structure, DNA structure in cells. I. INTRODUCTION Until the end of the 1970s, it was generally accepted that the DNA double helix is very regular and independent of the nucleotide se-.' quence.'" This conclusion was based mainly on data obtained by means of the X-ray fiber diffraction technique that had been used to study DNA structure for more than 2 decades. During the 1960s and 1970s,evidence based chiefly on the results of empirical techniques gradually mounted,b10e.g., suggesting that the structure of the DNA double helix is sequence dependent and influenced by environmentalconditions.lo In the early 1970s Bram11.12reached a similar conclusion based on his studies using X-ray fiber diffraction. Due to its limited resolution,'.this technique yields only an averaged DNA conformation; it cannot detect local variations in the double helix induced by the particular nucleotide sequence.I3Using this technique and DNA samples jith e x e k e s & base composition, however, Bram12 was able to predict an almost infinite polymorphy of DNA in the B state. At about the same time, Pohl and k ~ i n ' ~ . ' ~obtained circular dichroism (CD) spectra of poly(dGde).poly(dG-dC), which suggested that this polynucleotide at high salt concentrations assumes a structure differing from B-DNA and possibly left-handed. The untenability of the single DNA structure conception became obvious in the mid-1970s. Based on results obtained with various techniques, it was suggested that the DNA double helix is polym~rphic,'~.~~depending on the duplex nucleotide sequence and its anomalies as well as on environmentalconditions.1° This conclusion, however, received little attention at the time of its publication. : The situation changed dramatically by the end of the 1970s, when the first results from single-crystal X-ray analysis of short deoxyoli- 1040-923819113.50 8 1991 by CRC Press, Inc. gonucleotides were reported. Unlike &n, crystals are ordered in three dimensions and can diffract X-rays at or near atomic resolution, providing substantially more data. Due to these-factors, minute details of the DNA double helix can be observed. The crystal structure of d(pATAT) with different sugar phosphate conformation at adenine and thymine residues reported by Viswarnitra et al.16 in 1978 led to the proposal of an alternating B-DNA structure for p o l y ( d ~ - d ~ ) p l ~ ( d ~ - d ~ )with a dinucleotiderepeat unit.'"18 The left-handed Z-DNA structure of d(CGCGCG)19and d(CGCG) c r y ~ t a l s ~ ~ , ~ l solved at high resolution came as a surprise and immediately transformed DNA structure studies into a flourishing field. Shortly afterward it was shown that the structure of a DNA dodecarner d(CGCGAA'ITCGCG)was its structure differed, however, from that deduced by studying DNA fibers. Particularly interesting was the dependence of local structure on the nucleotide sequence. At the present time close to 80 deoxyoligonucleotidescontaining four or more base pairs have been studied by single-crystal Xray analysis, conclusively demonstrating the nucleotide sequence-dependent polymorphy of the !, DNA double helix." Further aspects of DNA structure polymorphy were uncovered soon after the discovery of Z-DNA. These included sequence-directed DNA curvat~re~~.~'and local DNAstructuresstabilized by superc~iling.~~-~~In 1980, correlation analysis of chromatin DNA nucleotide sequence revealed that certain dinucleotide~,~~e.g., AA or 'IT, occur at regular intervals correlated with the pitch of the DNA double helix. The regular in-phase occurrence of these nucleotides was supposed to be responsible for the unidirectional curvatureof DNA in chromatin. A few years later the anomalously slow electrophoretic mobility of a DNA fragmentfrom Leishmaniatarantolaekinetoplast was explained by DNA c~rvature.~',The determination of the nucleotide sequence of 'this'fragment revealed runs of adenines and thymines in phase with turns of the DNA helix. CruciformsZ8-'I and left-handed DNA segment^^^.^^ were among the earliest discovered supercoil-stabilized local DNAstructures. Anecessary condition for the formation of this type of structure is a suitable nucleotide sequence and a superhelix density sufficiently negative to stabilizethe given structure. In the past decadegreat progress has been made in the elucidation of the relations between the DNA supercoiling, local structures, and their dynamics, interactions, extrusion kinetics, and other properties (reviewed in References 31 and 34 to 36). On the other hand, understanding the biological role of local structures has lagged considerably.Greater progress can be expected in this area within the next few years due to the recent development of research techniques for the study of DNA structure in the Thisarticledeals mainly with local DNAstructuresstabilized by supercoilingin virro and in the cell; other aspects of DNA structure polymorphy are briefly summarized. II. MICROHETEROGENEITYOF THE DNA DOUBLE HELIX FORMS Studies of the detailed relationships between nucleotide sequence and DNA structure became feasible by the end of the 1970s, when organic synthesis had been developed to the point that deoxyoligonucleotidescould be produced in the purity and quantity necessary for the preparation of single crystals for X-ray diffraction (and nuclear magnetic resonance, [NMR]) studies." Three main families of DNA forms were identified by crystallographic analysis of deoxyoligonucleotides(for review see References 13,25, and 45 to 47): right-handed A- and B-forms and the left-handed Z-form. A. A-, B-, and 2-Helices The A-, B-, and Z-helices have distinctly different shapes that are due to the specific positioningand orientation of the bases with respect to the helix axis. In A-DNA the base pairs are displaced (0.4 nm)from the helix axis, the major groove is very deep, and the minor groove is very shallow. In B-DNA the major and minor grooves are of similar depths and the helix axis is close to the base pair center. In 2-DNA the minor groove is deep and the major groove is convex. In A- and B-DNA a single nucleotide can be considered as the repeat unit, while in Z- DNA the repeat unit is a dinucleotide. In Aduplexes base pairs are heavily tilted in contrast to base pairs in B-duplexes, which are almost perpendicularto the helical axis. Definitionsand nomenclature of nucleic acid structure parame: ters were published recently.48 The distinguishing averaged helical parameters of the DNA forms are given in Table 1. Many of the structural differences between the L+ i. -ik helices arisefrom the puckering of thesugar ring; -pL C3'-endo is typical for A-DNA, while in 2-DNA C3'-endo alternates with C2'-endo. In B-DNA sugar pucker tends to favor the C2'-endo or C1'e x ~ : ~but the distribution of conformations is much broader than in A- and Z-DNA. The righthanded A- and B-forms have the anti glycosidic bond, whereas in the left-handed 2-helix the orientation alternates between syn (for purines) and anti (for pyrimidines). In the latter structure the orientation around the C4'-C5' bond with respect to the C3' atom alternates between gauche + and trans conformations for cytidine and guanosine, respectively. The alternating features of Z-DNA result in the zigzag shape of its sugar-phosphate backbone, from which the name was derived. Thechangesin the backboneandglycosidic-bond conformations are accompanied by substantial variations in the stacking interactions between successive base pairs in 2-DNA. Methylation or brornination of cytosines at position 5 (studied mainly in oligonucleotides with alternatingC-G sequence) stabilizes 2-DNA. Under certain conditions even nonalternatingsequences of purines and pyrimidines can assume the conformation of 2- with thymines in a syn orientati~n.~~.~The outer surface features of such a 2-helix are different at the nonalternating sites, but the backbone is similar to that observed with alternating sequences. B. Local DNA Structure and Nucleotlde Sequence Average helix parameters for some righthanded structures are given in Table 2. The significant variations in some of these global parametersdependenton nucleotidesequenceresult in local changes along the DNA double helix. Such relations have been analyzed in detail by several authors and reviewed by Shakked and Rabinovich.13In A- and B-DNA these variations seem to be determined mainly by the specific interactions between the stacked base pairs and also to-someextent bylneighboring bases. In particular, homopblyme? dinucleotide steps show a TABLE 1 Comparison of A-, B-, and 2-DNA A-DNA. B-DNA' B1-DNAb 2-DNA' Helix sense right-handed right-handed right-handed left-handed Base pairs per turn Helix twist (") Rise per base pair (A) Helix pitch (A) Base pair tilt (") P distance from helix axis (A) Glycosidicorientation Sugar conformation 11 32.7 2.9 32 13 9.5 anti C3'-endo 10 36.0 3.4 34 0 9.3 anti Wide range 10 34.1, 36.8 3.5, 3.3 34 0 9.1 anti C2'-endo 12 (6 dimers) - 10, -50 3.7 45 - 7 6.9, 8.0 anti, syn C2'-endo, C3'- end@ ' , Numerical values for each form were obtained by averaging the global parameters of the corresponding double-helix fragments. B1-DNAvalues are for a double helix backbone conformation alternating between conformational states I and 11. The two values given correspond to CpG and GpC steps for the twist and P distance values, to cytosine and guanosine for the others. TWO valuescorrespond to thetwoconformationalstates.From Kennard,0.and Hunter,W. H., O. Rev. Biophys.,22, 3427, 1989. With permission. TABLE 2 Average Helical Parameters for Selected Right-Handed Structures A-form d(GGTATACC) d(GGGCGCCC) d(CTCTAGAG) r(GCG)d(TATACGC) r(UUAUAUAUAUAUAA) Fiber A-DNA 9-form d(CGCGAATrCGCG) d(CGCGAATT9rCGCG) Fiber B-DNA Rise per Groove width Helix base pair Base pair Propeller (A) Displacement twist (") . (A) tilt (") twist (") Minor Major Da (A) BrC = 5-bronecytosimo. Adapted from Kennard, 0. and Hunter, W. N., Q. Rev. Biophys., 22, 327,1989. With permission. wide spectrum of stacking characteristics that are markedly neighbor dependent. On the other hand, pyrimidine-purinesteps in A-DNA (especially the C-G steps) often display a low twist and high slide that are only slightly dependent on lieighboring steps. Ln 2-DNA the shape of the helix surface changes significantly due to deviations in the regular alteration of the purine-pyrimidine sequence, while the sugar-phosphate backbone does not change. The effect of the nucleotide sequence on the fine geometricalfeatures of each DNA form has been clearly demonstratedbut not fully elucidated. The emerging rules, however should be considered as tentativesince they were based on a relatively small number of examples. The well-known "Calladine's rules"51 are now perceived to be incomplete and to neglect important factors other than the steric clash of purine r i n g ~ . ~ C. DNA Hydration Information about the organization bf water molecules in DNA forms has recently been gained (reviewedin References 25, 52, and 53) from Xray diffraction analysis of crystals. Distinct hydration patterns were observed in the major and minor grooves and around the sugar-phosphate backbone. It was proposed that in DNA with a mixed nucleotidesequencehydrationof the backbone is related to global conformati~n.~~In Aand 2-DNA a chain of water moleculescan bridge the phosphate oxygens along the backbone. There are more water molecules around each phosphate group in the B-DNA, but almost no water bridges .between the phosphate oxygens, as the distance betweehphosphate oxygens in this DNA form are too great to be linked with a single water molecule. It appears that specific nucleotide sequences that create local changes in the DNA double helix may also affect the backbone hydration pattern.54-55Even greater dependence of the hydration patternson nucleotidesequencehas been found in the DNA grooves. In A-DNA specific hydration patterns occur in the major grooves.25A string of well-ordered water molecules hydrogen bonded to oxygen and nitrogen atoms in the minor groove has been found in the central AAlT sequence of the B-DNA dodecamer d(CGCGAATTCGCG).56.57 This specific hydration of B-DNA, the "spine of hydration", significantly contributes to DNA stability. Studies of further B-DNA helices (Table 3) revealed two ribbons of water molecules along the walls of wide regions of the minor groove, while narrow regions of the minor groove contained an ordered zigzag spine of hydration.* It appears that the interdependence between nucleic acid structure and the solvent represents one of the TABLE 3 Summary Comparison of Properties of &DNA Helices Nucleotlde I.D. Resolutlon Minor groove Mean propeller Minor groove Helix bend at1 sequence c o d e (A) wldthD twist" hydratlond (if present) W N W L H L b S b b u b b u b b u b Bent, CGCGt AAlTCGCG Bent, CGCI AAAllTGCG Bent, CGCt AAAAAAGCG Bent, CGC/ AAAAATGCG Straight Bent, CGCt ATATATGCG Bent, CGCGt ATATCGCG Straight Straight Straight Straight Straight W N W L H L CGCAAAAAAGCG W N W L H L CGCAAAAATGCG W N W L H L CGCGAAlTBrCGCG CGCATCTCTGCG W N W W N W L H L 1 b S b b u b CGCGATATCGCG W N W L I L GpsCGpsCGpsC CCAAGAlTGG CCAACGlTGG CGATCGATCG CCAGGCCTGG W lwwwl WWNWW WNWNW IWlWl L L H L L I L L I L L (1) L R R R S R S R S R S R Identification code usually from initials of first author: HD = Horace Drew; MC = Miquel Coll; HN = Hillary Nelson; AD = Anna DiGabrieli;MK = Mary Kopka; CY = Chun Yoon; ZS = Zippora Shakked (personal communication);WC = William Cruse; GA,CG = Gilbert Prive; KK = Kazunori Yanagi and Kasimietz Gtzeskowiak; UH = Udo Heinemann. W = wide minor groove (>I2A), I = intermediate minor groove, N = narrow minor groove (el0A). = H = high propeller twist (>15"), 1 = intermediatepropeller twist, L = low propeller twist ( Lk,was recently found in archaebacteria Sulfolobus acidocaldarius.lM In E. coli and other prokaryotes, positively supercoiled DNA can be generated during transcription in front of the transcription ensemble.145 C. Physical Properties of Supercoiled DNA Molecules Propertiesof supercoiledDNA and DNA interactions with single-strand binding proteins as well as with other proteins were studied by means of physical and physicochemical technique^.^^^-^^^ The recent results of Raman spectroscopy measurementsof supercoiled and nicked ColEl DNA suggest that accommodation of supercoiling occurs mainly in AT base pairs and backbone moieties.l6lRemelting effects might accountfor these changes, including a slight change in the band known to be responsiblefor base pair breakage. The major part of the alteration of the backbone geometry takes place in the C-0 linkage between the C5' and adjacent phosphate group. Raman spectra obtained with supercoiled pBR322lS9and its derivative ~ F b 1 0 0 ~ ~did not completely agree with those of ColEl DNA; the reasons for this disagreement are not known. It is tempting to suggest that the presence of ATrich C-inducing sequences3 ' in ColEl (See Section V1.A) might be connected with the observed Raman spectra of this DNA. Positively supercoiled pBR322 induced a negative contribution to CD of the main bands at 270 and 187 nm and showed a higher electrophoretic mobility when compared with the negatively supercoiled topisomeric sample (o= 0.07 and -0.07, respectively).16-' As expected, cruciform structure did not extrude in the positively supercoiled DNA. D. Shape of Supercoiled DNA Molecules Negatively supercoiled DNA can exist in two basic forms: solenoidal (toroidal) and plectonemic (interwound). The latter form has been assumed to be the most probable form of negatively supercoiled DNA free in solution. Solenoidal supercoiling can be exemplified by the wrapping of DNA along histones in nucleosomes. There are some important differences between these two forms. The solenoidal form is left-handed, while the plectonemic (interwound) is right-handed. Plectonemic supercoiled DNA can be naturally branched, bringing together nucleotides that are distant from one another in the primary structure; this is not the case with solenoidal supercoils. It is probable that these two forms are in a dynamic equilibrium in eukaryotic cells,. Not all protein-bound supercoils must be necessarily solenoidal, as the supercoil form is dictated by the winding surface; well-characterized plectonemic DNNprotein complex can be represented by the synaptic intermediate of the Tn3 resolvase.128*146 Recent computer statistical-mechanical simulations of moderately and highly supercoiled DNA molecules (treating supercoiled DNA withinn the wormlike model with excluded volume) showed irregularly shaped molecules with characteristics of branched interwound helices at higher superhelix densities. la The calculations showed that the quadratic dependence of the superhelicalfree energy on Lk (Equation 4) is valid for a variety of conditions but is not universal. Significant deviations can be expected at high superhelix density under ionic conditions where effective diameter of DNA is small. Experimental data concerning the shape of supercoiled DNA in solution are not free of ambiguity. The results of small-angle X-ray scattering are consistent with the toroidal shape but not with an interwound shape,149.150 while dynamic light-~catteringl~~-l~~and neutron diffraction in liquid crystalline solution158favor the interwound superhelical structure. Most electron microscopy ~ o r k ~ ~ ~ * ~ ~ ~displayed images consistent with the interwound model, but changes in the DNA shape due to the DNA absorption and drying on the supporting film were not excluded. Quite recent observations by"Adrian et al.164made by cryoelectron microscopy of vitrified specimens demonstrated the interwound form of the protein-free negatively supercoiled DNA. These authors also showed that the shape of the supercoiled DNA molecules was strongly affected by Mg2 + ions; in 10 mMTris the diameter of the pUC18 DNA molecules was about 12 nm, while it was reduced to 4 nm in 10 mM MgCI,. Approximatevalues of the partition of the linking difference (ALk) between the changes of writhe (AWr) and twist (ATw) were calculated from a single projection of pUC18 topoisomer with a ALk = -12 (in Tris solution with no added MgCI,). The partition between ATw and AWr was estimated to be between 1:3 and 1:4. These are the frrst data of ATw and AWr partitions based on direct measurements. E. Topoisomerases Supercoiling is controlled by enzymes called topoisomerases (reviewed in References 166 to 168) that may be divided into two classes. The type I enzymes do not require an external source of energy; the transient intemption of a DNA phosphodiestericbond is accompanied by the formation of an enzyme-DNAcovalent intermediate that conserves the free energy of the bond. This energy is then used for resealing the broken strand after its rotation around the unbroken strand. Type IIenzymes transiently break both strands, forming two covalent bonds between the DNA duplex and the enzyme molecule. The latter type of enzymes usually require ATP energy. Besides their basic ability to modify DNA linking number, topoisomerases can perform a number of topological reactions involving both double- and single-stranded DNAs, folding them into various kinds of topological knots and catenates. V. METHODS OF ANALYSIS OF LOCAL DNA STRUCTURES In natural supercoiled DNA molecules such as plasrnids,local structures representonly a very , small part of the whole molecule, usually around. 1%. Application of conventional physical techniques to the study of these molecules is thus very limited. In the last decade new methods were developed that have made it possible not only to detect a local structure in the DNA molecule, but also to provide its exact location and information about its chemical reactivity at single-nucleotide resolution. In the early ~ t u d i e s ~ ~ * ~ - ' ~ . "analysis of DNA electrophoretic mobility, antibodies, and enzymatic probes represented the main tools in local DNA structure research. In the past 5 to 7 years chemical probes have been increasingly applied, partially replacing the enzymatic probes. A. Analysis of DNA Electrophoretic Mobility Individual topoisomers can be resolved on agarose gels (see Section IV). It was shown169 that the electrophoretic mobility of circular DNA molecules containing long, inverted repeats increased regularly with supercoiling to a certain threshold level and then collapsed back.This collapse corresponded to the extrusion of the cruciform structure, presumably reflecting the de.crease in writhing that accompanies the extrusion. In this way gel electrophoresis can yield information about cruciform extrusion and other transitions in supercoiled DNA molecules. In 2-D gel e l e c t r o p h o r e ~ i s ~ ~ ~ ~ ~ ~ - ~ ~ ~a family of topoisomers in which ALk ranges from about zero to relatively large negative values is prepared first. Then electrophoresis is performed in the frrst dimension, followed by electrophoresis in the direction perpendicular to that of the fxst dimension. The electrophoresis in the second dimension is carried out in the presence of an intercalator, chloroquine, which reduces the twisting number Tw in all topoisomers. This decreases the supercoil-induced stress, so that any DNA segment that might have turned into a cruciformor another unusual structure reverts to its original B-form DNA. Thus, in the second dimension the mobility is governed only by the Lk of the topoisomers. As a result of this procedure an arc of topoisomers is formed in which a discontinuity is indicative of the local structural transition. From the results of 2-D gel electrophoresis, the superhelix density necessary for the structural transition as well as the change in Tw induced by this transition can be calculated. In an experiment performed by Wang et al.,17' for example, due to the flipping of a (dG-dC),, segment from a 10.5-fold helix into a left-handed 12-fold 2-helix, the observed ATw was equal to 6, conesponding well to the expected decrease by (321 10.5) + (32112) or 5.7. In addition, coupling between the free energy changes and local structural transitions in supercoiled DNA molecules can be ~tudied~'-'~'by 2-D gel electrophoresis, which" has been applied to studies of crucif o r m ~ , ' ~ ~ - ' ~ ~Z-DNA,171.174and triplexes175in supercoiled plasmids. 2-D gel electrophoresis performed at various first dimension temperatures provided a sensitive assay for the detection of local structural transitions in closed duplex DNAs. '" The possibility of applying electric fields in two orientations was utilized in a different way in pulsed field eletrophoresis, developedover the past several years for the resolution of very large DNA molecules (ranging from 20 to 2000 kb).17"178This technique makes possible the separation of intact chromosomal DNA molecules from lower eukaryotes as well as large human chromosomal DNA fragments (References 179 to 181 and references therein) and can also be - ' applied to studies of closed circular DNA molecules and their Another method introduced recently is denaturing gradient gel electrophoresis,18'-IE6 using solvents184(urea, formamide), temperat~re,'~~.'~~ or their combina~ion'~~as denaturing agents. The latter technique was applied to the study of early melting of supercoiled DNAs.lS8Increased application of denaturing gradient gel electrophoresis can be expected in the near future, as this technique has many assets, including the possibility of studying DNA structural transitions, even in partially purified samples and in cell extracts containing small amounts of DNA. The procedure based on the difference in electrophoretic mobility between a specific:DNAprotein complex and free DNA is widely known asgel retardationanalysis.189.190In the lastdecade this technique has been increasingly applied in DNA-protein interaction studies, including interactions with antib~dies,'~~-'~"RNA polymera ~ e , ' ~ ~etc. Practical protocols for gel retardation experiment^'^^ and reviews of recent applic a t i o n ~ ' ~ ~and factors affecting the lifetime and mobilities of protein-DNA c~mplexesl~~were published within the last 2 years. The gel retardation technique has been widely used in studies of DNA curvature (see Section 111). The combination of curved stretches in phase with noncurved segments makes it possibleto detect small structural changes involving changes in the DNA helix axis. 198 B. Antibodies Recognizing Local DNA Structures Nucleic acid immunochemistry providesvaluable reagents for studies of complex biological materials and of purified nucleic acids (reviewed in References199 to 201). Most abundant nucleic acids are only weakly immunogenic, whereas chemically damaged nucleic acids and less usual DNA structures show greater effectivity in inducing antibody formation. The low irnrnunogenicity of B-DNA may be due to its rapid degradation by serum nu~leases.'~~On the other hand, antibodies reacting with B-DNA were found in sera of patients with systemic lupus erythemat o s ~ s . ~ " ~Monoclonal antibodies have been isolated from hybridomas derived from mice with a disease similar to human lupus erythematosus. Some of them showed an ability to recognize BDNAs with different nucleotide sequences. Local DNA denaturation can be recognized by antibodies induced by insoluble complexes of denatured DNA with methylated bovine serum albumin or by conjugates of base nucleotides or nucleotides with protein^.^^^.^^^ The most prominent subject of DNA structure irnrnunochemical studies has been left-handed 2-DNA (reviewed in References 34, 199 to 201, and 205). -Antibodies have been induced mainly by brominated or methylated poly(dG-dC).poly(dG-dC) in 2form. Monoclonal antibodies recognizing various nucleotide sequences or a single sequence have been generated. Antibodies to triplex structures have been induced by triple-strandedpolyribonucleotides and mixed polyribo- and polydeoxyribonucleotides.*05Lee et al.206recently generated a monoclonal antibody against poly(dT-d5"C).poly(dGdA)-poly(dT-dSmC)capable of forming a triplex at neutral pHsZo7By use of this antibody, the presence of triplex structures in supercoiled plasmids206.208at pH 5 as well as in cells and was demonstrated. Triplex structures were also sensitively detected by immun~blotting.~~"Monoclonal antibodies against cruciform structures have also' become available.210 Antibodies represent a very useful tool in local DNA structure research due to their.high specificity and sensitivity and applicability in complex biological media. On the other hand, at higher concentrationsof antibodies, the equilibrium between a B-form and a specific local DNA structure recognized by the antibody (e.g., ZDNA)can bestronglyshifted in favorof the latter structure. In cytological studies antibodies are applied after futation, which may significantly influence the presence of the DNA structure detectedby the antibody. These factsshould be kept in mind when using DNA structure-specificantibodies to report the presence of local DNA structures in various biological materials (see Section VII). C. Enzymatic Probes The application of enzymes for probing local DNA structures is based on the ability of some nucleases to recognize and cleave DNA at sites with a more or less single-stranded character without cleaving the intact B-DNA. Properties of some single-strand selective nucleases were previously summarized.'O Among these enzymes, nuclease S 1 has been most frequently used for local DNA structure studies.211.212This enzyme cleaves DNA (at acid pH in the presence of Zn2 + ions) in the cruciform loop^,^^.^^ at the B-Zjunctions,213at triplex sites,175etc. It appears that the enzyme recognizes distortions in the sugar-phosphate b a c k b ~ n e , ~ ~ ~ . ~ ~ ~but the exact structure of the sites is not known. It has been shown that nucleaseS1does not cleaveat asingle base mismatch, but two mismatchesaresufficient for c l e a ~ a g e . ~ ~ ~ . ~ ~ 'Mung bean nuclease displayed properties similar to those of S1 n u c l e a ~ e ; ~ ' ~ . ~ ~ ~ however, it has been used in local DNA~trycture studies to a lesser extent.22G223In contrast to these two enzymes, endonuclease from Neurospora ~ r a ~ ~ a ~ ~ ~ * ~ ~and PI n u ~ l e a s e ~ ~ ~can be applied at neutral pH. Interest has increased in the latter enzyme, which is commercially available.226228 Enzymesdo not seem to be ideal DNA probes for many reasons, includingthe limitation of their application to the conditions of optimum enzyme activity, their mostly unknown mechanism of action, the possibility of induction of secondary structural changes in DNA, etc. For these reasons, attemptshave been made to findother DNA probes. D. Chemical Probes of the DNA Structure A common feature of the local DNA structures studied to date is the presenceof bases that, compared with the bases in the B-form, have better accessibility for interaction with the environment, e-g., bases in the cruciform loop, at the B-Z junction, etc. One may thus expect that these bases will display enhanced reactivity toward some chemicals. In the beginning of the 1980s229.230we looked for a chemical probe of the DNA structurethat could (1) bind specifically to single- and distorted double-stranded DNA regions under nearly physiological conditions, (2) form a covalent bond sufficiently stable even after the removal of the unreacted agent, (3) form a DNA adduct that,iseasily detectable even if only a very small friction of bases in the DNA molecule undergo the reaction, and (4) be potentially applicable for studies of DNA structure in situ. We have shown that osmium tetroxide, pyridine reagent (Os,py) fulfills the above requirem e n t ~ . ~ ~ ~ - ~ ~ ~ In addition to single-strand selective probes, chemicals that react with double-stranded B-. DNA, such as dimethylsulfate (seeSection V.D.2) and agents with nuclease activity, can be applied to DNA structure studies using mainly the footprinting approach (compakng the reactivities of DNA itself and its complex with a protein molecule or some other ligand). Photochemical reagents represent another class of DNA structure probes. We confine ourselves here mainly to the single-strand selective probes that have been demonstrated to have specific advantages in the study of local supercoil-stabilized DNA structures. Other probes are mentioned only briefly. 1. Single-StrandSelective Probes Table 4 shows the most importantprobes that react preferentially with single-stranded and some non-B DNA regions. These probes can be subdivided into two groups: probes reacting with sites involved in Watson-Crick hydrogen bonding and probes reacting with bases at other sites. The former group includes chloroacetaldehyde (CAA) (Figure 1c), bromoacetaldehyde (BAA), glyoxal (Figure Id), and N-cyclohexyl-N'-P(4- methylmofpholinium)ethylcarbodiimide-p-toluene (CMC) (Figure le); these probes cannot react within an intact B-DNA as they require hydrogen bond breakage (Figures lc to e). The latter group includes osmium tetroxide complexes, KMnO,, diethylpyrocarbonate (DEPC), TABLE 4 Chemical Probes of the DNA Structure Reacting Preferentially with Single-Stranded and Non-B-DNA Regions hydroxylarnine, and methoxylarnine. With the exception of DEPC, their reactions involve mainly C5 and C6 of the pyrimidine ring. Nonreactivity of bases in B-DNA is probably due to steric reasons. All of the above-mentioned probesare highly specific, showing almost no reaction with BDNA, provided the reaction conditions are properly chosen. The reaction sites in the DNA chains can be detected in various ways. Polyclonal antibodies were elicited in rabbits for the reaction products of the following probes: bisulfite, 0methylhydroxylarnine mixture,235CMC,U6 0smium tetroxide, pyridine ( O ~ , p y ) , ' ~ ~and osmium tetro~ide,2,2'-bipyridine.'~~.~~'Monoclonals against the latter DNA adduct were generated in 1990.237 The availability of these antibodies opens up new possibilities in DNA structure studies, especially with respect to the detection of open local structures in situ, and makes possible the determination of specific DNA adducts at high sensitivity. Base Thus far, osmium tetroxide complexes, Probe specifity Ref: DEPC, and BAA(CAA) have been applied to the largest extent to research local DNA structures. A. Probesreactingwith sitesinvolvedin Watson-Crick Propertiesof these probesare discussed in greater hydrogen bonding d&ail &I the following paragraphs. BAA, CAA C,A 268, 286, 287, 288, 290, 291, 308,309 a. Osmium Tetroxide and Its Complexes Glyoxal G 264, 293,310 CMC T, (3 294 B. Probes reacting in other sites OsO, (alone) DEPC Hydroxylamine Methoxylamine NaHSO, Ozone Boldnumbers refer to papers containing important methodological aspects; other represent examples of the probe application.Further references are given in text. Osmium (VIII) tetroxide is a versatile electron acceptor and an effective reagent for the cis hydroxylation of alkenes under stoichiometric conditions (reviewed in Reference 238). In DNA osmium (VI) esters are formed (Figure 2) via addition to the 5-6 double bond of the pyrimidine ring.37.239In the 1970s this reaction was utilized for the introduction of a heavy metal stain with the intention of developing an electron microscopic method of DNA sequencing.240 Ligands profoundly alter the nature of the osmium tetroxide reaction, changing the structures, kinetics of formation, and hydrolytic stabilityof the products. The reaction of Os,py with monomeric nucleic acid components have been studied in detail, chiefly by Behrman et al.239.24' Os,py reacts most readily with thymine moieties in single-stranded DNA.38The structure , 0 - 0 a R 'i' NC02Et b A YH R ..... CI 4 O H-C-C, I H H L c R I R C-C, R d R \ ,FIGURE1. Formation of the adducts between DNA bas& and single-strand selective probes: (a)osmium tetroxide (alone),(b) diethyl pyrocarbonate, (c)chloroacetaldehyde,(d) glyoxal, (e)N-cyclohexylN-p-(4-methylmorpholinium)ethylcarbodiimide-ptoluene(CMC), ..., hydrogen bonding in the Watson-Crick base pair. of the thymine-Os,py cis ester (Figure 2) was determined by single-crystal X-ray diffraction a n a l y ~ i s ; ~ ~ ~ . ~ ~ ~osmium in this ester is approximatelyoctahedral. Osmiumbindingsites in DNA chains can be detected in various ways (Figure 3), including detection at single-nucleotide resolution based either on the labilization of the sugar-phosphatebackbone (at the site of the thymineosmateester formation)to piperidinecleavage,38.244-246the ability of the adduct to terminate the tran~cription,~~'or the DNA primer extension in v i t ~ o . ~ ~ ~Adducts of DNA and RNA with various osmium tetroxide complexes are reducible at the mercury ele~trode,~~~-=~producing catalyticcurrentsat negativepotentials;these adducts can be determined at low concentrations by means of polarographic (voltammetric) techniques. In the B-DNA double helix the target C5-C6 double bond of thymine is located in the major groove where it is not accessible to the bulky electrophilic osmium probe.231Local changes in helix geometry may render the C5-C6 double ' bond accessible to the out-of-plane attack of the Os,py molecule on the base worbitals. Such changes include base unstacking in the four-way junction at low ionic strength^,^^.^ single-base mismatchesand b ~ l g e s , ~ l ' . ~ ~ - " ~local changes in twist in (A-T), sequences,260helix distortions in the vicinity of single-strand intermption~,~~~~" premelting of AT-rich sequences in supercoiled DNA,26ietc. Os,py has shown an ability to recognize minute local changes in the DNA molecule that are not detectable by chemical probes such as BAA, CAA (both requiring rupture of II , FIGURE 1E 0 0 R R \ a SO; IY so; . . I II m b FIGURE 2. (a) Formationof the adductbetween thymine and osmium tetroxide, pyridine; ...,hydrogen bondingin the Watson-Crickbase pair; (b) some ligands which can replace pyridinein the osmium complex: I, tetramethylethylenediamine(TEMED);11, 2,2'-bipyridine (bipy); Ill, 1,lo-phenanthrollne(phe); IV, bathophenanthroline (4,7-diphenyl-1,lo-phenanthroline)disulfonic acid (bpds). A- SS NUCLEASE CLEAVAGE RII CLEAVAGE INHlBlTlON FIGURE3. Schemeof treatmentof supermiledDNA withosmiumtetroxide,pyridine (Os,py), or 2,2'-bipyridine (Os,bipy) and mapping of the osmium binding sites. To map the osmium binding sites DNA is cleaved by restrictase (RI) followed (1) by cleavagewithhot piperidineandsequencingusingtheprinciplesof theMaxam-Gilbert method; (2) by primer extensionor transcriptionin vitro(terminatedat modifiedbases in the template) and nucleotide sequencing; (3) by digestion with single-strand selective nuclease and (nbndenaturing) gel electrophoresis of the resulting DNA fragments; or (4) if a structural distortion (e.g., in the B-Zjunctiok)'is expected to occur within the recognitionsite of aparticular restrictase (RII), thesite-specific modification of thedistortioncanbe manifestedby inhibition of therestrictioncleavage;(5) osmium binding in plasmids and DNA fragments can be determinedby immunoassay (e.g., by DNA gel retardation,immunoblotting,or ELISA). Modificationof large DNA regions may result in partial or full relaxation of the molecule (accompanied by changes in the electrophoreticmobility) and formation of a denaturation"bubble"visible by electron microscopy; the amount of bound osmium can be determinedelectrochemically. Watson-Crick hydrogen bonds)261(Figure lc), or DEPC.2'7.z3Within a short time Os,py has proven to bea usefulprobeof DNA structure in v i t r ~ . ' ~ . ~ ~ At high pyridine concentrations in the Os,py reagent, low ionic strengths, and long reaction times, the initial attack of the probe may be followed by the formation and propagation of the "denaturationbubble"manifested by changes in . the DNAelectrophoreticmobility.231.233-234Large "bubbles" can be visualized in the electron rnic r o s ~ o p e . ~ ~ . ~ ~ ~Under properly chosen reaction conditions (e.g, 1 mM osmium tetroxide, 2% pyridine,in 0.2M NaCI, 25 rnM Tris,5 rnM EDTA, pH 7.8 15 rnin at 26"C), secondarily induced changesare negligible;undertheseconditions, sitespecific modification at the B-Z junctionS245.246.263cruciform l o ~ p s , ~ ~ . ~ ~ 'protonated etc. were detected without any sign of changes induced by the probe. CD measurements showed no changes in the CD band in the initial stage of the Os,py reaction with calf thymus DNA.231Marked changes in CD and a steep increase in the amount of modified bases occurred under the: given conditions only after more than 34 h of the reaction.23'It may thus be concluded that Os,py is suitable as a probe of DNA structure, but the reaction conditions must be carefully chosen if reliable data are to be obtained. On the other hand, the ability of pyridine to destabilize the DNA double helix at higher concentrations may help to detect some very small changes in DNA stdc- are.38,230,231.266.267 In Os,py reagent osmium tetroxideis usually applied at 1 to 2 mM concentrations, while the concentration of pyridine is higher by two orders of magnitude. Replacing monodentate pyridine in Os,py reagent by bidentate 2.2'-bipyridine (bipy) results in more stable adducts with DNA and makes it possible to work with osmium tetroxide and bipy at equimolar concentrati~ns;~~ evensubrnillimolarconcentrationsof Os,bipy are sufficient for the detection of local open structures such as the B-Z junctions. In contrast to Os,py with Os,bipy, no secondary effects were observed. Os,bipycan be used to probe the DNA structure in E. coli cells (see Section VTI). To improve the versatility of the osmium probes we recently tested different ligands for their ability to site-specifically modify the B-2 junction and the cruciform loop in supercoiled plasm id^.^^^.^^^ Ln addition to Os,bipy (Figure 2) bathophenathrolinedisulfonic acid (bpds) and tetramethylethylenediamine (TEMED) can site specifically modify the above-mentioned structures at millimolar and submillimolar concentrations. Under the same conditions, a complex of osmium tetroxidewith 1,lO-phenanthroline(phe) displayed a lower specificity and also reacted at other sites of the supercoiled and linear DNA molecules. Os,phe may thus become useful in DNA footprintingin vitro and in situ.Differences in size and other propertiesof Os,bipy, Os,bpds, Os,TEMED, and Os,phe molecules may come into play in the selectivity of these probes for specific DNA structures in vitro and in the probe penetrationinto the cell and transport to its target DNA structure, The great diversity in the properties of osmium (VI) esters offers a number of possibilities (so far largely unexploited) that are not available in other chemical probes. b. Diethylpyrocarbonate (DPC) . , DEPC, an enzyme inhibitor and bactericidal agent, has been applied in nucleic acid research (reviewed in Reference 271) and protein res e a r ~ h , ~ ~ ~ . ~ ~ ~due mainly to its ability to react chemically with some amino acids and nucleic acid bases. DEPC carbethoxylates N7 of purines (Figure 1b) in DNA, but under certainconditions reactions at other sites and with other bases may o c ~ u r . ~ ~ ~ . ~ ~ ~In RNA, purines in single-stranded regions are accessible to this reagent, while those contained in double-stranded regions do not r e a ~ t . ~ ~ ~ s ~ ~ ~It has been shown that purines in syn conformation in left-handed 2-DNA react with DEPC much faster than those in B-DNA.277-278 Enhanced reactivity of 2-DNA toward DEPC provided a new approach to the studies of the formation and distribution of 2-DNA segments within a 2-DNA m o l e c ~ l e . ~ ~ ~ ~ ~ ~ ~DEPCalso reacts with single-stranded DNA, showing a specific reaction with bases in the cruciform loop.278.280 As a result of carbethylation and ring opening (Figure lb) the N-9 glycosidic bond is labilized, and due to alkali treatment chain scission occurs at the modified base site. This makes it possible to detect modified bases (similarly as with Os,py and 0 s,bipy)at singlenucleotide resolution using DNA sequencing techniques. Other ways of detecting the reaction sites in DNA strands have been little exploited. DEPC has been used for footprinting DNA-protein complexes;28' however, this type of experiment should be interpreted with caution, as DEPC reacts much faster with proteins thm with' single-stranded nucleic DNA footprintingis referred to again in Section V.D.2. c. Bromo- (BAA) and Chloroaceetaldehyde (CAA) BAA and CAA react with adenine and cytosine,283-285forming fluorescent cyclic ethenoderivatives (Figure lc). BAA was applied tostudy local changes in the DNA structure286288using nuclease S1 to cleave DNA at the BAA-modified sites. Both BAA and CAA were used to detect unpaired bases in cruciform loops,287B-2 junctions,268.289and triplex structures.2w For several years it was difficult to obtain a pattern at singlenucleotide resolution by means of the chemical cleavage at the modified sites. Only recently has a chemical cleavage procedure been developed that gives a resolution comparable to that obtained with DEPC and O ~ , p y . ~ ~ ~It has been generally accepted that CAA and BAA react only with bases not included in Watson-Crick base pairing. Recent results showing CAA reactivity within certain stretches of Z-DNAz9''inight be explained by a CAA reaction with a specific "open" state of the (dC-dA), segments related to the dynamic equilibrium between B and Z DNA (see Section VI.B.2). d. Advantages and Disadvantages Chemical probes (Table 4) have some advantages over enzymatic probes: 1. They usually can be applied under a wide range of conditions, including pH, temperature and ionic strength values, presence of nonaqueous solvents, etc. 2. Their moleculesare smaller and can diffuse to various parts of the DNA structure. 3. Their structures and reaction mechanisms are known and induction of secondary changes in DNA structure is less probable that with large protein molecules.Chemical probes differ in their specificity toward bases (Table 4, Figures 1 and 2) and their functional groups; proper combination of the chemical probes can thus provide detailed information about spatial organization of the " local DNA structure. 4. They do not induce DNA chain scissions, making possible the occurrence of simultaneous reactions at several sites of the supercoiled DNA molecule. 5. The reaction sites in DNA strands can be detected in various ways (Figure 3). 6. They are potentially applicable to probe DNA structure in vivo. Chemical probes also have some disadvantages: all chemical probes reacting with DNA bases are potentiallymutagenic and working with these reagents may represent a health hazard. CAAis carcinogenic;284~28sDEPCreactswith arnmonia, producing the carcinogenic ur15thdne;~'~ OsO, vapors irritates eyes and mucous membranes; etc. The reagents should be thus handled with care. 2. Probes.Reacting with DoubleStranded DNA There are a number of chemicals capable of reacting with double-strandedDNA (reviewed in References 318 to 324) but only those that have found significantapplication in the study.oflocal DNA structure research are mentioned here. a. Dimethylsulfate(DMS) In B-DNA, DMS methylates N-7 of guanine and N-3 of adenine, i.e., the shes not involved in Watson-Crick hydrogen b ~ n d i n g . ~ ' ~ . ~ ~In single-stranded DNA, N-1 of adenine and to a lesser extent N-3 of cytosine are also methylated. In Hoogsteen pairing N-7 is involved in hydrogen bonding; it does not react with DMS. This makes DMS suitable for probing structures involving Hoogsteen pairing, such as protonated triplex DMS alsoyields information about ligandlguanine contacts in the major groove of B-DNA, where N-7 of guanine is lo- cated.329-331 N-Ethyl-N-nitrosourea (ENU) reacts with DNA backbone phosphates, forming triesters. The reaction sites can be recognized by the triester alkaline hydrolysis.332Ethylation of phosphates has been used for interference studies of ligand interactions with DNA phosphates.332J33The technique does not detect direct contacts in the DNA complex, but rather that phosphates that are located too close to the ligand to accommo- date the ethyl group. The conditions of the reaction are far from physiological, thus significantlylimiting the useof ENU in DNA studies.332 Modification of DNA with N-methyl-N-nitrosourea (MNU) under physiologicalconditions -showed that the initial attack of MNU is strongly dependent on DNA conformation, suggesting that in addition to phosphates, bases might be involved in the rea~tion.~" c. Probes with "Nuclease" Activities Some chemicals are capable of inducing DNA chain scission, thus simulating to a certain extent the behaviorof natural nucleases. ~ucleasessuch as DNaseI, DNaseII, and micrococcal nuclease were applied in DNA f ~ o t p r i n t i n g , ~ ~ ~ . ~ ~ 'i.e., the technique used in the analysis of specific protein and drug binding to DNA. In footprinting, DNA fragments are cleaved with a nuclease and the products are analyzed using the Maxam and Gilbert technique,335in which the sites protected with specifically bound protein or drug molecules are observed as gaps in the DNA cleavage pattern. In addition to natural nucleases, chemicals have been applied recently to cleave DNA, including 1,lO-phenanthroline-copperion,322methidiumpr~pyl-EDTA,~~and Iron @).EDTA.320With the two latter chemicals DNA reacts to generate the hydroxyl radical (.OH), which induces the chain cleavage.In contrast to methidiumpropyl-EDTA, Iron (II).EDTA does not bind to DNA, which is cut by freely diffusible .OH. With Iron(II).EDTA footprint at very high resolution can be obtained. On the other hand, specific binding of phenanthroline-copperto B-DNA can be,.utiiizedin DNA structure studies.322This chemical cleaves A-DNA ' slower than B-DNA, while Z-DNA and singlestranded DNA are n; cleaved under the same conditions. More details can be obtained in References 332 and 336 to 341. d. Photochemical Probes Psoralens intercalate within the double helical DNA and on irradiation they are covalently added to the thymine 5,6 double bond.342One psoralen molecule can photoreact with two thymines at nucleotide sequences containing adjacent thymines in opposite strands, creating a DNA interstrand crosslink. The crosslinking sites can be analyzed by electron microscopy. To improve. the limited resolution of this technique, enzymebased procedures were recently developed that make it possible to analyze sites of psoralen adducts at single-nucleotide res~lution."~-~ Barton et al.324."7-350developed a series of transition metal complexes that bind specifically to local DNA structures; their photoactivation results in a site-specific DNA strand cleavage. For example, the tris (phenanthroline) metal complexes are favored for intercalation into righthanded helices, while the bulkier isomer of tris (4,7-diphenylphenanthroline)ruthenium binds to left-handed 2-DNA or to any other conformation that is unwound sufficiently to accommodate the bulky complex. Rhodium(1II) complex binds to the cruciform and upon photoactivation it cleaves DNA at and near the ~ruciform."~A probe for A-DNA was developed by matching the shape of the ruthenium(II) complex to that of the shallow minor groove of the A-DNA f o r ~ n . ~ ~ . ~ ' ~ e. Complementary Addressed Modification and Cleavage of DNA The ability of DNA strands to renature and hybridize was recognized more than 30 years ago.351-353Utilizationof this principle has become one of the main driving forces in the development of specific molecular biological techniques, of which one is the technique of complementary addressed modification and cleavage of DNA. Here, a relatively short oligonucleotide equipped with a chemi~allyrpctive group, intercalatingagent, andl or nucleic acid cleaving agent binds to its complementary target sequence in single-stranded DNA or RNA (reviewed in References 354 to 359). In double-stranded DNA, sequence-specific modification with alkylating oligonucleotides was observed in a negatively supercoiled plasmid but not in its relaxed form.360The double helix in relaxed DNA molecules can, however, be a target for pyrimidine (or purine) oligonucleotidesthat recognize homopurine~homopyrimidine(homopu-py) sequences via the major groove and form triplex ~ t r ~ c t u r e ~ . ~ ~ ~ - ~ ~ ~ . ~ ~ ~ - ~ ~The oligonucleotide is bound in a parallelorientation to the purinestrand of the DNA double helix. An oligonucleotide of about 15 to 19 nucleotides should be sufficiently long to recognize a single specific sequence in the human genome.356This technique may be of great use for mapping genomes over large distances to study local DNA structures, namely, the DNA triplexes both in vitro and in vivo, etc. While in experiments in vitro the usual deoxyoligonucleotides can be applied, for ex- perimentsin vivo several modificationshave been introduced into oligonucleotides to prevent their degradation by nucleases. The phosphate group was replaced by methylphosphonate and phosph~thionate,~~~and the a-nucleotide anomers were substituted for the natural p-anorner~.~~~The covalently linked intercalating agents include acridine, phenanthroline, proflavin, and furocoumarin derivatives.357.366-368 Proflavin and furocoumarins are photoactive and can be attached to the oligonucleotides by irradiation.3s6 As a nucleic acid-cleaving agent, metal complexes such as Iron-EDTA, Iron-porphyrin, and 1,lO-phenanthroline-copperhave been used.356 Oligonucleotide binding can be irreversible when the covalent bond is formed. An irreversible and well-detectablechange is a site-specific DNA chain scission induced by a DNA-cleaving agent such as Iron(II)*EDTAattached to the olig ~ n u c l e o t i d e . ~ ~ ~ . ~ ~ ~ ~ ~ ~Recently, an irradiationinducedcleavagewas reportedwith an I1-residue homopyrimidine oligonucleotide covalently linked to ellipticine derivative that introduced cleavage of the two strands of the target homopu-py sequence.369Development of the sequence-specific artificial photonucleases of this type represents a new approach in which the cleavage reaction can be very well controlled by light. Oligonucleotideconjugates that bind specifically to mRNA inhibit its translation in cell-free extract^.^^^,^^^ The use of antisense oligonucleotide to selectively control gene expression is a very promising approach of potentially great significance that may provide the basis for the development of highly specific therapeutic agents. At present, however, some basic questions remain unanswered.370 Site-specific DNA chain scissions have also been accomplished by incorporation of DNA cleaving agents into sequence-specific DNA binding peptides and proteins, e.g., the E. coli catabolitegene activator, a helix-turn-helixmotif sequence-specific binding protein.371 f. Ultraviolet Irradiation The formation of UV light-induced cyclobutane pyrimidine dimers in DNA requires a proper alignment of the 5.6 double bond in the two reacting pyrimidines.This requirementis not fulfilled in B-DNA; thus, the dimer formation can serve as a DNA structural probe.320.372J73 Formation of the interstrand crosslinking dimers was used more than 20 years ago to study DNA premelting,10.375.376but in the years hence, this approach was little exploited. Only recently was a renewed interest in this technique shown. It has been demonstrated that peaks of pyrimidine dirner formation occur in nucleosomecore particles with an average periodicity of 10.3 base^,^".^^^ and that the rate of thymine dimer formation is affected by the direction and degree of DNA bending.379Homopyrimidine inserts [with (dTdC), or (dC), sequences] in plasmid DNA are good targetsfor UV-induced[6-41-pyrimidine dimer formation,380as demonstrated by photoprints of fragments produced by DNA piperidinecleavage. The dimerization in these sequences was almost completely abolished when homopyrimidine oligonucleotides with (T-C), or (C), sequences were added to form the triplex structure. This technique may become very useful in the research of DNA triplex structures, both in vitro and in vivo. UV irradiation has also been applied as a photocrosslinkingand footprinting agent in studies of protein-DNA interaction^.^'^.^^^"^'^^^^ Pulsed laser (nanoseconďpulses) and flash irradiation (microsecond pulses) have been increasingly applied for time resolution ~ t u d i e s . ~ ~ ~ ' ~ ~ 'It appears that thymines photoreact with primary mines, including lysine residues, in proteins, but the photochemistry of protein-DNA crosslinking is not yet fully understood. 3. Conclusions .. . In contrast to the early 1980s, when the arsenal of methods suitable for the analysis of local supercoil stabilized structures was rather limited, a large number of methods are currently available. Among them chemical probes and gel electiophoresis are perhaps the most important. In studies of oligonucleotides modeling a specific DNA structure, physical techniques and particularly NMR386can yield detailed structural information. It can be expected that scanning Nn- neling microscopy, whose usefulness in the DNA structure studies was recently de'fnon~ t r a t e d , ~ ~ ' - ~ ~ ~will soon become a very useful tool in the research of local DNA structures and their interactions. Attempts to improve our knowledge. of DNA structure inside the cell will require the development of new techniques. The first results in this respect have been obtained with chemical probesand specific moleculargenetic approaches (see Section VLI). It is anticipated that further, rapid development of these techniques will continue in the near future.... VI. LOCAL CHANGES IN DNA SECONDARY STRUCTURE STABILIZED BY SUPERCOILING According to Equation 1, supercoiled DNA with ALk # 0 must either twist or writhe, or both, away from its relaxed form. Changes in =may involve torsional deformations of existing s e c o u struc- and/or transitio~sof BDNA segments t~ supercoil-stabilized specific local structures depending on nucleotide sequence,.These transitionsinclude B-Z transitions at alternating purine-pyrimidine sequences, hiplex formation at homopwpy tracts, cruciform extrusions at inverted repeat sequences, etc.; changes in JQinvolve bending and the formation of superhelicalturns. It should be noted that local changes resulting from conformational transitions must be compensated for elsewhere in the molecule (supercoiled domain) to preserve the linking number. Thus far the best understood structural change in a negatively supercoiled DNA molecule may be the cruciform e ~ t r u s i o n , ~ ' ~ ~ ~ ~ . ~ ~ 'discussed in detail in Section V1.A; here, it is mentioned as an example of formation of a local structure assisted by freeenergy of the negativesupercoiling. Inverted repeat sequences can, in principle, undergo a transition to a structure in which the regions with intrastmnd complementaritycan.pair, forming the cruciformstructure. Such a structure is energetically unfavorable compared with perfectly base-paired continuous DNA duplex. Thi energetic disadvantage of cruciform formation, however, can be overcome if the inverted repeat is placed in the negatively supercoiled molecule, in which the free energy of superhelix formation can provide the driving energy required to facilitate the cruciform extrusion. An inverted repeat of, for example, 21 bp, will in its unperturbed state contribute to the total Tw by 21110.5 or 2. The inverted repeat contribution will decrease to zero after the cruciform extrusion provided the backbone conformation adopted within the hairpin itself is relevant to the topological properties of the whole molecule.'22A similar decrease in Tw can be expected if the same sequence forms a melted region, while formation of 2-DNA would result in a DNA unwinding almost twofold greater. In natural supercoiled DNAs usually more than one nucleotide sequence in the molecule is able to undergo transition to a specific local supercoil-stabilized structure. Once the first transition occurs, the probabilityof further transition is changed-due to the partial relaxation of the molecule induced by the first transition. Moreover, a single nucleotide sequence may itself be able to form several alternative structures, depending on various factors, including environmental condition^,^^.^^^ superhelix density,z3etc. Thus, complex multistate equilibria may arise in supercoiledDNA molecules;such equilibria may thke part in DNA-controlled regulatory processes in vivo. Presence of the local structuresin supercoiled DNAs complicates their geometric and topological analysis. Equation 1 was formulated in terms of winding of either DNA strand about the duplex axis that is often difficult to define. To overcome this difficulty, new models were proposed (see Section IV) capable of evaluating quantitatively complex phenomena connected with the presence of local structures such as cruciform~in supercoiled DNA molecules. In this section conformations and properties of cruciforms, left-handedsegments, triplexes,and other local structures in vitro are discussed. The question of the existence of these structures in vivo and their biological relevance is a subject of Section VII. A. Cruciforms Cruciform structure was predicted 35 years ago,,393s39but the importance of the negative sup&coiling for its formation was suggest=d and theoretically derived much The first experimentalevidence of the cruciform structure was obtained about 10 years ago with the observation of very large cruciforms in ,the electronmicroscope2sand by nucleaseS1 probing of supercoiled plasmids and phage The possibility that the structure might be artificially induced by acid pH and Zn2 + ions of the enzyme probe buffer and/or by interaction of the enzyme molecule with DNA was soon excluded by chemical probing234 and 2-D gel electrophoresi~'~~.~~~at neutral pH in the absence of Zn2 + ions. Cruciform is now well established and various aspects of its stmcture were reviewed in a number of papers.31.35.122.123.138,390.391.400 The formation of a cruciform requires a nucleotide sequence with a local twofold symmetry represented usually by a sufficiently long inverted repeat. Two structural features are characteristic for the cruciform, i.e., the formally unpaired loop and the four-way junction. 1. Loop Optimal size of the formally single-stranded loop lies between four and six bases.79.401.402The loop adopts some structure involving probably base sta~king~~~.'"'~and even non-Watson-Crick base pairing.402Hairpinloops may affect the stem conformationclose to the loop with the stem perturbation stronger for smaller loops.404 2. Four-Way Junction The four-way junction3 ' is normally fully base paired,79.279.m.m5asymmetric and X-shaped. It introduces a bend into the molecule whose nature differs from the intrinsic sequence-directed c u r ~ a t u r e . ~ ~ . ~Ln the absence of ions thymines at the junctionare reactive to Os,py, probably due to base unsta~king,~and the helical armsof the junctionare fullyextended in a square, configuration. The four-way junction is formally, equivalent to the Holliday junction - a fundamental structure in the genetic recombination. A series of papers was published in 1989 and 1990 by Lilley et a1.*ď4 demonstrating for the first time the structural details of the Holliday junction. Their experimental approach was based on hybridization of four oligonucleotides(each of 80 nucleotides) to assemble a four-way helical junction. This complex and its variants (containing small changes in nucleotide sequence or substitution of electrically neutral methylphosphonate at the central phosphodiester linkages) were investigated using various techniques, including gel electrophoresis,chemical probes, nucleasecleavand the resulting model was confirmed and refinedon the groundsof fluorescenceenergy transfeFL4and model building studies.411The model is called the stacked X-structure in which the four arms are associated in pairs, stacked in a coaxial manntr' ťt9generate quasi-continuous helices. The nucleotide sequence at the junction center determines the choice of the stacking partners. Binding of metal ions is important for the stability of the X-structure. Resolvase (from T4 phage) binds at one side of the X-structurecleaving the exchanging strands two or three nucleotides from the junction. More detailscan be found in Lilley and co-workers' recent Information about the details of the cmciform loop, st&, arid four-way junction was obtained not only by investigation of supercoiled DNAs by techniques specific for local supercoilstabilized structures (this section), but also by studying oligonucleotide constructs modeling cruciform structure or its specific parts401.402~404.405.407-416 by techn i q u e ~ , ~ ' . ~ ~ . ~ . ~ ~ . ~ ' ~namely, NMR. Using this approach a crystal structure of a hexadecanucleotide CGCGCGTTTTCGCGCG was solved at 2.1 A resolution showing a hairpin configuration with Z-DNA hexamer stem.416Ln contrast to this type of research studies of the cruciform extrusion had to be done with supercoiled DNA molecules and the applied techniques were thus limited mainly to those described in Section V. 3. Cruciform Extrusion Extrusion of the cruciform requires significant rearrangement of the DNA structure, including complete reorganizationof the base pair- ing. Compared with the regular DNA duplex, cruciform is thermodynamically uristable due to the energetic cost of base unpairing in the loop and formation of the four-way junction. Free energy of the cruciform formation was determined to be 15 to 20 kcal/mo1.'38.172-399.417 The first studies demonstrated a significant kinetic bamer and showed that the cruciform extrusion could be a rather slow p r o ~ e s s . ~ ' ~ . ~ ' ~Further work withother DNAs revealed,- however, that extrusion could proceed under some conditions more easily. Detailed studies showed two types of cruciformextrusion kinetics. Cruciforms requiring the presence of salt for their extrusion (showing maximum extrusion rates at about 50 mM NaCl) are termed wg(related to salt dependency). S-type cruciforms are more common than C-type crucifom, so far represented only by ColE1as the only natural memberof the Ctype (C for ColEl) class. C-type cruciforms extrude maximally in the absence of salt, showing a marked dependence on temperature, which is in contrast to the ratherlow temperaturedependence of theextrusion of theS-type cruciforms.Striking differences in the kinetic properties of S- and Ctype cruciforms suggested the possibility of two different mechanistic pathways for the extrusion process (Figure 4). The S-type extrusion is idtiated by a helix opening at the center of the inserted repeat, followed by the formation of a small protocruciform and branch migration resulting in elongation of the cruciform stems up to the complete extrusion. Such extrusion was explained theoretically by Vologodskiiand FrankKarnenet~kii~~~and documented by a wealth of experimental evidence (Reference 391 and references therein). Recent suggest that the initial opening may correspond to about 10 bp and that the nucleotide sequence at the center of the inverted repeat significantly influences the rate of cruciform High A+T content of this region or destabilization of the helix by methylation of adenine at N6 enhanced the extrusion rate, while an increase:in the G+C content or stabilization of the helix by cytosine methylation resulted in a decrease of the extrusion rate. These results, as well as cation binding analysis, support the concept of the S-type cruciform extrusion.426 The C-type extrusion (Figure 4) is supposed *.em to start with coordinate unpairing of a much larger duplex segment followed by intrastrand reassociation to form the fully extruded cruciforq. This pathway would be expected to be facilitated by low ionic strength and characterized by a large activation energy, i.e., the characteristics experimentally observed, but what are the reasons for the two different extrusion mechanisms? Comparison of the nucleotide sequence of ColEl cruciform with that of a typical S-type cruciform does not show differences that may help answer this question. The explanation of the C-type extrusion offered recently by Lilley et al.31-26'.424.42u29is based on the presence of very (A+T)-rich sequences Qanking the ColEl inverted repeat. Replacing ColE1 cruciform with another inverted repeat (originally S-type) resulted in a typical C-type extrusion429Similarly replacing an S-type cruciform in its natural environment with a ColEl inverted repeat exhibited S-type extrusions. This clearly demonstrated that for the C-type extrusion, the sequences flanking the ColEl repeat are responsible. It was further shown that 1. The transition energy of the cruciform extrusions decreases with the length of the Ctype indbcing sequence43' 2. The polarity of the sequence can be unimportant and its length can be reduced to 12427.430 3. If the inverted repeat is flanked both by Sand C-type sequences, the salt concentration dominates the type of kinetics (i.e., Ctype in absence of NaC1) 4. The effect may be manifested at a significant distance (around 100 bp) 5. Insertion of a (G+C) segment between the C-type inducing sequence and the inverted repeat may block the C-type extrusion How can the (A +T)-rich C-type inducin4 sequence,perform a remote control of the cru: ciform extrusion? It was proposed that these sequences are responsible for a coordinate destabilization of a large domain in the supercoiled DNA, thus increasing the probability of largescale base opening in the inverted repeat.3t.261.390-427In agreement with this proposal, duplex stabilization in (A-T)-richsequenceswith - A cooperat lve / -opening large bubble / C-type extrusion \ inverted repeat fully-extruded \ S-typeextrusion L 2 r u c " o r mcentral cpenlng migraticn intra- s trond polring -- n central bubble proto-cruciform FIGURE 4. Mechanism of cruciform extrusion. The inverted repeat, represented by the thicker line, is shown in the unextruded form on the left. C-type cruciforms(top) initiate theextrusionprocesswith acoordinateopeningof many base pairs to form a large bubble. An intrastrandreassociation then forms the mature cruciform structure. The extrusion of S-type cruciforms (bottom), is initiated by a smaller opening event. lntrastrand pairing generates a smaller protocruciform, which may undergo branch migration. Base pairing is transferred from unextruded sequence to the growing cruciform stem in a multistep process to form the fully extruded structure. The principal differencesbetween the two mechanisms lie in the initial opening and the degree of tertiary folding in the transition state. (From Lilley, D. M. J., Chem. Soc. Rev., 18, 53, 1989. With permission.) -- distamycin and NaCl resulted in S-type extrusion. Conversely, helix destabilizing agents induced in DNA with S-type extrusion a quasi-Ctype extrusion kinetics. Thus, changing the helical stability results in interconversion between S-and C-type mechanisms. Bases in the C-type inducing sequence in supercoiled DNA are hyperreactive toward Os,py, BAA, glyoxal, NaHS03,261and DEPC.431This hyperreactivityrequires a threshold level of negative supercoiling and changes in the chemical reactivity induced by various agents correlate with those in cruciform extrusion.261BAA hyperreactivity was detected at temperatures above 26°C (with a maximum around 40°C), while that of Os,py was observed above 0°C reaching a max: imum of 20°C. The BAA hyperreactivity was explained by a cooperative but transient helix opening. Os,py hyperreactivity at much lower temperaturesmight be due at least in part to helix destabilization by 3% pyridine at low ionic s t ~ e n g t h . ~ ~ ~ , P ' ~ypersens6ikyToward BAA, DEPC, and KMnO, was observed in a "random sequence" region centered 10 to 30 bases away from the junction of the (A-T),, cruciform;432this hypersensitivity was pH-dependent. (A-T), sequences adopt the cruciform structure at low energies of formation without a detectable kinetic barrier.173.288.433 The unusual extrusion kinetics of the (A-T), cruciform may be due to specificproperties of (A-T), duplexes that are very easily deformable and denaturable and whose structure - .._ differs from the regular B-DNA.260.4"It has been suggested that these (A-T), sequences may represent simultaneously the inverted repeat and an effective inducing sequence, being thus a special subclass of the C-type extr~sion.~'The hypersensitivity of the region in the vicinity of the (AT),, cruciform was related to the cruciform extrusion, but its relation to the C-type extrusion remained unclear. The C-type cruciform extrusion is a very interesting phenomenon and some of its aspects may have more general consequences. h was shown quite recently that the AT-rich flanking sequences may influence the structure of lefthanded DNA (see Section VI.B.2).43s At the present time it would be desirable to have more detailed information on the way in which the inducing sequence materializes the remote control of the cruciform extrusion. The suggested3 ' telestability e f f e ~ t 4 ~ ~ ~would at least requiresome considerationsconcerning DNA supercoilingand the explthation based on the soliton-like states in DNA seems to be even less a ~ c e p t a b l e . ~ ~ ~ . ~ ~ ~ There is no doubt that the ability of some DNA segments to induce local changes in DNA conformation far away from the segment may be of great biological importance. Their better understanding, however, requires further work. 4. Multiple Structures in Inverted Repeats Some inverted repeats may be composed of alternating purine-pyrimidinesequences and can (in principle)form either a cruciform or a 2-DNA structure. It was shown that the (CATG),,*(CATG),,. insert selectively forms a cruciform structure when integrated in a negatively supercoiled p l a ~ m i d . ~ ~ ~Similarly, (TCGA),-(TCGA), and (TG),.(CA), inserts preferentiallyadopted cruciformat low ionic strength rather than Z-DNA?' However, the latter two sequences were induced to form left-handed DNA (after removal of local structures by ethidium bromide) when the negative supercoiling necessary for the transition was generated at higher ionic strength (e.g., 0.2 M NaCI). Although 2-DNA formation in (C-G) inserts requires more free energy than cruciform formation these sequences preferred the left-handed form.226(GTAC), flanked with two (G-C), sequences adopted left-handed DNA.w1(A-T) sequences showed a strong preference for 'crukiform f o r r n a t i ~ n , ~ ~ . ~ ~ ~but under specific conditions (in the presence of Ni2 + ions) left-handed structure was observed in a supercoiled plasmid (see Section V1.B. These results suggest that equilibria between B, 2, and cruciformstructures exist in alternating puepyrsequences depending on thesuperhelicity andenvironmentalconditions. The possibility that a given sequence may undergo more than one structural transition may be of great biological significance as these transitions provide specific recognition sites (loops and four-way junctions of the cruciform and B- 2 junctions and altered helical sense of 2-DNA) for DNA binding proteins and cause different levels of DNA relaxation, which may transmit long-range structural effects influencing regulatory processes.260p392.440 B. Left-Handed Z-DNA About 8 years after publication of the CD spectra of the left-handed DNA in poly(dGdC)-poly(dG-dC)by Pohl and J o ~ i n , ' ~ . ' ~2-DNA structure was solved by single-crystal X-ray diffraction studies (see Section II).'9-21This discovery stimulated enormous scientific effort, resulting in hundredsof papers that have been surveyed in numerous reviews.3636,138.191.w2447It would be useless to try to present a more or less comprehensive review of the broad field of 2-DNA in this article; instead I shall touch on some problems studied in recent years that have not been 'fully covered in thti preceding reviews. These include the question of formation of left-handed DNA within (dA-dT), sequences and the B-Z junctions. 1. Left-Handed DNA in (dA-dT), Sequences Regular alternations involving anti and syn nucleoside conformations are one of the main characteristics of Z-DNA (see Section II). Such alternationsare formed most easily in alternating pu-pyrnucleotide sequences (with purines in syn conformation). (dC-dG), sequences easily adopt the 2-DNA structure, as has been shown both by solution and crystal studiesw2 The surprising reluctance of the corresponding (dG-dC), oligonucleotides to adopt 2-DNA structure has not yet been fully explained. The ability of (dA-dC);(dTdG), sequences to form left-handed helices in solution both in lineaS48-J5'and supercoiled D N A s ~ ~ - ~ ~ ~ . ~ ~ . ~ ~ ~is well known, but no crystal structure has been reported up to now. Consecutive AT pairs can be incorporated into the Z helix (reviewed in References 442 and 445) and a maximum of six AT pairs can be tolerated, including the (T-A), sequence. In longer (dA-dT) sequences no left-handed structure was observed under conditions where (dC-dG), anď (dT-dG);(dA-dC), sequences adopted this struc-- ture. Inversion of the CD spectra of poly(dAdT)-poly(dA-dT)in concentrated CsF solutions was not due to the B-Z transition but to the socalled B-X t r a n s i ~ i o n . ~ ~ ~ . " ~ ~ The frrst evidence of the B-Z transition was obtained in 1986 in Tallandier's laboratory by investigating poly(dA-dT).poly(dA-dT) in films by IR spectroscopy in the presence of different counter ions and a wide variety of water cont e n t ~ . ~ ~ ~IR spectra observed in the presence of Ni2 + at high polynucleotide concentration and low water activity were assigned to a Z-type structure of poly(dA-dT).poly(dA-dT). CD of poly(dA-dT)*poly(dA-dT)solution at high NaCl concentration(5 M)in the presence of NiCI, (90 rnM) displayed almost total spectrum inversion with no negativeband around 250 nm, suggesting Z conformation of the polyn~cleotide.~~~The nickel-induced Z-form of poly(dA-dT).poly(dAdT) obtained further support from UV absorption456and especially from Raman457and resonance Raman measurements.458It was proposed that poly(dA-dT).poly(dA-dT) forms ZDNA due tointeractionof nickel ion with adenine N7.457*458This interaction is possible thanks to the screening of negatively charged phosphates by high sodium concentration that results in stabilization of adenosine in syn conformation and reorganization of the water distribution along the molecule that stimulates the B-Z transition. At NiCI, concentrations close to 0.1 M pH cannot be neutral, but it must be weakly acidic. The role of acid pH in the formation of Z-DNA in (dAdT) sequences, however, was not considered. Recently it was shown392that (dA-dT),, insert can adopt left-handed structure in a supercoiled plasmid under conditions (2M NaC1,0.2 M NiCI,, or 1 M NiCI, alone) not sufficient to induce Z-DNA in the linear plasmid or poly('dAdT)-poly(dA-dT). (dA-dT), block placed in the center of the 32 bp~c5mplementaryaliernating purine-pyrimidine insert adopted a left-handed structure459at substantially lower NiCI, concentrations (10 mM NiCI,, 0.2 M NaCI) as detected by DEPC probing. These results suggest in agreement with the recent predictions that all simple alternating purine-pyrimidine sequences such as G-C, A-C, and A-T may adopt left-handed conformation under some environmental conditions in linear D N A S ; " ~ . ~ ~in supercoiled DNAs these conditions must be combined with a proper superhelix density. 2. 6-Z Junctions Segments of left-handed Z-DNA can be contiguous with B-DNA both in vitro and in vivo (see Section VLI). The boundary between these two structures has been specified as B-Z junction (reviewed in References 34,36,442,444 to 446, and 460). Structural models of the B-Z junction suggest that at least one base pair at the junction, must be different from the B or Z conformat i o n ~ . ~ ~ ' - ~ ~Experimental data obtained mainly with supercoiled plasmids show that the B-Z junction may be a rather polymorphic structure, Cleavage of the B-Z junctions in supercoiled plasmids with single-strand selective nucleases (such as nuclease S1) as well as site-specific modification with single-strand selective chemic$ probes (see Sections V.C and V.D. l), including 0 ~ , ~ ~ 2 4 5 . 2 6 8 . 3 9 3 . 4 6 6 - 4 6 8 . 4 7 0and hydroxylamine,245.468indicated open distorted structures, but the data could not be unambiguously interpreted in terms of a presence of unpaired bases (not required in the structural models), as the above nucleases and chemicals can interact even with distorted base-paired regions. To detect unpaired bases in the B-Z junction chemicals reacting specifically with sites involved in Watson-Crick hydrogen bonding were applied.289.291,308.466.471 In 1987 it was shown268.Xl.293.308.J70n471 that such chemicals, including BAA, CAA, and glyoxal, site-specifically modify the boundary between B- and Z-DNA formed in supercoiled plasmidseither by (G-C),268-293.308.47'or (T-G)n471 sequences. This implies that the B-Z junctions in supercoiled DNA contain unpaired bases. On the other hand, further conclusions regarding, for example, the exact number of unpaired bases, must be considered with caution because the nuclease S1 applied for the detection of BAA or CAAmodified sites may not recognize"solitary nucleotides modified by the probe.268-293TO obtain more accuratedata, the recently developed method of chemical cleavage at the sites of BAA modification should be applied.290 No crystallographic data have thus far been published describing structure of the B-Z junction. Recently, however, other physical technique^^^^-"'^ were applied to study the B-Z junction in a linear DNA fragment472and olig o n u c l e o t i d e ~ . ~ ~ ~ -~ ~ ~It was shown that the hydrddynamic dimensions of the 153 bp fragment are not affected by the transition of two (C-G), segments to Z-DNA, induced by 15 mM [CO(NH,),]~+.~~~This may imply that the B-Z junctions were, under the given conditions, neither strongly bent nor particularly flexible. On the other hand, the results of IRspectroscopy of nucleotide films suggest that flexibility in the junction region (resulting from the presence of one no-base residue)stimulates the B-Z transition in six nucleotide residues of a double-stranded tride~amer.~'~ Recently, a 16 bp oligonucleotide "C G " C G " C G " C G A C T G A G " C G " C G " C G " C T G A C T 1 2 3 4 5 6 7 8 9 1 0 1 1 12'13 was synthesized with 5-methylcytosine ("C) incorporated in the (C-G), segment, which adopted a Z-conformation at high (>3 M)NaCl concentration~.~'~Studies of this oligonucleotideby UV absorption and CD and NMR spectroscopy revealed three base pairs involved in the B-Z junction with only one of them being dramatically d i s t ~ r t e d . ~ ~ ~ . ~ ~ ~The proton resonances for base pairs 7,8, and possibly9 became observableonly at temperatures higher than approximately 30°C, suggesting that at lower temperatures the base pairs are accessible to exchange with solvent. At 30 to 50°C, all internal hydrogen bonds in base pairs 2 to 14 were intact. It has been shown that methylationof a plasmid with HhaI methylase in ' vitro decreases unwinding in the B-Z junction by half as compared to the unmethylatedp l a ~ m i d . ~ ~ ~ The question thus arises to what extent are the structural features of the B-Z junction observed by NMR in the tridecamer influenced by cytosine methylation.474 The results of chemical probing of the B-Z junctions in supercoiled plasmidS246.263.268.289.293.30884W471 suggest that the B-Z, jun~ti~nsadjacent to (C-G), segments are narrow in good agreement with the results of the oligonucleotide proton NMR.474B-Z junctions adjacent to (G-T), segmentsin supercoiledplasmids may contain more bases hypersensitive to the chemical probe and might be structurally different.268.467470It was shown that in (dC-dA),, sequence around a -0.06 only about half of the 62 bp tract exists in Z-form anywhere along the C M G segment and is probably constantly in m ~ t i o n . ~ ~ ~ . ~ ~ ~ ~ ~Under these conditions the hypersensitivity toward chemical probes associated with B-Z junctions can be distributed over many bases. At more negative a the junction hypersensitivity begins to appear at one end of the Zforming sequence, suggesting that the structure of a B-Z junction can be assumed at less energy costs at one boundary than at the other. Do these data suggest that the left-handed structure of (T-G);(A-C), segments is different from Z-DNA observed in (C-G), sequences?The C T G G A C ,, .\: 14 15 16 L 5 available experimental data do not exclude such a possibility;some of them [e.g., the free energy of formation of Z-helices in (T-G);(C-A), sequences is higher than for (C-G), stretchesof the same length] might even support it.J45Quite recent results of Rajagopalan et al.435makes the possibilityof the existenceof various left-handed structures even more probable. These authors showed that flanking of Z-forming (C-G) sequences with AT-rich ColEl sequences(see Section VI.A.2) resulted in site-specific Os,bipyand BAA modifications within the left-handed (C-G) tracts of 36 and 40 bp in length. To explain this unexpected result it was suggested that the (CG) tract adopted two conformations that created a new junction. Regarding the B-Z junctions, at the present time we may only conclude that their structure' is polymorphic depending on nucleotide sequence, superhelical density, and environmental conditions. Bases in the junction show not only hypersensitivity to single-strand selective chemicals (Table 4). but also represent sites of enhanced intercalation of the psoralen probe.447 Structureand propertiesof the B-Z junctionsmay profoundly influence the B-Z transition, and with their open structures differing markedly from those of the adjacent helices, these junctions themselves may play a significant biological role (representing, for instance, a binding site for single-strand binding protein^).^'^.^'^ Further work will be,needed, including X-ray crystallography, NMR, and other physical techniques, as well as chemical probing to understand better the structural features of the B-Z junctions. C. Triplexes In only a few years triplexes have become one of the most studied aspects of the DNA polymorphic structure. ~ o l e c u l a gtriplexes at weakly acidic and neutral pH values require supercoiling, while intemolecular triplexes from duplex DNAs and oligonucleotidescan be formed under the same conditions in relaxed DNA molecules. As this article deals with supercoil-stabilized structures, I shall focus on intramolecular triplexes in an attempt to present a more detailed review. Intermolecular complexes are only briefly summarized. 1. Early Studies of Multistranded Structures in Synthetic Polynucleotides More than 30 years ago poly(A) and poly(U) were shown to form a triple-stranded structure (at neutral pH, in the presence of MgCl,) with the stoichiometry 1A:2U where the second poly(U) strand was bound via Hoogsteen pairing.480482Studies using poly(C) and poly(G) or poly(1)showed triplex formation at weakly acidic pH values (with one homopurine and twohomopyrirnidine strand^).^^^-^^^ Later, the possibility of formation of poly(dG).poly(dG).poly(dC) at neutral pH was demon~trated.~It was shown' that base triplets T-A-T and C-G-C+ can be formed in both polyribo- and polydeoxyribonucleotide series.487492Triplex structures were observed not only in homopolymers, but also in copolymers with alternating sequence containing all purines or all pyrimidines in each strand; for example, poly(dTdC).poly(dA-dG).poly(dT-dC) was formed below pH 6 in the presenceof MgC1,. Recently it was shown that these polynucleotides can adopt triple-strandedstructure even at neutral pH if the cytosine residues are me~hylated.~~~It was proposed that in the triplex the third strand was associated with a duplex through Hoogsteen pairing in the major groove with parallel orientation to the homopurine stranď.485.492.493Polynucleotides with mixed purine-pyrimidine sequences such as, for example, poly(dAdT).poly(dA-dT), did not form triplex structures. Guanosine and its derivatives are known to form tetrameric aggregates at high concentrat i o n ~ . ~ ~ ~ ~Poly(G) and poly(1) form quadruple he lice^.^"'-^^^ The ability to form four-stranded structures was observed also in guanine-rich polypurine copolymers.504 DNA and RNA multistrandedstructures have long been considered mainly an interesting property of some synthetic polymers with no biological relevance. Recent discoveries of supercoilstabilized triplexstructures in plasmid DNAs and the occurrenceof quadruplex structuresin G-rich ,b.telomericsequ.enca"(seeSection V1.D.1) turned the multistranded structures into the latest "hits" of DNA structure r e s e a r ~ h . ~ ~ ~ . ~ ~ ~ " 2. lntramolecular Triplexes The discovery of Z-DNA in synthetic polyand oligonucleotides by means of physical techniques (see Sections I, II, and V1.B) induceďa search for Z-formingnucleotidesequencesin various genomes and studies of relations between --. their location, structure, and biological function. The history of the discovery of the supercoilstabilized triplex DNA followed a different path. First, a great number of papers were published in the first half of the 1980s describing the location and properties of the polypurine.polypy;.imidine (polypu-py) sequences (reviewed in Reference 506; see Section VTI.C). Such sequences were found in a variety of eu-. karyotic organisms and tissues located mainly within the regulatory regions of active genes, Many of these sequences displayed a marked hy- , persensitivity to the single-strand selective nucl-S 1,,suggesting the presenceof non-B DNA structure.To explain this hypersensitivityseveral models were postulated (reviewed in Reference 328), including slipped s t r u c t ~ r e ~ ~ - ~left-handed non-Z DNA,509a structure in which AaT WatsonCrick pairs alternate with Hoogsteen G-C pairs (with G in syn conforrnati~n),~'~and a heteronomous structure with a dinucleotide repeat a. H-DNA Model In 1985 Lyarnichev et al.,'" using 2-D gel electrophoresis, observed a sharp structural transition within the insert (from the histone gene unit of sea urchin) of the recombinant plasmid pEJ4containing(A-G),, in the polypurinestrand. This transition depended on pH and on the degree of negative supercoiling, while the mobility drop was pH-independent (within the given pH range), consistent with the presence of nonintenvound complementary strand throughout the (dAdG);(dT-dC), stretch but not with cruciform or ZDNA. On the basis of their results they suggested a new structure called H form, stabilized by hydrogen ions and including a hair~in."~Considering the results of Lee et aL5I0obtained with poly (dT-dC)-poly(dG-dA)at weakly acid pH they proposed a protonated triplex H-DNA. Stemming from different methodological approaches Christophe et al.5" suggested a protonated triplex structure. In the H-DNA proposed by Lyarniche~~'~the Watson-Crick duplex extends to the center of (dT-dC);(dG-dA), tract (Figure 5) and the second half of the homopyrimidine strand folds back upon itself, winding down in the major groove of the helix. This returning strand forms Hoogsteen pairs with the purines of the Watson-Crickduplex cytosines being protonated. The second half of the homopurine strand also folds back, but is probably unstructured. Neither the experimental data of Chrisfophe. et al."' based on mapping of the nuclease S1 hypersensitive sites nor those of Lyarnichev et al.175.5'2represented sufficient proof of the H-DNA triplex. Such proof was obtained 2 years later by means of chemical 247,264.265.290.313.326.513 6. Chemical Probing: Strong Evidence for H-DNA Model In 1987 we probed supercoiled plasmid p a 4 (constructed in the laboratory of M. Frank-Kamenetskii) using Os,py as a probe for the homopyrimidine strand and glyoxal for the homopurine strand in combination with nuclease S1 to cleave DNA at the site of the modified bases.264 We demonstrated the dependence of the site-specific modification on pH, NaCl concentration (this dependence was observed at pH 6, bul not at pH 4), and supercoiling. At pH 5.6 a major sitespecific modification was observed in the middle of the homopyrimidine strand and a minor site close to the end of the tract. On the basis of these results we concluded that under the given conditions, protonated triplex H-DNA is present in the supercoiled pEJ4 DNA. At pH 4.0 similar site-specific modification was observed even in linear DNA m o l e c ~ l e s . ~ ~ ' ~ ~ ~Employing basically the same approach several laboratories, including ours, published in 1988 modification patterns at single-nucleotide resolution (using chemical cleavage of DNA or terminationof transcription at modified bases instead of nuclease S1) of several recombinant plasmids containing (d~-d~),.(d~-de),tracts. These patterns displayed strong Os,py modification in the middle of the homopyrimidine strand (Figure 5) (corresponding to the hairpin loop in H-DNA) and a minor modification of the 3'-end of the (dT-dC), tract (consistent with the reactionat the B-H junction), plus a strong DEPC modificationof the 5'half of the homopurine strand corresponding to the 5'-half of the unpaired sequence not included in the triplex: This hypersensitivity of the 5'-half was observed regardless of the insert orientation within the Other single-strandselective chemical probes such as hydroxylamine and methoxylarnine yielded more or less the same modification patterns, showing that the results are independent of the nature of the chemical p r ~ b e . ~ ~ ~ . " ~Modification with DMS showed that guanines of the 3'-half of the purine strand are protected against alkylation in agreement with their assumed involvement in the Hoogsteen pairing.215-313-326-"3The interconversion between duplex and triplex occurred within a few minutes, as detected by nuclease P1 site-specific cleav- FIGURE5. Structure of H-DNA. (A) Schematic representation of H-DNA in (TC-AG),,, with 3'-half of the pyrimidine (dT-dC) repeat donated to the triplex, forming the H-y3 conformer. The 5'-half of this repeat,plus the complementary 3'-half of the (dAdG),, polyurine repeat, act as the acceptor helix in this corforrnation.The two halves of thepolypyrimidinestrandin the tr~plex (- and ----) are antiparallel. Watson-Crick base pairs areshownaslines, Hoogsteenbasepairsasdots. (B) Hydrogenbondingschemes for base tripletsin H-DNA. (C) Alternativeuse of 3'- or 5'-half of the (dT-dC) repeat as the donated strand, to form H-y3 or H-y5 conformers of H-DNA. Watson-Crick base pairs are shown as lines and Hoogsteen base pairs between the acceptor purines and uncharged T or protonated C+pyrimidines are shown by and +, respectively. Boxed letters and lettersin circles indicate nucleotides that are reactive to singlestrand selective probes (e.g., Os,bipy or DEPC),whenthe DNA is in the H conformation. -. A 0s.PY Pur age.s14The energy parameters of the B-H transition in supercoiled DNA were obtained by Lyamichev et al.sls for (CIA-dG), and (G), sequences. The energy of nucleation of the H'form t in (A-G), sequences F, = 18 kcal/mol was close - to the corresponding value for the Becauseof theseresults, Lyamichevet al.51sconsidered the possibility of H-form extrusion in homopu*pytracts shorter than 15 bp as very improbable. This is in agreement with the recent B C G C results of K o h ~ i , ~ ~ ~who demonstrated triplex formation in (G16).(C16)but not in (G14).(Cl,). Triplex formation in both (C-T), and (C), sequences effectively protected the DNA duplex h m UV-induced pyrimidine dimerizati~n.~The degree of protection depended on the degree of supercoiling and acidity.516 Displacement of the single-stranded region, a characteristic feature of H-DNA, was questioned by some authors.328Htun and DahlbergS1' R T A T C s ' - C T A C C ~ C G G T C m m T t T m C T ~ ~ - ~ ~ ~ c ~ ~ c ~ c c f f i f f i ~ ~ f f i f f i f f i f f iH-y3 TCTCTCTCTCTCTCTC 5 ' 3 ' C- C A- T I IC - C T- A C- G T- A A - T A- T A - T T- A A - T T- A T - A T- A A-T G- C A - T C- C A T - A I I G- C 3 ' 5' 5 '-GTXGAGCGG T * A - T C * C - C TOA-T FIGURE 5C demonstratedan ability of the (dT-dC),, sequence cloned into circular MI3 phage DNA to form a complex with H-DNA in supercoiled plasmid containing (dT-dC),,-(dA-dG),, sequence. The complex was studied by electron microscopy, 2D gel electrophoresis, and chemical probing. The latter technique showed Os,py reactivity in the pyrimidinestrand similar to that of uncomplexed supercoiled DNA; the DEPC reactivity of the purine strand was significantly reduced, however, suggesting the single-strandedness of the 5'-half of the (A-G),, stretch in H-DNA, a result not consistent with other hypothetical models.328 A similar approach has been applied to reveal the presence of intramolecular triplexes in (dAdG)stretches in the initiation region of a dehydrofolate reductase replicon of Chinese hamster cells.fi8Formation of the complex with the (TC), stretch in M13 DNA requires the presence of triplex structure, in contrast to complexes between short (T-C), oligonucleotides and (A-G).(TG) sequences in duplex DNA.363-5'7 The results summarized above (1) provided strong evidence in favor of the intramolecular . triplex H-DNA and (2) suggested that from the two possible isomers of H-DNA (Figure 5), only that in which the donated strand comes from the 3'-half of the pyrimidine strand prevailed under the given conditions. c. Nucleotide Sequence Requirements The homopu-py nature of the sequence suitable for the formation of H-DNA is determined by the necessity of forming base triads TAT and CGC' or alternatively ATA and GCG (see below). The sequence must contain a mirror repeat (i.e., to be the same in the 3'-5' as well as in the 5'-3' directions along a single strand), which is also termed H-palindrome todistinguishit from normal palindromes (inverted repeats) extruding the cruciform structure.506.514.519-521H-palindrome may contain a nonpalirldrornic segment (which may even disturb the homopu-pycharacter of the sequence) in its center.519In triplexes with identical 12-base triads in the stem, 4 to 10 base interruptions inpthem'nter were tolerated.s21 Longer triplex loops, however, required higher supercoil energy for triplex formation and were less thermostable than triplexes with shorter loops. For example, the thermal stability of the triplex with 10-base loop was lower by 3 to 4°C and the free energy was about 5 kcal/mol higher than in the most stable triplex with a 4-base loop. A 12base nonhomopurine spacer between two (dAdG), segments did not prevent the formation of H-DNA (stable over a broad pH range), while the presence of a 46-base spacer with a random sequence completely abolished the H-DNA formation under the same condition^.^^^.^^^ If, however, the spacer was composed of a regular (dAdT), tract, a complex H-DNA with a cruciform structure formed by the spacer. The presence of bases disturbing the H-palindrome in its noncentral region was not so well t ~ l e r a t e d . ~ ' ~ - ~ ~ ~Mirkin et constructed sequences 5 ' - A A G G G A G A A X G G C G T A T A G G C G Y A A 3 ' in which X and Y were either A or G. When X = Y this sequence (inserted in pUC19 DNA) exhibited facile transition to H-DNA. For X # Y this transition was much more difficult or im-' possible, as detected by 2-D gel electrophoresis. A more detailed picture was obtained with Os,py and DEPC modifications. For example, a 42-bp pwpy sequence with three consecutive interruptions formed a mixture of two smaller triplexes instead of a large triplex containing unpaired bases in the stem. A shorter sequence with a single interruption formed a triplex with one unpaired base in the stem. The presence of this interruption resulted in a decrease of thermostability by 7°C and the requirement of a higher energy of supercoiling for the triplex formation. Lower requirements for supercoiling and increased thermostability were observed with increasing G+C contentof a homopwpy sequence, suggesting that the presence of CGC triads is very important for triplex formation. No triplex formation was observed in (dA)20*(dT)20even at superhelix densities more negative than -0.09.5'3 It was shown in 1990 that much longer (T,)-(A,,) segments adopted the triplex form.524 d. Hinged H-DNA Htun and Dahlberg247proposed a three-dimensional model for H-DNA, from which they predicted that H-DNA would introduce a severe kink in DNA molecules. Studying the electrophoretic mobility of DNA fragments containing H-DNA in (A-G), sequence (at pH 4) at different sites of the molecule, they showed that such a kink does exist and that it possesses a limited flexibility; thus, it can also be termed a hinge. The presence of a kink or a bend in supercoiled pEJ4 plasmid containing H-DNA was demonstrated at pH 5.6 by electron m i c r o s c ~ p y . ~ ~ ' , e. Conformersof H-DNA Htun and Dahlbergs17-s26proposed a standard nomenclature for describing H-DNA structures. According to this nomenclature, H-DNA refers to conformations with triple-stranded and displaced single-stranded regions (not considering the nature of the base pairing). To distinguish between two possible conformers, H-DNAs in which the donated polypyrirnidine strands come from the 5'-half were termed H-y5 and those from the 3'-half H-y3. The number of base triplets within H-DNA can be indicated in brackets, e.g., H-y5(16). Ln H-DNAs where the polypurine strand is donated, y (for pyrimidines) is replaced by r (for purines), e.g., H-r5. Once formed, H-DNA may be trapped in a local energy minimum; the initial nucleation event can thus determine which conformer can be formed. To form the nucleation site, rotation of the helices flanking the donated pyrimidinesmust occur. Formation of H-y3 conformer from (dTdC),,.(dA-dG),, requires the relaxation of about one more negative supercoil than does formation of H-y5 due to the necessity of additional rotation prior to nucleation. To form H-y5 conformer, the donor duplex only folds back on the acceptor, while H-y3 nucleation requires one complete turn of the two strands around each other. Thus, Hy3 nucleation would be promoted at higher levels of negative supercoiling when compared with nucleation of H-y5. DEPC modification of topoisomers containing :H-DNA confirmed this ass ~ m p t i o n , ~ ~ ~whiche& be used to explain the reported bias toward utilizing of the 3'-half of the polypyrimidine strand as a donor.247*265J13-513 Relative energetics costs of H-DNA formation decrease with increasing length of the (dT-dC).(dA-dG) repeats.526Longer repeats support generation of multiple conformers and relax a larger amount of negative supercoiling upon H-DNA f o r m a t i ~ n . ~ ' ~ . ~ ~ ~Multiple forms may result from the long repeats in which the mutation did not occur at the repeat center.s26The - - . pPP1 plasmid containing (dAdG),ATCGATATATATCG(dA-dG), sequence, which is not very long by itself but contains a long spacer region, displayed at pH 4.5 DEPC modification patterns consistent with the presence of both conformers at a wide range of -u values (0.02 to 0.1).522 With shorter purpyr sequences of about 25 to 30 bp, only the H-y3 conformer was observed at pH 5 in a wide range of superhelix d e n s i t i e ~ . " ~ . ~ ~ ~Similarly, the (CAA),TCC(GAA), sequence yielded only H- y3 conformer at a from -0.001 to -0.09 at pH 6 and 7.s26 On the other hand, increasing the length of the regular (A-G), sequence resulted in the formation of multiple conformers.526bTheir presence was manifested by complex modification patterns of (G-A),,, (G-A),,,. and (AGGAG),, segments at pH 5.5 and below at moderate and high superhelix densities. It thus appears that the mechanism suggested by Htun and DahlbergSz6might be limited to certain sequences, while in other purpyr sequences different mechanisms may come into play and give " rise to different triplex structures.526a f. Effect of pH, Ions, Supercoiling, Nucleotide Sequence, and Length of the Homopupy Tracts Low pH, negative supercoiling, and increasing length of the homopu-py sequences act interdependently to stabilize H-DNA.B7.264.433-526b.S27 With increasing lengths of the homopu-py tract (at pH 5.2), less negative superhelix densities were required to induce the triplex format i ~ n . ~ ~ ~ . ~ ~ ~AS the pH was increased (at a constant insert length), more supercoiling was necessary to drive the B-H t r a n s i t i ~ n . ~ ~ ~ . ~ ~ ~ ~ The homopu-py sequences capable of adopting H-form triplexes can be composed either of mixed sequences forming H-palindromes or of homopolynucleotidestretches such as (G);(C),. Mixed HomopwpySequences: With highly supercoiled DNA modification patterns characteristicfor H-DNA were observed even at neutral pH.247On the other hand, in plasmids (containing segments of regions flanking the C2b and C2a immunoglobulin constant region genes) where (dC-dT), tracts were adjacent to alternating pu-pyr segments, increasing the superhelix density did not result in triplex formation at neutral pH.528 This was explained by the ability of the 2-DNA flanking sequence to prevent triplex formation. The absence of triplex in pyrimidine sequences at pH 8 at high superhelix dens'ity,:combined with the formation of left-handed DNA in an alternating purine pyrimidine region located 76 bp to the 5'-side of the CT segment, was observed by Johr~ston.~'~ Using nuclease PI,site-specific cleavage was observed starting from the mean superhelix density of about -0.04 at pH 4.6 and about -0.06 at pH 7.5 with both synthetic (dTdC)12.(dAdG),, and (dG),,-(dC) Similar results were obtained using Os,py rnodificati~n.~~'At pH >7.8 the reactivity of the homopyrimidine strand was but the homopurine strand remained reactive to DEPC even above pH 9. This result was explained by the presence of a new conformation (J-DNA)247and alternatively by different conditions under which DEPC and Os,py reactions were conducted.S26bIt should not be difficult to solve this question experimentally using different chemical probes. Evans and Efstratiadi~~~'showed that nuclease S1 cleavage of (dG-A);(dC-dT), sequences is length dependent; they observed hypersensitivityof (dG-A),,.(dC-dT),, sequences to venom phosphodiesterase from Crotalus adamanteus at pH 9 (this enzyme possesses an intrinsic endonuclease single-strand selective activity). Recently it was shown526bby Os,py and DEPC probing that increasing the length of the insert decreases the dependence on acid pH for triplex formation. The (dG-dA),, sequence (constructed by Evans and Efstratiadi~~~')adopted a triplex structure at neutral pH and a moderate level of supercoiling (o= -0.049). The pK of theotriadC"GC is between 7 and 8; thus, at more alkaline pH values unprotonated CGC triads may contribute to triplex ~ t a b i l i t y . ~ ' ~ . ~ ~ ~ An interestingobservation was recently made by Bernues et al. ,530.530awho showed that the (GA-C-T),, sequence cloned in SV40 can adopt an altered structure at neutral pH in the presence of ZnZ + ions (at millimolar concentrations). On the basis of Os,py and DEPC modification patterns the authors proposed a triplex structure denoted as *H-DNA stabilized by GGC and AAT triads (Table 5). GGC and AAU triads were shown to be stable at neutral pH.53'-532At pH 4.5 the d(GA-CT),, stretch assumed the usual H - f ~ r r n . ~ ~ ~ It thus appears that this sequence can assume two different structures, depending on pH and ions present in solution. *H-DNA was observed also in the presence of Mn2 + and CoZ+ at both neutral and acid pH.526bThe switch region of IgA immunoglobulin in mice cloned into a recombinant plasmid showed hypersensitivity in the (AGGAG)?,direct repeat to nuclease S1 (at acid pH) - TABLE 5 nucleases was observed in a long AT-rich poEffectof pH and Ions on Formation of ,, lypu.py tract (found downstream of the chicken Protonated and Unprotonated H-DNAa myosin heavy chain gene) at pH 4.5 and 7.5.s33 Recent CD studies of poly(dA-dG)*(poly(dT-dC) Form: PY'PY'PU PU'PU.PY demonstrated six different conformational - states of this polymer at pH values between 8.0 F p y ppy and 2.5.5" These results suggest that local structures other than H triplexes may be formed in purpyr sequences at various pH values. (dG);(dC), Sequences: After Lyamichev et demonstrated the existence of protonated H P" , triplex in (G);(C), sequences by 2-D gel elecSequence: trophoresis, Kohwi and Kohwi-Shigemat~u~~~ (A-G)ns(C-T)n showed that the (dG),,.(dC),, sequence can adopt PH Acid Neutral another triplex structure at neutral pH in the presMe2+ No Zn2 + Mn2 + ence of Mg2 'ions (at millimolar concentrations) Co2 + formed by unprotonated base triads GGC (Table (G)n.(C)n PH Acid Neutral 5). (dG),,.(dC),, tract within the 5'-flanking reM e + No Mg2 +Mn2 +Ca2 + gion of the adult chicken PA globin gene dis-- (A);(T), In n = 69, but not in n = 33 PH Neutral Not observed Me=+ Mg2 + See text for details. and P1 (at neutral pH) at a more negative than -0.002.s27The nucleaseS1 hypersensitivity was retained for the shorter repeat (AGGAG),, which displayed at acid pH Os,py and DEPC modification patterns characteristic for the H-y3 conformer of H-DNA. Long rnirrorrepeats (GAA),TCC(GAA), and (GGA),TCC(GGA), form H triplexes at pH 6.0 and 7.0 in plasmids at -a as isolated from E. ~ o l i . ~ ~ ~ 'With an increase of -a and/or lowering of the pH, Os,pyand DEPC modification patterns were observed that were inconsistent with the formation of a usual H triplex. It has been suggested that a structure is formed that simultaneously contains two triplexes in the given sequence. One possibility is that Hy-5 forms in the (GAA), and Hy-3 in the (GAA), segment, but other triplex models are possible. Similarly, (GGA),TCC(GAA),, which is not a mirror repeat, formed non-B conformation at acid pHs. The energy required for the transition was about the same as with the H triplex formed by the Hpalindrome sequence, but the Os,py and DEPC modificationpatternsof the former structure were not consistent with H triplex DNA. A non-B played similar properties. In addition to Mg2 + ions, Ca2 + and Mn2 + also induced the (G),,.(C),, sequence into a GGC triplex. Recent results suggest that in forming the dG.dG.dC triplex structure the potential of poly(dG).poly(dC) depends on the length of the (dG).(dC) tract.s3s ZnZ+, Cu2 +,and Co2 + ions did not induce triplexstructure in the (G),,-(C),, sequence, but did influence thepstructureof tQe- , direct repeat sequence adjacent to the sequence. d[(G)z4C(G),,I~d[(C)2,G(C)2,1in pG46C plasmid displayed structural transition at neutral pH in the presence of MgZ + at ALk - 15 accompanied by a release of five turns as detected by 2-D gel electrophoresis.537 A 64-bp GC-rich polypu.py tract from the rat long interspersed DNA element displayed two classes of supercoil-dependent reactivity toward chemical probes.538One class consisted of highly sensitive bases (whose sensitivity was strongly affected by pH and Mg2 + ions) probably contained in H-DNA triplexes observed earlier in (dG);(dC), sequence^.^*^^^^^^^^ The other class comprised moderately sensitive bases independent of reaction conditions. This reactivity suggests the presence of non-B-DNA, but the structural basis for this reactivity is not known. (A);(T), Sequences: FoxSz4 showed that (dT),,.(dA),, sequences adopt an intramolecular triplex structure in supercoiled plasmid (at native a)in the presence of Mg2 + at pH 8. The structure structure hypersensitiveto ;ingle-strand selective displayed a characteristic Oslpy modification in the centerof the (T,) segment and a DEPC modification of the purine strand charac'ťeristicfor the H-y3 conformer. No chemical modification resembling the triplex structure was observed in (A3,).(T3,) and (A2,)*(T2,) segments. The nuclease S1 cleavage and DEPC modification patterns of (A,)*(T,) suggested that the displaced half of the purine strand might weakly interact with the triplex. 2-D gel electrophoresis failed to detect any structural change; this was said to be due to the relatively small fraction of molecules containing the triplex. Compared with (Gn).(Cn) and (A-G);(T-C), sequences, formation of triplex in (A,,)*(TJ segments requires much longer homopuspy stretches. suggested that the requirements for long (A).(T) tracts may be due to the high stability and rigidity of propellertwisted (A,)*(T,) helix containing bifurcated hydrogen bonds (see Section m).If this explanation is correct, why did (A15T15)segments show facile C-type cruciform extrusion253and why did thermally more stable (G,).(C,) sequences with n < 69 form triple^?^*.^'^."^ Further work is necessary to better understand recently discovered triplex structure in long (A).(T) segments. 3. lnfermolecular Triplexes a. Complexes of Oligonucleotides with DNA The possibility of complexing oligonucleotides with DNA has been long a n t i ~ i p a t e d . ~ ~ ~ . ~ ~ ~ Formation of intermolecular triplexes resulting from the interaction of oligonucleotides with the complementary homopurine sequence in duplex DNA was demonstrated only recently.3s359.363.540540 Interest in intramolecular triplexes has recently greatly intensified (see Section V.D.2.e), mainly due to their potential application as inhibitors of gene expression and as recognition and cleaving elements in chromosome mapping. It has been shown that a 27-base-long oligonucleotide probe binds to duplex DNA at a single site within the 5'-end of the human c-myc gene, forming a colinear triplex with the duplex binding site.s* The triplex formation correlates with repression of c-myc transcription in vitro. A shorter 11-mer oligopyrimidine d(TITCCTCCTCT) formed a complex with a fragment of double-stranded DNA containing a complementary The complex was stabilized by additional binding energy resulting from intercalation of the aromatic ring system (acr-dine,phenanthroline) attached to the oligonucleotide 5'-end. The oligomer was bound to the major groove of DNA in a parallel orientation with respect to the purine strand. This kind of highly sequence-specific interaction can be used to control DNA expression. A similar approach has been adopted to develop a new strategy of genomic DNA mapping.359Oligonucleotides with attached cleaving agent EDTASIron(I1) at the 5'-end can induce sequence-specific double-strand breaks in DNA. An oligonucleotide of about 15 to 19 nucleotides should be sufficiently long to recognizea specific sequence in the human genome,356.359providing in a formal sense lo6times better resolution than natural restriction nucleases. More details can be found in recent reviews and also in Section V.D.2.e.356359.541 b. Physical Studies of Triplexes Ln addition to the studies mentioned above, structural information on the triple helices has been deriveti' from poly- and oligonucleotide studies by means of physical techniques. Early X-ray fiber diffraction studies of homopolynucleotides resulted in a triplex model in which the third strand lies in the major groove, interacting with the duplex via Hoogsteen pairing.493-"2The duplex portion of the structure adopts an A-like conformation with a C3'-endo sugar pucker. the third strand can be homopurine or homopyrimidine, depending on experimental conditions. If the third strand is homopyrirnidine it is parallel to the homopurine strand in the triplex. Recent studies of oligonucleotides with various nucleotide sequences by means of NMR and other techniques confirmed in the principle triplex model, resulting from the X-ray fiber diffraction m e a s ~ r e m e n t s . ~ ~ ~ . ~ ~ - ~It has been shown by NMR, CD, and other measurements that at neutral pH in the presence of MgCl,, triple-helical (dA),,.2(dT),, is formed.546In this triplex thymine N3-H iminoprotons are involved in both Watson-Crick and Hwgsteen base pairing. d(GA), and d(TC), octarners are able to form B-DNA duplex as well as triplexes dependent on - 11 experimental condition^.^^^.^^^ pyrpu-pyr triplex was observed at low pH and in an excess of pyrimidine strand. The results unambiguously conf'ed that the second pyrimidine strand binds via a Hoogsteen pairing (with cytosines protonated at N3) in the major groove of a Watson-. Crick duplex. The conformation of the pyrimidine sugars was A-DNA-like (see Section 11), while purines had a B-DNA-type sugar pucker. Under the given conditions the TA Hoogsteen base pairs appeared more stable than GC+ Hoogsteen base pairs. Results of independent 2D NMR studies of 11-mers were in good agreement with those of octarner studies showing Ahelical base stacking conformation in the oligopurine strand of the 11-mer Quite recently, Sklenar and FeigonW8constructed a 28base DNA oligomer with a sequence that could potentially form a triplex containing C'GC and TAT triads. Their 2-D NMR experiment showed that this oligonucleotide forms an intramolecular triplex at pH 5.5 and that a significant amount of triplex remains at neutral pH. The 26-mer ~ ( G A A G G A G G A G A ~ ~ ~ . C T C C T C C I T C ) formed a hairpin in solution.s49If this oligonucleotide was mixed with d(TCTCCTCCTTC) at pH 5, a triplex is formed as detected by gel electrophoresis and CD measurements. The triplestranded structure melted in a biphasic profile. The duplex-to-triplex transition was accompanied by an average change in enthalpy of -73 ( 2 5 ) kcal/mol. The equimolar mixture of d(CTCTTCTTTCTTTTCTTTCTTCTC) and d(GAGAAGAAAGA) formed a triplex at pH 5 as detected by gel electrophoresis, ethidium bromide interactions, DNaseI digestion, and CD spectra. Cooperative thermal transition was observed that was attributed to the disruption of HDNA-like structure into single strands X-ray fiber diffraction studies of poly(dG)*poly(dC) complexes with N-aacetyl-L-arginineethylamide displayed a B-form pattern, although these polynucleotides favor (in the absenceof arginine) the A-DNA form.5" Upon dehydration a triplex was observed, most likely formed by poly(dC+)-poly(dG)*poly(C). the Frank-Kamenetskii group is now well establ i ~ h e d . " ~ . ~ ~ ~The structure can be formed in any ? homopu-pysequence containing a mirror repeat. Strong evidence for the protonated triplex strut`ture (Figure 5) has been supplied by chemical* probing.ZM.265.290.313.326.513Important structuralinformation has been extracted from recent NMR studies on synthetic oligonucleotide^.^*^ In spite of a large amount of data, it is still difficult to ascertain the structure of the half of the strand not involved in the triplex base pairing. Is it truly single-stranded and free of any interactions?This does not seem very probable. At least some results of chemical probing suggest that bases in this part of H-DNA in supercoiled plasmids are involved in interactions that are most probably substantially weaker than those within the trip l e ~ . ~ ~ ~ . ~ ~ ~ - ~ ~ ~ - ~ ~ ~DO these interactions involve the triplex or are they limited to, for example, stacking interactions within the displaced single strand itself? Ln the former case we would have to consider the possibility of the formation of a loosely bound tetraplex.522 D. Other Structural Changes , The structures of cruciform, left-handed DNA, and triplexes,'as well as their relations to DNA supercoiling and other factors, were well established in the 1980s. These structures are undoubtedly not the only ones stabilized by negative supercoiling. Recent experimental data suggest that a number of other structural changes may ~ c c u ~ . ~ ~ - ~ ~ ~Presently, some data cannot be interpreted in terms of a well-defined structure, while others do not reveal the relation of a structural change to DNA supercoiling. It is to be expected that these points will be clarified and new local DNA structures discovered soon. Among the DNA local structural changes that have recently attracted the greatest attention are base-unpaired regions, structures of telomericsequences, and parallel-stranded DNA. These are briefly discussed in the following paragraphs. 1. Structure of Telomeric Sequences 4. Conclusions The triplex H-DNA structure suggested by Telomeres (reviewed in References 505 and 556 to 558); the ends of eukaryotic chromo- somes, all have a similar type of npcleotide se: quence,i.e., tandemly repeated GC-rich sequences with a pronounced strand-specific base composition asymmetry such as d(C4AJ-d(T,G?) (repeated at least 50 times) in ciliated protozoan Tetrahymena or d(C4A4)d(T4G4)of Oxytricha. Telomeres are involved in the replication and maintenance of the chromosomal ends, and a higher order structural theme shared by all telomeric sequences has been suggested to be critical for their function.559In the last 2 to 3 years evidence has accumulated showing specific local structures in telomeric sequences. a. C,A Hairpin Budarf and B l a c k b ~ r n ~ ~ ~demonstrated that the telomeric sequence poly d(C4A,);d(T,G4), of Tetrahymena inserted in a plasrnid is hypersen: sitive to nuclease S1under conditions of negative supercoiling. No such hypersensitivity was observed in linear DNA. As in the case of homopuapy sequences, mapping of nuclease S 1 hypersensitive sites did not yield a basis sufficient to deduce the higher structure of the given sequence. Using 2-D gel electrophoresis and chem; ical probing, Lyarnichev et a1.561.56'a proposed a novel protonated DNA conformation, the C,A hairpin. In this structure two independent hairpins are formed that are stabilized by C.C+ and A.A + base pairs. The model of (C,A) hairpin structure is based primarily on the results of chemical probing. DMS, Os,py, and KMnO, showed no protection of G's and Ťs from modification, suggestingthat the G4T2strand is virtually unstructured. In the C4A2strand the strongest reaction was observed in the central A's, while other adenine residues were lessreactive. The nonequivalenceof the two strands in the (C,A) hairpin is supported by the binding of the oligonucleotide (C3A,C4C)complementary to the G-rich strand, which occurred only at acid pH. (G4T2),cOmp~emedtGto the ' C-rich strand showed no binding. The data presented by Lyamichev may also be consistent with triplex H-DNA carrying CGA and ATC triads (in addition to TAT and CGC'), if two conformers are present in comparable Further studies, including modification of C in various topoisomers, may help to solve this problem. b. Quadruplexes In the last few years studies of sysheticoigonuclep_tidescontaining telomeric sequence motifs have resulted in new models of DNA structures based on non-Wat~on~Crick-- base-pairing. In 1987 Hendersonet al.559investigated the structure of G-4c_h strqnds of several telomeri_c_s_ecp?n!es &d demonstrated the format& of novel intramolecular structures containing G-G pairs with guanosine.-. - residues in syn conformation as determined by NMR spectroscopy. Sen-- -.--and Gilbert562probed the structure of self-associated si-ngleIstranded DNA oligonucleotides (which displayed a decreased electrophoretic mobility when compared with single strands) by DMS and concluded that four-strandee d-- structures were formed in which th-e strands runjnparallel fashion and guanines &-bonded to eaih other by ~ o o ~ steen Quite recently compelling evidence was gathered independently from three laboratories .showing that G-rich oligonucleotides may form anti~arallelqhhdruplexescontaining cyclic guanine base tetrad^.$^^-^^' Sundquist and K . l ~ g ' ~ ~prepared a series of oligonucleotides containing duplex sections with a FGGGG repeating sequence (found in Tetrahymena telomeres) followed by a single-stranded 3'-terminal overhang of two repeats and showed that these oligonucleotides dimerize to form stable complexes in solution. The complexes were u~liauelvs t a b w d bv mtassium ion, The dimerization was mediated by the 3'terminal overhang. In these complexes the N7 of every guanine in the 3'-tenninal overhangwas inaccessible to DMS and DEPC, while both pairs of thymines were accessible to osmiumLe-trp~ige. The authors proposed that telhkefc DNA dimerizes by hydrogen bonding between two intramolecular hairpin loops, forming antiparallel quadruplexes. In contrast to Sundquist and Williamson et d.$@used single-stranded oligonucleotides composed of the telorneric sequencerepeats from h t r i c h o and ~etrah~rneha,while Panyutin et al.565used fragments of DNA con- taining (dG),,, (dG),,, and (dG),,, respectively. The resultsobtained by different methods (chemical pr~bing,'~~-'~~UV cr~sslinking,'~electrophoretic m ~ b i l i t y ~ ~ ~ . ~ ~ ~ " ~ )induced proposals of remarkably similar structures with guanosine bonds of adjacent chains in opposite conformations: syn vs. anti. Thus, in any tetrad two guariosine residues have anti and two have syn conformations. The suggestedss9~s62-s65Hoogsteen base pairing (Figure 5) received further support from experiments with replacement of dG with dl at various positions of the telomeric seq~ence.?~The oligonucleotide complexes prepared by Sundquist and opposed those of Williamson et al.'@ by the different effect of monovalent ion on their stabilization, suggesting differences in sizes of cavities created within the structures. Do these results mean that the original model of a parallel-strandedquadruplex (G4-DNA;Figure 6) proposed by Sen and Gilbert562should be abandoned? The recent work of these authors suggests that this should not be the case.567Sen and Gilberts6' showed that both ggi#e&tranded G4-DNA (with all of its guanosine residues $' the g& conformation) and ytiparal1eJ 'quadruplexes can be formed. They expected sequences with four or more separated runs of G's to yield four-stranded intramolecular fold-back structures, sequences with two runs to produce dimer fold-back structures, and sequenies with single runs to yield G4-DNA. Complex sequences formed parallel-stranded G4-DNA in the pres-, ence of sodium and rubidium-butnot the presence of potassiumions. This anomaly(which was not found in oligonucleotides containing short, single runs of three or more Gs) arose because potassium ions stabilized quadr~~lexstructures SQ strondy that transient intermediateswere tragped, preventing formation of the G4 structure. onn nation of G4-DNA was very strongly dependent on the ratio of Na+/K+.Two phases of G4 formation were shown: at low molar ratios of potassium, a progressive increase in the G4 formation rate was observed, while at higher potassium ratios G4 DNA formation was inhibited. Telomere DNA sequences were observed in phylogenetically diverse organisms including protozoa, fungi, and even higher eukary~tes."~ It now seems that telomeres of all eukaryotic nuclearchromosomes may have basically similar structures that can undergo transitions in dependence on their ionic environment. We can, however, only speculate about the relations between theirspatial organizationand biological function. 2. Base Unpaired Regions Early melting was perhaps the first recognized structural change assisted by the free energy of negative s~percoiling.'~~-~~~'Local base pair disruption was ~ b s e r v e d ' ~ ~ . ~ ~ ~under various conditions far from melting but differing from physiologicalconditions. Hypersensitivityto single-strandselective nucleases(such as mung bean and P1 nucleases) was reported at physiological conditions in AT-rich regionsof supercoiledplasrnids.223.568.569The hypersensitive sites did not correlate strictly with AT content and appeared in replication origins and transcriptional regulatory regions in both pro- and eukaryotic DNAs. Until recently it was not known whetherthis local DNA hypersensitivity was due to permanent or transient unwinding of the given DNA region. Recent experiments with 2-D gel electrophoresis showed that the unwound structure is thennodynamically stable (prevailing at equilibrium over the B-form).!70-57' Transition to stsble unwinding occurred"%t about -a = 0.05 and the average extent of unwinding was sufficient to completely unwind the region recognized by the nuclease at -a 0.067.570,e formation of completely unwound DNA segments in replication origins and transcriptional regulatory regions is a very attractive suggestion, since DNA must unwind to initiate replicationand transcription. The evidence based on nuclease hypersensitivityand 2-D gel electrophoresis, however, is not sufficient to rule out other alternatives.s69-571 Kohwi-Shigemats~~~~showed that the yeast replication origin sequence is reactive to CAA;s69~572.s73this result suggests that the reactive region is truly unpaired. It also has been shown that stable unpaired regions hypersensitive to CAA modification surround the immunoglobulin heavy chain (IgH) enhancer. These regions are AT-rich and contain negative regulatory elements. Unpaired CAA-hypersensitive, AT-rich FIGURE 6. (a) Scheme for the formation of G,-DNA. The formation of the dimer structure G,must be rate-limiting, and it must be rapidly converted into G,. Only three possible structures (including K and G;)of fold-back intermediatesare shown, but other st~cturesmay also be formed. (b) Structures of product K most compatiblewith its methylation-protectionpattern. The methylationenhancedguanine is circled. The arrows indicate the 5'-3' direction of the sugar-phosphate backbone. (Reprinted by permission from Sen, D. and Gilbert, W., Nature, 244, 410, 1990.) ', regions are also included within matrix association regions. Similar to mung bean nuclease hypersensitive sites of the E. coli replication oegin,n1the structural transition in the IgH enhancer region occurs at a about -0.05, and increasing the negative superhelix density results in extension of the unpaired region.573A local high AT content is not sufficient to cause the DNA segment to adopt the unpaired structure. Mutation of three adenines (to either G or C) in the sequence ATATAT in the CAA-hypersensitive region resulted in a marked reduction of its sensitivity to CAA. It was proposedthat the ATATAT mdtif may be kinked, serving as a nucleation site for base unpairing. Various sequences required different environmental conditions to form unpaired region~;"@"~for example, at 22OC the 3'-sequence of the IgH enhancer was very sensitive to CAA, while the autonomously replication sequence of the yeast origin was unreacti~e,'~~showing some CAA hypersensitivity at 37°C.573Base unpairing is involved in the C-type cruciformextrusion (see Section VI.A.2). Hypersensitivity to single-strand chemical probes under superhelical stress was. observed also in other AT-rich sequences3 ' and the curvature-inducing ATATAITITTTAGAGATITIT sequence. 'I3 The coincidence of the base unpaired regions with various functional elements such as negative regulatoryelements573(repressingIgHenhanceractivity in fibroblasts but not in Pcells), nuclear rnatrix association regions, replication origins,s7G572 and transcriptional regulatory regions suggests that DNA structure may play an active role in replication and transcriptionand other biological functions rather than serving only as a passive source of nucleotide sequence information. 3. Parallel-Stranded DNA Multistranded structures containing parallel strands were discussed in preceding paragraphs. Recently it has been shown that parallel-stranded duplexes can be formed by oligonucleotidesconsisting of AT pairs under physiological,-condiParallel-stranded DNA was formed in hairpin molecules (1) with stems stabilized in a parallel orientation by 5'-5' or 3'-3' phosphodiester linkages in the hairpin loop574.577-579and (2) with crosslinked ends.S80-582 From the biological point of view the most interesting parallel-stranded DNA appears to be that formed by hybridization of two complementary oligonucleotides with appropriate sequences (partially homooligomeric A-T sequences). Such parallel-strandedmolecules were recently designed, synthetized, and characteri ~ e d . ~ ~ ~ - ~ ~ ~ . ~ ~ ~ ~ ~ ~The conformational constraints in these oligonucleotides was obtained by using overhangs or by appropriate combination of block and mixed sequence^."^ Runs of alternating (AT), segments579as well as interspersed GC pairs are compatible with the parallel-stranded helix but induce its destabili~ation.'~ a. Structural Features of Parallel-Stranded DNA The structure of parallel-stranded DNA has been studied in solution using various physical and chemical techniques. (Crystallographic data are not yet available). The results of gel electre phoresis, W absorption, and CD spectra measurements, as well as thermal meltingand chemical probing, suggested that parallel-stranded DNA contains a distinct secondary structure.57k587Differences between antiparallel- and parallelstranded DNA behavior were observed upon binding intercalating drugs chemical probing and nuclease cleavage.574-578.583-585.587 According to the theoretical force field calculati~n,'~~reversed (trans) Watson-Crick AT base pairs are formed in parallel-stranded DNA in which the adenine 6-amino group is hydrogen bonded to thymine via the 0, instead of the 0, of the keto group [in the normal (cis) WatsonCrick pair]. The reverse (Watson-Crickbase pairing) has been supported by 'H-NMR578DMS probing,574and Raman spectra.579The Raman NOESY rneasurement~,~'~and molecular m ~ d e l i n g ' ~ ~ . ~ ~ ~suggested that the furanose rings are mainly in C2'-endo conformation and bases h e in anti oI5Cntation. 31P-NMRof the intramolecularparallel-stranded hairpin (containing (A)-(T)stem and four cytosines in the loop) and a 25 bp parallel-stranded duplex composed of two complementary strands [with a (A),,~(T),, block and sequencescontaining TA and AT steps] produced different result^."^^^^^ In the former parallel-stranded hairpin no drastic differences between parallel-and antiparallel-stranded DNAs were observed,578whereasthe backbonestructure of the latter parallel-stranded duplex differed from that of the antiparallel-stranded duplex.587No clear - explanation of these results has been yet offered. Measurements of fluorescence resonanceenergy transfer of 5'-fluorescence labeled oligonucleotides confirmed that the strands in parallelstranded DNA indeed have the same p~larity.'~ Parallel-stranded DNA inserted in a supercoiled plasmid underwent a transition to a (pu),.(pyr) triplex as detected by 2-D gel electrophoresis and chemical probing.591The data confirmed that parallel-stranded DNA forms a right-handed duplex. Parallel-stranded duplexes formed between unnatural a-anomers and natural poligonucleotides36s~5mas well as duplexes containingT*Tpairs and phosphate-methylatedbackbone have been reported.593 6.Biological Role Does parallel-stranded DNA occur in vivo, and, if so, what is its biological role? This question'cannot yet be answered, but the stability of parallel-stranded DNA at physiological conditions suggests that such a structure might be formed in vivo. It has been suggested that parallel-stranded DNA would arise most readily in the course of exchange reactions involving interactions between separated segments of DNA, including recombination and genomic rearrangem e n t ~ . ~ ~ ~ . ~ ~ ~In principle, parallel-stranded DNA can be formed in several ways, the simplest of which involves intrastrand loop formation.576 Parallel complementary sequences were found in the Drosophila genome593aand the ability of very similar sequences to form parallel-stranded DNA in vitro was demon~trated.~~~.Thus,the possibility of the existence of local parallelstranded DNA regions in vivo does not seem to be unrealistic. More details concerning parallelstranded DNA structure and properties can be found in recent review^.^^^.^^^ E. Conclusions It has been shown that negative supercoiling stabilizes left-handed DNA segments, cruciform, triplex and C,A-hairpin structures, and base unpaired regions. These structures are frequently called "unusual structures". Are these structures unusual only to us, because we recognized them much laterthan B- and A-DNA, or are they really unusual in nature? Does, for example, %DNA or the triplex structure occur in vivo substantially less frequentlythan A-DNA? Until now we have not been able to answer such questions. A common feature of cruciforms, triplexes, C,A hairpins, and base unpaired regions is the accessibility of some bases for interactions with the environment (in contrast to bases hidden inside the B- or A-DNA double helices). The aqcessibilityof bases in 2-DNA is less marked than in other unusual structures, but the results of chemical modification experiments suggest that at least in (A-C).(G-T) sequences left-handed DNA segments (in supercoiled plasmids) may contain a number of exposed bases.467470In addition, accessible bases were found at B-Z junctions regardless of the nucleotide sequence of ZDNA. Local accessibility of a small fraction of bases was observed earlier even in nonsupercoiled DNAs far below melting conditions.lo Regions containing accessible bases have been called "open"DNA regions. Similarly, cruciforms, triplexes, hairpins, and B-Z junctions may be called open DNA structures. It is not clear whether the base unpaired regions are completely structureless (this does not seem to be probable) or whether they adopt some structure. In the latter case they can be included among open DNA structures. If the open DNA structures play some biological role, it would be surprising if the accessibility of their bases would not take part in biological processes. Exposed bases in open DNA structures may represent targets for specific interactionswith other DNA and RNA strands;they c& facilitate or @event recognition of a given sequence by specific proteins and other substances interacting with DNA. Many mutagenic chemicals would react preferentially with exposed bases in a way similar to that of chemical probes. In addition to open structures discussed above other open structures and complexes can be transiently formed during various biological processes, including DNA replication, transcription, and recombination. An example of such structures is an open complex of RNA polymerase with promoter sequences. This complex contains a so-called melted region 12 to 16 bases in length. Recently DEPC, DMS, and Os,py have been applied to characterize this region in an open complex of the lac UV 5 p r o m ~ t e r . " ~ . ~ ~ ~It has been shown that bases in the templatestrand from -10 to +3 react with DEPC (A residues) and DMS (C residues).259Thymines at positions -8, -9, and - 11 reacted with Os,py in contrast to those at +1 and +2 (assumed to be in a single- stranded region) that did not react. On the other hand, thymine at -11 showed hypersensitivity toward Os,py even though this base is expected to be outside the locally "melted" region. These results suggest that the locally melted region is not completely structureless. Its detailed ch&acterization by means of chemical probes may provide important information, including those about the strength of the given promoter. Work with this goal in mind is under~ay."~Similar approaches can be applied to study other open structuresand complexesboth in vitro and in situ. VII. DNA STRUCTURE IN THE CELL Much information has been gained concerning DNA structure in solution, fibers, and crystals, although information about DNA structure in its natural environment, i.e., in the cells, is scarce. Such information is, however, vital for a better understanding of the biological role of DNA. The preceding sections showed that DNA supercoiling stabilizes various local structures such as cruciform, triplexes, left-handed DNA, etc., and that extrusion of these structures in turn changes the DNA superhelix density. Extrusion and absorption of the local structure may represent an efficient way of regulation of the biological processesinvolvingDNA, but do these structures really exist in vivo? Early experiments suggested that a certain portion of DNA isolated from various organisms had a single-stranded character(reviewed in Reference 10). It was shown that the amount of DNA cleaved with single-strand selective nuclease S1 increased during the period of DNA synthesis in human diploid fibroblasts.s95Ln experiments carried out with isolated DNA it was difficult to exclude the possibility that regions with a singlestranded character were formed secondarily (e.g., due to nuclease cleavage a.fter the cell disruption). Techniques have become available that make possible the study of DNA structure inside the cell. A. Methods of Analysis of DNA Structure in the Cell Both indirect and direct ways of demonstrating the formation of open local structures in the cells have been employed. The former include: (1) methods of studying certain genetic consequences of the formation of such a structure, including the modulation of transcription by ZDNAs%and cruciform,s97and the susceptibility of potential Z-forming sg8 and inverted repeat sequence^^^^-^' to deletions: and (2) methods based on changes in DNA topology induced by the formation of a local s t r u ~ t u r e ~ ~ ~ . ~ ~ ~and on compensation of the simultaneous partial DNA relaxation by intracellular topoisomerases (linking number assay). The indirect methods do not significantly disturb cell life, but interpretation of the results may not be free of ambiguity; for instance, the linking number assay of the cruciform extrusion in vivo relies upon the accurate regulation of superhelix density in the cell. It is expected that the change in the superhelixdensity induced by the structural transition will be compensated for by the regulatory system, which reduces the DNA linking number so that the level of supercoiling is ree~tablished*~~Unfortunately, other changes in DNA superhelix density (e.g., due to a specific protein binding to the inverted repeat) may be compensatedfor in a similar way. Perhaps the most efficient way to demonstrate'the formatioq.~fa local DNA structure in the cell is by probing DNA in situ. This has been achieved by molecular gene ti^^^.^^ and chemical approaches.39 The molecular genetic method induces an enzyme in the cell and theconsequences of its specific interaction with the local DNA structure in the cell &e determined. For example, cleavage of the cruciform loop by 'I7 endonuclease was used to demonstrate cruciform structure in E.. coli cells.603Single-strand selective chemical probes have been applied successfully . in DNA structure studies in vitro (Section V). We have showd9that Os,bipy enters E. coli cells without disturbing their integrity and site-specifically reacts with bases in open regions of the cellularDNA such as B-Z junction^^^-^ or triplex structures.604Testing of other osmium tetroxide complexes revealed that in addition to Os,bipy, Os,TEMED (Figure 2) can be applied to probe DNA structure in s i t ~ . ~ O ~KMnO, was recently introduced as another single-strandselective probe of the DNA structure in E. coli cells.4'Chemicals reacting with double-stranded DNA such as DMS3m.33'.606and copper-I, 10-phenanthr~line~"~ were used to obtain footprints of intracellular DNAs. DNA footprinting can also be carried out by irradiating cells with short exposures of UV light.372.608-609With the use of UV irradiation it may also be possible to obtain informationabout local DNA structures in vivo. After the treatment of cells with a chemical probe (or UV light), DNA is usually isolated and the probe reaction sites determined as in experiments in vitro (Figure 3). It is also possible to detect the probe binding in the cell by means of 'irnmunofluorescence techniques without isolatingDNA.6'0By using Os,bipy it has been shown - - - that,Z-DNA,39,40*248.447cru~iform,~~-~"and triplex structure^^^.^ can exist in E. coli cells. B. Cruciform Structures Inverted repeats, the potential cruciform sequences, arefrequently found within genetic regulatory regions.260p612.613Can all of these sequences extrude cruciforms in vivo? Reports published in 1983 s ~ g g e s t e d ~ ~ ~ . ~ ~ ~that in bacterial cells cruciform may be rare because of its slow formation. On the other hand, the tendency of some inverted repeats to undergo deletion was ~ b ~ e r v e d , ~ ' ~ . ~ ~ ~ . ~ ~ ~ - ~ ~ ~suggesting- the possible presence of cruciforms in the cell. I. Evidence of Cruciform Structure in Bacterial Cells Using the linking number assay, Hanniford and P~lleyblank~~*concluded that the (A-T),, sequence extruded cruciform in E. coli cells under conditionsof blocked protein synthesis. In 1987, Panayotatosand Fontainem3showed cleavage of the ColEl inverted repeat in the intracellulqr pLAT75 plasmid by T7endonuclease induced in E. coli cells. The digestion site coincided with the cruciforh loop cleavage by T7 and S1 nu: clease in vitro. The ColEl inverted repeat in the pLAT75 was' placed in its native environment . within the coding sequence of the colicin resistance gene and was actively transcribed from its natural promoter. The sequence surrounding the palindrome was highly AT-rich. Thus, the combined effects of active transcripti~n'~~and presenceof adjacentAT-rich sequences3 ' (seeSection V1.A) could significantly contribute to cruciform extrusion in the cell. By means of the Os,bipy, probe, the presence of cruciform in E. coli cells was demonstrated recently in (A-T), inserts of different length^.^.^" In the system not undergqing active transcription, biased in favor of cruciform formation (using salt shock or topoisomerase mutation to increase the superhelix density) cruciform was detected in (A- T),, (AT),,, and (A-T),, but not in (A-T),,. These experiments made it possible to calculate the effective DNA superhelix density inside the cell that responded directly to genetic and environmental influences. The above data seem to be sufficient to warrant the conclusion that at least some inverted repeats can form cruciform in E. coli cell^.^^^.^" The presence of AT-rich C-type-inducing sequence and superhelix density increased above the average intracellular level appears to be important for cruciform extrusion in the cell. 2. Biological Role If we admit that cruciforms can exist in the cell, a question concerning their biological funcZion may arise.::It has been suggested that one of the signals in;dlved in the initiation of DNA synthesis is a specific local DNA structure that is recognized by the enzyme necessary for replic a ~ i o n . ~ ' ~ . ~ ' ~Features common to many r-e tion origin sequences include inverted reueatsand AT-rich region^.^'^^^'^ Thus, cruciform structures may be one of the candidates that might be involved in si nalinginitiati~n-ofofDNAsynthesis. Quite recentlyg__;__War et a1.616observed somecorrelation between the distribution of activated origins of replication and the distribution of cruciform using irnrnunofluorescenceof eukaryotic nuclei labeled with monoclonal anticrucifonnantibody. By meansof fluorescenceflow cytometry they determined the number of cruciforms to be around lo5per nuclei. It was shown that human and rat Ells contan cruciform-binding protein that is structure-specific and sequence-independ: ent.617.6'8This protein was identified as nuclear HMGl .6'9 HMGl is an abundant component of the nucleus involved in transcriptions and DNA replication; its interaction with the crucifo& structure points to an important biological role of this structure. It has been shown that a synthetic E. coli promoter containing a cruciform in a -10 region may regulate transcription in a supercoil-dependent Transcription from this promoter in vitro was repressed as the cruciform [50 bp highly (88%) AT-rich inverted repeat] was extruded. Transcription in vivo was induced as supercoiling was relaxed due to DNA gyrase iAhibition [which was less (64%) AT-rich]. A cruciform in the -35 promoter region behaved similarly in ~itro;~~Ohowever, the same inverted repeat had littleeffect on the transcription in vivo. A 48-bp inverted repeat placed in the J-F intercistronic region of $ x 174 replicative form DNA gave identical transcripts in vitro with extruded and unextruded cruciform.621Unusual local structure, most probably cruciform, prevented both transcriptional initiation and elongation in form V of pBR322 DNA.622 It has been shown that transcription induces the generation of positive supercoils in front of the transcription complex and formation of negative supercoils behind it.14s-623-625Thus, C N C ~ ~ form structures can be absorbed or extruded dying transcription. S m extrusions in vivo are ratherjmprobae, while the occurenceof$&= extrusion In vivo, especially in highly AT-rich inverted repeats, appears probable.-Therefore, the difference observed in the effect of inverted repeats contained in the -10 and -35 regions on transcription in vivo may be due to different extrusion pathways of the two cru~iforms.~~'In vitro transcription proceeds in a less complex milieu (e.g., no membrane attachment sites are present), which may result in the formation of supercoil domains differing from those created in vivo by their size, degree of supercoiling, etc. It is thus not surprising that the effect of an inverted repeat on the transcription in vitro and in vivo is not always the same. The recent data suggest that cruciform structures can exist in the cell and might be involved in the regulation of DNA replication and tranr rip ti on.^^^^^^^^^^^^^^^^^^ The possibility of in-, volvement in other biological processes cannot be excluded.626.627 C. Left-Handed DNA The presence of 2-DNA in fixed eukaryotic chromosomes was reported in the early 1980s.628-630By means of anti-2-DNA antibodies patterns of different intensity bands in polytene chromosomes of Drosophila and Chironomus were observed by indirect immunofluoroesce n ~ e . ~ ~ ~ 'Shortly after this occurred it was sho& that 2-DNA could not be detected in unfixed chromosomes isolated by micromanipulaand that solvents used in fixation procedures induced different stable DNA tracts, resulting in the reproducible patterns. Therefore, antibody binding results can only be taken as evidence for the presence of potential 2-DNA sequences in the eukaryotic genomes. Attempts to demonstrate the existence of 2-DNA in (GC);(G-C), segments of plasmid DNA in E. coli by the linking number assay635did not solve the problem of the existence of 2-DNA in vivo since the results of this assay cannot be interpreted unambiguously (see above). 1. Existence of Left-Handed DNA in Prokaryotic Cells , Tn 1987 ,we used Os,bipy to search for lefthanded DNA in (c-G);(c-G), segments of an intracellularp l a ~ r n i d . ~ ~ - ~We showed that Os,bipy reacts at the boundaries of these segments, recognizing site-specifically the B-Z junction inside the E. coli cell. At the same time Jaworski et using a special molecular genetic technique, showed independently that left-handed DNA exists in the plasmid (C-G);(C-G), stretches and elicits a biological response in E. coli cells. Their in vivo assay was based on the in virro - . observation that a EcoRI recognitionsite was not methylated when it was near or in the 2-helix. In the in vivo assay a plasmid encoding the gene for a temperature-sensitive EcoRI methylase (MEcoRI) was cotransformed with any one of several plasmids containing (C-G), or (T-G), blocks of differentlengths with target EcoRI sites in the centeror at the end of the blocks. Inhibition of methylation by the MEcoRI was observed for the inserts, with the longest (G-C), segments long enough to form 56 bp left-handed helices. These results provided evidence that left-handed DNA can exist in the living cell; however, the pos'sibility that the observed inhibition of methylation was due to a specific protein binding cannot completely be excluded. On the other hand, the site- . specificOs,bipy modification of the B-Z junction in the cell could not be due to protein binding, although this assay did not allow the study of the biological response of left-handed DNA.40.139 Considering the results of the genetic42 and chemical35studies performed in 1987, it may be concluded that strong evidence has been obtained showing that left-handed DNA can exist in the cell. This conclusion is supported by further studies of the inhibition of MEcoRI in vivo and a thorough linking number assay602.636using systems similar to those used by Jaworski et al.42as wellas by the insert deletion a n a l y s i ~ . ~ ~ ~ . ~ ~ ~Using the linking number assay and MEcoRI inhibition assay, Zacharias et al.636showed that cytosine methylation stabilized left-handed DNA in E. coli cells just as occurred in the in vitro experiments. Insert deletion analysis showed that sequences capableof adopting Z-DNA were generallystable when cloned into an untranslated site of pBR322 (EcoRI), but suffered deletions when cloned into a site (BarnHI) located in the tetracycline resistance structural gene.598 Using Os,bipy, Rahmouni and Wells248recently showed that (C-G) segments as short as 12 bp adopted left-handed DNA when cloned upstream from the tet gene, whereas no lefthanded DNA was found when the(C-G)blocks (upto 74 bp long) were cloned downstream. These studies strongly support the notion of with varying degrees of supercoiling in E. coli cells, perhaps related to transcriptional activity. The important contribution of R. D. Wells' laboratory to studies of Z-DNA in bacterial cells was reviewed r e ~ e n t l y , ~ ~ . ~ ~The progress made in the last few years allows us to conclude that left-handed DNA can exist in bacterial cells [at least in (G-C), segments], and that its existence is dependent on the level of intracellular ' supercoiling. 2. Left-Handed DNA in Eukaryofic Cells Studies of Z-DNA in eukaryotic cells by means of antibodies are more difficult ,because of the possibility of the secondary induction of Z-DNA by (1) direct effect of fixatives, (2) removal of proteins during fixation procedures, which results in changes in DNA superhelix density, and (3) perturbance of the B-Z equilibrium by the anti-Z-DNA antibody. To overcome these difficulties, Wittig et al.638.639 encapsulated permeabilized myeloma nuclei (that were active in transcription and replication) in agarose microbeads and probed the extent of Z-DNA formation in dependence on supercoiling. Upon binding ZDNA specific antibody, they observed a broad plateau of constant binding that was taken as a measure of preexistingZ-DNA in the nuclei. They calculated that about 0.04% of the base pairs were in the Z conformation.639Inhibition of topoisomerase I with camptothecin resulted in a higher Z-DNA content, while cleavage with DNaseI induced the complete loss of preexisting Z-DNA in the nuclei. Soyer-Gobillard et al.639alocalized Z-DNA in limited areas inside the chromosomes of the dinoflagellate Prorocentrum micanr by irnrnunocytochemistry on squashed fmed, unfixed, and frozen cells. This organismus is a primitive eukaryote whosechromosomes show a permanently well-organized DNA skcture with no histones and a nucleosomal system that would modulate DNA supercoiling. This makes the dinoflagellate chromosome a highly suitable model for studying Z-DNA and other local structures in vivo.Z-DNA was localized often at the periphery or near the segregation fork of dividing chromosomes. In the nucleolus, Z-DNA was observed only in the nucleolus organizer region and never in the fibrillogranular area. Positive results obtained with unfixed and frozen cells provide strong evidence for the existence of Z-DNA in eukaryotic cells. These results cannot be simply correlated with the previous data obtained from eukaryotic chromosomes because of substantial differences in their chromosome organization. (C-G), and (C-A), sequences were cloned into SV40.620+641(C-G)nwas highly unstablecompared with that of (C-A)n.620This instability, however, was not related to the formation of ZDNA in eukaryotic cells, as no signs of this structure in (C-G), and (C-A),,.(T-G),, inserts in vivo were detected by the linking number assay.641It has been pointed out in this article that the &sults of this assay cannot be unambiguously interpreted; the possibility that changes induced by Z-DNA formation were compensated for by, forexample, dissociation of some protein molecule, cannot be excluded. Studies based on the dinoflagellate model introduced by S~yer-Gobbilard~~~"as well as by the technique developed by Wittig et a1.638.639represent a substantial advance in the study of ZDNA in eukaryotic cells. Further techniquesare, however, needed to elucidate the question of the presence, distribution, and biological function of left-handed DNA and other local structures in eukaryotic cells. Application of chemical probes and antibodies specific to DNA-probe adduct? may representa new and useful approach to these studies. An even more interesting approach can be seen in a novel confocal Raman microspectr~metry;~~,by means of this technique the Raman spectra of a single intact cell, a chromosome, or a polytene chromosome band and interbandcan be obtained, providing information about DNA structure and DNNprotein ratio. This technique brings new perspectives for future DNA structure research in vivo. 3. Biological Role of 2-DNA While the possibility of the existence of lefthanded DNA in vivo has been reliably establ i ~ h e d , ~ ~ . ~ ~the biological role of the structure remains unclear. Long (dC-dG), tracts are not widely found in biological systems, but (dTdG);(dA-dC), sequenceswere found in both proand eukaryotic genomes (reviewed in References 34 and 643 to 647) associated with many genes (reviewed in References34, 36, and 648 to 653). Proteins that bind to Z-DNA in vitro were isolated (reviewed in References 34, 36, and 648 to 653). Up to the present time, however, no ZDNA-dependent in vivo functions of these pro: teins have been identified. Among biological roles suggested for ZDNA are its participation in transcription, recombination, chromatin structure, e t ~ . ~ ~ . ~ ~ ~ . ~ ~ . ~ ~ ~It was shown that E. coli RNA polymerase could transcribe through (C-G),, sequencein the B-form.'" When this sequence was flipped to the left-handed form, the RNA polymerase together with its nascent transcript was blocked at the boundary of the (C-G),, tract. In vivo, however, the entire sequence was transcribed, suggesting some mechanism that removesor prevents the blocking of transcriptional elongation in the cell. It was recently shown that HMGl protein removes in. vitro the transcriptional block caused by the (CG),, sequence in the left-handed form.656 Insertion of (C-G), in the lac Z gene of E. coli inhibited expression of P-galactosidase in v i v ~ . ~ ~ 'The (C-G),, showed a 34-fold decrease of P-galactosidase synthesis when inserted instead of a lac operator and a 24-fold decrease when inserted between codons 5 and 6 of the lac Z gene. With shorter (C-G), sequences the decrease was substantially smaller. It might be interestingto try to compare the observed inhibition with the actual presence of left-handed DNA in the cell. In contrast to (C-G),,, which caused a transcriptional block in negatively supercoiled plasmid in vitro, the (C-A),,*(T-G),, sequence induced no strong hindrance to tran~cription.~" Longer (C-A);(T-G), sequences (with n = 60 and 179 bp, located upstream of the rat prolactin gene) formed lgft-han~DNA and inhibited gene transcription in v i t r ~ . ~ ~ *Nucleosome assembly at the (G-C),, insert was prevented when the insert adopted Z-DNA in a supercoiled ~ l a s m i d . ~ ~ ~ These results suggest that left-handed DNA may be involved in .important biological processes. Further work is, however, necessary to elucidate the role of this DNA form in transcription, recombination, and other biological processes.A D. Triplexes 1. Occurrence of Polypugy Tracts The occurrence of polypu.py sequences in genomes of various organisms has been studied intensively in the past few years. (G-A);(T-C), tractsconstitute 0.4,0.3, and 0.4%, respectively, of the rat, hamster, and mouse genomes, but only 0.7 and 0.5% of the human and monkey gen- ! ~ r n e . ~ 'These tracts were also found in other ~ r g a n i s m s . ~ ~(T);(A), and (G);(C), sequences are present in the human genome with 0.3 and 0.0002% frequencies, respectively.663In mice and rats the transcription units are flanked by (A);(T), and (G);(C), sequences.664Tracts of (C), and (G), blocks have been found in the vicinity of mouse immunoglobin light chain genesM5 and within the 5'-flanking region of the adult chicken PAglobin gene.- 0,sequences are found 5' upstream of the human A-globin gene intermingled with (T-G),.= (T-C), and (T-G), blocks are adjacent in the third intron of the apoliproprotein CLI gene.667In Drosophila chromosomes (T-C), and (T-G), blocks show a nonrandom distribution, with the highest occurrence on the X chromosome.668-670 Blocks containing (GAA), sequence units are present on all human chromosomes except the Y chromosome.671The (GGA)? sequence family may be ubiquitous ig thegenomesof higher eukaryotes. Pentamers T,, (T),C, TTClT, TTTCT, and TCTlT are represented with over 125 copies in the ovalbumin gene (CCT), sequences occur at recombination sites of the complex satellite DNA of the Bermuda land crab.673homopu.py sequences are also found in numerous virus g e n ~ m e s . ~ ~ ~ . ~ ~ ~ Homopu*pyregions 1 1 0 bp in length were found in human P-globin region and six other human genes with an average 6f about one string per 170 to 250 bp.675A high bias in favor of homopu-pystrings in a human genome may affect nucleosomestability and placement.M3.675-677Nucleosomes were reconstituted with all homopwhomopyrsequences tested,675-678with the exceptionof poly(dA).poly(dT), which was not able to form nucleosomes.679-680 Further details on the occurrence of homopu-homopyrsequences can be found in recent r e v i e w ~ . ~ ~ * ~ ~ O 2. Existence of H-DNA In Vivo In contrast to cruciform and left-handed ZDNA, whose supercoil-stabilized structures were uncovered in the early 1980s, H-DNA structu& was proposed only a few years ago, therefore, attempts to identify this structure in vivo have only a short history. In 1987 Lee et a1.206generated monoclonal antibody specific to triplex DNA. They reported binding of this antibody to mouse metaphase chromosomes and interphase nuclei. In fixed mouse and human chromosomes a positive correlation between immunofluorescent staining patterns, G- andlor C-banding patterns, and Hoechst 33258 banding was observed.681Unfixed isolated mouse chromosomes were only weakly fluorescent. These results are interesting, but they do not represent unequivocal evidence of triplex structure in the cell, as their interpretation suffers from drawbacks similar to those of 2-DNA immunofluorescent ~ t a i n i n g . ~ ~ ~ . ~ ~ ~ Using Os,bipy we did not observe any sign rku of triplex formation in pEJ4 intracellular plasmid kf. at neutral pH.38If E. coli cells were preincubated I u. in pH 4.5 or 5.0, a modification pattern char- , acteristic for H-DNA was obtained. The shift of -, the intracellular sup&helix density to more negative values (by cultivating the cells in media supplemented with 0.35 M NaCI) resulted in a stronger site-specific modification. A more detailed analysis of the pL153 plasmid (containing the homopu-py sequence from pH4 not undergoing active transcription)showed differences between the Os,bipy modification in vitro and in sit^.^^^ In situ, more bases were modified in the triplex loop and modification of two thymines at the 3'-end of the (C-T),, sequence (forming b e B-H junction of the H-3y conformer) was weaker (Figures 5 and 7). On the other hand, modification of three cytosinesat the 5'-end of the (C-T),, sequence was observed; these cytosines were unmodified in vitro. The modification of these cytosines can be expected in H-y5 conformer, which is formed at lower superhelix density than H-y3 observed mainly in isolated plasrnids. It was therefore tentatively suggested that H-y5 conformer prevails in the cell. Differences in triplex loop modification, which might be due to some intracellular interactions, suggest that the triplex structure in the cell may differ in some details from that observed in vitro. Basically the same modification patterns (Figure 7) were found in the pH range 4.5 to 5.2; at pH 5.4 no characteristic triplex modification was observed. When compared with extracellular pH, the intracellular pHs were higher by about 0.5 as determined by the fluorescein method. These results were obtained at pH values that are FIGURE7. Modificationof the polypyrimidine strand of the insert in pi153 plasmid with Os,bipy (A) in vitro and (B) in situ.- The length of the vertical lines in the nucleotide sequence represents the relative intensities of the bands on the sequencinggelobtained by densitometric tracing after (A) treatment of supercoiled pL153 DNA in vitro at pH 5.0 and (8)after Os,bipy treatment of E. coli cells harboring the pL153 plasmid (at external pH 5.0). Two conformers of the H-DNA triplex are shown in which is indicated the direction of the donated strand by full triangles. The arrows show the strongest modified base in the triplex loop and different modificationof bases at the potentialB-H junctionsconsistent with presence H-y5 conformer in the cell and H-y3conformer in vitro. not fully physiological for E. coli, but E. qoli pH values in different cell compartmentscan sigcan grow at pH 5.0 and 5.2, and while a tiansfer nificantly vary.683In the cell, requirements for of cells from pH 6.9to 4.3results in an induction protonation might be decreased by specific proof acid shock proteins, no such induction occurs tein binding andlormore negativesuperhelixdendue to the transfer to pH 5.682Intracellular pH sity in the given DNA domain. It may thus be depends on the life cycle of the cells (low pH is expected that triplex structures will be detected common to resting cells), and in eukaryotic cells at higher pH values, especially with longer po- lypvpy tracts, which have demonstrated the ability to form triplexes at neutral pH. The possibility of the existence of H-DNA in E. coli cells under physiological conditions is supported by the results of the recent insert deletion analysis of mixed polypu-py sequences capable of forming H-DNA in ~ i t r o . ~ ~ 'Similar to Z-formingsequences, the former sequences were stable when cloned into a nontranslated region, while inserts located in the tetracycline resistance structural gene were deleted. Parniewski et al.523 showed that polypu*py sequences capable of forming H-DNA in vitro are undermethylated in vivo within the potential triplex loop when grown in the JM dam' strain. dam undermethylation was suppressed by the administration of chloramphenicol to the cells. The results were explained by H-DNA formation in vivo and protection of the triplex loop from methylation by interaction with a specific protein. This explanation is very attractive, but formation of H-DNA with the given plasmids at neutral pH requires a u of -0.06; so far, only less negative values in vivo have been r e p ~ r t e d . ~ ~ ~ . ~ "Other interpretationsare possible:presenceof a partiallyextruded H-DNA trapped specifically in vivo or formation of a different structure which is reabsorbed due to chlorarnphenicol treatment. Application of chemical probes in situ might., help to clarify this problem. It has been shown that H-DNA can exist in a bacterial cell at acid intracellularpH value^.^'.^^^ Further work is necessary to establish whether this structure does exist under physiological conditions and in eukaryotic cells, where the location and high occurrence of polypu-py sequences suggest their biological importance. 3. Structural-Functional Relations Polypyrimidine sequences in vitro were observed in a number of naturally occurring seq u e n c e ~ , ~ . ~ ~ ~ . ~ ~some of which are involved in known biological functions. Carciaogen-induced amplificationof the integrated polyomavirus DNA . . was arrested within'aspecific cell DNA segment containing (G-A),,.(C-T),, tract.684Singlestranded (T-C)n and (G-A)n tracts of various length were cloned into M13 phage and replicated by extension of the M13 primer to determine whether these tracts act as stop signals for DNA replication in ~ i t r o . ~ ' ~Specific arrest of replication was detected around the middle of (T-C)n and (G-A)n with n >16. In (T-C)n tracts the arrests were more prominent at pH 6.5 to 7.5 than at pH 8.0. It was concluded that the arrests are due to triplex formation between partially replicated (T-C)n or (G-A)n tracts and the unreplicated portion of these sequences. Cooney et showed that a 27-base-long purine-rich oligonucleotide binds to duplex DNA at a single site within the 5'-end of the human c-myc gene 115 bp upstream from the transcriptionorigin P1. Correlation between the triplex formation at -115 bp and repression of c-myc transcription in virro was shown. If such a triplex formation can also occur in vivo, it may represent an alternative method of gene control in the cell. Using Os,py, DEPC, and DMS probes K i ~ i b u r g h ~ ' ~identified H-DNA in vin-o in a mixed polypu.py sequence from a positive cisacting transcription element of the human c-myc gene. This sequence binds several transcription factors. K i n n i b ~ r g h ~ ~ ~speculated that hybridization of the RNA component of one of these factors may represent the first step in H-DNA formationin vivoand that H-DNA would increase the transgriptional activity of the c-myc gene. An unusual strucdre, probably H-DNA, was formed by the (G),.ATT(G), sequence in a supercoiled plasrnid containing the genome of the human immunodeficiency virus type 1.687 This sequence is located at the integration site of a human immunodeficiency virus (HIV) provirus. Kohwi-Shigemat~u~~'showed quite recently that (G);(C), sequences enhance CAT gene expression in eukaryotic cells. The level of enhancement was highest for n = 28 to 30, comparable to the polyoma enhancer. Any shorter or longer tracts were less active. In vivo competition assay suggested the existence of a transacting factor that interacts with the (G);(C), sequence. In Drosophila nuclei a protein binding to multiple GAGA DNA sequence motifs was found. This protein activates the transcription of Ubx promoter in a binding-site-dependent manner.688 Further proteins from the same source bind to regions of (C-T), in the promoters of heat shock and histone genes.689Interaction of the purified protein fraction with the intergenic region located between promoters occurred on linear DNA fragments, indicating that supercoiling was not required. Interaction of these proteins with supercoiled DNA in vitro was not studied. In principle, specific protein interaction with the polypu.py region might stabilize the duplex, preventing triplex formation and vice versa. On the other hand, formation of a triplex can prevent specific protein-DNA interactions normally occumng in+the duplex. Oligonucleotide-directed triplex formation can inhibit recognition of the DNA double helix by prokaryotic restriction1 modification enzymes and by eukaryotic transcription factor at homopurine target E. Conclusicms Among the local DNA structures discussed in this section, triplexes appear to be the best candidates to play a role in gene expression. Iqtermoleculartriplexesseem to have a betterchance, to form in vivo, as their requirements for specific environmental conditions and negative supercoiling are less stringent. Moreover, they can be formed not only between DNA molecules, but also between DNA and RNA, including variousnucleoproteins. Involvement of intermolecular triplexes in biological processes in vitro such as DNA r e p l i ~ a t i o n , ~ ~ ~ . ~ ~ ~trans~ription,~~~and restriction/m~dification~~~.~~'has been demon- strated. The formationof intramolecular triplexes, 2DNA, and cruciforms in vivo is strongly dependenton the intracellularDNA superhelix density. Until recently the probability of the formation of these structures in vivo was considered to be rather low, as the (average) effective level of (unconstrained) supercoiling is about one half that of purified DNA,173.5%.599.602.692 The recent discoveryof transcriptional waves of supercoilingmakes the extrusion of local DNA ~tructu~smuch more probable.'45-623-625*693In fact, if a suitable sequence is present in a sufficiently negatively supercoiled DNA domain, a correspondinglocal structure should be formed in vivo unless its extrusion is prevented by some other factor (e.g., protein binding, kinetic barrier). Detection of a localstructure in vivocan be exploited to study the level of effective supercoiling in a given DNA d ~ m a i n . ~ ~ ~ . ~ ~' Therefore, in addition to a question frequently asked in recent years, "Do these (local, unusual) structures exist in vivo?", we may now also ask, "Why is this structure (e.g., 2-DNA) not formed in the given nucleotide sequence in vivo?" Asking questions is usually easier than answering them. To answer the above questions, further development of techniques suitable for DNA structure studies in the cell is necessary. We recently applied osmium tetroxide complexes to study the presence of open local DNA structures in eukaryotic cells by means of immunofluorescence.610As the chemical probe can be applied to cells and even glands prior to fixation, the adverse effects of fixation are eliminated. Using this technique we observed selective binding of the osmium probe to DNA in the cells, suggesting a wide occurrence of open local structures in eukaryotic cells. These structures are probably not limited to cruciforms, B-Z junctions, and triplexes. They may include further , known and unknown structures and their junctions as well as open transcription complexes and other structures connected with DNA replication, recombination; andl,other biological proc e ~ s e s . ~ ~ ~ . " ~ ~ ~ " - ~ *The permanent or transient availability of bases contained in these structures for interaction with the environment (which is important for their detection) may also play a significant biological role. VIII. PERSPECTIVES The question ofrecognition of DNA nucleotide sequences by specific proteins is one of the most important problems of contemporary mo- lecular biology. There is no doubt that the specific proteins do not read the nucleotide sequence as such; rather, they recognize the three-dimensional structure of DNA. N u s s i m ~ v ~ ~ 'believes that DNA regions may be distinguished by the thermodynamicsor flexibility of the DNA double helix and not by different local structures, because they might not be trapped as such. Evidence has been presented suggesting that different protein motifs might interact with the DNA grooves andlor backbone, recognizing their spe- cific sequence-dependent spatial arrangement~.'~'In fact, the amount of data about the relations between nucleotide sequences and their locations in pro- and eukaryotic genomes is much larger than the amount of data concerning the presence and location of DNA local structures. The reason for this difference has been due principally to the difficulties with obtaining the latter data. The results summarized in this review indicate that local DNA structures can be trapped both in vitro and in vivo, and that sequences from which these structures may be extruded are frequently located in biologically significant sites of the genomes. It would thus be rather surprising if the local DNA structures did not represent any signal in the DNA recognition. I believe that the recent progress in the development of techniques of probing the DNA structure in the cells will soon result in a great advance in our knowledge of the relation between local DNA structures and their biological function. The present situation seems to resemble that close to the end of the 1970s, when the suggested polymorphy of the DNA double helix was rather reluctantly accepted.1°This was at a time when the crystals of the Z-DNA were probably already growing. List of Abbreviations i Chemical Probes 0 s,py: osmium tetroxide,pyridine reagent Os,bipy: osmium tetroxide complexed with 2,2'-bipyridine phe: 1,lO-phenanthroline bpds: bathophenanthroline disulfonic acid TEMED: tetramethylethylenediamine DEPC: diethylpyrocarbonate BAA(CAA): bromo(ch1oro)acetaldehyde DMS: dimethylsulfate ENU: ethylnitrosourea MNU:methylnitrosourea CMC: N-cyc1ohexy1-N1-~(4-methy1morpholinium) ethylcarbodiimide-p-toluene Other Abbreviations homopu-py: homopurine-homopyridine polypu-py: polypurine-polypyrimidine 2-D: two dimensional DEDICATION This article is dedicated to Professor Julius Marmur on the occasion of his 65th birthday. ACKNOWLEDGMENTS This paper was written during my sabbatical leave at the Max-Planck Institute for Biophysical Chemistry, Gottingen and supported in part by grant 436 CSR-113/13/0 of the Deutsche Forschungsgemeinschaft (DFG). This review could not have been written without the help of Dr. Thomas Jovin, to whom I am greatly indebted for his hospitality, stimulating discussions, advice, and critical reading of the manuscript. I am also very grateful to Drs. Donna Arndt-Jovin, Stephan Diekmann, Udo Heinemann, Michel Robert-Nicoud, and other colleagues for their valuable comments on sections of this article. My thanks are also due to Drs. Jim Dahlberg, Maxim Frank-Karnenetskii, Udo Heinemann, Terumi Kohwi-Shigematsu, Jan Klysik, David Lilley, Andrzej Stasiak, Ed Trifonov, Robert D. Wells, and other colleagues for allowing me to quote their unpublished data. Last, but not least, I would like to express my deep gratitude to Eleanor Mann and Dr. Frantisek Jelen for their help with preparation of the manuscript. REFERENCES 1. Wilkins,M. H. F., Physical studiesof the molecular structure of deoxyribose nucleic acid and nucleoprotein, Cold Spring Harbor Symp. Quant. Biol., 21, 75, 1956. 2. Crick, F. H. C., The double helix: a personal view, Nature, 248, 766, 1974. ' 5. '~imrnerman,S. B., The three-dimensionalstructure of DNA, Annu. Rev. Biochem., 51, 395, 1982. 4. Jovin, T. M., Recognition mechanisms of DNAspecific enzymes, Annu. Rev. Biochem., 45, 889, 1976. 5. Palecek, E., Changes in oscillopolarographic behaviour of deoxyribonucleicacids at temperatures below denaturation temperature. J. Mol. Biol., 11, 839. 1965. 6. Palecek, E., Polarographic behaviour of native and denatured deoxyribonucleic acid, J. Mol. Biol.. 20, 263. 1966. 7. Palecek, E., Deoxyribonucleic acid conformational changes at temperatures below melting temperature, Arch. Biochem. Biophys., 125, 142, 1968. 8. Johnson, P. H. and Laskowski, M., Sugar-unspe-