Promoter melting triggered by bacterial RNA polymerase occurs in three steps Jie Chen3, Seth A. Darstb, and D. Thirumalaiac1 "Biophysics Program, Institute for Physical Science and Technology, 'Department of Chemistry and Biochemistry, University of Maryland, College Park, MD 20742; and The Rockefeller University, 1230 York Avenue, New York, NY 10065 Edited* by Jose N. Onuchic, University of California at San Diego, La Jolla, CA, and approved May 28, 2010 (received for review March 18, 2010) RNA synthesis, carried out by DNA-dependent RNA polymerase (RNAP) in a process called transcription, involves several stages. In bacteria, transcription initiation starts with promoter recognition and binding of RNAP holoenzyme, resulting in the formation of the closed (/? • Pc) RNAP-promoter DNA complex. Subsequently, a transition to the open R ■ Pa complex occurs, characterized by separation of the promoter DNA strands in an approximately 12 base-pair region to form the transcription bubble. Using coarse-grained self-organized polymer models of Thermus aquatics RNAP holoenzyme and promoter DNA complexes, we performed Brownian dynamics simulations of the R ■ Pc ->• R ■ P„ transition. In the fast trajectories, unwinding of the promoter DNA begins by local melting around the -10 element, which is followed by sequential unzipping of DNA till the +2 site. The R ■ Pc ->• R ■ P„ transition occurs in three steps. In step I, dsDNA melts and the nontemplate strand makes stable interactions with RNAP. In step II, DNA scrunches into RNA polymerase and the downstream base pairs sequentially open to form the transcription bubble, which results in strain build up. Subsequently, downstream dsDNA bending relieves the strain as R ■ P„ forms. Entry of the dsDNA into the active-site channel of RNAP requires widening of the channel, which occurs by a swing mechanism involving transient movements of a subdomain of the (3 subunit caused by steric repulsion with the DNA template strand. If premature local melting away from the -10 element occurs first then the transcription bubble formation is slow involving reformation of the opened base pairs and subsequent sequential unzipping as in the fast trajectories. DNA scrunching | transcription initiation | self-organized polymer model | molecular simulations | sequential DNA unzipping The DNA-dependent RNA polymerase (RNAP), whose sequence, structure, and functions are universally conserved from bacteria to man (1,2), is the key enzyme in the transcription of the genetic information in all organisms (3-5). There are three major stages in the transcription cycle (3), which first requires binding of promoter-specific transcription factors to the catalyti-cally-competent core of RNAP, to form a holoenzyme. They are: (i) initiation, which first requires binding of an initiation-specific c factor to the catalytically-competent core RNAP to form the holoenzyme, followed by recognition of the promoter DNA to form the closed (R ■ Pc) complex and subsequent spontaneous transition to the open (R ■ P0) complex; («) elongation of the transcript by nucleotide addition; (Hi) termination involving cessation of transcription and disassembly of the RNAP elongation complex. Among these highly regulated stages, the most complicated may be the initiation process because it involves promoter recognition, DNA unwinding, and the formation of the transcription bubble inside the RNAP active-site channel, where RNA synthesis occurs. A simplified transcription initiation pathway is (6, 7) R+P-RPc-RP0-IC (Fig. IB) that are assembled like a crab claw. The two "pincers," formed from the large p and P' subunits, hold the promoter DNA in the active-site channel between the pincers (Fig. 1). The structural model of RNAP holoenzyme complexes with fork-junction DNA (10) has given insights into the mechanism of the R ■ Pc -»■ R ■ P0 transition. A variety of structures were pieced together to construct detailed models for the R ■ Pc and R- P0, which lead to the following mechanism of R ■ P0 formation (3, 10). Local melting of the promoter —10 element allows binding of the exposed nontemplate strand to conserved region 2.3 of o (o23) (Fig. IB), stabilizing the melted region. Melting also renders the promoter DNA flexible, thus facilitating its entry into the active channel. Author contributions: D.T. designed research; J.C. performed research; D.T. contributed new reagents/analytic tools; J.C, S.A.D., and D.T. analyzed data; and J.C, S.A.D., and D.T. wrote the paper. The authors declare no conflict of interest. *This Direct Submission article had a prearranged editor. To whom correspondence should be addressed. E-mail: thirum@umd.edu. This article contains supporting information online at www.pnas.org/lookup/suppl/ doi:10.1073/pnas.1003533107/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1003533107 PNAS Early Edition | 12523 of 12528 The R ■ P0 structure further suggests that the promoter DNA bends into RNAP active channel to form the transcription bubble (11-13). Although the structural models provide plausible hypothesis for the transcription bubble formation, dynamical studies are required to describe the conformational changes that accompany the R ■ Pc -»■ R ■ P0 transition. Here, we use the structural models for R ■ Pc and R ■ P0 (updated from (10)) of T. aquaticus to address the following questions: (a) What are the steps in transcription bubble formation, and (b) What is the nature of RNAP dynamics that enables the downstream dsDNA entry into the enzyme? To answer these questions, we performed Brownian dynamics simulations of the R ■ Pc -»■ R ■ P0 transition using a coarse-grained SOP model (14). The simulations of the kinetics of the R ■ Pc -»■ R ■ P0 transition show that the transcription bubble forms in three distinct steps. In step I, a region near the —10 element on the promoter DNA melts. In step II, the promoter DNA scrunches into the RNAP active channel, forming the transcription bubble and in step III, the downstream DNA bends. DNA bending is a result of downstream DNA relaxation, and occurs only after unwinding of the dsDNA. Widening of the channel needed to accommodate the dsDNA entry into the active channel requires transient expansion of key structural elements in the p subunit of RNAP, which implies that the internal enzyme dynamics plays an important role in the R ■ Pc -»■ R ■ P0 transition. Results A Network of Contacts Trigger the Promoter Melting. The structures of the RNAP-DNA complexes used in this work (10) have 3,122 residues and 150 nucleotides. The holoenzyme has six subunits: od (Ala6-Glu229), all (Ala6-Phe225), p (Ala2-Alalll6), P' (Ala3-Alal499), w (Ala2-Ala93), and a (Ala93-Ala438) (Fig. LB). The promoter DNA has 75 base pairs, —54 to +21, labeled with respect to the transcription start site +1 (Fig. \A). In R ■ Pc, the promoter DNA "sits" on the top of RNAP (Fig. IB) and forms stable interactions between the c subunit and the —10 and —35 regions. Despite the low-resolution nature of the models, the dynamics reveal key structural changes that occur during the R ■ Pc -»■ R ■ P0 transition. Several contacts between DNA —10 element and the subunit c of RNAP rupture, in particular, the contacts involving nucleotides —12 to —8 on the template strand (Fig. SI). Formation of contacts between nucleotides —3 to +5 on the nontemplate strand and the p subunit (Fig. SI), nucleotides —9 to +1 on the template strand and the p subunit (Fig. SI), and nucleotides +1 to 7 on the template strand and the P' subunit (Fig. SI) stabilize the R ■ P0 state. R ■ Pc ->• R ■ P„ Transition Trajectories Partition into Fast (Efficient) and Slow (Inefficient) Tracks. The global nature of the R ■ Pc -»■ R ■ P0 transition is monitored by the time-dependent changes of the root mean square deviation Ac(t) (A0(t)) of the DNA with respect to the closed (open) value. The transition times, rms, calculated using \Ac(rm) — A0(rm)| < e = 0.1 A, were used to partition the set of 30 trajectories into fast and slow processes (Fig. S2). In the fast routes, Ac(t) (A0(t)) increases (decreases) very rapidly (Fig. S2), which indicates (see below) that upon opening of the DNA base pairs the transcription bubble forms efficiently. In contrast, in the slow routes, long-lived metastable states, which are indicated by a plateau in Ac{t) (A0(t)) (Fig. S2), are populated. The base pairs open by a complicated pathway resulting in a decreased efficiency in the transcription bubble formation. The conclusions do not depend on the value of the dielectric constant used in the treatment of the electrostatic interactions (see Eq. SI in Figs. S2 and S3). What is the origin of the differences between the fast and slow trajectories? To answer this question, we examined the time-dependent conformational changes of the promoter DNA, which we describe using the distances, dt(t), between the two comple- mentary nucleotides of each base pair in the promoter DNA. Here, d,(t) = \ff (t) -ifT{t)\ where if (t) and ifT{t) are the positions of the jth nucleotide on the template and the nontemplate strands, respectively. Since the transcription initiation site is at i = +1 and the transcription bubble forms between i = —12 and +2 (6), we computed dt(t) for four representative base pairs, —11, —7, —3, and +2 to describe the transcription bubble formation. In the R ■ Pc state, dt (t) « 11 A for all of the four base pairs, but in R ■ P0 state, d, (t) « 18 A, 48 A, 53 A, and 28 A for -11, -7, —3, and +2 base pairs, respectively. Promoter DNA Unzips Sequentially from -10 Element in the Fast Track. Analysis of dt{t) in all the fast trajectories shows a consensus sequence of events during the R ■ Pc -»■ R ■ P0 transition. Base-pair —11 opens first at t~ 10 us, which is followed by disruption of interactions in —7 at t ~ 16 us. In both cases the equilibrium values corresponding to the structure mR ■ P0 were reached rapidly (Fig. 2, Fig. S4). At t ~ 24 us, the —3 pair rips and the distance between the nucleotides attains the value in the R ■ P0 state. The distance d+2{t) fluctuates between 11 A and 30 A, and reaches 30 A shortly after base-pair —3 opens. The opening of the upstream base pairs favor the rupture of the downstream neighbors, which establishes that the base pairs from —12 to +2 rupture abruptly in a sequential manner by an unzipping mechanism. Complete analysis of all the dt(t)s shows that sequential unzipping starting from the —10 site leads to rapid transcription bubble formation. Initial Base Pair Opening Away from the -10 Element Results in Slow R ■ Pc ->• R ■ P„ Transition. Although promoter recognition sequences are localized in the —35 and —10 regions, melting of the base pair can occur stochastically (see also Fig. S5). However, as seen in the fast trajectories for efficient transcription bubble formation, rupture must start from the —10 element. In some trajectories, multiple base pairs melt nearly simultaneously in a nonsequential process, which greatly impedes transcription bubble formation. Fig. 2 shows that in one of the slow trajectories, rupture of base-pairs —3 and +2 compete with the opening of the base-pair —7. At t ~ 40 us, base-pair —7 remains intact while the downstream base-pairs —3 and +2 rip as seen in the increase of d_3 (40 us) and d+2 (40 us) from 11 A to 40 A and 11 A to 50 A, respectively (Fig. 2). At longer times, the base-pairs —3 and +2 reform as shown by the decrease in d_3 and d+2 to 11 A, the value in the R ■ Pc state, resulting in the "resetting" of the initial state. Subsequently, melting of the base pairs occur sequentially, which is manifested in the increase of dt(t) to the values in the R ■ P0 state. Transcription Bubble Formation in Fast Trajectories Occurs in Three Steps. We used the time-dependent distance changes between the jth (jth) nucleotide on the promoter sequence and;th residue on RNAP, d,j(t) = \r,(t) -fj(t)\, r, (r*) is the position of the jth nucleotide (residue), to identify three steps in the transcription bubble formation process. The overall dynamics associated with the bubble formation is assessed using d_s (t), which on an average (black line in Fig. 3/4) occurs in about (30-40) us. Dissection of the events leading to the increase in d_x{t) from about 11 A at t = 0 to 55 A at t « 35 us shows that transcription bubble formation occurs in three major steps. Step I: The -10 element melts locally. The decreases in d_5, Mg2+ (t) (the prime refers to the base-pair number on the NT strand) (Fig. 3B) shows that the nontemplate strand moves towards the active site of the enzyme (amino acids around the Mg2+ ion). In this stage, d_i2,+2i also decreases (Fig. 3B), but in contrast to d_5,Mg2+(t), the changes in the conformations of +21 base pair relative to the RNAP continues to evolve throughout the transcription bubble formation process (see below). Compared 12524 of 12528 | www.pnas.org/cgi/doi/10.1073/pnas.1003533107 Chen et al. t=16us t=24ns to the time needed for the bubble opening (Fig. 3/4), the characteristic time associated with the decrease of d_5, Mg2+ (t) (t ~ 16 us) is short, and is the major feature of step I. Although the template strand of the promoter DNA also moves synchronously in this step the dynamics of such a process lasts over the entire duration of bubble formation (at t ~ 35 us). Therefore, we do not consider the movement of the template strand as a major feature of step I. The correlated movement of the nontemplate and the template strands is consistent with the assumption that local melting of DNA in the —10 element renders it flexible. In addition, the observed stabilization of the nontemplate strand in step I agrees with the experimental finding that the conserved aromatic residues of o are positioned to recognize and stabilize the exposed nontemplate strand (15-17). Loss of base-pair interaction upon melting of the —10 base pairs (TATAAT) results in the formation Fig. 2. Dynamics of transcription bubble formation. OA) Distance between complementary nucleotides of the base pairs in the bubble region as functions of time for a representative fast trajectory. Cyan, red, black and blue lines correspond to c/,(t) for base pairs -11, -7, -3, and +2, respectively. Structures of the RNAP complex at t = 12 us, 16 us, and 24 |i during the development of the transition bubble are shown. (6) Same as (A) except the results are for a slow trajectory. The reformation of the prematurely opened base pairs -3 and 2 (black and blue curves respectively) is highlighted in the shaded region. The structures sampled during the ft • Pc -► ft • P0 transition are highlighted. Arrows indicate the starting and ending states. of favorable interaction of adenine or thymine nucleotides with the aromatic side chains (Phe248, Tyr253, and Trp256) on o2.3 subunit as well as electrostatic interactions between DNA and the enzyme. Step II: DNA scrunches into the RNAP active-site channel and forms a bubble. The transcription bubble starts to grow from the —10 to +2 base pair as the promoter sequence unzips. The bubble region quantified in terms of center, d_x (t), increases from 11 A to 55 A, the value in the R ■ P0 state, which implies that in this step promoter DNA unwinds and the strands separate. Meanwhile, the distance between the —10 element and the downstream edge of the promoter DNA (base-pair +21) decreases from 110 A to 55 A (see the V-shape d_i2,+2i in Fig. 3C). The observed decrease in Fig. 3C during step II is reminiscent of the scrunching o< o< 60 40 20 0 90 ST 80 70 " 100 o< S 80 xT 6o 160 |? 140 120 II III ■ .....»■■■* J B i i d-12',Mg2+ C d-12,+21 "D 16 32 48 t[ns] Fig. 3. Three steps (in shaded colors) in the transcription bubble formation in the fast trajectories. (A) The time-dependent increase in cL5(t) shows the growth of the transcription bubble. The black line shows cL5(t) averaged over multiple trajectories, and the individual traces for a few trajectories are shown for comparison. (6) Average changes in the distance between the nucleotide -12 on the nontemplate strand and Mg2+ as a function of time. (O Ensemble average changes in djj(f) for / = -12 and j = +21 as a function of t. (D) The time-dependent change in the angle