Protocells and RNA Self-Replication Gerald F. Joyce1 and Jack W. Szostak2 1 The Salk Institute for Biological Studies, La Jolla, California 92037 2 Howard Hughes Medical Institute and Department of Molecular Biology, Massachusetts General Hospital, Boston, Massachusetts 02114 Correspondence: gjoyce@salk.edu; szostak@molbio.mgh.harvard.edu SUMMARY The general notion of an “RNA world” is that, in the early development of life on the Earth, genetic continuity was assured by the replication of RNA, and RNA molecules were the chief agents of catalytic function. Assuming that all of the components of RNA were available in some prebiotic locale, these components could have assembled into activated nucleotides that condensed to form RNA polymers, setting the stage for the chemical replication of polynucleotides through RNA-templated RNA polymerization. If a sufficient diversity of RNAs could be copied with reasonable rate and fidelity, then Darwinian evolution would begin with RNAs that facilitated their own reproduction enjoying a selective advantage. The concept of a “protocell” refers to a compartment where replication of the primitive genetic material took place and where primitive catalysts gave rise to products that accumulated locally for the benefit of the replicating cellular entity. Replication of both the protocell and its encapsulated genetic material would have enabled natural selection to operate based on the differential fitness of competing cellular entities, ultimately giving rise to modern cellular life. Outline 1 Introduction 2 Compartmentalized systems 3 Nature of the first genetic material 4 Chemical RNA replication systems 5 RNA-catalyzed replication systems 6 Compartmented genetics 7 Summary References Editors: Thomas R. Cech, Joan A. Steitz, and John F. Atkins Additional Perspectives on RNA Worlds available at www.cshperspectives.org Copyright # 2018 Cold Spring Harbor Laboratory Press; all rights reserved; doi: 10.1101/cshperspect.a034801 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 1 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from 1 INTRODUCTION The first cells must have been simple in comparison with modern highly evolved cells and therefore are often referred to as “protocells.” Protocells must have been simple enough to self-assemble spontaneously in a chemically rich environment under appropriate physical conditions but sufficiently complex that they were poised to evolve to greater complexity, ultimately giving rise to all of modern biology (Szostak et al. 2001). What kind of cell structure and composition would have been consistent with the dual requirements of original simplicity and future complexity? Given that all modern cells are bounded by a semipermeable cell membrane, it is reasonable to postulate that protocells were also membrane-bound structures, albeit with very different membranes than modern cells (Deamer 1997). Similarly, all modern cells contain diverse RNA molecules that encode information and fulfill many biochemical functions. The central role of RNA in modern cellular biochemistry, and especially the role of the large ribosomal RNA as the catalyst of protein synthesis (Steitz and Moore 2003), provides a powerful argument in support of RNA as the dominant genetic and functional biopolymer at an early stage in the evolution of life. Whether RNA, a close relative of RNA, or some much simpler genetic material formed the original basis of protocell heredity remains controversial. However, recent advances in prebiotic chemistry suggest that RNA itself may have emerged directly from the chemistryof theyoung Earth and may therefore have been the genetic polymer of the first protocells (Sutherland 2016). Many aspects of modern biology are universal, including the translation apparatus and instructed protein synthesis, enzyme-catalyzed metabolism, and protein-mediated membrane transport. However, all of these complex processes are the result of extensive evolution and therefore could not have been present in protocells. In contrast, short noncoded peptides and a wide range of other products of prebiotic chemistry likely were present and may have played important roles in the complex environment required to nurture the assembly, growth, and division of protocells. Protocells may be viewed as the products of a series of self-assembly processes that took place in specific early Earth environments that provided the necessary chemical building blocks and sources of energy (Fig. 1). The classical bilayer membrane structure results from the spontaneous self-assembly of separate amphiphilic molecules into a lower energy aggregate state. The self-assembly of RNA is more complex, but given suitably activated ribonucleotides, RNA polymers can form spontaneouslyon mineral surfaces Figure 1. Schematic diagram of a protocell. Protocells bounded by multiple bilayer membranes composed of simple amphiphilic molecules would have been permeable to nucleotides and metal ions complexed with citrate or other ligands. Larger genomic and functional RNA molecules would have been trapped within the protocell interior. Protocell replication would involve growth and division of the membrane boundary as well as replication of the genomic RNA. G.F. Joyce and J.W. Szostak 2 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from (Ferris et al. 1996) or simply upon freezing because of the concentration of ribonucleotides in the eutectic phase between ice crystals (Kanavarioti et al. 2001). As membrane sheets form, the high energy of the membrane edge drives closure into vesicles, which will trap molecules such as RNA that are present in the surrounding solution. Seen in this way, the assembly of protocells appears as a straightforward and perhaps inevitable process, given the right environment. The real problem is not protocell formation, but protocell replication, including growth and division of the protocell membrane and replication of the encapsulated genetic material. Furthermore, these processes must be mutually compatible, and the sources of energy required to drive protocell growth and division must also be compatible with the integrity of the protocell components. The nature of these replication processes and the way that protocell replication led to the evolution of functional RNAs and thus to the complexity of the RNA world is the subject of this review. 2 COMPARTMENTALIZED SYSTEMS What evolutionary advantages does compartmentalization provide compared with the replication of informational molecules in solution? Simple processes of natural selection can occur in solution. Even without replication, chemical and physical factors will control and alter the sequencesthat are generated and persist (Chen and Nowak 2012). In an environment that allows for nonenzymatic replication, sequences that are more efficiently replicated will have a selective advantage. Repeated cycles of replication would therefore be expected to shape the structure of a population by influencing base composition, template length, secondary structure, and other properties. But what about the evolution of functional properties such as molecular recognition and catalysis? Numerous in vitro selection experiments have shown that, even without compartmentalization, RNAs can be obtained that bind a ligand or catalyze a self-modification reaction. In nature, selective pressures might favor binding to mineral surfaces, for example, to prevent RNAs from being washed out of a favorable local environment. However, for more complex functions, such as RNA-catalyzed replicative or metabolic reactions, some form of compartmentalization becomes essential. Consider an RNA that catalyzes formation of a metabolite that contributes to the replication of that RNA, for example, an activated nucleotide. In solution, the products of that reaction will diffuse away from the ribozyme that generated them, thus failing to benefit the RNA itself and perhaps benefiting other neighboring RNAs. A similar argument can be made with respect to an RNA replicase, which in free solution might replicate other surrounding RNAs, thus failing to benefit itself. Compartmentalization enables the positive feedback required for Darwinian evolution by keeping useful products close to the catalysts that generated them and by keeping molecules related by descent physically closer, on average, compared with more distantly related molecules. Compartmentalization is also essential to prevent system crashes caused by the evolution of parasites. The activity of a ribozyme RNA replicase, or indeed any ribozyme, relies on its folded structure, but the formation of a stable folded structure is at odds with the properties of an ideal template, which is unfolded and readily available to be copied. In free solution, parasitic RNAs that are better templates will increase at the expense of functional replicase RNAs, leading to a population crash. In a population of small compartments, however, the emergence of a parasitic sequence in one compartment will not affect the population as a whole, even if it dooms the descendants of that compartment. Experimentally, even transient compartmentalization is sufficient to prevent extinction caused by parasites (Bansho et al. 2016; Matsumura et al. 2016). These considerations imply that compartmentalization is an essential step on the path to biology. But of the many ways that compartmentalization can be achieved, which is most relevant to the origin of life? 2.1 Pros and Cons of Membrane Systems The formation of compartments through the assembly of bilayer membranes has a major conceptual advantage over other types of compartments, namely, its continuity with known biology. If the first protocells were membranebound compartments that could grow and divide, together with the replication of their encapsulated genetic material, then there could be a continuous line of descent from the first protocells to all subsequent cellular biology. Other forms of compartmentalization would have had to undergo a transition to membrane-based compartments. Such a transition would have become increasingly difficult as the system of replicating and functional RNAs became more highly adapted to any prior form of compartmentalization. For example, compartments that are more open to the environment might enable the evolution of ribozymesthat use large, polar, or charged substrates that would become inaccessible during a transition to membrane encapsulation. There are, however, several conceptual challenges with a membrane-based protocell structure. Chief among these is that bilayer membranes are a barrier to free access to the substrates needed for genome replication, most notably charged nucleotides. Modern cell membranes, composed of phospholipids, sterols, and other complex lipids, are nearly complete barriersto the permeation of nucleotides and even Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 3 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from divalent cations such as Mg2+ (Deamer 1997). Because primitive cells would have lacked the sophisticated transport machinery that controls molecular trafficking in modern cells, protocells must have been bounded by simpler, more permeable membranes. Membranes formed from single-chain amphiphiles, such as fatty acids and their glycerol esters as well as fatty alcohols and related compounds, tend to be much more permeable than membranes composed of two-chain lipids. Shorter alkyl chains also contribute to increased permeability but require a higher critical concentration for bilayer membrane stability. The prebiotic synthesis of membrane-forming amphiphilic molecules is poorly studied. Highly unsaturated carbon chains are abundant in interstellar molecular clouds (Duley and Hu 2009), and these molecules may be among the source materials for the heterogeneous collections of amphiphiles present in organic-rich meteorites. Lipids have been extracted from carbonaceous chondrites, such as Murchison, and have been shown to assemble into vesiclelike compartments (Deamer 1985). Whether linear fatty acids could be synthesized by robust chemical pathways on the early Earth is less clear. The best studied pathways involve Fischer–Tropsch-type chemistry, in which carbon monoxide and hydrogen react under high temperature and pressure on the surface of transition metal minerals to generate terminally oxygenated, saturated linear chains (Rushdi and Simoneit 2001). However, the relevance of this chemistry to early Earth conditions is debated. Such reactions tend to give chain length distributions that fall off exponentially with increasing length, suggesting that the resulting membranes would be dominated by short-chain fatty acids and alcohols, but stabilized by a small fraction of long-chain species (Budin et al. 2014). Protocell membranes must be compatible with the physical and chemical conditions necessary for RNA chemistry. Because fatty acid–based membranes, nonenzymatic RNA copying, and ribozyme activity all tend to be optimal at moderate salt concentrations and neutral to mildly alkaline pH, this requirement might not seem to be a major concern. However, nonenzymatic copying of RNA templates requires very high concentrations of divalent cations, typically Mg2+ , and most ribozymes require high concentrations of Mg2+ for optimal activity. In contrast, fatty acid–based membranes become unstable in ≥4 mM Mg2+ , indicating an apparent incompatibility between the two systems. One potential resolution of this conflict involves chelation of Mg2+ by citrate, which protects membranes from the disruptive effects of free Mg2+ while allowing nonenzymatic RNA copying chemistry to proceed with only a moderate reduction of activity (Adamala and Szostak 2013a). As an added benefit, the Mg2+ -catalyzed degradation of RNA is prevented by citrate. However, it is unlikely that sufficient citrate was present in the prebiotic environment, and thus alternatives must be sought. A short acidic peptide in the active site of cellular RNA polymerase binds Mg2+ and positions it for catalysis (Zhang et al. 1999; Cramer et al. 2001). Could acidic peptides synthesized prebiotically have played a similar role in early RNA copying? The condensation of acidic amino acids is catalyzed by positively charged mineral surfaces (Ferris et al. 1996), suggesting that such peptides might have been available on the early Earth. However, catalysis of templated RNA polymerization by acidic peptides has not been shown. Alternatively, RNA copying strategies that do not require a high concentration of divalent cations would circumvent the problem. 2.2 Alternatives to Membrane Systems A fundamental requirement for the evolution of complex adaptive traits is to restrict the free diffusion of genomic and functional polymers and their metabolic products so that mutations that lead to improved activity have the opportunity to benefit the mutant more than nearby wild-type individuals. In addition, compartmentalization is essential to prevent a fatal increase in parasitic sequences. Although these requirements could be met by a system of growing and dividing vesicles, other mechanisms of spatial isolation could, in principle, provide a solution. Such alternatives include porous rocks, particulate mineral surfaces, noncovalent assemblies of oligonucleotides, emulsion droplets, and phase-separated droplets, including coacervates. The porous rocks seen in hydrothermal vents have long been proposed as a possible site for the origin of life (Lane and Martin 2012). As the heated vent fluids cool, dissolved minerals form precipitates and build up a network of interconnected microscale channels. Transverse temperature gradients across such channels have been proposed to support a combination of thermophoresis and convection, resulting in very strong concentration gradients (Baaske et al. 2007). Laboratory experiments using glass capillaries as analogs of porous rock channels have shown the simultaneous concentration of dilute fatty acids and DNA, leading to the assembly of vesicles with encapsulated DNA near the bottom of the capillary tube (Budin et al. 2009). Subsequent experiments have shown that temperature gradients across narrow channels can concentrate oligonucleotides, facilitating the assembly of longer DNA strands (Mast et al. 2013). Concentration as a result of thermophoresis can also preferentially enrich longer products relative to shorter ones, providing a means to overcome the replication bias that favors shorter templates (Kreysing et al. 2015). Because thermophoresis is strongly inhibited by high salt concenG.F. Joyce and J.W. Szostak 4 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from trations, this concentration mechanism would only be operative in freshwater hydrothermal vents, such as those currently found in Lake Yellowstone (Inskeep et al. 2015). How nucleic acids that evolved within rock compartments would transition to membrane compartments is not clear. Nevertheless, the remarkable concentration effects achieved by thermophoresis deserve further study. Another scenario for spatial isolation involves the assembly and replication of RNA on mineral surfaces. The ability of montmorillonite clay to catalyze the polymerization of activated mononucleotides is well known (Ferris et al. 1996; Ferris 2002). The resulting RNAs are tightly bound to the mineral surface, suggesting that mineral particles might function like compartments, with each particle bearing a spatially isolated population of RNA molecules. If RNAs could replicate while remaining at least partially bound to the mineral surface, the system could in principle support the Darwinian evolution of functional RNAs. Although some early work supported the idea that RNA templates can be copied while adsorbed onto a mineral surface (Gibbs et al. 1980; Schwartz and Orgel 1985), further work is needed to confirm the template-directed synthesis of mineral-immobilized RNA. Compartments based on liquid–liquid phase separation, including coacervate droplets, are a third alternative to membrane vesicles. The strong electrostatic interactions between polyanionic nucleic acids and various polycations leads to the formation of complex coacervates, which, depending on their composition and the ambient conditions, may remain as liquid droplets. The contents of these droplets are highly concentrated relative to the bulk solution, and there is little barrier to the uptake of nutrients from the environment into the coacervate droplets. However, phase-separated droplets are subject to Ostwald ripening, leading to droplet growth, and to fusion events, leading to coalescence. Fusion events can be blocked by a surfactant coating, but this would eliminate the presumed advantage of enhanced accessto Mg2+ andothercharged species (Tang et al. 2014; Aumiller et al. 2016). Furthermore, in some coacervates, the RNA components exchange rapidly between droplets, so there may not be the degree of isolation required for Darwinian behavior (Jia et al. 2014). Further study is needed to determine whether phase-separated droplets can act as isolated compartments and whether RNA replication can be performed in aqueous two-phase systems. Emulsion droplets provide a fourth alternative to membrane vesicles. The local accumulation of hydrocarbon oils on the early Earth is not implausible. Given sufficient turbulent mixing and the presence of stabilizing surfactants, water-in-oil emulsion droplets would be likely to form. Such droplets could grow through fusion or Ostwald ripening, followed byshear-induced division, leading to acycle of growth and division. Although attractive as a simple model of protocell growth and division, the limited accessibility of aqueous emulsion droplets to sources of nutrients would be a major problem. However, even if emulsion droplets are not directly relevant to origin of life scenarios, they provide an interesting model system for the laboratory study of evolutionary phenomena (Ichihashi et al. 2013). Finally, nucleic acids might provide the basis for their own compartmentalization. The growing field of DNA (and RNA) origami provides many examples of compartments assembled from nucleic acids (for reviews, see Pinheiro et al. 2011; Hong et al. 2017). An early example was the assembly of an octahedral structure from a defined single-stranded DNA (Shih et al. 2004). In principle, genomic strands that replicate within such a compartment could encode the compartment itself. There are examples of DNA origami that undergo self-replication based on growth and division processes analogous to those of membrane systems (Schulman et al. 2012). Thus far, however, the properties of encapsulation and replication have not been combined within a single system. 2.3 Pathways for Vesicle Division Early work by Luisi and colleagues explored processes that could potentially lead to the growth and division of micelles and vesicles. By encapsulating ferritin particles in preformed 1-palmitoyl-2-oleoyl-sn-glycerol-3-phosphatidylcholine (POPC) vesicles, electron cryomicroscopy evidence was obtained suggesting that the addition of oleate to POPC vesicles led to vesicle growth and subsequent division (Berclaz et al. 2001). Experiments using fluorescent markers of membrane area and vesicle content confirmed that oleate addition to preformed vesicles led to growth, but division was not observed unless the vesicles were forced through small pores (Hanczyc et al. 2003). However, the addition of excess alkaline oleate micelles to preformed oleate vesicles, buffered at a lower pH, led to an increase in vesicle number. Strikingly, the size of the newly formed vesicles was close to that of the original preformed vesicles (Rasi et al. 2004). More recently, vesicle growth has been observed directly by video microscopy, following the addition of oleate to larger multilamellar vesicles ∼4 µm in diameter (Zhu and Szostak 2009). Surprisingly, vesicle growth proceeded through the extrusion of a thin filament from the outermost bilayer. Over time, this filament grew in length, and equilibration with inner bilayers resulted in a filamentous multilamellar vesicle under conditions in which surface area growth is much faster than volume growth owing to osmotic constraints. Filamentous vesicles are subject to a pearling instability that likely contributes to the ease with which they Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 5 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from are fragmented under mild shear stress. Gentle agitation is sufficient to bring about vesicle division without loss of the vesicle contents. Alternatively, in an environment rich in organic thiols, photochemical oxidative processes that lead to disulfide formation can induce extreme pearling in such vesicles, resulting in spontaneous division into a large number of small daughter vesicles (Zhu et al. 2012). In contrast to the behaviorof multilamellar vesicles, large unilamellar vesicles are fragile and tend to rupture with extensive loss of contents under shear stress. These features combine to make multilamellar vesicles and filamentous growth attractive as a means of simple, environmentally driven cycles of growth and division, requiring only episodic delivery of additional amphiphiles and a moderately turbulent envi- ronment. Competition between vesicles for limiting membrane components could provide a simple scenario for protocell growth that obviates the need for periodic input of new amphiphilic material. If some vesicles in a population are able to grow by obtaining membrane components from surrounding vesicles, many generations of growth and division could occur within a given locale. Furthermore, the resulting competition for resources could drive natural selection and thus Darwinian evolution. The first experimental demonstration of competition between protocells arose from studies of osmotically driven growth (Chen et al. 2004). Vesicles with a high concentration of encapsulated RNA are osmotically swollen and the membrane is under tension. This high-energy state can be relaxed by the incorporation of additional fatty acid molecules into the membrane. Because single-chain amphiphiles exchange rapidly between vesicles, fatty acid vesicles containing RNA grow in the presence of empty (relaxed) fatty acid vesicles. This simple physical mechanism could lead to a coupling between RNA replication and membrane growth and to competition between protocells. However, it is energetically expensive to cause osmotically swollen vesicles to divide, making it difficult to couple competitive growth to a faster overall cell cycle. Subsequent studies revealed that a small fraction of phospholipid in a largely fatty acid membrane can also lead to competitive growth (Budin and Szostak 2011). In this case, competition results from the slower dissociation of fatty acid molecules from membranes that contain phospholipids, which results in growth of phospholipidcontaining vesicles at the expense of vesicles that do not contain (or contain less) phospholipid. Thus, any vesicle that also contained a heritable catalyst of phospholipid synthesis would experience a strong selective advantage. Other molecules that have a similar effect in slowing fatty acid dissociation, such as hydrophobic peptides, also confer a competitive growth advantage, implying that a heritable RNA that catalyzes the synthesis of hydrophobic peptides could also provide a selective advantage through competition at the cellular level (Adamala and Szostak 2013b). An alternative to protocell growth through molecular accretion is stepwise growth through vesicle fusion. Cycles of fusion and division could allow replicating RNAs to spread throughout a population of vesicles, with strong selection for replication efficiency. Under this scenario, selection for increased efficiency of RNA-catalyzed RNA replication would come first, followed by the evolution of additional ribozymes because of the competition between protocells. Wet/dry cycles transform vesicles into lamellar stacks that upon rehydration again form vesicles. Whether wet/dry cycles could lead to the spread of replicating RNAs through a vesicle population remains unclear. Further studies of the fate of encapsulated RNAs following wet/ dry cycles may shed light on the potential role of such processes in early RNA evolution. 3 NATURE OF THE FIRST GENETIC MATERIAL 3.1 RNA as the First Genetic Material The prebiotic synthesis of nucleotides could have occurred in several ways. The simplest, conceptually, would be to synthesize a nucleoside base, couple it to ribose, and finally to phosphorylate the resulting nucleoside. Recent studies, however, suggest that other routes may be more feasible, most notably those that involve the coassembly of the base and sugar phosphate. Many sugars, including the four pentoses, react readily with cyanamide to form stable bicyclic amino-oxazolines (Fig. 2A) (Sanchez and Orgel 1970). The free sugars are unstable (Larralde et al. 1995) and cannot be synthesized directly in the presence of cyanamide. However, the sugars could be synthesized in one locale through the “formose” reaction (Mizuno and Weiss 1974; Müller et al. 1990; Ricardo et al. 2004; Kim et al. 2011), then combined with cyanamide in a different locale to give the sugar aminooxazolines. When this reaction is performed with acomplex mixture of sugars, the ribose derivative strikingly crystallizes from aqueous solution, segregating it from the other sugars (Springsteen and Joyce 2004). Sutherland and colleagues (Ingar et al. 2003; Anastasi et al. 2007; Powner et al. 2009) avoided the complication of the separate synthesis of sugars by starting simply with glycolaldehyde and cyanamide, which in the presence of 1 M phosphate at neutral pH gives 2-amino-oxazole an excellent yield (Fig. 2B). The phosphate both buffers and catalyzes the reaction, directing glycolaldehyde toward 2-amino-oxazole, rather than a complex mixture of prodG.F. Joyce and J.W. Szostak 6 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from ucts. Glyceraldehyde is then added, resulting in formation of the various pentose amino-oxazolines, including the arabinose compound. Arabinose amino-oxazoline in turn can react with cyanoacetylene, also in phosphate buffer, to form cytosine 2′ ,3′ -cyclic phosphate as the major product. This approach still requires sequential reactions. However, glycolaldehyde and glyceraldehyde can react with 2-aminothiazole to give the corresponding aminals, which selectively precipitate to give crystalline reservoirs of these key intermediates (Islam et al. 2017). A potential route from arabinose amino-oxazoline to pyrimidine nucleotides first was explored nearly 40 years ago (Tapiero and Nagyvary 1971) and has been reexamined in light of alternative routes to the pentose amino-oxazolines (Ingar et al. 2003). Like arabinose, arabinose 3-phosphate reacts with cyanamide to give the corresponding amino-oxazoline (Fig. 2C). This in turn reacts with cyanoacetylene to form a tricyclic intermediate that hydrolyzes to produce a mixture of cytosine arabinoside-3′ -phosphate and cytosine 2′ ,3′ -cyclic phosphate. Ribose amino-oxazoline A O HO OH O OH H H OH OH OH OHO 2C NHN C 2OH H2 O HO NO N O P O– O– O OH HO HO O O P O– O– O O HO O O OH H H OH P O– O– O O HO NO N NH O P O– O– O O HO N N N O O P O O O– H2 H O O O N N NH2 O OH P O– O– O C CN CH + 2C NHN O HO N OH O NH2 + B O HO NO OH NH2 NH2 O HO N OH HO OH O O NH2 O N 2C NHN H2PO4 (pH 7.0) – H2O – – NH2 O HO NO OH NH2 O HO NO N NH OH O HO N N O O P O O O– C CN CH H2PO4 (pH 6.5) – H2PO4 (pH 6.5) H2PO4 (pH 7.0)– HO O Figure 2. Potential prebiotic synthesis of pyrimidine nucleosides. (A) Reaction of ribose with cyanamide to form a bicyclic product, with cyanamide joined at both the anomeric carbon and 2-hydroxyl. (B) Reaction of glycolaldehyde with cyanamide in neutral phosphate buffer, followed by addition of glyceraldehyde, to form ribose and arabinose amino-oxazoline (and lesser amounts of the xylose and lyxose compounds). Arabinose amino-oxazoline then reacts with cyanoacetylene to give cytosine 2′ ,3′ -cyclic phosphate as the major product. (C) Analogous reaction of arabinose-3-phosphate to form a bicyclic product, which then reacts with cyanoacetylene to form a tricyclic intermediate that hydrolyzes to give a mixture of cytosine arabinoside-3′ -phosphate and cytosine 2′ ,3′ -cyclic phosphate. Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 7 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from (RAO) also reacts with cyanoacetylene, generating the 2,2′ -anhydro-ribonucleoside, but in this case as the αanomer. However, ring opening of this compound with sulfide generates α-2-thio-C, which readily photoanomerizes to give the β-anomer, which in turn desulfurizes to give C or deaminates to give 2-thio-U and U (Fig. 3) (Xu et al. 2017). A similar approach has been applied to the chemoselective assembly of purine nucleotide precursors (Powner et al. 2010). It has long been known for the purines, but not the pyrimidines, that simply heating the nucleobase with ribose or ribose-phosphate results in the synthesis of the corresponding ribonucleoside or ribonucleotide, but in low yield (Fuller et al. 1972). Two alternative approaches that have been explored recently involve either ribosylation of formamidopyrimidine, which is selective for the N9 position, but gives a mixture of α/β-furanosides and pyranosides (Becker et al. 2016), and a route from 2-thiooxazole to the pentose amino-oxazolidinone thiones and ultimately to the corresponding 8-oxopurine nucleotides (Stairs et al. 2017). The story remains incomplete, however, because these syntheses still require temporally separated reactions using high concentrations of just the right reactants and would be disrupted by the presence of other closely related compounds. The reactions channel material toward the desired products, but other fractionation processes are required to provide the correct starting materials at the requisite time and place. In addition to the selective crystallization processes described above, it has been proposed that precipitation of ferrocyanide salts could generate a concentrated reservoir of starting materials that can be liberated by geothermal activity (Patel et al. 2015). Other potential fractionation processes, involving either selective synthesis or selective degradation, are actively being pursued. Even chemical fractionation could not achieve on a macroscopic scale one desirable separation, the resolution of D-ribonucleotides from their L-enantiomers. This is a serious problem because experiments on the nonenzymatic template-directed polymerization of imidazole-activated mononucleotides suggest that the polymerization of the D-enantiomer is strongly inhibited by the L-enantiomer (Joyce et al. 1984). This difficulty may not be insuperable; perhaps with a different mode of phosphate activation, the inhibition would be less severe. However, enantiomeric cross-inhibition is certainly a serious problem if life arose in a racemic environment. It is possible that the locale for life’s origins was not racemic, although the global chemical environment contained nearly equal amounts of each pair of stereoisomers. There likely were biases in the inventory of compounds delivered to the Earth by comets and meteorites (Cronin and Pizzarello 1997; Engel and Macko 1997; Pizzarello et al. 2003; Glavin and Dworkin 2009). These in turn could have biased terrestrial syntheses, although the level of enantiomeric enrichment generally declines with successive chemical reactions. A special exception involves a remarkable set of reactions and fractionation processes that amplify a slight chiral imbalance, even to the level of local homochirality (Kondepudietal.1990;Soaietal.1995;Viedma2005;Kluss- mannetal.2006;Noorduinetal.2008;Viedmaetal.2008).As discussed elsewhere in this collection (Blackmond 2018), these systems have in common both a catalytic process for O O N NH2 NH2 HO HO HCCCN HS– HO OHHO HO HO O NH2 HO O N N + O N N S O N OH N S HO NH2 Crystalline RAO β-Anomer of 2-thio-C hν α-Anomer of 2-thio-C Figure 3. Potential prebiotic synthesis of 2-thio-C starting from ribose amino-oxazoline (RAO). RAO, which is generated by the reaction of glyceraldehyde with 2-amino-oxazole, crystallizes readily, leading to its accumulation as a reservoir of purified material. Subsequent reaction with cyanoacetylene generates an anhydronucleoside intermediate, which upon attack by hydrosulfide gives the α-anomer of 2-thio-C. Exposure to ultraviolet light results in photoanomerization to give the β-anomer of 2-thio-C, which then desulfurizes to form cytosine. G.F. Joyce and J.W. Szostak 8 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from amplification of same-handed molecules and an inhibition process for suppression of opposite-handed molecules. 3.2 Potential Alternatives to RNA The problems that arise when one triesto understand how an RNAworld could have arisen de novo on the primitive Earth are sufficiently challenging that one must consider other possibilities. What kind of alternative genetic systems might have preceded the RNA world? How could they have “invented” the RNA world? These topics have generated a good deal of speculative interest and some relevant experimental data. Eschenmoserand colleagues undertook a systematic study of the properties of analogs of nucleic acids in which ribose is replacedbysomeothersugar,orinwhichthefuranoseformof ribose is replaced by the pyranose form (Fig. 4) (Eschenmoser 1999). Strikingly, polynucleotides based on the pyranosyl analog of ribose (p-RNA) form Watson–Crick-paired double helices that are more stable than RNA, and p-RNAs are less likely than the corresponding RNAs to form multiple-strand competing structures (Pitsch et al. 1993, 1995, 2003). Furthermore, the helices twist much more gradually than those of standard nucleic acids, which should make it easier to separate strands of p-RNA during replication. p-RNA appears to be an excellent choice as a genetic system; in some ways it seems an improvement compared with the standard nucleic acids. However, p-RNA does not interact with normal RNA to form base-paired double helices. Most double-helical structures reported in the literature are characterized by a backbone with a six-atom repeat. Eschenmoser and colleagues made the surprising discovery that an RNA-like structure based on threose nucleotide analogs (TNA; Fig. 4C), although it involves a five-atom repeat, can still form a stable double-helical structure with standard RNA (Schöning et al. 2000). This provides an example of a pairing system based on a sugar that could be formed more readily than ribose. Tetroses are the unique products of the dimerization of glycolaldehyde, whereas pentoses are formed along with tetroses and hexoses from glycolaldehyde and glyceraldehyde. A structural simplification of Eschenmoser’s TNA has been achieved by Meggers and colleagues (Zhang et al. 2005). They replaced threose by its open chain analog, glycol, in the backbone of TNA, resulting in glycol nucleic acid (GNA; Fig. 4D). Complementary oligomers of GNA form antiparallel, double helices with surprisingly high duplex stabilities. The activated monomer of GNA is susceptible to intramolecular cyclization, but this might be overcome by using dimeric or short oligomeric building blocks. These and similar studies suggest that there are many ways of linking together the nucleotide bases into chains that are capable of forming base-paired double helices. It is not clear that it is much easier to synthesize the monomers of p-RNA, TNA, or GNA than to synthesize the standard nucleotides. However, it is possible that a base-paired structure of this kind will be discovered that can be synthesized readily under prebiotic conditions. It also may be fruitful to explore a broader range of potential precursors to RNA, changing the recognition elements as well as the backbone. A strong candidate for the first genetic material would be any informational macromolecule that is replicable in a sequence-general manner and derives from compounds that would have been abundant on the primitive Earth, and preferably has the ability to cross-pair with RNA. 4 CHEMICAL RNA REPLICATION SYSTEMS 4.1 Templated Polymerization of Mononucleotides The nonenzymatic template-directed polymerization of activated nucleotides has been studied for decades as a model O HO O – O O P O O N O HO O– O O P O N O– O O P O N O HO A B D O PO O– O N O PO O– O N O PO O– O N O C O O PO O– O N O O PO O– O N O O PO O– O N O O OHO O N O OHO P O O– O P OO O– N O OH OP O O– O N Figure 4. The structures of (A) RNA; (B) p-RNA; (C) TNA; and (D) GNA. Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 9 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from for the emergence of replicating informational systems during the origin of life. Rather than recount the detailed history of this field, which has been extensively reviewed (Joyce 2002; Orgel 2004), this review focuses on recent developments that have led to significant improvements in the extent and generality of RNA-templated synthesis of RNA, concluding with a discussion of the remaining barriers to the development of prebiotically plausible replication chemistry. Template-directed primer extension with imidazoleactivated monomers is attractive as a prebiotic means of copying RNA templates, in part for its similarity to the universal biological processes of enzymatic RNA and DNA replication. Since the discovery in 1981 that 2-methylimidazole (2MeI) is an especially good nucleotide activating agent (Inoue and Orgel 1981, 1983), nucleoside 5′ -phosphoro-2-methylimidazolides have been widely used for primer extension experiments. The mechanism of the reaction between a growing primer chain and an incoming imidazole-activated monomer was assumed to parallel that of enzymatic primer extension, which proceeds via in-line SN2 nucleophilic substitution. However, the very slow rate at which the last nucleotide of a template is copied relative to internal sites (Wu and Orgel 1992a), suggested that other factors influenced the reaction. Indeed, the same investigators showed that the reaction of the primer with an activated nucleotide is catalyzed by a downstream, activated nucleotide that is base-paired to the template at the primer +2 position. Furthermore, the identity of the leaving group on the 5′ -phosphate of the downstream nucleotide has a large effect on the rate of the reaction between the primer and the adjacent monomer (Wu and Orgel 1992b). Only recently has the mechanism behind these effects become clear. In the course of comparing the rates of oligonucleotide ligation and primer extension, Prywes and Szostak rediscovered the catalytic effect of a downstream, activated nucleotide and showed that this effect is much stronger for an oligonucleotide than a mononucleotide at the downstream position (Prywes et al. 2016). Initially, catalysis was thought to result from noncovalent interactions between the leaving groups of adjacent activated nucleotides, but subsequent experiments suggested that the relevant interaction was covalent and involved the off-template formation of a reactive intermediate. Nuclear magnetic resonance experiments identified this intermediate as a 5′ ,5′ -imidazolium-bridged dinucleotide, generated by an attack of the nucleophilic nitrogen of an unprotonated phosphoroimidazolide on the phosphate of a protonated imidazolide (Fig. 5). The template specificity of primer extension suggested that the intermediate could bind to the template through two adjacent Watson–Crick base pairs (Walton and Szostak 2016). The Richert laboratory also noted the formation of an imidazolium-bridged dinucleotide and its high reactivity in primer extension reactions (Kervio et al. 2016). More recent kinetic studies of primer extension with 2-aminoimidazole(2AI) activated monomers showed that all detectable primer extension occurs via the 2-aminoimidazolium intermediate and not through the reaction of a primer with an activated monomer (Walton and Szostak 2017). O OH O NH NN N O NH2 N H N P O– OH O OH O NH NN N O NH2 N N P O– OH O + N H N O OH O NH NN N O NH2P OH O OH O NH NN N O NH2 N N P O– OH O O + O O– + Figure 5. Formation of a 5′ ,5′ -linked imidazolium-bridged dinucleotide, which is the reactive intermediate in template-directed polymerization of imidazole-activated nucleotides. G.F. Joyce and J.W. Szostak 10 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from 4.2 Alternative Means of Monomer Activation The Richert laboratory has explored a wide range of phosphate-activating groups in an effort to improve the rate and generality of primer extension chemistry. One of the problems with nonenzymatic primer extension is that the 3′ -hydroxyl of the primer must be deprotonated during nucleophilic attack on the phosphate of the incoming monomer. Conversely, an imidazole-based activating group must be protonated to function as a leaving group, and this typically requires a lower pH because the pKa of this group is in the range of 6–8. Therefore, a leaving group that departs as an anion (with low pKa) could provide a much faster reaction rate at elevated pH. 1-Hydroxy-7-azabenzotriazole (HOAt) functions in this manner and results in rapid primer extension attributable to direct reaction of the primer with the HOAt-activated monomer (Kervio et al. 2016). A related approach is to use an N-alkyl-imidazole as an activating agent because it can leave as a neutral species that does not require protonation. Following this approach, the Richert laboratory used N-ethyl-imidazole (NEI) as an organocatalyst to facilitate in situ nucleotide activation with 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC). The nucleoside 5′ -phosphate reacts first with EDC to form a reactive isourea intermediate, then displacement by NEI yields NEI-activated monomers that can then react with the primer (Jauker et al. 2015). Although the highly reactive activated nucleotides are labile to hydrolysis at elevated pH, the continuous addition of EDC could in principle lead to continuing primer extension. An efficient and prebiotically plausible means of performing nucleotide activation and primer extension in the same reaction mixture would greatly simplify the conditions required for protocell replication. In view of the catalytic effect on primer extension by an activated downstream nucleotide, a series of substituted imidazoles were tested for their influence on the rate of reaction, in all cases with 2MeI-activated guanylate as the incoming monomer. The magnitude of the catalytic effect correlated positively with the pKa of the imidazole and with a smaller size of the imidazole substituent. 2AI has both of these properties and proved to be a superior activating group (Li et al. 2017). This effect is now understood to reflect, at least in part, the greater stability of the 2-aminoimidazolium-bridged dinucleotide intermediate compared with the relatively unstable 2-methylimidazolium intermediate. This allows the 2AI intermediate to build up to higher levels, significantly enhancing the rate and extent of primer extension (Walton and Szostak 2017). The potential prebiotic synthesis of 2AI is intriguing because it can be generated in the same reaction mixture as the related heterocycle 2-aminooxazole, a precursor of the pyrimidine nucleotides (Fahrenbach et al. 2017). 4.3 Alternative Modes of Strand Elongation Primer extension with activated monomers is a biologically inspired mode of template copying. In principle, however, template copying could also be accomplished by the ligation of activated oligonucleotides. Unfortunately, nonenzymatic RNA ligation reactions tend to have low yield and poor fidelity. An interesting exception comes from a study of template-directed ligation of DNA oligonucleotides mediated by cyanogen bromide (James and Ellington 1997). Although still low yielding, when random hexamers were annealed to a defined template, most of the ligation products had perfect complementarity to the template. This study suggests that oligonucleotide ligation could play a significant role in nonenzymatic RNA-templated synthesis, if more efficient ligation chemistry could be defined. The ligation of oligonucleotides activated at the 2′ (3′ ) end has traditionally been hampered by the fact that activation of a 2′ - or 3′ -phosphate, for example, by reaction with cyanoimidazole or EDC, leads to rapid formation of a 2′ ,3′ -cyclic phosphate, which is only weakly activated and yields predominately 2′ ,5′ -linked ligation products. This outcome can be partially avoided if the 3′ -phosphate is first acetylated by reaction with N-acetyl-imidazole or other acetylating reagents, in which case the acetyl group is preferentially transferred to the 2′ -hydroxyl (Bowleret al. 2013). Subsequent activation of the 3′ -phosphate allows for ligation and formation of a 3′ ,5′ -phosphodiester. In a final step, the 2′ -hydroxyl is regenerated by slow hydrolytic loss of the acetyl group. This process can be iterated to replace labile 2′ ,5′ linkages, which lead to strand cleavage by transesterification, with a progressively higher proportion of the more stable 3′ ,5′ linkages (Mariani and Sutherland 2017). Surprisingly, the rate of ligation of a primer to a 5′ -imidazole-activated oligonucleotide is very slow compared with the rate of primer extension with similarly activated monomers (Prywes et al. 2016). This is now explained by the fact that primer extension with monomers proceeds through the highly reactive imidazolium-bridged dinucleotide intermediate discussed above (Walton and Szostak 2017). Because the ligation reaction is slow, the final yield is determined by competition with hydrolysis of the activated phosphate. This suggests that chemistry that would reactivate hydrolyzed substrates, such as the combination of EDC and an N-alkyl imidazole explored by the Richert laboratory(Jaukeret al.2015),couldleadtoa highligationyield. Because oligonucleotides can have a long residence time on the template, a slow rate of ligation is not necessarily a problem, so long as the reaction can go to completion. An Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 11 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from intriguing possibility is that a hybrid process, in which short oligonucleotides act as primers at multiple sites on a template, followed by gap filling by primer extension with monomers, and a final stage of oligonucleotide ligation, could result in effective copying of long RNA templates (Szostak 2011). Such a hybrid process would avoid some of the difficulties inherent to template copying solely by ligation of oligonucleotide substrates, such as the formation of gaps or overhangs. 4.4 Fidelity and the Error Threshold The concept of an error threshold, that is, an upper limit to the frequency of copying errors that can be tolerated by a replicating macromolecule, was first introduced by Eigen (1971). Eigen’s model envisaged a population of replicating polynucleotides that draw on a limited supply of activated mononucleotides to produce additional copies of themselves. In this model, the rate of synthesis of new copies of a particular replicating RNA is proportional to its concentration, resulting in autocatalytic growth. The net rate of production is the difference between the rate of formation of error-free copies and the rate of decomposition of existing copies of the RNA. For an advantageous RNA to outgrow its competitors, its net rate of production must exceed the mean rate of production of all other RNAs in the population. Only the error-free copies of the advantageous RNA contribute to its net rate of production, but all the copies of the other RNAs contribute to their collective production. Thus, the relative advantage enjoyed by the advantageous individual compared with the rest of the population (often referred to as the “superiority” of the advantageous individual) must exceed the probability of producing an error copy of that advantageous individual. The proportion of copies of an RNA that are error free is determined by the fidelity of the component condensation reactions that are required to produce a complete copy. For simplicity, considera self-replicating RNA that is formed by n condensation reactions, each having mean fidelity q. The probability of obtaining a completely error-free copy is given by qn , which is the product of the fidelity of the component condensation reactions. If an advantageous individual is to outgrow its competitors, qn must exceed the superiority, s, of that individual. Expressed in terms of the numberof reactions required to produce the advantageous individual, n < |ln s| / |ln q|. For s > 1 and q > 0.9, this equation simplifies to n < ln s / (1 – q). This isthe “error threshold,” which describesthe inverse relationship between the fidelity of replication, q, and the maximum allowable number of component condensation reactions, n. The maximum number of component reactions is highly sensitive to the fidelity of replication but depends only weakly on the superiority of the advantageous individual. For a self-replicating RNA that is formed by the template-directed condensation of activated mononucleotides, a total of 2n – 2 condensation reactions are required to produce a complete copy. This takes into account the synthesis of both complementary strands. When chemical replication was first established, fidelity was likely to be poor and there would have been strong selection pressure favoring improvement of fidelity. As fidelity improves, a larger genome can be maintained. This allows exploration of a larger number of possible sequences, some of which may lead to further improvement in fidelity, which in turn allows a still larger genome size, and so on. If one assumes that the first genomes encoded a small but relatively efficient ribozyme, for which the identity of roughly 25 nucleotides must be specified, then a fidelity of ∼0.98 would be required to enable inheritance of the corresponding information. An analysis of error rates in the RNA-templated polymerization of imidazole-activated nucleotides indicated that most errors result from G•U wobble pairing (Rajamani et al. 2010). Calculated overall error rates from that study were 5%–15%, which would severely limit the size of a primitive genome. However, primer extension following a mismatch was generally very slow, depending on the specific mismatch. This stalling effect slows template copying, but as a consequence, the first copies to be completed are more likely to be accurate compared with those completed at later times. If full-length copies can serve as templates for the next round of copying as soon as they are completed, then there will be enrichment for accurate copies during the course of exponential amplification. Template copying fidelity can be improved by the use of nucleotide analogs. The weak base-pairing of U with A allows for the misincorporation of C opposite A, and the similar strength of the U•A and U•G base pairs allows for the misincorporation of U opposite G. Remarkably, replacing U with 2-thio-U reduces both of these sources of errors because the 2-thio-U•A base pair is almost as strong as a C•G base pair. Template copying experiments with 2MeI-activated monomers suggest that this substitution reduces wobble-pairing errors from 4%–5% to ∼2% at template G positions, suggesting that an overall error rate of 1%–2% may be possible (Heuberger et al. 2015). Replacement of U by 2-thio-U also results in faster and more efficient copying of templates that contain A residues. The combined effect of faster and more accurate copying makes the 2-thio substitution attractive and raises the question of the prebiotic plausibility of 2-thio-U. Recent G.F. Joyce and J.W. Szostak 12 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from advances in the prebiotic chemistry of pyrimidine nucleotides suggest that a mixture of U and 2-thio-U might be obtained through the synthesis of 2-thio-C, followed by spontaneous deamination (Fig. 5) (Xu et al. 2017). The 2-thio-U then would be preferentially incorporated into RNA. Although speculative at this point, the hypothesis that primordial RNA contained 2-thio-U rather than U is worth considering. 4.5 Complete Cycles of Replication The inheritance of RNA-based genetic information requires cycles of both plus- and minus-strand synthesis. Because the product of template copying is a double-stranded RNA, there must be some means of either strand separation or strand displacement synthesis. Transient temperature fluctuations could lead to thermal strand separation, but long RNA duplexes (≥30 base pairs) are difficult to denature thermally. Many environmental factors affect the melting temperature of RNA duplexes and could facilitate thermal strand separation, such as low ionic strength or the presence of high concentrations of formamide or urea. A more difficult problem is that strand reannealing is very fast relative to the time scale of nonenzymatic template copying. Strand reannealing can be slowed by dilution but achieving a reannealing time of hours would require strand concentrations in the picomolar range, which likely would be too low to maintain RNA replication within a protocell. Thus, environmental conditions that might drastically slow reannealing kinetics are of great interest. Recent work from Hud and colleagues has shown that the combination of template secondary structure and high solvent viscosity can significantly slow the reannealing of long templates (300–500 nucleotides), while still allowing shorter oligonucleotides (∼30 nucleotides) to diffuse rapidly and anneal to complementary sites on the template where ligation can occur (He et al. 2017). Whether this approach can be extended to shorter templates and copying with monomers or short oligonucleotides remains to be seen. Recently, studies of liquid–liquid phase separation suggest that this phenomenon might be relevant to protocell structure and function (Koga et al. 2011; Frankel et al. 2016). The high viscosity of phase-separated droplets could slow reannealing, whereas the colocalization of RNA templates and substrates could facilitate template copying (Aumiller et al. 2016). The recent observation of phaseseparated droplets containing only RNA is intriguing in this regard (Jain and Vale 2017). These droplets form when the RNAs contain certain triplet repeats, which enable the formation of a highly branched interconnected network. The droplets seem to be liquid-like upon initial phase separation, but rapidly mature to a gel-like state. It is possible that template strands trapped within such a matrix are unable to diffuse and therefore reanneal but could still be copied by short oligonucleotide or monomer substrates. Any physical means of immobilizing RNA strands after strand separation could prevent reannealing, so binding of RNA to surfaces such as membranes, mineral particles, or self-assembling peptide filaments could potentially facilitate replication. The alternative solution of strand displacement synthesis is appealing because this is the strategy that is universally used in biology. However, biological strand displacement relies on helicases and/or polymerases that can transduce some of the chemical energy used to drive primer extension into the mechanical energy needed for strand separation. It is unclear how this could be achieved in a prebiotic setting devoid of enzymes. 4.6 Current Status of Laboratory Protocell Replication Systems The experimental realization of a complete protocell model will require replication of mixed-sequence templates long enough to encode functional ribozymes within replicating vesicles. This goal has been gradually approached over the past decade. The first demonstration of nonenzymatic template copying within fatty acid vesicles was achieved by copying an RNA template with 5′ -activated 2′ -aminonucleotides. This model had the advantage of rapid primer extension in the absence of Mg2+ (Mansy et al. 2008). Five years later, chelation of Mg2+ with citrate was found to protect fatty acid membranes, while still allowing templated RNA polymerization to proceed. Addition of citrate enabled the copying of an RNA homopolymer template inside fatty acid vesicles. Subsequently, the use of the 2AI leaving group combined with activated downstream trinucleotides enabled the efficient copying of mixed-sequence templates containing up to seven nucleotides. This approach was recently adapted to template copying within vesicles by increasing the permeability of fatty acid membranes using Mg2+ -citrate. As a result, it has been possible to copy short RNA templates inside fatty acid vesicles following the addition of activated monomers and trimers to the outside of the vesicles (O’Flaherty et al. 2018). 5 RNA-CATALYZED REPLICATION SYSTEMS 5.1 The Replicase Model The notion of the RNA world places emphasis on an RNA molecule that catalyzes its own replication. Such a molecule must function as an RNA-dependent RNA polymerase, acting on itself (or copies of itself) to produce complemenProtocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 13 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from tary RNAs, and acting on the complementary RNAs to produce additional copies of itself. The efficiency and fidelity of this process must be sufficient to produce viable “progeny” RNA molecules at a rate that exceeds the rate of decomposition of the “parents.” Beyond these requirements, the details of the replication process are not highly constrained. The RNA-first view of the origin of life assumes that a supply of activated nucleotides was available through some set of abiotic processes. Furthermore, it assumes that a means existed to convert the activated nucleotides to an ensemble of random sequence polynucleotides, a subset of which had the ability to undergo nonenzymatic replication. From this population of replicating RNAs, particular RNAs would have gained in relative abundance if they enjoyed a selective advantage because of a higher net rate of production. There would have been selection for the more favorable template sequences and for RNAs that promote the replication process. The most rudimentary form of replicase function would involve assistance of chemical replication to improve that rate, accuracy, or sequence generality of RNA synthesis. With ongoing selective pressure, this chemical replication process would become supplanted by a biochemical replication process. The latter likely used a less-reactive, but energetically favorable, activating group on the mononucleotides, such as a polyphosphate, so that the RNA-catalyzed reaction dominated over spontaneous polymerization. Although it is unclear how the first RNA replicase ribozyme arose, it is not difficult to imagine how such a molecule, once developed, would function. The polymerization of activated nucleotides proceeds via nucleophilic attack by the 3′ -hydroxyl of a template-bound oligonucleotide at the α-phosphorus of an adjacent template-bound nucleotide derivative. This reaction could be assisted by favorable orientation of the reactive groups, deprotonation of the nucleophilic 3′ -hydroxyl, stabilization of the trigonalbipyramidal transition state, or charge neutralization of the leaving group. All of these tasks might be performed by RNA (Narlikar and Herschlag 1997; Emilsson et al. 2003), acting either alone (Strobel and Ortoleva-Donnelly 1999) or with the help of a suitably positioned metal cation or other cofactor (Shan et al. 1999, 2001). 5.2 From Ligase to Polymerase The possibility that an RNA replicase ribozyme could have existed has been made abundantly clear by work involving ribozymes that have been developed through in vitro evolution (Bartel and Szostak 1993; Ekland et al. 1995; Ekland and Bartel 1996; Jaeger et al. 1999; Robertson and Ellington 1999; Johnston et al. 2001; Rogers and Joyce 2001; McGinness and Joyce 2002; Ikawa et al. 2004; Wochner et al. 2011; Sczepanski and Joyce 2014; Horning and Joyce 2016). Bartel and Szostak (1993), for example, began with a large population of random sequence RNAs and evolved the “class I” RNA ligase ribozyme, an optimized version of which is about 100 nucleotides in length and catalyzes the joining of two template-bound oligonucleotides. Condensation occurs between the 3′ -hydroxyl of one oligonucleotide and the 5′ -triphosphate of another, forming a 3′ ,5′ -phosphodiester linkage and releasing inorganic pyrophosphate. This reaction is classified as ligation owing to the nature of the oligonucleotide substrates but involves the same chemical transformation as is catalyzed by modern RNA polymerase enzymes. X-ray crystal structures of two RNA ligase ribozymes, the L1 and class I ligases, provide a glimpse into mechanistic strategies a ribozyme could use to catalyze RNA polymerization (Fig. 6) (Robertson and Scott 2007; Shechner et al. 2009). Both ligases are dependent on Mg2+ ions for their activity. The L1 structure (Fig. 6A) shows a bound metal ion in the active site, coordinated by three nonbridging phosphate oxygens, one of which belongs to the newly formed phosphodiester linkage. This Mg2+ ion is favorably positioned to help neutralize the increased negative charge of the transition state and, potentially, to activate the 3′ -hydroxyl nucleophile and to help orient the α-phosphate for a more optimal in-line alignment. In the class I structure (Fig. 6B), no catalytic metal ion is seen in the vicinityof the active site, although there is what appears to be an empty metalbinding site formed by two nonbridging phosphate oxygens, positioned directly opposite the ligation junction in a manner similar to that observed for the L1 ligase. These structures point to a universal catalytic strategy, very similar to that used by modern protein-based RNA polymerases. Subsequent to its isolation as a ligase, the class I ribozyme was shown to catalyze a polymerization reaction in which the 5′ -triphosphate-bearing oligonucleotide is replaced by one or more nucleoside 5′ -triphosphates (NTPs) (Ekland and Bartel 1996). This reaction proceeds with high fidelity (q = 0.92), although the reaction rate drops sharply with successive nucleotide additions. Bartel and colleagues performed further in vitro evolution experiments to convert the class I ligase to a bona fide RNA polymerase that operates on a separate RNA template (Johnston et al. 2001). To the 3′ end of the class I ligase, they added 76 random sequence nucleotidesthat were evolved to form an accessory domain that assists in the polymerization of template-bound NTPs. The polymerization reaction is applicable to a variety of template sequences, and for wellbehaved sequences proceeds with an average fidelity of 0.97. This would be sufficient to support a genome length of 20–30 nucleotides, although the ribozyme itself contains G.F. Joyce and J.W. Szostak 14 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from about 190 nucleotides. The ribozyme has a catalytic rate for NTP addition of at least 1.5 min−1 , but its Km is so high that, even in the presence of micromolar concentrations of oligonucleotides and millimolar concentrations of NTPs, it requires about 2 h to complete each NTP addition (Lawrence and Bartel 2003). The ribozyme operates best under conditions of high Mg2+ concentration but becomes degraded under those conditions after 24 h, by which time it has added no more than 14 NTPs (Johnston et al. 2001). Further evolutionary optimization of the class I polymerase ribozyme has led to substantial improvement of its biochemical properties. By directly selecting for extension of an external primer on a separate template, Zaher and Unrau (2007) were able to improve the maximum length of template-dependent polymerization to >20 nucleotides, at a rate approximately threefold faster than that of the parent for the first nine monomer additions and up to 75fold faster for additions beyond 10 nucleotides. In addition, this ribozyme variant displays significantly improved fidelity, particularly with respect to discrimination against G•U wobble pairs. This improved fidelity appears to be the underlying source for the improvements in the maximum length of extension and the rate of polymerization. 5.3 From Polymerase to Replicase Using a sophisticated emulsion-based in vitro selection technique, the activity of the class I polymerase was greatly improved so that it can synthesize long RNA products, in some cases exceeding 100 nucleotides in length (Wochner et al. 2011). The improved ribozyme contains a 5′ -terminal “processivity tag” that binds to a complementary region of the template, thus overcoming the problem of a high Km through pseudointramolecularity. It continues to show a good average fidelity of 0.97. When the reaction is performed in the eutectic phase of water ice, even longer polymers can be obtained because of the increased concentration of reactants and reduced rate of RNA degradation under these near-frozen conditions (Attwater et al. 2013). However, the optimized polymerase strongly prefers cytidine-rich templates that lack any secondary structure, which precludes the synthesis of most functional RNAs. The class I polymerase was further optimized by selecting for its ability to react quickly and to copy “difficult” templates, including those that are cytidine-poor or contain regions of stable secondary structure (Horning and Joyce 2016). This resulted in a polymerase ribozyme that requires less than a minute to complete each NTP addition and can synthesize a variety of complex structured RNAs. It still struggles with highly structured templates or runs of consecutive template A residues. It also has a reduced fidelity of 0.92, primarily because of an increased frequency of G•U wobble mutations. Nonetheless, because of its much greater sequence generality, this optimized polymerase can catalyze the reciprocal synthesis of both an RNA and its complement, enabling the exponential amplification of short RNA templates in an all-RNA form of the polymerase chain reaction. A B Figure 6. X-ray crystal structure of the (A) L1 ligase, and (B) class I ligase ribozymes. Insets show the putative magnesium ion-binding sites at the respective ligation junctions. The structures are rendered in rainbow continuum, with the 5′ -triphosphate-bearing end of the ribozyme colored violet and the 3′ -hydroxyl-bearing end of the substrate colored red. The phosphate at the ligation junction is shown in white, and the proximate magnesium ion (modeled for the class I ligase) is shown as a yellow sphere, with dashed lines indicating coordination contacts. Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 15 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from This most highly optimized class I polymerase also has the ability to synthesize DNA polymers on an RNA template (Samanta and Joyce 2017). The ribozyme can incorporate all four deoxynucleoside 5′ -triphosphates at a rate of about one addition per minute and can generate products containing up to 32 deoxynucleotides. Reverse transcriptase activity likely was necessary for the transition from RNA to DNA genomes during the early history of life on Earth. The fact that an RNA polymerase ribozyme also can function as a reverse transcriptase suggests that a bridge may have existed between RNA and DNA as a selective adaptation of the RNA world. DNA is much more stable compared with RNA and thus provides a larger and more secure repository for genetic information. Generalized RNA-catalyzed RNA replication has not yet been achieved and it is not possible for the class I polymerase ribozyme to synthesize additional copies of itself. The most advanced form of the polymerase cannot replicate long RNA sequences because attemptsto do so are thwarted by the emergence of shorter ampliconsthat are copied more efficiently. There must be a selective advantage in maintaining the full-length amplicon and that advantage must exceed the probability of producing an error copy. It is reasonable to expect that a general RNA replicase will eventually be developed in the laboratory, thus providing a working model of RNA-based life. However, like the search for life elsewhere in the universe, such expectations must be met through direct observation rather than speculation. Despite falling short of the goal of generalized RNAcatalyzed RNA replication, there is one example of an RNA enzyme that catalyzes its own amplification and can undergo limited Darwinian evolution. This replication system uses a pair of ligase ribozymes that each catalyze the formation of the other, using a mixture of four different substrate oligonucleotides (Lincoln and Joyce 2009). In reaction mixtures containing only these RNA substrates, MgCl2, and buffer, a small starting number of ribozymes gives rise to many additional ribozymes through RNA-catalyzed exponential amplification. The replication process can be sustained indefinitely by replenishing the supply of substrates, in one case achieving an overall amplification factor of 10100 -fold in 37.5 h (Robertson and Joyce 2014). 6 COMPARTMENTED GENETICS 6.1 Integration of Genetic and Compartment Systems Assuming that the chemistry of RNA replication can be fully realized in a manner compatible with protocell replication, what other challenges must be addressed for the system to be capable of undergoing Darwinian evolution? Because functional RNAs are likely to provide significant selective advantage, it is helpful to understand the conditions necessary to obtain robust ribozyme function within model protocells. Such studies have generally used simple ribozymes, such as the self-cleaving hammerhead ribozyme, for which activity within fatty acid–based vesicles has been shown (Chen et al. 2005). Because most ribozymes require high concentrations of Mg2+ for optimal activity, it is necessary to use a lipid composition that maintains membrane stability in the presence of Mg2+ . Alternatively, it is possible that the first ribozymes did not require high concentrations of Mg2+ , instead relying on monovalent cations or other cofactors. New questions arisewhen considering the integration of replicating nucleic acids with replicating vesicles. For example, what mechanisms might serve to keep the membrane and RNA systems in balance so that neither out-replicates the other? One potential answer to this problem of homeostasis is the presence of short oligonucleotides that are complementary to a replicase ribozyme, generated as either replication intermediates or degradation fragments, that inactivate the replicase when present in high concentrations. Upon dilution, as would occur upon vesicle growth (Engelhart et al. 2016), this inhibition would be relieved and RNA replication could continue. 6.2 Membrane Permeability, Trafficking, and the Origins of Metabolism Assuming that the first protocells were heterotrophic, the protocell membranes must have been permeable to nutrients in the external environment. Fatty acid–based membranes show high permeability to ions, including divalent cations, as well as to larger polar and charged molecules, such as nucleotides. Environmental factors, including elevated temperature and the presence of divalent cations, can further enhance membrane permeability. High membrane permeability is likely to be necessary for protocell growth but is not a demanding physical requirement. However, high permeability can also limit the ability of a protocell to evolve metabolic functions because any metabolites synthesized within the protocell could leak out and be lost to the environment or captured by competing protocells. One possibility is that protocells did not evolve a metabolism until membrane permeability had decreased, perhaps driven by increasing phospholipid content resulting from the selective growth advantage provided by those phospholipids (Budin and Szostak 2011). Many compounds of biological metabolism are charged carboxylates or phosphate esters, which may reflect the evolution of metabolism at a time when cellular membranes were more permeable than in modern cells. G.F. Joyce and J.W. Szostak 16 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from 6.3 Remaining Challenges The experiments described above bring the field tantalizingly close to a replicating and evolving protocell, but several additional problems must be solved to reach that goal. Divalent metal cations appear to be essential for efficient RNA copying, but the poor affinity of the catalytic metal for the reaction center means that very high concentrations of these ions are required, which causes problems for both the RNA (degradation, hydrolysis of activated monomers) and for the fatty acid–based membranes. RNA polymerase enzymes solve these problems by binding and precisely positioning the metal ion for catalysis (Zhang et al. 1999; Cramer et al. 2001; Robertson and Scott 2007; Shechner et al. 2009). A prebiotically plausible means of achieving effective metal ion catalysis at low ambient concentration would greatly simplify the development of model protocells. Another key problem is the mobilization of chemical energy to activate (and reactivate) nucleotides for polymerization in a specific manner and under mild conditions. It also is not clear how to achieve multiple generations of nucleic acid replication within the context of a protocell. If and only if all of these challenges can be met will it be possible to synthesize an evolving protocell. 6.4 Implications for Synthetic Biology and Biotechnology The main purpose of research into both nonenzymatic and RNA-catalyzed RNA replication is to discern plausible pathways leading from chemistry to biology, and thus to provide potential explanations for the origin of life on Earth. However, the development of replicating protocells also will provide new approaches for the compartmentalized evolution of novel functional RNAs. One of the longstanding limitations of conventional in vitro evolution methods is that selection depends on self-modification of the catalytic RNA and thus does not enrich for high catalytic turnover. Replicating protocells may provide a new way to obtain efficient ribozymes, so long as their function can be linked to the selective advantage of the protocell. Many research groups are involved in the construction of artificial cells, with the goal of reconstituting some (orall) of the complex subsystems of microbial life. Most such systems therefore include the protein synthesis machinery, as either a crude extract or a reconstituted translation system. Reconstituting a fully functional and replicating cell from purified biological components is a major challenge and has inspired efforts to simplify aspects of cellular metabolism that seem gratuitous. For example, many chemical modifications of ribosomal RNA (rRNA) are essential for ribosome assembly, but recently Ichihashi and colleagues were able to evolve a variant small subunit rRNA that lacks modifications and can still be efficiently assembled into ribosomes (Murase et al. 2018). If this approach can be extended to the large subunit rRNA, ribosome assembly will be greatly simplified. Similar approaches are aimed at evolving a simpler ribosome, comparable with evolutionary intermediates in the path from the first peptidyl transferase ribozyme to the modern ribosome. Such approaches may contribute to the development of a comprehensive picture of possible paths from simple protocells to complex RNA world cells, and ultimately to the evolution of modern cells that contain DNA genomes, instructed protein synthesis, and a complex metabolism. 7 SUMMARY The constraints that must have been met to originate a selfsustained evolving system are reasonably well understood. One can sketch out a logical order of events, beginning with prebiotic chemistry and ending with protocellular life. However, it must be said that many details of this process remain obscure, and even for reactions that have been convincingly shown in the laboratory, it is difficult to know whether those same reactions were operable in the historical origins of life. The presumed RNA world should be viewed as a milestone, a plateau in the early history of life on Earth. So, too, the concept of an RNA world has been a milestone in the scientific study of life’s origins. Although this concept does not fully explain how life originated, it has helped to guide scientific thinking and has served to focus experimental efforts. Further progress will depend primarily on new experimental results as chemists, biochemists, and molecular biologists work together to address problems concerning molecular replication, ribozyme enzymology, and RNAbased cellular processes. ACKNOWLEDGMENTS J.W.S. is an Investigator of the Howard Hughes Medical Institute. This work was supported by research grants 287624 (G.F.J.) and 290363 (J.W.S.) from the Simons Foundation, research grants 80NSSC17K0462 (G.F.J.) and NNX15AL18G (J.W.S.) from the National Aeronautics and Space Administration (NASA), and research grant CHE- 1607034 (J.W.S.) from the National Science Foundation (NSF). Some portions of the text were published in prior editions of The RNA World. The contributions of Leslie Orgel to those prior versions and to the scientific literature of the origins of life are gratefully acknowledged. Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 17 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from REFERENCES ∗ Reference is also in this collection. Adamala K, Szostak JW. 2013a. Nonenzymatic template-directed RNA synthesis inside model protocells. Science 342: 1098–1100. Adamala K, Szostak JW. 2013b. Competition between model protocells driven by an encapsulated catalyst. Nat Chem 5: 495–501. Anastasi C, Crowe MA, Sutherland JD. 2007. Two-step potentially prebiotic synthesis of α-D-cytidine-5′ -phosphate from D-glyceraldehyde- 3-phosphate. J Am Chem Soc 129: 24–25. Attwater J, Wochner A, Holliger P. 2013. In-ice evolution of RNA polymerase ribozyme activity. Nat Chem 5: 1011–1018. Aumiller WM Jr, Pir Cakmak F, Davis BW, Keating CD. 2016. RNAbased coacervates as a model for membraneless organelles: Formation, properties, and interfacial liposome assembly. Langmuir 32: 10042– 10053. Baaske P, Weinert FM, Duhr S, Lemke KH, Russell MJ, Braun D. 2007. Extreme accumulation of nucleotides in simulated hydrothermal pore systems. Proc Natl Acad Sci 104: 9346–9351. Bansho Y, Furubayashi T, Ichihashi N, Yomo T. 2016. Host-parasite oscillation dynamics and evolution in a compartmentalized RNA replication system. Proc Natl Acad Sci 113: 4045–4050. Bartel DP, Szostak JW. 1993. Isolation of new ribozymes from a large pool of random sequences. Science 261: 1411–1418. Becker S, Thoma I, Amrei Deutsch A, Gehrke T, Mayer P, Zipse H, Carell T. 2016. A high-yielding, strictly regioselective prebiotic purine nucleoside formation pathway. Science 352: 833–836. Berclaz N, Muller M, Walde P, Luisi PL. 2001. Growth and transformation of vesicles studied by ferritin labeling and cryotransmission electron microscopy. J Phys Chem B 105: 1056–1064. ∗ Blackmond DG. 2018. The origin of biological homochirality. Cold Spring Harb Perspect Biol 11: a032540. Bowler FR, Chan CK, Duffy CD, Gerland B, Islam S, Powner MW, Sutherland JD, Xu J. 2013. Prebiotically plausible oligoribonucleotide ligation facilitated by chemoselective acetylation. Nat Chem 5: 383– 389. Budin I, Szostak JW. 2011. Physical effects underlying the transition from primitive to modern cell membranes. Proc Natl Acad Sci 108: 5249–5254. Budin I, Bruckner RJ, Szostak JW. 2009. Formation of protocell-like vesicles in a thermal diffusion column. J Am Chem Soc 131: 9628–9629. Budin I, Prwyes N, Zhang N, Szostak JW. 2014. Chain-length heterogeneity allows for the assembly of fatty acid vesicles in dilute solutions. Biophys J 107: 1582–1590. Chen IA, Nowak MA. 2012. From prelife to life: How chemical kinetics become evolutionary dynamics. Acc Chem Res 45: 2088–2096. Chen IA, Roberts RW, Szostak JW. 2004. The emergence of competition between model protocells. Science 305: 1474–1476. Chen IA, Salehi-Ashtiani K, Szostak JW. 2005. RNA catalysis in model protocell vesicles. J Am Chem Soc 127: 13213–13219. Cramer P, Bushnell DA, Kornberg RD. 2001. Structural basis of transcription: RNA polymerase II at 2.8 Å resolution. Science 292: 1863–1876. Cronin JR, Pizzarello S. 1997. Enantiomeric excesses in meteoritic amino acids. Science 275: 951–955. Deamer DW. 1985. Boundary structures are formed by organic components of the Murchison carbonaceous chondrite. Nature 317: 792–794. Deamer DW. 1997. The first living systems: A bioenergetic perspective. Microbiol Mol Biol Rev 61: 239–261. Duley WW, Hu A. 2009. Polyynes and interstellar carbon nanoparticles. Astrophys J 698: 808–811. Eigen M. 1971. Selforganization of matter and the evolution of biological macromolecules. Naturwiss 58: 465–523. Ekland EH, Bartel DP. 1996. RNA-catalysed RNA polymerization using nucleoside triphosphates. Nature 382: 373–376. Ekland EH, Szostak JW, Bartel DP. 1995. Structurally complex and highly active RNA ligases derived from random RNA sequences. Science 269: 364–370. Emilsson GM, Nakamura S, Roth A, Breaker RR. 2003. Ribozyme speed limits. RNA 9: 907–918. Engel M, Macko S. 1997. Isotopic evidence for extraterrestrial non-racemic amino acids in the Murchison meteorite. Nature 389: 265–268. Engelhart AE, Adamala KP, Szostak JW. 2016. A simple physical mechanism enables homeostasis in primitive cells. Nat Chem 8: 448–453. Eschenmoser A. 1999. Chemical etiology of nucleic acid structure. Science 284: 2118–2124. Fahrenbach C, Giurgiu C, Tam CP, Li L, Hongo Y, Aono M, Szostak JW. 2017. Common and potentially prebiotic origin for precursors of nucleotide synthesis and activation. J Am Chem Soc 139: 8780–8783. Ferris JP. 2002. Montmorillonite catalysis of 30–50 mer oligonucleotides: Laboratory demonstration of potential steps in the origin of the RNA world. Orig Life Evol Biosph 32: 311–332. Ferris JP, Hill AR Jr, Liu R, Orgel LE. 1996. Synthesis of long prebiotic oligomers on mineral surfaces. Nature 381: 59–61. Frankel EA, Bevilacqua PC, Keating CD. 2016. Polyamine/nucleotide coacervates provide strong compartmentalization of Mg2+ , nucleotides, and RNA. Langmuir 32: 2041–2049. Fuller WD, Sanchez RA, Orgel LE. 1972. Studies in prebiotic synthesis. VI. Synthesis of purine nucleosides. J Mol Biol 67: 25–33. Gibbs D, Lohrmann R, Orgel LE. 1980. Template-directed synthesis and selective adsorption of oligoadenylates on hydroxyapatite. J Mol Evol 15: 347–354. Glavin DP, Dworkin JP. 2009. Enrichment of the amino acid L-isovaline by aqueous alteration on CI and CM meteorite parent bodies. Proc Natl Acad Sci 106: 5487–5492. Hanczyc MM, Fujikawa SM, Szostak JW. 2003. Experimental models of primitive cellular compartments: Encapsulation, growth, and division. Science 302: 618–622. He C, Gállego I, Laughlin B, Grover MA, Hud NV. 2017. Aviscous solvent enables information transfer from gene-length nucleic acids in a model prebiotic replication cycle. Nat Chem 9: 318–324. Heuberger BD, Pal A, Del Frate F, Topkar VV, Szostak JW. 2015. Replacing uridine with 2-thiouridine enhances the rate and fidelity of nonenzymatic RNA primer extension. J Am Chem Soc 137: 2769–2775. Hong F, Zhang F, Liu Y, Yan H. 2017. DNA origami: Scaffolds for creating higher order structures. Chem Rev 117: 12584–12640. Horning D, Joyce GF. 2016. Amplification of RNA byan RNA polymerase ribozyme. Proc Natl Acad Sci 113: 9786–9791. Ichihashi N, Usui K, Kazuta Y, Sunami T, Matsuura T, Yomo T. 2013. Darwinian evolution in a translation-coupled RNA replication system within a cell-like compartment. Nat Commun 4: 2494. Ikawa Y, Tsuda K, Matsumura S, Inoue T. 2004. De novo synthesis and development of an RNA enzyme. Proc Natl Acad Sci 101: 13750– 13755. Ingar A-A, Luke RWA, Hayter BR, Sutherland JD. 2003. Synthesis of a cytidine ribonucleotide by stepwise assembly of the heterocycle on a sugar phosphate. Chembiochem 4: 504–507. Inoue T, Orgel LE. 1981. Substituent control of the poly(C)-directed oligomerization of guanosine 5′ -phosphoroimidazolide. J Am Chem Soc 103: 7666–7667. Inoue T, Orgel LE. 1983. A nonenzymatic RNA polymerase model. Science 219: 859–862. Inskeep WP, Jay ZJ, Macur RE, Clingenpeel S, Tenney A, Lovalvo D, Beam JP, Kozubal MA, Shanks WC, Morgan LA, et al. 2015. Geomicrobiology of sublacustrine thermal vents in Yellowstone Lake: Geochemical controls on microbial community structure and function. Front Microbiol 6: 1044. Islam S, Dejan-Krešimir B, Powner MW. 2017. Prebiotic selection and assembly of proteinogenic amino acids and natural nucleotides from complex mixtures. Nat Chem 9: 584–589. G.F. Joyce and J.W. Szostak 18 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from Jaeger L, Wright MC, Joyce GF. 1999. A complex ligase ribozyme evolved in vitro from a group I ribozyme domain. Proc Natl Acad Sci 96: 14712–14717. Jain A, Vale RD. 2017. RNA phase transitions in repeat expansion disorders. Nature 546: 243–247. James KD, Ellington AD. 1997. Surprising fidelity of template-directed chemical ligation of oligonucleotides. Chem Biol 4: 595–605. Jauker M, Griesser H, Richert C. 2015. Copying of RNA sequences without pre-activation. Angew Chemie 54: 14559–14563. Jia TZ, Hentrich C, Szostak JW. 2014. Rapid RNA exchange in aqueous two-phase system and coacervate droplets. Orig Life Evol Biosph 44: 1–12. Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, Bartel DP. 2001. RNA-catalyzed RNA polymerization: Accurate and general RNAtemplated primer extension. Science 292: 1319–1325. Joyce GF. 2002. The antiquity of RNA-based evolution. Nature 418: 214– 221. Joyce GF, Visser GM, van Boeckel CAA, van Boom JH, Orgel LE, van Westrenen J. 1984. Chiral selection in poly(C)-directed synthesis of oligo(G). Nature 310: 602–604. Kanavarioti A, Monnard PA, Deamer DW. 2001. Eutectic phases in ice facilitate nonenzymatic nucleic acid synthesis. Astrobiology 1: 271–281. Kervio E, Sosson M, Richert C. 2016. The effect of leaving groups on binding and reactivity in enzyme-free copying of DNA and RNA. Nucleic Acids Res 44: 5504–5514. Kim HJ, Ricardo A, Illangkoon H, Kim MJ, Carrigan MA, Frye F, Benner SA. 2011. Synthesis of carbohydrates in mineral-guided prebiotic cycles. J Am Chem Soc 133: 9457–9468. Klussmann M, Iwamura H, Mathew SP, Wells DH Jr, Pandya U, Armstrong A, Blackmond DG. 2006. Thermodynamic control of asymmetric amplification in amino acid catalysis. Nature 441: 621–623. Koga S, Williams DS, Perriman AW, Mann S. 2011. Peptide-nucleotide microdroplets as a step towards a membrane-free protocell model. Nat Chem 3: 720–724. Kondepudi DK, Kaufman RJ, Singh N. 1990. Chiral symmetry breaking in sodium chlorate crystallization. Science 250: 975–976. Kreysing M, Keil L, Lanzmich S, Braun D. 2015. Heat flux across an open pore enables the continuous replication and selection of oligonucleotides towards increasing length. Nat Chem 7: 203–208. Lane N, Martin WF. 2012. The origin of membrane bioenergetics. Cell 151: 1406–1416. Larralde R, Robertson MP, Miller SL. 1995. Rates of decomposition of ribose and other sugars: Implications for chemical evolution. Proc Natl Acad Sci 92: 8158–8160. Lawrence MS, Bartel DP. 2003. Processivity of ribozyme-catalyzed RNA polymerization. Biochemistry 42: 8748–8755. Li L, Prywes N, Tam CP, O’Flaherty DK, Lelyveld VS, Izgu EC, Pal A, Szostak JW. 2017. Enhanced nonenzymatic RNA copying with 2-aminoimidazole activated nucleotides. J Am Chem Soc 139: 1810–1813. Lincoln TA, Joyce GF. 2009. Self-sustained replication of an RNA enzyme. Science 323: 1229–1232. Mansy SS, Schrum JP, Krishnamurthy M, Tobé S, Treco DA, Szostak JW. 2008. Template-directed synthesis of a genetic polymer in a model protocell. Nature 454: 122–125. Mariani A, Sutherland JD. 2017. Non-enzymatic RNA backbone proofreading through energy-dissipative recycling. Angew Chemie 56: 6563– 6566. Mast CB, Schink S, Gerland U, Braun D. 2013. Escalation of polymerization in a thermal gradient. Proc Natl Acad Sci 110: 8030–8035. Matsumura S, Kun Á, Ryckelynck M, Coldren F, Szilágyi A, Jossinet F, Rick C, Nghe P, Szathmáry E, Griffiths AD. 2016. Transient compartmentalization of RNA replicators prevents extinction due to parasites. Science 354: 1293–1296. McGinness KE, Joyce GF. 2002. RNA-catalyzed RNA ligation on an external RNA template. Chem Biol 9: 297–307. Mizuno T, Weiss AH. 1974. Synthesis and utilization of formose sugars. Adv Carbohyd Chem Biochem 29: 173–227. Müller D, Pitsch S, Kittaka A, Wagner E, Wintner CE, Eschenmoser A. 1990. Chemie von α-aminonitrilen. Aldomerisierung von Glykolaldehydphosphat zu racemischen hexose-2,4,6-triphosphaten und (in gegenwart von formaldehyd). Racemischen pentose-2,4-diphosphaten: Rac-allose-2,4,6-triphosphat und rac-ribose-2,4-diphosphat sind die reaktionshauptprodukte. Helv Chim Acta 73: 1410–1468. Murase Y, Nakanishi H, Tsuji G, Sunami T, Ichihashi N. 2018. In vitro evolution of unmodified 16S rRNA for simple ribosome reconstitution. ACS Synth Biol 7: 576–583. Narlikar GJ, Herschlag D. 1997. Mechanistic aspects of enzymatic catalysis: Lessons from comparison of RNA and protein enzymes. Annu Rev Biochem 66: 19–59. Noorduin WL, Izumi T, Millemaggi A, Leeman M, Meekes H, Van Enckevort WJP, Kellogg RM, Kaptein B, Vlieg E, Blackmond DG. 2008. Emergence of a single solid chiral state from a nearly racemic amino acid derivative. J Am Chem Soc 130: 1158–1159. O’Flaherty DK, Kamat NP, Mirza FN, Li L, Prywes N, Szostak JW. 2018. Copying of mixed-sequence RNA templates inside model protocells. J Am Chem Soc 140: 5171–5178. Orgel LE. 2004. Prebiotic chemistry and the origin of the RNAworld. Crit Rev Biochem Mol Biol 39: 99–123. Patel BH, Percivalle C, Ritson DJ, Duffy CD, Sutherland JD. 2015. Common origins of RNA, protein and lipid precursors in a cyanosulfidic protometabolism. Nat Chem 7: 301–307. Pinheiro AV, Han D, Shih WM, Yan H. 2011. Challenges and opportunities for structural DNA nanotechnology. Nat Nanotechnol 6: 763–772. Pitsch S, Wendeborn S, Jaun B, Eschenmoser A. 1993. Why pentose- and not hexose-nucleic acids? Pyranosyl-RNA (“p-RNA”). Helv Chim Acta 76: 2161–2183. Pitsch S, Krishnamurthy R, Bolli M, Wendeborn S, Holzner A, Minton M, Lesueur C, Schlönvogt I, Jaun B, Eschenmoser A. 1995. PyranosylRNA (“p-RNA”): Base-pairing selectivity and potential to replicate. Helv Chim Acta 78: 1621–1635. Pitsch S, Wendeborn S, Krishnamurthy R, Holzner A, Minton M, Bolli M, Miculka C, Windhab N, Micura R, Stanek M, et al. 2003. Pentopyranosyl oligonucleotide systems. 9th communication. The β- D-ribopyranosyl-(4′ →2′ )-oligonucleotide system (“pyranosyl-RNA”): Synthesis and resumé of base-pairing properties. Helv Chim Acta 86: 4270–4363. Pizzarello S, Zolensky M, Turk KA. 2003. Nonracemic isovaline in the Murchison meteorite: Chiral distribution and mineral association. Geochim Cosmochim Acta 67: 1589–1595. Powner MW, Gerland B, Sutherland JD. 2009. Synthesis of activated pyrimidine ribonucleotides in prebiotically plausible conditions. Nature 459: 239–242. Powner MW, Sutherland JD, Szostak JW. 2010. Chemoselective multicomponent one-pot assembly or purine precursors in water. J Am Chem Soc 132: 16677–16688. Prywes N, Blain JC, Del Frate F, Szostak JW. 2016. Nonenzymatic copying of RNA templates containing all four letters is catalyzed by activated oligonucleotides. eLife 5: e17756. Rajamani S, Ichida JK, Antal T, Treco DA, Leu K, Nowak MA, Szostak JW, Chen IA. 2010. Effect of stalling after mismatches on the error catastrophe in nonenzymatic nucleic acid replication. J Am Chem Soc 132: 5880–5885. Rasi S, Mavelli F, Luisi PL. 2004. Matrix effect in oleate micelles–vesicles transformation. Orig Life Evol Biosph 34: 215–224. Ricardo A, Carrigan MA, Olcott AN, Benner SA. 2004. Borate minerals stabilize ribose. Science 303: 196. Rushdi AI, Simoneit BR. 2001. Lipid formation by aqueous Fischer– Tropsch-type synthesis over a temperature range of 100 to 400 degrees C. Orig Life Evol Biosph 31: 103–118. Robertson MP, Ellington AD. 1999. In vitro selection of an allosteric ribozyme that transduces analytes into amplicons. Nat Biotechnol 17: 62–66. Protocells and RNA Self-Replication Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 19 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from Robertson MP, Joyce GF. 2014. Highly efficient self-replicating RNA enzymes. Chem Biol 21: 238–245. Robertson MP, Scott WG. 2007. The structural basis of ribozymecatalyzed RNA assembly. Science 315: 1549–1553. Rogers J, Joyce GF. 2001. The effect of cytidine on the structure and function of an RNA ligase ribozyme. RNA 7: 395–404. Samanta B, Joyce GF. 2017. A reverse transcriptase ribozyme. eLife 6: e31153. Sanchez RA, Orgel LE. 1970. Studies in prebiotic synthesis. V. Synthesis and photoanomerization of pyrimidine nucleosides. J Mol Biol 47: 531–543. Shechner DM, Grant RA, Bagby SC, Koldobskaya Y, Piccirilli JA, Bartel DP. 2009. Crystal structure of the catalytic core of an RNA polymerase ribozyme. Science 326: 1271–1275. Schöning K, Scholz P, Guntha S, Wu X, Krishnamurthy R, Eschenmoser A. 2000. Chemical etiology of nucleic acid structure: The α-threofuranosyl-(3′ →2′ ) oligonucleotide system. Science 290: 1347–1351. Schulman R, Yurke B, Winfree E. 2012. Robust self-replication of combinatorial information via crystal growth and scission. Proc Natl Acad Sci 109: 6405–6410. Schwartz AW, Orgel LE. 1985. Template-directed polynucleotide synthesis on mineral surfaces. J Mol Evol 21: 299–300. Sczepanski JT, Joyce GF. 2014. A cross-chiral RNA polymerase ribozyme. Nature 515: 440–442. Shan S, Yoshida A, Sun S, Piccirilli JA, Herschlag D. 1999. Three metal ions at the active site of the Tetrahymena group I ribozyme. Proc Natl Acad Sci 96: 12299–12304. Shan S, Kravchuk AV, Piccirilli JA, Herschlag D. 2001. Defining the catalytic metal ion interactions in the Tetrahymena ribozyme reaction. Biochemistry 40: 5161–5171. Shih WM, Quispe JD, Joyce GF. 2004. A 1.7-kilobase single-stranded DNA that folds into a nanoscale octahedron. Nature 427: 618–621. Soai K, Shibata T, Morioka H, Choji K. 1995. Asymmetric autocatalysis and amplification of enantiomeric excess of a chiral molecule. Nature 378: 767–768. Springsteen G, Joyce GF. 2004. Selective derivatization and sequestration of ribose from a prebiotic mix. J Am Chem Soc 126: 9578–9583. Stairs S, Nikmal A, Bučar DK, Zheng SL, Szostak JW, Powner MW. 2017. Divergent prebiotic synthesis of pyrimidine and 8-oxo-purine ribonucleotides. Nat Commun 8: 15270. Steitz TA, Moore PB. 2003. RNA, the first macromolecular catalyst: The ribosome is a ribozyme. Trends Biochem Sci 28: 411–418. Strobel SA, Ortoleva-Donnelly L. 1999. A hydrogen-bonding triad stabilizes the chemical transition state of a group I ribozyme. Chem Biol 6: 153–165. Sutherland JD. 2016. The origin of life—Out of the blue. Angew Chemie 55: 104–121. Szostak JW. 2011. An optimal degree of physical and chemical heterogeneity for the origin of life? Philos Trans R Soc Lond B Biol Sci 366: 2894–2901. Szostak JW, Bartel DP, Luisi PL. 2001. Synthesizing life. Nature 409: 387– 390. Tang TYD, Hak CRC, Thompson AJ, Kuimova MK, Williams DS, Perriman AW, Mann S. 2014. Fatty acid membrane assembly on coacervate microdroplets as a step towards a hybrid protocell model. Nat Chem 6: 527–533. Tapiero CM, Nagyvary J. 1971. Prebiotic formation of cytidine nucleotides. Nature 231: 42–43. Viedma C. 2005. Chiral symmetry breaking during crystallization: Complete chiral purity induced by nonlinear autocatalysis and recycling. Phys Rev Lett 94: 065504. Viedma C, Ortiz JE, de Torres T, Izumi T, Blackmond DG. 2008. Evolution of solid phase homochirality for a proteinogenic amino acid. J Am Chem Soc 130: 15274–15275. Walton T, Szostak JW. 2016. A highly reactive imidazolium-bridged dinucleotide intermediate in nonenzymatic RNA primer extension. J Am Chem Soc 138: 11996–12002. Walton T, Szostak JW. 2017. A kinetic model of nonenzymatic RNA polymerization by cytidine-5′ -phosphoro-2-aminoimidazolide. Biochemistry 56: 5739–5747. Wochner A, Attwater J, Coulson A, Holliger P. 2011. Ribozyme-catalyzed transcription of an active ribozyme. Science 332: 209–212. Wu T, Orgel LE. 1992a. Nonenzymatic template-directed synthesis on oligodeoxycytidylate sequences in hairpin oligonucleotides. J Am Chem Soc 114: 317–322. Wu T, Orgel LE. 1992b. Nonenzymatic template-directed synthesis on hairpin oligonucleotides. II. Templates containing cytidine and guanosine residues. J Am Chem Soc 114: 5496–5501. Xu J, Tsanakopoulou M, Magnani CJ, Szabla R, Šponer JE, Šponer J, Góra RW, Sutherland JD. 2017. A prebiotically plausible synthesis of pyrimidine β-ribonucleosides and their phosphate derivatives involving photoanomerization. Nat Chem 9: 303–309. Zaher HS, Unrau PJ. 2007. Selection of an improved RNA polymerase ribozyme with superior extension and fidelity. RNA 13: 1017–1026. Zhang G, Campbell EA, Minakhin L, Richter C, Severinov K, Darst SA. 1999. Crystal structure of Thermus aquaticus core RNA polymerase at 3.3 Å resolution. Cell 98: 811–824. Zhang L, Peritz A, Meggers E. 2005. A simple glycol nucleic acid. J Am Chem Soc 127: 4174–4175. Zhu TF, Szostak JW. 2009. Coupled growth and division of model protocell membranes. J Am Chem Soc 131: 5705–5713. Zhu TF, Adamala K, Zhang N, Szostak JW. 2012. Photochemically driven redox chemistry induces protocell membrane pearling and division. Proc Natl Acad Sci 109: 9828–9832. G.F. Joyce and J.W. Szostak 20 Cite this article as Cold Spring Harb Perspect Biol 2018;10:a034801 Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from 2018; doi: 10.1101/cshperspect.a034801Cold Spring Harb Perspect Biol Gerald F. Joyce and Jack W. Szostak Protocells and RNA Self-Replication Subject Collection RNA Worlds Transcriptase Enzymes Single-Molecule Analysis of Reverse Linnea I. Jansson and Michael D. Stone Insights into Eukaryotic Translation Complexity: Ribosomal Structures Provide Extensions, Extra Factors, and Extreme Melanie Weisser and Nenad Ban Structural Biology of Telomerase Yaqiang Wang, Lukas Susac and Juli Feigon Regulation CRISPR Tools for Systematic Studies of RNA Gootenberg, et al. Jesse Engreitz, Omar Abudayyeh, Jonathan with Transcription Nascent RNA and the Coordination of Splicing Karla M. Neugebauer Relating Structure and Dynamics in RNA Biology al. Kevin P. Larsen, Junhong Choi, Arjun Prabhakar, et Complexes RNA−Integrative Structural Biology of Protein Magnetic Resonance (NMR) Spectroscopy for Combining Mass Spectrometry (MS) and Nuclear Allain Alexander Leitner, Georg Dorn and Frédéric H.-T. Synthetic Genetics Beyond DNA and RNA: The Expanding Toolbox of Holliger Alexander I. Taylor, Gillian Houlihan and Philipp of mRNA Nucleotides That Comprise the Epitranscriptome Discovering and Mapping the Modified Bastian Linder and Samie R. Jaffrey Lessons from Yeast Structural Basis of Nuclear pre-mRNA Splicing: Nagai Clemens Plaschka, Andrew J. Newman and Kiyoshi Imaging Illuminating Genomic Dark Matter with RNA Arjun Raj and John L. Rinn Microscope under the Single-Molecule Fluorescence Coming Together: RNAs and Proteins Assemble Ameya P. Jalihal, Paul E. Lund and Nils G. Walter From Retroelements to Research Tools Group II Intron RNPs and Reverse Transcriptases: Marlene Belfort and Alan M. Lambowitz Mapping the dsRNA World Daniel P. Reich and Brenda L. Bass http://cshperspectives.cshlp.org/cgi/collection/For additional articles in this collection, see Copyright © 2018 Cold Spring Harbor Laboratory Press; all rights reserved Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from The Origin of Biological Homochirality Donna G. Blackmond Splicing in Higher Eukaryotes Structural Insights into Nuclear pre-mRNA Berthold Kastner, Cindy L. Will, Holger Stark, et al. http://cshperspectives.cshlp.org/cgi/collection/For additional articles in this collection, see Copyright © 2018 Cold Spring Harbor Laboratory Press; all rights reserved Laboratory Press at Masaryk University on September 19, 2019 - Published by Cold Spring Harborhttp://cshperspectives.cshlp.org/Downloaded from