Introduction to Soft Matter Models of Membranes and Proteins by Robert Vácha Fundamental constants: Constant Symbol Value [SI] Boltzmann constant k, kB 1.381 ×10−23 JK−1 Avogadro’s number Na 6.022 ×1023 mol−1 Gas constant R=k×Na 8.314 JK−1mol−1 Electron charge e 1.602 ×10−19 C Speed of light c 2.998 ×108 ms−1 Vacuum permittivity 0 = 107 4πc2 8.854 ×10−12 C2J−1m−1 Unit symbols and relations: Quantity SI Relations Length m 1 m = 100 cm = 109 nm = 1010 Å 1 nm = 10 Å= 10−9 m = 10−7 cm Energy J 1 J = 0.239 cal = 6.242×1018 eV = 2.43×1020 kT (25◦C) 1 kT (25◦C) = 0.529 kcal mol−1 = 2.478 kJ mol−1 1 kT = 4.114×10−21 J (25◦C) = 4.045×10−21 J (20◦C) 1 kcal mol−1 = 4.184 kJ mol−1 (1 cal = 4.184 J) 1 eV = 1.602×10−19 J = 23.06 kcal mol−1 1 cm −1 = 1.986×10−23 J Surf. tension Jm−2 1 dyne cm−1 = 1 mN m−1 = 1 mJ m−2 Contents 1. Introduction to coarse-grained models . . . . . . . . . . . . 1 2. Ideal chain approximation . . . . . . . . . . . . . . . . . . . 9 3. Ideal thin membrane . . . . . . . . . . . . . . . . . . . . . . 19 4. Interaction of uncharged membrane and proteins . . . . . . 30 5. Interaction of charged membrane and proteins . . . . . . . 44 6. Towards realistic interactions of membranes and proteins . 54 7. Thermodynamic principles of self-assembly . . . . . . . . . 66 8. Langmuir adsorption . . . . . . . . . . . . . . . . . . . . . . 76 9. Transport phenomena . . . . . . . . . . . . . . . . . . . . . 84 10. Advanced Monte Carlo . . . . . . . . . . . . . . . . . . . . 95 Test questions . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Basics of thermodynamics . . . . . . . . . . . . . . . . . . . 109 Basics of statistical mechanics . . . . . . . . . . . . . . . . . 110 Derivation of Boltzmann distribution . . . . . . . . . . . . . . 112 Virial coefficients and itneractions . . . . . . . . . . . . . . . 113 Ideal chain - radius of gyration Rg . . . . . . . . . . . . . . . 114 Ideal chain - probability of the end-to-end distance . . . . . . . 116 Dispersion interactions with thin membrane . . . . . . . . . . 118 Depletion interactions in solution of ideal chains . . . . . . . . 119 Dissociation constants for amino acids . . . . . . . . . . . . . 121 Derivation of equilibrium equation between aggregates of different sizes . . . . . . . . . . . . . . . . . . . . . . . . 121 Distribution of finite size aggregates . . . . . . . . . . . . . . 122 Fluctuation-dissipation theorem . . . . . . . . . . . . . . . . . 123 Introduction to Metropolis Monte Carlo method . . . . . . . . 125 1. Introduction to coarse-grained models (Coarse-graining levels: from all-atom to single particle models and continuum approximation) Soft matter is quite a general term, which covers a wide variety of different systems and materials such as polymers, colloids, surfactants, proteins, membrane, cells, tissues, and other biological matter. The common feature of all soft matter substances is their ’softness’, which can be characterized by an elastic bulk modulus being in the range from 1 Pa to 100 MPa. This means that with a pressure change of 1 atmosphere, the material volume compresses from 105 to 0.1 %. For comparison the bulk modulus of water and steel is 2 GPa and 200 GPa respectively, while air has one of about 0.1 MPa. These species usually form very complex fluids with interactions on order of a few kT, which are comparable with thermal fluctuations and entropic forces. As a result the soft matter structures are dynamic and changes in the solution composition and temperature can lead to significant structural changes (no need to break chemical bonds). Soft matter processes are mostly about diffusion and changes in large molecules in solution. It is therefore important to understand organization and changes at the "mesoscopic" scale rather than investigate atomistic details. The characteristic length scale is on the order 2 of 1 - 100 nm and characteristic times are on the order of milliseconds to seconds rather than the angstroms and femto- and picosecond times relevant for the movement of individual atoms and small molecules. Such large scales are a typical domain of coarse-grained simulations, which are widely used and still being developed. Coarse-grained model is a term used for non-atomistic models, where some of the degrees of freedom are averaged out. Strictly speaking, atomistic models are also coarse-grained as they neglect or average quantum effects and wave description of particles. Nevertheless, the terminology of coarse grained models is typically used for models, where a single particle consists of more atoms etc. Naturally, the first question is how to coarsegrain the system of interest, or which model to use for it. This can sometimes be a challenging task as we only know that the right model has to contain all the physically important details of the studied phenomena/system, while the irrelevant degrees of freedom are averaged and described effectively. This means that the effective interaction is a function of temperature, pH and, salt concentration (for instance the solvent could be described as a dielectric medium with a certain viscosity, which are both temperature-dependent). Therefore, different phenomena often require different coarse grained model and a whole library of different models have been developed. To gain a better idea of the current library of coarse-grained models we will outline a classification system below. However, this classification is only for a simple overview, as it is not widely recognized and is incomplete (e.g. does not contain mean field models). We list some well-known examples of protein and membrane models. Classification based on the size of elementary building units: 1. Finite element models – molecules or their parts are represented as finite particles and we can divide this group further based on the number of heavy atoms included in one particle • First order – contains 2 - 5 heavy atoms; examples 3 are the MARTINI force field, Klein and Shinoda model, Voth’s model, Smit’s model, Tube model, ELBA, PRIMO, PaLaCe, ... • Second order – usually includes 5 - 20 heavy atoms; a typical example is an amino acid model (1 bead per amino acid) for proteins, UNRES, VAMM, OPEP, Bereau and Deserno model, SCORPION model, McCammon model.. • Third order – coarse grain 20 - 100 heavy atoms into one particle. A typical example is the three-bead lipid model by Lipowsky, which was further developed by Deserno, Frank model, HCG model, etc. ... • Single particle – whole molecule is represented as a single particle, HAS model for lipids and many patchy particles for proteins. 2. Lattice models – particles or regions of the system are represented as points on a lattice with the specific interactions between them • Ising model – interaction points with two possible states • Density Functional Theory – points contribute to the free energy of the system through their local density (entropy) and interactions 3. Continuum models – molecules or parts of the system are described uniformly by their macroscopic properties based on the basis of which we can further categorise this group • Electrostatic – solution described as a dielectric screening of the Coulomb interactions • Elastic – membrane as an elastic thin sheet 4 • Density Classification based on parameterization of the model: 1. Approaches to coarse-graining • TOP DOWN approach – the parameters of the interactions are fitted to macroscopic values such as partition coefficients, density, elasticity, binding constants, etc., which are commonly known from experiments • BOTTOM UP approach – model parameters are fitted to reproduce some microscopic properties such as radial distribution functions, molecular forces, etc., these are usually obtained from more detailed simulations. 2. Type of interaction between elementary building blocks • Isotropic – acts between two particles independently of their orientation - most common • Anisotropic – orientation-dependent interaction between two particles - interaction of dipoles or patchy particles • Three and more body – potential depends on the positions of more than two particles – single-particle water model Note that the different models can be combined in a calculation and there are also models where both top-down and bottomup approaches were combined. The species of interest or more flexible part of the system are often described in more details than the rest of the system. For example the MARTINI force field uses 1:3-4 mapping with explicit water, which can be polarizable or non-polarizable. In the contrast the UNRES (united residue) model uses one particle for each amino acid, two particles for 4 backbone heavy atoms and implicit solvent. The more detailed 5 backbone representation was found to be necessary to describe protein folding, where hydrogen bonds within the protein backbone play an important role. A knowledge-based approach is sometimes recognized separately from a top-down approach. This approach is based on protein structures from the PDB database. The statistics of molecular contacts and their distance dependence is used in interaction parameterization. The resulting Go, network, and nativecentric models were used to study protein folding, fluctuations, and protein-protein interactions. Despite the success of these models, it has been shown that the resulting models are not transferable to new protein structures. This casts a shadows on the physical significance of results from these models for non-native (untrained) structures. Why is there a need for coarse grained models and why is their popularity growing rapidly? Because human imagination in model development enable us to study systems that might not be feasible for all-atom simulations for decades. Let’s estimate how long it will thaw for a larger system to become feasible for all-atom simulation. Let’s say we want to simulate a system that is 10 times larger (103 in volume) than is feasible by atomistic simulations today. We could naively think that the computational cost (time required) will grow with the number of interactions, i.e. with N2 , where N is number of particles in the system (needing to calculate the forces between all pairs of atoms). Using more efficient algorithms we can achieve better scaling, so the computational cost would increase with N. Still, this gives us a time that is 103 longer. Moreover, the time required for equilibration and sampling of such a system would also increase. We can make a simple estimate based on the diffusion which dominates such large systems. If we take a single particle in the center of the simulation box and calculate the time required for its diffusion to the edge of the box, it would take 102 longer for our larger system. 6 So we can estimate that one would need computers 105 faster (possibly much more than that). Computer speed increases exponentially, doubling every 1.5 years, according to Moore’s law. This gives us 25 years, until we can perform such a simulation. Note that due to massive parallelization, we can simulate allatom system that have millions of atoms nowadays; however, the time scales required to study such large systems remains challenging. Moreover, we gain insights into key aspects of the studied systems from the models, so we can understand the microscopic origin of a given macroscopic phenomenon. Sometimes the models are so simple that they may look very unrealistic, for example ideal gas, ideal chain, etc. Nevertheless, the main property of a good model is its predictive power, and simple models have predicted a lot of phenomena and are still used as first ap- proximations. Most coarse-grained simulations of proteins and phospholipid membranes today employ finite element models with the solvent described as a continuum or finite elements. We will therefore focus on those. We start with a description of the simplest models covering conformations, entropy, and the changes in them for proteins and membranes. This is followed by a discussion of enthalpic contributions to the intra- and intermolecular interactions, focusing on their strength and distance dependence, information necessary for the development of the models using a top-down approach. The parameterization of coarse-grained models is then completed with description of methodology for the bottom-up approach. We then focus on principles of self assembly, surface association, and the motion in solutions, basic phenomena in the field of coarse-grained matter. Last but not least some advanced methods commonly applied in these areas are explained. We will now look at two examples that demonstrates how much we can learn from simplest coarse-grained models. In the early 7 fifties John G. Kirkwood made a prediction that hard spheres with no attractive interaction can freeze. This counter-intuitive idea was based on a theoretical approximation and was not well accepted at the time. Only later computer simulations by Alder, Wainwright, Wood, and Jacobson confirmed the existence of such a gas-to-crystal transition (interestingly, it was recently shown that the face-centered cubic phase is more stable than the hexagonal close-packed phase). This is the first order transition and it is purely entropic as there is no attractive interaction (dH=0). It is common that entropically driven processes are somewhat counter-intuitive, since most processes in our everyday experience are dominated by enthalpy. Some examples of such processes are described in the following chapters and one shall not neglect the entropy effects in soft matter. The second example we will mention now is the existence of a liquid phase. This may sound strange, since we see liquids every day, but from a molecular point of view it is hard to make any predictions about material phases based purely on the interaction potential of atoms. Let’s look at hard spheres again, but this time with isotropic attraction. Would this lead to any other phase than gas and solid? Think about its phase diagram. If the range of attraction is zero, there is no liquid phase. The imaginary critical temperature is below gas/solid coexistence and we will have no knowledge of it. When the attraction range expands the critical temperature (below which a liquid exists) increases, and once it passes across the solid/gas coexistence line the liquid begins to exist. Recent simulations and experiments suggested that this point exist when the range of attraction is about one-third of the particle diameter. Once we know about the liquid phase, it is easy to explain it. Nevertheless, it demonstrates that there might be other phases, whose critical temperature is below the coexistence line and hence we have no knowledge of them yet. Finding such materials or interaction potential remains an open task. 8 Sources and further readings: 1. W. G. Noid; Perspective: Coarse-grained models for biomolecular systems; The Journal of Chemical Physics 2013, 139 (9), 090901; DOI: 10.1063/1.4818908 2. H. I. Ingólfsson, C. A. Lopez, J. J. Uusitalo, D. H. de Jong, S. M. Gopal, X. Periole, S. J. Marrink; The power of coarse graining in biomolecular simulations; Wiley Interdisciplinary Reviews: Computational Molecular Science 2013, DOI: 10.1002/wcms. 2. Ideal chain approximation (Random coil, Kuhn length, end-to-end distance) Proteins are essential biological macromolecules with a variety of roles in all organisms. They are built from 20 standard amino acids attached together by peptide bonds forming a single linear polymer. Proteins can fold in a specific structure or remain unstructured in a conformation called random coil. In this conformation the elementary building blocks are oriented randomly and the overall shape is dictated by the statistical distribution of all possible populations. This could be theoretically described by an ideal chain approximation, a basic theory for ideal polymers. The ideal chain is a polymer of N independent non-interacting segments, each of length l. All the segments can freely rotate around the joining points. This model might sound like an oversimplification, however, there are many proteins and fully flexible polymers that behave like this when the segment length is large enough (it can contain several amino acids). The ideal chain is the simplest model of polymer approximation and in a way it is similar to the ideal gas approximation in the theory of molecular gases. Note that there is an important similarity between an ideal chain and random walk, which will help us in our calculations and understanding. In both cases the direction of each segment/step of length l is random and independent of other segments/steps. 10 An example of random walk is Brownian motion, where the motion of a large particle in solution can be described by diffusion. Figure 2.1: Illustration of ideal chain (random walk), where all segments have length l and RN is a vector from beginning to the end of N-th segment. The key parameter in the ideal chain model is the length l of the segment. It is an effective length, over which the protein can be treated as rigid body. Moreover, the individual segments of this length must be independent. Such an effective length is called the Kuhn length and the protein parts of this length are called Kuhn segments after the swiss chemist Werner Kuhn, who first suggested the existence of these effective segments in polymers. In proteins, the Kuhn length can vary from about 1 nm to 100 nm, depending on the protein sequence, charges, secondary structure, etc. Naturally, the Kuhn length is directly connected to the flexibility of the protein. If we take a point in the protein and study how much the direction at given distance d deviates from the starting point, we will find that it exponentially decays with exp(−d/lp). Where lp is a characteristic length called the persistence length. For short distances d lp the protein can be treated as a rigid rod and for large distance d lp the direction of the protein is 11 independent of its initial direction. Since the protein can be considered to be a rigid rod for +lp and −lp from the initial point, the Kuhn length can be calculated as: l = 2lp The persistence length lp can be also expressed using the bending stiffness B as: lp = B kT , where k is the Boltzmann constant 1.380×10−23 JK−1 . Other frequently used constants are: Avogadro’s number Na = 6.022 × 1023 mol−1 and ideal gas constant R = 8.31 JK−1 mol−1 . In soft matter, kT is commonly used as units of energy, since it corresponds to the thermal energy. Its relation to other units at 25◦ is kT = 4.11 × 10−21 J= 2.479 kJmol−1 = 0.593 kcal mol−1 . Now that we have defined our basic parameters, we can calculate the end-to-end distance RN of an ideal chain as a measure of its size. Analogously, with random walk we can start calculating a distance how far you get after walking lost in a deep forest. Let’s assume that we will walk for 10 hours and because we are relatively fit, we will walk with an average speed of 4 km/h. The total length of our track will thus be 40 km, however, if we lose our direction after a few hundred meters, i.e l = 400 m, we will actually not get very far from the starting point. 400 meters may not sound like very far, but in experiments with blindfolded people, they usually lost their initial direction within the first 100 - 200 m. Therefore, we will take this distance as an estimate for the persistence length of our walk. As we know random walk can be described by diffusion (if not see Chapter 9.), the distance from our origin is given by the equation: R2 = 2nDt, (2.2) 12 where D is a diffusion coefficient and n is the dimensionality of the process. We can calculate the diffusion coefficient from our speed and uninterrupted path in one direction as 2nD = l2 /t = lv. Hence, we obtain R2 = lvt, which can be rewritten as |R| = (lvt)1/2 = 4 km. This means that after walking 40 km we will actually only get 4 km away from our starting point. Moreover, it is highly likely that our path will cross itself, so we will have the impression of walking in circles, which many people lost in a dense mist or blizzard experience. Thus it is better to go slowly and concentrate on your direction than walk fast. To obtain the distance dependence on the number of segments, we simply divide the total path into segments of length l (vt = Nl) and obtain |RN | = lN1/2 (2.3) Now we will calculate the same RN for a protein using the ideal chain approximation. We start with a definition of the end-toend distance as a sum of N segments each defined by its vector ri of size l: RN = N i=1 ri (2.4) As all the segments are independent we can write the averaged value as: RN = N i=1 ri (2.5) All orientations of ri have the same probability, i.e. in 1D +l is equally likely as −l, hence the averaged value is zero. Therefore, we end up with RN = 0. This does not mean that the size of the end-to-end distance is zero, but that RN has no preferred orientation. The size of the end-to-end distance has to be calculated 13 as the square root of RN 2 . RN 2 = N i=1 ri 2 (2.6) = N i=1 ri N j=1 rj (2.7) = N i=1 N j=1 rirj (2.8) = N i=1 N j=1 lilj cos θij (2.9) = N i=1 N j=1 l2 cos θij (2.10) Since all orientations have the same probability, all angles from 0 to 180◦ are equally likely and cos θij = 0, apart from cases when i = j for which cos θii = 1. Therefore, the sum simplifies to RN 2 = l2 N (2.11) and the average end-to-end distance is: |RN | ≡ RN 2 = lN1/2 (2.12) Not surprisingly, we have obtained the same result as in random walk (Eq. 2.3), where we used the diffusion equation. If we want to compare to the measured size of proteins, we can calculate the radius of gyration Rg. It is defined as the mean 14 squared distance between the monomers and the center of mass: R2 g ≡ 1 N N i=1 (|ri − RCM |)2 (2.13) This can be related (see Appendix) to the end-to-end distance as: < R2 g > 1 6 RN 2 (2.14) We can also calculate the probability distribution of the endto-end distance. After a little bit of math (shown in the Appendix) we end up with the 1D probability: P(RN ) = 1 √ 2πNl2 exp −RN 2 /2Nl2 (2.15) and for 3D: P(RN ) = 3 √ 2πNl2 3/2 exp −3RN 2 /2Nl2 (2.16) The end-to-end distribution is a Gaussian function with a variance corresponding to the mean-squared end-to-end distance. This distribution could have been anticipated, since the central limit theorem of statistics states that a large number of random events have a Gaussian distribution. The assumptions used in the above derivation are large number of segments N and the end-to-end distance being smaller than the fully stretched chain RN Nl. From the probability distribution we can calculate the force that keeps the end-to-end distance in equilibrium. Our force doing the work can be calculated as: < FW (RN ) >= dW dRN (2.17) 15 The chain force is of the same size but opposite direction: < F(RN ) >= − dW dRN (2.18) Since there is no interaction, the internal energy stays constant and all the work done in a change of the end-to-end distance is transferred into heat. dU = dW + dQ = 0 (2.19) Furthermore, when we only make small changes we can assume that the change in our system is reversible: dQ = TdS (2.20) Hence, the chain force is: < F(RN ) >= − dW dRN = dQ dRN = T dS dRN (2.21) Thus to calculate the force we need to know the entropy change dependence on the end-to-end distance. The entropy is defined as: S = k ln(Ω) (2.22) where k is the Boltzmann constant and Ω is the number of microstates corresponding to the evaluated macro-state (end-to-end distance). Because we already know the probability distribution of the end-to-end distances, we know the number of micro-states up to the multiplication constant C. Ω(RN ) = CP(RN ) (2.23) and so S = k ln(Ω) = k ln P(RN ) + k ln(C) (2.24) 16 Substituting this into 2.21 we obtain: < F(RN ) >= kT d ln P(RN ) dRN (2.25) Finally, using the probability of the end-to-end distance (Eq. 2.16) leads to the equation of state for an ideal chain: < F( )RN >= − 3kT Nl2 RN , (2.26) where RN is the end-to-end distance. We should note that this force is purely entropic, since there are no interactions (∆H = 0). The linear increase in the force with the end-to-end distance is similar to Hook’s law used for small deformations in materials. Importantly, the force increases linearly with temperature, so it will be harder to stretch or shrink the ideal chain at higher temperatures. This might be counter-intuitive, but it is a well-known fact in polymer science and we can see demonstrations of it every day. Try to heat up a rubber band and it will become stiffer. Be aware that at large extensions the chain is not ideal and the equation 2.26 fails. When RN Nl an alternate model has to be employed, leading to diverging force at full extension (Nl = RN ): < F(RN ) >∼ − kTN Nl − |RN | (2.27) More realistic model is a worm-like chain model, where flexibility is uniformly distributed along the chain. This continuous model is constructed by a limit, where number of segments goes to infinity and the segment length goes to zero, while the total length is kept constant (L = Nl). The derivation is similar as above, but 17 summations becomes integrations: RN 2 = ri 2 (2.28) = L 0 L 0 cos θijdridrj (2.29) = L 0 L 0 exp − |ri − rj| lp dridrj (2.30) = 2lp L − lp(1 − e−L/lp ) (2.31) In limit of long chain L >> lp, the square end-to-end distance converges to 2Llp, which is equal to ideal chain for segment length l = lp/2. In addition, the finite volume of the chain can be added as an excluded volume, which gives the scaling of the squared end-toend distance to be N6/5 . These models successfully predicted the stretching of proteins and their behavior in a free or confined environment. Note that there are more chain models such as the bead-spring model, where successive bead particles along a coarse-grained chain are connected by harmonic potential. Exercise: Let’s consider proteins with no tertiary structure, such as denatured proteins or intrinsically disordered proteins. Calculate the ideal chain end-to-end distance and radius of gyration of denatured Ubiquitin protein, which has 76 amino acids. Compare the calculated radius of gyration with the experimental value of 25.2 nm. Then calculate the excluded volume radius of gyration of the 129-amino-acid-long protein Lysozyme and compare it with the experimental value of 35.8 nm. Comment on your results. The average persistence length of disordered Ubiquitin is 4.00 nm. And assume Lysozyme to be intrinsically disordered. Sources and further readings: 18 1. A. Y. Grosberg and A. R. Khokhlov; Giant Molecules: Here, There, and Everywhere (2nd Edition); World Scientific Publishing Company, 2010 2. J. R. C. Van der Maarel; Introduction To Biopolymer Physics; World Scientific Publishing Company, 2007 3. Ideal thin membrane (Elastic stretching, bending, Helfrich’s theory, endocytosis) The biological membrane is a vital component of all cells. It separates the content of the cell and its organelles from surrounding fluids. Its main function is to act as a selective filter which controls the traffic and communication across these open systems. Our current view of membranes dates back to 1925 when Gorter and Grendel showed that it is a thin double layer of lipid molecules. Proteins were added later and it was in 1972 that Singer and Nicolson introduced the "fluid mosaic" model, where the membrane is described as a fluid bilayer composed of a lipid mixture with embedded and attached proteins. This model has been updated to its current form by the addition of the complex interplay of all the membrane components, which can lead to spontaneous organization such as phase separation or the formation of more rigid regions called rafts. The rafts are speculated to have various important biological functions. The lipids in rafts are in a gel phase, where lipid hydrocarbon tails are highly ordered. The rest of the membrane is typically in the fluid phase, where the tails are disordered. The lateral diffusion of lipids is on the order of 10−3 nm2 ns−1 in the fluid phase and 2-3 orders of magnitude smaller in the gel phase. From a macroscopic view, membranes are flexible thin sheets 20 with thicknesses of about 5 nm. The shape and mechanical response of a membrane can be understood from the elastic theory of an idealized plate or shell. We will explain this theory on a small square of homogeneous material. When the square is stretched it expands from the original area A0 to a new size, denoted A. In the first approximation (similar to a harmonic oscillator) we can write the energy change as: ∆E = Estretch = 1 2 K(A − A0)2 , (3.2) where K is a constant representing the membrane resistance to the increase in area. As a matter of convention we will define a new constant Kstretch = KA0 and obtain: Estretch = 1 2 Kstretch (A − A0)2 A0 (3.3) The convention of the stretching constant (modulus) is chosen based on Hook’s law which states that the stretch A−A0 A0 is directly proportional to lateral tension(stress) σ ≡ dE dA . The integral form of the equation 3.3 is: Estretch = membrane Kstretch (A − A0) A0 dA = membrane σdA (3.4) The typical value of Kstretch for a phospholipid membrane is in range of 0.15 - 0.30 N/m. This value is similar to rubber, however, compared to rubber a phospholipid membrane is not made of long polymers. As a result a stretch of more than just a few percent leads to the membrane rupture. Note, that Kstretch can be calculated for homogenous materials from the Young’s modulus Y as Kstretch = Y t, where t is the thickness of the stretched sheet, assuming that it does not change during deformation. The second type of deformation we will investigate is bending. We start again from a square of homogeneous material, but this 21 time we bend the square along one axis as shown in Figure 3.2. As with stretching, we will use only the first term of expansion: Ebend = 1 2 Kbend (V − V0)2 V0 (3.5) To calculate the bending change in energy, we have to integrate Figure 3.2: Side view illustration of bending of a homogeneous material of thickness t. Bending radius R is defined by the curvature of mid plane (dashed). At distance z from mid plane a small volume of length l and thickness dz is displayed. the change in an elementary volume dV over the sheet thickness. The upper half of the material becomes stretched, while the lower part is compressed by the bending. Since Kbend corresponds to the deformation of volume it is equal to Young modulus Y , so: Ebend = 1 2 Y V (dV − dV0)2 dV0 (3.6) = 1 2 Y A t/2 −t/2 (dx dy dz − dxdydz)2 dxdydz , (3.7) 22 where dx dy dz is the size of the elementary volume after deformation. Because we have bending around a single axis dy = dy, we assume that the thickness of the bend material has not changed, dz = dz. The Eq. 3.7 then simplifies to: Ebend = 1 2 Y A t/2 −t/2 (dx − dx)2 dx dydz (3.8) (3.9) To evaluate the integral, we will use the substitution derived from Figure 3.2. We can see that the length of the elementary volume dx expands to dx at distance z from the mid plane. Using the expression for the angle β (ratio of arc length to the radius) we get: dx R + z = dx R (3.10) dx dx = R + z R (3.11) Now we subtract one from both sides to get: dx − dx dx = R + z − R R (3.12) dx − dx dx = z R (3.13) (dx − dx)2 dx2 = z2 R2 (3.14) (dx − dx)2 dx = z2 R2 dx (3.15) 23 We can substitute this into Eq. 3.9, leading to: Ebend = 1 2 Y A t/2 −t/2 z2 R2 dxdydz (3.16) = 1 2 Y A z3 3R2 t/2 −t/2 (3.17) = 1 24 Y t3 R2 A, (3.18) where A is the area of our square. Now we can compare the bending and stretching energy. Employing Kstretch = Y t we get: Ebend = 1 24 Kstretch t2 R2 A (3.19) This means that the lateral stress for bending is dE dA ∼ t2 R2 and for stretching it is ∼ A−A0 A0 . Thus the bending is less energy demanding than the stretching, as we see for other materials in every day life. Try for example to stretch or compress a sheet of paper without bending. The energy necessary to bend the membrane to a radius of 1 mm is roughly equal to a relative change in area of 10−11 . Note that the bending energy is commonly expressed per unit area: bend = 1 24 Kbend t3 R2 (3.20) This tells us that the energy density depends on the square of the curvature 1/R, as this is the only variable for a sheet of fixed size and thickness. For convenience we define a new constant κ ≡ t3 12 Y , which is the bending modulus for a given thickness. The bending energy per unit area then becomes: bend = 1 2 κ 1 R2 (3.21) 24 A typical value of κ for phospholipid membrane in range 6 - 8 10−20 J = 60 - 80 pN nm ≈ 15 - 20 kT at ambient conditions. This is an important value which perfectly suits the needs of cells. It is just slightly larger than the thermal energy (a few kT) which means that the thermal energy is enough to gently bend the membrane, but not enough to break it. Moreover, it also means that the membrane can be deformed by proteins, their self-assembly, or by ATP hydrolysis providing ∼ 15 kT per single molecule. We should note that a phospholipid membrane has zero shear modulus. The reason for this is that the membrane is fluid, so shear strain does not cause any change in energy. In nature, the membrane is usually bent around more than one axis. In fact, the membrane can have various shapes and the curvature is not the same everywhere. In order to describe such bending correctly, we have to look at bending locally and then integrate the local bending energy over the whole membrane area. The correct description of local curvature can be obtained from differential geometry, and we will use its general results and implications (the derivation can be found elsewhere). At each point of the deformed surface we can define a curvature c ≡ 1/R in each direction. Nevertheless, there are always two directions in which the directional curvatures are extremal (see figure 3.3). These curvatures are called the principal curvatures (c1 and c2), which are orthogonal, but do not have to be unique. One way to think about this is to imagine a normal vector to the surface at given point. The normal vector defines a plane that is called normal section plane. All vectors within the plane defines directions on the surfaces with varying curvature. By rotating the plane around the normal vector we can find minimum and maximum curvatures, which are the principal curvatures. There are two important characteristics of the curvature that 25 can be calculated from the principal curvature. The first is the mean curvature: H ≡ 1 2 (c1 + c2) = 1 2 1 R1 + 1 R2 (3.22) and the second is the Gaussian curvature: K ≡ c1c2 = 1 R1R2 (3.23) Figure 3.3: Schematic visualization of locally curved surface with two principal curvatures. The mean curvature H describes how much the surface is curved, while the Gaussian curvature K tells us how the surface can be unfolded, i.e. what shape it would have if the mean curvature were zero. With these two characteristics, we can properly describe the bending energy of the surface using a terms up to the order 1/R3 . In general, the bending energy depends on a combination of all the linear and the quadratic invariants c2 1 + c2 1, (c1 −c2)2 , (c1 +c2)2 , and c1c2. However, they are not independent 26 and the first two expressions can be expressed as combination of the last two. Thus, we can construct the expression for bending energy purely from the mean and Gaussian curvature. The following formula for bending energy density is called the Helfrich theory/equation after its main author Wolfgang Helfrich, but two other authors, Canham and Evans, who participated in its development, are sometimes added. It is: bend = 1 2 κ(c1 + c2 − c0)2 + κgc1c2 (3.24) = 1 2 κ(2H − c0)2 + κgK (3.25) with the integral form: Ebend = membrane 1 2 κ(c1 + c2 − c0)2 + κgc1c2 dA, (3.26) where κg is a bending modulus associated with Gaussian curvature. Note that Gaussian curvature is sometimes labeled KG because instead of the mean curvature H we can use the extrinsic curvature K ≡ 2H. The spontaneous curvature c0 is a curvature that the membrane would spontaneously adopt (without any bending). A non-zero value reflects a membrane asymmetry either due to the lipid composition or a different chemical environment on each side of the membrane in non-equilibrium situation. If we are interested in bending the membrane without changing its topology we can use the Gauss-Bonnet theorem, which states that the integral of a Gaussian curvature over the surface can be expressed as the integral over the boundary. Therefore, if the boundary conditions remain the same, the Gaussian curvature contribution is constant and the changes in bending energy are: ∆Ebend = membrane 1 2 κ(c1 + c2 − c0)2 dA (3.27) 27 Now we can combine the stretching and bending contributions together to obtain the total elastic energy of a membrane deformation as: Etot = membrane σ + 1 2 κ(c1 + c2 − c0)2 + κgc1c2 dA (3.28) If there are no other factors in the system, the total elastic energy will be in equilibrium at its minimum (the elastic energy represents the free energy of the system). By varying the equation 3.28 we can obtain a general shape equation, however, the resulting differential equation and its solutions go beyond the scope of this chapter. Nevertheless, we should mention that this theory is very fruitful and various interesting phenomena have been explained with it. To name just a few, the shape of red blood and other cells, budding of the membrane, contact angle and shape of vesicles in contact, and the change in the fluctuations and phase transitions of membranes in contact. One analogy to the membrane and its shapes is appealing. It is the analogy of soap bubbles, which are also made of a thin layer with two water/hydrophobic interfaces at the boundary. However, it can be misleading as there are significant differences. For instance, if one makes a hole in a soap bubble, it collapses, while a membrane heals itself by closing the hole. When two membranes come close to each other, their thermal fluctuations start to influence each other. The suppression of these fluctuation modes gives rise to repulsive forces. There are two types of fluctuations: undulation and protrusion. Undulation is a fluctuation of a membrane plane and can be well understood in elastic terms. The corresponding undulation interaction of two membranes at distance d is equals to −3π2 k2 T2 /128κd2 . Since the bending modulus κ is highly dependent on the membrane composition and surrounding solution, the undulation interaction is highly dependent on the system. Importantly, the suppression 28 of the thermal fluctuations can lead to membrane phase transition, and so the fluid membrane can become gel-like when it is adsorbed at or pressed against a solid interface. Protrusion is a movement of a single phospholipid out of the membrane plane. It is a local distortion, which corresponds to the membrane roughness. In addition, there is water depolarization and membrane dehydration as two membranes get very close to each other. All of these short-range effects add together to a repulsive interaction on the order of kT, which exponentially decreases with a decay length of 1 - 2 nm, and it was shown in 2012 that its origin is both enthalpic and entropic. Exercise: We have a cell and a virus is trying to get inside it via endocytosis (Fig. 3.4). Endocytosis is a process during which a particle gets internalized into the cell. There are different types of endocytosis involving different proteins and amounts of ATP. We will be interested in endocytosis without ATP, i.e. a thermodynamically spontaneous process. For simplicity we will assume that the virus particle is a sphere with radius R and that its surface is homogeneously covered by cell-attractive proteins (ligands). These attractive proteins are interacting with membrane receptors (membrane-embedded proteins) that are homogeneously distributed in the membrane. Due to the assumption of homogeneity, we can define an effective attraction (per unit area) w between the virus and the membrane. The total attractive interaction Eatr can then be written as: Eatr = wA, (3.29) where A is the contact area between the virus shell and membrane. At the beginning the virus particle is far from the membrane. Upon the interaction between the membrane and virus, the elastic membrane can bend and wrap around the virus particle. This happens only if the attraction density is large enough. Calculate the size at which the wrapping is a spontaneous pro- 29 cess and apply the formula to the Semliki Forest virus. Compare the threshold radius with a size of the virus, which radius is about R = 30 nm. Note that the virus contains about 80 attractive sites, each binding with a few kT, leading to w = 0.035 kT nm−2 . For the membrane deformation, use a typical bending modulus κ of 20 kT and its κg was recently estimated to -15 kT. Figure 3.4: Schematic visualization of endocytosis. Sources and further readings: 1. 4. Interaction of uncharged membrane and proteins (Dispersion forces, Hamaker constant, depletion forces, Derjaguin approximation) When we go beyond the ideal systems, we have to consider the interactions between molecules. In the following chapters we will discuss these interactions, with a focus on their strength and distance dependence. This is crucial for coarse-graining, since we need to know which interactions are the dominant ones and what parameterization (functional profile) should be used for them. We start with the interaction of uncharged molecules. Do they interact when we neglect gravitational forces? The answer is yes they do, and a common example is noble gases, which can become liquid at low temperatures and/or high pressures. The first explanation was given by Fritz London, hence the forces are sometimes called London forces. The physical origin of the interaction is due to the polarization of the electron cloud around an atom when another atom gets close. This is why the term induced dipole-induced dipole forces is sometimes used. However, the most frequent name is dispersion forces, which originates from its similarity to the quantum mechanical theory of light dispersion. The derivation is based on quantum mechanics (per- 31 turbation method using the multipole expansion of 1/r interaction) leading to: ELondon = − 3 2(4π 0)2 IiIj Ii + Ij αiαj r6 , (4.2) where Ii is the first ionization potential of the i-th atom, α is polarizability, and r is the distance between the atoms. Note that the total dispersion force between two molecules (particles) depends on the polarizability of the whole molecule, which is usually orientation-dependent. If one of the uncharged molecules has a permanent dipole then there is also a dipole-induced dipole interaction, which is sometimes called Debye interaction. Similarly to London interaction, it depends on 1/r6 : EDebye = − 1 (4π 0)2 p2 i αj r6 , (4.3) where p is the permanent dipole of the molecule. When both molecules have permanent dipoles there is also a dipole-dipole interaction. This commonly orients molecules and is sometimes called Keesom interactions after its discoverer. The orientationaveraged interaction free energy is: EKeesom = − 1 (4π 0)2 1 3kT p2 i p2 j r6 (4.4) All of the above interactions add together to form the van der Waals interaction, which is the first order approximation of the interaction of uncharged molecules. Naturally, there are higher order terms of multipole expansion such as quadrupole-quadrupole, etc. but they are typically weaker and more short ranged. EV DW = EKeesom + EDebye + ELondon (4.5) 32 Van der Waals interactions for atoms are equal to London dispersion forces, since atoms do not have permanent dipoles. Common values of van der Waals interaction are on the order of a few kT. For comparison, the strength of the interaction for hydrogen bonds is about 10 kT and covalent bonds are another order of magnitude stronger (on the order of 100 kT). However, numerous van der Waals interactions allows Gecko lizard to hold on side walls or ceilings. A more general description of van der Waals interactions, where all the above contributions are naturally together, is included in Lifschitz theory. It is a quantum electrodynamics derivation of Casimir effect for interaction in different dielectrics. According to this theory, the polarizabilities are frequency-dependent and the above results correspond to the zero frequency approximation. The advantage of this approach is that we can also calculate dispersion interactions in a continuum medium, i.e. solvent. The particle interaction is then dependent on the excess polarizabilities of the molecules compared to the solvent. EV DW ∼ − (αi − αs)(αj − αs) r6 , (4.6) where the subscript s stands for solvent. We can see that van der Waals interaction is always attractive (negative) for similar molecules. However, the interaction can become repulsive if one molecule has a larger polarizability than the solvent and the other molecule a lower polarizability than the solvent. If we do not know the polarizabilities for the species of interest, we can take advantage of the Clausius-Mosotti expression for the refractive index n: 4 3 πρα = n2 − 1 n2 + 2 , (4.7) where ρ is a density. Thus, we can estimate which particles/materials will be miscible and how much they will tend to aggregate. The 33 refractive index of water is 1.333. Most proteins as well as the inner part of membranes have a refractive index of about 1.5, which is similar to hydrocarbons. This is not surprising, since there are hydrocarbons in the hydrophobic core of these objects. In fact, hydrophobic attraction is the dominant interaction of self-assembly, but we will discuss this later. Proteins and membranes are not small atoms, hence if we want to calculate their van der Waals interactions, we will have to integrate it over their volumes. For simplicity we will assume the interaction to be additive here. We start with a small molecule next to a flat surface at distance d. The surface is made of molecules with number density ρ and each molecule interacts with energy −C/r6 . The easiest integration is over small rings axially aligned with the surface normal going through the molecule as depicted in Figure 4.5. All points on the ring with radius x are the same distance from the molecule r = √ x2 + z2, where z is the distance from the middle of the ring to the molecule. The number of surface molecules in such a ring is 2πxρdxdz. E·| = − ∞ z=d ∞ 0 2πxρ C (x2 + z2)6/2 dxdz = −2πρC ∞ z=d 1 4z4 dz = − πρC 6d3 (4.8) Importantly, this means that the van der Waals interaction between the molecule and the surface is fairly long range ∼ 1/d3 . If the surface is replaced with a wall of finite thickness t the integration leads to: E· = − 1 6 πρC 1 (d + t)3 − 1 d3 , (4.9) which for small thicknesses t → 0 decreases with ∼ 1/d4 . For 34 more interactions with a wall of finite thickness that represents a phospholipid membrane see Appendix. Figure 4.5: Schematic picture of the molecule next to the wall for the calculation of dispersion interaction integrating over the axial rings within the wall. Having the expression for the interaction of single molecules with the surface, we can calculate the interaction between larger objects and the surface by simply integrating over all the object molecules. An example could be a large sphere of radius R interacting with the surface as depicted in Figure 4.6. All spherical molecules at distance z from the surface create a circular slice of volume πx2 dz, where x is the radius of a given slice. From geometry we can calculate the slice radius as x2 = R2 − (R − (z − d))2 = (z − d)(2R − z + d), where d is the distance between the 35 Figure 4.6: Schematic picture of a sphere next to the wall for the calculation of dispersion interaction using integration over the axial disks within the sphere. surfaces. Consequently we integrate: E•| = d+2R d ρsphπ(z − d)(2R − z + d) πρsurf C 6z3 dz = − 1 6 π2 ρsphρsurf C d+2R d (z − d)(2R − z + d) z3 dz (4.10) 36 This can be simplified for close distances R d − z to: E•| = − 1 6 π2 ρsphρsurf C ∞ d (z − d)(2R) z3 dz (4.11) = − 1 6 π2 ρsphρsurf CR ∞ d 2 z2 − 2d z3 dz (4.12) = − 1 6 π2 ρsphρsurf CR −2 z + d z2 ∞ d (4.13) = − π2 ρsphρsurf CR 6d (4.14) Note that the interaction decreases with 1/d. We could also solve the Eq 4.10 directly and obtain the full solution for all distances: = − 1 6 π2 ρsphρsurf C d2 + 2Rd − 4dz − 4Rz 2z2 − ln(z) d+2R d = − 1 6 π2 ρsphρsurf C 2R(d + R) d(d + 2R) − ln(1 + 2R/d) In limiting cases we obtain a 1/d dependence for small d and 1/d3 for large distances, which confirms the above results. Similarly, we can calculate the interaction between two surfaces, where we integrate over squares of unit area. The energy per unit area is then: E|| = − 1 6 πρsurf1ρsurf2C ∞ d 1 z3 dz (4.15) = − πρsurf1ρsurf2C 12d2 (4.16) We can see that it is convenient to define a new constant, which contains all the material properties such as densities ρ and polarizabilities (C ∼ α1α2). This constant is called the Hamaker constant, A: A ≡ π2 ρ1ρ2C (4.17) 37 The advantage of the Hamaker constant is that it is similar for all materials, being in the range of 1 - 100 kT = 0.4 - 40.0 × 10−20 J (water: 4 kT, hydrocarbon: 1 kT). The reason for this is that the polarizabilities of molecules scale with their volume, while their number densities scale with the inverse of volume. Hence, A ∼ ρ1ρ2α1α2 ∼ 1 v1 1 v2 v1v2 = const. Using this constant, we can express the interactions for species of different shape as follows: Two atoms with distance r between their centers: E·· = − C r6 (4.18) Atom and surface separated by distance d: E·| = − πρC 6d3 (4.19) Atom and membrane: E· = − 1 6 πρC 1 (d + t)3 − 1 d3 (4.20) Atom and sphere: E·• = − 2 3 πρC R(3d3 + 6dR + 2R2 ) d3(d + 2R)3 (4.21) Sphere and surface: E•| = − A 6 2R(d + R) d(d + 2R) − ln(1 + 2R/d) (4.22) Sphere and surface d R: E•| = − AR 6d (4.23) 38 Two spheres: E•• = − A 6 2R1R2 d(d + 2(R1 + R2)) + 2R1R2 (d + 2R1)(d + 2R2) + ln d(d + 2R1 + 2R2) (d + 2R1)(d + 2R2) (4.24) Two spheres d R1, R2: E•• = − R1R2 R1 + R2 A 6d (4.25) Sphere and membrane d R: E• = 1 6 AR 1 d + t − 1 d (4.26) Two surfaces (per unit area): E|| = − A 12πd2 (4.27) Two membranes (per unit area): E = A 12π 2 (d + t)2 − 1 (d + 2t)2 − 1 d2 (4.28) Cylinder of unit length parallel to the surface: E◦| = − A √ R 12 √ 2d3/2 (4.29) Two parallel cylinders of unit length: E◦◦ = − R1R2 R1 + R2 A 12 √ 2d3/2 (4.30) 39 Two cylinders perpendicular to each other: E/\ = − R1R2 A 6d (4.31) Now we can estimate the dispersion interactions between proteins and membranes. However, we should keep in mind that we used two important assumptions. The first was the additivity of pair interactions and the second was no retardation of dispersion. In reality the interactions are not additive (e.g. aromatic rings) and electromagnetic forces do not propagate instantaneously, which can decrease the range of the interaction from r−6 to roughly r−7 . This retardation become important when the separation distance d is large - on the order of c/f, where c is the speed of light and f is atomic dipole fluctuations. In practice, it plays a role for separations larger than 5 nm. The interaction between curved surfaces can be calculated from the interaction of flat surfaces using the Derjaguin approximation. This is a very useful approximation, which is valid for short-ranged interactions without any assumption about the shape (profile) of the interaction. It states that the interaction for curved objects can be integrated over the two facing surfaces if their separation is smaller than their curvature. Etot(d) = ∞ d w(z)dA, (4.32) where d is the separation of objects, A is the cross-section area and w(z) is the interaction per unit area between two flat surfaces separated by distance z. For rotationally symmetric objects, we can use cylindrical coordinates to get: Etot(d) = 2π ∞ d w(z(r))rdr (4.33) We can demonstrate this with a simple case of two spheres of radii R1 and R2 separated by distance d. We need to integrate the 40 interaction over the two facing surfaces, i.e. over 2πxdx, where x is the radius of the ring in the cross-section of the spheres. The total interaction is: Etot = ∞ d 2πxw(z)dx, (4.34) where w(z) is the interaction density between the two flat surfaces separated by distance z. In order to be able to integrate we have to relate the distance between the two cross-sections z = d+z1 +z2 and the radius of the rings x. Using geometry and Figure 4.7: Schematic picture of two spheres for Derjaguin approxima- tion. the fact that R1 z1 we simplify the relation (R1−z1)2 +x2 = R2 1 to z1 = x2 2R1 . Similarly, we obtain the expression for z2. Combining these two we can express the distance between the rings as z = d + x2 2R1 + x2 2R2 . Thus dz = ( 1 R1 + 1 R2 )xdx and the total interaction energy can be expressed as: Etot = ∞ d 2π R1R2 R1 + R2 w(z)dz, (4.35) 41 From here we can directly see that the force between the two curved objects will be: Ftot = − ∂Etot ∂d = 2π R1R2 R1 + R2 w(z) (4.36) This relation has proved to be fruitful for comparing of experiments and theory, because it is much easier to do such a calculation for two flat surfaces, while in experiments it is easier to measure the force between two spheres by Atomic Force Microscopy or optical tweezers. To test the relation we can use the above expression for dispersion interactions. The interaction between two spheres E•• evaluated from the interaction between two surfaces E|| is: E•• = ∞ d 2π R1R2 R1 + R2 E||dz (4.37) = 2π R1R2 R1 + R2 ∞ d − A 12πz2 dz (4.38) = − R1R2 R1 + R2 A 6d (4.39) If our objects of interest are in solution, which proteins and membranes usually are, there is an additional interaction called depletion interaction. The origin of the interaction is in the osmotic pressure of the solvent in bulk and the depleted volume between close objects (see Figure 4.8). Thus the depletion interaction has an entropic character. We demonstrate this interaction on two flat surfaces of unit area with zero interaction. The surfaces are immersed in solvent made of small non-interacting solvent spheres of radius r. Clearly at separations smaller than 2r, there is no solvent between the surfaces and so the force is constant in this region. The force can be calculated from osmotic pressure, i.e. the pressure of an ideal 42 Figure 4.8: Illustrative figure of depletion interaction. gas pid = n V RT = ρkT, times unit area: Edepletion = − d 2r −ρkTdz = ρkT(d − 2r) (4.40) At separations larger than r the force is zero, so the energy of separating the surfaces remains constant. This idea was proposed by Asakura and Oosawa for unexpected attraction forces between colloid particles in polymer solutions in 1954. In the appendix, you can find a more advanced model for a solution of ideal chains. Exercise: Calculate the van der Waals interaction energy between two lysozyme proteins and lysozyme protein and membrane as a function of distance and plot the dependence. Assume that lysozyme has a spherical shape with a radius of about 20 nm and the membrane core has thickness 4 nm. The Hamaker constant for lysozyme interacting with itself in water is 2.87 x 10−20 J, 43 while for a typical hydrocarbon oil it is 0.54 x 10−20 J. Sources and further readings: 1. V. Adrian Parsegian; Van der Waals Forces A Handbook for Biologists, Chemists, Engineers, and Physicists; Cambridge University Press, 2005 2. Jacob N. Israelachvili; Intermolecular and Surface Forces (3rd Edition); Academic Press, 2011 3. Ken A. Dill and Sarina Bromberg; Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience (2nd Edition); Garland Science, 2010 5. Interaction of charged membrane and proteins (Poisson Boltzmann, Debye length, Strong coupling limit, Hofmeister series) In addition to the non-charged interactions described above, there could be a Coulomb interaction in the system of interest. Coulomb interaction between charged objects is a typical pairwise additive interaction, which is fairly long ranged (∼ 1/r) and usually stronger than dispersion. Its formula is simply Coulomb’s law: FCoulomb(r) = 1 4π q1q2 r2 = q1E2 (5.2) VCoulomb(r) = 1 4π q1q2 r = q1ψ2 (5.3) where F is the force, E is the electric field, V is the potential energy, and ψ is the electrostatic potential. is a dielectric constant of the given medium and q are the charges of the interaction objects 1 and 2 denoted in the subscript. Charges in solutions usually originate, in the dissociation of salts, acidic or basic groups, ionization, or by adsorption of the already dissociated molecules on the object. This means that there is usually a nearby co-charge to every charge present in soft matter systems. 45 To investigate the interactions, we start with a homogeneously charged object and mobile counter-ions in the surrounding solution. The charged object creates an electrostatic potential ψ, which influences the distribution of counter-ions . Their relationship is given by the Poisson equation: ψ = − qρ , (5.4) where q is the charge of a single counter-ion, ρ is the number density of counter-ions, and is the Laplace operator. We are interested in the equilibrium situation, where the chemical potential (free energy per mol) µ is constant over the whole system: µ = qψ + kT ln(ρ) = const. (5.5) This equation provides us with a Boltzmann distribution of counterions, which can be written as: ρ = ρ0 exp − qψ kT (5.6) where ρ0 is a constant representing a density at ψ = 0, in our case at infinity distance. By combining Eq. 5.4 and Eq. 5.6 we obtain the Poisson-Boltzmann equation: ψ = − qρ0 exp − qψ kT (5.7) Proteins and membranes are usually not only surrounded by counterions but also by additional salts. The total number density is then a sum of individual densities: ρ = i ρ0i exp − qiψ kT (5.8) where qi is the charge of the individual species, which is usually expressed as a multiplication of elementary charge e and valency 46 zi: qi ≡ ezi. Therefore, the general Poisson-Boltzmann equation can be rewritten as: ψ = − i eziρ0i exp − eziψ kT (5.9) This is a nonlinear second-order differential equation, and apart from a few simple examples it does not have an analytical solution. It is usually solved numerically and/or by approximations. The exact solution is determined by the boundary conditions of the system, which is typically system charge neutrality and/or known electrostatic potential at infinity. If we know electrostatic potential at certain point (from presence of electrode or solution of the Poisson-Boltzmann equation) we can calculate the density distribution of ions at that point. Derivation starts by deriving the Boltzmann distribution: dρ dx = − qρ0 kT exp − qψ kT dψ dx (5.10) and by substituting this result into the Poisson-Boltzmann equa- tion: dρ dx = kT d2 ψ dx2 dψ dx (5.11) = 2kT d dx dψ dx 2 (5.12) By integration from infinity we obtain: ρ = ρ0 + 2kT dψ dx 2 (5.13) A more general expression, where more ions in solution are involved, is: i ρi = i ρ0i + 2kT dψ dx 2 i ρ0i + E2 2kT (5.14) 47 Note that here is ion density instead of individual ions, hence the Poisson-Boltzmann theory is a mean field theory using averaged density values. The most common approximations to Poisson-Boltzmann equation is its ’linearization’, in which the exponential function is approximated by a first order expansion ex 1 + x leading to the Debye-Hückel approximation: ψ = i e2 z2 i ρ0i kT ψ (5.15) We also used an electro neutrality condition i eziρ0i = 0, so the first term after approximation vanishes. The Eq. 5.15 is a linear differential equation, which is much easier to solve, however, it is only valid for eziψ kT. In practice, this usually holds below 100 mM salt concentration. The solution to this equation is an exponential function in 1D: ψ(x) = ψ(0) exp − x λ (5.16) where λ ≡ kT i e2z2 i ρ0i is a characteristic distance called the Debye screening length and it represents the distance at which the compensating charge is distributed around the object. The Debye length can be written using Ionic strength I as λ ≡ kT e2NaI , where I = 1 2 i ciz2 i and Na is Avogadro’s number. Naturally, it decreases with increasing concentrations of ions. For a solution of monovalent equimolar salt (1:1), e.g. NaCl, the Debye length is 0.3 nm for 1M solution, 1.0 nm for 0.1 M solution, and 10 nm for 1 mM solution (roughly 0.304/ √ c nm). For divalent salts such as MgSO4 it is half of the above values. As a result the charge of the membrane and proteins is well screened below 1 nm at physiological conditions (200 mM salt concentration). For convenience we define the Bjerrum length λb, which is the distance, at which 48 Figure 5.9: Schematic profiles of ion density distributions next to a charged wall. two elementary charges interact with the same strength as the thermal energy, λb = e2 4π kT . (λb =56 nm in vacuum and λb = 0.71 nm in pure water) The above solution of exponential decay is very important and frequently used. It is also the main idea of the Gouy-Chapman theory, in which large objects in solution are surrounded by a electric double layer. The first layer is thin and is formed by dissociated or adsorbed ionic groups. The second layer is diffuse and it is formed by counter-ions, whose distribution exponentially decays with distance from the object surface. At large distances, 49 i.e. distances larger than the Debye screening length, the objects can be viewed as surrounded by a double layer. Using the electric double layer assumption together with the Derjaguin approximation and weak overlap assumption, we can derive the following expressions for interaction energies between the species at distance d. Two small molecules of radius R << d: E·· = Z1Z2e2 4π exp(−κd + 2κR) d(1 + 2κR) (5.17) Two spheres (R >> d): E•• = R1R2 R1 + R2 Z exp(−κd) (5.18) Sphere and surface: E•| = RZ exp(−κd) (5.19) Two surfaces (per unit area): E|| = κ 2π Z exp(−κd) (5.20) where κ ≡ 1 λ is the inverse Debye length and Z is an interaction constant with a similar role to the Hamaker Constant A for van der Waals interactions. Z is defined by the surface potential ψ0 of an isolated surface and related to surface charge density: Z = 64π kT e 2 tanh2 zeψ0 4kT (5.21) The hyperbolic tangent is often written separately as: γ = tanh(zeψ0/4kT), which is at ambient conditions and ψ0 in mV: γ = tanh(zψ0/103). 50 For small surface potentials (below 25 mV), this can be simplified using the surface charge density of particles σ = κψ0 as: Two spheres: E•• = 2πσ2 κ2 R1R2 R1 + R2 exp(−κd) (5.22) Sphere and surface: E•| = 2πσ2 R κ2 exp(−κd) (5.23) Two surfaces (per unit area): E|| = 2σ2 κ exp(−κd) (5.24) While the above expressions are very helpful for estimates of the interaction strength between membranes and proteins, we have to keep in mind the approximations we made. For example, at close distances the ion-correlation effects will start to be important. The diffuse counter-ion layer is a highly polarizable layer, and its fluctuations will give rise to attractive forces between the particles ( van der Waals forces have a similar origin). Indeed, it has been shown that ion-correlations can lead to an attraction between walls or spheres of the same charge. Moreover, once the separation between the particles approaches the size of ions, the Poisson Boltzmann description becomes inaccurate, as the ions form a two-dimensional layer. This regime is called the Strong coupling regime and was investigated only recently with significant help from computer simulations. Simulations were not only able to test Strong coupling, but able to investigate the gap between the Strong coupling theory and the Poisson Boltzmann description. In addition, the ions are only characterized by their charges in the Poisson Boltzmann equation, so all the finite size effects 51 of the ions are missed. Unfortunately, there are plenty of examples where the ion size is crucial. For instance, the vital biological difference between sodium and potassium, two very common alkali metals of very similar size. While sodium can pass through ion pumps in the membrane, potassium cannot. This leads to an ionic imbalance with potassium inside the cells and with sodium outside, which is used for the transport of larger molecules and correct functioning of proteins (there is an approximately 150 mM concentration of sodium outside and 5 mM inside the cells, while potassium has a 5 mM concentration outside and 150 mM inside the cell). Thus Poisson Boltzmann fails to explain Hofmeister series (also sometimes called Lyotropic series). These are series for cations and anions of salts which have been formed based on the salt effect on the salting out of egg white proteins. However, since their discovery in 1888 by Franz Hofmeister, they were found to be useful in many protein phenomena where ions follow the same trend. The explanation for this is found in the pairing between salt ions and proteins, which is related to their hydration free energy (roughly corresponding to their size), which determines the strength of the pairing. Once the size of ion is large enough, ions can also pair with uncharged groups and adhere there. Note that some modified versions of the Poisson-Boltzmann theory, which take the finite size of ions into account, were developed recently solved numerically. Be aware that the charge distribution on proteins is not static and a presence of other charged proteins, membrane, or external field can induce protonation or deprotonation of amino acids. The new charge distribution can be described in induced multipoles (charge, dipole, quadrupole, etc.), which size is dependent on the protein capacitance, C. The capacitance is defined as a change of the mean charge upon a small change of electric field: C = − ∂ < q > ∂ψ = < q2 > − < q >2 e2 (5.25) 52 The induced interaction of two molecules can then be written as: E = − e2 2kT(4π )2 C1C2 R2 − 1 2kT(4π )2 Q2 2C1 + Q2 1C2 R2 −... (5.26) where Q is the mean charge of the molecule. For a single amino acid, the capacitance can be related to change of pH: C = − 1 ln 10 ∂ < q > ∂ pH (5.27) The largest capacitance is thus at pKa, where the molecular charge changes the fastest with the pH change. Histidine has the closest pKa to neutral pH and is thus the most relevant amino acid to consider in biological applications (see appendix for pKa table of free standing amino acids). For proteins and larger molecules the interaction usually needs to be calculated numerically, since the pKa of the residue is influenced by presence of charged residues in vicinity. Exercise: Calculate the electrostatic repulsion of two Lysozyme proteins in blood at pH 7.4, where lysozyme has a charge of +7. Assume the radius of Lysozyme to be 20 nm. Exercise: What pKa has histidine on the surface of Lysozyme and close to a negatively charged membrane at physiological conditions and neutral pH? Consider standard histidine pKa = 6.0 and membrane surface charge density to be -0.3 e/nm2 . Calculate also apparent equilibrium constant pKapp that is defined with respect to bulk concentration of protons, [H+ ]∞. pKapp(x) = − log [X− ][H+ ]∞ [HX] (5.28) Sources and further readings: 53 1. Jacob N. Israelachvili; Intermolecular and Surface Forces (3rd Edition);Academic Press, 2011 2. Ken A. Dill and Sarina Bromberg; Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience (2nd Edition); Garland Science, 2010 6. Towards realistic interactions of membranes and proteins (DLVO, hydrophobic effect, parametrization of potentials ) By combining the electrostatic and dispersion interactions we obtain the DLVO potential, which describes the interaction between charged proteins and membranes at a coarse-grained level. Originally, the DLVO potential was developed for charged colloids by Derjaguin and Landau and, independently, by Verweij and Overbeek (the name DLVO is constructed from the first letters of the surnames of these authors). The interaction profiles are found to be in very good agreement with the measured profiles between two colloidal particles by Atomic Force Microscopy (AFM). Typically, the dispersion interaction is dominant at short distances, while the electrostatic interaction is the determining factor over large separations. The DLVO interaction between two highly charged particles (of the same sign) is dominated by electrostatic repulsion with a single van der Waals minimum at contact. When the particle charges are not very large or the solution contains enough electrolytes, a second minimum at a larger separation occurs (see Figure 6.10). The two minima are separated by a bar- 55 rier which is pH- and electrolyte- dependent. If the barrier is much higher than kT, the solution becomes kinetically stable with only the secondary minimum populated. Upon an increase in concentration, the height of the barrier decreases, which leads to a slow aggregation. Once the concentration reaches a critical concentration, the barrier height diminishes to zero and particles undergo a rapid aggregation. The usual position of the barrier is for separations in the range of 1 to 4 nm. Note that for protein crystallization a slow aggregation is desirable, because rapid aggregation often leads to a random orientation. Figure 6.10: Schematic profile of DLVO potential displayed together with its van der Walls and electrostatic contributions. One of the successes of DLVO potential is its explanation of 56 the empirical Schulze-Hardy rule, which states that the critical coagulation concentration varies with the inverse sixth power of the colloid charge. We will start the derivation from DLVO potential for two spherical particles with the same radius R: E•• = 64πkTRρ∞ γ2 κ2 exp(−κd) − AR 12d (6.2) where γ = tanh(zeψ0/4kT). At critical concentration the barrier is zero, hence E••(db) = 0 leading to: ρ∞ = Aκ2 768πkTγ2db exp(κdb) (6.3) The position of the barrier db can be calculated from the maximum definition dE••(db)/dd = 0 and using the above equation leading to db = 1/κ. Substituting the position of the barrier in the above gives: ρ∞ = Aκ3 768πkTγ2 exp(1) (6.4) Inserting a definition of κ2 = z2e2ρ∞ kT , we obtain: ρ∞ = 7682 π2 3 k5 T5 γ4 exp(2) A2e6z6 (6.5) For high potentials (ψ0 > 100mV) the γ = tanh(zeψ0/4kT) 1 and so the critical concentration approximates to ρ ∼ 1/z6 which is the Schulze-Hardy rule. However, for low constant potential γ = tanh(zeψ0/4kT) zeψ0/4kT, leading to the modified rule ρ ∼ ψ4 0/z2 . DLVO theory also successfully in explains the stability of protein solutions with various electrolytes. An example of such a solution is milk, which is a stable solution of charged proteins that can become unstable upon the addition of vinegar or salt, 57 leading to protein aggregation. However, there are many wellknown examples where DLVO fails. The main limitation is due to the Poisson-Boltzmann approximations. Poisson-Boltzmann is a mean-field continuum theory, which does not include ion size effects and ion-ion correlation effects. It has been shown that combining van der Waals attraction and electrostatic repulsion results in a rich phase diagram which could also include finite size aggregates such as clusters or fibers. The reason for this is that for few particles, the short-range van der Waals could overcompensate for the weak charge repulsion. When more particles are aggregated then the total charge of the cluster significantly increases, leading to a strong repulsion of additional particles, however, the attractive van der Waals attraction increases slightly with increased aggregate size. If we want to consider more complicated objects than spheres, the problem of calculating the phase diagram becomes quite difficult, and using computer simulations becomes beneficial. To complete the interactions of membranes and proteins we have to discuss the hydrophobic effect, which we have not yet covered. It is one of the most important driving forces for protein folding and molecular self-assembly. The hydrophobic parts of molecules are not exposed to solution and become mostly buried in the assembled structures. The residual hydrophobic amino acids at the protein surfaces are responsible for the hydrophobic interaction between proteins. The origin of the hydrophobic interaction is connected to hydrogen bonding in water, which is responsible for many water abnormalities. When water is in contact with hydrophobic molecule(s), its hydrogen bond network is disrupted. Indeed, at large interfaces, such as a water/oil interface, the interface water molecules have less hydrogen bonds and the hydrophobic effect is enthalpic in origin. However, the complete picture is not so simple. Smaller 58 hydrophobic molecules only slightly disrupt the hydrogen bonds. When the curvature is large enough, the number of hydrogen bonds remains roughly constant, but the number of possible arrangements of these hydrogen bonds decreases. The smaller number of micro states results in an entropic character of the hydrophobic effect for small molecules. An extreme example of this behavior are clathrates, the relatively stable cages around small hydrophobic molecules formed at high pressures. The radius crossover from enthalpic to entropic is around 1 nm. The hydrophobic free energy, dG, can be calculated from the water-accessible surface area of the nonpolar part of the molecule (dG = γdA). The water accessible surface can be constructed by rolling a sphere of radius 0.14 nm, the approximate size of a water molecule, over the surface of the studied molecule. For large hydrophobic particles (larger than 1 nm in radius) the proportional constant γ is a macroscopic surface tension, e.g. for a water/oil interface it is 4.7 kcal/mol/nm2 = 7.9 kT/nm2 = 33 dynes/cm. For smaller molecules such as alkanes it was shown that the effective surface tension is about 4 kT/nm2 . For mid range molecules/curvatures one shall extrapolate between these two values. The strength and distance dependence of the above interactions provide us with solid grounds for estimating and evaluating the most important interactions in the system of interest. Based on this chemo-physical intuition, we can develop a coarse-grained model via a top-down approach. In detailed top-down models, the interaction sites are related to specific chemical groups. The potential is then typically fitted to known thermodynamic properties such as partition coefficients, density, permittivity, binding constants, persistence length, etc. Large-scale top-down models do not necessarily describe any particular system, but rather capture phenomenological features such as the shape and distribution of interaction sites. The behavior of such models is typically inves- 59 tigated by the systematic variation of model parameters. The information obtained can help to predict the results of experiments under different conditions as well as help to interpret the actual results of those experiments. We finish the top-down approach with a brief summary of the typical strength and distance dependence of important interactions between small molecules (for dipoles we use thermally averaged interactions): Interactions Range Energy /kT (kJ/mol) Covalent bond 1 - 2 A 40 - 350 (100 - 900) Hydrogen bond 2.5 - 3.5 A 2 - 12 (5 - 30) Charge charge r−1 16 - 30 (40 - 70) Charge dipole r−4 4 - 12 (10 - 30) Dipole dipole r−6 1 - 3 (2 - 8) van der Waals r−6 0.5 - 2 (1 - 5) Bottom-up The bottom-up approach is based on mapping M, a mathematical connection (M : r → R) between detailed (typically all-atom) configuration r and the coarse-grained configuration R, which is related to the accuracy, efficiency, and transferability of the coarse-grained model. Ri = Mi(r) (6.6) The mapping is usually a linear combination of the Cartesian coordinates of a number of atoms to each coarse-grained building block. The degrees of freedom that will be averaged out in the mapping should be: 1. not essential to the process or property of interest; 2. numerous, so the computational gain of the coarsegrained model is significant; 3. largely uncorrelated with the essential degrees of freedom or the essential degrees of freedom need to be also represented in an effective way. Typically these 60 include high frequency motions and uncorrelated slow motions; however, it might not be trivial to determine which slow motions are uncorrelated. Note that the more coarse-grained the model, the more specific (restricted) it is to the system it was parameterized for. This non-transferability of the model is a price we often pay for its speed. For example, if a protein is parameterized in an implicit water model, it will not be valid for a system in a mixed solvent such as water/ethanol. The reason for this is given by the surface tension, which is different in both systems and hence the hydrophobic parts of the protein should interact differently to take this into account. The mapping defines how we split the molecule of interest into coarse-grained blocks. For small molecules, we usually think about the desired number of blocks and then geometrically split the molecule into parts of equal size. For non-homogenous large molecules (proteins of 100 nm) it is beneficial to perform a normal mode analysis, motion analysis, or experiments to identify the rigid and flexible parts of the molecule. Rigid parts could be replaced by one or few blocks to represent their shape correctly. In contrast, the flexible parts should be divided into a larger number of blocks that could properly describe its motion and fit its elastic parameters such as persistence length. Importantly, if the chosen model does not describe the molecule properly, a refined model might be necessary. Once we have our building blocks, we can start to parameterize their interactions. Statistical mechanics provides us with a formal framework to determine the interactions. This is based on the fact that the probability of each coarse-grained configuration should be equal to the sum of all corresponding probabilities in a detailed model. e− W (R) kT = dre− u(r) kT N i=1 δ (Mi(r) − Ri) (6.7) 61 Where u is a potential in the detailed model and Mi is a mapping of the i-th configuration. W(R) is a general many-body potential at the coarse-grained level, which is a free energy profiles or in other words the potentials of mean force averaged over detailed model configurations. The solution of this equation is called the inverse problem of statistical mechanics and Henderson theorem states that there is a unique solution, i.e. a one-to-one correspondence between the potential and the distribution function. Unfortunately, W(R) is impossible to calculate for most systems and its many-body character makes it very inconvenient in practical use. The challenge of the bottom-up approach is thus to find an approximation to W(R), which is computationally efficient and yet accurate. We typically limit ourselves to isotropic two-body potentials, for which the closest fit to W(R) is not unique and depends on the fitting procedure. Naturally, the approximation is also related to the transferability of the obtained potentials to different thermodynamic states than the ones for which it was parameter- ized. Note that even though the integral determines the mass of the given interaction site, which reproduces the equilibrium momenta distribution, it does not imply that the dynamic properties are reproduced in the coarse-grained model. The simplest bottom-up method is called the Boltzmann Inversion or Direct Boltzmann Inversion. It simply converts the distribution of populated configurations from a detailed model to a coarse-grained potential U: U(R) = −kT ln p(r) J(r) , (6.8) where p is probability and J is the corresponding Jacobian factor. In many cases this is simplified to spherically isotropic potentials calculated from radial distribution functions ρ: U(R) = −kT ln ρ(r) (6.9) 62 There are two common ways to obtain the necessary data from the detailed model. The first is to do one simulation with the whole system of interest, while the second one uses dilute simulations of individual building blocks. This method does not include the effect of the environment and the presence of other coarse-grained sites in its vicinity, which is very important in dense systems. The correlation among the different coarse-grained sites is captured by the Iterative Boltzmann Inversion method. As the name suggests this method is iterative, where a coarse-grained simulation is carried out in each step. The coarse-grained potential is modified based on the difference between the probabilities from the detailed and coarse-grained simulation. U(R)new = U(R)old + kT ln P(R|Uold) p(r) (6.10) The Iterative Boltzmann Inversion thus aims to reproduce pair histograms from the detailed model as accurately as possible. Inverse Monte Carlo is also an iterative scheme, but in addition to the radial distribution functions, their cross correlations distributions are also used. This includes the fact that modifying the potential at one distance can affect the radial distribution function at other distances. In general, Inverse Monte Carlo should be faster and more likely to converge (not get trapped in local minima) for multi-component systems than the Iterative Boltzmann Inversion. Note that some of the radial distributions functions can be provided from experiments or there can be additional restrictions during the fitting, such as pressure or compressibility values. Such coarse-grained models would be a hybrid between the bottom-up and top-down approaches. An alternative variational approach is Force Matching (FM), where the goal is to provide a potential that would fit the mean 63 forces calculated with the more detailed model. Mathematically, one minimizes a function: χ2 (F) = 1 3N N s=1 |Fs(M(r)) − fs(r)|2 , (6.11) where fs(r) is the net force on a coarse-grained site s in configuration r. Fs(M(r)) is a net force on the same interaction site using the coarse-grained model in the mapped configuration M(r). Brackets correspond to the average over a canonical ensemble. The FM thus focuses on the dynamic rather than the structural properties. In the extended ensemble version of FM higher order distribution functions (correlations) are also necessary, and the optimized potentials have an increased transferability. This method is called Multiscale coarse-graining. The ability of the FM model to reproduce the radial distribution functions has been suggested as a means of checking the completeness of the coarse grained model, showing that the mapping is fine enough. Similarly, net forces could be used as a check for the structure based methods. All the above methods can be understood in terms of minimizing the information loss during the coarse-graining process. The information lost during coarse-graining is: Φ(R) = ln pr(r)δ(R − M(r))dr PR(R) , (6.12) where P is the probability of configuration R in the coarse-grained configuration space and p is the probability of configuration r in a detailed configuration space. The relative entropy associated with this information loss is the negative log likelihood of the coarsegrained model relative to the detailed one. Srel = pR(R)Φ(R)dR (6.13) 64 The relative entropy is positive or zero and it vanishes when the coarse-grained model perfectly reproduces the sampling of the configuration space from the detailed model. The task of coarse graining is then to find the interaction potentials which minimize Srel for all configurations λ. δSrel δU = δSrel δ λ U(Rλ) = 0 (6.14) If we limit ourselves to pair potentials, this could be reformulated as a shown set of linear equations, whose solution leads to the desired coarse-grained potential. The numerical solution then corresponds to the methods described above. The relative entropy may also be helpful in optimization of the mapping. Iterative Boltzmann Inversion and Inverse Monte Carlo minimize the average of Φ(R) and involve a nonlinear optimization problem that is solved by iterative simulations. Force matching and Multiscale coarse-graining methods have the advantage of minimizing |Φ(R)|2 , which leads to a linear optimization problem that can be solved directly. Exercise: Calculate how much you have to lower the pH of milk to make a yoghurt (you can verify your result experimentally using milk and vinegar). Milk is a protein suspension stable at neutral pH. The most abundant milk protein, casein, forms micelles 100 nm in size and with a density 6 times lower than lysozyme. The zeta potential of those micelles was measured to be -8 mV at neutral pH. The Debye screening length of milk is about 1 nm. Assume that the slip plane where the zeta potential is measured is 2.5 nm from the micelle, as it is fluffy. For lower pH levels, use these values of zeta potentials (-2.5 mV for pH 5, -1.2 mV for pH 4.8, -0.5 mV for pH 4.6, 0.0 mV for pH 4.5, and +0.5 mV for pH 4.4). Exercise: Calculate the hydrophobic interaction and its distance dependence between two hydrophobic spherical particles 65 and two rods in a parallel orientation. Could this correspond to globular protein and α-helix? Sources and further readings: 1. W. G. Noid; Perspective: Coarse-grained models for biomolecular systems; The Journal of Chemical Physics 2013, 139 (9), 090901; DOI: 10.1063/1.4818908 2. Jacob N. Israelachvili; Intermolecular and Surface Forces (3rd Edition); Academic Press, 2011 3. Ken A. Dill and Sarina Bromberg; Molecular Driving Forces: Statistical Thermodynamics in Biology, Chemistry, Physics, and Nanoscience (2nd Edition); Garland Science, 2010 7. Thermodynamic principles of self-assembly (Critical micelle concentration, size distribution aggregates, shape argument ) In this chapter we will look at the thermodynamics of the selfassembly of molecules and particles, a vital cellular process. This usually means the association of hydrophobic or amphiphilic (partly hydrophobic and partly hydrophilic) species in aqueous solutions. The resulting molecular aggregates can have various geometries and sizes depending on the shape and type of interactions of the molecules of interest. The interactions between molecules consist of van der Waals interactions, hydrogen bonds, and electrostatic interactions. There are no covalent bonds, which means that the strength of the interactions are on the orders of kT or dozens of kT, therefore the association can be modified by the temperature and composition of the solvent. The size distribution of aggregates is determined by thermodynamics. At equilibrium, chemical potential has to be the same over the entire system. If we have aggregates of various sizes, their chemical potential can be written as: µN = µo N + RT ln(aN ), (7.2) 67 where aN is the activity (also called relative activity) of the aggregate composed of N molecules. The chemical potential of a single molecule in such an aggregate is: µ1N = µo 1N + RT N ln a1N N , (7.3) where a1N = NaN is the activity of a single molecule in the Naggregate and µo 1N = µo N /N is the standard chemical potential of a single molecule. Thus, at equilibrium we obtain the relationship between different aggregates: µ1N = µ1M (7.4) µo 1N + RT N ln a1N N = µo 1M + RT M ln a1M M (7.5) For an alternative derivation of this equation, see the appendix. We can obtain the relation of the N-aggregate to monomers we can get by simply using M = 1: µo 1 + RT ln (a1) = µo 1N + RT N ln a1N N (7.6) and the mole fraction of molecules in the N-aggregate can thus be expressed as: x1N = N x1 exp − µo 1N − µo 1 RT N (7.7) assuming the activity coefficient to be 1 (ai = γixi). Remember that the total molar fraction of solutes x can never exceed 1: x = ∞ N=1 x1N ≤ 1 (7.8) Similarly, the concentration of molecules in the N-aggregate can be written as: c1N /co = N c1/co exp − µo 1N − µo 1 RT N (7.9) 68 where co is the standard concentration. Note that deriving from molar fraction (cN = xN ∗ i ci) one would obtain above expression with total concentration instead of the standard concentra- tion. The system equilibrium can also be written as a set of chemical reactions. For the equilibrium between monomers and aggregates of size N we can write: NA ⇔ AN (7.10) The equilibrium constant is then: K = cN /co (c1/co)N = c1N (co )N−1 NcN 1 (7.11) = exp − ∆G kT = exp − N(µo 1N − µo 1) RT (7.12) From this we again obtain the equation 7.9. The equation 7.9 obtained above has important implications, as the total mole fraction can be expressed as a polynomial of x1 with small coefficients. As a result the total concentration is dominated by a linear term for small concentrations. This is caused by the large translational entropy that monomers would lose upon association. Monomer concentration thus increases linearly with increasing total concentration (limc→0 c1 = c). Naturally, this does not continue forever. When the loss of monomer translation entropy is comparable to the association energy, the solution saturates and all additional molecules join the aggregates. This threshold concentration is called the Critical Micelle Concentration (CMC) or Critical Aggregation Concentration (CAC). The idealized profile is shown in Figure 7.11). We can express this behavior from Eq. 7.7. Let’s consider a system where molecules prefer to aggregate in micelles of size 69 N. We can then write: x1N ≤ 1 (7.13) N x1 exp − µo 1N − µo 1 RT N ≤ 1 (7.14) x1 ≤ 1 N1/N exp µo 1N − µo 1 RT (7.15) The right side then corresponds to the CMC of the given system. Figure 7.11: Schematic profiles molar fractions. The dependence of µo 1N on N determines the aggregate size distribution. In general, there could be several peaks, however, 70 this is not the case in simple systems as standard chemical potential can be approximated by: ∆µo 1N = ∆µo 1∞ + A g NB/d (7.16) where g is the interaction free energy of two neighboring particles, d is a dimensionality factor, and ∆ means the chemical potential difference from the monomer.A, B are specific constants for the given system. The validity of Eq. 7.16 can demonstrated in the simple case of linear aggregates, where only the nearest neighbors would interact. The total chemical potential of aggregate N∆µo 1N is equal to the number of contacts times the molar free energy of the contact, g. So for all N we can write: N∆µo 1N = −(N − 1)g (7.17) ∆µo 1N = −(1 − 1/N)g (7.18) This can be reformulated using the limit (∆µo 1∞ = limN→∞ −(1− 1/N)g) to: ∆µo 1N = ∆µo 1∞ + g N (7.19) Similarly, we can estimate the number of contacts using square arrangement in 2D. The size of the square, a is equal to N1/2 . All middle particles have 4 neighbors, side particles have 3 neighbors, with the exception of the 4 corner particles which have only 2 neighbors. The free energy of the aggregate can then be estimated from the contacts as: 4g×(a−2)(a−2)+3g×4(a−2)+2g×4 = 4ga2 −4ga (7.20) In terms of the number of particles, this is 4gN − 4gN1/2 . The total free energy of the system can be written as: N∆µo 1N = −(4gN − 4gN1/2 ) (7.21) ∆µo 1N = −4g + 4g 1 N1/2 (7.22) 71 Using the limit for an infinite aggregate, we get ∆µo 1∞ = −4g, so we obtain: ∆µo 1N = ∆µo 1∞ + 4 g N1/2 (7.23) The factor 4 is given by the square arrangement, and if we used a circular one it would be different. In general, the number of interactions is given by the number of neighbors times N minus the number of boundaries, which scale as N1/d . Using the standard chemical potential of the monomer, Eq. 7.16 can be reformulated as: µo 1N − µo 1 = (µo 1∞ − µo 1) + A g NB/d = (µo 1∞ − µo 1) 1 − A NB/d (7.24) since we know that g = −∆µo 1∞ from the derivations above. Substituting Eq. 7.24 into Eq. 7.7 we obtain the expression for aggregate distribution: x1N = N x1 exp − µo 1∞ − µo 1 RT 1 − A NB/d N (7.25) = NxN 1 exp − µo 1∞ − µo 1 RT N − AN1−B/d) (7.26) For convenience we define a new constant C ≡ exp − µo 1∞−µo 1 RT and rewrite the above formula. x1N = N [x1C]N C−AN1−B/d (7.27) Above the CMC, we can approximate this as: x1N = NC−AN1−B/d (7.28) Because C > 1, then for B/d < 1, the x1N decreases for larger N and there will be no large aggregates. On the other hand 72 B/d > 1 leads to strong aggregation, where x1N increases with N. This results in phase separation (infinite aggregate). B/d = 1 is a threshold value, which can lead to finite size aggregates with distribution x1N N exp − N√ xC well above the CMC (see appendix). With this distribution, the majority of the molecules are in the aggregate Nmax = √ xC and the distribution of aggregates is quite broad, therefore, the solution is polydisperse. However, keep in mind that these results are only valid for simple systems such as spheres with isotropic interaction, where Eq. 7.16 is valid. In real systems the interactions are anisotropic and micelles of finite size appear for geometrical reasons. The effect of the molecule shape on aggregate structure and size can be nicely explained on well-studied surfactants with hydrocarbon tails. The core of an aggregated micelle is composed of hydrocarbon tails surrounded by hydrophilic headgroups, which are in contact with the solution. The size of the hydrophobic core is dictated by the volume of surfactant hydrocarbon tails vt, while the area in contact with water is dictated by the area of the headgroup a0. The last parameter describing the surfactant molecule is the critical length of the hydrocarbon tails lc. It is defined as a length beyond which the hydrocarbons cannot be considered to be an incompressible fluid and is slightly smaller than the length of fully extended tails. For saturated hydrocarbons, we can use the estimate (in A) lc ≤ 1.54 + 1.265n (7.29) where n is the number of hydrocarbons in the chain. Similarly, the estimate for volume (in A3 ) is: vt 27 + 26.9n (7.30) Both lc and vt grow linearly with the length of the tail and their ratio is thus constant vt/lc 21 A2 which is the cross-sectional area of the hydrocarbon chain. 73 Using the vt, lc and a0 parameters we can calculate the packing parameter p: p = vt lca0 (7.31) This parameter is characteristic for the shape of surfactant and determines the morphology of the aggregate (Israelachvili 1976). Let us start with a spherical micelle of radius R. The number of molecules in the micelle can be calculated from the volume or the surface as: N = 4 3 πR3 /vt = 4πR2 /a0 (7.32) thus, for radius we obtain: R = 3 vt a0 (7.33) At the same time, we know that the radius has to satisfy R ≤ lc. As a result we obtain the formula: 1 3 ≥ vt lca0 = p (7.34) Thus the packing parameter for a spherical micelle is p ≤ 1 3 meaning that the molecules have a conical shape. A well-known example of such a molecule is sodium dodecyl sulfate. Similarly, cylindrical micelles (hexagonal phase) are formed with surfactants with a packing parameter 1 3 ≤ p ≤ 1 2 and with the shape of a truncated cone. An example is the cationic lipid with a single alkyl chain CTAB - hexadecyl trimethylammonium bromide. Surfactants with the shape of even more truncated cones with packing parameter 1 2 ≤ p ≤ 1 self-assemble into cubic phase or vesicles (examples are lipids with two hydrocarbon chains, such as phosphatidylcholines). Bilayers are composed of cylindrical surfactants with p 1 represented by phosphatidyl ethanolamine. Surfactants with p ≥ 1 have the shape of a wedge (inverted cone) 74 p 1/3 1/2 1 2 3 shape cone truncated cylinder truncated wedge cone wedge phase micellar hexagonal lamellar inverted inverted hexagonal micellar Table 7.1: Summarizing table of related packing parameter, surfactant shape, and formed phase. which is truncated up to a value of 3. Wedge-shaped surfactants form inverted phases. Vesicles or inverted cubic phase are formed by surfactants with packing parameter 1 < p ≤ 2. For 2 < p ≤ 3, surfactants such as ceramide aggregate into an inverted hexagonal phase (inverse cylindrical micelles) and for surfactants with 3 > p the equilibrium structure is inverse micelles. These results are summarized in Table 7.1. The exact packing parameter and phase formed depends not only on the surfactant headgroup and length of the hydrocarbon tail, but also on the composition of the solution, number of unsaturated bonds in the tails and the temperature. The composition of the solution mostly affects the headgroup area a0, which could be decreased by increasing the salt concentration or increased by charging the headgroup via a change in pH. Unsaturated or branched tails increase the vt/lc ratio, which could also be in- 75 creased by the addition of hydrophobic molecules into the solution. Temperature effects are more complex, as they change both the headgroup area and tail volume at the same time. Naturally, mixtures of surfactants behave somewhere in between the pure compounds and the phase diagram can be quite rich and complex. This is utilized in living cells, which can synthesize various lipid types and thus control the composition and properties of the cellular membranes. We can only analytically describe aggregation for very simple systems, thus more realistic systems are studied in experiments or computer simulations. Nevertheless, it is good to keep in mind the basic thermodynamic principles of aggregation. Sources and further readings: 1. Intermolecular and Surface Forces (3rd Edition); Jacob N. Israelachvili; Academic Press, 2011 8. Langmuir adsorption (Adsorption and Grand canonical ensemble) In the previous chapter, we studied the aggregation of molecules in bulk. In this chapter we continue with the association of molecules with surfaces, which can be a model for protein adsorption at a membrane or small molecule adsorption at a large protein with many interaction sites. The most used and well known model is called Langmuir adsorption (see Figure 8.12). It considers a surface with M binding sites. Each site can either be empty or occupied by one adsorbed molecule. The adsorbed molecule interacts with the site with a potential energy of size . Moreover, the model assumes that the adsorbed molecules do not interact with each other. Let us suppose that N molecules are adsorbed to M sites. The total energy interaction energy is E(N) = −N and the number of possible realizations of such a state is: Ω(N) = M N = M! N!(M − N)! (8.2) As a result, the canonical partition function is: Q(N, M, T) = i e− Ei kT = M! N!(M − N)! e N RT , (8.3) 77 Figure 8.12: Langmuir model of adsorption. where i denotes a microstate (one realization) with energy Ei. The chemical potential is thus: µ = ∂F ∂n = − ∂kT ln(Q) ∂n = − ∂RT ln(Q) ∂N (8.4) = −RT ∂ ∂N ln M! N!(M − N)! + RT (8.5) For large M and N we can use the Stirling approximation ln(n!) ≈ n ln(n) − n and its derivation d ln(n!) dn ≈ ln(n) + 1 − 1 = ln(n) to get: µ = −RT [0 − ln(N) − ln(M − N)(−1)] − (8.6) = − + RT ln N M − N (8.7) To compare with experiments we have to calculate a fractional coverage defined as xc ≡ N M . The equation 8.7 can be rewritten 78 as: µ = − + RT ln xc 1 − xc (8.8) xc 1 − xc = exp µ + RT (8.9) xc = 1 1 + exp −µ+ RT (8.10) More commonly expressed in the form: xc = exp µ+ RT 1 + exp µ+ RT (8.11) A typical profile that we obtain with this relationship depends is shown in Figure 8.13. When the concentration of adsorbed molecules is high, the molecules commonly start interacting in some way (could be both direct or indirect), which leads to a deviation from the Langmuir model. Note that in the derivation above one can use kT instead of RT, since kT is pure unit of energy and RT is energy per mol. Now we derive the same equation in the Grand Canonical ensemble, where the number of particles can change. The system has a constant chemical potential, volume, and temperature (µV T). The Grand canonical partition function is: Ξ ≡ N Q(N, V, T)e µN RT (8.12) Thus using Eq. 8.3 we obtain for our system: Ξ = M N=0 M! N!(M − N)! e N( +µ) RT (8.13) 79 Figure 8.13: Typical profile of Langmuir adsorption. We use binomial theorem (1 + x)M = M N=0 M! N!(M−N)! xN to simplify the above expression to: Ξ = 1 + e +µ RT M (8.14) The average number of molecules in the system (adsorbed molecules) can be calculated from the expression: < N >≡ N N ∗ p(N) (8.15) where p(N) is the probability of the state with N adsorbed molecules. 80 Therefore, we obtain: < N > = N NQ(N, V, T)e µN RT N Q(N, V, T)e µN RT = ∂ ln Ξ ∂ µ RT (8.16) = M exp +µ RT 1 + exp +µ RT (8.17) and for fraction coverage xc we obtain the same as above: xc = exp µ+ RT 1 + exp µ+ RT (8.18) The derivation in the Grand canonical ensemble is easier and more elegant. It demonstrates that the Grand Canonical ensemble can be used to great benefit, especially in open and more complex systems. Note that the Langmuir adsorption model can also be derived phenomenologically from the reaction description. We start from considering the reaction NF + MF N, where NF represents a free molecule, MF represents a free binding site, and N is the adsorbed molecule in the binding site. The equilibrium constant can be expressed as: K = aN aNF aMF (8.19) and the fraction coverage as: xc = N M = N N + MF = 1 1 + MF N (8.20) The activity coefficient is 1 in Langmuir model, so we can reformulate the fraction coverage as: xc = 1 1 + aMF aN = 1 1 + 1 KaNF (8.21) 81 Since we also know that the activity is related to the chemical potential (aNF = exp(− µ RT )) and the energy of this reaction is related to the equilibrium constant (K = exp(−RT )), we can express the fraction coverage as: xc = 1 1 + exp −µ+ RT (8.22) This is commonly expressed in the form: xc = exp µ+ RT 1 + exp µ+ RT (8.23) If we want to study adsorption in computer simulations, we can either perform a simulation in the canonical ensemble with a system large enough to represent infinite bulk, or we can employ a Grand Canonical ensemble simulation. Simulation with the Grand Canonical ensemble only needs to simulate the surface, therefore the system is much smaller and faster to calculate. The bulk reservoir with constant temperature and pressure (chemical potential) is replaced by a fluctuating number of particles in the system. To perform such simulations, we have to implement insertion and deletion moves. During an insertion move we take the molecule from a hypothetical tempered reservoir and place it at a random position in the system. The insertion is accepted with the probability: p(N → N+1) = min 1, V Λ3(N + 1) exp µ − E(N + 1) + E(N) kT (8.24) In a deletion move we randomly select a molecule and remove it with the probability: p(N → N−1) = min 1, Λ3 N V exp −µ − E(N − 1) + E(N)] kT , (8.25) 82 where E is energy of the system with a given number of molecules, µ stands for chemical potential, and Λ is the de Broglie wavelength Λ ≡ h√ 2πmkT . The exponential pre-factor corresponds to the changed number of possible realizations of the system, and it originates from the non-interacting part of the partition function V N Λ3N N! . In more detail, we can follow the detailed balance condition of the Monte Carlo method (for the basics of the Monte Carlo method, including the detailed balance condition, see appendix). The Grand canonical partition function is: Ξ = ∞ N=0 V N Λ3N N! e µN RT e− E(rN ) kT drN (8.26) and the probability of a configuration with N particles is P(N) ∼ V N Λ3N N! e µN RT e− E(rN ) kT (8.27) thus if we are randomly selecting particles we can calculate the transition probability ratio to be: p(N → N + 1) p(N + 1 → N) = P(N + 1) P(N) (8.28) = V Λ3(N + 1) e [µ−E(N+1)+E(N)] kT (8.29) This probability ratio is satisfied by the above probabilities of single moves. In most cases we know the pressure of the system rather than its chemical potential. In order to obtain the chemical potential for simulation we could perform simulations without the surface and calculate the pressure for a given chemical potential until we find the one we want. Or we could do a simulation in the NPT ensemble with the desired pressure and calculate the corresponding 83 chemical potential. Alternatively, we could perform the simulation at standard pressure and use the following equation. µ = µo + RT ln a = µo + RT ln φ p po , (8.30) where φ is the fugacity coefficient. For optimal performance of simulations in the Grand Canonical ensemble the probability of insertions and deletions should not be very low. Exercise: Show that with Langmuir adsorption you can obtain the typical sigmoidal titration curve with a midpoint at pKa. Assume that you titrate with strong acid/base, so you can focus on dissociation curve. Sources and further readings: 1. Daan Frenkel and Berend Smit; Understanding Molecular Simulation: From Algorithms to Applications (2nd Edition); Academic Press, 2001 9. Transport phenomena (Diffusion, Navier Stokes, Stokes-Einstein equation, DPD ) In previous chapters we discussed the thermodynamic properties and structures of systems in equilibrium. In this chapter we focus on the kinetics of proteins. As mentioned in the first chapter, diffusion dominates the processes in soft matter. Proteins and big molecules in solution undergo a large number of collisions with the surrounding molecules of a solution, leading to their Brownian motion. This motion is named after Robert Brown, who studied the motion of a pollen grain in aqueous solution. However, the molecular explanation and derivation was done by Einstein (1905) and Smoluchowski (1906). The derivation starts from macroscopic measurements. If we put a lot of grains in one part of the solution, we will observe their diffusion to regions with low concentration. This motion is described by Fick’s 1st law: j = −D ∂c ∂x (9.2) which states that the flux j is linearly proportional to the concentration gradient with the proportionality constant D, called the diffusion coefficient. From the conservation of mass law, the flux is 85 the time derivative of concentration c: ∂j ∂x = − ∂c ∂t (9.3) Putting these together, we obtain Fick’s 2nd law: ∂c ∂t = D ∂2 c ∂x2 (9.4) If we assume constant amount of material with N = ∞ −∞ cdx and that we started to follow the particles from time zero (c(0, 0) = N), the solution of the differential equation is: c(x, t) = N √ 4πDt exp − x2 4Dt (9.5) Naturally, the mean particle position is < x >= 0 as the particles Figure 9.14: Concentration distribution change via diffusion. diffuse in all directions with the same diffusion coefficient. The second moment, mean square displacement, is: < x2 >= 2Dt (9.6) 86 The macroscopic change in concentration profile could thus be related to the microscopic motion of particles. We can do the same derivation in n dimensions to obtain the more general expression: < |r|2 >= 2nDt (9.7) Before we look at how to estimate the diffusion coefficient for molecules, let’s have a look at the molecular origin of Fick’s law. In other words, how do particles know which direction to go? And how do we get time-irreversible behavior from Newton’s time-reversible laws (a drop of ink in water becomes dispersed over time but not the other way round)? Consider two boxes with different amounts of ink molecules, where molecules can jump from one box to another with probability p and stay with probability s. The average number of particles that will jump from a box with N molecules is < nj >= N i=0 iP(i, N), (9.8) where P(i, N) is the probability of i molecules jumping from N. As we know that each molecule will either jump or stay, we can calculate the probabilities; P(i, N) = N! i!(N − i)! pi sN−i (9.9) This is a binomial distribution where a fraction factor originates from the number of combinations we can choose i jumping molecules. The sum can then be rewritten with a derivative as: < nj >= N i=0 p ∗ ∂P(i, N) ∂p = p ∂ N i=0 P(i, N) ∂p (9.10) 87 Now we will use the fact that binomial distribution sum to (p+s)N . Hence < nj >= p ∂(p + s)N ∂p = pN(p + s)N−1 = Np (9.11) where we used the assumption that each molecule either jumps or stay, i.e. p + s = 1. This means that the flux of molecules from the first box to the second one per area A and time dt can be calculated as: < j >= < n1 > − < n2 > A∆t = p(N1 − N2) Adt = p∆N Adt (9.12) To make our result more general we will move to boxes of infinitesimally small volume, where ∆N = −∆cdV = −∆cAdx. The minus sign comes from the opposite direction of the definitions of flux and concentration ∆j = j1 − j2 while ∆c = c2 − c1. < j >= − pdx∆c dt = − pdx2 dt ∆c dx = −D ∂c ∂x (9.13) The flow of particles along the concentration gradient (Fick’s law) thus originates from the simple fact that more molecules will randomly move in that direction than in the opposite. Similarly to the mixing of two gases, molecules do not ’know’ where to move. They move independently towards states with more possible rep- resentations. To obtain an estimation of the diffusion coefficient, we start 88 with its connection to the velocity autocorrelation function: < x2 > = t 0 vdt 2 (9.14) = t 0 t 0 < v(t )v(t ) > dt dt (9.15) = t 0 t 0 < v(t )v(t ) > dt dt + t 0 t t < v(t )v(t ) > dt dt (9.16) Using the substitutions z = t − t and y = t − t in the second integral leads to: < x2 > = t 0 t 0 < v(t )v(t ) > dt dt + t 0 y 0 < v(y)v(z) > dzdy (9.17) = 2 t 0 t 0 < v(t )v(t ) > dt dt (9.18) From this we can evaluate the diffusion coefficient: D = lim t→∞ ∂ < x2 > 2∂t (9.19) D = ∞ 0 < v(t − t )v(0) > d(t − t ) (9.20) This equation 9.20 is called the Green-Kubo relation. The motion of individual particles with weight m in diffusion is described by the Langevine equation m d2 x dt2 = fc − γ dx dt + fs (9.21) 89 where fc is a conservative force originating in the inter-particle potential. γ stands for the friction coefficient, related to the fluid viscosity, and fs is a stochastic random force representing collisions with the surrounding fluid molecules (typically represented by white noise). This equation is an essential part of Langevin dynamics. Molecular friction is related to macroscopic dissipation, which is related to stochastic force by Fluctuation-dissipation theorem. When satisfied friction and stochastic force at as thermostat with correct thermal fluctuations. When the Langevine equation is combined with the GreenKubo relation, we can derive an equation for the diffusion coefficient from the motion of a single particle. The first step is to multiply the Langevine equation by vx(0) and average it over time or ensemble. For a non-interacting particle we get: 0 = mvx(0) dvx dt + γvx(0)vx(t) − vx(0)fs(t) (9.22) 0 = m < vx(0) dvx dt > +γ < vx(0)vx(t) > − < vx(0)fs(t) > (9.23) 0 = m < vx(0) dvx dt > +γ < vx(0)vx(t) > (9.24) The solution of the above equation is: < vx(0)vx(t) >=< v2 x(0) > e− γ m t (9.25) The average size of velocity < v2 x(0) > is equal to kT/m from the Maxwell velocity distribution or Equipartition theorem. Hence, < vx(0)vx(t) >= kT m e− γ m t (9.26) The second step is to use this solution in the Green-Kubo expres- 90 sion: D = ∞ 0 kT m e− γ m t dt (9.27) = kT m m γ (9.28) = kT γ (9.29) To finish the calculation we need to know the friction coefficient, which is not easy to obtain, as it is determined by hydrodynamics. Below we show the main laws of hydrodynamics and the calculation of the friction coefficient for a spherical particle. Hydrodynamic flow of the solution is described by the NavierStokes equation, which is a nonlinear partial differential equation. For incompressible liquids it is: ρ ∂v ∂t + v · v = − p + η 2 v + f (9.30) where ρ is the density, v stands for the velocity vector. p represents the hydrostatic pressure, η is the viscosity, and f is the external force that act on the studied elementary volume dV . The equation is derived from Newton’s second law d(mv)/dt = F applied to a small volume of fluid. On the left side the total derivative (dmv/dt = ∂mv/∂t + v · mv) is simplified using a continuity equation and the assumption of fluid incompressibility. The force on the right side of Newton’s law is split into the stress acting on a cubic volume and the force f acting homogeneously on the given volume. The stress tensor is then further decomposed to diagonal terms (pressure) and off-diagonal terms (shear), which give rise to the second term on the right side of the Navier-Stokes equation. Note that the Navier-Stokes equation does not take into the account the molecular structure of the fluid, which is described as a homogeneous continuum. 91 There is no general solution to the Navier-Stokes equation, with its nonlinear character being the main obstacle. However, in soft matter the typical motion is so slow that it is in the low Reynolds number regime (no turbulence). Thus, we can neglect the term v · v and the equation 9.30 becomes linear: ρ ∂v ∂t = − p + η 2 v + f (9.31) and much easier to solve. Of course the exact solution still depends on the boundary conditions, but for several simple cases this could be solved analytically. For example for a hard sphere moving at steady state in solution the Navier-Stokes equation simplifies to 0 = − p + η 2 v. This could be solved in cylindrical coordinates leading to the transfer velocity from the particle to the solution vt ∼ vs(R/r), where R is the sphere’s radius, vs is the sphere’s velocity, and r is the distance from the sphere. From the transverse velocity dependence, we can calculate a friction force to derive the well-known Stokes’ law: Ff = −6πηRvs (9.32) This result might be counter-intuitive, since at turbulent flow the friction force depends on R2 v2 s . However, in laminar flow it depends on Rvs, which demonstrates the importance of correctly describing the hydrodynamics. Importantly, the velocity of the solution decays from the sphere with 1/r, which means that the hydrodynamic interaction between two particles can be a long ranged one. From Stoke’s law, we have the friction coefficient γ = 6πηR now and we can insert it into the Eq. 9.29 to end up with the Einstein-Stokes relation: D = kT γ = kT 6πηR (9.33) 92 which describes the diffusion coefficient of spherical particles in a liquid at laminar flow. This can be used to estimate the diffusion coefficient for roughly spherical proteins. However, when the molecule has a more complicated shape or non-homogeneous interactions, the Navier-Stokes equation cannot be solved analytically and we will need to do a computer simulation to calculate the diffusion coefficient. For completeness, water viscosity at ambient conditions is about 10−3 Pa·s. Note that the diffusion coefficient is a function of temperature, but not of the concentration of solutes. Based on the Einstein-Stokes relation, we can get an idea of the typical time scales in soft matter. Proteins approximated by a sphere will diffuse the distance of their radius in the following time (combining Eq. 9.6 and Eq. 9.29): R2 = 2nDt = 6 kT 6πηR t (9.34) t = πηR3 kT (9.35) As a result we get times on the order of µs for proteins of radius 10 nm and milliseconds for proteins with a radius of 100 nm. Therefore the relevant timescales are not easily accessible by all-atom simulations. Fortunately, there are coarse-grained techniques which can reach such long time scales and some of them even include hydrodynamics and through which we can calculate the diffusion coefficient. Probably the most widely used coarse-grained technique that includes solvent hydrodynamics is Dissipative Particle Dynamics (DPD). However, note that there are other methods such as the lattice Boltzmann method, multi-particle collision dynamics, fluid particle dynamics, or fluctuating hydrodynamics. In the DPD method, several solvent molecules are coarse-grained into one particle, which has similar effective friction and fluctuation as an 93 all-atom solvent. As in standard Molecular Dynamics (MD), the time is discretized and the movement in each step is done based on the forces acting on the particles. The difference is that in DPD, the total force contains not only the conservative forces Fc(rij) (e.g. forces originating from inter particle potentials), but also dissipative Fd(rij, vij) and stochastic forces Fs(rij): mi d2 ri dt2 = Fi = i=j Fc(rij) + Fd(rij, vij) + Fs(rij) (9.36) Dissipative force depends on both interparticle distance rij and velocity vij : Fd(rij, vij) = −γωd(|rij|) vij · rij |rij| rij |rij| (9.37) where γ is the friction coefficient and ωd represents the variation of the friction with distance. The ωd distance dependence is usually limited by a cut-off distance, after which the friction is zero. Below the cut off rc it could be constant or decaying function such as (1 − rij/rc). The stochastic force has the form: Fs(rij) = −σωs(|rij|)g rij |rij| (9.38) where σ is the magnitude of a random pair force and ωs is again the distance variation. g is a random number from a Gaussian distribution with unit variance. This formula guarantees that the stochastic force between two particles is antisymmetric Fs(rij) = −Fs(rji), which is needed for the conservation of momentum. Moreover, antisymmetry saves computer time, since only one has to calculate one value for each pair. This also distinguishes DPD from Brownian dynamics, in which each particle experiences a random force independently. Thus, Brownian dynamics does not 94 conserve the local momentum, which is a crucial property for correct hydrodynamics. Dissipative and stochastic forces act together as a thermostat, keeping the system at the desired temperature in the canonical ensemble (NV T). However, to recover the canonical ensemble there is a necessary condition: ωd = ω2 s (9.39) Exercise: Calculate the maximum displacement for a Monte Carlo displacement move. Assume an acceptance probability to be 0.2. The target diffusion coefficient is 10−5 cm2 s−1 . 1D case is trivial, so consider 3D moves, where particles is displaced with the same probability in all directions (and distances within maximum displacement). Sources and further readings: 1. Daan Frenkel and Berend Smit; Understanding Molecular Simulation: From Algorithms to Applications (2nd Edition); Academic Press, 2001 10. Advanced Monte Carlo (NPT, Parallel tempering, Cluster moves, Configuration bias, Widom insertion, Wang-Landau method, Kinetic Monte Carlo, Dynamic Monte Carlo) In this chapter we describe some of the methods that go beyond the simple Monte Carlo (MC) method with Metropolis sampling of the configuration space in the canonical ensemble that is described in the appendix. We will only cover a few advanced methods that are most commonly used. There is a large number of different advanced methods and new ones that are still being developed. For example in 2011 a Hybridization move was developed that leads to the formation and breakage of new bond constraints, as well as a Non-equilibrium candidate method which enhances sampling by moving particles out of equilibrium and subsequent relaxation of the system. A large proportion of the advanced methods take advantage of non-physical moves in order to enhance sampling of the equilibrium properties. This naturally leads to loss of the dynamics, for which Molecular dynamics (MD) is frequently used. However, there are also MC methods which can provide the dynamics of the system such as Kinetic and Dynamic Monte Carlo. On the other hand, if we are only interested in the equilibrium properties, MD can be used and understood as one of the possible and usually 96 not the most efficient method to sample the configuration space. Note that the implementation of the complicated and highly specialized moves is usually system-dependent. This is why there is not a universal MC package with all the methods, yet there have been several attempts to develop packages that are reasonably general. We start with a simple simulation of an NPT ensemble (isobaric isothermal). Similarly to Molecular dynamics, the desired pressure of the system and the fluctuations are provided by changes to the simulation box size. Here we will discuss an isotropic change, however, an anisotropic change of the box is very similar. In contrast to Molecular dynamics, there are no forces calculated in Monte Carlo, instead the change to the box is chosen randomly and the pressure is maintained via the acceptance rule. In a canonical ensemble we used the acceptance criterion exp(−∆E kT ), because the canonical partition function is: Q(N, V, T) = V N Λ3N N! e− E kT drN (10.2) and the MC method samples this integral. In the same way we will use an MC scheme to sample the integral of the partition function of the isobaric ensemble, which is: Z(N, p, T) = p Λ3N N!kT V N e−pV kT e− E kT drN dV (10.3) In other words the density of states (the probability that a system of N particles will have volume V at pressure p) is: P(N, V ) = V N e−pV kT e− E kT (10.4) = exp(− E + pV − NkT ln(V ) kT ) (10.5) 97 and the probability criterion for accepting or rejecting the volume change is: prob(old → new) = min 1, exp − Enew − Eold kT + p(Vnew − Vold) − NkT ln(Vnew/Vold) kT (10.6) We use prob for probability here to avoid the confusion with pressure p. Usually it is more efficient to make random changes in the logarithm of volume (ln(V )) than in the volume itself. The acceptance criterion then has to be modified to: prob(old → new) = min 1, exp − Enew − Eold kT + p(Vnew − Vold) − (N + 1)kT ln(Vnew/Vold) kT (10.7) since V N dV = V N+1 d(ln(V )). Parallel tempering (also called replica exchange) is a method used for systems with many local minima. In order to avoid the system being ’stuck’ in one minimum, a simultaneous run of many simulations is used, each at a different temperature to overcome barriers. The configurations between the different runs are swapped based on the MC swap move and the probability of accepting such a move is dependent on the temperature difference (the smaller temperature difference, the more likely the swap is accepted). The acceptance criterion for a swap move between systems i,j in NV T ensembles could be derived from the detailed 98 balance condition: p(i(Ti)j(Tj) → i(Tj)j(Ti)) p(i(Tj)j(Ti) → i(Ti)j(Tj)) = Pi(Tj)Pj(Ti) Pi(Ti)Pj(Tj) (10.8) = e − Ei kTj e − Ej kTi e − Ei kTi e − Ej kTj (10.9) = e 1 kTi − 1 kTj (Ej−Ei) (10.10) Hence the acceptance criterion is: p((i(Ti)j(Tj) → i(Tj)j(Ti)) = min 1, e Tj−Ti kTiT j (Uj−Ui) (10.11) Temperature is not the only property that can be varied and similar methods have been developed for simulations of systems with different densities, pH, salt concentrations, etc. The method is general and the exchanges can be done between simulations with different Hamiltonian. In cluster moves,the position or configuration of several molecules are changed at once. We typically attempt to move particles, where separate movements would result in a low acceptance. Therefore the most common criteria for cluster definition is distance proximity or strong interaction. The acceptance rules have to be carefully devised to meet the detail balance condition, as cluster moves could include the joining or separation of existing clusters. The simplest way to deal with this issue is to automatically reject any move where the number of clusters changes. The acceptance criterion is then simply: p(old → new) = min 1, new ij∈C pij new i∈Cj /∈C(1 − pij) old ij∈C pij old i∈Cj /∈C(1 − pij) exp − Enew − Eold kT (10.12) 99 where C stands for the cluster and pij is the probability of two particles i and j to belonging to the same cluster. By multiplying the probabilities for all particles we get the total probability of a given cluster and the rest of the system. If the probability of the cluster before and after is the same, the criterion simplifies to: p(old → new) = min 1, new i∈Cj /∈C(1 − pij) old i∈Cj /∈C(1 − pij) exp − Enew − Eold kT (10.13) The probabilities pij could be defined for instance using a simple distance cutoff rc resulting in: pij = 1 for rij < rc and pij = 0 for rij ≥ rc. The Configuration Bias MC (CBMC) method is used to bias our selection of configurations to those that are probable. We will demonstrate this method on insertions or deletions of large chain molecules such as proteins and polymers. The basic idea is based on the growth or removal of part of the molecule (for example amino acids) depending on its Boltzmann weight. It originates from the Rosenbluth method, which is used for polymers in a lattice and the acceptance of the move is given by the ratio of Rosenbluth weights. In growth, a randomly chosen position of the next chain segment would very likely end up with a rejection due to the overlap with another segment. Thus, we randomly generate several trial positions and choose one according to its Boltzmann weight. This means that we bias the growth to the positions that are most likely. The probability of the i-th trial position is given by: Pi = exp(− Ei kT ) l j=1 exp(− Ej kT ) , (10.14) where l is the number of trial positions and E is the potential energy of the system with the given trial position. However, as 100 we introduce bias we have to also modify the acceptance probability, so the detail balance remains conserved (otherwise we would generate conformations with a Rosenbluth distribution and not Boltzmann distribution). To correct the acceptance probability we have to calculate the Rosenbluth weight w(i) for the i-th segment: w(i) = l j=1 exp − Ej(i) kT (10.15) The Rosenbluth weight for the whole new chain of n segments is: Wnew = n n=1 w(i) = n i=1 l j=1 exp − Ej(i) kT (10.16) Now we take into account the conformations of the chains in the current system, which are in the Boltzmann conformation. We randomly select one of the existing chains in the system and regrow it with l trial positions of each segment as with a new chain, however, the first position i = 1 is not random, but from the existing chain e. The Rosenbluth weight for the i-th segment is: wold(i) = exp − Ue(i) kT + l j=2 exp − Ej(i) kT (10.17) The Rosenbluth weight for the whole "old" existing chain of n segments is: Wold = n i=1 exp − Ee(i) kT + l j=2 exp − Ej(i) kT (10.18) The acceptance criterion of the move is: p(old → new) = Wnew Wold if Wnew < Wold = 1 if Wnew ≥ Wold (10.19) 101 There are modifications to the Configuration Bias Monte Carlo, which were developed to enhance sampling. One example is the Prune-Enrichment method, in which the partial configurations are discarded ’pruned’ if their probability is low, whereas very likely early configurations are copied ’enriched’ to improve sampling. Another example is the Recoil-growth method, where chains leading to a dead end are shortened and regrown in a new conformation. For chain molecules, Reptation moves are also frequently used. In this move one end of the chain is marked as the head, then as the head moves the body of the chain follows the movement like a snake. In general, if we have a bias for generating trial configurations and the bias depends on the potential energy of such a configuration (α(old → new) = f(Enew)), then the acceptance criterion in the canonical ensemble is: p(old → new) = f(Eold) f(Enew) e− Enew−Eold kT if e− Enew−Eold kT < f(Enew) f(Eold) = 1 if e− Enew−Eold kT ≥ f(Enew) f(Eold) (10.20) Now we will look at a few methods used for Free energy cal- culations. Widom insertion or simply particle insertion is a method to calculate the free energy of a particle inserted into the system. For a large system, the free energy per particle µ can be calculated as: µ = H(N + 1, V, T) − H(N, V, T) (10.21) From statistical mechanics we get: µ = −kT ln Q(N + 1, V, T) + kT ln Q(N, V, T)(10.22) = −kT ln Q(N + 1, V, T) Q(N, V, T) (10.23) 102 where Q is the canonical partition function Q(N, V, T) = V N Λ3N N! e− E kT drN (10.24) After substitution we obtain: µ = −kT ln Λ3 V (N + 1) − kT ln e− EN+1 kT drN+1 e− EN kT drN = µid + µex (10.25) The first term is an ideal gas contribution, while the second term is the excess free energy. The integral can be calculated as an average over the particles in the system: µex = −kT ln e− EN+1−EN kT N drN+1 (10.26) The ensemble average over N-particles < .. >N can be sampled by the Monte Carlo method by simulating the system in the canonical ensemble with N particles, where we randomly insert additional particle and calculate the energy difference EN+1−EN . Note that the insertions are only virtual and system continues to evolve with N particles. In an isobaric ensemble we can follow a similar derivation to obtain: µ = −kT ln kT pΛ3 − kT ln pV (N + 1)kT e− EN+1−EN kT drN+1 N (10.27) In very dense systems, Widom insertion is not efficient, as the configurations with low energy (inserting in a hole) are purely sampled. It is then advantageous to combine insertions with deletions. Two systems are simulated: the first one has N particles 103 and an additional particle is virtually inserted; the second simulation consists of N + 1 particles and virtual removals of random particles are carried out. From each system, we obtain the probability distribution p of ∆E = EN+1 − EN . If there is an overlap of these two distributions, we can calculate the free energy of the insertion as in two systems with N particles: pN+1(∆E) = e− EN+1 kT δ(EN+1 − EN − ∆E)drN+1 e− EN+1 kT drN+1 = e− EN +∆E kT δ(EN+1 − EN − ∆E)drN+1 e− EN+1 kT drN+1 = e− EN +∆E kT δ(EN+1 − EN − ∆E)drN+1 e− EN+1 kT drN+1 e− EN kT drN e− EN kT drN = e−∆E kT e− EN kT δ(EN+1 − EN − ∆E)drN+1 e− EN kT drN e− EN kT drN e− EN+1 kT drN+1 = e−∆E kT pN (∆E)e−µex kT (10.28) This leads to: µex = ∆E − kT ln pN (∆E) pN+1(∆E) = ∆E + kT ln (pN+1(∆E)) − kT ln (pN (∆E)) (10.29) This method is called the Overlapping distribution method and is more general than the above case. Its difference from MD could be related to its thermodynamics integration. The Wang-Landau (WL) method calculates the density of states in any system that can be characterized by a cost function. With particle simulations, the cost function is an energy function along 104 a reaction coordinate. The reaction coordinate can be a general combination of coordinates such as particle distance, distance from the center of mass, axes, volume, etc. and this coordinate is sampled (usually from min to max value in certain bin size) and the penalty potential along this coordinate is adaptively modified. In the end we obtain a flat sampling along the reaction coordinate and the cost function is related to the free energy. As such the WL method is similar to Metadynamics in MD. Let g be a penalty function along the reaction coordinate and h the histogram. At the beginning both functions have zero values in all their bins. In each move during MC, the histogram of the current bin is increased by 1 and the penalty function decreased by constant modification factor f. (Note that in the original WL method ln(g) and ln f were used instead of g and f as it was made to directly correspond to the density of states, not their logarithm). The acceptance criterion for the canonical ensemble is: p(old → new) = min 1, exp − Enew − Eold + gnew − gold kT (10.30) The sampling continues until the histogram reaches an accurate estimate of the density of states with the given modification factor f. A common criterion is that the difference between the maximum and minimum value of the histogram is smaller than a certain threshold. The original algorithm proposed that each histogram value should be within 20 % of the mean value. When such a criterion is satisfied, the modification factor f is decreased and the histogram is reseted to 0. Originally, f was decreased by simply multiplying by 1/2. Later, it was suggested to use a multiplication proportional to 1/t in order to improve convergence (t is the simulation time). The procedure continues until the f is decreased below the previously selected value (typically 10−7 ). The g is then related to the free energy of the system along the reaction coordinate. 105 To capture dynamics, the Kinetic Monte Carlo or Dynamic Monte Carlo are frequently used. Naturally, no unphysical moves are allowed in these techniques and we need to know some of the dynamic parameters of the system a priori. In Dynamic Monte Carlo, we typically know the diffusion coefficients for all of the moves employed, as all the moves are coupled to the same time constant. The method was shown to lead to the same results as Brownian dynamics when the moves and their probabilities are calculated from the infinite dilution limit. In Kinetic Monte Carlo, the rates of all the moves responsible for the studied process are known and the trial moves are attempted according to these rates. Naturally, it is crucial to have the uncorrelated rates ri correct. At the beginning of each step, we calculate the cumulative rate function Ri = i j=1 rj for i = 1, 2, ..N, where N is the number of possible processes at a given time. Consequently, we generate a random number from 0 to RN and the m process (move) is carried out based on which Rm is selected. At the end of each step, the simulation time is updated by ∆t = −1/RN ln x, where x is a new random number x ∈ (0, 1 >. This algorithm is also called the residence-time algorithm or the Bortz-Kalos-Lebowitz algorithm. A more efficient modification of this algorithm has been suggested by grouping the same kind of transitions. The final method we will mention is Green’s Function Reaction Dynamics, which is an efficient method for simulating diffusiondependent systems. The many-body problem of the system is split into one- and two-body problems, which can be solved analytically using Green’s Functions. For every particle and pair of particles we calculate the time they would interact. The events are then chronologically simulated, and after each interaction simulation the event list is updated. 106 Sources and further readings: 1. Daan Frenkel and Berend Smit; Understanding Molecular Simulation: From Algorithms to Applications (2nd Edition); Academic Press, 2001 Test questions 1. What assumptions are used for segments of ideal chain model? 2. What is the end-to-end distance in the ideal chain model (random walk)? 3. What is the sum of individual segments in the ideal chain model N i=1 ri? 4. How does the equilibrium size of the ideal chain depend on tempera- ture? 5. How does the vesicle total bending energy depend on its radius? 6. What is the Mean curvature? 7. What is the Gaussian curvature? 8. What is included in the van der Waals interactions? 9. What is the distance dependence of dispersion interaction between two atoms? 10. What name has material constant for dispersion interaction? 11. What is the distance dependence of dispersion interaction for large spherical particles? 12. What origin is the depletion interaction? 13. Whose approximation can be used to calculate a short-range interaction of objects with various shapes? 14. What approximations are used in the Poisson-Boltzmann theory? 15. What is the Debye screening length at physiological concentrations (nm)? 16. What name has the double layer theory of screened charged surfaces? 17. What interactions are included in the DLVO theory? 18. What is a rough range of strength of hydrogen bonds (kT)? 19. What is a rough range of strength of van der Waals interaction (kT)? 20. What is valid for hydrophobic interaction (rc is crossover radius)? 21. What is used in bottom-up approach? 22. Coarse grained models are: 23. What is true well below CMC (ci is concentration of i-mer aggregate)? 108 24. What is the packing parameter? 25. What is in the Langmuir model? 26. What is in the number of of ways to select N out of M indistinguishable particles? 27. What formula represents the diffusion coefficient (n is number of degrees of motion)? 28. What is the name of formula for calculation of diffusion coefficient using velocity autocorrelation function? 29. How long (in order of magnitude) does it take for a protein of radius 1 nm to diffuse from a brain to an arm? Assume that the whole motion is within one axon with viscosity of water 10−3 Pa·s and that the total length is about 1 m. Appendices Basics of thermodynamics Thermodynamics is a phenomenological science providing us with a series of general relations that are valid for any substance. The main relations on which thermodynamics is based are known as the laws of thermodynamics, which can be formulated in various ways. The First Law of thermodynamics states that the energy, of a closed system, E is conserved and can be converted into two forms: heat, ’Q’, and work, ’W’. dE = Q + W (A.2) The Second Law says that heat can never spontaneously flow from a cold reservoir to a warmer reservoir. In other words it is impossible to make an engine which completely converts heat from a single heat bath into work. Yet another formulation is that any spontaneous process cannot decrease entropy, where the entropy change dS is for a reversible change given by the formula: dS = δQrev T (A.3) Entropy is an extensive quantity, so the entropy of the whole system equals to the sum of the entropy of the individual parts. The Third law defines temperature and states that the entropy of a perfect crystal at absolute zero is equal to zero. Sideal(T = 0K) = 0 The zeroth law of thermodynamics is sometimes automatically assumed, but it should be added for completeness. It postulates that if two systems are in thermal equilibrium with a third system, they must be in thermal equilibrium with each other. TA = TC, TB = TC => TA = TB 110 Energy E can be related to enthalpy H = E + pV , Helmholtz free energy F = E − TS, and free energy G = E − TS + pV . Each of these quantities becomes a thermodynamic potential which systems minimize in different ensembles, i.e. under different constant conditions. NV E : dE = TdS − pdV + i µiNi NpE : dH = TdS + V dp + i µiNi NV T : dF = −SdT − pdV + i µiNi NpT : dG = −SdT + V dp + i µiNi Maxwell relations (connect useful but non-measurable quantities to measurable ones): ∂T ∂V S,N = − ∂p ∂S V,N ∂V ∂T p,N = − ∂S ∂p T,N ∂T ∂p S,N = ∂V ∂S p,N ∂p ∂T V,N = ∂S ∂V T,N ∂T ∂N V,S = ∂µ ∂S V,N Gibbs-Duhem relation: −SdT + V dp = i Nidµi Basics of statistical mechanics Statistical mechanics provides us with a connection between the microscopic details of the system and its thermodynamic quantities. We use the term mi- 111 croscopic details to mean all microscopic states in which the system can exist. This information is contained in a partition function, which evaluates the number of states with the same probability under the given conditions. Naturally, each ensemble has a different partition function. In a micro-canonical ensemble, we assume that all states with the same energy are equally likely (the ergodic hypothesis). The partition function Ω is then simply the number of microstates at given energy E, volume V , and number of particles N. In a canonical ensemble the probability of each state is given by the Boltzmann factor exp( Ei kT ) thus the partition function is Q = i exp( Ei kT ). This can be written in a more general way using the Hamiltonian of the system, H. In a continuous space of interacting particles, we can write Q = exp( H kT )dpdr. Moreover, if the Hamiltonian is composed of kinetic energy and interaction potential energy, the momenta can be integrated out. In a 3D system of N particles, this leads to: Q = 1 Λ3N N! exp − U kT dr, (A.4) where U is the potential energy of the system in a given configuration r. The integral prefactor originates from the number of permutations among indistinguishable particles, N!, the number of degenerate states under the given conditions, 1, and the de Broglie wavelength Λ = h√ 2πmkT . The fact that the particles are indistinguishable does not have to have its origin in quantum mechanics (i.e. all electrons are the same), but in the number of (in)distinguishable macro states with these particles. Using the general statistical formula < x >= i xpi i pi one can then calculate the various properties of the system using the partition function. Summary of useful relations: NVE NVT NpT µVT microcanonical canonical isothermal-isobaric grandcanonical Ω Q = e E kT Z = Qe −pV kT dV Ξ = ∞ N=0 Qe µN kT S = k ln Ω F = −kT ln Q G = −kT ln Z pV = kT ln Ξ 1 T = ∂S ∂E E = ∂F/T ∂1/T H = ∂G/T ∂1/T N = ∂pV ∂µ p = T ∂S ∂V p = − ∂F ∂V V = ∂G ∂p E = T ∂pV ∂T µ = −T ∂S ∂N µ = ∂F ∂N µ = ∂G ∂N 112 The example is an ideal gas (non-interacting point particles), where the canonical partition function simplifies to Q = V N Λ3N N! . Thus, F = −kT ln V N Λ3N N! . The pressure of the ideal gas is p = − ∂F ∂V N,T = NkT V which leads to the equation of ideal gas equation. Derivation of Boltzmann distribution In a canonical (NV T) ensemble, the Helmholtz free energy is at a minimum dF = dU − TdS = 0, (A.5) which we can use to calculate the probability of individual states i with energy Ei. The internal energy is given by U =< E >= i piEi, (A.6) and the derivative is dU = i (Eidpi + pidEi) (A.7) The energy of any particular state is normally dependent on V or N but not on S or T. Using this fundamental principle from quantum mechanics and the fact that temperature is constant in a canonical ensemble, we obtain: dU = i Eidpi (A.8) The entropy is defined as S = −k i pi ln pi, (A.9) so the derivative is dS = −k i (1 + ln pi)dpi (A.10) We can now substitute Eq. A.8 and A.10 into Eq. A.5: dF = i [Ei + kT(1 + ln pi)]dpi = 0 (A.11) 113 The solution to this equation is: pi = e− Ei kT (A.12) After normalization, we obtain the Boltzmann distribution: pi = e− Ei kT i e− Ei kT = e− Ei kT Q , (A.13) where Q is a canonical partition function. Virial coefficients and iteractions Equation of state of very dilute gas (solution) is given by ideal gas equation: pV = nRT, (A.14) which can be rewritten as: p kT = N V = ρ (A.15) At higher concentrations the real gas deviates from ideal gas and the deviation can be expressed in density expansion of equation of state: p kT = ρ + B2(T)ρ2 + B3(T)ρ3 + ... (A.16) This is known as virial expansion and Bn(T) are called virial coefficients. The coefficients can be viewed as corrections to ideal gas and their importance increases with increasing density. They are specifically related to n-body intermolecular interactions: B2 is related to two body interactions, B3 is related to three body interactions, etc. From statistical mechanics it can be shown that: B2 = V 1 2 − Q2 Q2 1 (A.17) B3 = V 2 2Q2 Q2 1 2Q2 Q2 1 − 1 − 1 3 6Q3 Q3 1 − 1 (A.18) (A.19) 114 where Qn is a canonical partition function of system with n particles. Using the pairwise additivity of interactions we can write: Q1 = V dr1 = V Λ3 (A.20) Q2 = 1 Λ62 V exp − U12 kT dr1dr2 (A.21) Q3 = 1 Λ93! V exp − U12 + U13 + U23 kT dr1dr2dr3 (A.22) (A.23) The second virial coefficient for monoatomic classical gas can be related to u(r), a potential of mean force (spherically averaged free energy) between particles, as: B2(T) = − 1 2V ∞ 0 e u(r) kT − 1 4πr2 dr (A.24) Ideal chain - radius of gyration Rg We start from the definition of a radius of gyration R2 g ≡ 1 N N i=1 (|ri − RCM |)2 (A.25) where RCM is the center of mass defined as: RCM ≡ 1 N N i=1 (ri) (A.26) To derive the relation of the radius of gyration to the end-to-end distance, we will use the following expression where we will use the following definition of center of mass: 115 N i,j=1 (|ri − rj|)2 = N i,j=1 [(ri − RCM ) − (rj − RCM )]2 = N i,j=1 (ri − RCM )2 −2 N i,j=1 (ri − RCM ) · (rj − RCM ) + N i,j=1 (rj − RCM )2 = N N i=1 (ri − RCM )2 + 0 + N N j=1 (rj − RCM )2 = 2N N i=1 (|ri − RCM |)2 (A.27) Therefore, the radius of gyration can be rewritten as: R2 g = 1 2N2 N i,j=1 (|ri − rj|)2 (A.28) The ri − rj vector is a end-to-end vector of the chain from the i-th to j-th monomer. Employing the average end-to-end distance, we obtain: < R2 g > = 1 2N2 N i,j=1 l2 |i − j| (A.29) = l2 2N2 N i=1 2 i j=1 |i − j| (A.30) 116 Now we will do the summation using a k=1 k = a(a + 1)/2 and a k=1 k2 = (2a3 + 3a2 + a)/6 < R2 g > = l2 2N2 N i=1 2 i ∗ i − i(i + 1) 2 (A.31) = l2 2N2 N i=1 (i2 − i) (A.32) = l2 2N2 2N3 + 3N2 + N 6 − N2 + N 2 (A.33) = 1 6 l2 N − l2 N (A.34) = 1 6 RN 2 − l2 6N (A.35) The radius of gyration is equal to one sixth of the end-to-end distance for very large N. Ideal chain - probability of the end-to-end distance We will calculate a probability distribution of the end-to-end distance RN of an ideal chain made of N Kuhn segments, each of length l. For simplicity we start with a 1D chain, where segments can be only oriented left or right. The end-to-end distance can be expressed as RN = (NR − NL)l, where Ni is the number of segments oriented right (i = R) or left (i = L). The number of realizations of such a RN is given by the combinatorial expression: Ω(NR, NL) = N! NR!NL! (A.36) Since we know that the total number of segments stays constant, we can substitute NL = N − NR Ω(NR) = N! NR!(N − NR)! (A.37) With this substitution RN = (NR − N + NR)l = (2NR − N)l. To simplify the above expression we will use x = NR − NL, so RN = xl and Ω(x) = N! (N+x 2 )!(N−x 2 )! (A.38) 117 The probability of distance RN is the number of these micro-states divided by the total number of states, which is 2N , since all segments can be oriented left or right. However, in Eq. A.38 we are biased for such a x where both N + x and N − x are even numbers. By doing so we increased (doubled) our probability, since the probability of odd combinations is zero. To correct for this we have to add a factor of 1/2. Other possibility would be to obtain the probability without this factor and then normalize the probability to 1. 2P(x) = 1 2N N! (N+x 2 )!(N−x 2 )! (A.39) We will now use Stirling approximation ln x! ≈ x ln x − x + ln √ 2πx in its exponential form x! = xx e−x √ 2πx 2P(x) = 1 2N NN e−N+ N+x 2 + N−x 2 (N+x 2 )( N+x 2 )(N−x 2 )( N−x 2 ) 2πN 2π N+x 2 2π N−x 2 = (N 2 )N e0 (N+x 2 )( N+x 2 )(N−x 2 )( N−x 2 ) 2N π(N2 − x2) = (N 2 )N (N2−x2 4 )( N 2 )(N+x 2 ) x 2 (N−x 2 )− x 2 2N π(N2 − x2) = (N 2 )N (N 2 )N (1 − x2 N2 )N/2(N+x N−x )x/2 2N π(N2 − x2) = 1 − x2 N2 − N 2 1 + x N 1 − x N − x 2 2 πN(1 − ( x N )2) (A.40) Assuming that the number of segments is huge NR N or in our substitution x N and using the exponential expansion for large N (exp(t) ≈ (1+t/N)N 118 ), we can simplify the formula to: 2P(x) = exp x2 2N − x2 2N − x2 2N 2 πN (A.41) = exp − x2 2N 2 πN (A.42) P(x) = 1 √ 2πN exp − x2 2N (A.43) (A.44) We can double check that the distribution is normalized. By substituting x back, we obtain the end-to-end distance probability: P(RN ) = 1 √ 2πNl2 exp − R2 N 2Nl2 (A.45) Similarly, we can obtain the distribution in three dimensions. P(RN ) = 3 √ 2πNl2 3/2 exp −3R2 N /2Nl2 (A.46) Dispersion interactions with thin membrane The dispersion interaction with a thin membrane is calculated similarly to the interaction with an infinite wall, only the upper integration limit is not infinite, but the membrane thickness t. Therefore, the interaction of a small molecule with the membrane is: E· = − d+t z=d ∞ 0 2πxρ C (x2 + z2)3 dxdz (A.47) = πρC d+t d 1 2(x2 + z2)2 ∞ 0 dz (A.48) = −πρC d+t d 1 2z4 dz (A.49) = − 1 6 πρC( 1 (d + t)3 − 1 d3 ) (A.50) 119 The interaction distance dependence is ∼ 1/d4 for small membrane thickness t → 0. The distance scaling is thus weaker than for interaction with an infinite wall. The interaction of a sphere and membrane can be calculated using the above result as simply integration over all the molecules within the sphere: E• = d+2R d ρsphπ(z − d)(2R − z + d)E·||dz (A.51) As for the wall derivation, we will use the close distances approximation R d − z leading to: E• = CR 3 π2 ρsphρmem ∞ d z − d (z + t)3 − z − d z3 dz (A.52) The first term in the integral is calculated by employing the substitution s = z + t. The second term is the same as for the wall. E• = 1 3 π2 ρsphρmemCR − 1 z + t + d + t 2(z + t)2 + 1 z − d 2z2 ∞ d = 1 6 π2 ρsphρmemCR 1 d + t − 1 d (A.53) Again, for small membrane thickness t → 0, the interaction distance dependence is ∼ 1/d2 , which is again 1/d weaker (more short range) than the interaction with an infinite wall. The last interaction we will calculate is the interaction between two membranes, where the energy per unit area is: E = 1 6 πρmem1ρmem2C d+t d 1 (z + t)3 − 1 z3 dz (A.54) = πρmem1ρmem2C 12 2 (d + t)2 − 1 (d + 2t)2 − 1 d2 (A.55) The interaction energy at the limit of small membrane thickness is ∼ 1/d4 . Depletion interactions in solution of ideal chains Let’s have a system consisting of hard spheres in a solution of ideal chains. A solution of ideal chains behaves like an ideal gas with additional inner con- 120 formational freedom. The number of chain conformations is restricted in the vicinity of a hard surface. To simplify, we can use the Asakura and Oosawa assumption that each chain can be approximated as a hard sphere of radius equal to the radius of gyration Rg of the given ideal chain (Rg = l2 6 (N −1/N)). The depleted volume of the single hard sphere will then be a sphere with radius R + Rg. If two spheres are be separated by distance d 2Rg, their depleted volumes will overlap. The change in depleted volume is responsible for the depletion force, which is pushing the particles together. In other words, the force exists due to the osmotic pressure of ideal chains, which is approximated as an ideal gas pid = nRT = ρkT (being constant for a given density). The force is equal to the osmotic pressure times the area A(z), which is the area of the largest axial circle within the overlap. Edepletion = − 2Rg−d 0 −ρkTA(z)dz = −ρkTVo(d), (A.56) where Vo(d) is the overlap volume made of two spherical caps with total vol- ume: Vo = π 12 (6R + 4Rg + d)(2Rg − d)2 (A.57) The depletion energy is thus: Edepletion = − π 12 (6R + 4Rg + d)(2Rg − d)2 ρkT (A.58) We can understand the depletion energy in terms of the entropy gain of an ideal gas whose volume increases by the overlap volume. The Asakura and Oosawa approximation was numerically confirmed to be quite good. Note that the depletion interaction is not pairwise additive. The overlap of three spheres is generally not the sum of the pair overlaps, since there could be a region where all three spheres overlap. If we used the sum of the pair overlaps, then the three-spheres overlap would be calculated twice, so the three-body correction to the sum is positive (the force is repulsive). This non-additivity is important when Rg/R is large. 121 Dissociation constants for amino acids pKa is a dissociation constant for amino acids that can be protonated or deprotonated, defined as: pKa = [X−/0 ][H+ ] [HX0/−] = e− µo(HX0/−)−µo(X−/0)−µo(H+) kT (A.59) Amino Acid pKa Asp (D) 3.9 Glu (E) 4.3 His (H) 6.08 Cys (C) 8.28 Tyr (Y) 10.1 Lys (K) 10.5 Arg (R) 12.0 Derivation of equilibrium equation between aggregates of different sizes Let’s consider a chemical equation, where two molecules form a dimer: A + A A2 (A.60) The equilibrium constant of this reaction is: K = cA2 co2 coc2 A = exp − Go A2 − 2Go A kT (A.61) The dimer concentration cA2 can be transformed into the concentration of single A molecules in the dimer as cAinA2 = 2cA2 . Similarly, the free energy of the dimer can be split into the single molecule contributions as ∆Go A2 = 2∆Go AinA2 . The above equilibrium constant equation can be thus rewritten: 1 2 cAinA2 co2 coc2 A = exp −2 Go AinA2 − Go A kT (A.62) 122 We can use the definition of chemical potential µ as the free energy per mole and activity a to get: 1 2 aAinA2 a2 A = exp −2 µo AinA2 − µo A RT (A.63) where R is a gas constant (R = NAk). Taking the logarithm of the equation, we obtain: ln 1 2 aAinA2 − ln a2 A = −2 µo AinA2 − µo A RT −RT ln 1 2 aAinA2 + 2RT ln(aA) = 2(µo AinA2 − µo A) µo A + RT ln(aA) = µo AinA2 + 1 2 RT ln 1 2 aAinA2 (A.64) In the same way we can derive the above equation for larger aggregates: trimer, tetramer, .. N-mer: NA AN (A.65) µo A + RT ln(aA) = µo AinAN + RT N ln 1 N aAinAN (A.66) Since this formula holds for any size of N, we can write the final equilibrium equation between aggregates of different sizes M and N as: µo AinAN + RT N ln 1 N aAinAN = µo AinAM + RT M ln 1 M aAinAM (A.67) Distribution of finite size aggregates The molar fraction of molecules in an N-aggregate Eq. 7.27 for the threshold value d = 1 simplifies to: x1N = N [x1B] N B−1 (A.68) We can calculate, which aggregate has the most molecules by finding the extreme ∂x1N ∂N = 0. In order to calculate this, it is useful to express the mol fraction of monomers x1 in terms of the total molar fraction x. x = ∞ N=1 x1N = ∞ N=1 N [x1B] N B−1 = x1 (1 − x1B)2 (A.69) 123 where we used the sum ∞ N=1 NaN = a (1−a)2 . Solving this quadratic equation we obtain: x1 = 2xB + 1 − √ 4xB + 1 2xB (A.70) The molar fraction of monomers in the N-aggregate can be rewritten as: x1N = N 2xB + 1 − √ 4xB + 1 2x N B−1 (A.71) For large concentrations (well above CMC): x >> CMC exp µo 1N −µo 1 RT = B−1 , we can simplify the formula and use the approximation for large N: x1N = N 1 − 1/ √ xB N B−1 N exp − N √ xB (A.72) The distribution of aggregates thus has the shape of ∼ Ne−N . The most molecules are in the aggregate Nmax: ∂x1N ∂N = exp − N √ xB + N exp − N √ xB − N √ xB = 0 (A.73) Nmax = √ xB (A.74) Fluctuation-dissipation theorem The fluctuation-dissipation theorem relates macroscopic dissipation of energy (friction or resistive force) and system fluctuations in equilibrium. With linear response theory it clarifies the system response to the applied perturbations via its equilibrium fluctuations. Brownian motion (Einstein 1905): Let’s have a potential V (x) that makes particles flow with drift velocity vd = −µ dV dx = µF, (A.75) where µ is mobility of particles that relates thermal drift velocity to applied force. The net flow is then: j = −D ∂c ∂x + vdc = −D ∂c ∂x − µ dV dx (A.76) 124 In equilibrium the flow will vanish. The solution is then c ∼ exp(−µV D ). At constant temperature the concentration has to be proportional to Boltzmann factor c ∼ exp(− V kT ), so we obtain: µ D = 1 kT (A.77) The Einstein result is then (using Green-Kubo expression): µ = D kT = 1 γ (A.78) or µ = D kT = 1 kT ∞ 0 < v(t)v(0) > dt (A.79) Thermal noise and dissipation: We start from Langevine equation and find relation between friction and stochastic force. m d2 x dt2 = fc − γ dx dt + fs (A.80) Let’s assume that the stochastic force fs is a white noise (no correlations in time or space). This means: < fs(t) > = 0 (A.81) < fs(t)fs(t ) > = 2Aδ(t − t ) (A.82) < fs(t)v(0) > = 0 (A.83) where A is amplitude of the force and it is related to the strength of collisions. To find solution of Langevine equation we will assume fc = 0 no conservative forces on the nanoparticle. m dv dt = −γv + fs (A.84) The solution is then: mv(t) = mv(0)e− γ m t + t 0 fs(t )e− γ m (t−t ) dt (A.85) We can combine the solution with another available information for velocity, the 125 equipartition theorem < v2 >= kT m . Therefore we have to calculate < v2 >: < m2 v2 (t) > = < mv(0)e− γ m t + t 0 fs(t )e− γ m (t−t ) dt 2 > (A.86) = m2 v2 (0)e−2 γ m t + 0+ < t 0 fs(t )e− γ m (t−t ) dt 2 >(A.87) The middle term vanishes because < fs(t)v(0) >= 0. Let’s solve the square integral separately: < t 0 fs(t )e− γ m (t−t ) dt t 0 fs(t )e− γ m (t−t ) dt > (A.88) = t 0 t 0 e− γ m (t−t ) e− γ m (t−t ) 2Aδ(t − t )dt dt (A.89) = Am γ (1 − e−2 γ m t ) (A.90) Thus for square velocity we get: < m2 v2 (t) >= m2 v2 (0)e−2 γ m t + Am γ (1 − e−2 γ m t ) (A.91) In the limit of infinite time we can relate to equipartition theorem: lim t→∞ < v2 (t) >= A γm = kT m (A.92) Therefore, amplitude is related to friction as: A = kTγ (A.93) More general solution would be in n dimensions: γ = 1 nkT ∞ 0 < fs(t)fs(0) > dt (A.94) Introduction to Metropolis Monte Carlo method The Monte Carlo method is a stochastic method used to evaluate multidimensional integrals, solve differential equations, etc. In the context of molecular 126 simulations, it is used to sample the configuration space of a studied system, leading to the evaluation of the properties of interest. For this purpose, a random generation of configurations can be improved by using Metropolis methods, where the configurations are generated with a probability proportional to the Boltzmann factor exp(−E/kT), thus obtaining the relevant conformations of the system at the given temperature. The sampling starts from an initial configuration with energy Eold. A new configuration with energy Enew is generated by a move (for instance a small displacement) and this new configuration is accepted or rejected based on the transition probability. There are several options for calculating the transition probability depending on the particular move, but in general the transition probability has to keep the system in equilibrium once it is reached. In other words, once the system is in equilibrium the number of visits to particular configurations is proportional to the configuration probability. We usually use the much stronger condition of detailed balance, where the averaged number of moves from the old state to the new one is equal to the average number of reverse moves. This can be formulated as: P(old)a(old → new) = P(new)a(new → old) (A.95) where a is the transition probability and P the configuration probability. The ratio of transition probabilities is then: a(old → new) a(new → old) = P(new) P(old) (A.96) Each transition probability a consists of the probability of generating the given move α and the probability of accepting such a move p. In most of the common methods α(old → new) = α(new → old) and hence we can write the acceptance ratio: p(old → new) p(new → old) = P(new) P(old) (A.97) There are other formulas that satisfy this relation, but the standard Metropolis condition is: p(old → new) = P (new) P (old) if P(new) < P(old) = 1 if P(new) ≥ P(old) (A.98) We will demonstrate this method on a simple system. Assume that we have a system in a NV T ensemble with two particles A, B moving on the 127 x-axis with simple a displacement move. The displacement is ±1 with the same probability in both directions. The configuration probability is then P ∼ exp(−E/kT) and the probabilities of the displacement are: p(old → new) = e− Enew−Eold kT if Enew > Eold = 1 if Enew ≤ Eold (A.99) This can be also written as: p(old → new) = min[1, e− Enew−Eold kT ] (A.100) The simulation algorithm is as follows: 1. randomly select the particle (A or B) 2. randomly select direction (+ or −) 3. do a displacement of size 1 and calculate the energy of the new state 4. accept the new configuration with probability p We can see that the detailed balance is satisfied. In other words, the probability of moving particle A from position 1 to 2 is the same as moving it from position 2 to 1. Note that full configuration probability in a canonical ensemble is P(N) = V N Λ3N N! e− E(rN ) kT (A.101) since the canonical partition function is Q = V N Λ3N N! e µN RT e− E(rN ) kT drN (A.102) Another note is about a system where we move particles left and right in the sequence (A,B,A,B,A,B,...). The detailed balance is not satisfied, yet we can obtain the correct sampling. However, in complex systems the situation is more complicated, and therefore, the detailed balance condition is usually used.