C7790 Introduction to Molecular Modelling -1C7790 Introduction to Molecular Modelling TSM Modelling Molecular Structures Petr Kulhánek kulhanek@chemi.muni.cz National Centre for Biomolecular Research, Faculty of Science Masaryk University, Kamenice 5, CZ-62500 Brno PS/2020 Distant Form of Teaching: Rev1 Lesson 23 Molecular Mechanics I C7790 Introduction to Molecular Modelling -2- Context microworldmacroworld equilibrium (equilibrium constant) kinetics (rate constant) free energy (Gibbs/Helmholtz) partition function phenomenological thermodynamics statistical thermodynamics microstates (mechanical properties, E) states (thermodynamic properties, G, T,…) microstate ≠ microworld Description levels (model chemistry): • quantum mechanics • semiempirical methods • ab initio methods • post-HF methods • DFT methods • molecular mechanics • coarse-grained mechanics Structure EnergyFunction Simulations: • molecular dynamics • Monte Carlo simulations • docking • … C7790 Introduction to Molecular Modelling -3𝐸 𝑘(𝐑) = 𝐸 𝑏𝑜𝑛𝑑𝑠 + 𝐸 𝑎𝑛𝑔𝑙𝑒𝑠 + 𝐸𝑡𝑜𝑟𝑠𝑖𝑜𝑛𝑠 + 𝐸𝑒𝑙𝑒 + 𝐸 𝑣𝑑𝑤+. . . )()()(ˆ ekkek EH rRr RR  = Molecular Mechanics Schrodinger equation - quantum mechanical description bonded contributions non-bonded contributions Classical physics - mechanical description approximation electron motions is omitted (electron motions is implicitly included in empirical parameters) C7790 Introduction to Molecular Modelling -4- Nomenclature 𝐸 𝑘(𝐑) = 𝐸 𝑏𝑜𝑛𝑑𝑠 + 𝐸 𝑎𝑛𝑔𝑙𝑒𝑠 + 𝐸𝑡𝑜𝑟𝑠𝑖𝑜𝑛𝑠 + 𝐸𝑒𝑙𝑒 + 𝐸 𝑣𝑑𝑤+. . . https://en.wikipedia.org/wiki/Force_field_(chemistry) https://en.wikipedia.org/wiki/Coarse-grained_modeling https://en.wikipedia.org/wiki/Molecular_mechanics Molecular mechanics uses classical mechanics to model molecular systems. The Born– Oppenheimer approximation is assumed valid, and the potential energy of all systems is calculated as a function of the atomic coordinates using force fields. Coarse-grained modeling, coarse-grained models, aim at simulating the behavior of complex systems using their coarse-grained (simplified) representation. Coarse-grained models are widely used for molecular modeling of biomolecules at various granularity levels. In the context of chemistry and molecular modelling, a force field is a computational method that is used to estimate the forces between atoms within molecules and between molecules. More precisely, the force field refers to the functional form and parameter sets used to calculate the potential energy of a system of atoms or coarse-grained particles. + parameters C7790 Introduction to Molecular Modelling -5Bonded Contributions C7790 Introduction to Molecular Modelling -6Bonded Contributions Bond stretching Angle bending Bond rotation empirical parameters Main contributions C7790 Introduction to Molecular Modelling -7Bond stretching Harmonic potential 𝐸 𝑏(𝑟) = 1 2 𝐾 𝑟 − 𝑟0 2 ➢ The simplest description of a valence bond deformation is provided by the harmonic approximation. ➢ Valence bond deformation is then effectively described in the limit of Hooke's law. ➢ Hooke's law is a law of physics that states that the force (𝐹𝑠) needed to extend or compress a spring by some distance (Δ𝑟) scales linearly with respect to that distance. The scaling factor (𝐾) is a stiffness constant. 𝑑𝐸 𝑏(𝑟) 𝑑𝑟 = −𝐹 = 𝐹𝑠 = 𝐾 𝑟 − 𝑟0 𝐹𝑠 = 𝐾Δ𝑟 restoring force deformation force equilibrium bond distance stiffness (force constant) ➢ Higher order potentials such as a cubic potential can be used, but they exhibit catastrophic behavior at longer stretching distances. Why? C7790 Introduction to Molecular Modelling -8Bond stretching, cont. Morse potential ( ) ( )2 0 1)( rra e eDrV −− −= Harmonic potential ( )2 0 2 1 )( rrKrV −= ➢ The harmonic potential does not have dissociation limit. Thus, force fields employing the harmonic approximation cannot describe reactivity. ➢ The reactivity is difficult to describe even with Morse potential properly, a noticeable exceptions are ReaxFF (reactive force field) and EVB (empirical valence bond). Disadvantage of Morse potential • more parameters are needed • more computationally demanding C7790 Introduction to Molecular Modelling -9𝐸 𝑎 = 1 2 𝐾 Θ − Θ0 2 Angle bending ➢ The simplest description of a valence angle deformation is provided by the harmonic approximation similarly to a valence bond. equilibrium valence anglestiffness (force constant) ➢ Alternative forms ➢ GROMOS Forcefield ➢ Higher order potentials ➢ Cross-terms, which describes coupling between bond stretching and angle bending. 𝐸 𝑎 = 1 2 𝐾[cos(Θ) − cos(Θ0)]2 this form avoids singularities in gradient calculations for straight angles (180°) cos(Θ)= 𝒗 𝐵𝐴. 𝒗 𝐵𝐶 𝒗 𝐵𝐴 𝒗 𝐵𝐶 𝒗 𝐵𝐴 𝒗 𝐵𝐶 B A C Θ C7790 Introduction to Molecular Modelling -10Bond rotation A B C D 𝜑 Torsion angle (dihedral angle) is signed angle between two planes A-B-C and B-C-D. cos(Θ)= 𝒏 𝐴𝐵𝐶. 𝒏 𝐵𝐶𝐷 𝒏 𝐴𝐵𝐶 𝒏 𝐵𝐶𝐷 (+) clock-wise(-) anti clock-wise only [0, p] [0, p][-p, 0] normal vectors (perpendicular to the plane) 𝒏 𝐴𝐵𝐶 = 𝒗 𝐵𝐴 × 𝒗 𝐵𝐶 Bond rotations can be described by torsion (dihedral) angles. C7790 Introduction to Molecular Modelling -11➢ Torsional energy can be approximated by a Fourier series. ➢ Due to computational complexity, the length of the Fourier series is truncated. Typical lengths are up 𝐾 = 4. Bond rotation, cont. 𝐸𝑡 = ෍ 𝑛=1 𝐾 𝑉𝑛 2 [1 − cos(𝑛𝜑 − 𝛾𝑛)] phaseamplitude length of the Fourier series This equation is for a single torsion!!!! torsional energy is a periodic function C7790 Introduction to Molecular Modelling -12➢ Angle bending and bond rotations are not able to properly describe out-of-plane deformations. ➢ Out-of-plane deformations are observed in planar parts of molecules such as aromatic rings or peptide bonds (conjugated system). ➢ There are several geometrical descriptions of out-of-plane deformations (deviation from the plane or improper torsion). ➢ The deformation energy is expressed as either harmonic potential or a Fourier series. Out-of-plane deformations A B D 𝜑 C Planes: A-B-C B-C-D improper torsion C7790 Introduction to Molecular Modelling -13- Non-Bonded (Non-covalent) Contributions C7790 Introduction to Molecular Modelling -14𝐸 𝑣𝑑𝑤 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 4𝜀𝑖𝑗 𝜎𝑖𝑗 𝑟𝑖𝑗 12 − 𝜎𝑖𝑗 𝑟𝑖𝑗 6 𝐸𝑒𝑙𝑒 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 1 4𝜋𝜀 𝑜 𝑞𝑖 𝑞 𝑗 𝑟𝑖𝑗 Non-bonded Contributions Electrostatic interactions van der Waals interactions empirical parameters N – number of atoms Main contributions C7790 Introduction to Molecular Modelling -15Excluded Interactions ➢ Non-bonded interactions are computed between ALL atoms except of those pairs, which are explicitly described by bonded terms. ➢ Non-bonded interactions are thus both of intra and intermolecular origin. ➢ Excluded interactions are: ➢ 1-2 (bond stretching) ➢ 1-2-3 (angle bending) ➢ 1-2-3-4 (bond rotations) are not fully excluded but instead they are scaled C7790 Introduction to Molecular Modelling -16Electrostatic interactions 𝐸𝑒𝑙𝑒 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 1 4𝜋𝜀 𝑜 𝑞𝑖 𝑞 𝑗 𝑟𝑖𝑗 ➢ Electron density and positively charged nuclei of atoms are approximated by point charges. ➢ Interaction between point charges is provided by Coulomb potential. ➢ This simple model does not describe induction and penetration effect. ➢ Penetration electrostatic energy is always attractive and is consequence of overlap of smeared electron density and point nuclear charges at short distances. However, the effect of penetration energy is absorbed in LJ potential in the simple FF. ➢ Polarizable force fields ➢ they try to include effect of induction employing polarizable electrostatic models (induced dipoles, Drude oscillator, fluctuating charges, etc…) ➢ more computationally demanding, improvement is questionable C7790 Introduction to Molecular Modelling -17Point charges Atomic partial charges are not observable. Rather, they are (crude) approximation of electron density and point nuclear charges. Point charges must describe well electrostatic potential of molecules and their interactions. Sources of charges are QM calculations: ➢ Restrained ElectroStatic Potential Charges (RESP) derived from ESP charges. ➢ RESP corrects chemically non-intuitive charges on carbon atoms. ➢ RESP/ESP procedure fails on large structures with buried atoms. ➢ CM5 charges ➢ CHelpG charges ➢ DDEC6 charges ➢ others … Abdel-Azeim, S. Revisiting OPLS-AA Force Field for the Simulation of Anionic Surfactants in Concentrated Electrolyte Solutions. J. Chem. Theory Comput. 2020, 16 (2), 1136–1145. https://doi.org/10.1021/acs.jctc.9b00947. C7790 Introduction to Molecular Modelling -18ESP charges ESP charges (ElectroStatic Potential) are charges derived from electrostatic potential. The principle of charge calculation consists of two steps: 1. calculation of electrostatic potential VQM from wave function on discretized molecular envelope (set of points) 2. finding point atomic charges that create electrostatic potential VPC which is in the best agreement with the quantum mechanical potential (least squares method) pQMV ,→ kpPC qV , ( ) min! 2 ,, =−p pPCpQM VV is searched by least squares method http://biomodel.uah.es/Jmol/surfaces/inicio.htm ESP charges and their derivatives are used in molecular mechanics because, by their nature, they describe well electrostatic properties of molecules/system. C7790 Introduction to Molecular Modelling -19van der Waals interactions van der Walls interactions are very often described by Lenard-Jones potential. 𝐸 𝑣𝑑𝑤 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 𝜀𝑖𝑗 𝑅 𝑚,𝑖𝑗 𝑟𝑖𝑗 12 − 2 𝑅 𝑚,𝑖𝑗 𝑟𝑖𝑗 6 𝐸 𝑣𝑑𝑤 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 4𝜀𝑖𝑗 𝜎𝑖𝑗 𝑟𝑖𝑗 12 − 𝜎𝑖𝑗 𝑟𝑖𝑗 6 repulsive part (Pauli repulsion) attractive part (dispersion) 𝜎𝑖𝑗 𝑅 𝑚,𝑖𝑗 Alternative forms: 𝐸𝑣𝑑𝑤 = ෍ 𝑖=1 𝑁 ෍ 𝑗=𝑖+1 𝑁 𝐴𝑖𝑗 𝑟𝑖𝑗 12 − 𝐵𝑖𝑗 𝑟𝑖𝑗 6 "Better" models: ➢ exp-6 potential (however, it is unphysical at short distances) C7790 Introduction to Molecular Modelling -20Combining rules In computational chemistry and molecular dynamics, the combination rules or combining rules are equations that provide the interaction energy between two dissimilar nonbonded atoms, usually for the part of the potential representing the van der Waals interaction. Combining rules considerably reduce the number of needed parameters. https://en.wikipedia.org/wiki/Combining_rules Lorentz-Berthelot rules 𝜀𝑖𝑗 = 𝜀𝑖𝑖 𝜀𝑗𝑗 𝜎𝑖𝑗 = 𝜎𝑖𝑖 + 𝜎𝑗𝑗 2 arithmetic mean geometric mean LB rules are rather inaccurate. However, their success relies on the fact that the number of unlike interactions is considerably higher that between like atoms. Then, like parameters are optimized in such a way that resulting unlike parameters performs well. Side outcome is that like parameters alone do not perform well. C7790 Introduction to Molecular Modelling -21FF Parameters C7790 Introduction to Molecular Modelling -22Atom Types ➢ Force Filed parameters are required for: ➢ each valence bond (Kb, r0) ➢ each valence angle (Ka, Q0) ➢ each torsion angle (series of Vn, g) ➢ each atom (qi, eii, sii) Atom Types: ➢ Atoms types group atoms within the similar chemical environment This is impractical due large number of required parameters!!! solution PARM99 for DNA,RNA,AA, organic molecules, TIP3P wat. Polariz.& LP incl.02/04/99 C 12.01 0.616 ! sp2 C carbonyl group CA 12.01 0.360 sp2 C pure aromatic (benzene) CT 12.01 0.878 sp3 aliphatic C CV 12.01 0.360 sp2 arom. 5 memb.ring w/1 N and 1 H (HIS) CW 12.01 0.360 sp2 arom. 5 memb.ring w/1 N-H and 1 H (HIS) C* 12.01 0.360 sp2 arom. 5 memb.ring w/1 subst. (TRP) CY 12.01 0.360 nitrile C (Howard et al.JCC,16,243,1995) CZ 12.01 0.360 sp C (Howard et al.JCC,16,243,1995) H 1.008 0.161 H bonded to nitrogen atoms HC 1.008 0.135 H aliph. bond. to C without electrwd.group H1 1.008 0.135 H aliph. bond. to C with 1 electrwd. group HA 1.008 0.167 H arom. bond. to C without elctrwd. groups C7790 Introduction to Molecular Modelling -23Assigning Parameters - I input molecule atom types template library (atom types + partial charges) topology coordinates Force field (FF parameters) partial charges atom names -> atom types predefined template libraries for components of biomolecular structures ➢ aminoacid residues ➢ nucleotides C7790 Introduction to Molecular Modelling -24Assigning Parameters - II input molecule atom types topology coordinates external QM / internal EEM calculations Force field (FF parameters) partial charges ➢ typically, small organic molecules ➢ partial atomic charges are calculated independently C7790 Introduction to Molecular Modelling -25Parameter optimization Parameters can be derived from: ➢ experimental data ➢ bulk properties (density, etc.) ➢ thermodynamics (binding affinities) ➢ experimental geometries (equilibrium bond lengths and valence angles) ➢ vibrational spectra (force constants) ➢ computational data (QM calculations) ➢ geometry (equilibrium values) ➢ vibrational analysis (force constants) ➢ potential energy scans for bond rotations Transferability of parameters is key for general use in modelling. C7790 Introduction to Molecular Modelling -26Common Force Fields AMBER (Assisted Model Building and Energy Refinement) – widely used for proteins and DNA. CHARMM (Chemistry at HARvard Molecular Mechanics) – originally developed at Harvard, widely used for both small molecules and macromolecules GROMOS (GROningen MOlecular Simulation) – a force field that comes as part of the GROMOS software, a general-purpose molecular dynamics computer simulation package for the study of biomolecular systems. GROMOS force field A-version has been developed for application to aqueous or apolar solutions of proteins, nucleotides, and sugars. A B-version to simulate gas phase isolated molecules is also available. MMFF (Merck Molecular Force Field) – developed at Merck for a broad range of molecules. OPLS (Optimized Potential for Liquid Simulations) (variants include OPLS-AA, OPLS-UA, OPLS-2001, OPLS-2005, OPLS3e, OPLS4) – developed by William L. Jorgensen at the Yale University Department of Chemistry. GAFF – a general AMBER force field was developed for rational drug design. GAFF is compatible with the AMBER force field, and it has parameters for almost all the organic molecules made of C, N, O, H, S, P, F, Cl, Br and I. As a complete force field, GAFF is suitable for study of a great number of molecules (such as database searching) in an automatic fashion. https://en.wikipedia.org/wiki/Force_field_(chemistry)