Molecular Dynamics 2 Molecular Dynamics ● is computer simulation of physical movements by atoms and molecules ● is frequently used in the study of proteins and biomolecules, as well as in materials science ● is a specialized discipline of molecular modeling and computer simulation based on statistical mechanics Wikipedia 3 Molecular Dynamics Biological molecules exhibit a wide range of time scales over which specific processes occur; for example – Local Motions (0.01 to 5 Å, 10-15 to 10-1 s) ● Atomic fluctuations ● Sidechain Motions ● Loop Motions – Rigid Body Motions (1 to 10Å, 10-9 to 1s) ● Helix Motions ● Domain Motions (hinge bending) ● Subunit motions – Large-Scale Motions (> 5Å, 10-7 to 104 s) ● Helix coil transitions ● Dissociation/Association ● Folding and Unfolding 4 Molecular Dynamics ● Molecular dynamics simulations permit the study of complex, dynamic processes that occur in biological systems. These include, for example, ● Protein stability ● Conformational changes ● Protein folding ● Molecular recognition: proteins, DNA, membranes, complexes ● Ion transport in biological systems ● and provide the mean to carry out the following studies, ● Drug Design ● Structure determination: X-ray and NMR 5 Molecular Dynamics ● First macromolecular MD simulation published (1977, Size: 500 atoms, Simulation Time: 9.2 ps=0.0092 ns, Program: CHARMM precursor) Protein: Bovine Pancreatic Trypsine Inhibitor. This is one of the best studied proteins in terms of folding and kinetics. Its simulation published in Nature magazine paved the way for understanding protein motion as essential in function and not just accessory. 6 Molecular Dynamics ● MD simulation of the complete satellite tobacco mosaic virus (STMV) (2006, Size: 1 million atoms, Simulation time: 50 ns, program: NAMD) This virus is a small, icosahedral plant virus which worsens the symptoms of infection by Tobacco Mosaic Virus (TMV). Molecular dynamics simulations were used to probe the mechanisms of viral assembly. The entire STMV particle consists of 60 identical copies of a single protein that make up the viral capsid (coating), and a 1063 nucleotide single stranded RNA genome. One key finding is that the capsid is very unstable when there is no RNA inside. The simulation would take a single 2006 desktop computer around 35 years to complete. It was thus done in many processors in parallel with continuous communication between them 7 MD ensembles ● Microcanonical ensemble (NVE) : The thermodynamic state characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed energy, E. This corresponds to an isolated system. ● Canonical Ensemble (NVT): This is a collection of all systems whose thermodynamic state is characterized by a fixed number of atoms, N, a fixed volume, V, and a fixed temperature, T. ● Isobaric-Isothermal Ensemble (NPT): This ensemble is characterized by a fixed number of atoms, N, a fixed pressure, P, and a fixed temperature, T. ● Grand canonical Ensemble (mVT): The thermodynamic state for this ensemble is characterized by a fixed chemical potential, m, a fixed volume, V, and a fixed temperature, T. 8 MD – how it works 9 Molecular Dynamics 0 ps 1 ps Time2 ps 1 MD step = 0.002 ps = 2 fs 500 steps 500 steps 1 snapshot 1 snapshot Saved each 500 step ... ... .. . MD trajectorie = set of snapshots 2 ns 10 Molecular Dynamics - trajectory MD trajectorie = set of snapshots 0 ps 200 ps 400 ps 600 ps 800 ps 1000 ps 1200 ps 1400 ps 1600 ps 1800 ps 11 MD – average structure .. . 12 MD – structural changes 13 MD – math behind Newton’s equation of motion is given by where Fi is the force exerted on particle i, mi is the mass of particle i and ai is the acceleration of particle i. The force can also be expressed as the gradient of the potential energy, Combining these two equations yields where V is the potential energy of the system. Newton’s equation of motion can then relate the derivative of the potential energy to the changes in position as a function of time. 14 MD – math behind The potential energy is a function of the atomic positions (3N) of all the atoms in the system. Due to the complicated nature of this function, there is no analytical solution to the equations of motion; they must be solved numerically. Numerous numerical algorithms have been developed for integrating the equations of motion. * Verlet algorithm * Leap-frog algorithm * Velocity Verlet * Beeman’s algorithm 15 MD – Verlet algorithm All the integration algorithms assume the positions, velocities and accelerations can be approximated by a Taylor series expansion: 16 MD – Verlet algorithm To derive the Verlet algorithm one can write Summing these two equations, one obtains The Verlet algorithm uses positions and accelerations at time t and the positions from time t-δt to calculate new positions at time t+δt. The Verlet algorithm uses no explicit velocities. The advantages of the Verlet algorithm are, i) it is straightforward, and ii) the storage requirements are modest . The disadvantage is that the algorithm is of moderate precision. 17 MD - The Velocity Verlet algorithm This algorithm yields positions, velocities and accelerations at time t. There is no compromise on precision. 18 MD – Beeman´s algorithm This algorithm is closely related to the Verlet algorithm The advantage of this algorithm is that it provides a more accurate expression for the velocities and better energy conservation. The disadvantage is that the more complex expressions make the calculation more expensive. 19 MD – The leap-frog algorithm In this algorithm, the velocities are first calculated at time t+1/2δt; these are used to calculate the positions, r, at time t+δt. In this way, the velocities leap over the positions, then the positions leap over the velocities. The advantage of this algorithm is that the velocities are explicitly calculated, however, the disadvantage is that they are not calculated at the same time as the positions. The velocities at time t can be approximated by the relationship: 20 MD – Periodic boundary conditions 21 MD – short range interactions The pairwise interactions for a single particle can be computed by a) computing the interaction to all other particles or b) by dividing the domain into cells with an edge length of at least the cut-off radius of the interaction potential and computing the interaction between the particle and all particles in the same (red) and in the adjacent (green) cells. 22 MD – Long range interactions ● Particle Mesh Evald Real space Fourier space 23 MD – Solvation model ● Implicit solvent – a method of representing solvent as a continuous medium instead of individual “explicit” solvent molecules – Poisson – Boltzmann model – Generalized Born model 24 MD – Solvation model ● Explicit solvent model 25 Restraints x Constraints r0 r0 26 MD – post-simulation analysis ● MM-PBSA , MM-GBSA analysis 27 MD – post-simulation analysis ● Secondary structure of proteins – DSSP – DSSP recognizes eight types of secondary structure (each identified by its own symbol), depending on the pattern of hydrogen bonds.The 310 helix, alpha helix and pi helix are symbolized as G, H and I, respectively, and are recognized by having a repetitive sequence of hydrogen bonds in which the donor residue is three, four, or five residues later in the backbone. DSSP recognizes two types of hydrogen-bond pairs in beta sheet structures, the parallel and antiparallel bridge; a lone bridge is symbolized by B (a beta bridge), whereas longer sets of H-bond pairs of the same type are symbolized by E for extended, which is equivalent to a beta sheet. Sheets with beta bulges are likewise symbolized by E. The remaining types are T for turn (featuring a hydrogen bond typical of a helix), S for a region of high curvature (if the angle between the vector from Cαi to Cαi+2 and the vector from Cαi-2 to Cαi is less than 70°, and a blank if no other rule pertains. Additionally there is L meaning "loop" or "other" 28 MD - software ● AMBER ● NAMD ● GROMOS ● GROMACS ● YASARA ● CHARMM