C7790 Introduction to Molecular Modelling -1C7790 Introduction to Molecular Modelling TSM Modelling Molecular Structures Petr Kulhánek kulhanek@chemi.muni.cz National Centre for Biomolecular Research, Faculty of Science Masaryk University, Kamenice 5, CZ-62500 Brno PS/2020 Distant Form of Teaching: Rev1 Lesson 15 Potential Energy Surface III C7790 Introduction to Molecular Modelling -2Local vs global minimum on PES E(x) x configurations local minima global minima REMEMBER: This is 1D projection of E(R), which is a function of 3N variables (N-number of atoms). To find a global minimum, it is necessary to find ALL local minima. Due to PES complexity, this is not computationally achievable even for small systems. C7790 Introduction to Molecular Modelling -3Local vs global minimum on PES E(x) x configurations local minima representing some conformational changes nearest "global" minimum for example reactant state global minimum representing a given state for example product state REMEMBER: This is 1D projection of E(R), which is a function of 3N variables (N-number of atoms). C7790 Introduction to Molecular Modelling -4Local vs global minimum on PES Finding local minimum: ➢ it is rather simple task, which employs local geometry optimizers ➢ the success of finding of local minimum is almost always guaranteed (problematic might by shallow minima). Finding global minimum: ➢ it is VERY difficult task, which can employ deterministic and/or stochastic methods such as ➢ genetic algorithms ➢ Monte-Carlo sampling algorithms ➢ parallel tempering ➢ others ➢ the success of finding of global minimum is not guaranteed C7790 Introduction to Molecular Modelling -5Finding the optimal geometry (local minima) C7790 Introduction to Molecular Modelling -6Optimal geometry local minima C7790 Introduction to Molecular Modelling -7Optimization methods Geometry optimization methods I. zero order (energy only) ▪ downhill simplex method II. first order (energy and gradient only) ▪ steepest descent method ▪ conjugate gradient method III. second order (energy, gradient and Hessian) ▪ Newton's method IV.pseudo-second order (energy, gradient and approximate Hessian) ▪ Broyden-Fletcher-Goldfarb-Shanno method (BFGS) )(RE Task: find R such E(R) is minimum (has zero gradient) min! the most often used approach C7790 Introduction to Molecular Modelling -8Optimization methods Geometry optimization methods I. zero order (energy only) ▪ downhill simplex method II. first order (energy and gradient only) ▪ steepest descent method ▪ conjugate gradient method III. second order (energy, gradient and Hessian) ▪ Newton's method IV.pseudo-second order (energy, gradient and approximate Hessian) ▪ Broyden-Fletcher-Goldfarb-Shanno method (BFGS) Above mentioned methods (algorithms) are general for any function: The function f is called, variously, an objective function, a loss function or cost function (minimization), a utility function or fitness function (maximization), or, in certain fields, an energy function or energy functional. A feasible solution that minimizes (or maximizes, if that is the goal) the objective function is called an optimal solution. https://en.wikipedia.org/wiki/Mathematical_optimization C7790 Introduction to Molecular Modelling -9Zero-order methods Downhill simplex method (Nelder-Mead algorithm) ➢ iterative method ➢ only energy is required (no gradient or Hessian) ➢ can escape local minima and can find nearest "global" minimum https://sudonull.com/post/69185-Nelder-Mead-optimization-method-Python-implementation-example https://codesachin.wordpress.com/2016/01/16/nelder-mead-optimization/ Other zero-order methods: ➢ BOBYQA ➢ COBYLA ➢ majority of global optimizers ➢ etc. Further details: https://en.wikipedia.org/wiki/Derivative-free_optimization C7790 Introduction to Molecular Modelling -10First-order methods https://en.wikipedia.org/wiki/Gradient_descent Steepest Descent Method ➢ iterative method ➢ only gradient is required (energy is only required for monitoring) ➢ it can find a local minimum 𝑹 𝑛= 𝑹 𝑛−1 − 𝛾 𝜕𝐸(𝑹 𝑛−1) 𝜕𝑹 energy gradient"step" size Other first-order methods: ➢ conjugate-gradients ➢ etc. step size can be a constant, varying, or different for geometry domains (bonds, angles, …) ➢ Rarely used for QM as these methods require more iterations to reach a minimum than psedosecond order methods. ➢ Quite often used for MM. C7790 Introduction to Molecular Modelling -11Second-order methods 𝑹 𝑛= 𝑹 𝑛−1 − 𝛾 𝜕2 𝐸(𝑹 𝑛−1) 𝜕𝑹2 −1 𝜕𝐸(𝑹 𝑛−1) 𝜕𝑹 Newton's Method ➢ iterative method ➢ gradient and Hessian are required (energy is only required for monitoring) ➢ it can find a local minimum ➢ the method converges significantly faster (in smaller number of steps) than zeroorder or first-order methods ➢ it is EXTREMELY computationally expensive due to Hessian calculations energy gradient "step" size Hessian Solution: pseudo-second order methods (such as BFGS) ➢ initial Hessian is approximated (empirical approaches, or unit matrix) ➢ in next iterations, Hessian is updated using gradients ➢ Hessian is thus improving during optimization, which results in faster convergence in final steps https://en.wikipedia.org/wiki/Newton%27s_method_in_optimization step size can be a constant, varying, or different for geometry domains (bonds, angles, …) C7790 Introduction to Molecular Modelling -12Cartesian vs Internal Coordinates O H 1 0.974298 O 1 1.454349 2 96.868054 H 3 0.974298 1 96.868054 2 239.552651 Cartesian coordinates Internal coordinates (Z-matrix) bond length bond angle torsion angle 3N 3N-6 3N-5 Number of degrees of freedom: Number of degrees of freedom: (linear diatomic molecule) x y z O -0.180077 -0.046023 -0.062789 H 0.196208 -0.747659 0.498793 O 0.006537 1.047922 0.877207 H -0.931885 1.299156 0.951390 O O H H C7790 Introduction to Molecular Modelling -13Internal vs Cartesian coordinates Optimization in internal coordinates converges fasters than in Cartesian coordinates: ➢ Hessian in internal coordinates can naturally provide difference between force constants of different geometry parameters (bonds, angles, torsions) ➢ This property of internal coordinates allows to use different step sizes for different geometry parameters (bonds, angles, torsions). In some rare cases, optimization in internal coordinates can fail (oscillation, etc.) Potential solutions: ➢ use different optimization algorithm ➢ switch to Cartesian coordinates C7790 Introduction to Molecular Modelling -14Practical realizations Avogadro employing MM potential Nemesis employing MM potential Gaussian employing QM potential # RHF/cc-pVDZ Opt NoSymm C7790 Introduction to Molecular Modelling -15Optimization of initial model E(x) x configurations for example reactant state for example product state REMEMBER: This is 1D projection of E(R), which is a function of 3N variables (N-number of atoms).starting (built) models OK but not global minima Models derived from high resolution X-ray structures are not problematic. C7790 Introduction to Molecular Modelling -16Optimization of initial model E(x) x configurations for example reactant state for example product state starting (built) model BAD REMEMBER: This is 1D projection of E(R), which is a function of 3N variables (N-number of atoms). ➢ In silico (built) models are very susceptible to initial structure. ➢ Therefore, frequency (vibration, Hessian) analysis is a MUST to check the nature of optimized stationary point. C7790 Introduction to Molecular Modelling -17- Summary ➢ It is relatively easy to find a local minimum. ➢ All geometry optimizers stops at a stationary point (a point with zero gradient), which dos not necessarily need to represent a local minimum. ➢ Due to complexity of PES, it is important to verify a nature of found stationary point because the found geometry can represent a transition state or a higher order saddle point. ➢ This is especially important for in silico models, which usually starts a far away from optimal geometry. C7790 Introduction to Molecular Modelling -18- Homework C7790 Introduction to Molecular Modelling -19Numerical gradient calculation Forward differences Central differences x0x-1 x1x0 x1 h xEhxE xx xExE x E x )()()()()( 00 01 01 0 −+ = − − =   R h hxEhxE xx xExE x E x 2 )()()()()( 00 11 11 0 −−+ = − − =   − −R C7790 Introduction to Molecular Modelling -20Numerical gradient calculation Forward differences Central differences x0x-1 x1x0 x1 h xEhxE xx xExE x E x )()()()()( 00 01 01 0 −+ = − − =   R h hxEhxE xx xExE x E x 2 )()()()()( 00 11 11 0 −−+ = − − =   − −R it is calculated onceit is calculated for each gradient component a total of 3N + 1 energy calculations they are calculated for each gradient component a total of 6N energy calculations C7790 Introduction to Molecular Modelling -21Tasks I 1. Express the gradient of the function E(R) according to the Cartesian coordinates of both atoms. 2. The system contains 300 atoms. The calculation of its energy by the quantumchemical method takes 15 minutes. Calculation of energy and analytical gradient then 20 minutes. 1. Determine the calculation time of the numerical gradient and compare it with the calculation time of the analytical gradient. 2. Determine the calculation time of numerical Hessian, which is calculated a) from energies and b) from analytical gradients. 3. Suggest a way to speed up the calculation of the numerical gradient a Hessian. 2 0 )( 2 1 )( rrKE −=R C7790 Introduction to Molecular Modelling -22- 1) For the function below, determine the character of the points with values: a) x = 1 b) x = 0 c) x = -1 2) In what situation can the second derivative of a function be zero? 3) What is the relationship between the extent of the reaction x and reaction coordinate rC (also referred to as x)? Tasks II 33015)( 2 ++= xxxE C7790 Introduction to Molecular Modelling -23Tasks III 1. Study the mentioned local geometry optimization methods. Focus on their principle, advantages and disadvantages compared to other optimization methods. Literature: (1) Leach, A.R. Molecular Modeling: Principles and Applications, 2nd ed .; Prentice Hall: Harlow, England; New York, 2001. (2) Jensen, F. Introduction to Computational Chemistry, 2nd ed .; John Wiley & Sons:Chichester, England; Hoboken, NJ, 2007.