Volker John Numerical Methods for Ordinary Differential Equations 1. Explicit One-Step Methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 2. Numerical Methods for Stiff Ordinary Differential Equations . . . . . . . . . . . . . . . . . . . . . . . 27 3. Multi-StepMethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4. Summary and Outlook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73 Appendix A: Topics on the Theory of Ordinary Differential Equations. . . . . . . . . . . . . . . . .77 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 1 Chapter 1 Explicit One-Step Methods Remark 1.1. Contents. This course presents methods for the numerical solution of explicit systems of initial value problems for ordinary differential equations of first order y� (x) = f(x, y(x)), y(x0) = y0. For the most part, only initial value problems for scalar ordinary differential equations of first order y� (x) = f(x, y(x)), y(x0) = y0, (1.1) are considered, for simplicity of presentation. The extension of the results and the methods to systems is generally straightforward. It will be always assumed that there is a unique solution of the initial value problem in a neighborhood of the initial value. In applications, the independent variable is often the time. ✷ 1.1 Consistency and Convergence Definition 1.2. Grid, step size. A grid is a decomposition Ih of the interval I = [x0, xe] Ih = {x0, x1, . . . , xN = xe} with x0 < x1 < . . . < xN . The differences between neighboring grid points hk = xk+1 − xk are called step sizes. For an equidistant grid, the notation h = hk will be used for the step size, see Figure 1.1. ✷ Remark 1.3. Explicit and implicit methods. Let y(xk) denote the solution of (1.1) in the node xk and yk a numerical approximation of y(xk). A numerical 3 4 1 Explicit One-Step Methods x2 x3 x4 xN xeh x0 x1 xN−1 Fig. 1.1 Equidistant grid. method for the solution of (1.1) on a grid Ih is called explicit, if an approximation yk+1 in xk+1 can be calculated directly from already computed values yi, i ≤ k. Otherwise, the method is called implicit method. Implicit methods require in each step the solution of a generally nonlinear equation for computing yk+1. ✷ Definition 1.4. One-step method, incremental function. A one-step method for the computation of an approximation yk+1 of the solution of (1.1) on a grid Ih has the form yk+1 = yk + hkΦ (x, y, hk) , k = 0, 1, . . . , y0 = y(x0). (1.2) Here, Φ(·, ·, ·) is called incremental function of the one-step method. ✷ Example 1.5. One-step methods, incremental functions. The explicit or forward Euler method yk+1 = yk + hkf (xk, yk) , k = 0, 1, 2, . . . , y0 = y(x0), is an explicit one-step method with the incremental function Φ (x, y, hk) = f (xk, yk) . The computation of yk+1 requires only the substitution of already computed values in the function f(x, y) from the initial value problem (1.1). The implicit or backward Euler method yk+1 = yk + hkf (xk+1, yk+1) , k = 0, 1, 2, . . . , y0 = y(x0), is an implicit one-step method with the incremental function Φ (x, y, hk) = f (xk+1, yk+1) . One has to solve an equation for computing yk+1. The complexity of this step depends on f(x, y). ✷ Remark 1.6. Representation of implicit one-step methods. Explicit one-step methods require only that known values are inserted in the incremental function. Hence, their incremental function can be written finally in the form 1.1 Consistency and Convergence 5 xk+1xk ˆyk+1 y(xk+1) y(xk) lek+1 Fig. 1.2 The local error. Φ (xk, yk, hk). For the considerations in this section, one can adopt the point of view that also implicit one-step methods can be written as explicit onestep methods, because the data for the nonlinear equation are xk, yk, and hk. However, generally one does not know the concrete form of the incremental function. ✷ Example 1.7. Incremental function of the implicit Euler method. The incremental function of the implicit Euler method on an equidistant grid can be written in the form Φ (x, y, h) = f (x + h, y + hΦ (x, y, h)) , which allows formally the representation of this method as explicit one-step scheme. ✷ Definition 1.8. Local error. Let ˆyk+1 be the result of one step of an explicit one-step method (1.2) with the initial value y(xk), i.e., ˆyk+1 = y(xk) + hkΦ (xk, y(xk), hk) . Then, le (xk+1) = lek+1 = y (xk+1) − ˆyk+1 = y (xk+1) − (y(xk) + hkΦ (xk, y(xk), hk)) (1.3) is called local error, see Figure 1.2 ✷ Remark 1.9. The local error. In the literature, sometimes y (xk+1) − y(xk) hk − Φ (xk, y(xk), hk) is defined to be the local error. 6 1 Explicit One-Step Methods For the local error, one starts from the solution of the initial value problem and considers the error after one step of the numerical method. One should require for a reasonable method that the local error is small in an appropriate sense. ✷ Definition 1.10. Consistent method. Let y(x) be the solution of the initial value problem (1.1), hmax = maxk hk, and S := {(x, y) : x ∈ [x0, xe], y ∈ R} . The one-step method (1.2) is said to be consistent, if for all f ∈ C(S), which satisfy in S a Lipschitz condition with respect to y, it holds lim hmax→0 � max xk∈Ih |le (xk+1)| hk � = 0 or lim hmax→0 � max xk∈Ih |f(xk, y(xk)) − Φ (xk, y(xk), hk)| � = 0. (1.4) Both conditions are equivalent, compare Remark 1.11. ✷ Remark 1.11. Approximation of the derivative with the incremental function. For bounded incremental functions, it is obvious that the local error converges to zero if hmax → 0, because in this case it holds hk → 0 and ˆyk+1 → y(xk), such that this statement follows from (1.3). Consistency requires more, namely that the incremental function approximates the derivative of the solution sufficiently well. Applying (1.3) and (1.1) yields le (xk+1) hk = y (xk+1) − y(xk) hk − Φ (xk, y(xk), hk) ≈ y� (xk) − Φ (xk, y(xk), hk) = f (xk, yk) − Φ (xk, y(xk), hk) , compare (1.4). ✷ Example 1.12. Consistency of the explicit Euler method. For the explicit Euler method, it is Φ (xk, y(xk), hk) = f (xk, y(xk)) . Hence, condition (1.4) from Definition 1.10 is satisfied and the method is consistent. ✷ Remark 1.13. Quality of the approximation of the incremental function. For practical purposes, not only the consistency itself but the quality of the approximation of the derivative by the incremental function is essential. The quality allows a comparison of different one-step methods. For simplicity of presentation, let hk = h for all k. ✷ 1.1 Consistency and Convergence 7 Definition 1.14. Order of consistency. An explicit one-step method (1.2) has the consistency order p ∈ N, if p is the largest natural number such that for all functions f ∈ C(S), which satisfy a Lipschitz condition with respect to y, it holds |le (xk + h)| ≤ Chp+1 for all xk ∈ Ih, for all Ih with h ∈ (0, H], and with the constant C > 0 being independent of h. The constant C might depend on derivatives of y(x), on f(x, y), and on partial derivatives of f(x, y). ✷ Example 1.15. Order of consistency of the explicit Euler method. Consider the explicit Euler method and assume that the function y(x) is two times continously differentiable. Then, it follows with Taylor series expansion and using the differential equation that |le (xk + h)| = |y(xk + h) − ˆyk+1| = |y(xk) + hy� (xk) + h2 2 y�� (xk + θh) − y(xk) − h f (xk, y(xk)) � �� � =y � (xk) | = h2 2 � �y�� (xk + θh) � � ≤ h2 2 �y�C 2 ([x0,xe]) , with θ ∈ (0, 1). Since there is no way to replace the term on the right-hand side by a term with a larger power of h, the method has consistency order 1. ✷ Remark 1.16. Consistency and convergence. The consistency is a local property of a one-step method. For practical purposes, it is important that the computed solution converges to the analytic solution if the grid becomes finer and finer. Of course, the order of convergence is of importance, too. It will be shown that, under certain conditions, the convergence of a onestep method follows from its consistency and that the order of convergence equals the consistency order. ✷ Definition 1.17. Convergent method, order of convergence. A onestep method (1.2) converges for the initial value problem (1.1) on the interval I = [x0, xe], if for each sequence of grids {Ih} with hmax = maxhk hk → 0 for the global error e(xk, h) = y(xk) − yk, xk ∈ Ih, it follows that max xk∈Ih |e(xk, h)| → 0 for hmax → 0. The one-step method has the order of convergence p∗ , if p∗ is the largest natural number such that for all step lengths hmax ∈ (0, H], it holds |e(xk, h)| ≤ Chp ∗ max ∀ xk ∈ Ih, 8 1 Explicit One-Step Methods where C > 0 is independent of hmax. ✷ Lemma 1.18. Estimate for a sequence of real numbers. Assume that for real numbers xn, n = 0, 1, . . ., the inequality |xn+1| ≤ (1 + δ) |xn| + β holds with constants δ > 0, β ≥ 0. Then, it holds that |xn| ≤ enδ |x0| + enδ − 1 δ β, n = 0, 1, . . . . Proof. With induction, problem for exercises. � Theorem 1.19. Connection of consistency and convergence. Let y(x) be the solution of the initial value problem (1.1) with f ∈ C(S). Let a Lipschitz condition hold for the second argument of the incremental function |Φ(x, y1, h) − Φ(x, y2, h)| ≤ M |y1 − y2| ∀ (x, y) ∈ S, h ∈ (0, H], (1.5) with M ∈ R fixed. Assume that for the local error the estimate |le (xk + h)| ≤ Chp+1 ∀ xk ∈ Ih, h ∈ (0, H] (1.6) is valid and assume that y0 = y(x0). Then, it follows for the global error that |e(xk+1, h)| ≤ C eM(xk+1−x0) − 1 M hp , where C is independent of h. Proof. It holds that yk+1 = yk + hΦ (xk, yk, h) , y(xk+1) = y(xk) + hΦ (xk, y(xk), h) + le � xk+1 � , k = 0, 1, . . . . Then, it follows with the triangle inequality, the assumption on the local error (1.6), and the Lipschitz condition of the incremental function (1.5) that � �e(xk+1, h) � � = � �y(xk+1) − yk+1 � � = � �y(xk) − yk + le � xk+1 � + h � Φ (xk, y(xk), h) − Φ (xk, yk, h) �� � = � �e(xk, h) + le � xk+1 � + h � Φ (xk, y(xk), h) − Φ (xk, yk, h) �� � ≤ |e(xk, h)| + � �le � xk+1 �� � + h |Φ (xk, y(xk), h) − Φ (xk, yk, h)| ≤ |e(xk, h)| + Ch p+1 + hM |y(xk) − yk| = (1 + hM) |e(xk, h)| + Ch p+1 . This sequence of inequalities has the form that was considered in Lemma 1.18. One obtains with e(x0) = 0 1.2 Explicit Runge–Kutta Schemes 9 � �e(xk+1, h) � � ≤ e (k+1)hM |e(x0)| + C e (k+1)hM − 1 hM h p+1 = C e M(xk+1−x0) − 1 M h p . � Remark 1.20. To Theorem 1.19. • The consideration of a constant step length is only for simplicity of presentation. The result of the theorem holds also for non-constant step lengths with h = maxk hk. • One-step methods compute an approximation yk of the solution in the grid points xk, k = 0, 1, . . . , N. To enable a better comparison with the analytic solution, one connects these points linearly from (xk, yk) to (xk+1, yk+1). In this way, one obtains a piecewise linear approximation of the solution that is defined on [x0, xe]. This function is called yh (x). The considerations from above can be extended to yh (x). ✷ 1.2 Explicit Runge–Kutta Schemes Remark 1.21. Idea. The Euler methods are only of first order. The idea of Runge1 –Kutta2 methods consists in using an incremental function Φ(x, y, h) that is a linear combination of values of f(x, y) in different points. With this approach, one obtains methods of higher order for the cost of evaluating more values of f(x, y). This approach can be illustrated well at the integral equation that is equivalent to the initial value problem (1.1). For simplicity, let the right-hand side of (1.1) depend only on x. Then, the integral equation has the form y(x) = y0 + � x x0 f(t) dt. (1.7) The idea of the Runge–Kutta methods consists in approximating the righthand side by a quadrature rule, e.g., in the interval [xk, xk+1] by � xk+1 xk f(t) dt ≈ hk s� j=1 bjf � xk + cjhk � with the weights bj and the nodes xk + cjh. In the following, only hk = h for all k will be considered for the reason of simplicity. ✷ 1 Carle David Tolm´e Runge (1856 – 1927) 2 Martin Kutta (1867 – 1944) 10 1 Explicit One-Step Methods Definition 1.22. Runge–Kutta methods, increments, and stages. A Runge–Kutta method has the form yk+1 = yk + hΦ(x, y, h), k = 0, 1, . . . , y0 = y(x0), where the incremental function is defined with the help of Ki(x, y, h) = f  xk + cih, yk + h s� j=1 aijKj(x, y, h)   by Φ(x, y, h) = s� i=1 biKi(x, y, h), with c1, . . . , cs, b1, . . . , bs, aij ∈ R, i, j = 1, . . . , s. The quantities Ki(x, y, h), i = 1, . . . , s, are called increments. The natural number s ∈ N is the number of stages of the method. An equivalent definition is as follows y (i) k+1 = yk + h s� j=1 aijf � xk + cjh, y (j) k+1 � , (1.8) Φ(x, y, h) = s� i=1 bif � xk + cih, y (i) k+1 � . (1.9) The intermediate values y (i) k+1 are called stages. ✷ Remark 1.23. Butcher3 tableau. For the reason of clarity, one writes a Runge– Kutta scheme in general in form of a tableau, the so-called Butcher tableau c1 a11 a12 · · · a1s c2 a21 a22 · · · a2s c3 a31 a32 · · · a3s ... ... ... ... cs as1 as2 · · · ass b1 b2 · · · bs =⇒ c A bT . (1.10) Here, c are the nodes, A is the matrix of the method, and b are the weights. ✷ Remark 1.24. Increments and Butcher tableau. For explicit Runge–Kutta schemes, the increments can be computed one after the other 3 John C. Butcher, born. 1933 1.2 Explicit Runge–Kutta Schemes 11 K1(x, y, h) = f(xk, yk), K2(x, y, h) = f (xk + c2h, yk + ha21K1(x, y, h)) , ... Ks(x, y, h) = f  xk + csh, yk + h s−1� j=1 asjKj(x, y, h)   . (1.11) The Butcher tableau has the form 0 c2 a21 c3 a31 a32 ... ... ... ... cs as1 as2 · · · as,s−1 b1 b2 · · · bs−1 bs . ✷ Example 1.25. Explicit Euler scheme. The explicit Euler scheme is an explicit Runge–Kutta scheme with the Butcher tableau 0 1 . In the integral equation, the approximation � xk+1 xk f(t, y(t)) dt ≈ hf (xk, yk(xk)) is used, see the proof of the Theorem of Peano, lectures notes of Numerical Mathematics I. ✷ Theorem 1.26. Consistency of explicit Runge–Kutta schemes. Let f ∈ C(S), see Definition 1.10. An explicit Runge–Kutta scheme is consistent if and only if s� i=1 bi = 1. (1.12) Proof. From the continuity of f(x, y) and the definition (1.11) of the increments of an explicit Runge–Kutta scheme, it follows that lim h→0 Ki(x, y, h) = f(xk, y(xk)), ∀ (x, y) ∈ S, i = 1, . . . , s, for the case that the initial value of this step is yk = y(xk). The continuity of the absolute value function gives lim h→0 |f(xk, y(xk)) − Φ (xk, y(xk), h)| = lim h→0 � � � � � f(xk, y(xk)) − s� i=1 biKi(x, y, h) � � � � � 12 1 Explicit One-Step Methods = � � � � � f(xk, y(xk)) − s� i=1 bi lim h→0 Ki(x, y, h) � � � � � = � � � � � f(xk, y(xk)) � 1 − s� i=1 bi �� � � � � = 0 if and only if �s i=1 bi = 1. Hence, the condition (1.4) in Definition 1.10 is satisfied. � Theorem 1.27. Interpretation of the increments. Let for the solution of (1.1) hold y ∈ C2 ([x0, xe]), let f ∈ C(S), and let f be Lipschitz continuous in the second argument. If yk = y(xk) and ci = i−1� j=1 aij, i ≥ 2, (1.13) holds, then Ki(x, y, h) is an approximation of at least first order (of consistency) to y� (xk + cih), i.e., y� (xk + cih) − Ki(x, y, h) = O � h2 � . Proof. The proof follows by induction. i = 2. For i = 2, it follows with (1.1), the Lipschitz continuity, and Taylor series expansion that � �y � (xk + c2h) − K2(x, y, h) � � = � �f � xk + c2h, y(xk + c2h) � − f � xk + c2h, y(xk) + ha21f(xk, y(xk)) �� � ≤ L |y(xk + c2h) − y(xk) − ha21f(xk, y(xk))| = L � � �y(xk) + c2hy � (xk) + O(h 2 ) − y(xk) − ha21y � (xk) � � � = L � � �(c2 − a21)hy � (xk) + O(h 2 ) � � � . Hence, in the case c2 = a21, the error is of order O(h 2 ). i > 2. Let the asymptotic order of the errors be proved for all indices 2, . . . , i−1. Then, one gets in the same way as for i = 2 � �y � (xk + cih) − Ki(x, y, h) � � = � � � � � � f (xk + cih, y(xk + cih)) − f  xk + cih, y(xk) + h i−1� j=1 aijKj(x, y, h)   � � � � � � ≤ L � � � � � � y(xk + cih) − y(xk) − h i−1� j=1 aijKj(x, y, h) � � � � � � = L � � � � � � y(xk) + cihy � (xk) + O(h 2 ) − y(xk) − h i−1� j=1 � aij � y � (xk + cjh) + O(h 2 ) �� � � � � � � = L � � � � � � cihy � (xk) + O(h 2 ) − h i−1� j=1 � aij � y � (xk) + O(h) �� � � � � � � 1.2 Explicit Runge–Kutta Schemes 13 = L � � � � � � h  ci − i−1� j=1 aij   y � (xk) + O(h 2 ) � � � � � � . The order of the error O(h 2 ) is given, if ci = �i−1 j=1 aij. � Remark 1.28. Conditions on the coefficients for certain orders of convergence. The conditions from Theorems 1.26 and 1.27 are satisfied for many explicit Runge–Kutta schemes. The goal consists in determining the coefficients b1, . . . , bs, and aij in such a way that one obtains an order of consistency as high as possible. The consistency order of a Runge–Kutta scheme with s stages can be derived from the Taylor series expansion of the local error. Let (1.12) be valid, then one obtains, e.g., • A Runge–Kutta scheme with the parameters (A, b, c) has at least consistency order p = 2 if s� j=1 bjcj = 1 2 . (1.14) This condition will be shown in Example 1.29 for s = 2. • If in addition s� j=1 bjc2 j = 1 3 and s� j=1 bj s� k=1 ajkck = 1 6 hold, then the order of consistency is at least p = 3. The proof of the last statement and conditions for even higher order consistency can be found in the literature, e.g. in (Strehmel & Weiner, 1995; Strehmel et al., 2012, Section 2.4.2). ✷ Example 1.29. Runge–Kutta methods with 2 stages. For the investigation of 2-stage Runge–Kutta schemes, one considers for simplicity the so-called autonomous initial value problem y� (x) = f(y(x)), y(x0) = y0. One has for the increments K1(y, h) = f(yk), K2(y, h) = f (yk + ha21K1(yk, h)) = f (yk + ha21f(yk)) = f(yk) + ha21f(yk)∂yf(yk) + O(h2 ). If the initial value is exact, it follows for the incremental function that Φ(y(xk)) = b1K1(y, h) + b2K2(y, h) (1.15) = (b1 + b2)f(y(xk)) + hb2a21f(y(xk))∂yf(y(xk)) + O(h2 ). The Taylor series expansion of the solution has the form 14 1 Explicit One-Step Methods y(xk + h) = y(xk) + h y� (xk) � �� � =f(y(xk)) + h2 2 y�� (xk) + O � h3 � . One obtains with the chain rule y�� (x) = d dx y� (x) = d dx f(y(x)) = ∂yf(y)y� (x) = ∂yf(y)f(y(x)). Now, it follows for the local error, using Taylor series expansion and (1.15), that le(xk + h) = y(xk + h) − y(xk) − hΦ(y(xk)) = y(xk) + hf(y(xk)) + h2 2 � ∂yf(y(xk))f(y(xk)) � + O � h3 � − y(xk) −h � (b1 + b2)f(y(xk)) + hb2a21f(y(xk))∂yf(y(xk)) + O(h2 ) � = h � 1 − (b1 + b2) � f(y(xk)) + h2 � 1 2 − b2a21 � f(y(xk))∂yf(y(xk)) +O � h3 � . To achieve an order of consistency as large as possible, the first two terms have to vanish. One obtains with the condition c2 = a21 that b1 + b2 = 1, b2a21 = 1 2 ⇐⇒ b2c2 = 1 2 . The first equation is the general condition for consistency (1.12) and the second condition is exactly (1.14) for s = 2. These two conditions characterize all 2-stage explicit Runge–Kutta methods that possess consistency and convergence order 2 c2 c2 1 − 1 2c2 1 2c2 , with c2 �= 0. In the case c2 = 1/2, one obtains the method of Runge (1895) 1/2 1/2 0 1 . This method corresponds with respect to the approximation of the integral in (1.7) to the application of the mid point rule. For c2 = 1, one gets the method of Heun4 (1900) 4 Karl Heun (1859 – 1929) 1.2 Explicit Runge–Kutta Schemes 15 1 1 1/2 1/2 , which corresponds to the use of the trapezoidal rule for the numerical quadrature in (1.7). ✷ Remark 1.30. Autonomous ordinary differential equations. Every explicit first order ordinary differential equation y� (x) = f(x, y(x)) can be transformed into an autonomous form ˜y� (x) = ˜f (˜y(x)) = � f(x, y(x)) 1 � by introducing the function y(x) := x and ˜y(x) := � y(x) y(x) � and noting that (y(x), x) are just the components of ˜y(x). ✷ Theorem 1.31. Consistency and convergence of explicit Runge– Kutta methods. Let y(x) be the solution of the initial value problem (1.1) with f ∈ C(S) and let f(x, y) satisfy a Lipschitz condition in the second argument. Then, an explicit Runge–Kutta scheme that is consistent of order p converges also with order p. Proof. The incremental function of an explicit Runge–Kutta scheme is a linear combination of values of the right-hand side f(x, y). Thus, the assumptions of Theorem 1.19 are satisfied, since the Lipschitz condition in this theorem follows from the required Lipschitz condition on the right-hand side of the differential equation. The statement of the theorem follows now directly from Theorem 1.19. � Remark 1.32. Explicit Runge–Kutta methods of higher order. Analogously to 2-stage methods, it is possible to derive conditions on the coefficients of an explicit Runge–Kutta scheme in order to construct methods of higher order. An important question is the minimal number of stages that is necessary to be able to reach a certain order. Some answers to this question are from Butcher (1963, 1965, 1985) p 1 2 3 4 5 6 7 8 min s 1 2 3 4 6 7 9 11 . ✷ Example 1.33. Classical Runge–Kutta scheme (1901). The so-called classical Runge–Kutta scheme has four stages and the Butcher tableau 16 1 Explicit One-Step Methods 0 1/2 1/2 1/2 0 1/2 1 0 0 1 1/6 1/3 1/3 1/6 . It is based on the Simpson5 rule. The center node of the Simpson rule is used twice, c2 = c3, but with a different second argument for the computation of the increments. This method is of fourth order. ✷ 1.3 Step Length Control Remark 1.34. Motivation. The considerations so far did not provide a way for estimating a good step length for solving a given initial value problem with prescribed accuracy and with as little work as possible. • If the steps are too large, then the numerical solution might be too inac- curate. • If the steps are too small, then the numerical simulation might take much longer than necessary. A good step length depends certainly on the concrete problem and generally it will change within the considered interval. For these reasons, the step length should be controlled during the numerical simulation of the initial value problem. A typical approach consists in computing two approximations of the solution in a node with different methods and to draw conclusions on the size of the local error based on the difference of these approximations. Of course, the consideration of the global error would be better. However, Theorem 1.19 shows that on the one hand, the global error is influenced by problem-dependent terms, like the length of the interval [x0, xe] or the Lipschitz constant. On the other hand, the global error is expected to be small only if the local errors are small. ✷ 1.3.1 The Richardson Method Remark 1.35. Idea. Given a numerical method for solving an initial value problem and given a step length h. The Richardson6 method consists of the following steps, see also Figure 1.3: 5 Thomas Simpson (1710 – 1761) 6 Lewis Fry Richardson (1881 – 1953) 1.3 Step Length Control 17 y(x0 + 2h) yh x0 + hx0 x0 + 2h y0 y(x0 + h) y2h y2×h Fig. 1.3 Sketch of the Richardson method. 1. Starting from a node (x0, y0) and using a step length of 2h, an approximation y2h at the node x0 + 2h will be computed. 2. Two approximations yh and y2×h in x0 + h and x0 + 2h are computed with two steps of length h. 3. The step length will be controlled by comparing y2h and y2×h. In general, the more accurate approximation will be y2×h. In addition, it will be demonstrated that it is possible to improve the accuracy of y2×h with the information obtained by this method. ✷ Example 1.36. Richardson method for an explicit 2-stage Runge–Kutta method. Consider an explicit 2-stage Runge–Kutta scheme. One obtains in the first step of the Richardson method, using (1.8), (1.9), y (1) 2h = y0, y (2) 2h = y0 + 2ha21f(x0, y0), y2h = y0 + 2h [b1K1(x, y) + b2K2(x, y)] = y0 + 2h � b1f(x0, y (1) 2h ) + b2f � x0 + 2c2h, y (2) 2h �� , or written as Butcher tableau 0 2c2 2a21 2b1 2b2 . Note that because of the step length 2h, the weights sum up to 2. The second step of the Richardson method yields y (1) 2×h = y0, y (2) 2×h = y0 + ha21f(x0, y0), y (3) 2×h = yh = y0 + h � b1f � x0, y (1) 2×h � + b2f � x0 + c2h, y (2) 2×h �� , y (4) 2×h = yh + ha21f(x0 + h, yh), 18 1 Explicit One-Step Methods y2×h = yh + h � b1f(x0 + h, yh) + b2f � x0 + h + c2h, y (4) 2×h �� . Inserting the formula for yh in the last two lines, one sees that the Butcher tableau of this method is 0 c2 a21 1 b1 b2 1 + c2 b1 b2 a21 b1 b2 b1 b2 . That means, the computation of y2×h is equivalent to the computation of an approximation with the help of an explicit 4-stage Runge–Kutta scheme. Altogether, five function evaluations are needed: f(x0, y0), f � x0 + 2c2h, y (2) 2h � , f � x0 + c2h, y (2) 2×h � , f(x0 + h, yh), f � x0 + h + c2h, y (4) 2×h � . In the case of a s-stage Runge–Kutta method, (3s − 1) function evaluations are required. This number is rather large and the high costs per time step are a disadvantage of the Richardson method. ✷ Remark 1.37. Comparison of both approximations. Consider a one-step meth- od yk+1 = yk + hΦ(x, y, h) of order p. Let the initial value y(x0) be exact, then it follows for the local error in x0 + 2h that y(x0 + 2h) − y2h = C(x0)(2h)p+1 + O � hp+2 � . (1.16) For estimating the local error of y2×h it will be assumed that the incremental function Φ(x, y, h) is Lipschitz continuous in the second argument. This assumption is always satisfied for explicit Runge–Kutta schemes if f(x, y) possesses this property, see the proof of Theorem 1.31. It is y2×h = yh + hΦ (x + h, yh, h) . (1.17) Let ˆy2×h = y(x0 + h) + hΦ (x + h, y(x0 + h), h) (1.18) be the iterate that is computed with the exact starting value in x0 +h. Using the definition of the consistency order, one obtains with (1.17) and (1.18) y(x0 + 2h) − y2×h = (y(x0 + 2h) − ˆy2×h) + (ˆy2×h − y2×h) 1.3 Step Length Control 19 = � C(x0 + h)hp+1 + O � hp+2 � � + � y(x0 + h) + hΦ (x + h, y(x0 + h), h) −yh − hΦ (x + h, yh, h) � . For the terms with the incremental function, one gets from the Lipschitz continuity and the consistency order |hΦ (x + h, y(x0 + h), h) − hΦ (x + h, yh, h)| ≤ hL |y(x0 + h) − yh| � �� � O(h p+1 ) = O � hp+2 � . It follows, applying again the consistency error, that y(x0 + 2h) − y2×h = C(x0 + h)hp+1 + y(x0 + h) − yh + O � hp+2 � = C(x0 + h)hp+1 + C(x0)hp+1 + O � hp+2 � + O � hp+2 � = 2C(x0)hp+1 + O � hp+2 � , (1.19) where one assumes that C(x0 + h) = C(x0) + O(h), i.e., that the constants do not change too rapidly. Neglecting in (1.16) and (1.19) the higher order terms allows to eliminate y(x0 + 2h) and solve for the constant, yielding C(x0) = 1 2 � y2×h − y2h 2p − 1 � 1 hp+1 . (1.20) From (1.19), it follows for the local error of the more accurate method that y(x0 + 2h) − y2×h = y2×h − y2h 2p − 1 + O � hp+2 � . (1.21) The first term on the right-hand side is a computable approximation of this local error. ✷ Remark 1.38. Increasing the accuracy. Rearranging terms in (1.21) gives y(x0 + 2h) − � y2×h + y2×h − y2h 2p − 1 � = O � hp+2 � . Then, y2×h = y2×h + y2×h − y2h 2p − 1 is an approximation of the solution of order p + 1. ✷ 20 1 Explicit One-Step Methods Remark 1.39. Automatic step length control. From (1.21) and (1.20), it follows that err = |y2×h − y2h| 2p − 1 ≈ 2C(x0)hp+1 (1.22) is a computable approximation of the local error. This approximation will be compared with a prescribed tolerance. Often, a so-called scaled tolerance sc is used, (Hairer et al., 1993, p. 167) or (Strehmel et al., 2012, p. 61). The scaled tolerance is a combination of an absolute tolerance atol and a relative tolerance rtol sc = atol + max {|y0| , |y2×h|} rtol. Then, the scaled error errsc = |y2×h − y2h| (2p − 1)sc is defined. • If errsc ≤ 1 ⇐⇒ err ≤ sc, then the performed step will be accepted. Starting from y2×h or y2×h, the next step will be performed. If y2×h is used, this approach is also called local Richardson extrapolation. An important aspect is the choice of the step length hnew for the next step. The guideline is that the scaled error for the next step should be on the one hand still smaller than 1 but on the other hand as close to 1 as possible. Following (1.22), it should hold 1 = errnew sc = 2C (x0 + 2h) hp+1 new sc ≈ 2C(x0)hp+1 new sc = 2C(x0)hp+1 sc � hnew h �p+1 ≈ errsc � hnew h �p+1 , i.e., hnew has to be chosen such that hnew ≈ � 1 errsc �1/(p+1) h. (1.23) • If errsc > 1, then the performed step will be rejected. The Richardson method is repeated from (x0, y0) with a step length hnew < h. That means, the work that was spent for performing the step with step length h was wasted. One likes to avoid this situation. ✷ Remark 1.40. Issues of the practical implementation. In practical simulations, one uses some modifications of (1.23) for stabilizing the algorithm. • A safety factor α ∈ (0, 1) is introduced hnew = α � 1 errsc �1/(p+1) h, 1.3 Step Length Control 21 often α ∈ [0.8, 0.9]. • One likes to avoid large oscillations of the sizes of subsequent steps. For this reason, a factor for the maximal increase αmax of the new step size with respect to the current step size and a factor for the maximal decrease αmin < αmax are used. Then, one obtains hnew = h min � αmax, max � αmin, α � 1 errsc �1/(p+1) �� . If a very large step length is proposed α � 1 errsc �1/(p+1) > αmax, then the factor αmax becomes effective and similarly for the case that a very small step length is proposed. • Usually, one prescribes a minimal step length hmin and a maximal step length hmax and requires for all step lengths that hk ∈ [hmin, hmax]. • In the first step, one has to estimate h. Generally, this estimate has to be corrected. In practice, this correction is done very fast by algorithms for automatic step length control. An algorithm for determining a good initial step length can be found in (Hairer et al., 1993, p. 168). ✷ 1.3.2 Embedded Runge–Kutta Schemes Remark 1.41. Motivation, embedded Runge–Kutta schemes. Richardson extrapolation is quite expensive in terms of evaluations of the incremental function. It is possible to construct a step length control that needs less evaluations, with so-called embedded Runge–Kutta schemes. The idea of embedded Runge–Kutta schemes consists in computing numerical approximations of the solution at the next time with two one-step methods with different order. The methods are chosen in such a way that it is possible to use certain evaluations of the incremental function for both of them. That means, one has to construct a Runge–Kutta scheme of the form 0 c2 a21 ... ... cs as1 · · · as,s−1 b1 · · · bs−1 bs ˜b1 · · · ˜bs−1 ˜bs , 22 1 Explicit One-Step Methods x0 + hx0 y0 y(x0 + h) ˜y1 y1 Fig. 1.4 Sketch of embedded Runge–Kutta schemes. such that y1 = y0 + h s� i=1 biKi(x, y) is order of p and ˜y1 = y0 + h s� i=1 ˜biKi(x, y) is of order q, see Figure 1.4. In general, it is q = p − 1 or q = p + 1. ✷ Example 1.42. Runge–Kutta–Fehlberg 2(3). Consider explicit Runge–Kutta schemes with 3 stages 0 c2 a21 c3 a31 a32 b1 b2 b3 p = 2 ˜b1 ˜b2 ˜b3 q = 3 . One of the schemes should be of order 2 and the other one of third order. There are 11 parameters to choose. From Theorem 1.26, Theorem 1.27, and Remark 1.28, it follows that 8 equations have to be satisfied c2 = a21, c3 = a31 + a32, b1 + b2 + b3 = 1, b2c2 + b3c3 = 1 2 , ˜b1 + ˜b2 + ˜b3 = 1, ˜b2c2 + ˜b3c3 = 1 2 , 1.3 Step Length Control 23 ˜b2c2 2 + ˜b3c2 3 = 1 3 , ˜b3a32c2 = 1 6 . That means, one has to set three parameters. First, one can choose c2 = 1, b3 = 0. Then, it follows from the first equation that a21 = 1, from the fourth equation that b2 = 1/2, and from the third equation that b1 = 1/2. Now, one chooses c3 = 1/2. From the sixth and seventh equation, it follows that ˜b2 = 1/6 and ˜b3 = 4/6. Then, one gets from the fifth equation ˜b1 = 1/6 and from the eighth equation a32 = 1/4. Finally, the second equation gives a31 = 1/4. The resulting methods have the form 0 1 1 1/2 1/4 1/4 1/2 1/2 0 p = 2 1/6 1/6 4/6 q = 3 . The method with order q = 3 is the Simpson rule. The complete embedded approach is called Runge–Kutta–Fehlberg7 2(3) method (RKF 2(3)). ✷ Remark 1.43. Error estimate, theoretical drawback. By construction, it holds for the embedded scheme that y1 = y(x0 + h) + O � hp+1 � , ˜y1 = y(x0 + h) + O � hq+1 � . It follows that |err| := |˜y1 − y1| = � � �y(x0 + h) + O � hq+1 � − y(x0 + h) + O � hp+1 �� � � = � � �O � hp+1 � + O � hq+1 �� � � (1.24) is an estimate of the main error term of the Runge–Kutta scheme of order q∗ = min{p, q}. That means, one obtains only an estimate of the error of the lower order method. To obtain information only on the lower order method is the main theoretical drawback of this approach, since one is interested actually in the higher order method and one will continue the computation also from the higher order approximation. ✷ Remark 1.44. Automatic step length control, I Controller. Let h be the step size that was used for computing y1 of order p and ˜y1 of order q with p < q. From (1.24), one has |err| = |y1 − ˜y1| = Chp+1 . (1.25) 7 Erwin Fehlberg (1911 – 1972) 24 1 Explicit One-Step Methods Given a tolerance tol for the maximal local error. • One approach consists in controlling the error per step (EPS). Then, one requires that r1 = |err| ≤ tol. (1.26) If this condition is satisfied, then the current step is accepted. Next, one requires for the new step size that the local error is equal to the tolerance Chp+1 new = tol, with C from (1.25) This requirement gives hnew = � tol C �1/(p+1) = � tol Chp+1 �1/(p+1) h. With (1.25) and (1.26), the new step length is computed by hnew = � α tol |err| �1/k h = � α tol r1 �1/k h, (1.27) where k = p + 1 and α ∈ (0, 1) is again a safety factor. • Another way is the consideration of the error relative to the current step length, the so-called error per unit step (EPUS), r1 = |err| h ≤ tol. (1.28) The satisfaction of a condition of form (1.28) leads to a new step of form (1.27) with k = p. • If (1.26) or (1.28) is not satisfied, then the step is rejected and it will be repeated with a step length smaller than h. • A generalization of this approach is the so-called I Controller. Replacing in (1.27) 1/k by kI gives hnew = � α tol r1 �kI h. For obtaining a useful automatic step length control mechanism, the choice kI = 1/k or equivalently kkI = 1 is not necessary. The following choices can be found in the literature kkI ∈ [0, 2] ⇐⇒ kI ∈ [0, 2/k] stable control, kkI ∈ (1, 2) ⇐⇒ kI ∈ (1/k, 2/k) fast and oscillating control, kkI ∈ (0, 1) ⇐⇒ kI ∈ (0, 1/k) slow and smooth control, kkI = 1 ⇐⇒ kI = 1/k standard I Controller. 1.3 Step Length Control 25 There are more sophisticated controllers that are used in practical simulations, see S¨oderlind (2002) for an overview. ✷ Remark 1.45. Methods used in practice. In practice, one uses, e.g., • RKF 4(5), s = 6, Fehlberg (1964), • RKF 7(8), s = 13, Fehlberg (1969), • DOPRI 4(5) (or DOPRI 5(4) or DOPRI5), s = 6, Dormand8 , Prince9 : Dormand & Prince (1980), • DOPRI 7(8), s = 13, Prince & Dormand (1981). The standard routine ode45 from MATLAB uses DOPRI 4(5). ✷ Remark 1.46. Fehlberg trick. The Fehlberg trick requires that Ks = f � xk + csh, yk + h s−1� i=1 asiKi � ! = f � xk + h, yk + h s� i=1 biKi � �� � yk+1 � , i.e., the last evaluation of the incremental function of the old step can be used as first value of the incremental function in the new step. The conditions for applying this trick are asi = bi, i = 1, . . . , s − 1, bs = 0, cs = 1. It can be applied, e.g., in DOPRI 4(5). This trick works only if hold ≈ hnew. ✷ 8 John R. Dormand 9 P. J. Prince Chapter 2 Numerical Methods for Stiff Ordinary Differential Equations 2.1 Stiff Ordinary Differential Equations Remark 2.1. Stiffness. It was observed in Curtiss & Hirschfelder (1952) that explicit methods failed for the numerical solution of initial value problems for ordinary differential equations that model certain chemical reactions. They introduced the notation stiffness for such chemical reactions where the fast reacting components arrive in a very short time in their equilibrium and the slowly changing components are more or less fixed, i.e., stiff. In 1963, Dahlquist found out that the reason for the failure of explicit Runge–Kutta methods is their bad stability, see Section 2.3. It should be emphasized that the stability properties of the equations themselves are good, it is in fact a problem of the explicit methods. There is no unique definition of stiffness in the literature. However, essential properties of stiff systems are as follows: • There exist, for certain initial conditions, solutions that change slowly. • Solutions in a neighborhood of these smooth solutions converge quickly to them. A definition of stiffness can be found in (Strehmel & Weiner, 1995, p. 202), (Strehmel et al., 2012, p. 208). This definition involves a certain norm that depends on the equation and it might be complicated to evaluate this norm. If the solution of (1.1) is sought in the interval [x0, xe] and if the right-hand side of (1.1) is Lipschitz continuous in the second argument with Lipschitz constant L, then an approximation of this definition is as follows. A system of ordinary differential equations is called stiff if L (xe − x0) � 1. (2.1) Another definition of stiffness will be given in Definition 2.28. ✷ Example 2.2. Stiff system of ordinary differential equations. Consider the sys- tem 27 28 2 Numerical Methods for Stiff Ordinary Differential Equations Fig. 2.1 Solutions of Example 2.2, left: first component, right: second component. y� 1 = −80.6y1 + 119.4y2, y� 2 = 79.6y1 − 120.4y2, in (0, 1). This system is a linear system of ordinary differential equations that can be written in the form y� = � −80.6 119.4 79.6 −120.4 � y. Taking as Lipschitz constant, e.g., the l1 norm of the system matrix (column sums), one gets L = 239.8 and condition (2.1) is satisfied. The general solution of this system is, compare Appendix A.2.3, y(x) = c1 � 3 2 � e−x + c2 � −1 1 � e−200x . The constants are determined by the initial condition. If the initial condition is such that c2 = 0, then the solution is smooth for all x > 0. Otherwise, if c2 �= 0, then the solutions changes rapidly for small x while approaching the smooth solution, see Figure 2.1 ✷ 2.2 Implicit Runge–Kutta Schemes Remark 2.3. Motivation. If the upper triangular part of the matrix of a Runge–Kutta method, see Definition 1.22, is not identical to zero, the Runge– Kutta method is called implicit. That means, there are increments that depend not only on previously computed increments but also on not yet computed increments. Thus, one has to solve a nonlinear problem for computing these increments. Consequently, the implementation of implicit Runge–Kutta 2.2 Implicit Runge–Kutta Schemes 29 methods is much more involved compared with the implementation of explicit Runge–Kutta methods. Generally, performing one step of an implicit method is much more time-consuming than for an explicit method. However, the great advantage of implicit methods is that they can be used for the numerical simulation of stiff systems, see the stability theory in Section 2.3. ✷ Remark 2.4. Derivation of implicit Runge–Kutta methods. Implicit Runge– Kutta schemes can be derived from the integral representation (1.7) of the initial value problem. One can show that for each implicit Runge–Kutta scheme with the weights bj and the nodes xk + cjh there is a corresponding quadrature rule with the same weights and the same nodes, see the section on Gaussian quadrature in Numerical Mathematics I. ✷ Example 2.5. Gauss–Legendre quadrature. Consider the interval [xk, xk +h] = [xk, xk+1]. Let c1, . . . , cs be the roots of the Legendre polynomial Ps(t) with the arguments t = 2 h (x − xk) − 1 =⇒ t ∈ [−1, 1]. There are s mutually distinct real roots in (−1, 1). After having computed c1, . . . , cs, one can determine the coefficients aij, bj such that one obtains a method of order 2s, see Example 2.8. ✷ Remark 2.6. Simplifying order conditions. The order conditions for an implicit Runge–Kutta scheme with s stages are the same as given in Theorems 1.26, 1.27, and Remark 1.28. These conditions lead to a nonlinear system of equations for computing the parameters of the scheme. These computations are generally quite complicated. A useful tool for solving this problem are the so-called simplifying order conditions, introduced in Butcher (1964): B(p) : s� i=1 bick−1 i = 1 k , k = 1, . . . , p, C(l) : s� j=1 aijck−1 j = 1 k ck i , i = 1, . . . , s, k = 1, . . . , l, (2.2) D(m) : s� i=1 bick−1 i aij = 1 k bj � 1 − ck j � , j = 1, . . . , s, k = 1, . . . , m, with 00 = 1. One can show that for sufficiently large values l and m, the conditions C(l) and D(m) can be reduced to B(p) with appropriate p. ✷ Remark 2.7. Interpretation of B(p) and C(l). Consider the initial value prob- lem 30 2 Numerical Methods for Stiff Ordinary Differential Equations y� (x) = f(x), y(x0) = 0. With the fundamental theorem of differential calculus, one sees that this problem has the solution y(x0 + h) = � x0+h x0 f(ξ) dξ = h � 1 0 f (x0 + hθ) dθ. A Runge–Kutta method with s stages gives y1 = h s� i=1 bif (x0 + cih) . Consider in particular the case that f(x) is a polynomial f(x) = (x−x0)k−1 , k ∈ N \ {0}. Then, the analytical solution has the form y(x0 + h) = h � 1 0 (hθ) k−1 dθ = (hθ)k k � � � � � θ=1 θ=0 = hk k . (2.3) The Runge–Kutta scheme yields y1 = h s� i=1 bi(cih)k−1 = hk s� i=1 bick−1 i . (2.4) Comparing (2.3) and (2.4), one can observe that condition B(p) means that the quadrature rule that is the basis of the Runge–Kutta method is exact for polynomials of degree (p − 1). Condition C(1) is (1.13) with the upper limit s ci = s� j=1 aij, i = 1, . . . , s. (2.5) ✷ Example 2.8. Classes of implicit Runge–Kutta schemes. • Gauss–Legendre schemes. The nodes of the Gauss–Legendre quadrature are used. A method with s stages possesses the maximal possible order 2s, where all nodes are in the interior of the intervals. To get the optimal order, one has to show that B(2s), C(s), D(s) are satisfied, see (Strehmel et al., 2012, Section 8.1.2), i.e., s� i=1 bick−1 i = 1 k , k = 1, . . . , 2s, s� j=1 aijck−1 j = 1 k ck i , i = 1, . . . , s, k = 1, . . . , s, (2.6) 2.2 Implicit Runge–Kutta Schemes 31 s� i=1 bick−1 i aij = 1 k bj � 1 − ck j � , j = 1, . . . , s, k = 1, . . . , s. An example is the implicit mid point rule, whose coefficients can be derived by setting s = 1 in (2.6). One obtains the following conditions b1 = 1, b1c1 = 1 2 , a11 = c1, b1a11 = b1 (1 − c1) . Consequently, the implicit mid point rule is given by 1/2 1/2 1 . • Gauss–Radau1 methods. These methods are characterized by the feature that one of the end points of the interval [xk, xk+1] belongs to the nodes. A method of this class with s stages has at most order 2s − 1. Examples (s = 1): ◦ 0 1 1 s = 1, p = 1, ◦ 1 1 1 s = 1, p = 1, implicit Euler scheme. The first scheme does not satisfy condition (2.5). • Gauss–Lobatto2 methods. In these methods, both end points of the interval [xk, xk+1] are nodes. A method of this kind with s stages cannot be of higher order than (2s − 2). Examples: ◦ trapezoidal rule, Crank3 –Nicolson4 scheme 0 0 0 1 1/2 1/2 1/2 1/2 s = p = 2. ◦ other scheme 0 1/2 0 1 1/2 0 1/2 1/2 s = 2, p = 2. The second scheme does not satisfy condition (2.5). ✷ 1 Rodolphe Radau (1835 – 1911) 2 Rehuel Lobatto (1797 – 1866) 3 John Crank (1916 – 2006) 4 Phyllis Nicolson (1917 – 1968) 32 2 Numerical Methods for Stiff Ordinary Differential Equations Remark 2.9. Diagonally implicit Runge–Kutta methods (DIRK methods). For an implicit Runge–Kutta method with s stages, one has to solve a coupled nonlinear system for the increments K1(x, y), . . . , Ks(x, y). This step is expensive for a large number of stages s. A compromise is the use of so-called diagonally implicit Runge–Kutta (DIRK) methods c1 a11 0 0 · · · 0 c2 a21 a22 0 · · · 0 c3 a31 a32 a33 · · · 0 ... ... ... ... cs as1 as2 · · · ass b1 b2 · · · bs−1 bs . In DIRK methods, one has to solve s independent nonlinear equations for the increments. In the equation for Ki(x, y), only the stages K1(x, y), . . . , Ki(x, y) appear, where K1(x, y), . . . , Ki−1(x, y) were already computed. ✷ 2.3 Stability Theory Remark 2.10. On the stability theory. The stability theory studies numerical methods for solving the linear initial value problem y� (x) = λy(x), y(0) = 1, λ ∈ C. (2.7) It will turn out the even at the simple initial value problem (2.7) the most important stability properties of numerical methods can be explored. The solution of (2.7) is y(x) = eλx . If the initial condition will be slightly perturbed to be 1+δ0, then the solution of the perturbed initial value problem is ˜y(x) = (1 + δ0)eλx = eλx + δ0eλx . If λ = a + ib with a = Re(λ) > 0, then the difference |y(x) − ˜y(x)| = � � �δ0eλx � � � = |δ0| |eax | � � �e ibx � � � = |δ0| |eax | becomes for each δ0 �= 0 arbitrarily large if x is sufficiently large. That means, the initial value problem (2.7) is not stable in this case. In this situation, one cannot expect that any numerical method is stable. Hence, this situation is not of interest for numerical simulations. In contrast, if Re(λ) < 0, then the difference |y(x) − ˜y(x)| becomes arbitrarily small and the initial value problem is stable, i.e., small changes of the 2.3 Stability Theory 33 data result only in small changes of the solution. This case is of interest for the stability theory of methods for solving ordinary differential equations. This section considers one-step methods with equidistant meshes with step size h. The solution of (2.7) in the node xk+1 = (k + 1)h is y(xk+1) = eλxk+1 = eλ(xk+h) = eλh eλxk = eλh y(xk) =: ez y(xk), with z := λh ∈ C, Re(z) ≤ 0. Now, it will be studied how the step from xk to xk+1 looks like for different one-step methods. In particular, large steps are of interest, i.e., |z| → ∞. ✷ Example 2.11. Behavior of different one-step methods for one step of the model problem (2.7). 1. Explicit Euler method. The general form of this method is yk+1 = yk + hf (xk, yk) . In particular, one obtains for (2.7) yk+1 = yk + hλyk = (1 + z) yk =: R(z)yk. It holds, independently of Re(z), that lim|z|→∞ |R(z)| = ∞. 2. Implicit Euler method. This method has the form yk+1 = yk + hf (xk+1, yk+1) . For applying it to (2.7), one can rewrite it as follows yk+1 = yk + hλyk+1 ⇐⇒ (1 − z)yk+1 = yk ⇐⇒ yk+1 = 1 1 − z yk = � 1 + z 1 − z � yk =: R(z)yk. For this method, one has, independently of Re(z), that lim|z|→∞ |R(z)| = 0. 3. Trapezoidal rule. The general form of this method is yk+1 = yk + h 2 (f(xk, yk) + f(xk+1, yk+1)) , which can be derived from the Butcher tableau given in Example 2.8. For the linear differential equation (2.7), one gets 34 2 Numerical Methods for Stiff Ordinary Differential Equations yk+1 = yk + h 2 (λyk + λyk+1) ⇐⇒ � 1 − z 2 � yk+1 = � 1 + z 2 � yk ⇐⇒ yk+1 = 1 + z/2 1 − z/2 yk = � 1 + z 1 − z/2 � yk =: R(z)yk. Let z = 2r (cos(φ) + i sin(φ)). Inserting this expression gives lim |z|→∞ � � � � 1 + z/2 1 − z/2 � � � � = lim r→∞ � � � � 1 + r (cos(φ) + i sin(φ)) 1 − r (cos(φ) + i sin(φ)) � � � � = lim r→∞ � � � � 1/r + (cos(φ) + i sin(φ)) 1/r − (cos(φ) + i sin(φ)) � � � � = |(cos(φ) + i sin(φ))| |− (cos(φ) + i sin(φ))| = 1 1 = 1. Hence, one has that lim|z|→∞ |R(z)| = 1 for the trapezoidal rule, independently of φ, and with that independently of Re(z). The function R(z) describes for each method the step from xk to xk+1. Thus, this function is an approximation of ez , which has for different methods different properties, e.g., the limit for |z| → ∞. ✷ Definition 2.12. Stability function. Let 1 = (1, . . . , 1)T ∈ Rs , ˆC = C ∪ ∞, where ∞ has to be understood as in function theory (Riemann sphere), and consider a Runge–Kutta method with s stages and with the parameters (A, b, c). Then, the function R : ˆC → ˆC, z �→ 1 + zbT (I − zA)−1 1 (2.8) is called stability function of the Runge–Kutta method. ✷ Remark 2.13. Stability functions from Example 2.11. All stability functions from Example 2.11 can be written in the form (2.8). One obtains, e.g., for the trapezoidal rule b = � 1/2 1/2 � , I − zA = � 1 0 −z 2 1 − z 2 � , (I − zA) −1 = 1 1 − z 2 � 1 − z 2 0 z 2 1 � , from what follows that 1 + zbT (I − zA)−1 1 = 1 + z 1 − z/2 � 1 2 − z 4 + z 4 + 1 2 � = 1 + z 1 − z/2 . ✷ Theorem 2.14. Form of the stability function of Runge–Kutta methods. Given a Runge–Kutta scheme with s stages and with the parameters 2.3 Stability Theory 35 (A, b, c), then the stability function R(z) is a rational function defined on ˆC, whose polynomial order in the numerator and in the denominator is at most s. The poles of this functions might be only at values that correspond to the inverse of an eigenvalue of A. For an explicit Runge–Kutta scheme, R(z) is a polynomial. Proof. Consider first an explicit Runge–Kutta scheme. In this case, the matrix A is a strictly lower triangular matrix. Hence, I − zA is a triangular matrix with the values one at its main diagonal. This matrix is invertible and it is (I − zA) −1 = I + zA + . . . + z s−1 A s−1 , (2.9) which can be checked easily by multiplication with (I − zA) and using that A s = 0 since A is strictly lower triangular. It follows from (2.8) and (2.9) that R(z) is a polynomial in z of degree at most s. Now, the general case will be considered. The expression (I −zA) −1 1 can be interpreted as the solution of the linear system of equations (I − zA)ζ = 1. Using the Cramer rule, one finds that the i-th component of the solution has the form ζi = det Ai det(I − zA) , where Ai is the matrix that is obtained by replacing the i-th column of (I − zA) by the right-hand side, i.e., by 1. The numerator of ζi is a polynomial in z of order at most (s−1) since there is one column where z does not appear. The denominator is a polynomial of degree at most s. Multiplying with zb T from the left-hand side gives just a rational function with polynomials of at most degree s both in the numerator and in the denominator. There is only one case where this approach does not work, namely if det(I − zA) = det(z(I/z − A)) = z s det(I/z − A) = 0, i.e., if 1/z is an eigenvalue of A. � Theorem 2.15. Solution of the initial value problem (2.7) obtained with a Runge–Kutta scheme. Consider a Runge–Kutta method with s stages and with the parameters (A, b, c). If z−1 = (λh)−1 is not an eigenvalue of A, then the Runge–Kutta scheme is well-defined for the initial value problem (2.7). In this case, it is yk = (R(hλ))k , k = 0, 1, 2, . . . . Proof. The statement of the theorem follows directly if one writes the Runge–Kutta scheme for (2.7) and applies induction. exercise � Definition 2.16. Stability domain. The stability domain of a one-step method is the set S := {z ∈ ˆC : |R(z)| ≤ 1}. ✷ Remark 2.17. Desirable property for the stability domain. The stability domain of the initial value problem (2.7) is, see Remark 2.10, 36 2 Numerical Methods for Stiff Ordinary Differential Equations Sanal = C− 0 := {z ∈ C : Re(z) ≤ 0}, since R(z) = ez . In this domain, the solution decreases (for Re(z) < 0) or its absolute value is constant (for Re(z) = 0). A desirable property of a numerical method is that it should be stable for all parameters where the initial value problem is stable, i.e., C− 0 ⊆ S. ✷ Definition 2.18. A-stable method. If for the stability domain S of a onestep method, it holds that C− 0 ⊆ S, then this one-step method is called A-stable. ✷ Lemma 2.19. Property of an A-stable method. Consider an A-stable one-step method, then it is |R(∞)| ≤ 1. Proof. By the assumption C − 0 ⊆ S, the absolute value of the stability function is bounded from above by 1 for all |z| → ∞ with Re(z) ≤ 0. From Theorem 2.14, it follows that the stability function has to be a rational function where the polynomial degree of the numerator is not larger than the polynomial degree of the denominator, since otherwise the function is unbounded for |z| → ∞. It is known from function theory that such rational functions are continuous in ∞. Hence, it is |R(∞)| ≤ 1. � Remark 2.20. On A-stable methods. The behavior of the stability function for |z| → ∞, z ∈ C− 0 , is of utmost interest, since it describes the length of the steps that is admissible for given λ such that the method is still stable. However, from the property |R(∞)| ≤ 1, it does not follow that the step length can be chosen arbitrarily large without loosing the stability of the method. ✷ Definition 2.21. Strongly A-stable method, L-stable method. An Astable one-step method is called strongly A-stable, if it satisfies in addition |R(∞)| < 1. It is called L-stable (left stable), if even it holds that |R(∞)| = 0. ✷ Example 2.22. Stability of some one-step methods. The types of stability defined in Definitions 2.18 and 2.21 are of utmost importance for the quality of a numerical method. 1. Explicit Euler method. It is R(z) = 1 + z, i.e., the stability domain is the closed circle with radius 1 and center (−1, 0), see Figure 2.2. This method is not A-stable. For |λ| large, one has to use very small steps in order to get stable simulations. The smallness of the step lengths for stable simulations of stiff problems is the basic problem of all explicit methods. 2. Implicit Euler method. One has for this method R(z) = 1/(1 − z). The stability domain is the complete complex plane without the open circle with radius 1 and center (1, 0), see Figure 2.2. Hence, the method is Astable. From Example 2.11, it is already known that |R(∞)| = 0 such 2.3 Stability Theory 37 x−1 y 1 y 1 x1 Fig. 2.2 Stability domain of the explicit Euler method (left) and the implicit Euler method (right). that the implicit Euler method is even L-stable. A smallness condition on the step lengths does not arise for this method, at least for the model problem (2.7). In general, one can apply with the implicit Euler method much larger steps than, e.g., with the explicit Euler method. Step size restrictions arise, e.g., from the physics of the problem and from the required accuracy of the simulations. However, one has to solve in general in each node a nonlinear equation, like for each implicit scheme. Thus, the numerical costs and the computing time per step are usually much larger than for explicit schemes. 3. Trapezoidal rule. For the trapezoidal rule, one gets with z = a + ib, a, b ∈ R, |R(z)| 2 = � � � � 1 + z/2 1 − z/2 � � � � 2 = � � � � 1 + a/2 + ib/2 1 − a/2 − ib/2 � � � � 2 = (2 + a)2 + b2 (2 − a)2 + b2 . Thus, |R(z)| ≤ 1 if |2 + a| ≤ |2 − a|, compare Figure 2.3, i.e., R(z)    < 1 for a < 0 ⇐⇒ Re(z) < 0, = 1 for a = 0 ⇐⇒ Re(z) = 0, = 1 for z = ∞, see also Example 2.11. Hence, one obtains S = C− 0 . This method is Astable but not L-stable. However, in contrast to the implicit Euler method, which is a first order method, the trapezoidal rule is a second order method. ✷ Remark 2.23. Linear systems of ordinary differential equations. Consider the linear system of ordinary differential equations with constant coefficients y� (x) = Ay(x), y(0) = y0, A ∈ Rn×n , y0 ∈ Rn . (2.10) The solution of (2.10) has the form y(x) = eAx y0, 38 2 Numerical Methods for Stiff Ordinary Differential Equations −4 −2 0 2 4 � 0 1 2 3 4 5 |� ��| |�−�| Fig. 2.3 Illustration to the trapezoidal rule, Example 2.22. where Ax is defined component-wise, as a multiplication of a scalar with a matrix. The first factor on the right-hand side is the matrix exponential. ✷ Definition 2.24. Matrix exponential. Let A ∈ Rn×n and A0 := I, A1 := A, A2 := AA, . . . , Ak := Ak−1 A. The matrix exponential is defined by eA := exp(A) : Rn×n → Rn×n , A �→ ∞� k=0 Ak k! . ✷ Lemma 2.25. Properties of the matrix exponential. The matrix exponential has the following properties: i) The series ∞� k=0 Ak k! converges absolutely for all A ∈ Rn×n , like in the real case n = 1. ii) If the matrices A, B ∈ Rn×n are commuting, i.e., if AB = BA holds, then it follows that eA eB = eA+B . iii) The matrix � eA �−1 ∈ Rn×n exists for all A ∈ Rn×n and it holds � eA �−1 = e−A . This property corresponds to ex �= 0 for the scalar case. 2.3 Stability Theory 39 iv) It holds rank � eA � = n, det � eA � �= 0. v) The matrix-valued function R → Rn×n , x �→ eAx , where Ax is defined component-wise, is continuously differentiable with respect to x with d dx eAx = AeAx . The derivative of the exponential is the first factor in this matrix product. The formula looks the same as in the scalar case. Proof. i) with comparison test with a majorizing series, using that the corresponding series with real argument converges for all real numbers, see literature, ii) follows from i), exercise, iii) follows from ii), exercise, iv) follows from iii), v) direct calculation with difference quotient, exercise. � Example 2.26. Matrix exponential. There are only few classes of matrices that allow an easy computation of the matrix exponential: diagonal matrices and nilpotent matrices. 1. Consider A =   1 0 0 0 2 0 0 0 3   =⇒ Ak =    1k 0 0 0 2k 0 0 0 3k    . It follows that eAx = ∞� k=0 (Ax)k k! = ∞� k=0 1 k!    xk 0 0 0 (2x)k 0 0 0 (3x)k    =    �∞ k=0 x k k! 0 0 0 �∞ k=0 (2x) k k! 0 0 0 �∞ k=0 (3x) k k!    =   ex 0 0 0 e2x 0 0 0 e3x   . 2. This example illustrates property ii) of Lemma 2.25. For the matrices A = � 2 0 0 3 � , B = � 0 1 0 0 � , it is possible to calculate the corresponding series easily, since B is a nilpotent matrix (B2 = 0). One obtains eA = � e2 0 0 e3 � , eB = I + B = � 1 1 0 1 � . It holds AB �= BA and 40 2 Numerical Methods for Stiff Ordinary Differential Equations eA eB = � e2 e2 0 e3 � �= � e2 e3 0 e3 � = eB eA . Assume that eA eB = eA+B . Since eA+B = eB+A , it follows that then also eB eA = eB+A = eA+B = eA eB , which is a contradiction to the calculations from above. ✷ Remark 2.27. Extension of the stability theory to linear systems. Consider system (2.10). It will be assumed that the matrix A possesses n eigenvalues λ1, . . . , λn ∈ C. Further, it will be assumed that this matrix can be diagonalized, i.e., there exists a matrix Q ∈ Rn×n such that Λ = Q−1 AQ, with Λ = diag(λ1, . . . , λn). This property is given, e.g., if all eigenvalues are mutually different. The columns qi of Q are the eigenvectors of A. With the substitution y(x) = Qz(x) =⇒ y� (x) = Qz� (x), one obtains the differential equation Qz� (x) = AQz(x) ⇐⇒ z� (x) = Q−1 AQz(x) = Λz(x). The equations of this system are decoupled. Its general solution is given by z(x) = eΛx c = � cieλix � i=1,...,n . It follows that the general solution of (2.10) has the form y(x) = Qz(x) = n� i=1 cieλix qi. Inserting this expression in the initial condition yields y(0) = n� i=1 ciqi = Qc = y0 =⇒ c = Q−1 y0. Hence, one obtains the following solution of the initial value problem y(x) = n� i=1 � Q−1 y0 � i eλix qi, (2.11) 2.3 Stability Theory 41 where � Q−1 y0 � i is the i-th component of Q−1 y0. Now, one can easily see that the solution is stable (small changes of the initial data lead to small changes of the solution) only if all eigenvalues have a negative real part. The study of numerical methods makes sense only in the case that the problem is well posed, i.e., all eigenvalues have a negative real part. Then, the most important term in (2.11) with respect to stability is the term with the eigenvalue of A with the largest absolute value of its real part, since for the stability, the absolute values of the product of the real parts of the eigenvalues and the step length are important. ✷ Definition 2.28. Stiff system of ordinary differential equations. The linear system of ordinary differential equations y� (x) = Ay(x), A ∈ Rn×n , is called stiff, if all eigenvalues λi of A possess a negative real part and if q := max{|Re(λi)| , i = 1, . . . , n} min{|Re(λi)| , i = 1, . . . , n} � 1. Sometimes, the system is called weakly stiff if q ≈ 10 and stiff if q > 10. ✷ Remark 2.29. On Definition 2.28. Definition 2.28 has a disadvantage. The ratio becomes large also in the case that the eigenvalue with the smallest absolute value of the real part is close to zero. However, this eigenvalue is not important for the stability of numerical methods, only the eigenvalue with the largest absolute value of the real part. ✷ Remark 2.30. Local stiffness for general ordinary differential equations. The concept of stiffness can be extended in some sense from linear differential equations to general differential equations. The differential equation y� (x) = f(x, y(x)) can be transformed, by introducing the functions y(x) := x and ˜y(x) := � y(x) y(x) � , to the autonomous form ˜y� (x) = ˜f (˜y(x)) = � f(x, y(x)) 1 � . By linearizing at the initial value ˜y0, one obtains a differential equation of the form ˜y� (x) = A˜y(x). Applying some definition of stiffness to the linearized equation, it is possible to define a local stiffness for the general equation. 42 2 Numerical Methods for Stiff Ordinary Differential Equations However, if one considers nonlinear problems, one has to be careful in the interpretation of the results. In general, the results are valid only locally, i.e., in a neighborhood of the point of linearization, and they do not describe the stability of a numerical method in the whole domain of definition of the nonlinear problem. ✷ 2.4 Rosenbrock Methods Remark 2.31. Goal. From the stability theory, it became obvious that one has to use implicit methods for stiff problems. However, implicit methods are computationally expensive, one has to solve in general nonlinear problems in each step. The goal consists in constructing implicit methods that have on the one hand a reduced computational complexity but on the other hand, they should be still accurate and stable. ✷ Remark 2.32. Linearly implicit Runge–Kutta methods. Consider, without loss of generality, the autonomous initial value problem in Rn y� (x) = f(y), y(0) = y0, compare Remark 1.30. DIRK methods, see Remark 2.9, have a Butcher tableau of the form c1 a11 0 0 · · · 0 c2 a21 a22 0 · · · 0 c3 a31 a32 a33 · · · 0 ... ... ... ... cs as1 as2 · · · ass b1 b2 · · · bs−1 bs . One has to solve s decoupled nonlinear equations Kj = f � yk + h j−1� l=1 ajlKl + hajjKj � , j = 1, . . . , s. (2.12) This fixed point equation can be solved with a fixed point iteration. As a special fixed point iteration, the quasi Newton method for solving the j-th equation leads to an iterative scheme of the form K (m+1) j = K (m) j (2.13) − � I − ajjhJ �−1 � K (m) j − f � yk + h j−1� l=1 ajlKl + hajjK (m) j �� � �� � residual , 2.4 Rosenbrock Methods 43 m = 0, 1, . . .. The derivative with respect to Kj of the corresponding nonlinear problem to (2.12) with right-hand side 0 is I − ajjh ���� ∂y ∂Kj ∂yf � yk + h j−1� l=1 ajlKl + hajjKj � ∈ Rn×n . In (2.13), one uses usually the approximation of the derivative J = ∂yf(yk) instead of the derivative at the current iterate, hence it is a quasi Newton method. If the step length h is sufficiently small, then the matrix � I − ajjhJ � is non-singular, since then it is sufficiently close to the identity, and the linear systems of equations possess a unique solution. Often, it turns out to be sufficient for reaching the required accuracy to perform just one step of the iteration. This statement holds in particular if the step length is sufficiently small and if a sufficiently accurate start value K (0) j is available. One utilizes the ansatz (linear combination of the already computed increments) K (0) j := j−1� l=1 djl ajj Kl, where the coefficients djl, l = 1, . . . , j −1, still need to be determined. Applying just one step in (2.13) with this ansatz, one obtains an implicit method with linear systems of equations of the form � I − ajjhJ � Kj = f � yk + h j−1� l=1 � ajl + djl � Kl � − hJ j−1� l=1 djlKl, j = 1, . . . , s, yk+1 = yk + h s� j=1 bjKj. (2.14) This class of methods is called linearly implicit Runge–Kutta methods. Linearly implicit Runge–Kutta methods are still implicit methods. One has to solve in each step only s linear systems of equations. That means, these methods are considerably less computationally complex than the original implicit methods and the first goal stated in Remark 2.31 is achieved. Now, one has to study which properties of the original methods are transferred to the linearly implicit methods. In particular, stability is of importance. If stability will be lost, then the linearly implicit methods are not suited for solving stiff differential equations. ✷ Theorem 2.33. Stability of linearly implicit Runge–Kutta methods. Consider a Runge–Kutta method with the parameters (A, b, c), where A ∈ Rs×s is a non-singular lower triangular matrix (which was used for the derivation of (2.14)). Then, the corresponding linearly implicit Runge–Kutta 44 2 Numerical Methods for Stiff Ordinary Differential Equations method (2.14) with J = ∂yf(yk) has the same stability function R(z) as the original method, independently of the choice of {djl}. Proof. The linearly implicit method will be applied to the one-dimensional (to simplify notations) test problem y � (x) = λy(x), y(0) = 1, with Re(λ) < 0. Since f(y) = λy, one obtains J = λ. The j-th equation of (2.14) has the form � 1 − ajjhλ � Kj = λ � yk + h j−1� l=1 � ajl + djl � Kl � − hλ j−1� l=1 djlKl = λyk + hλ j−1� l=1 ajlKl, j = 1, . . . , s. Multiplication with h gives with z = λh Kjh − z j� l=1 ajlKlh = zyk, j = 1, . . . , s. This equation is equivalent, using matrix-vector notation, to (I − zA) Kh = zyk1, K = (K1, . . . , Ks) T . Let h be chosen in such a way that z −1 is not an eigenvalue of A. Then, one obtains by inserting this equation in the second equation of (2.14) yk+1 = yk + hb T K = yk + hb T (I − zA) −1 1 z h yk = � 1 + zb T (I − zA) −1 1 � yk = R(z)yk. Now one can see that in the parentheses there is the stability function R(z) of the original Runge–Kutta method, see (2.8). � Remark 2.34. On the stability and consistency. Since the most important stability properties of a numerical method for solving initial value problems with ordinary differential equations depend only on the stability function, these properties transfer from the original implicit Runge–Kutta method to the corresponding linearly implicit method. The choice of the coefficients {djl} will influence the order of the linearly implicit method. For an inappropriate choice of these coefficients, the order of the linearly implicit method might be lower than the order of the original method. ✷ Example 2.35. Linearly implicit Euler method. The implicit Euler method has the Butcher tableau 1 1 1 . With (2.14), it follows that the linearly implicit Euler method has the form � I − h∂yf(yk) � K1 = f (yk) , yk+1 = yk + hK1. 2.4 Rosenbrock Methods 45 The linearly implicit Euler method is L-stable, like the implicit Euler method, and one has to solve in each step only one linear system of equations. There are no coefficients {djl} to be chosen in this method. ✷ Remark 2.36. Rosenbrock5 methods. Another possibility for simplifying the use of linearly implicit methods and decreasing the numerical costs consists in using for all increments the same coefficient ajj = a. In this case, all linear systems of equations in (2.14) possess the same system matrix (I − ahJ). Then, one needs only one LU decomposition of this matrix and can solve all systems in (2.14) with this decomposition. This approach is called Rosenbrock methods or Rosenbrock–Wanner6 methods (ROW methods) (I − ahJ) Kj = f � yk + h j−1� l=1 � ajl + djl � Kl � − hJ j−1� l=1 djlKl, j = 1, . . . , s, yk+1 = yk + h s� j=1 bjKj. (2.15) In practice, it is often even possible to use the same approximation J of the Jacobian for some subsequent steps. This is true in particular, if the solution changes only slowly. In this way, one can save additional computational costs. ✷ Example 2.37. The method ode23s. In MATLAB, one can find for solving stiff initial value problems with ordinary differential equations the Rosenbrock method ode23s, see Shampine & Reichelt (1997). This method has the form (I − ahJ) K1 = f(yk), a = 1 2 + √ 2 ≈ 0.2928932, (I − ahJ) K2 = f � yk + 1 2 hK1 � − ahJK1, (2.16) yk+1 = yk + hK2. From the equation for the second increment, it follows that d21 = a. Then, one obtains with (2.15) a21 = 1/2 − d21 = 1/2 − a. Using the condition that the nodes are the sums of the rows of the matrix, it follows that the corresponding Butcher tableau looks like a a 1/2 1/2 − a a 0 1 . ✷ 5 Howard H. Rosenbrock (1920 – 2010) 6 Gerhard Wanner, born 1942 46 2 Numerical Methods for Stiff Ordinary Differential Equations Theorem 2.38. Order of ode23s. The Rosenbrock method ode23s is of second order if h ∈ (0, 1/(2a �J�2)). Proof. Let h ∈ (0, 1/(2a �J�2)), where �·�2 denotes the spectral norm of J, which is induced by the Euclidean vector norm �·�2. It can be shown, see class Computer Mathematics, that the matrix (I − ahJ) is invertible if �ahJ�2 < 1. This condition is satisfied for the choice of h from above. Let K be the solution of (I − ahJ)K = f. (2.17) Then, one obtains with the triangle inequality, with the compatibility of the Euclidean vector norm and the spectral matrix norm, and with the choice of h that �(I − ahJ)K�2 ≥ �K�2 − ah �JK�2 ≥ �K�2 − ah �J�2 �K�2 ≥ �K�2 − a �J�2 2a �J�2 �K�2 = 1 2 �K�2 . It follows with (2.17) that 1 2 �K�2 ≤ �(I − ahJ)K�2 = �f�2 =⇒ �K�2 ≤ 2 �f�2 . (2.18) Thus, the solution of the linear system of equations is bounded by the right-hand side. One obtains for the first increment of ode23s by recursive insertion, using (2.16), K1 = f(yk) + ahJK1 = f(yk) + ahJ (f(yk) + ahJK1) = f(yk) + ahJf(yk) + h 2 a 2 J 2 K1 = f(yk) + ahJf(yk) + O � h 2 � . (2.19) The last step is allowed since K1 is bounded by the data of the problem (the right-hand side f(yk)) independently of h, see (2.18) where the constant in the estimate is 2. Using a Taylor series expansion and considering only first order terms explicitly, one obtains in a similar way for the second increment of ode23s K2 = f � yk + 1 2 hK1 � − ahJK1 + ahJK2 = f (yk) + 1 2 h∂yf (yk) K1 − ahJK1 + ahJK2 + O � h 2 � (2.20) (2.19) = f (yk) + 1 2 h∂yf (yk) f (yk) − ahJf(yk) + ahJK2 + O � h 2 � (2.20) = f (yk) + 1 2 h∂yf (yk) f (yk) − ahJf(yk) + ahJf(yk) + O � h 2 � = f (yk) + 1 2 h∂yf (yk) f (yk) + O � h 2 � . Inserting these results in (2.16) gives for one step of ode23s yk+1 = yk + hf (yk) + 1 2 h 2 ∂yf (yk) f (yk) + O � h 3 � . (2.21) The Taylor series expansion of the solution y(x) of the system of differential equations in xk has the form, using the differential equation and the chain rule, y(xk+1) = y(xk) + hy � (xk) + h 2 2 y �� (xk) + O � h 3 � 2.4 Rosenbrock Methods 47 = y(xk) + hf (yk) + h 2 2 ∂f (y) ∂x (xk) + O � h 3 � = y(xk) + hf (yk) + h 2 2 ∂yf (yk) y � (xk) + O � h 3 � = y(xk) + hf (yk) + h 2 2 ∂yf (yk) f (yk) + O � h 3 � . Starting with the exact value at xk, then the first three terms of (2.21) correspond to the Taylor series expansion of the solution y(x) of the system of differential equations in xk. Thus, it follows that the local error is of order O � h 3 � , from what follows that the consistency order of ode23s is two, see Definition 1.14. � Remark 2.39. To the proof of Theorem 2.38. Note that it is not needed in the proof of Theorem 2.38 that J is the exact derivative ∂yf (yk). The method ode23s remains a second order method if J is only an approximation of ∂yf (yk) and even if J is an arbitrary matrix. However, the transfer of the stability properties from the original method to ode23s is only guaranteed for the choice J = ∂yf (yk), see Theorem 2.33. ✷ Theorem 2.40. Stability function of ode23s. Assume that J = ∂yf(yk), then the stability function of the Rosenbrock method ode23s has the form R(z) = 1 + (1 − 2a)z (1 − az)2 . (2.22) Proof. The statement of the theorem follows from applying the method to the usual test equation, exercise. � Corollary 2.41. Stability of ode23s. If J = ∂yf(yk), then the Rosenbrock method ode23s is L-stable. Proof. The statement is obtained by applying the definition of L-stability to the stability function (2.22). � Remark 2.42. On the order of ode23s. It remains the question whether an appropriate choice of J might even increase the order of the method. However, for the model problem of the stability analysis, a series expansion of the stability function shows that the exponential function is reproduced exactly only to the quadratic term. From this observation, it follows that one does not obtain a third order method even with exact Jacobian. In practice, there is no important reason from the point of view of accuracy to compute a new Jacobian in each step. Often, it is sufficient to update J every now and then. ✷ Chapter 3 Multi-Step Methods 3.1 Definition Remark 3.1. Multi-step methods. The characteristic feature of one-step methods is that they need for computing yk+1 only the value from the previous approximation yk of the solution. A straightforward extension consists in constructing methods that use for computing yk+1 more than one of the previous approximations yk, yk−1, . . .. Such methods are called multi-step methods. ✷ Definition 3.2. q-step method, linear q-step method. A q-step method with q ≥ 1 is a numerical method for approximately solving y� (x) = f (x, y(x)) , y(x0) = y0, (3.1) where yk+1 depends on yk+1−q but not on yi with i < k + 1 − q. A q-step method is called linear, if it has the form yk+1 = q−1� j=0 ajyk−j + h q−1� j=0 bjf � xk−j, yk−j � + hb−1f (xk+1, yk+1) , (3.2) k = q − 1, q, . . . , with q ≥ 1, a0, . . . , aq−1, b−1, . . . , bq−1 ∈ R, aq−1 �= 0 or bq−1 �= 0. For q = 1, the method is called a one-step method. If b−1 �= 0, then the linear q-step method is an implicit method, otherwise it is an explicit method. ✷ Remark 3.3. Initial values for a q-step method. A q-step method needs q initial values. However, the initial value problem (3.1) provides only the initial value y0. The second initial value y1 can be computed with a one-step method, the next initial value y2 with a one-step method or with a two-step method and so on. It follows that all initial values yi, i > 0, are already numerical approximations. This aspect has to be taken into account in the error analysis of multi-step methods, see Remark 3.23. ✷ 49 50 3 Multi-Step Methods integral xp interpolationxp−r xp−j xp+m Fig. 3.1 Parameters in the derivation of predictor-corrector schemes. 3.2 Predictor-Corrector Methods Remark 3.4. Construction. Starting point of the construction of predictorcorrector methods is the equivalent integral form of the initial value problem (3.1) y(x) = y0 + � x x0 f (t, y(t)) dt. (3.3) Denote the solution at ˜x by y(˜x), then it holds that y(x) = y(˜x) + � x ˜x f (t, y(t)) dt. (3.4) The main idea of predictor-corrector methods consists in approximating the integral on the right-hand side of (3.4) in an appropriate way. There are two principal difficulties: • The dependency of the term in the integral on t is generally not known since the function y(t) is unknown. • Even if the dependency of the function in the integral on t is known, generally it will be impossible to find an analytic expression of the solution. Consider an equidistant grid with nodes xi = x0 + ih, i = 0, 1, . . . . For the derivation of the methods, assume that the term in the integral is known. Then, the derivation is similar to the derivation of the Newton1 – Cotes2 formulas for numerical quadrature. In this approach, the term in the integral of (3.4) is replaced by a polynomial interpolant. Let the boundaries of the integral be the nodes ˜x = xp−j, starting point with parameter j, x = xp+m end point with parameter m, (3.5) with parameters j, m ∈ N0 that need yet to be determined. It will be required that the interpolation polynomial pr(x) satisfies the following properties: • the degree of pr(x) is lower than or equal to r, • pr(xi) = f(xi, y(xi)) for i = p, p − 1, . . . , p − r. 1 Isaac Newton (1642 – 1727) 2 Roger Cotes (1682 – 1716) 3.2 Predictor-Corrector Methods 51 Thus, xp is the most right-hand side node for computing the interpolation polynomial. The value r is a third parameter, compare Figure 3.1. Note that two sets of nodes are involved in the construction, namely the nodes that determine the boundaries of the integral and the nodes that are used to define the interpolation polynomial. The solution of this interpolation problem is given by the Lagrange3 interpolation polynomial pr(x) = r� i=0 f � xp−i, y(xp−i) � Li(x) with Li(x) = r� l=0,l�=i x − xp−l xp−i − xp−l , i = 0, 1, . . . , r. (3.6) It follows by using (3.4), (3.5), and (3.6) that yp+m ≈ yp−j + r� i=0 f � xp−i, y(xp−i) � � x ˜x Li(t) dt = yp−j + h r� i=0 βif � xp−i, y(xp−i) � (3.7) with βi = 1 h � x ˜x Li(t) dt = 1 h � x ˜x   r� l=0,l�=i t − xp−l xp−i − xp−l   dt. The constructed method is in particular linear. Note that so far the assumption of having an equidistant grid was not used. Finally, the formula for βi should be simplified. To this end, note that all fixed values from the interval are nodes of the equidistant grid, such that, e.g., xp = x0 + ph. Replacing these values and using the substitution t = xp + sh =⇒ dt = hds, yields βi = 1 h � m −j   r� l=0,l�=i xp + sh − xp−l xp−i − xp−l   h ds = � m −j   r� l=0,l�=i x0 + ph + sh − x0 − ph + lh x0 + ph − ih − x0 − ph + lh   ds 3 Joseph Louis Lagrange (1736 – 1813) 52 3 Multi-Step Methods = � m −j   r� l=0,l�=i s + l −i + l   ds. (3.8) Now, different methods can be obtained depending on the choice of m, j, and r and by replacing the generally unknown value y(xp−i) in (3.7) with yp−i. There are four important classes of methods. ✷ Example 3.5. Adams4 –Bashforth5 methods. The class of q-step Adams–Bashforth methods is given by m = 1, j = 0, and r = q−1. It follows that the q-step Adams–Bashforth method uses the nodes xk+1−q, . . . , xk for computing the Lagrangian interpolation polynomial. These are q nodes and pq(x) is at most of degree q − 1. Adams–Bashforth methods are explicit methods. They have the general form yk+1 = yk + h q−1� i=0 βif (xk−i, yk−i) , (3.9) see (3.7), with βi = � 1 0   q−1� l=0,l�=i s + l −i + l   ds, (3.10) compare (3.8). In the case q = 1, the term in the integral in (3.4) is replaced by a constant interpolation polynomial with the node (xk, f(xk, yk)). Using the convention that the product is 1 if there is formally no factor in (3.10), this approach yields yk+1 = yk + h �� 1 0 ds � f (xk, yk) = yk + hf (xk, yk) , i.e., one obtains the explicit Euler method. If q = 2, then the term in the integral is approximated by a linear interpolation polynomial with the nodes (xk−1, f(xk−1, yk−1)) and (xk, f(xk, yk)). Using (3.9) and (3.10), one obtains yk+1 = yk + h ��� 1 0 s + 1 1 ds � f (xk, yk) + �� 1 0 s −1 ds � f (xk−1, yk−1)) � = yk + h � 3 2 f (xk, yk) − 1 2 f (xk−1, yk−1) � = yk + h 2 � 3f (xk, yk) − f (xk−1, yk−1) � . q ≥ 3, exercise ✷ 4 John Couch Adams (1819 – 1892) 5 Francis Bashforth (1819 – 1912) 3.2 Predictor-Corrector Methods 53 Example 3.6. Adams–Moulton6 methods. Adams–Moulton methods are defined by m = 0, j = 1, and r = q. Hence, it follows that βi = � 0 −1   q� l=0,l�=i s + l −i + l   ds and from (3.7) that yk = yk−1 + h q� i=0 βif (xk−i, yk−i) or, by transforming the index, yk+1 = yk + h q� i=0 βif (xk+1−i, yk+1−i) . The q + 1 nodes of these methods are given by xk+1−q, . . . , xk, xk+1. That means, Adams–Moulton methods are implicit methods. This class contains two one-step methods that are obtained for q = 0 (which can be used in contrast to the requirement in Definition 3.2) and q = 1. Note that the parameter q in (3.2) determines both the previous approximations to be used and the previous arguments of the function f. But in the construction of the methods, three independent parameters were introduced to determine these values. This construction introduces some freedom which allows here to set q = 0. Considering the case q = 0, then the term in the integral is replaced by a constant interpolation polynomial with the node at (xk+1, f(xk+1, yk+1)). This approach results in the method yk+1 = yk + h �� 0 −1 ds � f (xk+1, yk+1) = yk + hf (xk+1, yk+1) , which is the implicit Euler method. For q = 1, one uses a linear interpolation polynomial with the points (xk, f(xk, yk)) and (xk+1, f(xk+1, yk+1)). One gets yk+1 = yk + h ��� 0 −1 s + 1 1 ds � f (xk+1, yk+1) + �� 0 −1 s −1 ds � f (xk, yk) � = yk + h � 1 2 f (xk+1, yk+1) + 1 2 f (xk, yk) � = yk + h 2 [f (xk+1, yk+1) + f (xk, yk)] . 6 Forest Ray Moulton (1872 – 1952) 54 3 Multi-Step Methods This method is the trapezoidal rule. ✷ Example 3.7. Nystr¨om7 methods. The class of Nystr¨om methods is obtained by using m = 1, j = 1, and r = q − 1. They have the form yk+1 = yk−1 + h q−1� i=0 βif (xk−i, yk−i) with βi = � 1 −1   q−1� l=0,l�=i s + l −i + l   ds. These methods are explicit and one uses the q nodes xk+1−q, . . . , xk. One gets, e.g., for q = 1, the method yk+1 = yk−1 + h �� 1 −1 ds � f (xk, yk) = yk−1 + 2hf (xk, yk) . ✷ Example 3.8. Milne8 method. Milne methods are defined by m = 0, j = 2, and r = q. Using a transform of the index, one finds that they have the form yk+1 = yk−1 + h q� i=0 βif (xk+1−i, yk+1−i) with βi = � 0 −2   q� l=0,l�=i s + l −i + l   ds. Thus, these are implicit methods. ✷ Remark 3.9. On the coefficients of multi-step methods. One can find tables with the coefficients for multi-step methods in the literature. ✷ Remark 3.10. Using implicit methods in practice, predictor-corrector methods. If implicit methods are used, then one has to solve in each node xk+1 an equation that is generally nonlinear. This step can be performed with some kind of fixed point iteration, e.g., with a method of Newton-type. To achieve a good efficiency of the method, a good initial iterate is of importance. To obtain a good initial iterate, one can use an explicit (multi-step) method. For 7 Evert J. Nystr¨om (1895 – 1960) 8 William Edwin Milne (1890 – 1971) 3.3 Convergence of Multi-Step Methods 55 this reason, explicit multi-step methods are called predictor methods and implicit multi-step methods are called corrector methods. The combination of a predictor method with a corrector method is called predictor-corrector method. Often, it is sufficient for computing the next iterate to perform the predictor step and one or two corrector steps. ✷ Remark 3.11. Nordsieck9 form. It is possible to transform multi-step methods in a one-step form, the so-called Nordsieck form. This form uses instead of yk, . . . , yk−q+1, f (xk, yk) , . . . , f � xk−q+1, yk−q+1 � , the values yk, y� (xk), y�� (xk), . . . , y(q) (xk), see, e.g., (Strehmel et al., 2012, Section 4.4.3). The advantage of the Nordsieck form consists in the possibility of applying a step length control as it is known from one-step methods, Section 1.3. Otherwise, a step length control for form (3.2) of multi-step methods becomes rather complicated. On the other hand, using the Nordsieck form requires that the solution of the initial value problem is q times continuously differentiable. ✷ 3.3 Convergence of Multi-Step Methods Remark 3.12. Generalities. In this section, linear multi-step methods of the form (3.2) will be considered. Similarly to one-step methods, notations like local error, consistency, or order of convergence will be introduced. The extension of these notations to nonlinear multi-step methods is straightforward. ✷ Definition 3.13. Local error. Let yk+1 be the results of (3.2), k ≥ q, where the initial values are exactly the values of the solution yk+1−q = y(xk+1−q), . . . , yk = y(xk). Then, the local error is defined by le(xk+1) = lek+1 = y(xk+1)−   q−1� j=0 ajy � xk−j � + h q−1� j=−1 bjf � xk−j, y(xk−j) �   . (3.11) ✷ 9 Arnold Nordsieck (1911 – 1971) 56 3 Multi-Step Methods Definition 3.14. Consistent method, consistency order. Let y(x) be the solution of the initial value problem (3.1), S = {(x, y) : x ∈ I = [x0, xe], y ∈ R}, and IN an equidistant mesh on I with N intervals. The multi-step method (3.2) is called consistent if for all f ∈ C(S), which satisfy in S a Lipschitz condition with respect to y, it holds lim h→0 � max xk∈IN le(xk + h) h � = 0, with h = xe − x0 N . (3.12) If the expression on the left-hand side converges like hp for p ≥ 1, then the multi-step scheme has the consistency order p. ✷ Example 3.15. Consistency order for a Nystr¨om method. The consistency order of a multi-step method can be computed in the same way as for a one-step method by expanding the local error in a Taylor series with respect to h. After having then divided by h, the order of the first non-vanishing term gives the consistency order. Consider the Nystr¨om method for q = 3 yk+1 = yk−1 + h ��� 1 −1 2� l=1 s + l l ds � f (xk, yk) +   � 1 −1 2� l=0,l�=1 s + l −1 + l ds   f (xk−1, yk−1) + �� 1 −1 1� l=0 s + l −2 + l ds � f (xk−2, yk−2) � = yk−1 + h � 7 3 f (xk, yk) − 2 3 f (xk−1, yk−1) + 1 3 f (xk−2, yk−2) � . It follows with (3.11) and (3.1) that le(xk+1) = y(xk+1) − y(xk−1) −h � 7 3 f (xk, y(xk)) − 2 3 f (xk−1, y(xk−1)) + 1 3 f (xk−2, y(xk−2)) � = y(xk+1) − y(xk−1) − h � 7 3 y� (xk) − 2 3 y� (xk−1) + 1 3 y� (xk−2) � . (3.13) Now, the the individual terms will be expanded y(xk+1) = y(xk + h) = y(xk) + hy� (xk) + h2 2 y�� (xk) + h3 6 y��� (xk) + h4 24 y(4) (xk) + O(h5 ), 3.3 Convergence of Multi-Step Methods 57 y(xk−1) = y(xk − h) = y(xk) − hy� (xk) + h2 2 y�� (xk) − h3 6 y��� (xk) + h4 24 y(4) (xk) + O(h5 ), y� (xk−1) = y� (xk − h) = y� (xk) − hy�� (xk) + h2 2 y��� (xk) − h3 6 y(4) (xk) + O(h4 ), y� (xk−2) = y� (xk − 2h) = y� (xk) − 2hy�� (xk) + 2h2 y��� (xk) − 4h3 3 y(4) (xk) + O(h4 ). Inserting these expressions in formula (3.13) for the local error gives le(xk+1) = y(xk) + hy� (xk) + h2 2 y�� (xk) + h3 6 y��� (xk) + h4 24 y(4) (xk) −y(xk) + hy� (xk) − h2 2 y�� (xk) + h3 6 y��� (xk) − h4 24 y(4) (xk) − 7h 3 y� (xk) + 2 3 � hy� (xk) − h2 y�� (xk) + h3 2 y��� (xk) − h4 6 y(4) (xk) � − 1 3 � hy� (xk) − 2h2 y�� (xk) + 2h3 y��� (xk) − 4h4 3 y(4) (xk) � + O(h5 ) = h4 3 y(4) (xk) + O(h5 ). With (3.12), one obtains that this method has consistency order 3. ✷ Remark 3.16. Linear multi-step methods with a high order of convergence. The goal in constructing multi-step methods consists of course in obtaining convergent methods of high order. A high order of convergence can be expected only if the consistency order is high, i.e., if the local error is small. Using the Taylor series expansion of the terms in the local error and requiring that as many leading terms as possible vanish, one gets a linear system of equations for determining the coefficients aj, bj, j = 0, . . . , q − 1 and b−1 in (3.11). In this way, one obtains a method of the form yk+1 − q−1� j=0 ajyk−j = h q−1� j=−1 bjf � xk−j, yk−j � . (3.14) Constructing one-step methods in this way, one always obtains a convergent one-step method, e.g., compare Example 1.29. However, the situation might be different for multi-step methods. ✷ 58 3 Multi-Step Methods Example 3.17. Non-convergent multi-step method. Consider the idea from Remark 3.16 for the construction of an explicit linear multi-step method with q = 2 and with maximal order of consistency. That means, the ansatz for the method is as follows, compare (3.14), yk+1 − a0yk − a1yk−1 = h [b0f (xk, yk) + b1f (xk−1, yk−1)] . The local error has the form le(xk+1) = y (xk+1) − a0y (xk) − a1y (xk−1) − hb0f (xk, y(xk)) −hb1f (xk−1, y(xk−1)) = y (xk+1) − a0y (xk) − a1y (xk−1) − hb0y� (xk) − hb1y� (xk−1). Now, the individual terms are expanded in powers of h: y (xk+1) = y (xk + h) = y (xk) + hy� (xk) + h2 2 y�� (xk) + h3 6 y��� (xk) + O(h4 ), y (xk−1) = y (xk − h) = y (xk) − hy� (xk) + h2 2 y�� (xk) − h3 6 y��� (xk) + O(h4 ), y� (xk−1) = y� (xk − h) = y� (xk) − hy�� (xk) + h2 2 y��� (xk) + O(h3 ). Inserting the expansions gives le(xk+1) = y (xk) + hy� (xk) + h2 2 y�� (xk) + h3 6 y��� (xk) − a0y (xk) −a1 � y (xk) − hy� (xk) + h2 2 y�� (xk) − h3 6 y��� (xk) � − hb0y� (xk) −hb1 � y� (xk) − hy�� (xk) + h2 2 y��� (xk) � + O(h4 ) = [1 − a1 − a0] y (xk) + [1 + a1 − b1 − b0] hy� (xk) + � 1 2 − a1 2 + b1 � h2 y�� (xk) + � 1 6 + a1 6 − b1 2 � h3 y��� (xk) + O(h4 ). Requiring that the first four terms should vanish leads to the following linear system of equations     1 1 0 0 −1 0 1 1 1/2 0 −1 0 −1/6 0 1/2 0         a1 a0 b1 b0     =     1 1 1/2 1/6     . The unique solution of this system is a1 = 5, a0 = −4, b1 = 2, b0 = 4. Consequently, one obtains the method 3.3 Convergence of Multi-Step Methods 59 yk+1 = −4yk + 5yk−1 + h [4f (xk, yk) + 2f (xk−1, yk−1)] (3.15) with third order of consistency. Next, the convergence of the method will be studied at the model initial value problem y� (x) = −y(x), y(0) = 1, with the solution y(x) = exp(−x). As second initial condition, one takes the value of the solution in the mesh point x1 = h, i.e., y1 = exp(−h). Inserting the special form of the right-hand side of the model problem, f(xk, yk) = −yk, in (3.15), one can represent the computed solution explicitly. This solution satisfies the homogeneous linear difference equation yk+1 + (4 + 4h)yk + (−5 + 2h)yk−1 = 0. The solution of this difference equation can be obtained with the ansatz yk = ξk . Inserting this ansatz in the difference equation leads to ξk+1 + (4 + 4h)ξk + (−5 + 2h)ξk−1 = 0. This equation is satisfied for ξ = 0. Other solutions can be obtained after division by ξk−1 from ξ2 + (4 + 4h)ξ + (−5 + 2h) = 0. (3.16) One gets the solutions ξ1(h) = −2 − 2h + 3 � 1 + 2 3 h + 4 9 h2 , ξ2(h) = −2 − 2h − 3 � 1 + 2 3 h + 4 9 h2 . For simplicity, the dependency on h will be neglected in the notation. The general solution of the difference equations can be represented as a linear combination of the special solutions (superposition principle) yk = C1ξk 1 + C2ξk 2 . Now, the constants can be determined from the initial conditions. It holds y0 = C1 + C2 = 1, y1 = e−h = C1ξ1 + C2ξ2, from what follows that C1(h) = e−h − ξ2 ξ1 − ξ2 , C2(h) = − e−h − ξ1 ξ1 − ξ2 . Expanding ξ1(h), ξ2(h), C1(h) and C2(h) in powers of h and inserting these expansions in the solution (exercise), gives for fixed x > 0 and hN := x/N 60 3 Multi-Step Methods yN = � 1 + O � x N �� � 1 − x N + O �� x N �2 ��N − 1 216 � x N �4 � 1 + O � x N �� � −5 − 3 x N + O �� x N �2 ��N . Considering now the convergence of the method, i.e., hN → 0 ⇐⇒ N → ∞. Then, one obtains for the first term, using well known properties of the exponential, that lim N→∞ � 1 + O � x N �� � 1 − x N + O �� x N �2 ��N = e−x . This part approximates the solution of the model problem. For the second term, it holds that − 1 216 � x N �4 � 1 + O � x N �� � −5 − 3 x N + O �� x N �2 ��N = − (−5)N 216 � x N �4 � 1 + O � x N �� � 1 + 3 5 x N + O �� x N �2 ��N . Since lim N→∞ � 1 + 3 5 x N + O �� x N �2 ��N = e3x/5 , one finds that the second term behaves for large N as follows − (−5)N 216 � x N �4 e3x/5 . (3.17) This expression oscillates with increasing N and the modulus is increasing for finer grids (‘exponential (−5)N is stronger than polynomial (x/N)4 ’), compare the values for x = 1 in the following table N value of (3.17) 1 0.0421787 2 - 0.1054467 3 0.3514890 4 - 1.3180836 5 5.2723345 6 - 21.96806 7 94.14883 8 - 411.90113 9 1830.6717 10 - 8238.0226 It follows that the method does not converge. 3.3 Convergence of Multi-Step Methods 61 Such an oscillatory behavior can be observed also if the method is applied for solving other initial value problems. For the considered example, the reason for this behavior is that the general solution of the difference equation contains a term that becomes arbitrarily large for large k and for small h (or large N). For the considered method it is lim h→0 ξ2(h) = −5 =⇒ lim k→∞ � � �ξ k 2 (h) � � � = ∞. The solution of the difference equation was obtained from the roots of the polynomial (3.16). It can be guessed that the roots of this polynomial will be of importance for the convergence of multi-step methods. ✷ Definition 3.18. Null stable linear multi-step method. A linear q-step method is called null stable if the first characteristic polynomial Ψ(ξ) = ξq − a0ξq−1 − . . . − aq−1 (3.18) possesses only roots ξq with � �ξq � � ≤ 1 that are simple in the case that � �ξq � � = 1. For the notation ‘null stable’, compare Remark 3.37 below. ✷ Example 3.19. Null stability for predictor-corrector methods. The methods from the four most important classes of predictor-corrector methods are null stable. • Adams–Bashforth methods, Adams–Moulton methods. The first characteristic polynomial has the form Ψ(ξ) = ξq − ξq−1 = (ξ − 1) ξq−1 . The only non-trivial root is ξq = 1 and this root is simple. • Nystr¨om methods, Milne methods. For these methods, the first characteristic polynomial is Ψ(ξ) = ξq − ξq−2 = (ξ + 1) (ξ − 1) ξq−2 . Hence, the only non-trivial roots are ξq = 1 and ξq = −1. They are simple. Null stability does not mean stable in the sense that the method can be applied for the numerical solution of stiff problems, see Example 3.22. ✷ Theorem 3.20. First Dahlquist10 barrier. The maximal order of consistency of a null stable linear q-step method is p =    q + 1 for q odd, q + 2 for q even, q if b−1 ≤ 0, in particular, if the method is explicit. 10 Germund Dahlquist (1925 – 2005) 62 3 Multi-Step Methods Proof. Only a sketch of the proof is given here, for details see the literature, e.g., (Strehmel et al., 2012, Section 4.2.3) or (Hairer et al., 1993, Section III.3). First, one sets for ξ ∈ C, |ξ| < 1, z = ξ − 1 ξ + 1 . Then, one defines the polynomials R(z) = � 1 − z 2 �q Ψ(ξ) = q� l=0 αlz l , S(z) = � 1 − z 2 �q σ(ξ) = q� l=0 βlz l , with σ(ξ) = b−1ξ q + b0ξ q−1 + . . . + bq−1. (3.19) As next step, one can prove that a linear multi-step method has consistency order p if and only if R(z) � ln 1 + z 1 − z �−1 − S(z) = O � z p� for z → 0. Using a Taylor series expansion of the term with the logarithm, one has on the left-hand side of this statement a polynomial. Now, one studies which coefficients of this polynomial might vanish such that the method is null stable in the individual cases given in the theorem. � Example 3.21. Consistency order of some predictor-corrector methods. • Adams–Bashforth methods with q steps have the consistency order q and Adams–Moulton methods with q steps possess the consistency order q+1. Thus, Adams–Moulton methods where q is even have an order that is less than the maximal possible order according to Theorem 3.20. • The 2-step Milne method (also Milne–Simpson method) yk+1 = yk−1 + h � 1 3 f(xk+1, yk+1) + 4 3 f(xk, yk) + 1 3 f(xk−1, yk−1) � (3.20) has the consistency order 4. This method achieves the maximal order of consistency for a null stable 2-step method. ✷ Example 3.22. Stability of the 2-step Milne method. This implicit method is null stable, see Example 3.19, and it possesses the maximal possible order of consistency for a null stable method, see Example 3.21. Thus, so far it shows favorable properties. But having a closer look on its stability reveals that this method has a severe drawback. Consider again the model initial value problem y� (x) = λy(x), y(0) = 1, 3.3 Convergence of Multi-Step Methods 63 with the solution y(x) = exp(λx). Applying the 2-step Milne method for the solution of this problem, then the method (3.20) has the form yk+1 = yk−1 + hλ � 1 3 yk+1 + 4 3 yk + 1 3 yk−1 � . This equation can be rewritten as a linear difference equation � 1 − hλ 3 � yk+1 − 4hλ 3 yk − � 1 + hλ 3 � yk−1 = 0. The general solution of this difference equation can be represented in the form yk = C1ξk 1 + C2ξk 2 , (3.21) where ξ1(h) and ξ2(h) are the solutions of the quadratic equation � 1 − hλ 3 � ξ2 − 4hλ 3 ξ − � 1 + hλ 3 � = 0. One obtains ξ1(h) = 3 3 − hλ  2hλ 3 + � 1 + (hλ)2 3   , ξ2(h) = 3 3 − hλ  2hλ 3 − � 1 + (hλ)2 3   . Now, the constants C1, C2 can be determined from the initial condition and from the value of the first step. It is for x = 0 C1 + C2 = 1 (3.22) and for x = h, taking the exact value, eλh = C1ξ1 + (1 − C1)ξ2 = (1 − C2) ξ1 + C2ξ2. (3.23) Expanding ξ1(h) and ξ2(h) in powers of h at h = 0, one obtains as first order approximation by computing the derivatives of ξ1(h) and ξ2(h), respectively, and inserting zero ξ1(h) = 1 + λh + O � h2 � , ξ2(h) = −1 + λ 3 h + O � h2 � . (3.24) In the interesting case, λ < 0, ξ1(h) approaches 1 from left, with values smaller than 1, and ξ2(h) approaches −1 also from left, but here the modulus of ξ2(h) is larger than 1. The last property leads to undesired effects. 64 3 Multi-Step Methods For the approximate solution in the node xk = kh, k = 0, 1, . . ., one gets with (3.21) yk = C1 � 1 + λh + O � h2 ��xk/h + C2 � −1 + λ 3 h + O � h2 ��xk/h . (3.25) The first term converges to exp(λxk) for h → 0, since lim h→0 (1 + λh) xk/h = lim h→0 � 1 + λxk h xk �xk/h = exp(λxk). It behaves like the solution of the model initial value problem. The second term behaves for small h like (−1)xk/h � 1 − λ 3 h �xk/h . Here, the second factor converges to exp(−λxk/3), but the first factor oscillates for xk/h ∈ N. That means, for the stable initial value problem with λ < 0, this term gives an oscillatory, bounded (for fixed xk), but exponentially large perturbation. The behavior of the solution depends on the constants C1 and C2. Inserting the expansion (3.24) in condition (3.23) gives eλh = (1 − C2) � 1 + λh + O � h2 �� + C2 � −1 + λ 3 h + O � h2 �� = 1 + λh − 2C2 − 2λh 3 C2 + O � h2 � . An expansion of the exponential yields 1 + λh + O � h2 � = 1 + λh − 2C2 − 2λh 3 C2 + O � h2 � =⇒ O � h2 � = −2C2 − 2λh 3 C2. It follows that C2(h) = O � h2 � and from (3.22), it follows that C1 = O (1). In summary, it is for the second term of (3.25) lim h→0 C2(h) � −1 + λ 3 h + O � h2 ��xk/h � �� � bounded = 0. The method converges. However, the term 3.3 Convergence of Multi-Step Methods 65 Fig. 3.2 Example 3.22: application of the 2-step Milne method to the model problem with λ = −5 and h ∈ {0.1, 0.01, 0.001} (left to right, top to bottom). C2(h) � −1 + λ 3 h + O � h2 ��xk/h ≈ ±h2 exp � − λxk 3 � becomes small in the case λ � −1 and large xk only if the step size h is very small, see Figure 3.2. The behavior found for this method can be observed in practice for all qstep methods of consistency order q+2 if these methods are applied to initial value problems with exponentially decaying solution. This kind of instability is a strong restriction of the usefulness of these methods. ✷ Remark 3.23. Start of multi-step methods and convergence. Apart of the consistency of multi-step methods, one is above all interested in their convergence. For one-step methods, convergence follows from consistency under rather general assumptions and the order of consistency and convergence are the same, see Theorem 1.19. The situation becomes more complicated for multi-step methods. First of all, one needs for starting a q-step method besides the known initial value y0 = y(x0) still (q − 1) further approximations y1, . . . , yq−1 for y(x1), . . . , y(xq−1). These values can be computed, for instance by a onestep method. The accuracy of these approximations has a strong influence on the accuracy of the q-step method that uses these values. Assume that the 66 3 Multi-Step Methods approximations behave as follows y0 = y(x0), y1 = y(x1) + ε1(h), . . . , yq−1 = y(xq−1) + εq−1(h). Then, the values that are computed with the q-step method depend also on the perturbations ε1(h), . . . , εq−1(h) and one should write for the computed solution in the node xk more exactly yk(ε, h), where ε(x, h) is a function for which εi(h) = ε(xi, h), i = 1, . . . , q − 1, holds. ✷ Definition 3.24. Global error. Let y(x) be the solution of the initial value problem (3.1). Denote the approximations of y(x) that are computed with a multi-step method with step length h by yk(ε, h), where the accuracy of the initial approximations is given by the function ε(x, h). Then, the quantity e(xk, ε, h) := yk(ε, h) − y(xk) is called global error or global discretization error at the node xk with respect to the step length h and the perturbations ε(x, h). ✷ Definition 3.25. Convergence of a multi-step method. Consider the ordinary differential equation of the initial value problem (3.1) in [a, b] and let x0 ∈ [a, b]. A multi-step method for solving initial value problems of form (3.1) is called convergent if lim n→∞ e(x, ε, hn) = 0, with hn = x − x0 n , for all x ∈ [a, b], for all f ∈ C1 ([a, b] × R), and for all functions ε(x, h) with lim n→∞ |ε(x, hn)| = 0, for x = x0 + ihn, i = 0, . . . , q − 1. ✷ Lemma 3.26. A convergent linear multi-step methods is null stable. A convergent linear multi-step method (3.2) is null stable. Proof. The proof is performed by contradiction. Assume that the linear multi-step method is not null stable. Consider the initial value problem y � (x) = 0, y(0) = 0, whose solution is y(x) = 0. Applying a linear multi-step method of form (3.2) to this problem yields the homogeneous linear difference equation yk+1 − q−1� j=0 ajyk−j = 0. (3.26) Since the method is assumed to be not null stable, the corresponding first characteristic polynomial Ψ(ξ) has a root with |ξ1| > 1 or a root |ξ2| = 1 that is not simple. Without loss 3.3 Convergence of Multi-Step Methods 67 of generality, let the multiplicity of ξ1 be one and of ξ2 be two. Similarly to Example 3.17, one finds that the solution of (3.26) in the node xk = kh is given by yk = C1ξ k 1 + C2kξ k 2 , C1, C2 ∈ R, where one of these coefficients is not zero. Consider a fixed x with x = mh, m ∈ N. Choosing C1 = C2 = √ h, where it will be discussed below that this is an admissible choice, the solution in x is given by √ hξ x/h 1 + x √ h ξ x/h 2 . (3.27) For the initial value x = 0, the value of (3.27) is √ h and for the initial value x = h, it is √ hξ1 + √ hξ2. Thus, for the initial values, (3.27) converges to the analytic solution as h → 0, so that the choices of C1 and C2 were admissible. However, for other values of x, which is assumed to be fix, both terms in (3.27) diverge. This observation contradicts the assumed convergence of the linear multi-step method. Hence, it is null stable. � Theorem 3.27. Connection of convergence and null stability. Let yk+1 = q−1� j=0 ajyk−j + hΦ(xk+1, . . . , xk+1−q, yk+1, . . . , yk+1−q, h) (3.28) be a consistent multi-step method for the solution of initial value problems of form (3.1), which is more general than a linear multi-step method. Assume that the incremental function satisfies the following conditions: i) Φ(xk+1, . . . , xk+1−q, yk+1, . . . , yk+1−q, h) ≡ 0 for all x ∈ [a, b], all yk ∈ R, and all h ∈ R if f(x, y) ≡ 0. ii) Lipschitz continuity with respect to the y-components, i.e., there are constants h0 > 0 and M such that � �Φ(xq, . . . , x0, vq, . . . , v0, h) − Φ(xq, . . . , x0, wq, . . . , w0, h) � � ≤ M q� i=0 |vi − wi| for all xq, . . . , x0 ∈ [a, b], all vi, wi ∈ R, i = 0, . . . , q, and all step sizes h with h < h0. Then, the multi-step method converges if and only if it is null stable. Proof. For the proof, it is referred to the literature, e.g., (Strehmel et al., 2012, Section 4.2.5). � Remark 3.28. To Theorem 3.27. • The first assumption and the null stability guarantee that the multi-step method solves the trivial initial value problem y� (x) = 0, y(x0) = 0, exactly if ε0 = ε1 = . . . = εq−1 = 0. 68 3 Multi-Step Methods • A linear multi-step method is a special case of (3.28). For linear multi-step methods, the first assumption is always satisfied, since the incremental function is a linear combination of values of the right-hand side f(x, y) of the ordinary differential equation. Due to the same reason, the incremental function of these methods satisfies the second assumption if the right-hand side f(x, y) is Lipschitz continuous with respect to the second argument. Altogether, if the right-hand side of the initial value problem is sufficiently smooth, then a consistent linear multi-step method is convergent if and only if it is null stable. ✷ Theorem 3.29. Order of convergence. Consider a multi-step method of the form (3.28) that satisfies the assumptions stated in Theorem 3.27 and which possesses the order of consistency p. Then, it holds for all f ∈ Cp ([a, b] × R) and for all x ∈ [a, b] that |e(x, ε, h)| = O (hp ) , if for the accuracy of the initial values it holds |εi(h)| = O (hp ) for i = 0, . . . , q − 1. Proof. See literature, e.g., (Strehmel et al., 2012, Section 4.2.5) or (Hairer et al., 1993, Chapter III.4). � Remark 3.30. Interpretation of Theorem 3.29. If a multi-step method with consistency order p should also have convergence order p, then it is necessary to compute the initial approximations sufficiently accurately, e.g., with a onestep method of order p. Considering the complete method, which consists of the starting method for computing the approximations y1, . . . , yq−1 and a predictor-corrector method for computing the other values, then the order of the complete method is determined by the partial method with the lowest order. ✷ 3.4 Backward Difference Formula (BDF) Methods Remark 3.31. Construction. The construction of Backward Difference Formula (BDF) methods is based on the original initial value problem (3.1) and not on the integral form (3.3) as it is the case for predictor-corrector methods. Given q + 1 nodes xk+1−q, . . . , xk+1 and q ≥ 1 known approximations of the solution yk+1−q, . . . , yk. Then, the idea of BDF methods consists in approximating the solution by an interpolation polynomial pq(x) of degree q with the nodes � xk+1−q, yk+1−q � , . . . , (xk, yk). Now, another condition is needed in order to define a polynomial of degree q and this condition shall also allow to compute yk+1. For BDF methods, one uses the requirement that 3.4 Backward Difference Formula (BDF) Methods 69 this polynomial should satisfy the differential equation (3.1) in xk+1, i.e., p� q (xk+1) = f (xk+1, yk+1) , (3.29) which leads to an interpolation of Hermite type. It follows from this requirement that BDF methods are implicit methods. ✷ Example 3.32. BDF methods. Consider an equidistant grid with grid size h. • q = 1. The linear interpolation polynomial through the points (xk, yk) and (xk+1, yk+1) is given by the Newton representation (using divided differences) p1(x) = yk+1 + (x − xk+1) yk − yk+1 xk − xk+1 . It is p� 1(x) = yk − yk+1 xk − xk+1 such that requirement (3.29) and xk − xk+1 = −h leads to yk − yk+1 −h = f (xk+1, yk+1) ⇐⇒ yk+1 = yk + hf (xk+1, yk+1) . Hence, BDF(1) is just the implicit Euler method. • q = 2. The Newton representation of the quadratic interpolation polynomial through (xk−1, yk−1) , (xk, yk) , (xk+1, yk+1) is given by p2(x) = yk+1 + (x − xk+1) yk − yk+1 xk − xk+1 + (x − xk+1)(x − xk) xk−1 − xk+1 � yk−1 − yk xk−1 − xk − yk − yk+1 xk − xk+1 � . (3.30) Computing the derivative of this polynomial and using that the grid is equidistant yields p� 2(x) = yk − yk+1 −h + (x − xk+1) + (x − xk) −2h � yk−1 − yk −h − yk − yk+1 −h � , such that requirement (3.29) leads to p� 2(xk+1) = yk+1 − yk h + h 2h � yk+1 − 2yk + yk−1 h � = f (xk+1, yk+1) . Collecting terms gives the BDF(2) method 3 2 yk+1 − 2yk + 1 2 yk−1 = hf (xk+1, yk+1) . (3.31) BDF(2) is the most popular multi-step method for stiff problems. 70 3 Multi-Step Methods • q ≥ 3. The derivation of higher order methods proceeds in the same way. One obtains, e.g., as BDF(3) method 11 6 yk+1 − 18 6 yk + 9 6 yk−1 − 2 6 yk−2 = hf (xk+1, yk+1) . (3.32) It should be emphasized that in BDF methods the right-hand side of the initial value problem appears only in one term, namely f (xk+1, yk+1). This situation is in contrast to the predictor-corrector methods from Section 3.2. This property of BDF methods is of advantage if the computation of the right-hand side is complicated or numerically expensive, like for special discretizations of partial differential equations. ✷ Lemma 3.33. Null stability of BDF(1), BDF(2), and BDF(3). The methods BDF(1), BDF(2), and BDF(3) are null stable. Proof. The statement of the lemma is obtained by computing the roots of the first characteristic polynomial. • q = 1. The characteristic polynomial is λ − 1 with the root λ1 = 1. • q = 2. The characteristic polynomial of BDF(2) (3.31) is λ 2 − 4 3 λ + 1 3 . A straightforward calculation gives λ1 = 2 3 + 1 3 = 1, λ2 = 2 3 − 1 3 = 1 3 . • q = 3. For BDF(3), see (3.32), one obtains the characteristic polynomial λ 3 − 18 11 λ 2 + 9 11 λ − 2 11 . By inserting, one checks that λ1 = 1 is a root of this polynomial. Extracting the linear factor with this root yields λ 3 − 18 11 λ 2 + 9 11 λ − 2 11 λ − 1 = λ 2 − 7 11 λ + 2 11 . The remaining roots are given by the roots of the quadratic polynomial, which are λ2 = 7 + i √ 39 22 , λ3 = 7 − i √ 39 22 , such that |λ2| = |λ3| = √ 22/11 ≈ 0.4264. � Remark 3.34. Null stability of BDF(q) methods. It can be shown that BDF(q) methods are null stable only for q ≤ 6, e.g., see Cryer (1972). Hence, BDF(q) methods for q > 6 are not of interest. ✷ Lemma 3.35. Consistency of BDF(q) methods. BDF(q) methods with q ≤ 6 are consistent of order q. 3.4 Backward Difference Formula (BDF) Methods 71 Proof. The proof is obtained by a Taylor series expansion (exercise). � Theorem 3.36. Convergence of BDF(q) methods. Let f ∈ Cq ([a, b]×R) and Lipschitz continuous with respect to the second argument and let the initial values be computed sufficiently accurately, then the BDF(q) methods with q ≤ 6 are convergent of order q. Proof. The incremental function of BDF(q) methods is Φ(xk+1, . . . , xk+1−q, yk+1, . . . , yk+1−q, h) = f � xk+1, yk+1 � , so that the assumptions of Theorem 3.27 are satisfied. Because BDF(q) methods with q ≤ 6 are null stable and consistent of order q, the other assumptions of Theorem 3.29 are also satisfied and the statement of the theorem follows now from Theorem 3.29. � Remark 3.37. On the stability. Stability of multi-step methods is studied at the same initial value problem (2.7) as it was used for one-step methods. In the same way as in Example 3.17, one obtains a homogeneous difference equation yk+1 − q−1� j=0 ajyk−j = z q−1� j=−1 bjyk−j with z = λh. With the ansatz yk = ξk and after division by ξk+1−q , one obtains a characteristic equation Ψ(ξ) − zσ(ξ) = 0, (3.33) compare (3.16), where Ψ(ξ) is the first characteristic polynomial (3.18). The polynomial σ has the coefficients bj, compare (3.19). Note that for z = 0, only the roots of the first characteristic polynomial are considered, which are important for the null stability of the method. This relation might be the reason for the notion ‘null’ stable. ✷ Definition 3.38. Stability domain. The set S = � z ∈ C : for all roots ξl of (3.33) it holds |ξl| ≤ 1; if ξl is a multiple root, then it holds |ξl| < 1 � is called stability domain of a linear multi-step method. ✷ Definition 3.39. A-stability, A(α)-stability. A linear multi-step method is called A-stable if C− ⊂ S. It is called A(α)-stable with α ∈ (0, π/2) if � z ∈ C− with |arg(z) − π| ≤ α � ⊂ S, with arg(z) ∈ [0, 2π). ✷ Theorem 3.40 (Second Dahlquist barrier). An A-stable linear multistep method is at most of second order. 72 3 Multi-Step Methods Table 3.1 Values of α (in degree) for the A(α)-stability of BDF(q) methods. q 1 2 3 4 5 6 α 90 90 86.03 73.35 51.84 17.84 Proof. See literature, e.g., (Strehmel et al., 2012, Section 9.1). � Remark 3.41. A(α)-stability of BDF(q) methods. BDF(q) methods are A(α)stable for q ≤ 6 and even A-stable for q ≤ 2. The values of α for BDF(q) methods are given in Table 3.1. Because of the small value of α for q = 6, the method BDF(6) is not used in practice. ✷ Remark 3.42. Variable step size for BDF(q) methods. BDF(q) methods can be used on non-equidistant grids, e.g., a formula for BDF(2) with variable step size can be derived on the basis of the quadratic interpolation polynomial (3.30). For q > 1 there is some restriction on the admissible change of the mesh size from one mesh cell to its neighbor, e.g., for BDF(2) stability is guaranteed as long as hk+1/hk ≤ 2.41421, see (Strehmel et al., 2012, p. 328) for more details. ✷ Chapter 4 Summary and Outlook 4.1 Comparison of Numerical Methods Remark 4.1. Motivation. Given an initial value problem in practice, one has to choose a method for its numerical solution. It is desirable to use a method that is appropriate for the given problem. This section discusses some criteria for making the choice. ✷ Remark 4.2. Criteria for comparing numerical methods for solving initial value problems. • Computing time. Computing time is important in many applications. If the evaluation of the right-hand side of the initial value problem is timeconsuming, the number of evaluations is important. For implicit methods, the number of calculations of the Jacobian and the number of LU factorizations is of importance. • Accuracy. Computing an accurate numerical solution is of course desirable. However, aiming for high accuracy is often in conflict with having short computing times. An easy step length control should be possible. • Memory. On modern computers, the amount of memory is usually not a big issue. However, if the given initial value problem has special structures, like a sparse Jacobian, such structures should be supported by the numerical method. It should be kept into consideration that on modern computers the access to the memory determines essentially the computing time (and not the number of floating point operations). • Reliability. The step length control (or order control) should be sensitive with respect to local changes of the right-hand side and act in a correct way. • Robustness. The method should work also for complicated examples. It should be flexible with the step length control, e.g., reduce the step length appropriately if the right-hand side has steep gradients. • Simplicity. In complex applications, often the simplicity of the method is of importance. 73 74 4 Summary and Outlook ✷ Remark 4.3. Some experience. There are much more numerical methods for solving initial value problems than presented in this course. Here, only some remarks to the presented methods are given. • Non-stiff problems. For such problems, explicit one-step and linear multistep methods were presented. Often, one-step methods need less steps than multi-step methods and they require fewer evaluations of the righthand side. A few popular explicit one-step methods are given in Remark 1.45. • Complex problems. For complex problems, e.g., from fluid dynamics, where one has an initial value problem with respect to time, very often only the simplest methods are used, like the explicit or implicit Euler method, the trapezoidal rule (Crank–Nicolson scheme), or sometimes BDF2. An efficient and theoretically supported step length control is not possible with these methods. Other methods, like Rosenbrock schemes, have been used only for academic problems so far, e.g., in John & Rang (2010). ✷ 4.2 Boundary Value Problems Remark 4.4. A one-dimensional boundary value problem. Boundary value problems prescribe, in contrast to initial value problems, values at (some part of) the boundary of the domain. A typical example in one dimension is − u�� = f in (0, 1), u(0) = a, u(1) = b, (4.1) with given values a, b ∈ R. Solving (4.1) can be performed in principal by integrating the right-hand side of the differential equation twice. Whether or not this analytic calculation can be performed depends on f. Each integration gives a constant, such that the general solution of the differential equation has two constants. The values of these constants can be determined with the given boundary conditions. ✷ Remark 4.5. Boundary value problems in higher dimensions. Let Ω ⊂ Rd , d ∈ {2, 3}, be a bounded domain. Then, a typical boundary value problem is − Δu = − d� i=1 ∂2 u ∂x2 i = f in Ω, u = g on ∂Ω, (4.2) where Δ is the Laplacian and ∂Ω is the boundary of Ω. Solutions of problems of type (4.2) can be hardly found analytically. 4.3 Differential-Algebraic Equations 75 The topic of Numerical Mathematics III will be the introduction of numerical methods for solving problems of type (4.2). Such methods include Finite Difference Methods, Finite Element Methods, and Finite Volume Methods. ✷ 4.3 Differential-Algebraic Equations Remark 4.6. Differential-Algebraic Equations (DAEs). In many applications, the modeling of processes leads to a coupled system of equations of different type. A typical example are systems of the form y� (t) = f(t, y(t), z(t)), 0 = g(t, y(t), z(t)), (4.3) with given functions f and g. In (4.3), the derivative of y with respect to t occurs, but not the derivative of z with respect to t. Problem (4.3) is called semi-explicit differential-algebraic equation (DAE), the variable y is the differential variable, and z is the algebraic variable. In this context, the notion ‘algebraic’ means that there are no derivatives. ✷ Example 4.7. Equations for incompressible fluids. Equations for the behavior of incompressible fluids are derived on the basis of two conservation laws. The first one is Newton’s second law of motion (net force equals mass times acceleration, conservation of linear momentum) and the second one is the conservation of mass. The unknown variables in these equations are the velocity u(t, x, y, z) and the pressure p(t, x, y, z), where t is the time, (x, y, z) the spatial variable, and u(t, x, y, z) =   u1(t, x, y, z) u2(t, x, y, z) u3(t, x, y, z)   . Whereas the conservation of linear momentum contains the temporal derivative ∂tu of the velocity, i.e., it is a differential equation with respect to time, the conservation of mass reads as follows ∇ · u = ∂xu1 + ∂yu2 + ∂zu3 = 0. (4.4) Thus, one obtains a coupled model of the differential equation (with respect to time) and the algebraic equation (4.4). ✷ Remark 4.8. Theory of DAEs. There is not sufficient time for presenting the theory of DAEs. One can find it, e.g., in (Strehmel et al., 2012, Chapter 13) or (Kunkel & Mehrmann, 2006, Part I). ✷ 76 4 Summary and Outlook Remark 4.9. Direct approach for the discretization of DAEs. In the so-called direct approach, the DAE (4.3) is embedded in the so-called singularly perturbed problem y� (t) = f(t, y(t), z(t)), εz� (t) = g(t, y(t), z(t)), (4.5) with 0 < ε � 1. Problem (4.5) is an ODE, for which the methods presented in this course can be applied. Formulating these methods for problem (4.5), then the so-called singular perturbation parameter ε appears. Setting then ε = 0 in these formulations leads formally to methods for the DAE (4.3). Now, one has to study the properties of these methods. ✷ Appendix A Topics on the Theory of Ordinary Differential Equations A.1 Ordinary Differential Equations of Higher Order Remark A.1. Motivation. The notation of stiffness comes from the consideration of first order systems of ordinary differential equations. There are some connections of such systems to ordinary differential equations of higher order, e.g. a solution method for linear first order systems requires the solution of a higher order linear differential equation, see Remark A.36. ✷ A.1.1 Definition, Connection to First Order Systems Definition A.2. General and explicit n-th order ordinary differential equation. The general ordinary differential equation of order n has the form F x, y(x), y′ (x), . . . , y(n) (x) = 0. (A.1) This equation is called explicit, if one can write it in the form y(n) (x) = f x, y(x), y′ (x), . . . , y(n−1) (x) . (A.2) The function y(x) is a solution of (A.1) in an interval I if y(x) is n times continuously differentiable in I and if y(x) satisfies (A.1). Let x0 ∈ I be given. Then, (A.1) together with the conditions y(x0) = y0, y′ (x0) = y1, . . . , y(n−1) (x0) = yn−1 is called initial value problem for (A.1). ✷ 77 78 A Topics on the Theory of Ordinary Differential Equations Example A.3. Special cases. The general resp. explicit ordinary differential equation of higher order can be solved analytically only in special cases. Two special cases, that will not be considered here, are as follows: • Consider the second order differential equation y′′ (x) = f(x, y′ (x)). Substituting y′ (x) = z(x), one obtains a first order differential equation for z(x) z′ (x) = f(x, z(x)). If one can solve this equation analytically, one gets y′ (x). If it is then possible to find a primitive of y′ (x), one has computed an analytical solution of the differential equation of second order. In the case of an initial value problem with y(x0) = y0, y′ (x0) = y1, the initial value for the first order differential equation is z(x0) = y1. The second initial value is needed for determining the constant of the primitive of y′ (x). • Consider the differential equation of second order y′′ (x) = f(y, y′ ). Let a solution y(x) of this differential equation be known and let y−1 (y) its inverse function, i.e. y−1 (y(x)) = x. Then, one can use the ansatz p(y) := y′ y−1 (y) . With the rule for differentiating the inverse function ((f−1 )′ (y0) = 1/f′ (x0)), one obtains dp dy (y) = y′′ y−1 (y) d dy y−1 (y(x)) = y′′ y−1 (y) y′(x) = y′′ y−1 (y) y′ (y−1(y)) = y′′ y−1 (y) p(y) = y′′ (x) p(y) . This approach leads then to the first order differential equation p′ (y) = f(y, p(y)) p(y) . ✷ A.1 Ordinary Differential Equations of Higher Order 79 Theorem A.4. Connection of explicit ordinary differential equations of higher order and systems of differential equations of first order. Every explicit differential equation of n-th order (A.2) can be transformed equivalently to a system of n differential equations of first order y′ k(x) = yk+1(x), k = 1, . . . , n − 1, y′ n(x) = f(x, y1(x), . . . , yn(x)) (A.3) or (note that the system is generally nonlinear, since the unknown functions appear also in f(·, . . . , ·)) y′ (x) =      y′ 1(x) y′ 2(x) ... y′ n(x)      =      0 1 0 · · · 0 0 0 1 · · · 0 ... ... ... ... ... 0 0 0 · · · 0           y1(x) y2(x) ... yn(x)      +      0 0 ... f(x, y1, . . . , yn)      for the n functions y1(x), . . . , yn(x). The solution of (A.2) is y(x) = y1(x). Proof. Insert in (A.2) y1(x) := y(x), y2(x) := y′ 1(x) = y′ (x), y3(x) := y′ 2(x) = y′′ (x), . . . yn(x) := y′ n−1(x) = y(n−1) (x). If y ∈ Cn(I) is a solution of (A.2), then y1(x), . . . , yn(x) is obviously a solution of (A.3) in I. Conversely, if y1(x), . . . , yn(x) ∈ C1(I) is a solution of (A.3), then it holds y2(x) = y′ 1(x), y3(x) = y′ 2(x) = y′′ 1 (x), . . . , yn(x) = y (n−1) 1 (x) y′ n(x) = y (n) 1 (x) = f(x, y1, . . . , yn). Hence, the function y1(x) is n times continuously differentiable and it is the solution of (A.2) in I. Example A.5. Transform of a higher order differential equation into a system of first order equations. The third order differential equation y′′′ (x) + 2y′′ (x) − 5y′ (x) = f(x, y(x)) can be transformed into the form y1(x) = y(x) y′ 1(x) = y2(x)(= y′ (x)) y′ 2(x) = y3(x)(= y′′ (x)) y′ 3(x) = y′′′ (x) = −2y′′ (x) + 5y′ (x) + f(x, y(x)) = −2y3(x) + 5y2(x) + f(x, y1(x)). ✷ 80 A Topics on the Theory of Ordinary Differential Equations A.1.2 Linear Differential Equations of n-th Order Definition A.6. Linear n-th order differential equations. A linear differential equation of n-th order has the form an(x)y(n) (x)+an−1(x)y(n−1) (x)+. . .+a1(x)y′ (x)+a0(x)y(x) = f(x), (A.4) where the functions a0(x), . . . , an(x) are continuous in the interval I, in which a solution of (A.4) is searched, and it holds an(x) = 0 in I. The linear n-th order differential equation is called homogeneous if f(x) = 0 for all x ∈ I an(x)y(n) (x) + an−1(x)y(n−1) (x) + . . . + a1(x)y′ (x) + a0(x)y(x) = 0. (A.5) ✷ Theorem A.7. Superposition principle for linear differential equations of higher order. Consider the linear differential equation of n-th order (A.4), then the superposition principle holds: i) If y1(x) and y2(x) are two solutions of the homogeneous equation (A.5), then c1y1(x) + c2y2(x), c1, c2 ∈ R, is a solution of the homogeneous equation, too. ii) If y0(x) is a solution of the inhomogeneous equation and y1(x) is a solution of the homogeneous equation, then y0(x) + y1(x) is a solution of the inhomogeneous equation. iii) If y1(x) and y2(x) are two solutions of the inhomogeneous equation, then y1(x) − y2(x) is a solution of the homogeneous equation. Proof. Direct calculations, exercise. Corollary A.8. General solution of the inhomogeneous differential equation. The general solution of (A.4) is the sum of the general solution of the homogeneous linear differential equation of n-th order (A.5) and one special solution of the inhomogeneous n-th order differential equation (A.4). Remark A.9. Transform in a linear system of ordinary differential equations of first order. A linear differential equation of n-th order can be transformed equivalently into a linear n × n system y′ k(x) = yk+1(x), k = 1, . . . , n − 1, y′ n(x) = − n−1 i=0 ai(x) an(x) yi+1(x) + f(x) an(x) or A.1 Ordinary Differential Equations of Higher Order 81 y′ (x) =      y′ 1(x) y′ 2(x) ... y′ n(x)      =      0 1 0 · · · 0 0 0 1 · · · 0 ... ... ... ... ... − a0(x) an(x) − a1(x) an(x) − a2(x) an(x) · · · −an−1(x) an(x)           y1(x) y2(x) ... yn(x)      +      0 0 ... f(x) an(x)      =: A(x)y(x) + f(x). (A.6) ✷ Theorem A.10. Existence and uniqueness of a solution of the initial value problem. Let I = [x0−a, x0+a] and ai ∈ C(I), i = 0, . . . , n, f ∈ C(I). Then, the linear differential equation of n-th order (A.4) has exactly one solution y ∈ Cn (I) for given initial value y(x0) = y0, y′ (x0) = y1, . . . , y(n−1) (x0) = yn−1. Proof. Since (A.4) is equivalent to the system (A.6), one can apply the theorem on global existence and uniqueness of a solution of an initial value problem from Picard–Lindel¨of, see lecture notes Numerical Mathematics I or the literature. To this end, one has to show the Lipschitz continuity of the right-hand side of (A.6) with respect to y1, . . . , yn. Denoting the right-hand side by F(x, y) gives F(x, y) − F(x, ˜y) [C(I)]n = A(y − ˜y) [C(I)]n ≤ A [C(I)]n,∞ y − ˜y [C(I)]n , where one uses the triangle inequality to get Ai·y C(I) = max x∈I n j=1 aij(x)yj(x) ≤ max x∈I n j=1 |aij(x)| max j=1,...,n max x∈I |yj(x)| = Ai· C(I) y [C(I)]n for i = 1, . . . , n. Now, one can choose L = A [C(I)]n,∞ = max x∈I max 1, a1(x) an(x) + . . . + an−1(x) an(x) . All terms are bounded since I is closed (compact) and continuous functions are bounded on compact sets. Definition A.11. Linearly independent solutions, fundamental system. The solutions yi(x) : I → R, i = 1, . . . , k, of (A.5) are called linearly independent if from k i=1 ciyi(x) = 0, for all x ∈ I, ci ∈ R, 82 A Topics on the Theory of Ordinary Differential Equations it follows that ci = 0 for i = 1, . . . , k. A set of n linearly independent solutions is called a fundamental system of (A.5). ✷ Definition A.12. Wronski1 matrix, Wronski determinant. Let yi(x), i = 1, . . . , k, be solutions of (A.5). The matrix W(x) =      y1(x) . . . yk(x) y′ 1(x) . . . y′ k(x) ... y (n−1) 1 (x) . . . y (n−1) k (x)      is called Wronski matrix. For k = n the Wronski determinant is given by det(W)(x) =: W(x). ✷ Lemma A.13. Properties of the Wronski matrix and Wronski determinant. Let I = [a, b] and let y1(x), . . . , yn(x) be solutions of (A.5). i) The Wronski determinant fulfills the linear first order differential equation W′ (x) = − an−1(x) an(x) W(x). ii) It holds for all x ∈ I W(x) = W(x0) exp − x x0 an−1(t) an(t) dt with arbitrary x0 ∈ I. iii) If there exists a x0 ∈ I with W(x0) = 0, then it holds W(x) = 0 for all x ∈ I. iv) If there exists a x0 ∈ I with rank(W(x0)) = k, then there are at least k solutions of (A.5), e.g. y1(x), . . . , yk(x), linearly independent. Proof. i) Let Sn be the set of all permutations of {1, . . . , n} and let σ ∈ Sn. Denote the entries of the Wronski matrix by W(x) = yjk(x) n j,k=1 . If σ = (σ1, . . . , σn), then let n j=1 yj,σj (x) = (y1,σ1 y2,σ2 . . . yn,σn ) (x). Applying the Laplace2 formula for determinants and the product rule yields 1 Joseph Marie Wronski (1758 – 1853) 2 Pierre–Simon (Marquis de) Laplace (1749 – 1827) A.1 Ordinary Differential Equations of Higher Order 83 d dx det(W(x)) = d dx   σ∈Sn  sgn(σ) n j=1 yj,σj (x)     = σ∈Sn  sgn(σ) n i=1   n j=1,j=i yj,σj (x)   y′ i,σi (x)   = n i=1   σ∈Sn  sgn(σ) n j=1,j=i yj,σj (x)y′ i,σi (x)     = n i=1 det    · · · · · · · · · y (i−1) 1 (x) ′ · · · y (i−1) n (x) ′ · · · · · · · · ·    . exercise for n = 2, 3. In the last step, again the Laplace formula for determinants was applied. In the i-th row of the last matrix is the first derivative of the corresponding row of the Wronski matrix, i.e. there is the i-th order derivative of (y1(x), . . . , yn(x)). The rows with dots in this matrix coincide with the respective rows of W(x). For i = 1, . . . , n − 1, the determinants vanish, since in these cases there are two identical rows, namely row i and i + 1. Thus, it is d dx det(W(x)) = det         y1(x) . . . yn(x) y′ 1(x) . . . y′ n(x) ... y (n−2) 1 (x) . . . y (n−2) n (x) y (n) 1 (x) . . . y (n) n (x)         . Now, one uses that y1(x), . . . , yn(x) are solutions of (A.5) and one replaces the n-th derivative in the last row by (A.5). Using rules for the evaluation of determinants, one obtains d dx det(W(x)) = n i=1 − ai−1(x) an(x) det      y1(x) . . . yn(x) y′ 1(x) . . . y′ n(x) ... y (i−1) 1 (x) . . . y (i−1) n (x)      . Apart of the last term, all other determinants vanish, since all other terms have two identical rows, namely the i-th row and the last row. ii) This term is the solution of the initial value problem for the Wronski determinant and the initial value W(x0), see the respective theorem in the lecture notes of Numerical Mathematics I. iii) This statement follows directly from ii) since the exponential does not vanish. iv) exercise Theorem A.14. Existence of a fundamental system, representation of the solution of a homogeneous linear differential equation of nth order by the fundamental system. Let I = [a, b] with x0 ∈ I. The homogeneous equation (A.5) has a fundamental system in I. Each solution of (A.5) can be written as a linear combination of the solutions of an arbitrary fundamental system. 84 A Topics on the Theory of Ordinary Differential Equations Proof. Consider n homogeneous initial value problems with the initial values y (i−1) j (x0) = δij, i, j = 1, . . . , n. Each of these initial value problems has a unique solution yj(x), see Theorem A.10. It is W(x0) = 1 for these solutions. From Lemma A.13, iii), it follows that {y1(x), . . . , yn(x)} is a fundamental system. Let y(x) be an arbitrary solution of (A.5) with the initial values y(i−1)(x0) = ˜yi−1, i = 1, . . . , n, and {y1(x), . . . , yn(x)} an arbitrary fundamental system. The system      y1(x0) . . . yn(x0) y′ 1(x0) . . . y′ n(x0) . . . y (n−1) 1 (x0) . . . y (n−1) n (x0)           c0 c1 ... cn−1      =      ˜y0 ˜y1 ... ˜yn−1      has a unique solution since the matrix spanned by a fundamental system is not singular. The function n i=1 ci−1yi(x) satisfies the initial conditions (these are just the equations of the system) and, because of the superposition principle, it is a solution of (A.5). Since the solution of the initial value problem to (A.5) is unique, Theorem A.10, it follows that y(x) = n i=1 ci−1yi(x). Theorem A.15. Special solution of the inhomogeneous equation. Let {y1(x), . . . , yn(x)} be a fundamental system of the homogeneous equation (A.5) in I = [a, b]. In addition, let Wl(x) be the determinant, which is obtained from the Wronski determinant W(x) with respect to {y1(x), . . . , yn(x)} by replacing the l-th column by (0, 0, . . . , f(x)/an(x))T . Then, y(x) = n l=1 yl(x) x x0 Wl(t) W(t) dt, x0, x ∈ I, is a solution of the inhomogeneous equation (A.4). Proof. The proof uses the principle of the variation of the constants. This principle will be explained in a simpler setting in Remark A.27. For details of the proof, see the literature. A.1.3 Linear n-th Order Differential Equations with Constant Coefficients Definition A.16. Linear differential equation of n-th order with constant coefficients. A linear n-th order differential equation with constant coefficients has the form any(n) (x) + an−1y(n−1) (x) + . . . + a1y′ (x) + a0y(x) = f(x), (A.7) with ai ∈ R, i = 0, . . . , n, an = 0. ✷ A.1 Ordinary Differential Equations of Higher Order 85 A.1.3.1 The Homogeneous Equation Remark A.17. Basic approach for solving the homogeneous linear differential equation of n-th order with constant coefficients. Because of the superposition principle, one needs the general solution of the homogeneous differential equation. That means, one has to find a fundamental system, i.e. n linearly independent solutions. Consider n i=0 aiy (i) h (x) = 0. (A.8) In the case of a differential equation of first order, i.e. n = 1, a1y′ h(x) + a0yh(x) = 0, one can get the solution by the method of separating the variables (unknowns), see lecture notes of Numerical Mathematics I. One obtains yh(x) = c exp − a0 a1 x , c ∈ R. One uses the same structural ansatz for computing the solution of (A.8) yh(x) = eλx , λ ∈ C. (A.9) It follows that y′ h(x) = λeλx , . . . , y (n) h (x) = λn eλx . Inserting into (A.8) gives anλn + an−1λn−1 + . . . + a1λ + a0 eλx = 0. (A.10) It is eλx = 0, also for complex λ. Because, using Euler’s formula, it holds for λ = a + ib, a, b ∈ R, that eλx = eax (cos(bx) + i sin(bx)) = eax cos(bx) + ieax sin(bx). A complex number is zero iff its real part and its imaginary part are vanish. It is eax > 0 and there does not exist a (bx) ∈ R such that at the same time sin(bx) and cos(bx) vanish. Hence, eλx = 0. The equation (A.10) is satisfied iff one of the factors is equal to zero. Since the second factor cannot vanish, it must hold p(λ) := anλn + an−1λn−1 + . . . + a1λ + a0 = 0. The function p(λ) is called characteristic polynomial of (A.8). The roots of the characteristic polynomial are the values of λ in the ansatz of yh(x). 86 A Topics on the Theory of Ordinary Differential Equations From the fundamental theorem of algebra it holds that p(λ) has exactly n roots, which do not need to be mutually different. Since the coefficients of p(λ) are real numbers, it follows that with each complex root λ1 = a + ib, a, b ∈ R, b = 0, also its conjugate λ2 = a − ib is a root of p(λ). It will be shown that the basic ansatz (A.9) is not sufficient in the case of multiple roots. ✷ Theorem A.18. Linearly independent solutions in the case of real roots with multiplicity k. Let λ0 ∈ R be a real root of the characteristic polynomial p(λ) with multiplicity k, 1 ≤ k ≤ n. Then, one can obtain with λ0 the k linearly independent solutions of (A.8) yh,1(x) = eλ0x , yh,2(x) = xeλ0x , . . . , yh,k(x) = xk−1 eλ0x . (A.11) Proof. For k = 2. yh,1(x), yh,2(x) solve (A.8). This statement is already clear for yh,1(x) since this function has the form of the ansatz (A.9). For yh,2(x) it holds y′ h,2(x) = (1 + λ0x) eλ0x , y′′ h,2(x) = 2λ0 + λ2 0x eλ0x , ... y (n) h,2(x) = nλn−1 0 + λn 0 x eλ0x . Inserting into the left-hand side of (A.8) yields eλ0x n i=0 ai(iλi−1 0 + λi 0x) = eλ0x x n i=0 aiλi 0 p(λ0) + n i=0 aiiλi 0 p′(λ0) . (A.12) It is p(λ0) = 0, since λ0 is a root of p(λ). The second term is the derivative p′(λ) of p(λ) at λ0. Since the multiplicity of λ0 is two, one can write p(λ) in the form p(λ) = (λ − λ0)2 p0(λ), where p0(λ) is a polynomial of degree n − 2. It follows that p′ (λ) = 2 (λ − λ0) p0(λ) + (λ − λ0)2 p′ 0(λ). Hence, it holds p′(λ0) = 0, (A.12) vanishes, and yh,2(x) is a solution of (A.8). yh,1(x), yh,2(x) are linearly independent. One has to show, Lemma A.13, that the Wronski determinant does not vanish. It holds W(x) = det yh,1(x) yh,2(x) y′ h,1(x) y′ h,2(x) = det eλ0x xeλ0x λ0eλ0x (1 + λ0x)eλ0x = e2λ0x det 1 x λ0 1 + λ0x = e2λ0x (1 + λ0x − λ0x) = e2λ0x > 0 for all x ∈ I. A.1 Ordinary Differential Equations of Higher Order 87 Roots of multiplicity k > 2. The principle proof is analogous to the case k = 2, where one uses the factorization p(λ) = (λ − λ0)k p0(λ). The computation of the Wronski determinant becomes more involved. Remark A.19. Complex roots. The statement of Theorem A.18 is true also for complex roots of p(λ). The Wronski determinant is e2λ1x = 0. However, the corresponding solutions, e.g. ˜y1,h(x) = eλ1x = e(a+ib)x are complex-valued. Since one has real coefficients in (A.8), one likes to obtain also real-valued solutions. Such solutions can be constructed from the complex-valued solutions. Let λ1 = a + ib, λ1 = a − ib, a, b ∈ R, b = 0, be a conjugate complex roots of p(λ), then one obtains with Euler’s formula eλ1x = e(a+ib)x = eax (cos(bx) + i sin(bx)) , eλ1x = e(a−ib)x = eax (cos(bx) − i sin(bx)) . Because of the superposition principle, each linear combination is also solution of (A.8). ✷ Theorem A.20. Linearly independent solution for simple conjugate complex roots. Let λ1 ∈ C, λ1 = a+ib, b = 0, be a simple conjugate complex root of the characteristic polynomial p(λ) with real coefficients. Then, yh,1(x) = Re eλ1x = eax cos(bx), yh,2(x) = Im eλ1x = eax sin(bx), are real-valued, linearly independent solutions of (A.8). Proof. Use the superposition principle for proving that the functions are solutions and the Wronski determinant for proving that they are linearly independent, exercise. Theorem A.21. Linearly independent solution for conjugate complex roots with multiplicity greater than one. Let λ1 ∈ C, λ1 = a+ib, b = 0, be a conjugate complex root with multiplicity k of the characteristic polynomial p(λ) with real coefficients. Then, yh,1(x) = eax cos(bx), . . . , yh,k(x) = xk−1 eax cos(bx), yh,k+1(x) = eax sin(bx), . . . , yh,2k(x) = xk−1 eax sin(bx) (A.13) are real-valued, linearly independent solutions of (A.8). Proof. The proof is similarly to the previous theorems. Theorem A.22. Fundamental system for (A.8). Let p(λ) be the characteristic polynomial of (A.8) with the roots λ1, . . . , λn ∈ C, where the roots are counted in correspondence to their multiplicity. Then, the set of solutions of form (A.11) and (A.13) form a fundamental system of (A.8). 88 A Topics on the Theory of Ordinary Differential Equations Proof. A real root with multiplicity k gives k linearly independent solutions and a conjugate complex root with multiplicity k gives 2k linearly independent solutions. Thus, the total number of solutions of form (A.11) and (A.13) is equal to the number of roots of p(λ). This number is equal to n, because of the fundamental theorem of algebra. It is known from Theorem A.14 that a fundamental system has exactly n functions. Altogether, the correct number of functions is there. One can show that solutions that correspond to different roots are linearly independent, e.g., (G¨unther et al., 1974, p. 75). The linearly independence of the solutions that belong to the same root, was already proved. Example A.23. Homogeneous second order linear differential equation with constant coefficients. 1. Consider y′′ (x) + 6y′ (x) + 9y(x) = 0. The characteristic polynomial is p(λ) = λ2 + 6λ + 9 with the roots λ1 = λ2 = −3. One obtains the fundamental system yh,1(x) = e−3x , yh,2(x) = xe−3x . The general solution of the homogeneous equation has the form yh(x) = c1yh,1(x) + c2yh,2(x) = c1e−3x + c2xe−3x , c1, c2 ∈ R. 2. Consider y′′ (x) + 4y(x) = 0 =⇒ p(λ) = λ2 + 4 =⇒ λ1,2 = ±2i. It follows that yh,1(x) = cos(2x), yh,2(x) = sin(2x) yh(x) = c1 cos(2x) + c2 sin(2x), c1, c2 ∈ R. ✷ A.1.3.2 The Inhomogeneous Equation Remark A.24. Goal. Because of the superposition principle, a special solution of (A.7) has to be found. This section sketches several possibilities to obtain such a solution. ✷ Remark A.25. Appropriate ansatz (St¨orgliedans¨atze). If the right-hand side f(x) possesses a special form, it is possible to obtain a solution of the inhomogeneous equation (A.7) with an appropriate ansatz. From (A.7) it becomes clear, that this way works only if on the left-hand side and the right-hand A.1 Ordinary Differential Equations of Higher Order 89 side of the equation are the same types of functions. In particular, one needs the same types of functions for yi(x) and all derivatives up to order n. This approach works, e.g., for the following classes of right-hand sides: • f(x) is a polynomial f(x) = b0 + b1x + . . . + bmxm , bm = 0. The appropriate ansatz is also a polynomial yi(x) = xk (c0 + c1x + . . . + cmxm ) , where 0 is a root of p(λ) with multiplicity k. • If the right-hand side is f(x) = (b0 + b1x + . . . + bmxm ) eax , then one can use the following ansatz yi(x) = xk (c0 + c1x + . . . + cmxm ) eax , where a is a root of p(λ) with multiplicity k. The first class of functions is just a special case for a = 0. • For right-hand sides of the form f(x) = (b0 + b1x + . . . + bmxm ) cos(bx), f(x) = (b0 + b1x + . . . + bmxm ) sin(bx), one can use the ansatz yi(x) = xk (c0 + c1x + . . . + cmxm ) cos(bx) +xk (d0 + d1x + . . . + dmxm ) sin(bx), if ib is a root of p(λ) with multiplicity k. One can find the ansatz for more right-hand sides in the literature, e.g. in Heuser (2006). ✷ Example A.26. Appropriate ansatz (St¨orgliedansatz). Consider y′′ (x) − y′ (x) + 2y(x) = cos x. The appropriate ansatz is given by yi(x) = a cos x + b sin x =⇒ y′ i(x) = −a sin x + b cos x =⇒ y′′ i (x) = −a cos x − b sin x. Inserting into the equation gives 90 A Topics on the Theory of Ordinary Differential Equations −a cos x − b sin x + a sin x − b cos x + 2a cos x + 2b sin x = cos x =⇒ (−a − b + 2a) cos x + (−b + a + 2b) sin x = cos x. The last equation is satisfied if the numbers a, b solve the following linear system of equations a − b = 1, a + b = 0 =⇒ a = 1 2 , b = − 1 2 . One obtains the special solution yi(x) = 1 2 (cos x − sin x) . ✷ Remark A.27. Variation of the constants. If one cannot find an appropriate ansatz, then one can try the variation of the constants. This approach will be demonstrated for the second order differential equation y′′ (x) + a1y′ (x) + a0y(x) = f(x). (A.14) Let yh,1(x), yh,2(x) be two linearly independent solutions of the homogeneous differential equation such that yh(x) = c1yh,1(x) + c2yh,2(x) is the general solution of the homogeneous equation. Now, one makes the ansatz yi(x) = c1(x)yh,1(x) + c2(x)yh,2(x) with two unknown functions c1(x), c2(x). The determination of these functions requires two conditions. One has y′ i(x) = c′ 1(x)yh,1(x) + c1(x)y′ h,1(x) + c′ 2(x)yh,2(x) + c2(x)y′ h,2(x) = (c′ 1(x)yh,1(x) + c′ 2(x)yh,2(x)) + c1(x)y′ h,1(x) + c2(x)y′ h,2(x). Now, one sets the term in the parentheses zero. This is the first condition. It follows that y′′ i (x) = c′ 1(x)y′ h,1(x) + c1(x)y′′ h,1(x) + c′ 2(x)y′ h,2(x) + c2(x)y′′ h,2(x). Inserting this expression into (A.14) gives A.1 Ordinary Differential Equations of Higher Order 91 f(x) = c′ 1(x)y′ h,1(x) + c1(x)y′′ h,1(x) + c′ 2(x)y′ h,2(x) + c2(x)y′′ h,2(x) +a1 c1(x)y′ h,1(x) + c2(x)y′ h,2(x) + a0 (c1(x)yh,1(x) + c2(x)yh,2(x)) = c1(x) y′′ h,1(x) + a1y′ h,1(x) + a0yh,1(x) =0 +c2(x) y′′ h,2(x) + a1y′ h,2(x) + a0yh,2(x) =0 +c′ 1(x)y′ h,1(x) + c′ 2(x)y′ h,2(x). This is the second condition. Summarizing both conditions gives the following system of equations yh,1(x) yh,2(x) y′ h,1(x) y′ h,2(x) c′ 1(x) c′ 2(x) = 0 f(x) . This system possesses a unique solution since yh,1(x), yh,2(x) are linearly independent from what follows that the determinant of the system matrix, which is just the Wronksi matrix, is not equal to zero. The solution is c′ 1(x) = − f(x)yh,2(x) yh,1(x)y′ h,2(x) − y′ h,1(x)yh,2(x) , c′ 2(x) = f(x)yh,1(x) yh,1(x)y′ h,2(x) − y′ h,1(x)yh,2(x) . The success of the method of the variation of the constants depends only on the difficulty to find the primitives of c′ 1(x) and c′ 2(x). For equations of order higher than two, one has the goal to get a linear system of equations for c′ 1(x), . . . , c′ n(x). To this end, one sets for each derivative of the ansatz the terms with c′ 1(x), . . . , c′ n(x) equal to zero. The obtained linear system of equations has as matrix the Wronski matrix and as right-hand side a vector, whose first (n − 1) components are equal to zero and whose last component is f(x). ✷ Example A.28. Variation of the constants. Find the general solution of y′′ (x) + 6y′ (x) + 9y(x) = e−3x 1 + x . The general solution of the homogeneous equation is yh(x) = c1e−3x + c2xe−3x , see Example A.23. The variation of the constants leads to the following system of linear equations e−3x xe−3x −3e−3x (1 − 3x)e−3x c′ 1(x) c′ 2(x) = 0 e−3x 1+x . Using, e.g., the Cramer rule, gives 92 A Topics on the Theory of Ordinary Differential Equations c′ 1(x) = − e−6x x 1+x (1 − 3x + 3x)e−6x = − x 1 + x , c′ 2(x) = e−6x 1 1+x (1 − 3x + 3x)e−6x = 1 1 + x . One obtains c1(x) = − x 1 + x dx = − 1 + x 1 + x dx + 1 1 + x dx = −x + ln |1 + x| , c2(x) = 1 1 + x dx = ln |1 + x| . Thus, one gets yi(x) = (−x + ln |1 + x|) e−3x + ln |1 + x| xe−3x and one obtains for the general solution y(x) = (−x + ln |1 + x| + c1) e−3x + (ln |1 + x| + c2) xe−3x . Inserting this function into the equation proves the correctness of the result. ✷ A.2 Linear Systems of Ordinary Differential Equations of First Order A.2.1 Definition, Existence and Uniqueness of a Solution Definition A.29. Linear system of first order differential equations. In a linear system of ordinary differential equations of first order one tries to find functions y1(x), . . . , yn(x) : I → R, I = [a, b] ⊂ R, that satisfy the system y′ i(x) = n j=1 aij(x)yj(x) + fi(x), i = 1, . . . , n, or in matrix-vector notation y′ (x) = A(x)y(x) + f(x) (A.15) with A.2 Linear Systems of Ordinary Differential Equations of First Order 93 y(x) =    y1(x) ... yn(x)    , y′ (x) =    y′ 1(x) ... y′ n(x)    , A(x) =    a11(x) · · · a1n(x) ... ... ... an1(x) · · · ann(x)    , f(x) =    f1(x) ... fn(x)    , where aij(x), fi(x) ∈ C(I). If f(x) ≡ 0, then the system is called homogeneous. ✷ Theorem A.30. Superposition principle for linear systems. Consider the linear system of ordinary differential equations (A.15), then the superposition principle holds: i) If y1(x) and y2(x) are two solutions of the homogeneous systems, then c1y1(x) + c2y2(x), c1, c2 ∈ R, is a solution of the homogeneous system, too. ii) If y0(x) is a solution of the inhomogeneous system and y1(x) is a solution of the homogeneous system, then y0(x) + y1(x) is a solution of the inhomogeneous system. iii) If y1(x) and y2(x) are two solutions of the inhomogeneous system, then y1(x) − y2(x) is a solution of the homogeneous system. Proof. Direct calculations, exercise. Corollary A.31. General solution of the inhomogeneous system. i) If y1(x), y2(x), . . . , yk(x) are solutions of the homogeneous system, then any linear combination k i=1 ciyi(x), c1, . . . , ck ∈ R, is also a solution of the homogeneous system. ii) The general solution of the inhomogeneous system is the sum of a special solution of the inhomogeneous system and the general solution of the homogeneous system. Theorem A.32. Existence and uniqueness of a solution of the initial value problem. Let I = [x0 − a, x0 + a] and aij ∈ C(I), fi ∈ C(I), i, j = 1, . . . , n. Then, there is exactly one solution y(x) : I → Rn of the initial value problem to (A.15) with the initial value y(x0) = y0 ∈ Rn . Proof. The statement of the theorem follows from the theorem on global existence and uniqueness of a solution of an initial value problem from Picard–Lindel¨of, see lecture notes Numerical Mathematics I or the literature. Since the functions aij(x) are continuous on the closed (compact) interval I, they are also bounded due to the Weierstrass theorem. That means, there is a constant M with |aij(x)| ≤ M, x ∈ I, i, j = 1, . . . , n. Denoting the right hand side of (A.15) by f(x, y), it follows that 94 A Topics on the Theory of Ordinary Differential Equations f(x, y1) − f(x, y2) ∞ = max i=1,...,n |fi(x, y1) − fi(x, y2)| = max i=1,...,n n j=1 aij(x)y1,j(x) + fi(x) − n j=1 aij(x)y2,j(x) − fi(x) = max i=1,...,n n j=1 aij(x) (y1,j(x) − y2,j(x)) ≤ n max i,j=1,...,n |aij(x)| max i=1,...,n |y1,i(x) − y2,i(x)| ≤ nM y1 − y2 ∞ , i.e. the right hand side satisfies a uniform Lipschitz condition with respect to y with the Lipschitz constant nM. Hence, the assumptions of the theorem on global existence and uniqueness of a solution of an initial value problem from Picard–Lindel¨of are satisfied. A.2.2 Solution of the Homogeneous System Remark A.33. Scalar case. Because of the superposition principle, one needs the general solution of the homogeneous system y′ (x) = A(x)y(x) (A.16) for finding the general solution of (A.15). The homogeneous system has always the trivial solution y(x) = 0. In the scalar case y′ (x) = a(x)y(x), the general solution has the form y(x) = c exp x x0 a(t) dt , c ∈ R, x0 ∈ (a, b), see lecture notes Numerical Mathematics I or the literature. Also for the system (A.16), it is possible to specify the general solution with the help of the exponential. ✷ Theorem A.34. General solution of the homogeneous linear system of first order. The general solution of (A.16) is yh(x) = e x x0 A(t) dt c, c ∈ Rn , x0 ∈ (a, b). (A.17) The integral is defined component-wise. Proof. i) (A.17) is a solution of (A.16). This statement follows from the derivative of the matrix exponential and the rule on the differentiation of an integral with respect to the upper limit y′ h(x) = d dx e x x0 A(t) dt c = d dx x x0 A(t) dt e x x0 A(t) dt c = A(x)e x x0 A(t) dt c. A.2 Linear Systems of Ordinary Differential Equations of First Order 95 ii) every solution of (A.16) is of form (A.17). Consider an arbitrary solution ˜yh(x) of (A.16) with ˜yh(x0) ∈ Rn. Take in (A.17) c = ˜yh(x0). Then, it follows that yh(x0) = e x0 x0 A(t) dt ˜yh(x0) = e0 =I ˜yh(x0) = ˜yh(x0). That means, e x x0 A(t) dt ˜yh(x0) is a solution of (A.16) which has in x0 the same initial value as ˜yh(x). Since the solution of the initial value problem is unique, Theorem A.32, it follows that ˜yh(x) = e x x0 A(t) dt ˜yh(x0). A.2.3 Linear Systems of First Order with Constant Coefficients Remark A.35. Linear system of first order differential equations with constant coefficients. A linear system of first order differential equations with constant coefficients has the form y′ (x) = Ay(x) + f(x), A =    a11 · · · a1n ... ... ... an1 · · · ann    ∈ Rn×n . (A.18) Thus, the homogeneous system has the form y′ (x) = Ay(x). (A.19) Its general solution is given by yh(x) = eAx c, c ∈ Rn , (A.20) see Theorem A.34. ✷ Remark A.36. Elimination method, substitution method for the homogeneous system. One needs, due to the superposition principle, the general solution of the homogeneous system. In practice, it is generally hard to compute exp(Ax) because it is defined by an infinity series. For small systems, i.e. n ≤ 3, 4, one can use the elimination or substitution method for computing the general solution of (A.19). This method is already known from the numerical solution of linear systems of equations. One solves one equation for a certain unknown function yi(x) and inserts the result into the other equations. For differential equations, the equation has to be differentiated, see Example A.37. This step reduces the dimension of the system by one. One continues with this method until one reaches an equation with only one unknown function. For this function, a homogeneous linear differential equation of order n has to be solved, see Section A.1.3. The other components of the solution vector of (A.19) can be obtained by back substitution. ✷ 96 A Topics on the Theory of Ordinary Differential Equations Example A.37. Elimination method, substitution method. Find the solution of y′ (x) = −3 −1 1 −1 y(x) ⇐⇒ y′ 1(x) = −3y1(x)−y2(x), y′ 2(x) = y1(x)−y2(x). Solving the second equation for y1(x) and differentiating gives y1(x) = y′ 2(x) + y2(x), y′ 1(x) = y′′ 2 (x) + y′ 2(x). Inserting into the first equation yields y′′ 2 (x)+y′ 2(x) = −3 (y′ 2(x) + y2(x))−y2(x) ⇐⇒ y′′ 2 (x)+4y′ 2(x)+4y2(x) = 0. The general solution of this equation is y2(x) = c1e−2x + c2xe−2x , c1, c2 ∈ R. One obtains from the second equation y1(x) = y′ 2(x) + y2(x) = (−c1 + c2) e−2x − c2xe−2x . Thus, the general solution of the given linear system of differential equations with constant coefficients is computed by y = −c1 + c2 c1 e−2x + −c2 c2 xe−2x . Note that one can choose the constants in y2(x), but the constants in y1(x) are determined by the back substitution. If the constants should be choosen by y1(x), one obtains y = C1 C2 − C1 e−2x + C2 −C2 xe−2x . If an initial condition is given, then corresponding constants can be determined. ✷ Remark A.38. Other methods for computing the general solution of the homogeneous system. There are also other methods for computing the general solution of (A.19). • The idea of the method of main-vectors and eigenvectors consists in transforming the system to a triangular system. Then it is possible to solve the equations successively. To this end, one constructs with the so-called main-vectors and eigenvectors an invertible matrix C ∈ Rn×n such that C−1 AC is a triangular matrix. One can show that such a matrix C exists for each A ∈ Rn×n . Then, one sets A.2 Linear Systems of Ordinary Differential Equations of First Order 97 y(x) = Cz(x) =⇒ y′ (x) = Cz′ (x). Inserting into (A.19) yields Cz′ (x) = ACz(x) ⇐⇒ z′ (x) = C−1 ACz(x). This is a triangular system for z(x), which is solved successively for the components of z(x). The solution of (A.19) is obtained by computing Cz(x). • The method of matrix functions is based on an appropriate ansatz for the solution. However, the application of both methods becomes very time-consuming for larger n, see the literature. ✷ Remark A.39. Methods for determining a special solution of the inhomogeneous system. For computing the general solution of the inhomogeneous system of linear differential equations of first order with constant coefficients, one needs also a special solution of the inhomogeneous system. There are several possibilities for obtaining this solution: • Method of the variation of constants. One replaces c in (A.20) by c(x), inserts this expression into (A.18), obtains conditions for c′ (x), and tries to compute c(x) from these conditions. • Appropriate ansatz (St¨orgliedans¨atze). If each component of the right hand side f(x) has a special form, e.g., a polynomial, sine, cosine, or exponential, then it is often possible to find the special solution with an appropriate ansatz. • Method of elimination. If the right hand side of f(x) of (A.18) is (n − 1) times continuously differentiable, then one can proceed exactly as in the elimination method. One obtains for one component of y(x) an inhomogeneous ordinary differential equation of order n with constant coefficients, for which one has to find a special solution. A special solution for (A.18) is obtained by back substitution. ✷ References Butcher, J. C. (1964) Implicit Runge-Kutta processes. Math. Comp., 18, 50–64. Cryer, C. W. (1972) On the instability of high order backward-difference multistep methods. Nordisk Tidskr. Informations behandling (BIT), 12, 17–25. Curtiss, C. F. & Hirschfelder, J. O. (1952) Integration of stiff equations. Proc. Nat. Acad. Sci. U. S. A., 38, 235–243. Deuflhard, P. & Bornemann, F. (2002) Scientific computing with ordinary differential equations. Texts in Applied Mathematics, vol. 42. Springer-Verlag, New York, pp. xx+485. Translated from the 1994 German original by Werner C. Rheinboldt. Deuflhard, P. & Bornemann, F. (2008) Numerische Mathematik 2. de Gruyter Lehrbuch. [de Gruyter Textbook], revised edn. Walter de Gruyter & Co., Berlin, pp. xii+499. Gew¨ohnliche Differentialgleichungen. [Ordinary differential equations]. Dormand, J. R. & Prince, P. J. (1980) A family of embedded RungeKutta formulae. J. Comput. Appl. Math., 6, 19–26. Fehlberg, E. (1964) New high-order Runge-Kutta formulas with step size control for systems of first- and second-order differential equations. Z. Angew. Math. Mech., 44, T17–T29. Fehlberg, E. (1969) Klassische Runge-Kutta-Formeln f¨unfter und siebenter Ordnung mit Schrittweiten-Kontrolle. Computing (Arch. Elektron. Rechnen), 4, 93–106. G¨unther, P., Beyer, K., Gottwald, S. & W¨unsch, V. (1974) Grundkurs Analysis. Teil 4. Leipzig: BSB B. G. Teubner Verlagsgesellschaft, p. 308. Erste Auflage, Mathematisch-Naturwissenschaftliche Bibliothek, Band 56. Hairer, E., Nørsett, S. P. & Wanner, G. (1993) Solving ordinary differential equations. I . Springer Series in Computational Mathematics, vol. 8, second edn. Berlin: Springer-Verlag, pp. xvi+528. Nonstiff problems. 99 100 References Heuser, H. (2006) Gew¨ohnliche Differentialgleichungen. Mathematische Leitf¨aden. [Mathematical Textbooks], fifth edn. Stuttgart: B. G. Teubner, p. 628. Einf¨uhrung in Lehre und Gebrauch. [Introduction to theory and application]. John, V. & Rang, J. (2010) Adaptive time step control for the incompressible Navier-Stokes equations. Comput. Methods Appl. Mech. Engrg., 199, 514–524. Kunkel, P. & Mehrmann, V. (2006) Differential-algebraic equations. EMS Textbooks in Mathematics. European Mathematical Society (EMS), Z¨urich, pp. viii+377. Analysis and numerical solution. Prince, P. J. & Dormand, J. R. (1981) High order embedded RungeKutta formulae. J. Comput. Appl. Math., 7, 67–75. Shampine, L. F. & Reichelt, M. W. (1997) The MATLAB ODE suite. SIAM J. Sci. Comput., 18, 1–22. Dedicated to C. William Gear on the occasion of his 60th birthday. S¨oderlind, G. (2002) Automatic control and adaptive time-stepping. Numer. Algorithms, 31, 281–310. Numerical methods for ordinary differential equations (Auckland, 2001). Strehmel, K., Weiner, R. & Podhaisky, H. (2012) Numerik gew¨ohnlicher Differentialgleichungen, second edn. Springer Spektrum, p. 505. Strehmel, K. & Weiner, R. (1995) Numerik gew¨ohnlicher Differentialgleichungen. Teubner Studienb¨ucher Mathematik. [Teubner Mathematical Textbooks]. Stuttgart: B. G. Teubner, p. 462.