E7441: Scientific computing in biology and biomedicine Systems of linear equations Vlad Popovici, Ph.D. RECETOX Outline 1 Systems of linear equations - reminder Norms Linear systems Conditioning Accuracy 2 Solving linear systems Diagonal systems Triangular systems Gaussian elimination 3 Special cases Symmetric positive definite systems 4 Examples and applications Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 2 / 67 Additional references: Golub, Van Loan, Matrix Computations, Johns Hopkins Univ. Press, 3rd Ed. 1996 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 3 / 67 A motivating example - Multiple linear regression Linear model y = Xβ + ϵ y = [y1, . . . , yn]T is the vector of observed values, X = [xij] ∈ Mn,p(R) is the matrix of independent variables, and ϵ is a vector of residuals (errors). β can be found by minimizing the sum of squared residuals, which leads to solving: Normal equations XT Xβ = XT y Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 4 / 67 Naïve numerical solution - DO NOT USE!: β = (XT X)−1 XT y implemented as import numpy as np ... beta_hat_direct = np.linalg.inv(X.T @ X) @ X.T @ y Much better: beta_hat_solve = np.linalg.solve(X.T @ X, X.T @ y) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 5 / 67 Vectors and norms Let x be a vector, x =   x1 ... xn   = [x1, . . . , xn]T . The p−norm is defined as ∥x∥p =   n i=1 |xi|p   1 p Special cases: p = 1: (Manhattan or city-block norm) ∥x∥1 = i |xi| p = 2: (Euclidean norm) ∥x∥2 = i x2 i p → ∞: (∞−norm) ∥x∥∞ = maxi |xi| The unit circles. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 6 / 67 Vector norms - properties ∀x, y ∈ Rn and for any norm, ∥x∥ ≥ 0 with ∥x∥ = 0 ⇔ x = 0 ∥αx∥ = |α| · ∥x∥, ∀α ∥x + y∥ ≤ ∥x∥ + ∥y∥ (triangle inequality); also | ∥x∥ − ∥y∥ | ≤ ∥x − y∥ ∥x∥1 ≥ ∥x∥2 ≥ ∥x∥∞ ∥x∥1 ≤ √ n∥x∥2, ∥x∥2 ≤ √ n∥x∥∞ → norms differ by at most a constant, hence they are equivalent Python: numpy.linalg.norm(x, p) or scipy.linalg.norm(x,p) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 7 / 67 Matrix norms Let A =   a11 a12 a1n . . . an1 an2 ann   be a square matrix. defined based on a vector norm ∥A∥ = max x 0 ∥Ax∥ ∥x∥ the maximum "stretching" applied to a vector by the matrix A ∥A∥1 = maxj n i=1 |aij| (maximum absolute column sum) ∥A∥∞ = maxi n j=1 |aij| (maximum absolute row sum) ∥A∥2 =? (we’ll see it later) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 8 / 67 Matrix norms - properties Let A and B be two square matrices ∥A∥ > 0 if A 0 ∥αA∥ = |α| · ∥A∥, for any scalar α ∥A + B∥ ≤ ∥A∥ + ∥B∥ ∥A · B∥ ≤ ∥A∥ · ∥B∥ ∥Ax∥ ≤ ∥A∥ · ∥x∥ for any vector x Python: numpy.linalg.norm(A, p) or scipy.linalg.norm(A, p) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 9 / 67 Linear systems In general, a system of linear equations has the form: a11x1 + a12x2 + · · · + a1nxn = b1 a21x1 + a22x2 + · · · + a2nxn = b2 . . . am1x1 + am2x2 + · · · + amnxn = bm or, in matrix format, Ax = b where A is an m × n matrix (say, A ∈ Mm,n(R)), b and x are vectors with m and n elements, respectively. In other words: can the vector b be expressed as a linear combination of columns of matrix A? Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 10 / 67 In Python Python: x = numpy.linalg.lstsq(A, b[, rcond]) if A is square and of full rank, the “exact” solution is returned otherwise performs least squares regression NOTE: numpy.linalg.solve() works only for full rank matrices Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 11 / 67 Square matrices case (m = n) A ∈ Mn,n(R) is singular if it has any of the following equivalent properties: A has no inverse (A−1 does not exist) det(A) = 0 rank(A) < n (rank: maximum number of rows or columns that are linearly independent) Az = 0 for some vector z 0 Otherwise, the matrix is nonsingular. If A is nonsingular, there is a unique solution; otherwise, depending on b, there might be zero or infinitely many solutions. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 12 / 67 Geometrical interpretation (2D): a linear equation defines a line if A is nonsingular, the two lines intersect if A is singular, the two lines may be parallel (no solution) or identical (infinitely many solutions) If A is singular and b ∈ span(A) the system is consistent and has infinitely many solutions. (span(A) is the vector space generated by the columns of A.) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 13 / 67 Examples let A = 1 2 3 4 and b = −1 −1 , then A is nonsingular and there is a unique solution, x = 1 −1 let A = 1 2 2 4 and b = −1 −2 , then A is singular and there is no unique solution try out in Python and check the documentation for solve() and lstsq() functions Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 14 / 67 Singularity, norm and conditioning condition number of a nonsingular square matrix is cond(A) = ∥A∥ · ∥A−1 ∥ convention: cond(A) = ∞ for singular A ratio between maximum streching and maximum shrinking of a nonzero vector large cond(A) indicates a matrix close to singularity small det(A) does not imply large cond(A) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 15 / 67 Condition number - properties cond(A) ≥ 1 cond(I) = 1 (I is the identity matrix - Python: eye(n)) cond(αA) = cond(A), for any A and scalar α for a diagonal matrix D = diag(di), di 0 we have cond(D) = max |di| min |di| condition number is used for assessing the accuracy of the solutions to linear systems Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 16 / 67 Condition number: exact computation requires matrix inverse: ▶ ∥A∥ is easy to compute ▶ computing at low cost ∥A−1 ∥ is difficult → expensive (even more than finding the solutions to the problem) and prone to numerical instability in practice: estimated as a byproduct of the solution process One approach: find lower bounds on ∥A−1 ∥ and, thus, on cond(A). If Ax = y it follows that ∥x∥ ∥y∥ ≤ ∥A−1 ∥, with "=" achieved for some optimal y. So one needs to find y such that the lhs above is maximized to get a good estimate of ∥A−1 ∥. Python: numpy.linalg.cond(). Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 17 / 67 Ill-conditioned matrices - example Consider the Hilbert matrix H with elements hij = 1 i+j−1 . It arises, for example, from least square approximation of functions by polynomials, and hij = 1 0 xi+j dx In Python use the hilbert() and invhilbert() (from scipy.linalg package) for H and H−1 respectively. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 18 / 67 for n in np.arange(5, 15): H = scipy.linalg.hilbert(n) invH = scipy.linalg.invhilbert(n) # exact inverse for n < 15! c = np.linalg.cond(H) d1 = np.linalg.det(H) * np.linalg.det(np.linalg.inv(H)) d2 = np.linalg.det(H) * np.linalg.det(invH) print(’n={:2d}\tcond={:e}\tdet1={:10.7f}\t\tdet2={:10.7f}\n ’.format(n, c, d1, d2)) The floating-point representation of hij damages more the results than the inversion process. n= 5 cond=4.766073e+05 det1= 1.0000000 det2= 1.0000000 . . . n=10 cond=1.602498e+13 det1= 1.0000229 det2= 1.0000879 n=11 cond=5.224781e+14 det1= 1.0014194 det2= 1.0023949 n=12 cond=1.642592e+16 det1= 1.0681547 det2= 0.9870101 n=13 cond=4.493668e+18 det1=−9.5735009 det2= 0.3085276 n=14 cond=3.219842e+17 det1= 1.1823510 det2=1728.5395280 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 19 / 67 Accuracy of solutions condition number → error bounds let x be the solution to Ax = b and ˆx the solution to Aˆx = b + ∆b let ∆x = ˆx − x, then b + ∆b = Ax + A∆x, from which ∥∆x∥ ∥x∥ ≤ cond(A) ∥∆b∥ ∥b∥ Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 20 / 67 ∥∆x∥ ∥x∥ ≤ cond(A) ∥∆b∥ ∥b∥ Relative change in solution The condition number bounds the relative changes in the solution due to a relative change in rhs, regardless of the algorithm used to compute the solution. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 21 / 67 The condition number cond(A) defines the uncertainty in x, given the uncertainty in b. Similarly, if (A + D)ˆx = b, then ∥∆x∥ ∥ˆx∥ ≤ cond(A) ∥D∥ ∥A∥ Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 22 / 67 if data (A, b) is accurate to machine precision, then the relative error in solution can be approximated by ∥ˆx − x∥ ∥x∥ ≈ cond(A)ϵmach i.e. the solution loses about log10(cond(A)) decimal digits of accuracy with respect to input data the analysis is about relative error in the largest components of the solution vector; relative error can be larger in the smaller components. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 23 / 67 the condition number is affected by the scaling of A, so one way of improving the solution is by rescaling - this does not improve a matrix near singularity. example: A = 1 0 0 ϵ , b = 1 ϵ the matrix A is ill-conditioned for small ϵ: cond(A) = 1/ϵ. by scaling the 2nd eq with 1/ϵ, the matrix becomes well conditioned. in general, it is more difficult... Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 24 / 67 Example: Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 25 / 67 Residuals residual vector: r = b − Aˆx for ˆx being the approximate solution to Ax = b theoretically: if A is nonsingular then ∥ˆx − x∥ = 0 ⇔ ∥r∥ = 0 practically, small residual is not necessarily equivalent to small error since ∥∆x∥ ∥ˆx∥ ≤ cond(A) ∥r∥ ∥A∥ · ∥ˆx∥ small relative residual implies small relative error, only if A is well-conditioned Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 26 / 67 Residuals - backward error analysis let D be the "delta" matrix, such that ˆx is the exact solution of (A + D)ˆx = b, then ∥r∥ ∥A∥ · ∥ˆx∥ ≤ ∥D∥ ∥A∥ large relative residual implies large backward error and indicates an unstable algorithm stable algorithms yield small relative residuals, regardless conditioning of nonsingular A Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 27 / 67 General strategy transform the system (mainly A) such that the solution is easier to compute (but unchanged) if M is a nonsingular matrix the systems Ax = b and MAx = Mb have the same solution. trivial transformations: ▶ permutation of rows in the system: use a permutation matrix (has exactly one 1 in each row and column, rest is 0). ▶ diagonal scaling: may improve the accuracy Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 28 / 67 A few relevant functions in Python Please, use ? or the online documentation for details! solve(): solves linear systems Ax = B via various methods, for A square matrix. You can specify the properties of A in scipy.linalg.solve(). check out the other scipy.linalg.solve*() functions! scipy.linalg.lu() computes LU factorization numpy.triu() returns upper triangular part of a matrix numpy.tril() returns lower triangular part of a matrix numpy.diag() returns the diagonal of a matrix numpy.linalg.cond() used for estimating the condition number Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 29 / 67 Diagonal systems The simplest linear system is   a11 0 . . . 0 0 a22 . . . 0 ... 0 0 . . . ann     x1 x2 ... xn   =   b1 b2 ... bn   with obvious solution x = [bi/aii]i. def diagsolve(A, b): # Solve A x = b for a diagonal matrix A. d = np.diag(A) if np.any(np.isclose(d, 0)) : raise RuntimeError(’A is singular!’) x = b / d # this is element -wise return x Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 30 / 67 a11x1 + a12x2 + a13x3 = b1 a22x2 + a23x3 = b2 a33x3 = b3 which is equivalent to a11x1 = b1 −a12x2 −a13x3 a22x2 = b2 −a23x3 a33x3 = b3 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 31 / 67 Triangular systems A is lower triangular if aij = 0 for i < j or upper triangular if aij = 0 for i > j solution is obtained by back-substitution: for A =   a11 a12 a13 . . . a1n 0 a22 a23 . . . a2n 0 0 a33 . . . a3n ... 0 0 0 . . . ann   xn = bn/ann xi =  bi − n j=i+1 aijxj   /aii, for i = n − 1, n − 2, . . . , 1 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 32 / 67 Back-substitution algorithm (not vectorized!) Algorithm: Back-substitution algorithm for j = n to 1 do if ajj = 0 then stop; xj ← bj/ajj; for i = 1 to j − 1 do bi ← bi − aijxj; Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 33 / 67 Exercise derive the forward substitution method for lower triangular matrices implement in Python the functions fwsolve() and bksolve() for forward and backward substitution Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 34 / 67 Elementary elimination matrices Goal Find tranformations of nonsingular matrices that would lead to triangular systems. Example: let z = [z1, z2]T with z1 0, then 1 0 −z2/z1 1 z1 z2 = z1 0 → use linear combinations or rows Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 35 / 67 In general, Mk z =   1 . . . 0 0 . . . 0 ... ... ... ... ... ... 0 . . . 1 0 . . . 0 0 . . . −mk+1 1 . . . 0 ... ... ... ... ... ... 0 . . . −mn 0 . . . 1     z1 ... zk zk+1 ... zn   =   z1 ... zk 0 ... 0   where mi = zi/zk , for i = k + 1, . . . , n. pivot: zk Gaussian transformation or elementary elimination transformation: Mk Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 36 / 67 Properties of the Gaussian transformation Mk is nonsingular (it is lower triangular, full rank matrix) Mk = I − meT k , where m = [0, . . . , 0, mk+1, . . . , mn]T and ek is the k−th column of the identity matrix M−1 k = I + meT k : just the sign is changed for the inverse. Denote Lk = Mk if Mj = I − teT j , j > k, then Mk Mj = I − meT k + teT j , so the result is sort of "union" of the two matrices. Note that the order of multiplication is important. a similar result holds for the inverses Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 37 / 67 Gaussian elimination transform the system Ax = b into a triangular system: ▶ choose M1 with a11 as pivot to eliminate the 1st column below a11. The new system is M1Ax = M1b. The solution stays the same. ▶ next choose M2 with a22 as pivot to eliminate the 2nd colum below a22. The new system is M2M1Ax = M2M1b. The solution stays the same. ▶ . . . until we get a triangular system solve the system Mn−1 . . . M1Ax = Mn−1 . . . M1b by back-substitution Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 38 / 67 LU factorization let M = Mn−1 . . . M1 and L = M−1 L = (Mn−1 . . . M1)−1 = M−1 1 . . . M−1 n−1 = L1 . . . Ln−1 which is unit lower triangular. by design, U = MA is upper triangular then A = M−1 U = LU with L lower triangular and U upper triangular Gaussian elimination is a factorization of a matrix as a product of two triangular matrices: LU factorization LU factorization is unique up to a scaling factor of diagonal scaling of factors Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 39 / 67 if A is factorized into LU, the system becomes LUx = b and is solved by forward-substitution (reverse order of backward s.) in lower triangular system Ly = b followed by back-substitution in Ux = y Gaussian elimination and LU factorization express the same solution process in Python, check scipy.linalg.lu(), ...lu_factor(), ...lu_solve() Python example: A = np.array([[0, 1, 1], [2, -1, -1], [1, 1, -1]]) b = np.array([2, 0, 1]) res = scipy.linalg.lu(A) # check the documentation! L = res[1] U = res[2] y = scipy.linalg.solve_triangular(L, b, lower=True) x = scipy.linalg.solve_triangular(U, y, lower=False) print(x) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 40 / 67 Note: det(A) = det(L) det(U) if at any stage, the leading entry on the diagonal is zero → cannot choose the pivot → interchange the row with some row below with a non-zero pivot if there is no way to choose a proper pivot, the matrix U will be singular but the factorization can be performed! the back-substitution will fail however. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 41 / 67 Experiment (from C. Van Loan, "Introduction to scientific computing") Consider the system ϵ 1 1 1 x1 x2 = 1 + ϵ 2 with the solution [1 1]T . Write a Python code to solve it using LU factorization, for ϵ = 10−2 , 10−4 , . . . , 10−18 . Discuss the results! Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 42 / 67 Another application of LU decomposition Consider you have to compute the scalar α = zT A−1 b ∈ R, with z, b ∈ RN and A ∈ Rn×n nonsingular. But x = A−1 b is the solution of the linear system Ax = b. So, you should use LU decomposition, compute x and then α = zT x. In Python: # ...define A, b, z res = scipy.linalg.lu(A) # check the documentation! L = res[1] U = res[2] y = scipy.linalg.solve_triangular(L, b, lower=True) x = scipy.linalg.solve_triangular(U, y, lower=False) alpha = z.T * x Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 43 / 67 Improving stability chose the pivot to minimize error propagation choose the entry of largest magnitude on or below the diagonal as pivot this is called partial pivoting each Mk is preceded by a permutation matrix Pk to interchange rows still MA = U, but M = Mn−1Pn−1 . . . M1P1 L = M−1 is triangular, but not necessarily lower triangular in general (Pn−1 . . . P1)A = PA = LU check again previous Python example and try P = res[0] Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 44 / 67 if the pivot is sought as the largest entry in the entire unreduced submatrix, then you have complete pivoting requires permutations or rows AND columns there are 2 permutations matrices, P, Q, such that PAQ = LU better numerical stability, but much more expensive in computation in general, only partial pivoting is used with Gaussian elimination Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 45 / 67 Pivoting is not required if: the matrix is diagonally dominant: n i=1,i j |aij| < |ajj|, j = 1, . . . , n the matrix is symmetric positive definite: A = AT and xT Ax > 0, ∀x 0 Examples of symmetric positive (semi-)definite matrices from practice? Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 46 / 67 Residuals r = b − Aˆx where ˆx was obtained by Gaussian elimination it can be shown that ∥r∥ ∥A∥ ∥ˆx∥ ≤ ∥E∥ ∥A∥ ≤ ρnϵmach where E is the backward error in data matrix: (A + E)ˆx = b and ρ = max(uij)/ max(aij) is the growth factor without pivoting, ρ is unbounded so the algorithm is unstable with partial pivoting, ρ ≤ 2n−1 in practice, ρ ≈ 1, so ∥r∥ ∥A∥ ∥ˆx∥ ⪅ nϵmach Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 47 / 67 Residuals, cont’d Gaussian elimination with partial pivoting yields small relative residuals, regardless of the conditioning however, computed solution is close to real solution only if the system is well-conditioned yet a smaller growth factor can be obtained with complete pivoting, but the extra cost may not be worth Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 48 / 67 Example: in a 3-digit decimal arithmetic, solve 0.641 0.242 0.321 0.121 x1 x2 = 0.883 0.442 the exact solution is [1 1]T the Gaussian elimination leads to ˆx = [0.782 1.58]T the exact residual is r = [−0.000622 − 0.000202]T → as small as can be expected with 3 digits precision the error is large: ∥ˆx − x∥ = 0.6196 which is ≈ 62% relative error! this is because of ill-conditioning, cond(A) > 4000 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 49 / 67 What happened? The Gaussian elimination led to 0.641 0.242 0 0.000242 x1 x2 = 0.883 −0.000383 so x2 was the result of the division of quantities below ϵmach, yielding an arbitrary result. The x1 is computed to satisfy the 1st eq., resulting in small residual but large error. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 50 / 67 Implementation and complexity The general form of the Gaussian elimination is for i do for j do for k do aij ← aij − (aik /akk )akj order of the loops is not important (for the final result) ...but, depending on the memory storage, they have different performance Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 51 / 67 Implementation and complexity (cont’d) there are about n3 /3 floating-point operations → the complexity is O(n3 ) the forward-/back-substitutions require about n2 multiplications and n2 additions (for a single b) if you try to invert A, x = A−1 b, you need n3 operations → 3× more than for LU factorization inversion is less precise: difference between 3−1 × 18 and 18/3 in fixed-precision arithmetic matrix inversion is convenient in formulas, but in practice you do factorizations! Ex: A−1 B should use LU factorization of A and then forward- and back-substitutions with columns of B Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 52 / 67 Gauss-Jordan elimination idea: for each element of the diagonal, eliminate all the elements below AND above in the column using combinations of rows the elimination matrix has the form   1 . . . 0 −m1 0 . . . 0 ... ... ... ... ... ... ... 0 . . . 1 −mk−1 0 . . . 0 0 . . . 0 1 0 . . . 0 0 . . . 0 −mk+1 1 . . . 0 ... ... ... ... ... ... ... 0 . . . 0 −mn 0 . . . 1   where mi = ai/ak for i = 1, . . . , n do the same to the right hand side term, too Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 53 / 67 Gauss-Jordan elimination, cont’d the result is a diagonal matrix on lhs the solution is obained by dividing the entries on the transformed rhs by the terms of the diagonal it requires n3 /2 multiplications and the same number of additions → 50% more expensive than LU decomposition despite being more expensive, it is sometimes preferred to LU decomposition for parallel implementations if the rhs is initialized with an identity matrix, after G-J elimination the rhs becomes A−1 Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 54 / 67 Solving series of similar problems idea: try to reuse as much as possible from previous computations if only rhs changes, LU decomposition does not have to be recomputed if A suffers only rank one changes, one can still use pre-computed A−1 (Sherman-Morrison formula): (A − uvT )−1 = A−1 + A−1 u(1 − vT A−1 u)−1 vT A−1 this has a complexity of O(n2 ) compared to O(n3 ) that is needed by a new inversion Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 55 / 67 For a modified equation, (A − uvT )x = b the solution is x = A−1 b + A−1 u(1 − vT A−1 u)−1 vT A−1 b and is solved by the following procedure solve Az = u, so z = A−1 u solve Ay = b, so y = A−1 b compute x = y + ((vT y)/(1 − vT z))z If A is already factored, this approach has a complexity O(n2 ) Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 56 / 67 Comments on scaling theoretically, multiplying the terms on diagonal of A and corresponding entries of b would not change the solution in practice, it affects conditioning, choice of pivot and, by consequence, accuracy Example: 1 0 0 ϵ x1 x2 = 1 ϵ is ill-conditioned for small ϵ, since cond(A) = 1/ϵ. It becomes well-conditioned if the second equation is multiplied by 1/ϵ. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 57 / 67 Iterative refinements let x0 be the approximate solution to Ax = b and r0 = b − Ax0 be the corresponding residual let then z0 be the solution to Az = r0 an improved approximate solution is then x1 = x0 + z0 HOMEWORK: prove that Ax1 = b repeat until convergence the process needs higher precision for computing a useful residual not often used, but sometimes useful Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 58 / 67 Special forms of linear systems For some special cases of A storage and computation time can be saved. For example, if A is symmetric: A = AT , aij = aji for all i, j positive definite: zT Az > 0, ∀z 0 band diagonal: aij = 0 if |i − j| > β, where β is the bandwidth sparse: most of the elements of A are zero Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 59 / 67 Symmetric positive definite systems Cholesky decomposition: A = LLT where L is lower triangular. A admits a Cholesky decomposition if and only if it is symmetric positive definite if the decomposition exists, it is unique Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 60 / 67 Cholesky decomposition algorithm with overwriting of A Algorithm: Cholesky decomposition algorithm for j = 1 to n do for k = 1 to j − 1 do for i = j to n do aij ← aij − aik ajk ; ajj ← √ ajj; for k = j + 1 to n do akj ← akj/ajj; Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 61 / 67 Cholesky decomposition - properties does not need pivoting to maintain stability only n3 /6 multiplications and n3 /6 additions are required for the algorithm presented, only the lower triangle of A is modified, and can be restored, if needed, from the upper triangle requires about half the computations and half of the memory compared with LU factorization there are variations of Cholesky decomposition for banded matrices, for positive semi-definite matrices (semi-Cholesky decomposition) and for symmetric indefinite matrices Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 62 / 67 Suggestions of methods to use If A is a real dense square matrix... ...use LU decomposition with partial pivoting: A = PLU ...and is a band matrix, use LU decomposition with pivoting and row interchanges ...and is tridiagonal, use Gaussian elimination ...and is symmetric positive definite, use Cholesky decomposition ...and is symmetric tridiagonal, use special Cholesky with pivoting, A = LDLT ...and is symmetric indefinite, use special Cholesky In Python (scipy.linalg), check the documentation for functions: cholesky(), ldl(), lu(). Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 63 / 67 Polynomial interpolation a function p(x) interpolates a set of points {(xi, yi)|i = 0, . . . , N} if it satisfies yi = p(xi) for all i = 0, . . . , N. this leads to a system of N + 1 equations. If p(x) is a polynomial of degree M, p(x) = aMxM + · · · + a1x + a0, the system is of the form a0 + a1x0 + · · · + aMxM 0 = y0 . . . a0 + a1xN + · · · + aMxM N = yN where the unknowns are a0, . . . , aM. if M = N → Vandermonde matrix in Python check the functions numpy.polyfit() and numpy.polyval() write the Python function to solve the interpolation problem for M = N. Do NOT use the functions above for interpolation! Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 64 / 67 1D Poisson problem A two-point boundary problem, −u′′ (x) = y(x), x ∈ [0, 1], u(0) = u(1) = 0, where y is a given continuous function on [0, 1]. If y cannot be integrated exactly, approximate solutions are sought. Using finite differences, u′ (x) = lim h→0 u(x + h 2 ) − u(x − h 2 ) h u′′ (x) = lim h→0 u(x + h) − 2u(x) + u(x − h) h2 Divide the interval [0, 1] in m + 1 equal subintervals of length h = 1/(m + 1) and let xi = ih be the limits of these subintervals, i = 0, . . . , m + 1. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 65 / 67 Denote y(xi) = y(ih) = yi and u(xi) = u(ih) = ui. Then, the problem becomes − ui+1 − 2ui + ui−1 h2 = yi, i = 1, . . . , m, u0 = um+1 = 0. This can be written as a linear system: Tu =   2 −1 0 −1 2 −1 0 ... ... ... 0 −1 2 −1 0 −1 2     u1 u2 ... um−1 um   = h2   y1 y2 ... ym−1 ym   where the matrix T is a Toeplitz matrix. The system can be solved using the Levinson algorithm - see scipy.linalg.solve_toeplitz() function in Python. Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 66 / 67 Questions? Vlad Popovici, Ph.D. (RECETOX) E7441: Scientific computing in biology and biomedicine 67 / 67