Linear Algebra: Matrices B-l Appendix B: LINEAR ALGEBRA: MATRICES TABLE OF CONTENTS Page §B.l Matrices....................... B-3 §B.1.1 Square Matrices................ B-3 §B.1.2 Symmetry and Antisymmetry............ B-4 §B.1.3 Special Matrices................ B-4 §B. 1.4 *Are Vectors a Special Case of Matrices?........ B-5 §B.2 Elementary Matrix Operations.............. B-6 §B.2.1 Equality................... B-6 §B.2.2 Transposition................. B-6 §B.2.3 Addition, Subtraction and Scaling .......... B-7 §B.3 Matrix Products ................... B-7 §B.3.1 Matrix by Vector Product ............. B-7 §B.3.2 Matrix by Matrix Product and Matrix Powers ...... B-8 §B.3.3 Matrix Product Properties ............. B-9 §B.4 Bilinear and Quadratic Forms .............. B-10 §B.5 Determinants..................... B-l 1 §B.5.1 Determinant Properties.............. B-ll §B.5.2 Cramer's Rule and Homogeneous Systems........ B-13 §B.6 Singular Matrices, Rank ................ B-14 §B.6.1 Rank Deficiency ................ B-l5 §B.6.2 Rank of Matrix Sums and Products.......... B-15 §B.6.3 Singular Systems: Particular and Homogeneous Solutions . . B-16 §B.6.4 *Rank of Rectangular Matrices........... B-16 §B.7 Matrix Inversion.................... B-17 §B.7.1 Explicit Computation of Inverses .......... B-17 §B.7.2 Some Properties of the Inverse............ B-l 8 §B.8 Characteristic Polynomial and Eigenvalues.......... B-19 §B.8.1 *Similarity Transformations and Invariants ....... B-20 §B.8.2 *Congruential Transformations........... B-20 §B.8.3 *The Cayley-Hamilton Theorem........... B-21 §B.9 *The Inverse of a Sum of Matrices ............ B-21 §B.9.1 *The Sherman-Morrison and Woodbury Formulas ..... B-21 §B.9.2 *Formulas for Modified Determinants......... B-22 §B.9.3 *Inversion of Low Order Symmetric Matrix Sums ..... B-22 §B.10 *Matrix Orthogonality................. B-22 §B.10.1 *Matrix Orthogonalization Via Projectors........ B-23 §B.10.2 *Orthogonal Projector Properties .......... B-23 §B. Notes and Bibliography ................. B-24 §B. Exercises...................... B-25 B-2 §B.l MATRICES §B.l. Matrices This Appendix introduces the concept of a matrix. Consider a set of scalar quantities arranged in a rectangular array containing m rows and n columns: an ■ alj ■ ■ a\n ~ a2l ■ i = 1, ... ,m. (B.23) SC The symbol X may be omitted if using the summation convention, as ilustrated above after = . B-7 Appendix B: LINEAR ALGEBRA: MATRICES Example B.12. The product of a 2 x 3 matrix and a 3-vector is a 2-vector: 1 -3 0" 4 2-1. This product definition is not arbitrary but emanates from the analytical and geometric properties of entities represented by matrices and vectors. For the product definition to make sense, the column dimension of the matrix A (called the pre-multiplicand) must equal the dimension of the vector x (called the post-multiplicand). For example, the reverse product x A would not make sense unless m = n = 1. If the row dimension m of A is one, the matrix formally reduces to a row vector, and the matrix-vector product reduces to the inner product defined by equation (A. 15) of Appendix A. The result of this operation is a one-dimensional vector or scalar. Thus we see that the present definition properly embodies previous cases. The associative and commutative properties of the matrix-vector product fall under the rules of the more general matrix-matrix product discussed next. §B.3.2. Matrix by Matrix Product and Matrix Powers We now pass to the most general matrix-by-matrix product, and consider the operations involved in computing the product C of two matrices A and B: C = AB. (B.25) Here A = [a^] is a matrix of order m x n, B = [bjk] is a matrix of order n x p, and C = [cjk] is a matrix of order m x p. The entries of the result matrix C are defined by the formula n cik = J2 aUbJk = aijbjk> i = l, ... ,m, k=\, ... ,p. (B.26) 7 = 1 We see that the (i, k)th entry of C is computed by taking the inner product of the ith row of A with the kth column of B. For this definition to work and the product be possible, the column dimension of A. must be the same as the row dimension of~B. Matrices that satisfy this rule are said to be product-conforming, or conforming for short. If the two matrices do not conform, their product is undefined. The following mnemonic notation often helps in remembering this rule: C = A B (B.27) mxp mxn nxp For the matrix-by-vector case treated in the preceding subsection, p = 1. Matrix A is called the pre-multiplicand and is said to premultiply B. Matrix B is called the post-multiplicand and is said to postmultiply A. This careful distinction on which matrix comes first is a consequence of the absence of commutativity: even if B A exists (it only does if m = n), it is not generally the same as AB. If A = B, the product A A is called the square of A and is denoted by A2. Note that for this definition to make sense, A must be a square matrix — else the factors would not be conforming. Similarly, A3 = AAA = A2A = AA2. Other positive-integer powers can be analogously defined. [1x1 + (-3) x2 + 0x31_r-51 U x 1 +2 x 2+ (-1) x 3_| ~ L 5J (B.24) B-8 5B.3 MATRIX PRODUCTS The above definition does not encompass negative powers. For inverse, A-1 denotes the inverse of A, which is studied later. The general power Am, where m can be a real or complex scalar, can be defined with the help of the matrix spectral form — an advanced topic covered in Appendix E. A square matrix A that satisfies A = A2 is called idempotent. We shall see later that this condition characterizes the so-called projector matrices. A square matrix A whose pth power is the null matrix is called p-nilpotent. Remark B.2. For hand computations, the matrix product is most conveniently organized by the so-called Falk's scheme: nk (B.28) Each entry in row i of A is multiplied by the corresponding entry in column k of B (note the arrows), and the products are summed and stored in the (i, k)th entry of C. Example B.13. To illustrate Falk's scheme, let us form the product C = AB of the following matrices '2 1 0-5' "3 0 -1 B 4 3 0 1 (B.29) The matrices are conforming because the column dimension of A and the row dimension of B are the same (3). We arrange the computations as shown below: 2 1 0-5" 4 3-10 =B .0 1 -7 4 J . (B.30) T3 0 21 T6 5 -14 -71 _C_AR A" |_4 -1 5 J U 6 -34 oj CAB Here 3x2 + 0x4 + 2x0 = 6 and so on. §B.3.3. Matrix Product Properties Associativity. The associative law is verified: A(BC) = (AB)C. (B.31) Hence we may delete the parentheses and simply write ABC. Distributivity. The distributive law also holds: If B and C are matrices of the same order, then A(B + C) = AB + AC, and (B + C) A = BA + C A. (B.32) Commutativity. The commutativity law of scalar multiplication does not generally hold. If A and B are square matrices of the same order, then the products A B and B A are both possible but in general AB^BA. If A B = B A, the matrices A and B are said to commute. One important case is when A and B are diagonal. In general A and B commute if they share the same eigensystem. B-9 Appendix B: LINEAR ALGEBRA: MATRICES Example B.14. Matrices a - ß b [l cl »=[' (B.33) b c - p. commute for any a, b, c, p. More generally, A and B = A — p I commute for any square matrix A. Transpose of a Product. The transpose of a matrix product is equal to the product of the transposes of the operands taken in reverse order: (AB)r=BrAr. (B.34) The general transposition formula for an arbitrary product sequence is (ABC...MN)r =NrMr...CrBrAr. (B.35) Congruent Transformation. If B is a symmetric matrix of order m and A is an arbitrary m x n matrix, then S = ArB A. (B.36) is a symmetric matrix of order n. Such an operation is called a congruent or congruential transformation. It is often found in finite element analysis when changing coordinate bases because such a transformation preserves certain key properties (such as symmetry). Loss of Symmetry. The product of two symmetric matrices is not generally symmetric. Null Matrices may have Non-null Divisors. The matrix product AB can be zero although A ^ 0 and B^O. Likewise it is possible that A ^ 0, A2 ^ 0, ..., but Ap = 0. §B.4. Bilinear and Quadratic Forms Let x and y be two column vectors of order n, and A a real square n x n matrix. Then the following triple product produces a scalar result: s = yT A x (B.37) Ixn nxn nxl This is called a bilinear form. Matrix A is called the kernel of the form. Transposing both sides of (B.37) and noting that the transpose of a scalar does not change, we obtain the result s = xrAry = yr Ax. (B.38) If A is symmetric and vectors x and y coalesce, i.e. Ar = A and x = y, the bilinear form becomes a quadratic form s = xTAx. (B.39) Transposing both sides of a quadratic form reproduces the same equation. B-10 §B.5 DETERMINANTS Example B.15. The kinetic energy of a dynamic system consisting of three point masses ml, m2, m3 moving in one dimension with velocities v\, v2 and w3, respectively, is T = )^{mxv\ + in > r.' + m3vf). This can be expressed as the quadratic form in which T 1 T = -V 2 v ml 0 0 " M = 0 m2 0 . 0 0 m3 _ V = (B.40) (B.41) (B.42) Here M denotes the system mass matrix whereas v is the system velocity vector. The following material discusses more specialized properties of matrices, such as determinants, inverses, rank and orthogonality. These apply only to square matrices unless othewise stated. §B.5. Determinants The determinant of a square matrix A = [a^] is a number denoted by |A| or det(A), through which important properties such as matrix singularity or its eigenvalue spectrum can be compactly characterized. This number is defined as the following function of the matrix elements: |A| = det(A) = ± Y[ aljx a. 2j2 • • • anj„> (B.43) where the column indices j\, J2, ... jn are taken from the set {1,2,... n}, with no repetitions allowed. The plus (minus) sign is taken if the permutation (ji J2 ... jn) is even (odd). Example B.16. If the order n does not exceed 3, closed forms of the determinant are practical. For a 2 x 2 matrix, a2l a22 Ml "22 "12"21- (B.44) For a 3 x 3 matrix, a\\ Cl\2 «13 a2\ a22 a23 Ö31 Ö32 Ö33 — ^11^22^33 ^~ ^12^23^31 ^~ ^13^21^32 ^13^22^31 ^12^21^33 Cl^^d^^d^^. (B.45) Remark B.3. The concept of determinant is not applicable to non-square matrices or to vectors. Thus the notation |x| for a vector x can be reserved for its magnitude (as in Appendix A) without risk of confusion. Remark B.4. Inasmuch as the product (B.43) contains n\ terms, the calculation of |A| from the definition is impractical for general matrices whose order exceeds 3 or 4. For example, if n = 10, the product (B.43) contains 10! = 3, 628, 800 terms, each involving 9 multiplications, so over 30 million floating-point operations would be required to evaluate |A| according to that definition. A more practical method based on matrix decomposition is described in Remark B.3. B-ll Appendix B: LINEAR ALGEBRA: MATRICES §B.5.1. Determinant Properties Some useful rules associated with the calculus of determinants are listed next. I. Rows and columns can be interchanged without affecting the value of a determinant. Consequently |A| = |Ar|. (B.46) II. If two rows, or two columns, are interchanged the sign of the determinant is reversed. For example: 1 A 1 O (B.47) 3 4 1 -2 1 -2 3 4 III. If a row (or column) is changed by adding to or subtracting from its elements the corresponding elements of any other row (or column) the determinant remains unaltered. For example: 3 4 3+1 4-2 4 2 1 -2 1 -2 1 -2 -10. (B.48) IV. If the elements in any row (or column) have a common factor a then the determinant equals the determinant of the corresponding matrix in which a = 1, multiplied by a. For example: 6 8 = 2 3 4 1 -2 1 -2 2 x (-10) -20. (B.49) V. When at least one row (or column) of a matrix is a linear combination of the other rows (or columns) the determinant is zero. Conversely, if the determinant is zero, then at least one row and one column are linearly dependent on the other rows and columns, respectively. For example, consider 3 2 1 1 2 -1 . (B.50) 2-1 3 This determinant is zero because the first column is a linear combination of the second and third columns: column 1 = column 2 + column 3. (B.51) Similarly, there is a linear dependence between the rows, which is given by the relation row 1 = | row 2 + | row 3. (B.52) VI. The determinant of an upper triangular or lower triangular matrix is the product of the main diagonal entries. For example, 3 2 1 0 2 -1 = 3 x 2 x 4 = 24. (B.53) 0 0 4 B-12 §B.5 DETERMINANTS This rule is easily verified from the definition (B.43) because all terms vanish except j\ = 1, j2 = 2, ... jn = n, which is the product of the main diagonal entries. Diagonal matrices are a particular case of this rule. VII. The determinant of the product of two square matrices is the product of the individual determinants: |AB| = |A||B|. (B.54) The proof requires the concept of triangular decomposition, which is covered in the Remark below. This rule can be generalized to any number of factors. One immediate application is to matrix powers: |A2| = |A||A| = |A|2, and more generally |A™| = |A|™ for integers. VIII. The determinant of the transpose of a matrix is the same as that of the original matrix: IAI. (B.55) This rule can be directly verified from the definition of determinant, and also as direct consequence of Rule I. Remark B.5. Rules VI and VII are the key to the practical evaluation of determinants. Any square nonsingular matrix A (where the qualifier "nonsingular" is explained in §B.3) can be decomposed as the product of two triangular factors A = LU, (B.56) in which L is unit lower triangular and U is upper triangular. This is called a LU triangularization, LU factorization or LU decomposition. It can be carried out in 0(n3) floating point operations. According to rule VII: |A| = |L| |U|. (B.57) According to rule VI, |L| = 1 and |U| = unu22 ... unn. The last operation requires only 0(n) operations. Thus the evaluation of |A| is dominated by the effort involved in computing the factorization (B.56). For n = 10, that effort is approximately 103 = 1000 floating-point operations, compared to approximately 3 x 107 from the naive application of the definition (B.43), as noted in Remark B.4. Thus the LU-based method is roughly 30, 000 times faster for that modest matrix order, and the ratio increases exponentially for large n. §B.5.2. Cramer's Rule and Homogeneous Systems Cramer's rule provides a recipe for solving linear algebraic equations directly in terms of determinants. Let the system of equations be — as usual — denoted by Ax = y, (B.58) in which A is a given n x n matrix, y is a given hxI vector, and x is the «xl vector of unknowns. The explicit form of (B.58) is Equation (A.l) of Appendix A, with n = m. A closed form solution for the components x\, X2 ■ ■., xn of x in terms of determinants is yi an «13 yi «23 yn ßfi2 ßfi3 an yi an Ü2\ yi «23 fl-nl yn CLrii O-ln O-ln (B.59) B-13 Appendix B: LINEAR ALGEBRA: MATRICES The rule can be remembered as follows: in the numerator of the quotient for x ■, replace the jth column of A by the right-hand side y. This method of solving simultaneous equations is known as Cramer's rule. Because the explicit computation of determinants is impractical for n > 3 as explained in Remark C.3, direct use of the rule has practical value only for n = 2 and n = 3 (it is marginal for n = 4). But such small-order systems arise often in finite element calculations at the Gauss point level; consequently implementors should be aware of this rule for such applications. In addition the rule may be advantageous for symbolic systems; see Example below. Example B.17. Solve the 3x3 linear system by Cramer's rule: Xi = 8 2 1 5 2 0 3 0 2 5 2 1 3 2 0 1 0 2 6 *2 ~5 2 1 X\ 8 3 2 0 — 5 _1 0 2 x% 3 5 8 1 3 5 0 1 3 2 5 2 1 3 2 0 1 0 2 -5-1. 6 x3 5 2 8 3 2 5 1 0 3 5 2 1 3 2 0 1 0 2 (B.60) 6 (B.61) Example B.18. Solve the 2x2 linear algebraic system L -P i + p\Yx2\ LoJ by Cramer's rule: 5 -ß 0 l + ß 5 + 5ß x2 = - 2 + ß 5 -ß 0 5ß 2 + ß -ß -ß l+ß 2 + 3ß' 2 + ß -ß -ß l + ß 2 + 3ß (B.63) Remark B.6. Creamer's rule importance has grown in symbolic computations carried out by computer algebra systems. This happens when the entries of A and y are algebraic expressions. For example the example system (B.62). In such cases Cramer's rule may be competitive with factorization methods for up to moderate matrix orders, for example n < 20. The reason is that determinantal products may be simplified on the fly. One immediate consequence of Cramer's rule is what happens if yi = y2 = .. . = yn = 0. (B.64) A linear equation system with a null right hand side Ax = 0, (B.65) is called a homogeneous system. From the rule (B.59) we see that if |A| is nonzero, all solution components are zero, and consequently the only possible solution is the trivial one x = 0. The case in which |A| vanishes is discussed in the next section. B-14 §B.6 SINGULAR MATRICES, RANK §B.6. Singular Matrices, Rank If the determinant | A| of a n x n square matrix A = An is zero, then the matrix is said to be singular. This means that at least one row and one column are linearly dependent on the others. If this row and column are removed, we are left with another matrix, say A„_i, to which we can apply the same criterion. If the determinant |A„_i | is zero, we can remove another row and column from it to get A„_2, and so on. Suppose that we eventually arrive at an r x r matrix Ar whose determinant is nonzero. Then matrix A is said to have rank r, and we write rank(A) = r. If the determinant of A is nonzero, then A is said to be nonsingular. The rank of a nonsingular n x n matrix is equal to n. Obviously the rank of Ar is the same as that of A since it is only necessary to transpose "row" and "column" in the definition. The notion of rank can be extended to rectangular matrices; as covered in the textbooks cited in Notes and Bibliography. That extension, however, is not important for the material covered here. Example B.19. The 3x3 matrix has rank r = 3 because I Al Example B.20. The matrix -3 / 0. "3 2 2" A = 1 2 -1 _2 -1 3_ "3 2 1" A = 1 2 -1 _2 -1 3_ (B.66) (B.67) already used as example in §B.5.1, is singular because its first row and column may be expressed as linear combinations of the others as shown in (B.51) and (B.52). Removing the first row and column we are left with a 2 x 2 matrix whose determinant is 2 x 3 — (—1) x (— 1) = 5 ^ 0. Thus (B.67) has rank r = 2. §B.6.1. Rank Deficiency If the square matrix A is supposed to be of rank r but in fact has a smaller rank f < r, the matrix is said to be rank deficient. The number r — f > 0 is called the rank deficiency. Example B.21. Suppose that the unconstrained master stiffness matrix K of a finite element has order n, and that the element possesses b independent rigid body modes. Then the expected rank of K is r = n — b. If the actual rank is less than r, the finite element model is said to be rank-deficient. This is usually undesirable. Example B.22. An an illustration of the foregoing rule, consider the two-node, 4-DOF, Bernoulli-Euler plane beam element stiffness derived in Chapter 12. (B.68) in which EI and L are positive scalars. It may be verified (for example, through an eigenvalue analysis) that this 4x4 matrix has rank 2. The number of rigid body modes is 2, and the expected rank is r = 4 — 2 = 2. Consequently this FEM model is rank sufficient. 12 6L -12 6L EI 4L2 -61 2L2 U 12 -6L symm 4L2 B-15 Appendix B: LINEAR ALGEBRA: MATRICES §B.6.2. Rank of Matrix Sums and Products In finite element analysis matrices are often built through sum and product combinations of simpler matrices. Two important rules apply to "rank propagation" through those combinations. The rank of the product of two square matrices A and B cannot exceed the smallest rank of the multiplicand matrices. That is, if the rank of A is ra and the rank of B is rb, rank(AB) < min(ra, rb). (B.69) In particular, if either matrix is singular, so is the product. Regarding sums: the rank of a matrix sum cannot exceed the sum of ranks of the summand matrices. That is, if the rank of A is ra and the rank of B is rb, rank(A + B) < ra + rb. (B.70) Rule (B.70) is the basis for studying the effect of the number of Gauss points on the rank of an element stiffness matrix being built through numerical integration, as discussed in Chapter 19. §B.6.3. Singular Systems: Particular and Homogeneous Solutions Having introduced the notion of rank we can now discuss what happens to the linear system (B.58) when the determinant of A vanishes, meaning that its rank is less than n. If so, (B.58) has either no solution or an infinite number of solutions. Cramer's rule is of limited or no help in this situation. To discuss this case further we note that if |A| =0 and the rank of A is r = n — d, where d > 1 is the rank deficiency, then there exist d nonzero independent vectors z,, i = 1, ... d such that Az,=0. (B.71) These d vectors, suitably orthonormalized, are called null eigenvectors of A, and form a basis for its null space. Let Z denote the n x d matrix obtained by collecting the z, as columns. If y in (B.58) is in the range of A, that is, there exists an nonzero xp such that y = Axp, its general solution is x = Xp + xA = xp + Zw, (B.72) where w is an arbitrary d x 1 weighting vector. This statement can be easily verified by substituting this solution into Ax = y and noting that AZ vanishes. The components xp and Xk are called the particular and homogeneous portions respectively, of the total solution x. (The terminology: particular solution and particular solution, are often used.) If y = 0 only the homogeneous portion remains. If y is not in the range of A, system (B.58) does not generally have a solution in the conventional sense, although least-square solutions can usually be constructed. The reader is referred to the many textbooks in linear algebra for further details. B-16 §B.7 MATRIX INVERSION §B.6.4. *Rank of Rectangular Matrices The notion of rank can be extended to rectangular matrices, real or complex, as follows. Let A be m x n. Its column range space 1Z(A) is the subspace spanned by Ax where x is the set of all complex n-vectors. Mathematically: 1Z(A) = {Ax : x e C"}. The rank r of A is the dimension of 1Z(A). The null space J\f (A) of A is the set of n-vectors z such that A z = 0. The dimension of J\f (A) is n — r. Using these definitions, the product and sum rules (B.69) and (B.70) generalize to the case of rectangular (but conforming) A and B. So does the treatment of linear equation systems A x = y in which A is rectangular. Such systems often arise in the fitting of observation and measurement data. In finite element methods, rectangular matrices appear in change of basis through congruential transformations, and in the treatment of multifreedom constraints. §B.7. Matrix Inversion The inverse of a square nonsingular matrix A is represented by the symbol A-1 and is defined by the relation AA '=1. (B.73) The most important application of the concept of inverse is the solution of linear systems. Suppose that, in the usual notation, we have Ax = y. (B.74) Premultiplying both sides by A-1 we get the inverse relationship x = A~1y. (B.75) More generally, consider the matrix equation for multiple (m) right-hand sides: A X = Y , (B.76) nxn nxm nxm X = A_1Y. (B.77) which reduces to (B.74) for m = 1. The inverse relation that gives X as function of Y is In particular, the solution of AX = I, (B.78) is X = A-1. Practical methods for computing inverses are based on directly solving this equation; see Remark below. §B.7.1. Explicit Computation of Inverses The explicit calculation of matrix inverses is seldom needed in large matrix computations. But occasionally the need arises for the explicit inverse of small matrices that appear in element level computations. For example, the inversion of Jacobian matrices at Gauss points, or of constitutive matrices. A general formula for elements of the inverse can be obtained by specializing Cramer's rule to "» = M- a also works. The last expression in (b.95) may be further transformed by matrix manipulations as (a + Br1 = a"1 - (i + a"1 Br1 a"1 b a"1 = a"1 - a '(i + ba ') 'ba 1 = a"1 - a *b (i + a^b)-1 a"1 = a^1 - a-'ba-'ci + ba-1)-1. In all of these forms b may be singular (or null). If b is also invertible, the third expresion in (b.96) may be transformed to the a commonly used variant (a + Br1 = a"1 - a"1 (a-1 + b-'r1 a"1. (b.97) The case of singular a may be handled using the notion of generalized inverses. This is a topic beyond the scope of this course, which may be studied, e.g., in the textbooks [83,129,686]. The special case of b being of low rank merges with the Sherman-Morrison and Woodbury formulas, covered below. The Sherman-Morrison formula discussed below gives the inverse of a matrix modified by a rank-one matrix. The Woodbury formula extends it to a modification of arbitrary rank. In structural analysis these formulas are of interest for problems of structural modifications, in which a finite-element (or, in general, a discrete model) is changed by an amount expressable as a low-rank correction to the original model. B-21 Appendix B: LINEAR ALGEBRA: MATRICES §B.9.1. *The Sherman-Morrison and Woodbury Formulas Let A be a square n x n invertible matrix, whereas u and v are two n-vectors and /3 an arbitrary scalar. Assume that a = 1 + p vTA 'u / 0. Then (A + ySuv^r1 = A"1 - - A 'uvT A-1. (B.98) When /3 = 1 this is called the Sherman-Morrison formula after [740]. (For a history of this remarkable expression and its extensions, which are important in many applications such as statistics and probability, see the review paper by Henderson and Searle cited previously.) Since any rank-one correction to A can be written as /3uvT, (B.98) gives the rank-one change to its inverse. The proof can be done by direct multiplication; see Exercise B.15. For practical computation of the change one solves the linear systems Aa = u and Ab = v for a and b, using the known A-1. Compute a — 1 + /3vTa. If a ^ 0, the change to A-1 is the dyadic — {fi/a)ahT. Consider next the case where U and V are two n x k matrices with k < n and /3 an arbitrary scalar. Assume that the k x k matrix £ — Ik + /3 VTA~'U, in which 1^ denotes the k x k identity matrix, is invertible. Then (A + y6UVr)_1 = A"1 - /? \~lVITl\TArl. (B.99) This is called the Woodbury formula, after [890]. It reduces to the Sherman-Morrison formula (B.98) if k = 1, in which case U = a is a scalar. §B.9.2. * Formulas for Modified Determinants Let A denote the adjoint of A. Taking the determinants from both sides of A + /3uvT one obtains | A + /3uvT| = | A| +/3vTAu. (B.100) If A is invertible, replacing A = |A| A-1 this becomes |A + /3uvT| = |A| (1 + /3v7A~1u). (B.101) Similarly, one can show that if A is invertible, and U and V are n x k matrices, | A + /3UV7'| = |A| |It+ ^VTA-1U|. (B.102) §B.9.3. * Inversion of Low Order Symmetric Matrix Sums The inversion formulas given here are important in recent developments in continuum mechanics. Let A be an x n positive definite symmetric matrix with n = 2, 3,1 the identity matrix of order n and c a scalar. We want to express A + cl)~' directly in terms of A, I, c and the invariants of A provided by its characteristic coefficients. The derivations are carried out in [402], and make essential use of the Cayley-Hamilton theorem. Two Dimensional Case. Here n = 2. The invariants are IA = trace(A) and IIa = det(A). Then (A + cD-' =-- \ tt (A-(c+/A)I). (B.103) c (c + IA) + IIA Three Dimensional Case. Here n — 3. The invariants are IA — trace(A), IIA — (trace(A2) — IA)/2, and IIIA = det(A). Then (A + cir1 = 1 (A2 - (c + IA) A + [c(c + IA) + IIa\ D) ■ (B.104) c[c(c + IA) + IIA] + HIa v ' B-22 §B.10 *MATRIX ORTHOGONALITY §B.10. *Matrix Orthogonality Let A and B be two product-conforming real matrices. For example, A is k x m whereas B is m x n. If their product is the null matrix C = AB = 0, (B.105) the matrices are said to be orthogonal. This is the generalization of the notions of vector orthogonality discussed in the previous Appendix. §B.10.1. *Matrix Orthogonalization Via Projectors The matrix orthogonalization problem can be stated as follows. Product conforming matrices A and B are given but their product is not zero. How can A be orthogonalized with respect to B so that (B.105) is verified? Suppose that B is m x n with m >n and that BTB is nonsingular (equivalently, B has full rank).7 Then form the m x m orthogonal projector matrix, or simply projector 1 TlT PB = I-B(B' B) 'B (B.106) in which I is the m x m identity matrix. Since PB = PB, the projector is square symmetric. Note that PBB = B-B(BTB) 1 (BTB) = B - B = 0. (B.107) It follows that PB projects B onto its null space. Likewise BT PB = 0. Postmultiplying A by PB yields A = APB = A - AB (BTB) 1 BT. (B.108) Matrix A is called the projection of A onto the null space of B.8 It is easily verified that A and B are orthogonal: AB = AB - AB (BTB) 1 (BTB) = AB - AB = 0. (B.109) Consequently, forming A via (B.106) and (B.108) solves the orthogonalization problem. If B is square and nonsingular, A = 0, as may be expected. If B has more columns than rows, that is m < n, the projector (B.106) cannot be constructed since BBT is necessarily singular. A similar difficulty arises if m > n but BTB is singular. Such cases require treatment using generalized inverses, which is a topic beyond the scope of this Appendix.9 In some applications, notably FEM, matrix A is square symmetric and it is desirable to preserve symmetry in A. That can be done by pre-and postmultiplying by the projector: A = P B A P B. (B.110) Since = PB, the operation (B. 110) is a congruential transformation, which preserves symmetry. 7 If you are not sure what "singular", "nonsingular" and "rank" mean or what (.) 1 stands for, please read §B.7. 8 In contexts such as control and signal processing, is called a filter and the operation (B.108) is called filtering. 9 See e.g., the textbooks [83,686]. B-23 Appendix B: LINEAR ALGEBRA: MATRICES §B.10.2. *Orthogonal Projector Properties The following properties of the projector (B.106) are useful when checking out computations. Forming its square as p2 = pB pB = i _ 2B (BTB) lBT + B (BTB) lBT B (BTB) lBT = I-2B(BTB) 'B^ + B^B) 'BT = I-B(BtB) 'Bt = Pb, (B.lll) shows that the projector matrix is idempotent. Repeating the process one sees that p'^ = PB, in which n is an arbitrary nonnegative integer. If B is m xn with m >n and full rank n, PB has m — n unit eigenvalues and n zero eigenvalues. This is shown in the paper [258], in which various applications of orthogonal projectors and orthogonalization to multilevel FEM computations are covered in detail. Notes and Bibliography Much of the material summarized here is available in expanded form in linear algebra textbooks. Highly reputable ones include Bellman [73], Golub and Van Loan [335], Householder [410], and Strang [773]. As books of historical importance we may cite Aitken [9], Muir [555], Turnbull [833], and Wilkinson and Reinsch [876]. For references focused on the eigenvalue problem, see Appendix E. For inverses of matrix sums, there are two SIAM Review articles: [359,385]. For an historical account of the topic and its close relation to the Schur complement, see the bibliography in Appendix P. B-24 Exercises Homework Exercises for Appendix B: Matrices In all ensuing exercises matrices and vectors are assumed real unless otherwise stated. EXERCISE B.l [A: 10] Given the three matrices 2 4 -1 2 2 5 1 0' 3 1 -1 2 2 -2" B = 1 0 4 1 -3 2 ri -3 1.2 0 (EB.l) compute the product D = ABC by hand using Falk's scheme. Hint: do BC first, then premultiply that by A. EXERCISE B.2 [A: 10] Given the square matrices verify by direct computation that A B ^ B A. EXERCISE B.3 [A: 10] Given the matrices 3 -4 2 (EB.2) " 1 0" A = -1 2 B = 2 0 -1 4' 2 0 0 0 (EB.3) (note that B is symmetric) compute S = A B A, and verify that S is symmetric. EXERCISE B.4 [A: 10] Given the square matrices A = -1 0 -2 B = -6 -14 2 (EB.4) verify that A B = 0 although A / 0 and B / 0. Is B A also null? EXERCISE B.5 [A: 10] Given the square matrix A = 0 a b 0 0c 0 0 0 (EB.5) in which a,b,c are real or complex scalares, show by direct computation that A2 0 but A3 = 0. EXERCISE B.6 [A: 10] Can a diagonal matrix be antisymmetric? EXERCISE B.7 [A:20] Prove the matrix product transposition rule (B.34). Hint: call C = (AB)T, D = BTAT, and use the matrix product definition (B.26) to show that the generic entries of C and D agree. EXERCISE B.8 [A:20] If A is an arbitrary m x n real matrix, show: (a) both products ATA and AAT exist, and (b) both are square and symmetric. Hint: for (b) use the symmetry condition S = ST and (?). EXERCISE B.9 [A: 15] Show that A2 only exists if and only if A is square. B-25 Appendix B: LINEAR ALGEBRA: MATRICES EXERCISE B.10 [A:20] If A is square and antisymmetric, show that A2 is symmetric. Hint: start from A = — AT and apply the results of Exercise B.8. EXERCISE B.ll [A:15] If A is a square matrix of order n and c a scalar, show that det(cA) = c" det A. EXERCISE B.12 [A:20] Let u and v denote real n-vectors normalized to unit length, so that uT u = 1 and yTy = 1, and let I denote the n x n identity matrix. Show that det(I - uvT) = 1 - vTu (EB.6) EXERCISE B.13 [A:25] Let u denote a real n-vector normalized to unit length so that uT u = 1, while I denotes the n x n identity matrix. Show that H = I - 2uuT (EB.7) is orthogonal: HTH = I, and idempotent: H2 = H. This matrix is called a elementary Hermitian, a Householder matrix, or a reflector. It is a fundamental ingredient of many linear algebra algorithms; for example the QR algorithm for finding eigenvalues. EXERCISE B.14 [A:20] The trace of a n x n square matrix A, denoted trace(A) is defined by (B.6). Show that if the entries of A are real, n n trace(ATA) = ^ ^ a2 (EB.8) i=\ j=i This is the same as the Frobenius norm 11 A| | F (incorrectly called Euclidean norm by some authors). EXERCISE B.15 [A:25] Prove the Sherman-Morrison formula (B.98) by direct matrix multiplication. EXERCISE B.16 [A:30] Prove the Woodbury formula (B.99) by considering the following block bordered system in which Ik and I„ denote the identity matrices of orders k and n, respectively. Solve (EB.9) two ways: eliminating first B and then C, and eliminating first C and then B. Equate the results for B. EXERCISE B.17 [A:20] Show that the eigenvalues of a real symmetric square matrix are real, and that the eigenvectors are real vectors. EXERCISE B.18 [A:25] Let the n real eigenvalues A, of a real n x n symmetric matrix A be classified into two subsets: r eigenvalues are nonzero whereas n — r are zero. Show that A has rank r. EXERCISE B.19 [A: 15] Show that if A is positive definite, Ax = 0 implies that x = 0. EXERCISE B.20 [A:20] Show that for any real m x n matrix A, ATA is nonnegative. (Existence is proven in a previous Exercise). EXERCISE B.21 [A: 15] Show that a triangular matrix is normal if and only if it is diagonal. EXERCISE B.22 [A:20] Let A be a real orthogonal matrix. Show that all of its eigenvalues A,, which are generally complex, have unit modulus. EXERCISE B.23 [A: 15] Let A and B be two real symmetric matrices. If the product A B is symmetric, show that A and B commute. B-26 Exercises EXERCISE B.24 [A:30] Let A and T be real n x n matrices, with T nonsingular. Show that T 'AT and A have the same eigenvalues. (This is called a similarity transformation in linear algebra). EXERCISE B.25 [A:40] (Research level) Let Abemxn and Bbenxm. Show that the nonzero eigenvalues of AB are the same as those of BA (Kahan). EXERCISE B.26 [A:25] Let A be real skew-symmetric, that is, A = — AT. Show that all eigenvalues of A are either purely imaginary or zero. EXERCISE B.27 [A:30] Let A be real skew-symmetric, that is, A = -AT. ShowthatU= (1 +A)_1(I —A), called a Cayley transformation, is orthogonal. EXERCISE B.28 [A:30] Let P be a real square matrix that satisfies P2 = P. (EB.10) Such matrices are called idempotent, and also orthogonal projectors. Show that (a) all eigenvalues of P are either zero or one; (b) I — P is also a projector; (c) PT is also a projector. EXERCISE B.29 [A:35] The necessary and sufficient condition for two square matrices to commute is that they have the same eigenvectors. EXERCISE B.30 [A:30] A matrix whose elements are equal on any line parallel to the main diagonal is called a Toeplitz matrix. (They arise in finite difference or finite element discretizations of regular one-dimensional grids.) Show that if Ti and t2 are any two Toeplitz matrices, they commute: t1t2 = t2t1. Hint: do a Fourier transform to show that the eigenvectors of any Toeplitz matrix are of the form [e'mih}\ then apply the previous Exercise. B-27