2 [ In/'/p ö 0<<1 i i C 11 <'1 • /ot >■-/ I 'c •/1 / ct r 1W ed'J- INTRODUCTORY MATRIX MATERIAL 2.1] BLOCK OPERATIONS ^/_________._——— It is very often convenient in both theoretical and computer work to partition a matrix into submatrices, This can be done in numerous ways as suggested by this example: 2 ! I I 3 I 10 I 11 I 14 j 15 Each submatrix or block can be labeled by subscripts, and we can display the original matrix with submatrices or blocks for its elements. The general form of a partitioned matrix therefore is (2.1.1) A = 11 kl 12 A k2 11 kS, Block Operations 17 This means that the number of rows in each A. must be the same for each' i ahTthe number o £ "co l'uinns must be the same tor each i. I'fie size or A-is- ---- j_j rpiv nr>>-<--i ^ -»-v -^A-^-,-----_ i 3 indicate this by writing : for certain integers m± and n . We (2.1.1') A = No. of columns "11 *kl »12 k2 _L£ kX.- No. of rows n, A square matrix A of order n is often partitioned symmetrically. Suppose that xT^~nT + n? +' • --:'T"n with ni f 1. Partition A as ~ r 11 12 Alr (2.1.2) A = r2 where size A±j = n± x nj. The diagonal blocks A.. are square matrices of order n . «=—-_—_11 ___ . t l -~— Example. X X X X X X X X X n X X X X X X X X X X X X X X X X X X X X X X X X X X = 6 = 2 = 1 = 3 is a symmetric partition of a 6 x 6 matrix. 3g"?rf.?a^icfs ara^eaJ^Jttft-Efc uar or compounded of square blocks all of the samg^TzTT^'~- Dotted lines, bars, commas are all used in an obvious way to indicate partitions. The size of the blocks must be such that they all fit together properly. 16 Block Operations 19 18 Introductory Matrix Material Example ■ X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X If a square matrix A of order nk is composed of n x n square submatrices all of order k, it is termed an (n, k) matrix. Thus the matrix depicted above is a (2, 3) matrix. Subject to certain conformability conditions on the blocks, the operations of scalar product, transpose, conjugation, a"3ol'LTron" alid'muTEXpIiLcation are carriecT dutTTn the"s~sm~e'"way when expressed Th_bioc^ notation as when they are expressed in element" noS7 tion. This means (2.1.3) 11 cA 11 cA 1JI (2.1.4) Jcl 11 k£ ca. kl 11 cA kl Tel Akl k£ kl (2.1.5) 11 ^1£' A* All A * Akl "kl A k£ A* kJt Here T designates the transpose and * the conj ugate transpose. (2.1.6) All • * * A1JI\ / Bll * ' ■ BU Akl * * * AkS, Bkl • * * Bkl All + Bll A1J> + B1J> \l + Bkl Ak£ + Bk£ , A (2.1.7) 11 1H< lk£ 11 '11 'In' 'kn where C.. = J ,A. B 13 ur=l lr r^ In (2.1.6) the size of each A^^ must be the size of the corresponding B In (2.1.7), designate the size of A.. by a. x g . 13 1 3 and the size of B.. by Y. * 6.. Then, if 6 = y for 13 2 1 3 ' r 1 r 1 < r < I, the product A. B . can be formed and pro-— — 1 c ir r] c duces an x 6.. matrix, independently of r. The sum can then be found as indicated and the C.. are a. x 6. 13 1 : matrices and together constitute a partition. Note that the rule for forming the blocks C.. of the matrix product is the same as when A.. and B.. are single numbers. ^ x3 Example. If A and B are n x n matrices and if B -A / C = 20 Introductory Matrix Material Block Operations 21 , A2 + B2, C = BA - AB, AB - BA A2 + B2 PROBLEMS 1 2. 3. In the example just given what is C2 if A and B commute? In the example, compute C3. what if a and B commute? Let 4. 6. M = , N = B. r 0 be two block diagonal matrices. When can the product MN be formed? What is the product? "Hadamard matrices" of order 2n are given recursively by means of the definition H H2 = [1 -1]' H v+1 .[V V] H , -H , Write out H4 and Hg explicitly. Compute H2E2' H4H4" Let A, B, C, D all be n x n and let a, b, c, d be scalars. What is Let I be the identity of order p. Prove that /i B \ det . 0 = det C. A B 7. If A and C are square, prove that det(Q „) = (det A)(det C). 8. If A and C are square, prove that the eigenvalues A B of (g c) are those of A together with those of C. 2.2 DIRECT SUMS k, let A^ be a square matrix of For i = 1, 2, . . . , order n--. The block diagonal square matrix (2.2.1) A = 0 = diag(A1, A2, of order n^ + n2 + ••• + n^ is called the direct sum of the Ak and is designated by -" .....■ 1 ■■■■■ ; " ~ k ffi A. = ffi A. . k l (2.2.2) A-l • A2 i=l The following identities are easily established. (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (A ffi B) (A + B) C = A (B © C) . (C + D) = (A © %) (A © B) (C © D) = AC ffi BD. T ((| © D) T T (A ffi B) = A ffi B (A ffi B) * = A* ffi B*. (AffiB) 1 = A1ffiB'L, assuming that the indicated inverses exist. det (A ffi B) = (det A) (det B). tr (A ffi B) = tr A + tr B. If pA(A) designates the characteristic polynomial of A, then PAeB„ (A ffi B) = { AA, AB}. (AA designates the set of eigenvalues of A.) 22 Introductory Matrix Material PROBLEMS Let A © A, Prove that det A = A- © A~ © 12 k n. , det A. and that for integer p, AP = A? o.= l pi 1 ... © AR. Give a linear algebra interpretation of the direct sum along the following lines. Let V be a finite-dimensional vector space and let L and M be sub-spaces. Write V = L 8 M if and only if every vector x E V can be written uniquely in the form x = y + z withy € L, z 6 M. Show that V = L © M if and only if y } are bases for y } is a m (a) dim V = dim L + dim M, L D M = {0} (b) if {xi;...,x£} and {y^. L and M, then {x.,...,x.,y_, basis for V. x 1 The fundamental theorem of rank-canonical form for square matrices tells us that ifAisanxn matrix of rank r, then there exist nonsingular matrices P, Q such that PAQ =1 © 0 _ . Verify this formulation. r n_r 2.3 KRONECKER PRODUCT Let A and B be m x n and p x q respectively. Then the Kronecker product (or tensor, or direct product of A and B) is that mp * nq matrix defined by (2.3.1) A 0 B = allB' amlB' a12B. a _B, m2 a, B In a B mn Important properties of the Kronecker product are as follows (indicated operations are assumed to be defined]: .....______________„___ (1) (aA) & B = A 0 (aB) = a(A 0 B); a scalar. (2) (A + B) 0 C = (A 0 C) + (BSC). (3) A ® (B + C) = (A 0 B) + (A 0 C) . |^(4) A 0 (B 0 C) = (A 0 B) 0 C. Kronecker Product 23 (5) (A ( i B) (C ( 3D) = (AC) 0 BD. (6) A & B = A < 3 B. (7) (A T i B) = AT 0 BT; (A 0 B)* = A* 0 B*. (8) r (A 0 B) = r (A) r(B) . orders m and n. Then (9) (10) (ID (12) tr (A 0 B) (tr(A))(tr(B)). If A and B are nonsingular, so is A B"1. B and (A 0 B) 1 = A-1 B) = (det A) "(det B)m. det (A There exists a permutation matrix P (see Section 2.4) depending only on m, n, such that B 0 A = P*(A 0 B)P. (13) Let fS{x, y) designate the polynomial P 4 k 0(x, y) = I a. xJy . j,k=0 3k Let 0(A; B) designate the mn x mn matrix j,k=0 *j,k A^ 0 Bk. Thf.n the eigenvalues of ß (A; B) are 0(A, 1, 2, m. 1, 2, n where X and u are the eigenvalues of A r s and B respectively Lues of ,m; s = 1, 2, ...,n eigenvalues of A 0 B are A u ^ r s In particular, the r = 1, 2, PROBLEMS 1. Show that I I = n m n mn Describe the matrices I A, A If A is m x m and B is n x n, then A 0 B (A 0 I ) (I 0 B) n m (I © B) (A 24 Introductory Matrix Material Permutation Mairiaas 9. 10. If A and B are upper (or lower) triangular, then so is A ® B. If A ® B ^ 0 is diagonal, so are A and B. Let A and B have orders m, n respectively + Show that the matrix (I ® B) (A © I ) has the n m < s A and u are the eigenvalues, r s 3 eigenvalues X 1, 2, ..., n, where of A and B. This matrix is often called the Kronecker sum of A and B. Let A and B be of orders m and n. If A and B both are (1) normal, (2) Hermitian, (3) positive definite, (4) positive semidefinite, and (5) unitary, then A ® B has the corresponding property. See Section 2.9. [21 Kronecker powers: Let A = A ® A and, in [k+1] general, A A[k] « A[i]. Prove that (AB Let Ax A © A [k] Prove that A [k] = A[k]B[k] Ax and By = py, x = T , T T -xy , x2y that (A ® B)Z = AyZ. '2.4 /PERMUTATION MATRICES (x 1' x y ]. Prove irr By a permutation^g of the set N= fl, 2, n} is meant a one-to-one mapping of N onto itself. Including the identity permutation there are n! distinct permutations of N. One can indicate a typical per-mutation by c(l) i± (2.4.1) a(2) i<= i. a(n) * i. (2.4.11) 2 i. The inverse permutation is designated by a 1. Thus Let Ej designate the unit (row) vector of n cor ponents which has a 1 in the jth position and O's elsewhere: (2.4.2) E (0, / 0, 1, 0, , 0) BY a permutation matrix of order n is meant a matrix of the form (2.4.3) One has (2.4.4) (a^_j) where 3i,a(i) = 1' 1 = !<2'--- 1 / 3 - 0 I otherwise. ^>-J^^mLM,^Jsm.%. I in the aJi l;th.„column..Jatld_.a,; elsewhere. The^ith^qlrari ofp has a l_in the 5.... iiX^hJ^.9K.3M.Als elsewhsre. Thus each row and each column of P has precisely one 1 in it. Example P = a '0 0 0 1 10 0 0 0 0 10 '0100 which is often written as It is easily seen that Permutation Matrices 27 26 Introductory Matrix Material (2.4.5) y (l) co(2) c (n) Hence if A = - XyO (j) That is, AP0 is A with its columns permuted by a Note also that (2.4.9) P P - P ox at 3 where the product of the permutations o, t is applied from left to right. Furthermore, 1; (2.4.10) hence (P0)* = P (2.4.11) (PO)*P0 = Pö_1Pa = Px Therefore (2.4.12) I. (Pa)* = P^ = (Pa) -1 The permutation matrices are_ thus unitar^^ormlng a "subgroup" of "the uniTary^groug^ From (2.4.6), (2.4.8) and (2.4.12) it follows that if A is n x n (2.4.13) P AP* a a (aa(i) ,a(j)J ' so that the similarity transformation P AP* causes a consistent renumbering of the rows and columns of A by the permutation a. Among the permutation matrices, the matrix (2.4.14) plays a fundamental role in the theory of circulants. This corresponds to the forward shift permutation a(l) = 2, o(2) = 3, a(n-l) = n, (n) = 1, that is, to the cycle a = (1, 2, 3, n) generating the cyclic group of order n (tt is for "push"). One has (2.4.15) tt2 - 0 0 corresponding to a2 for which a2(I) = 3, a2 (2) = 4, 2 k k a (n) = 2. Similarly for tt and o . The matrix Tjn corresponds to . THE FOURIER MATRIX Let n be a fixed integer > 1 -and set c t \ _,2Tri. 2tt , 2tt (2.5.1) w = exp(—:—) = cos — + l sin — In a good deal of what follows, w might be taken as any primitive nth root.....of unity.,' but we prefer to standardize the selection as in (2.5.1). Note that i 32 Introductory Matrix Material (2.5.2) (a) w n 1, (b) ww = 1, (c) w = w"1, (d) wk = w"k = wn"\ (e) 1 + w + w + • • . _i_ n~l n + w = 0, By the Fourier matrix of order n, we shall mean atrix J" (2.5.3) '. f* the matrix f (= F ) where i_(w(i-l) (J-D 1_ /n 1 1 w 2 w w2 w4 w 2-\ 34 Introductory Matrix Material The Fourier Matrix 35 n 5 1(mod 4), f (X) n = 2 (mod 4), f (X) n = 3(mod 4), f (X) (X - 1)(X4 - i)(V4)(n-l) (X' 1)(X4 - 1)(V4)(n-2) (X + i) (X' 1) (X' 1) (1/4)(n-3) The discrete Fourier transform, complex n-tuples, write Working with Z = (z , z2, Z = (z^f z 21 and The linear transformation (2.5.8) ; Z = FZ I where F is the Fourier matrix is known as the discrete Fourier transform [DFT). Its inverse is given simply by (2.5.9) F*Z. The transform (2.5.8) often goes by the name of harmonic analysis or periodocfram analysis, while the inverse transform (2.5.9) is called harmonic synthesis, The reasons behind these terms are as follows: suppose that p(z) = aQ + a^z + ... + anzn_1 is a polynomial of degree < n - 1. It will be determined uniquely by specifying its values p(z ) at n distinct points z, , K K k = 1, 2, n in the complex plane. Select these points zk as the n roots of unity 1, w, w , w~ . Then clearly (2.5 .10) n^F* J p(w ) so that (2.5.11) vT1'2* p(D P (w) n-1' p (w n-1, The passage from functional values to coefficients through (2.5.11) or (2.5.8) is an analysis of the function, while in the passage from coefficient values to functional values through (2.5.10) or (2.5.9) the functional values are built up or "synthesized." These formulas for interpolation at the roots of unity can be given another form. By a Vandermonde matrix V(Zq, z^, zn_^) is (2.5.12) From (2.5.4) one has, clearly, \-2- i V(l, w, w , .. (2.5.13) _ _2 ! V(l, w, w , ... , w" l) form 1 1 1 zo 21 z . n-1 2 z2 2 zo zn-l n-1 n-1 n-1 zo 21 z . n-1 n-1, l/2„, w ) = n F* nl/2F* = nl/2F_ One now has from (2.5.11) (2.5.14) p(z) = (1, z, . n-1, , z ) (aQ, ax, n-1. -1/2 = (1, z, . . . , z )n F(p(l) , p(w) , , P(wn-L))T n-1, a , ) n-1 -2 (1, z, ..., z )V (1, w, w , , wn_1) (p(l) , p(w) , , n-1,.T , P (w )) . 36 Introductory Matrix Material 37 Note. In the literature of signal processing, a sequence-to-sequence transform is known as a discrete or digital filter. Very often the transform [such as (2.5.8)] is linear and is called a linear filter. Fourier Matrices as Kronecker Products.; The Fourier matrices of orders 2 maybe expressed as Kronecker products. This factorization is a manifestation, essentially, of the idea known as the Fast Fourier Transform (FFT) and is of vital importance in real time calculations. Let F' designate the Fourier matrices of order _2__ 2 whose rows have been permuted according to the bit reversing pernuitatlon (see Problem"^ pT TUJ Examples F2 1 -1 i -i 1 1 -1 -1 One has (2. 5.15) F4 = (I2 8 F2)D4(F2 I2), where D = diag(l, 1, 1, i). This may be easily checked out. As is known, A ® B = P(B ® A)P* for some permutation matrix P that depends merely on the dimensions of A and B. We may therefore write, for some permutation matrix S4 (one has, in fact, sT1 (2.5.16) F4 (I2 0 F^)D4S4(I2 F2>S4 Similarly, (2.5.17) F ■ = 16 F^D16(F4 S I4» where (2.5.18) D16 = diag(I, D2, D, D3) with (2.5.19) D = diagd, w, w2, w3), w = exp 42jri 16 Again, for an appropriate permutation matrix S-^g = °16 °16' (2.5.20) FJ6 = (I4 8 F4)D16S16(I4 6 F41S16-j For 256 use. (2.5.21) D256 = diag(I, D8, D4, D15) where the sequence 0, 8, 4, 15 is the bit reversed order of 0, 1,...., 15 and where -27Ti/256 w = e ' (2.5.22) D = diag(l, w, w15), PROBLEMS 1. Evaluate det F . n 2. Find the polynomial p^_^(z) of degree <_ n - 1 that takes on the values 1/z at the nth roots of unity, wJ, j = 1, 2, , n. What is the limiting behavior of P_(z) as n ->• °°? (de Mere) Write F R + iS where R and S are real and i = /d. Show that R and S are symmetric and that R2 + S2 = I, RS = SR. Exhibit R and S explicitly. 2.6 HADAMA.RD MATRICES By a Hadamard matrix of order n, H (= Hn), is meant a matrix whose elements are either +1 or -1 and for which 38 Introductory Matrix Material Hadamard Matrices 39 (2.6.1) Thus, n f T HH = H H nl. •1/2 H is an orthogonal matrix. Examples Hx = (1), /2 F0 = H. 4,1 = ( 1 1 1 -1 1 1 1 1 1 1 1 1 l4,2 It is known that if n > 3, then the order of an Hadamard matrix must be. "a multiple of 4. With one"" least one.Hadamard matrix. Theorem 2.6.1. If A and B are Hadamard matrices of orders m and n respectively, then A 0 B ..is. an Hadamard matrix of order mn. Proof (A ® B) (A 8 B) (A 0 B) (A T BT) = (AAT) 0 (BBT) (mlm) 0 (nln) = mn(Im mnl In some areas, particularly digital sicrnal processing , the term Hadamard nmatrix is limited to the matrices of order 2n given specifically by the recursion (2.6.2) = (1), H , , = H ® H . . 2n+l 2n 2% t1 1) [1 -1> ' These matrices have the additional property of being symmetric, (2.6.3) so that (2.6.4) V = V H2 = 2nI. 2n The Walsh-Hadamard Transform. By this is meant the transform (2.6.5) HZ where H is an Hadamard matrix. PROELEMS Hadamard parlor.^gapie; Write down in a row any four numbers. Then write the sum of the first two, the sum of the last two; the difference of the first two, the difference of the last two to form a second row. Iterate this procedure four times. The final row will be four times the original row. Explain, making reference to H, Generalize. P is square and every row Define a follows. ~P" is"~square and every"row and every co Umn of P has exactly one nonzero, element in it. That element is either a +1 or a -1. Show that if H is an Hadamard matrix, and if P and Q are generalized permutation matrices, then PHQ is an Hadamard matrix. With the notation of (2.6.2) prove that ,n+l = (H I2)(1^ ® H2). 40 Introductory Matrix Material Generalized Inverse 41 4. Using Problem 3, show that the Hadamard transform of a vector by H can be carried out in 2n < n 2 additions or subtractions. 5. If H is an Hadamard matrix of order n, prove that Idet Hi = nn/2. inverse exists. That is, there are_ many matrices A for which there exists no matrix B such~~that AB = BA = I. " " In discussing the solution of systems of linear eguations, we know that if A is n x n and nonsingular then the solution of the eguation AX B, 2.7 TRACE The trace of a square matrix A = (a^) of order n is defined as the sum of its diagonal elements: (2.7.1) tr A n I j-l a . . , 33 The principal general properties of the trace are tr (aA + bB) = a tr (A) + b tr (B) . (1) (2) (3) (4) (5) tr(AB) = tr (BA). tr A = tr (S 1AS), S nonsingular. If are the eigenvalues of A, then tr A = 7" 1 \. . L 1=1 l More generally, if p designates a polynomial P(A) j = 0 3 (6) (7) (3) then tr(p(A)) = ££=1 p(AR). tr(AA*) = tr(A*A) = . ,|a..|2 = square ___ '3 — -L^D of Frobenius norm of A. tr (A ® B) = tr A + tr B. tr (A ® B) = (tr A) (tr B) £7] ■ where X and B are n x m matrices, can be written very neatly in matrix form as X = A XB. ! .........«t\ Although the "solution" give above is symbolic, and in general is not the most economical way of solving systems of linear equations, it has important applications. However, we have so far only been able to use this idea for square nonsingular matrices. In this section we show that for every matrix A, whether square or rectangular, sin^I^*jor'''nolfsingui.ar, there exists a unique "generalized inverse" often called the "Moore-Penrose" inverse of A, and employing it, the formal solution X = A "'"Bean be given a useful interpretation. This generalized inverse has several "of the important properties of the inverse of a square nonsingular matrix, and the resulting theory is able in a remarkable way to unify a variety of diverse topics. This theory originated in the 1920s, but was rediscovered in the 1950s and has been developed extensively since then. Right and Left Inverses Definition. If A is an m x n matrix, a right inverse of A is an n x m matrix B such that AB = I . Similar- m ly a left inverse is a matrix C such that CA = I . GENERALIZED INVERSE Example■ If A = For large classes of matrices, such as the square "singular" matrices and the rectangular matrices, no a right inverse of A is the matrix 42 Introductory Matrix Material Generalized Inverse 43 since AB 2" However, note that A does not have a left inverse, since for any matrix C, by the theorem on the rank of a product, r(CA) < r(A) =2, so that CA ^ I^- Similarly, although A is, by definition, a left inverse of B, there exists no right inverse of B. The following theorem gives necessary and sufficient conditions for the existence of a right or left inverse. Theorem 2.8 .1.1. An_ m x n matrix A__ has a right (left) inverse jLf and ojiiy~jLf A has" rahkmTn) Proof. We work first with right inverses. Assume that AB = I Then m = r (I ) < ro — r(A) < m. Hence r(A) = m. Conversely, suppose that r(A) = m. Then A has m linearly independent columns, and we can find a permutation matrix P so that the matrix A = AP has its first m columns linearly independent. Now, if we can find a matrix B such that AB = APB = I, then B = PB is clearly a right inverse for A. Therefore, we may assume, without loss of generality, that A has its first m columns linearly independent. Hence A can be written in the block form A = (A1, A2) where A^ is an m x m nonsingular matrix and is some m x (n - m) matrix. Q) ill Now let A = A, (I 1 m This can be factored to yield B = where B-^ is m x n and B2 is m) x m. Then AB = I if and only if A1B1 + A1QB2 or if and only if Bl + QB2 = Al ' or if and only if Bl = Al3 Therefore, we ha\ QB2. QB, B = for an arbitrary (n - m) x m matrix B^. Thus there is a right inverse, and if n > m, it is not unique. We now prove the"^Ke^r^~f6r,'a'"'le^t^lnvers^T Suppose, again, that A is m x n and r(A) = n. Then By the first part, AT has Hence BTA = I and A has a is n x m and r(AT) a right inverse: ATB = I. left inverse. Corollary. If A is_n_x n of rank n, then A_has both a right and a left inverse "and" they are the same. Proof. The existence of a right and a left inverse for A follows immediately from the theorem. To prove that they are the same we assume AB = I, CA = I. Then C(AB) = CI = C. But also, C(AB) = (CA)B = IB = B, so that B = C be the inverse of A, denoted by A This is the matrix that is defined to 1 44 Introductory Matrix Material Generalized Inverse PROBLEMS 1 Find a left inverse for left inverses. Does Find all the 1 2 3 4 have a left inverse? Let A be in x n and have a left inverse B. Suppose that the system of linear equations AX = C has a solution. Prove that the solution is unique and is given by X = BC. Let B be a left inverse for A. Prove that ABA = A and BAB = B. 5. Let A be m x n and have rank n. nonsingular and that (ATA) "LAT for A. Prove that A A is is a left inverse Let A be in x n and have rank n. Let W be m x na positive definite symmetric. Prove that ATWA is m —IT nonsingular and that (A WA) A W is a left inverse Generalized Inverses Definition. Let A be an m x n matrix. Then an n x m matrix X that satisfies any or all of the following properties is called a generalized inverse: (1) AXA = A, (2) XAX = X, (3) (AX)* = AX, (4) (XA)* = XA. Here the star * represents the conjugate transpose. A matrix satisfying all four of the properties above is called a Moore-Penrose inverse of A (for short: an M-P inverse). We show now that every matrix A has a unique M-P inverse. It is denoted by A". It should be remarked that the M-P inverse is often designated by other symbols, such as A . The notation A' is used here because (a) it is highly suggestive and (b) it comes close to one used in the APL computer language. We first prove the following lemma on "rank factorization" of a matrix. Lemma. If A is an m x n matrix of rank r, then A = BC, where B is m x r, C is r x n and r(B) = r(C) = r. Proof. Since the rank of A is r, A has r linearly independent columns. We may assume, without loss of generality, that these are the first r columns of A, for, if not, there exists a permutation matrix P such that the first r columns of the matrix AP are the r linearly independent columns of A. But if AP can be factored as then AP = BC, r (B) = r (C) A = BC where C = CP and r(C) = r(C) = r, since P is non-singular. Thus if we let B be the m x r matrix consisting of the first r columns of A, the remaining n - r columns are linear combinations of the columns of B, of the form Bq'-'' for some r x 1 vector q'^'. Then if we let Q be the r * (n - r) matrix, Q = (Q (1) Q(n"r)), we have If we let we have r n-r A = (B, BQ) (letters over blocks indicate number of columns) C = (Ir, Q) , A = B(Ir, Q) = BC and r (B) = r (C) = r. 46 Introductory Matrix Material Generalized Inverse 47 .We next show the existence of an M-P inverse in the case where A has full row or full column rank. Theorem 2.8.2.1 (a) If A is square and nonsingular, set A' = A (b) If A is n x 1 (or 1 x n) and A / 0, set 1 -----v l -1 A = (A*A) A* (or A = (c) IfAismxri and r (A) A*(AA*)-1. AT = (A*A)_1A*. (AA*) m, set A A*) Then A' is, an M-P inverse for A. Moreover, in the case of full row rank, it is a right inverse; in the case of full column rank, it is a left inverse. * Note that (a) and (b) are really special cases of (c). Proof. Direct calculation. Observe that if A is m x n and r(A) = m, then AA* is m x m. It is well known that r (AA*) = m, so that (AA*)~ can be formed. Similarly for A*A. We can now show the existence of an M-P inverse for any m x n matrix A. If A = 0, set AT = 0* = 0_ n ,m This .is readily verified to satisfy requirements (1), (2), (3) and (4) for a generalized inverse. If A ^ 0, factor A as in the lemma into the product A = BC where B is m x r, C is r x n and r(B) = r(C) = r. Now B has full column rank while C has full row rank, so that B' and C may be found as in the previous theorem. Now set . f T * * I : A = C B . j Theorem 2.8.2.2. Let A' be defined as above. Then it is an M-P inverse for A. Proof. It is easier to verify properties (3! and (4) first. They will then be used in proving properties (1) and (2). (3) AA' = B(CC")BV = BIB~ = BB~, and since BBT = (BBT)*, we have AAT = (AAT)*. (4) (1) (2) Now we pi unique. Similarly, A (AAT)A = (BB (ATA)AT = (C A=C'C= (C C) * = ) BC = BC = A. C)CTBT = CVBT = A~ (A'A) *. A the M-P invert Theorem 2.8.2.3. Given an ra x n matrix A, there is o_nly one matrix A' that satisfies all four properties fo'r the Mb or e -P en ro s e invers if. Proof. Suppose that there exist matrices B and C satisfying Then B and C = ABA = A (1) ACA = A, BAB = B (2) CAC = c, (AB) * = AB (3) (AC) * = AC (BA) * = BA (4) (CA)* = CA (2) (4) (1) = (BA)B = {A*B* )B = (A*C*A*)B *B (4) (4) (2) = (CA)(A*B* B) = CA(BAB) = CAB (2) (3) (1) = C(AC) = CC*A* = CC*(A*B*A*) (3) (3) and (2) = (CC*A*)(AB) CAB. Therefore B = C. The integers over the equality signs show the equations used to derive the equality. Penrose has given the following recursive method 48 Introductory Matrix Material ilized Inverse 49 for computing A', which is included in case the reader would like to write a computer program. Theorem 2.8.2.4 (the Penrose algorithm) m * n and have rank r > 0. Let A be (a) (b) (c) Set B - A*A (B is n x n) x n) . Set recursively for 1=1, 2, Set C1 = I (C1 is n t - 1: ci+1 ■ 3 a," is m x n and where D^JT -^2J»i^lJ«»^2^"'^™ nonsingular diagonal matrix of order r. Note that the representation (2.8.3.1) can be written as U*AV - D or. changing the notation, ^JAV = D, and so on (since U and V are unitary). "' Let A be it! x n; then, as is well known, AA* is positive semidef inite Hermitian s^aunefc^A-c. and r (AA*) r !A) = r (A*). Hence the eigenvalues of AA* are real 2 ,2 „S 1' d2' and nonnecjative. Write them as d^, d", d_, 0, 0, 0 where the d.'s are positive and where there are m - r O's in the list. The numbers d-^, d.,, dr are known as the singular values of A. Proof. Define Dn diag(d 1' "2' d ) . r Let U be m x r and consist of the (orthonormal) eigenvec- 2 ? tors of AA* corresponding to the eigenvalues d., dt,, o L l We have AA*U .., &l (cf and U*^ Theorems 2.9.3 and 2.9.9) I Let U2 be the m x (m 1 r) matrix whose columns consist of an orthonormal basis for the null space of A* Write U = (U Then A*U, 0 and U*U2 ^, U2) (block notation) = I m-r Then U*U = {U1' V UiUl U2U1 u*u2 U2U2 Now, since AA*U^ = viDx> U*AA*U1 = U^Uj^D^. But A*U_ = 0, so that U*A = 0, hence ^2V1D1 = o. Since is nonsingular, it follows that U*U, = U*U„ = 0. This means that 2 1 u*u = 0 1"2 0 and hence that U is unitary. Let V1 be the n x r matrix defined by V1 = r) matrix whose n A*U1D1 Let V2 be the n % (n columns are a set of null space of A. Thus AV., V as the n x n matrix V = (V^, V2). r orthonormal vectors for the 0 and V|V2 = Now Define V1V1 (d^Up.) (A^d"1) = d^uju^^d"1 dt^d, = I , 11 r and V*V1 = VJjA^D"1 = (AV2)*U1D11 = 0. that V is unitary. / U* S Finally, U*AV = A(V, v2) It follows U*AV2 U*AV2 Introductory Matrix Material Generalized Inverse 53 / uiAVi u \ f ui ^ 0 0 ' U*AA*U1D1 ■ 0 0 ' Using UDV theorem, we can produce a very convenient formula for A'. If A = U*DV*, where U, V, P are as r Theorem 2.8.3.2, above, then vd'u where r m-r -1 „ \ Dl 1 r 0 o n-r Proof. By a direct computation, it is easy to show that the n x m matrix r m-r -1 °) Dl 0 0 is D 'U) = U*DD'U = U*(or 0)u and Now since A(VD /I 0\ (VDVU)A = v(Qr 0JV*, the third and fourth properties for the generalized inverse are satisfied. Also, AA^A = (U*DV*)(VD*u)(U*DV*) = u*dd~dv* = U*DV* = A. Similarly A~ AA~ = A~, proving the first two properties. Theorem 2.8.3.3. For each A there exist polynomials p and q such that A*p(AA*), AT = q(A*A)A*. ^■4 Proof. Let A be m * n and have rank r. Then by the diagonal decomposition theorem there exist unitary matrices U, V of order m and n and an m x n matrix ( 1 ) n-r 0^ 0' r m-r where that A = U*DV*. D1 = diag (d-j^, 2' Then A* dr), dld2" •dr f 0), such vd*U, aa* = U*DD*U, and a' = VD'u. For an arbitrary polynomial p(z), p(aa*) = p(U*(DD*)U) = U*p(DD*)U. Hence a*p(aa*) = VD*p(DD*)U. Therefore for a' to equal a*p(aa*)' it is necessary and sufficient that D' = d*p(dd*). Equi-valently, C1 o 0 or d"1 = dkp(|dk|2), k = l/(|d, ■" ient. among D* '0 1, 2, Thus P(|dkr) •), k = 1, 2, r is necessary and suffic- Let s designate the number of distinct values |d^|, |d2|, |dr|. Then by the fundamental theorem of polynomial interpolation (see any book on interpolation, approximation, or numerical analysis) there is a unique polynomial of degree < s - 1 that takes on the values I -2 at the s points |d. The second identity for A' is proved similarly. PROBLEMS 1. 2. Let U and V be unitary. Prove that (UAV)' = V*AVU*. Let A be normal. Give a representation for A' in terms of the characteristic values of A. See Section 2.9. Prove that if A is normal, AAf = a^A. Prove that AT = A* if and only if the singular 54 Introductory Matrix Material values of A are 0 or 1 Prove that A' limt_0 A*(tl + AA*) -1 2.8.4 Generalized Inverts and ggfiligttfi of \^__^-^ Linear Foliations Using the properties of the generalized inverse we are able to determine, for any system of equations AX = B, whether or not the system has a solution. If it does, we can obtain a matrix equation, involving the generalized inverse, which exhibits this solution. Oddly enough, we need only the first property of a general- ■ »(D ized inverse. That is, we may use any matrix A , such that AA(1)A A. (1) that Definition. IfAismxn, any n x m matrix a satisfies AA^'a = A is called a (1)-inverse of A. More generally, any matrix that satisfies any combination of the four requirements for the generalized inverse on page 4J^f,^^S^^^»^,^^^S^Mfc Example. A (1, 2, 4)-inverse for A, is one that satisfies conditions (1), (2), and (4). Theorem 2.8.4.1. Let A be m x n. The system of equations has a solution if and only if B = AA(1)B, for any (1) inverse"A[L oE A. In this case, the general solution is given by X = a(1)b + (I a(1)a)y for an arbitrary n x 1 vector Y. Proof. Let b = AA^'b. Then AX (1) AA(1^B is solved by X = A B. Suppose, conversely, that the system has a solution X.? AXQ = B. Then, for any Generalized Inverse 55 (1)-inverse, A b (1) AXQ = (AA(1)A)XQ AA(1)B. Moreover, if X = A(1)B + (I - A(1)A)Y, then with B = AA(1)B, AX = AA^ B + A(I A(1)A)Y = B + (A AA(1)A)Y B + 0 = B. Therefore any such X is a solution. To show that it is the general solution, we must show that if AXq = b then Xg for some Y. Let R = X. - A^B (1) AA b = B - b = 0 A(1)b + (I - A(1)A)Y Then AR Now therefore R = R - ' AX ■ (1) A AR. Hence, XQ = A^B + (I - i required form with Y = R. (1) A)R which is of the In the numerical utilization of this theorem one should, of course, use some standard (1)-inverse of A such as A'. PROBLEMS Show that if A is an m x n matrix and B is any (1) -inverse of A, then AB and BA are idempotent of orders m and n respectively and BAB is a (1,2)-inverse of A. Show that if A is m x n (n x m), of rank m, then any (1)-inverse of A is a right (left) inverse of A, and any right (left) inverse of A is a (1,2,3)-[(1,2,4)-] inverse of A. Consider two systems of equations: (1) AX = B, (2) CX = D. Find conditions such that every solution of (1) is a solution of (2). What happens in Problem 3 if B = D = 0? Prove that the matrix equation AXB = C has a solution if and only if AA'CB'B = C. In this case, the general solution is given by Introductory Matrix Material Generalized Inverse 57 X = A^CBT + Y - A'AYBB* for an arbitrary Y. 2.8.5 The M-P Inverse and Least Square Problems Let A be m * n, X and B be n x 1, and consider the system of equations '•• ^ M \ AX = B. If the vector B lies in the range of A, then there exists one or more solutions to this system. If the solution is not unique we might want toknow which solution has minimum norm. If the vector B is not in the range of A, then there is no solution to the system, but it is often desirable to find a vector X in some way closest to a solution. To this end, for any X, define the residual vector R = AX - B and consider its Euclidean norm | | R| | = /R*R. A least squares solution to the system is a vector Xn such that its * residual has minimum norm, ^'i'hat is, ' | | |R | | = | |AX - B| | 1 I I AX - B| | for all n x 1 vectors X. Theorem 2.8.5.1. The system of equations AX = B always has a least squares solution. This solution is unique it and only it the columns or A are "linearly independent■ In this case, the unique least squares solution is given by X = A'B. Proof. Let r[a) designate the range space of a and by [r(a)]-^- designate its orthogonal complement. Then we can write b = b^ + where Bj^ is in r(a) and b2 is in orthogonal complement [r (a) ]"*-. For any x, ax is in r(a) as is ax - b^, hence is orthogonal to b2. Now ax - b - ax - B1 - b2. Hence, for any x, ||ax - b||2 * ||ax - bJ|2 * | [ b2 | 12 > ||b2||2. Therefore |j B0| | is a lower bound for the values 2 [ I ax - bI I and is achieved if and only if ax = b^. Since b. is in r(a), there is a solution x to ax = b For this vector Xg, I Ir0II 2 = I|ax0 - bM2 = I|b2M2 1 Max - b||2, so that the lower bound is achieved. Since a unique solution to AX = b1 exists if and only if the columns of A are linearly independent, the theorem is proved. For any solution X^ to AX = B^, Rq = AXq - B = B1 - [B1 + B2) = -B2 is in [R (A) Therefore A*RQ = 0, or A*(AXQ - B) = 0, or A*AX„ = A*B. 0 These are the normal equations determining the least squares solution. If the columns of A are independent, then r(A*A) = r(A) = n, so that the n * n matrix A*A is nonsingular. The least squares solution Xg is determined by A*AXq = A*B, so that xq = (A*A)-1A*b. But, from our previous work. A' = (A*A) A*. Finally, we take up the general case. Lemma. Let P = AA" t^Q = ^Jj^A. Then, if X and Y are arbitrary vectors "rc^nformaxHTe) ~ I|AX + (I - P)Y||2 = 1|AX||2 + ||(I - P)Y| j2 and ||A*Y + (I - Q)X||2 = ||A*Y||2 + ||(I - Q)X||2. Proof. Since A = AA'A, AX = AA'AX = PZ with Z = AX. We now prove that PZ 1 (I - P)Y. This is equivalent to (PZ)*(I - P)Y = 0 or Z*P*(I - P)Y = 0. But 2 - V T P* = P and P = (AA A)A - AA = P. Therefore, Introductory Matrix Material Generalized Inverse 59 P*(I - P) =0. The first equality above now follows from Pythagoras' theorem. The second equality can be derived from the first using A'' = A. Another way of phrasing this work is that P is the projection onto the range space R(A) of A while I - P is the projection onto the orthogonal complement of R(A). Theorem 2.8.5.2. Let A be m x n and B be m x 1. Let xg = A.B.: (i) Then for any n x 1 X f X_, we have either AX AXr or (2) I AX - BI I = IX II > IIX, |AX0 - B| and Proof. For any X we have AX - B = AX - AATB + AATB - B = A(X - ArB) + (I - AAV)(-B). By the previous lemma, ||AX - B||2 = ||ACX - A*B)||2 + ||(I - AA')(-B)|| = ||A(X - X0)||2 + ||AXQ - B||2 > I|AXQ - B||2. The equality holds here if and only if A(X - XQ) = 0 Hence if ax f axq< inequality (1) holds. Suppose, then, that AX = AX A'AA'B = A'B = X, 0" Therefore, X Then A'AX = A AX 0 x0 + (X v - A'B + (I - A'A)X. Hence by inequality (2) of the lemma, so that and only if X = X. . This theorem may be rephrased as follows. Given the system AX = B. Then the vector A'B is either the unigue least squares solution or it is the least squares solution of minimum norm. PROBLEM 1. A is square and singular. Characterize the solution A'B. 2.9 NORMAL MATRICES, QUADRATIC FORMS, AND FIELD OF VALUES We record here a number of important facts. By a normal matrix is meant a square matrix A for which (2.9.1) AA* = A*A. Examples. Hermitian, skew-Hermitian, and unitary matrices are normal. Hence real symmetric"; sKew-symmetric, and ortnbgonal matrices are also normal. All circulants are"'no7maTT~a*s~w^"3Tta~rT^ Theorem 2.9.1. a is normal if and only if there is a unitary u""and diagonal D such that A = (J*DU. Theorem 2.9.2. a is normal if and only if there is a polynomial p (x) such that' A™*" = "pTSTT""""" "~~*" Theorem 2.9.3. A is Hermitian if and only if there is a unitary''matrix U""an3 "*ai~real '"dfaTgbnaT D such" that" a~~=tt*du~---*-*—---" Theorem 2.9.4. A is (real) symmetric if and only if 'there is a (real) ortnogonal matrix u ana a reaT ' diagonal D such that a = U*D"tn ~ 60 Introductory Matrix Material Normal Matrices 61 PROBLEMS 1. Prove that A is normal if and only if A = R + is where R and S are real symmetric and commute. 2. Prove that A is normal if and only if in the polar decomposition of A (A = HU with H positive semi-definite Hermitian, U unitary) one has HU = UH. 3. Let A have eigenvalues X^, An- Prove that A is normal if and only if the eigenvalues of AA* are |X1|2, |X2|2, |Xj2. 4. Prove that A is normal if and only if the eigenvalues of A + A* are X^ + X^, + A,,, ...f X + X . n n 5. If A is normal and p(z) is a polynomial, then p (A) is normal. 6. If A is normal, prove that A" is normal. 7. If A and B are normal, prove that A 8 B is normal. 8. Use Theorem 2.9.1 to prove Theorem 2.9.2. Quadratic Forms. Let m be n x n and let Z = [z , z.,, zn)T- By a quadratic form is meant the function of z , ... zn given by (2.9.2) |m(Z) = Z*MZ. ' It is often of importance to distinguish the quadratic form from a matrix that gives rise to it. The real and the complex cases are essentially different . Lemma 2.9.5. Let Q be real and square anc U a real colunin^^ -QT, that is, if and only if Q is sikew-symmet£ic. Proof. (a) Let Q = ~QT. If a = UTQU, aT = a = UTQTU = UT(-Q)U = -a. Therefore a = 0. (b) Let UTQU = 0 for all (real) U. Write Q = ^1 + ^2 w^ere ^1 symmetric and Q2 is skew-symmetric. Then, for all U UTQU = VTQ1U + UTQ2U = UTQ]_U = 0. Since is symmetric, we have for some orthogonal P and real diagonal matrix A: = PTAP. Therefore for all real ux u5ptAPU = (PU)TA(PU). Write PU = (Uj,, un), A = diag(Xir A ) . Then we have ^k=l Ak(%^ = 0 for a11 (ul' un)f hence for a11 (u^, uR). This clearly implies X^ = 0, for k = 1, 2, n. Hence = 0 and Q = Q2 = skew-symmetric. Theorem 2.9.6. Let Q and R be real square and u be a real column. Then UTQU a UTRU for all U if and only if Q - R is skew-symmetric■ Proof. UTQU = UTRU if and only if l)T(Q - R]o = 0. Corollary. Let Q be real and U be a real column. Then ' "-"*-!—~- T (2.9.3) uTQU = UT(-° t Q )U. 1 t The matrix T(Q + Q ) is known as the symmetriza- tion of Q. * - We pass now to the complex case. Lemma 2.9.7. Let M be a square matrix with complex elements and leTTJTj^ _a__co^ Then Z*MZ = 0 for all complex Z if and only if M = 0. Proof (a) The "if" is trivial. (b) "Only if." Write Z = X + iY, M = R + is Introductory Matrix Material Normal Matrices 63 where X, Y, R, S are all real. Then we are given (2.9.4) (X* - iY*)(R + iS)(X + iY) = 0 for all real X, Y. Select Y = 0. Then X*(R + iS)x = 0 for all real X or X*RX = 0 and X*SX = 0. Therefore, by the first T lemma, R and S must be skew-symmetric: R + R =0, m S + S = 0. Expanding the product on the left side of (2.9.4), we obtain X*RX + iX*RY + iX*SX - X*SY - iY*RX + Y*RY + Y*SX + iY*SY. In view of the skew symmetry of R and S and the first lemma, we have X*RX = X*SX = Y*RY = Y*SY = 0. Therefore, we have for all real X, Y: (Y*SX - X*SY) + i(X*RY - Y*RX) = 0 or and Y*SX = X*SY = Y*S*X X*RY = Y*RX = X*R*Y. Thus, for all real X, Y, X*(R - R*)Y = 0 and Y*(S - 3*)X = 0. Selecting X and Y as appropriate unit vectors (0r-<0, 1, 0, 0), this tells us that R - R* = 0 and S - S* = 0. But R* = RT = -R and S* = ST = -S, therefore R = S = 0 and M = 0. Theorem 2.9.8. Let M and N be square matrices of order n with complex elements and suppose that (2.9.5) Z*MZ = Z*NZ for all complex vectors Z. Then M = N. Proof. As before, Z*MZ = Z*NZ if and only if Z* (M - N) Z = 0. Note that this theorem is false if (2.9.5) only for real Z. holds Corollary. Z*MZ is real for all complex Z if and only if M is Hermitian. Proof. Z*MZ is real if and only if Z*MZ = (Z*MZ)* = Z*M*Z. Hence M = M*. Let M be a Hermitian matrix. It is called positive definite if Z*MZ > 0 for all Z / 0. It is called positive semidefinite if Z*MZ > 0 for all Z. It is called indefinite if "there exist Z r 0 and Z2 f 0 such that Z*MZ1 > 0 > Z*MZ2- Theorem 2.9.9. Let M be a Hermitian matrix of prder n with eigenvalues X^, (a) k = 1, 2, ., X '. Then M is positive definite if and only if X. > (b) M is positive semidefinite if and only if Xk > 0, k = 1, 2, ..., n. (c) M is indefinite if and only if there are integers j, k, j f k, with X. > 0, X < 0. ., ^.- -.. 3 K Field of Values. Let M designate a matrix of order n. The set of all complex numbers Z*MZ with ||z|| = 1 is known as the field of values of M and is designated by j?(VL) designates the Euclidean norm of Z. "The following facts, due to Hausdorff and Toeplitz, are known. (1) (2) (2.9.6) (3) (2.9.7) (4) S1 (M) is a closed, bounded, connected, convex subset of the complex" plane".'"" The field of values is invariant under unitary .transforjnations: jr(M) = JJ(U*MU) , U unitary. If ch M designates the convex hull of the eigenvalues of M, then ch M c ^(M) . If M is normal, then ,F(M) = ch M. ( 64 Introductory Matrix Material Normal Matrices 65 PROBLEMS 1. Show that the field of values of a 2 x 2 matrix M is either an ellipse (circle), a straight line segment, or a single point. More specifically, by Schur's theorem**, if one reduces M unitarily to upper triangular form, Hadamard matrices: Ahmed and Rao; Hall; Harmuth; Wallis, Street, and Wallis. Generalized inverses: Ben-Israel and Greville; Meyer. UDV theorem: Ben-Israel and Greville; Forsythe and Moler; Golub and Reinsch (numerical methods). M = CP 0 m A, U unitary, then (a) M is not normal if and only if in 0. (a") X^ \2- is the interior and boundary of an ellipse with foci at A^, Length of major axis (|m|2 + |A, - A| 2)1/'2 . ^2, length of minor axis is (a") 1 '2' 5? (M) is the disk with center (b) at A^ and radius |m|/2. M is normal (m = 0). (b1) (b") x1 * 2' 5s(M) is the line segment joining A^ and ^"(M) is the single point A^. REFERENCES General: Aitken, [1]; Barnett and Story; Bellman, [2]; Browne; Eisele and Mason; Forsythe and Moler; Gant-macher; Lancaster, [1]; MacDuffee; Marcus; Marcus and Mine; Muir and Metzler; Newman; M. Pearl; Pullman; Suprunenko and Tyshkevich; Todd; Turnbull and Aitken. Vandermonde matrices: Gautschi. Discrete Fourier transforms: Aho, Hopcroft and Ullman; Carlitz; Davis and Rabinowitz? Fiduccia; Flinn and McCowan; Harmuth; Nussbaumer; Winograd; J. Pearl., **Any square matrix is unitarily similar to an upper triangular matrix.