2
[ In/'/p ö 0<<1 i i
C 11 <'1 • /ot >■-/ I 'c •/1 / ct r 1W ed'J-
INTRODUCTORY MATRIX MATERIAL
2.1] BLOCK OPERATIONS ^/_________._———
It is very often convenient in both theoretical and computer work to partition a matrix into submatrices, This can be done in numerous ways as suggested by this example:
2 !
I
I    3 I
10   I 11 I
14   j 15
Each submatrix or block can be labeled by subscripts, and we can display the original matrix with submatrices or blocks for its elements.    The general form of a partitioned matrix therefore is
(2.1.1)      A =
11
kl
12
A
k2
11
kS,
Block Operations
17
This means that the number of rows in each A.
must be the same for each' i ahTthe number o £ "co l'uinns
must be the same tor each i.    I'fie size or A-is-
---- j_j
rpiv    nr>>-<--i ^ -»-v -^A-^-,-----_
i 3
indicate this by writing
:  for certain integers m± and n . We
(2.1.1')    A =
No.  of columns
"11
*kl
»12
k2
_L£
kX.-
No. of rows
n,
A square matrix A of order n is often partitioned symmetrically.    Suppose that xT^~nT + n? +' • --:'T"n with ni f 1.    Partition A as ~ r
11
12
Alr
(2.1.2)
A =
r2
where size A±j = n± x nj.    The diagonal blocks A.. are
square matrices of order n . «=—-_—_11
___ . t l -~—
Example.
X	X	X	X	X	X
X	X	X	n	X	X
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
= 6
= 2
= 1
= 3
is a symmetric partition of a 6 x 6 matrix.
3g"?rf.?a^icfs ara^eaJ^Jttft-Efc uar or compounded of square blocks all of the samg^TzTT^'~-
Dotted lines, bars, commas are all used in an obvious way to indicate partitions.    The size of the blocks must be such that they all fit together properly.
16
Block Operations
19
18
Introductory Matrix Material
Example ■
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
X	X	X	X	X	X
If a square matrix A of order nk is composed of n x n square submatrices all of order k,  it is termed an (n, k)   matrix.    Thus the matrix depicted above is a (2,  3) matrix.
Subject to certain conformability conditions on the blocks, the operations of scalar product, transpose,   conjugation, a"3ol'LTron"   alid'muTEXpIiLcation are carriecT dutTTn the"s~sm~e'"way when expressed Th_bioc^ notation as when they are expressed in element" noS7 tion.     This means
(2.1.3)
11
cA
11
cA
1JI
(2.1.4)
Jcl
11
k£
ca.
kl
11
cA
kl
Tel
Akl
k£
kl
(2.1.5)
11
^1£'
A* All
A * Akl
"kl
A
k£
A*
kJt
Here T designates the transpose and * the conj ugate transpose.
(2.1.6)
All • * * A1JI\      / Bll * ' ■ BU
Akl * * *  AkS, Bkl  • * * Bkl
All + Bll
A1J> + B1J>
\l + Bkl
Ak£ + Bk£
, A
(2.1.7)
11
1H<
lk£
11
'11
'In'
'kn
where C..  = J    ,A. B
13      ur=l lr r^
In  (2.1.6)  the size of each A^^ must be the size
of the corresponding B
In  (2.1.7), designate the size of A.. by a.   x  g .
13 1 3
and the size of B.. by Y.   * 6..    Then,  if 6    = y for 13    2    1       3 ' r      1 r
1 < r < I, the product A.  B  . can be formed and pro-—     —   1 c ir r] c
duces an       x 6.. matrix,  independently of r.    The sum
can then be found as indicated and the C.. are a.   x 6.
13 1 :
matrices and together constitute a partition. Note that the rule for forming the blocks C.. of the matrix
product is the same as when A..  and B..  are single numbers. ^ x3
Example.    If A and B are n x n matrices and if B      -A /
C =
20
Introductory Matrix Material
Block Operations
21
, A2 + B2,
C =
BA - AB,
AB - BA A2 + B2
PROBLEMS 1
2.
3.
In the example just given what is C2 if A and B commute?
In the example, compute C3. what if a and B commute?
Let
4.
6.
M =
,  N =
B.
r 0
be two block diagonal matrices. When can the product MN be formed?   What is the product?
"Hadamard matrices" of order 2n are given recursively by means of the definition
H
H2 =   [1 -1]'        H v+1
.[V V]
H , -H ,
Write out H4 and Hg explicitly.    Compute H2E2' H4H4"
Let A, B, C, D all be n x n and let a, b, c, d be scalars.    What is
Let I    be the identity of order p.    Prove that /i B \
det
. 0
= det C.
A B
7. If A and C are square, prove that det(Q „) = (det A)(det C).
8. If A and C are square, prove that the eigenvalues
A B
of  (g c) are those of A together with those of C.
2.2    DIRECT SUMS
k,  let A^ be a square matrix of
For i = 1,  2,   . . . ,
order n--.    The block diagonal square matrix
(2.2.1)    A =
0
= diag(A1, A2,
of order n^ + n2 + •••  + n^ is called the direct sum
of the Ak and is designated by -"
.....■   1   ■■■■■ ; " ~ k
ffi A.   =    ffi    A. . k l
(2.2.2)
A-l • A2
i=l
The following identities are easily established.
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(A ffi B) (A + B)
C = A
(B © C) .
(C + D)  =   (A © %)
(A © B) (C © D)  = AC ffi BD.
T
((| © D)
T T (A ffi B)    = A
ffi B
(A ffi B) * = A* ffi B*.
(AffiB)   1 = A1ffiB'L, assuming that the indicated inverses exist.
det (A ffi B)  =  (det A) (det B).
tr (A ffi B)  = tr A + tr B.
If pA(A) designates the characteristic polynomial of A,  then PAeB<M  =   (pA(X) ) (pB(X) ) .
Hence >„ (A ffi B)  = { AA,  AB}.     (AA designates the set of eigenvalues of A.)
22
Introductory Matrix Material
PROBLEMS
Let A
© A,
Prove that det A =
A-   © A~ © 12 k
n.   ,   det A.   and that for integer p, AP = A?
o.= l      pi 1 ...   © AR.
Give a linear algebra interpretation of the direct sum along the following lines.    Let V be a finite-dimensional vector space and let L and M be sub-spaces.    Write V = L 8 M if and only if every vector x E V can be written uniquely in the form x = y + z withy € L,  z 6 M.    Show that V = L © M if and only if
y } are bases for
y } is a m
(a) dim V = dim L + dim M, L D M = {0}
(b) if {xi;...,x£} and {y^.
L and M, then {x.,...,x.,y_, basis for V.        x 1
The fundamental theorem of rank-canonical form for square matrices tells us that ifAisanxn matrix of rank r,  then there exist nonsingular matrices P, Q such that PAQ =1    © 0 _ . Verify this  formulation. r n_r
2.3    KRONECKER PRODUCT
Let A and B be m x n and p x q respectively.    Then the Kronecker product   (or tensor, or direct product of A and B)  is that mp * nq matrix defined by
(2.3.1)    A 0 B =
allB'
amlB'
a12B.
a _B, m2
a, B In
a B mn
Important properties of the Kronecker product are as follows   (indicated operations are assumed to be defined]: .....______________„___
(1) (aA)  & B = A 0  (aB) = a(A 0 B);  a scalar.
(2) (A + B)   0 C =  (A 0 C) + (BSC).
(3) A ®   (B + C)   =   (A 0 B) +   (A 0 C) . |^(4) A 0   (B 0 C)   =   (A 0 B) 0 C.
Kronecker Product
23
(5)	(A (	i B) (C (	3D)   =   (AC)   0 BD.
(6)	A &	B = A <	3 B.
(7)	(A	T i B) =	AT 0 BT;    (A 0 B)* = A*  0 B*.
(8)	r (A	0 B) =	r (A) r(B) .
orders m and n.
Then
(9) (10)
(ID (12)
tr (A 0 B)
(tr(A))(tr(B)).
If A and B are nonsingular, so is A B"1.
B and
(A 0 B)   1 = A-1
B)   =   (det A) "(det B)m.
det (A
There exists a permutation matrix P (see Section 2.4) depending only on m, n, such that B 0 A = P*(A 0 B)P.
(13)    Let fS{x,  y)  designate the polynomial
P 4 k
0(x,  y)  =      I      a. xJy . j,k=0 3k
Let 0(A;  B)  designate the mn x mn matrix
j,k=0
*j,k
A^   0 Bk.
Thf.n the eigenvalues of ß (A;  B) are
0(A,
1, 2,
m.
1, 2,
n where X    and u    are the eigenvalues of A r s
and B respectively Lues of
,m;   s = 1,   2, ...,n
eigenvalues of A 0 B are A u ^ r s
In particular, the r = 1, 2,
PROBLEMS 1.    Show that I
I = n
m       n mn Describe the matrices I
A, A
If A is m x m and B is n x n,  then A 0 B
(A 0 I  ) (I    0 B) n m
(I    © B) (A
24
Introductory Matrix Material
Permutation Mairiaas
9. 10.
If A and B are upper (or lower) triangular, then so is A ® B.
If A ® B ^ 0 is diagonal,  so are A and B.
Let A and B have orders m, n respectively
+
Show
that the matrix  (I   ® B)
(A © I  )  has the n
m
< s
A and u are the eigenvalues, r s 3
eigenvalues X
1,  2,   ..., n, where
of A and B.    This matrix is often called the Kronecker sum of A and B.
Let A and B be of orders m and n.    If A and B both are  (1) normal,   (2)  Hermitian,   (3) positive definite,   (4)  positive semidefinite,  and (5) unitary,  then A ® B has the corresponding property.    See Section 2.9.
[21
Kronecker powers: Let A       = A ® A and, in
[k+1]
general, A A[k]   « A[i]. Prove that (AB Let Ax
A © A
[k]
Prove that A
[k]
= A[k]B[k]
Ax and By = py,  x =
T       ,      T T
-xy , x2y
that   (A ® B)Z = AyZ.
'2.4   /PERMUTATION MATRICES
(x
1'
x y ]. Prove irr
By a permutation^g of the set N= fl,  2, n} is
meant a one-to-one mapping of N onto itself. Including the identity permutation there are n! distinct permutations of N.    One can indicate a typical per-mutation by
c(l) i± (2.4.1)        a(2) i<= i.
a(n)  * i.
(2.4.11)
2
i.
The inverse permutation is designated by a 1. Thus
Let Ej designate the unit   (row) vector of n cor
ponents which has a 1 in the jth position and O's elsewhere:
(2.4.2)
E
(0,
/  0,  1, 0,
, 0)
BY a permutation matrix of order n is meant a matrix of the form
(2.4.3)
One has
(2.4.4)
(a^_j) where
3i,a(i)  = 1'   1 = !<2'---
1 / 3
- 0 I otherwise.
^>-J^^mLM,^Jsm.%. I in the aJi l;th.„column..Jatld_.a,; elsewhere.    The^ith^qlrari ofp has a l_in the
5.... iiX^hJ^.9K.3M.Als elsewhsre.    Thus each row and each column of P has precisely one 1 in it.
Example
P =
a
'0 0 0 1 10 0 0 0    0 10
'0100
which is often written as
It is easily seen that
Permutation Matrices
27
26
Introductory Matrix Material
(2.4.5)
y (l)
co(2)
c (n)
Hence if A =  <ajj2_^s an n x r matrix'
(2.4.6)
:A = (aa(i),j
that is, P A is A with its rows permuted by a. More-
over ,
(2.4.7)     (xr  x2, xn)P(
=  (x    ,       , x ,
a i(l)      a ±(2)
) ,
a "(n)
so that if A =   (aii)  is r x n,
(2.4.8)
ap,
(a    _! >-
XyO (j)
That is, AP0 is A with its columns permuted by a Note also that (2.4.9)
P P    - P
ox at
3
where the product of the permutations o,  t is applied
from left to right. Furthermore,
1;
(2.4.10)
hence
(P0)* = P
(2.4.11) (PO)*P0 = Pö_1Pa = Px Therefore
(2.4.12)
I.
(Pa)* = P^ = (Pa)
-1
The permutation matrices are_ thus unitar^^ormlng a "subgroup" of "the uniTary^groug^
From  (2.4.6),   (2.4.8)  and  (2.4.12) it follows that if A is n x n
(2.4.13)
P AP*
a a
(aa(i) ,a(j)J '
so that the similarity transformation P AP* causes a
consistent renumbering of the rows and columns of A by the permutation a.
Among the permutation matrices,  the matrix
(2.4.14)
plays a fundamental role in the theory of circulants. This corresponds to the forward shift permutation a(l) = 2,  o(2) = 3, a(n-l) = n,     (n)  = 1, that
is, to the cycle a =  (1,  2,  3, n)  generating the
cyclic group of order n   (tt is for "push").    One has
(2.4.15)
tt2 -
0
0
corresponding to a2 for which a2(I)  = 3,  a2 (2)  = 4,
2 k k
a   (n)  = 2.    Similarly for tt    and o  .    The matrix
Tjn corresponds to <jn
I,  so that
(2.4.16)
n
TT
Note also that T
(2.4.17) tt
1 n-1
tt      =   tt = tt
=   IT* =
A particular instance_of   (2.4.13) is
(2.4.18) ! ttAttT =  (a. . .   .   . )
| 1+1,3+1'
where A =   (a.j.)  and the subscripts are taken mod n.
Permutation Matrices
29
28
Introductory Matrix Material
Here is a second instance.     Let L
(A
1'
X^)   .    Then,  for any permutation matrix Pq1
(2.4.19)
Pa(diag L)P* = diag(PaL).
A second permutation matrix of importance is
0
(2.4.20)
o
o
■J
o
which corresponds to the permutation a(l) = 1, a(2) n,  a(3)  = n - 1, o(j)  = n - j + 2, a(n) =
Exhibited as a product of cycles,  o =   (1)(2, n)
(3, n - 1) ,   . . . ,   (n,  2).
It follows"that d2
2.
I, hence
(2.4.21) Also,
(2.4.22)
I.
r* = r
Again, as an instance of (2.4.13),
(2.4.23)       r(diag L)T = diag(TL).
Finally, we cite the counteridentity K, which has I's on the main counterdiagohal and 01s elsewhere:
(2.4.24)
K = K
One has K = K*.   K    = I, K
-1
Let P = PQ designate an n x n permutation matrix.
Now a may be factored into a product of disjoint cycles.    This factorization is unique up to the arrangement of factors.    Suppose that the cycles in the product have lengths p^,  p2, pm,
• • - + p    = n) .    Let 7t      designate the ir matrix
(P1 + P2
(2.4.14)  of order p.
By a rearrangement of rows and
columns,  the cycles in P^ can be brought into the
form of involving only contiguous indices,  that is, indices that are successive integers.    By (2,4.13), then,  there exists a permutation matrix R of order n such that____
(2.4.25)
RPR*  - RPR
•1
7t © ti
Pl p2
© it
Since the characteristic polynomial of tt is
P]T"-Pk~----------- ----™—"   Pk -
(-1)     (A      - 1),  it follows that the characteristic
1)
polynomial of RPR*, hence of P,  is n™=1(-l) (A
The__ej^ggny_aJLue.s of the permutation matrix P are therefore^ the roots of unity comprised~In the "tdtaTity or ' roots of" EKe'm equatlonST*™'**™™"     ™ ——.
= 1,
k = 1, 2,
Example.    Let a be the permutation of 1,  2,   3,   4,   5, 6 tor which a(l)  = 5,  o(2)  = 1,   a(3)  = 6,  a(4)  = 4, a (5.)  = 2,   a(6)  = 3.    Then a can be factored into cycles as o =   (152)(4)(36).    Therefore, m
3 and p^ =  3,
Po = l.r P-
P . =
The matrix P is
o
\
»
3 0 Introductory Matrix Material Permutation Matrices 31
The matrix R corresponding to t(1)  = 1,  t(5) = 2, t(2)  = 3, t(4)  = 4,  x(3) = 5,  t(6)  = 6,  is such that
RPaR* =
0	1	0	0 :	0	0
0	0	1	0	0	0
1	0	0	0	0	0
0	0	0	1	0	0
0	0	0	0	0	1
0	0	0	0	1	0
The_ eigenvalues of^ p    are therefore the roots of (A3 - 1) (A - 1) (X2 - 1) .
A permutation a is called primitive if its factorization consists of one cycle of full length n. The eigenvalues ofa primitive permutation matrix are the nth roots of unity, hence they are distinct.
PROBLEMS
If M is m x n, describe the relationship between
[n/2]
6.
M,  KM,  and MK . '    m n
Prove that det Kn = (-1)
where   [x] designates the largest integer < x. Determine the characteristic polynomial of it .
For integer p,  set M    = tt^ + tt ^.    Prove that »
p /h,
M    = M P n-p
M
M
p+1
Let C
M,M 1 P
(x) =
M
0 n
- M   , . p-1
2 cos n£
21, M M    = M  , + p q p+q
M
p-q'
where x = 2 cos 6, desig-
nate the Tschebvscheff polynomials of the first
kind. One knows that C ™"7(xl =' xC~Tx) ::~C Tlx") , - n+1 n n-1
Referring to Problem 4,
CQ(x)  = 2,
C1 (x)   = x.
prove that M   = C   (M, ) .
P        ^ n Let N =  {0,   1,  2, 2    - 1} and let a desig-
nate the permutation of N that results from reversing the binary bits of the elements of N.
7 .
8 .
9.
10.
11.
12.
13.
14.
Example		When n =		
0 ■+ 000		000	-»■	o.
1 001		100	~y	4,
2 -+ 010		010	->-	2,
3 -+ 011	-y	110	->-	6,
4 ->- 100	-+	001	->-	1,
5 -+ 101	->	101		5,
6 ■+ 110		011		3,
7 -* 111		111		7.
3,
Discuss the factorization of a for n about the general case?
Describe the matrices I
What
i it   ;  it ® n n
ii    and tt n n
I are m
If m > 1, prove that I
derogatory,  that is,  their minimal polynomial is not their characteristic polynomial.
Prove that K      0 K     = K
„m        2n 2m+n
Let tt be of order n.    Prove that I + tt + tt +
-•-  + tt11 ^ = J, where J is the matrix of all l's.
If a is a primitive permutation, prove that a ^~ is primitive.
If a and x are primitive permutations, is it true that ct is primitive?
P is a primitive permutation matrix if and only if it is of the form P = R*ttR where R is a permutation matrix.
P is a primitive permutation matrix of order n if and only if n is the least positive integer
for which Pn = I.        '"p      'i. (iVw ■ ,vi-*„/<,, >.
THE FOURIER MATRIX
Let n be a fixed integer > 1 -and set
c t \ _,2Tri. 2tt  , 2tt
(2.5.1)    w = exp(—:—)  = cos — + l sin —
In a good deal of what follows, w might be taken as
any primitive nth root.....of unity.,' but we prefer to
standardize the selection as in  (2.5.1).    Note that
i
32
Introductory Matrix Material
(2.5.2)
(a) w
n
1,
(b) ww = 1,
(c) w = w"1,
(d) wk = w"k = wn"\
(e) 1 + w + w    + • • .
_i_ n~l n + w       = 0,
By the Fourier matrix of order n, we shall mean
atrix J"
(2.5.3) '. f*
the matrix f  (= F ) where
i_(w(i-l) (J-D
1_
/n		
1	1	
w	2 w	
w2	w4	
w
2<n-l)
n-1     2(n-1)
,(n-l)(n-1)
Note the star on the left-hand member.    The sequence
w ,  k= 0,  1, is periodic; hence there are only n
distinct elements in F.    F can therefore be written alternatively as
(2.5.4) F*
,-1/2
1	1	1
	2	
w	w	w
2	4	
w	w	w
	n-2	•
w	w	w
It is easily established that F and F* are symmetric:
(2.5.5)
T
F  , F*
(F*)T = F,   F = F*,
It is o:  fundamental importance that
The Fourier Matrix
Theorem 2.5.1.    F is unitary:
(2.5.6)
FF*
—T —T FF    = F F
F*F = I I
or or
F      = F*
-1 —T F      = F ,
or
Proof.    This is a result of the geometric series identity
y1  r(i-k)     1 -  Wn(j-k) fn  if  j  = k,
I   W     -        =-- = (
n-1
I
r=0
v 0 if j ^ k.
A second application of the geometrical identity yields
Theorem 2.5.2
= F*F* = r =
1
F,
Corollary.
4 2 F*   = r
I.
3 4 p*    = f* (f*
IF
We may write the Fourier matrix picturesquely in the form
(2.5.7)
(It may be shown that all the qth roots of I are of the form M "'"DM where D = diagdj^, u,,, u  ) ,  u? =
and"where M is any nonsingular matrix7T~
Corollary. The^jej^enyalj^^ appropriate multiplicities.
Carlitz has obtained the characteristic polynomials f(A)   of F*   (= F*).    They are as follows.
n = 0(mod 4),
f(A)  = (A
i
(A
1) (A + i) (A + 1) • l)(n/4>-\
34
Introductory Matrix Material
The Fourier Matrix
35
n 5 1(mod 4), f (X) n = 2 (mod 4), f (X) n = 3(mod 4),        f (X)
(X - 1)(X4 - i)(V4)(n-l)
(X'
1)(X4 - 1)(V4)(n-2)
(X + i) (X'
1)
(X'
1)
(1/4)(n-3)
The discrete Fourier transform, complex n-tuples, write
Working with
Z = (z , z2, Z =   (z^f  z 21
and
The linear transformation (2.5.8)    ;   Z = FZ I
where F is the Fourier matrix is known as the discrete Fourier transform  [DFT).    Its inverse is given simply by
(2.5.9)
F*Z.
The transform (2.5.8) often goes by the name of harmonic analysis or periodocfram analysis, while the inverse transform (2.5.9)  is called harmonic synthesis, The reasons behind these terms are as follows: suppose
that p(z) = aQ + a^z + ... + anzn_1 is a polynomial of
degree < n - 1.    It will be determined uniquely by specifying its values p(z )  at n distinct points z, ,
K K k = 1,  2, n in the complex plane.    Select these
points zk as the n roots of unity 1, w, w , w~ .
Then clearly
(2.5 .10)
n^F*
J
p(w )
so that
(2.5.11)
vT1'2*
p(D
P (w)
n-1'
p (w
n-1,
The passage from functional values to coefficients through (2.5.11) or  (2.5.8)  is an analysis of the function, while in the passage from coefficient values to functional values through  (2.5.10) or  (2.5.9) the functional values are built up or "synthesized."
These formulas for interpolation at the roots of unity can be given another form.
By a Vandermonde matrix V(Zq,  z^, zn_^) is
(2.5.12)
From (2.5.4)  one has, clearly,
\-2-
i V(l, w, w , ..
(2.5.13) _    _2 !
V(l, w, w ,   ... , w" l)
form		
1	1	1
zo	21	z . n-1
2	z2	2
zo		zn-l
n-1	n-1	n-1
zo	21	z . n-1
n-1, l/2„, w      )  = n F*
nl/2F* = nl/2F_
One now has from (2.5.11) (2.5.14)    p(z)  =  (1,  z, .
n-1, , z      ) (aQ, ax,
n-1. -1/2 = (1,  z,  . . . ,  z )n
F(p(l) , p(w) ,
, P(wn-L))T n-1,
a    , ) n-1
-2
(1,   z,   ...,   z      )V (1,  w, w ,
, wn_1) (p(l) ,  p(w) ,
, n-1,.T , P (w      )) .
36
Introductory Matrix Material
37
Note.    In the literature of signal processing, a sequence-to-sequence transform is known as a discrete or digital filter.    Very often the transform [such as (2.5.8)]  is linear and is called a linear filter.
Fourier Matrices as Kronecker Products.;   The Fourier
matrices of orders 2   maybe expressed as Kronecker products.    This factorization is a manifestation, essentially, of the idea known as the Fast Fourier Transform (FFT)  and is of vital importance in real time calculations.
Let F'    designate the Fourier matrices of order
_2__
2    whose rows have been permuted according to the bit
reversing pernuitatlon (see Problem"^ pT TUJ Examples
F2
1 -1
i -i
1 1 -1 -1
One has
(2. 5.15)
F4 =   (I2 8 F2)D4(F2
I2),
where D    = diag(l,  1,  1, i).    This may be easily checked out.
As is known, A ® B = P(B ® A)P* for some permutation matrix P that depends merely on the dimensions of A and B.    We may therefore write,  for some permutation matrix S4   (one has, in fact, sT1
(2.5.16)
F4
(I2  0 F^)D4S4(I2
F2>S4
Similarly,
(2.5.17)
F ■ =
16
F^D16(F4 S I4»
where
(2.5.18)      D16 = diag(I,  D2, D, D3)
with
(2.5.19)      D = diagd, w, w2, w3),
w = exp
42jri
16
Again,  for an appropriate permutation matrix S-^g =
°16 °16' (2.5.20)
FJ6 =   (I4 8 F4)D16S16(I4 6 F41S16-j
For 256 use.
(2.5.21)      D256 = diag(I, D8, D4, D15)
where the sequence 0,   8,  4, 15 is the bit
reversed order of 0,  1,...., 15 and where
-27Ti/256
w = e '
(2.5.22)      D = diag(l, w, w15),
PROBLEMS
1. Evaluate det F .
n
2. Find the polynomial p^_^(z) of degree <_ n - 1 that takes on the values 1/z at the nth roots of unity,
wJ,  j = 1, 2,
, n.    What is the limiting
behavior of P_(z)  as n ->• °°?    (de Mere)
Write F
R + iS where R and S are real and i =
/d.    Show that R and S are symmetric and that R2 + S2 = I,  RS = SR. Exhibit R and S explicitly.
2.6    HADAMA.RD MATRICES
By a Hadamard matrix of order n,  H   (= Hn),   is meant a
matrix whose elements are either +1 or -1 and for which
38
Introductory Matrix Material
Hadamard Matrices
39
(2.6.1)
Thus, n
f T HH    = H H
nl.
•1/2
H is an orthogonal matrix.
Examples
Hx = (1), /2 F0 = H.
4,1
= (
1 1 1 -1
1
1
1
1
1 1 1 1
l4,2
It is known that if n > 3, then the order of an Hadamard matrix must be. "a multiple of 4.   With one""
least one.Hadamard matrix.
Theorem 2.6.1.    If A and B are Hadamard matrices of orders m and n respectively,  then A 0 B ..is. an Hadamard matrix of order mn.
Proof
(A ® B) (A 8 B)
(A 0 B) (A
T
BT)   =   (AAT)   0 (BBT)
(mlm) 0 (nln) = mn(Im
mnl
In some areas, particularly digital sicrnal processing , the term Hadamard nmatrix is limited to the
matrices of order 2n given specifically by the recursion
(2.6.2)
= (1),
H     , ,   =  H       ®  H . .
2n+l        2n 2%
t1 1) [1   -1> '
These matrices have the additional property of being symmetric,
(2.6.3)
so that
(2.6.4)
V = V
H2    = 2nI. 2n
The Walsh-Hadamard Transform. By this is meant the transform
(2.6.5)
HZ
where H is an Hadamard matrix.
PROELEMS
Hadamard parlor.^gapie;    Write down in a row any four numbers.    Then write the sum of the first two,  the sum of the last two;  the difference of the first two, the difference of the last two to form a second row.    Iterate this procedure four times.    The final row will be four times the original row.    Explain, making reference to H, Generalize.
P is square and every row
Define a
follows. ~P" is"~square and every"row and every co Umn of P has exactly one nonzero, element in it. That element is either a +1 or a -1.    Show that if H is an Hadamard matrix, and if P and Q are generalized permutation matrices, then PHQ is an Hadamard matrix.
With the notation of   (2.6.2)  prove that
,n+l
= (H
I2)(1^ ® H2).
40
Introductory Matrix Material
Generalized Inverse
41
4. Using Problem 3,  show that the Hadamard transform of a vector by H     can be carried out in
2n
< n 2    additions or subtractions.
5. If H is an Hadamard matrix of order n, prove that Idet Hi = nn/2.
inverse exists.    That is,  there are_ many matrices A for which there exists no matrix B such~~that AB = BA = I. " "
In discussing the solution of systems of linear eguations, we know that if A is n x n and nonsingular then the solution of the eguation
AX
B,
2.7 TRACE
The trace of a square matrix A = (a^) of order n is defined as the sum of its diagonal elements:
(2.7.1)
tr A
n I j-l
a . . , 33
The principal general properties of the trace are tr (aA + bB)  = a tr (A)  + b tr (B) .
(1)
(2)
(3)
(4)
(5)
tr(AB) = tr (BA).
tr A = tr (S 1AS), S nonsingular.
If        are the eigenvalues of A, then
tr A = 7" 1   \. .
L 1=1 l
More generally,  if p designates a polynomial
P(A)
j = 0 3
(6)
(7) (3)
then tr(p(A)) = ££=1 p(AR).
tr(AA*) = tr(A*A) = . ,|a..|2 = square ___ '3 — -L^D
of Frobenius norm of A. tr (A ® B)  = tr A + tr B. tr (A ® B)  =  (tr A) (tr B)
£7]
■
where X and B are n x m matrices, can be written very neatly in matrix form as
X = A XB. !
.........«t\
Although the "solution" give above is symbolic, and in general is not the most economical way of solving systems of linear equations, it has important applications.    However, we have so far only been able to use this idea for square nonsingular matrices. In this section we show that for every matrix A, whether square or rectangular, sin^I^*jor'''nolfsingui.ar, there exists a unique "generalized inverse" often called the "Moore-Penrose" inverse of A, and employing it,
the formal solution X = A "'"Bean be given a useful interpretation.    This generalized inverse has several "of the important properties of the inverse of a square nonsingular matrix,  and the resulting theory is able in a remarkable way to unify a variety of diverse topics.    This theory originated in the 1920s, but was rediscovered in the 1950s and has been developed extensively since then.
Right and Left Inverses
Definition.     If A is an m x n matrix, a right inverse
of A is an n x m matrix B such that AB = I . Similar-
m
ly a left inverse is a matrix C such that CA = I .
GENERALIZED INVERSE
Example■ If
A =
For large classes of matrices,  such as the square "singular" matrices and the rectangular matrices, no
a right inverse of A is the matrix
42
Introductory Matrix Material
Generalized Inverse
43
since AB
2"
However, note that A does not have a left inverse, since for any matrix C, by the theorem on the rank of a product, r(CA)  < r(A) =2, so that CA ^ I^-    Similarly, although A is, by definition, a left
inverse of B,  there exists no right inverse of B.
The following theorem gives necessary and sufficient conditions for the existence of a right or left inverse.
Theorem 2.8 .1.1. An_ m x n matrix A__ has a right (left) inverse jLf and ojiiy~jLf A has" rahkmTn)
Proof.    We work first with right inverses.
Assume that AB = I
Then m = r (I ) < ro —
r(A)  < m.
Hence r(A)  = m.
Conversely, suppose that r(A)  = m.    Then A has m linearly independent columns, and we can find a permutation matrix P so that the matrix A = AP has its first m columns linearly independent.    Now,  if we can find a matrix B such that AB = APB = I, then B = PB is clearly a right inverse for A.
Therefore, we may assume, without loss of generality, that A has its first m columns linearly independent.    Hence A can be written in the block form
A =   (A1, A2)
where A^ is an m x m nonsingular matrix and is some m x   (n - m) matrix.
Q)
ill
Now let
A = A, (I 1 m
This can be factored to yield
B =
where B-^ is m x n and B2 is
m)   x m.    Then AB = I
if and only if
A1B1 + A1QB2
or if and only if
Bl + QB2 = Al '
or if and only if
Bl = Al3 Therefore, we ha\
QB2.
QB,
B =
for an arbitrary (n - m) x m matrix B^. Thus there is a right inverse, and if n > m, it is not unique.
We now prove the"^Ke^r^~f6r,'a'"'le^t^lnvers^T Suppose, again, that A is m x n and r(A) = n. Then
By the first part, AT has Hence BTA = I and A has a
is n x m and r(AT)
a right inverse: ATB = I. left inverse.
Corollary. If A is_n_x n of rank n, then A_has both a right and a left inverse "and" they are the same.
Proof.    The existence of a right and a left inverse for A follows immediately from the theorem. To prove that they are the same we assume
AB = I,        CA = I.
Then C(AB)  = CI = C.    But also,
C(AB)  =   (CA)B = IB = B,
so that B = C
be the inverse of A, denoted by A
This is the matrix that is defined to 1
44
Introductory Matrix Material
Generalized Inverse
PROBLEMS 1
Find a left inverse for left inverses.
Does
Find all the
1
2
3
4
have a left inverse?
Let A be in x n and have a left inverse B. Suppose that the system of linear equations AX = C has a solution.    Prove that the solution is unique and is given by X = BC.
Let B be a left inverse for A. Prove that ABA = A and BAB = B.
5.    Let A be m x n and have rank n.
nonsingular and that (ATA) "LAT for A.
Prove that A A is is a left inverse
Let A be in x n and have rank n.    Let W be m x na positive definite symmetric.    Prove that ATWA is
m —IT
nonsingular and that (A WA)    A W is a left inverse
Generalized Inverses
Definition.    Let A be an m x n matrix.    Then an n x m matrix X that satisfies any or all of the following properties is called a generalized inverse:
(1) AXA = A,
(2) XAX = X,
(3) (AX)* = AX,
(4) (XA)* = XA.
Here the star * represents the conjugate transpose. A matrix satisfying all four of the properties above is called a Moore-Penrose inverse of A  (for short: an M-P inverse).    We show now that every matrix A has a
unique M-P inverse. It is denoted by A". It should be remarked that the M-P inverse is often designated
by other symbols, such as A .    The notation A'  is used here because   (a)  it is highly suggestive and   (b) it comes close to one used in the APL computer language.
We first prove the following lemma on "rank factorization" of a matrix.
Lemma.
If A is an m x n matrix of rank r, then A = BC,
where B is m x r, C is r x n and r(B) = r(C) = r.
Proof.    Since the rank of A is r, A has r linearly independent columns.    We may assume, without loss of generality, that these are the first r columns of A, for,  if not, there exists a permutation matrix P such that the first r columns of the matrix AP are the r linearly independent columns of A.    But if AP can be factored as
then
AP = BC,        r (B)  = r (C)
A = BC
where C = CP     and r(C) = r(C) = r,  since P is non-singular.
Thus if we let B be the m x r matrix consisting
of the first r columns of A, the remaining n - r
columns are linear combinations of the columns of B,
of the form Bq'-''  for some r x 1 vector q'^'. Then if we let Q be the r *   (n - r) matrix,
Q = (Q
(1)
Q(n"r)),
we have
If we let
we have
r n-r
A =  (B,    BQ) (letters over blocks
indicate number of columns)
C =   (Ir,  Q) ,
A = B(Ir,  Q)   = BC
and r (B)  = r (C) = r.
46
Introductory Matrix Material
Generalized Inverse
47
.We next show the existence of an M-P inverse in the case where A has full row or full column rank.
Theorem 2.8.2.1
(a) If A is square and nonsingular,  set A' = A
(b) If A is n x 1   (or 1 x n) and A / 0, set
1 -----v l
-1
A =
(A*A)
A*
(or A =
(c)    IfAismxri and r (A) A*(AA*)-1. AT  = (A*A)_1A*.
(AA*) m,  set A
A*)
Then A' is, an M-P inverse for A.    Moreover, in the case of full row rank,  it is a right inverse; in the case of full column rank, it is a left inverse. *
Note that  (a)  and   (b) are really special cases of (c).
Proof.    Direct calculation.    Observe that if A is m x n and r(A) = m, then AA* is m x m.    It is well
known that r (AA*) = m, so that  (AA*)~   can be formed. Similarly for A*A.
We can now show the existence of an M-P inverse for any m x n matrix A.
If A = 0, set AT = 0* = 0_
n ,m
This .is readily
verified to satisfy requirements (1), (2), (3) and (4) for a generalized inverse.
If A ^ 0, factor A as in the lemma into the product
A = BC
where B is m x r, C is r x n and r(B) = r(C) = r. Now B has full column rank while C has full row rank, so
that B'  and C may be found as in the previous theorem.
Now set .
f T *     * I
:  A    = C B  . j
Theorem 2.8.2.2. Let A' be defined as above. Then it is an M-P inverse for A.
Proof.
It is easier to verify properties (3!
and (4) first. They will then be used in proving properties  (1) and (2).
(3)     AA'   = B(CC")BV  = BIB~  = BB~,  and since
BBT =   (BBT)*, we have AAT =   (AAT)*.
(4)
(1)
(2)
Now we pi unique.
Similarly, A (AAT)A = (BB (ATA)AT = (C
A=C'C=   (C C) * = ) BC =  BC = A. C)CTBT = CVBT  = A~
(A'A) *.
A the M-P invert
Theorem 2.8.2.3.    Given an ra x n matrix A,  there is
o_nly one matrix A' that satisfies all four properties fo'r the Mb or e -P en ro s e invers if.
Proof.    Suppose that there exist matrices B and C satisfying
Then B
and
C =
ABA	= A		(1)	ACA	= A,
BAB	= B		(2)	CAC	= c,
(AB)	* = AB		(3)	(AC) *	= AC
(BA)	*   = BA		(4)	(CA)*	= CA
(2)	(4)		(1)		
=	(BA)B =	{A*B*	)B    = (A*C*A*)B	*B	
(4)		(4)	(2)		
=	(CA)(A*B*	B) =	CA(BAB)     = CAB		
(2)	(3)		(1)		
=	C(AC) =	CC*A*	= CC*(A*B*A*)		
(3)		(3)	and (2)		
=    (CC*A*)(AB)
CAB.
Therefore B = C.    The integers over the equality signs show the equations used to derive the equality. Penrose has given the following recursive method
48
Introductory Matrix Material
ilized Inverse
49
for computing A', which is included in case the reader would like to write a computer program.
Theorem 2.8.2.4 (the Penrose algorithm) m * n and have rank r > 0.
Let A be
(a)
(b)
(c)
Set B - A*A  (B is n x n)
x n) .
Set recursively for 1=1, 2,
Set C1 = I
(C1 is n
t  - 1:
ci+1 ■ <l/i)tr(C.B)r
C.B (C^ is n
n) .
Then tr(CrB) / 0 and AT = rCyA*/tr(CfB). Moreover, Cr+1B
0.    We therefore do not need to know r
beforehand, but merely stop the recurrence when we have "arrived at this sTage.
The proof is omitted.
Also very useful is the Greville algorithm. Theorem 2.8.2.5.
where       is the
kth column of A and A,   , is the submatrix of A consis-
___.__— .....-i— mi........Km, i.....—1 in.iiin.iij_.........WW—mm
Set d^ = Ak and
ting of its first k - 1 columns.
- Ak^S:
Set b
cw if cv £ °-
If ck _= 0,
set bk -  (1 + d*dR)
"1d*A,;
k k-1"
Then
A    = A, = k
T<-1
dkbk
To fitjarts , set ° (a*a1)"1a*.
PROBLEMS 1.     If a
, verify that
1 f-6
3 -
3
3 9
I)
10.
11.
12.
13.
14.
I If
f a =   ( 1    1 ) ,  find a" v 1    2 '
A =
find A'.
Use Penrose's formulas to compute the inverse of the nonsingular matrix
,14      8 3v (8      5 2).
'3      2 1 Use Greville's algorithm.
.If c is a nonzero scalar, prove that  (cA)' = (l/c)A\
Prove that  (A')" = A. Prove that  (AT)* = (A*)T.
If d is a scalar, define d' by d' = d ^ if d / 0, d* = 0 if d = 0.    Let A = diag(d1 Prove that A' = diag(d^,
d ) n
d') . n
Prove that ,0 B^
(
0 v j ,0 B+)  and (B
0;
0
)
Prove that if A' = 0, then A = 0*.
Let A =  (*       and have rank 1.    Prove that
A =
, 2 l
|br
Id
Let J be the J matrix of order n.    Prove that JT = (l/n2)J.
Let S be an n x n matrix with l's on the super-diagonal and 0*s elsewhere.    Find Sv.
2
Let P be any projection matrix  (i.e., P   = P, P*
= P)
Prove that P
P.
Tj___^,,~4-^nr M=4-vi v Maffl-riai Generali2ed Inverse
\J j.HI.j. UUUl^ LWJ-j      i iu v-J. -i-^i    * «- wjl.
15.	Prove that
16.	Prove that
17.	Prove that
18.	Taking A =
	eral, (AB)
19.	If a and b
	and (ab*)T
20.	Prove that
	2.8.3 The
A" =   (A*A)"A* r(A)  = r(. (1,   0) , B
A*(AA*)
(a "a
(^), show that,  in gen-
(a*a)Ta*,
(a*a)'(b*b)'ba*. , 6i B) '  = A'  8 B' .
The UDV Theorem and the M-P Inverse
We begin by establishing a theorem that is of great utility in visualizing the action and facilitating the manipulation of rectangular  (or square) matrices. This is the UDV theorem, also called the diagonal decomposition theorem or the singular value decomposition theorem. ...
Theorem 2.8.3.1. Let A be an m x n matrix with com-plex elements and of rank r. Then the exist unitary matrices U,   V oi orders m and n respectively such that
(2.8.3.1)
where
«ii ir ■ i" i "i ii •
(2.8.3.2)
UDV*
(Dl °)
d >
3
a,"
is m x n and where D^JT -^2J»i^lJ«»^2^"'^™ nonsingular diagonal matrix of order r.
Note that the representation  (2.8.3.1)  can be written as U*AV - D or. changing the notation, ^JAV = D, and so on   (since U and V are unitary).
"'   Let A be it! x n; then, as is well known, AA* is positive semidef inite Hermitian s^aunefc^A-c. and r (AA*) r !A)  = r (A*).    Hence the eigenvalues of AA* are real
2    ,2 „S 1' d2'
and nonnecjative.    Write them as d^, d", d_, 0,
0,
0 where the d.'s are positive and where there
are m - r O's in the list.
The numbers d-^, d.,,
dr are known as the singular values of A.
Proof.    Define Dn
diag(d
1' "2'
d ) . r
Let
U   be m x r and consist of the  (orthonormal) eigenvec-
2 ?
tors of AA* corresponding to the eigenvalues d., dt,, o L l
We have AA*U
.., &l (cf
and U*^
Theorems 2.9.3 and 2.9.9) I
Let U2 be the m x (m
1
r) matrix
whose columns consist of an orthonormal basis for the
null space of A* Write U = (U
Then A*U,
0 and U*U2
^, U2)   (block notation)
= I
m-r Then
U*U =
{U1' V
UiUl U2U1
u*u2
U2U2
Now,  since AA*U^ = viDx>
U*AA*U1 = U^Uj^D^.
But A*U_ =
0,  so that U*A = 0, hence ^2V1D1 = o.    Since is nonsingular,  it follows that U*U,  = U*U„ = 0. This
means that
2 1
u*u =
0
1"2
0
and hence that U is unitary.
Let V1 be the n x r matrix defined by V1 =
r) matrix whose n
A*U1D1
Let V2 be the n % (n
columns are a set of null space of A.    Thus AV.,
V as the n x n matrix V =  (V^, V2).
r orthonormal vectors for the
0 and V|V2 = Now
Define
V1V1
(d^Up.) (A^d"1) = d^uju^^d"1 dt^d,   =  I ,
11 r
and V*V1 = VJjA^D"1 =  (AV2)*U1D11 = 0.
that V is unitary.
/ U* S
Finally,
U*AV =
A(V,
v2)
It follows
U*AV2
U*AV2
Introductory Matrix Material
Generalized Inverse
53
/ uiAVi     u \     f ui
^ 0 0 '
U*AA*U1D1
■ 0 0 '
Using UDV theorem, we can produce a very convenient formula for A'.
If A = U*DV*, where U, V,  P are as
r
Theorem 2.8.3.2, above, then
vd'u
where
r	m-r	
-1	„ \	
Dl	1	r
0	o	n-r
Proof.    By a direct computation, it is easy to show that the n x m matrix
r	m-r
-1	°)
Dl	
0	0
is D
'U)   = U*DD'U = U*(or    0)u and
Now since A(VD /I 0\
(VDVU)A = v(Qr    0JV*,  the third and fourth properties for the generalized inverse are satisfied. Also, AA^A =   (U*DV*)(VD*u)(U*DV*)  = u*dd~dv* = U*DV* = A. Similarly A~ AA~ = A~, proving the first two properties.
Theorem 2.8.3.3. For each A there exist polynomials p and q such that
A*p(AA*),
AT = q(A*A)A*.
^■4
Proof.    Let A be m * n and have rank r.    Then by the diagonal decomposition theorem there exist unitary matrices U, V of order m and n and an m x n matrix
( 1 )
n-r
0^ 0'
r
m-r
where
that A = U*DV*.
D1 = diag (d-j^,
2'
Then A*
dr),
dld2"
•dr f 0), such
vd*U, aa* = U*DD*U, and
a' = VD'u. For an arbitrary polynomial p(z), p(aa*) = p(U*(DD*)U) = U*p(DD*)U.    Hence a*p(aa*) =
VD*p(DD*)U.    Therefore for a'  to equal a*p(aa*)' it is
necessary and sufficient that D'  = d*p(dd*). Equi-valently,
C1
o
0
or d"1 = dkp(|dk|2), k
= l/(|d, ■"
ient. among
D* '0 1,  2,
Thus P(|dkr)
•), k = 1, 2,
r is necessary and suffic-
Let s designate the number of distinct values |d^|,   |d2|, |dr|.    Then by the fundamental
theorem of polynomial interpolation (see any book on interpolation, approximation, or numerical analysis) there is a unique polynomial of degree < s - 1 that
takes on the values
I -2
at the s points |d.
The second identity for A'   is proved similarly.
PROBLEMS 1.
2.
Let U and V be unitary.    Prove that   (UAV)' = V*AVU*.
Let A be normal.    Give a representation for A' in terms of the characteristic values of A. See Section 2.9.
Prove that if A is normal, AAf = a^A.
Prove that AT = A* if and only if the singular
54
Introductory Matrix Material
values of A are 0 or 1 Prove that A'
limt_0 A*(tl + AA*)
-1
2.8.4    Generalized Inverts and ggfiligttfi of \^__^-^ Linear Foliations
Using the properties of the generalized inverse we are able to determine, for any system of equations
AX = B,
whether or not the system has a solution.    If it does, we can obtain a matrix equation, involving the generalized inverse, which exhibits this solution. Oddly enough, we need only the first property of a general-
■ »(D
ized inverse.    That is, we may use any matrix A , such that AA(1)A
A.
(1)
that
Definition.    IfAismxn, any n x m matrix a
satisfies AA^'a = A is called a  (1)-inverse of A. More generally, any matrix that satisfies any combination of the four requirements for the generalized inverse on page 4J^f,^^S^^^»^,^^^S^Mfc
Example. A (1, 2, 4)-inverse for A, is one that satisfies conditions   (1),   (2), and (4).
Theorem 2.8.4.1. Let A be m x n. The system of equations
has a solution if and only if B = AA(1)B, for any (1)
inverse"A[L oE A. In this case, the general solution is given by
X = a(1)b + (I
a(1)a)y
for an arbitrary n x 1 vector Y.
Proof.    Let b = AA^'b.    Then AX (1)
AA(1^B is
solved by X = A B. Suppose, conversely, that the system has a solution X.? AXQ = B.    Then,  for any
Generalized Inverse
55
(1)-inverse, A b
(1)
AXQ = (AA(1)A)XQ
AA(1)B.
Moreover, if X = A(1)B + (I - A(1)A)Y, then with B = AA(1)B,
AX = AA^ B + A(I
A(1)A)Y
= B + (A
AA(1)A)Y
B + 0 = B.
Therefore any such X is a solution.
To show that it is the general solution, we must
show that if AXq = b then Xg
for some Y.    Let R = X. - A^B (1)
AA     b = B - b = 0
A(1)b +   (I - A(1)A)Y
Then AR Now therefore R = R -
'     AX ■
(1) A      AR.
Hence, XQ = A^B + (I - i required form with Y = R.
(1)
A)R which is of the
In the numerical utilization of this theorem one should, of course, use some standard  (1)-inverse of A
such as A'. PROBLEMS
Show that if A is an m x n matrix and B is any
(1) -inverse of A,  then AB and BA are idempotent
of orders m and n respectively and BAB is a (1,2)-inverse of A.
Show that if A is m x n  (n x m), of rank m, then any  (1)-inverse of A is a right  (left)  inverse of A, and any right  (left)  inverse of A is a (1,2,3)-[(1,2,4)-]  inverse of A.
Consider two systems of equations:   (1) AX = B,
(2) CX = D. Find conditions such that every solution of  (1)   is a solution of (2).
What happens in Problem 3 if B = D = 0?
Prove that the matrix equation AXB = C has a
solution if and only if AA'CB'B = C.    In this case, the general solution is given by
Introductory Matrix Material
Generalized Inverse
57
X = A^CBT + Y - A'AYBB*
for an arbitrary Y.
2.8.5    The M-P Inverse and Least Square Problems
Let A be m * n, X and B be n x 1, and consider the system of equations '•• ^ M \
AX = B.
If the vector B lies in the range of A, then there exists one or more solutions to this system.    If the solution is not unique we might want toknow which solution has minimum norm.    If the vector B is not in the range of A, then there is no solution to the system, but it is often desirable to find a vector X in some way closest to a solution.    To this end,  for any X,  define the residual vector R = AX - B and consider
its Euclidean norm | | R| | = /R*R.    A least squares solution to the system is a vector Xn such that its * residual has minimum norm,   ^'i'hat is, '
| | |R | |  =   | |AX   - B| |  1 I I AX - B| |        for all n x 1
vectors X.
Theorem 2.8.5.1.    The system of equations AX = B always has a least squares solution.    This solution is unique it and only it the columns or A are "linearly independent■    In this case,  the unique least squares
solution is given by X = A'B.
Proof.    Let r[a) designate the range space of a
and by  [r(a)]-^- designate its orthogonal complement. Then we can write b = b^ +       where Bj^ is in r(a) and
b2 is in orthogonal complement  [r (a) ]"*-.    For any x,
ax is in r(a) as is ax - b^, hence is orthogonal to
b2.    Now ax - b - ax - B1 - b2.    Hence,  for any x,
||ax - b||2 * ||ax - bJ|2 * | [ b2 | 12 > ||b2||2.
Therefore  |j B0| |    is a lower bound for the values 2
[ I ax - bI I    and is achieved if and only if ax = b^. Since b. is in r(a), there is a solution x   to ax = b
For this vector Xg,
I Ir0II 2 = I|ax0 - bM2 = I|b2M2 1 Max - b||2,
so that the lower bound is achieved.
Since a unique solution to AX = b1 exists if and
only if the columns of A are linearly independent, the theorem is proved.
For any solution X^ to AX = B^,
Rq = AXq - B = B1 -   [B1 + B2) = -B2 is in  [R (A)
Therefore A*RQ = 0, or
A*(AXQ - B)  = 0,
or
A*AX„ = A*B. 0
These are the normal equations determining the least squares solution.
If the columns of A are independent, then r(A*A) = r(A)  = n, so that the n * n matrix A*A is nonsingular.    The least squares solution Xg is determined by A*AXq = A*B,   so that xq =   (A*A)-1A*b. But,
from our previous work. A' =  (A*A) A*.
Finally,  we take up the general case.
Lemma.    Let P = AA" t^Q = ^Jj^A.    Then,  if X and Y are arbitrary vectors "rc^nformaxHTe) ~
I|AX +  (I - P)Y||2 =  1|AX||2 +  ||(I - P)Y| j2
and
||A*Y +  (I - Q)X||2 =  ||A*Y||2 +  ||(I - Q)X||2.
Proof.     Since A = AA'A,  AX = AA'AX = PZ with Z = AX.    We now prove that PZ 1 (I - P)Y.    This is equivalent to  (PZ)*(I - P)Y = 0 or Z*P*(I - P)Y = 0. But
2 - V T
P* = P and P    =   (AA A)A    - AA   =  P. Therefore,
Introductory Matrix Material
Generalized Inverse
59
P*(I - P)  =0.    The first equality above now follows from Pythagoras'  theorem.    The second equality can be
derived from the first using A'' = A.
Another way of phrasing this work is that P is the projection onto the range space R(A) of A while I - P is the projection onto the orthogonal complement of R(A).
Theorem 2.8.5.2.    Let A be m x n and B be m x 1. Let
xg = A.B.: (i)
Then for any n x 1 X f X_, we have either
AX
AXr
or
(2)
I AX - BI I =
IX  II > IIX,
|AX0 - B|
and
Proof.    For any X we have
AX - B = AX - AATB + AATB - B
= A(X - ArB)  +   (I - AAV)(-B). By the previous lemma,
||AX - B||2 =   ||ACX - A*B)||2 +   ||(I - AA')(-B)|| =   ||A(X - X0)||2 +   ||AXQ - B||2 >   I|AXQ - B||2. The equality holds here if and only if A(X - XQ) = 0
Hence if ax f axq<  inequality (1) holds.
Suppose, then, that AX = AX
A'AA'B = A'B = X,
0"
Therefore, X
Then A'AX = A AX
0
x0 + (X
v -
A'B + (I - A'A)X. Hence by inequality (2) of the lemma,
so that
and
only if X = X. .
This theorem may be rephrased as follows. Given
the system AX = B.    Then the vector A'B is either the
unigue least squares solution or it is the least squares solution of minimum norm.
PROBLEM
1.    A is square and singular.    Characterize the solution A'B.
2.9 NORMAL MATRICES, QUADRATIC FORMS, AND FIELD OF VALUES
We record here a number of important facts.    By a normal matrix is meant a square matrix A for which
(2.9.1)
AA* = A*A.
Examples.    Hermitian,  skew-Hermitian,  and unitary matrices are normal.    Hence real symmetric"; sKew-symmetric, and ortnbgonal matrices are also normal. All circulants are"'no7maTT~a*s~w^"3Tta~rT^
Theorem 2.9.1.    a is normal if and only if there is a unitary u""and diagonal D such that A = (J*DU.
Theorem 2.9.2.    a is normal if and only if there is a polynomial p (x)  such that' A™*" = "pTSTT""""" "~~*"
Theorem 2.9.3. A is Hermitian if and only if there is a unitary''matrix U""an3 "*ai~real '"dfaTgbnaT D such" that"
a~~=tt*du~---*-*—---"
Theorem 2.9.4. A is (real) symmetric if and only if 'there is a (real) ortnogonal matrix u ana a reaT ' diagonal D such that a = U*D"tn ~
60
Introductory Matrix Material
Normal Matrices
61
PROBLEMS
1. Prove that A is normal if and only if A = R + is where R and S are real symmetric and commute.
2. Prove that A is normal if and only if in the polar decomposition of A  (A = HU with H positive semi-definite Hermitian, U unitary) one has HU = UH.
3. Let A have eigenvalues X^, An-    Prove that A is normal if and only if the eigenvalues of AA* are  |X1|2,   |X2|2, |Xj2.
4. Prove that A is normal if and only if the eigenvalues of A + A* are X^ + X^,        + A,,, ...f
X    + X . n n
5. If A is normal and p(z)  is a polynomial, then p (A)   is normal.
6. If A is normal, prove that A"  is normal.
7. If A and B are normal, prove that A 8 B is normal.
8. Use Theorem 2.9.1 to prove Theorem 2.9.2.
Quadratic Forms.    Let m be n x n and let Z =  [z , z.,, zn)T-    By a quadratic form is meant the function of z  ,   ...    zn given by
(2.9.2)       |m(Z)  = Z*MZ. '
It is often of importance to distinguish the quadratic form from a matrix that gives rise to it. The real and the complex cases are essentially different .
Lemma 2.9.5.    Let Q be real and square anc U a real colunin^^
-QT,  that is, if and only if Q is sikew-symmet£ic. Proof.
(a)     Let Q = ~QT.    If a = UTQU,  aT = a = UTQTU =
UT(-Q)U = -a.    Therefore a = 0.
(b)    Let UTQU = 0 for all   (real)   U.    Write Q = ^1 + ^2 w^ere ^1       symmetric and Q2 is skew-symmetric. Then, for all U
UTQU = VTQ1U + UTQ2U = UTQ]_U = 0.
Since       is symmetric, we have for some orthogonal P
and real diagonal matrix A:       = PTAP.    Therefore for
all real ux u5ptAPU =  (PU)TA(PU).    Write PU =
(Uj,, un), A = diag(Xir A  ) .    Then we have
^k=l Ak(%^ = 0 for a11  (ul' un)f hence for a11
(u^, uR).    This clearly implies X^ = 0,  for k =
1,  2, n.    Hence       = 0 and Q = Q2 = skew-symmetric.
Theorem 2.9.6.    Let Q and R be real square and u be a
real column. Then UTQU a UTRU for all U if and only if Q - R is skew-symmetric■
Proof.    UTQU = UTRU if and only if l)T(Q - R]o = 0.
Corollary.    Let Q be real and U be a real column.
Then ' "-"*-!—~-
T
(2.9.3)        uTQU = UT(-° t Q )U.
1 t
The matrix T(Q + Q )  is known as the symmetriza-
tion of Q. * -
We pass now to the complex case.
Lemma 2.9.7.    Let M be a square matrix with complex
elements and leTTJTj^ _a__co^
Then
Z*MZ = 0
for all complex Z if and only if M = 0. Proof
(a) The "if" is trivial.
(b) "Only if."    Write Z = X + iY, M = R + is
Introductory Matrix Material Normal Matrices 63
where X, Y,  R, S are all real.    Then we are given
(2.9.4)     (X* - iY*)(R + iS)(X + iY)  = 0        for all
real X, Y.
Select Y = 0. Then X*(R + iS)x = 0 for all real X or X*RX = 0 and X*SX = 0.    Therefore, by the first
T
lemma,  R and S must be skew-symmetric:  R + R =0,
m
S + S    = 0.    Expanding the product on the left side of  (2.9.4), we obtain
X*RX + iX*RY + iX*SX - X*SY - iY*RX + Y*RY
+ Y*SX + iY*SY.
In view of the skew symmetry of R and S and the first lemma, we have X*RX = X*SX = Y*RY = Y*SY = 0. Therefore, we have for all real X, Y:
(Y*SX - X*SY)   +  i(X*RY - Y*RX)   = 0
or
and
Y*SX = X*SY = Y*S*X
X*RY = Y*RX = X*R*Y.
Thus,   for all real X, Y,  X*(R - R*)Y = 0 and
Y*(S - 3*)X = 0.    Selecting X and Y as appropriate
unit vectors  (0r-<0,  1,  0, 0), this tells us that
R - R* = 0 and S - S* = 0.    But R* = RT = -R and S* =
ST = -S,  therefore R = S = 0 and M = 0.
Theorem 2.9.8.    Let M and N be square matrices of order n with complex elements and suppose that
(2.9.5)        Z*MZ = Z*NZ
for all complex vectors Z.    Then M = N.
Proof.    As before,  Z*MZ = Z*NZ if and only if Z* (M - N) Z = 0.
Note that this theorem is false if (2.9.5) only for real Z.
holds
Corollary. Z*MZ is real for all complex Z if and only if M is Hermitian.
Proof.    Z*MZ is real if and only if Z*MZ = (Z*MZ)* = Z*M*Z.     Hence M = M*.
Let M be a Hermitian matrix.    It is called positive definite if Z*MZ > 0 for all Z / 0.    It is called positive semidefinite if Z*MZ > 0 for all Z. It is called indefinite if "there exist Z   r 0 and Z2 f 0 such that Z*MZ1 > 0 > Z*MZ2-
Theorem 2.9.9.    Let M be a Hermitian matrix of prder
n with eigenvalues X^, (a)
k = 1, 2,
., X '. Then M is positive definite if and only if X. >
(b) M is positive semidefinite if and only if Xk > 0, k = 1,  2,  ..., n.
(c) M is indefinite if and only if there are integers j, k,  j f k, with X.  > 0,   X    < 0.
., ^.- -.. 3 K
Field of Values.    Let M designate a matrix of order n. The set of all complex numbers Z*MZ with  ||z|| = 1 is known as the field of values of M and is designated
by j?(VL)
designates the Euclidean norm of Z.
"The following facts, due to Hausdorff and Toeplitz, are known.
(1)
(2)
(2.9.6)
(3)
(2.9.7)
(4)
S1 (M) is a closed, bounded, connected, convex subset of the complex" plane".'""
The field of values is invariant under unitary .transforjnations:
jr(M)  =   JJ(U*MU) ,
U
unitary.
If ch M designates the convex hull of the eigenvalues of M, then
ch M    c ^(M) .
If M is normal, then  ,F(M)  = ch M.
(
64
Introductory Matrix Material
Normal Matrices 65
PROBLEMS
1.    Show that the field of values of a 2 x 2 matrix M is either an ellipse   (circle), a straight line segment, or a single point.    More specifically, by Schur's theorem**,  if one reduces M unitarily to upper triangular form,
Hadamard matrices:    Ahmed and Rao; Hall; Harmuth; Wallis, Street, and Wallis.
Generalized inverses:    Ben-Israel and Greville; Meyer.
UDV theorem:    Ben-Israel and Greville; Forsythe and Moler; Golub and Reinsch  (numerical methods).
M = CP
0
m A,
U unitary,
then
(a)    M is not normal if and only if in 0.
(a")    X^      \2- is the interior and
boundary of an ellipse with foci at A^,
Length
of major axis  (|m|2 +  |A, - A| 2)1/'2 .
^2,  length of minor axis is
(a")
1 '2'
5? (M) is the disk with center
(b)
at A^ and radius |m|/2. M is normal  (m = 0).
(b1) (b")
x1 *
2'
5s(M) is the line segment
joining A^ and
^"(M) is the single point A^.
REFERENCES
General:  Aitken,   [1]; Barnett and Story; Bellman, [2]; Browne; Eisele and Mason; Forsythe and Moler; Gant-macher; Lancaster,   [1]; MacDuffee; Marcus; Marcus and Mine; Muir and Metzler; Newman; M. Pearl; Pullman; Suprunenko and Tyshkevich; Todd; Turnbull and Aitken.
Vandermonde matrices: Gautschi.
Discrete Fourier transforms:    Aho, Hopcroft and Ullman; Carlitz;  Davis and Rabinowitz? Fiduccia; Flinn and McCowan;   Harmuth;  Nussbaumer; Winograd; J. Pearl.,
**Any square matrix is unitarily similar to an upper triangular matrix.