MASARYK UNIVERZITY Brisk guide to maths Jan Slovák, Martin Panák, Michal Bulant et al Brno 2013 The work on the textbook has been supported by the project CZ. 1.07/2.2.00/15.0203. INVESTMENTS IN EDUCATION DEVELOPMENT Authors: Mgr. Michal Bulant, Ph.D. Mgr. Aleš Návrat, Dr. rer. nat. Mgr. Martin Panák, Ph.D. prof. RNDr. Jan Slovák, DrSc. RNDr. Michal Veselý, Ph.D. Graphics and illustrations: Mgr. Petra Rychlá ©2013 Masaryk University Contents Chapter 1. Initial warmup 3 1. Numbers and functions 3 2. Combinatorics 7 3. Difference equations 11 4. Probability 15 5. Plane geometry 23 6. Relations and mappings 36 Chapter 2. Elementary linear algebra 72 1. Vectors and matrices 72 2. Determinants 83 3. Vector spaces and linear mappings 92 4. Properties of linear mappings 108 Chapter 3. Linear models and matrix calculus 137 1. Linear processes 137 2. Difference equations 143 3. Iterated linear processes 150 4. More matrix calculus 157 5. Decompositions of the matrices and pseudoinversions 176 Chapter 4. Analytic geometry 207 1. Affine and euclidean geometry 207 2. Geometry of quadratic forms 227 3. Projective geometry 234 Chapter 5. Establishing the ZOO 254 1. Polynomial interpolation 254 2. Real number and limit processes 263 3. Derivatives 281 4. Power series 293 Chapter 6. Differential and integral calculus 347 1. Differentiation 347 2. Integration 364 3. Infinite series 382 Chapter 7. Continuous models 419 1. Fourier series 420 2. Metric spaces 432 3. Integral operators 448 4. Discrete transforms 455 Chapter 8. Continuous models with more variables 463 1. Functions and mappings on M" 463 2. Integration for the second time 494 3. Differential equations 516 4. Notes about numerical methods 539 Preface This textbook follows the years of lecturing Mathematics at the Faculty of Informatics at Masaryk University in Brno. The programme requires introduction to genuine mathematical thinking and precision, but there is not much time dedicated. Thus, we want to cover seriously, but quickly, about as much of mathematical methods as usual in bigger courses in the classical Science and Technology programmes. At the same time, we do not want to give up the completeness and correctness of the mathematical exposition. We want to introduce and explain more demanding parts of Mathematics, together with elementary explicit examples how to use the results. But we do not want to solve for the reader how much of theory or practice to enjoy and in which order. All these requests have lead us to the two collumn format where the rahter theoretical explanation and the practical examples are split. This way, we want to please and help the readers to find their own way. Either to go through the examples and algorithms first, and then to come to explanations why the things work, or the other way round. We also hope to overcome the usual stress of the readers horrified by the amount of the stuff. With our text, they are not supposed to read through everything in a linear order. On the opposit, the readers should enjoy browsing through the text and finding their own paths. In both collumns, we intend to present rather standard exposition of basic Mathematics, but focusing on the essence of the concepts and their relations. The examples are solving simple mathematical problems but we also try to show thei use in mathematical models in practise as much as possible. We are aware thath the theoretical text is written in a very compact way. A lot of details are left to readers, in particular in the more difficult paragraphs. Similarly, the examples display the variety from very simple ones to those reuqesting some deeper thoughts. We would very much like to help the reader • to formulate precise definitions of basic concepts and to prove simple mathematical results, • to percieve the meaning of roughly formulated properties, relations and outlooks for exploring mathematical tools, • to understand the instructions and algorithms creating mathematical models and to appreti-ate their usage. The goals are ambitions and nearly everyone needs his or her own paths, including failures. This is one of the reasons why we come back to basic ideas and concepts several times with growing complexity and width of the discussions. Of course, this might also look as chaotic, but we very much hope that this approach gives a much better chance to those who will persit in their effort. Clearly this textbook cannot be the only source for everybody. Actually, the only really good proceeding is the combine several sources and to think about their differences on the way. But we hope, it should be a perfect begin and help for everybody who is ready to return back to the individual parts again and again. To make this task simpler, we have added emotive icons. We hope they will not only spirit the dry mathematical text but also indicate which parts should be rather be read carefully or better jumped over in the first round. The usage of the icons follows the feelings of the authors and we tried to use them in a systematic way. Roughly speaking, we are using icons warning before complexity, difficulty etc.: Further icons indicated unpleasant technicality and need of patiance: Finally, there also icons showing up the joy of the game: The practical collumn with the examples should be readable nearly independently off the theory. Without the ambition to know the deeper reasons why the algorithms work, it should be possible to readjust here. Some definitions and descriptions in the theoretical text are marked to be catched easily when reading the examples, too. The examples and theory are partly coordinated to allow jumping there and back, but the links are not tight. CHAPTER 1 Initial warmup "value, difference, position" - what it is and how to comprehend it? A. Numbers and functions We can already work with natural, integer, rational and real numbers. We argue why rational numbers are not sufficient for us (although computers are actually not able to work with any other) and we recall the so-called complex numbers (because even reals are not enough for some calculations). 1.1. Find some real number which is not rational. Solution. One among many possible answers is V2. Already the old Greeks knew that if we prescribe the area of rectangle a2 = 2, then we cannot find rational a which would satisfy it. Why? Assume we know that it holds (p/q)2 = 2 for natural numbers p and q that do not have common divisors different from 1 (otherwise we can further reduce the fraction p/q). Then p2 = 2q2 is an even number. Thus, on the left-hand side p2 is even, and so is p. Hence, p2 is divisible by 4, and so q must be even that implies p and q have 2 as a common factor, which is a contradiction. □ 1.2. Remark. It can be even proven that 72-th root of a rational number, where n is natural, is either natural or is not rational (see ||G|| The goal of the first chapter is to introduce the reader to the fascinating world of mathematical thinking. For that we choose our examples of mathematical modelling of real situations using abstract object and connections to be as specific as possible. We also go through a few topics and mechanisms to which we will subsequently return in the rest of the book, and in the end of the chapter we will speak about the language of mathematics itself (which we will mostly use in an intuitive way). The easier the objects and settings we work with are, the more it is difficult to understand in depth the nuances of the use of particular tools and mechanisms. Mostly it is possible to reach the core ideas only through their connection to others. Therefore we introduce them from many points of view at once. Changing the topics very often might be confusing, but it will surely get better when we return to specific ideas and notions in later chapters. The name of this chapter can be also understood as an encouragement to patience. Even the simplest tasks and ideas are easy only for those who have already seen similar ones. Full knowledge and mathematical thinking can be reached only through a long and complicated. Let us start with the simplest thing: common numbers. 1. Numbers and functions Since the dawn of ages people wanted to know "how much" of something they had, or "how much" is some-r" "^^r^^i tnm8 w°rth, "how long" will a particular task take, etc. The result of such ideas is usually //// -I some "number". We consider something to be a number, if we can multiply it and add it, and it behaves according to the usual rules - either according to all rules we except, or only to some. For instance, the result of multiplication does not depend on the order of multiplicands, we have the number zero whose adding does not change the result, we have the number one which behaves in a similar manner with respect to addition, and so on. The simplest example are the so-called natural numbers, which we denote N — {0, 1, 2, 3,...}. Note that we consider zero to be a natural number, as it is usual especially in computer science. To count "one, two, three,..." is learned already by little children in their pre-school age. Some time later we meet the integers Z — {..., —2, — 1, 0, 1, 2,...} and finally we get used to floatingpoint numbers, and we know what a 1.19-multiple of the price means thanks to the 19% tax. CHAPTER 1. INITIAL WARM UP 1.3. Find all solutions to the equation x2 = b for any real number b. Solution. We know that this equation always has a solution x in the domain of real numbers, whenever b is non-negative. If b < 0, then such real x cannot exist. Thus we need to find a bigger domain, where this equation has a solution. First we add to the real numbers a new number i, so-called imaginary unit, and try to extend the definitions of addition and multiplication in order to preserve the usual behaviour of numbers (as summarised in 1.1). Clearly we need to be able to multiply the new number i by real numbers and sum it with real numbers. Therefore we need to work in our newly defined domain of complex numbers C with formal expressions of the form z, = a + i b. In order to satisfy all the properties of associativity and distributiv-ity, we define the addition so that we add independently the real parts and the imaginary parts. Similarly, we want the multiplication to behave as if we multiply the tuples of real numbers, with the additional rule that f = — 1, that is, (a + i b) + (c + i d) = (a + c) + i (b + d), (a + i b) ■ (c + i d) = (ac — bd) + i (be + ad). □ The real number a is called the real part of the complex number z„ the real number b is called the imaginary part of the complex number z„ and we write re(z) = a, im(z) = b. 1.4. Assert that all the properties (KG1-4), (01-4) and (P) of scalars from 1.1 hold. Solution. Zero is the number 0 + i 0, one is the number 1 + i 0, both these numbers are for simplicity denoted as before, that is, 0 and 1. All properties are obtained by direct calculations. □ Complex number is given by a tuple of real numbers, therefore it is a point in the real plane M2. 1.5. Show that the distance of the complex number z, = a + i b from the origin (we denote it by \z\) is given by the expression zz, where z„ the complex conjugate, is a — i b. Solution. The product ZZ = (a2 + b2) + i (-ab + ba) = a2 + b2 is always a real number and indeed gives us the square of the distance from the number z, to the origin. Thus it holds \z\2 = zz. □ 1.1. Properties of numbers. In order to be able to properly work with numbers, we need to be more careful with their definition and properties. In mathematics, the basic statements about properties of objects, whose validity is assumed without the need to prove them, are called axioms. A good choice of axioms determines the range of the theory they give rise to, and also to its usability in mathematical models of reality. Let us now list basic properties of the operations of addition and multiplication for our calculations with numbers, which we denote by numbers a,b,c,____ Both operations work by taking two numbers a, b and by applying addition or multiplication we obtain the resulting values a + b and a ■ b. __ | Properties of scalars [__> Properties of numbers: (KG1) (a + b) + c — a + (b + c), for all a, b, c (KG2) a + b — b + a, for all a, b (KG3) there exists 0 such that for all a it holds that a + 0 — a (KG4) for all a there exists b such that a + b — 0 The properties (KG1)-(KG4) are called the properties of commutative group. They are called associativity, commutativity, existence of neutral element (when speaking of addition we usually say zero element), existence of inverse element (when speaking of addition we also say the negative of a and denote it by —a), respectively. Properties of multiplication: (01) (a ■ b) ■ c — a ■ (b ■ c), for all a, b, c (02) a ■ b — b ■ a, for all a, b (03) there exists 1 such that for all a it holds that 1 • a — a (04) a ■ (b + c) — a ■ b + a ■ c, for all a, b, c. The properties (01)-(04) are called associativity, commutativity, existence of unit element and distributivity of addition with respect to multiplication, respectively. The sets with operation +, • that satisfy the properties (KG1)-(KG2) and (01)-(04) are called commutative rings. Further properties of multiplication: (P) for every a / 0 there exists b such that a ■ b — 1. (OI) if a ■ b — 0, then either a — 0 or b — 0 The property (P) is called existence of inverse element with respect to multiplication (this element is then denoted by a-1) and the property (OI) then says that there exists no "divisors of zero". Properties of the operations of addition and multiplication will be often used, even if we do not know what object are ifl^£ we really working with. In this way we obtain very f^fivt general mathematical tools. However, it is good to 'ffsr*%5s-L have some idea of typical examples of object we work with. The integers Z are a good example of commutative group, the natural numbers are not since they do not satisfy (KG4) (and possibly do not even contain the neutral element if one does not consider zero to be natural). If a commutative ring also satisfies the property (P), we speak of field (often also about commutative field). 4 CHAPTER 1. INITIAL WARM UP 1.6. Remark. The distance \z\ is also called the absolute value of the complex number z. 1.7. Polar form of complex numbers. Let us first consider complex numbers of the form z = cos
>-1-~ Clearly we have
n(n - 1) • • • (n - k + 1) of possible results of a subsequential choosing of our k elements, but we obtain the same /c-tuple in k\ distinct orders. If we want to choose the items along with an ordering, we speak of a variation of &-th degree.
As we have just checked, the number of combinations and variations are given by the following formula, which are not very effective for calculations with k and n large, since they contain factorials.
_ [ Combinations and variations [__%
L
Proposition. For the number c(n, k) of combinations ofk-th degree among n elements, where 0 < k < n, it holds that
(1.3)
/ „ (n\ nin-\)...in-k + \) n\
cin, k) — [ ) — - = -.
\kj k(k-l)..A (n-k)\k\
For the number vin, k) of variations it holds that
(1.4) v(n,k) = n(n-l)---(n-k+l) for all 0 < k < n (and zero otherwise).
We pronounce binomial coefficient (£) as "n over k". The name stems from the so-called binomial expansion, which is the expansion of (a + b)n. If we expand (a + b)n, the coefficient at akbn~k equals for every 0 < k < n exactly the number of ways to choose a k-tuple from n parentheses in the product (from these parentheses, we take a, from the others, we take b). Therefore we have
(i.5) (fl+*r = £("W-*
and note that for the derivation only distributivity, commutativity and associativity of multiplication and summation was necessary. The formula (1.5) therefore holds in every commutative ring.
Let us present another simple example of a mathematical proof - a few simple propositions about binomial coefficients. For a simplification of formulations we define (£) — 0 whenever k < 0 or k > n.
1.7. Proposition. For all natural numbers k and n we have
8
CHAPTER 1. INITIAL WARM UP
speaker AB. The number of all orderings where B speaks right after A is then equal to the number of permutations of seven elements. Clearly, the same number is for the number of all orderings where A speaks right after B. Since the number of all possible orderings of eight speakers is 8!, the result is 8! — 2 • 7!. □
1.18. How many anagrams of the word PROBLEM are there, such that
a) the letters B and R are next to each other,
b) the letters B and R are not next to each other.
Solution, a) The pair of letters B and R can be assumed to be a single indivisible "double-letter". In total we have six distinct letters and there are 6! words of six indivisible letters. In our case we have to multiply this by two, since are double-letter can be either BR or RB. Thus the result is 2 • 6!.
b) 7! — 2 • 6! the complement to the part a) to the number of all seven-letter words of distinct letters. □
1.19. In how many ways can an athlete place 10 distinct cups to 5 shelves, if into every shelf all 10 cups fit?
Solution. Let us add 4 indistinguishable items, say separators, to the cups. The number of all distinct orderings of cups and separators is clearly 14!/4! (the separators are indistinguishable). Every placement of cups into shelves corresponds to exactly one ordering of cups and separators. It is enough to say that the cups before the first separator in the ordering are placed in the first shelf (preserving the order), the cups between the first and the second separator in the second shelf, and so on. Thus the number 14!/4! is the result. □
1.20. Determine the number of four-digit numbers with exactly two distinct digits.
Solution. Two distinct letters used for the number can be chosen in (2°) ways, from two chosen digits we can compose 24 — 2 distinct four-digit numbers (we subtract the 2 for the two one digit numbers). In total we have (2°)(24 - 2) = 630 numbers. But in this way, we have also computed the numbers that start with zero. Of these there are (j)(23 — 1) = 63. Thus we have 630 — 63 = 567 numbers. □
1.21. Determine the number of even four-digit numbers composed of exactly two distinct digits.
Solution. Analogously to the previous example, let us first ignore the peculiarities of the digit zero. We thus obtain © (24 -2)+5-5(23 -1) numbers (we first count only the number that consist only of even digits, the second summand gives the number of even four-digit numbers
(2) (III) = © + (k"+i)
(3) ELo GD =2"
(4) £SLo*GD = w2"-1-
Proof. The first proposition is immediate directly from the formula (1.3). If we expand the right-hand side of (2), we obtain
n
k + 1
k\{n-k)\ {k+\)\{n-k (k+ l)n! + (n-k)n\ (k + l)!(n -k)\ (n + l)!
1)!
(k+ \)\(n-k)\
which is the left-hand side of (2).
In order to prove (3), we use the so-called mathematical induction. This tool is very suitable for statements saying that something should hold for every natural number _^ n. The mathematical induction consists of two steps. In the first, base step we assert the claim for n — 0 (in general, for the smallest n the claim should hold for). In the second, inductive step we assume that the claim holds for some n (and all smaller numbers) and using this we prove that that this implies the claim for n + 1. Putting it together, we obtain that the claim holds for every n.
The claim (3) clearly holds for n — 0, since Q — 1 = 2°. (Similarly easy is it also for n — 1.) Now let us assume that the claim holds for some n and calculate the corresponding sum for n + l using the claims (2) and (3). We yield
n + l , .N n + l r
k=0
k=0
n
k- 1
k=-l v 7 k=0 V 7
~>n , >~>n _ ^n + l
Note that the formula (3) gives the number of all subsets of an n-element set, since (£) is the number of all subsets of size k. Note also that (3) follows from (1.5) by choosing a — b — 1.
To prove (4) we again employ induction, as in (3). For n — 0 the claim clearly holds. Inductive assumption says that (4) holds for some n. Let us now calculate the corresponding sum for n + 1 using (2) and the inductive assumption. We obtain
n + l k=0
n + 1
n + l r k=0 L
n
k- 1
n k
n + l
k=-l v 7 k=0 V 7
=t(;)+t»(;)+i:»C)
jfc=0 v 7 k=0 V 7 k=0 V 7
= 2" + nl"'1 + Hi"'1 = (n + 1)2".
This completes the inductive step and the claim is proven for all natural n. □
9
CHAPTER 1. INITIAL WARM UP
with one digit even and one digit odd). Again we have to subtract the numbers that start with zero, of those there are (23 — 1)4 + (22 — 1)5. The final number is thus
'5N
(24 - 2) + 5 • 5(2J - 1) - (2J - 1)4 - (2Z - 1)5 = 272.
□
1.22. There are 677 people at a concert. Do some of them have the same name initials?
Solution. There are 26 letters in the alphabet. Thus the number of all possible name initials are 262 = 676. Thus at least two people have the same initials. □
1.23. New players meet in a volleyball team (6 people). How many times do they shake hands when introducing to each other (everybody shakes with everybody)? How many times do they shake hands with the opponent after playing a match?
Solution. Every tuple of players shakes hands at the introduction. The number of handshakes is then equal to the combination C(2, 6) = (2) = 15. After a match each of the six players shakes hands six times (with each of six opponents). Thus the number is 62 = 36. □
1.24. In how many ways can five people be seated in a car for five people, if only two of them have driver licence? In how many ways can 20 passengers and two drivers be seated in a bus for 25 people?
Solution. On the driver's place we have two choices and the other places are then arbitrary, that is, for the second seat we have four choices, for the third three choices, then two and then 1. That makes 2.4! = 48 ways. Similarly in the bus we have two choices for the driver, and then the driver plus the passengers can be seated among the 24 seats arbitrarily. Let us first choose the seats to be occupied, that is, (jj), and among these the people can be seated in 21! ways. That makes 2. (£)21! = ^ ways. □
1.25. In how many ways can we insert into three distinct envelopes five identical 10-bills and five identical 100-bills such that no envelope stays empty?
Solution. Let us first compute the number of insertions ignoring the non-emptiness condition. Using the rule of product (we insert the 10-bills and 100-bills independently) we have C(2, 7)2 = Q2. Let us now subtract the insertions such that exactly one envelope is empty and then the insertions such that two are empty. We have C(2, 7)2 - 3(C(1, 6)2 - 2) - 3 = Qf - 3(62 - 2) - 3 = 336. □
The second property from our claim allows us to compose all binomial coefficients into the so-called Pascal triangle, where every number is obtained as a sum of two coefficients situated right "above" it:
n = 0
n = 1
n = 2
n = 3
n — 4
n = 5
1
1
1
1
1
10
1
1
1
1 5 10
Note that in individual rows we have exactly the coefficients at individual powers in the expression (1.5), for instance the last given row says
(a
■ b)5 = a5
■5a4b ■
10a V
lOflV
-5ab*
1.8. Choice with repetitions. The ordering of n elements, where some of them are indistinguishable is called permutation with repetitions.
Let there be among n given elements pi elements of first kind, p2 elements of second kind, ..., pk of the k-th kind, where p\ + p2 + • • • + p\ — n, then the number of permutations with repetitions of these elements is denoted as P(pi,...,Pk).
Similarly to permutations and combinations without repetitions, for the choice of the first element we have n possibilities, for the second n — 1 and so on, until the last element, for which we have only one choice. But we consider the orderings which differ only in the order of indistinguishable elements to be identical. Elements of every kind can be ordered in pi! ways, thus we have
_ Permutations with repetitions _.
1
P(pi,...,pk)
Pi!
Free choice of k elements from n elements, when order matters, is called variation ofk-th degree with repetitions, the number of those is denoted V(n,k). Free choice in this case means that we assume that for every choice we have the same number of possibilities - for instance, when we return the elements back before the next choice, when we throw the same dice, and so on. The following clearly holds:
I Variations with repetitions [__<
L
V(n,k) = n"
If we are interested in choice without taking care of order, we speak of combinations with repetitions and for their number we write C(n, k). At first sight, it does not seem to be easy to determine the number. The proof of the following theorem is typical for mathematics - we reduce the problem to another problem we have already dealt with. In our case it is reduction to standard combinations without repetitions:
Combinations with repetitions [___
Theorem. The number of combinations with repetitions of k-th order from n elements equals for every k > 0 and n > 1
\-k- r
k
C(n,k) =
10
CHAPTER 1. INITIAL WARM UP
1.26. Determine the number of distinct sentences which can arise by permuting letters in the individual words in the sentence "Skokan na koks" (the arising sentences and words do not have to make any sense).
Solution. Let us first compute the number of anagrams of individual words. From the word "skokan" we obtain 6!/2 distinct anagrams (permutation with repetition P(\, 1, 1, 1,2)), similarly "na" yields two and "koks" 4!/2. Therefore, using the rule of product, we have 6!4!/4 = 4320. □
1.27. How many distinct anagrams of the word "krakatit", such that between the letters "k" there is exactly one other letter.
Solution. In the considered anagrams there are exactly six possibilities of placement of the group two "k", since the first of the two "k" can be placed at any of the positions 1 — 6. If we fix the spots for the two "k", then the other letters can be placed arbitrarily, that is, in P(\, 1, 2, 2) ways. Using the rule of product, we have
6 • 6!
6- 1,2,2)
2-2
1080.
□
1.28. In how many ways can we insert five golf balls into five holes (into every hole one ball), if we have four white balls, four blue balls and three red balls?
Solution. Let us first solve the problem in the case that we have five balls of every colour. In this case it amounts to free choice of five elements from three possibilities (there is a choice out of three colours for every hole), that is variations with repetitions (see). We have
V(3,5) = 35.
Now let us subtract the configurations where there are either balls of one colour (there are three such), or exactly four red balls (there are 2 • 5 = 10; we first choose the colour of the non-red ball - two ways -and then the hole it is in - five ways). Thus we can do it in
35 - 3 - 10 = 230
ways.
□
1.29. In how many ways could have the English Premier League league finished, if we know that no two of the three teams Newcastle United, Fulham and Tottenham Hotspur are not "adjacent" in the final table? (There are 20 teams in the league.)
Solution. First approach. We use the inclusion-exclusion principle. From the number of all possible resulting tables we subtract the tables
Proof. The proof is based on a trick (a simple one, as soon as we understand it). We show two different approaches.
Assume first, that we are drawing cards from a deck of n different cards and in order to make it possible to draw some card multiple times, we add to the deck k — 1 different jokers (we definitely want to draw at least one of the original cards). Say that we have drawn r original cards and s jokers, that is, r + s — k. It seems that we should devise a method how to assign "substitute" jokers to original cards, so that we know how many times we have drawn each original card. But we actually need to discuss only the number of ways how to do that.
For that we can use the mathematical induction and assume that the claim holds for any arguments smaller than n and k. We need to obtain the combination with repetition of the s-th order from r original cards, which gives (^+k~r~l) — (k^1), which is exactly the number of combinations without repetitions of s-th order from all jokers. Thus the theorem is proven.
Alternative approach (induction-free): Over the set
S — {fli, ..., a„],
from which we choose the combination, we fix an ordering of the elements and for our choices of elements of S we prepare n boxes into which we give (in the fixed order) the elements of S (one element into every box).
The individual choices x, e S are then given to the box which already contains this element. Now let us realize that in order to detect the original combination we just need to know how many elements are there in individual boxes. For instance,
a I bbb I cc I d
* | * * * | ** | *,
determines the choice b, b, c from the set S — [a, b, c, d}.
In the general case of the choice of k elements from n possible we have a chain of n + k elements and the number C(n, k) equals the number of possible placements of the boxes | among individual elements. This amount to the choice of n—1 positions from n+k—1 possible. Since we have
+ k-l\ ( n+k-1 \ /n+k-V
1
the theorem is proven (for the second time).
1
□
3. Difference equations
In the previous paragraphs we have seen formulas, which determined the value of a scalar function defined on natural numbers (factorial) or on tuples of natural numbers (binomial coefficients) using already defined values. In the paragraph 1.5 the binomial coefficients are defined with a directly computable formula, but we can also understand them using the relationship exhibited in 1.8 -instead of the value of the function we give the difference corresponding to a change of the variable.
Such approach can be seen very often when formulating mathematical models that describe real systems in economy, biology, etc. We will observe only a few sim-pie examples and we will return to this topic in the future.
11
CHAPTER 1. INITIAL WARM UP
where some two of the three teams are adjacent and then add the tables where all three teams are adjacent. The number is then
20!
2! • 19! + 3! • 18! = 1741445647958016000.
Second approach. Let us consider the three teams to be "separators". The remaining teams have to be divided such that between any two separators there is at least one team. The remaining teams can be arbitrarily permuted, as can the separators. Thus we have 'IS
■ 17! • 3! = 1741445647958016000.
ways.
□
1.30. For any fix n e N determine the number of all solutions to the equation
xi +x2-\-----h xk = n
on the set of strictly positive integers.
Solution. Every solution (r1; ..., rk), X!*=i ri = n can be uniquely encoded as a sequence of separators and ones, where we first write r\ ones, then a separator, then r2 ones, then another separator, and so one. Such sequence then clearly contains n ones and k — l separator. Every such sequence clearly determines some solution of the given equation. Thus there are exactly that many solutions as there are sequences, that
□
(n+k-l\
IS,
C. Difference equations
Difference equations (also called recurrence relations) are relations between elements of some sequence, where an element of the sequence depends on previous elements. To solve a difference equations means finding an explicit formula for 72-th (that is, arbitrary) element of the sequence. Recurrence relation allows us only to compute 72-th element by computing all previous elements.
If an element of the sequence is determined only by the previous element, we speak about first order difference equation. Those are present in our real world, for instance when we want to find out how long will repayment of a loan take for fixed monthly repayment, or when we want to find out how much shall we pay per month if we want to repay a loan in a fixed time.
1.31. Michael wants to buy a new car. The car costs €30, 000. Michael wants to take out a loan and repay it with a fixed month repayment. The car company offers him to buy the car with yearly interest of 6%. Michael would like to finish repaying the loan in three years. How much should he pay per month?
1.9. Linear difference equations of first order. General difference equation of the first order is an expression of the form
f(n + 1) = F(n, fin)),
where F is a known scalar function with two parameters. If we know the "initial" value f(0), we can compute/(l) — F(0, /(0)), then f(2) — F(l, /(l)) and so on. Using this approach we can compute the value fin) for arbitrary n e N. Note that this idea resembles the construction of natural numbers from the empty set or the principle of mathematical induction.
An example of such equation is the definition of the factorial function:
in + 1)! = in + 1) -n!
We see that the value of fin + 1) depends on both n and the value of/(«).
Another, very simple example is fin) — C for some fixed scalar C and all n, and the so-called linear difference equation of first order
(1.6)
f(n + \) = a-f(n) + b,
where a / 0 and b are known scalars.
Such difference equation is easy to solve if b — 0. Then it is the well-known recurrent definition of the geometric progression and it holds that
/(l) = a/(0), /(2) = fl/(l) = a2/(0) and so on
Thus for all n we have
/(«) = aV(0).
This is also the relation for the so-called Malthusian population growth model, which is based on the assumption that during a given time interval the populations grows with a constant ratio a to the state before the interval.
We will prove a general result for first order equations, which are similar to linear, but allow varying coefficients of a and b,
(1.7)
fin + l) — a„- fin) + bn.
First let us think about what such equations can describe.
Linear difference equation (1.6) can we nicely interpret as a mathematical model for finance, e.g. savings or loan payoff with a fixed interest rate a and fixed repayment b (the cases of savings and loans differ only in the sign of b).
With varying parameters a and b we obtain a similar model with varying interest rate and repayment. We can imagine for instance that n is the number of months,
1 ^i'- a„ is the interest rate in the nth month, b„ the repay-* - ment in the nth month. Do not be afraid of the seemingly difficult calculations in the following result. It is a typical example of technical mathematical statement for which it is hard to "guess" precisely how it should be formulated. On the other hand, it is then a simple exercise on the properties of scalars and mathematical induction to prove it. Really interesting are then the corollaries, see 1.11 later.
In the formulation we use along with the usual notation for sum me similar notation for the product Yl- In the rest of the text we will also use the convention that when the index set is empty, then the sum is zero and that the product is one.
12
CHAPTER 1. INITIAL WARM UP
Solution. Let 5 denote the sum Michael has to pay per month. After first month Michael repays 5, part of it is a repayment of the loan, part of it repays the interest. Let dk stand for the loan after k months. After first month dk is
0,06
di = 30000 - 5 + • 30000.
12
In general, after k-th month we have (1.1)
0, 06
dk = di-i — 5 H—<4-i-
Using the relation (1.9) is dk given by
0,06\ 1 + I 30000
1 +
0, 06 12
125 0706
Repaying the loan in three years means d36 = 0, thus we obtain
(1.2)
5 = 30000
0,06 12
l-(l + °f )"36
912.7.
□
Note that the recurrence relation (|| 1.11|) can be used for our case as long as all y(n) are positive, that is, as long as Michael still has to repay something.
1.32. Consider the case from the previous example. For how long would Michael have to pay, if he would like to repay €500 per month?
Solution. Setting q = (l + ^) = 1.005, c = 30000 the condition dk = 0 gives the equation
, 2005 H 2005 -c
by taking logarithms of both sides we obtain
, ln2005 -ln(2005-c)
k = -,
In q
which for 5 = 500 gives approximately £ = 71,5, thus Michael would be paying for six years (and the last repayment would be less than €500). □
1.33. Determine the sequence {yn}^L\, which satisfies the following recurrence relation
3y" , 1 ^1 1 yn+i = — + 1, n > 1, yi = 1.
o
Linear recurrence can appear for instance in geometric problems:
1.10. Proposition. General solution of the difference equation (1.7) of first order with the initial condition /(0) = yo is given by the formula
(n-l \ n-2 / n-l \
n a') yo + X! I! ai hi+ bn-1-i=0 I j=0 \i=j+\ )
Proof. We will prove the proposition using mathematical induction. It clearly holds for n — 1 where it amounts directly to the definition f(l) — flow + V
Assuming that the statement holds for some fixed n, we can easily compute:
(/n-l \ n-2 / n-l \ \
(n a>) y°+12 [ nai) bJ+b»-1 \i=0 I 7=0 \i=j+\ ) )
+ K
(n \ n — l I n \
rifl')^ + E II ai\bj+bn, i=0 I j=0 \i=j+\ j
as can be directly seen by multiplying out. □
Let us again note that for the proof we did not need anything about the scalars we used except for the properties of commutative ring.
1.11. Corollary. General solution of linear difference equation (1.6) with a ^ 1 and initial condition /(0) = yo is
(1.9)
1 - a"
f(n) = anyo + --b.
I — a
Proof. If we set a, and bi to be constants and use the general formula (1.8) we obtain
/(«) = «"W + ^(l + E"""7'"1)-
V 7=0 7
For evaluating the sum of products in the second summand we need
to observe that these are expressions (1+aH-----\-an~l)b. The sum
of this geometric progression can be computed using the formula
l—a" — (1 — a)(l+a-\-----h a"-1), and that yields the required
result. □
Note that for calculating the sum of a geometric progression we required the existence of the inverse element for non-zero scalars. We could not do that with integers only. Thus the last result holds for field of scalars and we can thus use it for linear difference equations where the coefficients a, b and the initial condition /(0) — yo are rational, real or complex numbers; and also in the ring of remainder classes Zk with prime k (we will define remainder classes in the paragraph 1.41).
It is noteworthy that the formula (1.9) actually holds even with the integer coefficients and initial condition. Then we know in advance that all f(n) are integer, and integers are a subset of rational numbers. Thus our formula necessary gives correct integer solutions.
Observing the proof in more detail, we see that 1 — a" is always divisible by 1 — a, thus the last paragraph should not have surprised
13
CHAPTER 1. INITIAL WARM UP
1.34. Suppose n lines divide the plane into areas, what is the maximal number of areas that can arise this way?
Solution. Let the number of areas be p„. If there is no line in the plane, then the whole plane is an area, thus po = 1. If there are n lines, then by adding (n + l)-st line increases the number of areas by the number of areas this new line interesects. If no lines are parallel and no three lines intersect at the same point, the number of areas the (n + l)-st line crosses equals to one plus the number of its intersections with the previous lines (the crossed area will then be divided into two, thus the total number increases by one at every crossing). The new line has at most n intersections with the already-present n lines. The segment of the line between two intersections crosses exactly one area, thus the new line crosses at most n +1 areas. Before adding the line, there was at most p„ areas (by the definition of p„).
Thus we obtain the recurrence relation
pn+1 = pn+(n + 1),
from which we obtain an explicit formula for pn either by applying the formula 1.10 or directly:
pn = pn-i + n = pn_2 + (n-l)+n
= p„-3 + (n - 2) + (n - 1) + n =
n(n + 1) n2 + n + 2 = 1 + —-- = -
Po
□
Recurrence relation can be more complex than first order. Let us list example of combinatorial problems, for whose solution a recurrence relation can be used.
us. However it can be seen that with scalars from Z4 and say a — 3 we fail since 1 — a — 2 is a divisor of zero.
1.12. Nonlinear example. Let us return for a while to the first \\ order equation (1.6) we have used for a very primitive population growth model which directly depends on the momentary population size p. On first sight it is W clear that such model with a > 1 leads to a very rapid and unbounded growth.
A more realistic model has such population change Ap(n) — p(n + 1) — pin) only for small values of p, that is Ap/p ~ r > 0. Thus if we want to let population grow by 5% for a time interval only for small p, we choose r to be 0, 05. For some limit value p — K > 0 the population does not grow and for even greater values it even decreases (since for instance the resources for the feeding of the population are limited, individuals in a big population are obstacles to each other etc).
Let us assume that exactly the values y„ — Apin)/pin) change linearly in pin). Graphically we can imagine this dependence as a line in the plane of variables p and y, which goes through the points [0, r] (that is when p — 0 we have y — r) and [K, 0] (which gives the second condition - when p — K the population does not change). Thus we set
y
By setting y
yn and p ■■
pin + 1)
pin) we obtain pin)
pin)
— pin)+r,
that is by multiplying out we obtain a difference equation first order (where the value of p in) is present as both first and second power).
(1.10)
pin + 1) = p(n)(l - —pin)+r)
Try to thing through the behaviour of this model for various , values of r and K. On the picture we can see the ■1 values for parameters r — 0, 05 (that is, five percent growth in the ideal state), K — 100 (the resource limit the population to the size 100) and piO) are two individuals.
Heunbakni M/'i^oi r
Note that the original almost exponential growth slows later down and the value approaches the desired limit of 100 individuals. For p close to one and K way greater than r the right side of the equation (1.10) is approximately p(n)(l+r), that is the behaviour is similar to that of the Malthusian model. On the other hand, for p almost equal to K the right side of the equation is approximately pin). For initial value of p greater than K the values will decrease, for smaller than K they will grow, thus the system will basically oscillate about the value K.
14
CHAPTER 1. INITIAL WARM UP
1.35. How many words of length 12 that consist only of letters A and B, but do not contain a sub-word BBB, are there?
Solution. Let a„ denote the number of words of length n consisting of letters A and B but without BBB as a sub-word. Then for a„ (n > 3) the following recurrence holds
since the words of length n that satisfy the given condition either end with an A, or with an AB, or with an ABB. There are a„_i words ending with an A (preceding the last A there can be an arbitrary word of length n — 1 satisfying the condition). Analogously for the two remaining groups. Further, we can easily compute that a\ = 2, a2 = 4, a3 = 1. Using the recurrence relation we can then compute
a12 = 1705.
We could also derive an explicit formula for n-th element of the sequence using the theory we have developed. Characteristic polynomial of the recurrence relation is x3 — x2 — x — 1 with one real and two complex roots, which we can express using the relations (|| 1.12||).
□
Score of a basketball match between the teams of Czech Republic and Russia is after the first quarter 12 : 9 for the Russian team. In how many ways could the score have developed?
Solution. If we denote P^j) the number of ways in which the score could have developed for a quarter that ended with k : I, then for k, I > 3 the following recurrence relation holds:
(k-3,l)
(k-2,l)
(k-l,l)
OU-l)
+ p<
OU-2)
OU-3)-
(We can divide all possible evolutions of the quarter with the final score k : I into six mutually exclusive possibilities, according to which team scored a goal and how worth it was (1, 2 or 3 points)). Using the symmetry of the problem, it clearly holds that P^j) = P(i,k)- Further
4. Probability
Let us have a look on another frequent example of scalar-valued functions - the observed values are often known neither explicitly by a formula nor implicitly by some description. They are a re-y'"'1 suit of some randomness and we try to describe the probability of some outcome happening.
1.13. What is probability? As a simple example we can use common six-sided dice throwing, with sides labelled as
1, 2, 3, 4, 5, 6.
If we describe mathematical model of such throwing with a "fair" dice, we expect and thus also require that every side occurs with the same frequency. In words, we say that "every in advance chose side occurs with the probability g".
But if you try to manufacture with a knife such dice from wood, you probably observe that the relative frequencies will not be the same. In such situation, we can after a large number of tries count the relative frequencies of each label, and set these to be the probabilities in our mathematical description. But no matter how large the number of tries is, we cannot exclude the possibility that all the tries we did was some unlikely combination of the result and thus our model is not well chosen.
In the following part we will work with abstract mathematical description of probability in the simplest approach. The question how accurate or adequate for a specific real-world problem is out of the realms of mathematics. But that does not mean that such question are not for mathematician, quite the opposite (most likely in cooperation with some experts in the given area). Later we will return to probability and see it as a theory describing the behaviour of random processes or fully deterministic processes where not all determining parameters are known.
Mathematical statistics allows us to say how much can we expect that a given model corresponds to reality, or allows us to determine the parameters of the model in such way that the correspondence with the observations is high and simultaneously can estimate the reliability of the chosen model.
For both probability and statistics a complex mathematical theory is required, which we build over the course of few semesters.
On the example of our dice we can imagine it as follows: in the probability theory we work with the parameters pi for the probabilities of individual sides and only require that these probabilities are non-negative and their sum is
P1+ P2 + P3+ P4 + P5 + P6 — 1-
When choosing specific values pi for a specific dice in mathematical statistics we can then estimate the reliability of our mathematical model of the die.
Our humble goal for now is just to indicate how to abstractly capture the probabilistic considerations in formal f ~-±Z% mathematical objects. The following paragraphs are thus basically just exercises in simple operations with sets and combinatorics (that is, calculating the number of possibilities of satisfying the condition for finite sets).
15
CHAPTER 1. INITIAL WARM UP
we have for k > 3 that:
P(k,2) = P(k-3,2) + P(k-2,2) + P(k-l,2) + P(k,l) + P(k,0), P(k,l) = P(k-3,l) + P(k-2,l) + P(k-l,l) + P(k,0), P(k,0) = P(k-3,0) + P(k-2,0) + ^*(i-l,0).
which along with the initial condition gives P(o,o) = 1> Ai,o) :
^(2,0) = 2, ^(3,0) = 4, ^(1,1) = 2, P(2,l) = -P(l,l) + P(0,l) + ^(2,0) P(2,2) = P(0,2) + P(l,2) + P(2,l) + P(2,0) = H g^eS
p(l29) = 497178513.
□
Remark. We see that the recurrence relation in this problem has a more complex form in comparison to the form we have dealt with in our theory and thus we cannot evaluate arbitrary number P^,i) explicitly, we can evaluate it only by a subsequent computing from previous elements. Such an equation is called partial difference equations, since the elements of the equation are indexed by two independent variables (k,l).
We will talk more about recurrent formulas (difference equations) of higher orders with constant coefficients in chapter 3.
D. Probability
Let us state a few simple exercises for classical probability, where we are dealing with some experiment with only a finite number of outcomes ("all cases") and we are interested whether the outcome of the experiment belongs to a subset of possible outcomes ("favourable outcomes"). The probability we are trying to determine then equals to the number of favourable outcomes divided by the total number of all outcomes. Classical probability can be used when we assume (know) that each of the possible outcome has the same probability of happening (for instance, fair dice throwing).
1.37. What is the probability that the roll of a dice results to a number greater than 4?
Solution. There are six possible outcomes (the set {1, 2, 3, 4, 5, 6}) of which two are favourable ({5, 6}). Thus the probability is 2/6 = 1/3.
□
1.38. We randomly choose a group of five people from a group of eight men and four women. What is the probability that there are at least three women in the chosen group?
Solution. We compute the probability as a quotient of the number of favourable outcomes to the total number of outcomes. We divide the
1.14. Random events. We work with a non-empty fixed set £2 of all possible outcomes, which we call the sample space. For simplicity the set £2 is finite with elements &>i, ...,&>„, corresponding to individual possible outcomes. Every subset Acfi represent a possible event. The set of subsets A of the sample space is called the set of events, if
• £2 e A (the sample space is an event),
• if A, B e A, then A \ B e A (that is, for every two events their set difference is also an event),
• if A, B e A, then A U B e A (that is, for every two events their union is also an event).
Clearly also the complement Ac — Q \ A of an event A is an event, which we call the opposite event to the event A. The intersection of two events is again an event, since for every two subsets A, B Rf ■ (v — w)
Rf ■ (v — w) + w
' cos \jf(x — wx) — sin \jf(y . sin TJf(x — wx) + cos TJf(y -
- Wy) + WX Wy)) + W,
1.32. Reflection. Another well-known example of mappings which preserve length is the so-called reflection \"° through a line. Again it suffices to describe reflections through fines that go through the origin O, all other reflections can be derived using shifts and rotations.
Let us look for a matrix of reflection with respect to the line with the direction given by the unit vector i; such that the angle between i; and the vector (1,0) has value i/r. Let us first realise that
Z0
1 0 0 -1
- /1 o
ü-A
(I)
In general, we can rotate any line so that it has the direction (1,0) and thus we can write general reflection matrix as
where we first rotate via the matrix R-^ so that the line is in "zero" position, reflect with the matrix Zo and return back with the rotation Rf.
32
CHAPTER 1. INITIAL WARM UP
(or vice versa). Vectors in the case (a) are thus perpendicular, that is Co?
which means that the linear mapping given by this matrix is the projection on the x axis. Similarly we can see that the matrix A2 determines the reflection with the respect to the y axis, since
The matrix A3 can be expressed in the form
^cos det A
satisfies all the three conditions we wanted. How many such mappings could there possibly be? Every vector can be expressed using two basis vectors e\ — (1,0) and e2 — (0, 1) and by linearity is then every possibility for vol A uniquely determined by the value for these vectors. Since for area — in the same way as for determinant — is clearly vol A(ei, e\) — vol A(e2, e2) — 0 (due to the required antisymmetry), every such scalar function is necessarily determined by the value on the single tuple of arguments (e\, e2). Therefore all possibilities are equal up to a scalar multiple, which can be determined by the condition
1
volA(ei, e2) = -,
that is we are choosing orientation and scale through the choice of basis vectors and we want that the unit square has area equal to one.
Thus we see that the determinant gives the area of a parallelogram determined by the columns of the matrix A and the area of the triangle is thus one half of that.
1.35. Visibility in the plane. The previous description of the value for oriented area gives us elegant tool for determining the position of a point relative to oriented fine segments. > By an oriented line segment we mean two points in the plane M2 with fixed order. We can imagine it as an arrow from one point to the other. Such an oriented fine segment divides the plane into two half-planes, let us call them "left" and "right". We want to be able to tell whether a given point is in the left or right half-plane.
Such tasks are often met in computer graphics when dealing with visibility of objects. For simplicity we can imagine that a line segment can be "seen" from the points to the right of it and cannot be seen from the points to left of it (this corresponds to the notion that object with bounded by fine segments oriented counterclockwise has to the left of the line segments its interior, through which the segment cannot be seen).
35
CHAPTER 1. INITIAL WARM UP
Solution. Using both sides of the equation into the coordinates u =
(uu u2), v = (vi, v2) yields:
2(\\u\\2 + \\v\\2) = 2(u\ + u\ + v\ + v\) = u\ + 2u\V\ + v\ + u\ + 2u2v2 + u|+ + u2 — 2u\V\ -\- v\ -\- u\ — 2u2v2 + t>2 = (ill + vi)2 + (u2 + v2)2 + (ill - vi)2 + (u2 - v2)2 = \\u+v\\2 + \\u-v\\2.
□
1.81. Show that by composing an odd number of point reflections in the plane yields again a point symmetry.
Solution. Point reflection in the plane across the point 5 is represented with the formula X h» 5 - (X - 5), that is X h» 25 - X. (The image of the point X in this reflection is obtained by summing the vector opposite to the vector X—S and the vector 5.) By repeated application of three point reflections across the points 5, T and U respectively thus yields X h» 25 - X h» 2T - (25 - X) h» 2U - (2T - (25 - X)) = 2(U - T + 5) - X, that is X h» 2(U - T + 5) - X, which is a point reflection across the point 5 — T + U. Thus composition of any odd number of point reflection can be thus reduced to a composition of three point reflections, thus it is a point reflection (in principle, this is a proof by mathematical induction, try to formulate it by yourself). □
1.82. Construct (2n + l)-gon, if the middle points of all its sides are given.
Solution. We use the fact that the composition of an odd number of point reflections is again a point reflection (see the previous exercise). Denote the vertices of the (2n + l)-gon we are looking for by A\, A2, ..., A2n+i and the middle points of the sides (starting from the middle point of AiA2) by 5i, 52, ... S2n+i. If we carry out the point reflections across the middle points, then clearly the point A! is a fixed point of the resulting point reflection, thus it is its centre point. In order to find it, it is enough to carry out the given point reflection with any point X of the plane. The point Ai then lies in the middle of the line segment XX' where X' is the image of X in that point reflection. The rest of the vertices A2, ..., A2n+i can be obtained by mapping the point A\ in the point reflections across the points 5i, ..., S2n+i. □
w oA From our considerations it can be seen that in the Euclidean plane two vectors are perpendicular whenever their scalar product is zero.
In the case of real vector space of any dimension we shall try a similar approach, because the concept of the angle of two vectors is clearly always two-dimensional (we want the angle to be the same in the two-dimensional space containing u and v). In the following paragraphs, we shall consider only finitely dimensional vector spaces over real scalars R.
Scalar product and perpendicularity _
Scalar product on a vector space V over real numbers is a mapping (, ):VxV^l which is symmetric in its arguments, linear in each of its arguments, and such that (i;, i;) > 0 and 111; 112 — (v, v) — 0 if and only ifv—0.
The number || i; || — *J(v, v) is called the size of the vector i;.
Vectors v and w e V are called orthogonal or perpendicular whenever (v, w) — 0. We also write v ± w. The vector i; is called normalised whenever ||u|| — 1.
The basis of the space V composed of orthogonal vectors only is called orthogonal basis. If the vectors are additionally normalised, it is then orthonormalised basis.
104
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
a sum of the images of the vectors and the image of a multiple of a vector is the multiple of the image of the vector. These properties are shared among the mappings stated at the start of this paragraph. Such a mapping is then uniquely determined by its behaviour on vectors of a basis (in the plane by the image of two vectors not on the same line, in the space by the image of three vectors not in the same plane).
And how to write down some linear mapping / on a vector space V ? Let us start for simplicity with the plane M2: assume that the image of the point (vector) (1,0) is (a,b) and the image of the point (vector) (0, 1) is (c, d). This uniquely determines the image of arbitrary point with coordinates (u, v): f((u, v)) = f(u(\, 0) + v(0, 1)) = uf(l, 0) + vf(l, 0) = (ua, ub) + (vc, vd) = (au + cv, bu + dv), which can be efficiently written down as follows:
a c b d
au + cv bu + dv
Linear mapping is thus a mapping uniquely determined by a matrix. Furthermore, when we have another linear mapping g given by
(e f\
the matrix I ^ j, then we can easily compute (an interested reader
can fill in the details by himself) that their composition g o f is given
, (ae + fc be + df\
by the matrix (^+^ bg + dh)-
This leads us to the definition of matrix multiplication in exactly this way, that is, we want that an application of a mapping on a vector is given by the matrix multiplication of the matrix of the mapping with the given vector, and that mapping composition is given by the product of the corresponding matrices. It works analogously in the spaces of higher dimension. Further, this again shows what has already been proven in (2.5), that is, that matrix multiplication is associative but not commutative, because it is so with mapping composition. That is another of the motivation why one should investigate vector spaces.
Let us now recall that already in the first chapter we have worked with matrices of some linear mappings in the plane R2, notably with the rotation around a point and with axial symmetry (see 1.31 and 1.32).
Let us now try to write down matrices of linear mappings from M3 to M3. How does the matrix of a rotation in three dimensions look like? Let us begin with special (easier for description) rotations about coordinate axes:
2.64. Matrix of rotation about coordinate axes in M3. We write down matrices of the rotations by the angle ">) = XIv/v; "/• uj) = Xv//V/V/-
If the basis is orthonormal, the matrix S is the unit matrix. This proves the following useful claim:
__J SCALAR PRODUCT AND ORTHONORMAL BASIS j___
Proposition. Scalar product is in every orthonormal basis given in coordinates by the expression
(x, y) - xT ■ y.
For every general basis of the space V there is symmetric matrix S such that the coordinate expression of the scalar product is
(x, y) — xT ■ S ■ y.
2.41. Orthogonal complements and projections. For every fixed subspace W C V in a space with scalar rt product we define its orthogonal complement as follows
ff^jueV; u _L v for all i; e W}.
Directly from the definition it is clear that W1- is vector subspace. If W C V has basis (u\,..., uk) the condition for W1- is given as k homogeneous equations for n variables. Thus W1- will have dimension at least n — k. Also u e W n W1- means (u, u) — 0 and thus also u — 0 due to the definition of scalar product. Clearly then the whole space V is the direct sum
V = w ®WL.
Linear mapping / : V —>• V on any vector space is called projection, if we have
/<>/ = /■ In such case for every vector v e V
v = f(v) + (v- f(v)) € Im(/) + Ker(/) = V
and if i; e Im(/) and f(v) — 0 then also v — 0. The previous sum of subspaces is then direct. We say that / is a projection on the subspace W — Im(/) along the subspace U — Ker(/). In words, the projection can be described naturally as follows: we decompose the given vector into component in W and in U and forget the second one.
If V has a scalar product, we say that the projection is perpendicular if the kernel is perpendicular to the image.
105
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
Solution. When rotating any particular point about the given axis (say x), the corresponding coordinate (x) does not change and the remaining two coordinates are then given by the rotation in the plane which we already know (a matrix of the type 2/times!).
Thus we gradually obtain the following matrices - rotation about the axis z;.
(cos cp — sin cp 0^ sin cp cos cp 0 0 0 1,
rotation about the axis y:
cos cp 0 sin cp1
0 1 0 — sin cp 0 cos q>}
rotation about the axis x:
'10 0 0 cos cp — sin cp y0 sin cp cos cp
The sign at cp in the matrix for rotation about y is different. We want, as with any other rotation, the rotation about the y axis to be in the positive sense — that is, when we look in the opposite direction of the direction of the y axis, the world turns anti-clockwise. The signs in the matrices depend on the orientation of our coordinate system. Usually, in the 3-dimensional space the so-called "dextrorotary coordinate system" is chosen: if we place our hand on the x axis such that the fingers point in the direction of the axis and such that we can rotate the x axis in the xy plane so that x coincides with the y axis and they point in the same direction, then the thumb should point in the direction of the z, axis. In such system this is a rotation in the negative sense in the plane xz, (that is, the axis z, turns in the direction towards x). Think about the positive and negative sense of rotations through all three axes. □ The knowledge of matrices allows us to write the matrix of rotation about any (oriented) axis. Let us start with a specific example:
2.65. Find the matrix of the rotation in the positive sense through the angle 7t/3 about the line passing through the origin with the oriented directional vector (1, 1,0) under the standard basis M3.
Solution. The given rotation can be easily obtained by composing these three mappings:
• rotation through the angle 7t/4 in the negative sense about the axis z, (the axis of the rotation goes over on the x axis);
• rotation through the angle 7t/3 in the positive sense about the x axis;
• rotation through the angle jt/'4 in the positive sense about the z axis (the x axis goes over on the axis of the rotation).
Every subspace W / V thus defines an perpendicular projection on W. It is a projection on W along W-1, given by the unique decomposition of every vector u into components uw e W and uw± e W-1, that is, linear mapping which maps uw + uw±_ on
2.42. Existence of orthonormal basis. Note that on every finitely dimensional real vector space there definitely exist scalar products. Just pick any basis, call it orthonormal and we immediately have a scalar product. In this basis the scalar products are computed as in the formula in the Theorem 2.40.
But we can do it in other way too. If we are given scalar product on a vector space V, we can easily use some suitable perpendicular projections and transform any basis into orthonormal one. It is called Gramm-Schmidt orthogonalisation process. The point of this procedure is to transform a given sequence of nonzero generators v\,..., Vk of a finitely dimensional space V into an orthogonal set of nonzero generators for V.
_ I Gramm-Schmidt orthogonalisation
Proposition. Let (u\,..., «*) be a linearly independent k-tuple of vectors of a space V with scalar product. Then there exists an orthogonal system of vectors (v\, ..., v^) such that vi e (u \, ... ,Uj■), i — 1, ..., k. We obtain it by the following procedure:
• The independence of the vectors ui ensures that u\ choose v\ — u\.
• If we have already constructed the vectors v\, ... quired properties, we choose vi+\ — U£+\ +a\v\ +■ where at — — t,l"l+\'^'K
^ 0; we
V£ of re-
■ ■ +a£V£,
Proof. Let us begin with the first (nonzero) vector v\ and calculate the perpendicular projection v2 on do
(ui)1" C {{vuv2}).
The result is nonzero if and only if v2 is independent on v\. In all further steps we work similarly.
In the £-th we want that for vi+\ — u^+\ +a\v\ H-----Va^v^
holds {vi+i, vt) — 0 for all i — 1,... ,£. That implies
0 — (u£+i +a\vi H-----latvt, vt) — (w^+i, vt) +at{vi, vt)
and we can see that the vectors with desired properties are determined uniquely up to a scalar multiple. □
Whenever we have an orthogonal basis of a vector space V, we just have to normalise the vectors in order to obtain an orthonormal basis. Thus we have proven:
Corollary. On every finitely dimensional real vector space with scalar product there exist an orthonormal basis.
In orthonormal basis the coordinates and perpendicular projections are very easy to calculate. Really, let us have an orthonormal basis (e\,..., e„) of a space V. Then every vector v — x\e\ + • • • + xnen satisfies
{et, v) — {et, x\e\ H-----h x„e„) — xt
and it always holds that
(2.3) i; — (e\, v)e\ H-----\-(e„,v)e„.
106
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
The matrix of the resulting rotation is the product of the matrices corresponding to the given three mappings, while the order of the matrices is given by the order of application of the mappings - the first mapping applied is in the product the rightmost one. Thus we obtain the desired matrix
0 j_
2
V3
/V2 o) /l
z V2 2 z Vj 2 0 0
0 'J
/ 3 J_
= 4 j_ 4 3
V- 4 Vě 4 4 Vě 4 1 2
Note that the resulting rotation could be also obtained for instance by taking the composition of the three following mappings:
• rotation through the angle n/4 in the positive sense about the axis z, (the axis of rotation goes over on the axis y);
• rotation through the angle 7t/3 in the positive sense about the axis y;
• rotation through the angle 7t/4 in the negative sense about the axis z, (the axis y goes over to the axis of rotation).
Analogously we obtain
0
1
0
V5\
2
0
/VI
2
VI
2
2
VI
2
0
0
□
2.66. Matrix of general rotation in R3. Derive the matrix of a general rotation in R3.
Solution. We can do the same things as in the previous example with general values. Consider arbitrary unit vector (x, y, z). Rotation in the positive sense through the angle
/VT^l5 o |,
-y/y/T 0
z2 x
If we are given a subspace W c V and its orthonormal basis (ei,..., ek), we can surely extended it to an orthonormal basis (e i,..., en) of the whole V. Perpendicular projection of a general vector v e V on W is then given by the relation
v i-> (ei, v)e\ H-----h (e„, v)ek.
For perpendicular projection it is enough to know just the orthonormal basis of the subspace W, on which we are projecting.
Let us also note than in general projections / on a subspace W along U and projections g on U along W tied with the relation g — idy — /. When dealing with perpendicular projections on a given subspace W, it is always more efficient to calculate the orthonormal basis of the space which has smaller dimension (that is, for either W or W-1).
Let us also note that the existence of an orthonormal basis ensures that for every real space V of dimension n with scalar product there exists a linear mapping which is an isomorphism between V and the space M" with standard scalar product. Similarly it has been shown already in the Theorem 2.40, where we have shown that the desired isomorphism is exactly the coordinate assignment. In words - in orthonormal basis the scalar product with coordinates is computed by the same formula as the standard scalar product in W.
We shall return to the questions of the size of a vector and to projections in the following chapter in more general context.
2.43. Angle of two vectors. As we have already noted, the angle of two linearly independent vectors in the space must be the same as when we consider them in the two-dimensional subspace they generate. Basically, this is the reason why the notion of angle is independent of the dimension of the original space and if we choose orthogonal basis such that its first two vectors generate the same subspace as the two given vectors u and i; (whose angle we are measuring), we can simply take the definition from the planar geometry. Even without choosing the basis it must hold that:
___| Angle of two vectors [ -
Angle • K, where for any four vectors u, v, w, z and scalars a,
107
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
ii) rotation 1Z2 in the positive sense about the y axis through the angle with cosine Vl — z2, that is, with sine z, under which the line with the directional vector (0, y, z) goes over on the line with the directional vector (1, 0, 0). Matrix of this rotation is
R2
z 0
0 VT
iii) rotation 7^3 in the positive sense about the x axis through the angle
-1 • R2l -R3-R2-Ri =
cos • V*, v \-> a( , v), that is, plugging a fixed vector in the second argument we obtain a linear form which is the image of this vector. If we choose a fixed basis on a finitely dimensional space V and a dual basis V*, then we have a mapping
y
(ih>y • a ■ x).
4. Properties of linear mappings
More detailed analysis of properties of types of linear mappings will now lead us to a better understanding of tools which vector spaces give us for modelling of linear processes and systems.
2.45. Let us begin with four examples in the lowest interesting dimension. In the standard basis of the plane M2 with the standard scalar product we consider 5 the following matrices of mapping / : M2 —>•
a =
1 0 0 0
B =
0 1 0 0
c =
a 0 0 b
D =
0
The matrix a gives a perpendicular projection along the subspace
W C {(0,a); aeRjcR2
108
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
Transition matrix for changing the basis from the standard basis to the basis / is then given by
on the subspace
Matrix of the mapping in the basis / is then given by
t~lat
□
2.68. Consider the vector space of polynomials of one variable of
degree at most 2 with real coefficients. In this space, consider the
basis 1, x, x2. Write down the matrix of the derivative mapping in this
basis and also in the basis 1 + x2, x, x + x2.
/0 1 0\ /0 1 1 \ Solution. 002,21 3. □ \0 0 0/ \0 -1 -1/
2.69. In the standard basis in R3 determine the matrix of the rotation through the angle 90° in the positive sense about the line (t, t,t),t el, oriented in the direction of the vector (1, 1, 1). Further, give the matrix of this rotation in the basis
£=((1,1,0), (1,0,-1), (0,1,1)).
Solution. We can easily determine the matrix of the given rotation in a suitable basis, that is, in a basis given by the directional vector of the line and by two mutually perpendicular vectors in the plane x + y + z, = 0, that is, in the plane of vectors perpendicular to the vector (1, 1, 1). We shall note that the matrix of the rotation in the pos-
itive sense through 90° in some orthonormal basis in R2 is
0 -1
1 0
In orthogonal basis with sizes of the vectors k, I it is ^
If we choose perpendicular vectors (1,-1,0) and (1, 1, —2) in the plane x + y + z = 0 with sizes \fl and Vo", then in the basis / = ((1, 1, 1), (1, —1, 0), (1, 1, —2)) the rotation we are looking for
/I 0 0 \ has matrix 10 0 — V3 I. In order to obtain the matrix of the
\0 1/V3 0 / rotation in the standard basis, it is enough to change the basis. The
transition matrix t for changing the basis from the basis / to the standard basis is obtained by writing the coordinates (under the standard basis) of the vectors of the basis / as the columns of the matrix t: /I 1 1
t = I 1 — 1 1 |. Finally, for the desired matrix r we have VI 0 -2
V c {(fl,0); íieI)c
that is, the projection on the x-axis along the y-axis Evidently for this mapping / : R2 -> R2 it holds that / o / = / and thus the restriction f\y of the given mapping on its codomain is identity mapping. The kernel of / is exactly the subspace W.
The matrix B has the property B2 — 0, therefore the same holds for the corresponding mapping /. We can envision it as a mapping of differentiation of polynomials Mi [x] of degree at most one in the basis (1, x) (with differentiation we shall deal in the chapter five, see ??).
The matrix C gives a mapping /, which enlarges the first vector of the basis a-times, the second fr-times. Therefore the whole plane divides into two subspaces, which are preserved under the mapping and where it is only a homothety, that is, scaling by a scalar multiple (first case was a special case with a — 1, b — 0). For instance the choice a — 1, b — — 1 corresponds to axial symmetry (mirror symmetry) under the x-axis, which is the same as complex conjugation x + iy i-> x — iy on the two-dimensional real space R2 ~ C in basis (1,0- This is a linear mapping of the two-dimensional real vector space C, but not of the one-dimensional complex space C.
The matrix D is a matrix of rotation through the right angle in the standard basis and on the first sight we can see that none of the one-dimensional subspaces is not preserved under this mapping.
Such rotation is a bijection of the plane to itself, therefore we can surely find distinct bases in the domain and codomain, where its matrix will be the unit matrix E (we simply take any basis of the domain and its image in the codomain). But we are not able to do this with the same basis on both the domain and the codomain. Let us see the matrix D as a matrix of the mapping g : C2 -> ~"2 in the standard basis of the complex vector space . Then we can find vectors u — (i, 1), i; — (—i, 1), for which we have
0 -1
1 0
0 -1
1 0
i ■ u,
That means that in the basis (w, v) on C the mapping g has the matrix
'i 0
^0 -i.
K
and note that the this complex analogy to the case of matrix C has on the diagonal the elements a — cos(j7r) + «sin(j7r) and its complex conjugate a. In other words, the argument of this number in polar form gives the angle of the rotation.
This is easy to understand, if we denote the real and imaginary part of the vector u as follows
iyu
Re w + ŕ Im u
i/+í''lo
The vector i; is complex conjugate of u. We are interested in the restriction of the mapping g on the real vector space V — R2 n
(w, v) c
-. Evidently is
V
(u + u, i(u — u)) — (ji
-yu)
109
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
R
1/3 1/3 + V3/3
1/3-VŠ/3 l/3 + V3/3\ 1/3 1/3 - V3/3
J/3-V3/3 1/3 + V3/3 1/3 /
This result can be checked by plugging into the matrix of general rotation (||2.66||). By normalising the vector (1, 1, 1) we obtain the vector (x, y, z) = (1/V3, 1/V3, 1/V3), cos(?)) = 0, &in( • V on a vector space of dimension n over scalars K. If we imagine such equality written in coordinates, that is, using the matrix of the mapping A in some bases, it is an expression
A ■ x — a ■ x — (A — a ■ E) ■ x — 0.
>From the previous we know that such a system of equations has the only solution x — 0 if the matrix A—aE is invertible. Thus we want to find such values a e K for which A — aE is not invertible, and for that the necessary and sufficient condition (see Theorem 2.23)
(2.4)
det(A -a-E) = 0.
If we consider X — a a variable in the previous scalar equation, we are actually looking for roots of polynomial of n-th degree. As we have seen in the case of the matrix D, the roots may exist, but do not have to depending to the field of scalars K we are having.
_ | Eigenvalues and eigenvectors [___
Scalars X satisfying the equation f(u) — X ■ u for a nonzero vector u e V are called eigenvalues of mapping f, the corresponding nonzero vectors u then eigenvectors of mapping f.
If u, v are eigenvectors associated with the same eigenvalue X, then for every linear combination of u and i; it holds
f(au + bv) — af (u) + bf(v) — X(au + bv).
Therefore the eigenvectors associated with the same eigenvalue X form along with a zero vector a nontrivial vector subspace Vx, that is, eigenspace associated with X. For instance, if X — 0 is an eigenvalue, the kernel Ker / is a eigenspace Vq.
>From the definition of the eigenvalues it is clear that their computation cannot depend on the choice of the basis and the matrix of the mapping /. Indeed, as a direct corollary of the transformation properties from the paragraph 2.38 and Cauchy theorem 2.19 for calculation of the determinant of product we obtain by choosing different coordinates a matrix A' — P~l AP with invertible matrix P and
\P~LAP
XE\ = \P~LAP -■■ \P~1(A-XE)P\ ■■ \A-XE\,
P~lXEP\
-i i
(A-XE\\P\
because scalar multiplication is commutative and \P~ \ —
>From these reason we use for matrices and mappings the same terminology:
110
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
2.72. Consider complex numbers as a real vector space and choose 1 and i for its basis. Determine in this basis the matrix of the following linear mappings:
a) conjugation,
b) multiplication by the number (2 + i).
Determine the matrix of these mappings in the basis / = ((1 — 0,(1 + 0).
Solution. In order to determine the matrix of a linear mapping in some basis, it is enough to determine the images of the basis vectors.
a) For conjugation we have 1 h» 1, i h» —i, written in the coordinates (1,0) h» (1,0) and (0, 1) h» (0, -1). By writing the images into the columns we obtain the matrix ^ ^l)' ^ ^ ^a sis / the conjugation swaps basis vectors, that is, (1, 0) h» (0, 1)
and (0, 1) i—> (1, 0) and the matrix of conjugation under this basis is
0 1\
1 0)'
b) For the basis (1,0 we obtain 1 h» 2 + i, i h» 2i — 1, that is,
(1,0) h» (2, 1), (0, 1) h» (2,-1). Thus the matrix of multiplication
(2 -1N
by the number 2 + i under the basis (1, /) is: I ^ ^
Now let us determine the matrix in the basis /. Multiplication by (2 + 0 gives us: (1 - 0 ^ (1 - 0(2 + 0 = 3- i, (1 + 0 ^ (1 + 30-Coordinates (a,b)f of the vector 3 — i in the basis / are given, as we know, by the equation a ■ (1 — 0 + b ■ (1 + 0 = 3 + i, that is, (3 + 0/ = (2, 1). Analogously (1 + 30/ = (-1, 2). Altogether, we
1 V> "
Think about the following: why is the matrix of multiplication by 2 + i the same in both bases? Would the two matrices in these bases be the same for multiplication by any complex number? □
2.73. Determine the matrix A, which under the standard basis of the space M3 gives the orthogonal projection on the vector subspace generated by the vectors u \ = (—1, 1,0) etadu2 = (—1,0, 1).
Solution. Let us first note that the given subspace is a plane going through the origin with the normal vector u3 = (1, 1, 1). The ordered triple (1, 1, 1) is clearly a solution to the system
have obtained the matrix
-xx
+ x2
+ x3
0, 0,
that is, the vector u3 is perpendicular to the vectors u\,u2.
Under the given projection the vectors u \ and u2 must map to themselves and the vector u3 on the zero vector. In the basis composed of
Characteristic polynomial of matrix and mapping [__
For a matrix A of dimension n over K we call the polynomial | A — XE\ € K„[A] characteristic polynomial of the matrix A.
Roots of this polynomial are the eigenvalues of the matrix A. If A is the matrix of the mapping / : V -> V in a certain basis, then | A — XE\ is also called the characteristic polynomial of the mapping f. i
Because the characteristic polynomial of a linear mapping / : V -> V is independent of the choice of the basis of V, its coefficients at individual powers of the variable X are scalars expressing the properties of /, that is, they cannot depend on the choice of the basis. Notably as a simple exercise for calculating determinants we express the coefficients at the highest and lowest powers (we assume dim V — n and the matrix of the mapping A — (atj) to be in a certain basis):
\A — X - E\ — {-\)nXn + (-ly-Vi + • • • + ann) ■ k + ••• + IAI -X°.
! n — l
Coefficient at the highest power says only whether the dimension of the space V is even or odd. We have already noted that the determinant of the matrix of a mapping expresses how the given linear mapping scales the volume.
Interesting is that the sum of the diagonal elements of the matrix of a mapping does not depend on the choice of basis. We call it the trace of matrix and denote it by TrA. Trace of mapping is defined as a trace of the matrix in an arbitrary basis. In reality this is not so surprising, because in the eight chapter we show an example to illustrate a method of differential calculus, which shows that the trace is actually a linear approximation of the determinant in the neighbourhood of the unit matrix, see ??.
In the following we show a few important properties of eigenspaces.
2.47. Theorem. Eigenvectors of linear mappings f : V —. associated to different eigenvectors are linearly independent.
V
Proof. Let a\,..., a\ be distinct eigenvalues of the mapping / and u\, eigenvectors with these eigenvalues. We
do the proof by induction on the number of linearly independent vectors among the chosen ones. Assume that «i,... ,ue are linearly independent and ui+\ — qui is their linear combinations. At least 1—1 can be chosen, because the eigenvectors are nonzero. But then /(w^+i) — • m+i — T!i=i ai+i ■ ci ■ ut> that is,
fim+i) — J^ai+i
Ui
• f(Ui) = ^\
at ■ Ui.
By subtracting the second and the fourth expression in the equalities we obtain 0 — J2l=i(ai+i — ai) • q • ui- All the differences between eigenvalues are nonzero and at least one coefficient q is nonzero. That is a contradiction with the assumed linear independence «i,..., ui, therefore also the vector ui+i must be linearly independent of the others. □
111
CHAPTER 2. ELEMENTARY LINEAR ALGEBRA
u\, u2, u3 (in this order) is thus the matrix of this projection
'1 o rr> o 1 o vo o Oy
Using the the transition matrix for changing the basis
_ 1 2
1 \
3 3 3
III
3 3 3
from the basis (u \, u2, u3) to the standard basis, and from the standard basis to the basis («i, u2, u3) we obtain
-l -l r
1 0 1 0 1 1,
0 0
0 1 0
0 0 0
2 _ I _ 1
3 3 3 I 2 _ 1
"33 3
1 1 2
The just proved theorem can be seen as a decomposition of a linear mapping / into a sum of simple mappings. For distinct eigenvalues Xi of the characteristic polynomial we obtain one-dimensional eigenspaces Vit. Each of them then describes a projection on this invariant one-dimensional subspace, where the mapping is given just as multiplication by the eigenvalue Xi. The whole subspace V is then decomposed into a direct sum of individual eigenspaces. Furthermore, this decomposition can be easily calculated:
[ Basis of eigenvectors [__>
□
2.74. In the vector space R3 determine the matrix of the orthogonal projection onto the plane x + y — 2z, = 0. O
2.75. In the vector space R3 determine the matrix of the orthogonal projection on the plane 2x — y + 2z = 0. O
I. Bases and inner products
Using the inner product we can solve in a different (better?) way problems we were able to solve already using changes of coordinates.
2.76. Write down the matrix of the mapping of orthogonal proj ection on the plane passing through the origin and perpendicular to the vector (1,1,1).
Solution. The image of arbitrary point (vector) x = (xi, x2, x3) e R3 under the considered mapping can be obtained by subtracting from the given vector its orthogonal projection onto the direction normal to the considered plane, that is, onto the direction (1, 1, 1). This projection p is given by (see 2.3) as
(x, (1, 1, 1)) _ X\ +X2 + X3 Xi+X2+X3 Xi+X2+X3
1(1, 1, 1)|2 ~ 3 ' 3 ' 3
The resulting mapping is thus
2x\ x2 -\- x3 2x2 X\ -\- x3 2x3 X\ -\- x2 xp = (~$ 3 ' ~3 3 ' ~3 3 } =
Corollary. If there exists n mutually distinct roots Xi of the characteristic polynomial of the mapping f : V -> V on n-dimensional space V, then there exists a decomposition of V into a direct sum of eigenspaces of dimension 1. That means that there exists a basis of V composed only of eigenvectors and in this basis f has diagonal matrix. This basis is uniquely determined up to the order of the elements.
The corresponding basis (expressed in the coordinates in an arbitrary basis ofV) is obtained by solving N systems of homogeneous linear equations ofn variables with matrices (A — Xi ■ E), where A is a matrix of f in a chosen basis.
3 3 3
We have (correctly) obtained the same matrix as in the exercise || 2.731|.
□
2.48. Invariant subspaces. We have seen that every eigenvector v of the mapping / : V -> V generates a subspace (v) c V, which is preserved by the mapping /.
In more generality, we say that a vector subspace W c V is invariant subspace for a linear mapping /, if it holds that f(W) W on vector spaces. As we can surely imagine, the vector i; e V can represent the state of some system we are observing, while cp(v) gives the result after some process was realised.
If we want to reach a given result b e W of such process, we solve the problem
cp(x) — b
for some unknown vector x and a known vector b.
In fixed coordinates we then have a matrix A of a mapping x2 + 90
xx > 110
The objective function (the function that gives the profit for given number of manufactured nuts and bolts) is \§x\ + 60x2. The previous system of inequalities gives R2 a certain area and the optimisation of the profit means to find in this area the point (points) in which the objective function has the maximum value, that is, to find the highest k such that the line 40xi + 60x2 = k has a non-empty intersection with the given area. Graphically, we can find the solution for example by placing the line p into the plane such that it satisfies the equation 40xi + 60x2 = 0 and start moving it "upwards" as long as it has some intersection with the area. It is clear that the last intersection is either a point or the borderline of the line (and the line must be parallel to p). Thus we obtain (see the figure) the point x\ = 110 and x2 = 5. Maximum possible income is thus 40 • 100 + 60 • 5 = 4700 Kč. □
3.2. Minimisation of costs for feeding Stable in Nišovice u Volyně buys fodder for winter: hay and oat. The nutritional values of the fodder and required daily portions for one foal are given in the table:
g/kg Hay Oat Requirements
Dry basis 841 860 At least 6300 g
Digestible nitrogen stuff 53 123 At most 1150g
Starch 0,348 0,868 At most 5,35 g
Calcium 6 1,6 At least 30 g
Phosphate 2,8 3,5 At most 44 g
Natrium 0,2 1,4 Approximately 7 g
Cost 1,80 1,60
Every foal must obtain in daily meal at least 2 kg of oat. Average cost (counting the payment for the transportation) cost 1, 80 Kc per 1 kg of hay and 1, 60 Kc per 1 kg of oat. Compose daily diet for one foal which has minimum costs.
3.3. Optimal distribution of material. On inner wooden panelling of a cottage there are following requirements
• at most 120 planks of length 35 cm,
• from 180 to 330 planks of length 120 cm,
• at least 30 planks of length 95 cm.
When solving the system of equation we are thus left with exactly n—k free parameters and by setting one of them to have the value one and making the other to be zero we obtain exactly n—k linearly independent solutions. All solutions are then given by all the linear combinations of these n—k solutions. Every such (n — &)-tuple of solutions is called fundamental system of solutions of the given homogeneous system of equations. We have proved:
Theorem. The set of all solutions of the homogeneous system of equations
A-x = 0
for n variables with the matrix A of rank K is a vector subspace ofW of dimension n—k. Every basis of such subspace forms a fundamental system of solutions of the given homogeneous system.
3.2. Non-homogeneous systems of equations. Consider now the general system of equations
A ■ x = b.
Let us now realise once again that the columns of the matrix A are actually images of the vectors of the standard basis in K" under the linear mapping and the variables are non-negative.
It is easy to see that every general linear programming problem can be transformed into a standard one of either type. Aside of sign changes we can work with decomposition of the variables that have no sign restriction into a difference of two non-negative ones. Without loss of generality we shall further work only with the standard maximisation problem.
How to solve such problem. We seek maximum of a linear form h over subsets M of a vector space which are given by linear inequalities, that is, in the plane by the intersection of half planes, in general we shall speak in the next chapter about half-spaces. Note that every linear form over real vector space h : V -> R (that is, arbitrary linear scalar function) is monotone in every chosen direction - that is, in the direction it either grows all the time or decreases. More precisely, if we choose a fixed starting vector u e V and "directional" vector v e V, then composition of our form h with parametrisation yields
t h-> h(u + t v) = h(u) + t h(v).
This expression is indeed with increasing parameter t always either increasing or decreasing, or constant (depending on whether h(v) is positive, negative or zero).
Thus we surely must expect that problems similar to the one with the painter are either unsatisfiable (if the given set with restrictions is empty), or the profit is unbounded (if the restrictions give an unbounded of the space and the form h is in some of the unbounded directions non-zero) or they attain a maximal solution
139
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
in at least one of the "vertices" of the set M (while usually that would be the case for only a single vector, it can also be the case that the maximum is attained on a part of the boundary of the area M.
3.5. Formulation using linear equations. Finding an optimum is not always as simple as in the previous case. The problem can contain many variables and restrictions and even deciding whether the set M of satisfiable points is non-empty can be a problem. We don't have space here for the complete theory, but we mention at least two directions of ideas, which show that actually the solution can be always found in a way similar to the previous paragraph.
Let us begin by comparison with systems of linear equations - because we understand those well. Let us write the equations (3.1)-(3.3) in general form:
A ■ x < b,
where x is now an n-dimensional vector, b is m-dimensional vector and A is the corresponding matrix. By an inequality between vectors we mean individual inequalities between coordinates. We want to maximise the product c ■ x for a given row vector of coefficients of the linear form h. If we add a new auxiliary variable for every equation and add another variable z for the value of the linear form h, we can rewrite the whole system as a system of linear equations
' z
where the matrix is composed of the blocks with 1+n+m columns and 1 + m rows, with corresponding individual components of the vectors. Additionally we require for all coordinates X and xs non-negativity.
If the given system of equations has a solution, in this set of solutions we seek values for the variables z, x and xs, such that all x are non-negative and z maximised. In the paragraph 4.11 on page 215 we will discuss this situation from the viewpoint of affine geometry.
Specifically, in our problem of black and white painter the system of linear equations looks like this:
/l -c\ ~C2 0 0 o\
0 1 1 1 0 0
0 W\ W2 0 1 0
V> bi b2 0 0 V
XI X2
x3
x4
w
L W
W
3.6. Duality of linear programming. Consider the real matrix A with m rows and n columns, vector of restrictions b and row vector c giving the objective function. From these data |- we can compose two problems of linear programming for
x € M" and y e W".
Maximisation problem: Maximise c ■ x under the conditions A ■ x < b and x > 0.
Minimisation problem: Minimise yT ■ b under the condition yT ■
A > cT and y > 0.
140
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
We say that these two problems are mutually dual. For deriving other properties of linear programming we first introduce some terminology.
We say that the problem is solvable if there is some admissible vector x which meets all restrictions. Solvable maximisation (minimisation) is bounded, if the objective function is bounded from above (bellow) over the set of admissible vectors.
Lemma. IfxeM." is an admissible vector for the standard maximisation problem and y e M™ is admissible vector for the dual minimisation problem, then for the objective functions we have
Proof. It is actually a simple observation: x > 0 and cT < yT ■ A, but also y > 0 and A ■ x < b, thus it must also hold that
>From here we immediately see that if both dual problems are solvable, then they must be bounded. Even more interesting is the following corollary, which is directly implied by the inequality in the previous proof.
Corollary. If there exist admissible vectors x and y of dual linear problems such that for the objective functions it holds that c-x — yT -b, then both are optimal solution for the corresponding problem.
3.7. Theorem (About duality). If a standard problem of linear programming is solvable and bounded, then its dual is also bounded and solvable, there exist an optimal solution for each of the problems and the optimal values of the corresponding objective functions are equal.
Proof. One direction was already proved in the previous corollary. It remains to prove the existence of an optimal solution. That is easy to prove by constructing an algorithm, which we won't do in a great detail now. We will return to the missing part of the proof in the part about affine geometry at the page 215. □
Let us note yet another corollary of the just formulated duality theorem:
Corollary (Equilibrium theorem). Consider two admissible vectors x and y for the standard maximisation problem and its dual problem from the definition 3.6. Then both these vectors are optimal if and only ifyi — Ofor all coordinates with index i for which Z~2"j=i aijxj < bi and simultaneously xj — 0 for all coordinates with index j such that Y1T=1 yiaij > ci-
Proof. Consider that both relations regarding the zeroes
c ■ x < yT ■ b
c ■ x < yT ■ A ■ x < yT ■ b, which is what we wanted to prove.
□
among x, and yt hold. Then we can in the follow-?T-y ing computation calculate with equation, because r Jfiv\ ^be summands with strict inequality have zero
- coefficients anyway:
m m n m n
X v'< - X?> X aiJxj -XX v'"'.'-v'
/ —1 i — 1 j—l i — l j—l
and from the same reason also
m
n
n
141
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
This shows one implication, thanks to the duality theorem.
Consider now that both x and y are optimal vectors. We thus know that
m m n n
J2ytbi - EE v/"/;v; - Er/v>
t=i i=\ j=i j=i
but simultaneously the left- and right-hand sides are equal. Thus there is equality everywhere. If we rewrite the first equality as
we see that it can be satisfied only if the relation from the statement holds, because it is a sum of non-negative numbers and it equals zero. From the second equality we similarly derive the second part and the proof is finished. □
The duality theorem and equilibrium theorem are useful when solving linear programming problems, because they show us relations between zeroes among the additional variables and satisfying the restrictions.
3.8. Notes about linear models in economy. Our very schematic problem of black-white painter from the paragraph 3.4 can be used to illustrate one of the typical economical models, the so-called model of production planing. The model tries to capture the problem completely, that is, to capture both external and internal relations. Left-hand sides of the equations (3.1), (3.2), (3.3) and of the objective function h(xi, x2) are expressing various production relations. Depending on the character of the problem, we have on the right-hand sides either exact values (and then we solve equations) or capacity restrictions and goal optimisation (then we obtain linear programming problems).
We can thus in general solve the problem of source allocation with supplier restrictions and either minimise costs or maximise income. We can also interpret duality from this point of view. If our painter would like to set up costs of his work vl, of white colour yw and of black colour yB, then he minimises the objective function
L ■ yL + Wyw + ByB
with restrictions
yL + w\yw +b\yB > c\ yL + wiyw + b2yB > c2.
But that is exactly the dual problem to the original one and the theorem 3.7 says that optimal state is such when the objective functions have the same value.
Among economical models we can find many modifications. One of them are problems of financial planing, which are connected to the optimisation of portfolio. We are setting up volume of investment into individual investment possibilities with the goal to meet the given restrictions for risk factors while maximising the profit, or dually minimise the risk under given volume.
Another common model is marketing application, for instance allocation of costs for advertisement in various media or placing advertisement into time intervals. Restrictions are in this case determined by budget, target population, etc.
142
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Very common are models of nutrition, that is, setting up how much of different kinds of food should be eaten in order to meet total volume of specific components, e.g. minerals and vitamins.
Problems of linear programming arise with personal tasks, where workers with specific qualifications and other properties are distributed into working shifts. Common are also problems of merging, problems of splitting and problems of goods distribution.
2. Difference equations
We have already met difference equations in the first chapter, zsp ^ albeit briefly and of first order only. Now we show a more general theory for linear equations m^ with constant coefficients, which gives not only !/, very practical tools but also nice illustration for concepts of vector spaces and linear mappings.
J Homogeneous linear difference equation of order k
3.9. Definition. Homogeneous linear difference equation of order k is given by the expression
a0x„ + a\xn-\ H-----h akx„-k = 0, a0 / 0 ak / 0,
where the coefficients at are scalars, which can possibly also depend on n.
We also say that such equality gives homogeneous linear recurrence of order k and we usually denote the sequence in question as a function
a\ ak
x„ = f(n) =--f(n - 1)------f(n - k).
«0 a0
Solution of this equation is a sequence of scalars x,, for all
i e N (or i e Z), which satisfy the equation with any fixed n.
By giving any k values x, in sequence are determined all the
J§t# other values uniquely. Indeed, we work over a field of scalars, thus the values ao and ak are invertible and thus using the definition any x„ can be computed uniquely, similarly for xn-k. Induction thus immediately proves that all remaining values are uniquely determined.
The space of all infinite sequences x, forms a vector space, where addition and multiplication by scalars works coordinate-wise. Directly from the definition is immediate that a sum of two solutions of a homogeneous linear equation or a multiple of a solution is again a solution. Analogously as with homogeneous linear systems we see that the set of all solutions form a subspace.
Initial condition on the values of the solutions is given as a ^-dimensional vector in K*. Sum of initial conditions determines the sum of the corresponding solutions, similarly for scalar multiples. Note also that plugging zeroes and ones into initial k values immediately yields k linearly independent solutions of the equation. Thus although the vectors are infinite sequences, the set of all solutions has finite dimension, we know that its dimension equals to the order of the equation k, and we can easily obtain a basis of all those solutions. Again we speak of fundamental system of solutions and all other solutions are its linear combinations.
As we have already checked, if we choose k indices i, i + 1,...,«+ k — 1 in sequence, the homogeneous linear difference equation gives a linear mapping Kk -> K°° of ^-dimensional vectors of initial values into infinitely-dimensional sequences of the same scalars. Independence of such solutions is equivalent to the
143
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
independence of the initial values - which can be easily told from determinant. If we have a &-tuple of solutions (x[,1],..., x[f]), it is independent if and only if the following determinant, the so-called Casortian, is non-zero for one n (which then implies it is non-zero for all n)
c(41],...,*?]) =
r[i] Tm
r[l] Jk]
An + l ••• An + l
Jl] Jk]
3.10. Solution of homogeneous recurrences with constant coefficients. It is hard to find a universal mechanism for finding a solution of general homogeneous linear difference equations, that is, directly computable expression for the general solution x„. In practical models there are very often equations, where the coefficients are constant. In this case it is possible to guess suitable form for the solution and indeed we will be able to find k linearly independent solutions. This is a complete solution of the problem, as all other solutions will be linear combinations.
For simplicity let us start with equations of second order. Such are very often encountered in practical problems, where there are relations based on two previous values. Linear difference equation (recurrence) of second order with constant coefficients is for us thus a form
(3.4) f(n + 2) = a- f(n + l)+b-f(n) + c,
where a,b,c are known scalar coefficients.
For instance in population models we can assume that the individuals in a population mature and start breeding two seasons later (that is, they add to the value fin + 2) by a multiple b ■ fin) with positive b > 1), while immature individuals tire and destroy part of the mature population (that is, the coefficient a is negative). Furthermore, it might be that somebody destroys (uses, eats) a fixed amount c every season.
Special such case with c — 0 is for instance the Fibonacci sequence of numbers yo, yi, • • •, where yn+2 — y«+i + yn.
If when solving a mathematical problem we don't have any new idea, we can always try to what success leads some known solution of a similar problem. Let us try to plug into the equation (3.4) with coefficient c — 0 similar solution as with the linear equations, that is, fin) — X" for some scalar X. By plugging in we obtain
X" +2 - aXn+1 - bXn = X" iX2 -aX-b) = 0.
This relation will hold either for X — 0 or for the choice of the values
Xi = ^(fl + Vfl2 + 4b), X2 = ^(a - Va2 +4b).
We have thus determined when actually such solutions indeed work, we just have to suitably choose the scalar X. But this is not enough for us, since we need to find a solution for any two initial values /(0) and /(l), and we have only found two specific sequences satisfying given equation (or possibly only one sequence -if A.2 = Ai).
As we have already derived for even very general linear recurrences, sum of two solutions /i(n) and f2(n) of our equation fin + 2) — a ■ fin + 1) — b ■ fin) — 0 is clearly again a solution
144
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
B. Recurrent equations
Distinct linear dependences can be a good tool for describing various models of growth. Let us begin with a very popular population model that uses linear difference equation of second order:
3.4. Fibonacci sequence. In the beginning of the Spring a stork brought on a meadow two newborn rabbits, male and female. The female is, after being two months old, able to
deliver two newborns, male and female. The newborns can then start delivering after one month and then every month. Every female is pregnant for one month and then she delivers. How many pairs of rabbits will be there after nine months (if none of them dies and none "moves in")?
Solution. After one month, there is still one pair, but the female is already pregnant. After two months, first newborns are delivered, thus there are two pairs. Every next month, there are that many new pairs as there were pregnant females one month before, which equals to the number of at least one month-old pairs, which equals the number of pairs that were there two months ago. The total number of pairs p„ after n months is thus the sum of the number of pairs in the previous two months. For the number of pairs we thus have the following homogeneous linear recurrent formula
(3.1) Pn+2 = Pn + 1 + Pn, n = l,...,
which along with initial conditions p\ = 1 and p2 = 1 uniquely determines the numbers of pairs of rabbits at the meadow in individual months. Linearity of the formula means that all members of the sequence (p„) appear in the first power, the meaning of the word recurrence is hopefully clear and the homogeneity means that in the formula the absolute term is missing (see further for non-homogeneous formula). For the value of the n-th member we can derive an explicit formula. In searching for the formula we can use the observation that for certain r the function f is a solution of the difference equation without initial conditions. This r can be obtained by plugging into the recurrent relation:
r"+2 = r"+1 +r" and after dividing by f we obtain
r2 = r + 1,
which is the so-called characteristic equation of the giver recurrent formula. Our equation thus has roots ^—j^- and ^-y^ and the sequences
of the same equation and the same holds for the scalar multiples of the solution. Our two specific solutions thus allow even more general solutions
f(n) = Ci^ + C2Xn2
for arbitrary scalars C\ and C2 and for unique solution of the specific problem with given initial values /(0) and /(l) it remains just to find the corresponding scalars C\ and C2. (And we also need to check whether it is possible for any two initial values).
3.11. Choice of scalars. Let us show how this can work on at least one example. Let us concentrate on the problem that the roots of the characteristic polynomial are in general not in the same field of the scalars as the coefficients in the equation. Thus we solve the problem:
1
(3.5)
yn+2
yo — 2, yi — 0.
(±^)" and b„
2 , ^..^ ^„ — (—^-)n, n > 1 satisfy the given relation. The relation is also satisfied by any linear combination, that is, any
In our case is thus X\t2 — \ (1 ± \/3) and clearly yo — C\ + C2 — 2
yi = ^Ci(l + V3) + ^C2(1-V3)
is satisfied for exactly one choice of these constants. Direct calculation yields C\ — I — \\fi, C2 — 1 4- 5V3 and our problem has unique solution
f(n) = (1 - ^/3)^(1 + V3)" + (1 + ^/3)^d - V3)".
Note that even if the found solutions for the equation with integral coefficients look complicated and are expressed with irrational (or possibly complex) numbers, we know a priori that the solution itself is again integral. Without this "step aside" into bigger field of scalars we would not be able to describe the general solution.
We will meet with similar events very often. General solution allows us also without direct enumeration of constant to discuss qualitative behaviour of the sequence of numbers fin), that is, whether the values with growing n approach some fixed value or oscillate in some interval or are unbounded.
3.12. General case of homogeneous recurrences. Let us now
try similarly as in the case of second order to plug in the choice x„ — X" for some (yet unknown) scalar X into the general homogeneous equation from the definition 3.9. For every n we obtain the condition
X"-kia0Xk + aiXk~l ■ ■ ■ + ak) = 0
which means that either X — 0 or X is the root of the so-called characteristic polynomial in the parentheses. Characteristic polynomial is independent of n.
Assume that the characteristic polynomial has k distinct roots X\,... ,Xk. We can for this purpose extended the field of scalars we are working in, for instance Q into R or R into C, because the result of the calculations will again solutions that stay in the original field thanks to the equation itself. Each of the roots gives us single possible solution
xn — iXi)" ■
In order to be happy, we require k linearly independent solutions.
145
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
sequence c„ = san + tb„, s, t e R. The numbers s and t can be chosen so that the resulting combination satisfies the initial conditions, in our case c\ = 1, c2 = 1. For simplicity it is clever to define the zero-th member of the sequence as co = 0 and compute s and t from the equations for c0 and c\. We find out that s = — , t = and thus
(1 + V5)" - (1 - V5)"
(3.2)
Pn
2"(V5)
Such sequence satisfies the given recurrent formula and also the initial conditions c0 = 0, c\ = 1, thus it is the single sequence given by this requirements. Note that the value of the formula (||3.2||) is integer for any natural n (it gives the integer Fibonacci sequence), although it might not seem so at the first glance. □
3.5. Simplified model for behaviour of gross domestic product.
Consider the difference equation
(3.3)
yk+2 - a{\ + b)yk+i + abyk = 1,
where yk is the gross domestic product at the year k. The constant a is the so-called consumption tendency, which is a macroeconomical factor that gives the fraction of money that the people spend (from what they have at their disposal), and the constant b describes the dependence of the measure of investment of the private sector on the consumption tendency.
We further assume that the size of the domestic product is normalised such that on the right-hand side of the equation the result is 1.
Compute the values y„ for a = |, b = |, yo = 1, y\ = 1.
Solution. Let us first look for the solution of the homogeneous equation (the right side being zero) in the form of r1. The number r must be a solution of the characteristic equation
a(\ + b)x +ab = 0, that is, x2
1
x + - = 0, 4
which has a double root \. All the solutions of the homogeneous equation are then of the form a(\)n + bn(^)n.
Let us also note that if we find some solution of the non-homogeneous equation (the so-called particular solution), then if we add to it any solution of the homogeneous solution, we obtain another solution of the non-homogeneous equation. It can be shown that by this we obtain all solutions of the non-homogeneous equation.
In our case (that is, when all the coefficients and the non-homogeneous term are constant) is a particular solution the constant
In order to do this it suffices to check the independence by plugging k values for n — 0,..., k — 1 for k choices of Xi into the Casortian (see 3.9). We thus obtain the so-called Vandermonde matrix and it is a nice (but not entirely trivial) exercise to compute that for every k and any /c-tuple of distinct Xt is determinant of such matrix non-zero, see ||2.24|| on the page 87. But that means that the chosen solutions are linearly independent.
We have thus found the fundamental system of solutions of the homogeneous difference equation in the case that all the roots of its characteristic polynomial are distinct.
Consider now the multiple root X and plug into the definition the assumed solution x„ — nXn. We obtain the condition
aotiX" + ■
ak(n
k)Xn~k = 0.
This condition can be rewritten using the so-called derivation of a polynomial (see ?? on the page ??), which we denote by apostrophe:
X(a0Xn + ■ ■ ■ + akXn-k)' = 0
and right at the beginning of the fifth chapter we shall see that the root of a polynomial / has multiplicity greater than one if and only if it is a root of /'. Our condition is thus satisfied.
With greater multiplicity £ of the root of the characteristic \^ polynomial we can proceed similarly and use the fact that a root with multiplicity £ is a root of all derivations of the polynomial up to £ — 1 (inclusively). Derivations look like this:
\ n— k
f(X) = a0Xn H-----\-akX
f'(X) = a0nX"~L H-----h ak(n - k)X'
n—k — 1
a0n(n-l)X"-2+- ■ ■ +ak(n-k)(n-k-l)X"-k-2
f
«+1> =avn...(n-£)Xn-1-1 +.
+ ak(n — k) ... (n — k — £)X
n-k-l-l
Let us look on the case for a triple root X and try to find a solution in the form n2Xn. Plugging into the definition we obtain the equation
a0n2Xn + --- + ak(n- k)2Xn~k = 0.
Clearly the left side equals the expression X2f"(X) + Xf'(X) and because X is a root of both derivations, the condition is satisfied.
Using induction we easily prove that even for general condition for the solution in the form x„ —nlXn,
a0nlXn + ... ak(n - k)lXn~k = 0,
the solution can be obtained as a linear combination of the derivations of the characteristic polynomial starting with the expression
Xl+lf^+l-Xl£(£ + \)fil) + ... and we have thus came close to the complete proof of the following:
Theorem. Every homogeneous linear difference equation of the order k over any field of scalars K contained in the complex numbers K has as a set of all solutions a k-dimensional vector space generated by the sequences xn — nlXn, where X are (complex) roots of the characteristic polynomial and the powers £ run overall
146
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
y„ = c. By plugging into the equation we obtain c is, c = 4. All solutions of the difference equation
1
yk+2 - yk+\ + - • yk = l
c + ±c = l,that
are thus of the form 4 + a (pn + bn (pn. We require that yo = yi = 1 and these two equations give a = b = —3, thus the solution of this non-homogeneous equation is
3" 1 i
Again, as we know that the sequence given by this formula satisfies the given difference equation and also the given initial conditions, it is indeed the only sequence characterised by these properties. □ In the previous case we have used the so-called method of indeterminate coefficients. It is based on the following: on the basis of the non-homogeneous term of the given difference equation we "guess" the form of the particular solution. The forms of the particular solutions are known for many non-homogeneous terms. For instance the equation
(3.4)
yn+k +alyn+k_l H-----h akyn = Pm(n),
where P (m) is a polynomial of degree n and the corresponding characteristic equation has real roots, has (almost always) particular solution of the form Qm(n), where Qm(n) is a polynomial of degree m.
Other possible way how to solve such equation is the so-called variation of constants method, where we first find a solution
y(n) = ^CifiQi)
r = l
of the homogenised equation and we consider the constants ct as functions Ci(n) of the variable n and we look for a particular solution of the given equation in the form
y(n) = ^Ci(n)fi(n).
i=\
Let us show on the picture the values of ft for i < 35 with the equation
fin) = 9-f(n - 1) - lf(n - 2) + 1 /(0) = /(l) = 1.
natural numbers between zero and multiplicity of the corresponding root k.
Proof. Aforementioned relations between the multiplicity of a root and the derivation of the polynomial will be proven later, and we won't prove the fact that every complex polynomial has exactly that many roots (counting multiplicities) as is its degree. Thus it just remains to prove that the found k-tuple of solutions is linearly independent. Even in this case we can inductively prove that the corresponding Casortian is non-zero, as we have already done in the case of Vandermonde determinant before.
For illustration of our approach we show how does the calculation look like for the case of a root k\ with multiplicity one and a root X2 with multiplicity two:
C(k\,k\,nk\)
,« + 1 1 «+2
k2
^2+1 i«+2
— k\k2
— k\k2
1
kl
1
k2
kl k2 1
1 n 1 2n ~k\k2
k\ — X2 kl(ki-k2) k\ — k2
"2
n
(n + l)k2 (n + 2)k\
1 n 0 k2 0 k1
nk\ (n + \)kn+l (n + 2)kn+2
k2
kl(k\ — k2) ki
n i 2« + l
klk2
(ki - k2y / 0.
In the general case the proof can be carried on in a completely similar way, inductively. □
3.13. Real basis of the solutions. For equations with real coefficients the initial real conditions always lead to real solutions. Still, the corresponding fundamental solutions derived using the just proven theorem might exists only in the complex domain.
Let us therefore try to find other generators, which will be more convenient for us. Because the coefficients of the characteristic polynomial are real, each its root is either real or the roots are paired as complex conjugates.
If we describe the solution in the polar form as
k" = \k\"(cos n - Unevenness of the curves are consequence of imprecise depicting, both signals are of course smooth sinus curves.
Solution. The characteristic polynomial of the given equation is x4 — x3 — x +1. If we are looking for its roots, we are solving the reciprocal equation
x + 1 = 0
x4 -x3
Standard procedure is to solve the equation by the expression x2 and then we use the substitution t = x + that is, t2 = x2 + ^ + 2. We obtain the equation
t2
0,
with roots ti = — 1, t2 = 2. For both of these values of the indeterminate t we solve separately the equation given by the substitution:
1
x +
It has two complex roots: x\
;V3
-1.
2 ^l 2
cos(27r/3) + i sin(27r/3)
andx2 = — 5 — 1 2 = cos(2tt/3) — i sin(27r/3).
For the second value of the indeterminate t we obtain the equation
1
x + - = 2
Note that in the areas where the resulting signal is roughly as strong as the original there is a dramatic shift in the phase. Cheap equaliser indeed work in such a bad way.
149
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
with double root 1. Thus the basis of the vector space of the sequences that are a solution of the difference equation in question is the following quadruple of sequences: {(-5 + *V3~) }™=l, {(-5 - zV3~) }™=l, (constant sequence) and {n}™=l. If we are looking for a real basis, we must replace two of the generators (sequences) from this basis by some sequences that are real only. As these generators are power series whose members are complex conjugates, it suffices to take as suitable generators the sequences given by the half of the sum and by the half of the z'-th multiple of the difference of that complex generators. This yields the following real basis of the solution space: {1}^ (constant sequence), {n}™=l, {cos(« • 2tt/3)}^=1, {sin(« • 2tt/3)}^=1. □
3.8. Find a sequence that satisfies the given non-homogeneous difference equation with the initial conditions:
X~n+2 = Xn + 1 + 2x„ + 1, X\ = 2, X2 = 2.
Solution. General solution of the homogenised equation is of the form a{—1)" + b2". Particular solution is the constant —1/2. General solution of the given non-homogeneous equation without initial conditions is thus
fl(-l)" +b2"
1
By plugging in the initial conditions then gives us the constants a = —5/6, b = 5/6. The given difference equation with initial conditions is thus satisfied by the sequence
■-(-1)" + -2""1 - -. 6 ' 3 2
□
3.9. Determine the sequence of real numbers that satisfies the following non-homogeneous difference equation with initial conditions:
2x„+2 = — Xn + 1 + xn + 2, X\ = 2, X2 = 3.
Solution. General solution of the homogenised equation is of the form a(—l)n + b(l/2)n. Particular solution is the constant 1. General solution of the non-homogeneous equation without initial conditions is thus
a(-\T+b(^j +1.
By plugging into the initial conditions we obtain the constants a = 1, b = 4. The given equation with initial conditions is thus satisfied by the sequence
(_!)»+ 4 fV\ +i.
3. Iterated linear processes
3.17. Iterated processes. In practical models we very often encounter the situation where the evolution of a system in a given time interval is given by a linear process, and we are interested in the behaviour of the system after many iteration. Very often is the linear process remains the same, from the mathematical point of view it is thus repeated multiplication of the state vector by the same matrix.
While for solving the systems of linear equation we needed only minimal knowledge of properties of linear mappings, in order to understand the behaviour of an iterated system we need to know the properties of eigenvalues, properties of eigenvectors and other structural results.
In a sense we are in the same environment as with linear recurrences and actual our description of filters in previous paragraphs can be described in such way. Imagine that we are working with sound and are keeping track by the state vector
Yn — (xn,
■ Xn-k+l)
of all values from the actual one to the last one that is yet being processed in our linear filter. In one time interval (for the frequency of audio signal a very short one) we then move to the state vector
Yn + l — (xn + l, X„, . . . , X„-k+2),
where the first value x„+i = a\x„-\-----ha,tx«-,t+i is computed as
with homogeneous difference equations, the others are just shift by one position and last one is forgotten. The corresponding square matrix of order k that satisfies Yn+\ — A ■ Yn looks as follows:
(a\ «2 ••• a-k\ 1 0 ... 0 0
A =
0 1
0 0
\0 0 ... 1 O.J
For such simple matrix we have derived explicit procedure for the complete formula for the solution. In general, it wont be so easy even for very similar systems. One of the typical cases is study of dynamics of a population in distinct biological systems.
Note also that the matrix A has (understandably) the characteristic polynomial
p(X) = Xk - aik
k-l
ak,
as can be easily derived by expanding the last column and the recurrence. That is explainable also directly, because the solution x„ — X", X / 0 basically means that the matrix A by multiplication takes the eigenvector (Xk,..., X)T to its A-multiple. Thus such X must be eigenvalue of the matrix A.
3.18. Leslie model for population growth. Imagine that we are dealing with some system of individuals (cattle, insects, cell cultures, etc.) divided into m groups (according to their age, evolution stage, etc.). The state X„ is thus given by the vector
Xn — (ui, ..., um)T depending on the time t„ in which we are observing the system. Linear model of evolution of such system is then given by the matrix A of dimension n, which gives the change of the vector X„
150
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
□
3.10. Solve the following difference equation:
-"-«+4 = -"-«+3 -"-«+2 ~i~ -"-« + 1 -*-«•
Solution. From the theory we know that the space of the solutions of this difference equation is a four-dimensional vector space whose generators can be obtained from the roots of the characteristic polynomial of the given equation. The characteristic polynomial is
x4 - x3 + x2 - x + 1 = 0.
It is a reciprocal equation (that means that the coefficients at the (n — /c)-th and /c-th power of x, k = 1, are equal). Thus we use the
substitution u = x + £. After dividing the equation by x2 (zero cannot be a root) and substituting (note that x2 + \ = u2 — 2) we obtain
2 1 1 2 X — X + 1---1--7="
- u — 1=0.
1±VI
X x^
Thus we obtain the indeterminates «1,2 = 1^2- From there then by the equation x2 — ux + 1 = 0 we determine the four roots
Xl, 2,3,4
1± V5±V-10±2V5
Now we note that the roots of the characteristic equation could have been "guessed" right away - it is
x5 + 1 = (x + l)(x4 - x3 + x2 - X + 1),
and thus the roots of the polynomial x4 — x3 + x2 — x + 1 are also the roots of the polynomial x5 + 1, which are exactly the fifth roots of the — 1. By this we obtain that the solutions of the characteristic polynomial are the numbers x\ ,2 = cos( j) ± i sin(y) and x3 4 = cos(^) ± i sin(3j-). Thus the real basis of the space of the solution of the given difference equation is for instance the basis of the sequences cos(rj-), sin(rj-), cos(3^ZL) and sin(3^ZL), which are sines and cosines of the arguments of the corresponding powers of the roots of the characteristic polynomial.
Note that we have by the way derived the algebraic expressions for ^10~2^ cos(3i) = 2^=1 and sin(3f) =
cos(f) = ±±^, sin(f)
V10+2 VI
.5/ 4 . ^"v5/ 4 ' w"v 5 ' 4 M
4 (because all the roots of the equation have the absolute value 1, they are real (imaginary) parts of the corresponding roots). □
3.11. Determine the explicit expression of the sequence satisfying the difference equation x„+2 = 2x„+i — 2x„ with members x\ = 2, x2 = 2.
Solution. The roots of the characteristic polynomial x2 — 2x + 2 are 1 + i and 1 — i. The basis of the (complex) vector space of the solution
to
Xn+i — A ■ X„
when time changes from t„ to ?„+i.
Let us show as an example the so-called Leslie model for population growth, where there is the matrix
(fl h h ■ fm — 1 fm^
0 0 . 0 0
0 ^2 0 . 0 0
0 0 ^3 0 0
\0 0 0 ... rm-i 0/
whose parameters are tied with the evolution of a population divided into m age groups such that f denotes the relative fertility of the corresponding age group (in the observed time shift from N individuals in the z-th group arise new f N ones - that is, they are in the first group), while t, is relative mortality in the z-th group in one time interval. Clearly such model can be used with any number of age groups.
All coefficients are thus non-negative real numbers and the numbers t, are between zero and one. Note that when all x are equal one, it is actually a linear recurrence with constant coefficients and thus has either exponential growth/decay (for real roots X of the characteristic polynomial) or oscillation connected with potential growth/decay (for complex roots).
Before we introduce more general theory, let us play for a while with this specific model.
Direct computation with the Laplace expansion of the last column yields the characteristic polynomial pm (X) of the matrix A for the model with m groups:
Pm(X) = \A- XE\ = -kPm-i(k) + (-l)m~7mTl . . . Xm — \.
Easily by induction we derive that this characteristic polynomial is of the form
Pm(X) = (-l)m(X"
■ fli X
m — l
-\X — am)
and mainly non-negative coefficients a\,..., am, if all parameters xi and fi are positive. For instance it is always
— fm X\ . . . Xm — 1.
Let us qualitatively estimate the distribution of the roots of the polynomial pm. Sadly, details of this procedure could
fbe exactly explained only later, after understanding some parts of the so-called mathematical analysis in the chapter five and later, however it should all be intuitively clear even now. We express the characteristic polynomial in the form
pm(X) = ±Xm(\-q(X))
where q(X) — a\X~l + • • • + amX~m is a strictly decreasing non-negative function for X > 0. Evidently there exists exactly one positive X for which q(X) — 1 and thus also pm (X) — 0. In other words, for every Leslie matrix there exists exactly one positive real eigenvalue.
For actual Leslie models of populations all coefficients t, and fj are between one and zero and a typical situation is when the only real eigenvalue Xi is greater or equal to one, while the absolute values of the other eigenvalues are strictly smaller than one.
151
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
is thus formed by the sequences y„ = (1 + /)" and z„ = (1 — /)". The sequence in question can thus be expressed as a linear combination of these sequences (with complex coefficients). It is thus x„ = a-yn+b-Zn, where a = a\ + ia2, b = b\ + ib2. From the recurrent relation we compute x0 = \(2x\ — x2) = 0 and by substitution n = 0 and n = 1 into the expression of xn we obtain
1 = x0 = a\ + ia2 + b\ + ib2
2 = xi = (fli + ia2)(\ + i) + (bi + ib2){\ - /),
and by comparing the real and the complex part of both equations we obtain a linear system of four equations i
a\ + bi
a2 + b2 a\ - a2 + b\ + b2 ai + a2 — t>i + b2 = 0
with solution a\ = b\ = b2 = ^ and a2 = —1/2. Thus we can express the sequence in question as
Xn = (\- ^0(1 + 0" + (\ + - if-
The sequence can also be expressed using the real basis of the (complex) vector space of the space of solutions, that is, using the sequences u„ = \{y„+ zn) = (\/2)" cos(^) and v„ = \i(z„ - y„) = (V2)" sin(rj-). The transition matrix for the changing the basis from the complex one to the real one is
Mi -$)•
the inverse matrix is T~l = 0 ^, for expressing the sequence x„ using the real basis, that is, for expressing the coordinates (c, d) of the sequence x„ under the basis {u„, vn}, we have
If we begin with any state vector X which is given as a sum of eigenvectors
X — Xi + ■ ■ ■ + xm with eigenvalues Xi, then iterations yield
X
X\xr +
■ ■ x\,xn
thus under the assumption that \Xi\ < 1 for all i > 2, all components in the eigensubspaces decrease very fast, except for the component X\X\.
Distribution of the population among the age groups are thus very fast approaching the ratios of the components of eigenvector to the dominant eigenvalue X\.
For example for the matrix (let us realise the meaning of individual coefficient, they are taken from the model for sheep breeding, that is, the values x contain both natural deaths and activities of breeders)
( 0 0.2 0.8 0.6 o\
= 1 0.95 0 0 0 0
A = 0 0.8 0 0 0
= 0 0 0 0.7 0 0
= 2 1 o 0 0 0.6 0/
the eigenvalues are approximately
1.03, 0, -0.5, -0,27 + 0.74/, -0.27-0.74/
with absolute values 1.03, 0, 0.5, 0.78, 0.78 and the eigenvector corresponding to the dominant eigenvalue is approximately
XT = (30 27 21 14 8).
We have immediately chosen the eigenvector whose coordinates sum to 100, it directly gives us the percentual distribution of the population.
If we instead of three-percent total growth of the population rather wanted constant number and said that we will eat sheep from second group, we would be asking the question how much shall we decrease x2 so that the dominant eigenvalue would be one.
3.19. Matrices with non-negative elements. Real matrices that have no negative elements have very special properties. Also, they are very often present in practical /zj$£fakvfe models. We shall thus introduce the so-called Perron--l^jf^^K-J— Frobenius theory which deals with such matrices.
Let us begin with definition of some notions in order to be able to formulate our ideas.
| Positive and primitive matrix [__>
L
Definition. Under positive matrix we understand a square matrix A whose all elements atj are real and strictly positive. Primitive matrix is such square matrix A such that some power Ak is positive.
thus we have again an alternative expression of the sequence xn where there are no complex numbers (but there are square roots):
x„ = (V2rcos(^) + (V2rsin(^),
which we could have obtained by solving two linear equations in two variables c, d, that is, 1 = x0 = c ■ u0 + d ■ v0 = c and 2 = x\ = c ■ u\ + d ■ v\ = c + d. □
Let us recall that spectral radius of matrix A is the maximum of absolute values of all (complex) eigenvalues of A. Spectral radius of a linear mapping over (finitely-dimensional) vector space is the spectral radius of the corresponding matrix under some basis.
2
Norm of a matrix A e Rn or of a vector x e Rn is the sum of absolute values of all elements. For vector x we write |x| for its norm.
The following result is very useful and hopefully also well understandable. Its proof is with its hardness quite atypical for this
152
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
3.12. Determine the explicit expression of the sequence satisfying the difference equation x„+2 = 3x„+1 + 3x„ with members x\ = 1 and x2 = 3. O
3.13. Determine the explicit formula for the n -th member of the unique solution {x„}™=l that satisfies the following conditions:
Xn+2 = Xn + \ — X„, , X\ = 1, x2 = 5.
o
3.14. Determine the explicit formula for the n -th member of the unique solution {x„}™=l that satisfies the following conditions:
— Xn+3 = 2Xn+2 + 2xn + i + X„, X\ = 1, X2 = 1, X3 = 1.
o
3.75. Determine the explicit formula for the n -th member of the unique solution {xn}^=l that satisfies the following conditions:
— X„+3 = 3x„+2 + 3x„ + i + X„, X\ = 1, X2 = 1, X3 = 1.
o
C. Population models
Population models which we are to deal with right now will have recurrence relations in vector spaces. The unknown in this case is not a sequence of numbers but a sequence of vectors. The role of coefficients is played by matrices. We begin with a simple (two-dimensional) case.
3.16. Savings. With a friend we are saving for a holiday together by monthly payments in the following way. At the beginning I give 10 €and he gives 20 €. Every consecutive month each of us gives as many as last month plus one half of what the other has given the month before. How much will we have after one year? How much many will I pay in the twelfth month?
Solution. The amount of many I pay in the 72-th month is denoted as x„ and the amount my friend is paying is y„. The first month we thus givex! = 10, yi = 20. For the following payments we can write down a recurrent relation:
Xn + l = Xn -\- ~^y-n
y«+i = y» + 2xn
If we denote the common savings as z,„ = xn + y„, then by summing the equations we obtain zn+i = zn + \zn = \zn- That is a geometric sequence and we obtain z,„ = 3.(|)"_1. In a year we will thus have zi + z,2 + • • • + z,\2- This partial sum is easy to compute
textbook, so we give at least a vague idea how to do it. If the reader has some problems with smooth reading, we suggest skipping the proof immediately.
Theorem (Perron). If A is a primitive matrix with spectral radius X € M, then X is a simple root of characteristic polynomial of the matrix A, which is strictly greater than the absolute value of any other eigenvalue of the matrix A. Furthermore, there exist eigenvector x associated with X such that all elements x, of x are positive.
Vague idea. In the proof we shall rely on the intuition from \\ elementary geometry. Partly we will make the used concepts more precise in the analytical geometry in the fourth chapter, some analytical aspects will be W studied in more detail in the fifth chapter and later, and some claims won't be proven in this textbook at all. Hopefully the presented ideas will not just illuminate the theorem but also will motivate for deeper study of geometry and analysis by themselves. Let us begin with a understandable auxiliary lemma:
Lemma. Consider any polyhedron P containing the origin 0 e W. If some iteration of the linear mapping iff : M" -> M" maps P in its inside, then the spectral radius of the mapping iff is strictly smaller than one.
Consider the matrix A of the mapping i/r under the standard basis. Because the eigenvalues of Ak are the k-th powers of the eigenvalues of the matrix A, we can without loss of generality assume that the mapping i/r already maps P into its inside. Clearly i/r cannot have any eigenvalue with absolute value greater than one.
Let us argue by contradiction. Assume that there exists eigenvalue X with \X\ — 1. Thus there are two possibilities, either Xk — 1 for suitable k or there is no such k.
The image of P is a closed set (that means that when the points in the image group about some point y in M", the point y is also in the image) and the border of P is not intersected at all by the image. Thus ^ cannot have a fixed point on the border and there cannot even be any point on the border to which the points in the image would converge. The first argument excludes that some power of X is one, because such fixed point on the border of P would then exist. In the remaining case there would definitely be a two-dimensional subspace W C M" on which the restriction of i/r acts as a rotation by an irrational argument and thus there definitely exist a point y in the intersection of W with the border of P. But then the point y could be approached arbitrarily close by the points from the set i[r" (y) (through all iterations) and thus would have to be in the image also. That leads to a contradiction and thus the lemma is proven.
Now let us prove the Perron theorem. Our first step is ensuring the existence of the eigenvector which has all elements positive. Let us consider the so-called standard simplex
S — {x — (xi
,x„Y , |x| = \,xt >(U = 1,
Because all elements in the matrix A are non-negative, the image A ■ x has all non-negative coordinates as x does and at least one of
them is always non-zero. The mapping x
1 (A ■ x) thus
3(1 + - H----+ = 3
1-1
772,5.
maps S to itself. This mapping S —>• S satisfies all the assumptions of the so-called Brouwer fixed point theorem and thus there exists vector y e S such that it is mapped by this mapping to itself. That
153
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
In a year we will have saved over 772 €.
The recurrent system of equation describing the savings system can be written by matrices as follows:
means that
xn + l
y«+i
1 f \ / Xr
7 1/ Wr-
it is thus again a geometric sequence. Its elements are now vectors and the quotient is not a scalar, but a matrix. The solution can be found analogously:
The power of the matrix acting on the vector (xi,yi) can be found by expressing this vector in the basis of eigenvectors. The characteristic polynomial of the matrix is (1 — A)2 — 4- — 0 and the eigenvalues are
thus A
1,2
3 j_
2' 2
. The corresponding eigenvectors are thus (1,1) and
(1,-1). For the initial vector (xi,yi) = (1, 2) we compute 'l\ 3 /l\ l/l
and thus
2 VI
3 /3\"_1 /l
2 V2
-1
1 /1\"_1 / 1
2 V2
That means that in the 12. month I pay
Xl2
12
12
130
EUR and my friend pays basically the same amount. □
Remark. The previous example can be solved also without matrices by rewriting the recurrent equation: x„ = xn + \yn = \xn + \z,n-
The previous example was actually a model of growth (in the case of growth of saved money). Let us now go to the models of growth describing primarily a growth of some population. Leslie model of population growth with which we have coped with in great detail in the theoretical part describes very well not only populations of sheep (according to which it was developed), but can be also applied in modelling of the following populations:
3.17. Rabbits for the second time. Let us show how the Leslie model can describe the population of the rabbits on the meadow with which we have worked in the exercise (||3.4||). Let us consider that the rabbits are dying after reaching the ninth year of age (in the original model the rabbits were immortal). Let us denote the numbers of rabbits according to their age in months in time t (in months) as x\(t), x2(t),..., x9(t), then the numbers of rabbits in individual categories are after one month described by the formula x\{t + 1) =
A ■ y = ky, k = \ A ■ y\
and we have found an eigenvector that lies in S. Because some power of Ak has due to our assumption all elements positive and of course we have Ak ■ y = kky, all elements of the vector y are strictly positive (that is, they he inside of S) and k > 0.
In order to prove the rest of the theorem, we will consider the mapping given by the matrix A in a more suitable basis and furthermore we shall multiply it by a constant A-1:
B = A_1(y_1 ■ A ■ Y),
where Y is a diagonal matrix with coordinates yi of a just-found eigenvector y on a diagonal. Evidently B is also a primitive matrix and furthermore the vector z = (1,..., l)T is its eigenvector, because clearly Y ■ z = y.
If we know prove that [i = 1 is a simple root of the characteristic polynomial of the matrix B and that all other roots have absolute value strictly smaller than one, the proof is finished.
In order to do that we use the auxiliary lemma. Consider the matrix B to be a matrix of a linear mapping that maps the row vectors
u = («1
««)
B = v,
that is, using multiplication from the right. Thanks to the fact that z = (1,..., l)T is an eigenvector of the matrix B, the sum of the coordinates of the row vector i;
i,i=\
i = \
whenever u e S. Therefore the mapping maps the simplex S on itself and thus has in S a (row) eigenvector w with eigenvalue one (fixed point, thanks to the theorem of Brouwer). Because some power Bk contains only strictly positive elements, the image of the simplex S in the &-th iteration of the mapping given by B lies inside of S. We are getting close to using our lemma prepared for this proof.
We shall still work with the row vectors. Denote by P the shift of the simplex S into the origin by the eigenvector w we have just found, that is, P = —w + S. Evidently P is a polyhedron containing the origin and the vector subspace V c K" generated by P is invariant to the action of the matrix B through multiplication of the row vectors from the right. Restriction of our mapping on P thus satisfies the assumptions of the auxiliary lemma and thus all its eigenvalues are strictly smaller than one.
We have yet to deal with the problem that the just considered mapping is given by multiplication of the row vectors from the right with the matrix B, while originally we were interested in the mapping given by the matrix B and multiplication of the column vectors from the left. But that is equivalent to the multiplication of the transposed column vectors with the transposed matrix B in the usual way - from the left. Thus we have proven the claim about eigenvalues for the transpose of B. But transposing does not change the eigenvalues.
Dimension of the space V is n — 1, thus completing the proof.
□
154
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
or
x3(t) + -- • +x9(0, Xi (t + l) = Xi -1 (0, pro i = 2,3
/xi(t+l)\ 1 1 1 1 1 1 1 Ai(0\
x2(t+l) 1 0 0 0 0 0 0 0 0 x2(t)
x3(t+l) 0 1 0 0 0 0 0 0 0 x3(t)
x4(t+ 1) 0 0 1 0 0 0 0 0 0 Mi')
x5U + 1) = 0 0 0 1 0 0 0 0 0 x5(i)
x6(t + 1) 0 0 0 0 1 0 0 0 0 X(,(t)
xi (t + 1) 0 0 0 0 0 1 0 0 0 xi(t)
x8(7+ 1) 0 0 0 0 0 0 1 0 0 Mf)
\x9(f + 1)7 Vo 0 0 0 0 0 0 1 0/ \x9(f)/
,10,
Characteristic polynomial of the given matrix is A9 — A7 —A6 —A5 —A4 — A3 — A2 — A — 1. Roots of this equation are hard to explicitly express, but we can estimate one of them very well - Ai = 1, 608 (why must it be smaller than (V5 + l)/2)?). Thus the population grows according to this model approximately with the geometric sequence 1, 608'.
3.18. Pond. Let us have a simple model of a pond where there lives a population of white fish (roach, bleak, vimba, nase, etc.). We assume that 20 % of babies survive their second year and from that age on they are able to reproduce. For these young fish, approximately 60 % of them survives their third year and in the following years the mortality can be ignored. Furthermore we assume that the birth rate is three times the number of fish that can reproduce.
Such population would clearly very quickly fill the pond. Thus we want to maintain a balance by using a predator, for instance esox. Assume that one esox eats per year approximately 500 mature white fish. How many esox should be put into the pond in order for the population to stagnate?
Solution. If we denote by p the number of babies, by m the number of young fish and by r the number of adult fish, then the state of the population in the next year is given by:
3m + 3r 0,2p ,0, 6m + xr ,
where 1 — r is the relative mortality of the adult fish caused by the esox. The corresponding matrix describing this model is then
If the population is to stagnate, then this matrix must have eigenvalue 1. In other words, one must be the root of the characteristic polynomial of this matrix. That is of the form A2(r - A) + 0, 36 - 0, 6.(r - A) = 0. That means that r must satisfy
r - 1+0.36 - 0.6(r - 1) = 0 0.4r - 0.04 = 0
3.20. Simple corollaries. The following very useful claim has with the knowledge of the Perron theorem a surprisingly simple proof and shows how strong is the prop-erty of the primitive matrix of a mapping.
Corollary. If A — (flij) is a primitive matrix and x e W its eigenvector with all coordinates non-negative and eigenvalue A, then A > 0 is the spectral radius of A. Furthermore it holds that
mm
£>,7 0.
>From the theorem of Perron we know that the spectral radius 11 is an eigenvalue and choose such eigenvector y associated with [i such that the difference x — y has only strictly positive coordinates. Then necessarily for all the powers of n we have
0 < A" ■ (x - y) = A"x - fi"y,
but we also have that A < ji. From there we directly have A — ji.
It remains to estimate the spectral radius using the minimum and maximum of sums of individual columns of the matrix. We denote them by />m;n and bmax, choose x to be a vector with the sum of coordinates equal to one and count:
X! auxj - X! "kXi ~k
i,j=l i = l
n y n \ n
* = (fly )xj ^12 b™*xj
n / n v. n A = j2\12aij )xj - X^min*/ :
= 1 vr = l
□
Note that for instance all Leslie matrices from 3.18, where all the coefficients f and tj are strictly positive, are primitive and thus we can apply on them the just derived results.
Perron-Frobenius theorem is a generalisation of the Perron theorem for more general matrices, which we won't give here. More information can be found for instance in ??.
3.21. Markov chains. Very frequent and interesting case of linear processes with only non-negative elements in matrix is a mathematical model of a system which can be in one of m states with various probabilities. In a given point of time the system is in state i with probability X{ and transition form the state i to the state j happens with probability ?y.
We can write the process as follows: at time n the system is described by the probability vector
x„ - («l(«),
um{n))
That means that all components of the vector x are real non-negative numbers and their sum equals one. Components give the distribution of the probability of individual possibilities for the state of the system. The distribution of the probabilities at time
155
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
0.6 At + 0.5 Kk, -0.16 Dk + 1.2 Kk;
In the next year only 10 % is allowed to survive and the rest should be eaten by the esox. If we denote the desired number of esox by x, then together they eat 500x fish, which should according to the previous computation be 0.9r. The ratio of the number of white fish to the number of esox should thus be r- = ^j. That is approximately one esox for 556 white fish. □ In general, we can work with the previous model as follows:
3.19. Let in the population model prey-predator be the relation between the number of predators Dk and preys Kk in the given and the following month (ieNU {0}) be given by the linear system
(a)
Dk+l Kk+i
(b)
Djt+i = 0.6 + 0.5 Kk,
Kk+l = —0.175 Dk + 1.2 Kk;
(c)
Djt+i = 0.6 + 0.5 Kk,
Kk+l = —0.135 Dk + 1.2 Kk. Let us analyse the behaviour of this model after a very long time.
Solution. Note that individual variants differ from each other only in
the value of the coefficients at Dk in the second equation. We can thus
express all three cases as
'DA (0.6 0.5\ (Dk_x KkJ \-a 1.2j'\Kk-ly
where we gradually set a = 0.16, a = 0.175, a = 0.135. The value
of the coefficient a represents here the average number of preys killed
by one (clearly a „humble") predator per month. When denoting
y—a 1.21
we immediately obtain
Using the powers of the matrix T we can determine the evolution of the populations of predators and preys after a very long time. We easily compute the eigenvalues
(a) kt = 1, k2 = 0.8;
(b) kt = 0.95, k2 = 0.85;
(c) ki = 1.05, k2 = 0.75
the matrix T is and the respective eigenvectors are
(a) (5,4)r, (5,2)r;
(b) (10,7)r, (2, if;
(c) (10,9)r , (10, 3f.
k e N,
jk l Do
k e N.
n + 1 is given by multiplying the probabilistic transition matrix T — (tij), that is,
Xfi+\ — T - xn.
Because we assume that the vector x captures all possible states and thus with total probability one again transits into some of the state, all columns of T are also given by probabilistic vectors. Such process is called (discrete) Markov process and the resulting sequence of vectors xo, x \, ... is called Markov chain xn.
Note that every probabilistic vector x is actually mapped by a Markov process on a vector with a sum of coordinates equal to one:
j2tijxj = X(X'/;)V/ = Xv; = '•
>J J ' J
Now we can use the Perron-Frobenius theory in its full power. Because the sum of the rows of the matrix is always equal to the vector (1,..., 1), we can easily see that the matrix T—E is singular and thus one is surely an eigenvalue of the matrix T.
If furthermore T is a primitive matrix (for instance, when all elements are non-zero), from the corollary 3.20 we know that one is a simple root of the characteristic polynomial and all others have absolute value strictly smaller than one.
Theorem. Markov processes with the matrix that has no zero element or that some its power has this property, satisfy:
• there exist unique eigenvector im for the eigenvalue 1 which is probabilistic,
• the iteration Tkxo approaches the vector x^ for any initial probabilistic vector xq.
Proof. This claim follows directly from the positivity of the coordinates of the eigenvector derived in the Perron theorem.
Assume first that the algebraic and geometric multiplicities of the eigenvalues of the matrix T are the same. Then every probabilistic vector xo can be (in complex extension C") written as linear combination
x0 — ClX0
■ c2u2
where u2... ,u„ extend Xoo to a basis of the eigenvectors. But then the &-th iteration gives again a probabilistic vector
Xk
Tk-x0
+ X2c2u2-\-----hA/„«„.
Because all eigenvalues X2, ■ ■ ■ Xn are in absolute value strictly smaller than one, all components of the vector xk but the first one approach (in norm) zero very rapidly. But xk is still probabilistic, thus it must be that c\ — 1 and the second claim is proven.
In reality even with distinct algebraic and geometric multiplicities of eigenvalues we reach the same conclusion using a more detailed study of the so-called root subspaces of the matrix T which we reach in the connection with the so-called Jordan matrix decomposition even in this chapter, see the note 3.33.
Even in the general case we reach in the eigensubspace (xqo) a uniquely determined invariant (n — 1)-dimensional complement, on which are all eigenvalues in absolute value smaller than one and thus the corresponding component in xk approaches zero as before. □
156
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
For & e N thus holds that (a)
5 5\ (I 0\* (5 5W 4 21 ' \0 0.8/ ' U 2
,i _ j'10 2\ /0.95 0 \* /10 2^ 1 7 1/ ' l 0 0.85/ ' I 7 1
(b)
(c)
) 10\ /1.05 0 V /10 10x _1 9 3 J ' V 0 0.75) ' \ 9 3
>From there we further have for big 4eN that (a)
(5 5\ (\ 0\ (5 5
(b)
4 2/ \0 0/ \4 2
j_ /-10 25 10 V -8 20
10 2\ /0 0\ /10 2
(c)
7 1/ \0 0/ V 7 1
0 0 0 0
,*_./'10 10\ /1,05* 0\ /10 10
9 3 / V 0 0/ V 9 3 1,05* /-30 100N
60 V"27 90 / ' because exactly for big ieNwe can set
(a)
1 0 \ /l 0 0 0.8/ ~ VO 0
0.95 0 V /0 0 0 0.85/ ~ VO 0
1.05 0 V /1.05* 0
(b)
(c)
0 0.15 J \ 0 Oj Let us note that in the variant (b), that is for a = 0.175, it was not necessary to compute the eigenvectors. Thus we have obtained
(a)
Kk)~10\-8 20) '\K0
= J_(5 (-2D0 + 5K0) 10 14 (-2D0 + 5K0)
3.22. Iteration of the stochastic matrices. Matrices of Markov chains, that is, matrices whose rows have sum of their components equal to one are called stochastic matrices. Standard problems connected with Markov processes contain answers to the question about the expected time elapsed between transition from one state to another and so on. Right now we are not prepared for solving these problems, but we return to this topic later.
We reformulate the previous theorem into a simple, but surprising result. By convergence to a limit matrix in the following theorem we mean the following: if we say that we want to bound the possible error e > 0, then we can find a bound on the number of iterations k after which all the components of the matrix differ from the limit one by less than e.
Corollary. Let T be a primitive stochastic matrix from a Markov process and let Xqo be the stochastic eigenvector for the dominant eigenvalue I (as in the theorem before). Then the iterations Tk converge to the limit matrix T^, whose columns all equal to iw.
Proof. Columns in the matrix Tk are images of the vectors of the standard basis under the corresponding iterated linear mapping. But these are images of the probabilistic vectors and thus all of them converge to x^. □
Now for a short goodbye to Markov processes we think about the problem whether there exist for a given system the states into which the system tends to get in and stay in them.
We say that a state is transient, if the system stays in it with probability strictly smaller than one. State is absorbing if the system stays in it with probability one and into which the system can get with non-zero probability from any of the transient states. Finally, Markov chain x„ is absorbing, if all its states are either absorbing or transient.
If in the absorbing Markov chain first of r states of the system are absorbing, then for the stochastic matrix T of the system this means that it decomposes into "block-wise" upper triangular form
E R 0 Q
where £ is a unit matrix whose dimension is given by the number of absorbing states, while R is a positive matrix and Q non-negative. In any case, iterations of this matrix yield a matrix which has the same block of zero values in the bottom-left block and thus it is not primitive, for instance
r2 =
R + R-Q
Q2
Even about such matrices we can obtain many information using the full Perron-Frobenius theory and with knowledge of probability and statistics also estimate expected time after which the system gets into one of the absorbing states.
4. More matrix calculus
On pretty practical examples we have seen that understanding the inner structure of matrices and their properties is a strong tool for specific computations and analyses. Even more is it true for effectivity of numerical calculations with matrices. Therefore we will for a while deal with abstract theory.
157
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
(b)
(c)
0 0 0 0
Dk
1.05*
60 1.05*
60
D0 K0
-30 100 -27 90
Do K0
10(-3D0 + 10K0) 9(-3D0 + 10Z0)
These results can be interpreted as follows:
(a) If 2Do < 5Ko, the sizes of both populations stabilise on nonzero sizes (we say that they are stable); if 2D0 > 5K0, both populations die out.
(b) Both populations die out.
(c) For 3D0 < 10 K0 begins a population boom of both kinds; for 3 Do > 10Ko both populations die out.
Even a tiny change of the size of a can lead to a completely different result. This is caused by the constantness of the value of a - it does not depend on the size of the populations. Note that this restriction (that is, assuming a to be constant) has no interpretation in reality. But still we obtain an estimate on the sizes of a for stable populations.
□
3.20. Remark. Other model for the populations of predators and preys is the model by Lotka and Volterra, which describes a relation between the populations by a system of two ordinary differential equations. Using this model both populations oscillate, which is in accord with observations.
In linear models an important role is played by the primitive matrices (3.19).
3.21. Which of the matrices
0 1/7
1 6/7
D
0
/1/2 0 l/3\ r 1 0
B = ° 1 1/2 , C = 1/4 0 1/2
\l/2 0 1/6/ \3/4 0 1/2
1/2 0 0 \ (0 1 0 0\
1/3 0 0 E 0 0 0 1
1/6 1/6 1/3 1 0 0 0
0 5/6 2/3) \° 0 1 V
are primitive? Solution. Because
A2 = /l/7 6/49 \ 3 16/7 43/49/'
'3/8 1/4 l/4> 1/4 3/8 1/4 ,3/8 3/8 1/2,
We will investigate further some special types of linear mappings on vector spaces and also a general case where the structure is described using the so-called Jordan theorem.
3.23. Unitary spaces and mappings. We are already used to the fact that it is efficient to work in the domain of complex numbers even in the case when we are interested only in real objects. Furthermore, in many areas the complex vector spaces are necessary component of the problem. For instance, take the so-called quantum computing, which became a very active area of theoretical computer science, although quantum computers have not been constructed yet (in a usable form).
Therefore we extend what we know about orthogonal mappings and mappings from the end of the second chapter with the following definitions:
__ | Unitary spaces [__>
Definition. Unitary space is a complex vector space V along with the mapping V x V -> C, (u, v) i-> u ■ v, which satisfies for all vectors u, v, w e V and scalars a e C
(1) u ■ v — v ■ u (the bar stands for complex conjugation),
(2) (au) ■ v — a(u ■ v),
(3) (u + v) ■ w — u ■ w + v ■ w,
(4) if u / 0, then u ■ u > 0 (notably if the expression is real). Such mapping is called scalar product over V.
Real number *Jv ■ v is called size of the vector v and vector is normalised, if its size equals one. Vectors u and i; are called orthogonal if their scalar product is zero, basis composed of mutually orthogonal and normalised vectors is called orthonormal basis V.
On first sight this is an extension of the definition of Euclidean vector spaces into the complex domain. We will keep on using the alternative notation (u, v) for scalar product of vectors u and i;. Identically to the real domain, we obtain immediately from the definition the following simple properties of the scalar product for all vectors in V and scalars in C:
u ■ u € M
u ■ u — 0 if and only if u — 0
u ■ (av) — a(u ■ v) u ■ (v + w) — u ■ v + u ■ w M.0 = 0-« = 0
' j ij
where the last equality holds for all finite linear combinations. It is a simple exercise to prove everything formally, for instance the fist property follows from the definition property (1).
Standard example of scalar product over complex vector space
(xi
x„)T ■ (yi,
• x„) -xiyi
■ xn yn.
Thanks to conjugation of the coordinates of the second argument this mapping satisfies all required properties. The space C" with this scalar product is called standard unitary space of dimension n. We can denote this scalar product with matrix notation as x • y —
-T
y ■ x.
158
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
the matrices A and C are primitive, and because
'1/2 0 1/3 0 1 1/2 vl/2 0 1/6
the middle column of the matrix B" is always (for n e N) the vector (0, 1, 0)T, that is, the matrix B cannot be primitive. The product
/1/3 1/2 0 0 \ M / 0 \
1/2 1/3 0 0 0 1/6 1/6 1/3 \l/6 0 5/6 2/3/
0
a
w
o
a/6 + bß \5a/6 + 2b/3j
a, b e
implies that the matrix D2 has in the right upper corner a zero two-dimensional (square) sub-matrix. By repetition of this implication we obtain that the same property is shared by the matrices D3 = D ■ D2, D4 = D ■ D3, ..., D" = D ■ D"-\ thus the matrix D is not primitive. The matrix E is a permutation matrix (in every row and every column there is exactly one non-zero element, 1). It is not difficult to realise that the powers of the permutation matrix are again permutation matrices. The matrix E is thus also not primitive. This can be easily verified by calculating the powers E2, E3, E4. The matrix E4 is a unit matrix. □ Now we show a more robust model.
3.22. Model of spreading of annual plants. We consider the plants that at the beginning of the summer blossom, at the peak of the summer produce seeds and die. Some of the seeds burst into flowers at the end of the autumn, some survive the winter in the ground and burst into flowers at the start of the spring. The flowers that burst out in autumn and survive the winter are usually bigger in the spring and usually produce more seeds. After this, the whole cycle repeats.
The year is thus divided into four parts and in each of these parts we distinguish between some "forms" of the flower:
Part Stage
beginning of the spring small and big seedlings
beginning of the summer small, medium and big blossoming flowers
peak of the summer seeds
autumn seedlings and seeds
We denote by x\(t) and by x2(t) the number of small and big seedlings respectively at the start of the spring in the year t and by y\(t), y2(t) and (0 the number of small, medium and big flowers respectively in the summer of that year. From the small seedlings either small or big flowers grow, from the big seedlings either medium or big flowers grow. Each of the seedlings can of course die (weather, be eaten by a cow, etc.) and nothing grows out of it. Denote by btj the probability that the seedling of the j-th size, j = 1,2 grows into a flower of the
Completely analogously to the Euclidean spaces and orthogonal mappings, great importance is in those mappings that respect scalar product.
___J Unitary mapping J___,-
Linear mapping W between unitary spaces is called unitary mapping, if for all vectors u,v e V we have
u ■ v — cp(u) ■ (p(v).
Unitary isomorphism is a bijective unitary mapping.
3.24. Properties of spaces with scalar product. In a brief discussion about Euclidean spaces in the previous chapter we have already derived some simple properties of spaces with scalar product. The proofs for the complex case are very similar.
In the following we shall work with real and complex spaces simultaneously and we write K for R or C, in the real case the conjugation is just the identity mapping (as the actual restriction of the conjugation in the complex plane to the real line is). Similarly to the real space we define in general for arbitrary vector subspace U C V in the space with scalar product its orthogonal complement
U1- = {v e V; u ■ v = 0 for all u e U],
which is clearly also a vector subspace in V.
In the following paragraphs we work exclusively with finitely-dimensional unitary or Euclidean spaces. However, many of our results have a natural generalisation for the so-called Hilbert spaces, which are specific infinitely-dimensional spaces with scalar products, to which we return later, albeit briefly.
Proposition. For every finitely-dimensional space V of dimension n with scalar product we have:
(1) InV there exists an orthonormal basis.
(2) Every system of non-zero orthogonal vectors in V is linearly independent and can be extended to an orthogonal basis.
(3) For every system of linearly independent vectors (u\,..., uk) there exists an orthonormal basis (y\, ..., vn) such that its vectors respectively generate the same subspaces as the vector Uj, that is, ..., vt) — {u\..., ut), 1 < i < k.
(4) If («i, ..., u„) is an orthonormal basis V, then coordinates of every vector u e V are expressed via
u — (u ■ u{)u\ + ••• + («• u„)u„.
(5) In any orthonormal basis the scalar product has the coordinate form
u ■ v — x ■ y — x\yi H-----V xnyn
where x and y are columns of coordinates of the vectors u and v in a chosen basis. Notably, every n-dimensional space with scalar product is isomorphic to the standard Euclidean R" or the unitary C".
(6) Orthogonal sum of unitary subspaces V\ + ■ ■ ■ + Vk in V is always a direct sum.
(7) If A C V is an arbitrary subset, then A1- C V is a vector (and thus also unitary) subspace and (A-1)1- C V is exactly the subspace generated by A. Furthermore we have V — (A) © A\
(8) V is orthogonal product of n one-dimensional unitary sub-spaces.
159
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
z'-th size, i = 1, 2, 3. Then we have
0 < bn < 1, bl2 = 0, 0 < &21 < 1, 0322<0, i3i=0,
0 < /332 < 1, &11 + &21 < 1, ^22 + /J32 < 1
(think in detail about what each of these inequalities expresses). If we consider the classical probability, we can compute bn as a ratio of the positive results (small seedling grew into a small flower) and of all possible results (the number of small seedlings), that is, b\\ = yi {f)/x\(t). From the
yi(0 = bnxi{f). Analogously, we obtain the equality
y3(0 = b32x2(t).
If we denote for a while by y2,i (t) and y2,2(0 the number of medium flowers that grew out of small and big seedlings respectively, we have
yi(t) = y2,i(t) + y2,2(0 and/32i = j2,i(0Ai(0, ^22 = yi,iit) / x2(f) and thus
y2(0 = b2\xl(t) + b22x2(t).
x(t)
xdt)
x2(t)
y(t)
and rewrite the previous equation in the matrix notation
y(0 = Bx(t).
Denote by cn, c12 and c13 the number of seed produced by small, medium and big flowers respectively, and by z(t) the total number of produced seeds in the summer of the year t, we have
z(t) = cnyi(0 +c12y2(f) +c13y3(0,
or in matrix calculus
z(t) = Cy(t)
with the notation
C = (cn cX2 ci3) .
If we want the matrix C to describe the modelled reality, we assume that the inequalities
0 < Cn < cl2 < ci3
hold.
Denote finally by w 1 (0 and w2 (t) the number of seeds that burst in the autumn and the number of seeds that stay in the ground during the winter respectively, and by d\\ and d2\ the probabilities that the seed burst out in the autumn and that the seed does not burst respectively, and by f\ \ and f22 the probabilities that the seedling and the seed do
Proof. (1), (2), (3): We first extend the given system of vectors into any basis (u\,..., u„) of the space V and then start the Gramm-Schmidt orthogonalisation from -j f. 2.42. This yields an orthogonal basis with properties as required in (3). But from the Gramm-Schmidt orthogonalisation algorithm it is clear that when the original k vectors formed an orthogonal system of vectors, then during the process they won't be changed. Thus we have also proved (2) and (1). (4): If u — a\u\ + ■ ■ ■ + anun, then
u ■ Ui — a\(u\ ■ Ui) +
■ Qn{un ■ Ui) — üi \\Ui II — Cli
(5): Similarly we compute for any vectors u —
X\U\
■■ y\u\
yn
U ■ V — (X\U\
— x\y\ ^
■+xnun) ■ (yi«i ~t~ xnyn.
ynu„)
(6) : We need to show that for any tuple V), Vj from the given sub-spaces their intersection is trivial. If we have u e Vt and simultaneously u e Vj, then we have u _L u, that is, u ■ u — 0. That is possible only for the zero vector u e V.
(7) : Let u, v e A-1. Then (au + bv) ■ w — 0 for all w e A, a, b e K (from the distributivity of the scalar product). We have thus checked that A-1 is a unitary subspace in V. Let (vi,..., vjc) be some basis (A) chosen among the elements of A, and let be («i,..., wjO the orthonormal basis outputted by the Gramm-Schmidt orthogonalisation of the vectors (vi,..., v^). We extend it to the orthonormal basis of the whole V (both exist thanks to the already proven parts of this proposition). Because it is an orthogonal basis, necessarily (wi+i, ...,«„) — [u\,..., u^)1- — A1-and A c (wi+i,..., un)L (this follows from expressing the coordinates under the orthonormal bassi). If u _L (wjfc+i, ..., u„), then wis necessarily a linear combination of the vectors u 1\, but that happens whenever it is a linear combination of the vectors v\,..., vi, which is equivalent to u being in (A).
(8) : This is equivalent to the formulation of existence of the orthonormal basis. □
3.25. Important properties of size. Now we have everything prepared for basic properties connected with our definition of the size of vectors. We speak also of the norm defined by the scalar product. Note also that all '^c^t^J^— claims always consider finite sets of vectors and their validity does not depend on the dimension of the space V where it all takes place.
Theorem. For any two vectors u,v in the space V with scalar product we have
(1) || u + v || < || u II + II v II, with equality if and only ifu and v are linearly dependent.
(triangle inequality)
(2) \u ■ v\ < || u || || v ||, with equality if and only if u and v are linearly dependent.
(Cauchy inequality)
(3) For every orthonormal system of vectors (e\, ..., e\) we have
22 2 > \u ■ e\ I + • • • + \u ■ ek\
(Bessel inequality).
160
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
not die during the winter respectively. The probabilities dn,d2i clearly must satisfy the inequalities
0 < d\\, 0 < d2\, dn + d2\ = 1,
and because a seedling dies in the winter more easily than a seed hidden in the ground, we assume about fu, f22 that
(4) For orthonormal system of vectors (e\, ..., ek) the vector u belongs to the subspace e [e\,..., ek) if and only if
|2
|m||2 — \u ■ e\ \2 ■
\u-ek\
When denoting
'dn ,d2\
D
0 < /n < f22 < 1.
fu 0
w(t)
Wi(t)
0 MJ' ~w \w2(t) we obtain with similar ideas as before the equalities
w(t) = Dz(t), x(t + 1) = Fw(t).
Because the matrix multiplication is associative, we can for the numbers in individual stages of flowers in the following year from the previous equalities compose the recurrent formulas:
x(t + 1) =Fw(t) = F{Dz(t)) = (FD)z(t) = (FD){Cy(t)) = =(FDC)y(t) = (FDC)(Bx(t)) = (FDCB)x(t),
y(t + 1) =Bx(t + 1) = B{Fw(t)) = (BF)w(t) = (BF)(Dz(t)) = =(BFD)z(t) = (BFD)(Cy(t)) = (BFDC)y(t),
z(t + 1) =Cy(t + 1) = C(Bx(t + 1)) = (CB)x(t + 1) = (CB)(Fw(t)) =(CBF)w(t) = (CBF)(Dz{t)) = (CBFD)z(t),
W(t + 1) =Dz(t + 1) = D(Cy(t + 1)) = (DC)y(t + 1) =
=(DC)(Bx(t + 1)) = (DCB)x(t + 1) = (DCB)(Fw(t)) = ==(DCBF)w(t).
Using the notation
Ax = FDCB, Ay = BFDC, Az = CBFD, Aw = DCBF, we simplify them into the formula
x(t+\) = Axx{t), y(H-l) = Ayy(t), z(t+l) = Azz(t), w(t+\) = Aw(t)
>From these formulas we can compute the distribution of the population of the flowers in any part of any year, if we know the starting distribution of the population (that is, in the year zero).
For instance, let the distribution of the population be known in the summer, that is, z(0) of seeds. The distribution of the population at the beginning of the spring in the t-th year is
x(t) = Axx(t - 1) = A2xx(t -2) = ... = A'-lx(l) = A'~l Fw(0) = =A'-1FDz(0).
(Parseval equality) (5) For orthonormal system of vectors (e\, ..., ek) and vector u € V is the vector
w — (u ■ e\)e\ H-----h (u ■ ek)ek
the only vector which minimises the size \\u — v\\ for all v € (ei, ..., ek).
Proof. All proofs rely on direct computations: (2): Define the vector w :— u — ^v, that is, w _L v and compute
0 < \\w\\2 = \\u\\2
0 < IMI2IMI2 =
(u-v)
IM
2 (" ' »)
II i|2ii ii2
u-v ,,\ i (u-v)(u-v) ||., M2 II '•J II \\VV
2{u ■ v){u ■ v) + (u ■ v)(u ■ v)
>From there it directly follows that ||«||2IMI2 > \u ■ v\2 and the equahty holds if and only if w — 0, that is, whenever u and v are linearly dependent. (1): Again it suffices to compute
|2
||2 ii m2
v\\ — \\u\\ ■
< \\u\
J v ||- + U ■ V + V ■ u
IMI2 + 2Re(« • v) \v\\2 +2\u ■ v\ < \\u\\2
2
2||m|
iwiir
Because these are positive real numbers, it indeed is that || u+v\\ < \\u\\ + \\v\\. Furthermore, with equahty it must be that in all previous inequalities equality also holds, but that is equivalent to the condition that u and v are linearly dependent (using the previous part).
(3), (4): Let (e\,..., ek) be an orthonormal system of vectors. We extend to an orthonormal basis (e\,...,e„) (that is always possible thanks to the previous theorem). Then, again using the previous theorem, is for every vector u e V
n n k
\\u\\2 — ^(w • et)(u ■ ei) — ^2 \u ■ e,|2 > ^\u ■ e,|2 /—1 /—1 /—1
But that is the Bessel inequality. Furthermore, equality can hold if and only if u ■ =0 for all i > k, which proves the Parseval equahty.
(5): Choose arbitrary v e {e\,..., ek) and extend the given orthonormal system to the orthonormal basis (e\,..., en). Let («i,... ,un) and (xi,..., xk, 0,..., 0) be coordinates of u and v • under this basis. Then
\\u-v\\2 — \u\ - xi |2 H-----h \uk - xk\2 + l^+il2 H-----h |«„|2
and this expression is clearly minimised when choosing the individual vectors to be x\ — u\,..., xk — uk. □
3.26. Properties of unitary spaces. The properties of orthogonal mapping have a direct analogue in the complex domain. We can easily formulate them and prove gSL^ together:
Proposition. Consider the linear mapping (endomorphism) cp : V —>• V on the space with scalar product. Then the following conditions are equivalent.
161
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Note that the matrix Az = CBFD is of the type 1 x 1; it is not a matrix but just a scalar. We can denote by A = Az, compute
(3.5)
fbn 0
A = CBFD = (cxx cX2 cX3) j b2i b22
0 b32)
7n 0 \ (dn
0 f22 Ui
= (cnbn + cX2b2X cX2b22 + cX3b32) ^^i) =
= bncndnfn + b2ici2dnfn + b22cX2d2Xf22 + b32cX3d2Xf22
and order the previous computation into a suitable form
x(t) = (FDCB)'-1 FDz(0) = FD(CBFD)'~2 CBFDz(0) =
= FD(CBFDy-lz( ■ (3): Standard scalar product is in K" always given for columns x, y of scalars by the expression x ■ y — xT Ey, where E is the unit matrix. The property (2) thus means that the matrix A of the mapping ■ (5) The claim is expressed via the matrix A of the mapping ■ (6): Because for the determinant we have |ArA| — \E\ — \AAT\ — \A\\A\ — 1, there exists the inverse matrix A-1. But we also have A A1A — A, therefore also A1A — E which is expressed exactly by (6).
(6) =>■ (1): In the chosen orthonormal basis we have
From it we can see that the increment k is mostly influenced by the number of the seed that overwinter (parameter ^21) and their survivability (parameter ^22)- This revelation is not surprising, the farmers are aware of this fact since the times of neolithic times. The result shows that the mathematical model indeed adequately describes the reality.
Other interesting and well-described models of growth can be found in the collection of exercises after this chapter.
3.23. Consider the following Leslie model: farmer breeds sheep. The birth-rate of sheep depends only on their age and is on average 2 lambs per sheep between one and two years of age, 5 lambs per sheep between two and three years of age and 2 lambs per sheep between three and four years of age. Younger sheep do not deliver any lambs. Every year, half of the sheep die, uniformly distributed among all age groups. Every sheep older than four years is sent to the butchery. Farmer would like to sell (living) lambs younger than one year for their skin. What part of the lambs can be sold every year such that the
eigenvalues of unitary matrices are always complex units in the complex plane.
As with the orthogonal mappings we can easily check that orthogonal complements of invariant subspaces with respect to unitary cp : V -> V are always also invariant. Indeed, if cp(U) c U, u e U and 1; e U1- are arbitrary, then
V be a unitary mapping of complex vector spaces. Then V is orthogonal sum of one-dimensional eigen-subspaces.
Proof. There surely exist at least one eigenvector v e V. Then the restriction W we can naturally define its dual mapping ty* : W* -> V* by the relation
(3.6) (v, f*(a)) = {f(v),a),
where ( , ) denotes evaluation of the form (the second argument) on the vector (first argument), v e V and a e W* are arbitrary.
Let us choose bases v over V, w over W and let us write A for the matrix of the mapping i/r under these bases. Then we easily compute in dual bases the matrix of the mapping ty* in the corresponding dual bases over the dual spaces. Indeed, the definition says that if we represented the vectors from W* in the coordinates as rows of scalars, then the mapping ty* is given by the same matrix as if, if we multiply by it the row vectors from the right:
(ijf(v), a) — (a 1
, ®n) ■ A
— (v, if* (a)).
\Vn/
That means that the matrix of the dual mapping ty* is the transpose AT, because a - A — (AT ■ aT)T.
163
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
size of the herd remains the same? In what ratio will then the sheep be distributed among individual age categories?
Solution. Matrix of the model (without action of the farmer) is
/0 2 5
L =
i 0 0
o \
\0 0
2\ 0
0 0
5 0/
The farmer can influence how many sheep younger than one year stay in his herd to the next year, that is, he can influence the element In of the matrix L. Thus we are dealing with the model
/0 2 5 2\ a 0 0 0 0 ± 0 0 \0 0 \ 0/
and we are looking for an a such that the matrix has the eigenvalue 1 (we know that it has only one real positive eigenvalue). The characteristic polynomial of this matrix is
X - 2aX2
5 1
-X - -, 2 2
and if we require it to have 1 as a root, it must be that a
(we
substitute X = 1 and set the polynomial equal to zero). The farmer can
thus sell
i i
10
of lambs that are born that year. The corresponding
eigenvector for the eigenvalue 1 of the given matrix is (20, 4,2, 1) and in these ratios the population stabilises. □
3.24. Consider the Leslie population growth model for the population of rats, divided into three groups according to age: younger than one year, between one year and two years and between two years and three years. Assume that there exists no rat older than three years. The average birth-rate of one rat in individual age categories is the following: in the first group it is zero, in the second and in the third it is 2 rats. The mortality in the second group is zero, that is, the rats that survive their first year die after three years of life. Determine the mortality in the first group, if you know that the population stagnates (the total number of rats does not change). O
D. Markov processes
3.25. Sweet-toothed gambler. Gambler bets on a coin - whether a flip results in a head or in a tail. At the start of the game he has three sweets. On every flip, he bets on sweet and if he wins, he gains one additional, if he looses, he looses the sweet. The game ends when he loses all sweets or has at least five sweets. What is the probability that the game does not end after four bets?
Let us further assume that we are in a vector space with scalar product. If we choose one fixed vector v e V, substituting vectors for the second argument in the scalar product gives us a mapping
V -* V* = Hom(V, K)
V
(W !->• (u, 111) €
The non-degeneracy condition of the scalar product ensures that this mapping is a bijection. Furthermore we know that it indeed is a linear mapping over complex or real scalars, because we have fixed the second argument. On first sight it is clear that the vectors of the orthonormal basis are mapped on forms that constitute a dual basis, and every vector can be thus understood using the scalar product as a linear form.
In the case of vector spaces with scalar product our identification of a vector space with its dual also takes the dual mapping i/>* to the mapping i/>* : W -> V given by the formula
(3.7)
where by the same notation of parentheses as in the definition (3.6) we now mean scalar product. This mapping is called adjoint mapping to iff.
Equivalently we can understand the relation (3.27) to be the definition of the adjoint mapping i/>*, for instance by substituting all tuples of vectors of an orthonormal basis for the vectors u and i; we directly obtain all values of the matrix of the mapping i/>*. The previous calculation for the dual mapping in coordinates can be now repeated, we just have to keep in mind that in orthonormal bases in unitary spaces the coordinates of the second argument are conjugated:
(i[r(v), hi) — (wi, ..., w„) ■ A
\Vn/
= A
\w„)
— (v, \jr* (w))
\Vn/
Therefore we see that if A is the matrix of the mapping i/> in an orthonormal basis, then the matrix of the adjoint mapping i/>* is the transposed and conjugated matrix A - we denote this by A* — AT.
The matrix A* is called the adjoint matrix of the matrix A. Note that adjoint matrices are well defined for any rectangular matrix. We should not confuse them with algebraic adjoints, which we have used for square matrices when working with determinants.
We can thus summarise that for any linear mapping i/> : V -> W between unitary spaces under orthonormal bases with the matrix A, its dual mapping has in the dual bases the matrix AT. If we also identify using the scalar product the vector spaces with their duals, then the dual mapping corresponds to the adjoint mapping i/>* : W -> V (it is a custom to denote this mapping in the same way as the dual mapping), which has the matrix A*. The distinction between the matrix of the dual mapping and of the adjoint mapping is thus in the additional conjugation, which is of course the corollary of the fact that unifying the unitary space with its dual is not complexly linear mapping (since from the second argument in the scalar product the scalars are brought out conjugated).
164
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Solution. Before the 7-th round we can describe the state of the player by the random vector Xj = (Mi), PiU), P2U), PsU), P4(j), PsU)), where pt is the prob-abiUty that the player has z sweets. If the player has before the 7-th bet z sweets (z = 2,3, 4), then after the bet he has with 1/2 probability (z — 1) sweets with 1/2 probability (z + 1) sweets. If he attains five sweets or loses them all, the number of sweets does not change. The vector Xj+i is then obtained from the vector Xj by multiplying it with the matrix
/l 0.5 0 0 0 0\
0 0 0.5 0 0 0
0 0.5 0 0.5 0 0
0 0 0.5 0 0.5 0
0 0 0 0.5 0 0
0 0 0 0.5 V
x,
At the start we have
M 0
0
1 0
W
after four bets the situation is described by the following vector
x,
AAX,
16
0
_5_
16
0
\V
that is, the probability that the game ends in the fourth bet or sooner is one half.
Note also that the matrix A describing the evolution of the prob-abilist vector X is itself probabilistic, that is, in each column the sum is one. But it does not have the property required by the Perron-Frobenius theorem and by a simple computation you can check (or you can see it straight without any computation) that there exist two linearly independent eigenvectors corresponding to the eigenvalue 1 -the case that the player has no sweet, that is x = (1, 0, 0, 0, 0, 0)T, or the case when the player has 5 sweets and the game thus ends with him keeping all the sweets, that is, x = (0, 0, 0, 0, 0, l)T. All other eigenvalues (approximately 0.8,0.3, —0.8, —0.3) are in absolute value strictly smaller than one. Thus the components in the corresponding eigensubspaces with iteration of the process with arbitrary initial distribution vanish and the process approaches the limiting value of the probabilistic vector of the form (a, 0.0, 0.0, 1 — a), where the value a
3.28. Self-adjoint mappings. A special case of linear mapping are those that are identical with their adjoint mappings: if* = if. Such mappings are called self-< adjoint. Equivalently we can say that they are those mappings whose matrix A is under one (and thus under all) orthogonal basis satisfies A = A*.
In the case of Euclidean spaces the self-adjoint mappings are those that have symmetric matrix (under some basis). Often they are called symmetric matrices and symmetric mappings.
In the complex domain the matrices that satisfy A = A* are called Hermitian matrices. Sometimes they are also called self-adjoint matrices. Note that Hermitian matrices form a real vector subspace in the space of all complex matrices, but it is not a sub-space in the complex domain.
Remark. Especially interesting is in this connection the following remark. If we multiply a Hermitian matrix A by the imaginary unit, we obtain the matrix B = i A, which has the property B* = i AT = — B. Such matrices are called anti-Hermitian. As every real matrix is a sum of a symmetric and an anti-symmetric part,
1
1
A = -(A + A1) + -(A-A1), in the complex domain we analogously have
A=l-(A + A*) + i^-(A-A*) 2 2i
and can thus express every complex matrix in exactly one way as
a sum
A = B + iC
with Hermitian matrices B and C. It is an analogy of the decomposition of a complex number into its real and purely imaginary component and in the literature we often encounter the notation
B = re A = -(A + A*), C = im A = — (A - A*). 2 2i
In the language of linear mappings this means that every complex linear automorphism can be uniquely expressed using two self-adjoint mappings.
3.29. Spectral decomposition. We consider a self-adjoint mapping if : V -> V with matrix A under some orthonormal basis and we try to proceed similarly as in 2.50. Again, I we first look in general at the invariant subspaces of self-adjoint mappings and on their orthogonal complements. If for any subspace W C V and self-adjoint mapping if : V -> V we have if(W) c W, then also for every 1; e W-1, w e W
(if(v), in) = (u, if(w)) = 0.
That means that also f(W^) c W^.
Consider now the matrix A of a self-adjoint mapping under some orthonormal basis and A ■ x = Xx for some eigenvector x e C". We obtain
X(x, x) = {Ax, x) = (x, Ax) = (x, Xx) = X(x, x).
Positive real number (x, x) can be cancelled out and thus it must be X = X, that is, eigenvalues are always real.
The characteristic polynomial det( A—X E) has that many complex roots as is the dimension of the square matrix A, and all of them are actually real. Thus we have proved important general result:
165
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
depends on the initial number of sweets. In our case it is a = 0.4, if there were 4 sweets at the start, it would be a = 0.2 and so on. □
3.26. Car rental. Company that rents cars every week has two branches - one in Prague and one in Brno. A car rented in Brno can be returned in Prague and vice versa. After some time it has been discovered that in Prague, roughly 80 % of the cars rented in Prague and 90 % of the cars rented in Brno are returned there. How to distribute the cars among the branches such that in both there is at the start of the week always the same number of cars as in the week before? How will the situation look like after some long time, if the cars are distributed at the start in a random way?
Solution. Let us denote the components of the vector in question, that is, the initial number of cars in Brno and in Prague by xB and x P respectively. The distribution of the cars between branches is then described
by the vector x = (^B^j ■ ^ we consider such a multiple of the vector x such that the sum of its components in 1, then its components give the procentual distribution of the cars. At the end of the week is according to the statement the state is described by the vector I ^ ^ 0 8 / \ jc
'0.1 0.2 v0.9 0.8
rental. If at the end of the week in the branches there should be the same number of cars as at the beginning, we are looking for such vector x for which it holds that Ax = x. That means that we are looking for an eigenvector of the matrix A associated with the eigenvalue 1.
The characteristic polynomial of the matrix A is (0.1 — A) (0.8 — A) — 0.9.0.2 = (A — 1)(A + 0.1) and 1 is indeed an eigenvalue of
the matrix A. The corresponding eigenvector x = I Xb ) satisfies the
\xpj
-0.9 0.2 \ (xB
The matrix A
thus describes our (linear) system of car
0. It is thus a multiple of the vector . For determining the procentual distribution we are looking for
The suitable distribution of the cars between
equation ^Q9 _Q y ^
0.2^ .0.9,
a multiple such that xB + xP = 1. That is satisfied by the vector
_L (02\ - (0A^ 11 \0.9J ~ V0.82y
Prague and Brno is such that 18% of the cars are in Brno and 82% of the cars are in Prague.
If we choose arbitrarily the initial state x = I Xb ), then the state
\xpj
after n week is described by the vector xn = A"x. Now it is useful to express the initial vector x in the basis of the eigenvectors of A. The eigenvector of the eigenvalue 1 has already been found and similarly we find eigenvectors of the eigenvalue —0.1. That is for instance the
vector
Proposition. Orthogonal complement of an invariant subspace for self-adjoint mapping is also invariant. Furthermore, all eigenvalues of a Hermitian matrix A are always real.
>From the definition itself it is clear that restriction of a self-adjoint mapping on an invariant subspace is again self-adjoint. The previous claim thus ensures that there always exists a basis of V composed of eigenvectors. Indeed, the restriction of \[r on the orthogonal complement of an invariant subspace is again self-adjoint mapping, thus we can add into the basis one eigenvector after another, until we obtain whole decomposition of V. Eigenvector associated with distinct eigenvalues are perpendicular, because from the equations \[f(u) — Xu, ^j/(v) — pv we have that
X{u, v) — (if(u), v) — {u, ifr(v)) — p{u, v) — p{u, v).
Usually our result is formulated using projections on eigensub-spaces. About the projector P : V -> V we say that it is perpendicular if Im P _L Ker P. Two perpendicular projectors P, Q are mutually perpendicular if Im P _L Im Q.
Theorem (About the spectral decomposition). For every self-adjoint mapping iff : V -> V on a vector space with scalar product there exists an orthonormal basis composed of eigenvectors. If Xi, ..., Xk are all distinct eigenvalues of iff and P\, . . . , Pk are the corresponding perpendicular and mutually perpendicular projectors on the eigenspaces corresponding to the eigenvalues, then
f = XiPi + --- + XkPk.
Dimension of images of these projectors is always equal to the algebraic multiplicity of the eigenvalues a,.
3.30. Orthogonal diagonalisation. Mappings for which we can \^ find an orthonormal basis as in the previous theo-\ rem about spectral decomposition are called orthogonally diagonalisable. They are of course exactly such mappings for which we can find an orthonormal basis such that the matrix of the mapping is diagonal under this basis. Let us think for a while how can they look like.
For the Euclidean case it is simple: diagonal matrices are first of all symmetric, thus they are exactly the self-adjoint mappings. As a corollary we obtain a result that an orthogonal mapping of an Euclidean space into itself is orthogonally diagonalisable if and only if it is self-adjoint (they are exactly the self-adjoint mappings with eigenvalues ±1).
For complex unitary spaces the situation is more complicated. Consider arbitrary linear mapping cp : V -> V of a unitary space and let cp — \[r + in be the (uniquely given) decomposition of cp into its Hermitian and anti-Hermitian part. If cp has under a suitable orthonormal basis a diagonal matrix D, then D — reD + iimD, where the real and the imaginary parts are exactly the matrices \[r and n (follows from the uniqueness of the decomposition). Thus it also holds that \[r o n — n o i\r and cp o cp* — cp* ocp. The mappings cp : V —>• V with the last listed property are called normal.
Mutual connections are shown in the following proposition (we follow the notation of this paragraph):
Proposition. The following conditions are equivalent:
(1) cp is orthogonally diagonalisable,
(2) cp* o cp — cp o cp* (that is, cp is a normal mapping),
(3) iff o r) = r) o ifr,
166
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
The initial vector can thus be expressed as a linear combination
/0 2\ /-1N x = a + b I j ). State after n weeks is then
(4) for a matrix A — (atj) of a mapping (3): it suffices to do a direct calculation
cpcp* — (-iff + in)(i{f — in) = iff2 + n2 + i(ni{f — ifrn)
cp* cp = (iff — ir))(i{r + in) — iff2 + n2 + i(^rr\ — r)ifr)
Subtraction yields 2i{n^r — TJrn).
(2) =>■ (1): letw e V be an eigenvector of the normal mapping cp. Then
cp(u) ■ cp(u) — {cp*cp(u), u) — ((pep* (u), u) — cp* (u) ■ cp* (u),
thus also \cp(u)\ — \cp* (u)\. If cp is normal, then (cp — kid V)* — (cp* —kidV) and thus (cp —k id V) is also a normal mapping. From the previous equation follows that if cp(u) — ku, then cp* (u) — ku. That means that (4): the expression ^; ■ |fly |2 is the trace of the matrix AA*, which is the matrix of the mapping • V, which we prove later in 3.37. The theorem says that for every linear mapping • V there exists an orthonormal basis under which 3:
A"
and we easily determine that the game ends with the probability a + ab + ab2 = 0, 885 as a loss and with the probability roughly 0, 115
/l a + ab + ab2 a + ab a 0\
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
b3 b2 b V
Proof. For unitary mapping cp is qxp* — id V — cp* cp and thus cpcp* = (i/r + it])(^r — it]) — i/>2 + 0 + rj2 — id V. On the other hand, for normal mapping the last calculation shows that the other implication holds too. □
3.31. Non-negative mappings and roots. Non-negative real numbers are exactly those which we can write as square roots. Generalisation of such behaviour for matrices and mappings can be seen in products of matrices B — A* ■ A (that is, composition of
mappings i/>* of):
(B ■ x, x) = (A* • A • x, x) = (A • x, A • x) > 0 for all vectors x. Furthermore, we clearly have
B* = (A* ■ A)* = A* ■ A = B.
Hermitian matrices B with such property are called positively semi-definite and if the zero value is attained only for x — 0, they are called positively definite. Analogously, we speak of positively definite and positively semidefinite mappings iff : V -> V.
For every positively semidefinite mapping i/> : V -> V we can find its root, that is, a mapping rj such that rj o rj — i/>. It is simplest to see under an orfhonormal basis where i/> has diagonal matrix. Such basis exists (as we have already proven) and the matrix A of the mapping i/> has on diagonal only non-negative real numbers, the eigenvalues of i/>. If some of them were negative, then the condition for non-negativity would not be satisfied already for some of the basis vectors. But then it suffices to define the mapping rj using the matrix B with square roots of the corresponding eigenvalues on diagonal.
3.32. Spectra and nilpotent mappings. At the end of this section we return to the question about behaviour of linear mapping in full generality. We shall still work with real or complex vector spaces.
Let us recall that spectrum of linear mapping f : V -> V is a sequence of roots of the characteristic polynomial of the mapping /, counting multiplicities. Algebraic multiplicity of eigenvalue is its multiplicity as of a root of the characteristic polynomial, geometric multiplicity of eigenvalue is the dimension of the corresponding subspace of eigenvectors.
Linear mapping / : V -> V is called nilpotent, if there exists an integer k > 1 such that the iterated mapping /* is identically zero. The smallest k with such property is called degree ofnilpo-tency of the mapping /. The mapping / : V —>• V is called cyclic, if there exists a basis (u\,... ,u„) of the space V such that f(u\) — 0 and /(«/) — ui-\ for all i =2,..., n. In other words, the matrix of / under this basis is of the form
/0 1 0 0 0 1
V: :
A
7
If f(v) — a ■ v, then for every natural k we have f (v) — ak ■ v. Notably, the spectrum of nilpotent mapping can contain only zero scalar (and that is always present).
Directly from the definition follows that every cyclic mapping is nilpotent, furthermore its degree of nilpotency is equal to the
168
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
as a win of €80. (We multiply by the matrix A°° the initial vector (0, 1, 0, 0, 0) and obtain the vector (a + ab + ab2, 0, 0, 0, b3).) □
3.30. Consider the situation from the previous case and assume that the probability of both win and loss is 1/2. Denote by A the matrix of the process. Without using any computational software determine
A100. O
3.31. Absent-minded professor. Consider the following situation: absent-minded professor carries an umbrella with him, but with probability 1/2 he forgets it where he is leaving from. In the morning, he leaves to the work. At the work, he goes for a lunch into a restaurant, and then back. After he is finished with his work, he leaves for home. Consider for simplicity that he does not go anywhere else and that in the restaurant the umbrella stays on his favourite spot, where he can take it from on the next day (if he does not forget it there). Consider this situation as Markov process and write down its Matrix. What is the probability that after many days in the morning the umbrella is located in the restaurant? (It is useful to take as a time unit one day -from morning to morning.)
Solution.
'11/16 3/8 l/4>
3/16 3/8 1/4
1/8 1/4 1/2,
We compute for instance the element a\, that is, the probability
that the umbrella starts its day at home and stays there (that is, will be
there the next day in the morning) - there are three disjoint ways for
the umbrella:
D the professor forgets it at home in the morning p\ = \, DPD the professor takes it to the work, then he forgets to take it on
the lunch and in the evening he takes it home: P2 = \ \ =
1
8'
DPRPD the professor takes the umbrella all the time with him and does not forget it anywhere: p3 = \ ■\ ■\ ■\ = j^.
In total a\ = pi + p2 + pi = ^.
The eigenvector of this matrix corresponding to the dominant eigenvalue 1 is (2, 1, 1), and thus the desired probability is 1/(2 + 1 + 1) = 1/4. □
3.32. Algorithm for determining the importance of pages. Internet browsers can find on the Internet (almost) all pages containing a given word or phrase. But how to sort the pages such that the user receives a list sorted according to the relevance of the given pages? One of the possibilities is the following algorithm: the collection of all found pages is considered to be a system and each of the found
dimension of the space V. The operator of derivation on polynomials, D(xk) — /ex*-1, is an example of cyclic mapping on spaces K„ [x] of all polynomials of degree at most n over scalars K.
Surprisingly, this also holds the other way - every nilpotent mapping is a direct sum of cyclic mappings. A proof of this claim takes a lot of work, thus we first formulate the results we are aiming at, and then gradually start with the technical work. In the resulting theorem about Jordan decomposition appear vector (sub)spaces and linear mappings on them with a single eigenvalue X and a matrix
(XX 0 ... 0\ ox 1 ... 0
j = . .
v0 0 0 ... X)
These matrices (and corresponding invariant subspaces) are called Jordan blocks.
Theorem (Jordan theorem about canonical form). Let V be a vector space of the dimension n and f : V -> V be a linear mapping with n eigenvalues, counting algebraic multiplicities. Then there exists a unique decomposition of the space V into a direct sum of subspaces
V = Vi ® ■ ■ ■ ® vk
such that f (Vi) C Vi, restriction of f on every Vi has a single eigenvalue Xi and the restriction f — Xi ■ id on Vi is either cyclic or zero mapping.
The theorem thus says that for a suitable basis every linear mapping has block-diagonal form with Jordan blocks along the diagonal. The total number of ones over the diagonal in such form equals the difference between total algebraic and geometric multiplicity of the eigenvalues.
3.33. Notes. Note that we have already proven the Jordan theorem for the cases when all eigenvalues are either distinct or when the geometric and algebraic multiplicities of the eigenvalues are the same. Specifically, we have already proven it for unitary, normal and self-adjoint mappings.
Another useful observation is that for every linear mapping /, every eigenvalue of / has uniquely determined invariant subspace that corresponds to the Jordan block in the matrix.
We should also mention one very useful corollary of the Jordan theorem (which we have already used in the discussion about the behaviour of Markov chains). Assume that the eigenvalues of our mapping / are all in absolute value smaller than one. Then repeated application of the linear mapping on every vector v e V leads to a fast decrease of all coordinates of fk(v) bellow any bounds. Indeed, assume for simplicity that on whole V the mapping / has only one eigenvalue X and / — X id v is cyclic (that is, we consider only one Jordan block) and let v\,..., vi be the corresponding basis. Then the condition from the theorem says f(v2) — Xv2 + v\, f2(v2) — X2v2 + Xv\ + Xv\, and similarly for other vt and higher powers. In any case, iteration results in higher and higher powers of X at all non-zero components, while the smallest of them can be at most the degree of nilpotency lower than the number of iterations.
This proves the claim (and the same argument can be used to prove that for the mapping with all eigenvalues with absolute
169
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
pages as one of its states. We describe a random walk on these pages as a Markov process. The probabilities of transitions between pages are given by the hyperlink: each link, say from page A to page B, determines the probability (l/(total number of links from the page A)), with which the process moves from the page A to the page B. If from some page there are no leading links, we consider it to be a page from which a link leads to every other page. This gives us probabilistic matrix m (the element mi; corresponds to the probability with which we move form the z-th page to the 7-th page). Thus if one randomly clicks on links in the found pages (and from a linkless page one just chooses randomly the next one) the probability that at a given point in time (distant enough from the beginning) one is located on the z-th page corresponds to the z-th component of the unit eigenvector of the matrix m, corresponding to the eigenvalue 1. Looking at the sizes of these probabilities we define the importance of the individual pages.
This algorithm can be modified by assuming that the users stops clicking from a link to link after certain time and again starts on a random page. Say that with probability d he chooses randomly a new page and with probability (l — d) keeps clicking. In such situation the probability of transition between any two pages 5, and Sj is non-zero - it is d/n + (1 — d)/total number of links at the page 5, if from 5, there is a link to Sj, and d/n otherwise (if there are no links at 5,, then it is \/ri). According to the Perron-Frobenius theorem the eigenvalue 1 is with multiplicity one and dominant, and thus the corresponding eigenvector is unique (if we chose transitional probabilities only as described in the previous paragraph, it would not have to be so).
For illustration consider pages A, B, C and D. The links lead from A to B and to C, from B to C and from C to A, from D nowhere. Let us say that the probability that the user chooses a random new page is 1/5. Then the matrix m looks as follows:
/1/20 1/20 17/20 l/4\
9/20 1/20 1/20 1/4
9/20 17/20 1/20 1/4
\l/20 1/20 1/20 1/4/
The eigenvector corresponding to the eigenvalue 1 is (305/53,175/53,315/53,1), the importance of pages is thus given according to the order of the sizes of the corresponding components, that is, C > A > B > D.
3.33. Based on the temperature at 14:00 the days are divided into warm, average and cold. From the all-year statistics, after a warm day in half of the cases the next day is warm in 50 % of the cases and average day in 30 % of the cases, after an average day the next day is in 40 % of the cases average and cold in 30 % of the cases, and after
m
value strictly greater than one leads to unbounded growth of all coordinates for the iteration fk(v)).
The rest of this part of the third chapter is devoted to the proof of the Jordan theorem and some necessary lemmata. It is way more difficult than anything so far and the reader can skip it, until the beginning of the fifth part of this chapter.
3.34. Root spaces. On examples we have already seen that the eigensubspaces describe additional geometric properties only for some linear mappings. Thus we now introduce a more subtle tool, the so-called root subspaces.
Definition. Non-zero vector u e V is called root vector of a linear mapping cp : V -> V, if there exists a e K and an integer k > 0 such that (cp — a ■ idy)k(u) — 0, that is, k-th iteration of the given mapping maps u to zero. The set of all root vectors corresponding to a fixed scalar X along with the zero vector is called the root subspace associated to the scalar X e K, and is denote as IZx-
If u is a root vector and the k from the definition is chosen the smallest possible, then (cp — a ■ idy)*-1 (w) is an eigenvector with the eigenvalue a. Thus we have IZx — {0} for all scalars X which are not in the spectrum of the mapping cp.
Proposition. For linear mapping V we have:
(1) for every X e K is H\ C V a vector subspace,
(2) for every X, \i e K is 1Z\ invariant with respect to the linear mapping ( From there we have that 0 — (cp — X ■ idy)k(u) — (ji — X)k ■ u and thus also u — 0 for X / ji.
(4) Choose a basis e\,... ,ep of the subspace IZx- Because according to the definition there exist numbers kt such that (cp — X-idy)*' (et) — 0, we have that the whole mapping (cp — X ■ idy) |^ is nilpotent. □
3.35. Factor subspaces. Our next aim is to show that the dimension of the root spaces is always equal to the algebraic multiplicity of the corresponding eigenvalues. Let us first introduce some useful technical tools.
DefiftTHon. Let U C V be a vector subspace. On the set of all vectors in V we define an equivalence relation as follows: vi ~ v2 if and only if v\ — v2 e U. Axioms of equivalence are easy to check. The set V/U of the classes of this equivalence, along with the operations defined using representants, that is, [v] + [w] —
170
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
a cold day the next day is in 50 % of the cases cold and in 30 % of the cases average. Without any further information derive how many warm, cold and average days can be expected in a year.
Solution. For each day exactly one of the states „warm day", „average day", „cold day" is attained. If the vector x„ has as its components the probabilities that a certain (72-th) day is warm, average and cold (respectively), then the components of the vector
/0.5 0.3 0.2\
xn+1 = 0.3 0.4 0.3 -xn \0.2 0.3 0.5/
give the probabilities that the next day is warm, average and cold respectively For verifying it suffices to substitute
/1\ /0\ /0>
X„ = 0 , X„ = 1 , X„ = 0
W W \h
while for instance for the third choice we must obtain the probabilities that after a cold day follows a warm, average and cold day (respectively). We see that the problem is a Markov chain with probabilistic transitional matrix
/0.5 0.3 0.2> T = 0.3 0.4 0.3 \0.2 0.3 0.5,
Because all the elements of this matrix are positive, there exists a probabilistic vector
X0O = {Xoo> xoo' xoo) '
to which the vector x„ approaches as n grows, independently of the
vector x„ for small n. Furthermore, thanks to the corollary of the
Perron-Frobenius theorem x^ is the eigenvector of the matrix T for
the eigenvalue 1. Thus it must hold that
xl = 0.5x1 + 0.3x1 + 0.2x1, xl = 0.3x1 + OAxl + 0.3x1, -4 = 0.2 xj^ + 0.34, + 0.5 x^,
where the last condition means that the vector x^ is probabilistic. It is easy to compute that this system has a single solution
1 _ 2 _ 3
Thus we can expect roughly the same number of warm, average and cold days.
Let us emphasise that the sum of the numbers from any column of the matrix T had to equal 1 (otherwise it would not be a Markov process). Because TT = T (the matrix is symmetric), the sum of all numbers from any row is also equal 1. We say that a matrix with non-negative elements and with the properly that the sum of the numbers
[1; + w], a ■ [u] = [a ■ u], forms a vector space which we call factor vector space of the space V by the subspace U.
Check the correctness of the definition of the operations and that all the axiom of the vector space hold!
Classes (vectors) in the factor space V/U will be often denoted as a formal sum of one representant with all vectors of the subspace U, for instance u + UeV/U,ueV. Zero vector is in V/U exactly the class 0+U, that is, the vector u e V represents the zero element in V/U if and only if it is u e U.
For simple examples, think about V/{0} — V, V/V — {0} and about the factor space of the plane M2 by any one-dimensional subspace (here, every one-dimensional subspace U C M2 is a line passing through the origin), where the classes of equivalence are parallel lines with this line.
Proposition. Let U c V be a vector subspace and (u\, ..., un) be such basis of V such that (u\, ..., uk) is a basis of U. Then dim V/U = n — k and the vectors
uk+i + U,..
U
form a basis of V/U.
Proof. Because V = {u\, ...,«„), it is also that V/U = (ui+U,..., u„+U). But first k generators are zero, thus V/U =
(wjfc+i + U, ... ,u„ + U). Assume that ak+\ ■ (uk+i + U)-\-----h
a„ ■ («„ + U) = (ajfc+i • Wjfc+i H-----h a„ ■ u„) + U = 0 € V/U.
That is equivalent to the belonging of a linear combination of the vectors uk+i, ...,«„ to the subspace U. Because U is generated by the remaining vectors, the combination is necessary zero, that is, all coefficients at are zero. □
3.36. Induced mappings on factor spaces. Assume that U C V is an invariant subspace with respect to linear mapping • V and choose basis u\,...,un of the space V such that the first k vectors of this basis is a basis of U. In this basis • V be a linear mapping whose spectrum contains n elements (that is, all roots of the characteristic polynomial lie in K and we count their multiplicities). Then there exists a sequence
111
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
in any column equals one and analogously for rows is called doubly stochastic. Important property of every doubly stochastic primitive matrix (for any dimension - the number of states) is that the corresponding vector Xqo has all the components identical, that is, after sufficiently many iterations all the states in the corresponding Markov chain are attained with the same frequency. □
3.34. John is used to go running every evening. He has three tracks - short, middle and long. Whenever he chooses a short track, the next day he feels bad about it and chooses uniformly between long and medium. Whenever he chooses a long track, the next day he chooses arbitrarily among all three. Whenever he chooses the medium track, the next he feels good about it and again chooses uniformly between medium and long. Assume that he has been running like this for a very long. How often does he choose the short one and how often the long one? What is the probability that he chooses a long one when he picked it a week before?
Solution. Clearly it is a Markov process with three possible states -choices for short, medium and long tack. This order of the states gives a probabilistic transitions matrix
0 0 1/3^ 1/2 1/2 1/3 ,1/2 1/2 1/3,
It suffices to realise that for instance the second column corresponds to the choice of the medium track in the previous day, which means that with the probability 1/2 again a medium track will be chosen (the second row) and with probability 1/2 a long track will be chosen (the third row). Because we have
/ 1/6 1/6 l/9\ T2 = 5/12 5/12 4/9 , \5/12 5/12 4/9/
we can use the corollary of the Perron-Frobenius theorem for Markov chains. It is not difficult to compute that eigenvector corresponding to the eigenvalue 1 and which is probabilistic vector, namely:
1 3 3X T 7' 7' 7
The values 1/7, 3/7, 3/7 then give respectively the probabilities that in a randomly chosen day he choose short, medium and long track.
Let John at a certain day (that is, in time n e N) choose a long track. This corresponds to the probabilistic vector
xn = (0, 0, If .
of invariant subspaces {0} — Vq C V\ C • • • C V„ — V with dimensions dim Vj- = i. Under the basis u\,... ,un of the space V such that Vt■ — (u i■) the mapping and after seven days we have
'0>
x„+7 = r7 • o
'l/3^ 1/3
1/ \1/3y
The enumeration gives us as components of xn+1 the values
0.142 861225...; 0.428 569 387...; 0.428 569 387...
Thus the probability that he chooses a long track under the condition that he chose it seven days ago is roughly 0.428 569 ~ 3/7 = 0.428 571. □
3.35. The production line is not reliable: individual products differ in quality in a non-neglectible way. Furthermore, a certain worker tries to improve the quality of the products and intervenes to the process. The products are distributed into classes I, II, III according to their quality, and a report found out that after a product of class I the next product has the same quality in 80 % of the cases and is of quality II in 10 % of the cases; after a product of the class the next product is of the class in 60 % of the cases and is of quality I in 20 % of the cases, and after a product of the quality III the next product is of the quality III in 50 % of the cases and in 25 % of the cases it is of quality II. Compute the probability that the 18-th product is of the quality I, if the 16-th product is of quality III.
Solution. Let us first solve the problem without using a Markov chain. The event in question is satisfied by the cases (16-th product is of the class III)
• 17-th product is of the class I and 18-th product is of class I;
• 17-th product is of the class II and 18-th product is of class
I;
• 17-th product is of the class III and 18-th product is of class I,
with probabilities respectively
• 0.25 • 0.8 = 0.2;
• 0.25 • 0.2 = 0.05;
• 0.5-0.25 = 0.125.
Thus we easily obtain the result
0.375 = 0.2 + 0.05 + 0.125.
Now let us view the problem as a Markov process. From the statement we have that to the order of the possible states „product is of
(w) — 0 for suitable j. We have thus derived ( • V whose whole spectrum is in K is V — 1Z\X © • • • © 1Z\n the direct sum of the root subspaces. If we choose suitable bases for these subspaces, then V be a nilpotent linear mapping. Then there exists a decomposition of V into a direct sum of subspaces V — Vi © • • • © Vk such that the restriction ofcp on any of them is cyclic.
Proof. Verifying this is quite straightforward and consists of construction of such basis of the space V that the action of the mapping cp on the basis vectors directly show the , ,g decomposition into the cyclic mappings. But taking care
i of the details will take some time. Let k be the degree of nilpotency of the mapping cp and denote = imifff), i — 0,..., k, that is,
{0} = Pk C Pjt_i C • • • C Pi C P0 = V.
Choose arbitrary basis e\~l,..., ekp~^ of the space Pk-\, where pk-i > 0 is the dimension of Pk-i- >From the definition it follows that Pk-\ C K&vcp, that is, always cp(ek~l) — 0.
Assume that P^-i / V. Because Pk-i — cp(Pk-2), there necessarily exist in Pk-2 the vectors ek~2, j — 1,..., pk-\ such
thatp(e*-2) — ekrl. Assume
a\e\ 1
,ek-\+biek-2
"T- uPk-l"Pk-l
Application of the mapping cp on this linear combination yields
bie\~l A-----ybVk_xekv~\ = 0, therefore all bj = 0. But then also
aj — 0, because it is a combination of the basis vectors. Thus we have verified the linear independence of all 2pk-\ chosen vectors. We extend them to a basis
-i ,
ek~l • Pk-i
ek~2 ek~2
'Pk-2
of the space Pk-2- Furthermore, the images of the added basis vectors are in Pk-i, necessarily they must be linear combinations of the basis elements ek~l,..., ek~} . We can thus exchange the cho-
sen vectors e
cp(ekr2). This
l ' """' "Pk-i ■
Jj-x+i.....4r-2 withvectors^"2
ensures that the vectors added to the basis of Pk-2 belong to the kernel of the mapping cp. Let us thus assume it right about the chosen basis (1).
Let us assume further that we have already constructed a basis of the subspace Pk-i such that we can directly compose it into the schema
-i ,
~Pk-\
- g*-2 ek-2
k-l
ek~3 ek~3
■ ' Pk-V Pk-\ + V '
k-i k-i
Pk-V cpk-i + V- ■
ek~2
■ ' Pk-2
k-3 k-3 k-3
' ' Pk-2' »-2+1' • • • ' Pk-3
k-i k-l „k-l CPk-2> Cpk-2 + V
k-l
01 '• • •' "»-1' >,t_i + l'- • •' "»-2' "»-2 + 1'" " "' "»-3' • • • "Pk-l
where the value of the mapping cp on any basis vector is located above him, or equals zero if there is nothing above that basis vector. If Pk-i / V, then again there must exist vectors
k-l-l
' e n-t 1 wnich maP on e i
k-l
tend them to a basis P^-i-i, say by the vectors
., ek lt and we can ex-
Pk-l
k-l-l
k-l-l
'■pk-i + l' • • • ' Pk-l-1-
By gradual subtraction of the values through iteration of the mapping cp on these vectors yields that the vectors added to the basis
174
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Also, for n e N we can directly determine
( *" (§)"-(*)"
«)"-(*)"
(I)
0 0
(«"
© -a: (i)-(t: -«)■
0 0 0
(«■
(«-(*: -(«■
0 0 0 0
(ir -(§)"
°\
0 0 0 0
1/
The values in the first column correspond gradually to the probabilities that n-times in a row the result is 1, n-times in a row the result is 1 or 2 and there was at least one 2 (therefore we subtract the probability given in the first row), n-times in a row the result is 1,2 or 3 and at least once the result is 3, up to the last row where there is the probability that at least once during n throws the result is 6 (this can be easily derived from the probability of the complementary event). Similarly, in the fourth column are the non-zero probabilities of the events „n-times in a row the result is 1, 2, 3 or 4", „n-times in a row the result is 1, 2, 3, 4 or 5 and at least once it is 5" and „at least once during n attempts the result is 6". Interpretation of the matrix T as the probabilistic transition matrix of a Markov process allows for quick expression of the powers r,«eN. □
3.37. In this problem we deal with a certain property of an animal species which is determined independently of the sex but just by a certain gene - a tuple of alleles. Every individual gains one allele from each of its parent, randomly and independently. There are forms of the gene given by various alleles a, A - they form three possible states aa, a A = Aa and AA of the properly.
(a) Assume that each individual of a certain population mates only with an individual of another population, where there appears only the properly caused by the tuple a A. Exactly one of their offspring (randomly chosen one) will be left on the spot and he will also mate only with an individual of that specific population, and so on. Determine the probabilities of appearance of aa, a A, AA in the considered population after certain time.
(b) Solve the problem given in the case (a), if the other population is composed only of individual with the tuple A A.
(c) Randomly chosen two individuals of opposite sex are bred. >From their progeny again randomly choose two of opposite sex and breed them. If you carry on with this for a long time, compute the probability that both bred individuals have a tuple of alleles A A, or aa (then the process of breeding ends).
Pk-i-i lie in the kernel of n. This implies that if the matrix J of the mapping From here we calculate
n - n(k) — d\ (k) + 2d2(k) H dk(k) — rk-\(k)
■ ■ ■ + ldt(k) + ldl+l(k) ■ ■2rk(k) + rk+l(k)
(where the last row arises by combining the previous for values
l=k-l,k,k+Y).
3.41. Note. The proof of the theorem about the existence of the Jordan canonical form was constructive, but it does not give us a perfect algorithmic approach for the construction. Now we summarise the already derived approach for the explicit computation of the basis under which the given mapping • V has the matrix in the canonical Jordan form.
(1) We find the roots of the characteristic polynomial.
(2) If there are less than n — dim V of them (counting multiplicities), there is no canonical form.
(3) If there are n linearly independent eigenvectors, we obtain a basis of V composed of eigenvectors and under it T = 0 1/2 1 \0 0 Oj
We immediately see all eigenvalues 1,1/2 and 0 (if we subtract them from the diagonal, the rank of the resulting matrix is not 3, that is, the homogeneous system given by this matrix will have a non-trivial solution). To these eigenvectors then respectively correspond the eigenvectors
1
upper border of the scheme from the proof of the theorem 3.39, but it is necessary to find a suitable basis by application of iterations From there for arbitrary n e N it follows
/l -1 1 \ /l 0 0 r" = 0 1 -2-0 1/2 0 \0 0 1 / \0 0 0
1 0 0\ /l -1 1
0 1/20-0 1 -2
0 0 0/ \0 0 1
1 0 0\ /l 1 1N 0 1/2 0 1-10 1 2 0 0 0/ \0 0 1,
'I -1 1 0 1 -2 ,0 0 1
'10 0 0 2"" 0 ,0 0 0
Clearly for big n e N we can substitute 0 for 2 ", which implies
/l -1 1 \ /l 0 0\ /l 1 1\ /ll r T" « 0 1 -2 0 0 00 1 2 = 0 0 0 \0 0 1 / \0 0 0/ \0 0 1/ \0 0 Oj
Thus if individuals of the original population procreate exclusively
with the member of the specific population (that, that has only AA),
necessarily after a sufficient number of breeding it results into a total
elimination of the tuples a A and aa (and it does not matter what their
original distribution was).
The case (c). Now we have 6 possible states (in this order)
AA, AA; aA,AA; aa, AA;
aA,aA; aa,aA; aa,aa,
while these states are given by the genotypes of the parents. The matrix of the corresponding Markov chain is
/l 1/4 0 1/16 0 0\
0 1/2 0 1/4 0 0
0 0 0 1/8 0 0
0 1/4 1 1/4 1/4 0
0 0 0 1/4 1/2 0
0 0 1/16 1/4 V
If we consider for instance the situation (second column), where one of the parents has the tuple A A and the second has a A, then clearly each of the four cases (we are talking about the tuple of alleles of two randomly chosen offsprings)
AA, AA; AA,aA; aA,AA; aA,aA
occurs with the same probability. The probability of staying in the second state is thus 1/2 and the probability for transition from the second state to the first is 1/4 and to the fourth state also 1/4.
Now we should again determine the powers T" for big n e N. Considering the form of the first and of the last column we immediately
where U is the upper triangular matrix and thus
A — L-U
where L is lower triangular matrix with ones on diagonal and U is upper triangular. This decomposition is called LU-decomposition of the matrix A.
In the case of the general matrix we can with Gaussian elimination into the row echelon form need some additional row permutations, sometimes even column permutations. Then we obtain the more general
A — P ■ L ■ U ■ Q, where P and Q are some permutation matrices.
3.43. Notes. A direct corollary of the Gaussian elimination is also a realisation that up to the choice of suitable / bases on the domain and codomain, every mapping / : V -> W given by a matrix in block-diagonal form with unit matrix, with size given by the dimension of the image / and with zero blocks all around. This can be reformulated as follows: every matrix A of the type m/n over a field of scalars K can be decomposed into the product
E 0 0 0
Q,
where P and Q are suitable invertible matrices.
For square matrices we have in 3.32 shown when discussing properties of linear mappings / : V -> V over complex vector spaces that every square matrix A of the dimension m can be decomposed into the product
A — PB- P~l,
where B is block-diagonal with Jordan blocks associated to eigenvalues on the diagonal. Indeed, it is just a reformulation of the Jordan theorem, because multiplying by the matrix P and by its inverse from the other side corresponds in this case just to a change of the basis on the vector space V and the cited theorem says that in a suitable basis every mapping has Jordan canonical form.
Analogously, when discussing the self-adjoint mappings we have proved that for real symmetric matrices or for complex Her-mitian matrices there always exists a decomposition into the product
A — PB- P*,
where B is a diagonal matrix with all (always real) eigenvalues on the diagonal, counting multiplicities. Indeed, it is again a product of matrices then stand for the change of the basis, but we allow only changes between orthonormal bases and thus also the matrix P for the change must be orthogonal. From there we have P~l — P*.
For real orthogonal mappings we have derived analogous expression as for symmetric, only our B is diagonal with blocks of size two or one, expressing either rotation or mirror symmetry or identity with respect to the corresponding subspaces.
3.44. Singular decomposition theorem. Let us return to general linear mappings between vector spaces (in general distinct). If a scalar product is defined on them and we restrict ourselves on orthonormal bases only, we must proceed in a more refined way than in the case of arbitrary bases.
177
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
find out that 1 is an eigenvalue of the matrix T. It is very easy to find the eigenvectors
(l,0,0,0,0,0)r, (0,0,0,0,0, if
corresponding to the eigenvalue 1. By considering only a four-dimensional submatrix of the matrix T (omitting the first and sixth row and column) we find the remaining eigenvalues
1 1 1 - V5 1 + 75 2' 4' 4 ' 4 '
If we recall the solution of the exercise called Sweet-toothed gambler, we don't have to compute T". In that exercise we obtained the same eigenvectors corresponding to the eigenvalue 1 and the other eigenvalues also had their absolute value strictly smaller than 1 (the exact values were not used). Thus we obtain identical conclusion - the process approaches the probabilistic vector
(a, 0,0, 0,0, 1 -a)T ,
where a e [0, 1] is given by the initial state. Because only at the first and sixth position of the resulting vector can be a non-zero number, the states
aA,AA; aa, A A; aA,aA; aa,aA
after many breedings disappear. Let us further realise (follows also from the exercise Sweet-toothed gambler) that the probability that the process ends with AA, A A equals the relative ratio of the appearance of A in the initial state.
The case (d). Let the values a,b,c e [0, 1] give in this order the relative ratios of occurrence of alleles AA, aA, aa in the given population. We want to obtain the expression of relative ratios of the tuples AA, a A, aa in the offspring of the population. If the choice of tuples for breeding is random, then for a suitably big population it can be expected that the relative ratio of breeding of individuals that both have AA is a2, the relative ratio for the tuple a A and AA is lab, the relative ratio for a A (both of them) is b2 and so on. The offspring of the parents with tuples AA,AA must inherit A A. The probability that the offspring of the parents with tuples AA, a A has AA is clearly 1/2 and the probability that the offspring of the parents with tuples a A, a A has A A is 1/4. There are no other cases for an offspring with the tuple A A (if one of the parents has the tuple aa, then the offspring cannot have A A). Relative frequency of A A in the progeny is thus
, 1,1, b2
a1 ■ 1 + lab---h b1 ■ - = a2 + ab -\--.
2 4 4
Theorem. Let A be any matrix of the type m/n over real or complex scalars. Then there exist square unitary matrices U and V of dimensions m and n, and a real diagonal matrix D with non-negative elements of dimension r, r < min{m, n\, such that
A = USV*
S =
D 0 0 0
and r is the rank of the matrix AA*. Furthermore, S is determined uniquely up to the order of the elements and the elements of the diagonal matrix D are the square roots of the eigenvalues di of the matrix AA*. If A is a real matrix, then the matrices U and V are orthogonal.
Proof. Assume first that m < n and denote Km the mapping between real and complex spaces with % standard scalar products, given by the matrix A under the standard bases.
We can reformulate the statement of the theorem as follows: there exists orthonormal bases on K" and Km under which the mapping From there
B = V*A*AV = (AV)*(AV).
That is equivalent to the claim that first r columns of the matrix AV are orthogonal and the remaining are zero, because they have zero size.
Let us now denote first r columns v\,..., vr e Rm. Thus it holds that (vi,Vi) = dt, i = 1,..., r, and the normalised vectors ui = -j= Vi form an orthonormal system of non-zero vectors. Let us extend them to an orthonormal basis u = u\, ...,«„ of the whole Km. If we express our original mapping n, we can apply the previous part of the proof on the matrix A*. From there we directly obtain the desired claim.
If we work over real scalars, all the previous steps in the proof are also realised in the real domain. □
This proof of the theorem about singular decomposition is constructive and we can indeed use it for computing the unitary (orthogonal) matrices U and V and the non-zero diagonal elements of the matrix S.
3.45. Geometric interpretation. Diagonal values of the matrix D from the previous theorem are called singular values of the matrix A. Let us reformulate this theorem in the real case more geometrically.
178
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Analogically we set gradually the relative frequencies of the tuples a A and aa in the progeny:
b2
ab + be + lac -\--
and
c + be -\--.
This process can be viewed as a mapping T that transforms the vector (a,b,c)T. It holds that
(a\ ( a2 +ab + b2/4 b \ h» \ab + bc + 2ac + b2/2 c) \ c2 + bc + b2/4
Let us mention that the domain (and also the codomain) of T are just the vectors
' a'
kde a, b, c e [0, 1], a + b + c = 1.
We would like to give the operation T my multiplying the vector by some constant matrix. But that is clearly not possible (the mapping T is not linear). It is thus not a Markov process and the determination of what happens after a long time cannot be simplified as in the previous cases. But we can compute what happens if we apply the mapping T twice in a row. In the second step we obtain
a2 + ab + b2/4 T : I ab + be + lac + b2/2 I h» I t2 I , kde
c2 + be + b2/4
t\ = [a2 + ab + — ) + ( a2 + ab + — ) ( ab + be + 2ac + —
b2^
+ — I ab + be + lac -\--
4 V 2
t2 = ( a2 + ab + — | ( ab + be + lac + '— | +
+ [ ab + be + lac + (c2 + be + ^ ) +
b2\ ( b2\ 1 / ^2
+ l[a2 + ab + — ) ( c2 + 6c + — J + - ( + be + lac +
, , , ^2\2 / b2\ ( 7 b2
f7 = ( cl + be + — I + I ab + be + lac + — I I cl + be + —
1 / b2
+ — \ ab + be + lac -\--
4 V 2
It can be shown (using a + b + c = 1) that
b2 b2 b2
t\=a2+ab-\--, tl = ab + be + lac -\--, tl=c2 + bc-\--,
z 4 2 4
For the corresponding linear mappings : W —>• Rm the singular values have indeed simple geometric meaning: let K c M" be the unit ball for the standard scalar product. The image cp(K) is the always an m-dimensional ellipsoid (possibly degenerate). The singular values of the matrix A are then the sizes of the main half-axes and the theorem further says that the original sphere always allows orthogonal grouped diameters, whose image are exactly all half-axes of this ellipsoid.
For square matrices it can be seen that A is invertible if and only if all singular values are non-zero. The ratio of the greatest to the smallest singular value is an important parameter for the robustness of the sequence of numerical computations with matrices, for instance for the computation of the inverse matrix. Let us also note that there exist fast methods of computations (approximations) for eigenvalues, thus the singular decomposition is very effective to work with.
3.46. Polar decomposition theorem. The singular decomposi-f% tion theorem is a starting point for many very useful tools. Let us now think about some direct corollaries (which by themselves are quite non-trivial). The statement of the theorem says that for any matrix A, real or complex, A — U SW* with S diagonal with non-negative real numbers on the diagonal and U and W unitary. But then also A — USU*UW* and let us call the matrices P = USU*, V = UW*. First of them, P, is Hermitian (in real case symmetric) and positively semidefinite, because it regards just how to write down the mapping with real diagonal matrix S in another orthonormal basis, while V is a product of two unitary matrices and thus again unitary (in the real case orthogonal). Furthermore A* — WSU* and thus AA* — USSU* — P2 and our matrix P is actually the square root of the easily computable Hermitian matrix A A*.
Assume that A — PV — QU are two such decompositions of the matrix A into the product of positively semidefinite Hermitian and unitary matrix and assume that A is invertible. But then
AA* = PVV*P = P2 = QUU*Q = Q2
is positively definite and thus the matrices Q — P — \/AA* are uniquely determined and invertible. But then also U — V —
P~lA.
We have thus completely derived a very useful analogy of the decomposition of a real number into a sign (orthogonal matrix in the case of dimension are exactly ±1) and the absolute value (the matrix P, for which we can compute the square root).
Theorem (Polar decomposition theorem). Every square complex matrix A of the dimension n can be always expressed in the form A — P ■ V, where P is Hermitian and positively definite square matrix of the same dimension and V is unitary. We have P = V A A*. If A is invertible, the decomposition is unique and V — (s/~AA*)-lA.
If we work over real scalars, P is symmetric and V orthogonal.
If we apply the same theorem on A* instead of A, we obtain the same result, but with the order of the Hermitian and unitary matrices reversed. The matrices in the corresponding right and left decomposition will of course be in general distinct.
179
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
that is,
a2 + ab + b2/4
a2 +ab +b2/4 ab +bc + lac + b2/2 I h» I ab + be + lac + b2/2 c2 + bc + b2/4 ) \ c2 +bc + b2/4
We have obtain a surprising result that further application of the transform T does not change the vector obtained in the first step. That means that the appearance of the considered tuples is after arbitrary long time time the same as in the first generation of offspring. For a big population we have thus proven that the evolution takes place during first generation (unless there are some mutation or selection). □
3.38. Let there are two boxes, which contain together n white and n black balls. In regular time intervals from both boxes a ball is taken and moved to the other urn, while the number of balls in each of the boxes is at the beginning (and thus for all the time) equal to n. Give for this Markov process its probabilistic transition matrix T.
Solution. This case is often used in physics as a model of blending two incompressible liquids (already in the year 1769 introduced by D. Bernoulli) or analogously, as a model of diffusion of gases. The states 0,1,...,« correspond for instance to the number of white balls in the first box. This information already says how many black balls are in the first box (and the remaining balls are then in the second box). If in the certain step a state changes from j € {1,...,«} to j — 1, it means that from the first box a white ball was drawn and from the second a black ball was drawn. That happens with probability
I . I - ]—
n n n2
Transition from the state j € {0, ...,n — 1} to the state j + 1 corresponds to drawing the black ball from the first box and a white ball from the second box, with probability
n ~ j n - j (n- j)2
The system stays in the state j e {1,— 1}, if from both boxes balls of the same colour were drawn, which has the same probability
n
J n
J_ I
n
2j (n- j)
In the complex case the analogy with the decomposition of numbers is even more funny - positively semidefinite P again plays a role of the absolute value of the complex number, the unitary matrix V then has a unique expression as a sum V — re V + i imV with Hermit-ian real and imaginary parts and the property (re V)2 + (im V)2 — E, that is, we obtain a full analogy for the polar form for the complex numbers (see the final remark in 3.30). But note that in the case with more dimensions it is important in what order is this "polar form" of matrix written. It is possible in both ways, but the results are in general distinct.
For many practical applications it is faster to use the so-called QR decomposition of matrices, which is an analogy of the Schur orthogonal triangulation theorem:
3.47. Theorem. For every complex matrix A of the type m/n there exists a unitary matrix Q and an upper triangular matrix R such that A = QTR.
If we work over real scalars, both Q and R are real.
Proof. In the geometric formulation we need to prove that for every mapping Km with the matrix A under the standard bases we can choose new orthonormal basis on Km such that then From there we see that
B
B
I)
Q
E 0 0 0
D-1 0
for suitable matrices P, Q and R. But now
'd~x p\(d 0 q rj\0 0
BA
E
QD
181
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
suffices for him to have €346 c if p = 0.495 (or €1727 if p = 0.499). Therefore it is possible in big casinos that „passionate" players play almost fair games. □
3.40. In a certain company there exist two competing departments. The management has decided that every week they will measure relative (with respect to the number of employees) incomes attained by these two departments. To the more successful department will then 2 employees of the other department be moved. This process will go on as long as both departments have some employees. You have gained a position in this company and you can choose one of these two departments where you will work. You want to choose the department which won't be cancelled due to the employee movement. What will be your choice, if one of the departments has 40 employees, the other 10 employees and you estimates that the second one will have relatively greater income in 54 % of the cases? O Another application of the Markov chains are in the additional exercises after this chapter.
E. Unitary spaces
Even in the previous chapter we have defined the scalar product in real vector spaces (2.40), in this chapter we extend its definition to the complex spaces too (3.23).
3.41. Groups O(n) and U(n). If we consider all linear mappings from M3 to M3 which preserve the given scalar product, that is, with respect to the definitions of the lengths of the vectors and deviations of two vector all linear mappings that preserve lengths and angles, then these mappings form with respect to the operation of composition a group (see 1.1); composition of two such mappings is by the the definition also a mapping that preserves lengths and angles, unit element of the group is the identity mapping, the inverse element for a given mapping is its inverse mapping - thanks to the condition on the lengths preservation such mapping exists. Matrices of such mappings thus form a group with the operation of matrix multiplication (see ), it is called the orthogonal group and is denoted by 0(n). It is a subgroup of all invertible mappings from M." to M.".
If we additionally require that the matrices have determinant one, we speak of the special orthogonal group SO(n) (in general the determinant of a matrix in O(n) can be either 1 or — 1).
Similarly we define the unitary group U(n) as group of all (complex) matrices that correspond to the complex linear mappings from
should be Hermitian, thus QD — O and thus also Q — O (the matrix D is diagonal and invertible). Analogously, the assumption that AB is Hermitian implies that P is zero. Additionally, we have
B = BAB =
D-1 0
D-1 0
On the right side in the right-lower corner there is zero, and thus also R — O and the claim is proven.
(4): Consider the mapping cp : K" -> Km, x i-> Ax, and direct sums K" — (Ker cp)1- © Ker cp, Km = Imcp © (Imcp)-1. The restricted mapping cp :— cp^Keiip:)± ■ (Kercp)1- -> Imcp is a linear isomorphism. If we choose suitable orthonormal bases on (Kercp)1- and Imcp and extend them to orthonormal bases on whole spaces, the mapping cp will have matrix S and cp the matrix D from the theorem about the singular decomposition. For given b e Km is the point z e Imcp that minimises the distance \\b — z\\ (that is, the point that realises the distance from the affine subspace p(b,Imcp), see the next chapter) exactly the component z — b\ of the decomposition b — b\ + b2, b\ e Imcp, b2 e (Imcp)-1. But in a suitably chosen basis is the mapping , originally given under standard bases by the pseudoinverse A(_1), given by the matrix y from the singular decomposition theorem, notably we have ,(-D i
(Imcp) — (Ker ( From the point (4) of the previous theorem we obtain that the matrix A A1" is the matrix of the perpendicular projection form the vector space M", where n is the number of the rows of the matrix A on the subspace generated by the columns of the matrix A (this interpretation has of course meaning only for matrices that have more rows than columns).
Furthermore, for matrices A whose columns for independent vectors, the expression (AT A)~l AT makes sense and it is not hard to verify that this matrix satisfies all properties from (1) and (2) from the previous theorem, thus it is a pseudoinverse of the matrix A.
182
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
C" to C" that preserve a given scalar product in a unitary space. Analogously, SU(n) denotes the subgroup of matrices in U(n) with determinant one (in general, determinant of matrix in U(n) can be any complex unit).
3.42. Consider the vector space V of functions R -» C. Determine whether the mapping From the definition of exp we can show that it holds that exp(A + B) = exp(A). exp(S) as we are used to with the exponential mapping in the domain of numbers. Because in general it is (u+v)* = u* + v* and (cv)* = cv*, we obtain
U*
co 1 co 1
(Y-(iHYT = Y-,(-iH*Y
and because H*
«=o H, then
«=o
U* = T(-1)"-(///)" = exp(-iff) z—' n\
«=o
and thus
U*U = exp(iH)exp(-iH) = exp(0) = 1.
det(I7) = e
trace (iH)
□
3.44. Hermitian matrices A, B, C satisfy [A, C] = [B, C] = 0 and [A, S] 7^ 0, where [, ] is a commutator of matrices defined defined by the relation [A, B] = AB — BA. Show that at least one eigensubspace of the matrix C must have dimension > 1.
Solution. We prove it by contradiction. We assume that all eigensub-spaces of the operator C have dim = 1. Then we can for any vector u write u = ckuk where uk are linearly independent eigenvectors of the operator C associated with the eigenvalue kk (and ck = u.uk) For these eigenvectors it clearly holds that
0 = [A, C]uk = ACuk — CAuk = XkAuk — C(Auk)
3.50. Linear regression. The approximation property (3) from the previous theorem is very useful in the cases where we are to find as good approximation as possible for the (non-existent) solution of a given system Ax — b, where A is a real matrix of the type m/n and m > n.
For instance, an experiment gives us many measured real values bj and we want to find a linear combination of some functions fi, which approximates the values bj. The actual values of the chose functions in the points yj e R give a matrix atj — fjiyi), whose columns are given by values of the individual functions fj in the considered points, and our goal is to determine the coefficients xj e R so that the sum of the squares of the deviations from the actual values
is minimised. In other words, we seek a linear combination of the functions ft such that we interpolate the given values bt "well". Thanks to the previous theorem are the optimal coefficients
A^b.
In order to have a more specific idea, consider just two functions f\ (x) — x, f2(x) — x2 and assume that the "measured values" of their unknown combination g(x) — y\x + y2x2 in integral values for x between 1 and 10 are bT — (1.44 10.64 4.48 14.56 31.12 39.20 54.88 71.28 85.92 104.16). This vector arose by computing the values x + x2 in given points shifted by random values in range ±8. The matrix A — (btj) is in our case equal to
Ar =
12 3 4
10
1 4 9 16 25 36 49 64 81 100
and the coefficients in the combination are
,(-D
0.61 0.99
The resulting interpolation can be seen at the picture, where the given values b are interpolated with a green polygonal chain, while the red graph corresponds to the combination g. The computations were done in the system Maple using the command leastsqrs(B,b). If you are enfriended with Maple (or some other similar software), try to do some experiments with similar tasks.
183
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
>From there we see that Auk is an eigenvector of the matrix C with the eigenvalue Xk. But that means that Auk = XAuk for some number XA. Similarly we derive Buk = Xfuk for some number Xf. For the commutator of matrices A and B we then obtain
[A, B]uk = ABuk - BAuk = XAXBkuk - XBkXAuk = 0
But that means that
[A, B]u = [A, B]J2ckUk - J]cjt[A, B]uk = 0
k k
and because u was arbitrary, that means that [A, B] = 0, which is a contradiction. □
3.45. Applications in quantum physics. In quantum physics we
don't give to quantities any numerical value, as in classical
rf physics, but a Hermitian operator. That is nothing but a Her-mitian mapping, which can lead (and often does) to a linear transformation between unitary spaces of infinite dimension (we can imagine this as a matrix of infinite dimension). Vectors in this unitary space then represent the states of the given physical system. When measuring a given physical quantity we obtain only values that are eigenvalues of the corresponding operator.
For instance instead of the coordinate x we have an operator of the coordinate x, that results in multiplication by x. If the state of the system is described by the vector V, then it holds that x. (v) = xv, that is, it corresponds to the multiplication of the vector by the real number x. At the first glance this Hermitian operator is different from our cases of finite dimensions. Evidently every real number is an eigenvalue (x has the so-called continuous spectrum). Similarly, in place of speed (more precisely, momentum) we have the operator p = —z The eigenvectors are solution of the differential equation —i^ = Xv. Even in this case is the spectrum continuous. That expresses the fact that the corresponding physical quantity is continuous (it can attain any real value). On the other hand, we have physical quantities, for instance energy, that can attain only discrete values (energy exists in quanta). The corresponding operators are then really similar to the Hermitian matrices, they just have infinitely many eigenvalues.
3.46. Show that x. and p are Hermitian and that
[x, p] = i
Solution. For any vector i; it holds that
„ „ „ „ dv d(xv)
[x, p]v = xpv — pxv = x(—i—) + i--= iv
dx dx
and from there we directly have our claim. □
x
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
A2 - 3A + 2
3.47. Show that
[x — p, x + p] = 2i
Solution. Evidently we have that [x, x] — 0 and [p, p] = 0 and the rest follows from the linearity of the commutator from the previous exercise. □
3.48. Jordan form. Find the Jordan form of the matrix A. What is the geometric interpretation of this decomposition of the matrix?
Solution. i)We first compute the characteristic polynomial of the matrix A
l A 1171 1
\A-XE\= _6 4_A
The eigenvalues of the matrix A are the roots of this polynomial, that means that ki 2 = 1,2. Because the matrix is of order two and has two
distinct eigenvalues, its Jordan form is a diagonal matrix J = ^ ^ The eigenvector (x, y) associated with the eigenvalue 1 satisfies 0 = (A — E)x = ^ 5 3^ that is, — 2x +y = 0. That holds exactly for the multiples of the vector (1,2). Similarly we find out that the eigenvector associated with the eigenvalue 2 is (1, 3). The matrix P is then obtained by writing these eigenvectors into tho columns, that is,
P = ^ F°r the matrix A we then have A = P ■ J ■ P~l. The inverse of Pis/5"1 = ^2 ^and we obtain
■1 1\ = /1 0\/3 -1
-6 4/ ^2 3J \0 2) \-2 1
This decomposition tells us that the matrix A determines such linear mapping that has in basis of the eigenvectors (1, 2), (1, 3) the aforementioned diagonal form. That means that in the direction (1,2) nothing is changing and in the direction (1,3) every vector is being stretched twice.
ii) Characteristic polynomial of the matrix A is in this case
|A - XE\
1 — k 1 -4 3 — A
A2 - 2A + 1 = 0
We obtain a double root A = 1 and the corresponding eigenvector (x, y) satisfies
' -2 l\ /jcn
0 = (A-E)x-, 4 2)yy
The solutions are, as in the previous case, multiples of the vector (1,2). The fact that the system has no two linearly independent vectors as a
185
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
solution says that the Jordan form in this case is not optimal, but it
will be a matrix ^ . The basis for which A has this form is the
eigenvector (1,2) and a vector that maps on this vector by the mapping A — E, it is thus a solution of the system of equations
-2 1 -4 2
1 ) - ( "2 1
2 \00
The solutions are the multiples of the vector (1,3). We obtain the same basis as in the previous case and we can write
-4 3 J \2 3J\0 \J\-2 1 The mapping now acts on the vector as follows: the component in the direction (1,3) stays the same and the component in the direction (1,2) is multiplied by the sum of the coefficients that determine the components in the directions (1,3) and (1,2). □
3.49. Find the Jordan form of the matrix A and write down the decomposition. What is geometric interpretation of this decomposition?
li = | ^ ^2 4^ anc* = I ^4 i^j and draw how the vectors v = (3, 0), Aiv and A2v decompose with respect to the basis of the eigenvectors of the matrix Ai _2.
Solution. The matrices have the same Jordan forms as the matrices in the previous exercise and both have in the basis of the vectors (1,2) and (1, —1), that is,
3 (-2 4 ) = (2 -l) (0 2) (2 -1
and
3 (4 1 ) = (2 -l) (0 l) (2 -1, For vector 1; = (3, 0) we obtain 1; = (1,2)+ 2(1, —1) and for its images Aiv = (5, -2) = (1, 2) + 2 • 2 • (1, -1) and A2v = (5, 4) = (2 + 1) .(1,2) + 2- (1,-1). □
F. Matrix decompositions
3.50. Prove or disprove:
• Let A be a square matrix n x n. Then the matrix AT A is symmetric.
• Let A be a square matrix with only real positive eigenvalues. Then A is symmetric.
3.51. Find an LU-decomposition of the following matrix:
-2 1 0 -4 4 2 -6 1 -1.
186
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Solution.
'1 0 0\ (-2 1 0^ 2 1 0 0 2 2 K3 -1 1/ \ 0 0 1, We first multiply the matrices that correspond to the Gaussian elimination, we thus obtain for the original matrix a, xa = u, where x is a lower triangular matrix given by the Gaussian reduction, u upper triangular. From this equality we have a = x~l u, which is the desired decomposition (we thus have to compute the inverse of x). □
/l 1 0\
3.52. Find the LU-decomposition of the matrix I 1 —1 2 l-O
V- 1 -V
3.53. Ray-tracing. In computer 3D-graphics the image is very often displayed using the Ray-tracing algorithm. The basis of this algorithm is an approximation of the light waves by a ray (line) and approximation of the displayed objects by polyhedrons. These are bounded by planes and it is necessary to compute where exactly are the light rays reflected from these planes. From physics we know how are the rays reflected - the angle of impact equals the angle of reflection. With this topic we have already met in the exercise || 1.64||.
The ray of light in the direction i; = (1, 2, 3) hits the plane given by the equation x + y + z = 1. In what direction is it reflected? Solution. Unit normal vector to the plane is n = -^(1,1,1). The vector that gives the direction of the reflected ray vR lies in the plane given by the vectors v,n. We can express it as a linear combination of these vectors. Furthermore, the rule for the angle of reflection says that (v, n) = —{vR, n). >From there we obtain a quadratic equation for the coefficient of the linear combination.
This exercise can be solved in an easier, more geometric way. From the picture we can directly derive that
vR = v — 2(v, n)n
and in our case we obtain vR = (—3, —2, — 1). □
3.54. Singular decomposition, polar decomposition, pseu-
doinverse. Compute the singular decomposition of the matrix /0 0 -I\
a = I — 1 0 0 I. Then compute its polar decomposition and
\ 0 0 0 / find its pseudoinverse.
Solution. We first compute at a:
/ 0 -1 0\ / 0 0 ata =10 0 0 -1 0
and obtain a diagonal matrix. But we need to find such orthonormal basis under which the matrix is diagonal and the zero row is the last
187
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
one. This can be clearly obtained by rotating through the right angle
about the x-axis (the j-coordinate then goes to z, and z, goes to —y).
This rotation is an orthogonal transformation given by the matrix v = 'l 0 0\
0 0 1 I. By this we have (without much computation) found the v0 -1 o)
decomposition at a = vbvt, where b is diagonal with eigenvalues (1, \, 0) on the diagonal. Because now we have b = (av)t(av), the columns of the matrix
0 0 -k\ (I 0 0\ / 0 4 0
0 0 0 / \0 -1 0/ \ 0 0 Oy form an orthogonal system of vectors, which we normalise and extend to a basis. That is then of the form (0, -1, 0), (1, 0, 0), (0, 0, 1). The transition matrix of changing from this basis to the standard one is
then u = | — 1 0 0 |. Finally, we obtain the decomposition a
u*Jbvt
0 0 -1 0 0 0
Geometrical interpretation of decomposition is the following: first, everything is rotated through the right angle by the x-axis, then follows a projection to the xy plane such that the unit ball is mapped on the ellipse with major half-axes 1 and \ and the result is the rotated through the right angle about the z-axis.
The polar decomposition a = p ■ w can be simply obtained from the singular one: p := u^/~but and w := uvt, that is,
0 °\ 1 f0 0 -1
0 0 -1 = -1 0 0
0 1 o 1 o 1 0
and
W =
and from that it follows that
0 0 -lj -10 0 0 0 0
Pseudoinverse matrix is then given by the expression A(_1) := vsu1 'I 0 0^
where s = I 0 2 0 I. Thus we have
t(-D
188
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
□
3.55. QR decomposition. QR decomposition of a matrix A is very useful in the case when we are given a system of linear equations Ax = b which has no solution, but we need to find an approximation as good as possible. That is, we want to minimise \\Ax — b\\. According to the Pythagorean theorem we have || Ax — b\\2 = \\ Ax — b\\ ||2 + || b±_\\2, where b was decomposed into b\\ that belongs to the range of the linear transformation A (that corresponds to the matrix A) and into bj_, that is perpendicular to this range. Projection on the range of A can be written in the form QQT for a suitable matrix Q. Specifically for this matrix we obtain it through Gram-Schmidt orthonormalisation of the column of the matrix A. Then we have Ax — b\\ = Q(QT Ax — QTb). The system in the parentheses has a solution, for which we obtain \\Ax — b\\ = \\bj_\\, which is the minimal value. Furthermore, the matrix R := QT A is upper triangular and therefore the approximate solution can be found very easily.
Find an approximate solution of the system
x + 2y = 1 2x + 4y = 4
(1 2\
Solution. We have a system Ax = b with A = I J and b =
(which evidently has no solution). We thus orthonormalise the
columns of A. We take the first of them and divide it by its size. This
i /I'
is
yields the first vector of the orthonormal basis y^J ■ ^ut the second is twice the first and thus it will be after orthonormalisation zero. Therefore we have Q = . The projector on the range of A i
then QQT = next we compute
9
and
The approximate solution then satisfies Rx = QTb and that in our case means 5x + 9y = 9 (approximate solution is thus not unique). QR decomposition of the matrix A is then
□
189
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
2 -1 -1\
3.56. Minimise \\Ax -b\\ for A = | -1 2 -1 I and/3 =
1-1 2 /
and write down the QR decomposition of the matrix A. Solution. Normalised first column of the matrix A is 000 e\ = (2\
I — 1 I. From the second column we subtract its component in the direction e\. We have
-1
and therefore we obtain l\ /-l
By this we have created an orthogonal vector, which we normalise and (°\
obtain e2 = 4? I 1 I. The third column of the matrix A is already 1-1/
linearly dependent (we can verify this by computing the determinant). The desired column-orthogonal matrix is then
1 I2 °
Next we compute
' 2 -1 -1
-3 -3 A Ve^O 373 -373;
and
The solution of the equation Rx = QTb is x = y = z. Multiples of the vector (1, 1, 1) thus minimise \\Ax — b\\.
The mapping given by the matrix A is a projection on the plane with a normal vector (1, 1, 1).
□
3.57. Linear regression. The knowledge we have obtained in this chapter can be successfully used practically for solving problems with linear regression. It is about finding the best approximation of some functional dependence using a linear function.
We are thus given a functional dependence in some points (for instance, we investigate the value of the property of people depending
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
on their intelligence, the value of the property of their parents, number of mutual friends with Mr. Williams, ...), that is, f{a\,... ,aln) = yi,..., f{a\, ak, ..., ak) = yk,k > n (we have thus more equations than unknowns) and we want "best possible" approximation of this dependency using a linear function, that is, we want to express the value of the property as a linear function f{x\,..., x„) = b\X\ + b2x2 +• • • +bnxn +c. If we also define "best possible" by minimisation of
k / n \ 2
E I v< E(/';V' 1 < > j
with regard to the real constants b\, ...,b„, c. Our goal is to find such linear combination of the columns of the matrix A = (a1.) (with coefficients b\,..., b„), that has the smallest distance from the vector (yi,..., yk) in Rk, it is thus about finding an orthogonal projection of the vector (yi,..., yk) on the subspace generated by the columns of the matrix A. Using the theorem 3.49 this projection is the vector (bu...,bn)T = A(-rHyi,...,bn).
3.58. Using the least squares method, solve the system
2x + y + 2z = 1
x + y + 3z = 2
2x + y + z = 0
x + z, = -1
Solution. Our system has no solution, since its matrix has rank 3, the extended matrix has rank 4. The best approximation of the vector b = (1, 2, 0, —1) formed by the right sides of the equations can be thus obtained using the theorem 3.49 by the vector A{~l)b. (AA{~l)b is then the best approximation - the perpendicular projection of the vector b on the space generated by the columns of the matrix A).
Because the columns of the matrix A are linearly independent, its pseudoinverse is given by the relation (AT A)-1 AT. Thus we have
i(-D
191
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
3/5 -1 ° \ 2 1 2
-1 10/3 -2/3 1 1 1 0
0 -2/3 1/3 / \2 3 1 1
1/5 -2/5 1/5 3/5 \
0 1/3 2/3 -5/3
0 1/3 -1/3 1/3 /
The desired x equals
A{~l)b = (-6/5,7/3, l/3)r.
The projection (the best possible approximation of the column of the right sides) is then the vector (3/5, 32/15, 4/15, -13/15). □
192
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
G. Additional exercises for the whole chapter
3.59. Model of evolution of a whale population. For evolution of a population females are important, and for them the important factor is not age but fertility. From this point of view we can divide the females into newborns (juvenile), that is, females who are yet infertile; young fertile females; adult females with highest fertility and postclimacterial females which are not fertile anymore (but are still important with respect to taking care of newborns and food gathering).
We model the evolution of such population in time. For a time unit we choose time it takes to reach adulthood. Newborn female which survives this interval becomes fertile. The evolution of a young female to full fertility and to postclimacterial state depends on the environment. That is, transition to next category is a random event. Analogously, death of an individual is also a random event. Young fertile female has per unit interval less children than adult female. Let us formalise these statements.
Denote by xi(t), x2(t), x3(t), x4(t) the number of juvenile, young, adult and postclimacterial females in time t respectively. The amount can be expressed as a number of individuals, but also as a number of individuals relative on a unit area (the so-called population density), or as a total biomass and similarly. Further denote by pi the probability that a juvenile female survives the unit time interval and becomes fertile, and by p2 and p3 the respective probabilities that a young female becomes adult and that adult female becomes old. Another random event is dying (positively formulated: survival) of females that do not move to the next category - we denote the probabilities respectively q2, q3 and q4 for young, adult and old females. Each of the numbers p\, p2, p3, q2, q3, q4 is as a probability from the interval [0, 1]. Young female can survive, reach adulthood or die; these events are mutually exclusive, together they form a sure event and cannot be excluded. Thus we have p2 + q2 < 1. From similar reasons we have p3 +q3 < 1. Finally, we denote by f2 and f3 the average number of daughters of a young and adult female, respectively. These parameters satisfy 0 < f2 < f3.
Expected number of newborn females in the next time interval is the sum of daughters of young and of adult females, that is
x1(t + \) = f2X2(t) + f3X3(t).
We denote for a while by x2,i (t + 1) the amount of young females in time t + 1, which were in the previous time interval, that is, in time t, juvenile, and by x2:2(t +1) the amount of young females, that were already in time t fertile, survived that time interval bud did not move into the adulthood. The probability p\ that a juvenile female survives the interval can be expressed by classical probability, that is, by the ratio x2,i(f + \)/x\{t), and similarly we can express the probability q2 as the ratio x2,2(t + l)/x2(t). Because young females in time t + 1 are exactly those that survived the juvenile stage and those that already were fertile, did survive and did not evolve, it holds that
x2(t + 1) = x2,i(t + l)+x%2(t + 1) = p\xi{t) +q2x2(t).
Analogically we derive the expected number of fully fertile females
x3(t + 1) = p2x2(t) + q3x3(t)
and the expected number of postclimacterial females by
x4(t + 1) = p3x3(t) + q4x4(t).
193
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
CO
20
30
40
50
10
Figure i. Evolution of a population of orca whale. On the horizontal axis the time is in years, on the vertical axis is the size of the population. Individual areas depict the number of juvenile, young, adult and old females respectively, from bellow.
Now we can denote /0
Pi 0
h
P2
0
h o
P3
o\
0 0
4/
x(t)
Ai(0\
*2(0 X3(t)
\x4(t) /
and rewrite the previous recurrent formulas in the matrix form
x(t + 1) = Ax(t).
Using this matrix difference equation we can easily compute the expected number of whale females
in individual categories, if we know the distribution of population at some initial time.
Specifically, for the population of orca whales the following parameters were observed: pi = 0,9775, q2 = 0,9111, h = 0,0043, p2 = 0,0736, q3 = 0,9534. /3 = 0,1132, p3 = 0,0452, q4 = 0,9804; Time interval is in this case one year.
If we start at the time t = 0 with unit measure of young female in some unoccupied area, that is,
with the vector x(0) = (0, 1, 0, 0)T, we can compute
x(l)
/ 0 0,0043 0,1132 0 > /0\ ^0,0043^
0,9775 0,9111 0 0 1 0,9111
0 0,0736 0,9534 0 0 0,0736
V 0 0 0,0452 0,9804y W )
/ 0 0,0043 0,1132 0 \ ^0,0043^ /0,01224925\
0,9775 0,9111 0 0 0,9111 0,83430646
0 0,0736 0,9534 0 0,0736 0,13722720
V 0 0 0,0452 0,9804/ 0 J ^0,00332672/
x(2)
and we can carry on. The results of the computation can be also expressed graphically; see the picture || 11|. Try by yourself a computation and graphical depiction of the results even for a different initial
194
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
distribution of the population. The result should be an observation that the total population grows exponentially, but the ratios of the sizes of individual groups stabilise gradually on constant values. The matrix A thus has the eigenvalues
ki = 1,025441326, k2 = 0,980400000, a3 = 0,834222976, a4 = 0,004835698,
eigenvector associated with the largest eigenvalue k± is
w = (0,03697187, 0,31607121, 0,32290968, 0,32404724);
this vector is normed such that the sum of its components equals 1.
Compare the evolution of the size of the population with the exponential function F(t) = k[x0, where x0 is the total size of the initial population. Compute also the relative distribution in individual categories in the population after certain time of evolution, and compare it with the components of the eigenvector w. They will appear very close, this is caused by the fact that A has only single eigenvalue that has the greatest absolute value and by the fact that the vector space generated by the eigenvectors associated with the eigenvalues a2, a3, a4 has with the non-negative orthant intersection only the zero vector. The structure of the matrix A itself does not ensure such easily predictable evolution, because it is a so-called reducible matrix (see ??).
3.60. Model of growth of population of teasels Dipsacus sylvestris. This plant can be seen in four stages. Either as a blossoming plant or as rosette of leaves, while with the rosette there are three sizes - small, medium and large. The life cycle of this monoicous perennial plant can be described as follows.
Blossoming plant produces in late summer some number of seeds and dies. From the seeds, some sprout already in that year into a rosette of leaves, usually of medium size. Other seeds spend the winter in the ground. Some of the seeds in the ground sprout in the spring into a rosette, but because they were weakened during the winter, the size is usually small. After three or more winters the "sleeping" (formally, dormant) seeds die as they loose the ability to sprout. Depending on the environment of the plant, small or medium rosette can during the year grow, and any rosette can stay in its category or die (wither, be eaten by insects, etc.) Medium or large rosette can in the next year burst into a flower. Blossoming flower then produces seeds and the cycle repeats.
In order to be able to predict the spreading of the population of the teasels, we need to quantify the described events. The botanists discovered that a blossoming plant produces on average 431 seeds. The probabilities that a seed sprouts, that a rosette grows or bursts into a flower are summarised in the following table:
195
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
event
probability
seed produced by a flower dies
seed sprouts into a small rosette in the current year
seed sprouts into a medium rosette in the current year
seed sprouts into a large rosette in the current year
seed sprouts into a small rosette after spending the winter
seed sprouts into a medium rosette after spending the winter
seed sprouts into a large rosette after spending the winter
seed sprouts into a small rosette after spending two winters
seed dies after spending one winter
small rosette survives but does not grow
medium rosette survives but does not grow
large rosette survives but does not grow
small rosette grows into a medium one
small rosette grows into a large one
medium rosette grows into a large one
medium rosette bursts into a flower
large rosette bursts into a flower
0,172 0,008 0,070 0,002 0,013 0,007 0,001 0,001 0,013 0,125 0,238 0,167 0,125 0,036 0,245 0,023 0,750
Note that all the relevant events in the life cycle have their probabilities given and that the events are mutually incompatible.
Let us imagine that we always observe the population at the beginning of the vegetative year, say in March, and that all considered events take place in the rest of the year, say from April to February. In the population there are blossoming flowers, rosettes of three sizes, produced seeds and seeds that have been dormant for a year or two. This could lead us to division of the population into seven classes - just-produced seeds, seeds dormant for one year, seeds dormant for two years, rosettes small, medium and large and blossoming flowers. But the just-produced seeds are in the same year changed either into rosettes or they spend winter, thus they do not form an individual category. Let us thus denote:
x\(t) — the number of seeds dormant for one year in the spring of the year t x2(t) — the number of seeds dormant for two years in the spring of the year t x3 (t) — the number of small rosettes in the spring of the year t x4(t) — the number of medium rosettes in the spring of the year t x5 (t) — the number of large rosettes in the spring of the year t xe(t) — the number of blossoming flowers in the spring of the year t The number of produced seeds in the year t is 431x6(f). The probability that the seeds stays dormant
for the first year equals the probability that the seed does not sprout into any rosette and does not die,
that is, 1 - (0,008 + 0,070 + 0,002 + 0,172) = 0,748. The expected number of seeds dormant for
winter in the next year is thus
The probability that the seed that have been dormant for one year stays dormant for the second year equals the probability that the dormant seed does not sprout into any rosette and that it does not die, that is, 1 - 0,013 - 0,007 - 0,001 - 0,013 = 0,966. The expected number of seeds dormant for two winters is thus
Small rosette can sprout from the seeds immediately, from a seed dormant for one year or from a seed dormant for two years. The expected number of small rosettes sprouted from non-dormant seeds in the year t equals 0,008 • 431x6(f) = 3,448x6(f). The expected number of small rosettes sprouted
xi(t + 1) = 0,748 • 431x6(0 = 322,388jc6(0.
x2(t + 1) = 0,966*1 (f).
196
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
from the seeds dormant for one and two years is 0,013*i (?) and 0,010*2(0 respectively. With these newly sprouted small rosettes there are in the population also the older small rosettes (those that have not grown yet) - of those there are 0,125*3(0- The total expected number of small rosettes is thus
x3(t + 1) = 0,013*i(0 + 0,010*2(0 + 0,125jt3(0 + 3,448*6(0-
Analogically we determine the expected number of medium and large rosettes
x4(t + 1) =0,007*^0 + 0,125jt3(f) +0,238jc4(0 + 0,070 -431*6(0 = =0,007*i(0 + 0,125*3(0 +0,238*4(0 + 30,170*6,
x5(t + 1) =0,245*4(0 + 0,167*5(0 + 0,002 • 431*6(0 = =0,245*4(0 +0,167*5(0 + 0,862*6(0.
The blossoming flower can arise either from medium or from large rosette. The expected number of blossoming flowers is thus
x6(t + 1) = 0,023*4(0 + 0,750*5(0-
We have thus reached six recurrent formulas for individual components of the investigated plant. We now denote
/ 0 0 0 0 0 322,388\ /*i(0\
0,966 0 0 0 0 0 *2(0
0,013 0,010 0,125 0 0 3,448 , *(0 = *3(0
0,007 0 0,125 0,238 0 30,170 *4(0
0,008 0 0,038 0,245 0,167 0,862 *5(0
V 0 0 0 0,023 0,750 o ) \*6(0 /
and write the previous equalities in the matrix form suitable for the computation
x(t + 1) = A*(0-
If we know the distribution of the individual components of the population in some initial year t = 0, we can compute the expected numbers of flowers and seeds in the following years. We can also
6
compute the total number of individuals n(t) at the time t, n(t) = ^*;(0, relative distribution
r = l
of the individual components Xi(t)/n(t), i = 1, 2, 3, 4, 5, 6 and the yearly relative change in the population n(t + l)/n(t). The results of such calculations for fifteen years and the case that we have put into some locality one blossoming flower, are given in the table || 11|. Unlike the whale population, the image would not be very clear, as the numbers of flowers are negligible compared to the numbers
of seeds (the individual areas for flowers would merge in the picture).
ki= 2,3339 k4 = 0,1187 + 0,1953i
The matrix A has the eigenvalues A2 = -0,9569 + l,4942i X5 = 0,1187 - 0,1953i
X3 = -0,9569 - l,4942i X6 = -0,127'4
The eigenvector associated with the eigenvalue ki is
w
(0,6377, 0,2640, 0,0122, 0,0693, 0,0122, 0,0046);
this vector is normed such that the sum of its components is equal to one. We see that with increasing time t the relative increment in the size of population approaches the eigenvalue X\, relative distribution of the components in the population approach the components of the normed eigenvector associated with the eigenvector X\. Every non-negative matrix that has non-zero elements at the
197
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
t xi x2 x3 X4 x5 Xg n(t) |
0 0,00 0,00 0,00 0,00 0,00 1,00 1,00
1 322,39 0,00 3,45 30,17 0,86 0,00 356,87
2 0,00 311,43 4,62 9,87 10,25 1,34 337,50
3 432,13 0,00 8,31 43,37 5,46 7,91 497,18
4 2550,50 417,44 33,93 253,07 22,13 5,09 3 282,16
5 1641,69 2463,78 59,13 235,96 91,78 22,42 4514,76
6 7 227,10 1585,88 130,67 751,37 107,84 74,26 9 877,12
7 23 941,29 6981,37 382,20 2486,25 328,89 98,16 34218,17
8 31 646,56 23 127,29 767,29 3768,67 954,73 303,85 60568,39
9 97 958,56 30570,58 1 786,27 10381,63 1 627,01 802,72 143 126,78
10 258788,42 94627,97 4 570,24 27 597,99 4358,70 1459,04 391402,36
11 470376,19 249 989,61 9 912,57 52 970,28 10991,08 3 903,78 798 143,52
12 1258 532,41 454383,40 23 314,10 134915,73 22317,98 9461,62 1902925,24
13 3 050 314,29 1215742,31 56442,70 329 291,15 55 891,57 19 841,54 4727 523,56
14 6396675,73 2946603,60 127 280,49 705398,22 133 660,97 49492,37 10359111,38
15 15955747,76 6179188,75 299 182,59 1721756,52 293 816,44 116 469,89 24566161,94
t *2(0 x3(t) x4(t) *s(0 x6(t) n(t + 1)
1 n(t) n(t) n(t) n(t) n(t) n(t) n(t)
0 0,000 0,000 0,000 0,000 0,000 1,000 356,868
1 0,903 0,000 0,010 0,085 0,002 0,000 0,946
2 0,000 0,923 0,014 0,029 0,030 0,004 1,473
3 0,869 0,000 0,017 0,087 0,011 0,016 6,602
4 0,777 0,127 0,010 0,077 0,007 0,002 1,376
5 0,364 0,546 0,013 0,052 0,020 0,005 2,188
6 0,732 0,161 0,013 0,076 0,011 0,008 3,464
7 0,700 0,204 0,011 0,073 0,010 0,003 1,770
8 0,522 0,382 0,013 0,062 0,016 0,005 2,363
9 0,684 0,214 0,012 0,073 0,011 0,006 2,735
10 0,661 0,242 0,012 0,071 0,011 0,004 2,039
11 0,589 0,313 0,012 0,066 0,014 0,005 2,384
12 0,661 0,239 0,012 0,071 0,012 0,005 2,484
13 0,645 0,257 0,012 0,070 0,012 0,004 2,191
14 0,617 0,284 0,012 0,068 0,013 0,005 2,371
15 0,650 0,252 0,012 0,070 0,012 0,005
Table 1. Modelled evolution of the population of teasels Dipsacus sylvestris. Sizes of the individual components of population, the total size of population, relative distribution of the individual components of population and the relative increments of sizes.
same positions as A is primitive. The evolution of the population thus necessarily approaches a stable structure.
3.61. Nonlinear model of population. Investigate in detail the evolution of the population for a non-linear model from the text book (1.12) and the values and K = 1 and
i) rate of j growth r = 1 and the initial state p(l) = 0,2
ii) rate of j growth r = 1 and the initial state p(l) = 2
hi) rate of j growth r = 1 and the initial state p(l) = 3
iv) rate of j growth r = 2,2 and the initial state p(l) = 0,2
v) rate of j growth r = 3 and the initial state p(l) = 0,2
Compute some first members and predict the future growth of the population.
198
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Solution.
i) The first ten members of the sequence p(n) is in the following table. >From there we can see that the size of the population converges to the value 1.
n P(n)
1 0,2
2 0,36
3 0,5904
4 0,83222784
5 0,971852502
6 0,999207718
7 0,999999372
Graph for the evolution of the population for r = 1 and p(l) = 0, 2:
ii) For the initial value p(l) = 2 we obtain p(2) = 0 and after that the population does not change.
iii) For p(l) = 3 we obtain
n P(n)
1 3
2 -15
3 -255
4 -65535
and from there we see that the populations decreases under all bounds, iv) For the measure of growth r = 2, 2 and the initial state p(l) = 0, 2 we obtain
n P(n)
1 0,2
2 0,552
3 1,0960512
4 0,864441727
5 1,122242628
6 0,820433675
7 1,144542647
8 0,780585155
9 1,157383491
10 0,756646772
11 1,161738128
12 0,748363958
!3 1,162657716
14 0,74660417
We see that instead of convergence we obtain in this case an oscillation - after some time the population jumps between the values 1,16 and 0,74. The graph of the evolution of the population for r = 2, 2 and p(l) = 0, 2 then looks as follows: v) For the rate of growth r = 3 and the initial state p(l) = 0, 2 we obtain
199
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
n P(n)
1 0,2
2 0,68
3 1,3328
4 0,00213248
5 0,008516278
6 0,033847529
7 0,131953152
8 0,475577705
9 1,223788359
10 0,402179593
11 1,123473097
12 0,707316989
13 1,328375987
14 0,019755658
15 0,077851775
16 0,293224403
17 0,91495596
18 1,148390614
19 0,63715945
20 1,330721306
21 0,010427642
22 0,041384361
23 0,160399447
In this case the situation is more complicated - the population starts oscillating between more values. In order to be able to see between what values, we would need to compute more members. For the members from the table we have the following graph:
□
3.62. In a lab an experiment is being carried on with the same probability of success and failure. If the experiment succeeds, the probability of the success of the second experiment is 0, 7. If the first experiment fails, the probability of the success of the second experiment is only 0, 6.
This process goes on, that is, if the previous experiment was successful, the probability of the next success is 0, 7 and if the previous experiment was a failure, then the probability of the next success is 0, 6. For any n € N determine the probability that the 72-th experiment is successful.
Solution. Let us introduce the probabilistic vector
xn = {xln,x2n)T, neN,
where x\ is the probability of the success of the 72-th experiment and x\ = 1 — x\ is the probability of its failure. According to the statement it is
-(E)
and clearly also
_ /0, 7 0, 6\ /l/2\ _ /l3/20\ Xl ~ [0, 3 0,4J' \l/2) ~\ 7/20 ) ■
Using the notation
/7/10 3/5\ V3/10 2/5;
200
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
it holds that
(3.7) xn+i = T ■ x„, n e N,
because the probabilistic vector x„+1 depends only on x„ and this dependency is identical for both x2 and x\. >From the relation (||3.7||) we directly have
(3.8) jc„+i = T ■ T -jc„_i = ••• = T" -jci, n > 2, n e N.
Therefore we express T", 72 e N. It is a Markov process, and thus 1 is an eigenvalue of the matrix T. The second eigenvalue 0, 1 follows for instance from the fact that the trace (the sum of the elements on the diagonal) equals to the sum of the eigenvalues (every eigenvalue is counted with its algebraic multiplicity). To these eigenvalues then correspond the eigenvectors
We thus obtain
'2 1 \ /l 0 \ (2 1
T ' 1 -11 \0 1/10) \1 -1
that is, for n e N we have
Substitution
2 1 Wl 0 V 2 1
1 -1) ' \0 1/10/ ' \1 -1
2 1 \ /l" 0 \ (2 1
i -l/'lo io-"/"vi -i
2 i y = i (i i
1 -l) "3 [l -2 and multiplication yields
1/2+ 10-" 2-2-10-"\ 3 — IO-" 1+2-10""/'
>From there, from (||3.7||) and from (||3.8||) it follows that
'2 11 1
+ t—7T - n e N.
,3 6- 10" 3 6-10", Specially, we see that for big n the probability of success of the 72-th experiment is close to 2/3. □
3.63. Student on a student dormitories is very "socially tired" (as a result, he is not able to fully perceive the universe around him and coordinate his movements). In this state he decides that he invites on the party-in-progress his friend which lives at the end of the hall. But, at the other end of the hall there lives somebody he definitely does not want to invite. But he is so „tired", that he realises the decision to make a step in a desired direction only in 53 of 100 attempts (in the remaining 47, he makes a step in exactly the opposite direction). Assuming that he starts in the middle of the hall and that the distance to both of the doors at the ends corresponds to twenty of his awkward steps, determine the probability that he first reaches the desired door. O
201
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
3.64. Let r e Nof persons be playing the so-called "silent post". For simplicity assume that the first person whispers to the second person exactly one (arbitrarily chose) of the words „yes", „no". The second person then whispers to the third one that of the words „yes", „no" the second person thinks that the first person whispered. This then continues to the 72-th person. If the probability that the word changes (on purpose, accidentally) to the other word during one transmission equals p € (0, 1), determine for big n e N the probability that the 72-th person correctly receives the word transmitted by the first person.
Solution. We can view this problem as a Markov chain with two states called Yes and No, and we say that the process is in the Yes state in the time m € N, if the m-th person thinks that the received word is „yes". For the order of the states Yes, No the probabilistic matrix is
The product of the matrix Tm~l and the probabilistic vector of the initial choice of the first person then gives the probability of what the m-th person thinks. We don't have to compute the powers of this matrix, because all the elements of the matrix T are positive numbers. Furthermore, this matrix is doubly stochastic. Thus we know that for big 72 e N the probabilistic vector is close to the vector (1/2, 1/2)r. The probability that the 72-th person says „yes" is thus approximately the same as the probability that the 72-th person says „no", independently of the initial word. For a big number of participants thus holds that roughly half of them hears „yes" (we repeat that this does not depend on the initial word).
For completeness let us determine what would be the result if we assumed that the probability of change from „yes" to „to" is for any person equal to p € (0, 1) and the probability of change from „no" to „yes" is equal to (in general distinct) q € (0, 1). In this case for the same order of the states we obtain a probabilistic matrix
Again, with sufficiently many people it does not depend on the initial choice of the word. Simply speaking, in this model it holds that it does not depend on the initial state, because the people decide about what the transmitted information is; more precisely, the people themselves decide about the frequency of appearance of „yes" and „no", if there is enough of them (and there is no checking present).
Let us further add that the obtained result was experimentally confirmed. In psychological experiment there was an individual repeatedly exposed to an event that could have been interpreted in two ways, and it was being done in time intervals that ensured that the subject still remembered the previous event. See for instance „T. Havr'anek et al.: Matematikapro biologick'e a I'ekafsk'e vedy, Praha, Academia 1981", where there is an experiment in which an ambiguous object (say, a drawing
which leads to (for big 72 e N) to the probabilistic vector close to the vector
\p + q p + qj which for instance follows from the expression of the matrix
202
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
of a cube which can be perceived from both the bottom and the top) is in fixed time intervals lighted on. Such process is a Markov chain with the transition matrix
'1 - p q p l-q,
where p,q € (0, 1). □
3.65. In a certain game you can choose one of two opponents. The probability that you beat the better one is 1/4, while the probability that you beat the worse one is 1/2. But the opponents cannot be distinguished, thus you do not know which one is the better one. But you await a big number of games (and for each of them you can choose a different opponent). And of course you want reach the winning ratio as big as possible. Consider these two strategies:
1. For the first game choose the opponent randomly. If you win some game, carry on with the same opponent; if you lose the game, change the opponent.
2. For the first two games, choose (one) opponent randomly. Then for the next two games, if you lost both the previous games, change the opponent, otherwise stay with the same.
Which of the strategies is better?
Solution. Both strategies are a Markov chain. For simplicity denote the worse opponent by A and
the better opponent by B. In the first case for the states „game with A" and „game with 5" (in this
order) we obtain the probabilistic transition matrix
'1/2 3/4^ ,1/2 1/4,
This matrix has all elements positive, and thus it suffices to find the probabilistic vector Xoo, which is associated with the eigenvalue 1. It holds that
3 2X T
.5 5,
Its components correspond to the probabilities that after a long row of games the opponent is the player A or player B. Thus we can expect that 60 % of the games will be played against the worse of the opponents. Because
2 _ 3 1 2 1
5 ~ 5 ' 2 + 5 ' 4' there will be roughly 40 %.
For the second strategy, let us use the states „two games in a row with A" and „two games in a
row with 5" that lead to the probabilistic transition matrix
'3/4 9/16N 1/4 1/16;
We easily determine that now it is
9 4 13' 13,
Against the worse opponent we would then play (9/4)-times more frequently than against the better one. Let us recall that for the first strategy it was (3/2)-times more frequently. The second strategy is thus better. Let us also note that for the second strategy roughly 42,3 % of the games are winning - it suffices to enumerate
11 9 1 4 1
0, 423 = — =---+---.
26 13 2 13 4
203
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
□
3.66. Petr regularly meets his friend. But he is „well-known" for his bad timekeeping. Bud he is trying to change, thus it holds that in half of the cases he comes on time and in one tenth of the cases he comes even sooner than he should, if he was late for the last meeting. But if he was on time or sooner for the last meeting, he returns back to his „carelessness" and with probability 0, 8 comes late, and with only 0, 2 he is on time. What is the probability that on the 20-th meeting he comes late, when on the eleventh he was on time?
Solution. Clearly it is a Markov process with states „Petr came late", „Petr came on time", „Petr came sooner" with the probabilistic transition matrix (with the given order of states)
/0,4 0,8 0,8\ T = 0,5 0, 2 0, 2 . \0, 1 0 0/
The eleventh meeting is determined by the probabilistic vector (0, 1, 0)T (we surely know that Petr came on time). To the twentieth meeting corresponds the vector
/0\ /0,571578 368\ T9 1 = 0,371316224 . \0/ \0, 057 105 408/
The desired probability is thus 0, 571578 368 (exactly). Let us add that
/0, 571 316224 0,571578 368 0,571 578 368\ T9 = 0,371512 832 0,371316224 0,371316224 . \0,057 170944 0,057105 408 0,057 105 408/
>From there we see that it really does not depend on whether he came on the eleventh meeting late (first column), on time (second) or sooner (third). □
3.67. Two students A and B spend every Monday morning by playing a certain computer game. The person who wins then pays for both of them in the evening in the restaurant. The game can also be a draw - then each pays for the half. The result of the previous game partially determines the next game. If a week ago the student A has won, then with the probability 3/4 wins again and with probability 1/4 it is a draw. Draw is repeated with the probability 2/3 and with probability 1/3 the next game is won by B. If the student B won a game, then with the probability 1/2 he wins again and with probability 1/4 student A is the winner of the next game. Determine the probability that today each of them pays half of the costs, if the first game played long time ago was won by A.
Solution. We are actually given a Markov process with the states „the student A wins", „the game ends with a draw, „the student B wins" (in this order) with the probabilistic transition matrix
/3/4 0 l/4\ T = 1/4 2/3 1/4 . \ 0 1/3 1/2/
We want to find the probability of the transition from the first state to the second after a big number n € N of steps (weeks). The matrix T is primitive, because
/ 9/16 1/12 5/16 \ T2 = 17/48 19/36 17/48 . \ 1/12 7/18 1/3 /
204
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
It thus suffices to find the probabilistic eigenvector of the matrix T associated with the eigenvalue 1. It is easy to compute that
_ /2 3 2
x°° ~ \r r 7
We know that the vector x^ differs only very slightly from the probabilistic vector for big n and also
does not depend on the initial state, that is, for big n e N we can set
/2/7 2/7 2/7\ T" ~ 3/7 3/7 3/7 . \2/7 2/7 2/7/
The desired probability is the element of this matrix on the second position in the first column (the second component of the vector x^). Thus we have (quite quickly) found the result 3/7. □
205
CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS
Solutions of the exercises
3.2. Daily diet should contain 3, 9 kg of hay and 4, 3kg of oat. The costs per foal are then 13, 82 Kc.
3.3. 3.12.
1
1
2\/3sin(n • (tt/6)) — 4cos(n • (jr/6)). -3(-l)" - 2cos(« • (2tt/3)) - 2^3 sin(« • ((2tt/3)). (-l)"(-2«2 + 8« -7).
3.13. xn
3.14. x„
3.15. x„
3.24. Leslie matrix of the given model is (the mortality of the first group is denoted by a)
^0 2 2^ a 0 0 V0 1 Oy
The stagnation condition corresponds to the fact that the matrix has 1 for the eigenvalue, that is, the polynomial
X3 — 2aX — 2a has 1 as its root, that is, a = 1/4.
3.27.
1
I 5
The matrix has the dominant eigenvalue 1, the corresponding eigenvector is (|, 1). Because the eigenvalue is dominant, the ratio of the viewers stabilises on 6 : 5.
3.30. As in (|| 3.291|) the game ends after three bets. Thus all the powers of A, starting with A3, are identical.
(i 7/8 3/4 1/2 0\
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
\0 1/8 1/4 1/2 1/
3.40. We can use the result of the exercise called Ruining of the player. The probability that the first department is cancelled is according to this exercise equals to
1 _ / 0-46 \5 1 \ 1-0.46/
.100
,25
= 0.56.
1 _ / 0-46 y 1 \ 1-0.46/
It was enough to plug in p — 1 — 0.54, y — 10/2 and x — 40/2 to (||3.6||). It is thus more clever to choose
the smaller department.
3.50.
• The claim holds. (B := AT A, btj — (i-th row of AT) ■ (j-th column of A)= bjt AT) ■ (;'-th column of A)=(j-th column of A) • (;'-th row of AT)
A 1N
(j-th row of
The claim does not hold. Consider for instance A
0 1
3.52.
1
0
3.63. Again it is a special case of the Ruining of the player. It suffices to reformulate the statement accordingly. For p = 0, 47, y = 20 and x = 20 from (||3.6||) follows the result
1
0,917 =
V 1-0,47 /
/_047_\
V 1-0,47 /
206
CHAPTER 4
Analytic geometry
position, incidence, projection
- and we return to matrices again...
A. Affine geometry
4.1. Find the parametric equation for a line in R3 given by equations
x - 2y + z = 2, 2x + y - z = 5.
Solution. It is obviously sufficient to solve the equation system. However we can use different approach. We need to find non-zero direction vector ortogonal to normal vectors (1, —2, 1), (2, 1,-1). Cross product
(1, -2, 1) x (2, 1, -1) = (1,3,5) gives us such vector. We can notice that triple (x, y, z) = (2,-l,-2) satisfies the respective system and we obtain the solution
[2,-1,-2]+ t (1,3, 5), t eR.
□
4.2. Plane in R4 is given by its parametric equation
q : [0, 3, 2, 5] + t (1, 0, 1, 0) + s (2, -1, -2, 2), t, s e Find its implicit equation.
Now we come back to our view on geometry that we had when we studied positions of points in the plane in the 5th part of the first chapter, c.f. 1.23. First we will be interested in properties of objects in the Euclidean space, delimited by points, straight fines, planes etc. The essential point will be to clarify how their properties are related to the notion of vectors and whether they depend on the notion of length of vectors.
In the next part, we will use linear algebra for the study of objects which are defined in a nonlinear way. To do this we will need a little bit more from the theory of matrices again. The results will be important later on, while discussing the technique for optimalization, i.e. searching for extrema of functions.
At the end of this chapter we show how the projectivization of affine spaces help us to get a simplification and stability of algorithms typical for computer graphics.
1. Affine and euclidean geometry
While we were clarifying the structure of solutions of linear equations in the first part of the previous chapter we found out in paragraph 3.1 that all solutions of non-H homogeneous systems of linear equations does not form vector spaces but always arise in such a way that to an one particular solution we add the vector space of solutions of the corresponding homogeneous system. On the other hand, the difference of any two solutions of the nonhomogeneous system is always a solution of the homogeneous system. This behaviour is similar to the behaviour of linear difference equations, as we have seen in paragraph 3.14 already.
4.1. Affine spaces. A direction how to deal with the theory is given already in the discussion about the geometry of the plane, c.f. paragraph 1.25 and further. There we described straight lines and points as sets of solutions of systems of linear equations. Any line was considered as a one-dimensional subspace, although its points were described by two coordinates. Parametrically, the fine was defined by the sum of a single point (i.e. to a pair of coordinates) and multiples of a fixed direction vector. Now we will proceed in the same way in arbitrary dimension.
___J Standard affine space J___
Standard affine space A„ is a set of all points in M" = A„ together with an operation which to a point A — (a\,..., an) e An and a vector i; — (v\,... ,v„) e Rn — V assigns the point
a + v — (ai + di, .
v„) e
An •
CHAPTER 4. ANALYTIC GEOMETRY
Solution. Our task is to find a system of equations with 4 variables x, y, z, u (because dimension of the space is 4) which are satisfied by the coordinates of precisely those points which he in the plane. Note that sought system must contain 2 = 4 — 2 linearly independent equations. We solve the problem by so called elimination of parameters. Points [x,y,z,u] € q satisfy
x = t + 2s,
y = 3 — s,
z = 2 + t - 2s,
u = 5 + 2s,
where t,s sR. We can express the system as matrix
/1 2 -1 0 0 0 0 \
0 -1 0 -1 0 0 3
1 -2 0 0 -1 0 2
2 0 0 0 -1 5/
where the first two columns are direction vectors of the plane, followed by negative identity matrix and finally the last column is vector of coordinates of point [0, 3, 2, 5]. We expressed the system in such a way so that it is a system in t, s, x, y, z, u and we move all the unknown variables to the one side of the equations. We transform obtained matrix using elementary row operations in order to get as much zero-rows on the left-hand side of the first vertical line. Adding (—1)-times the first row and (—4)-times the second row to the third row and adding twice the second row to the first row we obtain
/1 2 -1 0 0 0 0 \
0 -1 0 -1 0 0 3
1 -2 0 0 -1 0 2
\° 2 0 0 0 -1 5 )
(1 2 -1 0 0 0 0 \
0 -1 0 -1 0 0 3
0 0 1 4 -1 0 -10
0 0 -2 0 -1 11 /
Which implies result
+
4y -2y
- 10 u + 11
0, 0.
Coefficients on the right-hand side of the first vertical line, respective to the rows which are zero-rows on the left-hand side of that line, are the coefficients of general equations of a planes. Note that if we expressed the original system as a matrix
/1 0 0 0 1 2 0 \
0 1 0 0 0 -1 3
0 0 1 0 1 -2 2
0 0 1 0 2 5/
This operation satisfies the following three properties:
(1) A + 0 — A for all points A e A„ and the null vector 0 e V,
(2) A + (v + w) — (A + v) + w for all vectors v,weV and points A e A„,
(3) for every two points A, B e A„ there exists exactly one vector v e V such that A + v — B. This vector is denoted hy v — B — A, sometimes also AB.
The underlying vector space M" is called the difference space of the standard affine space A„.
We notice a danger of several formal ambiguities. We are
'i§t# using the same symbol "+" for two different oper-'''jfxj/k ations: for adding a vector from the difference space t0 a p0int in the affine space, and for for summing vectors in the difference space V — Rn. Also we do not introduce specific letters for the set of points in the affine space, i.e. A„ denotes both this set of points and also the whole structure defining the affine space.
Why do we actually want to distinguish between the set of points in the affine space A„ and its difference space V when both spaces can be viewed as M" ? It is going on fundamental formal step to understanding the geometry in M": The thing is that the geometric objects like straight lines, points, planes etc. do not depend directly on the vector space structure of the set M", and do not depend at all on the fact that we are working with n-tuples of scalars. We only need to know what it means to move "straight in a given direction". For instance, we consider the affine plane as an unbounded board without chosen coordinates but with the possibility to move about a given vector. When we switch to such abstract view, we will be able to discuss the "plane geometry" for two-dimensional subspaces, i.e. planes in higher-dimensional spaces, the geometry of "Euclidean space" for three-dimensional subspaces etc., without the need to work with ^-tuples of coordinates.
This point of view is present in the following definition:
4.2. Definition. The affine space A with the difference space V is a set of points V, together with the map
V
(A, v)
where V is a vector space and our map satisfies the properties (l)-(3) from the definition of the standard affine space.
So for a fixed vector i; e V we get a translation rv : A -> A as the restricted map
tv : V ~ V x {v} -* V, A\-^ A + v.
By the dimension of an affine space A, we mean the dimension of its difference space.
In sequel we do not distinguish accurately between denoting the set of points A and the set of vectors V, we talk about points and vectors of the affine space A instead.
It follows immediately form the axioms that for arbitrary points A, B, C in the affine space A
(4.1)
(4.2)
(4.3)
A - A = 0 e V B — A — -(A - B) (C - B) + (B - A) = C - A.
Indeed, (4.1) follows from the fact that A+0 — 0 and that such vector is unique (the first and the third defining property). By adding
208
CHAPTER 4. ANALYTIC GEOMETRY
where x,y,z,u remains on the left-hand side of the equations, similar
transformation
/ 1 0 0 0
0 10 0
0 0 10
V 0 0 0 1
gives us the result
1 2 0 \ / 1 0 0 0 1 2 0 \
0 -1 3 0 1 0 0 0 -1 3
1 -2 2 -l -4 1 0 0 0 -10
0 2 5 ) V 0 2 0 1 0 0 11 /
4y 2y
+
+ u
10,
11.
When expressing system as a matrix, it is important to take into consideration whether the vertical line separates left-hand side from right-hand side. As we saw in this exercise, parameter eUmination method can be long-winded and it is not difficult to make a mistake along the way.
Another solution All we wanted to obtain in fact, are two linearly independent normal vectors, i.e. vectors perpendicular to (1, 0, 1, 0), (2,-1, —2, 2). If we "guessed" that these vectors could be for example (0, 2, 0, 1), (-1, 0, 1, 2), inputting x = 0, y = 3, z = 2, u = 5 to the equations
2y + u = a,
—x + z + 2u = b
we get a = 11, b = 12, and the sought implicit expression is
2y + u = 11,
+ 2u = 12.
+ z
u 2u
□
4.3. Find a parametric equation of the plane passing through points
A = [2, 1, 1], £ = [3,4, 5], C = [4, -2, 3].
Then find a parametric equation of the open half-plane containing the point C and bounded by line going through the points A, B.
Solution. We need one point and two (linearly independent) vectors lying in this plane for the parametric equation of the plane. It is enough to choose the point A and vectors 5 — A = (1,3, 4) and C — A = (2, —3, 2), which are obviously independent. A point [x, y, z] lies in the plain if and only if there exist numbers t, s € R so that
x =2 + 1-t+2-s, y = l + 3- t - 3-s, z = 1 + 4 ■ t + 2 ■ s;
which means the parametric equation is
[2, 1, 1] + t (1, 3, 4) + s (2, -3, 2), t, s e R.
Setting s = 0 gives us a line passing through points A, B. For t = 0, s > 0 we get a ray with initial point A and passing through C. Particular but arbitrarily choosen t e R and variable s > 0 gives us a ray initiated on the border line and going through the half-plane in
successively B — A and A — B to A, according to the second defining property we obtain obviously A again. So we added the null vector which proves (4.2). Similarly, (4.3) follows from the defining property 4.1 (2) and the uniqueness.
Let us remark that the choice of one fixed point Aq e A determines a bijection between V and A. So for a fixed basis u in V we get for every point A e A a unique expression
A = Aq + x\u\ + ■ ■ ■ + x„u„.
We talk about an affine coordinate system (Aq; u\,..., un) given by the origin of the affine coordinate system Aq and the basis u of the corresponding difference space, or also about an affine frame
(Aq, w).
We can summarize the situation as follows: Affine coordinates of a point A in the frame (Aq, u) are the coordinates of the vector A — Aq in the basis u of the difference space V.
The choice of an affine coordinate system identifies each n-dimensional affine space A with the standard affine space A„.
4.3. Affine subspaces. If we choose only such points in A which have some of in advance chosen coordinates equal to zero (for instance the last one), we obtain again a set which behaves as an affine space. Indeed, this is the spirit of the following definition of the so called affine
Ma subspaces.
Subspaces of an affine space
Definition. The nonempty subset Q c A of an affine space A with a difference space V is called an affine subspace in A if the subset W = {B — A; A, B e Q} c Visa vector subspace and for any A e Q, v e W we have A + v e Q.
It is important to include both of the conditions in the definition since it is easy to find examples of sets which satisfy the first condition but not the second one. Have a think about a straight line in the plane with one removed point.
For an arbitrary set of points M c A in an affine space with a difference space V, we define the vector space
Z(M) = ({B — A; B, A e M}) C V
of all vectors generated by the differences of points in M.
In particular, V = Z(A) and every affine subspace Q c A itself satisfies the axioms for an affine space with the difference space Z(Q).
Directly from the definitions we also get that the intersection of any set of affine subspaces is either an affine subspace or the empty set.
The affine subspace (M) in A generated by a nonempty set M c A is the intersection of all affine subspaces which contain all points of the subset M.
„ Affine hull and parametric description of a subspace \^
The affine subspaces can be nicely described by their difference spaces if we choose a point Aq e M in a generating set M. Indeed, we get (M) = {Aq + v; v e Z(M) c Z(A)}, i.e. to generate the affine subspace we take the vector subspace Z(M) in the difference space generated by all differences of points in M, and we add this vector space to an arbitrary point in M. We talk also about the affine hull of the set of points M in A.
209
CHAPTER 4. ANALYTIC GEOMETRY
which point C lies. That means that the sought open half-plane can be expressed parametrically as
□
[2, 1,1] + ? (1, 3, 4) + s (2, -3, 2), «el,s> 0.
4.4. Determine relative position of lines
p : [1,0, 3] + t (2,-1,-3), (el,
q : [1,1, 3] + s (1,-1,-2), s el.
Solution. We will find common points of given lines (subspaces intersection). We get a system
1 + 2t = 1 + s, 0 - t = 1 - s, 3 - 3t = 3 - 2s.
>From the first two equations we get that t = 1, s = 2. However, this does not satisfy the third equation. The system does not have a solution. Direction vector (2, —1, —3) of the line p is not a multiple of direction vector (1, — 1, —2) of the line q which means that the lines are not parallel. Hence, they are skew lines. □
4.5. Find all numbers a elso that lines
p : [4, -4, 8] +t (2, 1, -4), (el, q : [a, 6, -5] + s (1, -3,3), s € R
are intersecting.
Solution. Lines are intersecting if and only if the system
+
4 -4
+ 2t + t - At
a
s,
6 - 3s, 5 + 3s
has exactly one solution. Expressing the system as a matrix (the first column corresponding to t, the second to s), we solve
1 2 -1 a - 4 \
1 3 10
V-4 -3 -13 )
3 10
-1 a - A
-3 -13
\ 10
1 a-24
3
1
We see that the system has exactly one solution if and only if the second row is a multiple of the third row. This property is satisfied only for a = 3. Let us add that the point of intersection of the lines is [6, —3, 4].
□
On the other hand, whenever we choose a subspace U in the difference space Z(A) and a fixed point A e A the subset A + U, created by all possible sums of the point A and all vectors in U, is an affine subspace. This approach leads to the notion of parametrization of subspaces:
Let Q = A + Z(Q) is an affine subspace in A„ and («i,... ,uk) is a basis of Z(Q) c Rn. Then the expression of the subspace
Q — {A + hui +■■■ + tkuk; t\, ... ,tk Z(B) between their difference spaces such that for all A e A, v e Z(A) the following holds
f(A + v) = f(A) + cp(v).
The maps / and P(A,C)
(4) In each cartesian coordinate system (Ao; e), the distance of the points A — Aq + a\e\ + • • • + anen, B — Ao + b\e\ +
----\-b„en is yjj2"=i(ai ~ bi)2-
(5) Given a point A and a subspace Q in £„, there exists a point P e Q which minimalizes the distance between A and the points in Q. The distance between A and P is equal to the length of the orthogonal projection of the vector A — B into Z(Q)-1 for an arbitrary B e Q.
(6) More generally, for subspaces Q and 1Z in £„ there exist points P e Q and Q e 1Z which minimalize the distances of points B € Q and A € 1Z. The distance between the points P and Q is equal to the length of the orthogonal projection of the vector A — B into Z(Q)1- for arbitrary points B e Q and A e 7Z.
Proof. The first three properties follow directly from the ^ properties of length of vectors in spaces with a scalar product, the fourth one follows directly from the expression of the scalar product in an orthonormal basis.
218
CHAPTER 4. ANALYTIC GEOMETRY
4.21. In vector space R4 compute distance i; between point [0, 0, 6, 0] and vector subspace
U : [0, 0, 0, 0] + h (1, 0, 1, 1) + h (2, 1, 1, 0) + h (1, -1, 2, 3),
Solution. We will solve the problem by the least squares method. Let U's generating vectors be columns of matrix
/l 2 1 \
A =
■1
0 1
1 1 2 \1 0 3/
and we substitute point [0, 0, 6, 0] by corresponding vector b =
(0, 0, 6, 0)T. We will solve A ■ x = b, i.e. linear equation system
X\ + 2x2 + x3 = 0,
x2 — xj, = 0,
x\ + x2 + 2x3 = 6,
x\ + 3x3 = 0,
by least squares method. (Note that the system does not have a solution
- the distance would be 0 otherwise.) Let's multiply A ■ x = b by
matrix AT from the left-hand side. Augmented matrix AT ■ A ■ x =
AT ■ b then is
3 3 6 3 6 3 6 3 15 12
By elementary row operations we transform the matrix to the normal form
3 3 3 6 6 3
3 15
12 j \ 0 -3 3 We continue with backward eUmination 1 1 2 0 1 -1 0 0 0
and see the solution
■1
0
x = (2 - 3t, t, t) T , t e R.
Note that the existence of infinitely many solutions is caused by third vector generating U, which is redundat because
3 (1, 0, 1, 1) - (2, 1, 1, 0) = (1, -1, 2, 3).
Arbitrary (t e R) linear combination
(2 - 3f) (1, 0, 1, 1) + t (2, 1, 1,0) + ? (1, -1, 2, 3) = (2, 0, 2, 2)
corresponds to a point [2, 0, 2, 2] in subspace U, which is the nearest point to [0, 0, 6, 0]. The distance is therefore
u = || [2, 0, 2, 2] - [0, 0, 6, 0] || = V22 + 0 + (-4)2 + 22 = 2^6.
□
Let us look at the relation for the minimal distances p(A, B) for B e Q. The vector A — B decomposes uniquely as A — B = u\ +u2, where u\ e Z(Q), u2 e Z(Q)-1. The component u2 does not depend on the choice ofB € Q since any potential change of the point B would show by adding a vector from Z(Q). Now let us choose P = A + (—u2) = B + u\ e Q. We get
BH2 =
ll"2ll > ll"2ll — IIA ■
From here we see that the minimal possible distance is reached exactly for our point P and its value is ||«2 II indeed.
We get the general result in a similar way. For the choice of arbitrary points A e 1Z and B e Q their difference is given as a sum of vectors u\ e Z(K) + Z(Q) and u2 e (Z(K) + Z(Q))J-, where the component u2 does not depend on the choice of the points. Adding suitable vectors from the difference spaces of 1Z and Q we obviously obtain points A' and B' whose distance is exactly
II"2||. □
Now we extend our brief overview of elementary problems in the affine geometry.
4.17. Examples of standard problems. (1) To find the distance from the point A € £„ to the subspace Q C £„:
A method of solving such problem is given in the proposition 4.16.
(2) In £2 to construct the straight line q through a given point A which form a given angle with a given line p:
Let us remind that we have worked with angles between vectors in the plane geometry already (see e.g. 2.43). We find a vector u € M2 lying in the difference space of the line q, and we choose a vector i; having the prescribed angle with u. The desired line is given by the point A and the difference space (v). The problem has either two solutions or only one solution.
(3) To find the perpendicular from a point to a given line:
The procedure is introduced in the poof of the last but one item of the proposition 4.16.
(4) In £3 to determine the distance of two lines p, q:
We choose arbitrarily one point from each of the lines, A e p, B e q. The component of the vector A—B lying in the orthogonal complement (Z(p) + Z(q))1- has the length equal to the distance between p and q.
(5) In £3 to find the axis of two skew lines p a q:
By the axis we mean the crossbar which realizes the minimal possible distance of the given skew lines in terms of the points of intersection. Again, the procedure can be derived from the proof of the proposition 4.16 (the last item). Let rj is the subspace generated by a single point A e p and the sum Z(p) + (Z(p) + Z(q))-1. Provided that the lines p and q are not parallel, it is going to be a plane. Then the intersection rjllq together with the difference space (Z(p)+Z(q))-L give the parametric expression of the desired axis. If the lines are parallel, then the problem has an infinite number of solutions.
4.18. Angles. Various geometric notions like angles, orientation, volume etc. in the point spaces £„ are defined in
f ~-±Z% terms of suitable notions from the vector euclidean spaces just as the notion of the distance. Let us remind that we defined the angle between two vectors at the end of the third part of the second chapter, see 2.43.
219
CHAPTER 4. ANALYTIC GEOMETRY
4.22. Compute volume of parallelepiped in R3 with base in plane z, = 0 and with edges given by pairs of vertices [0, 0, 0], [-2, 3, 0]; [0, 0, 0], [4, 1, 0] a [0, 0, 0], [5, 7, 3]. Solution. Parallelepiped is given by vectors (4,1,0), (—2,3,0), (5,7,3). We know that its volume is defined as determinant 4-2 5
Indeed, from Cauchy inequality follows 0
< 1, and
3 0
3 • 14 = 42.
Note that if we modified the order of vectors, we would get result ±42, because determinant gives us oriented volume of parallelepiped. Further note that the volume would not change if the third vector was [a,b,3] for arbitrary a, b € R. Its surface obviously depends only on ortogonal distance of planes of its upper and lower base and their area
4 -2
1
14.
□
4.23. Let points [0, 0, 1], [2, 1, 1], [3, 3, 1], [1, 2, 1] define a paral-leloid. Determine point X lying on line p : [0, 0, 1] + (1, 1, l)t so that parallelepiped defined by given paralleloid and point X has volume of 1.
Solution. We will form a determinant which gives us volume of a parallelepiped with X moving along line p:
t t
1 0
2 0
Volume should be 1 which introduces condition t = 1/3.
□
4.24. Let A BCD EF GHbea cube (with common notation, i.e. vectors E — A, F — B, G — C, H — D are orthogonal to the plane defined by vertices A, B, C, D) in Euclidean space R3. Compute angle cp between vectors F — A a H — A.
Solution. We have solved this problem using formula for angle between vectors. Let's think about the problem further. Vertices A, F, H are vertices of a triangle with all sides of the same length, it is hence equilateral triangle and therefore cp = jt/3. □
4.25. Let 5 be a midpoint of edge AB of cube ABCDEFGH (with common labelling). Compute cosine of angle between lines £"5 and BG.
Solution. Dilatation (homotethy) is similar mapping, hence it preserves angles. We can therefore asume that the cube edge has length 1. Further, we can place the point A to the origin of coordinate system and points B and E to points [1, 0, 0] and [0, 0, 1] respectively. Other coordinates are then given: S = [1/2, 0, 0], G = [1,1,1], vector
MINI
so it has sense to define the angle cp(u, v) between vectors u, v e V in a real vector space with a scalar product given by the equation
u ■ v
cos cp(u, v) — -, 0 < cp(u, v) < 2tt.
I"l III'll
This is completely in accordance with the situation in the two-dimensional euclidean space R2 and with our philosophy that the notion related to the two vectors is the issue of the plane geometry in fact.
In the euclidean plane, we used also the geometric functions cos and sin which we defined by a pure geometric consideration. We will come back to this in the beginning of the fifth chapter, when we will be able to check precisely the geometric opinion that the function cos is decreasing in the interval [0, it]. Therefore, the angle between two vectors in higher-dimensional spaces is measured in the plane which is generated by these two vectors (or it is zero), and our defining relation corresponds to the conventions in all dimensions.
In an arbitrary real vector space with a scalar product, it follows directly from definitions that
v\\2 =
■ + |MI - 2(« • v)
2 2
— \\u\\ + ||«11 — 2||w|| ||u|| coscp(u, v).
This is evidently the well known law of cosines from the plane geometry.
Next, the following relation holds for each orthonormal basis e of the difference space V and a non-zero vector u e V
\\u\\2 — ^2 \u • ei\2-
i
By dividing this equation by the number ||» ||2 we get
1 = ^(cos^(w, e,))2,
which is the law of directional cosines cp(u, e{) of the vector u.
Now we can derive reasonable definitions for angles between general subspaces in an euclidean vector space from the definitions of angles between vectors. Concurrently we must decide how to deal with cases, where the subspaces have a nontrivial intersection. As the angle between two lines, we want to take the smaller one from the two possible angles, in the case of two nonparallel planes in R3 we do not want to say that the angle is zero since they intersect and have one direction in common:
Angles between subspaces [__<
4.19. Definition. Let us consider finite-dimensional subspaces Ui, U2 in an euclidean vector space V of an arbitrary dimension.
The angle between vector subspaces U\, U2 is the real number a — cp(Ui, U2) e [0, j] satisfying: (1) If dimt/i = dimt/2 = 1, U\ = (u), U2 = (u),then
|w.u|
cos a = -.
(2) If the dimensions of U\, U2 positive and U\ n U2 — {0}, then the angle is the minimum of all angles between one-dimensional subspaces
a — min{ From this system we can see that y = — z, and x = 2z. Vector
(2, —1, 1) is therefore direction vector of p; in other words, we have
(p is obviously passing through the origin)
p : [0, 0, 0] + t (2,-1, 1), t eR.
For angle l)) —
Proof. According to the Cauchy inequality, for all vectors u e U we have
\u ■ v\ \u ■ (v\ + v2)\ \u ■ V\\
\\u\\ \\v\ \u\\ IIu 1II \\u\\ III'll
\v\ I
\v\ I
\v\\ \\v\ I
\v\ ■ v\ \\v\\ \\v\ II
This implies
cos(p((v), («)) < cos(p((v), (v\))
\vi I
+ 3b2 + 2V3ab.
and thus the vector v\, which we have found, represents the largest possible value of the cosine of angles between all choices of vectors inU. But since the function cos is decreasing on the interval [0, j], we get the smallest possible angle in this way, and so the claim is proved. □
4.21. Calculating angles. The procedure in the previous lemma ;i can be understood as follows. We take the orthogonal projection of the one-dimensional subspace generated by 1; t into the subspace U, and we look at the ratio between 1; 1 ^ ' and its image. A similar procedure is used in the higher dimension too. However, the problem is to recognize the directions whose projections give the desired (minimal) angle. We can see this in our previous example if we project the bigger space U into one-dimensional (1;) first, and then orthogonally back to U. We find out that the desired angle corresponds to the direction of
221
CHAPTER 4. ANALYTIC GEOMETRY
If we use a2 + b2 = 1, we get
0 = 2b2 + 2V3ab, tj. 0 = b (b + V3aj .
Together (remember that c = 3a and a2 + b2 = 1)
1 73
a = ±1, & = 0, c = ±3; a = ±-, & = T—, c
2 2
We can easily check that lines determined by those coefficients
x + 3 = 0, satisfy all the conditions.
1
- x
2
73 3 — V + - = 0 2^2
3
±-.
2
□
4.28. Determine general equation of all planes so that angle between every such plane and plane x + y + z — 1 = 0is 60°, and further, they contain line p : [1, 0, 0] + t (1, 1, 0). O
4.29. Determine angles between planes
a: [1,0, 2] + (1,-1, l)t + (0,1,-2)5 p: [3, 3, 3]+ (1,-2,0)? + (0,1,1)5
Solution. Line of intersection between planes has direction vector (1,-1,1), plane ortogonal to this vector has intersection with given planes generated by vectors vektory (1,0, —1) a (0, 1, 1). Angle between these one-dimensional subspaces is 60°. □
4.30. Cube ABCDA'B' C D' (in standard notation, i.e. ABCD and A'B' C D' are faces and A A' is an edge). Compute angle between AB' and AD'.
Solution. Consider cube of side 1 and place it in M3 in such way that
vertex A has coordinates [0, 0, 0], vertex B coordinates [1, 0, 0] and
vertex C coordinates [1, 1, 0]. Then vertex B! has coordinates [1, 0, 1]
and vertex D' coordinates [0, 1, 1]. We can determine vectors AB' =
B' — A = [1,0, l]-[0, 0, 0] = (1,0, 1), AD' = D'-A = [0, 1, 1]-
[0, 0, 0] = (0, 1, 1). By definition of angle U2 as before. Similarly, let i/> : U2 -> U\ be the map which has arisen from the orthogonal projection onU\. In the bases (e\, ek) and (e'j,..., e'}), these maps have matrices
A =
ek ■ ex
B =
\e\-e\ ... ek-e'J
l -e\
erei
\e[ ■ ek ... e'r ekJ
Since we are regarding scalar products on a real vector space, et ■ e'j — e'j -et holds for all indices i, j, in particular we have B — A T.
The composition of maps fofi : U\ —>• U\ has therefore a symmetric positive semidefinite matrix AT A, and i/> is an adjoint map to i ) + pi t>i ) +qi (u2, vt )
- p2(v1,v1) - q2(v2,v1) = 0,
((-3,2, -5, -7, -3),v2) + pi (uuv2) +qx (u2,v2)
- p2 (vu v2) - q2 (v2, v2) = 0. By computing those dot products we get linear equation system
tj-
We define the absolute value of the volume of a parallelepiped inductively such that we fulfil the idea that it is the product of the volume of the "base" and the "altitude":
\Vol\Tk(A;Ul,
I Vol\Vi(A; «0 = ||«i II
■ ,«*)= Ik* II I Vol |7Vi(A; «i,
■, w*-i).
If u\,..., un is a basis agreeing with the orientation of V, we define the (oriented) volume of the parallelepiped by
VolP*(A; «i,..., u„) — | Vol\Vk(A; uu..., u„),
in the case of a nonagreeing basis we set
VolP*(A; wi,..., un) — —| Vol\Vk(A; uu ..., un).
The following claim clarifies our former comments that the determinant expresses the volume in a sense. The thing is that the first claim says exactly that we get the volume of the parallelepiped in a ^-dimensional space, which is stretched on k vectors, such that we write down their coordinates (in an orthonormal basis) into columns of a matrix and we calculate the determinant.
The formula in the second claim is called Gramm determinant. Its advantage is that it is independent on the choice of basis and, therefore, it is better to handle in the case that k is lower then the dimension of the whole space.
Theorem. Let Q c £n be an euclidean subspace, and let (e\,..., ek) be its orthonormal basis. Then for arbitrary vectors u\, ... ,uk € Z(Q) and A € Q the . following holds
(1) VolTk(A; ux
Uk) —
(2) (VolTk(A;Ul,...,Uk)y
Proof. The matrix
u\ ■ e\ u\ ■ u\
u\ ■ Uk
Uk ■ e\ Uk ■ ek
.. Uk ■ U\ .. Uk ■ Uk
A = :
\u\-ek ..
has the coordinates of vectors u\,. columns, and
|A|2= \A\\A\ = \A-u\ ■ u\
U\ ■ Uk
Uk ■ ei\
Uk ■ ek) . ,uk in the chosen basis in
|A Uk ■ u\
Uk ■ Uk
\ATA\
Hence we see that if (1) holds, then also (2) holds.
The unoriented volume is directly form the definition equal to the product
\Vol\Vk(A;ui,. where ui — u\, v2 — u2
uk) — IN Illk2ll •• , vk — uk
a2vi
Nil, ■ ak v\
of _,vk-i is the result of the Gramm-Schmidt orthogonalization.
224
CHAPTER 4. ANALYTIC GEOMETRY
7,
6pi - \q\ - 9p2 - 3q2
-4pi + 6qi + 6q2 = 6,
9pi - 33p2 - q2 = 31,
3pi - 6qi - p2 - 9q2 = -11,
which we solve by forming matrix and performing elementary row operations.
7 \ / 1
/
V
9
3
0
-9 0
-33 -1
0 0 0
1 0 0 1 0
V 0 0 0 1
0
0 0
o \
-1 -1
2
31
-11 J
The solutions is (pi, q\, p2, q2) = (0, —1, — 1, 2). We have found
Xt-X2 = (-3, 2, -5, -7, -3)-u2+vx-2v2 = (-3, 4, -2, -4, 2).
The size of vector (—3,4,—2,—4, 2) and at the same time distance between planes a\, a2 is hence
7 = V(-3)2 + 42 + (-2)2 + (-4)2 + 22.
We determined distance between q\ and £2 differently than the distance between a\ and a2. We could have used both methods in both cases. Let's try the former method for the case of a\, a2. Let's find ortogonal complement of vector subspace generated by
(2,1,0,0,1), (-2,0,1,1,0), (2,2,4,0,3), (2,0,0,-2,-1)
We get / 2
V
1 0 0 1
2 4 0 0
0
1 0
1 \ 0
3
-1 /
/ 1 0 0 0
0 10 0
0 0 10
V 0 0 0 1
3/2 \ -2
1
2
The ortogonal complement is ((—3/2,2,-1,-2,1)), or rather ((3, —4, 2, 4, —2)). Note that distance between a\ and a2 equals the size of ortogonal projection of vector (difference of arbitrary point in o\ and arbitrary point in a2)
u = (3, -2, 5, 7, 3) = [3, -1, 7, 7, 3] - [0, 1, 2, 0, 0]
to this ortogonal complement. Denote the ortogonal projection of u as pu and choose 1; = (3, —4, 2, 4, —2). Obviously pu = a ■ v for some del and it holds
( u — pu, v ) = 0, tj. ( u, v ) — a { v, v ) = 0.
Computing gives 49 — a ■ 49 = 0. Therefore pu = 1 ■ v = v and the distance between planes a\ and a2 is equal
\\Pu\\ = 732 + (-4)2 + 22 + 42 + (-2)2 = 7.
Method of computing distance using ortogonal complement of sum of vector spaces has proven to be „faster way to the solution". With no doubt, it will be the same for planes q\ a q2. The second method however reveals points where the distance can be measure (pair
Thus we have
(yo\Vk(A;uu...,uk)Y =
VI ■ VI
vi ■ vk
v\ ■ v\ 0 0 0
. . Vk- VI
■ • vk- vk
vk ■ vk
Let us denote by B the matrix whose columns are formed jji:, by the coordinates of vectors v\,..., vk in the orthonor-
mal basis e. Since 1;
vk have arisen from u\
|t as images under a linear transformation with an upper-triangular matrix C with ones on the diagonal, we have B = CAand|B| = \C\\A\ = \A\. Butthen|A|2 = \B\2 = \A\\A\, and thus ~VolVk(A; u\,..., uk) — ±|A|. The resulting volume is zero if the vectors u\,... ,uk are dependent. Provided that they are independent, the sign of the determinant is positive if and only if the basis u\,... ,uk defines the same orientation as the basis e. □
We can formulate the following important geometric consequence:
4.23. Corollary. For each linear map • V on an euclidean
space V, det - [u\,..., un] is antisymmetric n-linear map. It means, it is linear in all arguments, and the interchange of any two arguments causes the change of sign of the result.
(2) The outer product is zero if and only if the vectors u\,... ,un are linearly dependent.
(3) Vectors u\,..., un form a positive basis if and only if their outer product is positive.
In technical applications in the space R3, we often use a closely related operation, so called cross product, which assigns a vector to any pair of vectors.
Let us consider an arbitrary euclidean vector space V of dimension n > 2 and vectors u\,..., u„-\ e V. If we substitute
tors u\,
225
CHAPTER 4. ANALYTIC GEOMETRY
of points in which the planes are the closest). Let's find such points in the case of planes q\, q2. Denote
ui = (1, 0, -1, 0, 0), u2 = (0, 1, 0, 0, -1), ui = (1,1, 1,0, 1), v2 = (0, -2, 0, 0, 3).
Points Xi € q\, X2 € Q2, which are „the closest" (as commented above), are
Xi = [7, 2, 7, -1,1] + hui + siu2, X2 = [2, 4, 7, -4, 2] + t2vi + s2v2,
so
X-i — Xo
[7,2,7, -1, 1] - [2,4,7, -4,2] +hui + SiU2 - t2vt - s2v2 (5, -2, 0, 3, -1) + hui + s\u2 - t2v\ - s2v2.
Dot products
\XX -X2,Ul) [Xl-X2,vl)
o,
0,
[Xi-X2,u2) Xl-X2,v2)
o,
0
then lead to linear equation system
2h = -5,
2s i + 5^2 = 1,
-4t2 - s2 = -2,
-5si - t2 - I3s2 = -1
with only solution t\ = —5/2, s\ = 41/2, t2 = 5/2, s2
obtained
"9 45 19
5 41
Xl = [7,2,7,-1, l]--«l +yM2
X2 = [2, 4, 7, -4, 2] + -vi - Sv2
2 2
-1,
-8. We
39
9 45 19 39 2' T' T' ~ ' ~T
Now we can easily see that the distance between points x\,x2 (and, at the same time, distance between planes q\, q2) je || x\ — x2 \ \ = ||(0,0,0,3,0)||=3. □
4.35. Find intersection of plane passing through point A = [l,2,3,4]eK4 and ortogonal to plane
q : [1, 0, 1, 0] + (1, 2, -1, -2)s + (1, 0, 0, l)f, s,(eR.
Solution. First, let's find plane ortogonal to q. Its direction will be ortogonal to direction of q, for vectors (a,b,c,d) within its direction we get linear equation system
(a,b, c,d) ■ (1,2,-1, -2) = 0 = a+2b-c-2d = 0 (a,b,c,d) ■ (1,0, 0, 1) =0 = a+d = 0.
these n — 1 vectors into the first n — 1 arguments of the n-linear map defined by the volume determinant as above, then we are given one argument left, i.e. a linear form on V. Since we have the scalar product at disposal, each linear form corresponds to exactly one vector. We call this vector u € y the cross product of vectors u\,..., w„_i, i.e. the following holds for each vector w e V
{v, w)
[Ml, ... , W„_l, HI J.
We denote the cross product byw = «i x...x«„_i.
If the coordinates of our vectors in an orthonormal basis are v = (yi,..., yn)T, w = (xi, ... ,xn)T and Uj = (u\j,. ..unj)T, then our definition can be expressed as
y\x\
' ynXn
«11
Ul(n-l) Xi
Mnl • • • Mn(n — 1)
We see from here that the vector i; is given uniquely and its coordinates are calculated by the formal expansion of this determinant along the last column. At the same time, the following properties of the cross product are direct consequences of the definition:
Theorem. For the cross product v — u\ x ... x w„_i we have
(1) v e ..., Un-i)1-
(2) v is nonzero if and only if the vectors u \ ,...,«„_ i are linearly independent,
(3) the length \\v\\ of the cross product is equal to the absolute value of the volume of parallelepiped V(0; u\, ..., w„_i),
(4) («i,..., w„_i, u) is an agreeing basis of the oriented eu-clidean space V.
Proof. The first claim follows directly from the defining formula for i; since substituting an arbitrary vec-WL JkY/ tor uj for w we get the scalar product v ■ uj on the left and the determinant with two equal columns on the right.
The rank of the matrix with n — 1 columns uj is given by the maximal size of a non-zero minor. The minors which define coordinates of the cross product are of degree n — 1 and thus the claim (2) is proved.
If the vectors u\,... ,u„-i are dependent, then also (3) holds. Therefore, let us consider that the vectors are independent, let i; be their cross product, and let us choose an orthonormal basis (ei,..., e„_i) of the space (ui,..., u„-\). It follows from what we have proved that there exists a multiple (l/a)v, 0 ^ a e R, such that (e i,..., ek, (1 /a) v) is an orthonormal basis of the whole space V. The coordinates of our vectors in this basis are
uj — (u\j,u(n-i)j, 0)T, v — (0, ..., 0, a)T.
So the outer product [u\,..., w„_i, i;] is equal (see the definition of cross product)
0
|«i, ..., w„_i, v\ —
«11
«i(«-i)
«(«-i)i ••• «(«-i)(«-i) 0 0 ... 0 a
— (v, v) — a . Expanding the determinant along the last column we get
a2 = a VolP(0; uu ..., ;„_i).
226
CHAPTER 4. ANALYTIC GEOMETRY
Solution is two-dimensional vector space ((0, 1,2, 0), (—1,0, —3, 1)). Plane r ortogonal to q passing through a has parametric equation
r : [1, 2, 3, 4] + (0, 1, 2, 0)w + (-1,0, -3, l)v, u, v e R.
We can obtain intersection of planes from both parametric equations. We get linear equation system
1 + s + t = l-v
2s = 2 + u
1 — s = 3 + 2u — 3v
-2s + t = 4 + v,
which has only solution (it must be so as matrix columns are linearly independent) s = -8/19, t = 34/19, u = -54/19, v = -26/19. Inputting parameter values s and t into parametric form of plane q, we obtain sought intersection [45/19, -16/19, 11/19, 18/19] (needless to say, we get the same solution by inputting the values into r). □
4.36. Find a line passing through point [1,2] e R2 so that angle between this line and line
p : [0, l] + f(l,l)
is 30°.
Solution. Angle between two lines is angle between their direction vectors. It is sufficient to find direction vector v of the line. One way to do so is to rotate direction vector of p by 30°. Rotation matrix for the angle 30° is
cos 30° - sin 30c sin 30° cos 30°
Sought vector v is therefore
We could perform the backward rotation as well. The line (one of two possible) has parametric equation
/V3 1 73 l\ [1'21 + (--2- + 2J'-
□
4.37. Determine cos a, where a is angle between two adjacent faces of regular octahedron (octahedron has eight equilateral triangles as faces).
Solution. Octahedron is symetric, therefore it does not matter which two faces we choose. Further, without loss of generality, asume octahedron of edge length 1 and place it into standard Cartesian coordinate
Both the remaining two claims from the proposition follow From here. □
4.25. Affine and euclidean properties. Now we can have a think about which properties are related to the affine structure of the space and for which properties we really need the scalar product in the difference space.
It is obvious that all euclidean transformations, i.e. bijective affine maps between euclidean spaces, which preserve the distance between points preserve also all objects we have studied. I.e. next to the distances they preserve also un-oriented angles, unoriented volumes, angle between sub-spaces etc. If we want them to preserve also oriented angles, cross products, volumes, then we must assume in addition that our transformations preserve the orientation too.
We may formulate our problem also as follows: Which concepts of euclidean geometry are preserved under affine transformations?
First let us remind that an affine transformation on a n-dimensional space A is uniquely defined by mapping n + 1 points in a general position, i.e. by mapping one n-dimensional simplex. In the plane, it means to choose the image of one (nondegenerate) triangle, which may be an arbitrary (nondegenerate) triangle. The preserved properties will be the properties related to subspaces in particular, i.e. the properties of the type "a line passing through a point" or "a plane contains a line" etc. At the same time, the col-inearity of vectors is preserved, and for every two colinear vectors, the ratio of their lengths is preserved (independently on the scalar product defining the length). Similarly, we have already seen that the ratio of volumes of two n-dimensional parallelepipeds is preserved under transformations (since the determinant of the corresponding matrix changes about the same multiple).
These affine properties can be used smartly in the plane to prove geometric claims. For instance, to prove the fact that the medians of a triangle intersect in a single point and in one third of their lengths, it is sufficient to verify this only in the case of an isosceles right-angled triangle or only in the case of an equilateral triangle, and then this property holds for all triangles. Think this argumentation over!
2. Geometry of quadratic forms
After straight lines, the simplest objects in the analytic geom-_ etry of plane are so called conic sections. They are given by quadratic equations in cartesian coordinates, and by coefficients we recognize //// ■ that the conic is a circle, ellipse, parabola or hyperbola, potentially it may be also a pair of lines or a point (the degenerate cases).
We will see that our tools enable us to classify effectively these objects in all finite dimensions and to work with them. It is also obvious that we cannot distinguish a circle from an ellipse in affine geometry, therefore we begin in the euclidean geometry.
4.26. Quadrics in £„. In analogy with equations of conic sections in plane, we start with objects in euclidean point spaces which are defined in a given orthonormal basis by quadratic equations, we talk about quadrics.
227
CHAPTER 4. ANALYTIC GEOMETRY
system R3 so that its centroid lies in [0, 0, 0]. Its vertices then are located in points A = [^,0,0], B = [0,^,0], C = [-^,0,0], D = [0, 0], £ = [0,0,-|]aF = [0, 0, ^].
We will compute angle between faces CDF and BCF. We have to find vectors ortogonal to their intersection and lying within respective faces, which means ortogonal to CF. They are altitudes from D and F to edge CF in triangles CDF and BCF respectively. Altitudes in equilateral triangle are the same segments as medians, so they are SD and SB, where S is midpoint of CF. Because we know coordinates of points C and F, the point S has coordinates [—0, ^-] and vectors are SD = (^, ,-% a SB = (^, f,Together
cos a
4 ' 2
/ V2 72 _V_2\ /V2 72 V2x ^ 4 ' 2' 4 ^ ' ^ 4 ' 2 ' 4 ^
Therefore a = 132°
□
4.38. In Euclidean space spaces U,V, where
determine angle • R. Similarly, we may think of a general symmetric bilinear form on an arbitrary vector space.
For an arbitrary basis on this vector space, the value fix) on vector x = x\e\ +----h xnen is given by the equation
f(x) = Fix, x) = ^^XiXjFiei, ej) = xT ■ A ■ x
'J
where A = (a^) is a symmetric matrix with elements atj = F(et, ef). We call such maps / quadratic forms, and the formula from above for the value of the form in terms of the chosen coordinates is called the analytic formula for the form.
In general, by a quadratic form we mean the restriction / (x) of a symmetric bilinear form Fix, y) to arguments of the type (x, x). Evidently, we can reconstruct the whole bilinear form F from the values fix) since
fix + y) = Fix + y, x + y) = fix) + fiy) + 2F(x, y).
If we change the basis to a different basis e[,..., e'n, we get different coordinates x = S ■ x* for the same vector (here S is the corresponding transformation matrix), and so
fix) = iS-x')T
A ■ iS-x') = ix'Y ■ iS ■ A ■ S) ■ x'.
Now let us assume again that our vector space is equipped with a scalar product. Then the previous computation can be formulated as follows. The matrix of bilinear form F, which is the same as the matrix of /, transforms under a change of coordinates in such a way that for orthogonal changes it coincides with the transformation of a matrix of a linear map (indeed, then we have S-1 = ST). We can interpret this result also as the following observation:
Proposition. Let V be a real vector space with a scalar product. Then formula
3
19 19 19 I (0,0,0,1,0) 11
-V4 = (0,0, 0, 1,0),
..(0,0,0,1,-1)11 J2 2 Hence cp = jt/4.
Case (e). Let's determine intersection of vector subspaces associated with given affine subspaces. Vector {x\, x2, x3, X4, x$) is in vector subspace of U, if and only if
(X\, X2, X3, X4, x5) =
t (2, 1, 3, 5, 3) + s (0, 3, 1, 4, -2) + r (1, 2, 4, 0, 3)
Proof. Indeed, each bilinear form with a fixed second argument becomes a linear form au( ) = F( , u), and in the presence of a scalar product, it must be given by formula a(u)(v) = v ■ w for a suitable vector w. We set = w. One show directly from the coordinate expression displayed above that cp is a linear map with matrix A. Hence it is selfadjoint.
On the other hand, each symmetric map cp defines a symmetric bilinear form F by formula F(u, v) = (cp(u), v) = (u, cp(v)), and thus also a quadratic form by restriction. □
We get immediately the following consequence of this proposition. For each quadratic form / there exists an orthonormal basis of the difference space in which / has a diagonal matrix (and the values on the diagonal are determined uniquely up to their order).
Due to the identification of quadratic forms with linear maps, we can also define correctly the rank of the quadratic form as the rank of its matrix in any basis (i.e. the rank is equal to the dimension of the image of the corresponding map cp).
4.28. Classification of quadrics. Let us come back to our equation (4.4). Our results on quadratic forms enable us to rewrite this equation as follows
Y^^iA + J2biXi + b = o.
Hence we may assume directly that the quadric is given in this form. In the next step, we do completing the squares for the coordinates x, with A, ^ 0, which "absorbs" the squares together with the linear terms in the same variable (so called Lagrange algorithm, will be discussed in detail later). So we are left only with linear terms corresponding to variables for which the coefficient at the quadratic term was zero, and we get
r = l
pi)2+ E
j satisfying Xj
bjXj + c
= 0
0.
This corresponds to a translation of the origin about the vector with coordinates pt and to such a choice of basis of the difference space that we get the desired diagonal form in the quadratic part. In the identification of quadratic forms with linear maps derived above, it means that cp is diagonal on the orthogonal complement of its kernel. If we are left with some linear terms, we may adjust the orthonormal basis of the difference space for the kernel of cp such that the corresponding linear form is a multiple of the first term of the dual basis. Hence we can already reach the final formula
where k is the rank of matrix of quadratic form /. If b / 0, we can make the constant c in the equation to be zero by a next change of the origin.
Hence we see that the linear term may (but does not have to) appear only in the case that the rank of / is less than n, c e R may be nonzero only if b = 0. The resulting equations are called the canonical analytic formulas for quadrics.
229
CHAPTER 4. ANALYTIC GEOMETRY
for some t,s,r el, and, at the same time, (x\, x2, x3, x4, x5) e V if and only if
(jci, x2, x3, x4, x5) = p (-1, 1, 1, -5, 0) + q (1, 5, 1, 13, -4) for some p, q el. Let's find such t,s,r, p,q el,so that
t (2, 1, 3, 5, 3) + s (0, 3, 1, 4, -2) + r (1, 2, 4, 0, 3) = p(-l, 1, 1, -5,0) +q (1,5, 1, 13, -4).
It is a homogeneous linear equation system. We will solve it in matrix form (order of variables is t, s, r, p, q)
í2 0 1 1 -1 / 1 3 2 -1 -5 \
1 3 2 -1 -5 0 2 1 -1 -3
3 1 4 -1 -1 ~ ... ~ 0 0 1 -1 1
5 4 0 5 -13 0 0 0 0 0
\3 -2 3 0 4 ) \0 0 0 0
4.29. The case of £2. As an example of the previous procedure, let us go through the whole discussion in the simplest ; / case of a nontrivial dimension, i.e. dimension two. ^~z_ The original equation has the form
an
x2 + a22y2 + 2ai2xy + a\x + 02y ■
0.
By a suitable choice of a basis of difference space and the subsequent completing the squares we reach the form (we use the same notation x, y for the new coordinates):
„2
a\\x
■ a22
y2 + a\x + a2y + <
0
where a, may be nonzero only in the case that an is zero. By the last step of the general procedure, i.e. in dimension n = 2 only by a choice of a translation, we reach exactly one of the following equations:
It has showed that vectors defining V are linear combination of U's vectors. That means V is subset of U, and hence M a quadratic form. Then there exist a polar basis for f on V.
Proof. (1) Let A be the matrix of / in basis u — (u\, ...,u„) on V, and let us assume an / 0. Then we may write
f(x\, ..., x„) — a\\x\ + 2ai2XiX2 H-----h A22-4 + • • •
— flfj^flnXi + Ö12X2 H-----h fll«X„)2
+ terms not containing x\.
Hence we transform the coordinates (i.e. we change the basis) such that in new coordinates we have
x\ — a\\x\ + fli2X2 H-----h a\nxn, x'2 — X2, ..., x'n — x„.
It corresponds to the new basis (as an exercise, compute the transformation matrix)
V\ = üyyU\, 1)2 —- U2 — Ü\2U\, . . . , V„ — U„ — üyy a\nu\
and so, as we may expect, in the new basis the corresponding symmetric bilinear form satisfies g(v\,v{) — 0 for all i > 0 (compute!). Thus / has the form a^x'l2 + h in the new coordinates, where h is a quadratic form independent on the variable x\.
Due to technical reasons, it is mostly better to choose v\ — u\ in the new basis. Then we have the expression f — f\ + h, where f\ depend only on x'1, while x'1 does not appear in h, but g(vu vi) - an.
(2) Let us assume that after doing the step (1), we get for h a matrix (of rank less about one) with a nonzero coefficient at x7,2. Then we may repeat exactly the same procedure and we get the expression f — f\ + fi + h, where h contains only the variables with index greater than two. We may proceed in this way as long until we get a diagonal form after n — 1 steps or in a step, say ;-th step, the element an is zero.
(3) If the last possibility happens, but in the same time there exists some other element ajj / 0 with j > i, then it suffices to switch the ;-th and the j-th vector of the basis and to continue according the the previous procedure.
(4) Let us assume now that we come to the situation ajj — 0 for all j > i. If there is no element aß ^ 0 with j > i,k > i, then we are done since we have got a diagonal matrix. If aß / 0, then we use transformation vj — uj + uk + we keep the other vector of basis constant (i.e. x?k — xk — xj, the other remain constant). Then h(vj, Vj) = h(uj, Uj) + h(uk, WjO + 2h(uk, Uj) — 2aj\ ^0 and we can continue according to (1). □
231
CHAPTER 4. ANALYTIC GEOMETRY
'I zlL
o \
,0 0 1
1 3
-3 ,
polar basis is therefore 0, 0), §, 0), (1, -3, 1)). □
3. /(xi, x2, X3)
4.40. Determine polar basis of form / : M3
2*1X3 -(- X2.
Solution. Matrix of the form is
/0 0 1N A = J 0 1 0
V 0 °v
We can switch the order of variables: yi = x2, y2 = xi, y3 = x3. It is then trivial to apply step (1) of Lagrange algorithm (there are no common terms), however for the next step, case (4) sets in. We introduce transformation zi = yi,z2 = y2, zj, = yj, — y2. Pak
f(xl,x2,x3) = zj + 2z2(z3 + z2) = zj + ^(2z2 + z3f - X-z\.
Together we get z,\=y\= x2, z2 = y2= xu z3 = y3 - y2 = x3 - xx. Matrix T for change to polar basis is
/ 0 1 0\ /0 1 0>
T = 1 0 0 and T~l = 1 0 0
V-i o i) \o 1 1,
polar basis is therefore ((0, 1, 0), (1, 0, 1) (0, 1, 1)). □
4.41. Find polar basis of quadratic form / standard basis defined as
I, which is in
f(xi,x2, x3) = xix2 + dissolution. By application of Lagrange algorithm we get:
f(X\, X2, X3) = 2xiX2 + X2X3
we perform substitution according to step (4) of the algorithm y2 — x2—x\, = 2xi (xi + yi) + (*i + y2)*3 = 2x\ + 2x\y2 + X1X3 + y2x3 =
1 1 2 1 2 1 2
= 2^2xi +y2 + 2xi) ~ y2 ~ 8X3 + y2X3 =
substitution yi — 2x\ + y2 + 5X3
1 2 ' 2 ^ 2 , 1 2 1 \2 , ^ 2
= ~ 2^ ~ 8X3 ^ = 23'1 ~ 2yi ~ 2Xs) 8X3 = substitution V3 — \y2 — \x3
4.31. Affine classification of quadratic forms. We can improve the Lagrange algorithm for computing polar basis by
____multiplying the vectors from basis by a scalar such
that the coefficients at squares of variables in the corresponding analytic formula for our form will be only scalars 1,-1 and 0. Moreover, the following law of inertia says that the number of one's and minus one's does not depend on our choices in the course of the algorithm. These numbers are called the signature of a quadratic form. As before, we get a complete description of quadratic forms in the sense that two such forms may be transformed each one into the other by an affine transformation if and only if they have the same signature.
Theorem. For each nonzero quadratic form of rank r on a real vector space V there exists a natural number 0 < p < r and r independent linear forms 0 while for*; e Q we have f(v) < 0. Hence necessarily P n Q — {0} holds, and therefore dim P + dim Q < n. From here we conclude p + (n — q) < n, i.e. p < q. However, we get also q < p by the opposite choice of subspaces.
Thus p is independent on the choice of the polar basis. But then for two matrices with the same rank and the same number of positive coefficients in the diagonal form of the corresponding quadratic form, we get the same analytic formulas. □
While we discussed symmetric maps we talked about definite and semidefinite maps. The same discussion has an obvious meaning also for symmetric bilinear forms and quadratic forms. A quadratic form / on a real vector space V is called (1) positive definite if f(u) > 0 for all vectors u / 0,
232
CHAPTER 4. ANALYTIC GEOMETRY
We get change of basis matrix by either expressing the old variables (xi, x2, x3) by new variables (yi, y3, x3), or equivalently expressing the new ones by the old ones (which is easier), we however need to compute inverse matrix in the latter case.
We have y\ = 2x\ + y2 + \x3 = 2x\ + (x2 — x{) + \x3 and
\x3. Matrix changing basis from
1 V 9X3
1 X\ + \x3
^3 — 2->-z 2^ ~~ 2-"1 1 2
polar basis to standard basis is
Inverse matrix is
'1 _2 _
45 : 3 3 : v0 0
One of polar bases of the given quadratic forms is hence for example basis (see the columns of matrix
{(1/3, 1/3, 0), (-2/3, 4/3, 0), (-1/2, 1/2, 1)}. □
4.42. Determine the type of conic section defined by
3xf
3x\x2 + x2 — 1 = 0.
Solution. We complete the squares:
1
3.Xi — 3x\X2 ~h X2 — 1
:(3*i
:x2y
1
-x\ + x2 1
1
yl 3 4 T 3 1
-,y\
2
2-71 rA 3"
According to list 4.29, the given conic section is hyperbola.
□
4.43. By completing the squares express quadric
-x2 + 3y2 + z2 + 6xy - 4z, = 0
in such way that one can determine its type from it.
Solution. We move all terms containing x to — x2 and complete the
square. We get equation
-(x - 3y)2 + 9/ + 3y2 + z2 - 4z = 0.
There are no „unwanted" terms containing y , so we repeat the procedure for z, which gives us
-(x - 3y)2 + 12/ + (z - 2)2 - 4 = 0.
Now we can conclude that there is a transformation of variables that leads to equation (we can divide by 4 first)
+ y2+z2
1 =0.
(2) positive semidefinite if f(u) > 0 for all vectors u e V,
(3) negative definite if f(u) < 0 for all vectors u ^ 0,
(4) negative semidefinite if f(u) < 0 for all vectors u e V,
(5) indefinite if f(u) > 0 and f(v) < 0 for two vectors u, v e V.
We use the same names also for symmetric matrices corresponding to quadratic forms. By a signature of a symmetric matrix we mean the signature of the corresponding quadratic form.
4.32. Theorem (Sylvester criterion). A symmetric real matrix A is positive definite if and only if all its leading principal minors are positive.
A symmetric real matrix A is negative definite if and only if (— 1)' | A, | > Ofor all leading principal submatrices Ai.
Proof. We must analyse in detail the form of the transforma-\^ tions used in the Lagrange algorithm for constructing h the polar basis. The transformation used in the first step of this algorithm always have an upper triangular matrix T and if we use the technical modification mentioned in the proof of proposition 4.30 moreover, the matrix has one's on the diagonal:
/I
T =
£12
0 1
"n2\
0
V
□
Such matrix of the transformation from basis u to basis v has several nice properties. In particular, its leading principal submatrices Tk formed by first k rows and columns are the transformation matrices of a subspace Pk — ■ ■ ■, uk) from basis (u\,..., uk) to basis (vi..., vk). The leading principal submatrices Ak of the matrix A of form / are matrices of restrictions of the form / to Pk. Therefore, the matrices Ak and A'k of restrictions to Pk in basis uandv respectively satisfy A k — A'k(Tk)~l, where T is the transformation matrix from u to v. The inverse matrix to an upper triangular matrix with one's on the diagonal is an upper triangular matrix with one's on the diagonal again. Hence we may similarly express A' in terms of A. Thus the determinants of matrices Ak and A'k are equal by Cauchy formula. So we proved a useful statement:
Let f be a quadratic form on V, dim V — n, and let u be a basis of V such that we never need the items (3) and (4) from the Lagrange algorithm while finding the polar basis. Then as the result we get analytic formula
f(x\, ..., x„) — \\x\ + X2x\ + ■ ■ ■ + Xry?r
where r is the rank of form f, k\, ... ,kr ^ 0 and for leading principal submatrices of the (former) matrix A of quadratic form f we have \ Ak\ — k\k2 ... kk, k < r.
In our procedure, each sequential transformation makes zeros under the diagonal in next column. From here it is obvious that if the leading principal minors are nonzero then the next diagonal term in A is nonzero. By this consideration we proved so called Jacobi theorem:
Corollary. Let f be a quadratic form of rank r on a vector space V with matrix A in basis u. There is no need of other steps in Lagrange algorithm than completing squares if and only if the leading principal submatrices of A satisfy \A\ \ ^ 0, \Ar\ ^ 0. Then
233
CHAPTER 4. ANALYTIC GEOMETRY
We can tell the type of the conic section without transforming its equation to the form listed in 4.29. As we know, we can express every conic section as
aux2 + 2anxy + any2 + 2ai3x + 2a23y + a33 = 0.
there exists a polar basis (which we get by the above algorithm), in which f has analytic formula
det A
an an an a 12 a22 a23 a 13 a32 a33
and
Determinants
an ayi
&12 a22
are so called invariants of conic section which means that they are not changed by Euclidian transformation (rotation and translation). Furthermore, different types of conic sections have different signs of those determinants.
• A / 0 non-degenerate conic sections:
ellipse for 8 > 0, hyperbola for 8 < 0 and parabola for 8 = 0 Furthermore, for real ellipse (not imaginary), (an +a22)A < 0 must hold.
• A = 0 degenerate conic sections, lines
We can easily check that signs (or zero-value) of the determinants are
(x\
really invariant to coordinate transformation. Denote X = I y I and
w
A is a matrix of quadratic form. Then the corresponding conic section has equation XT AX = 0. We get the standard form by rotation and translation, i.e. by transformation to new coordinates x', y1 satisfying
x = x' cos a — / sin a + c\
y = x' sin a + y' cos a + c2,
(X'\
or, in matrix form, for new coordinates X' = \ y' \ holds
cos a — sin a c\ \ ix in a cos a c2 I I / 0 0 1/ \1
Inputting X = MX' into the conic section equation we get equation in new coordinates
XTAX = 0 (MX')T A(MX') = 0 X'TMT A MX' = 0.
Denote by A' matrix of the quadratic form in new coordinates. Then
(cos a —sin a ciN sin a cos a c2 | has unit 0 0 1
determinant, so
det A' = det MT det A det M = det A = A.
n
=
\
x„) — \ Ai\x]
\A2\ \Ai\~
\Ar\ x2 \Ar-l\ r'
Hence if all leading principal minors are positive, then / is positive definite by Jacobi theorem.
On the other hand, let us consider that the form / is positive definite. Then for a suitable regular matrix P we have A — PTEP = PTP. And so \A\ = \P\2 > 0. Let u be a chosen basis in which the form / has matrix A. The restrictions of / to subspaces V* — (u\,..., uk) are positive definite forms fk again, and the corresponding matrices in bases u i,..., ui_ are the leading principal submatrices A^. Thus|A,t| > 0 according to the previous part of the proof.
The claim about negative definite forms follows by observing the fact that A is positive definite if and only if —A is negative definite. □
3. Projective geometry
In many elementary texts on analytic geometry, the authors finish with the affine and euclidean objects described if '■. :i above. The affine and euclidean geometries are sufficient for many practical problems, but not for all prob-■%iw*4&**-^-~ lems.
For instance in processing an image from a camera, angles are not preserved and parallel lines may (but does not have to) intersect. The next reason for finding a more general framework for geometric problems and considerations is to deal only with simple numerical operations like matrix multiplication. Moreover, it is difficult to distinguish very small angles from zero angles, and thus it is preferable to have tools which do not need such distinguishing.
The basic idea of projective geometry is to extend affine spaces by points in infinity such that it allows us an easy work with linear objects like points, lines, planes, projections, etc.
4.33. Projective extension of affine plane. We begin with the simplest interesting case, the geometry in a plane. If we imagine the points in plane A2 as the plane z — 1 in V?, then each point P in our affine plane is represented by a vector u — (x, y, 1) e R3, and so it is represented also by a one-dimensional subspace (u) c R3. On the other hand, almost each one-dimensional subspace in R3 intersects our plane in exactly one point P, and the vectors of such subspace are given by coordinates (x, y, z) uniquely up to a common scalar multiple. Only the subspaces corresponding to vectors (x, y, 0) will not have any intersection with our plane.
_ | Projective plane [___
Definition. Projective plane V2 is the set of all one-dimensional subspaces in R3. Homogeneous coordinates of point P — (x : y : z) in the projective plane are triples of real numbers given up to a common scalar multiple, while at least one of them must be nonzero. The straight line in projective plane is defined as the set of one-dimensional subspaces (i.e. points in V2) which generate a two-dimensional subspace (i.e. a plane) in R3.
234
CHAPTER 4. ANALYTIC GEOMETRY
Necessarily, also determinant A33, which is algebraic complement of
«33 is invariant to coordination transformation, because for rotation
only det A' = det MT det A det M holds. In this case matrix M = ^cos a — sin a 0\
sin a cos a Ol anddetA33 = detA33 = S. For translation 0 0 1/
/l 0 cx
only M = I 0 1 c2 | and this subdeterminant remains unchanged. \0 0 1
4.44. Determine type of conic section 2x2 —2xy+3y2 —x+y—1=0.
Solution. Determinant A
■1 3 i j_
"2 2
i 4
i
-i
j/0 hence it
is non-degenerate conic section. Moreover 8 = 5 > 0, therefore it is ellipse, ellipse.
ellipse. Furthermore (an + a22)A = (2 + 3) • (—^) < 0, so it is real
□
4.45. Determine type of conic section x2 — 4xy — 5 y2 + 2x+4y+3 0.
1 -2 1 -2 -5 2 1 2 3
Solution. Determinant A furthermore 8
1 -2 -2 -5
-34 ^ 0, 9 < 0, it is therefore hyperbola. □
4.46. Determine equation and type of conic section passing through points
[-2, -4], [8, -4], [0, -2], [0, -6], [6, -2].
Solution. We will input coordinates of the points into general conic section equation
aiix2 + a22y2 + 2a\2xy + a\x + a2y + a
0
We get linear equation system
4«ii + 16^22 + 16ai2 — 2a i — 4a2 + a
64fln + 16^22 — 64^12 + 8fli — 4a2 + a
4a22 - 2a2 + a
36a22 — 6a2 + a
36fln + 4^22 — 24^12 + 6ci\ — 2a2 + a
In matrix form we perform operations
0, 0, 0, 0, 0.
(4 16 16 -2 -4 1\
64 16 -64 8 -4 1
0 4 0 0 -2 1
0 36 0 0 -6 1
\36 4 -24 6 -2 V
In order to have a concrete example, let us look at two parallel lines in affine plane R2
Li:y-jc-l=0, L2:y-x+ 1=0.
If we see the points of lines L i and L2 as finite points in projective space V2, their homogeneous coordinates (x : y : z) obviously satisfy equations
L\: y — x — z — 0, L2 : y — x + z — 0.
It is easy to see that the intersection L i n L2 is the point (—1:1: 0) e V2 in this context, i.e. the point of infinity corresponding to the common direction vector of the lines,
4.34. Affine coordinates in projective plane. On the contrary if we begin with the projective plane and if we want to see the affine plane as its "finite" part, then instead of plane " z — 1 we may take an other plane a in R3 which does not pass through origin 0 e M3. Then the finite points will be those one-dimensional subspaces which have a nonzero intersection with the plane a.
Let us proceed farther in our example of two parallel lines from the previous paragraph, and let us see what their equations look like in coordinates in affine plane given by y — 1. To get them, it suffices to substitute y — 1 into the previous equations:
Li : 1
0, L'7 : 1
0
Now the "infinite" points of our former affine plane are given by z — 0, and we see that our lines L\ and L'2 intersects in point (1, 1,0). This corresponds to the geometric vision that two parallel lines L i, L2 in affine plane intersect in infinity, in point (1:1:0) precisely.
4.35. Projective spaces and transformations. One can generalize in a natural way our procedure from the affine plane to each finite dimension.
Choosing an arbitrary affine hyperplane A„ in vector space M"+1 which does not pass through origin we may identify the points P e A„ with one-dimensional sub-spaces generated by these points. The remaining one-dimensional subspaces fulfil a hyperplane parallel to A„, and we call them infinite points in the projective extension V„ of affine plane A„.
Obviously the set of infinite points in V„ is always a projective space of dimension one less. An affine straight line has only one infinite point in its projective extension (both ends of the line "intersect" in infinity and thus the projective line looks like a circle), the projective plane has a projective line of infinite points, the three-dimensional projective space has a projective plane of infinite points etc.
More generally, we define the projectivization of a vector space: for an arbitrary vector space V of dimension n+1 we define
V(V) = {P C V; P C V, dim V = 1}.
Choosing a basis u in V we get so called homogeneous coordinates on V(V) such that for a P e V(V) we use its arbitrary nonzero vector u e V and the coordinates of this vector in basis u. The points of the projective space V(V) are called geometric points, while their generators in V are called arithmetic representatives.
In the chosen projective coordinates, we can fix one of them to be one (i.e. we exclude all points of the projective extension which have this coordinate equal to zero), and so we get an embedding
235
CHAPTER 4. ANALYTIC GEOMETRY
/4 16 16 -2 -4 1 \
0 4 0 0 -2 1
0 0 64 -8 12 -9
0 0 0 24 -36 27
0 0 0 3 -v
/48 0 0 0 0 -1\
0 12 0 0 0 -1
0 0 64 0 0 0
0 0 0 24 0 3
\0 0 0 0 3 -V
ci2 = 32.
We can choose value of a. If we choose a = 48, we get
an = l, a22 = 4, an = 0, ax = -Conic section has equation
x2 + Ay2 - 6x + 32y + 48 = 0. We will complete x2 — 6x, Ay2 + 32y to squares, which gives us
(x - 3)2 + A(y + A)2 - 25 = 0,
or rather
(x-3)2 (y+A)2
(I)2 " = ' We can see it is an ellipse with center in [3, —4].
□
4.47. Other characteristics of conic sections. Let's take a further look into some terms related to conic sections. Axis of conic section is a line of reflection symmetry for conic section. From canonical form of conic section in polar basis (4.29) it can be derived that an ellipse has two axes (x = 0 a y = 0), a parabola has one axis (x = 0) a hyperbola has two axes (x = 0 a y = 0). Intersection of axis and conic section itself is called conic section vertex. Numbers a, b from canonical form of conic section (which express distance between vertices and origin) are called semi-axes length. In the case of ellipse and hyperbola, the axes intersect in the origin. This point is a point of central symmetry for the conic section. This point is called center of conic section. Besides vertices and centers there are other interesting points lying on axis of conic section. For ellipse we have ellipse foci E, F characterized by property \EX\ + \FX\ = 2a for arbitrary X lying on ellipse. Following example shows that such points E a F really exist.
4.48. Existence of foci. For ellipse with lengths of semi-axes a > b are points E = [—e, 0] and F = [e, 0], where e = \J a2 — b2 its foci (in polar coordinates).
Solution. Consider points X = [x, y], which satisfy properly \EX\ + \FX\ = 2a and we show that these are exactly ellipse points.
of n-dimensional affine space A„ C V(V). It is precisely the construction which we used in our example on projective plane.
4.36. Perspective projection. The advantages of projective geometry shows up nicely in the case of perspective projection R3 -> R2. Let us imagine that an observer '$&.J sitting in the origin observes "one half of the world", i.e. the points (X, Y, Z) e R3 with Z > 0, and sees the image "projected" on the screen given by plane Z = / > 0.
Thus a point (X, Y, Z) in the "real world" projects to a point (x, y) on the screen as follows:
*/7
X
y = f-
It is not only a nonlinear formula but also the accuracy of calculations will be problematic in the case that Z is small.
Extending this transformation to a map V3 -> V2 we get (X : Y : Z : W) k> (x : y : z) = (-fX : -fY : Z), i.e. a map described by simple linear formula
'/ 0 0 0> 0/00 .0 0 10;
/a
Y Z
W
This simple expression defines the perspective projection for finite points in R3 c V3 which we substitute as points with W = 1. In this way we eliminated problems with points whose image runs to infinity. Indeed, if the Z-coordinate of a real point is close to zero, then the value of the third homogeneous coordinate of the image is close to zero, i.e. it corresponds to a point close to infinity.
Affine and projective transformations. Obviously, each injective linear map • V2 between vector spaces maps one-dimensional subspaces to one-dimensional sub-spaces. Therefore, we get a map on projectivizations T : ~P(yi) —>• Viy^). Such maps are called projective maps, in literature one uses also the notion collineation if this map is in-vertible.
Otherwise put, the projective map is a map between projective spaces such that in each system of homogeneous coordinates on domain and image it is given by multiplication by a matrix. More generally if our auxiliary linear map is not injective, then we define the projective map only outside of its kernel, i.e. on points whose homogeneous coordinates do not map to zero.
Since injective maps V —>• V of a vector space to itself are invertible, all projective maps of projective space V„ to itself are invertible too. They are also called regular collineations or projective transformations. In homogeneous coordinates, they correspond to invertible matrices of dimension n+1. Two such matrices define the same projective transformation if and only if they differ by a constant multiple.
If we choose the first coordinate as the one whose vanishing defines infinite points, then the transformations preserving infinite points are given by matrices whose first row vanishes up to its first element. If we want to switch to affine coordinates of finite points, i.e we fix the first coordinate to be one, the first element in the first row must be also equal to one. Hence the matrices of collineations
236
CHAPTER 4. ANALYTIC GEOMETRY
Coordinate-wise, this equation looks like
yj(x + e)2 + y2 + J{x - e)2 + y2 = 2a By raising to a power and performing some operations we get
(a2 - e2)x2 + a2/ = a2(a2 - e2).
Substituting e2 = a2
b2 and dividing by a2b2 we get 1.
x2 y2 az
b2
□
Remark. Number e from the previous example is called eccentricity of an ellipse. Similarly, hyperbola foci are points E, F, which satisfy \\EX\ — \FX\\ = 2a for arbitrary X on hyperbola. You can check that there are two points satisfying this condition [—e, 0] and [e, 0] (in polar basis), where e = \Ja2 + b2. Parabola focus is a point F of coordinates F = [0, f] and it is characterized by a fact that distance between this point and arbitrary X on parabola is equal to the distance between X and line y
£ 2 ■
4.49. Find foci of ellipse x2 +2/ = 2.
Solution. We can see from the equation that semi-axes lengths area = 1 and b = We easily compute (see ||4.48||): e = \Ja2 — b2 = 1, foci coordinates then are [ — 1, 0] a [1, 0]. □
preserving finite points of our affine space have the form: /l 0 ••• 0\
flu
\K ««1 ,b„)T
fli«
where b = (b\,..., bny e Rn and A = (ai;) is an invert-ible matrix of dimension n. The action of such matrix on vector (1, x\,..., x„) is exactly a general affine transformation, where b is the translation and A its linear part. Thus the affine maps are exactly those collineations which preserve the hyperplane of infinity points.
4.38. Determining collineations. In order to define an affine map, it is necessary and sufficient to define an image of the affine frame. In the above description of affine transformations as special cases of projective maps it corresponds to a suitable choice of an image of a suitable arithmetic basis of the vector space V.
But it does not hold in general that the image of an arithmetic basis of V determines the collineation uniquely. We show the core of the problem on a simple example of affine plane. If we choose four points A, B, C, D in the plane such that each three of them are in a general position (i.e. no three of them he on a line), then we may choose their images in the collineation as follows:
Let us choose arbitrarily their four images A', B' ,C, D' with the same property, and let us choose their homogeneous coordinates u, v, w, z, u', v', w', z' v R3. Obviously the vectors z and z' can be written as linear combinations
z = c\u + cjv + csw,
Z = C,K
- y 1/1 ~t~ ' 3 '
where all six coefficients must be nonzero, otherwise there exist three of our points which are not in general position.
Now we choose new arithmetic representatives u = c\u, v = c2v and w = c3w of points A, B and C respectively, and similarly u' = c\u', v' = C2v' and w' = c3w' for points A', B' a C. This choice defines an unique linear map • V(V) we call a point S e V(V) the center of collineation f if all hyperplanes in the bunch determined by S are fixed hyperplanes. A hyperplane a is called the axis of collineation f if all its points are fixed points.
It follows directly from the definition that the axis of a collineation is the center of the dual collineation, while the bunch of hyperplanes defining the center of collineation is the axis of the dual collineation.
Since the matrices of a collineation on the former and the dual space differ only by the transposition, their eigenvalues coincide (eigenvectors are column respectively row vectors corresponding
239
CHAPTER 4. ANALYTIC GEOMETRY
from 4.29., which corresponds to intersecting cone in R3 with different planes. Non-degenerate sections are depicted. Degenerate sections are those which passes the vertex of the cone.
We define following useful terms for conic section in projective plane :
Points P, Qe V2 corresponding to one-dimensional subspaces (p)' (?) (generated by vectors p,q € R3) are called polar conjugated with respect to conic section /, if F(p, q) = 0, or rather pT Aq = 0 holds.
Point P= (p) is called singular point of conic section /, when it is polar conjugated with respect to / with all points of the plane, so F(p,x) = 0 Vx € V2. In other words, Ap = 0. Hence the matrix A of conic section does not have maximal rank and therefore does define degenerate conic section. Non-degenerate conic sections do not contain singular points.
We call the set of all points X= (x) polar conjugated with P = (p) polar of point P with respect to conic section /. It is therefore set of point for which F(p,x) = pTAx = 0. Because polar is given by linear combination of coordinates, it is always (in non-singular case) a line. The following part explains geometric interpretation of polar.
4.57. Polar characterization. Consider non-degenerate conic section /. Polar of point P € f with respect to / is tangent to / with the touch point P. Polar of point P £ f is line defined by touch points of tangents to / passing through P.
Solution. We will first consider Pe / and show by contradiction that polar has exactly one common point with the conic section (the touch point). Suppose that polar of P, defined by F(p, x) = 0, intersects / in Q= (q) t^P. Then obviously F(p,q) =0 and fiq) = F(q, q) = 0. For arbitrary point X = (x) lying on P and Q we then have x = ap + fiq for some a, /i e K. Because of bilinearity and symmetry of F we get
fix) = Fix, x)
a2Fip,p)+2apFip,q)+p2Fiq, q)
0
to the same eigenvalues). For example in the pojective plane (and due to the same reason in each real projective space of even dimension) each collineation has at least one fixed point since the characteristic polynomials of corresponding linear maps are of odd degree and so they have at least one real root.
Instead of discussing a general theory, we will illustrate now shortly its usefulness on several results for projective planes. .
Proposition. A projective transformation different from the identity has either exactly one center and exactly one axis or it does not have center neither axis.
Proof. Let us consider collineation / on VR3 and let us assume that it has two distinct centers A and B. Let us denote by £ the line given by these two centers, and let us choose a point X in projective plane outside of £. If p and q are the lines passing through pairs of points (A, X) respectively (B, X), then also f(p) = p and f(q) = q and in particular also point X is fixed. But this means that all points of the plane outside of £ are fixed. Hence each line different from £ has all points out of £ fixed and thus also its intersection with £ is fixed. It means that / is identity mapping and so we proved that every projective transformation different from the identity may have at most one center. The same consideration for the dual projective plane gives the result about at most one axis.
If / has a center A, then all lines passing through A are fixed and correspond therefore to a two-dimensional subspace of row eigenvectors of matrix corresponding to transformation /. Therefore, there exists a two-dimensional subspace of column eigenvectors to the same eigenvalue, and this one represents exactly the line of fixed points, hence the axis. The same consideration in the reversed order proves the opposite statement - if a projective transformation of plane has an axis, then it has also a center. □
For practical problems it is useful to work with complex projective extensions also in the case of a real plane. Then the geometric behaviour can be easily read off the potential existence of real or imaginary centers and axes.
4.42. Projective classification of quadrics. In the end of this section we come back to conies and quadrics. A quadric ; / Q in n-dimensional affine space Rn is defined by gen-sr^ eral quadratic equation (4.4), see page 228. Viewing affine space Rn as affine coordinates in projective space VRn+1 we may aim to describe the set Q by homogeneous coordinates in projective space. The formula in these coordinates should contain only terms of second order since only a homogeneous formula is independent of the choice of the multiple of homogeneous coordinates (xo, x\,... ,x„) of a point. Hence we are searching for a homogeneous formula whose restriction to affine coordinates, i.e. substitution xo = 1, gives the original formula (4.4).
But this is especially easy, we simply add enough xq to all terms - no one to quadratic terms, one to linear terms and x2, to the constant term in the original affine equation for Q.
So we get a well defined quadratic form / on vector space Rn+1 whose zero set defines correctly so called projective quadric
The intersection of "cone" Q c Rn+1 of the zero set of this form with affine plane xq = 1 is the original quadric Q whose points are called proper points of the quadric, while the other points Q \ Q in the projective extension are the infinite points.
240
CHAPTER 4. ANALYTIC GEOMETRY
which means that every point x of line is lying on conic section /. However, when the conic section contains a line, it has to be degenerate, which is contradiction. As well, we can see that in the case of degenerate conic section, the polar is line of conic section itself.
Claim for P £ f follows from the corollary of symmetry of bilinear form F. When point Q lies on polar of P, then point P lies on polar of Q.
□
Using polar conjugates we can find axes and center of conic sections without need of Lagrange algorithm.
Consider conic section matrix as a block matrix
A a
T
a a
where A = (atj) for i, j = 1, 2, a is vector (ai3, a23) and a = a33. That means the conic section is defined by equation
uT Au + 2aTu + a = 0
for vector u = (x, y). Now we show that
4.58. Axes of conic section are polars of points at infinity determined by eigenvectors of matrix A.
Solution. Because of symmetry of A, in the basis of its eigenvectors, it has diagonal shape D = , where X, \± e M. and this basis is
ortogonal. Denote by U matrix changing basis to eigenbasis (columns are eigenvectors), then the conic section matrix in eigenbasis is
'UT 0\/A a\/U 0\ _ / D UTa" 0 l) \aT a)\0 l) ~ [aTU a
So in this basis we have canonical form defined by vector UTa (up to
translation). Specifically, denote by vx, eigenvectors and we have
a1Vi t aTv„ t (aTvA2 (aTv,.)2 k(x + —±)2 + fi(y +--)2 = kJ +--'— - a.
A fl A fl
It means that eigenvectors are direction vectors of the conic section axes (so called main directions) and axes equations in this basis are x = —^y2- and y = —^J±- Axes coordinates uk and in standard
T T
basis satisfy v^iii = — ^-^ and v^u^ = —^J±, because v^iXui +
The classification of real or complex projective quadrics, up to projective transformations, is a problem which we have already managed — it is all about finding the canonical polar basis, see paragraph 4.29. From this classification, which is given by signature of the form in real case and only by rank in complex case, we can deduce relatively easily also the classification of affine quadrics. We show the core of the procedure on the case of conies in affine and projective plane.
The projective classification gives the following possibilities, described by homogeneous coordinates (x : y : z) in projective plane VR3:
imaginary regular conic given by x2 + y2 real regular conic given by x2 + y2 pair of imaginary lines given by x2 + y2 — 0
= 0
z2 = 0
• pair of real lines given by x2
• one double line x2 — 0.
y
o
We consider this classification as real, i.e. the classification of quadratic forms is given not only by its rank but also by its signature. Nevertheless, the points of quadric are considered also in the complex extension. In this way we should understand the stated names, e.g. the imaginary conic does not have any real points.
4.43. Affine classification of quadrics. For affine classification we must restrict the projective transformations to those which preserve the line of infinite points. We can realize this also by an opposite procedure — for a fixed projective type of conic Q, i.e. its cone Q c R3, we are choosing different affine planes a c R3 which do not pass through origin, and we observe how the set of points Q n a, which are proper points of Q in affine coordinates realized by plane a, is changing.
Hence in the case of a regular conic we have a true cone Q given by equation z2 — x2 + y2 and as planes a we may take the tangent planes to unite sphere for instance. If we begin with plane z — 1, the intersection consists only from finite points forming a unite circle Q. By a successive sloping of a we get more and more stretched ellipse until we get such slope that a is parallel with one of lines of the cone. In that moment there appears one (double) infinite point of our conic whose finite points still form one connected component, and so we get parabola. The continuation in sloping gives rise to two infinite points and the set of finite points is no more connected, and so we get the last regular quadric in the affine classification, a hyperbola.
We can take the advice from the introduced method which enable us to continue the classification in higher dimensions. In particular, let us notice that the intersection of our conic with the projective line of infinite points is always a quadric in dimension one less, i.e. in our case it is either an empty set or a double point or two points as types of quadrics on a projective line. Next we found out that we found an affine transformation transforming one of possible realizations of a fixed projective type to another one only if the corresponding quadrics in the infinite line were projectively equivalent. In this way, it is possible to continue the classification of quadrics in dimension three and farther.
241
CHAPTER 4. ANALYTIC GEOMETRY
a) = 0 and v^ip^u^ + a) = 0. These equations are equivalent to equations vf(Au^ + a) = 0 a u^(AwM + a) = 0 which are polar equations of points defined by vectors v^av^. □
4.59. Remark. Corollary of the previous claim is a fact that center of the conic section is polar conjugated with all points at infinity. Center s coordinates then satisfy equation As + a = 0.
If det(A) 7^ 0, then equation As+a =0 for center coordinates has exactly one solution for 8 = det(A) ^ 0 and no solutions for 8 = 0. That means that, regarding non-degenerate conic sections, ellipse and hyperbola have exactly one center and parabola does not have any (its center is point at infinity).
4.60. Prove that angle between tangent to parabola (with arbitrary touch point) and parabola axis is the same as angle between the tangent and line connecting focus and the touch point.
Solution. Polar (i.e. tangent) of point X=[x0, yo] to parabola defined by canonical equation in polar basis is line satisfying
0 i i
x0x - py - py0 = 0
Cosine of angle between tangent and parabola axis (x = 0) is given by dot product of corresponding unit direction vectors. Unit direction vector of the tangent is , 1 (p,x0) and therefore for cosine we have
1 Xn
= (p,x0).(0,l) =
Pz+*o \JP +xo
Now we show that cosine of angle between tangent and line connecting focus F=[0, f ] and touch point X is equal. Unit direction vector of the connecting line is
1 p
(x0, yo - -)■
For cosine of angle we have
1 1 px0
===(x0y0 + ——)
p2 + xz0 Jxz0 + (yo - f)
P\2
r2
0. rr^t _£Q_
Substituting yo = ^ we get
/p2+4
This example shows that lightrays striking parallel with axis of parabolic mirrow are reflecting to the focus and, vice versa, lightrays going through focus reflect in direction parallel with axis of parabola. This is the principle of many devices such as parabolic reflector. □
242
CHAPTER 4. ANALYTIC GEOMETRY
4.61. Find equation of tangent in P=[l, 1] to the conic section
4x2 + 5V2 - 8xy + 2y - 3 = 0
Solution. By projecting we get conic section defined by quadratic form (x, y, z)A(x, y, z)T with matrix
/ 4 -4 0
A = -4 5 1 \ 0 1-3
Using previous theorem, tanget is polar of P, which has homogenenous coordinates (1:1: 1). It is given by equation (1,1, l)A(x, y, z)T = 0, which in this case gives
2y - 2z = 0
Moving back to inhomogeneous coordinates we get tangent equation y = l.
□
4.62. Find coordinates of intersection of y axis and conic section defined by
5x2 + 2xy + y2 - 8x = 0
Solution, y axis, i.e. line x = 0, is polar of sought point P with homogeneous coordinates (p) = (p\ : p2 : p3). That meansthat equation x = 0 is equivalent to polar equation F(p, v) = pT Av = 0, where i; = (x, y, z)T. This is satisfied when Ap = (a, 0, 0)T for some a € R. This condition gives us for conic section matrix
/5 1 -4\
A = 1 1 0 \-4 0 0 /
equation system
5pi + Pi ~ 4p3 = aj Pi + Pi = 0 -4Pl = 0
We can find point P coordinates by inverse matrix, p = A~1(a,0, 0)T, or solve the system directly by backward substitution. In this case we can easily obtain solution p = (0, 0, —\a). So y axis touches the conic section in the origin. □
CHAPTER 4. ANALYTIC GEOMETRY
4.63. Find a touch point of line x = 2 with conic section from the previous exercise.
Solution. Line has equation x — 2z, = 0 in projective extension and therefore we get condition Ap = (a,0, —2a) for touch point P, which gives us
5pi + p2 - 4p3 = a Pi + P2 = 0 —4pi = —2a
Its solutionis p = (\a, — \a, \a). These homogeneous coordinates are equivalent to (2, —2, 1) and hence the touch point has coordinates [2, -2]. □
4.64. Find equations of tangents passing through P= [3, 4] to tangent defined by
2x2 - 4xy + y2 - 2x + 6y - 3 = 0
Solution. Suppose that the touch point has homogeneous coordinates given by multiple of vector t = (t\,t2, h)- Condition of T lying on conic section is tT At = 0, which gives
2t2 - 4ttt2 + 2ttt3 + 6t2h - 3rj = 0
Condition of P lying on polar of T is pT At = 0, where p = (3, 4, 1) are homogeneous coordinates of point P. In this case, the equation gives us
(2 -2 -l\(h\ (3, 4, 1) -2 1 3 \ \t2 \= -3h +t2 + 6t3=0 V-l 3 -3) \t3)
Now we can input t2 = 3ti — 6t3 to the previous quadratic equation.
Then
-t\ +4tit3 - 3t\ = 0
Because for t3 = 0 equation is not satisfied, we can move to inhomo-geneous coordinates ^, 1), for which we get
-(^)2 + 4(^)-3 = 0 a g = 3(£)-6,
tj. f = 1 a ff = -3, nebo lj- = 3 a ff = 3. So the touch points have homogeneous coordinates (1 : —3 : 1) and (3:3: 1). Tangent equations are polars of those points Ix — 2y — 13 = 0 and x = — 3. □
4.65. Find equation of tangent passing through origin to the circle
x2 + y2 - lOx - 4y + 25 = 0
Solution. Touch point (t\ : t2 : t3) satisfies
1 0 -5\ jh\ (0, 0, 1) | 0 1 -2 I I f2 I = —5*i - 2t2 + 25 = 0 -5 -2 25 / W
244
CHAPTER 4. ANALYTIC GEOMETRY
>From here we derive for example t2 and substitute in circle equation, which {t\ : t2 : h) has to satisfy as well. We get quadratic equation 29t2 - 250fi + 525 = 0, which has solutions h = 5 and tx = . We compute coordinate t2 and get touch points [5, 0] and [^, ^]. The tangents are polars of those points with equations y = 0 a 20x —21y = 0. □
4.66. Find tangents equations to circle x2 +v2 = 5 which are parallel with 2x + y + 2 = 0.
Solution. In projective extension, these tangets intersect in point at infinity satisfying 2x + y + z, = 0, so in point with homogeneous coordinates (1 : —2 : 0). They are tangents from this point to the circle. We can use the same method as in previous exercise. Conic section matrix is diagonal with the diagonal (1,1,-5) and therefore the touch point (?i : t2 : t3) of the tangents satisfy t\ — 2t2 = 0. Substituting into circle equation we get 5t\ =5. Since that t2 = ±1 and touch points are [2, 1] and [—2, —1]. □ Tangent touching the conic section at infinity is called conic section asymptote. Number of asymptotes of conic section is equal to number of intersections between conic section and line at infinity, which means that ellipse does not have any real asymptote, parabola has one (which is however line at infinity) and hyperbola two of them.
4.67. Find points at infinity and asymptotes of conic section defined by
Ax2 - Sxy + 3/ - 2y - 5 = 0
Solution. First, we rewrite the conic section in homogeneous coordinates.
4x2 - 8xy + 3y2 - 2yz - 5z2 = 0 Points at infinity are then points determined by homogeneous coordinates (x : y : 0) satisfying this equation, which means
4x2 - 8xy + 3/ = 0.
For fraction - we get two solutions: - = — \ and - = — Conic
y ° y 2 y 2
section is therefore hyperbola with points at infinity P= (—1 : 2 : 0) a
Q= (—3:2:0). Asymptoty jsou potom polAA/y bodLZ P a Q, tj.
4 -4 0\ /V (-1,2,0)1-4 3 -1 y I =-12x + 10y-2 = 0
(-3,2,0) -4 3 -1 y =-20x + 18y - 2 = 0
□
245
CHAPTER 4. ANALYTIC GEOMETRY
You can find further exercises on conic sections on the page 250.
4.68. Harmonic cross-ratio. If cross-ratio of four points lying on line equal to —1, we talk about harmonic quadruple. Let A BCD be a quadrilateral. Denote by K intersection of lines AB and CD, by M intersection of lines AD and BC. Further let L, N be intersection of KM and AC, BD respectively. Show that points K, L, M, N are harmonic quadruple. O
246
CHAPTER 4. ANALYTIC GEOMETRY
247
CHAPTER 4. ANALYTIC GEOMETRY
248
CHAPTER 4. ANALYTIC GEOMETRY
249
CHAPTER 4. ANALYTIC GEOMETRY
D. Further exercise on this chapter
4.69. Find parametric equation of intersection of planes in M3:
a : 2x + 3y — z + 1 = 0 a p : x — 2y + 5 = 0.
o
4.70. Find common perpendicular of skew lines
p : [1, 1, 1] + t(2, 1, 0), q : [2, 2, 0] + f (1, 1, 1).
o
4.71. Jarda is standing in [ — 1, 1,0] and has a stick of length 4. Can he simultaneously touch lines p and q, where
p : [0,-1,0] + f(l,2, 1), q : [3,4, 8]+^(2, 1,3)?
O (Stick has to pass through [-1,1,0].)
4.72. Cube ABCDEFGH is given. Let point T lie on edge BF, \BT\ = \\BF\. Compute cosine of angle between ATC and BDE. Q
4.73. Cube ABCDEFGH is given. Let point T lie on edge AE, \AT\ = \\AE\ and S is midpoint of AD. Compute cosine of angle between BDT and SCH. Q
4.74. Cube ABCDEFGH is given. Let point T lie on edge BF,\BT\ = \\BF\. Compute cosine of angle between ATC and BDE. O
2 2
4.75. Determine tangent to ellipse + y = 1 parallel with line x + y — 7 = 0.
Solution. Lines parallel with given line intersect this line in point at infinity (1 : —1 : 0). We construct tangents to given ellipse passing through this point. Touch point T= (h '■ h '■ h) lies on its polar and therefore satisfies — § = 0, so t2 = ^t\. Substituting in ellipse equation we get t\ = ±y. Touch points of sought tangents are [y, |] and [—y, — f ]. Tangents are polars of those points. These have equations x + y = 5 and x + y = —5. □
4.76. Determine points at infinity and asymptotes of conic section
2x2 + 4xy + 2V2 - y + 1 = 0
Solution. Equation of points at infinity 2x2 + 4xy + ly2 = 0 or rather 2(x + y)2 = 0 has solution ^ — y. The only point at infinity therefore is (1 : — 1 : 0) (conic section is a parabola). Asymptote is a polar of this point, specifically line at infinity z = 0. □
4.77. Prove that product of distances between arbitrary point on a hyperbola and its asymptotes is consant and tell the value of this constant.
Solution. Denote the point lying on hyperbola as P. Asymptote equation of hyperbola in canonical form is bx ± ay = 0. Their normals are (b, ±a) and from here we determine projections Pi, P2 of point P to asymptotes. For distance between point P and asymptotes we get |P Pi 21 = -7==. The
2 „2 j.2 2 2 2
product is therefore equal a ^2^_h2P = f2+b2' because P ues on hyperbola. □
250
CHAPTER 4. ANALYTIC GEOMETRY
4.78. Compute angle between asymptotes of hyperbola 3x2 — y2 = 3.
Solution. For cosine of angle between asymptotes of hyperbola in canonical form we get cos a
b2+a
. In our case the angle is equal 60°. □
4.79. Compute centers of conic sections
(a) 9x2 + 6xy - 2y - 2 = 0
(b) x2 + 2xy + y2 + 2x + y + 2 = 0
(c) x2 - 4xy + Ay2 + 2x - Ay - 3 = 0
(d) ^# + ^# = 1
Solution, (a) System As + a = 0 for computing proper centers is
9si + 3s2 = 0 3*1-2 = 0
and, solving it, we obtain center [|, — 2].
(b) In this case we have
si+s2 + l = 0 S1+S2 + J = 0
and therefore there is no proper center (conic section is a parabola). Moving to homogeneous coordinates we can obtain center at infinity (1 : — 1 : 0).
(c) Coordinates of center in this case satisfy
51-2*2 + 1 = 0 -2*i +4*2-2 = 0
and the solution is whole line of centers. It is so because the conic section is degenerated to pair of parallel lines.
(d) From equations for center computation we immediately get that center is (a, (3). Coordinates of center therefore gives translation of coordinate system origin to the frame in which the ellipse has basic form.
□
4.80. Tell the equations of axes of conic section 6xy + Sy2 + 4y + 2x — 13 = 0.
Solution. Main directions of the conic section (axes direction vectors) are eigenvectors of matrix 0 3\
2 g 1. Characteristic equation has form X2 — 8A — 9 = 0 and eigenvalues are therefore Ai = — 1,
X.2 = 9. Corresponding eigenvectors are then (3, — l)and(l, —3). Axes are polars of points at infinity defined by those directions. For (3, —1) we get axis equation — 3x + y + 1 = 0 and for (1, —3) axis
-9x - 2\y - 5 = 0. □
4.81. Determine equations of axes of conic section Ax2 + 4xy + y2 + 2x + 6y + 5 = 0.
(A 2\
Solution. Eigenvalues of matrix I ^ ^ I are X\ = 0, A2 = 5 and corresponding eigenvectors are
(—1,2) and (2, 1). We get axes 5 = 0 and 2x + y + 1 = 0. The former axis is obviously satisfied for no point. Hence there is only on axis (the conic section is a parabola). □
4.82. Equation
x2 + 3xy - y2 + x + y + 1 = 0.
251
CHAPTER 4. ANALYTIC GEOMETRY
defines a conic section. Tell its center, axes, asymptotes and foci.
252
CHAPTER 4. ANALYTIC GEOMETRY
Exercise solution
4.9. 2, 3, 4, 6, 7, 8. Try to find planes positions which correspond to each of those numbers on your own. 4.2 R exists (do not confuse this notation with the function x i-> (/(x))-1), then it is uniquely determined by either of the following identities
/ 1 o / = idM, f of 1 = idB
and the other one then holds as well. If / is defined on a set A c R and f(A) — B, the existence of f~l is conditioned by the same statements with identity mappings id a and idg, respectively, on the right-hand sides. As we can see from the picture, the graph of the inverse function is obtained simply by interchanging the axes of the input and output variables.
If we knew that the inverse y — f~1(x) of a differentiable function x — f(y) is also differentiable, then the chain rule would immediately yield
1 = (id)'(x) = (/ o rl)'(x) = f(y) ■ (f-l)'(x),
so we obtain the formula (apparently, f'(y) must be non-zero in this case)
___| Derivative of an inverse function |___
(5.6)
f'(y)
This corresponds to the intuitive idea that for y — f(x), the value of /' is approximately ^ while for x — f~l (y) it is approximately (f~l)'(y) — And this indeed is the way how we can calculate the derivatives of inverse functions:
Theorem. If f is a real-valued function differentiable at a point xo and such that /'(xo) ^ 0, then there is a function f~l defined on some neighborhood of the point and such that (5.6) holds.
287
CHAPTER 5. ESTABLISHING THE ZOO
positive x-axis in the "positive sense of rotation", ie. counterclockwise.) O Solution. 7T/4.
5.90. Determine the equations of the tangent and normal line to the curve given by the equation
+ y3 - 2xy = 0
at the point [1, 1]. Solution, y = 2 — x; y = x.
5.91. Prove the following:
o
< In (1 + x) < x for all x > 0.
o
Solution. The inequalities follow, for instance, from the mean value theorem (attributed to Lagrange) applied to the function y = ln(l +0, t e [0, x].
F. Extremal problems
The simple observation 5.32 about the geometrical meaning of the derivative also tells us that a differentiable real-valued function of a real variable can have extremes only at the points where its derivative is zero. We can utilize this mere fact when solving miscellaneous practical problems.
5.92. Consider the parabola y = x2. Determine the x-coordinate xA of the parabola's point which is nearest to the point A = [1,2].
Solution. It is not difficult to realize that there is a unique solution to this problem and that we are actually looking for the absolute minimum of the function
f(x) = y/ix - l)2 + (x2 - 2)2, x e R.
Apparently, the function / takes the least value at the same point where the function
g(x) = (x - l)2 + (x2 - 2)2, x e R
does. Since
g'(x) = 4x3 - 6x - 2, x e R,
by solving the equation 0 = 2x3 — 3x — 1, we first get the stationary point x = — 1 and after dividing the polynomial 2x3 — 3x — 1 by the polynomial x + 1, we then obtain the remaining two stationary points
±fi and l-±fi.
As the function g is a polynomial (differentiable on the whole domain), from the geometrical sense of the problem, we get
Proof. First, let us realize that the request that the derivative at xo be non-zero means that our function / is either increasing or decreasing on some neighborhood of the point; see the corollary 5.32. Thus there exists an inverse function defined on some neighborhood. Since a continuous function maps a closed bounded interval onto a closed interval; the image f(U) of any open set U contained in the domain of / is open as well. Then, by the definition of continuity, the inverse function is continuous, too.
To prove our proposition, it now suffices to carefully read through the proof of the fourth statement of the theorem 5.33. We only choose / for h and f~l for /, and we know that the composite function is differentiable instead of supposing existence of the derivatives of both the functions (and we know that the composite is the identity function): Indeed, by the lemma 5.31, there is a function ^ continuous at the point yo such that
f(y) - /(yo) = (p(y)(y - yo),
on some neighborhood of yo. Further, it satisfies cp(yo) — /'(yo)-However, then the substitution y = f~l (x) gives that
x - xo = ^(/-1(x))(/"1(x) - f-\x0)),
for all x lying in some neighborhood O (xo) of the point xo. Further, we have f~l (xq) — yo, and since / is either strictly increasing or strictly decreasing, we get that cp(f~1(x)) / 0 for all x e O(xo)\ {xq}. Thus we can write
f-Hx)-rHx0)
l
x -x0 0, x € I, the function / must take the greatest value on / at its only stationary point x0 = b/4. Thus the sides of the wanted rectangle are b/2 long (twice x0: considering the original problem) and h/2 (which can be obtained by substituting b/4 for x into the expression h — 2hx/b). Hence we get S = hb/4. □
5.94. Among rectangles such that two of their vertices lie on the x-axis and the other two have positive y-coordinates and lie on the parabola y = 8 — 2x2, find the one which has the greatest area. Solution. The base of the largest rectangle is 4/V3 long, the rectangle's height is then 16/3. This result can be obtained by finding the absolute maximum of the function
Six) = 2x (8 - 2x2)
on the interval / = [0, 2]. Since this function is non-negative on /, takes zero at /'s boundary points, is differentiable on the whole of / and its derivative is zero at a unique point of /, namely x = 2/V3, it has the maximum there. □
5.95. Let the ellipse 3x2 + y2 = 2 be given. Write the equation of its tangent line which forms the smallest triangle possible in the first quadrant and determine the triangle's area.
Solution. The line corresponding to the equation ax + by + c = 0 intersects the axes at the points [—^, 0], [0, —|] and the area of the triangle whose vertices are these two points and the origin is S =
2
j^. The line which touches the ellipse at [xT, yT] has the equation 3xxr + yyT —2 = 0. The area of the triangle corresponding to it is
Now, let us focus on the derivative of the exponential function fix) — ax. If the derivative of ax exists at all points x, it will surely hold that
fix)
lim
a
x+Ax
-,Ax
lim ■
Ax Ajc^o Ax
On the other hand, if the derivative at zero exists, then this formula guarantees the existence of the derivative at any point of the domain and also determines its value. At the same time, we verified the validity of the formula for the one-sided derivatives.
Unfortunately, it will take us some time to verify (see 5.43,11 i 11, and 6.43) that the derivatives of exponential functions indeed exist. We will also see that there is an especially important base e, the so-called Euler's number, for which the derivative at zero equals one. What we can do now is to notice that the exponential functions are special in the way that their derivatives are proportional (with a constant coefficient) to their values:
1
f'iO)ax
iax)' = (eln^)' = ln(a)(eln(fl)*) = ln(a) • ax.
5.37. Mean value theorems. Before we continue our journey of building new functions, we will derive several sim-<\, pie statements about the derivatives. The meaning U of all of them is intuitively clear from the pictures and the proofs only follow the visual imagination.
Theorem. Let a function f : M -> M be continuous on a closed bounded interval [a, b] and differentiable inside this interval. If fia) = fib), then there is a number c e (a, b) such that f'ic) = 0.
Proof. Since the function / is continuous on the closed interval (i. e. on a compact set), it reaches both a maximum and a minimum there. If the maximum and the minimum shared the value fia) — fib), it would mean that the function / is constant, and thus its derivative is zero at all points of the interval (a,b). Therefore, let us suppose that at least one of the maximum and the minimum is different and let it occur at an interior point c. Then it is impossible to have f'ic) / 0 because then the function / would be either increasing or decreasing at the point c (see 5.32) and so it would take both lower and higher values than fic) at a neighborhood of the point c. □
The above theorem is called Rolle's theorem.lt immediately implies the following corollary, known as Lagrange's mean value theorem.
5.38. Theorem. Let a function f : M -> M be continuous on an interval [a, b] and differentiable at all points inside this interval.
289
CHAPTER 5. ESTABLISHING THE ZOO
thus S = Jx2yT. Further, in the first quadrant, we have that xT,yT > 0. To minimize this area means to maximize the product xTyT = 3x2r, which is (in the first quadrant) the same as to maximize
xttJ2-(xTyT)2 = x2(2 minimum is at xT
3x2T)
-3(x2
\)2 + Hence, the wanted
-j^. The tangent's equation is V3x + y = 2 and
the triangle's area is 5mi„ = 2-jp.
□
5.96. At the time t = 0, the three points P, Q, R began moving in the plane as follows: The point P is moving from the point [—2, 1] in the direction (3, 1) at the constant speed VlO m/s, the point Q is moving from [0, 0] in the direction (— 1, 1) with the constant acceleration 2 V2 m/s2 (beginning at zero speed) and the point R is going from [0, 1] in the direction (1, 0) at the constant speed 2 m/s. At which time will the area of the triangle PQRbe minimal?
Solution. The equations of the points P, Q,Rm time are
P
Q R
[-2, 1] + (3, l)t, [0, 0] + (-l, l)ř, [0, 1] + (2, 0)í.
The area of the triangle P QR is determined, for instance, by half the absolute value of the determinant whose rows are the coordinates of the vectors PQ and QR (see 1.34). So we minimize the determinant:
-2 + t t
2t
-1 + i2
2t - t + 2.
The derivative is 6r — 1, so the extrema occur at t = Thanks to considering non-negative time only, we are interested in t = . The second derivative of the considered function is positive at this point, thus the function has a local minimum there. Further, its value at this point is positive and less than the value at the point 0 (the boundary point of the interval where we are looking for the extremum), so this point is the wanted global minimum. □
5.97. At 9 o'clock in the morning, the old wolf left his den D and as a part of his everyday warm-up, he began running counterclockwise around his favorite stump S at the constant speed 4 kph (not very quick, is he), keeping the constant distance of 1 km from it. At the same time, Little Red Riding Hood set out from her house H straight to her Grandma's cottage C at the constant speed 4 kph. When will they be closest to each other and what will their distance be at that time? The coordinates (in kilometers) are: D = [2, 3], S = [2, 2], H = [0, 0], C = [5, 5].
Solution. The wolf is moving along a unit circle, so his angular speed equals his absolute speed and his position in time can be described by
Then there is a number c e (a, b) such that
fib) - f(a)
fie) =
b — a
VETA 0 S-rttpW tfWOTE.
Proof. The proof is a simple record of the geometrical mean-ing of the theorem: The secant line between the points [a, f(a)] and [b, fib)] has a tangent line which is parallel to it (have a look at the picture). The equation of our secant line is
y — g(x) — f(a) + ^—^-^^(x - a).
b — a
The difference h (x) — f(x) — g(x) determines the distance of the graph and the secant line (in the values of y). Surely h(a) — h(b) and
it, , ft, , fib) - f(a)
h(x) = f (x)--■-.
b — a
By the previous theorem, there is a point c at which h' (c) — 0. □
The mean value theorem can also be written in the form:
(5.9) f(b) = f(a) + f'(c)(b-a).
In the case of a parametrically given curve in the plane, i. e. a pair of functions y — fit), x — git), the same result about existence of a tangent line parallel to the secant line going through the marginal points is described by the so-called Cauchy's mean value theorem:
Corollary. Let functions y = fit) andx = git) be continuous on an interval [a, b] and differentiable inside this interval, and further let g'it) ^ 0 for all t € (a, b). Then there is a point c € (a, b) such that
fjb) - fja) _ f'jc) gib) - gia) g'ic) '
Proof. Again, we rely on Rolle's theorem. Thus we set
hit) = ifib) - fia))git) - igib) - gia)) fit).
NowMa) = fib)gia)-fia)gib),hib) = f{b)g{a)-f{a)g{p), so there is a number c e (a,b) such that h'(c) — 0. Since g'ic) / Owe get just the desired formula. □
A reasoning similar to the one in the above proof leads to a supremely useful tool for calculating limits of quotients of functions. The theorem is known as I'Hospital's rule:
290
CHAPTER 5. ESTABLISHING THE ZOO
the following parametric equations:
x(t) = 2- cos (4f), y(t) = 2- sin(4f), Little Red Riding Hood is then moving along the line
x(t) = 2J2t, y(t) = 2y/2t.
Let us find the extrema of the (squared) distance p of their paths in time:
p(t) = [2 - cos(40 - 2V2tf + [2 - sin(4f) - 2V2?]2, p'(t) =16(cos(40 - sin(40)(V2f - 1) + 32?+ + 4V2(cos(40 + sin(40) - 16V2.
It is impossible to solve the equation p' (t) = 0 algebraically, we can only find the solution numerically (using some computational software). Apparently, there will be infinitely many extrema: every round, the wolfs direction is at some moment parallel to that of Little Red Riding Hood, so their distance is decreasing for some period; however, Little Red Riding Hood is moving away from the wolfs favorite stump around which he is moving. We find out that the first local minimum occurs at t = 0.31, and then at t = 0.97, when the distance of our heroes will be approximately 5 meters. Clearly this is the global maximum as well.
The situation when we cannot solve a given problem explicitly is quite common in practice and the use of numerical methods is of great importance. □
5.98. Halley's problem, 1686. A basketball player is standing in front of a basket, at distance I from its rim which is at height h from the throwing point. Determine the minimal initial speed v0 which the player must give to the ball in order to score, and
the angle cp corresponding to this v0. See the picture.
Solution. Once again, we will omit units of measurement: we can assume that distances are given in meters, times in seconds (and speeds in meters per second then). Suppose the player throws the ball at time t = 0 and it goes through the rim at time t0 > 0. We will express the ball's position (while flying) by the points [x(t), y(t)] for t e [0, t0], and we require that x(0) = 0, y(0) = 0, x(t0) = I, y(t0) = h. Apparently,
x? (t) = vo cos cp, y (t) = vo sin cp — gt
for t € (0, to), where g is the gravity of Earth, since the values x' (t) and y (0 are, respectively, the horizontal and vertical speed of the ball. By integrating these equations, we get
x(t) = v$t cos cp + c\, y(t) = v$t sin cp — ^ gt2 + c2
5.39. Theorem. Let us suppose that f and g are functions dijferen-tiable on some neighborhood of a point xo e R, yet not necessarily at the point xq itself Moreover, let the limits
lim f(x)
x—>xq
0,
lim g{x) = 0
exist. If the limit
exists, then the limit
lim
fix)
x^x0 g'(x)
lim
x^x0 g(x)
exists as well and these two limits are equal.
Proof. Without loss of generality, we can assume that both the functions / and g take zero at the point xq. Again, we can illustrate the statement by a picture. *^t~~:_ Let us consider the points [g(x), f(x)] e R2 parametrized by the variable x. The quotient of the values then corresponds to the slope of the secant line between the points [0, 0] and [f(x),g(x)]. At the same time, we know that the quotient of the derivatives corresponds to the slope of the secant line at the given point. Thus we want to derive that the limit of the slopes of the secant lines exists from the fact that the limit of the slopes of the tangent lines exists.
Technically, we can make use of the mean value theorem in a parametric form. First of all, let us realize that the existence of the expression f'(x)/g'(x) on some neighborhood of the point xo (excluding xo itself) is implicitly assumed; thus especially for points c sufficiently close to xo, we will have g'(c) / 0.4 Thanks to the mean value theorem, we have now that
fix) ,. /(x)-/(x0) f'(cx)
lim
lim
lim
«o g(x) x^x0 g(X) - g(x0) x^x0 g'(Cx)
where cx is a number lying between xo and x, dependent on x. From existence of the limit
,. fix) lim -,
x^xo g'(x)
it follows that this value will be shared by the limit of any sequence created by substituting the values x — xn approaching xq into
This is not always necessary for the existence of the limit in a general sense. Nevertheless, for the statement of l'Hospital's rule, it is. A thorough discussion can be found (googled) in the popular article 'R. P. Boas, Counterexamples to L'Hospital's Rule, The American Mathematical Monthly, October 1986, Volume 93, Number 8, pp. 644-645.'
291
CHAPTER 5. ESTABLISHING THE ZOO
for t € (0, to) and c\, c2 € R. From the initial conditions
lim x(t) = jc(0) = 0, lim y(t) = y(0) = 0,
t^0+ t^0+
it follows that c\ = c2 = 0. Substituting the remaining conditions
lim x(t) = x(to) = I, lim y(t) = y(t0) = h
t-Hg- t-Hg-
then gives
/ = voto cos cp, h = voto sin cp — j gt^.
According to the first equation, we have that
I
(5.1)
to
Vq cos cp
and thus we get only one equation
gl2
(5.2)
h = I tan cp
2f2 cos2 cp
where v0 e (0, +oo), cp e (0, tt/2).
Let us remind that our task is to determine the minimal v0 and the corresponding cp which satisfies this equation. To be more comprehensible, we want to find the minimal value of v0 for which there is an angle cp satisfying (||5.2||). Since
l+tan2 From the last equation (quadratic equation in p = tan cp), it follows that
tan 0.
Once again, a suitable substitution (this time q = v^) allows us to reduce the left side to a quadratic expression and subsequently to get
(v2 -g[h + Vh^TfJj (v2 -g[h- Vh^TpJj > 0. As h < \Jh2 + P, it must be that
h + V/i2 +12
l. e. v0
>
h + Jh2 + P
The least value (5.4)
h + y/h2 + P
f'(x)/g'(x). Especially, we can substitute any sequence cXn for xn -> xo, and thus the limit
,. f'(cx) lim -
■*^*o g'(cx)
will exist, and the last two limits will be equal. Thus we have proved that the wanted limit exists and also has the same value. □
>From the proof of the theorem, it is apparent that it holds for one-sided limits as well.
5.40. Corollaries. L'Hospital's rule can easily be extended for limits at the improper points ±oo and for the case of infinite values of the limits. If, for instance, we have
lim f{x) = 0, lim g(x) = 0,
x^oo x^oo
then limJC_>0+ fiX/x) = 0 and limJC_>0+ g(\/x) = 0.
At the same time, from existence of the limit of the quotient of the derivatives at infinity, we get
(/(!/*))' /'(l/xX-l/x2) lim -= lim--—
x^0+ (g(l/x))> x^0+ g'(l/x)(-l/x2)
,. /'(I/*) ,. fix) = lim -= lim -.
X^0+ g'(l/x) X^QO g'(X)
Applying the previous theorem, we get that the limit
f(x) f(l/x) f'(x) lim -= lim -= lim -
x^oo g(X) x^0+ g(l/x) x^oo g'(x)
will exist in this case as well.
The limit calculation is even simpler in the case when
lim f(x) = ±oo, lim g(x) = ±oo.
X^XQ X^Xq
Then it suffices to write
f(x) l/g(x) lim -= lim -,
*->*0 g(x) x^x0 l/f(x)
which is already the case of usage of l'Hospital's rule from the previous theorem. It can be proved that l'Hospital's rule holds in the same form for infinite limits as well:
Theorem. Let f and g be functions differentiable on some neighborhood of a point xo e M, yet not necessarily at the point xo itself Further, let the limits lim*-^ f(x) = ±oo and lim*-^ g(x) = ±oo exist. If the limit
fix)
exists, then the limit
lim
*->*0 g'(x)
,. fix) lim
x^x0 g(x)
exists as well and they equal each other.
Proof. Once again, we can apply the mean value theorem. The key step is to express the quotient in a form where the derivative arises:
/(*) _ /(*) f(x)-f(y) g(x)-g(y) g(x) f(x)-f(y) g(x)-g(y) g(x) where y is fixed, from a selected neighborhood of xo and x is approaching xq. Since the limits of / and g at xq are infinite, we can surely assume that the differences of the values of both functions at x and y, having fixed y, are non-zero.
292
CHAPTER 5. ESTABLISHING THE ZOO
is then matched (see (||5.3||)) by (5.5)
v2 h + Vh2 + P tancp = — = —
i. e. cp = arctg
h + Vh2 + P
I I The previous calculation was based upon the conditions x(t0) = I, y(t0) = h only. However, these only talk about the position of the ball at the time t0, but the ball could get through the rim from below. Therefore, let us add the condition / (t0) < 0 which says that the ball was falling at the time, and let us prove that it holds for vq from (|| 5.41|) and cp from (||5.5||).
Let us remind that we have (see (||5.1||), (||5.2||))
to
vq cos cp
0 2(1 tan cp—h) cos2
Using this, from we get
lim y(t)
t^to-
Vo sin Xfi) —
xx
is a special case of the so-called power mean with exponent r, also known as generalized mean:
Mr(xi,
■xi
The special value M~l is called harmonic mean. Now, let us calculate the limit value of Af for r approaching zero. For this purpose, we will determine the limit by l'Hospital's rule (it it an expression of the form 0/0 and we differentiate with respect to r, while x, are constant parameters).
The following calculation, in which we apply the chain rule and our knowledge of the derivative of the power function, must be read from the back. Existence of the last limit implies the existence of the last-but-one, and so on.
lim ln(Mr(xi,
r^0
, x„)) — lim
r^0
O)
x\ \\iX\-\-----Yxrn \nxn
lim-
r^0
lnxi + •
lnx„
= ln^/xl
Hence we can immediately see that
lim Mr (xi, ..., x„) — Zjx\ ... x„,
r^O
which is a value known as geometric mean.
4. Power series
5.42. The calculation of ex. Besides addition and multiplication, \j, we can also manipulate with limits of sequences. ~i. Thus it might be a good idea to approximate non-polynomial functions by sequences of values that can be calculated.
293
CHAPTER 5. ESTABLISHING THE ZOO
We have obtained that the elevation angle corresponding to the throw with minimal energy is the arithmetic mean of the right angle and the angle at which the rim is seen (from the ball's position).
The problem of finding the minimal speed of the thrown ball was actually solved by Edmond Halley as early as in 1686, when he determined the minimal amount of gunpowder necessary for a cannonball to hit a target which lies at greater height (beyond a rampart, for instance). Halley proved (the so-called Halley's calibration rule) that to hit a target at the point [I, h] (shooting from [0,0]) one needs the same minimal amount of gunpowder as when hitting a horizontal target at distance h + ~Jh2 + P (at the angle — 1, b ^ 0, and a natural number n > 2, it is true that (1 + b)n > 1 + nb.
Proof. For n — 2, we get
(1 +b)2 = 1 +2b-
■b2 > 1
2b.
From now on, we proceed by induction on n, supposing b > —I. Let us assume that the proposition holds for some k > 2 and let us calculate:
(1 + b)k+l = (1 + b)k(\ +b)> (1+ kb)(\ + b)
= l + (k+l)b + kb2 > l + (k+ \)b The statement is, of course, true for b — — 1 as well.
□
Now we can bound the quotient of adjacent terms a„ of out sequence
(i + ^r)-1
i
(n2 - \)nn n2n(n - 1)
1 \" n nL I n —
> (1--)-
1.
n n — 1
Thus we have proved that our sequence is indeed increasing.
The following, very similar, calculation (applying Bernoulli's inequality once again) verifies that the sequence of numbers
b„
1
1 + -
n
n + l
is decreasing. Surely b„ > an.
b„
b„
+1
n
n + 1 n
n+2
1
1
1 + -
n
n
n + 1
n+2
1
1 + -
n
2n + l
2n
n+2
n(n ■+ 2)
n
1
n+2
= 1.
n(n+2)/
Thus the sequence a„ is increasing and bounded from above, so the set of its terms has a supremum which equals the limit of the
294
CHAPTER 5. ESTABLISHING THE ZOO
R = t>0f0 cos cp, —h = v$to sin cp — -\ gt^.
From the first equation, it follows that
R
to
VQ cos -2gR(q>)
R
Vq sin 2 Af. However, so great indeces j satisfy Cj+\ < < 2~^~N+1^ cN. This means that the parial sums of the first n terms of our formal sum are bounded from above by the sums
N-l j j n-N j D„ < >^ —x-7 H--x" >^ —7.
295
CHAPTER 5. ESTABLISHING THE ZOO
for cp -» 7t/2— the value of R decreases) and is differentiable at every point of this interval, it has its maximum at the point where its derivative is zero. This means that R (cp) can be maximal only if
(5.8)
R(cp) = h tan 2cp.
Let us thus substitute (||5.8||) into (||5.7||). We obtain
h tan 2cp v2, sin 2cp — gh2 tan2 2cp + 2hv^ cos2 cp = 0. This equation can be transformed to
tan 2cp v2, sin 2cp + 2v2) cos2 cp = gh tan2 2cp,
2 sin2 2cp
+ vl (cos2 )(l+cos 2 N for some fixed N (very great) and choose a fixed number k < N (quite small). Then for sufficiently large N, we can approximate the sum of the first k terms in the expression of un in (5.10) by i;* with arbitrary precision. Since this part of the sum of un is strictly less than un itself, the sequence u„ must converge to the same limit as the sequence v„. Thus we have proved
__ [ The power series for ex [__,
Theorem. The exponential function is, for every number x e expressed as the limit of the partial sums in the expression
1 9
1+xH--x2
2!
1
y-x".
«=0
5.44. Number series. When deriving the previous important theorems about the function ex, we have accidentally worked with several extraordinarily useful concepts and tools. Now, we will formulate
them in general:
Infinite number series
Definition. An infinite series of numbers is an expression
E
«=0
an = aß -\- a\ -\- a2 -
at
Let, for instance, javelin thrower Barbora Spotakova give a javelin the speed i>o = 27.778 m/s = 100 km/h at the height h = 1.8 m (with g = 9.806 65 m • s~2). Then the javelin can fly up to the distance
R ( 0. It thus makes sense to analyze the function (see (||5.11||) and (||5.12||))
a(y) = 4 arcsin ^ — 2arcsin j, y € [0, R].
By selecting the appropriate unit of length (for which R = 1) we can turn to the function
a(x) = 4arcsin^ — 2arcsinx, x e [0, 1].
Having calculated the derivative
a'(x) = —jL= - 2, x e (0, 1),
we can easily determine that the equation a'(x) =0 has a unique solution
324
CHAPTER 5. ESTABLISHING THE ZOO
x0 = y^e(0, 1), if «2eE(l,4). Let us set n = 4/3 (which is approximately the refractive index of water). Further,
a'(x) > 0, x e (0, xo), c/(x) < 0, x e (xo, 1). We have found that at the point
x0 = /^P- = I Ti = 0.86, the function a has a global maximum
o-(xo) = 4 arcsin -4? - 2 arcsin ^ = 0.734 rad ^ 42 °.
y v/ 2V3 3V3
Although it is amazing that the peak of the rainbow cannot be above the level of approximately 42 ° with regard to the observer, what is even more amazing are the values
a (0.14) = 39.4°, a (0.94) = 39.2 °, a(0.8) = 41.2°, a (0.9) = 41.5 °.
Those imply (the function a is increasing on the interval [0, x0] and decreasing on the interval [x0, 1]) that more than 20 % of the values a lie in the band from around 39 ° to around 42 °, and 10 % lie in a band thinner than 1 °. Furthermore, if we consider
a(0.84) =41.9°, a (0.88) = 41.9 °,
we can see that the rays for which a is close to 42 ° have the greatest intensity. Let us emphasize that this is an instance of the so-called principle of minimum deviation: the highest concentration of the diffused light happens to be at the rays with minimum deviation since the total angle deviation of the ray equals the angle 8 = it — a.
The droplets from which the rays creating the rainbow for the observer come lie on the surface of a cone having the central angle equal to 2a (x0). The part of this cone which is above ground then appears as the rainbow arc to the observer (see the picture). Thus when the sun is setting, the rainbow has the shape of a semicircle. Let us remark that the rainbow exists only with regard to its observer - it is not anchored in the space. Eventually, let us add that the circular shape of the rainbow was examined as early as 1635-1637 by René Descartes. □
5.202. L'Hospital's pulley.
A rope of length r is tied to the ceiling at point A. A pulley is attached to its other end. Another
§,, rope of length I > \fcP- + r2, going through the pulley, is tied to the ceiling at point B which is at distance d from the point A. A weight is attached to this rope. In what position will the x weight stabilize (the system will be in a stationary position)? Omit the mass and the size of the ropes and the pulley. See the picture.
Solution. The system will be in a stationary position if its potential energy is minimized, i. e. the distance f(x) of the weight from the ceiling is maximal. However, this means that for r > d, the pulley only moves under the point B. Further on we will thus suppose that r < d. By the Pythagorean theorem, the distance of the pulley from the ceiling is Vr2 — x2 and from the weight then I — y/(d — x)2 + r2 — x2 , which gives
fix) = Vr2 - x2 + I - y/(d - x)2 + r2 - x2 .
The state of the system is fully given by the value x e [0, r] (see the picture), so it suffices to find the global maximum of the function / on the interval [0, r]. First, we calculate the derivative
325
CHAPTER 5. ESTABLISHING THE ZOO
f W = Jfl -x2 ~ J(d-x)2+fl -x2 = Jr2 -x2 + J(d-x)2+fl -x2 ' X ^ ('0' r-*"
Exponentiating the equatino f'(x) = 0 for x e (0, r) leads to
x1 = d2
r2 —x2 {d—x)2+r2 —x2
Multiplying both sides by (r2 — x2) {(d — x)2 + r2 — x2) then leads to
2dx3 - (2d2 +r2)x2 + d2?2 = 0, x e (0, r).
If we notice that one of the roots of the left-hand polynomial is x = d, we can easily transform the last equation into the form
(x-d) (2dx2 - r2x - dr2) = 0, x e (0, r),
or (we have a formula for the quadratic equation)
2d(x-d)(x- d+f&M.) (x - ^g^) =0, xe (0, r).
Hence we can see that the equation f'(x) = 0 has at most one solution on the interval (0, r). (Since r < d and \Jr2 + Sd2 > r, there are surely not two roots of the considered polynomial in x in the interval (0, r).) It remains to determine whether
i2 W r2 +%& 1 X0 - -4d- - 4 r
G (0, r).
Realizing that r,d > 0 and r < d,we get
0 < x0 < ^ r
l+V^T
r.
As the function /' is continuous on the interval (0, r), it can change sign only at the point x0. From the limits
lim f'(x) = -jf=, lim f'(x) = -oo,
x^0+ -Jdl+rl x^r-
it follows that
fix) > 0, x e (0, x0), /'(jc) < 0, x e (x0, r). Thus the function / has the global maximum on the interval [0, r] at the point x0. □
5.203. A nameless mail company can only transport parcels whose length does not exceed 108 inches and whose sum of length and maximal perimeter is at most 165 inches. Find the largest (i. e. having the greatest volume) parcel which can be transported by this company.
Solution. Let M denote the value 165 (inches) and x the parcel's length (in inches as well). Apparently, the wanted parcel has such a shape that for any t e (0, x), its cross section has a constant perimeter (the maximal one). We will denote this perimeter by p (in inches, again). We want the parcel to have the greatest volume so that the cross section of a given perimeter has the greatest area possible. It is not difficult to realize that the largest planar figure of a given perimeter is a disc. Thus we have derived that the desired parcel has the shape of a cylinder with height equal to x and radius r = p/2n.
Its volume is
V =jtr2x =
and it must be that p + x < M and x < 108. Thus we consider the parcel for which p + x = M. Its volume is
V(X) = i-M^2L = *3-2Mx2+M2x where x e (Q) 10g] _
Having calculated the derivative
326
CHAPTER 5. ESTABLISHING THE ZOO
v,(x) = ^-amx+m2 = 3^-^-?)^ x e (Q) 10g) j
we easily find out the the function V is increasing on the interval (0, 55] = (0, M/3] and decreasing on the interval [55, 108] = [M/3, min{108, A/}]. The greatest volume is thus obtained for x = M/3, where
v (f) = m = 0.011789 M3 ^ 0.867 8 m3. If the company also required that the parcel have the shape of a rectangular cuboid (or more generally a right prism of a given number of faces), we can repeat the previous reasoning for a given cross section of area S without specifying what the cross section looks like. It suffices to realize that necessarily S = kp2 for some k > 0 which is determined by the shape of the cross section. (If we change only the size of the sides of the polygon which is the cross section, then its perimeter will change by the same ratio. However, its area will change by square of the ratio.) Thus the parcel's volume is the function
V(x) = Sx = kp2x = k (M — x)2x, x e (0, 108].
The constant k does not affect the point of the global maximum of the function V, so the maximum is again at the point x = M/3. For instance, for the largest right prism having a square base, we have p = M — x = 2M/3, i. e. the length of the square's sides is a = M/6 and the volume is then
V =a2x = = 0.009 259 M3 ^0.681 6 m3.
For a parcel in the shape of a ball (when x is the diameter), the condition p + x < M can immediately be expressed as nx + x < M, i. e. x < M/{jt + 1) < 108. Thus for x = M/{jt + 1), we get the maximal volume
V = \it (f)3 = -^-3 = 0.007 370 M3 ^0.542 6 m3.
Similarly, for a parcel in the shape of a cube (when x is the length of the cube's edges), the condition p + x < M means x < M/5 < 108. Thus for x = M/5 we get the maximal volume
V =x3 = (f)3 = 0.008 M3 « 0.588 9 m3.
Let us add that the length of the edges of the cube which has the same volume as the found cylinder
is
a = -M-= 0.227 595 M « 0.953 849 m.
Let us realize its length and perimeter sum to 5a = 1.138 M, i. e. more than the company's limit by around 14 %. □
5.204. A large military area (further denoted by MA) having the shape of a square and area of 100 km2 is bounded along its perimeter by a narrow path. From the starting point in one corner of MA, one can get to the target point inside MA by going 5 km along the path and then 2 km perpendicularly to it. However, one can also go along the path at 5 kph for any time period and then askew through the MA at 3 kph. What distance do you have to travel along the path if you want to get there as soon as possible?
Solution. To travel x km along the path (where x e [0, 5]), we need x/5 hours. Our way through MA will then be
V22 + (5 - x)2 = Vx2 - lOx + 29 kilometers long and we will cover it in \Jx2 — lOx + 29/3 hours. Altogether, our journey will take
327
CHAPTER 5. ESTABLISHING THE ZOO
f(x) = \x + \y/x2 - lOx + 29 hours (let us remind that x e [0, 5]). The only zero point of the function
fix) = \ +
1 , 1 jt-5
5 3 v/jc2-10jc+29
is x = 7/2. Since the derivative /' exists at every point of the interval [0, 5] and since
/(D = ?|(5) = f (0) = ^, the function / has its absolute minimum at the point x = 7/2 Thus we should go 3.5 km along the path. □
5.205. You find yourself in a boat on a lake at distance d km from the shore. You want to get to a given place on the shore whose straight-line distance is \JS + Z2 from you (see the picture). What path will you take if you want to be there as soon as possible, supposing you can row at t>i kph and run along the shore at v2 kph? How long will the journey take?
Solution. The optimal strategy is apparently given by first rowing straight to the shore at some point [0, x] for x € [0,1] and then running along the shore to the target point [0,1] (see the picture), so the trajectory consists of two line segments (or only one segment, in the case when x = I). The voyage to the point [0, x] on the shore will take
hours
and the final run then
l—x
hours.
«2
We want to minimize the total time, i. e. the function
on the interval [0, /]. Further, we can assume that t>i < v2. (Clearly for t>i > v2 the optimal strategy is to row straight to the target point, which corresponds to x = I.) First, we calculate the first derivative
and then the second derivative
f(x) = —rJ==, jce(0,Z)-
/ (d2 +x
Further, we solve the equation
t' (x) = 0, i. e. Exponentiating this equation gives
v-2
l +x2 V2
Simple rearrangements lead to
2 \v? / • v7
xA = v 7 2, i. e. ^ —--
-(5)
Let us realize that we consider only x e (0,1). Thus we are interested in whether
^- d
-2- < I, * " ^--'-
If this inequaUty holds, then also v\ < v2 and the function i changes sign only at the point
X0 = G (0,/),
328
CHAPTER 5. ESTABLISHING THE ZOO
and this change is from negative to positive (consider limx^0+ f (x) < 0 and t" (x) > 0, x e (0, /)). This means that in this case, at the point x0 there is the global minimum of the function t on the interval [0, /]. However, if the inequality (||5.205||) is false, then we have f (x) < 0 for all x e (0,1) whence it follows that the global minimum of the function t on [0,1] is at the right-hand marginal point (the function t is decreasing on its domain). The fastest journey will take (in hours)
t (x0)
d2 +*l l-x0 1
V2 Vi
a d
V2
dv2+ivifi^f-id dV2(i-(a)2)+hlyi-(^)2 dV2fi^f+iv
"2 V
i d
supposing (|| 5.2051|), and if (115.20511) does not hold.
ViV2
t (I) = hours
□
5.206. A company is looking for a rectangular patch of land with sides of lengths 5a and b. The company wants to enclose it with a fence and then split it into 5 equal parts (each being a rectangle with sides a, b) by further fences. For which values of a, b will the area S = Sab of the patch be maximal if the total length of the used fences is to equal 2 400 m?
Solution. Let us reformulate the statement of the problem: We want to maximize the product Sab while satisfying the condition
(5.13) 6b + 10a = 2400, a,b>0.
It can easily be shown that the function
a h-> 5a
2 400-lOa
defined for a e [0, 240] takes the maximal value at the point a = 120. Hence the result is
a = 120 m, b = 200 m. Let us add that the mentioned value of b immediately follows from (||5.13||).
□
5.207. A rectangle is inscribed into an equilateral triangle with sides of length a so that one of its sides lies on one of the triangle's sides and the other two of the rectangle's vertices lie on the remaining sides of the triangle. What is the maximum possible area of the rectangle?
5.208. Choose the dimensions of an (open) swimming pool whose volume is 32 m3 and whose bottom has the shape of a square, so that one would spare the least amount of paint possible to prime its bottom and walls. O
5.209. Express the number 28 as a sum of two non-negative numbers such that the sum of the first summand squared and the second summand cubed is as small as possible. O
329
CHAPTER 5. ESTABLISHING THE ZOO
5.210. With the help of the first derivative, find the real number a > 0 for which the sum a + l/a is minimal. Then solve this problem without using the differential calculus. O
5.211. Inscribe a rectangle with the greatest perimeter possible into a semidisc with radius r. Determine the rectangle's perimeter. O
5.212. Among the rectangles with perimeter Ac, find the one having the greatest area (if such one exists) and determine the lengths of its sides. O
5.213. Find the height h and the radius r of the largest (i. e. having the greatest volume) cone which fits into a ball of radius R. Q
5.214. From the triangles with a given perimeter p, select the one with the greatest area. O
5.215. On the parabola given by the equation 2x2 — 2y = 9, find the points which are closest to the origin of the coordinate system. O
5.216. Your task is to create a one-liter tin having the "usual" shape of a cylinder so that the minimal amount of material would be used. Determine the proper ratio between its height h and radius r. O
5.217. Determine the distance of the point [3,-l]el2 from the parabola y = x2 — x + \. Q
5.218. Determine the distance of the point [—4, —2] e R2 from the parabola y = x2 + x + 1. O
5.219. At the time t = 0, a car left the point A = [5, 0] at the speed of 4 units per second in the direction (—1, 0). At the same time, another car left the point B = [—2, —1] at the speed of 2 units per second in the direction (0, 1). When will the cars be closest to each other and what will their distance be at that moment? O
5.220. At the time t = 0, a car left the point A = [0, 0] at 2 units per second in the direction (1,0). At the same time, another car left the point B = [1, — 1] at 3 units per second in the direction (0, 1). When will they be closest to each other and what will the distance be? O
5.221. Determine the maximum possible volume of a cone with surface area 3tv cm2 (the surface area of its base is included as well). The area of a cone is P = 7tr(r + h), its volume then V = \Ttr2h, where r is the radius of its base and h is its height. O
5.222. A 13 feet long ladder is leaned against a house. Suddenly the base of the ladder slips off and the ladder begins to go down (still touching the house at its other end). When the base of the ladder is 12 feet from the house, it is moving at 5 feet per second from it. At this moment:
(a) What is the speed of the top of the ladder?
(b) What is the rate of change of the triangle dehmited by the house, the ladder, and ground?
(c) What is the rate of change of the angle enclosed by the ladder and the ground?
o
5.223. Suppose you own an excess of funds without the possibility to invest outside your own factory which acts at a regulated market with a nearly unhmited demand and a hmited access to some key raw materials, which allows you to produce at most 10 000 products per day. You know that the raw profit p and the expenses e, as functions of a variable x which determines the average number of products per day, satisfy
330
CHAPTER 5. ESTABLISHING THE ZOO
v(x) = 9x, n(x) = x3 - 6x2 + 15x, x e [0, 10]. At what production will you profit the most from your factory? O
5.224. Determine
lim ( cotx--
x^O \ X
Solution. If we realize that
1
lim cotx = +oo, lim — = +oo,
x^0+ x^0+ X
1
lim cotx = — oo, lim — = — oo,
jt-»0- x^O- X
we can see that both one-sided limits are of the type oo — oo. We can thus consider the (two-sided) limit.
We will write the cotangent function as the ratio of the cosine and the sine and convert the fractions
to a common denominator, i. e.
1 \ x cos x — sin x lim cotx--= lim-.
x^o \ x) x^o xsmx
Thus we have obtained an expression of the type 0/0 for which we get (by 1'Hospital's rule)
xcosx —sinx cosx — x sinx — cosx —xsinx
lim-= lim-= lim
x^o xsinx x^o sinx + xcosx sinx + x cosx
By one more use of 1'Hospital's rule for the type 0/0, we then get
—xsinx —sinx—xcosx 0 — 0
lim-= lim-=-= 0.
x^o sinx + x cosx cosx + cosx — x sinx 1 + 1 — 0
5.225. Determine the limit
7TX
5.226. Calculate
lim (1 — x) tan ■ 2
lim (— — xtan x).
^f-V2 /
5.227. Using l'Hospital's rule, determine
j^((3"-2"W
5.228. Calculate
. 1 1
lim
l \ 2 In x x2 — 1
5.229. By l'Hospital's rule, calculate the limit
^2
2
lim cos
x^+oo \ x
□
o
o
o
o
o
331
CHAPTER 5. ESTABLISHING THE ZOO
5.230. Determine
lim (1 — cosx)s
o
5.231. Determine the following Umits
lim xtaT, Hm xisz,
where a e M is arbitrary. O
5.232. By any means, verify that
ex - 1 lim-= 1.
x^O X
o
5.233. By applying the ratio test (also called D'Alembert's criterion; see 5.46), determine whether the infinite series
(a) E
n = l oo
(b) E ff;
«=i
(c) E „".„,
converges
« = 1
Solution. Since (a„ > 0 for all n)
2*+1-(«+2)3-3* -• ■v-a-Ti3 3"+1-2"-(« + l)
(a) um 22±i = lim ll[<»?rf; = lim = lim ^ = f < 1;
v y „^oo an 3»+1-2»-(« + D3 3(« + l)3 „^^ 3«3 3
(b) lim 22±i = Km f^l- • = Urn 4r = 0 < 1;
(C) Km 2s±l = lim (, i^r'n, • = I™ t^tt • lim ^ = lim 4 •
v y „^oo an \(« + l)2-(n + 1)! «" / „^00 (« + l)2 „^00 «" „^00 «2
lim (l + i)" = 1 e > 1, the series (a) converges; (b) converges; (c) does not converge (it diverges to +00). □
5.234. By applying the root test (Cauchy's criterion), determine whether the infinite series
(a) E ln"(« + l);
(b) E
«=1
00
(c) Earcsin"! «=1
converges.
Solution. Once again we consider series with non-negative terms only, where
(a) lim ^fa~n = lim = 0 < 1;
(«±±y iim (i+iy
(b) lim 4a~n = lim ^TT = 7°°V L = f < 1;
(c) lim ^/öJJ" = lim arcsin = arcsin 0 = 0 < 1.
This means that all of the examined series converge. □
332
CHAPTER 5. ESTABLISHING THE ZOO
5.235. Determine whether the series
oo
(a) £(-D" ln(l + £);
n = \
oo 2
(b) E ^ ■
.i!
« = 1 oo
(c) v (~3)" «=i
converges.
Solution. The case (a). By l'Hospital's rule, we have
r K1+^) r r 1 1
lim v ! ' = lim 2 ,- = lim —l— = 1,
hoo 2* x^*+oo (2^) x^*+oo 1 + 2*
hence
0 < In (1 + < £
for all sufficiently large neN. However, we know that the series E^i *s convergent. So it must be that
00
£ln(l + £) < +00,
«=i
i. e. the examined series converges (absolutely). The case (b). The ratio test gives
lim
i- 2("+1)2-«l i- 22"+1 i- 2-4"
hm--= lim ^—r = lim ^-V = +00.
«^00 (« + l)!-2«2 «^00 " + 1
Thus the series does not converge.
The case (c). Now we will use the general version of the root test
lim sup y\an I = lim sup 6+3_1)n = f < 1, whence it follows that the series is (absolutely) convergent. □
5.236. By any means, determine whether the following alternating series converge:
(a) E(-l)
n « +3«-l .
(3«-2)2 n = \
/u\ —1 3n4-3n3+9n-
yO) *■) (5«3-2)4"
_±\h-1 3n4-3n3+9n-l
n = l
Solution. The case (a). Since we have that
„2
lim = lim = i y^ 0,
„^oo (3«-2)2 9«2 9 7-'
it immediately follows that the limit
does not exist. Therefore, the series does not converge (a necessary condition for the convergence is not satisfied).
The case (b). We have seen that when applying the ratio (or root) test, the polynomials neither in the numerator nor in the denominator affect the value of the examined limit. Let us thus consider the series
00
4«
n = l
for which we have
lim
"n+\
4- )J
0, pe(0,2), /a»<0, 0 there exists a norm of the partition S > 0 such that for all partitions S with norm lesser than S, we have
15S - /| < e.
For example, if we choose g(x) on interval [0, 1] as a sequentially constant function with finitely many discontinuities ci,... ,Ck and "jumps"
at = lim g(x) - lim g(x),
x—y ci _|_ x—yc[ —
then the Riemann-Stieltjes integral exists for every continuous fix) and equals
»1 k
/ f(x)dg(x) = y2aif(ck). Jo ~i
By the same technique we used for the Riemann integral, we can now define upper and lower sums and uppoer and lower Riemann-Stieltjes integral, which have the advantage that for bounded functions they always exist and their values coincide if and only if the Riemann-Stieltjes integral in the above sense exists.
We already encountered problems with Riemann integration of functions that were "too jumpy". Technically, for function g(x) on a finite interval [a, b] we define its variation by
sup Igte)
S r = l
■ g(Xi-l)\,
where we take the supremum over all partitions S of the interval [a, b]. If the supremum is infinite, we say g(x) has an unbounded variation on [a, b], otherwise we say g is a function with a bounded variation on [a, b].
Similarly to the procedure for the Riemann integral, we can quite easily derive the following:
Theorem. Let fix) and g(x) be real functions on a finite interval [a, b].
(1) Ifgix) is decreasing and continuously differentiable, then the Riemann integral on the left side and the Riemann-Stieltjes integral on the right side both exist simultaneously and their values are equal
fb pb
f(x)dg(x)
f f(x)g\x)dx= f
Ja Ja
(2) If fix) is continuous and g(x) is a nondecreasing function
rb
with a finite variation, then the integral Ja f(x)dg(x) exists.
6.49. Kurzweil integral. The last stop will be a modification of \^ the Riemann integral, which fixes the unfortunate be-\ havior at the third point in the paragraph 6.37, i.e. the limits of the nondecreasing sequences of integrable functions will again be integrable. Then we will be able to interchange the order of the limit process and integration in these cases, just like with uniform convergence.
First notice what's the essence of the problem. Intuitively we should assume that very small sets must have a zero size, and thus the changes of values of the functions on such sets shouldn't influence the integration. Moreover, a countable union of such "negli-_gihlq for Jhe purpose of integration" sets should have a zero size again. Surely we would expect that for example the set of rational
392
CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS
Of course, the convergence of the series at points ±1 can be veri-
fied directly. It's even possible to directly deduce that E ^
«=i
(n + D
(by writing out ^
(n + D
n + 1"
1
□
6.86. Sum of a series. Using theorem 6.41 "about the interchange of a limit and an integral of a sequence of uniformly convergent functions", we'll now add the number series
oo 1
T —
n = \
We'll use the fact that f
xn+l
1
«2" '
Solution. On interval (2, oo), the series of functions E^Li ^rr converges uniformly. That is implied for example by the Weierstrass test: each of the function is decreasing on interval (2, oo), thus their
values are at most ^n-; the series Y^T=
:1 2" + !
is convergent though
2«+l '
(it's a geometric series with quotient -]). Hence according to the Weierstrass test, the series of functions E^Li ^rr converges uniformly. We can even write the resulting function explicitly. Its value at any x € (2, oo) is the value of the geometric series with quotient 7, so if we denote the limit by fix), we have
fix)
Y —
z—i Yn + 1
« = 1
1 1
X2 1 -
1
x(x — 1)
By using (6.43) (3), we get
Ax
'2
~ 1 _ ™ r°° dx
nln ~ ^ J2 xn+1
n=\ n=\ 2
/•oo / 00 j
A
\«=i
-jf
dx
x(x
-<5
dx
f 1 1
lim /---
<5^oo J2 X — 1 X
dx
lim [(ln(<5 - 1) - ln(<5) - ln(l) + In2]
lim
• X2 between metric spaces with metrics d\ and d2, respectively, is called an isometry iff all elements x,y e X satisfy d2(cp(x), cp(y)) — d\(x, y).
Of course, every isometry is a bijection onto its image (this follows from the properly that the distance of distinct elements is non-zero) and the corresponding inverse mapping is an isometry as well.
Now, let us consider two inclusions of a dense subset, i\ : X —>• Xi and i2 : X —>• X2, into two completions of the space X, and let us denote the corresponding metrics by d, d\, and d2, respectively. Apparently, the mapping
X
X2
438
CHAPTER 7. CONTINUOUS MODELS
Hence we can see that the sequence {x„} cannot be a Cauchy sequence. Thus we have found the answer for the usual metric. However, we could have utilized the fact that the sequence {xn} is not convergent by (||7.281|) and that we find ourselves in a complete metric space, where Cauchy sequences and convergent sequences coincide.
For the metric d, it suffices to realize that the mapping / introduced in (||7.271|) is a continuous bijection between the sets [0, oo) and [0, 1), having the property that /(0) = 0. Thus, any sequence is convergent "in the original meaning" if and only if it converges in the metric space R with metric d. It holds as well that a sequence is a Cauchy sequence in R with respect to the usual metric if and only if it is a Cauchy sequence with respect to d. □
7.26. Is the metric space 0,
12 0
12 0 -24 > 0, 0 2 0 = 6 > 0, -3 0 1
imply that the matrix Hf (2, 1, 4) is positive definite - there is a strict local minimum at the point [2,1,4]. □
8.30. Find the local extrema of the function
z = (x2 - l) (l - x4 - y2) , x, y e R.
Solution. Once again, we calculate the partial derivatives zx, zy and set them equal to zero. This leads to the equations
-6x5 + 4x3 + 2x - Ixy2 = 0, (x2 - l) (-2y) = 0,
whose solutions [x, y] = [0, 0], [x, y] = [1, 0], [x, y] = [-1, 0]. (In order to find these solutions, it suffices to find the real roots 1, — 1 of the polynomial — 6x4 + Ax2 + 2 using the substitution u = x2. Now, we compute the second-order partial derivatives
-30x4 + 12x2 + 2-2/, z
xy
~yx
-4xy, zyy = -2 (x2 - l)
Hz (0, 0)
and evaluate the Hessian at the stationary points:
^ °), //z(l,0) = //z(-l,0) = (-016 £
We can see that the first matrix is positive definite, so the function has a strict local minimum at the origin.
However, the second and third matrices are negative semidefinite. Therefore, the knowledge of second partial derivatives in insufficient for deciding whether there is an extremum at the points [1,0] and [—1,0]. On the other hand, we can examine the function values near these points. We have
z (1,0) =z (-1,0) = 0, z(x,0)<0 forxe(-l,l). Further, consider y dependent on x e (—1, 1) by the formula y = ^2 (l — x4), so that y -» 0 for x -» ± 1. For this choice, we get
z (x, ^2(1 -x4)) = (x2 - 1) (x4 - 1) > 0, x e (-1, 1).
We have thus shown that in arbitrarily small neighborhoods of the points [1, 0] and [—1,0], the function z, takes on both higher and lower values than the function value at the corresponding point. Therefore, these are not extrema. □
8.31. Decide whether the polynomial
p(x, y)
x6 + y8 + y4x4
has a local extremum at the stationary point [0, 0]. Solution. We can easily verify that the partial derivatives px and py are indeed zero at the origin. However, each of the partial derivatives Pxx, Pxy, Pyy is also equal to zero at the point [0, 0]. The Hessian Hp (0, 0) is thus both positive and negative semidefinite at the same time. However, a simple idea can lead us to the result: We can notice that p(0,0) = Oand
p(x, y) = x6 (1 - y5) + y8 + y4x4 > 0
for [x, y] € R x (—1, 1) \ {[0, 0]}. Therefore, the given polynomial has a local minimum at the origin. □
8.32. Determine local extrema of the function / : R3 -» R,
f(x,y,z)=x2y + y2z+x-zon.R3. O
Now, gy (x, y) — fy (x + t, y) — fy (x, y), so we can write • 0 must guarantee the wanted equality
fxy(x, y) = fyx(x, y)
at all points (x, y).
The same procedure for functions of n variables proves the following fundamental result:
I Interchangeability of partial derivatives
8.10. Theorem. Let f : En —>• M be a k-times differentiable function with continuous partial derivatives up to order k (inclusive) in a neighborhood of a point x € M". Then all partial derivatives of the function f at the point x up to order k (inclusive) are independent of the order of differentiation.
Proof. The proof for the second order was illustrated above in , the special case when n — 2 . The procedure works
similarly for the general case as well. " Formally, the proof can be led in the following way: we may assume that for every fixed choice of a pair of coordinates Xi and xj, the whole discussion of their interchanging takes place in a two-dimensional afline subspace, i. e., all the other variables are considered to be constant and take no effect in the reasonings.
In the case of higher-order derivatives, the proof can be finished by induction on the order. Indeed, every order of the indeces i\, ...,ik can be obtained from a fixed one by several swaps of adjacent pairs of indeces. □
8.11.
Hessian. In the case of first-order derivatives, we introduced the differential, being the linear form df(x) which approximates a function fata point x in the best way. Similarly, we will now want to understand the quadratic approximation of a function / : E„ —>•
474
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
8.33.
Determine the local extrema of the function /
9
x y
y2 z + 4x +z
on
8.34. Determine the local extrema of the function /
fix, y, z) = xz2 + y2 z - x + y on R3.
8.35. Determine the local extrema of the function /
fix, y, z) = y2z - xz2 + x + 4y on R3.
8.36. Determine the local extrema of the function /
fix, y) = x2y + x2 + ly2 + y on R2
8.37. Determine the local extrema of the function /
fix, y) = x2y + 2/ + 2y on R2.
8.38. Determine the local extrema of the function /
fix, y) = x2 + xy + 2y2 + y on R2.
8.39. Determine the local extrema of the function /
fix, y) = x2 + xy - 2y2 + y on R2.
O
► I
O
► I
o o o o o
F. Implicitly given functions and mappings
8.40. Let F : R2 -» R be a function, F(x,y) = xysin^xy2). Show that the equality Fix,y) = 1 implicitly defines a function / : U -» R on a neighborhood U of the point [1,1] so that Fix, fix)) = 1 forx € U. Determine f'il).
Solution. The function is differentiable on the whole R2, so it is such on any neighborhood of the point [1, 1]. Let us evaluate Fy at [1, 1]:
Fyix, y) = x sin
+ Ttx2 y2 cos
so Fy(l, 1) = 1 7^ 0. Therefore, it follows from theorem 8.18 that the equation Fix,y) = 1 implicitly determines on a neighborhood of the point (1, 1) a function / : U -» R defined on a neighborhood of the point (number) 1. Moreover, we have
Fxix, y)=y sin (^y2) + ijxy3 cos (^y2) ,
so the derivative of the function / at the point 1 satisfies ^(1, 1) 1
Fy(h 1)
1
□
Remark. Notice that although we are unable to explicitly define the function / from the equation Fix, fix)) = 1, we are able to determine its derivative at the point 1.
8.41. Considering the function F : R2 -
Fix, y) = ex sin(y) + y
77-/2 - 1
, show that the equation F{x,y) = 0 implicitly defines the variable y to be a function of x, y = fix), on a neighborhood of the point [0,77-/2]. Compute /'(0).
Solution. The function is differentiable in a neighborhood of the point [0,7T/2]; moreover, Fy = ex cosy + 1, F(0, jt/2) = 1 ^ 0, so the equation indeed defines a function / : U -» R on a neighborhood of the point [0, tt/2]. Further, we have Fx = ex siny, 7^(0, jt/2) = 1, and its derivative at the point 0 satisfies:
Fx(0, tt/2) 1 f'iO) = - _.' =-- = -1. □
7% (0,77-/2)
1
Hessian
Definition. If / : M" —>• M is a twice differentiable function, we call the symmetric matrix of functions
H fix) =
d2f dxi dxj
(H-ix)
ix) =
a2/
dx\dxn
ix)\
a2/
dxn dx\
ix)
■Al—ix)!
ÓXndXn v ' /
the Hessian of the function / at the point x.
We have already seen from the previous reasonings that zeroing the differential at a point (x, y) e E2 guarantees stationary behavior along all curves going through this point. The Hessian
H fix, y)
fxx ix, y) fxy ix, y)
fxyix,y) /vv(V. V)
plays the role of the second derivative. For every parametrized straight line
cit) = (x(0, yit)) the univariate functions
a(0 = /M0,X0)
m = fixo,yo) + ^-(xo,yo)^ ox
(x0 + %t, y0 + nt),
of
t-(*o, yo)ri
dy
fxxixo, yo)t + 2fxyix0, yo)ŠV + fyyixo, yo)v'
will share the same derivatives up to the second order (inclusive) at the point t — 0 (calculate this on your own!). The function f3 can be written in terms of vectors as
ßit) — fixo, yo) + dfixQ, y0) •
l-iH v)-Hfixo,yo)-(Í
or Pit) = fix0, yo) + dfix0, yo)iv) + jHfixo, yo)iv, v), where i; — (§, n) is the increase given by the derivative of the curve c(f), and the Hessian is used as a symmetric 2-form.
This is an expression which looks like Taylor's theorem for univariate functions, namely the quadratic approximation of a function by Taylor's polynomial of degree two. The following picture shows both the tangent plane and this quadratic approximation for two distinct points and the function fix, y) — sin(x) cos(y).
8.12. Taylor's theorem. The multidimensional version of Tay-lor's theorem is once again an example of a mathemat-ical statement where the most difficult part is finding ' ] '\MXJ the right formulation. The proof is quite simple then.
475
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
8.42. Let
F(x, y, z) = sin(xy) + sin(yz) + sin(xz).
Show that the equation F{x, y,z) =0 implicitly defines a function z(x, y) : R2 -» R on a neighborhood of the point [tt, 1,0] e I3 so that F(x, y, zix, y)) = 0.
Determine zxin, 1) and zy(7t, 1).
Solution. We will calculate Fz = y cos(yz)+x cos(xz), Fz(tc, 1, 0) = tt + 1 7^ 0, and the function z(x, y) is defined by the equation F(x, y, z(x, y)) = 0 on a neighborhood of the point [tt, 1, 0]. In order to find the values of the wanted partial derivatives, we first need to calculate the values of the remaining partial derivatives of the function F at the point [tt, 1,0].
Fx(x, y, z) Fy(x, y, z)
y cos(xy) + zcos(xz) Fx(jv, 1, 0) x cos(xy) + z cos(yz) Fy(jv, 1, 0)
-1,
-TT,
odkud
zxin, 1)
Zy(Tt, 1)
1
F An, 1,0) Fz(it, 1,0)
FyJTt, 1,0) _ _
Fz(it, 1,0) ~ tt + ť
Tt + 1
7t
□
8.43. Having the mapping F : R3 ^ R2, F(x,y,z) = (f(x,y,z),g(x,y,z)) = (exsmy,xyz), show that the equation F(x, ci(x), c2(x)) = (0,0) defines a curve c : R -» M2 on a neighborhood of the point [1, Tt, 1]. Determine the tangent vector to this curve at the point 1.
Solution. We will calculate the square matrix of the partial derivatives of the mapping F with respect to y and z;.
H(x,y,z) = [fy fz
8y 8z
Hence, H(l,jt, 1)
-1
1
x cos y ex sin y 0
xz, xy
and det#(l, tt, 1)
-TT
ŕ o.
Now, it follows from the implicit mapping theorem (see 8.18) that the equation Fix, c\{x), c2(x)) = (0, 0) on a neighborhood of the point [1, tt, 1] determines a curve (ci(x), c2(x)) defined on a neighborhood of the point [1, tt]. In order to find its tangent vector at this point, we need to determine the )column) vector (fx, gx) at this point:
fx
sin y e
yz
.x sin y
fxihTT, 1)
8 Ahn, 1)
The wanted tangent vector is thus
(cúAh (c2)Ah)
fy(hn,l) fz(hn,l) ^(l,7r,l) gz(hn,\)
-1 0
1 TT
TT
1 0
fAhn,\ 8x(\,n, 1)
0
□
We will proceed in the direction mentioned above, and we will introduce a notation for the particular parts of Dkf approximations of higher orders for functions /:£„—>• R". It will alwyas be ^-linear expressions in the increases, and we will be interested only in their enumeration at a &-tuple of same values.
We have already discussed the differential D1 f — df (the first order) and the Hessian D2f — Hf (the second order). In general, for functions /:£„—>• R, points x — (x\,..., x2) e En, and increases i; — (§i, ...,§„), we set
Dkf(x)(v)
E
dkf
l• R be a k-times differentiable function in a neighborhood Og(x) of a point x € En. For every increase v € W of size || v || < ^, exzsta a number 6, 0 < 0 < 1, swcft
/(x + u) = /(x) + Dlf(x)(v) + ±D2f(x)(v)+
1
(*- 1)!
2! k!
Proof. For an increase i; e M", we consider the parametrized jijf'straight line c(f) — x + tv in E„, and we examine the function • R defined by the composition (pit) — |■ / o c(t). Taylor's theorem for univariate functions claims that (see Theorem 6.4)
(Pit) = (piO) + /(xq), respectively. If equality holds for no x / xo in the previous inequalities, we talk about a strict extrémům.
For the sake of simplicity, we will suppose that our function / has continuous both first-order and second-order partial derivatives on its domain. A necessary condition for existence of an extrémům at a point xq is that the differential be zero at this point, i. e., df(xo) — 0. Indeed, if df(xo) / 0, then there is a direction v in which we have dvf(xo) / 0. However, then the function value is increasing at one side of the point xo along the line xo + tv and it is decreasing on the other side, see (5.32).
An interior point x e En of the domain of a function / at which the differential df(x) is zero is called a stationary point of the function f.
All
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
extrema. Further, inside every eighth of the sphere given by the coordinate planes, there may or may not be another extremum. The particular quadrants can be easily parametrized, and the function h (considered a function of two parameters) can be analyzed by standard means (or we can have it drawn in Maple, for example).
Actually, solving the system (no matter whether algebraically or in Maple again) leads to a great deal of stationary points. Besides the six points we have already talked about (two of the coordinates equal to zero and the other to ±1) and which have a = ±|, there are also the points
y 3 3 3 J
for example, where a local extremum indeed occurs.
If we restrict our interest to the points of the circle K, we must give another function G another free parameter rj representing the gradient coefficient. This leads to the bigger system
0 0 0
0 = x2 + y2 +z2- L 0 = x+y + z.
3x2 — 2Xx - V,
3/ - 2ky - V,
3z2 -2Xz - V,
However, since a circle is also a compact set, h must have both a global minimum and maximum on it. Further analysis is left to the reader. □
f(x,y,z) = 1. If so, find
8.46. Determine whether the function / : R3 -» x2 y has any extrema on the surface 2x2 + 2^ + z2 these extrema and determine their types.
Solution. Since we are interested in extrema of a continuous function on a compact set (ellipsoid) - it is both closed and bounded in R3 - the given function must have both a minimum and maximum on it. Moreover, since the constraint is given by a continuously differentiable function and the examined function is differentiable, the extrema must occur at stationary points of the function in question on the given set. We can build the following system for the stationary points:
2xy x2 0
4kx, 4ky, 2kz.
This system is satisfied by the points [± , , 0] and [± , — , 0]. The function takes on only two values at these four stationary points. Ir follows from the above that the first and second stationary points are maxima of the function on the given ellipsoid, while the other two are minima. □
8.47. Decide whether the function / : R3 -» R, f(x, y, z) = z -xy2 has any minima and maxima on the sphere
2,2,2
x +y +z
1.
We will again, for a while, work with a simple function in E2 in order to illustrate our conclusions directly. Let us consider the function fix, y) — sin(x) cos(y) which has been discussed and caught in many pictures, namely in paragraphs 8.9 a 8.8.
The shape of this function resembles well-known egg plates, so it is apparent that we can find a lot of extrema, but also many more stationary point which, in fact, will not be extrema (the little "saddles" noticeable in the picture).
Therefore, let is calculate the first derivatives, and then the necessary second-order ones:
fxix, y) — cos(x) cos(y), fy(x, y) — - sin(x) sin(y),
and both derivatives will be zero for two sets of points
(1) cos(x) — 0, sin(y) — 0, that is (x, y) — i^^-n, £n), for any t,leZ
(2) cos(y) — 0, sin(x) — 0, that is (x, y) — (kit, 2^-n), for any t,leZ.
The second partial derivatives are
Hfix,y) =(f" ffxy)ix,y)
\Jxy Jyy/
- sin(x) cos(y) — cos(x) sin(y)
- cos(x) sin(y) — sin(x) cos(y)
We thus get the following Hessians in our two sets of stationary points:
If so, determine them.
(1) Hfikn + j, £n) — ± ^ ^j, where the sign — occurs
when k and £ have the same parity (remainder upon division by two), and the sign + occurs in the other case;
(2) Hfikn, In + j) — ± ^ ^j, where, again, the sign — occurs occurs when k and £ have the same parity, and the sign + occurs in the other case;
Now, if we look at the proposition of Taylor's theorem for order k — 2, we get, in a neighborhood of one of the stationary points
(*o, yo),
fix, y) = f(x0, yo)+ 1
+ 2Hf(x° + °(x ~ x°)' y° + e(y ~ yo))(x - xo, y - yo),
where Hf is now considered a quadratic form evaluated at the increase (x — xo, y — yo). Since the Hessian of our function is continuous (i. e., continuous partial derivatives up to order two, inclusive) and the matrices of the Hessian are non-degenerate, the local maximum occurs if and only if our point (xo, yo) belongs to the former group with k and £ of the same parity. On the other
478
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Solution. We are looking for solutions of the system
x = —ky2, y = —2kxy, z = k.
2j. The first
The second equation implies that either y = 0 or x = possibility leads to the points [0, 0, 1], [0, 0, —1]. The second one cannot be satisfied (substituting into the equation of the sphere, we get the equation
1 1 ,
777 + -r, +k = L 4k2 2k2
which has no solution. The function has a maximum and minimum, respectively, at the two computed points on the given sphere. □
8.48. Determine whether the function / : R3 -» R, f(x, y, z) = xyz, has any extrema on the ellipsoid given by the equation
g(x,y,z) = kx2 + lf + z2 = 1, k, I e R+.
If so, calculate them.
Solution. First, we build the equations which must be satisfied by the stationary points of the given function on the ellipsoid:
dx dx
dy
dy
yz
xz.
xy
2Xkx,
2Xly,
2Xz.
dz dz
We can easily see that the equation can only be satisfied by a triple of non-zero numbers. Dividing pairs of equations and substituting into the ellipse's equation, we get eight solutions, namely the stationary points x = ±77^, y = ±-^j, z = ±7^5- However, the function / takes on only two distinct values at these eight points. Since it is continuous and the given ellipsoid is compact, / must have both a maximum and minimum on it. Moreover, since both / and g are continuously differentiable, these extrema must occur at stationary points. Therefore, it must be that four of the computed stationary points are local maxima of the function (of value 777=) and the other four are
minima (of value
3V3«
□
8.49. Determine the global extrema of the function
f(x, y) = x2 - 2y2 + 4xy - 6x - 1 on the set of points [x, y] that satisfy the inequalities (8.1) x>0, y>0, y<-x+3.
Solution. We are given a polynomial with continuous partial derivatives on a compact (i. e. closed and bounded) set. Such a function necessarily has both a minimum and a maximum on this set, and this can happen only at stationary points or on the boundary. Therefore, it suffices to find stationary points inside the set and the ones on a finite number of open (or singleton) parts of the boundary, then evaluate / at these points and choose the least and the greatest values. Notice that the set of points determined by the inequalities (||8.11|) is clearly a triangle with vertices at [0, 0], [3, 0], [0, 3].
hand, if the parities are different, then the point from the former group happens to be a point of a local minimum.
On the other hand, the Hessian of the latter group of points is always positive at some increases and negative at other ones. Therefore, the entire function / behaves in this manner in a small neighborhood of the given point.
In order to formulate the general statement about the Hessian and the local extrema at stationary points, we have to remember the discussion about quadratic forms from the paragraphs 4.31^1.32 in the chapter on affine geometry. There, we introduced the following attributes for a quadratic form h : En -> M:
• positively definite iff h (w) > 0 for all u ^ 0
• positively semidefinite iff h(u) > 0 for all u e V
• negatively definite iff h(u) < 0 for all u ^ 0
• negatively semidefinite iff h(u) < 0 for all u e V
• indefinite iff h(u) > 0 and f(v)<0 for appropriate u,v e V.
We also invented some methods which allow us to find out whether a given form has any of these properties.
The Taylor expansion with remainder immediately yields the following proposition:
Theorem. Let f : En -> Rbe a twice continuously differentiable function and x e En be a stationary point of the function f. Then
(1) f has a strict local minimum at x if Hf(x) is positively definite,
(2) f has a strict local minimum at x if H fix) is negatively definite,
(3) f does not have an extremum at x if H fix) is indefinite.
Proof. The Taylor second-order expansion with remainder applied to out function f(x\,..., x„), an arbitrary point x — (xi,..., x„), and any increase 1; — (vi,..., v„), such that both x and x + v lie in the domain of the function /, says that
f(x + v) = f(x) + df ix)iv) + \nfix + 0 ■ v)iv)
for an appropriate real number 6, 0 < 6 < 1. Since we suppose that the differential is zero, we get
fix + v) = fix) + l-Hfix + 6 ■ v)iv).
By our assumption, the quadratic form Hf(x) is continuously dependent on the point x, and the definiteness or indefiniteness of quadratic forms can be determined by the sign of the major subde-terminants of the matrix Hf, see Sylvester's criterion in paragraph 4.32. However, the determinant itself is a polynomial expression in the coefficients of the matrix, hence a continuous function. Therefore, the non-zeroness and signs of the examined determinants are the same in a sufficiently small neighborhood of the point x as at the point x itself.
In particular, for positively definite Hf(x), we have guaranteed that, at a stationary point x, f(x + v) > f(x) for sufficiently small 1;, so this is a sharp minimum of the function / at the point x. The case of negative definiteness is analogous. If Hf(x) is indefinite, then there are directions 1;, w in which fix + v) > fix) and fix + w) < fix), so there is no extremum at the stationary point in question. □
479
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Let us determine the stationary points inside this triangle as the solution of the equations fx = 0, fy = 0. Since
fx(x, y) = 2x+Ay- 6, fy(x, y) = Ax - Ay, these equations are satisfied only by the point [1, 1]. The boundary suggests itself to be expressed as the union of three line segments given by the choice of pairs of vertices. First, we consider x = 0, y € [0, 3], when fix, y) = —2y2 — 1. However, we know the graph of this (univariate) function on the interval [0, 3] It is thus not difficult to find the points at which global extrema occur. They are the marginal points [0, 0], [0, 3]. Similarly, we can consider y = 0, x € [0, 3], also obtaining the marginal points [0, 0], [3,0]. Finally, we get to the line segment y = — x + 3, x e [0, 3]. Making some rearrangements, we get
f(x, y) = f{x, -x + 3) = -5x2 + 18* - 19, x e [0, 3]. We thus need to find the stationary points of the polynomial p(x) = —5x2 + 18* — 19 from the interval [0, 3]. The equation p'ix) = 0, i. e., — 10* + 18 = 0, is satisfied by x = 9/5. This means that in the last case, we obtained one more point (besides the marginal points), namely [9/5, 6/5], where a global extremum may occur. Altogether, we have these points as "suspects":
[1,1], [0,0], [0,3], [3,0], [f,f] with function values
-A, -1, -19, -10, respectively. We can see that the function / takes on the greatest value -1 at the point [0, 0] and the least value -19 at the point [0, 3]. □
8.50. Determine whether the function / : R3 -» R, fix, y, z) = fz has any extrema on the line segment given by the equations
2x + y + z = 1,
x — y + 2z, = 0 and the constraint x e [—1,2]. If so, find these extrema and determine their types. Justify all of your decisions.
Solution. We are looking for the extrema of a continuous function on a compact set. Therefore, the function must have both a minimum and a maximum on this set, and this will happen either at the marginal points of the segment or at those where the gradient of the examined function is a linear combination of the gradients of the functions that give the constraints. First, let us look for the points which satisfy the gradient condition:
0
2yz
y2
2x + y + z x — y + 2z
2k +1, k-l, k + 2l, 1, 0.
The solution is [x, y, z] = [§, 0, -j] and [x, y,z] = [f, §, -|]
(of course, the variables k and I can also be computed, but we are not interested in them). The marginal points of the given line segment are [—1, |, |] and [2, —|, —|]. Considering these four points, the function takes on the greatest value at the first marginal point (/(*, y, z) = 4pp), which is its maximum on the given segment, and it
Let us notice that the theorem yields no result if the Hessian of the examined function is degenerate, yet not indefinite at the point in question. The reason is again the same as in the case of univariate functions. In these cases, there are directions in which both the first and second derivatives vanish, so at this level of approximation, we cannot determine whether the function behaves like f3 or ±?4 until we calculate the higher-order derivatives in the necessary directions at least.
At the same time, we can notice that even at those points where the differential is non-zero, the definiteness of the Hessian Hf(x) has similar consequences as the non-zeroness of the second derivative of a univariate function. Indeed, for a function / : Rn -> R, the expression
z(x + v) = f(x) + df(x)(v)
defines a tangent hyperplane to the graph of the function / in the space Rn+1, so Taylor's theorem of order two with remainder, as used in the proof, shows that when the Hessian is positively definite, all the values of the function / lie in a sufficiently small neighborhood of the point x above the values of the tangent hyperplane, i. e., the whole graph is above the tangent hyperplane in a sufficiently small neighborhood. In the case of negative definiteness, it is the other way round. Finally, when the Hessian is indefinite, the graph of the function goes from one side of the hyperplane to the other, but this happens, in general, along objects of lower dimension in the tangent hyperplane, so we have no straightforward generalization of inflexion points.
8.14. The differential of mappings. The concepts of a derivative and a differential can be easily extended to mappings F : E„ -> Em. Having selected the Cartesian coordinate system on both sides, this mapping is an ordinary m-tuple
F(xu ..., x„) - (/i(*i, ..., xn), ..., fm(xi, .. .,*„))
of functions f : E„ -> R. We say that F is a differentiable or k-times differentiable mapping iff the corresponding property is shared by all the functions fi,..., fm.
Differential and Jacobian matrix |_^
The differentials dfi (x) of the particular functions f give a linear approximation of the increases of their values for the mapping
F(xu ... ,x„) = (/i(*i, .. .,xn), ..., fm(xx, ..., x„)).
Therefore, we can expect that they will also give a coordinate expression of the linear mapping D1F(x) : Rn —>• Rm between the direction spaces, which linearly approximates the increases of our mapping. The resulting matrix
DlF(x)
(dfi(x)\
df2(x) —
\dfm (x) J
3/2 obci
I 3/m
\ obci
Ml
3/2 dX2
Ms.
dx2
3£l\
dx„ >
dxn
dx„ /
(x)
is called the Jacobian matrix of the mapping F at a point x.
The linear mapping D1F(x) defined on the increases i; — (vi,..., v„) by identically denoted the Jacobian matrix is called
480
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
takes the least value at the second marginal point {fix, y, z) which is thus its minimum there.
80 n 27'' □
8.51. Find the maximal and minimal values of the polynomial
p(x, y) = 4x3 - 3x - Ay3 + 9y
on the set
M = {[x, y] e R2; x2 + y2 < l} .
Solution. This is again the case of a polynomial on a compact set; therefore, we can restrict our attention to stationary points inside or on the boundary of M and the "marginal" points on the boundary of M. However, the only solutions of the equations
px(x, y) = \2x2 -3 = 0, py(x, y) = -12/ + 9 = 0
are the points
1 V3 1 S 1 V3 1 V3
2' 2 2' 2 2' 2 2' 2
which are all on the boundary of M. This means that p has no extrémům inside M. Now, it suffices to find the maximum and minimum of p on the unit circle k : x2 + y2 = 1. The circle k can be expressed parametrically as
x=cosf, y = sinf, te[—jt,7t].
Thus, instead of looking for the extrema of p on M, we are now seeking the extrema of the function
/(f) := p(cos f, sin f) = 4 cos3 t — 3 cos t — 4 sin3 f + 9 sin f on the interval [—it, it]. For f e [—7T, tt], we have
/'(f) = —12cos2 f sinf + 3 sinf — 12sin2 f cosf + 9cos f,
In order to determine the stationary points, we must express the function /' in a form from which we will be able to calculate the intersection of its graph with the x-axis. To this purpose, we will use the identity
-K- = 1 + tan2 f,
cosz t
which is valid provided both sides are well-defined. We thus obtain
f'if) =
cos3 f [- 12tan f + 3 (tan f + tan3 f) - 12tan2 f + 9 (l + tan2f) ]
for f e [—it, it] with cosf ^ 0. However, this condition does not exclude any stationary points since sin f ^ 0 if cos f = 0. Therefore, the stationary points of / are those points f e [—it, tv] for which
-4 tan f + tan f + tan3 f - 4 tan2 f + 3 + 3 tan2 f = 0. The substitution s = tan f leads to
s3 - s2 - 3s + 3 = 0, i. e. (s - 1) (s - Vfj (s + Vfj = 0. Then, the values
5=1, s = >/3, s = — >/3 respectively correspond to
f e {—I jt, t e {—I jt, |7r}, f e {—\it, §77-}.
Now, we evaluate the function / at each of these points as well as at the marginal points t = —it, t = it. Sorting them, we get
/ (-1 it) = -1 - 373 < / (-1 jt) = -3V2 0 changes the points of extrema (of course, they can change their values). However, we know that the function g gives the distance of a point [x, y] from the point [2, 0]. Since the set M is clearly a square with vertices [1, 0], [0, 1], [—1, 0], [0, —1], the point of M that is closest to [2, 0] is the vertex [1,0], while the most distant one is [ — 1, 0]. Altogether, we have obtained that the minimal value of / occurs at the point [1,0] and the maximal one at [ — 1, 0]. □
8.53. Compute the local extrema of the function y = fix) given implicitly by the equation
3x2+2xy+x = y2+3y + \, [x, y] sR2\{[x,x - |] ; x € R} .
Solution. In accordance with the theoretical part (see 8.18), let us denote
Fix, y) = 3x2 + 2xy + x — y2 — 3y — |, [x, y]sR2\ {[x,x - §] ; x € R} and calculate the derivative
6x+2y + l 2x—2y—3 '
We can see the this derivative is continuous on the whole set in question. In particular, the function / is defined implicitly on this set (the denominator is non-zero).
A local extremum may occur only for those x, y which satisfy / = 0, i. e., 6x+2y+\ = 0. Substituting y = —3x — 1/2 into the equation Fix, y) = 0, we obtain — 12x2 + 6x = 0, which leads to
[x,y] = [0,-±], [x,y] = [\,-2\.
We can also easily compute that
„ _ i ,y _ (6+2/)(2x-2;y-3)-(6jt+2;y+l)(2-2/) y — \y ) — (2x-2y-3)2
Substituting x = 0, y = —1/2, / = 0 and x = 1/2, y = —2, / = 0, we obtain
f = _6(_2)_o > Q for [JC> = [o, 4]
and
f
6(+2)-0
<
0 for [*,)>] = [±,-2],
mapping would exist. The Cartesian image of lines in polar coordinates with constant coordinates r or • Em and G : Em —>• Er be two dif-ferentiable mappings, where the domain of G contains the whole image of F. Then, the composite mapping G o F is also differen-tiable , and its differential at any point form the domain of F is given by the composition of differentials
Dl(GoF)(x) = D1G(F(x))o/)1F(x).
The corresponding Jacobian matrix is given by the product of the corresponding Jacobian matrices.
Proof. In paragraph 8.5 and in the proof of Taylor's theorem, we derived how differentiation of mappings composed from functions and curves behaves This proves the > special cases of this theorem for n — r — 1. The general case can be proved analogously, we just have to work with more vectors now.
Let us fix an arbitrary increase i; and calculate the directional derivative for the composition G o F at a point x e E„. This actually means to determine the differentials for the particular coordinate functions of the mapping G composed with F. For the sake of simplicity, we will write g o F for any one of them.
dv(g o F)(x) = lim Ug(F(x + tv)) - g(F(x))). t^o t
The expression in parentheses can, from the definition of the differential of g, be expressed as
g(F(x + tv)) - g(F(x) = dg(F(x))(F(x + tv) - F(x))
+ a(F(x + tv) - F(x)),
where a is a function defined on a neighborhood of the point F(x) which is continuous and satisfies lim^o ^a^i;) — 0. Substitution into the equality for the directional derivative yields
dv(g o F)(x) = lim Udg(F(x))(F(x t^o t
a(F(x
■ dg(F(x))(\iml-(F( 1
\t->o t
+ tv) - F(x)) tv) - F(x))) f tv) - Fix)
lim
t^o t
aiFix + tv) - Fix))
= dg(F(x)) o DlFix)iv) + 0,
482
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
We have thus proved that the implicitly given function has a strict local minimum at the point x = 0 and a strict local maximum at x = 1/2.
□
8.54. Find the local extrema of the function z the maximum possible set by the equation
f(x,y) given on
(8.2)
xz.
yz + 2x + 2y + 2z,-2 = 0.
Solution. Differentiating (||8.2||) with respect to x and y gives
2x + 2zzx - z - xzx - yzx + 2 + 2z,x = 0,
2y + 2zzy Hence we get that
z
(8.3)
xz,^
z - yz,y + 2 + 2z,y
0.
fx(x, y)
2z
fy(x, y)
— 2x —2 x - y + 2' -2y -2
2z, — x — y + 2
We can notice that the partial derivatives are continuous at all points where the function / is defined. This implies that the local extrema can occur only at stationary points. These points satisfy
zx = 0, i. e. z — 2x — 2 = 0,
zy = 0, i. e. z - 2y - 2 = 0. We have thus two equations, which allow us to express the dependency of x and y on z. Substituting into (||8.2||), we obtain the points
[x, y, z] =
[x, y, z]
-3 + 76, -3 + 76, -4 + 276 -3 - V6, -3 - 76, -4 - 276
Now, we need the second derivatives in order to decide whether the local extrema really occur at the corresponding points. Differentiating zx in (||8.3||), we obtain
7 — f (v v\ - fa-2)(2z-x-y +2)-(z-2x-2)(2Zx-1)
Zxx - Jxx\x, y) - {2z_x_y+2)2 with respect to x, and
_ r , n _ Zj,(2z-x-y+2)-(z-2x-2)(2zj,-l)
Zxy ~ Jxy(X, y) - (2z_x_y+2)2 ,
with respect to y. We need not calculate zyy since the variables x and y are interchangeabel in (||8.2||) (if we swap x and y, the equation is left unchanged). Moreover, the x- and y-coordinates of the considered
points are the same; hence zx
Now, we evaluate that at the
stationary points:
fxx (-3 + 76, -3 + V6) = fyy (-3 + V6, -3 + 76)
fxy (-3 + 76, -3 + 76) = fyx (-3 + 76, -3 + 76) = 0,
fxx (-3 - 76, -3 - 76") = fyy (-3 - 76, -3 - 76") = j=
fxy (-3 - 76, -3 - 76") = fyx (-3 - 76, -3 - 76") = 0. As for the Hessian, we have
Hf (-3 + 76, -3 + 76") =
Hf{[-
76, -3
where we made use of the properties of the function a and the fact the a linear mapping between finite-dimensional spaces are always continuous.
Thus, we have proved the theorem for the particular functions gi,..., gr of the mapping G. The whole theorem now follows from matrix multiplication. □
Now, we can illustrate, by a simple example, the usage of our concept of transformation and the theorem about differentiation of composite mappings. We have seen that the polar coordinates are given from the Cartesian ones by the transformation F : M2 —>• M2 which, in coordinates (x, y) and (r, cp), is written as follows (for instance, on the domain of all point in the first quadrant except for the points having x — 0):
— Jx2 + y2, (f — arctan
y
Consider a function gt : E2 —>• M which can be expressed as
g(r,
In fact, we are considering a one-dimensional quadratic form whose positive (negative) definiteness at a stationary point means that there is a minimum (maximum) at that point. Realizing that the stationary points had x =2k,y = 2k, mere substitution yields
\dx2,
d2L^,^ = -4V2dx2, d2L(-^,-£)=4j2c which means that there is a strict local maximum of the function / at
the point is a strict
(8.5)
V2/2, V2/2
, while at the point
-V2/2, -V2/2
ocal minimum. The corresponding values are:
there
-2V2.
Now, we will demonstrate a quicker way how to obtain the result. We know (or we can easily calculate) the second partial derivatives of the function L, i. e., the Hessian with respect to the variables x and y:
HL (x, y)
> 2__6k n
v3 r4 u
inverse function is then the multiplicative inverse of the derivative of the original function.
Interpreting this situation for a mapping E\ —>• E\ and linear mappings R —>• R as their differentials, the non-zeroness is a necessary and sufficient condition for the differential to be invert-ible. In this way, we obtain a statement which is valid for finite-dimensional spaces in general:
I The inverse mapping theorem [__,
1
Theorem. Let F : En —>• En be a differentiable mapping on a neighborhood of a point xq € En, and let the Jacobian matrix D1 f(xo) be invertible. Then in some neighborhood of the point xq, the inverse mapping F-1 exists, and its differential at the point F(xq) is the inverse mapping to the differential D1F(xo), i. e., it is given by the inverse matrix to the Jacobian matrix of the mapping F at the point xq.
Proof. First, we should try to verify that the theorem makes sense and is expectable. If we supposed that the inverse mapping existed and was differentiable at the point F(xq), differentiation of composite functions enforces the formula
idM« = dl(F~l o F)(xo) = dHf'1) o D1F(xq),
which verifies the formula at the end of the theorem. Therefore, we know right from the beginning which differential for F~l to look for.
In the next step, we will suppose that the inverse mapping F~1 exists in a neighborhood of the point F(xq) and that it is continuous. We are to verify the existence of the differential. Since F is differentiable in a neighborhood of xq, it follows that
F(x) - F(x0) - dlF(x0)(x - x0) — a(x - x0)
with function a : R" —>• 0 satisfying lim^o ^a^i;) — 0. To verify the approximation properties of the linear mapping (D1F(xo))~1, it suffices to calculate the following limit for y — F(x) approaching yo — F(xq):
lim 1 n{F-\y) - F^iyo) - (Z)1JF(x0))-1(y - yo))-y^yo \\y - y0||
Substitution into the previous equality gives
1
lim
x0-
y^yo IIy - yo
(D1JF(x0))-1(D1JF(x0)(x - x0) + a(x - x0)) -1
lim
y^yo IIy - y0||
-(D1JF(xo))-1(«(x-x0))
(^^(xo))-1 lim
-1
-{a{x -x0)),
y->yo ||y - yoll
where the last equahty follows from the fact that linear mappings between finite-dimensional spaces are always continuous, and thanks to invertibility of the differential, performing it before the limit process has no impact upon existence of the limit.
484
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
The evaluation
HL
HL
-2V2 0 0 -2V2
2V2 0 o 2V2
then tells us that the quadratic form is negative definite for the former stationary point (there is a strict local maximum) and positive definite for the latter one (there is a strict local minimum).
We should be aware of a potential trap in this "quicker" method in the case we obtain an indefinite form (matrix). Then, we cannot conclude that there is not an extremum at that point since as we have not included the constraint (which we did when computing (PL), we are considering a more general situation. The graph of the function / on the given set is a curve which can be defined as a univariate function. This must correspond to a one-dimensional quadratic form. □
4.
8.56. Find the global extrema of the function
f(x,y) = -\ + -\, x^O, y^O
on the set of points that satisfy the equation \ + \
x y
Solution. This exercise is to illustrate that looking for global extrema may be much easier than for local ones (cf. the above exercise) even in the case when the function values are considered on an unbounded set. First, we would determine the stationary points (|| 8.4||) and the values (||8.5||) the same way as above. Let us emphasize that we are looking for the function's extrema on a set that is not compact, so we will not do with evaluating the function at the stationary points. The reason is that the function / may not have an extremum on the considered set -its range might be an open interval. However, we will show that this is not the case here.
Let us thus consider \ x\ > 10. We can realize that the equation
\ + \ = 4 can be satisfied only by those values y for which | y | >
x y
1/2. We have thus obtained the bounds
-2V2
<
10
+ 2 < 272, if
> 10.
2 < f(x, y) < 10
At the same time, we have (interchanging x and y leads to the same task)
-2V2 < -2 < f(x,y) < ± +2 < 2V2, if |y|>10.
Hence we can see that the function / must have global extrema on the considered set, and this must happen inside the square A BCD with vertices A = [-10, -10], B = [10, -10], C = [10, 10], D = [-10, 10].
The intersection of the "hundred times reduced" square with vertices at Ä = [-1/10,-1/10], B = [1/10,-1/10], C = [1/10, 1/10], D = [-1/10, 1/10] and the given set is clearly the empty set. Therefore, the global extrema are at points inside the compact set bounded by these two squares. Since / is continuously dif-ferentiable on this set, the global extrema can occur only at stationary points. We thus must have
fm
/(f,f)=2V2, /mto = /(-f,-f) = -2V2.
Let us notice that we are almost done with the proof. The limit at the end of our expression is, thanks to the properties of the function a, zero if the magnitudes ||F(x) — F(xo)|| are greater than C\\x — xq|| for some constant C. This is a bit stronger property than F-1 being continuous; in literature, this property of a function is called Lipschitz continuity. So, now it remains "merely" to prove the existence of a Lipschitz-continuous inverse mapping to the mapping F.
To simplify the reasonings to come, we will transform the general case to a statement which is a bit more simple. Especially, we can achieve xq — 0 e Rn, y0 — F(xq) — 0 e M" by a convenient choice of Cartesian coordinates, which is without loss of generality.
Composing the mapping F with any linear mapping G yields a differentiable mapping again, and we know this changes the differential. The choice G(x) = (Dl F(O))"1 (x) gives Dl(Go F)(0) = idu". Therefore, we can assume that
DlF(0) = idRn.
Now, having these assumptions, let us consider the mapping K(x) — F(x) — x. This mapping is differentiable, too, and its differential at 0 is apparently zero.
By The Taylor expansion with remainder of the particular coordinate functions Kt and the definition of Euclidean distance, we get for any continuously differentiable mapping A" in a neighborhood of the origin of W the bound
\\K(x) - K(y)\\ < Cfn\\x-y\\,
where C is bounded by the maximum of all absolute values of the partial derivatives in the Jacobian matrix of the mapping A" in the neighborhood in question.2
Since the differential of the mapping K at the point xo — 0 is zero in our case, we can, selecting a sufficiently small neighborhood U of the origin, achieve the bound
\K(x)-K(y)\
1
< -1 - 2
■yll.
Further, substituting for the definition K(x) voking the triangle inequality || (u — v) + v\\ e., ||m|| — ||d|| < ||« — d|| as well, we get
||y — x|| — ||-F(x) — F(y)|| < ||-F(x)
1
= F (x) — x and in-
< ||« — i>|| + ||d||, i.
F(y) + y - x||
Hence, finally,
1
— Ilx
2
y|| < \\F(x)-F(y)\
With this bound, we have reached a great advancement: if x / y in our neighborhood U of the origin, then we also must have F(x) ^ F(y). Therefore, our mapping is bijective. Let us write F-1 for its inverse defined on the image of U. For this function, our bound says that
|F_1(jc) - F~l(y)II < 2||
yll,
so this mapping is not only continuous, but also Lipschitz-continuous, as we needed in the previous part of the proof.
It immediately follows from this reasoning that a function which has continuous partial derivatives on a compact set is Lipschitz-continuous on it as well.
485
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
□
8.57. Determine the maximal and minimal values of the function fix, y, z) = xyz on the set M given by the conditions
1, x + y + z = 0.
x2 + y2 + z2
Solution. It is not hard to realize that M is a circle. However, for our problem, it is sufficient to know that M is compact, i. e. bounded (the first condition of the equation of the unit sphere) and closed (the set of solutions of the given equations is closed since if the equations are satisfied by all terms of a converging sequence, then it is satisfied by its limit as well). The function / as well as the constraint functions F(x, y, z) = x2 + y2 + z2 — 1, G(x, y, z) = x + y + z, have continuous partial derivatives of all orders (since they are polynomials). The Jacobi constraint matrix is
'Fx(x, y, z) Fy(x, y, z) Fz{x, y, z) \ = (2x 2y 2z" KGx(x,y,z) Gy(x,y,z) Gz(x, y, z)) ~ \l 1 1 Its rank is reduced (less than 2) if and only if the vector (2x, 2y, 2z) is a multiple of the vector (1, 1, 1), which gives x = y = z, and thus x = y = z =0 (by the second constraint). However, the set M does contain the origin. Therefore, we may look for stationary points using the method of Lagrange multipliers. For
L(x, y, z, A.i, k2) = xyz - ki (x2 + y2 + z2 - l) - k2 (x + y + z), the equations Lx = 0, Ly = 0, Lz = 0 give
yz, — 2k\x — k2 = 0,
xz, — 2k\y — k2 = 0,
xy — 2k\z, — k2 = 0,
respectively. Subtracting the first equation from the second one and from the third one leads to
xz — yz — 2k\y + 2k\x = 0,
xy — yz — 2k\z, + 2k\x = 0,
i. e.,
(x -y)(z + 2kx) = 0, (x-z) (y + 2*i) = 0. The last equations are satisfied in these four cases:
x = y, x = z; x = y, y = —2k\; z = —2k\,x=z; z = — 2k\, y = — 2k\, thus (including the constraint G = 0)
x = y = z, = 0; x = y = —2k\, z, = 4Ai;
x = z = —2k\, y = Ak\\ x = 4A.i, y = z = —2k\.
Except for the first case (which clearly cannot happen), including the constraint F = 0 yields
(4A02 + (-2A02 + (-2A02 = 1, Altogether, we get the points
i. e. k\
1
'76'
1 2 '76' V6.
1 2
1
76
2
76'
1
'76'
1
76
It could seem that we are done with the proof, yet it is not \\ true. To finish the proof completely, we still have to show that the mapping F restricted to a sufficiently small neighborhood is not only bijective, but also that it maps open neighborhoods of zero onto open neighborhoods of zero. 3
Let us choose S so small that the neighborhood V — (D& (0) lies in U with its boundary, and, at the same time, the Jacobian matrix of the mapping is invertible on the whole V. This surely can be done since the determinant is a continuous mapping. Let B denote the boundary of the set V (i. e., the corresponding sphere). Since B is compact and F is continuous, the function
P(x) = \\F(x)\\
has both a maximum and a minimum on B. Let us denote a — ^minx£B p(x) and consider any y e Oa(0). Of course, a > 0. We want to show that there is at least one x e V such that y — F(x), which will prove the whole inverse mapping theorem.
To this purpose, consider the function (y is our fixed point)
h(x)= \\F(x)-y\\2.
Again, the image h(V) U h(B) must have a minimum. First, we show that this minimum cannot occur for x e B. Indeed, we have F(0) — 0, hence h(0) — \\y\\ < a. At the same time, by our definition of a, the distance of y from F(x) for x e B is at least a for y e Oa (0) (since a was selected to be half the minimum of the magnitude of F(x) on the boundary). Therefore, the minimum occurs inside V, and it must be at a stationary point z of the function h. However, this means that for all j — 1,..., n, we have
^r(z) = E2(/'«-w)ö77« = 0.
dxJ
i=\
dfi dxj
This system of equations can be considered a system of linear equations with variables & — ft (z) — yi and coefficients given by twice the Jacobian matrix D1F(z). For every zeV, such a system has a unique solution, and that is zero since we suppose that the Jacobian matrix is invertible.
Thus, we have found the wanted point x — z e V satisfying, for all i — 1,..., n, the equality ft (z) — yi, i. e., F(z) — y. □
8.18. The implicit function theorem. Our next goal is to apply the inverse mapping theorem for work with implicitly defined functions. For the beginning, let us IIISlI? consider a differentiable function F(x, y) defined in the plane E2, and let us look for those point (x, y) at which F{x, y) = 0.
An example of this can be the usual (implicit) definition of straight lines and circles:
F(x, y) — ax + by + c — 0
F(x, y) = (x - s)2 + (y - t)2 - r2 = 0, r > 0.
While in the first case, the function given by the first formula is (for
a c
y — fix) —--x--
y J b b
for all x; in the other case, for any point (x$, yo) satisfying the equation of the circle and such that yo / t (these are the marginal
l l
76' 76'
2
76
1
76'
2 1 76' 76
2 1 1 76 ' 76 ' 76
In literature, there are many examples of mappings which, for instance, continuously and bijectively map a line segment onto a square.
486
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
We will not verify that these really are stationary points. The only important thing is that all stationary points are among these six.
We are looking for the global maximum and minimum of the continuous function / on the compact set M. However, the global extrema (we know they exist) can occur only at points of local extrema with respect to M. And the local extrema can occur only at the aforementioned points. Therefore, it suffices to evaluate the function / at these points. Thus we find out that the wanted maximum is
/(--.--■-)-/(--■-■--
V 76 V6 76/ V Vě Vě V6y
.2 1 1 \ 1
/
,76' 76' 76/ 376' while the minimum is
f(- - --)=f(- -- -V76 76 76/ V76 76 76y
= /M-.2-,2J) = _J_.
V 76 76 76/ 376
□
8.58. Find the extrema of the function / : R3 -» R, fix, y, z) = x2 + y2 + z2, on the plane x + y — z, = 1 and determine their types. Solution. We can easily build the equations that describe the linear dependency between the normal to the constraint surface and the examined function:
k, y = k z
-k, k e
The only solution is the point [|, |, — |]. Further, we can notice that the function is increasing in the direction of (1, —1,0), and this direction lies in the constraint plane. Therefore, the examined function has a minimum at this point.
Another solution. We will reduce this problem to finding the extrema of a two-variable function on R2. Since the constraint is linear, we can express z, = x + y — 1. Substituting this into the given function then yields a real-valued function of two variables: f(x, y) = x2 + y2 + (x + y - l)2 = 2x2 + 2xy + y2 - 2x - 2y + 1. Setting both partial derivatives equal to zero, we get the linear equation
4x + 2y - 2 = 0, 4y + 2x - 2 = 0,
whose only solution is the point [|, |]. Since it is a quadratic function with positive coefficients at the unknowns, it is unbounded on R2. Therefore, there is a (global) minimum at the obtained point. Then, we can get the corresponding point [|, |, — |] in the constraint plane from the linear dependency of z. □
8.59. Find the extrema of the function x + y : R3 -» R on the circle given by the equations x + y + z, = 1 and x2 + y2 + z2 = 4. Solution. The "suspects" are those points which satisfy
(1, 1,0)=*:- (1, 1,1) + /- (jc, y, z), k,leR.
Clearly, x = y(= 1/Z). Substituting this into the equation of the circle then leads to the two solutions
1 722 1 722 1 722
- ± -—, - ± -—, - T -— 3 6 3 6 3 3
points of the circle in the direction of the coordinate x), we can find a neighborhood of the point xq in which either
or
y
y
= f(x) = t + f(x-s)2
= fix) = t - V(x - s)2
according to which semicircle the point (xq, yo) belongs to. Having the picture of the situation drawn, the reason is clear: we cannot describe both the semicircles simultaneously by a single function y — fix). The marginal points of the interval [s — r, s + r] are more amazing. They also satisfy the equation of the circle, yet we have at them that Fy is ±r,t) — 0, which describes the position of the tangent line to the circle at these points, parallel to the y-axis. Indeed, we cannot find neighborhoods of these points in which the circle could be described as a function y — fix).
Moreover, the derivatives of our function y — fix) — t + 7(x — s)2 — r2 at points where it is defined can be expressed in terms of partial derivatives of the function F:
fix) = -
2(x — s)
y ■
If we interchange the roles of the variables x and y and we will want to find a dependency x — fiy) such that F(/(y), y) — 0, then we will succeed in neighborhoods of points is ± r, t) with no problem. Let us notice that the partial derivative Fx is non-zero at these points.
Our observation thus (for mere two examples) says: for a function Fix, y) and a point (a,b) e E2 such that F(a,b) — 0, there is a unique function y — fix) satisfying F(x, fix)) — 0 if we have Fy(a, b) / 0. In this case, we can even compute f'ia) — —Fx(a,b)/Fy(a,b). We will prove that actually, this proposition is always true. The last statement about derivatives can be remembered (and is quite comprehensible if things are thoroughly understood) from the expression for the differential of the function gix) — Fix, y(x)) and the differential dy — f'(x)dx
0 — dg — Fxdx + Fydy — (Fx + Fyf'(x))dx.
We could work analogously with the implicit expressions F{x, y, z) — 0, where we can look for a function gix, y) such that Fix, y, gix, y)) — 0. As an example, consider the function fix,y) — x2 + y2, whose graph is a circular paraboloid centered at the point (0, 0). This can be defined implicitly by the equation
0 = Fix, y,z) = z-x2
■y2-
Before formulating the result straight for the general situation, we can notice which dimensions could/should appear in the problem. If we wanted to find, for this function F, a curve c(x) — (ci (x), c2Íx)) in the plane such that
Fix, c(x)) — Fix, c\ix), C2Íx)) — 0,
then we succeed as well (even for all initial conditions x — a), yet the result will not be unique for a given initial condition. In fact, it suffices to consider an arbitrary curve on the circular paraboloid whose projection onto the first coordinate has non-zero derivative. Then we consider x to be the parameter of the curve, and c(x) is chosen to be its projection onto the plane yz.
487
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Since every circle is compact, it suffices to examine the function values at these two points. We find out that there is a maximum of the considered function on the given circle at the former point and a minimum at the latter one. □
8.60. Find the extrema of the function / : R3 -» R, f(x, y, z) = x2 + y2 + z2, on the plane 2x + y — z, = 1 and determine their types.
o
8.61. Find the maximum of the function / : R2 -» R, f(x, y) = xy on the circle with radius 1 which is centered at the point [x0, y0] =
[0, i]. O
8.62. Find the minimum of the function / : R2 -» R, f = xy on the circle with radius 1 which is centered at the point [xo, yo] = [2, 0].
o
8.63. Find the minimum of the function / : R2 -» R, f = xy on the circle with radius 1 which is centered at the point [x0, y0] = [2, 0].
o
8.64. Find the minimum of the function / : R2 -» R, f = xy on the ellipse x2 + 3/ = 1. O
8.65. Find the minimum of the function / : R2 -» R, f = x2y on the circle with radius 1 which is centered at the point [x0, y0] = [0, 0].
o
8.66. Find the maximum of the function / : I on the circle x2 + y2 = 1.
8.67. Find the maximum of the function / : on th ellipse 2x2 + 3 V2 = 1.
8.68. Find the maximum of the function / : on the ellipse x2 + 2y2 = 1.
l,f(x,y) =x3y O
R, f(x, y) = xy
o
R, f(x, y) = xy
o
H. Volumes, areas, centroids of solids
8.69. Find the volume of the solid which lies in the half-plane z, > 0, the cylinder x2 + y2 < 1, and the half-plane
a) z < x,
b) x + y + z, < 0.
Therefore, we expect that one function of m + 1 variables defines implicitly a hypersurface in Rm+1 which we want to express (at least locally) as the graph of a function of m variables. We can anticipate that n functions of m + n variables will define an intersection of n hy-persurfaces in Rm+n, which is, in "most" cases, a m-dimensional object.
Let us thus consider a differentiable mapping
fn):
The Jacobian matrix of this mapping will have n rows and m columns, and we can write it symbolically as
d1F = (dlxF, d\F)
(Ml
obci
. Ml
\Sx\
Ml
dxm
Ml
dxm vm+n
3/i
dxm
SXn.
dxm
_Mn_
where (x\,..., xm+n) e Rm+n is written as (x, y) e D\ F is a matrix of n rows and the first m columns in the Jacobian matrix, while D^F is a square matrix of order n, with the remaining columns. The multidimensional analogy to the previous reasoning with the non-zero partial derivative with respect to y is the condition that the matrix Dy F is invertible.
The implicit mapping theorem
Theorem. Let F : Rm+n —>• Rn be a differentiable mapping in an open neighborhood of a point (a,b) e Rm x Rn — Rm+n at which F(a,b) — 0, and det DyF ^ 0. Then there exists a differentiable mapping G : Rm —>• R" defined on an neighborhood U of the point a € Rm with image G(U) which contains the point b and such that F(x, G(x)) = Ofor all x € U.
Moreover, the Jacobian matrix D1G of the mapping G is, in the neighborhood of the point a, given by the product of matrices
DlG{x)
-(DyF)
-\x, G(x)) ■ DlxF(x, G(x)).
Proof. For the sake of comprehensibility, we first show the proof for the simplest case of the equation F(x, y) — 0 with a function F of two variables. At first sight, it ft, v will be quite complicated because it will be presented in a way which can be extended for the general dimensions as the theorem states.
We extend the function F to
F : R2 R2, (x, y) (x, F(x, y)). The Jacobian matrix of the mapping F is
1 0
yFx(x,y) Fy(x,y)
It follows from the assumption Fy(a,b) / 0 that the same holds in a neighborhood of the point (a, b) as well, so the function F is invertible in this neighborhood, by the inverse mapping theorem. Therefore, let us take the uniquely defined differentiable inverse mapping F-1 in a neighborhood of the point (a, 0).
Now, let us denote by it : R2 —>• R the projection onto the second coordinate, and consider the function /(x)=jroF_1(i,0).
DlF(x, y) =
488
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Solution, a) The volume can be calculated with ease using cylindric coordinates. There, the cylinder is determined by the inequality r < 1; the half-plane z < x by z < r cos z2, x > 0.
Solution.
First, we should realize what the examined solid looks like. It is a part of a ball which lies outside a given cone (see the picture).
The best way to determine the volume is probably to subtract half the volume of the sector given by the cone from half the ball's volume (note that the volume of the solid does not change if we replace the condition x > 0 with z > 0 - the sector is cut either "horizontally" or "vertically", but always to halves). We will calculate in spherical coordinates.
x = r cos( z2. Again, we express the conditions in the spherical coordinates: r2 < 1, 3sin2(i/» > cos2(i/», i. e., tan(i/>) > Just like in the case of the ball, we can see that the variables occur independently in the inequalities, so the integration bounds of the variables will be independent of each other as well. The condition r2 < 1 implies r e (0, 1]; from tan(V0 > we have e [0- f ]■ The variable 0, centered at (a, b, c), i. e., given by the equation
F(x, y, z) = (x- a)2 + (y - b)2 + (z - c)2 = r2,
we get the normal vectors at a point P — (xq, yo, zo) as a non-zero multiple of the gradient, i. e., a multiple of
D1F — (2(x0 - a), 2(yo - b), 2(z0 - c)),
and the tangent vectors will be exactly the vectors perpendicular to the gradient. Therefore, the tangent plane to a sphere at the point P can always be described implicitly in terms of the gradient by the equation
0 = (x0 - a)(x - x0) + (y0 - b)(y - y0) + (z0 - c)(z - z0)-This is a special case of the following general formula:
490
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
In cylindric coordinates
Tangent hyperplane to an implicitly given hypersurface
X
y
z
r cos( \, respectively.
7T
7r
□
Another alternative is to compute it as the volume of a solid of revolution, again splitting the solid into the two parts as in the previous case (the part "under the cone" and the part "under the sphere". However, these solids cannot be obtained by rotating around one of the axes. The volume of the former part can be calculated as the difference between the volumes of the cylinder x2 + y2 < ^,0 < z < ^
and the cone's part 3x2 + 3y2 • Rn.
For a fixed choice b — (b\,..., b„), the set of all solutions is, of course, the intersection of all hypersurfaces M(bt, f) corresponding to the particular functions f. The same must hold for tangent directions, while normal directions are generated by particular gradients. Therefore, if D1F is the Jacobian matrix of a mapping which implicitly defines a set M and a point P —
491
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Solution. We iwll compute the integral in spherical coordinates. The segment can be perceived as a spherical sector without the cone (with vertex at the point [0, 0, 0] and the circular base z, = 1, x2 + y2 = 1). In these coordinates, the sector is the product of the intervals [0, \/2] x [0, 2tc) x [0,7r/4]. We thus integrate in the given bounds, in any order:
f/7
Jo Jo Jo
r2 sin(Ö) dö dr dcp = -(V2 - l)jt.
In the end, we must subtract the volume of the cone. That is equal to \ttR2H (where R is the radius of the cone's base and H is its height; both are equal to 1 in our case), so the total volume is
4 r 1 1 r
^sector - ^cone = ~W2 - 1) - -it = -7r(4V2 - 5).
The volume of a general spherical segment with height h in a ball with radius R could be computed similarly:
V
^sector ~~ ^cone
/ / r2 sin(6>) dr dö d 0, y > 0. Therefore, the volume of the whole solid is equal to
y = 8/o/o 7W=-
dydx = 128.
□
Remark. Note that the projection of the considered solid onto both the plane y = 0 and the plane z, = 0 is a circle with radius 4, yet the solid is not a ball.
8.73. Find the volume of the part of the cylinder x2 + y2 by the planes z, = 0 and z, = x + y + 2.
4 bounded
For every curve c(t) c M going through P — c(0), we must have that h (c(t)) is an externum of this univariate function. Therefore, the derivative must satisfy
^-A(c(0)|M> = d^o)h(P) = dh(P)(c'(0)) = 0.
However, this means that the differential of the function h at the point P is zero along all tangent increases to M at P. This property is equivalent to stating that the gradient of h lies in the normal subspace (more precisely, in its direction space). Such points P e M are called stationary points of the function H with respect to bindings given by F.
As we have seen in the previous paragraph, the normal space to our set M is generated by the rows of the Jacobian matrix of the mapping F, so the stationary points are determined equivalently by the following proposition:
___J Lagrange multipliers | -
Theorem. Let F = (/i, ..., /„) : Rm+n -> Rn be a differen-tiable function in a neighborhood of a point P, F(P) — 0. Further, let M be given implicitly by an equation F (x, y) — 0, and let the rank of the matrix D1 F at the point P ben. Then P is a stationary point of a continuously differentiable function h : W"+n -> Rwith respect to the conditions F if and only if there exist real parameters k\, ... ,kn such that
grad/z = Xi grad/i
X„ grad/„.
Let us notice that the method of Lagrange multipliers is an algorithmic one. Therefore, let us take a look at the ^,/ numbers of unknowns and equations: the gradients are vectors of m+n coordinates, so the request of the theorem gives m + n equations. The variables are, on one side, the coordinates x\,..., xm+n of the wanted stationary points P with respect to the bindings, and, on the other hand, the n parameters A, in the linear combination. Now it remains to state that the point P belongs to the implicitly given set M, which leads to n more equations. Altogether, we have 2n + m equations for 2n + m variables, so we can expect that the solution will be given by a discrete set of points P (i. e., each one of them will be an isolated point).
8.23. Arithmetic mean-geometric mean inequality. As an example of practical application of the Lagrange multipliers, we will prove the inequality
1
(x\ H-----h xn) > ^/xT
for any n positive real numbers x\,... ,x„. Further, we will prove that the inequality holds with equality if and only if all the x, 's are equal.
Let us thus take the sum x\ + ■■■ +x„ — c to be the binding condition for a (non-specified) non-negative constant c. We will look for the maxima and minima of the function
f (x\, . . . , Xfi) — \JX\ • • • xn
with respect to our binding condition and the assumption x\ >
0,...,x„ > 0.
The normal vector to the hyperplane defined by the condition is (1,..., 1). Therefore, the function / can have an externum only
493
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Solution. We will work in cylindric coordinates given by the equations x = r cos( 0, i. e., if we keep the orientation of the curve, and the same value up to sign if the derivative of the transformation is negative.
Precisely speaking, we have learned to integrate the differential df of a function over curves. However, it may be the case that the connection with integration of functions is not apparent. We clearly cannot get the length of the curve if we select a constant function with value one for /. We need a geometric point of view to explain this. The size of a vector is given by a quadratic form, rather than a linear one. However, if we take the square root of the values of a (positively definite) quadratic form, we get a linear form (up to sign, see above). We will get back to these connections shortly.
8.34. Vector fields and linear forms. In the previous paragraph, the parametrization of a curve was used to obtain a tangent vector c'(t) e R" for every point in the image M of the curve. We thus have a mapping X : M M x R", c(t) \-+ (c(t), c'(t)). We talk about the vector field X along the curve M.
In general, we define a vector field X on an open set U C R" as assigning the vector X(x) e R" in the direction space of the Euclidean space R" to every of its points x in the considered domain.
If a vector field X on an open set U C R" is given, then we can define for every differentiable function / on U its derivative in the direction of the vector field X in terms of the directional derivative by the formula
X(f):U^R, X(f)(x)=dx(x)f.
(Xl(x),...Xn(x)),
Therefore, if we have, in coordinates, X(x) then
X(f)(x)
df
Xl(x)-^(x)
OX l
df
+ Xn(x)-^(x).
dxn
The simplest vector fields will have all coordinate functions equal to zero except for one function X, which will be constantly equal
505
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
y = 2r sincp re [0, 1],
leading to (the Jacobian of the transformation is 6r):
/
(2e2x sin y - 3y3)dx + (e2x cos y + -x3) dy = jj 2e2x cos y + 4x2- (2e2x cos y - 9y2) Ax Ay
D
1 2jt
j j 6r [4(3r cos cp)2 + 9(2r sin • Rn* on an open subset U, we talk about a linear form r\ on U.
Every differentiable function / on an open subset U c R" defines a linear form df on U. We use the notation Ql (U) for the set of all smooth linear forms on U.
It is apparent that in the coordinates (xi,..., x„), we can use the differentials of the particular coordinate functions to express every linear form rj as
rj (x) — rji(x)dx\ + • • • + rjn(x)dxn,
where m (x) are uniquely determined functions. Such a form rj is evaluated at a vector field X(x) — Xi(x)-^- H-----h X„ (x) by
V(X(x)) = m(x)Xi(x) + ■■■ + Vn(x)X„(x).
If the form rj is the differential of a function /, we get just the expression X(f)(x) — df(X(x)) used above.
Let us notice that we have actually defined the integral of any linear form over (non-parametrized) curves M in terms of an arbitrary parametrization c(t)
J-f
JM Ja
V(c(t))(c'(t))dt,
since although we worked with the function differential back then, we actually verified that the value of the integral was independent of the choice of parametrization for any linear form.
We can also notice that we need not write any symbol denoting which concept of a volume we are integrating with respect to. It is given by the definition of a linear form.
8.35. /-dimensional surfaces and &-forms. Instead of parametrized curves, we will now work with differentiable mappings cp : V • R" which is a diffeomorphism and (p~l (M) — V x {0}. This definition, which might seem complicated at the first sight, is illustrated by a picture. Manifolds can be typically given by implicit mappings, see paragraph 8.18 and the discussion in 8.19.
506
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
where c is the positively oriented circle x2 + y2 =9. 8.110. Compute the integral
O
/
1 y3 1 x3
(- + 2xy - —) dx + (- + x2 + —) dy,
x 3 y 3
where c is the positively oriented boundary of the set D = {(x,y) e M2 : 4 < x2 + y2 < 9, < y < V3x}. O
8.111. Remark. An important corollary of Green's theorem is the formula for computing the area D that is bounded by a curve c.
m(D)
-y dx + x dy.
8.112. Compute the area given by the ellipse ^ + = 1-
Solution. Using the formula ||8.111|| and the transformation x a cos t,y = b sin t, we get for t € [0, 2it] that
m(D)
-y dx + x dy
if
c
2n
-a cos t ■ b cos tdt--/
2 J 2 J
o
2jt 2n
1 Í 2 1 Í 2
-ab I cos tdt -\—ab I sin tdt
2 J 2 J
2jt 2jt
a cos t ■ b cos tdt — - I b sin t ■ (—a sin t)dt
2n
—ab f i
2 J
___ . cos21 + sin2 tdt = -ab2it = ital 2 2
which is indeed the well-known formula for the area of an ellipse with semi-axes a and b.
□
8.113. Find the area bounded by the cycloid which is given paramet-rically as \jr{t) = [a(t — siní); a(l — cos f)L for a > 0, t e (0, 2jt), and the x-axis.
Solution. Let the curves that bound the area be denoted by c\ and c2. As for the area, we get
(D) = \ L -ydx +x áy + \ 4 - ydx +x dy-
m
Now, we will compute the mentioned integrals step by step. The parametric equation of the curve c\ (a segment of the x-axis) is (t; 0); t € [0; 2ait], so we obtain for the first integral that
-ydx+xdy = - 0 • 1 dt + / t ■ Odt = 0. 2 Jo Jo
The parametric equation of the curve c2 is \li(t) e (a(t — sin t), a(\ — cost e \_2tt; 0].
The mapping • M of a manifold M, an t]( Ak(TxM)* that the pullback of this form by any parametrization yields a smooth exterior &-form on V. We will use the notation £2k (M) for the set of all smooth exterior &-forms on M.
8.36. Outer product of exterior forms. Given a &-form a e
A"
and an ü-form,
e A*R°*, we can create a (k + £)-form a a p by all possible permutations a of the arguments. We just have to alternate the arguments in all possible orders and take the right sign each time:
(aAp)(Xu...,Xk+l) = 1 x
— sign((r)q!(Zfr(i), Xa(k) )P(Xa(k+i), Xa{k+£) )•
It is clear from the definition that a a p is indeed a (k In the simplest case of 1-forms, the definitions says that
(a a P)(X, Y) = (a(X)ft(Y) - a(Y)ft(X)).
In the case of a 1-form a and a &-form f3, we get
£)-iorm.
(aAß)(X0,XU---,Xk) =
k
^2(-iya(Xj)ß(x0,...,Xj,.
■, Xk),
J. Applications of Stoke's theorem - the Gauss-Ostrogradsky
theorem
8.114. Compute I = ffx3 dy dz +y3 dx dz +z3 dx dy,where S
s
is given by the sphere x2 + y2 + z2 = 1.
Solution. It is advantageous to work in spherical coordinates
x = p sin • U satisfies
cp* (a a P) — r«, giving the standard n-dimensional volume of parallelograms, i. e., in the standard coordinates, we will have
CO^n — dx\ a • • • a dx„.
If we want to integrate a function fix) "in the old fashion", we consider the form co — fco^n instead, i. e. co will have the form (8.3) in the standard coordinates. We define
I co = I f(x)dx\ a Ju Ju
a dx„
I fix) dx\ ... dxn, Ju
where there is the Riemann integral of a function on the right-hand side. We can notice, that the n-form on the left-hand side is independent of the choice of coordinates.
If we want to express the form co in different coordinates using a diffeomorphism • U, it means we will evaluate co at a point cpiu) — x at the values of the vectors )
I f(u) det(Z) 1ipiu))du\ Jv
which is, by the theorem on transformation of variables from paragraph 8.31, the same value if the determinant of the Jacobian matrix keeps being positive, and the same value up to sign if it is negative.
Our new interpretation thus yields a geometrical sense for the integral of an n-form on M", supposing the corresponding Riemann-integral exists in some (hence any) coordinates. This integration takes into account the orientation of the area we are integrating over.
8.38. Integration of exterior forms on manifolds. Now, we are almost ready for the definition of an integral of a k-formy on a ^-dimensional oriented manifold. For the sake of simplicity, we will examine smooth forms co ^— with compact support. First, let us assume that we are given a ^-dimensional manifold M c M" and one of its local parametrizations • U C M c M". The choice of the parametrization • 7 c K1, we can easily compute using the same definition. Let us denote
cp* (a))(u) — f (u)du\ a • • • a diik-
Invoking the relation (8.2) for the pullback of a form by a composite mapping, we get
f a>= f cp*(co)= f n )= f
JU ■ JVifW Jv
ip CD.
Therefore, if we select a different cover and unit decomposition, we can do the above reasoning for a common refinement of these covers and verify that the expression we have defined is actually independent of all of our choices (think this out in detail!).
8.41. Exterior differential of exterior forms. As we have seen, the differential of a function can be interpreted as a mapping
d : Q°(Rn) Q\Rn).
By means of parametrizations, this definition can be extended to functions on manifolds M, where the differential is a linear form on M. The following theorem extends this differential to arbitrary exterior forms on manifolds McR".
[ Exterior differential [__^
Theorem. There is a unique mapping d : Qk(M) —>• Qk+lM,for all manifolds Mcl" and k — 0,..., k, such that
• d is linear with respect to multiplication by real numbers
• for k — 0, it is a differential of functions
512
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
With a bit of effort, it can be shown that a differential equation of the form y = f(ax + by + c) can be transformed to an equation with separated variables, and this can be done by the substitution z = ax + by + c. Let us emphasize that the variable z, replaces y.
We thus set z, = x + y, which gives z! = 1 + /. Substitution into (II8.8H) yields
dz, z + 1 1
dx 2z — 1
dz 3z
dx 2z — 1
2 1
3 ~ Yz
dz = 1 — I
JM JdM
Proof. Using an appropriate locally finite cover of the manifold M and a unit decomposition subordinate to it, we can express the integrals on both sides as the sum (even a finite one, since the support of the considered from co is compact) of integrals of forms on Rk or the
half-space 1
We can thus assume without loss of generality that M is the half-space
M =
and the form co is a form with compact support on M. Then, co will surely be the sum of the forms
co — coj (x)dx\ a • • • a dxj a • • • a dxfc,
where the hat indicates omission of the corresponding linear form, and coj (x) is a smooth function with compact support. Its exterior differential is
3u+2v '
• dcoj
dco — (—ly —-dx\ a
8xj
a dxk.
514
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
which can be solved by further substitution z = v/u. We thus obtain
4 + 3z
z'u + z
dz
— u =--
du
2z + 3 2z2 + 6z + 4
3 + 2z' 2z2 + 6z + 4
3 + 2z dz = -
du u
provided z2 + 3z + 2 ^ 0. Integrating, we get
1 i ,
- In J z2 + 3z + 2
In I u I + In I C
- In J (z2 + 3z + 2) u2 J = In | C |
In J (z2 + 3z + 2) u2 J = In C2 (z2 + 3z + 2)u2 = ±C2
C ^ 0.
We thus have
(z2 + 3z + 2) u2 = D, D/0
and returning to the original variables,
,2
— +3-+2)uz
UL u
V2 + 3vu + 2m2
(y- l)2 + 3(y- l)(x + l) + 2(x + l)2
D, D^O,
D, D^O, D, D^O.
Making simple rearrangements, the general solution can be expressed as
(x + y) (2x + y + 1) = D, D jL 0.
Now, let us return to the condition z2 + 3z, + 2 ^ 0. It follows from z2 + 3z + 2 = 0 that z = — 1 or z = —2, i. e., v = —u or v = —2u. For i; = —u, we have x = u — 1 and y = i; + 1 = —m + 1, which means that y = — x. Similarly, for i; = —2u, we have y = —2u + 1, hence y = —2x — 1. However, both functions y = —x, y = —2x — 1 satisfy the original differential equations and are included in the general solution for the choice D = 0. Therefore, every solution is known from the implicit form
(x + y) (2x + y + 1) = D, D € R.
□
8.125. Find the general solution of the differential equation
(x2 + y2) dx — 2xy dy = 0.
Solution. For y ^ 0, simple rearrangements lead to
v _ i±± - i±iil!
y — 2xy — 2j •
Using the substitution u = y/x, we get to the equation
u'x + u
2u
If j > 1, the form co on the boundary 3M evaluates identically to zero. At the same time, invoking the fundamental theorem about antiderivatives of univariate functions, we get
f i f ( f°° da)i \
I d(D — (—Xy I [I -dx; )dx\ ■ ■ ■ dxj ■ ■ ■ dxk
JM Jm*-1 VJ-oo dxj J
= (-1>
AM
dxi ■ ■ ■ dxj ■ ■ ■ dxk — 0,
since the function coj has compact support. So the theorem is true in this case. However, if j — 1, then we obtain
f f ( f° da)i \
j d(D — j II -dx\ \dxj_......dxk
JM JRk-1 \J-oo dxi )
— I a>i (0, x2, ■ ■ ■, xjc)dx2 ■ ■ ■ dxi — J a>
JRk-1 JdM
This finishes the proof of Stokes's theorem. □
8.44. Notes about application of Stokes' theorem. We have proved an extraordinarily strong result which covers several standard integral relations from the classic vector analysis. For instance, we can notice that by Stokes' theorem, the integration of the exterior differential dco of any (k — l)-forms over a compact manifold without boundary is always zero (for example, integrating a 2-form dco over the sphere S2 c R3).
Let us look step by step at the cases of Stokes' theorem in lower dimensions.
The case n — 2, k — 1. We are thus examining a surface M in the plane, bounded by a curve C — 8M. If we have co(x, y) — f(x, y)dx+g(x, y)dy,ihendco — (— j^ + ^dxAdy. Therefore, Stokes' theorem gives the formula
■g(x, y)dy = / ( -JM \
j fix, y)dx
dx a dy,
K+dA
dy dx
which is one of the standard forms of the so-called Green's theo-
Using the standard scalar product on M2, we can identify the vector field X with a linear form cox such that a>x(Y) — (Y, X). In the standard coordinates (x, y), this just means that the field X — f(x, y) J-r 4- g(x, y)-^ defines right the form co given above. The integral of cox over a curve C has the physical interpretation of the work done by movement along this curve in the force field X. Green's theorem then says, besides others, that if cox — dF for some function F, then the work done along a closed curve is always zero. Such fields are called potential fields and the function F is the potential of the field X.
With Green's theorem, we have verified once again that integrating the differential of a function along a curve depends solely on the initial and terminal points of the curve.
The case n — 3, k — 2. We are examining a region in M3, bounded by a surface S. If ca — f(x, y, z)dy a dz + g(x, y, z)dz a dx + h(x, y, z)dx a dy, we get dm — (|£ + |^ + ^-)dx a dy a dz, and Stokes' theorem says that
Ii
fix, y, z)dy Adz+g(x, y, z)dzAdx+h(x, y, z)dxAdy
JM \
Bf
dx
dy
dh
3z
H---1--\dx A dy A dz.
515
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
For u 7^ ±1 and D
-1/C, we have du
— x dx
1+u2
2u2
2u
In 1
\ —u1 In I x I + In I C
In-
1
In I Cx
1 = Cx ( 1
z)
x
-Dx
y2
2u
dx
■ du = —, x
, C ^ 0, , c # 0,
C/0,
D/0, 0^0.
v2,
±x. While y = 0 is —x are solutions and
The condition u = ± 1 corresponds to y not a solution, both the functions y = x and y can be obtained by the choice D = 0. The general solution is thus
y2 = x2 + Dx, Dei. □
8.126. Solve
y
2y
*2-l •
Solution. The given equation is of the form / = a(x)y + b(x), i. e., a non-homogeneous linear differential equation (the function b is not identically equal to zero). The general solution of such an equation can be obtained using the method of integration factor (the non-homogeneous equation is multiplied by the expression e~ ^ a(x) ^) or the method of variable separation (the integration constant that arises in the solution of the corresponding homogeneous equations is considered to be a function in the variable x). We will illustrate both of these methods on this problem.
As for the former method, we multiply the original equation by the expression
J-
dx
x-1 x + l '
where the corresponding integral is understood to stand for any anti-derivative and where any non-zero multiple of the obtained function can be considered (that is why we could remove the absolute value). Thus, consider the equation
2y
x(x — 1)
x + l 1 (jc + 1)2 _ x+l "
The core of the method of integration factor is that fact that the expression on the left-hand side is the derivative of y . Integrating this leads to
,x-l
x(x — 1)
dx
2x +2 mix + 1 I + C, Ce
y x+i j x+i 2
Therefore, the solutions are the functions
y = x^T (t _ 2x + 21n|jc + 1 | + c) , CeR.
As for the latter method, we first solve the corresponding homogeneous equation
V -__2J-
y - x^-v
This is the statement of the so-called Gauss-Ostrogradsky theorem.
This theorem also has a very illustrative physical interpretation. Every vector field X — /(x, y, z) J-r + g(x, y, z)f^ + h(x,y,z)-^ defines an exterior 2-form cox(x, y, z) — fix, y, z)dy A dz + g(x, y, z)dz A dx + h(x, y, z)dx A dy by substitution for the first argument in the standard form of volume. The integral of this form over a surface can be perceived so that the integrated 2-form infinitesimally contributes, at every point to the integral, the increase equal to the volume of the parallelepiped given by the field X and a little piece of surface. If we consider the vector field to be the velocity of movement of the particular points of the space, this will be the "flow rate" through the given surface. On the right-hand side of the integral, there is an expression which can be defined as d(cox) — (div X)dx A dy A dz. Gauss-Ostrogradsky theorem says that if divZ equals zero identically, then the total flow rate through the boundary surface of the region is zero as well. Such fields, with divZ — 0, are called solenoidal vector fields.
The case n — 3, k — 1. In this case, we have a surface M in R3 bounded by a curve C. If the linear form co is the differential of some function, we find out that the integral over the surface depends on the boundary curve only. This is the classical Stokes' theorem. If we use the standard scalar product, just like in the plane,
to identify the vector field X — f-^r co — f dx + gdx + hdz, we obtain
■ s1-
« dy
h-^ with the form
fc
gdx + hdz
f
d(D,
where^=(|-f)JyA*+(f-f)*AJx+(f-|i)JxA dy. This 2-form can again be identified with a single vector field rot X, which yields dco by substitution into the standard form of volume. This field is called the rotation or curl of the vector field X. We can see that in the three-dimensional space, vector fields X having the property that cox — dF for some function F are given by the condition rot X — 0. They are called conservative (or potential) vector fields.
3. Differential equations
In this section, we will get back to (vector) functions of one variable, which will be given and examined in terms of their instantaneous changes. At the end, we will stop for a while to look at equations containing partial derivatives.
8.45. Linear and non-linear difference models. The concept of derivative was introduced in order to work with instantaneous changes of the examined quantities. In the introductory chapter, we once defined differences for the same reason, and it was just the relations between the values of the quantities and the changes of them or other quantities which lead to the so-called difference equations. As a motivating introduction to equations containing derivatives of unknown functions, we will now return to the difference equations for a while.
The simplest model was interests of deposits or loans (and the same for the so-called Malthusian model of populations). The increase was proportional to the value, see 1.10. Considering continuous modeling, the same request leads to an equation connecting
516
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
which is an equation with separated variables. We have
dy 2y dx x2 —
ln|y
In I x
dy
y
1 I + In I x + 1 I + In I C x + 1
ln|y
In
C
y = C
where we had to exclude the case y
x - 1 x + 1
x2 - 1 2
-dx,
- 1
x - 1
0. However, the function y = 0 is always a solution of a homogeneous linear differential equation, and it can be included in the general solution. Therefore, the general solution of the corresponding homogeneous equation is
ax+i) c R
J X — 1 '
Now, we will consider the constant C to be a function C(x). Differentiating leads to
./ _ C'(x) (jc + 1)(jc-1)+C(jc) (jc-I)-C(jc)
y - (x-i)2
Substituting this into the original equation, we get
C'(x) (jt+l)(jt-l)+C(jt) (Jt-I)-C(jt) _ _ 2C(x) (x + 1)
(x-l)2 ~ (x-l)(x2-l)'
It follows that
C(x)
x(x — 1) x + l '
l. e.,
C(x)
r x(x -1)
J x + l
dx,
C(x)
2x + 2 In I x + 1 I + C, Ce
Now, it suffices to substitute:
y = c(x)
x+l x-l
x + l x-l
2x + 2 In I x + 1 I + C
C e
We can see that the result we have obtained here is of the same form as in the former case. This should not be surprising as the differences between the two methods are insignificant and the computed integrals are the same.
Finally, we can notice that the solution y of an equation / = a(x)y can be found in the same way for any continuous function a. We thus always have
y = Cefa(x)dx, CeR.
Similarly, the solution of an equation / = a (x)y +b(x) with an initial condition y(xo) = yo can be determined explicitly as (provided the coefficients, i. e. the functions a and b, are continuous)
f* ait) dt ( ex , — [' a(s) ds ,\
y = e^o (y0 + jxQbit)e Ao dt) .
Let us remark that the linear equation has no singular solution, and the general solution contains aCel □
8.127. Solve the linear equation
(y + 2xy) e*2 = cos x.
the derivative y1 (t) of a function with its value
(8.4) /(?) = r.y(0
with a proportionality constant r.
It is easy to guess the solution of this equation, i. e. a function y(t) which satisfies the equality identically,
y(0 = Cert
with an arbitrary constant C. This constant can be determined uniquely be choosing the so-called initial values yo — y(?o) at some point to. If a part of the increase in our model were given by a constant action independent of the value y or t (like bank charges or the natural decrease of population as a result of sending some part of it to slaughterhouses), we could use an equation with a constant s on the right-hand side.
(8.5)
y(f) = r-y(f) + 5.
Apparently, the solution of this equation is the function
y(0 = Cert--.
r
It is very easy to come across this solution if we realize that the set of all solutions of the equation (8.4) is a one-dimensional vector space, while the solutions of the equation (8.5) are obtained by adding any one of its solutions to the solutions of the previous equation. We can then easily find the constant solution y(t) — k forA: = -f.
Similarly, in paragraph 1.13, we managed to create the so-called logistic model of population growth based upon the assumption that the ratio of the change of the population size p(n + 1) — pin) and its size pin) is affine with respect to the population size itself. We also wanted the model to behave similarly as the Malthu-sian one for small values of the population size and to cease growing when reaching a limit value K. Now, the same relation for the continuous model can be formulated for a population pit) dependent on time t by the equality
(8.6)
AO = Pit) (-
K
Pit)
i. e., at the value pit) — K for a large constant K, the instantaneous increase of the function p is indeed zero, while for pit) near zero, the ratio of the rate of increase of the population and its size is close to r, which is often a small number (roughly hundredths) expressing the rate of increase of the population in good conditions.
It is surely not easy to solve such an equation without knowing the proper theory (although we will be able to deal with this type of equations presently). However, as an exercise on differentiation, we can easily verify that the following function is a solution for every constant C:
Pit)
K
1 + CK e"
517
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Solution. If we used the method of integration factor, we would only rewrite the equation trivially since it is already of the desired form -
2
the expression on the left-hand side is the derivative of y ex . Thus, we can immediately calculate
/
cosx,
ye-
ye-
sinx + C,
y = e x (sinx + C) ,
cos x dx,
Cef, Cel.
□
8.128.
Find all non-zero solutions of the Bernoulli equation
y-f = 3xy2.
Solution. The Bernoulli equation
y = a(x)y + b(x)f , r#0, r#l, reM
can be solved by first dividing by the term / and then using the substitution u = yl~r, which leads to the linear differential equation
u' = (1 — r) [a(x)u + b(x)] . In this very problem, the substitution u = y1-2 = 1/y gives
u' + a = -3x.
X
Similarly to the previous exercise, we have
u =e-lnl*l [J-3xeln|x|t/x] , was obtained as an (arbitrary) antiderivative to 1/x.
where In | x Furhter,
-3x e1
In I x
dx
/
-3x \x\dx
The absolute value can be replaced with a sign that can be canceled, i. e., it suffices to consider
u = \ [f -3x2 dx] = \ [-x3 + C], CeR.
Returning to the original variable, we get
y = - = ttS, CeR.
J u C-xi '
The excluded case y = 0 is a singular solution (which, of course, is true for every Bernoulli equation with r positive). □
8.129. Interchanging the variables, solve the equation
y dx — (x + y2 sin y) dy = 0.
Solution. When the variable x occurs only in the first power in the differential equation and y occurs in the arguments of elementary functions, we can apply the so-called method of variable interchange, when we look for the solution as for a function x of the independent variable
y-
First, we write the equation explicitly:
y = —f—•
y x+yA sin y
This equation is not of any of the previous types, so we rewrite it as follows:
Confronting the red graph (left-hand picture) of this function with the choice K = 100, r — 0, 05, and C — 1 (the first two were used in 1.13 this way, the last one roughly corresponds to the initial value p(0) — 1) with the right-hand picture (the solution of the difference equation from 1.13 with the same values of the parameters), we can see that both approaches to population modeling indeed yield quite similar results. To compare the output, the left-hand picture also contains in green the graph of the solution of the equation (8.4) with the same constant r and initial condition.
8.46. First-order differential equations. By an (ordinary) first-order differential equation, we usually mean the relation between the derivative y1 (t) of a function with respect to the variable t, its value y(t), and the vari-able itself, which can be written in terms of some real-valued function F : R3 —>• R as the equality
F(/(0,y(0,0 = 0.
The writing resembles implicitly given functions y(t); however, this time, there is a dependency upon the derivative of the wanted function y(t).
If the equation is solved at least explicitly with regard to the derivative, i. e.,
y'(t) = f(t, y(0)
for some function / : R —>• R, we can imagine graphically what this equation defines. For every value (t, y) in the plane, we can consider the arrow corresponding to the vector (1, fit, y)), i. e., the velocity with which the point of the graph of the solution moves through the plane in dependence on the free parameter t.
For the equation (8.6), for instance, we get the following picture (illustrating the solution for the initial condition as above).
518
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
dy_ _ y
dx x + y2 sin y '
dx ( y \_1 x
ay \x + y1 sin y J y
, 1 x = — x + y sin y.
y
We have thus obtained a linear differential equation. Now, we can easily compute its general solution
x = —y cos y + Cy, Cel.
□
Further problems concerning first-order differential equations can be found on page ??.
L. Practical problems leading to differential equations
8.130. A water purification plant with volume 2000 m3 was contaminated with lead which is spread in the water with density 10 g/m3. Water is flowing in and out of the basin at 2 m3/s. In what time does the amount of lead in the basin decrease below 10 [ig/m3 (which is the hygienic norm for the amount of lead in drinkable water by a regulation of the European Community) provided the water keeps being mixed uniformly?
Solution. Let us denote the water's volume in the basin by V (m3), the speed of the water's flow by i; (m3/s). In an infinitesimal (infinitely small) time unit At,y-v At grams of lead runs out of the basin, so we can construct the differential equation
dm
m
- — ■ v At V
for the change of the lead's mass in the basin. Separating the variables, we get the equation
Am
m
v
-At. V
Integration both sides of the equation and getting rid of the logarithms, we get the solution in the form m(t) = m0e~^', where m0 is the lead's mass at time t = 0. Substituting the concrete values, we find out that t = 6 h 35 min. □
8.131. The speed of transmission of a message in a population consisting of P people is directly proportional to the number of people who have not heard the message yet. Determine the function / which describes the dependency of the number of people who have heard the message on time. Is it appropriate to use this model of message transmission for small or large values of PI
Solution. We construct a differential equation for /. The speed of the transmission = fit) should be directly proportional to the number of people who have not heard of it, i. e. the value P —fit). Altogether,
dj_
dt
10C-
y(x)
80-
60-
40-
207
'///// '//// '////
1111
III!
//// //// ■%////
//// //// ////
1111
m ',',y,
//// ////
//// /// ' ////
111
m
//// ////
/// ///
J11
w
/// ///
//// ////
1111
m '///,
//// //// ////
I " I ' I ' I ' I " I T "T "1 T T™
50 100
~i—I—r—r—r—r—r—i
150 200
Considering these pictures, we can intuitively anticipate that for every initial condition, there will exist a unique solution of our equation. However, as we will see, this proposition holds only for sufficiently smooth functions /.
8.47. Integration of differential equations. Before examining existence of the solutions of the differential equations, we present at least one truly elementary method of solution. It transforms the solution to ordinary integration, which usually leads to an implicit description of the solution.
equations with separated variables J____
Consider a differential equation in the form
(8.7) y« = /(o-s mo)
for two continuous functions of a real variable, / and g.
The solution of this equation can be obtained by integration, i. e., we find the antiderivatives
dy
Giy)
fix)dx.
This procedure reliably finds a solution which satisfies g(y(t)) ^ 0.
Then, computing the function y(x) from the implicitly given formula F(x) + C — Giy) with an arbitrary constant C leads to the solution, because differentiating this equation using the chain rule for the composite function G(y(x)) indeed leads to ■ y1 (x) —
fix).
As an example, we can find the solution of the equation
y (x) = x • y(x).
f C. Hence it looks (at
Direct calculation gives In \ y(x) least for positive values of y) as
2x2
HP - fit)).
y(x) = eix2+c = D ■ e^2,
where D is an arbitrary positive constant now. Let us stop for a while to examine the resulting formula and signs thoroughly. The
519
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
Separating the variables and introducing a constant K (the number of people who know the message at time t = 0 must be P — K), we get the solution
fit)
Ke
-kt
where k is a positive real constant.
Apparently, this model makes sense for large values of P only. □
8.132. The speed at which an epidemic spreads in a given closed population consisting of P people is directly proportional to the product of the number of people who have been infected and the number of people who have not. Determine the function fit) describing the number of infected people in time.
Solution. Just like in the previous problem, we construct a differential equation:
df dt
Again, separating the variables and introducing suitable constants K and L, we obtain
K
k ■ f(t) iP - fit)) .
fit)
1+Le
-Kkt
□
8.133. The speed at which a given isotope of a given chemical element decays is directly proportional to the amount of the given isotope. The half-life of the isotope of plutonium H9Pu is 24,100 years. In what time does a hundredth of a nuclear bomb whose active component is the mentioned isotope disappear?
Solution. Denoting the amount of plutonium by m, we can build a differential equation for the rate of the decay:
dm
- = k ■ m,
dt
where k is an unknown constant. The solution is thus the function m{t) = m0e~kt. Substituting into the equation for half-life^-*' = |), we get the constant k = 2.88 • 105. The wanted time is then approximately 349 years. □
8.134. The acceleration of an object falling in a constant gravitational field with a certain resistance of the environment is given by the formula
dv dt
where k is a constant which expresses the resistance of the environment. An object was dropped in a gravitational field with g = 10 ms-2 at the initial speed of 5 ms-1, the resistance constant is k = 0.5 s-1. What will the speed of the object be in three seconds?
Solution.
— = g - kv,
(f-4
-kt
constant solution y(x) — 0 satisfies our equation as well, and for negative values of y, we can use the same solution with negative constants D. In fact, the constant D can be arbitrary, and we have found a solution satisfying any initial value.
\\\w
)//S'sS'-*-lilt//"-*-.
I
III
',,111 -'///II '/////
In
///
HHP
---ww v
-n.WW \
m\\
n Iii ill
} / / f/sS'^-111////"-
WW,',',!
wm
" mini
-"////
"////111 - -,,/
The picture shows two solutions which demonstrate the instability of the equation with regard to the initial values: If, for any xq, we change a tiny yo from a negative value to a positive one, then the behavior of the resulting solution changes dramatically. Moreover, we should notice the constant solution y(x) — 0, which satisfies the initial condition y(*o) — 0.
Using separation of variables, we can easily solve the nonlinear equation from the previous paragraph which described a logistic population model. Try this as an exercise.
In the first chapter, we paid much attention to the so-called linear difference equations, and their general solution, looking quite awful, was determined in paragraph 1.10 on page 13. Although it was clear beforehand that it will be a one-i S dimensional affine space of satisfying sequences, it was a hardly transparent sum, because we needed to take into account all of the changing coefficients.
We can thus use this as a source of inspiration for the following construction of the solution of a general first-order linear equation
(8.8)
y (t) = a(t)y(t) + b(t)
with continuous coefficients a it) ans bit).
First of all, let us find the solution of the homogenized equation y (0 — a(t)y(t). This can be computed easily by separation of variables, obtaining
y(t) = yoF(t,to), F(t,s) = daMdx .
In the case of difference equations, we "guessed" the solution, and then we proved by induction that it was correct. It is even simpler now, as it suffices to differentiate the correct solution to verify the statement.
.... | The solution of first-order linear equations |_,
The solution of the equation (8.8) with initial values y(?o) — yo is (locally in a neighborhood of to) given by the formula
yit) = yoFit, t0) + / Fit, s)b(s) ds,
f
J* a(x) dx
v(3) = 20 — I5e 2 ms 1 after substitution.
□
where F(t, s) — e
Verify the correctness of the solution by yourselves (pay proper attention to the differentiation of the integral where t is both in the upper bound and a free parameter in the integrand).
520
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
8.135. The rate of increase of a population of a certain type of bug is indirectly proportional to its size. At time t = 0, the population had 100 bugs. In a month, the population doubled. What will the size of the population be in two months?
Solution. Let us consider a continuous approximation of the number of bugs, and let their amount be denoted by P. Then, we can build the following equation:
dP _ k dt ~ P'
77-100,
□
P = ^Kt + c. Substituting the given values, we get P (2) which is an estimate of the actual number of bugs.
Now, we can, for instance, directly solve the equation y' (x) — 1 — x ■ y(x), this time encountering stable behavior, visible in the following pic-
ture.
-\\ \
-\\
-W \
2,5* A
8.136. Find the equation of the curve with the following properties: It lies in the first quadrant, goes through the point [l, 3/4], and its tangent at any point marks on the positive half-axis y a segment whose length is the same as the distance of that point from the origin. O
8.137. Consider a chemical compound C isolated in a container. C is unstable, with half-time of a molecule equal to q time units. If there were M moles of the compound C in the container at the beginning (i. e., at time t = 0), how many moles of it will be there at time t > 0?
o
8.138. A 100-gram body lengthens a spring of 5 cm if hung on it. Express the dependency of its position on time t provided the speed of the body is 10 cm/s when going through the equilibrium point. O
Further practical problems that lead to differential equations can be found on page ??.
M. Higher-order differential equations
8.139. Underdamped oscillation. Now, we will describe a simple model for the movement of a solid object attached to a point with a strong spring. If y(t) is the deviation of our object from the point yo = y(0) = 0, then we can assume that the acceleration y" it) in time t is proportional to the magnitude of the deviation, yet with the other sign. The proportionality constant k is called the spring constant. Considering the case k = 1, we get the so-called oscillation equation
y"(t) = -y(t).
This equation corresponds to the system of equations
x'(0 = -y(0, y'(t)=x(t)
from 8.7. The solution of this system is given by
x(t) = R cos(? - r), y(t) = R sin(f - r)
with an arbitrary non-negative constant R, which determines the maximum amplitude, and a constant r, which determines the initial phase.
Therefore, in order to determine a unique solution, we need to know not only the initial position yo, but also the speed of the motion at that moment. These two pieces of information uniquely determine both the amplitude and the initial phase.
Moreover, let us imagine that as a result of the properties of the spring material, there is another force which is directly proportional to the instantaneous speed of our object, with the other sign than the amplitude again. This is expressed by one more term with the first derivative, so our equation is now
«w /—w\\v. /^•\\\\\\\
//^-^n\\\\\\\\. ///^-^nww \\ \ \\ \ .
t//S'— ^WWWWU
/ / / / ->»^n\\ \ \ \
w / m / / ///ssss'^—-l/l/.//./!/././././/././././/.//./
/-*w\ \ v.
Y/7^X\\\\\ hit
/ / //x/^——-~-~n\\\\\ /////// ///SSSS'^-'-^ 1/1/1//1//1/1/1/1/1/1/1/1/.//.//./
8.48. Transformation of coordinates. Our pictures tend to indi-f'Q ^ cate that differential equations can be perceived as •£j\ geometric objects (the "directional field of the ar-yg^ggr-js rows"), so we should be able to look for the solution by conveniently chosen coordinates. We will get back to this point of view later; now, we will only show three simple and typical tricks as they seem from the explicit form of the equations in coordinates.
We begin with the so-called homogeneous equations of the form
yw = /(^-).
Considering a transformation z get by the chain rule that
1
r, assuming that t / 0, then we
z'(t) = -{ty'{i)
y(t)) = -t(f(z)
z),
which is an equation with separated variables.
Another example is the so-called Bernoulli differential equations, which are of the form
y(t) = f(t)y(t) + g(t)y(t)n,
where n ^ 0, 1. The choice of the transformation z — yl~n leads to the equation
z'(t) = (1 - n)y(t)-"(f(t)y(t) + g(t)f) = (l-n)f(t)z(t) + (l-n)g(t),
which is a linear equation, which we are able to integrate.
In the end, let us take a look at an extraordinarily important equation, the so-called Riccati equation. It is a form of the Bernoulli equation with n — 2, extended by an absolute term
y(t) = f(t)y(t)+g(t)y(t)2 + h(t).
This equation can also be transformed to a linear equation provided that we are able to guess a particular solution x(t). Then, we can use the transformation
1
z(0 =-•
y(t) - x(t)
Verify by yourselves that this transformation leads to the equation
z'(t) = -(f(t) + 2xit)git))zit) - git).
521
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
y"(t) = -y(t)-ay'(t),
where a is a constant which expresses the magnitude of the damping. In the following picture, there are the so-called phase diagrams for solutions with two distinct initial conditions, namely with zero damping on the left, and for the value of the coefficient a = 0.3 on the right.
Tlumené oscilace
Tlumené oscilace
r.
The oscillations are expressed by the y-axis values; the x-axis values describe the speed of the motion.
8.140. Undamped oscillation. Find the function y(i) which satisfies the following differential equation and initial conditions:
y"(t) + 4y(t) = f(t), ;y(0) = 0, y'(0) = -1,
where the function f(t) is piecewise continuous:
1 cos(2?) for 0 < t < 7i, 10 for t > 7t.
fit)
Solution. This problem is a model of undamped oscillation of a spring (omitting friction, non-linearities in the toughness of the spring, and other factors) which is initiated by an outer force.
The function f(t) can be written as a linear combination of Heav-iside's function u(t) and its shift, i. e.,
Since
f(t) = cos(2í)(w(í) - M0)
£(y")(s) = s2C(y) - sy(0) - y'(0) = s2 C(y) + 1,
we get, applying the results of the above exercises 7 and 8 to the Laplace transform of the right-hand side
s2 C(y) + \ + 4C(y)
£(cos(2í)(«(0 - M0)) = £(cos(2i) • u(t)) - £(cos(2i) • M0)
£(cos(2i))
Hence,
C(y)
(1
1
'£(cos(2(í + 7C))
s2 +4
s2 +4
+ (1
(s2 + 4)2
Performing the inverse transform, we obtain the solution in the form
s
y(t)
sin(2í) + \t sin(2i) + C~l \e
(s2 + Af
Just as we saw in the case of integration of functions (which is, in fact, the simplest type of equations with separated variables), the equations usually do not have a solution expressible explicitly in terms of elementary functions.
Similarly as with standard engineer tables of values of special functions, books listing the solutions of basic equations were compiled as well.4 Today, the wisdom concealed in them is essentially transferred to software systems like Maple or Mathematica. There, we can assign any task on ordinary differential equations, and we will get the results in a surprisingly good deal of cases, yet after all, it will not be possible for most problems.
The way out of this is numerical methods, which try only to approximate the solutions. However, to be able to use them, we still need good theoretical starting points regarding existence, uniqueness, and stability of the solutions.
We begin with the so-called Picard-Lindelof theorem:
Existence and uniqueness of the solutions of ODEs
8.49. Theorem. Let a function j'(t, y) : R2 -> R have continuous partial derivatives on an open set U. Then for every point (to, yo) e U D R2, there exists a maximal interval I — [?o — a, to + b], with positive a, b e R, and a unique function y(t) : I -> R which is a solution of the equation
f(t) = f(t, y(t))
on the interval I.
Proof. Notice that if a function y(t) is a solution of our equation satisfying the initial condition y(?o) — to, then it also satisfies the equation
y(0 = yo + f yf" (t) dt = y0 + I fit, y(t)) dt. Jt0 Jt0
However, the right-hand side of this expression is, up to constant, the integral operator
L(y)(t) = yQ+ f f(t,y(t))dt. Jt0
When solving our first-order differential equations, we are thus looking for a fixed point of this operator L, i. e., we want to find a function y — y(t) satisfying L(y) — y.
On the other hand, if a Riemann-integrable function y(t) is a fixed point of the operator L (y), then it immediately follows from the antiderivative theorem that y(t) indeed satisfies the given differential equation, including the initial conditions.
We can quite easily guess for the operator L how much its values L (y) and L (z) differ for various arguments y(t) and z(t). Indeed, thanks to the partial derivatives of the func-> tion / being continuous, we know that / is locally Lips-chitz. This means that we have the bound
\f(t,y)-f(t,z)\ < C\y-z\, with a constant C if we restrict the values (t, y) to a neighborhood of the point (to, yo) with compact closure. We choose an e > 0 and restrict the value of t to some interval J — [to — ao, to + bo] so
/'
E. g., Kamke.
522
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
However, by formula (||7.36||), we have
s
1 / — tis
CTy\e
l-C-l(e-ns C(t sin(2f)))
(s2 + 4)2 ) 4
= (t -tx) sin(2(f - tx)) ■ Hn(t).
Since Heaviside's function is zero for t < tx and equal to 1 for t > tx, we get the solution in the form
y(0
■\ sin(2f) + \t sin(2f) for 0 < f < ix
n-2
sin(20
for t > tt
□
8.141. Find the general solution of the equation
/" - 5/ - 8/ + 48y = 0.
Solution. This is a third-order linear differential equation with constant coefficients since it is of the form
y(n) + aiy(n-l) + a2y(n-2) + . . . + + ^ = f (x)
for certain constants a\, ..., an el Moreover, we have f(x) = 0, i. e., the equation is homogeneous.
First of all, we will find the roots of the so-called characteristic polynomial
kn + a{kn~x + a2kn~2 H-----h an_{k + an.
Each real root k with multiplicity k corresponds to the k solutions
and every pair of complex roots k = a ± iß with multiplicity k corresponds to the k pairs of solutions
eax cos (fix) , x eax cos (fix) xk~1 eax cos (fix) ,
eax sin (fix) , x eax sin (fix) xk~1 eax sin (fix) .
Then, the general solution corresponds to all linear combinations of the above solutions.
Therefore, let us consider the polynomial
k3 - 5k2 - Sk + 48
with roots ki = k2 = 4, k3 = —3. Since we know the roots, we can deduce the general solution as well:
y = de41 + C2x e4x + C3e"3\ d, C2, C3el. □
8.142. Compute
f + f + 9/ + 9y = ex + lOcos(3x) .
Solution. First, we will solve the corresponding homogeneous equation. The characteristic polynomial is equal to
k3 + k2 + 9k + 9,
with roots ki = — 1, k2 = 3i, k3 = — 3i. The general solution of the corresponding homogeneous equation is thus
y = de-x + C2cos (3x) + C3 sin (3x) , C\, C2, C3 e R.
The solution of the non-homogeneous equation is of the form
y = de-x + C2 cos (3x) + C3 sin (3x) +yp, Cu C2, C3el
for a particular solution yp of the non-homogeneous equation.
The right-hand side of the given equation is of a special form. In general, if the non-homogeneous part is given by a function
that J x [yo —e, yo +s] c U, and we consider only those functions y(t) and z(t) which, for t e J, satisfy
max\y(t) - y0\ < e, max \z(f) — )>ol < £■
Now, we obtain the bound
\(L(y) - L(z))(t)\ =
f f
Jtn
f(t,y(t))- f(t,z(t))dt
< I \f(t,y(t))- f(t,z(t))\dt 'to
• M" with continuous first derivatives. Then, a system of differential equations dependent upon a parameter X e Rk with initial condition at a point x € U
y(t) = f(y(t),X), y(0)=x
has a unique solution y(t, x, X), which is a mapping with continuous first derivatives with respect to each variable.
Proof. First, we can notice that we can consider a system dependent on parameters to be an ordinary autonomous system with no parameters if we consider even the parame-,«i ters to be space variables and we add (vector) conditions i| X'(t) — 0 and X(0) — X. Therefore, without loss of generality, we can prove the theorem for autonomous systems with no further parameters and concentrate on the dependency upon the initial conditions.
Just like in the case of the fundamental existence theorem, we will build upon Picard's approximations of the solution using the integral operator
y0(t, x) — x, yk+i(t,x)
= x+ f Jo
f(yk(s,x))ds.
Merely specifying the proof of this theorem 8.49, we can verify the uniform convergence of the approximations yk (t, x) to the solution y(t, x), including the variable x.
Now, for the initial condition, let us fix a point xo and a small neighborhood V of its, which, should need be, will be reduced during the following bounds, and let us write C for the constant which, thanks to Lipschitzness of the function /, gives the bound
\f(y)-f(z)\ , X = {x e R; 0 < x < 1} inM.
Solution. The set N. For any n e N, we have that
01 (n) n N = (n - 1, n + 1) n N = {n}.
Hence, there is a neighborhood of n e N in R which contains only one natural number (the number n), therefore every point n e N is isolated. There are thus no interior points (an isolated point cannot be interior). A point a € R is a limit point of A if and only if every neighborhood of a contains infinitely many points of A. However, the set
0i (a) n N = (a - 1, a + 1) n N, where a e R,
is finite, hence N has no limit points. By finiteness of this set, we have that
Sh := inf | b — n
inf \b-n\>0 for b e R \ N.
neOi(b)nN
Therefore, 0Sb (b) n N = 0, so no b e M \ N is a boundary point of N. We also know that every point which is not an interior point of a given set is necessarily its boundary point. The set of N's boundary points thus contains N, and so it equals N.
The set Q. The rational numbers are a dense subset of the real numbers. This means that for every real number, there is a sequence of rational numbers converging to it. (We can, for instance, imagine the decimal representation of a real number and the corresponding sequence whose z'-th term will be the representation truncated to the first i decimal digits. Furthermore, we can suppose that the terms of this sequence are pairwise distinct, for example by deliberately changing the last digit, or by taking the representation with recurring nines rather than zeros, ie. 0.999... for the integer 1 and so on). The set of Q's limit points is thus the whole R and every point x e R \ Q is a boundary point. Especially, we get that any 8 -neighborhood
P P
--8, —h 8 I , where p, q e Z, q ^ 0,
q q
of a rational number p/q contains infinitely many rational numbers, hence there are no isolated points. The number 72/10" is rational for no n € N. Supposing the contrary (again, p, q sZ,q ^ 0) V2 p ^ 10>
ie
10" q q we arrive at an immediate contradiction as we know that the number \fl is not rational. Every neighborhood of a rational number p/q thus contains infinitely many real numbers p/q + 72/10" (n e N)
The multiplication A • iyo,y\,ylQ,y'])T gives the vector («3, ci2, a\, ao)T of coefficients of the polynomial /, i. e.
/(*) = (2y0-2yi +/0 + /i)x3
+ (-3)>o + 3yq - 2y0 - y\)x2 + y0x + y0.
5.9. Spline interpolation. Similarly, we can prescribe any finite \. number of derivatives at the particular points and a convenient choice for the upper bound on the degree of the wanted polynomial leads to a unique interpolation. We will not delay ourselves with details here. Unfortunately, these interpolations do not solve the problems mentioned already in connection with the simple interpolation of values - complexity of the computations and instability. However, the usage of derivatives allows us to improve our methods:
As we have seen in the pictures demonstrating the instability of the interpolation by a single polynomial of sufficiently large degree, small local changes of the values dramatically affected the overall changes of the behavior of the resulting polynomial. Thus we may try to use small polynomial pieces of low degrees which we, however, must be able to link to one another properly.
The simplest case is to link each pair of adjacent points with a polynomial of degree at most one. This is also the most frequent way of displaying data. From the view of derivatives, this means that they will be constant on each of the segments and then will change in a leap.
A bit more sophisticated method is to prescribe the value and the derivative at each point, i. e. we will have four values for two points, which uniquely determines Hermite's polynomial of degree three, see above. This polynomial can then be used for all the values of the input variable between the marginal points xo < x\. We talk about the interval [xo, x{\. Such a piecewise polynomial approximation has the property that the first derivatives will be compatible.
However, in practice, mere compatibility of the first derivatives is insufficient (for instance, with railway tracks), and furthermore, the values of the first derivatives are not always at our disposal. Thus we get the idea of making use of the values at the given points, and on the other hand to require equality of the first and second derivatives between the adjacent pieces of the cubic polynomials. This conditions yield the same number of equations and unknowns, and so the problem will be similarly solvable:
_ | Cubic splines [___
Let xo < xi < • • • < x„ be real values at which the required values yo, ..., yn are given. A cubic interpolation spline for this assignment is a function S : R —>• R which satisfies the following conditions:
• the restriction of S on the interval [x,_i, x;] is a polynomial St of degree at most three i — 1,..., n
• Si(xi-i) — yt-i and St (xt) — yt for all; — 1, ...n,
• S'i (xi) = (xi) for all i = 1,..., n - 1,
• S'! (xt) = S'!+l{Xi) for all i = 1,..., n - 1.
The cubic spline1 for n + 1 points consists of n cubic polynomials, i. e. we have An free parameters (the first condition from the
The name comes from the meaning of a ruler used to draw smooth curves between points.
262
CHAPTER 5. ESTABLISHING THE ZOO
which are not rational (Q, as a field, is closed under subtraction). Therefore, every point p/q € Q is boundary as well, and there are no interior points of the set Q.
The set X = [0, 1). Let a e [0, 1) be an arbitrary number. Apparently, the sequences {a + {1 — ^}%Li converge to a and 1, respectively. So we have easily shown that the set of X's Umit points contains the interval [0, 1]. There are no other limit points: for any b <£ [0, 1] there is 8 > 0 such that Os (b) n [0, 1] = 0 (for b < 0 it suffices to take 8 = —b, and for b > 1 we can choose 8 = b — 1). Since every point of the interval [0, 1) is a limit point, there are no isolated points. For a e (0, 1), let 8a be the less one of the two positive numbers a, I — a. Considering
0Sa (a) = (a - 8a, a + 8a) c (0, 1), a € (0, 1),
we see that every point of the interval (0, 1) is an interior point of X. For every 8 e (0, 1), we have that
os (0) n [0, i) = (-8, s) n [0, i) = [0, s),
0& (1) n [0, 1) = (1 - 8, l+8)0 [0, l) = (l - s, 1),
so every 8 -neighborhood of the point 0 contains some points of the interval [0, 1) and some points of the interval (—8,0), and every 8-neighborhood of 1 has a non-empty intersection with the intervals [0, 1), [1,1 + 8). Therefore, 0 and 1 are boundary points. Altogether, we have found that the set of X's interior points is the interval (0, 1) and the set of X's boundary points is the two-element set {0, 1}, as we know that no point can be both interior and boundary and that a boundary point must be an interior or limit point. □
5.29. Determine the suprema and infima of the following sets in R:
(-1)"
; n e N} C R,
A = (-3,0]U(1,tt)U{6}; B
5.30. Find sup A and inf A for
n+ (-!)"
5.31. The following sets are given:
N = {1,2, ...,n, ...}, M = t7 = (0,2]U[3,5]\{4}. Determine inf N, sup M, inf J and sup J in R.
; n e N ; C = (-9,
O
O
1
-; n € N n
definition). The other conditions then yield In + (n — 1) + (n — 1) more equalities, i. e. two parameters remain free. In practice, we often prescribe the values of the derivatives at the marginal points explicitly (the so-called complete spline), or assume they equal zero (this case is called a natural spline).
Unfortunately, the computation of the whole spline is not as easy as with the independent computations of Hermite's cubic polynomials because the data mingle between adjacent intervals. However, with an appropriate order, one can obtain a matrix of the system such that all of its non-zero elements appear on three diagonals only. These matrices are nice enough to be solved in time proportional to the number of points, using a suitable numerical method. For comparison, let us look at interpolation of the same data as in the case of the Lagrange polynomial, now using splines:
2. Real number and limit processes
It is important to have a sufficiently large stock of functions with which we can express all usual dependencies. However, at the same time, the choice of the functions must be carefully restricted so that we would be able to build some universal and efficient tools for the work with them.
Actually, the first problem we have to solve is how to define the values of the functions at all. After all, all we can get with a finite number of multiplication and addition is polynomial functions and efficient manipulation can be done with rational numbers only. However, we cannot do with rational numbers even when looking for roots of quadratic polynomials as, for instance, \/2 is not a rational number.
Thus our first step will be a thorough introducing of the so-called limit process, i. e. we will define precisely what it means that some values approach a certain value.
We can also notice that an important property of polynomials is their "continuous" dependency of their values on the input variable. Intuitively said, if we change x a little bit, the value of f(x) also changes a bit only. On the other hand, this behavior is not possessed by piecewise constant functions / : R -> R near the sudden "jumps". For instance, the so-called Heaviside's function
/(*) =
0 for all x < 0, 1/2 forx = 0,
1 for all x > 0
o
has this type of "discontinuity" for x = 0.
Let us formalize these intuitive statements.
263
CHAPTER 5. ESTABLISHING THE ZOO
5.32. Find a set M c R which does not have an infimum in R but has a supremum there. Similarly, find a set A7" c R which does not have a supremum in M but has an infimum there. O
5.33. Find a subset X of the set M such that sup X < inf X. Q
5.34. Find sets A, B, C c M such that
AHS = 0, ARC = 0, SRC = 0, sup A = inf 5 = infC = supC.
O
5.35. Mark the following sets in the complex plane:
i) {z eCMz- 1| = |z + l|},
n) {z e C| 1 < \z - i\ < 2},
iii) {z € C| Re(z2) = 1},
iv) {z e C| Re(i) < ±}.
Solution.
• the imaginary axis,
• annulus around /,
• the hyperbola a2 — b2 = 1,
• exterior of the unit disc centered at 1.
□
C. Limits
In the subsequent exercises, we will deal with calculating limits of sequences, that is what the sequences "look like at infinity". Then, if we were to determine the 72-th term of a given sequence for a very large 72, the hmit of the sequence (supposing it exists) can approximate it very well. We devote much space to computation of hmits of sequences (and limits of functions) in this exercise column, that is why they begin earlier (and end later) than in the part concerning theory.
Let us begin with limits of sequences. The needful definitions can be found at page. 266.
5.10. Real numbers. So far, we have made do with algebraic properties of real numbers which claimed that R is a field. However, we have also used the relation of the standard (total) order of the real numbers, denoted "<" (see the paragraph 1.38). The properties (axioms) of the real numbers, including the connections between the relations and other operations, are enumerated in the following table. The bars indicate how the axioms gradually guarantee that the real numbers form an abelian (commutative) group with respect to addition, that R \ {0} is an abelian group with respect to multiplication, that R is a field, that the set R together with the operations +, • and the order relation is a so-called ordered field. Finally, the last axiom can be perceived as claiming that R is "sufficiently dense", i. e. there are no points missing between any points (like, for instance, \fl is missing in the rational numbers).
[ Axioms of the real numbers [__<
(Rl) (a + b) + c = a + (b + c), for all a,b,ceR
(R2) a+b = b + a, for all a,/> e R
(R3) there is an element 0 e M such that for all a e R, a+0 =
a
(R4) for all a e R, there is an additive inverse (—a) e R such that a + (-a) = 0
(R5) (R6) (R7)
(R8)
(a ■ b) ■ c = a ■ (b ■ c), for all a, b, c e R a ■ b = b ■ a for all a, b e R
there is an element 1 e R, 1 / 0, such that for all a e R,
1 • a = a
for all a e R, a / 0, there is a multiplicative inverse
-1
such that a ■ a 1 = 1
(R9) a ■ (b + c) = a ■ b + a ■ c, for all a, b, c e R (RIO) the relation < is a total order, i. e. reflexive, antisymmetric, transitive, and total on R (Rl 1) for all a, b, c e R, a < b implies a + c < b + c (R12) for all a, b e R, a > 0 and b > 0 implies a ■ b > 0
1
(R13) every non-empty set A c has a least upper bound.
which has an upper bound
The conception of a least upper bound (also called supremum) must be thoroughly introduced. It makes sense for any partially ordered set, i. e. a set with a (not necessarily total) ordering relation. We will also meet it later in algebraic contexts. Let us remind that at the general level, an ordering relation is any binary relation on a set which is reflexive, antisymmetric, and transitive; see the paragraph 1.38.
__ I Supremum and infimum [__>
Definition. Let us consider a subset A c B in a partially ordered set B. An upper bound of the set A is any element b e B such that b > a holds for all a e A. Dually, we define the concept of a lower bound of the set A as an element b e B such that b < a for all a € A.
The least upper bound of the set A, if it exists, is called its supremum and denoted by sup A. Dually, the greatest lower bound, if it exists, is called an infimum; we write inf A.
The last axiom of our table of properties of the real numbers thus claims that for every non-empty set A of real numbers, it is true that if there is a number a which is greater than or equal to all
264
CHAPTER 5. ESTABLISHING THE ZOO
5.36. Calculate the following limits of sequences:
i) lim 2"2+l"+1, ü) lim 2"2+3"+1,
iii) lim
n + l
«^oo 2«2+3« + l ' 2"-2"
V4«"
-oo 2"+2-" ■
24
iv) lim„
v) lim
vi) lim -v/4«2 + n — 2n.
Solution.
i) lim
2«2+3« + l n + l
2«+3+7
lim
n^*oo 1 + 7,
ii) lim 2"2+3"+1
7 „^„o 3«z+« + l
lim —
n^*oo 3+
^ + -L
n „2
iii) lim
n^*cx>
iv)
n + l 2n2+3n + l
lim
1
OO.
1 + -
lim .
„^oo 2n+3+-
2" - 2~" 2" + 2""
lim
IL _ i
— + 1
2-n i 1
v) By the squeeze theorem (5.21): Vn e N : :
/4n2" < y4n2+n
<
4n2+n +
lim
n^*oo
vi)
2nH
Then lim - 2. So lim
/4«2
lim
2«
2, lim ■
4«2+« +
v/4n2+n
2 as well.
lim \f\~n1
+ n — 2n
lim
n^*oo
(V4«2 + n - 2n)(jAn2 +n + 2n)
V4«2 + n + 2n
n
lim —_
«^oo ^4„2 _|_ „ _|_ 2n
lim
1
oo ^4n2+n
+ 2
1
4"
□
5.37. Let c e
lim ^/c = 1.
n^*cx>
Solution. First, let us consider c > 1. The function Hfc is decreasing (in n), yet all its values are greater than 1, hence the sequence Zfc has a limit, and this limit is equal to the infimum of the sequence's terms. Let us suppose, for a while, that thus Umit is greater than 1, that is 1 +s for some s > 0. Then by the definition of a limit, all the sequence's
2
terms will eventually (from some index m on) be less than 1 + e +
2
especially ?/c < 1 + e + . But then we have that
ic <
1 +s +
1 + 2<1 + £'
numbers x € A, then there is a least number with this property. For instance, the choice A = {x € Q, x2 < 2} gives us the supremum sup A = \fl.
An immediate consequence of this axiom is also the existence of infima for any non-empty set of real numbers bounded from below. (It suffices to realize that changing the sign of all the numbers interchanges suprema and infima).
For the formal construction of our theory, we need to know whether the properties we demand from the real numbers are realizable, i. e. whether there is such a set R with the operations and ordering relation which satisfy the thirteen axioms. So far, we have constructed correctly only the rational numbers, which form an ordered field, i. e. satisfy the axioms (Rl) - (R12), which can easily be verified.
Actually, the real numbers can not only be constructed, but the construction is, up to isomorphism, unique. However, for our need, we will do with an intuitive idea of the real line. We will focus on the existence and uniqueness later on.
5.11. The complex plane. Let us remind that the complex numbers are given as pairs of real numbers. We usually write them as z = re z + i im z. Therefore, the plane C = M2 is a good image of the complex numbers. With addition and multiplication, the complex numbers satisfy the axioms (R1)-(R9) and thus form a field. There is, however, no natural ordering defined on them which would satisfy the axioms (R10)-R(13). Nevertheless, we will work with them as we have already seen that extending some scalars to the complex numbers is highly advantageous for calculations, and sometimes even necessary.
There is an important operation on the complex numbers, the so-called conjugation. It is the reflection symmetry with respect to the line of real numbers, i. e. changing the sign of the imaginary part. We denote it by a bar over the number z e C:
z = rez — i imz.
Since for z = x + iy,
z ■ z = (x + iy) (x — iy) = x2 + y2,
this value expresses the squared distance of the complex numbers from the origin (zero). The square root of this non-negative real number is called the absolute value of the complex number z; we write
(a positive real number). We will show that (5.3)
z • z.
The absolute value is also defined on any ordered field of scalars K, we just define the absolute value \a\ as follows:
Ia if a > 0 —a if a < 0.
Of course, it is true that for any numbers a, b e K,
(5.4) \a+b\ < \a\ + \b\.
This property is called the triangle inequality. It also holds for the absolute value of the complex numbers, which was defined above.
Especially for the field of rational numbers and the field of real numbers, which are subfields of the complex numbers, both definitions of the absolute value coincide.
265
CHAPTER 5. ESTABLISHING THE ZOO
which contradicts our assumption that 1 + e is the infimum of the considered sequence.
The theorem is trivial for c = 1, and for a number c € (0, 1) it follows from the above, if we invoke the theorem for the number l/c.
□
5.38. Determine
lim J/n.
Solution. Apparently, we have j/n > 1, n e N. So we can set rfn = 1 + a„ for certain numbers a„ > 0, neN. By the binomial theorem we get that
n = (1 + a„)n = 1 + Qo„ + Qa2 + ••• + <, n > 2 (n e N). Hence we have the bound (all the numbers a„ are non-negative)
n > i ic
72 (?2 — 1)
which leads to
0 < a„ <
72-1
72 > 2(72 € N),
72 > 2(72 e N).
By the squeeze theorem,
0 = lim 0 < lim an < lim
n^-oc n^-oc n^-oc
Thus we have obtained the result
72 - 1
lim t/n = lim (1 + a„) = 1 + 0 = 1.
We can notice that by further application of the squeeze theorem, we get
1 = lim 1 < lim Ifc < lim j/n~ = 1 for every real number c > 1. □
5.39. Calculate the limit
um (y/i.yi.yi... 272).
Solution. To determine the hmit, it is sufficient to express the terms
in the form
22 • 2? • 2s • • • 22* = 22+?+5+'"+2^.
Thus we get
lim (y/2 .Zfl.yi... 2lfl \ = lim 2
n—>oo \ / n—>oo
lim (
i+i+i.
2+4 + 8
_ J_
"2"
I + I + I.
2+4 + 8
2«=i
5.12. Convergence of a sequence. In the following paragraphs, we will work with one of the number sets K of ratio-X, nal, real, or complex numbers. The absolute value thus must be understood in the corresponding context, and we should also bear in mind that the triangle inequality holds in all these cases.
We would like to formalize the notion of a sequence of numbers approaching a limit. Therefore, the key object of our interest will be sequences of numbers a,, where the index i usually goes throughout the natural numbers. We will denote the sequences either loosely as ao, a\,..., or as infinite vectors (ao, ai,...), or (similarly to the matrix notation) as (flr)^i-
_ | Cauchy sequences [___
Let us consider a sequence (ao, a l, • • •) of elements of K such that for any fixed positive number e > 0, it holds for all but finitely many terms a, of the sequence that for all but finitely many terms
a;,
a,- — a,- < e.
I
In other words, for any fixed e > 0, there is an index N such that the above inequality holds for all i, j > N; i. e. the elements of the sequence are eventually arbitrarily close to each other. Such a sequence is called a Cauchy sequence.
Intuitively, we feel that either all but finitely many of the sequence's terms are equal (then \at — aj \ =0 will hold from some index N on), or they "approach" some value. This is easily imaginable in the complex plane: choosing an arbitrarily small disc (with radius equal to e), then, supposing we have a Cauchy sequence, it must be possible to put it into the complex plane in such a way that it covers all but finitely many of the elements of the infinite sequence a,. We can imagine that the disc gradually shrinks to a single value a; see the picture.
tos-wutnost
O fJOMri&CNltff cjsel-
If such a value a e K exists for a Cauchy sequence, we would expect the sequence to have the property of convergence:
__ [ Convergent sequences [__>
We say that a sequence (a;)°^0 converges to a value a iff for any positive real number e,
fl; — a\ < s
1
holds for all but finitely many indeces i (the set of those i for which the inequality does not hold may depend on e). The number a is called the limit of the sequence (fli)^0.
If a sequence a, e K, i — 0, 1,..., converges to a e K, then for any fixed positive e, we know that \at —a\ < s for all i greater than a certain N e N. However, by the triangle inequality, we then get that for all pairs of indeces i, j > N, it is true that
\ai — aj \ — \cii — ax + o/v — Qj \ < Wi — fl/v I + \aN ~ aj \ < 2e.
Thus we have proved:
Lemma. Every converging sequence is a Cauchy sequence.
266
CHAPTER 5. ESTABLISHING THE ZOO
By the well-known formula for the sum of geometric series,
00 /1 \ « 1
whence it follows that
lim (V2-^2-^2---72)=21=2.
n—>co \ /
□
5.40. Determine
1 2
n — 2 72 — 1
lim — + — + ••• + —— +
O
5.41. Calculate
V«3 — H«2 + 2 + ^Jn1 — 2n5 — n3 — n + sin2 n lim -
2 - ^5t24 + 2n3 + 5
5.42. Determine the limit
n\ + (n-2)\- (n -4)!
o
lim
n50 +n\- (n - 1)!
O
5.43. Find two sequences (let use denote their terms by x„ and y„ (n e N), respectively) having infinite limits and such that
lim (x„ +y„) = l, lim (x„ y2) = +00.
5.44. Determine the limit points of the sequence given by
(-l)"2n
o
V4«2 + 5n + 3
neN.
O
5.45. Calculate
if
lim sup a„ and lim inf a„
n2 + An — 5 9 H7T
--- sin —, n e N.
«2 + 9 4
O
5.46. Determine
lim inf ( (-1)" ( 1 + -J +sin —
o
However, in the field of rational numbers, it can easily happen that the corresponding value a does not exist even for a Cauchy sequence. For instance, the number \fl can be approached by rational numbers at with arbitrary accuracy, thereby obtaining a sequence converging to \fl, but the limit is not rational.
Ordered fields of scalars in which every Cauchy sequence is converging are called complete. The following theorem proposes that the axiom (R13) guarantees that the real numbers are such a field:
Theorem. Every Cauchy sequence of real numbers ai converges to a real value a e M.
Proof. The terms of any Cauchy sequence form a bounded set Q since any choice of e bounds all but finitely many of them. Let us define B as the set of those real num-
^sSgSiLg bers x for which x < aj holds for all but finitely
many terms aj of the sequence.
Apparently, B has an upper bound, and thus has a supremum
as well, by (R13). Let us define a — sup 5. Now, having fixed
some e > 0, we choose N so that \at -Especially, aj > a^ — s and aj < apj -and so a^ — s belongs to B, while a^ we get that \a — a^\ < s, and thus
aj\ < s for all i, j > N. s for all indeces j > N, - s does not. Altogether,
aj\ <
■ aN\
|fljv — aj \ < 2e
for all j > N. However, this means that a is the limit of the considered sequence. □
Corollary. Every Cauchy sequence of complex numbers n converges to a complex number z.
Proof. Letuswritezi — at+ibi. Since |a,— aj\2 < \zi—Zj\2 and similarly for the values bt, both sequences of real numbers a, and bt are Cauchy sequences. They converge to a and b, respectively, and we can easily verify that z — a + 1' b is the limit of the sequence zi. □
5.13. Remark. The previous discussion gives us a method for defining the real numbers. We proceed similarly to building the integers from the natural numbers (adding all additive inverses) and building the rational numbers from the integers (adding all multiplicative inverses of non-zero numbers). This time, we "complete" the rational numbers by all limits of Cauchy sequences.
It suggests itself to introduce a suitable equivalence relation on the set of all Cauchy sequences of rational numbers so that Cauchy sequences (a,■) and (bt) are equivalent iff the distances | a, — bt I converge to zero (this is the same as the condition that merging these sequences into a single sequence-for instance, the terms of the first sequence will become the odd terms of the resulting one and the terms of the second sequence will be the even ones-yields a Cauchy sequence as well). We will not verify that this relation is an equivalence in detail, neither will we define the operations and the ordering relation, nor will we prove that all of the axioms will indeed hold. Nevertheless, it is not difficult. Nor is proving the fact that the axioms (R1)-(R13) define the real numbers uniquely up to isomorphism (a bijective mapping preserving the algebraic operations as well as the ordering). We will return to this notes later.
267
CHAPTER 5. ESTABLISHING THE ZOO
5.47. Now let us proceed with limits of functions. The definition can be found at page 272. Determine
(a)
lim sinx;
(b)
(c)
(d)
lim
x + x
2 x2 - 3x + 2'
lim arccos-
X^+QO \ X + 1
lim
x + x
(jc-2)(jc+3) ,. x + 3 2 + 3 c lim-:-— = lim-- =--- = 5
►2 X2 - 3x + 2 x^2 (X - 2) (X - 1) ^21-1 2-1
leads to the correct result (thanks to continuity of the obtained at function at the point x0 = 2). Let us realize that the Umit of a function can be calculated from the function values in an arbitrarily small deleted neighborhood of a given point xo and that the Umit does not depend on the function value at the point. We can thus make use of multiplying or reducing by factors which do not change the function values in an arbitrarily selected deleted neighborhood of the point x0.
Exercise (c). By moving the limit inwards twice, the original Umit transforms to
arccos I lim -
>+°° x + 1
It can easily be shown that
lim
1
0.
► +oo x + 1
As the function y = arccos x is continuous at the point 0 and takes the value 7t/2 there, and the function y = x3 is continuous at jt/2, we get that
lim (arccos—-—| = (arccos ( lim —-—| | = (—
:^+cx) \ x + 1 / V V^+CX) x + 1 / / ^ 2
5.14. Closed sets. Four our further work with the real or complex numbers, we will need to thoroughly understand the notions of closeness, boundedness, convergence, and so on. For any subset A of points in K, we will be interested not only of the points belonging to a e A, but also in the ones which can be approached by limits of sequences.
Limit points of a set [__^
1 , lim arctg —, lim arctg x , lim arctg (sin x) .
x^—oo x x^—oo x^—oo
Solution. Exercise (a). Let us remind that a function / is, by definition, continuous at a given point x iff the limit of / at x is equal to the function value f(x). However, we know that the function y = sin x is continuous at every real number. Thus we get that
,. . . TV V3
lim sin x = sin — = —.
x^n/3 3 2
Exercise (b). The immediate substitution x = 2 leads to both zero numerator and zero denominator. Despite that, the problem can be solved very easily. The reduction
Let us consider a set A of points belonging to K. A point x € K is called a limit point of the set A iff there is a sequence «0, «i, • • • of elements of A such that all its terms differ from x, yet its limit is x.
The limit points of a subset A of rational, real, or complex numbers are those numbers x which can be approached by such sequences of numbers lying in A which do not contain the point x itself. Let us notice that a limit point of a set may or may not belong to it.
For every non-empty set A c K and a fixed point x e K, the set of all distances |x — a\, a e A, is a set of real numbers bounded from below, and so it has an infimum d(x, A), which is called the distance of the point x from the set A. Let us notice that d(x, A) — 0 if and only if x e A or x is a limit point of A. (We suggest that the reader prove this in detail from the definitions.)
[ Closed sets |___
The closure A of a set A c K is the set of those points which have zero distance from A (note that the distance from the empty set of points is undefined, therefore 0 = 0).
A closed subset in K is such a set which coincides with its closure. Thus these are exactly those sets which contain all of its limit points as well. There is a typical example of a closed set: a closed interval
[a, b] — {x e R, a < x < b] of real numbers, where a and b are fixed real numbers.
If either of the boundary values of the interval is missing, we write a — —oo (minus infinity) and similarly b — +oo. Such closed intervals are denoted by (—oo, b], [a, oo), and (—oo, oo).
The closed sets are exactly those which contain all they can "converge to". A closed set may be formed by a sequence of real numbers without a limit point or a sequence with a finite number of limit points together with these points. The unit disc (including its boundary circle) in the complex plane is another example of a closed set.
We can easily verify that any intersection and any finite union of closed set is again a closed set. Indeed, if all of the points of some sequence belong to the considered intersection of closed sets, then they belong to each of the sets, and so do all the limit points. However, if we wanted to say the same about an arbitrary union, we would get in trouble: singleton sets are closed, but a sequence of points created from them may not be. On the other hand, if we restrict our attention to finite unions and consider a limit point of some sequence lying in this union, then the limit point must also be the limit point of any subsequence, especially the one lying in only one of the united sets. As this set is assumed to be closed, the limit point lies in it, and thus it lies in the whole union.
268
CHAPTER 5. ESTABLISHING THE ZOO
Exercise (d). The function y = arctg x has properties which are "useful when calculating hmits" - it is continuous and injective (increasing) on the whole domain. These properties always (with no further conditions or hmitations) allow to move the examined limit into the argument of such a function. Therefore, let us consider
arctg ( lim — J , arctg ( lim x4 J , arctg ( lim sinx
\ x^ — oo x I \x^—oo I \x^— oo
Apparently,
1
lim - = 0,
x^-oo x
lim x = +oo
x^> — oo
and the limit lim^-oo sinx does not exist, which implies
4
1 4 7T
lim arctg - = arctg 0 = 0, lim arctg x = lim arctg y = —
—* —oo x x^—oo y^+oo 2
and the last hmit does not exist, either.
□
5.48. Determine the limit
lim
1 — cos x >b x2 sin(x2)
Solution.
1 — cos X
lim
>o X2 sin(x2)
lim
2 sin2 (I)
>o X2 sin(x2)
lim
1 ^ (!)
2 .
2sm V2
>0 (f) sin(x2)
1 / sin - lim —
2 \*->o
(f)V 1 — -lim
1
- • oo = oo.
2 j - >osin2(x2) 2
The previous calculation must be considered "from the back". Since the limits on the right-hand side exist (no matter whether finite or infinite) and the expression \ ■ oo is meaningful (see the note after theorem 5.22), the original hmit exists as well. If we split the original limit into the product
1
lim(l - cosx) • lim ,
x^o x^o x2 sm(xz)
we would get the 0 • oo type, which is an indeterminate form, but this tells us nothing about existence of the original limit. □
5.49. Determine the following limits:
i) lim^2 in) lim^o
x-2 •Jx1-^
ii) lim^o
sin (sin x)
iv) lim^o e~>
Solution.
x — 2 x — 2
i) lim , = lim
Vx - 2 0 lim = - = 0.
►2 ^/x2 _ 4 x^2 J(x -2)(x + 2) x^2 Vx + 2 4
... ,. x-2 (5.27),. siny 11) lim , = lim-= 1,
5.15. Open sets. There is another useful type of subsets of the real numbers: open intervals
(a, b) = {x € R; a < x < b],
where, again, a and b are fixed real numbers or infinite values ±oo. It is an open set, in the following sense:
__\ Open sets and neighborhoods of points |__-
An open set in K is a set whose complement is a closed set. A neighborhood of a point a e K is any open set O which contains a. If the neighborhood is defined as
Og(a) = {x e K, |x - a\ < 8}
for some positive number 8, then we call it the S -neighborhood of the point a. |
Let us notice that for any set A, a e K is a limit point of A if and only if every neighborhood of a contains at least one more point b e A,b ^ a.
Lemma. A set A cK of numbers is open if and only if with every point a € A, an entire neighborhood of a belongs to A.
Proof. Let A be an open set and a e A. If there were no neighborhood of the point a inside A, there would be a sequence a„ <£ A, \a — an \ < 1/n. But then the point a e A is a limit point of the set K \ A, which is impossible since the complement of A is closed.
Now let us suppose that every a e A has an entire neighborhood of its lying in A. This naturally prevents a limit point b of the set K \ A to lie in A. Thus the set K \ A is closed, and so A is open. □
From this lemma, it immediately follows that any union of open sets results in an open set, and further than any finite intersection of open sets is also an open set.
In the case of the real numbers, the ^-neighborhood of a point a is the open interval of length 28, centered at a. In the complex plane, it is the disc of radius 8, also centered at a.
5.16. Bounded and compact number sets. The closed and open sets are the basic concepts of topology. Without going into deeper connections, we have just made ourselves familiar with the topology of the real line and the topology of the complex plane. The following concepts will be extremely useful:
___j Bounded and compact sets J___i
A set A of rational, real, or complex numbers is called bounded iff there is a positive real number r such that \z\ < r for all numbers z e A. Otherwise, the set is called unbounded.
A set which is both bounded and closed is called compact.
►2 Vx2 - 4 y^o y
Closed bounded intervals of real numbers are a typical example of compact sets.
Let us add further topological concepts that will allow us to express efficiently:
An interior point of a set A of real or complex numbers is such a point that one of its neighborhoods is contained in A.
A boundary point of a set A is such a point that all its neighborhoods are disjoint with neither A, nor its complement K \ A. A boundary point of the set A may or may not belong to it.
269
CHAPTER 5. ESTABLISHING THE ZOO
where we made use of the fact that lim sin x = 0.
x^O
sin x iii) lim-
x^O X
smx
lim sin x ■ lim-
x^O x^O X
0-1=0,
again, the original limit exists because both the right-hand side limits exist and their product is well-defined.
iv) One must be cautious when calculating this limit. Both one-sided limits exist, but are different, which implies that the examined Umit does not exist:
lim e
x^0+
oo,
lim
x^O-
□
5.50. Calculate
(a) lim*
(c) lim*
Solution.
x+2 >2 (x-lf '
(b) (d)
lim* lim*
x+2 >2 (x-2)5 '
► +0o -
In this exercise, we will be concerned with so-called indeterminate forms. We recommend perceiving indeterminate forms as a helping concept which is only to facilitate the first approach to limit calculations because the obtained indeterminate form only means that one "has found out nothing". We know the limit of a sum is the sum of the limits, the limit of a product is the product of the Umits, and the limit of a quotient is the quotient of the limits, supposing the particular Umits exist and do not lead to one of the following expressions oo — oo, 0 • oo, 0/0, oo/oo, which are called indeterminate forms. For completeness, let us add that these rules can be combined and that an expression containing an indeterminate form is itself considered an indeterminate form. For instance, the forms
-oo + oo
0
oo
(—oo)3+oo
are all indeterminate, but the forms
0
- oo, T-^- =
' 3+oo
0 • (oo — oo)"
0
-oo
oo,
3 + oo (—oo)3
oo
can be called "determinate" (one can immediately determine the Umit - they correspond to the values — oo, 0, 0, respectively).
In exercise (a), the quotient of the numerator and the denominator gives us 4/0. Expressions containing division by zero are inappropriate (later, we should be able to avoid them). Yet it leads to the result, it is not an indeterminate form. We may notice that the denominator
An open cover of a set A is such a system of open sets Ut, i e I, that its union contains the whole of A.
An isolated point of a set A is a point a e A such that there is a neighborhood Nofa satisfying N n A — {a}.
^ ~ t
1 -'
5.17. Theorem. All subsets A of the real numbers satisfy:
(1) a non-empty set A is open iff it is a union of countably (or finitely) many open intervals,
(2) every point a e A is either interior or boundary,
(3) every boundary point of A is either an isolated or a limit point of A,
(4) A is compact iff every infinite sequence contained in it has a subsequence converging to a point in A,
(5) A is compact iff each of its open covers contains a finite sub-cover
Proof. (1) Apparently every open set is some union of neighborhoods of its points, i. e. of open intervals. So the question that remains is whether it suffices to take countably many of them. Thus we may try to select intervals which will be as "great" as possible. We will consider points a, b e A to be related iff the whole open interval (min{a, b], max{a, b}) is contained in A. Clearly, this relation is an equivalence (the open interval (a, a) is the empty set, which is contained in any set; symmetry and transitivity are apparent). The classes of this equivalence relation are intervals which are pairwise disjoint. Each of these intervals surely contains a rational number, and the obtained rational numbers are also pairwise distinct. However, there are only countably many rational numbers, so the statement is proved.
(2) It follows immediately from the definitions that no point can be both interior and boundary. Let a e A be a point that is not interior. Then there is a sequence of points at ^ A with a as its limit point. At the same time, a belongs to each of its neighborhoods. Thus a is boundary.
(3) Suppose that a e A is boundary but not isolated. Then, similarly to the reasoning from the previous paragraph, there are points at, this time inside A, whose limit point is a.
(4) Suppose that A is a compact set, i. e. both closed and bounded. Let us consider an infinite sequence of points at e A. This set surely has both a supremum b and an infimum a (we could have taken any upper and lower bounds of the set A as well). Now let us cut the interval [a, b] into halves: [a, \(b — a)] and [j(b — a),b]. At least one of them contains infinitely many of the terms at. We will select this half and one of the terms contained in it;
270
CHAPTER 5. ESTABLISHING THE ZOO
approaches zero from the right (for x ^ 2 we have that (x — 2)6 > 0). We write this as 4/ + 0. Thus the numerator and denominator are both positive in some deleted neighborhood of the point x0 = 2 and one can say that the denominator, at the Umit point, is "infinitely times less" than the numerator, that is
x + 2
lim
+oo,
= +oo (similarly, we can set
- 7i (x - 2f which corresponds to setting 4/ + 0 4/ - 0 = -oo).
When calculating the limit of (b), one can proceed analogously. Since the numbers have the same sign, we get that
x + 2 x + 2
lim
+oo 7^ —oo
lim
►2+ (X - 2)5 x^2- (x - 2)5 '
so the examined Umit does not exist. We can write 4/±0 (or, more generally, a/ ± 0, a 7^ 0, a € R*), which is a "determinate form". When thoroughly distinguishing the symbols +0 and —0 from ±0, a/ ± 0 for a 0 always means the limit in question does not exist. Exercises (c), (d). If fix) > 0 for all considered x e R, then
f(x)g{x) = eln(/w*w) = esM-ln/M.
Making use of the fact that the exponential function is continuous and injective on the whole of its domain (R), we can replace the Umit
g(x)
with
lim fix)
X—>XQ
lim (g(x)-lnf(x))
Let us remind that either of these limits exists if and only if the other one exists. Further,
lim igix) ■ Infix)) = a e R
X^-XQ
lim igix) ■ Infix)) = +oo
X—>XQ
lim igix)-In fix))
-OO
lim fix)
X—>XQ
lim fix)
X—>XQ
lim fix)
X—>XQ
g(x) g(x) g(x)
e ,
+oo,
0.
Thus we can write
lim fix)
X—>Xq
g(x)
lim g(x)- lim In f(x)
X—^A'q X—^A'q
if both limits on the right-hand side exist and do not lead to the indeterminate form 0 • oo. It is not difficult to realize that this indeterminate form can only be obtained in three cases, corresponding to the remaining indeterminate forms 0°, oo°, 1°°, when we have, respectively, that
and
and
lim fix) = 0
X^-Xq
lim f{x) = +oo
X^-Xq
lim fix) = 1
X^XQ
and
lim gix) = 0,
X^XQ
lim gix) = 0,
X^XQ
lim g(x) = ±oo.
X^-XQ
then we cut the selected interval into halves. Again, we select such a half which contains infinitely many of the sequence's terms and select one of those points. By this procedure, we obtain a Cauchy sequence (you can prove this by yourselves; all you need is careful manipulation with some bound, similarly as above). However, we know that Cauchy sequences have limit points or are constant up to finitely many exceptions. Thus there is a subsequence with the wanted limit. >From the fact that A is closed, it follows that the obtained point lies in A.
Now the other direction: if every infinite subset of A has a Umit point in A, then all limit points are in A, and so A is closed. If A were not bounded, we would be able to find an increasing or decreasing sequence such that the differences of adjacent numbers would be at least 1, for instance. However, such a sequence of points in A cannot have a Umit point at all.
(5) First, let us focus on the easier implication, i. e. let us suppose that every open cover contains a finite one and prove that A is both closed and bounded. Apparently, A -
0 we can find a covering of set A by a countable system of open intervals Jt,i — 1,2,... such that
^m(Ji) < s.
In the following, by the statement "function / has the given property on set B almost everywhere" we'll always mean the fact that / has this property at all points except for a subset A c B of zero measure. For example, the characteristic function of rational numbers is zero almost everywhere, a sequentially continuous function is continuous almost everywhere etc.
We'd now like to modify the definition of the Riemann integral so that when choosing the partition and the corresponding Riemann sums, we would be able to eliminate the omnious effect of the values of the integrated function on a before known set of zero measure. It also seems reasonable to try to guarantee that the segments in the given partitions with representants would have the property that near points of such set, they would be controllably small.
A positive real function S on a finite interval [a, b] is called a calibre. We call a partition S of interval [a, b] with representants & <5-calibrated, if we have
^ - Kb) < xi-i < Hi < xi < Hi + S(Hi)
for all i.
For further procedure, it's essential to verify that for every calibre S, a ^-calibrated partition with representants can be found. This statement is called Cousin's lemma and can be proven for example in the usual way based upon the properties of supremas. For a given calibre S on [a, b], we'll denote by M the set of all points x € [a, b] such that a ^-calibrated partition with representants can be found on [a, x]. Surely M is nonempty and bounded, thus it has a supremum s. If s / b, then we could find a calibrated partition with a representant at s, which leads to a contradiction.
Now we can define a generalization of the Riemann integral in this way:
Definition. Function / defined on a finite interval [a, b] has its Kurzweil integral
-f
J a
f(x) dx,
if for every e > 0, there exists a calibre S such that for every 8-calibrated partition with representants S, the inequality | Se — 11 < e holds for the corresponding Riemann sum Se ■
6.50. Properties of the Kurzweil integral. First notice that when \\ defining the Kurzweil integral, we only bounded the set of all partitions, for which we take the Riemann sums into account. Hence if our function is Riemann W integrable, then it must also have the Kurzweil integral and these two integrals are equal.
393
CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS
which can be seen for example from the ratio test for convergence of series:
lim
ř7=>oo
an + l
lim
ř7=>oo
(n + 1)2
-(«+1
n2n
,. ln + 1 1 hm--= -.
«=>°o 2 n 2
In total, according to (6.43) (3), we have
/•In 3 /-In 3 00
/ f(x) áx = / ^Tne-nx
Jin 2
/•in J w
in 2 ^
00 1
00 00 / 1 1 \
D--"ti = E(^-^) = i
/hT"* dx
1 _ 1
2 ~ 2"
□
6.88. Determine the following limit (give reasons for the procedure of computation):
<*»(f)
lim
r00 a io (1
dx.
cos(í)
Solution. First we'll determine lim , The sequence of these
«=>OC> (l+n)
functions converges pointwise and we have
cos(f) 1 (||??||) 1
ex
lim
(1 + _)" Um (1 + _)"
It can be shown that the given sequence converges uniformly. Then according to (6.41),
f°° cos(-) fc lim/o 0 + lf ^ = 1
+
cos(-) lim--
(1 + xň)
dx
1
/- = '
Jo e*
We leave the verification of uniform convergence to the reader (we only point out that the discussion is more complicated than in the previous cases).
□
For the same reason, we can repeat the argumentation in Theorem 6.24 about simple properties of the Riemann integral and again verify that the Kurzweil integral behaves in the same way. In particular, a linear combination of integrable function cf(x) + dgix) is again integrable and its integral is c fb f(x)dx + d fb gix)dx etc. For proving this, it only suffices to think through little modifications when discussing the refined partitions, which moreover should be S -calibrated.
Analogously, for the case of monotonie sequences of point-wise convergent functions, we can extend the argumentation verifying that the limits of uniformly convergent sequences of integrable functions /„ are again integrable and the integral of the limit is the limit of the values of integrals /„.
Finally, the Kurweil integral behaves in the way we would like it to, even to sets with zero measure:
Theorem. Consider a function f on interval [a, b], which is zero almost everywhere. Then the Kurzweil integral fb f '(x)d(x) exists and equals zero.
Proof. This is a nice illustration of the idea that we can get rid of the influence of values on a small set by a smart choice of calibre. Denote by M the corresponding set of zero measure, outside of which fix) — 0 and write Mk c [a,b],k — 1,..., for the subset of the points for which k — 1 < |/(x)| < k. Because all the sets Mi have zero measure, we can cover it by a countable system of in sum arbitrarily small and pairwise disjoint open intervals Jkj. Now definte the calibre S(x) for x e Jkj so that the whole intervals (x — Six), x + Six)) were still contained in Jkj. Outside of set M, we then define S arbitrarily.
For á-calibrated partition S of the interval [a, b] we can then put a bound on the corresponding Riemann sum
n-l
£/(£n)(*Z + l -Xi) — £ /(£n)(Xi + l -Xi) j=0
n-l
7=0
oo n — l
0, xe[l,2],
max (x +
xe[l,2]
2 +
2 +
Ys-
n+x2/ vi+22
An increasing function takes the maximum value at the right marginal point of a closed interval. □
7.25. Determine whether the sequence {xn }„eN where
x\
1,
1 + 5 +
+ -, n e N\ {1}, First, consider the usual metric given by
is a Cauchy sequence in the difference in absolute value (i. e., induced by the norm of absolute value). Then, consider the metric
Solution. Let us remind that
oo ^
(7.28) £-Therefore,
oo,
i. e.
k=\
oo 1
k—m
lim |x„
E
k—m + l
oo, m e N.
oo, m e N.
for such a function /. Let us notice that if we considered both the one-sided limits (which always exist by our definition) and the value of the function itself to be the value / (x) at points of discontinuity, we can work with maxima instead of suprema. It is apparent again that it is a norm (except for the problems with the values at discontinuity points).
7.17. Completion of metric spaces. Both the real numbers R and the complex numbers C are (with the metric given by the absolute value) a complete metric space. Actually, this is contained in the axiom of existence of suprema. Let us remind the the real numbers were created as a "completion" of the space of rational numbers which is not complete itself. It is apparent that the closure of the set Q c R is the whole R.
Dense and nowhere-dense subsets _.
1
We say that a subset A c X in a metric space X is dense iff the closure of A is the whole space X. A set A is said to be nowhere dense in X iff the set X \ A is dense.
Apparently, A is dense in Z if every open set in the whole space X has a non-empty intersection with A.
In all cases of norms on functions from the previous paragraph, we can easily see that the metric spaces defined in this way are not complete since it can happen that the limit of a Cauchy sequence of functions from our vector space 5° [a, b] should be a function which does not belong to this space any more. Let us consider the interval [0, 1] as the domain of functions /„ which take zero on [0, 1 In) and are equal to sin(l/x) on [1/n, 1]. Apparently, they converge to the function sin( 1 /x) in all L p norms, but this function does not he in our spaces.
Completion of a metric space _.
Let X be a metric space with metric d which is not complete. A metric space X with metric d such that X c X, d is the restriction of d to the subset X and the closure X is the whole space X is called a completion of the metric space X.
The following theorem says that the completion of an arbitrary (incomplete) metric space X can be found in essentially the same way as the real numbers were created from the rationals. Before we get to the quite difficult proof of this extraordinarily important and useful result, we can notice that such a "completion" I of a space X can be done in a unique way, in a certain sense:
A mapping \;
(b) ||/||0o=max{|/(x)|;xe[-l,l]} complete?
Solution. The case (a). Let us, for every n e N, define a function
fn(x) = 0, x e [-1,0), /„(*) = 1, x e [±, l],
fn(x) =nx, x e [0, i) .
The obtained sequence {/„}„£n C n, m, n e N.
Let us focus on the potential limit of the sequence {/„} in n(s). Therefore, the continuous function / must satisfy
f{x) = 0, x e [-1,0], f{x) = 1, x e [e, 1] for an arbitrarily small s > 0. Thus, necessarily,
fix) = 0, x e [-1, 0], fix) = 1, x € (0, 1].
However, this function is not continuous on [—1, 1] - it does not belong to the considered metric space. Therefore, the sequence {/„} does not have a limit in 0 (or for every e/2 if you want) there is an n(e) € N such that
s
(7.29) max | fm{x) - f„ix) | < -, m,n> n(e).
jce[-l,l] 2
In particular, we get for every x e [—1, 1] a Cauchy sequence {fnix)}neN C R of numbers. Since the metric space R with the usual metric is complete, every (for x e [—1,1]) sequence {f„{x)} is convergent. Let us set
fix) := lim f„ix), x e [-1, 1].
is well-defined on the dense subset i\(X) c X\. Its image is the dense subset i2(X) c X2 and, moreover, this mapping is clearly an isometry. The dual mapping i\ o 1 works in the same way.
Every isometric mapping maps, of course, Cauchy sequences to Cauchy sequences. At the same time, such Cauchy sequences converge to the same element in the completion if and only if this holds for their images under the isometry / '
The first and second properties are clearly satisfied. To prove the triangle inequality, it suffices to realize that dim, n) € (1,4/3] if m ^ n. All Cauchy sequences can be found equally easily: they are the so-called almost stationary sequences — constant from some index on (i. e., constant except for finitely many terms). Thus, every Cauchy sequence is convergent, so the metric space in question is complete. Let us introduce the sets
A„ := \m e N; dim, n) < 1 + ^} , neN.
As the inequality in their definition is not strict, it is guaranteed that they are closed sets. Since A„ = {n, n + 1, ...}, (||7.30||) does not hold. If we omitted the requirement (||7.311|), it would mean that the metric space is not complete, which is not true. Finally, let us mention that
lim sup {d(x, y); x, y € An} = lim (l + ^—) = 1^0.
□
7.28. Prove that the metric space h is complete.
Solution. Let us consider an arbitrary Cauchy sequence {xn }„eN in the space i2. However, every term of this sequence is again a sequence, i. e., x„ = neN. Let us mention that, of course, the range of
indeces does not matter - there is no difference whether n, k e N or
elements x, y,z, and we easily get
d(x,z) — lim d(xt, Zi)
< lim d(xt, yt) + lim d(yt, zt)
— d(x, y) + d(y, z).
Apparently, the restriction of the metric d just defined to the original space X is identical to the original metric because the original points are represented by constant sequences.
It remains to prove that X is dense in X and that the constructed metric space is complete. We want to prove that for any fixed Cauchy sequence x — {x,} and every (no matter how small) s > 0, we can find an element y of the original space such that the distance of the constant sequences of y's from the chosen sequence Xi does not exceed e. However, since the sequence x, is a Cauchy sequence, all pairs of its terms x„, xm will eventually (i. e. for sufficiently large indeces m and n) become closer than e to each other. Then the choice y — xn for one of those indeces necessarily gives that the elements y and xm will be closer than e, and so, from the limit point of view, it will hold that d(y, x) < s.
Finally, it remains to prove that Cauchy sequences of points of the extended space X with respect to the metric d are necessarily convergent. In other words, we want to show that repeating the above procedure does not yield new points. This can be done by approaching the points of a Cauchy sequence ik by points yk from the original space X so that the resulting sequence y — {yt} would be the limit of the original sequence with respect to the metric d.
Since we already know that X is a dense subset in X, we can choose, for every element ik of our fixed sequence, an element Zk e X so that the constant sequence Zk would satisfy d(xk, ik) < 1 /k. Now, let us consider the sequence z — {zo, z\, ■ ■ ■}. The original sequence x is Cauchy, i. e. for a fixed real number e > 0, there is an index n(e) such that d(x„,xm) < s/2 whenever both m and n are greater than 11(e). Without loss of generality, we can assume that our index n(e) is greater than or equal to 4/e. Now, for m and n greater than n(e), we get:
d(Zmi Zn) — d(zm, Zn)
— d(Zmi xm) + d(xm, xn) -\- d(xn, Zn)
< l/m + s/2 + l/n < 2- + - — e. ~ ~ 4 2
Thus it is a Cauchy sequence zt of elements in X, and so z e X. Let us examine whether the distance d(x„, z) approaches zero, which we tried to guarantee by the construction. From the triangle inequality,
dCz, xn) < d(z, z„) + d(z„, x„).
However, from our previous bounds, it follows that both the sum-mands on the right-hand side converge to zero, thereby finishing the proof. □
In the following three paragraphs, we will introduce quite simple three theorems about complete metric spaces. They are highly applicable in both mathematical analysis and verifying convergence of numerical methods.
440
CHAPTER 7. CONTINUOUS MODELS
n,k € N U {0}. Let us introduce helping sequences yk for k e N so that
yk = {y"k}neN = {4}„eN • If {x„} is a Cauchy sequence in h, then each of the sequences yk is a Cauchy sequence in R (the sequences yk are sequences of real numbers). It follows from the completeness of R (with respect to the usual metric) that all of the sequences yk are convergent. Let us denote their limits by zk, & e N.
It suffices to prove that z = {zk}km £ h and that the sequence {xn} converges for n -» oo in h just to the sequence z. The sequence {xn}neN C h is a Cauchy sequence; therefore, for every s > 0, there is an 72 (e) e N with the properly that
e (-"4, — ->4) < £, m,n > n(s), m, n e N.
In particular,
' 2
e (-"4 ~~ Xt) < e' m,n > n(s), m,n,l € N, whence, letting m -» oo, we can obtain
' 2
e(^-4) ^ e' «>«(«), «JeN,
k=l
i. e. (this time i -» oo)
(7.32)
E(^~4)2• X on a metric space X with metric 0, for every positive real number e, we can find an index n(e) such that for all At with indeces i > n(e), their diameters are less than e. However, then for so large indeces i, j, we will have d(zt, Zj) < s, and thus our sequence is a Cauchy sequence. Therefore, it has a limit point z e X, which, of course, must be a limit point of all the sets At, thus it belongs to all of them (since they are all closed) and so to their intersection.
We have proved the existence of z. Now, it remains to prove its uniqueness. For that purpose, assume there are points z and y, both belonging to the intersection of all the sets At. Then their distance must be less than the diameter of the sets At, but that converges to zero. This completes the proof. □
7.21. Theorem (Baire theorem). If X is a complete metric space, then the intersection of every countable system of open dense sets Ai is a dense set in the metric space X.
Proof. Let a system of dense open sets At, i — 1, 2..., be
given in X. We want to show that the set A
, Ai has a non-
k=\
empty intersection with any open set U C X. We will proceed inductively, invoking the previous theorem.
Surely there is a z\ e A i n U, but since the set A i is open, the closure of an ei-neighborhood U\ (for sufficiently small ei) of the point zi is contained in A i as well. Let us denote the closure of this ei-ball U\ by Si. Further, let us suppose that the points zi and their open e,-neighborhoods Ut are already chosen for i — 1,... ,n. Since the set A„+i is open and dense in X, there is a point z„+\ e An+i n U„; however, since A„+i n U„ is open, the point zn+i belongs to it together with a sufficiently small sn+i -neighborhood U„+\. Then, the closures surely satisfy Bn+\ — U„+\ C U„, and so the closed set Bn+\ is contained in An+i n U„. Moreover, we can assume that s„ < 1/n.
If we proceed in this inductive way from the original point z\ and the set B\, we get a non-decreasing sequence of non-empty closed sets Bn whose diameter approaches zero. Therefore, there is a point z common to all of these sets, i. e.,
z e n^Ui = n^Bi c n^A,, n u,
which is the statement to be proved. □
7.22. Bounded and compact sets. The following concepts facil-' j»V itated the phrasing of our observations about the real m~^J numbers. They can be reformulated for general met-/f^ftvt* fic sPaces whh almost no changes.:
An interior point of a subset A in a metric space is such an element of A which belongs to it together with some of its e-neighborhoods.
A boundary point of a set A is an element x e X such that each of its neighborhoods has a non-empty intersection with both A and the complement X \ A. A boundary point may or may not belong to the set A itself. A.
An open cover of a set A is a system of open sets Ut c X, i e I, such that their union contains the whole of A.
An isolated point of a set A is an element a e A such that one of its e-neighborhoods in X has the singleton intersection {a} with A.
A set A of elements of a metric space is called bounded iff its diameter is finite, i. e., there is a real number r such that d(x, y) <
442
CHAPTER 7. CONTINUOUS MODELS
Therefore, for every e > 0, there is an n(s) € N satisfying
2-
y £=h(e) + 1
>From each of the intervals [—l/n, \/n] for n e {1,..., we can choose finitely many points x", for any x e [—1/«, 1/«] that
, xnm(n) so that we would have
mm
je[l,...,m(n)}
/5"
Let us consider such sequences {y„}„em from h whose terms with in-deces n > n(s) are zero, and at the same time,
yi e {x
vm(l)
J«(e) e ixl ■
«00 1
m(n(EJ) I '
There are only finitely many such sequences and they create an e-net for A since
f+ S +
1 +
Since e > 0 is arbitrary, the set A is totally bounded, which implies its compactness.
It is very simple to determine whether the set B is compact. Every compact set must be closed, but the set B is not. Its closure is
B = {{x„}„m e Zoo; | x„ | < ±, n e N} .
The set B is compact. The proof of this fact is much simpler than for the set A, thus we leave it as an exercise for the reader. □
D. Integral operators
The convolution is one of the tools for smoothing functions:
7.32. Determine the convolution f * where
1
x
x forx e [0, 1]
Mx) fi(x)
for x ^ 0
ine [0, 1 0 otherwise
Solution. The value of the convolution at a point 7 is given by the integral f^ fi(x)f2(t — x) Ax. The integrated function is non-zero if the second factor is non-zero, i. e., if(7—x) e [—1, l],i. e.,x e [7 — l,f + l]. The value of the convolution at the point 7 can be interpreted as the integral mean of the function f over the interval (7 — 1,7 + 1). When integrating over this interval, we have to distinguish whether the number 0 belongs to it. If the interval contains zero, the integral must be split into two improper integrals. However, the value of the smaller one can be subtracted thanks to the function -\ being odd, so
the integral -\ Ax remains (think out why the formula works for negative numbers 7 as well). Thus, we get:
fi * fi(t)
t+i i
t-l X
Ax = In
~-i+t i
t+i
for 7 e (-oo, -1] U [1, oo],
fill \ Ax = ln|{±f| for* e [-1,1].
□
Now, let us try to calculate the convolution of two functions both of which have a finite support.
r for all elements x, y e A. In the other case, the set is said to be unbounded.
A metric space X is called compact iff every sequence of terms xi e X has a subsequence converging to some point x e X.
In the case of the real numbers, we mentioned several characterizations of compactness. The concept of bounded-ness is a bit more complicated in the case of metric spaces. For any subsets A, B c X in a metric space X with metric d, we define their distance
dist(A, B) — sup {d(x, y)}.
xeA,yeB
If A — {x} is a singleton set, we talk about the distance dist(x, B) of the point x from the set B. We say that a metric space X is totally bounded iff for every positive real number e, there is a finite set A such that
dist(x, A) < £
for all points x e X. Let us remind that a metric space is bounded iff the whole X has a finite diameter.
We can immediately see that a totally bounded space is especially bounded. Indeed, the diameter of a finite set is always finite, and if A is the set corresponding to e from the definition of total boundedness, then the distance d(x, y) of two points can always be bounded by the sum of dist(x, A), dist(y, A), and diam A, which is a finite number. In the case of a metric on a subset of a finite-dimensional Euclidean space, these concepts coincide since the boundedness of a set guarantees the boundedness of all coordinates in a fixed orthonormal basis, and this implies the total boundedness. (Verify this in detail by yourselves!)
Theorem. The following statements about a metric space X are equivalent:
(1) X is compact,
(2) every open cover ofX contains a finite cover,
(3) X complete and totally bounded.
Sketch of the proof. If the second statement of the theorem is satisfied, the we can easily see that the space X must be totally bounded. Indeed, it suffices to choose the cover of X consisting of all e-balls centered at the points x e X. We can choose a finite cover from that and the set of centers x, of the balls participating in this finite cover already satisfies the condition from the definition of total boundedness.
To prove the implication (2) ==>■ (3), we need to show the completeness. Let us consider a Cauchy sequence x, .
□
7.23. Compactness on continuous functions. As an example of the fact that the behavior of compactness may differ in Euclidean spaces from that in spaces of functions, we will mention a very useful theorem, known as Arzela-Askoli theorem.
Theorem. A set M C C[a, b] is compact if and only if it is bounded, closed, and equicontinuous.
443
CHAPTER 7. CONTINUOUS MODELS
7.33. Determine the convolution f\ * f2 where
/iW = fi(x) =
1-x2 fbrjc e [—1,1], 0 otherwise,
x for x e [0, 1], 0 otherwise.
Solution. The value of the convolution f\ * f2 at a point t is given by the integral over all real numbers of the product of the function f\ (x) and the function f2(t — x) with respect to the variable x (see 7.13). Thus, this value is zero if either of the values fi(x) and f2(t — x) is zero for any real x. On the other hand, the value of the convolution can be non-zero at a point t only if there are numbers x such that f\ (x) ^ 0 7^ fi(t — x). By the definitions of the given functions, this occurs if there are numbers i e [-1,1] (/i(i) / 0) such that (t — x) e [0, 1] (fiit-x) £ 0). I.e., /i*/2(0 can be non-zero if [t-1, f+l]n[0, 1] ^ 0. This happens for t e [—1,2]. We integrate over x belonging to the intersection of the intervals [t — l,t + 1] and [0, 1]. Further, this intersection depends on t e [—1,2]:
a) for t € [-1, 0], we have [t - 1, t + 1] n [0, 1] = [0,t + 1],
b) for t e [0, 1], we have [t - 1, t + 1] n [0, 1] = [0, 1],
c) for t e [1, 2], we have [t - 1, t + 1] n [0, 1] = [t - 1, 1].
According to the intersection of these intervals, we then have: a)
/oo ft+1 fi(x)f2(t -x)dx= / fi(x)f2(t - x) Ax -oo JO
rt+1
L
(1 - x2)(t - x)dx
-t4 + t2+-t
b)
/oo /» 1
h(x)f2(t-x) = / fi(x)f2(t-x) -oo JO
1 2 1 (1 - x2)(t -x)dx = -t--,
c)
/oo /» 1
fi(x)f2(t-x) = / h(x)f2(t-x) -oo
J t-1
1 1 4
(1 - x2)(t - x) Ax = —t4 -t2+ -t.
12
Altogether, we get:
fi * flit)
-U4
U4 12'
3'
■f2
for? e [-2, -1], for? e [-1, 1], for? e [1,2] otherwise.
□
7.24. Proof of theorem 7.8 about Fourier series. The general context of metrics and convergence allows us now to get back to the proof of the theorem in which we got a first idea about piecewise and other convergences of Fourier series. However, we do not care about necessary conditions for convergence, and many other formulations can be found in literature. On the other hand, our theorem 7.8 was quite simple and concerned a good deal of useful cases.
Firstly, it is good to realize how convergences may differ with respect to different Lp norms. For the sake of simplicity, we will always work in the completion of the space 5^ or S]. with respect to the corresponding norm, without thinking about what the spaces actually look like (even though they could be described quite easily with the help of Kurzweil integral).
Holder's inequality (applied to functions / and constant 1) yields the first of the following bounds on 5° [a, b]:
f
J a
\fix)\dx < \a-b\Vl^ \fix)\Pdx
.h\VlCVl (J^\fix)\dx
< \a
VP
where p > 1 and l/p + l/q — 1, C > \f(x)\ on the whole interval [a, b] (such a uniform bound by a constant always exists if / e 5° [a, b]). The second bound follows immediately from the bound \f(x)\P < CP'1 \f(x)\and the relation 1 - l/p = l/q.
Thus it is apparent from the first bound that Lp-convergence /«—>•/ is, for any p > 1, always stronger than L\-convergence (and with a merely modified bound, we can derive an even stronger
proposition, namely that Lq
-convergence is stronger than Lp-
convergence whenever q > p; try this by yourselves). However, to apply the second bound, we have to require uniform boundedness of the sequence of functions /„, i. e. the bound for the functions /„ by a constant C must be independent of n. Then we can assert that I fn ix) — fix)\ < 2C, and our bound implies that L1 -convergence is stronger than Lp-convergence.
Therefore, all our Lp-norms on our space 5° [a, b] are equivalent with regard to convergence of uniformly bounded sequences of functions.
The most difficult (and most interesting) part is to prove the first statement of the theorem 7.8, which is in literature often referenced as Dirichlet condition (it is deemed to be derived as early as in 1824). First, we prove how this property of piecewise convergence implies the statements (2) and (3) of the theorem to be proved. Without loss of generality, we can assume that we are working on the interval [—it, it], i. e. with period T — 2it.
As the first step, we prepare simple bounds for the coefficients of the Fourier series. A bound-of-course is
1 r
* J-71
\fix)\ dx,
and the same for all the coefficients bn since both cos(x) and sin(x) are bounded by 1 in absolute value. However, if / is a continuous function in S1 [a,b], we can integrate by parts, thus obtaining
anif)
1 r
* J-71
fix) cos(nx)• 0 such that the corresponding sequence of functions ynn converges uniformly to a continuous function y(n). Further, let us write more simply y„ (t) — ynn —>• y(t).
However, for each of the continuous functions yh, we have only finitely many points in the interval [to,t] where it is not dif-ferentiable, so we can write
yn (t) — yo +
f y'n
Jta
(s) ds.
On the other hand, the derivatives on the particular intervals are constant, so we can write (here, k is the largest such that to +khn < t, while y, and tj are the points from the definition of the function
yn (t) - yo +
E / n*j
yj)ds
+ f f(tk,Jk)-Instead, we would like to see
y» (0 — yo + I f(s, yn (s)) ds,
f
but the difference between this integral and the last two terms in the previous expression is bound by the possible differences of the function f(t, y) and the lengths of the intervals. Thanks to our
525
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
□
8.146. Determine the general solution of the equation
/ - y
- 5.
Solution. The characteristic polynomial of the equation is x2 —x, with roots 1, 0. Therefore, the general solution of the homogenized equation is ci + c2ex, where c\, c2 € R. We are looking for a particular solution in the form ax, a € R, using the method of undetermined coefficients. The result is a = —5, and the general solution is of the form
c\ + c2ex — 5x.
8.147. Solve the equation
f -iy+y
□
x2 + l ■
Solution. We will solve this non-homogeneous equation using the method of variation of constants. We will thus obtain the solution in the form
y = Ci(x) yi(x) + C2(x) y2(x) H-----h C„(x) y„(x),
where y\, ..., yn give the general solution of the corresponding homogeneous equation and the functions C\ ix),..., Cn (x) can be obtained from the system
C\ (x) yi (x) + ■ ■ ■ + C'n (x) yn (x) = 0, C'l(x)y\(x) + ... + C'n(x)y'n(x) = Q,
C\(x) y(r2) (x) + ■ ■ ■ + C'n(x) y(n"-2) (x) = 0, C\(x) yf(x) + • • • + C'n(x) y(r1] ix) = fix).
The roots of the characteristic polynomial X2 — 2k + 1 are ki = k2 = 1. Therefore, we are looking for the solution in the form
Cl(x)ex + C2(x)xex,
considering the system
kC\{x)ex + C'2(x)xex =0, C\(x) ex + C'2(x) [ex +xex] = -^j-
We can compute the unknowns Cx{x) and C'2(x) using Cramer's rule. It follows from
ex
ex e
x ex ex + x ex
ex ex
0
t2+l
x ex + x ex = e2x,
-X e2x
X2 + 1'
0 e2x
x2+l X2 + 1
universal bound for f(t, y) above, we can thus use just the last integral instead of the actual values in the limit process lim„^oo yn (t), thereby obtaining
y(t) — lim
(yo+ f(s, yn(s))ds) J t0
= yo+ (lim f(s, y„ (s))) ds Jt0
— yo + / f(s, y(s))ds, Jt0
where we used the uniform convergence y„ (t) —>• y(t). This proves the theorem.
□
8.52. Systems of first-order equations. The problem of finding the solution of the equation y1 (x) — f(x, y) can ■f also be viewed as looking for a (parametrized) curve (x(f), y(t)) in the plane where we have fixed the parametrization of the variable x(?) — t beforehand. However, if we accept this point of view, then we can forget this fixed choice for one variable and we can add an arbitrary number of variables.
In the plane, for instance, we can write such a system in the form
x' (0 = f(t, xit), yit)), y (0 = git, xit), yit))
with two functions /, g : R3 —>• R. Similarly for more variables.
A simple example in the plane might be the system of equations
x,(t) = -y(t), y1it) = xit).
It can be easily guessed (or verified at least) that there is a solution of this system,
xit) — R cos t, yit) — R sin t,
with an arbitrary non-negative constant R, and the curves of the solution will be exactly the parametrized circles with radius R.
In the general case, we will work with the vector notation of the system in the form
x'(0 = /(f, xit))
for a vector function x : R —>• R" and a mapping / : M"+1 —>• R". We are able to extend the validity of the theorem on uniqueness and existence of the solution to such systems:
Existence and uniqueness for systems of ODEs J_>
Theorem. Consider functions fit, x\, x„) : M"+1 —>• R, i = I,..., n, with continuous partial derivatives. Then, for every point (to, x\, ..., x„) e Rn+1, there exists a maximal interval [to — a, to + b], with positive numbers a, b € R, and a unique function xit) : R —>• R" which is the solution of the system of equations
x\ (x) - fi it, xi it), ...,x„ ix))
x'nix) = /„it, xi (0, - - -, x„(x)) with initial condition
Xl(t0) - xi, ...,x„(t0) -x„.
526
CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES
that
Ci(x)
j ^-^dx =-Un(x2+ l) + Cu Ci e
/