MASARYK UNIVERZITY Brisk guide to maths Jan Slovák, Martin Panák, Michal Bulant et al Brno 2013 The work on the textbook has been supported by the project CZ. 1.07/2.2.00/15.0203. INVESTMENTS IN EDUCATION DEVELOPMENT Authors: Mgr. Michal Bulant, Ph.D. Mgr. Aleš Návrat, Dr. rer. nat. Mgr. Martin Panák, Ph.D. prof. RNDr. Jan Slovák, DrSc. RNDr. Michal Veselý, Ph.D. Graphics and illustrations: Mgr. Petra Rychlá ©2013 Masaryk University Contents Chapter 1. Initial warmup 3 1. Numbers and functions 3 2. Combinatorics 7 3. Difference equations 11 4. Probability 15 5. Plane geometry 23 6. Relations and mappings 36 Chapter 2. Elementary linear algebra 72 1. Vectors and matrices 72 2. Determinants 83 3. Vector spaces and linear mappings 92 4. Properties of linear mappings 108 Chapter 3. Linear models and matrix calculus 137 1. Linear processes 137 2. Difference equations 143 3. Iterated linear processes 150 4. More matrix calculus 157 5. Decompositions of the matrices and pseudoinversions 176 Chapter 4. Analytic geometry 207 1. Affine and euclidean geometry 207 2. Geometry of quadratic forms 227 3. Projective geometry 234 Chapter 5. Establishing the ZOO 254 1. Polynomial interpolation 254 2. Real number and limit processes 263 3. Derivatives 281 4. Power series 293 Chapter 6. Differential and integral calculus 347 1. Differentiation 347 2. Integration 364 3. Infinite series 382 Chapter 7. Continuous models 419 1. Fourier series 420 2. Metric spaces 432 3. Integral operators 448 4. Discrete transforms 455 Chapter 8. Continuous models with more variables 463 1. Functions and mappings on M" 463 2. Integration for the second time 494 3. Differential equations 516 4. Notes about numerical methods 539 Preface This textbook follows the years of lecturing Mathematics at the Faculty of Informatics at Masaryk University in Brno. The programme requires introduction to genuine mathematical thinking and precision, but there is not much time dedicated. Thus, we want to cover seriously, but quickly, about as much of mathematical methods as usual in bigger courses in the classical Science and Technology programmes. At the same time, we do not want to give up the completeness and correctness of the mathematical exposition. We want to introduce and explain more demanding parts of Mathematics, together with elementary explicit examples how to use the results. But we do not want to solve for the reader how much of theory or practice to enjoy and in which order. All these requests have lead us to the two collumn format where the rahter theoretical explanation and the practical examples are split. This way, we want to please and help the readers to find their own way. Either to go through the examples and algorithms first, and then to come to explanations why the things work, or the other way round. We also hope to overcome the usual stress of the readers horrified by the amount of the stuff. With our text, they are not supposed to read through everything in a linear order. On the opposit, the readers should enjoy browsing through the text and finding their own paths. In both collumns, we intend to present rather standard exposition of basic Mathematics, but focusing on the essence of the concepts and their relations. The examples are solving simple mathematical problems but we also try to show thei use in mathematical models in practise as much as possible. We are aware thath the theoretical text is written in a very compact way. A lot of details are left to readers, in particular in the more difficult paragraphs. Similarly, the examples display the variety from very simple ones to those reuqesting some deeper thoughts. We would very much like to help the reader • to formulate precise definitions of basic concepts and to prove simple mathematical results, • to percieve the meaning of roughly formulated properties, relations and outlooks for exploring mathematical tools, • to understand the instructions and algorithms creating mathematical models and to appreti-ate their usage. The goals are ambitions and nearly everyone needs his or her own paths, including failures. This is one of the reasons why we come back to basic ideas and concepts several times with growing complexity and width of the discussions. Of course, this might also look as chaotic, but we very much hope that this approach gives a much better chance to those who will persit in their effort. Clearly this textbook cannot be the only source for everybody. Actually, the only really good proceeding is the combine several sources and to think about their differences on the way. But we hope, it should be a perfect begin and help for everybody who is ready to return back to the individual parts again and again. To make this task simpler, we have added emotive icons. We hope they will not only spirit the dry mathematical text but also indicate which parts should be rather be read carefully or better jumped over in the first round. The usage of the icons follows the feelings of the authors and we tried to use them in a systematic way. Roughly speaking, we are using icons warning before complexity, difficulty etc.: Further icons indicated unpleasant technicality and need of patiance: Finally, there also icons showing up the joy of the game: The practical collumn with the examples should be readable nearly independently off the theory. Without the ambition to know the deeper reasons why the algorithms work, it should be possible to readjust here. Some definitions and descriptions in the theoretical text are marked to be catched easily when reading the examples, too. The examples and theory are partly coordinated to allow jumping there and back, but the links are not tight. CHAPTER 1 Initial warmup "value, difference, position" - what it is and how to comprehend it? A. Numbers and functions We can already work with natural, integer, rational and real numbers. We argue why rational numbers are not sufficient for us (although computers are actually not able to work with any other) and we recall the so-called complex numbers (because even reals are not enough for some calculations). 1.1. Find some real number which is not rational. Solution. One among many possible answers is V2. Already the old Greeks knew that if we prescribe the area of rectangle a2 = 2, then we cannot find rational a which would satisfy it. Why? Assume we know that it holds (p/q)2 = 2 for natural numbers p and q that do not have common divisors different from 1 (otherwise we can further reduce the fraction p/q). Then p2 = 2q2 is an even number. Thus, on the left-hand side p2 is even, and so is p. Hence, p2 is divisible by 4, and so q must be even that implies p and q have 2 as a common factor, which is a contradiction. □ 1.2. Remark. It can be even proven that 72-th root of a rational number, where n is natural, is either natural or is not rational (see ||G|| The goal of the first chapter is to introduce the reader to the fascinating world of mathematical thinking. For that we choose our examples of mathematical modelling of real situations using abstract object and connections to be as specific as possible. We also go through a few topics and mechanisms to which we will subsequently return in the rest of the book, and in the end of the chapter we will speak about the language of mathematics itself (which we will mostly use in an intuitive way). The easier the objects and settings we work with are, the more it is difficult to understand in depth the nuances of the use of particular tools and mechanisms. Mostly it is possible to reach the core ideas only through their connection to others. Therefore we introduce them from many points of view at once. Changing the topics very often might be confusing, but it will surely get better when we return to specific ideas and notions in later chapters. The name of this chapter can be also understood as an encouragement to patience. Even the simplest tasks and ideas are easy only for those who have already seen similar ones. Full knowledge and mathematical thinking can be reached only through a long and complicated. Let us start with the simplest thing: common numbers. 1. Numbers and functions Since the dawn of ages people wanted to know "how much" of something they had, or "how much" is some-r" "^^r^^i tnm8 w°rth, "how long" will a particular task take, etc. The result of such ideas is usually //// -I some "number". We consider something to be a number, if we can multiply it and add it, and it behaves according to the usual rules - either according to all rules we except, or only to some. For instance, the result of multiplication does not depend on the order of multiplicands, we have the number zero whose adding does not change the result, we have the number one which behaves in a similar manner with respect to addition, and so on. The simplest example are the so-called natural numbers, which we denote N — {0, 1, 2, 3,...}. Note that we consider zero to be a natural number, as it is usual especially in computer science. To count "one, two, three,..." is learned already by little children in their pre-school age. Some time later we meet the integers Z — {..., —2, — 1, 0, 1, 2,...} and finally we get used to floatingpoint numbers, and we know what a 1.19-multiple of the price means thanks to the 19% tax. CHAPTER 1. INITIAL WARM UP 1.3. Find all solutions to the equation x2 = b for any real number b. Solution. We know that this equation always has a solution x in the domain of real numbers, whenever b is non-negative. If b < 0, then such real x cannot exist. Thus we need to find a bigger domain, where this equation has a solution. First we add to the real numbers a new number i, so-called imaginary unit, and try to extend the definitions of addition and multiplication in order to preserve the usual behaviour of numbers (as summarised in 1.1). Clearly we need to be able to multiply the new number i by real numbers and sum it with real numbers. Therefore we need to work in our newly defined domain of complex numbers C with formal expressions of the form z, = a + i b. In order to satisfy all the properties of associativity and distributiv-ity, we define the addition so that we add independently the real parts and the imaginary parts. Similarly, we want the multiplication to behave as if we multiply the tuples of real numbers, with the additional rule that f = — 1, that is, (a + i b) + (c + i d) = (a + c) + i (b + d), (a + i b) ■ (c + i d) = (ac — bd) + i (be + ad). □ The real number a is called the real part of the complex number z„ the real number b is called the imaginary part of the complex number z„ and we write re(z) = a, im(z) = b. 1.4. Assert that all the properties (KG1-4), (01-4) and (P) of scalars from 1.1 hold. Solution. Zero is the number 0 + i 0, one is the number 1 + i 0, both these numbers are for simplicity denoted as before, that is, 0 and 1. All properties are obtained by direct calculations. □ Complex number is given by a tuple of real numbers, therefore it is a point in the real plane M2. 1.5. Show that the distance of the complex number z, = a + i b from the origin (we denote it by \z\) is given by the expression zz, where z„ the complex conjugate, is a — i b. Solution. The product ZZ = (a2 + b2) + i (-ab + ba) = a2 + b2 is always a real number and indeed gives us the square of the distance from the number z, to the origin. Thus it holds \z\2 = zz. □ 1.1. Properties of numbers. In order to be able to properly work with numbers, we need to be more careful with their definition and properties. In mathematics, the basic statements about properties of objects, whose validity is assumed without the need to prove them, are called axioms. A good choice of axioms determines the range of the theory they give rise to, and also to its usability in mathematical models of reality. Let us now list basic properties of the operations of addition and multiplication for our calculations with numbers, which we denote by numbers a,b,c,____ Both operations work by taking two numbers a, b and by applying addition or multiplication we obtain the resulting values a + b and a ■ b. __ | Properties of scalars [__> Properties of numbers: (KG1) (a + b) + c — a + (b + c), for all a, b, c (KG2) a + b — b + a, for all a, b (KG3) there exists 0 such that for all a it holds that a + 0 — a (KG4) for all a there exists b such that a + b — 0 The properties (KG1)-(KG4) are called the properties of commutative group. They are called associativity, commutativity, existence of neutral element (when speaking of addition we usually say zero element), existence of inverse element (when speaking of addition we also say the negative of a and denote it by —a), respectively. Properties of multiplication: (01) (a ■ b) ■ c — a ■ (b ■ c), for all a, b, c (02) a ■ b — b ■ a, for all a, b (03) there exists 1 such that for all a it holds that 1 • a — a (04) a ■ (b + c) — a ■ b + a ■ c, for all a, b, c. The properties (01)-(04) are called associativity, commutativity, existence of unit element and distributivity of addition with respect to multiplication, respectively. The sets with operation +, • that satisfy the properties (KG1)-(KG2) and (01)-(04) are called commutative rings. Further properties of multiplication: (P) for every a / 0 there exists b such that a ■ b — 1. (OI) if a ■ b — 0, then either a — 0 or b — 0 The property (P) is called existence of inverse element with respect to multiplication (this element is then denoted by a-1) and the property (OI) then says that there exists no "divisors of zero". Properties of the operations of addition and multiplication will be often used, even if we do not know what object are ifl^£ we really working with. In this way we obtain very f^fivt general mathematical tools. However, it is good to 'ffsr*%5s-L have some idea of typical examples of object we work with. The integers Z are a good example of commutative group, the natural numbers are not since they do not satisfy (KG4) (and possibly do not even contain the neutral element if one does not consider zero to be natural). If a commutative ring also satisfies the property (P), we speak of field (often also about commutative field). 4 CHAPTER 1. INITIAL WARM UP 1.6. Remark. The distance \z\ is also called the absolute value of the complex number z. 1.7. Polar form of complex numbers. Let us first consider complex numbers of the form z = cos

>-1-~ Clearly we have n(n - 1) • • • (n - k + 1) of possible results of a subsequential choosing of our k elements, but we obtain the same /c-tuple in k\ distinct orders. If we want to choose the items along with an ordering, we speak of a variation of &-th degree. As we have just checked, the number of combinations and variations are given by the following formula, which are not very effective for calculations with k and n large, since they contain factorials. _ [ Combinations and variations [__% L Proposition. For the number c(n, k) of combinations ofk-th degree among n elements, where 0 < k < n, it holds that (1.3) / „ (n\ nin-\)...in-k + \) n\ cin, k) — [ ) — - = -. \kj k(k-l)..A (n-k)\k\ For the number vin, k) of variations it holds that (1.4) v(n,k) = n(n-l)---(n-k+l) for all 0 < k < n (and zero otherwise). We pronounce binomial coefficient (£) as "n over k". The name stems from the so-called binomial expansion, which is the expansion of (a + b)n. If we expand (a + b)n, the coefficient at akbn~k equals for every 0 < k < n exactly the number of ways to choose a k-tuple from n parentheses in the product (from these parentheses, we take a, from the others, we take b). Therefore we have (i.5) (fl+*r = £("W-* and note that for the derivation only distributivity, commutativity and associativity of multiplication and summation was necessary. The formula (1.5) therefore holds in every commutative ring. Let us present another simple example of a mathematical proof - a few simple propositions about binomial coefficients. For a simplification of formulations we define (£) — 0 whenever k < 0 or k > n. 1.7. Proposition. For all natural numbers k and n we have 8 CHAPTER 1. INITIAL WARM UP speaker AB. The number of all orderings where B speaks right after A is then equal to the number of permutations of seven elements. Clearly, the same number is for the number of all orderings where A speaks right after B. Since the number of all possible orderings of eight speakers is 8!, the result is 8! — 2 • 7!. □ 1.18. How many anagrams of the word PROBLEM are there, such that a) the letters B and R are next to each other, b) the letters B and R are not next to each other. Solution, a) The pair of letters B and R can be assumed to be a single indivisible "double-letter". In total we have six distinct letters and there are 6! words of six indivisible letters. In our case we have to multiply this by two, since are double-letter can be either BR or RB. Thus the result is 2 • 6!. b) 7! — 2 • 6! the complement to the part a) to the number of all seven-letter words of distinct letters. □ 1.19. In how many ways can an athlete place 10 distinct cups to 5 shelves, if into every shelf all 10 cups fit? Solution. Let us add 4 indistinguishable items, say separators, to the cups. The number of all distinct orderings of cups and separators is clearly 14!/4! (the separators are indistinguishable). Every placement of cups into shelves corresponds to exactly one ordering of cups and separators. It is enough to say that the cups before the first separator in the ordering are placed in the first shelf (preserving the order), the cups between the first and the second separator in the second shelf, and so on. Thus the number 14!/4! is the result. □ 1.20. Determine the number of four-digit numbers with exactly two distinct digits. Solution. Two distinct letters used for the number can be chosen in (2°) ways, from two chosen digits we can compose 24 — 2 distinct four-digit numbers (we subtract the 2 for the two one digit numbers). In total we have (2°)(24 - 2) = 630 numbers. But in this way, we have also computed the numbers that start with zero. Of these there are (j)(23 — 1) = 63. Thus we have 630 — 63 = 567 numbers. □ 1.21. Determine the number of even four-digit numbers composed of exactly two distinct digits. Solution. Analogously to the previous example, let us first ignore the peculiarities of the digit zero. We thus obtain © (24 -2)+5-5(23 -1) numbers (we first count only the number that consist only of even digits, the second summand gives the number of even four-digit numbers (2) (III) = © + (k"+i) (3) ELo GD =2" (4) £SLo*GD = w2"-1- Proof. The first proposition is immediate directly from the formula (1.3). If we expand the right-hand side of (2), we obtain n k + 1 k\{n-k)\ {k+\)\{n-k (k+ l)n! + (n-k)n\ (k + l)!(n -k)\ (n + l)! 1)! (k+ \)\(n-k)\ which is the left-hand side of (2). In order to prove (3), we use the so-called mathematical induction. This tool is very suitable for statements saying that something should hold for every natural number _^ n. The mathematical induction consists of two steps. In the first, base step we assert the claim for n — 0 (in general, for the smallest n the claim should hold for). In the second, inductive step we assume that the claim holds for some n (and all smaller numbers) and using this we prove that that this implies the claim for n + 1. Putting it together, we obtain that the claim holds for every n. The claim (3) clearly holds for n — 0, since Q — 1 = 2°. (Similarly easy is it also for n — 1.) Now let us assume that the claim holds for some n and calculate the corresponding sum for n + l using the claims (2) and (3). We yield n + l , .N n + l r k=0 k=0 n k- 1 k=-l v 7 k=0 V 7 ~>n , >~>n _ ^n + l Note that the formula (3) gives the number of all subsets of an n-element set, since (£) is the number of all subsets of size k. Note also that (3) follows from (1.5) by choosing a — b — 1. To prove (4) we again employ induction, as in (3). For n — 0 the claim clearly holds. Inductive assumption says that (4) holds for some n. Let us now calculate the corresponding sum for n + 1 using (2) and the inductive assumption. We obtain n + l k=0 n + 1 n + l r k=0 L n k- 1 n k n + l k=-l v 7 k=0 V 7 =t(;)+t»(;)+i:»C) jfc=0 v 7 k=0 V 7 k=0 V 7 = 2" + nl"'1 + Hi"'1 = (n + 1)2". This completes the inductive step and the claim is proven for all natural n. □ 9 CHAPTER 1. INITIAL WARM UP with one digit even and one digit odd). Again we have to subtract the numbers that start with zero, of those there are (23 — 1)4 + (22 — 1)5. The final number is thus '5N (24 - 2) + 5 • 5(2J - 1) - (2J - 1)4 - (2Z - 1)5 = 272. □ 1.22. There are 677 people at a concert. Do some of them have the same name initials? Solution. There are 26 letters in the alphabet. Thus the number of all possible name initials are 262 = 676. Thus at least two people have the same initials. □ 1.23. New players meet in a volleyball team (6 people). How many times do they shake hands when introducing to each other (everybody shakes with everybody)? How many times do they shake hands with the opponent after playing a match? Solution. Every tuple of players shakes hands at the introduction. The number of handshakes is then equal to the combination C(2, 6) = (2) = 15. After a match each of the six players shakes hands six times (with each of six opponents). Thus the number is 62 = 36. □ 1.24. In how many ways can five people be seated in a car for five people, if only two of them have driver licence? In how many ways can 20 passengers and two drivers be seated in a bus for 25 people? Solution. On the driver's place we have two choices and the other places are then arbitrary, that is, for the second seat we have four choices, for the third three choices, then two and then 1. That makes 2.4! = 48 ways. Similarly in the bus we have two choices for the driver, and then the driver plus the passengers can be seated among the 24 seats arbitrarily. Let us first choose the seats to be occupied, that is, (jj), and among these the people can be seated in 21! ways. That makes 2. (£)21! = ^ ways. □ 1.25. In how many ways can we insert into three distinct envelopes five identical 10-bills and five identical 100-bills such that no envelope stays empty? Solution. Let us first compute the number of insertions ignoring the non-emptiness condition. Using the rule of product (we insert the 10-bills and 100-bills independently) we have C(2, 7)2 = Q2. Let us now subtract the insertions such that exactly one envelope is empty and then the insertions such that two are empty. We have C(2, 7)2 - 3(C(1, 6)2 - 2) - 3 = Qf - 3(62 - 2) - 3 = 336. □ The second property from our claim allows us to compose all binomial coefficients into the so-called Pascal triangle, where every number is obtained as a sum of two coefficients situated right "above" it: n = 0 n = 1 n = 2 n = 3 n — 4 n = 5 1 1 1 1 1 10 1 1 1 1 5 10 Note that in individual rows we have exactly the coefficients at individual powers in the expression (1.5), for instance the last given row says (a ■ b)5 = a5 ■5a4b ■ 10a V lOflV -5ab* 1.8. Choice with repetitions. The ordering of n elements, where some of them are indistinguishable is called permutation with repetitions. Let there be among n given elements pi elements of first kind, p2 elements of second kind, ..., pk of the k-th kind, where p\ + p2 + • • • + p\ — n, then the number of permutations with repetitions of these elements is denoted as P(pi,...,Pk). Similarly to permutations and combinations without repetitions, for the choice of the first element we have n possibilities, for the second n — 1 and so on, until the last element, for which we have only one choice. But we consider the orderings which differ only in the order of indistinguishable elements to be identical. Elements of every kind can be ordered in pi! ways, thus we have _ Permutations with repetitions _. 1 P(pi,...,pk) Pi! Free choice of k elements from n elements, when order matters, is called variation ofk-th degree with repetitions, the number of those is denoted V(n,k). Free choice in this case means that we assume that for every choice we have the same number of possibilities - for instance, when we return the elements back before the next choice, when we throw the same dice, and so on. The following clearly holds: I Variations with repetitions [__< L V(n,k) = n" If we are interested in choice without taking care of order, we speak of combinations with repetitions and for their number we write C(n, k). At first sight, it does not seem to be easy to determine the number. The proof of the following theorem is typical for mathematics - we reduce the problem to another problem we have already dealt with. In our case it is reduction to standard combinations without repetitions: Combinations with repetitions [___ Theorem. The number of combinations with repetitions of k-th order from n elements equals for every k > 0 and n > 1 \-k- r k C(n,k) = 10 CHAPTER 1. INITIAL WARM UP 1.26. Determine the number of distinct sentences which can arise by permuting letters in the individual words in the sentence "Skokan na koks" (the arising sentences and words do not have to make any sense). Solution. Let us first compute the number of anagrams of individual words. From the word "skokan" we obtain 6!/2 distinct anagrams (permutation with repetition P(\, 1, 1, 1,2)), similarly "na" yields two and "koks" 4!/2. Therefore, using the rule of product, we have 6!4!/4 = 4320. □ 1.27. How many distinct anagrams of the word "krakatit", such that between the letters "k" there is exactly one other letter. Solution. In the considered anagrams there are exactly six possibilities of placement of the group two "k", since the first of the two "k" can be placed at any of the positions 1 — 6. If we fix the spots for the two "k", then the other letters can be placed arbitrarily, that is, in P(\, 1, 2, 2) ways. Using the rule of product, we have 6 • 6! 6- 1,2,2) 2-2 1080. □ 1.28. In how many ways can we insert five golf balls into five holes (into every hole one ball), if we have four white balls, four blue balls and three red balls? Solution. Let us first solve the problem in the case that we have five balls of every colour. In this case it amounts to free choice of five elements from three possibilities (there is a choice out of three colours for every hole), that is variations with repetitions (see). We have V(3,5) = 35. Now let us subtract the configurations where there are either balls of one colour (there are three such), or exactly four red balls (there are 2 • 5 = 10; we first choose the colour of the non-red ball - two ways -and then the hole it is in - five ways). Thus we can do it in 35 - 3 - 10 = 230 ways. □ 1.29. In how many ways could have the English Premier League league finished, if we know that no two of the three teams Newcastle United, Fulham and Tottenham Hotspur are not "adjacent" in the final table? (There are 20 teams in the league.) Solution. First approach. We use the inclusion-exclusion principle. From the number of all possible resulting tables we subtract the tables Proof. The proof is based on a trick (a simple one, as soon as we understand it). We show two different approaches. Assume first, that we are drawing cards from a deck of n different cards and in order to make it possible to draw some card multiple times, we add to the deck k — 1 different jokers (we definitely want to draw at least one of the original cards). Say that we have drawn r original cards and s jokers, that is, r + s — k. It seems that we should devise a method how to assign "substitute" jokers to original cards, so that we know how many times we have drawn each original card. But we actually need to discuss only the number of ways how to do that. For that we can use the mathematical induction and assume that the claim holds for any arguments smaller than n and k. We need to obtain the combination with repetition of the s-th order from r original cards, which gives (^+k~r~l) — (k^1), which is exactly the number of combinations without repetitions of s-th order from all jokers. Thus the theorem is proven. Alternative approach (induction-free): Over the set S — {fli, ..., a„], from which we choose the combination, we fix an ordering of the elements and for our choices of elements of S we prepare n boxes into which we give (in the fixed order) the elements of S (one element into every box). The individual choices x, e S are then given to the box which already contains this element. Now let us realize that in order to detect the original combination we just need to know how many elements are there in individual boxes. For instance, a I bbb I cc I d * | * * * | ** | *, determines the choice b, b, c from the set S — [a, b, c, d}. In the general case of the choice of k elements from n possible we have a chain of n + k elements and the number C(n, k) equals the number of possible placements of the boxes | among individual elements. This amount to the choice of n—1 positions from n+k—1 possible. Since we have + k-l\ ( n+k-1 \ /n+k-V 1 the theorem is proven (for the second time). 1 □ 3. Difference equations In the previous paragraphs we have seen formulas, which determined the value of a scalar function defined on natural numbers (factorial) or on tuples of natural numbers (binomial coefficients) using already defined values. In the paragraph 1.5 the binomial coefficients are defined with a directly computable formula, but we can also understand them using the relationship exhibited in 1.8 -instead of the value of the function we give the difference corresponding to a change of the variable. Such approach can be seen very often when formulating mathematical models that describe real systems in economy, biology, etc. We will observe only a few sim-pie examples and we will return to this topic in the future. 11 CHAPTER 1. INITIAL WARM UP where some two of the three teams are adjacent and then add the tables where all three teams are adjacent. The number is then 20! 2! • 19! + 3! • 18! = 1741445647958016000. Second approach. Let us consider the three teams to be "separators". The remaining teams have to be divided such that between any two separators there is at least one team. The remaining teams can be arbitrarily permuted, as can the separators. Thus we have 'IS ■ 17! • 3! = 1741445647958016000. ways. □ 1.30. For any fix n e N determine the number of all solutions to the equation xi +x2-\-----h xk = n on the set of strictly positive integers. Solution. Every solution (r1; ..., rk), X!*=i ri = n can be uniquely encoded as a sequence of separators and ones, where we first write r\ ones, then a separator, then r2 ones, then another separator, and so one. Such sequence then clearly contains n ones and k — l separator. Every such sequence clearly determines some solution of the given equation. Thus there are exactly that many solutions as there are sequences, that □ (n+k-l\ IS, C. Difference equations Difference equations (also called recurrence relations) are relations between elements of some sequence, where an element of the sequence depends on previous elements. To solve a difference equations means finding an explicit formula for 72-th (that is, arbitrary) element of the sequence. Recurrence relation allows us only to compute 72-th element by computing all previous elements. If an element of the sequence is determined only by the previous element, we speak about first order difference equation. Those are present in our real world, for instance when we want to find out how long will repayment of a loan take for fixed monthly repayment, or when we want to find out how much shall we pay per month if we want to repay a loan in a fixed time. 1.31. Michael wants to buy a new car. The car costs €30, 000. Michael wants to take out a loan and repay it with a fixed month repayment. The car company offers him to buy the car with yearly interest of 6%. Michael would like to finish repaying the loan in three years. How much should he pay per month? 1.9. Linear difference equations of first order. General difference equation of the first order is an expression of the form f(n + 1) = F(n, fin)), where F is a known scalar function with two parameters. If we know the "initial" value f(0), we can compute/(l) — F(0, /(0)), then f(2) — F(l, /(l)) and so on. Using this approach we can compute the value fin) for arbitrary n e N. Note that this idea resembles the construction of natural numbers from the empty set or the principle of mathematical induction. An example of such equation is the definition of the factorial function: in + 1)! = in + 1) -n! We see that the value of fin + 1) depends on both n and the value of/(«). Another, very simple example is fin) — C for some fixed scalar C and all n, and the so-called linear difference equation of first order (1.6) f(n + \) = a-f(n) + b, where a / 0 and b are known scalars. Such difference equation is easy to solve if b — 0. Then it is the well-known recurrent definition of the geometric progression and it holds that /(l) = a/(0), /(2) = fl/(l) = a2/(0) and so on Thus for all n we have /(«) = aV(0). This is also the relation for the so-called Malthusian population growth model, which is based on the assumption that during a given time interval the populations grows with a constant ratio a to the state before the interval. We will prove a general result for first order equations, which are similar to linear, but allow varying coefficients of a and b, (1.7) fin + l) — a„- fin) + bn. First let us think about what such equations can describe. Linear difference equation (1.6) can we nicely interpret as a mathematical model for finance, e.g. savings or loan payoff with a fixed interest rate a and fixed repayment b (the cases of savings and loans differ only in the sign of b). With varying parameters a and b we obtain a similar model with varying interest rate and repayment. We can imagine for instance that n is the number of months, 1 ^i'- a„ is the interest rate in the nth month, b„ the repay-* - ment in the nth month. Do not be afraid of the seemingly difficult calculations in the following result. It is a typical example of technical mathematical statement for which it is hard to "guess" precisely how it should be formulated. On the other hand, it is then a simple exercise on the properties of scalars and mathematical induction to prove it. Really interesting are then the corollaries, see 1.11 later. In the formulation we use along with the usual notation for sum me similar notation for the product Yl- In the rest of the text we will also use the convention that when the index set is empty, then the sum is zero and that the product is one. 12 CHAPTER 1. INITIAL WARM UP Solution. Let 5 denote the sum Michael has to pay per month. After first month Michael repays 5, part of it is a repayment of the loan, part of it repays the interest. Let dk stand for the loan after k months. After first month dk is 0,06 di = 30000 - 5 + • 30000. 12 In general, after k-th month we have (1.1) 0, 06 dk = di-i — 5 H—<4-i- Using the relation (1.9) is dk given by 0,06\ 1 + I 30000 1 + 0, 06 12 125 0706 Repaying the loan in three years means d36 = 0, thus we obtain (1.2) 5 = 30000 0,06 12 l-(l + °f )"36 912.7. □ Note that the recurrence relation (|| 1.11|) can be used for our case as long as all y(n) are positive, that is, as long as Michael still has to repay something. 1.32. Consider the case from the previous example. For how long would Michael have to pay, if he would like to repay €500 per month? Solution. Setting q = (l + ^) = 1.005, c = 30000 the condition dk = 0 gives the equation , 2005 H 2005 -c by taking logarithms of both sides we obtain , ln2005 -ln(2005-c) k = -, In q which for 5 = 500 gives approximately £ = 71,5, thus Michael would be paying for six years (and the last repayment would be less than €500). □ 1.33. Determine the sequence {yn}^L\, which satisfies the following recurrence relation 3y" , 1 ^1 1 yn+i = — + 1, n > 1, yi = 1. o Linear recurrence can appear for instance in geometric problems: 1.10. Proposition. General solution of the difference equation (1.7) of first order with the initial condition /(0) = yo is given by the formula (n-l \ n-2 / n-l \ n a') yo + X! I! ai hi+ bn-1-i=0 I j=0 \i=j+\ ) Proof. We will prove the proposition using mathematical induction. It clearly holds for n — 1 where it amounts directly to the definition f(l) — flow + V Assuming that the statement holds for some fixed n, we can easily compute: (/n-l \ n-2 / n-l \ \ (n a>) y°+12 [ nai) bJ+b»-1 \i=0 I 7=0 \i=j+\ ) ) + K (n \ n — l I n \ rifl')^ + E II ai\bj+bn, i=0 I j=0 \i=j+\ j as can be directly seen by multiplying out. □ Let us again note that for the proof we did not need anything about the scalars we used except for the properties of commutative ring. 1.11. Corollary. General solution of linear difference equation (1.6) with a ^ 1 and initial condition /(0) = yo is (1.9) 1 - a" f(n) = anyo + --b. I — a Proof. If we set a, and bi to be constants and use the general formula (1.8) we obtain /(«) = «"W + ^(l + E"""7'"1)- V 7=0 7 For evaluating the sum of products in the second summand we need to observe that these are expressions (1+aH-----\-an~l)b. The sum of this geometric progression can be computed using the formula l—a" — (1 — a)(l+a-\-----h a"-1), and that yields the required result. □ Note that for calculating the sum of a geometric progression we required the existence of the inverse element for non-zero scalars. We could not do that with integers only. Thus the last result holds for field of scalars and we can thus use it for linear difference equations where the coefficients a, b and the initial condition /(0) — yo are rational, real or complex numbers; and also in the ring of remainder classes Zk with prime k (we will define remainder classes in the paragraph 1.41). It is noteworthy that the formula (1.9) actually holds even with the integer coefficients and initial condition. Then we know in advance that all f(n) are integer, and integers are a subset of rational numbers. Thus our formula necessary gives correct integer solutions. Observing the proof in more detail, we see that 1 — a" is always divisible by 1 — a, thus the last paragraph should not have surprised 13 CHAPTER 1. INITIAL WARM UP 1.34. Suppose n lines divide the plane into areas, what is the maximal number of areas that can arise this way? Solution. Let the number of areas be p„. If there is no line in the plane, then the whole plane is an area, thus po = 1. If there are n lines, then by adding (n + l)-st line increases the number of areas by the number of areas this new line interesects. If no lines are parallel and no three lines intersect at the same point, the number of areas the (n + l)-st line crosses equals to one plus the number of its intersections with the previous lines (the crossed area will then be divided into two, thus the total number increases by one at every crossing). The new line has at most n intersections with the already-present n lines. The segment of the line between two intersections crosses exactly one area, thus the new line crosses at most n +1 areas. Before adding the line, there was at most p„ areas (by the definition of p„). Thus we obtain the recurrence relation pn+1 = pn+(n + 1), from which we obtain an explicit formula for pn either by applying the formula 1.10 or directly: pn = pn-i + n = pn_2 + (n-l)+n = p„-3 + (n - 2) + (n - 1) + n = n(n + 1) n2 + n + 2 = 1 + —-- = - Po □ Recurrence relation can be more complex than first order. Let us list example of combinatorial problems, for whose solution a recurrence relation can be used. us. However it can be seen that with scalars from Z4 and say a — 3 we fail since 1 — a — 2 is a divisor of zero. 1.12. Nonlinear example. Let us return for a while to the first \\ order equation (1.6) we have used for a very primitive population growth model which directly depends on the momentary population size p. On first sight it is W clear that such model with a > 1 leads to a very rapid and unbounded growth. A more realistic model has such population change Ap(n) — p(n + 1) — pin) only for small values of p, that is Ap/p ~ r > 0. Thus if we want to let population grow by 5% for a time interval only for small p, we choose r to be 0, 05. For some limit value p — K > 0 the population does not grow and for even greater values it even decreases (since for instance the resources for the feeding of the population are limited, individuals in a big population are obstacles to each other etc). Let us assume that exactly the values y„ — Apin)/pin) change linearly in pin). Graphically we can imagine this dependence as a line in the plane of variables p and y, which goes through the points [0, r] (that is when p — 0 we have y — r) and [K, 0] (which gives the second condition - when p — K the population does not change). Thus we set y By setting y yn and p ■■ pin + 1) pin) we obtain pin) pin) — pin)+r, that is by multiplying out we obtain a difference equation first order (where the value of p in) is present as both first and second power). (1.10) pin + 1) = p(n)(l - —pin)+r) Try to thing through the behaviour of this model for various , values of r and K. On the picture we can see the ■1 values for parameters r — 0, 05 (that is, five percent growth in the ideal state), K — 100 (the resource limit the population to the size 100) and piO) are two individuals. Heunbakni M/'i^oi r Note that the original almost exponential growth slows later down and the value approaches the desired limit of 100 individuals. For p close to one and K way greater than r the right side of the equation (1.10) is approximately p(n)(l+r), that is the behaviour is similar to that of the Malthusian model. On the other hand, for p almost equal to K the right side of the equation is approximately pin). For initial value of p greater than K the values will decrease, for smaller than K they will grow, thus the system will basically oscillate about the value K. 14 CHAPTER 1. INITIAL WARM UP 1.35. How many words of length 12 that consist only of letters A and B, but do not contain a sub-word BBB, are there? Solution. Let a„ denote the number of words of length n consisting of letters A and B but without BBB as a sub-word. Then for a„ (n > 3) the following recurrence holds since the words of length n that satisfy the given condition either end with an A, or with an AB, or with an ABB. There are a„_i words ending with an A (preceding the last A there can be an arbitrary word of length n — 1 satisfying the condition). Analogously for the two remaining groups. Further, we can easily compute that a\ = 2, a2 = 4, a3 = 1. Using the recurrence relation we can then compute a12 = 1705. We could also derive an explicit formula for n-th element of the sequence using the theory we have developed. Characteristic polynomial of the recurrence relation is x3 — x2 — x — 1 with one real and two complex roots, which we can express using the relations (|| 1.12||). □ Score of a basketball match between the teams of Czech Republic and Russia is after the first quarter 12 : 9 for the Russian team. In how many ways could the score have developed? Solution. If we denote P^j) the number of ways in which the score could have developed for a quarter that ended with k : I, then for k, I > 3 the following recurrence relation holds: (k-3,l) (k-2,l) (k-l,l) OU-l) + p< OU-2) OU-3)- (We can divide all possible evolutions of the quarter with the final score k : I into six mutually exclusive possibilities, according to which team scored a goal and how worth it was (1, 2 or 3 points)). Using the symmetry of the problem, it clearly holds that P^j) = P(i,k)- Further 4. Probability Let us have a look on another frequent example of scalar-valued functions - the observed values are often known neither explicitly by a formula nor implicitly by some description. They are a re-y'"'1 suit of some randomness and we try to describe the probability of some outcome happening. 1.13. What is probability? As a simple example we can use common six-sided dice throwing, with sides labelled as 1, 2, 3, 4, 5, 6. If we describe mathematical model of such throwing with a "fair" dice, we expect and thus also require that every side occurs with the same frequency. In words, we say that "every in advance chose side occurs with the probability g". But if you try to manufacture with a knife such dice from wood, you probably observe that the relative frequencies will not be the same. In such situation, we can after a large number of tries count the relative frequencies of each label, and set these to be the probabilities in our mathematical description. But no matter how large the number of tries is, we cannot exclude the possibility that all the tries we did was some unlikely combination of the result and thus our model is not well chosen. In the following part we will work with abstract mathematical description of probability in the simplest approach. The question how accurate or adequate for a specific real-world problem is out of the realms of mathematics. But that does not mean that such question are not for mathematician, quite the opposite (most likely in cooperation with some experts in the given area). Later we will return to probability and see it as a theory describing the behaviour of random processes or fully deterministic processes where not all determining parameters are known. Mathematical statistics allows us to say how much can we expect that a given model corresponds to reality, or allows us to determine the parameters of the model in such way that the correspondence with the observations is high and simultaneously can estimate the reliability of the chosen model. For both probability and statistics a complex mathematical theory is required, which we build over the course of few semesters. On the example of our dice we can imagine it as follows: in the probability theory we work with the parameters pi for the probabilities of individual sides and only require that these probabilities are non-negative and their sum is P1+ P2 + P3+ P4 + P5 + P6 — 1- When choosing specific values pi for a specific dice in mathematical statistics we can then estimate the reliability of our mathematical model of the die. Our humble goal for now is just to indicate how to abstractly capture the probabilistic considerations in formal f ~-±Z% mathematical objects. The following paragraphs are thus basically just exercises in simple operations with sets and combinatorics (that is, calculating the number of possibilities of satisfying the condition for finite sets). 15 CHAPTER 1. INITIAL WARM UP we have for k > 3 that: P(k,2) = P(k-3,2) + P(k-2,2) + P(k-l,2) + P(k,l) + P(k,0), P(k,l) = P(k-3,l) + P(k-2,l) + P(k-l,l) + P(k,0), P(k,0) = P(k-3,0) + P(k-2,0) + ^*(i-l,0). which along with the initial condition gives P(o,o) = 1> Ai,o) : ^(2,0) = 2, ^(3,0) = 4, ^(1,1) = 2, P(2,l) = -P(l,l) + P(0,l) + ^(2,0) P(2,2) = P(0,2) + P(l,2) + P(2,l) + P(2,0) = H g^eS p(l29) = 497178513. □ Remark. We see that the recurrence relation in this problem has a more complex form in comparison to the form we have dealt with in our theory and thus we cannot evaluate arbitrary number P^,i) explicitly, we can evaluate it only by a subsequent computing from previous elements. Such an equation is called partial difference equations, since the elements of the equation are indexed by two independent variables (k,l). We will talk more about recurrent formulas (difference equations) of higher orders with constant coefficients in chapter 3. D. Probability Let us state a few simple exercises for classical probability, where we are dealing with some experiment with only a finite number of outcomes ("all cases") and we are interested whether the outcome of the experiment belongs to a subset of possible outcomes ("favourable outcomes"). The probability we are trying to determine then equals to the number of favourable outcomes divided by the total number of all outcomes. Classical probability can be used when we assume (know) that each of the possible outcome has the same probability of happening (for instance, fair dice throwing). 1.37. What is the probability that the roll of a dice results to a number greater than 4? Solution. There are six possible outcomes (the set {1, 2, 3, 4, 5, 6}) of which two are favourable ({5, 6}). Thus the probability is 2/6 = 1/3. □ 1.38. We randomly choose a group of five people from a group of eight men and four women. What is the probability that there are at least three women in the chosen group? Solution. We compute the probability as a quotient of the number of favourable outcomes to the total number of outcomes. We divide the 1.14. Random events. We work with a non-empty fixed set £2 of all possible outcomes, which we call the sample space. For simplicity the set £2 is finite with elements &>i, ...,&>„, corresponding to individual possible outcomes. Every subset Acfi represent a possible event. The set of subsets A of the sample space is called the set of events, if • £2 e A (the sample space is an event), • if A, B e A, then A \ B e A (that is, for every two events their set difference is also an event), • if A, B e A, then A U B e A (that is, for every two events their union is also an event). Clearly also the complement Ac — Q \ A of an event A is an event, which we call the opposite event to the event A. The intersection of two events is again an event, since for every two subsets A, B } c £2 are called elementary events, Pospisil definuje samotne prvky jako el.jevy.Jinde se to definuje i jeste ji-nak 16 CHAPTER 1. INITIAL WARM UP favourable cases according to the number of men in the chose group: there can be either two or one. There are eight groups with five people of which one is a man (all women have to be present in such groups, thus it depends only on which man we choose). There are c(8, 2) • c(4, 3) = (j) • (3) of groups with two men (we choose two men from eight and then independently three men from four, these two choices can be independently combined and thus using the rule of product we obtain the number of such groups). Thus the total number of groups with five people and at least three women is c(12, 5) = (j2). The probability is then 8 + 0© 5 (?) 33- □ Let us give an example for which the use of classical probability is not suitable: 1.39. What is the probability that the reader of this exercise wins at least €25 million euro in EuroLotto during the next week? Solution. Such a formulation is incomplete, it does not give us enough information. We present a "wrong" solution. The sample space of possible outcomes is two-element: either the reader wins or not. Favourable event is one (win), thus the probability is 1/2 (a clearly wrong answer). □ Remark. In the previous exercise the basic condition of the usage of classical probability was violated - every elementary event must have the same probability. In fact, the elementary event has not been defined. EuroLotto has a daily draw with jackpot of €25, 000, 000 for choosing 5 correct numbers 1 — 50. There is no other way to win €25, 000, 000 than to win a jackpot on some of the day during the week. The elementary event would be that a single lotto card with 5 numbers wins a jackpot. Assuming that the reader submits k lotto cards every day of the week, the probability of winning at least one jackpot during the week equals — 2n876o- Ik 1.40. There are 2n seats in a row in cinema. We randomly seat n men and n women in the row. What is the probability that no two persons of the same sex sit next to each other? Solution. In total there are (2n!) of possible seatings, the number of seating satisfying the given condition is 2(n\)2: we have two ways for choosing the positions for men (thus also for women) - either all men sit on odd-numbered places (thus women sit on even-numbered places), or vice versa. Among these places, both men and women are translators note: im really unsure about the english terminology regarding some of these terms... • intersection of events At, i e I, corresponds to the event n,e/A,, union of events At, i e I, corresponds to the event • A, B € A are mutually exclusive, if A n B — 0, • the event A has as a corollary the event B, if A c B, Give an example of all listed terms for the sample space of dice rolling or analogously for coin throwing! 1.15. Definition. Probability space is a triple (£2, A, P), where A is a set of events of (finite) sample space £2, where there is a scalar function P : A -> K with the following properties: • P is non-negative, that is, P(A) > 0 for all events A, • P is additive, that is, P(A U B) — P(A) + P(B), whenever A, B From a sack with five white and five red balls we randomly draw three (we do not return the balls back to the sack). What is the probability that two of them are white and one is red? Solution. Let us divide the event into a union of three disjoint events, according to in what turn we draw the red ball. Probability, that the red ball is drawn as third, second, or first, respectively, are : \ ■ | • |, I.I.I I . 1 . I Tn total — 2 9 2' 2 9 2 i-""" 12- Another solution. Consider the number of all possible triples of drawn balls (the balls of the same colour are indistinguishable), thus (3°). There are (2) • (I) of triples with exactly two white balls (two The same result is obtained for general probabilistic function P on a set of events. Since AC\B and A\B are independent events, P(A) = P(A \B) + P(A n B), □ similarly for B, and we also have P(A U B) = P(A \B) + P(B \ A) + P(A n B). If we express the probabilities of P(A \ B) and P(B\A) in terms of P(A), P(B) and P(A n B), we obtain the formula (1.11), now for the general case. The following theorem is a direct reflection of the so-called \^ combinatorial inclusion-exclusion principle into our finite probability and it says, how shall we deal with multiple elementary event counting in general case. It is probably a good example of mathematical theorem, where the hardest part is finding a good formulation. After that is done, we can say that the claim is (intuitively) obvious. On the picture is the situation for three sets A, B, C for classical probability. If we just sum the probabilities of A, B and C, then the hatched areas represent elements that are present twice, and the double-hatched area represents those present thrice. Thus we subtract the hatched areas once, which leads to triple subtraction of the elements of double-hatched area, which we must therefore add once more. In general, thanks to the additivity of the probability, we can imagine that we decompose every event as a union of elementary events (although the elementary events do not have to belong to the set of events in consideration). Then the probability of every event is given by the sum of probabilities of its elementary events. For expressing the probability that at least one of the events occurs we can sum the probabilities of A,, then subtract those elementary events that are present twice (that is, in intersection of two A,'s). However, we might have subtracted some event too many times, notably in the case that an element was present in (at least) three At's. Thus we again add, and so on. Theorem. Let A\,..., A^ e Abe arbitrary events over the sample space Q with a set of events A. Then k k-l k p(utlAi)=j2p(Ai')-Y, E p(^nAi) / —1 i — 1 j=i + l k-2 k-l k + E E E P(AinAjnAt) r' = l j=i + l t=j+\ + p(Ai n A2n■ ■ ■ n Ak). 18 CHAPTER 1. INITIAL WARM UP white balls can be drawn in Q ways, and one red ball can join them in five ways). The required probability is then (\°) 5_ 12- □ 1.44. From a hat where there are five white, five red and six black balls we randomly draw balls (and do not return the drawn balls back). What is the probability that the fifth drawn ball is black? Solution. We will solve a general problem, the probability that the z'-th drawn ball is black. This probability is the same for all i, 1 < i < 16 - we can imagine that we draw all balls one by one, and every such sequence (from the first drawn ball to the last one) consisting of five white, five red and six black has the same probability of being drawn. Thus we can use the classical probability. There are P (5, 5, 6) = gi.^'g; of such sequences. The number of sequences where there is a black ball on the r'-th place, the rest arbitrary, equals to the number of arbitrary sequences of five white, five red and five black balls, that equals P(5, 5, 5) 15! 5!5!5! . Thus the probability is f(5,5,5) _ _ 3 P(5,5, 6) 16! 6!5!5! □ Let us return to the dice throwing and try to describe the events of the sample space £2 that arise when we are throwing until a six is rolled, but no more than hundred times. For a single roll the sample space consist of six numbers from one to six, this is classical probability. For whole series of rolls the sample space is much bigger - it consist of finite sequences of numbers from one to six, which have at most 100 elements and all numbers but the last one are from one to five, and either the last number is six or it has length exactly 100 and there is no number six in it. An event A can be for instance the subset "there were at most two rolls". All favourable elementary events are then [1,6], [2,6], [3,6], [4,6], [5,6]. Using the classical probability for single dice rolls we derive the probability of the events in £2. But it is not classical probability - if we were to derive the probability of the event A, it means that first roll is not six and the second is. Classical probability says 5 1 5 P(A) =---= —, 6 6 36 since the first roll is different from six with probability 1 — and the second roll is completely independent of the first one. Clearly, this does not amount to the quotient of all favourable cases to the size of the sample space. Proof. In order to make the aforementioned ideas into a proof, we need to ensure that all the operations of adding subtracting are with coefficient one. Instead of doing that, we can give a more formal proof by mathematical induction over the number of events k, whose probabilities we are summing. Try to compare both approaches as they presented, it should help to clarify what it means to "prove" and what it means to "understand". For k — 1 the claim is obvious, the case k — lis the same as the equality (1.11) which we have already proved (in general case). Let us assume that the theorem holds for any number of events up to a definite k > 1. Now we can work in the induction step with the formula for k + 1 events, where the union of the first k of them are considered to be the A in the equation (1.11) and the remaining event is considered to be the B : ) U Ajt+i) E((-1)j+1 E P(Atln...nAtj)) P(Ak+l) 1 From there we can easily derive that by exchanging one or more event is a set of stochastically independent events we again obtain a set of stochastically independent sets. Very often we need to compute the probability that at least one of the stochastically independent set of events occurs, that is, we want to compute P(Ai U • • • U Ak). In such situation we can use elementary properties of set operations, the so-called De Morgan laws, (UI-6/AI-)c = nieIAct (nieIAi)c = U/g/A? and we obtain (1.12) P(Ai U ■uAjt) = i-P(A 0, then P(Ai n A2) = P(A2)P(Al\A2) = P(Al)P(A2\Al). All these numbers express (in a different manner) the probability that both events A\ and A2 occur. For instance, in the last case we first look whether the first event occurred. Then, assuming that the first has occurred, we look whether the second also occurs. Similarly, for three events A\, A2, A3 satisfying P(Ai n A2 n A3) > 0 we obtain P(Ai n A2 n A3) = P(Al)P(A2\Al)P(A3\Al n A2). In words, this can be described as follows: the probability that three events occur at once can be computed by first computing the probability that the first occurs, then computing the probability that the second occurs under the assumption that the first has occurred, and then computing the probability that the third occurs under the assumption that both the first and the second have occurred. If we have in general k events Ai,..., A* satisfying P(Ai n • • • n A£) > 0, then the theorem says the following P(Ai n ■ ■ ■ n A*) = P(A1)P(A2|A1)- • -PtAjtiAi n • • • n Ak_x). Really, thanks to the assumption all the probabilities of the intersections, which are taken as the hypotheses, are non-zero. By simplifying the expression we obtain both on the left and on the right side of the equation the probability of the event corresponding to the intersection of A\,..., A*. 1.21. Geometric probability. In practical problems we often encounter much more complicated models, where the sample space is not a finite set. At this moment, we do not have even basic tools for generalising probability to infinite sets at our disposal, but we can give at least a simple illustration. 21 CHAPTER 1. INITIAL WARM UP The following exercise is a simple model, which estimates the probability of death of a person in a traffic accident. 1.46. Approximately 1200 persons die per year at the roads of Czech Republic. Determine the probability that some person of a chose group of 500 people dies in the following ten years in a traffic accident. For simplicity, assume that every person has the same "chance" of dying in traffic accident and that is 1200/107. Solution. Let us first count the probability that one randomly chosen person does not die in ten years in a traffic accident. The probability that he does not die in a year is (1 — j^). The probability that he does not die in ten years is then (1 — )10. The probability that in ten years none of the given 500 people does not die is again using the product rule (the events are independent) (1 — y^-)5000. The probability of the opposite event, that is, some of the chosen people dies, is then 5000 1 - I 1 - — I = 0.4512. / 12 V ('-ioO □ Remark. Model we have used in the previous exercise to describe the given situation is just approximate. The complication is in the condition that every person in the sample has the same probability of dying, which we have derived based on the total number of deaths per year. But the number of deaths changes yearly and even if it did not, the population changes. Let us show one of the possible inaccuracies on a different approach to the solution: if 1200 persons per year dies, then in ten years 12000 persons die. The probability that a certain person dies in ten years can thus be estimated by 12000/107. The probability that a specific person does not die in ten years is then (1 — j^) (first two members of binomial expansion of (1 — j^)10)- In total we analogously obtain the estimate of the probability 500 / 12VUU 1-1--- = 0.4514. V 10V We see that both estimates are very close to each other. The effort to use mathematical knowledge for winning in various gambling games is very old. Let us have a look on a very simple example. 1.47. Alex has a €2500 left over from organising a summer camp. Alex s is no fool - he added €50 from his savings and decided to go playing roulette. Alex s bets only on colour. The probability of winning when betting on colour is 18/37. He begins to bet €10, and if he loses, in the next bet he bets twice the amount he betted in the previous (only if he has enough money, if not he ends the game even Consider the plane R2 of tuples of real numbers and its subset £2 with known volume vol £2. For example we can take the unit square. Events are represented by subsets Acfi and for the event set A we consider some suitable system of subsets for which we can determine the volume. An event A then occurs if a randomly chosen point from £2 belongs to the subarea determined by A, otherwise the event does not occur. Let us take for instance the problem where we randomly choose two values a < b in the interval [0, 1] c K. All values a and b are chosen with the same probability and the question is "what is the probability that the interval (a, b) has size at least one half?" The choice of points (a, b) is actually a choice of a point [a, />] inside of the triangle £2 with border points [0, 0], [0, 1], [1, 1] (see the picture). We can imagine this as a description of a problem where a very tired guest at a party tries to divide a sausage with two cuts into three pieces for him and his two friends. What is the probability that somebody gets at least a half of the sausage? Thus we need to determine the area of the subset which corresponds to points with b > a + j, that is, the inside of the triangle A bounded by the points [0, j], [0, 1], [\, 1]. Clearly we get that Try to answer on your own the question "what is the minimal prescribed length / such that the probability of choosing an interval of length at least / is one half?" 1.22. Monte Carlo methods. One of efficient computation methods for approximate values is the simulation of the probability by relative occurrence of a chosen event. For in-stance the well-known formula for the volume of a circle fi' 1 with given radius says that the volume of a unit circle is exactly the constant n = 3, 1415..., which expresses the ratio of the volume of the circle and the square of its radius. (Let us note that there is a fact we have not proven - why should the volume of a circle equal to a constant multiple of the square of its radius? We will be able to prove this mathematically after we learn how to do the so-called integration. Experimentally, we can verify this by the approach given bellow with squares of different size). 22 CHAPTER 1. INITIAL WARM UP if he has some money left). If he wins, in the next round he bets again €10. What is the probability that using this strategy he wins another €2550? (As soon as he has already won such amount, he ends the game). Solution. Let us first count how many times in a row can Alex s loose. If he begins with a bet of €10 , then for n bets he needs 10+20+- • -+10-2' n-l 10'(^2') = 10'(ytt) = 10-(2"-d- As we can easily see, the number 2550 is of the form 10(2" — 1) for n = 8. Alex s can thus bet eight times in a row no matter what the result is, for nine bets he would need 10(29 — 1) = €5110 and during the game he will never have such amount (as soon as he has €5100, he ends). Thus in order for him to fail, he must lose eight times in a row. The probability of losing on one bet is 19/37, probability of losing eight times in a row is (19/37)8 (as the bets are independent). The probability that in these eight games he wins €10 (using his strategy) is thus 1 - (19/37)8. In order to win €2500 , he needs to win 255 times €10. Again using the product rule the probability of winning is 255 0.29. Thus the probability of winning is lower than betting everything at once on colour. □ 1.48. Individually you can try to solve the previous exercise assuming that Alex has the same strategy as before, but ends only when he has no money (if he cannot afford to double the bet when he lost the previous but still has some money, he begins again with €10). Let us now exercise the so-called "conditional" probability (see (1.20)). 1.49. What is the probability that when rolling two dice the sum is 7, if we know that neither of the rolls resulted in 2? Solution. Let B be the event that neither of the rolls results into 2, and let A be the event "sum is 7". The set of all possible outcomes is again denoted by £2. Then P(AnB) |Ans| P(A\B) P(B) \B\ mi \B\ The number 7 can appear as a sum in four ways if there is no 2, that is, \A n B\ = 4, \B\ = 5 -5 = 25, thus P(A\B) = —. 25 Note that P(A) = \, that is, A and B are independent. □ If we choose Q to be unit square and A to be the intersection of Q with a unit square (centred at the origin), then vol A — |jr. Thus if we have a reliable generator of random numbers between zero and one and we compute relative frequencies how often the distance of the point [a, b] (given by the generator) from the origin is smaller than one, thatis, a2+b2 < 1, then the result (after a large number of attempts) approximates the number \ti pretty well. Numerical approaches based on this principle are called Monte Carlo methods. 5. Plane geometry So far we have been using elementary notions from geometry n of the real plane in an intuitive way. Now we will investigate in more detail how to deal with the need ^^SSsS—M to describe "position in the plane" and to find some relation between positions of distinct points in the plane. Our tools will be again mappings, but this time we will consider only very special rules which to tuples of values (x, y) assign tuples (w, z) — F(x, y). This part will also serve as a gentle introduction to the area of mathematics called Linear algebra, which we will deal with in subsequent three chapters. 1.23. Vector space R2. Let us view the "plane" as a set of tuples of real numbers (x, y) e R2. We will call these tuples vectors in R2. For such vectors we can define addition "coordinate-wise", that is for vectors u — (x, y) and v — (x7, y1) we set u + v — (x +x', y + /). Since all the properties of commutative groups hold for individual coordinates, the hold for our new vector addition too. In particular there exists so called zero vector 0 — (0, 0), whose addition to any vector v results again into the vector v. We are using the same symbol 0 for the vector and for its scalar coordinates on purpose — from the context it will be always clear which "zero" it should be. Next we define multiplication of vectors and scalars in such a way that for a e R and v — (x, y) e R2 we set a ■ v — {ax, ay). Usually we will omit the symbol • and just the juxtaposition of symbols a v shall denote the scalar multiple of a vector. We can directly check other properties for scalar multiplication by a, b and addition of vectors u, v, for instance a (u + v) — a u + a v, (a + b)u — a u + bu, a{bu) — {ab)u, where we are again using the same symbol plus for both vector addition and scalar addition. These operation are easy to imagine if we consider the vectors v to be arrows going from the origin 0 — [0, 0] and ending at the position [x, y] in the plane. Such arrows can we compose — one right after another, and that corresponds to the vector addition. Multiplication by a scalar a corresponds to stretching the arrow to its a -multiple. 23 CHAPTER 1. INITIAL WARM UP 1.50. Michael has two mailboxes, one at gmail.com and the other at hotmail.com. His username is the same at both servers, but passwords are different (he does not remember which passwords corresponds to which server). When typing in the password for accessing his mailbox, he makes a typo with probability 5% (that is, if he wants to type in specific password, with probability 95% he types what he intended). Michael typed in at the server hotmail.com a username and a password, but the server told him that something is wrong. What is the probability that he chose the correct password but just "mistyped" when typing in? (We assume that the username is always typed correctly) Solution. Let A be the event that Michal typed in at hotmail.com a wrong password. This event is an union of two disjoint events: A\ : he wanted to type in the correct password and mistyped, A2 : he wanted to type in the wrong password (the one from gmail.com) and either mistyped or not. Thus we are looking for a conditional probability P(Ai\A) which is according to the formula for conditional probability: P(Ai|A) = —-- =-—— =-—-, P(A) P{AXUA2) P{Ax) + P{A2y thus we just need to determine the probabilities P{A\) and P(A2). The event A! is a conjunction (intersection) of two independent events: Michael wanted to type in a correct password and Michael mistyped. According to the problem statement, the probability of the first event is 1/2 and the probability of the second event is 1/20, in total P{A\) = \ ' I) = io (we multiply the probabilities, since the events are independent). Further we have (directly from the problem statement) P(A2) = \. In total P(A) = P(Al) + P(A2) = ± + \ = §, and we can evaluate P(Al\A) 1 21' □ The method of geometric probability can be used in the case that the given sample space consists of infinitely many elementary events, which altogether fill some area of a line, space (where we can determine length, volume, ...). We assume that the probability, that elementary event of a given sub-area happens, is equal to the ratio of the volume of the subarea to the volume of the sample space. 1.51. >From Edinburgh Waverly station trains depart every hour (in direction to Aberdeen) and from Aberdeen to Edinburgh they also comeevery hour. Assume that the trains move between these two stations with an uniform speed 72 km/h and are 100 meters long. The trip takes 2 hrs in either direction. The trains meet each other somewhere Ms + {XT' Now we are able to do a very important step: if we remember jjfi „ two important vectors e\ — (1, 0) and e2 — (0, 1), then every vector can be obtained as u — (x, y) — x e\ + y e2. The expression on the right is called linear combinations of vectors e\ and e2. The tuple of vectors e — (ei,e2) is called a basis of the vector space R2. However, if we choose other two vectors u, v such that neither of them is a multiple of the other, that is a different basis of R2, we can do the same. Linear combination w — x u + y v gives us for all distinct tuples (x, y) exactly all vectors w in the plane. Finally, we can consider the vectors to be the arrows in the abstract position, that is if we forget the identification of the points in the plane with the tuples of numbers. The only fact that remains is that all the arrows are ^— "fixed" in the point 0 which is also the zero vector. Operations of addition and scalar multiplication remain, and only through the choice of the base e\, e2 we identify our plane of arrows with R2. 1.24. Affine plane. If we fix some vector u e R2, we can add it (that is, compose it with other vectors as an arrow) to any point P — [x, y]. Therefore with any fixed vector u we have defined shift, which maps every point of the plane to P + u. 24 CHAPTER 1. INITIAL WARM UP along the trail. After visiting an Edinburgh pub John, who Lives in Aberdeen, takes train home and falls asleep at the departure. During the trip from Edinburgh to Aberdeen he randomly sticks his head out of the train for five seconds, in the space where the trains ride in the other direction. What is the probability that he loses his head? (We assume that there are no other trains here.) Solution. The mutual speed of the oncoming trains is 40 m/s, the oncoming train passes John's window for two and a half seconds. The sample space of all outcomes is thus the interval (0, 7200 s). During John's trip two trains pass by John's window in the opposite direction and any overlap of the their 2.5 s passing time interval with the 5 s interval when John's head might be sticking out is fatal. Thus, for each train, the space of "favourable" outcomes is an interval of length 7.5 s somewhere in the sample space. For two trains, it's double this amount. Thus the probability of losing the head is 15/7200 = 0.002. □ 1.52. In one of the countries in the world, once a day between eight a.m. and eight p.m. a bus randomly departs from town A to town B. Once a day in the same time interval another bus departs in the other direction. The trip in either direction takes five hours. What is the probability that the buses meet, if they use the same trail? Solution. The sample space is a square 12 x 12. If we denote the time of the departure of the buses as x and y respectively, then they meet on the trail if and only if \x — y\ < 5. This inequality determines in the square the are of "favourable events". The are of the remaining part is easier to compute, since it is an union of two right-angled isosceles triangles with legs of length 7. Thus in total it is 49, the area of the "favourable part" is 144 - 49 = 95 and the probability is p = ^ = 0, 66. □ 1.53. A rod of length two meters is randomly divided into three parts. Determine the probability that at least one part is at most 20 cm long. Solution. Random division of a rod into three parts is given by two points of the cut, x and y (we first cut the rod in the distance x from Let us now completely forget the coordinates and see the whole plane as a set where are shifts take place. Such a set A — R2 can be imagined from the point of view of an observer, who observes from some fixed position (we can call that position for instance O — [xq, yo] e R2). Suppose that the observer sees the plane as an infinite plate without any measurements and labels, and only knows what means shifting by any multiple of some vector u e R2. Such a plane will be called "Affine plane". In order to be able to see the "tuples of real numbers" around him, the observer must choose some fixed point E\ which he will call the "point [1,0]" and some other point E2 which he will call the "point [0, 1]". In other words, he chooses a basis e\ — (1, 0), e2 — (0, 1) among the shifting vectors. To reach any point he will then just jump "a-times in the direction e\" and then "fr-times in the direction e2" and the resulting point will be called the "point [a, b]". If he does it the usual way, it will not matter on the order of the operations, that is he can first jump fr-times in the direction e2 and after that in the direction e\. The thing we have described now is called the choice of (affine) coordinate system in the plane, the point O is its origin, and in general every point P of the plane is identified with the tuple of numbers [a,b], which we will also denote as shift P — O. >From now on we will work in fixed coordinates, that is with tuples of real number, but for better orientation we will denote vectors in parentheses instead of brackets (which we use for coordinates of points in the affine plane). t .AT 1.25. Lines in the plane. If our observer can shift by any multiple of a fixed vector, he also knows what is a line. It is a subset p c A in the plane, such that there exists f point O and a non-zero vector i; such that p = {P e A; P - O = t v, t e R). Let us now describe P — P(t) e p in the chosen coordinates with the choice v — (a, P): x(t) — x0 ■ t, y(t) = yo ■ Since the vector i; — (a, ft) is non-zero, at least one of the numbers a, p has to be non-zero. Let us assume that for instance a ^ 0, then we can eliminate t from the parametric equation for x and y and through a simple computation we obtain —y6x + ay — —fixo -t That is the general equation of the fine (1.13) ax + by — c, ay0. 25 CHAPTER 1. INITIAL WARM UP the origin, we do not move it and again cut it in the distance y from the origin). The sample space is thus a square C with side 2 m. If we place the square C so that its two sides lie on axes in the plane, then the condition that at least one part is at most 20 cm determines in the square a subarea O: O = {(x, y) e C| (x < 20) V (x > 180) V (y < 20) V (y > 180) V (\x - y\) < 20}. As we can clearly observe, this subarea has volume ^ times the volume of the whole square. &J iiL iL with the following relation between the tuple of numbers (a, b) (—ft, a) and the direction vector of the line v — (a, ft) □ E. Plane geometry Let us return for a while back to the complex numbers. The complex plane is basically "normal" plane, where we have something extra: 1.54. Interpret multiplication by the imaginary unit i and taking the complex conjugate as a geometrical transformations in the plane. Solution. Imaginary unit i corresponds to the point (0, 1). Let us note that multiplying any number z, = a + i b by the imaginary unit gives result i ■ (a + i b) = —b + i a which is under the interpretation in the plane just a rotation of the point z, through the right angle in the positive sense, that is counterclockwise. Taking the complex conjugate is a reflection through the axis of real numbers: z = (a + i b) h> (a — i b) = z,- □ Now one well-known but nevertheless useful exercise. 1.55. Determine the sum of angles, which are between the vectors (1, 1), (2, 1) and (3, 1) in the plane R2 (picture). Solution. If we view the plane R2 as the Gauss plane (of complex numbers), then the given vectors correspond to complex numbers 1 +i, 2 + i and 3 + i, and we are to find the sum of their arguments, which (1.14) aa + bß — 0. The expression on the right in the equation of the line (1.13) can we view as a scalar function F which depends f '~±zi on the points in the plane and with values in R, the equation itself as a condition on its value. We shall see later that the vector (a, b) is in this case exactly the direction, in which F grows the fastest. For this reason will the direction perpendicular to (a, b) exactly the direction, in which our function F remains constant. The constant c then determines, which among all the parallel lines with that direction this equation corresponds to. Let us now have two lines p and q, and ask about their intersection p n q. That is described as a point which satisfies the equations of both lines simultaneously. Let us write them like this (1.15) -by —r dy — s. Again we can view the left side as a mapping, which to every tuple of coordinates [x, y] of point P in the plane assigns vector of values of two scalar functions F\ and F2 given by the left sides of the particular equations (1.15). Thus we can write our equations as a single relation F(v) — w, where F is a mapping which maps the vector i; describing the position of any point in the plane (in our coordinates) to the vector given by the left side of the equations, and we demand that this mapping maps it to the specified value w — (r, s). 1.26. Linear mappings and matrices. Mappings F with which we have worked with when describing the intersection of lines have one very important property in common: they preserve the operations of addition and multiplication with vectors and scalars, that is they preserve linear combinations: F(a ■ v + b for all a, b e R, v, w e from M2 to M2, and write F: w) — a ■ F(v) + b ■ F(w) R2. We say that F is a linear mapping R2 —>• M2. This can be also described 26 CHAPTER 1. INITIAL WARM UP is according to the de Moivre's formula the argument of their product. Their product is (1 + i)(2 + 0(3 + /) = (! + 3i) (3 + i) = 10/, which is a purely imaginary number with argument jt/2 - thus the sum we looked for is exactly it 12. □ 1.56. Write the characteristic equation of the line p : x = 2 — t, y = 1 + 3t, t € R. Solution. The vector (—1,3) gives the direction of the line p. Therefore the vector (3, 1) is a normal to p and the characteristic equation of p is 3x + y + c = 0 for some eel We can determine this c by setting x = 2, y = 1 (the line p passes through point [2,1] with t = 0). Thus we obtain c = —1 and consequently the result 3x + y — 1 = 0. □ 1.57. We are given a line p : [2,0] + t(3,2), t € R. Determine the characteristic equation of this line and the intersection with the line q : [-1,2] + 5(1,3), s e R. Solution. The coordinates of the points on this line are given by the parametric equations as x =2 + 3? and y = 0 + 2t. By eliminating t from the equations we obtain the characteristic equation: 2x - 3y - 4 = 0. We obtain the intersection of p with the line q by putting the points of q in parametric expression into the characteristic equation of p: 2(-l+5)-3(2 + 3*)-4 = 0, where we get that s = —12/7 and from the parametric equation of q we obtain the coordinates of the intersection P: 19 22 P = [--,--] 7 7 □ 1.58. Determine the intersection of the lines p : x + y - 4 = 0, q : x = -1 + 2t, y = 2 + t, t eR. Solution. Let us first note that the direction of p is given by the vector up = (1,-1) (any nonzero vector perpendicular to the vector (1,1) from the characteristic equation of p) and the direction of q is given by the vector uq = (2, 1). As the vector up is not a multiple of the vector vq, we see that the lines have a nonempty intersection (they are not parallel). The point [x, y] is the intersection if and only if its with words — linear combination of vectors maps to the same linear combination of their images, that is linear mapping are those mappings which preserve linear combinations. We have already encountered the same behaviour in the equation (1.13) for the line, where the linear mapping in question was F : R2 -> R and its prescribed value c. That is also the reason why the values of the mapping z — F(x, y) are on the image depicted as a plane in R3. We will write such mapping with the so called matrices and their multiplication. By matrix we mean a rectangular scheme of scalars, for instance a b\ , I x nebo i; — c d) \yy we speak of (square) matrix A and (column) vector i;. The multiplication is defined as follows: ' a b\ (x\ (ax + by^ , c dj \y j \cx +dy j Similarly, we can instead of vector multiply from the right by another matrix B of the same dimension as A. We just apply the given formulas on individual columns of the matrix B and as a result we again obtain a square matrix. We cannot multiply vector i; from the right with matrix A because the number of scalars on the rows of i; and the number of scalars in the columns of A differ. But we |^ can write the vector w as a row of scalars (so called =£^_ transposed vector) wT — (a b) and that we can multiply from the right with our matrix A or vector i; already. We can easily check the so-called associativity of multiplication (do it for general matrices A, B and a vector i; in detail): (A ■ B) ■ v = A ■ (B ■ v). Of course that we can instead of vector i; write any matrix C of correct dimension. In a similarly easy way can we see that dis-tributivity also holds: A-(B + Q = A- B + A-C, but the commutativity does not hold and there also exist "divisible zeros". For instance '0 1\ /0 0\ _ /0 1\ /0 0\ /0 1\ _ /0 0^ ,0 o)'\p i) ~ [p oj'Xo i)'[p o) ~ [p 0j We observe in particular that vector multiplication with a fixed matrix gives a linear mapping, and, in the other direction, using the values of a linear mapping F on two fixed vector of basis we obtain the whole corresponding linear mapping. Thus the points in the plane are in general images of the linear mapping F from a plane to a plane, lines are in general preimages of values of linear mappings from the plane to the real line R. With matrices and vectors can we write the equations for lines and points as A ■ v — Of course, in particular situations it does not have to be like this. For instance the intersection of two identical lines is the line itself (and the preimage of a specific value for such a linear mapping will be a whole line), the preimage of zero under the zero mapping 27 CHAPTER 1. INITIAL WARM UP coordinates satisfy the equation of p and there exist a real number t such that x = -1 + 2t, y = 2 + t. If we put this into the equation of p, we obtain (-1+20+ (2 + 0-4 = 0. This equation is satisfied by t = 1, which gives the intersection with coordinates x = 1, y = 3. □ 1.59. Find the characteristic equation of the line p, which goes through the point [2,3] and is parallel with the line x — 3y + 2 = 0, and the parametric equation of the line q which goes through the points [1,3] and [-2, 1]. Solution. Every line parallel to the line x — 3y + 2 = 0 is given by the equation x - 3y + c = 0 for some cel. The line q goes through the point [2, 3]. Therefore it must hold that 2 - 3 • 3 + c = 0, tj. c = l. We can immediately give the parametric equation of the line q q : [1, 3] + t (1 - (-2), 3 - 1) = [1, 3] + t (3, 2), (el. is the whole plane. The first case happens when on the left side of equations (1.15) there are the same expressions up to a scalar multiple (said in another way, the rows of the matrix A are the same up to a scalar multiple). In such a case either the intersection of the fines is empty (the lines are parallel but distinct) or it contains all points of the fine (identical lines). This condition can be expressed by saying that the ratios a/c and b/d must be the same, that is (1.16) ad — be — 0. Note that this expression already takes care of the cases, where either c or d is zero. 1.27. Determinant of matrix. Theexpressionontheleftin(1.16) is called determinant of the matrix A and we write for it det A ad — be. Our discussion can be now expressed as follows: Proposition. Determinant is a scalar function det A defined for all matrices A and equation A-v — u has a unique solution if and only if det A / 0. It was necessary that we are working with the field of scalars w'ji*£\ —try to think it through. For instance it does not hold ^ " " with integers in general. If we just compute the solution of the equations with integer coefficients (that is W the matrix A has only integer inputs) the solution does not have to be integral in general. □ 1.60. Determine whether some of lines pi : 2x + 3y — 4 = 0, p2 '■ x — y + 3 = 0, pi : — 2x + 2y = —6, p4 : -x - | y + 2 = 0, p5 : x = 2 + t, y = -2 - t, (el are parallel. Solution. It is clear that -2 • (-x - | y + 2) = 2x + 3y - 4. The characteristic equations thus describe the same line. The vector (2, 3) is normal to the line pi, for the line p2 a vector is (1, —1), for the line pj, such a vector is (—2,2) and for the line ps such vector is (1,1) (perpendicular to the vector (1,-1)). The lines p2 and pj, are parallel (as the vectors perpendicular to them are multiples of each other). There are no more pairs of parallel lines, since the equations x - y + 3 = 0, -2x + 2y + 6 = 0 has clearly no solutions, the lines p\ and p4 form the only pair of identical lines. □ 1.28. Affine mappings. Let us now investigate how the matrix notation allows us to work with simple mappings in the affine plane. We have seen that matrix multiplication gives a linear mapping. Shifting in the affine plane by a fixed vector t matrix notation: (r, s) e R can we also easily write in the x + r y + s If we allow ourselves to add fixed vector to the result of a linear mapping then our expression will have the form t = ax ex ■ ■ by dy In this way we have described exactly all so-called affine mappings of the plane to itself. Such mappings allow us recomputing of coordinates which arose by different choices of origins and bases of directions for shifting. What happens, if our observer from the paragraph 1.23 will observe the plane from a different point, or chooses different points E\, E2I Try to think through that when speaking about coordinates the difference will be exactly realised by affine mapping. Later we will see general reasons why that holds in any dimension. 28 CHAPTER 1. INITIAL WARM UP 1.61. Determine the line p which is perpendicular to the line q : 6x — 7y + 13 = 0 and which goes through the point [—6,7]. Solution. Since the vector normal to q is the direction of p, we can directly write the result p : x = -6 + 6t, y = 1 - It, t € R. □ 1.62. Give an example of numbers a,b € R, such that the vector u is a normal to AB where A = [1, 2], B = [2b, b], u = (a — b, 3). Solution. The direction of AS is (2b — \,b — 2) (this vector is always nonzero), and therefore the vector (2 — b ,2b — 1) is normal to AB. Setting 2-b = a-b, 2/3-1=3, we obtain a = b = 2. □ 1.63. Determine the relative position of the lines p, q in the plane for p : 2x — y — 5 = 0, q : x + 2y — 5 = 0. If they are not parallel, determine the coordinates of the intersection. Solution. From the characteristic equations of p, q we obtain the vectors (2, — 1), (1, 2) which are normal to them. The lines are parallel if and only if these vectors are multiples of each other, which is not the case. The intersection is found by solving the equations 2x - y - 5 = 0, x + 2y - 5 = 0. Expressing y from the first equation as y = 2x — 5 and putting it to the second equation we obtain x + 2(2x - 5) - 5 = 0, tj. x = 3. Then easily y = 2 - 3 — 5 = 1. The intersection thus is [3, 1]. □ 1.64. Consider the plane R2 with the standard coordinate system. A laser ray is sent from the origin [0, 0] in the direction (3, 1). It hits the mirror line p given by the equation p: [4,3]+f(-2, 1) and then is reflected (the angle of rebound is the same as the angle of entry). At which point meets the ray the line q, given by q : [7,-10]+ f (-1,6)? Solution. The angle between the line p and the direction of the ray is 45°, the rebounded ray is thus perpendicular to the entering ray and its direction is (1, —3) (Be careful with the orientation! The vector of the direction can also be obtained via reflection (axial symmetry) of £>eTKAfNt''m LzM? 1.29. Euclidean plane. Let us now give our observer the ability to see and measure distance. For instance we can trust the common equation for the length of the vector v — (a,b) ■b2 in affine coordinates chosen by the observer. Immediately we can define notions as angle and rotation in the plane. We can easily imagine it like this: our observer decides about some points E\ and E2 that they are at distance 1, i^v? and also decides that they are perpendicular. Dis-&r tance in the direction of the coordinate axes are then given by the corresponding ratio, in general Euclid (Pythagorean) theorem is used. This leads to the equation given above. tUKLIDOVSK// i/zm'LENosr IMI=ll?-$ll = 1l&7e' Of course that our observer can work in a different manner. He can use some specific standard for real measurements of the distance of points P and Q in the plane and then say that exactly that is the length of the vector Q — P, which is necessary to shift from the P to Q. Then he picks some of the vectors which have size 1 and for instance using the triangle with sides of size 3,4 and 5 constructs a perpendicular vector of size 1 and then continues as before. Euclidean plane is an affine plane with a notion of distance as given above. 1.30. Angle between vectors. So-called trigonometric function (cos)(p — which we have already used in the discussion about complex numbers as points in the plane — is given by the value of the first coordinate of the unit vector whose angle with the vector (1, 0) is 0. The intersection of the liven given by the rebounded ray with the line q is the point [4,8], which lies out of the half-line given by the rebound ray (t = —2). Thus the rebound ray does not meet the line q. □ Remark. The reflection of a ray in three-dimensional space is studied in the exercise || 3.531|. 1.65. Line segment of length 1 started moving at noon with a constant speed 1 mathrmms~l in the direction (3,2) from the point [—2, 0]. Another line segment of length of 1 has started moving from the point [5,-2] also at noon, but with double speed. Will they collide? Solution. Lines along which the segments are moving can be described parametrically: p : [-2,0]+r(3,2), q : [5,-2]+s(-l,l). Characteristic equation of the line p is 2x - 3y + 4 = 0. Plugging the parametric equation of the line q yields the intersection point P = [1, 2]. Now let us try to choose a single parameter t for both lines so that the corresponding point describes the position at p and q of the first and second line segment respectively at the time t (more precisely, the position of the initial point of the line segment). At time 0 is the first line segment at the position [—2,0], the second at the position [5, —2]. During time? (measured in seconds) the first segments travels t units of length in the direction (3,2), the second segments travels 2t units of length in the direction (—1, 1). Thus the corresponding parametrisations are: P q [-2,0] + (3, 2), [5, -2] + fV2(-l, 1). The initial point of the first segment enters the point [1, 2] at time t\ = \/l3 s, the initial point of the second segment at time t2 = V2 s -more than a half second sooner. At the time 12 + \ = \fl + \ < 11 the ending point of the second segment moves away from P. Thus when the initial point of the first segment enters the point P, the ending point 0PCWLKA PVO R2 is easy to guess: if the result of applying the mapping is the matrix with columns (a, c) and (b,d), then the first column (a, c) is obtained by multiplying this matrix with the basis vector (1,0) and the second is the evaluation at the second basis vector (0, 1). We can see from the picture that rotating counter-clockwise through the angle i/r there are in the matrix the following columns: a b c d cos -iff sin \jf - sin -iff cos \jf The counter-clockwise direction is called the positive direction, the other direction is the negative direction. Therefore we obtain the claim: 30 CHAPTER 1. INITIAL WARM UP of the second segment is already away and the segments do not collide. □ 1.66. Planar soccer player shoots a ball from the point f = [1, 0] in the direction (3, 4) hoping to hit the goal which is a line segment from the point a = [23, 36] to b = [26, 40]. Does the ball fly towards the goal? Solution. Due to the fact that the situation takes places in the first quadrant, it is sufficient to consider only the slopes of the vectors fa, (3, 4), fb. If they form either increasing or decreasing sequence (in the order we have written them), the ball flies towards the goal. The sequence is 36/22, 4/3, 30/25 which is a decreasing sequence, thus the ball flies towards the goal. □ 1.67. Simplify (a 0 5 -2 2 b b)t ■ 2C • u, while 2 0^ -1 1 C a-b (a-b)1 Solution. By plugging in -1 V ' v y ^5 and by matrix multiplication we obtain -2 -l\ /4 -4 5 1 j ' 18 10 2C (a - b)t ■ 2C ■ u 4 -4 8 10 -52 64 __J Rotation matrix J_ Rotating through a given angle if in the positive direction about the origin is given by the matrix Rf: R,i cos -iff — sin -iff sin if cos if Now, since we know how the matrix of the rotation in the plane looks like, we can check that rotation preserves distances and angles (defined by the previously given equation). Let us denote the image of a vector i; as , i v'\ ^ /vx cos ilf — vv sin ilf vx sin if + vy cos if and similarly w' — Rf-w. We can easily check that it really holds that \v — \\v\\ v'w' — vxwx Uy IVy . The previous expression can be written using vectors and matrices as follows: □ 1.68. Give an example of matrices a and b for which (a) (a + b) ■ (a - b) £ a ■ a - b ■ b; (b) (a + b) ■ (a + b) ^ a ■ a + 2A ■ b + b ■ b. Solution. Let us remind that we are considering two-dimensional (square) matrices a and b. For any two matrices a and b we have (a + b) ■ (a - b) = a ■ a - a ■ b + b ■ a - b ■ b. The identity (a + b) ■ (a — b) = a ■ a — b ■ b is thus obtained if and only if—ab +b-a is zero matrix, that is if and only if the matrices a and b commute. An example of such matrices are thus such pairs of matrices, which do not commute (the matrix of multiplication is changed when we change the order of multiplied matrices). We can choose for instance a = i: :i, b = since with this choice is a ■ b b - a 8 5 20 13 Analogously is for any pair of matrices a, b 13 20 5 8 (Rf ■ w)T(Rf ■ v) — wTv. The transposed vector (Rf ■ w)T equals wT ■ R^, where R^ is the so-called transpose of the matrix Rf. That is a matrix, whose rows consist of the columns of the original matrix and similarly the columns consist of the rows of the original matrix. Therefore we see that the rotation matrices satisfy the relation R^ ■ Rf — /, the matrix / (sometimes we denote this matrix just as 1 and mean by this the unit in the ring of matrices) is the so-called unit matrix I = 1 0 0 1 This led us to a derivation of a remarkable claim — the matrix F with the property that F ■ Rf — I (we will call such a matrix the inverse matrix to the rotation matrix Rf) is the transpose of the original matrix. That makes sense, since the inverse mapping to the rotation through the angle if is again a rotation, but through the angle — if, that is the inverse matrix of R^ equals the matrix R- cos(—ir) — sin(— if) sin(— if) cos(— if) cos if sin if - sin if cos if It is easy to write the rotation around a point P — O again using matrix, the equation can be expressed with a shift: 31 CHAPTER 1. INITIAL WARM UP (A + B) ■ (A + B) = A ■ A + A ■ B + B ■ A + B ■ B. That means that (A +B) ■ (A +B) = A- A +A-B + A-B + B ■ B is satisfied if and only if A ■ B = BA. In the second case is the answer exactly the same as in the first case. □ G : 1.69. Decide, whether the mapping F, G ' x \ i Ix — 3y yy J \—1x + 5y ' x\ (2x + 2y - 4 vy) \4x - 9y + 3 are linear. given by x, y e x, y e Solution. For any vector (x, y)T e we can express 7 — 3 \ / x\ „ 11x -2 3 / V-V vv^ This implies that both mappings are affine. Let us remind that affine mapping is a linear one if and only if the zero vector maps to zero. Since :?))-©■ °m-(t the mapping F is linear, the mapping G is not. □ 1.70. Let us consider a regular hexagon ABCDEF (the vertices are labelled in the positive direction) with centre at the point S = [1,0] and the vertex A = [0, 2]. Determine the coordinates of the vertex C. Solution. The coordinates of the vertex C can be obtained by rotating the point A around the centre S of the hexagon through the angle 120° in the positive direction: 'cos(120°) - sin(120°)N sin(120°) cos(120°) C (A — S) + S + [1,0] = [^-73,-1-^] □ 1.71. Determine the angle between two vectors (a) u = (-3, -2), v = (-2, 3); (b) u = (2,6), v = (-3, -9). Solution. The angle

Rf ■ (v — w) Rf ■ (v — w) + w ' cos \jf(x — wx) — sin \jf(y . sin TJf(x — wx) + cos TJf(y - - Wy) + WX Wy)) + W, 1.32. Reflection. Another well-known example of mappings which preserve length is the so-called reflection \"° through a line. Again it suffices to describe reflections through fines that go through the origin O, all other reflections can be derived using shifts and rotations. Let us look for a matrix of reflection with respect to the line with the direction given by the unit vector i; such that the angle between i; and the vector (1,0) has value i/r. Let us first realise that Z0 1 0 0 -1 - /1 o ü-A (I) In general, we can rotate any line so that it has the direction (1,0) and thus we can write general reflection matrix as where we first rotate via the matrix R-^ so that the line is in "zero" position, reflect with the matrix Zo and return back with the rotation Rf. 32 CHAPTER 1. INITIAL WARM UP (or vice versa). Vectors in the case (a) are thus perpendicular, that is

Co? which means that the linear mapping given by this matrix is the projection on the x axis. Similarly we can see that the matrix A2 determines the reflection with the respect to the y axis, since The matrix A3 can be expressed in the form ^cos

det A satisfies all the three conditions we wanted. How many such mappings could there possibly be? Every vector can be expressed using two basis vectors e\ — (1,0) and e2 — (0, 1) and by linearity is then every possibility for vol A uniquely determined by the value for these vectors. Since for area — in the same way as for determinant — is clearly vol A(ei, e\) — vol A(e2, e2) — 0 (due to the required antisymmetry), every such scalar function is necessarily determined by the value on the single tuple of arguments (e\, e2). Therefore all possibilities are equal up to a scalar multiple, which can be determined by the condition 1 volA(ei, e2) = -, that is we are choosing orientation and scale through the choice of basis vectors and we want that the unit square has area equal to one. Thus we see that the determinant gives the area of a parallelogram determined by the columns of the matrix A and the area of the triangle is thus one half of that. 1.35. Visibility in the plane. The previous description of the value for oriented area gives us elegant tool for determining the position of a point relative to oriented fine segments. > By an oriented line segment we mean two points in the plane M2 with fixed order. We can imagine it as an arrow from one point to the other. Such an oriented fine segment divides the plane into two half-planes, let us call them "left" and "right". We want to be able to tell whether a given point is in the left or right half-plane. Such tasks are often met in computer graphics when dealing with visibility of objects. For simplicity we can imagine that a line segment can be "seen" from the points to the right of it and cannot be seen from the points to left of it (this corresponds to the notion that object with bounded by fine segments oriented counterclockwise has to the left of the line segments its interior, through which the segment cannot be seen). 35 CHAPTER 1. INITIAL WARM UP Solution. Using both sides of the equation into the coordinates u = (uu u2), v = (vi, v2) yields: 2(\\u\\2 + \\v\\2) = 2(u\ + u\ + v\ + v\) = u\ + 2u\V\ + v\ + u\ + 2u2v2 + u|+ + u2 — 2u\V\ -\- v\ -\- u\ — 2u2v2 + t>2 = (ill + vi)2 + (u2 + v2)2 + (ill - vi)2 + (u2 - v2)2 = \\u+v\\2 + \\u-v\\2. □ 1.81. Show that by composing an odd number of point reflections in the plane yields again a point symmetry. Solution. Point reflection in the plane across the point 5 is represented with the formula X h» 5 - (X - 5), that is X h» 25 - X. (The image of the point X in this reflection is obtained by summing the vector opposite to the vector X—S and the vector 5.) By repeated application of three point reflections across the points 5, T and U respectively thus yields X h» 25 - X h» 2T - (25 - X) h» 2U - (2T - (25 - X)) = 2(U - T + 5) - X, that is X h» 2(U - T + 5) - X, which is a point reflection across the point 5 — T + U. Thus composition of any odd number of point reflection can be thus reduced to a composition of three point reflections, thus it is a point reflection (in principle, this is a proof by mathematical induction, try to formulate it by yourself). □ 1.82. Construct (2n + l)-gon, if the middle points of all its sides are given. Solution. We use the fact that the composition of an odd number of point reflections is again a point reflection (see the previous exercise). Denote the vertices of the (2n + l)-gon we are looking for by A\, A2, ..., A2n+i and the middle points of the sides (starting from the middle point of AiA2) by 5i, 52, ... S2n+i. If we carry out the point reflections across the middle points, then clearly the point A! is a fixed point of the resulting point reflection, thus it is its centre point. In order to find it, it is enough to carry out the given point reflection with any point X of the plane. The point Ai then lies in the middle of the line segment XX' where X' is the image of X in that point reflection. The rest of the vertices A2, ..., A2n+i can be obtained by mapping the point A\ in the point reflections across the points 5i, ..., S2n+i. □ w oA B as a relation f^AxB, f = {(a,f(a));aeA} is also known as the graph of a function f. \ 0 ^TŕCľí j** (n^_Y Bandg : B -> C, then their composition go f : A -> C is denned as (gof)(a) = g(f(a)). It can be also expressed under the notation used for relation as / c A x B, f = {(a, f(a)); a € A} g^BxC, g = {(b, g(b)); beB] gof^AxC, gof = {(a,g(f(a)));aeA}. The composition of relation is denned in a very similar way, "4f%%K we Just ac^ existential quantifiers to the statements, "V^i since we have to consider all possible "preimages" J^P^ and all possible "images". Let R c A x B, S c Bx C ^rs^i- be relations. Then S o R c A x C, SoR = {(a, c); 3b e B, (a, b) e R, (b, c) e S}. A special case of relation is the identity relation idA — {(«, a) 0, 0. Since the last determinant is zero, we see that the points [0, 1], [5, 6] and [7, 8] lie on a line, the side AB is thus not visible. The side BC is also not visible, unlike the side AC for which the determinant is negative. □ B - P 1 7 C - P 5 7 A - P 5 5 B - P 7 7 C - P 5 7 A - P — 5 5 < B; the elements of the set 2A are thus mappings from A to {0, 1} which "say" whether given element is in a given subset). We have a relation c on the set 2A given by the property "being a subset". Thus it is X c Z if X is a subset of Z. Clearly all three conditions from the definition of ordering are satisfied: if I c F and Y c. X then necessarily X and Y must be identical. If X c Y c Z then also X c. Z, and reflexivity is clear from the definition. We say that an ordering < on a set A is complete, if for every two elements a, b e Ait holds that they are comparable, that means that either a < b or b < a. Let us note that not all tuples (X, Y) of subsets of A are comparable in this sense. More precisely, if A contains more than element, there exist subsets X and Y where neither X c. Y nor Y c. X. Let us recall the recurrent definition of natural numbers N — {0, 1,2, 3,...}, where 0 = 0, 1 = {0, 1,2,...,«}. On this set N we define a relation < as follows: m < n, if either m e n or m — n. Clearly this is a complete ordering. For instance 2 < 4, since 2 = {0, {0}} € {0, {0}, {0, {0}}, {0, {0}, {0, {0}}}} = 4. In other words, the recurrent definition itself gives the relation n < n + 1 and transitively then n < k for all k obtained in this manner later. 1.39. Partitions of an equivalence. Every equivalence Sona set A gives also a partition of the set A, consisting of subsets of mutually equivalent sets, so called equivalence classes. For any a e A we consider the class (set) of elements, which are equivalent with a, that is Ra = {b e A; (a, b) e R}. Often we will write for Ra simply [a]a, if it isclear from the context which equivalence we have in mind. Clearly Ra — Rb if (a,b) e R, and every such equivalence class is therefore represented by any of its elements, so-called representant. Furthermore Ra n Rb / 0 if and only if Ra — Rh, that is the equivalence classes are pairwise disjoint. Finally, A — UaeARa, that is the whole set A is partitioned to equivalence classes. Another way of looking at things is that [a] is seen as the element a "up to equivalence". 1.40. Construction of the integers and rational numbers. With natural numbers we can do addition and we know that adding zero to a number does not change it. We can also define subtraction, but the result does not always belong to VJ 1 the set N. The basic idea of construction of the integers from the natural numbers is to add to N these missing results. This can be done as follows: instead of result of subtraction, we will work with ordered tuples of numbers, which will represent the result. It remains just to define which such tuples are equivalent (with respect to the result of subtraction). The necessary relation is then: (a, b) ~ (a', b') b' ■b' ■b. 39 CHAPTER 1. INITIAL WARM UP 1.90. Determine which sides of the quadrilateral with vertices A = [95, 99], B = [130, 106], C = [40, 60], D = [130, 120], are visible from the point [2, 0]. Solution. First we need to determine the sides of the quadrilateral (the "correct" vertex order): A BCD. After computing the corresponding determinants as in previous exercises we see that only the side CB is visible. □ F. Mappings and relations 1.91. Determine whether the following relations over the set M are equivalence relations: i) M = {/ : R -> R], where (/ ~ g) if /(0) = g(0). ii) M = {/ : R -> R], where (/ ~ g) if /(0) = g(l). iii) M is the set of lines in the plane, where two lines are in relation if they do not intersect. iv) M is the set of lines in the plane, where two lines are in relation if they are parallel. v) M = N, where (m ~ n) if S(m) + S(n) = 20, while S(n) stands for the sum of the digits of the number n. vi) M = N, where (m ~ n) if C(m) = C(n), where C(n) = S(n) if the sum of the digits S(n) is less than 10, otherwise we define C(n) = C(S(n)) (thus it always holds that C(n) < 10) . Solution. i) Yes. Let us check the three properties of equivalence: i) Reflexivity: for any real function / it holds that /(0) = /(0). 11) Symmetry: if /(0) = g(0), then also g(0) = /(0). iii) Transitivity: if /(0) = g(0) andg(0) = h(0), then also /(0) = HO). ii) No. The relation is not reflexive, since for instance for the function sin we have sin 0 ^ sin 1 and is not even transitive. iii) No. The relation is not reflexive (every line intersects itself) and not transitive. iv) Yes. The equivalence classes then correspond to unoriented directions in the plane. v) No. The relation is not reflexive. 5(1) + 5(1) = 2. vi) Yes. Let us note that the expression in the middle equations cannot be realised in natural numbers, but the expression on the right can. We can easily check that it really is an equivalence, and we denote its classes as the integers Z. We define addition and subtraction on Z using representants. For instance [(a, b)] + [(c, d)] = [(a+c,b + d)], which is clearly independent of the choice of representants. It is always possible to choose representants (a, 0) for natural numbers and representants (0, a) for negative numbers — this is probably the simplest and clearest choice. This simple example shows how important it is to be able to see the equivalence classes as a whole object and to concentrate on the properties of these objects, not on the formal description of their construction. However, the description is important in order to be able to check that such objects exist. On integers we have all the properties of scalars (KG1)-(KG4) and (01)-(04), see the paragraphs 1.1 and 1.3. For multiplication the neutral element is one, but for all numbers a other than zero and one we are not able to find a number a-1 with the property a ■ a~l — 1, that means that for multiplication we are missing inverse elements. Let us also note that the properties of the integral domain (ID), see 1.3. This means that if the product of two numbers equals zero, at least one of them has to be zero. Thanks to the last stated property we can construct the rational numbers Q by adding all missing multiplicative inverses by a method analogous to the construction of Z from N. On the set of all ordered tuples (p,q),q / 0, of integers we define a relation ~ so that it models our expectation of the fractions p/q: {p',q') <=> plq = p'l4 <=> (p,q) W = p' q. Again, we are not able to formulate the expected behaviour in the middle equation when we work in Z, but for the equation on the right this is indeed possible. Clearly this relation is a well-defined equivalence (think it through!) and rational numbers are then the equivalence classes. If we formally write p/q instead of tuples (p,q), we can define the operations of multiplication and addition by the well-known formulas. □ 1.92. We have a set {3, 4, 5, 6, 7}. Write explicitly the relations 1.41. Remainder classes. Another nice and simple example are j:^ the so-called remainder classes of integers. For a fixed natural number k we define an equivalence ~£ so that two numbers a, b e Z are equivalent if they have the same remainder when divided by k. The resulting set of equivalence classes is denoted as Z*. This procedure is simplest for k — 2. This yields Z2 — {0, 1}, where zero stands for even numbers and one for odd numbers. Again it is easy to see that using representants we can correctly define addition and multiplication for each Z*. Theorem. The remainder class 7Li_ is a commutative field of scalars (that is, the property (P) from the paragraph 1.3 is also satisfied) if and only ifk is a prime. Ifk is not prime, then Zj- contains a divisor of zero, thus it is not an integral domain. Proof. The second part is easy to see — if x- y — k for natural numbers x, y, then clearly the result of multiplying the corresponding classes [x] • [y] is zero. 40 CHAPTER 1. INITIAL WARM UP i) a divides b ii) a divides b or b divides a iii) a and b have a common divisor greater than one On the other hand, if x and k are relatively prime, then according to the so-called Bezout equality which we derive later (see ??) natural numbers a and b satisfying a x + b k — 1, which for corresponding equivalence classes gives o [a] ■ [x] + [0] = [a] ■ [x] = [1] and thus [a] is the inverse element to [x]. □ 1.93. Let the relation r be defined over R2 such that ((a, b), (c, d)) € r for arbitrary a,b,c,d € R if and only if b = d. Determine whether it is an equivalence relation. If it indeed is, describe geometrically the partitioning it determines. Solution. From ((a, b), (a, b)) € r for all a, b € R it is implied that the relation is reflexive. Equally easy to see is that the relation is symmetric, since in the equality of the second coordinates we can interchange left and right side. If ((a, b), (c, d)) e r a ((c, d), (e, f)) e r, that is it holds that b = d and d = /, we easily get that the transitivity condition ((a, b), (e, f)) € r, that is b = f, holds. The relation r is an equivalence relation, where the points in the plane are in relation if and only if they have the same second coordinate (the line they determine is perpendicular to the y axis). The corresponding partition then divides the plane into the lines parallel with the x axis. □ 1.94. Determine how many distinct binary relations can be defined between the set X and the set of all subsets of X, if the set X has exactly 3 elements. Solution. Let us first realize that the set of all subsets of X has exactly 23 = 8 elements, and thus the cartesian product with X has 8 • 3 = 24 elements. Possible binary relations the correspond to subsets of this cartesian product, and of those there are exactly 224. □ 1.95. Give the domain d and the codomain i of the relations r = {(a, v), (b, x), (c, x), (c, u), (d, v), (/, y)} between the sets A = {a, b, c, d, e, /} and b = {x, y, u, v, w}. Is the relation r a mapping? Solution. Directly from the definition of the domain and the codomain of a relation we obtain d = {a, b, c, d, /} C A, i = {x, y, u, v} C b. It is not a mapping since (c,x),(c,u) e r, that is c e d has two images. □ 41 CHAPTER 1. INITIAL WARM UP 1.96. Determine about each of the following relations over the set {a,b,c,d} whether it is an ordering and whether it is complete: Ra = {(a, a), (b, b), (c, c), (d, d), (b, a), (b, c), (b, d)}, Rh = {(a, a), (b, b), (c, c), (d, d), (d, a), (a, d)}, Rc = {(a, a), (b, b), (c, c), (d, d), (a, b), (b, c), (b, d)}, Rd = {(a, a), (b, b), (c, c), (a, b), (a, c), (a, d), (b, c), (b, d), (c, d)}, Re = {(a, a), (b, b), (c, c), (d, d), (a, b), (a, c), (a, d), (b, c), (b, d), (c,d)}. Solution. Ra is an ordering, which is not complete (for instance neither (a,c) g Ra nor (c, a) g Ra). The relation Ri, is not anti-symmetrical (it is both (a, d) € R], and (d, a) € R],), therefore it is not an ordering (it is an equivalence). The relations Rc and Rd are also not an ordering, since they are not transitive (for instance (a,b),(b,c) € Rc, Rd, (a,c) £ Rc, Rd) and also not reflexive ((d, d) g Rc, (d,d) £ Rd). Relation Re is a complete ordering (if we interpret (a,b) e R as a < b, then a < b < c < d). □ 1.97. Determine whether the mapping / is injective (one-to-one) or surjective (onto), if (a) / : Z x Z -> Z, f((x, y)) = x + y - 10x2; (b) / : N -+ N x N, f(x) = (2x, x2 + 10). Solution. In the case (a) is given a mapping which is surjective (it is enough to set x = 0) but not injective (it is enough to set (x, y) = (0, —9) and (x,y) = (1, 0)). In the case (b) in is an injective mapping (both its coordinates, that is functions y = 2x and y = x2 + 10 are clearly increasing over N) which is not surjective (for instance the tuple (1,1) has no preimage). □ 1.98. Determine the number of mappings from the set {1, 2} to the set {a, b, c}. How many of them are surjective and how many injective? Solution. To the element 1 we can assign arbitrarily one of the elements a,b,c. Similarly for the element 2 we have two possibilities. Thus according tho the (combinatorial) rule of product there are exactly 32 mappings of the set {1, 2} to the set {a, b, c}. None of them can be surjective, since the set {a, b, c} has more elements than the set {1,2}. For the arbitrary mapping of the element 1 (three possibilities) we have injective mapping if and only if the element 2 gets mapped to a different element (two possibilities). Thus we see that the number of injective mappings of the set {1, 2} to the set {a, b, c} is 6. □ CHAPTER 1. INITIAL WARM UP 1.99. Determine the number of inj ective mappings of the set {1, 2, 3} to the set {1,2, 3,4}. Solution. Any injective mapping among the given sets is given by choosing an (ordered) triple from the set {1, 2, 3, 4} (the elements in the chosen triple will correspond in order to images of the numbers 1, 2, 3) and vice versa every injective mapping gives us such a triple. Thus the number of injective mappings equals the number of ordered triples among four elements, that is i>(3, 4) = 4 • 3 • 2 = 24. □ 1.100. Determine the number of surjective mapping of the set {1,2, 3, 4} to the set {1,2, 3}. Solution. We can determine the number by subtracting the number of non-surjective mappings from the number of all mappings. The number of all mappings is V (3, 4) = 34, the number of non-surjective mappings (that is the number of mappings with one-element codomain) is three. Thus the number of mappings with two-element codomain is (2) (24 — 2) (there are (2) ways to choose the codomain and for a fixed two-element codomain there are 24 —2 ways how to map four elements onto them). Thus the number of surjective mappings is 1.101. The Hasse diagram of ordering. The Hasse diagram of a give ordering -< over an n-element set M is a diagram with n vertices (every vertex corresponds to exactly one element of the set), and two vertices (elements) a, b are joined (with a more or less vertical) line (such that a is "lower" and b is "higher") if and only if b covers a, that is a < b and there is no c e M such that a < c and c From the total number of 16 relations over a two-element set have we thus chosen {(1,2), (2,1)}, {(1,2), (2,1), (1,1)}, {(1,2), (2,1), (2, 2)}. It is clear that each of these 3 relations is symmetric but neither reflexive nor transitive. □ 1.105. Determine the number of equivalence relations over a set {1,2,3,4}. Solution. Equivalences can be enumerated by the sizes of their equivalence classes. For the sizes of equivalence classes over a four-element set we have these possibilities: The sizes of equivalence classes number of equivalences of this type 1,1,1,1 1 2,1,1 0 2,2 \0 3,1 G) 4 l In total we have 15 different equivalences. □ Remark. In general, the number of partitions of a given n-element set is given by the Bell number Bn+k, for which a recurrence formula can be derived n 1.106. How many relations are there over an «-element set? Solution. Relation is an arbitrary subset of the cartesian product of the set with itself. This cartesian product has n2 elements, thus the 2 number of all relations over an n-element set is 2" . □ 44 CHAPTER 1. INITIAL WARM UP 1.107. How many reflexive relations are there over an «-element set? Solution. Relation over the set m is reflexive if and only if it has the diagonal relation AM = {(a, a), kde a e m} as a subset. As for the rest of the n 2—n ordered tuples in the cartesian product MxMwe have independent choice, whether the tuple belongs to the relation or not. In 2 total we have 2" ~" different reflexive relations over an n-element set. □ 1.108. How many symmetric relations are there over an ^-element set? Solution. Relation r over the set m is symmetric if and only if the intersection of r with each {(a, b), (b, a), where a ^ b, a,b € m} is either the whole two-element set or is empty. There are two-element subsets of the set m, and if we also declare what the intersection of r and the diagonal relation AM = {(a, a), where a e m} should be, then r is completely determined. In total we are to do (2) + n independent choices between two alternatives: each set of the type {(a, b), (b, a)\where a, b € m, a ^ b} is either the subset of r or it is disjoint with r, and every tuple (a, a), a € m either is in r or not. In total we have 2©+" symmetric relation over an n-element set. □ 1.109. How many anti-symmetric relations over an ^-element set are there? Solution. Relation r over the set m is anti-symmetric if and only if the intersection of r with each set {(a, b), (b, a)} a ^ b, a, b € m is either empty or one-element (which means that it is either {(a, b)} or {(b, a)}). The intersection of r with the diagonal relation is arbitrary. By declaring what these intersections are the relation r is completely determined. In total we thus have 3®2" anti-symmetric relations over an n-element set. □ In [?] we have defined remainder classes (also called residue classes) and we have shown that Zp is a field for any prime p. On the other hand, events we are not used to when dealing with real or complex numbers occur in Zp. 1.110. Non-zero polynomial with zero values. Find a non-zero polynomial of one indeterminate with coefficients in Z7, that is an expression of the form anx" + • • • + a\x + a0, at € Z7, a„ ^ 0 such that it attains only zero values over the set Z7 (that is, if we set x to be equal to any of the elements of Z7 and evaluate, we always obtain zero). Solution. For the construction of such polynomial we use the Fermat's little theorem which says that for any prime number p and number a, _CHAPTER 1. INITIAL WARM UP which is not divisible by p, we have: ap~l = l(modp). Thus we can take for instance the polynomial x1 - x (the polynomial x6 — x is not zero for x = 0). □ 46 CHAPTER 1. INITIAL WARM UP G. Additional exercise for the whole chapter 1.111. Let t and m be positive integers. Show that the number %/i is either integer or is not rational. Solution. Show that if the number is not integer, then it cannot be rational. If %/i is not integer, then there exists a prime r and integer s such that f divides t, rs+1 does not divide t (this we write as ordr t = s) and m does not divide s . Assume that ^ft = £, p, q e Z, in other words t ■ pm = qm. Consider ordr L and ordr R and their divisibility by the number m. (L denotes the left-hand side of the equation,...). □ 1.112. Determine (2+3r)(l+r\/3) l-r\/3 Solution. Since the absolute value of the product (ratio) of any two complex numbers is the product (ratio) of their absolute values and every complex number has the same absolute value as its complex conjugate, we have that (2+3Q(l+i 73) l-r'V3 |2 + 3f| |l+iV3| |l-r'\/3| |2 + 3f| = V22 + 32 □ 1.113. Simplify the expression (573+ 5/) . Solution. Taking powers one by one or doing an expansion using binomial theorem are in this case too much time-consuming. Let us rather write 5V3 + 5i = 10 + 0 = 10 (cos f + i sin f) and using the Moivre theorem we easily obtain (573 + 5ij 12 = 1012 (cos ^ + i sin ^f) = 1012. 1.114. Calculate zi + zi, z\ ■ zi, z\, \zi\, zf2, for a) z,\ = 1 - 2i, Z2 = 4i -3 b) zi =2,z2 = i 1.115. Determine the distance d of the numbers z, z in the complex plane for /3V3 _ • 3 2 2m □ o Solution. It is not difficult to realize that complex conjugates are in the complex plane symmetrical with respect to the x-axis and the distance of a complex number from the x-axis equals its imaginary part. That gives d = 3. □ 47 CHAPTER 1. INITIAL WARM UP 1.116. In the meeting there were six men. If all of them shook hands with each other, how many handshakes have happened? Solution. The number of handshakes equals the number of ways of choosing an unordered tuple among 6 elements, thus the result is c (6, 2) = (^) = 15. □ 1.117. Determine in how many ways a 4-member committee can be chosen among 15 deputies, if it is not allowed for two certain deputies to work together. Solution. The result is (?)-(?) = 1287. It can be obtained by first calculating the number of all 4-member committees and then subtracting the number of those committees where the given two deputies are chosen together (in that case, we only choose two more members among the remaining 13 deputies). □ 1.118. In how many ways can we divide 8 women and 4 men in two six-member groups (which are considered unordered) in such a way that there is at least one man in each group? Solution. If we forget the last condition, division of 12 people in two six-member groups can be done by just choosing 6 people and put them to the first group, which can be done in (?) ways. The groups are not distinguishable (we do not know which one is the first one), thus the total number is rather \ ■ (?). In (j) cases all men are in one group (we choose two women among eight to complete the group). The correct answer is thus \ ■ (?) " © = 434- □ 1.119. What is the number of 4-digit numbers composed of digits 1,3,5,6,7 and 9, where no digit occurs more than once? Solution. We have 6 distinct letters at our disposal. We ask: how many distinct ordered 4-tuples can be chosen from them? The result is 1; (6, 4) = 6 • 5 • 4 • 3 = 360. □ 1.120. The Greek alphabet consists of 24 letters. How many words of exactly five letters can be composed in it? (Disregarding whether the words have some actual meaning or not.) Solution. For each of the five positions in the word we have 24 possibilities, since the letters can repeat. The result is then v (24, 5) = 245. □ 1.121. In a long-distance race, where the racers start one after another in given time intervals, there were k racers, among them 3 friends. Determine the number of starting schedules in which no two of the 3 friends start next to each other. For simplicity assume k > 5. Solution. Remaining k — 3 racers can be ordered in (k — 3)1 ways. For the three friends there are then k — 2 places (the start, the end and the k — 4 spaces) where we can put them inv (k — 2, 3) ways. Using the rule of (combinatorial) product, we obtain (k - 3)\ ■ (k - 2) ■ (k - 3) ■ (k - 4) = (jk - 2)! • (jk - 3) • (k - 4). □ 48 CHAPTER 1. INITIAL WARM UP 1.122. There are 32 participants of a tournament. The organisers have stated that the participants must divide arbitrarily into four groups, such that the first one has size 10, the second and the third 8, and the fourth 6. In how many ways can this be done? Solution. We can imagine that from 32 participants we create a row, where first 10 are the first group, next 8 are the second group and so on. There are 32! orderings of all participants. Note that the division into groups is not influenced if we change the order of the people in the same group. Therefore the number of distinct divisions equals P (10, 8,8,6) = T5ifbi- □ 1.123. We need to accommodate 9 people in one four-bed room, one three-bed room and one two-bed room. In how many ways can this be done? Solution. If we assign to the people in the four-bed room the number 1, in the three-bed room number 2 and in the two-bed room number 3, then we create permutations with repetitions from the elements 1, 2, 3, where 1 occurs four times, 2 three times and 3 two times. Number of such permutations is P (4, 3, 2) = ^ = 1260. □ 1.124. Determine the number of ways how to divide among three people A, B and C 33 distinct coins such that A and B together have twice as many coins as C. Solution. From the problem statement it is clear that C must receive 11 coins. That can be done in (^) ways. Each of the remaining 22 coins can be given either to A or to B, which gives 222 ways. Using the rule of product we obtain the result (^) • 222. □ 1.125. In how many ways can we divide 40 identical balls among 4 boys? Solution. Let us add three matches to the 40 balls. If we order the balls and matches in a row, the matches divide the balls in 4 sections. We order the boys at random, give the first boy all the balls from the first section, give the second boy all the balls from the second section and so on. It is now evident that the result is (433) = 12 341. □ 1.126. According to quality, we divide food products into groups i, 77, 7/7, iv. Determine the number of all possible divisions of 9 food products into these groups, such that the numbers of products in groups are all distinct. Solution. If we directly write the considered groups from the elements of i, ii, iii, iv, we create combinations of repetitions of the ninth-order from four elements. The number of such combinations is (g2) = 220. □ 1.127. In how many ways could the table of the first soccer league ended, if we know only that at least one of the teams Ostrava, Olomouc is in the table after the team of Brno (there are 16 teams in the league). Solution. Let us first determine the three places where the teams of Brno, Oloumouc and Ostrava ended. Those can be chosen in c(3, 16) = (g6) ways. From 6 possible orderings of these three teams 49 CHAPTER 1. INITIAL WARM UP on the given three places only four satisfy the given condition. After that, we can independently choose the order of the remaining 13 teams at the remaining places of the table. Using the rule of product, we have the solution '16^ □ . • 4 • 13! = 13948526592000. 3 1.128. How many distinct orderings (in a row) at a picture of a volleyball team (6 players), if i) Gouald a Bamba want to stand next to each other ii) Gouald a Bamba want to stand next to each other and in the middle hi) Gouald a Kamil do not want to stand next to each other Solution. i) In this case Gouald a Bamba can be considered a single person, we just multiply then by two to determine their relative order. Thus we have 2.5! = 240 orderings. ii) Here it is similar except that the position of Gouald and Bamba is fixed. We have 2.4! =48 orderings. hi) Probably the simplest approach is to subtract the cases where Kamil and Gouald stand next to each other (see (i)). We get 6! - 2.5! = 720 - 240 = 480. □ 1.129. Coin flipping. We flip a coin six times. i) How many distinct sequences of heads and tails are there? ii) How many sequences with exactly four heads are there? iii) How many sequences with at least two heads are there? o 1.130. How many anagrams of the word BAZILIKA are there, such that there are no two vowels next to each other and no two consonants next to each other? Solution. Since there are four vowels and four consonants in the word, each such anagram is either of the type BABABABA or ABABABAB. On the given four places we can permute vowels in P0(2,2) = ways and independently of that also the consonants (4! ways). Using the rule of product, the result is then 2 • 4! • = 288. □ 1.131. In how many ways can we divide 9 girls and 6 boys into two group such that each group contains at least two boys? Solution. We divide the boys and the girls independently: 29(25 — 7) = 12800. □ 1.132. Material is composed of five layers, each of them has fibres in one of the possible six directions. How many of such materials are there? How many of them have no two neighbouring layers which have fibres in the same direction? Solution. 65 a 6 55. □ 50 CHAPTER 1. INITIAL WARM UP 1.133. For any fixed n e N determine the number of all solutions to the equation x\ + x2 H-----h xk = n in the set of positive integers. Solution. If we look for a solution in the domain of positive integers, then we note that the natural numbers x\, ... xk are a solution to the equation if and only if the non-negative integers yt = xt — \, / = 1 are a solution to the equation yi + yi H-----\-yk=n -k. Using || 1.30||, there are (£:}) of them. □ 1.134. There are n forts on a circle (n > 3), numbered in a row with numbers 1,..., n. In one §,, moment of time each of the shoots at one of its neighbours (fort 1 neighbours with the fort ri). Denote by P(n) the number of all possible results of the shooting (a result of the shooting is x a set of numbers of those forts that were hit, regardless of the number of hits taken). Prove that P(n) and P(n + 1) are relatively prime. Solution. If we denote the forts that were hit by a black dot and the unhit by a white dot, the task is equivalent to the task to determine the number of all possible colourings of n dots on a circle with black and white colour, such that no two white dots have "distance" one. For odd n this number is equal to K(n) - the number of colourings with black and white, such that no two white dots are adjacent (we reorder the dots such that we start with the dot one and proceed increasingly with odd numbers, and then increasingly with even). For even n this number equals K(n/2)2, the square of the colouring of n/2 dots on a circle such that no two white are adjacent (we colour independently the dots on even positions and on odd positions). For K(n) we easily derive a recurrent formula K(n) = K(n—\)+K(n—2). Furthermore, we can easily compute that K(2) = 3, K(3) = 4, K(4) = 7, that is, K(2) = F(4) - F(0), K(3) = F(5) -F(\),K(4) = F (6) — F (2), and using induction we can easily prove that K(n) = F(n+2) — F(n — 2), where F(n) denotes then-th member of the Fibonacci sequence (F(0) = 0, F(l) = F(2) = 1). Since (K(2), K(3)) = 1, we have for n > 3 similarly as in the Fibonacci sequence (K(n), K{n - 1)) = (K(n) - K(n - 1), K(n - 1)) = = (K(n - 2), K(n-!)) = ■■■ = 1. Let us now show that for every even n = 2a is P (n) = K(a)2 relatively prime with both P (n + 1) = K(2a + 1) and P(n — 1) = K(2a — 1). For this the following is enough: for a > 2 we have (K(a), K(2a + 1)) = (K(a), F(2)K(2a) + F(l)K(2a - 1)) = = (K(a), F(3)K(2a - 1) + F(2)K(2a -2) = ... = (K(a), F(a + l)K(a + 1) + F(a)K(a)) = = (K(a), F(a + 1)) = (F(a + 2) - F(a - 2), F(a + 1)) = = (F(a +2) - F(a + 1) - F(a - 2), F(a + 1)) = = (F(a) - F(a - 2), F(a + 1)) = = (F(a - 1), F(a + 1)) = (F(a - 1), F(a)) = 1 (K(a), K(2a - 1)) = (K(a), F(2)K(2a - 2) + F(l)K(2a - 3)) = 51 CHAPTER 1. INITIAL WARM UP = (K(a), F(3)K(2a - 3) + F(2)K(2a - 4)) = = • • • = (K(a), F(a)K(a) + F(a - l)K(a - 1)) = = (K(a), F(a - 1)) = (F(a + 2) - F(a - 2), F(a - 1)) = = (F(a + 2) - Fifl), F(a - 1)) = = {F{a + 2) - Fifl + 1), F(a - 1)) = (F(a), F(a - 1)) = 1. This proves the claim. □ 1.135. How much money do I save in a building savings in five years, if I invest in it 3000 Kc monthly (at the first day of the month), the yearly interest rate is 3% and once a year I obtain a state donation of 1500 Kc (this donation comes at first of May)? Solution. Let xn be the amount of money at the account after n years. Then (for n > 2) we obtain the following recurrent formula (assuming that every month is exactly one twelfth of a year) xn+1 = 1, 03(xn) + 36000 + 1500+ 0.03-3000(l + ii + ... + !) + interests from deposits this year + 0, 03• - •1500 3 interest from the state donation credited at this year = l,03(x„) + 38115. Therefore n-2 xn = 38115 J](l, 03)' + (l,03)"-1Jti + 1500, while xx = 36000 + 0, 03 • 3000 (l + ^ + • • • + ^) = 36585, in total x5 = 38115 q3)q3~ ^ + (!> 03)4 ' 36585 + 1500 = 202136. □ 1.136. Remark. In reality, interests are computed according to the number of days the money is on the account. You should obtain a real bank statement of a building savings, determine its interest rates and try to compute the credited interests in a year. Compare the result with the sum that was credited in reality. Compute until the numbers disagree ... 1.137. What is the maximum number of areas the plane can be divided into by n circles? Solution. For the maximum number p„ of areas we derive a recurrent formula pn+1 = pn+2n. Note that the (n + l)-th circle intersects n previous circles in at most 2n points (and this can really occur) 52 CHAPTER 1. INITIAL WARM UP Clearly p\ = 2. Thus for p„ we obtain p„ = pn-i + 2(72 - 1) = p„-2 + 2(n - 2) + 2(72 - 1) = . . . n-l = pi + 2i = 722 — 72 + 2. r = l □ 1.138. What is the maximum number of areas a 3-dimensional space can be divided into by 72 planes? Solution. Let the number be r„. We see that r0 = 1. Similarly to the exercise (|| 1.34||) we consider 72 planes in the space, we add another plane ad we ask what is the maximum number of new areas. Again it is exactly the number of areas the new plane intersects. How many can that be? The number of areas intersected by the (72 + l)-th plane equals to the number of areas the new (72 + l)-th plane is divided into by the lines of intersection with the 72 planes that were already situated in the space. However, there are at most 1/2 ■ (n2 + n + 2) of those (according to the exercise in plane), thus we obtain the recurrent formula 722 + 72 +2 rn + l = rn H--^-" This equation can be again solved directly: (72 - l)2 + (72 - 1) + 2 722 - 72 + 2 r„ = r„_i H---- = r„_i H---- = (72 - l)2 - (72 - 1) + 2 722 - 72 + 2 = r„_2 +---+ ^^ = 722 (72 — l)2 72 (72 — 1) = r„_2 + — +---------- + 1 + 1 = 2 2 2 2 , n2 (n-l)2 (72 - 3)2 72 (72 - 1) (72 - 2) "r""3 +2+2 + 2 2 2 2 + +1 + 1 + 1 = Y n \ n n '■o + 2E?-2El' + E1 / — 1 / — 1 / — 1 53 CHAPTER 1. INITIAL WARM UP n(n + l)(2n + 1) n(n + 1) 1 "I--7^---:--1" " 12 4 723 + 6« + 5 where we have used the known relation .2 n(n + l)(2n + 1) E'2 r = l which can be easily proved by mathematical induction. □ 1.139. What is the maximum number of areas a 3-dimensional space can be divided into by n balls? o 1.140. What is the number of areas a 3-dimensional space is divided into by n mutually distinct planes which all intersect a given point? Solution. For the number x„ of areas we derive a recurrent formula x„ = x„-i + 2(n - 1), furthermore x\ = 2, that is, x„ = n(n — 1) + 2. □ 1.141. From a deck of 52 cards we randomly draw 16 cards. Express the probability that we choose exactly 10 red and 6 black cards. Solution. We first realize that we don't have to care about the order of the cards. (In the resulting fraction we would obtain ordered choices by multiplying by 16! both nominator and denominator.) The number of all possible (unordered) choices of 16 cards from 52 is Similarly, the number of all choices of 10 cards from 26 is equal to (^) and of 6 cards from 26 is (266). Since we are choosing independently 10 cards from 26 red and 6 cards from 26 black, using the (combinatorial) rule of product we obtain the result /26\ l76\ Hzr1 =0, 118. □ 1.142. In a box there are 7 white, 6 yellow and 5 blue balls. We draw (without returning) 3 balls randomly. Determine the probability that exactly 2 of them are white. Solution. In total there are (7+3+5) ways, how to choose 3 balls. Choosing exactly two white allows (2) choices of two white balls and simultaneously choices for the third ball. Using the rule of product is the number of ways how to choose exactly two white equal to Q • (Y) ■ Thus the result is ^ = 0, 283. □ 54 CHAPTER 1. INITIAL WARM UP 1.143. From a deck with 108 cards (2 x 52 + 4 jolly jokers) we draw without returning 4 cards randomly. What is the probability that at least one of them is an ace or a joker? Solution. We can easily determine the probability of the complementary event, that is, in the 4 drawn cards there is none of the 12 cards (8 aces and 4 jokers). This probability is given by the ratio of the number of choices of 4 cards from 96 and the number of choices of 4 cards from 108, that is, (946) / ^^J8). The complementary event thus has the probability 1 - M = 0, 380. □ 1.144. When throwing a dice, eleventh times in a row the result was 4. Determine the probability that the twelfth roll results in 4. Solution. The previous results (according to our assumptions) do not influence the result of further rolls. Thus the probability is 1/6. □ 1.145. From a deck of 32 cards we randomly draw 6 cards. What is the probability that all of them have the same colour? Solution. In order to obtain the result t$ = 1,234- 10"4, we just first choose one of the 4 colours and realize that there are Q ways how to choose 6 cards from 8 cards of this colour. □ 1.146. Three players are given 10 cards each and two remain (from a deck of 32 cards, where 4 of them are aces). Is it more likely, that somebody receives seven, eight and nine of spades; or that two aces remain? Solution. Since the probability that some of the players receives the three mentioned cards equals 3 ,32 while the probability that two aces remain equals it is more likely that some of the players receives the three mentioned cards. Let us note that proving the inequality Q (?) is possible by transforming both sides, where by repetitive crossing-out (after expanding the binomial coefficients according to their definition) we easily obtain 6 > 1. □ 1.147. We throw n dice. What is the probability that among the numbers that appeared the values 1, 3 and 6 are not present? Solution. We can reformulate the exercise that we throw the dice n times. The probability that the first roll does not result into 1, 3 or 6 is 1/2. The probability that neither the first nor the second roll is clearly 1/4 (the result of the first roll does not influence the result of the second roll). Since the event determined by the result of a given roll and event determined by the result of another roll are always (stochastically) independent, the probability is 1/2". □ 55 CHAPTER 1. INITIAL WARM UP 1.148. Two friends are shooting independently of each other at one target - one shoots, then the second shoots, then the first, and so on. The probability that the first hits 0, 4, the second friend has the probability of hitting 0, 3. Determine the probability P of the event that after shooting there will be exactly one hit of the target. Solution. We determine the result by summing the probabilities of two mutually exclusive events -first friend hit the target and the second has not; and second friend hit the target and first has not. Since the events of hitting are independent (note that independence is preserved when taking complements) is the probability given by the product of the probabilities of given elementary elements. That is, p = 0, 4 • (1 - 0, 3) + (1 - 0, 4) • 0, 3 = 0, 46. □ 1.149. We flip three coins twelve times. What is the probability that at least one flipping results in three tails? Solution. If we realize that when repeating the flipping, the individual results are independent, and denote for i € {1,..., 12} by At the event „the z'-th flipping results in three tails", we are determining P (iJ A^j = 1 - (1 - P(A0) • (1 - P(A2)) ••• (1 - P(An)). For every i € {1,..., 12} is P(At) = 1/8, since at every coin of the three the tail is with the probability 1/2 independently of the results of the other coins. Now we can write the final probability □ 1.150. In a particular state there is a parliament with 200 members. Two major political parties in this state flip a coin during an "election" for every seat in the parliament. Each of the parties has associated one side of the coin. What is the probability that each of the parties gains 100 seats? (The coin is "fair".) Solution. There are 2200 of possible results of the elections (considered to be sequences of 200 results of flips). If each party is to obtain 100 seats, then there are exactly 100 tails and 100 heads in the sequence. There are (2[J[J) such sequences (since the sequence is uniquely determined by choosing 100 members of 200 possible, which will result in, say, tails). The resulting probability is /2oo\ 200! '/ra 0, 9. Clearly is P(At) = 1/10 for any i € N. Thus it is enough to solve the equation 1-(&)"> 0,9, from which we can express n > !og"an, kde a > 1. logo 0,9 Evaluating, we obtain that we must do the drawing at least twenty two times. □ 1.153. Texas hold'em. Let us now solve a couple of simple exercises concerning the popular card game Texas hold'em, whose rules we will not state (if the reader does not know them, she can look them up on the Internet). What is the probability that i) the starting combination is a tuple of the same symbols? ii) in my starting tuple of cards there is an ace? iii) in the end I have one of the six best combinations of cards? iv) I win, if I hold in my hand ace and a triple of twos (of any colour), on the flop there is ace and two twos and on the turn there is a third three and all these four cards have distinct colour? (The last card river is not yet turned) Solution. i) The number of distinct symbols is 13 and there are always four of them (one of each colour). Thus the number of tuples with the same symbols is 13(2) = 78. The number of all possible tuples is (1324) = 1326. The probability of having same symbols is then 77 = 0, 06. ii) One card is the ace, that is four choices, and the second is arbitrary, that is 51 choices. But we have counted twice the tuples with two aces, of which there are Q = 6. Thus we obtain 4.51 - 6 = 198 tuples and the probability is ^ = 0, 15. iii) Let us compute the probabilities of the individual best combinations: ROYAL FLUSH: There are exactly only four such combinations - one of each colours. The number of combinations of five cards are (552) = 2598960. The probability is thus equal to 1,5.10"6. Very small:) STRAIGHT FLUSH: Sequence which ends with the highest card in the range 6 to K, that is eight choices for every colours. We obtain 259382960 = 1, 2.10-5. POKER: Four identical symbols - 13 choices (for every symbol one). The fifth card can be arbitrary, that is 48 choices. That makes 25llt6Q = 2, 4.10-4. 57 CHAPTER 1. INITIAL WARM UP FULL HOUSE: Three identical symbols make 13(3) = 52 choices and two identical symbols make 12(2) = 72 choices. The probability is 2598960 ^ 1' 4-10-3. FLUSH: All five cards of the same colour means 4(153) = 5148 choices and the probability is then 5148 — 2 10~3 it> men 3593960 — ^-lu ■ STRAIGHT: The highest card of the sequence is in the range from 6 to Ace, that is 9 choices. The colour of every card is arbitrary, that makes 9.45 = 9216 choices. But we have counted both straight flush and royal flush which we must subtract. For determining the probability of one of the six best combinations we don't have to do that, we just do not count the first two combinations. Therefore we obtain the probability approximately 3, 5.10"3 + 2.10"3 + 1, 4.10"3 + 2, 4.10"4 = 7, 14.10"3. iv) The situation is clearly pretty good and therefore it will be better to count bad situation, that is, when the opponent has even better combination. I have at this moment full house of two aces and three two's. The only combination that could beat me at this moment is either full house of three aces and two twos or a poker of twos. That means that the enemy must have either the ace or the last two. If he has the two and any other card, then he clearly wins no matter what card is river. How many ways are there for this other card in his hand? 3 + 4 + -- - + 4 + 2 = 45 (one triple and two aces cannot be in his hand since I have them). There are (426) = 1035 remaining combination and the probability of such loss is then 0,043. If he has an ace in his hand, then the following can happen. If he holds two aces, then he again wins if two is not on the river - then I would have split poker. The probability of my (conditional) loss is then y^-fl = 10~3. If the enemy has in his hand ace and some other card than 2 and A, then it is a draw no matter what is on the river. The total probability of the win is thus almost 96 %. □ 1.154. A volleyball team (with libero, that is, 7 people) sits after a match in a pub and drinks beer. But there is not enough mugs, and thus the publican keeps using the same seven. What is the probability that i) exactly one person does not receive the mug he had last round, ii) nobody receives the mug he had last round, hi) exactly three receive the mug they had last round. Solution. i) If six people receive the mug they had last round, then clearly the seventh person also receives the mug he had last round, the probability is thus zero. ii) Let M is the set of all orderings and event At occurs when the z'-th person receives his mug from last round. We want to calculate \M - U, At \. We obtain 7! Yll=o = 1854- And the probability is ^ = §§ = 0, 37. hi) We choose which three receive the mug they had last round - Q = 35 choices. The remaining four must receive mugs from somebody else. That is again the formula from the previous section, specifically it is 4! Ylt=o (~~rh = 9 choices. In total we have 9 • 35 = 315 choices and the probability is = j?. 58 CHAPTER 1. INITIAL WARM UP □ 1.155. In how many ways can we place n identical rooks on a chessboard n x n such that every non-occupied position is threatened by some of the rooks? Solution. Such placements are a union of two sets: the set of placements where in at least one row there is one rook (therefore in every row there is exactly one; this set has nn elements - in every row we choose independently one position for the rook), and the set of placements where in every column there is at least one (that is exactly one) rook (as before, this set has nn elements). The intersection of these sets has n! elements (the places for the rooks are chosen sequentially starting in the first row - there we have n choices, in the second only n — 1 - one column is already occupied...). Using the inclusion-exclusion principle, we obtain 2nn - n\. □ 1.156. Determine the probability that when throwing two dice at least one resulted in four, if the sum is 7. Solution. We solve this exercise using the classical probability, where the condition is interpreted as restriction of the probability space. The space has due to the condition 6 elements, and exactly 2 of those are favourable to the given event. The answer is thus 2/6 = 1/3. □ 1.157. We throw two dice. Determine the conditional probability, that the first die resulted in five under the condition that the sum is 9. Based on this result, decide whether the events "first dice results in five" and "the sum is 9" are independent. Solution. If we denote the event "first dice resulted in five" by A and the event "the sum is 9" by H, then it holds P(A\H) = ^ = f = I. 36 Note that the sum 9 occurs when the first die is 3 and the second 6, the first is 4 and the second 6, the first is 5 and the second is 4, or the first is 6 and the second is 3. Of those four results (that have the same probability) only one is favourable to the event A. Since the probability of A is clearly 1/6 7^ 1/4, the events are not mutually independent. □ 1.158. Let us have a deck of 32 cards. If we draw twice one card, what is the probability that the second drawn card is an ace, if we return the first card; and when we don't return the first card (then there are 31 cards in the deck). Solution. If we return the card in the deck, we are just repeating the experiment, which has 32 possible results (which have the same probability), and exactly four of them are favourable. Thus we see that the probability is 1/8. In the second case when we do not return the card, is probability also the same. It is enough to consider that when drawing all the cards one by one is the probability of the ace as the first card identical to the probability that the ace is the second card. We could also use conditional probability, that results into _± A. _1_ 28 _4_ _ 1 32 ' 31 32 ' 31 ~~ 8- 59 CHAPTER 1. INITIAL WARM UP □ 1.159. Consider families with two children and for simplicity assume that all choices in the set £2 = {bb, bg, gb, gg}, where b stands for „boy" and h stands for „girl" considering the age of the children have the same probability. Choose random events Hi - familiy has a boy, Ai - family has two boys. Compute P (Ai|Hi). Similarly consider families with three children, where Q = {bbb,bbg,bgb, gbb, bgg, gbg, ggb, ggg}. If H2 - the family has both boy and girl, A2 - the family has at most one girl, decide whether the events A2 and H2 are independent. Solution. Considering which of the four elements of the set £2 are (not) favourable to the event Ai or Hi, we easily obtain p (A I H \ — p(AinHi) — p(Ai) — 1 _ I r — P(Hl) — P(Hl) — 3 — 3- Further we have to determine whether the following holds: P (A2 f)H2) = P (A2) • P (H2). Again we just have to realize that exactly the elements kkk, kkh, khk, hkk of the set £2, are favourable to the event A2; to the event H2 the elements kkh, khk, hkk, khh, hkh, hhk are favourable and to the event A2 n H2 the elements kkh, khk, hkk. Therefore P (A2 n H2) = § = I ■ I = P (A2) • P (H2), which means that the events A2 and H2 are independent. □ 1.160. We flip a coin five times. For every head, we put a white ball in a hat, for every tail we put in the same hat a black ball. Express the probability that in the hat there is more black balls than white balls, if there is at least one black ball in the hat. Solution. Let us have the following two events A - there are more black balls than white balls in the hat, H - there is at least one black ball in the hat. We want to express/5(A\H). Note that the probability P (Hc) of the complementary event to the event H is 2~5 and that the probability of the event is the same as the probability P (Ac) of the complementary event (there are more white balls in the hat). Necessarily, P(H) = 1 — 2~5, P(a) = 1/2. Furthermore P(a n H) = P(a), since the event H contains the event A (the event A has H as a consequence). Thus we have obtained p(A\m - p(AnH) - 2 - 16 r(.A\ti)- p{H) _ 5_31. □ 60 CHAPTER 1. INITIAL WARM UP 1.161. In a box there are 9 red and 7 white balls. Sequentially we draw three balls (without returning). Determine the probability that the first two are red and the third is white. Solution. We solve this exercise using the theorem about multiplication of probabilities. First we require a red ball, that happens with the probability 9/16. If a red ball was drawn, then in the second round we draw a red ball with the probability 8/15 (there are 15 balls in the box, 8 of them are red). Finally, if two red balls were drawn, the probability that a white ball is drawn is 7/14 (there are 7 white balls and 7 red balls in the box). Thus we obtain ±.±.J.=0 15 16 15 14 □ 1.162. In the box there are 10 balls, 5 of them are black and 5 are white. We will sequentially draw the balls, and we do not return them back. Determine the probability that first we draw a white ball, then a black, then a white and in the last, fourth turn again a white. Solution. We use the theorem about multiplication of probabilities. In the first round we draw a white ball with the probability 5/10, then a black ball with probability 5/9, then a white ball with probability 4/8 and in the end a white ball with probability 3/7. That gives _5_ 5 4 3 _ _5_ 10 ' 9 ' 8 ' 7 ~~ 84- □ 1.163. From a deck of 32 cards we randomly draw six cards. Compute the probability that the first king will be chosen as the sixth card (that is, the previous five cards do not contain any king). Solution. Using the theorem about multiplication of probabilities we have 28 27 26 25 24 J_ ^_ r> 0790 32 ' 31 ' 30 ' 29 ' 28 ' 27 ~~ U' u/z->- □ 1.164. What is the probability that a sum of two randomly chosen positive numbers smaller than 1 is smaller than 3/7? Solution. It is clear that it is a simple exercise on geometrical probability where the basic space £2 is a square with vertices at [0, 0], [1, 0], [1, 1], [0, 1] (we are choosing two numbers in [0, 1]). We are interested in the probability of the event that a randomly chosen point [x, y] in this square satisfies x + y < 3/7, that is, the probability that the point lies in the triangle A with vertices at [0, 0], [3/7, 0], [0, 3/7]. Now we can easily compute p(a\ — Mil — (^) /2 _ 2. ^ ' vol Q 1 98 • □ 61 CHAPTER 1. INITIAL WARM UP 1.165. Let a pole be randomly broken into three parts. Determine the probability that the length of the second (middle) part is greater than two thirds of the length of the pole before the breaking. Solution. Let d stand for the length of the pole. The breaking of the pole at two points is given by the choice of the points where we split the pole. Let x be the point which is the first (closer to left end of the pole), and x + y be the point where the second splitting occurs. That says that the basic space is the set {[x, y]; x € (0, d), y € (0, d — x)}, that is, a triangle with vertices at [0, 0], [d, 0], [0, d]. The length of the middle part is given by the value of y. The condition from the exercise statement can be now restated as y > 2d/3, which corresponds to the triangle with vertices at [0, 2d/3], [d/3, 2d/3], [0, d]. Areas of the considered triangles are d1 /2 a (d/3)2/2, therefore the probability is 3^-2 _ 1 (fl ~ 9' 2 □ 1.166. A pole of length 2 m is randomly divided into three parts. Determine the probability of the event that the third part is shorter than 1,5m. Solution. This exercise is for using the geometrical probability, where we are looking for the probability that the sum of the lengths of the first two parts is greater than one fourth of the length of the pole. We determine the probability of the complementary event, that is, the probability that if we randomly choose two points on the pole, both of them are in the first quarter of the pole. The probability of this event is 1/42, since the probability of picking a point in the first quarter of the pole is clearly 1/4 and this choice is independently repeated (once). Thus the probability of the complementary event is 15/16. □ 1.167. Mirek and Marek have a lunch at the school canteen. The canteens opens from 11 to 14. Each of them eats the lunch for 30 minutes, and the arrival time is random. What is the probability that they meet at a given day, if they always sit at the same table? Solution. The space of all possible events is a square 3x3. Denote by x the arrival time of Mirek and by y the arrival time of Marek, these two meet if and only if \x — y\ < 1/2. This inequality determines in the square of possible events the area whose volume is 11/36 of the volume of the whole square. Thus that is also the probability of the event. □ 1.168. >From Brno Honza rides a car to Prague randomly between 12 and 16, and in the same time interval Martin rides a car to Brno from Prague. Both stop in a motorest in the middle of the trip for thirty minutes. What is the probability that they meet there, if Honza's speed is 150 km/h and Martin's is 100 km/h? (The distance Praha-Brno is 200 km). Solution. If we denote the departure time of Martin by x and the departure time of Honza by y, and in order to have fewer fractions in the following calculations choose a time unit to be ten minutes, then the base space is a square 24 x 24. The arrival time of Martin to the motorest is x + 6, arrival time of Honza is y +4. As in the previous exercise, the event that they meet in the motorest is equivalent to the event that their arrival times do not differ by more than thirty minutes, that is, | (x + 6) — (y + 4) | < 3. This condition determines an area with volume 242 — \ (232 +192) (see the figure) and the probability 62 CHAPTER 1. INITIAL WARM UP p = (232 + 192 ) 242 □ 1.169. Mirek departs randomly between 10 and 20 o'clock from Brno to Prague. Marek departs randomly in the same interval from Prague to Brno. The trip takes 2 hours. What is the probability that they meet on the road (they use the same road)? Solution. We are solving analogously to the previous exercise. The space of all events is a square 10x10, Mirek, departing at the time x, meets Marek, departing at the time y if and only if \x — y\ < 2. The probability is p = = |r = 0,36. 1.170. Two meter-long pole is randomly divided into three pieces. Determine the probability that a triangle can be built of the pieces. Solution. Division of the pole is given as in the previous exercises by the points of cutting x and y and the probability space is again a square 2 x 2. In order to be able to build a triangle of the pieces, the lengths of the parts must satisfy the triangle inequalities, that is, sum of lengths of any two parts must be greater than the length of the third part. Since the sum of the lengths is 2 meters, this condition is equivalent to the condition that each part must be smaller than 1 meter. Using the cut-points x and y, we can express this that it cannot simultaneously hold x < 1 and y < 1 or simultaneously x > 1 and y > 1 (this corresponds to the conditions that the border parts of the pole are smaller than 1), and also \x — y\ < 1 (the middle part is smaller than one). These conditions are satisfied by the shaded area in the picture, whose volume is 1/4. □ 63 CHAPTER 1. INITIAL WARM UP □ 1.171. Does the equation (a) (b) (c) have a unique solution (that is, exactly one)? Solution. The set of equation is uniquely solvable if and only if the determinant of the matrix given by the left-hand side coefficients is nonzero. Therefore, the coefficients on the right-hand side do not influence the uniqueness of the solution. Thus we have to have the same answer in (a) and (b). Since 4 -73 4jci - V3x2 = 3, xx - 2Jlx2 = -2 Ax i - V3x2 = 16, xx - 2Jlx2 = -7 Ax i + 2x2 = 7, —2x\ *2 = -3 1 -2V7 4- (-2V7) - (-V3- l) ^0, = 4-(-l)-(2-(-2)) = 0, -2 -1 for (a) and (b) there is a unique solution and in (c) there is not. If we multiply the second equation in (c) by —2, we see that it has no solution at all. □ 1.172. Compute the area 5 of a quadrilateral given by the vertices [0,-2], [-1,1], [1,5], [1,-1]. Solution. In the usual notation A = [0,-2], 5 = [1,-1], C = [l,5], Z) = [-l,l] and the usual division of the quadrilateral into triangles ABC and ACD with areas Si and S2 we obtain S = Si + s2 1-0 1-0 -1+2 5 + 2 + 1-0 -1-0 5+2 1+2 (7-1)+ ±(3+ 7) □ 1.173. Determine the area of the quadrilateral ABCD with vertices A C = [2, 5] aD = [-2, -5]. [1,0], B = [11, 13], 64 CHAPTER 1. INITIAL WARM UP Solution. We divide the quadrilateral into two triangles ABC and ACD. We compute their areas by computing the determinants, see 1.34, 1 1 5 1 1 5 2 10 13 + 2 -3 -5 □ 1.174. Compute the area of parallelogram with vertices at [5, 5], [6, 8] at [6, 9]. Solution. Although such parallelogram is not uniquely determined (the fourth vertex is not given), the triangle with vertices at [5, 5], [6, 8] and [6, 9] must be necessarily a half of every parallelogram with these three vertices (one of the sides of the triangle becomes the diagonal of the parallelogram). Therefore the area equals the determinant 6-5 6-5 1 1 8-5 9-5 3 4 □ 65 CHAPTER 1. INITIAL WARM UP 1.175. Determine the number of relations over the set {1, 2, 3, 4}, which are both symmetric and transitive. Solution. Relations of the given properties is an equivalence over some subset of the set {1, 2, 3, 4}. 1.177. Determine the numer of ordering relations over the set {1, 2, 3, 4} such that the elements 1 and 2 are not comparable (that is, neither 1^2 nor 2^1, where -< stands for the ordering relation). O 1.178. Determine the number of surjective mappings / from the set {1, 2, 3, 4, 5} to the set {1, 2, 3} such that /(l) = /(2). Solution. Every such mappings is uniquely given by the images of the elements {1, 3, 4, 5}, there are exactly that many mappings as there are surjective mappings of the set {1, 3, 4, 5} to the set {1, 2, 3}, that is, 36, as we know from the previous exercise. □ 1.179. Give all the elements in S o R, if R = {(2, 4), (4, 4), (4, 5)} C N x N, S = {(3, 1), (3, 2), (3, 5), (4, 1), (4, 4))cNx N. Solution. Considering all choices of two ordered tuple (2, 4), (4,1); (2, 4), (4, 4); (4, 4), (4,1); (4, 4), (4, 4) satisfying that the second element of the first ordered tuple—which is a member of R—equals the first element of the second ordered tuple—which is a member of S—we obtain In total, 1 + 4 • 1 + (j) • 2 + (3) • 5 + 15 = 52. □ 1.176. Determine the number of ordering relations over a three-element set. o SoR = {(2, 1), (2,4), (4, 1), (4,4)}. □ 1.180. Let a binary relation be given R = {(0,4), (-3,0), (5,7T), (5, 2), (0,2)} between sets A = Z a B = R. Express R 1 and R o R 1. Solution. We can immediately see that R-1 = {(4, 0), (0, -3), (tt, 5), (2, 5), (2, 0)}. Furthermore, R o R-1 {(4, 4), (0, 0), (77-, tt), (2, 2), (4, 2), (it, 2), (2, it), (2, 4)}. □ 1.181. Decide whether the relation R determined by the condition: Solution. In the first case R is transitive, because 66 CHAPTER 1. INITIAL WARM UP In the second case R is not transitive. For instance, consider (4, 2), (2, 1) e R, (4, 1) £ R. □ 1.182. Find all relations over M = {1,2}, which are not antisymmetric. Which of them are transitive? Solution. There are four relations that are not antisymmetric. They are exactly subsets of the set {1,2} x {1,2}, which contain the elements (1, 2), (2, 1) (otherwise the condition of antisymmetry is satisfied). Of these four only the relation {(1, 1), (1, 2), (2, 1), (2, 2)} = M x M, is transitive, because not containing tuples (1,1) and (2, 2) in a transitive relation means that the relation cannot contain both (1,2) and (2, 1). □ 1.183. Is there an equivalence relation, which is also an ordering, over the set of all lines in the plane? Solution. An equivalence relation (or ordering relation) must be reflexive, therefore every line must be in relation with itself. Furthermore we require that the relation is both symmetric (equivalence) and antisymmetric (ordering). That means that a line can be in relation only with itself. If we define the relation such that two lines are in relation if and only if they are identical, we obtain "very natural" relation which is both equivalence relation and ordering. We just need to check that it is transitive, which it trivially is. Thus the only relation satisfying the problem statement is the identity over the set of all lines in the plane. □ 1.184. Determine, whether the relation R = {(k,l) eZxZ; \ k\ > |/|} over the set Z is an equivalence and/or an ordering. Solution. The relation R is not an equivalence: it is not symmetric (take (6, 2) e R, (2, 6) ^ R); it is not an ordering: it is not antisymmetric (take (2, —2) e R, (—2, 2) e R). □ 1.185. Show that the intersection of any equivalence relation over a set X is again an equivalence relation, and that the union of two ordering relations over a set X does not have to be an ordering. Solution. We see that the intersection of equivalence relations is reflexive, symmetrical and transitive: all the equivalence relations must contain the tuple (x, x) for every x e X, therefore the intersection contains that tuple too. If the element (x, y) is in the intersection, then the element (y, x) is also in the intersection (just use the fact that every equivalence is symmetric). If tuples (x, y) and (y, z) are in the intersection, then both are in the equivalences also. Since the equivalences are transitive, they all contain the element (x, z) and thus that element is also in the intersection. If we chose X = {1, 2} and the ordering relation R, = {(1, 1), (2, 2), (1, 2)}, R2 = {(1, 1), (2, 2), (2, 1)} over X, we obtain the relation /?1U/?2 = {(1,1),(2, 2), (1,2), (2, 1)}, which is not antisymmetric, thus not an ordering. □ 67 CHAPTER 1. INITIAL WARM UP 1.186. Over the set M = {1, 2, ..., 19, 20} there is an equivalence relation ~ such that a ~ b for any a,b € M if and only if the first digits of the numbers a, b are the same. Construct the partition given by this equivalence. Solution. Two numbers from the set M are in the same equivalence class if and only if they are in the relation (first digit is the same). Therefore the partition consists of the sets {1, 10, 11, ..., 18, 19}, {2, 20}, {3}, {4}, {5}, {6}, {7}, {8}, {9}. □ 1.187. We are given partition of two classes {b, c}, {a, d, e} of the set X = {a, b, c, d, e}. Write down the equivalence relation R over the set X which gives this partition. Solution. Equivalence R is determined by the fact that the two elements are in relation if and only if they are in the same partition class (note also that R must be symmetric), and every element is in relation with itself (R must be reflexive). Therefore R contains exactly (a, a), (b, b), (c, c), (d, d), (e, e), (b, c), (c, b), (a, d), (a, e), (d, a), (d, e), (e, a), (e, d). □ 1.188. In the following three figures, icons are connected with lines such that people in different parts of the world could have assigned them. Determine whether it is a mapping, and whether it is injective, surjective or bijective. Solution. In the first case it is a mapping which is surjective but not injective, because both the snake and the spider are labelled as poisonous. The second case is not a mapping but only a relation, since the dog is labelled both as a pet and as a meal. The third case is again a mapping. This time it is neither injective nor surjective. □ 68 CHAPTER 1. INITIAL WARM UP 1.189. Let {a, b, c, d] be a set with a relation {(a, a), (b,b), (a,b), (b,c), (c,b)}. What is the minimal number of elements we have to add to the relation in order to make it an equivalence? Solution. Let us successively ensure the three properties that define an equivalence. First it is the reflexivity. We must add the tuples {(c,c), (d,d)}. Second is the symmetry - we must add (b, a) and for the third step we must do the so-called transitive closure. Since a is in relation with b and b is in relation with c, we must add (a, c) and (c, a). □ 1.190. Consider the set of numbers that have five digits in the binary notation and a relation such that two numbers are in the relation whenever their digit sum has the same parity. Write down the corresponding equivalence classes. Solution. We have two equivalence classes (of eight members): [10000] = {10000, 10011, 10101, 10110, 11001, 11010, 11100, 11111} which corresponds to the set {16, 19, 21, 22, 25, 26, 28, 31} and [10001] = {10001, 10010, 10100, 11000, 10111, 11011, 11101, 11110} which corresponds to the set {17, 18, 20, 24, 23, 27, 29, 30}. □ 1.191. Consider the set of numbers that have three digits in the ternary notation and a relation such that two numbers are in the relation whenever they i) begin with the same two digits in this notation, ii) end with the same two digits in this notation. Write down the corresponding equivalence classes. Solution. i) We obtain six three-element classes [100] = {100, 101, 102} corresponds {9, 10, 11} [110] = {110, 111, 112} corresponds {12, 13, 14} [120] = {120, 121, 122} corresponds {15, 16, 17} [200] = {200, 201, 202} corresponds {18, 19, 20} [210] = {210, 211, 212} corresponds {21, 22, 23} [220] = {220, 221, 222} corresponds {24, 25, 26} ii) In this case we have nine two-element classes [100] = {100, 200} corresponds {9, 18} [101] = {101, 201} corresponds {10, 19} [102] = {102, 202} corresponds {11, 20} [110] = {110, 210} corresponds {12, 21} [111] = {111, 211} corresponds {13, 22} [112] = {112, 212} corresponds {14, 23} [120] = {120, 220} corresponds {15, 24} 69 CHAPTER 1. INITIAL WARM UP [121] = {121, 221} corresponds {16, 25} [122] = {122, 222} corresponds {17, 26} □ 1.192. What is the maximal domain D and codomain H such that the following mappings are bi-jective, and what is then the inverse function? i) x h» x4 ii) x h» x3 hi) x ^ ^ Solution. i) D = [0, oo) and H = [0, oo) or also D = (—oo, 0] a H = [0, oo). The inverse function is thenx h» tfx. ii) D = H = R and the inverse function is x h> ^/x. hi) D = R \ {—1} and H = R \ {0}. The inverse function is x h» ^ — 1. □ 1.193. Consider a relation R x R. A point is in the relation whenever it holds that (x - If + (V + 1)2 = 1. Can we describe the points using the function y = f(x)l Depict the points in the relation. Solution. We cannot, because for instance y = — 1 has two preimages: x = 0 and x =2. The points lie on a circle with the centre at the point (1,-1) and radius 1. □ 1.194. Let for any two integers k, I hold that (k,l) e r whenever the number 4k — 41 is an integral multiple of 7. Is such a relation over r an equivalence? Is it an ordering? Solution. Note that two integers are in the relation r if and only if they have the same remainder under the division by 7. Therefore it is an example of the so-called remainder class of integers. Therefore we know that the relation r is an equivalence relation. Its symmetry (for instance, (3, 10), (10, 3) e r, 3 ^ 10) implies that it is not an ordering. □ 1.195. Let a relation r be defined over the set = {3, 4, 5,... ,n,n + 1, ...}, such that two numbers are in the relation whenever they are relatively prime (that is, the prime decompositions of the numbers do not contain any common number). Determine whether this relation is reflexive, symmetric, antisymmetric, transitive. Solution. For a tuple of the same numbers in holds that (n, n) g r. Therefore the relation is not reflexive. It is clear that when two numbers are relatively prime or not, it does not matter how they are ordered - it is a properly of unordered tuples. Therefore, r is symmetric. From the symmetry we have that it is not antisymmetric (for instance, (3, 5) e fl, 3 / 5). Since r is symmetric and (n, n) g r for any number n € N, a choice of two distinct numbers which are in the relation gives that r is not transitive. □ 70 CHAPTER 1. INITIAL WARM UP Solution to the exercises 1.33. y„ = 2(f)" - 2. 1.92. i) (3, 3), (4,4), (5, 5), (6, 6), (7, 7), (3, 6), check that it is an ordering relation. ii) again (/, /) for i — 1,..., 7 and additionally (3, 6), (6, 3), check that it is an equivalence relation. iii) (/, /) for i — 1,..., 7 and also (3, 6), (6, 3), (4, 6), (6, 4). Check that it is not an equivalence, since transitivity does not hold. 1.103. Three different Hasse diagrams which satisfy the given condition. In total 5! + 5! + 5!/4 — 270. 1.114. a) 1 - 3 - 2/ + 4/ = -2 + 2i, 1 • (-3) - 8/2 + 6/ + 4i = 5 + 10«, 1 + 2i, ^42 + (-3)2 = 5, £i = fi% = 1 ■ (-3) + 8/2 + 6i - 4/25 = + A«. b) 2+/, 2/2, 1, I = -2/. i) 26 = 64 ii) 0 = 15 iii) No head is one possibility (q) = 1, one head is (j) — 6. Thus there are 7 sequences with at most one head and the result is 64 — 7 = 57. 1.139. The maximum number y„ of areas a plane can be divided into by n circles is y„ — yn-\ + 2(n — 1), yi = 2, that is, y„ = n2 — n + 2. For the maximum number p„ of areas a space can be divided into by n balls we obtain the recurrent formula Pn+i — Pn+ y«, Pi = 2, that is, pn = |(n2 - 3n + 8). 1.176. 19. /./77. 87. 71 CHAPTER 2 Elementary linear algebra are you able to calculate with scalars? - if not, let us go straight to the matrices... In the previous chapter we have warmed up with relatively simple problems which did not require any sophisticated tools. It was enough to use addition and multiplication of scalars. In this and subsequent chapters we will be dealing with particular topics in more detail. Three chapters will be about tools for working with that, where the operations consist of simple operations with scalars, but work with more scalars simultaneously. We speak about "linear objects" and "linear algebra". Although it might seem a very specialised tool, we shall see later that even more complicated objects are studied mostly using their "linear approximations". In this chapter we will work directly with finite sequences of -K, where A(i, j) — aij. Matrices of the type 1/n or n/1 are actually just vectors in K". Ever general matrices can be understood as vectors in Km'n, we just concatenate all the columns. In partic-^~LY/' ular, matrix addition and matrix multiplication by scalars is defined: A + B — (aij + bij), a ■ A — (a ■ a^) where A — (atj), B — (bij), a e K. The matrix —A — (—aij) is called the addition inverse to the matrix A and the matrix /0 ... 0\ 0 = \o ... o/ is called the zero matrix. Seeing matrices as m ■ n-dimensional vectors, we obtain the following claim Proposition. The formulas for A + B, a-A, —A, 0 give for the set of all matrices of the type m/n the operations of addition and multiplication by scalars, which satisfy axioms (V1)-(V4). 2.3. Matrices and equations. An often-used tool for description of mathematical models are systems of linear equations. Matrices are useful for the description of such systems. We use for this the notion of scalar product of two vectors, which assigns to the vectors (ai,... ,an) and (x\,... ,x„) assigns their product (fll, ..., a„) ■ (x\, ..., x„) — a\x\ H-----h a„x„ that is, we subsequently multiply the coordinates of the vectors and sum the results. Every system of m linear equations in n variables flllJCl 021*1 ■ fll2*2 ■ ■ a22x2 ■ ai„xn = b\ a2nxn = b2 ®m\x\ ~t~ am2x2 + • • • + amnxn — bm can be seen as a constraint on values of m scalar products with an unknown vector (x\,...,x„) with vectors of coordinates (an, ..., ain). 74 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 1 2 3 2 \ /1 2 3 2 0 1 1 1 ' 1 1 1 0 0 4 -W 0 1 -1 First we have subtracted from the second row twice the first row and to the third row we have added thrice the first row. Then we have added the second row to the third row and multiplied the second row by — 1 /4. Let us now go back to the system of equations x\ + 2x2 + 3x3 = 2, x2 + X3 = 1, X3 = — 1. We immediately see that X3 = —1. If we plug in X3 = —1 into the equation x2 + x3 = 1, we obtain x2 = 2. And again by plugging x3 = — 1, x2 = 2 into the first equation we obtain x\ = 1. □ Systems of linear equations can be written in the matrix notation. But is it an advantage, when we can solve the systems even without speaking of the matrices? Yes it is, we can speak about the solution with more concept, we can say in the language of matrices how many solutions a system has and it is more natural and elegant for computer applications. Try to get more familiar with particular operations which can be done with matrices. As we have seen in previous examples, equivalent operations with linear equations correspond to elementary row (column) transformations. Further we have seen that transforming a matrix into a row echelon form (this process is called Gaussian elimination, see 2.7), solving the system is then very easy. We show it on some more examples, where we will see that a system can have infinitely many solutions. 2.3. Solve a system of linear equations 2x\ — x2 + 3x3 = 0, 3xi + 16x2 + 7x3 = 0, 3xi — 5x2 + 4x3 = 0, -7jci + 7x2 + -10jc3 = 0. Solution. Because the right-hand side of all equations is zero (such a case is called a homogeneous system) we shall work with the matrix of the system only. We find the solution by transforming the matrix into the row echelon form using elementary row transformations, which correspond to changing the order of equations, multiplying an equation by a non-zero number and addition of multiples of equations. Furthermore, we can always go back and forth between the matrix notation and the original system notation with variables xt. We obtain: / 2 -1 3 \ / 2 -1 3\ 16 -5 7 7 4 ■10/ 0 0 35/2 -7/2 7/2 5/2 -1/2 1/2 j The vector of variables can be also seen as a column in a matrix of the type n/1, and similarly the values b\,... ,b„ can be seen as a vector u, and that is again a single column of the matrix of the type n/1. Our system of equations can be then formally written as A ■ x — u as follows: a\n\ (bl\ \@ml ••• amn J \Xn / \bm/ where the left-hand side is interpreted as m scalar products of the individual rows of the matrix (giving rise to a column vector), whose values are determined by the equations. That means that the identity of the ;-th coordinates corresponds to the original ;-th equation anxi H-----h ainx„ — bt and the notation A ■ x — u gives the original system of equations. 2.4. Matrix product. In the plane, that is, for vectors of dimension two, we have developed a matrix calculus and we saw that it is effective to work with (see 1.26). Now we will work more generally and develop all tools we ■Wtv*^^— already know from the plane case for all dimensions n. Matrix multiplication is possible to define only when the dimensions of rows and columns allow it, that is, when the scalar product is defined for them as before: Matrix product |___ For any matrix A — (aif) of the type m/n and any matrix B — (bjk) of the type n/q over the ring of scalars K we define their product C — A ■ B — (en) as a matrix of the type m/q with the elements Cik aijbß, for arbitrary 1 < / < m, 1 < k < q. That is, the element ac[ik] of the product is exactly the scalar product of the ;-th row of the matrix on the left and of the k-th column of the matrix on the right. For instance we have 2 1 1 -1 0 1 3 2 3 3 1 0 2.5. Square matrices. If there is the same number of rows and columns in the matrix, we speak of square matrix. The number of rows and columns is then called the dimension of the matrix. The matrix /l ... 0\ E = (Sij) = \0 1/ is called the unit matrix. Numbers defined in such way &ij are also called Kronecker delta. Over the set of square matrices over K of dimension n the matrix product is defined for any two matrices, that is, there is the multiplication operation is defined there, and its properties are similar to that of scalars: 75 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 3 \ / 2 -1 3 \ 5/2 0 7 1 0 0 0 0 0 / 0 >From there we can see that the second, third and fourth equations are multiples of the equation 7x2 + x3 = 0. But we still carry on with the transformations, in order to see what will happen: / 2 -1 3 \ / 2 -1 0 35/2 5/2 _ 0 35/2 0 -7/2 -1/2 ~ 0 0 \ 0 7/2 1/2 / \ 0 0 Although we were given four equations for three variables, the whole system has infinitely many solutions, because for any x3 e M the remaining equations have 2x\ — x2 + 3x3 = 0, 7x2 + x3 = 0 solution. Thus we substitute for the variable x3 a parameter t e M and express 1 x\ = — (x2 — 3x3) x2 — —-xj, 1 1 11 ■ t a xi = - (x2 — 3x3) = — — t. If we now substitute t = - (Xi, x2, x3) 7 2 7 —7s, we obtain the result in a simple form (lis, s, -Is) , s e □ 2.4. Find all solutions of the system of linear equations 3xi + 3x3 — 5x4 = —8, x\ — x2 + X3 — X4 = —2, —2xi — X2 + 4x3 — 2x4 = 0, 2xi + x2 — x3 — X4 = —3. Solution. The corresponding extended matrix of the system is / V 0 -i -i 1 3 1 4 ■1 "3 / By changing the order of rows (equations) we obtain / -1 1 1 -1 -1 4 3 -2 \ -3 0 V 3 0 which we transform into the row echelon form: / 1 -1 1 -1 -2 ^ ( 1 -1 1 -1 -2 \ 2 1 -1 -1 -3 0 3 -3 1 1 -2 -1 4 -2 0 0 -3 6 -4 -4 V 3 0 3 -5 -* ) \o 3 0 -2 ~2) / 1 -1 1 -1 -2 \ (1 -1 1 -1 -2 \ 0 3 -3 1 1 0 3 -3 1 1 0 0 3 -3 -3 0 0 3 -3 -3 0 3 -3 "3 ) 0 0 0 The system has thus infinitely many solutions, because we have three equations for four variables, which have exactly one solution for any Proposition. Over the set of all square matrices of the dimension , n over arbitrary ring ofscalars K is defined the multiplication operation with the following properties of ^rr. rings (see 1.3): (1) The associativity holds (Ol). (2) The unit matrix E — ( 1 do not form an integral domain, therefore they are not even a (non-commutative) field. Proof. Associativity of multiplication - (Ol): Since scalars are associative, distributive and commutative, we can for the three matrices A — (a^) of type m/n, B — {bjk) of type n/p and C — (cki) of type p/q calculate A - B — (j2aU-bJk), BC= (j2bJk-Ck), (A ■ B) ■ C — \ J2(Z2aij.bjk).cki\ = (J2aij.bjk.ckij, A-(B ■ Q = ( ^ ay. bjk .cki) j = ( a'j-bjk -cki ) • j k J Kj,k J Note that while computing, we did rely on the fact that it does not matter in which order are we doing given sums and products, that is, we were heavily using the properties of scalars. We can easily see that multiplication by unit matrix has the property of a unit element: (\ 0 ••• 0\ 0 1 ••• 0 A- E ( an V™! a\m\ Q-mm j \o o ... 1/ similarly for multiplication by E from the left. It remains to show the distributivity of multiplication and addition. Again using the distributivity of scalars we can easily calculate for matrices A — (fly) of the type m/n, B — (bjk) of the type n/p,C — (cjk) of the type n/p, D — (dki) of the type p/q A-(B + Q= (j2aij(bjk +cjk)j = ((I>A*) + = A ■ B + A ■ C (B + Q ■ D — (j2(bjk +cjk)dkij = ((J2bjkdki) + (J2cJkdki)) = B-D + C-D. ^ k k ' As we have seen in 1.26, two matrices of dimension two do not necessarily commute: A 0\ /0 1\ _ /0 1n ,0 0/" v0 01 ~ vo 0) 0 1 0 0 1 0 0 0 0 0 0 0 76 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA choice for the variable x4 e M. Thus for x4 we substitute the parameter t € R and go back from the matrix notation to the equations x\ — x2 ~\~ X3 — t = —2, 3x2 — 3x3 + t = 1, 3x3 - 3t = -3. >From the last equation we have x3 = t — 1. Plugging in for x3 into the second equation gives us 3x2 — 3r + 3 + t = 1, that is, x2 Finally using the first equation we have 1 xx (2r — 2) + ř — 1 — ř -2, 1 - (2ř 3 V 1 2) tj. x1 = - (2t - 5). (x\, x2, X3, x4) 2s 3s) in the form 3s — 1, 3s S € Thus the set of solutions can be written (for t 2s - -, 3 3 Let us now go back to the extended matrix of the system and transform it further by using the row transformations in order to have (still in the row echelon form) the first non-zero number of every row (the so-called pivot) equal to one and that all the other numbers in the column of the pivot are zero. We have /1 -1 1 -1 -2 \ ( 1 -1 1 -1 -2 \ 0 3 -3 1 1 0 1 -1 1/3 1/3 0 0 3 -3 -3 0 0 1 -1 -1 \o 0 0 0 0 ) V 0 0 0 0 0 ) /1 -1 0 0 -1 / 1 0 0 -2/3 -5/3 \ 0 1 0 -2/3 -2/3 0 1 0 -2/3 -2/3 0 0 1 -1 -1 0 0 1 -1 -1 0 0 0 0 ) 0 0 0 0 because first we have multiplied the second and the third row by 1/3, then we have added the third row to the second and its (— 1)-multiple to the first. Finally we have added the second row to the first. From the last matrix we easily obtain the result x2 x3 w /-5/3\ -2/3 -1 V 0 / + t (2/3\ 2/3 1 V 1 / t e Free variables are those whose columns do not contain any pivot (in our case there is no pivot in the fourth column, that is, the fourth variable is free and we use it as a parameter). □ 2.5. Determine the solutions of the system of equations 3xi + 3x3 — 5x4 = 8, x\ — x2 + X3 — X4 = —2, —2xi — x2 + 4x3 — 2x4 = 0, 2xi -\- x2 — X3 — X4 = —3. This gives us immediately a counterexample to validity of (02) and (OI). For matrices of type 1/1 both axiom clearly holds, because the scalars itself have them. For matrices of greater dimension the counterexamples can be obtained similarly - they have the counterexamples for dimension 2 in their left upper corner, the rest is zero. (Verify it on your own!) □ In the proof we have actually worked with matrices of more general type, thus we have proved the properties in greater generality: Associativity and distributivity of matrix multiplication J Corollary. Matrix multiplication is associative and distributive, that is, A- (B ■ O = (A- B) C A-(B + Q = A- B + A-C, whenever are all the given operations defined. Unit matrix is a unit element for multiplication (both from the right and from the left). 2.6. Inverse matrices. With scalars we can do the following: from the equation a ■ x — b we can express x — a-1 • b, whenever the inverse of a exists. We would ^S^g£^ like to be able to do this for matrices too, but we have a problem - how can we tell when such matrix exists, and how to compute it? We say that B is inverse of A, when A ■ B — B ■ A — E. Then we write B — A ~1 and from the definition it is clear that both matrices must be square and of the same dimension n. A matrix which has an inverse is called invertible matrix or regular square matrix. In the subsequent paragraphs we derive (among other things) that B is inverse of A whenever just one of the required equations holds (the other equation is then necessarily true also). If A ~1 and B~1 exist, then there also is the inverse of the product A ■ B (2.1) (A ■ B)'1 = B~l ■ A'1. Because the associativity of matrix multiplication proved a while ago, we have that • A'1) ■ (A ■ B) — B~l ■ (A~l ■ A) ■ B — E (A- B)- (B~l ■ A~l) — A ■ (B ■ B~l) ■ A~l — E. Because we can calculate with matrices similarly as with scalars (they are just a little more complicated), the existence of inverse matrix can really help us with the solution of systems of linear equations: if we express a system of n equations for n unknowns as a matrix MM product A ■ x — air, Vml ' ' ' ßmm/ \Xm/ (bl\ \bmj 11 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. Note that the system of equations in this exercise differs from the system of equations in the previous exercise only in the value 8 (instead of — 8) on the right-hand side. If we do the same row transformations as in the previous exercise, we obtain. / 3 0 3 -5 8 \ / 1 -1 1 -1 -2 \ 1 -1 1 -1 -2 2 1 -1 -1 -3 — 2 -1 4 -2 0 -2 -1 4 -2 0 V 2 1 -1 -1 V 3 0 3 -5 8 / / 1 -1 1 — 1 / 1 -1 1 -1 -2 \ 0 3 -3 1 1 0 3 -3 1 1 0 0 3 3 -3 0 0 3 -3 -3 \0 0 3 3 13 ) 0 0 0 i6; where the last operation was subtracting the third row from the fourth. From the fourth equation 0 = 16 follows that the system has no solutions. Let us emphasise than whenever we obtain an equation of the form 0 = a for some a ^ 0 (that is, zero row on the left side and nonzero number after the vertical bar) when doing the row transformation, the system has no solutions. □ You can find more exercises for systems of systems of linear equations on the page 127 B. Manipulations with matrices In this sub-chapter we shall work with matrices only, in order to get more familiar with their properties. 2.6. Matrix multiplication. Carry on the matrix multiplications and check the result. Note that, in order to be able to multiply two matrices, the necessary and sufficient condition is that the first matrix has the same number of columns as the number of rows of the second matrix. The number of rows of the resulting matrix is then given by the number of rows of the first matrix, the number of columns then equals the number of columns of the second matrix. 12 3 7 0 and when there exists inverse of the matrix A, then we can multiply from the left by A-1 and we obtain u — A~ A ■ x — E ■ x — x, that is, A-1 • u is the desired solution. On the other hand, expanding the condition A ■ A~l — E for unknown scalars in the matrix A~l gives us n systems of linear equations for the same matrix on the left and different vectors on the right. 2.7. Equivalent operations with matrices. Let us gain some practical insight into the relation between systems of equations and their matrices. Clearly, searching for the inverse can be more complicated than direct solution to the system of equations. But it is important that whenever we have to solve more systems of equations with the same matrix A but with different right sides u, then yielding A-1 can be really beneficial for us. From the point of view of solving systems of equations A-x — u it is natural to consider the matrices A and vectors u equivalent whenever they give a system of equations with the same solution set. Let us think about possible operations which would simplify the matrix A such that obtaining the solution is easier. Let us begin with simple manipulations of rows of equations which do not influence the solution, and similar modifications of the right-hand side vector. If we are able to change a square matrix into the unit matrix, then the right-hand side vector is a solution of the original system. If some of the rows of the system vanish during the course of manipulations (that is, they become zero), it will also give us some direct informations about the solution. Our simple operations are: Elementary row transformations [__ • switching of two rows, • multiplication of a given row by a non-zero scalar, • adding a row to another row. These operations are called elementary row transformations. It is clear that the corresponding operations at the level of the equations in the system do not change the set of the solutions whenever our ring is an integral domain. -14 1 12), Analogically, elementary column transformations of matrices are • switching of two columns • multiplication of a given column by a non-zero scalar, • adding a column to another column. these do not preserve the solution set, since they interchange the variables itself. Systematically we can use elementary row transformations for subsequent ehmination of variables. This gives an algorithm which is usually called Gaussian elimination of variables. Gaussian elimination of variables |_s Proposition. Non-zero matrix over arbitrary ring of scalars K can be transformed using finitely many elementary row transformations into the so-called (row) echelon form: • Ifaik — Ofor all k — 1, ..., j, then a\j — Ofor all k > i, 78 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA vi ) (1 2 -2) (-2). if ciQ-i)j is the first non-zero element at the (i — l)-th row, then at 0. Remark. The parts i) and ii) in the previous exercise show that multiplication of square matrices is not commutative in general, in the part iii) we see that if we can multiply two rectangular matrices, then it is possible only in one of the orders. In parts iv) and v) you can note that (A • B)T = AT ■ BT. 2.7. Calculate A5 and A-3, if O 2.8. Let ľ 0 -5\ / A = 2 7 15 , B=\ U 7 13/ \ Can the matrix A be transformed into B using only elementary row transformations (then we say that such matrices are row equivalent)? Solution. Both matrices are clearly row equivalent with three-dimensional identity matrix. It is easy to see that row equivalence on the set of all matrices of given type is indeed an equivalence relation. Thus the matrices A and B are row equivalent. □ 2.9. Find some matrix B for which the matrix C echelon form, where /3 5 B ■ A is in row 2\ 3 0 1 -3 -5 \7 -5 1 4/ Solution. If we multiply the matrix A gradually from the left by elementary matrices (consider what elementary row transformations does it correspond to) / 1 0 0 0\ /1 0 0 0\ E3 = 0 -3 1 0 0 1 0 0 , E4 = 0 0 1 0 0 1 0 0 \o 0 0 V 0 0 V (I 0 0 0\ (I 0 0 0\ E5 = 0 0 1/3 0 0 1 0 0 , E6 = 0 0 1 -2 0 1 0 0 \p 0 0 v> 0 0 V (I 0 0 0\ (I 0 0 0\ E1 = 0 0 1 0 0 1 0 0 0 0 1/4 0 0 1 0 0 -4 0 V 0 0 V Proof. Matrix in the row echelon form looks like this /o 0 0 a-lk 2 -1 M -1 2 -1 . V 0 0 1 / 0 aip film ) and the matrix can (but does not have to) end with some zero rows. In order to transform arbitrary matrix we can use a simple algorithm, which will bring us, row by row, to the resulting echelon form: ___| Gaussian elimination algorithm |_ - (1) By a possible switching of rows we obtain a matrix where the first row has in the first non-zero column a non-zero element, let that column be j-th. (2) For i — 2,..., by multiplying the first row by the element a^, multiplying z'-th row by the element a\j and subtracting we eliminate the element atj on the z-th row. (3) By repeated application of the steps (1) and (2), always for the not-yet-echelon part of rows and columns in the matrix we reach after a finite number of steps the final form of the matrix. This proves the proposition. □ The given algorithm is really the usual elimination of variables used in the systems of linear equations. In a completely analogical manner we define the column echelon form of matrices and doing column instead of row elementary transformations, we obtain an algorithm for transforming matrices into the column echelon form. Remark. We have formulated the Gaussian elimination for general scalars from some ring. It seems natural to multiply with scalars to obtain a row echelon form where the coefficients at the non-zero "diagonal" are ones - computing the solution is then easy. However, this is not possible in general - take for instance the integers Z. For solving systems of equations the given algorithm does not make any sense if there are divisors of zero among the scalars. Think carefully about the differences between K — Z, K — R and possibly Z2 or z4. 2.8. Matrix of elementary row transformations. In the following we will work exclusively with field of scalars K, that is, every non-zero scalar has an inverse. Note that elementary row (column) transformations correspond to multiplication from the left (right) by the following matrices: 79 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA we obtain B = EsE1E6E5E4Et,E2Ei /0 0 0 1/12 1 -2/3 \0 -4/3 1 0\ -5/12 0 1/3 0 -1/3 1/ C /l -3-5 0 \ 0 1 9/4 1/4 0 0 0 0 \0 0 0 0 / □ 2.10. Complex numbers as matrices. Consider the set of matrices b a , a, }. Note that C is closed under addition and multiplication, and further show that the mapping / : C b a h» a+bi satisfies f(M + N) = f(M) + f(N) and f(M-N) = f(M) ■ f(N) (on the left-hand sides of the equations we have addition and multiplication of matrices, on the right-hand sides we have addition and multiplication of complex numbers). Thus the set C along with multiplication and addition can be seen as the field C of complex numbers. The mapping / is then called isomorphism (of fields). Thus for instance we have -9N ,9 i 3 5 -5 3 69 13 -13 69 which corresponds to (3 + 5z) • (8 — 9i) = 69 — 13/. 2.11. Solve the equations for matrices 'l 3' 3 8 ■X, 1 2 3 4 X 2 ■ 1 3 3 8 1 2 3 4 Solution. Clearly the unknowns Xi and X2 must be matrices of the type 2 x 2 (in order for the products to be defined and that the result is a matrix of the type 2x2). Set X, CL\ Xj a2 b2 C2 d2 and multiply out the matrices in the first given equation. It has to hold ci\ + 3ci 3ai + 8ci bi + 3dx 3*i + 8di 1 2 3 4 that is, 3ci\ + + 3ci 3bi + 3di 1, 2, 3, 4, By adding a (—3)-multiple of the first equation with the third equation we obtain c\ = 0 and then a\ = 1. Analogously, by adding a (1) Switching of the ;-th and j-th row (column) /l 0 ... \ 0 0 1 1 ... 0 V i/ (2) Multiplication of the ;-th row (column) by the scalar a: /I \ 1 (3) Adding the ;-th and the j-th row (column): /l 0 o '•• V This trivial observation is actually very important, since the product of invertible matrices is invertible (recall 2.1) and all elementary transformations over field of scalars are invertible (the definition of the elementary transformation itself ensures that inverse transformations are of the same type and it is easy to determine corresponding matrix). For arbitrary matrix A we obtain by multiplying with suitable invertible matrix P — Pf- P\ from the left (that is, sequential multiplication with k matrices) its equivalent row echelon form A' — P ■ A. In general, if we apply the same elimination procedure for the columns, we can obtain from any matrix B its column echelon form B' by multiplying it from the right by a suitable invertible matrix Q — Q l • • • Qi ■ If we start with the matrix B — A' in row echelon form, this procedure eliminates only the still non-zero elements out of the diagonal of the matrix and in the end we can transform the remaining elements to be units. Thus we have verified a very important result we will use many times in the future: 2.9. Theorem. For every matrix A of the type m/n over field of scalars K there exists square invertible matrices P of dimension m and Q of dimension n such that the matrix P ■ A is in row echelon 80 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA (—3)-multiple of the second equation to the fourth equation we obtain d\ = 2 and then b\ = —4. Thus we have '1 -4^ v0 2 We find the values a2,b2,c2, d2 by a different approach. We use the relation form and a b c d 1 ad - be \-c a which holds for any numbers a,b,c,d e M (easy to derive; it also directly follows from 2.2), we calculate 1 3 3 8 -8 3 3 -1 Multiplying the given equations by this matrix from the right gives X2 1 2 3 4 -8 3 3 -1 and thus X2 -2 1 ■12 5 □ 2.12. Solve the matrix equation '2 5 X 1 3 4 -i 2 1 o 2.13. Computing the inverse matrix. Compute the inverse matrix of the matrices /4 3 2\ /l 0 A = ( 5 6 3 1, £=3 3 \3 5 2/ \2 2 Then determine the matrix (AT ■ B) 1. Solution. We find the inverse by the following: write next to each other the matrix A and the unit matrix. Then use elementary row transformations so that the sub-matrix A changes into the unit matrix. This will change the original unit sub-matrix to A-1. We gradually obtain -1 0 1 í4 3 2 1 0 6 3 0 1 0 ~ 5 2 0 0 1/ \ -2 0 1 0 -1 \ 16 3 -5 1 5 ~ 11 2 -3 0 4; i -2 0 1 0 -1 \ ( 1 0 0 0 5 1 -2 1 1 ~ o 0 1 0 1 0 1-22/ o 1 0 1 0 -1 \ -2 1 1 -3 0 4 J 3 -4 3 -7 11 -9 1 -2 2 /I ■ . 0 .. 0\ 0 . . 1 0 ... . .. 0 0 . . 0 1 0 . .. 0 0 . . 0 0 0 . .. 0 V / P ■ A ■ Q 2.10. Algorithm for computing inverse matrix. In the previous paragraphs we have basically obtain the complete algorithm for computing the inverse matrix. Using the simple approach described in the next paragraph, we either find out that the inverse does not exist, or we compute it. Keep in mind that we are still working over field of scalars. Equivalent row transformations of square matrix A of dimension n lead to the matrix P' such that the matrix P' ■ A is in row echelon form. It could be the case that some of the last rows are zero. If there exists the inverse of A, then there exists also the inverse of P' ■ A. But if the last row of P' ■ A is zero, then the last row of P' ■ A ■ B is also zero for any B of dimension n. That is, the existence of zero row in the result of (row) Gaussian elimination means that there cannot exists the inverse A-1. Assume now that A-1 exists. Because of the previous, we obtain the row echelon form which has no non-zero row, that is, all diagonal elements of P' ■ A are non-zero. But then carrying the row elimination using the elementary row transformation from the bottom-right corner backwards and transforming the diagonal elements to be units we obtain the unit matrix E. That is, we find another invertible matrix P" such that for P — P" ■ P' we have P ■ A — E. Doing column instead of row transformation we can (under the assumption of the existence of A-1) find a matrix Q such that A ■ Q — E. From this we have P = p . E = P ■ (A ■ Q) = (P ■ A) ■ Q = Q. That is, we have found the inverse matrix A-1 — P — Q for the matrix A. Notably, at the point of finding the matrix P with the property PA — £ we don't have to do any further computation since we know that we already have the inverse matrix. Practically we can work as follows: ___| Computing the inverse matrix J___ Next to each other we write the original matrix A and the unit matrix E, we transform the matrix A using the elementary row transformation to the row echelon form, then using the so-called backwards elimination to the diagonal matrix and then by multiplying with the inverse elements of K to the unit matrix. Simultaneously, we apply all these transformations to the matrix E, and as a result we obtain the matrix A-1 in place where E has been. If during the course of the procedure we obtain a zero row in the original matrix, we conclude that the inverse matrix does not exist. 81 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 3-4 3 1 -2 2 -7 11 -9 In the first step we subtracted from the first row the third row, in the second step we added a (—5)-multiple of the first to the second row and added a —3)-multiple of the first row to the third row, in the third step we subtracted from the second row the third row, in the fourth step we added a (—2)-multiple of the second row to the third row, in the fifth step we added a (—5)-multiple of the third row to the second row and added a 2-multiple of the third row to the first row, and in the last step we changed the second and the third row. Let us emphasise the result / 3 -4 3 A1 = ( 1 -2 2 -7 11 -9) Let us note that when calculating the matrix A ~1 we did not have to cope with fractions thanks to the suitably chosen row transformations. Although we could carry on similarly when doing the next exercise, that is, B~l, we will rather do the more obvious row transformations. We have 1 1 0 0 1 2 -3 \ / 1 0 0 1 2 -3 0 1 0 -1 1 -1 I " 0 1 0 -1 1 -1 0 0 1 3 o -! 1 / 0 1 0 -2 3 that is, Using the identity (AT .B)~l =B~l - {A7)'1 = B~ and the knowledge of the inverse matrices computed before, we obtain (A?.B)-' ■14 -9 42 ■10 -5 27 17 10 -491 □ 2.11. Linear dependence and rank. In the previous musings about calculations with matrices we have worked all the time with row and column addition seeing them as vector, along with scalar multiplication. Such operations are called linear combinations. We return to such operations in abstract sense in a while in 2.24, it will be also useful to understand their core meaning just now. Linear combination of rows (columns) of a matrix A — (a^) of type m/n we understand an expression of the form ) are rows (or u j — c\uix H-----h CkUtk, where q are scalars, uj — (a.j\,... ,cijn (a\j,..., amj) are columns) of the matrix A. If there exists linear combination of given rows with at least one non-zero scalar coefficient which results into zero row, we say that these rows are linearly dependent. In the other case, that is, when the only possibility of obtaining zero row is by taking only zero scalars, the rows are called linearly independent. Analogously, we define linearly dependent and linearly independent columns. The previous results about the Gaussian elimination can be now interpreted as follows: the number of non-zero "steps" in the row (column) echelon form is always equal to the |x number of linearly independent rows (columns) of the matrix. Let Eh be the matrix from the theorem 2.9 with h ones on the diagonals and assume that by two different transformation sequences we obtain two different h' < h. But then according to our algorithm there are two invertible matrices P and Q such that P-Eh,Q = Eh. In the product Ey ■ Q there will be more zero rows in the bottom part of the matrix than ones in Eh - but we should be able to reach Eh using only elementary row transformations. Increasing the number of linearly independent rows using only elementary row transformations is not possible. Therefore the number of ones in the matrix P ■ A ■ Q in the theorem 2.9 is independent of the choice of our elimination sequence and is always equal to the number of linearly independent rows in A, and also to the number of linearly independent columns in A. This number is called the rank of the matrix and we denote it by h(A). Let us remember the following theorem: Theorem. Let A be a matrix of type m/n over field of scalars K. The matrix A has the same number h(A) of linearly independent rows and columns. Notably, the rank is always at most minimum of the dimensions of the matrix A. The algorithm for computing the inverse matrix also says that square matrix A of dimension m has inverse if and only if its rank equals m. 2.12. Matrices as mappings. Analogous to the way we did it when working with matrices in the geometry of the plane (see 1.29), we can interpret every square matrix A as a mapping A:Kn^Kn, ibA-i. Thanks to the distributivity of matrix multiplication it is clear how are linear combinations of vectors mapped using such mappings: A ■ (a x + b y) — a (A ■ x) + b (A ■ y). 82 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.14. Compute the inverse matrix of the matrix 0 -2N -2 1 -5 2 2.15. Compute the inverse matrix of the matrix O /8 3 5 2 0 0 0 0 \0 0 0 0 0\ 0 0 0 -10 0 0 1 2 0 3 5/ O 2. id Determine whether there exists an inverse of the matrix /II 1 1 \ C If yes, then compute C 1 1 V1 1 -1 1 I O while i is the imaginary unit O 2.18. Write the inverse matrix to the n x n matrix (n > 1) /2-n 1 1 2 - n \ 1 1 1 \ 1 2-n 1 1 2-n/ O C. Permutations In order to be able to define the key point of the matrix calculus, that is, determinant, we must deal with permutations (bijections of a finite set) and their parities. We shall use the two-row notation for permutations (see 2.14). In the first row we list all elements of the given set, and every column Right from the definition we see (thanks to the associativity of multiplication) that composition of mappings corresponds to matrix multiplication in given order. Thus invertible matrices correspond to bijective mappings. >From this point of view the theorem 2.9 is very interesting. We can see it as follows: the rank of the matrix determines how big is the image of the whole K" under this mapping. Really, if A — P ■ Ek ■ Q with matrix Ek with k ones as in 2.9, then the invertible Q first just bijectively "shuffles" the n-dimensional vectors in K", the matrix Ek then "copies" first k coordinates and zeroes the n — k remaining. This "^-dimensional" image then cannot be enlarged by multiplying with P. 2.13. Solving systems of linear equations. We shall return to the notions of dimension, linear independence and so on in the third part of this chapter. But even now we can notice what the already derived results say about the solution of the system of linear equations. If we consider the matrix of the system of equations and add to it the column of the required results, we speak about the extended matrix of the system. The approach we have presented before corresponds to sequential variable elimination in the equations and deletion of the linearly dependent equations (these are simply a consequence of other equations). We have thus derived the complete information about the size of the set of solutions of the system of linear equations, based on the rank of the matrix of the system. If we are left with more nonzero rows in the row echelon form of the extended matrix than in original matrix of the system, then there cannot be any solution (simply, we cannot hit the given value with the corresponding linear mapping). If the rank of both matrices is the same, then we have in the backwards elimination exactly that many free parameters as the difference between the number of variables n and the rank h(A). 2. Determinants In the fifth part of the first chapter we have seen (see 1.27) that for square matrices of dimension 2 over the real numbers there exists scalar function det, which as-5S=tS=E^8 signs to the matrix a non-zero number if and only if the inverse of the matrix exists. We did not say it in these words, but you can check by yourself that it means indeed the same (see the paragraphs starting with 1.26 and the formula (1.16)). Determinant was also useful in another way, see the paragraphs 1.33 and 1.34, where we have derived that the volume of the parallelepiped should be linearly dependent on every of the two vectors defining it and it is useful to require the change of the sign when changing the order of these vectors. Because determinant (and only determinant) had these properties, up to a constant scalar multiple, we stated that it corresponds to the definition of the volume. Now we will see that we can do it similarly for every finite dimension. In this part we will work with arbitrary scalars K and matrices over these scalars. Our results about determinants will thus hold for all commutative rings, notably also for integer matrices. 2.14. Definition of the determinant. Let us remind that the bijective mapping from the set X to itself is called permutation of 83 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA then corresponds to a tuple (preimage, image) in the given permutation. Because permutation is a bijection, the second row is indeed a permutation (ordering) of the first row, in accordance with the definition from combinatorics. 2.19. Decompose the permutation a 123456789 316789542 into a product of transpositions. Solution. We first decompose the permutation into a product of independent cycles: let us start with the first element (one) and look on the second row to see what the image of one is. It is three. Now we look on the column that starts with three, and find out that the image of three is six, and so on. We carry on in this manner for so long until we again reach the starting element (in this case it is one). We obtain the following sequence of elements, which map to each other under the given permutation: 1 i—> 3 i—> 6 i—> 9 i—> 2 i—> 1. The mapping which maps elements in such a manner is called cycle (see 2.16) which we denote (1, 3, 6, 9, 2). Now we take any element not contained in the obtained cycle, and start with him the same procedure as with one. We obtain the cycle (4, 7, 5, 8). From the method is clear that the result does not depend on the first obtained cycle. Each element from the set ({1, 2, ..., 9}) appears in one of the obtained cycles, we can thus write: a = (l,3,6,9,2)o(4,7,5,8). For cycles the decomposition into transpositions is simple, we have (1, 3, 6, 9, 2) = (1, 3)o(3, 6)o(6, 9)o(9, 2) = (1, 3)(3, 6)(6, 9)(9, 2). We thus obtain: a = (1, 3)(3, 6)(6, 9)(9, 2)(4, 7)(7, 5)(5, 8). □ Remark. Let us note that the operation o is composition of mappings, thus it is necessary to carry out the composition "backwards", as we are used to composition of mappings. Applying the given composition of transposition for instance on the element two we can gradually write: [(1, 3)(3, 6)(6, 9)(9, 2)](2) = [(1, 3)(3, 6)(6, 9)]((9, 2)(2)) = [(1,3)(3,6)(6,9)](9) = [(1,3)(3,6)](6) = (1,3)(3) = 1, the set X, see 1.7. If X — {1, 2,..., n], the permutation can be written putting the resulting ordering into a table: 1 <7(1) 2 o (2) n a {n) The element x e X is called a fixed point of the permutation a if a (x) — x. Permutation a such that there exist exactly two distinct elements x, y e X such that a (x) — y while all other elements z e X are fixed points, is called transposition, we denote it by (x, y). Of course that for such transformation it holds also that a (y) — x, therefore the name. In the dimension 2 the formula for determinant was simple -take all possible products of two elements, one from every column and every row of the matrix, give them a sign such that switching two columns leads to the change of the sign of the whole result, and sum all of them (that is, all two): A = a b c d det A — ad — be. In general, consider square matrices A — (ay) of dimension n over K. The formula for the determinant of the matrix A is also composed of all possible products from elements from individual rows and columns: ___[ Definition of determinant [__^ Determinant of the matrix A is a scalar det A — \A \ defined by the relation 1^1 — J2 Sgn(°r)fllff(D -a2a(2) ' ®no{n) where £„ is the set of all possible permutations over {1,... ,n\ and the sign sgn for a permutation a will be described later. Each of the expressions sgn(er)fliCT(i) • a2a(2) ■ ■ ■ is called a member of the determinant \A\. In the dimensions 2 and 3 we can easily guess correct signs. The product of the elements on the diagonal should be with positive sign and we want anti-symmetry when switching two columns or rows. | Determinants in the dimension 2 and 3 |_^ For n — 2 it is, as we have expected an an an ö22 011022 — Ol2«21- Similarly for n — 3 an an 013 021 a 22 a 23 fl31 fl32 í/33 011022033 — fll3fl22fl31 + 013021032 —011023032 + fll2fl23fl31 — 012021033- This formula is called the Saarns rule. 84 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA thus the mapping indeed maps the element 2 on the element 1 (it is actually just the cycle (1, 3, 6, 9, 2) written in a different way). When writing a composition of permutations, we often omit the sign "o" and speak of product of permutations. When writing the cycle we write only the elements on which the cycle (that is, the mapping) nontrivially acts (that is, the element is mapped to some other element). Fixed-points of the cycle are not listed. Thus it is necessary to know on which set do we consider the given cycle (mostly it will be clear from the context). The cycle c = (4, 7, 5, 8) from the previous example is thus a mapping (permutation), which, in the two-row notation, looks like this 123456789 123786549 If the original permutation has some fixed-points they do not appear in the cycle decomposition. Let us further note that the notation (1, 2, 3) gives the same cycle as for instance (2, 3, 1) or (3, 1, 2). But the notation (1, 3, 2) is a different cycle. 2.20. Determine the parity of the following permutations: 1 2 3 4 5 7 8 9 3 1 7 8 9 5 4 2 1 2 3 4 5 6 2 4 6 1 5 3 Solution. >From the previous exercise we know that a = (1,3)(3,6)(6, 9)(9, 2)(4, 7)(7,5)(5, 8). Its parity is given by the parity of the number of transpositions in its decomposition (which is, unlike the number of transposition in an arbitrary decomposition, always the same). There are seven transpositions in the decomposition, thus the permutation is odd. Even without the knowledge of a decomposition of a into transpositions we could compute the number of tuples (a,b) c {1, 2,..., 9} x {1, 2, ..., 9} which are inverse with respect to a (see 2.15): we go sequentially through the second row in the two-row notation and for every number k there we count the number of numbers which are smaller than k and are located after k in the second row. It is not hard to realise that the number of inversions in a given permutation is exactly the number of tuples "bigger before smaller" in the second row. For a we compute (stepping through the second row): after three there is one and two, thus we add 2; after one there is no smaller number and we add 0; after six there is five, four and two, thus we add 4, similarly for seven, eight and nine, for five we add 2, for four we add 1 and for two nothing. Thus we have 17 inversions in total and the permutation is indeed odd. 2.15. Parity of permutation. How can we find a sign of a per-jiTZi mutation. We say that a tuple of element a,b e fv4t X — {1, ... ,n] forms an inversion in permutationo, .f^few^'i it " l> and a (a) > a(b). Permutation a is called 'ffsr*%5s-L even (odd), if it contains even (odd) number of inversions. Parity of permutation a is (-l)™mber of inversions and we de_ note it by sgn(er). This amounts to our definition of sign for computing determinant. But we would like to know how to calculate with parity. >From the following theorem about permutations it is clear that the Saarus rule really gives the determinant for the dimension 3. Theorem. Over the set X — {1, 2,..., n\ there are exactly n! distinct permutations. These can be ordered in a sequences such that every two consequent permutations differ in exactly one transposition. For any permutation there is such sequence starting with it. Every transposition changes parity. Proof. For one- and two-element X the claim is trivial. Let us do induction over the number of dimensions. Assume that the claim holds for all sets with n — 1 elements and consider a permutation g(\) — a\,..., g(n) — an. According to the induction assumption all the permutations that end with an can be obtained in a sequence where every two consequent permutations differ in one transposition. There are (n — 1)! such permutations. On the last of them we use the transposition of g (n) — an with some element at has not yet been at the last position, and once again form a sequence of all permutations that end with at. After doing this procedure n -times, we obtain n (n—1)! —n \ distinct permutations - that is, all permutations on n elements. The resulting sequence satisfies the condition. Note that the last sentence of the theorem does not seem to be useful for its application. But it is a very important part for proving it by induction over the size of X. It remains to prove the part of the theorem about parities. Consider the ordering (a\, ..., at, ai+\, ..., an), containing r inversions. Then clearly in the ordering (a\, ..., fl;+i, at, ..., a„) there are either r — 1 or r + 1 inversions. Every transposition (at, aj) is obtainable by doing (j — i) + (j — i — 1) — 2(j — i) — 1 transpositions of neighbouring elements. Therefore doing any transposition changes the parity. Also, we already know that all permutations can be obtained by applying transpositions. □ We found out that applying a transposition changes the parity of a permutation and any ordering of numbers {1,2,... ,n] can be obtained through transposing of neighbouring elements. Therefore we have proven Corollary. On every finite set X — {1, ..., n\ with n elements, n > 1, there are exactly \n! even and \n \ odd permutations. If we compose two permutations, it means first doing all transpositions forming the first permutations and then all the transpositions forming the second. Therefore for any two permutations g, n : X -> X we have sgn((T o n) = sgn((j) • sgn(^) 85 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Analogously we can decompose r into either a product of transpositions (using the cycle decomposition): t = (1, 2, 4)(3, 6) = (1, 2)(2, 4)(3, 6), or we count the number of inversions inr: 1+2 + 3 + 0 + 1 = 7. Anyway we find out that r is also an odd permutation. and also □ D. Determinants Ensure on the following exercise that you can compute determinants of the type 2x2 and 3x3 (using the Saarus rule): (\ 2 2.21. Compute the determinant of the following matrices I ^ ^ 'l 2 3\ / 1 11 1-12 1 0 0 K3 2 2) \-2 0 1 2.22. Compute the determinant of the matrix /l 3 5 6\ 12 2 2 1 1 1 \0 1 2 2 Solution. We start by expanding the first column, where there is greatest number (one) of zeroes. Gradually we get 13 5 6 12 2 2 1112 0 12 1 using the Saarus rule 2 2 2 3 5 6 3 5 6 1 • 1 1 2 - 1 • 1 1 2 + 1 • 2 2 2 1 2 1 1 2 1 1 2 1 2- 2 + 6 — 2. □ 2.23. Find all the values of argument a such that 1 1 1 a 0 a 1 0 1 a 0 0 0 1. For complex a give either its algebraic or polar form. Solution. We compute the determinant by expanding the first row of the matrix: D a 1 0 a 0 1 0 0 0 -further we expand using the last row: a 1 a 1 1 a ■ 1 a 1 0 0 —a D = a ■ (—a) 1 -a2(a2 1)- sgn((T ) = sgn((j). 2.16. Decomposing permutations into cycles. A good tool for practical work with permutations is the cycle decomposition. __ [ Cycles |___ Permutation a over the set X — {1,..., n] is called cycle of length k, if we can find elements a\,..., e X, 2 < k < n such that o(ai) — flj+i, i — 1, ..., k — 1, while er(a,t) — «1 and other elements in X are fixed-points of a. Cycles of length two are exactly transpositions. Every permutation is a composition of cycles. Cycles of even length have parity — 1, cycles of odd length have parity 1. The last claim has yet to be proven. If we define for a given permutation a relation r such that two elements x, y e X are in relation if and only if ar(x) — y for some iteration of the permutation a, then clearly it is an equivalence relation (check it carefully!). Because X is finite set, for some £ it must hold that a1 (x) — x. If we pick one equivalence class {x, er(x),..., ol~l (x)} c X and define other elements to be fixed-points, we obtain a cycle. Evidently, the original permutation X is then composition of all these cycles for individual equivalence classes and it does not matter in which order we compose the cycles. For determining the parity we just have to note that cycles of even length can be written as an odd number of transposition, therefore their parity is — 1. Analogously, cycle of odd length can be obtained using an even number of transpositions and therefore it has parity 1. 2.17. Simple properties of determinant. Knowing the proper-|gu ties of permutations and their parities from previous r-xsg-Nj^P- paragraphs allows us to derive quickly some basic properties of determinant. For every matrix A — (ai;) of the type m/n over scalars from K we define transpose of A. It is a matrix AT — (a'..) with elements a?. — which is of the type n/m. Square matrix A with the property A — AT is called symmetric. If it holds that A — —AT, then A is called antisymmetric. _ Simple properties of determinant _. Theorem. For every square matrix A = (flij) the following claims hold: (1) \AT\ = \A\ (2) If one of the rows contains only zero elements from K, then \A\ =0. (3) If a matrix B was obtained from A by transposing two rows, then\A\ = -\B\. (4) If a matrix B was obtained from A by multiplying a row by a scalar a € K, then \ B\ — a \A\. (5) If all elements of the k-th row in A are of the form ay = + bkj and all remaining rows in the matrices A, B — (bij), C — (cij) are identical, then \A\ = \B\ + \C\. (6) Determinant \A \ does not change if we add to any row of A a linear combination of other rows. 86 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Together we obtain the following condition for a: a4 — a2 + 1 = 0. Substituting t = a2 we have t2 — t + 1 with roots t\ = l+l2^ = cos(7r/3) + i sin(7r/3), h = x~\^ = cos(7r/3) — i sin(7r/3) = cos(—7r/3) + i sin(—7t/3), from where we obtain four possible values for the parameter a: a\ = cos(7r/6) + i sin(7r/6) = V3/2 + i/2, a2 = cos(77r/6) + i sin(77r/6) = — V3/2 — i/2, a3 = cos(—it/6) + i sin(—it/6) = V3/2 — i/2, a4 = cos(57r/6) + i sin(57r/6) = -V3/2 + i/2. □ 2.24. Vandermonde determinant. Prove the formula for the so-called Vandermonde determinant, that is, determinant of the Vandermonde matrix: 1 1 . 1 CL\ a2 . V n = a\ a2 . ■ al \ i. Solution. We show a really beautiful proof by induction, which fills the heart §, of any mathematician with supreme joy. Consider the determinant V„ to be a polynomial P in the variable a„. From the v definition of the determinant it follows that this polynomial is of degree n — 1 in this variable and that the numbers Cl\,. .. ,Cln — \ are its roots: if we substitute in the Vandermonde matrix V„ into the last column formed by the powers of a„ any of the previous columns formed by the powers of the number at, the value of this changed determinant is actually the value of the Vandermonde determinant (seen as the polynomial in the variable a„) at the point at. However, that determinant is clearly zero, because determinant of matrix with two identical, that is, linearly dependent columns is zero. That means that at is a root of P. Thus we have n — 1 roots of a polynomial of degree k, thus it must be the list of all its roots and P must be of the form P = C(an — a\){an — a2) ■ ■ ■ (an — a„_i) where C is some constant - the leading term of the polynomial P. If we consider computation of the determinant V„ using the last column expansion, we see that C is the coefficient at ann~x, that is V(n — 1). Since for n = 2, clearly V(2) = a2 — a\, the laim for V„ holds by induction. □ Alternative solution, (see Hints and solutions to the exercises) Proof. (1) The members of determinants in bijective correspondence. To a member sgn(cr)flifr(i) • a2a{2) ■ ■ ■ ana{n) corresponds in A T member (it does not depend on the order of scalars) sgn(er)aCT(i)i • aa(2)2 ■ ■ ■ aa(„)„ — = sgn(ff)fl1(T-i (1) a2a-1 (2) ' ' ano-l («)' and we have to ensure that this member has the correct sign. The parity of a and a ~1 is the same, therefore it really is a member in the determinant of | AT \ and the first claim is proven. (2) This comes straight from the definition of determinant, because all its members contain from every row exactly one member. If one of the rows is zero, all members of the determinant are zero. (3) In all members of | B\ the only change in comparison with | A \ is an addition of one transposition, therefore all the signs will be reversed. (4) This is straight from the definition, because members of | B\ are members of | A \ multiplied by the scalar a. (5) In every member of | A \ there is exactly one element from the &-th row of the matrix A. Thanks to the distributive law for multiplication and addition in K, the claim follows directly from the definition of determinant. (6) If there are two identical rows in A, among the members of determinant there are always two identical up to the sign. Therefore in this case | A \ =0. Thanks to the claim (5), we can add to the given row any other row without changing the value of the determinant. Thanks to the claim (5), we can add even a scalar multiple of any other row. □ 2.18. Corollaries for computation. Thanks to the previous theorem, we can using elementary row transformations bring every square matrix A into row echelon form, without changing the value of its determinant. We just have to be careful and add to rows only linear combinations of other rows. __J Computing determinants using elimination |.__ If the matrix A is in row echelon form, then every member of \A\ at least one element lies below the diagonal, except for the case that all elements lie on the diagonal. Therefore the diagonal member is the only non-zero one. Thus we see that the determinant of such matrix in row echelon form is \A \ — an ■ a22 ■ ■ ■ ■ ann. Previous theorem gives us very effective method for computing determinants using Gauss elimination method, see the paragraph 2.7. _ Let us note a nice corollary of the first claim of the previous theorem about the equality of the determinant of the matrix and its transpose. It ensures that whenever we prove some claim about determinants formulated in terms of rows of the corresponding matrix, we immediately obtain an analogous claim in terms of the columns. For instance, we can immediately formulate all the claims (2)-(6) for 87 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.25. Find out whether the matrix /3 2 -1 2 \ 4 1 2 -4 -2 2 4 1 \2 3 -4 3 2 -1 2 4 1 2 -4 -2 2 4 1 2 3 -4 8 1 2 - -4 3 • 2 4 1 — 3 -4 8 4 1 2 -2 -2 2 4 2 3 4 2 • 2 4 addition of linear combinations of columns. We can use it for deriving the following formula for direct calculation of solutions for systems of linear equations: __ | Cramer rule [___ is invertible. Solution. Matrix is invertible (that is, there is an inverse matrix) whenever we can transform it by elementary row transformations into the unit matrix. That is equivalent for instance to the properly that it has non-zero determinant. That we can compute using the Laplace Theorem (2.32) by expanding for instance the first row: Consider the system of n linear equations for n variables with matrix of the system A — (ay) and the column of values b — (b\,..., b„), that is, in matrix notation we are solving the equation Ax — b. If there exists the inverse A-1, then individual components of the unique solution x — (x\,..., x„) given by the relation = \AiWA\~1, where the matrix A, arises from the matrix of the system A by exchanging the ;-th column for the column b of values. = 3 • 90 - 2 • 180 + (-1) • 110 - 2 • (-100) = 0, that is, the given matrix is not invertible. □ £. Systems of linear equations for the second time We have already encountered systems of linear equations at the beginning of the chapter. Now we will deal with them in more detail. Let us first use the advantage for computing the solution of the system of linear equations given by the inverse of the matrix. 2.26. Participants of a trip. There were 45 participants of a two-day bus trip. First day the fee for a watchtower was €30 for an adult, €16 for a child and €24 for a senior. In total, the fee was €1116. On the second day, the fee for a bus with a palace and botanical garden tour was €40 for an adult, €24 c for a child and €34 for a senior. In total, the fee was €1542. How many adults, children and seniors were there among the participants? Solution. Let us introduce the variables x giving the „number of adults"; y giving the „number of children"; z giving the „number of seniors"; There were 45 participants, therefore Really, as we have already seen, inverse of the matrix of the system exist if and only if the system has unique solution. If we have such solution x, we can plug instead of the column b into the matrix At the corresponding linear combination of the columns of the matrix A, that is the values bt — anx\ + • • • + ainx„. Then, by subtracting the xj--multiples of all other columns, in the ;-th column remains just the x,-multiple of the original column of A. The number x, can thus be brought in front of the determinant to obtain the equation |Ai||A|_1 — xi\A\\A\~x — x,, which is the claim. Let us further note that the properties (3)-(5) from the previous theorem say that determinant, as a mapping which assign to n vectors of dimension n (rows or columns of the matrix) a scalar, is antisymmetric mapping linear in every argument, exactly as we required in analogy to the 2-dimensional case. 2.19. Further properties of the determinant. Later we will see that exactly as in the dimension 2 the determinant of the matrix equals to the (oriented) volume of the parallelepiped determined by the columns of the matrix. We shall also see that considering the mapping x given by the square matrix A over M" we can see the determinant of this matrix as expression of the ratio between the volume of the parallelepipeds given by the vectors x\,...x„ and their images A ■ x\,..., A ■ x„. Because mapping composition x i-> A • x i-> B • (A • x) corresponds to matrix multiplication, the so-called Cauchy theorem is easy to understand: __ | Cauchy theorem |__, Theorem. Let A — (ay). B — (bij) be square matrices of dimension n over the ring of scalars K. Then \A ■ B\ — \A \ ■ \B\. x + y + z 45. Not that from the Cauchy theorem and the representation of the elementary row transformations by multiplication by suitable matrices (see 2.8) we immediately have the claim (2), (3) and (6) from the theorem 2.17. We know derive this theorem in a purely algebraic way just \\ because the previous argumentation based on geometrical intuition could hardly work for arbitrary scalars. The base tool is the so-called determinant expansion W using one or more of the rows or columns. We will also need a little of technical preparation. Reader who is not fond of too much abstraction can skip these parts and absorb only the statement of the Laplace theorem and its corollaries. 88 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA The total fee for the entry into the watchtower and the botanical garden expressed in our variables gives 30x + \6y + 24z, and 40x + 24y + 34z respectively. But we know the actual values (€1116 and €1542). Thus we have 30x 40x + + 16y 24y + + 24z 34z 1 116, 1542. We write the system of three linear equations in the matrix notation as 1 1 1 30 16 24 ^40 24 34 The solution is r 1/ \y = 6 6 \ because 1 1 1 30 16 24 ,40 24 34, Expressed in words, there were 22 adults, 12 children and 11 seniors. □ 2.27. Using the inverse matrix, compute the solution of the system X\ -\- x2 -\~ X3 -\- x4 = 2, X\ -\- x2 — x3 — x4 = 3, X\ — x2 ~\~ x3 — x4 = 3, X\ — x2 — x3 -\- x4 = 5. o But what if the matrix of the system is not invertible? Then we cannot use the inverse matrix for solving the system. Such a system then always has more than one solution. As the reader may know, a system of linear equations either has no solution, has one solution or has infinitely many solutions (for instance, it cannot have exactly two solutions). The space of the solutions is either a vector space (in the case when the right-hand side of the system is zero, we speak of homogeneous system of linear equations) or an affine space, see 4.1 (in the case when the right-hand side of at least one of the equations is nonzero, we speak of non-homogeneous system of linear equations). We demonstrate possible types of solutions of a system of linear equations by examples. 2.20. Minors of the matrix. When investigating matrices and their properties we often work only with parts of the matrices. Therefore we need some new notions. __| submatrices and minors J___ Let A — (aij) be a matrix of the type m/n and let 1 < i\ < ■ ■ ■ < ik < rn, 1 < ji < ... < ji < n be fixed natural numbers. Then the matrix M = U2 \ai, of the type k/l is called a submatrix of the matrix A determined by the rows i\,... Jk and columns j\,ji. The remaining (m — k) rows and in — t) columns determine a matrix M* of the type (m—k)I(n—l), which is called complementary submatrix to M in A. When k — £we define \M\, which is called subdeterminant or minor of the order k of the matrix A. If m — n, then when k — I we have also M* square, then \M* | is called minor complement \M\, or complementary minor of the submatrix M in the matrix A. The scalar (_1)!'h-----h't+j'h-----. is called algebraic complement of the minor \M\. Submatrices formed by the first k rows and columns are called leading principal submatrices, and their determinants leading principal minors of the matrix A. IF we choose k sequential rows and columns starting with the ;-th row, we speak of principal matrices and principal minors. Specially, when k — £ — 1, m — nwe call the corresponding complementary minor the algebraic complement A^ of the element atj of the matrix A. 2.21. Laplace determinant expansion. If the principal minor \M\ of the matrix A is of the order k, then directly from the definition of the determinant we see that each of the individual k\(n — k)\ members in the product of \M\ with its algebraic complement is a member of \A\. In general, consider a submatrix M, that is, a square matrix given by the rows i\ < 12 < ■ ■ ■ < ik and columns j\ < ■■■ < j^. Then using — 1) + • • • + (ik — k) exchanges of neighbouring rows and (ji — 1) + • • • + (jk — k) exchanges of neighbouring columns in A we can transform this submatrix M into a principal submatrix and the complementary matrix gets transformed into its complementary matrix. The whole matrix A gets transformed into a matrix B for which it holds thanks to 2.17 and the definition of determinant that \B\ — (-1)"|A|, where a = Yji=\(lh - jh) -2(1 + • • • + k). Therefore we have checked: Proposition. If A is a square matrix of dimension n and \ M\ is its minor of the order k < n, then the product of any member of\M\ with any member of its algebraic complement is a member of\A\. This claim suggests the intuition than using some products of smaller determinants we could express the determinant of the matrix itself. We see that \A\ contains exactly n \ distinct members, exactly one for each permutation. These members are mutually distinct as polynomials in elements of (a general indeterminate) 89 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.28. For what values of parameters a,b sR has the system of linear equations xx X\ X\ + + (1 (1 ax2 a)x2 a)x2 2x3 + axj, b-3, 2b - 1 (a) exactly one solution; (b) no solution; (c) at least 2 solutions? Solution. We rewrite it, as usual, in the extended matrix, and transform: a a 0 a 2b - 1 -2 2 a + 2 1 —a -2 b 0 1 2 -3 0 0 a b + 2 2.29. Determine the number of solutions for the systems (a) V5x2 \2xx + + HX3 -9, (b) (c) X\ - 5x3 = -9, X\ + 2X3 = -7; Ax i + 2x2 _ 12X3 = 0, 5xi + 2x2 - x3 = 0, —2x\ — x2 + 6x3 = 4; 4xi + 2x2 _ 12X3 = 0, 5xi + 2x2 - x3 = 1, —2xi — x2 + 6x3 = 0. matrix A. If we can show that there are exactly that many mutually distinct expressions from the previous claim, we obtain the determinant \A \ from their sum. It remains to show that the members of the product | M\ ■ \ M* \ contain exactly n! distinct members from | A \. >From the chosen k rows we can choose minors M and using the previous lemma each of the k\(n — k) \ members in the products of \M\ with their algebraic complements is a member of \A\. But for distinct choices of M we can never obtain the same members and the individual members in (— \)l\+'"+lk+h+---+ii . \M\ ■ \M*\ are also mutually distinct. Therefore we have exactly the required number k! (n — k)! (^) — n! of members. Thus we have proven: ___J Laplace theorem J___ At the first step we subtract the first row from the second and the third; and at the second step we subtract the second from the third. We see that the system has a unique solution (determined by backward elimination) if and only if a 7^ 0. For a = 0, the third column is a zero column. If a = 2 and b = —2, we have a zero row, and choosing x3 € R as a parameter gives infinitely many distinct solutions. For a = 0 and b ^ —2 the last equation a = b + 2 cannot be satisfied and the system has no solution. Let us note that for a = 0, b = — 2 the solutions are (xi, x2, x3) = (-2 + It, -3 - It, t) , t € R and for a ^ 0 the unique solution is the triple -3a2 - ab - 4a +2b +4 2b + 3a + 4 b + 2 □ Theorem. Let A — (aij) be a square matrix of dimension n over arbitrary ring of scalars with k rows fixed. Then \A\ is a sum of all (£) products (-\)h+~+ik+h+~+ii. \m\ • |M* | of minors of the order k chosen among the fixed rows with their algebraic complements. Laplace theorem transforms the computation of \A\ into the computation of determinants of lower dimension. This method of computation is called Laplace expansion by the chose rows (or columns). For instance, the expansion of the z-th row or j-th column is: J2a where Ay denotes the algebraic complement of the element ay (that is, minor of order one). In practical computation it is often efficient to combine the Laplace expansion with a direct method of linear combination addition. 2.22. Proof of Cauchy theorem. The theorem is based on a |gu clever but elementary application of the Laplace theorem. We just use the Laplace expansion twice on particular positions in the matrix. Let us first consider the following matrix H of the dimension In (we are using the so-called block symbolics, that is, we write the matrix as if composed of the (sub)matrices A, B, and so on). (a\ H A 0 -E B -1 V 0 «In 0 -1 0 b\\ b„i 0 \ 0 bin bun / Laplace expansion of the first n rows gives us \H\ — \A \ ■ \B\. Now we add sequentially to the last n rows linear combinations of the first n columns in order to obtain a matrix with zeros in the 90 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. Vectors (1, 0, —5), (1, 0, 2) are clearly linearly independent (they are not multiples of each other) and the vector (12, V5, 11) cannot be their linear combination (its second coordinate is non-zero), therefore the matrix whose rows are these three linearly independent vectors is invertible. Thus the system for the case (a) has exactly one solution. For the cases (b) and (c) it is enough to note that (4,2, -12) = -2(-2, -1, 6). In the case (b) adding the first equation to the third multiplied by two gives 0 = 8, no solution for the system; in the case (c) the third equation is a multiple of the first - the system has clearly infinitely many distinct solutions. □ 2.30. Find (any) linear system, whose set of solutions is exactly {(t + 1, It, 3t, At); t e R}. Solution. Such a system is for instance 2x\ — x2 = 2, 2x2 — x4 = 0, 4x3 — 3x4 = 0. These solutions are satisfied exactly for every t e R and vectors (2,-1,0,0), (0,2,0,-1), (0,0,4,-3) giving the left-hand sides of the equations are clearly linearly independent (the set of solutions contains a single parameter). □ 2.31. Determine the rank of the matrix / 1 -3 0 1 \ 1 -2 2-4 1-10 1 \-2 -1 1 -2/ Then determine the number of solutions of the system of linear equations xl + x2 + x3 —3xi — 2x2 — x3 + 2x2 Xi — 4X2 + x3 and all solutions of the system xl + x2 + x3 —3x\ — 2x2 — x3 + 2x2 Xi — 4X2 + x3 and of the system — 2x4 — X4 + x4 — 2x4 — 2x4 — X4 + x4 — 2x4 4, 5, 1, 3 0, 0, 0, 0 Xl Xl Xi -2xi 3x2 2X2 + 2x3 x2 x2 + x3 1, -4, 1, -2. bottom right corner. We obtain /flll ... fll„ en K -1 0 Cnl 0 C In \ 0 \0 -1 0 ... 0 / The elements of the submatrix on the top right part must satisfy cij = anbij + ai2b2j H-----h ainbnj, that is, they are exactly the members of the product AB and | K\ — \H\. The expansion of the last n columns gives us = (-\)"+1+-+2"\A ■ B\ = (_i)2"-(»+D \A-B\ = \A-B\. This proves the Cauchy theorem. 2.23. Determinant and the inverse matrix. Assume first that there is an inverse matrix of the matrix A, that is, A ■ A~l — E. Since the unit matrix always satis-4*5^» fies I E\ — 1, for every invertible matrix we have that \A\ is an invertible scalar and thanks to the Cauchy theorem we have \A~1\ — |A|_1. But we can say more, combining the Laplace and Cauchy theorem. ___J Inverse matrix determinant formula J___ For any square matrix A -matrix A* = (a*), where a* of the elements in A. The matrix A* is called algebraically adjoint matrix of the matrix A. Theorem. For every square matrix A over a ring of scalars K we have that (2.2) A A* = A* A = \A\ • E. Notably, (1) A-1 exists as a matrix over a ring of scalars K if and only if IAI ~1 exists in K. (2) If A~l exists, then A~l (flij) of dimension n we define a = Aji are algebraic complements |A| A*. Proof. As we have already mentioned, Cauchy theorem shows that the existence of A-1 implies the invertibility of \A\ e K. For arbitrary square matrix A we can directly compute A ■ A* — (cij), where n n Cij = ^^atkalj = ^2aikAjk- k=\ k=\ If i — j it is exactly the Laplace expansion of \A \ of ;-th row. If i / j it is expansion of determinant of the matrix where the ;-th and j-th row is the same, therefore — 0. This implies that A ■ A* — IAI • E and we have proven the equality (2.2). Let us further assume that \A\ is an invertible scalar. If we repeat the previous computation for A * ■ A, we obtain | A \ ~1A * ■ A — E. Therefore our computation really gives the inverse matrix of A, as claimed in the theorem. □ 91 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. Because det A = —10, that is, non-zero, the columns of A are linearly independent, and thus the rank equals to the dimension. The first of the three given system is given by the extended matrix / 1 1 1 -2 4 \ -3 -2 -1 -1 5 0 2 0 1 1 V 1 -4 1 -2 3 ) But the left-hand side is exactly AT with determinant \AT\ = \A\ ^0. Therefore there exists a matrix (AT) 1 and the system has a unique solution (jci, x2, jc3, x4)T = {AT)~l ■ (4, 5, 1, 3)T . The second of the systems has the same left-hand side (given by the matrix AT) as the first. Because the numbers on the right-hand side of the equations in the system do not influence the number of solutions and because every homogeneous system has a zero solution, the only solution of the second system is given by (jci, x2, x3, x4) = (0,0, 0, 0). The third system is given by the extended matrix / 1 1 1 V -2 -3 0 -2 2 -1 0 -1 1 1 \ -4 1 "2/ which is the matrix A (only the last column is given after the vertical bar). If we try to simplify the matrix into the row echelon form, we must obtain a row ( 0 0 0 | a ) , kde a ^ 0. We know, that the column on the right-hand side is not a linear combination of the columns on the left-hand side (the rank of the matrix is 4). This system thus has no solution. □ 2.32. Let be given. Find real numbers bi,b2,b3 such that the system of linear equations A • x = b has: (a) infinitely many solutions; (b) unique solution; (c) no solution; (d) exactly four solutions. Solution. For the readers it is definitely no problem to find correct values in the cases a) and c) (it is enough to choose b\ = b2 + b3 in the case As a direct corollary of this theorem we can once again prove the Cramer rule for solving the systems of linear equations, see 2.18. Really, for the solution of the system A - x — b we just need to read in the equation x — A-1 • b — \A\~lA* -b the last expression as the Laplace expansion of the determinant of the matrix At which arose through the exchange of the ;-th column of A for the column b. 3. Vector spaces and linear mappings 2.24. Abstract vector spaces. Let us go back for a while to the systems of m linear equations of n variables from 2.3 and let us further assume that the system is homogeneous A ■ x — 0, that is, \am\ a\n\ (x\\ /0\ 7 VW Thanks to the distributivity of the matrix multiplication it is clear that the sum of two solutions x — (x\,..., x„) and y — (yi,..., yn) satisfies A-ix + y) A-y = 0 and is thus also a solution. Similarly, a scalar multiple a ■ x is also a solution. The set of all solutions of a fixed system of equations is therefore closed on vector addition and scalar multiplication. That are the basic properties of vectors of dimension n in K", see 2.1. Now we have the vectors in the solution space with n coordinates and the "dimension" of this space is given by the difference of the number of variables and the rank of the matrix A. Thus we can easily have with the solution of 1000 coordinates only one or two free parameters. Thus the whole solution space will behave as a plane or a line, as we have already seen in 1.25 at the page 25. Already in the paragraph 1.9 we have encountered a more interesting example of a space of all solutions of a homogeneous linear difference equation of first order. All solutions have been obtained from a single one by scalar multiplication and are also closed under addition and scalar multiples. These "vectors" of solutions are infinite sequences of numbers, although we intuitively expect that the "dimension" of the whole space of solutions should be one. Therefore we need a more general definition of vector space and its dimension: | Vector space definition [__^ Vector space V over field of scalars K is a set where we define the operations • addition, which satisfies the axioms (KG1)-(KG4) from the paragraph 1.1 on the page 4, • scalar multiplication, for which the axioms axioms (V 1)-(V4) from the paragraph 2.1 on the page 72 hold. Let us remind our simple notational convention: scalars are usually denoted by letters from the beginning of the alphabet, that is, a, b, c,..., while for vectors we shall use letters from the end, that is, u, v, w, x, y, z. Usually, x, y, z will denote n -tuples of 92 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA a) and b\ ^ b2 + b3 in the case c)). Let us further note that \A\ = 0, thus the system either has infinitely many or no solution. In general, the set of solutions of a homogeneous system of linear equations is a vector space, thus the variant d) is a priori excluded. The variant b) is possible only for a system with a regular matrix (the only solution is then a zero vector). □ 2.33. Solve the system of homogeneous linear equations given by the matrix /o V2 73 76 0 \ 2 2 73 -2 -75 0 2 75 273 -73 \3 3 73 -3 0 / o 2.34. Determine all solutions of the system 2.35. Solve x2 + x4 = 1, 3x\ — 2x2 - 3x3 + 4x4 = - 2, X\ + x2 - x3 + x4 = 2, XX x3 1. 3x — 5y + 2u + Az = 2, 5x + ly — Au - 6z = 3, Ix - Ay + + 3z = O o 2.36. Decide whether the system of linear equations + 3xi 2x\ 2x\ 3x\ + + 3x2 3x2 3x2 2x2 + + x3 x3 x3 x3 1, 8, 4, of three variables x\, x2, x3 is solvable. O 2.37. Determine the number of solutions of 2 systems of 5 linear equations AT - x (1,2, 3,4, 5)2 (1, 1, 1, 1, 1) where X — (X\, x2, x3) o 2.38. Determine the solution of the system of linear equations + Ax2 +2 x3 = 0, Li + 3x2 — x3 = 0, ax i 2x\ scalars. For completeness, the letters from the middle of the alphabet, for instance i, j,k,£, will mostly denote indices in expressions. In order to gain some practice in formal approach, we check simple properties of vectors, which are trivial for n -tuples for scalars, but not so evident for general vectors. 2.25. Proposition. Let V be a vector space over a field of scalars K, further take a, b, a, e K, and vectors u, v, uj e V. Then (1) a ■ u — 0 if and only if a — 0 or u — 0, (2) (—1) • u — —u, (3) a ■ (u — v) — a ■ u — a ■ v, (4) (a — b) ■ u — a ■ u — b ■ u, (5) (E?=i «i) ' (T,J=i "]) = E?=i T,J=i Oi ■ "J- Proof. We can expand 0 • u — a ■ u which according to the axiom (KG4) ensures 0 • u — 0. Now « + (-!)•« (=2) (1 + (-1)) • w = 0 • u = 0 and thus — u — (— 1) • u. Further, (V2, V3) a ■ (u + (—1) • v) = a ■ u + (—a) ■ v — a ■ u — a ■ v, (a +0) ■ u — a ■ u Which proves (3). It holds that (a (V2, V3) b) ■ u — a ■ u + (—b) ■ u — a ■ u — b ■ u which proves (4). The property (5) follows using induction with (V2) and (VI). It remains to prove (1): a-0 — a - (u — u) — a-u — a-u — 0, which along with the first derived proposition in this proof proves one implication. For the other implication we first need the axiom of the field for scalars and the axiom (V4) for vector spaces: if p ■ u — 0 and p / 0, then u — 1 • u — (p~l ■ p) ■ u — p~l ■ 0 — 0. □ 2.26. Linear (in)dependence. In the paragraph 2.11 we have worked with the so-called linear combinations of rows of a matrix. With general vectors we will work analogously: __J Linear combinations and independence |___ Expression of the form a\ ■ vi + ■ ■ ■ + ■ is called linear combination of vectors ui, ..., e V. Finite sequence of vectors v\, ..., is called linearly independent, if the only zero linear combination is the one with all coefficients zero, that is, for scalars a\,..., e K holds that ai ■ vi H-----h ak ■ vk — 0 a\ — ü2 — ■ ■ ■ — ak — 0. It is clear that in independent sequence of vectors all vectors are mutually distinct and nonzero. The set of vectors M c V in vector space V over K is called linearly independent, if every finite &-tuple of vectors v\, ... ,Vk e M is linearly independent. The set of vectors M is linearly dependent, if it is not linearly independent. 93 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA depending on the parameter aeK. 2.39. Depending on the parameter a e solutions of the system /4 2 3 V6 1 3 2 a \ (x\\ *2 5 2 ■8/ x3 \x4J O determine the number of /2\ 5 3 v-v o 2.40. Decide whether there is a system of homogeneous linear equations of three variables whose set of solutions is exactly (a) {(0, 0, 0)}; (b) {(0,1,0), (0,0,0), (1,1,0)}; (c) {(jc, 1,0); x € R}; (d) {(x,y,2y); x,y elj. O 2.41. Solve the system of linear equations, depending on the real parameters a, b. x + 2y + bz = a x - y + 2z = 1 3x-y = 1. O 2.42. Find the algebraically adjoint matrix and the inverse of the matrix (\ 0 5 0\ 4 0 \0 7 0 8/ Solution. The adjoint matrix is An A3i \A4i An A22 A32 A42 An A23 A33 A43 Au\ A24 A34 A44/ where Atj is the algebraic complement of the element atj of the matrix A, that is, the product of the number (—l)i+i and the determinant of the matrix given by A without the z'-th row and 7-th column. We have 0 0 4 -24, Al2 = - 3 0 4 0 6 0 7 0 8 5 0 0 6 0 A43 0 0 3 4 0 0 0, 144 0,... ■12. Directly from the definition we have that a nonempty subset M „^mL- °f vectors from a vector space over a field of scalars ifl^£ K is dependent if and only if one of its vectors can be f^few^ expressed as a finite linear combination using other 'ffsr*%5s-L vectors in M. Really, at least one of the coefficients in the corresponding zero linear combination must be nonzero, and since we are over a field of scalars, we can multiply whole combination by the inverse of this nonzero coefficient and thus express its corresponding vector using others. Every subset of a linearly independent set M is clearly also linearly independent (we require the same conditions on a smaller set of vectors). Similarly, we can see that M c V is linearly independent if and only if every finite subset of M is linearly independent. 2.27. Generators and subspaces. Subset M c Vis called vector sub space if it along with restricted operations of ad-rt dition and scalar multiplication forms a vector space. That is, we require Va, b e K, Vi>, w e M, a ■ v + b ■ w e M. Let us investigate a couple of cases: The space of m-tuples of scalars Rm with coordinate-wise addition and multiplication is a vector space over R, but also a vector space over Q. For instance for m — 2, the vectors (1, 0), (0, 1) e R2 are linearly independent, because from a- (1,0)+b- (0, 1) = (0, 0) follows a = b = 0. Further, the vectors (1,0), (V2, 0) e M2 are linearly dependent over R, because V2 • (1, 0) = (V2, 0), but over Q they are linearly independent! Over R these two vectors "generate" one-dimensional subspace, while over Q the subspace is "bigger". Polynomials of degree at most m form a vector space Mm[x]. We can see the polynomials as mappings / : R —>• R and define the addition and scalar multiplication like this: (/ + g)(x) — f(x) + g(x), (a ■ f)(x) — a ■ f(x). Polynomials of any degree also form a vector space Moo M and Rm [x] c R„ [x] is a vector subspace for any m < n < 00. Subspaces are also for instance all even polynomials or odd polynomials, that is, polynomials satisfying f(-x) = ±f(x). In a complete analogy as with polynomials we can define a structure of vector space on a set of all mappings R —>• R or of all mappings M —>• V of an arbitrary fixed set M into the vector space V. Because the condition in the definition of subspace consists only of universal quantifiers, the intersection of subspaces is still a subspace. We can clearly see it also directly: Let Wi, i e I, be vector subspaces in V, a, b e K, u,v e P 1 nieIWi. Then for all i e I, a ■ u + b ■ v e Wt, but that means that a ■ u + b ■ v e nieI Wi. Notably, the intersections (M) of all subspaces W C V that contain some given set of vectors M c V is a subspace. We say that a set M generates the subspace (M), or that the elements of M are generators of the subspace (M). Let us again formulate a few simple claims about subspace generation: Proposition. For every nonempty set M C V we have that 94 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA A* By plugging in we obtain /-24 0 20 0 \ 0 -32 0 28 8 0-4 0 \ 0 16 0 -12/ /-24 0 8 0 \ 0 -32 0 16 20 0 -4 0 \ 0 28 0 -12/ We compute the inverse matrix A~l from the relation A~l = \A\~l - A*. Determinant of the matrix A is (expanding the first row) equal to 10 2 0 0 3 0 4 5 0 6 0 0 7 0 8 3 0 4 0 3 4 0 6 0 + 2 5 0 0 7 0 8 0 7 8 16. By plugging in we obtain / -3/2 0 1/2 0 \ 0 -2 0 1 5/4 0 -1/4 0 V 0 7/4 0 -3/4/ □ 2.43. Find the algebraically adjoint matrix F* for a, ß,y,S e O 2.44. Calculate the algebraically adjoint matrix for the matrices -1\ (a) /3 -2 0 0 2 2 1 1 -2 -3 -2 \0 1 2 1 / (b) 1 + i 2i 3-2/ 6 where i denotes the imaginary unit. O F. Vector spaces The properties of vector space, which we have already observed for the plane or three dimensional space are possessed by other sets as well. We illustrate this by examples. 2.45. Vector space - yes or no? Decide for the following sets whether they form a vector space over the field of real numbers: (1) (M) — {ai ■ u\ H-----\-cik-iik\ k e N, at e K, Uj e M, j — 1, ...,£}; (2) M — (M) if and only ifM is a vector subspace; (3) ifN C M then (N) C (M) is a vector subspace Subspace (0) generated by the empty subspace is the trivial sub-space {0} C V. Proof. (1) The set of all linear combinations a\u\ on the right-hand side (1) is clearly a vector subspace and of course it contains M. On the other hand, each of the linear combinations must be in (M) and thus the first claim is proven. The claim (2) follows immediately from (1) and from the definition of vector space and analogously (1) implies the third claim. Finally, the smallest subspace is {0}, because empty set is contained in every subspace and each of them contains the vector 0. □ 2.28. Sums of subspace. Since we now have some intuition about generators and their respective subspaces, we should understand the possibilities how some subspaces can generate whole space V. _J Sum of subspaces J___ Let Vt,i e I be subspaces of V. Then the subspace generated by their union, that is, (U, e/ V,), is called sum of subspaces V). We denote it as 2~2iei ^- Notably, for a finite number of subspaces Vi,..., Vk C V we write Vl+--- + V* = (ViUVr2u---UVJt). We see that every element in the considered subspace can be expressed as a linear combination of vectors from the subspaces Vi. Because vector addition is commutative, we can associate members that belong to the same subspace and for a finite sum of k subspaces we obtain Vi + V2 + ■ ■ ■ + Vk = {Vl + ■ ■ ■ + vk; Vie Vi, i = 1,..., k). Sum W — Vi + ■ ■ ■ + Vk C V is called direct sum of subspaces if the intersection of any two is trivial, that is, Vi n Vj — {0} for all i ^ j. We show that in such case can every vector w e W be written in a unique way as a sum w — vi Vk, where Vi e Vi. Really, if for that vector we could simultaneously write w — v[ +----h v'k, then 0 — w — w — (vi — v[) H----+ (vk — v'k). If vi — v. is the first nonzero term of the right-hand side, then this vector from Vi can be expressed using vectors from other sub-spaces. That is a contradiction with the assumption that Vi has zero intersection with other subspaces. The only possibility is then that all the vectors on the right-hand side are zero and thus the expression of w is unique. For direct sums of subspaces we write W = Vi ® Vk |V/. 2.29. Basis. Now we have everything prepared for understanding minimal sets of generators as we understood them in the plane M2. 95 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA i) The set of solutions of the system xl + x2 + ' ' ' + X9S + x99 + ^100 =100xi, xi + x2-\-----h x98 + x99 =99xi, xi + x2 H-----h x98 =98*1, X} -(- X2 =2x\. ii) The set of solutions of the equation X\ + X2 + ■ ■ ■ + X\oo = 0 iii) The set of solution of the equation xi + 2x2 + 3x3 H-----h lOOxioo = 1. iv) The set of all real (or complex) sequences. (Real or complex sequence is a mapping /:N->Mor/:N->C The image of number n is then called 72-th member of the sequence, we usually denote it by lower index, say a„.) v) The set of solutions of homogeneous difference equation. vi) The set of solutions of non-homogeneous difference equation. vii) {/ : R -> R\f(\) = f(2) = c, c e R} Solution. i) Yes. They all are real multiples of the vector (1, 1, 1..., 1), 1-,-' 100 ones that is, vector space of dimension 1 (see also (2.29)). ii) Yes. It is a space of dimension 99 (corresponds to the number of free parameters of the solution). In general the set of all solutions of any system homogeneous linear equations forms a vector space. iii) No. For instance, taking twice the solution x\ = 1, xt =0, i = 2, ... 100 we do not obtain a solution. But the set of solutions forms a so-called affine space (see (4.1)). iv) Yes. The set of all real or complex sequences clearly forms a real (complex) vector space. Adding the sequences and scalar multiplication is defined term-wise, where it is clearly the vector space of all real (complex) numbers. v) Yes. In order to show that the set of sequences which satisfy given difference homogeneous equation it is enough to show that it is closed under addition and real number multiplication (as the set of all real sequences is a vector space, as we know). Let us have two sequences (x7)^0 and (yj)^0 Basis of vector space V Subset M C V is called basis of vector space V if (M) and M is linearly independent. Vector space with finite basis is called finitely dimensional, the number of elements of the basis is called the dimension ofV. If V does not have a finite basis, we say that V is infinitely dimensional. We write dim V — k, & e N or & = 00. In order to be satisfied with such definition of dimension, we must know that different bases of the same space will always have the same number of elements. This we will show in a while. But we note immediately, that the trivial subspace is generated by empty set, which is an "empty" basis. It thus has zero dimension. Basis of a ^-dimensional space will usually be denoted as a k-tuple v — (v\..., Vk) of basis vectors. It is mostly about having a convention: with finitely dimensional vector spaces we shall always consider the base along with a given order of the elements even if we have not defined it that way, strictly said. Clearly, if (vi,... ,v„) is a basis of V, the whole space V is a direct sum of the one-dimensional subspaces V=(vi)®---®(vn). An immediate corollary of the derived uniqueness of decomposition of any vector w in V into the components in the direct sum gives unique decomposition W — X\V\ + • • • + x„ v„ and allows us after choosing a basis to see vectors again as n-tuples of scalars. To this idea we will return in the paragraph 2.33, when we finish the discussion of existence of bases and sums of sub-spaces in general case. 2.30. Theorem. >From any finite set of generators of a vector space V we can choose a basis. Every base of a finitely dimensional space V has the same number of elements. Proof. First claim can be easily proved using induction on the number of generators k. Only the zero subspace does not need any generator and thus we are able to choose an empty basis. On the other hand, we are not allowed to choose the zero vector (generators would be linearly dependent) and there is nothing else in the subspace. In order to have our inductive step more natural, we deal with the case k — 1 first. We have V — ({v}) and 1; ^ 0, because {v} is linearly independent set of vectors. Then {v} is also a basis of the vector space V. Assume that the claim holds for k — n and consider V — (vi,..., vn+i). If vi,..., vn+i are linearly independent, then they form a basis. In the other case there exists i such that vt —aivi H-----\-ai-iVi-i +ai+ivi+i H-----\-an+\vn+\. Then V — (vi,..., i^-i, ur-+i,..., vn+\) and we can choose a basis, using inductive assumption. In remains to ensure that bases always have the same number of elements. Consider basis (y\,... ,vn) of the space V and for arbitrary nonzero vector consider a\v\ ■ a„v„ e V 96 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA satisfying the given equation, that is, anxn+k + an-\xn+k-\ + ' ' ' + a0xk = 0 anyn+k + an-iyn+k-i + ■ ■ ■ + aoyk = 0. By adding these equations, we obtain an(xn+k + yn+k) + an-i(xn+k-i + yn+k-i) + • • • + a0(xk + yk) = 0, therefore also the sequence (xj + yj)^0 satisfies the given equation. Analogously, if the sequence (xj)^0 satisfies the given equation, then also (uxj)^0, where uel vi) No. The sum of two solutions of a non-homogeneous equation anxn+k + an-\xn+k-\ + ' ' ' + a0xk — c a„y„+k + an-iyn+k-i-\-----h a0yk = c, c e {0} satisfies the equation an(xn+k + yn+k) + an-i(xn+k-i + yn+k-i) + • • • + a0(xk + yk) = 2c, that is, it does not satisfy the original non-homogeneous equation. But the set of solutions forms an affine space, see 4.1. vii) It is a vector space if and only if c = 0. If we take two functions / and g from the given set, then (/ + g)(l) = (/ + g)(2) = /(l) + g(l) = 2c. Thus if / + g is to be a member of the given set, it must be that (/ + g)(l) = c, therefore 2c = c, therefore c = 0. □ 2.46. Find out, whether the set U1 = {(x1,x2,x3) sM3; | jci | = \x2\ = \x3\} is a subspace of a vector space R3 and the set U2 = {ax2 + c; fl,cel) a subspace of the space of polynomials of degree at most 2. Solution. The set U\ is not a vector (sub)space. We can see that, for instance, (1, 1,1)+ (-1,1, 1) = (0,2,2) glA. The set U2 is a subspace (there is a clear identification with R2), be- with at ^ 0 for some i. Then 1 , , vt — — (u - (a\vi H-----hflr-iUr-i +ai+ivi+i H-----\-a„v„)) Cli and therefore also {u,v\,..., vi-\, vi+\, ...,v„) — V. We ensure that this is again a basis: if adding u to linearly independent vectors v\,..., vi+\,..., v„ would lead to a set of linearly dependent vectors, then u is their linear combination. That would mean V (V\, . . . , Vi-l, Vi + l, ...,vn), cause {axx2 + ci) + (a2x2 + c2) = (ax + a2)x2 + (ci + c2), which is not possible. Thus we have proved that for any nonzero vector u e V there exists i, 1 < i < n, such that (u,v\,..., vi-\, vi+\, ..., vn) is again a basis of V. Further, we shall instead of one vector u consider a linearly independent set u\,..., uk and we will sequentially add u\, u2,..., always exchanging for some vt using our previous approach. We have to ensure that there always is such (that is, that the vector u will not exchange for each other). Assume thus that we have already placed u\,..., ui. Then the vector ui+\ can clearly be expressed as a linear combination of such vector and the remaining vj . If only the coefficients at u i,..., ui were nonzero, that would mean that the vectors u\,..., ui+\ are linearly dependent, which is a contradiction. For every k < n we can after k steps obtain a basis in which from the original basis k vectors were exchanged for new ones. If k > n, then in the n-th step we obtain a basis consisting only of new vectors , which means that the original set could not be linearly independent. Notably it is not possible that two bases have a different number of elements. □ In reality, we have proved a stronger claim, the so-called Steinitz exchange theorem, which says that for every finite basis v and every system of linearly independent vectors in V we can find a subset of the basis vectors which can be exchanged with the new vectors to obtain a basis. 2.31. Corollaries of the Steinitz exchange theorem. Thanks to the possibility of freely choosing and exchanging basis vectors we can immediately derive nice (and intuitively expectable) properties of bases of vector space: Proposition. (1) Every two bases of a finitely dimensional vector space have the same number of elements, that is, our definition of dimension is basis-independent. (2) IfV has a finite basis, then every linearly independent set can be extended to a basis. (3) Basis of a finitely dimensional vector space is maximal linearly independent set. (4) Bases of a vector space are exactly minimal sets of generators. A little bit more complicated, but now easy to deal with, is the situation of dimensions of subspaces and their sums: Corollary. Let W,W\,W2 c V be subspaces of a space V of finite dimension. Then we have that (1) dimW < dim V, (2) V = W if and only if dim V = dim W, (3) dim Wi + dim W2 = dim(Wi + W2) + dim(Wi n W2). 97 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA k ■ (ax2 + c) = (ka) x2 + kc for all numbers a\,c\, a2, c2,a,c,ksR. □ 2.47. Is the set V = {(1, x); x el) with operations ®:VxV^V, (l,y)©(l,z) = (l,z +y) for all z, y eR 0:lxV ^F, z © (l,y) = (l,y ■ z) for allz, y e 1. a vector space? O G. Linear dependence and independence, bases 2.48. By calculating the determinant of a suitable matrix decide whether the vectors (1, 2, 3, 1), (1, 0, -1, 1), (2, 1, -1, 3) and (0, 0, 3, 2) are linearly dependent or not. Solution. Because 1 2 1 0 2 1 0 0 3 1 -1 1 1 10 ^0, the given vectors are linearly independent. □ 2.49. Given arbitrary linearly independent vectors u, v, w, z in a vector space V, decide whether in V the vectors u — 2v, 3u + w — z, u — 4v + w + 2z, 4v + Sw + 4z are linearly independent or not. Solution. The considered vectors are linearly independent if and only if the vectors (1, -2, 0, 0), (3, 0, 1, -1), (1, -4, 1, 2), (0, 4, 8, 4) are linearly independent in R4. We have 1-200 3 0 1-1 1-412 0 4 8 4 thus the vectors are linearly independent. -36 ^0, □ 2.50. Determine all constants ael such that the polynomials ax2 + x +2, —2x2 + ax + 3 and x2 + 2x + a are linearly dependent (in the vector space P3 [x] of polynomials of one variable of degree at most three over real numbers). Solution. In the basis 1, x, x2 the coefficients of the given vectors (polynomials) are (a, 1, 2), (—2, a, 3), (1, 2, a). Polynomials are linearly independent if and only if the matrix whose columns are given by the coordinates of the vectors has rank lower than the number of the vectors, which in this case means that rank must be two or lower. In Proof. It remains to prove only the last claim. That is clear when the dimension of one of the spaces is zero. Assume then that dim W\—r>\, dim W2 — s > 1 and let (wi..., wt) be a basis of W\ n W2 (or empty set, if the intersection is trivial). According to the Steinitz exchange theorem this basis of the intersection can be extended to a basis (wi,..., wt, ut+\ ... ,ur) for W\ and to a basis (wi... ,wt, vt+\,... ,vs) for W2. Vectors wt, ut+i. Ur, Vt+l ... ,VS clearly generate W\ + W2. We show that they are linearly independent. Let fliwi H-----Vatwt +bt+iut+i + ... ----h brur + ct+i vt+i ■csvs = 0. Then necessarily - (Q+i • vt+i H-----hcs • vs) — — a\ ■ w\ +----h at ■ wt + bt+\ ■ ut+\ + ■ ■ ■ + br ■ ur must belong to ^nffi. That implies that bt+i — ■ ■ ■ — br — 0, since in that way we have defined our bases. Then also a\ ■ w\ H-----h at ■ wt + ct+\ ■ vt+\ H-----h cs ■ vs — 0 and because the corresponding vectors form a basis W2, all the coefficients are zero. The claim (3) now follows by directly calculating of generators. □ 2.32. Examples. (1) K" has (as a vector space over K) dimension n. Basis is for example an n-tuple of vectors ((1,0, ...,0), (0, 1, ...,0)..., (0, ...,0, 1)). This basis is called the standard basis ofK". Note that in the case of finite field of scalars, say Z*, the whole space K" has only a finite number (kn) of elements. (2) C as a vector space over R has dimension 2, basis is for instance the numbers 1 and i. (3) Km[x], that is, the space of all polynomials of degree at most m, has dimension m + 1, basis is for instance the sequence 1, x, x2, ... ,xm . Vector space of all polynomials K[x] has dimension 00, but we can still find a basis (although infinite in size): 1, x, x2,____ (4) Vector space R over Q has dimension 00 and does not have a countable basis. (5) Vector space of all mappings / : R —>• R has also dimension 00 and does not have any finite basis. 2.33. Vector coordinates. If we fix a basis (vi,..., v„) of a finitely dimensional space V, then every vector w e V can be expressed as a linear combination 1; — aivi + ■ ■ ■ + anvn. Assume that we can do it in two ways: But then w — a\v\ H-----h a„v„ — b\v\-\-----h b„v„. 0 = (ai - b\) ■ vi H-----h (an - b„) ■ vn 98 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA the case of square matrix, rank lower than the number of rows means that the determinant is zero. The condition for a thus reads -2 1 a 2 3 a 0, that is, a is a root of the polynomial a3 — 6a — thus there are 3 such constants a\ = —l,a2j, 5 = (a + l)(a2- _ l±V2l -a —5), □ 2.51. Vectors (1,2,1), (-1,1,0), (0,1,1) are linearly independent, and therefore together form a basis of R3 (for basis it is important to give an order of the vectors). Every three-dimensional vector is therefore some linear combination of them. What linear combination corresponds to the vector (1, 1, 1), or equiv-alently, what are the coordinates of the vector (1, 1, 1) in the basis formed by the given vectors? O Solution. We seek a, b, c e R such that a(\, 2, 1) + b(-\, 1,0) + c(0, 1, 1) = (1, 1, 1). The equation must hold in every coordinate, so we have a system of three linear equations in three variables: a — b =1 2a+b+c = 1 a + c = 1, whose solution gives us a = b = — \,c = \, thus we have (1, 1, 1) = \ ■ (1, 2, 1) - \ ■ (-1, 1, 0) + l- ■ (0, 1, 1), that is, the coordinates of the vector (1, 1, 1) in the basis ((1, 2, 1), (-1,1, 0), (0, 1, 1)) are (I -I I). □ 2.52. Express the vector (5, 1, 11) as a linear combination of the vectors (3, 2, 2), (2, 3, 1), (1, 1, 3), that is, find numbers p,q,r e R, for which (5, 1, 11) = p (3, 2, 2) + q (2, 3, 1) + r (1, 1, 3). O 2.53. Consider the complex numbers C as a real vector space. Determine the coordinates of the number 2 + i in the basis given by the roots of the polynomial x2 + x + 1. Solution. Because roots of the given polynomial are + z'# and and thus a, — b[for all i — 1,..., n. We have reached the following conclusion: In a finitely dimensional space every vector can be given in a unique way as a linear combination of basis vectors. Coefficients of this unique linear combination expressing the given vector w e V in the chosen basis v — (vi,...,v„)sae called coordinates of the vector w in this basis. Whenever we speak about coordinates (a\,..., a„) of vector w, which we express as a sequence, we must have a fixed ordering of basis vectors v — (v\,..., v„). Although we have defined the basis as a minimal set of generators, in reality we work with them as with sequences (that is, with ordered sets). ___| Assigning coordinates to vectors J_ - Mapping, which to the vector u — a\ v\ H-----Yan vn assigns its coordinates in the basis v shall be denoted with the same symbol v : V -> K". It has the following properties: (1) v_(u + w) — v_(u) + v(w); Vm, w e V, (2) v(a ■ u) — a ■ v(u); Va e K, Vm e V. Note that the operations over left and right side of these equa-jSt# tions are not identical, quite the opposite, they are operations over different vector spaces! At this op-5^2g|3^ portunity, we can think about the general case of the basis M of (possibly infinite) vector space V. The basis then does not have to be countable, but still we can define the mapping M : V —>• KM (that is, the coordinates of the vectors are the mapping from M to K). The given properties of assignments of coordinates were already seen at the mappings in geometry we have called linear (they preserved our linear structure in the plane). Before we deal more thoroughly with the dependency of the coordinates on the choice of the basis, we look in more generality at the notion of the linearity of the mapping. 2.34. Linear mapping. For any vector space (of finite or infinite dimension) we define "linearity" of a mapping between spaces similarly to the planar case (M2): Linear mapping, definition {_ Let V and W be vector spaces over the same field of scalars K. The mapping / : V —>• W is called linear mapping (homomorphism) if the following holds: (1) f(u + v) = f(u) + f(v), Vu,veV (2) f(a-u) = a- f(u), Va e K, Vm e V. i^-, we have to determine the coordinates {a, b) of the vector Clearly, such mapping have already been seen in the case of matrix multiplication: / : K" Km,x A -x with matrix of type m/n over K. Image Im / := f(V) c W is always vector subspace, since linear combination of images f(u{) is an image of a linear combination of the vectors with the same coefficients. Analogously, the set of all vectors Ker / := f~l ({0}) c V is a subspace, since the linear combination of zero images will always be a zero vector. The subspace Ker / is called kernel of linear mapping f. Linear mapping which is a bijection is called isomorphism. 99 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2 + z in the basis (—^ + 1^,-^ — 1 ^). These real numbers a, b are uniquely determined by the condition 1 73 1 73 a ■ (---h z—) + b ■ (---z-) = 2 + z. 2 2 2 2 By considering individually the real and the imaginary part of the equation we obtain a system of two linear equations in two variables: 1 1 --a--b = 2 2 2 V3 73, -a--b = 1. 2 2 Its solution gives us a -2 + V3 coordinates are (—2 + — 2 — -^). -2 — ^j-, therefore the □ 2.54. Remark. As a perceptive reader has definitely spotted, the problem statement is not unambiguous - we are not given the order of the roots of the polynomial, thus we do not have the order of the basis vectors. The result is thus given up to the permutation of the coordinates. Let us also add a remark about the so-called rationalising the denominator, that is, removing the square roots from the denominator. The authors do not have a distinctive attitude whether this should al-ways be done or not (Does ^- look better than ?). In some cases the rationalising is undesirable: from the fraction -J= we can immediately spot that its value is a little greater than 1 (because V 35 is just a little smaller than 6), while for the rationalised fraction we cannot spot anything. 2.55. Consider complex numbers C as a real vector space. Determine the coordinates of the number 2 + z in the basis given by the roots of the polynomial x2 — x + 1. 2.56. For what values of the parameters a,b,c e M are the vectors (1, 1, a, 1), (l,b,l, 1), (c, 1, 1, 1) linearly dependent? 2.57. Let a vector space V be given along with some basis formed by the vectors u,v,w,z. Determine whether the vectors u — 3v + z, v — 5w — z, 3w — lz, u — w + z are linearly (in)dependent. 2.58. Complete the vectors 1 — x2 + x3, 1 + x2 + x3, 1 — x — x3 into a basis of the space of polynomials of degree at most 3. 2.59. Do the matrices 1 0 \ (I 4 \ (-5 0\ (I -2 1-2J' [0 -lj ' V 3 0) ' [0 3 form a basis of the vector space of square two-dimensional matrix? Analogously to the abstract definition of vector spaces, it is again necessary to prove seemingly trivial claims that follow from the axioms: Proposition. Let f : V -> W be a linear mapping between two vector spaces over the same field of scalars K. For all vectors u, u\, ..., uk e V and scalars a\,..., ak e~Kit holds that (1) /(0) = 0, (2) f(-u) = -/(«), (3) f(ai ■ u\ H-----h ak ■ uk) — a\ ■ f (u\) -\-----V ak ■ f(uk), (4) for every vector subspace V\ C V is its image /(Vi) a vector sub space in W, (5) for every vector subspace W\ C W isthe set f~l(W\) — {v e V; f(v)eWi] a vector subspace in V. Proof. We rely on the axioms, definitions and already proved results (in case you are not sure what has been used, look it up!): /(0) = f(u -u) = /((l - 1) • u) = 0 • f(u) = 0, /(-«) = /((-l) • u) = (-1) • f(u) = -f(u). The property (3) is again easy from the definition for two sum-mands using induction on the number of summands. >From the property (3) we have that (/(Vi)) — f(Vi), thus it is vector sub-space. On the other hand, if f(u) e W\ and f(v) e W\ then for any scalars it will be that f(a-u+b-v) — a ■ f (u)+b-f (v) eW\. □ 2.35. Simple corollaries. (1) Composition g of : V —>• Z of two linear mappings / : V —>• W and g : W —>• Z is again a linear mapping. (2) Linear mapping / : V —>• W is an isomorphism if and only if Im / — W and Ker / — {0} c V. Inverse mapping of an isomorphism is again an isomorphism. (3) For any two subspaces V\, V2 C V and linear mapping / : V W it holds that f(Vi + V2) = f(Vi) + f(V2), f(Vi n v2) c /(Vi) n f(v2). (4) The mapping "coordinate assignment" u : V —>• K" given by arbitrarily chosen basis u — (u\, ...,«„) of a vector space V is an isomorphism. (5) Two finitely dimensional vector spaces are isomorphic if and only if they have the same dimension. (6) Composition of two isomorphisms is an isomorphism. Proof. Proving the first claim is a very easy exercise. ^ For the proof of the second one we must realise that if / is a linear bijection, then a vector w is an image of a linear combination au + bv, that is u> — f~l (au + bv), if and only if f(w) = au + bv = f(a-f-1(u)+b-f'Hv)). Thus it also holds that w — af~l(u) + bf~l (v) and therefore the inversion of a linear bijection is again a linear bijection. Further, / is surjective if and only if Im / — W and if Ker / — {0} then f(u) — f(v) ensures f(u — v) — 0, that is, u — v. In this case / is injective. The remaining claims are easy to prove by induction. Try to make a counterexample - in the inclusion that is to be proved there 100 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. The four given matrices are as vectors in the space of 2 x 2 matrices linearly independent. It follows from the fact that the matrix /1 1 -5 1 \ 0 4 0 -2 1 0 3 0 V"2 -1 0 3/ is regular (which is by the way equivalent to any of the following claims: its rank equals its dimension; it can be transformed into the unit matrix by elementary row transformations; it has the inverse matrix; it has non-zero determinant (equal to 116); it stands for a system of homogeneous linear equations with only zero solution; every non-homogeneous linear system with left-hand side given by this matrix has a unique solution; the range of a linear mapping given by this matrix is a vector space of dimension 4 - this mapping is injective). □ 2.60. Let there be in M3 two vector spaces U and V generated by the vectors (1, 1, -3), (1, 2, 2) a (1, 1, -1), (1, 2, 1), (1, 3, 3), respectively. Determine the intersection of these two subspaces. Solution. The subspace V has dimension only 2 (it is not the whole space M3), because 1 1 1 1 1 -1 1 2 3 = 1 2 1 = 0 -1 1 3 1 3 3 and any two of the considered three vectors is clearly linearly independent. Similarly we can see that U has dimension 2. Also we have 1 1 1 2 -3 2 1 1 -1 2^0, and therefore the vector (1,1,-1) does not he in the subspace U. The intersection of two planes (two-dimensional spaces) passing through the origin in a three-dimensional space must be at least a line. In our case it is exactly a line (subspaces are not identical). Thus we have determined the dimension of the intersection - it is one-dimensional. If we note that 1 -(1, 1, -3)+2- (1,2,2) = (3,5, 1) = 1 • (1, 1, -1) +2 • (1,2, 1), we obtain expression of the intersection in the form of a set of all scalar multiples of the vector (3,5, 1) (thus it is a line passing through the origin with this vector as a direction). □ does not always have to hold an equality (that is, find an example where the inclusion is strict). □ 2.36. Coordinates again. Consider any two vector spaces V and W over K with dim V — n, dim W — m and consider some r linear mapping / : V -> W. For every choice of basis 1 -' *% u_ — (u\,..., un) on V, v — (vi,..., vn) on W we have ft1 ' at our disposal the corresponding coordinate assignments and the whole situation is captured in the following diagram: The bottom arrow fu,v is defined by the remaining three, that is, as a mapping it is a composition fu,v =RO f ou~l. Matrix of a linear mapping I - Every linear mapping is uniquely determined by its values on I an arbitrary set of generators, notably on the vectors of a basis u. Denote by f(u\) — an ■ vi +a2i ■ v2 H-----V am\vm f(u2) — a 12 ■ vi + a22 ■ v2 H-----h am2vm /(«„) — a\n ■ vi + a2„ ■ v2 H-----h amnvm, that is, scalars atj form a matrix A, where the columns are coordinates of the values f(uj) of the mapping / on the basis vectors expressed in the basis v on the target space W. Matrix A — (a^) is called matrix of the mapping f in bases u, v. | For a general vector u — x\u\ H-----h x„u„ e V we calculate (recall that vector addition is commutative and distributive with respect to scalar multiplication) /(«) — x\f(ui) H-----h xnf(un) = xi(flni;i-|-----Yam\vm) H-----\-x„(ai„vi-\-----Yamnvm) = (xiflnH-----Yxna\n)v\ H-----h (x\am\-\-----Yxnamn)vm. Using matrix multiplication we can now very easily and clearly write down the values of the mapping fu,v(u>) defined uniquely by the previous diagram. Recall that vector in W are understood as columns, that is, matrices of the type r/1 fu,v(u(u))) - v(f(w)) - A ■ u{w). On the other hand, if we have fixed bases on V and W, then every choice of a matrix A of the type m/n gives a unique linear mapping K" —>• Km and thus also a mapping / : V —>• W. If we have chosen bases of spaces V and W, every choice of a matrix of the type m/n correspond to a unique linear mapping V —>• W and we have shown a bijection between matrices of the corresponding dimension and linear mappings V —>• W. 101 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.61. Determine the vector subspace (of the space R4) generated by the vectors u\ = (—1,3, —2, 1), u2 = (2,-1,-1, 2), u3 = (—4, 7, —3, 0), «4 = (1, 5, —5, 4), by choosing some maximal set of linearly independent vectors ut (that is, by choosing a basis). Solution. We write the vectors ut into the columns of a matrix and transform it using elementary row transformations. This way we obtain Í-1 3 -2 V i 2 1 0 1 \0 0 7 /I 0 -1 -i -: 2 o o -i -i 1 \ 5 -5 4/ 4\ 5/4 1 0 / ( 1 -1 3 V-2 (I 2 0 1 0 0 \0 0 2 2 -1 -1 4\ 1 5 "5/ /I 0 0 4 \ 5/4 -1/4 0 / (I 0 0 1 0 0 \0 0 2 0 4 -4 -7 7 3 2 -1 0 0 4\ 5 -7 3 / °\ 0 1 0/ 2 • (-1, 3, -2, 1) - (2, -1, -1, 2) = (-4, 7, -3, 0). □ 2.62. In the vector space R we are given three-dimensional sub-spaces U = (mi, u2, u3), V = (vi, v2, v3), while (\\ (\\ (\\ ( 1 ^ (1 \ 1 1 0 1 -1 1 , u2 = 0 , u3 = 1 , vi = -1 , v2 = 1 vv V) V) v3 = (1, — 1, — 1, l)r. Determine the dimension and give a basis of the subspace U n V. Solution. The subspace U n V contains exactly the vectors that can be obtained as a linear combinations of vectors ut and also as a linear combination of vectors vt. We thus search for numbers x\, x2, x3, yi, y2, y3 € R such that the following holds: (\\ (1\ (1\ ( 1 ^ ( 1 ^ ( 1 \ 1 1 0 i -l -1 xx 1 + x2 0 + x3 1 = yi -i + y2 l + J3 -1 vv V) v) that is, we are looking for a solution of a system + x3 = Xi Xi xx + + X2 *2 *2 + + X3 x3 yi yi -y\ -yi + + y2 y2 y2 y2 + + J3, J3, J3, ys- >From that it follows that linearly independent are exactly the vectors ui,u2,u4, that is, exactly the vectors corresponding to the columns which contain first non-zero number of some row. Furthermore we have (see the third column) 2.37. Matrix for changing the coordinates. If we choose V and W to be the same space, but with different bases, and for / pick the identity mapping, the approach from the previous paragraph expresses the vectors of the basis u in coordinates with respect to the basis v. Let the resulting matrix be T. If we then have the vector u as U — X\U\ + • • • + XnUn, that is, in coordinates with respect to u and plug for their expression using the vectors from v, we obtain the coordinate expression x — (x\,..., x„) of the same vector in the basis v. It is enough just to reorder the summands and express the individual scalars at the vectors of the basis. In reality, we are doing exactly the same thing as in the previous paragraph for the special case of the identity mapping idv on the vector space V. Matrix of this identity mapping is T and therefore the direct calculation must give x — T ■ x. The situation is depicted in the diagram The resulting matrix T is called matrix for changing the basis from u of the vector space V to the basis v of the same space. Directly from the definition we have: Calculating the matrix for changing the basis [_( Proposition. Matrix T for changing from the basis u to the basis v is obtained by taking coordinates of the vectors of the basis u expressed in the basis v are written as the columns of the matrix T. The role of the matrix for changing the basis is that if we know the coordinates x of the vector in the basis u, then its coordinates in the basis v are obtained by multiplying the column x with the matrix for changing the basis (from the left). Because the inverse mapping for the identity mapping is again an identity mapping, the matrix for changing the basis is always invertible and its inverse is the matrix for changing the basis in the opposite direction, that is, from the basis v to the basis u. 2.38. More coordinates. Now we show how to compose possible coordinate expressions of linear mapping. Let us consider another vector space Z over K of dimension k with basis w, linear mapping g : W —>• Z and denote the corresponding matrix by gv,w- V ■ / w ■ fu,v 8v,ui Composition g of on the upper row corresponds to the matrix of the mapping K" —>• Kk on the bottom and we directly calculate (we write A for the matrix / and B for the matrix of g in the chosen 102 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Using matrix notation of this homogeneous system (and preserving bases): the order of the variables) we have /l i i -1 -1 -1\ /l 1 1 -1 -1 - -1\ 1 1 0 -1 1 1 0 0 -1 0 2 2 1 0 1 1 -1 1 0 -1 0 2 0 2 v> 1 1 1 1 -v 1 1 1 1 - /l 1 1 -1 -1 -is /l 1 1 -1 -1 - -1\ 0 1 1 1 1 -1 0 1 1 1 1 - -1 0 0 - -1 0 2 2 0 0 1 0 -2 - -2 0 1 3 1 !y ^0 0 0 1 1 /l 1 1 0 0 0 \ /l 0 0 0 0 2 \ 0 1 1 0 0 - 2 0 1 0 0 2 0 0 0 1 0 - -2 - 2 0 0 1 0 -2 -2 ^0 0 0 1 1 1/ ^0 0 0 1 1 0 We obtain a solution X\ t, s e -2ř, X2 = —2s, x 3 = 2s + 2t, yi t, y2 = s, y3 = t, We obtain a general vector of the intersection by substituting X\ + x2 X\ + x3 / 0 \ -it - 2s 2s V It ) We see that dim U n V unv -i i V 0 / -i 0 V1/ □ 2.63. Give some basis of the subspace U of the vector space of real matrices 3x2. Extend this basis to a basis of the whole space. Solution. Let us remind that a basis of a subspace is a set of linearly independent vectors which generate the given subspace. Because '1 2\ /0 1\ 1-2 -V 2 |3 4+32 3=0 1 v5 6/ \4 5/ \2 3 the whole subspace U is generated just by the first two matrices. These are furthermore linearly independent (none is a multiple of another) and thus give a basis. If we want to extend it to a basis of the whole space of real matrices 3 x 2, we must find four more matrices (the gv,w ° fu,v(x) = wo gov 1 ovo f o u 1 — B ■ (A ■ x) — (B ■ A) ■ x — (g o f\,w{x) for every x e Kn. Composition of mappings thus corresponds to multiplication of the corresponding matrices. Note that the isomorphisms correspond exactly to invertible matrices. The same approach gives us an answer to the question how does the matrix of the mapping change whenever we change the basis (both in the domain and in the codomain): V idy V ■ f w- fu,v w where T is the matrix for changing the basis from w' to u and S is the matrix for changing the basis from v' to v. If A is the original matrix of the mapping, then the matrix of the new mapping is given by A' = S~l AT. In the special case of linear mapping / : V —>• V, that is, mapping that has the same space V as its domain and codomain, we express / usually with a single basis u of the space V. Then the changing of the basis to the new one u' with the matrix T for changing from u' to u the new matrix will be A' — T~l AT. 2.39. Linear forms. A specially simple but important case of linear mappings are so-called linear forms. They are linear mappings from the vector space V over field of scalars K into the scalars K. If we are given the coordinates on V, the assignments of a single ;-th coordinate to the vectors is an example of a linear form. More precisely, for every choice of basis v — (v\ ,...,«„) we have at our disposal the linear forms v* : V —>• K such that v*(vj) — Stj, that is, zero for distinct indices i and j and one for the same indices. Vector space of all linear forms on V is denoted by V* and called dual space of the vector space V. Let us now assume that the vector space V has finite dimension n. The basis V* composed of assignments of individual coordinates as before is called dual basis. Really it is a basis of the space V*, because these forms are clearly linearly independent (prove it!) and if a is an arbitrary form, then it holds for every vector u — x\ v\ + • • • + x„ vn a(u) — jcia(ui) + • • • + x„a(v„) = a(i;i)i;*(«) H-----\-a(v„)v*(u) and thus the linear form a is a linear combination of the forms v*. For a fixed basis {1} on one-dimensional space of scalars K are for every choice of the basis v on V the linear forms a identified with matrices of the type 1/n, that is, with rows y. Exactly the components of these rows are coordinates of the general linear forms in the dual basis v*. Expressing such form on vector is then given by multiplying the corresponding row vector y with the column of the coordinates x of the vector u e V in the basis i>: a(u) — y ■ x — yixi H-----h ynxn. Thus we can see that for every finitely dimensional space V is V* isomorphic to the space V. Realisation of such isomorphism is given for instance by our choice of the dual basis for the chosen basis on the space V. 103 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 1 0 0 0 0 0 2 1 0 0 0 0 3 2 1 0 0 0 4 3 0 1 0 0 5 4 0 0 1 0 6 5 0 0 0 1 dimension of the whole space is clearly 6) such that the resulting six-tuple is linearly independent. We can use for instance the canonical basis 'l 0\ /0 1\ /0 0\ /0 0\ /0 0\ /0 0^ 0 0 I, o o 1, 1 o 1, 0 11, o o 1, 0 0 v0 0/ \0 0/ \0 0/ \0 0/ \1 0/ \0 1, of the space of real matrices 3x2, which can be identified directly with M6. If we write down the two vectors of the basis of U and then the canonical basis of the whole space, by choosing first 6 linearly independent vectors we obtain a desired basis. If we consider for instance 1, we can immediately add to the basis vectors 'l 2\ /0 1N 3 4, 23 v5 6) \4 5y of the subspace U the matrices (vectors of the space of the matrices) ^0 0\ /0 0\ /0 0\ /0 0^ 10, (oil, 00, 00 vo o) \o o) \i o) \o 1, to a basis. Let us note that the determinant given above is easy to compute - it equals the product of all elements on the diagonal, because the matrix is in lower triangular form (everything above the diagonal is zero). □ H. Linear mappings How to analytically describe similar mappings (for instance rotation, axial symmetry, mirror symmetry, projection of a three-dimensional space on a two-dimensional one) in the plane or in the space? How can we describe scaling of a picture? What do they have in common? They all are linear mappings. That means that they preserve certain structure of the space or a subspace. What structure? Structure of a vector space. Every point in the plane is described by two coordinates, every point in the (3-dimensional) space is described by three coordinates. If we fix the origin, then it makes sense to say that some point is in some direction twice that far from the origin as some other point. We also know where do we get if we shift by some value in a given direction and then by some other value in another direction. These properties can be formalised - we speak of vectors in the plane or in the space and about their multiplication and addition. Linear mapping has the property that the image of a sum of vectors is In this context we again meet the scalar product of a row of n scalars with a column of n scalars, as we have worked with it already in the paragraph 2.3 on the page 74. When considering an infinitely dimensional space, things be-\\ have differently. For instance the simplest example of the space of all polynomials K[x] in one variable is a vector space with a countable basis with elements Vi — x* and as before we can define linearly independent forms v*. Every formal infinite sum J2t^oaivi is now well-defined linear form on K[x], because it will be evaluated only on a finite linear combination of the basis polynomials x*, i = 0, 1,2, .... The countable set of all v* is thus not a basis. In reality, one can show that this dual space cannot have a countable basis. 2.40. The size of vectors and scalar product. When dealing with the geometry of the plane R2 in the first chap-/ ter in the paragraph 1.29 we have already worked not with just bases and linear mappings but also with the size of vectors and their angles. For defining these terms we have used the scalar product of two vectors i; — (x, y) and v' — (x*, y1) in the form u ■ v — xx* + yy1. Really, the actual expression for the size of i; — (x, y) is given by || v || — y x2 + y2 — -Jv ■ v, while the (oriented) angle

From our considerations it can be seen that in the Euclidean plane two vectors are perpendicular whenever their scalar product is zero. In the case of real vector space of any dimension we shall try a similar approach, because the concept of the angle of two vectors is clearly always two-dimensional (we want the angle to be the same in the two-dimensional space containing u and v). In the following paragraphs, we shall consider only finitely dimensional vector spaces over real scalars R. Scalar product and perpendicularity _ Scalar product on a vector space V over real numbers is a mapping (, ):VxV^l which is symmetric in its arguments, linear in each of its arguments, and such that (i;, i;) > 0 and 111; 112 — (v, v) — 0 if and only ifv—0. The number || i; || — *J(v, v) is called the size of the vector i;. Vectors v and w e V are called orthogonal or perpendicular whenever (v, w) — 0. We also write v ± w. The vector i; is called normalised whenever ||u|| — 1. The basis of the space V composed of orthogonal vectors only is called orthogonal basis. If the vectors are additionally normalised, it is then orthonormalised basis. 104 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA a sum of the images of the vectors and the image of a multiple of a vector is the multiple of the image of the vector. These properties are shared among the mappings stated at the start of this paragraph. Such a mapping is then uniquely determined by its behaviour on vectors of a basis (in the plane by the image of two vectors not on the same line, in the space by the image of three vectors not in the same plane). And how to write down some linear mapping / on a vector space V ? Let us start for simplicity with the plane M2: assume that the image of the point (vector) (1,0) is (a,b) and the image of the point (vector) (0, 1) is (c, d). This uniquely determines the image of arbitrary point with coordinates (u, v): f((u, v)) = f(u(\, 0) + v(0, 1)) = uf(l, 0) + vf(l, 0) = (ua, ub) + (vc, vd) = (au + cv, bu + dv), which can be efficiently written down as follows: a c b d au + cv bu + dv Linear mapping is thus a mapping uniquely determined by a matrix. Furthermore, when we have another linear mapping g given by (e f\ the matrix I ^ j, then we can easily compute (an interested reader can fill in the details by himself) that their composition g o f is given , (ae + fc be + df\ by the matrix (^+^ bg + dh)- This leads us to the definition of matrix multiplication in exactly this way, that is, we want that an application of a mapping on a vector is given by the matrix multiplication of the matrix of the mapping with the given vector, and that mapping composition is given by the product of the corresponding matrices. It works analogously in the spaces of higher dimension. Further, this again shows what has already been proven in (2.5), that is, that matrix multiplication is associative but not commutative, because it is so with mapping composition. That is another of the motivation why one should investigate vector spaces. Let us now recall that already in the first chapter we have worked with matrices of some linear mappings in the plane R2, notably with the rotation around a point and with axial symmetry (see 1.31 and 1.32). Let us now try to write down matrices of linear mappings from M3 to M3. How does the matrix of a rotation in three dimensions look like? Let us begin with special (easier for description) rotations about coordinate axes: 2.64. Matrix of rotation about coordinate axes in M3. We write down matrices of the rotations by the angle ">) = XIv/v; "/• uj) = Xv//V/V/- If the basis is orthonormal, the matrix S is the unit matrix. This proves the following useful claim: __J SCALAR PRODUCT AND ORTHONORMAL BASIS j___ Proposition. Scalar product is in every orthonormal basis given in coordinates by the expression (x, y) - xT ■ y. For every general basis of the space V there is symmetric matrix S such that the coordinate expression of the scalar product is (x, y) — xT ■ S ■ y. 2.41. Orthogonal complements and projections. For every fixed subspace W C V in a space with scalar rt product we define its orthogonal complement as follows ff^jueV; u _L v for all i; e W}. Directly from the definition it is clear that W1- is vector subspace. If W C V has basis (u\,..., uk) the condition for W1- is given as k homogeneous equations for n variables. Thus W1- will have dimension at least n — k. Also u e W n W1- means (u, u) — 0 and thus also u — 0 due to the definition of scalar product. Clearly then the whole space V is the direct sum V = w ®WL. Linear mapping / : V —>• V on any vector space is called projection, if we have /<>/ = /■ In such case for every vector v e V v = f(v) + (v- f(v)) € Im(/) + Ker(/) = V and if i; e Im(/) and f(v) — 0 then also v — 0. The previous sum of subspaces is then direct. We say that / is a projection on the subspace W — Im(/) along the subspace U — Ker(/). In words, the projection can be described naturally as follows: we decompose the given vector into component in W and in U and forget the second one. If V has a scalar product, we say that the projection is perpendicular if the kernel is perpendicular to the image. 105 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. When rotating any particular point about the given axis (say x), the corresponding coordinate (x) does not change and the remaining two coordinates are then given by the rotation in the plane which we already know (a matrix of the type 2/times!). Thus we gradually obtain the following matrices - rotation about the axis z;. (cos cp — sin cp 0^ sin cp cos cp 0 0 0 1, rotation about the axis y: cos cp 0 sin cp1 0 1 0 — sin cp 0 cos q>} rotation about the axis x: '10 0 0 cos cp — sin cp y0 sin cp cos cp The sign at cp in the matrix for rotation about y is different. We want, as with any other rotation, the rotation about the y axis to be in the positive sense — that is, when we look in the opposite direction of the direction of the y axis, the world turns anti-clockwise. The signs in the matrices depend on the orientation of our coordinate system. Usually, in the 3-dimensional space the so-called "dextrorotary coordinate system" is chosen: if we place our hand on the x axis such that the fingers point in the direction of the axis and such that we can rotate the x axis in the xy plane so that x coincides with the y axis and they point in the same direction, then the thumb should point in the direction of the z, axis. In such system this is a rotation in the negative sense in the plane xz, (that is, the axis z, turns in the direction towards x). Think about the positive and negative sense of rotations through all three axes. □ The knowledge of matrices allows us to write the matrix of rotation about any (oriented) axis. Let us start with a specific example: 2.65. Find the matrix of the rotation in the positive sense through the angle 7t/3 about the line passing through the origin with the oriented directional vector (1, 1,0) under the standard basis M3. Solution. The given rotation can be easily obtained by composing these three mappings: • rotation through the angle 7t/4 in the negative sense about the axis z, (the axis of the rotation goes over on the x axis); • rotation through the angle 7t/3 in the positive sense about the x axis; • rotation through the angle jt/'4 in the positive sense about the z axis (the x axis goes over on the axis of the rotation). Every subspace W / V thus defines an perpendicular projection on W. It is a projection on W along W-1, given by the unique decomposition of every vector u into components uw e W and uw± e W-1, that is, linear mapping which maps uw + uw±_ on 2.42. Existence of orthonormal basis. Note that on every finitely dimensional real vector space there definitely exist scalar products. Just pick any basis, call it orthonormal and we immediately have a scalar product. In this basis the scalar products are computed as in the formula in the Theorem 2.40. But we can do it in other way too. If we are given scalar product on a vector space V, we can easily use some suitable perpendicular projections and transform any basis into orthonormal one. It is called Gramm-Schmidt orthogonalisation process. The point of this procedure is to transform a given sequence of nonzero generators v\,..., Vk of a finitely dimensional space V into an orthogonal set of nonzero generators for V. _ I Gramm-Schmidt orthogonalisation Proposition. Let (u\,..., «*) be a linearly independent k-tuple of vectors of a space V with scalar product. Then there exists an orthogonal system of vectors (v\, ..., v^) such that vi e (u \, ... ,Uj■), i — 1, ..., k. We obtain it by the following procedure: • The independence of the vectors ui ensures that u\ choose v\ — u\. • If we have already constructed the vectors v\, ... quired properties, we choose vi+\ — U£+\ +a\v\ +■ where at — — t,l"l+\'^'K ^ 0; we V£ of re- ■ ■ +a£V£, Proof. Let us begin with the first (nonzero) vector v\ and calculate the perpendicular projection v2 on do (ui)1" C {{vuv2}). The result is nonzero if and only if v2 is independent on v\. In all further steps we work similarly. In the £-th we want that for vi+\ — u^+\ +a\v\ H-----Va^v^ holds {vi+i, vt) — 0 for all i — 1,... ,£. That implies 0 — (u£+i +a\vi H-----latvt, vt) — (w^+i, vt) +at{vi, vt) and we can see that the vectors with desired properties are determined uniquely up to a scalar multiple. □ Whenever we have an orthogonal basis of a vector space V, we just have to normalise the vectors in order to obtain an orthonormal basis. Thus we have proven: Corollary. On every finitely dimensional real vector space with scalar product there exist an orthonormal basis. In orthonormal basis the coordinates and perpendicular projections are very easy to calculate. Really, let us have an orthonormal basis (e\,..., e„) of a space V. Then every vector v — x\e\ + • • • + xnen satisfies {et, v) — {et, x\e\ H-----h x„e„) — xt and it always holds that (2.3) i; — (e\, v)e\ H-----\-(e„,v)e„. 106 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA The matrix of the resulting rotation is the product of the matrices corresponding to the given three mappings, while the order of the matrices is given by the order of application of the mappings - the first mapping applied is in the product the rightmost one. Thus we obtain the desired matrix 0 j_ 2 V3 /V2 o) /l z V2 2 z Vj 2 0 0 0 'J / 3 J_ = 4 j_ 4 3 V- 4 Vě 4 4 Vě 4 1 2 Note that the resulting rotation could be also obtained for instance by taking the composition of the three following mappings: • rotation through the angle n/4 in the positive sense about the axis z, (the axis of rotation goes over on the axis y); • rotation through the angle 7t/3 in the positive sense about the axis y; • rotation through the angle 7t/4 in the negative sense about the axis z, (the axis y goes over to the axis of rotation). Analogously we obtain 0 1 0 V5\ 2 0 /VI 2 VI 2 2 VI 2 0 0 □ 2.66. Matrix of general rotation in R3. Derive the matrix of a general rotation in R3. Solution. We can do the same things as in the previous example with general values. Consider arbitrary unit vector (x, y, z). Rotation in the positive sense through the angle

/VT^l5 o |, -y/y/T 0 z2 x If we are given a subspace W c V and its orthonormal basis (ei,..., ek), we can surely extended it to an orthonormal basis (e i,..., en) of the whole V. Perpendicular projection of a general vector v e V on W is then given by the relation v i-> (ei, v)e\ H-----h (e„, v)ek. For perpendicular projection it is enough to know just the orthonormal basis of the subspace W, on which we are projecting. Let us also note than in general projections / on a subspace W along U and projections g on U along W tied with the relation g — idy — /. When dealing with perpendicular projections on a given subspace W, it is always more efficient to calculate the orthonormal basis of the space which has smaller dimension (that is, for either W or W-1). Let us also note that the existence of an orthonormal basis ensures that for every real space V of dimension n with scalar product there exists a linear mapping which is an isomorphism between V and the space M" with standard scalar product. Similarly it has been shown already in the Theorem 2.40, where we have shown that the desired isomorphism is exactly the coordinate assignment. In words - in orthonormal basis the scalar product with coordinates is computed by the same formula as the standard scalar product in W. We shall return to the questions of the size of a vector and to projections in the following chapter in more general context. 2.43. Angle of two vectors. As we have already noted, the angle of two linearly independent vectors in the space must be the same as when we consider them in the two-dimensional subspace they generate. Basically, this is the reason why the notion of angle is independent of the dimension of the original space and if we choose orthogonal basis such that its first two vectors generate the same subspace as the two given vectors u and i; (whose angle we are measuring), we can simply take the definition from the planar geometry. Even without choosing the basis it must hold that: ___| Angle of two vectors [ - Angle

• K, where for any four vectors u, v, w, z and scalars a, 107 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA ii) rotation 1Z2 in the positive sense about the y axis through the angle with cosine Vl — z2, that is, with sine z, under which the line with the directional vector (0, y, z) goes over on the line with the directional vector (1, 0, 0). Matrix of this rotation is R2 z 0 0 VT iii) rotation 7^3 in the positive sense about the x axis through the angle

-1 • R2l -R3-R2-Ri = cos

• V*, v \-> a( , v), that is, plugging a fixed vector in the second argument we obtain a linear form which is the image of this vector. If we choose a fixed basis on a finitely dimensional space V and a dual basis V*, then we have a mapping y (ih>y • a ■ x). 4. Properties of linear mappings More detailed analysis of properties of types of linear mappings will now lead us to a better understanding of tools which vector spaces give us for modelling of linear processes and systems. 2.45. Let us begin with four examples in the lowest interesting dimension. In the standard basis of the plane M2 with the standard scalar product we consider 5 the following matrices of mapping / : M2 —>• a = 1 0 0 0 B = 0 1 0 0 c = a 0 0 b D = 0 The matrix a gives a perpendicular projection along the subspace W C {(0,a); aeRjcR2 108 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Transition matrix for changing the basis from the standard basis to the basis / is then given by on the subspace Matrix of the mapping in the basis / is then given by t~lat □ 2.68. Consider the vector space of polynomials of one variable of degree at most 2 with real coefficients. In this space, consider the basis 1, x, x2. Write down the matrix of the derivative mapping in this basis and also in the basis 1 + x2, x, x + x2. /0 1 0\ /0 1 1 \ Solution. 002,21 3. □ \0 0 0/ \0 -1 -1/ 2.69. In the standard basis in R3 determine the matrix of the rotation through the angle 90° in the positive sense about the line (t, t,t),t el, oriented in the direction of the vector (1, 1, 1). Further, give the matrix of this rotation in the basis £=((1,1,0), (1,0,-1), (0,1,1)). Solution. We can easily determine the matrix of the given rotation in a suitable basis, that is, in a basis given by the directional vector of the line and by two mutually perpendicular vectors in the plane x + y + z, = 0, that is, in the plane of vectors perpendicular to the vector (1, 1, 1). We shall note that the matrix of the rotation in the pos- itive sense through 90° in some orthonormal basis in R2 is 0 -1 1 0 In orthogonal basis with sizes of the vectors k, I it is ^ If we choose perpendicular vectors (1,-1,0) and (1, 1, —2) in the plane x + y + z = 0 with sizes \fl and Vo", then in the basis / = ((1, 1, 1), (1, —1, 0), (1, 1, —2)) the rotation we are looking for /I 0 0 \ has matrix 10 0 — V3 I. In order to obtain the matrix of the \0 1/V3 0 / rotation in the standard basis, it is enough to change the basis. The transition matrix t for changing the basis from the basis / to the standard basis is obtained by writing the coordinates (under the standard basis) of the vectors of the basis / as the columns of the matrix t: /I 1 1 t = I 1 — 1 1 |. Finally, for the desired matrix r we have VI 0 -2 V c {(fl,0); íieI)c that is, the projection on the x-axis along the y-axis Evidently for this mapping / : R2 -> R2 it holds that / o / = / and thus the restriction f\y of the given mapping on its codomain is identity mapping. The kernel of / is exactly the subspace W. The matrix B has the property B2 — 0, therefore the same holds for the corresponding mapping /. We can envision it as a mapping of differentiation of polynomials Mi [x] of degree at most one in the basis (1, x) (with differentiation we shall deal in the chapter five, see ??). The matrix C gives a mapping /, which enlarges the first vector of the basis a-times, the second fr-times. Therefore the whole plane divides into two subspaces, which are preserved under the mapping and where it is only a homothety, that is, scaling by a scalar multiple (first case was a special case with a — 1, b — 0). For instance the choice a — 1, b — — 1 corresponds to axial symmetry (mirror symmetry) under the x-axis, which is the same as complex conjugation x + iy i-> x — iy on the two-dimensional real space R2 ~ C in basis (1,0- This is a linear mapping of the two-dimensional real vector space C, but not of the one-dimensional complex space C. The matrix D is a matrix of rotation through the right angle in the standard basis and on the first sight we can see that none of the one-dimensional subspaces is not preserved under this mapping. Such rotation is a bijection of the plane to itself, therefore we can surely find distinct bases in the domain and codomain, where its matrix will be the unit matrix E (we simply take any basis of the domain and its image in the codomain). But we are not able to do this with the same basis on both the domain and the codomain. Let us see the matrix D as a matrix of the mapping g : C2 -> ~"2 in the standard basis of the complex vector space . Then we can find vectors u — (i, 1), i; — (—i, 1), for which we have 0 -1 1 0 0 -1 1 0 i ■ u, That means that in the basis (w, v) on C the mapping g has the matrix 'i 0 ^0 -i. K and note that the this complex analogy to the case of matrix C has on the diagonal the elements a — cos(j7r) + «sin(j7r) and its complex conjugate a. In other words, the argument of this number in polar form gives the angle of the rotation. This is easy to understand, if we denote the real and imaginary part of the vector u as follows iyu Re w + ŕ Im u i/+í''lo The vector i; is complex conjugate of u. We are interested in the restriction of the mapping g on the real vector space V — R2 n (w, v) c -. Evidently is V (u + u, i(u — u)) — (ji -yu) 109 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA R 1/3 1/3 + V3/3 1/3-VŠ/3 l/3 + V3/3\ 1/3 1/3 - V3/3 J/3-V3/3 1/3 + V3/3 1/3 / This result can be checked by plugging into the matrix of general rotation (||2.66||). By normalising the vector (1, 1, 1) we obtain the vector (x, y, z) = (1/V3, 1/V3, 1/V3), cos(?)) = 0, &in(• V on a vector space of dimension n over scalars K. If we imagine such equality written in coordinates, that is, using the matrix of the mapping A in some bases, it is an expression A ■ x — a ■ x — (A — a ■ E) ■ x — 0. >From the previous we know that such a system of equations has the only solution x — 0 if the matrix A—aE is invertible. Thus we want to find such values a e K for which A — aE is not invertible, and for that the necessary and sufficient condition (see Theorem 2.23) (2.4) det(A -a-E) = 0. If we consider X — a a variable in the previous scalar equation, we are actually looking for roots of polynomial of n-th degree. As we have seen in the case of the matrix D, the roots may exist, but do not have to depending to the field of scalars K we are having. _ | Eigenvalues and eigenvectors [___ Scalars X satisfying the equation f(u) — X ■ u for a nonzero vector u e V are called eigenvalues of mapping f, the corresponding nonzero vectors u then eigenvectors of mapping f. If u, v are eigenvectors associated with the same eigenvalue X, then for every linear combination of u and i; it holds f(au + bv) — af (u) + bf(v) — X(au + bv). Therefore the eigenvectors associated with the same eigenvalue X form along with a zero vector a nontrivial vector subspace Vx, that is, eigenspace associated with X. For instance, if X — 0 is an eigenvalue, the kernel Ker / is a eigenspace Vq. >From the definition of the eigenvalues it is clear that their computation cannot depend on the choice of the basis and the matrix of the mapping /. Indeed, as a direct corollary of the transformation properties from the paragraph 2.38 and Cauchy theorem 2.19 for calculation of the determinant of product we obtain by choosing different coordinates a matrix A' — P~l AP with invertible matrix P and \P~LAP XE\ = \P~LAP -■■ \P~1(A-XE)P\ ■■ \A-XE\, P~lXEP\ -i i (A-XE\\P\ because scalar multiplication is commutative and \P~ \ — >From these reason we use for matrices and mappings the same terminology: 110 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.72. Consider complex numbers as a real vector space and choose 1 and i for its basis. Determine in this basis the matrix of the following linear mappings: a) conjugation, b) multiplication by the number (2 + i). Determine the matrix of these mappings in the basis / = ((1 — 0,(1 + 0). Solution. In order to determine the matrix of a linear mapping in some basis, it is enough to determine the images of the basis vectors. a) For conjugation we have 1 h» 1, i h» —i, written in the coordinates (1,0) h» (1,0) and (0, 1) h» (0, -1). By writing the images into the columns we obtain the matrix ^ ^l)' ^ ^ ^a sis / the conjugation swaps basis vectors, that is, (1, 0) h» (0, 1) and (0, 1) i—> (1, 0) and the matrix of conjugation under this basis is 0 1\ 1 0)' b) For the basis (1,0 we obtain 1 h» 2 + i, i h» 2i — 1, that is, (1,0) h» (2, 1), (0, 1) h» (2,-1). Thus the matrix of multiplication (2 -1N by the number 2 + i under the basis (1, /) is: I ^ ^ Now let us determine the matrix in the basis /. Multiplication by (2 + 0 gives us: (1 - 0 ^ (1 - 0(2 + 0 = 3- i, (1 + 0 ^ (1 + 30-Coordinates (a,b)f of the vector 3 — i in the basis / are given, as we know, by the equation a ■ (1 — 0 + b ■ (1 + 0 = 3 + i, that is, (3 + 0/ = (2, 1). Analogously (1 + 30/ = (-1, 2). Altogether, we 1 V> " Think about the following: why is the matrix of multiplication by 2 + i the same in both bases? Would the two matrices in these bases be the same for multiplication by any complex number? □ 2.73. Determine the matrix A, which under the standard basis of the space M3 gives the orthogonal projection on the vector subspace generated by the vectors u \ = (—1, 1,0) etadu2 = (—1,0, 1). Solution. Let us first note that the given subspace is a plane going through the origin with the normal vector u3 = (1, 1, 1). The ordered triple (1, 1, 1) is clearly a solution to the system have obtained the matrix -xx + x2 + x3 0, 0, that is, the vector u3 is perpendicular to the vectors u\,u2. Under the given projection the vectors u \ and u2 must map to themselves and the vector u3 on the zero vector. In the basis composed of Characteristic polynomial of matrix and mapping [__ For a matrix A of dimension n over K we call the polynomial | A — XE\ € K„[A] characteristic polynomial of the matrix A. Roots of this polynomial are the eigenvalues of the matrix A. If A is the matrix of the mapping / : V -> V in a certain basis, then | A — XE\ is also called the characteristic polynomial of the mapping f. i Because the characteristic polynomial of a linear mapping / : V -> V is independent of the choice of the basis of V, its coefficients at individual powers of the variable X are scalars expressing the properties of /, that is, they cannot depend on the choice of the basis. Notably as a simple exercise for calculating determinants we express the coefficients at the highest and lowest powers (we assume dim V — n and the matrix of the mapping A — (atj) to be in a certain basis): \A — X - E\ — {-\)nXn + (-ly-Vi + • • • + ann) ■ k + ••• + IAI -X°. ! n — l Coefficient at the highest power says only whether the dimension of the space V is even or odd. We have already noted that the determinant of the matrix of a mapping expresses how the given linear mapping scales the volume. Interesting is that the sum of the diagonal elements of the matrix of a mapping does not depend on the choice of basis. We call it the trace of matrix and denote it by TrA. Trace of mapping is defined as a trace of the matrix in an arbitrary basis. In reality this is not so surprising, because in the eight chapter we show an example to illustrate a method of differential calculus, which shows that the trace is actually a linear approximation of the determinant in the neighbourhood of the unit matrix, see ??. In the following we show a few important properties of eigenspaces. 2.47. Theorem. Eigenvectors of linear mappings f : V —. associated to different eigenvectors are linearly independent. V Proof. Let a\,..., a\ be distinct eigenvalues of the mapping / and u\, eigenvectors with these eigenvalues. We do the proof by induction on the number of linearly independent vectors among the chosen ones. Assume that «i,... ,ue are linearly independent and ui+\ — qui is their linear combinations. At least 1—1 can be chosen, because the eigenvectors are nonzero. But then /(w^+i) — • m+i — T!i=i ai+i ■ ci ■ ut> that is, fim+i) — J^ai+i Ui • f(Ui) = ^\ at ■ Ui. By subtracting the second and the fourth expression in the equalities we obtain 0 — J2l=i(ai+i — ai) • q • ui- All the differences between eigenvalues are nonzero and at least one coefficient q is nonzero. That is a contradiction with the assumed linear independence «i,..., ui, therefore also the vector ui+i must be linearly independent of the others. □ 111 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA u\, u2, u3 (in this order) is thus the matrix of this projection '1 o rr> o 1 o vo o Oy Using the the transition matrix for changing the basis _ 1 2 1 \ 3 3 3 III 3 3 3 from the basis (u \, u2, u3) to the standard basis, and from the standard basis to the basis («i, u2, u3) we obtain -l -l r 1 0 1 0 1 1, 0 0 0 1 0 0 0 0 2 _ I _ 1 3 3 3 I 2 _ 1 "33 3 1 1 2 The just proved theorem can be seen as a decomposition of a linear mapping / into a sum of simple mappings. For distinct eigenvalues Xi of the characteristic polynomial we obtain one-dimensional eigenspaces Vit. Each of them then describes a projection on this invariant one-dimensional subspace, where the mapping is given just as multiplication by the eigenvalue Xi. The whole subspace V is then decomposed into a direct sum of individual eigenspaces. Furthermore, this decomposition can be easily calculated: [ Basis of eigenvectors [__> □ 2.74. In the vector space R3 determine the matrix of the orthogonal projection onto the plane x + y — 2z, = 0. O 2.75. In the vector space R3 determine the matrix of the orthogonal projection on the plane 2x — y + 2z = 0. O I. Bases and inner products Using the inner product we can solve in a different (better?) way problems we were able to solve already using changes of coordinates. 2.76. Write down the matrix of the mapping of orthogonal proj ection on the plane passing through the origin and perpendicular to the vector (1,1,1). Solution. The image of arbitrary point (vector) x = (xi, x2, x3) e R3 under the considered mapping can be obtained by subtracting from the given vector its orthogonal projection onto the direction normal to the considered plane, that is, onto the direction (1, 1, 1). This projection p is given by (see 2.3) as (x, (1, 1, 1)) _ X\ +X2 + X3 Xi+X2+X3 Xi+X2+X3 1(1, 1, 1)|2 ~ 3 ' 3 ' 3 The resulting mapping is thus 2x\ x2 -\- x3 2x2 X\ -\- x3 2x3 X\ -\- x2 xp = (~$ 3 ' ~3 3 ' ~3 3 } = Corollary. If there exists n mutually distinct roots Xi of the characteristic polynomial of the mapping f : V -> V on n-dimensional space V, then there exists a decomposition of V into a direct sum of eigenspaces of dimension 1. That means that there exists a basis of V composed only of eigenvectors and in this basis f has diagonal matrix. This basis is uniquely determined up to the order of the elements. The corresponding basis (expressed in the coordinates in an arbitrary basis ofV) is obtained by solving N systems of homogeneous linear equations ofn variables with matrices (A — Xi ■ E), where A is a matrix of f in a chosen basis. 3 3 3 We have (correctly) obtained the same matrix as in the exercise || 2.731|. □ 2.48. Invariant subspaces. We have seen that every eigenvector v of the mapping / : V -> V generates a subspace (v) c V, which is preserved by the mapping /. In more generality, we say that a vector subspace W c V is invariant subspace for a linear mapping /, if it holds that f(W)From this point of view the eigenspaces of the mapping are extremal case of invariant subspaces and notably in the case of existence of n — dim V distinct eigenvalues of the mapping / we obtain a decomposition of V into direct sum of n eigenspaces. In a suitable basis formed of the eigenvectors the mapping has then diagonal form with eigenvalues on the diagonal. 2.49. Orthogonal mappings. Let us now have a look on the special case of the mapping / : V —>• W between spaces with scalar products, which preserve sizes for all vec- f tors u e V. Ill CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.77. In R3 a standard coordinate system is considered. In the plane z, = 0 there is a mirror and at the point [4,3,5] there is a candle. The observer at the point [1, 2, 3] is not aware of the mirror, but sees in it the reflection of the candle. At what point does he think the candle is? o Solution. [4, 3, -5] □ 2.78. Find the matrix of the reflection with respect to the plane x + y+z = 0. Solution. >From the equation of the plane we determine its unit normal vector. In our case it is « = -^(1,1,1). The reflection Z of the vector i; can then be expressed by Zv = v — 2(v.n)n = (1 — 2nnT)v (for the standard scalar product we have v.n = vnT). The matrix of the reflection is then Definition of orthogonal mapping 1 — 2nnT 1 'i i r 2 - - i i i 3 Vi i i. 1 1 1 □ Using the inner product we can determine the (angular) deflection of the vectors: 2.79. Determine the deflection of the roots of the polynomial x2 — i considered as vectors in the complex plane. Solution. The roots of the given polynomial are square roots of i. The arguments of the square roots of any complex numbers differ according to the de Moivre theorem by n. Their deflection is thus always n. □ 2.80. Determine the cosine of the deflection of the lines p, q in R3 given by the equations p : -2x + y + z = 1 x + 3y - 4z = 5 q : x — y = —2 z = 6 2.81. Let a line be given: p : [1, 1] + (4, t e R Determine the parametrical expression of all lines q that pass through the origin and have deflection 60° with the line p. Q Linear mapping / : V -> W between spaces with scalar product is called orthogonal mapping, if for all u e V >From the linearity of / and from the symmetry of the scalar product follows that for all tuples of vectors the following equality holds: (f(u + v), f(u + v)) = (f(u), f(u)) + (f(v), f(v)) + 2(/(k),/(«)). Therefore all orthogonal mappings satisfy also seemingly stronger condition that for all vectors u, v e V it holds that (/(«), f(v)) = («, v). In the initial discussion about the geometry in the plane we have proved in the Theorem 1.33 that a linear mapping R2 -> R2 preserves sizes of the vectors if and only if its matrix in the standard basis (which is orthonormal with respect to the standard scalar product) satisfies AT ■ A — E, that is, A-1 = AT. In general, orthogonal mapping / : V -> W must be always injective, because the condition (f(u),f(u)) — 0 means also (u, u) — 0 and thus u — 0. In such case is then the dimension of the range always at least as big as the dimension of the domain of /. But then the both dimensions equal and we know that / : V —>• Im / is a bijection. If Im / / W, we extend the orthonormal basis of the image of / to an orthonormal basis of the target space and the matrix of the mapping then contains a square regular submatrix A along with zero rows so that it has the required size. Without loss of generality we can assume that W — V. Our condition for the matrix of orthogonal mapping says in orthonormal basis that for all vectors for all vectors x and y in the space K" the following: (A • x)T ■ (A ■ y) = xT ■ (AT • A) • y = xT ■ y. By specially choosing x and y to be the vectors of the standard basis we directly obtain that AT ■ A — E, that is, the same result as in the dimension two. Thus we have obtained the following theorem: ___J Matrix of orthogonal mappings |_ - Theorem. Let V be a real vector space with scalar product and let f : V —>• V be a linear mapping. Then f is orthogonal if and only if in some orthogonal basis (and then consequently in all of them) it has the matrix A satisfying AT — A~l. Proof. Indeed, if / preserves sizes, it must have the listed property in every orthonormal basis. On the other hand, the previous calculations show that this property for matrix in one basis ensures size preservation. □ Square matrices which satisfy the equality AT — A~l are called orthogonal matrices. Corollary of the previous theorem is also a description of all matrices S of basis changing. Each must give a mapping K" —>• K" that preserves sizes and thus satisfies the condition S~l — ST. When changing from one orthonormal basis to another the matrix of any linear mapping changes according to the relation 113 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.82. Using the Gram-Schmidt orthogonalisation obtain the orthogonal basis of the subspace U = {(*!, X2, X3, x4)t € M4; X\ + X2 + X3 + x4 = 0} of the space R4. Solution. The set of solutions of the given homogeneous linear equation is clearly a vector space with the basis /-1\ /-1\ /-1\ 0 1 1 0 u2 u3 0 0 V1/ Vectors of the orthogonal basis obtained using the Gram-Schmidt orthogonalisation process shall be denoted t>i, v2, v3. Let us first set vi =u\. Further, let 1 u\ ■ V\ v2 = u2- Vi = u2 - - Vi \\vi\\2 2 1 1 —, —, 1,0 2 2 that is, let us choose a multiple v2 = (—1, —1,2,0)T. Further, let V3 = U3 u\ ■ V\ |Wll u\ ■ v2 ~\\v7\ 1 1 v2 = u3 Vi v2 1 1 1 —, —, —, 1 3 3 3 Altogether we have /-1\ /-1\ /-1\ 1 -1 -1 0 , v2 = 2 , v3 = -1 \°) \°) \3/ Let us add that due to the simplicity of the exercise we can immediately give an orthogonal basis of the vectors (1,-1,0, 0)T , (0, 0, l,-l)r or (-1,1,1,-iy (i,-i,i,-iy (1,1,-1,-1)' (-1,-1,1,1)' □ 2.83. Write down some basis of the real vector space of the matrices 3x3 over R with zero trace (the sum of the elements on the diagonal) and write the coordinates of the matrix '12 0 0 2 0 1 -2 -3, in this basis. 2.84. Define some inner product on the vector space of the matrices from the previous exercise. Compute the norm of the matrix from the previous exercise, induced by the product you have defined. O A' STAS. 2.50. Decomposition of an orthogonal mapping. Let us now have a more detailed look on eigenvectors and eigenvalues of orthogonal mappings on a real vector space V with scalar product. Consider fixed orthogonal mapping / : V -> V with matrix A in some orthonormal basis and let us try to continue as with the matrix D of rotation, as in the example 2.45. But let us first have a general look on invariant subspaces of orthogonal mappings and their orthogonal complements. If for any subspace W C V and orthogonal mapping / : V -> V it holds that f(W) c W, then also for all w € ff1 it holds that w e W (f(v), w) = (f(v), f o f-\w)) = (v, f-\w)) = 0 because also f~1(w) e W. But that means that also /(W-1) W-1. We have thus proved a simple but important proposition: C Proposition. Orthogonal complement of invariant subspace is also invariant. If eigenvalues of orthogonal mapping are real, this claim ensures that there always exists a basis V of eigenvectors. Indeed, restriction of / on the orthogonal complement of an invariant sub-space is again an orthogonal mapping, therefore we can put into the basis one eigenvector after another, until we obtain the whole decomposition of V. However, mostly the eigenvalues of orthogonal mappings are not real. We again need to make a trip into complex vector spaces. Let us formulate a result right away: Orthogonal mapping decomposition |_n Theorem. Let f : V —>• V be an orthogonal mapping on a vector space V with scalar product. Then all the roots of the characteristic polynomial f have size one and there exists a decomposition ofV into one-dimensional eigenspaces corresponding to the eigenvalues k — ±1 and two-dimensional subspaces Px ~x, where f acts by rotating through the angle equal to the argument of the complex number k in the positive sense. All these subspaces are mutually orthogonal. Proof. Without loss of generality we can work with the space V — Rm with the standard scalar product. The mapping is thus given by orthogonal matrix A which we can see as a matrix of a linear mapping on a complex W/ space Cm (which just happens to have all coefficients real). Necessary there exist exactly m (complex) roots of the characteristic polynomial, counting their algebraic multiplicity (see the Fundamental theorem of algebra, ??). Furthermore, because the characteristic polynomial of the mapping has only real coefficients, the roots are either real or there is a tuple of roots which are complex conjugates X and X. The associated eigenvectors in Cm for such tuple of complex conjugates are actually a solutions to two systems of linear homogeneous equations which are also complex conjugates of each other - the corresponding matrices of the systems are all real except for the eigenvalues. Therefore also the solutions of this systems are complex conjugates. Now we use the fact that for every invariant subspace its orthogonal complement is also invariant. We first find the 114 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.85. Determine some basis of the vector space of all antisymmetric real square matrices of the type 4x4. Consider the standard inner product in this basis and using this product express the size of the matrix / 0 3 1 0\ -3012 -1-1 0 2 \0 -2-2 0/ 2.86. Find the orthogonal complement U1- of the subspace U = {(x\, x2, x3, x4); x\ = X3, x2 = X3 + 6x4} C R4. Solution. Orthogonal complement U1- consists of exactly those vectors that are perpendicular to every solution of the system x\ — X3 = 0, x2 ~ x3 ~ 6X4 = 0. A vector is a solution of this system if and only if it is perpendicular to both vectors (1, 0, —1, 0), (0, 1, —1, —6). Thus we have U1- = {a ■ (1, 0, -1, 0) + b ■ (0, 1, -1, -6); a, b e R}. □ 2.87. Determine whether the subspaces U = ((2, 1, 2, 2)) a V = ((-1,0, -1,2), (-1,0, 1,0), (0,0, 1, -1)) of the space R4 are orthogonal. If they are, is R4 = U © V, that is, is U1- = VI 2.88. Depending on the parameter t e R determine the dimension of the subspace U of the vector space R\ if U is generated by the vectors (a) mi = (1, 1, 1), u2 = (l,t,\), «3 = (2, 2, 0; (b) «i = (t,t,t), u2 = {-At,-At, At), «3 = (-2, -2, -2). 2.89. Construct an orthogonal basis of the subspace ((1,1,1,1), (1,1,1,-1), (-1,1,1,1)) of the space R4. 2.90. In the space R4 find some orthogonal basis of the subspace of all linear combinations of the vectors (1,0, 1, 0), (0, 1, 0, —7), (4,-2,4,14) and the subspace generated by the vectors (1,2,2,-1), (1, 1, -5,3), (3,2, 8, -7). 2.91. For what values of the parameters a, b e R are the vectors (1,1,2,0,0), (1,-1,0, I, a), (1,6,2,3,-2) in the space M5 pairwise orthogonal? eigenspaces V±\ associated to the real eigenvalues and restrict our mapping to the orthogonal complement of their sum. Without loss of generality we can thus assume that our orthogonal mapping has no real eigenvalues and that dim V — 2n > 0. Let us now choose some eigenvalue X and let u-k be the eigenvector associated to the eigenvalue X — a + ifi, p / 0. Analogously to the case of rotation in the plane given in the paragraph 2.45 by the matrix D we are interested in the real part of the sum of two one-dimensional subspaces (ux) © (ux), where ux is the eigenvector associated to the eigenvalue X. It is an intersection of the given sum of the complex subspaces with R2n, which is generated by the vectors ux+ux and i(ux-ux), that is, real vector subspace Px C R2" generated by the basis given by the real and imaginary part of ux XX — reux, Because A ■ (ux + üx) — Xux -yx — -imux. Xux and similarly with the second basis vector, it is clearly an invariant subspace with respect to multiplication by the matrix A and we obtain A • xx — ocxx + Pyx, A ■ yx — -ayx + fixx. Because our mapping preserves sizes, the size of the eigenvalue X must be equal to one. But that means that the restriction of our mapping on Px is rotation through the argument of the eigenvalue X. Note that the choice of the eigenvalue X instead of X leads to the same subspace with the same rotation, we just have it expressed in basis xx, yx, that is, we must in the coordinates rotate through the angle with opposite sign. The proof of the whole theorem is finished, because the by restriction of our mapping to the orthogonal complement and repeating the previous we obtain the whole decomposition after n steps. □ We return to the ideas in this proof once again in the third chapter, when we study complex extensions of the Euclidean vector spaces, see 3.26. Remark. Specially in the dimension three at least one eigenvalue ± 1 must be real, because three is an odd number. But then the associated eigenspace is an axis of the rotation of the three-dimensional space through the angle given by argument of the other eigenvalues. Try to think how to detect in which direction the space is rotated and also that the eigenvalue — 1 means additional reflection through the plane perpendicular to the axis of the rotation. We shall return to the discussion of the properties of matrices and linear mappings. Before we continue with the general theory, we show first in the next chapter a couple of application. We close this section with a general definition: ___J Spectrum of linear mapping j___ 2.51. Definition. Spectrum of linear mapping f : V —>• V (spectrum of a matrix) is a sequence of roots of the characteristic polynomial /, along with multiplicities. Algebraic multiplicity of eigenvalue means its multiplicity as of the root of the characteristic polynomial, geometric multiplicity of the eigenvalue is the dimension of the associated subspace of eigenvectors. Spectral diameter of of a linear mapping (of a matrix) is the greatest of the absolute values of the eigenvalues. 115 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.92. In the space M5 consider the subspace generated by the vectors (1,1,-1,-1,0), (1,-1,-1,0,-1), (1,1,0,1,1), (—1,0,—1,1,1). Find some basis of its orthogonal complement. 2.93. Describe the orthogonal complement of the subspace V of the space R4, if V is generated by the vectors (—1, 2, 0, 1), (3, 1, —2, 4), (-4, 1,2,-4), (2, 3, -2,5). 2.94. In the space M5 determine the orthogonal complement W1- of the subspace W, if (a) W = {(r + s + t, -r + t, r + s, -t, s + t); r,s,t e R}; (b) W is the set of the solutions of the system of equations x\ — X3 = 0, X\ — X2 ~\~ X3 — x4 -\- x5 = 0. 2.95. Let in the space R4 be given vectors (1,-2,2,1), (1,3,2,1). Extend these two vectors into an orthogonal basis of the whole R4. (You can do it in any way you like, for instance using the Gram-Schmidt orthogonalisation process.) In this terminology, our results about orthogonal mappings can be formulated as follows: the spectrum of orthogonal mapping is always a subset of the unit circle in the complex plane. That means that in the real part of the spectrum there are only values ±1, whose algebraic and geometric multiplicities are the same. Complex values of the spectrum then correspond to the rotations in suitable two-dimensional subspaces which are mutually perpendicular. 2.96. Find some orthonormal basis of the subspace Vet, where V = {{x\,x2, x3, x4) e R4 \ x\ + 2x2 + x3 = 0}. Solution. We see that the fourth coordinate does not appear in the restriction for the subspace, thus it seems reasonable to pick (0, 0, 0, 1) as one of the vectors of the orthonormal basis and reduce the problem into the subspace R3. Let us try once again to avoid any computation - we see that if we set the second coordinate equal to zero, then in the investigated space there are vectors with reverse first and third coordinate, notably, the unit vector (, 0, — , 0). This vector is perpendicular to any vector which has first coordinate equal to the third coordinate. In order to get into the investigated subspace, we choose the second coordinate equal to the opposite value of the sum of the first and the third coordinate, we then normalise, that is, we choose the vector (4=, — -7=, -4=, 0) and we are done. □ J. Eigenvalues and eigenvectors 2.97. Eigenvalues and eigenvectors can be used to illustrative description of linear mappings, notably in R2 and R3. (1) Consider mapping with a matrix under the standard basis 116 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA We then obtain \A - XE\ -a 0 1 0 1 — a 0 1 0 -a -a3 + x2 + x - i, with roots Xi 2 = 1,a3 = —1. Eigenvectors with eigenvalue a = 1 can be computed: with the basis of the space of solutions, that is, of all eigenvectors with this eigenvalue «i = (0,1,0), u2 = (1,0,1). Similarly for a = — 1 we obtain the third independent eigenvector (-1,0, 1). Under the basis (note that u3 must be linearly indepen- dent of the remaining two thanks to the previous theorem and u\, u2 were obtained as two independent solutions) / has the diagonal matrix The whole space R3 is a direct sum of eigenspaces, R3 = v\ © v2, dim v\ = 2, dim v2 = 1. This decomposition is uniquely determined and says a lot about geometric properties of the mapping /. The eigenspace v\ is furthermore a direct sum of one-dimensional eigenspaces, which can be chosen in more ways (thus such a decomposition has no further geometrical meaning). (2) Consider linear mapping / : R2[x] -> R2[x] defined by polynomial differentiation, that is, f(l) = 0, f(x) = 1, f(x2) = 2x. The mapping / thus has in the usual basis (1, x, x2) the matrix Characteristic polynomial is | A — X ■ E\ = —a3, thus it has only one eigenvalue, a = 0. We compute the eigenvectors: The space of the eigenvectors is thus one-dimensional, generated by the constant polynomial 1. 117 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.98. An exercise with a change of basis. Determine the eigenvalues and eigenvectors of the matrix /l 1 A = 1 2 1 V 2 h Describe the geometric interpretation of this mapping and write down its matrix under the basis: ei = [1,-1,1] e2 = [1,2,0] e3 = [0, 1, 1] Solution. Characteristic polynomial of the matrix is 1 — a 1 0 1 2 — a 1 =-a3+4a2 - 2a =-a(a2 - 4a+ 2). 1 2 1 - a Roots of this polynomial, eigenvalues, say when the matrix 'l-a 1 0 1 2-a 1 1 2 1 - ay will not have full rank, that is, the system of equations '1-a 1 0 1 2-a 1 1 2 1 - ay will have more solutions than just x = (0, 0, 0). Thus eigenvalues are 0, 2 + V2, 2 — ~Jl. Let us compute eigenvectors associated with the particular eigenvalues: • 0: We solve the system 'l 1 0\ /jcA 1 2 1 \\x2 \= 0 1 2 l) \x3) Its solutions form one-dimensional vector space of eigenvectors: ((1, -1, 1)). 2 + V2: We solve the system (1 + V2) 1 0 \ /jcA 1 -72 1 \\x2 \= 0. 1 2 -(1 + V2)/ W The solutions form a one-dimensional space ((1, 1+V2, 1+ V2)>. • 2 — V2: We solve the system V2-1) 1 0 \ /jcA 1 V2 1 x2 = 0. 1 2 (V2 - 1)/ VW Its solutions form a space of eigenvectors ((1, 1 — V2, 1 — V2)>. 118 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA The given matrix has eigenvalues 0, 2 + \/2 and 2 — \/2, with associated one-dimensional spaces of eigenvectors ((1, —1,1)), ((1,1 + 72, 1 + 72)) and ((1, 1 - V2, 1 - V2)) respectively. The mapping can thus be interpreted as a projection along the vector (1, — 1, 1) into the plane given by the vectors (1,1 + V2, 1 + V2) and (1,1 — V2, 1 — \/2) composed with the linear mapping given by "stretching" by the factor corresponding to the eigenvalues in the directions of the associated eigenvectors. Now we express it under the given basis. For this we need the matrix T for changing the basis from the standard basis to the new basis. This can be obtained by writing the coordinates of the vectors of the original basis under the new basis into the columns of the matrix T. But we shall do it in a different way - we obtain first the matrix for changing the basis from the new one to the original one, that is, the matrix T~l. We just write the coordinates of the vectors of the new basis into the columns: Then and for the matrix B of a mapping under new basis we have (see 2.38) /0 5 2 B = TAT'1 = 10 -2 -1 \0 14 □ Let us do some more exercises for computing with eigenvalues and eigenvectors. 2.99. Find the eigenvalues and the associated subspaces of eigenvectors of the matrix A= Solution. Let us first construct the characteristic polynomial of the matrix: -1-k 1 0 -1 3-A 0 2 —2 2 — X A3 - 4A3 + 2X + 4. This polynomial has roots 2, 1 + V3, 1 — V3, which are then eigenvalues of the matrix. Their algebraic multiplicity is one (they are simple roots of the polynomial), thus each has associated only one (up to a 119 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA non-zero multiple) eigenvector (that is, the so-called geometric multiplicity of the eigenvalue is also one, see 3.32). Let us determine the eigenvector associated with the eigenvalue 2 (it is a solution of the homogeneous linear system with the matrix A - IE): -3xi + x2 = 0 — lxi + x2 = 0 2x\ — 2x2 = 0. The system has solution xi = x2 = 0, x3 e M arbitrary, the eigenvector associated to the value 2 is then for instance the eigenvector (and any multiple of it). Analogically we determine the remaining two eigenvectors - as solutions of the system [A — (1 + */3)E]x = 0 and of the system [A — (1 + V3)£]x = 0. The solution of the system (-2-V3)xi+x2 = 0 -lxi + (2 - V3)x2 = 0 2xi - 2x2 + (1 - V3)x3 = 0 is the space {((^ — V)t, — 5, ) , t e M}. That is the space of eigenvectors associated with the eigenvalue 1 + V3 (except for the zero vector, which is a solution of the system, but we do not consider it an eigenvector; we shall not refer to this anymore in the future and we won't explicitly exclude the zero vector from the set of solutions). Similarly we obtain that the space of eigenvector associated with the eigenvalue 1 - 73 is ((-1 - ^, -±, 1)). □ 2.100. Find the eigenvalues and the associated eigenspaces of eigenvectors of the matrix: /I 1 0\ A = -1 3 0 . \2 -2 2/ Solution. Characteristic polynomial of the matrix is A3—6A2+ 12A — 8, which is (A — 2)2 with a root 2 which has multiplicity 3. The number 2 is thus an eigenvalue with algebraic multiplicity three. Its geometric multiplicity is either one, two or three. Let us determine the vectors associated to this eigenvalue as the solutions of the system -xi +x2 = 0, (A - 2£)x = -xi +x2 = 0, 2xi —2x2 = 0. Its solutions form the two-dimensional space ((1, —1, 0), (0, 0, 1)). The eigenvalue 2 has thus algebraic multiplicity 3 and geometric multiplicity 2. CHAPTER 2. ELEMENTARY LINEAR ALGEBRA □ Further basic exercises regarding eigenvalues and eigenvectors of matrices can be found at the page 130 2.101. For any nxn matrix A its characteristic polynomial \A—XE\ is of degree n, that is, it is of the form | A - X E | = cn X" + c„_i X"-1 + ■ ■ ■ + ci X + c0, cn^0, while we have cn = (-!)", c„_! = (-l)"-1trA, c0 = |A|. If the matrix A is three-dimensional, we obtain | A - XE | = -X3 + (tr A) X2 + c\ X + \A\. By choosing X = 1 we obtain \A-E\ = -l + trA+Cl + |A|. >From there we obtain \A-XE\ = -A3 + (trA)A2 + (|A-£'| + l-trA-|A|)A + |A|. Use this expression for determining the characteristic polynomial and the eigenvalues of the matrix /32 -67 47 A = 7 -14 13 \-7 15 -6 o 2.102. Without any computation write down the spectrum of the linear mapping / : M3 -> M3 given by (x\, x2, x3) (x\ + x3, x2, x\ + x3). o 2.103. Give the dimension of the eigenspaces of the eigenvalues Xt of the matrix /4 0 0 0\ 14 0 0 5 2 3 0-\0 4 0 3/ o 2.104. Pauli matrix In physics, state of the particle with spin \ is described with Pauli matrices. They are the 2 x 2 matrices over complex numbers: CTl = (i o)'CT2 = (/ "o)'CT3 = (o -l) For square matrices we define their commutator (denoted by square brackets) as [o\, a2] := a\a2 — a2a\ Show that it holds that \_a\, a2] = 2io3 and similarly [a\, 0-3] = 2ia2 and [a2, er3] = 2ia\. Furthermore, show that of = a| = o2 = 1 and that eigenvalues of the matrices a\, a2, cr3 are ±1. 121 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Show that for matrices describing the state of the particle with spin 1 the same commuting relations hold as in the case of Pauli matrices. Equivalently it can be shown that under the notation space with basis (\, i, j, k) an algebra of quaternions (algebra is a vector space with binary bilinear operation of multiplication, in this case the multiplication is given by the matrix multiplication). In order for the vector space to be an algebra of quaternions it is necessary and sufficient to show the following properties: i2 = j2 = k2 = —1 and ij = -ji = k,jk = -kj = i and ki = -ik = j. 2.105. Can the matrix be expressed in the form of the product B = P~l ■ D ■ P for some diagonal matrix D and invertible matrix PI If it is possible, give an example of such tuple of matrices D, P, and find out how many such As we have already seen in , based on the eigenvalues and eigenvectors of the given 3x3 matrix, we can often geometrically interpret the mapping it gives in R3. Notably, we can do it in these situations: If the matrix has 0 as eigenvalue and 1 as an eigenvalue with geometric multiplicity 2, it is a projection in the direction of the eigenvector associated with the eigenvalue 0 on the plane given by the eigenspace of the eigenvalue 1. If the eigenvector associated with 0 is perpendicular to that plane, the mapping is an orthogonal projection. If the matrix has eigenvalue — 1 with the eigenvector perpendicular to the plane of the eigenvectors associated with the eigenvalue 1, it is a mirror symmetry through the plane of the eigenvectors associated with 1. If the matrix has eigenvalue 1 with eigenvector perpendicular to plane of the eigenvectors associated with the eigenvalue —1, it is an axial symmetry (in space) through the axis given by the eigenvector associated with 1. 2.106. Determine what linear mapping R3 -» R3 is given by the matrix forms the vectors tuples are there. O 122 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solution. The matrix has a double eigenvalue —1, its associated eigenspace is ((2, 0, 1), (1, 1, 0)). Further, the matrix has 0 as the eigenvalue, with eigenvector (1, 4, —3). The mapping given by this matrix under the standard basis is then an axial symmetry through the line given by the last vector composed with the projection on the plane perpendicular to the last vector, that is, given by the equation x + Ay - 3z = 0. □ 2.107. The theorem (2.50) gives us tools how to recognise matrix of a rotation in R3: it has three distinct eigenvalues with absolute value 1, one of them is the number 1 (its associated eigenvector is the axis of the rotation). The argument of the remaining two, which are necessarily complex conjugates, gives the angle of the rotation in the positive sense in the plane given by the basis ux + W, i\ux — ui~]- 2.108. Determine what linear mapping is given by the matrix =i 1 =1 5 5 5 =1 1 1 5 5 5 8 ^4 3 5 5 5 Solution. By the already known method we find out that the matrix has the following eigenvalues and corresponding eigenvectors: 1, (1, 2, 0); | + f z, 1, (1, 1 + i, -1 - 0; | - fi, (1, 1 - i, -1 + 0- It is thus a matrix of a rotation (all the eigenvalues have absolute value 1 and one of the eigenvalues is 1), further we know that it is a rotation by the angle arccos(|) = 0, 2957T, which is the argument of the complex number | + ji. It remains to determine the direction of the rotation. First it is good to recall that the meaning of the direction of the rotation changes when we change the orientation of the axis (it has no meaning to speak of the direction of the rotation if we do not have an orientation of the axis). Using the ideas from the proof of the theorem 2.50, we see that the given matrix acts by rotating by arccos(|)) in the positive sense in the plane given by the basis ((0,1,-1), (1,1,-1)). The first vector of the basis is the imaginary part of the eigenvector associated with the eigenvalue | + |r, the second is then the (common) real part of the eigenvectors associated with the complex eigenvalues. The order of the vectors in the basis is important (by changing their order the meaning of the direction changes). The axis of rotation is perpendicular to the plane. If we orient using the right-hand rule (the perpendicular direction is obtained by taking the product of the vectors in the basis) then the direction of the rotation agrees with the direction of rotation in the plane with the given basis. In our case we obtain by the vector product (0, 1, -1) x (1, 1, -1) = (0, -1, -1). It is thus a rotation through arccos(|) in the positive sense about the vector 123 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA (0,-1,-1), that is, a rotation through arccos(|) in the negative sense about the vector (0, 1, 1). □ 124 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 125 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 126 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA K. Additional exercises for the whole chapter 2.109. Solve the equation + X2 + x3 + x4 -Xi -1x\ 2x2 *2 + 2x3 x3 + 2x4 + 2x5 4x5 + 3x2 + 3x3 x4 -\- 2x5 - 6x5 3, 5, 0, 2. Solution. The extended matrix of the system is / V 1 0 -1 -2 1 2 1 2 1 -1 1 2 1 0 3 \ 5 0 2/ Adding the first row to the third, adding its 2-multiple to the fourth, and adding the (—5/2)-multiple of the second to the fourth we obtain /111 0 2 2 0 0 0 \0 5 5 -2 -4 0 -10 3 \ 5 3 8/ / 1 0 0 1 2 2 -2 -4 0 0 3 \ 5 3 -9/2 / The last row is clearly a multiple of the previous, and thus we can omit it. The pivots are located in the first, second and fourth, thus the free variables are x3 and x5 which we substitute by the real parameters t and s. We thus consider the system Xi + *2 2x2 + + t It + + x4 2s 4s 3, 5, 3. We see that x4 = 3/2. The second equation gives 2x2 + 2t + 3 — 4s = 5, that is, x2 = 1 — t + 2s. >From the first we have xi + 1 - t + 2s + t + 3/2 - 2s = 3, Altogether, (2.1) (Xi, X2, x3, x4, x5) tj. Xi = 1/2. (1/2, 1 - t +2s, t, 3/2, s), t,s € In this exercise we also consider the extended matrix and we transform it using the row transformations into the row echelon form, where the first non-zero number in every row is 1 and where in a column in which this 1 is located the remaining numbers are 0. We also note that we omit the fourth equation, which is a combination of the first three. Gradually, multiplying the second and the third row by the number 1/2, subtracting the third row from the second and from the first and by subtracting the second row from the first we obtain 0 1 1 1 1 -2 3) 0 2 2 2 -4 5 0 0 0 2 0 3J 1 1 1 0 -2 3/2 0 1 1 0 -2 1 0 0 0 1 0 3/2 1 0 0 0 0 0 110-2 0 0 0 1 0 127 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA If we choose again x3 = t, x5 = s (t, s € R), we obtain the general solution (|| 2.11|) in the same form, directly. Consider the corresponding equations jci = 1/2, x2 + t — 2s = 1, x4 = 3/2. □ 2.110. Find the solution of the system of linear equations given by the extended matrix / 3 3 2 1 3 \ 2 1 1 0 4 0 5 -4 3 1 \5 3 3 -3 5/ Solution. We transform the given extended matrix into the row echelon form. We first copy the first three rows and into the last row we write the sum of the (2)-multiple of the first and of the (—3)-multiple of the last row. By this we obtain / 3 3 2 1 3 \ ( 3 3 2 1 3 \ 2 1 1 0 4 0 -3 -1 -2 6 0 5 -4 3 1 0 5-4 3 1 \5 3 3 -3 5 ) \0 6 1 14 0/ Copying the first two row s and adding a 5-multiple of the second row to the 3-multiple of the third and its 2-multiple to the fourth gives 2 -1 -17 -1 / 3 0 0 -1 -4 1 1 -2 3 14 3 \ 1 0/ / 3 0 0 V 0 0 0 1 -2 -1 10 3 \ 33 12/ We copy the first, second and fourth row, and add the fourth to the third, we obtain / 3 0 0 V 0 0 0 -1 ■17 -1 1 -2 -1 10 3 \ 6 33 12/ / 3 0 0 V 0 0 0 -1 ■18 -1 1 -2 9 10 3 \ 6 45 12/ Then we have (the remaining row transformations are „usual") / 3 3 2 1 3 \ ( 3 3 2 1 3 \ 0 -3 -1 -2 6 0 -3 -1 -2 6 0 0 -18 9 45 0 0 2 -1 -5 \o 0 -1 10 12) V 0 0 1 -10 -12) / 3 3 2 1 3 (3 3 2 1 3 \ 0 -3 -1 -2 6 0 -3 -1 -2 6 0 0 1 -10 -12 0 0 1 -10 -12 0 2 -1 -5 ) \0 0 0 19 19 / We see that the system has exactly 1 solution. We determine it by backwards eUmination (3 3 2 1 3 (3 3 2 0 2 0 -3 -1 -2 6 0 -3 -1 0 8 0 0 1 -10 -12 0 0 1 0 -2 \0 0 0 1 1 ) \0 0 0 1 1 ) ( 3 3 0 0 6 \ ( 1 1 0 0 2 \ ( 1 0 0 0 4 0 -3 0 0 6 0 1 0 0 -2 0 1 0 0 -2 0 0 1 0 -2 0 0 1 0 -2 0 0 1 0 -2 V 0 0 0 1 1 / ^ 0 0 0 1 1 ) V 0 0 0 1 1 ) 128 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA The result is then X\ = 4, x2 = —2, x3 = —2, x^ = 1. □ 2.111. Give all the solutions of the homogeneous system x + y = 2z + v, z + 4w + v = 0, —3m =0, z = — 1; of four linear equations with 5 variables x, y, z, u, v. Solution. We rewrite the system into a matrix such that in the first column there are coefficients at x, in the second there are coefficients at y, and so on, while we put all the variables in equations to the left side. By this we obtain the matrix -1\ 1 0 4 /l 1 - 0 0 ] 0 0 0-30 \0 0 1 0 1 / We add (4/3)-multiple of the third row to the second and subtract then the second row from the fourth to obtain /I 1 0 0 0 0 0 1 0 1 0 4 -3 0 -1\ 1 0 /I 1 0 0 0 0 0 1 0 0 0 0 -3 0 -1\ 1 0 0/ We multiply the third row by the number —1/3 and add the 2-multiple of the second row to the first, which gives /l 1 -2 0 -l\ /l 1 0 0 l\ 0 0 1 0 1 0 0 1 0 1 00 0 -3 0 ~ 0 0 0 1 0 \0 0 0 0 0 / \0 0 0 0 0/ >From the last matrix we can directly obtain all solutions because we have the matrix in the row echelon form, while the first non-zero number in every row is 1 and in a column where there is such 1 there are zeroes at all other positions. The solution given as a linear combination of two vectors is determined exactly by the columns without first non-zero number on some row, that is, by the second and the fifth column, when we choose 1 as the second coordinate for the second column and as the fifth coordinate for the fifth column and when we take the numbers in the corresponding column with the opposite sign and put them at the position given by the column where there is a first 1 in its row. Let us add that the result can be immediately rewritten in the form /x\ /-1\ /-1\ y 1 0 z = t 0 + s -1 u 0 0 \v) \°) K1) (x, y, z, u, v) = (—t — s, t, —s, 0, s) , í, s e □ 2.112. Decompose the following permutations into transposition: 129 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA (l 2 3 4 5 6 l\ l> \1 6 5 4 3 2 1/' /l 2 3 4 5 6 7 8\ U) V6 4 1 2 5 8 3 7/1' ?1 2 3 4 5 6 7 8 9 10\ Ul) \4 6 1 10 2 5 9 8 3 7 / 2.773. Determine the parity of the given permutations: . (I 2 3 4 5 6 7\ '^7 5 6 4 1 2 3/' /l 2 3 4 5 6 7 8\ U) \6 7 1 2 3 8 4 5/' /l 2 3 4 56789 10\ m' V9 7 1 10 2 5 4 9 3 6 )' 2.774. Determine the eigenvalues of the matrix / -13 5 4 2\ 0 -1 0 0 -30 12 9 5 V -12 6 4 V o 2.775. Having been told that the numbers 1, — 1 are the eigenvalues of the matrix /-ll 5 4 l\ -3 0 10 -21 11 8 2 ' \-9 5 3 1/ give all solutions of the characteristic equation | A — X E \ =0. Hint: if you denote all the roots of the polynomial | A — X E \ as X\, k2, a3, a4, it is I A I = ki ■ k2 ■ a3 • k4, tr A = ki + k2 + a3 + k4. o 2.77d Give an example of a four-dimensional matrix with eigenvalues ki = 6 and k2 = l such that their multiplicity of k2 as a root of the characteristic polynomial is three and that (a) the dimension of the subspace of eigenvectors of k2 is 3; (b) the dimension of the subspace of eigenvectors of k2 is 2; (c) the dimension of the subspace of eigenvectors of k2 is 1; o 2.777. Find the eigenvalues and eigenvectors of the matrix: -1 0 0 2.77S. Determine the characteristic polynomial | A—k E |, eigenvalues and eigenvectors of the matrix <\ -1 6^ 2 1 6 a -1 130 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA O respectively. 131 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA Solutions to the exercises 2.7. 2.12. There is only one such matrix X, and it is '18 -32^ A — 27 '14 13 —13> 13 14 13 0 0 27 /I 10 - -4\ 2.14. A-1 = 1 12 - 5 V> 5 - "2/ (2 _1 0 0 0\ -5 8 0 0 0 2.15. 0 0 -1 0 0 0 0 0 -5 2 0 0 3 -V (0 1 1 o\ 2.16. C"1 = 1 2 0 1 1 -1 0 0 0 1 V -1 -1 1 / 2.17. In the first case we have A = in the second 2.18. We have 1 (3 -i 2 ' \i 1 /14 8 5^ 2 11 \ 1 10; n - 1 ŕ 1 i ... 1\ i 0 i ... 1 i i 0 '•. 1 v i 1 0/ 2.21. -3,17,-1 2.24. By subtracting the first row from all other rows and then expanding the first column we obtain V„(xi, x2, ■ ■ ■, x„) 1 XI x^ 0 X2 — XI x| — x^ 0 xn x\ x2 x^ X2 X^ "^2 Xfi X\ X2, X2 „«-1 „n-l _ „n-l A2 Al x"-l _ x"-l vn-\ _ *n -""1 If we take out x;+i — x\ from the i-th row for / e {1, 2,..., n — 1}, we obtain V„(xi, x2, ..., x„) i x2 + XI ... YľjZo x2~j~2 xi (X2 - xi) • • • (x„ - xi) 1 x„ + XI «-i'-2 / Z^j=o " 1 132 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA By subtracting from every column (starting with the last and ending with the second) x\ -multiple of the previous column, we obtain ^n-2 n-j-2 1 X2 + X\ n-j-Z r 2-.j=£> A2 Al En—2 n-j-2 i j=0 Xn Tl 1 X, 1 xn x2 xl- 1 x„ + X\ Therefore Vn(xi,x2----,x„) — (x2 -xi) ■ ■ ■ (xn - xi) Vn-i(x2, ... ,xn). Because it is clear that ^(Xft — l, Xn) — Xn Xn — \, it holds (by induction) that V„ (x i, x2.....x„ ) — Y\ (x.i - x') ■ 1 2/9 5/9 2.SO. cos = V2 N/3" 2.8/. 91 : (2- ^,2V3 + i)f, fl2 : (2+-2V3 +i)f. 2.S4. For instance the inner product that follows from the isomorphism of the space of all real 3x3 matrices with the space R9. If we use the product from R9, we obtain an inner product that assigns to two matrices the sum of products of two corresponding elements. For the given matrix we obtain '1 y 12 + 22 + O2 + O2 + 22 + O2 + l2 + (-2)2 + (-3)2 134 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA 2.87. The vector that gives the subspace U is perpendicular to each of the three vectors that generate V. The subspaces are thus orthogonal. But it is not true that M4 — U © V. The subspace V is only two-dimensional, because (-1,0, -1,2) = (-1,0, 1,0) -2(0, 0, 1, -1). 2.88. In the first case we have dim U — 2 for t e {1, 2}, otherwise we have dim U — 3. In the second case we have dim U — 2 for t / 0 and dim U — 1 for t — 0. 2.89. Using the Gram-Schmidt orthogonalisation process we can obtain the result ((1, 1, 1, 1), (1, 1, 1, -3), (-2, 1, 1,0)). 2.90. Preserving the order of the subspaces from the problem statement we have for instance the orthogonal bases ((1,0, 1,0), (0, 1,0, -7)) and ((1, 2, 2, -1), (2, 3, -3, 2), (2, -1,-1, -2)). 2.91. The result is a — 9/2, b — —5, because it must hold 1+6 + 4 + 0 + 0 = 0, 1-6 + 0 + 3-2a = 0. 2.92. The basis must contain a single vector. It is some non-zero scalar multiple of the vector (3, -7, 1, -5,9). 2.93. Orthogonal complement V1- is a set of all scalar multiples of the vector (4, 2, 7, 0). 2.94. (a) W1- = ((1, 0, -1, 1, 0), (1, 3, 2, 1, -3)); (b) W1- = ((1,0, -1,0,0), (1, -1, 1, -1, 1)). 2.95. There is infinitely many possible extensions, of course. A very simple one is for instance (1,-2,2,1), (1,3,2,1), (1,0,0,-1), (1,0,-1,1). 2.101. Je | A - X E | = -X3 + 12X2 - MX + 60, . Xx = 3, X2 = 4, X3 = 5. 2.102. The result is the sequence 0, 1,2. 2.103. Dimension is 1 for X\ — 4 and 2 for X2 — 3. 2.105. Matrix B has two distinct eigenvalues, and thus such expression exists. For instance it holds that '5 6\_i/v/2 -V2\ /ll 0\ i/V2 s/2\ v6 S)-^\J2 V2JV0 -lj'2^_V2 V2j There exist exactly two diagonal matrices D: '11 0\ /-l 0 but the columns of the matrix P~l can be substituted with their arbitrary non-zero scalar multiples, thus there are infinitely many tuples D, P. 2.112. i) (1, 7)(2, 6)(5, 3), ii) (1, 6)(6, 8)(8, 7)(7, 3)(2,4), ill) (1,4)(4, 10)(10, 7)(7, 9)(9, 3)(2, 6)(6, 5) 2.113. i) 17 inversions, odd, ii) 12 inversions, even iii) 25 inversions, odd 2.114. The matrix has only one eigenvalue--1. 2.115. The root —1 of the polynomial | A — X E \ has multiplicity three. 2.116. For instance, 135 CHAPTER 2. ELEMENTARY LINEAR ALGEBRA (a) /6 0 0 (c) 0\ 0 0 V /6 0 0 (b) 0 1 7 0 /6 0 0 \o o\ 0 1 V 0 0 2.117. Triple eigenvalue —1, corresponding eigenspace is ((1, 0, 0), (0, 2, 1)). 2.118. Characteristic polynomial is —(X — 2)2(X — 9), that is, eigenvalues are 2 and 9 with associated eigen- vectors (1,2, 0), (-3,0, 1) a (1,1,1) 136 CHAPTER 3 Linear models and matrix calculus where are the matrices useful? - basically almost everywhere.. A. Processes with linear restrictions Let us show an example of a very simple linear optimisation prob- lem: 3.1. A company manufactures bolts and nuts. Nuts and bolts are moulded - moulding a box of bolts takes one minute, box of nuts is moulded for 2 minutes. Preparing the box itself takes one minute for bolts, 4 minutes for nuts. The company has at its disposal two hours for moulding and three hours for box preparation. Demand says, that it is necessary to manufacture at least 90 boxes of bolts more than boxes of matrices. Due to technical reasons it is not possible to manufacture more than 110 boxes of bolts. The profit from one box of bolts is 40 Kc and the profit from one box of 60 Kc. The company has no trouble with selling. How many boxes of nuts and bolts should be manufactured in order to have maximal profit? Solution. Let us write the given data into a table: Bolts 1 box Nuts 1 box Capacity Mould 1 min./box 2 min./box 2 hours Box 1 min./box 4 min./box 3 hours Profit 40 Kc/box 60 Kc/box We have already developed a pretty useful package of tools and it is time to show some applications of the matrix calculus. On some relatively easy problems we see that the theory allows us both qualitative and quantitative analyses and sometimes it leads quite easily to some surprising results. Although it might seem that the assumption of linearity of relations between the quantities is too restrictive, it is quite often note so - in real problems there linear relations tend to either directly appear or the final process is a result of an iteration of many linear steps. And even if it is not the case, we can using this approach at least approximate the real processes. We like to view the matrices (and linear mappings) as objects with which we would like to work with as if they were scalars. In order to do that, a pretty hard work to be done in |x the fourth chapter is required. We show a quick and useful application then on the so-called matrix decompositions, which are needed for numerical mastery of matrix calculus in a most robust way. 1. Linear processes 3.1 Solution of system of linear equations. Simple linear processes are given by linear mappings

W on vector spaces. As we can surely imagine, the vector i; e V can represent the state of some system we are observing, while cp(v) gives the result after some process was realised. If we want to reach a given result b e W of such process, we solve the problem cp(x) — b for some unknown vector x and a known vector b. In fixed coordinates we then have a matrix A of a mapping

x2 + 90 xx > 110 The objective function (the function that gives the profit for given number of manufactured nuts and bolts) is \§x\ + 60x2. The previous system of inequalities gives R2 a certain area and the optimisation of the profit means to find in this area the point (points) in which the objective function has the maximum value, that is, to find the highest k such that the line 40xi + 60x2 = k has a non-empty intersection with the given area. Graphically, we can find the solution for example by placing the line p into the plane such that it satisfies the equation 40xi + 60x2 = 0 and start moving it "upwards" as long as it has some intersection with the area. It is clear that the last intersection is either a point or the borderline of the line (and the line must be parallel to p). Thus we obtain (see the figure) the point x\ = 110 and x2 = 5. Maximum possible income is thus 40 • 100 + 60 • 5 = 4700 Kč. □ 3.2. Minimisation of costs for feeding Stable in Nišovice u Volyně buys fodder for winter: hay and oat. The nutritional values of the fodder and required daily portions for one foal are given in the table: g/kg Hay Oat Requirements Dry basis 841 860 At least 6300 g Digestible nitrogen stuff 53 123 At most 1150g Starch 0,348 0,868 At most 5,35 g Calcium 6 1,6 At least 30 g Phosphate 2,8 3,5 At most 44 g Natrium 0,2 1,4 Approximately 7 g Cost 1,80 1,60 Every foal must obtain in daily meal at least 2 kg of oat. Average cost (counting the payment for the transportation) cost 1, 80 Kc per 1 kg of hay and 1, 60 Kc per 1 kg of oat. Compose daily diet for one foal which has minimum costs. 3.3. Optimal distribution of material. On inner wooden panelling of a cottage there are following requirements • at most 120 planks of length 35 cm, • from 180 to 330 planks of length 120 cm, • at least 30 planks of length 95 cm. When solving the system of equation we are thus left with exactly n—k free parameters and by setting one of them to have the value one and making the other to be zero we obtain exactly n—k linearly independent solutions. All solutions are then given by all the linear combinations of these n—k solutions. Every such (n — &)-tuple of solutions is called fundamental system of solutions of the given homogeneous system of equations. We have proved: Theorem. The set of all solutions of the homogeneous system of equations A-x = 0 for n variables with the matrix A of rank K is a vector subspace ofW of dimension n—k. Every basis of such subspace forms a fundamental system of solutions of the given homogeneous system. 3.2. Non-homogeneous systems of equations. Consider now the general system of equations A ■ x = b. Let us now realise once again that the columns of the matrix A are actually images of the vectors of the standard basis in K" under the linear mapping

and the variables are non-negative. It is easy to see that every general linear programming problem can be transformed into a standard one of either type. Aside of sign changes we can work with decomposition of the variables that have no sign restriction into a difference of two non-negative ones. Without loss of generality we shall further work only with the standard maximisation problem. How to solve such problem. We seek maximum of a linear form h over subsets M of a vector space which are given by linear inequalities, that is, in the plane by the intersection of half planes, in general we shall speak in the next chapter about half-spaces. Note that every linear form over real vector space h : V -> R (that is, arbitrary linear scalar function) is monotone in every chosen direction - that is, in the direction it either grows all the time or decreases. More precisely, if we choose a fixed starting vector u e V and "directional" vector v e V, then composition of our form h with parametrisation yields t h-> h(u + t v) = h(u) + t h(v). This expression is indeed with increasing parameter t always either increasing or decreasing, or constant (depending on whether h(v) is positive, negative or zero). Thus we surely must expect that problems similar to the one with the painter are either unsatisfiable (if the given set with restrictions is empty), or the profit is unbounded (if the restrictions give an unbounded of the space and the form h is in some of the unbounded directions non-zero) or they attain a maximal solution 139 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS in at least one of the "vertices" of the set M (while usually that would be the case for only a single vector, it can also be the case that the maximum is attained on a part of the boundary of the area M. 3.5. Formulation using linear equations. Finding an optimum is not always as simple as in the previous case. The problem can contain many variables and restrictions and even deciding whether the set M of satisfiable points is non-empty can be a problem. We don't have space here for the complete theory, but we mention at least two directions of ideas, which show that actually the solution can be always found in a way similar to the previous paragraph. Let us begin by comparison with systems of linear equations - because we understand those well. Let us write the equations (3.1)-(3.3) in general form: A ■ x < b, where x is now an n-dimensional vector, b is m-dimensional vector and A is the corresponding matrix. By an inequality between vectors we mean individual inequalities between coordinates. We want to maximise the product c ■ x for a given row vector of coefficients of the linear form h. If we add a new auxiliary variable for every equation and add another variable z for the value of the linear form h, we can rewrite the whole system as a system of linear equations ' z where the matrix is composed of the blocks with 1+n+m columns and 1 + m rows, with corresponding individual components of the vectors. Additionally we require for all coordinates X and xs non-negativity. If the given system of equations has a solution, in this set of solutions we seek values for the variables z, x and xs, such that all x are non-negative and z maximised. In the paragraph 4.11 on page 215 we will discuss this situation from the viewpoint of affine geometry. Specifically, in our problem of black and white painter the system of linear equations looks like this: /l -c\ ~C2 0 0 o\ 0 1 1 1 0 0 0 W\ W2 0 1 0 V> bi b2 0 0 V XI X2 x3 x4 w L W W 3.6. Duality of linear programming. Consider the real matrix A with m rows and n columns, vector of restrictions b and row vector c giving the objective function. From these data |- we can compose two problems of linear programming for x € M" and y e W". Maximisation problem: Maximise c ■ x under the conditions A ■ x < b and x > 0. Minimisation problem: Minimise yT ■ b under the condition yT ■ A > cT and y > 0. 140 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS We say that these two problems are mutually dual. For deriving other properties of linear programming we first introduce some terminology. We say that the problem is solvable if there is some admissible vector x which meets all restrictions. Solvable maximisation (minimisation) is bounded, if the objective function is bounded from above (bellow) over the set of admissible vectors. Lemma. IfxeM." is an admissible vector for the standard maximisation problem and y e M™ is admissible vector for the dual minimisation problem, then for the objective functions we have Proof. It is actually a simple observation: x > 0 and cT < yT ■ A, but also y > 0 and A ■ x < b, thus it must also hold that >From here we immediately see that if both dual problems are solvable, then they must be bounded. Even more interesting is the following corollary, which is directly implied by the inequality in the previous proof. Corollary. If there exist admissible vectors x and y of dual linear problems such that for the objective functions it holds that c-x — yT -b, then both are optimal solution for the corresponding problem. 3.7. Theorem (About duality). If a standard problem of linear programming is solvable and bounded, then its dual is also bounded and solvable, there exist an optimal solution for each of the problems and the optimal values of the corresponding objective functions are equal. Proof. One direction was already proved in the previous corollary. It remains to prove the existence of an optimal solution. That is easy to prove by constructing an algorithm, which we won't do in a great detail now. We will return to the missing part of the proof in the part about affine geometry at the page 215. □ Let us note yet another corollary of the just formulated duality theorem: Corollary (Equilibrium theorem). Consider two admissible vectors x and y for the standard maximisation problem and its dual problem from the definition 3.6. Then both these vectors are optimal if and only ifyi — Ofor all coordinates with index i for which Z~2"j=i aijxj < bi and simultaneously xj — 0 for all coordinates with index j such that Y1T=1 yiaij > ci- Proof. Consider that both relations regarding the zeroes c ■ x < yT ■ b c ■ x < yT ■ A ■ x < yT ■ b, which is what we wanted to prove. □ among x, and yt hold. Then we can in the follow-?T-y ing computation calculate with equation, because r Jfiv\ ^be summands with strict inequality have zero - coefficients anyway: m m n m n X v X aiJxj -XX v'"'.'-v' / —1 i — 1 j—l i — l j—l and from the same reason also m n n 141 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS This shows one implication, thanks to the duality theorem. Consider now that both x and y are optimal vectors. We thus know that m m n n J2ytbi - EE v/"/;v; - Er/v> t=i i=\ j=i j=i but simultaneously the left- and right-hand sides are equal. Thus there is equality everywhere. If we rewrite the first equality as we see that it can be satisfied only if the relation from the statement holds, because it is a sum of non-negative numbers and it equals zero. From the second equality we similarly derive the second part and the proof is finished. □ The duality theorem and equilibrium theorem are useful when solving linear programming problems, because they show us relations between zeroes among the additional variables and satisfying the restrictions. 3.8. Notes about linear models in economy. Our very schematic problem of black-white painter from the paragraph 3.4 can be used to illustrate one of the typical economical models, the so-called model of production planing. The model tries to capture the problem completely, that is, to capture both external and internal relations. Left-hand sides of the equations (3.1), (3.2), (3.3) and of the objective function h(xi, x2) are expressing various production relations. Depending on the character of the problem, we have on the right-hand sides either exact values (and then we solve equations) or capacity restrictions and goal optimisation (then we obtain linear programming problems). We can thus in general solve the problem of source allocation with supplier restrictions and either minimise costs or maximise income. We can also interpret duality from this point of view. If our painter would like to set up costs of his work vl, of white colour yw and of black colour yB, then he minimises the objective function L ■ yL + Wyw + ByB with restrictions yL + w\yw +b\yB > c\ yL + wiyw + b2yB > c2. But that is exactly the dual problem to the original one and the theorem 3.7 says that optimal state is such when the objective functions have the same value. Among economical models we can find many modifications. One of them are problems of financial planing, which are connected to the optimisation of portfolio. We are setting up volume of investment into individual investment possibilities with the goal to meet the given restrictions for risk factors while maximising the profit, or dually minimise the risk under given volume. Another common model is marketing application, for instance allocation of costs for advertisement in various media or placing advertisement into time intervals. Restrictions are in this case determined by budget, target population, etc. 142 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Very common are models of nutrition, that is, setting up how much of different kinds of food should be eaten in order to meet total volume of specific components, e.g. minerals and vitamins. Problems of linear programming arise with personal tasks, where workers with specific qualifications and other properties are distributed into working shifts. Common are also problems of merging, problems of splitting and problems of goods distribution. 2. Difference equations We have already met difference equations in the first chapter, zsp ^ albeit briefly and of first order only. Now we show a more general theory for linear equations m^ with constant coefficients, which gives not only !/, very practical tools but also nice illustration for concepts of vector spaces and linear mappings. J Homogeneous linear difference equation of order k 3.9. Definition. Homogeneous linear difference equation of order k is given by the expression a0x„ + a\xn-\ H-----h akx„-k = 0, a0 / 0 ak / 0, where the coefficients at are scalars, which can possibly also depend on n. We also say that such equality gives homogeneous linear recurrence of order k and we usually denote the sequence in question as a function a\ ak x„ = f(n) =--f(n - 1)------f(n - k). «0 a0 Solution of this equation is a sequence of scalars x,, for all i e N (or i e Z), which satisfy the equation with any fixed n. By giving any k values x, in sequence are determined all the J§t# other values uniquely. Indeed, we work over a field of scalars, thus the values ao and ak are invertible and thus using the definition any x„ can be computed uniquely, similarly for xn-k. Induction thus immediately proves that all remaining values are uniquely determined. The space of all infinite sequences x, forms a vector space, where addition and multiplication by scalars works coordinate-wise. Directly from the definition is immediate that a sum of two solutions of a homogeneous linear equation or a multiple of a solution is again a solution. Analogously as with homogeneous linear systems we see that the set of all solutions form a subspace. Initial condition on the values of the solutions is given as a ^-dimensional vector in K*. Sum of initial conditions determines the sum of the corresponding solutions, similarly for scalar multiples. Note also that plugging zeroes and ones into initial k values immediately yields k linearly independent solutions of the equation. Thus although the vectors are infinite sequences, the set of all solutions has finite dimension, we know that its dimension equals to the order of the equation k, and we can easily obtain a basis of all those solutions. Again we speak of fundamental system of solutions and all other solutions are its linear combinations. As we have already checked, if we choose k indices i, i + 1,...,«+ k — 1 in sequence, the homogeneous linear difference equation gives a linear mapping Kk -> K°° of ^-dimensional vectors of initial values into infinitely-dimensional sequences of the same scalars. Independence of such solutions is equivalent to the 143 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS independence of the initial values - which can be easily told from determinant. If we have a &-tuple of solutions (x[,1],..., x[f]), it is independent if and only if the following determinant, the so-called Casortian, is non-zero for one n (which then implies it is non-zero for all n) c(41],...,*?]) = r[i] Tm r[l] Jk] An + l ••• An + l Jl] Jk] 3.10. Solution of homogeneous recurrences with constant coefficients. It is hard to find a universal mechanism for finding a solution of general homogeneous linear difference equations, that is, directly computable expression for the general solution x„. In practical models there are very often equations, where the coefficients are constant. In this case it is possible to guess suitable form for the solution and indeed we will be able to find k linearly independent solutions. This is a complete solution of the problem, as all other solutions will be linear combinations. For simplicity let us start with equations of second order. Such are very often encountered in practical problems, where there are relations based on two previous values. Linear difference equation (recurrence) of second order with constant coefficients is for us thus a form (3.4) f(n + 2) = a- f(n + l)+b-f(n) + c, where a,b,c are known scalar coefficients. For instance in population models we can assume that the individuals in a population mature and start breeding two seasons later (that is, they add to the value fin + 2) by a multiple b ■ fin) with positive b > 1), while immature individuals tire and destroy part of the mature population (that is, the coefficient a is negative). Furthermore, it might be that somebody destroys (uses, eats) a fixed amount c every season. Special such case with c — 0 is for instance the Fibonacci sequence of numbers yo, yi, • • •, where yn+2 — y«+i + yn. If when solving a mathematical problem we don't have any new idea, we can always try to what success leads some known solution of a similar problem. Let us try to plug into the equation (3.4) with coefficient c — 0 similar solution as with the linear equations, that is, fin) — X" for some scalar X. By plugging in we obtain X" +2 - aXn+1 - bXn = X" iX2 -aX-b) = 0. This relation will hold either for X — 0 or for the choice of the values Xi = ^(fl + Vfl2 + 4b), X2 = ^(a - Va2 +4b). We have thus determined when actually such solutions indeed work, we just have to suitably choose the scalar X. But this is not enough for us, since we need to find a solution for any two initial values /(0) and /(l), and we have only found two specific sequences satisfying given equation (or possibly only one sequence -if A.2 = Ai). As we have already derived for even very general linear recurrences, sum of two solutions /i(n) and f2(n) of our equation fin + 2) — a ■ fin + 1) — b ■ fin) — 0 is clearly again a solution 144 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS B. Recurrent equations Distinct linear dependences can be a good tool for describing various models of growth. Let us begin with a very popular population model that uses linear difference equation of second order: 3.4. Fibonacci sequence. In the beginning of the Spring a stork brought on a meadow two newborn rabbits, male and female. The female is, after being two months old, able to deliver two newborns, male and female. The newborns can then start delivering after one month and then every month. Every female is pregnant for one month and then she delivers. How many pairs of rabbits will be there after nine months (if none of them dies and none "moves in")? Solution. After one month, there is still one pair, but the female is already pregnant. After two months, first newborns are delivered, thus there are two pairs. Every next month, there are that many new pairs as there were pregnant females one month before, which equals to the number of at least one month-old pairs, which equals the number of pairs that were there two months ago. The total number of pairs p„ after n months is thus the sum of the number of pairs in the previous two months. For the number of pairs we thus have the following homogeneous linear recurrent formula (3.1) Pn+2 = Pn + 1 + Pn, n = l,..., which along with initial conditions p\ = 1 and p2 = 1 uniquely determines the numbers of pairs of rabbits at the meadow in individual months. Linearity of the formula means that all members of the sequence (p„) appear in the first power, the meaning of the word recurrence is hopefully clear and the homogeneity means that in the formula the absolute term is missing (see further for non-homogeneous formula). For the value of the n-th member we can derive an explicit formula. In searching for the formula we can use the observation that for certain r the function f is a solution of the difference equation without initial conditions. This r can be obtained by plugging into the recurrent relation: r"+2 = r"+1 +r" and after dividing by f we obtain r2 = r + 1, which is the so-called characteristic equation of the giver recurrent formula. Our equation thus has roots ^—j^- and ^-y^ and the sequences of the same equation and the same holds for the scalar multiples of the solution. Our two specific solutions thus allow even more general solutions f(n) = Ci^ + C2Xn2 for arbitrary scalars C\ and C2 and for unique solution of the specific problem with given initial values /(0) and /(l) it remains just to find the corresponding scalars C\ and C2. (And we also need to check whether it is possible for any two initial values). 3.11. Choice of scalars. Let us show how this can work on at least one example. Let us concentrate on the problem that the roots of the characteristic polynomial are in general not in the same field of the scalars as the coefficients in the equation. Thus we solve the problem: 1 (3.5) yn+2 yo — 2, yi — 0. (±^)" and b„ 2 , ^..^ ^„ — (—^-)n, n > 1 satisfy the given relation. The relation is also satisfied by any linear combination, that is, any In our case is thus X\t2 — \ (1 ± \/3) and clearly yo — C\ + C2 — 2 yi = ^Ci(l + V3) + ^C2(1-V3) is satisfied for exactly one choice of these constants. Direct calculation yields C\ — I — \\fi, C2 — 1 4- 5V3 and our problem has unique solution f(n) = (1 - ^/3)^(1 + V3)" + (1 + ^/3)^d - V3)". Note that even if the found solutions for the equation with integral coefficients look complicated and are expressed with irrational (or possibly complex) numbers, we know a priori that the solution itself is again integral. Without this "step aside" into bigger field of scalars we would not be able to describe the general solution. We will meet with similar events very often. General solution allows us also without direct enumeration of constant to discuss qualitative behaviour of the sequence of numbers fin), that is, whether the values with growing n approach some fixed value or oscillate in some interval or are unbounded. 3.12. General case of homogeneous recurrences. Let us now try similarly as in the case of second order to plug in the choice x„ — X" for some (yet unknown) scalar X into the general homogeneous equation from the definition 3.9. For every n we obtain the condition X"-kia0Xk + aiXk~l ■ ■ ■ + ak) = 0 which means that either X — 0 or X is the root of the so-called characteristic polynomial in the parentheses. Characteristic polynomial is independent of n. Assume that the characteristic polynomial has k distinct roots X\,... ,Xk. We can for this purpose extended the field of scalars we are working in, for instance Q into R or R into C, because the result of the calculations will again solutions that stay in the original field thanks to the equation itself. Each of the roots gives us single possible solution xn — iXi)" ■ In order to be happy, we require k linearly independent solutions. 145 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS sequence c„ = san + tb„, s, t e R. The numbers s and t can be chosen so that the resulting combination satisfies the initial conditions, in our case c\ = 1, c2 = 1. For simplicity it is clever to define the zero-th member of the sequence as co = 0 and compute s and t from the equations for c0 and c\. We find out that s = — , t = and thus (1 + V5)" - (1 - V5)" (3.2) Pn 2"(V5) Such sequence satisfies the given recurrent formula and also the initial conditions c0 = 0, c\ = 1, thus it is the single sequence given by this requirements. Note that the value of the formula (||3.2||) is integer for any natural n (it gives the integer Fibonacci sequence), although it might not seem so at the first glance. □ 3.5. Simplified model for behaviour of gross domestic product. Consider the difference equation (3.3) yk+2 - a{\ + b)yk+i + abyk = 1, where yk is the gross domestic product at the year k. The constant a is the so-called consumption tendency, which is a macroeconomical factor that gives the fraction of money that the people spend (from what they have at their disposal), and the constant b describes the dependence of the measure of investment of the private sector on the consumption tendency. We further assume that the size of the domestic product is normalised such that on the right-hand side of the equation the result is 1. Compute the values y„ for a = |, b = |, yo = 1, y\ = 1. Solution. Let us first look for the solution of the homogeneous equation (the right side being zero) in the form of r1. The number r must be a solution of the characteristic equation a(\ + b)x +ab = 0, that is, x2 1 x + - = 0, 4 which has a double root \. All the solutions of the homogeneous equation are then of the form a(\)n + bn(^)n. Let us also note that if we find some solution of the non-homogeneous equation (the so-called particular solution), then if we add to it any solution of the homogeneous solution, we obtain another solution of the non-homogeneous equation. It can be shown that by this we obtain all solutions of the non-homogeneous equation. In our case (that is, when all the coefficients and the non-homogeneous term are constant) is a particular solution the constant In order to do this it suffices to check the independence by plugging k values for n — 0,..., k — 1 for k choices of Xi into the Casortian (see 3.9). We thus obtain the so-called Vandermonde matrix and it is a nice (but not entirely trivial) exercise to compute that for every k and any /c-tuple of distinct Xt is determinant of such matrix non-zero, see ||2.24|| on the page 87. But that means that the chosen solutions are linearly independent. We have thus found the fundamental system of solutions of the homogeneous difference equation in the case that all the roots of its characteristic polynomial are distinct. Consider now the multiple root X and plug into the definition the assumed solution x„ — nXn. We obtain the condition aotiX" + ■ ak(n k)Xn~k = 0. This condition can be rewritten using the so-called derivation of a polynomial (see ?? on the page ??), which we denote by apostrophe: X(a0Xn + ■ ■ ■ + akXn-k)' = 0 and right at the beginning of the fifth chapter we shall see that the root of a polynomial / has multiplicity greater than one if and only if it is a root of /'. Our condition is thus satisfied. With greater multiplicity £ of the root of the characteristic \^ polynomial we can proceed similarly and use the fact that a root with multiplicity £ is a root of all derivations of the polynomial up to £ — 1 (inclusively). Derivations look like this: \ n— k f(X) = a0Xn H-----\-akX f'(X) = a0nX"~L H-----h ak(n - k)X' n—k — 1 a0n(n-l)X"-2+- ■ ■ +ak(n-k)(n-k-l)X"-k-2 f «+1> =avn...(n-£)Xn-1-1 +. + ak(n — k) ... (n — k — £)X n-k-l-l Let us look on the case for a triple root X and try to find a solution in the form n2Xn. Plugging into the definition we obtain the equation a0n2Xn + --- + ak(n- k)2Xn~k = 0. Clearly the left side equals the expression X2f"(X) + Xf'(X) and because X is a root of both derivations, the condition is satisfied. Using induction we easily prove that even for general condition for the solution in the form x„ —nlXn, a0nlXn + ... ak(n - k)lXn~k = 0, the solution can be obtained as a linear combination of the derivations of the characteristic polynomial starting with the expression Xl+lf^+l-Xl£(£ + \)fil) + ... and we have thus came close to the complete proof of the following: Theorem. Every homogeneous linear difference equation of the order k over any field of scalars K contained in the complex numbers K has as a set of all solutions a k-dimensional vector space generated by the sequences xn — nlXn, where X are (complex) roots of the characteristic polynomial and the powers £ run overall 146 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS y„ = c. By plugging into the equation we obtain c is, c = 4. All solutions of the difference equation 1 yk+2 - yk+\ + - • yk = l c + ±c = l,that are thus of the form 4 + a (pn + bn (pn. We require that yo = yi = 1 and these two equations give a = b = —3, thus the solution of this non-homogeneous equation is 3" 1 i Again, as we know that the sequence given by this formula satisfies the given difference equation and also the given initial conditions, it is indeed the only sequence characterised by these properties. □ In the previous case we have used the so-called method of indeterminate coefficients. It is based on the following: on the basis of the non-homogeneous term of the given difference equation we "guess" the form of the particular solution. The forms of the particular solutions are known for many non-homogeneous terms. For instance the equation (3.4) yn+k +alyn+k_l H-----h akyn = Pm(n), where P (m) is a polynomial of degree n and the corresponding characteristic equation has real roots, has (almost always) particular solution of the form Qm(n), where Qm(n) is a polynomial of degree m. Other possible way how to solve such equation is the so-called variation of constants method, where we first find a solution y(n) = ^CifiQi) r = l of the homogenised equation and we consider the constants ct as functions Ci(n) of the variable n and we look for a particular solution of the given equation in the form y(n) = ^Ci(n)fi(n). i=\ Let us show on the picture the values of ft for i < 35 with the equation fin) = 9-f(n - 1) - lf(n - 2) + 1 /(0) = /(l) = 1. natural numbers between zero and multiplicity of the corresponding root k. Proof. Aforementioned relations between the multiplicity of a root and the derivation of the polynomial will be proven later, and we won't prove the fact that every complex polynomial has exactly that many roots (counting multiplicities) as is its degree. Thus it just remains to prove that the found k-tuple of solutions is linearly independent. Even in this case we can inductively prove that the corresponding Casortian is non-zero, as we have already done in the case of Vandermonde determinant before. For illustration of our approach we show how does the calculation look like for the case of a root k\ with multiplicity one and a root X2 with multiplicity two: C(k\,k\,nk\) ,« + 1 1 «+2 k2 ^2+1 i«+2 — k\k2 — k\k2 1 kl 1 k2 kl k2 1 1 n 1 2n ~k\k2 k\ — X2 kl(ki-k2) k\ — k2 "2 n (n + l)k2 (n + 2)k\ 1 n 0 k2 0 k1 nk\ (n + \)kn+l (n + 2)kn+2 k2 kl(k\ — k2) ki n i 2« + l klk2 (ki - k2y / 0. In the general case the proof can be carried on in a completely similar way, inductively. □ 3.13. Real basis of the solutions. For equations with real coefficients the initial real conditions always lead to real solutions. Still, the corresponding fundamental solutions derived using the just proven theorem might exists only in the complex domain. Let us therefore try to find other generators, which will be more convenient for us. Because the coefficients of the characteristic polynomial are real, each its root is either real or the roots are paired as complex conjugates. If we describe the solution in the polar form as k" = \k\"(cos n

- Unevenness of the curves are consequence of imprecise depicting, both signals are of course smooth sinus curves. Solution. The characteristic polynomial of the given equation is x4 — x3 — x +1. If we are looking for its roots, we are solving the reciprocal equation x + 1 = 0 x4 -x3 Standard procedure is to solve the equation by the expression x2 and then we use the substitution t = x + that is, t2 = x2 + ^ + 2. We obtain the equation t2 0, with roots ti = — 1, t2 = 2. For both of these values of the indeterminate t we solve separately the equation given by the substitution: 1 x + It has two complex roots: x\ ;V3 -1. 2 ^l 2 cos(27r/3) + i sin(27r/3) andx2 = — 5 — 1 2 = cos(2tt/3) — i sin(27r/3). For the second value of the indeterminate t we obtain the equation 1 x + - = 2 Note that in the areas where the resulting signal is roughly as strong as the original there is a dramatic shift in the phase. Cheap equaliser indeed work in such a bad way. 149 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS with double root 1. Thus the basis of the vector space of the sequences that are a solution of the difference equation in question is the following quadruple of sequences: {(-5 + *V3~) }™=l, {(-5 - zV3~) }™=l, (constant sequence) and {n}™=l. If we are looking for a real basis, we must replace two of the generators (sequences) from this basis by some sequences that are real only. As these generators are power series whose members are complex conjugates, it suffices to take as suitable generators the sequences given by the half of the sum and by the half of the z'-th multiple of the difference of that complex generators. This yields the following real basis of the solution space: {1}^ (constant sequence), {n}™=l, {cos(« • 2tt/3)}^=1, {sin(« • 2tt/3)}^=1. □ 3.8. Find a sequence that satisfies the given non-homogeneous difference equation with the initial conditions: X~n+2 = Xn + 1 + 2x„ + 1, X\ = 2, X2 = 2. Solution. General solution of the homogenised equation is of the form a{—1)" + b2". Particular solution is the constant —1/2. General solution of the given non-homogeneous equation without initial conditions is thus fl(-l)" +b2" 1 By plugging in the initial conditions then gives us the constants a = —5/6, b = 5/6. The given difference equation with initial conditions is thus satisfied by the sequence ■-(-1)" + -2""1 - -. 6 ' 3 2 □ 3.9. Determine the sequence of real numbers that satisfies the following non-homogeneous difference equation with initial conditions: 2x„+2 = — Xn + 1 + xn + 2, X\ = 2, X2 = 3. Solution. General solution of the homogenised equation is of the form a(—l)n + b(l/2)n. Particular solution is the constant 1. General solution of the non-homogeneous equation without initial conditions is thus a(-\T+b(^j +1. By plugging into the initial conditions we obtain the constants a = 1, b = 4. The given equation with initial conditions is thus satisfied by the sequence (_!)»+ 4 fV\ +i. 3. Iterated linear processes 3.17. Iterated processes. In practical models we very often encounter the situation where the evolution of a system in a given time interval is given by a linear process, and we are interested in the behaviour of the system after many iteration. Very often is the linear process remains the same, from the mathematical point of view it is thus repeated multiplication of the state vector by the same matrix. While for solving the systems of linear equation we needed only minimal knowledge of properties of linear mappings, in order to understand the behaviour of an iterated system we need to know the properties of eigenvalues, properties of eigenvectors and other structural results. In a sense we are in the same environment as with linear recurrences and actual our description of filters in previous paragraphs can be described in such way. Imagine that we are working with sound and are keeping track by the state vector Yn — (xn, ■ Xn-k+l) of all values from the actual one to the last one that is yet being processed in our linear filter. In one time interval (for the frequency of audio signal a very short one) we then move to the state vector Yn + l — (xn + l, X„, . . . , X„-k+2), where the first value x„+i = a\x„-\-----ha,tx«-,t+i is computed as with homogeneous difference equations, the others are just shift by one position and last one is forgotten. The corresponding square matrix of order k that satisfies Yn+\ — A ■ Yn looks as follows: (a\ «2 ••• a-k\ 1 0 ... 0 0 A = 0 1 0 0 \0 0 ... 1 O.J For such simple matrix we have derived explicit procedure for the complete formula for the solution. In general, it wont be so easy even for very similar systems. One of the typical cases is study of dynamics of a population in distinct biological systems. Note also that the matrix A has (understandably) the characteristic polynomial p(X) = Xk - aik k-l ak, as can be easily derived by expanding the last column and the recurrence. That is explainable also directly, because the solution x„ — X", X / 0 basically means that the matrix A by multiplication takes the eigenvector (Xk,..., X)T to its A-multiple. Thus such X must be eigenvalue of the matrix A. 3.18. Leslie model for population growth. Imagine that we are dealing with some system of individuals (cattle, insects, cell cultures, etc.) divided into m groups (according to their age, evolution stage, etc.). The state X„ is thus given by the vector Xn — (ui, ..., um)T depending on the time t„ in which we are observing the system. Linear model of evolution of such system is then given by the matrix A of dimension n, which gives the change of the vector X„ 150 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS □ 3.10. Solve the following difference equation: -"-«+4 = -"-«+3 -"-«+2 ~i~ -"-« + 1 -*-«• Solution. From the theory we know that the space of the solutions of this difference equation is a four-dimensional vector space whose generators can be obtained from the roots of the characteristic polynomial of the given equation. The characteristic polynomial is x4 - x3 + x2 - x + 1 = 0. It is a reciprocal equation (that means that the coefficients at the (n — /c)-th and /c-th power of x, k = 1, are equal). Thus we use the substitution u = x + £. After dividing the equation by x2 (zero cannot be a root) and substituting (note that x2 + \ = u2 — 2) we obtain 2 1 1 2 X — X + 1---1--7=" - u — 1=0. 1±VI X x^ Thus we obtain the indeterminates «1,2 = 1^2- From there then by the equation x2 — ux + 1 = 0 we determine the four roots Xl, 2,3,4 1± V5±V-10±2V5 Now we note that the roots of the characteristic equation could have been "guessed" right away - it is x5 + 1 = (x + l)(x4 - x3 + x2 - X + 1), and thus the roots of the polynomial x4 — x3 + x2 — x + 1 are also the roots of the polynomial x5 + 1, which are exactly the fifth roots of the — 1. By this we obtain that the solutions of the characteristic polynomial are the numbers x\ ,2 = cos( j) ± i sin(y) and x3 4 = cos(^) ± i sin(3j-). Thus the real basis of the space of the solution of the given difference equation is for instance the basis of the sequences cos(rj-), sin(rj-), cos(3^ZL) and sin(3^ZL), which are sines and cosines of the arguments of the corresponding powers of the roots of the characteristic polynomial. Note that we have by the way derived the algebraic expressions for ^10~2^ cos(3i) = 2^=1 and sin(3f) = cos(f) = ±±^, sin(f) V10+2 VI .5/ 4 . ^"v5/ 4 ' w"v 5 ' 4 M 4 (because all the roots of the equation have the absolute value 1, they are real (imaginary) parts of the corresponding roots). □ 3.11. Determine the explicit expression of the sequence satisfying the difference equation x„+2 = 2x„+i — 2x„ with members x\ = 2, x2 = 2. Solution. The roots of the characteristic polynomial x2 — 2x + 2 are 1 + i and 1 — i. The basis of the (complex) vector space of the solution to Xn+i — A ■ X„ when time changes from t„ to ?„+i. Let us show as an example the so-called Leslie model for population growth, where there is the matrix (fl h h ■ fm — 1 fm^ 0 0 . 0 0 0 ^2 0 . 0 0 0 0 ^3 0 0 \0 0 0 ... rm-i 0/ whose parameters are tied with the evolution of a population divided into m age groups such that f denotes the relative fertility of the corresponding age group (in the observed time shift from N individuals in the z-th group arise new f N ones - that is, they are in the first group), while t, is relative mortality in the z-th group in one time interval. Clearly such model can be used with any number of age groups. All coefficients are thus non-negative real numbers and the numbers t, are between zero and one. Note that when all x are equal one, it is actually a linear recurrence with constant coefficients and thus has either exponential growth/decay (for real roots X of the characteristic polynomial) or oscillation connected with potential growth/decay (for complex roots). Before we introduce more general theory, let us play for a while with this specific model. Direct computation with the Laplace expansion of the last column yields the characteristic polynomial pm (X) of the matrix A for the model with m groups: Pm(X) = \A- XE\ = -kPm-i(k) + (-l)m~7mTl . . . Xm — \. Easily by induction we derive that this characteristic polynomial is of the form Pm(X) = (-l)m(X" ■ fli X m — l -\X — am) and mainly non-negative coefficients a\,..., am, if all parameters xi and fi are positive. For instance it is always — fm X\ . . . Xm — 1. Let us qualitatively estimate the distribution of the roots of the polynomial pm. Sadly, details of this procedure could fbe exactly explained only later, after understanding some parts of the so-called mathematical analysis in the chapter five and later, however it should all be intuitively clear even now. We express the characteristic polynomial in the form pm(X) = ±Xm(\-q(X)) where q(X) — a\X~l + • • • + amX~m is a strictly decreasing non-negative function for X > 0. Evidently there exists exactly one positive X for which q(X) — 1 and thus also pm (X) — 0. In other words, for every Leslie matrix there exists exactly one positive real eigenvalue. For actual Leslie models of populations all coefficients t, and fj are between one and zero and a typical situation is when the only real eigenvalue Xi is greater or equal to one, while the absolute values of the other eigenvalues are strictly smaller than one. 151 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS is thus formed by the sequences y„ = (1 + /)" and z„ = (1 — /)". The sequence in question can thus be expressed as a linear combination of these sequences (with complex coefficients). It is thus x„ = a-yn+b-Zn, where a = a\ + ia2, b = b\ + ib2. From the recurrent relation we compute x0 = \(2x\ — x2) = 0 and by substitution n = 0 and n = 1 into the expression of xn we obtain 1 = x0 = a\ + ia2 + b\ + ib2 2 = xi = (fli + ia2)(\ + i) + (bi + ib2){\ - /), and by comparing the real and the complex part of both equations we obtain a linear system of four equations i a\ + bi a2 + b2 a\ - a2 + b\ + b2 ai + a2 — t>i + b2 = 0 with solution a\ = b\ = b2 = ^ and a2 = —1/2. Thus we can express the sequence in question as Xn = (\- ^0(1 + 0" + (\ + - if- The sequence can also be expressed using the real basis of the (complex) vector space of the space of solutions, that is, using the sequences u„ = \{y„+ zn) = (\/2)" cos(^) and v„ = \i(z„ - y„) = (V2)" sin(rj-). The transition matrix for the changing the basis from the complex one to the real one is Mi -$)• the inverse matrix is T~l = 0 ^, for expressing the sequence x„ using the real basis, that is, for expressing the coordinates (c, d) of the sequence x„ under the basis {u„, vn}, we have If we begin with any state vector X which is given as a sum of eigenvectors X — Xi + ■ ■ ■ + xm with eigenvalues Xi, then iterations yield X X\xr + ■ ■ x\,xn thus under the assumption that \Xi\ < 1 for all i > 2, all components in the eigensubspaces decrease very fast, except for the component X\X\. Distribution of the population among the age groups are thus very fast approaching the ratios of the components of eigenvector to the dominant eigenvalue X\. For example for the matrix (let us realise the meaning of individual coefficient, they are taken from the model for sheep breeding, that is, the values x contain both natural deaths and activities of breeders) ( 0 0.2 0.8 0.6 o\ = 1 0.95 0 0 0 0 A = 0 0.8 0 0 0 = 0 0 0 0.7 0 0 = 2 1 o 0 0 0.6 0/ the eigenvalues are approximately 1.03, 0, -0.5, -0,27 + 0.74/, -0.27-0.74/ with absolute values 1.03, 0, 0.5, 0.78, 0.78 and the eigenvector corresponding to the dominant eigenvalue is approximately XT = (30 27 21 14 8). We have immediately chosen the eigenvector whose coordinates sum to 100, it directly gives us the percentual distribution of the population. If we instead of three-percent total growth of the population rather wanted constant number and said that we will eat sheep from second group, we would be asking the question how much shall we decrease x2 so that the dominant eigenvalue would be one. 3.19. Matrices with non-negative elements. Real matrices that have no negative elements have very special properties. Also, they are very often present in practical /zj$£fakvfe models. We shall thus introduce the so-called Perron--l^jf^^K-J— Frobenius theory which deals with such matrices. Let us begin with definition of some notions in order to be able to formulate our ideas. | Positive and primitive matrix [__> L Definition. Under positive matrix we understand a square matrix A whose all elements atj are real and strictly positive. Primitive matrix is such square matrix A such that some power Ak is positive. thus we have again an alternative expression of the sequence xn where there are no complex numbers (but there are square roots): x„ = (V2rcos(^) + (V2rsin(^), which we could have obtained by solving two linear equations in two variables c, d, that is, 1 = x0 = c ■ u0 + d ■ v0 = c and 2 = x\ = c ■ u\ + d ■ v\ = c + d. □ Let us recall that spectral radius of matrix A is the maximum of absolute values of all (complex) eigenvalues of A. Spectral radius of a linear mapping over (finitely-dimensional) vector space is the spectral radius of the corresponding matrix under some basis. 2 Norm of a matrix A e Rn or of a vector x e Rn is the sum of absolute values of all elements. For vector x we write |x| for its norm. The following result is very useful and hopefully also well understandable. Its proof is with its hardness quite atypical for this 152 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS 3.12. Determine the explicit expression of the sequence satisfying the difference equation x„+2 = 3x„+1 + 3x„ with members x\ = 1 and x2 = 3. O 3.13. Determine the explicit formula for the n -th member of the unique solution {x„}™=l that satisfies the following conditions: Xn+2 = Xn + \ — X„, , X\ = 1, x2 = 5. o 3.14. Determine the explicit formula for the n -th member of the unique solution {x„}™=l that satisfies the following conditions: — Xn+3 = 2Xn+2 + 2xn + i + X„, X\ = 1, X2 = 1, X3 = 1. o 3.75. Determine the explicit formula for the n -th member of the unique solution {xn}^=l that satisfies the following conditions: — X„+3 = 3x„+2 + 3x„ + i + X„, X\ = 1, X2 = 1, X3 = 1. o C. Population models Population models which we are to deal with right now will have recurrence relations in vector spaces. The unknown in this case is not a sequence of numbers but a sequence of vectors. The role of coefficients is played by matrices. We begin with a simple (two-dimensional) case. 3.16. Savings. With a friend we are saving for a holiday together by monthly payments in the following way. At the beginning I give 10 €and he gives 20 €. Every consecutive month each of us gives as many as last month plus one half of what the other has given the month before. How much will we have after one year? How much many will I pay in the twelfth month? Solution. The amount of many I pay in the 72-th month is denoted as x„ and the amount my friend is paying is y„. The first month we thus givex! = 10, yi = 20. For the following payments we can write down a recurrent relation: Xn + l = Xn -\- ~^y-n y«+i = y» + 2xn If we denote the common savings as z,„ = xn + y„, then by summing the equations we obtain zn+i = zn + \zn = \zn- That is a geometric sequence and we obtain z,„ = 3.(|)"_1. In a year we will thus have zi + z,2 + • • • + z,\2- This partial sum is easy to compute textbook, so we give at least a vague idea how to do it. If the reader has some problems with smooth reading, we suggest skipping the proof immediately. Theorem (Perron). If A is a primitive matrix with spectral radius X € M, then X is a simple root of characteristic polynomial of the matrix A, which is strictly greater than the absolute value of any other eigenvalue of the matrix A. Furthermore, there exist eigenvector x associated with X such that all elements x, of x are positive. Vague idea. In the proof we shall rely on the intuition from \\ elementary geometry. Partly we will make the used concepts more precise in the analytical geometry in the fourth chapter, some analytical aspects will be W studied in more detail in the fifth chapter and later, and some claims won't be proven in this textbook at all. Hopefully the presented ideas will not just illuminate the theorem but also will motivate for deeper study of geometry and analysis by themselves. Let us begin with a understandable auxiliary lemma: Lemma. Consider any polyhedron P containing the origin 0 e W. If some iteration of the linear mapping iff : M" -> M" maps P in its inside, then the spectral radius of the mapping iff is strictly smaller than one. Consider the matrix A of the mapping i/r under the standard basis. Because the eigenvalues of Ak are the k-th powers of the eigenvalues of the matrix A, we can without loss of generality assume that the mapping i/r already maps P into its inside. Clearly i/r cannot have any eigenvalue with absolute value greater than one. Let us argue by contradiction. Assume that there exists eigenvalue X with \X\ — 1. Thus there are two possibilities, either Xk — 1 for suitable k or there is no such k. The image of P is a closed set (that means that when the points in the image group about some point y in M", the point y is also in the image) and the border of P is not intersected at all by the image. Thus ^ cannot have a fixed point on the border and there cannot even be any point on the border to which the points in the image would converge. The first argument excludes that some power of X is one, because such fixed point on the border of P would then exist. In the remaining case there would definitely be a two-dimensional subspace W C M" on which the restriction of i/r acts as a rotation by an irrational argument and thus there definitely exist a point y in the intersection of W with the border of P. But then the point y could be approached arbitrarily close by the points from the set i[r" (y) (through all iterations) and thus would have to be in the image also. That leads to a contradiction and thus the lemma is proven. Now let us prove the Perron theorem. Our first step is ensuring the existence of the eigenvector which has all elements positive. Let us consider the so-called standard simplex S — {x — (xi ,x„Y , |x| = \,xt >(U = 1, Because all elements in the matrix A are non-negative, the image A ■ x has all non-negative coordinates as x does and at least one of them is always non-zero. The mapping x 1 (A ■ x) thus 3(1 + - H----+ = 3 1-1 772,5. maps S to itself. This mapping S —>• S satisfies all the assumptions of the so-called Brouwer fixed point theorem and thus there exists vector y e S such that it is mapped by this mapping to itself. That 153 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS In a year we will have saved over 772 €. The recurrent system of equation describing the savings system can be written by matrices as follows: means that xn + l y«+i 1 f \ / Xr 7 1/ Wr- it is thus again a geometric sequence. Its elements are now vectors and the quotient is not a scalar, but a matrix. The solution can be found analogously: The power of the matrix acting on the vector (xi,yi) can be found by expressing this vector in the basis of eigenvectors. The characteristic polynomial of the matrix is (1 — A)2 — 4- — 0 and the eigenvalues are thus A 1,2 3 j_ 2' 2 . The corresponding eigenvectors are thus (1,1) and (1,-1). For the initial vector (xi,yi) = (1, 2) we compute 'l\ 3 /l\ l/l and thus 2 VI 3 /3\"_1 /l 2 V2 -1 1 /1\"_1 / 1 2 V2 That means that in the 12. month I pay Xl2 12 12 130 EUR and my friend pays basically the same amount. □ Remark. The previous example can be solved also without matrices by rewriting the recurrent equation: x„ = xn + \yn = \xn + \z,n- The previous example was actually a model of growth (in the case of growth of saved money). Let us now go to the models of growth describing primarily a growth of some population. Leslie model of population growth with which we have coped with in great detail in the theoretical part describes very well not only populations of sheep (according to which it was developed), but can be also applied in modelling of the following populations: 3.17. Rabbits for the second time. Let us show how the Leslie model can describe the population of the rabbits on the meadow with which we have worked in the exercise (||3.4||). Let us consider that the rabbits are dying after reaching the ninth year of age (in the original model the rabbits were immortal). Let us denote the numbers of rabbits according to their age in months in time t (in months) as x\(t), x2(t),..., x9(t), then the numbers of rabbits in individual categories are after one month described by the formula x\{t + 1) = A ■ y = ky, k = \ A ■ y\ and we have found an eigenvector that lies in S. Because some power of Ak has due to our assumption all elements positive and of course we have Ak ■ y = kky, all elements of the vector y are strictly positive (that is, they he inside of S) and k > 0. In order to prove the rest of the theorem, we will consider the mapping given by the matrix A in a more suitable basis and furthermore we shall multiply it by a constant A-1: B = A_1(y_1 ■ A ■ Y), where Y is a diagonal matrix with coordinates yi of a just-found eigenvector y on a diagonal. Evidently B is also a primitive matrix and furthermore the vector z = (1,..., l)T is its eigenvector, because clearly Y ■ z = y. If we know prove that [i = 1 is a simple root of the characteristic polynomial of the matrix B and that all other roots have absolute value strictly smaller than one, the proof is finished. In order to do that we use the auxiliary lemma. Consider the matrix B to be a matrix of a linear mapping that maps the row vectors u = («1 ««) B = v, that is, using multiplication from the right. Thanks to the fact that z = (1,..., l)T is an eigenvector of the matrix B, the sum of the coordinates of the row vector i; i,i=\ i = \ whenever u e S. Therefore the mapping maps the simplex S on itself and thus has in S a (row) eigenvector w with eigenvalue one (fixed point, thanks to the theorem of Brouwer). Because some power Bk contains only strictly positive elements, the image of the simplex S in the &-th iteration of the mapping given by B lies inside of S. We are getting close to using our lemma prepared for this proof. We shall still work with the row vectors. Denote by P the shift of the simplex S into the origin by the eigenvector w we have just found, that is, P = —w + S. Evidently P is a polyhedron containing the origin and the vector subspace V c K" generated by P is invariant to the action of the matrix B through multiplication of the row vectors from the right. Restriction of our mapping on P thus satisfies the assumptions of the auxiliary lemma and thus all its eigenvalues are strictly smaller than one. We have yet to deal with the problem that the just considered mapping is given by multiplication of the row vectors from the right with the matrix B, while originally we were interested in the mapping given by the matrix B and multiplication of the column vectors from the left. But that is equivalent to the multiplication of the transposed column vectors with the transposed matrix B in the usual way - from the left. Thus we have proven the claim about eigenvalues for the transpose of B. But transposing does not change the eigenvalues. Dimension of the space V is n — 1, thus completing the proof. □ 154 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS or x3(t) + -- • +x9(0, Xi (t + l) = Xi -1 (0, pro i = 2,3 /xi(t+l)\ 1 1 1 1 1 1 1 Ai(0\ x2(t+l) 1 0 0 0 0 0 0 0 0 x2(t) x3(t+l) 0 1 0 0 0 0 0 0 0 x3(t) x4(t+ 1) 0 0 1 0 0 0 0 0 0 Mi') x5U + 1) = 0 0 0 1 0 0 0 0 0 x5(i) x6(t + 1) 0 0 0 0 1 0 0 0 0 X(,(t) xi (t + 1) 0 0 0 0 0 1 0 0 0 xi(t) x8(7+ 1) 0 0 0 0 0 0 1 0 0 Mf) \x9(f + 1)7 Vo 0 0 0 0 0 0 1 0/ \x9(f)/ ,10, Characteristic polynomial of the given matrix is A9 — A7 —A6 —A5 —A4 — A3 — A2 — A — 1. Roots of this equation are hard to explicitly express, but we can estimate one of them very well - Ai = 1, 608 (why must it be smaller than (V5 + l)/2)?). Thus the population grows according to this model approximately with the geometric sequence 1, 608'. 3.18. Pond. Let us have a simple model of a pond where there lives a population of white fish (roach, bleak, vimba, nase, etc.). We assume that 20 % of babies survive their second year and from that age on they are able to reproduce. For these young fish, approximately 60 % of them survives their third year and in the following years the mortality can be ignored. Furthermore we assume that the birth rate is three times the number of fish that can reproduce. Such population would clearly very quickly fill the pond. Thus we want to maintain a balance by using a predator, for instance esox. Assume that one esox eats per year approximately 500 mature white fish. How many esox should be put into the pond in order for the population to stagnate? Solution. If we denote by p the number of babies, by m the number of young fish and by r the number of adult fish, then the state of the population in the next year is given by: 3m + 3r 0,2p ,0, 6m + xr , where 1 — r is the relative mortality of the adult fish caused by the esox. The corresponding matrix describing this model is then If the population is to stagnate, then this matrix must have eigenvalue 1. In other words, one must be the root of the characteristic polynomial of this matrix. That is of the form A2(r - A) + 0, 36 - 0, 6.(r - A) = 0. That means that r must satisfy r - 1+0.36 - 0.6(r - 1) = 0 0.4r - 0.04 = 0 3.20. Simple corollaries. The following very useful claim has with the knowledge of the Perron theorem a surprisingly simple proof and shows how strong is the prop-erty of the primitive matrix of a mapping. Corollary. If A — (flij) is a primitive matrix and x e W its eigenvector with all coordinates non-negative and eigenvalue A, then A > 0 is the spectral radius of A. Furthermore it holds that mm £>,7 0. >From the theorem of Perron we know that the spectral radius 11 is an eigenvalue and choose such eigenvector y associated with [i such that the difference x — y has only strictly positive coordinates. Then necessarily for all the powers of n we have 0 < A" ■ (x - y) = A"x - fi"y, but we also have that A < ji. From there we directly have A — ji. It remains to estimate the spectral radius using the minimum and maximum of sums of individual columns of the matrix. We denote them by />m;n and bmax, choose x to be a vector with the sum of coordinates equal to one and count: X! auxj - X! "kXi ~k i,j=l i = l n y n \ n * = (fly )xj ^12 b™*xj n / n v. n A = j2\12aij )xj - X^min*/ : = 1 vr = l □ Note that for instance all Leslie matrices from 3.18, where all the coefficients f and tj are strictly positive, are primitive and thus we can apply on them the just derived results. Perron-Frobenius theorem is a generalisation of the Perron theorem for more general matrices, which we won't give here. More information can be found for instance in ??. 3.21. Markov chains. Very frequent and interesting case of linear processes with only non-negative elements in matrix is a mathematical model of a system which can be in one of m states with various probabilities. In a given point of time the system is in state i with probability X{ and transition form the state i to the state j happens with probability ?y. We can write the process as follows: at time n the system is described by the probability vector x„ - («l(«), um{n)) That means that all components of the vector x are real non-negative numbers and their sum equals one. Components give the distribution of the probability of individual possibilities for the state of the system. The distribution of the probabilities at time 155 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS 0.6 At + 0.5 Kk, -0.16 Dk + 1.2 Kk; In the next year only 10 % is allowed to survive and the rest should be eaten by the esox. If we denote the desired number of esox by x, then together they eat 500x fish, which should according to the previous computation be 0.9r. The ratio of the number of white fish to the number of esox should thus be r- = ^j. That is approximately one esox for 556 white fish. □ In general, we can work with the previous model as follows: 3.19. Let in the population model prey-predator be the relation between the number of predators Dk and preys Kk in the given and the following month (ieNU {0}) be given by the linear system (a) Dk+l Kk+i (b) Djt+i = 0.6 + 0.5 Kk, Kk+l = —0.175 Dk + 1.2 Kk; (c) Djt+i = 0.6 + 0.5 Kk, Kk+l = —0.135 Dk + 1.2 Kk. Let us analyse the behaviour of this model after a very long time. Solution. Note that individual variants differ from each other only in the value of the coefficients at Dk in the second equation. We can thus express all three cases as 'DA (0.6 0.5\ (Dk_x KkJ \-a 1.2j'\Kk-ly where we gradually set a = 0.16, a = 0.175, a = 0.135. The value of the coefficient a represents here the average number of preys killed by one (clearly a „humble") predator per month. When denoting y—a 1.21 we immediately obtain Using the powers of the matrix T we can determine the evolution of the populations of predators and preys after a very long time. We easily compute the eigenvalues (a) kt = 1, k2 = 0.8; (b) kt = 0.95, k2 = 0.85; (c) ki = 1.05, k2 = 0.75 the matrix T is and the respective eigenvectors are (a) (5,4)r, (5,2)r; (b) (10,7)r, (2, if; (c) (10,9)r , (10, 3f. k e N, jk l Do k e N. n + 1 is given by multiplying the probabilistic transition matrix T — (tij), that is, Xfi+\ — T - xn. Because we assume that the vector x captures all possible states and thus with total probability one again transits into some of the state, all columns of T are also given by probabilistic vectors. Such process is called (discrete) Markov process and the resulting sequence of vectors xo, x \, ... is called Markov chain xn. Note that every probabilistic vector x is actually mapped by a Markov process on a vector with a sum of coordinates equal to one: j2tijxj = X(X'/;)V/ = Xv; = '• >J J ' J Now we can use the Perron-Frobenius theory in its full power. Because the sum of the rows of the matrix is always equal to the vector (1,..., 1), we can easily see that the matrix T—E is singular and thus one is surely an eigenvalue of the matrix T. If furthermore T is a primitive matrix (for instance, when all elements are non-zero), from the corollary 3.20 we know that one is a simple root of the characteristic polynomial and all others have absolute value strictly smaller than one. Theorem. Markov processes with the matrix that has no zero element or that some its power has this property, satisfy: • there exist unique eigenvector im for the eigenvalue 1 which is probabilistic, • the iteration Tkxo approaches the vector x^ for any initial probabilistic vector xq. Proof. This claim follows directly from the positivity of the coordinates of the eigenvector derived in the Perron theorem. Assume first that the algebraic and geometric multiplicities of the eigenvalues of the matrix T are the same. Then every probabilistic vector xo can be (in complex extension C") written as linear combination x0 — ClX0 ■ c2u2 where u2... ,u„ extend Xoo to a basis of the eigenvectors. But then the &-th iteration gives again a probabilistic vector Xk Tk-x0 + X2c2u2-\-----hA/„«„. Because all eigenvalues X2, ■ ■ ■ Xn are in absolute value strictly smaller than one, all components of the vector xk but the first one approach (in norm) zero very rapidly. But xk is still probabilistic, thus it must be that c\ — 1 and the second claim is proven. In reality even with distinct algebraic and geometric multiplicities of eigenvalues we reach the same conclusion using a more detailed study of the so-called root subspaces of the matrix T which we reach in the connection with the so-called Jordan matrix decomposition even in this chapter, see the note 3.33. Even in the general case we reach in the eigensubspace (xqo) a uniquely determined invariant (n — 1)-dimensional complement, on which are all eigenvalues in absolute value smaller than one and thus the corresponding component in xk approaches zero as before. □ 156 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS For & e N thus holds that (a) 5 5\ (I 0\* (5 5W 4 21 ' \0 0.8/ ' U 2 ,i _ j'10 2\ /0.95 0 \* /10 2^ 1 7 1/ ' l 0 0.85/ ' I 7 1 (b) (c) ) 10\ /1.05 0 V /10 10x _1 9 3 J ' V 0 0.75) ' \ 9 3 >From there we further have for big 4eN that (a) (5 5\ (\ 0\ (5 5 (b) 4 2/ \0 0/ \4 2 j_ /-10 25 10 V -8 20 10 2\ /0 0\ /10 2 (c) 7 1/ \0 0/ V 7 1 0 0 0 0 ,*_./'10 10\ /1,05* 0\ /10 10 9 3 / V 0 0/ V 9 3 1,05* /-30 100N 60 V"27 90 / ' because exactly for big ieNwe can set (a) 1 0 \ /l 0 0 0.8/ ~ VO 0 0.95 0 V /0 0 0 0.85/ ~ VO 0 1.05 0 V /1.05* 0 (b) (c) 0 0.15 J \ 0 Oj Let us note that in the variant (b), that is for a = 0.175, it was not necessary to compute the eigenvectors. Thus we have obtained (a) Kk)~10\-8 20) '\K0 = J_(5 (-2D0 + 5K0) 10 14 (-2D0 + 5K0) 3.22. Iteration of the stochastic matrices. Matrices of Markov chains, that is, matrices whose rows have sum of their components equal to one are called stochastic matrices. Standard problems connected with Markov processes contain answers to the question about the expected time elapsed between transition from one state to another and so on. Right now we are not prepared for solving these problems, but we return to this topic later. We reformulate the previous theorem into a simple, but surprising result. By convergence to a limit matrix in the following theorem we mean the following: if we say that we want to bound the possible error e > 0, then we can find a bound on the number of iterations k after which all the components of the matrix differ from the limit one by less than e. Corollary. Let T be a primitive stochastic matrix from a Markov process and let Xqo be the stochastic eigenvector for the dominant eigenvalue I (as in the theorem before). Then the iterations Tk converge to the limit matrix T^, whose columns all equal to iw. Proof. Columns in the matrix Tk are images of the vectors of the standard basis under the corresponding iterated linear mapping. But these are images of the probabilistic vectors and thus all of them converge to x^. □ Now for a short goodbye to Markov processes we think about the problem whether there exist for a given system the states into which the system tends to get in and stay in them. We say that a state is transient, if the system stays in it with probability strictly smaller than one. State is absorbing if the system stays in it with probability one and into which the system can get with non-zero probability from any of the transient states. Finally, Markov chain x„ is absorbing, if all its states are either absorbing or transient. If in the absorbing Markov chain first of r states of the system are absorbing, then for the stochastic matrix T of the system this means that it decomposes into "block-wise" upper triangular form E R 0 Q where £ is a unit matrix whose dimension is given by the number of absorbing states, while R is a positive matrix and Q non-negative. In any case, iterations of this matrix yield a matrix which has the same block of zero values in the bottom-left block and thus it is not primitive, for instance r2 = R + R-Q Q2 Even about such matrices we can obtain many information using the full Perron-Frobenius theory and with knowledge of probability and statistics also estimate expected time after which the system gets into one of the absorbing states. 4. More matrix calculus On pretty practical examples we have seen that understanding the inner structure of matrices and their properties is a strong tool for specific computations and analyses. Even more is it true for effectivity of numerical calculations with matrices. Therefore we will for a while deal with abstract theory. 157 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS (b) (c) 0 0 0 0 Dk 1.05* 60 1.05* 60 D0 K0 -30 100 -27 90 Do K0 10(-3D0 + 10K0) 9(-3D0 + 10Z0) These results can be interpreted as follows: (a) If 2Do < 5Ko, the sizes of both populations stabilise on nonzero sizes (we say that they are stable); if 2D0 > 5K0, both populations die out. (b) Both populations die out. (c) For 3D0 < 10 K0 begins a population boom of both kinds; for 3 Do > 10Ko both populations die out. Even a tiny change of the size of a can lead to a completely different result. This is caused by the constantness of the value of a - it does not depend on the size of the populations. Note that this restriction (that is, assuming a to be constant) has no interpretation in reality. But still we obtain an estimate on the sizes of a for stable populations. □ 3.20. Remark. Other model for the populations of predators and preys is the model by Lotka and Volterra, which describes a relation between the populations by a system of two ordinary differential equations. Using this model both populations oscillate, which is in accord with observations. In linear models an important role is played by the primitive matrices (3.19). 3.21. Which of the matrices 0 1/7 1 6/7 D 0 /1/2 0 l/3\ r 1 0 B = ° 1 1/2 , C = 1/4 0 1/2 \l/2 0 1/6/ \3/4 0 1/2 1/2 0 0 \ (0 1 0 0\ 1/3 0 0 E 0 0 0 1 1/6 1/6 1/3 1 0 0 0 0 5/6 2/3) \° 0 1 V are primitive? Solution. Because A2 = /l/7 6/49 \ 3 16/7 43/49/' '3/8 1/4 l/4> 1/4 3/8 1/4 ,3/8 3/8 1/2, We will investigate further some special types of linear mappings on vector spaces and also a general case where the structure is described using the so-called Jordan theorem. 3.23. Unitary spaces and mappings. We are already used to the fact that it is efficient to work in the domain of complex numbers even in the case when we are interested only in real objects. Furthermore, in many areas the complex vector spaces are necessary component of the problem. For instance, take the so-called quantum computing, which became a very active area of theoretical computer science, although quantum computers have not been constructed yet (in a usable form). Therefore we extend what we know about orthogonal mappings and mappings from the end of the second chapter with the following definitions: __ | Unitary spaces [__> Definition. Unitary space is a complex vector space V along with the mapping V x V -> C, (u, v) i-> u ■ v, which satisfies for all vectors u, v, w e V and scalars a e C (1) u ■ v — v ■ u (the bar stands for complex conjugation), (2) (au) ■ v — a(u ■ v), (3) (u + v) ■ w — u ■ w + v ■ w, (4) if u / 0, then u ■ u > 0 (notably if the expression is real). Such mapping is called scalar product over V. Real number *Jv ■ v is called size of the vector v and vector is normalised, if its size equals one. Vectors u and i; are called orthogonal if their scalar product is zero, basis composed of mutually orthogonal and normalised vectors is called orthonormal basis V. On first sight this is an extension of the definition of Euclidean vector spaces into the complex domain. We will keep on using the alternative notation (u, v) for scalar product of vectors u and i;. Identically to the real domain, we obtain immediately from the definition the following simple properties of the scalar product for all vectors in V and scalars in C: u ■ u € M u ■ u — 0 if and only if u — 0 u ■ (av) — a(u ■ v) u ■ (v + w) — u ■ v + u ■ w M.0 = 0-« = 0 ' j ij where the last equality holds for all finite linear combinations. It is a simple exercise to prove everything formally, for instance the fist property follows from the definition property (1). Standard example of scalar product over complex vector space (xi x„)T ■ (yi, • x„) -xiyi ■ xn yn. Thanks to conjugation of the coordinates of the second argument this mapping satisfies all required properties. The space C" with this scalar product is called standard unitary space of dimension n. We can denote this scalar product with matrix notation as x • y — -T y ■ x. 158 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS the matrices A and C are primitive, and because '1/2 0 1/3 0 1 1/2 vl/2 0 1/6 the middle column of the matrix B" is always (for n e N) the vector (0, 1, 0)T, that is, the matrix B cannot be primitive. The product /1/3 1/2 0 0 \ M / 0 \ 1/2 1/3 0 0 0 1/6 1/6 1/3 \l/6 0 5/6 2/3/ 0 a w o a/6 + bß \5a/6 + 2b/3j a, b e implies that the matrix D2 has in the right upper corner a zero two-dimensional (square) sub-matrix. By repetition of this implication we obtain that the same property is shared by the matrices D3 = D ■ D2, D4 = D ■ D3, ..., D" = D ■ D"-\ thus the matrix D is not primitive. The matrix E is a permutation matrix (in every row and every column there is exactly one non-zero element, 1). It is not difficult to realise that the powers of the permutation matrix are again permutation matrices. The matrix E is thus also not primitive. This can be easily verified by calculating the powers E2, E3, E4. The matrix E4 is a unit matrix. □ Now we show a more robust model. 3.22. Model of spreading of annual plants. We consider the plants that at the beginning of the summer blossom, at the peak of the summer produce seeds and die. Some of the seeds burst into flowers at the end of the autumn, some survive the winter in the ground and burst into flowers at the start of the spring. The flowers that burst out in autumn and survive the winter are usually bigger in the spring and usually produce more seeds. After this, the whole cycle repeats. The year is thus divided into four parts and in each of these parts we distinguish between some "forms" of the flower: Part Stage beginning of the spring small and big seedlings beginning of the summer small, medium and big blossoming flowers peak of the summer seeds autumn seedlings and seeds We denote by x\(t) and by x2(t) the number of small and big seedlings respectively at the start of the spring in the year t and by y\(t), y2(t) and (0 the number of small, medium and big flowers respectively in the summer of that year. From the small seedlings either small or big flowers grow, from the big seedlings either medium or big flowers grow. Each of the seedlings can of course die (weather, be eaten by a cow, etc.) and nothing grows out of it. Denote by btj the probability that the seedling of the j-th size, j = 1,2 grows into a flower of the Completely analogously to the Euclidean spaces and orthogonal mappings, great importance is in those mappings that respect scalar product. ___J Unitary mapping J___,- Linear mapping

W between unitary spaces is called unitary mapping, if for all vectors u,v e V we have u ■ v — cp(u) ■ (p(v). Unitary isomorphism is a bijective unitary mapping. 3.24. Properties of spaces with scalar product. In a brief discussion about Euclidean spaces in the previous chapter we have already derived some simple properties of spaces with scalar product. The proofs for the complex case are very similar. In the following we shall work with real and complex spaces simultaneously and we write K for R or C, in the real case the conjugation is just the identity mapping (as the actual restriction of the conjugation in the complex plane to the real line is). Similarly to the real space we define in general for arbitrary vector subspace U C V in the space with scalar product its orthogonal complement U1- = {v e V; u ■ v = 0 for all u e U], which is clearly also a vector subspace in V. In the following paragraphs we work exclusively with finitely-dimensional unitary or Euclidean spaces. However, many of our results have a natural generalisation for the so-called Hilbert spaces, which are specific infinitely-dimensional spaces with scalar products, to which we return later, albeit briefly. Proposition. For every finitely-dimensional space V of dimension n with scalar product we have: (1) InV there exists an orthonormal basis. (2) Every system of non-zero orthogonal vectors in V is linearly independent and can be extended to an orthogonal basis. (3) For every system of linearly independent vectors (u\,..., uk) there exists an orthonormal basis (y\, ..., vn) such that its vectors respectively generate the same subspaces as the vector Uj, that is, ..., vt) — {u\..., ut), 1 < i < k. (4) If («i, ..., u„) is an orthonormal basis V, then coordinates of every vector u e V are expressed via u — (u ■ u{)u\ + ••• + («• u„)u„. (5) In any orthonormal basis the scalar product has the coordinate form u ■ v — x ■ y — x\yi H-----V xnyn where x and y are columns of coordinates of the vectors u and v in a chosen basis. Notably, every n-dimensional space with scalar product is isomorphic to the standard Euclidean R" or the unitary C". (6) Orthogonal sum of unitary subspaces V\ + ■ ■ ■ + Vk in V is always a direct sum. (7) If A C V is an arbitrary subset, then A1- C V is a vector (and thus also unitary) subspace and (A-1)1- C V is exactly the subspace generated by A. Furthermore we have V — (A) © A\ (8) V is orthogonal product of n one-dimensional unitary sub-spaces. 159 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS z'-th size, i = 1, 2, 3. Then we have 0 < bn < 1, bl2 = 0, 0 < &21 < 1, 0 \u ■ e\ I + • • • + \u ■ ek\ (Bessel inequality). 160 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS not die during the winter respectively. The probabilities dn,d2i clearly must satisfy the inequalities 0 < d\\, 0 < d2\, dn + d2\ = 1, and because a seedling dies in the winter more easily than a seed hidden in the ground, we assume about fu, f22 that (4) For orthonormal system of vectors (e\, ..., ek) the vector u belongs to the subspace e [e\,..., ek) if and only if |2 |m||2 — \u ■ e\ \2 ■ \u-ek\ When denoting 'dn ,d2\ D 0 < /n < f22 < 1. fu 0 w(t) Wi(t) 0 MJ' ~w \w2(t) we obtain with similar ideas as before the equalities w(t) = Dz(t), x(t + 1) = Fw(t). Because the matrix multiplication is associative, we can for the numbers in individual stages of flowers in the following year from the previous equalities compose the recurrent formulas: x(t + 1) =Fw(t) = F{Dz(t)) = (FD)z(t) = (FD){Cy(t)) = =(FDC)y(t) = (FDC)(Bx(t)) = (FDCB)x(t), y(t + 1) =Bx(t + 1) = B{Fw(t)) = (BF)w(t) = (BF)(Dz(t)) = =(BFD)z(t) = (BFD)(Cy(t)) = (BFDC)y(t), z(t + 1) =Cy(t + 1) = C(Bx(t + 1)) = (CB)x(t + 1) = (CB)(Fw(t)) =(CBF)w(t) = (CBF)(Dz{t)) = (CBFD)z(t), W(t + 1) =Dz(t + 1) = D(Cy(t + 1)) = (DC)y(t + 1) = =(DC)(Bx(t + 1)) = (DCB)x(t + 1) = (DCB)(Fw(t)) = ==(DCBF)w(t). Using the notation Ax = FDCB, Ay = BFDC, Az = CBFD, Aw = DCBF, we simplify them into the formula x(t+\) = Axx{t), y(H-l) = Ayy(t), z(t+l) = Azz(t), w(t+\) = Aw(t) >From these formulas we can compute the distribution of the population of the flowers in any part of any year, if we know the starting distribution of the population (that is, in the year zero). For instance, let the distribution of the population be known in the summer, that is, z(0) of seeds. The distribution of the population at the beginning of the spring in the t-th year is x(t) = Axx(t - 1) = A2xx(t -2) = ... = A'-lx(l) = A'~l Fw(0) = =A'-1FDz(0). (Parseval equality) (5) For orthonormal system of vectors (e\, ..., ek) and vector u € V is the vector w — (u ■ e\)e\ H-----h (u ■ ek)ek the only vector which minimises the size \\u — v\\ for all v € (ei, ..., ek). Proof. All proofs rely on direct computations: (2): Define the vector w :— u — ^v, that is, w _L v and compute 0 < \\w\\2 = \\u\\2 0 < IMI2IMI2 = (u-v) IM 2 (" ' ») II i|2ii ii2 u-v ,,\ i (u-v)(u-v) ||., M2 II '•J II \\VV 2{u ■ v){u ■ v) + (u ■ v)(u ■ v) >From there it directly follows that ||«||2IMI2 > \u ■ v\2 and the equahty holds if and only if w — 0, that is, whenever u and v are linearly dependent. (1): Again it suffices to compute |2 ||2 ii m2 v\\ — \\u\\ ■ < \\u\ J v ||- + U ■ V + V ■ u IMI2 + 2Re(« • v) \v\\2 +2\u ■ v\ < \\u\\2 2 2||m| iwiir Because these are positive real numbers, it indeed is that || u+v\\ < \\u\\ + \\v\\. Furthermore, with equahty it must be that in all previous inequalities equality also holds, but that is equivalent to the condition that u and v are linearly dependent (using the previous part). (3), (4): Let (e\,..., ek) be an orthonormal system of vectors. We extend to an orthonormal basis (e\,...,e„) (that is always possible thanks to the previous theorem). Then, again using the previous theorem, is for every vector u e V n n k \\u\\2 — ^(w • et)(u ■ ei) — ^2 \u ■ e,|2 > ^\u ■ e,|2 /—1 /—1 /—1 But that is the Bessel inequality. Furthermore, equality can hold if and only if u ■ =0 for all i > k, which proves the Parseval equahty. (5): Choose arbitrary v e {e\,..., ek) and extend the given orthonormal system to the orthonormal basis (e\,..., en). Let («i,... ,un) and (xi,..., xk, 0,..., 0) be coordinates of u and v • under this basis. Then \\u-v\\2 — \u\ - xi |2 H-----h \uk - xk\2 + l^+il2 H-----h |«„|2 and this expression is clearly minimised when choosing the individual vectors to be x\ — u\,..., xk — uk. □ 3.26. Properties of unitary spaces. The properties of orthogonal mapping have a direct analogue in the complex domain. We can easily formulate them and prove gSL^ together: Proposition. Consider the linear mapping (endomorphism) cp : V —>• V on the space with scalar product. Then the following conditions are equivalent. 161 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Note that the matrix Az = CBFD is of the type 1 x 1; it is not a matrix but just a scalar. We can denote by A = Az, compute (3.5) fbn 0 A = CBFD = (cxx cX2 cX3) j b2i b22 0 b32) 7n 0 \ (dn 0 f22 Ui = (cnbn + cX2b2X cX2b22 + cX3b32) ^^i) = = bncndnfn + b2ici2dnfn + b22cX2d2Xf22 + b32cX3d2Xf22 and order the previous computation into a suitable form x(t) = (FDCB)'-1 FDz(0) = FD(CBFD)'~2 CBFDz(0) = = FD(CBFDy-lz(■ (2): The mapping

■ (3): Standard scalar product is in K" always given for columns x, y of scalars by the expression x ■ y — xT Ey, where E is the unit matrix. The property (2) thus means that the matrix A of the mapping

■ (5) The claim is expressed via the matrix A of the mapping

■ (6): Because for the determinant we have |ArA| — \E\ — \AAT\ — \A\\A\ — 1, there exists the inverse matrix A-1. But we also have A A1A — A, therefore also A1A — E which is expressed exactly by (6). (6) =>■ (1): In the chosen orthonormal basis we have From it we can see that the increment k is mostly influenced by the number of the seed that overwinter (parameter ^21) and their survivability (parameter ^22)- This revelation is not surprising, the farmers are aware of this fact since the times of neolithic times. The result shows that the mathematical model indeed adequately describes the reality. Other interesting and well-described models of growth can be found in the collection of exercises after this chapter. 3.23. Consider the following Leslie model: farmer breeds sheep. The birth-rate of sheep depends only on their age and is on average 2 lambs per sheep between one and two years of age, 5 lambs per sheep between two and three years of age and 2 lambs per sheep between three and four years of age. Younger sheep do not deliver any lambs. Every year, half of the sheep die, uniformly distributed among all age groups. Every sheep older than four years is sent to the butchery. Farmer would like to sell (living) lambs younger than one year for their skin. What part of the lambs can be sold every year such that the eigenvalues of unitary matrices are always complex units in the complex plane. As with the orthogonal mappings we can easily check that orthogonal complements of invariant subspaces with respect to unitary cp : V -> V are always also invariant. Indeed, if cp(U) c U, u e U and 1; e U1- are arbitrary, then V be a unitary mapping of complex vector spaces. Then V is orthogonal sum of one-dimensional eigen-subspaces. Proof. There surely exist at least one eigenvector v e V. Then the restriction

W we can naturally define its dual mapping ty* : W* -> V* by the relation (3.6) (v, f*(a)) = {f(v),a), where ( , ) denotes evaluation of the form (the second argument) on the vector (first argument), v e V and a e W* are arbitrary. Let us choose bases v over V, w over W and let us write A for the matrix of the mapping i/r under these bases. Then we easily compute in dual bases the matrix of the mapping ty* in the corresponding dual bases over the dual spaces. Indeed, the definition says that if we represented the vectors from W* in the coordinates as rows of scalars, then the mapping ty* is given by the same matrix as if, if we multiply by it the row vectors from the right: (ijf(v), a) — (a 1 , ®n) ■ A — (v, if* (a)). \Vn/ That means that the matrix of the dual mapping ty* is the transpose AT, because a - A — (AT ■ aT)T. 163 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS size of the herd remains the same? In what ratio will then the sheep be distributed among individual age categories? Solution. Matrix of the model (without action of the farmer) is /0 2 5 L = i 0 0 o \ \0 0 2\ 0 0 0 5 0/ The farmer can influence how many sheep younger than one year stay in his herd to the next year, that is, he can influence the element In of the matrix L. Thus we are dealing with the model /0 2 5 2\ a 0 0 0 0 ± 0 0 \0 0 \ 0/ and we are looking for an a such that the matrix has the eigenvalue 1 (we know that it has only one real positive eigenvalue). The characteristic polynomial of this matrix is X - 2aX2 5 1 -X - -, 2 2 and if we require it to have 1 as a root, it must be that a (we substitute X = 1 and set the polynomial equal to zero). The farmer can thus sell i i 10 of lambs that are born that year. The corresponding eigenvector for the eigenvalue 1 of the given matrix is (20, 4,2, 1) and in these ratios the population stabilises. □ 3.24. Consider the Leslie population growth model for the population of rats, divided into three groups according to age: younger than one year, between one year and two years and between two years and three years. Assume that there exists no rat older than three years. The average birth-rate of one rat in individual age categories is the following: in the first group it is zero, in the second and in the third it is 2 rats. The mortality in the second group is zero, that is, the rats that survive their first year die after three years of life. Determine the mortality in the first group, if you know that the population stagnates (the total number of rats does not change). O D. Markov processes 3.25. Sweet-toothed gambler. Gambler bets on a coin - whether a flip results in a head or in a tail. At the start of the game he has three sweets. On every flip, he bets on sweet and if he wins, he gains one additional, if he looses, he looses the sweet. The game ends when he loses all sweets or has at least five sweets. What is the probability that the game does not end after four bets? Let us further assume that we are in a vector space with scalar product. If we choose one fixed vector v e V, substituting vectors for the second argument in the scalar product gives us a mapping V -* V* = Hom(V, K) V (W !->• (u, 111) € The non-degeneracy condition of the scalar product ensures that this mapping is a bijection. Furthermore we know that it indeed is a linear mapping over complex or real scalars, because we have fixed the second argument. On first sight it is clear that the vectors of the orthonormal basis are mapped on forms that constitute a dual basis, and every vector can be thus understood using the scalar product as a linear form. In the case of vector spaces with scalar product our identification of a vector space with its dual also takes the dual mapping i/>* to the mapping i/>* : W -> V given by the formula (3.7) where by the same notation of parentheses as in the definition (3.6) we now mean scalar product. This mapping is called adjoint mapping to iff. Equivalently we can understand the relation (3.27) to be the definition of the adjoint mapping i/>*, for instance by substituting all tuples of vectors of an orthonormal basis for the vectors u and i; we directly obtain all values of the matrix of the mapping i/>*. The previous calculation for the dual mapping in coordinates can be now repeated, we just have to keep in mind that in orthonormal bases in unitary spaces the coordinates of the second argument are conjugated: (i[r(v), hi) — (wi, ..., w„) ■ A \Vn/ = A \w„) — (v, \jr* (w)) \Vn/ Therefore we see that if A is the matrix of the mapping i/> in an orthonormal basis, then the matrix of the adjoint mapping i/>* is the transposed and conjugated matrix A - we denote this by A* — AT. The matrix A* is called the adjoint matrix of the matrix A. Note that adjoint matrices are well defined for any rectangular matrix. We should not confuse them with algebraic adjoints, which we have used for square matrices when working with determinants. We can thus summarise that for any linear mapping i/> : V -> W between unitary spaces under orthonormal bases with the matrix A, its dual mapping has in the dual bases the matrix AT. If we also identify using the scalar product the vector spaces with their duals, then the dual mapping corresponds to the adjoint mapping i/>* : W -> V (it is a custom to denote this mapping in the same way as the dual mapping), which has the matrix A*. The distinction between the matrix of the dual mapping and of the adjoint mapping is thus in the additional conjugation, which is of course the corollary of the fact that unifying the unitary space with its dual is not complexly linear mapping (since from the second argument in the scalar product the scalars are brought out conjugated). 164 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Solution. Before the 7-th round we can describe the state of the player by the random vector Xj = (Mi), PiU), P2U), PsU), P4(j), PsU)), where pt is the prob-abiUty that the player has z sweets. If the player has before the 7-th bet z sweets (z = 2,3, 4), then after the bet he has with 1/2 probability (z — 1) sweets with 1/2 probability (z + 1) sweets. If he attains five sweets or loses them all, the number of sweets does not change. The vector Xj+i is then obtained from the vector Xj by multiplying it with the matrix /l 0.5 0 0 0 0\ 0 0 0.5 0 0 0 0 0.5 0 0.5 0 0 0 0 0.5 0 0.5 0 0 0 0 0.5 0 0 0 0 0 0.5 V x, At the start we have M 0 0 1 0 W after four bets the situation is described by the following vector x, AAX, 16 0 _5_ 16 0 \V that is, the probability that the game ends in the fourth bet or sooner is one half. Note also that the matrix A describing the evolution of the prob-abilist vector X is itself probabilistic, that is, in each column the sum is one. But it does not have the property required by the Perron-Frobenius theorem and by a simple computation you can check (or you can see it straight without any computation) that there exist two linearly independent eigenvectors corresponding to the eigenvalue 1 -the case that the player has no sweet, that is x = (1, 0, 0, 0, 0, 0)T, or the case when the player has 5 sweets and the game thus ends with him keeping all the sweets, that is, x = (0, 0, 0, 0, 0, l)T. All other eigenvalues (approximately 0.8,0.3, —0.8, —0.3) are in absolute value strictly smaller than one. Thus the components in the corresponding eigensubspaces with iteration of the process with arbitrary initial distribution vanish and the process approaches the limiting value of the probabilistic vector of the form (a, 0.0, 0.0, 1 — a), where the value a 3.28. Self-adjoint mappings. A special case of linear mapping are those that are identical with their adjoint mappings: if* = if. Such mappings are called self-< adjoint. Equivalently we can say that they are those mappings whose matrix A is under one (and thus under all) orthogonal basis satisfies A = A*. In the case of Euclidean spaces the self-adjoint mappings are those that have symmetric matrix (under some basis). Often they are called symmetric matrices and symmetric mappings. In the complex domain the matrices that satisfy A = A* are called Hermitian matrices. Sometimes they are also called self-adjoint matrices. Note that Hermitian matrices form a real vector subspace in the space of all complex matrices, but it is not a sub-space in the complex domain. Remark. Especially interesting is in this connection the following remark. If we multiply a Hermitian matrix A by the imaginary unit, we obtain the matrix B = i A, which has the property B* = i AT = — B. Such matrices are called anti-Hermitian. As every real matrix is a sum of a symmetric and an anti-symmetric part, 1 1 A = -(A + A1) + -(A-A1), in the complex domain we analogously have A=l-(A + A*) + i^-(A-A*) 2 2i and can thus express every complex matrix in exactly one way as a sum A = B + iC with Hermitian matrices B and C. It is an analogy of the decomposition of a complex number into its real and purely imaginary component and in the literature we often encounter the notation B = re A = -(A + A*), C = im A = — (A - A*). 2 2i In the language of linear mappings this means that every complex linear automorphism can be uniquely expressed using two self-adjoint mappings. 3.29. Spectral decomposition. We consider a self-adjoint mapping if : V -> V with matrix A under some orthonormal basis and we try to proceed similarly as in 2.50. Again, I we first look in general at the invariant subspaces of self-adjoint mappings and on their orthogonal complements. If for any subspace W C V and self-adjoint mapping if : V -> V we have if(W) c W, then also for every 1; e W-1, w e W (if(v), in) = (u, if(w)) = 0. That means that also f(W^) c W^. Consider now the matrix A of a self-adjoint mapping under some orthonormal basis and A ■ x = Xx for some eigenvector x e C". We obtain X(x, x) = {Ax, x) = (x, Ax) = (x, Xx) = X(x, x). Positive real number (x, x) can be cancelled out and thus it must be X = X, that is, eigenvalues are always real. The characteristic polynomial det( A—X E) has that many complex roots as is the dimension of the square matrix A, and all of them are actually real. Thus we have proved important general result: 165 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS depends on the initial number of sweets. In our case it is a = 0.4, if there were 4 sweets at the start, it would be a = 0.2 and so on. □ 3.26. Car rental. Company that rents cars every week has two branches - one in Prague and one in Brno. A car rented in Brno can be returned in Prague and vice versa. After some time it has been discovered that in Prague, roughly 80 % of the cars rented in Prague and 90 % of the cars rented in Brno are returned there. How to distribute the cars among the branches such that in both there is at the start of the week always the same number of cars as in the week before? How will the situation look like after some long time, if the cars are distributed at the start in a random way? Solution. Let us denote the components of the vector in question, that is, the initial number of cars in Brno and in Prague by xB and x P respectively. The distribution of the cars between branches is then described by the vector x = (^B^j ■ ^ we consider such a multiple of the vector x such that the sum of its components in 1, then its components give the procentual distribution of the cars. At the end of the week is according to the statement the state is described by the vector I ^ ^ 0 8 / \ jc '0.1 0.2 v0.9 0.8 rental. If at the end of the week in the branches there should be the same number of cars as at the beginning, we are looking for such vector x for which it holds that Ax = x. That means that we are looking for an eigenvector of the matrix A associated with the eigenvalue 1. The characteristic polynomial of the matrix A is (0.1 — A) (0.8 — A) — 0.9.0.2 = (A — 1)(A + 0.1) and 1 is indeed an eigenvalue of the matrix A. The corresponding eigenvector x = I Xb ) satisfies the \xpj -0.9 0.2 \ (xB The matrix A thus describes our (linear) system of car 0. It is thus a multiple of the vector . For determining the procentual distribution we are looking for The suitable distribution of the cars between equation ^Q9 _Q y ^ 0.2^ .0.9, a multiple such that xB + xP = 1. That is satisfied by the vector _L (02\ - (0A^ 11 \0.9J ~ V0.82y Prague and Brno is such that 18% of the cars are in Brno and 82% of the cars are in Prague. If we choose arbitrarily the initial state x = I Xb ), then the state \xpj after n week is described by the vector xn = A"x. Now it is useful to express the initial vector x in the basis of the eigenvectors of A. The eigenvector of the eigenvalue 1 has already been found and similarly we find eigenvectors of the eigenvalue —0.1. That is for instance the vector Proposition. Orthogonal complement of an invariant subspace for self-adjoint mapping is also invariant. Furthermore, all eigenvalues of a Hermitian matrix A are always real. >From the definition itself it is clear that restriction of a self-adjoint mapping on an invariant subspace is again self-adjoint. The previous claim thus ensures that there always exists a basis of V composed of eigenvectors. Indeed, the restriction of \[r on the orthogonal complement of an invariant subspace is again self-adjoint mapping, thus we can add into the basis one eigenvector after another, until we obtain whole decomposition of V. Eigenvector associated with distinct eigenvalues are perpendicular, because from the equations \[f(u) — Xu, ^j/(v) — pv we have that X{u, v) — (if(u), v) — {u, ifr(v)) — p{u, v) — p{u, v). Usually our result is formulated using projections on eigensub-spaces. About the projector P : V -> V we say that it is perpendicular if Im P _L Ker P. Two perpendicular projectors P, Q are mutually perpendicular if Im P _L Im Q. Theorem (About the spectral decomposition). For every self-adjoint mapping iff : V -> V on a vector space with scalar product there exists an orthonormal basis composed of eigenvectors. If Xi, ..., Xk are all distinct eigenvalues of iff and P\, . . . , Pk are the corresponding perpendicular and mutually perpendicular projectors on the eigenspaces corresponding to the eigenvalues, then f = XiPi + --- + XkPk. Dimension of images of these projectors is always equal to the algebraic multiplicity of the eigenvalues a,. 3.30. Orthogonal diagonalisation. Mappings for which we can \^ find an orthonormal basis as in the previous theo-\ rem about spectral decomposition are called orthogonally diagonalisable. They are of course exactly such mappings for which we can find an orthonormal basis such that the matrix of the mapping is diagonal under this basis. Let us think for a while how can they look like. For the Euclidean case it is simple: diagonal matrices are first of all symmetric, thus they are exactly the self-adjoint mappings. As a corollary we obtain a result that an orthogonal mapping of an Euclidean space into itself is orthogonally diagonalisable if and only if it is self-adjoint (they are exactly the self-adjoint mappings with eigenvalues ±1). For complex unitary spaces the situation is more complicated. Consider arbitrary linear mapping cp : V -> V of a unitary space and let cp — \[r + in be the (uniquely given) decomposition of cp into its Hermitian and anti-Hermitian part. If cp has under a suitable orthonormal basis a diagonal matrix D, then D — reD + iimD, where the real and the imaginary parts are exactly the matrices \[r and n (follows from the uniqueness of the decomposition). Thus it also holds that \[r o n — n o i\r and cp o cp* — cp* ocp. The mappings cp : V —>• V with the last listed property are called normal. Mutual connections are shown in the following proposition (we follow the notation of this paragraph): Proposition. The following conditions are equivalent: (1) cp is orthogonally diagonalisable, (2) cp* o cp — cp o cp* (that is, cp is a normal mapping), (3) iff o r) = r) o ifr, 166 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS The initial vector can thus be expressed as a linear combination /0 2\ /-1N x = a + b I j ). State after n weeks is then (4) for a matrix A — (atj) of a mapping

(3): it suffices to do a direct calculation cpcp* — (-iff + in)(i{f — in) = iff2 + n2 + i(ni{f — ifrn) cp* cp = (iff — ir))(i{r + in) — iff2 + n2 + i(^rr\ — r)ifr) Subtraction yields 2i{n^r — TJrn). (2) =>■ (1): letw e V be an eigenvector of the normal mapping cp. Then cp(u) ■ cp(u) — {cp*cp(u), u) — ((pep* (u), u) — cp* (u) ■ cp* (u), thus also \cp(u)\ — \cp* (u)\. If cp is normal, then (cp — kid V)* — (cp* —kidV) and thus (cp —k id V) is also a normal mapping. From the previous equation follows that if cp(u) — ku, then cp* (u) — ku. That means that

(4): the expression ^; ■ |fly |2 is the trace of the matrix AA*, which is the matrix of the mapping

• V, which we prove later in 3.37. The theorem says that for every linear mapping

• V there exists an orthonormal basis under which

3: A" and we easily determine that the game ends with the probability a + ab + ab2 = 0, 885 as a loss and with the probability roughly 0, 115 /l a + ab + ab2 a + ab a 0\ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 b3 b2 b V Proof. For unitary mapping cp is qxp* — id V — cp* cp and thus cpcp* = (i/r + it])(^r — it]) — i/>2 + 0 + rj2 — id V. On the other hand, for normal mapping the last calculation shows that the other implication holds too. □ 3.31. Non-negative mappings and roots. Non-negative real numbers are exactly those which we can write as square roots. Generalisation of such behaviour for matrices and mappings can be seen in products of matrices B — A* ■ A (that is, composition of mappings i/>* of): (B ■ x, x) = (A* • A • x, x) = (A • x, A • x) > 0 for all vectors x. Furthermore, we clearly have B* = (A* ■ A)* = A* ■ A = B. Hermitian matrices B with such property are called positively semi-definite and if the zero value is attained only for x — 0, they are called positively definite. Analogously, we speak of positively definite and positively semidefinite mappings iff : V -> V. For every positively semidefinite mapping i/> : V -> V we can find its root, that is, a mapping rj such that rj o rj — i/>. It is simplest to see under an orfhonormal basis where i/> has diagonal matrix. Such basis exists (as we have already proven) and the matrix A of the mapping i/> has on diagonal only non-negative real numbers, the eigenvalues of i/>. If some of them were negative, then the condition for non-negativity would not be satisfied already for some of the basis vectors. But then it suffices to define the mapping rj using the matrix B with square roots of the corresponding eigenvalues on diagonal. 3.32. Spectra and nilpotent mappings. At the end of this section we return to the question about behaviour of linear mapping in full generality. We shall still work with real or complex vector spaces. Let us recall that spectrum of linear mapping f : V -> V is a sequence of roots of the characteristic polynomial of the mapping /, counting multiplicities. Algebraic multiplicity of eigenvalue is its multiplicity as of a root of the characteristic polynomial, geometric multiplicity of eigenvalue is the dimension of the corresponding subspace of eigenvectors. Linear mapping / : V -> V is called nilpotent, if there exists an integer k > 1 such that the iterated mapping /* is identically zero. The smallest k with such property is called degree ofnilpo-tency of the mapping /. The mapping / : V —>• V is called cyclic, if there exists a basis (u\,... ,u„) of the space V such that f(u\) — 0 and /(«/) — ui-\ for all i =2,..., n. In other words, the matrix of / under this basis is of the form /0 1 0 0 0 1 V: : A 7 If f(v) — a ■ v, then for every natural k we have f (v) — ak ■ v. Notably, the spectrum of nilpotent mapping can contain only zero scalar (and that is always present). Directly from the definition follows that every cyclic mapping is nilpotent, furthermore its degree of nilpotency is equal to the 168 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS as a win of €80. (We multiply by the matrix A°° the initial vector (0, 1, 0, 0, 0) and obtain the vector (a + ab + ab2, 0, 0, 0, b3).) □ 3.30. Consider the situation from the previous case and assume that the probability of both win and loss is 1/2. Denote by A the matrix of the process. Without using any computational software determine A100. O 3.31. Absent-minded professor. Consider the following situation: absent-minded professor carries an umbrella with him, but with probability 1/2 he forgets it where he is leaving from. In the morning, he leaves to the work. At the work, he goes for a lunch into a restaurant, and then back. After he is finished with his work, he leaves for home. Consider for simplicity that he does not go anywhere else and that in the restaurant the umbrella stays on his favourite spot, where he can take it from on the next day (if he does not forget it there). Consider this situation as Markov process and write down its Matrix. What is the probability that after many days in the morning the umbrella is located in the restaurant? (It is useful to take as a time unit one day -from morning to morning.) Solution. '11/16 3/8 l/4> 3/16 3/8 1/4 1/8 1/4 1/2, We compute for instance the element a\, that is, the probability that the umbrella starts its day at home and stays there (that is, will be there the next day in the morning) - there are three disjoint ways for the umbrella: D the professor forgets it at home in the morning p\ = \, DPD the professor takes it to the work, then he forgets to take it on the lunch and in the evening he takes it home: P2 = \ \ = 1 8' DPRPD the professor takes the umbrella all the time with him and does not forget it anywhere: p3 = \ ■\ ■\ ■\ = j^. In total a\ = pi + p2 + pi = ^. The eigenvector of this matrix corresponding to the dominant eigenvalue 1 is (2, 1, 1), and thus the desired probability is 1/(2 + 1 + 1) = 1/4. □ 3.32. Algorithm for determining the importance of pages. Internet browsers can find on the Internet (almost) all pages containing a given word or phrase. But how to sort the pages such that the user receives a list sorted according to the relevance of the given pages? One of the possibilities is the following algorithm: the collection of all found pages is considered to be a system and each of the found dimension of the space V. The operator of derivation on polynomials, D(xk) — /ex*-1, is an example of cyclic mapping on spaces K„ [x] of all polynomials of degree at most n over scalars K. Surprisingly, this also holds the other way - every nilpotent mapping is a direct sum of cyclic mappings. A proof of this claim takes a lot of work, thus we first formulate the results we are aiming at, and then gradually start with the technical work. In the resulting theorem about Jordan decomposition appear vector (sub)spaces and linear mappings on them with a single eigenvalue X and a matrix (XX 0 ... 0\ ox 1 ... 0 j = . . v0 0 0 ... X) These matrices (and corresponding invariant subspaces) are called Jordan blocks. Theorem (Jordan theorem about canonical form). Let V be a vector space of the dimension n and f : V -> V be a linear mapping with n eigenvalues, counting algebraic multiplicities. Then there exists a unique decomposition of the space V into a direct sum of subspaces V = Vi ® ■ ■ ■ ® vk such that f (Vi) C Vi, restriction of f on every Vi has a single eigenvalue Xi and the restriction f — Xi ■ id on Vi is either cyclic or zero mapping. The theorem thus says that for a suitable basis every linear mapping has block-diagonal form with Jordan blocks along the diagonal. The total number of ones over the diagonal in such form equals the difference between total algebraic and geometric multiplicity of the eigenvalues. 3.33. Notes. Note that we have already proven the Jordan theorem for the cases when all eigenvalues are either distinct or when the geometric and algebraic multiplicities of the eigenvalues are the same. Specifically, we have already proven it for unitary, normal and self-adjoint mappings. Another useful observation is that for every linear mapping /, every eigenvalue of / has uniquely determined invariant subspace that corresponds to the Jordan block in the matrix. We should also mention one very useful corollary of the Jordan theorem (which we have already used in the discussion about the behaviour of Markov chains). Assume that the eigenvalues of our mapping / are all in absolute value smaller than one. Then repeated application of the linear mapping on every vector v e V leads to a fast decrease of all coordinates of fk(v) bellow any bounds. Indeed, assume for simplicity that on whole V the mapping / has only one eigenvalue X and / — X id v is cyclic (that is, we consider only one Jordan block) and let v\,..., vi be the corresponding basis. Then the condition from the theorem says f(v2) — Xv2 + v\, f2(v2) — X2v2 + Xv\ + Xv\, and similarly for other vt and higher powers. In any case, iteration results in higher and higher powers of X at all non-zero components, while the smallest of them can be at most the degree of nilpotency lower than the number of iterations. This proves the claim (and the same argument can be used to prove that for the mapping with all eigenvalues with absolute 169 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS pages as one of its states. We describe a random walk on these pages as a Markov process. The probabilities of transitions between pages are given by the hyperlink: each link, say from page A to page B, determines the probability (l/(total number of links from the page A)), with which the process moves from the page A to the page B. If from some page there are no leading links, we consider it to be a page from which a link leads to every other page. This gives us probabilistic matrix m (the element mi; corresponds to the probability with which we move form the z-th page to the 7-th page). Thus if one randomly clicks on links in the found pages (and from a linkless page one just chooses randomly the next one) the probability that at a given point in time (distant enough from the beginning) one is located on the z-th page corresponds to the z-th component of the unit eigenvector of the matrix m, corresponding to the eigenvalue 1. Looking at the sizes of these probabilities we define the importance of the individual pages. This algorithm can be modified by assuming that the users stops clicking from a link to link after certain time and again starts on a random page. Say that with probability d he chooses randomly a new page and with probability (l — d) keeps clicking. In such situation the probability of transition between any two pages 5, and Sj is non-zero - it is d/n + (1 — d)/total number of links at the page 5, if from 5, there is a link to Sj, and d/n otherwise (if there are no links at 5,, then it is \/ri). According to the Perron-Frobenius theorem the eigenvalue 1 is with multiplicity one and dominant, and thus the corresponding eigenvector is unique (if we chose transitional probabilities only as described in the previous paragraph, it would not have to be so). For illustration consider pages A, B, C and D. The links lead from A to B and to C, from B to C and from C to A, from D nowhere. Let us say that the probability that the user chooses a random new page is 1/5. Then the matrix m looks as follows: /1/20 1/20 17/20 l/4\ 9/20 1/20 1/20 1/4 9/20 17/20 1/20 1/4 \l/20 1/20 1/20 1/4/ The eigenvector corresponding to the eigenvalue 1 is (305/53,175/53,315/53,1), the importance of pages is thus given according to the order of the sizes of the corresponding components, that is, C > A > B > D. 3.33. Based on the temperature at 14:00 the days are divided into warm, average and cold. From the all-year statistics, after a warm day in half of the cases the next day is warm in 50 % of the cases and average day in 30 % of the cases, after an average day the next day is in 40 % of the cases average and cold in 30 % of the cases, and after m value strictly greater than one leads to unbounded growth of all coordinates for the iteration fk(v)). The rest of this part of the third chapter is devoted to the proof of the Jordan theorem and some necessary lemmata. It is way more difficult than anything so far and the reader can skip it, until the beginning of the fifth part of this chapter. 3.34. Root spaces. On examples we have already seen that the eigensubspaces describe additional geometric properties only for some linear mappings. Thus we now introduce a more subtle tool, the so-called root subspaces. Definition. Non-zero vector u e V is called root vector of a linear mapping cp : V -> V, if there exists a e K and an integer k > 0 such that (cp — a ■ idy)k(u) — 0, that is, k-th iteration of the given mapping maps u to zero. The set of all root vectors corresponding to a fixed scalar X along with the zero vector is called the root subspace associated to the scalar X e K, and is denote as IZx- If u is a root vector and the k from the definition is chosen the smallest possible, then (cp — a ■ idy)*-1 (w) is an eigenvector with the eigenvalue a. Thus we have IZx — {0} for all scalars X which are not in the spectrum of the mapping cp. Proposition. For linear mapping

V we have: (1) for every X e K is H\ C V a vector subspace, (2) for every X, \i e K is 1Z\ invariant with respect to the linear mapping (

From there we have that 0 — (cp — X ■ idy)k(u) — (ji — X)k ■ u and thus also u — 0 for X / ji. (4) Choose a basis e\,... ,ep of the subspace IZx- Because according to the definition there exist numbers kt such that (cp — X-idy)*' (et) — 0, we have that the whole mapping (cp — X ■ idy) |^ is nilpotent. □ 3.35. Factor subspaces. Our next aim is to show that the dimension of the root spaces is always equal to the algebraic multiplicity of the corresponding eigenvalues. Let us first introduce some useful technical tools. DefiftTHon. Let U C V be a vector subspace. On the set of all vectors in V we define an equivalence relation as follows: vi ~ v2 if and only if v\ — v2 e U. Axioms of equivalence are easy to check. The set V/U of the classes of this equivalence, along with the operations defined using representants, that is, [v] + [w] — 170 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS a cold day the next day is in 50 % of the cases cold and in 30 % of the cases average. Without any further information derive how many warm, cold and average days can be expected in a year. Solution. For each day exactly one of the states „warm day", „average day", „cold day" is attained. If the vector x„ has as its components the probabilities that a certain (72-th) day is warm, average and cold (respectively), then the components of the vector /0.5 0.3 0.2\ xn+1 = 0.3 0.4 0.3 -xn \0.2 0.3 0.5/ give the probabilities that the next day is warm, average and cold respectively For verifying it suffices to substitute /1\ /0\ /0> X„ = 0 , X„ = 1 , X„ = 0 W W \h while for instance for the third choice we must obtain the probabilities that after a cold day follows a warm, average and cold day (respectively). We see that the problem is a Markov chain with probabilistic transitional matrix /0.5 0.3 0.2> T = 0.3 0.4 0.3 \0.2 0.3 0.5, Because all the elements of this matrix are positive, there exists a probabilistic vector X0O = {Xoo> xoo' xoo) ' to which the vector x„ approaches as n grows, independently of the vector x„ for small n. Furthermore, thanks to the corollary of the Perron-Frobenius theorem x^ is the eigenvector of the matrix T for the eigenvalue 1. Thus it must hold that xl = 0.5x1 + 0.3x1 + 0.2x1, xl = 0.3x1 + OAxl + 0.3x1, -4 = 0.2 xj^ + 0.34, + 0.5 x^, where the last condition means that the vector x^ is probabilistic. It is easy to compute that this system has a single solution 1 _ 2 _ 3 Thus we can expect roughly the same number of warm, average and cold days. Let us emphasise that the sum of the numbers from any column of the matrix T had to equal 1 (otherwise it would not be a Markov process). Because TT = T (the matrix is symmetric), the sum of all numbers from any row is also equal 1. We say that a matrix with non-negative elements and with the properly that the sum of the numbers [1; + w], a ■ [u] = [a ■ u], forms a vector space which we call factor vector space of the space V by the subspace U. Check the correctness of the definition of the operations and that all the axiom of the vector space hold! Classes (vectors) in the factor space V/U will be often denoted as a formal sum of one representant with all vectors of the subspace U, for instance u + UeV/U,ueV. Zero vector is in V/U exactly the class 0+U, that is, the vector u e V represents the zero element in V/U if and only if it is u e U. For simple examples, think about V/{0} — V, V/V — {0} and about the factor space of the plane M2 by any one-dimensional subspace (here, every one-dimensional subspace U C M2 is a line passing through the origin), where the classes of equivalence are parallel lines with this line. Proposition. Let U c V be a vector subspace and (u\, ..., un) be such basis of V such that (u\, ..., uk) is a basis of U. Then dim V/U = n — k and the vectors uk+i + U,.. U form a basis of V/U. Proof. Because V = {u\, ...,«„), it is also that V/U = (ui+U,..., u„+U). But first k generators are zero, thus V/U = (wjfc+i + U, ... ,u„ + U). Assume that ak+\ ■ (uk+i + U)-\-----h a„ ■ («„ + U) = (ajfc+i • Wjfc+i H-----h a„ ■ u„) + U = 0 € V/U. That is equivalent to the belonging of a linear combination of the vectors uk+i, ...,«„ to the subspace U. Because U is generated by the remaining vectors, the combination is necessary zero, that is, all coefficients at are zero. □ 3.36. Induced mappings on factor spaces. Assume that U C V is an invariant subspace with respect to linear mapping

• V and choose basis u\,...,un of the space V such that the first k vectors of this basis is a basis of U. In this basis

• V be a linear mapping whose spectrum contains n elements (that is, all roots of the characteristic polynomial lie in K and we count their multiplicities). Then there exists a sequence 111 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS in any column equals one and analogously for rows is called doubly stochastic. Important property of every doubly stochastic primitive matrix (for any dimension - the number of states) is that the corresponding vector Xqo has all the components identical, that is, after sufficiently many iterations all the states in the corresponding Markov chain are attained with the same frequency. □ 3.34. John is used to go running every evening. He has three tracks - short, middle and long. Whenever he chooses a short track, the next day he feels bad about it and chooses uniformly between long and medium. Whenever he chooses a long track, the next day he chooses arbitrarily among all three. Whenever he chooses the medium track, the next he feels good about it and again chooses uniformly between medium and long. Assume that he has been running like this for a very long. How often does he choose the short one and how often the long one? What is the probability that he chooses a long one when he picked it a week before? Solution. Clearly it is a Markov process with three possible states -choices for short, medium and long tack. This order of the states gives a probabilistic transitions matrix 0 0 1/3^ 1/2 1/2 1/3 ,1/2 1/2 1/3, It suffices to realise that for instance the second column corresponds to the choice of the medium track in the previous day, which means that with the probability 1/2 again a medium track will be chosen (the second row) and with probability 1/2 a long track will be chosen (the third row). Because we have / 1/6 1/6 l/9\ T2 = 5/12 5/12 4/9 , \5/12 5/12 4/9/ we can use the corollary of the Perron-Frobenius theorem for Markov chains. It is not difficult to compute that eigenvector corresponding to the eigenvalue 1 and which is probabilistic vector, namely: 1 3 3X T 7' 7' 7 The values 1/7, 3/7, 3/7 then give respectively the probabilities that in a randomly chosen day he choose short, medium and long track. Let John at a certain day (that is, in time n e N) choose a long track. This corresponds to the probabilistic vector xn = (0, 0, If . of invariant subspaces {0} — Vq C V\ C • • • C V„ — V with dimensions dim Vj- = i. Under the basis u\,... ,un of the space V such that Vt■ — (u i■) the mapping

and after seven days we have '0> x„+7 = r7 • o 'l/3^ 1/3 1/ \1/3y The enumeration gives us as components of xn+1 the values 0.142 861225...; 0.428 569 387...; 0.428 569 387... Thus the probability that he chooses a long track under the condition that he chose it seven days ago is roughly 0.428 569 ~ 3/7 = 0.428 571. □ 3.35. The production line is not reliable: individual products differ in quality in a non-neglectible way. Furthermore, a certain worker tries to improve the quality of the products and intervenes to the process. The products are distributed into classes I, II, III according to their quality, and a report found out that after a product of class I the next product has the same quality in 80 % of the cases and is of quality II in 10 % of the cases; after a product of the class the next product is of the class in 60 % of the cases and is of quality I in 20 % of the cases, and after a product of the quality III the next product is of the quality III in 50 % of the cases and in 25 % of the cases it is of quality II. Compute the probability that the 18-th product is of the quality I, if the 16-th product is of quality III. Solution. Let us first solve the problem without using a Markov chain. The event in question is satisfied by the cases (16-th product is of the class III) • 17-th product is of the class I and 18-th product is of class I; • 17-th product is of the class II and 18-th product is of class I; • 17-th product is of the class III and 18-th product is of class I, with probabilities respectively • 0.25 • 0.8 = 0.2; • 0.25 • 0.2 = 0.05; • 0.5-0.25 = 0.125. Thus we easily obtain the result 0.375 = 0.2 + 0.05 + 0.125. Now let us view the problem as a Markov process. From the statement we have that to the order of the possible states „product is of V be arbitrary linear mapping in a (real or complex) unitary space with m — dim V eigenvalues (counting multiplicities). Then there exists an orthonormal basis of the space V such that the matrix of cp is under this basis upper triangular with eigenvalues k\, ... ,km on the diagonal. 3.38. Theorem. Let

: V/IZ^ -> V/TZ^ be the mapping induced by

. Let (i; + 7\^) e V/lZx be the corresponding eigenvector, that is, ir(v+lZx) — k-(v+lZx), which according to the definition denotes v <£lZx and cp(v) — k ■ v + w for suitable w e IZx. We thus have w — ((w) — 0 for suitable j. We have thus derived (• V whose whole spectrum is in K is V — 1Z\X © • • • © 1Z\n the direct sum of the root subspaces. If we choose suitable bases for these subspaces, then

V be a nilpotent linear mapping. Then there exists a decomposition of V into a direct sum of subspaces V — Vi © • • • © Vk such that the restriction ofcp on any of them is cyclic. Proof. Verifying this is quite straightforward and consists of construction of such basis of the space V that the action of the mapping cp on the basis vectors directly show the , ,g decomposition into the cyclic mappings. But taking care i of the details will take some time. Let k be the degree of nilpotency of the mapping cp and denote = imifff), i — 0,..., k, that is, {0} = Pk C Pjt_i C • • • C Pi C P0 = V. Choose arbitrary basis e\~l,..., ekp~^ of the space Pk-\, where pk-i > 0 is the dimension of Pk-i- >From the definition it follows that Pk-\ C K&vcp, that is, always cp(ek~l) — 0. Assume that P^-i / V. Because Pk-i — cp(Pk-2), there necessarily exist in Pk-2 the vectors ek~2, j — 1,..., pk-\ such thatp(e*-2) — ekrl. Assume a\e\ 1 ,ek-\+biek-2 "T- uPk-l"Pk-l Application of the mapping cp on this linear combination yields bie\~l A-----ybVk_xekv~\ = 0, therefore all bj = 0. But then also aj — 0, because it is a combination of the basis vectors. Thus we have verified the linear independence of all 2pk-\ chosen vectors. We extend them to a basis -i , ek~l • Pk-i ek~2 ek~2 'Pk-2 of the space Pk-2- Furthermore, the images of the added basis vectors are in Pk-i, necessarily they must be linear combinations of the basis elements ek~l,..., ek~} . We can thus exchange the cho- sen vectors e cp(ekr2). This l ' """' "Pk-i ■ Jj-x+i.....4r-2 withvectors^"2 ensures that the vectors added to the basis of Pk-2 belong to the kernel of the mapping cp. Let us thus assume it right about the chosen basis (1). Let us assume further that we have already constructed a basis of the subspace Pk-i such that we can directly compose it into the schema -i , ~Pk-\ - g*-2 ek-2 k-l ek~3 ek~3 ■ ' Pk-V Pk-\ + V ' k-i k-i Pk-V cpk-i + V- ■ ek~2 ■ ' Pk-2 k-3 k-3 k-3 ' ' Pk-2' »-2+1' • • • ' Pk-3 k-i k-l „k-l CPk-2> Cpk-2 + V k-l 01 '• • •' "»-1' >,t_i + l'- • •' "»-2' "»-2 + 1'" " "' "»-3' • • • "Pk-l where the value of the mapping cp on any basis vector is located above him, or equals zero if there is nothing above that basis vector. If Pk-i / V, then again there must exist vectors k-l-l ' e n-t 1 wnich maP on e i k-l tend them to a basis P^-i-i, say by the vectors ., ek lt and we can ex- Pk-l k-l-l k-l-l '■pk-i + l' • • • ' Pk-l-1- By gradual subtraction of the values through iteration of the mapping cp on these vectors yields that the vectors added to the basis 174 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Also, for n e N we can directly determine ( *" (§)"-(*)" «)"-(*)" (I) 0 0 («" © -a: (i)-(t: -«)■ 0 0 0 («■ («-(*: -(«■ 0 0 0 0 (ir -(§)" °\ 0 0 0 0 1/ The values in the first column correspond gradually to the probabilities that n-times in a row the result is 1, n-times in a row the result is 1 or 2 and there was at least one 2 (therefore we subtract the probability given in the first row), n-times in a row the result is 1,2 or 3 and at least once the result is 3, up to the last row where there is the probability that at least once during n throws the result is 6 (this can be easily derived from the probability of the complementary event). Similarly, in the fourth column are the non-zero probabilities of the events „n-times in a row the result is 1, 2, 3 or 4", „n-times in a row the result is 1, 2, 3, 4 or 5 and at least once it is 5" and „at least once during n attempts the result is 6". Interpretation of the matrix T as the probabilistic transition matrix of a Markov process allows for quick expression of the powers r,«eN. □ 3.37. In this problem we deal with a certain property of an animal species which is determined independently of the sex but just by a certain gene - a tuple of alleles. Every individual gains one allele from each of its parent, randomly and independently. There are forms of the gene given by various alleles a, A - they form three possible states aa, a A = Aa and AA of the properly. (a) Assume that each individual of a certain population mates only with an individual of another population, where there appears only the properly caused by the tuple a A. Exactly one of their offspring (randomly chosen one) will be left on the spot and he will also mate only with an individual of that specific population, and so on. Determine the probabilities of appearance of aa, a A, AA in the considered population after certain time. (b) Solve the problem given in the case (a), if the other population is composed only of individual with the tuple A A. (c) Randomly chosen two individuals of opposite sex are bred. >From their progeny again randomly choose two of opposite sex and breed them. If you carry on with this for a long time, compute the probability that both bred individuals have a tuple of alleles A A, or aa (then the process of breeding ends). Pk-i-i lie in the kernel of

n. This implies that if the matrix J of the mapping

From here we calculate n - n(k) — d\ (k) + 2d2(k) H dk(k) — rk-\(k) ■ ■ ■ + ldt(k) + ldl+l(k) ■ ■2rk(k) + rk+l(k) (where the last row arises by combining the previous for values l=k-l,k,k+Y). 3.41. Note. The proof of the theorem about the existence of the Jordan canonical form was constructive, but it does not give us a perfect algorithmic approach for the construction. Now we summarise the already derived approach for the explicit computation of the basis under which the given mapping

• V has the matrix in the canonical Jordan form. (1) We find the roots of the characteristic polynomial. (2) If there are less than n — dim V of them (counting multiplicities), there is no canonical form. (3) If there are n linearly independent eigenvectors, we obtain a basis of V composed of eigenvectors and under it

T = 0 1/2 1 \0 0 Oj We immediately see all eigenvalues 1,1/2 and 0 (if we subtract them from the diagonal, the rank of the resulting matrix is not 3, that is, the homogeneous system given by this matrix will have a non-trivial solution). To these eigenvectors then respectively correspond the eigenvectors 1 upper border of the scheme from the proof of the theorem 3.39, but it is necessary to find a suitable basis by application of iterations

From there for arbitrary n e N it follows /l -1 1 \ /l 0 0 r" = 0 1 -2-0 1/2 0 \0 0 1 / \0 0 0 1 0 0\ /l -1 1 0 1/20-0 1 -2 0 0 0/ \0 0 1 1 0 0\ /l 1 1N 0 1/2 0 1-10 1 2 0 0 0/ \0 0 1, 'I -1 1 0 1 -2 ,0 0 1 '10 0 0 2"" 0 ,0 0 0 Clearly for big n e N we can substitute 0 for 2 ", which implies /l -1 1 \ /l 0 0\ /l 1 1\ /ll r T" « 0 1 -2 0 0 00 1 2 = 0 0 0 \0 0 1 / \0 0 0/ \0 0 1/ \0 0 Oj Thus if individuals of the original population procreate exclusively with the member of the specific population (that, that has only AA), necessarily after a sufficient number of breeding it results into a total elimination of the tuples a A and aa (and it does not matter what their original distribution was). The case (c). Now we have 6 possible states (in this order) AA, AA; aA,AA; aa, AA; aA,aA; aa,aA; aa,aa, while these states are given by the genotypes of the parents. The matrix of the corresponding Markov chain is /l 1/4 0 1/16 0 0\ 0 1/2 0 1/4 0 0 0 0 0 1/8 0 0 0 1/4 1 1/4 1/4 0 0 0 0 1/4 1/2 0 0 0 1/16 1/4 V If we consider for instance the situation (second column), where one of the parents has the tuple A A and the second has a A, then clearly each of the four cases (we are talking about the tuple of alleles of two randomly chosen offsprings) AA, AA; AA,aA; aA,AA; aA,aA occurs with the same probability. The probability of staying in the second state is thus 1/2 and the probability for transition from the second state to the first is 1/4 and to the fourth state also 1/4. Now we should again determine the powers T" for big n e N. Considering the form of the first and of the last column we immediately where U is the upper triangular matrix and thus A — L-U where L is lower triangular matrix with ones on diagonal and U is upper triangular. This decomposition is called LU-decomposition of the matrix A. In the case of the general matrix we can with Gaussian elimination into the row echelon form need some additional row permutations, sometimes even column permutations. Then we obtain the more general A — P ■ L ■ U ■ Q, where P and Q are some permutation matrices. 3.43. Notes. A direct corollary of the Gaussian elimination is also a realisation that up to the choice of suitable / bases on the domain and codomain, every mapping / : V -> W given by a matrix in block-diagonal form with unit matrix, with size given by the dimension of the image / and with zero blocks all around. This can be reformulated as follows: every matrix A of the type m/n over a field of scalars K can be decomposed into the product E 0 0 0 Q, where P and Q are suitable invertible matrices. For square matrices we have in 3.32 shown when discussing properties of linear mappings / : V -> V over complex vector spaces that every square matrix A of the dimension m can be decomposed into the product A — PB- P~l, where B is block-diagonal with Jordan blocks associated to eigenvalues on the diagonal. Indeed, it is just a reformulation of the Jordan theorem, because multiplying by the matrix P and by its inverse from the other side corresponds in this case just to a change of the basis on the vector space V and the cited theorem says that in a suitable basis every mapping has Jordan canonical form. Analogously, when discussing the self-adjoint mappings we have proved that for real symmetric matrices or for complex Her-mitian matrices there always exists a decomposition into the product A — PB- P*, where B is a diagonal matrix with all (always real) eigenvalues on the diagonal, counting multiplicities. Indeed, it is again a product of matrices then stand for the change of the basis, but we allow only changes between orthonormal bases and thus also the matrix P for the change must be orthogonal. From there we have P~l — P*. For real orthogonal mappings we have derived analogous expression as for symmetric, only our B is diagonal with blocks of size two or one, expressing either rotation or mirror symmetry or identity with respect to the corresponding subspaces. 3.44. Singular decomposition theorem. Let us return to general linear mappings between vector spaces (in general distinct). If a scalar product is defined on them and we restrict ourselves on orthonormal bases only, we must proceed in a more refined way than in the case of arbitrary bases. 177 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS find out that 1 is an eigenvalue of the matrix T. It is very easy to find the eigenvectors (l,0,0,0,0,0)r, (0,0,0,0,0, if corresponding to the eigenvalue 1. By considering only a four-dimensional submatrix of the matrix T (omitting the first and sixth row and column) we find the remaining eigenvalues 1 1 1 - V5 1 + 75 2' 4' 4 ' 4 ' If we recall the solution of the exercise called Sweet-toothed gambler, we don't have to compute T". In that exercise we obtained the same eigenvectors corresponding to the eigenvalue 1 and the other eigenvalues also had their absolute value strictly smaller than 1 (the exact values were not used). Thus we obtain identical conclusion - the process approaches the probabilistic vector (a, 0,0, 0,0, 1 -a)T , where a e [0, 1] is given by the initial state. Because only at the first and sixth position of the resulting vector can be a non-zero number, the states aA,AA; aa, A A; aA,aA; aa,aA after many breedings disappear. Let us further realise (follows also from the exercise Sweet-toothed gambler) that the probability that the process ends with AA, A A equals the relative ratio of the appearance of A in the initial state. The case (d). Let the values a,b,c e [0, 1] give in this order the relative ratios of occurrence of alleles AA, aA, aa in the given population. We want to obtain the expression of relative ratios of the tuples AA, a A, aa in the offspring of the population. If the choice of tuples for breeding is random, then for a suitably big population it can be expected that the relative ratio of breeding of individuals that both have AA is a2, the relative ratio for the tuple a A and AA is lab, the relative ratio for a A (both of them) is b2 and so on. The offspring of the parents with tuples AA,AA must inherit A A. The probability that the offspring of the parents with tuples AA, a A has AA is clearly 1/2 and the probability that the offspring of the parents with tuples a A, a A has A A is 1/4. There are no other cases for an offspring with the tuple A A (if one of the parents has the tuple aa, then the offspring cannot have A A). Relative frequency of A A in the progeny is thus , 1,1, b2 a1 ■ 1 + lab---h b1 ■ - = a2 + ab -\--. 2 4 4 Theorem. Let A be any matrix of the type m/n over real or complex scalars. Then there exist square unitary matrices U and V of dimensions m and n, and a real diagonal matrix D with non-negative elements of dimension r, r < min{m, n\, such that A = USV* S = D 0 0 0 and r is the rank of the matrix AA*. Furthermore, S is determined uniquely up to the order of the elements and the elements of the diagonal matrix D are the square roots of the eigenvalues di of the matrix AA*. If A is a real matrix, then the matrices U and V are orthogonal. Proof. Assume first that m < n and denote

Km the mapping between real and complex spaces with % standard scalar products, given by the matrix A under the standard bases. We can reformulate the statement of the theorem as follows: there exists orthonormal bases on K" and Km under which the mapping

From there B = V*A*AV = (AV)*(AV). That is equivalent to the claim that first r columns of the matrix AV are orthogonal and the remaining are zero, because they have zero size. Let us now denote first r columns v\,..., vr e Rm. Thus it holds that (vi,Vi) = dt, i = 1,..., r, and the normalised vectors ui = -j= Vi form an orthonormal system of non-zero vectors. Let us extend them to an orthonormal basis u = u\, ...,«„ of the whole Km. If we express our original mapping

n, we can apply the previous part of the proof on the matrix A*. From there we directly obtain the desired claim. If we work over real scalars, all the previous steps in the proof are also realised in the real domain. □ This proof of the theorem about singular decomposition is constructive and we can indeed use it for computing the unitary (orthogonal) matrices U and V and the non-zero diagonal elements of the matrix S. 3.45. Geometric interpretation. Diagonal values of the matrix D from the previous theorem are called singular values of the matrix A. Let us reformulate this theorem in the real case more geometrically. 178 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Analogically we set gradually the relative frequencies of the tuples a A and aa in the progeny: b2 ab + be + lac -\-- and c + be -\--. This process can be viewed as a mapping T that transforms the vector (a,b,c)T. It holds that (a\ ( a2 +ab + b2/4 b \ h» \ab + bc + 2ac + b2/2 c) \ c2 + bc + b2/4 Let us mention that the domain (and also the codomain) of T are just the vectors ' a' kde a, b, c e [0, 1], a + b + c = 1. We would like to give the operation T my multiplying the vector by some constant matrix. But that is clearly not possible (the mapping T is not linear). It is thus not a Markov process and the determination of what happens after a long time cannot be simplified as in the previous cases. But we can compute what happens if we apply the mapping T twice in a row. In the second step we obtain a2 + ab + b2/4 T : I ab + be + lac + b2/2 I h» I t2 I , kde c2 + be + b2/4 t\ = [a2 + ab + — ) + ( a2 + ab + — ) ( ab + be + 2ac + — b2^ + — I ab + be + lac -\-- 4 V 2 t2 = ( a2 + ab + — | ( ab + be + lac + '— | + + [ ab + be + lac + (c2 + be + ^ ) + b2\ ( b2\ 1 / ^2 + l[a2 + ab + — ) ( c2 + 6c + — J + - ( + be + lac + , , , ^2\2 / b2\ ( 7 b2 f7 = ( cl + be + — I + I ab + be + lac + — I I cl + be + — 1 / b2 + — \ ab + be + lac -\-- 4 V 2 It can be shown (using a + b + c = 1) that b2 b2 b2 t\=a2+ab-\--, tl = ab + be + lac -\--, tl=c2 + bc-\--, z 4 2 4 For the corresponding linear mappings : W —>• Rm the singular values have indeed simple geometric meaning: let K c M" be the unit ball for the standard scalar product. The image cp(K) is the always an m-dimensional ellipsoid (possibly degenerate). The singular values of the matrix A are then the sizes of the main half-axes and the theorem further says that the original sphere always allows orthogonal grouped diameters, whose image are exactly all half-axes of this ellipsoid. For square matrices it can be seen that A is invertible if and only if all singular values are non-zero. The ratio of the greatest to the smallest singular value is an important parameter for the robustness of the sequence of numerical computations with matrices, for instance for the computation of the inverse matrix. Let us also note that there exist fast methods of computations (approximations) for eigenvalues, thus the singular decomposition is very effective to work with. 3.46. Polar decomposition theorem. The singular decomposi-f% tion theorem is a starting point for many very useful tools. Let us now think about some direct corollaries (which by themselves are quite non-trivial). The statement of the theorem says that for any matrix A, real or complex, A — U SW* with S diagonal with non-negative real numbers on the diagonal and U and W unitary. But then also A — USU*UW* and let us call the matrices P = USU*, V = UW*. First of them, P, is Hermitian (in real case symmetric) and positively semidefinite, because it regards just how to write down the mapping with real diagonal matrix S in another orthonormal basis, while V is a product of two unitary matrices and thus again unitary (in the real case orthogonal). Furthermore A* — WSU* and thus AA* — USSU* — P2 and our matrix P is actually the square root of the easily computable Hermitian matrix A A*. Assume that A — PV — QU are two such decompositions of the matrix A into the product of positively semidefinite Hermitian and unitary matrix and assume that A is invertible. But then AA* = PVV*P = P2 = QUU*Q = Q2 is positively definite and thus the matrices Q — P — \/AA* are uniquely determined and invertible. But then also U — V — P~lA. We have thus completely derived a very useful analogy of the decomposition of a real number into a sign (orthogonal matrix in the case of dimension are exactly ±1) and the absolute value (the matrix P, for which we can compute the square root). Theorem (Polar decomposition theorem). Every square complex matrix A of the dimension n can be always expressed in the form A — P ■ V, where P is Hermitian and positively definite square matrix of the same dimension and V is unitary. We have P = V A A*. If A is invertible, the decomposition is unique and V — (s/~AA*)-lA. If we work over real scalars, P is symmetric and V orthogonal. If we apply the same theorem on A* instead of A, we obtain the same result, but with the order of the Hermitian and unitary matrices reversed. The matrices in the corresponding right and left decomposition will of course be in general distinct. 179 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS that is, a2 + ab + b2/4 a2 +ab +b2/4 ab +bc + lac + b2/2 I h» I ab + be + lac + b2/2 c2 + bc + b2/4 ) \ c2 +bc + b2/4 We have obtain a surprising result that further application of the transform T does not change the vector obtained in the first step. That means that the appearance of the considered tuples is after arbitrary long time time the same as in the first generation of offspring. For a big population we have thus proven that the evolution takes place during first generation (unless there are some mutation or selection). □ 3.38. Let there are two boxes, which contain together n white and n black balls. In regular time intervals from both boxes a ball is taken and moved to the other urn, while the number of balls in each of the boxes is at the beginning (and thus for all the time) equal to n. Give for this Markov process its probabilistic transition matrix T. Solution. This case is often used in physics as a model of blending two incompressible liquids (already in the year 1769 introduced by D. Bernoulli) or analogously, as a model of diffusion of gases. The states 0,1,...,« correspond for instance to the number of white balls in the first box. This information already says how many black balls are in the first box (and the remaining balls are then in the second box). If in the certain step a state changes from j € {1,...,«} to j — 1, it means that from the first box a white ball was drawn and from the second a black ball was drawn. That happens with probability I . I - ]— n n n2 Transition from the state j € {0, ...,n — 1} to the state j + 1 corresponds to drawing the black ball from the first box and a white ball from the second box, with probability n ~ j n - j (n- j)2 The system stays in the state j e {1,— 1}, if from both boxes balls of the same colour were drawn, which has the same probability n J n J_ I n 2j (n- j) In the complex case the analogy with the decomposition of numbers is even more funny - positively semidefinite P again plays a role of the absolute value of the complex number, the unitary matrix V then has a unique expression as a sum V — re V + i imV with Hermit-ian real and imaginary parts and the property (re V)2 + (im V)2 — E, that is, we obtain a full analogy for the polar form for the complex numbers (see the final remark in 3.30). But note that in the case with more dimensions it is important in what order is this "polar form" of matrix written. It is possible in both ways, but the results are in general distinct. For many practical applications it is faster to use the so-called QR decomposition of matrices, which is an analogy of the Schur orthogonal triangulation theorem: 3.47. Theorem. For every complex matrix A of the type m/n there exists a unitary matrix Q and an upper triangular matrix R such that A = QTR. If we work over real scalars, both Q and R are real. Proof. In the geometric formulation we need to prove that for every mapping

Km with the matrix A under the standard bases we can choose new orthonormal basis on Km such that then

From there we see that B B I) Q E 0 0 0 D-1 0 for suitable matrices P, Q and R. But now 'd~x p\(d 0 q rj\0 0 BA E QD 181 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS suffices for him to have €346 c if p = 0.495 (or €1727 if p = 0.499). Therefore it is possible in big casinos that „passionate" players play almost fair games. □ 3.40. In a certain company there exist two competing departments. The management has decided that every week they will measure relative (with respect to the number of employees) incomes attained by these two departments. To the more successful department will then 2 employees of the other department be moved. This process will go on as long as both departments have some employees. You have gained a position in this company and you can choose one of these two departments where you will work. You want to choose the department which won't be cancelled due to the employee movement. What will be your choice, if one of the departments has 40 employees, the other 10 employees and you estimates that the second one will have relatively greater income in 54 % of the cases? O Another application of the Markov chains are in the additional exercises after this chapter. E. Unitary spaces Even in the previous chapter we have defined the scalar product in real vector spaces (2.40), in this chapter we extend its definition to the complex spaces too (3.23). 3.41. Groups O(n) and U(n). If we consider all linear mappings from M3 to M3 which preserve the given scalar product, that is, with respect to the definitions of the lengths of the vectors and deviations of two vector all linear mappings that preserve lengths and angles, then these mappings form with respect to the operation of composition a group (see 1.1); composition of two such mappings is by the the definition also a mapping that preserves lengths and angles, unit element of the group is the identity mapping, the inverse element for a given mapping is its inverse mapping - thanks to the condition on the lengths preservation such mapping exists. Matrices of such mappings thus form a group with the operation of matrix multiplication (see ), it is called the orthogonal group and is denoted by 0(n). It is a subgroup of all invertible mappings from M." to M.". If we additionally require that the matrices have determinant one, we speak of the special orthogonal group SO(n) (in general the determinant of a matrix in O(n) can be either 1 or — 1). Similarly we define the unitary group U(n) as group of all (complex) matrices that correspond to the complex linear mappings from should be Hermitian, thus QD — O and thus also Q — O (the matrix D is diagonal and invertible). Analogously, the assumption that AB is Hermitian implies that P is zero. Additionally, we have B = BAB = D-1 0 D-1 0 On the right side in the right-lower corner there is zero, and thus also R — O and the claim is proven. (4): Consider the mapping cp : K" -> Km, x i-> Ax, and direct sums K" — (Ker cp)1- © Ker cp, Km = Imcp © (Imcp)-1. The restricted mapping cp :— cp^Keiip:)± ■ (Kercp)1- -> Imcp is a linear isomorphism. If we choose suitable orthonormal bases on (Kercp)1- and Imcp and extend them to orthonormal bases on whole spaces, the mapping cp will have matrix S and cp the matrix D from the theorem about the singular decomposition. For given b e Km is the point z e Imcp that minimises the distance \\b — z\\ (that is, the point that realises the distance from the affine subspace p(b,Imcp), see the next chapter) exactly the component z — b\ of the decomposition b — b\ + b2, b\ e Imcp, b2 e (Imcp)-1. But in a suitably chosen basis is the mapping , originally given under standard bases by the pseudoinverse A(_1), given by the matrix y from the singular decomposition theorem, notably we have ,(-D i (Imcp) — (Ker(From the point (4) of the previous theorem we obtain that the matrix A A1" is the matrix of the perpendicular projection form the vector space M", where n is the number of the rows of the matrix A on the subspace generated by the columns of the matrix A (this interpretation has of course meaning only for matrices that have more rows than columns). Furthermore, for matrices A whose columns for independent vectors, the expression (AT A)~l AT makes sense and it is not hard to verify that this matrix satisfies all properties from (1) and (2) from the previous theorem, thus it is a pseudoinverse of the matrix A. 182 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS C" to C" that preserve a given scalar product in a unitary space. Analogously, SU(n) denotes the subgroup of matrices in U(n) with determinant one (in general, determinant of matrix in U(n) can be any complex unit). 3.42. Consider the vector space V of functions R -» C. Determine whether the mapping

From the definition of exp we can show that it holds that exp(A + B) = exp(A). exp(S) as we are used to with the exponential mapping in the domain of numbers. Because in general it is (u+v)* = u* + v* and (cv)* = cv*, we obtain U* co 1 co 1 (Y-(iHYT = Y-,(-iH*Y and because H* «=o H, then «=o U* = T(-1)"-(///)" = exp(-iff) z—' n\ «=o and thus U*U = exp(iH)exp(-iH) = exp(0) = 1. det(I7) = e trace (iH) □ 3.44. Hermitian matrices A, B, C satisfy [A, C] = [B, C] = 0 and [A, S] 7^ 0, where [, ] is a commutator of matrices defined defined by the relation [A, B] = AB — BA. Show that at least one eigensubspace of the matrix C must have dimension > 1. Solution. We prove it by contradiction. We assume that all eigensub-spaces of the operator C have dim = 1. Then we can for any vector u write u = ckuk where uk are linearly independent eigenvectors of the operator C associated with the eigenvalue kk (and ck = u.uk) For these eigenvectors it clearly holds that 0 = [A, C]uk = ACuk — CAuk = XkAuk — C(Auk) 3.50. Linear regression. The approximation property (3) from the previous theorem is very useful in the cases where we are to find as good approximation as possible for the (non-existent) solution of a given system Ax — b, where A is a real matrix of the type m/n and m > n. For instance, an experiment gives us many measured real values bj and we want to find a linear combination of some functions fi, which approximates the values bj. The actual values of the chose functions in the points yj e R give a matrix atj — fjiyi), whose columns are given by values of the individual functions fj in the considered points, and our goal is to determine the coefficients xj e R so that the sum of the squares of the deviations from the actual values is minimised. In other words, we seek a linear combination of the functions ft such that we interpolate the given values bt "well". Thanks to the previous theorem are the optimal coefficients A^b. In order to have a more specific idea, consider just two functions f\ (x) — x, f2(x) — x2 and assume that the "measured values" of their unknown combination g(x) — y\x + y2x2 in integral values for x between 1 and 10 are bT — (1.44 10.64 4.48 14.56 31.12 39.20 54.88 71.28 85.92 104.16). This vector arose by computing the values x + x2 in given points shifted by random values in range ±8. The matrix A — (btj) is in our case equal to Ar = 12 3 4 10 1 4 9 16 25 36 49 64 81 100 and the coefficients in the combination are ,(-D 0.61 0.99 The resulting interpolation can be seen at the picture, where the given values b are interpolated with a green polygonal chain, while the red graph corresponds to the combination g. The computations were done in the system Maple using the command leastsqrs(B,b). If you are enfriended with Maple (or some other similar software), try to do some experiments with similar tasks. 183 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS >From there we see that Auk is an eigenvector of the matrix C with the eigenvalue Xk. But that means that Auk = XAuk for some number XA. Similarly we derive Buk = Xfuk for some number Xf. For the commutator of matrices A and B we then obtain [A, B]uk = ABuk - BAuk = XAXBkuk - XBkXAuk = 0 But that means that [A, B]u = [A, B]J2ckUk - J]cjt[A, B]uk = 0 k k and because u was arbitrary, that means that [A, B] = 0, which is a contradiction. □ 3.45. Applications in quantum physics. In quantum physics we don't give to quantities any numerical value, as in classical rf physics, but a Hermitian operator. That is nothing but a Her-mitian mapping, which can lead (and often does) to a linear transformation between unitary spaces of infinite dimension (we can imagine this as a matrix of infinite dimension). Vectors in this unitary space then represent the states of the given physical system. When measuring a given physical quantity we obtain only values that are eigenvalues of the corresponding operator. For instance instead of the coordinate x we have an operator of the coordinate x, that results in multiplication by x. If the state of the system is described by the vector V, then it holds that x. (v) = xv, that is, it corresponds to the multiplication of the vector by the real number x. At the first glance this Hermitian operator is different from our cases of finite dimensions. Evidently every real number is an eigenvalue (x has the so-called continuous spectrum). Similarly, in place of speed (more precisely, momentum) we have the operator p = —z The eigenvectors are solution of the differential equation —i^ = Xv. Even in this case is the spectrum continuous. That expresses the fact that the corresponding physical quantity is continuous (it can attain any real value). On the other hand, we have physical quantities, for instance energy, that can attain only discrete values (energy exists in quanta). The corresponding operators are then really similar to the Hermitian matrices, they just have infinitely many eigenvalues. 3.46. Show that x. and p are Hermitian and that [x, p] = i Solution. For any vector i; it holds that „ „ „ „ dv d(xv) [x, p]v = xpv — pxv = x(—i—) + i--= iv dx dx and from there we directly have our claim. □ x CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS A2 - 3A + 2 3.47. Show that [x — p, x + p] = 2i Solution. Evidently we have that [x, x] — 0 and [p, p] = 0 and the rest follows from the linearity of the commutator from the previous exercise. □ 3.48. Jordan form. Find the Jordan form of the matrix A. What is the geometric interpretation of this decomposition of the matrix? Solution. i)We first compute the characteristic polynomial of the matrix A l A 1171 1 \A-XE\= _6 4_A The eigenvalues of the matrix A are the roots of this polynomial, that means that ki 2 = 1,2. Because the matrix is of order two and has two distinct eigenvalues, its Jordan form is a diagonal matrix J = ^ ^ The eigenvector (x, y) associated with the eigenvalue 1 satisfies 0 = (A — E)x = ^ 5 3^ that is, — 2x +y = 0. That holds exactly for the multiples of the vector (1,2). Similarly we find out that the eigenvector associated with the eigenvalue 2 is (1, 3). The matrix P is then obtained by writing these eigenvectors into tho columns, that is, P = ^ F°r the matrix A we then have A = P ■ J ■ P~l. The inverse of Pis/5"1 = ^2 ^and we obtain ■1 1\ = /1 0\/3 -1 -6 4/ ^2 3J \0 2) \-2 1 This decomposition tells us that the matrix A determines such linear mapping that has in basis of the eigenvectors (1, 2), (1, 3) the aforementioned diagonal form. That means that in the direction (1,2) nothing is changing and in the direction (1,3) every vector is being stretched twice. ii) Characteristic polynomial of the matrix A is in this case |A - XE\ 1 — k 1 -4 3 — A A2 - 2A + 1 = 0 We obtain a double root A = 1 and the corresponding eigenvector (x, y) satisfies ' -2 l\ /jcn 0 = (A-E)x-, 4 2)yy The solutions are, as in the previous case, multiples of the vector (1,2). The fact that the system has no two linearly independent vectors as a 185 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS solution says that the Jordan form in this case is not optimal, but it will be a matrix ^ . The basis for which A has this form is the eigenvector (1,2) and a vector that maps on this vector by the mapping A — E, it is thus a solution of the system of equations -2 1 -4 2 1 ) - ( "2 1 2 \00 The solutions are the multiples of the vector (1,3). We obtain the same basis as in the previous case and we can write -4 3 J \2 3J\0 \J\-2 1 The mapping now acts on the vector as follows: the component in the direction (1,3) stays the same and the component in the direction (1,2) is multiplied by the sum of the coefficients that determine the components in the directions (1,3) and (1,2). □ 3.49. Find the Jordan form of the matrix A and write down the decomposition. What is geometric interpretation of this decomposition? li = | ^ ^2 4^ anc* = I ^4 i^j and draw how the vectors v = (3, 0), Aiv and A2v decompose with respect to the basis of the eigenvectors of the matrix Ai _2. Solution. The matrices have the same Jordan forms as the matrices in the previous exercise and both have in the basis of the vectors (1,2) and (1, —1), that is, 3 (-2 4 ) = (2 -l) (0 2) (2 -1 and 3 (4 1 ) = (2 -l) (0 l) (2 -1, For vector 1; = (3, 0) we obtain 1; = (1,2)+ 2(1, —1) and for its images Aiv = (5, -2) = (1, 2) + 2 • 2 • (1, -1) and A2v = (5, 4) = (2 + 1) .(1,2) + 2- (1,-1). □ F. Matrix decompositions 3.50. Prove or disprove: • Let A be a square matrix n x n. Then the matrix AT A is symmetric. • Let A be a square matrix with only real positive eigenvalues. Then A is symmetric. 3.51. Find an LU-decomposition of the following matrix: -2 1 0 -4 4 2 -6 1 -1. 186 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Solution. '1 0 0\ (-2 1 0^ 2 1 0 0 2 2 K3 -1 1/ \ 0 0 1, We first multiply the matrices that correspond to the Gaussian elimination, we thus obtain for the original matrix a, xa = u, where x is a lower triangular matrix given by the Gaussian reduction, u upper triangular. From this equality we have a = x~l u, which is the desired decomposition (we thus have to compute the inverse of x). □ /l 1 0\ 3.52. Find the LU-decomposition of the matrix I 1 —1 2 l-O V- 1 -V 3.53. Ray-tracing. In computer 3D-graphics the image is very often displayed using the Ray-tracing algorithm. The basis of this algorithm is an approximation of the light waves by a ray (line) and approximation of the displayed objects by polyhedrons. These are bounded by planes and it is necessary to compute where exactly are the light rays reflected from these planes. From physics we know how are the rays reflected - the angle of impact equals the angle of reflection. With this topic we have already met in the exercise || 1.64||. The ray of light in the direction i; = (1, 2, 3) hits the plane given by the equation x + y + z = 1. In what direction is it reflected? Solution. Unit normal vector to the plane is n = -^(1,1,1). The vector that gives the direction of the reflected ray vR lies in the plane given by the vectors v,n. We can express it as a linear combination of these vectors. Furthermore, the rule for the angle of reflection says that (v, n) = —{vR, n). >From there we obtain a quadratic equation for the coefficient of the linear combination. This exercise can be solved in an easier, more geometric way. From the picture we can directly derive that vR = v — 2(v, n)n and in our case we obtain vR = (—3, —2, — 1). □ 3.54. Singular decomposition, polar decomposition, pseu- doinverse. Compute the singular decomposition of the matrix /0 0 -I\ a = I — 1 0 0 I. Then compute its polar decomposition and \ 0 0 0 / find its pseudoinverse. Solution. We first compute at a: / 0 -1 0\ / 0 0 ata =10 0 0 -1 0 and obtain a diagonal matrix. But we need to find such orthonormal basis under which the matrix is diagonal and the zero row is the last 187 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS one. This can be clearly obtained by rotating through the right angle about the x-axis (the j-coordinate then goes to z, and z, goes to —y). This rotation is an orthogonal transformation given by the matrix v = 'l 0 0\ 0 0 1 I. By this we have (without much computation) found the v0 -1 o) decomposition at a = vbvt, where b is diagonal with eigenvalues (1, \, 0) on the diagonal. Because now we have b = (av)t(av), the columns of the matrix 0 0 -k\ (I 0 0\ / 0 4 0 0 0 0 / \0 -1 0/ \ 0 0 Oy form an orthogonal system of vectors, which we normalise and extend to a basis. That is then of the form (0, -1, 0), (1, 0, 0), (0, 0, 1). The transition matrix of changing from this basis to the standard one is then u = | — 1 0 0 |. Finally, we obtain the decomposition a u*Jbvt 0 0 -1 0 0 0 Geometrical interpretation of decomposition is the following: first, everything is rotated through the right angle by the x-axis, then follows a projection to the xy plane such that the unit ball is mapped on the ellipse with major half-axes 1 and \ and the result is the rotated through the right angle about the z-axis. The polar decomposition a = p ■ w can be simply obtained from the singular one: p := u^/~but and w := uvt, that is, 0 °\ 1 f0 0 -1 0 0 -1 = -1 0 0 0 1 o 1 o 1 0 and W = and from that it follows that 0 0 -lj -10 0 0 0 0 Pseudoinverse matrix is then given by the expression A(_1) := vsu1 'I 0 0^ where s = I 0 2 0 I. Thus we have t(-D 188 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS □ 3.55. QR decomposition. QR decomposition of a matrix A is very useful in the case when we are given a system of linear equations Ax = b which has no solution, but we need to find an approximation as good as possible. That is, we want to minimise \\Ax — b\\. According to the Pythagorean theorem we have || Ax — b\\2 = \\ Ax — b\\ ||2 + || b±_\\2, where b was decomposed into b\\ that belongs to the range of the linear transformation A (that corresponds to the matrix A) and into bj_, that is perpendicular to this range. Projection on the range of A can be written in the form QQT for a suitable matrix Q. Specifically for this matrix we obtain it through Gram-Schmidt orthonormalisation of the column of the matrix A. Then we have Ax — b\\ = Q(QT Ax — QTb). The system in the parentheses has a solution, for which we obtain \\Ax — b\\ = \\bj_\\, which is the minimal value. Furthermore, the matrix R := QT A is upper triangular and therefore the approximate solution can be found very easily. Find an approximate solution of the system x + 2y = 1 2x + 4y = 4 (1 2\ Solution. We have a system Ax = b with A = I J and b = (which evidently has no solution). We thus orthonormalise the columns of A. We take the first of them and divide it by its size. This i /I' is yields the first vector of the orthonormal basis y^J ■ ^ut the second is twice the first and thus it will be after orthonormalisation zero. Therefore we have Q = . The projector on the range of A i then QQT = next we compute 9 and The approximate solution then satisfies Rx = QTb and that in our case means 5x + 9y = 9 (approximate solution is thus not unique). QR decomposition of the matrix A is then □ 189 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS 2 -1 -1\ 3.56. Minimise \\Ax -b\\ for A = | -1 2 -1 I and/3 = 1-1 2 / and write down the QR decomposition of the matrix A. Solution. Normalised first column of the matrix A is 000 e\ = (2\ I — 1 I. From the second column we subtract its component in the direction e\. We have -1 and therefore we obtain l\ /-l By this we have created an orthogonal vector, which we normalise and (°\ obtain e2 = 4? I 1 I. The third column of the matrix A is already 1-1/ linearly dependent (we can verify this by computing the determinant). The desired column-orthogonal matrix is then 1 I2 ° Next we compute ' 2 -1 -1 -3 -3 A Ve^O 373 -373; and The solution of the equation Rx = QTb is x = y = z. Multiples of the vector (1, 1, 1) thus minimise \\Ax — b\\. The mapping given by the matrix A is a projection on the plane with a normal vector (1, 1, 1). □ 3.57. Linear regression. The knowledge we have obtained in this chapter can be successfully used practically for solving problems with linear regression. It is about finding the best approximation of some functional dependence using a linear function. We are thus given a functional dependence in some points (for instance, we investigate the value of the property of people depending CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS on their intelligence, the value of the property of their parents, number of mutual friends with Mr. Williams, ...), that is, f{a\,... ,aln) = yi,..., f{a\, ak, ..., ak) = yk,k > n (we have thus more equations than unknowns) and we want "best possible" approximation of this dependency using a linear function, that is, we want to express the value of the property as a linear function f{x\,..., x„) = b\X\ + b2x2 +• • • +bnxn +c. If we also define "best possible" by minimisation of k / n \ 2 E I v< E(/';V' 1 < > j with regard to the real constants b\, ...,b„, c. Our goal is to find such linear combination of the columns of the matrix A = (a1.) (with coefficients b\,..., b„), that has the smallest distance from the vector (yi,..., yk) in Rk, it is thus about finding an orthogonal projection of the vector (yi,..., yk) on the subspace generated by the columns of the matrix A. Using the theorem 3.49 this projection is the vector (bu...,bn)T = A(-rHyi,...,bn). 3.58. Using the least squares method, solve the system 2x + y + 2z = 1 x + y + 3z = 2 2x + y + z = 0 x + z, = -1 Solution. Our system has no solution, since its matrix has rank 3, the extended matrix has rank 4. The best approximation of the vector b = (1, 2, 0, —1) formed by the right sides of the equations can be thus obtained using the theorem 3.49 by the vector A{~l)b. (AA{~l)b is then the best approximation - the perpendicular projection of the vector b on the space generated by the columns of the matrix A). Because the columns of the matrix A are linearly independent, its pseudoinverse is given by the relation (AT A)-1 AT. Thus we have i(-D 191 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS 3/5 -1 ° \ 2 1 2 -1 10/3 -2/3 1 1 1 0 0 -2/3 1/3 / \2 3 1 1 1/5 -2/5 1/5 3/5 \ 0 1/3 2/3 -5/3 0 1/3 -1/3 1/3 / The desired x equals A{~l)b = (-6/5,7/3, l/3)r. The projection (the best possible approximation of the column of the right sides) is then the vector (3/5, 32/15, 4/15, -13/15). □ 192 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS G. Additional exercises for the whole chapter 3.59. Model of evolution of a whale population. For evolution of a population females are important, and for them the important factor is not age but fertility. From this point of view we can divide the females into newborns (juvenile), that is, females who are yet infertile; young fertile females; adult females with highest fertility and postclimacterial females which are not fertile anymore (but are still important with respect to taking care of newborns and food gathering). We model the evolution of such population in time. For a time unit we choose time it takes to reach adulthood. Newborn female which survives this interval becomes fertile. The evolution of a young female to full fertility and to postclimacterial state depends on the environment. That is, transition to next category is a random event. Analogously, death of an individual is also a random event. Young fertile female has per unit interval less children than adult female. Let us formalise these statements. Denote by xi(t), x2(t), x3(t), x4(t) the number of juvenile, young, adult and postclimacterial females in time t respectively. The amount can be expressed as a number of individuals, but also as a number of individuals relative on a unit area (the so-called population density), or as a total biomass and similarly. Further denote by pi the probability that a juvenile female survives the unit time interval and becomes fertile, and by p2 and p3 the respective probabilities that a young female becomes adult and that adult female becomes old. Another random event is dying (positively formulated: survival) of females that do not move to the next category - we denote the probabilities respectively q2, q3 and q4 for young, adult and old females. Each of the numbers p\, p2, p3, q2, q3, q4 is as a probability from the interval [0, 1]. Young female can survive, reach adulthood or die; these events are mutually exclusive, together they form a sure event and cannot be excluded. Thus we have p2 + q2 < 1. From similar reasons we have p3 +q3 < 1. Finally, we denote by f2 and f3 the average number of daughters of a young and adult female, respectively. These parameters satisfy 0 < f2 < f3. Expected number of newborn females in the next time interval is the sum of daughters of young and of adult females, that is x1(t + \) = f2X2(t) + f3X3(t). We denote for a while by x2,i (t + 1) the amount of young females in time t + 1, which were in the previous time interval, that is, in time t, juvenile, and by x2:2(t +1) the amount of young females, that were already in time t fertile, survived that time interval bud did not move into the adulthood. The probability p\ that a juvenile female survives the interval can be expressed by classical probability, that is, by the ratio x2,i(f + \)/x\{t), and similarly we can express the probability q2 as the ratio x2,2(t + l)/x2(t). Because young females in time t + 1 are exactly those that survived the juvenile stage and those that already were fertile, did survive and did not evolve, it holds that x2(t + 1) = x2,i(t + l)+x%2(t + 1) = p\xi{t) +q2x2(t). Analogically we derive the expected number of fully fertile females x3(t + 1) = p2x2(t) + q3x3(t) and the expected number of postclimacterial females by x4(t + 1) = p3x3(t) + q4x4(t). 193 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS CO 20 30 40 50 10 Figure i. Evolution of a population of orca whale. On the horizontal axis the time is in years, on the vertical axis is the size of the population. Individual areas depict the number of juvenile, young, adult and old females respectively, from bellow. Now we can denote /0 Pi 0 h P2 0 h o P3 o\ 0 0 /0\ ^0,0043^ 0,9775 0,9111 0 0 1 0,9111 0 0,0736 0,9534 0 0 0,0736 V 0 0 0,0452 0,9804y W ) / 0 0,0043 0,1132 0 \ ^0,0043^ /0,01224925\ 0,9775 0,9111 0 0 0,9111 0,83430646 0 0,0736 0,9534 0 0,0736 0,13722720 V 0 0 0,0452 0,9804/ 0 J ^0,00332672/ x(2) and we can carry on. The results of the computation can be also expressed graphically; see the picture || 11|. Try by yourself a computation and graphical depiction of the results even for a different initial 194 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS distribution of the population. The result should be an observation that the total population grows exponentially, but the ratios of the sizes of individual groups stabilise gradually on constant values. The matrix A thus has the eigenvalues ki = 1,025441326, k2 = 0,980400000, a3 = 0,834222976, a4 = 0,004835698, eigenvector associated with the largest eigenvalue k± is w = (0,03697187, 0,31607121, 0,32290968, 0,32404724); this vector is normed such that the sum of its components equals 1. Compare the evolution of the size of the population with the exponential function F(t) = k[x0, where x0 is the total size of the initial population. Compute also the relative distribution in individual categories in the population after certain time of evolution, and compare it with the components of the eigenvector w. They will appear very close, this is caused by the fact that A has only single eigenvalue that has the greatest absolute value and by the fact that the vector space generated by the eigenvectors associated with the eigenvalues a2, a3, a4 has with the non-negative orthant intersection only the zero vector. The structure of the matrix A itself does not ensure such easily predictable evolution, because it is a so-called reducible matrix (see ??). 3.60. Model of growth of population of teasels Dipsacus sylvestris. This plant can be seen in four stages. Either as a blossoming plant or as rosette of leaves, while with the rosette there are three sizes - small, medium and large. The life cycle of this monoicous perennial plant can be described as follows. Blossoming plant produces in late summer some number of seeds and dies. From the seeds, some sprout already in that year into a rosette of leaves, usually of medium size. Other seeds spend the winter in the ground. Some of the seeds in the ground sprout in the spring into a rosette, but because they were weakened during the winter, the size is usually small. After three or more winters the "sleeping" (formally, dormant) seeds die as they loose the ability to sprout. Depending on the environment of the plant, small or medium rosette can during the year grow, and any rosette can stay in its category or die (wither, be eaten by insects, etc.) Medium or large rosette can in the next year burst into a flower. Blossoming flower then produces seeds and the cycle repeats. In order to be able to predict the spreading of the population of the teasels, we need to quantify the described events. The botanists discovered that a blossoming plant produces on average 431 seeds. The probabilities that a seed sprouts, that a rosette grows or bursts into a flower are summarised in the following table: 195 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS event probability seed produced by a flower dies seed sprouts into a small rosette in the current year seed sprouts into a medium rosette in the current year seed sprouts into a large rosette in the current year seed sprouts into a small rosette after spending the winter seed sprouts into a medium rosette after spending the winter seed sprouts into a large rosette after spending the winter seed sprouts into a small rosette after spending two winters seed dies after spending one winter small rosette survives but does not grow medium rosette survives but does not grow large rosette survives but does not grow small rosette grows into a medium one small rosette grows into a large one medium rosette grows into a large one medium rosette bursts into a flower large rosette bursts into a flower 0,172 0,008 0,070 0,002 0,013 0,007 0,001 0,001 0,013 0,125 0,238 0,167 0,125 0,036 0,245 0,023 0,750 Note that all the relevant events in the life cycle have their probabilities given and that the events are mutually incompatible. Let us imagine that we always observe the population at the beginning of the vegetative year, say in March, and that all considered events take place in the rest of the year, say from April to February. In the population there are blossoming flowers, rosettes of three sizes, produced seeds and seeds that have been dormant for a year or two. This could lead us to division of the population into seven classes - just-produced seeds, seeds dormant for one year, seeds dormant for two years, rosettes small, medium and large and blossoming flowers. But the just-produced seeds are in the same year changed either into rosettes or they spend winter, thus they do not form an individual category. Let us thus denote: x\(t) — the number of seeds dormant for one year in the spring of the year t x2(t) — the number of seeds dormant for two years in the spring of the year t x3 (t) — the number of small rosettes in the spring of the year t x4(t) — the number of medium rosettes in the spring of the year t x5 (t) — the number of large rosettes in the spring of the year t xe(t) — the number of blossoming flowers in the spring of the year t The number of produced seeds in the year t is 431x6(f). The probability that the seeds stays dormant for the first year equals the probability that the seed does not sprout into any rosette and does not die, that is, 1 - (0,008 + 0,070 + 0,002 + 0,172) = 0,748. The expected number of seeds dormant for winter in the next year is thus The probability that the seed that have been dormant for one year stays dormant for the second year equals the probability that the dormant seed does not sprout into any rosette and that it does not die, that is, 1 - 0,013 - 0,007 - 0,001 - 0,013 = 0,966. The expected number of seeds dormant for two winters is thus Small rosette can sprout from the seeds immediately, from a seed dormant for one year or from a seed dormant for two years. The expected number of small rosettes sprouted from non-dormant seeds in the year t equals 0,008 • 431x6(f) = 3,448x6(f). The expected number of small rosettes sprouted xi(t + 1) = 0,748 • 431x6(0 = 322,388jc6(0. x2(t + 1) = 0,966*1 (f). 196 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS from the seeds dormant for one and two years is 0,013*i (?) and 0,010*2(0 respectively. With these newly sprouted small rosettes there are in the population also the older small rosettes (those that have not grown yet) - of those there are 0,125*3(0- The total expected number of small rosettes is thus x3(t + 1) = 0,013*i(0 + 0,010*2(0 + 0,125jt3(0 + 3,448*6(0- Analogically we determine the expected number of medium and large rosettes x4(t + 1) =0,007*^0 + 0,125jt3(f) +0,238jc4(0 + 0,070 -431*6(0 = =0,007*i(0 + 0,125*3(0 +0,238*4(0 + 30,170*6, x5(t + 1) =0,245*4(0 + 0,167*5(0 + 0,002 • 431*6(0 = =0,245*4(0 +0,167*5(0 + 0,862*6(0. The blossoming flower can arise either from medium or from large rosette. The expected number of blossoming flowers is thus x6(t + 1) = 0,023*4(0 + 0,750*5(0- We have thus reached six recurrent formulas for individual components of the investigated plant. We now denote / 0 0 0 0 0 322,388\ /*i(0\ 0,966 0 0 0 0 0 *2(0 0,013 0,010 0,125 0 0 3,448 , *(0 = *3(0 0,007 0 0,125 0,238 0 30,170 *4(0 0,008 0 0,038 0,245 0,167 0,862 *5(0 V 0 0 0 0,023 0,750 o ) \*6(0 / and write the previous equalities in the matrix form suitable for the computation x(t + 1) = A*(0- If we know the distribution of the individual components of the population in some initial year t = 0, we can compute the expected numbers of flowers and seeds in the following years. We can also 6 compute the total number of individuals n(t) at the time t, n(t) = ^*;(0, relative distribution r = l of the individual components Xi(t)/n(t), i = 1, 2, 3, 4, 5, 6 and the yearly relative change in the population n(t + l)/n(t). The results of such calculations for fifteen years and the case that we have put into some locality one blossoming flower, are given in the table || 11|. Unlike the whale population, the image would not be very clear, as the numbers of flowers are negligible compared to the numbers of seeds (the individual areas for flowers would merge in the picture). ki= 2,3339 k4 = 0,1187 + 0,1953i The matrix A has the eigenvalues A2 = -0,9569 + l,4942i X5 = 0,1187 - 0,1953i X3 = -0,9569 - l,4942i X6 = -0,127'4 The eigenvector associated with the eigenvalue ki is w (0,6377, 0,2640, 0,0122, 0,0693, 0,0122, 0,0046); this vector is normed such that the sum of its components is equal to one. We see that with increasing time t the relative increment in the size of population approaches the eigenvalue X\, relative distribution of the components in the population approach the components of the normed eigenvector associated with the eigenvector X\. Every non-negative matrix that has non-zero elements at the 197 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS t xi x2 x3 X4 x5 Xg n(t) | 0 0,00 0,00 0,00 0,00 0,00 1,00 1,00 1 322,39 0,00 3,45 30,17 0,86 0,00 356,87 2 0,00 311,43 4,62 9,87 10,25 1,34 337,50 3 432,13 0,00 8,31 43,37 5,46 7,91 497,18 4 2550,50 417,44 33,93 253,07 22,13 5,09 3 282,16 5 1641,69 2463,78 59,13 235,96 91,78 22,42 4514,76 6 7 227,10 1585,88 130,67 751,37 107,84 74,26 9 877,12 7 23 941,29 6981,37 382,20 2486,25 328,89 98,16 34218,17 8 31 646,56 23 127,29 767,29 3768,67 954,73 303,85 60568,39 9 97 958,56 30570,58 1 786,27 10381,63 1 627,01 802,72 143 126,78 10 258788,42 94627,97 4 570,24 27 597,99 4358,70 1459,04 391402,36 11 470376,19 249 989,61 9 912,57 52 970,28 10991,08 3 903,78 798 143,52 12 1258 532,41 454383,40 23 314,10 134915,73 22317,98 9461,62 1902925,24 13 3 050 314,29 1215742,31 56442,70 329 291,15 55 891,57 19 841,54 4727 523,56 14 6396675,73 2946603,60 127 280,49 705398,22 133 660,97 49492,37 10359111,38 15 15955747,76 6179188,75 299 182,59 1721756,52 293 816,44 116 469,89 24566161,94 t *2(0 x3(t) x4(t) *s(0 x6(t) n(t + 1) 1 n(t) n(t) n(t) n(t) n(t) n(t) n(t) 0 0,000 0,000 0,000 0,000 0,000 1,000 356,868 1 0,903 0,000 0,010 0,085 0,002 0,000 0,946 2 0,000 0,923 0,014 0,029 0,030 0,004 1,473 3 0,869 0,000 0,017 0,087 0,011 0,016 6,602 4 0,777 0,127 0,010 0,077 0,007 0,002 1,376 5 0,364 0,546 0,013 0,052 0,020 0,005 2,188 6 0,732 0,161 0,013 0,076 0,011 0,008 3,464 7 0,700 0,204 0,011 0,073 0,010 0,003 1,770 8 0,522 0,382 0,013 0,062 0,016 0,005 2,363 9 0,684 0,214 0,012 0,073 0,011 0,006 2,735 10 0,661 0,242 0,012 0,071 0,011 0,004 2,039 11 0,589 0,313 0,012 0,066 0,014 0,005 2,384 12 0,661 0,239 0,012 0,071 0,012 0,005 2,484 13 0,645 0,257 0,012 0,070 0,012 0,004 2,191 14 0,617 0,284 0,012 0,068 0,013 0,005 2,371 15 0,650 0,252 0,012 0,070 0,012 0,005 Table 1. Modelled evolution of the population of teasels Dipsacus sylvestris. Sizes of the individual components of population, the total size of population, relative distribution of the individual components of population and the relative increments of sizes. same positions as A is primitive. The evolution of the population thus necessarily approaches a stable structure. 3.61. Nonlinear model of population. Investigate in detail the evolution of the population for a non-linear model from the text book (1.12) and the values and K = 1 and i) rate of j growth r = 1 and the initial state p(l) = 0,2 ii) rate of j growth r = 1 and the initial state p(l) = 2 hi) rate of j growth r = 1 and the initial state p(l) = 3 iv) rate of j growth r = 2,2 and the initial state p(l) = 0,2 v) rate of j growth r = 3 and the initial state p(l) = 0,2 Compute some first members and predict the future growth of the population. 198 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Solution. i) The first ten members of the sequence p(n) is in the following table. >From there we can see that the size of the population converges to the value 1. n P(n) 1 0,2 2 0,36 3 0,5904 4 0,83222784 5 0,971852502 6 0,999207718 7 0,999999372 Graph for the evolution of the population for r = 1 and p(l) = 0, 2: ii) For the initial value p(l) = 2 we obtain p(2) = 0 and after that the population does not change. iii) For p(l) = 3 we obtain n P(n) 1 3 2 -15 3 -255 4 -65535 and from there we see that the populations decreases under all bounds, iv) For the measure of growth r = 2, 2 and the initial state p(l) = 0, 2 we obtain n P(n) 1 0,2 2 0,552 3 1,0960512 4 0,864441727 5 1,122242628 6 0,820433675 7 1,144542647 8 0,780585155 9 1,157383491 10 0,756646772 11 1,161738128 12 0,748363958 !3 1,162657716 14 0,74660417 We see that instead of convergence we obtain in this case an oscillation - after some time the population jumps between the values 1,16 and 0,74. The graph of the evolution of the population for r = 2, 2 and p(l) = 0, 2 then looks as follows: v) For the rate of growth r = 3 and the initial state p(l) = 0, 2 we obtain 199 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS n P(n) 1 0,2 2 0,68 3 1,3328 4 0,00213248 5 0,008516278 6 0,033847529 7 0,131953152 8 0,475577705 9 1,223788359 10 0,402179593 11 1,123473097 12 0,707316989 13 1,328375987 14 0,019755658 15 0,077851775 16 0,293224403 17 0,91495596 18 1,148390614 19 0,63715945 20 1,330721306 21 0,010427642 22 0,041384361 23 0,160399447 In this case the situation is more complicated - the population starts oscillating between more values. In order to be able to see between what values, we would need to compute more members. For the members from the table we have the following graph: □ 3.62. In a lab an experiment is being carried on with the same probability of success and failure. If the experiment succeeds, the probability of the success of the second experiment is 0, 7. If the first experiment fails, the probability of the success of the second experiment is only 0, 6. This process goes on, that is, if the previous experiment was successful, the probability of the next success is 0, 7 and if the previous experiment was a failure, then the probability of the next success is 0, 6. For any n € N determine the probability that the 72-th experiment is successful. Solution. Let us introduce the probabilistic vector xn = {xln,x2n)T, neN, where x\ is the probability of the success of the 72-th experiment and x\ = 1 — x\ is the probability of its failure. According to the statement it is -(E) and clearly also _ /0, 7 0, 6\ /l/2\ _ /l3/20\ Xl ~ [0, 3 0,4J' \l/2) ~\ 7/20 ) ■ Using the notation /7/10 3/5\ V3/10 2/5; 200 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS it holds that (3.7) xn+i = T ■ x„, n e N, because the probabilistic vector x„+1 depends only on x„ and this dependency is identical for both x2 and x\. >From the relation (||3.7||) we directly have (3.8) jc„+i = T ■ T -jc„_i = ••• = T" -jci, n > 2, n e N. Therefore we express T", 72 e N. It is a Markov process, and thus 1 is an eigenvalue of the matrix T. The second eigenvalue 0, 1 follows for instance from the fact that the trace (the sum of the elements on the diagonal) equals to the sum of the eigenvalues (every eigenvalue is counted with its algebraic multiplicity). To these eigenvalues then correspond the eigenvectors We thus obtain '2 1 \ /l 0 \ (2 1 T ' 1 -11 \0 1/10) \1 -1 that is, for n e N we have Substitution 2 1 Wl 0 V 2 1 1 -1) ' \0 1/10/ ' \1 -1 2 1 \ /l" 0 \ (2 1 i -l/'lo io-"/"vi -i 2 i y = i (i i 1 -l) "3 [l -2 and multiplication yields 1/2+ 10-" 2-2-10-"\ 3 — IO-" 1+2-10""/' >From there, from (||3.7||) and from (||3.8||) it follows that '2 11 1 + t—7T - n e N. ,3 6- 10" 3 6-10", Specially, we see that for big n the probability of success of the 72-th experiment is close to 2/3. □ 3.63. Student on a student dormitories is very "socially tired" (as a result, he is not able to fully perceive the universe around him and coordinate his movements). In this state he decides that he invites on the party-in-progress his friend which lives at the end of the hall. But, at the other end of the hall there lives somebody he definitely does not want to invite. But he is so „tired", that he realises the decision to make a step in a desired direction only in 53 of 100 attempts (in the remaining 47, he makes a step in exactly the opposite direction). Assuming that he starts in the middle of the hall and that the distance to both of the doors at the ends corresponds to twenty of his awkward steps, determine the probability that he first reaches the desired door. O 201 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS 3.64. Let r e Nof persons be playing the so-called "silent post". For simplicity assume that the first person whispers to the second person exactly one (arbitrarily chose) of the words „yes", „no". The second person then whispers to the third one that of the words „yes", „no" the second person thinks that the first person whispered. This then continues to the 72-th person. If the probability that the word changes (on purpose, accidentally) to the other word during one transmission equals p € (0, 1), determine for big n e N the probability that the 72-th person correctly receives the word transmitted by the first person. Solution. We can view this problem as a Markov chain with two states called Yes and No, and we say that the process is in the Yes state in the time m € N, if the m-th person thinks that the received word is „yes". For the order of the states Yes, No the probabilistic matrix is The product of the matrix Tm~l and the probabilistic vector of the initial choice of the first person then gives the probability of what the m-th person thinks. We don't have to compute the powers of this matrix, because all the elements of the matrix T are positive numbers. Furthermore, this matrix is doubly stochastic. Thus we know that for big 72 e N the probabilistic vector is close to the vector (1/2, 1/2)r. The probability that the 72-th person says „yes" is thus approximately the same as the probability that the 72-th person says „no", independently of the initial word. For a big number of participants thus holds that roughly half of them hears „yes" (we repeat that this does not depend on the initial word). For completeness let us determine what would be the result if we assumed that the probability of change from „yes" to „to" is for any person equal to p € (0, 1) and the probability of change from „no" to „yes" is equal to (in general distinct) q € (0, 1). In this case for the same order of the states we obtain a probabilistic matrix Again, with sufficiently many people it does not depend on the initial choice of the word. Simply speaking, in this model it holds that it does not depend on the initial state, because the people decide about what the transmitted information is; more precisely, the people themselves decide about the frequency of appearance of „yes" and „no", if there is enough of them (and there is no checking present). Let us further add that the obtained result was experimentally confirmed. In psychological experiment there was an individual repeatedly exposed to an event that could have been interpreted in two ways, and it was being done in time intervals that ensured that the subject still remembered the previous event. See for instance „T. Havr'anek et al.: Matematikapro biologick'e a I'ekafsk'e vedy, Praha, Academia 1981", where there is an experiment in which an ambiguous object (say, a drawing which leads to (for big 72 e N) to the probabilistic vector close to the vector \p + q p + qj which for instance follows from the expression of the matrix 202 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS of a cube which can be perceived from both the bottom and the top) is in fixed time intervals lighted on. Such process is a Markov chain with the transition matrix '1 - p q p l-q, where p,q € (0, 1). □ 3.65. In a certain game you can choose one of two opponents. The probability that you beat the better one is 1/4, while the probability that you beat the worse one is 1/2. But the opponents cannot be distinguished, thus you do not know which one is the better one. But you await a big number of games (and for each of them you can choose a different opponent). And of course you want reach the winning ratio as big as possible. Consider these two strategies: 1. For the first game choose the opponent randomly. If you win some game, carry on with the same opponent; if you lose the game, change the opponent. 2. For the first two games, choose (one) opponent randomly. Then for the next two games, if you lost both the previous games, change the opponent, otherwise stay with the same. Which of the strategies is better? Solution. Both strategies are a Markov chain. For simplicity denote the worse opponent by A and the better opponent by B. In the first case for the states „game with A" and „game with 5" (in this order) we obtain the probabilistic transition matrix '1/2 3/4^ ,1/2 1/4, This matrix has all elements positive, and thus it suffices to find the probabilistic vector Xoo, which is associated with the eigenvalue 1. It holds that 3 2X T .5 5, Its components correspond to the probabilities that after a long row of games the opponent is the player A or player B. Thus we can expect that 60 % of the games will be played against the worse of the opponents. Because 2 _ 3 1 2 1 5 ~ 5 ' 2 + 5 ' 4' there will be roughly 40 %. For the second strategy, let us use the states „two games in a row with A" and „two games in a row with 5" that lead to the probabilistic transition matrix '3/4 9/16N 1/4 1/16; We easily determine that now it is 9 4 13' 13, Against the worse opponent we would then play (9/4)-times more frequently than against the better one. Let us recall that for the first strategy it was (3/2)-times more frequently. The second strategy is thus better. Let us also note that for the second strategy roughly 42,3 % of the games are winning - it suffices to enumerate 11 9 1 4 1 0, 423 = — =---+---. 26 13 2 13 4 203 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS □ 3.66. Petr regularly meets his friend. But he is „well-known" for his bad timekeeping. Bud he is trying to change, thus it holds that in half of the cases he comes on time and in one tenth of the cases he comes even sooner than he should, if he was late for the last meeting. But if he was on time or sooner for the last meeting, he returns back to his „carelessness" and with probability 0, 8 comes late, and with only 0, 2 he is on time. What is the probability that on the 20-th meeting he comes late, when on the eleventh he was on time? Solution. Clearly it is a Markov process with states „Petr came late", „Petr came on time", „Petr came sooner" with the probabilistic transition matrix (with the given order of states) /0,4 0,8 0,8\ T = 0,5 0, 2 0, 2 . \0, 1 0 0/ The eleventh meeting is determined by the probabilistic vector (0, 1, 0)T (we surely know that Petr came on time). To the twentieth meeting corresponds the vector /0\ /0,571578 368\ T9 1 = 0,371316224 . \0/ \0, 057 105 408/ The desired probability is thus 0, 571578 368 (exactly). Let us add that /0, 571 316224 0,571578 368 0,571 578 368\ T9 = 0,371512 832 0,371316224 0,371316224 . \0,057 170944 0,057105 408 0,057 105 408/ >From there we see that it really does not depend on whether he came on the eleventh meeting late (first column), on time (second) or sooner (third). □ 3.67. Two students A and B spend every Monday morning by playing a certain computer game. The person who wins then pays for both of them in the evening in the restaurant. The game can also be a draw - then each pays for the half. The result of the previous game partially determines the next game. If a week ago the student A has won, then with the probability 3/4 wins again and with probability 1/4 it is a draw. Draw is repeated with the probability 2/3 and with probability 1/3 the next game is won by B. If the student B won a game, then with the probability 1/2 he wins again and with probability 1/4 student A is the winner of the next game. Determine the probability that today each of them pays half of the costs, if the first game played long time ago was won by A. Solution. We are actually given a Markov process with the states „the student A wins", „the game ends with a draw, „the student B wins" (in this order) with the probabilistic transition matrix /3/4 0 l/4\ T = 1/4 2/3 1/4 . \ 0 1/3 1/2/ We want to find the probability of the transition from the first state to the second after a big number n € N of steps (weeks). The matrix T is primitive, because / 9/16 1/12 5/16 \ T2 = 17/48 19/36 17/48 . \ 1/12 7/18 1/3 / 204 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS It thus suffices to find the probabilistic eigenvector of the matrix T associated with the eigenvalue 1. It is easy to compute that _ /2 3 2 x°° ~ \r r 7 We know that the vector x^ differs only very slightly from the probabilistic vector for big n and also does not depend on the initial state, that is, for big n e N we can set /2/7 2/7 2/7\ T" ~ 3/7 3/7 3/7 . \2/7 2/7 2/7/ The desired probability is the element of this matrix on the second position in the first column (the second component of the vector x^). Thus we have (quite quickly) found the result 3/7. □ 205 CHAPTER 3. LINEAR MODELS AND MATRIX CALCULUS Solutions of the exercises 3.2. Daily diet should contain 3, 9 kg of hay and 4, 3kg of oat. The costs per foal are then 13, 82 Kc. 3.3. 3.12. 1 1 2\/3sin(n • (tt/6)) — 4cos(n • (jr/6)). -3(-l)" - 2cos(« • (2tt/3)) - 2^3 sin(« • ((2tt/3)). (-l)"(-2«2 + 8« -7). 3.13. xn 3.14. x„ 3.15. x„ 3.24. Leslie matrix of the given model is (the mortality of the first group is denoted by a) ^0 2 2^ a 0 0 V0 1 Oy The stagnation condition corresponds to the fact that the matrix has 1 for the eigenvalue, that is, the polynomial X3 — 2aX — 2a has 1 as its root, that is, a = 1/4. 3.27. 1 I 5 The matrix has the dominant eigenvalue 1, the corresponding eigenvector is (|, 1). Because the eigenvalue is dominant, the ratio of the viewers stabilises on 6 : 5. 3.30. As in (|| 3.291|) the game ends after three bets. Thus all the powers of A, starting with A3, are identical. (i 7/8 3/4 1/2 0\ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \0 1/8 1/4 1/2 1/ 3.40. We can use the result of the exercise called Ruining of the player. The probability that the first department is cancelled is according to this exercise equals to 1 _ / 0-46 \5 1 \ 1-0.46/ .100 ,25 = 0.56. 1 _ / 0-46 y 1 \ 1-0.46/ It was enough to plug in p — 1 — 0.54, y — 10/2 and x — 40/2 to (||3.6||). It is thus more clever to choose the smaller department. 3.50. • The claim holds. (B := AT A, btj — (i-th row of AT) ■ (j-th column of A)= bjt AT) ■ (;'-th column of A)=(j-th column of A) • (;'-th row of AT) A 1N (j-th row of The claim does not hold. Consider for instance A 0 1 3.52. 1 0 3.63. Again it is a special case of the Ruining of the player. It suffices to reformulate the statement accordingly. For p = 0, 47, y = 20 and x = 20 from (||3.6||) follows the result 1 0,917 = V 1-0,47 / /_047_\ V 1-0,47 / 206 CHAPTER 4 Analytic geometry position, incidence, projection - and we return to matrices again... A. Affine geometry 4.1. Find the parametric equation for a line in R3 given by equations x - 2y + z = 2, 2x + y - z = 5. Solution. It is obviously sufficient to solve the equation system. However we can use different approach. We need to find non-zero direction vector ortogonal to normal vectors (1, —2, 1), (2, 1,-1). Cross product (1, -2, 1) x (2, 1, -1) = (1,3,5) gives us such vector. We can notice that triple (x, y, z) = (2,-l,-2) satisfies the respective system and we obtain the solution [2,-1,-2]+ t (1,3, 5), t eR. □ 4.2. Plane in R4 is given by its parametric equation q : [0, 3, 2, 5] + t (1, 0, 1, 0) + s (2, -1, -2, 2), t, s e Find its implicit equation. Now we come back to our view on geometry that we had when we studied positions of points in the plane in the 5th part of the first chapter, c.f. 1.23. First we will be interested in properties of objects in the Euclidean space, delimited by points, straight fines, planes etc. The essential point will be to clarify how their properties are related to the notion of vectors and whether they depend on the notion of length of vectors. In the next part, we will use linear algebra for the study of objects which are defined in a nonlinear way. To do this we will need a little bit more from the theory of matrices again. The results will be important later on, while discussing the technique for optimalization, i.e. searching for extrema of functions. At the end of this chapter we show how the projectivization of affine spaces help us to get a simplification and stability of algorithms typical for computer graphics. 1. Affine and euclidean geometry While we were clarifying the structure of solutions of linear equations in the first part of the previous chapter we found out in paragraph 3.1 that all solutions of non-H homogeneous systems of linear equations does not form vector spaces but always arise in such a way that to an one particular solution we add the vector space of solutions of the corresponding homogeneous system. On the other hand, the difference of any two solutions of the nonhomogeneous system is always a solution of the homogeneous system. This behaviour is similar to the behaviour of linear difference equations, as we have seen in paragraph 3.14 already. 4.1. Affine spaces. A direction how to deal with the theory is given already in the discussion about the geometry of the plane, c.f. paragraph 1.25 and further. There we described straight lines and points as sets of solutions of systems of linear equations. Any line was considered as a one-dimensional subspace, although its points were described by two coordinates. Parametrically, the fine was defined by the sum of a single point (i.e. to a pair of coordinates) and multiples of a fixed direction vector. Now we will proceed in the same way in arbitrary dimension. ___J Standard affine space J___ Standard affine space A„ is a set of all points in M" = A„ together with an operation which to a point A — (a\,..., an) e An and a vector i; — (v\,... ,v„) e Rn — V assigns the point a + v — (ai + di, . v„) e An • CHAPTER 4. ANALYTIC GEOMETRY Solution. Our task is to find a system of equations with 4 variables x, y, z, u (because dimension of the space is 4) which are satisfied by the coordinates of precisely those points which he in the plane. Note that sought system must contain 2 = 4 — 2 linearly independent equations. We solve the problem by so called elimination of parameters. Points [x,y,z,u] € q satisfy x = t + 2s, y = 3 — s, z = 2 + t - 2s, u = 5 + 2s, where t,s sR. We can express the system as matrix /1 2 -1 0 0 0 0 \ 0 -1 0 -1 0 0 3 1 -2 0 0 -1 0 2 2 0 0 0 -1 5/ where the first two columns are direction vectors of the plane, followed by negative identity matrix and finally the last column is vector of coordinates of point [0, 3, 2, 5]. We expressed the system in such a way so that it is a system in t, s, x, y, z, u and we move all the unknown variables to the one side of the equations. We transform obtained matrix using elementary row operations in order to get as much zero-rows on the left-hand side of the first vertical line. Adding (—1)-times the first row and (—4)-times the second row to the third row and adding twice the second row to the first row we obtain /1 2 -1 0 0 0 0 \ 0 -1 0 -1 0 0 3 1 -2 0 0 -1 0 2 \° 2 0 0 0 -1 5 ) (1 2 -1 0 0 0 0 \ 0 -1 0 -1 0 0 3 0 0 1 4 -1 0 -10 0 0 -2 0 -1 11 / Which implies result + 4y -2y - 10 u + 11 0, 0. Coefficients on the right-hand side of the first vertical line, respective to the rows which are zero-rows on the left-hand side of that line, are the coefficients of general equations of a planes. Note that if we expressed the original system as a matrix /1 0 0 0 1 2 0 \ 0 1 0 0 0 -1 3 0 0 1 0 1 -2 2 0 0 1 0 2 5/ This operation satisfies the following three properties: (1) A + 0 — A for all points A e A„ and the null vector 0 e V, (2) A + (v + w) — (A + v) + w for all vectors v,weV and points A e A„, (3) for every two points A, B e A„ there exists exactly one vector v e V such that A + v — B. This vector is denoted hy v — B — A, sometimes also AB. The underlying vector space M" is called the difference space of the standard affine space A„. We notice a danger of several formal ambiguities. We are 'i§t# using the same symbol "+" for two different oper-'''jfxj/k ations: for adding a vector from the difference space t0 a p0int in the affine space, and for for summing vectors in the difference space V — Rn. Also we do not introduce specific letters for the set of points in the affine space, i.e. A„ denotes both this set of points and also the whole structure defining the affine space. Why do we actually want to distinguish between the set of points in the affine space A„ and its difference space V when both spaces can be viewed as M" ? It is going on fundamental formal step to understanding the geometry in M": The thing is that the geometric objects like straight lines, points, planes etc. do not depend directly on the vector space structure of the set M", and do not depend at all on the fact that we are working with n-tuples of scalars. We only need to know what it means to move "straight in a given direction". For instance, we consider the affine plane as an unbounded board without chosen coordinates but with the possibility to move about a given vector. When we switch to such abstract view, we will be able to discuss the "plane geometry" for two-dimensional subspaces, i.e. planes in higher-dimensional spaces, the geometry of "Euclidean space" for three-dimensional subspaces etc., without the need to work with ^-tuples of coordinates. This point of view is present in the following definition: 4.2. Definition. The affine space A with the difference space V is a set of points V, together with the map V (A, v) where V is a vector space and our map satisfies the properties (l)-(3) from the definition of the standard affine space. So for a fixed vector i; e V we get a translation rv : A -> A as the restricted map tv : V ~ V x {v} -* V, A\-^ A + v. By the dimension of an affine space A, we mean the dimension of its difference space. In sequel we do not distinguish accurately between denoting the set of points A and the set of vectors V, we talk about points and vectors of the affine space A instead. It follows immediately form the axioms that for arbitrary points A, B, C in the affine space A (4.1) (4.2) (4.3) A - A = 0 e V B — A — -(A - B) (C - B) + (B - A) = C - A. Indeed, (4.1) follows from the fact that A+0 — 0 and that such vector is unique (the first and the third defining property). By adding 208 CHAPTER 4. ANALYTIC GEOMETRY where x,y,z,u remains on the left-hand side of the equations, similar transformation / 1 0 0 0 0 10 0 0 0 10 V 0 0 0 1 gives us the result 1 2 0 \ / 1 0 0 0 1 2 0 \ 0 -1 3 0 1 0 0 0 -1 3 1 -2 2 -l -4 1 0 0 0 -10 0 2 5 ) V 0 2 0 1 0 0 11 / 4y 2y + + u 10, 11. When expressing system as a matrix, it is important to take into consideration whether the vertical line separates left-hand side from right-hand side. As we saw in this exercise, parameter eUmination method can be long-winded and it is not difficult to make a mistake along the way. Another solution All we wanted to obtain in fact, are two linearly independent normal vectors, i.e. vectors perpendicular to (1, 0, 1, 0), (2,-1, —2, 2). If we "guessed" that these vectors could be for example (0, 2, 0, 1), (-1, 0, 1, 2), inputting x = 0, y = 3, z = 2, u = 5 to the equations 2y + u = a, —x + z + 2u = b we get a = 11, b = 12, and the sought implicit expression is 2y + u = 11, + 2u = 12. + z u 2u □ 4.3. Find a parametric equation of the plane passing through points A = [2, 1, 1], £ = [3,4, 5], C = [4, -2, 3]. Then find a parametric equation of the open half-plane containing the point C and bounded by line going through the points A, B. Solution. We need one point and two (linearly independent) vectors lying in this plane for the parametric equation of the plane. It is enough to choose the point A and vectors 5 — A = (1,3, 4) and C — A = (2, —3, 2), which are obviously independent. A point [x, y, z] lies in the plain if and only if there exist numbers t, s € R so that x =2 + 1-t+2-s, y = l + 3- t - 3-s, z = 1 + 4 ■ t + 2 ■ s; which means the parametric equation is [2, 1, 1] + t (1, 3, 4) + s (2, -3, 2), t, s e R. Setting s = 0 gives us a line passing through points A, B. For t = 0, s > 0 we get a ray with initial point A and passing through C. Particular but arbitrarily choosen t e R and variable s > 0 gives us a ray initiated on the border line and going through the half-plane in successively B — A and A — B to A, according to the second defining property we obtain obviously A again. So we added the null vector which proves (4.2). Similarly, (4.3) follows from the defining property 4.1 (2) and the uniqueness. Let us remark that the choice of one fixed point Aq e A determines a bijection between V and A. So for a fixed basis u in V we get for every point A e A a unique expression A = Aq + x\u\ + ■ ■ ■ + x„u„. We talk about an affine coordinate system (Aq; u\,..., un) given by the origin of the affine coordinate system Aq and the basis u of the corresponding difference space, or also about an affine frame (Aq, w). We can summarize the situation as follows: Affine coordinates of a point A in the frame (Aq, u) are the coordinates of the vector A — Aq in the basis u of the difference space V. The choice of an affine coordinate system identifies each n-dimensional affine space A with the standard affine space A„. 4.3. Affine subspaces. If we choose only such points in A which have some of in advance chosen coordinates equal to zero (for instance the last one), we obtain again a set which behaves as an affine space. Indeed, this is the spirit of the following definition of the so called affine Ma subspaces. Subspaces of an affine space Definition. The nonempty subset Q c A of an affine space A with a difference space V is called an affine subspace in A if the subset W = {B — A; A, B e Q} c Visa vector subspace and for any A e Q, v e W we have A + v e Q. It is important to include both of the conditions in the definition since it is easy to find examples of sets which satisfy the first condition but not the second one. Have a think about a straight line in the plane with one removed point. For an arbitrary set of points M c A in an affine space with a difference space V, we define the vector space Z(M) = ({B — A; B, A e M}) C V of all vectors generated by the differences of points in M. In particular, V = Z(A) and every affine subspace Q c A itself satisfies the axioms for an affine space with the difference space Z(Q). Directly from the definitions we also get that the intersection of any set of affine subspaces is either an affine subspace or the empty set. The affine subspace (M) in A generated by a nonempty set M c A is the intersection of all affine subspaces which contain all points of the subset M. „ Affine hull and parametric description of a subspace \^ The affine subspaces can be nicely described by their difference spaces if we choose a point Aq e M in a generating set M. Indeed, we get (M) = {Aq + v; v e Z(M) c Z(A)}, i.e. to generate the affine subspace we take the vector subspace Z(M) in the difference space generated by all differences of points in M, and we add this vector space to an arbitrary point in M. We talk also about the affine hull of the set of points M in A. 209 CHAPTER 4. ANALYTIC GEOMETRY which point C lies. That means that the sought open half-plane can be expressed parametrically as □ [2, 1,1] + ? (1, 3, 4) + s (2, -3, 2), «el,s> 0. 4.4. Determine relative position of lines p : [1,0, 3] + t (2,-1,-3), (el, q : [1,1, 3] + s (1,-1,-2), s el. Solution. We will find common points of given lines (subspaces intersection). We get a system 1 + 2t = 1 + s, 0 - t = 1 - s, 3 - 3t = 3 - 2s. >From the first two equations we get that t = 1, s = 2. However, this does not satisfy the third equation. The system does not have a solution. Direction vector (2, —1, —3) of the line p is not a multiple of direction vector (1, — 1, —2) of the line q which means that the lines are not parallel. Hence, they are skew lines. □ 4.5. Find all numbers a elso that lines p : [4, -4, 8] +t (2, 1, -4), (el, q : [a, 6, -5] + s (1, -3,3), s € R are intersecting. Solution. Lines are intersecting if and only if the system + 4 -4 + 2t + t - At a s, 6 - 3s, 5 + 3s has exactly one solution. Expressing the system as a matrix (the first column corresponding to t, the second to s), we solve 1 2 -1 a - 4 \ 1 3 10 V-4 -3 -13 ) 3 10 -1 a - A -3 -13 \ 10 1 a-24 3 1 We see that the system has exactly one solution if and only if the second row is a multiple of the third row. This property is satisfied only for a = 3. Let us add that the point of intersection of the lines is [6, —3, 4]. □ On the other hand, whenever we choose a subspace U in the difference space Z(A) and a fixed point A e A the subset A + U, created by all possible sums of the point A and all vectors in U, is an affine subspace. This approach leads to the notion of parametrization of subspaces: Let Q = A + Z(Q) is an affine subspace in A„ and («i,... ,uk) is a basis of Z(Q) c Rn. Then the expression of the subspace Q — {A + hui +■■■ + tkuk; t\, ... ,tk 0}, (4) more generally ^-dimensional subspaces a — {P + t\ ■ v\ H-----h tk ■ vk; t\, , e R, k > 0}, (5) angles in two-dimensional subspaces B = {P + tx ■ vi + t2 • v2; h > 0, ?2 > 0}. Directly from the definition, it also follows that an intersection of an arbitrary system of convex sets is a convex set. The intersection of all convex sets containing given set M is called the convex hullJC(M) of the set M. Theorem. The convex hull of any subset M C A is JC(M) = {fiAi + ••• + 1, tt >0,At eM} f tsAs; J^tt r = l Proof. Let S denotes the set of all affine combinations on the right-hand side of the equation we want to prove. First we check that S is convex. Therefore, we choose two series of parameters U, i — 1,.., s\, fj, j — 1,..., s2 with the desired properties. Without loss of generality, we may assume that s\ — s2 and that the same points from M there appear in both combinations (otherwise we simply add summands with zero coefficients). Let us consider an arbitrary point on the line segment given by vertices defined by the two combinations: e(fiAi H-----h tsAs) + (1 - e)(^Ai H-----h /SAS), 0 < e < 1. Obviously any point of thise line segment lies in S. It remains to show that the complex hull of the points A\,..., As cannot be smaller than S. The points A, themselves correspond to the choice of parameters tj — 0 for all j / i and ti — 1. Let us assume that the claim holds for all sets with s — 1 points at most. It means that the convex hull of the points A i is (according to the assumption) formed 213 CHAPTER 4. ANALYTIC GEOMETRY 4.12. Affine transformation of point coordinates Point x coordinates are expressed as [2, 2, 3] in affine basis {[1,2, 3], (1, 1, 1), (1,-1,2), (2, 1, 1)} (in M3). Determine its coordinates in standard basis, i.e. in basis {[0,0, 0], (1,0, 0), (0, 1,0), (0,0, 1)}. Solution. Coordinates [2, 2, 3] in basis {[1, 2, 3], (1, 1, 1), (1, -1, 2), (2, give us [1, 2, 3]+2-(l, 1, l)+2-(l, -1, 2)+3-(2, 1, 1) = [11, 5, 12] coordinates of point x in standard basis. □ 4.13. Affine transformation of mapping. Find affine mapping / in coordinate system with basis u = {(1, 1), (—1, 1)} and origin [2, 0], which is defined as f(xi,x2) + in standard basis in M2. Solution. Change of basis matrix from basis u to the standard basis is '\ -f 1 1 We get the transformation matrix in basis ([2, 0], u) by first transforming coordinates in basis ([2, 0], u) to the standard basis, i.e. to basis ([0, 0], (1, 0), (0, 1)), then we apply transformation matrix / in the standard basis and in the end we transform back to the coordinates in basis ([2, 0], u). Transformation equations for changing coordinates yi, y2 in basis ([2, 0], u) to coordinates x\, x2 in standard basis are And hereby we have + + Hence, our sought mapping is f(yi, yi) 1 j_ \ \ 2 2 2 0 -1 1 + + + □ 4.14. Let there be a standard coordinate system in M3 space. Agent K lives at point S with coordinates [0, 1,2] and the headquarters gave him a coordinate system with origin S and basis {(1, 1,0), (-1,0, 1), (0, 1,2)}. Agent Bond lives at point D with coordinates [1, 1, 1] and uses coordinate system with basis {(0, 0, 1), (-1,1, 2), (1, 0, 1)}. Agent K has set an appointment with agent Bond in the old brickfield which is (according to K's coordinate exactly by the combinations from the right side of the equation we want to prove, where ts — 0. Now consider a point A — t\ A\ +----h ts As e S,ts < 1, and affine combinations e(iiAi + • • • + ís_iAs_i) + (1 - e(l - ts))As It is a line segment with vertices given by parameters s — 0 (the point As) and e = 1/(1 — ts) (a point in the convex hull of k'ij)i., As_i). The point A is an inner point of this line segment with the parameter s — 1, and thus A lies in the complex hull of Ai,...,As. □ The convex hulls of finite sets are called convex polyhedrons. If and only if the vertices Aq, ..., Ak defining the convex polyhedron are in a general position, we get a ^-dimensional simplex. In the case of a simplex, the expression of any of its points as an affine combination of the defining vertices is unique. A specific example are the convex polyhedrons defined by one point and a finite number of vectors: Let u\, ... ,uk be arbitrary vectors in the difference space W, A e A„ a point. A parallelepiped Vk(A; u\,..., uk) c An is the set Vk(A; u\, ..., uk) — {A + c\u\ - ckuk; 0 < et < lj If the vectors u\, are independent, we talk about a k- dimensional parallelepiped Vk (A; u\,..., uk) c A„. It is obvious from the definition that the parallelepipeds are convex. In fact they are the convex hulls of their vertices. 4.10. Examples of standard affine exercises. (1) To find a para-i^K- metric description of an implicitly given subspace r >-S£"*)"^" and vice versa: Sv'psrJ Finding a particular solution of a nonhomoge-31s2e— neous system and a fundamental solution of the homogenized system, we get (in the coordinates in which the equations have been set) exactly the desired parametric description. In the opposite direction, if we write the parametric description in coordinates and then we eliminate the free parameters t\,..., tk, we get exactly the equations defining the given subspace implicitly. (2) To find the subspace generated by several subspaces Qi, ■ ■ ■, Qs (of different dimensions in general, e.g. to find a plane in Rj, given by a straight line and a point, by three points etc.), and to define this subspace implicitly or parametrically: The resulting subspace Q is always determined by one fixed point At in a subspace Qi and by the sum of all difference spaces. For instance, Q = Ai +(Z({A1,...,Ak}) + Z(Q1) Z(QS)). If the subspaces are given implicitly, it is possible to convert them into the parametric form first. Nevertheless, also different methods are advantageous in some concrete situations. Notice that we really need to use one point of each from the subspaces. For example, two parallel fines in a plane generate the whole plane but they share the same one-dimensional difference space. (3) To find the intersection of the subspaces Q\, ■ ■ ■, Qs: If they are given in the implicit form, it is sufficient to unify all equations into one system (and to leave out the linearly dependent). If the system that has arisen is insolvable, then the intersection is empty. In the opposite case, we get an implicit description of the affine subspace which is the intersection we are searching for. 214 CHAPTER 4. ANALYTIC GEOMETRY system) at point [1, 1,0]. Where should Bond come (regarding his coordinate system)? Solution. Change of basis matrix from agent K's basis to the Bond's one (with the same origins) is /-4 2 -1N T = 1 0 1 \2 -1 1 f Vector (0, 1,2) thus has coordinates T ■ (0, l,2)r = (0,2, if, by transposing origin (we add vector (—1,0, 1)) we get the result (-1,2,2). □ 4.15. Find a transversal of lines (line passing through both lines) q : [2,2,0] + f(l, 1, 1), [1, 1, 1] +t(2, 1,0), so vn&Xffl', u, 0] lies on this line. Solution. We find intersection of sought transversal with line q (denote it by Q). Transversal contains some point lying on p and the point [1, 0, 0], therefore it lies in plane p defined by this point and line p, thus in plane [1, 1, 1] + t(2, 1,0) + ^(0, 1, 1). Point Q is then intersection of this plane and line q. We will find it by solving system 1 +2t 1+t + s 1+s 2 + u 2 + u Left-hand sides of equations represent all three coordinates of an arbitrary point of plane p respectively, right-hand sides then represent coordinates of arbitrary point on q (we denoted the free variable as u in order not to be ambiguous). Solving this sytem, we obtain s = 2, t = 2, u = 3 and by inputting u = 3 into line q equation we get Q = [5, 5, 3] (we get the same point by inputting s = 2, t = 2, into parametric equations of p). Sought transversal is thereby given by point Q and point [1, 0, 0]. Now we easily compute the intersection with p, point P = [7/3, 5/3, 1]. □ 4.16. Find a common perpendicular of two skew lines p: [3,0, 3] + (0, 1,2)?, (el q : [0,-1,-2]+ (1,2,3)5 sei Solution. We want to find a transversal perpendicular to both direction vector of line p and direction vector of line q. We can find If we are given parametric forms, we may also directly search for common points as solutions of the appropriate equations, similarly as we were finding the intersections of vector spaces. In this way, we get directly the parametric description again. If the number of subspaces is greater then two, we must search for the intersection step by step. If one of the subspaces is defined parametrically and the other implicitly, it suffices to substitute the parametrized coordinates and to solve the resulting system of equations. (4) To find a crossbar between skew lines p, q in A3 passing through a given point or having a given direction: By a crossbar we mean a straight fine which has a f nonempty intersection with both the skew fines. Thus the resulting crossbar is r is an one-dimensional affine sub-space. If we are given its one point A e r, then the affine subspace generated by p and A is either a straight line (A e p) or a plane (A ^ p). In the first case, we have an infinite number of solutions, one for each point of q, in the second case, it suffices to find the intersection B of the plane (pUA) with q, and r — ({A, B}). The problem has no solution if the intersection is empty. If q c (pUA), we get an infinite number of solutions again, and if the intersection has one element, we get exactly one solution. If we are given a direction u e Rn, i.e. the difference space of r, instead of a point, then we consider the subspace Q generated by p and the difference space Z(p) + (u) c K". Again, we get infinite number of solutions if q c Q, otherwise we consider the intersection Q with q and we finish in the same way as in the previous case. The solutions of many other practical geometric problems are based mostly on the systematic use of the steps given above. 4.11. Remarks to linear programming. In the beginning of the third chapter in paragraphs 3.4-3.8, we dealt with practical problems which are given by systems of linear inequalities. I We easily check that each single inequality a\x\ ■anx„ < b defines a halfspace in the standard affine space M" which is bounded by a hyperplane given by the corresponding equation (compare with the definition in paragraph 4.9(4)). Indeed, if we choose the parametric description of the hyperplane {P + tlVl+---+t„-lV„-l} with vectors vi,..., v„-\ from the difference space, then by completing these vectors by 1; to a basis of the whole M", the value a\x\ + • • • + a„x„ — b on the linear combination tivi + • • • + tn-ivn-i + tnv must be positive for all vectors with either a positive or a negative t„. At the same time we see that the set of all admissible vectors for the problem of the linear programming is always an intersection of a finite number of convex sets and hence the set itself is either convex or empty. If the intersection is simultaneously nonempty and bounded, then it is obviously a convex polyhedron. As we have justified in 3.4 already, each linear form is either permanently increasing or permanently decreasing or constant along each (parametrized) straight line in the affine space. Thus if a given problem from linear programming is solvable and bounded, then it must have 215 CHAPTER 4. ANALYTIC GEOMETRY the right direction for example by cross product of those two vectors, and obtain direction (1, —2, 1). Now we form linear equation system which expresses that a vector defined by some two points, one of them lying on p, the other on q, was parallel with direction (1, —2, 1). Symbolically we get system P — Q = k{\, —2, 1), or [3, 0, 3] + (0, 1, 2)t - [0, -1, -2] + (1, 2, 3)s = Jk(l, -2, 1). Treat- 1-,-' 1-,-' p q ing this equality component-wisely, we get 3-5 1 + t - 2s 5 + 2t - 3s -2k with solution t = 1, s =2, k = 1. Inputting t = 1 into line p parametric equation we get one point of the common perpendicular, point [3, 1,5], by inputting s = 2 into line q equation we then get point [3, 1,5]. The common perpendicular is defined by those two points □ B. Eucledian geometry 4.17. Determine distance of lines in M3. p : [1, -1, 0] + t(-1, 2, 3), and q : [2, 5, -1] + t(-1, -2, 1). Solution. The distance is defined as the distance of ortogonal projections of arbitrary points on the respective lines to the ortogonal complement of the vector subspace generated by their directions. We find the ortogonal complement using cross product: ((-1, 2, 3), (-1, -2, l))x = ((-1, 2, 3) x (-1, -2, 1)) = ((8, -2,4)) = ((4,-1,2)). Transversal is for example segment [1,—1,0][2,5,—1], so we project vector [1, -1, 0] - [2, 5, -1] = (-1, -6, 1). We obtain distance of lines: |(-1,-6,1). (4,-1,2)| 4 P(P, B between affine spaces is called the affine map if there exists a linear map

Z(B) between their difference spaces such that for all A e A, v e Z(A) the following holds f(A + v) = f(A) + cp(v). The maps / and

: [-25,0, 26] +s (-9, 1,7), seK. We get point A by choosing certain s el. On top of that vectors A — B = (-28-9*, -11 +5,22 + 7s) , A-C = (-20 - 9s, 13 + s, 28 + 7s) should be of the same length, i.e. V(-28 - 9s)2 + (-11 + s)2 + (22 + 7s)2 = V(-20 - 9s)2 + (13 + s)2 + (28 + Is)2 , or rather (-28 - 9s)2 + (-11 + s)2 + (22 + 7s)2 = (-20 - 9s)2 + (13 + s)2 + (28 + Is)2. should hold. >From the last equation we get s = —3. Therefore A = [-25, 0, 26] - 3 (-9, 1,7) = [2, -3, 5]. □ 4.19. Michael is standing in [2, 1,2] and has a stick of length 4. Can he touch lines p and q with this stick at the same time? p : [-1,4, 1] + t (-1,2,0), q : [4,4,-1]+5(1,2,-4)? (Stick has to pass through [2, 1, 2].) Solution. We know how to compute transversal of those lines passing through [2, 1, 2]. It is segment [1, 0, 1][3, 2, 3], its length is \/V2, which is less than 4. Michael is able to touch the lines. □ 4.20. In Euclidian space M4 determine the distance of point A = [2, —5, 1,4] and subspace defined by equations U : Ax\ — 2x2 — 3x3 — 2x4 + 12 = 0, 2x\ — x2 — 2x3 — 2x4 + 9 = 0. Solution. First, we find a parametric expression of subspace U. For example, B = [0, 3, 0, 3] e U. 4.13. Ratio of colinear points. The affine combinations of pairs of points can be also expressed with the help of so called ratio of points on a straight line. If C is given § by an affine combination of points A and B ^ C where C — rA + sB, then we say that the number X = (C; A, B) = --r is the ratio of the point C with respect to the given points A and B. Since we can express the point C as C = A + s(B - A) = B + r(A - B), the ratio X is the ratio of length of the oriented vectors C — A and C — B. In particular, X — — 1 if and only if C is the center of the fine segment between A and B (i.e. r — s — \ in our affine combination). Hence our characterization of affine maps in terms of affine combinations has the following intelligible consequence: Corollary. Affine maps are exactly those maps which keep the ratios invariant. 4.14. Changes of coordinates. Under the choice of an affine coordinate system (Aq, w) on A and a system (Bo, v) on B, we get the coordinate expression of the affine map / : A -> B. It follows directly from the definition that it is sufficient to express the image f(Ao) of the origin of coordinate system on A in the coordinate system on B, i.e. to express the vector f(Ao) — Bq in the basis v as a column of coordinates yo, and everything else is then given by multiplying by the matrix of the map

P(A,C) (4) In each cartesian coordinate system (Ao; e), the distance of the points A — Aq + a\e\ + • • • + anen, B — Ao + b\e\ + ----\-b„en is yjj2"=i(ai ~ bi)2- (5) Given a point A and a subspace Q in £„, there exists a point P e Q which minimalizes the distance between A and the points in Q. The distance between A and P is equal to the length of the orthogonal projection of the vector A — B into Z(Q)-1 for an arbitrary B e Q. (6) More generally, for subspaces Q and 1Z in £„ there exist points P e Q and Q e 1Z which minimalize the distances of points B € Q and A € 1Z. The distance between the points P and Q is equal to the length of the orthogonal projection of the vector A — B into Z(Q)1- for arbitrary points B e Q and A e 7Z. Proof. The first three properties follow directly from the ^ properties of length of vectors in spaces with a scalar product, the fourth one follows directly from the expression of the scalar product in an orthonormal basis. 218 CHAPTER 4. ANALYTIC GEOMETRY 4.21. In vector space R4 compute distance i; between point [0, 0, 6, 0] and vector subspace U : [0, 0, 0, 0] + h (1, 0, 1, 1) + h (2, 1, 1, 0) + h (1, -1, 2, 3), Solution. We will solve the problem by the least squares method. Let U's generating vectors be columns of matrix /l 2 1 \ A = ■1 0 1 1 1 2 \1 0 3/ and we substitute point [0, 0, 6, 0] by corresponding vector b = (0, 0, 6, 0)T. We will solve A ■ x = b, i.e. linear equation system X\ + 2x2 + x3 = 0, x2 — xj, = 0, x\ + x2 + 2x3 = 6, x\ + 3x3 = 0, by least squares method. (Note that the system does not have a solution - the distance would be 0 otherwise.) Let's multiply A ■ x = b by matrix AT from the left-hand side. Augmented matrix AT ■ A ■ x = AT ■ b then is 3 3 6 3 6 3 6 3 15 12 By elementary row operations we transform the matrix to the normal form 3 3 3 6 6 3 3 15 12 j \ 0 -3 3 We continue with backward eUmination 1 1 2 0 1 -1 0 0 0 and see the solution ■1 0 x = (2 - 3t, t, t) T , t e R. Note that the existence of infinitely many solutions is caused by third vector generating U, which is redundat because 3 (1, 0, 1, 1) - (2, 1, 1, 0) = (1, -1, 2, 3). Arbitrary (t e R) linear combination (2 - 3f) (1, 0, 1, 1) + t (2, 1, 1,0) + ? (1, -1, 2, 3) = (2, 0, 2, 2) corresponds to a point [2, 0, 2, 2] in subspace U, which is the nearest point to [0, 0, 6, 0]. The distance is therefore u = || [2, 0, 2, 2] - [0, 0, 6, 0] || = V22 + 0 + (-4)2 + 22 = 2^6. □ Let us look at the relation for the minimal distances p(A, B) for B e Q. The vector A — B decomposes uniquely as A — B = u\ +u2, where u\ e Z(Q), u2 e Z(Q)-1. The component u2 does not depend on the choice ofB € Q since any potential change of the point B would show by adding a vector from Z(Q). Now let us choose P = A + (—u2) = B + u\ e Q. We get BH2 = ll"2ll > ll"2ll — IIA ■ From here we see that the minimal possible distance is reached exactly for our point P and its value is ||«2 II indeed. We get the general result in a similar way. For the choice of arbitrary points A e 1Z and B e Q their difference is given as a sum of vectors u\ e Z(K) + Z(Q) and u2 e (Z(K) + Z(Q))J-, where the component u2 does not depend on the choice of the points. Adding suitable vectors from the difference spaces of 1Z and Q we obviously obtain points A' and B' whose distance is exactly II"2||. □ Now we extend our brief overview of elementary problems in the affine geometry. 4.17. Examples of standard problems. (1) To find the distance from the point A € £„ to the subspace Q C £„: A method of solving such problem is given in the proposition 4.16. (2) In £2 to construct the straight line q through a given point A which form a given angle with a given line p: Let us remind that we have worked with angles between vectors in the plane geometry already (see e.g. 2.43). We find a vector u € M2 lying in the difference space of the line q, and we choose a vector i; having the prescribed angle with u. The desired line is given by the point A and the difference space (v). The problem has either two solutions or only one solution. (3) To find the perpendicular from a point to a given line: The procedure is introduced in the poof of the last but one item of the proposition 4.16. (4) In £3 to determine the distance of two lines p, q: We choose arbitrarily one point from each of the lines, A e p, B e q. The component of the vector A—B lying in the orthogonal complement (Z(p) + Z(q))1- has the length equal to the distance between p and q. (5) In £3 to find the axis of two skew lines p a q: By the axis we mean the crossbar which realizes the minimal possible distance of the given skew lines in terms of the points of intersection. Again, the procedure can be derived from the proof of the proposition 4.16 (the last item). Let rj is the subspace generated by a single point A e p and the sum Z(p) + (Z(p) + Z(q))-1. Provided that the lines p and q are not parallel, it is going to be a plane. Then the intersection rjllq together with the difference space (Z(p)+Z(q))-L give the parametric expression of the desired axis. If the lines are parallel, then the problem has an infinite number of solutions. 4.18. Angles. Various geometric notions like angles, orientation, volume etc. in the point spaces £„ are defined in f ~-±Z% terms of suitable notions from the vector euclidean spaces just as the notion of the distance. Let us remind that we defined the angle between two vectors at the end of the third part of the second chapter, see 2.43. 219 CHAPTER 4. ANALYTIC GEOMETRY 4.22. Compute volume of parallelepiped in R3 with base in plane z, = 0 and with edges given by pairs of vertices [0, 0, 0], [-2, 3, 0]; [0, 0, 0], [4, 1, 0] a [0, 0, 0], [5, 7, 3]. Solution. Parallelepiped is given by vectors (4,1,0), (—2,3,0), (5,7,3). We know that its volume is defined as determinant 4-2 5 Indeed, from Cauchy inequality follows 0 < 1, and 3 0 3 • 14 = 42. Note that if we modified the order of vectors, we would get result ±42, because determinant gives us oriented volume of parallelepiped. Further note that the volume would not change if the third vector was [a,b,3] for arbitrary a, b € R. Its surface obviously depends only on ortogonal distance of planes of its upper and lower base and their area 4 -2 1 14. □ 4.23. Let points [0, 0, 1], [2, 1, 1], [3, 3, 1], [1, 2, 1] define a paral-leloid. Determine point X lying on line p : [0, 0, 1] + (1, 1, l)t so that parallelepiped defined by given paralleloid and point X has volume of 1. Solution. We will form a determinant which gives us volume of a parallelepiped with X moving along line p: t t 1 0 2 0 Volume should be 1 which introduces condition t = 1/3. □ 4.24. Let A BCD EF GHbea cube (with common notation, i.e. vectors E — A, F — B, G — C, H — D are orthogonal to the plane defined by vertices A, B, C, D) in Euclidean space R3. Compute angle cp between vectors F — A a H — A. Solution. We have solved this problem using formula for angle between vectors. Let's think about the problem further. Vertices A, F, H are vertices of a triangle with all sides of the same length, it is hence equilateral triangle and therefore cp = jt/3. □ 4.25. Let 5 be a midpoint of edge AB of cube ABCDEFGH (with common labelling). Compute cosine of angle between lines £"5 and BG. Solution. Dilatation (homotethy) is similar mapping, hence it preserves angles. We can therefore asume that the cube edge has length 1. Further, we can place the point A to the origin of coordinate system and points B and E to points [1, 0, 0] and [0, 0, 1] respectively. Other coordinates are then given: S = [1/2, 0, 0], G = [1,1,1], vector MINI so it has sense to define the angle cp(u, v) between vectors u, v e V in a real vector space with a scalar product given by the equation u ■ v cos cp(u, v) — -, 0 < cp(u, v) < 2tt. I"l III'll This is completely in accordance with the situation in the two-dimensional euclidean space R2 and with our philosophy that the notion related to the two vectors is the issue of the plane geometry in fact. In the euclidean plane, we used also the geometric functions cos and sin which we defined by a pure geometric consideration. We will come back to this in the beginning of the fifth chapter, when we will be able to check precisely the geometric opinion that the function cos is decreasing in the interval [0, it]. Therefore, the angle between two vectors in higher-dimensional spaces is measured in the plane which is generated by these two vectors (or it is zero), and our defining relation corresponds to the conventions in all dimensions. In an arbitrary real vector space with a scalar product, it follows directly from definitions that v\\2 = ■ + |MI - 2(« • v) 2 2 — \\u\\ + ||«11 — 2||w|| ||u|| coscp(u, v). This is evidently the well known law of cosines from the plane geometry. Next, the following relation holds for each orthonormal basis e of the difference space V and a non-zero vector u e V \\u\\2 — ^2 \u • ei\2- i By dividing this equation by the number ||» ||2 we get 1 = ^(cos^(w, e,))2, which is the law of directional cosines cp(u, e{) of the vector u. Now we can derive reasonable definitions for angles between general subspaces in an euclidean vector space from the definitions of angles between vectors. Concurrently we must decide how to deal with cases, where the subspaces have a nontrivial intersection. As the angle between two lines, we want to take the smaller one from the two possible angles, in the case of two nonparallel planes in R3 we do not want to say that the angle is zero since they intersect and have one direction in common: Angles between subspaces [__< 4.19. Definition. Let us consider finite-dimensional subspaces Ui, U2 in an euclidean vector space V of an arbitrary dimension. The angle between vector subspaces U\, U2 is the real number a — cp(Ui, U2) e [0, j] satisfying: (1) If dimt/i = dimt/2 = 1, U\ = (u), U2 = (u),then |w.u| cos a = -. (2) If the dimensions of U\, U2 positive and U\ n U2 — {0}, then the angle is the minimum of all angles between one-dimensional subspaces a — min{From this system we can see that y = — z, and x = 2z. Vector (2, —1, 1) is therefore direction vector of p; in other words, we have (p is obviously passing through the origin) p : [0, 0, 0] + t (2,-1, 1), t eR. For angle

l)) — Proof. According to the Cauchy inequality, for all vectors u e U we have \u ■ v\ \u ■ (v\ + v2)\ \u ■ V\\ \\u\\ \\v\ \u\\ IIu 1II \\u\\ III'll \v\ I \v\ I \v\\ \\v\ I \v\ ■ v\ \\v\\ \\v\ II This implies cos(p((v), («)) < cos(p((v), (v\)) \vi I + 3b2 + 2V3ab. and thus the vector v\, which we have found, represents the largest possible value of the cosine of angles between all choices of vectors inU. But since the function cos is decreasing on the interval [0, j], we get the smallest possible angle in this way, and so the claim is proved. □ 4.21. Calculating angles. The procedure in the previous lemma ;i can be understood as follows. We take the orthogonal projection of the one-dimensional subspace generated by 1; t into the subspace U, and we look at the ratio between 1; 1 ^ ' and its image. A similar procedure is used in the higher dimension too. However, the problem is to recognize the directions whose projections give the desired (minimal) angle. We can see this in our previous example if we project the bigger space U into one-dimensional (1;) first, and then orthogonally back to U. We find out that the desired angle corresponds to the direction of 221 CHAPTER 4. ANALYTIC GEOMETRY If we use a2 + b2 = 1, we get 0 = 2b2 + 2V3ab, tj. 0 = b (b + V3aj . Together (remember that c = 3a and a2 + b2 = 1) 1 73 a = ±1, & = 0, c = ±3; a = ±-, & = T—, c 2 2 We can easily check that lines determined by those coefficients x + 3 = 0, satisfy all the conditions. 1 - x 2 73 3 — V + - = 0 2^2 3 ±-. 2 □ 4.28. Determine general equation of all planes so that angle between every such plane and plane x + y + z — 1 = 0is 60°, and further, they contain line p : [1, 0, 0] + t (1, 1, 0). O 4.29. Determine angles between planes a: [1,0, 2] + (1,-1, l)t + (0,1,-2)5 p: [3, 3, 3]+ (1,-2,0)? + (0,1,1)5 Solution. Line of intersection between planes has direction vector (1,-1,1), plane ortogonal to this vector has intersection with given planes generated by vectors vektory (1,0, —1) a (0, 1, 1). Angle between these one-dimensional subspaces is 60°. □ 4.30. Cube ABCDA'B' C D' (in standard notation, i.e. ABCD and A'B' C D' are faces and A A' is an edge). Compute angle between AB' and AD'. Solution. Consider cube of side 1 and place it in M3 in such way that vertex A has coordinates [0, 0, 0], vertex B coordinates [1, 0, 0] and vertex C coordinates [1, 1, 0]. Then vertex B! has coordinates [1, 0, 1] and vertex D' coordinates [0, 1, 1]. We can determine vectors AB' = B' — A = [1,0, l]-[0, 0, 0] = (1,0, 1), AD' = D'-A = [0, 1, 1]- [0, 0, 0] = (0, 1, 1). By definition of angle

U2 as before. Similarly, let i/> : U2 -> U\ be the map which has arisen from the orthogonal projection onU\. In the bases (e\, ek) and (e'j,..., e'}), these maps have matrices A = ek ■ ex B = \e\-e\ ... ek-e'J l -e\ erei \e[ ■ ek ... e'r ekJ Since we are regarding scalar products on a real vector space, et ■ e'j — e'j -et holds for all indices i, j, in particular we have B — A T. The composition of maps fofi : U\ —>• U\ has therefore a symmetric positive semidefinite matrix AT A, and i/> is an adjoint map to o

i ) + pi t>i ) +qi (u2, vt ) - p2(v1,v1) - q2(v2,v1) = 0, ((-3,2, -5, -7, -3),v2) + pi (uuv2) +qx (u2,v2) - p2 (vu v2) - q2 (v2, v2) = 0. By computing those dot products we get linear equation system tj- We define the absolute value of the volume of a parallelepiped inductively such that we fulfil the idea that it is the product of the volume of the "base" and the "altitude": \Vol\Tk(A;Ul, I Vol\Vi(A; «0 = ||«i II ■ ,«*)= Ik* II I Vol |7Vi(A; «i, ■, w*-i). If u\,..., un is a basis agreeing with the orientation of V, we define the (oriented) volume of the parallelepiped by VolP*(A; «i,..., u„) — | Vol\Vk(A; uu..., u„), in the case of a nonagreeing basis we set VolP*(A; wi,..., un) — —| Vol\Vk(A; uu ..., un). The following claim clarifies our former comments that the determinant expresses the volume in a sense. The thing is that the first claim says exactly that we get the volume of the parallelepiped in a ^-dimensional space, which is stretched on k vectors, such that we write down their coordinates (in an orthonormal basis) into columns of a matrix and we calculate the determinant. The formula in the second claim is called Gramm determinant. Its advantage is that it is independent on the choice of basis and, therefore, it is better to handle in the case that k is lower then the dimension of the whole space. Theorem. Let Q c £n be an euclidean subspace, and let (e\,..., ek) be its orthonormal basis. Then for arbitrary vectors u\, ... ,uk € Z(Q) and A € Q the . following holds (1) VolTk(A; ux Uk) — (2) (VolTk(A;Ul,...,Uk)y Proof. The matrix u\ ■ e\ u\ ■ u\ u\ ■ Uk Uk ■ e\ Uk ■ ek .. Uk ■ U\ .. Uk ■ Uk A = : \u\-ek .. has the coordinates of vectors u\,. columns, and |A|2= \A\\A\ = \A-u\ ■ u\ U\ ■ Uk Uk ■ ei\ Uk ■ ek) . ,uk in the chosen basis in |A Uk ■ u\ Uk ■ Uk \ATA\ Hence we see that if (1) holds, then also (2) holds. The unoriented volume is directly form the definition equal to the product \Vol\Vk(A;ui,. where ui — u\, v2 — u2 uk) — IN Illk2ll •• , vk — uk a2vi Nil, ■ ak v\ of _,vk-i is the result of the Gramm-Schmidt orthogonalization. 224 CHAPTER 4. ANALYTIC GEOMETRY 7, 6pi - \q\ - 9p2 - 3q2 -4pi + 6qi + 6q2 = 6, 9pi - 33p2 - q2 = 31, 3pi - 6qi - p2 - 9q2 = -11, which we solve by forming matrix and performing elementary row operations. 7 \ / 1 / V 9 3 0 -9 0 -33 -1 0 0 0 1 0 0 1 0 V 0 0 0 1 0 0 0 o \ -1 -1 2 31 -11 J The solutions is (pi, q\, p2, q2) = (0, —1, — 1, 2). We have found Xt-X2 = (-3, 2, -5, -7, -3)-u2+vx-2v2 = (-3, 4, -2, -4, 2). The size of vector (—3,4,—2,—4, 2) and at the same time distance between planes a\, a2 is hence 7 = V(-3)2 + 42 + (-2)2 + (-4)2 + 22. We determined distance between q\ and £2 differently than the distance between a\ and a2. We could have used both methods in both cases. Let's try the former method for the case of a\, a2. Let's find ortogonal complement of vector subspace generated by (2,1,0,0,1), (-2,0,1,1,0), (2,2,4,0,3), (2,0,0,-2,-1) We get / 2 V 1 0 0 1 2 4 0 0 0 1 0 1 \ 0 3 -1 / / 1 0 0 0 0 10 0 0 0 10 V 0 0 0 1 3/2 \ -2 1 2 The ortogonal complement is ((—3/2,2,-1,-2,1)), or rather ((3, —4, 2, 4, —2)). Note that distance between a\ and a2 equals the size of ortogonal projection of vector (difference of arbitrary point in o\ and arbitrary point in a2) u = (3, -2, 5, 7, 3) = [3, -1, 7, 7, 3] - [0, 1, 2, 0, 0] to this ortogonal complement. Denote the ortogonal projection of u as pu and choose 1; = (3, —4, 2, 4, —2). Obviously pu = a ■ v for some del and it holds ( u — pu, v ) = 0, tj. ( u, v ) — a { v, v ) = 0. Computing gives 49 — a ■ 49 = 0. Therefore pu = 1 ■ v = v and the distance between planes a\ and a2 is equal \\Pu\\ = 732 + (-4)2 + 22 + 42 + (-2)2 = 7. Method of computing distance using ortogonal complement of sum of vector spaces has proven to be „faster way to the solution". With no doubt, it will be the same for planes q\ a q2. The second method however reveals points where the distance can be measure (pair Thus we have (yo\Vk(A;uu...,uk)Y = VI ■ VI vi ■ vk v\ ■ v\ 0 0 0 . . Vk- VI ■ • vk- vk vk ■ vk Let us denote by B the matrix whose columns are formed jji:, by the coordinates of vectors v\,..., vk in the orthonor- mal basis e. Since 1; vk have arisen from u\ |t as images under a linear transformation with an upper-triangular matrix C with ones on the diagonal, we have B = CAand|B| = \C\\A\ = \A\. Butthen|A|2 = \B\2 = \A\\A\, and thus ~VolVk(A; u\,..., uk) — ±|A|. The resulting volume is zero if the vectors u\,... ,uk are dependent. Provided that they are independent, the sign of the determinant is positive if and only if the basis u\,... ,uk defines the same orientation as the basis e. □ We can formulate the following important geometric consequence: 4.23. Corollary. For each linear map

• V on an euclidean space V, det

- [u\,..., un] is antisymmetric n-linear map. It means, it is linear in all arguments, and the interchange of any two arguments causes the change of sign of the result. (2) The outer product is zero if and only if the vectors u\,... ,un are linearly dependent. (3) Vectors u\,..., un form a positive basis if and only if their outer product is positive. In technical applications in the space R3, we often use a closely related operation, so called cross product, which assigns a vector to any pair of vectors. Let us consider an arbitrary euclidean vector space V of dimension n > 2 and vectors u\,..., u„-\ e V. If we substitute tors u\, 225 CHAPTER 4. ANALYTIC GEOMETRY of points in which the planes are the closest). Let's find such points in the case of planes q\, q2. Denote ui = (1, 0, -1, 0, 0), u2 = (0, 1, 0, 0, -1), ui = (1,1, 1,0, 1), v2 = (0, -2, 0, 0, 3). Points Xi € q\, X2 € Q2, which are „the closest" (as commented above), are Xi = [7, 2, 7, -1,1] + hui + siu2, X2 = [2, 4, 7, -4, 2] + t2vi + s2v2, so X-i — Xo [7,2,7, -1, 1] - [2,4,7, -4,2] +hui + SiU2 - t2vt - s2v2 (5, -2, 0, 3, -1) + hui + s\u2 - t2v\ - s2v2. Dot products \XX -X2,Ul) [Xl-X2,vl) o, 0, [Xi-X2,u2) Xl-X2,v2) o, 0 then lead to linear equation system 2h = -5, 2s i + 5^2 = 1, -4t2 - s2 = -2, -5si - t2 - I3s2 = -1 with only solution t\ = —5/2, s\ = 41/2, t2 = 5/2, s2 obtained "9 45 19 5 41 Xl = [7,2,7,-1, l]--«l +yM2 X2 = [2, 4, 7, -4, 2] + -vi - Sv2 2 2 -1, -8. We 39 9 45 19 39 2' T' T' ~ ' ~T Now we can easily see that the distance between points x\,x2 (and, at the same time, distance between planes q\, q2) je || x\ — x2 \ \ = ||(0,0,0,3,0)||=3. □ 4.35. Find intersection of plane passing through point A = [l,2,3,4]eK4 and ortogonal to plane q : [1, 0, 1, 0] + (1, 2, -1, -2)s + (1, 0, 0, l)f, s,(eR. Solution. First, let's find plane ortogonal to q. Its direction will be ortogonal to direction of q, for vectors (a,b,c,d) within its direction we get linear equation system (a,b, c,d) ■ (1,2,-1, -2) = 0 = a+2b-c-2d = 0 (a,b,c,d) ■ (1,0, 0, 1) =0 = a+d = 0. these n — 1 vectors into the first n — 1 arguments of the n-linear map defined by the volume determinant as above, then we are given one argument left, i.e. a linear form on V. Since we have the scalar product at disposal, each linear form corresponds to exactly one vector. We call this vector u € y the cross product of vectors u\,..., w„_i, i.e. the following holds for each vector w e V {v, w) [Ml, ... , W„_l, HI J. We denote the cross product byw = «i x...x«„_i. If the coordinates of our vectors in an orthonormal basis are v = (yi,..., yn)T, w = (xi, ... ,xn)T and Uj = (u\j,. ..unj)T, then our definition can be expressed as y\x\ ' ynXn «11 Ul(n-l) Xi Mnl • • • Mn(n — 1) We see from here that the vector i; is given uniquely and its coordinates are calculated by the formal expansion of this determinant along the last column. At the same time, the following properties of the cross product are direct consequences of the definition: Theorem. For the cross product v — u\ x ... x w„_i we have (1) v e ..., Un-i)1- (2) v is nonzero if and only if the vectors u \ ,...,«„_ i are linearly independent, (3) the length \\v\\ of the cross product is equal to the absolute value of the volume of parallelepiped V(0; u\, ..., w„_i), (4) («i,..., w„_i, u) is an agreeing basis of the oriented eu-clidean space V. Proof. The first claim follows directly from the defining formula for i; since substituting an arbitrary vec-WL JkY/ tor uj for w we get the scalar product v ■ uj on the left and the determinant with two equal columns on the right. The rank of the matrix with n — 1 columns uj is given by the maximal size of a non-zero minor. The minors which define coordinates of the cross product are of degree n — 1 and thus the claim (2) is proved. If the vectors u\,... ,u„-i are dependent, then also (3) holds. Therefore, let us consider that the vectors are independent, let i; be their cross product, and let us choose an orthonormal basis (ei,..., e„_i) of the space (ui,..., u„-\). It follows from what we have proved that there exists a multiple (l/a)v, 0 ^ a e R, such that (e i,..., ek, (1 /a) v) is an orthonormal basis of the whole space V. The coordinates of our vectors in this basis are uj — (u\j,u(n-i)j, 0)T, v — (0, ..., 0, a)T. So the outer product [u\,..., w„_i, i;] is equal (see the definition of cross product) 0 |«i, ..., w„_i, v\ — «11 «i(«-i) «(«-i)i ••• «(«-i)(«-i) 0 0 ... 0 a — (v, v) — a . Expanding the determinant along the last column we get a2 = a VolP(0; uu ..., ;„_i). 226 CHAPTER 4. ANALYTIC GEOMETRY Solution is two-dimensional vector space ((0, 1,2, 0), (—1,0, —3, 1)). Plane r ortogonal to q passing through a has parametric equation r : [1, 2, 3, 4] + (0, 1, 2, 0)w + (-1,0, -3, l)v, u, v e R. We can obtain intersection of planes from both parametric equations. We get linear equation system 1 + s + t = l-v 2s = 2 + u 1 — s = 3 + 2u — 3v -2s + t = 4 + v, which has only solution (it must be so as matrix columns are linearly independent) s = -8/19, t = 34/19, u = -54/19, v = -26/19. Inputting parameter values s and t into parametric form of plane q, we obtain sought intersection [45/19, -16/19, 11/19, 18/19] (needless to say, we get the same solution by inputting the values into r). □ 4.36. Find a line passing through point [1,2] e R2 so that angle between this line and line p : [0, l] + f(l,l) is 30°. Solution. Angle between two lines is angle between their direction vectors. It is sufficient to find direction vector v of the line. One way to do so is to rotate direction vector of p by 30°. Rotation matrix for the angle 30° is cos 30° - sin 30c sin 30° cos 30° Sought vector v is therefore We could perform the backward rotation as well. The line (one of two possible) has parametric equation /V3 1 73 l\ [1'21 + (--2- + 2J'- □ 4.37. Determine cos a, where a is angle between two adjacent faces of regular octahedron (octahedron has eight equilateral triangles as faces). Solution. Octahedron is symetric, therefore it does not matter which two faces we choose. Further, without loss of generality, asume octahedron of edge length 1 and place it into standard Cartesian coordinate Both the remaining two claims from the proposition follow From here. □ 4.25. Affine and euclidean properties. Now we can have a think about which properties are related to the affine structure of the space and for which properties we really need the scalar product in the difference space. It is obvious that all euclidean transformations, i.e. bijective affine maps between euclidean spaces, which preserve the distance between points preserve also all objects we have studied. I.e. next to the distances they preserve also un-oriented angles, unoriented volumes, angle between sub-spaces etc. If we want them to preserve also oriented angles, cross products, volumes, then we must assume in addition that our transformations preserve the orientation too. We may formulate our problem also as follows: Which concepts of euclidean geometry are preserved under affine transformations? First let us remind that an affine transformation on a n-dimensional space A is uniquely defined by mapping n + 1 points in a general position, i.e. by mapping one n-dimensional simplex. In the plane, it means to choose the image of one (nondegenerate) triangle, which may be an arbitrary (nondegenerate) triangle. The preserved properties will be the properties related to subspaces in particular, i.e. the properties of the type "a line passing through a point" or "a plane contains a line" etc. At the same time, the col-inearity of vectors is preserved, and for every two colinear vectors, the ratio of their lengths is preserved (independently on the scalar product defining the length). Similarly, we have already seen that the ratio of volumes of two n-dimensional parallelepipeds is preserved under transformations (since the determinant of the corresponding matrix changes about the same multiple). These affine properties can be used smartly in the plane to prove geometric claims. For instance, to prove the fact that the medians of a triangle intersect in a single point and in one third of their lengths, it is sufficient to verify this only in the case of an isosceles right-angled triangle or only in the case of an equilateral triangle, and then this property holds for all triangles. Think this argumentation over! 2. Geometry of quadratic forms After straight lines, the simplest objects in the analytic geom-_ etry of plane are so called conic sections. They are given by quadratic equations in cartesian coordinates, and by coefficients we recognize //// ■ that the conic is a circle, ellipse, parabola or hyperbola, potentially it may be also a pair of lines or a point (the degenerate cases). We will see that our tools enable us to classify effectively these objects in all finite dimensions and to work with them. It is also obvious that we cannot distinguish a circle from an ellipse in affine geometry, therefore we begin in the euclidean geometry. 4.26. Quadrics in £„. In analogy with equations of conic sections in plane, we start with objects in euclidean point spaces which are defined in a given orthonormal basis by quadratic equations, we talk about quadrics. 227 CHAPTER 4. ANALYTIC GEOMETRY system R3 so that its centroid lies in [0, 0, 0]. Its vertices then are located in points A = [^,0,0], B = [0,^,0], C = [-^,0,0], D = [0, 0], £ = [0,0,-|]aF = [0, 0, ^]. We will compute angle between faces CDF and BCF. We have to find vectors ortogonal to their intersection and lying within respective faces, which means ortogonal to CF. They are altitudes from D and F to edge CF in triangles CDF and BCF respectively. Altitudes in equilateral triangle are the same segments as medians, so they are SD and SB, where S is midpoint of CF. Because we know coordinates of points C and F, the point S has coordinates [—0, ^-] and vectors are SD = (^, ,-% a SB = (^, f,Together cos a 4 ' 2 / V2 72 _V_2\ /V2 72 V2x ^ 4 ' 2' 4 ^ ' ^ 4 ' 2 ' 4 ^ Therefore a = 132° □ 4.38. In Euclidean space spaces U,V, where determine angle

• R. Similarly, we may think of a general symmetric bilinear form on an arbitrary vector space. For an arbitrary basis on this vector space, the value fix) on vector x = x\e\ +----h xnen is given by the equation f(x) = Fix, x) = ^^XiXjFiei, ej) = xT ■ A ■ x 'J where A = (a^) is a symmetric matrix with elements atj = F(et, ef). We call such maps / quadratic forms, and the formula from above for the value of the form in terms of the chosen coordinates is called the analytic formula for the form. In general, by a quadratic form we mean the restriction / (x) of a symmetric bilinear form Fix, y) to arguments of the type (x, x). Evidently, we can reconstruct the whole bilinear form F from the values fix) since fix + y) = Fix + y, x + y) = fix) + fiy) + 2F(x, y). If we change the basis to a different basis e[,..., e'n, we get different coordinates x = S ■ x* for the same vector (here S is the corresponding transformation matrix), and so fix) = iS-x')T A ■ iS-x') = ix'Y ■ iS ■ A ■ S) ■ x'. Now let us assume again that our vector space is equipped with a scalar product. Then the previous computation can be formulated as follows. The matrix of bilinear form F, which is the same as the matrix of /, transforms under a change of coordinates in such a way that for orthogonal changes it coincides with the transformation of a matrix of a linear map (indeed, then we have S-1 = ST). We can interpret this result also as the following observation: Proposition. Let V be a real vector space with a scalar product. Then formula

3 19 19 19 I (0,0,0,1,0) 11 -V4 = (0,0, 0, 1,0), ..(0,0,0,1,-1)11 J2 2 Hence cp = jt/4. Case (e). Let's determine intersection of vector subspaces associated with given affine subspaces. Vector {x\, x2, x3, X4, x$) is in vector subspace of U, if and only if (X\, X2, X3, X4, x5) = t (2, 1, 3, 5, 3) + s (0, 3, 1, 4, -2) + r (1, 2, 4, 0, 3) Proof. Indeed, each bilinear form with a fixed second argument becomes a linear form au( ) = F( , u), and in the presence of a scalar product, it must be given by formula a(u)(v) = v ■ w for a suitable vector w. We set = w. One show directly from the coordinate expression displayed above that cp is a linear map with matrix A. Hence it is selfadjoint. On the other hand, each symmetric map cp defines a symmetric bilinear form F by formula F(u, v) = (cp(u), v) = (u, cp(v)), and thus also a quadratic form by restriction. □ We get immediately the following consequence of this proposition. For each quadratic form / there exists an orthonormal basis of the difference space in which / has a diagonal matrix (and the values on the diagonal are determined uniquely up to their order). Due to the identification of quadratic forms with linear maps, we can also define correctly the rank of the quadratic form as the rank of its matrix in any basis (i.e. the rank is equal to the dimension of the image of the corresponding map cp). 4.28. Classification of quadrics. Let us come back to our equation (4.4). Our results on quadratic forms enable us to rewrite this equation as follows Y^^iA + J2biXi + b = o. Hence we may assume directly that the quadric is given in this form. In the next step, we do completing the squares for the coordinates x, with A, ^ 0, which "absorbs" the squares together with the linear terms in the same variable (so called Lagrange algorithm, will be discussed in detail later). So we are left only with linear terms corresponding to variables for which the coefficient at the quadratic term was zero, and we get r = l pi)2+ E j satisfying Xj bjXj + c = 0 0. This corresponds to a translation of the origin about the vector with coordinates pt and to such a choice of basis of the difference space that we get the desired diagonal form in the quadratic part. In the identification of quadratic forms with linear maps derived above, it means that cp is diagonal on the orthogonal complement of its kernel. If we are left with some linear terms, we may adjust the orthonormal basis of the difference space for the kernel of cp such that the corresponding linear form is a multiple of the first term of the dual basis. Hence we can already reach the final formula where k is the rank of matrix of quadratic form /. If b / 0, we can make the constant c in the equation to be zero by a next change of the origin. Hence we see that the linear term may (but does not have to) appear only in the case that the rank of / is less than n, c e R may be nonzero only if b = 0. The resulting equations are called the canonical analytic formulas for quadrics. 229 CHAPTER 4. ANALYTIC GEOMETRY for some t,s,r el, and, at the same time, (x\, x2, x3, x4, x5) e V if and only if (jci, x2, x3, x4, x5) = p (-1, 1, 1, -5, 0) + q (1, 5, 1, 13, -4) for some p, q el. Let's find such t,s,r, p,q el,so that t (2, 1, 3, 5, 3) + s (0, 3, 1, 4, -2) + r (1, 2, 4, 0, 3) = p(-l, 1, 1, -5,0) +q (1,5, 1, 13, -4). It is a homogeneous linear equation system. We will solve it in matrix form (order of variables is t, s, r, p, q) í2 0 1 1 -1 / 1 3 2 -1 -5 \ 1 3 2 -1 -5 0 2 1 -1 -3 3 1 4 -1 -1 ~ ... ~ 0 0 1 -1 1 5 4 0 5 -13 0 0 0 0 0 \3 -2 3 0 4 ) \0 0 0 0 4.29. The case of £2. As an example of the previous procedure, let us go through the whole discussion in the simplest ; / case of a nontrivial dimension, i.e. dimension two. ^~z_ The original equation has the form an x2 + a22y2 + 2ai2xy + a\x + 02y ■ 0. By a suitable choice of a basis of difference space and the subsequent completing the squares we reach the form (we use the same notation x, y for the new coordinates): „2 a\\x ■ a22 y2 + a\x + a2y + < 0 where a, may be nonzero only in the case that an is zero. By the last step of the general procedure, i.e. in dimension n = 2 only by a choice of a translation, we reach exactly one of the following equations: It has showed that vectors defining V are linear combination of U's vectors. That means V is subset of U, and hence

M a quadratic form. Then there exist a polar basis for f on V. Proof. (1) Let A be the matrix of / in basis u — (u\, ...,u„) on V, and let us assume an / 0. Then we may write f(x\, ..., x„) — a\\x\ + 2ai2XiX2 H-----h A22-4 + • • • — flfj^flnXi + Ö12X2 H-----h fll«X„)2 + terms not containing x\. Hence we transform the coordinates (i.e. we change the basis) such that in new coordinates we have x\ — a\\x\ + fli2X2 H-----h a\nxn, x'2 — X2, ..., x'n — x„. It corresponds to the new basis (as an exercise, compute the transformation matrix) V\ = üyyU\, 1)2 —- U2 — Ü\2U\, . . . , V„ — U„ — üyy a\nu\ and so, as we may expect, in the new basis the corresponding symmetric bilinear form satisfies g(v\,v{) — 0 for all i > 0 (compute!). Thus / has the form a^x'l2 + h in the new coordinates, where h is a quadratic form independent on the variable x\. Due to technical reasons, it is mostly better to choose v\ — u\ in the new basis. Then we have the expression f — f\ + h, where f\ depend only on x'1, while x'1 does not appear in h, but g(vu vi) - an. (2) Let us assume that after doing the step (1), we get for h a matrix (of rank less about one) with a nonzero coefficient at x7,2. Then we may repeat exactly the same procedure and we get the expression f — f\ + fi + h, where h contains only the variables with index greater than two. We may proceed in this way as long until we get a diagonal form after n — 1 steps or in a step, say ;-th step, the element an is zero. (3) If the last possibility happens, but in the same time there exists some other element ajj / 0 with j > i, then it suffices to switch the ;-th and the j-th vector of the basis and to continue according the the previous procedure. (4) Let us assume now that we come to the situation ajj — 0 for all j > i. If there is no element aß ^ 0 with j > i,k > i, then we are done since we have got a diagonal matrix. If aß / 0, then we use transformation vj — uj + uk + we keep the other vector of basis constant (i.e. x?k — xk — xj, the other remain constant). Then h(vj, Vj) = h(uj, Uj) + h(uk, WjO + 2h(uk, Uj) — 2aj\ ^0 and we can continue according to (1). □ 231 CHAPTER 4. ANALYTIC GEOMETRY 'I zlL o \ ,0 0 1 1 3 -3 , polar basis is therefore 0, 0), §, 0), (1, -3, 1)). □ 3. /(xi, x2, X3) 4.40. Determine polar basis of form / : M3 2*1X3 -(- X2. Solution. Matrix of the form is /0 0 1N A = J 0 1 0 V 0 °v We can switch the order of variables: yi = x2, y2 = xi, y3 = x3. It is then trivial to apply step (1) of Lagrange algorithm (there are no common terms), however for the next step, case (4) sets in. We introduce transformation zi = yi,z2 = y2, zj, = yj, — y2. Pak f(xl,x2,x3) = zj + 2z2(z3 + z2) = zj + ^(2z2 + z3f - X-z\. Together we get z,\=y\= x2, z2 = y2= xu z3 = y3 - y2 = x3 - xx. Matrix T for change to polar basis is / 0 1 0\ /0 1 0> T = 1 0 0 and T~l = 1 0 0 V-i o i) \o 1 1, polar basis is therefore ((0, 1, 0), (1, 0, 1) (0, 1, 1)). □ 4.41. Find polar basis of quadratic form / standard basis defined as I, which is in f(xi,x2, x3) = xix2 + dissolution. By application of Lagrange algorithm we get: f(X\, X2, X3) = 2xiX2 + X2X3 we perform substitution according to step (4) of the algorithm y2 — x2—x\, = 2xi (xi + yi) + (*i + y2)*3 = 2x\ + 2x\y2 + X1X3 + y2x3 = 1 1 2 1 2 1 2 = 2^2xi +y2 + 2xi) ~ y2 ~ 8X3 + y2X3 = substitution yi — 2x\ + y2 + 5X3 1 2 ' 2 ^ 2 , 1 2 1 \2 , ^ 2 = ~ 2^ ~ 8X3 ^ = 23'1 ~ 2yi ~ 2Xs) 8X3 = substitution V3 — \y2 — \x3 4.31. Affine classification of quadratic forms. We can improve the Lagrange algorithm for computing polar basis by ____multiplying the vectors from basis by a scalar such that the coefficients at squares of variables in the corresponding analytic formula for our form will be only scalars 1,-1 and 0. Moreover, the following law of inertia says that the number of one's and minus one's does not depend on our choices in the course of the algorithm. These numbers are called the signature of a quadratic form. As before, we get a complete description of quadratic forms in the sense that two such forms may be transformed each one into the other by an affine transformation if and only if they have the same signature. Theorem. For each nonzero quadratic form of rank r on a real vector space V there exists a natural number 0 < p < r and r independent linear forms 0 while for*; e Q we have f(v) < 0. Hence necessarily P n Q — {0} holds, and therefore dim P + dim Q < n. From here we conclude p + (n — q) < n, i.e. p < q. However, we get also q < p by the opposite choice of subspaces. Thus p is independent on the choice of the polar basis. But then for two matrices with the same rank and the same number of positive coefficients in the diagonal form of the corresponding quadratic form, we get the same analytic formulas. □ While we discussed symmetric maps we talked about definite and semidefinite maps. The same discussion has an obvious meaning also for symmetric bilinear forms and quadratic forms. A quadratic form / on a real vector space V is called (1) positive definite if f(u) > 0 for all vectors u / 0, 232 CHAPTER 4. ANALYTIC GEOMETRY We get change of basis matrix by either expressing the old variables (xi, x2, x3) by new variables (yi, y3, x3), or equivalently expressing the new ones by the old ones (which is easier), we however need to compute inverse matrix in the latter case. We have y\ = 2x\ + y2 + \x3 = 2x\ + (x2 — x{) + \x3 and \x3. Matrix changing basis from 1 V 9X3 1 X\ + \x3 ^3 — 2->-z 2^ ~~ 2-"1 1 2 polar basis to standard basis is Inverse matrix is '1 _2 _ 45 : 3 3 : v0 0 One of polar bases of the given quadratic forms is hence for example basis (see the columns of matrix {(1/3, 1/3, 0), (-2/3, 4/3, 0), (-1/2, 1/2, 1)}. □ 4.42. Determine the type of conic section defined by 3xf 3x\x2 + x2 — 1 = 0. Solution. We complete the squares: 1 3.Xi — 3x\X2 ~h X2 — 1 :(3*i :x2y 1 -x\ + x2 1 1 yl 3 4 T 3 1 -,y\ 2 2-71 rA 3" According to list 4.29, the given conic section is hyperbola. □ 4.43. By completing the squares express quadric -x2 + 3y2 + z2 + 6xy - 4z, = 0 in such way that one can determine its type from it. Solution. We move all terms containing x to — x2 and complete the square. We get equation -(x - 3y)2 + 9/ + 3y2 + z2 - 4z = 0. There are no „unwanted" terms containing y , so we repeat the procedure for z, which gives us -(x - 3y)2 + 12/ + (z - 2)2 - 4 = 0. Now we can conclude that there is a transformation of variables that leads to equation (we can divide by 4 first) + y2+z2 1 =0. (2) positive semidefinite if f(u) > 0 for all vectors u e V, (3) negative definite if f(u) < 0 for all vectors u ^ 0, (4) negative semidefinite if f(u) < 0 for all vectors u e V, (5) indefinite if f(u) > 0 and f(v) < 0 for two vectors u, v e V. We use the same names also for symmetric matrices corresponding to quadratic forms. By a signature of a symmetric matrix we mean the signature of the corresponding quadratic form. 4.32. Theorem (Sylvester criterion). A symmetric real matrix A is positive definite if and only if all its leading principal minors are positive. A symmetric real matrix A is negative definite if and only if (— 1)' | A, | > Ofor all leading principal submatrices Ai. Proof. We must analyse in detail the form of the transforma-\^ tions used in the Lagrange algorithm for constructing h the polar basis. The transformation used in the first step of this algorithm always have an upper triangular matrix T and if we use the technical modification mentioned in the proof of proposition 4.30 moreover, the matrix has one's on the diagonal: /I T = £12 0 1 "n2\ 0 V □ Such matrix of the transformation from basis u to basis v has several nice properties. In particular, its leading principal submatrices Tk formed by first k rows and columns are the transformation matrices of a subspace Pk — ■ ■ ■, uk) from basis (u\,..., uk) to basis (vi..., vk). The leading principal submatrices Ak of the matrix A of form / are matrices of restrictions of the form / to Pk. Therefore, the matrices Ak and A'k of restrictions to Pk in basis uandv respectively satisfy A k — A'k(Tk)~l, where T is the transformation matrix from u to v. The inverse matrix to an upper triangular matrix with one's on the diagonal is an upper triangular matrix with one's on the diagonal again. Hence we may similarly express A' in terms of A. Thus the determinants of matrices Ak and A'k are equal by Cauchy formula. So we proved a useful statement: Let f be a quadratic form on V, dim V — n, and let u be a basis of V such that we never need the items (3) and (4) from the Lagrange algorithm while finding the polar basis. Then as the result we get analytic formula f(x\, ..., x„) — \\x\ + X2x\ + ■ ■ ■ + Xry?r where r is the rank of form f, k\, ... ,kr ^ 0 and for leading principal submatrices of the (former) matrix A of quadratic form f we have \ Ak\ — k\k2 ... kk, k < r. In our procedure, each sequential transformation makes zeros under the diagonal in next column. From here it is obvious that if the leading principal minors are nonzero then the next diagonal term in A is nonzero. By this consideration we proved so called Jacobi theorem: Corollary. Let f be a quadratic form of rank r on a vector space V with matrix A in basis u. There is no need of other steps in Lagrange algorithm than completing squares if and only if the leading principal submatrices of A satisfy \A\ \ ^ 0, \Ar\ ^ 0. Then 233 CHAPTER 4. ANALYTIC GEOMETRY We can tell the type of the conic section without transforming its equation to the form listed in 4.29. As we know, we can express every conic section as aux2 + 2anxy + any2 + 2ai3x + 2a23y + a33 = 0. there exists a polar basis (which we get by the above algorithm), in which f has analytic formula det A an an an a 12 a22 a23 a 13 a32 a33 and Determinants an ayi &12 a22 are so called invariants of conic section which means that they are not changed by Euclidian transformation (rotation and translation). Furthermore, different types of conic sections have different signs of those determinants. • A / 0 non-degenerate conic sections: ellipse for 8 > 0, hyperbola for 8 < 0 and parabola for 8 = 0 Furthermore, for real ellipse (not imaginary), (an +a22)A < 0 must hold. • A = 0 degenerate conic sections, lines We can easily check that signs (or zero-value) of the determinants are (x\ really invariant to coordinate transformation. Denote X = I y I and w A is a matrix of quadratic form. Then the corresponding conic section has equation XT AX = 0. We get the standard form by rotation and translation, i.e. by transformation to new coordinates x', y1 satisfying x = x' cos a — / sin a + c\ y = x' sin a + y' cos a + c2, (X'\ or, in matrix form, for new coordinates X' = \ y' \ holds cos a — sin a c\ \ ix in a cos a c2 I I / 0 0 1/ \1 Inputting X = MX' into the conic section equation we get equation in new coordinates XTAX = 0 (MX')T A(MX') = 0 X'TMT A MX' = 0. Denote by A' matrix of the quadratic form in new coordinates. Then (cos a —sin a ciN sin a cos a c2 | has unit 0 0 1 determinant, so det A' = det MT det A det M = det A = A. n = \ x„) — \ Ai\x] \A2\ \Ai\~ \Ar\ x2 \Ar-l\ r' Hence if all leading principal minors are positive, then / is positive definite by Jacobi theorem. On the other hand, let us consider that the form / is positive definite. Then for a suitable regular matrix P we have A — PTEP = PTP. And so \A\ = \P\2 > 0. Let u be a chosen basis in which the form / has matrix A. The restrictions of / to subspaces V* — (u\,..., uk) are positive definite forms fk again, and the corresponding matrices in bases u i,..., ui_ are the leading principal submatrices A^. Thus|A,t| > 0 according to the previous part of the proof. The claim about negative definite forms follows by observing the fact that A is positive definite if and only if —A is negative definite. □ 3. Projective geometry In many elementary texts on analytic geometry, the authors finish with the affine and euclidean objects described if '■. :i above. The affine and euclidean geometries are sufficient for many practical problems, but not for all prob-■%iw*4&**-^-~ lems. For instance in processing an image from a camera, angles are not preserved and parallel lines may (but does not have to) intersect. The next reason for finding a more general framework for geometric problems and considerations is to deal only with simple numerical operations like matrix multiplication. Moreover, it is difficult to distinguish very small angles from zero angles, and thus it is preferable to have tools which do not need such distinguishing. The basic idea of projective geometry is to extend affine spaces by points in infinity such that it allows us an easy work with linear objects like points, lines, planes, projections, etc. 4.33. Projective extension of affine plane. We begin with the simplest interesting case, the geometry in a plane. If we imagine the points in plane A2 as the plane z — 1 in V?, then each point P in our affine plane is represented by a vector u — (x, y, 1) e R3, and so it is represented also by a one-dimensional subspace (u) c R3. On the other hand, almost each one-dimensional subspace in R3 intersects our plane in exactly one point P, and the vectors of such subspace are given by coordinates (x, y, z) uniquely up to a common scalar multiple. Only the subspaces corresponding to vectors (x, y, 0) will not have any intersection with our plane. _ | Projective plane [___ Definition. Projective plane V2 is the set of all one-dimensional subspaces in R3. Homogeneous coordinates of point P — (x : y : z) in the projective plane are triples of real numbers given up to a common scalar multiple, while at least one of them must be nonzero. The straight line in projective plane is defined as the set of one-dimensional subspaces (i.e. points in V2) which generate a two-dimensional subspace (i.e. a plane) in R3. 234 CHAPTER 4. ANALYTIC GEOMETRY Necessarily, also determinant A33, which is algebraic complement of «33 is invariant to coordination transformation, because for rotation only det A' = det MT det A det M holds. In this case matrix M = ^cos a — sin a 0\ sin a cos a Ol anddetA33 = detA33 = S. For translation 0 0 1/ /l 0 cx only M = I 0 1 c2 | and this subdeterminant remains unchanged. \0 0 1 4.44. Determine type of conic section 2x2 —2xy+3y2 —x+y—1=0. Solution. Determinant A ■1 3 i j_ "2 2 i 4 i -i j/0 hence it is non-degenerate conic section. Moreover 8 = 5 > 0, therefore it is ellipse, ellipse. ellipse. Furthermore (an + a22)A = (2 + 3) • (—^) < 0, so it is real □ 4.45. Determine type of conic section x2 — 4xy — 5 y2 + 2x+4y+3 0. 1 -2 1 -2 -5 2 1 2 3 Solution. Determinant A furthermore 8 1 -2 -2 -5 -34 ^ 0, 9 < 0, it is therefore hyperbola. □ 4.46. Determine equation and type of conic section passing through points [-2, -4], [8, -4], [0, -2], [0, -6], [6, -2]. Solution. We will input coordinates of the points into general conic section equation aiix2 + a22y2 + 2a\2xy + a\x + a2y + a 0 We get linear equation system 4«ii + 16^22 + 16ai2 — 2a i — 4a2 + a 64fln + 16^22 — 64^12 + 8fli — 4a2 + a 4a22 - 2a2 + a 36a22 — 6a2 + a 36fln + 4^22 — 24^12 + 6ci\ — 2a2 + a In matrix form we perform operations 0, 0, 0, 0, 0. (4 16 16 -2 -4 1\ 64 16 -64 8 -4 1 0 4 0 0 -2 1 0 36 0 0 -6 1 \36 4 -24 6 -2 V In order to have a concrete example, let us look at two parallel lines in affine plane R2 Li:y-jc-l=0, L2:y-x+ 1=0. If we see the points of lines L i and L2 as finite points in projective space V2, their homogeneous coordinates (x : y : z) obviously satisfy equations L\: y — x — z — 0, L2 : y — x + z — 0. It is easy to see that the intersection L i n L2 is the point (—1:1: 0) e V2 in this context, i.e. the point of infinity corresponding to the common direction vector of the lines, 4.34. Affine coordinates in projective plane. On the contrary if we begin with the projective plane and if we want to see the affine plane as its "finite" part, then instead of plane " z — 1 we may take an other plane a in R3 which does not pass through origin 0 e M3. Then the finite points will be those one-dimensional subspaces which have a nonzero intersection with the plane a. Let us proceed farther in our example of two parallel lines from the previous paragraph, and let us see what their equations look like in coordinates in affine plane given by y — 1. To get them, it suffices to substitute y — 1 into the previous equations: Li : 1 0, L'7 : 1 0 Now the "infinite" points of our former affine plane are given by z — 0, and we see that our lines L\ and L'2 intersects in point (1, 1,0). This corresponds to the geometric vision that two parallel lines L i, L2 in affine plane intersect in infinity, in point (1:1:0) precisely. 4.35. Projective spaces and transformations. One can generalize in a natural way our procedure from the affine plane to each finite dimension. Choosing an arbitrary affine hyperplane A„ in vector space M"+1 which does not pass through origin we may identify the points P e A„ with one-dimensional sub-spaces generated by these points. The remaining one-dimensional subspaces fulfil a hyperplane parallel to A„, and we call them infinite points in the projective extension V„ of affine plane A„. Obviously the set of infinite points in V„ is always a projective space of dimension one less. An affine straight line has only one infinite point in its projective extension (both ends of the line "intersect" in infinity and thus the projective line looks like a circle), the projective plane has a projective line of infinite points, the three-dimensional projective space has a projective plane of infinite points etc. More generally, we define the projectivization of a vector space: for an arbitrary vector space V of dimension n+1 we define V(V) = {P C V; P C V, dim V = 1}. Choosing a basis u in V we get so called homogeneous coordinates on V(V) such that for a P e V(V) we use its arbitrary nonzero vector u e V and the coordinates of this vector in basis u. The points of the projective space V(V) are called geometric points, while their generators in V are called arithmetic representatives. In the chosen projective coordinates, we can fix one of them to be one (i.e. we exclude all points of the projective extension which have this coordinate equal to zero), and so we get an embedding 235 CHAPTER 4. ANALYTIC GEOMETRY /4 16 16 -2 -4 1 \ 0 4 0 0 -2 1 0 0 64 -8 12 -9 0 0 0 24 -36 27 0 0 0 3 -v /48 0 0 0 0 -1\ 0 12 0 0 0 -1 0 0 64 0 0 0 0 0 0 24 0 3 \0 0 0 0 3 -V ci2 = 32. We can choose value of a. If we choose a = 48, we get an = l, a22 = 4, an = 0, ax = -Conic section has equation x2 + Ay2 - 6x + 32y + 48 = 0. We will complete x2 — 6x, Ay2 + 32y to squares, which gives us (x - 3)2 + A(y + A)2 - 25 = 0, or rather (x-3)2 (y+A)2 (I)2 " = ' We can see it is an ellipse with center in [3, —4]. □ 4.47. Other characteristics of conic sections. Let's take a further look into some terms related to conic sections. Axis of conic section is a line of reflection symmetry for conic section. From canonical form of conic section in polar basis (4.29) it can be derived that an ellipse has two axes (x = 0 a y = 0), a parabola has one axis (x = 0) a hyperbola has two axes (x = 0 a y = 0). Intersection of axis and conic section itself is called conic section vertex. Numbers a, b from canonical form of conic section (which express distance between vertices and origin) are called semi-axes length. In the case of ellipse and hyperbola, the axes intersect in the origin. This point is a point of central symmetry for the conic section. This point is called center of conic section. Besides vertices and centers there are other interesting points lying on axis of conic section. For ellipse we have ellipse foci E, F characterized by property \EX\ + \FX\ = 2a for arbitrary X lying on ellipse. Following example shows that such points E a F really exist. 4.48. Existence of foci. For ellipse with lengths of semi-axes a > b are points E = [—e, 0] and F = [e, 0], where e = \J a2 — b2 its foci (in polar coordinates). Solution. Consider points X = [x, y], which satisfy properly \EX\ + \FX\ = 2a and we show that these are exactly ellipse points. of n-dimensional affine space A„ C V(V). It is precisely the construction which we used in our example on projective plane. 4.36. Perspective projection. The advantages of projective geometry shows up nicely in the case of perspective projection R3 -> R2. Let us imagine that an observer '$&.J sitting in the origin observes "one half of the world", i.e. the points (X, Y, Z) e R3 with Z > 0, and sees the image "projected" on the screen given by plane Z = / > 0. Thus a point (X, Y, Z) in the "real world" projects to a point (x, y) on the screen as follows: */7 X y = f- It is not only a nonlinear formula but also the accuracy of calculations will be problematic in the case that Z is small. Extending this transformation to a map V3 -> V2 we get (X : Y : Z : W) k> (x : y : z) = (-fX : -fY : Z), i.e. a map described by simple linear formula '/ 0 0 0> 0/00 .0 0 10; /a Y Z W This simple expression defines the perspective projection for finite points in R3 c V3 which we substitute as points with W = 1. In this way we eliminated problems with points whose image runs to infinity. Indeed, if the Z-coordinate of a real point is close to zero, then the value of the third homogeneous coordinate of the image is close to zero, i.e. it corresponds to a point close to infinity. Affine and projective transformations. Obviously, each injective linear map

• V2 between vector spaces maps one-dimensional subspaces to one-dimensional sub-spaces. Therefore, we get a map on projectivizations T : ~P(yi) —>• Viy^). Such maps are called projective maps, in literature one uses also the notion collineation if this map is in-vertible. Otherwise put, the projective map is a map between projective spaces such that in each system of homogeneous coordinates on domain and image it is given by multiplication by a matrix. More generally if our auxiliary linear map is not injective, then we define the projective map only outside of its kernel, i.e. on points whose homogeneous coordinates do not map to zero. Since injective maps V —>• V of a vector space to itself are invertible, all projective maps of projective space V„ to itself are invertible too. They are also called regular collineations or projective transformations. In homogeneous coordinates, they correspond to invertible matrices of dimension n+1. Two such matrices define the same projective transformation if and only if they differ by a constant multiple. If we choose the first coordinate as the one whose vanishing defines infinite points, then the transformations preserving infinite points are given by matrices whose first row vanishes up to its first element. If we want to switch to affine coordinates of finite points, i.e we fix the first coordinate to be one, the first element in the first row must be also equal to one. Hence the matrices of collineations 236 CHAPTER 4. ANALYTIC GEOMETRY Coordinate-wise, this equation looks like yj(x + e)2 + y2 + J{x - e)2 + y2 = 2a By raising to a power and performing some operations we get (a2 - e2)x2 + a2/ = a2(a2 - e2). Substituting e2 = a2 b2 and dividing by a2b2 we get 1. x2 y2 az b2 □ Remark. Number e from the previous example is called eccentricity of an ellipse. Similarly, hyperbola foci are points E, F, which satisfy \\EX\ — \FX\\ = 2a for arbitrary X on hyperbola. You can check that there are two points satisfying this condition [—e, 0] and [e, 0] (in polar basis), where e = \Ja2 + b2. Parabola focus is a point F of coordinates F = [0, f] and it is characterized by a fact that distance between this point and arbitrary X on parabola is equal to the distance between X and line y £ 2 ■ 4.49. Find foci of ellipse x2 +2/ = 2. Solution. We can see from the equation that semi-axes lengths area = 1 and b = We easily compute (see ||4.48||): e = \Ja2 — b2 = 1, foci coordinates then are [ — 1, 0] a [1, 0]. □ preserving finite points of our affine space have the form: /l 0 ••• 0\ flu \K ««1 ,b„)T fli« where b = (b\,..., bny e Rn and A = (ai;) is an invert-ible matrix of dimension n. The action of such matrix on vector (1, x\,..., x„) is exactly a general affine transformation, where b is the translation and A its linear part. Thus the affine maps are exactly those collineations which preserve the hyperplane of infinity points. 4.38. Determining collineations. In order to define an affine map, it is necessary and sufficient to define an image of the affine frame. In the above description of affine transformations as special cases of projective maps it corresponds to a suitable choice of an image of a suitable arithmetic basis of the vector space V. But it does not hold in general that the image of an arithmetic basis of V determines the collineation uniquely. We show the core of the problem on a simple example of affine plane. If we choose four points A, B, C, D in the plane such that each three of them are in a general position (i.e. no three of them he on a line), then we may choose their images in the collineation as follows: Let us choose arbitrarily their four images A', B' ,C, D' with the same property, and let us choose their homogeneous coordinates u, v, w, z, u', v', w', z' v R3. Obviously the vectors z and z' can be written as linear combinations z = c\u + cjv + csw, Z = C,K - y 1/1 ~t~ ' 3 ' where all six coefficients must be nonzero, otherwise there exist three of our points which are not in general position. Now we choose new arithmetic representatives u = c\u, v = c2v and w = c3w of points A, B and C respectively, and similarly u' = c\u', v' = C2v' and w' = c3w' for points A', B' a C. This choice defines an unique linear map

• V(V) we call a point S e V(V) the center of collineation f if all hyperplanes in the bunch determined by S are fixed hyperplanes. A hyperplane a is called the axis of collineation f if all its points are fixed points. It follows directly from the definition that the axis of a collineation is the center of the dual collineation, while the bunch of hyperplanes defining the center of collineation is the axis of the dual collineation. Since the matrices of a collineation on the former and the dual space differ only by the transposition, their eigenvalues coincide (eigenvectors are column respectively row vectors corresponding 239 CHAPTER 4. ANALYTIC GEOMETRY from 4.29., which corresponds to intersecting cone in R3 with different planes. Non-degenerate sections are depicted. Degenerate sections are those which passes the vertex of the cone. We define following useful terms for conic section in projective plane : Points P, Qe V2 corresponding to one-dimensional subspaces (p)' (?) (generated by vectors p,q € R3) are called polar conjugated with respect to conic section /, if F(p, q) = 0, or rather pT Aq = 0 holds. Point P= (p) is called singular point of conic section /, when it is polar conjugated with respect to / with all points of the plane, so F(p,x) = 0 Vx € V2. In other words, Ap = 0. Hence the matrix A of conic section does not have maximal rank and therefore does define degenerate conic section. Non-degenerate conic sections do not contain singular points. We call the set of all points X= (x) polar conjugated with P = (p) polar of point P with respect to conic section /. It is therefore set of point for which F(p,x) = pTAx = 0. Because polar is given by linear combination of coordinates, it is always (in non-singular case) a line. The following part explains geometric interpretation of polar. 4.57. Polar characterization. Consider non-degenerate conic section /. Polar of point P € f with respect to / is tangent to / with the touch point P. Polar of point P £ f is line defined by touch points of tangents to / passing through P. Solution. We will first consider Pe / and show by contradiction that polar has exactly one common point with the conic section (the touch point). Suppose that polar of P, defined by F(p, x) = 0, intersects / in Q= (q) t^P. Then obviously F(p,q) =0 and fiq) = F(q, q) = 0. For arbitrary point X = (x) lying on P and Q we then have x = ap + fiq for some a, /i e K. Because of bilinearity and symmetry of F we get fix) = Fix, x) a2Fip,p)+2apFip,q)+p2Fiq, q) 0 to the same eigenvalues). For example in the pojective plane (and due to the same reason in each real projective space of even dimension) each collineation has at least one fixed point since the characteristic polynomials of corresponding linear maps are of odd degree and so they have at least one real root. Instead of discussing a general theory, we will illustrate now shortly its usefulness on several results for projective planes. . Proposition. A projective transformation different from the identity has either exactly one center and exactly one axis or it does not have center neither axis. Proof. Let us consider collineation / on VR3 and let us assume that it has two distinct centers A and B. Let us denote by £ the line given by these two centers, and let us choose a point X in projective plane outside of £. If p and q are the lines passing through pairs of points (A, X) respectively (B, X), then also f(p) = p and f(q) = q and in particular also point X is fixed. But this means that all points of the plane outside of £ are fixed. Hence each line different from £ has all points out of £ fixed and thus also its intersection with £ is fixed. It means that / is identity mapping and so we proved that every projective transformation different from the identity may have at most one center. The same consideration for the dual projective plane gives the result about at most one axis. If / has a center A, then all lines passing through A are fixed and correspond therefore to a two-dimensional subspace of row eigenvectors of matrix corresponding to transformation /. Therefore, there exists a two-dimensional subspace of column eigenvectors to the same eigenvalue, and this one represents exactly the line of fixed points, hence the axis. The same consideration in the reversed order proves the opposite statement - if a projective transformation of plane has an axis, then it has also a center. □ For practical problems it is useful to work with complex projective extensions also in the case of a real plane. Then the geometric behaviour can be easily read off the potential existence of real or imaginary centers and axes. 4.42. Projective classification of quadrics. In the end of this section we come back to conies and quadrics. A quadric ; / Q in n-dimensional affine space Rn is defined by gen-sr^ eral quadratic equation (4.4), see page 228. Viewing affine space Rn as affine coordinates in projective space VRn+1 we may aim to describe the set Q by homogeneous coordinates in projective space. The formula in these coordinates should contain only terms of second order since only a homogeneous formula is independent of the choice of the multiple of homogeneous coordinates (xo, x\,... ,x„) of a point. Hence we are searching for a homogeneous formula whose restriction to affine coordinates, i.e. substitution xo = 1, gives the original formula (4.4). But this is especially easy, we simply add enough xq to all terms - no one to quadratic terms, one to linear terms and x2, to the constant term in the original affine equation for Q. So we get a well defined quadratic form / on vector space Rn+1 whose zero set defines correctly so called projective quadric The intersection of "cone" Q c Rn+1 of the zero set of this form with affine plane xq = 1 is the original quadric Q whose points are called proper points of the quadric, while the other points Q \ Q in the projective extension are the infinite points. 240 CHAPTER 4. ANALYTIC GEOMETRY which means that every point x of line is lying on conic section /. However, when the conic section contains a line, it has to be degenerate, which is contradiction. As well, we can see that in the case of degenerate conic section, the polar is line of conic section itself. Claim for P £ f follows from the corollary of symmetry of bilinear form F. When point Q lies on polar of P, then point P lies on polar of Q. □ Using polar conjugates we can find axes and center of conic sections without need of Lagrange algorithm. Consider conic section matrix as a block matrix A a T a a where A = (atj) for i, j = 1, 2, a is vector (ai3, a23) and a = a33. That means the conic section is defined by equation uT Au + 2aTu + a = 0 for vector u = (x, y). Now we show that 4.58. Axes of conic section are polars of points at infinity determined by eigenvectors of matrix A. Solution. Because of symmetry of A, in the basis of its eigenvectors, it has diagonal shape D = , where X, \± e M. and this basis is ortogonal. Denote by U matrix changing basis to eigenbasis (columns are eigenvectors), then the conic section matrix in eigenbasis is 'UT 0\/A a\/U 0\ _ / D UTa" 0 l) \aT a)\0 l) ~ [aTU a So in this basis we have canonical form defined by vector UTa (up to translation). Specifically, denote by vx, eigenvectors and we have a1Vi t aTv„ t (aTvA2 (aTv,.)2 k(x + —±)2 + fi(y +--)2 = kJ +--'— - a. A fl A fl It means that eigenvectors are direction vectors of the conic section axes (so called main directions) and axes equations in this basis are x = —^y2- and y = —^J±- Axes coordinates uk and in standard T T basis satisfy v^iii = — ^-^ and v^u^ = —^J±, because v^iXui + The classification of real or complex projective quadrics, up to projective transformations, is a problem which we have already managed — it is all about finding the canonical polar basis, see paragraph 4.29. From this classification, which is given by signature of the form in real case and only by rank in complex case, we can deduce relatively easily also the classification of affine quadrics. We show the core of the procedure on the case of conies in affine and projective plane. The projective classification gives the following possibilities, described by homogeneous coordinates (x : y : z) in projective plane VR3: imaginary regular conic given by x2 + y2 real regular conic given by x2 + y2 pair of imaginary lines given by x2 + y2 — 0 = 0 z2 = 0 • pair of real lines given by x2 • one double line x2 — 0. y o We consider this classification as real, i.e. the classification of quadratic forms is given not only by its rank but also by its signature. Nevertheless, the points of quadric are considered also in the complex extension. In this way we should understand the stated names, e.g. the imaginary conic does not have any real points. 4.43. Affine classification of quadrics. For affine classification we must restrict the projective transformations to those which preserve the line of infinite points. We can realize this also by an opposite procedure — for a fixed projective type of conic Q, i.e. its cone Q c R3, we are choosing different affine planes a c R3 which do not pass through origin, and we observe how the set of points Q n a, which are proper points of Q in affine coordinates realized by plane a, is changing. Hence in the case of a regular conic we have a true cone Q given by equation z2 — x2 + y2 and as planes a we may take the tangent planes to unite sphere for instance. If we begin with plane z — 1, the intersection consists only from finite points forming a unite circle Q. By a successive sloping of a we get more and more stretched ellipse until we get such slope that a is parallel with one of lines of the cone. In that moment there appears one (double) infinite point of our conic whose finite points still form one connected component, and so we get parabola. The continuation in sloping gives rise to two infinite points and the set of finite points is no more connected, and so we get the last regular quadric in the affine classification, a hyperbola. We can take the advice from the introduced method which enable us to continue the classification in higher dimensions. In particular, let us notice that the intersection of our conic with the projective line of infinite points is always a quadric in dimension one less, i.e. in our case it is either an empty set or a double point or two points as types of quadrics on a projective line. Next we found out that we found an affine transformation transforming one of possible realizations of a fixed projective type to another one only if the corresponding quadrics in the infinite line were projectively equivalent. In this way, it is possible to continue the classification of quadrics in dimension three and farther. 241 CHAPTER 4. ANALYTIC GEOMETRY a) = 0 and v^ip^u^ + a) = 0. These equations are equivalent to equations vf(Au^ + a) = 0 a u^(AwM + a) = 0 which are polar equations of points defined by vectors v^av^. □ 4.59. Remark. Corollary of the previous claim is a fact that center of the conic section is polar conjugated with all points at infinity. Center s coordinates then satisfy equation As + a = 0. If det(A) 7^ 0, then equation As+a =0 for center coordinates has exactly one solution for 8 = det(A) ^ 0 and no solutions for 8 = 0. That means that, regarding non-degenerate conic sections, ellipse and hyperbola have exactly one center and parabola does not have any (its center is point at infinity). 4.60. Prove that angle between tangent to parabola (with arbitrary touch point) and parabola axis is the same as angle between the tangent and line connecting focus and the touch point. Solution. Polar (i.e. tangent) of point X=[x0, yo] to parabola defined by canonical equation in polar basis is line satisfying 0 i i x0x - py - py0 = 0 Cosine of angle between tangent and parabola axis (x = 0) is given by dot product of corresponding unit direction vectors. Unit direction vector of the tangent is , 1 (p,x0) and therefore for cosine we have 1 Xn = (p,x0).(0,l) = Pz+*o \JP +xo Now we show that cosine of angle between tangent and line connecting focus F=[0, f ] and touch point X is equal. Unit direction vector of the connecting line is 1 p (x0, yo - -)■ For cosine of angle we have 1 1 px0 ===(x0y0 + ——) p2 + xz0 Jxz0 + (yo - f) P\2 r2 0. rr^t _£Q_ Substituting yo = ^ we get /p2+4 This example shows that lightrays striking parallel with axis of parabolic mirrow are reflecting to the focus and, vice versa, lightrays going through focus reflect in direction parallel with axis of parabola. This is the principle of many devices such as parabolic reflector. □ 242 CHAPTER 4. ANALYTIC GEOMETRY 4.61. Find equation of tangent in P=[l, 1] to the conic section 4x2 + 5V2 - 8xy + 2y - 3 = 0 Solution. By projecting we get conic section defined by quadratic form (x, y, z)A(x, y, z)T with matrix / 4 -4 0 A = -4 5 1 \ 0 1-3 Using previous theorem, tanget is polar of P, which has homogenenous coordinates (1:1: 1). It is given by equation (1,1, l)A(x, y, z)T = 0, which in this case gives 2y - 2z = 0 Moving back to inhomogeneous coordinates we get tangent equation y = l. □ 4.62. Find coordinates of intersection of y axis and conic section defined by 5x2 + 2xy + y2 - 8x = 0 Solution, y axis, i.e. line x = 0, is polar of sought point P with homogeneous coordinates (p) = (p\ : p2 : p3). That meansthat equation x = 0 is equivalent to polar equation F(p, v) = pT Av = 0, where i; = (x, y, z)T. This is satisfied when Ap = (a, 0, 0)T for some a € R. This condition gives us for conic section matrix /5 1 -4\ A = 1 1 0 \-4 0 0 / equation system 5pi + Pi ~ 4p3 = aj Pi + Pi = 0 -4Pl = 0 We can find point P coordinates by inverse matrix, p = A~1(a,0, 0)T, or solve the system directly by backward substitution. In this case we can easily obtain solution p = (0, 0, —\a). So y axis touches the conic section in the origin. □ CHAPTER 4. ANALYTIC GEOMETRY 4.63. Find a touch point of line x = 2 with conic section from the previous exercise. Solution. Line has equation x — 2z, = 0 in projective extension and therefore we get condition Ap = (a,0, —2a) for touch point P, which gives us 5pi + p2 - 4p3 = a Pi + P2 = 0 —4pi = —2a Its solutionis p = (\a, — \a, \a). These homogeneous coordinates are equivalent to (2, —2, 1) and hence the touch point has coordinates [2, -2]. □ 4.64. Find equations of tangents passing through P= [3, 4] to tangent defined by 2x2 - 4xy + y2 - 2x + 6y - 3 = 0 Solution. Suppose that the touch point has homogeneous coordinates given by multiple of vector t = (t\,t2, h)- Condition of T lying on conic section is tT At = 0, which gives 2t2 - 4ttt2 + 2ttt3 + 6t2h - 3rj = 0 Condition of P lying on polar of T is pT At = 0, where p = (3, 4, 1) are homogeneous coordinates of point P. In this case, the equation gives us (2 -2 -l\(h\ (3, 4, 1) -2 1 3 \ \t2 \= -3h +t2 + 6t3=0 V-l 3 -3) \t3) Now we can input t2 = 3ti — 6t3 to the previous quadratic equation. Then -t\ +4tit3 - 3t\ = 0 Because for t3 = 0 equation is not satisfied, we can move to inhomo-geneous coordinates ^, 1), for which we get -(^)2 + 4(^)-3 = 0 a g = 3(£)-6, tj. f = 1 a ff = -3, nebo lj- = 3 a ff = 3. So the touch points have homogeneous coordinates (1 : —3 : 1) and (3:3: 1). Tangent equations are polars of those points Ix — 2y — 13 = 0 and x = — 3. □ 4.65. Find equation of tangent passing through origin to the circle x2 + y2 - lOx - 4y + 25 = 0 Solution. Touch point (t\ : t2 : t3) satisfies 1 0 -5\ jh\ (0, 0, 1) | 0 1 -2 I I f2 I = —5*i - 2t2 + 25 = 0 -5 -2 25 / W 244 CHAPTER 4. ANALYTIC GEOMETRY >From here we derive for example t2 and substitute in circle equation, which {t\ : t2 : h) has to satisfy as well. We get quadratic equation 29t2 - 250fi + 525 = 0, which has solutions h = 5 and tx = . We compute coordinate t2 and get touch points [5, 0] and [^, ^]. The tangents are polars of those points with equations y = 0 a 20x —21y = 0. □ 4.66. Find tangents equations to circle x2 +v2 = 5 which are parallel with 2x + y + 2 = 0. Solution. In projective extension, these tangets intersect in point at infinity satisfying 2x + y + z, = 0, so in point with homogeneous coordinates (1 : —2 : 0). They are tangents from this point to the circle. We can use the same method as in previous exercise. Conic section matrix is diagonal with the diagonal (1,1,-5) and therefore the touch point (?i : t2 : t3) of the tangents satisfy t\ — 2t2 = 0. Substituting into circle equation we get 5t\ =5. Since that t2 = ±1 and touch points are [2, 1] and [—2, —1]. □ Tangent touching the conic section at infinity is called conic section asymptote. Number of asymptotes of conic section is equal to number of intersections between conic section and line at infinity, which means that ellipse does not have any real asymptote, parabola has one (which is however line at infinity) and hyperbola two of them. 4.67. Find points at infinity and asymptotes of conic section defined by Ax2 - Sxy + 3/ - 2y - 5 = 0 Solution. First, we rewrite the conic section in homogeneous coordinates. 4x2 - 8xy + 3y2 - 2yz - 5z2 = 0 Points at infinity are then points determined by homogeneous coordinates (x : y : 0) satisfying this equation, which means 4x2 - 8xy + 3/ = 0. For fraction - we get two solutions: - = — \ and - = — Conic y ° y 2 y 2 section is therefore hyperbola with points at infinity P= (—1 : 2 : 0) a Q= (—3:2:0). Asymptoty jsou potom polAA/y bodLZ P a Q, tj. 4 -4 0\ /V (-1,2,0)1-4 3 -1 y I =-12x + 10y-2 = 0 (-3,2,0) -4 3 -1 y =-20x + 18y - 2 = 0 □ 245 CHAPTER 4. ANALYTIC GEOMETRY You can find further exercises on conic sections on the page 250. 4.68. Harmonic cross-ratio. If cross-ratio of four points lying on line equal to —1, we talk about harmonic quadruple. Let A BCD be a quadrilateral. Denote by K intersection of lines AB and CD, by M intersection of lines AD and BC. Further let L, N be intersection of KM and AC, BD respectively. Show that points K, L, M, N are harmonic quadruple. O 246 CHAPTER 4. ANALYTIC GEOMETRY 247 CHAPTER 4. ANALYTIC GEOMETRY 248 CHAPTER 4. ANALYTIC GEOMETRY 249 CHAPTER 4. ANALYTIC GEOMETRY D. Further exercise on this chapter 4.69. Find parametric equation of intersection of planes in M3: a : 2x + 3y — z + 1 = 0 a p : x — 2y + 5 = 0. o 4.70. Find common perpendicular of skew lines p : [1, 1, 1] + t(2, 1, 0), q : [2, 2, 0] + f (1, 1, 1). o 4.71. Jarda is standing in [ — 1, 1,0] and has a stick of length 4. Can he simultaneously touch lines p and q, where p : [0,-1,0] + f(l,2, 1), q : [3,4, 8]+^(2, 1,3)? O (Stick has to pass through [-1,1,0].) 4.72. Cube ABCDEFGH is given. Let point T lie on edge BF, \BT\ = \\BF\. Compute cosine of angle between ATC and BDE. Q 4.73. Cube ABCDEFGH is given. Let point T lie on edge AE, \AT\ = \\AE\ and S is midpoint of AD. Compute cosine of angle between BDT and SCH. Q 4.74. Cube ABCDEFGH is given. Let point T lie on edge BF,\BT\ = \\BF\. Compute cosine of angle between ATC and BDE. O 2 2 4.75. Determine tangent to ellipse + y = 1 parallel with line x + y — 7 = 0. Solution. Lines parallel with given line intersect this line in point at infinity (1 : —1 : 0). We construct tangents to given ellipse passing through this point. Touch point T= (h '■ h '■ h) lies on its polar and therefore satisfies — § = 0, so t2 = ^t\. Substituting in ellipse equation we get t\ = ±y. Touch points of sought tangents are [y, |] and [—y, — f ]. Tangents are polars of those points. These have equations x + y = 5 and x + y = —5. □ 4.76. Determine points at infinity and asymptotes of conic section 2x2 + 4xy + 2V2 - y + 1 = 0 Solution. Equation of points at infinity 2x2 + 4xy + ly2 = 0 or rather 2(x + y)2 = 0 has solution ^ — y. The only point at infinity therefore is (1 : — 1 : 0) (conic section is a parabola). Asymptote is a polar of this point, specifically line at infinity z = 0. □ 4.77. Prove that product of distances between arbitrary point on a hyperbola and its asymptotes is consant and tell the value of this constant. Solution. Denote the point lying on hyperbola as P. Asymptote equation of hyperbola in canonical form is bx ± ay = 0. Their normals are (b, ±a) and from here we determine projections Pi, P2 of point P to asymptotes. For distance between point P and asymptotes we get |P Pi 21 = -7==. The 2 „2 j.2 2 2 2 product is therefore equal a ^2^_h2P = f2+b2' because P ues on hyperbola. □ 250 CHAPTER 4. ANALYTIC GEOMETRY 4.78. Compute angle between asymptotes of hyperbola 3x2 — y2 = 3. Solution. For cosine of angle between asymptotes of hyperbola in canonical form we get cos a b2+a . In our case the angle is equal 60°. □ 4.79. Compute centers of conic sections (a) 9x2 + 6xy - 2y - 2 = 0 (b) x2 + 2xy + y2 + 2x + y + 2 = 0 (c) x2 - 4xy + Ay2 + 2x - Ay - 3 = 0 (d) ^# + ^# = 1 Solution, (a) System As + a = 0 for computing proper centers is 9si + 3s2 = 0 3*1-2 = 0 and, solving it, we obtain center [|, — 2]. (b) In this case we have si+s2 + l = 0 S1+S2 + J = 0 and therefore there is no proper center (conic section is a parabola). Moving to homogeneous coordinates we can obtain center at infinity (1 : — 1 : 0). (c) Coordinates of center in this case satisfy 51-2*2 + 1 = 0 -2*i +4*2-2 = 0 and the solution is whole line of centers. It is so because the conic section is degenerated to pair of parallel lines. (d) From equations for center computation we immediately get that center is (a, (3). Coordinates of center therefore gives translation of coordinate system origin to the frame in which the ellipse has basic form. □ 4.80. Tell the equations of axes of conic section 6xy + Sy2 + 4y + 2x — 13 = 0. Solution. Main directions of the conic section (axes direction vectors) are eigenvectors of matrix 0 3\ 2 g 1. Characteristic equation has form X2 — 8A — 9 = 0 and eigenvalues are therefore Ai = — 1, X.2 = 9. Corresponding eigenvectors are then (3, — l)and(l, —3). Axes are polars of points at infinity defined by those directions. For (3, —1) we get axis equation — 3x + y + 1 = 0 and for (1, —3) axis -9x - 2\y - 5 = 0. □ 4.81. Determine equations of axes of conic section Ax2 + 4xy + y2 + 2x + 6y + 5 = 0. (A 2\ Solution. Eigenvalues of matrix I ^ ^ I are X\ = 0, A2 = 5 and corresponding eigenvectors are (—1,2) and (2, 1). We get axes 5 = 0 and 2x + y + 1 = 0. The former axis is obviously satisfied for no point. Hence there is only on axis (the conic section is a parabola). □ 4.82. Equation x2 + 3xy - y2 + x + y + 1 = 0. 251 CHAPTER 4. ANALYTIC GEOMETRY defines a conic section. Tell its center, axes, asymptotes and foci. 252 CHAPTER 4. ANALYTIC GEOMETRY Exercise solution 4.9. 2, 3, 4, 6, 7, 8. Try to find planes positions which correspond to each of those numbers on your own. 4.2 K or Z -> K, where K is a given set of numbers. We also worked with sequences of vectors over real or complex numbers Let us remind the discussion from the paragraph 1.4, where we thought about how to deal with scalar functions. There is nothing to add to this discussion and we would like (to start off) to work with functions M -> M (real-valuedfunctions of a real variable), or M -> C (complex-valued functions of a real variable), or functions Q -> Q (rational-valued functions of a rational variable) and so on. Our conclusions can usually be extended to the cases concerning vector values over the considered scalars, but we will mostly talk only about the cases of real and complex numbers. Let us begin with the easiest functions which we can assign explicitly by finitely many algebraic operations with scalars. 5.1. Polynomials. We can add and multiply scalars, and these op-WsJ^// erations satisfy a number of properties which we enumerated in the paragraphs 1.1 and 1.3. If we admit any finite number of these operations, leaving one of the variables as an unknown and fixing the other scalars, we get the so-called polynomial functions: CHAPTER 5. ESTABLISHING THE ZOO Each equation arose from one of the given conditions. Another option is to construct the required polynomial from the fundamental Lagrange polynomials. (see 5.4): Polynomials A polynomial over a ring of scalars ! given by the expression fix) = anx" + a„_ix"_1 is a mapping / H----+ a\x + ao, P(x) (jc-3)(jc-4)(jc-5) 1 . ^-!±-!±-L + o .(...) + (2-3)(2-4)(2-5) (x-2)(x-3)(x-5) (x-2)(x-3)(x-4) (-1) • —-zrr:-^7~.-— + o • (4 - 2) (4 - 3) (4 - 5) z-29. where at, i = 0,..., n, are fixed scalars, multiplication is indicated by mere concatenation of symbols, and"+" denotes addition. If a„ / 0, we say that the the polynomial / has degree n. The decree of the zero polynomial is undefined. The scalars at are called (5 — 2) (5 — 3) (5 — 4) coefficients of the polynomial f. 4 , ,101 The coefficients of the polynomial form, of course, the solution of the aforementioned system of linear equations. □ 5.2. Find a polynomial P satisfying the following conditions: P(l + 0 = i, P(2) = 1, P(3) = -i. 5.3. For pairwise distinct points x0,... ,xn e M, consider the elementary Lagrange polynomials (5.4) li{x) := fr-^^-^-ijjx-^iH.-^) xsR,i=0,...,n. (.Xi-Xo) — [Xi-Xi-l)[Xi-Xi + l) — (Xi-X„) Prove that J2 h W = 1 for all x e Solution. Apparently, Y,h(xo) = l + 0 + -- + 0=l, = 0+1 + -- - + 0=1, E^fe) = o + o + "- + i = i. r=0 This means that the polynomial 5Z"=o ^ (x) °f degree not greater than n takes the value 1 at the n + 1 points x0,..., x„. However, there is exactly one such polynomial, namely the constant polynomial y = 1. □ 5.4. Find a polynomial P satisfying the following conditions: P(l) = 0, P'(l) = 1, P(2) = 3, P'(2) = 3. Solution. Once again, we will provide two methods of finding the polynomial. The given conditions give rise to four linear equations for the coefficients of the wanted polynomial. So if we look for a polynomial The polynomials of degree zero are exactly the non-zero constant mappings. In algebra, polynomials are more often defined as formal expressions of the aforementioned form of fix), i. e. a polynomial is defined to be a sequence ao, a\,... of coefficients such that only finitely many of them are non-zero. However, we will show shortly that these approaches are equivalent. It is easy to verify that the polynomials over a given ring of scalars form a ring as well. The multiplication and addition of polynomials are given by the operations in the original ring K by the values of the polynomials, i. e. (/ ' *)(*) = fix) ■ 8(x), (f + g)ix) = fix) +gix), where the operations on the left-hand side are interpreted in the ring of polynomials whereas the operations on the right-hand side are the ones of the ring of scalars. 5.2. Euclidean division of polynomials. As we have already mentioned, we will work exclusively with the scalar fields Q, R, or C. In all these fields, the following statement holds: Proposition (Euclidean division of polynomials). For any two polynomials f of degree n and g of degree m, there is exactly one pair of polynomials q, r such that f = q ■ g +r and the degree of the polynomial r is less than m or r = 0. Proof. Let us begin with the uniqueness. Suppose we have two expressions of the polynomial / in terms of the polynomials q, q', r, and r1, i. e. we have f = q-g+r = q'-g + r'. Subtraction gives 0 = (q — q') ■ g + ir — r1). If q — q', then r = r' as well. And if q ^ q', then the term of the highest degree in iq — q') • g cannot be compensated by r — r', which leads to a contradiction. We have thus proved the uniqueness of the result of the division, supposing it exists. It remains to prove that the polynomial / can always be expressed in the wanted form. If m > n, we can immediately set / = 0 • g + /. Therefore, let us suppose that n > m and prove the proposition by induction on the degree of the polynomial /. If / is of degree zero, then the statement is trivial. Let us thus suppose that the statement holds for all polynomials / of degree less than n > 0 and consider the expression h(x) = fix) — jrx"~m gix). Either h(x) is the zero polynomial and we have got what we have been looking for, or it is a polynomial of a lower degree and as such can be written in the desired form as h (x) = q ■ g + r whence f(x)=h(x)-and we are done. ^x«- lgix) = iq + ^x«- □ 255 CHAPTER 5. ESTABLISHING THE ZOO of degree less than four, we get the same number of equations and unknown coefficients (let us say P(x) = a3x3 + a2x2 + a\x + a0): P(\) = (a3 + a2 + a\ + a0 =0, P'(\) = 3a3 + 2a2 + ax = 1, P (2) = %a3 + 4a2 + 2a i + ciq = 3, P'(2) = \2a3 + 4a2 + ax = 3. By solving this system, we obtain the polynomial P(x) = —2x3 + 10x2 - 13* + 5. Another solution. We will use fundamental Hermite polynomials: h\(x) = 1- L ~~rr(x V 0 + (- -1) h\(x) = (5- 2x)(x - -I)2, h\{x) = (x- l)(x- 2)\ h\(x) = (x- 2)(x- I)2- Altogether, (2x - l)(x - 2)2, P(x) = 0-h\(x)+3-h\(x)+\-h\(x)+3-h\(x) -2x3+10x2-13x+5. □ 5.5. Using Lagrange interpolation, approximate cos2 1. Use the values the function takes at the points f, f, and j. Solution. First, we determine the mentioned values: cos2(j) = 1/2, cos2(^-) = 1/4, cos2(^) = 0. Then, we determine the elementary Lagrange polynomials, calculating their values at the given point. *o(l) Altogether, P(l) = (1 "f)(l 2> 8(7r" 3)(7T- 2) - ,71 M 1Z \ { 1Z 3 )\4 2> O 7T2 (1- 2> 9(7r" -4)(7T- -2) (- - 71 \ f 71 - ~) ~ 2> y 7T2 (1 2i7t- 4)(7T- 3) ^2 71 \ f 71 4){ 2 7T2 (71- 3)(7T- 2) 1 •9(7r" -4)(7T- -2) 7T2 4 * y 7T2 + 0 (5jt - 12) (jt -2) ijT2 0.288913. We may notice we did not need to calculate the third elementary polynomial. The actual value is cos2 1 = 0.291927. □ If the value / (b) equals zero for some element b e K, then we must have r — 0 in the quotient f(x) = q(x)(x —b) + r because otherwise we could not achieve f(b) — q(b) ■ 0 + r, where the degree of r is zero. We say that b is a root of the polynomial f. The degree of q is then exactly n — 1. If q also has a root, we can continue and in no more than n steps we arrive at a constant polynomial. Thus we have proved that the number of roots of any non-zero polynomial over the field K is at most the degree of the polynomial. Hence we can easily derive the following observation: Corollary. IfM. is an infinite field, then the polynomials f and g are equal as mappings if and only if they equal as sequences of coefficients. Proof. Suppose that / — g, i. e. / — g — 0, as a mapping. Therefore, the polynomial (/ — g)(x) has infinitely many roots, which is possible only if it is the zero polynomial. □ Let us realize that of course, this statement does not hold with finite fields. A simple non-example is the polynomial x2 + x over Z2 which represents a constant zero mapping. 5.3. Interpolation polynomial. It is often useful to give an easily • R or Q —>• Q, form a very nice class j« of functions of one variable. We can lay them through any set of given values. Moreover, they are easily expressible, so their value at any point can be calculated without difficulties. However, we encounter a number of problems when trying to put them in practice. The first of the problems is to quickly find the polynomial which we will lay through the given data since solving the aforementioned system of linear equations generally requires time proportional to the cubed number of given points, which is unacceptable for larger data. Another problem is slow computation of the value of a polynomial of a relatively high degree at a given point. Both of these problems can be partially bypassed by selecting a more convenient expression of the interpolation polynomial (i. e. we choose a basis of the corresponding vector space of all polynomials of degree at most k which is better than the standard basis 1, x, x2, ..., x" ). We will demonstrate this on one exercise: 257 CHAPTER 5. ESTABLISHING THE ZOO the result P(25) = 5.892, which is apparently wrong as the binary logarithm is an increasing function. Can you explain the origin of this error? Solution. Joe asked Google and learned that the interpolation error can be expressed as f(x)-Pn(x) (X — X0)(x — X\) . . . (x — Xn) + (n + 1)! where the point § is not known, but lies in the interval given by the least and greatest knots. The term in the fraction's numerator causes the accuracy to deteriorate by adding farther knots. □ 5.8. A week later, Joe needed to approximate Vv. He got the idea of reversing the problem and using the inverse interpolation, ie. to interchange the roles of arguments (function inputs) and values (function outputs) and to approximate the value of an appropriate function at the point 0. Describe his procedure. Solution. The function x2 — 1 takes 0 at Vv. Joe took the points x0 = 2, x\ = 2.5, and x2 = 3, with the function values —3, —0.75, and 2, respectively. Then he interchanged their roles, thus obtaining the elementary Lagrange polynomials l0(x) h(x) (x + 0.75)(x -2) 4 (-3 + 0.75X-3 - 2) ~ 45" 16 , 16 32 --x--x -\--, 99 99 33 6,3 9 —x2 H--x H--. 55 11 55 1 —x 9 2 15' For y/l, he got the approximate value 2 • Z0(0) + 2.5 • l\ (0) + 3 • l2(0) 437 165 2.6485. Additional questions: Joe made a mistake while constructing one of the elementary polynomials, try to find it. Does this mistake affect the resulting value? How could we make use of the value of the derivative at the point 2.5? □ 5.9. Find a natural spline 5 which satisfies 5(-l) = 0, 5(0) = 1, 5(1) = 0. Solution. The wanted spline consists of two cubic polynomials, let us denote them 5i for the interval [—1,0] and S2 for the interval [0, 1]. The word "natural" requires that the second derivatives of 5i and S2 be zero at the points — 1 and 1, respectively. Thanks to the given value at 0, we know that the absolute coefficients of both the polynomials are 1. By symmetry, the common value of the first derivative at the point 0 is 0. So we can set 5i (x) = ax3 + bx2 + 1 and S2(x) = cx3 + Lagrange interpolation polynomials Lagrange interpolation polynomial can easily be expressed in terms of the so-called elementary Lagrange polynomials li of degree n with the properties 1 i J |0 i*j Apparently, these polynomials must be (up to a constant) equal to the expressions (x — x$)... (x — x,_i)(x — xi+\)... (x — x„), and so Uj^ix-xj) 1 I I./. / k=0 Proof. We will proof this statement by induction on the number of the points x,. Apparently, it holds for n — 1 (and the problem is completely uninteresting for n — 0). Let us suppose that the result is correct for n — 1, i. e. n-l V(x0, ■ ■ ■, x„_i) = ]~[(Xi-Xj0. i>k=0 Now let us consider the values x$,..., x„_i to be fixed and let us vary the value of x„. Expanding the determinant by the last row (see ??), we obtain the wanted determinant as the polynomial (5.1) V(x0, ...,*„) = (x„)"V(x0,..., x„_i) - (xn)"-1 This is a polynomial of degree n since we know that its coefficient at (x„)" is non-zero, by the induction hypothesis. Apparently, it will take zero at any point x„ — x, for i < n because in that case, the original determinant contains two identical rows. Our polynomial is thus divisible by the expression (.Xn x§)(xn Xl) • • • (xn xw_i), which itself is of degree n. Hence it follows that the whole Vandermonde determinant (as a polynomial in the variable x„) must, up to a multiplicative constant, equal this expression, i. e. V(x0, x„) = c ■ (xn - x0)(x„ - xi) • • • (x„ - x„_i). Confronting the coefficients at the highest exponent in (5.1) with this expression yields c — V(x0, x„_i), which finishes the proof of this lemma. □ Again, we can see that the determinant will be very small if the distances of the points x, are such. 5.6. Derivatives of polynomials. We have found out that the values of the polynomials rapidly tend to infinite values as the input variable grows (see the pictures as well). Therefore, it is apparent that polynomials are unable to describe periodic events (such as the values of the trigonometric functions). One could say that we will achieve much better results, at least between the points x,, if we look not only at the function values, but also at the rate of increase of the function at those points. For this purpose, we introduce (only intuitively, for the time being) the concept of a derivative for polynomials. Again, we can work with real, complex or rational polynomials. The rate of increase of a real-valued polynomial f(x) at a point x e R is well expressed by f(x + Ax) - f(x) (5.2) Ax and since we can calculate (over an arbitrary ring) (x + Ax)* = x* +kxk~1 Ax + • • • + (^)x' (Ax)*"' + • ■(Ax)* 259 CHAPTER 5. ESTABLISHING THE ZOO Xi -2 -1 1 2 yi 1 -1 -1 1 Then find any polynomial of degree greater than three which satisfies the conditions in the table. O 5.13. Find a polynomial p(x) = ax3 + bx2 + cx + d which satisfies p(0) = l, p(l) = 0, p(2) = l, p(3) = 10. o 5.14. Construct a polynomial p of degree three or less which satisfies p(0) = 2, p{\) = 3, p(2) = 12, p(5) = 147. we get, for the polynomial f(x) quotient in the form (in V f(x+Ax)-f(x) Ax nx" 1 Ax ■ +(Axf Ax „«-2 oo, the above Ax ■+Cl\ — Ax + (n — l)a„_ix H----+ a\ + Ax(... ) where the expression in parentheses is polynomially dependent on Ax. Clearly, for the values Ax very close to zero, we get a value arbitrarily close to the following expression: Derivatives of polynomials [__^ The derivative of a polynomial f(x) — anx" respect to the variable x is the polynomial oo with o 5.15. Let the values yo, ■ ■ ■, y„ e 1 at pairwise distinct points x0,..., x„ € R, respectively, be given. How many polynomials of degree exactly n + 1 and taking the given values at the given points are there? O 5.16. Determine the Hermite interpolation polynomials P, Q satisfying P(-l) = -ll, P(l) = l, P'(-l) = 12, P'(l)=4; e(-l) = -9, Q(l) = -1, Q'(-l) = 10, Q'(l)=2. o 5.17. Replace the function / with a Hermite polynomial, knowing that f'(x) — na„x" 1 + (n — Y)an-\x" Xi -1 1 2 f(Xi) f\Xi) 4 8 -4 -8 -8 11 x0 = 0, X\ = 2, X2 = 1, yo = 0, y\ = 4, yi = 1, So = 0, = 4, y2 = 2. o 5.18. Without calculation, determine the Hermite interpolation polynomial if the following is given: >From the definition, it is clear that it is just the value f'(xo) of the derivative which gives us a good approximation of the polynomial's behavior near the point xq. To be more precise, the lines /(so + Ax) - /fa) y —-7^-(x - x0) + /(x0), Ax i. e. the secant lines of the graph of the polynomial going through the points [xq, /(xo)] and [xq+Ax, /(xq+Ax)] approach, as Ax decreases, to the line y — f'(x0)(x - x0) + /(xo), which must be the tangent to the graph of the polynomial /. We talk about linear approximation of the polynomial / by its tangent line. The derivative of polynomials is a linear mapping which assigns to polynomials of degree at most n polynomials of degree at most n — 1. Iterating this procedure, we obtain the second derivative /", the third derivative /(3), and generally after /c-tuple iteration, the polynomial of degree n — k. Thus the (n + l)-st derivative is the zero polynomial. This linear mapping is an example of the so-called cyclic nilpotent mappings, which are more thoroughly examined in the paragraph 3.32 on nilpotent mappings. o 5.19. Find a polynomial of degree three or less taking the value y = 4 at the point x = 1 and y = 9 at x = 2, having its derivative equal to —2 at x =0 and to 1 at x = 1. Then find a polynomial of degree three or less taking the value y = 6 at both the points x = 1 and x = — 1 and having its derivative equal to 2 at both these points. O 5.7. Hermite's interpolation problem. Again, let us consider m + 1 pairwise distinct real numbers xo,... ,xm,i. e. xi / xj for all i / j. We will want to lay polynomi-|lv als through given values, but we will now determine ^l^%s»A_ not only the values at those points, but also the first derivatives. That it, we set y, a 3/. for all i. We are looking for a polynomial / which will satisfy these conditions on the values and derivatives. 260 CHAPTER 5. ESTABLISHING THE ZOO 5.20. How many polynomials satisfying the following conditions are there? The degree is four or less, the values at x0 = 5 and x\ = 55 are yo = 55 and y\ = 5, respectively, and both the first and second derivatives at the point xq are zero. O 5.21. Find any polynomial P satisfying P(0) = 6, P(l) = 4, P(2)=4, P'(2) = 1. Analogously as in the case of interpolating the values only, we obtain the following system of 2(m + 1) equations for the coefficients of the polynomial fix) = anx" +----h ao: ao + XQfli +----h (xo)"a„ a0 + xma\ H-----h (xm)"a„ a\ + 2xofl2 + • • • + n(xo)"_1fln yo ym y0 o 5.22. Construct the natural cubic interpolation spline for the values yo = 1, yi = 0, y2 = 1 at the points xq = — 1, x\ = 0, X2 = 1, respectively. O 5.23. Construct the natural cubic interpolation spline for the function /(jc) = |jc|, jce[-l,l], selecting the points x0 = — 1, x\ = 0, x2 = 1. O 5.24. Construct the natural cubic interpolation spline for the points x0 = —3, x\ = 0, x2 = 3 and the values yo = —3, y\ = 0, y2 = 3. Q 5.25. Without calculation, construct the natural cubic interpolation spline for the points x0 = —l,xi = 0 a x2 = 2 and the value yo = yi = y2 = 1 at these points. O 5.26. Construct the complete (i. e., the derivatives at the marginal points are given) cubic interpolation spline for the points xq = —3, x\ = —2, x2 = —1 and the values y0 = 0, yi = 1, y2 = 2, y'0 = 1, y'2 = 1. O 5.27. Construct the natural cubic interpolation spline for the function a\ +2xma2■ + n(xm)" lan ym y selecting the points Xq = 0, X\ = 1, X2 = 3. o More problems concerning polynomial interpolation can be found at 315. Again, we could verify that the choice n = 2m+l makes the determinant of this system non-zero, and thus there will be exactly one solution. However, similarly to Lagrange polynomial, our polynomial / can be constructed straightaway. We just create a set of polynomials with values 0 or 1 (at both the derivatives and the values) in order to express the desired values as their linear combination. Verifying the following definition and proposition is left to the reader: Hermite's interpolation polynomial [___ Hermite's interpolation polynomial is defined by fundamental Hermite's polynomials: l"(xt) 1 -(X -Xi) di(x)Y h\ix) = h]ix) = where l(x) = ]1Li(x h]ixj) (h])'(xj) hjixj) (hf)'ixj) and so Hermite's interpolation polynomial is given by the expression k f(x) = J2(yih}ixi) + y'ih2ixi)y r = l i'(Xi) (x - xt) (£t(x))2 , - Xi). These polynomials satisfy: 1 for i = j 0 for i / j 0 0 5.8. Examples of Hermite's polynomials. The simplest case is the one of prescribing the value and the derivative at one point. This fully determines a polynomial of degree one fix) = f(x0) + f'ix0)ix - x0), i. e. exactly the equation of the straight line given by the value and slope at the point x$. When we set the values and the derivatives at two points, i. e. y0 = fix0), y0 = f'ixo), yi = /(*i), = f'ixi) for two distinct points x;, we still obtain an easily computable problem. Let us look at it in a simple case when xo = 0, x\ = 1. Then the matrix of the system and its inverse will be /0 0 0 1\ A = 1 0 \3 1 0 0/ A = (2 -2 1 1 \ -3 3 -2 -1 0 0 1 0 0 0 261 CHAPTER 5. ESTABLISHING THE ZOO B. Topology of the complex numbers and their subsets 5.28. Find limit, isolated, boundary, and interior points of the sets N, , X = {x e R; 0 < x < 1} inM. Solution. The set N. For any n e N, we have that 01 (n) n N = (n - 1, n + 1) n N = {n}. Hence, there is a neighborhood of n e N in R which contains only one natural number (the number n), therefore every point n e N is isolated. There are thus no interior points (an isolated point cannot be interior). A point a € R is a limit point of A if and only if every neighborhood of a contains infinitely many points of A. However, the set 0i (a) n N = (a - 1, a + 1) n N, where a e R, is finite, hence N has no limit points. By finiteness of this set, we have that Sh := inf | b — n inf \b-n\>0 for b e R \ N. neOi(b)nN Therefore, 0Sb (b) n N = 0, so no b e M \ N is a boundary point of N. We also know that every point which is not an interior point of a given set is necessarily its boundary point. The set of N's boundary points thus contains N, and so it equals N. The set Q. The rational numbers are a dense subset of the real numbers. This means that for every real number, there is a sequence of rational numbers converging to it. (We can, for instance, imagine the decimal representation of a real number and the corresponding sequence whose z'-th term will be the representation truncated to the first i decimal digits. Furthermore, we can suppose that the terms of this sequence are pairwise distinct, for example by deliberately changing the last digit, or by taking the representation with recurring nines rather than zeros, ie. 0.999... for the integer 1 and so on). The set of Q's limit points is thus the whole R and every point x e R \ Q is a boundary point. Especially, we get that any 8 -neighborhood P P --8, —h 8 I , where p, q e Z, q ^ 0, q q of a rational number p/q contains infinitely many rational numbers, hence there are no isolated points. The number 72/10" is rational for no n € N. Supposing the contrary (again, p, q sZ,q ^ 0) V2 p ^ 10> ie 10" q q we arrive at an immediate contradiction as we know that the number \fl is not rational. Every neighborhood of a rational number p/q thus contains infinitely many real numbers p/q + 72/10" (n e N) The multiplication A • iyo,y\,ylQ,y'])T gives the vector («3, ci2, a\, ao)T of coefficients of the polynomial /, i. e. /(*) = (2y0-2yi +/0 + /i)x3 + (-3)>o + 3yq - 2y0 - y\)x2 + y0x + y0. 5.9. Spline interpolation. Similarly, we can prescribe any finite \. number of derivatives at the particular points and a convenient choice for the upper bound on the degree of the wanted polynomial leads to a unique interpolation. We will not delay ourselves with details here. Unfortunately, these interpolations do not solve the problems mentioned already in connection with the simple interpolation of values - complexity of the computations and instability. However, the usage of derivatives allows us to improve our methods: As we have seen in the pictures demonstrating the instability of the interpolation by a single polynomial of sufficiently large degree, small local changes of the values dramatically affected the overall changes of the behavior of the resulting polynomial. Thus we may try to use small polynomial pieces of low degrees which we, however, must be able to link to one another properly. The simplest case is to link each pair of adjacent points with a polynomial of degree at most one. This is also the most frequent way of displaying data. From the view of derivatives, this means that they will be constant on each of the segments and then will change in a leap. A bit more sophisticated method is to prescribe the value and the derivative at each point, i. e. we will have four values for two points, which uniquely determines Hermite's polynomial of degree three, see above. This polynomial can then be used for all the values of the input variable between the marginal points xo < x\. We talk about the interval [xo, x{\. Such a piecewise polynomial approximation has the property that the first derivatives will be compatible. However, in practice, mere compatibility of the first derivatives is insufficient (for instance, with railway tracks), and furthermore, the values of the first derivatives are not always at our disposal. Thus we get the idea of making use of the values at the given points, and on the other hand to require equality of the first and second derivatives between the adjacent pieces of the cubic polynomials. This conditions yield the same number of equations and unknowns, and so the problem will be similarly solvable: _ | Cubic splines [___ Let xo < xi < • • • < x„ be real values at which the required values yo, ..., yn are given. A cubic interpolation spline for this assignment is a function S : R —>• R which satisfies the following conditions: • the restriction of S on the interval [x,_i, x;] is a polynomial St of degree at most three i — 1,..., n • Si(xi-i) — yt-i and St (xt) — yt for all; — 1, ...n, • S'i (xi) = (xi) for all i = 1,..., n - 1, • S'! (xt) = S'!+l{Xi) for all i = 1,..., n - 1. The cubic spline1 for n + 1 points consists of n cubic polynomials, i. e. we have An free parameters (the first condition from the The name comes from the meaning of a ruler used to draw smooth curves between points. 262 CHAPTER 5. ESTABLISHING THE ZOO which are not rational (Q, as a field, is closed under subtraction). Therefore, every point p/q € Q is boundary as well, and there are no interior points of the set Q. The set X = [0, 1). Let a e [0, 1) be an arbitrary number. Apparently, the sequences {a + {1 — ^}%Li converge to a and 1, respectively. So we have easily shown that the set of X's Umit points contains the interval [0, 1]. There are no other limit points: for any b <£ [0, 1] there is 8 > 0 such that Os (b) n [0, 1] = 0 (for b < 0 it suffices to take 8 = —b, and for b > 1 we can choose 8 = b — 1). Since every point of the interval [0, 1) is a limit point, there are no isolated points. For a e (0, 1), let 8a be the less one of the two positive numbers a, I — a. Considering 0Sa (a) = (a - 8a, a + 8a) c (0, 1), a € (0, 1), we see that every point of the interval (0, 1) is an interior point of X. For every 8 e (0, 1), we have that os (0) n [0, i) = (-8, s) n [0, i) = [0, s), 0& (1) n [0, 1) = (1 - 8, l+8)0 [0, l) = (l - s, 1), so every 8 -neighborhood of the point 0 contains some points of the interval [0, 1) and some points of the interval (—8,0), and every 8-neighborhood of 1 has a non-empty intersection with the intervals [0, 1), [1,1 + 8). Therefore, 0 and 1 are boundary points. Altogether, we have found that the set of X's interior points is the interval (0, 1) and the set of X's boundary points is the two-element set {0, 1}, as we know that no point can be both interior and boundary and that a boundary point must be an interior or limit point. □ 5.29. Determine the suprema and infima of the following sets in R: (-1)" ; n e N} C R, A = (-3,0]U(1,tt)U{6}; B 5.30. Find sup A and inf A for n+ (-!)" 5.31. The following sets are given: N = {1,2, ...,n, ...}, M = t7 = (0,2]U[3,5]\{4}. Determine inf N, sup M, inf J and sup J in R. ; n e N ; C = (-9, O O 1 -; n € N n definition). The other conditions then yield In + (n — 1) + (n — 1) more equalities, i. e. two parameters remain free. In practice, we often prescribe the values of the derivatives at the marginal points explicitly (the so-called complete spline), or assume they equal zero (this case is called a natural spline). Unfortunately, the computation of the whole spline is not as easy as with the independent computations of Hermite's cubic polynomials because the data mingle between adjacent intervals. However, with an appropriate order, one can obtain a matrix of the system such that all of its non-zero elements appear on three diagonals only. These matrices are nice enough to be solved in time proportional to the number of points, using a suitable numerical method. For comparison, let us look at interpolation of the same data as in the case of the Lagrange polynomial, now using splines: 2. Real number and limit processes It is important to have a sufficiently large stock of functions with which we can express all usual dependencies. However, at the same time, the choice of the functions must be carefully restricted so that we would be able to build some universal and efficient tools for the work with them. Actually, the first problem we have to solve is how to define the values of the functions at all. After all, all we can get with a finite number of multiplication and addition is polynomial functions and efficient manipulation can be done with rational numbers only. However, we cannot do with rational numbers even when looking for roots of quadratic polynomials as, for instance, \/2 is not a rational number. Thus our first step will be a thorough introducing of the so-called limit process, i. e. we will define precisely what it means that some values approach a certain value. We can also notice that an important property of polynomials is their "continuous" dependency of their values on the input variable. Intuitively said, if we change x a little bit, the value of f(x) also changes a bit only. On the other hand, this behavior is not possessed by piecewise constant functions / : R -> R near the sudden "jumps". For instance, the so-called Heaviside's function /(*) = 0 for all x < 0, 1/2 forx = 0, 1 for all x > 0 o has this type of "discontinuity" for x = 0. Let us formalize these intuitive statements. 263 CHAPTER 5. ESTABLISHING THE ZOO 5.32. Find a set M c R which does not have an infimum in R but has a supremum there. Similarly, find a set A7" c R which does not have a supremum in M but has an infimum there. O 5.33. Find a subset X of the set M such that sup X < inf X. Q 5.34. Find sets A, B, C c M such that AHS = 0, ARC = 0, SRC = 0, sup A = inf 5 = infC = supC. O 5.35. Mark the following sets in the complex plane: i) {z eCMz- 1| = |z + l|}, n) {z e C| 1 < \z - i\ < 2}, iii) {z € C| Re(z2) = 1}, iv) {z e C| Re(i) < ±}. Solution. • the imaginary axis, • annulus around /, • the hyperbola a2 — b2 = 1, • exterior of the unit disc centered at 1. □ C. Limits In the subsequent exercises, we will deal with calculating limits of sequences, that is what the sequences "look like at infinity". Then, if we were to determine the 72-th term of a given sequence for a very large 72, the hmit of the sequence (supposing it exists) can approximate it very well. We devote much space to computation of hmits of sequences (and limits of functions) in this exercise column, that is why they begin earlier (and end later) than in the part concerning theory. Let us begin with limits of sequences. The needful definitions can be found at page. 266. 5.10. Real numbers. So far, we have made do with algebraic properties of real numbers which claimed that R is a field. However, we have also used the relation of the standard (total) order of the real numbers, denoted "<" (see the paragraph 1.38). The properties (axioms) of the real numbers, including the connections between the relations and other operations, are enumerated in the following table. The bars indicate how the axioms gradually guarantee that the real numbers form an abelian (commutative) group with respect to addition, that R \ {0} is an abelian group with respect to multiplication, that R is a field, that the set R together with the operations +, • and the order relation is a so-called ordered field. Finally, the last axiom can be perceived as claiming that R is "sufficiently dense", i. e. there are no points missing between any points (like, for instance, \fl is missing in the rational numbers). [ Axioms of the real numbers [__< (Rl) (a + b) + c = a + (b + c), for all a,b,ceR (R2) a+b = b + a, for all a,/> e R (R3) there is an element 0 e M such that for all a e R, a+0 = a (R4) for all a e R, there is an additive inverse (—a) e R such that a + (-a) = 0 (R5) (R6) (R7) (R8) (a ■ b) ■ c = a ■ (b ■ c), for all a, b, c e R a ■ b = b ■ a for all a, b e R there is an element 1 e R, 1 / 0, such that for all a e R, 1 • a = a for all a e R, a / 0, there is a multiplicative inverse -1 such that a ■ a 1 = 1 (R9) a ■ (b + c) = a ■ b + a ■ c, for all a, b, c e R (RIO) the relation < is a total order, i. e. reflexive, antisymmetric, transitive, and total on R (Rl 1) for all a, b, c e R, a < b implies a + c < b + c (R12) for all a, b e R, a > 0 and b > 0 implies a ■ b > 0 1 (R13) every non-empty set A c has a least upper bound. which has an upper bound The conception of a least upper bound (also called supremum) must be thoroughly introduced. It makes sense for any partially ordered set, i. e. a set with a (not necessarily total) ordering relation. We will also meet it later in algebraic contexts. Let us remind that at the general level, an ordering relation is any binary relation on a set which is reflexive, antisymmetric, and transitive; see the paragraph 1.38. __ I Supremum and infimum [__> Definition. Let us consider a subset A c B in a partially ordered set B. An upper bound of the set A is any element b e B such that b > a holds for all a e A. Dually, we define the concept of a lower bound of the set A as an element b e B such that b < a for all a € A. The least upper bound of the set A, if it exists, is called its supremum and denoted by sup A. Dually, the greatest lower bound, if it exists, is called an infimum; we write inf A. The last axiom of our table of properties of the real numbers thus claims that for every non-empty set A of real numbers, it is true that if there is a number a which is greater than or equal to all 264 CHAPTER 5. ESTABLISHING THE ZOO 5.36. Calculate the following limits of sequences: i) lim 2"2+l"+1, ü) lim 2"2+3"+1, iii) lim n + l «^oo 2«2+3« + l ' 2"-2" V4«" -oo 2"+2-" ■ 24 iv) lim„ v) lim vi) lim -v/4«2 + n — 2n. Solution. i) lim 2«2+3« + l n + l 2«+3+7 lim n^*oo 1 + 7, ii) lim 2"2+3"+1 7 „^„o 3«z+« + l lim — n^*oo 3+ ^ + -L n „2 iii) lim n^*cx> iv) n + l 2n2+3n + l lim 1 OO. 1 + - lim . „^oo 2n+3+- 2" - 2~" 2" + 2"" lim IL _ i — + 1 2-n i 1 v) By the squeeze theorem (5.21): Vn e N : : /4n2" < y4n2+n < 4n2+n + lim n^*oo vi) 2nH Then lim - 2. So lim /4«2 lim 2« 2, lim ■ 4«2+« + v/4n2+n 2 as well. lim \f\~n1 + n — 2n lim n^*oo (V4«2 + n - 2n)(jAn2 +n + 2n) V4«2 + n + 2n n lim —_ «^oo ^4„2 _|_ „ _|_ 2n lim 1 oo ^4n2+n + 2 1 4" □ 5.37. Let c e lim ^/c = 1. n^*cx> Solution. First, let us consider c > 1. The function Hfc is decreasing (in n), yet all its values are greater than 1, hence the sequence Zfc has a limit, and this limit is equal to the infimum of the sequence's terms. Let us suppose, for a while, that thus Umit is greater than 1, that is 1 +s for some s > 0. Then by the definition of a limit, all the sequence's 2 terms will eventually (from some index m on) be less than 1 + e + 2 especially ?/c < 1 + e + . But then we have that ic < 1 +s + 1 + 2<1 + £' numbers x € A, then there is a least number with this property. For instance, the choice A = {x € Q, x2 < 2} gives us the supremum sup A = \fl. An immediate consequence of this axiom is also the existence of infima for any non-empty set of real numbers bounded from below. (It suffices to realize that changing the sign of all the numbers interchanges suprema and infima). For the formal construction of our theory, we need to know whether the properties we demand from the real numbers are realizable, i. e. whether there is such a set R with the operations and ordering relation which satisfy the thirteen axioms. So far, we have constructed correctly only the rational numbers, which form an ordered field, i. e. satisfy the axioms (Rl) - (R12), which can easily be verified. Actually, the real numbers can not only be constructed, but the construction is, up to isomorphism, unique. However, for our need, we will do with an intuitive idea of the real line. We will focus on the existence and uniqueness later on. 5.11. The complex plane. Let us remind that the complex numbers are given as pairs of real numbers. We usually write them as z = re z + i im z. Therefore, the plane C = M2 is a good image of the complex numbers. With addition and multiplication, the complex numbers satisfy the axioms (R1)-(R9) and thus form a field. There is, however, no natural ordering defined on them which would satisfy the axioms (R10)-R(13). Nevertheless, we will work with them as we have already seen that extending some scalars to the complex numbers is highly advantageous for calculations, and sometimes even necessary. There is an important operation on the complex numbers, the so-called conjugation. It is the reflection symmetry with respect to the line of real numbers, i. e. changing the sign of the imaginary part. We denote it by a bar over the number z e C: z = rez — i imz. Since for z = x + iy, z ■ z = (x + iy) (x — iy) = x2 + y2, this value expresses the squared distance of the complex numbers from the origin (zero). The square root of this non-negative real number is called the absolute value of the complex number z; we write (a positive real number). We will show that (5.3) z • z. The absolute value is also defined on any ordered field of scalars K, we just define the absolute value \a\ as follows: Ia if a > 0 —a if a < 0. Of course, it is true that for any numbers a, b e K, (5.4) \a+b\ < \a\ + \b\. This property is called the triangle inequality. It also holds for the absolute value of the complex numbers, which was defined above. Especially for the field of rational numbers and the field of real numbers, which are subfields of the complex numbers, both definitions of the absolute value coincide. 265 CHAPTER 5. ESTABLISHING THE ZOO which contradicts our assumption that 1 + e is the infimum of the considered sequence. The theorem is trivial for c = 1, and for a number c € (0, 1) it follows from the above, if we invoke the theorem for the number l/c. □ 5.38. Determine lim J/n. Solution. Apparently, we have j/n > 1, n e N. So we can set rfn = 1 + a„ for certain numbers a„ > 0, neN. By the binomial theorem we get that n = (1 + a„)n = 1 + Qo„ + Qa2 + ••• + <, n > 2 (n e N). Hence we have the bound (all the numbers a„ are non-negative) n > i ic 72 (?2 — 1) which leads to 0 < a„ < 72-1 72 > 2(72 € N), 72 > 2(72 e N). By the squeeze theorem, 0 = lim 0 < lim an < lim n^-oc n^-oc n^-oc Thus we have obtained the result 72 - 1 lim t/n = lim (1 + a„) = 1 + 0 = 1. We can notice that by further application of the squeeze theorem, we get 1 = lim 1 < lim Ifc < lim j/n~ = 1 for every real number c > 1. □ 5.39. Calculate the limit um (y/i.yi.yi... 272). Solution. To determine the hmit, it is sufficient to express the terms in the form 22 • 2? • 2s • • • 22* = 22+?+5+'"+2^. Thus we get lim (y/2 .Zfl.yi... 2lfl \ = lim 2 n—>oo \ / n—>oo lim ( i+i+i. 2+4 + 8 _ J_ "2" I + I + I. 2+4 + 8 2«=i 5.12. Convergence of a sequence. In the following paragraphs, we will work with one of the number sets K of ratio-X, nal, real, or complex numbers. The absolute value thus must be understood in the corresponding context, and we should also bear in mind that the triangle inequality holds in all these cases. We would like to formalize the notion of a sequence of numbers approaching a limit. Therefore, the key object of our interest will be sequences of numbers a,, where the index i usually goes throughout the natural numbers. We will denote the sequences either loosely as ao, a\,..., or as infinite vectors (ao, ai,...), or (similarly to the matrix notation) as (flr)^i- _ | Cauchy sequences [___ Let us consider a sequence (ao, a l, • • •) of elements of K such that for any fixed positive number e > 0, it holds for all but finitely many terms a, of the sequence that for all but finitely many terms a;, a,- — a,- < e. I In other words, for any fixed e > 0, there is an index N such that the above inequality holds for all i, j > N; i. e. the elements of the sequence are eventually arbitrarily close to each other. Such a sequence is called a Cauchy sequence. Intuitively, we feel that either all but finitely many of the sequence's terms are equal (then \at — aj \ =0 will hold from some index N on), or they "approach" some value. This is easily imaginable in the complex plane: choosing an arbitrarily small disc (with radius equal to e), then, supposing we have a Cauchy sequence, it must be possible to put it into the complex plane in such a way that it covers all but finitely many of the elements of the infinite sequence a,. We can imagine that the disc gradually shrinks to a single value a; see the picture. tos-wutnost O fJOMri&CNltff cjsel- If such a value a e K exists for a Cauchy sequence, we would expect the sequence to have the property of convergence: __ [ Convergent sequences [__> We say that a sequence (a;)°^0 converges to a value a iff for any positive real number e, fl; — a\ < s 1 holds for all but finitely many indeces i (the set of those i for which the inequality does not hold may depend on e). The number a is called the limit of the sequence (fli)^0. If a sequence a, e K, i — 0, 1,..., converges to a e K, then for any fixed positive e, we know that \at —a\ < s for all i greater than a certain N e N. However, by the triangle inequality, we then get that for all pairs of indeces i, j > N, it is true that \ai — aj \ — \cii — ax + o/v — Qj \ < Wi — fl/v I + \aN ~ aj \ < 2e. Thus we have proved: Lemma. Every converging sequence is a Cauchy sequence. 266 CHAPTER 5. ESTABLISHING THE ZOO By the well-known formula for the sum of geometric series, 00 /1 \ « 1 whence it follows that lim (V2-^2-^2---72)=21=2. n—>co \ / □ 5.40. Determine 1 2 n — 2 72 — 1 lim — + — + ••• + —— + O 5.41. Calculate V«3 — H«2 + 2 + ^Jn1 — 2n5 — n3 — n + sin2 n lim - 2 - ^5t24 + 2n3 + 5 5.42. Determine the limit n\ + (n-2)\- (n -4)! o lim n50 +n\- (n - 1)! O 5.43. Find two sequences (let use denote their terms by x„ and y„ (n e N), respectively) having infinite limits and such that lim (x„ +y„) = l, lim (x„ y2) = +00. 5.44. Determine the limit points of the sequence given by (-l)"2n o V4«2 + 5n + 3 neN. O 5.45. Calculate if lim sup a„ and lim inf a„ n2 + An — 5 9 H7T --- sin —, n e N. «2 + 9 4 O 5.46. Determine lim inf ( (-1)" ( 1 + -J +sin — o However, in the field of rational numbers, it can easily happen that the corresponding value a does not exist even for a Cauchy sequence. For instance, the number \fl can be approached by rational numbers at with arbitrary accuracy, thereby obtaining a sequence converging to \fl, but the limit is not rational. Ordered fields of scalars in which every Cauchy sequence is converging are called complete. The following theorem proposes that the axiom (R13) guarantees that the real numbers are such a field: Theorem. Every Cauchy sequence of real numbers ai converges to a real value a e M. Proof. The terms of any Cauchy sequence form a bounded set Q since any choice of e bounds all but finitely many of them. Let us define B as the set of those real num- ^sSgSiLg bers x for which x < aj holds for all but finitely many terms aj of the sequence. Apparently, B has an upper bound, and thus has a supremum as well, by (R13). Let us define a — sup 5. Now, having fixed some e > 0, we choose N so that \at -Especially, aj > a^ — s and aj < apj -and so a^ — s belongs to B, while a^ we get that \a — a^\ < s, and thus aj\ < s for all i, j > N. s for all indeces j > N, - s does not. Altogether, aj\ < ■ aN\ |fljv — aj \ < 2e for all j > N. However, this means that a is the limit of the considered sequence. □ Corollary. Every Cauchy sequence of complex numbers n converges to a complex number z. Proof. Letuswritezi — at+ibi. Since |a,— aj\2 < \zi—Zj\2 and similarly for the values bt, both sequences of real numbers a, and bt are Cauchy sequences. They converge to a and b, respectively, and we can easily verify that z — a + 1' b is the limit of the sequence zi. □ 5.13. Remark. The previous discussion gives us a method for defining the real numbers. We proceed similarly to building the integers from the natural numbers (adding all additive inverses) and building the rational numbers from the integers (adding all multiplicative inverses of non-zero numbers). This time, we "complete" the rational numbers by all limits of Cauchy sequences. It suggests itself to introduce a suitable equivalence relation on the set of all Cauchy sequences of rational numbers so that Cauchy sequences (a,■) and (bt) are equivalent iff the distances | a, — bt I converge to zero (this is the same as the condition that merging these sequences into a single sequence-for instance, the terms of the first sequence will become the odd terms of the resulting one and the terms of the second sequence will be the even ones-yields a Cauchy sequence as well). We will not verify that this relation is an equivalence in detail, neither will we define the operations and the ordering relation, nor will we prove that all of the axioms will indeed hold. Nevertheless, it is not difficult. Nor is proving the fact that the axioms (R1)-(R13) define the real numbers uniquely up to isomorphism (a bijective mapping preserving the algebraic operations as well as the ordering). We will return to this notes later. 267 CHAPTER 5. ESTABLISHING THE ZOO 5.47. Now let us proceed with limits of functions. The definition can be found at page 272. Determine (a) lim sinx; (b) (c) (d) lim x + x 2 x2 - 3x + 2' lim arccos- X^+QO \ X + 1 lim x + x (jc-2)(jc+3) ,. x + 3 2 + 3 c lim-:-— = lim-- =--- = 5 ►2 X2 - 3x + 2 x^2 (X - 2) (X - 1) ^21-1 2-1 leads to the correct result (thanks to continuity of the obtained at function at the point x0 = 2). Let us realize that the Umit of a function can be calculated from the function values in an arbitrarily small deleted neighborhood of a given point xo and that the Umit does not depend on the function value at the point. We can thus make use of multiplying or reducing by factors which do not change the function values in an arbitrarily selected deleted neighborhood of the point x0. Exercise (c). By moving the limit inwards twice, the original Umit transforms to arccos I lim - >+°° x + 1 It can easily be shown that lim 1 0. ► +oo x + 1 As the function y = arccos x is continuous at the point 0 and takes the value 7t/2 there, and the function y = x3 is continuous at jt/2, we get that lim (arccos—-—| = (arccos ( lim —-—| | = (— :^+cx) \ x + 1 / V V^+CX) x + 1 / / ^ 2 5.14. Closed sets. Four our further work with the real or complex numbers, we will need to thoroughly understand the notions of closeness, boundedness, convergence, and so on. For any subset A of points in K, we will be interested not only of the points belonging to a e A, but also in the ones which can be approached by limits of sequences. Limit points of a set [__^ 1 , lim arctg —, lim arctg x , lim arctg (sin x) . x^—oo x x^—oo x^—oo Solution. Exercise (a). Let us remind that a function / is, by definition, continuous at a given point x iff the limit of / at x is equal to the function value f(x). However, we know that the function y = sin x is continuous at every real number. Thus we get that ,. . . TV V3 lim sin x = sin — = —. x^n/3 3 2 Exercise (b). The immediate substitution x = 2 leads to both zero numerator and zero denominator. Despite that, the problem can be solved very easily. The reduction Let us consider a set A of points belonging to K. A point x € K is called a limit point of the set A iff there is a sequence «0, «i, • • • of elements of A such that all its terms differ from x, yet its limit is x. The limit points of a subset A of rational, real, or complex numbers are those numbers x which can be approached by such sequences of numbers lying in A which do not contain the point x itself. Let us notice that a limit point of a set may or may not belong to it. For every non-empty set A c K and a fixed point x e K, the set of all distances |x — a\, a e A, is a set of real numbers bounded from below, and so it has an infimum d(x, A), which is called the distance of the point x from the set A. Let us notice that d(x, A) — 0 if and only if x e A or x is a limit point of A. (We suggest that the reader prove this in detail from the definitions.) [ Closed sets |___ The closure A of a set A c K is the set of those points which have zero distance from A (note that the distance from the empty set of points is undefined, therefore 0 = 0). A closed subset in K is such a set which coincides with its closure. Thus these are exactly those sets which contain all of its limit points as well. There is a typical example of a closed set: a closed interval [a, b] — {x e R, a < x < b] of real numbers, where a and b are fixed real numbers. If either of the boundary values of the interval is missing, we write a — —oo (minus infinity) and similarly b — +oo. Such closed intervals are denoted by (—oo, b], [a, oo), and (—oo, oo). The closed sets are exactly those which contain all they can "converge to". A closed set may be formed by a sequence of real numbers without a limit point or a sequence with a finite number of limit points together with these points. The unit disc (including its boundary circle) in the complex plane is another example of a closed set. We can easily verify that any intersection and any finite union of closed set is again a closed set. Indeed, if all of the points of some sequence belong to the considered intersection of closed sets, then they belong to each of the sets, and so do all the limit points. However, if we wanted to say the same about an arbitrary union, we would get in trouble: singleton sets are closed, but a sequence of points created from them may not be. On the other hand, if we restrict our attention to finite unions and consider a limit point of some sequence lying in this union, then the limit point must also be the limit point of any subsequence, especially the one lying in only one of the united sets. As this set is assumed to be closed, the limit point lies in it, and thus it lies in the whole union. 268 CHAPTER 5. ESTABLISHING THE ZOO Exercise (d). The function y = arctg x has properties which are "useful when calculating hmits" - it is continuous and injective (increasing) on the whole domain. These properties always (with no further conditions or hmitations) allow to move the examined limit into the argument of such a function. Therefore, let us consider arctg ( lim — J , arctg ( lim x4 J , arctg ( lim sinx \ x^ — oo x I \x^—oo I \x^— oo Apparently, 1 lim - = 0, x^-oo x lim x = +oo x^> — oo and the limit lim^-oo sinx does not exist, which implies 4 1 4 7T lim arctg - = arctg 0 = 0, lim arctg x = lim arctg y = — —* —oo x x^—oo y^+oo 2 and the last hmit does not exist, either. □ 5.48. Determine the limit lim 1 — cos x >b x2 sin(x2) Solution. 1 — cos X lim >o X2 sin(x2) lim 2 sin2 (I) >o X2 sin(x2) lim 1 ^ (!) 2 . 2sm V2 >0 (f) sin(x2) 1 / sin - lim — 2 \*->o (f)V 1 — -lim 1 - • oo = oo. 2 j - >osin2(x2) 2 The previous calculation must be considered "from the back". Since the limits on the right-hand side exist (no matter whether finite or infinite) and the expression \ ■ oo is meaningful (see the note after theorem 5.22), the original hmit exists as well. If we split the original limit into the product 1 lim(l - cosx) • lim , x^o x^o x2 sm(xz) we would get the 0 • oo type, which is an indeterminate form, but this tells us nothing about existence of the original limit. □ 5.49. Determine the following limits: i) lim^2 in) lim^o x-2 •Jx1-^ ii) lim^o sin (sin x) iv) lim^o e~> Solution. x — 2 x — 2 i) lim , = lim Vx - 2 0 lim = - = 0. ►2 ^/x2 _ 4 x^2 J(x -2)(x + 2) x^2 Vx + 2 4 ... ,. x-2 (5.27),. siny 11) lim , = lim-= 1, 5.15. Open sets. There is another useful type of subsets of the real numbers: open intervals (a, b) = {x € R; a < x < b], where, again, a and b are fixed real numbers or infinite values ±oo. It is an open set, in the following sense: __\ Open sets and neighborhoods of points |__- An open set in K is a set whose complement is a closed set. A neighborhood of a point a e K is any open set O which contains a. If the neighborhood is defined as Og(a) = {x e K, |x - a\ < 8} for some positive number 8, then we call it the S -neighborhood of the point a. | Let us notice that for any set A, a e K is a limit point of A if and only if every neighborhood of a contains at least one more point b e A,b ^ a. Lemma. A set A cK of numbers is open if and only if with every point a € A, an entire neighborhood of a belongs to A. Proof. Let A be an open set and a e A. If there were no neighborhood of the point a inside A, there would be a sequence a„ <£ A, \a — an \ < 1/n. But then the point a e A is a limit point of the set K \ A, which is impossible since the complement of A is closed. Now let us suppose that every a e A has an entire neighborhood of its lying in A. This naturally prevents a limit point b of the set K \ A to lie in A. Thus the set K \ A is closed, and so A is open. □ From this lemma, it immediately follows that any union of open sets results in an open set, and further than any finite intersection of open sets is also an open set. In the case of the real numbers, the ^-neighborhood of a point a is the open interval of length 28, centered at a. In the complex plane, it is the disc of radius 8, also centered at a. 5.16. Bounded and compact number sets. The closed and open sets are the basic concepts of topology. Without going into deeper connections, we have just made ourselves familiar with the topology of the real line and the topology of the complex plane. The following concepts will be extremely useful: ___j Bounded and compact sets J___i A set A of rational, real, or complex numbers is called bounded iff there is a positive real number r such that \z\ < r for all numbers z e A. Otherwise, the set is called unbounded. A set which is both bounded and closed is called compact. ►2 Vx2 - 4 y^o y Closed bounded intervals of real numbers are a typical example of compact sets. Let us add further topological concepts that will allow us to express efficiently: An interior point of a set A of real or complex numbers is such a point that one of its neighborhoods is contained in A. A boundary point of a set A is such a point that all its neighborhoods are disjoint with neither A, nor its complement K \ A. A boundary point of the set A may or may not belong to it. 269 CHAPTER 5. ESTABLISHING THE ZOO where we made use of the fact that lim sin x = 0. x^O sin x iii) lim- x^O X smx lim sin x ■ lim- x^O x^O X 0-1=0, again, the original limit exists because both the right-hand side limits exist and their product is well-defined. iv) One must be cautious when calculating this limit. Both one-sided limits exist, but are different, which implies that the examined Umit does not exist: lim e x^0+ oo, lim x^O- □ 5.50. Calculate (a) lim* (c) lim* Solution. x+2 >2 (x-lf ' (b) (d) lim* lim* x+2 >2 (x-2)5 ' ► +0o - In this exercise, we will be concerned with so-called indeterminate forms. We recommend perceiving indeterminate forms as a helping concept which is only to facilitate the first approach to limit calculations because the obtained indeterminate form only means that one "has found out nothing". We know the limit of a sum is the sum of the limits, the limit of a product is the product of the Umits, and the limit of a quotient is the quotient of the limits, supposing the particular Umits exist and do not lead to one of the following expressions oo — oo, 0 • oo, 0/0, oo/oo, which are called indeterminate forms. For completeness, let us add that these rules can be combined and that an expression containing an indeterminate form is itself considered an indeterminate form. For instance, the forms -oo + oo 0 oo (—oo)3+oo are all indeterminate, but the forms 0 - oo, T-^- = ' 3+oo 0 • (oo — oo)" 0 -oo oo, 3 + oo (—oo)3 oo can be called "determinate" (one can immediately determine the Umit - they correspond to the values — oo, 0, 0, respectively). In exercise (a), the quotient of the numerator and the denominator gives us 4/0. Expressions containing division by zero are inappropriate (later, we should be able to avoid them). Yet it leads to the result, it is not an indeterminate form. We may notice that the denominator An open cover of a set A is such a system of open sets Ut, i e I, that its union contains the whole of A. An isolated point of a set A is a point a e A such that there is a neighborhood Nofa satisfying N n A — {a}. ^ ~ t 1 -' 5.17. Theorem. All subsets A of the real numbers satisfy: (1) a non-empty set A is open iff it is a union of countably (or finitely) many open intervals, (2) every point a e A is either interior or boundary, (3) every boundary point of A is either an isolated or a limit point of A, (4) A is compact iff every infinite sequence contained in it has a subsequence converging to a point in A, (5) A is compact iff each of its open covers contains a finite sub-cover Proof. (1) Apparently every open set is some union of neighborhoods of its points, i. e. of open intervals. So the question that remains is whether it suffices to take countably many of them. Thus we may try to select intervals which will be as "great" as possible. We will consider points a, b e A to be related iff the whole open interval (min{a, b], max{a, b}) is contained in A. Clearly, this relation is an equivalence (the open interval (a, a) is the empty set, which is contained in any set; symmetry and transitivity are apparent). The classes of this equivalence relation are intervals which are pairwise disjoint. Each of these intervals surely contains a rational number, and the obtained rational numbers are also pairwise distinct. However, there are only countably many rational numbers, so the statement is proved. (2) It follows immediately from the definitions that no point can be both interior and boundary. Let a e A be a point that is not interior. Then there is a sequence of points at ^ A with a as its limit point. At the same time, a belongs to each of its neighborhoods. Thus a is boundary. (3) Suppose that a e A is boundary but not isolated. Then, similarly to the reasoning from the previous paragraph, there are points at, this time inside A, whose limit point is a. (4) Suppose that A is a compact set, i. e. both closed and bounded. Let us consider an infinite sequence of points at e A. This set surely has both a supremum b and an infimum a (we could have taken any upper and lower bounds of the set A as well). Now let us cut the interval [a, b] into halves: [a, \(b — a)] and [j(b — a),b]. At least one of them contains infinitely many of the terms at. We will select this half and one of the terms contained in it; 270 CHAPTER 5. ESTABLISHING THE ZOO approaches zero from the right (for x ^ 2 we have that (x — 2)6 > 0). We write this as 4/ + 0. Thus the numerator and denominator are both positive in some deleted neighborhood of the point x0 = 2 and one can say that the denominator, at the Umit point, is "infinitely times less" than the numerator, that is x + 2 lim +oo, = +oo (similarly, we can set - 7i (x - 2f which corresponds to setting 4/ + 0 4/ - 0 = -oo). When calculating the limit of (b), one can proceed analogously. Since the numbers have the same sign, we get that x + 2 x + 2 lim +oo 7^ —oo lim ►2+ (X - 2)5 x^2- (x - 2)5 ' so the examined Umit does not exist. We can write 4/±0 (or, more generally, a/ ± 0, a 7^ 0, a € R*), which is a "determinate form". When thoroughly distinguishing the symbols +0 and —0 from ±0, a/ ± 0 for a 0 always means the limit in question does not exist. Exercises (c), (d). If fix) > 0 for all considered x e R, then f(x)g{x) = eln(/w*w) = esM-ln/M. Making use of the fact that the exponential function is continuous and injective on the whole of its domain (R), we can replace the Umit g(x) with lim fix) X—>XQ lim (g(x)-lnf(x)) Let us remind that either of these limits exists if and only if the other one exists. Further, lim igix) ■ Infix)) = a e R X^-XQ lim igix) ■ Infix)) = +oo X—>XQ lim igix)-In fix)) -OO lim fix) X—>XQ lim fix) X—>XQ lim fix) X—>XQ g(x) g(x) g(x) e , +oo, 0. Thus we can write lim fix) X—>Xq g(x) lim g(x)- lim In f(x) X—^A'q X—^A'q if both limits on the right-hand side exist and do not lead to the indeterminate form 0 • oo. It is not difficult to realize that this indeterminate form can only be obtained in three cases, corresponding to the remaining indeterminate forms 0°, oo°, 1°°, when we have, respectively, that and and lim fix) = 0 X^-Xq lim f{x) = +oo X^-Xq lim fix) = 1 X^XQ and lim gix) = 0, X^XQ lim gix) = 0, X^XQ lim g(x) = ±oo. X^-XQ then we cut the selected interval into halves. Again, we select such a half which contains infinitely many of the sequence's terms and select one of those points. By this procedure, we obtain a Cauchy sequence (you can prove this by yourselves; all you need is careful manipulation with some bound, similarly as above). However, we know that Cauchy sequences have limit points or are constant up to finitely many exceptions. Thus there is a subsequence with the wanted limit. >From the fact that A is closed, it follows that the obtained point lies in A. Now the other direction: if every infinite subset of A has a Umit point in A, then all limit points are in A, and so A is closed. If A were not bounded, we would be able to find an increasing or decreasing sequence such that the differences of adjacent numbers would be at least 1, for instance. However, such a sequence of points in A cannot have a Umit point at all. (5) First, let us focus on the easier implication, i. e. let us suppose that every open cover contains a finite one and prove that A is both closed and bounded. Apparently, A - c lies in B, which is impossible. The original choice of c < b led to a contradiction, which proves the desired equaUty b = c. Now, with the help of a neighborhood of b lying in C, we can find in C a finite cover for the whole of A. □ 271 CHAPTER 5. ESTABLISHING THE ZOO In other cases, knowledge (and existence) of the limits lim fix), lim gix) x^xq x^xq allows us to determine the result (having defined some more expressions) lim fix) x—>xq lim fix) x^xq J™ g(x) Since 1\ 1 lim I 2 H— J =2, lim — = 0, lim x = +oo, we have that lim 2 + - x^+oo \ x lim x x^>+oo lim x^+oo \ x 2° = 1, 1 or lim x x = lim (xx) =0. The last result can be expressed as 0°° = 0 or oo°° = oo, oo-1 = 0 (let us emphasize that these are not indeterminate forms). Although we have laid great emphasis on the reader to prefer reasoning about the Umit behavior of functions to mindless labeling of the forms as determinate and indeterminate, it is, we hope, clear now why we will focus on the indeterminate ones. □ 5.51. Calculate lim sinx + txx +oo 2 cos x — 1 — x2 ' 3*+i + x5 - Ax lim lim +oc y + 2X + x2 ' 4X - 8x6 - 2X - 167 lim x^>+oo 3X - 45x - Vnjtx+l2 Jx — sin3 x + x arctg x Vl + 2x + x2 Solution. Having reduced the first fraction by the polynomial x2, we get lim sinx + Tlx lim v2 x^+oo 2COSX — 1 — X2 x^+oo Boundedness of the expressions 2 cos x — 1 r2 1 | sin x | < 1, | 2 cos x — 11 < 3 pro x e and x2 -» +oo for x -» +oo give us the result lim x^>+oo —T +x 0 + jt 2 cos x — 1 r2 1 0-1 -71. 5.18. Limits of functions and sequences. For the discussion of limits, it is advantageous to extend the set R of real numbers by the two infinite values ±oo as we have done when defining intervals. A neighborhood of infinity is any interval (a, oo). Similarly, any interval (—oo, a) is a neighborhood of — oo. Further, we will extend the concept of a limit point so that oo is a limit point of a set A c R iff every neighborhood of oo has a non-empty intersection with it, i. e. if the set A is unbounded from above. Similarly for —oo. We talk about infinite limit points, sometimes also called improper limit points of the set A. _ "Calculations with infinities" _. We also introduce rules for calculation with the formally added values ±oo and arbitrary "finite" numbers a e R: a + oo = oo a — oo = —oo a ■ oo = oo, if a > 0 a ■ oo = —oo, if a < 0 a ■ (—oo) = —oo, if a > 0 a ■ (—oo) = oo, if a < 0 a ±oo = 0, for all a / 0. The following definition covers many cases of limit processes and needs to be thoroughly understood. We will go through the particular cases in detail presently. _ [ Real and complex limits [__^ Definition. Let us consider a subset A c R and a real-valued function / : A —>• R or a complex-valued function / : A —>• C, defined on A. Further, let us consider ste^!3 a Umit point xq of the set A (i. e. a real number or ±oo). We sat that / has limit a e R (or a complex limit a e C) at the point xo and write lim f(x) = a x^xq iff for every neighborhood 0(a) of the point a, there is a neighbor-hood0(xo) ofthe point xo such that for all x e An(0(xo)\{xo}), it holds that/(x) e 0(a). In the case of a real-valued function, a = ±oo can also be the Umit. Such a Umit is called infinite or improper. In the other case, i. e. a e R, we say the Umit is finite or proper. It is important to notice that the value of / at xo has no occurrence in the definition, and the function / may even not be defined at this Umit point (and in the case of an improper Umit point, it cannot, of course)! We often talk about a deleted neighborhood 0(x) \ {x} of those points where we are interested in the function values. For now, we will not define improper Umits of complex functions. 5.19. The most often cases of domains. Our definition of a Umit covers several very dissimilar situations: 272 CHAPTER 5. ESTABLISHING THE ZOO In the last argumentation, we actually used the squeeze theorem and the notation c/oo = 0 which is valid for any c € R (or bounded/ oo = 0, where "bounded" denotes a bounded function). This procedure can be generalized. Any limit of the form /iW + fi(x) + ••• + /„ (x) lim >x0 gi(x) + g2(x) H----+ g„(x) where satisfies lim lim =0, ie {2, x^x0 f\ (X) lim -=0, i e {2, x^x0 gl(x) ft(x) + f2(x) + ■ ■ ■ + fm(x) m lim >x0 gl(x) + g2(x) H----+gn(x) x^x0 gi(x) supposing the limit on the right-hand side exists. It is advantageous to realize (the third limit can be determined, for example, by l'Hospital's rule, with which we will make ourselves familiar later) lim — = 0, lim — = 0, lim — = 0, lim — = 0 for eel, 0 < a < ß, 1 < a < Hence we immediately have that 1X + 1 lim + x5 - Ax lim 3-3" lim +oo 3* + 2X + X2 x^+oo 3 Ax - 8x6 - 2X - 167 3; lim >+oc y _ 45x _ Jllxx + U x^+oc _7TT7T12 • 7tx If we realize that -oo. 71 lim arctgx = — > 1, x^+oo 2 we will also obtain that lim x^+oo x — sin x + x arctg x Vl + 2x +x2 x arctg x lim -—— = lim arctgx x^+oo x/x2 x^>+oc TV □ 5.52. Determine the limits lim 1 1 1 1 «^oovl-2 2-3 3-4 1 1 lim + + ••• + (n — 1) • n 1 Mn2 + 1 V«2 + 2 V«2 + n. Solution. Since for every natural number k > 2 it holds that (what we do here is called partial fraction decomposition - we will present it in detail in the chapter concerning integration of rational functions) 1 _ 1 1 (k-l)k ~ k - 1 ~ k' (1) Limits of sequences. If A — N, i. e. the function / is defined for the natural numbers only, we talk about limits of sequences of real or complex numbers. In this case, the only limit point of the domain is oo, and we often write the values (terms) of the sequence as fin) — an and the limit in the form lim a„ — a. According to the definition, this means that for any neighborhood 0(a) of the limit value a, there is an index N e N such that a„ e 0(a) for all n > N. Actually, we have only reformulated the definition of convergence of a sequence (see 5.12). We have only added the possibility of infinite limits. We also say that the sequence an converges to a. We can easily see from our definition for complex numbers that a sequence of complex values has limit a if and only if the real parts of at converge to re a and the imaginary parts converge to ima. (2) Limits of functions at an interior point of an interval. If / is defined on the interval A — (a,b) and xq is an interior point of this interval, we talk about the limit of a function at an interior point of its domain. Usually, we write lim fix) — a. x^xq Let us examine why it is important to require fix) e 0(a) only for the points x / xq in this case as well. As an example, let us consider the function / : R —>• R |0 ifx/0 1 ifx=0. Apparently, the limit at zero is well-defined, and in accordance with our expectations, lim^o fix) — 0 even though the value /(0) — 1 does not belong into small neighborhoods of the limit point 0. (3) One-sided limits. If A — [a, b] is a bounded interval and xo — a or xo — b, we talk about a one-sided limit of the function / at the point xq: from the left and from the right, respectively. If the point xo is an interior point of the domain of /, we can, in order to determine the limit, consider the domain restricted to [xo, b] or [a, xo]. The resulting limits are also called a right-sided limit and left-sided limit, respectively, of the function / at the point xq. We denote them by limx^x+ fix) and lim^^- fix), respectively. As an example, we can consider the one-sided limits at xo — 0 for Heaviside's function h from the beginning of this part. Apparently, lim h(x) — 1, lim h(x) — 0. However, the limit lim^o fix) does not exist. It follows from out definitions that the limit at an interior point of the domain of an arbitrary function / exists if and only if both one-sided limits exist and are equal. 5.20. Further examples of limits. (1) The limit of a complex function / : A —>• C exists if and only if the limits of both the real part and the imaginary part exist. In this case, we have lim fix) — lim (re/(x)) + i lim (im/(x)). x^xq x^xq x^xq The proof is straightforward and makes direct use of the definition of distances are neighborhoods of the points in the complex plane. 273 CHAPTER 5. ESTABLISHING THE ZOO we get that lim 1 1 1 + z-—: + —— + ■■■ + 1 1-2 2-3 3-4 (n — 1) • n 1 1 ,111111 lim----1-----1-----1-----h «^ooVl 22334 n-l n lim I 1 - - 1. Let us remark that this limit is quite important: it determines the sum of one of the so-called telescoping series (with which Johann I Bernoulli (1667-1748) worked). To determine the second limit, we invoke the squeeze theorem. The bounds 1 11 In 1 +••• + :+••• + ■ •Jn2 + n y/n2 +n 1 1 +••• + V«2 + 1 V«2 + n V«2 + 1 for n € N give that 1 + ••• + •Jn2 + n \Jn2 +n 1 n V«2 +1 V«2 +1 lim —-== V«2 + n < lim _ w«2 +1 n < lim —=. n^°° V«2 +1 + ••• + V«2 + Since lim lim V«2 + n n-xx>^/„2 1, lim lim V«2 + l -/"2 1, we also have that lim 1 1 + V«2 + 1 V«2 + 2 + ••• + 1 V«2 + n □ 5.53. Calculate (a) (b) -v/1 + X — \J\ — X lim-; x^O X lim cosx — smx yyr/4 cos (2x) (c) lim ^ f ^x2 + 2x + 3 - + 2x + 2) . Solution. We will calculate the wanted limits using the method of multiplying both the numerator and the denominator by a suitable expression. The first fraction can be conveniently extended by Vl +X + y/l-X Indeed, the membership into a ^-neighborhood of a complex value z is guaranteed by the real (1 /\/2)S-neighborhoods of the real and the imaginary parts of z. Hence the proposition follows immediately. (2) Let / be a real or complex polynomial. Then for every point x € R, it holds that Really, if/(x) lim f(x) = f(x0). x^xq anx" + ■■■ +ao, then the identity (xo + S)k — ■ + Sk, substituted for k — 0,..., n, gives that choosing a sufficiently small S makes the values arbitrarily close to f(x0). (3) Now, let us consider the following, quite awful, function defined on the whole real line x* XQ Mx\~l I 1 if x e 0 if x i It is apparent straight from the definition that this function has an (even one-sided) limit at no point of its domain. (4) The following function is even trickier than the previous one. Let / : R —>• R be the function defined as follows:2 /(*) if x — ifx i a p, q relatively prime Choosing any point x, no matter whether rational or irrational, \. and a huge natural number m, then x will belong to \>, exactly one of the intervals ^jjp) for some n (if x — £ we consider only coprime m > q). We set Sjc to be the minimum of the distances of the point x from the edges of these intervals for the considered m less than k. Of course, it always holds that &k < \- Now, let us consider some e > 0 and k such that j < e. Then for all y in the deleted <5-neighborhood of the point x, we have either f(y) = 0 (if it is an irrational value) or f(y) < j for r > k (if it is a rational value). In either case, we get that \f(y)\ < e. Therefore, this function's limit is zero at all real points x. However, only at the irrational points, this limit equals the function value. 5.21. Theorem (The squeeze theorem). Let f, g, h be three real-valued functions with the same domain A and such that there is a deleted neighborhood of a limit point xq e R of the domain where fix) < g(x) < h(x). Then, supposing there are limits lim f(x) = f0, lim h(x) = h0 x^xq x^xq and fo — ho, the limit lim g(x) = go x^xq exists as well and it satisfies go — fo — ho. This function is called Thomae function after a German mathematician J. Thomae, 1840-1921. 274 CHAPTER 5. ESTABLISHING THE ZOO and making use of the well-known formula (a — b)(a + b) = a2 — b2. Thus we obtain ,. VI +x - VI -x ,. (1+jc) - (1 -jc) lim- = lim ——-=--=— X x^O x (Vl +x + Vl - x) 2 2 = lim x^O >o vi +x + Vi - x VT + VT 1. Similarly we can calculate cosx — sinx lim yyr/4 cos (2jc) lim (cos x + sin x) (cos x — sin x) >n/A (cos x + sin x) cos (2x) cos2 x — sin2 x lim x^n/4 (cosx + sinx) cos (2x) 1 1 lim -:- = —-- x^jt/4 cosx + Sinx V2 i V2 The reduction was made thanks to the identity cos (2x) = cos2 x — sin2 x, iel As for the last Umit, to make use of the formula (a - b) [a2 + ab + b2) = a3 - b3, we need the expression V2 2 ' (x2 + 2x + 3)2+v/jc2 + 2x + 3-^x2 +2x + 2+J (x2 + 2x + 2\ which corresponds to a2 + ab + b2, so we choose a = v^x2 + 2x + 3, /3 = v^x2 + 2x + 2. By this extension, we transform the original limit to for some polynomials P, Q. Let us emphasize that this really holds for all n e N. For n = 1, one must realize that we set (*) = 0 and that the polynomials P, Q may be constant zeros. So we get (1 + 2nx)n = 1 +2n2x + 2n3 (n-l)x2 + P (x) x3, x e R, (l + nx)2n = 1 +2n2x + n3 (2n - 1)x2 + Q (x) x3, x e R. Mere substitution and simple rearrangements give us (1 +2nx)n - (l+nx)2n lim lim x^O xl lim (-t23 + (P(x) - Q(x)) x) = -n3 + 0 = -«3 (2«3 (n - 1) - n3 (2n - 1)) x2 + (P(x) - Q(x)) x3 Proof. From the assumptions of the theorem, it follows that for any e > 0, there is a neighborhood O(xq) of the point xq e A c R in which both / (x) and h (x) lie in the interval (fo—s, fo+s), for all x ^ xq. >From the condition f(x) < g(x) < h(x), it follows that g(x) e (fo -s, fo + s) as well, so lim^^ g(x) = f0. The presented reasoning can be gently modified for infinite Umit values or for Umits at infinite points xq. It would be a good idea to think it through thoroughly! □ We can notice that this theorem allows us to calculate the limit for all types discussed above, i. e. limits of sequences, limits of functions at interior points, one-sided limits, and so on. 5.22. Theorem. Let A C R be the domain of real or complex functions f and g, let xq be a limit point of A and let the limits lim f(x) = a € x^xq lim g(x) = b € x^xq exist. Then: (1) the limit a is unique, (2) the limit of the sum f g exists and satisfies lim (f(x) + g(x)) = a + b, x^xq (3) the limit of the product f ■ g exists and satisfies lim (f(x) ■ g(x)) = a- b, x^xq (4) supposing b ^ 0, the limit of the quotient f/g exists and satisfies f(x) _ a ~ b' lim □ *^*o g(x) Proof. (1) Let us suppose that a and a' are two values of the Umit \imx^X() f(x). If a / a', then there are dis-U£/& joint neighborhoods 0(a) and O(a'). However, for sufficiently small neighborhoods of xo, the values of / should Ue in both the neighborhoods, which is a contradiction. Thus a = a'. (2) Let us choose some neighborhood of a + b, for instance 02B(a+b). For a sufficiently small neighborhood of xo and x / xo, both f(x) and g(x) will lie in e-neighborhoods of the points a and b. Hence their sum will Ue in the 2e-neighborhood of the value a + b. The proposition is proved. (3) Similarly to the above paragraph: we take 0Ei(ab). For sufficiently small neighborhoods of xq, the values of both / and 275 CHAPTER 5. ESTABLISHING THE ZOO 5.54. Calculate lim (tan x) ^(2x) x^n/4 Solution. Limits of the type 1±0° (like the examined one) can be calculated using the formula lim fix) x—>xq lim «f(x)-l)g(x)) xq supposing the limit on the right-hand side exists and f(x)^l for all x of some deleted neighborhood of the point x0 e M. Therefore, let us determine sin x . \ sin (2x) lim (tanx — l)tan (2x) x^7l/4 lim 1 x^k/4 ycosx / cos (2x) sin x—cos x 2 sin x cos x lim x^7l/4 lim cosx -2 sinx cos2 x — sin2 x i V2 x^jt/4 cosx + smx V2 i V2 2 2 Hence we have that lim (tan x) tan (7.x) Let us remark that the used formula holds more generally for "the type iwhatever"5 that is with no further conditions on the limit lim^™ g(x) which even need not exist. □ 5.55. Show that smx lim- Jtr^O X 1. Solution. Let us consider the unit circle (especially its quarter lying in the first quadrant) and its point [cosx, sinx], x e (0, it/2). The length of the arc between the points [cosx, sinx] and [1,0] is equal to x. So we apparently have smx < x, x e The value tan x is then the distance between the points [1, sin x/ cos x] and [1,0]. We can see that (feel free to draw a picture) x < tanx, x e This inequality also follows from the fact that the area of the triangle with vertices [0, 0], [1, 0], [1, tanx] is greater than the area of the considered circular sector. Altogether, we have obtained that sinx smx < x < cosx that is 1 < < X € x e smx cosx smx 1 > - > cosx, x e (••!) • g will hit e-neighborhoods of the values a and b. Therefore, their product will he in the required e2-neighborhood. (4) This is left as an exercise for the reader. □ Remark. If we look thoroughly at the presented proofs, we see that the statement of the theorem can be extended even to some infinite values of the limits of real-valued functions: Firstly, it must be the case that at least one of the limits is finite or that both limits share the same sign. Then it holds that the limit of the sum is the sum of the limits, with the conventions from 5.18. However, the case "oo — oo" is excluded. In the second case, one of the limits may be infinite, then the other one must be non-zero. Then, again, the limit of the product is the product of the limits. Now, the case "0 • (±oo)" is excluded. In the case of a quotient, we may have a e R and b — ±oo, then the resulting limit will be zero; or a — ±oo and b e R, then it will be ±oo according to the signs of the numerator and the denominator. The case " ^" is excluded. Let us emphasize that our theorem also covers, as a special case, the corresponding statements about the convergence of sequences as well as about one-sided limits of functions defined on an interval. For reasoning about limits, the following corollary of the definitions may be technically useful. It connects limits of sequences and of functions in general. 5.23. Corollary. Let us consider a real or complex function f defined on a set A C R and a limit point xo of the set A. The function f has limit y at the point xo if and only if for every sequence of points xn € A converging to, but different from xq, the sequence of the values f(xn) has limit y. i I I I f _»-1-f-jiw 4 Proof. First, let us suppose that the limit of / at xq is y. Then for any neighborhood U of the point y, there must be a neighborhood V of the point xo such that for all x e V n A, x / xo, we have fix) e U. For every sequence x„ —>• xo of points different from xo, the terms x„ will he in V for all n greater than a suitable N. Therefore, the sequences of values f(x„) will converge to y as well. Now, let us suppose that the function / does not converge to y at x —>• xo. Then for some neighborhood U of the value y, there is a sequence of points xm / xo in A which are closer to xo than 1/m, and yet the value f(xm) does not belong to U. This way, we have constructed a sequence of points lying in A and different from xq for which the values f(x„) do not converge to y, thereby finishing the proof. □ 276 CHAPTER 5. ESTABLISHING THE ZOO Invoking the squeeze theorem, we get the inequalities sinx 1 = lim 1 > lim - > lim cos x = cos 0=1. x^0+ x^0+ X x^0+ Thus we have proved that lim smx 1. x^0+ X The function y = (sinx)/x defined for x ^ 0 is even, whence it follows that smx lim - x^O- X smx lim -= 1. x^0+ X Since both one-sided limits exist and have the same value, the examined Umit exists as well and satisfies lim smx smx lim - = 1. x^O X x^0± X Let us remark that at first sight, one could say the limit can be calculated using l'Hospital's rule. However, then one would have to know the sine's derivative at zero which, actually, is the limit in question. Thus we may not invoke l'Hospital's rule in this case. 5.56. Determine the limits □ lim n^oo \ n + 1 lim ( 1 + —r sin2 x lim x^O X lim 3 tan x >o 5x2 lim —y~. sin x sin (3x) lim- sin (5x) lim I 1 - ^ arcsm x lim-; x^O X tan (3x) lim >o sin (5x) lim _ lim ^x e - e x->o x x->o sin (2x) Solution. When calculating these limits, we will use our knowledge of the following limits (a e 1): / a\n sinx ex — 1 lim(l + -j =efl; lim-= 1; lim-= 1. Thus we know that x^0 X x^0 X , , IV /n-l\n 1 lim 1 - - = lim The substitution m = n — 1 gives us (n — 1\ ( m -) = lim - m m + l m lim I - J • lim m^oo \ m + 1/ m^oo m + l Altogether, we have lim m m + l • lim m m + l Now, we have prepared tools for a correct formulation of the property of continuity, with which we have dealt when talking about polynomials. ___J Continuity of functions J. - Definition. Let / be a real or complex function defined on an interval Act We say that / is continuous at a point x0 e A iff lim f(x) = f(x0). x^xq The function / is said to be continuous on the set A iff it is continuous at every point xq e A. Let us notice that for the boundary points of the interval A, the definition says that value of / equals the value of the one-sided limit there. We say that the function is right-continuous or left-continuous at such a point. We have also seen that every polynomial is a continuous function on the whole R, see 5.20(2). Further, we have met a function which is continuous at irrational real numbers only although it has limits at all rational points as well, see 5.20(4). From the previous theorem 5.22 about limit properties, many of the following propositions immediately follow. 5.24. Theorem. Let f andg be (real or complex) functions defined on an interval A and continuous at a point xq € A. Then (1) the sum f + g is continuous at xq (2) the product f ■ g is continuous at xq (3) if g(xo) ^ 0, then the quotient f/g is well-defined on some neighborhood ofxQ and is continuous at xq. (4) if a continuous function h is defined on an neighborhood of the value f(xo) of the real-valued function f, then the composite function h o f is defined on an neighborhood of the point xq and is at xq. Proof. The statements (1) and (2) are apparent. We need to supplement the proof of (3). If g (xq) / 0, then the entire e-neighborhood of the number g (xq) does not contain zero for a sufficiently small £ > 0. >From the continuity of g, it follows that on a sufficiently small ^-neighborhood of the point xq, g will be non-zero and the quotient f/g is thus well-defined there. However, then it will be continuous at xq by the previous theorem. (4) Let us choose a neighborhood O of the value h(f(xo)). >From the continuity of h, there is a neighborhood O' of the point 277 CHAPTER 5. ESTABLISHING THE ZOO Clearly, the second Umit is equal to 1. Changing the variables (replacing n with m), we can write the result n lim n + 1 Further, it holds that lim 1 + l V and lim 1 + lim 1 + lim I 1 - -\ = lim ((1 - - e° = l 0. Let us point out that the first result follows from the Umits lim (1 + \) = lim (l + -and the second one from e, lim - = 0 n—>oo fi lim 1 - n^oo V n lim n = +oo, where we set e~°° = 0 (this is a notation for lim^-oo ex = 0, which is a determinate form). We can easily get that sin x sin x lim-= lim sinx • lim-=0-1=0. x^O X x^O x^O X Apparently, and the limit lim- x^o sinx lim l"1 = 1 1 x^o sinx does not exist (we write 1/ ± 0). If we used the rule for the limit of a product to determine the limit x lim —t— sin x , we would obtain ll/±0=l/±0. This means that the limit does not exist (this, again, is a determinate form). For the calculation of arcsin x lim-, x^0 X we will make use of the identity x = sin (arcsin x) which holds for any x € (—1, 1), that is in some neighborhood of the point 0. Substituting y = arcsin x, we get arcsm x arcsin x lim-= lim lim y l. x->o x sin (arcsin x) y^osiny Let us remark that y -» 0 follows from substituting x = 0 into y = arcsin x and from continuity of this function at 0 (this also guarantees that such a substitution can be made). /(xo) which is mapped into O by h. The continuous function / maps some sufficiently small neighborhood of the point xo into the neighborhood O'. However, this is the definition property of continuity, which finishes the proof. □ Now we can quite easily derive some basic connections between continuous mappings and the topology of the real numbers: 5.25. Theorem. Let f : R -> R be a continuous function. Then (1) the inverse image f~l(U) of every open set U is an open set, (2) the inverse image f~l (W) of every closed set W is a closed set, (3) the image f(K) of every compact set K is a compact set, (4) f has both a maximum and a minimum on every compact set K. Proof. (1) Let us consider a point xo e f~l (U). There is a ;i neighborhood O of the value /(xo) which is contained in "Kfe U since U is open. However, then there is a neighborhood Alfa O' of the point xo which is mapped into O and thus belongs 'if ^ to the inverse image. Therefore, every point of the inverse image is interior, which finishes the proof. (2) Let us consider a limit point xo of the inverse image f~l(W) and a sequence x;, /(x,) e W, which converges to it. >From the continuity of /, it apparently follows that /(xr-) converges to /(xo), and since W is closed, it must be that /(xo) e W. Clearly, all limit points of the inverse image of the set W are contained in W. (3) Let us choose any open cover of / (K). The inverse images of the particular intervals are unions of open intervals and thus create a cover of the set K. We can select a finite cover from it, so it suffices to take finitely many of the corresponding images to cover the original set f(K). (4) Since the image of a compact set is again a compact set, the image must be bounded and it must contain both the supremum and the infimum. Hence it follows that these must also be the maximum and the minimum, respectively. □ 5.26. Corollary. Let f : R —>• R be continuous. Then (1) the image of every interval is again an interval, (2) f takes all the values between the maximal and the minimal one on the closed interval [a, b].3 Proof. (1) First, let us consider an open interval A and suppose there is a point y e R such that f(A) contains points less than y as well as points greater than y, but y ^ f(A). This means that for open sets B\ — (—oo, y) and B2 — (y, oo), their inverse images A\ — f~l{B\) c A and A2 — f~1(B2) C A cover A. Again, these sets are open, disjoint, and have a non-empty intersection with A. Thus there must be a point x e A which does not he in A i but is a limit point of this set. At the same time, it must he in A2, which is impossible for two disjoint open sets. Thus we have proved that if there is a point y which does not belong to the image of the interval, then either all of the values must be above y or all must be below. Hence it follows that the image is again an interval. Let us notice that the marginal points of this interval may or may not he in the image. This theorem is (especially in Czech literature) called Bolzano's theorem. Bernard Bolzano worked in Prague at the beginning of the 19th century. 278 CHAPTER 5. ESTABLISHING THE ZOO We can immediately see that 3 tan2 x (3 sin x sin x 1 lim--— = lim - • >o 5x2 o \ 5 x x cos2 x 3 sinx sinx 1 - • lim-• lim-• lim 5 x^O X x^O X x^0COS2X 3 3 = -•1.1.1 = -. 5 5 By appropriate extension and substitution, we get sin (3x) / sin (3x) 5x 3 lim-= lim o sin (5x) x^o \ 3x sin (5x) 5 sin (3x) 5x 3 = lim-• lim-• - x^o 3x x^o sin (5x) 5 = lim^.lim-^ = l.l.U2. y^o y z^osinz 5 5 5 Thanks to the previous result, it can easily be calculated that tan (3x) / sin (3x) 1 lim-= lim osin(5x) ysin (5x) cos (3x) sin(3x) 1 = lim-• lim 3 3 - • 1 -. x^o sin (5x) x^o cos (3x) 5 5 Similarly, we can determine ,(5-2)* _ y I e" —_—— jtr^O X x^O lim = hm & (5 - 2)x lim e2x ■ lim- x^O x^O 3x 3x 2 (5-2) •3 e° • lim e^ - 1 and also lim e5x — e lim y^O y i5x — 1 e~x — 1 3 = 1 • 1 -3 = 3 >o sin(2x) *->■() \ sin (2jc) sin (2x) lim ±5x 1 2x 5 e x — 1 2x 1 lim >o \ 5x sin (2x) 2 —x sin (2x) \ 2 *5x 1 2x 5 lim >o 5x x->o sin (2x) 2 lim 1 2x lim >o —x x->o sin (2x) \ 2 1 ,. e"-l z 5 ,. e"-l ,. z lim-• lim-•--lim-• lim- u^o u z^osinz 2 v^o v z^o smz 1 5 1 „ 2 + 2=3- 5.57. Calculate the limits 1 — cos (2x) lim-:-; x^o xsinx Solution. We will utilize the fact that sinx lim-= x^O X lim x^O cosx If the domain interval A contains one of its limit points, then the continuous function must map it to a limit point or an interior point of the image of the interior of A. This verifies the statement. (2) This statement immediately follows from the previous one as the image of a closed bounded interval (i. e. a compact set) must be a closed interval again. □ We will finish our introductory discussion by some more theorems which are useful tools for calculating limits. 5.27. Theorem (About the limit of a composite function). Let f, g : M —>• M be functions and lim^a f(x) = b. (1) If the function g is continuous at the point b, then lim g (f(x)) =g Urn f(x)) =g(b). x^a \x^a / (2) If the limit lim-y^j, g(y) exists and f(x) ^ b holds for all x from some deleted neighborhood of the point a, then lim g (f(x)) = lim g(y). x^a y^b Proof. The first proposition can be proved similarly to 5.24(4). From the continuity of g at the point b, it follows that for any neighborhood V of the value g(b), we can find a sufficiently small neighborhood U of the point b whose values of g lies in V. However, if / has limit b at the point a, then / will hit U by all its values for some sufficiently small deleted neighborhood of the point a, which verifies the first statement. Even if we cannot use the continuity of g at the point b, the previous reasoning will hold as well if we ensure that sufficiently small neighborhoods of the point a are mapped into a deleted neighborhood of the point b by the function /. □ 5.28. Who is in the ZOO. We have begun to build our menagerie of functions with polynomials and functions which can be created from them "by parts". At the same time, we have derived many properties for a huge class of continuous functions. However, we do not have many practically manageable examples at our disposal (except for the polynomials). As another example, we will concentrate on the quotients of polynomials. Let / and g be two polynomials which can take complex values as well (i. e. we admit expressions a„x" H-----hflo with complex coefficients at e C, but we allow to substitute real values only for the variable x). .\{x el,g(i) = 0| C, fix) The function h h(x) = g(x) is well-defined at all real points x except for the roots of the polynomial g. Such functions are called rational functions. From the theorem 5.24, it follows that rational functions are continuous at □ all points of their domains. At the points where they are undefined, they can have • a finite limit, supposing the point is a common root of both / and g and the multiplicity in / is at least as great as in g (in this case, extending the function's domain by this point and defining it to take the value of the limit there makes the functions continuous at the point as well), • an infinite limit, supposing the one-sided infinite limits are equal, 279 CHAPTER 5. ESTABLISHING THE ZOO Then, we get lim x^O 1 — cos (2jc) x sinx 1 — (cos2 x — sin2 x) x sinx (l — cos2x) + sin2x x sinx 2 sin x sin x lim-:- = lim 2- X Sinx X lim x^O lim and lim x^O 1 — cos X lim x^O 1 — cos x 1 + cos X x2 1 + cos X lim 2; 1 — cos X >o x2 (1 + cosx) lim sin2x >o x2 (1 + cosx) smx lim x^O X lim 1 >o 1 + cosx Let us remark that we could also use the identity 1 — cos (2x) = 2 sin2 x, x e D. Continuity of functions □ 5.58. Let us examine existence of limits and continuity of the function (x — I) - sgnx at the points 0 and 1. Solution. First, let us calculate the one-sided limits at the point 0: linwo-(* " l)~sgn* = linw0-(* - 1) = -1, linwo+(* " l)"sgn" = linwo+ ^ = ~h whence lim(x — i)-ssn* = —l. However, the function value at 0 x^O equals 1, so the examined function is not continuous at the point 0. Further, we have that linwi-(jc - iys^x = linwi- jzt = -oo, limx^1+(x - l)-ssn^ = limx^1+ ^ = 00. Both one-sided limits at the point 1 exist, yet they differ, which implies that the (two-sided) limit of this function at 1 does not exist, and the function is not continuous here, either. □ R(x) 5.59. Without invoking the squeeze theorem, prove that the function [x, x e {£; n € N} ; [0, x e M\ {1; n e N} is continuous at the point 0. Solution. The function R is continuous at the point 0 if and only if limi?(x) = R(0) = 0. x^O We will show that, by the definition of a limit, the examined limit equals 0. Using the "usual" notation, we have a = 0, x0 = 0. Let 8 > 0 be arbitrary. For any x e (—8,8) we have that R(x) = 0, • different one-sided infinite limits. This situation is illustratively caught by the picture, which shows the values of the function (x - 0.05a)(x - 2 - 0.2a) (x - 5) h(x) — - x(x-2)(x-4) for a — 0 (the left-hand picture thus displays the rational function (x - 5)/(x - 4)) and for a = 5/3. 5.29. Power and exponential functions. The polynomials are created by addition and multiplication of scalars and the simple power functions n->/ with natural exponents n — 0, 1, 2,____The sense of the function x i-> x_1, defined for all x / 0, is also obvious. Now, we will extend this definition to a general power function X3 with an arbitrary a e R. We will use the properties of powers and roots, which we will consider to be a "matter of course". For a negative integer —a, we thus define x~a = (x*)"1 = (x~l)a. Further, we would surely want the equality bn — x for n e N to 1 imply that b is the n-th root of x, i. e. b — x«. It is necessary to verify that such b's always exist for positive real numbers x. By factoring out y2 — yi in i\ — y\, we can easily see that the function y i->- -f is increasing for y > 0. Let us choose a number x > 0 and consider the set B — {y e R, y > 0, yn < x}. This is a non-empty set bounded from above, so let us set b — sup B. We already know that a power function with a natural exponent n is continuous, so we can easily verify that bn — x. Indeed, surely bn < x, and if the inequality were strict, we would find a number y such that b" < y < x, which would imply that b < y, which contradicts the definition of a supremum. Thus we have the power function correctly defined for all rational numbers a — X3 — (xp)? = (x?)p. Eventually, we can notice that for the values a e R and x > 1, X3 is strictly increasing for rational a's. Therefore, we define x° — supfx^ ,yeQ, y < a}. For 0 < x < 1, we proceed analogously (one must be careful of the inequality signs) or we set X3 — (j)~a- For x — 1, we define 1" — 1 for any a. Now, we have defined the power function x i->- X3 for all x e [0, 00) and a e R. However, we can consider another view of the construction: For every fixed real number c > 0, there is a well-defined function y i->- cy on the whole real line. This function is called an exponential function with base c. 280 CHAPTER 5. ESTABLISHING THE ZOO or R(x) = x, hence (in both cases) we get R(x) e (—8,8). In other words, having chosen any 8 -neighborhood (—8, 8) of the point a, we can take the 8-neighborhood (—8,8) of the point x0 as then for any x e (—8,8) (the considered neighborhood of xo) it holds that R(x) € (—8, 8) (here, the interval (—8, 8) is the neighborhood of a). This matches the definition of a hmit (we did not even have to require x ^ x0). The considered function R is called the Riemann function (hence the name R). In literature, it can be found in many modifications. For instance, the function /(*) 1, j_ 1; is also "often" called the Riemann function. □ 5.60. By defining the values at the points — 1 and 1, extend the function f(x) = (x2 - 1) si 2x - 1 sin ■ 1 X £ ±1 (X € R) so that the resulting function is continuous on the whole R. Solution. The original function is continuous at every point of its domain. Thus the extended function will be continuous if and only if we set f(-l) := lim ( (x2 - l)sin-X ~ 1 1 f(l) := lim I (x2 - l) sin ■ 2x - 1 1 \ v ' Xz - 1 If either of these hmits did not exist (or were infinite), the function could not be extended to a continuous one. Clearly we have that 2x - 1 sin — xl whence it follows that i 1 < 1, x ^ ±1 (x e R), 1 < fix) < \ xL - 1 , x ^ ±1 (x e R). Since lim I x2 — 1 Jt->±1 1 0, by the squeeze theorem, we get the result /(±1) := 0. 5.61. Determine whether the equation e2* — x4 + 3x3 — 6x2 a positive solution. Solution. Let us consider the function □ 5 has /(*) Jlx x4 + 3x3 - 6x2 - 5, x > 0, for which f(0) -4, lim f(x) x^+oo lim e2x x^>+oc +oo. The properties which we used when defining the power function and the exponential function f(y) — cy,L e. c — /(l), can be summarized in a single inequality for any positive real x and real y: /(.V • V) - /(V) • /(V) together with condition of continuity. Indeed, for y — 0 we get that /(0) — 1, and hence 1 — f(0) — f(x — x) — f(x) ■ (f(x))~l and, eventually, for a natural number n, apparently f(nx) — (f(x))n. Thus we have determined the values x° for all x > 0 and a e Q. The continuity condition determines the function's values at the remaining points as well. The exponential function especially satisfies the well-known formulas (5.5) (ax) ix-y 5.30. Logarithmic functions. We have just seen that the exponential function f(x) — ax is increasing for a > 1 and decreasing for 0 < a < 1. Thus in both cases, there is a function f~l (x) inverse to it. This function is called a logarithmic function with base a. We write lnfl (x), and lnfl (ax) — x is the defining property. The equalities (5.5) are thus equivalent to lna(*•?) = lnfl(x) + lnfl(y), lna(xy) = y- lna(x). Logarithmic functions are defined only for positive input values and are, on the whole domain, increasing for base a > 1 and decreasing for 0 < a < 1. lnfl(l) — 0 holds for every a. We will see presently that there is an extremely important value of a, the so-called Euler's number e, see the paragraph 5.42. The function lne (x) is called the natural logarithm and denoted by ln(x) (i. e. omitting the base e). 3. Derivatives When we were talking about polynomials, we already discussed how to describe the rate at which the func-1^17/, tion changes at a given point of its domain (see the paragraph 5.6). Back then, we examined the quotient (5.2), which expressed the slope of the secant line between the points [x, f(x)] e R2 and [x + Ax, f(x + Ax)] e M2 for a (small) increase Ax of the input variable. This reasoning is correct for any real or complex function /; we only have to properly work with the concept of a limit, instead of "intuitive decreasing" of Ax. We introduce the definition of both proper and improper derivatives, i. e. we admit infinite values of the derivatives as well. We can notice that, unlike in the case of a mere limit of a function, now the function must be defined at the point xo at which we consider the derivative. 281 CHAPTER 5. ESTABLISHING THE ZOO >From the fact that / is continuous on the whole domain it thus follows that it takes on all values y e [—4, +00). Especially, its graph necessarily intersects the positive semiaxis x, ie. the equation fix) =0 has a solution. □ 5.62. At which points x e R is the function y = cos ( arctg ( I 12x21 + 11 I (considering maximum domain) continuous? 5.63. Determine whether the function x, x < 0; /(*) 0, x, 0, x, 1 x-3 ' is continuous; left-continuous points — 7T, 0, 1, 2, 3, 7T. 5.6~¥. Extend the function 0 < x < 1; jc = 1; 1 < x < 2; 2 < x < 3; x > 3 right-continuous at the o /(*) = arctg 1 + — • 2 5 sin x , x e \{0} at x = 0 so that it is continuous at this point. 5.65. Find all p € R for which the function sin (6x) o /(*) 3x x e \{0}; fiO) is continuous at the origin. o 5.66. Choose a real number a so that the function -4 - 1 h(x) = is continuous on'. 5.67. Calculate 1 x > 1; h(x) a, x < 1 o lim sin8 x lim sin8 x o 5.68. Find all possible values of the parameter a e R so that the inequality (a - 2)x2 - (a - 2)x + 1 > 0 holds for all real numbers x. Solution. We can notice that for a = 2, the inequality holds trivially (there is constant 1 on the left side). For a ^ 2, the left side is a quadratic function / (x) in the variable x, and further / (0) = 1. Thanks to the function fix) being continuous, the inequality fix) > 0 will hold Derivative of a function of a real variable [_s 5.31. Definition. Let / be a real or complex function defined on an interval Acl and xq e A. If the limit lim fix) - fix0) x0 exists, we say that the function / has derivative a at the point x$. The value of the derivative is denoted by fixo) or 37 (xo) or a = In accordance with the value of the defining limit, the derivative is also sometimes called proper or improper. One-sided derivatives (i. e. left-sided derivatives and right-sided derivatives) are defined analogously in terms of the corresponding one-sided limits. If a function has a derivative at a point xq, we say the function is dijferentiable at x$. A function which is differentiable at every point of a given interval is said to be differentiable on the interval. 1 Derivatives can be easily manipulated with, but we will have a lot of work correctly deriving the derivatives even of some already constructed functions. Therefore, a bit prematurely, we introduce a table of derivatives of several such functions. In the last column, you can find references to the corresponding paragraph where the result is proved. We can also notice that even though we are unable to express inverse functions to some of our functions by elementary means, we are nonetheless able to calculate their derivatives; see 5.35. Derivatives of some functions [__i function domain derivative polynomials fix) whole M fix) is again a polynomial 5.6 cubic splines whole M only the first deriva- 5.9 hix) tive of h'ix) is continuous rational functions /(*)/* to whole R except for roots rational functions: f'(x)g(x)-f(x)g'(x) gU)2 5.34 of g power functions interval fix) = ax"'1 ?? fix) = x* (0, 00) exponential func- whole R fix) = In (a) -ax ?? tions fix) = ax, a > 0, a ^ 1 logarithmic interval fix) = ?? functions (0, 00) fix) = lnfl(x), a > 0, a ^ 1 From the formulation of the definition, we would anticipate that /' (xo) will allow us to approximate the function / by a straight line y — /(xo) + fix0)ix - x0). 282 CHAPTER 5. ESTABLISHING THE ZOO for all real x if and only if there is no solution to the equation fix) = 0 in R (the whole of the graph of the function / will then be "above" the x-axis). This will occur if and only if the discriminant of the quadratic equation ia — 2)x2 — ia — 2)x + 1 = 0 (in x) will be negative. Thus we get the following necessary and sufficient condition: D = ia-2)2 - Ma - 2) = ia - 2) (a - 6) < 0. This is true for a e (2,6). Altogether, the inequality holds for all real jtiffae[2,6). □ 5.69. In R, solve the equation 2x +y +4x +5x +6x =5_ o Solution. The function on the left side is a sum of five increasing functions on R, so it must be increasing as well. For x = 0, its value is 5, which is thus the only solution of the equation. □ 5.70. In R, solve the equation 2X + 3X + 6X = 1. o 5.71. Determine whether the polynomial v-37 + 5x21 - 4x9 + 5x4 - 2x - 3 has a real root in the interval (—1, 1). o E. Derivatives First of all, let us show that the derivatives enlisted in the table of paragraph 5.31 are correct. We will derive them right from the definition of a derivative. 5.72. >From the definition, (see 5.31) find the derivatives of the functions x" (x is the variable, n is a constant positive integer), yfx, sinx. Solution. First, let us remark that by substituting h for x — x0 in the definition of a derivative, we get lim x^-xq fix) - fix0) x0 lim fix0 + h) - fjx0) h In the following calculations, we will work with the latter expression of the limit. This is the meaning of the following lemma, which says that replacing the constant coefficient f'(xo) in the line's equation with a certain continuous function gives exactly the values of /. The difference between the values if (x) and the value tK*o) on a neighborhood of xq then says how much the slopes of the secant lines and the tangent line at the point xq differ. S»-- Lemma. A real or complex function fix) has a finite derivative at xq if and only if there is a neighborhood 0(xq) and a function iff which is continuous at xo and such that for all x e O(xq), it holds that fix) = /(xo) + f(x)(x - x0). Furthermore, then i/(xo) = f'(xo), and f itself is continuous at the point xo. Proof. First, let us suppose that /'(xq) is a finite derivative. If if is to exist, it is surely of the form f(x) = (f(x) - /(x0))/(x - x0) for all x e O \ {xo}. On the other hand, we define the value at the point xo as /'(xo). Surely, then lim if(x) = f'(x0) = if(x0) x^xq as desired. And if such a function if exists, the same procedure calculates its limit at xo. Thus the derivative /'(xo) exists as well and equals if(x0). >From the expression of / in terms of continuous functions, it is apparent the / itself is continuous at the point xo. □ 5.32. Geometrical meaning of the derivative. The previous lemma can be illustrated geometrically, thereby getting another view at the derivative. It says that it can be determined whether the derivative exists from the graph of the function y = f(x), i. e. the corresponding curve with coordinates x and y: the derivative exists if and only if the slope of the secant line going through the points [xo, /(xo)] and [x, f(x)] changes continuously. If so, the limit value of this slope is the value of the derivative. _J Functions increasing and decreasing at a point Corollary. If a real-valued function f has derivative /'(xo) > 0 at a point xq e M, then there is a neighborhood 0(xq) such that fix) > f'(xo) for all points x € 0(xq), x > xo, and fix) < /(xo) holds for all x € 0(xq), x < xq. 283 CHAPTER 5. ESTABLISHING THE ZOO (xny = lim (x + h)n - x" h (")x"-1h + (")x"-2h2 + --- + hn lim —- h nxn + lim " Xxn~2h + (" )xn~3h2 + ■■■+ h"~l nx .«-1 lim h lim 1 h^O h(y/x + h + yfx) h^O Jx + h + yfx 1 2y/x' sin(x + h) — sinx (sin x) = lim- h^O h sin x cos h + cos x sin h — sin x = lim- h^O h cos x sin A sinx(cos/i — 1) = lim--h lim- ft^o ft h^O ft cosx • lim shift 2(sin|)2 lim ft^o h h^O h sin f cos x • 1 + lim sin t - cosx. 5.73. Differentiate: Sol ni: □ i) x sin x, ii) iii) ln(x + Vx2 — a2), a 7^ 0, iv) arctan^^^), |x| < 1, v) X*. he formula for the derivative of a product (the Leib-e get that (x sinx)' = x' • sinx + x • (sinx)' = sinx + x cosx. (ii) By the formula for the derivative of a quotient (5.34), we have sinx (sinx)' • x — sinx • x' x cosx — sinx X xz xz (iii) This time, we will use the formula for the derivative of function composition (the chain rule, see 5.33). Setting h(x) = ln(x), f(x)=x + Vx2 — a2, we obtain On the other hand, if the derivative satisfies /'(xo) < 0, then there is a neighborhood 0(xq) such that f(x) < /(xo) far all points x € 0(xq), x > xq, and f(x) > /(xo) far all x € O(xq), x < Xq. , ._. x/x + h - ~Jx {x/x + h - Vx)(V-x + h + yfx) (y/x) = lim-= lim-, - h^O h h^o h(Jx +h + Vx) Proof. Let us consider the former case. By the previous lemma, we have fix) — fixo) + V^(x)(x — xo) and tK*o) > 0. However, since is continuous at the point xo, there must exist a neighborhood 0(xo) on which it holds that t/c(x) > 0. Then with increasing x > xo, the value fix) > fixo) increases as well, and analogously for x < x$. The latter case (with a negative derivative) can be proved similarly. □ The functions that, for all points x of some neighborhood of a pointxo, satisfy fix) > /(xq) if x > xq and fix) < /(xq) if x < xo are called increasing at the point x$. If the function is increasing at all points of a given interval, then it is said to be increasing on the interval. Of course, functions which are increasing on an interval satisfy fib) > fia) for all a < b from this interval. Dually, a function is said to be decreasing at a point xq iff there is a neighborhood of the point xq such that fix) < /(xq) if x > xo and fix) > fixo) if x < xq for all points x of the neighborhood. It is decreasing on an interval iff it is decreasing at every point of the interval. Thus our corollary says that a function having a non-zero finite derivative at a point is either increasing or decreasing at that point, according to the sign of the derivative. As an illustration of a simple usage of the connection between the derivatives and the properties of being an increasing (or decreasing) function, we can consider existence of inverses to polynomials. Since hardly any polynomials are exclusively increasing or decreasing functions, we cannot anticipate that there would be WglofJally defined inverse functions to them. On the other hand, the inverse exists to every restriction of / to an interval between adjacent roots of the derivative /', i. e. where the derivative of the polynomial is non-zero and keeps the sign. These inverse functions will never be polynomials, except for the case of polynomials of degree one, when the equation y gives that 1 x = -iy - b). Similarly with a polynomial of degree two, the equation leads to the formula y — ax + bx + c -b ± jb2 -Aa(c- y) 2a ln(x + Vx2 - a2)' = hi fix))' = hi fix)) ■ fix) (x _|_ Vx2 — a2)' and thus the inverse (given by the above formula) exists only for x + Vx2 — a2 those x which lie in either of the intervals (—oo, 2a ),(-&, oo) 1 + + Vx2" For the work with inverse functions to polynomials, we thus cannot do with the functions we have at our disposal now, so we obtain new additions to our menagerie. 284 CHAPTER 5. ESTABLISHING THE ZOO where we used the chain rule once again when differentiating V-*2 — a2. (iv) Again, we are looking for the derivative of a composed function: arctan l l + l-x2 VT^x2 + h-x2 1 + l-x2 Vl -X2 + 1 VT^x2 VT^x2 (v) First, we transform the function to a function with constant base (preferably the base e) which we are able to differentiate. (jc*y = ((elnxy)' = (exlnxy (x lnx)' • e x In x (1 + lnx) • x* □ 5.74. Find the derivative of the function y = xsin x,x > 0. Solution. We have (xsinx)' = (esinx lnx)' = esinx lnx (cosx lnx + ^) xsinx (cosx lnx + säi). 5.75. For positive x, determine the derivative of the function fix) = xlnx. □ o Solution. f'{x) = 2xln x~l ■ lnx. 5.76. For x e (0, ix/2), calculate the derivative of the function y = (sinx)c°sx. o Solution, (sin x)1+cosx (cot2x -ln(sinx)). We advise the reader to make up some functions and find their derivatives. The results can be verified in a great deal of mathematical programs. In the following exercise, we will look at the geometrical meaning of the derivative at a given point, namely that it determines the slope of the tangent line to the function graph at the given point (see 5.32). 5.33. Formulae for calculation of derivatives. Now, we will introduce several basic facts about the calculation of derivatives. They will talk about how much the differentiation is compatible with the algebraic operations of addition and multiplication of real or complex functions. The last formula then allows us to efficiently determine the derivatives of composite functions. It is also called the "chain rule". Intuitively, they can be understood very easily if we imagine that the derivative of a function y = f(x) is the quotient of the rates of increase of the output variable y and the input variable x: , Ay J Ax Of course, for y = h(x) = f(x) + g(x), the increase in y is given by the sum of the increases of / and g, and the increase of the input variable is still the same. Therefore, the derivative of a sum is the sum of the derivatives. In the case of a product, we have to be a bit more careful. For y = f(x)g(x), the increase is Ay = fix + Ax)g(x + Ax) - f(x)g(x) = f(x+Ax)(g(x+Ax)-g(x)) + (/(x+Ax)- f(x))g(x) Now, if we make the increase Ax small, we actually calculate the limit of a sum of products, which, as we know, can be calculated as the sum of the products of the limits. Thus we can expect that the derivative of a product fg is given by the expression fg' +f'g, which is called Leibniz rule. The derivative of a composite function is even more interesting: Let us consider a function g = ho f, where the domain of the function z = h(y) contains the codomain of the function y = f(x). Again, by writing out the increases, we obtain that Az _ Az Ay Ax Ay Ax' Thus we may expect that the formula will be of the form (h o f)'(x) = h'(f(x))f'(x). Now we will provide correct formulations together with proofs: ___| Rules for differentiation J_ - 5' = Theorem. Let f and g be real or complex functions defined on a neighborhood of a point xq e M and having finite derivatives at this point. Then (1) for every real or complex number c, the function x c- f(x) has a derivative at the point xq and (cf)'(x0) = c(f'(x0)), (2) the function f + g has a derivative at the point xq and (f + g)'(x0) = f(x0) + g'(x0), (3) the function f ■ g has a derivative at the point xq and (/ ' S)'(*o) = f'(xo)g(xo) + f(x0)g'(x0). 285 CHAPTER 5. ESTABLISHING THE ZOO 5.77. Using differential, approximate arccotg 1, 02. Solution. The differential of the function / having continuous derivative at the point x0 is equal to /' (x0) dx = f (x0) (x - x0). The equation of the tangent to /'s graph at the point [xo, f(xo)] is then y - f (*o) = /' (*o) (* - *o) ■ Hence we can see that the differential is the growth on the tangent line. However, the values on the tangent approximate those of /, supposing the difference x — x0 is "small". Thus we obtain the following formula for approximating the function value by its differential: /(*) ~ / (*o) + /' (*o) (x - x0). So, setting f(x) := arccotgx, x0 := 1, we get arccotg 1, 02 « arccotg 1 + ^ (1, 02 - 1) = f - 0, 01. Eventually, let us remark that the point x0 is of course selected so that the expression x — x0 is as close to zero as possible, yet we must be able to calculate the values of / and /' at the point. □ 5.78. Using differential, approximate arcsin 0, 497. O Solution, f - 0, 003. 5.79. Using differential, approximate a := arctg 1, 02; b := '70. o Solution, a ~ f + 0, 01; b ~ 4, 125. 5.80. Using differential, approximate (a) sin(f|); (b) sin (Iff). O Solution, (a) \ spin . 360 ' \/2 i \/2jt 2 360 ' (b)^r + 5.81. Determine the parameter celso that the tangent line to the graph of the function ln^r) at the point [1,0] goes through the point [2, 2]. Solution. >From the statement of the problem it follows that the tangent's slope is |5f = 2. The slope is determined by the derivative at the (4) Further, if h is a function defined on a neighborhood of the image yo — /(xo) and having a derivative at the point yo, the composite function h o / also has a derivative at the point xq and I (h o f)'(xo) = h'(f(x0)) ■ f'(xo). Proof. (1) and (2): A straightforward application of the theorem about sums and products of function limits yields the result. (3) We will rewrite the quotient of the increases (which we have already mentioned), in the following way: (fg)(x) - (fgKxo) ,g(x)-g(x0) , f(x)-f(x0) - = f(x)--1--g(xo)- x — xq x — xq x — xq The limit of this expression for x —>• xo gives the wanted result because / is continuous at the point xo. (4) By the lemma 5.31, there are functions i/r and

R exists (do not confuse this notation with the function x i-> (/(x))-1), then it is uniquely determined by either of the following identities / 1 o / = idM, f of 1 = idB and the other one then holds as well. If / is defined on a set A c R and f(A) — B, the existence of f~l is conditioned by the same statements with identity mappings id a and idg, respectively, on the right-hand sides. As we can see from the picture, the graph of the inverse function is obtained simply by interchanging the axes of the input and output variables. If we knew that the inverse y — f~1(x) of a differentiable function x — f(y) is also differentiable, then the chain rule would immediately yield 1 = (id)'(x) = (/ o rl)'(x) = f(y) ■ (f-l)'(x), so we obtain the formula (apparently, f'(y) must be non-zero in this case) ___| Derivative of an inverse function |___ (5.6) f'(y) This corresponds to the intuitive idea that for y — f(x), the value of /' is approximately ^ while for x — f~l (y) it is approximately (f~l)'(y) — And this indeed is the way how we can calculate the derivatives of inverse functions: Theorem. If f is a real-valued function differentiable at a point xo and such that /'(xo) ^ 0, then there is a function f~l defined on some neighborhood of the point and such that (5.6) holds. 287 CHAPTER 5. ESTABLISHING THE ZOO positive x-axis in the "positive sense of rotation", ie. counterclockwise.) O Solution. 7T/4. 5.90. Determine the equations of the tangent and normal line to the curve given by the equation + y3 - 2xy = 0 at the point [1, 1]. Solution, y = 2 — x; y = x. 5.91. Prove the following: o < In (1 + x) < x for all x > 0. o Solution. The inequalities follow, for instance, from the mean value theorem (attributed to Lagrange) applied to the function y = ln(l +0, t e [0, x]. F. Extremal problems The simple observation 5.32 about the geometrical meaning of the derivative also tells us that a differentiable real-valued function of a real variable can have extremes only at the points where its derivative is zero. We can utilize this mere fact when solving miscellaneous practical problems. 5.92. Consider the parabola y = x2. Determine the x-coordinate xA of the parabola's point which is nearest to the point A = [1,2]. Solution. It is not difficult to realize that there is a unique solution to this problem and that we are actually looking for the absolute minimum of the function f(x) = y/ix - l)2 + (x2 - 2)2, x e R. Apparently, the function / takes the least value at the same point where the function g(x) = (x - l)2 + (x2 - 2)2, x e R does. Since g'(x) = 4x3 - 6x - 2, x e R, by solving the equation 0 = 2x3 — 3x — 1, we first get the stationary point x = — 1 and after dividing the polynomial 2x3 — 3x — 1 by the polynomial x + 1, we then obtain the remaining two stationary points ±fi and l-±fi. As the function g is a polynomial (differentiable on the whole domain), from the geometrical sense of the problem, we get Proof. First, let us realize that the request that the derivative at xo be non-zero means that our function / is either increasing or decreasing on some neighborhood of the point; see the corollary 5.32. Thus there exists an inverse function defined on some neighborhood. Since a continuous function maps a closed bounded interval onto a closed interval; the image f(U) of any open set U contained in the domain of / is open as well. Then, by the definition of continuity, the inverse function is continuous, too. To prove our proposition, it now suffices to carefully read through the proof of the fourth statement of the theorem 5.33. We only choose / for h and f~l for /, and we know that the composite function is differentiable instead of supposing existence of the derivatives of both the functions (and we know that the composite is the identity function): Indeed, by the lemma 5.31, there is a function ^ continuous at the point yo such that f(y) - /(yo) = (p(y)(y - yo), on some neighborhood of yo. Further, it satisfies cp(yo) — /'(yo)-However, then the substitution y = f~l (x) gives that x - xo = ^(/-1(x))(/"1(x) - f-\x0)), for all x lying in some neighborhood O (xo) of the point xo. Further, we have f~l (xq) — yo, and since / is either strictly increasing or strictly decreasing, we get that cp(f~1(x)) / 0 for all x e O(xo)\ {xq}. Thus we can write f-Hx)-rHx0) l x -x0 0, x € I, the function / must take the greatest value on / at its only stationary point x0 = b/4. Thus the sides of the wanted rectangle are b/2 long (twice x0: considering the original problem) and h/2 (which can be obtained by substituting b/4 for x into the expression h — 2hx/b). Hence we get S = hb/4. □ 5.94. Among rectangles such that two of their vertices lie on the x-axis and the other two have positive y-coordinates and lie on the parabola y = 8 — 2x2, find the one which has the greatest area. Solution. The base of the largest rectangle is 4/V3 long, the rectangle's height is then 16/3. This result can be obtained by finding the absolute maximum of the function Six) = 2x (8 - 2x2) on the interval / = [0, 2]. Since this function is non-negative on /, takes zero at /'s boundary points, is differentiable on the whole of / and its derivative is zero at a unique point of /, namely x = 2/V3, it has the maximum there. □ 5.95. Let the ellipse 3x2 + y2 = 2 be given. Write the equation of its tangent line which forms the smallest triangle possible in the first quadrant and determine the triangle's area. Solution. The line corresponding to the equation ax + by + c = 0 intersects the axes at the points [—^, 0], [0, —|] and the area of the triangle whose vertices are these two points and the origin is S = 2 j^. The line which touches the ellipse at [xT, yT] has the equation 3xxr + yyT —2 = 0. The area of the triangle corresponding to it is Now, let us focus on the derivative of the exponential function fix) — ax. If the derivative of ax exists at all points x, it will surely hold that fix) lim a x+Ax -,Ax lim ■ Ax Ajc^o Ax On the other hand, if the derivative at zero exists, then this formula guarantees the existence of the derivative at any point of the domain and also determines its value. At the same time, we verified the validity of the formula for the one-sided derivatives. Unfortunately, it will take us some time to verify (see 5.43,11 i 11, and 6.43) that the derivatives of exponential functions indeed exist. We will also see that there is an especially important base e, the so-called Euler's number, for which the derivative at zero equals one. What we can do now is to notice that the exponential functions are special in the way that their derivatives are proportional (with a constant coefficient) to their values: 1 f'iO)ax iax)' = (eln^)' = ln(a)(eln(fl)*) = ln(a) • ax. 5.37. Mean value theorems. Before we continue our journey of building new functions, we will derive several sim-<\, pie statements about the derivatives. The meaning U of all of them is intuitively clear from the pictures and the proofs only follow the visual imagination. Theorem. Let a function f : M -> M be continuous on a closed bounded interval [a, b] and differentiable inside this interval. If fia) = fib), then there is a number c e (a, b) such that f'ic) = 0. Proof. Since the function / is continuous on the closed interval (i. e. on a compact set), it reaches both a maximum and a minimum there. If the maximum and the minimum shared the value fia) — fib), it would mean that the function / is constant, and thus its derivative is zero at all points of the interval (a,b). Therefore, let us suppose that at least one of the maximum and the minimum is different and let it occur at an interior point c. Then it is impossible to have f'ic) / 0 because then the function / would be either increasing or decreasing at the point c (see 5.32) and so it would take both lower and higher values than fic) at a neighborhood of the point c. □ The above theorem is called Rolle's theorem.lt immediately implies the following corollary, known as Lagrange's mean value theorem. 5.38. Theorem. Let a function f : M -> M be continuous on an interval [a, b] and differentiable at all points inside this interval. 289 CHAPTER 5. ESTABLISHING THE ZOO thus S = Jx2yT. Further, in the first quadrant, we have that xT,yT > 0. To minimize this area means to maximize the product xTyT = 3x2r, which is (in the first quadrant) the same as to maximize xttJ2-(xTyT)2 = x2(2 minimum is at xT 3x2T) -3(x2 \)2 + Hence, the wanted -j^. The tangent's equation is V3x + y = 2 and the triangle's area is 5mi„ = 2-jp. □ 5.96. At the time t = 0, the three points P, Q, R began moving in the plane as follows: The point P is moving from the point [—2, 1] in the direction (3, 1) at the constant speed VlO m/s, the point Q is moving from [0, 0] in the direction (— 1, 1) with the constant acceleration 2 V2 m/s2 (beginning at zero speed) and the point R is going from [0, 1] in the direction (1, 0) at the constant speed 2 m/s. At which time will the area of the triangle PQRbe minimal? Solution. The equations of the points P, Q,Rm time are P Q R [-2, 1] + (3, l)t, [0, 0] + (-l, l)ř, [0, 1] + (2, 0)í. The area of the triangle P QR is determined, for instance, by half the absolute value of the determinant whose rows are the coordinates of the vectors PQ and QR (see 1.34). So we minimize the determinant: -2 + t t 2t -1 + i2 2t - t + 2. The derivative is 6r — 1, so the extrema occur at t = Thanks to considering non-negative time only, we are interested in t = . The second derivative of the considered function is positive at this point, thus the function has a local minimum there. Further, its value at this point is positive and less than the value at the point 0 (the boundary point of the interval where we are looking for the extremum), so this point is the wanted global minimum. □ 5.97. At 9 o'clock in the morning, the old wolf left his den D and as a part of his everyday warm-up, he began running counterclockwise around his favorite stump S at the constant speed 4 kph (not very quick, is he), keeping the constant distance of 1 km from it. At the same time, Little Red Riding Hood set out from her house H straight to her Grandma's cottage C at the constant speed 4 kph. When will they be closest to each other and what will their distance be at that time? The coordinates (in kilometers) are: D = [2, 3], S = [2, 2], H = [0, 0], C = [5, 5]. Solution. The wolf is moving along a unit circle, so his angular speed equals his absolute speed and his position in time can be described by Then there is a number c e (a, b) such that fib) - f(a) fie) = b — a VETA 0 S-rttpW tfWOTE. Proof. The proof is a simple record of the geometrical mean-ing of the theorem: The secant line between the points [a, f(a)] and [b, fib)] has a tangent line which is parallel to it (have a look at the picture). The equation of our secant line is y — g(x) — f(a) + ^—^-^^(x - a). b — a The difference h (x) — f(x) — g(x) determines the distance of the graph and the secant line (in the values of y). Surely h(a) — h(b) and it, , ft, , fib) - f(a) h(x) = f (x)--■-. b — a By the previous theorem, there is a point c at which h' (c) — 0. □ The mean value theorem can also be written in the form: (5.9) f(b) = f(a) + f'(c)(b-a). In the case of a parametrically given curve in the plane, i. e. a pair of functions y — fit), x — git), the same result about existence of a tangent line parallel to the secant line going through the marginal points is described by the so-called Cauchy's mean value theorem: Corollary. Let functions y = fit) andx = git) be continuous on an interval [a, b] and differentiable inside this interval, and further let g'it) ^ 0 for all t € (a, b). Then there is a point c € (a, b) such that fjb) - fja) _ f'jc) gib) - gia) g'ic) ' Proof. Again, we rely on Rolle's theorem. Thus we set hit) = ifib) - fia))git) - igib) - gia)) fit). NowMa) = fib)gia)-fia)gib),hib) = f{b)g{a)-f{a)g{p), so there is a number c e (a,b) such that h'(c) — 0. Since g'ic) / Owe get just the desired formula. □ A reasoning similar to the one in the above proof leads to a supremely useful tool for calculating limits of quotients of functions. The theorem is known as I'Hospital's rule: 290 CHAPTER 5. ESTABLISHING THE ZOO the following parametric equations: x(t) = 2- cos (4f), y(t) = 2- sin(4f), Little Red Riding Hood is then moving along the line x(t) = 2J2t, y(t) = 2y/2t. Let us find the extrema of the (squared) distance p of their paths in time: p(t) = [2 - cos(40 - 2V2tf + [2 - sin(4f) - 2V2?]2, p'(t) =16(cos(40 - sin(40)(V2f - 1) + 32?+ + 4V2(cos(40 + sin(40) - 16V2. It is impossible to solve the equation p' (t) = 0 algebraically, we can only find the solution numerically (using some computational software). Apparently, there will be infinitely many extrema: every round, the wolfs direction is at some moment parallel to that of Little Red Riding Hood, so their distance is decreasing for some period; however, Little Red Riding Hood is moving away from the wolfs favorite stump around which he is moving. We find out that the first local minimum occurs at t = 0.31, and then at t = 0.97, when the distance of our heroes will be approximately 5 meters. Clearly this is the global maximum as well. The situation when we cannot solve a given problem explicitly is quite common in practice and the use of numerical methods is of great importance. □ 5.98. Halley's problem, 1686. A basketball player is standing in front of a basket, at distance I from its rim which is at height h from the throwing point. Determine the minimal initial speed v0 which the player must give to the ball in order to score, and the angle cp corresponding to this v0. See the picture. Solution. Once again, we will omit units of measurement: we can assume that distances are given in meters, times in seconds (and speeds in meters per second then). Suppose the player throws the ball at time t = 0 and it goes through the rim at time t0 > 0. We will express the ball's position (while flying) by the points [x(t), y(t)] for t e [0, t0], and we require that x(0) = 0, y(0) = 0, x(t0) = I, y(t0) = h. Apparently, x? (t) = vo cos cp, y (t) = vo sin cp — gt for t € (0, to), where g is the gravity of Earth, since the values x' (t) and y (0 are, respectively, the horizontal and vertical speed of the ball. By integrating these equations, we get x(t) = v$t cos cp + c\, y(t) = v$t sin cp — ^ gt2 + c2 5.39. Theorem. Let us suppose that f and g are functions dijferen-tiable on some neighborhood of a point xo e R, yet not necessarily at the point xq itself Moreover, let the limits lim f(x) x—>xq 0, lim g{x) = 0 exist. If the limit exists, then the limit lim fix) x^x0 g'(x) lim x^x0 g(x) exists as well and these two limits are equal. Proof. Without loss of generality, we can assume that both the functions / and g take zero at the point xq. Again, we can illustrate the statement by a picture. *^t~~:_ Let us consider the points [g(x), f(x)] e R2 parametrized by the variable x. The quotient of the values then corresponds to the slope of the secant line between the points [0, 0] and [f(x),g(x)]. At the same time, we know that the quotient of the derivatives corresponds to the slope of the secant line at the given point. Thus we want to derive that the limit of the slopes of the secant lines exists from the fact that the limit of the slopes of the tangent lines exists. Technically, we can make use of the mean value theorem in a parametric form. First of all, let us realize that the existence of the expression f'(x)/g'(x) on some neighborhood of the point xo (excluding xo itself) is implicitly assumed; thus especially for points c sufficiently close to xo, we will have g'(c) / 0.4 Thanks to the mean value theorem, we have now that fix) ,. /(x)-/(x0) f'(cx) lim lim lim «o g(x) x^x0 g(X) - g(x0) x^x0 g'(Cx) where cx is a number lying between xo and x, dependent on x. From existence of the limit ,. fix) lim -, x^xo g'(x) it follows that this value will be shared by the limit of any sequence created by substituting the values x — xn approaching xq into This is not always necessary for the existence of the limit in a general sense. Nevertheless, for the statement of l'Hospital's rule, it is. A thorough discussion can be found (googled) in the popular article 'R. P. Boas, Counterexamples to L'Hospital's Rule, The American Mathematical Monthly, October 1986, Volume 93, Number 8, pp. 644-645.' 291 CHAPTER 5. ESTABLISHING THE ZOO for t € (0, to) and c\, c2 € R. From the initial conditions lim x(t) = jc(0) = 0, lim y(t) = y(0) = 0, t^0+ t^0+ it follows that c\ = c2 = 0. Substituting the remaining conditions lim x(t) = x(to) = I, lim y(t) = y(t0) = h t-Hg- t-Hg- then gives / = voto cos cp, h = voto sin cp — j gt^. According to the first equation, we have that I (5.1) to Vq cos cp and thus we get only one equation gl2 (5.2) h = I tan cp 2f2 cos2 cp where v0 e (0, +oo), cp e (0, tt/2). Let us remind that our task is to determine the minimal v0 and the corresponding cp which satisfies this equation. To be more comprehensible, we want to find the minimal value of v0 for which there is an angle cp satisfying (||5.2||). Since l+tan2From the last equation (quadratic equation in p = tan cp), it follows that tan

0. Once again, a suitable substitution (this time q = v^) allows us to reduce the left side to a quadratic expression and subsequently to get (v2 -g[h + Vh^TfJj (v2 -g[h- Vh^TpJj > 0. As h < \Jh2 + P, it must be that h + V/i2 +12 l. e. v0 > h + Jh2 + P The least value (5.4) h + y/h2 + P f'(x)/g'(x). Especially, we can substitute any sequence cXn for xn -> xo, and thus the limit ,. f'(cx) lim - ■*^*o g'(cx) will exist, and the last two limits will be equal. Thus we have proved that the wanted limit exists and also has the same value. □ >From the proof of the theorem, it is apparent that it holds for one-sided limits as well. 5.40. Corollaries. L'Hospital's rule can easily be extended for limits at the improper points ±oo and for the case of infinite values of the limits. If, for instance, we have lim f{x) = 0, lim g(x) = 0, x^oo x^oo then limJC_>0+ fiX/x) = 0 and limJC_>0+ g(\/x) = 0. At the same time, from existence of the limit of the quotient of the derivatives at infinity, we get (/(!/*))' /'(l/xX-l/x2) lim -= lim--— x^0+ (g(l/x))> x^0+ g'(l/x)(-l/x2) ,. /'(I/*) ,. fix) = lim -= lim -. X^0+ g'(l/x) X^QO g'(X) Applying the previous theorem, we get that the limit f(x) f(l/x) f'(x) lim -= lim -= lim - x^oo g(X) x^0+ g(l/x) x^oo g'(x) will exist in this case as well. The limit calculation is even simpler in the case when lim f(x) = ±oo, lim g(x) = ±oo. X^XQ X^Xq Then it suffices to write f(x) l/g(x) lim -= lim -, *->*0 g(x) x^x0 l/f(x) which is already the case of usage of l'Hospital's rule from the previous theorem. It can be proved that l'Hospital's rule holds in the same form for infinite limits as well: Theorem. Let f and g be functions differentiable on some neighborhood of a point xo e M, yet not necessarily at the point xo itself Further, let the limits lim*-^ f(x) = ±oo and lim*-^ g(x) = ±oo exist. If the limit fix) exists, then the limit lim *->*0 g'(x) ,. fix) lim x^x0 g(x) exists as well and they equal each other. Proof. Once again, we can apply the mean value theorem. The key step is to express the quotient in a form where the derivative arises: /(*) _ /(*) f(x)-f(y) g(x)-g(y) g(x) f(x)-f(y) g(x)-g(y) g(x) where y is fixed, from a selected neighborhood of xo and x is approaching xq. Since the limits of / and g at xq are infinite, we can surely assume that the differences of the values of both functions at x and y, having fixed y, are non-zero. 292 CHAPTER 5. ESTABLISHING THE ZOO is then matched (see (||5.3||)) by (5.5) v2 h + Vh2 + P tancp = — = — i. e. cp = arctg h + Vh2 + P I I The previous calculation was based upon the conditions x(t0) = I, y(t0) = h only. However, these only talk about the position of the ball at the time t0, but the ball could get through the rim from below. Therefore, let us add the condition / (t0) < 0 which says that the ball was falling at the time, and let us prove that it holds for vq from (|| 5.41|) and cp from (||5.5||). Let us remind that we have (see (||5.1||), (||5.2||)) to vq cos cp 0 2(1 tan cp—h) cos2 Using this, from we get lim y(t) t^to- Vo sin

Xfi) — xx is a special case of the so-called power mean with exponent r, also known as generalized mean: Mr(xi, ■xi The special value M~l is called harmonic mean. Now, let us calculate the limit value of Af for r approaching zero. For this purpose, we will determine the limit by l'Hospital's rule (it it an expression of the form 0/0 and we differentiate with respect to r, while x, are constant parameters). The following calculation, in which we apply the chain rule and our knowledge of the derivative of the power function, must be read from the back. Existence of the last limit implies the existence of the last-but-one, and so on. lim ln(Mr(xi, r^0 , x„)) — lim r^0 O) x\ \\iX\-\-----Yxrn \nxn lim- r^0 lnxi + • lnx„ = ln^/xl Hence we can immediately see that lim Mr (xi, ..., x„) — Zjx\ ... x„, r^O which is a value known as geometric mean. 4. Power series 5.42. The calculation of ex. Besides addition and multiplication, \j, we can also manipulate with limits of sequences. ~i. Thus it might be a good idea to approximate non-polynomial functions by sequences of values that can be calculated. 293 CHAPTER 5. ESTABLISHING THE ZOO We have obtained that the elevation angle corresponding to the throw with minimal energy is the arithmetic mean of the right angle and the angle at which the rim is seen (from the ball's position). The problem of finding the minimal speed of the thrown ball was actually solved by Edmond Halley as early as in 1686, when he determined the minimal amount of gunpowder necessary for a cannonball to hit a target which lies at greater height (beyond a rampart, for instance). Halley proved (the so-called Halley's calibration rule) that to hit a target at the point [I, h] (shooting from [0,0]) one needs the same minimal amount of gunpowder as when hitting a horizontal target at distance h + ~Jh2 + P (at the angle

— 1, b ^ 0, and a natural number n > 2, it is true that (1 + b)n > 1 + nb. Proof. For n — 2, we get (1 +b)2 = 1 +2b- ■b2 > 1 2b. From now on, we proceed by induction on n, supposing b > —I. Let us assume that the proposition holds for some k > 2 and let us calculate: (1 + b)k+l = (1 + b)k(\ +b)> (1+ kb)(\ + b) = l + (k+l)b + kb2 > l + (k+ \)b The statement is, of course, true for b — — 1 as well. □ Now we can bound the quotient of adjacent terms a„ of out sequence (i + ^r)-1 i (n2 - \)nn n2n(n - 1) 1 \" n nL I n — > (1--)- 1. n n — 1 Thus we have proved that our sequence is indeed increasing. The following, very similar, calculation (applying Bernoulli's inequality once again) verifies that the sequence of numbers b„ 1 1 + - n n + l is decreasing. Surely b„ > an. b„ b„ +1 n n + 1 n n+2 1 1 1 + - n n n + 1 n+2 1 1 + - n 2n + l 2n n+2 n(n ■+ 2) n 1 n+2 = 1. n(n+2)/ Thus the sequence a„ is increasing and bounded from above, so the set of its terms has a supremum which equals the limit of the 294 CHAPTER 5. ESTABLISHING THE ZOO R = t>0f0 cos cp, —h = v$to sin cp — -\ gt^. From the first equation, it follows that R to VQ cos

-2gR(q>) R Vq sin 2

Af. However, so great indeces j satisfy Cj+\ < < 2~^~N+1^ cN. This means that the parial sums of the first n terms of our formal sum are bounded from above by the sums N-l j j n-N j D„ < >^ —x-7 H--x" >^ —7. 295 CHAPTER 5. ESTABLISHING THE ZOO for cp -» 7t/2— the value of R decreases) and is differentiable at every point of this interval, it has its maximum at the point where its derivative is zero. This means that R (cp) can be maximal only if (5.8) R(cp) = h tan 2cp. Let us thus substitute (||5.8||) into (||5.7||). We obtain h tan 2cp v2, sin 2cp — gh2 tan2 2cp + 2hv^ cos2 cp = 0. This equation can be transformed to tan 2cp v2, sin 2cp + 2v2) cos2 cp = gh tan2 2cp, 2 sin2 2cp + vl (cos2

)(l+cos 2 N for some fixed N (very great) and choose a fixed number k < N (quite small). Then for sufficiently large N, we can approximate the sum of the first k terms in the expression of un in (5.10) by i;* with arbitrary precision. Since this part of the sum of un is strictly less than un itself, the sequence u„ must converge to the same limit as the sequence v„. Thus we have proved __ [ The power series for ex [__, Theorem. The exponential function is, for every number x e expressed as the limit of the partial sums in the expression 1 9 1+xH--x2 2! 1 y-x". «=0 5.44. Number series. When deriving the previous important theorems about the function ex, we have accidentally worked with several extraordinarily useful concepts and tools. Now, we will formulate them in general: Infinite number series Definition. An infinite series of numbers is an expression E «=0 an = aß -\- a\ -\- a2 - at Let, for instance, javelin thrower Barbora Spotakova give a javelin the speed i>o = 27.778 m/s = 100 km/h at the height h = 1.8 m (with g = 9.806 65 m • s~2). Then the javelin can fly up to the distance R ( m^ we sav ^e series converges and equals s iff the limit s = lim s„ of the partial sums exists and is finite. If the sequence of partial sums has an improper limit, we say that the series diverges to oo or —oo. If the limit of the partial sums does not exist, we sometimes say that the series oscillates. However, the world record of Barbora Spotakova does not even approach 80 m although the impact of other phenomena (air resistance, for example) can be neglected. Still we must not forget that from 1 April 1999, the center of gravity of the women's javelin was moved towards its tip upon the decision of IAAF (International Association of Athletics Federation). This reduced the flight distance by around 10 %. The original record (with "correctly balanced" javelin) was 80.00 m. For the sequence of partial sums sn to be converging, it is necessary and sufficient that it is a Cauchy sequence, i. e. km — Sn I — \an+l + • • • + am I must be arbitrarily small for sufficiently great m > n. Since |fl«+ll H-----h \am \ > \an+i H-----h am\, the convergence of the series Y^k=.o \an \ implies the convergence of the series y^jtlo ««• 296 CHAPTER 5. ESTABLISHING THE ZOO The performed reasoning and the obtained result can be applied to other athletic disciplines and sports. In golf, for instance, h is close to 0, and thus it is just the angle 0, and therefore we could get a helping hand form the corresponding one-sided limit. □ Further miscellaneous problems concerning extrema of functions of a real variable can be found at 320 G. L'Hospital's rule 5.100. Verify that the Umit (a) sin (2jc) — 2 sin x 0 lim--- is of the type -; x^o 2ex -x2 -2x -2 0 (b) (c) (d) (e) (f) lnx oo lim - is of the type —; x^0+ cotx oo lim I-- — -— I is of the type oo — oo; 1+ V x — 1 lnx lim (In (x — 1) • lnx) is of the type 0 • oo; 1 Q lim (cot x) i- * is of the type oo ; jt-»0+ lim x^0 \ X sin x \ x is of the type 1°°; __| Absolutely convergent series J__ We say that a series 2~2k*=o ®n ^ absolutely convergent iff the I series z~2T=o converges. | The absolute convergence has been introduced because it can often be much easily verified. Moreover, the following theorem shows that all simple algebraic operations behave "very well" in the case of series that con-■— verge absolutely. 5.45. Theorem. Let S = 2~2T=o fl« an<^ T — S«=o b" be two absolutely convergent series. Then (1) their sum converges absolutely to the sum oo oo oo S + T = ^ an + ^ bn = ^2(a„ + bn), n—0 n—0 n—0 (2) their difference converges absolutely to the difference oo oo oo S - T = Efl« ~ E^" = E(fl" ~ b")' n—0 n—0 n—0 (3) their product converges absolutely to the product oo / n ST E°" ) ' ( E^" I = E \ ^2an-kbk \«=0 \h=0 n=0 \k=0 Proof. Both the first and the second statements are a straightforward consequence of the corresponding properties of limits. The third statement requires our attention. Let us write cn — y^,an-kbk- k=0 From the assumptions and the rule for the Umit of a product, we get (Efl") ' IE*" \n=0 / \h=0 I Thus it suffices to prove that Efl" ' \Hb" \«=o \«=0 0= lim k^oo Ik \ I k \ k - EM-E \n=0 I \n=0 ' n=0 Ck Let us confront the expressions (Efl") ' (EM = E aibJ> \n=0 I \n=0 I 0k i + i>k +P 00 J2 (b) (c) (d) (e) (f) (g) lim (2ex -x2 -2x -2) = 2- 0- 0- 2 = 0; jt=>0 V ' lim lnx = — oo, lim cotx = +oo; X 1 lim -= +oo, lim -= +oo; x^i+ x — 1 x^i+ lnx lim lnx =0, lim In (x — 1) = — oo; jtr=>l + jtr=>l + lim cotx = +00, lim -= 0; x^o+ x^o+lnx sinx 1 lim-= 1, lim — = +oo; x^0 X Jt=>0 Xz JTX lim cos — = 0, lim lnx =0. x=>l — 2 x=>l — The case (a). Applying l'Hospital's rule transforms the Umit sin (2x) — 2 sin x lim into the limit lim ö 2e* - x2 - 2x - 2 2 cos(2x) — 2 cos x >o 2ex -2x -2 which is of the type 0/0. Two more applications of the rule lead to —4 sin (2x) + 2 sinx lim- x^o 2ex - 2 and (the above limit is also of the type 0/0) — 8 cos (2x) + 2 cosx —8 + 2 lim-=-= —3. x^o 2ex 2 Altogether, we have (returning to the original Umit) sin (2x) — 2 sin x lim---= —3. x^o 2ex — x2 — 2x — 2 Let us remark that multiple appUcation of l'Hospital's rule in an exercise is quite common. From now on, we will set the Umits of quotients of derivatives obtained by l'Hospital's rule equal the Umits of the original quotients. We can do this if the gained limits on the right sides really exist, i. e. add more terms into it, i. e. we will take all as in the product and remove only those whose indeces are both at most k/2. \a;b i+j>k 0 1. We know nothing in the case of \q\ = 1: the series may or may not converge. (3) If the limit lim = q exists, then the series converges absolutely for q < 1 while it does not converge for q > 1. Again, in the case of q = 1, the series may or may not converge. Proof. (1) We know that the existence and the potential value Jtsb/ of the Umit of a sequence of complex numbers is given by the Umits of the real parts and the imaginary parts. Thus it suffices to prove the first proposition for sequences of real numbers. If lim^oo an does not exist or is non-zero, then for sufficiently small number e > 0, there are infinitely many terms ak with \ak\ > s. So there must be either infinitely many positive terms of infinitely many negative terms among them. But then, adding any one of them into the partial sum, we get that the difference of the adjacent terms s„ and s„+i is at least e. Thus the sequence of partial sums cannot be a Cauchy sequence and, therefore, it cannot be converging, either. (2) Since we want to prove the absolute convergence, we can assume straightaway that the terms of the series are real numbers at > 0. The proof was given for the special value of q = 1/2 when deriving the value of ex using series. Now, let us consider q < r < 1 for a real number r. From the existence of the Umit of the quotients, we can deduce that for every j greater than a sufficiently large N, it holds that aj+i < r(J~N+1) aN. But this means that the partial sums s„ are, for large n > N, bounded from above by the sums n-N 7=0 j=0 j=0 1 _ fi-N+l T^~r Since 0 < r < 1, the set of all partial sums is an increasing sequence bounded from above, and thus its limits is its supremum. 298 CHAPTER 5. ESTABLISHING THE ZOO actually we will make sure that what we write is senseful only afterwards. The case (b). This time, differentiation of the numerator and the denominator gives lnx lim - = lim ■x^O+COtX x^0+ lim x^0+ sin2 x The last limit can be determined easily (we even know it). From lim — sinx = 0, smx lim -= 1, x^0+ X the result 0 = 0-1 follows. We could also have used l'Hospital's rule again (now for the expression 0/0), obtaining the result — sin2x — 2-sinx-cos x —2-0-1 lim - = lim -=-= 0. x^0+ X x^0+ 1 1 The case (c). By mere transforming to a common denominator: lim 1 lim x lnx — (x — 1) -i+\x — 1 lnx/ x^i+ (x — l)lnx we have obtained the type 0/0. We have that xlnx-(x-l) lnx + f-1 lim -= lim -:—-- lim lnx -i+ (x-l)lnx £^±_|_mx ^i+1-J.+lnx We have the quotient 0/0, which (again by l'Hospital's rule) satisfies lnx 7 1 1 lim lim i+1-I+lnx +1 1 + 1 2' Returning to the original Umit, we write the result lim 1 i+ V x — 1 lnx The case (d). We transform the assigned expression into the type oo/oo (to be precise, into the type — oo/oo) by creating the fraction In (x - 1) lim In (x — 1) • lnx = lim By l'Hospital's rule, ,. ln(x-l) lim ---= lim In x 1 x-l x^l + x^l + 1 1 1 In x lim -x In x -i+ x-l In2 x * This indeterminate form (of the type 0/0) can once again be determined by l'Hospital's rule: -xln2x -ln2x - 2x lnx - - lim - = lim ►1+ X — 1 *->■! + 1 0 + 0 0. In the case q > r > 1, we will use a similar technique. However, this time, from the existence of the limit of the quotients, we now deduce that aj+i JJ-N+l) ax > 0. However, this means that the absolute values of the particular terms of the series do not converge to zero, and thus the series cannot be converging, by the already proved part of the theorem. (3) The proof is quite similar to the previous case. From the existence of the limit q < 1, it follows that for any r, q < r < 1, there is an N such that for all n > N, Z/\an\ < r holds. Exponentiation then gives us \an \ < r", so we, once again, are comparing this to a geometric series. Thus the proof can be finished in the same way as in the case of the ratio test. □ In the proof of both the second and the third statement, we have used a weaker assumption than the existence of the limit. We only wanted to know that the examined sequences of non-negative terms are, from a given index on, either all greater or all less than a given number. For this purpose, however, it suffices to consider, for a given sequence of terms bn, the supremum of the terms with index higher than n. These suprema always exist and create a non-increasing sequence. Its infimum is then called limes superior of the sequence and denoted by lim sup bn. The advantage is that limes superior always exists. Therefore, we can reformulate the previous result (without having to change the proof) in a stronger form: Corollary. Let S — Z~2^Lo a" ^e an infinite series of real or complex numbers. (1) If an+i q — lim sup then the series S converges absolutely for q < 1 and does not converge for q > 1. For q = 1, it may or may not converge. (2) If q = lim sup $\an\, the series converges absolutely for q < 1 while it does not converge for q > 1. For q = 1, it may or may not converge. 5.47. Power series. If we consider not a sequence of numbers an, but rather a sequence of functions /„ (x) sharing the same domain A, we can use the definition of addition of series "pointwise", thereby ■'r///' obtaining the concept of the series of functions «=o _j Convergence of power series J__ A power series is given by an expression oo S(x) — ^2anX" ■ n=0 299 CHAPTER 5. ESTABLISHING THE ZOO The cases (e), (f), (g). Since lim (cot x) i"' x^0+ / sin x \ x2 lim - x^O \ X lim lim lnťcot x) In x lim / 7tX\ (cos —) In x - QX^O X lim (lnjt-ln(cos ^-)) it suffices to calculate the limits given in the argument of the exponential function. By 1'Hospital's rule and simple rearrangements, we get lim In (cotx) >o+ lnx type +00 -00 = l Km cot" ,™2* x^0+ lim - x^o+ cosx ■ sinx 0 type- = lim -1 *^o+ cos2x — sin x 1-0 ln^i lim —- x^O X2 type5 type : lim^ x^O x cos x—sin x v2 lim x cosx — smx lim 2x x->b 2x2 sinx cosx — x sinx — cosx 0 4x sin x + 2x2 cos x lim smx lim 0 4 sin x + 2x cos x — cosx 0 type- -1 hence 0 4 cos x + 2 cos x — 2x sin x 4 + 2 — 0 1 lim (cot x) in x x^0+ ( sin x \ x lim — x^O \ X 1 We can proceed similarly when determining the last Umit. We have that In (cos ?y) / Jtx\ lim ) (In x) • In I cos — ) lim lim 1 In x sin type ) - 2/2 -00 00 -00 00 1 _ j_ In2 x x jt x sin ^ • In x = — lim -^-=r-. 2x^i- cos^f- Since this form is of the type 0/0, we could continue by using 1'Hospital's rule; instead, we will go from x sin ■ In2 x lim clover to the product of limits 7tX- cos ?y lim (x sin —^ • x^i- V 2 / ln2x hm -ŤŤT cos iy 1 • lim ln2x COS ^ We say that s(x) has radius of convergence p > 0 iff 5(x) converges for every x satisfying |x| < p and does not converge for |x| > p. 5.48. Properties of power series. Although a significant part of the proof of the following theorem will have to be postponed until the end of the following chapter, we will formulate the basic properties of the power series right now: Absolute convergence and differentiation Theorem. Let s(x) = Y^Lq o-n^ be a power series and let the limit _ r = lim yjoj exist Then the radius of convergence of the series S equals p — r-1. The power series s(x) converges absolutely on the whole interval of convergence and is continuous on it (including the marginal points, supposing it is convergent there). Moreover, the derivative exists on this interval, and (x) = 'y^nanxn n = \ Proof. To verify the absolute convergence of the series, we can use the root test from theorem 5.46(3), for every value of x. We calculate _ lim ^/\anx" I = rx, and the series converges absolutely, or does not converge if this Umit is different from 1. Hence it follows that it indeed converges for |x| < p and diverges for |x| > p. The statements about the continuity and the derivatives will be proved later in a more general context, see 6.43-6.45. □ Let us also notice that, when proving the convergence, we can use a stronger form of the root test, and so the radius r of convergence can, for every power series, be described explicitly by r~l = lim sup ^J\an\. 5.49. Notes. If the coefficients of the series increase rapidly, i. e. ■ a„ = n", then r = 00, i. e. the radius of conver-i?LvY/ gence is zero. Indeed, such a series converges at a single point, namely x = 0. Now we will have a look at some examples of convergence of power series (including the marginal points of the corresponding interval): Let us consider 00 1 S(x) = J2x", Tlx) The former case is a geometric series, which we have already met. Its sum is, for every x, |x| < 1, s(x) = 1 — X while |x| > 1 guarantees that the series diverges. For x = 1, we obtain the series 1 + 1 + 1 + ..., which is apparently divergent. For x = — 1, we get the series 1 — 1 + 1 — ..., whose partial sums do not have a Umit, i. e. the series oscillates. 300 CHAPTER 5. ESTABLISHING THE ZOO Only now we apply 1'Hospital's rule for ln2x lim COS : type 0 lim 2 lnx ("§) sin : Altogether, we have lim (hue • In (cos = ^.1-0 = 0, i. e. lim fcos —^ In x □ 5.101. As we have implicitly mentioned, using l'Hospital's rule can lead to a non-existing limit even though the original hmit exists: Determine the limit x + sin x lim - x^oo x Solution. The limit is of the type ^, by l'Hospital's rule, we get that x + sinx 1 + cosx lim x^-oc lim x^-oc X x^oo 1 and since the hmit linu^oocosx does not exist, nor does the limit linu^oo 1 + cosx. However, the original hmit exists because x 1 x + sin x x + 1 — < - < - and by the squeeze theorem, x + sin x x + sin x x + 1 1 = lim - < lim - < lim -= 1. X^QO X X^QO X X^QO X □ 5.102. Determine lnx lim h°o X lim x In —, x^o+ x lim x e x^O- lim x^O X 100 ' lim x e* x^0+ lim (In x — x) ; x^>+oc lim lim lim ►+o° X + lnx • COSX ' x^+oo */x + 3 x^+oo ^fx2 _|_ i Solution. It can easily be shown (for instance, by 72-fold use of l'Hospital's rule) that for any 72 e N, it holds that x e lim — =0, i. e. lim — +00. The squeeze theorem implies the following generalization for real numbers a > 0: x° lim — 0, i. e. lim +00. The theorem 5.46(2) shows that the radius of convergence of the latter example is 1 as well because lim 4Tx"+1 «+1 — x lim n -xn n n + 1 For x — 1, we get the series 1 + 5 + -j + ..., which is divergent: By gradually summing up the 2k~1 adjacent terms 1/2*-1, ..., 1/(2* - 1) and replacing each of them by 2~k (thus they total up to 1/2), we can bound the partial sums from below by the sum of these 1 /2's. Since the bound from below diverges to infinity, so does the original series. On the other hand, the series T(—l) — — 1 + \ — \ + ... converges although it, of course, cannot converge absolutely. This follows from a more general theorem which will be introduced in the next chapter. 5.50. Trigonometric functions. With the power series, our society of functions increased by a lot of new examples of ^r smooth functions, i. e. functions which are arbitrarily f many times differentiable on the whole of their domains. Moreover, all of these additions to out menagerie have the property (similarly to polynomials) that the formula which defines them in fact defines a function C —>• C. Indeed, our reasoning about the absolute convergence holds flawlessly for series of complex numbers as well. Therefore, the power series will be convergent when we replace x with any complex number lying inside the disc with radius r centered at the origin of the complex plane. Let us, for a while, play with the most important example, the exponential function 1 „ 1 Ix2 2 This power series has infinite radius of convergence, so it defines a smooth function for all complex numbers x. Its values are the limits of values of (complex) polynomials with real coefficients and each polynomial is completely determined by finitely many of its values. Especially, the values of the power series are completely determined on the complex domain by their values at real input values x. Therefore, the complex exponential must also satisfy the usual formulas which we have already derived for the real values x. In particular, we have ex+y = Qx see (5.5) and the theorem 5.45(3). Let us substitute the values x — i ■ t, where i e Cis the imaginary unit, t e R arbitrary. ef = 1 + a - -r2 -i—r3 + — t4 + i—t5 -... 2 3! 4! 5! and apparently, the conjugate number to z — e" is the number z — e Hence z ■ z 1 hoo x" and all the values z — e" lie on the unit circle (centered at the origin) in the complex plane. The real and imaginary parts of the points lying on the unit circle have been described using the trigonometric functions cos 6 and sin 6, where 6 is the corresponding angle. 301 CHAPTER 5. ESTABLISHING THE ZOO Taking into account that the graphs of the functions y = ex and y = lnx (the inverse function to y = ex) are symmetric with regard to the line y = x, we further see that lnx x lim -= 0, i. e. lim -= +00. x lnx Thus we have obtained the first result. That could also be derived from 1'Hospital's rule because lnx lim - :^+oo x lim lim 1 0. hoo 1 x^+00 x Let us point out that l'Hospital's rule can be used to calculate all of the following five limits. However, it is possible to determine these limits by much simpler means. For instance, the substitution y = 1/x leads to lim x In — x^0+ X lim ie> = x^0+ v Iny = lim - y lim — y 0; +00. Of course, x -» 0+ gives y By the substitutions u ■ tively, lim x e" x^O- _ j_ ft x2 lim 1/x -» +00 (we write 1/ + 0 = +00). — 1/x, v = 1/x2 we get that, respec- lim e u -00; >0 X 100 v50 lim — o, where x -» 0— corresponds to u = —1/x -» +00 (we write — 1/ — 0 = +00) and then x -> Oto 1; = 1/x2 -» +00 (again 1/+0 = +00). We have also clarified that lim (In x X) lim -OO. Potential doubts can be scattered by the limit In x — x .. /. x lnx - lim lnx lim (1 V lnx/ -OO, which proves that even when decreasing the absolute value of the considered expression (without changing the sign), the absolute value of the expression remains unbounded. We can equally easily determine x lim lim +°° x + In x • cos x X lim — :^+oo x 1; lim x +°° Vx2 + 1 lim -^►+00 lim +00; = 1. We have seen that the l'Hospital's rule may not be the best method for calculating Umits of types 0/0, 00/00. The three preceding exercises illustrate that it even cannot be applied in all cases (for indeterminate 60N!0METfUCK£. Differentiating the parametric description of the points of a circle t i-> e", we get the vectors of "velocities" which will be given by the formula (if we do not believe that the power series can be differentiated term by term yet, we can instead differentiate the real part and the imaginary part separately) t i-> (elt)' — i -e", so their will keep the unit size. Hence we can deem that the whole circle will be traversed when the value of the parameter reaches the length of the circle, i. e. 2it (a thorough definition of the length of a curve needs integral calculus, then we will be able to verify this statement). This procedure can be used to define the number it, sometimes also called Archimedes' constant or Ludolphian number 5 half the length of the unit circle in the Euclidean plane M2. Now, we can at least partially convince ourselves by a look at the least positive roots of the real part of the partial sums of our series, i. e. the corresponding polynomials. Already with order ten, we get the number it with accuracy of 5 decimal places. Thus we obtain the definition of trigonometric functions in terms of the power series: cos t — v&&" = 1--12 1 4 1 6 -r--r- + (-D 1 4! 6! sin? — lme" —t--f3 H--15--1 7! (2k)\ 1 3 1 3! 5! + (-D* 1 (2k + 1) The following picture illustrates the convergence of the series for the cosine function. It is the graph of the corresponding polynomial of degree 68. Gradually drawing the partial sums, we can see that the approximation near zero is very good and hardly changes at all. As the order increases, the approximation gets better farther from the origin as well. -*This number describes the ratio of the circumference to the diameter of an (arbitrary) circle. It was known to Babylonians and Greeks as early as the ancient times. The term Ludolphian number is derived from the name of German mathematician Ludolph van Ceulen of the 16th century, who produced 35 digits of the decimal expansion of the number, using the method of inscribed and circumscribed regular polygons, invented by Archimedes. 302 CHAPTER 5. ESTABLISHING THE ZOO forms). If we had applied it to the first problem, we would have obtained, for x > 0, the quotient 1 x 1 + £2ii — In x ■ sin x x + cos x — x In x ■ sin x x which is more complicated than the original one. The limit for x -» +oo does not even exist, so one of the prerequisites of l'Hospital's rule is not satisfied. In the second case, any number of multiple uses of 1'Hospital's rule leads to indeterminate forms. For the last problem, l'Hospital's rule sends us back to the original hmit: first it gives the fraction 1 2x 2Jx2 + l and then 2x 27*2 + 1 1 From here, we can deduce that the limit equals 1 (we are looking for a non-negative real number ael such that a = a~l) only if we have already shown it exists at all. □ Other examples concerning calculation of hmits by l'Hospital's rule can be found at page 331. H. Infinite series Infinite series naturally appear in a series (of problems). 5.103. Sierpinski carpet. The unit squares is divided into nine equal squares and the middle one is removed. Each of the eight remaining squares is again divided into nine equal subsquares and the middle subsquare (of each of the eight squares) is removed again. Having applied this procedure ad infinitum, determine the area of the resulting figure. Solution. In the first step, a square having the area of 1/9 is removed. In the second steps, eight squares (each having the area of 9-2, i. e. totaling to 8 • 9-2) are removed. Every further iteration removes eight times more squares than in the previous steps, but the squares are nine times smaller. The sum of areas of all the removed squares is I + I + + 9 T 92 T 93 T E «=o 8* 9«+i ■ The area of the remaining figure (known as Sierpinski carpet) thus equals 1 E 9«+i — i 9 E (9) 1 «=0 «=0 0. The well-known formula e" &~" — sin21 + cos2 t — 1 follows straight from the definition. Further, from the derivative (e"Y — i e" we can see that (sin t)' — cos t, (cos t)' — — sin t. Of course, this result can also be verified by differentiating our series term by term. Let to denote the least positive number for which — — i. e. the first positive zero point of the function cos t. According to out definition of it, we have ?o — \it. The square of this value is el2t° — s~l2'o = (s~"°)2, and so it is a zero point of the function sin t. Of course, for any t, it holds that pQ\4k it 1 -e" Therefore, both trigonometric functions sin and cos are periodic, with period 2it. Right from our definitions, we can see that this is their prime period. Now we can easily derive all the usual formulae connecting the trigonometric functions. We will, for illustration, introduce some of them. First, let us notice that the definition says that (5.12) cos? = -(e^+e-'O (5.13) 2 sin? — — (elt -e~lt). 2i Thus the product of these functions can be expressed as sin? cos? — — (elt -e~lt)(elt +e~lt) 4i = —(ei2t-e~i2t) = -sin2f. 4« 2 Further, we can utilize our knowledge of derivatives: cos 2t — (— sin 2t)' — (sin t cos t)' — cos21 — sin2 t. The properties of further trigonometric functions sin t tan t cos t cot t — (tan t) □ can easily be derived from their definitions and the formulae for derivatives. The graphs of the functions sine, cosine, tangent, and cotangent are displayed on the pictures (they are the red one and the green one on the left, and the red one and the green one on the right, respectively): 303 CHAPTER 5. ESTABLISHING THE ZOO 5.104. Koch snowflake, 1904. Create a "snowflake" by the following procedure: At the beginning, consider an equilateral triangle with sides of length 1. With each of its three sides, do the following: Cut it into three equally long parts, build another equilateral triangle above (i. e. pointing out from, not into, the original triangle) the middle part and remove the middle part. This transforms the original equilateral triangle into a six-pointed star. Once again, repeat this step ad infinitum, thus obtaining the desired snowflake. Prove that the created figure has infinite perimeter. Then determine its area. Solution. The perimeter of the original triangle is equal to 3. In each step, the perimeter increases by one third since three parts of every line segment are replaced with four equally long ones. Hence it follows that the snowflake's perimeter can be expressed as the limit d„ = 3 (I)" and lim d„ = +oo. The figure's area is apparently increasing during the construction. To determine it, it thus suffices to catch the rise between two consecutive steps. The number of the figure's sides is four times higher every step (the line segments are divided into thirds and one of them is doubled) and the new sides are three times shorter. The figure's area thus grows exactly by the equilateral triangles glued to each side (so there is the same number of them as of the sides). In the first iteration (when creating the six-pointed star from the original triangle), the area grows by the three equilateral triangles with sides of length 1/3 (one third of the original sides' length). Let us denote the area of the original equilateral triangle by So. If we realize that shortening an equilateral triangle's sides three times makes its area decrease nine times, we get So + 3-f. for the area of the six-pointed star. Similarly, in the next step we obtain the area of the figure as So + 3-f +4-3-|. Now it is easy to deduce that the area of the resulting snowflake equals the limit lim (S0 + 3 • f + 4 • 3 • | + A-An.%. J>Q_) ' ^ J 9/1+1) S0 lim (l + i + + ••• + • (IT) So 1 + j lim So ( 1 + 1 + + i+iE(ir k=0 (IT) = S0 So 1 + I Hm E (IT ' k=0 i _i_ I . _L_ 3 1 4 So- The snowflake's area is thus equal to 8/5 of the area of the original triangle, i. e. 8 o ? >->0 4 2 73 5 • 1 2 3 Cyclometric functions are the functions inverse to trigonometric functions. Since the trigonometric functions all have period 2it, their inverses can be defined only inside one period, and further only on the part where the given function is either increasing or decreasing. The inverse trigonometric functions are with domain [—1, 1] andränge [—tt/2, tt/2]. Then arccos — cos-1 with domain [—1, 1] and range [0, it], see the left-hand picture. The remaining functions are (displayed in the picture on the right) arctan — tan-1 with domain R and range (—tt/2, tt/2), and finally arccot — cot-1 with domain R and range (0, it). The so-called hyperbolic functions are also of great importance in practice, namely sinhx — -(ex — e x), 2 ' coshx — -(ex +e x). 2 The name indicates that they should have something in common with a hyperbola. A straight calculation gives (the squares cancel out and only the mixed terms remain) (coshx)2 - (sinhx)2 = 2^(e* e~x) = 1. The points [cosh t, sinh t] e R2 indeed parametrically describe a hyperbola in the plane. For hyperbolic functions, one can easily derive identities similar to the ones for trigonometric functions. Among many of them, we can easily see from the definition (by substituting into (5.12) and (5.13)) that coshx — cos(i'x), i sinhx — sin(i'x). 304 CHAPTER 5. ESTABLISHING THE ZOO Let us notice that this snowfiake is an example of an infinitely long curve which encloses a finite area. □ 5.105. Calculate the series oo / n. n = \ (b) e h «=o (c) e (42/1-I + 42n) > « = 1 oo (d) e £; «=1 00 (e) (3« + l)(3«+4) ■ «=0 Solution. The case (a). From the definition, the series is equal to 00 / 1 1 \ «=1 ((tt " 7l) + (A " 7f) + • • • + 0 ~ vfe)) = i^(1 + (-7! + 7!) + --- + (-* + ^)-vfe) = 1-The case (b). Apparently, this sequence is a quintuple of the standard geometric series with the common ratio q = 1/3, hence e^=5e(ir=5 L_ — il . 4 — 2 ' «=0 «=0 3 The case (c). We have that (with the substitution m = n — 1) °° 3 2 3 °° 1 2 °° 1 e (42/1-1 + 421) = 4 e (4211-2) + is e (42/1-2) = «=1 «=1 (i + £)E 16/ ^ 42" m—0 14 V (-ly' 16 2^ V16/ m— 0 16 V42" « = 1 14 16 1- 14 15 • The series of linear combinations was expressed as a linear combination of series (to be more precise, as a sum of series with factoring out the constants), which is a valid modification supposing the obtained series are absolutely convergent. The case (d). From the partial sum sn = \ + ^ + ^+ ■■■ + §:, neN, we immediately get that 32 + 33 + ■ n—1 1 n ' 3/1 ' 3/1+1 > n e N. Therefore, $n 3 Since lim 3/1+1 - i + J- + J- + ~ 3 1 32 1 33 1 0, we get that 4- — 3« 3«+i 1 n e N. e # = Am § (.„ - f) = | lim e ^ iEar=i(i-i)=i- The case (e). It suffices to use the form (this is the so-called partial fraction decomposition) 5.51. Notes. (1) If a power series S(x) is expressed with the value of the variable x moved by a constant offset xq, we arrive at the function T(x) — S(x — xq). If p is the ■■f )j radius of convergence of S, then T will be well-defined on the interval (xq — p, xq + p). We say that T is a power series centered at xq. The power series can be defined in the following way: S(x) — ^a„(x -x0)", «=0 where xq is an arbitrary fixed real number. All of our previous reasonings are still valid, we must only be aware of the fact that they relate to the point xq. Especially, such a power series converges on the interval (xo — p, xo + p), where p is its radius of convergence. Further, it holds that if a power series y — T(x) has its values in an interval where a power series S(y) is well-defined, then the values of the function S o T are also described by a power series which can be obtained by formal substitution of y — T(x) for y into S(y). (2) As soon as we have power series with a general center at our disposal, we can calculate the coefficients of the power series for inverse functions straightaway. We will not introduce a list of formulae here, it can easily be obtained in Maple, for instance, by the procedure "series". For illustration, we will have a look at two examples: We have seen that 1 1 2 1 3 1 4 -x2 + -x3 H--x4 2 6 24 Since e° — 1, we will search for a power series centered at x — 1 for the inverse function lnx, i. e. lnx — ao+a\(x — l)+a2(x — I)2 +aj,(x — I)3 +a4(x — I)4 +... . Applying the equality x — elnx, regrouping the coefficients by the powers of x and substituting, we get: ( 1 2 1 3 1 4 A x — an + a\ x H—xr H—x H--x + ... u V 2 6 24 / ■ a2\x '-x2 ■ fl3 I x -x2 ao + flix + ( — a\ + a2 )x 24 —a\ + Ü2 + as Ix 6 - + - |fl2 + -A3 + a4 \x Confronting the coefficients at the corresponding powers on both sides, we get 1 1 1 «0 = 0, a\ — 1, «2 = — ~' fl3 = y ' °4 = —4' " " " which indeed corresponds to the valid expression (will be verified later): lnx = 00 ^_\~jn~ 1 (x - 1)". «=1 Similarly, we can play with the series 1 3 1 5 1 7 sin? = ?--r H--r--r 3! 5! 7! 305 CHAPTER 5. ESTABLISHING THE ZOO l (3« + l)(3«+4) which gives 3« + l lim i (l Z^ (3« «=0 I . I _ 7 T 7 lim i (l - -^-) 3 V 3«+4/ j_ 1 3 ' 3«+4' 1 n e NU {0}, 1 + 1 4 4 + l)(3«+4) 1 10 T T 3« + l 3«+4/ □ 5.106. Verify that n — 1 «—0 Solution. We can immediately see that — ' 22 + 32 < Z 22 — 2' 42 + 52 + 62 + 72 < ^ 42 — 4' or, in general: í _i_ i__i__ (2«)2 (2"+1-l)2 ^ (2«)2 2"' < 2' « 1 n € N. Hence (by comparing the terms of both of the series) we get the wanted inequality, from which, by the way, it follows that the series YlT=i ^2 converges absolutely. Eventually, let us specify that 00 j 2 00 1 Z^^2=T<^=Z^2«- n—\ n—0 and the (unknown so far) series for its inverse (note that we are looking for a series centered at zero again because we have sin 0 — 0) ■ a\t + ajt1 arcsin t — üq Substitution gives ■ as a4t t — ao + a\ i t 3! 5! a2\ t 3! -ť3+-ť5 5! = AO + + fl2f hence :fl2 + A4 If arcsin t — t a3 )t3+ 3 120 1*3 6 öl -aj, + 05 1^ 40 (3) We can also notice that if we believed right from the beginning that the function ex can be expressed as a power series centered at zero and that power series can be differentiated term by term, then we would easily obtained the differential equation for the coefficients a„ as we know that (x"+1 )' = (« + l)x". Therefore, from the condition that the exponential function has its derivative equal to its value at every point, it follows that and hence it is clear that a„ — X flo 1 □ 5.107. Examine convergence of the series 00 YMn=±i. '—1 n « = 1 Solution. Let us try to add up the terms of this series. We have that 00 J2 In s±i = lim (In \ + In § + In \ + • • • + In =±i) = n = \ lim ln^ 2-3-4-(» + l) , 9, — lim In (n + 1) = +00. Thus the series diverges to +00. 5.108. Prove that the series g arctg »2+2»+3^+4 g ^±L o n + l ' ni+n1— n «=0 « = 1 do not converge. Solution. Since lim arctg «2+2«+3y/«+4 n + l lim arctg ^ and lim 3" + l lim + OO, □ the necessary condition lim a„ = 0 for the series YlT=nna" to con~ verge does not hold in either case. □ 306 CHAPTER 5. ESTABLISHING THE ZOO 5.109. What is the series «=2 Solution. From the inequalities (consider the graph of the natural logarithm) 1 < Inn 3, n e N, it follows that >/l < ■ifinn < n > 3, n e N. By the squeeze theorem, lim v^lnn = 1, i- C- lim -»J= = 1 - «^►00 n^oo Vinn Thus the series does not converge. As its terms are non-negative, it must diverge to +00. □ 5.110. Determine whether the series 1 (« + l)-3" ' 0) E 7^TTV3 «=0 00 . (b) E «=1 (c) E —r- v 7 «—In « « = 1 converge. Solution. All of the three enlisted series consist of non-negative terms only, so the series is either finite (i. e. converges), or diverges to +00. We have that (a) E - E (I)" = TZT < +00; «=0 «=0 3 n — 1 n — 1 n — 1 00 00 (c) E —1— > E 1 = +°°- v 7 ^-^ n —In« — t—1 n n — \ n — \ Hence it follows that the series (a) converges; (b) diverges to +00; (c) diverges to +00. □ More interesting exercises concerning series can be found at page 332. I. Power series In the previous chapter, we examined whether it makes sense to assign a value to a sum of infinitely many numbers. Now we will turn our attention to the problem what sense the sum of infinitely many functions may have. 5.111. Determine the radius of convergence of the following power series: 00 i) E « = 1 307 CHAPTER 5. ESTABLISHING THE ZOO a) E n = \ \_ (1+0"" Solution. i) From we get that 1 lim sup "n+\ Thus the power series converges exactly for the real numbers x € {—\,\) (alternatively, the complex numbers \x\ < \). \ (it is har- Let us notice that the series diverges for x monic), but on the other hand, it converges for x = — ± (alternating harmonic series). To determine the convergence for any x lying in the complex plane on the circle of radius \ is a much harder question which goes beyond our lectures. ii) lim sup 1 (l + 0" lim sup 1 +i V2 2 ' □ 5.112. Determine the radius r of convergence of the power series (a) E^«; «=i oo (b) £(-4/0"*"; n = l co 2 (c) E(i + £)" «=i (d) E (2+(-l)») n x « = 1 Solution. It holds that l l. 8 ' (a) lim y\a„ | = lim „- (b) lim ^/| an I = lim An = +oo; (c) lim = lim (1 + i)" = e; _ it—$ (d) lim sup y\a„\= lim sup 2+(lir = Urn sup Jfl"_\r 1. Therefore, the radius of convergence is (a) r = 8, (b) r = 0, (c) r = 1/e, (d)r = l. □ 5.113. Calculate the radius r of convergence of the power series E ein «=i •Jn4+2n3 + l-jf (x - 2)". Solution. The radius of convergence of any power series does not change if we move its center or alter its coefficients while keeping their absolute values. Therefore, let us determine the radius of convergence of the series 308 CHAPTER 5. ESTABLISHING THE ZOO ^ \?«4+2«3 + l-3T" Since lim ^n3" = ( lim = 1 for a > 0, we can move to the series 00 «=1 with the same radius of convergence r = jt/3. □ 5.114. Give an example of a power series centered at the origin which, on the interval (—3,3), determines the function 1 x2-x-l2' Solution. As _1_-_I_- I (J.___]—) x2-x-l2 (x-4)(x+3) 7 Vjt-4 x+3/ and 1 _ 4- _ 1 x-4 1-| 4 (l + f + £ + ••• + £ + •••), _L_ -_L_ - I (l __ + _! + ... + Iz£)_ + ... ^) x+3 — l-(-f) _ 3 ^ 3 ^ 32 ^ ^ 3" ^ / ' we get 1 _ _J_ V x^_ _ J_ (-*)" _ V / (-!)"+'__1_\ T« x2_x_n 28 ^ 4" 21 ^ 3" ^ \ 21-3" 28-4" / - «=0 «=0 «=0 x 7 □ 5.115. Approximate the number sin 1° with error less than 10-10. Solution. We know that oo sin x = x - ± x3 + jj Xs - ± x1 H----= t t^htt . x eR. 3! 5! 7! (2« + l)! «=0 Substituting x = jt/180 gives us that the partial sums on the right side will approximate sin 1°. It remains to determine the sufficient number of terms to add up in order to provably get the error below 10-10. The series jr___1_ (JL_\3 I _L (JL.)5 _ i_ (JL.)7 _i_ — V ( n \2n+1 180 3! V180/ 5! V180/ 7! V180/ ~r ' " — 2^ (2n + l)\ V 180/ «=0 is alternating with the properly that the sequence of the absolute values of its terms is decreasing. If we replace any such convergent series with its partial sum, the error we thus make will be less than the absolute value of the first term not included in the partial sum. (We do not give a proof of this theorem.) The error of the approximation sinl°^-r^ 180 1803-3! is thus less than 1805-5! 1U • □ 5.116. Determine the radius r of convergence of the power series 309 CHAPTER 5. ESTABLISHING THE ZOO V 2 "! r" 2-^ (2n)! «=0 o 5.777. Calculate the radius of convergence for 2%/" ■ O 5.778. Without calculation determine the radius of convergence of the power series oo 5 n- « = 1 V —_ X" o 5.779. Find the domain of convergence of the power series y«+i 34n n = l O 5.720. Determine for which x e M the power series E <-3)" (x-2)" ^ v/«4+2«3 + lll V 7 converges. O 5.727. Is the radius of convergence of the power series oo oo n—0 n—1 common to all sequences {a«}^L0 of real numbers? O 5.722. Decide whether the following implications hold: (a) If the limit lim ^o2 exists and is finite, then the power se- 3»/„2 ries Z~2 an(x — xd)" n = \ converges absolutely at at least two distinct points x. (b) Conditional convergence of series 2~1™=\ an, 2~1™=\ ^« implies that the series X^i(6fl« — 5b„) converges as well. (c) If a series YlT=o a» satisfies lim a2 = 0, then it is convergent. (d) If a series YlT=i an converges, then the series oo n « = 1 converges absolutely. o 5.725. Approximate cos with error less than 10~5. O 5.124. For the convergent series 310 CHAPTER 5. ESTABLISHING THE ZOO (-1)" ^ V^+100' «=0 bound the error of its approximation by the partial sum sg 999. O 5.125. Express the function y = ex, defined on the whole real line, as an infinite polynomial whose terms are of the form an(x — 1)". Then express the function y = 2X defined on R as an infinite polynomial with terms anx" . Q 5.126. Find the function / to which, for x e R, the sequence of functions converges. Is this convergence uniform on M? O 5.127. Does the series 00 T kde x e R, n = \ converge uniformly on the real line? O 5.128. Approximate (a) the cosine of ten degrees with accuracy of at least 10~5; (b) the definite integral JQ1/2 ^-j- with accuracy of at least 10~3. o 5.729. Determine the power series centered at xG = 0 of the function X f{x) = /V dt, x € R. 0 o 5.130. Using the integral test, find the values a > 0 for which the series oo y - «=i converges. O 5.131. Determine for which x e R the series y_t_x3„ ^ 2" • n ■ ln(n) converges. O 5.132. Determine all x e R for which the power series £ ^ is con- i=\ " vergent. O Solution. Forx e [-1, 1]. □ 5.133. For which x € R does the series oo Eln(n\) nx « = 1 311 CHAPTER 5. ESTABLISHING THE ZOO converge? O 5.134. Determine whether the series oo T (-lf^tan -±= ■ ' n^Jn n = \ converges absolutely, converges conditionally, diverges to +oo, diverges to — oo, or none of the above, (such a series is sometimes said to be oscillating). O 5.135. Calculate the series oo T — t— n-V « = 1 with the help of an appropriate power series. O 5.136. Forx e (-1, 1), add x - Ax2 + 9x3 - 16x4 H---- 5.137. Supposing \ x\ < 1, determine the series oo (a) E^*2-1; «=i 2^n-l (b) Y,n2x> n = \ 5.138. Calculate using the power series E 2n-l (-2)""1 n = \ Jin E 2"-«! X «=0 y O O E(-i)" (2« + l)x2 «=0 for some x e (—1, 1). O 5.739. For x e R, calculate the series oo 1 v3n + l o J. Additions into the ZOO 5.140. Determine the maximal subset of R where the function y = arctg (x21 + sinx) • 5--— can be defined. O Solution. R. 5.141. Write the maximal domain of the function arccos (In x) 312 CHAPTER 5. ESTABLISHING THE ZOO COS X . X3 ' COS X o Solution. (1, e]. 5.142. Determine the domain and the range of the function y 2-3x ■ Then determine the function inverse to this one. O Solution, (-oo, I) U (f, +oo); (-oo, -|) U +oo); y = ff±}, x + -\. 5.143. Is the function (a) y (b) y = ^ + l; (c) y = ^; (d) y = ^ + l; (e) y = sin x + tan |; (f) y = ln|lf; (g) y = sinhx = (h) y = coshx = with the maximal domain odd? O Solution, (a) yes; (b) no; (c) no; (d) no; (e) yes; (f) yes; (g) yes; (h) no. 5.144. Is the function (a) y (b) y = ^ + l; (c) y = ^; (d) y = ^ + l; (e) y = sin x + tan |; (f) y = in jif; (g) _y = sinhx COS X . COS X 2 (h) y = coshx = ^+fl with the maximal domain even? O 5.145. Determine whether the function (a) y = sin x • In I x I; (b) y = arccotgx; (c) y = x8 - ^3x6 + 3x2 - 6; (d) y = cos (it — x); (e) y = £»i£±£_ v 7 J 3+7 cos jc with the maximal domain is odd and whether it is even. O 5.146. Is the function (a) y = In (cos x) ; (b) y = tan (3x) + 2 sin (6x) 313 CHAPTER 5. ESTABLISHING THE ZOO with maximal domain periodic? O 5.147. Draw the graphs of the functions /(x)=e|x|, xeR; g(x)=ln|x|, ieK\{0). O 5.148. Draw the graph of the function y = 2_l x , x € 5.149. The functions o sinhx = e 2e , x e R; coshx = 2e , x eR; tanhx = x e M; cothx = ie8\{0) cosh x' ' sinh x ' 1 J are called hyperbolic functions. Determine the derivatives of these functions on their domains. O 5.150. At any point x e R, calculate the derivative of the area hyperbolic sine (denoted arsinh), the function inverse to the hyperbolic sine y = sinh xonl. O Note: The inverse functions to the hyperbolic functions y = coshx, x € [0, +oo), y = tanhx, x e R and y = cothx, x e (—oo, 0)U(0, +oo) are called area hyperbolic functions (y = arsinhx belongs to them, too). They are denoted arcosh, artanh, arcoth, respectively and are defined for x e [1, +00), x e (—1,1), and x e (—00, —1) U (1, +00), respectively. Let us add that (arcoshx)' = jA—^, x > 1, (artanhx)' = \x \ < 1, (arcothx)' = \x \ > 1. 5.151. Calculate: 2 12 12 2 + lH---1---1---1---1---1---- 2! 3! 4! 5! 6! O Solution. Confronting the series with the expansions of the functions sinh and cosh into power series, we get the result sinh(l) +2cosh(l). □ 314 CHAPTER 5. ESTABLISHING THE ZOO K. Additional exercises to the whole chapter 5.152. Determine a polynomial P(x) of the least degree possible satisfying the conditions P(l) = 1, P(2) = 28, P(0) = 2, P'(0) = 1, P'(l) =9. O 5.153. Determine a polynomial P(x) of the least degree possible satisfying the conditions P(0) = 0, P(l) = 4, P(-l) = -2, P'(0) = 1, =7. O 5.754. Determine a polynomial 7J(x) of the least degree possible satisfying the conditions P(0) = -1, = -1, P'(-l) = 10, P'(0) = -1, = 6. O 5.755. From the definition of a limit, prove that lim (x3 - 2) = -2. 5.158. Determine both one-sided limits lim arctan —, lim arctan —. x^0+ X x^O- X Knowing the result, decide existence of the Umit lim arctan —. x^O X 5.159. Do the following limits exist? sinx 5x4 + 1 lim ——, lim- x^O X x^O X 5.160. Calculate the Umit tanx — sinx lim sin x 5.161. Determine 2 sin3 x + 7 sin2 x + 2 sin x — 3 lim-----. x^k/6 2 sin x + 3 sin x — 8 sin x + 3 5.162. For any m, n e N, determine i™ - 1 lim -. x^l X" - 1 o 5.156. From the definition of a limit, determine ,. (l+x)2-3 lim -, x^-i 2 i. e. write the 5(e)-formula as in the previous exercise. O 5.757. From the definition of a limit, show that r 3 (x ~ 2)4 ^ lim -= +oo. o o o o o o 315 CHAPTER 5. ESTABLISHING THE ZOO 5.163. Calculate lim ( v x2 + x — x) . o 5.164. Determine lim x^+oo (xy/l+X2 -X2) . o 5.765. Calculate + cosx lim---. sin x 5.166. Determine sin (Ax) o lim >o Vx + 1 - 1 O 5.767. Calculate v/1 + tan x — V1 — tan x lim -. x^o- smx o 5.76S. Calculate 2X + Vl +x2 -x9 - 7x5 + 44x2 lim -—==- ■->-00 3X + ^6x6 + x2 - 18x5 - 592X4 o 5.769. Letlim^^.oo f(x) = 0. Is it true thatlim^-ooC/Cxj-gCx)) = 0 for every increasing function g : R -> M? O 5.770. Determine the limit . 2«-l / n lim-- n^oo \ n + 5 5.777. Calculate sin x — x o lim o 5.772. For x > e, determine the sign of the derivative of the function fix) = arctan lnx -l+lnjc' o Solution. f'(x) < 0, x > e. 5.173. Determine all local extrema of the function y = x In2 X 316 CHAPTER 5. ESTABLISHING THE ZOO defined on the interval (0, +00). O Solution. The function has a local maximum at the point x\ = e~2 and it has a local minimum at the point x2 = 1. 5.174. Is there a real number a such that the function y = ax + sinx has a global minimum on the interval [0, 2tt] at the point x0 = 5n/41 O Solution. There is not: for a = V2/2, there is only a local extremum at the point. 5.175. Find the absolute minimum of the function y = ex — Inx, x > 0 on its domain. O Solution. 2 = e - — In -. e e 5.176. Determine the maximum value of the function y = ^/3xe~x, xeR. O Solution. 4=. 5.177. Find the absolute extrema of the polynomial p(x) = x3 — 3x + 2 on the interval [—3, 2]. O Solution.4 = p{-\) = p (2), -16 = p (-3). 5.178. Let a moving object's distance in time be given as follows: sit) = -it - 3)2 + 16, (e[0,7], where (is the time in seconds, and the distance is in meters. Determine (a) the initial (i. e. at the time ( = 0 s) speed of the object; (b) the time and position at which its speed is zero; (c) its speed and acceleration at the time ( = 4 s. Let us remark that the object's speed is the derivative of its position and acceleration is the derivative of its speed. O 5.179. From the definition of a derivative /' of a function / at the point x0, calculate /' for fix) = Vx at any point x0 > 0. O 5.180. Determine whether the derivative of the function fix) = x arctan\, x e R \ {0}, /(0) = 0 at the point x0 = 0 exists. O 5.181. Does the derivative of the function y = sin (arctan (J 12x21 + 11 | • ^^12 + sin(sin(sin(sinx))), x e R at the point xo = tt3 + 371 exist? O 5.182. Determine whether the derivative of the function f(x) = (x2 - ljsin^, x;t-l(x6l), /(-1)=0 317 CHAPTER 5. ESTABLISHING THE ZOO at the point x0 = — 1 exists. O 5.183. Give an example of a function / : R -» R which is continuous on the whole real axis, but does not have derivatives at the points x\ = 5, x2 = 9. Q 5.184. Find functions / and g which have derivatives at no real point, yet their composition / o g is differentiable at every point of the real line. O 5.185. Using the basic formulae, calculate the derivative of the function (a) y = (2 — x2) cosx + 2x sinx, x e R; (b) y = sin (sinx) , x e R; (c) y = sin (In (x3 + 2x)) , x e (0, +oo); 1+Jt-Jt2 1-Jt+Jt2 (d) y = fe^ ^ e o o 5.786. By any means, determine the derivative of the function (a) y = tJx y/x yf^c, x € (0, +oo); (b) y = In |tan || , x e M \ {«7r; n e Z}. 5.187. Write the derivative of the function y = sin (sin (sinx)), x e 5.188. For the function i=£ j. .3/v3 /(x) = arccos + Vx3 having the maximum possible domain, calculate /' on the largest subset of R where this derivative exists. O 5.189. At any point x ^ {nrr; n € Z}, determine the first derivative of the function y = Vsin x. O 5.190. For x e R, differentiate o xV 1 + x2 + e* (x2 - 2x + 2) . O 5.191. Calculate /'(l) if /(jc) = (x - l)(x - 2)2(x - 3)3, x e R. o 5.792. Determine the derivative of the function |x|^l(xeM). 318 CHAPTER 5. ESTABLISHING THE ZOO o 5.193. Differentiate (with respect to the real variable x) x In2 (x + Vl +x2) - 2Vl +x2 In (x + Vl + x2) + 2x at all points where the derivative exists. Simplify the obtained expression. O 5.194. Determine /' on a maximal set if f(x) = \ogx e. O 5.195. Express the derivative of the product of four functions [f(x)g(x)h(x)k(x)] ' as a sum of products of their derivatives and themselves, supposing all of these functions are differ-entiable. O 5.196. Determine the derivative of the function _ x3 (x+l)2jx~+2 y (x+3)2 for x > 0. O 5.197. A highway patrol helicopter is flying 3 kilometers above a highway at the speed of 120 kph. Its pilot localizes a car whose straight-line distance from the helicopter is 5 kilometers and which is approaching it at 160 kph (with regard to the helicopter). Determine the car's speed with regard to a tin lying on the highway. Solution. For the sake of simplicity, we will omit units of measurement (distances will be expressed in kilometers and times in hours, speeds in kph, then). The helicopter's position at time t can be expressed by the point [y(t), 3], and the car's position by [x(t), 0], then. (We choose the axes so that the helicopter and the car are moving along the x-axis.) Let us denote by s(t) the straight-hne distance of the car from the helicopter and by t0 the moment mentioned in the problem's statement. Let us calculate the car's speed with respect to the origin. We can suppose that x(t) > y(t) > 0, then x' (t) < 0, y (0 > 0 for the considered time moments t since the car is approaching the point [0, 0] from the right - the value x(t) decreases as t increases, therefore x1 (i) < 0. Similarly we can get that y (0 > 0 and also s' (t) < 0. Let us add that, for instance, yf (t) determines the rate of change of the function y at time t, i. e. the helicopter's speed We know that s (t0) = 5, *' (to) = -160, y' (t0) = 120 and that (s(t) is the hypotenuse of the right triangle) (5.9) (x(t)-y(t))2 + 32 = s2(t). Hence it follows (x(t) > y(t) > 0) that (x (to) - y (to))2 + 32 = 52, i. e. x (t0) - y (t0) = 4. By differentiating the identity (||5.9||), we get 2 (x(t) - y(t)) (x' (0 - y (0) = 2s(t)s> (t) and then for t = t0, 2-A(x'(to)- 120) =2-5-(-160), i.e. x' (t0) = -80. 319 CHAPTER 5. ESTABLISHING THE ZOO We have calculated that the car is approaching the tin at 80 kph. It suffices to realize with which units of measurement we worked. Having obtained a negative value is caused by our choice of the coordinate system. □ 5.198. For which a e R is the cubic polynomial P which satisfies the conditions P (0) = 1, P'(0) = 1, P(\) =2fl+2, P'(l) = 5a + 1, a monotonie function on the whole M? Solution. >From the conditions P(0) = 1 and P'(0) = 1 it follows that P(x) = bx3 + cx2 + x + 1 where b, c e R; the two remaining conditions determine two equations for the variables b and c: b + c + 2 = 2a + 2, 3b + 2c + \ = 5a + 1 with the unique solution b = c = a. The polynomials which satisfy the desired conditions are thus of the form P(x) = ax3 + ax2 + x + 1, a e R. The monotonicity of the polynomial is equivalent to having no local extrema. The extrema can occur only at those points where the derivative changes sign. Therefore, the polynomial is monotonie if and only if its derivative keeps the sign on the whole R. The derivative is P'(x) = 3ax2 +2ax + 1 and it will keep the sign iff the discriminant is non-positive. Thus we get the condition 4a2 -12a < 0 4a(a - 3) < 0, which is true for a e [0, 3]. However, for a = 0 the polynomial P is monotonie, yet not cubic, Thus the set of satisfactory numbers a is the interval (0, 3]. □ 5.199. Regiomontanus' problem, 1471. In the museum, there is a painting on the wall. Its lower edge is a meters above ground and its ^ upper edge b meters, then (its height thus equals b—a). A tourist is looking at the painting, her eyes being at height h < a meters above ground. (The reason for the inequality h < a can, for instance, be to allow more equally tall visitors to view the painting simultaneously in several rows.) How far from the wall should the tourist stand if she wants to maximize her angle of view at the painting? 320 CHAPTER 5. ESTABLISHING THE ZOO Solution. Let us denote by x the distance (in meters) of the tourist from the wall and by cp her angle of view at the painting. Further, let us set (see the picture) the angles a, f3 e (0, jt/2) by tana = ^, tan£ = x ' ' x Our task is to maximize cp = a — f3. Let us add that for h > b, one can proceed analogously and for h € [a, b], the angle cp increases as x decreases (cp = it for x = 0 and h € (a, b)). >From the condition h < a it follows that the angle cp is acute, i. e. cp e (0, it 12). Since the function y = tanx is increasing on the interval (0, it 12), we can turn our attention to maximizing the value tan cp. We have that tan cp = tan (a — (3) tana—tan fi 1 +tan a tan fi x(b—a) l + t±.!LdL x2+(h_h-)(a_h-)- So it suffices to find the global maximum of the function /(*) x(b—a) x2+(b-h)(a-h) ' x e [0, +oo). From the expression (b-a)[x2+(b-h)(a-h)]-2x2(b-a) [x2+(b-h)(a-h)f f'(x) we can see that f'(x) > 0 for x e (o, y/(b - h)(a - hf) , f'(x) < 0 for x e ^/(b - h)(a - h), +oo) (b-a)[(b-h)(a-h)-x2 [x2+(b-h)(a-h)f X e (0, +oo), Hence the function / has its global maximum at the point x0 = — h)(a — h) (let us remind the inequalities h < a < b). The point xo can, of course, be determined by other means. For instance, we can (instead of looking for the maximum of the positive function / on the interval (0, +oo)) try to find the global minimum of the function 321 CHAPTER 5. ESTABLISHING THE ZOO g(x) = -f- = *2+(»-W«-ft> = ^ + (»-*)(flTA), xe(0,+00) with the help of the so-called AM-GM inequality (between the arithmetic and geometric means) y-^>yfyiY2, V!,y2>0, where the equality occurs iff y\ = y2. The choice yiW = ^. ^) = ^rrI then gives g(x) = yi(x) + y2(x) > 2 Vvi(x)y2(x) = ^ V(* - A) (a - A). Therefore, if there is a number x > 0 for which yi(x) = y2(x), then the function g has the global minimum at x. The equation yl(*) = y2(*), i.e. ^ = has a unique positive solution x0 = y/(b — h)(a — h). We have determined the ideal distance of the tourist from the wall in two different ways. The angle corresponding to x0 is (p0 = arctan 2 x.f'~^ ,, = arctan;-h—- ' xl+(b-h)(a-h) 2 J(b-h)(a-h)' When looking at the painting from the ground (being an ant, for instance), we have h = 0, and so b—a x0 = yjab, cpo = arctan; 2 \jab If the painting is 1 meter high and its lower edge is 2 meters above ground (a = 2, b = 3), then the ant will see the painting at the largest angle i and the point B in a homogeneous space with speed of light v2. See the picture. Solution. Once again, we will omit units of measurement. We can assume that distances are given in meters, speeds t>i, v2 in meters per second (and time in seconds, then). The ray is determined by Fermat's principle of least time: of all the paths between the points A and B, the light will go along the one which can be traversed in the least time. In homogeneous spaces, the ray will be a straight line (in this case, we will consider its segment). So it suffices to determine the point R (given by the value x) where the ray refracts. The distance between the points A and R is Jh\ + x2, between points R and B it is y h\ + (d — x)2, then. The total time of the transmission of energy between the points A and B is thus given by the function T(x) = 1--h-- in the variable x e [0,d]. Let us emphasize that we want to find the point x e [0, d] at which the value T(x) is minimal. The derivative 322 CHAPTER 5. ESTABLISHING THE ZOO T'(x) "'■ ''ij-t-a- V2 w«2 d—x vi.h]+x2 v2Jhl+(d-x)2 is a continuous function on the interval [0, d], so its sign can be easily described by its zero points. From the equation T'(x) = 0, i. e. x - d~x it follows that x jh2+(d-x)2 This expression is useful for us because (see the picture) d—x sini V] (5.10) —— = —. sin 2) is increasing with respect to x. Since T'(0) < 0 and T'(d) > 0, there is exactly one stationary point x0. From the inequalities T'(x) < 0 for x e [0, xq) and T'(x) > 0 for x e (xo, i and v2 is constant for the given homogeneous spaces and determines an important quantity which describes the interface of optical spaces. It is called a refractive index and denoted by n. Usually, the first space is vacuum, i. e. v\ = c, and v2 = v, thus obtaining the (absolute) index of refraction n = c/v. For vacuum, we get n = 1, of course. This value is also used for air since its refractive index at the standard conditions (i. e. pressure of 101 325 Pa, temperature of 293 K and absolute humidity of 0.9 gm~3) is n = 1.000272. Other spaces have n > 1 (n = 1.31 for ice, n = 1.33 for water, n = 1.5 for glass). However, the refractive index also depends on the wave length of the electromagnetic radiation in question (for example, for water and light, it ranges from n = 1.331 to n = 1.344), where the index ordinarily decreases as the wave length increases. The speed of light in an optical space having n > 1 depends on its frequency. We talk about the dispersion of light. The dispersion causes rays of different colors to refract at different angles. (The violet ray refracts the most and the red ray refracts the least.) This is also the origin of a rainbow. We can further remind the well-known Newton's experiment with a glass prism from 1666. Eventually, let us remark that our task always has a solution because we can choose the point R arbitrarily. If, together with the speeds t>i and v2, the angle 0. It thus makes sense to analyze the function (see (||5.11||) and (||5.12||)) a(y) = 4 arcsin ^ — 2arcsin j, y € [0, R]. By selecting the appropriate unit of length (for which R = 1) we can turn to the function a(x) = 4arcsin^ — 2arcsinx, x e [0, 1]. Having calculated the derivative a'(x) = —jL= - 2, x e (0, 1), we can easily determine that the equation a'(x) =0 has a unique solution 324 CHAPTER 5. ESTABLISHING THE ZOO x0 = y^e(0, 1), if «2eE(l,4). Let us set n = 4/3 (which is approximately the refractive index of water). Further, a'(x) > 0, x e (0, xo), c/(x) < 0, x e (xo, 1). We have found that at the point x0 = /^P- = I Ti = 0.86, the function a has a global maximum o-(xo) = 4 arcsin -4? - 2 arcsin ^ = 0.734 rad ^ 42 °. y v/ 2V3 3V3 Although it is amazing that the peak of the rainbow cannot be above the level of approximately 42 ° with regard to the observer, what is even more amazing are the values a (0.14) = 39.4°, a (0.94) = 39.2 °, a(0.8) = 41.2°, a (0.9) = 41.5 °. Those imply (the function a is increasing on the interval [0, x0] and decreasing on the interval [x0, 1]) that more than 20 % of the values a lie in the band from around 39 ° to around 42 °, and 10 % lie in a band thinner than 1 °. Furthermore, if we consider a(0.84) =41.9°, a (0.88) = 41.9 °, we can see that the rays for which a is close to 42 ° have the greatest intensity. Let us emphasize that this is an instance of the so-called principle of minimum deviation: the highest concentration of the diffused light happens to be at the rays with minimum deviation since the total angle deviation of the ray equals the angle 8 = it — a. The droplets from which the rays creating the rainbow for the observer come lie on the surface of a cone having the central angle equal to 2a (x0). The part of this cone which is above ground then appears as the rainbow arc to the observer (see the picture). Thus when the sun is setting, the rainbow has the shape of a semicircle. Let us remark that the rainbow exists only with regard to its observer - it is not anchored in the space. Eventually, let us add that the circular shape of the rainbow was examined as early as 1635-1637 by René Descartes. □ 5.202. L'Hospital's pulley. A rope of length r is tied to the ceiling at point A. A pulley is attached to its other end. Another §,, rope of length I > \fcP- + r2, going through the pulley, is tied to the ceiling at point B which is at distance d from the point A. A weight is attached to this rope. In what position will the x weight stabilize (the system will be in a stationary position)? Omit the mass and the size of the ropes and the pulley. See the picture. Solution. The system will be in a stationary position if its potential energy is minimized, i. e. the distance f(x) of the weight from the ceiling is maximal. However, this means that for r > d, the pulley only moves under the point B. Further on we will thus suppose that r < d. By the Pythagorean theorem, the distance of the pulley from the ceiling is Vr2 — x2 and from the weight then I — y/(d — x)2 + r2 — x2 , which gives fix) = Vr2 - x2 + I - y/(d - x)2 + r2 - x2 . The state of the system is fully given by the value x e [0, r] (see the picture), so it suffices to find the global maximum of the function / on the interval [0, r]. First, we calculate the derivative 325 CHAPTER 5. ESTABLISHING THE ZOO f W = Jfl -x2 ~ J(d-x)2+fl -x2 = Jr2 -x2 + J(d-x)2+fl -x2 ' X ^ ('0' r-*" Exponentiating the equatino f'(x) = 0 for x e (0, r) leads to x1 = d2 r2 —x2 {d—x)2+r2 —x2 Multiplying both sides by (r2 — x2) {(d — x)2 + r2 — x2) then leads to 2dx3 - (2d2 +r2)x2 + d2?2 = 0, x e (0, r). If we notice that one of the roots of the left-hand polynomial is x = d, we can easily transform the last equation into the form (x-d) (2dx2 - r2x - dr2) = 0, x e (0, r), or (we have a formula for the quadratic equation) 2d(x-d)(x- d+f&M.) (x - ^g^) =0, xe (0, r). Hence we can see that the equation f'(x) = 0 has at most one solution on the interval (0, r). (Since r < d and \Jr2 + Sd2 > r, there are surely not two roots of the considered polynomial in x in the interval (0, r).) It remains to determine whether i2 W r2 +%& 1 X0 - -4d- - 4 r G (0, r). Realizing that r,d > 0 and r < d,we get 0 < x0 < ^ r l+V^T r. As the function /' is continuous on the interval (0, r), it can change sign only at the point x0. From the limits lim f'(x) = -jf=, lim f'(x) = -oo, x^0+ -Jdl+rl x^r- it follows that fix) > 0, x e (0, x0), /'(jc) < 0, x e (x0, r). Thus the function / has the global maximum on the interval [0, r] at the point x0. □ 5.203. A nameless mail company can only transport parcels whose length does not exceed 108 inches and whose sum of length and maximal perimeter is at most 165 inches. Find the largest (i. e. having the greatest volume) parcel which can be transported by this company. Solution. Let M denote the value 165 (inches) and x the parcel's length (in inches as well). Apparently, the wanted parcel has such a shape that for any t e (0, x), its cross section has a constant perimeter (the maximal one). We will denote this perimeter by p (in inches, again). We want the parcel to have the greatest volume so that the cross section of a given perimeter has the greatest area possible. It is not difficult to realize that the largest planar figure of a given perimeter is a disc. Thus we have derived that the desired parcel has the shape of a cylinder with height equal to x and radius r = p/2n. Its volume is V =jtr2x = and it must be that p + x < M and x < 108. Thus we consider the parcel for which p + x = M. Its volume is V(X) = i-M^2L = *3-2Mx2+M2x where x e (Q) 10g] _ Having calculated the derivative 326 CHAPTER 5. ESTABLISHING THE ZOO v,(x) = ^-amx+m2 = 3^-^-?)^ x e (Q) 10g) j we easily find out the the function V is increasing on the interval (0, 55] = (0, M/3] and decreasing on the interval [55, 108] = [M/3, min{108, A/}]. The greatest volume is thus obtained for x = M/3, where v (f) = m = 0.011789 M3 ^ 0.867 8 m3. If the company also required that the parcel have the shape of a rectangular cuboid (or more generally a right prism of a given number of faces), we can repeat the previous reasoning for a given cross section of area S without specifying what the cross section looks like. It suffices to realize that necessarily S = kp2 for some k > 0 which is determined by the shape of the cross section. (If we change only the size of the sides of the polygon which is the cross section, then its perimeter will change by the same ratio. However, its area will change by square of the ratio.) Thus the parcel's volume is the function V(x) = Sx = kp2x = k (M — x)2x, x e (0, 108]. The constant k does not affect the point of the global maximum of the function V, so the maximum is again at the point x = M/3. For instance, for the largest right prism having a square base, we have p = M — x = 2M/3, i. e. the length of the square's sides is a = M/6 and the volume is then V =a2x = = 0.009 259 M3 ^0.681 6 m3. For a parcel in the shape of a ball (when x is the diameter), the condition p + x < M can immediately be expressed as nx + x < M, i. e. x < M/{jt + 1) < 108. Thus for x = M/{jt + 1), we get the maximal volume V = \it (f)3 = -^-3 = 0.007 370 M3 ^0.542 6 m3. Similarly, for a parcel in the shape of a cube (when x is the length of the cube's edges), the condition p + x < M means x < M/5 < 108. Thus for x = M/5 we get the maximal volume V =x3 = (f)3 = 0.008 M3 « 0.588 9 m3. Let us add that the length of the edges of the cube which has the same volume as the found cylinder is a = -M-= 0.227 595 M « 0.953 849 m. Let us realize its length and perimeter sum to 5a = 1.138 M, i. e. more than the company's limit by around 14 %. □ 5.204. A large military area (further denoted by MA) having the shape of a square and area of 100 km2 is bounded along its perimeter by a narrow path. From the starting point in one corner of MA, one can get to the target point inside MA by going 5 km along the path and then 2 km perpendicularly to it. However, one can also go along the path at 5 kph for any time period and then askew through the MA at 3 kph. What distance do you have to travel along the path if you want to get there as soon as possible? Solution. To travel x km along the path (where x e [0, 5]), we need x/5 hours. Our way through MA will then be V22 + (5 - x)2 = Vx2 - lOx + 29 kilometers long and we will cover it in \Jx2 — lOx + 29/3 hours. Altogether, our journey will take 327 CHAPTER 5. ESTABLISHING THE ZOO f(x) = \x + \y/x2 - lOx + 29 hours (let us remind that x e [0, 5]). The only zero point of the function fix) = \ + 1 , 1 jt-5 5 3 v/jc2-10jc+29 is x = 7/2. Since the derivative /' exists at every point of the interval [0, 5] and since /(D = ?|i kph and run along the shore at v2 kph? How long will the journey take? Solution. The optimal strategy is apparently given by first rowing straight to the shore at some point [0, x] for x € [0,1] and then running along the shore to the target point [0,1] (see the picture), so the trajectory consists of two line segments (or only one segment, in the case when x = I). The voyage to the point [0, x] on the shore will take hours and the final run then l—x hours. «2 We want to minimize the total time, i. e. the function on the interval [0, /]. Further, we can assume that t>i < v2. (Clearly for t>i > v2 the optimal strategy is to row straight to the target point, which corresponds to x = I.) First, we calculate the first derivative and then the second derivative f(x) = —rJ==, jce(0,Z)- / (d2 +x Further, we solve the equation t' (x) = 0, i. e. Exponentiating this equation gives v-2 l +x2 V2 Simple rearrangements lead to 2 \v? / • v7 xA = v 7 2, i. e. ^ —-- -(5) Let us realize that we consider only x e (0,1). Thus we are interested in whether ^- d -2- < I, * " ^--'- If this inequaUty holds, then also v\ < v2 and the function i changes sign only at the point X0 = G (0,/), 328 CHAPTER 5. ESTABLISHING THE ZOO and this change is from negative to positive (consider limx^0+ f (x) < 0 and t" (x) > 0, x e (0, /)). This means that in this case, at the point x0 there is the global minimum of the function t on the interval [0, /]. However, if the inequality (||5.205||) is false, then we have f (x) < 0 for all x e (0,1) whence it follows that the global minimum of the function t on [0,1] is at the right-hand marginal point (the function t is decreasing on its domain). The fastest journey will take (in hours) t (x0) d2 +*l l-x0 1 V2 Vi a d V2 dv2+ivifi^f-id dV2(i-(a)2)+hlyi-(^)2 dV2fi^f+iv "2 V i d supposing (|| 5.2051|), and if (115.20511) does not hold. ViV2 t (I) = hours □ 5.206. A company is looking for a rectangular patch of land with sides of lengths 5a and b. The company wants to enclose it with a fence and then split it into 5 equal parts (each being a rectangle with sides a, b) by further fences. For which values of a, b will the area S = Sab of the patch be maximal if the total length of the used fences is to equal 2 400 m? Solution. Let us reformulate the statement of the problem: We want to maximize the product Sab while satisfying the condition (5.13) 6b + 10a = 2400, a,b>0. It can easily be shown that the function a h-> 5a 2 400-lOa defined for a e [0, 240] takes the maximal value at the point a = 120. Hence the result is a = 120 m, b = 200 m. Let us add that the mentioned value of b immediately follows from (||5.13||). □ 5.207. A rectangle is inscribed into an equilateral triangle with sides of length a so that one of its sides lies on one of the triangle's sides and the other two of the rectangle's vertices lie on the remaining sides of the triangle. What is the maximum possible area of the rectangle? 5.208. Choose the dimensions of an (open) swimming pool whose volume is 32 m3 and whose bottom has the shape of a square, so that one would spare the least amount of paint possible to prime its bottom and walls. O 5.209. Express the number 28 as a sum of two non-negative numbers such that the sum of the first summand squared and the second summand cubed is as small as possible. O 329 CHAPTER 5. ESTABLISHING THE ZOO 5.210. With the help of the first derivative, find the real number a > 0 for which the sum a + l/a is minimal. Then solve this problem without using the differential calculus. O 5.211. Inscribe a rectangle with the greatest perimeter possible into a semidisc with radius r. Determine the rectangle's perimeter. O 5.212. Among the rectangles with perimeter Ac, find the one having the greatest area (if such one exists) and determine the lengths of its sides. O 5.213. Find the height h and the radius r of the largest (i. e. having the greatest volume) cone which fits into a ball of radius R. Q 5.214. From the triangles with a given perimeter p, select the one with the greatest area. O 5.215. On the parabola given by the equation 2x2 — 2y = 9, find the points which are closest to the origin of the coordinate system. O 5.216. Your task is to create a one-liter tin having the "usual" shape of a cylinder so that the minimal amount of material would be used. Determine the proper ratio between its height h and radius r. O 5.217. Determine the distance of the point [3,-l]el2 from the parabola y = x2 — x + \. Q 5.218. Determine the distance of the point [—4, —2] e R2 from the parabola y = x2 + x + 1. O 5.219. At the time t = 0, a car left the point A = [5, 0] at the speed of 4 units per second in the direction (—1, 0). At the same time, another car left the point B = [—2, —1] at the speed of 2 units per second in the direction (0, 1). When will the cars be closest to each other and what will their distance be at that moment? O 5.220. At the time t = 0, a car left the point A = [0, 0] at 2 units per second in the direction (1,0). At the same time, another car left the point B = [1, — 1] at 3 units per second in the direction (0, 1). When will they be closest to each other and what will the distance be? O 5.221. Determine the maximum possible volume of a cone with surface area 3tv cm2 (the surface area of its base is included as well). The area of a cone is P = 7tr(r + h), its volume then V = \Ttr2h, where r is the radius of its base and h is its height. O 5.222. A 13 feet long ladder is leaned against a house. Suddenly the base of the ladder slips off and the ladder begins to go down (still touching the house at its other end). When the base of the ladder is 12 feet from the house, it is moving at 5 feet per second from it. At this moment: (a) What is the speed of the top of the ladder? (b) What is the rate of change of the triangle dehmited by the house, the ladder, and ground? (c) What is the rate of change of the angle enclosed by the ladder and the ground? o 5.223. Suppose you own an excess of funds without the possibility to invest outside your own factory which acts at a regulated market with a nearly unhmited demand and a hmited access to some key raw materials, which allows you to produce at most 10 000 products per day. You know that the raw profit p and the expenses e, as functions of a variable x which determines the average number of products per day, satisfy 330 CHAPTER 5. ESTABLISHING THE ZOO v(x) = 9x, n(x) = x3 - 6x2 + 15x, x e [0, 10]. At what production will you profit the most from your factory? O 5.224. Determine lim ( cotx-- x^O \ X Solution. If we realize that 1 lim cotx = +oo, lim — = +oo, x^0+ x^0+ X 1 lim cotx = — oo, lim — = — oo, jt-»0- x^O- X we can see that both one-sided limits are of the type oo — oo. We can thus consider the (two-sided) limit. We will write the cotangent function as the ratio of the cosine and the sine and convert the fractions to a common denominator, i. e. 1 \ x cos x — sin x lim cotx--= lim-. x^o \ x) x^o xsmx Thus we have obtained an expression of the type 0/0 for which we get (by 1'Hospital's rule) xcosx —sinx cosx — x sinx — cosx —xsinx lim-= lim-= lim x^o xsinx x^o sinx + xcosx sinx + x cosx By one more use of 1'Hospital's rule for the type 0/0, we then get —xsinx —sinx—xcosx 0 — 0 lim-= lim-=-= 0. x^o sinx + x cosx cosx + cosx — x sinx 1 + 1 — 0 5.225. Determine the limit 7TX 5.226. Calculate lim (1 — x) tan ■ 2 lim (— — xtan x). ^f-V2 / 5.227. Using l'Hospital's rule, determine j^((3"-2"W 5.228. Calculate . 1 1 lim l \ 2 In x x2 — 1 5.229. By l'Hospital's rule, calculate the limit ^2 2 lim cos x^+oo \ x □ o o o o o 331 CHAPTER 5. ESTABLISHING THE ZOO 5.230. Determine lim (1 — cosx)s o 5.231. Determine the following Umits lim xtaT, Hm xisz, where a e M is arbitrary. O 5.232. By any means, verify that ex - 1 lim-= 1. x^O X o 5.233. By applying the ratio test (also called D'Alembert's criterion; see 5.46), determine whether the infinite series (a) E n = l oo (b) E ff; «=i (c) E „".„, converges « = 1 Solution. Since (a„ > 0 for all n) 2*+1-(«+2)3-3* -• ■v-a-Ti3 3"+1-2"-(« + l) (a) um 22±i = lim ll[<»?rf; = lim = lim ^ = f < 1; v y „^oo an 3»+1-2»-(« + D3 3(« + l)3 „^^ 3«3 3 (b) lim 22±i = Km f^l- • = Urn 4r = 0 < 1; (C) Km 2s±l = lim (, i^r'n, • = I™ t^tt • lim ^ = lim 4 • v y „^oo an \(« + l)2-(n + 1)! «" / „^00 (« + l)2 „^00 «" „^00 «2 lim (l + i)" = 1 e > 1, the series (a) converges; (b) converges; (c) does not converge (it diverges to +00). □ 5.234. By applying the root test (Cauchy's criterion), determine whether the infinite series (a) E ln"(« + l); (b) E «=1 00 (c) Earcsin"! «=1 converges. Solution. Once again we consider series with non-negative terms only, where (a) lim ^fa~n = lim = 0 < 1; («±±y iim (i+iy (b) lim 4a~n = lim ^TT = 7°°V L = f < 1; (c) lim ^/öJJ" = lim arcsin = arcsin 0 = 0 < 1. This means that all of the examined series converge. □ 332 CHAPTER 5. ESTABLISHING THE ZOO 5.235. Determine whether the series oo (a) £(-D" ln(l + £); n = \ oo 2 (b) E ^ ■ .i! « = 1 oo (c) v (~3)" «=i converges. Solution. The case (a). By l'Hospital's rule, we have r K1+^) r r 1 1 lim v ! ' = lim 2 ,- = lim —l— = 1, hoo 2* x^*+oo (2^) x^*+oo 1 + 2* hence 0 < In (1 + < £ for all sufficiently large neN. However, we know that the series E^i *s convergent. So it must be that 00 £ln(l + £) < +00, «=i i. e. the examined series converges (absolutely). The case (b). The ratio test gives lim i- 2("+1)2-«l i- 22"+1 i- 2-4" hm--= lim ^—r = lim ^-V = +00. «^00 (« + l)!-2«2 «^00 " + 1 Thus the series does not converge. The case (c). Now we will use the general version of the root test lim sup y\an I = lim sup 6+3_1)n = f < 1, whence it follows that the series is (absolutely) convergent. □ 5.236. By any means, determine whether the following alternating series converge: (a) E(-l) n « +3«-l . (3«-2)2 n = \ /u\ —1 3n4-3n3+9n- yO) *■) (5«3-2)4" _±\h-1 3n4-3n3+9n-l n = l Solution. The case (a). Since we have that „2 lim = lim = i y^ 0, „^oo (3«-2)2 9«2 9 7-' it immediately follows that the limit does not exist. Therefore, the series does not converge (a necessary condition for the convergence is not satisfied). The case (b). We have seen that when applying the ratio (or root) test, the polynomials neither in the numerator nor in the denominator affect the value of the examined limit. Let us thus consider the series 00 4« n = l for which we have lim "n+\ 4-oo V3« «=>oo \ V3«/ □ 5.238. Determine whether the series oo (a) E ^; n = \ (b) E cos(jrrc) converges absolutely, converges conditionally, or does not converge at all. Solution. The case (a). It is easy to show that this series converges absolutely. For instance, E I | — E ^ < E 2« = ^ n — 1 n — 1 n—0 and the second inequality has already been proven. The case (b). We can see that cos (jtn) = (—1)", n e N. So we have an alternating series such that the sequence of the absolute values of its terms is decreasing. Therefore, from the Umit lim 4= = 0 it follows that the series is convergent. On the other hand, oo oo oo J2 = Et?>E1 = +°°- n = l n = l v" n = l Thus the series converges conditionally. □ 5.239. Calculate the series OO / v n = l (b) E Jr; oo (c) E (42/1-1 + 42^)' n = l 00 (d) E f; n = l 00 (e) 2E (3« + l)(3«+4)' «=0 Solution. The case (a). By the definition, 00 / 1 1 \ n = l lim «=>oo VVvT s/l ) + (J2 V3)+"'+(i vil)) lim «=>oo (1 + (-75 + 7l) + --- + (-^ + 7i)-7fcT) = L 334 CHAPTER 5. ESTABLISHING THE ZOO The case (b). Apparently, this is five times the convergent geometric series with the common ratio q = 1/3, hence cx cx ^ E pr = 5 E (3) = 5 • -—y = y- «=0 «=0 The case (c). We have that (substituting m = n — 1) E (42/1-1 + 4>) 4 E (42/1-2) ~T~ 16 E (42/1-2) « — 1 « — 1 « — 1 CX) CX) 14 V / Um 14 1 14 f 3 , 2.\ y- J_ _ 14 y- (J_\« \4 T 16/ ^ 42m 16 ^ V 16/ 16 1__L 15- m=0 m=0 16 The series of linear combinations was expressed as a linear combination of series (to be more precise, as a sum of series with factoring out the constants), which is a valid modification supposing the obtained series are absolutely convergent. The case (d). From the partial sum sn = | + + Jr + ••• + £, neN, we immediately obtain that 3 — 32 T 33 T T yi T 3„+i , « c «■ Thus v — I_i_X_i_J__i_..._i_J___«_ i, c w Since lim = 0, we get CX « Ef =lim| (,„-|) = f^E^ = i£(ir = i(^T-i) = i- i=l v 3 7 The case (e). It suffices to use the form (the so-called partial fraction decomposition) i 1111 (3« + l)(3«+4) 3 3« + l 3 3«+4 which gives , neNU(O), ^ (3« + l)(3«+4) 3 V 4^4 7^7 10 ^ ^ 3« + l 3«+4/ n=0 n^oo im i (l — t^tt) = k- 3 V 3«+4/ 3 □ 5.240. Verify that cx °° i E < E 2F-n — l n—0 Solution. We can immediately see that 1 < or the general bound 1<1 .L + ±<2-^-± — + — + — + —<4- — — - 1 - A' 22 + 32 < Z 22 — 2' 42 + 52 + 62 + 72 < ^ 42 — 4' (2/!)2 T -T p/i+l _1)2 ^ ^ (2/1)2 — 2"' " *= 1X1 • Hence (by comparing the terms of both of the series) we get the wanted inequality, from which, by the way, it follows that the series E^Li ^2 is absolutely convergent. Let us specify that CX 2 °° 1 n — l n—0 335 CHAPTER 5. ESTABLISHING THE ZOO □ 5.241. Examine convergence of the series oo t—1 n « = 1 Solution. Let us try to add up the terms of this series. We have that oo T In s±i = lim (In } + In § + In f + • • • + In s±±) = j " «=>oo " 7 lim In 2-3l49;("+1) = lim In (n + 1) = +oo. H=>0o 1-z-3---H n=>0o Thus the series diverges to +oo. □ 5.242. Prove that the series g arctan "2+2"+3>+4 g _^±l_ n + l do not converge. Solution. Since and lim arctan "2+2"+3f+4 = iim arctan = f !• 3"-t-1 1' 3" lim 3 2_ = lim — = +oo, the necessary condition lim an = 0 for the series Eh==h0 an to converge does not hold. □ oo E n=>oo 5.243. What is the series 71n« n=L Solution. From the inequalities (consider the graph of the natural logarithm) 1 < Inn < n, n > 3, neN it follows that Vl < yinn < n > 3, n e N. By the squeeze theorem (5.21), i lim vlnn = 1, i. e. lim -=== = 1. «=>oo «=>oo Vinn Thus the series is not convergent. Since all its terms are non-negative, it must diverge to +oo. □ 5.244. Find out whether the series oo 1 (a) Ett; (n + l)-3« n=0 (b) E ^ «=i oo (C) E~T" v 7 n —In n n = l converges. Solution. All of the three enlisted series consist of non-negative terms only, so the series either converges, or diverges to +oo. We have oo oo (a) E ^ £ (5)" = T_T < +°°' 00 9 00 9 00 « — 1 77 — 1 « — 1 336 chapter 5. establishing the zoo (c) E —1— > E 1 = +00. n — 1 n — 1 Hence it follows that (a) converges; (b) diverges to +00; (c) diverges to +00. □ 5.245. Show that the so-called harmonic series E1 «=i diverges. Solution. For any natural number k, the sum of the first 2h terms of this series is greater than k/2: 1111111 1 + ~ + 7 + 7 + 7 + 7 + 7 + 7 + --- 2 3 4 5 6 7 8 1.1_1 "4+4-2 .i+i+i+i-i as the sum of the terms from 2l + 1 to 2l+l is always greater than 2l-times (its number) 1/2' (the least one of them), which sums to 1/2. □ 5.246. Determine whether the following series converge, or diverge: oo i) E- « = 1 oo «=i m) E 1 „.2100000 «=1 oo iy) E (1+0« «=1 Solution. i) We will examine the convergence by the ratio test: 2(72 + 1) 2n+\ lim an + \ = lim n + l In n—>oo n—>oo Z n lim n—>oo 2 > 1, so the series diverges. ii) We will bound the series from below: we know that £ < for any natural number n. Thus the sequence of the partial sums s„ of the examined series and the sequence of the partial sums s'n of the harmonic series satisfy: n ^ 1 r = l V r = l Since the harmonic series diverges (see the previous exercise), by definition, the sequence of its partial sums {s'n}'^=l diverges as well. Therefore the sequence of its partial sums {s„}^Li also diverges and so does the examined sequence. iii) This series is divergent since it is a multiple of the harmonic series. iv) The examined series is geometric, with common ratio j^j. Such a sequence is convergent if and only if the absolute value of the common ratio is less than one. We know that V2 1 TT7 i i1 1 -i '2 ~ 21' 1 1 4 + 4 < 1, 337 CHAPTER 5. ESTABLISHING THE ZOO hence the series converges, and we are even able to calculate it: 1 + i 1 1 □ 5.247. Consider a square with sides of length a > 0. Now consider the square whose vertices are the midpoints of the original square's sides. Then consider the square whose vertices are again the midpoints of the sides of the previous square; and so on. Determine the sum of the areas and the sum of the perimeters of all these (infinitely many) squares. O 5.248. Let a sequence of rows of semidiscs be given, such that for each n e N, the 72-th row contains 2" semidiscs, each having the radius of 2~". What is the area of an arbitrary figure consisting of all these semidiscs, supposing the semicircles do not overlap? O 5.249. Solve the equation 1 - tanx + tan2x - tan3 x + tan4x - tan5 x -\----= t ta1^2^.. tanzx+l 5.250. Determine 00 E (2"-!" yi-i") n = l 5.251. Calculate E v^2 + 2n + l. n = l 5.252. Prove the convergence of the series E 3"+2" 6" «=1 5.254. Sum up 1-3 T 3-5 T 5-7 T Z^ (2«-l)(2« + l) ■ « = 1 5.255. Using the partial fraction decomposition, calculate o o o and find its value. O 5.253. Calculate the series 00 (a) E n = l 00 (b) E ^- n=0 o o 338 CHAPTER 5. ESTABLISHING THE ZOO 11=2 (b) E n = l O 5.256. Determine the value of the convergent series oo E 4«2-l ' «=0 o 5.257. Calculate the series E n2-\-3n ' n = l O 5.258. In terms of oo (-i)"'1 -1 _ ixi_ i il _ i ii _ I express the following two series s — t = 1_i+!_! + !_! + A_! + A ■— n 2^3 4^5 6^7 8 ^ « = 1 (both the series contain the same elements as the first one, only in a different order). O 5.259. Determine whether the series oo 2"+(-2)" «=0 converges. O 5.260. Prove the following statement: If a series E«*lo a« converges, then lim sin (3an + it) = 0. «=> oo o oo _ oo oo 5.267. For which a e M; £ € Z; y e M\{0} do the series E v; £ £ ^ « = 120 h=240 «=360 converge? O 5.262. Determine whether the series _j^« «"—5«°+2« «=21 2 converges absolutely, converges conditionally, or does not converge at all. O 5.263. Find out whether the limit lim l\ + \ + • • • + -4) «=>oo V" " " ' is finite. Let us warn that one cannot make use of the sums oo oo 1 jt2 «-1 ii- 6 ' t—i nL ii—1 n—2 5.264. Find all real numbers A > 0 for which the series o 339 CHAPTER 5. ESTABLISHING THE ZOO £(-l)"ln(l + A2") n = l is convergent. O 5.265. Let us remind that the harmonic series diverges; i. e. oo E 1 = +OC. « = 1 Determine whether the series I + ..._l_I_l_J__l_..._l_J__l_J__l_..._l_J__l_... 1 T 9 T 11 T T 19 T 21 T T 29 T ..._I__L_I_..._I__L_I—L_|_..._|__L_|_J__i_... is divergent as well. O 5.266. Give an example of two divergent series E^Li an, E^Li ^« with positive numbers for which the series E^Li @a„ — 2bn) converges absolutely. O 5.267. Find out whether the two series OO - OO 7 4 Ei_i\« ("!) . V-1 /_i \n n —n+n \ L> (2«)!' L> nZ+2n6+n n — l n — l converge absolutely, converge conditionally, or do not converge at all. O 5.268. Does the series El -\\n + \ ^n+^n + l n = l converge? O 5.269. Find the values of the parameter p € R for which the series OO £(-D" sin"f n = l converges. O 340 CHAPTER 5. ESTABLISHING THE ZOO Solutions to the exercises 5.2. P(x) = (-| - ji)x2 + (2 + 3i> - I - t'-5.77. 3.x2 -2x -4. 5.72. (2x2 - 5) /3; eg. (fx2 - § )3. 5.73. a = 1,7> = -2, c = 0, = 1. 5.74. x3 + x2 - x + 2. 5.15. Infinitely many. 5.16. P(x) = x3 - 2X2 + 5x - 3; g(x) = x3 - 2x2 + 3x - 3. 5.77. x5 -2x4 -5x + 2. 5.78. x2. 5.79. x3 -2x+5;x3 - x + 6. 5.20. Infinitely many. 5.27. Eg. x2 - 3x + 6. 5.22. Si{x) = \ (x + l)3 - \ (x + 1) + l,x e [-1.0]; S2(x) = -3-K3 + fx2^ e [0, 1]. 5.23. Si(x) = \ (x + l)3 - § (x + 1) + l,x e [-1,0]; S2(x) = -513 + \x2,x e [0, 1]. 5.24. 5i(jc) = x; £>(x) = x. 5.25. 5i(jc) = 1; 52(x) = 1. 5.2(5. Si(x) = x + 3, x e [-3 + / - 1, -3 + /]; i e {1, 2}. 5.27. 5iW = 1 - ^x + ^x3; S2(x) = \ - \ (x - 1) + ^ (x - l)2 - ^ (x - l)3. 5.29. sup A = 6, 1 sup B — —, sup C = 9, 5.30. It can easily be shown that sup A — —, 5.31. Clearly infN=L supAf = 0, 5.32. We can, for instance, set M:=Z\N; N := N. inf A = -3; inf£ = -1; inf C = -9. inf A = 0. infj" = 0, sup J = 5. 5.33. Consider any singleton (one-element set) Xcl. 5.34. The set C must be a singleton. Thus, let us choose C — {0}, for example. Now we can take A = (—1,0), £ = (0, 1). 5.40. We have /l 2 n-2 n-l\ /1 + n - 1 n-l\ 1 lim \ -2+-2+---+ —Ť- + —2~~ J = llm----T~ = o• 5.41. It can easily be shown that V«3 - Hh2 + 2 + vV - 2n5 - n3 - n + sin2 ;i lim - -= —00. 2 - V5fi4 + 2n3 + 5 341 CHAPTER 5. ESTABLISHING THE ZOO 5.42. The limit is equal to 1. 5.43. We can, for instance, set x„ :— n, yn :— —n + 1, n e N. 5.44. The answer is ± 1. 5.45. The result is lim sup a„ — 1, 5.46. We have liminf ( (-1)" ( 1 + - I +sin— =-e-—. n^oo \ \ n J 4 / 2 5.62. The examined function is continuous on the whole R. 5.63. The function is continuous at the points — it, 0, it; only right-continuous at the point 2; only left-continuous at the point 3; and continuous from neither side at 1. 5.64. It is necessary to set /(0) :— 0. 5.65. The function is continuous iff p — 2. 5.66. The correct answer is a — 4. 5.67. It holds that sin8 x sin8 x lim —-— = lim —-— = 0. jr-»0+ Xi x^-oo xi 5.70. The only solution is x = —1. 5.71. It does. 5.116. r — +oo. 5.117. 1. 5.118. 3. 5.7/9. [-1, 1]. 5.120.x e [2- \,2+ i]. 5./2Z. It is. 5.722. (a) True. (b) False. (c) False. (d) True. ->•-«■"• 1 102.2 + 104.4!- 5.724. The error lies in the interval (0, 1 /200). 5.126. f(x) =x,x eR; it is. 5.127. It does not. j.izy. Z^«=0 (2«+1)h! a 5.730. a > 1. 5.737. [-^2,^2). 5.733. x > 2. 5.134. The series is absolutely convergent. 5.735. In (3/2). lim inf a„ — 0. 342 CHAPTER 5. ESTABLISHING THE ZOO 5J36- 5.137. (a) I In i±|; (b) J^. 5.758. 2/9. 5.ii9.xe~. 5.144. (a) no; (b) no; (c) yes; (d) yes; (e) no; (f) no; (g) no; (h) yes. 5.145. The functions (a), (e) are odd; the functions (c), (d) are even. 5.146. It is periodic, the prime period being (a) 2tt; (b) tt/3. 5.147. The functions / and g are even, so it suffices to consider the graphs of the functions y — ex, x e [0, +oo) and y — lnx, x € (0, +oo). 5.148. The given function is even, so to draw its graph, it suffices to know the graph of the function y — 2X, X € (—oo, 0]. 5.149. (sinhx)' = coshx; (coshx)' = sinhx; (tanhx)' = —(cothx)' = l cosh2*' v " ' sinh2*' 5.150. -J—. 5.152. x4 + 2X3 - x2 + x - 2. 5.153. x4 + 2X3 - Ix2 + x + 2. 5.154. x4 + 3X3 - 3X2 - x - 1. 5.155. For every e > 0, it suffices to assign to the e-neighborhood of the point —2 the <5-neighborhood of the point 0 given by m J, 8 — s, and without loss of generality, we can assume that e < 1. Since if e > 1, we can set S — 1. 5.156. Existence of the limit and the equality ,. (l+x)2-3 3 lim -= — x^-i 2 2 follows, for example, from the choice S :— s for e e (0, 1). 5.157. Since — (x — 2)4 < x for x < 0, we get 3 (x — 2)4/2 > —x for x < 0. 5.158. As 1 7T 1 7T lim arctan — — —, lim arctan — —--. x^o+ x 2 x^o- x 2 the considered limit does not exist. 5.159. The former limit equals +oo, the latter does not exist. 5.160. The limit can be determined by a lot of means. For instance: tanx — sinx /tanx —sinx cotx lim-r-= lim x^O sin3x x^o \ sin3x cotx 1 — cos X 1 = lim -t— = lim x^O cos x • sin2 x x^o cos x (l — cos2 x) 1 1 = lim >0 cosx (1 + cosx) 2 5.161. We have that 2 sin x + 7 sin x + 2 sin x — 3 sin x + 1 lim-----= lim -= —3. :^jt/6 2sin3x + 3 sin2x - 8 sinx + 3 x^n/6 sinx - 1 5.162. We have x™ - 1 m lim -= —. x" - 1 n 343 CHAPTER 5. ESTABLISHING THE ZOO 5.163. After multiplying by the fraction Vx2 + x + X Vx2 4- x + x we can easily get that lim I V-y- + x — x) 5.164. We have 5.165. We have lim (x V1 + x2 — x2 ^ = -. ->+oo\ / 2 a/2 — V1 + cos x \fl lim *->0 sin x 5.166. By extending the given fraction, we can obtain sin (4x) lim >0 Vx + 1 - 1 5.7(57. We have that V1 + tan x — VI — tan x lim -= 1. x^o- sinx 5.168. Apparently, 2X + Vl +x2 -x9 - 7a-5 + 44X2 7 lim - —-= —. x^-°° y + V6x6 + x2 - 18x5 - 592X4 18 5.769. The statement is false. For example, consider f(x) := —, x € (—oo. 0); g(x) := x, x e x 5.170. n \2"-1 lim I - ) = e «=>oo \n + 5 -10 5.178. (a) v(0) = 6m/s; (b) f = 3 s, 5(3) = 16m; (c) u(4) = -2m/s, a(4) = -2m/s2. 5.779. /'(*„) = 5.780. It does not because the one-sided derivatives differ (concretely: tt/2 from the right and —jt/2 from the left). 5.787. It does. 5.782. It does not. 5.183. f{x) :=|x-5| + |x-9|. 5.184. For instance, let / = g take 1 at rational numbers and —1 at irrational ones. 5.785. (a) x2 sinx; (b) cos (sinx) ■ cosx; (c) ffig cos (in (x3 + 2x)); (d) J^~^)2• 5.786. (a) | x 8; (b) cosecx = 5.187. cosx • cos (sinx) ■ cos (sin (sinx)). 5.788. /'(*) = . 1 , + 1, x € (l - V2, 1 + V2). V1+2jc— jc- V / 5.789. c/" . 3 v sin x 344 CHAPTER 5. ESTABLISHING THE ZOO 5.190. + j? t?. 5.191. -8. 5.192.^ fig. x e R. 5.194. f'(x) = -\ (log, e)2, x > 0, x / 1. 5.195. [f(x)g(x)h(x)k(x)] ' = f'(x)g(x)h(x)k(x) + f(x)g'(x)h(x)k(x) + f(x)g(x)h'(x)k(x) + f(x)g(x)h(x)k'(x). 5.207. The inscribed rectangle has sides of lengths x, *j3/2(a — x), thus its area is V3/2(a — x)x. The 5.208.4m x 4m x 2m. 5.209. 28 = 24 + 4. 5.210. a = 1. 5.211.241 r. 5.212. It is the square with sides of length c). 5.213. h = ffl, r = ^fl. 5.214. It is the equilateral triangle (with area V3 /?2/36). 5.2/5. [2, -1/2], [-2, -1/2]. 5.2/(5. i; = 2r. 5.217. The closest point is [1, 1], the distance then 2\/2. 5.218. The closest point is [— 1, 1], distance 3V2. 5.219. t = 1, 5s, the distance will be V5 units. 5.220. It will happen at the time ? = ^ s, the distance being units. 5.221. P = nrv + nr2 =>• j; = =>• V = \r(P - nr2). The extremum is at r = the substitution gives V = ^ cm3. 5.222. (a) 12ft/s; (b) -59, 5ft2/s; (c) -lrad/s. 5.223. At about 3 414 products per day. 5.224. Triple use of 1'Hospital's rule gives maximum occurs for x = a/2, hence the greatest possible area is (V3/8)a2. sinx — x lim 6 5.225. 2/jr. 5.226. 5.227. (C 5.228. 1/2. 5.229. We have 345 CHAPTER 5. ESTABLISHING THE ZOO 5.230. By double applying 1'Hospital's rule, one obtains lim (1 -cosx)sin* =e° = 1. x^O 5.231. In both cases, the result is e". 5.232. The limit can be easily calculated by 1'Hospital's rule, for instance. 5.247. 2a2; 4a (2 + V2"). 5.248. tt/2. 5.249. x = | + A-7T, x = ^ + Jbr, k e Z. 5.250. 5. 5.25/. +oo. 5.252. 3/2. 5.253. (a) 3; (b) 9/4. 5.254. 1/2. 5.255. (a) 3/4; (b) 1/4. 5.256. -1/2. 5.257. 11/18. 5.258. s/2; 3s/2 (s = ln2). 5.259. It does. 5.260. It suffices to consider the necessary condition for convergence, namely lim^oo a„ = 0. 5.261. a > 0; /3 e {-2, -1, 0, 1, 2}; y e (-oo, -1) U (1, +oo). 5.262. It is absolutely convergent. 5.26i. The limit is equal to 1/2. 5.264. A € [0, 1). 5.265. The value of the given series is finite - the series converges. 5.266. For example: a„ = n/3, bn = n/2, n e N. 5.267. The former series converges absolutely; the latter one does conditionally. 5.268. It does. 5.269. pel. 346 CHAPTER 6 Differential and integral calculus we already have the menagerie, but what shall we do with it? - we'll learn to control it... A. Derivatives of higher orders First we'll introduce a convention for denoting the derivatives of higher orders: we'll denote the second derivative of function / of one variable by /" or f(2), derivatives of third or higher order only by y (3) y (4) y («) por remembrance, we' 11 start with a slightly cunning problem using "only" first derivatives. 6.1. Determine the following derivatives: i) (x2 ■ sinx)", ii) (**)", iv) (x")(n), v) (sinx)(n). Solution, (a) (x2 ■ sinx)" 4x cosx — x2 sinx. (b) (xx)" = [(1 + In *)**]' = (d) (x")(n) = [(x")'] = (2xsinx +x2cosx)' = 2 sinx + xx~l +xx(l + lnx)2. In the previous chapter we were playing either with extremely large classes of functions — all continuous, all differentiable etc. — or only with particular functions — for example exponential, goniometric, polynomials etc. However we had only a minimum of tools and we computed everything by hand. From the qualitative point of view, we only indicated how to use the knowledge of a linear approximation of a function to its derivative to discuss the local behavior of such function near a given point. Now we will put together several results that will allow us to work with functions more easily in simulations of real problems. By differentiation we learned how to measure instantaneous changes. In this chapter we will deal with the task of summing infinitely many of these "infinitely small" changes, e.g. how to "integrate". First though, we will clarify some things about differentiating. In the last part of the chapter we will come back to series of functions and fill in several missing steps in our argumentation so far. 1. Differentiation 6.1 Higher order derivatives. If the first derivative f'(x) of a y/- real or a complex function has a derivative (/')' (xo) -^I'b at the point xo, we say that the second derivative of function / (or second order derivative) exists. Then we write f"(x0) = (f')'(x0) or /(2)(x0). Function / is double differentiable on some interval, if it has a second derivative at each of its points. We define derivatives of higher orders inductively: .__j k TIMES DIFFERENTIABLE FUNCTIONS j__-- x2(lnx)2 x2(lnx)4' rtVlO-D = (nxn-l )(»-!) A real or a complex function / is differentiable (k + 1) times at the point xq for some natural number k, if it is differentiable k times on some neighbourhood of the point xo and its k-ih derivative has a derivative at the point xo. For the k-ih derivative of the function fix) we write f^k\x). For k = 0, by 0 times differentiable functions we mean continuous functions. If derivatives of all orders exist on an interval, we say that the function / is smooth on it. For functions with continuous k-ia derivative we use the denotation the class offuncitons Ck iA) on an interval A, where k can attain values 0, 1,..., oo. Often we write only Ck, if the domain is known from the context. n\. CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS (e) (sinx)(n) = re(z" sinx) + im(z" cosx). 6.2. Differentiate the expression □ tfT^T ■ (x + 2)3 ex(x + 132)2 of variable x > 1. Solution. We'll solve this problem using the so called logarithmic differentiation. Let / be an arbitrary positive function. We know that [ln/(Jc)]' = ^g, tj. /'(*) = /(*)• [In/(*)]', if the derivative fix) exists. The usefulness of this formula is given by the fact that for some functions, it's easier to differentiate their logarithm then themselves. Such is the expression in our provlem. We'll obtain Zfx~^X ■ (x + 2)3 ex(x + 132)2 In Zfx~^X ■ (x + 2)3 ex(x + 132)2 tfx~^\ ■ (x + 2)3 ex(x + 132)2 tfx~^\ ■ (x + 2)3 ex(x + 132)2 i)fx~^l ■ (x + 2)3 " ex(x + 132)2 3 in (x + 2) + i in (x - 1) - x lne - 2In (x + 132) 1 x+2 40-1) x + 132 □ 6.3. Let n e N be arbitrary. Find the «-th derivative of function y = ln±±£, Solution. With respect of the equality ln±±f =ln(l + x) -ln(l -x) , jce(-l, 1), we'll define an auxihary function /(jc):=ln(fljc + l), x e (-1, 1), a = ±1. For i e (-1, 1) we can easily (sequentially) compute ax-\-l' -a2 (ax + l)2 ' f(3)ix)- 2a f4)ix) (ax+l)3 ' -6a4 (ax + l)4 " Based on these results we can figure out that (, (_iy-i(„ - \)\an (6.1) /w(*)=v ; v '-, ie(-U), neN. (ax + 1)" We'll verify the validity of this formula by mathematical induction. It holds for n = 1, 2, 3, 4, so it suffices to show that its validity for k e N implies its validity for k + 1. Because the direct computation yields fk+l)ix) = ( (-!)*-'(*-!)! (ax + l)k (-!)*-'(*-!)! ak (-k)a _ (-l)kk\ak+l vzorec (116.1 II) it holds for all neN. Then ln(n)(l+x) (x + l)" ln(n)(l-x) From here we obtain the result (ln^)(B) = (n-l)!( l (l-x)n ("-D! (-x+ir (-1)" x e (-1, 1). We can illustrate the concept of higher order derivatives on polynomials. Because a derivative of a polynomial is a polynomial with a degree one less than the original one, after a finite number of differentiations we get the zero polynomial. More precisely, after exactly k + 1 differentiations, where k is the degree of the polynomial, we get zero. Of course then derivatives of all orders exist, e.g. feC°(R). In the spline construction, see 5.9, we took care that the resulting functions would belong to the class C2(M). Their third derivatives will be sequentially constant functions. That is why the splines won't belong to C3 (R), even though all their higher order derivatives will be zero in all of the inner points of all single intervals in the interpolation. Think this example through in detail! The next assertion is a simple combinatorical corollary of Leibniz's rule for differentiation of a product of two functions: Lemma. If two functions f and g have derivatives of order k at the point xq, then their product also has a derivative of order k and the following equality holds: (/■«)«(ao) = i;(*)/«(xo)^(xo). Proof. For k — 0 the statement is trivial, for k — 1 it's Leibniz's product rule. If the equality holds for some k, by differentiating the right hand side and using Leibniz's rule we obtain a smiliar expression E i=0 fi+1)(xo)g (k- i)(xo) + f(i)(x0)g(k- -i + l) i*o) In this new sum, the sum of orders of derivatives of products in all summands isk+1 and the coefficients of f^ (xo)g<*+1_i-) (xo) are the sums of binomial coefficients (. * l) + (*) = (*+!). □ 6.2. Multiple roots and inversions of polynomials. We already computed the derivatives of polynomials in the paragraph 5.6 and it can be seen that these are smooth functions. In this case differentiation can be viewed as an injective algebraic map. Let's see how we can use differentiation for discussing multiple roots of polynomials. First we formulate The fundamental theorem of algebra, whose proof will be left over to ??. Theorem. Each nonzero complex polynomial f : C —>• C of degree at least one has a root. Thus a polynomial of degree k > 0 has exactly k complex roots (counting multiplicities) and can be written uniquely in the form f(x) = (x -ai)Cl ■ (x -aq)c Nlel . fr — n Y-q ^ where a\,..., aq are all roots of the polynomial / and 1 < c\, ..., cq < k are their multiplicities (i.e. natural numbers). By differentiation of f(x) as a function of one real variable x we get f(x) =Cl(x- ai)^-\ .. (x - aq)c* +... (l+xT + cq(x -ai)Cl...(x -aqfi Ca-l 348 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS pro x € (-1, 1) a n € N. □ 6.4. Determine the second derivative of function y = tg x on its whole domain, i.e. for cos x ^ 0. O 6.5. Determine the fifth and the sixth derivative of the polynomial p (x) = (3x2 + 2x + 1) • (2x - 6) • (2x2 - 5x + 9), x e R. If ci = 1 and the root a \ is real, the value of the derivative /' at the point a\ will be nonzero, because the first term is nonzero, while all the others will vanish after setting x = a\. Similarly for other roots. Thus we verified a convenient property that a real root a of a polynomial / is multiple if and only if it is a root of its derivative /'. (We will extend this statement to all complex roots in time.) o 6.6. With no computation involved, determine the 12th derivative of function y = e2x + cos x + x10 - 5x7 + 6x3 - 1 Ix + 3, x e R. o 6.7. Write the 26th derivative of function f(x) = sinx + x23 - x18 + 15X11 - 13x8 - 5x4 - llx3 + 16 + e2* pro x e R. O We'll show some more interesting examples of using the differential calculus. First though, we'll mention the Jensen inequality, which disscusses convex and concave functions and which we'll use later. 6.8. Jensen inequality. For a strictly convex function / on interval / and for arbitrary points x\, ..., xn e / and real numbers c\, ..., cn > 0 sucht that c\ + ■ ■ ■ + cn = 1, the inequality □ \i = l / i = l holds, with equality occuring if and only if x\ = ■ ■ ■ = x„. Solution. Proof can be found for example in ||??|| Remark. The Jensen inequality can be also formulated in a more intuitive j way: the centroid of mass points placed upon a graph of a strictly convex function lies above this graph. 63r Vrove that among all (convex) «-gons inscribed into a circle, the regular n-gon has the biggest area (for arbitrary n > 3). Solution. Clearly it suffices to consider the «-gons inside of which lies the center of the circle. We'll divide each such n-gon inscribed into a circle with radius r to n triangles with areas St, i € {\, ... ,n} according to the figure. With regard to the fact that sin^ = ^, cos^ = ^, ie{l.....n}, 2 r ' 2 r ' 1 ' we have 5; = xt ht = r2 sin y cos y = ^ r2 sin while for values 0, fix) will be increasing, while for even n it will be decreasing on the left side and increasing on the right side, therefore at xo it will attain its minimal value among points from (sufficiently small) neighbourhood of xq = 0. We can apply this point of view to function /'. If the second derivative is nonzero, its sign determines the behavior of the first derivative. That's why at the critical point xo the derivative fix) will be increasing if the second derivative is positive and decreasing if the second derivative is negative. If it's increasing though, it means that it will necessarily be negative to the left of the critical point and positive to the right of it. In that case, function / is decreasing to the left of the critical point and increasing to the right of it. That means / attains its minimal value among all points from (sufficiently small) neighbourhood of xo at the point xo. On the other hand, if the second derivative is negative at x$, the first derivative is decreasing, thus negative to the left of xo and positive to the right of it. Function / will then attain its maximal value among all values from some neighbourhood. A function that is differentiable on (a, b) and continuous on [a, b] certainly has an absolute maximum and minimum of this interval. It can be attained only at its boundary or at a point with zero derivative, i.e. in a critical point. That means critical points may be sufficient for finding extremes and second derivatives will help us determine the types of the extremes, if they are nonzero. For more precise discussion though we need better approximation of the studied function than a linear one. That's why we'll first study notions in this direction and later come back to discussing the course of functions. 349 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS interval. Then according to Jensen's inequality for q = l/n and x, (Pi, we have -sin [Y,-(Pi:)<-E^sm<°i. tj. sin Q]- cp, > \r = l / r = l \r = l / r = l Moreover, we know the equality occurs exactly for 0), the perimeter also gets a times bigger and the area a2 times (it's a square measure). Hence IQ doesn't depend on the size of the region, but only on its shape. Thus we can consider a regular n-gon inscribed into a unit circle. According to the figure, h = cos cp = cos -, £ = sin cp = sin -, which yields on = n ■ x 2n sin - and 5 = n ■ \hx = n cos - sin -. fit 17 17 Thus for a regular n-gon, we have IQ ^cotg^, 4«2 sin2 f which we can verify for example for a square (n = 4) with a side of length a, where JO — 4™2 — 1L — e. cotCT e. 1 V ~ (4a)2 — 4 — 4 LUl& 4 ■ Using the limit transition for n -> 00 and the Umit lim sai = 1, we get the isoperimetric quotient for a circle: 6.4. Taylor expansion. As a surprisingly easy use of Rolle's theorem we will now derive an extremely important result. It's called Taylor expansion with a remainder. Intuitively we can get to it by reversing our notions about power series. If we have a power series centered in a, S(x) — an(x — a)n, «=0 and we differentiate it repeatedly, we are getting power series (we know that we can differentiate such expression term after term, even if we haven't proved it yet) Sw (x) — ^n(n — \) ... (n — k + \)an(x - a) n—k n—k At the point x = a we then have S^k) (a) — Icla^. Then we can conversely read the last statement as an equation for and rewrite the original series as 00 ^ S(x) = ^-SW (fl)(. a)n. «=0 If we have some sufficiently smooth function fix) instead of a power series, it's suitable to ask if it can be expressed as a power series and how fast will the partial sums (i.e. approximations of function / by polynomials) converge. Our notion just suggested we can expect a good approximation by polynomials in the neighbourhood of point a. I Taylor polynomials of function / |__, Fot k times differentiable function / we define its Taylor polynomial ofk-th degree by the relation Tk,af(x) = fia) + f'(a)(x - a) + i/»(x - a)2+ -f(3\a)ix-a)3 + 6 + -f(k\a)ix-a)k. k\ The precise answer looks similar to the mean value theorem, but we work with higher degrees of polynomials: Theorem (Taylor expansion with a remainder). Let f (x) be a function that is k times differentiable on interval (a, b) and continuous on [a, b]. Then for all x € (a, b) there exists a number c € (a, x) such that fix) = fia) + f'ia)ix-a) + ... + —" a)*"1 + Y/k){c){X ~ ik = Tk-hafix) + -f(k)ic)ix-a)k. k\ Proof. Define the remainder R (i.e. the error of the approximation for fixed x) as follows f(x) = Tk.haf(x) + R i.e. R — i^rix — a)k for a suitable number r (dependant on x). Now consider function F(§) defined by k-i l—' 7! k\ 350 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS IQ = lim ^cotg- lim cos 0 1 1. Of course, for a circle with radius r, we could have also directly computed IQ = *£ 4jr(jir2;) (2nr)2 1. For the boundary a of sector of a circle with radius r and central angle

)J

0, pe(0,2), /a»<0, 0 with the property c\ + ■ ■ ■ + cn = 1. Moreover we know that the equality occurs if and only if x\ = ■ ■ ■ = x„. By choosing we then get y hi £l i < r = l V 7 By several simplifications, we obtain the inequality and then (notice that ELi °i = 0 t < y °1 A — i—i Xi ' r = l with equality again occuring for (6.3) xi = ■■■ =x„, tj. ^1 On_ This implies that S the smallest, if and only if (|| 6.31|) holds. This smallest value of S is I2/(4jt A). Now we only need to determine the lengths of the cut parts ot. If (||6.3||) holds, then clearly ot = kkt for all i € {1, ..., n} and certain constant k > 0. From *? n n E = / and simultaneously E °i = ^ E A/ = /—i /—i /—i we can immediately see that k = l/A, i.e. oi = %l, ie{l.....b). Let's take a look at a specific situation where we are to cut a string of length 1 m into two smaller ones and then create a square and a circle from them so that the sum of their areas is the smallest possible. For a square and a circle (in order), we have (see the example called Isoperimetric quotient) M=£, *2 = 1, tj. A=Ai+A2 = ^. Then the lengths of the respective parts are (in metres) °i = ife ' 1 = 4^7 = °- 56' °2 — • 1 — -3— — 0 44 4±zl 1 — 4+7t — u' The area of a square with perimeter 0, 56 m (with a side of length a = 0,14 m) is 0, 0196 m2 and the area of a circle with perimeter 0, 44 m (and radius r = 0, 07 m) is approximately 0, 015 4 m2. We can verify that (in m2 0,035 = 0,019 6 + 0,015 4. 4jtA 4(4+jt) □ Taylor expansions. We necessarily need the derivatives of higher orders to determine the Taylor expansion of a given function. Corollary (Taylor's theorem). Assume that the function f(x) is smooth on the interval (a — b, a + b) and all of its derivatives are bounded uniformally here by a constant M > 0, i.e. \fik)(x)\ • 0+,i.e. a finite sum of limits of the expressions -l/xz — x J j e \/xz All these expressions are of type oo/oo, so we can use L'Hospital's rule repeatedly on them. Obviously after several differentiations of both the numerator and denominator (and a similar adjustment as higher) there will be still the same expression in the denominator, while in the numerator the power will be nonnegative. Thus the whole expression necessarily has a It's a special case of so called Whitney's theorem, see. complete the citation and information. 352 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.12. Determine the Taylor expansions (of k-ih order at point x) of the following functions: i) Tq of function sin x, ii) T? of function —. Solution, (i) We'll compute the values of the first, second and third derivative of function / = sin at point 0: f'(0) = cos(0) = 1, /(2)(0) = -sin(0) = 0, /(3)(0) = -cos(0) = -1, also /(0) = 0. Thus the Taylor expansion of the third order of functionsin(x) at point Ois ro3(sin(x)) = x — -x3. (ii) Again/(l) = e, /'(I) /(2)(D /(3)(D e x ex x ex x x=\ ex2 2ex 2- + — x xJ Jt = l nx2 + 6ex 6ex -2e x = \ Thus we get the Taylor expansion of third order of function y at point 1: T3(-)=e + -(x-l)2 x3 3x2 e(--+- V 3 2 2x + -). o □ 6.13. Determine the Taylor polynomial ro6 of function sin and using theorem (6.4), estimate the error of the polynomial at point 7t/4. Solution. Analogously to the previous example, we compute fi 1 3 1 s rn (sin(jc)) = x--x -\--xr . 0 6 120 Using the theorem 6.4, we then estimate the size of the remainder (error) R. According to the theorem, there exists c e (0, j) such that R(jt/4) COS(c)7T7 7!47 < 1 7! 0, 0002. 6.14. Find the Taylor polynomial of third order of function y = arctg x, x e M at point x0 = 1. 6.15. Determine the Taylor expansion of third order at point x0 of function □ o = o (a) y (b) y (c) y (d) y (e) y 1 COS x ' e 2 ; sin (sin x) ; tg*; ex sin x zero limit at zero, just like we've computed in the case of the first derivative higher. The same will hold true for a finite sum of such expressions, so we've found out that each derivative / w (x) at zero will exist and its value will be zero. We've shown that our function fix) is smooth on whole M, it's of course a nonzero function everywhere except for x — 0, but all its derivatives at this point are zero. Of course, then it's not an analytic function at the point xq — 0. 6.7. Examples of nonanalytic smooth functions. We can easily K; „ modify our function f(x) from the previous paragraph in this way: -l/x2 ifx < 0 if x > 0 Again it's a smooth function on whole R. By another modification we can obtain a function that is nonzero in all inner points of the interval [—a, a], a > 0 and zero elsewhere: h(x) if |x| > a if Ixl < a. This function is again smooth on whole R. The last two functions are on the figures, on right the parameter a — 1 is used. Finally we'll show how to get smooth analogies of Heaviside functions. For two fixed real numbers a < b we define the function fix) with usage of earlier defined function g in this way: /(*) = gix - a) gix - a) + gib - x) Obviously for all x e R the denominator of the fraction is positive (because for each of the intervals determined by numbers a and b at least one of the summands of the denominator is nonzero, therefore the whole denominator is positive. Thus from our definition we get a smooth function fix) on whole R. For x < a the denominator of the fraction is zero according to the definition of g though, for x > b the numerator and denominator are equal. On the next two figures there are these functions fix) with parameters a — 1 — a, b — 1 + a, where on the left we have a — 0.8 and on the right we have a — 0.4. 353 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS defined in a certain neighbourhood of point x0. O 6.16. Determine the Taylor expansion of fourth order of function y = lnx2, x € (0, 2) at point x0 = 1. O 6.17. Find the estimation of the error of the approximation In (1 + x) ~ x forx e (-1,0). O 6.18. Write the Taylor polynomial of fourth degree of function y = sin x, x € R centered at the origin. Using this polynomial, approximately compute sin 1° and determine the limit lim x smx~x jc^0+ x O 6.19. Determine the Taylor polynomial centered at the origin of degree at least 8 of function y = e2x,x sR. Q 6.20. Express the polynomial x3 — 2x + 5 as a polynomial in variable 1. O 6.21. Espand the function ln( 1 +x) into a power series at point 0 and 1 and determine all x e R for which these series converge. Solution. First we'll determine the expansion at point 0. To expand a function into a power series at a given point is the same as to determine its Taylor expansion at that point. We can easily see that [ln(x + 1)] («) (-1) n + l (n- 1)! (x + 1)" so after computing the derivatives at zero, we have ln(x + 1) = In 1 + oo E anx", where n = l (_iy.+i(n_ i)! (_i) n + l Thus we can write ln(x + 1) x2 + « = 1 For the radius of convergence, we can then use the limit of the quotient of the following coefficients of terms of the power series 1 1 lim„_ "n+\ 1 -n^oo i 1. lim, Hence the series converges for arbitrary x e (—1, 1). For x = — 1 we get the harmonic series (with a negative sign), for x = 1 we get the alternating harmonic series, which converges by the Leibniz criterion. Thus the given series converges exactly for x e ( — 1, 1]. 0,5 1 Now we can also easily create a smooth analogy of the characteristic function of the interval [c, d\. Denote the higher specified function f(x) with parameters a — —s, b — +s as fE (x). Now for the interval (c, d) with length d — c > 2s we define the function hB(x) — fB (x — c) ■ fB (d — x). This function is identically zero on the intervals (—oo, c — s) a (d + s, oo) and identically equal one on the interval (c + s,d — s), moreover it's smooth everywhere and locally it's either constant or monotonie. The smaller the e > 0, the faster our function jumps from zero to one around the beginning of the interval or back at the end of it. Thus we can see that smooth functions are very "plastic" — from a local behaviour around one point we cannot deduce anything at all about the global behavior of such function. Conversely, analytic functions are completely determined just by derivatives at one point. In particular they are completely determined by their behavior on an arbitrarily small neighbourhood of a single point from their domain. In this sense, they are very "rigid". Local behavior of functions. We've seen that the sign of the first derivative of a differentiable function determines whether it's increasing or decreasing on some neighbourhood of the given point. If •■iu, derivative is zero though, it doesn't tell us much about the behavior of the function by itself. We've already encountered the importance of the second derivative while describing critical points. Now we'll generalize the discussion of critical points for all orders. We'll start with discussing the local extremes of functions, i.e. values, that are strictly bigger or strictly smaller than all the other values from some neighbourhood of a given point. In the following we'll consider functions with sufficiently high number of continuous derivatives, without specifically pointing this assumption out. We say the point a in domain of / is a critical point of order k iff f{l\d) = 0, f^i>(a)^0 (k+l), Suppose f^k+l\a) > 0. Then this continuous derivative is positive on a certain neighbourhood O(a) of the point a as well. In that case, a Taylor expansion with a remainder gives us f(x) = f(a) + -l—f(k+l\c)(x - a)k+1 (k+ 1)! for all x in O(a). Because of that, the change of values of f(x) in a neighbourhood of a is given by the behavior of the function (x —a)k+1. Moreover, if k+l is an even number, then the values of f(x) in such neighbourhood are necessarily bigger than the value 354 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Analogously, for the expansion at point 1, by computing the above derivatives from ||6.211|, we get ln(x + 1) = ln(2) + l-(x - 1) - l-(x - if + ^(x -if-... 00 (-1)" + ! = ln(2) + ^i-^_(jc- If, «=i n ■ 2" and for the radius of convergence of this series, we get 1 1 lim„_ "n+\ I. lim„_ 2"+l (n+l) -OO 1 Vn The first series converges for —1 < x < 1, the second for —1 < x < 3. □ 6.22. Expand the function (a) y = ln±±£, jce(-l,l); (b) y = e.*2 + x2e~2\ x e R into a Taylor series centered at the origin.. Solution. If the function can be expressed as a sum of a power series (with a positive radius of convergence) on its domain of convergence, then this series is necessarily the Taylor series of the given function (its sum). This allows us to find the corresponding Taylor series easily. Case (a). We know that OO j ln(l+jc) = E t^x", *e(-l,l), n = \ i.e. n — 1 n — 1 xn = Hřhx In total, we have OO ln{±£=ln(l+x)-ln(l-x) = E n — 1 n—1 forx e (-1, 1). Case (b). Similarly, the well known identity 2 Jht-\ implies ex = f Xxn, x e «=0 e*2 = E^2)" = E^«, xs «=0 «=0 and OO OO x2 e~2x = x2 E h ("2x)" = E {-^r xn+2, x e «=0 ' «=0 Hence Q*2 _i_ x2t~2x = T x2n+(-2)nxn+2 «=0 X € □ R /(a) and obvisouly the point a is a point of local minimum then. If k is even though, then the values on left are small and on right bigger than f(a), so an extreme doesn't occur even locally. On the other we can notice that the graph of function fix) intersects its tangent y — fia) at point [a, f(a)]. Conversely, if f^k+l\a) < 0, then because of the same reasoning it's a local maximum for odd k a again the extreme doesn't occur for even k. 6.9. Convex and concave functions. We say that differentiable function / is concave at point a, if in a certain neigh- ; / bourhood its graph lies completely below the tangent e~l at point [a, fia)], i.e. we require fix) < fia) + f'(a)(x - a). Conversely, we say that / is convex at point a, if its graph is above the tangent at point a, i.e. fix) > fia) + f'(a)(x - a). A function is convex or concave on an interval, if it has this property in its every point. Moreover suppose that function / has continuous second derivatives in a neighbourhood of point a. From the Taylor expansion of second order with a remainder we obtain fix) = fia) + f'ia)ix -a) + ^f"(c)(x - af. Then obviously the function is convex, whenever f"ia) > 0, and concave, whenever f"ia) < 0. If the second derivative is zero, we cab zse derivatives of higher orders. We can only make the same conclusion if the first other nonzero derivative after the first derivative is of even order. If the first nonzero derivative is of odd order, clearly the points of the graph of the function on opposite sides of some small neighbourhood of the studied point will lie on opposite sides of the tangent at this this point. 6.10. Inflection points. Point a is called an inflective point of a differentiable function /, if the graph of function /crosses from one side of the tangent to the other. Suppose / has continuous third derivatives and write the Taylor expansion of third order with a remainder: / (x) = /(«)+/'(«) ix -a)+ \ f'ia) ix -af + - f" (c) (x -af. 2 6 If a is a nonzero point of the second derivative such that f'ia) / 0, then the first derivative is nonzero on some neighbourhood as well and clearly it's an inflective point. In that case, the sign of the third derivative determines whether the graph of the function crosses the tangent from the top to the bottom or vice versa. Moreover, if a is an isolated nonzero point of the second derivative and simultaneously an inflective point, then clearly on some small neighbourhood of a the function is concave on one side and convex on the other. Thus we can also see the inflective points as the points of the crossover betwenn concave and convex behaviour of the graph of the function. 355 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.23. Determine the Taylor series centered at the origin of function (b) y = arctgx, x e (—1, 1). Solution. Case (a). We'll use the formula oo oo = £(-*)" = £(-i)"*", jce(-i.i) «=0 «=0 abou the sum of a geometric series. By differentiating it, we obtain (oo \ ' oo £(-!)"*" ) = E(-l)"n^_1. Jce(-l,l), «=o / «=i with (x0)' = 0, thus the lower index is n = 1. We can see that oo \« + l ~, vn — \ (1+xV J2, (-1)"+1 njc""1, jce(-l, 1). «=i Case (b). We can express the derivative of function y = arctg t as oo oo (arctg0' = ^7r = £ HT = £(-l)"<2", fe(-U). «=0 «=0 Because forx e (—1, 1) we have / (arctgt)' dt = arctgx — arctg 0 = arctgx 6.11. Asymptotes of the graph of the function. We'll introduce one more useful utility for sketching the graph of a function. We'll try to figure out the so called asymp-^ Civ totes, i.e. the lines, which the values of function / M'i^ approach. An asymptote at the improper point oo is such a line y = ax + b, which satisfies lim (f(x) — ax — b) = 0. We also call it an asymptote with a slope. If such an asymptote exists, it satisfies lim (f(x) — ax) = b and therefore the limit and x / oo \ oo/ x \ oo / £(-1)"^ )dt = e ((-iyft2"dt) = e o V«=o / «=o V o / «=o we already have the result oo arctgx = J2£Tx2"+1, *e(-l,l). (-1)" r2« + l 2n + l X «=0 6.24. Find the Taylor series centered at x0 = 0 of function X f(x) = f u cos u2 du, x € M.. 0 Solution. The equality oo cos? = E tt€ t2", (et £—^ (2«)! ' impUes «=o 2 (-1)" / 2^2" M COS H = U x 1-^- '" £ (2«)! (M ) £ (In (-1)" „4« + ! «=0 U , U € «=0 and then (for x e M) X X / oo /(jc) = J u cos u2 du = f I E ^)T m4"+1 I 0 0 V«=o " ' £ m.hAn+iäu =e^ «=0 \ 0 (-1)" x4«+2 «=0 1+2) ' □ lim -= a □ x^oo x exists as well. Conversely, if the last two limits exist, the limit from the definition of the asymptote exists as well, thus these are sufficient conditions as well. We can define and compute the asymptote at the improper point — oo similarly. This way we can find all the potentional lines satasfying the proporties of asymptotes with a slope. All we have left are potential lines perpendicular the the x axis: The asyptotes at points a e R are lines x = a such that the function / has at least on of the one-sided limits at point a infinite. We also speak of th asymptotes without a slope. For example rational functions have an asymptote in zero points of denominator that aren't zero points of the numerator. We'll compute at least one simple example: function fix) = x + \ has the asymptotes y = x and x = 0. Indeed, the one-sided limits from the right and left at zero are clearly ±oo, while the limit f{x)/x = 1 + 1/x2 is of course exactly ± 1 at the improper points, while the limit f{x) — x = 1/x is zero at the improper points. By differentiating we get /'(x) = l-x-2, f"(x) = 2x-3. 6.25. On the interval of convergence (—1, 1), determine the sum of the series E n in + 1) x" . n = l Solution. We have function f'{x) has two zero points ±1. At point x = 1 the function has a local minimum, at point x = — 1 a local maximum. The second derivative has no zero points in all its domain (—oo, 0) U (0, oo), so our function doesn't have any inflection points. 356 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS oo «=1 X>(/i + 1)jc" = E" (xn+l)' i oo E nX -1 + £jc" «=o «=i .2 E^«V « = 1 ^■2 E«*"+1 «=i Ex" «=i 2 1 2* for all x e (-1, 1). □ d^d. Expand the function cos2(x) into a power series (i.e. determine its Taylor expansion) at point 0 and determine for which real numbers this series converges. 6.27. Expand the function sin2(x) into a power series at point 0 and determine for which real numbers this series converges 6.28. Expand the function ln(x3 + 3x2 + 3x + 1) into a power series at point 0 and determine for which x e R it converges. O 6.29. Expand the function In *Jx into a power series at point 1 and determine for which i e lit converges. O More problems about Taylor polynomials and series can be found on page 412. Now we'll state several "classical" problems, in which we'll determine the course of distinct functions. 6.30. Determine the range of function f(x)=%±, xeR. Solution. The line y = 1 is clearly an asymptote of function / at +oo and the line y = — 1 is an asymptote at — oo, because lim 4 = 1, lim sLzl 11111 p-ril lim sLzl 2=1 = _1 0+1 The inequality > o, X € then implies that / is continuous and increasing on R. Hence the range is the interval (—1,1). □ 2 6.31. State all intervals on which the function y = e~x , x e R is concave. O 6.32. Consider function j = arctg^-, x^0(xsR). Determine intervals on which this function is convex and concave and also all its asymptotes. 6.33. Find all asymptotes of function (a) y = xex; with maximal domain. 6.34. State the asymptotes of function o o y = 2 arctg 6.35. Consider function X £ ±1 (X € R). o 6.12. Differential of a function. In practical use of differential calculus, we often work with dependencies between several quantites, say y a x, and the choice of de-^ pendant and independant variable is not fixed. The explicit relation y — fix) with some function / is then only one of several options. Differentiating then expresses, that the immediate change of y — fix) is proportional to the immediate change of x with a proportion of fix) — ^-(x). This relation is often being written as df dfix) — —(x)dx, dx where we interpret dfix) as a linear map of increments of given J/(x)(Ai) — fix) ■ Ax, while dx(x)(Ax) — Ax. We speak of the differential of function f if the approximative property fix +Ax)- fix)-df ix)iAx) lim Ajc^O Ax 0 holds. Taylor theorem then implies that a function with bounded derivative /' has a differential df. In particular, that occurs at point x if the first derivative fix) exists and is continuous. If the quantity x is expressed by another quantity t, i.e. x — git), and moreover by a function with continuous first derivative again, the the rule for differentiating composite functions tells us the composite function fog has again a differential df dx df(f) =-j-(x) — (f)dt. dx at Therefore we can see df as a linear approximation of the given quantity dependant on the increments of the dependant variable, no matter how this dependance is given. 6.13. The curvature of the graph of a function. To train our-\\ selves in the basic rules for differentiating composite functions etc., we'll disscuss the graph of a smooth function fix) as a special case of a parametrized curve in a plane for now. We can imagine it as a movement in the plane parametrized by an independant variable x. For an arbitrary point x from the domain of our function, by computing the first derivative we can immediately get the vector (1, f(x)) e R2 that represents the immediate velocity of such a movement. . The tangent line through the point [x, fix)] 357 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS y = In 3e2x+ex + l0 e* + l O defined for all real x. Find its asymptotes. 6.36. Determine the course of the function f(x) = ^|x|3 + l. Solution. The domain is the whole real axis, / has no discontinuities. For example it suffices to consider that the function y = f/x is continuous at every point x e R (unlike even roots defined only on the nonnegative axis). We can also immediately see that fix) > 1 and fi—x) = fix) for all x e R, i.e. the function / is positive and even. Thus we can obtain the point [0, 1] as the only intersections of the graph of / with the axes by substituting x = 0. The limit behavior of the function can be determined only at ±oo (there are no discontinuities), where we can easily compute (6.4) lim ý\. + 1 lim yp lim I + 00. Now we'll step up to determing the course of a function by using its derivatives. For x > 0, we have /(i) = vV+T=(i3+i)] hence (6.5) fix) = 1 (x3 + lp 3x2 (x3 + iy > 0, x > 0. This implies that / is increasing on the interval (0, +oo). With respect to its continuity at the origin, it must be increasing on [0, +oo). Because it's an even function, we know that on the interval (—oo, 0] it must be decreasing. Thus it has only one local minimum at point x0 = 0, which is also a (strict) global minimum. Because a noncon-stant continuous function maps an interval to an interval, the range of / is exactly [1, +oo) (consider fix0) = 1 and (||6.4||)). Notice that thanks to the even parity of the function, we didn't have to compute the derivative /' on the negative half-axis, which can though be easily determined by substituting |x|3 = (—x)3 = —x3, yielding fix) = \ (-x3 + lp (-3x2) vV*3+i)3 < o, x < 0. When computing f'(0), we can proceed according to the definition or we can use the hmits lim 0 uu, ,--^ — lim - determine the one-sided derivatives and then /'(0) = 0. In fact, we didn't even have to compute the first derivative on the positive half-axis either. To obtain that / is increasing on (0, +oo), we only needed to realize that both functions y = f/x and y = x3 + 1 are increasing on R and a composition of increasing functions is again an increasing function. For x > 0, we can easily compute the second derivative using (II6.5H) 2^(*3 + l)2-§p(*3 + l)-1(3;C2) vV+i)4 parametrized by this directional vector then represents a linear approximation of the curve. We've also seen that in the case /" (x) — 0 and simultaniously f"'ix) ^ 0 the graph of our function intersects the tangent line, which means the tangent line is the best approximation of the curve at the point x up to the second order as well. We usually describe this by saying that the graph of the function / has a zero curvature at the point x. While the nonzero values of the first derivative described the speed of the growth (no matter the sign), we can intuitively expect the second derivative will describe the extent of the curvature of the graph. So far we've only seen that the graph of the function is above its tangent for a positive value and below it for a negative one. We got the tangent at a fixed point P — [x, fix)] as a limit of the secants, i.e. the lines passing through the points P a Q — [x + Ax, fix + Ax)]. If we want to approximate the second derivative, we will interpolate the points P and Q / P by a circle Cq, whose center is at the intersection of the perpendicular lines to the tangents at points P and Q. It can be seen from the figure that if the angle between the tangent at a fixed point P and the x axis is a and the angle between the tangent at a fixed point Q and the x axis is a + Aa, then the angle of the mentioned perpendicular lines will be A a as well. If we denote the radius of our circle by p, then the length of the arc between points P and Q will be pAa. As the point Q approaches a fixed point P, the length of the arc will approach the length s of the studied curve, i.e. the graph of the function fix), and the circle will approach the circle Cp. Thus we get the basic relation for the limit radius p of the circle Cp: As ds p = lim -= —. Aa^O Aa da We define the curvature of the graph of the function / at a point P as the number 1/p. Zero curvature then corresponds to an infinite radius p. For computing the radius p we need to express the length of the arc s by the change of the angle a and express the derivative of this function by the derivative of /. We can already notice that for an increasing angle 6 the length of the arc can either increase or decrease, depending on whether the circle Cq has the center above or below the graph of the function /. The sign of p then reflects whether the function is concave or convex. We also need to think abour the special case when the center "runs off" to infinity in limit, i.e. instead of a circle we get a line again, which is the tangent. Obviously, we don't have a direct tool to compute the derivative ^j-. However, we know that tg a — df/dx and by differentiating this equality by x we obtain (using the rule for differentiating composite functions) 1 da (cos a)2 dx On the left hand side we can substitute = 1 + (tga)2 = (cos a)2 1 + if')2 which implies (see the rule for differentiating an inverse function) dx da l + (tga)2 l + >\2 f" f" 358 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS i.e. after a simplification we have 2x (6.6) > 0, x > 0. Similarly we can compute f"(x) -. 2xj(-x3 + l)2 - lx^(-xi + l)-1(-3x2) ^(-3+l)4 2x > 0, for x > 0 and then /"(0) = 0. Next, we can use a limit transition: lim 2x 0 - _ „ _ lim --. According to the inequality (||6.6||), / is strictly convex on the interval (0, +oo). Also / must be strictly convex on (—oo, 0). To obtain this conclusion though, we again didn't have to compute the second derivative for x < 0, it sufficed to use the even parity of the function. In total, we obtained that / is convex on its whole domain (it doesn't have any inflection points). To be able to plot the graph of the function, we still need to find the asymptotes (we leave the computation of values of the function at certain points to the reader). Since / is continuous on R, it can't have any horizontal asymptotes. A line y = ax+b is an inclined asymptote for x -» oo if and only if both (proper) limits lim fill = a, lim (f(x) - ax) = b. 2x exist. Analogous statement holds for x lim m = lim Si = x^-oo x x^-oo x lim (f(x) — 1 • x) = lim (\/~ x^-oo x^-oo \ > — oo. Hence the limits lim ^ = 1, x3 + 1 lim Vx^+T-x j(x3 + l)2+xjx3 + l+x2 ^{x3 + \)2+xJx3+A+x2 lim x3+l-x3 um 3^ 0 x^°° ^f(x3 + \f+xjx3 + \+x2 imply that the line y = x is an asymptote at +oo. If we again consider the fact that / is even, we'll immediately obtain the line y = —x as an asymptote at — oo. □ 6.37. Determine the course of the function f(x) = cosx J ^ ' cos 2x ' Solution. The domain consists of exactly those x e R, for which cos 2x 7^ 0. The equality cos 2x = 0 is satisified exactly for 2x = § + kit, k € Z, tj. x = f + kf, k € Z. Hence the domain is {f + kf; k eZ}. Clearly we have /(-x) cos(—x) cos(—2x) cos 2x fix) Now we're almost done, because the increment of the length of the arc s in dependance on x is given by the formula ±- = (1 + ax so by using the rule for differentianting a composite function we can compute ds ^ da dx da ds dx (1 + (/')2)3/2 Now we can see the relation between the curvature and the second derivative: the numerator of our fraction is always positive, no matter the value of the first derivative. It's equal to the third power of the size of the tangent vector of the studied curve. The sign of the curvature is therefore given only by the sign of the second derivative, which only confirms our notions about concave and convex points of functions. If the second derivative is zero, the curvature is zero as well. The circle, by which we defined the curvature is called the osculating circle. Try to compute the curvature of simple functions yourself and use osculating circles while sketching their graphs. The computation at the critical points of the function / is the easiest, because in these we get the radius of the osculating circle as the reciprocal of the second derivative with the corresponding sign. 6.14. Vector differential calculus. As we've mentioned already ^ in the introduction to chapter five, for our notions about differentiating it was quite essential that we studied functions defined on real numbers and that their values could be added and multiplied by real numbers. That's why we need out functions / : R -> V to have values in the vector space V. For distinction, we'll call them vector functions of one real variable or more briefly vector functions. Now we'll take more interest in real function with values in plane or space, i.e. / : R M2 and / : R R3. We talk about (parametrized) curves in plane and space. Similarly, we could work with values in Rn for any finite dimension n. For simplification, we'll work in the fixed standard bases in M2 and R3, so our curves will be given by doubles or triples of simple real functions of one real variable, respectively. The vector function r in plane or space, respectively, is then given by r(t) = x(?)ei + y(f>2, K0 = *(0«l + y(t)e2 + z(t)e3. The derivative of such a vector function is a vector, which approximates the map r by a linear map of a line to the plane or to the space. In plane it's dr(t) dt -(f) = r'(f) =x'(0e! +y(0«2 and similarly in space. We have to understand the differential of a vector function in this context as well: / dx dy dz \ dr = I —e\ H--e2 H--ej, )dt \ dt dt dt J where we understand the expression on the right hand side in a way the increment of the scalar independant variable t is linearly mapped by multiplying the vector of the derivative and thus obtaining the corresponding increment of the vector quantity r. 359 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS for all x in the domain, thus / (with its domain symmetric with respect to the origin) is an even function, which was implied by the even parity of the function y = cos x. Moreover, because cosine is periodic with a period of 2tt (i.e. y = cos 2x has a period of n), it suffices to consider the function / for x e V := [0,7T] \ {f + kf; k e Z} = [0, f) U (f, 3f) U , n], since the course of the function on its whole domain can be derived using its even parity and periodicity with a period of 2tt . Hence we'll only be concerned with the discontinuities x\ = n/4 and X2 = 3n/4. We'll determine the corresponding one-sided limits lim -^f = n cos 2x c^ 4 lim -^f 3_ cos 2x x^lf- +oo, = +oo, lim COS X cos 2x lim -^f 3_ cos 2x -oo, -oo. If we have a respect to the continuity of / on the interval (tt/4, 3n/4), we can see that / attains all real values on this interval. Hence the range of / is the whole R. We also found out that the discontinuities are of the second kind, where at least one of the one-sided limits is improper (or doesn't exist). By that, we simultaneously proved that the lines x = n/4 and x = 3n/4 are horizontal asymptotes. If we'd want to formulate the previous results without a restriction to the [0, n ], we can say that at all points xk = ^ + !f, ksZ f has a discontinuity of the second kind and every line x = \ + kf, keZ is a horizontal asymptote. Also the periodicity of / implies that no other asymptotes exist. In particular, it cannot have any inclined asymptotes, nor can the (improper) hmits limx^+0o f(x), lim^-oo f(x) exist. Now we'll find the points of intersection with the axes. The point of intersection [0, 1] with the y axis can be cound by computing /(0) = 1. When looking for the points of intersection with the x axis, we consider the equation cos x = 0, x e V with the only solution being x = jt/2. Then we can easily obtain the intervals [0, tt/4), (it/2, 3n/4), on which / is positive, and the intervals (n/4, Jx/2), (3n/4, it], where it's negative. Now we'll step up to computing the derivative - sinx cos 2x — 2 cosx (— sin2x) cos 22x — sin x (cos2 x — sin2 x) + 2 cos x (2 sin x cos x) cos2 2x sin3 x +3 cos2 x sin x (sin2 x + cos2 x + 2 cos2 x) sin x cos2 2x cos2 2x (2 cos j + 1) si sinx -, x e V. cos2 2x The points at which f'(x) = 0 are clearly the solutions of the equation sinx = 0, x € V, i.e. the derivative is zero at points x3 = 0, x4 = jt. The inequalities 2cos2x + 1 > cos22x > 0, sinx > 0, x eV C\ (0, jt) imply that / is increasing at every inner point of the set V, thus / is increasing on every subinterval of V. The even parity of / then implies that it's decreasing at every point x e (—it, 0), x ^ —3n/4, If the r(t) represents a parametrization of a curve, then its derivative is a velocity vector of such defined distance. We studied a special case of the vector r(t) — te\ + f(t)e2 giving the graph of the function / in the last paragraph. The second derivative then represents the acceleration of such defined movement. Notice that of course the acceleration need not be collinear with the velocity. In fact, in the case of the graph of a function, the acceleration is collinear with the velocity only at the points, where /" is zero, which corresponds the idea that the acceleration can be collinear only if the curvature of the graph is zero. 6.15. Differentiating composite maps. In linear algebra and geometry there are very useful maps called forms. They have one or more vectors as their arguments and they are linear > in each of their aguments. Thus we can define the size of the vectors (the dot product is symmetric bilinear form) or the volume of a parallelepiped (it's a n-linear antisymmetric form, where n is the dimension of the space), see for example the paragraphs 2.44 a 4.22. Of course, we can use vectors r(t) dependant on a parameter as the arguments of these operations. By a straightforward usage of Leibniz's rule for differentiation of a product of functions we will verify the following Theorem. (1) If r(t) : R —>• MP is a dijferentiable vector and * : R" —>• Rm is a linear map, then the derivative of the map for satisfies d(V o r) dt dr * o —. dt (2) Consider differentiable vectors n, I a k-linear form : R" x ... xR" on the space derivative of the composed map cp(t) = *(n(0, ...,rjt(0) satisfies (generalized Leibniz's) rule fk) H-----h (n, R" and The the dw ,dr\ — = M—,r2, dt v dt drk x (3) The previous statement remains valid without a change even if also has values in the vector space (and is linear in all its k arguments). Proof. (1) In linear algebra, it is shown that the linear maps are given by a constant matrix of scalars A — (atj) in a way that y n n \ * o r(t) = ( aun (t), ■ ■ ■, amiri (0 ) • We now carry out the differentiation by individual coordinates of the result. However, we know, that derivative acts linerly towards scalar linear combinations, see Theorem 5.33. That's why we indeed get the derivative by simply evaluating the original linear map * on the derivative r1 (t). (2) We obtain the second statement analogously. We write out the evaluation of the /c-linear form on the vectors r\,..., rk in the coordinates in this way: 0, x or sin2x 3 + 4 cos2 x sin2 x + 8 respectively, we have fix) 11 -4cos4x -4cos2x > 3, x e M = 0 for certain x e P if and only if cosx = 0. But that's satisfied only by x5 = n/2 e V. It's clear that /" changes its sign at this point, i.e. it's a point of inflection. No other points of inflection exist (the second derivative /" is continuous on V). Other changes of the sign of /" occur at zero points of the denominator, which we have already determined as discontinuities x\ = Tt/4 and x2 = 37T/4. Hence the sign changes exactly at points x\, x2, x5, thus the inequality fix) > 0 pro x -» 0+ implies that / is convex on the interval [0, tt/4), concave on (tt/4, tt/2], convex on [tt/2, 3it/4) and concave on (3^/4, it]. The convexity and concavity of / on other subintervals is given by its periodicity and a simple observation: if a function is even and convex on an interval (a, b), where 0 < a < b, then it's also convex on i-b, -a). All that's left is computing the derivative (to estimate the speed of the growrth of the function) at the point of inflection, yielding /' (tt/2) = 1. Based on all previous results, it's now easy to plot the graph of function /. □ 6.38. Determine the course of the function x ln(x) and plot its graph. Solution. i) First we'll determine the domain of the function: where the scalars B^.--** ^ given as the value of the given form <$>(eil,..., etk) on the chosen &-tuple of base vectors for every choice of indices. The rule for differentiating a product of scalar functions then yields the statement. (3) If <$> has vector values, it's given by finitely many components and we can use the previous notion on each of them □ On the euclidean space R3, besides the dot product, which assigns a scalar to two vectors, we also have the vector product, which assigns the vector u x v e R3 to vectors u and i;, see 4.24. This vector u x v is orthogonal to both vectors u and i;, its size equals the area of the parallelogram determined by the u and i; (in this order) and an orientation such that the triple u, v, u x v is a positively oriented basis. The previous theorem immediately implies convenient corollaries: Corollary. Consider the vectors u(f) and v(t) in the space R3. The derivatives of their dot product (u(t), v(t)) and their vector product u(t) x v(t) satisfy (6.1) (6.2) — (u(t), v(t)) dt d — (u(t) x v(t)) dt (u'(t), v(t)) + (u(t), v'(t)) u(t) XV(t) + u(t) x v'(t) 6.16. The curvature of curves. Now we have far more powerful tools for studying curves in amore systematic way than we discussed the curvature of the graphs of functions. Let's look at the curves r(t) in space and assume they are parametrized in way that their tangent vector always has a unit size, i.e. {r1 (t), r1 (?)) = 1 for all t. We say the curve r(t) is parametrized by the length. By another differentiation of this unit vector r' (t) we obtain a vector r" (t), for which we'll evaluate (using the symmetry of the dot product) 0= 4 =2 dt and thus the acceleration vector r" (t) is always orthogonal to the velocity vector. That corresponds to the idea that after the choice of a parametrization with a constant size of velocity, the acceleration in the direction of the movement cannot be noticeable, therefore the acceleration must lie in the plane orthogonal to the velocity vector. If the second derivative is nonzero, we call the normed vector 1 n{t) = -r" it) \V'(i)\\ the main normal of the curve r(t). The scalar function ic(t) satisfying (at the points where r" it) / 0) r" it) = K(t)n(t) is called the curvature of the curve r(t). At the zero points of the second derivative we define ic(t) by zero value as well. At the nonzero points of the curvature the unit vector bit) = r1 it) x n it) is well defined, we call it the binomial of the curve r(t). By a direct computation we obtain 0= -(bit),/it)) dt \{1} (b'it),r'it)) + (bit),r"it)) = (b'it),r'it))+Kit)(bit),nit)) = {b\t),ť(t)), 361 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS ii) We'll find the intervals of monotonicity of the function: first we'll find zero points of the derivative: ln(jc) - 1 ln2(x) 0 The root of this equation is e. Next we can see that f'(x) is negative on both intervals (0, 1) and (1, e), hence f(x) is decreasing on both intervals (0, 1) and (1, e). Additionally, f'(x) is positive on the interval (e, oo), thus f(x) is increasing here. That means the function / has the only extreme at point e, being the minimum, (we can also decide this using the sign of the second derivative of the function / at point e, because f(2)(e) > 0). iii) We'll find the points of inflection: ln(jc) - 2 /(2)W 0 x In (x) The root of this equation is e2, so it must be a point of inflection (it cannot be an extreme with regard to the ptrevious point). iv) The asymptotes. The line x = 1 is an asymptote of the function. Next, let's look for asymptotes with a finite slope k: limy ln(jc) lim - x^oo ln(x) 0. If the asymptote exists, its slope must be 0. Let's continue the computation lim - x^°o ln(x) 0 • x = lim ln(x) = oo, and because the Umit isn't finite, an asymptote with a finite slope doesn't exist. The course of the function: □ Now move from determining the course of functions onto other subjects connected to derivatives of functions. First we'll demonstrate the concept of curvature and the osculating circle on an ellipse 6.39. Determine the curvature of the ellipse x2 + ly2 = 2 at its vertices (||4.47||). Also determine the equations of the circles of osculation at these vertices. which show that the tangent vector to the binormal is orthogonal to both b(t) and r' it). Therefore it must be a multiple of the vector of the main normal. We write b'it) -x(t)n(t) and call the scalar function x(t) the torsion of the curve r(t). We have not yet computed the speed of change of the main normal, which we can also write as n(t) = bit) x r' it): n'it) = b'it) x r1 (0 + K(t)b(t) x nit) = -x(t)n(t) x r1 it) + K(t)(-r/ it)) = x(t)b(t) -Kit)r'it). Successively, for all points with nonzero second derivative of the curve r(t) parametrized by the length of the arc, we derived an importnat basis (r1 it), nit), bit)), called the Frenet frame in the classical literature and simultaneously in this base we expressed the derivatives of its components by the form of the so called Frenet-Serret formulas dr'-(t) = K(t)n(t), —it) = x(t)b(t) - K(t)r'(t) dt db dt dt it) = -x(t)n(t). Notice that if the curve r(t) still lies in one plane, then its torsion is an identically zero function. In fact, the converse is true as well. We won't prove it here, but it is a corollary of a classical result of geometric theory of curves: Two curves in a space parametrized by the length of the arc can be mapped to each other by an euclidean transformation, if and only if their curvature functions and torsion functions coincide except for a constant shift of the parameter. Moreover, for every choice of smooth funcitons k ax there exists a smooth curve with these parameters. We won't prove this result here, the persons concerned can find the thorough version in[?]. By a straightforward computation we can check that the curvature of the graph of the function y = fix) in plane and the curvature k of this curve defined in this paragraph coincide. Indeed, by computing the derivative of the composite function using the differential of the length of the arc for the graph of a function of form dt = (1 + ifxf)1/2dx, dx = il + ifx)2)~1/2dt (here we write fx = ^) we obtain this relation for our unir tangent vector of the graph of a cruve r'it) = ix'it), y (0) = (d + ifx)2r1/2, fxd + (fx)2)-1'2) and by fairly not well arranged, but similar computation of the second derivative and its size we indeed obtain La 'dx2 6.17. The approximations of derivatives and the asymptotic estimations. In the beggining of this textbook in paragraphs 1.3, 1.9 and further we discussed how to express the value of a function by changes, i.e. differen-tions. In the next part of the text we will similaely the function / using its derivatives, i.e. immediate changes. Before that though, let's stop at the connections between 362 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Solution. Because the ellipse is already in the basic form at the given coordinates (there are no mixed or linear terms), the given basis is already a polar basis. Its axes are the coordinate axes x and y, its vertices are the points [72, 0], [-72, 0], [0, 1] and [0, -1]. Let's first compute the curvature at vertice [0, 1]. If we consider the coordinate y as a function of the coordinate x (determined uniquely in a neighbourhood of [0, 1] ), then differentiating the equation of the ellipse with respect to the variable x yields 2x + 4y/ = 0, hence / = — ^ (/ denotes the derivative of function y(x) with respect to the variable x; in fact it's nothing else than expressing the derivative of a function given implicitly, see ??). Differentiating this equation with respect to x than yields y" = - ^-). At point [1, 0], we obtain / = 0 - \ (we'd receive the same results if we explicitly expressed and y" y = \ 72 — x2 from the equation of the ellipse and performed differentiation; the computation would be only a little more complicated, as the reader can surely verify). According to 6.13, the radius of the osculation circle will be (! + (/)¥ (y")2 -2, derivatives and differentions. The key to this will be the Taylor series with a remainder. Suppose that for some (sufficiently) differentiable function / (x) defined on the interval [a, b], we know the values fi — f (x,) at the points xq = a, x\, x2, ■ ■ ■, x„ — b, while all indices i — 1,..., n satisfy x;■ — x,_i — h > 0 for some constant h. Write the Taylor expansion of function / in the form f(xi ±h) = fi± hf'(Xi) + y± ^/(3)te) + ■ ■ ■ We know that if we finish the expansion by a term of order k in h, i.e. an expression containing hk, then the actual error will be bounded by uk+l —- (k+\)V on the interval [x, — h, xt + h]. If the (k + l)-the derivative / is continuous, we can approximate it by a constant. Then we can see that for small h, the error of the approximation by the Taylor polynomial of order k acts like hk+l except for a constant multiple. Such an estimation is called an asymptotic estimation. Definition. We say the expression G (h) is asymptoticall equal to F(h) for h -* 0 and write G(h) = 0(F(h)), if the finite limit G(h) lim h^o F(h) a € . exists. or 2, respectively, and the sign tells us the circle will be "below" the graph of the function. The ideas in 6.13 and 6.16 imply that its center will be in the direction opposite to the normal line of this curve, i.e. on the y axis (the function y as a function of variable x has a derivative at point [0, 1], thus the tangent line to its graph at this point will be parallel to the x axis, and because the normal is perpendicular to the tangent, it must be the y axis at this point). The radius is 2, so the center will be at point [0, 1 — 2] = [0, —1]. In total, the equation of the osculation circle of the ellipse x2 + ly2 = 2 at point [0, 1] will be x2 + (y + l)2 = 4. Analogously, we can determine the equation of the osculation circle at point [0, —1]: x2 + (y — l)2 = 4. The curvatures of the ellipse (as a curve) at these points then equal \ (the absolute value of the curvature of the graph of the function). For determining the osculation circle at point [72, 0], we'll consider the equation of the ellipse as a formula for the variable x depending on the variable y, i.e. x as a function of y (in a neighbourhood of point [72, 0], the variable y as a function of x isn't determined uniquely, so we cannot use the previous procedure - technically it would end up by diving by zero). Sequentially, we obtain: 2xx' + 4y = 0, thus x' = -2f, and x" = -2{\ - y-^). Hence at point [72, 0], we have x' = 0 and x" = — 72 and the radius of the circle of osculation is p = —= ^- according to 6.13. The normal line is heading to — oo along the x axis at point [72, 0], thus the center of the osculation circle will be on the x axis on the other side at distance ^, hence at the point [72 - ^, 0] = P^, 0]. In total, the equation of the circle of osculation at vertice [72, 0] will be (x — ^)2 + y2 = \. The curvature at both of these vertices equals 72. Denote the sought estimations of the values of the derivatives of fix) at the points x, as f^ and write the Taylor expansion briefly in this way: ft±i = ft±./;'// ■ rii fin JLh2±l^h3. 2 6 For the approximations of the first derivative we can immediately use three different differences computed from the Taylor expansion: (i) fi+l~fi-l-^f(3)(Xi) 2h 3! , (1) _ fi + l ~ fi _ h fh, s ~ h 2\J (Xi) + --- f!1) = i^L + ^f"(Xi) + ... when we only substracted the respective polynomials. Then we obtain a numerical representation of the first derivative. The first of them has an asymptotic estimation of the error of (i) fi + l - fi i-1 2h Oih2), the other two have 0(/z). We call them the central difference, the forward difference and the backward difference. Surprisingly, the central difference one digit place better than the other two. We can proceed the same way when approximating the second derivative. To be able to compute fix,) from a suitable combination of the Taylor polynomials, we need to cancel both the first derivatives and the value at x,. The simpliest combination cancels all the odd derivatives as well: (2) _ fi + l ~ 2fj + fj + l h2 363 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS □ 6.40. Remark. The vertices of an ellipse (more generally the vertices of a closed smooth curve in plane) can be defined as the points at which the function of curvature has an extreme. The ellipse having four vertices isn't a coincidence. The so called "Four vertices theorem" states that a closed curve of the class C3 has at least four vertices. (A curve of the class C3 is locally given parametrically by points [f(t), g(t)] € R2, t € (a, b) c R, where / and g are functions of the class C3 (R).) Thus the curvature of the ellipse at its any point is between its curvatures at its vertices, i.e. between \ and \fl. B. Integration First, several easy examples that everyone should handle. 6.41. Using integration "by heart", express (a) / e~x dx, x e R; (b)/ •JA-x2 dx, x e (—2, 2); x2+3 dx, x e 3x2 + l x3+x+2 dx, i^-l. Solution. We can easily obtain (a) / e~x dx = — f —e~x dx + C; (b)/ •jA-x2 dx = f ■ 'i-(!r ■■ dx arcsin § + C; x2+3 dx dx -+i V3 f 1 73 (d)/ -Larctg^ + C; 3x2+3 1 + (73) ■ dx x3+3x+2 dx = In x3 + 3x + 2 \ + C, where we used the formula / 4^ dx f(x) ln|/(*)| + C. □ 6.42. Compute the indefinite integral T + 4 e~ - ^ + 9 sin 5x + 2 cos : + 3-jc) dx pro i ^ 3, i ^ | + /C7T, k sZ. Solution. Only by combining the earlier derived formulas, we obtain We call this the differention of the second order and just like the central first differention, the asymptotic estimation is one digit place better than we would expect at first glance: A2) _ fi + l J i 2fi + fi+l h2 ■0(h2). 2. Integration 6.18. Newton integral. Now we'll take interest in the reverse procedure than we did for differentiating. We'll want to reconstruct the actual values of some function us-5s- ing its immediate changes. If we consider the given function f(x) the derivative of an unknown function F(x) then at the differential level we can write dF — f (x)dx. We call the function F the primitive function or the indefinite integral of the function / and traditionally we write F(x) J f (x)dx. Lemma. The primitive function F(x) to the function f(x) is determined uniquely on each interval [a, b] except for an additive constant. Proof. The statement follows immediately from Lagrange's mean value theorem, see 5.38. Indeed, if F'(x) — G'(x) — f(x) on the whole interval [a, b], the function (F — G)(x) has a zero derivative in all points c of the interval [a, b]. Then the mean value theorem implies that for all points x in this interval, F(x) - G(x) = F(a) - G(a) + 0-(x-a). Then the difference of the values of functions F a G must be the same on the interval [a, b]. □ The previous lemma leads us to this notation for the indefinite integral: F(x) -f f (x)dx + C with an unknown constant C. We can also consider the value of real function f(x) as an immediate increment of the area bounded by the graph of the function / and the x axis and try to find the size of this area between boundary values a a b of some interval. Let's try to connect this concept with the indefinite integral. Suppose we know a real function and its indefinite integral F(x), i.e. F'(x) — f(x) on the interval [a,b]. If we divide the interval [a, b] to n parts by choosing the points a — xo < x\ < ■ ■ ■ < x„ — b and approximate the values of the derivatives at the points x, by the expressions f(xt) = F'ixt) F(xi+i) - F(xj) Xi + l — Xi by summing over all the intervals of our partition, we obtain an estimation of the sought size of the area: 364 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS / (r + 4ex - ± + 9sin5x + 2cos § - ^ + dx ^ +6e^ + ~ I cos5x + 4sinf - 3tgx - In | 3 -x \ + C. □ For expressing the following integrals, we'll use the method of integration by parts (see 6.20). 6.43. Compute / x cosx Ax, x e R and flnxdx,x> 0; Solution. u = lnx u' — 1 i/ = 1 i; = x / lnx dx / / x cos x dx xlnx — / ldx=xlnx — x + C W = X w' = 1 i/ = cosx t> = smx x sinx + cosx + C. x sin x — / sin x dx □ 6.44. Using integration by parts, compute (a) / (x2 + l) e~x dx, x e R, (b) /(2x — 1) lnx dx, x > 0, (c) / arctgx 0. f (x2 + l) e~x dx 2 + l G'(x) (x2 + 1) e~x +f2xe~ F'(x) G(x) --dx = 2x —e" F(x)--G'(x) 2x - e~ F'(x) G(x) -- (x2 + l) e~x - 2x e~x + f 2 e~x dx = - (x2 + l) e" 2x e~x - 2e~x + C = -e~x (x2 + 2x + 3) + C; (b) /(2x —l)lnx dx (x2 — x) In x — / : F(x)~-G'(x) — dx = lnx = 2x - F'(x) G(x) -. 1/x j2 (x2 — x) In x + / 1 — x dx {x2 — x) In x + x X- + C; (c) «-i «-i !=0 (Xi + l-Xi) i=0 F(b) - F(a). Therefore we can expect that for "nice enough" functions f(x), the size of the area bounded by the graph of the function and the x axis can truly be calculated as a difference of the values of the primitive function at the boundary points of the interval. This procedure is called Newton integral. We write Ja f(x)dx = [F(x)fa = F(b) - F(a) and also speak of the (Newton) definite integral within the bounds a, b. In the case of a complex function /, the real and the imaginary part of its indefinite integral is uniquely determined by the real and the imaginary part of /, so with no further comments we'll only work with real functions from now on and we'll come back to complex ones in applications as needed. 6.19. Integration "by heart". Before we'11 clarify how the Newton integral is connected to the size of an area a eventually how to use it for simulations of practical problems, we'll show several procedures of computing the Newton integral. We'll only use our knowledge of differentiation. The most easy case is the one when we can see the derivative in the integrated function flat out. To do that in the simple cases, it suffices to read the tables for function derivatives in our menagerie from the other side. This way we get f.ex. the following statements for all a e R and n e 7L, n / — 1: / / / / / / / / / / / adx = ax" dx C —— x"+1 «+l C •ax dx = ± eax +C a — dx = a In x + C x a cos(bx) dx = | sin(bx) + C a sin (to) dx = — | cos (to) + C a cos (to) sin" (bx) dx = b(n+l) a sin(to) cos" (bx) dx = sin"+1(to) b(n+l) cos C n+1(bx) + C - In (cos (to)) + C b a tg(to) dx ^^dx = Kctg(t) + C yfa2 — x 1 _ dx = arccos (-) + C dx = arcsin (t—^ + C. 365 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS / arctg x dx F(x) = arctg x G'ix) = 1 G(x) = x x arctgx - / dx = x arctgx - \J dx x arctgx - \ In (l + x2) + C; (d) / ex sin x dx F(x)--G'(x) smx F'(x) G(x)~- cosx -e^cosx + f ex cosxdx F(x) -. G'ix) e = cosx F\x) = ex G(x) = sinx —ex cosx + ex sinx — / e* sinx 5 . (7+ln x)1 dx, x > 0; (c) f — v ' J (l+sin jt (d) f .cos* dx, X € ■' Vl+sin2* Solution. We have (a) j dx, x £ ü+pl, k € Z; / V2x — 5 dx t = 2x — 5 I dt = 2dx I y (2x - 5)3 + C; / yftdt t2 +C (b) (7+ln x)' dx t = 7 + lnx dt = - dx x (7+lnjt:)8 ffdt = j + C + C; (c) (l+sin x)2 dx t = 1 + sin x (if = cos x o du = (l + -r=) dt & - 1 f = sinx dt = cos x 0. Thus we should consider the values C\ and C2. For the sake of simplicity though, we'll use the notation without indices and stating the corresponding intervals. Furthermore, we'll help ourselves by letting aC = C for a e R \ {0} and C + b = C for b e R, based on the fact that {C; CeR) = {aC; C e R] = {C + b; C e R] = R. We could then obtain an entirely correct expression for example by substitutions C = aC, C = C + b. These simplifications will prove their usefulness when computing more complicated problems, because they make the procedures and the simplifications more lucid. Case (b). Sequential simplifications of the integrated function lead to ftg2xdx = f^Ldx= f J ° •> cosz x •> 1—COS2 X dx f —k— dx — f 1 dx = tgx — x + C, •I COSz X •> ° where we helped ourselves by the knowledge of the derivative (tg*)' = d*7, x^f+ /C7T,/ceZ. Case (c). It suffices to realize that this is a special case of the formula ff^dx = ln\f(x)\ + C, which can be verified directly by differentiation (ln|/(x)| + C)' = (ln[±/(x)])'+(Q / _ [±f(x)]' _ ±f'(x) _ f'(x) ±f(x) ±f(x) f(x) Hence I- i+si*-dx =ln (1 + sinx) +C. Case (d). Because the integral of a sum is the sum of integrals (if the seperate integrals are sensible) and a nonzero constant can be factored out of the integral at any time, we have / 6 sin 5x + cos | + 2 e^" dx = — | cos 5x + 2 sin | + 3 e^ + C. □ 6.48. Determine (a) / ^7 dx, x ^ j + kit, k € Z; (b) / x2 e~3x dx, x e R; (c) / cos2 x dx, x s R. Solution. Case (a). Using integration by parts, we obtain F(x) = x F'(x) = 1 G'« = d^ G{x)=tgx x tgx + / c™xx dx = x tgx + In | cosx | + C. Case (b). This time we are clearly integrating a product of two functions. By applying the method of integration by parts, we reduce the integral to another integral in a way that we differentiate one function and integrate the second. We can integrate both of them (we can differentiate all elementary functions). Thus we must decide which of the two variants of the method we'll use (whether we'll integrate the function y = x2, or y = e~3x). Notice that we can use integration bz ■ dx xtgx — f tgx dx While evaluating definite integrals, we can compute the whole reccurence straight out after avaluating in the given bounds. F.ex. it can be seen immediately that while doing integration over the interval [0, 2it], our integrals have these values: <>2jt Iq — I dx I Jo [x]2n = lit *2n h -L f I cosxdx — [sinx]2,^ — 0 Jo 0 for even m cos x dx — m — l -Im-2 for odd m Thus for even m — In we obtain the resulr í>2jt Jo 2„ , (2n - l)(2n-3)...3- 1„ cos x dx —--——-—---2it, jo 2n(2n — 2)... 2 outright, while for odd m it's always zero (as could be guessed from the graph of the function cos x). 6.23. Integration of rational functions. While doing integration of rational functions, we can use several simplifications. Particularly in the case the degree of the polynomial / in the numerator is greater or equal to the degree of the polynomial g in the denominator, it's sensible to carry out the division with a remainder outright (see the paragraph 5.2) and reduce the integration to a sum of two integrals. The first one will be an integration of a polynomial and the second one an integration of an expression f/g with degree of g strictly greater than the degree of / (such functions are called proper rational functions). Indeed, we can achieve this by simple division of the polynomial: / h f = q-g+h, — = q + -. g g Thus we can assume without loss of generality that degree of g is strictly greater than the degree of /. We' 11 show another procedure on a simple example. Let's try to analyse how to get the result f(x) _ 4x+2 -2 6 g (x) x2 + 3x + 2 which we can integrate directly: 4x + 2 1 / ■ dx -2 In I II 61n| C. - 3x +2 First off, by modifying the sum of the fractions to a common denominator we can verify this equality easily. Conversely, if we know our expression can be written in the form 4x + 2 A B x2+3x + 2 x+1 x + 2 we only need to compute the cofficients A and B. We can obtain equations for them by multiplying both side by the polynomial x2 + 3x + 2 from the denominator and comparing coefficients of the individual powers of x in the resulting polynomials on both sides: 4x + 2 = A(x + 2) + B(x + 1) 2A + B = 2, A + B = 4. This is where our decomposition comes from. It's called decomposition into partial fractions. This elementary procedure can easily be generalized. It's a purely algebraic notion based upon properties of polynomials, which we'll come back to in chapter ??. 368 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS parts repeatedlz and that the 72-th derivative of a polynomial of degree n € N is a constant polynomial. That gives us a way to compute x e dx F(x)--G'(x) -3x ■\x2 e~3x + \ I F'(x) G(x) x e 2x 1 „-3x 3 C 3xdx and furthermore x e -3x dx 1 I x e~3x + 3 10-3* "3 e I «-3x 9 C F(x) = x F'(x) G'(x) = e~3x G(x) --1 /e~3xdx = -\xe~3x - ±e~3x + C. In total, we have /-i-2 ^ 3x .1 y _ __1_ -^.2 ^ 3x _ 2 _ 3x _ 2 _ 3x ~v C L c ~ V - ^ -A< G g . V G G - \q~3x (x2 + \x + I) + C. Note that a repeated use of integration by parts within the scope of computing one integral is common (just like when computeing limits by the l'Hospital rule). Case (c). Again we apply integration by parts using f cos2 x dx = f cos x ■ cos x dx = cos x ■ sin x + / sin2 x dx F(x) = cosx F'(x) = — sinx G'(x) = cosx G(x) = sinx cos x • sin x + f 1 — cos2 x dx = cos x • sin x + f \dx — f cos2 x dx = cos x • sin x + x — f cos2 x 0. Suppose the denominator g (x) and the numerator f(x) don't share any real or complex roots and that g (x) has exactly n distinct real roots a\,..., an. Then the points a\, ...an are exactly all the discontinuities of the function f(x)/g(x). For simplifying the notion first write g(x) as the product g(x) = p(x)q(x) of two coprime polynomials. By Bezout identity (see ??), which is a corollary of ordinary polynomial division with a remainder, there exist polynomials a(x) and b(x) of degrees strictly lower than the degree of g such that a(x)p(x) + b(x)q(x) — 1. Multiplying this equality by the quotient f(x)/g(x), we obtain fix) _ a(x) b(x) g(x) q(x) p(x) Now suppose our polynomial g (x) has only real roots, therefore it has a unique factorization (x — at)"', where are the multiplicities of the roots a,, i — 1,..., k. By a sequential use of the previous procedure with coprime polynomials p(x) and g(x), we get a representation of f(x)/g (x) as a sum of fractions of the form rijx) rk(x) (x-fli)"i "' (x-ak)mr where the degrees of the polynomials n (x) are strictly lesser than the degrees of the denominators. Each of them can be very easily represented as a sum r(x) Ai A2 An ——— =-=--1-----1-----1---—, (x — a)n x — a (x — a)2 (x — a)n if we start from the highest powers of the polynomial r(x) and sequentially compute A\, A2,... by suitable adding and removing of summands in the numerator. F.ex. 5x - 16 x - 2 1 5 6 (x - 2)2 (x - 2)2 (x - 2)2 x - 2 (x - 2)2' Now we need to handle the case, where there are not enough real roots. There always exists a factorization of g(x) to linear factors with eventual complex roots though. Repeating the previous notion for complex polynomials gives us the same result. If we know in advance the coefficients of the polynomials are real though, the complex roots in our expressions will come up simultaneously with their complex conjugate roots. Therefore we work with quadratic factors of the form of sum of squares (x — a)2 + b2 and their powers straight out. Our previous notion work very well again and guarantees that it will be possible to see the respective summands in the form of Bx + C 7x>+x ((x - a)2 + b2)" ' Similarly to the real roots case, we can always find the corresponding decomposition into partial fractions of the form A\x + B\ Anx + Bn (x - a)2 + b2 H h ((x - a)2 + b2)" in the case of a power ((x—a)2+b2)n of such quadratic (irreducble) factor as well. Specific results can also be tried out in Maple by calling the procedure "convert(h, parfrac, x)" that decomposes the expression h that is rationally dependant on the variable x into partial fractions. 369 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Solution. Case (a). This is a simple problem for the so called first substitution method, whose essence is writing the integral in the form of (6.8) J f((p(x))(p'(x)dx for certaing functions / and cp. Using the substitution y = (pipe), (we also substitute dy = cp1 (x) dx, which we get by differentiating y = (p(x)) , such integral can be reduced to the integral / f(y) dy. By substituing y = cos x, where dy = — sin x dx, we then obtain / cos5 x ■ sinx dx = — f cos5 x (— sinx) dx = — f y5 dy = - 2- + C -Case (b). Using the equality + C. / cos5 x • sin2 x dx = f (cos2 x) sin2 x • cos x dx /(I sin' x) sin2 x • cos x dx we're tempted to use the substitution t t f cos5 x • sin2 x dx ft6 - 2t4 + t2 dt dt - sinx cos x dx + l + C sin x, which yields J = /(i -pftdt • 7 i • 5 sin x 2 sin x + + c. 7 " 5 1 3 1 w — 7 5 1 3 Case (c). Because both sine and cosine are contained in an even power, we cannot proceed as in the previous problem. Let's try to use the so called second substitution method, which means a reduction of / f(y) dy to the form (||6.8||) for y = (p{x). A situation in which we replace a simple expression by a more complicated one might seem surprising. But don't forget that this more complicated integral might have such a form that we may just be able to compute it. We want to determine the primitive function of function f(x) = tg4x. Thus it's sensible to consider the substitution u = tg x. We obtain x = arctg u 1 dx - I ■ dx du 1+u2 , y - u + arctg u + C = ^ --tgx + x + C f t^—t du = f U2 — 1 + -y—r du tgx + arctg (tgx) + C Case (d). We have dx •O^+x z = x '5 A~ — dx I l-z2+z zs+z6 1 6z5 which can be easily computed by integrating by parts, yielding ■ / te' dt F(t) = t F'(t) = 1 dt) = e G(0 = e' \te yedt e~x (x2 + 1) + C. Case (b). Similarly, we obtain / x arcsin x2 dx = F'(t) G(t) F(t) = arcsin t G'(t) = 1 t = XT dt = 2x dx l / arcsin t dt t arcsin t f-r=dt u du l 1 -2t dt \t arcsin t + \ f ^= = \t arcsin t + \yfü + C 2t arcsint + ±Vl - t2 + C \x2 arcsinx2 + ±Vl - x4 + C. □ 6.51. Compute the integral / Vl -x2 dx, x e (-1, 1) in two different ways. Solution. Integration by parts yields F(x) = Vl -x2 F'(x) G(x)-- VT -xf x Vl — x2 — f Vl — x2 dx + arcsin x, which implies 2 / V1 — x2 dx = x Vl — x2 + arcsinx + C, i.e. / V1 — x2 dx = ^ (x Vl — x2 + arcsinx^ + C. The substitution method along with (||6.7||) then yields x = sin y ' dx = cos y dy f V1 — x2 dx f y/1 — sin2 y ■ cos j dy f cos2 j 0. Solution. This problem can illustrate the possibilities of combining the substitution method and integration by parts (in the sscope of one problem. First we'll use the substitution y = V* to get rid of the root from the argument of the exponential function. That leads to the integral (3) If f is real function defined on the interval [a,b], C el is a constant and the integral f% f(x)dx exists, then the integral f C ■ f(x)dx also exists and / Ja b rb C ■ f(x)dx = C ■ I f(x)dx Ja holds. Proof. (1) First suppose that the integral over the whole interval exists. When computing it, surely we can limit ourselves to limits of the Riemann sums whose partitions have the point c among their partitioning points. Each such sum can be obtained as a sum of two partial Riemann sums. If these two partial sums would depend on the chosen partitions and representants in limit, then the total sums couldn't be independant on the choices in limit (it suffices to keep of sequence of partition of the subinerval the same and change the other so the limit would change). Conversely, if both Riemann integrals on both subintervals exists, they can be approximated with arbitrary precision by the Riemann sums, and moreover independantly on their choice. If we add a partitioning point c to any sequence of Riemann sums over the whole interval [a,b], we'll change the value of the whole sum and also the values of partial sums over the intervals belonging to [a, c] and [c, b] at most by a multiple of the norm of the partition and possible differences of the bounded function / on whole [a, b]. That's a number tending arbitrarily close to zero for a decreasing norm of the partition. Then necessarily the partial Riemann sums of our function also converge to the limits, whose sum is the Riemann integral over [a, b]. (2) In every Riemann sum, the sum of the functions manifests as the sum of the values in the chosen representants. Because multiplication of real numbers is distributive, the statement follows. (3) The same thought as in the previous case. □ The following result is crucial for understanding the relation between an integral and a derivative: 6.25. Theorem (The fundamental theorem of integral calculus). For every continuous function f on a finite interval rb [a, b] there exists its Riemann integral Ja f(x)dx. Moreover, the function F(t) given on the interval [a, b] by the Riemann integral F(x) = f f(t)dt J a is a primitive function to f on this interval. The whole proof of this important statement will be somewhat longer. In the first step for proving the existence of the integral, we'll use an alternative definition, in which we replace the choice of representants and the corresponding value /(&) by the suprema Mi of the values f(x) in the corresponding subinterval [x,_i, x;], or by theinfima m, of the function fix) in the same subinterval, respectively. We speak of upper and lower Riemann sums, respectively (in literature, this process is sometimes called the Darboux integral). 371 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS y2 = x 2y dy = dx Now by using integration by parts, we'll compute / dx 2fyeydy. fytydy f(y)--g'(y) y f'(y) g(y) -- 1 yey - feydy yey — ey + C. Thus in total, we have /e^dx = 2yey - 2ey + C = 2e^ (y/x - l) + C. □ 6.53. Prove that 1 . „ - sin x 2 1 1 3 -- cos(2x) H--cos(4x) H--. 4 ' 16 16 Solution. Easier than to compare the given expressions directly is to show that the functions on the right and left hand side have the same derivatives. We have L' = 2 cos x sin3 x = sin(2x) sin2 x, p' = \ sin(2x) + | sin(4x) = sin2x(j + ^cos(2x)) = sin(2x) sin2 x. Hence the left and the right hand side differ by a constant. This constant can be determined by comparing the values at one point, for example 0. Both functions are zero at zero, thus they are equal. □ C. Integration of rational functions 6.54. Integrate (a) f ^ dx, x ^ 2; (b) / (TiV * * "4: (c) f^TTs dx,xeR; (d) f , ?0x~71 a dx, x e R. Solution. Cases (a), (b). We have f^dx j x—2 y = x — 2 dy = dx f-dy = 6In I j |+C = 61n|x-2|+C and similarly (X+4Y dx y = x +4 dy = dx I^dy -2y< + C (x+4)' + c. We can see that integrating the partial fractions which correspond to real roots of a denominator of rational function is very easy. Moreover, without loss of generality we can obtain y = X — Xq and -x0 ■ dx dy = dx I f y dy ■ A In I x — x0 I + C A In I y I + C (x-xq)' ■ dx y = x - X0 dy = dx A fj^dy Ay- -n + l + c (l-n)(x-x0r-1 for all A, x0 e R, n > 2, n e N. Case (c). Now we are to integrate a partial fraction corresponding to a pair of complex conjugate roots. Thus in the denominator there is a polynomial of degree 2 and in the numerator at most 1. If it's of degree 1, we'll write the partial fraction so that we'll have a multiple 6.26. Upper and lower Riemann integral. Because our func-jijfition is continuous, it's surely bounded on a closed interval, hence all the above considered suprema and infima and fi-| nite. Then the upper Riemann sum corresponging to the partition S — (xq, ..., x„) is given by the expression Ss.sup = J2( sup /(§)) • (xi - Xi-i) = J^Miixt - xt-i), r = l while the lower Riemann sum is Ss.inf — inf /(§)) • (xt - xt-i) — y^mtixi - x,_i). i—^h-i<^ 0 we can find k such that ^Ht.sup, k> N will be distant from Smp by less than e. 372 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS of the derivative of the denominator in the numerator and add to it the fraction, in whose numerator there is only a constant. This way we'll obtain r 3x+7 i _ 3 r 2x-4 jv , n r dx J x2-4x + 15 2 J x2-4x + l5 - 2 J x2-4x + l5 dX + 13 / '. §ln(x2 -4x + 15) + = f ln(x2 -Ax + 15) + x-2 x2-4x + \5 J2 I dx +1 y -- dy dx ln(x2-4x + 15) + JLf dy y2+l ■ In (x2 - Ax + 15) + J= arctg y + C = In (x2 - Ax + 15) + arctg *jg + C Again, we can generally express Ax+b (x-x0)2+a2 dx and compute 2(x-x0) f f MX-X20> 2 dx J (x-x0)2+a2 f , 2(X10) 2 dx + (B + Ax0) f ■] (x-x0)2+a2 v K" •> f _2 j y (x-x0)2+a2 dx y (x — xq)2 + a2 dy = 2 (x — xq) dx I (x-x0)2+a2 In | y | + C = ln[(jc - x0)z + a2] + C, x—xq dx = -4 f a1 j dx +i z = dz. dx a J z2 + l ± arctgz + C = i arctg + C, i.e. J Ax+b (x-x0)2+a2 dx f In ((x - x0)2 + a2) + *±j^ arctg £^o + C, where the values A, B, x0 e M, a > 0 are arbitrary. Case (d). All that is left are the partial fractions for multiple complex roots in the form of r Ax+2B „„, A, B,x0 e M, a > 0,n e N\ {1}, which can be analogically simplified to 2(*-*p) 2 [(x_X())2+fl2] + (S + Ax0) [(*-*0)2+a2]' Then we'll determine 2(x-xp) [(x-x0)2+a2 y (x — xq)2 + a2 l (l-n)yn dy = 2 (x — xo) 0, we can find S > 0 such that for all partitions with norm at most S the inequality l^s.sup — | < s will hold. That's exactly the statement that the number Smp is the limit of all sequences of upper sums with norms of the partition approaching zero. We cam prove the statement for lower sums in exactly the same way. If the Riemann integral doesn't exist, there exist sequences of partitions and representants with different limits of Riemann sums. The the proven statement implies, that the limits of upper and lower sums will be different as well. Conversely, suppose Smp — Sm, but then all Riemann sums of sequences of the partitions must have the same limit because of the inequalities (6.3). □ 6.27. Uniform continuity. Until now, we have only used the continuity of our function / to show that all such functions are bounded on a closed finite interval. We still have to show though, that for continuous functions, ^sup — ^inf* >From the definition of continuity we know that for every fixed point x € [a, b] and every neighbourhood 0B(f(x)) there exists a neighbourhood Os(x) such that /(£><$ (x)) c 0E(/(*)). This statement can also be rewritten in this way: for y, z e Og (x), i.e. \y-z\< 28, it's true that f(y), f(z) € 0E(f(x)), i.e. \f(y)-f(z)\<2s. We're going to need a global variant of such properly, we call itw-niform continuity of function /: Theorem. Let f be a continuous function on a closed finite interval [a, b]. Then for every s > 0, there exists 8 > 0 such that for all z, y € [a, b] satisfying \y — z\ < 8, the inequality \f(y)- f(z)\ctg^ + I + C In total, we have ■ dx 30x-n 15 x-3 (x2-6x + 13) ~ x2-6x + 13 + 16 WCX% 2 + 8 x2-6x + 13 11 arrtCT £=1 4- 13x-159 _l r1 16arctg 2 + 8(jc2_fa+13) +c. + C □ 6.55. Integrate the rational functions jc3 + 1 jc(jc-I)3 x-4 (a)/ ^ / 5jt2+6*+3 (c)/ (e) / dx, i /0, i / 1; (5*>+6*+3)-^/£_ = ± In (5x2 + 6x + 3) - ^ arctgf + C = ^ In (5x2 + 6x + 3) - ^ arctg ^t3 + C; 5jc+3 -4= 0, dependant on dixed small e > 0, such that \f(x + Ax)-f(x)\ D Thus A + C, A - 2C + D, C - 2D, -2A + D. _8_ " 25 • (x-l)2(x2+2x+2) dx = f dx dx x+8 ■ dx In I x - 1 l j_ 25 1 ~ " 1 5(jc-1) where we used 25(x-l) 1 J 5(x-l)2 J 25(x2+2x+2) ^ In (x2 + 2x + 2) - ^ arctg (x + 1) + C, jt+8 x2+2x+2 dx = f \(2x+2) ■ dx ■f 2x+2 7/ l (x+l)2 + l dx l x2+2x+2 dx + x2+2x+2 x2+2x+2 2 In (x2 + 2x + 2) + 7 arctg (x + 1) + C. □ 6.57. Determine (a) I^tlXT1 dx,xeR; (b) fj^dx, x ^±1. Solution. Case (a). First we must do the division of polynomials (x3 + 2x2 + x - 1) : (x2 - x + 1) = x + 3 + ^fj-, to consider a proper rational function (with the degree of the numerator lower then the degree of the denominator). Now we'll compute xi+2x2+x-l x2-x + l + 3x + |/ dx = f x +3dx + f 2x-\ dx 3x-4 x2-x+l dx dx 2 ■ - ■ 2 , x2-x+r - 2 j {x_hy+{4y XY + 3x + | In (x2 - x + 1) - j= arctg ^ + C. Case (b). We have fJ*L.1dx=fldx + ±f£-1-±f 1 /■ * /UX I 72JT ■Jlx-2 8 ^ j^-v^ + l 7 8 J jc-1 8 J x+l 4 J x2+l 1 r V2x+2 • 0 from the right clearly exists, so we've evaluated the improper integral JO We can proceed in the same way if we want to integrated over an unbounded interval. In this case, we often speak of improper integrals of the first kind, while the integrals of unbounded functions on finite intervals are improper integrals of the second kind. More generally, for example for a e R -f J a f(x) dx — lim f J a f(x) dx, if the limit on the right hand side exists. Similarly we can have a finite upper bound and the other one infinite. If both are infinite, we evaluate the integral as a sum of two integrals with a chosen fixed bound in the middle, i.e. /cx) pa />c f(x)dx= J f(x)dx+ J -cx) J—oo Ja f(x) dx. The existence nor the value doesn't depend on the choice of such bound, because by changing it, we only change both sum-mands by the same finite value, but with the opposite sign. Conversely a limit for which the upper and lower bound would approach ±oo at the same speed can lead to different results! For example ^1 / J —i x dx — -x2 = 0, even though the values of the integrals x dx with one fixed bound approach infinite values fast. When evaluating the improper integral of a rational function we must carefully divide the given interval according to the discontinuities of the integrated function and compute all the improper integrals seperately. Moreover it's necesarry to divide the whole interval in a way that we always integrate a function unbounded only in a neighbourhood of one of the boundary points. 6.31. New acquisitions to the ZOO. From the solved problems it could seem it's usual to find an indefinite integral by expressions composed of known elementary functions. That's a I completely false impression, f ^ On the contrary, an overwhelming majority of continuous functions leads to integrals we cannot express in this way. Even if we integrate fairly simple functions. Because the functions obtained by integration often appear in applications, many of them have names and before the advent of computers, extensive tables were published for the needs of engineers. In further text, we'll come back to the methods of obtaining numeric approximations of such functions. Let's see at least some examples. In methods of signal processing, the function sinc(x) sin(x) is very important. It can be verified in a way fairly straightforward, although toilful way, that it's a smooth function with limit values /(0) = 1, /'(0) = 0, /"(0) = -^. 376 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS x + | In | x l|-±ln|x + l|-±arctgx + 4f / If 8 i i 4 "^a- ' 16 Zjc+V2 dx V2 r 16 J dx Ix-sfj x2--j2x + \ dx dx x + £ In | x ^ln 16 | In | x + 11 - \ arctgx + (x2 - V2x + l) - ^ arctg (V2x - l) -2(| In (x2 + V2x + l) - x arctg (V2x + l) + C. □ 6.58. Compute 2jc4+2jc2-5jc + 1 x{jc2-jc+1)2 0, x e. Solution. Case(a). The advantage of the method of integrating rational functions described above is its universality (using it, we can find primitive functions of every rational function). Sometimes though, using the substitution method or integrating by parts is more convenient. For example, v-2 I JjT^dx y = x dy = 2x dx I dy 2(l+y2) 1 f dy 2 J l+y2 ■ arctgy + C = j arctgx + C. Case (b). Using substitution, we obtain an integral of a rational function y = In x I •f x In3 5 In x_ x+x In2 x—2x 5y f In3 5 In x „2 . jc+ln2 x-2 x y-l y2+2y+2 - dx = x -y+2 dy dy = ■ dx Hence it can be immediately seen that this even function will have an absolute maximum at the point x = 0 and with the increasing absolute value of x, it will ripple with ever decreasing amplitude. The sine integral function is defined by Si(jc) = / sinc(f) dt. Another important functions are Fresnel's sine and cosine integrals FresnelS (x) FresnelC(x) = / sinfijrf2) Jo = / cosfijrf2) Jo dt dt. On the left figure, there's the course of the function Si(x), on the right we can see both Fresnel's functions. ll/vJl'''1''''''1''1''1..... We can also obtain a new type of functions, if we allow a free parameter in the integrated expression, on which the result then depends. As an example, let's look at one of the most important mathematical functions ever — the so calld Gamma function. It's defined by r(z) Jo f~ldt. It can be shown that this function is analytic at all points z £ Z and for small z e N we can evaluate: Jo T(l) = / e-'fdt = [-e"1]^ = 1 T(2) = / e_r f1 dt = [- e_r + J e~c dt = 0 + 1 = 1 fOO e~' tdt = 0 + 2 = 2 poo poo = / e~'t1 dt = [- + / e~'dt = < Jo Jo POO PC T(3) = / e-'f-dt = 0 + 2 / Jo Jo and by induction we can easily derive that for all positive integers n this function yields the value of a factorial: Tin) = (n- 1)! The following figure shows the course of the function f(x) — ln(r(x)) in logarithmic scale of the dependant variable. Hence we can see how fast the factorial actually grows. 377 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS I^y-\j^±%dy + 3j 1 In I y In I In x - y-l 1 (j + l)2+l2 dy In (y2 + 2y + 2) + 3 arctg (y + 1) + C 1 | - i In (ln2x + 21nx +2) + 3arctg(lnx + 1) + C. □ For an arbitrary function / that is continuous and bounded on a bounded interval (a,b), the so called Newton-Leibniz formula (6.9) / f(x)dx = [F(x)]ba := lim F(x) - lim F J x^b- x=>a+ (X) holds, where F'(x) = f(x),x e (a,b). Emphasise that under the given conditions, the primitive function F always exists and so do both proper limits in (||6.9||). Hence to compute the definite integral, we only need to find the antiderivative and determine the respective onesided limits (eventually only values of the function, if the primitive function is continuous at the boundary points of the interval). 6.60. Determine (a)/ 1 Vx^+JT* dx, x > 0; dx, x ^ l. dx, x e R \ [-1, 1]; dx, x e (—oo, —4) U (1, +oo); dx, x e (—1,2); dx, i^l. (x-l)^x2+x + l Solution. In this problem, we'll illustrate the use of the substitution method while integrating expressions containing roots. Case (a). If the integral is in the form of p(i). pay p(j) y/x) dx for certain numbers p(\), p(2),..., p(j) e N and a rational function / (of more variables), the substitution f = x is suggested, where n is the (least) common multiple of numbers p(l),..., p{j). Using this substitution, we can always reduce the integrand (the integrated function) to a rational function, which we can always integrate. We'll get dx _ ? 0 = X, i°/x = t dx dx r^V^+^t2) 10/^ = 10/(1 10 [In? + In lOf dt 1 i 1 _ 10ty j- 4- 2fi ^ 3fi 1 t+l ) dt = dt + (i+ \°I)10 Case (b). For integrals 10 ^ ln(l+0] + C: + C. 5 I__10 & + 3^ 2^ p(\X p 0 and the polynomial ax2 + bx + c has real roots x\, x2, we'll use the representation X-X2 X—X\ Vax2 + bx + c = yfa y (x — X1)2 = yfa \ x — x\ and let?2 = . If a < 0 and the polynomial ax2 + bx + c has real roots x\ < x2, we'll use the representation V ax2 + bx + c = -a (x — xi) 2 *2-:t x—X\ -a (x — xi) X2-X X—X\ and let f2 = jz^. If the polynomial ax2 + bx + c doesn't have real roots (necessarily for a > 0), we choose the substitution Vax2 + bx + c = ±V^ ■ x dzt with any choice of the signs. Note that we of course choose the signs so that we get as easy expression to integrate as possible. In all these cases, these substitutions again lead to rational functions. Hence (d) many disjoint intervals, by which we can cover the given set A, while the lower integral is the supremum of the sums of lengths of finitely many disjoint intervals that can be embedded into the set A. We can proceed in the same way in higher dimensions when defining the Jordan measure. For the definition of area (volume) in higher-dimensional space we will also be able to use the concept of the Riemann integral as well, when we generalize it for the multidimensional case. It's good to already notice though that the original notion about an area of a plane figure bounded by a graph of a function in the way described above will indeed be fulfilled completely. 6.33. Mean value of a function. For a finite set of values, we're used to think of their mean value and usually define it as the arithmetic mean. For a Riemann integrable function f(x) on an interval (finite or infinite) [a, b], its mean value is defined by m(f) 1 [l b-a Ja f(x) dx. By definition, m(f) is the altitude of the rectangle (oriented according to the sign) over the interval [a, b], which has the same area as the area between the x axis and the graph of the fix). Hence the integral mean value theorem holds in general. Proposition. If fix) is a Riemann integrable function on an interval [a, b], then there exists a number mi f) satisfying fb fix) dx = mif)ib — a). f J a 6.34. Length of a space curve. The integral we built can be also effectively used to compute the length of a curve in H'~±^i multidimensional vector space M". For the sake of simplicity, we' 11 show this on the case of a curve in M2 with coordinates x, y. Suppose we have a parametric description of a curve F : R —>• M2, Fit) = [git), fit)] and look at it as a trajectory of a movement. For simplicity suppose that fit) and git) have sequentially continuous derivatives. By differentiating the map Fit) we'll obtain values that will correspond to the speed of the movement along this trajectory. Hence the total length of the curve (i.e. distance traveled over time between the values t = a,t = b) will be given by an integral over the interval [a, b], where the integrated function hit) will be exactly the sizes of the vectors F'(t). Therefore we want to compute the length s given by s= fhhit)dt= fh J if'it))2 + ig'it))2 dt. In a special case when the curve is a graph of a function y = fix) between points a < b, we'll obtain J a 1 + (f'(x))2 dx. The same result can be intuitively seen as a corollary of Pythagor's theorem: for a linear increment of the length of a curve As corresponding to the increment Ax of variable x, we can compute As '(Ax)2 + iAy)2 379 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS dx dx (e) (f) (x+4)Jx2+3x-4 J (x+4)J(x-l)(x+4) t2 _ x-1 dx X dx x+4 -j__4 d-t2)2 dt I dt = f 11-^1 5 1-fi ' dt sgn(l-f2)/l^ = fsgn(^)f + C f sgn (x) M + C; x +l) i2 +2r-2 , i 2t-l 1 -2(^-^+1) , (2r-l)2 flf 2 A = f_(J^Dl_dt - f_ J >2+2>-2 >2 -n-1 j r2+2r-2 2r-1 2r-1 f f VI 1 _ VI 1 \ j ^ 3 r+l-V3 3 ,+ 1+73/ dt V3 In f+ 1- V3 V3 In t + 1 + V3 + C fin h-1-V3 h-1+V5 V*2+jt + l-jt + l-\/3 + C = ^ In 1 . . -pi J V*2+*+l-*+l+V3 V3 + c. □ 6.61. Using a suitable substitution, compute j„ _ / „ -x/5-l\ i i /V5-1 —=— dx, x € (-oo, ^i^1) U (- ;+Vx2+x-l \ 2 / V Solution. Even though the quadratic polynomial under the root has real roots x\, x2, we won't solve this problem by substitution t2 = jEr^- We could proceed that way, but we'll rather use a method we introduced for the complex roots case. That's because this method yields a very simple integral of a rational function, as can be seen from the calculation and when looking at our definition of integral, that means Conversely, the fundamental theorem of differential calculus (see 6.25) shows, that at the level of differentials, such defined quantity of the length of a graph of a function y — y(x) satisfies ds — yj 1 + (/ (x))2dx, just as we've expected. As an easy example, we'll calculate the length of a unit circle as a double of an integral of the function y — Vl — x2 over [—1, 1]. We already know that the result must be 2it, because we defined it in this way. ''LiTT- '1 + (/)2 dx = 2 U1 + T^dx : dx — 2[arcsinx]l_1 — 2it. If we instead use y — \j r2 — x2 — rJ 1 — (x/r)2 and bounds [—r, r] in the previous calculation, by substituting x — rt we'll obtain the length of a circle with radius r. sir) (x/r)2 (x/r)2 2r[arcsinx]l_1 — 2itr. ■ dx — 2 sr—p : dt The result is of course well known from elementary geometry. Nonetheless, by using integral calculus, we've just derived an important fact, that the length of a circle is linearly dependent on its diameter 2r. The number it is exactly the ration, in which is this dependancy realized. 6.35. Areas and volumes. The Riemann integral can be used directly to compute areas or volumes of shapes defined by a graph of a function. As an example, let's calculate the area of a circle with radius r. The halfcircle bounded by the function \lr2 — x2 has an area, whose double a(r) can be computed using the substitution x — r sin t,dx — r cos t dt (using the corollary for h in the paragraph 6.22) a(r) 2 — x2 dx — 2r2 2r2 — [cos? sin? + t\_n/2 ,jt/2 J-n/2 -.irr2. cos ? dt It's again worth noticing that this well known formula is derived from the principles of integral calculus and surprisingly, the area of a circle is not only proportional to the square of the radius, but this proportion is again given by the constant jr. Also notice the ratio of the area and the perimeter of a circle, i.e. Ttr2 r 2nr 2 A square with the same area has a side of length spiir and therefore its perimeter is 4v/jrr. Hence the perimeter of a square with an 380 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS dx :+jx2+x-l \Jx2 + X — 1 = X + t x2 + x - 1 = x2 +2xt + t2 v — t+1 1-2* -2J2 +2H-2 (t+2)(l-2t) «+2 IJl) Vx2 + X - 1 • M is positive and nonincreasing on the interval (1, 00). Then this series converges if and only if the integral fix) Ax. converges. Proof. If we interpret the integral as an area under the curve, -2 the criterion is clear. 381 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.63. Compute the definite integrals (b) J2, dx; (C) fo (e2*+3 + cos2*) dx'i Solution. We have (a) TT^T2 dx y = 1 — x dy = - 2 2x dx r-^dy f^dy = [Vy]10 = l; (b) dx Jx^l Z = X + V-? dz, [lnz] 2+V3 2+V3 / \dz In (2 + V3); (c) 1 / 1 \ 1 1 1 / ( ??+3 + ^7 )dx = f dx + f ^5-^ >oo f>k / /(x)dx = lim / /(x)dx > lim ^k — 00 Jl k^ooJi k^oo and the intergral diverges. Now suppose the given integral converges and denote the k-ih partial sum of the given series by Sk. Then we have the inequalities k^oo /(x)dx — lim / /(x)dx < lim sk < 00, k^oo rk because Sk is an upper sum of the Riemann integral J: f(x) dx and we suppose the given series converges. □ 3. Infinite series While building our menagerie, we have already encountered power series, which extend the collection of all polynomials in a natural way, see 5.44. We also said that we'll obtain a class of analytic functions in this way, but we didn't even prove that power series are continuous functions. Now we can easily show that it is indeed the case and that we can also differentiate and integrate power series term by term. Because of this, we will see that it's not possible to obtain a sufficiently wide class of functions by using power series. For example, in this way we can never obtain only sequentially continuous periodic functions, which are very important for simulations and processing of audio and video signals. 6.37. How tamed are our series of functions? Let's now return iJl to disscussing the limits of sequences of functions and the sum of series of functions from the point of applying the ^ methods of differential and integral calculus. Consider a convergent series of functions S(x) = Ylf«(x) «=1 on an interval [a,b]. Natural questions are: • If all functions /„ (x) are continuous at some point xo e [a, b], is the function S(x) also continuous at the point xo? • If all functions f„(x) are differentiable at some point a e [a,b], is the function S(x) also differentiable there and does the equality S> (x) = £~ 1 f> (x) hold? • If all functions /„ (x) are Riemann integrable on an interval [a, b], is the function S(x) also integrable there and does the equality fha S{x)dx = Y.T=i fa /« hold? First we'll show on examples that the answer on all three such formulated questions are "NO!". But then we'll find simple additional conditions on the convergence of the series which, on the contrary, will guarantee the validity of all three statements. Hence the series of functions are not generally well managable, though we can choose a wide class of ones which can be worked with very 382 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS is clearly the antiderivative of function fix) := x5 In (x + 1) on interval (—1, 1), i.e. by differentiating it, we'll get exactly /. Hence /t5 In (t + 1) dt f t5 In (t + 1) dt o £. Improper integrals -x5 In (x + 1) □ 6.66. Decide if +0o f dx e 1 and the x axis (from the left, the figure is bounded by the line x = 1). Hence the integral is a positive real number, or equals +oo. We know that f < arctgx < §, x e [1, +oo). But that implies +00 +00 +00 z = z f x-ldx< f ^ dx<\ f x-2 dx = 7t, 1 1 1 i.e. in particular +0o f ^ dx e □ The formula (|| 6.91|) can be also used in a case when the function / is unbounded or the interval (a, b) is unbounded. We speak of the so called improper integrals. For the improper integrals, the limits on the right hand side may be improper and may not exist at all. If one of the Umits doesn't exist or we receive an expression oo — oo, it means that the integral doesn't exist (oo — oo doesn't have a character of an indefinite expression in this case). We say the integral oscilates. In every other case, we have the result (recall that oo + oo = +oo, —oo — oo = — oo, ±oo + a = ±oo for a e M). 6.67. Determine (a) / sin x dx; (b) / dx . x*+x2 ' (d) / % -l Solution. Case (a). We can immediately determine oo / sinx dx = [— cosx]^° = lim (— cosx) + cos 1. well. Fortunately, power series will belong there as well. Then we'll also give some thoughts to alternative concepts of integration that work more satisfyingly even for wider classes of functions 6.38. Examples of nasty sequences. (1) First consider the functions fnix) = (sinx)" on intervalu [0, it]. The values of these functions will be nonnega-tive and lesser than one at all points 0 < x < it, except for x — j, where the value is 1. Hence on the whole interval [0, it], these functions will converge to the function fix) = lim fnix) = 0 for all x / f 1 forx = j. point by point. Clearly, the limit of the sequence of functions /„ is a noncontiguous function, even though all functions /„ (x) are continuous. The problematic point is even an inner point of the interval. We can find the same phenomenon for series of functions, because the sum is the limit of partial sums. Hence in the previous example, it suffices to express /„ as the n-t partial sum. For example, fiix) — sinx, f2ix) — (sinx)2 — sinx, etc. The left figure plots the functions fmix) for m — n3,n — 1, ..., 10. (2) Let's now look at the second question, i.e. badly behaving derivatives. Quite natural idea on the same principle as above is constructing a sequence of functions which will always have the same nonzero derivative at one point, but they will become smaller and smaller, so they will pointwise converge to an identic zero function. The previous figure on the right plots the functions /„(x)=x(l-x2)" on interval [—1, 1] for values n — m2, m — 1, glance, it's clear that 10. At first lim /„ (x) = 0 and all functions /„ (x) are smooth. Their derivative at the point x — 0 is /„'(0) = ((1 -x2)" -2«x2(l -x2)""1)!^ = 1 no matter the n. But the limit function for the sequence /„ has a zero derivative at every point of course! (3) We've already seen the counterexample to the third statement in 6.32. The characteristic function xq of rational numbers can be expressed as a sum of countably many functions, which will be numbered exactly by rational numbers and will be zero everywhere except for the single point after which they are named for, 383 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Because the limit on the right hand side doesn't exist, the integral os-cilates. Cases (b), (c). Analogously, we can easily compute dx x*+xl dx 2(x2+l) 1 v ' 1 1 J\ I 1 1+x2 dX r i i < ["I-arctgxJ lim (-i - arctgx) + \ + arctg l=0-f + l + | = l- f and even more easily /§ = [2V*t = 4-0 = 4, o where the primitive function is continuous from the right side at the origin (thus the limit equals the value of the function). Case (d). If we'd mindlessly compute -l -1 l r dx_ J X2 -1 2M 0 2 • oo = +oo. 6.68. Compute the definite integrals (a) /o°° dx' (b) f^2 In I x I dx; (°) h — dx> (d) f:x ^ dx; r2 1 (e) It ' 1 x In x Solution. We have (a) dx. dx (x+2)5 [(x+2)-^ (Ämo(x + 2)"4-2"4) = -Ho-^) = ^ (b) 2 0 2 2 / In I x I dx = f In | x \ dx + f In | x \ dx = 2 f In x dx -2-2 0 0 F(x) = lnx F'(x) = \ G'(x) = 1 G(x) = x 2 I [x lnx]2, — / 1 dx where they value will be 1. Riemann integrals of all such functions will be zero, but their sum is not a Riemann integrable function. This example illustrates the fundamental flaw of the Riemann integral, which we'll come back to later. We can easily also find an example when the limit function / is integrable, all functions /„ are continuous, but the value of the integral still isn't the limit of the values of integrals of /„. It suffices to slightly change the sequence of functions which we used above: fn(x) =2nx(l -x2)". We can easily verify that the values of these functions also converge to zero for every x e [0, 1] (for example we can see that \n(fn(x)) -+ -oo). But we'd receive an obviously wrong result (a negative value while integrating a positive function). The reason why the Newton-Leibniz formula cannot be applied in this way is the discontinuity of the given function at the origin. But if we use the additivity rule b c b f f(x)dx = f f(x)dx + f f(x)dx, a a c which always holds, if the integrals on the right hand side are sensible, we'll find the correct result 101 n 1 C dx_ — C dx_ \ C dx_ — r_n _i_r_ii — J x2 — J x2 ~t~ J x2 — L jcJ-i + L xk - -1 -1 0 lim - 1 - 1 - lim = oo - 2 + oo = +oo. Note that the even character of function y = x~2 also implies Jo fn (x) dx 1 1/0. 6.39. Uniform convergence. An obvious reason of failure in all three previous examples is the fact that the speed of pointwise convergence of values f„(x) —>• f(x) varies dramatically point from point. Hence a natural idea is to limit ourselves to cases where the convergence will have roughly the same speed all over the interval Uniform convergence [___ □ 1 Definition. We say the sequence of functions /„ (x) converges uniformly on interval [a, b] to a limit fix), if for every positive number e, there exists a natural number N e N such that for all n > N and all x e [a,b] the inequality \fn(x) ~ f(x)\ < S holds. We say a series of functions converges uniformly on an interval, if the sequence of its partial sums converges uniformly. Albeit the choice of the number N depends on the chosen e, it's independant on the point x e [a,b]. That's a difference from the pointwise convergence, where N depends on both e and x. We can visualise the definition graphically in this way: if we consider a zone created by a translation of the limit function / (x) to / (x) ± s for arbitrarily small, but fixed positive e, all of the functions /„ (x) will fall into this zone, except for finitely many of them. Clearly the first and the last of the previous cases didn't have this property; at the second case, the sequence of derivatives f'n lacked it. The following three theorems can be briefly summed up by a statement that all three generally false statements in 6.37 are true for uniform convergence (but beware of the subtilities when differentiating). 6.40. Theorem. Let fn(x) be a sequence of functions that are continuous on interval [a, b], which converges uniformly to function f(x) on this interval. Then f(x) is also continuous on interval [a, b]. Proof. We want to show that for an arbitrary fixed point xq e [a, b] and any fixed small e > 0, the inequality \f(x)-f(xo)\ 0 we have \fn(x) ~ f(x)\ < S for all x € [a, b] and all sufficiently large n. Choose some n with this property and consider S > 0 such that \fn(x) ~ fn(xo)\ < S for all x in <5-neighbourhood of xo (that's possible, because all fn(x) are continuous). Then \f(x) - f(x0)\ < \f(x) - fn (x)| + \fn(x) - fn(x0)\ + \fn(xo) - f(xo)\ < 3e for all x in our chosen <5-neighbourhood of x$. □ 6.41. Theorem. Let fn(x) be a sequence of Riemann integrable functions on a finite interval [a, b] which converge uniformly to function f (x) on this interval. Then f(x) is Riemann integrable as well and lim / f„(x)dx— I I lim f„(x))dx — I f(x)dx. The proof of this theorem is based upon a generalization of properties of Cauchy sequences of numbers to uniform convergence of functions. This way we can work with the existence of the limit of a sequence of integrals without needing to know it. ___I Uniformly Cauchy sequences |____- Definition. We say the sequence of functions /„ (x) on interval [a, b] is uniformly Cauchy, if for every (small) positive number e, there exists (large) natural number N such that for all x € [a, b] and all n > N, the inequality \fn(x) ~ fm(x)\ < £ holds. Clearly every uniformly convergent sequence of function on interval [a, b] is also uniformly Cauchy on the same interval; it suffices to notice the usual bound \fn(x) - fm(x)\ < \fn(x) - f(x)\ + \f(x) - fm(x)\ based on triangle inequality. This observation will now suffice to prove our theorem, but first we'll stop at a convinient converse statement: Proposition. Every uniformly Cauchy sequence of functions fn (x) on interval [a, b] uniformly converges to some function f on this interval. Proof. The condition for a sequence of functions to be Cauchy implies that also for all x e [a,b], the sequence of values /„ (x) is a Cauchy sequence of real (eventually complex) numbers. Hence the sequence of functions /„ (x) must converge pointwise to some function f(x). We'll show that in fact, the sequence f„(x) converges to its limit uniformly. Choose N large enough so that \fn(x) ~ fm(x)\ < £ 385 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Solution. We'll first solve this problem by the substitution method and then repeatedly apply integration by parts, yielding J2 | oo 1 ff e~ydy = y = xr _ dy = 2x dx | 2 Q F'iy) = nf~l G(y) = -e-y f x2n+l e~x dx o ■ F(y) = f G'iy) = e-y [-/ e~y]™ + nff-1e-ydy \ = \ f f~l e~y dy F(y) = f~l G'(y) = e-y F'(y) = (n- l)f~2 G(y) = -e-y oo I ( [-f-1 e->]~ + (n-l)f f~2 e~y dy e-ydy = ... = nÜLf^l f ye-y dy F(y) = y G'(y) = e-y F'iy) = i G(y) = -< ni([-y-y]7 + f'-3dy ni [-e-yE 00 «! 2 ■ □ 6.71. In dependancy on a e M+ determine the integral f0 j% dx. O F. Lengths, areas, surfaces, volumes 6.72. Determine the length of the curve given parametrically: x = sin21, y = cos21, for? e [0, f]. Solution. According to ||??||, the length of a curve is given by the integral V(x'(0)2 + (/(O)2 dt = / V(sin202 + (-sin202 dt V2sin2f dt = V2. If we reaUze that the given curve is a part of the line y = 1 — x (since sin21 + cos21 = 1) and moreover the segment with boundary points [0, 1] (for t = 0) and [1, 0] (for t = |), we can immediately writw its length V2. □ 6.73. Determine the length of a curve given parametrically: x = t2, y = r3 for? e [0, V5]. Solution. We'll again determine the length i by using the formula 1991 f J4t2 + 9t4 dt = I t^9t2+Adt Jo Jo 1 r5 .- 2 3 , 335 for some small positive e chosen beforehand and all n > N, x € [a, b]. Now choose one such n and fix it, then we have \f„(x)-f(x)\ for all x € [a, b]. lim \f„{x) - fm(x)\ < £ □ Proof of the Theorem. Recall that every uniformly convergent sequence of functions is also uniformly Cauchy and that the Riemann sums of all single terms of our sequence converge to rb J a f" W ^x independently on the choice of the partition and the representants. Hence, if we have \fn(x) for all x € [a, b], then also fm(x)\ < E rb rb / /„ (x) dx -Ja Ja fm (x) dx < e\b ■ Therefore the sequence of numbers f„ (x) dx is Cauchy, hence convergent. Also because of the uniform convergence of the sequence /„ (x), the same must be true for the limit function fix) (its Riemann sums are arbitrarily close to the Riemann sums of the functions /„ for sufficiently large n), so the limit function fix) will again be integrable. Moreover, pb pb / /„ (x) dx - I . Ja Ja , (x) dx — I fix) dx so it must be the correct limit value. < e\b ■ □ For the corresponding result about derivatives, we need to take extra care regarding the assumptions: 6.42. Theorem. Let fn(x) be a sequence of functions differen-tiable on interval [a, b] and assume fn(xo) —>• f(xo) at some point xq € [a, b]. Moreover, let all derivatives gn (x) = f'n (x) be continuous and let them converge uniformly to function g(x) on the same interval. The the function fix) = f* git) dt is also differentiate on interval [a, b], the functions fn(x) converge to fix) and fix) = g(x). Proof. If we consider functions /„ (x) = /„ (x) — /„ (xo) instead of fix), the assumptions and conclusions in the theorem will be valid or invalid for both sequences and the same time. Hence without loss of generality we can assume that all our functions satisfy /„ (xq) — 0. Then for all x e [a, b], we can write fr n (x) — I gn Jx0 it) dt. Because the functions g„ uniformly converge to function g on whole [a, b], the functions /„ (x) converge to function fix) f git)dt. Because function g is a uniform limit of continuous functions, it is again a continuous function, thus we have proved all that was needed, see 6.24 about the Riemann integral and a primitive function. □ For infinite series, we can sum up the previous conlusions in this way: □ 6.43. Corollary. Consider functions fix) on interval [a, b]. 386 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.74. Determine the area to the right of the line x = 3 and bounded by the graph of a function y = ^-j- and the x axis. Solution. The are is given by the improper integral J3°° -^-j-dx. We'll compute it using decomposition into partial fractions: 1 x3 - 1 1 x = 1 xu : 1 = C-B x : 0 = A + C Ax + B C +- x2 + x + 1 1 (Ax + B)(x - 1) + C(x2 + x + 1), 1 c = r 2 B = —, 3 1 A = — 3 (1) If all functions fn (x) are continuous on [a, b] and the series oo S(x) = ^2fn(x) n = l uniformly converges to function S(x), then S(x) is continuous on [a, b]. (2) If all functions fn (x) are continuously differentiable on interval [a, b], the series S(x) — YlnLi fn(x) converges for some xq € [a, b] and the series T(x) — Yl^Li fn(x) converges uniformly on [a, b], then the series S(x) converges, it is continuously differentiable on [a, b] and S* (x) = T(x), i.e. / OO \ i oo (£/«(*)) = £/»• \ = 1 ' n = l (3) If all functions f„ (x) are Riemann integrable on [a, b] and the series and we can write r ' il = ir(J— -i+2 ) J3 x3-l 3J3 \x-l x2+x + \J dx. Now we'll seperately determine the indefinite integral / xi+x\\ ^x / / x +2 x2 + x + 1 dx x + (* + 5>2 + ! dx + 3 r l 2j (jc + I) ■ dx 2^+4 substitution at the first integral t = X2 + X + 1 At = 2(x + \) dx 2 J t 2 J (jc + I; )2 + ^ 1> ^ 4 substitution at the first integral S=X + \ As = Ax 1 - 3 - ln(x2 + x + 1) + - 1 9 - ln((x2 + x + 1) + J s2 + l 34 r l As substitution at the second integral U = ff Au = -Xs As V3 - ln(xz + x + 1) + 2 / ~T~~—7 d« = J u2 + l - ln(x2 + x + 1) + V3 arctan(w) = 1 9 /2x + l\ - ln(x2 + x + 1) + V3 arctan -— . 2 V V3 / S(x) = £/„(x) n = \ uniformly converges to function S(x) on [a, b], then S(x) is integrable on [a, b] and / £/«(*) k* = £ / f» Ja \ = l 7 n = lJa (x) dx. 6.44. Test of uniform convergence. The simpliest way to find out whether a sequence of functions converges uniformly is a comparison with absolute converts gence of a suitable sequence of numbers. This is often called the Weierstrass test. Suppose we have a series of functions /„ (x) on interval I — [a, b] and we have a bound \fn(x)\ <%el for suitable real constants a„ and for all x € [a, b]. We can immediately put a bound on the differences of the partial sums (x) = E /«w «=i for distinct indices k. For & > m we get k*(x) - Sm(x)| = n=m + l E ^ E i/»(x)i< E a*- n=m + l If a series of (nonnegative) constants 2~2T=i a" is convergent, then of course the sequence of its partial sums is Cauchy. But we have just verified that in that case the sequence of partial sums s„ (x) will even be uniformly Cauchy. Thanks to the statement proven above in 6.41 we just proved the following Theorem (Weierstrass test). Let fn (x) be a sequence of functions defined on interval [a, b] with \fn(x)\ < an e M. If the series of numbers Y^Li a" convergent, then the series S(x) — Z~2^Li fn(x) converges uniformly. 387 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS In total, for the improper integral we can write: 1 1 dx 1 r 1 9 r (2x + 1 - lim In \x — 11--ln(x + x + 1) — v 3 arctan I --=— 3 |_ 2 V V3 1/1 1 , r (28 + \\ - lim - In \8 - 1| - - ln(<52 + 8 + 1)- V3 arctan —— 3 &^oo \3 2 V V3 / 1 1 73 7 — In 2 H— In 13 H--arctan —= 3 6 3 V3 1 1 V3 7 -In 13--In 2 H--arctan —= — 6 3 3 V3 1 3 s- lim In 1 Vx2 + x + 1 7 1 r (28 + 1 - lim v 3 arctan --=— 3 <5—» oo V ^3 - In 13 H--— arctan —=--m2--it. 6 73 V3 3 6 □ 6.75. Determine the surface and volume of a circular paraboloid created by rotating a part of the parabola y = 2x2 for x e [0, 1] around the y axis. Solution. The formulas stated in the texts are true for rotating the curves around the x axis! Hence it's necessary either to integrate the given curve with respect to variable y, or to transform. -2 „ V / Jo ■ Ax = it S = 2it f + — Ax =2it f J- + — Ax Jo V 2V 8x J0 V2 16 7T- 17vT7 - 1 "24 ' □ 6.76. Compute the area 5 of a figure composed of two parts of plane bounded by lines x = 0, x = l,x = 4, the x axis and the graph of a function y fx-l Solution. First realize that fx-l < 0, x e [0, 1), /jc—1 and lim -oo, lim > 0, xe(l,4] +oo. ,1_ jx-1 x^l + V*-l The first part of the figure (below the x axis) is thus bounded by the curves y = 0, x = 0, x = 1, y = -yL= with an area given by the improper integral fx-l dx: while the second part (above the x axis), which is bounded by the 6.45. Consequences for power series. The Weierstrass test is very useful for discussing power series S(x) — ^ a„ (x - x0)" «=0 centered at point x$. During our first encounter with power series we showed in jgu 5.47 that each such series converges on (xo — S, xo + S), where the so called radius of convergence S > 0 •Vv ' can also be zero or oo. (see 5.51). In particular, in the proof of the theorem 5.47, we used a comparison with a suitable geometric series to verify the convergence of the series S(x). By the Weierstrass test, the series S(x) converges uniformly on every compact (i.e. finite) interval [a, b] belonging to the interval (xo — S, xo + S). Thus we proved this: Theorem. Every power series S(x) is continuous and continuously differentiable at all points inside its interval of convergence. The function S(x) is also integrable and differentiating and integrating can be done term by term. In fact, the so called Abel's theorem states the power series are continuous even in boundary points of their domain (including eventual infinite limits). We won't prove it here. Just proven pleasant properties of power series also point at the boundaries of their useablity when simulating dependences of some practical events or processes. In particular, it's not possible to simulate sequentially continuous functions very well by using power series. As we' 11 see in a moment, for specific needs it's possible to find better sets of functions /„ (x) than the values /„ (x) — x" . The best known examples are the Fourier series and the so called wavelets which we'll discuss in the next chapter. 6.46. Laurent series. In the context of Taylor expansions let's look at a smooth function f(x) — e_1/* from paragraph 6.6. We've seen it's not analytic at zero, because all its derivatives are zero there. So while at all other points xo this function is given by convergent Taylor series with radius of convergence r — \x$\, at the origin the series converges at only one point. But if we substitute the expression — 1/x2 for x in the power series for e*, we get a series of functions S(x) = V-(-D"x -In E (_1)N x2", «=o which will converge at all points except for x / 0 and gives us a good description of behavior near the exceptional point x — 0. Thus it seems useful to consider the following more general series quite similar to the power ones: ___| Laurent series [___ A series of functions of the form S(x) - ^2 an(x - x0)n is called a Laurent series centered at xq. We call the series convergent if both its parts with positive and negative exponents converge separately. curves 388 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS y = 0, 1, has an area of Since 4 /■ 1 4, y dx. . , |V(x-D2 + c, /Jt —1 z dx the sum Si + S2 can be gotten as (l^W -§)+ ^Hm (| V9 - f^l), |(l + V9). lim We have shown among other things, that the given figure has a finite area, even though it's unbounded (both from the top and the bottom). (If we approach x = 1 from the right, eventually from the left, its altitude grows beyond measure.) Recall here the indefinite expression of type 0 • oo. Namely, the figure is bounded if we limit ourselves to x e [0, 1 - S] U [1 + S, 4] for an arbitrarily small S > 0. □ 6.77. Determine the avarage velocity vp of a solid in the time interval [1, 2], if its velocity is v(t) = -?f—, t€[l,2]. Omit the units. Solution. To solve the problem, it suffices to realize that the sought avarage velocity is the mean value of function i; on interval [1,2]. Hence 2 5 vP = 2TT / "7= dt = f t1^ dx = V5 - V2, with 1 + t2 =x,tdt Ji+t2 dx/2. □ 6.78. Compute the length s of a part of the curve calles tractrix given by the parametric description fit) = rcost + rln(tg|) , g(t)=rsiat, te[jv/2,a], where r > 0, a € in/2, jt). Solution. Since fit) -r sin? + 2tg j -cos2 j -r sin t + fit) = r cos t on interval [tv/2, a], for the length s we get jt/2 r2 cos4 t + r2 cos2 t dt dt f S2SI dt J sin t Tl/2 Tl/2 -r [In (sin 0]^2 -r In (sin a) □ 6.79. Compute the volume of a solid created by rotation of a bounded surface, whose boundary is the curve x4 — 9x2 + y4 =0, around the x axis. Solution. If [x, y] is a point on the x4 — 9x2 + y4 =0, clearly this curve also intersects points [—x, y], [x, —y], [—x, —y]. Thus is symmetric with respect to both axes x, y. For y = 0, we have x2 ix — 3) ix + 3) = 0, i.e. the x axis is intersected by the boundary The purpose of Laurent series can be seen at rational functions. Consider such function Six) = f(x)/g(x) with coprime polynomials / and g and consider a root xo of polynomial g(x). If the multiplicity of this root is s, then after multiplication we get function Six) = S(x)(x — xqY, which will now be analytic on some neighbouthood of the point xq and therefore we can write Six) = Cl-l ix - x0y oo ^ a„(x x0 + a0 + «i (x — x0) + x0)". Now consider seperate parts Six) = S- + S+ = an (x — xo)" + an (x — xo)". « = -oo n=0 As for the series S+, Theorem 5.47 implies that its radius of convergence R is given by the equality R~x = lim sup tf\a„\. If we apply the same idea to the series S- with 1/x subtituted for x though, we'll find out the series S- ix) converges for |x — xo | > r, where r 1 = lim sup ^J\a-n |. These notions remain completely true even for complex values of x substituted into our expressions. Theorem. A Laurent series Six) centered at xo converges for all x € C satisfying r < |x— xq\ < R and diverges for all x satisfying |x — xo| < r or \x — xq\ > R. Hence we can see the Laurent series need not converge at any point at all, because we can have values R < r. But if we look for example at the above case of rational functions expanded to Laurent series at some of the roots of the denominator, then clearly r = 0 and therefore, as expected, it will really converge in the punctured neighbouthood of this point xo, while R will be given exactly by the distance to the closest root of the denominator. In case of our first example, for the function e-1^ we have r = 0 and R = oo. 6.47. Numerical approximation of integration. Just like at the end of the previous part of the text (see paragraph 6.17), we'll now use the Taylor expansion to propose as good and simple approximations of integration as possible. We'll work with an integral / = rb Ja fix)dx of analytic function fix) and a uniform partition of the interval [a, b] using points a = xq, x\, ..., x„ = b with distances Xi■ — Xi-i = h > 0. We'll denote the points in the middle of the intervals in the partitions by xr-+i/2 and the values of our function at the points of the partition by /(*/) = f. We' 11 compute the contribution of one segment of the partition to the integral by the Taylor expansion and the previous theorem. We intentionally integrate symmetrically around the middle values so that the derivatives of odd orders cancel each other out while 389 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS curve at points [—3,0], [0, 0], [3,0]. In the first quadrant, it can then be expressed as a graph of the function f(x) = ^9x2 -x4, x e [0, 3]. The sought volume is thus a double (here we consider x > 0) of the integral / Jtf2ix) dx = it f V9x2 — x4 dx. o o Using the substitution t = V9 — x2 ixdx = —tdf), we can easily compute 3 _ 3 _ 0 / V9x2 - x4 dx = fx ■ V9 - x2 dx = - J t2 dt = 9, 0 0 3 and receive the result 18tt. □ 6.80. Torricelli's trumpet, 1641. Let a part of a branch of the hyperbola xy = 1 for x > a, where a > 0, rotate around the x axis. Show that the solid of revolution created in this manner has a finite volume V and simultaneously an infinite surface S. Solution. We know that + 0o +0o V = Tt f (i) dx = TT f -jj dx = 77" lim '^►+00 K) and +00 S = 2n f i J x +00 +00 1 + (-ji) dx =2jt f dx > 2jt f \dx 2n I lim In x — In a +00. The fact the the given solid (the so called Torricelli's trumper) cannot be painted with a finite amount of color, but can be filled with a finite amount of fluid, is called Torriccelli's paradox. But realize that a real color painting has a nonzero width, which the computation doesn't take into account. For example, if we would paint it from the inside, a single drop of color would undoubtedly "block" the trumpet of infinite length. □ Another problems about computing lengths of curves, areas of plane figures and volumes of parts of space can be found on page 413. 6.81. Apllications of the integral criterion of convergence. Now let's get back to (number) series. Thanks to the integral criterion of convergence (see 6.33), we can decide the question of convergence for a wider class of series: Decide, whether the following sums converge of diverge: 00 a) E "f, « = 1 00 b) E£- «=1 Solution. First notice, that we cannot decide the convergence of none of these series by using the ratio or root test (all Umits lim | | and «^00 lim zfchi equal 1). Using the integral criterion for convergence of series, we obtain: a) í 1 x In (x) ■ dx -dt = lim [ln(r)]o = 00, 0 t &^oo integrating: cA/2 / fixi + 1/2 + t)dt = / Y)-/("Wl/2)<" )dt J-h/2 J-hl2\r7^) ' 00 / /-A/2 1 \ =e(/ y{k\xi+v2Ýdt\ t-L\J-h/2k\ J 00 h2k+l En_f (2k) f \ 22H2k + \)\J (Xi+1'2)- k=0 A very simple numerical approximation of integration on one segment of the partition is the so called trapezoidal rule, which uses the area of a trapezoid given by the points [x;, 0], [x;, f], [0, xt+i], [xt+i, f+i] for approximation. This area is Pi = l-ifi+fi+i)h so in total we can approximate the integral I by value n-l h /uch = e * = 2(/o+2/1 +"'+2fn~i + fn)- We'll now compare /nch with the exact value of I computed by contributions over seperate segments of the partition. We can express the values f by middle values and derivatives //+i/2 m this way: h 1 h2 n fi + \/2±\/2 = fi + \/2 ± i:fi + l/2 + T^yif (l + l/2) 2\22' h3 ± 3!23/(3)(/ + 1/2) + ---' so for the contribution P, to the approximation we get Pi = \(fi + fi+i)h = h(fi+l/2 + ^f'd + 1/2)) + Oih5). >From here we get an estimations of the error I — 7iicn over one segment of the partition h h2 Ai = h(fi+l/2 + -f!'+l/2 - fi+l/2 - -//;i/2 + Oih4)) -f!'+l/2 + Oih5). h3 12 The total error is thus estimated as / " /lich = J^nh3f" + n °^5) = - ^f" + °^4) where /" represents the approximation of the second derivative of If the linear approximation of the function over the seperate segments doesn't suffice, we can try can an approximation by a quadratic polynomial. To do that, we'll always need three points, so we'll work with segments of the partition in pairs. Suppose n — 2m and consider x, with odd indices. We'll require fi+l = fix, +h) = fi+ah + fJh2 fi-l = fixt-h) = f-ah + ph2 which gives (see the similarity to the difference for approximating the second derivative) 1 P = -^.(fi+i + fi-l ~ 2fi)- 390 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS hence the given series diverges. b) 1 -rr dx XL lim hence the given series converges. □ 6.82. Using the integral criterion, decide the convergence of series E n = l 1 (« + l)ln2(« + l) Solution. The function /(*) x e [1, +oo) (x + \)\Y?(x + \), is clearly positive and nonincreasing on its whole domain, thus the given series converges if and only if the integral f^°° f(x)dx con- verges. By using the substitution y dx/(x + 1)), we can compute +0o +0o In (x + 1) (where dy f i (x+l)ln2(x + l) dx = f -i dy = ±. In 2 Hence the series converges. G. Uniform convergence 6.83. Does the sequence of functions □ x e n € N yn = e^ converge uniformly on M? Solution. The sequence {y„}„ef>] converges pointwise to the constant function y = 1 on R, since lim e4«2 1, x e e > 2 for all n e N But the computation y„ (yin^j implies that it's not a uniform convergence. (In the definition of uniform convergence, it suffices to consider s € (0, 1).) □ 6.84. Decide whether the series •Jx-n E n = l n*+x2 converges uniformly on the interval (0, +oo). Solution. Using the denotation fn(x) we have _ ^fx-n - ^pf+I2' n(nA-3x2) x > 0, n e N, x > 0, n e N. 2VI(«4+*2) From now on, let n e N be arbitrary. The inequalities f'n(x) > 0 for and f'n(x) < 0forx e imply that the maximum of function /„ is attained exactly at the point x = n2/*j3. Since V27 4«2 E n = l 721 4«2 727 4 The area of approximation of the integral over two segments of the partition between x, _ i and x,+1 is now estimated by the expression /h 2 ft + at + fit2 dt = 2hft + -fin3 -h 3 = 2hfi + ^(fi+i+fi-i-2fi) 6 = jM+i + fi-i-2fi)- This procedure is called Simpson's rule. The whole integral is now approximated by /Simp = jh(f0 + hn + 4 fk + 2 E /*)■ odd k even ^ Similarly to the procedure above we can derive that the total error is estimated by / " /simp = -^(b - a)A4/(4) + 0(h5), where /(4) represents the approximation of the fourth derivative of /• By the end of this chapter, we'll stop at other concepts of integration. First we'll show a modification of the Riemann integral, which will later be useful in notions about probability and statistics. We'll mostly stay in an area of notions and comments though, readers interested in a thorough explication will need to find another sources. 6.48. Riemann-Stieltjes integral. In our idea of integration as summing infinitely many linearized (infinitely) small increments of the area given by a function fix) we omitted the possibility that for different values of x we would take the increments with different weights. This could be surely arranged at the infinitesimal level by interchanging the differential dx for 0 there exists a norm of the partition S > 0 such that for all partitions S with norm lesser than S, we have 15S - /| < e. For example, if we choose g(x) on interval [0, 1] as a sequentially constant function with finitely many discontinuities ci,... ,Ck and "jumps" at = lim g(x) - lim g(x), x—y ci _|_ x—yc[ — then the Riemann-Stieltjes integral exists for every continuous fix) and equals »1 k / f(x)dg(x) = y2aif(ck). Jo ~i By the same technique we used for the Riemann integral, we can now define upper and lower sums and uppoer and lower Riemann-Stieltjes integral, which have the advantage that for bounded functions they always exist and their values coincide if and only if the Riemann-Stieltjes integral in the above sense exists. We already encountered problems with Riemann integration of functions that were "too jumpy". Technically, for function g(x) on a finite interval [a, b] we define its variation by sup Igte) S r = l ■ g(Xi-l)\, where we take the supremum over all partitions S of the interval [a, b]. If the supremum is infinite, we say g(x) has an unbounded variation on [a, b], otherwise we say g is a function with a bounded variation on [a, b]. Similarly to the procedure for the Riemann integral, we can quite easily derive the following: Theorem. Let fix) and g(x) be real functions on a finite interval [a, b]. (1) Ifgix) is decreasing and continuously differentiable, then the Riemann integral on the left side and the Riemann-Stieltjes integral on the right side both exist simultaneously and their values are equal fb pb f(x)dg(x) f f(x)g\x)dx= f Ja Ja (2) If fix) is continuous and g(x) is a nondecreasing function rb with a finite variation, then the integral Ja f(x)dg(x) exists. 6.49. Kurzweil integral. The last stop will be a modification of \^ the Riemann integral, which fixes the unfortunate be-\ havior at the third point in the paragraph 6.37, i.e. the limits of the nondecreasing sequences of integrable functions will again be integrable. Then we will be able to interchange the order of the limit process and integration in these cases, just like with uniform convergence. First notice what's the essence of the problem. Intuitively we should assume that very small sets must have a zero size, and thus the changes of values of the functions on such sets shouldn't influence the integration. Moreover, a countable union of such "negli-_gihlq for Jhe purpose of integration" sets should have a zero size again. Surely we would expect that for example the set of rational 392 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Of course, the convergence of the series at points ±1 can be veri- fied directly. It's even possible to directly deduce that E ^ «=i (n + D (by writing out ^ (n + D n + 1" 1 □ 6.86. Sum of a series. Using theorem 6.41 "about the interchange of a limit and an integral of a sequence of uniformly convergent functions", we'll now add the number series oo 1 T — n = \ We'll use the fact that f xn+l 1 «2" ' Solution. On interval (2, oo), the series of functions E^Li ^rr converges uniformly. That is implied for example by the Weierstrass test: each of the function is decreasing on interval (2, oo), thus their values are at most ^n-; the series Y^T= :1 2" + ! is convergent though 2«+l ' (it's a geometric series with quotient -]). Hence according to the Weierstrass test, the series of functions E^Li ^rr converges uniformly. We can even write the resulting function explicitly. Its value at any x € (2, oo) is the value of the geometric series with quotient 7, so if we denote the limit by fix), we have fix) Y — z—i Yn + 1 « = 1 1 1 X2 1 - 1 x(x — 1) By using (6.43) (3), we get Ax '2 ~ 1 _ ™ r°° dx nln ~ ^ J2 xn+1 n=\ n=\ 2 /•oo / 00 j A \«=i -jf dx x(x -<5 dx f 1 1 lim /--- <5^oo J2 X — 1 X dx lim [(ln(<5 - 1) - ln(<5) - ln(l) + In2] lim 0 we can find a covering of set A by a countable system of open intervals Jt,i — 1,2,... such that ^m(Ji) < s. In the following, by the statement "function / has the given property on set B almost everywhere" we'll always mean the fact that / has this property at all points except for a subset A c B of zero measure. For example, the characteristic function of rational numbers is zero almost everywhere, a sequentially continuous function is continuous almost everywhere etc. We'd now like to modify the definition of the Riemann integral so that when choosing the partition and the corresponding Riemann sums, we would be able to eliminate the omnious effect of the values of the integrated function on a before known set of zero measure. It also seems reasonable to try to guarantee that the segments in the given partitions with representants would have the property that near points of such set, they would be controllably small. A positive real function S on a finite interval [a, b] is called a calibre. We call a partition S of interval [a, b] with representants & <5-calibrated, if we have ^ - Kb) < xi-i < Hi < xi < Hi + S(Hi) for all i. For further procedure, it's essential to verify that for every calibre S, a ^-calibrated partition with representants can be found. This statement is called Cousin's lemma and can be proven for example in the usual way based upon the properties of supremas. For a given calibre S on [a, b], we'll denote by M the set of all points x € [a, b] such that a ^-calibrated partition with representants can be found on [a, x]. Surely M is nonempty and bounded, thus it has a supremum s. If s / b, then we could find a calibrated partition with a representant at s, which leads to a contradiction. Now we can define a generalization of the Riemann integral in this way: Definition. Function / defined on a finite interval [a, b] has its Kurzweil integral -f J a f(x) dx, if for every e > 0, there exists a calibre S such that for every 8-calibrated partition with representants S, the inequality | Se — 11 < e holds for the corresponding Riemann sum Se ■ 6.50. Properties of the Kurzweil integral. First notice that when \\ defining the Kurzweil integral, we only bounded the set of all partitions, for which we take the Riemann sums into account. Hence if our function is Riemann W integrable, then it must also have the Kurzweil integral and these two integrals are equal. 393 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS which can be seen for example from the ratio test for convergence of series: lim ř7=>oo an + l lim ř7=>oo (n + 1)2 -(«+1 n2n ,. ln + 1 1 hm--= -. «=>°o 2 n 2 In total, according to (6.43) (3), we have /•In 3 /-In 3 00 / f(x) áx = / ^Tne-nx Jin 2 /•in J w in 2 ^ 00 1 00 00 / 1 1 \ D--"ti = E(^-^) = i /hT"* dx 1 _ 1 2 ~ 2" □ 6.88. Determine the following limit (give reasons for the procedure of computation): <*»(f) lim r00 a io (1 dx. cos(í) Solution. First we'll determine lim , The sequence of these «=>OC> (l+n) functions converges pointwise and we have cos(f) 1 (||??||) 1 ex lim (1 + _)" Um (1 + _)" It can be shown that the given sequence converges uniformly. Then according to (6.41), f°° cos(-) fc lim/o 0 + lf ^ = 1 + cos(-) lim-- (1 + xň) dx 1 /- = ' Jo e* We leave the verification of uniform convergence to the reader (we only point out that the discussion is more complicated than in the previous cases). □ For the same reason, we can repeat the argumentation in Theorem 6.24 about simple properties of the Riemann integral and again verify that the Kurzweil integral behaves in the same way. In particular, a linear combination of integrable function cf(x) + dgix) is again integrable and its integral is c fb f(x)dx + d fb gix)dx etc. For proving this, it only suffices to think through little modifications when discussing the refined partitions, which moreover should be S -calibrated. Analogously, for the case of monotonie sequences of point-wise convergent functions, we can extend the argumentation verifying that the limits of uniformly convergent sequences of integrable functions /„ are again integrable and the integral of the limit is the limit of the values of integrals /„. Finally, the Kurweil integral behaves in the way we would like it to, even to sets with zero measure: Theorem. Consider a function f on interval [a, b], which is zero almost everywhere. Then the Kurzweil integral fb f '(x)d(x) exists and equals zero. Proof. This is a nice illustration of the idea that we can get rid of the influence of values on a small set by a smart choice of calibre. Denote by M the corresponding set of zero measure, outside of which fix) — 0 and write Mk c [a,b],k — 1,..., for the subset of the points for which k — 1 < |/(x)| < k. Because all the sets Mi have zero measure, we can cover it by a countable system of in sum arbitrarily small and pairwise disjoint open intervals Jkj. Now definte the calibre S(x) for x e Jkj so that the whole intervals (x — Six), x + Six)) were still contained in Jkj. Outside of set M, we then define S arbitrarily. For á-calibrated partition S of the interval [a, b] we can then put a bound on the corresponding Riemann sum n-l £/(£n)(*Z + l -Xi) — £ /(£n)(Xi + l -Xi) j=0 n-l 7=0 oo n — l X € 6.105. Forx e (0, 1), compute f ( H—/ 3 „ + 4 sinx — 5 cosx) dx. 6.113. Determine O o d.70d. Determine the indefinite integrals (a) / arctgx 0 using integration by parts. O 6.107. By repeated use of integration by parts, for all x e M determine (a) / x2 sinx 0. O 6.109. Using integration by parts, determine / (2-x2)e*dx on the whole real line. O 6.110. Integrate (a) f(2x + 5)wdx, x e R; (b) f —V- dx, x > 0; v ' J x hr x (c) / e~x x2 dx, x e R; (d) /"15sa!2^£ic, x e (-1, 1); (e) f^dx, x > 0; (g) f^dx, xeR; (h) / sin -v/x 0 by using the substitution method. O 6.111. Forx € (0, 1), by using suitable substitutions, reduce the integrals fx\f^dx; f * (x-l)jx2+x + l to integrals of rational functions. O 6.112. Forx € {—jt/2, tz/2) compute J 1 +sinz jt using the substitution t = tg x. O r 7, o. 405 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS in an arbitrary way. 6.114. Compute (a) fx" lnx dx, x > 0, n ^ — 1; (b) fjf^dx, xeR. O O 6.115. For x > 0 determine (2+5xy dx: (a) / v A, ^I^dx; (c) / i A. Solution. All three given integrals are binomial, i.e. they can be written as f xm (a + bx" )p dx for some a,beR, m,n,peQ. The binomial integrals are usually solved by applying the substitution method. If p € Z (not necessarily p < 0), we choose the subtitution x = f, where s is the common denominator of numbers m a n: if 22±i e Z and p g Z, we choose a + &x" = z*, where s is the denominator of number p; and if + p e Z (p £ Z, ^ Z), we choose a + bx" = f x", where s is the denominator of p. In these three cases, a reduction to an integration of a rational function is guaranteed. Hence we can easily compute (a) p e Z x e J xz+4x+8 6.128. Compute the indefinite integral of the function i (x2+x+l)2 ' y — X £1. f , * , dx, jceM\{l,2}. J (x-\)(x-2)2 ' 6.132. Forx e (0, |), compute (a) / sin3x cos4x dx; (c) / 2 sin2 | dx; (d) / cos2x dx; (e) / cos5 x Vsinx dx; sin x ' O O o o o o o 408 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.133. Let y = | x | on the interval I = [— 1, 1] and let S„ = (-l,-2=±,..., - ±,0, I 2=1 1) be a partition of the interval 7 for arbitrary neN. Determine Ssn, SUp and 5sn, inf (the upper and lower Riemann sum corresponding to the given partition). Based on this result decide if the function y = | x | on [— 1, 1] is integrable (in Riemann sense). o 6.134. Compute lim ^+^+-+7TTT. «^►00 " o 6.135. How many distinct primitive functions to function y = cos (lnx) does there exist on the interval (0, 10)? O 6.136. Give an example of a function / on th interval I = [0, 1] that doesn't have a primitive function on/. O 6.137. Using the Newton integral, compute tc (a) / sinx dx; 0 1 (b) / arctgx dx; 0 3jt/4 (c) f tx^ dx; J 1 +sin x ' -rc/4 e (d) / I In x I dx . l/e 6.138. Compute \ Vl+x2 1 3 fj^dx. 0 6.141. For example by repeated integration by parts, compute tc/2 f e2x cosx dx. 0 6.142. Determine 1 / x2 e~x dx. -1 o o 6.139. For arbitrary real numbers a < b determine b f sgnx dx. a Recall that sgnx = 1, for x > 0; sgnx = — 1, fpr x < 0; and sgnO = 0. O 6.140. Compute the definite integral o o 409 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.143. Compute the integral 1 /' //, dx •> v/5-4jc (a) / e" dx In 2 (b) / £ dx. o 0 0 6.147. Order the numbers n/2 tt/2 1 A := / cosx sin2x dx, B := f sin2x dx. C := f —x5 5X dx. 6.151. With an error lesser than 1/10, approximately compute 2 f(x- 22^-i)mx dx. O using the substitution method. O 6.144. Compute o 6.145. Which of the positive numbers Jl/2 71 p := f cos7 a- dx, q '■= f cos2 x dx 0 0 is bigger? O 6.146. Determine the signs of these three numbers (values of integrals) 2 71 271 a := fx3 2X dx; b := f cosx dx; c := f ^ dx. o 0 0-1 D-.= f^dX + J^4dx + f^dx 2tz tt 10 by size. O 6.148. By considering the geometric meaning of the definite integral, determine 2 (a) / | x - 11 dx; -2 0,10 (b) / tgx dx; -0,10 2tz (c) / sinx dx. o o 6.149. Compute f1l\x\ dx. O 6.150. Determine i f x5 sin2 x dx. o 410 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.152. Without using the symbols of differentiating and integrating, express I / 3t2 cost dt\ r2 / x4 + 3x3 +5x2 +4x +2 dx. 6.154. Compute the integral '5 sin t 1 — cos21 4 At. 6.155. Compute the integral "ln2 dx o e2x — 3ex 6.156. Compute: zl (i) f02 sin x sin 2x dx, (ii) / sin2 x sin 2x dx. 6.157. Compute the improper integral — oo +CX) (b) / f; 0 (C) / 2£!±vi dx. 0 1 (d) / In | x | dx. -l 6.158. Determine 3jt/2 /" COS x ^ 1+sinx 0 dx. 6.159. Compute the improper integral + 0o / 72-TTT dx. 6.160. Compute o with variable x e M and a real constant a, if we differentiate with respect to x. O 6.153. Compute the indefinite integral 1 o o o o o o o 411 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS +00 / e2x+ex+l dx. 6.161. By using the substitution method, compute 0 2 00 1 f x e~x dx; f ^-j- dx. -00 0 6.162. Compute the integrals 1 - n 4 _ n +°° - n f eJl dx- f ^— dx- f ^— dx ■J VI ' J VI ' J V? 0 1 4 did.?. Find the values of a e M, for which +00 (a) / % € M; 1 (b) / £ e M; 0 +00 (c) / sin ax dx e M. 6.764. For which p, q € R is the integral +00 r _dx J xP ] (b) / £3 e — 00 +CX) (c) / A e 1 E (-1)" o o o o ' In? x 2 finite? O 6.165. Decide, if the following is true: +00 —00 +00 o 6.766. Approximately compute cos with an error lesser than 10~5. O 6.167. For a convergent series V^+100' estimate the error of the approximation of its sum by the partial sum 59999. O 6.168. Without computing the derivatives, determine the Talor polynomial of degree 4 centered at x0 = 0 of function f(x) = cosx — 2sinx — In(1 + x) , x e (—1, 1). Then decide if the graph of function / in neighbourhood of the point [0, 1] is above or below the tangent line. O 6.169. By using differentiation, obtain the Taylor expansion of function y = cosx from the Taylor expansion of function y = sin x centered at the origin. O 412 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.170. Find the analytic function whose Taylor series is JC ^ JC i ~^ JC JC i * * * j for* e [-1,1]. O 6.171. From the knowledge of the sum of a geometric series, derive the Taylor series of function y = 5+2x centered at the origin. Then determine its radius of convergence. O 6.172. Expand the function to a Taylor series centered at the origin. O 6.173. Expand the function cos2(x) to a power series at the point n/4 and determine for which x e R this series converges. O 6.174. Express the function y = ex defined on the whole real axis as an infinite polynomial with terms of the form a„ (x — 1)" and express the function y = 2X defined on R as an infinite polynomial with terms a„x". Q 6.175. Find a function / such that for x e R, the sequence of functions to it. Is this convergence uniform on M? O 6.176. Does the series oo kde xeR, n = \ converge uniformly on the whole real axis? O 6.177. By using differentiation, obtain the Taylor expansion of function y = cosx from the Taylor expansion of function y = sin x centered at the origin. O 6.178. Approximate (a) cosine of ten degress with a precision of at least 10~5; (b) the definite integral JQ1/2 with a precision of at least 10~3. o 6.179. Determine the power expansion centered at x0 = 0 of function r f(x) = f e dt, x s o o 6.180. Find the analytic function whose Taylor series is JC ^ JC ~~\ ~^ JC JC ~~\ * * * ^ for* e [-1,1]. O 6.181. From the knowledge of the sum of a geometric series, derive the Taylor series of function y — 5+2x centered at the origin. Then determine its radius of convergence. O 6.182. Let a movement of a solid (a trajectory of a mass point) be described by function s(t) = -(t - 3)2 + 16, fe[0,7] in units m, s. Determine (a) the initial (i.e. at time t = 0 s) velocity of the solid; (b) the time and location at which the solid has zero velocity; (c) the velocity and the acceleration of the solid at time t = 4 s. Recall that velocity is the derivative of trajectory and acceleration is the derivative of velocity. O 413 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS Solutions of the exercises 6.4. 2^F. COS X 6.5. p(5)(x) = 12 • 5!; p(6)(x) = 0. 6.6. 212e2x + cosx. 6.7. f(26)(x) = -sinx + 226 e2*. 6.74. f + i (x - 1) - I (x - l)2 + ^ (x - l)3. 6.15. (a) 1 + ^; (b) 1 - ^; (c) x - ^; (d) x + ^; (e) x + x2 + ^. (5.7(5. 2 (x - 1) - (x - l)2 + | (x - l)3 - \ (x - l)4. (5.77. "*3 3(l+x)3 • 6 IS v c;n 1° ~ ^ lim „ x sin*-*2 _ 1 6.7,5. x --g-, sin 1 ~ - g—3, lim^0+-p---g- 6.20. (x - l)3 + 3 (x - l)2 + (x - 1) + 4. 6.26. V(-l)"-x2", (2«)! converges for all real x. (5.27. converges for all real x. 6.28. 00 92«-l y(_D»+if-^ (2«)! «=1 ^3(-l)"+1 n f(x) = > -x" , *—' n «=1 converges forx € (—1, 1]. 6.29. It's good to realize we're expanding i ln(x). /(x) = ^(-iy+1-(x-iy, 2/ r=0 Converges on interval (0, 2]. (5.32. It's convex on intervals (—oo, 0) and (0, 1/2); concave on interval (1/2, +oo). It has only one asymptote, the line y — it/4 (v ±oo). 6.33. (a) y = 0 at —oo; (b) x = 2 - horizontal, y = 1 v ±oo. 6.34. y — 0 for x -> ±oo. (5J5. y = In 10, y = x + ln3. (5.77. for a e (0, 1), oo else. 6.89. All of them. 6.90. The range is (—oo, 0] U [4, +oo). Function / is not odd, even nor periodic. It has a single discontinuity xo — — 1 with lim fix) — —oo, lim fix) — +oo. x^> — 1 + x^* —1 — The function intersects the x axis only at the origin. It's positive for x < —1 and nonpositive for x > —1. It can be shown easily that lim fix) — +oo, lim fix) — —oo; x^—oo x^+oo X € K \ 414 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS This implies that / is increasing on the intervals [—2, — 1), (—1,0] and decreasing on the intervals (—oo, —2], [0, +oo). At the stationary point x\ — 0 it reaches a strict local maximum and at the stationary point x2 — —2 it has a sharp local minimum y2 — 4. It's convex on the interval (—oo, —1) and concave on the interval (—1, +oo). It doesn't have a point of inflection. The line x — —1 is a horizontal asymptote, the inclined asymptote at ±oo is the line y — — x + 1. For example, /(—3) — 9/2, /'(—3) — —3/4, /(l) — —1/2, /'(l) = -3/4. 6.91. The function is defined and continuous on R \ {0}. It's not odd, even nor periodic. It's negative exactly on the interval (1, +oo). The only point of intersection of the graph with the axes is the point [1,0]. At the origin, / has a discontinuity of the second kind and its range is is R, because lim f(x) — +oo, lim f(x) — —oo, lim f(x) — +oo. Aditionally, f'(x) = -££, x€M\{0], f"(x) = x € R \ {0}. The only stationary point is x\ — —72. The function / is increasing on the interval [x\, 0), decreasing on the intervals (—oo, x\], (0, +oo). Hence at point x\ it has a local minimum y\ — 3/74. It has no points of inflection. It's convex on its whole domain. The fine x — 0 is a horizontal asymptote and the line y — — x is an inclined asymptote at ±oo. 6.92. The function is defined and continuous on R \ {1}. It's not odd, even nor periodic. The points of intersection of the graph of / with the axes are the points [i-72,o] and [0, —1]. At xo — 1, the function has a discontinuity of the second kind and its range is R, which follows from the limits lim f(x) — —oo, lim f(x) — +oo, lim f(x) — +oo. JC—>■ 1 — Jt-»1 + X^±QO After the arrangement /(x) = (x-l)2 + IlT, x€M\{l], it's not difficult to compute f'(x) = 2^^, x€M\{l], f"(x) = 2^0^, x€M\{l}. The only stationary point is x\ — 2. The function / is increasing on the interval [2, +oo), decreasing on the intervals (—oo, 1), (1, 2]. Hence at the point x\ it attains the local minimum y\ — 3. It's convex on the intervals (-oo, 1 - 72), (1, +00) and concave on the intervals (l-72, l) . The point x2 — 1 — 72 is thus a point of inflection. The line x — 1 is a horizontal asymptote. The function doesn't have any inclined asymptotes. 6.93. The function is defined and continuous on whole R. It's not odd, even nor periodic. It attains positive values on the positive half-axis, negative values on the negative half-axis. The point of intersection of the graph of / with the axes is only the point [0, 0]. The derivative can be determined easily: f'(x) = - 7x"e-*, x e R \ {0}, /'(0) = +00, The only zero point of the first derivative is the point xo — 1/3. The function / is increasing on the interval (—00, 1/3] and decreasing on the interval [1/3, +00). Hence at the point xo, it has an absolute maximum yo — l/73e. Since lim^-oo f(x) — —00, its range is (—00, yo]. The points of inflection are xi = t=^, x2 = 0, x3 = li^. It's convex on the intervals (xi, xi) and (X3, +00), concave on the intervals (—00, xi), (x2, X3). The only asymptote is the line y — 0 at +00, i.e. limx^+0o f(x) — 0. 6.94. The function is defined and continuous onR\ {2}. It's not odd, even nor periodic. It's positive exactly on the interval (0, 2). The only point of intersection of the graph of / with the axes is the point [0, 0]. At the point xo — 2, the so called jump of size it is realized, as follows from the limits lim f(x) = f, lim f(x) = -f. We have 415 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS The first derivative doesn't have a zero point. The function / is therefore increasing at every point of its domain. Since lim /(*) = -£, lim /(*) = -§, its range is the set (—it/2, tt/2) \ {—tt/4}. The function / is convex on the interval (—00, 1), concave on the intervals (1,2), (2, +00). Thus the point x\ — 1 is a point of inflection with /(l) — tt/4. The only asymptote is the line y — —it/4 at ±00. 6.95. Domain R+, global maximum x — e, point of inflection x = Ve3, increasing on (0, e), decreasing on (e, 00), concave on (0, Ve3, convex on (Ve3, 00), asymptotes x = 0 and y — 0, lim^o /(x) = —00, lim^oo /(x) = 0. 6.96. Domain R \ [1,2]. Local maximum x = 1~2V^, concave on the whole domain, asymptotes x = 1, x — 2. 6.97. Domain R. Local minimas —1,1, maximum at 0. Even function. Points of inflection ±-^, no asymptotes. 6.98. Domain R \ [—j, 1]. No global extremes. No inflection points, asymptotes x — — 5, x — 1. 6.99. DomainM\{l}. Noextrems. No points of inflection, convex on (—00, 1), concave on (1, 00). Horizontal asymptote x — 1. Inclined asymptote y — x + 1. 6.100. (a) £ xV^; (b) £ + 2^ + ^; (c) ™L; (d) In (1 + sinx). 6.707. (a) -cotgx - x + C; (b) tgx - cotgx + C. 6.102. ex + 3arcsin §. 6.703. ±ln(l +x4) + C. 6.704. 2 V2 arctg ^ + C. 6.105. In J xff J + 5 arcsin x — 4 cos x — 5 sin x + C. 6.706. (a) x arctgx - Ml+__) + c; (b) + C. 6.107. (a) —x2 cos x + 2x sin x + 2 cos x + C; (b) ex (x2 — 2x + 2) + C. 6.708. ^ (21n2x - 21nx + l) + C. 6.709. (2x -x2)ex + C. 6.770. (a) (2*+5)" + C; (b) + C; (c) -± e"^ + C; (d) 5 arcsin3 x + C; (e) ^ + C; (f) arctg2 Vx + C; (g) ^ arctg e*) + C; (h) 2 sin fx - 2 fx cos fx + C. 6.111. For example 1 — x = r^x gives / ~2^4 ^ x x2 ^ x2 + l ^ (x2 + \)2- 6-120.^ + ^-1. 6.121. ^ + /Xliv 6.122. ^-^ + ^. 416 CHAPTER 6. DIFFERENTIAL AND INTEGRAL CALCULUS 6.123.--^ + ^-^. 6-124. jM + i\ + CT + -x 6-125. 1 + -L - 2 + ^. 6.126. (a)31n|x-2|;(b)-^?. 6.127. § In (x2 + 4x + 8) - \ arctg ^ + C. 6.128. ^zrctg^ + J^)+C. 6-129. I in Jgfc + f arctg ^ + C. 6.130. iln|x - 1| - iln(x2 +x + l) - arctg ^ + C. 6.737. In (| x - 1 | (x - 2)4) - ^ + x + C. 6.7J2. (a) ^ _ £°^i + C; (b) ^ + § + C; (c) x - sin x + C; (d) | + ^ + C; (e) | sini x - | sini x + £ sinT x + C; (f) ^ +2tgx - ^ + C; (g) ± In |tg|| - + C; (h) In |tg f | + C. 6.133. SSn, sup = =£1, SSn, m = 2=1; yes, it is. 6.134. /f VI Jx = \ (2V2 - l). 6.135. Infinitely many. 6.136. For example, / can attain a value of 1 at rational points of the interval and be zero at irrational points. 6.138. V5 - V2. 6.139. \b\ - \a\. 6.140. iln2. 6.141. i (e* - 2). 6.142. e-5e~l. 6.143. I. 6.744. (a) 4; (b) 6.145. p < q. 6.146. a >0;b = 0;c>0. 6.147. C 1; (b) a < 1; (c) a = 0. (5.764. Exactly for /? > 1, q e K and for p — 1, q > 1. 6.165. (a) true; (b) false; (c) true. 6.166. 1 - ^2 + (5.7(57. The error belongs to the interval (0, 1/200). 6.168. 1 — 3a- + jjx4 ; above the tangent line. 6.169. ZZo 24" co (-1)" 9« (2n)! 6.770. v = arctgx. 6.777. Exactly for x € (—§' §)> we nave E(-f) oo 1 _ 1 5+2x 5 «=0 6.172. ^ 2~2n=0 3nX" ■ 6.173. /(jc) = 1/2+ >----(x--) J J ' ^ (2i + 1)! V 4/ The series converges for all x € R. 6.175. fix) =x,x eM;yes. 6.776. No. 6.177. ZZo{-^x2n- 6.178. {a)\-^ + ^;{b)\-^. 0.1/V. l„=0 (2« + l)«! X 6.180. y = arctgx. 6.787. Exactly for x € ( — |, we have 5+2x — 5 ^ \ 5 1 ■ 6.7S2. (a) v(0) = 6m/s; (b) t = 3 s, s(3) = 16m; (c) v(4) = -2m/s, a(4) = -2m/s2. 418 CHAPTER 7 Continuous models How do we manage non-linear objects? - mainly by linear tools again... A. Orthogonal systems of functions If we want to depict a three-dimensional object in a plane, we usually consider its (mostly orthogonal) projection into the plane. Similarly, if we want to "express" some more complicated function in terms In this chapter, we will show several applications of the tools of differential and integral calculus for various problems in which we will do with functions of one real variable. The tools and procedures will be quite similar to the ones shown in the third chapter, i. e. manipulations with linear combinations of selected generators and linear transformations (looking for their kernels or the reverse images of given elements). However, we will work not with finite-dimensional vectors but with spaces of functions, i. e. the vector spaces we will be considering will seldom have finite dimension. We will get back to these as well as other practical fields in the next chapter in the context of functions of several variables, differential equations, and the calculus of variations. First, we will approximate functions with linear combinations from given sets of generators. However, on the way, we will have to clarify how we can work with such concepts like distance. Actually, we will sketch the basics of what is called the theory of metric spaces, and this part will also serve us as a preparation for analysis in Euclidean spaces M". We will mainly resume applying the procedures we have already known from the Euclidean vector spaces. We will find out that our intuition from the Euclidean spaces of low dimensions is quite convenient in general as well. Then we will focus on integral operators, i. e. linear mappings on functions which are defined in terms of integrals. Especially, we will pay our attention to the so-called Fourier analysis. As usual, our reasoning will touch discrete variants of previously discussed continuous operations. In the entire chapter, we will work with functions of one variable which will take real or (very often) complex values. CHAPTER 7. CONTINUOUS MODELS of simpler ones, we can consider its projection into the (real) vector space generated by those simpler functions. Then we will be able to integrate, for instance, the more complicated functions in the same way as we integrated (or differentiated) functions expressed in terms of power series (if the space of the simpler functions is "sufficiently" large, then we will be able to do this with arbitrary accuracy). We can even define a scalar product on a suitable (infinite) vector space of functions on a given interval. Thus the scalar product is not to be defined on the space of all functions on the given interval, but rather on a subspace of its which, on the other hand, will be large enough for our calculations (besides, it will contain all continuous functions on the given interval). The scalar product will allow us to calculate the projections in the same way as we used to do in the case of vector spaces. Given a finite-dimensional vector (sub)space of functions and wanting to determine the projection of a function onto it, we first calculate the orthogonal (or orthonormal) basis of this subspace by the Gram-Schmidt orthogonalization process and then we determine the orthogonal projection in the known way (2.3). 7.1. Let the vector subspace (x2, 1/x) of the space of real-valued functions on the interval [1, 2] be given. Complete the function 1/x to an orthogonal basis of the subspace and determine the orthogonal projection of the function x onto it. Solution. First, we deal with the basis. It is required that the function 1/x be one of the vectors of the basis. The vector space in question is generated by two linearly independent functions, thus its dimension is 2 (and all of the vectors lying in it are of the form a ■ ^ + b ■ x2 for some a, iel). It remains to find one more vector of the basis which is orthogonal to the function f\ = 1/x. According to the Gram-Schmidt process, we are looking of it in the form f2 = x2 + k ■ \, k e M. The real constant k can be determined from the condition of orthogonality: 1 , 1 1 , 11 <-,x2+£--) = (-,x2)+£<-,-). X X Therefore, l l \_x_ 21.1 1 x x dx Thus, the wanted orthogonal basis is (^, x2 — |). Now, we calculate the projection px of the function x onto this subspace (see (2.3)): <*,7> 1 (x,X2-7) (l, l) * + (x2 - 1 x ' x ' 1 2 15 - 3 x 34 x ':) ■ (X2 - -) □ 7.2. Let us consider the real vector space of functions on the interval generated by the functions \ an orthogonal basis of this space. [1,2] generated by the functions Complete the function - to 1. Fourier series 7.1. Spaces of functions. As usual, we begin with choosing appropriate sets of functions which we want to work with. We want to have enough functions so that our models could be conveniently applied in practice. At the same time, the functions must be sufficiently "nice" as we must be able to integrate and differentiate them as needed. We will largely work with functions defined on an interval I — [a,b] c M, or on an infinite interval (i. e. the marginal values a and b can take the values ±oo, but the sets will still be closed). .__| Spaces of piecewise smooth functions |_s The set of functions 5° — 5° [a, b] contains exactly the piece-wise continuous functions on I — [a,b] with real or complex values, i. e. we suppose that at every point of the interval, the function / e 5° has the corresponding finite one-sided limits both from the left and from the right, and that on every finite interval, there are only finitely many points of discontinuity. Especially, all such functions are bounded on bounded intervals. For every natural number k > 1, we will also consider the set of all piecewise continuous functions / such that all their derivatives up to order k (inclusive) lie in 5° (i. e. they need not exist at all points, but their one-sided limits do exist). We will denote this set by 5*. In the case of an unbounded interval I, we will also often work with the subset 6^ c 5* of all functions with a compact support (i. e. the functions take zero outside some finite closed interval). On bounded intervals, of course, all such functions have a compact support in this sense. When we are not interested in the interval we work on, we will write only 6^ in all cases. In the case of a finite interval [a, b] or under the condition of a compact support, our functions from 5° are always Riemann integrable on the chosen interval I, both in the absolute value and squared, i. e. Ja \f(x)\dx < oo, Ja (f(x)) dx < oo. Our reasonings can be extended on significantly greater domains of functions, yet this often costs us a lot of technical effort. From time to time, we will make reference to Kurzweil (or Lebesgue) integrable functions for which the results are more compact and nicer. Interested readers are referred to extended and specialized literature. Actually we will keep the same strategy as with the rational and real numbers - we calculate with nice functions only and we can "handle" the limits of Cauchy sequences in the chosen metrics (which are usually needed only formally). Distance of functions. From the already proved properties of limits and derivatives, we can immediately see that 5* and 6^ are vector spaces. In finite-dimensional spaces, the distance of vectors was considered in terms of the differences of the particular coordinates. In spaces of functions, we can proceed analogously and utilize the absolute value of real or complex numbers (or the Euclidean distance) in the following way: Distance of functions [__> 420 CHAPTER 7. CONTINUOUS MODELS Solution. Similarly to the previous exercise, we use the Gram-Schmidt orthogonalization process (with the given scalar product). Thus we gradually get that h{x) = jj - tz, □ fs(x) =1 3_ 4x ' 3 I 13 2x2 24x ■ 7.3. Determine the projection of the functions ^ and x onto the vector space from the exercise ||7.2||. Determine the distances from this vector space as well. Solution. Projection of ^ : + %f2 + \h, distance: Projection of x : 2j\ + (-§ + ln(2))/2 + (-|ln(2) + |)/3, distance: about 0.03496029493. We can see that the distance of the function which behaves in a similar way as the generators is smaller. □ 7.4. Let the vector subspace (sin(x),x) of the space of real-valued functions on the interval [0, n] be given. Complete the function x to an orthogonal basis of the subspace and determine the orthogonal projection of the function \ sin(x) onto it. O 7.5. Let the vector subspace (cos(x), x) of the space of real-valued functions on the interval [0, n] be given. Complete the function cos(x) to an orthogonal basis of the subspace and determine the orthogonal projection of the function ^ cos(x) onto it. B. Fourier series o One of the fundamental studied periodic processes which can be met in applications is a general simple harmonic oscillation in mechanics. It is the case of a mass point moving along a straight line. It is well known that the function / which describes the position of the mass point on the line in time is of the form (7.1) f (t) = a sin (cot +b) for certain constants a, co > 0, b e R determined by the position and velocity of the point at the initial time. The function y = f(t) can, for instance, be obtained by solving the homogeneous linear differential equation (7.2) / + co2y = 0 following from Newton's law of force for the given movement. Let us mention that the function / has period T = 2jt/co (in mechanics, one often talks about frequency 1/T) and that the positive value a (expressing the maximum displacement of the oscillating point from the equilibrium position) is called the amplitude. The value b (expressing the position of the point at the initial time) is called the initial phase, and co is the angular frequency of the oscillation. Similarly, we can focus on the function z = g (t) which describes the dependence of voltage upon time t in an electrical circuit with inductance L and capacity C and which is the solution of the differential equation (7.3) z" + co\ 0. The only difference between the equations (|| 7.21|) ans (|| 7.31|) (besides the dissimilar physical interpretation) is the constant co. In the equation (||7.21|), there is co2 = k/m where k is the proportionality constant Definition. The L\-distance of functions / and g from 6^ is defined by 11/~ g\\i = / l/W - g(x)\dx. J a Similarly, /^-distance of / and g is defined by II/-SII2 or \f(x)-g(x)\zdx 1/2 The size of the function || /1|: or || /1| 2 is understood to be its distance from the zero function. In the first case, the L\ -distance of functions / and g which take real values only expresses the area enclosed by the graphs of these functions, regardless of which function takes greater values. Since we consider piecewise continuous functions / and g, their distance can equal zero only if they differ in their values at the points of discontinuity, i. e. at only finitely many points on bounded intervals. Indeed, if our functions differ at a point xo and they are continuous at it, then they also differ on some sufficiently small neighborhood of this point, and this neighborhood, in turn, contributes a non-zero value into the integral. If we have three functions f,g, and h, then, of course, f\h(x)-f(x)\dx= f Ja Ja pb pb < / \h(x) - g(x)\dx + / Ja Ja \h(x) - g(x) +g(x) - f(x)\dx \g(x) - f(x)\dx, so the usual triangle inequality holds. We can notice that to derive this inequality, we only used the triangle inequality for the scalars; thus it holds for functions /, g e 6J? with complex values as well. The second definition is similar. The square of the size ||/||2 of a function / is ll/H22= / \f(x)\2dx J a and it is derived from the well-defined symmetric bilinear mapping of real functions to scalars (f,8) -f Ja f(x)g(x)dx by substituting / for both functions. In the case of complex values, we obtain this size similarly from the scalar product, using complex conjugation: (f,8) - Ja f(x)g(x) dx, as we saw when talking about the unitary spaces in the third chapter: Thus the triangle inequality will hold as well because the whole discussion can be done in a space of dimension at most three with scalar product, generated by given functions /, g, h. 7.3. (In)finiteness of dimensions, orthogonality. Let us, for a first argument and symmetry (f,8) = (8,f), 421 CHAPTER 7. CONTINUOUS MODELS and m is the mass of the point, while in the equation (||7.3||), there is CO (LC)"1. Actually, every periodic process which can be described by a function of the form (||7.1||) is considered a harmonic oscillation, and the constants a,co,b are almost exclusively given the mentioned names borrowed from the simple harmonic oscillation of a mass point in mechanics. Applying one of the sum formulae sin (a + j3) = cos a sin + sin a cos a, j3 e M, we can write the function / (see (||7.1||)) as (7.4) f(t) = c cos (cot) + d sin (cot) where c = a smb, d = acosb. Thus, the function / from (||7.4||) also describes a harmonic oscillation with amplitude a = \Jc1 + d2 and the initial phase b € [0, lit) satisfying sin b = c/a, cos b = d/a. An important task in application problems is the composition (so-called superposition) of different harmonic oscillations. A key position is occupied by the superposition of finitely many harmonic oscillations expressed by functions of the form /„ (x) = an cos (ncox) + b„ sin (ncox) for n € {1, ..., m}. These particular functions have prime period 2jt/(nco). Therefore, their sum (7.5) [an cos (ncox) + bn sin (ncox) ] «=i is a periodic function with period 2n/co. It holds generally that the superposition of any finite collection of simple harmonic oscillations with commensurable periods is a period function whose period is the lease common multiple of the prime periods of the particular oscillations. The sum (||7.5||) modified by an appropriate movement, (7.6) 2 + [an cos (ncox) + bn sin (ncox) ] «=i is just the m-th partial sum of the functional series oo (7.7) --\- \-an cos (ncox) + bn sin (ncox) ]. «=i From the physical point of view, it is a complicated periodic process which can serve as a natural approximation of the superposition of infinitely many simple harmonic oscillations (so-called harmonic components) of the functional series (||7.7||). It may be an interesting question whether, on the other hand, every periodic process can be in a "reasonable" way expressed by the superposition of finitely (or possibly infinitely) many simple harmonic oscillations - whether every periodic process is the result of such a superposition. Exactly formulated from the mathematical point of view: whether every periodic function can be expressed as the finite sum (|| 7.61|), or at least as the series (||7.7||). Of course, the positive answer for a significant and broad class of periodic functions is obtained for the infinite sum only (see the theoretical part). We have already mentioned that periodic processes play an important role in many physical and technical fields. Traditionally, we i. e. in the real case, it is a symmetric bilinear mapping. At the same time, for continuous functions, the condition of non-zero size of non-zero functions holds as well, while for our piecewise continuous functions, zero size implies that the function is non-zero except for an (at most) countable set of points (finite on every finite interval). Thus we have truly defined the scalar product for the vector subspace of continuous functions. In the case of more general functions, we should, from the technical point of view, identify functions which differ on finite intervals at finitely many points only. In our subsequent reasonings, this technical obstruction will play an insignificant role (we will occasionally make a reference to it in notes). In the case of finite-dimensional real or complex vector spaces, we considered scalar products and the size of vectors as soon as in the second and third chapter. Now let us notice that when we derived the properties, we always worked with pairs or finite sets of vectors. Now, we can do just the same with functions, and if we restrict our definition of a scalar product to a vector subspace generated (over real or complex numbers, according to our need) by only finitely many functions /i ,...,/*, we again obtain a well-defined scalar product on this finite-dimensional vector subspace. As an example, let us consider the functions ft — xl, i — 0,..., k. In 5°, they generate the (k + 1)-dimensional vector sub-space Rklx] of all polynomials of degree at most k. The scalar product of two such polynomials is given by an integral. Every polynomial of degree at most k is uniquely expressed as a linear combination of the generators fo, ■■■ , fk- Moreover, if our generators were such that (7.1) (fi,fj) = |0 for/^j, |l for «' = ./', then we would have the so-called orthonormal basis. At this occasion, let us remind the procedure of Gram-Schmidt orthogonaliza-tion, see 2.42, which transforms any system of linearly independent generators ft into new (again linearly independent) orthogonal generators gt of the same subspace, i. e. (gi, gj) — 0 for all i / j. We can calculate them step by step as gi — f\ and by the formulae (fl+Ugi) gi+i — fi+i ■ aigi, fori > 1. For illustration, we will apply this procedure to three polynomials 1, x, x2 on the interval [—1, 1]. We get g\ — 1, g2 = X g3 = x2 = x2 II* 1 3 1 dx ■ 1 — x — 0 — x x2 • 1 dx ■ 1 — (' ten2;- ■ x dx Thus the corresponding orthogonal basis of the space R2M of all polynomials of degree less than three on the interval [—1, 1] is 1, x, x2 — 1/3. Normalization, i. e. multiplying by an appropriate scalar to change the size of the basis' elements to one, gives the orthonormal basis [i [3 l/5, /n=^-, /*2 = y-*, h3 = -xJ-(3x2 -1/3). 422 CHAPTER 7. CONTINUOUS MODELS can point out acoustics, mechanics, and electrical engineering where answering the above question is undoubtedly of extreme importance. Besides that, looking for the answer has given rise to an independent mathematical field - the theory of Fourier series. Later, it began to be applied in many other classes of problems (among others, for solving the most of important types of ordinary and partial differential equations) and led to development of the particular theoretical foundations of mathematics (for instance, to precise definition of the fundamental concepts of a function and an integral). The Fourier series are named in honor of French mathematician and physicist Jean B. J. Fourier, who was the first to apply trigonometric expressions (||7.6||) in practice in his work from 1822 devoted to the issue of heat conduction (he began to deal with this issue in 1804, and he finished the publication as early as in 1811). The significance of this work of Fourier's for physics is enormous although Fourier himself did not pay much attention to physics. He introduced mathematical methods which even nowadays belong to the standard tools of theoretical physics. His mathematical theory of heat also became the foundations for George S. Ohm when he derived the famous law of conduction of electric current. We should not forget to mention that there were many mathematicians who studied the properties of the sums (|| 7.61|) many years earlier than Fourier (L. Euler, for example). However, they did not achieve such significant results with regard to practical applications. 7.6. Determine the Fourier coefficients of the function (a) g(x) = sin(2x) cos (3x) , x € [—Jt, 7t]; (b) g(x) = cos4x, x € [—Jt, Jt]. Solution. The case (a). Since for ieK,we have sin (2x) cos (3x) = sin (2x) [cos (2x) cos x — sin (2x) sin x] = sin {Ax) cos x — sin (2x) sin x j cos x sin {Ax) 1—cos(4x) smx - \ sin x + \ cos x sin (Ax) + \ sin x cos (Ax) - 5 sin x + j sin (5x) we can see that the Fourier coefficients are all zero except for b\ -1/2, b5 = 1/2. The case (b). Similarly, from -l2 cos4x l+cos(2x) = [cos2 x] [l + 2 cos (2x) + cos2 (2x) ] l ~~ 4 3 + \ cos (2x) + \ cos (Ax) , x e 1+2 cos (2x) + 1+cos(4jc) 2 3/4, a2 = 1/2, a4 = 1/8, and the other coefficients 8 1 2 it follows that a0 are all zero. We showed in this exercise that the calculation of the Fourier series may not lead to integrations (usually by parts). Especially in the cases where the function g is a product (power) of functions y = sin (mx), y = cos (nx) for m, n e N, it suffices to apply high-school knowledge (well-known trigonometric formulae). □ 7.7. Find the Fourier series for the periodic extension of the function (a) g(x) = 0, x € [—Jt, 0), g(x) = sinx, x e [0, Jt); (b) g(x) = |x |, x € [-Jt,jt); (c) g(x) =0, x e [-1, 0), g(x) = x + 1, x e [0, 1). Such orthonormal generators of Rk [x] are called Legendre polynomials. 7.4. Orthogonal systems of functions. We have just reminded ourselves the advantages of orthonormal bases of sub-spaces of finite-dimensional vector spaces. In the last example of Legendre polynomials generating M2 [x] C V — Rk[x], k > 2, for any polynomial h e V, the function H — (h, h\)h\ + (h, h2)h2 + (h, h3)h3 will be the uniquely determined function which minimizes our L2-distance \\h — H\\ among all functions in R,t[x], see 3.25. The coefficients for the best approximation of a given function by a function from a selected subspace can be obtained just by integrating in the definition of the scalar product. The mentioned example suggests the following generalization: If we do the Gram-Schmidt orthogonalization for all monomials 1, x, x2,..., i. e. for a countable system of generators, what will become of that? ___J Orthogonal systems of functions |_ - Every (at most) countable system of linearly independent functions in 6^ [a, b] such that the scalar product of each pair of distinct functions is zero is called an orthogonal system of functions. If all the functions /„ in the sequence are pairwise orthogonal and for all n, the size ||/„ ||2 = 1, we talk about an orthonormal system of functions. Let us thus consider an orthogonal system of functions /„ e 5° [a, b] and suppose that for (real or complex) constants c„, the series F(x) = ^c„/„ «=1 (F, fn) 00 b — ^2 Cm / m=\ fm (x) fn (x) dx — Cn || fn I converges uniformly on a finite interval [a, b]. Then the scalar product (F, /„) can easily be expressed in terms of the particular summands (see the corollary 6.43), obtaining where the norm means (just like in further paragraphs) our Z,2-size. Surely we can now anticipate in what sense the procedures from finite-dimensional spaces can be extended: Instead of finite linear combinations of base vectors, we will work I with infinite series of pairwise orthogonal functions. The f ^ following theorem gives us a transparent and very general answer to the question how well the finite sums of such a series can approach a given function: 7.5. Theorem. Let /„, n — 1, 2,..., be an orthogonal sequence of (real or complex) functions in 5° [a, b] and let g e 5° [a, b] be an arbitrary function. Let us denote rb _ cn — II/nil 2 g(x)f„(x) dx. (I) For any fixed n € N, the expression which has the least L2-distance from g among all linear combinations of functions 423 CHAPTER 7. CONTINUOUS MODELS Solution. The case (a). Direct calculation gives Xg+2jT 0 71 üQ = j- f g(x) dx = ^ f Odx + j- f sinx dx X(j -71 0 = J- [-cosx]* = 1, 71 L J U jt ' fl„ = ^ g (x) COS («X) (ix 0 jt = ^ / 0 »i2ii/»ii2< 11*1 n = l (3) L^-distance of g from the partial sums s\ — 2~2n=i c„f„ converges to zero, i. e. lim ||g — ^jfc ii = 0, if and only if «=1 Before we start with the proof, let us first look at the meanings of the particular statements of this theorem. Since we are working with an arbitrarily chosen orthogonal system of functions, we cannot expect that all functions can be approximated by linear combinations of the functions f. For instance, if we consider the case of Legendre orthogonal polynomials on the interval [—1, 1] and restrict ourselves to even degrees only, we will surely be able to approximate even functions only. Nevertheless, the first statement of the theorem says that we can always reach the best approximation possible by partial sums (in /^-distance). The second and third statements then can be perceived as an analogy to the orthogonal projections into subspaces expressed by Cartesian coordinates. Indeed, if for a given function g, the series F(x) — E^Li cnfn converges pointwise, then the function F(x) is, in a certain sense, a orthogonal projection of g into the vector subspace of all such series. The second statement is called Bessel's inequality and it is an analogy of the finite-dimensional proposition that the orthogonal projection of a vector cannot be greater than the original vector. The equality from the third statement is called Parseval's theorem and it says that if a given vector does not become smaller by the orthogonal projection into a given subspace, then it surely belongs to the subspace. On the other hand, our theorem does not claim that the partial sums of the considered series would have to converge pointwise to some function. There is no analogy to this phenomenon in the finite-dimensional world. In general, the series F(x) need not be convergent (i. e. if we considered more general functions than the ones from our space even in the case when the equality in (3) holds. However, if the series z~2n%i lc« I converges to a finite value and all the functions /„ are bounded uniformly on I, then, the series F(x) — E^Li c« /« apparently converges at every point x. Yet it need not converge to the function g everywhere. We will get back to this problem shortly. The proof of all of the three statements of the theorem is quite similar to the case of finite-dimensional Euclidean spaces. No wonder it is so as the bounds for the distances of g from the partial sum / are constructed in the finite-dimensional linear hull of the functions concerned: 424 CHAPTER 7. CONTINUOUS MODELS and for any n e N using integration by parts, we get xq+2tt jt an = ^ f g(x) cos (nx) dx = ^ f x cos («x) 0 iff for every 71 o x e M, it holds that fix + T) = fix). 1 r st \ ■ tr^ii n\j , l r /v \ • /tiz n\j a It is apparent that sums and scalar products of periodic func-/ / (v) sm ([2/c - l]y) dy + - / / (x) sm ([2/c - l]x) dx = 0. , Ff , , ,. * ^ tions with the same period are again periodic tunctions with the same period. o o The integral fx0+T fix) dx of a periodic function / on an Jx0 rjTi > . ,. . , . interval whose length equals the period T is independent of the The case (b). We immediately get , . „ ^ J ° ■ choice of xo e M. o The last proposition can be proved easily: a0 = - fix)dx = ± f f(x)dx + ± f fix)dx= 0, Let us choose two such margina1 Points xo and y0 for the in- 71 _n 71 -tz 71 o tegration Substituting t = x + for a suitable we transform ■fyo+T f(x^^x to^e case wnen J'o 6 tx0'xo + T]- Now we can split the interval of integration into three parts, thereby finishing and then, in an analogous way as for the first statement, we get that for the proof. any k e N, The orthogonality of the Fourier system of functions can be i nr n/ N ^r„,n N , j^t\ calculated by a nice trip to the world of complex num- a2k = - / / (x) cos ( 2/c x) dx ?~i%21 u u- u i * zk jt j j \ / \l Ji!J bers, which we can utilize later: Let us remind that e" = cos(x) + /sin(x). ^ /(x) cos ([2/c]x) dx + ^ f fix) cos ([2/c]x) dx = \ x = y + jt \ "~3£Jif?~ Straight differentiation of the product of real-valued o functions, we can verify that real-valued functions z(x) and cpix) 1 77r/(j + ^)cos([2^][y+7r])^ + l//(x)cos([2^]x) dx of a real variable x satisfy -2n / /(y) cos ([2/c][y + tt]) ^y + 1 / /(x) cos ([2/c]x) Jx 0 0 (z(x) e'^W)' = z'(x) e'"w +i z(x) qf (x) e'" r>(At) 426 CHAPTER 7. CONTINUOUS MODELS = / /(?) [cos (t2^) cos ([2^]^) - sin (&k]y) sin ([2/c]tt)] dy o + / f(x) cos ([2/c]x) U2«-1]jt [2«-l]2jt2 ) n = l sin [2n — l]jrx □ 7.11. Express the function g(x) = cosx, x e (0, n), as the sum of a cosine Fourier series and a sine Fourier series. Solution. Of course, we have cosx = cosx, x e (—it, jt), considering the left-hand cosine to be the even extension of the function g and the right-hand cosine to be the uniquely given cosine Fourier series. Then, the sine series must have a„ = 0, n e N U {0}, and we can easily calculate that Fix) —--h 2_^(fl« cos(n&>x) + bn sin(n&)x)j, n = \ which have values b« = jf <0+T gix) cos(n&>x) dx, '-0+T gix) sin(n&)x) dx. 7.7. Exponential formula. A while ago, we used the basic formula for parametrization of the unit circle in the complex plane by the trigonometric functions when we verified the orthogonality of the functions cos(nx), sin(nx) If we consider co — lir/T to be the speed of running around the circle, where T is the time of one lap, we get the same parametrization in the form: For a (real or complex) function fit) and all integers n, we can define, in this context, its complex Fourier coefficients as the complex numbers i rT'2 cn = ~ fit)e-'mnt dt. 1 J-T/2 Straight from the definition, the relation between the coefficients a„ and b„ of the Fourier series (after recalculating the formulae for these coefficients for functions with a general period of length T) and these complex coefficients c„ become clear. For natural numbers n, we get c„ = \(an - ib„), c-n — \ian + ib„), and if the function / takes on real values only, c„ and c_„ are, of course, complex conjugates of each other. Thus we have expressed the Fourier series F(t) for a function fit) in the form Both in the case of functions with real and complex values, the corresponding Fourier series can be written in this form. However, the coefficients will be complex in general in both cases. We will return to this expression several times; for instance, when we will discuss the extraordinarily useful Fourier transformation. We can notice that having fixed T, the expression co — 2it/T describes just the change of the frequency caused by n being increased by one. Thus it is just the discrete step by which we change the frequencies when calculating the coefficients of the Fourier series. In subsequent parts of this chapter, we will show that Fourier series work with a complete orthogonal system on 5°. However, we will have to prepare ourselves for that thoroughly. For that reason, we formulate some useful results right now and add some practically oriented notes. We will get back to the proofs later. 7.8. Theorem. Let us consider a finite interval [a, b] with length T —b—a. Further, let f be a function with real or complex values in Sl[a,b] (i. e. a piecewise continuous function with a piecewise continuous first derivative), periodically extended on the whole M. Then: 428 CHAPTER 7. CONTINUOUS MODELS b\ = ^ / cosx sinx dx = ^ / sin (2x) dx = 0, 0 0 7t /3„ = ^ cos x sin («x) *+ lim f(y)). cos([n + l]x) I cos([ft — l]x) n+1 ' n-1 Considering that b„ = 0 for odd n e N and b„ we get 2«[(-l)" + l] («2-l)jT n e N\ {!}. 4« («2-l)jT for even n, cosx E n = l 8« (4«2-l)jr sin (2«x) x e (0, 7t). □ 7.12. Write the Fourier series of the n -periodic function which equals cosine on the interval {—jt/2, Jt/2). Further, write the cosine Fourier series of the 2it -periodic function y = | cosx |. Solution. It is not hard to reaUze that we are looking for one Fourier series only (the second part of the problem is only reformulation of the first one). Therefore, let us construct the Fourier series for the function g(x) = cosx, x e [—tz/2, tv/2]. Since g is even, we have b„ = 0, n e N. At the same time, we have a0 jt/2 - f cosxdx = -, 71 J 71 ' -jt/2 jt/2 J- f cos x cos (2nx) dx -jt/2 jt/2 I / \ [cos ([2n + l]x) + cos ([2n - l]x) ] • 00. 7.9. Extension of periodic functions. The convergent Fourier series converges, of course, outside the original interval [-T/2, T/2] as well and it is a periodic function on the whole R. As an example, let us consider the Fourier series for the periodic function given by Heaviside's function g(x) restricted to one period. I. e., our function g will be equal to — 1 on the interval [—it, 0] and to 1 on the interval (0, it). We need not care about the values at zero and at the marginal points of the interval because these do not have any effect on the coefficients of the Fourier series. Its periodic extension onto the whole R is usually called an "angular wave function". Since it is an odd function, the coefficients at the functions cos(nx) must be all zero. For the coefficients at the functions sin(nx), we get b„ 1 f* * J-71 2 — (1- nit g(x) sin(nx) dx (-1)"). 2 f* * Jo sin(nx) dx Thus the Fourier series has the form 4 g(x) = -it sin(x) 1 - sin(3x) 3 — sin(5x) The partial sums of its first five and fifty terms, respectively, are shown in the following pictures. If the interval [-T/2, T/2] is chosen for the prime period of such an angular wave function, i. e. we want to work with the periodic extension of Heaviside's function with period T, we can easily recalculate that the resulting Fourier series has the form 4 / 1 1 g(x) — — I sin(&)x) H— sin(3 0, double application of integration by parts yields: a„ — — f x2 cos(2™;t )dx — 2 f x2 cos(jtnx)dx 2 J-i Jo TtLnL The remaining coefficient is ao 2 f 2 f — I x dx — 2 I 2 J-i Jo x2 dx The entire series giving the periodic extension of x2 from the interval [—1, 1] thus equals /(*) = 1 E n = \ (-1)" cos(jrnx). By Weierstrass criterion, it is apparent that this series converges uniformly. Therefore, f(x) will be continuous. Thanks to the theorem 7.8, we know that actually f(x) — x2 on the whole interval [—1, 1] since we are approximating a continuous function on the whole R, and the convergence must be uniform. Thus our series approximates the function x2 on the interval [0, 1] far better than we could do it with a periodic extension of the function from this interval only. Let us proceed with our illustrations. Thanks to the uniform convergence, we can invoke the rule for differentiating and integrating series term by term and calculate the Fourier series for the functions x and x3. The differentiation will be the simpler one: -ix2)' 2 ' 0 oo -E TT i—l (-D n + 1 ■ sin(Trnx). «=i Apparently, this series cannot converge uniformly since the periodic extension of the function x is not a continuous function. However, we can easily derive that it will converge pointwise (see our reasonings about alternating series in ??), thus we really obtained the equality, (see the theorem ??). 430 CHAPTER 7. CONTINUOUS MODELS not give it in the denominators (the change of the upper bound takes effect when calculating a0). Therefore, we had to multiply the original series by n2. Readers who are unable to perform the corresponding calculations in mind and to immediately realize the differences, are advised to calculate the Fourier series of the function g directly. Substituting x = 0 and x = n then gives 7t oo i. e. E n = l (-D" 12 ' and n—\ n—\ In other words, we have found another way of expressing 7T 12(1 J- 4- -i- 22 ' 32 42 + ...)=6(l + £ + £ + £ + ...). □ 7.15. Using the Fourier series of the function g(x) = ex,x e [0, lit), calculate YZLi resolution. We have (see also (||7.9||), (||7.10||)) a0 =I/e^x = l(e2--l), o In ex[cos(nx)+n sin(nx)] l+«2 - f ex cos (nx) dx = - 71 j v ' 71 2n 0 (l+«2)jr ' 1 n e N, 2jt ^ / e* sin (njc) 1tz— 2 2 hence it follows that el7T + l 2 which can be refined to i+«2 -1 / j_ _j_ cos 0—n sin 0 2 « = 1 oo 2^1^ l+«2 « = 1 oo 1 _ (jr-l)e2;r+jr+l 2- l+„2 " 2(e □ 7.16. Calculate the series y — (2n- n = \ Solution. To determine the value of this series, one can successfully apply many known Fourier series of various functions. Let us remind, for instance, the Fourier series Similarly, we can integrate term by term, leading to 1 3 2 4 ^ (-1)" -x = —x ■ n = \ ■ sin(jrnx), and the resulting Fourier series is obtained by substituting for x from the previous equality. 7.11. General Fourier series and wavelets. In the case of a general orthogonal system of functions /„ and the series made from it, we often talk about general Fourier series with respect to the orthogonal system of functions /„. Fourier series and further tools built upon them are used for processing various signals, pictures, and so on. The nature of the period trigonometric functions used in the standard Fourier series and their simple scaling by increasing the frequency limit their usability. In many application fields, there arose a natural requirement of more convenient orthogonal systems of functions which will reflect the supposed nature of the data and which could be processed more efficiently. Requirements for fast numerical processing usually include quick scalability and the possibility of easy movements by constant values. We can hope in such a system if, for instance, we choose a suitable continuous function i/r with a compact support from which we create countably many functions ^fjk, j,k e Z, by translations and dilations: fjk(x) =2>/2f(2>x-k). If at the same time, the following two conditions are satisfied: • the form of the mother function captures the possible behavior of the data in a good way, • its descendants ^ form a complete orthogonal system, then, possibly, only a few of the functions will do to approximate the processed signal in question. We talk about the so-called wavelets. We have no space for details here, it is an extraordinarily vital field of research as well as the base of commercial applications. Interested readers are referenced to special literature. Let us remark that actually, only discrete versions of our objects are used, i. e. the values of all the functions ^ are only enlisted in a discrete (very large) set of points and are also orthogonal in this sense. The standards JPEG2000 are a good example of this, they use this technique and are a tool for professional compression of visual data in film industry, or the format DjVu for compressed publications. One of the first wavelets was created by Ingrid Daubechies. In the picture below, there are the so-called Daubechies mother wavelet D4(x) and its daughter D4(2 1). 431 CHAPTER 7. CONTINUOUS MODELS 2 it 2, « = 1 (2«-l)2 ' which was calculated for the function g (x) = |x |,x e [—n,n). Since this function is continuous on [—it, it) and | — tt | = 17t |, we even know that jt _ 4_ st" cos([2n —1]*) 2 it ^ (2«-l)2 ' X e [ —7t, 71"]. « = 1 Substituting x = 0 gives oo n — z. - i v_J_ U 2 jt ^ (2«-l)2' oo *■e- E —n n = \ □ 7.17. Sum up the series ^ «4' ^ «4 Solution. First, let us remind that the values of the series y — = — V ~' „2 6 > „2 «—1 «—1 iE! 12 have already been determined. In this exercise, we will hint the procedure by which the series t i , £ tP» «—1 ft—1 for a general ieN can be calculated. We use the identities sin (nx) (7.11) (7.12) x e (0, lit), «=i An x-^ cos («x) sin (nx) — + 4E^2-^-4^E^^' * 6(0,2*), «—i «—i which follow from the constructions of the Fourier series for the functions g(x) = x and g(x) = x2, respectively, on the interval [0, 2jt). By (||7.11||), we have oo £^ = ^, xe(0,27r). «=i " Substituting into (||7.12||) gives Ecos( «2 «=1 cos(nx) _ 3jc2-6ttjc+2jt2 12 x e (0, 27T). n = \ E «=i sin(ny) „3 o «=i 2_fi„„J_^2 . ..3_W2,~ 2, 0 Let us point out that, in fact, every Fourier series may be integrated term by term. Similarly, further integration gives Mere substitution then proves the validity of this formula at the marginal points x = 0, x = 2n as well. The left-hand series is apparently bounded from above by £^Li ^ thus it converges absolutely and uniformly on [0, 2n]. Therefore, it can be integrated term by term: oo Esin(nx) «3 The function FA is not described by any means from analysis. The function is given only by the values it takes on a finite (yet very large) set of input values. It is chosen so that it would have, in its various parts, all the necessary properties for graphics data — both slow and fast increase, sharp turns at both extrema, and so on. The complexity of the construction lies, of course, in the condition that the system obtained by the mentioned construction be orthogonal! 2. Metric spaces In this part of the chapter, we will focus on the concepts of distance and convergence in a more abstract way. We will take advantage of this presently when we prove the already mentioned properties of Fourier series, and we will also return to these concepts in miscellaneous contexts. So we can consider the subsequent pages to be a very useful (and hopefully manageable) trip into the world of mathematics for the competent or courageous. 7.12. Metrics and norms. When we derived the technique of Fourier series, we freely talked about the distance on a space of functions. Now we will stop by this 3s concept and explain it thoroughly. The Euclidean distance in the vector spaces Rn satisfies the following three abstract requirements. (So does the L\-distance d(f, g) — \\f — g\\t on the space of continuous and absolutely integrable functions.) For the oncoming paragraphs, let us try to keep these two examples in our minds. _ | Axioms of a metric and a norm [___ A set X together with a mapping rf:IxX->I such that for all x, y,z e X, the following conditions are satisfied (7.2) d(x, y) > 0; and d(x, y) = 0 iff x = y , (7.3) d(x,y) = d(y,x), (7.4) d(x, z) < d(x, y) + d(y, z), is called a metric space. The mapping d is the metric on X. If X is a vector space over R and || || : X -> R is a function satisfying (7.5) (7.6) (7.7) ||x|| > 0; and ||x|| — 0 iff x — 0, \\Xx\\ — \X\ ||x||, for all scalars X , \\x + y\\ < ||x|| + ||y||, then the function || || is called the norm on X, and the space X is then a normed vector space. A norm always gives the metric d(x, y) — \\x — y\\. 432 CHAPTER 7. CONTINUOUS MODELS E n = l 1—cos(nx) E n = l cos(ny) „4 /E sin(ny) „3 / 12 0 ■ 0+ in the limit, we would get just the series E^Li ^- Thus, it should suffice to integrate the right-hand function twice and calculate one limit. However, the integration of the right-hand side leads to a non-elementary integral, i. e., the antiderivative cannot be expressed in terms of elementary function we usually work with. 1 □ 7.18. Using Parseval's identity for Fourier's orthogonal system, verify that oo E n = l (2«-l)4 96' 1The function f(p) = Yln^l *s caUed the Riemann zeta function. Let us consider an arbitrary sequence of elements x$,x\,... lying in X such that for any fixed positive real number e, it holds for all but finitely many terms x, of the sequence that for all but finitely many terms xj, d(X[, Xj) < s. In other words, for any fixed e > 0, there is an index N such that the above inequality holds for all i, j > N; i. e. the elements of the sequence are eventually arbitrarily close to each other. Such a sequence is called a Cauchy sequence. 1 Just as in the case of the real or complex numbers, we would like every Cauchy sequence of terms x, e X to converge to some value x in the following sense: ___| Convergent sequences |___ If a sequence of elements x$,x\,... e X, a fixed element x e X and every positive real number e are such that for all but finitely many i (depending on the choice of e), it holds that d(xt, x) < s, we say that the sequence xt,i — 0,1,..., converges to the element x, which is called the limit of the sequence xt,i — 0, 1,... in the metric space X. Thanks to the triangle inequality, we get that for each pair of terms x;, xj from a convergent sequence with sufficiently large indeces, it holds that (the denotation is taken from the definition above) d(xt, Xj) < d(xt, x) + d(x, Xj) < 2s. Therefore, every convergent sequence is a Cauchy sequence. Metric spaces where the converse (i. e. every Cauchy sequence is convergent) is true as well are called complete metric spaces. 7.14. Topology, convergence, and continuity. Just as in the case of the real numbers, we can formulate the convergence in terms of "open neighborhoods". 433 CHAPTER 7. CONTINUOUS MODELS Solution. We have already summed this series up (see (||7.13||)). Now, we will reveal that number series can be summed up even more easily thanks to Fourier series. However, this way is conditioned by knowledge of a good deal of Fourier series and can be a bit more complicated for the reader. (We recommend to compare the solutions of this exercise and the previous one.) It is imperative to choose an appropriate Fourier series. For instance, let us take the Fourier series Open and closed sets dE « = 1 cos([2r? — \]x) (2«-l)2 ' x0+T n = \ n = \ x0 says for it that zL _i_ ii v 1 2 JT2 (2«-l)4 n = \ \2dx 1 fx2dx = 2*1 jt j 3 1. e., E n = l 1 (2«-l)4 \ 3 2 / 16 96 • □ Now, we will illustrate how Fourier series can be applied in the theory of differential equations. For the sake of simplicity, we consider only the non-homogeneous (compare to (||7.2||)) differential equation (7.14) y" + a2y = f(x) with y an unknown in variable x e R, with a periodic, continuously differentiable function / : R -» R on the right-hand side and a constant a > 0. Let T > 0 be the prime period of the function / and let its Fourier series on [-T/2, T/2] be known, i. e., the identity (7.15) f(x) «=1 2itnx 2itnx An cos--h Bn sin- x e 7.19. Prove that if the equation (||7.14||) has a periodic solution on R, then the period of this solution is also a period of the function /. Further, prove that the equation (|| 7.141|) has a unique periodic solution with period T if and only if 2jvn (7.16) a 7^ for every n e N. Solution. Let a function y = gix), x e R, be a solution of the equation (||7.141|) and let it have period p > 0. In order to substitute the function g into a second-order differential equation, its second derivative g" must exist. Since the functions g, g', g", ... share the same period, the function g"ix)+a2gix) = fix) is also period with period p. In other words, the function / is periodic as a linear combination of functions with period p. Thus, we have proved the first statement claiming that p = IT for a certain I € N. The open s-neighborhood of an element x in a metric space X (or just e-neighborhood in short) is the set 0Eix) = {y e X; d(x, y) < s}. A subset U C X is open iff with every point x it contains, it contains some e-neighborhood of x as well. A subset ff c lis closed iff its complement X \ W is an open set. which we have obtained for the function gix) = \x\,x e [—7r, 7r) and which has already been used once to determine the value of a series. Parseval's identity Instead of an e-neighborhood, we also talk about an (open) e-ball centered at x. In the case of a normed space, we can do with e-balls centered at zero: those added to the given element x give just its e-neighborhood. The limit points of a subset A c X are, again, defined as such elements x e X that there is a sequence of points from A converging to, but not containing x. We can easily see that a set is closed if and only if it contains all of its limit points: Indeed, it follows straight from the definition that a set A is closed if and only if for every point x A, there is some e > Osuch that the whole e-neighborhood Ob (x) has an empty intersection with A. Thus if A were closed and x were a limit point of the set A not belonging to it, then in every such e-neighborhood of such a point x, there are infinitely many points of the set A, which is a contradiction. On the other hand, let us suppose that A contains all of its limit points and let us consider x e X \ A. If in every e-neighborhood of the point x, there were a point xB e A, then the choices e = 1/n give us a sequence of points x„ e A converging to x. But then, the point x would have to be a limit point, thus lying in A, which again leads to a contradiction. For every subset A in a metric space X, we define its interior as the set of those points in A which belong to A together with some neighborhood of theirs. Further, we define the closure A of a set A as the union of the original set A with the set of all A's limit points. As easily as in the case of the real numbers, we can verify that the intersection of any system of closed sets as well as the union of any finite system of closed sets results in a closed set again. It is the other case with open sets: any union of open sets is an open set, but in general, only a finite intersection of open sets is again an open set. Prove these propositions by yourselves in detail! We also advise the reader to verify that the interior of a set A equals the union of all open sets contained in A, while the closure of A is the intersection of all closed sets which contain A. The closed and open sets are the essential concepts of the mathematical discipline called topology. Without going into deeper connections, we have just made ourselves familiar with the topology of the metric spaces. The concept of convergence can be reformulated now as follows: a sequence of elements x,, i — 0, 1,..., in a metric space X converges to x e X iff for every open set U containing x, all but finitely many points of our sequence lie in U. Just as in the case of the real numbers, we can define continuous mappings between metric spaces: A mapping / : W -> Z is continuous iff the reverse image f~l iV) of every open set V c Z is an open set in W. Of course, this means nothing else than the claiming that for every z = fix) e Z and a positive real number e, there is a positive real number S 434 CHAPTER 7. CONTINUOUS MODELS Now, suppose that the a function y = g(x), x e R, is a periodic solution of the equation (|| 7.141|) with period T and that it is expressed by a Fourier series as follows: Clí) ^—\ (7.17) g(x) = — + [an cos (únnx) + bn sin (öotx)], x «=i where to = 2n/T. If g satisfies the equation (||7.14||), it must have a continuous second derivative on R. Therefore, oo g'(x) = E [o)nb„ cos (conx) — cona„ sin (conx)], x e R, n = l (7.18) oo g"(x) = \—co2n2an cos (&>72x) — co2n2bn sin (&>72x)] , x e R. n = l Substituting (||7.15||), (||7.17||) and (||7.18||) into (||7.14||) yields a2f + oo E [(—co2n2an + a2an) cos (ncox) + {—co2n2bn + a2bn) sin (tkwx)] n = \ 4p + E t-^n cos (« ^0 ^0 Ac a - = —, i. e. flo = —z, 2 2 a^-and (7.20) (—co2n2 + a2) a„ = An, {—co2n2 + a2) bn = Bn, n € N. We can see that there is exactly one pair of sequences {a„}„eNu{0), {bn}neN satisfying these conditions if and only if -co2n2 + a2 = - (2jL)2 + a2 £ 0 for every neN, i. e., if (||7.16||) holds. In this case, the only solution of (||7.14||) with period T is determined by the only solution (7.21) a„ -(w2«2+fl2' —(w2«2 + fl2' of the system (||7.20||). Let us emphasize that we have silently utilized the uniform convergence of the series in (||7.18||). This follows, besides others, from deeper results of the general theory of Fourier series to which we will not pay further attention. □ 7.20. Using the solution of the previous problem, find all lit -periodic solutions of the differential equation f +2y = E sin(nx) X € «=1 Solution. The equation is in the form of (||7.14||) for a = \[2 and the continuously differentiable function oo /(x)=E^ n = \ with prime period T = 2jt. According to the problem 117.1911, the condition \p2 £ N implies the there is exactly one 2it -periodic solution. If we look for it as the value of the series oo ^ + E \-an cos (nx) + bn sin (nx)], x e R, n = l such that for all elements y e W with distance dw(x, y) < 8, it also holds that dz(z, f(y)) < s. Again, similarly as in the case of the real-valued functions, a mapping / between metric spaces is continuous if and only if it respects convergence of sequences. 7.15. Lp-norms. Now we have the general tools with which we jji :i can have a look at examples of metric spaces created by finite-dimensional vectors or functions at our disposal. We |x will restrict ourselves to an extraordinarily useful class of norms. We begin with the real or complex finite-dimensional vector spaces M" and C", and for a fixed real number p > 1 and any vector z — (zi,..., zn), we define =to i=\ 1/p We are going to prove that this indeed defines a norm. The first two properties from the definition are clear. It remains to prove the triangle inequality. For that purpose, we will use the so-called Holder's inequality: Lemma. For a fixed real number p > \ and every pair of n-tuples of non-negative real numbers x, and yi, it holds that n / n . l/p , n .\/q r = l \' = 1 7 \' = 1 7 where l/q — 1 — l/p. Proof. Let us denote by X and Y the expressions in the prod-uct on the right-hand side of the inequality to be proved. If all of the numbers x, or all of the numbers yt are zero, then the statement clearly holds. Therefore, let us suppose that X / 0 and Y / 0. Holder's inequality is a useful straight corollary of the convexity of the exponential function. Let us define the numbers and wk so that XeVk/p, yk = YeWk/q. Xk Since l//? + l/g = l,we can consider the affine combination of ■ wk and thanks to the mentioned convexity, we the values -vk obtain M/p+Wk/q < 1 — p 1 Hence we can calculate straightaway that p 1 .1 &m 1 1 —Xkyk < — XY y - p and summing over k — 1, n Exí;Ví' - n, 1 XY pXP qYP Erf- However, the particular sums on the right-hand side give exactly Xp and Yq, so the whole expression is equal to l/p + l/q — 1. Multiplying this inequality by the number XY finishes the proof. □ Now we will really be able to prove that || || is indeed a norm: 435 CHAPTER 7. CONTINUOUS MODELS we further know that (see (||7.19||) and (||7.21||)) cjq — an — 0, tion ha 0 v = E 1 neN. Jn ~ n2(2-n2) ' Thus, the given equation has the unique lit-periodic solution oo sm(nx) « = 1 n2(2-n2) ' X € □ C. Metric spaces 7.21. Show that the definition of a metric as a function d defined on X x X for a non-empty set X and satisfying (7.22) d(x, y) = 0, if and only if x = y, x,yeX, (7.23) d(x, z) < d(y, x) + d(y, z), x, y, z e X, is equivalent to the definition given in the theoretical part, in paragraph 7.12. Solution. Ostensibly, this definition lays fewer requirements on the metric than the definition from the theoretical part. The definitions are equivalent iff the conditions (||7.22||), (||7.23||) imply (7.24) d(y, x) > 0, x, y e X, (7.25) d(x, y) = d(y, x), x,yeX. However, if we set x = z in (||7.23||), we get (||7.24||) from (||7.22||). Similarly, the choice y = z, in (||7.23||), using (||7.22||), implies that d(x, y) < d(y, x) for all points x, y e X. Interchanging the variables x and y then gives d(y, x) < d(x, y), i. e. (||7.25||). Thus, we have proved that the definitions are equivalent. Many more ways of defining a metric can be found in literature. Besides those, one can find many definitions which are a bit different and lead to objects other than metrics (the most important ones being pseudometrics, ultrametrics, and semimetrics). The first axiomatic definition of a "traditional" metric was given by Maurice Frechet in 1906. However, the name of the metric comes from Felix Hausdorff, who used this word in his work from 1914. □ 7.22. Consider the power set (the set of all subsets) of a given finite set. Determine whether the mapping defined for all considered subsets X, Yby (a) dt(X, Y) :=|(Iuy)\(XnF)|; (b) d2(X, Y) := (XU|1]u^nY)'' X U Y ^ 0' ^(0' 0) := 0 is a metric. (By | X |, we mean the number of elements of a set X.) Solution. We will omit verifications of the first and second conditions from the definition of a metric in exercises on deciding whether a particular mapping is a metric. The reader should immediately realize that both d\ and d2 satisfy them. Therefore, we analyze the triangle inequality only. The case (a). For any sets X, Y, z, we have (7.26) (xuz)\(xnz)c [(x uf)\(xn Y)] u [(y u z) \ (Y n z)] since if x e (X U z) \ (X n z), then exactly one of the following occurs: x € X and x g Z, x g X and x e z. _j Minkowski inequality [_ For every p > 1 and all n -tuples of non-negative real numbers (x\,..., xn) and (yi,..., y„), it holds that (p*+*>p) *(E*r) +(Erf) • To verify this practical inequality, we can use the following trick, invoking Holder's inequality. We surely have (notice that P> 1) E>ta + y^'1 ^ (E*f) • (l> + yi)lp~1)q) as well as / " \ i/p / " \ E^-fe + yi)p~l 2= (E^f) • (E^ + yi)(p-l)q) Summing up the last two inequalities and taking into account that p + q — pq, and so (p — l)q — pq — q — we arrive at l/q Y.U(xi + yi)r n=i(xi + yi)p l/q E^ r = l 1/P Erf 1/P However, 1 — l/q — l/p, so this is just the so-called Minkowski inequality which we have wanted to prove. Thus we have verified that on every finite-dimensional real or complex vector space, there is a class of norms || || for all p > 1. Beside that, we further set llzlloo = maxfjzil, i = 1, ... ,n], which, apparently, is a norm, too. We can notice that Holder's inequality can be, in the context of these norms, written for all x — (x\,..., x„), y — (yi,..., y„) as Ei* I-i*-1 for all p > 1 and q satisfying l/p + l/q — 1, where for p — 1, we set q = oo. 7.16. Lp -norms for sequences and functions. Now we can easily define norms on suitable infinite-dimensional vector spaces as well. Let us begin with sequences. The vector space £p, p > 1, is the set of all sequences of real or complex numbers x$,x\,... such that All sequences with bounded absolute values of their terms create the space . Taking the limit as n goes to oo, we immediately get from the Minkowski inequality Ei i=0 l/p is a norm on £p. Similarly, we set Hxlloo — sup{|x, I, i — 0, 1, ... on ^oo, again obtaining a norm. 436 CHAPTER 7. CONTINUOUS MODELS Thus, it makes sense to consider these four possibilities: x e X, x £ Z, x e Y, x e X, x £ Z, x £ Y, x £ X, x e Z, x e F, x ^ X, x e Z, x ^ F, which may occur for x e (XUZ)\(XnZ). However, in any of these four cases, x belongs to exactly one of the sets (X U F) \ (X n F), (F U Z) \ (F n Z). Thus we have obtained the inclusion (||7.26||) whence the wanted triangle inequality follows. di(X, Z) = |(XUZ)\(XnZ)|< | [(X U F) \ (X n F)] U [(F U Z) \ (F n Z)] I < I (X U F) \ (X n F) I + I (F U Z) \ (F n Z) I = Y)+dl(Y, Z). The case (b). We can proceed similarly to the case of d\. Let us denote by X' the complement of a set X. The equalities (X U F) \ (X n F) = (xnfnz)u(xnfn z') u(X'nfnz)u(X'nyn z'), (F U Z) \ (F n Z) = (X n f n z') u (X n 7' n Z) u (X' n f n z') u (X' n F' n Z), [(X uz)\(xn z)] u [7 \ (X u z)] = (xnF nz')u(xnF' nz')u(X' nF nz)u(X' nF' nz)u(X' nF nz'), which, again, can be proved by listing several possibilities, imply a stronger form of (||7.26||): [(X uz)\(xn z)] u [7 \ (X u z)] c [(X U 7) \ (X fl F)] U [(F U Z) \ (7 n Z)]. Further, we invoke the inequality (mz)x(inz) < [(xuz)x.(xnz)]u[Yx.(xuz)] y \ \ 7 -L Q\ xuz — IUZU[r\(IUZ)] ' -f— yJ• That is based upon calculations with non-negative numbers only since it holds in general that f < fg, y>0, z>0, xe[0,zl Since, apparently, X U Z U [7 \ (X U Z)] we get d2(X, Z) X U F U Z, (iuz)\(inz) < < xuz [(XUY)x.(XnY)]U[(YUZ)x.(YnZ)] XUYUZ (xuYK(xnY) , (Yuz)x,(Ynz) XUY [(xuz)x.(xnz)]u[Yx.(xuz)] < XUZU[Yx,(XUZ)] — (XUY)x.(Xf]Y) | + | (YUZ)x.(YHZ) < XUYUZI — + yuz| -d2(X,Y) + d2(Y,Z), if X U Z 0 and F ^ 0. However, for X = Z = 0 or F = 0, the triangle inequality clearly holds as well. Therefore, both mappings are metrics. The metric d\ has a mere helping use. On the other hand, the metric d2 has wider applications and it is also known as Jaccard's metric. It is named after biologist Paul Jaccard, who, in 1908, described the measure of similarity in insects populations using the function 1 — d2. □ 7.23. Let Prove that d is a metric on R. Solution. Again, we prove the triangle inequality only (the rest is clear). Let us introduce a helping increasing function (7.27) f(t) := j^, t > 0. Eventually, let us get back to the space of functions 5° [a, b] on a finite interval [a, b] or 6°, [a, b] on an unbounded interval. We have already met the norm || || i. However, for every p > 1 and for all functions in such a space of functions, the Riemann integrals J a \f(x)\pdx surely exist, so we can define 11/11 = (/ \f(x)\Pdx] The Riemann integral was defined in terms of limits, using the so-called Riemann sums which correspond to splitting S with representatives &. In our case, those are the finite sums 5s,s =E|/(^)|P(x'-^-i). r = l Holder's inequality applied to the Riemann sums of a product of two f(x) and g(x) gives r = l l/q < (^Ei/&)ip(-* J a =1 ^ \-=l where on the right-hand side, there is just the product of the Riemann sums for the integrals and \\g\\q. Moving to limits, we thus verify the so-called Holder's inequality for integrals: f(x)g(x) dx<(J f(xY dx\ I J g(xf dx\ which is valid for all non-negative real-valued functions / and g in our space of piecewise continuous functions with a compact support. In just the same way as in the previous paragraph, we can derive the integral variant of the Minkowski inequality from Holder's inequality: ll/ + Sllp< ll/llp+llsllp. Thus || ||p is indeed a norm on the vector space of all continuous functions having a compact support for all p > 1 (we verified this for p — 1 long ago). We will use the word "norm" for the entire space 5° [a, b] of piecewise continuous functions in this context; however, we should bear in mind that we have to identify those functions which differ only by their values at points of discontinuity. Among these norms, the case of p — 2 is exceptional; we have realized it by the scalar product. In this case, we could have derived the triangle inequality much more easily using the Schwarz inequality. For the functions from 5° [a, b], we can define an analogy of the Loo-norm on n-dimensional vectors. Since our functions are piecewise continuous, they will always have suprema of absolute values on a finite closed interval, so we can set ll/lloo = SUP{/W, x e [a, b]} 437 CHAPTER 7. CONTINUOUS MODELS The fact that / is increasing need not be verified by calculation of the first derivative. It can be seen by the simple rearrangement m - fir) = TT7 " TT7 = (TTITO > °- s > r ± °-Therefore, d{x, z) x-z x-y+y-z < x~y \+\ y-z 1 + 1 x-z 1 + | x-y+y-z — 1 + | x-y | + | y-z\ x-y__j__y-z < x-y , y-z 1 + | x-y | + | y-z, 1 + | x-y \ + \ y-z, — l + \x-y\ ' l + \y-z,\ d(x, y) + d(y, z), x, y, z e M. □ 7.24. Determine the distance of the functions f(x)=x, g(x) = -^J=, xe[l,2] v l+xz as elements of the normed vector space 0, xe[l,2], max (x + xe[l,2] 2 + 2 + Ys- n+x2/ vi+22 An increasing function takes the maximum value at the right marginal point of a closed interval. □ 7.25. Determine whether the sequence {xn }„eN where x\ 1, 1 + 5 + + -, n e N\ {1}, First, consider the usual metric given by is a Cauchy sequence in the difference in absolute value (i. e., induced by the norm of absolute value). Then, consider the metric Solution. Let us remind that oo ^ (7.28) £-Therefore, oo, i. e. k=\ oo 1 k—m lim |x„ E k—m + l oo, m e N. oo, m e N. for such a function /. Let us notice that if we considered both the one-sided limits (which always exist by our definition) and the value of the function itself to be the value / (x) at points of discontinuity, we can work with maxima instead of suprema. It is apparent again that it is a norm (except for the problems with the values at discontinuity points). 7.17. Completion of metric spaces. Both the real numbers R and the complex numbers C are (with the metric given by the absolute value) a complete metric space. Actually, this is contained in the axiom of existence of suprema. Let us remind the the real numbers were created as a "completion" of the space of rational numbers which is not complete itself. It is apparent that the closure of the set Q c R is the whole R. Dense and nowhere-dense subsets _. 1 We say that a subset A c X in a metric space X is dense iff the closure of A is the whole space X. A set A is said to be nowhere dense in X iff the set X \ A is dense. Apparently, A is dense in Z if every open set in the whole space X has a non-empty intersection with A. In all cases of norms on functions from the previous paragraph, we can easily see that the metric spaces defined in this way are not complete since it can happen that the limit of a Cauchy sequence of functions from our vector space 5° [a, b] should be a function which does not belong to this space any more. Let us consider the interval [0, 1] as the domain of functions /„ which take zero on [0, 1 In) and are equal to sin(l/x) on [1/n, 1]. Apparently, they converge to the function sin( 1 /x) in all L p norms, but this function does not he in our spaces. Completion of a metric space _. Let X be a metric space with metric d which is not complete. A metric space X with metric d such that X c X, d is the restriction of d to the subset X and the closure X is the whole space X is called a completion of the metric space X. The following theorem says that the completion of an arbitrary (incomplete) metric space X can be found in essentially the same way as the real numbers were created from the rationals. Before we get to the quite difficult proof of this extraordinarily important and useful result, we can notice that such a "completion" I of a space X can be done in a unique way, in a certain sense: A mapping

• X2 between metric spaces with metrics d\ and d2, respectively, is called an isometry iff all elements x,y e X satisfy d2(cp(x), cp(y)) — d\(x, y). Of course, every isometry is a bijection onto its image (this follows from the properly that the distance of distinct elements is non-zero) and the corresponding inverse mapping is an isometry as well. Now, let us consider two inclusions of a dense subset, i\ : X —>• Xi and i2 : X —>• X2, into two completions of the space X, and let us denote the corresponding metrics by d, d\, and d2, respectively. Apparently, the mapping X X2 438 CHAPTER 7. CONTINUOUS MODELS Hence we can see that the sequence {x„} cannot be a Cauchy sequence. Thus we have found the answer for the usual metric. However, we could have utilized the fact that the sequence {xn} is not convergent by (||7.281|) and that we find ourselves in a complete metric space, where Cauchy sequences and convergent sequences coincide. For the metric d, it suffices to realize that the mapping / introduced in (||7.271|) is a continuous bijection between the sets [0, oo) and [0, 1), having the property that /(0) = 0. Thus, any sequence is convergent "in the original meaning" if and only if it converges in the metric space R with metric d. It holds as well that a sequence is a Cauchy sequence in R with respect to the usual metric if and only if it is a Cauchy sequence with respect to d. □ 7.26. Is the metric space \; (b) ||/||0o=max{|/(x)|;xe[-l,l]} complete? Solution. The case (a). Let us, for every n e N, define a function fn(x) = 0, x e [-1,0), /„(*) = 1, x e [±, l], fn(x) =nx, x e [0, i) . The obtained sequence {/„}„£n C n, m, n e N. Let us focus on the potential limit of the sequence {/„} in n(s). Therefore, the continuous function / must satisfy f{x) = 0, x e [-1,0], f{x) = 1, x e [e, 1] for an arbitrarily small s > 0. Thus, necessarily, fix) = 0, x e [-1, 0], fix) = 1, x € (0, 1]. However, this function is not continuous on [—1, 1] - it does not belong to the considered metric space. Therefore, the sequence {/„} does not have a limit in 0 (or for every e/2 if you want) there is an n(e) € N such that s (7.29) max | fm{x) - f„ix) | < -, m,n> n(e). jce[-l,l] 2 In particular, we get for every x e [—1, 1] a Cauchy sequence {fnix)}neN C R of numbers. Since the metric space R with the usual metric is complete, every (for x e [—1,1]) sequence {f„{x)} is convergent. Let us set fix) := lim f„ix), x e [-1, 1]. is well-defined on the dense subset i\(X) c X\. Its image is the dense subset i2(X) c X2 and, moreover, this mapping is clearly an isometry. The dual mapping i\ o 1 works in the same way. Every isometric mapping maps, of course, Cauchy sequences to Cauchy sequences. At the same time, such Cauchy sequences converge to the same element in the completion if and only if this holds for their images under the isometry X2 which is both a bijection and an isometry. Thus, the completions Xi and X2 are indeed identical in this sense. 7.18. Theorem. Let X be a metric space with metric d which is not complete. Then there exists a completion X ofX with metric d which is unique up to bijective isometries. Proof. The idea of the construction is quite identical to \\ the one used when building the real numbers. Two Cauchy sequences x, and y, of points belonging to X are considered equivalent iff d(xt, y;) converges to W zero for i approaching infinity. This is a convergence of real numbers, thus the definition is correct. From the properties of convergence on the real numbers, it is quite apparent that the relation defined above is really an equivalence relation. The reader is advised to verify this in detail. For instance, the transitivity follows from the fact that the sum of two sequences converging to zero converges to zero as well. Now, let us define X as the set of the classes of this equivalence of Cauchy sequences. The original points x e X can be identified with the class of sequences equivalent to the constant sequence Xi — x, i — 0, 1, .... It is now easy to define the metric d. It suggests itself to consider d(x,y)= lim d(xt, yt) for sequences x — {xq, x\,...} and y — {yo, yi,...}. First, we have to verify that this limit exists at all and is finite. Straight from the triangle inequality and the fact that both the sequences x and y are Cauchy sequences, it follows that the considered sequence is also a Cauchy sequence of real numbers d(xt, yt), so its limit exists. If we select different representatives x — {x?0, x/1,...} and y — {yo, yi,...}, then we can see from the triangle inequality for the distance of real numbers (we need to consider the consequences for differences of distances) that Wix';, 3/.) - d(Xi, yt)\ < Wix';, 3/.) - dix[, yi)\+ \dix\, yd - d(xt, yt)\ < dixi,x't) + diyt, )/.). Therefore, the definition is indeed independent of the choice of representatives. Further, we verify that d is a metric on X. The first and second properties are clear, so it remains to prove the triangle inequality. For that purpose, let us choose three Cauchy representatives of the 439 CHAPTER 7. CONTINUOUS MODELS Letting m -» oo in (||7.29||), we obtain max | f{x) - f„ix) | < | < s, n> n(e). However, this means that the sequence {/„}„£n converges uniformly to the function / on [— 1, 1]. In other words, {/„}„£n converges to / with respect to the given norm. We have already found that a uniform limit of continuous functions is a continuous function. Thanks to that, we need not prove that / e / ' The first and second properties are clearly satisfied. To prove the triangle inequality, it suffices to realize that dim, n) € (1,4/3] if m ^ n. All Cauchy sequences can be found equally easily: they are the so-called almost stationary sequences — constant from some index on (i. e., constant except for finitely many terms). Thus, every Cauchy sequence is convergent, so the metric space in question is complete. Let us introduce the sets A„ := \m e N; dim, n) < 1 + ^} , neN. As the inequality in their definition is not strict, it is guaranteed that they are closed sets. Since A„ = {n, n + 1, ...}, (||7.30||) does not hold. If we omitted the requirement (||7.311|), it would mean that the metric space is not complete, which is not true. Finally, let us mention that lim sup {d(x, y); x, y € An} = lim (l + ^—) = 1^0. □ 7.28. Prove that the metric space h is complete. Solution. Let us consider an arbitrary Cauchy sequence {xn }„eN in the space i2. However, every term of this sequence is again a sequence, i. e., x„ = neN. Let us mention that, of course, the range of indeces does not matter - there is no difference whether n, k e N or elements x, y,z, and we easily get d(x,z) — lim d(xt, Zi) < lim d(xt, yt) + lim d(yt, zt) — d(x, y) + d(y, z). Apparently, the restriction of the metric d just defined to the original space X is identical to the original metric because the original points are represented by constant sequences. It remains to prove that X is dense in X and that the constructed metric space is complete. We want to prove that for any fixed Cauchy sequence x — {x,} and every (no matter how small) s > 0, we can find an element y of the original space such that the distance of the constant sequences of y's from the chosen sequence Xi does not exceed e. However, since the sequence x, is a Cauchy sequence, all pairs of its terms x„, xm will eventually (i. e. for sufficiently large indeces m and n) become closer than e to each other. Then the choice y — xn for one of those indeces necessarily gives that the elements y and xm will be closer than e, and so, from the limit point of view, it will hold that d(y, x) < s. Finally, it remains to prove that Cauchy sequences of points of the extended space X with respect to the metric d are necessarily convergent. In other words, we want to show that repeating the above procedure does not yield new points. This can be done by approaching the points of a Cauchy sequence ik by points yk from the original space X so that the resulting sequence y — {yt} would be the limit of the original sequence with respect to the metric d. Since we already know that X is a dense subset in X, we can choose, for every element ik of our fixed sequence, an element Zk e X so that the constant sequence Zk would satisfy d(xk, ik) < 1 /k. Now, let us consider the sequence z — {zo, z\, ■ ■ ■}. The original sequence x is Cauchy, i. e. for a fixed real number e > 0, there is an index n(e) such that d(x„,xm) < s/2 whenever both m and n are greater than 11(e). Without loss of generality, we can assume that our index n(e) is greater than or equal to 4/e. Now, for m and n greater than n(e), we get: d(Zmi Zn) — d(zm, Zn) — d(Zmi xm) + d(xm, xn) -\- d(xn, Zn) < l/m + s/2 + l/n < 2- + - — e. ~ ~ 4 2 Thus it is a Cauchy sequence zt of elements in X, and so z e X. Let us examine whether the distance d(x„, z) approaches zero, which we tried to guarantee by the construction. From the triangle inequality, dCz, xn) < d(z, z„) + d(z„, x„). However, from our previous bounds, it follows that both the sum-mands on the right-hand side converge to zero, thereby finishing the proof. □ In the following three paragraphs, we will introduce quite simple three theorems about complete metric spaces. They are highly applicable in both mathematical analysis and verifying convergence of numerical methods. 440 CHAPTER 7. CONTINUOUS MODELS n,k € N U {0}. Let us introduce helping sequences yk for k e N so that yk = {y"k}neN = {4}„eN • If {x„} is a Cauchy sequence in h, then each of the sequences yk is a Cauchy sequence in R (the sequences yk are sequences of real numbers). It follows from the completeness of R (with respect to the usual metric) that all of the sequences yk are convergent. Let us denote their limits by zk, & e N. It suffices to prove that z = {zk}km £ h and that the sequence {xn} converges for n -» oo in h just to the sequence z. The sequence {xn}neN C h is a Cauchy sequence; therefore, for every s > 0, there is an 72 (e) e N with the properly that e (-"4, — ->4) < £, m,n > n(s), m, n e N. In particular, ' 2 e (-"4 ~~ Xt) < e' m,n > n(s), m,n,l € N, whence, letting m -» oo, we can obtain ' 2 e(^-4) ^ e' «>«(«), «JeN, k=l i. e. (this time i -» oo) (7.32) E(^~4)2 n(e), n e N. Especially, we have oo 2 e(^-^) < 00 > n>n(s),nsN and, at the same time, 00 . e (4) < 00, " G N, which follows straight from {x„}„£n C h- Since 00 / 00 / 00 , EM) 00 to z in i2 follows from (||7.32||). □ 7.29. In the metric space • X on a metric space X with metric 0, for every positive real number e, we can find an index n(e) such that for all At with indeces i > n(e), their diameters are less than e. However, then for so large indeces i, j, we will have d(zt, Zj) < s, and thus our sequence is a Cauchy sequence. Therefore, it has a limit point z e X, which, of course, must be a limit point of all the sets At, thus it belongs to all of them (since they are all closed) and so to their intersection. We have proved the existence of z. Now, it remains to prove its uniqueness. For that purpose, assume there are points z and y, both belonging to the intersection of all the sets At. Then their distance must be less than the diameter of the sets At, but that converges to zero. This completes the proof. □ 7.21. Theorem (Baire theorem). If X is a complete metric space, then the intersection of every countable system of open dense sets Ai is a dense set in the metric space X. Proof. Let a system of dense open sets At, i — 1, 2..., be given in X. We want to show that the set A , Ai has a non- k=\ empty intersection with any open set U C X. We will proceed inductively, invoking the previous theorem. Surely there is a z\ e A i n U, but since the set A i is open, the closure of an ei-neighborhood U\ (for sufficiently small ei) of the point zi is contained in A i as well. Let us denote the closure of this ei-ball U\ by Si. Further, let us suppose that the points zi and their open e,-neighborhoods Ut are already chosen for i — 1,... ,n. Since the set A„+i is open and dense in X, there is a point z„+\ e An+i n U„; however, since A„+i n U„ is open, the point zn+i belongs to it together with a sufficiently small sn+i -neighborhood U„+\. Then, the closures surely satisfy Bn+\ — U„+\ C U„, and so the closed set Bn+\ is contained in An+i n U„. Moreover, we can assume that s„ < 1/n. If we proceed in this inductive way from the original point z\ and the set B\, we get a non-decreasing sequence of non-empty closed sets Bn whose diameter approaches zero. Therefore, there is a point z common to all of these sets, i. e., z e n^Ui = n^Bi c n^A,, n u, which is the statement to be proved. □ 7.22. Bounded and compact sets. The following concepts facil-' j»V itated the phrasing of our observations about the real m~^J numbers. They can be reformulated for general met-/f^ftvt* fic sPaces whh almost no changes.: An interior point of a subset A in a metric space is such an element of A which belongs to it together with some of its e-neighborhoods. A boundary point of a set A is an element x e X such that each of its neighborhoods has a non-empty intersection with both A and the complement X \ A. A boundary point may or may not belong to the set A itself. A. An open cover of a set A is a system of open sets Ut c X, i e I, such that their union contains the whole of A. An isolated point of a set A is an element a e A such that one of its e-neighborhoods in X has the singleton intersection {a} with A. A set A of elements of a metric space is called bounded iff its diameter is finite, i. e., there is a real number r such that d(x, y) < 442 CHAPTER 7. CONTINUOUS MODELS Therefore, for every e > 0, there is an n(s) € N satisfying 2- y £=h(e) + 1 >From each of the intervals [—l/n, \/n] for n e {1,..., we can choose finitely many points x", for any x e [—1/«, 1/«] that , xnm(n) so that we would have mm je[l,...,m(n)} /5" Let us consider such sequences {y„}„em from h whose terms with in-deces n > n(s) are zero, and at the same time, yi e {x vm(l) J«(e) e ixl ■ «00 1 m(n(EJ) I ' There are only finitely many such sequences and they create an e-net for A since f+ S + 1 + Since e > 0 is arbitrary, the set A is totally bounded, which implies its compactness. It is very simple to determine whether the set B is compact. Every compact set must be closed, but the set B is not. Its closure is B = {{x„}„m e Zoo; | x„ | < ±, n e N} . The set B is compact. The proof of this fact is much simpler than for the set A, thus we leave it as an exercise for the reader. □ D. Integral operators The convolution is one of the tools for smoothing functions: 7.32. Determine the convolution f * where 1 x x forx e [0, 1] Mx) fi(x) for x ^ 0 ine [0, 1 0 otherwise Solution. The value of the convolution at a point 7 is given by the integral f^ fi(x)f2(t — x) Ax. The integrated function is non-zero if the second factor is non-zero, i. e., if(7—x) e [—1, l],i. e.,x e [7 — l,f + l]. The value of the convolution at the point 7 can be interpreted as the integral mean of the function f over the interval (7 — 1,7 + 1). When integrating over this interval, we have to distinguish whether the number 0 belongs to it. If the interval contains zero, the integral must be split into two improper integrals. However, the value of the smaller one can be subtracted thanks to the function -\ being odd, so the integral -\ Ax remains (think out why the formula works for negative numbers 7 as well). Thus, we get: fi * fi(t) t+i i t-l X Ax = In ~-i+t i t+i for 7 e (-oo, -1] U [1, oo], fill \ Ax = ln|{±f| for* e [-1,1]. □ Now, let us try to calculate the convolution of two functions both of which have a finite support. r for all elements x, y e A. In the other case, the set is said to be unbounded. A metric space X is called compact iff every sequence of terms xi e X has a subsequence converging to some point x e X. In the case of the real numbers, we mentioned several characterizations of compactness. The concept of bounded-ness is a bit more complicated in the case of metric spaces. For any subsets A, B c X in a metric space X with metric d, we define their distance dist(A, B) — sup {d(x, y)}. xeA,yeB If A — {x} is a singleton set, we talk about the distance dist(x, B) of the point x from the set B. We say that a metric space X is totally bounded iff for every positive real number e, there is a finite set A such that dist(x, A) < £ for all points x e X. Let us remind that a metric space is bounded iff the whole X has a finite diameter. We can immediately see that a totally bounded space is especially bounded. Indeed, the diameter of a finite set is always finite, and if A is the set corresponding to e from the definition of total boundedness, then the distance d(x, y) of two points can always be bounded by the sum of dist(x, A), dist(y, A), and diam A, which is a finite number. In the case of a metric on a subset of a finite-dimensional Euclidean space, these concepts coincide since the boundedness of a set guarantees the boundedness of all coordinates in a fixed orthonormal basis, and this implies the total boundedness. (Verify this in detail by yourselves!) Theorem. The following statements about a metric space X are equivalent: (1) X is compact, (2) every open cover ofX contains a finite cover, (3) X complete and totally bounded. Sketch of the proof. If the second statement of the theorem is satisfied, the we can easily see that the space X must be totally bounded. Indeed, it suffices to choose the cover of X consisting of all e-balls centered at the points x e X. We can choose a finite cover from that and the set of centers x, of the balls participating in this finite cover already satisfies the condition from the definition of total boundedness. To prove the implication (2) ==>■ (3), we need to show the completeness. Let us consider a Cauchy sequence x, . □ 7.23. Compactness on continuous functions. As an example of the fact that the behavior of compactness may differ in Euclidean spaces from that in spaces of functions, we will mention a very useful theorem, known as Arzela-Askoli theorem. Theorem. A set M C C[a, b] is compact if and only if it is bounded, closed, and equicontinuous. 443 CHAPTER 7. CONTINUOUS MODELS 7.33. Determine the convolution f\ * f2 where /iW = fi(x) = 1-x2 fbrjc e [—1,1], 0 otherwise, x for x e [0, 1], 0 otherwise. Solution. The value of the convolution f\ * f2 at a point t is given by the integral over all real numbers of the product of the function f\ (x) and the function f2(t — x) with respect to the variable x (see 7.13). Thus, this value is zero if either of the values fi(x) and f2(t — x) is zero for any real x. On the other hand, the value of the convolution can be non-zero at a point t only if there are numbers x such that f\ (x) ^ 0 7^ fi(t — x). By the definitions of the given functions, this occurs if there are numbers i e [-1,1] (/i(i) / 0) such that (t — x) e [0, 1] (fiit-x) £ 0). I.e., /i*/2(0 can be non-zero if [t-1, f+l]n[0, 1] ^ 0. This happens for t e [—1,2]. We integrate over x belonging to the intersection of the intervals [t — l,t + 1] and [0, 1]. Further, this intersection depends on t e [—1,2]: a) for t € [-1, 0], we have [t - 1, t + 1] n [0, 1] = [0,t + 1], b) for t e [0, 1], we have [t - 1, t + 1] n [0, 1] = [0, 1], c) for t e [1, 2], we have [t - 1, t + 1] n [0, 1] = [t - 1, 1]. According to the intersection of these intervals, we then have: a) /oo ft+1 fi(x)f2(t -x)dx= / fi(x)f2(t - x) Ax -oo JO rt+1 L (1 - x2)(t - x)dx -t4 + t2+-t b) /oo /» 1 h(x)f2(t-x) = / fi(x)f2(t-x) -oo JO 1 2 1 (1 - x2)(t -x)dx = -t--, c) /oo /» 1 fi(x)f2(t-x) = / h(x)f2(t-x) -oo J t-1 1 1 4 (1 - x2)(t - x) Ax = —t4 -t2+ -t. 12 Altogether, we get: fi * flit) -U4 U4 12' 3' ■f2 for? e [-2, -1], for? e [-1, 1], for? e [1,2] otherwise. □ 7.24. Proof of theorem 7.8 about Fourier series. The general context of metrics and convergence allows us now to get back to the proof of the theorem in which we got a first idea about piecewise and other convergences of Fourier series. However, we do not care about necessary conditions for convergence, and many other formulations can be found in literature. On the other hand, our theorem 7.8 was quite simple and concerned a good deal of useful cases. Firstly, it is good to realize how convergences may differ with respect to different Lp norms. For the sake of simplicity, we will always work in the completion of the space 5^ or S]. with respect to the corresponding norm, without thinking about what the spaces actually look like (even though they could be described quite easily with the help of Kurzweil integral). Holder's inequality (applied to functions / and constant 1) yields the first of the following bounds on 5° [a, b]: f J a \fix)\dx < \a-b\Vl^ \fix)\Pdx .h\VlCVl (J^\fix)\dx < \a VP where p > 1 and l/p + l/q — 1, C > \f(x)\ on the whole interval [a, b] (such a uniform bound by a constant always exists if / e 5° [a, b]). The second bound follows immediately from the bound \f(x)\P < CP'1 \f(x)\and the relation 1 - l/p = l/q. Thus it is apparent from the first bound that Lp-convergence /«—>•/ is, for any p > 1, always stronger than L\-convergence (and with a merely modified bound, we can derive an even stronger proposition, namely that Lq -convergence is stronger than Lp- convergence whenever q > p; try this by yourselves). However, to apply the second bound, we have to require uniform boundedness of the sequence of functions /„, i. e. the bound for the functions /„ by a constant C must be independent of n. Then we can assert that I fn ix) — fix)\ < 2C, and our bound implies that L1 -convergence is stronger than Lp-convergence. Therefore, all our Lp-norms on our space 5° [a, b] are equivalent with regard to convergence of uniformly bounded sequences of functions. The most difficult (and most interesting) part is to prove the first statement of the theorem 7.8, which is in literature often referenced as Dirichlet condition (it is deemed to be derived as early as in 1824). First, we prove how this property of piecewise convergence implies the statements (2) and (3) of the theorem to be proved. Without loss of generality, we can assume that we are working on the interval [—it, it], i. e. with period T — 2it. As the first step, we prepare simple bounds for the coefficients of the Fourier series. A bound-of-course is 1 r * J-71 \fix)\ dx, and the same for all the coefficients bn since both cos(x) and sin(x) are bounded by 1 in absolute value. However, if / is a continuous function in S1 [a,b], we can integrate by parts, thus obtaining anif) 1 r * J-71 fix) cos(nx)?) — z sin (atf) ] dt. Since the product of any two odd functions is an even function, the product of an even function and an odd function is an odd function and since the integral of an odd function over the interval [ — 1, 1] is 0 (if this integral exists at all) and the integral of an even function over the interval [—1, 1] is twice the integral over [0, 1], we further get Hf){fi>) = vfc / sin (<»0 dt = ^ cos(a)t) -ll j , 2_ cos oo—l If we directly used the known expression of the Fourier transform of an odd function /, we would obtain more easily that f f(t) sm(cot) dt o / sin (cot) dt o 2 cos (w— 1 □ 7.36. Describe the Fourier transform T(f) of the function f(t)=Q~a?, teR, where a > 0. Solution. Our task is to calculate oo F(f)(co) = -±= f e~at2 e~i0Jtdt. Differentiating (with respect to co) and then integrating by parts (for F' = -it Q~a?, G = e-i0Jt) gives oo (F(f)(co))' = -±= f -it e~at2 e~imt dt = -at2 - ■10Jt - lim f e -at2—iojt c i(—io) Q—<*t2 Q—ioJt ^ f 2a -4= ( f lim q-^ - f lim q~atl - f f q~atl e~imt dt V2F \ 2a 2a t^_0o J 2a at2 -at2 0,—ia m I 1 / e-^ t~imt dt 2a I y^f 2a T(f)(co). = -bn(f')- n We write a„ (/) for the corresponding coefficient of the function /, and so on. Thus we can see that the "smoother" a function is, the more rapidly the Fourier coefficients approach zero. Iterating this procedure, we really obtain a bound for functions / in [—it, it] with continuous derivatives up to order k inclusive \a„(f)\ r \f(k+1\x)\dx, nlc+l7t J_K and the same for b„(f). In other words, for sufficiently smooth functions /, the n*-multiples of their Fourier coefficients a„ and bn are bounded by the L i -norm of their /c-th derivative . Let us thus consider a continuous function / in the space S1 [a, b] such that the partial sums of its Fourier series converge pointwise to /. Then we can assert that \sn(x) - f(x)\ - (ak cos(kx) + bi sin(kx)) k=N+l oo < E dfl*l k=N+l \bk\). The right-hand side can further be estimated by the coefficients a'n and b'n of the derivative /' (invoking Holder's inequality for the Lp and Lq norms for infinite series with p — q — 2, see 7.15, and Bessel's inequality for general Fourier series, see 7.5.(2)) oo ^ f(x)\< E -(Ki + i^i) k=N+l ™ i\l/2 ife2 j \sN(x) <<2 E k=N+l < f'»2 \K\2) 1/2 Thus we have obtained not only a proof of the uniform convergence of our series to the anticipated value, but also a bound for the speed of the convergence: /V2 , \ 1 sup \sN(x) - f(x)\ < -=||/'||2 • -=. xeR W71" / This proves the statement 7.8.(2), supposing the Dirichlet condition 7.8.(1) holds. 7.25. /^-convergence. In the next step of our proof, we will de-rive L 2-convergence of Fourier series under the con-h dition of uniform convergence. The proof utilizes the common technique of approximation objects which are not continuous by ones which are. We will de-=scribe it without further details. Interested readers should be able to fill in the gaps by themselves without any difficulties. First, we will formulate the statement we need in general: Lemma. A subset of continuous functions f in 5° [a, b] on a finite interval [a, b] is a dense subset in this space with respect to the L^-norm. 445 CHAPTER 7. CONTINUOUS MODELS Therefore, let us look for functions y(co) = F(f) (co) which satisfy the differential equation (7.33) , CO ^ 2a ^ Writing y = dy/dco, we have 2l = -f-y, i. e. x-dy = -^dco, cm 2a J' y 2a ' unless the function y equals zero (apparently, y = 0 is a solution of (||7.331|)). Integration yields ln|y| = -£-ln|C|, i.e. y = , where C e M\{0}. Including the zero solution as well, we can express all the solutions of the differential equation (||7.33||) as the functions y(co) = Ke~^, K 8 > 0, we define the function fs as x/S for |x| < S and f$(x) — h(x) otherwise. Apparently, all the functions fs are continuous since the point of discontinuity was overcome by a convenient linear function on an interval whose size is controlled by S. It can be calculated very easily that \\h — fs\\2 —>• 0 as the function / is bounded in absolute value, and so the contribution of the integration over a decreasing interval has to approach zero. All discontinuity points of a general function / can be cared for in exactly the same way. There are only finitely many of them, and so all of the considered functions are limit points of sequences of continuous functions. Now, our proof is already simple because for the given function /, the distance of the partial sums of its Fourier series can be bounded with the help of a continuous approach fB in this way (all norms in this paragraph are the L2 norms): Wf-sn (/) II < II / " fs II + II fs ~ sn (fs) II + \\sn (fs) ~ sn (/) || and the particular summands on the right-hand side can be controlled. The first one of them is at most e, and according to the assumption of uniform convergence for continuous functions, the second summand can be bounded equally tightly. It is good to notice that the third one is the size of the partial sum of the Fourier series for f — fB. Thus we have \\.f - fs-sN(f - fs)\\ < Therefore, (thanks to the triangle inequality) \\sN(f-fs)\\ <2||/-/s|| <2e. Altogether, we have bounded the whole distance for sufficiently close continuous functions and sufficiently large numbers N by the number 4s. This verifies the L2 convergence we wanted to prove. 7.26. Dirichlet kernel. Finally, we get to the proof of the first \\ statement of theorem 7.8. It follows straight from the definition of the Fourier series F(t) for a function f(t), using its expression with the complex exponents tial in 7.7, that the partial sums sn(t) can be written as 2iz 1 J oj J oj If we substitute — co for co in the integral over the interval (—oo, 0], we will obtain oo oo 71 vO m m 0 1 c sin co 2n f ^f- [cos (cot) — i sin (cot) + cos (cot) + i sin (cot) ] dco oo I r sin» cos , t, dco Let us mention that the previous expression can be obtained already from the fact that the function y = with maximal domain is even. j co Using the identity sinx • cos (xy) = ^ (sin [x(l + y)] + sin [x(l — y)]) , x, y e M, sn(?) -T/2 i " pi = j E J f(x)e-imkx eimkt dx, k=-n ' -T/2 where T is the period we are working with and co — 2n/T. This expression can be rewritten as -T/2 sn (t) and the function f j Kft(t — x)f(x) dx, J-T/2 1 N KN(y) = ~ £ e'^ k=-n is called Dirichlet kernel. Let us notice that the sum is a piece of a geometric series with common ratio etmy. Thus it can be expressed explicitly for all y / 0 in the following way (both the numerator 446 CHAPTER 7. CONTINUOUS MODELS which, besides others, follows from the sum formulae (for the sine function), we get f(t) = ±[f sin[ml1+ž)] dco + f W1-^ dcú m o m The substitutions u = co(l + t),v = co(l — t) then give (oo oo \ f^du-f^dv) =0, t>l; 0 0 / (oo oo \ oo ffdu + f^dv \=±f^du, »€(-1,1); 0 0/0 (oo oo \ -/^ + /^ =0, f <-l. 0 0 / Thus we have proved that the function / is zero for 11 \ > 1 and constant (necessarily non-zero) for 11 | < 1. (All the way, we assume that the inverse Fourier transform exists.) Let us determine the function value /(0). The function g(t) = \, \t\\ satisfies F(g)(co) = ^f^'dt = ^/cosM) dt = J^. -1 0 Hence it follows that /(0) = g(0)/2 = 1/2. Let us emphasize enumeration of the integral oo fmJLdu = $, ■J u 2 ' 0 which we have obtained as well. □ 7.38. Solve the integral equation oo / f(x) sin (xt) dt = e~x, x > 0 o for an unknown function /. Solution. If we multiply both sides of the equation by the number y/2/jt, we obtain just the sine Fourier transform on the left-hand side. Therefore, it suffices to apply the inverse transform to the equation. Thus we get oo f(t) = | f e~x sin(xf) dx, t > 0. o Integrating by parts twice, we can obtain Je 1 sin (xt) dx [— sin (xt) — t cos (xt) ] + C, hence / e 1 sin (xt) dx o Therefore, the function is the solution of the equation. /(') = JFiTF' ř>0- and the denominator are multiplied by — e tmyl2 to be able to substitute the real-valued function sin): KN(y) 1 e' -iNioy J(N+\)my T 1 1 - el0jy -i(N+\/2)a>y J(N+l/2)coy J QÍmy/2 _ e-imy/2 1 sin((Ař+ l/2)y/2) At the point y — 0, of course, we see that KN (0) — j (2N +1). It is apparent from the last expression that Kf/(y) is an even function, and using 1'Hospital's rule, we can calculate quickly that it is continuous everywhere. Since all the partial sums of the series for the constant function f(x) — 1 also equal 1, we get from the definition of the Dirichlet kernel that T/2 Kpj(x)dx — 1. T/2 In the case of periodic functions, the integrals over intervals whose length equals the period are independent of the choice of the marginal points. Hence, changing the coordinates, we can also use the expression rT/2 sn (x) — / KN (y)f(x + y) dy J-T/2 for the partial sums. Finally, we are fully prepared. First, we will focus on the case when the function / is continuous and differentiable at the point x. We want to prove that in this case, the Fourier series F(x) converges to the value f(x) at the point x. We get rT/2 sN (x) -/(*)=/ (f(x + y) - f(x))KN (y) dy. J-T/2 The integrated expression can be rewritten into a form which reminds Fourier coefficients of convenient functions: fix + y)- f(x) sin((Ař+ l/2)y/2) sin(NoL>y) + sin(y/2) for y / 0, while is the angular frequency, and for any fixed t, the so-called wave length a is the prime period. The number k then represents the speed k = 2f- sH which the wave propagates. We have ce0 - f E2dt = cs0- f A2 cos2(cot - kx) At * Jo ? Jo 1 Cz 1 + cos(2(&>? - kx)) cs0A2- f —' -—-—dt * Jo 2 1 t 1 r sm(2(a>t — kx)) 1T -cs0A2-[t + 2co -I x 1 91 , sin(2( 1 9, sin(2(x). Thus we can refer to the previous continuous case and obtain, for the Fourier series F(x) of our function /, the identity F(0) = Fi(0) + F2(0) ^(lim f(y)-l y^0+ lim /(y))+0, y^O- which we wanted to prove. In the case of discontinuity at a general point, we can proceed similarly, and the whole proof has come to an end (so has the proof of the statements (2) and (3) of theorem 7.8 where we required that the Dirichlet condition be true). 3. Integral operators 7.27. Integral operators. In the case of finite-dimensional vector spaces, we can perceive the vectors as mappings from a finite set of fixed generators into the space of coordinates. The sums of vectors and the scalar multiples of vectors were then given by the corresponding operations with such functions. Then we worked with the vector spaces of functions of a real variable in the same way when their values were scalars (or vectors as well). The simplest linear mapping a between vector spaces mapped vectors to scalars (the so-called linear forms). It was defined as the sum of products of coordinates x, of vectors with fixed values at — a(et) at the generators e;, i. e. by one-row matrices: (xi > Xn) («1 , ««) • (xi, > Xn) More complicated mappings, with values lying in the same space, were then given similarly by square matrices. We can approach linear operations on spaces of functions in an analogous way. For the sake of simplicity, we will work with the real vector space S of all piecewise continuous real-valued functions having a compact support and defined on the whole R or on an interval I — [a,b]. Linear mappings S —>• R will be called (real) linear Junctionals. Examples of such functionals can be given in two different ways — by evaluating the function's values (or its derivatives') at some fixed points or in terms of integration. We can, for instance consider the functional L given by evaluating the function at a sole fixed point xo e I, i. e., L(f) = /(x0). Or, we can have the functional given by integration and a fixed function g(x), i. e., L(f) -f J a f(x)g(x)dx. 448 CHAPTER 7. CONTINUOUS MODELS A IS s(x, y)e-ik^+iy) dx dy J Js? /p/2 pq/2 / e-ik&+ny) dy dx -p/2 J-q/2 The function g(x) in the previous example is a function which weighs the particular values representing the function f(x) in the definition of the Riemann integral. The simplest case of such a functional is, of course, the Riemann integral itself, i. e. the case of g(x) — 1 for all points x. We can get a good image from the choice . . (O if|x|>e Is if|x| 0. The integral of the function g over R equals one, and our linear functional can be perceived as a (uniform) averaging of the values of the function / over the e-neighborhood of the origin. Similarly, we can work with the function /p/2 r-q/2 e-^x dx -p/2 J-q/2 -ikrjy dy ~ g—ik^x ~ p/2 - £-iktjy - -p/2 _ —ikt] q/2 0 if \x\ > £ if \x\ < s 2 sin(/c§/?/2) 2 sin(kr]q/2) krj Apq sin(k^p/2) sin(kr]q/2) kijp/2 krjq/2 The graph of the function f(x) = looks as follows: The graph of the function VK§, vi) sin £ sin rj 1 ~ then does: And the diffraction we are describing: _^ith which worked in the paragraph 6.6. This function is smooth on the whole R with a compact support on the interval (—e, e). Our functional has the meaning of a weighted combination of the values, but this time, the weights of the input values decrease rapidly as their distance from the origin increases. Surely, the integral of g over the whole R is finite, yet it is not equal to one. Dividing g by this integral would lead to a functional which would have the meaning of a non-uniform averaging of a given function /. There is another quite common instance, the so-called Gaussian function g(x) = -e * , it which also has a unit integral over the whole R (we will verify this later). This time, all the input values x in the corresponding "average" have a non-zero weight, yet this weight becomes insignificantly small as the distance from the origin increases. We could observe another example with a unit integral over the whole a while ago when we were discussing the Dirichlet kernels g(x) — Kn(x) for Fourier series. 7.28. Function convolution. Integral functionals from the previous paragraph can easily be modified to obtain a "steamed averaging" of the values of a given function / near a given point jel: Ly(f) -f J —c f(x)g(y -x)dx __\ Convolution of functions of a real variable |___ The free parameter y in the definition of the functional Ly(f) can be perceived as a new independent variable, and our operation Ly actually maps functions to functions again, / i->- /: f(y) = Ly(f)= f f(x)g(y-x)dx. This operation is called the convolution of functions f and g, denoted f * g. The convolution is mostly defined for real or complex functions on R with a compact support. 449 CHAPTER 7. CONTINUOUS MODELS I- 1-10'rad - Orad L-1-10 rad Since lim^o = 1, the intensity at the middle of the image is directly proportional to To = A2p2q2. The Fourier transform can be easily scrutinized if we aim a laser pointer through a subtle opening between the thumb and the index finger; it will be the image of the function of its permeability. The image of the last picture can be seen if we create a good rectangular opening by, for instance, gluing together some stickers with sharp edges. 7.40. Find the solution to the so-called equation of heat conduction (equation of diffusion) ut(x, t) = a2 uxx(x, t), x € M, t > 0 satisfying the initial condition lim u(x, t) = f(x). Notes: The symbol ut = ^ stands for the partial derivative of the the u with respect to t (i. e., differentiating with respect to t and considering x to be constant), and similarly, uxx = £f- denotes the second partial derivative with respect to x (i. e., twice differentiating with respect to x while considering t to be constant). The physical interpretation of this problem is as follows: We are trying to determine the temperature u (x, t) in an thermally isolated and homogeneous bar of infinite length (the range of the variable x) if the initial temperature of the bar is given as the function /. The section of the bar is constant and the heat can spread in it by conduction only. The coefficient a2 then equals the quotient f^, where a is the coefficient of thermal conductivity, c is the specific heat and q is the density. In particular, we assume that a2 > 0. Solution. We apply the Fourier transform to the equation, with respect to variable x. We have T (ut) (co, t) '2Jr / ut(x, t)e-imxdx f u(x,t)e-imx dx —oo where differentiated with respect to t, i. e., T (ut) (co, t) = (T (u) (co, 0)' = (T (u))t (co, t). At the same time, we know that By the transformation t — z — x, we can easily calculate that /oo f(x)g(z- x)dx -oo f— oo f(z-t)g(t)dt = (g*f)(z). / Thus the convolution, considered a binary operation * : Sc x Sc —> Sc of pairs of functions having compact supports, is commutative. Similarly, convolutions can be considered with integration over a finite interval; we only have to guarantee that the functions participating in them be well-defined. Especially, this can be done for periodic functions with integrating over an interval whose length equals the period. The convolution is an extraordinarily useful tool for modeling the way in which we observe the data of an experiment or the influence of a medium through which information is transferred (for instance, an analog audio or video signal affected by noise, and so on) The input value / is the transferred information and the function g is chosen so that it would express the influence of the medium or the technical procedure used for the signal processing or the processing of any other data. 7.29. Gibbs phenomenon. Actually, we have already seen a useful case of convolution. In paragraph 7.26, we in-terpreted the partial sum of the Fourier series for a function / as a convolution with Dirichlet kernel KN(y) = £ ~icoky -772 r This interpretation allows us to explain the so-called Gibbs phenomenon mentioned in paragraph 7.9. 7.30. Fourier transform. The convolution is one of many examples of a general integral operators on spaces of functions L(f)(y) -f Ja f(x)k(y, x) dx. The function k(y, x), dependent on two variables, is called the kernel of the integral operator L. The domain of such functionals must be chosen in view of the properties of the particular kernels so that the used integral would exist at all. The theory of integral operators with kernels and equations they contain is very useful and interesting at the same time. However, we do not have enough space for it here. We will focus only on an extraordinarily important case, the so-called Fourier transform T, which has deep connections with Fourier series. Let us remind that a function f(t), given by its converging Fourier series, equals where the numbers c„ are complex Fourier coefficients, con — nljt/T with period T, see paragraph 7.7. Having fixed T, the expression Aco — 2it/T describes just the change of the frequency caused by n being increased by one. Thus it is just the discrete step by which we change the frequencies when 450 CHAPTER 7. CONTINUOUS MODELS T (a2 uxx) (co, t) = a2 T (uxx) (co, t) = —a2a? T (u) (oo, t). Denoting y(co, t) = T (u) (co, t), we get to the equation yt = -a2co2 y. We already solved a similar differential equation when we were calculating Fourier transforms, so it is now easy for us to determine all of its solutions y(co, t) = K(co) e-fl2ft>\ K(co) e R. It remains to determine K(co). The transformation of the initial condition gives T(f) (co) = lim F(u) (oo, t) = lim y(oo, t) = K(oo)e° = K(co), t^0+ t^0+ hence y(co, t)=T (/) (oo) e-a2°?t, K(oo) e R. Now, using the inverse Fourier transform, we can return to the original differential equation with solution u(x, t) 2Jr f y(oo, t) eimx doo -4- f T(f) (oo) e-"2a)2' eia)X doo ^ f I -jL= f f(s) e-ias ds ) &~amt &imx doo -j== f f(S) I -JL. f e-aWt e-ia*s-x) d(D | ds Computing the Fourier transform F(f) of the function f(t) for a > 0, we have obtained (while relabeling the variables) -at2 'dp /2c e ^, c > 0. According to this formula (consider c = a2t > 0, p we have oo, r x), Therefore, ./2t7 J C u(x, t) 2m2t e-ia*s-x) dM I2a2t 2a ■Jilt J f(s)e 4a2' ds. □ 7.41. Determine the Laplace transform C(f)(s) of the function (a) f(t) = tat; (b) f(t) = Cleait + c2ea2t; (c) f(t) = cos ibt); (d) fit) = sin ibt); (e) fit) = cosh ibt); (f) fit) = sinh ibt), where the values b e R and ci,c2 € C are arbitrary and the positive number i e Kis greater than the real parts of the numbers a,a\,a2 e C and it is also greater than b in the problems (e) and (f). Solution. The case (a). It follows directly from the definition of the Laplace transform that calculating the coefficients of the Fourier series. The coefficient 1/T in the formula T/2 T/2 f(t) e~la)nt dt then equals Aco/2tt, so the series for f(t) can be rewritten as 'T/2 f(t) i — — rv-i x T/2 /We" dx e" Now, let us imagine the values con for all n e Z as the chosen repitesentatives for small intervals [co„, co„+i] of length Aco. Then, offi®pression in the big inner parentheses in the previous formula fa^K) actually describes the summands of the Riemann sums for thetlmproper integral g(co)eat da, where g(co) is a function which takes, at the points con, the values -T/2 g(e>n) I. T/2 /We" dx. We are working with piecewise continuous functions with a compact support, thus our function / is integrable in absolute value over the whole R. Letting T —>• oo, the norm Aco of our subinter-vals in the Riemann sum gets finer. At the same time, in the last expression, we obtain the integral g(o>) r J —c /We" dx. The previous reasonings show that there is quite a large set of Riemann integrable functions / on R for which we ^ (\ can define a pair of mutually inverse integral opera-U tors: Fourier transform J____ For every piecewise continuous real or complex function / on . with a compact support, we define Hf)(co) = f(co) i r \[7jz J-c Z(0e" dt. This function / is called the Fourier transform of the function /. The previous reasonings also show that we will have f(t) = J^1 (f)(t) — f \/2jT J-c f(co)elmt dco. This says that the Fourier transform J7 just defined has an inverse operation J7-1, which is called inverse Fourier transform. Let us notice that both the Fourier transform and its inverse are integral operators with almost identical kernels ±iojt k(co, t) — e: Of course, there transforms are meaningful for much greater domains. Interested readers are referenced to specialized literature. 451 CHAPTER 7. CONTINUOUS MODELS £ (f) (s) = f eat e~st dt = fe~ lim (^-) -(s-a) (s—a)t 1 The case (b). Using the result of the above case and the linearity of improper integrals, we obtain oo oo £ (/) (s) = a f eait e~st dt + c2 f e"2t e~st dt = ^ + 0 0 12 The case (c). Since cos (bt) = \ (eibt + &~ibt) , the choice c\ = 1/2 = c2, a\ = ib, a2 = —ib in the previous variant gives £ (/) (*) = /( 1 „ibt + 5° i ) dt i + i 2(s-ib) 1 2(s+/fc) 2+fc2- The cases (d), (e), (f). Analogously, the choices (d) c\ = —ill, c2 = i/2, ci\ = ib, a2 = —ib; (e) ci = 1/2 = c2, ai = b,a2 = -b; (f) ci = 1/2, c2 = -1/2, ai = b,a2 = -b lead to (d) £ (/) (S) - fr-Ce) £ (/) (i) (f) £ (/) (*) s2+fc2 ' s . s2_h2 > s2-fc2 • 7.42. Using the relation (7.34) □ C(f')(s)=sC(f)(s)- lim f(t), derive the Laplace transforms of the functions y = cos t and y = sin t. Solution. First, let us realize that from (||7.34||), it follows that £(f")(S) = S£(f')(S)- lim f'(t) = - Hm f'(t) = s sC (/) (S) - lim f(t) s2C(f)(s)-s lim fit) - lim /'(f). Therefore, -£ (siní) (5) = £ (- siní) (5) = £ ((siní)") (5) = s2 £ (sin ř) (5) — s lim sin ř — lim cos t = s2 £ (sin ř) (5) — 1, whence we get —£ (siní) (s) = s2£ (siní) (5) — 1, i. e. £ (siní) (5) Now, invoking (||7.34||), we can easily determine s2+r £ (cosi) (s) = £ ((siní)') 00 = s ^ - Jim siní = □ 7.43. For s > — 1, calculate the Laplace transform £ (g) (s) of the function g(t) = te~'. Further, for s > 1, calculate the Laplace transform £ (h) (s) of the function 7.31. Simple properties. The Fourier transform changes the local and global behavior of functions in an interesting way. Let us begin with a simple example in which we find a function f(t) which is transformed to the indicator function of the interval Q], i. e., /(&>) = 0 for M > Q, and /(&>) = 1 for M < Q. The inverse transform ^F~l gives f(t) = —r 1 '2jt 1 1 :(e' sPhtt 2i 2Q sin(ňř) 2Ťř Thus, except for a multiplicative constant and the scaling of the input variable, it is the very important function sinc(x) — Straight calculation of the limit at zero (1' Hospital's rule) gives /(0) — 2Q(2tt)~1/2, the closest zero points are at t — ±ir/ £2 and the function drops to zero quite rapidly outside the origin x — 0. This function is caught in the picture by a wavy curve for £2 — 20. Simultaneously, the area where our function f(t) keeps waving more rapidly as £2 increases is also depicted by a curve. Omega = 20.000 We can see the indicator function of the interval £2] is Fourier-transformed to the function /, which has takes significant positive values near zero, and the value taken at zero is a fixed multiple of £2. Therefore, as £2 increases, the / concentrates more and more near the origin. Further, we will derive the Fourier transform of the derivative fit) for a function /. We keep supposing that / has a compact support, i. e., especially both T(f') and T(f) really exist. Let us use integration by parts: nf)(u) i V2jr i(tíT(f)((tí) = tH /(0e" dt f J —c /(0e- dt Thus we can see that Fourier transform converts the (limit) operation of differentiation to the (algebraic) operation of a simple multiplication by the variable. Of course, this formula can be iterated, obtaining Hf")(o>) = -a>LHf ),■■■, Hfin)) onHf). 452 CHAPTER 7. CONTINUOUS MODELS hit) = t sinhf. Solution. Integrating by parts, we obtain C ig) (s)= ft e"' e~st dt = ft e"(s+1)i dt = lim (^ztt ) - 0 I dt (lim 1 (s+l> ,2 • Differentiating the Laplace transform of a general function — / (i. e., an improper integral) with respect to the parameter s gives ' oo \ 1 oo oo / -fit) e~st dt) =f -fit) (e"")' dt = ftfit) e"s' dt. vO / 0 0 This means that the derivative of the Laplace transform £(—f)is) is the Laplace transform of the function tfit). The Laplace transform of the function y = sinh t has already been determined as the function y = -jry;. Therefore, (ä)(*) = (-Ä)' 2s (s2-l)2- Let us notice that we could also have determined C ig) is) this way. □ The basic Laplace transforms are enumerated in the following table: yit) Ciy)is) teai feat sincot cos cot eat sin cot eat (cos cot + - sin cot) t sin -s)t df 1 (ia)—s)t s — ico 1 eim --(lim — s — ico t^oo est s co + i- 1) 1 s + ico s — ico is — ico) is + ico) s1 + co1 s1 + co □ 7.45. Let Ciy)is) denote the Laplace transform of a function yit). Using integration by parts, prove that Solution. (7.35) Ciy')is) = sCiy)is) - y(0) 7.32. The relation to convolutions. There is another extremely important property, the relation between convolutions and Fourier transforms. Let us calculate the transform of the convolution h — f * g, where, as usual, we assume that the functions have compact supports. We will switch the order of integration, which is a step whose correctness will be verified later in differential and integral calculus, see ??. In the next little step, we will introduce the substitution t — x — u. Hh)i2tc As Q, goes to oo, the left-hand expression transforms to 1 iTig))iz) — giz), while on the right-hand side, we get giz) f g(t)8(z - t)dt. The wanted 8(f) thus looks as a "function" which takes zero everywhere except the single point t — 0 where it "takes such an infinite value" that integrating the product of 8(t) and any integrable function g gives just the value of g at the point t — 0. Of course, it is not a function in the common sense, but it is an object used quite often. It is called the Dirac function 8 and it can be described correctly as an instance of the so-called distribution. Since we do not have enough space and time, we will not pay further attention to distributions. We only mention that the Dirac 8 can be imagined as a unit impulse at a single point. Its Fourier transform is the constant function Ti8) ico) — —)=. s/LTZ 453 CHAPTER 7. CONTINUOUS MODELS C(f)is) =s2C(y)-sy(0)-y'(0) and, by induction: n £(y(n))(s) = sn£iy)is) - JV-' y(i~l) (0). 7.46. Find a function y(t) satisfying the differential equation fit) + 4y(0 = sin 2? and the initial conditions y(0) = 0 and y (0) = 0. Solution. From the previous example ||8.149||: s2Ciy)is) + 4£(y)(s) = £ (sin 2?) 00 At the same time, 2 □ l. e., £ (sin 2000 C(y)is) = s2 + 4' 2 is2 + 4)2 The inverse transform gives y(t) = i sin 2? — \t cos2f. □ 7.47. Find a function y(0 satisfying the differential equation y" it) + 6y' (0 + 9y(0 = 50 sin t and the initial conditions y(0) = 1 and y (0) = 4. Solution. The Laplace transform gives s2Ciy)is) - s - 4 + 6isCiy)is) - 1) + 9£(y)(s) = 50£(sin000, i. e., is2 + 6s + 9)£iy)is) 50 + 1 + s +10, £(y)(s) 50 s +10 + is2+l) is + 3)2 is + 3)2' Decomposing the first term into partial fractions, we get 50 As + B C D +-t + so is2 + l)is + 3)2 s2 + l s + 3 is+3)2 50 = (As + B)(s + 3)2 + Cis2 + l)(s + 3) + D(s2 + 1). Substituting s = —3, we obtain 50 = 10D so D = 5, and confronting the coefficients at s3 0 = A + C, so A = -C. Confronting the coefficients at s then yields 4 0 = 9A + 6B + C = 8A + 6B, so B = -C. Confronting the absolute terms, we get 50 = 9S + 3C + D = 12C + 3C + 5 so C = 3, B = 4, A = -3. On the other hand, many functions which are not integrable in absolute value on R are Fourier-transformed to expressions with Dirac 8. For instance, Jr(cos(nf))('w) = J-(^(n — &)) + 8(n + ft))), which can easily be seen from the calculation of the Fourier transform of the function fa cos(nx) and then be letting Q approach oo. We can get the Fourier transform of the sine function in a similar way, we can also take advantage of the fact that the transform of the derivative of this function will differ only by a multiple of the imaginary unit and the variable. These transforms are a base for Fourier analysis of signals: If a signal is a pure sinusoid of a given frequency, then this is recognized in the Fourier transform as two single-point impulses right at the positive and negative value of the frequency. If the signal is a linear combination of several such pure signals, we obtain the same linear combination of single-point impulses. However, since we always process a signal in a finite time interval only, we actually get not single-point impulses, but rather a wavy curve similar to the function sine with a strong maximum at the value of the corresponding frequency. The size of this maximum also yields the information about the original amplitude of the signal. 7.34. Fourier sine a cosine transform. If we apply the Fourier transform to an odd function f(t), i. e., /(—t) — —f(t), the contribution in the integration of the product of f(t) and the function cos(±ft)f) cancels for positive and negative values of t. Thus straight calculation gives F(f)(co) ——= I f(t)sincotdt. I In The resulting function is odd again, hence by the same reason, the inverse transform can determined in a similar way: F(f)(co) ——= I f(t)sincotdt. /2it Omitting the imaginary unit i gives mutually inverse transforms, which are called the Fourier sine transform for odd functions: fAto) fit) 2 r V it Jo V it Jo f(t) sin (ft>?) dt, fs (t) sin(ft)f) dt. Similarly, one can define the Fourier cosine transform for even functions: fdeo) = fit) f(t) cos (ft>?) dt, 12 r * Jo [2 [°° - — I fs it) sin <*)t dt. * Jo 7.35. Laplace transforms. The Fourier transform cannot be applied to functions which are not integrable in absolute value over the entire R (at least, we do not obtain true functions). The so-called Laplace transform acts quite similarly as the Fourier one and is flawless in this sense: £(/)(*) = m Jo f(t)&~st dt. 454 CHAPTER 7. CONTINUOUS MODELS Since we have s + 10 s + 3 + 7 1 + 7 (s + 3)2 (s + 3)2 s+3 (s + 3) 2 ' £(y)(s) -3s+4 I _3_ I 5 s+3 ~T (s+3)2 s2 + l + ^- + 7 ^ s+3 ^ (s+3)2 -3s I__4__|_ _4_ I 12 s2+l s2+l s+3 (s+3)2- Hence, using the inverse Laplace transform, we get the solution in the form The integral operator C has a rapidly reducing kernel if s is a positive real number. Therefore, the Laplace transform is usually perceived as a mapping of suitable functions on the interval [0, oo) to the function on the same or shorter interval. The image C (p) exists, for example, for every polynomial p(t) and all positive numbers s. In an analogous way as in the case of the Fourier transform, we can get the formula for the Laplace transform of a differentiated function for s > 0 using integration by parts: C(f'(t))(s) y(t) -3cos? +4sin? + 4e 3t + 12te 3t. / /'(0e-Jo dt -St-iOO Jo 7.48. Find the Laplace transform of Heaviside's function H(t) and shifted Heaviside's function Ha(t) = H(t — a): '0 forf<0, for t = 0, for t > 0. poo / f(t)e-st dt Jo sC(fXs). H(t) Solution. C(H(t))(s) / H(t)e~stdt = Jo Jo dt C(Ha(t))(s) C(H(t -a))(s) f Jo ;(0-l) POO / H Jo (t - a)e st At f Ja -s(t+a) dt = e~as C(H(t))(s) □ =[/(0e" = -/(0) The properties of the Laplace transform and many more transforms used especially in technical practice can be found in specialized literature. 4. Discrete transforms The Fourier analysis of signals mentioned in the previous paragraph used to be realized by special analog circuits in radio technology, for instance. Nowadays, we work only with discrete data when processing signals by computer circuits. Let us assume that there is a fixed (tiny) sample interval r given in a (discrete) time variable and that our signal repeats with period A^t (for a very large natural number AO, which is the maximal period which can be rep--st ^resented in our discrete model. □ 7.49. Show that (7.36) C(f(t).Ha(t))(s) Solution. C(f(t).Ha(t))(s) e-as£(f(t + a))(s) poo pc / f(t)H(t - a)e~st dt = Jo Ja poo pc / f(t + a)e-s(t+a) dt = e~as Jo Jo e-asC(f(t + a))(s). f(t)e~stdt f(t +a)e~st dt □ 7.50. Solution. Find a function y(t) satisfying the differential equation and the initial conditions: /(0 +4y(t) = f(t), y(0) = 0, /(0) = -1, where the function f(t) is piecewise continuous: cos(2f) for 0 < t < 7T, 0 for t > 7T. This problem is a model of undamped oscillation of a spring (excluding friction and other phenomena like non-linearities in the toughness of the spring and so on) which is initiated by an outer force during the initial period only and then ceases. 455 CHAPTER 7. CONTINUOUS MODELS The function fit) can be written as a linear combination of Heav-iside's function u(t) and its shift, i. e., fit) = cos(2f)(w(0 - MO)- Since Ciy")is) = s2Ciy) - syiO) - y'(0) = s2 Ciy) + 1, we get, making use of the previous examples 7 and 8, the right-hand sides to the calculation of the Laplace transform s2Ciy) + 1 + ACiy) = £(cos(2t)(u(t) - MO)) = £(cos(20 • u(t)) - £(cos(20 • MO) = £(cos(20) - e~ns£(cos(2(f + jt)) sz + 4 Hence, = -TT7 + d " g"m)f2L,2-sz + 4 (sz + 4)z The inverse transform then yields the solution in the form y(0 = -\ sin(2f) + \t sin(20 + C~l (e~™ ^ $+ ^ According to (||7.36||), C~\e~nS^T^) = \C-He-dt,mi2t))) = it - jt) sin(2(f - jt)) ■ Hn(t). Since Heaviside's function is zero for t < jt and equals 1 for t > jt, we get the solution in the form -\ sin(2f) + \t sin(20 for 0 < t < jx I sin(2f) for t > Jt □ 7.51. Find a function y(i) satisfying the differential equation f it) = cos ijtt) - yit), t e (0, +oo) and the initial conditions y(0) = c\, / (0) = c2. Solution. First, let us emphasize that from the theory of ordinary differential equations, it follows that this problem has a unique solution. Further, let us remind that and C (/") (*) = s2£ (f) (s) - s lim fit) - lim fit) £(cos ibt)) is) s2+h2 , Applying the Laplace transform to the given differential equation thus gives s2C (y) is) - sci -c2 = -zj-r - £ (y) is), p. r r„\ tc\ c„. — -A. l. e., S C\S c2 (7.37) C iy) is) = + -y1— + -Tj-. (s2 + 1) (s2 +Jt2) sz + 1 s1 + 1 It suffices to find a function y satisfying (||8.12||). Partial fraction decomposition gives _s_ _ 1 / s _ s \ (s2+\)(s2+jt2) ~ n2-l \s2+l s2+n2 ) ' 456 CHAPTER 7. CONTINUOUS MODELS Therefore, from the expression of £ (cos (bt)) (s) mentioned above and the proved equality 1 2+i' £ (sin t) (s) = we already obtain the wanted solution y(t) = -^—j (cos t — cos (jtt)) + ci cos t + c2 sin t. □ 7.52. Solve the system of differential equations x" (0+x' (0 = y(t)-f (t)W, x' (t)+2x(t) = -y(t)+y (t)+e~' with initial conditions x(0) = 0, y(0) = 0, x' (0) = 1, / (0) = 0. Solution. Once again, we apply the Laplace transform. Together with this transforms the first equation to s2 £ (x) (s) — s lim x(t) — lim x' (t) + s£ (x) (s) — lim x(t) = £ (y) (s) - (s2£ (y) (s) - s lim y(t) - lim yf (t)) + ^ and the second one to s£ (x) (s) - lim x(t) + 2£ (x) (s) = -£ (y) (s) + s£ (y) (s) - lim y(t) + ^. If we enumerate the Umits (according to the initial conditions), we get the linear equations s2£ (x) (s)-l+ s£ (x) (s) = £ (y) (s) - s2 £ (y) (s) + ^ and s£ (x) (s) + 2£ (x) (s) = -£ (y) (s) + s£ (y) (s) + ^ with a unique solution Once again, we use partial fraction decomposition, obtaining £ (x) (s) - i ^- + ^ —±__IJ- - 1 i , I _J_ Since we have already calculated that £ (t e"') (*) = -L-i, £ (sinh 0 (*) = £ (f sinh 0 (s) 2s (*2-ir we get x(t) = | f e~l + \ sinh f, y(t) = | f sinh t. The reader can verify that these functions x and y are really the wanted solution. We strongly recommend to perform the verification (for instance for the reason that the Laplace transforms of the functions y = e',y = sinh t and y = t sinh t were obtained only for s > 1). □ 7.53. Find the solution to the following system of differential equations: x'(t) = -2x(t) + 3y(0 + 3t2, y'(t) = -4x(t) + 5y(0 + e', x(0) = 1, y(0) = -1 Solution. £(x')(s) = £(-2x + 3y + 3t2)(s), £(y')(s) = £(-4x + 5y + e')(s). 457 CHAPTER 7. CONTINUOUS MODELS The left-hand sides can be written using (||8.11||), and the right-hand ones can be rewritten thanks to linearity of the operator C. Since C(3t2)(s) = ^ and C(e')(s) = j^, we get the system of linear equations sC(x)(s) - 1 sC(y)(s) + 1 -2C(x)(s) + 3C(y)(s) + -4C(x)(s) + 5C(y)(s) + s-l ■ After rearrangements, we get X(s)x(s) = b(s) in matrices, where we denoted 's + 2 -3 \ ... fC(x)(s)\ , , ( 1 + 4 4 ,-5>X^ = Uy)(,))andb^ = (-l+S7lr Ms) Cramer's rule says that C(x)(s) = ^1, C(y)(s) = M where s + 2 -3 4 s-5 1 + 3s+ 2, + Ä 5 + 2 1 + 4 -1 + 1 -3 s-5 l s-l (s-5)(l + ^) + 3(-l + ^T) (* + 2)(-l + ^T) (s -5)(s3 +6) 24 o3 • |A| = |Ai| = |A2| = Hence C(x)(s) £(y)(s) ■■ (s - l)(s - 2) V * - i sj Using partial fraction decomposition, we can express the Laplace images of the solution C(x)(s)- 39 3 ' 28 21 (s - \)(s - 2) V s3 s-\ 1 /(s + 2)(2-s) 4s3 +24 39 "2s2 (s-l)2 + 28 s-l £(x)(s) (s-l)2 + 27 s-l 4(s-2) __7__ s-2 25 «3 87 4s ' 12 «3 21 s and then we arrive at the solutions of Cauchy's problem with the inverse transformation: x(t) -- y(t) 39 t - 3te' + 28 1 ■ The vector representing the data is then decomposed orthogonally and some basis vectors (columns of the matrix C ) are dropped. This produces a reduction of data reasonably approximating the original set. The inverse transform is easy. Since C is orthogonal, it is given by multiplication by the transposed matrix. Show that for n = 2, the matrix C equals ^ j ^ and that it is orthogonal. Calculate the orthogonal decomposition of vector (3, 4) 458 CHAPTER 7. CONTINUOUS MODELS CCJ with respect to the basis formed by the columns of the matrix and determine the eigenvalues and eigenvectors. Solution. Let us calculate 1/1 1 Wl 1 \ 1 (2 0^ 2 \1 -iJ'V1 "V ~ 2 V° 2-Thus, the matrix C is really orthogonal and its columns create an or-thonormal basis e\ = (-^, -^),e2 = — The coefficients of the orthogonal decomposition of the vector u = (3, 4) can be obtained easily by applying the transposed matrix 75 G -i)G) = Therefore, the orthogonal decomposition has the following form: 7 A\ 1 1. C1 u ^2 \v2/ 72 \ V2y The characteristic polynomial of the matrix C is (A+) (A—-j|) — 5 = 0, so the eigenvalues are A^2 = ±1 (an orthogonal matrix cannot have any others). The corresponding eigenvectors are determined by the respective equations (- So these are, for instance, the vectors (-^-^, 1 — (^,— 1 (which are orthogonal automatically). Remark. Try to draw a picture of the action of the mapping determined by the matrix A on some vector in the plane. 7.55. Discrete cosine transform 2. Show that the symmetric matrix /0 1 ... 0 0\ 1 0 ... 0 0 0, (-L + l)* + 0. □ 0 0 ... 0 1 \0 0 ... 1 0/ has eigenvalues Xt = cos \y\. Therefore, the domain of this function is {{x, y) e \x\ > \y\}- You can see the graph of this function in the picture. At the very beginning of our journey through the mathematical countryside, we have seen that it is not difficult to work with more parameters simultaneously since vectors can be manipulated as easily as scalars. We only have to think things out well. Now, once again, we will deal with situations when the mathematically expressed relations depend on more (yet still finitely many) parameters. We will see that there is no need of brand new ideas; it will often do to reduce the problems we encounter to the ones we are able to solve. At the same time, we can finally return to the discussion of situations when the function values are described in terms of instantaneous changes - i. e., we will stop for a while to look at ordinary and partial differential equations. At the very end, we will introduce the so-called variation problems. As usual, we will try to comment on the discrete variants of our approaches or problems on the fly. 1. Functions and mappings on R" 8.1. Multivariate functions. For the modeling of processes (or objects in graphics) in practice, we can seldom do with functions R -> R of one variable. At least, functions dependent on parameters are necessary, and the dependence of the change of the results on the parameters is often more important than the result itself. Therefore, we will consider the functions fixux2,..., jc„) : R" -* R, and we will try to extend our methods for monitoring the values and their changes for this situation. We call them functions of more variables or, more compactly, multivariate functions. We will often work with the cases n — 2 or n — 3 so that the concepts being introduced would be easier to understand, and we will, in these cases, use the letters x, y, z instead of numbered variables. This means that a function / defined in the "plane" R2 will be denoted f :R2 3 ix, y) i-> fix, y) e R, and, similarly, in the "space" R3 f :R3 3 ix, y, z) k> fix, y, z) e R. Just like in the case of univariate functions, we talk about the domain A c 1" on which the function in question is defined. When examining a function given by a concrete formula, the first task is often to find the largest domain on which the formula makes sense. CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES It is also useful to consider the graph of a multivariate function, i. e., a subset G/CK"xI = R"+\ defined by c) This formula is again a composition of a logarithm and a polyno- mial of two variables. However, the polynomial — x2 y takes on only non-positive real values, where the logarithm is undefined (as a function R -» R). d) This formula correctly defines a value iff the argument of the arc sine lies in the interval [—1, 1], which is broken by exactly those pairs (x, y) € R2 whose first component is rational. The formula thus defines a function on the set {(x, y), x e R \ Q}. e) The argument of the square root must be non-negative, the argument of the natural logarithm must be positive, and the argument of the arc sine must be from [—1,1]. □ B. The topology of En 8.2. A known fact about the space En is that the shortest path between a pair of points is a line segment. However, many more metrics can be defined on the space R" (or on its subsets). For instance, considering a map of a state as a subset of R2, the distance of two points may be defined as the time necessary to get from one of the points to the other by public transport or on foot. In France, for example, the shortest paths between most pairs of points in this metric are far from line segments. 8.3. Show that every non-empty proper subset of En has a boundary point (which need not he in it). Solution. Let U c En be a non-empty subset with no boundary point. Consider a point X e U, a point Y e U' := E„ \U, and the line segment XY c En. Intuitively, going from X to Y along this segment, we must once get from U to U', and this can happen only at a boundary point (everyone who has ever been to a foreign country is surely well acquainted with this fact). Formally, let A be the point of XY for which \XA\ = sup{|XZ|, XZ € U} (apparently, there is exactly one such point on the segment XY). This point is a boundary point of U: it follows from the definition of A that any line segment XB (with B e XA) is contained in U; in particular, B e U. However, if there were a neighborhood of A contained in U, then there would exist a part of the line segment XY longer than XA which would be contained in U, which contradicts the definition of the point A. Therefore, any neighborhood of the point A contains a point from U as well as a point from E„\U. □ Gf — {(xi, , x„, f(x\, ...,x„)); (xi, , x„) e A}, where A is the domain of /. For instance, the graph of the function defined in the plane by the formula fix, y) x + y x2 + y2 is quite a nice surface, caught in the picture. The maximal domain of this function consists of all the points of the plane except for the origin (0, 0). When defining the function, and especially when drawing its graph, we used fixed coordinates in the plane. If we fix the value of either of the coordinates, only one variable remains. Fixing the value of x, for example, we get the mapping R -> R3, ^ (x, y, f{x, y)), i. e., a curve in the space R3. Curves are vector functions of one variable, with which we have already worked, namely in chapter six (see 6.14). The images of the curves for some fixed values of the coordinates x and y are depicted by lines in the picture. The curves c : R -> R" are, besides multivariate functions, the easiest examples of mappings F : Rm -> R", which we will get to shortly. In the case of functions of one variable, we built the entire differential and integral calculi upon the concepts of convergence, open neighborhoods, continuity, and so on. In the second part of chapter seven, these concepts were generalized for the so-called metric spaces, rather than only for the Euclidean spaces R". Before going on with the following paragraphs, it is appropriate to recall these parts, or to look for the concepts and results there when necessary. We present a bit of a summary here. 464 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.4. Prove that the only non-empty clopen (both closed and open) subset of E„ is E„ itself. Solution. It follows from the above exercise ||8.3|| that every nonempty proper subset U of E„ has a boundary point. If U is closed, then it is equal to its closure; therefore, it contains all of its boundary points. However, an open set (by definition) cannot contain a boundary point. □ 8.5. Show that the space E„ cannot be written as the union of (at least two) disjoint non-empty open sets. Solution. Suppose that E„ can be expressed thus, i. e., E„ where I is an index set. Let us fix a set U from this union. Then, we can write E„ = U U U, where both U and U (being a union of open sets) are open. However, they are also complements of open sets; therefore, they are closed as well. This contradicts the result of the previous exercise || 8.41|. □ 8.6. Prove or disprove: a union of (even infinitely many) closed subsets of E" is a closed subset of E". Solution. The proposition does not hold in general. As a counterexample, consider the union U Ui. iel u 1 -, 1 1 of closed subsets of R, which is equal to the open interval (0, 1). □ 8.7. Prove or disprove: an intersection of (even infinitely many) open subsets of E" is an open subset of E". Solution. The proposition does not hold in general. As a counterexample, consider the intersection n(-f.+!) i=2 V 7 of open subsets of R, which is equal to the closed singleton {1}. □ 8.8. Consider the graph of a continuous function / : R2 -» R as a subset of £"3. Determine whether this subset is open, closed, and compact, respectively. Solution. The subset is not open since any neighborhood of a point lxo, yo, f(xo, yo)] contains a segment of the line x = x0, y = y0. However, there is a unique point of the graph of the function on this segment, and that is the point [x0, y0, f(x0, y0)]. The continuity of / implies that the subset is closed - we will show that every convergent sequence of points of the graph of / converges to a point which also lies in the graph: If such a sequence is convergent in £3, then it must converge in every component, so the sequence {[xn, y«]}^Li is convergent in R2. Let us denote this Umit by [a,b]. Then, it follows from the definition of continuity that its function values at the points [x„, yn] must converge to the value f(a,b). However, this means that the sequence {[x„, y„, f(xn, y„)]}™=l converges to the point [a,b, f(a,b)], which belongs to the graph of the function /. Therefore, the graph is a closed set. 8.2. Euclidean spaces. A Euclidean space En is perceived as a set of points in Rn without any choice of coordinates, and its direction space Rn is considered to be the § vector space of all increases that can be added to the points of the space E„. Moreover, a standard scalar product E xtyt, is selected on Rn, where u — (xi,..., x„) and 1; — (yi,..., y„) are arbitrary vectors. This gives a metric on E„, i. e., a function describing the distance ||P — Q\\ of pairs of points P, Q by the formula Oil where u is the vector which yields the point P when added to the point Q. In the plane E2, for instance, the distance of the points P\ — (xi, yi) and P2 — (x2, y2) is given by \\Pi -P2||2 = (xi -x2)2 + (yi -y2f. Metrics defined in this manner satisfy the triangle inequality for every triple of points P, Q, R: R\\ = HP - Q) + (Q - R)\\ < \\(P-Q)\ \(Q-R)\ see 3.25(1) in geometry, or the axioms of a metric in 7.12, or the same inequality (5.4) for scalars. The concepts defined for real and complex scalars and discussed for metric spaces in detail thus can be carried over (extended) with no problem for the points P of any Euclidean space: ___J The topology of Euclidean spaces |_ - a Cauchy sequence: a sequence of points P, such that for every fixed s > 0, || Pi—Pj:ll < £ holds for all indeces but for finitely many exceptional values i, j; a convergent sequence: a sequence of points P converges to a point P iff for every fixed e > 0, || P — P || < e holds for all but finitely many indeces i; the point P is then called the limit of the sequence P,; a limit point P of a set A c E„: there exists a sequence of points in A converging to P and different from P; a closed set: contains all of its limit points; an open set: its complement is closed; an open ^-neighborhood of a point P: the set Og (P) = {Q € E„; IIP - Gil < S], 8 e R, S > 0; a boundary point P of a set A: every ^-neighborhood of P has non-empty intersection with both A and the complement En \ A; an interior point P of a set A: there exists a <5-neighborhood of P which lies inside A; a bounded set: lies inside some <5-neighborhood of one of its points (for a sufficiently large 8); a compact set: both closed and bounded. The reader should make an appropriate effort to read the para-■ graphs 3.25, 5.14-5.17, 7.14-7.16, and 7.22 as ^l_3fc.Y / well as try to think out/recall the definitions and connections of all these concepts. 465 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES The subset is closed, yet it is not compact since it is not bounded (its orthogonal projection onto the coordinate plane xy is the whole R2. (A subset of E„ is compact iff it is both closed and bounded.) □ C. Tangent lines, tangent planes, graphs of multivariate functions 8.9. A car is moving at velocity given by the vector (0, 1, 1). At the initial time t = 0, it is situated at the point [1, 0, 0]. The acceleration of the car in time t is given by the vector (— cos t, — sin t, 0). Describe the dependency of the position of the car upon the time t. Solution. As we have already discussed in paragraph 8.4, we got acquainted with the means of solving this type of problem as early as in chapter 6. Notice that the "integral curve" C(t) from the theorem of paragraph 8.4 starts at the point (0, 0, 0) (in other words, C(0) = (0, 0, 0)). In the affine space R", we can move it so that it starts at an arbitrary point, and this does not change its derivative (this is performed by adding a constant to every component in the parametric equation of the curve). Therefore, up to the movement, this integral curve is determined uniquely (nothing else than constants can be added to the components without changing the derivative). When we integrate the curve of acceleration, we get the curve of velocity (— sin t, cos ř — 1,0). Considering the initial velocity as well, we obtain the velocity curve of the car: (— siní, cosi, 1) (we shifted the curve of the vector (0, 1, 1), i. e., so that now the velocity curve at time t = 0 agrees with the given initial velocity). Further integration leads to the curve (cos t — 1, sin t, t). Shifting this of the vector (1, 0, 0) then fits with the initial position of the car. Therefore, the car moves along the curve [cos t, sin t, t] (this curve is called a helix). □ 8.10. Determine both the parametric and implicit equations of the tangent line to the curve c : R -» R3, c(t) = {c\{f), c2(t), c3(t)) = (t, t2, f3) at the point which corresponds to the parameter's value t = 1. Solution. The value t = 1 corresponds to the point c(l) = [1, 1, 1]. The derivatives of the particular components are c[ (t) = l,c'2(t) = 2t, c3(t) = 3t2. The values of the derivatives at the point t = 1 are 1, 2, 3. Therefore, the parametric equations of the tangent fine are: x = c[(l)s +Ci(l) = t + 1, y = c'2(l)s + c2(l) = 2i + l, z = c'3(\)s + c3(\) = 3i + l. In order to get the implicit equations (which are not given canonically), we ehminate the parameter t, thereby obtaining: 2x — y = 1, 3x - z = 2. □ 8.11. The set of differentiable functions. We can notice that multi-variate polynomials are differentiable on the whole of their do- slr mam- Similarly, the composition of a differentiable univariate 9 function and a differentiable multivariate function leads to a differentiable multivariate function. For instance, the function sin(x + y) is differentiable on the whole R2; ln(x + y) is a differentiable function on the set of points with x > y (an open half-plane, i. e., without the It should be clear straight from the definitions that sequences of points Pi have the properties mentioned in the first and second items if and only if these properties are possessed by the real sequences obtained from the particular coordinates of the points P, in every Cartesian coordinate system. Therefore, it also follows from Lemma 5.12 that every Cauchy sequence of points in E„ is convergent. Especially, E„ is always a complete metric space. 8.3. Compact sets. Our games with open, closed, or compact sets could seem useless in the case of the real line E\ since in the end, we almost always talked about intervals. In the case of metric spaces in the second part of chapter seven, it probably was, on the other hand, too complicated. However, the same approach is quite easy in the case of Euclidean spaces R". It is also very useful and important (and it is, of course, a special case of general metric spaces). Just like in the case of E\, we define the open cover of a set (i. e., a system of open sets containing the given set), and the Theorem 5.17 holds as well (with mere reformulations): Theorem. Subsets A C En of Euclidean spaces satisfy: (1) A is open if and only if it is a union of a countable (or finite) system of ^-neighborhoods, (2) every point a e A is either interior or boundary, (3) every boundary point of A is either an isolated or a limit point of A, (4) A is compact if and only if every infinite sequence contained in it has a subsequence converging to a point in A, (5) A is compact if and only if each of its open covers contains a finite subcover. Proof. The proof from 5.17 can be reused without changes in the case of propositions (l)-(3), yet now the concepts have to be perceived in a different way, and the "open intervals" are substituted with multidimensional <5-neighborhoods of appropriate points. However, the proof of the fourth and fifth propositions has to be adjusted properly. Therefore, it is a good idea to go through the proof of the corresponding propositions for general metric spaces in 7.22 while noticing the parts which can be simplified for Euclidean spaces. □ 8.4. Curves in E„. Almost all of our discussion about limits, X derivatives, and integrals of functions in chapters 5 and 6 concerned functions of a real variable and real or complex values since we used only the triangle inequality valid for the magnitudes of the real and complex numbers. We already noticed back then that this argument can be carried over to any functions of a real variable with values in a Euclidean space R", and we introduced several tools for the work with curves in paragraphs 6.14-6.17. Therefore, let us remind that for every (parametrized) curve1, i. e., a mapping c : R -> R" in an n-dimensional space, we can work with the concepts which simply extend our reasonings from the univariate functions: • a limit: lim^í0 c(t) e R" ^In geometry, one often makes a distinction between a curve as a subset of e„ and its parametrization e ->• e". When we say the word "curve", we will exclusively mean the parametrized curve. 466 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES boundary line). The proofs of these propositions are left as an exercise on hmit compositions. Remark.Notation of partial derivatives The partial derivative of a function / : R" -» R in variables x\,..., x„ with respect to the variable xi will be denoted by both J^- and the shorter expression fXl. In the exercise part of the book, we will rather keep to the latter notation. On the other hand, the notation J^- better catches the fact that this is a ' dx\ derivative of / in the direction of the vector field (you will learn what a vector field is in paragraph 8.34). I'-'ol (c(0 - eft,)) e 8.12. Determine the domain of the function / R,f(x,y) x2 ^/y. Calculate the partial derivatives where they are defined on this domain. Solution. The domain of the function in question in R2 is the half-plane {(x, y), y > 0}. In order to determine the partial derivative with respect to a given variable, we consider the other variables to be constants in the formula that defines the function. Then, we simply differentiate the expression as a univariate function. We thus get: 1 x2 fx = 2xy afy = - — . 277 The partial derivatives exist at all points of the domain except for the boundary line y = 0. □ 8.13. Determine the derivative of the function / : R3 -» R, f(x,y,z) = x2yz at the point [1,-1,2] in the direction 1; = (3,2,-1). Solution. The directional derivative can be calculated in two ways. The first one is to derive it directly from the definition (see paragraph 8.5). The second one is to use the differential of the function; see 8.6 and theorem 8.7. Since the given function is a polynomial, it is differentiable on the whole R3. Let us go from the definition: 1 Mx,y,z) = lim-[/(*+3f, y + 2t, z - t) - f(x, y, z)] = t^o t = lim -[(x + 3t)2(y + 20 (z - 0 - x2 yz] = t^o t = lim-[t (6xyz + 2x2z - x2 y) + t2 (...)] = t^o t = 6xyz + 2x2z - x2y. We have thus derived the derivative in the direction of the vector (3, 2, —1) as a function of three real variables which determine the point at which we are interested in the value of the derivative. Evaluating this for the desired point thus leads to /„ (1, —1,2) = — 7. In order to compute the directional derivative from the differential of the function, we first have to determine the partial derivatives of the function: fx = 2xyz„ fy = x2z„ fz = x2y. It follows from the note beyond theorem 8.7 that we can express Mh -1, 2) = 3/(1,-1,2) + 2^(1,-1, 2)+ • a derivative: c'(?o) — lim^- • an integral: f% c(t)dt e W' We can also notice that both the limit and the derivative of curves make sense in an affine space even without selecting the coordinates (where the limit is again a point in the original space, while the derivative is a vector in the direction space!). In the case of an integral, we will have to consider curves in the vector space M". The reason for this can be seen even in the case of one dimension, where we need to know the origin to be able to see the "area under the graph of a function". Once again, it is apparent straight from the definition that limits, derivatives, and integrals can be calculated by particular n-coordinate components in W, and their existence can be determined in the same way. We can also directly formulate the analogy of the connection between the Riemann integral and the antiderivative for curves (see 6.25): Proposition. Let c be a curve in M", continuous on an interval [a, b]. Then its Riemann integral c(f)dt exists. Moreover, the curve C(t) Ja c(s)ds e is well-defined, differentiable, and it holds that C (t) — c(t)for all values t € [a, b]. It is worse with the mean value theorem and, in general, with Taylor's theorem, see 5.38 and 6.4. We can apply them in a selected coordinate system to the particular coordinate functions of a differentiable function c(t) — (c\ (t),..., cn (?)) on a finite interval [a, b]. In the case of the mean value theorem, for instance, we get the existence of numbers U such that ct(b) - ct(a) — (b -a) ■ c-0i). However, these numbers will be distinct in general, so we cannot express the difference vector of the marginal points c(b) — c(a) as a multiple of the derivative of the curve at a single point. For example, in the plane E2, we thus get for the differentiable curve c(t) — (x(t), y(0) that c(b) - c(a) = (x'&ib - a), 3/ (n)(b - a)) = (£-«)• (x'(£), for two (different, in general) values e [a,b]. However, this reasoning is still sufficient for the following bound: Lemma. Ifc is a curve in En with continuous derivative on a compact interval [a, b], then we have for all a < s < t < b that \\c(t) - c(s)\\ < Vn(maxre[a,fe] ||c'(r)||) • \t - s\. Proof. Direct application of the mean value theorem gives for appropriate points rt inside the interval [s, t] the following: n n \\c(t) - c(s)\\2 = £(c,-(0 - a(s))2 < Y](c'i(ri)(t - s))2 r = l <(t - s)2 Y^maxre[sJ] c'tir)2 r = l < n(maxre[s,r], r-=i,...,„ |c-(r)|)2(? - s)2 < nmaxre[M \\c'(r)\\2(t - s)2. 467 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES + (-l)/z(l,-l,2) 3-(-4) + 2-2 + (- l).(-l) -7. 8.14. Determine the derivative of the function / □ R, 005(17) z at the point [0, 0, 2] in the direction of the vector f(x, y, z) (1,2,3). Solution. The domain of this function is M3 except for the plane z = 0. The following calculations will be considered only on this domain. The function in question is differentiable at the point [0, 0, 2] (this follows from the note ||8.11||). We can determine the value of the examined directional derivative by 8.6, using partial derivatives. First, we determine the partial derivatives of the given function (as we have already mentioned in exercise ||8.12||, in order to determine the partial derivative with respect to x, we differentiate it as a univariate function (in x) and use the chain rule; similarly for other partial derivatives): fx 2xy sin(x2y) fy x sin(x y) fz cos(x y) z z z" Evaluating this expression at the particular values, we obtain fx(0, 0, 2) + 2 • fy(0, 0, 2) + 3 • M0, 0, 2) 1 -0 + 2-0 + 3- 3 4' □ h~)+fx e 1 1, - e (x - 1) + /, 1 1, - e 1 y - - -1 — x + ey. The given point does not satisfy this equation, so it does not lie in the tangent plane. □ □ An important concept is the one of a tangent vector to a curve c : R -> E„ at a point c(to) e E„, which is defined as the vector in the direction Rn given by the derivative c'(to) e Rn. The straight line T given parametrically by T : c(t0) + t ■ c'(t0) is called the tangent line to the curve c at the point to. Unlike the tangent vector, the tangent line T, being a non-parametrized line, is independent of the parametrization of the curve c since thanks to the chain rule, changing the parametrization leads to the same tangent vector, up to multiple. 8.5. Partial derivatives. For every function / : Rn -> R and an arbitrary curve c : R -> Rn, we can consider their composition (/ o c)(() : R ^ R. This composite function Foe expresses the behavior of the function / along the curve c. The simplest case is when we use straight lines. I directional and partial derivatives |_^ 8.15. Having a function / : W -» R with differential df(x) and a point x € W, determine a unit direction u e 1„ in which the directional derivative dv (x) is maximal. Solution. According to the note beyond theorem 8.4, we are maximizing the function fv(x) = vifXl(x) + v2fX2(x) + • • • + vnfXn(x) in dependence on the variables v\, ...,vn which are bound by the condition v\ + ■ ■ ■ + v\ = 1. We have already solved this type of problem in chapter 3, when we talked about linear optimization (viz || ?? ||). The value /„ (x) can be interpreted as the scalar product of the vectors (fXl ,...,/*„) and {v\,... ,vn). And this product is maximal if the vectors are of the same direction. The vector 1; can thus be obtained by normalizing the vector (fXl ,...,/*„). In general, we say the the function grows maximally in the direction (fXl, ..., fXn). Then, this vector is called the gradient of the function /. In paragraph 8.19, we will recall this idea and go into further details. □ 8.16. Determine whether the tangent plane to the graph of the function / : K x 1+ ^ I, f(x, y) = x ■ ln(y) at the point [1, ±] goes through the point [1, 2, 3] e M3. Solution. First of all, we calculate the partial derivatives: fx(x, y) = ln(y), fy(x, y) = |; their values at the point [1, ^] are —1, e; further f(l, -) = —1. Therefore, the equation of the tangent plane is 1 Definition. We say that / : R" —>• R has derivative in the direction of a vector v e Rn at a point x e En iff the derivative dvf(x) of the composite mapping t i-> f(x + tv) at the point t — 0, i. e., 1 dvf(x) = lim-(f(x + tv) - f(x)). t^o t The value dv f is also called a directional derivative. The special choice of the lines in the direction of the axes of the coordinate system yields the so-called partial derivatives of the function f, which are denoted by J£-, i — I,... ,n, ox (without referring to the function) as operations For functions in the plane, we thus get ^-f(xy) ax ^-f(x,y) dy 1 lim -(f (x t^o t t, y) lim -(fix, y t^o t t) f(x, y)) f(x, y)). Especially, we can see that the partial differentiation with respect to a given variable is just the casual one-variable differentiation while considering the other variables to be constants. 8.6. The differential of a function / : Rn —>• R. However, we will not do with partial or directional derivatives for a good approximation of the behavior of a function by linear expressions. We would probably expect that a "differentiable" function of more variables composed with any differentiable curve again yields differentiable functions of one variable, which we have known well. However, let us look at the functions in the plane given by the formulae g(x, y) = h(x, y) i 1 if yx = 0 10 otherwise [l ify=x2/0 10 otherwise 468 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.17. Determine the parametric equation of the tangent line to the intersection of the graphs of the functions / : R2 -» R, f(x, y) = x2 + xy - 6, g : R x R+ -» R, g(x, y) = x ■ ln(y) at the point [2, 1]. Solution. The tangent hne to the intersection is the intersection of the tangent planes at the given point. The plane that is tangent to the graph of / and goes through the point [2, 1] is z = f(2,l) + fx(2,l)(x-xo) + fy(2,l)(y-y0) = 5x + 2y - 12. The tangent plane to the graph of g is then z = f(2,\) + gx(x,y)(2,\)(x-xo) + g(x,y)y(2,\)(y-yo) = 2y-2. The intersection hne of these two planes is given parametrically as [2, t, 2t - 2], t e R. Another solution. The normal to the surface given by the equation f(x, y,z) = 0 at the point b = [2, 1, 0] is (fx(b), fy(b), fz(b)) = (5,2, —1); the normal to the surface given by g(x, y,z) = 0 at the same point is (0, 2,-1). The tangent line is perpendicular to both normals; we can thus obtain a vector it is parallel to as the vector product of the normals, which is (0, 5, 10). Since the tangent hne goes through the point [2, 1,0], its parametric equation is [2, 1 + t, 2t], t e R. □ 8.18. Determine all second partial derivatives of the function / given by f(x, y, z) = *Jxy In z. Solution. First, we determine the domain of the given function: the argument of the square root must be non-negative, and the argument of the natural logarithm must be positive. Therefore, Df = {(x, y, z) e R\ (z > l&(xy > 0)) V (0 < z < l)&(xy < 0)}. Now, we calculate the first partial derivatives with respect to each of the three variables: fx y ln(z) fy x ln(z) fz xy 2^/xy ln(z) 2V*y ln(z) 2z,~Jxy ln(z) Each of these three partial derivatives is again a function of three variables, so we can consider (first) partial derivatives of these functions. Those are the second partial derivatives of the function /. We will write the variable with respect to which we differentiate as a subscript of the function /. fxx fxy fxz fyy fyz fin2 4(xy Inz)" xy In2 z + Inz 4(xylnz)2 2y/xy lnz xy2 Inz y + 4z(xylnz)2 2z^/xy In z x2 In2 z 4(xy Inz) 5 X2 y In z + Apparently, neither of them extends all smooth curves going through the point (0, 0) to smooth functions. On the other hand, both partial derivatives of g at (0, 0) exist and no other directional derivatives do, while h has all directional derivatives at the point (0, 0), and we even have dvh(0) — 0 for all directions v, so this is a linear dependence on » 6 I2. We can also imagine a function / which, along the lines (r cos 6, r sin 6) with a fixed angle 6, takes the values k(6)r, where k(6) is a periodic odd function of the angle 6, with period 2it. All of its directional derivatives dv f at (0, 0) exist, yet these will not be linear expressions depending on the directions i; for general functions k(6). Therefore, we will imitate the case of univariate functions as thoroughly as possible, and we will forbid such a pathological behavior of functions directly by a definition: ___J Differential J___ Definition. A function / : W -> R is differentiable at a point x iff all of the following three conditions hold: (1) the directional derivatives dvf(x) at the point x exist for all vectors v eRn, dyf(x) is linearly dependent on the increase of v, linwo m(/(* + v)~ f(x) ~ dvf{x)) = °-The linear expression dv f (in a vector variable v) is called a differential of the function f evaluated at the increase of v. (2) (3) In words, we require that the increases of the function / at the point x be well approximated by linear functions of increases of the variable quantities. It follows directly from the definition of directional derivatives that the differential can be defined solely by the property (3). Indeed, if there is a linear form df(x) such that the increases i; at the point x satisfy the property (3) with dvf(x) — df(x)(v), then df(x)(v) is apparently just the directional derivative of the function / at the point x, so the properties (1) and (2) are automatically satisfied. Let us examine what we can say about the differential of a o). To this purpose, consider any smooth curve 11->-(x(t), y(t)) with xo — x(0), yo — y(0). Using the mean value theorem for univariate functions in both summands separately, we obtain that 17(f(x(t),y(t))- f(x0,y0)) = \(f(x(t), y(t))-f(x0, y(t))) + l(f(X0, y(t))-f(x0, yo)) - (x(t) - x0) df ax df -(y(t)-yo)-T-(xo, y(n)) dy 4z(xylnz)2 2z,^/xy In z for suitable numbers § and r\ between 0 and t. Especially, for every sequence of numbers t„ converging to zero, we can get the corresponding sequences of numbers §„ and rjn which also converge to zero and all will satisfy the above expression. Letting t —>• 0, we get thanks to continuity of the partial derivatives that (see the test for convergence of a function using 469 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES fzi 2 2 x y xy 4z2(xy\wz,)2 2z2jxy lnz By the theorem about interchangeability of partial derivatives (see 8.10), we know that fxy = fyx, fxz = fzx, fyz = fzy. Therefore, it suffices to compute the mixed partial derivatives (the word "mixed" means that we differentiate with respect to more than one variable) just for one order of differentiation. □ D. Taylor polynomials 8.19. Write the second-order Taylor expansion of the function / : R2 -» R, f(x, y) = ln(x2 + y2 + 1) at the point [1, 1]. Solution. First, we compute the first partial derivatives: 2x _ 2y fx = - xz then the Hessian: Hf(x, y) + f + 1 2y2-2;c2+2 (x2+y2+l)2 4xy 4xy (x2+y2+l)2 2;c2-2y2+2 (x2+y2+l)2 I \ (x2+y2 + \)2 The value of the Hessian at the point [1, 1] is 4 2 "9 9 Altogether, we get that the second-order Taylor expansion at the point [1, 1] is T2(x, y) f(l, l) + Ml, 1)(* - l) + Mh i)(y - l) + -i,y- i)Hf(i, l). 1 y x - 1 = ln(3) + -(x- l) + -(y- l) + y(x- I)2- -A-(x-\)(y-\)+l-(y-\)2 = ^(x2 + y2 + 8x + Sy - 4xy - 14) + ln(3). □ Remark. In particular, we can see that the second-order Taylor expansion of an arbitrary differentiable function at a given point is a second-order polynomial. 8.20. Determine the second-order Taylor polynomial of the function / : R2 -» R2, f(x,y) = xy cos y at the point [n, it]. Decide whether the tangent plane to the graph of this function at the point [it, it, f{jt, 7t)] goes through the point [0, jt, 0]. Solution. As in the above exercises, we find out that T(x, y) = l-jt2y2 -xy- jr3y + X-Tt\ The tangent plane to the graph of the given function at the point [7t,7t] is given by the first-order Taylor polynomial at the point [n, n]; its general equation is thus z = —Tty — 7tX + TV2, and this equation is satisfied by the given point [0, jt, 0]. □ subsequences of the input values, 5.23, and Theorem 5.22 about the limits of sums and products of functions) ^f(x(t), y(0)|m) = x!(oAxo, yo) + y (O)-^(xo, yo), at ox ay which is a pleasant extension of the theorem on differentiation of composite functions of one variable for vector-valued functions. Of course, the special choice of parametrized straight lines (x(f), y(t)) = (x0 + y0 + trfi transforms our calculation, with i; — (§, r]), to the equality of df dvf(x0, yo) — -z-(x0, yo)% + t-(*o, yo)n, ox ay and this formula can be expressed in the nice way in which we described coordinate expressions of linear functions on vector spaces: df df df — —dx H--dy. dx dy In other words, the directional derivative dvf is indeed a linear function R" —>• R on the increases, with coordinates given by the partial derivatives. Similarly, we can now prove that the assumption of continuous partial derivatives at a given point guarantees the approximation properties of the differential as well. We will consider general multivariate functions straightaway: 8.7. Theorem. Let f : En —>• R be a function of n variables which has continuous partial derivatives in a neighborhood of the a point x € En. Then its differential df at the point x exists and its coordinate expression is given by the formula df df df df — -—dxi + -—dx2 H-----h -—dx„. OX l 0X2 oxn Proof. This theorem can be derived analogously to the pro-cedure described above, for the case n — 2. We only i, have to be careful in details and finish the reasoning about the approximation properties. Just like above, we consider a curve c(t) = (Cl(t),...,cn(t)), c(0) — (0,0) and a point x e R", and we express the difference fix + c(t)) — f(x) for the composite function f(c(t)) as follows: f(xi + ci(0, ..., x„ + cn(t)) - f(xi,x2 + c2(t), ...) + f(x\,x2 +c2(t), ...))- f(xi, x2, ..., x„ + cn(t)) + f(xi,x2, ..., x„ +c„(0) - f(x\,x2, > xn). Now, we can apply the mean value theorem to all of the n sum-mands, thus obtaining (similarly to the case of two variables) df (ci(0 -ci(0)) — (xi +c1(61),x2 + c2(t), ...,xn +cn(t)) OX I df + (c2(0 - c2(0))-Mxi, x2 + c2(92), ...,x„+ cn(t)) OX2 df + (c„(0 - c„(0))-—(xi, x2, ..., x„ + ci(6n)), dx„ 470 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.21. Determine the third-order Taylor polynomial of the function / : R3 -» R, fix, y,z)=x3y + xz2 + xy + 1 at the point [0, 0, 0]. O 8.22. Determine the second-order Taylor polynomial of the function / : R2 -» R, f(x, y) = x2 sin y + y2 cos x at the point [0, 0]. Decide whether the tangent plane to the graph of this function at the point [0, 0, 0] goes through the point [n, n, n]. Q 8.23. Determine the second-order Taylor polynomial of the function ln(x2 y) at the point [1,1]. O 8.24. Determine the second-order Taylor polynomial of the function / : R2 -» R, fix, y) = tan(xj + y) at the point [0, 0]. O £. Extrema of multivariate functions 8.25. Determine the stationary points of the function / : R2 -» R, fix,y) = x2 y + y2x — xy and decide which of these points are local extrema and of which type. Solution. The first derivatives are fx = 2xy + y2 — y, fy = x2 + 2xy —x. If we set both partial derivatives equal to zero simultaneously, the system has the following solution: {x = y = 0], {x = 0, y = 1], {x = 1, y = 0], {x = 1/3, y = 1/3], which are four stationary points of the given function. 2y 2x + 2y — 1 2x + 2y — 1 2x Its values at the stationary points are, respectively, 0 -l\ /11W0 1 -l o y \i oj' \i i Therefore, the first three Hessians are indefinite, and the last one is positive definite. The point [1/3, 1/3] is thus a local minimum. □ 8.26. Determine the point in the plane x + y + 3z = 5 lying in R3 which is closest to the origin of the coordinate system. First, do this by applying the methods of linear algebra; then, using the methods of differential calculus. Solution. It is the intersection point of the perpendicular going through the point [0, 0, 0] to the plane. The normal to the plane is it, t, 3f), t € R. Substituting into the equation of the plane, we get the intersection point [5/11,5/11, 15/11]. Alternatively, we can minimize the distance (or its square) of the plane's points from the origin, i. e., the function (5 - y - 3z)2 + y2 + z2. Setting the partial derivatives equal to zero, we get the system 3y + 10z - 15 = 0 2y + 3z-5 = 0, whose solution is as above. Since we know that the minimum exists and is the only stationary point, we need not calculate the Hessian any more. □ The Hessian of the function / is for appropriate values Qi, 0 < Qi < t. This is a finite sum, so the same reasoning as in the case of two variables verifies that ^-f(x + c(t))t=0 = cj(0) |--(x) + • • • + c'n(0)^-(x). at ox i axn The special choice of the curves c(t) — x + tv for a directional vector i; verifies the statement about existence and linearity of the directional derivatives at x. At the same time, we can apply the mean value theorem in the same way to the difference fix + v)-fix) = dvfix+0v) df df — vi-—(x + 6v) H-----h v„-—(x + 6v) ax i axn with an appropriate 6, 0 < 6 < 1, where the latter equality holds according to the formula for directional derivatives derived above, for sufficiently small v's thanks to the continuity of the partial derivatives in a neighborhood of the point x. Since all the partial derivatives are continuous at the point x, we know that for an arbitrarily small e > 0, there is a neighborhood U of the origin in M" such that for w e U, all partial derivatives ^-(x + w) differ from |£- (x) by less than e. Thus we get the bound -(f(x + w)-f(x)-dwf(x + 9w)) \w\\s, so the approximation property of the differential is satisfied as well. □ 8.8. A plane tangent to the graph of a function. The linear approximation of the function behavior by its differential can, similarly to the case of univariate functions, :. be expressed with respect to its graph. We will just work with hyperplanes instead of tangent lines. For the case of a function on E2 and a fixed point (xq, yo) e E2, consider the plane in £3 given by the equation z = f(xo, yo) + dfix0, yo)(x -x0,y- yo) df df — f(xo, yo) + t-(*o, yo)(x - x0) + —ixo, yo)(y - yo)-ax ay We have already seen that the increase of the function values of a differentiable function / : E„ —>• R at points x + tv and x is always expressed in terms of the directional derivative dvf at a suitable point between them. Therefore, this is the only plane out of those which contain the point (xo, yo) having the property that all derivatives, and so the tangent lines of all curves c(t) = (x(t), y(t), f(x(t), y(t))) as well lie in it. It is called the tangent plane to the graph of the function /. Two tangent planes to the graph of the function f(x, y) — sin(x) cos(y) are shown in the picture. The diagonal line is the image of the curve c(t) — (t, t, fit, t)). All CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.27. Determine the local extrema of the function fix, y) = x2 + arctan2 x + I y3 + y I , x, y € Solution. The function / can be written as the sum f\ + f2, where f^x) = x2 + arctan2x, x e R, f2(y) = \ y3 + y \ , y e R. If the function / has a local extremum at a point, then it does so with respect to an arbitrary subset of its domain. In other words, if the function has, for instance, a maximum at a point [a, b] and we set y = b, then the univariate function f(x,b)ofx must have a maximum at the point x = a. Let us thus fix an arbitrary y € R. For this fixed value of y, we get a univariate function, which is a shift of the function f\. This means that its maxima and minima are at the same points. However, it is easy to find the extrema of the function f\. We can just realize that this function is even (it is the sum of two even functio: and the function y = arctan2 x is the product of two odd functions) and increasing for x > 0 (the composition as well as the sum of increasing functions is again an increasing function). Therefore, it has a unique extremum, and that is a minimum at the point x = 0. Similarly, for any fixed value of x, f is a shift of the function f2, and f2 has a minimum at the point y = 0, which is its only extremum. We have thus proved that / can have a local extremum only at the origin. Since /(0,0)=0, f(x,y)>0, [ij]el2\([0,0]), the function / has a strict local (even global) minimum at the point [0,0]. □ 8.28. Examine the local extrema of the function fix, y) = (x + y2) ei, ijel. Solution. This function has partial derivatives of all orders on the whole of its domain. Therefore, local extrema can occur only at stationary points, where both the partial derivatives fx, fy are zero. Then, it can be determined whether the local extremum occurs by computing the second derivatives. We can easily determine that fx(x, y) = ei + \ (x + y2)ef fy(x, y) = 2y e%, x, y e R. A stationary point [x, y] must satisfy fy(x,y)=0, i.e. y = 0, and, further, Mx,y) = A(jc,0)=et(l + ijc) =0, i.e. x = -2. We can see that there is a unique stationary point, namely [—2,0]. Now, we calculate the Hessian Hf at this point. If this matrix (the corresponding quadratic form) is positive definite, the extremum is a strict local minimum. If it is negative definite, the extremum is a strict local maximum. Finally, if the matrix is indefinite, there will be no extremum at the point. We have fxxix, y) = \ e§ (2 + \ (x + y2)) , fyyix, y) = 2&, fxyix, y) = fyXix, y) = yei, x, y e R. Therefore, 'fxx (-2, 0) fxy i-2, 0)\ _ (l/2t 0 For the case of functions of n variables, the tangent plane is defined as an analogy to the tangent plane to an area in the three-dimensional space. Instead of being puzzled by a great deal of indeces, it will be useful to recall affine geometry, where we already worked with the so-called hyperplanes, see paragraph 4.3. tangent (hyper)plane to the graph of a function at a point ,_| Definition. A tangent hyperplane to the graph of a function / : Rn -> R at a point x e Rn is the nadr containing the point (x, fix)) with direction which is the graph of the linear mapping df(x) : Rn -> R, i. e. the differential at the point x e En. The definition takes advantage of the fact that the directional derivative dv f is given by the increase in the tangent (hyper)plane corresponding to the increase of the input vector v. Many analogies with the univariate functions follow from these reasonings. In particular, a differentiable function / on E„ has zero differential at a point x e En if and only if its composition with any curve going through this point has a stationary point there, i. e., is neither increasing, nor decreasing in the linear approximation. In other words, the tangent plane at such a point is parallel to the hyperplane of the variables (i. e., its direction is E„ c having added the last coordinate set to zero). Of course, this does not mean that / should have a local extremum at such a point. Just like in the case of univariate functions, this depends on the values of higher derivatives. 8.9. Derivatives of higher orders. The operation of differentiation can be iterated similarly to the case of univariate functions. This time, we can choose different directions for each iteration. If we fix an increase i; e M", the enumeration of the differentials at this increase defines a (differential) operation on differentiable functions / : E„ -> R f dvf = df(v), and the result is again a function df(v) : E„ -> R. If this function is differentiable as well, we can repeat this procedure with another increase, and so on. In particular, we can work with iterations of partial derivatives. For second-order partial derivatives, we write 32 „ d2f -°-)f dxj dxi -f Hf i-2, 0) fyxi-2,0) fyyi-2,Q) 0 2/e dxi dxj dxi dxj In the case of the repeated choice i — j, we also write JL 0 JL\ f = JL f = itU. dxi ° ;iv( /' ;ay • dx2' All CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES We should recall that the eigenvalues of a diagonal matrix are exactly the values on the diagonal. Further, positive definiteness means that all the eigenvalues are positive. Hence it follows that there is a strict local minimum at the point [—2,0]. □ 8.29. Find the local extrema of the function fix, y, z) 9 2 + T + \ ~ 3xz - 2y + 2z, x,y,z e Solution. The function / is a polynomial; therefore, it has partial derivatives of all orders. It thus suffices to look for its stationary points (the extrema cannot be elsewhere). In order to find them, we differentiate / with respect to each of the three variables x, y, z, and set the derivatives equal to zero. We thus obtain 3x2 — 3z = 0, i. e., z = x2, 2y — 2 = 0, i. e., y = 1, and (utilizing the first equation) z -3x + 2 = 0, i. e., x e {1,2}. Therefore, there are two stationary points, namely [1, 1, 1] and [2, 1, 4]. Now, we compute all second-order partial derivatives: fxx = 6x, fXy = fyX = 0, fxz = fzx = ~3, fyy = 2, fyz = fzy = 0, fz = 1. Having this, we are able to evaluate the Hessian at the stationary points: Hf (1,1,1) 0 0 2 0 Hf (2,1,4) Now, we need to know whether these matrices are positive definite, negative definite, or indefinite in order to determine whether and which extrema occur at the corresponding points. Clearly, the former matrix (for the point [1, 1, 1]) has eigenvalue X = 2. Since its determinant equals —6 and it is a symmetric matrix (all eigenvalues are real), the matrix must have a negative eigenvalue as well (because the determinant is the product of the eigenvalues). Therefore, the matrix Hf (1, 1, 1) is indefinite, and there is no extremum at the point [1, 1, 1]. We will use the so-called Sylvester's criterion for the latter matrix Hf (2, 1,4). According to this criterion, a real-valued symmetric matrix an an ■ ■ ai„^ an a22 a23 ■ ■ a2n A = an a23 a33 ■ ■ a3n \ain a2n a3n ■ ann J d\ = \an\ , d2 is positive definite if and only if all of its leading principal minors A, i. e. the determinants an a\2 a\3 , d3= «i2 a22 a23 , ..., d„ = \ A |, a 13 a23 a33 are positive. Further, it is negative definite iff dx < 0, d2 > 0, d3 < 0, i-l)"dn>0. The inequalities axx a12 an a22 We proceed in the same way with further iterations and talk about partial derivatives of order k &f dx{ ... dxi. More generally, we can also iterate (assuming the function is sufficiently differentiable) any directional derivatives; for instance, dv o dwf for two fixed increases v, w e R". ___| times differentiable functions |_ - We say that a function / : E„ -> R is k-times (continuously) differentiable at a point x iff all partial derivatives up to order k (inclusive) exist in a neighborhood of the point x and are continuous at this point. We say that / is /c-differentiable iff it is /c-times (continuously) differentiable at all points of its domain. >From now on, we will always work with continuously differentiable functions unless explicitly stated otherwise. To show all of this in the simplest form, we will once again work in the plane E2, supposing the second-order partial derivatives are continuous. In the plane as well as in the space, iterated derivatives are often denoted by mere indeces referring to the variable names, for example: df d2f d2f d2f Jx ^ , j xx ^ 9 , j xy ^ ^ , jyx ^ ^ ox oxL oxo y oyox We will show that if certain senseful conditions are satisfied, the partial derivatives commute, i. e., we need not care about the order in which we differentiate. Since we suppose that the partial derivatives exist and are continuous, the limits fxy (x, y) = lim y (fx (x, y + t) - fx (x, y)) 1( lr — lim - lim - I f(x +s,y + t)-f(x,y + t) t^0 t \s^0 S /(V • V. V) • /(V. V)) exist. However, since the limits can be expressed by any choice of values t„ —>• 0 and s„ —>• 0 and the limits of the corresponding sequences, we will also have that fxy(X y) = lim j ({f(x + t,y + t)- f(x, y + 0) -(f(x + t,y)- f(x,y))y and this limit value is continuous at (x, y). Let us consider the expression from which we take the last limit to be a function 0, 12 0 12 0 -24 > 0, 0 2 0 = 6 > 0, -3 0 1 imply that the matrix Hf (2, 1, 4) is positive definite - there is a strict local minimum at the point [2,1,4]. □ 8.30. Find the local extrema of the function z = (x2 - l) (l - x4 - y2) , x, y e R. Solution. Once again, we calculate the partial derivatives zx, zy and set them equal to zero. This leads to the equations -6x5 + 4x3 + 2x - Ixy2 = 0, (x2 - l) (-2y) = 0, whose solutions [x, y] = [0, 0], [x, y] = [1, 0], [x, y] = [-1, 0]. (In order to find these solutions, it suffices to find the real roots 1, — 1 of the polynomial — 6x4 + Ax2 + 2 using the substitution u = x2. Now, we compute the second-order partial derivatives -30x4 + 12x2 + 2-2/, z xy ~yx -4xy, zyy = -2 (x2 - l) Hz (0, 0) and evaluate the Hessian at the stationary points: ^ °), //z(l,0) = //z(-l,0) = (-016 £ We can see that the first matrix is positive definite, so the function has a strict local minimum at the origin. However, the second and third matrices are negative semidefinite. Therefore, the knowledge of second partial derivatives in insufficient for deciding whether there is an extremum at the points [1,0] and [—1,0]. On the other hand, we can examine the function values near these points. We have z (1,0) =z (-1,0) = 0, z(x,0)<0 forxe(-l,l). Further, consider y dependent on x e (—1, 1) by the formula y = ^2 (l — x4), so that y -» 0 for x -» ± 1. For this choice, we get z (x, ^2(1 -x4)) = (x2 - 1) (x4 - 1) > 0, x e (-1, 1). We have thus shown that in arbitrarily small neighborhoods of the points [1, 0] and [—1,0], the function z, takes on both higher and lower values than the function value at the corresponding point. Therefore, these are not extrema. □ 8.31. Decide whether the polynomial p(x, y) x6 + y8 + y4x4 has a local extremum at the stationary point [0, 0]. Solution. We can easily verify that the partial derivatives px and py are indeed zero at the origin. However, each of the partial derivatives Pxx, Pxy, Pyy is also equal to zero at the point [0, 0]. The Hessian Hp (0, 0) is thus both positive and negative semidefinite at the same time. However, a simple idea can lead us to the result: We can notice that p(0,0) = Oand p(x, y) = x6 (1 - y5) + y8 + y4x4 > 0 for [x, y] € R x (—1, 1) \ {[0, 0]}. Therefore, the given polynomial has a local minimum at the origin. □ 8.32. Determine local extrema of the function / : R3 -» R, f(x,y,z)=x2y + y2z+x-zon.R3. O Now, gy (x, y) — fy (x + t, y) — fy (x, y), so we can write

• 0 must guarantee the wanted equality fxy(x, y) = fyx(x, y) at all points (x, y). The same procedure for functions of n variables proves the following fundamental result: I Interchangeability of partial derivatives 8.10. Theorem. Let f : En —>• M be a k-times differentiable function with continuous partial derivatives up to order k (inclusive) in a neighborhood of a point x € M". Then all partial derivatives of the function f at the point x up to order k (inclusive) are independent of the order of differentiation. Proof. The proof for the second order was illustrated above in , the special case when n — 2 . The procedure works similarly for the general case as well. " Formally, the proof can be led in the following way: we may assume that for every fixed choice of a pair of coordinates Xi and xj, the whole discussion of their interchanging takes place in a two-dimensional afline subspace, i. e., all the other variables are considered to be constant and take no effect in the reasonings. In the case of higher-order derivatives, the proof can be finished by induction on the order. Indeed, every order of the indeces i\, ...,ik can be obtained from a fixed one by several swaps of adjacent pairs of indeces. □ 8.11. Hessian. In the case of first-order derivatives, we introduced the differential, being the linear form df(x) which approximates a function fata point x in the best way. Similarly, we will now want to understand the quadratic approximation of a function / : E„ —>• 474 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.33. Determine the local extrema of the function / 9 x y y2 z + 4x +z on 8.34. Determine the local extrema of the function / fix, y, z) = xz2 + y2 z - x + y on R3. 8.35. Determine the local extrema of the function / fix, y, z) = y2z - xz2 + x + 4y on R3. 8.36. Determine the local extrema of the function / fix, y) = x2y + x2 + ly2 + y on R2 8.37. Determine the local extrema of the function / fix, y) = x2y + 2/ + 2y on R2. 8.38. Determine the local extrema of the function / fix, y) = x2 + xy + 2y2 + y on R2. 8.39. Determine the local extrema of the function / fix, y) = x2 + xy - 2y2 + y on R2. O ► I O ► I o o o o o F. Implicitly given functions and mappings 8.40. Let F : R2 -» R be a function, F(x,y) = xysin^xy2). Show that the equality Fix,y) = 1 implicitly defines a function / : U -» R on a neighborhood U of the point [1,1] so that Fix, fix)) = 1 forx € U. Determine f'il). Solution. The function is differentiable on the whole R2, so it is such on any neighborhood of the point [1, 1]. Let us evaluate Fy at [1, 1]: Fyix, y) = x sin + Ttx2 y2 cos so Fy(l, 1) = 1 7^ 0. Therefore, it follows from theorem 8.18 that the equation Fix,y) = 1 implicitly determines on a neighborhood of the point (1, 1) a function / : U -» R defined on a neighborhood of the point (number) 1. Moreover, we have Fxix, y)=y sin (^y2) + ijxy3 cos (^y2) , so the derivative of the function / at the point 1 satisfies ^(1, 1) 1 Fy(h 1) 1 □ Remark. Notice that although we are unable to explicitly define the function / from the equation Fix, fix)) = 1, we are able to determine its derivative at the point 1. 8.41. Considering the function F : R2 - Fix, y) = ex sin(y) + y 77-/2 - 1 , show that the equation F{x,y) = 0 implicitly defines the variable y to be a function of x, y = fix), on a neighborhood of the point [0,77-/2]. Compute /'(0). Solution. The function is differentiable in a neighborhood of the point [0,7T/2]; moreover, Fy = ex cosy + 1, F(0, jt/2) = 1 ^ 0, so the equation indeed defines a function / : U -» R on a neighborhood of the point [0, tt/2]. Further, we have Fx = ex siny, 7^(0, jt/2) = 1, and its derivative at the point 0 satisfies: Fx(0, tt/2) 1 f'iO) = - _.' =-- = -1. □ 7% (0,77-/2) 1 Hessian Definition. If / : M" —>• M is a twice differentiable function, we call the symmetric matrix of functions H fix) = d2f dxi dxj (H-ix) ix) = a2/ dx\dxn ix)\ a2/ dxn dx\ ix) ■Al—ix)! ÓXndXn v ' / the Hessian of the function / at the point x. We have already seen from the previous reasonings that zeroing the differential at a point (x, y) e E2 guarantees stationary behavior along all curves going through this point. The Hessian H fix, y) fxx ix, y) fxy ix, y) fxyix,y) /vv(V. V) plays the role of the second derivative. For every parametrized straight line cit) = (x(0, yit)) the univariate functions a(0 = /M0,X0) m = fixo,yo) + ^-(xo,yo)^ ox (x0 + %t, y0 + nt), of t-(*o, yo)ri dy fxxixo, yo)t + 2fxyix0, yo)ŠV + fyyixo, yo)v' will share the same derivatives up to the second order (inclusive) at the point t — 0 (calculate this on your own!). The function f3 can be written in terms of vectors as ßit) — fixo, yo) + dfixQ, y0) • l-iH v)-Hfixo,yo)-(Í or Pit) = fix0, yo) + dfix0, yo)iv) + jHfixo, yo)iv, v), where i; — (§, n) is the increase given by the derivative of the curve c(f), and the Hessian is used as a symmetric 2-form. This is an expression which looks like Taylor's theorem for univariate functions, namely the quadratic approximation of a function by Taylor's polynomial of degree two. The following picture shows both the tangent plane and this quadratic approximation for two distinct points and the function fix, y) — sin(x) cos(y). 8.12. Taylor's theorem. The multidimensional version of Tay-lor's theorem is once again an example of a mathemat-ical statement where the most difficult part is finding ' ] '\MXJ the right formulation. The proof is quite simple then. 475 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.42. Let F(x, y, z) = sin(xy) + sin(yz) + sin(xz). Show that the equation F{x, y,z) =0 implicitly defines a function z(x, y) : R2 -» R on a neighborhood of the point [tt, 1,0] e I3 so that F(x, y, zix, y)) = 0. Determine zxin, 1) and zy(7t, 1). Solution. We will calculate Fz = y cos(yz)+x cos(xz), Fz(tc, 1, 0) = tt + 1 7^ 0, and the function z(x, y) is defined by the equation F(x, y, z(x, y)) = 0 on a neighborhood of the point [tt, 1, 0]. In order to find the values of the wanted partial derivatives, we first need to calculate the values of the remaining partial derivatives of the function F at the point [tt, 1,0]. Fx(x, y, z) Fy(x, y, z) y cos(xy) + zcos(xz) Fx(jv, 1, 0) x cos(xy) + z cos(yz) Fy(jv, 1, 0) -1, -TT, odkud zxin, 1) Zy(Tt, 1) 1 F An, 1,0) Fz(it, 1,0) FyJTt, 1,0) _ _ Fz(it, 1,0) ~ tt + ť Tt + 1 7t □ 8.43. Having the mapping F : R3 ^ R2, F(x,y,z) = (f(x,y,z),g(x,y,z)) = (exsmy,xyz), show that the equation F(x, ci(x), c2(x)) = (0,0) defines a curve c : R -» M2 on a neighborhood of the point [1, Tt, 1]. Determine the tangent vector to this curve at the point 1. Solution. We will calculate the square matrix of the partial derivatives of the mapping F with respect to y and z;. H(x,y,z) = [fy fz 8y 8z Hence, H(l,jt, 1) -1 1 x cos y ex sin y 0 xz, xy and det#(l, tt, 1) -TT ŕ o. Now, it follows from the implicit mapping theorem (see 8.18) that the equation Fix, c\{x), c2(x)) = (0, 0) on a neighborhood of the point [1, tt, 1] determines a curve (ci(x), c2(x)) defined on a neighborhood of the point [1, tt]. In order to find its tangent vector at this point, we need to determine the )column) vector (fx, gx) at this point: fx sin y e yz .x sin y fxihTT, 1) 8 Ahn, 1) The wanted tangent vector is thus (cúAh (c2)Ah) fy(hn,l) fz(hn,l) ^(l,7r,l) gz(hn,\) -1 0 1 TT TT 1 0 fAhn,\ 8x(\,n, 1) 0 □ We will proceed in the direction mentioned above, and we will introduce a notation for the particular parts of Dkf approximations of higher orders for functions /:£„—>• R". It will alwyas be ^-linear expressions in the increases, and we will be interested only in their enumeration at a &-tuple of same values. We have already discussed the differential D1 f — df (the first order) and the Hessian D2f — Hf (the second order). In general, for functions /:£„—>• R, points x — (x\,..., x2) e En, and increases i; — (§i, ...,§„), we set Dkf(x)(v) E dkf l• R be a k-times differentiable function in a neighborhood Og(x) of a point x € En. For every increase v € W of size || v || < ^, exzsta a number 6, 0 < 0 < 1, swcft /(x + u) = /(x) + Dlf(x)(v) + ±D2f(x)(v)+ 1 (*- 1)! 2! k! Proof. For an increase i; e M", we consider the parametrized jijf'straight line c(f) — x + tv in E„, and we examine the function

• R defined by the composition (pit) — |■ / o c(t). Taylor's theorem for univariate functions claims that (see Theorem 6.4) (Pit) = (piO) + /(xq), respectively. If equality holds for no x / xo in the previous inequalities, we talk about a strict extrémům. For the sake of simplicity, we will suppose that our function / has continuous both first-order and second-order partial derivatives on its domain. A necessary condition for existence of an extrémům at a point xq is that the differential be zero at this point, i. e., df(xo) — 0. Indeed, if df(xo) / 0, then there is a direction v in which we have dvf(xo) / 0. However, then the function value is increasing at one side of the point xo along the line xo + tv and it is decreasing on the other side, see (5.32). An interior point x e En of the domain of a function / at which the differential df(x) is zero is called a stationary point of the function f. All CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES extrema. Further, inside every eighth of the sphere given by the coordinate planes, there may or may not be another extremum. The particular quadrants can be easily parametrized, and the function h (considered a function of two parameters) can be analyzed by standard means (or we can have it drawn in Maple, for example). Actually, solving the system (no matter whether algebraically or in Maple again) leads to a great deal of stationary points. Besides the six points we have already talked about (two of the coordinates equal to zero and the other to ±1) and which have a = ±|, there are also the points y 3 3 3 J for example, where a local extremum indeed occurs. If we restrict our interest to the points of the circle K, we must give another function G another free parameter rj representing the gradient coefficient. This leads to the bigger system 0 0 0 0 = x2 + y2 +z2- L 0 = x+y + z. 3x2 — 2Xx - V, 3/ - 2ky - V, 3z2 -2Xz - V, However, since a circle is also a compact set, h must have both a global minimum and maximum on it. Further analysis is left to the reader. □ f(x,y,z) = 1. If so, find 8.46. Determine whether the function / : R3 -» x2 y has any extrema on the surface 2x2 + 2^ + z2 these extrema and determine their types. Solution. Since we are interested in extrema of a continuous function on a compact set (ellipsoid) - it is both closed and bounded in R3 - the given function must have both a minimum and maximum on it. Moreover, since the constraint is given by a continuously differentiable function and the examined function is differentiable, the extrema must occur at stationary points of the function in question on the given set. We can build the following system for the stationary points: 2xy x2 0 4kx, 4ky, 2kz. This system is satisfied by the points [± , , 0] and [± , — , 0]. The function takes on only two values at these four stationary points. Ir follows from the above that the first and second stationary points are maxima of the function on the given ellipsoid, while the other two are minima. □ 8.47. Decide whether the function / : R3 -» R, f(x, y, z) = z -xy2 has any minima and maxima on the sphere 2,2,2 x +y +z 1. We will again, for a while, work with a simple function in E2 in order to illustrate our conclusions directly. Let us consider the function fix, y) — sin(x) cos(y) which has been discussed and caught in many pictures, namely in paragraphs 8.9 a 8.8. The shape of this function resembles well-known egg plates, so it is apparent that we can find a lot of extrema, but also many more stationary point which, in fact, will not be extrema (the little "saddles" noticeable in the picture). Therefore, let is calculate the first derivatives, and then the necessary second-order ones: fxix, y) — cos(x) cos(y), fy(x, y) — - sin(x) sin(y), and both derivatives will be zero for two sets of points (1) cos(x) — 0, sin(y) — 0, that is (x, y) — i^^-n, £n), for any t,leZ (2) cos(y) — 0, sin(x) — 0, that is (x, y) — (kit, 2^-n), for any t,leZ. The second partial derivatives are Hfix,y) =(f" ffxy)ix,y) \Jxy Jyy/ - sin(x) cos(y) — cos(x) sin(y) - cos(x) sin(y) — sin(x) cos(y) We thus get the following Hessians in our two sets of stationary points: If so, determine them. (1) Hfikn + j, £n) — ± ^ ^j, where the sign — occurs when k and £ have the same parity (remainder upon division by two), and the sign + occurs in the other case; (2) Hfikn, In + j) — ± ^ ^j, where, again, the sign — occurs occurs when k and £ have the same parity, and the sign + occurs in the other case; Now, if we look at the proposition of Taylor's theorem for order k — 2, we get, in a neighborhood of one of the stationary points (*o, yo), fix, y) = f(x0, yo)+ 1 + 2Hf(x° + °(x ~ x°)' y° + e(y ~ yo))(x - xo, y - yo), where Hf is now considered a quadratic form evaluated at the increase (x — xo, y — yo). Since the Hessian of our function is continuous (i. e., continuous partial derivatives up to order two, inclusive) and the matrices of the Hessian are non-degenerate, the local maximum occurs if and only if our point (xo, yo) belongs to the former group with k and £ of the same parity. On the other 478 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution. We are looking for solutions of the system x = —ky2, y = —2kxy, z = k. 2j. The first The second equation implies that either y = 0 or x = possibility leads to the points [0, 0, 1], [0, 0, —1]. The second one cannot be satisfied (substituting into the equation of the sphere, we get the equation 1 1 , 777 + -r, +k = L 4k2 2k2 which has no solution. The function has a maximum and minimum, respectively, at the two computed points on the given sphere. □ 8.48. Determine whether the function / : R3 -» R, f(x, y, z) = xyz, has any extrema on the ellipsoid given by the equation g(x,y,z) = kx2 + lf + z2 = 1, k, I e R+. If so, calculate them. Solution. First, we build the equations which must be satisfied by the stationary points of the given function on the ellipsoid: dx dx dy dy yz xz. xy 2Xkx, 2Xly, 2Xz. dz dz We can easily see that the equation can only be satisfied by a triple of non-zero numbers. Dividing pairs of equations and substituting into the ellipse's equation, we get eight solutions, namely the stationary points x = ±77^, y = ±-^j, z = ±7^5- However, the function / takes on only two distinct values at these eight points. Since it is continuous and the given ellipsoid is compact, / must have both a maximum and minimum on it. Moreover, since both / and g are continuously differentiable, these extrema must occur at stationary points. Therefore, it must be that four of the computed stationary points are local maxima of the function (of value 777=) and the other four are minima (of value 3V3« □ 8.49. Determine the global extrema of the function f(x, y) = x2 - 2y2 + 4xy - 6x - 1 on the set of points [x, y] that satisfy the inequalities (8.1) x>0, y>0, y<-x+3. Solution. We are given a polynomial with continuous partial derivatives on a compact (i. e. closed and bounded) set. Such a function necessarily has both a minimum and a maximum on this set, and this can happen only at stationary points or on the boundary. Therefore, it suffices to find stationary points inside the set and the ones on a finite number of open (or singleton) parts of the boundary, then evaluate / at these points and choose the least and the greatest values. Notice that the set of points determined by the inequalities (||8.11|) is clearly a triangle with vertices at [0, 0], [3, 0], [0, 3]. hand, if the parities are different, then the point from the former group happens to be a point of a local minimum. On the other hand, the Hessian of the latter group of points is always positive at some increases and negative at other ones. Therefore, the entire function / behaves in this manner in a small neighborhood of the given point. In order to formulate the general statement about the Hessian and the local extrema at stationary points, we have to remember the discussion about quadratic forms from the paragraphs 4.31^1.32 in the chapter on affine geometry. There, we introduced the following attributes for a quadratic form h : En -> M: • positively definite iff h (w) > 0 for all u ^ 0 • positively semidefinite iff h(u) > 0 for all u e V • negatively definite iff h(u) < 0 for all u ^ 0 • negatively semidefinite iff h(u) < 0 for all u e V • indefinite iff h(u) > 0 and f(v)<0 for appropriate u,v e V. We also invented some methods which allow us to find out whether a given form has any of these properties. The Taylor expansion with remainder immediately yields the following proposition: Theorem. Let f : En -> Rbe a twice continuously differentiable function and x e En be a stationary point of the function f. Then (1) f has a strict local minimum at x if Hf(x) is positively definite, (2) f has a strict local minimum at x if H fix) is negatively definite, (3) f does not have an extremum at x if H fix) is indefinite. Proof. The Taylor second-order expansion with remainder applied to out function f(x\,..., x„), an arbitrary point x — (xi,..., x„), and any increase 1; — (vi,..., v„), such that both x and x + v lie in the domain of the function /, says that f(x + v) = f(x) + df ix)iv) + \nfix + 0 ■ v)iv) for an appropriate real number 6, 0 < 6 < 1. Since we suppose that the differential is zero, we get fix + v) = fix) + l-Hfix + 6 ■ v)iv). By our assumption, the quadratic form Hf(x) is continuously dependent on the point x, and the definiteness or indefiniteness of quadratic forms can be determined by the sign of the major subde-terminants of the matrix Hf, see Sylvester's criterion in paragraph 4.32. However, the determinant itself is a polynomial expression in the coefficients of the matrix, hence a continuous function. Therefore, the non-zeroness and signs of the examined determinants are the same in a sufficiently small neighborhood of the point x as at the point x itself. In particular, for positively definite Hf(x), we have guaranteed that, at a stationary point x, f(x + v) > f(x) for sufficiently small 1;, so this is a sharp minimum of the function / at the point x. The case of negative definiteness is analogous. If Hf(x) is indefinite, then there are directions 1;, w in which fix + v) > fix) and fix + w) < fix), so there is no extremum at the stationary point in question. □ 479 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Let us determine the stationary points inside this triangle as the solution of the equations fx = 0, fy = 0. Since fx(x, y) = 2x+Ay- 6, fy(x, y) = Ax - Ay, these equations are satisfied only by the point [1, 1]. The boundary suggests itself to be expressed as the union of three line segments given by the choice of pairs of vertices. First, we consider x = 0, y € [0, 3], when fix, y) = —2y2 — 1. However, we know the graph of this (univariate) function on the interval [0, 3] It is thus not difficult to find the points at which global extrema occur. They are the marginal points [0, 0], [0, 3]. Similarly, we can consider y = 0, x € [0, 3], also obtaining the marginal points [0, 0], [3,0]. Finally, we get to the line segment y = — x + 3, x e [0, 3]. Making some rearrangements, we get f(x, y) = f{x, -x + 3) = -5x2 + 18* - 19, x e [0, 3]. We thus need to find the stationary points of the polynomial p(x) = —5x2 + 18* — 19 from the interval [0, 3]. The equation p'ix) = 0, i. e., — 10* + 18 = 0, is satisfied by x = 9/5. This means that in the last case, we obtained one more point (besides the marginal points), namely [9/5, 6/5], where a global extremum may occur. Altogether, we have these points as "suspects": [1,1], [0,0], [0,3], [3,0], [f,f] with function values -A, -1, -19, -10, respectively. We can see that the function / takes on the greatest value -1 at the point [0, 0] and the least value -19 at the point [0, 3]. □ 8.50. Determine whether the function / : R3 -» R, fix, y, z) = fz has any extrema on the line segment given by the equations 2x + y + z = 1, x — y + 2z, = 0 and the constraint x e [—1,2]. If so, find these extrema and determine their types. Justify all of your decisions. Solution. We are looking for the extrema of a continuous function on a compact set. Therefore, the function must have both a minimum and a maximum on this set, and this will happen either at the marginal points of the segment or at those where the gradient of the examined function is a linear combination of the gradients of the functions that give the constraints. First, let us look for the points which satisfy the gradient condition: 0 2yz y2 2x + y + z x — y + 2z 2k +1, k-l, k + 2l, 1, 0. The solution is [x, y, z] = [§, 0, -j] and [x, y,z] = [f, §, -|] (of course, the variables k and I can also be computed, but we are not interested in them). The marginal points of the given line segment are [—1, |, |] and [2, —|, —|]. Considering these four points, the function takes on the greatest value at the first marginal point (/(*, y, z) = 4pp), which is its maximum on the given segment, and it Let us notice that the theorem yields no result if the Hessian of the examined function is degenerate, yet not indefinite at the point in question. The reason is again the same as in the case of univariate functions. In these cases, there are directions in which both the first and second derivatives vanish, so at this level of approximation, we cannot determine whether the function behaves like f3 or ±?4 until we calculate the higher-order derivatives in the necessary directions at least. At the same time, we can notice that even at those points where the differential is non-zero, the definiteness of the Hessian Hf(x) has similar consequences as the non-zeroness of the second derivative of a univariate function. Indeed, for a function / : Rn -> R, the expression z(x + v) = f(x) + df(x)(v) defines a tangent hyperplane to the graph of the function / in the space Rn+1, so Taylor's theorem of order two with remainder, as used in the proof, shows that when the Hessian is positively definite, all the values of the function / lie in a sufficiently small neighborhood of the point x above the values of the tangent hyperplane, i. e., the whole graph is above the tangent hyperplane in a sufficiently small neighborhood. In the case of negative definiteness, it is the other way round. Finally, when the Hessian is indefinite, the graph of the function goes from one side of the hyperplane to the other, but this happens, in general, along objects of lower dimension in the tangent hyperplane, so we have no straightforward generalization of inflexion points. 8.14. The differential of mappings. The concepts of a derivative and a differential can be easily extended to mappings F : E„ -> Em. Having selected the Cartesian coordinate system on both sides, this mapping is an ordinary m-tuple F(xu ..., x„) - (/i(*i, ..., xn), ..., fm(xi, .. .,*„)) of functions f : E„ -> R. We say that F is a differentiable or k-times differentiable mapping iff the corresponding property is shared by all the functions fi,..., fm. Differential and Jacobian matrix |_^ The differentials dfi (x) of the particular functions f give a linear approximation of the increases of their values for the mapping F(xu ... ,x„) = (/i(*i, .. .,xn), ..., fm(xx, ..., x„)). Therefore, we can expect that they will also give a coordinate expression of the linear mapping D1F(x) : Rn —>• Rm between the direction spaces, which linearly approximates the increases of our mapping. The resulting matrix DlF(x) (dfi(x)\ df2(x) — \dfm (x) J 3/2 obci I 3/m \ obci Ml 3/2 dX2 Ms. dx2 3£l\ dx„ > dxn dx„ / (x) is called the Jacobian matrix of the mapping F at a point x. The linear mapping D1F(x) defined on the increases i; — (vi,..., v„) by identically denoted the Jacobian matrix is called 480 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES takes the least value at the second marginal point {fix, y, z) which is thus its minimum there. 80 n 27'' □ 8.51. Find the maximal and minimal values of the polynomial p(x, y) = 4x3 - 3x - Ay3 + 9y on the set M = {[x, y] e R2; x2 + y2 < l} . Solution. This is again the case of a polynomial on a compact set; therefore, we can restrict our attention to stationary points inside or on the boundary of M and the "marginal" points on the boundary of M. However, the only solutions of the equations px(x, y) = \2x2 -3 = 0, py(x, y) = -12/ + 9 = 0 are the points 1 V3 1 S 1 V3 1 V3 2' 2 2' 2 2' 2 2' 2 which are all on the boundary of M. This means that p has no extrémům inside M. Now, it suffices to find the maximum and minimum of p on the unit circle k : x2 + y2 = 1. The circle k can be expressed parametrically as x=cosf, y = sinf, te[—jt,7t]. Thus, instead of looking for the extrema of p on M, we are now seeking the extrema of the function /(f) := p(cos f, sin f) = 4 cos3 t — 3 cos t — 4 sin3 f + 9 sin f on the interval [—it, it]. For f e [—7T, tt], we have /'(f) = —12cos2 f sinf + 3 sinf — 12sin2 f cosf + 9cos f, In order to determine the stationary points, we must express the function /' in a form from which we will be able to calculate the intersection of its graph with the x-axis. To this purpose, we will use the identity -K- = 1 + tan2 f, cosz t which is valid provided both sides are well-defined. We thus obtain f'if) = cos3 f [- 12tan f + 3 (tan f + tan3 f) - 12tan2 f + 9 (l + tan2f) ] for f e [—it, it] with cosf ^ 0. However, this condition does not exclude any stationary points since sin f ^ 0 if cos f = 0. Therefore, the stationary points of / are those points f e [—it, tv] for which -4 tan f + tan f + tan3 f - 4 tan2 f + 3 + 3 tan2 f = 0. The substitution s = tan f leads to s3 - s2 - 3s + 3 = 0, i. e. (s - 1) (s - Vfj (s + Vfj = 0. Then, the values 5=1, s = >/3, s = — >/3 respectively correspond to f e {—I jt, t e {—I jt, |7r}, f e {—\it, §77-}. Now, we evaluate the function / at each of these points as well as at the marginal points t = —it, t = it. Sorting them, we get / (-1 it) = -1 - 373 < / (-1 jt) = -3V2 / (I jt) = 3V2 > / (i tt) = -1 + 373 > 0. the differential of the mapping F at a point x in the domain iff we have lim — (F(x + v)- F(x) - Z)1F(x)();)) = 0. u->0 Hull v 7 Several times, we have already used the fact that the definition of Euclidean distance guarantees that the limits of values in En exist if and only if the limits of the particular coordinate components do. Direct application of Theorem 8.5 about existence of the differential for functions of n variables to the particular coordinate functions of the mapping F thus leads to the following proposition (prove this in detail by yourselves!): Corollary. Let F : En —>• Em be a mapping such that all of its coordinate functions have continuous partial derivatives in a neighborhood of a point x e En. Then the differential D1 F(x) exists, and it is given by the Jacobian matrix D1 F(x). 8.15. Transformation of coordinates. A mapping F : En —>• En which has an inverse mapping G : En —>• En defined on the whole of F's image is called a trans-formation. Such a mapping can be perceived as a change of coordinates. We usually require that both F and G be (continuously) differentiable mappings. Just like in the case of vector spaces, the choice of our "point of view", i. e. the choice of coordinates, can simplify or deteriorate our comprehension of the examined object. The change of coordinates is now being discussed in a much more general form than in the case of affine mappings in the fourth chapter. Sometimes, the term "curvilinear coordinates" is used in this general sense. A very illustrative example is the change of the most usual coordinates in the plane to the so-called polar ones, i. e., the position of a point P is given by its distance r — ^/x2 + y2 from the origin and the angle cp — arctan(y/x) between the ray from the origin to it and the x-axis (if x ^ 0). 1/2*Pi 3/2*Pi The change from polar coordinates to the standard ones is ppolar = (r' «o) ^ (r cos V, r sin?) = cartesian It is apparent that it is necessary to limit the polar coordinates to an appropriate subset of points (r, 0 changes the points of extrema (of course, they can change their values). However, we know that the function g gives the distance of a point [x, y] from the point [2, 0]. Since the set M is clearly a square with vertices [1, 0], [0, 1], [—1, 0], [0, —1], the point of M that is closest to [2, 0] is the vertex [1,0], while the most distant one is [ — 1, 0]. Altogether, we have obtained that the minimal value of / occurs at the point [1,0] and the maximal one at [ — 1, 0]. □ 8.53. Compute the local extrema of the function y = fix) given implicitly by the equation 3x2+2xy+x = y2+3y + \, [x, y] sR2\{[x,x - |] ; x € R} . Solution. In accordance with the theoretical part (see 8.18), let us denote Fix, y) = 3x2 + 2xy + x — y2 — 3y — |, [x, y]sR2\ {[x,x - §] ; x € R} and calculate the derivative 6x+2y + l 2x—2y—3 ' We can see the this derivative is continuous on the whole set in question. In particular, the function / is defined implicitly on this set (the denominator is non-zero). A local extremum may occur only for those x, y which satisfy / = 0, i. e., 6x+2y+\ = 0. Substituting y = —3x — 1/2 into the equation Fix, y) = 0, we obtain — 12x2 + 6x = 0, which leads to [x,y] = [0,-±], [x,y] = [\,-2\. We can also easily compute that „ _ i ,y _ (6+2/)(2x-2;y-3)-(6jt+2;y+l)(2-2/) y — \y ) — (2x-2y-3)2 Substituting x = 0, y = —1/2, / = 0 and x = 1/2, y = —2, / = 0, we obtain f = _6(_2)_o > Q for [JC> = [o, 4] and f 6(+2)-0 < 0 for [*,)>] = [±,-2], mapping would exist. The Cartesian image of lines in polar coordinates with constant coordinates r or

• Em and G : Em —>• Er be two dif-ferentiable mappings, where the domain of G contains the whole image of F. Then, the composite mapping G o F is also differen-tiable , and its differential at any point form the domain of F is given by the composition of differentials Dl(GoF)(x) = D1G(F(x))o/)1F(x). The corresponding Jacobian matrix is given by the product of the corresponding Jacobian matrices. Proof. In paragraph 8.5 and in the proof of Taylor's theorem, we derived how differentiation of mappings composed from functions and curves behaves This proves the > special cases of this theorem for n — r — 1. The general case can be proved analogously, we just have to work with more vectors now. Let us fix an arbitrary increase i; and calculate the directional derivative for the composition G o F at a point x e E„. This actually means to determine the differentials for the particular coordinate functions of the mapping G composed with F. For the sake of simplicity, we will write g o F for any one of them. dv(g o F)(x) = lim Ug(F(x + tv)) - g(F(x))). t^o t The expression in parentheses can, from the definition of the differential of g, be expressed as g(F(x + tv)) - g(F(x) = dg(F(x))(F(x + tv) - F(x)) + a(F(x + tv) - F(x)), where a is a function defined on a neighborhood of the point F(x) which is continuous and satisfies lim^o ^a^i;) — 0. Substitution into the equality for the directional derivative yields dv(g o F)(x) = lim Udg(F(x))(F(x t^o t a(F(x ■ dg(F(x))(\iml-(F( 1 \t->o t + tv) - F(x)) tv) - F(x))) f tv) - Fix) lim t^o t aiFix + tv) - Fix)) = dg(F(x)) o DlFix)iv) + 0, 482 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES We have thus proved that the implicitly given function has a strict local minimum at the point x = 0 and a strict local maximum at x = 1/2. □ 8.54. Find the local extrema of the function z the maximum possible set by the equation f(x,y) given on (8.2) xz. yz + 2x + 2y + 2z,-2 = 0. Solution. Differentiating (||8.2||) with respect to x and y gives 2x + 2zzx - z - xzx - yzx + 2 + 2z,x = 0, 2y + 2zzy Hence we get that z (8.3) xz,^ z - yz,y + 2 + 2z,y 0. fx(x, y) 2z fy(x, y) — 2x —2 x - y + 2' -2y -2 2z, — x — y + 2 We can notice that the partial derivatives are continuous at all points where the function / is defined. This implies that the local extrema can occur only at stationary points. These points satisfy zx = 0, i. e. z — 2x — 2 = 0, zy = 0, i. e. z - 2y - 2 = 0. We have thus two equations, which allow us to express the dependency of x and y on z. Substituting into (||8.2||), we obtain the points [x, y, z] = [x, y, z] -3 + 76, -3 + 76, -4 + 276 -3 - V6, -3 - 76, -4 - 276 Now, we need the second derivatives in order to decide whether the local extrema really occur at the corresponding points. Differentiating zx in (||8.3||), we obtain 7 — f (v v\ - fa-2)(2z-x-y +2)-(z-2x-2)(2Zx-1) Zxx - Jxx\x, y) - {2z_x_y+2)2 with respect to x, and _ r , n _ Zj,(2z-x-y+2)-(z-2x-2)(2zj,-l) Zxy ~ Jxy(X, y) - (2z_x_y+2)2 , with respect to y. We need not calculate zyy since the variables x and y are interchangeabel in (||8.2||) (if we swap x and y, the equation is left unchanged). Moreover, the x- and y-coordinates of the considered points are the same; hence zx Now, we evaluate that at the stationary points: fxx (-3 + 76, -3 + V6) = fyy (-3 + V6, -3 + 76) fxy (-3 + 76, -3 + 76) = fyx (-3 + 76, -3 + 76) = 0, fxx (-3 - 76, -3 - 76") = fyy (-3 - 76, -3 - 76") = j= fxy (-3 - 76, -3 - 76") = fyx (-3 - 76, -3 - 76") = 0. As for the Hessian, we have Hf (-3 + 76, -3 + 76") = Hf{[- 76, -3 where we made use of the properties of the function a and the fact the a linear mapping between finite-dimensional spaces are always continuous. Thus, we have proved the theorem for the particular functions gi,..., gr of the mapping G. The whole theorem now follows from matrix multiplication. □ Now, we can illustrate, by a simple example, the usage of our concept of transformation and the theorem about differentiation of composite mappings. We have seen that the polar coordinates are given from the Cartesian ones by the transformation F : M2 —>• M2 which, in coordinates (x, y) and (r, cp), is written as follows (for instance, on the domain of all point in the first quadrant except for the points having x — 0): — Jx2 + y2, (f — arctan y Consider a function gt : E2 —>• M which can be expressed as g(r, In fact, we are considering a one-dimensional quadratic form whose positive (negative) definiteness at a stationary point means that there is a minimum (maximum) at that point. Realizing that the stationary points had x =2k,y = 2k, mere substitution yields \dx2, d2L^,^ = -4V2dx2, d2L(-^,-£)=4j2c which means that there is a strict local maximum of the function / at the point is a strict (8.5) V2/2, V2/2 , while at the point -V2/2, -V2/2 ocal minimum. The corresponding values are: there -2V2. Now, we will demonstrate a quicker way how to obtain the result. We know (or we can easily calculate) the second partial derivatives of the function L, i. e., the Hessian with respect to the variables x and y: HL (x, y) > 2__6k n v3 r4 u inverse function is then the multiplicative inverse of the derivative of the original function. Interpreting this situation for a mapping E\ —>• E\ and linear mappings R —>• R as their differentials, the non-zeroness is a necessary and sufficient condition for the differential to be invert-ible. In this way, we obtain a statement which is valid for finite-dimensional spaces in general: I The inverse mapping theorem [__, 1 Theorem. Let F : En —>• En be a differentiable mapping on a neighborhood of a point xq € En, and let the Jacobian matrix D1 f(xo) be invertible. Then in some neighborhood of the point xq, the inverse mapping F-1 exists, and its differential at the point F(xq) is the inverse mapping to the differential D1F(xo), i. e., it is given by the inverse matrix to the Jacobian matrix of the mapping F at the point xq. Proof. First, we should try to verify that the theorem makes sense and is expectable. If we supposed that the inverse mapping existed and was differentiable at the point F(xq), differentiation of composite functions enforces the formula idM« = dl(F~l o F)(xo) = dHf'1) o D1F(xq), which verifies the formula at the end of the theorem. Therefore, we know right from the beginning which differential for F~l to look for. In the next step, we will suppose that the inverse mapping F~1 exists in a neighborhood of the point F(xq) and that it is continuous. We are to verify the existence of the differential. Since F is differentiable in a neighborhood of xq, it follows that F(x) - F(x0) - dlF(x0)(x - x0) — a(x - x0) with function a : R" —>• 0 satisfying lim^o ^a^i;) — 0. To verify the approximation properties of the linear mapping (D1F(xo))~1, it suffices to calculate the following limit for y — F(x) approaching yo — F(xq): lim 1 n{F-\y) - F^iyo) - (Z)1JF(x0))-1(y - yo))-y^yo \\y - y0|| Substitution into the previous equality gives 1 lim x0- y^yo IIy - yo (D1JF(x0))-1(D1JF(x0)(x - x0) + a(x - x0)) -1 lim y^yo IIy - y0|| -(D1JF(xo))-1(«(x-x0)) (^^(xo))-1 lim -1 -{a{x -x0)), y->yo ||y - yoll where the last equahty follows from the fact that linear mappings between finite-dimensional spaces are always continuous, and thanks to invertibility of the differential, performing it before the limit process has no impact upon existence of the limit. 484 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES The evaluation HL HL -2V2 0 0 -2V2 2V2 0 o 2V2 then tells us that the quadratic form is negative definite for the former stationary point (there is a strict local maximum) and positive definite for the latter one (there is a strict local minimum). We should be aware of a potential trap in this "quicker" method in the case we obtain an indefinite form (matrix). Then, we cannot conclude that there is not an extremum at that point since as we have not included the constraint (which we did when computing (PL), we are considering a more general situation. The graph of the function / on the given set is a curve which can be defined as a univariate function. This must correspond to a one-dimensional quadratic form. □ 4. 8.56. Find the global extrema of the function f(x,y) = -\ + -\, x^O, y^O on the set of points that satisfy the equation \ + \ x y Solution. This exercise is to illustrate that looking for global extrema may be much easier than for local ones (cf. the above exercise) even in the case when the function values are considered on an unbounded set. First, we would determine the stationary points (|| 8.4||) and the values (||8.5||) the same way as above. Let us emphasize that we are looking for the function's extrema on a set that is not compact, so we will not do with evaluating the function at the stationary points. The reason is that the function / may not have an extremum on the considered set -its range might be an open interval. However, we will show that this is not the case here. Let us thus consider \ x\ > 10. We can realize that the equation \ + \ = 4 can be satisfied only by those values y for which | y | > x y 1/2. We have thus obtained the bounds -2V2 < 10 + 2 < 272, if > 10. 2 < f(x, y) < 10 At the same time, we have (interchanging x and y leads to the same task) -2V2 < -2 < f(x,y) < ± +2 < 2V2, if |y|>10. Hence we can see that the function / must have global extrema on the considered set, and this must happen inside the square A BCD with vertices A = [-10, -10], B = [10, -10], C = [10, 10], D = [-10, 10]. The intersection of the "hundred times reduced" square with vertices at Ä = [-1/10,-1/10], B = [1/10,-1/10], C = [1/10, 1/10], D = [-1/10, 1/10] and the given set is clearly the empty set. Therefore, the global extrema are at points inside the compact set bounded by these two squares. Since / is continuously dif-ferentiable on this set, the global extrema can occur only at stationary points. We thus must have fm /(f,f)=2V2, /mto = /(-f,-f) = -2V2. Let us notice that we are almost done with the proof. The limit at the end of our expression is, thanks to the properties of the function a, zero if the magnitudes ||F(x) — F(xo)|| are greater than C\\x — xq|| for some constant C. This is a bit stronger property than F-1 being continuous; in literature, this property of a function is called Lipschitz continuity. So, now it remains "merely" to prove the existence of a Lipschitz-continuous inverse mapping to the mapping F. To simplify the reasonings to come, we will transform the general case to a statement which is a bit more simple. Especially, we can achieve xq — 0 e Rn, y0 — F(xq) — 0 e M" by a convenient choice of Cartesian coordinates, which is without loss of generality. Composing the mapping F with any linear mapping G yields a differentiable mapping again, and we know this changes the differential. The choice G(x) = (Dl F(O))"1 (x) gives Dl(Go F)(0) = idu". Therefore, we can assume that DlF(0) = idRn. Now, having these assumptions, let us consider the mapping K(x) — F(x) — x. This mapping is differentiable, too, and its differential at 0 is apparently zero. By The Taylor expansion with remainder of the particular coordinate functions Kt and the definition of Euclidean distance, we get for any continuously differentiable mapping A" in a neighborhood of the origin of W the bound \\K(x) - K(y)\\ < Cfn\\x-y\\, where C is bounded by the maximum of all absolute values of the partial derivatives in the Jacobian matrix of the mapping A" in the neighborhood in question.2 Since the differential of the mapping K at the point xo — 0 is zero in our case, we can, selecting a sufficiently small neighborhood U of the origin, achieve the bound \K(x)-K(y)\ 1 < -1 - 2 ■yll. Further, substituting for the definition K(x) voking the triangle inequality || (u — v) + v\\ e., ||m|| — ||d|| < ||« — d|| as well, we get ||y — x|| — ||-F(x) — F(y)|| < ||-F(x) 1 = F (x) — x and in- < ||« — i>|| + ||d||, i. F(y) + y - x|| Hence, finally, 1 — Ilx 2 y|| < \\F(x)-F(y)\ With this bound, we have reached a great advancement: if x / y in our neighborhood U of the origin, then we also must have F(x) ^ F(y). Therefore, our mapping is bijective. Let us write F-1 for its inverse defined on the image of U. For this function, our bound says that |F_1(jc) - F~l(y)II < 2|| yll, so this mapping is not only continuous, but also Lipschitz-continuous, as we needed in the previous part of the proof. It immediately follows from this reasoning that a function which has continuous partial derivatives on a compact set is Lipschitz-continuous on it as well. 485 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES □ 8.57. Determine the maximal and minimal values of the function fix, y, z) = xyz on the set M given by the conditions 1, x + y + z = 0. x2 + y2 + z2 Solution. It is not hard to realize that M is a circle. However, for our problem, it is sufficient to know that M is compact, i. e. bounded (the first condition of the equation of the unit sphere) and closed (the set of solutions of the given equations is closed since if the equations are satisfied by all terms of a converging sequence, then it is satisfied by its limit as well). The function / as well as the constraint functions F(x, y, z) = x2 + y2 + z2 — 1, G(x, y, z) = x + y + z, have continuous partial derivatives of all orders (since they are polynomials). The Jacobi constraint matrix is 'Fx(x, y, z) Fy(x, y, z) Fz{x, y, z) \ = (2x 2y 2z" KGx(x,y,z) Gy(x,y,z) Gz(x, y, z)) ~ \l 1 1 Its rank is reduced (less than 2) if and only if the vector (2x, 2y, 2z) is a multiple of the vector (1, 1, 1), which gives x = y = z, and thus x = y = z =0 (by the second constraint). However, the set M does contain the origin. Therefore, we may look for stationary points using the method of Lagrange multipliers. For L(x, y, z, A.i, k2) = xyz - ki (x2 + y2 + z2 - l) - k2 (x + y + z), the equations Lx = 0, Ly = 0, Lz = 0 give yz, — 2k\x — k2 = 0, xz, — 2k\y — k2 = 0, xy — 2k\z, — k2 = 0, respectively. Subtracting the first equation from the second one and from the third one leads to xz — yz — 2k\y + 2k\x = 0, xy — yz — 2k\z, + 2k\x = 0, i. e., (x -y)(z + 2kx) = 0, (x-z) (y + 2*i) = 0. The last equations are satisfied in these four cases: x = y, x = z; x = y, y = —2k\; z = —2k\,x=z; z = — 2k\, y = — 2k\, thus (including the constraint G = 0) x = y = z, = 0; x = y = —2k\, z, = 4Ai; x = z = —2k\, y = Ak\\ x = 4A.i, y = z = —2k\. Except for the first case (which clearly cannot happen), including the constraint F = 0 yields (4A02 + (-2A02 + (-2A02 = 1, Altogether, we get the points i. e. k\ 1 '76' 1 2 '76' V6. 1 2 1 76 2 76' 1 '76' 1 76 It could seem that we are done with the proof, yet it is not \\ true. To finish the proof completely, we still have to show that the mapping F restricted to a sufficiently small neighborhood is not only bijective, but also that it maps open neighborhoods of zero onto open neighborhoods of zero. 3 Let us choose S so small that the neighborhood V — (D& (0) lies in U with its boundary, and, at the same time, the Jacobian matrix of the mapping is invertible on the whole V. This surely can be done since the determinant is a continuous mapping. Let B denote the boundary of the set V (i. e., the corresponding sphere). Since B is compact and F is continuous, the function P(x) = \\F(x)\\ has both a maximum and a minimum on B. Let us denote a — ^minx£B p(x) and consider any y e Oa(0). Of course, a > 0. We want to show that there is at least one x e V such that y — F(x), which will prove the whole inverse mapping theorem. To this purpose, consider the function (y is our fixed point) h(x)= \\F(x)-y\\2. Again, the image h(V) U h(B) must have a minimum. First, we show that this minimum cannot occur for x e B. Indeed, we have F(0) — 0, hence h(0) — \\y\\ < a. At the same time, by our definition of a, the distance of y from F(x) for x e B is at least a for y e Oa (0) (since a was selected to be half the minimum of the magnitude of F(x) on the boundary). Therefore, the minimum occurs inside V, and it must be at a stationary point z of the function h. However, this means that for all j — 1,..., n, we have ^r(z) = E2(/'«-w)ö77« = 0. dxJ i=\ dfi dxj This system of equations can be considered a system of linear equations with variables & — ft (z) — yi and coefficients given by twice the Jacobian matrix D1F(z). For every zeV, such a system has a unique solution, and that is zero since we suppose that the Jacobian matrix is invertible. Thus, we have found the wanted point x — z e V satisfying, for all i — 1,..., n, the equality ft (z) — yi, i. e., F(z) — y. □ 8.18. The implicit function theorem. Our next goal is to apply the inverse mapping theorem for work with implicitly defined functions. For the beginning, let us IIISlI? consider a differentiable function F(x, y) defined in the plane E2, and let us look for those point (x, y) at which F{x, y) = 0. An example of this can be the usual (implicit) definition of straight lines and circles: F(x, y) — ax + by + c — 0 F(x, y) = (x - s)2 + (y - t)2 - r2 = 0, r > 0. While in the first case, the function given by the first formula is (for a c y — fix) —--x-- y J b b for all x; in the other case, for any point (x$, yo) satisfying the equation of the circle and such that yo / t (these are the marginal l l 76' 76' 2 76 1 76' 2 1 76' 76 2 1 1 76 ' 76 ' 76 In literature, there are many examples of mappings which, for instance, continuously and bijectively map a line segment onto a square. 486 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES We will not verify that these really are stationary points. The only important thing is that all stationary points are among these six. We are looking for the global maximum and minimum of the continuous function / on the compact set M. However, the global extrema (we know they exist) can occur only at points of local extrema with respect to M. And the local extrema can occur only at the aforementioned points. Therefore, it suffices to evaluate the function / at these points. Thus we find out that the wanted maximum is /(--.--■-)-/(--■-■-- V 76 V6 76/ V Vě Vě V6y .2 1 1 \ 1 / ,76' 76' 76/ 376' while the minimum is f(- - --)=f(- -- -V76 76 76/ V76 76 76y = /M-.2-,2J) = _J_. V 76 76 76/ 376 □ 8.58. Find the extrema of the function / : R3 -» R, fix, y, z) = x2 + y2 + z2, on the plane x + y — z, = 1 and determine their types. Solution. We can easily build the equations that describe the linear dependency between the normal to the constraint surface and the examined function: k, y = k z -k, k e The only solution is the point [|, |, — |]. Further, we can notice that the function is increasing in the direction of (1, —1,0), and this direction lies in the constraint plane. Therefore, the examined function has a minimum at this point. Another solution. We will reduce this problem to finding the extrema of a two-variable function on R2. Since the constraint is linear, we can express z, = x + y — 1. Substituting this into the given function then yields a real-valued function of two variables: f(x, y) = x2 + y2 + (x + y - l)2 = 2x2 + 2xy + y2 - 2x - 2y + 1. Setting both partial derivatives equal to zero, we get the linear equation 4x + 2y - 2 = 0, 4y + 2x - 2 = 0, whose only solution is the point [|, |]. Since it is a quadratic function with positive coefficients at the unknowns, it is unbounded on R2. Therefore, there is a (global) minimum at the obtained point. Then, we can get the corresponding point [|, |, — |] in the constraint plane from the linear dependency of z. □ 8.59. Find the extrema of the function x + y : R3 -» R on the circle given by the equations x + y + z, = 1 and x2 + y2 + z2 = 4. Solution. The "suspects" are those points which satisfy (1, 1,0)=*:- (1, 1,1) + /- (jc, y, z), k,leR. Clearly, x = y(= 1/Z). Substituting this into the equation of the circle then leads to the two solutions 1 722 1 722 1 722 - ± -—, - ± -—, - T -— 3 6 3 6 3 3 points of the circle in the direction of the coordinate x), we can find a neighborhood of the point xq in which either or y y = f(x) = t + f(x-s)2 = fix) = t - V(x - s)2 according to which semicircle the point (xq, yo) belongs to. Having the picture of the situation drawn, the reason is clear: we cannot describe both the semicircles simultaneously by a single function y — fix). The marginal points of the interval [s — r, s + r] are more amazing. They also satisfy the equation of the circle, yet we have at them that Fy is ±r,t) — 0, which describes the position of the tangent line to the circle at these points, parallel to the y-axis. Indeed, we cannot find neighborhoods of these points in which the circle could be described as a function y — fix). Moreover, the derivatives of our function y — fix) — t + 7(x — s)2 — r2 at points where it is defined can be expressed in terms of partial derivatives of the function F: fix) = - 2(x — s) y ■ If we interchange the roles of the variables x and y and we will want to find a dependency x — fiy) such that F(/(y), y) — 0, then we will succeed in neighborhoods of points is ± r, t) with no problem. Let us notice that the partial derivative Fx is non-zero at these points. Our observation thus (for mere two examples) says: for a function Fix, y) and a point (a,b) e E2 such that F(a,b) — 0, there is a unique function y — fix) satisfying F(x, fix)) — 0 if we have Fy(a, b) / 0. In this case, we can even compute f'ia) — —Fx(a,b)/Fy(a,b). We will prove that actually, this proposition is always true. The last statement about derivatives can be remembered (and is quite comprehensible if things are thoroughly understood) from the expression for the differential of the function gix) — Fix, y(x)) and the differential dy — f'(x)dx 0 — dg — Fxdx + Fydy — (Fx + Fyf'(x))dx. We could work analogously with the implicit expressions F{x, y, z) — 0, where we can look for a function gix, y) such that Fix, y, gix, y)) — 0. As an example, consider the function fix,y) — x2 + y2, whose graph is a circular paraboloid centered at the point (0, 0). This can be defined implicitly by the equation 0 = Fix, y,z) = z-x2 ■y2- Before formulating the result straight for the general situation, we can notice which dimensions could/should appear in the problem. If we wanted to find, for this function F, a curve c(x) — (ci (x), c2Íx)) in the plane such that Fix, c(x)) — Fix, c\ix), C2Íx)) — 0, then we succeed as well (even for all initial conditions x — a), yet the result will not be unique for a given initial condition. In fact, it suffices to consider an arbitrary curve on the circular paraboloid whose projection onto the first coordinate has non-zero derivative. Then we consider x to be the parameter of the curve, and c(x) is chosen to be its projection onto the plane yz. 487 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Since every circle is compact, it suffices to examine the function values at these two points. We find out that there is a maximum of the considered function on the given circle at the former point and a minimum at the latter one. □ 8.60. Find the extrema of the function / : R3 -» R, f(x, y, z) = x2 + y2 + z2, on the plane 2x + y — z, = 1 and determine their types. o 8.61. Find the maximum of the function / : R2 -» R, f(x, y) = xy on the circle with radius 1 which is centered at the point [x0, y0] = [0, i]. O 8.62. Find the minimum of the function / : R2 -» R, f = xy on the circle with radius 1 which is centered at the point [xo, yo] = [2, 0]. o 8.63. Find the minimum of the function / : R2 -» R, f = xy on the circle with radius 1 which is centered at the point [x0, y0] = [2, 0]. o 8.64. Find the minimum of the function / : R2 -» R, f = xy on the ellipse x2 + 3/ = 1. O 8.65. Find the minimum of the function / : R2 -» R, f = x2y on the circle with radius 1 which is centered at the point [x0, y0] = [0, 0]. o 8.66. Find the maximum of the function / : I on the circle x2 + y2 = 1. 8.67. Find the maximum of the function / : on th ellipse 2x2 + 3 V2 = 1. 8.68. Find the maximum of the function / : on the ellipse x2 + 2y2 = 1. l,f(x,y) =x3y O R, f(x, y) = xy o R, f(x, y) = xy o H. Volumes, areas, centroids of solids 8.69. Find the volume of the solid which lies in the half-plane z, > 0, the cylinder x2 + y2 < 1, and the half-plane a) z < x, b) x + y + z, < 0. Therefore, we expect that one function of m + 1 variables defines implicitly a hypersurface in Rm+1 which we want to express (at least locally) as the graph of a function of m variables. We can anticipate that n functions of m + n variables will define an intersection of n hy-persurfaces in Rm+n, which is, in "most" cases, a m-dimensional object. Let us thus consider a differentiable mapping fn): The Jacobian matrix of this mapping will have n rows and m columns, and we can write it symbolically as d1F = (dlxF, d\F) (Ml obci . Ml \Sx\ Ml dxm Ml dxm vm+n 3/i dxm SXn. dxm _Mn_ where (x\,..., xm+n) e Rm+n is written as (x, y) e D\ F is a matrix of n rows and the first m columns in the Jacobian matrix, while D^F is a square matrix of order n, with the remaining columns. The multidimensional analogy to the previous reasoning with the non-zero partial derivative with respect to y is the condition that the matrix Dy F is invertible. The implicit mapping theorem Theorem. Let F : Rm+n —>• Rn be a differentiable mapping in an open neighborhood of a point (a,b) e Rm x Rn — Rm+n at which F(a,b) — 0, and det DyF ^ 0. Then there exists a differentiable mapping G : Rm —>• R" defined on an neighborhood U of the point a € Rm with image G(U) which contains the point b and such that F(x, G(x)) = Ofor all x € U. Moreover, the Jacobian matrix D1G of the mapping G is, in the neighborhood of the point a, given by the product of matrices DlG{x) -(DyF) -\x, G(x)) ■ DlxF(x, G(x)). Proof. For the sake of comprehensibility, we first show the proof for the simplest case of the equation F(x, y) — 0 with a function F of two variables. At first sight, it ft, v will be quite complicated because it will be presented in a way which can be extended for the general dimensions as the theorem states. We extend the function F to F : R2 R2, (x, y) (x, F(x, y)). The Jacobian matrix of the mapping F is 1 0 yFx(x,y) Fy(x,y) It follows from the assumption Fy(a,b) / 0 that the same holds in a neighborhood of the point (a, b) as well, so the function F is invertible in this neighborhood, by the inverse mapping theorem. Therefore, let us take the uniquely defined differentiable inverse mapping F-1 in a neighborhood of the point (a, 0). Now, let us denote by it : R2 —>• R the projection onto the second coordinate, and consider the function /(x)=jroF_1(i,0). DlF(x, y) = 488 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution, a) The volume can be calculated with ease using cylindric coordinates. There, the cylinder is determined by the inequality r < 1; the half-plane z < x by z < r cos z2, x > 0. Solution. First, we should realize what the examined solid looks like. It is a part of a ball which lies outside a given cone (see the picture). The best way to determine the volume is probably to subtract half the volume of the sector given by the cone from half the ball's volume (note that the volume of the solid does not change if we replace the condition x > 0 with z > 0 - the sector is cut either "horizontally" or "vertically", but always to halves). We will calculate in spherical coordinates. x = r cos() dif/ d

z2. Again, we express the conditions in the spherical coordinates: r2 < 1, 3sin2(i/» > cos2(i/», i. e., tan(i/>) > Just like in the case of the ball, we can see that the variables occur independently in the inequalities, so the integration bounds of the variables will be independent of each other as well. The condition r2 < 1 implies r e (0, 1]; from tan(V0 > we have e [0- f ]■ The variable

dr dcp = —=. 73 8.19. The gradient of a function. As we have seen in the previous paragraph, if F is a continuously dif-ferentiable function of n variables, the definition F(x\,..., x„) — b with a fixed value b e R defines m7Jw- a subset M c Rn which often has the properties of an (n — l)-dimensional hypersurface. To be more precise, if the vector of the partial derivatives D F 3xi dx„ is non-zero, we can describe the set M locally as the graph of a continuously differentiable function of n — 1 variables. In this connection, we also talk about level sets Mj,. The vector D1F e Rn is called the gradient of the function F. In technical and physical literature, it is also often denoted as grad F. Since M], is given by a constant value of the function F, the derivatives of the curves lying in M will surely have the property that the differential dF always evaluates to zero on them. Indeed, for every such a curve, we have F(c(t)) — b, hence — F(c(t)) = dF(c'(t)) = 0. at On the other hand, v = (vu...,v„) e R" directional derivative we can consider a general vector and the magnitude of the corresponding \dvF\ = M dx -vi dx„ — cos ( ID Fll where

0, centered at (a, b, c), i. e., given by the equation F(x, y, z) = (x- a)2 + (y - b)2 + (z - c)2 = r2, we get the normal vectors at a point P — (xq, yo, zo) as a non-zero multiple of the gradient, i. e., a multiple of D1F — (2(x0 - a), 2(yo - b), 2(z0 - c)), and the tangent vectors will be exactly the vectors perpendicular to the gradient. Therefore, the tangent plane to a sphere at the point P can always be described implicitly in terms of the gradient by the equation 0 = (x0 - a)(x - x0) + (y0 - b)(y - y0) + (z0 - c)(z - z0)-This is a special case of the following general formula: 490 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES In cylindric coordinates Tangent hyperplane to an implicitly given hypersurface X y z r cos( \, respectively. 7T 7r □ Another alternative is to compute it as the volume of a solid of revolution, again splitting the solid into the two parts as in the previous case (the part "under the cone" and the part "under the sphere". However, these solids cannot be obtained by rotating around one of the axes. The volume of the former part can be calculated as the difference between the volumes of the cylinder x2 + y2 < ^,0 < z < ^ and the cone's part 3x2 + 3y2 R given by multiplying the column of coordinates by the row vector grad F). Clearly, The selected point P satisfies our equation. □ 8.20. A model of illumination of 3D objects. Let us consider illumination of a three-dimensional object where we know the direction i; of the light falling onto the two-dimensional surface of this object, i. e. a set M given implicitly by an equation F(x, y, z) — 0. The light intensity of a point P e M is defined as I cos • Rn. For a fixed choice b — (b\,..., b„), the set of all solutions is, of course, the intersection of all hypersurfaces M(bt, f) corresponding to the particular functions f. The same must hold for tangent directions, while normal directions are generated by particular gradients. Therefore, if D1F is the Jacobian matrix of a mapping which implicitly defines a set M and a point P — 491 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution. We iwll compute the integral in spherical coordinates. The segment can be perceived as a spherical sector without the cone (with vertex at the point [0, 0, 0] and the circular base z, = 1, x2 + y2 = 1). In these coordinates, the sector is the product of the intervals [0, \/2] x [0, 2tc) x [0,7r/4]. We thus integrate in the given bounds, in any order: f/7 Jo Jo Jo r2 sin(Ö) dö dr dcp = -(V2 - l)jt. In the end, we must subtract the volume of the cone. That is equal to \ttR2H (where R is the radius of the cone's base and H is its height; both are equal to 1 in our case), so the total volume is 4 r 1 1 r ^sector - ^cone = ~W2 - 1) - -it = -7r(4V2 - 5). The volume of a general spherical segment with height h in a ball with radius R could be computed similarly: V ^sector ~~ ^cone / / r2 sin(6>) dr dö d

0, y > 0. Therefore, the volume of the whole solid is equal to y = 8/o/o 7W=- dydx = 128. □ Remark. Note that the projection of the considered solid onto both the plane y = 0 and the plane z, = 0 is a circle with radius 4, yet the solid is not a ball. 8.73. Find the volume of the part of the cylinder x2 + y2 by the planes z, = 0 and z, = x + y + 2. 4 bounded For every curve c(t) c M going through P — c(0), we must have that h (c(t)) is an externum of this univariate function. Therefore, the derivative must satisfy ^-A(c(0)|M> = d^o)h(P) = dh(P)(c'(0)) = 0. However, this means that the differential of the function h at the point P is zero along all tangent increases to M at P. This property is equivalent to stating that the gradient of h lies in the normal subspace (more precisely, in its direction space). Such points P e M are called stationary points of the function H with respect to bindings given by F. As we have seen in the previous paragraph, the normal space to our set M is generated by the rows of the Jacobian matrix of the mapping F, so the stationary points are determined equivalently by the following proposition: ___J Lagrange multipliers | - Theorem. Let F = (/i, ..., /„) : Rm+n -> Rn be a differen-tiable function in a neighborhood of a point P, F(P) — 0. Further, let M be given implicitly by an equation F (x, y) — 0, and let the rank of the matrix D1 F at the point P ben. Then P is a stationary point of a continuously differentiable function h : W"+n -> Rwith respect to the conditions F if and only if there exist real parameters k\, ... ,kn such that grad/z = Xi grad/i X„ grad/„. Let us notice that the method of Lagrange multipliers is an algorithmic one. Therefore, let us take a look at the ^,/ numbers of unknowns and equations: the gradients are vectors of m+n coordinates, so the request of the theorem gives m + n equations. The variables are, on one side, the coordinates x\,..., xm+n of the wanted stationary points P with respect to the bindings, and, on the other hand, the n parameters A, in the linear combination. Now it remains to state that the point P belongs to the implicitly given set M, which leads to n more equations. Altogether, we have 2n + m equations for 2n + m variables, so we can expect that the solution will be given by a discrete set of points P (i. e., each one of them will be an isolated point). 8.23. Arithmetic mean-geometric mean inequality. As an example of practical application of the Lagrange multipliers, we will prove the inequality 1 (x\ H-----h xn) > ^/xT for any n positive real numbers x\,... ,x„. Further, we will prove that the inequality holds with equality if and only if all the x, 's are equal. Let us thus take the sum x\ + ■■■ +x„ — c to be the binding condition for a (non-specified) non-negative constant c. We will look for the maxima and minima of the function f (x\, . . . , Xfi) — \JX\ • • • xn with respect to our binding condition and the assumption x\ > 0,...,x„ > 0. The normal vector to the hyperplane defined by the condition is (1,..., 1). Therefore, the function / can have an externum only 493 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution. We will work in cylindric coordinates given by the equations x = r cos(\/2 p2 y V = 4 / / —rdzdrdcp Jo Jo Jfi V2 = 2V2 / / 2r - r3 dr dcp = 2^2 \ dcp Jo Jo Jo = V2ir. □ 8.78. Calculate the volume of the ellipsoid x2 + 2y2 + 3z2 = 1. Solution. We will consider the coordinates x y z r cosfa) sin(ö), ■^rsinfa)sin(ö), TS ^rcos(Ö). The corresponding Jacobian is -j^r2 sin(0), so the volume is V = / / sin(Ö) dr dö dcp Jo Jo Jo V6 3V6 7T. Se,^ = £ f(^...in)AXi (here, we write Ai;, Jn for the product of the sizes of the particular intervals from the division defining the little cube with the corresponding indeces) always converge to the value independent of the selected sequence of divisions and representatives. The function / is then said to be Riemann-integrable over I. As a relatively simple exercise, you can prove in detail that every Riemann-integrable function over an interval I must be bounded there. The reason is the same as in the case of univariate functions: we control the norms of the divisions used in the definition somewhat roughly. The situation is much worse if we try to integrate in this way over unbounded intervals, because, unlike integrals in one variable, we cannot replace the wanted result uniquely with the limit of integrals over bounded areas, see ?? below. Therefore, we will further talk about integration of functions over M" only for functions whose support is compact, i. e. functions which take zero outside a bounded interval I. A bounded set M C M" is said to be Riemann measurable iff its indicator function, defined by Xm(xi, > X-n) — 1 for (x\,..., x„) e S 0 for all other points in 1 is Riemann-integrable over W. For any Riemann-measurable set M and a function / defined at all points of M, we can consider the function / — xm • / as a function defined on the whole W, and this function / apparently has a compact support. The Riemann integral of the function / over the set M is defined by / / dx\ ... dxn — / Jm Jr" fdx\ ... dx„, supposing the right-hand integral exists. This definition of the Riemann integral does not provide reasonable instructions how to compute the values of integrals. However, it immediately leads to the following basic properties of the Riemann integral (cf. Theorem 6.24): 8.26. Theorem. The set of Riemann-integrable real-valued functions over an interval I C M" is a vector space over the real scalars, and the Riemann integral is a linear form there. If the integration domain S is given as a disjoint union of finitely many Riemann-measurable domains S{, the integral over a function f over S is given by the sums of the integrals over the particular domains Si. Proof. All the properties follows directly from the definition of the Riemann integral and the properties of convergent sequences of real numbers, just like in the case of univariate functions. We advise to think out the details by yourselves. □ Now, let us rewrite the theorem into usual equalities: 496 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES □ 8.79. Remark. Note that if the transformation the coordinates is linear (and affine), then the space is deformed "uniformly". This means that the volume of an arbitrary solid is changed proportionally to the change of the volume of an infinitesimal volume element, which is the Jacobian. Therefore, if we consider the volume of the ball with a given radius r to be known, (in this case, r = 1), we can infer directly that the volume of the ellipsoid is V i 76 T7t 376 It. 8.80. Find the volume of the solid which is bounded by the paraboloid 2x2 + 5V2 = z, and the plane z = 1. Solution. We choose the coordinates x = -j^r cos(cp), y = ^rsinOp), z = z. The determinant of the Jacobian is so the volume is /» 2tt /»1 /»1 Jo Jo J i2 V 10 dz dr dcp 7T 2V10 □ 8.81. Find the volume of the solid which lies in the first octant and is bounded by the surfaces y2 + z2 = 9 and y2 = 3x. Solution. In cylindric coordinates, V prc/2 p3 p Jo Jo Jo r-j cos2 o?) 27 r dx dr dw = —jt. * 16 □ 8.82. Find the volume of the solid in M3 which is bounded by the cone part 2x2 +y2 = (z — 2)2, z > 2 and the paraboloid 2x2 + y2 = 8 — z. Solution. First of all, we find the intersection of the given surfaces: (z - 2)2 -z + 8, z > 2; _| Finite additivity and linearity J_ The first part says that a linear combination (over scalars in R) of Riemann-integrable functions f■ : I -> R, i — 1,..., k is always a Riemann-integrable function, and it can be computed as follows: a\f\(x\, ..., x„) H-----\-akfk(xi, ...,xn)\dx\. ..dx„ a\ j f\(x\, ..., x„) dx\ ... dxn + ----Vak j fk(x\, ..., x„) dx\ .. .dx„. The second part then says that for disjoint Riemann-measurable sets Mi and M2 and for a function / : Rn —>• R which is Riemann-integrable over both these sets, we have that / Ja MlUM2 f (x\, ..., x„) dx\ ... dx„ = / f(xi,...,xn)dxi...dxn + JM\ j JM2 f(x\, ..., x„) dx\ ... dx„. 8.27. Multiple integrals. We will see in a while that Riemann-jSt'* integrable functions especially involve the cases when the integration domain M can be defined by a continuous function dependency of the coordinates of boundary points so that, given the first coordinate x, we can define the range of the next coordinate by two functions, i. e., y e [(f(x), i[r(x)], then the range of the next coordinate by z e [n(x, y), r(x, y)], and so on for all of the other coordinates. We can indeed do this in the case of our ball from the introductory example: for x e [—1, 1], we define the range for y as y e [—VI — x2, Vl — x2]. The volume of the ball can then be computed by integration of the mentioned function /, or we can integrate the indicator function of the ball, i. e. the function which takes one on the area S c R3 which is defined by z e ' r" ^ ~ r" ^ ~~ TT y2]. The following theorem is fundamental for this. It transforms the computation of a Riemann integral to a gradual computation of several univariate integrals (while the other variables are considered to be parameters, which can thus appear in the integration bounds as well). ___| Multiple integrals J___ Theorem. Let M c E continuous functions be a bounded set given, as above, by M = {(xi, ..., x„); xi e [a, b], x2 e [ih(xi), n2(xi)], x„ e [i/n(x\, ...,xn-i),nn(xi, ...,x„_i)]}, and f be a function which is continuous on M. Then the Riemann integral of the function f over the set M exists and is given by the 497 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES therefore, z = 4, and the equation of the intersection is 2x2 + y = 4. The substitution x = r cos fa), y = r sin fa), z, = z transforms the given surfaces to the form r2 = (z — 2)2, z, > 2, and r2 = 8 — z, i. e., z = r+2 for the former surface and z, = 8—r2 for the latter. Altogether, the projection of the given solid onto the coordinate cp is equal to the interval [0, 2it]. Having fixed a cp0 e [0, 2it], the projection of the intersection of the solid and the plane cp = ip0 onto the coordinate r equals (independently of cpo) the interval [0, 2]. Having fixed both ro and cpo, the projection of the intersection of the solid and the line r = ro, cp = cpo, onto the coordinate z, is equal to the interval [ro +2, 8—r^ ]. The Jacobian of the considered transformation is j V2 r, so we can write V p2n p2 pS-r2 0 JO Jr+2 V2 dz, dr dcp 16V2 -TV. □ 8.83. Find the volume of the solid which lies inside the cylinder y2 + z" = 4 and the half-space x > 0 and is bounded by the surface y2 + j2 -2 + 2x z" + zx = 16. Solution. In cylindric coordinates, V f'ff Jo Jo Jo r dx dr dcp = 2&it. □ 8.84. The centroid of a solid. The coordinates (xt, yt,zt) of the cen-troid of a (homogeneous) solid T with volume V in M3 are given by the following integrals: Zt E E E xdxdy dz, y dx dy dz, z dxdydz. Jo ^0 ^2 2 dy dx s/6 in Jo dy' dx' TV 10 JO 4V6' The other integrals we need can be computed directly in Cartesian coordinates x and y: x dy dx 73 / 1 - 3x2 Xa/-dx 2 1 - 3f dt V2 18 ' formula The centroid of a figure in M2 or other dimensions can be computed analogously. 8.85. Find the centroid of the part of the ellipse 3x2 +2/ = 1 which lies in the first quadrant of the plane M2. Solution. First, let us calculate the volume of the given ellipse. The transformation x = -j^x1 ,y = -jj/ with Jacobian ^ leads to / f(xi, x2, ..., x„)üfxi .. .dx„ — / 1/ Jm Ja \J\lr2 a rin(x\,...,Xn-\) f(x\, xi, ■ ■ ■, xn) dxn ) ... dx2 )dxi fn(xi,---,Xn-i) Proof. First of all, we will go through the proof for the case of two variables, and then we will see that there is no need of further ideas in the general case. Consider an interval / — [a,b] x [c, d] containing our set M — {(x, y); x e [a,b],y e [ijf(x), n(y)]} and divisions S of the interval / with representatives^-. The corresponding integral sum is SS,5=£/(£y)A*l7 '•j = E(E/^'))a^)Ax'' where we write Axy for the product of the sizes Ax, and Ax; of the intervals which correspond to the choice of the representative Now, let us assume that we work only with choices of representatives l-ij which all share the same first coordinate x,. If we leave the division of the interval [a, b] and refine only the division of [c, d], the values of the inner sum of our expression will approach the value of the integral Pl(Xi) Jy(xi) f(xt, y)dy, which surely exists since the function f(x{,y) is continuous. Moreover, we thus obtain a function which is continuous in the free parameter x;, see 8.24. Therefore, further refinement of the division of the interval [a, b] leads, in limit, to the desired formula ^2SiAxi^S = j (j f(x,y)dy\dx. It remains to deal with the case of general choices of representatives of general divisions S. However, since we are working with a continuous function / on a compact set, it is actually uniformly continuous there. Therefore, if we select a small real number e > 0 beforehand, we can always, for the norm of a division, find a bound S > 0 so that the values of the function / for the general choices xtj differs by no more than e from the choices used above. The limit processes thus result in the same for general Riemann sums 63,5 as we saw above. Now, the general case can be proved easily by induction. In the case of n — 1, the result is trivial. The presented reasoning can easily be transformed for a general induction |x step, writing (x2,... ,x„) instead of y, having x\ instead of x, and perceiving the particular little cubes of the divisions as (n — 1)-dimensional cubes Cartesian-multiplied by the last interval. In the last-but-one step of the proof, we just use the induction hypothesis, rather than the simple one-dimensional integration. The final argument about uniform continuity remains 498 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 1 . /\-3x2 1 [H 1 - 3x2 y dy dx = - / -:-dx r?3 m— l r / / y dy dx = - Jo Jo z Jo 4~1 1 V3 (1 - 3x2)dx = —. 18 Therefore, the coordinates of the centroid are — ]. □ 8.86. Find the volume and the centroid of a homogeneous cone of height h and circular base with radius r. Solution. Positioning the cone so that the vertex is at the origin and points downwards, we have in cylindric coordinates that V =4 rTr/2 pr ph Jo Jo hp 1 2 p dz dp dip = —Tthr Apparently, the centroid lies on the z-axis. For the z-coordinate, we get 1 r 1 r12 r rh 3 z = 77 / - zdV = 77 / / / zpdzdpdcp =-h. V JkuLzel V Jo Jo J±n 4 TP Thus, the centroid hes -\h over the center of the cone's base. □ 8.87. Find the centroid of the solid which is bounded by the paraboloid 2x2 + 2y2 = z, the cylinder (x + l)2 + y2 =0, and the plane z = 0. Solution. First, we will compute the volume of the given solid. Again, we use the cylindric coordinates (x = r ■ cos cp, y = r ■ sin cp, z, = z), where the equation of the paraboloid is z = 2r2 and the equation of the cylinder reads r = —2 cos(tp). Moreover, taking into account the fact that the plane x = 0 is tangent to the given cylinder, we can easily determine the bounds of the integral that corresponds to the volume of the examined solid: /» /»— 2 cos pl^r2 V = I I I r dz dr dtp Ji Jo Jo 3zl /| fo L -2 cos c 2r dr dtp i cos ittp = 3jz, where the last integral can be computed using the method of recurrence from 6.22. Now, let us find the centroid. Since the solid is symmetric with respect to the plane y = 0, the y-coordinate of the centroid must be zero. Then, the remaining coordinates xT and zt of the centroid can the same. We advise to go through this proof in detail as an exercise. □ FuBINl's theorem 8.28. Corollary. For a multidimensional interval M — [a\, b\\ x [a2, bi\ x ... x [an, bn] and a continuous function f(x\,..., x„) on M, the multiple Riemann integral Jm f(xi fbi i>b2 rbn J a\ Jai Jüf, f(x\, ..., x„) dx\ ... dx„ is independent of the order in which the integrations are performed. Proof. In the case of a multidimensional interval M in the previous theorem, any order of integration expresses the area M in the required form. Therefore, the order of integration has no effect upon the result of the integral. □ The possibility of changing the order of integration in multiple integrals is extremely useful. We have already taken advantage of this result, namely when studying the connection of Fourier transforms and convolutions, see paragraph 7.9. Our simple derivation of Fubini's theorem builds upon the simple properties of Riemann integration and the continuity of the integrated function. Fubini, in fact, proved this result in a much more general context of integration, while the theorem we have just introduced was used by mathematicians like Cauchy at least a century before Fubini. We can also notice that we have defined no concept of an improper integral for unbounded multivariate functions. You can verify that it is quite impossible to do this in a reasonable way, just consider the following example of two multiple integrals: fo (io y (x y)3 - y \ l dy Jdx — — ■\dy=-: 3 dxJ dy (x + yy The reason can be felt already from the properties of non-absolutely converging series. There, rearranging the summands can lead to an arbitrary result. The situation is a bit better if we calculate the Riemann integral of a bounded Riemann-integrable function f(x) with non-compact support over the whole M". If there is a universal bound / (x) dx < C with a constant C independent of the choice of an n-dimensional interval I, then it is possible to define / f(x) dx — lim / Jw r^°° Jir f(x) dx, where Ir — {{x\,..., x„); \xj \ < r, j — 1,..., n], and the result is, of course, bounded by the same constant C. In this case as well, 499 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES be computed by the following integrals: 1 xT = — V 1 V 1 V 1 V j j j xdxdydz f f / Jo Jf Jo /i L \:-[ 10 -2 cos a r cos cp dz dr dcp 2r coscp dr dph ■ cos cp dcp 4 3' where the last integral was computed by 6.22 again. Analogously for the z-coordinate of the centroid: Zt V Í L Í -2 cos c 20 zr cos cp dz dr dcp = —. The coordinates of the centroid are thus [—4,0, 2§-]. □ 8.88. Find the centroid of the homogeneous solid in M3 which lies between the planes z, = 0 and z, = 2, bounded by the cones x2 + y2 = z2 andx2 + y2 = 2z2. Solution. The problem can be solved in the same way as the previous ones. It would be advantageous to work in cylindric coordinates. However, we can notice that the solid in question is an "annular cone": it is formed by cutting out a cone Ki with base radius 4 of a cone K2 with base radius 8, of common height 2. The centroid of the examined solid can be determined by the "rule of lever": the centroid of a system of two solids is the weighted arithmetic mean of of the particular solids' centroids, weighed by the masses of the solids. We found out in exercise || 8.86|| that the centroid of a homogeneous cone is situated at quarter its height. Therefore, the centroids of both cones lie at the same point, and this points thus must be the centroid of the examined solid as well. Hence, the coordinates of the wanted centroid are [0, 0, 4]. □ 8.89. Find the volume of the solid in M3 which is bounded by the cone part x2 + y2 = (z — 2)2 and the paraboloid x2 + y2 = 4 — z. Solution. We build the corresponding integral in cylindric coordinates, which evaluates as follows: 2jt /•! /-4-r2 V r> All /»1 /»4—/ JO h) Jr+2 r dz dr dcp = —it. □ 8.90. Find the volume of the solid in M3 which lies under the cone x2 + y2 = (z — 2)2, z < 2 and over the paraboloid x2 + y2 = z. Solution. V r>2iz /»1 r>2—r JO JO Jr2 r dz dr dcp = —jx. Note that the considered solid is symmetric with the solid from the previous exercise || 8.891| (the center of the symmetry is the point [0,0,2]). Therefore, it must have the same volume. □ Fubini's theorem holds in the form / f(x)dx= / •••( / f(x)dxi) ...dx„. Jr" J-qo V-oo 8.29. Notes about integration. The Riemann integral of multivariate functions behaves even worse than we have seen in the case of func-I tions of one variable in the sixth chapter. Therefore, more V-1 sophisticated approaches to integrations have been developed, which are derived from the concept of the measure of a set. Let us take a quick look at this problem. We can consider the strict analogy of the lower and upper Riemann integrals for univariate functions. This means taking infima or suprema, respectively, over the corresponding multidimensional interval, instead of the function values at the representatives in the Riemann sums. For bounded functions, we always get well-defined values this way, and if we do this for the indicator function xm of a fixed set M, we get the so-called inner and outer Riemann measure of the set M. Apparently, the inner measure is the limit of the areas given by the sum of the volumes of all intervals from our divisions which are inside M, and, on the other hand, the outer measure is given by the sum of the volumes of intervals covering M. It follows directly from the definition that a set M is Riemann-measurable if and only if its lower and upper measures are equal. The sets whose outer measure is zero are, of course, Riemann-measurable. We call them measure-zero sets or null sets. It can be shown quite easily that Riemann-integrable functions are exactly those bounded functions with compact support whose set of discontinuity points has measure zero. Surely, this definition makes the measure finitely additive, i. e., a disjoint union of finitely many measurable sets is again a measurable set, and its measure is given by the sum of the measures of the sets being united. However, unlike in the case of one variable, now it does not hold that a countable disjoint union of measurable sets is measurable, so we must expect problems with limit approaches, as we have seen in the case of one variable. If we restrict ourselves to the vector space Sc (Rn) of all continuous functions with compact support, we can proceed in the same way as in the seventh chapter, i. e., we can define, for functions / e SC(R"), their norms ll/l (/ \f(x\, ... ,xn)\p dxi ...dx„ VP for all values 1 < p < oo. Thanks to the Riemann integral having been defined in terms of divisions, the properties of the norm can be verified in the same way as for univariate functions, using Holder's and Minkowski's inequalities. We thus get the metric spaces Cp. As we have known from the general theory, its completion exists (and it is determined uniquely, up to isometry), and it can be shown that it will again be a space of functions. Moreover, a more general theory of integration can be developed so that the norms on these complete spaces would be given by the same formulae as above. However, we will not go deeper into these parts of mathematical analysis here. 500 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.91. Find the centroid of the surface bounded by the parabola y = 4 — x2 and the line y = 0. O 8.92. Find the centroid of the circular sector corresponding to the angle of 60° that was cut out of a disc with radius 1. O 8.93. Find the centroid of the semidisc x2 + y2 = 1, y > 0. O 8.94. Find the centroid of the circular sector corresponding to the angle of 120° that was cut out of a disc with radius 1. O 8.95. Find the volume of the solid in M3 which is given by the inequalities z > 0, z - x < 0, and (x - l)2 + y2 < 1. O 8.96. Find the volume of the solid in M3 which is given by the inequalities z > 0, z-y <0. O 8.97. Find the volume of the solid bounded by the surface 3x2 + 2y2 + 3z2 + 2xy - 2yz - 4xz = 1. 8.98. Find the volume of the part of M3 lying inside the ellipsoid 2x2 + y2 + z2 = 6 and in the half-space x > 1. O 8.99. The area of the graph of a real-valued function f(x, y) in variables x and y. The area of the graph of a function of two variables over an area S in the plane x y is given by the integral P = f Jl+fl + ffdxdy. Considering the cone x2 + y2 = z2. find the area of the part of its lateral surface which lies above the plane z, = 0 and inside the cylinder x2 + y2 = y. Solution. The wanted area can be calculated as the area of the graph of the function z = fx2 + y2 over the disc K: x2 — (y — ^)2. We can easily see that x y fx =-, / = —-—, x x2 + y2 ' y x2 + y2 ' so the area is expressed by the integral jjJl + f2 + f2Axdy = jjf V5 dx dy V 2 / / r dr dcp =- / sin cp Jo Jo 2 J0 V2.7V □ 8.100. Find the area of the parabola z, = x2+y2 over the disc x2+y2 < 4. O 8.101. Find the area of the part of the plane x + 2y + z, = 10 that lies over the figure given by (x — \)2 + y2 < \ and y > x. Q In the following exercise, we will also apply our knowledge of the theory of Fourier transforms from the previous chapter. 8.30. Differentiation with respect to parameters. Now, we can finally deal with the promised dependency of integrals upon parameters. The following result is / highly applicable. For instance, we can use it when examining integral transforms, which we talked about in the second part of chapter seven. Now, our previous results about extrema of multivariate functions also have a direct application for minimization of areas or volumes of objects defined in terms of functions dependent on parameters. __J Differentiation with respect to parameters |„__ Theorem. For a continuous function f(x, y\,..., yn) definedfor all x from a finite interval [a, b] and for all (y\,..., yn) lying in some neighborhood U of a point c — (c\,..., c„) € M", consider the integral f Ja F(yu ..., y„) — / f(x,yi,...,yn)dx. O If there exists a continuous partial derivative M- on a neighbor- ly j hood of the point c, then (c) exists as well, and we have 8F fh df ~z—(c) — / 7—(x,c\, ... ,cn)dx. oyj J a oyj Proof. Thanks to the considered continuity of all functions, we can easily utilize our knowledge about univariate antiderivatives, and the result will be a simple consequence of Fubini's theorem. Since all the other parameters yj play only the passive role of a constant parameter in our reasonings, we can assume without loss of generality that there is only one parameter y. Let us denote G(y) -f Ja (x, y) dx, F(y) -f Ja fix, y) dx and compute, invoking Fubini's theorem, the antiderivative Hiy) = I* G(y)dy Jyo = Fiy) - F(y0). Ja \Jyo df \ — ix,y)dy)dx dy ) Finally, differentiating with respect to y yields dH dF G(y) = —(y) = —(y), ay ay which is what we have wanted to prove. □ 8.31. Change of coordinates at integration. When calculating integrals of univariate functions, we used coordinate transformations as an extraordinarily powerful tool. $KJtO% The situation is very similar in the case of functions ^%s»-i— of more variables. First, let us recall (with an appropriate interpretation for the subsequent generalization) the transformation for a single variable. The integrated expression / (x) dx describes the area of a rectangle defined by a (linearized) increase of the variable x and by the value 501 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.102. Fourier transform and diffraction. Light intensity is a physical quantity which expresses the transmission of energy by waves. The intensity of a general light wave is defined as the time-averaged magnitude of the Poynting vector, which is the vector product of mutually orthogonal vectors of electric and magnetic fields. A monochromatic plane wave spreading in the direction of the j-axis satisfies cs0 If r Jo El dt, where c is the speed of light and eo is the vacuum permittivity. The monochromatic wave is described by the harmonic function Ey = \jr(x, t) = A cos(a>t —kx). The number A is the maximal amplitude of the wave, a> is the angular frequency, and for any fixed t, the so-called wave length a is the prime period. The number k then represents the speed & = y" at which the wave propagates. We have / = cs0- f E2 dt = cs0- f A2 cos2(cot — k x) dt = * Jo ? Jo 1 fT 1 + cos(2((ot - k x)) cs0A2- f —' - dt * Jo 2 1 sin(2(atf — k x)) ,T -cs0A2-[t + 2co -I x 1 91 , sin(2( 1 9/ sin(2((wr — k x)) — sin(2(—k x)) , -ce0A2(l +---) = 2 2(or -cs0A f(x). If we transform the variable by a relation x — u(t), then the linearized increase can be expressed as du dx — —dt, dt and so the corresponding contribution for the integral is given by du f(u(t))—dt, dt where we either suppose that the sign of the derivative u'(t) is positive, or we interchange the bounds of the integral, so that the sign takes no effect in the result. Intuitively, the procedure for n variables is quite similar. We only have to use our knowledge about the volume of parallelepipeds from linear algebra. We use, for Riemann integrals in the Riemann sums, an approximation which takes the volume (area) of a small multidimensional interval and multiplies it by the value of the function at the representative point. If we transform the coordinates, we not only get the function value at the representative point in a new coordinate expression, but we also have to account for the change of the area or volume of the corresponding small multidimensional interval. Once again, this is the case of a linear approximation of a change, which we know well — this is actually an action of the linear approximation of the used transformation, i. e. an action of the Jacobian matrix, see 8.14. The change of the volume is then given (in absolute value) by the determinant of this matrix (see our discussion of this topic in linear algebra, especially 4.22). Transformation of coordinates |_, The second term in the parentheses can be neglected since it is always less than ^ = jfr < 10~6 f°r real detectors of light, so it is much inferior to 1. The light intensity is directly proportional to the squared amplitude. A diffraction is such a deviation from straight-line propagation of light which cannot be explained as the result of a refraction or reflection (or the change of the ray's direction in a medium with continuously varying refractive index). The diffraction can be observed when a lightbeam propagates through a bounded space. The diffraction phenomena are strongest and easiest to see if the light goes through openings or obstacles whose size is roughly the wavelength of the light. In the case of the Fraunhofer diffraction, with which we will deal in the following example, a monochromatic plane wave goes through a very thin rectangular opening and projects on a distant surface. For instance, we can highlight a spot on the wall with a laser pointer. The image we get is the Fourier transform of the function describing the permeability of the shade - opening. Let us choose the plane of the diffraction shade as the coordinate plane z = 0. Let a plane wave A exp(ikz) (independent of the point (x, y) of landing on the shade) hit this plane perpendicularly. Let s(x, y) denote the function of the permeability of the shade, then the resulting waves falling onto the projection surface at a point (§, rj) can be described as the integral sum of the waves (Huygens-Fresnel principle) which have gone through the shade and propagate through the medium from all points (x, y, 0) (as a spherical wave) into the point (§, n, z): Theorem. Let G(t\ Jn) (Xl, G(t\, ..., t„), be a continuously differentiable mapping, G(M) and M be Riemann-measurable sets, and f : M -continuous function. Then, ' Xfi) — N = Jm f(xi / Jn f(G(h,tn)) det^Gfa,..., tn))\dti ...dtn. Proof. Since we are working with a continuous function / and a differentiable change of coordinates, the integrals on both sides of the equality to be proved apparently exist. Therefore, we only need to prove that their values are indeed equal. Let us denote our composite function by ,tn) = f(G(t1, tn)), and choose a sufficiently large n-dimensional interval I containing TV and its division S. The entire proof is nothing more than a more exact writing of the discussion presented above. First, let us notice two things: The images of the boundaries of our interval hx...in are differentiable objects (sides, edges, etc.); in particular, they will again be Riemann-measurable sets. For each little part hx...in of our division S, the integral / over Jix..xn — G(Iix...in) surely exists. Further, if we fix the center f(1.. Jn °f me interval hx...in, then we get the linear image of this interval (note that we map the interval 502 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES J Jm2 rf) = A I I s(x, y)e l2 -ik(^x+rff) áx dy p/2 r-q/2 rf) = A -p/2 J- -ik(&+rff) dy áx p/2 J -q/2 /p/2 r-q/2 e-^x áx / e-ikny dy -p/2 J-q/2 ~ g—ik^x ~ p/2 -£-iktjy - q/2 -p/2 _ —ikt] _ -q/2 2sin(& i-p/2) 2sin(& rjq/2) sin(& i-p/2) sin(& rjq/2) A--= Apq-- ki; krj k^p/2 kriq/2 The graph of the function f(x) = looks as follows: The graph of the function f (£, v) = then does: And the diffraction we are describing: shifted to the origin with the linear mapping given by the Jacobian matrix, and the result is then added to the image of the center) */,...«„ = G(f,in) + DiG(f, .,)(/,-, l\...ln ^ \'l\...lnJ ' ^ \ll\...ln)\1-l\...ln H\...ln) an n -dimensional parallelepiped. If our division is very fine, this parallelepiped differs only a bit from the image Jiu...in. Exactly speaking, thanks to the uniform continuity of the mapping G, we can, for an arbitrarily small e > 0, find a norm of the division such that we will have for all finer divisions that G(th...in) + (1 + e)D1G(tl,... tn)(Ih...in) d Jh...ik. However, then the n-dimensional volumes will also satisfy voln(Jil...in) < (l+s)"voln(Rh...in) = (l+e)"|detG(fil..Jt)|vol„(/il..JJ. Now, we are able to bound the whole integral from above: Jm f(xi ^) dx Y • • • djCfj — f(xi, ...,xn)dxi ...dxn l\-ln '1-'" < SUP g)VOl„ (/^...ij il...iJ^'-'t^eI'l-'« < (1 + e)n ( SUP *)|det G(th...ik)| vol„(/,■,...,„). h...in(fu-'tn)&Ii^n Letting the norms of the divisions approach zero, the left-hand value remains the same, while on the right side, we obtain the Riemann integral. Instead of the equality to be proved, we get the inequality: / f(xi,...,x„)dxi...dx„ Jm Jn f(G(h ,t„))\áet(DlG(tu , t„))\ dt\ ... dt„. However, now we can repeat the same reasoning so that we interchange G and G-1, the integration domains M and N, and the functions / and g. We thus immediately obtain the other inequality: / Jn g(tu tn)\det(DlG(ti,tn))\dti ...dtn f f(xu...,xn)\dzt(DlG(G-l(xu Jm Idet^G-1^! Jm which finishes the proof. , x„))\dxi ■ , x„)))\ . dXfj □ 8.32. An example in two dimensions. The coordinate transfor-fSU mations are quite transparent for the integral of a con-i'^i^r^ tinuous function f(x, y) of two variables. Consider the differentiable transformation G(s, t) — (x(s, t), y(s, t)). Denoting g(s, t) — f(x(s, t), y(s, OX we get dx dy dx dy / f(x, y)dxdy — \ g(s, t) Jg(N) Jn ds dt dt ds dsdt. 503 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES M0°rad Orad L-I■ 10 rad Since lim^^o ^f- = 1, the intensity at the middle of the image is directly proportional to 7o = A2p2q2. The Fourier transform can be easily scrutinized if we aim a laser pointer through a subtle opening between the thumb and the index finger; it will be the image of the function of its permeability. The image of the last picture can be seen if we create a good rectangular opening by, for instance, gluing together some stickers with sharp edges. I. Applications of Stoke's theorem - Green's theorem 8.103. Compute (x — y)dx + x Ay, where c is the positively oriented curve represented by the perimeter of the square ABCD with vertices A = [2,2]; B = [-2, 2];C = [-2, -2]; D = [2, -2]. Solution. Using Green's theorem (see 8.44), we reduce the given curve integral to an area (multiple) integral. The integral is of the form / fix, y) Ax + g(x, y) Ay, where fix, y) = x — y and g(x, y) = x. c The needed partial derivatives of the functions f(x,y) and g(x,y) are thus fy(x, y) = — 1 and gx(x, y) = 1. All of the functions fix, y), g(x, y), fy(x, y), and gx(x, y) are continuous on Green's theorem: so we can use / (x — y) Ax + x Ay II (1 + 1) Ax Ay If Ax Ay 2 2 2J f dxdy = 2[x]2_2.[y]2_2 = 32. -2 -2 □ 8.104. Compute / x Ax + xy Ay, where c is the positively oriented curve going through the vertices A [0, 0]; B = [1,0]; C = [0, 1]. As a truly simple example, we can calculate the integral of the indicator function of a disc with radius R (i. e. its area) and the integral of the function fit, 6) — cos(f) defined in polar coordinates inside a circle with radius \it (i. e. the volume hidden under such a "cap placed above the origin", see the picture). First, we determine the Jacobian matrix of the transformation : r cos 9, y — r sin# ^cos9 — rsin#N . sin 6 r cos 6 D G = Hence, the determinant of this matrix is equal to AetDlGir, 6) = r(sin2 6 + cos2 6) = r. Therefore, we can calculate directly for the disc S which is the image of the rectangle (r, 6) e [0, R] x [0, 2it] — T. We thus get the area of the disc: r r2lZ rR rR I dxdy — I J rdr d6 — I Js Jo Jo Jo Inrdr — 7t R The integration of the function / will be very similar, using multiple integration and integration by parts: <>2jt pjt/2 / fdxdy = / / Js Jo Jo r cos rdr dB — ?r2 2tt. 8.33. Curve integrals. We often cannot do with integrals over J§L'* °Pen su^sets m fl^" because our quantities are given only on objects which are similar to curves or sur-'S^^^^^M faces in M3. The previous reasoning about changes of coordinates when computing integrals clarified the intuitive imagination that our process of integration is the sum of volumes of small linearized parallelepipeds multiplied by the value of the integrated function. Extending this idea, we could define integration over such multidimensional surfaces in M" directly. However, we will first relieve the integration of dependency on coordinates, and then we will transform it to the well-known integration on M". Recall the calculation of the length of a curve by univariate integrals, which was discussed in paragraph 6.7 on page 353. The curve was parametrized as a mapping c(t) : R —>• M", and the size of the tangent vector || c' it) \\ was expressed in the Euclidean vector space. This procedure was given by the universal relation for an arbitrary tangent vector, i. e., we actually found p : Rn —>• R which gave the true size when evaluated at c'(t). This mapping satisfied p(a v) — \a\p(v) since we ignored the orientation of the curve given by our parametrization. If we wanted a signed length, respecting the orientation, then our mapping p would be linear on every one-dimensional subspace id". 504 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution. The curve c is the boundary of the triangle ABC. The integrated functions are continuously differentiable on the whole M2, so we can use Green's theorem: 1 -x + l J x4 Ax + x y Ay = J J y Ax Ay = J J y Ax Ay = D 1 / 0 0 2 -x + l 2X2 t ~ ~2 Ax + x 1 / x2 -2x + l Ax □ 8.105. Calculate / (xy + x + y) Ax + (xy + x — y) Ay, where c is the circle with radius 1 centered at the origin. Solution. Again, the prerequisites of Green's theorem are satisfied, so we can use Green's theorem, which now gives J (xy + x + y) Ax + (xy + x - y) Ay c = jj y + \- x- \AxAy D 1 2jt = J J r2 (sin cp — cos 0, i. e., if we keep the orientation of the curve, and the same value up to sign if the derivative of the transformation is negative. Precisely speaking, we have learned to integrate the differential df of a function over curves. However, it may be the case that the connection with integration of functions is not apparent. We clearly cannot get the length of the curve if we select a constant function with value one for /. We need a geometric point of view to explain this. The size of a vector is given by a quadratic form, rather than a linear one. However, if we take the square root of the values of a (positively definite) quadratic form, we get a linear form (up to sign, see above). We will get back to these connections shortly. 8.34. Vector fields and linear forms. In the previous paragraph, the parametrization of a curve was used to obtain a tangent vector c'(t) e R" for every point in the image M of the curve. We thus have a mapping X : M M x R", c(t) \-+ (c(t), c'(t)). We talk about the vector field X along the curve M. In general, we define a vector field X on an open set U C R" as assigning the vector X(x) e R" in the direction space of the Euclidean space R" to every of its points x in the considered domain. If a vector field X on an open set U C R" is given, then we can define for every differentiable function / on U its derivative in the direction of the vector field X in terms of the directional derivative by the formula X(f):U^R, X(f)(x)=dx(x)f. (Xl(x),...Xn(x)), Therefore, if we have, in coordinates, X(x) then X(f)(x) df Xl(x)-^(x) OX l df + Xn(x)-^(x). dxn The simplest vector fields will have all coordinate functions equal to zero except for one function X, which will be constantly equal 505 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES y = 2r sincp re [0, 1], leading to (the Jacobian of the transformation is 6r): / (2e2x sin y - 3y3)dx + (e2x cos y + -x3) dy = jj 2e2x cos y + 4x2- (2e2x cos y - 9y2) Ax Ay D 1 2jt j j 6r [4(3r cos cp)2 + 9(2r sin • Rn* on an open subset U, we talk about a linear form r\ on U. Every differentiable function / on an open subset U c R" defines a linear form df on U. We use the notation Ql (U) for the set of all smooth linear forms on U. It is apparent that in the coordinates (xi,..., x„), we can use the differentials of the particular coordinate functions to express every linear form rj as rj (x) — rji(x)dx\ + • • • + rjn(x)dxn, where m (x) are uniquely determined functions. Such a form rj is evaluated at a vector field X(x) — Xi(x)-^- H-----h X„ (x) by V(X(x)) = m(x)Xi(x) + ■■■ + Vn(x)X„(x). If the form rj is the differential of a function /, we get just the expression X(f)(x) — df(X(x)) used above. Let us notice that we have actually defined the integral of any linear form over (non-parametrized) curves M in terms of an arbitrary parametrization c(t) J-f JM Ja V(c(t))(c'(t))dt, since although we worked with the function differential back then, we actually verified that the value of the integral was independent of the choice of parametrization for any linear form. We can also notice that we need not write any symbol denoting which concept of a volume we are integrating with respect to. It is given by the definition of a linear form. 8.35. /-dimensional surfaces and &-forms. Instead of parametrized curves, we will now work with differentiable mappings cp : V • M c R" which can be extended to a mapping

• R" which is a diffeomorphism and (p~l (M) — V x {0}. This definition, which might seem complicated at the first sight, is illustrated by a picture. Manifolds can be typically given by implicit mappings, see paragraph 8.18 and the discussion in 8.19. 506 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES where c is the positively oriented circle x2 + y2 =9. 8.110. Compute the integral O / 1 y3 1 x3 (- + 2xy - —) dx + (- + x2 + —) dy, x 3 y 3 where c is the positively oriented boundary of the set D = {(x,y) e M2 : 4 < x2 + y2 < 9, < y < V3x}. O 8.111. Remark. An important corollary of Green's theorem is the formula for computing the area D that is bounded by a curve c. m(D) -y dx + x dy. 8.112. Compute the area given by the ellipse ^ + = 1- Solution. Using the formula ||8.111|| and the transformation x a cos t,y = b sin t, we get for t € [0, 2it] that m(D) -y dx + x dy if c 2n -a cos t ■ b cos tdt--/ 2 J 2 J o 2jt 2n 1 Í 2 1 Í 2 -ab I cos tdt -\—ab I sin tdt 2 J 2 J 2jt 2jt a cos t ■ b cos tdt — - I b sin t ■ (—a sin t)dt 2n —ab f i 2 J ___ . cos21 + sin2 tdt = -ab2it = ital 2 2 which is indeed the well-known formula for the area of an ellipse with semi-axes a and b. □ 8.113. Find the area bounded by the cycloid which is given paramet-rically as \jr{t) = [a(t — siní); a(l — cos f)L for a > 0, t e (0, 2jt), and the x-axis. Solution. Let the curves that bound the area be denoted by c\ and c2. As for the area, we get (D) = \ L -ydx +x áy + \ 4 - ydx +x dy- m Now, we will compute the mentioned integrals step by step. The parametric equation of the curve c\ (a segment of the x-axis) is (t; 0); t € [0; 2ait], so we obtain for the first integral that -ydx+xdy = - 0 • 1 dt + / t ■ Odt = 0. 2 Jo Jo The parametric equation of the curve c2 is \li(t) e (a(t — sin t), a(\ — cost e \_2tt; 0]. The mapping

• M of a manifold M, an t]( Ak(TxM)* that the pullback of this form by any parametrization yields a smooth exterior &-form on V. We will use the notation £2k (M) for the set of all smooth exterior &-forms on M. 8.36. Outer product of exterior forms. Given a &-form a e A" and an ü-form, e A*R°*, we can create a (k + £)-form a a p by all possible permutations a of the arguments. We just have to alternate the arguments in all possible orders and take the right sign each time: (aAp)(Xu...,Xk+l) = 1 x — sign((r)q!(Zfr(i), Xa(k) )P(Xa(k+i), Xa{k+£) )• It is clear from the definition that a a p is indeed a (k In the simplest case of 1-forms, the definitions says that (a a P)(X, Y) = (a(X)ft(Y) - a(Y)ft(X)). In the case of a 1-form a and a &-form f3, we get £)-iorm. (aAß)(X0,XU---,Xk) = k ^2(-iya(Xj)ß(x0,...,Xj,. ■, Xk), J. Applications of Stoke's theorem - the Gauss-Ostrogradsky theorem 8.114. Compute I = ffx3 dy dz +y3 dx dz +z3 dx dy,where S s is given by the sphere x2 + y2 + z2 = 1. Solution. It is advantageous to work in spherical coordinates x = p sin

• U satisfies cp* (a a P) — + z + x dx dy dz 2 2i 3 J J J p ■ iß2 sin2 cp + z + p2cos2cp) dp dip dz. 0 0 1 2 2i 3 /// 0 0 1 2 2i 3 /// 0 0 1 2 2i 3 /// p ■ (p2 (sin2

r«, giving the standard n-dimensional volume of parallelograms, i. e., in the standard coordinates, we will have CO^n — dx\ a • • • a dx„. If we want to integrate a function fix) "in the old fashion", we consider the form co — fco^n instead, i. e. co will have the form (8.3) in the standard coordinates. We define I co = I f(x)dx\ a Ju Ju a dx„ I fix) dx\ ... dxn, Ju where there is the Riemann integral of a function on the right-hand side. We can notice, that the n-form on the left-hand side is independent of the choice of coordinates. If we want to express the form co in different coordinates using a diffeomorphism

• U, it means we will evaluate co at a point cpiu) — x at the values of the vectors ) I f(u) det(Z) 1ipiu))du\ Jv which is, by the theorem on transformation of variables from paragraph 8.31, the same value if the determinant of the Jacobian matrix keeps being positive, and the same value up to sign if it is negative. Our new interpretation thus yields a geometrical sense for the integral of an n-form on M", supposing the corresponding Riemann-integral exists in some (hence any) coordinates. This integration takes into account the orientation of the area we are integrating over. 8.38. Integration of exterior forms on manifolds. Now, we are almost ready for the definition of an integral of a k-formy on a ^-dimensional oriented manifold. For the sake of simplicity, we will examine smooth forms co ^— with compact support. First, let us assume that we are given a ^-dimensional manifold M c M" and one of its local parametrizations

• U C M c M". The choice of the parametrization

• 7 c K1, we can easily compute using the same definition. Let us denote cp* (a))(u) — f (u)du\ a • • • a diik- Invoking the relation (8.2) for the pullback of a form by a composite mapping, we get f a>= f cp*(co)= f n) JM JRk JRk Ip* (/ du\ a • • • a dllk) f(f(v)) det(Dli/)(v)dvi---dvk. f JRk I JRk This is again the same value as fRk cp* co. This proves the correctness of our definition of the integral fM co provided the integrated &-form has compact support lying in the image of a single parametrization. However, typical manifolds M are given by implicit equations; e. g. x2 + y2 + z2 — 1 defines the surface of the unit ball, i. e. the sphere S2 c R3. If we want to integrate an exterior 2-form on S2, we will have to use several parametrizations. Fortunately, our definition of the integral is additive with respect to disjoint unions of integration domains. Therefore, if we can write M — U\ U U2 U • • • U Um U B, where Ui are pairwise disjoint images of parametrizations cpi, and B is a set whose inverse image in any parametrization is a Riemann-measurable set with measure zero, we can compute / co — I &)+•••+/ JM JU\ JU, CO + ■ ■ ■ + I CD, >Ui Jum and we can easily verify that this value is independent of the choice of the sets Ui and the parametrizations (in particular, we need not be worried by the set B since the result of any integration on it is zero). For example, we can imagine splitting a sphere to the upper and lower hemispheres, leaving the equator B uncovered. When calculating in practice, we usually divide the entire manifold into several disjoint areas, and we integrate on each of them separately. However, we will mention a global definition which is more advantageous from the technical point of view. 8.39. Unit decomposition. Consider a manifold M c M" and one of its covers by open images Ui of parametrizations cot. We can surely find a countable cover of each manifold M (it suffices to realize that we can do with parametrizations which map the origin to points with rational coordinates in M"). Furthermore, we can assume that any point in x € M belongs to only finitely many sets Ui. Such a cover is called a locally finite cover by parametrizations cpi. 510 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES The validity of these two formulae is actually guaranteed by the chain rule theorem and the theorem for differentiating an inverse function, respectively. It was just the facility of the manipulations that inspired G. W. Leibniz to introduce this notation, which has been in use up to now. Further, we should realize why we have not written the general solution (||8.6||) in the suggesting form (8.7) y = sin (tan x + x + C) , Ce As we will not mention the domains of differential equations (i. e., for which values of x the expressions are well-defined), we will not change them by "redundant" simplifications, either. It is apparent that the function y from (|| 8.71|) is defined for all x e (0, jt) \ {tt/2}. However, for the values of x which are close to Tt/2 (having fixed C), there is no y satisfying (||8.6||). In general, the solutions of differential equations are curves which may not be expressible as graphs of elementary functions (on the whole intervals where we consider them). Therefore, we will not even try to do that. □ 8.119. Find the general solution of the equation / = (2 — y) tan x. Solution. Again, we are given a differential equation with separated variables. We have dy dx dy — = (2 — y) tan x, y-2 In I y - 2 I smx -dx, cosx — In I cos x In I C I, C^O. Here, the shift obtained from the integration has been expressed by In | C |, which is very advantageous (bearing in mind what we want to do next) especially in those cases when we obtain a logarithm on both sides of the equation. Further, we have In | y — 2 | =ln|Ccosjc|, C^O, | y — 2 | = | Ccosjc |, C^O, y-2 = Ccosx, C^O, where we should write ±C (after removing the absolute value). However, since we consider all non-zero values of C, it makes no difference whether we write +C or — C. We should pay attention to the fact that we have made a division by the expression y — 2. Therefore, we must examine the case y = 2 separately. The derivative of a constant function is zero, so we have found another solution, y =2. However, this solution is not singular since it is contained in the general solution as the case C = 0. Thus, the correct result is y = 2 + Ccosx, Cel. □ 8.120. Find the solution of the differential equation (1 + e*)y/ = ex which satisfies the initial condition y(0) = 1. Solution. If the functions / : (a, b) -» R and g : (c,d) -» R are continuous and giy) ^ 0, y e (c,d), then the initial problem / = f(x)g(y), y(x0) = y0 Now, recall the smooth variants of indicator functions from paragraph 6.7. For every pair of positive numbers e < r, we constructed a function fB,r(t) such that fB,r(t) — 1 for \t\ < r — s, while fB,r(t) — 0 for \ t\ > r + s, and 0 < fB,r(t) < 1 everywhere. At the same time, we had fit) ^ 0 if and only if \t\ < r + s. Now, if we define \r,e,xq (x) — fB,r(\x - x0|), we get a smooth function which takes the value 1 inside the ball Br-B (xq), with support exactly Br+B (xq), and with values between 0 and 1 everywhere. Lemma (Whitney's theorem). Every closed set K c of all zero points of some smooth function. is the set Proof. The idea of the proof is quite simple. If A" — Rn, the zero function is convenient, so we can further assume that K / W. An open set U — Rn\K can be expressed as the union of (at most) countably many open balls Bri (x,), and for each of them, we choose a smooth non-negative function f on Rn whose support is just Bri (xi), see the function Xr,e,x0 above. Now, we add up all these functions into an infinite series fix) — ^2akfk(x), where the coefficients a* are selected so small that this series would converge to a smooth function fix). To this purpose, it suffices to choose a* so that all partial derivatives of all functions a* fk ix) up to order k (inclusive) would be bounded from above by 2~k. Then, not only the series ^ a* fk is bounded from above by the series ^ 2~k, hence by Weierstrass criterion, it converges uniformly on the entire Rn, but we get the same for all series of partial derivatives, since we can always write them as E k=0 Clk drfk dx;, ■ dxir E k=r ak ar.fk dxh ■ ■ ■ dxir where the first part is a smooth function as it is a finite sum of smooth functions, and the second part can again be bounded from above by an absolutely converging series of numbers, so this expression will converge uniformly to dx. ^../dx.r ■ It is apparent from the definition that the function fix) satisfies the conditions of the lemma. □ Unit decomposition on a manifold Theorem. Consider a manifold M C R" and one of its locally finite covers by open images Ui of parametrizations cpt. Then, there exists a system of smooth functions f on the sets U{ such that for every point x e M, we have ^ f (x) = 1, and f (x) ^ 0 if and only ifxe Ui. The system of functions f from the theorem is called the unit decomposition subordinate to a locally finite cover of a manifold by parametrizations. 511 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES has a unique solution for any x0 e (a,b), y0 e (c,d). This solution is determined implicitly as y(x) yo *o In practical problems, we first find all solutions of the equation and then select the one which satisfies the initial condition. Let us compute: (l + e*) y dy/dx = ex, ydy 1 +ex ■ dx, y2 y— =ln(l +ex) +ln|C|, C ^ 0, y— = ln(C[l+e*]), C>0. The substitution y = 1, x = 0 then gives £ = In (C • 2), i. e. C - 2 We have thus found the solution y_ 2 ln(^[l+e*]), l. e., □ y = Jlln(f [1+e*]) on a neighborhood of the point [0, 1] where y > 0. 8.121. Find the solution of the differential equation which satisfies y(0) = 1. Solution. Similarly to the previous example, we get dy dx f + 1 ~~ x + 1' arctan y = In | x + 11 + C, Cel. The initial condition (i. e., the substitution x = 0 and y = 1) gives arctan 1 = In | 11 + C, i. e., C = f. Therefore, the solution of the given initial problem is the function y(x) = tan (In | x + 1 | + |) on a neighborhood of the point [0,1]. □ 8.122. Solve (8.8) y x+y + l 2x + 2y - 1' Solution. Let a function / : (a,b) x (c,d) -» R have continuous second-order partial derivatives and f(x, y) ^ 0, x e (a,b), y € (c,d). Then, the differential equation / = f(x, y) can be transformed to an equation with separated variables if and only if f(x,y) j;Ax.y) = 0, x e (a, &), y e (c, • Ut such that the closures of all images • U of a piece U of the manifold M, and let us have a look at the contribution of the integral over U. We get f « = E [ (fi°)= f JU ■ JVifW Jv ip CD. Therefore, if we select a different cover and unit decomposition, we can do the above reasoning for a common refinement of these covers and verify that the expression we have defined is actually independent of all of our choices (think this out in detail!). 8.41. Exterior differential of exterior forms. As we have seen, the differential of a function can be interpreted as a mapping d : Q°(Rn) Q\Rn). By means of parametrizations, this definition can be extended to functions on manifolds M, where the differential is a linear form on M. The following theorem extends this differential to arbitrary exterior forms on manifolds McR". [ Exterior differential [__^ Theorem. There is a unique mapping d : Qk(M) —>• Qk+lM,for all manifolds Mcl" and k — 0,..., k, such that • d is linear with respect to multiplication by real numbers • for k — 0, it is a differential of functions 512 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES With a bit of effort, it can be shown that a differential equation of the form y = f(ax + by + c) can be transformed to an equation with separated variables, and this can be done by the substitution z = ax + by + c. Let us emphasize that the variable z, replaces y. We thus set z, = x + y, which gives z! = 1 + /. Substitution into (II8.8H) yields dz, z + 1 1 dx 2z — 1 dz 3z dx 2z — 1 2 1 3 ~ Yz dz = 1 cx+i c eR. y = xe □ 8.124. Compute y 4;c+3y + l Solution. In general, we are able to solve every equation of the form ax + by + c (8.9) y = / ^Ax +By + C/ If the system of linear equations (8.10) ax +by +c = 0, Ax+By + C=0 has a unique solution x0, yo, then the substitution U — X — Xq, V — y — yo transforms the equation (||8.9||) to a homogeneous equation dv _ r / au+bv \ du ~ J \Au+Bv I ' If the system (||8.10||) has no solution or has infinitely many solutions, the substitution z = ax + by transforms the equation (||8.9||) to an equation with separated variables (often, the original equation is already such). In this problem, the corresponding system of equations 4x + 3y + 1 = 0, 3x + 2y + 1 = 0 has a unique solution x0 = — 1, yo = 1. The substitution u = x + 1, v = y — 1 then leads to the homogeneous equation 4u+3v dv Oriented boundary of a manifold Let us consider a closed subset M c R" such that its interior M c M is an oriented ^-dimensional manifold with a cover by compatible parametrizations cpt. Further, let us assume that for every boundary point x e 8M — M \ M, it has a neighborhood in M with parametrization

— I JM JdM Proof. Using an appropriate locally finite cover of the manifold M and a unit decomposition subordinate to it, we can express the integrals on both sides as the sum (even a finite one, since the support of the considered from co is compact) of integrals of forms on Rk or the half-space 1 We can thus assume without loss of generality that M is the half-space M = and the form co is a form with compact support on M. Then, co will surely be the sum of the forms co — coj (x)dx\ a • • • a dxj a • • • a dxfc, where the hat indicates omission of the corresponding linear form, and coj (x) is a smooth function with compact support. Its exterior differential is 3u+2v ' • dcoj dco — (—ly —-dx\ a 8xj a dxk. 514 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES which can be solved by further substitution z = v/u. We thus obtain 4 + 3z z'u + z dz — u =-- du 2z + 3 2z2 + 6z + 4 3 + 2z' 2z2 + 6z + 4 3 + 2z dz = - du u provided z2 + 3z + 2 ^ 0. Integrating, we get 1 i , - In J z2 + 3z + 2 In I u I + In I C - In J (z2 + 3z + 2) u2 J = In | C | In J (z2 + 3z + 2) u2 J = In C2 (z2 + 3z + 2)u2 = ±C2 C ^ 0. We thus have (z2 + 3z + 2) u2 = D, D/0 and returning to the original variables, ,2 — +3-+2)uz UL u V2 + 3vu + 2m2 (y- l)2 + 3(y- l)(x + l) + 2(x + l)2 D, D^O, D, D^O, D, D^O. Making simple rearrangements, the general solution can be expressed as (x + y) (2x + y + 1) = D, D jL 0. Now, let us return to the condition z2 + 3z, + 2 ^ 0. It follows from z2 + 3z + 2 = 0 that z = — 1 or z = —2, i. e., v = —u or v = —2u. For i; = —u, we have x = u — 1 and y = i; + 1 = —m + 1, which means that y = — x. Similarly, for i; = —2u, we have y = —2u + 1, hence y = —2x — 1. However, both functions y = —x, y = —2x — 1 satisfy the original differential equations and are included in the general solution for the choice D = 0. Therefore, every solution is known from the implicit form (x + y) (2x + y + 1) = D, D € R. □ 8.125. Find the general solution of the differential equation (x2 + y2) dx — 2xy dy = 0. Solution. For y ^ 0, simple rearrangements lead to v _ i±± - i±iil! y — 2xy — 2j • Using the substitution u = y/x, we get to the equation u'x + u 2u If j > 1, the form co on the boundary 3M evaluates identically to zero. At the same time, invoking the fundamental theorem about antiderivatives of univariate functions, we get f i f ( f°° da)i \ I d(D — (—Xy I [I -dx; )dx\ ■ ■ ■ dxj ■ ■ ■ dxk JM Jm*-1 VJ-oo dxj J = (-1> AM dxi ■ ■ ■ dxj ■ ■ ■ dxk — 0, since the function coj has compact support. So the theorem is true in this case. However, if j — 1, then we obtain f f ( f° da)i \ j d(D — j II -dx\ \dxj_......dxk JM JRk-1 \J-oo dxi ) — I a>i (0, x2, ■ ■ ■, xjc)dx2 ■ ■ ■ dxi — J a> JRk-1 JdM This finishes the proof of Stokes's theorem. □ 8.44. Notes about application of Stokes' theorem. We have proved an extraordinarily strong result which covers several standard integral relations from the classic vector analysis. For instance, we can notice that by Stokes' theorem, the integration of the exterior differential dco of any (k — l)-forms over a compact manifold without boundary is always zero (for example, integrating a 2-form dco over the sphere S2 c R3). Let us look step by step at the cases of Stokes' theorem in lower dimensions. The case n — 2, k — 1. We are thus examining a surface M in the plane, bounded by a curve C — 8M. If we have co(x, y) — f(x, y)dx+g(x, y)dy,ihendco — (— j^ + ^dxAdy. Therefore, Stokes' theorem gives the formula ■g(x, y)dy = / ( -JM \ j fix, y)dx dx a dy, K+dA dy dx which is one of the standard forms of the so-called Green's theo- Using the standard scalar product on M2, we can identify the vector field X with a linear form cox such that a>x(Y) — (Y, X). In the standard coordinates (x, y), this just means that the field X — f(x, y) J-r 4- g(x, y)-^ defines right the form co given above. The integral of cox over a curve C has the physical interpretation of the work done by movement along this curve in the force field X. Green's theorem then says, besides others, that if cox — dF for some function F, then the work done along a closed curve is always zero. Such fields are called potential fields and the function F is the potential of the field X. With Green's theorem, we have verified once again that integrating the differential of a function along a curve depends solely on the initial and terminal points of the curve. The case n — 3, k — 2. We are examining a region in M3, bounded by a surface S. If ca — f(x, y, z)dy a dz + g(x, y, z)dz a dx + h(x, y, z)dx a dy, we get dm — (|£ + |^ + ^-)dx a dy a dz, and Stokes' theorem says that Ii fix, y, z)dy Adz+g(x, y, z)dzAdx+h(x, y, z)dxAdy JM \ Bf dx dy dh 3z H---1--\dx A dy A dz. 515 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES For u 7^ ±1 and D -1/C, we have du — x dx 1+u2 2u2 2u In 1 \ —u1 In I x I + In I C In- 1 In I Cx 1 = Cx ( 1 z) x -Dx y2 2u dx ■ du = —, x , C ^ 0, , c # 0, C/0, D/0, 0^0. v2, ±x. While y = 0 is —x are solutions and The condition u = ± 1 corresponds to y not a solution, both the functions y = x and y can be obtained by the choice D = 0. The general solution is thus y2 = x2 + Dx, Dei. □ 8.126. Solve y 2y *2-l • Solution. The given equation is of the form / = a(x)y + b(x), i. e., a non-homogeneous linear differential equation (the function b is not identically equal to zero). The general solution of such an equation can be obtained using the method of integration factor (the non-homogeneous equation is multiplied by the expression e~ ^ a(x) ^) or the method of variable separation (the integration constant that arises in the solution of the corresponding homogeneous equations is considered to be a function in the variable x). We will illustrate both of these methods on this problem. As for the former method, we multiply the original equation by the expression J- dx x-1 x + l ' where the corresponding integral is understood to stand for any anti-derivative and where any non-zero multiple of the obtained function can be considered (that is why we could remove the absolute value). Thus, consider the equation 2y x(x — 1) x + l 1 (jc + 1)2 _ x+l " The core of the method of integration factor is that fact that the expression on the left-hand side is the derivative of y . Integrating this leads to ,x-l x(x — 1) dx 2x +2 mix + 1 I + C, Ce y x+i j x+i 2 Therefore, the solutions are the functions y = x^T (t _ 2x + 21n|jc + 1 | + c) , CeR. As for the latter method, we first solve the corresponding homogeneous equation V -__2J- y - x^-v This is the statement of the so-called Gauss-Ostrogradsky theorem. This theorem also has a very illustrative physical interpretation. Every vector field X — /(x, y, z) J-r + g(x, y, z)f^ + h(x,y,z)-^ defines an exterior 2-form cox(x, y, z) — fix, y, z)dy A dz + g(x, y, z)dz A dx + h(x, y, z)dx A dy by substitution for the first argument in the standard form of volume. The integral of this form over a surface can be perceived so that the integrated 2-form infinitesimally contributes, at every point to the integral, the increase equal to the volume of the parallelepiped given by the field X and a little piece of surface. If we consider the vector field to be the velocity of movement of the particular points of the space, this will be the "flow rate" through the given surface. On the right-hand side of the integral, there is an expression which can be defined as d(cox) — (div X)dx A dy A dz. Gauss-Ostrogradsky theorem says that if divZ equals zero identically, then the total flow rate through the boundary surface of the region is zero as well. Such fields, with divZ — 0, are called solenoidal vector fields. The case n — 3, k — 1. In this case, we have a surface M in R3 bounded by a curve C. If the linear form co is the differential of some function, we find out that the integral over the surface depends on the boundary curve only. This is the classical Stokes' theorem. If we use the standard scalar product, just like in the plane, to identify the vector field X — f-^r co — f dx + gdx + hdz, we obtain ■ s1- « dy h-^ with the form fc gdx + hdz f d(D, where^=(|-f)JyA*+(f-f)*AJx+(f-|i)JxA dy. This 2-form can again be identified with a single vector field rot X, which yields dco by substitution into the standard form of volume. This field is called the rotation or curl of the vector field X. We can see that in the three-dimensional space, vector fields X having the property that cox — dF for some function F are given by the condition rot X — 0. They are called conservative (or potential) vector fields. 3. Differential equations In this section, we will get back to (vector) functions of one variable, which will be given and examined in terms of their instantaneous changes. At the end, we will stop for a while to look at equations containing partial derivatives. 8.45. Linear and non-linear difference models. The concept of derivative was introduced in order to work with instantaneous changes of the examined quantities. In the introductory chapter, we once defined differences for the same reason, and it was just the relations between the values of the quantities and the changes of them or other quantities which lead to the so-called difference equations. As a motivating introduction to equations containing derivatives of unknown functions, we will now return to the difference equations for a while. The simplest model was interests of deposits or loans (and the same for the so-called Malthusian model of populations). The increase was proportional to the value, see 1.10. Considering continuous modeling, the same request leads to an equation connecting 516 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES which is an equation with separated variables. We have dy 2y dx x2 — ln|y In I x dy y 1 I + In I x + 1 I + In I C x + 1 ln|y In C y = C where we had to exclude the case y x - 1 x + 1 x2 - 1 2 -dx, - 1 x - 1 0. However, the function y = 0 is always a solution of a homogeneous linear differential equation, and it can be included in the general solution. Therefore, the general solution of the corresponding homogeneous equation is ax+i) c R J X — 1 ' Now, we will consider the constant C to be a function C(x). Differentiating leads to ./ _ C'(x) (jc + 1)(jc-1)+C(jc) (jc-I)-C(jc) y - (x-i)2 Substituting this into the original equation, we get C'(x) (jt+l)(jt-l)+C(jt) (Jt-I)-C(jt) _ _ 2C(x) (x + 1) (x-l)2 ~ (x-l)(x2-l)' It follows that C(x) x(x — 1) x + l ' l. e., C(x) r x(x -1) J x + l dx, C(x) 2x + 2 In I x + 1 I + C, Ce Now, it suffices to substitute: y = c(x) x+l x-l x + l x-l 2x + 2 In I x + 1 I + C C e We can see that the result we have obtained here is of the same form as in the former case. This should not be surprising as the differences between the two methods are insignificant and the computed integrals are the same. Finally, we can notice that the solution y of an equation / = a(x)y can be found in the same way for any continuous function a. We thus always have y = Cefa(x)dx, CeR. Similarly, the solution of an equation / = a (x)y +b(x) with an initial condition y(xo) = yo can be determined explicitly as (provided the coefficients, i. e. the functions a and b, are continuous) f* ait) dt ( ex , — [' a(s) ds ,\ y = e^o (y0 + jxQbit)e Ao dt) . Let us remark that the linear equation has no singular solution, and the general solution contains aCel □ 8.127. Solve the linear equation (y + 2xy) e*2 = cos x. the derivative y1 (t) of a function with its value (8.4) /(?) = r.y(0 with a proportionality constant r. It is easy to guess the solution of this equation, i. e. a function y(t) which satisfies the equality identically, y(0 = Cert with an arbitrary constant C. This constant can be determined uniquely be choosing the so-called initial values yo — y(?o) at some point to. If a part of the increase in our model were given by a constant action independent of the value y or t (like bank charges or the natural decrease of population as a result of sending some part of it to slaughterhouses), we could use an equation with a constant s on the right-hand side. (8.5) y(f) = r-y(f) + 5. Apparently, the solution of this equation is the function y(0 = Cert--. r It is very easy to come across this solution if we realize that the set of all solutions of the equation (8.4) is a one-dimensional vector space, while the solutions of the equation (8.5) are obtained by adding any one of its solutions to the solutions of the previous equation. We can then easily find the constant solution y(t) — k forA: = -f. Similarly, in paragraph 1.13, we managed to create the so-called logistic model of population growth based upon the assumption that the ratio of the change of the population size p(n + 1) — pin) and its size pin) is affine with respect to the population size itself. We also wanted the model to behave similarly as the Malthu-sian one for small values of the population size and to cease growing when reaching a limit value K. Now, the same relation for the continuous model can be formulated for a population pit) dependent on time t by the equality (8.6) AO = Pit) (- K Pit) i. e., at the value pit) — K for a large constant K, the instantaneous increase of the function p is indeed zero, while for pit) near zero, the ratio of the rate of increase of the population and its size is close to r, which is often a small number (roughly hundredths) expressing the rate of increase of the population in good conditions. It is surely not easy to solve such an equation without knowing the proper theory (although we will be able to deal with this type of equations presently). However, as an exercise on differentiation, we can easily verify that the following function is a solution for every constant C: Pit) K 1 + CK e" 517 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Solution. If we used the method of integration factor, we would only rewrite the equation trivially since it is already of the desired form - 2 the expression on the left-hand side is the derivative of y ex . Thus, we can immediately calculate / cosx, ye- ye- sinx + C, y = e x (sinx + C) , cos x dx, Cef, Cel. □ 8.128. Find all non-zero solutions of the Bernoulli equation y-f = 3xy2. Solution. The Bernoulli equation y = a(x)y + b(x)f , r#0, r#l, reM can be solved by first dividing by the term / and then using the substitution u = yl~r, which leads to the linear differential equation u' = (1 — r) [a(x)u + b(x)] . In this very problem, the substitution u = y1-2 = 1/y gives u' + a = -3x. X Similarly to the previous exercise, we have u =e-lnl*l [J-3xeln|x|t/x] , was obtained as an (arbitrary) antiderivative to 1/x. where In | x Furhter, -3x e1 In I x dx / -3x \x\dx The absolute value can be replaced with a sign that can be canceled, i. e., it suffices to consider u = \ [f -3x2 dx] = \ [-x3 + C], CeR. Returning to the original variable, we get y = - = ttS, CeR. J u C-xi ' The excluded case y = 0 is a singular solution (which, of course, is true for every Bernoulli equation with r positive). □ 8.129. Interchanging the variables, solve the equation y dx — (x + y2 sin y) dy = 0. Solution. When the variable x occurs only in the first power in the differential equation and y occurs in the arguments of elementary functions, we can apply the so-called method of variable interchange, when we look for the solution as for a function x of the independent variable y- First, we write the equation explicitly: y = —f—• y x+yA sin y This equation is not of any of the previous types, so we rewrite it as follows: Confronting the red graph (left-hand picture) of this function with the choice K = 100, r — 0, 05, and C — 1 (the first two were used in 1.13 this way, the last one roughly corresponds to the initial value p(0) — 1) with the right-hand picture (the solution of the difference equation from 1.13 with the same values of the parameters), we can see that both approaches to population modeling indeed yield quite similar results. To compare the output, the left-hand picture also contains in green the graph of the solution of the equation (8.4) with the same constant r and initial condition. 8.46. First-order differential equations. By an (ordinary) first-order differential equation, we usually mean the relation between the derivative y1 (t) of a function with respect to the variable t, its value y(t), and the vari-able itself, which can be written in terms of some real-valued function F : R3 —>• R as the equality F(/(0,y(0,0 = 0. The writing resembles implicitly given functions y(t); however, this time, there is a dependency upon the derivative of the wanted function y(t). If the equation is solved at least explicitly with regard to the derivative, i. e., y'(t) = f(t, y(0) for some function / : R —>• R, we can imagine graphically what this equation defines. For every value (t, y) in the plane, we can consider the arrow corresponding to the vector (1, fit, y)), i. e., the velocity with which the point of the graph of the solution moves through the plane in dependence on the free parameter t. For the equation (8.6), for instance, we get the following picture (illustrating the solution for the initial condition as above). 518 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES dy_ _ y dx x + y2 sin y ' dx ( y \_1 x ay \x + y1 sin y J y , 1 x = — x + y sin y. y We have thus obtained a linear differential equation. Now, we can easily compute its general solution x = —y cos y + Cy, Cel. □ Further problems concerning first-order differential equations can be found on page ??. L. Practical problems leading to differential equations 8.130. A water purification plant with volume 2000 m3 was contaminated with lead which is spread in the water with density 10 g/m3. Water is flowing in and out of the basin at 2 m3/s. In what time does the amount of lead in the basin decrease below 10 [ig/m3 (which is the hygienic norm for the amount of lead in drinkable water by a regulation of the European Community) provided the water keeps being mixed uniformly? Solution. Let us denote the water's volume in the basin by V (m3), the speed of the water's flow by i; (m3/s). In an infinitesimal (infinitely small) time unit At,y-v At grams of lead runs out of the basin, so we can construct the differential equation dm m - — ■ v At V for the change of the lead's mass in the basin. Separating the variables, we get the equation Am m v -At. V Integration both sides of the equation and getting rid of the logarithms, we get the solution in the form m(t) = m0e~^', where m0 is the lead's mass at time t = 0. Substituting the concrete values, we find out that t = 6 h 35 min. □ 8.131. The speed of transmission of a message in a population consisting of P people is directly proportional to the number of people who have not heard the message yet. Determine the function / which describes the dependency of the number of people who have heard the message on time. Is it appropriate to use this model of message transmission for small or large values of PI Solution. We construct a differential equation for /. The speed of the transmission = fit) should be directly proportional to the number of people who have not heard of it, i. e. the value P —fit). Altogether, dj_ dt 10C- y(x) 80- 60- 40- 207 '///// '//// '//// 1111 III! //// //// ■%//// //// //// //// 1111 m ',',y, //// //// //// /// ' //// 111 m //// //// /// /// J11 w /// /// //// //// 1111 m '///, //// //// //// I " I ' I ' I ' I " I T "T "1 T T™ 50 100 ~i—I—r—r—r—r—r—i 150 200 Considering these pictures, we can intuitively anticipate that for every initial condition, there will exist a unique solution of our equation. However, as we will see, this proposition holds only for sufficiently smooth functions /. 8.47. Integration of differential equations. Before examining existence of the solutions of the differential equations, we present at least one truly elementary method of solution. It transforms the solution to ordinary integration, which usually leads to an implicit description of the solution. equations with separated variables J____ Consider a differential equation in the form (8.7) y« = /(o-s mo) for two continuous functions of a real variable, / and g. The solution of this equation can be obtained by integration, i. e., we find the antiderivatives dy Giy) fix)dx. This procedure reliably finds a solution which satisfies g(y(t)) ^ 0. Then, computing the function y(x) from the implicitly given formula F(x) + C — Giy) with an arbitrary constant C leads to the solution, because differentiating this equation using the chain rule for the composite function G(y(x)) indeed leads to ■ y1 (x) — fix). As an example, we can find the solution of the equation y (x) = x • y(x). f C. Hence it looks (at Direct calculation gives In \ y(x) least for positive values of y) as 2x2 HP - fit)). y(x) = eix2+c = D ■ e^2, where D is an arbitrary positive constant now. Let us stop for a while to examine the resulting formula and signs thoroughly. The 519 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Separating the variables and introducing a constant K (the number of people who know the message at time t = 0 must be P — K), we get the solution fit) Ke -kt where k is a positive real constant. Apparently, this model makes sense for large values of P only. □ 8.132. The speed at which an epidemic spreads in a given closed population consisting of P people is directly proportional to the product of the number of people who have been infected and the number of people who have not. Determine the function fit) describing the number of infected people in time. Solution. Just like in the previous problem, we construct a differential equation: df dt Again, separating the variables and introducing suitable constants K and L, we obtain K k ■ f(t) iP - fit)) . fit) 1+Le -Kkt □ 8.133. The speed at which a given isotope of a given chemical element decays is directly proportional to the amount of the given isotope. The half-life of the isotope of plutonium H9Pu is 24,100 years. In what time does a hundredth of a nuclear bomb whose active component is the mentioned isotope disappear? Solution. Denoting the amount of plutonium by m, we can build a differential equation for the rate of the decay: dm - = k ■ m, dt where k is an unknown constant. The solution is thus the function m{t) = m0e~kt. Substituting into the equation for half-life^-*' = |), we get the constant k = 2.88 • 105. The wanted time is then approximately 349 years. □ 8.134. The acceleration of an object falling in a constant gravitational field with a certain resistance of the environment is given by the formula dv dt where k is a constant which expresses the resistance of the environment. An object was dropped in a gravitational field with g = 10 ms-2 at the initial speed of 5 ms-1, the resistance constant is k = 0.5 s-1. What will the speed of the object be in three seconds? Solution. — = g - kv, (f-4 -kt constant solution y(x) — 0 satisfies our equation as well, and for negative values of y, we can use the same solution with negative constants D. In fact, the constant D can be arbitrary, and we have found a solution satisfying any initial value. \\\w )//S'sS'-*-lilt//"-*-. I III ',,111 -'///II '///// In /// HHP ---ww v -n.WW \ m\\ n Iii ill } / / f/sS'^-111////"- WW,',',! wm " mini -"//// "////111 - -,,/ The picture shows two solutions which demonstrate the instability of the equation with regard to the initial values: If, for any xq, we change a tiny yo from a negative value to a positive one, then the behavior of the resulting solution changes dramatically. Moreover, we should notice the constant solution y(x) — 0, which satisfies the initial condition y(*o) — 0. Using separation of variables, we can easily solve the nonlinear equation from the previous paragraph which described a logistic population model. Try this as an exercise. In the first chapter, we paid much attention to the so-called linear difference equations, and their general solution, looking quite awful, was determined in paragraph 1.10 on page 13. Although it was clear beforehand that it will be a one-i S dimensional affine space of satisfying sequences, it was a hardly transparent sum, because we needed to take into account all of the changing coefficients. We can thus use this as a source of inspiration for the following construction of the solution of a general first-order linear equation (8.8) y (t) = a(t)y(t) + b(t) with continuous coefficients a it) ans bit). First of all, let us find the solution of the homogenized equation y (0 — a(t)y(t). This can be computed easily by separation of variables, obtaining y(t) = yoF(t,to), F(t,s) = daMdx . In the case of difference equations, we "guessed" the solution, and then we proved by induction that it was correct. It is even simpler now, as it suffices to differentiate the correct solution to verify the statement. .... | The solution of first-order linear equations |_, The solution of the equation (8.8) with initial values y(?o) — yo is (locally in a neighborhood of to) given by the formula yit) = yoFit, t0) + / Fit, s)b(s) ds, f J* a(x) dx v(3) = 20 — I5e 2 ms 1 after substitution. □ where F(t, s) — e Verify the correctness of the solution by yourselves (pay proper attention to the differentiation of the integral where t is both in the upper bound and a free parameter in the integrand). 520 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.135. The rate of increase of a population of a certain type of bug is indirectly proportional to its size. At time t = 0, the population had 100 bugs. In a month, the population doubled. What will the size of the population be in two months? Solution. Let us consider a continuous approximation of the number of bugs, and let their amount be denoted by P. Then, we can build the following equation: dP _ k dt ~ P' 77-100, □ P = ^Kt + c. Substituting the given values, we get P (2) which is an estimate of the actual number of bugs. Now, we can, for instance, directly solve the equation y' (x) — 1 — x ■ y(x), this time encountering stable behavior, visible in the following pic- ture. -\\ \ -\\ -W \ 2,5* A 8.136. Find the equation of the curve with the following properties: It lies in the first quadrant, goes through the point [l, 3/4], and its tangent at any point marks on the positive half-axis y a segment whose length is the same as the distance of that point from the origin. O 8.137. Consider a chemical compound C isolated in a container. C is unstable, with half-time of a molecule equal to q time units. If there were M moles of the compound C in the container at the beginning (i. e., at time t = 0), how many moles of it will be there at time t > 0? o 8.138. A 100-gram body lengthens a spring of 5 cm if hung on it. Express the dependency of its position on time t provided the speed of the body is 10 cm/s when going through the equilibrium point. O Further practical problems that lead to differential equations can be found on page ??. M. Higher-order differential equations 8.139. Underdamped oscillation. Now, we will describe a simple model for the movement of a solid object attached to a point with a strong spring. If y(t) is the deviation of our object from the point yo = y(0) = 0, then we can assume that the acceleration y" it) in time t is proportional to the magnitude of the deviation, yet with the other sign. The proportionality constant k is called the spring constant. Considering the case k = 1, we get the so-called oscillation equation y"(t) = -y(t). This equation corresponds to the system of equations x'(0 = -y(0, y'(t)=x(t) from 8.7. The solution of this system is given by x(t) = R cos(? - r), y(t) = R sin(f - r) with an arbitrary non-negative constant R, which determines the maximum amplitude, and a constant r, which determines the initial phase. Therefore, in order to determine a unique solution, we need to know not only the initial position yo, but also the speed of the motion at that moment. These two pieces of information uniquely determine both the amplitude and the initial phase. Moreover, let us imagine that as a result of the properties of the spring material, there is another force which is directly proportional to the instantaneous speed of our object, with the other sign than the amplitude again. This is expressed by one more term with the first derivative, so our equation is now «w /—w\\v. /^•\\\\\\\ //^-^n\\\\\\\\. ///^-^nww \\ \ \\ \ . t//S'— ^WWWWU / / / / ->»^n\\ \ \ \ w / m / / ///ssss'^—-l/l/.//./!/././././/././././/.//./ /-*w\ \ v. Y/7^X\\\\\ hit / / //x/^——-~-~n\\\\\ /////// ///SSSS'^-'-^ 1/1/1//1//1/1/1/1/1/1/1/1/.//.//./ 8.48. Transformation of coordinates. Our pictures tend to indi-f'Q ^ cate that differential equations can be perceived as •£j\ geometric objects (the "directional field of the ar-yg^ggr-js rows"), so we should be able to look for the solution by conveniently chosen coordinates. We will get back to this point of view later; now, we will only show three simple and typical tricks as they seem from the explicit form of the equations in coordinates. We begin with the so-called homogeneous equations of the form yw = /(^-). Considering a transformation z get by the chain rule that 1 r, assuming that t / 0, then we z'(t) = -{ty'{i) y(t)) = -t(f(z) z), which is an equation with separated variables. Another example is the so-called Bernoulli differential equations, which are of the form y(t) = f(t)y(t) + g(t)y(t)n, where n ^ 0, 1. The choice of the transformation z — yl~n leads to the equation z'(t) = (1 - n)y(t)-"(f(t)y(t) + g(t)f) = (l-n)f(t)z(t) + (l-n)g(t), which is a linear equation, which we are able to integrate. In the end, let us take a look at an extraordinarily important equation, the so-called Riccati equation. It is a form of the Bernoulli equation with n — 2, extended by an absolute term y(t) = f(t)y(t)+g(t)y(t)2 + h(t). This equation can also be transformed to a linear equation provided that we are able to guess a particular solution x(t). Then, we can use the transformation 1 z(0 =-• y(t) - x(t) Verify by yourselves that this transformation leads to the equation z'(t) = -(f(t) + 2xit)git))zit) - git). 521 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES y"(t) = -y(t)-ay'(t), where a is a constant which expresses the magnitude of the damping. In the following picture, there are the so-called phase diagrams for solutions with two distinct initial conditions, namely with zero damping on the left, and for the value of the coefficient a = 0.3 on the right. Tlumené oscilace Tlumené oscilace r. The oscillations are expressed by the y-axis values; the x-axis values describe the speed of the motion. 8.140. Undamped oscillation. Find the function y(i) which satisfies the following differential equation and initial conditions: y"(t) + 4y(t) = f(t), ;y(0) = 0, y'(0) = -1, where the function f(t) is piecewise continuous: 1 cos(2?) for 0 < t < 7i, 10 for t > 7t. fit) Solution. This problem is a model of undamped oscillation of a spring (omitting friction, non-linearities in the toughness of the spring, and other factors) which is initiated by an outer force. The function f(t) can be written as a linear combination of Heav-iside's function u(t) and its shift, i. e., Since f(t) = cos(2í)(w(í) - M0) £(y")(s) = s2C(y) - sy(0) - y'(0) = s2 C(y) + 1, we get, applying the results of the above exercises 7 and 8 to the Laplace transform of the right-hand side s2 C(y) + \ + 4C(y) £(cos(2í)(«(0 - M0)) = £(cos(2i) • u(t)) - £(cos(2i) • M0) £(cos(2i)) Hence, C(y) (1 1 '£(cos(2(í + 7C)) s2 +4 s2 +4 + (1 (s2 + 4)2 Performing the inverse transform, we obtain the solution in the form s y(t) sin(2í) + \t sin(2i) + C~l \e (s2 + Af Just as we saw in the case of integration of functions (which is, in fact, the simplest type of equations with separated variables), the equations usually do not have a solution expressible explicitly in terms of elementary functions. Similarly as with standard engineer tables of values of special functions, books listing the solutions of basic equations were compiled as well.4 Today, the wisdom concealed in them is essentially transferred to software systems like Maple or Mathematica. There, we can assign any task on ordinary differential equations, and we will get the results in a surprisingly good deal of cases, yet after all, it will not be possible for most problems. The way out of this is numerical methods, which try only to approximate the solutions. However, to be able to use them, we still need good theoretical starting points regarding existence, uniqueness, and stability of the solutions. We begin with the so-called Picard-Lindelof theorem: Existence and uniqueness of the solutions of ODEs 8.49. Theorem. Let a function j'(t, y) : R2 -> R have continuous partial derivatives on an open set U. Then for every point (to, yo) e U D R2, there exists a maximal interval I — [?o — a, to + b], with positive a, b e R, and a unique function y(t) : I -> R which is a solution of the equation f(t) = f(t, y(t)) on the interval I. Proof. Notice that if a function y(t) is a solution of our equation satisfying the initial condition y(?o) — to, then it also satisfies the equation y(0 = yo + f yf" (t) dt = y0 + I fit, y(t)) dt. Jt0 Jt0 However, the right-hand side of this expression is, up to constant, the integral operator L(y)(t) = yQ+ f f(t,y(t))dt. Jt0 When solving our first-order differential equations, we are thus looking for a fixed point of this operator L, i. e., we want to find a function y — y(t) satisfying L(y) — y. On the other hand, if a Riemann-integrable function y(t) is a fixed point of the operator L (y), then it immediately follows from the antiderivative theorem that y(t) indeed satisfies the given differential equation, including the initial conditions. We can quite easily guess for the operator L how much its values L (y) and L (z) differ for various arguments y(t) and z(t). Indeed, thanks to the partial derivatives of the func-> tion / being continuous, we know that / is locally Lips-chitz. This means that we have the bound \f(t,y)-f(t,z)\ < C\y-z\, with a constant C if we restrict the values (t, y) to a neighborhood of the point (to, yo) with compact closure. We choose an e > 0 and restrict the value of t to some interval J — [to — ao, to + bo] so /' E. g., Kamke. 522 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES However, by formula (||7.36||), we have s 1 / — tis CTy\e l-C-l(e-ns C(t sin(2f))) (s2 + 4)2 ) 4 = (t -tx) sin(2(f - tx)) ■ Hn(t). Since Heaviside's function is zero for t < tx and equal to 1 for t > tx, we get the solution in the form y(0 ■\ sin(2f) + \t sin(2f) for 0 < f < ix n-2 sin(20 for t > tt □ 8.141. Find the general solution of the equation /" - 5/ - 8/ + 48y = 0. Solution. This is a third-order linear differential equation with constant coefficients since it is of the form y(n) + aiy(n-l) + a2y(n-2) + . . . + + ^ = f (x) for certain constants a\, ..., an el Moreover, we have f(x) = 0, i. e., the equation is homogeneous. First of all, we will find the roots of the so-called characteristic polynomial kn + a{kn~x + a2kn~2 H-----h an_{k + an. Each real root k with multiplicity k corresponds to the k solutions and every pair of complex roots k = a ± iß with multiplicity k corresponds to the k pairs of solutions eax cos (fix) , x eax cos (fix) xk~1 eax cos (fix) , eax sin (fix) , x eax sin (fix) xk~1 eax sin (fix) . Then, the general solution corresponds to all linear combinations of the above solutions. Therefore, let us consider the polynomial k3 - 5k2 - Sk + 48 with roots ki = k2 = 4, k3 = —3. Since we know the roots, we can deduce the general solution as well: y = de41 + C2x e4x + C3e"3\ d, C2, C3el. □ 8.142. Compute f + f + 9/ + 9y = ex + lOcos(3x) . Solution. First, we will solve the corresponding homogeneous equation. The characteristic polynomial is equal to k3 + k2 + 9k + 9, with roots ki = — 1, k2 = 3i, k3 = — 3i. The general solution of the corresponding homogeneous equation is thus y = de-x + C2cos (3x) + C3 sin (3x) , C\, C2, C3 e R. The solution of the non-homogeneous equation is of the form y = de-x + C2 cos (3x) + C3 sin (3x) +yp, Cu C2, C3el for a particular solution yp of the non-homogeneous equation. The right-hand side of the given equation is of a special form. In general, if the non-homogeneous part is given by a function that J x [yo —e, yo +s] c U, and we consider only those functions y(t) and z(t) which, for t e J, satisfy max\y(t) - y0\ < e, max \z(f) — )>ol < £■ Now, we obtain the bound \(L(y) - L(z))(t)\ = f f Jtn f(t,y(t))- f(t,z(t))dt < I \f(t,y(t))- f(t,z(t))\dt 'to 0, we thus have max \L(y)(t) L(z)(t)\ < max c\y(i) \t—1§\ <& z(t)\ for some constant 0 < c < 1. In paragraph 7.19 on page 441, these operators were called contraction. However, for the assumptions of the Banach contraction theorem, which guarantees a uniquely determined fixed point, we need even completeness of the space X of functions on which the operator L works. In our case, we can notice that merely from the continuity of \\ the mapping f(t, y), there follows a uniform bound for all of the functions y(t) considered above and the values t > s in their domain: \L(y)(t) - L(y)(s)\ f \f(t, y(t)\dt 0. Therefore, besides the conditions mentioned above, we can even restrict ourselves to the subset of all uniformly continuous functions. This set is already compact, hence it is a complete set of continuous functions on our interval, see Arzela-Askoli theorem 7.23. Therefore, there exists a unique fixed point y(t) of this contraction L, which is the solution of our equation. It remains to show the existence of a maximal interval / — (to — a, to + b). Let us suppose that we have found a solution y(i) on an interval (to, and, at the same time, the limit yi = lim y(t) exists and is finite. Then, it follows from what has been proved above that there must exist a solution with initial condition (ti, y i), in some neighborhood of the point t\, and on the left-hand side of it, it must coincide with the solution y(t). Therefore, the solution y (t) can surely be extended on the right-hand side of t\. There are thus only two possibilities when the solution right of t\ does not exist: either there is no finite limit y(i) at t\ from the left, or the limit yi exists, yet the point (t\, yi) is on the boundary of the domain U of the function /. In both cases, we indeed have a maximal extension of the solution right of to. The argumentation for the maximal solution left of to is analogous. □ 523 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES P„(x)zax, where Pn is a polynomial of degree n, then there is a particular solution of the form yp=xk R„(x)eax, where k is the multiplicity of a as a root of the characteristic polynomial and Rn is a polynomial of degree at most n. More generally, if the non-homogeneous part is of the form ex [Pm (x) cos (fix) + S„ (x) sin (fix) ] , where Pm is a polynomial of degree m and S„ is a polynomial of degree n, there exists a particular solution of the form yp=xk eax [Ri(x) cos (fix) + Tr(x) sin ] , where k is the multiplicity of a + ifi as a root of the characteristic polynomial and i?;, Ti are polynomials of degree at most I = max {m,n}. In our problem, the non-homogeneous part is a sum of two functions in the special form (see above). Therefore, we will look for (two) corresponding particular solutions using the method of undetermined coefficients, and then we will add up these solutions. This will give us a particular solution of the original equation (as well as the general solution, then). Let us begin with the function y = ex, which has particular solution yPl (x) = Aex for some Aet. Since yPl (x) = y'pi (x) = /; (x) = fv\ (x) = Ae\ substitution into the original equation, whose right-hand side contains only the function y = ex, leads to 20Ae*=e\ i.e. A = 2\. For the right-hand side with the function y = 10 cos (3x), we are looking for a particular solution in the form yP2 (x) = x [B cos (3x) + C sin (3x) ]. Recall that the number a = 3i was obtained as a root of the characteristic polynomial. We can easily compute the derivatives y'n (x) = [B cos (3x) + C sin (3x) ] +x [-3B sin (3x) + 3C cos (3x) ], y"n (x) =2[-3B sin (3x) + 3C cos (3x) ] +x [-9B cos (3x) - 9C sin (3x) ], (x) = 3 [-9B cos (3x) - 9C sin (3x) ] +x [27 B sin (3x) -21C cos (3x) ]. Substituting them into the equation, whose right-hand side contains the function y = 10 cos (3x), we get -185 cos (3x) - 18C sin (3x) - 6B sin (3x) + 6C cos (3x) = 10 cos (3x) . Confronting the coefficients leads to the system of linear equations -18S + 6C = 10, -18C-65=0 with the only solution B = —1/2 and C = 1/6, i. e., yP2(x) = x [—| cos (3x) + i sin (3x)]. Altogether, the general solution is y = Cit~x + C2 cos (3x) + C3 sin (3x) + -^ex 1 1 —x cos (3x) -\—x sin (3x) , C\, C2, C3 e M. 2 6 8.50. Iterative approximations of solutions. The proof of the previous theorem can be reformulated as an iterative procedure which provides approximate solutions using step-by-step integration. By a concrete bound for the constant c from the proof, we can get even straight bounds for the errors. Try to think this out as an exercise (see the proof of Banach fixed-point theorem in paragraph 7.19). It can then be shown quite easily and directly that it is a uniformly convergent sequence of continuous functions, so the limit will again be a continuous function (without invoking the complicated theorems from the seventh chapter). _ I Picard's approximations [___ The unique solution of the equation y(0 = /(f, y(t)) whose right-hand side / has continuous partial derivatives can be expressed, on a sufficiently small interval, as the limit of step-by-step iterations beginning with the constant function (the so-called Picard's approximation): yo(t) = yo, yn+i(t) - L(y„), n - 1,.... It is a uniformly converging sequence of continuous functions with continuous limit y(t). Let us notice that we actually needed only the Lipschitzness of partial derivatives of the function, so the theorem holds with this weaker assumption as well. We will show in the next paragraph that mere continuity of the function / guarantees the existence of the solution as well, yet it is insufficient for the uniqueness. 8.51. Ambiguity of solutions. Let us begin with a really simple example. Consider the equation y(o = v/woT- Separating the variables, we can easily find the solution y(0 = \(t + Q2, for positive values y, with an arbitrary constant C and t + C > 0. For the initial values (to, yo) with yo / 0, this is an assignment matching the previous theorem, so there will also be locally exactly one solution. The solution must apparently keep being non-decreasing, hence for negative values yo, we get the same solution, only with the other sign and t + C < 0. However, for the initial condition (to, yo) — (to, 0), we have not only the already discussed solution continuing to the left of to and to the right, but also the identically zero solution y(t) — 0. Therefore, these two branches can be bound arbitrarily, see the picture. Nevertheless, the existence of a solution is guaranteed by the following theorem, known as Peano existence theorem: Theorem. Consider a function f'(t, y) : M2 —>• M which is continuous on an open set U. Then for every point (to, yo) e U D R2, there exists a continuous solution of the equation y(t) = f(t, y(t)) locally in some neighborhood of to. Proof. The proof will be presented only roughly, leaving the details to the reader. Instead of using Picard's approximations, we will proceed quite naively. 524 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES □ 8.143. Determine the general solution of the equation / +3/ +2y = e~2x. Solution. The given equation is a second-order (the highest derivative of the wanted function is of order two) linear (all derivatives are in the first power) differential equation with constant coefficients. First, we solve the homogenized equation / + 3/ + 2y = 0. Its characteristic polynomial is x2 +3x + 2 = (x + l)(x + 2), with roots x\ = — 1 and x2 = —2. Hence, the general solution of the homogenized equation is + c2e -2x where ^, c2 are arbitrary real constants. Now, using the method of undetermined coefficients, we will find a particular solution of the original non-homogeneous equation. According to the form of the non-homogeneity and since —2 is a root of the characteristic polynomial of the given equation, we are looking for the solution in the form yo = axe~2x for aeK. Substituting into the original equation, we obtain -2x a[-4e~2x +4xe~2x +3(e~2x 2xe~2x) + 2xe~2x] hence a = — 1. We have thus found the function — xe 2x as a particular solution of the given equation. Hence, the general solution is the function space c\e x + c2e -2x xe -2x ,ci,c2 e □ 8.144. Determine the general solution of the equation y + y = i. Solution. The characteristic polynomial of the given equation is x2 +x, with roots 0 and — 1. Therefore, the general solution of the homogenized equation is c\ + c2e~x, where c\, c2 e R. We are looking for a particular solution in the form ax, a e R (since zero is a root of the characteristic polynomial). Substituting into the original equation, we get a = 1. The general solution of the given non-homogeneous equation is c\ + c2e~x + x, c\, c2 e R. □ 8.145. Determine the general solution of the equation y" + 5y' + 6y = e -2x Solution. The characteristic polynomial of the equation is x2 + 5x + 6 = (x + 2)(x + 3), its roots are —2 and —3. The general solution of the homogenized equation is thus c\e~2x + c2e~3x, c\,c2 e R. We are looking for a particular solution in the form axe~2x, (—2 is a root of the characteristic polynomial), a e R, using the method of undetermined coefficients. Substitution into the original equation yields a = 1. Hence, the general solution of the given equation is cxe~2x + c2e~3x + xe~2x. We will construct a solution to the right of the initial point to. To this purpose, we select a small step h > 0 and label the points tk — to+kh, k—l,2,.... The value of the derivative f(to, yo) of the corresponding curve of the solution (t, y(t)) is defined at the initial point (to, yo), so we can substitute a parametrized line with the same derivative: y°) (t) = yo + f(to,yo)(t-to), and we label yi — y(0) (fi). We thus inductively construct the functions and points yik) (t) = yk + f(xk, yu)(t - h), yk+1 = y(k) (tk+l). Now, we define yn(t) by gluing the particular linear parts, i. e., yh(t) = yw (0 for all te[kh,(k+ l)h]. This is clearly a continuous function, which is called Euler's approximation of the solution. Now it "only" remains to prove that the limit of the functions \\ yn for h approaching zero exists and is a solution. For this, we need to notice (as we have already done in the proof of the theorem on uniqueness and existence of the solution) that, thanks to f(t, y) being uniformly continuous on the neighborhood U where we are looking for a solution, we have, for any selected e > 0, a S such that \f(t,y)-f(s,z)\• 0 such that the corresponding sequence of functions ynn converges uniformly to a continuous function y(n). Further, let us write more simply y„ (t) — ynn —>• y(t). However, for each of the continuous functions yh, we have only finitely many points in the interval [to,t] where it is not dif-ferentiable, so we can write yn (t) — yo + f y'n Jta (s) ds. On the other hand, the derivatives on the particular intervals are constant, so we can write (here, k is the largest such that to +khn < t, while y, and tj are the points from the definition of the function yn (t) - yo + E / n*j yj)ds + f f(tk,Jk)-Instead, we would like to see y» (0 — yo + I f(s, yn (s)) ds, f but the difference between this integral and the last two terms in the previous expression is bound by the possible differences of the function f(t, y) and the lengths of the intervals. Thanks to our 525 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES □ 8.146. Determine the general solution of the equation / - y - 5. Solution. The characteristic polynomial of the equation is x2 —x, with roots 1, 0. Therefore, the general solution of the homogenized equation is ci + c2ex, where c\, c2 € R. We are looking for a particular solution in the form ax, a € R, using the method of undetermined coefficients. The result is a = —5, and the general solution is of the form c\ + c2ex — 5x. 8.147. Solve the equation f -iy+y □ x2 + l ■ Solution. We will solve this non-homogeneous equation using the method of variation of constants. We will thus obtain the solution in the form y = Ci(x) yi(x) + C2(x) y2(x) H-----h C„(x) y„(x), where y\, ..., yn give the general solution of the corresponding homogeneous equation and the functions C\ ix),..., Cn (x) can be obtained from the system C\ (x) yi (x) + ■ ■ ■ + C'n (x) yn (x) = 0, C'l(x)y\(x) + ... + C'n(x)y'n(x) = Q, C\(x) y(r2) (x) + ■ ■ ■ + C'n(x) y(n"-2) (x) = 0, C\(x) yf(x) + • • • + C'n(x) y(r1] ix) = fix). The roots of the characteristic polynomial X2 — 2k + 1 are ki = k2 = 1. Therefore, we are looking for the solution in the form Cl(x)ex + C2(x)xex, considering the system kC\{x)ex + C'2(x)xex =0, C\(x) ex + C'2(x) [ex +xex] = -^j- We can compute the unknowns Cx{x) and C'2(x) using Cramer's rule. It follows from ex ex e x ex ex + x ex ex ex 0 t2+l x ex + x ex = e2x, -X e2x X2 + 1' 0 e2x x2+l X2 + 1 universal bound for f(t, y) above, we can thus use just the last integral instead of the actual values in the limit process lim„^oo yn (t), thereby obtaining y(t) — lim (yo+ f(s, yn(s))ds) J t0 = yo+ (lim f(s, y„ (s))) ds Jt0 — yo + / f(s, y(s))ds, Jt0 where we used the uniform convergence y„ (t) —>• y(t). This proves the theorem. □ 8.52. Systems of first-order equations. The problem of finding the solution of the equation y1 (x) — f(x, y) can ■f also be viewed as looking for a (parametrized) curve (x(f), y(t)) in the plane where we have fixed the parametrization of the variable x(?) — t beforehand. However, if we accept this point of view, then we can forget this fixed choice for one variable and we can add an arbitrary number of variables. In the plane, for instance, we can write such a system in the form x' (0 = f(t, xit), yit)), y (0 = git, xit), yit)) with two functions /, g : R3 —>• R. Similarly for more variables. A simple example in the plane might be the system of equations x,(t) = -y(t), y1it) = xit). It can be easily guessed (or verified at least) that there is a solution of this system, xit) — R cos t, yit) — R sin t, with an arbitrary non-negative constant R, and the curves of the solution will be exactly the parametrized circles with radius R. In the general case, we will work with the vector notation of the system in the form x'(0 = /(f, xit)) for a vector function x : R —>• R" and a mapping / : M"+1 —>• R". We are able to extend the validity of the theorem on uniqueness and existence of the solution to such systems: Existence and uniqueness for systems of ODEs J_> Theorem. Consider functions fit, x\, x„) : M"+1 —>• R, i = I,..., n, with continuous partial derivatives. Then, for every point (to, x\, ..., x„) e Rn+1, there exists a maximal interval [to — a, to + b], with positive numbers a, b € R, and a unique function xit) : R —>• R" which is the solution of the system of equations x\ (x) - fi it, xi it), ...,x„ ix)) x'nix) = /„it, xi (0, - - -, x„(x)) with initial condition Xl(t0) - xi, ...,x„(t0) -x„. 526 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES that Ci(x) j ^-^dx =-Un(x2+ l) + Cu Ci e /- L(y) once again, this time mapping curves in M" to curves in M", and we are looking for its fixed point. Since the Euclidean distance of two points in M" is always bounded from above by the sum of the sizes of the differences of the particular components, the proof goes on in much the same way as in the case 8.49. We only need to notice that the size of the vector \\f(t, zi,..., z„) - f(t, yi,..., y„)\\ is bounded from above by the sum \\f(t, Zi, . . . , Z„) ~ fit, yi, Z2 ■ ■ ■ , Zn)\\ + ■ ■ ■ + \\f(t, yi, ■ ■ ■, y„-i, z„) - fit, yi,..., y„)\\. We recommend to go through the proof of Theorem 8.49 from this point of view and to think out the details. □ When we introduced and examined models of a real system, the so-called qualitative behavior of the solution in dependence on the initial conditions and free parameters of the system (i. e. constants or functions) is essential. As a quite simple example of a system of first-order equations, we can notice the standard population model "predator - prey", which was introduced in the 1920s by Lotka and Volterra. Let x{t) denote the evolution of the number of individuals in the prey population and y{t) for the predators. We assume that the increase of the prey would correspond to the Malthusian model (i. e. exponential growth with coefficient a) if they were not hunted. On the other hand, we assume that the predator would only naturally die out (i. e. exponential decrease with coefficient y). Further, we consider an interaction of the predator and the prey which is expected to be proportional to the number of both with a certain coefficient p, which is, in the case of the predator, supplemented by a multiplicative coefficient expressing the hunting efficiency. We get a system of two equations: ___j lotka-volterra model J___ x' (0 — ax{t) - Py(t)x{t) y(0 =-yy(t) + 8l3x(t)y(t). s2 + 4 It is interesting that the same model captures quite well the progress of unemployment in the system limited to employers and their employees, considering the employees to be the predators, while the employers play the role of the prey. vložit cca ctyri obrázky znázorňující dynamiku pro ruzne hodnoty koeficientu a opatřit komentárem!!!!! Much information about this and other models can be found in literature. 527 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES l. e., £(y)(s) (s2 + 4)2 The inverse transform leads to i 8 4 y(t) = i sin2ř — jřcos2ř □ 8.151. Find the function y(t) which satisfies the differential equation / (0 + 6/ (0 + 9y(t) = 50 sin t and the initial conditions y(0) = 1, / (0) = 4. Solution. The Laplace transform yields s2C(y)(s) - s - 4 + 6(sC(y)(s) - 1) + 9£(y)(s) = 50£(sin/;)00, i. e., (s2 + 6s + 9)£(y)(s) 50 s2 + 1 + s +10, C(y)(s) 50 + s +10 (s2+l)(s + 3)2 (s + 3)2' Decomposing the first term to partial fractions, we obtain 50 As + B C D +-r + (s2 + l)(s + 3)2 52 + 1 s + 3 (S + 3)2' so 50 = (As + B)(s + 3)2 + C(52 + l)(s + 3) + Z)(52 + 1). Substituting 5 = —3, we get 50 = 10D hence D = 5 and confronting the coefficients at s3, we have 0 = A + C, hence A = -C. Confronting the coefficients at s, we obtain 4 0 = 9 A + 6B + C = 8A + 65, hence £ = -C. Finally, confronting the absolute term, we infer 50 = 9B + 3C + D = 12C + 3C + 5 hence C = 3, 5 = 4, A = -3. Since 5 + 10 5+3 + 7 1 7 + (5 + 3)2 (Ä + 3)2 Ä+3 (5 + 3) 2 ' we have £(y)(5) -3s+4 , 3 _ + ^_ + 5 + J_ + -JL- s2 + l ^ s+3 ^ (s+3)2 ^ s+3 ^ (s+3): _ -3s i__4__i _4_ i 12 s2+l s2+l s+3 "t" (s+3)2- Now, the inverse Laplace transform yields the solution in the form y(i) = -3 cos t + 4 sint + 4• M" have continuous partial derivatives on an open set U with compact closure. Such functions must be uniformly continuous and uniformly Lipschitz on U, so we can label the finite values \f(t,x)-f(t,y)\ C — sup x^y; (t,x), (t,y)eU \x 5= sup \f(t,x)-g(t,x)\ (t,x)eU ■y\ Having this notation, we can formulate our fundamental theorem: Theorem. Let x(t) and y(t) be two fixed solutions x'(f) = /(f,x(0), y'(t) = g(t,y(t)) of the systems considered above, given by initial conditions x(?o) = xo and y(to) = yo- Then, \x(t) - y(t)\ < |x0 - y0\ec|Mo1 +£(ec'M°l -l). Proof. Without loss of generality, we can assume that to — 0. \^ From the expression of the solutions x(t) and y(t) as fixed points of the corresponding integral operators, we immediately get the bound W0 - >it)\ < \xQ -yQ\+ [ \f(s, x(s)) - g(s, y(s))\ds. Jo The integrand can be further bound as follows: \f(s,x(s))-g(s,y(s))\ < \f(s, x(s)) - f(s, y(s))\ + \f(s, y(s)) - g(s, y(s)) < C \x(s) — y(s) \ + B If we denote F(t) — \x(t)—y(t)\, a — |xo —yo\, we can write our bound as F(t) < a + f (C F(s) + B) ds. Such a bound can be quite easily used thanks to the following general result, which is known as Gronwall's inequality. Notice the similarity with the general solution of linear equations. Lemma. Let a real-valued function F(t) satisfy, for the input values from an interval t € [0, tmax], F(t) 0. Then, f it) = cos int) - yit), t e (0, +oo) we also ^ and the initial conditions yiO) = cu y (0) = c2. Fit) mdr =-^W'M )> Solution. Again, we apply the Laplace transform. This, using so we finally obtain £(e±l) is) = f d • transforms the first equation to °* F{f) ^ "«C1 ~ J0 & eL ^ * *) s2C (x) is) - s lim x(t) - lim x' it) + s£ (x) (5) - lim x(0 = _ , f ß(r) dr _,\ _ £ ^ ^ ^ — ( s2 £ (y) (5) — 5 lim y(£) — lim y (?) I H__— ^e second proposition of the lemma has thus been proved as \ t^o+ t^o+ ) s_1 well. □ and the second one to is) - t=>0+ Now, we can finish the proof of the theorem about continuous sC (x) is) — ^Ijn^ x(0 + 2£ (x) is) = dependency upon the parameters. We have already obtained the -C(y) is) +s£iy) is) - lim y(t) + —. boundF(f) < a+ffc Fis)+B) ds, and using a merely modified s+1' function Fit) = Fit) + f, it yields Evaluating the limits (according to the initial conditions), we obtain the linear equations pit) < § + a + f CFis)ds. sz£ (x) is) - 1 + s£ (x) is) = £ iy) is) - s1 £ iy) is) + ^ and This is the assumption of Gronwall's inequality with (even) constant parameters, so by the second proposition of the lemma, we l+i get with the only solution pit) + - < ia + -)foCds,, s£ (x) {s) + 2£ (x) is) = -£ iy) is) + s£ iy) is) + £ix) is) — 2(j_1s)(j+1)2 . £iy) is) — 2(s2-\x2' which is the statement Once again, we perform partial fraction decomposition, getting Fit) - 8 s_! -l- 4 ( 1)2 s s+i _ 4 (s+i)2 4 S2-V we have wanted to prove. □ 529 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Since we have already computed that £(?e-')(s) = £ (sinh t)(s) 2s C (t sinh t) (s) s2-l ■ (*2-l)2 we get x(t) 31 e~' + j sinh t, y(t) = 11 sinh t. We definitely advise the reader to verify that these functions of x and y are indeed the wanted solution. The reason is that the Laplace transforms of the functions y = e', y = sinh? and y = t sinh? were obtained only for s > 1). □ 8.154. Find the solution of the following system of differential equations: x'(t) y'(t) Solution. -2x(t) + 3y(0 + 3r, -4x(0 + 5y(0 + e1, x(0) = 1, y(0) = -1 C(x')(s) C(y')(s) C(-2x + 3y + 3t2)(s), C(-4x + 5y + e')(s). The left-hand sides can be written using (|| 8.111|), while the right-hand sides can be rewritten thanks to linearity of the C operator. Since C(3t2)(s) = -% and C(et)(s) = j^j, we get the system of linear equations s£(x)(s) - 1 sC(y)(s) + 1 -2£(x)(s) + 3£(y)(s) + ^, -4C(x)(s) + 5C(y)(s) + s-l ■ In matrices, this is A(s)x(s) = b(s), where 's + 2 -3 4 s -5, Cramer's rule says that Ms) ^ = (£SS)andbW 1 + £(*)(*) = -^1 £(y)(s) = -^i, where |A| |Ai| s+2 -3 4 5-5 1 + -1 + s-l -3 5-5 |A2| = Hence, C(x)(s) C(y)(s) s+2 1+4 1 / - 3s + 2, (5-5)(l + ^) + 3(-l + ^T) (5+2)(-l + ^T)-4 24 «3 • (5 - \)(s - 2) (s-5)(s3 +6) 1 1 (s + 2)(2-s) 453+24 (s - \)(s - 2) V s - 1 s3 Decomposing to partial fractions, the Laplace images of the solutions can be expressed as follows: C(x)(s) = - ^ + ^ 21 £(x)(.) 2s (s-l)2 ____ 25 _ 87 s-l 4(s-2) s3 4s' 18 _ 3 I JJ_ ' «2 (s-l)2 12 21 The continuous dependency upon both the initial conditions and the potential further parameters in which the function / would be Lipschitz-continuous immediately follows from the statement of the theorem. A really simple equation in one variable x* (t) — x(t) with exponential solution shows that we cannot hope in better general results. 8.54. Differentiability of the solutions. In practical problems, we are often interested in the differentiability of the obtained solutions, especially with regard to the initial conditions or other parameters of the system. We can notice that in the general vector notation of the system of ordinary equations x'(0 = /(f, x(f)), we can always suppose that the vector function does not depend implicitly on t. Indeed, if it depends explicitly on t, we can add another variable xo and write the same system of equations for the curve x7 (t) — (xo(t), x\(t),...,x„ (?)) as x'0(f) = 1 x\(t) = fi(x0(t),Xl(t), ...,x„(f)) x'„(?) = /„(xo(0,*i(0, with initial conditions Xn (0) *0(?0) — to, X(t0) — Xi, . . . , X„(to) — X„. Such systems, which do not explicitly depend on time, are called autonomous systems of ordinary differential equations. To simplify the method, we will deal with autonomous systems dependent on parameters X and with initial conditions (8.9) y(t) = f(y(t),k), y(to)=x. s—1 s—2 Without loss of generality, we will, in the case of autonomous systems, always consider the initial value to — 0, and should need arise, we will write the solution with y(0) — x in the form y(t, x, X) to emphasize the dependency on the parameters. For fixed values of the initial conditions (and the potential parameters), the solution will always be once more differentiable than the function /. This can be easily derived inductively by applying the chain rule. If / is continuously differentiable, f(t) = Dlf(y(t)) ■ /(t) = Dlf(y(t)) ■ f(y(t)) exists and is continuous. Having all the derivatives up to order two continuous, we get the expression for the third derivative: y(3)(0 = D2f(y(t))(f(y(t)), f(y(t))) + (D1f(y(t)))2- f(y(t)). Think out the argumentation for higher orders in detail. Let us assume for a while that there is a solution y(t, x) of our system (8.9) which is continuously differentiable in the parameters x € M" as well. Then, the derivative (t,x) = Dl(y(t,x)), i. e. the Jacobian matrix of all partial derivatives with respect to the coordinates x,, which depends on the time t as well as the initial 530 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES Now, the inverse transform yields the solution of this Cauchy problem: condition x, can be determined using the chain rule: x(t) -. y(0 39, 3te' + 2Se' -- 3te' + lie' 21 2* 4 e le2 15/2 2 ' 6t2 - _ 87 4 ' 21 . □ O. Equation of heat conduction 8.155. Find the solution to the so-called equation of heat conduction (equation of diffusion) ut{x, t) = a2 uxx(x, t), x e R, t > 0 satisfying the initial condition lim u (x,t) = f(x). t^0+ Notes: The symbol ut = ^ stands for the partial derivative of the the u with respect to t (i. e., differentiating with respect to t and considering x to be constant), and similarly, ux dx2 denotes the second partial derivative with respect to x (i. e., twice differentiating with respect to x while considering t to be constant). The physical interpretation of this problem is as follows: We are trying to determine the temperature u (x, t) in an thermally isolated and homogeneous bar of infinite length (the range of the variable x) if the initial temperature of the bar is given as the function /. The section of the bar is constant and the heat can spread in it by conduction only. The coefficient a2 then equals the quotient ^, where a is the coefficient of thermal conductivity, c is the specific heat and q is the density. In particular, we assume that a2 > 0. Solution. We apply the Fourier transform to the equation, with respect to variable x. We have T (ut) (co, t) Tic f ut(x, t)e~ia)Xdx 2k f u (x, t) t~imx dx —oo where differentiated with respect to t, i. e., T (ut) (co, t) = (T (u) (co, 0)' = (T (u))t (co, t). At the same time, we know that T (a2 uxx) (co, t) = a2 T (uxx) (co, t) = —a2co2 T (u) (co, t). Denoting y(co, t) = T (u) (co, t), we get to the equation yt = -a2co2 y. We already solved a similar differential equation when we were calculating Fourier transforms, so it is now easy for us to determine all of its solutions y(co, t) = K (co) e-fl2ft>\ K (co) e R. It remains to determine K(co). The transformation of the initial condition gives T(f) (co) = lim F(u) (co, t) = lim y(co, t) = K (co)e° = K (co), hence y(co, t) = T (f) (co) e"fl2ftA, K (co) e R. Now, using the inverse Fourier transform, we can return to the original differential equation with solution (x, 0 = 7= f ylfi>, 0 eimx dco f T (f) (co) e-fl2ft>2' eimx dco Dlx{y^(t,x)) =-(Dxy(t,x)) = D1f(y(t,x))-D1xy(t,x). The derivatives with respect to the initial conditions along the solution y(t, x) of the system (8.54) are thus given as the solutions of a system of n2 first-order equations with initial condition (8.10) &(t, x) = F(t, x) ■ ®(t, x), • M" with continuous first derivatives. Then, a system of differential equations dependent upon a parameter X e Rk with initial condition at a point x € U y(t) = f(y(t),X), y(0)=x has a unique solution y(t, x, X), which is a mapping with continuous first derivatives with respect to each variable. Proof. First, we can notice that we can consider a system dependent on parameters to be an ordinary autonomous system with no parameters if we consider even the parame-,«i ters to be space variables and we add (vector) conditions i| X'(t) — 0 and X(0) — X. Therefore, without loss of generality, we can prove the theorem for autonomous systems with no further parameters and concentrate on the dependency upon the initial conditions. Just like in the case of the fundamental existence theorem, we will build upon Picard's approximations of the solution using the integral operator y0(t, x) — x, yk+i(t,x) = x+ f Jo f(yk(s,x))ds. Merely specifying the proof of this theorem 8.49, we can verify the uniform convergence of the approximations yk (t, x) to the solution y(t, x), including the variable x. Now, for the initial condition, let us fix a point xo and a small neighborhood V of its, which, should need be, will be reduced during the following bounds, and let us write C for the constant which, thanks to Lipschitzness of the function /, gives the bound \f(y)-f(z)\ (t, x) by this equation and examine the expression G(t, h) = \y(t, x0 + h)- y(t, x0) - h®(t, x0)| 531 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 1/(1/ f(s)e-ia)Sds) e-fl2ftA eia)X dco i r i r --a2"2* -X) dto \ ds. I /(*) -757 / e-flfl"e-" Computing the Fourier transform F(f) of the function fit) e-af- ^or fl > o, we have obtained (while relabeling the variables) i r e-cp e-i>Prfp = i e-fc c>o. —oo According to this formula (consider c = a2t > 0, p = co, r = s — x), we have Therefore, 1 r g—a2m2t g—r J2.1T J u (x, t) ü)(s —x) dco /2a2t 2aJ~nt j f{s)e 4*2> ds. □ P. Numerical solution of differential equations Now, we present two simple exercises on applying the Euler method for solving differential equations. 8.156. Use the Euler method to solve the equation / = — y2 with the initial condition y(l) = 1. Determine the approximate solution on the interval [1,3]. Try to estimate for which value h of the step is the error less than one tenth. Solution. The Euler method for the considered equation is given by yic+i = yk-h-y\ for *0 = 1, V0 = l, Xk=X0 + k-h, yk = yixk). We begin the procedure with step value h = 1 and halve it in each iteration. The estimate for the "sufficiency" of h will be made somewhat imprecisely by comparing two adjacent approximate values of the function y at common points, terminating the procedure if the maximum of the absolute difference of these values is not greater than the tolerated error (0.1). _The results_ h0 = 1 y(°) =(10 0) Ai = 0.5 (1 0.5 0.375 0.3047 0.2583) y Maximal difference: 0.375. h2 = 0.25 ,(2) (1.0000 0.7500 0.6094 0.5165 0.4498 0.3992 y 0.3594 0.3271 0.3004) Maximal difference: 0.1094. h3 = 0.125 y ,0) (1.0000 0.8750 0.7793 0.7034 0.6415 0.5901 0.5466 0.5092 0.4768 0.4484 0.4233 0.4009 0.3808 0.3627 0.3462 0.3312 0.3175) Maximal difference: 0.0322. with small increases h e W. In order to prove that the continuous derivative exists, we just have to show that 1 lim - Git, h) = 0. h^0 h We will need several bounds to this purpose. First, from the last theorem about continuous dependence upon initial conditions, we can immediately see the bound \yit, x0 + h) - yit, x0)\ < \h\ ec|i| . In the next step, we use Taylor's expansion with remainder for the mapping /, fiy) - fiz) = D\fiz) -iy-z) + Riy, z), where Riy, z) satisfies \R(y, z)\/\y — z\ —>• 0 for |y — z\ —>• 0. We get the first bound which uses the definition of the mapping <$>it, xq) using its derivative, and we write Fit, x) — D1 fiyit, x)) again. i Git,h)< / \fiyis,x0 + h))- fiyis,x0)) h F(s, xo)(.s, xq)| ds Jo F(s, xq)\\ \y(s, xq+H) — y(s, xq) — h (^, xo)| ds + \R(y(s,xo+h),y(s,x0))\ds, Jo where we are working with the norm on matrices given as the maximum of the absolute values of their entries. Since we assume that Fit, x) is continuous, we can bound the norm in our neighborhood V and for \t\ < T with a sufficiently small T to remain in the neighborhood V by \\F(t,xo)\\ < B, and, at the same time, for any selected constant e > 0, we can find abound \h\ < 8 for which the remainde R satisfies \ Riy it, x0 + h), yit, x0))| < e\y(t, x0 + h) - yit, x0)| < \h\eeCT . Therefore, our bound can be improved as follows: Git,h) Rn with inverse diffeomorphisms Flx_t. A simple example of a complete vector field is the field X(x) — af-. Its flow is given by Fl? (xi, x„) — (xi +t,X2,..., x„). On the other hand, the vector field X(t) dimensional space R is not complete as its solutions are of the form 1 f2^- on the one- t i-> C 533 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES □ for initial conditions with to / 0, so they "run away" towards infinite values in a finite time. The description of a vector fields as assigning the tangent vector in the direction space to each point of the Euclidean space is independent of the coordinates. The following theorem thus gives us a geometric local qualitative description of all solutions of systems of ordinary differential equations in a neighborhood of each point x where the given vector field X is non-zero. Theorem. If X is a vector field defined on a neighborhood of a point xq € K" and we have X(xq) ^ 0, then there exists a transformation of coordinates F such that in the new coordinates y — F(x), the vector field X is given as the field Proof. We will construct a diffeomorphism F — (fi,..., /„) step by step. Geometrically, the essence of the proof can be summarized as follows: we select a hyper-> surface which is complementary to the directions X(x), goes through the point xq, then we fix the coordinates on it, and finally, we extend them to some neighborhood of the point xo using the flow of the field X. First, we move the point xo to the origin and use a linear transformation on W in order to achieve X(0) — gfj-(O). Now, let us write in these coordinates (x\,... ,x„) the flow of the field X going through the point (xi,..., x„) at time t — 0 as x,-(0 — (Pi (t,xi,..., x„). We define fi(x\, ..., x„) = Xfi), dxi (0) = (0, 1, .0), i = 2,.. and the same formula holds for i — 1, because we have X — Therefore, the Jacobian matrix of the mapping F at the origin is the identity matrix E, so it is indeed a transformation of coordinates on some neighborhood (see the inverse mapping theorem in paragraph 8.17). Now, directly from the definition of the mapping F in terms of the flow of the vector field X, the flow of the field will be expressed in the new coordinates (yi, ■ ■ ■, y«) as Flf(y!,...,y„) = (yi + t, y2, Verify this by yourselves in detail! yn)- □ 8.56. Higher-order equations. An ordinary differential equation of order k (solved with respect to the highest derivative) is an equation s y(k) (t) = f(t,y(t),y'(t),...,yik-1)(t)), where / is a known function of k+1 variables, x is an independent variable, and y(x) is an unknown function of one variable. We will show that this type of equation is always equivalent to a system of k first-order equations. We introduce new unknown functions in a variable t as follows: y0(0 = y(t), yi(0 = y'0(0, y*-i(0 = ^_2(0- Now, 534 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES the function y(t) is a solution of our original equation if and only if it is the first component of the system of equations y0(o = yi (o /iCO =y2(0 y„_2(0 = yn-i(t) y„_i(0 = f(t,yo(t), yi(0,---,y«-i(0)- We thus get the following direct corollary of the theorems from 8.52-8.54: ___| Solutions of higher-order ODEs |_ - Theorem. Let a function f(t, yo, yt-i) : U C Rk+1 -> M have continuous partial derivatives on an open set U. Then, for every point (to, zo, ■ ■ ■, Zk-i) e U, there exists a maximal interval Imax — [xo — a,xo + b], with positive numbers a, b e M, and a unique function y(t) : Imax —>• K which is a solution of the k-th order equation y(k) (t) = f(t, y(t),y'(t),...,y{k-l)(t)) with initial condition y(?o) = zo, y (to) = zi,..., y(k~1} (to) = zt-i- Moreover, this solution depends differentiably on the initial condition and potential further parameters differentiably entering the function f. We can thus see that for an unambiguous assignment of a solution of an ordinary &-th order differential equation, we have to determine at a point the value and the first k — 1 derivatives of the resulting function. If we worked with a system of £ equations of order k, then the same procedure transforms this system to a system of k£ first-order equations. Therefore, an analogous statement about existence, uniqueness, continuity, and differentiability will hold again. Of course, stronger properties pass on to all such systems in the cases when the right-hand side of the equation / is differen-tiable up to order k (inclusive) or analytic, including the parameters, and these properties pass on to the solutions as well. 8.57. Linear differential equations. We have already perceived the operation of differentiation as a linear mapping from (sufficiently) smooth functions to functions. If we multiply the derivatives (-^V of the particular orders j by fixed functions aj (t) and add up these expressions, we get the so-called linear differential operator. y(t) D(y)(t) = ak(t)yik) (t) + ■ ■ ■ + ax (t)y' (t) + a0y(t). To solve the corresponding homogeneous linear differential equation then means to find a function y satisfying D(y) — 0, i. e., the image is the identically zero function. It is clear straight from the definition that the sum of two solutions will again be a solution, since for any functions yi and y2, we have D(yi + y2)(t) = D(yi)(t) + D(y2)(t). Analogously, a constant multiple of a solution is again a solution. The set of all solutions of a &-th order linear differential equation 535 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES is thus a vector space. Applying the previous theorem about existence and uniqueness, we get the following: __The space of solutions of linear equations _ Theorem. The set of all solutions of a homogeneous linear differential equation of order k is always a vector space of dimension k. Therefore, we can always describe the solutions as linear combinations of any set ofk linearly independent solutions. Such solutions are determined uniquely by linearly independent initial conditions on the value of the function y(t) and its first (k — 1) derivatives at a fixed point To- Proof. If we choose k linearly independent initial conditions at a fixed point, then we get, for each of them, a unique solution of our solution. A linear combination of these initial condition then leads to the same linear combination of the corresponding solutions. We thus exhaust all of the possible initial conditions, so we get the entire space of solutions of our equation in this way. □ 8.58. Linear differential equations with constant coefficients. The previous discussion surely reminded us the situation with homogeneous linear difference equations we dealt with in paragraph 3.9 of the third chapter. The analogy goes further even when all of the coefficients aj of the differential operator D are constant. We have already seen such first-order equations (8.8) whose solution is an exponential with an appropriate constant at the argument. Just like in the case of difference equations, it suggests itself to try whether such a form of the solution y(t) — e>a with an unknown parameter k can satisfy an equation of order k. Substitution yields D(ekt) — (akXk + ak-ikk~l H-----h a\k + a0(x)) elt . The parameter k thus leads to a solution of a linear differential equation with constant coefficients if and only if k is a root of the so-called characteristic polynomial akkk + • • • + a\k + ao. If this polynomial has k distinct roots, then we get the basis of the whole vector space of solutions. Otherwise, if k is a multiple root, then direct calculation, making use of the fact that k is then a root of the derivative of the characteristic polynomial as well, yields that the function y(t) — t e>a is also a solution. Similarly, for higher multiplicities I, we get I distinct solutions In the case of a general linear differential equation, we assign a non-zero value of the differential operator D. Again, analogously to the reasonings about systems of linear equations or linear difference equations, we can see that the general solution of this type of (non-homogeneous) equation for a fixed function bit), is the sum of an arbitrary solution of this equation and the set of all solutions of the corresponding homogeneous equation D(y)(t) — 0. The entire space of solutions is thus again a nice finite-dimensional affine space, hidden in a huge space of functions. The methods for finding a particular solution are introduced in concrete examples in the other column. In principle, they are based upon looking for the solution in a similar form as the right-hand side is. Diy)it) = bit), 536 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.59. Matrix systems with constant coefficients. Now, let us take a look at a very special case of first-order systems, whose right-hand side is given by multiplication by an n2-dimensional unknown vector function Y(t): (8.11) Y'(t) = A ■ 7(0 with a constant matrix A e Mat„(M). Combining all our knowledge from linear algebra and univariate function analysis, we can guess the solution directly if we define the so-called exponential of a matrix by the formula B(t) = etA = Y —Ak. The right-hand expression can be formally viewed as a matrix whose entries are infinite series created from the mentioned products. If we bound all entries of A by the maximum of their absolute values ||A|| = C, then, for the k-ih summand in bij(t), we get the bound j^nkCk in absolute value. Hence, every series bij (t) is necessarily absolutely and uniformly convergent, and it is bound above by the value emC. Trying to differentiate the terms of our series one by one, we get a uniformly convergent series with limit A etA. Therefore, by the general properties of uniformly convergent series, the derivative dt K ' also equals this expression. We have thus obtained the general solution of our system (8.11) in the form 7(0 = &tA Z, where Z e Mat„ (R) is an arbitrary constant matrix. Indeed, the exponential etA is an invertible matrix for all t, so we have obtained a vector space of the proper dimension, and hence all general solutions. Noticeably, if we have only a vector equation with a constant matrix A e Mat„(R), y1 (t) — A ■ y(t), for an unknown function y : R -> Rn, then the exponential etA gives n linearly independent solutions with its n columns. The general solution is then given by any linear combination of them. Finally, let us recall that we met the first-order matrix system in paragraph 8.54 when we were thinking about the derivative of the solutions of vector equations with respect to t the initial conditions. Now, consider a differentiable vector w field X(x) defined on a neighborhood of a point xo e R" such that X(xo) — 0. Then, the point xq is a fixed point of its flow Flf(x). The differential d> (0 = Dx Flf (x0) satisfies (see (e8.42b) on page 531) (0) = £. We thus know explicitly the evolution of the differential of the vector field's flow at the singular point xo, which is given by the exponential d>(0 = e'A, A = Z)1Z(x0). This is a useful step for reasoning about the qualitative behavior in a neighborhood of the stationary point xq. 537 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES 8.60. A note about Markov chains. In the third chapter, we dealt with iterative processes, where the so-called stochastic matrices and Markov processes determined by them played an interesting role. Let us recall that a matrix A is stochastic iff the sum of each of its columns is one. In other words, (1 ... 1) • A = (1 ... 1). If we take the exponential etA, we obtain (1 ... 1) • etA = — C1 • • • 1) • A* = e'(l ... 1). k=0 Therefore, for every t, the invertible matrix Bit) — e~' etA is stochastic. We thus get a continuous version of the Markov process (infinitesimally) generated by the stochastic matrix A. Indeed, differentiating with respect to t, we obtain — Bit) = -e"'e(A AetA = (-E + A)B(f), dt so the matrix Bit) is the solution of the matrix system of equations with constant coefficients Y'it) — (A — E) ■ Yit) with the stochastic matrix A. This can be explained quite intuitively. If the matrix A is stochastic, then the instantaneous increase of the vector yit) in the vector system with the matrix A, y (?) = A ■ yit), is again a stochastic vector. However, we want the Markov process to keep the vector yit) stochastic for all t. Hence, the sum of increases of the particular components of the vector yit) must be zero, which is guaranteed by subtracting the identity matrix. As we have seen above, the columns of the matrix solution Y'it) create a basis of all solutions i it) of the vector system. Let us further suppose that the matrix A is primitive, i. e., some of its powers has only positive entries, see ?? on page ??. Then we know that its powers converge to a matrix A^, all of whose columns are eigenvectors corresponding to the eigenvalue 1. Hence, there must exist a universal constant bound for all powers II Ak — Aqo II < C, and for every small positive e, there is an N e N such that for all k > N, we already have that \\Ak — A^H < s. Now, we can bound the difference between the solution Y'it) for large t and the constant matrix A^: 00 Jc 00 Jc e-' y LA* _ e-t y LA ^ k\ ^ k\ k=0 k=0 jk < e"' ^ -CHAooll +e-'e||A00||. k T. The whole expression has thus been bound(forn > N and? > T > 0) by the number e(C + l^lA^H. We have thus proves a very interesting statement, which resembles the discrete version of Markov processes: 538 CHAPTER 8. CONTINUOUS MODELS WITH MORE VARIABLES _J Continuous processes with a stochastic matrix Theorem. Every primitive stochastic matrix A gives a vector system of equations 3/ (0 = (A - E) ■ y(t) with the following properties: • the basis of the vector space of all solutions is given by the columns of the stochastic matrix Y(t) = e~' etA , • if the initial condition yo — y(to) is a stochastic vector, then the solution y(t) is also a stochastic vector for all t, • every stochastic solution converges for t -> oo to an eigenvector yoo of the matrix A corresponding to the eigenvalue 1 of the matrix A. 4. Notes about numerical methods Except for the simple equations, like the linear ones with constant coefficients, we seldom encounter analytically solvable equations in practice. Therefore, we usually need some techniques to approximate the solutions of the equations we are working with. We have already thought of a similar idea anywhere we dealt with approximations (i. e., we would recommend to compare this to the earlier paragraphs about splines, Taylor polynomials, and Fourier series). With a bit of courage, we can consider difference and differential equations to be mutual approximations. In one direction, we replace differences with differentials (for example, in economical or population models), and vice versa. We will stop for a while to look at replacing derivatives with differences. First, we introduce the usual notation for bounds on the errors. Let us recall that having a function f(x) in variable x, we say that it is, in a neighborhood of a limit point xo of its domain, of order of magnitude 0(cp(x)) for a function cp(x) iff there exists a neighborhood U of the point xq and a constant C such that l/tol