/ CENGAGE a% Learning* Linear Algebra A Modern Introduction, 4th Edition David Poole Product Director: Liz Covello Product Team Manager: Richard Stratton Content Developer: Laura Wheel Product Assistant: Danielle Hallock Media Developer: Andrew Coppola Content Project Manager: Alison Eigei Zade Senior Art Director: Linda May Manufacturing Planner: Doug Bertke Rights Acquisition Specialist: Shalice Shah-Caldwell Production Service & Compositor: MPS Limited Text Designer: Leonard Massiglia Cover Designer: Chris Miller Cover & Interior design Image: Image Source/Getty Images UK PdF i\ 1U BRNO Lokace: ' POOLA2I Pf.fi. / Sign. . Sol ?aoL- © 2015, 2011, 2006 Cengage Learning ALL RIGHTS RESERVED. No part of this work covered by the copyright herein may be reproduced, transmitted, stored, or used in any form or by any means graphic, electronic, or mechanical, including but not limited to photocopying, recording, scanning, digitizing, taping, web distribution, information networks, or information storage and retrieval systems, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without the prior written permission of the publisher. For product information and technology assistance, contact us at Cengage Learning Customer & Sales Support, 1-800-354-9706 For permission to use material from this text or product, submit all requests online at www.cengage.com/permissions. Further permissions questions can be emailed to permissionrequest@cengage.com. Library of Congress Control Number: 2013944173 ISBN-13: 978-1-285-46324-7 ISBN-10:1-285-46324-2 Cengage Learning 200 First Stamford Place, 4th Floor Stamford, CT 06902 USA Cengage Learning is a leading provider of customized learning solutions with office locations around the globe, including Singapore, the United Kingdom, Australia, Mexico, Brazil and Japan. Locate your local office at international.cengage.com/region. Cengage Learning products are represented in Canada by Nelson Education, Ltd. For your course and learning solutions, visit www.cengage.com. Purchase any of our products at your local college store or at our preferred online store www.cengagebrain.com. Instructors: Please visit login.cengage.com and log in to access instructor-specific resources. Printed in the United States of America 13 14 15 16 17 18 19 23 22 21 20 19 Dedicated to the memory of Peter Hilton, who was an exemplary mathematician, educator, and citizen—a unit vector in every sense. Preface vii To the Instructor xvii To the Student xxiii Chapter 1 Vectors ! 1.0 1.1 1.2 1.3 1.4 Introduction: The Racetrack Game 1 The Geometry and Algebra of Vectors 3 Length and Angle: The Dot Product 18 Exploration: Vectors and Geometry 32 Lines and Planes 34 Exploration: The Cross Product 48 Writing Project: The Origins of the Dot Product and Cross Product Applications 50 Force Vectors 50 Chapter Review 55 49 Chapter 2 Systems of Linear Equations 57 2.0 Introduction: Triviality 57 2.1 Introduction to Systems of Linear Equations 58 2.2 Direct Methods for Solving Linear Systems 64 Writing Project: A History of Gaussian Elimination Explorations: Lies My Computer Told Me 83 Partial Pivoting 84 Counting Operations: An Introduction to the Analysis of Algorithms 85 2.3 Spanning Sets and Linear Independence 88 2.4 Applications 99 Allocation of Resources 99 Balancing Chemical Equations 101 Network Analysis 102 Electrical Networks 104 Linear Economic Models 107 Finite Linear Games 109 Vignette: The Global Positioning System 121 2.5 Iterative Methods for Solving Linear Systems 124 Chapter Review 134 82 II Contents Chapter 3 Matrices 136 Chapter 4 3.0 Introduction: Matrices in Action 136 3.1 Matrix Operations 138 3.2 Matrix Algebra 154 3.3 The Inverse of a Matrix 163 3.4 The LU Factorization 180 3.5 Subspaces, Basis, Dimension, and Rank 191 3.6 Introduction to Linear Transformations 211 Vignette: Robotics 226 3.7 Applications 230 Markov Chains 230 Linear Economic Models 235 Population Growth 239 Graphs and Digraphs 241 Chapter Review 251 Eigenvalues and Eigenvectors 253 4.0 4.1 4.2 4.3 4.4 4.5 4.6 Introduction: A Dynamical System on Graphs 253 Introduction to Eigenvalues and Eigenvectors 254 Determinants 263 Writing Project: Which Came First: The Matrix or the Determinant? Vignette: Lewis Carroll's Condensation Method 284 Exploration: Geometric Applications of Determinants 286 Eigenvalues and Eigenvectors of n X n Matrices 292 Writing Project: The History of Eigenvalues 301 Similarity and Diagonalization 301 Iterative Methods for Computing Eigenvalues 311 Applications and the Perron-Frobenius Theorem 325 Markov Chains 325 Population Growth 330 The Perron-Frobenius Theorem 332 Linear Recurrence Relations 335 Systems of Linear Differential Equations 340 Discrete Linear Dynamical Systems 348 Vignette: Ranking Sports Teams and Searching the Internet 356 Chapter Review 364 283 Chapter 5 Orthogonality 366 5.0 Introduction: Shadows on a Wall 366 5.1 Orthogonality in U" 368 5.2 Orthogonal Complements and Orthogonal Projections 5.3 The Gram-Schmidt Process and the QR Factorization Explorations: The Modified QR Factorization 396 Approximating Eigenvalues with the QR Algorithm 5.4 Orthogonal Diagonalization of Symmetric Matrices 5.5 Applications 408 Quadratic Forms 408 Graphing Quadratic Equations 415 Chapter Review 425 378 388 398 400 Contents t Chapter 6 Vector Spaces 427 Chapter 7 ,-.-—.————^ 6.0 Introduction: Fibonacci in (Vector) Space 427 6.1 Vector Spaces and Subspaces 429 Writing Project: The Rise of Vector Spaces 443 6.2 Linear Independence, Basis, and Dimension 443 Exploration: Magic Squares 460 6.3 Change of Basis 463 6.4 Linear Transformations 472 6.5 The Kernel and Range of a Linear Transformation 481 6.6 The Matrix of a Linear Transformation 497 Exploration: Tilings, Lattices, and the Crystallographic Restriction 515 6.7 Applications 518 Homogeneous Linear Differential Equations 518 Chapter Review 527 Distance and Approximation 529 Chapter 8 — ~ . , . , , ,—>. 7.0 Introduction: Taxicab Geometry 529 7.1 Inner Product Spaces 531 Explorations: Vectors and Matrices with Complex Entries 543 Geometric Inequalities and Optimization Problems 547 7.2 Norms and Distance Functions 552 7.3 Least Squares Approximation 568 7.4 The Singular Value Decomposition 590 Vignette: Digital Image Compression 607 7.5 Applications 610 Approximation of Functions 610 Chapter Review 618 i Codes Online only 620 - .---------------,-^ 8.1 Code Vectors 620 Vignette: The Codabar System 626 8.2 Error-Correcting Codes 627 8.3 Dual Codes 632 8.4 Linear Codes 639 8.5 The Minimum Distance of a Code 644 APPENDIX A Mathematical Notation and Methods of Proof Al APPENDIX B Mathematical Induction Bl APPENDIX C Complex Numbers CI APPENDIX D Polynomials Dl APPENDIX E Technology Bytes Online only Answers to Selected Odd-Numbered Exercises ANSI Index II Preface The last thing one knows when writing a book is what to put first. —Blaise Pascal Pensees, 1670 For more on the recommendations of the Linear Algebra Curriculum Study Group, see The College Mathematics Journal 24 (1993), 41-46. The fourth edition of Linear Algebra: A Modern Introduction preserves the approach and features that users found to be strengths of the previous editions. However, I have streamlined the text somewhat, added numerous clarifications, and freshened up the exercises. I want students to see linear algebra as an exciting subject and to appreciate its tremendous usefulness. At the same time, I want to help them master the basic concepts and techniques of linear algebra that they will need in other courses, both in mathematics and in other disciplines. I also want students to appreciate the interplay of theoretical, applied, and numerical mathematics that pervades the subject. This book is designed for use in an introductory one- or two-semester course sequence in linear algebra. First and foremost, it is intended for students, and I have tried my best to write the book so that students not only will find it readable but also will want to read it. As in the first three editions, I have taken into account the reality that students taking introductory linear algebra are likely to come from a variety of disciplines. In addition to mathematics majors, there are apt to be majors from engineering, physics, chemistry, computer science, biology, environmental science, geography, economics, psychology, business, and education, as well as other students taking the course as an elective or to fulfill degree requirements. Accordingly, the book balances theory and applications, is written in a conversational style yet is fully rigorous, and combines a traditional presentation with concern for student-centered learning. There is no such thing as a universally best learning style. In any class, there will be some students who work well independently and others who work best in groups; some who prefer lecture-based learning and others who thrive in a workshop setting, doing explorations; some who enjoy algebraic manipulations, some who are adept at numerical calculations (with and without a computer), and some who exhibit strong geometric intuition. In this edition, I continue to present material in a variety of ways—algebraically, geometrically, numerically, and verbally—so that all types of learners can find a path to follow. I have also attempted to present the theoretical, computational, and applied topics in a flexible yet integrated way. In doing so, it is my hope that all students will be exposed to the many sides of linear algebra. This book is compatible with the recommendations of the Linear Algebra Curriculum Study Group. From a pedagogical point of view, there is no doubt that for most students Viil Preface concrete examples should precede abstraction. I have taken this approach here. I also believe strongly that linear algebra is essentially about vectors and that students need to see vectors first (in a concrete setting) in order to gain some geometric insight. Moreover, introducing vectors early allows students to see how systems of linear equations arise naturally from geometric problems. Matrices then arise equally naturally as coefficient matrices of linear systems and as agents of change (linear transformations). This sets the stage for eigenvectors and orthogonal projections, both of which are best understood geometrically. The dart that appears on the cover of this book symbolizes a vector and reflects my conviction that geometric understanding should precede computational techniques. I have tried to limit the number of theorems in the text. For the most part, results labeled as theorems either will be used later in the text or summarize preceding work. Interesting results that are not central to the book have been included as exercises or explorations. For example, the cross product of vectors is discussed only in explorations (in Chapters 1 and 4). Unlike most linear algebra textbooks, this book has no chapter on determinants. The essential results are all in Section 4,2, with other interesting material contained in an exploration. The book is, however, comprehensive for an introductory text. Wherever possible, I have included elementary and accessible proofs of theorems in order to avoid having to say, "The proof of this result is beyond the scope of this text." The result is, I hope, a work that is self-contained. I have not been stingy with the applications: There are many more in the book than can be covered in a single course. However, it is important that students see the impressive range of problems to which linear algebra can be applied. I have included some modern material on finite linear algebra and coding theory that is not normally found in an introductory linear algebra text. There are also several impressive real-world applications of linear algebra and one item of historical, if not practical, interest; these applications are presented as self-contained "vignettes." I hope that instructors will enjoy teaching from this book. More important, I hope that students using the book will come away with an appreciation of the beauty, power, and tremendous utility oflinear algebra and that they will have fun along the way. What's New in the Fourth Edition The overall structure and style of Linear Algebra: A Modern Introduction remain the same in the fourth edition. Here is a summary of what is new: • The applications to coding theory have been moved to the new online Chapter 8. • To further engage students, five writing projects have been added to the exer-Seepages 49, 82, 283, 301, 443 c^se sets- These projects give students a chance to research and write about aspects of the history and development oflinear algebra. The explorations, vignettes, and many of the applications provide additional material for student projects. • There are over 200 new or revised exercises. In response to reviewers' comments, there is now a full proof of the Cauchy-Schwarz Inequality in Chapter 1 in the form of a guided exercise. • I have made numerous small changes in wording to improve the clarity or accuracy of the exposition. Also, several definitions have been made more explicit by giving them their own definition boxes and a few results have been highlighted by labeling them as theorems. • All existing ancillaries have been updated. Preface IX Features Clear Writing Style The text is written is a simple, direct, conversational style. As much as possible, I have used "mathematical English" rather than relying excessively on mathematical notation. However, all proofs that are given are fully rigorous, and Appendix A contains an introduction to mathematical notation for those who wish to streamline their own writing. Concrete examples almost always precede theorems, which are then followed by further examples and applications. This flow—from specific to general and back again—is consistent throughout the book. Key Concepts introduced Early Many students encounter difficulty in linear algebra when the course moves from the computational (solving systems of linear equations, manipulating vectors and matrices) to the theoretical (spanning sets, linear independence, subspaces, basis, and dimension). This book introduces all of the key concepts of linear algebra early, in a concrete setting, before revisiting them in full generality. Vector concepts such as dot product, length, orthogonality, and projection are first discussed in Chapter 1 in the concrete setting of U2 and R3 before the more general notions of inner product, norm, and orthogonal projection appear in Chapters 5 and 7. Similarly, spanning sets and linear independence are given a concrete treatment in Chapter 2 prior to their generalization to vector spaces in Chapter 6. The fundamental concepts of subspace, basis, and dimension appear first in Chapter 3 when the row, column, and null spaces of a matrix are introduced; it is not until Chapter 6 that these ideas are given a general treatment. In Chapter 4, eigenvalues and eigenvectors are introduced and explored for 2 X 2 matrices before their n X n counterparts appear. By the beginning of Chapter 4, all of the key concepts of linear algebra have been introduced, with concrete, computational examples to support them. When these ideas appear in full generality later in the book, students have had time to get used to them and, hence, are not so intimidated by them. Emphasis on Vectors and Geometry In keeping with the philosophy that linear algebra is primarily about vectors, this book stresses geometric intuition. Accordingly, the first chapter is about vectors, and it develops many concepts that will appear repeatedly throughout the text. Concepts such as orthogonality, projection, and linear combination are all found in Chapter 1, as is a comprehensive treatment of lines and planes in R3 that provides essential insight into the solution of systems of linear equations. This emphasis on vectors, geometry, and visualization is found throughout the text. Linear transformations are introduced as matrix transformations in Chapter 3, with many geometric examples, before general linear transformations are covered in Chapter 6. In Chapter 4, eigenvalues are introduced with "eigenpictures" as a visual aid. The proof of Perron's Theorem is given first heuristically and then formally, in both cases using a geometric argument. The geometry of linear dynamical systems reinforces and summarizes the material on eigenvalues and eigenvectors. In Chapter 5, orthogonal projections, orthogonal complements of subspaces, and the Gram-Schmidt Process are all presented in the concrete setting of R3 before being generalized to R" and, in Chapter 7, to inner N Preface product spaces. The nature of the singular value decomposition is also explained informally in Chapter 7 via a geometric argument. Of the more than 300 figures in the text, over 200 are devoted to fostering a geometric understanding of linear algebra. Explorations The introduction to each chapter is a guided exploration (Section 0) in which students are invited to discover, individually or in groups, some aspect of the upcoming See pages L 136, 427, 529 chapter. For example, "The Racetrack Game" introduces vectors, "Matrices in Action" introduces matrix multiplication and linear transformations, "Fibonacci in (Vector) Space" touches on vector space concepts, and "Taxicab Geometry" sets up generalized norms and distance functions. Additional explorations found throughout the book include applications of vectors and determinants to geometry, an investigation of 3 X 3 magic squares, a study of symmetry via the tilings of M. C. Escher, an introduction to complex linear algebra, and optimization problems using geometric See pages 83, 84, 85, 396, 398 inequalities. There are also explorations that introduce important numerical consid- erations and the analysis of algorithms. Having students do some of these explorations is one way of encouraging them to become active learners and to give them "ownership" over a small part of the course. Applications The book contains an abundant selection of applications chosen from a broad range of disciplines, including mathematics, computer science, physics, chemistry, engineering, biology, business, economics, psychology, geography, and sociology. Noteworthy among these is a strong treatment of coding theory, from error-detecting codes (such as International Standard Book Numbers) to sophisticated error-correcting codes (such as the Reed-Muller code that was used to transmit satellite photos from space). Additionally, there are five "vignettes" that briefly showcase some very modern applications of linear algebra: the Global Positioning System (GPS), robotics, Internet search engines, digital image compression, and the Codabar System. Examples and Exercises There are over 400 examples in this book, most worked in greater detail than is customary in an introductory linear algebra textbook. This level of detail is in keeping with the philosophy that students should want (and be able) to read a textbook. Accordingly, it is not intended that all of these examples be covered in class; many can be assigned for individual or group study, possibly as part of a project. Most examples have at least one counterpart exercise so that students can try out the skills covered in the example before exploring generalizations. There are over 2000 exercises, more than in most textbooks at a similar level. Answers to most of the computational odd-numbered exercises can be found in the back of the book. Instructors will find an abundance of exercises from which to select homework assignments. The exercises in each section are graduated, progressing from the routine to the challenging. Exercises range from those intended for hand computation to those requiring the use of a calculator or computer algebra system, and from theoretical and numerical exercises to conceptual exercises. Many of the examples and Seepages 248, 359, 526, 588 exercises use actual data compiled from real-world situations. For example, there are problems on modeling the growth of caribou and seal populations, radiocarbon dating See pages 623, 641 Seepages 121, 226,356, 607, 626 Preface of the Stonehenge monument, and predicting major league baseball players' salaries. Working such problems reinforces the fact that linear algebra is a valuable tool for modeling real-life problems. Additional exercises appear in the form of a review after each chapter. In each set, there are 10 true/false questions designed to test conceptual understanding, followed by 19 computational and theoretical exercises that summarize the main concepts and techniques of that chapter. Biographical Sketches and Etymological Notes It is important that students learn something about the history of mathematics and come to see it as a social and cultural endeavor as well as a scientific one. Accordingly, the text contains short biographical sketches about many of the mathematicians who contributed to the development of linear algebra. I hope that these will help to put a human face on the subject and give students another way of relating to the material. I have found that many students feel alienated from mathematics because the terminology makes no sense to them—it is simply a collection of words to be learned. To help overcome this problem, I have included short etymological notes that give the origins of many of the terms used in linear algebra. (For example, why do we use the word normal to refer to a vector that is perpendicular to a plane?) Margin icons The margins of the book contain several icons whose purpose is to alert the reader in various ways. Calculus is not a prerequisite for this book, but linear algebra has many interesting and important applications to calculus. The icon denotes an example or exercise that requires calculus. (This material can be omitted if not everyone in the class has had at least one semester of calculus. Alternatively, this material can be assigned as projects.) The »♦«•• icon denotes an example or exercise involving complex numbers. (For students unfamiliar with complex numbers, Appendix C contains all the background material that is needed.) The CAS icon indicates that a computer algebra system (such as Maple, Mathematica, or MATLAB) or a calculator with matrix capabilities (such as almost any graphing calculator) is required—or at least very useful— for solving the example or exercise. In an effort to help students learn how to read and use this textbook most effectively, I have noted various places where the reader is advised to pause. These may be places where a calculation is needed, part of a proof must be supplied, a claim should be verified, or some extra thought is required. The U" icon appears in the margin at such places; the message is "Slow down. Get out your pencil. Think about this." Technology This book can be used successfully whether or not students have access to technology. However, calculators with matrix capabilities and computer algebra systems are now commonplace and, properly used, can enrich the learning experience as well as help with tedious calculations. In this text, I take the point of view that students need to master all of the basic techniques of linear algebra by solving by hand examples that are not too computationally difficult. Technology may then be used Preface (in whole or in part) to solve subsequent examples and applications and to apply techniques that rely on earlier ones. For example, when systems of linear equations are first introduced, detailed solutions are provided; later, solutions are simply given, and the reader is expected to verify them. This is a good place to use some form of technology. Likewise, when applications use data that make hand calculation impractical, use technology. All of the numerical methods that are discussed depend on the use of technology. With the aid of technology, students can explore linear algebra in some exciting ways and discover much for themselves. For example, if one of the coefficients of a linear system is replaced by a parameter, how much variability is there in the solutions? How does changing a single entry of a matrix affect its eigenvalues? This book is not a tutorial on technology, and in places where technology can be used, I have not specified a particular type of technology. The student companion website that accompanies this book offers an online appendix called Technology Bytes that gives instructions for solving a selection of examples from each chapter using Maple, Math-ematica, and MATLAB. By imitating these examples, students can do further calculations and explorations using whichever CAS they have and exploit the power of these systems to help with the exercises throughout the book, particularly those marked with the ^ icon. The website also contains data sets and computer code in Maple, Mathematica, and MATLAB formats keyed to many exercises and examples in the text. Students and instructors can import these directly into their CAS to save typing and eliminate errors. Finite and Numerical Linear Algebra The text covers two aspects of linear algebra that are scarcely ever mentioned together: finite linear algebra and numerical linear algebra. By introducing modular arithmetic early, I have been able to make finite linear algebra (more properly, "linear algebra over finite fields," although I do not use that phrase) a recurring theme throughout the book. This approach provides access to the material on coding theory in Chapter 8 (online). There is also an application to finite linear games in Section 2.4 that students really enjoy. In addition to being exposed to the applications of finite linear algebra, mathematics majors will benefit from seeing the material on finite fields, because they are likely to encounter it in such courses as discrete mathematics, abstract algebra, and number theory. All students should be aware that in practice, it is impossible to arrive at exact solutions of large-scale problems in linear algebra. Exposure to some of the techniques of numerical linear algebra will provide an indication of how to obtain highly accurate approximate solutions. Some of the numerical topics included in the book are roundoff error and partial pivoting, iterative methods for solving linear systems and computing eigenvalues, the LU and QR factorizations, matrix norms and condition numbers, least squares approximation, and the singular value decomposition. The inclusion of numerical linear algebra also brings up some interesting and important issues that are completely absent from the theory of linear algebra, such as pivoting strategies, the condition of a linear system, and the convergence of iterative methods. This book not only raises these questions but also shows how one might approach them. Gerschgorin disks, matrix norms, and the singular values of a matrix, discussed in Chapters 4 and 7, are useful in this regard. See pages 83, 84, 124, 180, 311, 555,561,568,590 Seepages 319, 563, 600 Preface Kill Appendices Appendix A contains an overview of mathematical notation and methods of proof, and Appendix B discusses mathematical induction. All students will benefit from these sections, but those with a mathematically oriented major may wish to pay particular attention to them. Some of the examples in these appendices are uncommon (for instance, Example B.6 in Appendix B) and underscore the power of the methods. Appendix C is an introduction to complex numbers. For students familiar with these results, this appendix can serve as a useful reference; for others, this section contains everything they need to know for those parts of the text that use complex numbers. Appendix D is about polynomials. I have found that many students require a refresher about these facts. Most students will be unfamiliar with Descartess Rule of Signs; it is used in Chapter 4 to explain the behavior of the eigenvalues of Leslie matrices. Exercises to accompany the four appendices can be found on the books website. Short answers to most of the odd-numbered computational exercises are given at the end of the book. Exercise sets to accompany Appendixes A, B, C, and D are available on the companion website, along with their odd-numbered answers. Anclllaries For Instructors Enhanced Web Assign® <—VVehAssi^i Printed Access Card; 978-1-285-85829-6 Online Access Code: 978-1-285-85827-2 Exclusively from Cengage Learning, Enhanced WebAssign combines the exceptional mathematics content that you know and love with the most powerful online homework solution, WebAssign. Enhanced WebAssign engages students with immediate feedback, rich tutorial content, and interactive, fully customizable eBooks (YouBook), helping students to develop a deeper conceptual understanding of their subject matter. Flexible assignment options give instructors the ability to release assignments conditionally based on students' prerequisite assignment scores. Visit us at www. cengage.com/ewa to learn more. Cengage Learning Testing Powered by Cognero Cengage Learning Testing Powered by Cognero is a flexible, online system that allows you to author, edit, and manage test bank content from multiple Cengage Learning solutions; create multiple test versions in an instant; and deliver tests from your LMS, your classroom, or wherever you want. Complete Solutions Manual The Complete Solutions Manual provides detailed solutions to all exercises in the text, including Exploration and Chapter Review exercises. The Complete Solutions Manual is available online. Instructor's Guide This online guide enhances the text with valuable teaching resources such as group work projects, teaching tips, interesting exam questions, examples and extra material for lectures, and other items designed to reduce the instructor's preparation time and make linear algebra class an exciting and interactive experience. For each section of the text, the Instructor's Guide includes suggested time and emphasis, points to stress, questions for discussion, lecture materials and examples, technology tips, student projects, group work with solutions, sample assignments, and suggested test questions. Solution Builder www.cengage.com/solutionbuilder Solution Builder provides full instructor solutions to all exercises in the text, including those in the explorations and chapter reviews, in a convenient online format. Solution Builder allows instructors to create customized, secure PDF printouts of solutions matched exactly to the exercises assigned for class. * Access Cognero and additional instructor resources online at login.cengage.com. For Students Student Solutions Manual (ISBN-13: 978-1-285-84195-3) The Student Solutions Manual and Study Guide includes detailed solutions to all odd-numbered exercises and selected even-numbered exercises; section and chapter summaries of symbols, definitions, and theorems; and study tips and hints. Complex exercises are explored through a question-and-answer format designed to deepen understanding. Challenging and entertaining problems that further explore selected exercises are also included. Enhanced WebAssign® "^^wfebAssign Printed Access Card: 978-1-285-85829-6 Online Access Code: 978-1-285-85827-2 Enhanced Web Assign (assigned by the instructor) provides you with instant feedback on homework assignments. This online homework system is easy to use and includes helpful links to textbook sections, video examples, and problem-specific tutorials. CengageBrain. com To access additional course materials and companion resources, please visit www. cengagebrain.com. At the CengageBrain.com home page, search for the ISBN of your title (from the back cover of your book) using the search box at the top of the page. This will take you to the product page where free companion resources can be found. Acknowledgments The reviewers of the previous edition of this text contributed valuable and often insightful comments about the book. I am grateful for the time each of them took to do this. Their judgement and helpful suggestions have contributed greatly to the development and success of this book, and I would like to thank them personally: Jamey Bass, City College of San Francisco; Olga Brezhneva, Miami University; Karen Clark, The College of New Jersey; Marek Elzanowski, Portland State University; Christopher Francisco, Oklahoma State University; Brian Jue, California State University, Stanislaus; Alexander Kheyfits, Bronx Community College/CUNY; Henry Krieger, Harvey Mudd College; Rosanna Pearlstein, Michigan State Preface XV University; William Sullivan, Portland State University; Matthias Weber, Indiana University. I am indebted to a great many people who have, over the years, influenced my views about linear algebra and the teaching of mathematics in general. First, I would like to thank collectively the participants in the education and special linear algebra sessions at meetings of the Mathematical Association of America and the Canadian Mathematical Society. I have also learned much from participation in the Canadian Mathematics Education Study Group and the Canadian Mathematics Education Forum. I especially want to thank Ed Barbeau, Bill Higginson, Richard Hoshino, John Grant McLoughlin, Eric Muller, Morris Orzech, Bill Ralph, Pat Rogers, Peter Taylor, and Walter Whiteley, whose advice and inspiration contributed greatly to the philosophy and style of this book. My gratitude as well to Robert Rogers, who developed the student and instructor solutions, as well as the excellent study guide content. Special thanks go to Jim Stewart for his ongoing support and advice. Joe Rotman and his lovely book A First Course in Abstract Algebra inspired the etymological notes in this book, and I relied heavily on Steven Schwartzmahs The Words of Mathematics when compiling these notes. 1 thank Art Benjamin for introducing me to the Codabar system and Joe Grcar for clarifying aspects of the history of Gaussian elimination. My colleagues Marcus Pivato and Reem Yassawi provided useful information about dynamical systems. As always, I am grateful to my students for asking good questions and providing me with the feedback necessary to becoming a better teacher. I sincerely thank all of the people who have been involved in the production of this book. Jitendra Kumar and the team at MPS Limited did an amazing job producing the fourth edition. I thank Christine Sabooni for doing a thorough copyedit. Most of all, it has been a delight to work with the entire editorial, marketing, and production teams at Cengage Learning: Richard Stratton, Molly Taylor, Laura Wheel, Cynthia Ashton, Danielle Hallock, Andrew Coppola, Alison Eigel Zade, and Janay Pryor. They offered sound advice about changes and additions, provided assistance when I needed it, but let me write the book I wanted to write. I am fortunate to have worked with them, as well as the staffs on the first through third editions. As always, I thank my family for their love, support, and understanding. Without them, this book would not have been possible. David Poole dpoole@trentu.ca To the Instructor "Would you tell me, please, which way I ought to go from here?" "That depends a good deal on where you want to get to" said the Cat. —Lewis Carroll Alice's Adventures in Wonderland, 1865 This text was written with flexibility in mind. It is intended for use in a one- or two-semester course with 36 lectures per semester. The range of topics and applications makes it suitable for a variety of audiences and types of courses. However, there is more material in the book than can be covered in class, even in a two-semester course. After the following overview of the text are some brief suggestions for ways to use the book. An Overview of the Text See page 1 See page 32 See page 48 Chapter 1: Vectors The racetrack game in Section 1.0 serves to introduce vectors in an informal way. (It's also quite a lot of fun to play!) Vectors are then formally introduced from both algebraic and geometric points of view. The operations of addition and scalar multiplication and their properties are first developed in the concrete settings of R2 and IR3 before being generalized to IR". Modular arithmetic and finite linear algebra are also introduced. Section 1.2 defines the dot product of vectors and the related notions of length, angle, and orthogonality. The very important concept of (orthogonal) projection is developed here; it will reappear in Chapters 5 and 7. The exploration "Vectors and Geometry" shows how vector methods can be used to prove certain results in Euclidean geometry. Section 1.3 is a basic but thorough introduction to lines and planes in R2 and W. This section is crucial for understanding the geometric significance of the solution of linear systems in Chapter 2. Note that the cross product of vectors in R3 is left as an exploration. The chapter concludes with an application to force vectors. See page 57 Chapter 2: Systems of Linear Equations The introduction to this chapter serves to illustrate that there is more than one way to think of the solution to a system of linear equations. Sections 2.1 and 2.2 develop the xvil XViii To the Instructor main computational tool for solving linear systems: row reduction of matrices (Gaus-+ sian and Gauss-Jordan elimination). Nearly all subsequent computational methods in Seepages 72, 205, 386, 486 D00k depend on this. The Rank Theorem appears here for the first time; it shows up again, in more generality, in Chapters 3, 5, and 6. Section 2.3 is very important; it introduces the fundamental notions of spanning sets and linear independence of vectors. Do not rush through tills material. Section 2.4 contains six applications from which instructors can choose depending on the time available and the interests of the Seepage 121 class. The vignette on the Global Positioning System provides another application that students will enjoy. The iterative methods in Section 2.5 will be optional for many courses but are essential for a course with an applied/numerical focus. The three ex-Seepages 83, 84, 85 plorations in this chapter are related in that they all deal with aspects of the use of computers to solve linear systems. All students should at least be made aware of these issues. Chapter 3: Matrices This chapter contains some of the most important ideas in the book. It is a long chapter, but the early material can be covered fairly quickly, with extra time allowed for the crucial material in Section 3.5. Section 3.0 is an exploration that introduces the notion of a linear transformation: the idea that matrices are not just static objects but rather a type of function, transforming vectors into other vectors. All of the basic facts about matrices, matrix operations, and their properties are found in the first two sections. The material on partitioned matrices and the multiple representations of the matrix product is worth stressing, because it is used repeatedly in subsequent sections. The Fundamental Theorem of Invertible Matrices in Section 3.3 is very important and will appear several more times as new characterizations of invertibility are presented. Section 3.4 discusses the very important LU factorization of a matrix. If this topic is not covered in class, it is worth assigning as a project or discussing in a workshop. The point of Section 3.5 is to present many of the key concepts of linear algebra (subspace, basis, dimension, and rank) in the concrete setting of matrices before students see them in full generality. Although the examples in this section are all familiar, it is important that students get used to the new terminology and, in particular, understand what the notion of a basis means. The geometric treatment of linear transformations in Section 3.6 is intended to smooth the transition to general linear transformations in Chapter 6. The example of a projection is particularly important because it will reappear in Chapter 5. The vignette on robotic arms is a concrete demonstration of composition of linear (and affine) transformations. There are four applications from which to choose in Section 3.7. Either Markov chains or the Leslie model of population growth should be covered so that they can be used again in Chapter 4, where their behavior will be explained. Chapter 4: Eigenvalues and Eigenvectors The introduction Section 4.0 presents an interesting dynamical system involving graphs. This exploration introduces the notion of an eigenvector and foreshadows the power method in Section 4.5. In keeping with the geometric emphasis of the book, Section 4.1 contains the novel feature of "eigenpictures" as a way of visualizing the eigenvectors of 2 X 2 matrices. Determinants appear in Section 4.2, motivated by their use in finding the characteristic polynomials of small matrices. This "crash Seepage 136 Seepages 172,206, 296, 512, 605 See page 226 See pages 230, 239 To the Instructor XlX course" in determinants contains all the essential material students need, including an optional but elementary proof of the Laplace Expansion Theorem. The vignette "Lewis Carroll's Condensation Method" presents a historically interesting, alternative method of calculating determinants that students may find appealing. The exploration "Geometric Applications of Determinants" makes a nice project that contains several interesting and useful results. (Alternatively, instructors who wish to give more detailed coverage to determinants may choose to cover some of this exploration in class.) The basic theory of eigenvalues and eigenvectors is found in Section 4.3, and Section 4.4 deals with the important topic of diagonalization. Example 4.29 on powers of matrices is worth covering in class. The power method and its variants, discussed in Section 4.5, are optional, but all students should be aware of the method, and an applied course should cover it in detail. Gerschgorins Disk Theorem can be covered independently of the rest of Section 4.5. Markov chains and the Leslie model of population growth reappear in Section 4.6. Although the proof of Perron's Theorem is optional, the theorem itself (like the stronger Perron-Frobenius Theorem) should at least be mentioned because it explains why we should expect a unique positive eigenvalue with a corresponding positive eigenvector in these applications. The applications on recurrence relations and differential equations connect linear algebra to discrete mathematics and calculus, respectively. The matrix exponential can be covered if your class has a good calculus background. The final topic of discrete linear dynamical systems revisits and summarizes many of the ideas in Chapter 4, looking at them in a new, geometric light. Students will enjoy reading how eigenvectors can be used to help rank sports teams and websites. This vignette can easily be extended to a project or enrichment activity. Chapter 5: Orthogonality The introductory exploration, "Shadows on a Wall," is mathematics at its best: it takes a known concept (projection of a vector onto another vector) and generalizes it in a useful way (projection of a vector onto a subspace—-a plane), while uncovering some previously unobserved properties. Section 5.1 contains the basic results about orthogonal and orthonormal sets of vectors that will be used repeatedly from here on. In particular, orthogonal matrices should be stressed. In Section 5.2, two concepts from Chapter I are generalized: the orthogonal complement of a subspace and the orthogonal projection of a vector onto a subspace. The Orthogonal Decomposition Theorem is important here and helps to set up the Gram-Schmidt Process. Also note the quick proof of the Rank Theorem. The Gram-Schmidt Process is detailed in Section 5.3, along with the extremely important QR factorization. The two explo-Seepages 396, 398; rations that follow outline how the QR factorization is computed in practice and how it can be used to approximate eigenvalues. Section 5.4 on orthogonal diagonalization of (real) symmetric matrices is needed for the applications that follow. It also contains the Spectral Theorem, one of the highlights of the theory of linear algebra. The applications in Section 5.5 are quadratic forms and graphing quadratic equations. I always See pages 408, 415 include at least the second of these in my course because it extends what students al- ready know about conic sections. Chapter 6: Vector Spaces The Fibonacci sequence reappears in Section 6.0, although it is not important that students have seen it before (Section 4.6). The purpose of this exploration is to show See page 284 Seepage 286 See pages 325, 330 See page 348 XX To the Instructor Seepage 515 that familiar vector space concepts (Section 3.5) can be used fruitfully in a new setting. Because all of the main ideas of vector spaces have already been introduced in Chapters 1-3, students should find Sections 6.1 and 6.2 fairly familiar. The emphasis here should be on using the vector space axioms to prove properties rather than relying on computational techniques. When discussing change of basis in Section 6.3, it is helpful to show students how to use the notation to remember how the construction works. Ultimately, the Gauss-Jordan method is the most efficient here. Sections 6.4 and 6.5 on linear transformations are important. The examples are related to previous results on matrices (and matrix transformations). In particular, it is important to stress that the kernel and range of a linear transformation generalize the null space and column space of a matrix. Section 6.6 puts forth the notion that (almost) all linear transformations are essentially matrix transformations. This builds on the information in Section 3.6, so students should not find it terribly surprising. However, the examples should be worked carefully. The connection between change of basis and similarity of matrices is noteworthy The exploration "Tilings, Lattices, and the Crystallographic Restriction" is an impressive application of change of basis. The connection with the artwork of M. C. Escher makes it all the more interesting. The applications in Section 6.7 build on previous ones and can be included as time and interest permit. See page 529 See page 543 See page 547 Seepage 60/ Chapter 7: Distance and Approximation Section 7.0 opens with the entertaining "Taxicab Geometry" exploration. Its purpose is to set up the material on generalized norms and distance functions (metrics) that follows. Inner product spaces are discussed in Section 7.1; the emphasis here should be on the examples and using the axioms. The exploration "Vectors and Matrices with Complex Entries" shows how the concepts of dot product, symmetric matrix, orthogonal matrix, and orthogonal diagonalization can be extended from real to complex vector spaces. The following exploration, "Geometric Inequalities and Optimization Problems," is one that students typically enjoy. (They will have fun seeing how many "calculus" problems can be solved without using calculus at all!) Section 7.2 covers generalized vector and matrix norms and shows how the condition number of a matrix is related to the notion of ill-conditioned linear systems explored in Chapter 2. Least squares approximation (Section 7.3) is an important application of linear algebra in many other disciplines. The Best Approximation Theorem and the Least Squares Theorem are important, but their proofs are intuitively clear. Spend time here on the examples—a few should suffice. Section 7.4 presents the singular value decomposition, one of the most impressive applications of linear algebra. If your course gets this far, you will be amply rewarded. Not only does the SVD tie together many notions discussed previously; it also affords some new (and quite powerful) applications. If a CAS is available, the vignette on digital image compression is worth presenting; it is a visually impressive display of the power of linear algebra and a fitting culmination to the course. The further applications in Section 7.5 can be chosen according to the time available and the interests of the class. Chapter 8: Codes This online chapter contains applications of linear algebra to the theory of codes. Section 8.1 begins with a discussion of how vectors can be used to design To the Instructor XX) error-detecting codes such as the familiar Universal Product Code (UPC) and International Standard Book Number (ISBN). This topic only requires knowledge of Chapter 1. The vignette on the Codabar system used in credit and bank cards is an excellent classroom presentation that can even be used to introduce Section 8.1. Once students are familiar with matrix operations, Section 8.2 describes how codes can be designed to correct as well as detect errors. The Hamming codes introduced here are perhaps the most famous examples of such error-correcting codes. Dual codes, discussed in Section 8.3, are an important way of constructing new codes from old ones. The notion of orthogonal complement, introduced in Chapter 5, is the prerequisite concept here. The most important, and most widely used, class of codes is the class of linear codes that is defined in Section 8.4. The notions of subspace, basis, and dimension are key here. The powerful Reed-Muller codes used by NASA spacecraft are important examples of linear codes. Our discussion of codes concludes in Section 8,5 with the definition of the minimum distance of a code and the role it plays in determining the error-correcting capability of the code. How to Use the Book Students find the book easy to read, so I usually have them read a section before I cover the material in class. That way, I can spend class time highlighting the most important concepts, dealing with topics students find difficult, working examples, and discussing applications. I do not attempt to cover all of the material from the assigned reading in class. This approach enables me to keep the pace of the course fairly brisk, slowing down for those sections that students typically find challenging. In a two-semester course, it is possible to cover the entire book, including a reasonable selection of applications. For extra flexibility, you might omit some of the topics (for example, give only a brief treatment of numerical linear algebra), thereby freeing up time for more in-depth coverage of the remaining topics, more applications, or some of the explorations. In an honors mathematics course that emphasizes proofs, much of the material in Chapters 1-3 can be covered quickly. Chapter 6 can then be covered in conjunction with Sections 3.5 and 3.6, and Chapter 7 can be integrated into Chapter 5.1 would be sure to assign the explorations in Chapters 1,4, 6, and 7 for such a class. For a one-semester course, the nature of the course and the audience will determine which topics to include. Three possible courses are described below and on the following page. The basic course, described first, has fewer than 36 hours suggested, allowing time for extra topics, in-class review, and tests. The other two courses build on the basic course but are still quite flexible. A Basic Course A course designed for mathematics majors and students from other disciplines is outlined on the next page. This course does not mention general vector spaces at all (all concepts are treated in a concrete setting) and is very light on proofs. Still, it is a thorough introduction to linear algebra. Section Number of Lectures Section Number of Lectures 1.1 1.2 1.3 2.1 2.2 2.3 3.1 3.2 3.3 3.5 1-1.5 1-1.5 0.5-1 1-2 1-2 1-2 1 2 2 1 3.6 4.1 4.2 4.3 4.4 5.1 5.2 5.3 5.4 7.3 1-2 1 2 1 0.5 1 2 1-2 1-1.5 1-1.5 Total: 23-30 lectures Because the students in a course such as this one represent a wide variety of disciplines, I would suggest using much of the remaining lecture time for applications. In my course, I do code vectors in Section 8.1, which students really seem to like, and at least one application from each of Chapters 2-5. Other applications can be assigned as projects, along with as many of the explorations as desired. There is also sufficient lecture time available to cover some of the theory in detail. A Course with a Computational Emphasis For a course with a computational emphasis, the basic course outlined on the previous page can be supplemented with the sections of the text dealing with numerical linear algebra. In such a course, I would cover part or all of Sections 2.5, 3.4,4.5, 5.3, 7.2, and 7.4, ending with the singular value decomposition. The explorations in Chapters 2 and 5 are particularly well suited to such a course, as are almost any of the applications. Some courses will be aimed at students who have already encountered the basic principles of linear algebra in other courses. For example, a college algebra course will often include an introduction to systems of linear equations, matrices, and determinants; a multivariable calculus course will almost certainly contain material on vectors, lines, and planes. For students who have seen such topics already, much early material can be omitted and replaced with a quick review. Depending on the background of the class, it may be possible to skim over the material in the basic course up to Section 3.3 in about six lectures. If the class has a significant number of mathematics majors (and especially if this is the only linear algebra course they will take), I would be sure to cover Sections 6.1-6.5, 7.1, and 7.4 and as many applications as time permits. If the course has science majors (but not mathematics majors), I would cover Sections 6.1 and 7.1 and a broader selection of applications, being sure to include the material on differential equations and approximation of functions. If computer science students or engineers are prominently represented, I would try to do as much of the material on codes and numerical linear algebra as I could. There are many other types of courses that can successfully use this text. I hope that you find it useful for your course and that you enjoy using it. A Course for Students Who Have Already Studied Some Linear Algebra To the Student "Where shall I begin, please your Majesty?" he asked. "Begin at the beginning" the King said, gravely, "and go on till you come to the end: then stop!' —Lewis Carroll Alice's Adventures in Wonderland, 1865 Linear algebra is an exciting subject. It is full of interesting results, applications to other disciplines, and connections to other areas of mathematics. The Student Solutions Manual and Study Guide contains detailed advice on how best to use this book; following are some general suggestions. Linear algebra has several sides: There are computational techniques, concepts, and applications. One of the goals of this book is to help you master all of these facets of the subject and to see the interplay among them. Consequently, it is important that you read and understand each section of the text before you attempt the exercises in that section. If you read only examples that are related to exercises that have been assigned as homework, you will miss much. Make sure you understand the definitions of terms and the meaning of theorems. Don't worry if you have to read something more than once before you understand it. Have a pencil and calculator with you as you read. Stop to work out examples for yourself or to fill in missing calculations. The M > icon in the margin indicates a place where you should pause and think over what you have read so far. Answers to most odd-numbered computational exercises are in the back of the book. Resist the temptation to look up an answer before you have completed a question. And remember that even if your answer differs from the one in the back, you may still be right; there is more than one correct way to express some of the solutions. For example, a value of 1/V2 can also be expressed as V2/2 and the set of all scalar multiples of the vector 3 1/2 is the same as the set of all scalar multiples of As you encounter new concepts, try to relate them to examples that you know. Write out proofs and solutions to exercises in a logical, connected way, using complete sentences. Read back what you have written to see whether it makes sense. Better yet, if you can, have a friend in the class read what you have written. If it doesn't make sense to another person, chances are that it doesn't make sense, period. You will find that a calculator with matrix capabilities or a computer algebra system is useful. These tools can help you to check your own hand calculations and are indispensable for some problems involving tedious computations. Technology also XXII! enables you to explore aspects of linear algebra on your own. You can play "what if?" games: What if I change one of the entries in this vector? What if this matrix is of a different size? Can I force the solution to be what I would like it to be by changing something? To signal places in the text or exercises where the use of technology is recommended, I have placed the icon CAS in the margin. The companion website that accompanies this book contains computer code working out selected exercises from the book using Maple, Mathematica, and MATLAB, as well as Technology Bytes, an appendix providing much additional advice about the use of technology in linear algebra. You are about to embark on a journey through linear algebra. Think of this book as your travel guide. Are you ready? Let's go! Vectors Here tftey come pouring out of the blue. Little arrows for me and for you. —Albert Hammond and Mike Hazelwood Little Arrows Dutchess Music/BMI, 1968 1.0 introduction: The Racetrack Came Many measurable quantities, such as length, area, volume, mass, and temperature, can be completely described by specifying their magnitude. Other quantities, such as velocity, force, and acceleration, require both a magnitude and a direction for their description. These quantities are vectors. For example, wind velocity is a vector consisting of wind speed and direction, such as 10 km/h southwest. Geometrically, vectors are often represented as arrows or directed line segments. Although the idea of a vector was introduced in the 19th century, its usefulness in applications, particularly those in the physical sciences, was not realized until the 20th century. More recently, vectors have found applications in computer science, statistics, economics, and the life and social sciences. We will consider some of these many applications throughout this book. This chapter introduces vectors and begins to consider some of their geometric and algebraic properties. We begin, though, with a simple game that introduces some of the key ideas. [You may even wish to play it with a friend during those (very rare!) dull moments in linear algebra class.] The game is played on graph paper. A track, with a starting line and a finish line, is drawn on the paper. The track can be of any length and shape, so long as it is wide enough to accommodate all of the players. For this example, we will have two players (lets call them Ann and Bert) who use different colored pens to represent their cars or bicycles or whatever they are going to race around the track. (Let's think of Ann and Bert as cyclists.) Ann and Bert each begin by drawing a dot on the starting line at a grid point on the graph paper. They take turns moving to a new grid point, subject to the following rules: 1. Each new grid point and the line segment connecting it to the previous grid point must lie entirely within the track. 2. No two players may occupy the same grid point on the same turn. (This is the "no collisions" rule.) 3. Each new move is related to the previous move as follows: If a player moves a units horizontally and b units vertically on one move, then on the next move he or she must move between a — 1 and a + 1 units horizontally and between I I Chapter 1 Vectors The Irish mathematician William Rowan Hamilton (1805-1865) used vector concepts in his study of complex numbers and their generalization, the quaternions. b — 1 and b + 1 units vertically. In other words, if the second move is c units horizontally and d units vertically, then \a — cj £ 1 and \b — d\ :£ 1. (This is the "acceleration/deceleration" rule.) Note that this rule forces the first move to be 1 unit vertically and/or 1 unit horizontally. A player who collides with another player or leaves the track is eliminated. The winner is the first player to cross the finish line. If more than one player crosses the finish line on the same turn, the one who goes farthest past the finish line is the winner. In the sample game shown in Figure 1.1, Ann was the winner. Bert accelerated too quickly and had difficulty negotiating the turn at the top of the track To understand rule 3, consider Anns third and fourth moves. On her third move, she went 1 unit horizontally and 3 units vertically. On her fourth move, her options were to move 0 to 2 units horizontally and 2 to 4 units vertically. (Notice that some of these combinations would have placed her outside the track.) She chose to move 2 units in each direction. Stan Figure 1.1 A sample game of racetrack Finish Problem 1 Play a few games of racetrack. Problem 2 Is it possible for Bert to win this race by choosing a different sequence of moves? Problem 3 Use the notation [a, b] to denote a move that is a units horizontally and b units vertically. (Either a or b or both may be negative.) If move [3, 4] has just been made, draw on graph paper all the grid points that could possibly be reached on the next move. Problem 4 What is the net effect of two successive moves? In other words, if you move [a, b] and then [c, d], how far horizontally and vertically will you have moved altogether? Section 1.1 The Geometry and Algebra of Vectors Problem 5 Write out Anns sequence of moves using the [a, b] notation. Suppose she begins at the origin (0, 0) on the coordinate axes. Explain how you can find the coordinates of the grid point corresponding to each of her moves without looking at the graph paper. If the axes were drawn differently, so that Ann's starting point was not the origin but the point (2, 3), what would the coordinates of her final point be? Although simple, this game introduces several ideas that will be useful in our study of vectors. The next three sections consider vectors from geometric and algebraic viewpoints, beginning, as in the racetrack game, in the plane. The Geometry anil Algebra of Vectors The Cartesian plane is named after the French philosopher and mathematician Rene Descartes (1596-1650), whose introduction of coordinates allowed geometric problems to be handled using algebraic techniques. The word vector comes from the Latin root meaning "to carry." A vector is formed when a point is displaced—or "carried off"—a given distance in a given direction. Viewed another way, vectors "carry" two pieces of information: their length and their direction. When writing vectors by hand, it is difficult to indicate boldface. Some people prefer to write v for the vector denoted in print by v, but in most cases it is fine to use an ordinary lowercase v. It will usually be clear from the context when the letter denotes a vector. The word component is derived from the Latin words co, meaning "together with," and ponere, meaning "to put." Thus, a vector is "put together" out of its components. Vectors In the Plane We begin by considering the Cartesian plane with the familiar x- and y-axes. A vector is a directed line segment that corresponds to a displacement from one point A to another point B; see Figure 1.2. The vector from A to B is denoted by AB; the point A is called its initial point, or tail, and the point B is called its terminal point, or head. Often, a vector is simply denoted by a single boldface, lowercase letter such as v. The set of all points in the plane corresponds to the set of all vectors whose tails are at the origin O. To each point A, there corresponds the vector a = OA; to each vector a with tail at O, there corresponds its head A. (Vectors of this form are sometimes called position vectors.) It is natural to represent such vectors using coordinates. For example, in Figure 1.3, A = (3, 2) and we write the vector a = OA = [3, 2) using square brackets. Similarly, the other vectors in Figure 1.3 are b = [-1, 3] and c = [2, -1] The individual coordinates (3 and 2 in the case of a) are called the components of the vector. A vector is sometimes said to be an ordered pair of real numbers. The order is important since, for example, [3, 2] [2, 3]. In general, two vectors are equal if and only if their corresponding components are equal. Thus, [x, y] = [1, 5] implies that x = 1 andy = 5. It is frequently convenient to use column vectors instead of (or in addition to) [3l row vectors. Another representation of [3, 2] is . (The important point is that the Figure 1.2 Figure 1.3 ■5 Chapter 1 Vectors components are ordered.) In later chapters, you will see that column vectors are somewhat better from a computational point of view; for now, try to get used to both representations. It may occur to you that we cannot really draw the vector [0, 0] = 00 from the origin to itself. Nevertheless, it is a perfectly good vector and has a special name; the zero vector. The zero vector is denoted by 0. The set of all vectors with two components is denoted by R2 (where U denotes R2 is pronounced "r two." the set of real numbers from which the components of vectors in R2 are chosen). Thus, [-1, 3.5], [ V2, it], and [§, 4] are all in R2. Thinking back to the racetrack game, let's try to connect all of these ideas to vectors whose tails are not at the origin. The etymological origin of the word vector in the verb "to carry" provides a clue. The vector [3, 2] may be interpreted as follows: Starting at the origin O, travel 3 units to the right, then 2 units up, finishing at P. The same displacement may be applied with other initiahpoints. Figure 1.4 shows two equivalent displacements, represented by the vectors AB and CD. y a c Figure 1.4 When vectors are referred to by their coordinates, they are being considered analytically. We define two vectors as equal if they have the same length and the same direction. Thus, AB = CD in Figure 1.4. (Even though they have different initial and terminal points, they represent the same displacement.) Geometrically, two vectors are equal if one can be obtained by sliding (or translating) the other parallel to itself until the two vectors coincide. In terms of components, in Figure 1.4 we have A = (3, 1) and B = (6, 3). Notice that the vector [3, 2] that records the displacement is just the difference of the respective components: AB = [3,2] = [6 - 3,3 - 1] Similarly, CD = 1 - (-4),1 ■D] = [3,2] and thus AB = CD, as expected. A vector such as OP with its tail at the origin is said to be in standard position. The foregoing discussion shows that every vector can be drawn as a vector in standard position. Conversely, a vector in standard position can be redrawn (by translation) so that its tail is at any point in the plane. Example 1.1 If A = (—1, 2) and B = (3, 4), find AB and redraw it (a) in standard position and (b) with its tail at the point C = (2, -1). We compute AB « [3 - (-1), 4 - 2] = [4, 2]. If AB is then translated to CD, where C = (2, -1), then we must have D = (2 + 4, -1 + 2) = (6, 1), (See Figure 1.5.) Section 1.1 The Geometry and Algebra of Vectors Hew Vectors from Old As in the racetrack game, we often want to "follow" one vector by another. This leads to the notion of vector addition, the first basic vector operation. If we follow u by v, we can visualize the total displacement as a third vector, denoted by u + v. In Figure 1.6, u = fl, 2] and v = [2,2], so the net effect of following u by v is [1 +2,2 + 2] = [3,4] which gives u + v. In general, if u = [w„ u2] and v = [vx, v2], then their sum u + v is the vector U + V = [ul + V;, «2 + V2] It is helpful to visualize u + v geometrically. The following rule is the geometric version of the foregoing discussion. y Figure 1.6 Vector addition 1 Chapter 1 Vectors The Head-to-Tail Rule Given vectors u and v in U2, translate v so that its tail coincides with the head of u. The sum u + v of u and v is the vector from the tail of u to the head of v. (See Figure 1.7.) figure 1.7 The head-to-tail rule figure 1.8 The parallelogram determined by u and v By translating u and v parallel to themselves, we obtain a parallelogram, as shown in Figure 1.8. This parallelogram is called the parallelogram determined by u and v. It leads to an equivalent version of the head-to-tail rule for vectors in standard position. The Parallelogram Rule Given vectors u and v in R2 (in standard position), their sum u + v is the vector in standard position along the diagonal of the parallelogram determined by u and v. (See Figure 1.9.) Example 1.2 If u = [3,-1] and v = [1,4], compute and draw u + v. Solution We compute u + v = [3 + 1, -1 + 4] = [4, 3], This vector is drawn using the head-to-tail rule in Figure 1.10(a) and using the parallelogram rule in Figure 1.10(b). Section 1.1 The Geometry and Algebra of Vectors U + V (b) The second basic vector operation is scalar multiplication. Given a vector v and a real number c, the scalar multiple cv is the vector obtained by multiplying each component of v by c. For example, 3 [ — 2,4] = [ — 6,12]. In general, cv = c [v1; v2] = [cvj, cv2] Geometrically, cv is a "scaled" version of v. --->. If v = [-2, 4], compute and draw 2v, |v, and -2v. Solution We calculate as follows: 2v= [2(-2),2(4)] = [-4,8] |v= [|(-2), |(4)] - [-1,2] -2v= [-2(-2), -2(4)] = [4,-8] These vectors are shown in Figure 1.11. 2v H—I—I—h H—I—I—I—I—>x -2v Figure 1.11 8 Chapter 1 Vectors 2v Figure 1.12 ' -2v -iv Figure 1.13 Vector subtraction The term scalar comes from the Latin word scala, meaning "ladder." The equally spaced rungs on a ladder suggest a scale, and in vector arithmetic, multiplication by a constant changes only the scale (or length) of a vector. Thus, constants became known as scalars. Observe that cv has the same direction as v if c > 0 and the opposite direction if c < 0. We also see that cv is \c \ times as long as v. For this reason, in the context of vectors, constants (i.e., real numbers) are referred to as scalars. As Figure 1.12 shows, when translation of vectors is taken into account, two vectors are scalar multiples of each other if and only if they are parallel. A special case of a scalar multiple is (— l)v, which is written as — v and is called the negative of v. We can use it to define vector subtraction: The difference of u and v is the vector u — v defined by v = u + (—v) Figure 1.13 shows that u - v corresponds to the "other" diagonal of the parallelogram determined by u and v. Example 1.4 y Figure 1.14 Ifu= [1,2] andv = [-3, l],thenu - v = [1 - (-3), 2 - 1] = [4,1]. The definition of subtraction in Example 1.4 also agrees with the way we calculate a vector such as AB. If the points A and B correspond to the vectors a and b in standard position, then AB = b — a, as shown in Figure 1.14. [Observe that the head-to-tail rule applied to this diagram gives the equation a + (b — a) = b. If we had accidentally drawn b — a with its head at A instead of at B, the diagram would have read b + (b — a) = a, which is clearly wrong! More will be said about algebraic expressions involving vectors later in this section.] Vectors in u Everything we have just done extends easily to three dimensions. The set of all ordered triples of real numbers is denoted by U3. Points and vectors are located using three mutually perpendicular coordinate axes that meet at the origin O. A point such as A = (1, 2, 3) can be located as follows: First travel 1 unit along the x-axis, then move 2 units parallel to the y-axis, and finally move 3 units parallel to the z-axis. The corresponding vector a = [1,2,3] is then OA, as shown in Figure 1.15. Another way to visualize vector a in IR3 is to construct a box whose six sides are determined by the three coordinate planes (the xy-, xz-, and yz-planes) and by three planes through the point (1,2,3) parallel to the coordinate planes. The vector [ 1,2,3] then corresponds to the diagonal from the origin to the opposite corner of the box (see Figure 1.16). Section 1.1 The Geometry and Algebra of Vectors z z --A(l,2,3) x X V Figure 1.15 Figure 1.16 The "componentwise" definitions of vector addition and scalar multiplication are extended to K3 in an obvious way. In general, we define W as the set of all ordered n-tuples of real numbers written as row or column vectors. Thus, a vector v in W is of the form The individual entries of v are its components; v; is called the rth component. We extend the definitions of vector addition and scalar multiplication to W in the obvious way: If u = [uu u2,..., u„] andv = [vx, v2,..., v„], the rth component of u + v is Uj + V; and the rth component of cv is just cv;. Since in IR" we can no longer draw pictures of vectors, it is important to be able to calculate with vectors. We must be careful not to assume that vector arithmetic will be similar to the arithmetic of real numbers. Often it is, and the algebraic calculations we do with vectors are similar to those we would do with scalars. But, in later sections, we will encounter situations where vector algebra is quite unlike our previous experience with real numbers. So it is important to verify any algebraic properties before attempting to use them. One such property is commutativity of addition: u + v = v + u for vectors u and v. This is certainly true in R2. Geometrically, the head-to-tail rule shows that both u + v and v + u are the main diagonals of the parallelogram determined by u and v. (The parallelogram rule also reflects this symmetry; see Figure 1.17.) Note that Figure 1.17 is simply an illustration of the property u + v = v + u. It is not a proof, since it does not cover every possible case. For example, we must also W""& include the cases where u = v, u = —v, and u = 0. (What would diagrams for these cases look like?) For this reason, an algebraic proof is needed. However, it is just as easy to give a proof that is valid in U" as to give one that is valid in U2. The following theorem summarizes the algebraic properties of vector addition and scalar multiplication in IR". The proofs follow from the corresponding properties of real numbers. Vectors in IR" [v„v2, ...,v„] or n _ Chapter 1 Vectors Theorem 1.1 Algebraic Properties of Vectors in IR" Let u, v, and w be vectors in W and let c and d be scalars. Then a. U + V = V + u Commutativit) b. (u + v) + w = u + (v + w) Associativity c. u + 0 = u d. u + (-u) = 0 e. c(u + v) = cu + CV Distrihutivity f. (c + d)u = cu + du Distributivity 8- c(du) = (cd)u h. lu = u Reaarli The word theorem is derived from the Greek word theorema, which in turn comes from a word meaning "to look at." Thus, a theorem is based on the insights we have when we look at examples and extract from them properties that we try to prove hold in general. Similarly, when we understand something in mathematics—the proof of a theorem, for example— we often say, "I see." (u + v) + w = u + (v + w) • Properties (c) and (d) together with the commutativity property (a) imply that 0 + u = u and —u + u = 0 as well. * If we read the distributivity properties (e) and (f) from right to left, they say that we can factor a common scalar or a common vector from a sum. Proof We prove properties (a) and (b) and leave the proofs of the remain- ing properties as exercises. Let u = [u1( u2, [w„ w2,..., w„]. un\, v = [vv v2, vn], and w (a) u + v = [uj, u„] + [vlt v2,..., v„] = [Ui + vu u2 + v2,..., u„ + v„] = [Vj + U\, v2 + u2,..., v„ + un] = [vu v2, ...,v„] + [w„ u2,..., u„] = V + u The second and fourth equalities are by the definition of vector addition, and the third equality is by the commutativity of addition of real numbers. (b) Figure 1.18 illustrates associativity in U2. Algebraically, we have (u + v) + w = ( [m„ u2, ...,«„] + [Vj, v2,vj) + [wu w2,..., w„] = [«, + vv u2 + v2.....u„ + v„] + [w„ w2,..., w„] = [(«! + V]) + Wv (u2 + V2) + W2, . . ., (U„ + V„) + W„] = [Uy + (v, + Wj), u2 + (v2 + w2), ...,u„ + (v„ + W„)] = [Up «2, . . . , M„] + [Vj + wv v2 + w2,. ■ ■, vn + wn] = [ut, u2,..,, «„] + ( [Vj, v2,..., vj + [wp w2,..., wj) = u + (v + w) The fourth equality is by the associativity of addition of real numbers. Note the careful use of parentheses. Figure 1.18 Section 1.1 The Geometry and Algebra of Vectors 11 By property (b) of Theorem 1.1, we may unambiguously write u + v + w without parentheses, since we may group the summands in whichever way we please. By (a), we may also rearrange the summands—for example, as w + u + v—if we choose. Likewise, sums of four or more vectors can be calculated without regard to order or grouping. In general, if vlt v2,..., v* are vectors in R", we will write such sums without parentheses: Vj + v2 + • • ■ + vk The next example illustrates the use of Theorem 1.1 in performing algebraic calculations with vectors. ..........______w Example 1.5 Let a, b, and x denote vectors in IK". (a) Simplify 3a + (5b - 2a) + 2(b - a). (b) If 5x — a = 2(a + 2x), solve for x in terms of a. r Solution We will give both solutions in detail, with reference to all of the properties in Theorem 1.1 that we use. It is good practice to justify all steps the first few times you do this type of calculation. Once you are comfortable with the vector properties, though, it is acceptable to leave out some of the intermediate steps to save time and space. (a) We begin by inserting parentheses. 3a + (5b - 2a) + 2(b - a) = (3a + (5b - 2a)) + 2(b - a) = (3a + (-2a + 5b)) + (2b - 2a) = ((3a + (-2a)) + 5b) + (2b - 2a) = ((3 + (-2))a + 5b) + (2b - 2a) = (la + 5b) + (2b - 2a) = ((a + 5b) + 2b) - 2a = (a + (5b + 2b)) - 2a = (a + (5 + 2)b) - 2a = (7b + a) - 2a = 7b + (a - 2a) = 7b + (1 - 2)a = 7b + (-l)a = 7b - a You can see why we will agree to omit some of these steps! In practice, it is acceptable to simplify this sequence of steps as 3a + (5b - 2a) + 2(b - a) = 3a + 5b - 2a + 2b - 2a = (3a - 2a - 2a) + (5b + 2b) = -a + 7b or even to do most of the calculation mentally. (a) , (e) (b) (f) (b), (h) (b) (0 (a) (b) (0. (h) Chapter 1 Vectors (b) In detail, we have 5x — a = 2(a + 2x) 5x — a = 2a + 2(2x) (e) 5x — a = 2a + (2-2)x (g) 5x — a = 2a + 4x (5x - a) - 4x = (2a + 4x) - 4x (-a + 5x) - 4x = 2a + (4x - 4x) (a), (b) -a + (5x - 4x) = 2a + 0 (b), (d) -a + (5 - 4)x = 2a (f), (c) -a + (l)x = 2a a + (-a + x) = a + 2a (h) (a + (-a)) + x = (1 + 2)a (b), (f) 0 + x = 3a (d) x = 3a (c) Again, we will usually omit most of these steps. Linear Combinations and Coordinates 4 A vector that is a sum of scalar multiples of other vectors is said to be a linear combination of those vectors. The formal definition follows. Definition A vector v is a linear combination of vectors Vj, \2, . . . ,vk if there are scalars cuc2,... ,ck such that v = cxvx + c2v2 + • • • + ckvk. The scalars C[, c2,..., ck are called the coefficients of the linear combination. ixample 1.6 Example 1.7 2n r 2 5 The vector -2 is a linear combination of 0 -3 , and -4 -1 -1 1 0 since 1 2 5 2 0 + 2 -3 - -4 = -2 -1 1 0 -1 4 Remark Determining whether a given vector is a linear combination of other vectors is a problem we will address in Chapter 2. In !R2, it is possible to depict linear combinations of two (nonparallel) vectors quite conveniently. 3 way that e. and v = Y ~0" and e2 = o. 1 We can use u and v to locate a new set of axes (in the same locate the standard coordinate axes). We can use Section 1.1 The Geometry and Algebra of Vectors y Figure 1,19 these new axes to determine a coordinate grid that will let us easily locate linear combinations of u and v. As Figure 1.19 shows, w can be located by starting at the origin and traveling -u followed by 2v. That is, w = —u + 2v We say that the coordinates of w with respect to u and v are — 1 and 2. (Note that this is just another way of thinking of the coefficients of the linear combination.) It follows that 3 1 -1 + 2 1 2 3 (Observe that -1 and 3 are the coordinates of w with respect to e] and e2.) Switching from the standard coordinate axes to alternative ones is a useful idea. It has applications in chemistry and geology, since molecular and crystalline structures often do not fall onto a rectangular grid. It is an idea that we will encounter repeatedly in this book. Binary Vectors and Modular Arithmetic We will also encounter a type of vector that has no geometric interpretation—at least not using Euclidean geometry. Computers represent data in terms of Os and Is (which can be interpreted as off/on, closed/open, false/true, or no/yes). Binary vectors are vectors each of whose components is a 0 or a 1. As we will see in Chapter 8, such vectors arise naturally in the study of many types of codes. In this setting, the usual rules of arithmetic must be modified, since the result of each calculation involving scalars must be a 0 or a 1. The modified rules for addition and multiplication are given below. + 0 1 • 0 1 0 0 1 0 0 0 1 1 0 1 0 1 The only curiosity here is the rule that 1 + 1 = 0. This is not as strange as it appears; if we replace 0 with the word "even" and 1 with the word "odd," these tables simply 14 Chapter 1 Vectors mmm is We are using the term length differently from the way we used it in R". This should not be confusing, since there is no geometric notion of length for binary vectors. summarize the familiar parity rules for the addition and multiplication of even and odd integers. For example, 1 + 1=0 expresses the fact that the sum of two odd integers is an even integer. With these rules, our set of scalars {0,1} is denoted by Z2 and is called the set of integers modulo 2. -:-,-,-■-■-^» In Z2, 1 + 1+ 0 + 1 = 1 and 1 + 1 + 1 + 1 = 0. (These calculations illustrate the parity rules again: The sum of three odds and an even is odd; the sum. of four odds is even.) With Z2 as our set of scalars, we now extend the above rules to vectors. The set of all M-tuples of 0s and Is (with all arithmetic performed modulo 2) is denoted by Z". The vectors in Z2 are called binary vectors of length n. fmmm II The vectors in Z2 are [0, 0], [0, 1] contain, in general?) [1, 0], and [1, 1], (How many vectors does Z" 1 59 Example 1.11 Let u = [1,1,0,1,0] and v •= [0,1,1,1,0] be two binary vectors of length 5. Find u + v. Solution The calculation of u + v takes place over Z2, so we have u + v = [1, 1,0, 1,0] + [0, 1, 1, 1,0] = [l + 0, 1 + 1,0 + 1, 1 + 1,0 + 0] = [1,0, 1,0,0] It is possible to generalize what we have just done for binary vectors to vectors whose components are taken from a finite set (0, 1,2,..., k) for k S 2. To do so, we must first extend the idea of binary arithmetic. The integers modulo 3 is the set Z3 = {0,1,2} with addition and multiplication given by the following tables: + 0 1 2 • 0 1 2 0 0 1 2 0 0 0 0 1 1 2 0 1 0 1 2 2 2 0 1 2 0 2 1 Observe that the result of each addition and multiplication belongs to the set {0,1, 2}; we say that Z3 is closed with respect to the operations of addition and multiplication. It is perhaps easiest to think of this set in terms of a 3-hour clock with 0,1, and 2 on its face, as shown in Figure 1.20. The calculation 1+2 = 0 translates as follows: 2 hours after 1 o'clock, it is 0 o'clock. Just as 24:00 and 12:00 are the same on a 12-hour clock, so 3 and 0 are equivalent on this 3-hour clock. Likewise, all multiples of 3—positive and negative-are equivalent to 0 here; 1 is equivalent to any number that is 1 more than a multiple of 3 (such as —2, 4, and 7); and 2 is equivalent to any number that is 2 more than a Section 1.1 The Geometry and Algebra of Vectors multiple of 3 (such as — 1, 5, and 8). We can visualize the number line as wrapping around a circle, as shown in Figure 1.21. -3,0,3, Figure 1.20 Arithmetic modulo 3 , -2, 1,4,... Example 1.12 To what is 3548 equivalent in Z3? This is the same as asking where 3548 lies on our 3-hour clock. The key is to calculate how far this number is from the nearest (smaller) multiple of 3; that is, we need to know the remainder when 3548 is divided by 3. By long division, we find drat 3548 = 3 • 1182 + 2, so the remainder is 2. Therefore, 3548 is equivalent to 2 in Z3. In courses in abstract algebra and number theory, which explore this concept in greater detail, the above equivalence is often written as 3548 = 2 (mod 3) or 3548 = 2 (mod 3), where = is read "is congruent to." We will not use this notation or terminology here. Example 1.13 In Z3, calculate 2 + 2 + 1 + 2. Solution 1 We use the same ideas as in Example 1.12. The ordinary sum is 2 + 2 + 1+2 = 7, which is 1 more than 6, so division by 3 leaves a remainder of 1. Thus, 2 + 2 + 1 + 2 = 1 in Z3. A better way to perform this calculation is to do it step by step entirely in //. %. 2 + 2+1 + 2= (2+ 2) + 1 + 2 =1+1+2 = (1 + 1) + 2 = 2 + 2 = 1 Here we have used parentheses to group the terms we have chosen to combine. We could speed things up by simultaneously combining the first two and the last two terms: (2 + 2) + (1 + 2) = 1 + 0 16 Chapter 1 Vectors Repeated multiplication can be handled similarly. The idea is to use the addition and multiplication tables to reduce the result of each calculation to 0,1, or 2. A Extending these ideas to vectors is straightforward. -; 1.14 o Arithmetic modulo m In Z| let u = [2, 2, 0, 1, 2] and v = [1, 2, 2, 2, 1]. Then u • v - [2,2,0,1,2] + [1,2,2,2,1] = [2 + 1,2- + 2,0 + 2, 1 + 2,2 + 11 = [0, 1,2,0,0] Vectors in Zb3 are referred to as ternary vectors of length 5. In general, we have the set Zm = {0,1, 2,..., m - 1} oiintegers modulo m (corresponding to an m-hour clock, as shown in Figure 1.22). A vector of length n whose entries are in Zm is called an m-ary vector of length n. The set of all m-ary vectors of length n is denoted by J"„. Exercises 1.1 1. Draw the following vectors in standard position (a) a = (c) c (b) b = (d) d 2 3. 3 -2 2. Draw the vectors in Exercise 1 with their tails at the point (2, —3). 3. Draw the following vectors in standard position in U3: (a) a =[0,2,0] (b) b = [3,2,1] (c) c = [1, -2,1] (d) d = [-1,-1,-2] 4. If the vectors in Exercise 3 are translated so that their heads are at the point (3, 2,1), find the points that correspond to their tails. 5. For each of the following pairs of points, draw the vector AB. Then compute and redraw AB as a vector in standard position. (a) A = (1, -1),B = (4,2) (b) A ~ (0, -2), B ~ (2, -1) (c) A = (2, §), B = (§, 3) (d) A = (\, k), B = (z, 1) 6. A hiker walks 4 km north and then 5 km northeast. Draw displacement vectors representing the hiker's trip and draw a vector that represents the hiker s net displacement from the starting point. Exercises 7-10 refer to the vectors in Exercise 1. Compute the indicated vectors and also show how the results can be obtained geometrically. 7. a + b 8. b - c 9. d - c 10. a + d Exercises 11 and 12 refer to the vectors in Exercise 3. Compute the indicated vectors. 11. 2a + 3c 12. 3b - 2c + d 13. Find the components of the vectors u, v, u + v, and u - v, where u and v are as shown in Figure 1.23. 14. In Figure 1.24, A, B, C, D, E, and F are the vertices of a regular hexagon centered at the origin. Express each of the following vectors in terms of a = OA and b = OB: (a) AB (c) AD (e) AC (b) BC (d) CF (f) BC + DE FA Section 1.1 The Geometry and Algebra of Vectors Figure 1.23 > X In Exercises 15 and 16, simplify the given vector expression. Indicate which properties in Tlieorem 1.1 you use. 15. 2(a - 3b) + 3(2b + a) 16. -3(a - c) + 2(a + 2b) + 3(c - b) In Exercises 17 and 18, solve for the vector x in terms of the vectors a and b. 17. x - a = 2(x - 2a) 18. x + 2a - b = 3(x + a) - 2(2a - b) In Exercises 19 and 20, draw the coordinate axes relative to u and v and locate w. 19. u = 20. u l" 1 , v = , w = .-1. .1. '-2 2 1 ,v = — 2 , w , w = —u — 2v In Exercises 21 and 22, draw the standard coordinate axes on the same diagram as the axes relative to u and v. Use these to find was a linear combination of u and v. 21. u 22. u ť "ť "2 -1 .v = 1 , w = 6 -2 2 2 3. ,v = .1. > w = 9. 23. Draw diagrams to illustrate properties (d) and (e) of Theorem 1.1. 24. Give algebraic proofs of properties (d) through (g) of Theorem 1.1. In Exercises 25-28, u and v are binary vectors. Find u + v in each case. ~r "l 0 1 25. u = , v = 26. u = i . v = 1 .1. 1 .0. .1. 27. u = [1,0,1, l],v = [1,1,1,1] 28. u = [1, 1,0, 1,0],v = [0, 1, 1, 1,0] 29. Write out the addition and multiplication tables for Z4. 30. Write out the addition and multiplication tables for Z5. In Exercises 31-43, perform the indicated calculations. 31.2 + 2 + 2inZ, 32. 2-2-2 inZ3 33. 2(2 + 1 + 2) in Z3 34. 3 + 1 + 2 + 3 in Z4 35. 2 • 3 • 2 in Z4 36. 3(3 + 3 + 2) in Z4 37. 2 + 1 + 2 + 2 + 1 in Z3, Z4, and Z5 38. (3 + 4)(3 + 2 + 4 + 2) in Z5 39. 8(6 + 4 + 3) in, 40. 2100inZn 41. [2, 1,2] + [2,0,1] inZ| 42. 2 [2, 2, 1] inZ3, 43.2(13,1,1,2] + [3, 3,2,1 ]) in Z^ and Z^ In Exercises 44-55, solve the given equation or indicate that there is no solution. 44. x + 3 = 2 in Z5 46. 2x = 1 in Z3 48. 2x = 1 in Z5 50. 3x = 4 in Zg 52. 8x = 9inZn 54. 4x + 5 = 2 in Z 45. x+ 5 = 1 in Zft 47. 2x - 1 in Z4 49. 3x = 4 in Z5 51. 6x — 5 in Z8 53. 2x + 3 = 2 in Z5 55. 6x + 3 = 1 in Z8 56. (a) For which values of a does x + a = 0 have a solu- tion in Z5? (b) For which values of a and b does x + a = b have a solution in Z6? (c) For which values of a, b, and m does x + a — b have a solution in Zm? 57. (a) For which values of a does ax = 1 have a solution inZ,? /O ^ (b) For which values of a does ax - 1 have a solution inZ6? ave a (c) For which values of a and m does ax = lh solution in Zm? ^ $RN( 18 Chapter 1 Vectors Length and Angle: The Dot Product It is quite easy to reformulate the familiar geometric concepts of length, distance, and angle in terms of vectors. Doing so will allow us to use these important and powerful ideas in settings more general than IR2 and R3. In subsequent chapters, these simple geometric tools will be used to solve a wide variety of problems arising in applications—even when there is no geometry apparent at all! The Dot Product The vector versions of length, distance, and angle can all be described using the notion of the dot product of two vectors. Definition if "Vi" «2 and v = then the dot product u • v of u and v is defined by U'V = HjV] + u2v2 + ■ • ' + u„v„ In words, u • v is the sum of the products of the corresponding components of u and v. It is important to note a couple of things about this "product" that we have just defined: First, u and v must have the same number of components. Second, the dot product u • v is a number, not another vector. (This is why u • v is sometimes called the scalar product of u and v.) The dot product of vectors in K" is a special and important case of the more general notion of inner product, which we will explore in Chapter 7. Example 1.15 1 -3 Compute u • v when u = 2 andv = 5 .-3. 2. Solution uv = 1 (-3) + 2- 5 + (-3) •2 = Notice that if we had calculated v • u in Example 1.15, we would have computed vu = (-3)-l + 5-2 + 2-(-3) = 1 That u • v = v • u in general is clear, since the individual products of the components commute. This commutativity property is one of the properties of the dot product that we will use repeatedly. The main properties of the dot product are summarized in Theorem 1.2. Section 1.2 Length and Angle: The Dot Product 53 Theorem 1.2 Let u, v, and w be vectors in W and let c be a scalar. Then a. u-V = VU Commutativiry b. u* (v + w) = u*v + U'W c. (cu) -v = e(u'v) d. u • u S 0 and u • u = 0 if and only if u = 0 Proof We prove (a) and (c) and leave proof of the remaining properties for the exercises. (a) Applying the definition of dot product to u ■ v and v • u, we obtain u-v = ulv1 + u2v2 + ■ ■ ■+ unvn = VxUy + V2U2 + • ■ •+ V„M„ = vu where the middle equality follows from the fact that multiplication of real numbers is commutative. (c) Using the definitions of scalar multiplication and dot product, we have (cu) -v = [cu,, CM,, ..., cu„] • [Vj, v2,..., vj = CU^Vy + cu2v2 + ■ ■ ■ + cu„v„ = c(«1V1 + M2V2 + • • • + «„V„) = c(u-v) Remarks • Property (b) can be read from right to left, in which case it says that we can factor out a common vector u from a sum of dot products. This property also has a "right-handed" analogue that follows from properties (b) and (a) together: (v + w) • u = v • u + w • u. • Property (c) can be extended to give u • (cv) = c(u • v) (Exercise 58). This extended version of (c) essentially says that in taking a scalar multiple of a dot product of vectors, the scalar can first be combined with whichever vector is more convenient. For example, (| [-1,-3,2])-[6,-4,0] = [-l,-3,2]-(§ [6,-4,0]) = [-1,-3, 2] • [3,-2, 0] =3 With this approach we avoid introducing fractions into the vectors, as the original grouping would have. • The second part of (d) uses the logical connective if and only if. Appendix A discusses this phrase in more detail, but for the moment let us just note that the wording signals a double implication—namely, if u = 0, then u • u = 0 and if u • u = 0, then u = 0 Theorem 1.2 shows that aspects of the algebra of vectors resemble the algebra of numbers. The next example shows that we can sometimes find vector analogues of familiar identities. 20 Chapter 1 Vectors Example 1.16 x figure 1.25 Example 1.17 Prove that (u + v) • (u + v) - u • u + 2(u • v) + v • v for all vectors u and v in W. (u + v) • (u + v) = (u + v)-u + (u + v)-v = U'U + VU + U'V + VV = U,U + U'V + U"V + W = U-U + 2(u"v) + vv (Identify the parts of Theorem 1.2 that were used at each step.) To see how the dot product plays a role in the calculation of lengths, recall how lengths are computed in the plane. The Theorem of Pythagoras is all we need. a In U , the length of the vector v = is the distance from the origin to the point (a, b), which, by Pythagoras' Theorem, is given by va? + b2, as in Figure 1.25. Observe that a1 + b2 = v • v. This leads to the following definition. Definition The length (or norm) of a vector v = tive scalar ||v|| defined by in Un is the nonnega- | v || = Vv-v = Vv2 + v\ + • ■ • + v2 In words, the length of a vector is the square root of the sum of the squares of its components. Note that the square root of v • v is always defined, since vv> 0 by Theorem 1.2(d). Note also that the definition can be rewritten to give ||v||2 = vv, which will be useful in proving further properties of the dot product and lengths of vectors. || [2, 3] || = V22 + 32 = VT3 Theorem 1.3 lists some of the main properties of vector length. 4 Theorem 1.3 Let v be a vector in U" and let c be a scalar. Then a. ||v|| = 0 if and only if v = 0 b. IIcv II = |c| ||v|| Proof Property (a) follows immediately from Theorem 1.2(d). To show (b), we have ||cv||2 = (cv)-(cv) = c2(vv) = c2||v||2 using Theorem 1.2(c). Taking square roots of both sides, using the fact that Vc2 = | c \ for any real number c, gives the result. _ mm Section 1.2 Length and Angle: The Dot Product 21 A vector of length 1 is called a unit vector. In (R2, the set of all unit vectors can be identified with the unit circle, the circle of radius 1 centered at the origin (see Figure 1.26). Given any nonzero vector v, we can always find a unit vector in the same direction as v by dividing v by its own length (or, equivalently, multiplying by l/||v||). We can show this algebraically by using property (b) of Theorem 1.3 above: Ifu = (l/||v||)v, then HI = IKl/NMI = Il/||v||!||v|| = (l/t|v||)||v[| = 1 and u is in the same direction as v, since l/||v|| is a positive scalar. Finding a unit vector in the same direction is often referred to as normalizing a vector (see Figure 1.27). M—*■ \ V Figure 1.26 Unit vectors in K2 IMf Figure 1.27 Normalizing a vector Example 1.18 In R2,letei "l" and e2 = 1 Then el and e2 are unit vectors, since the sum of the squares of their components is 1 in each case. Similarly, in R3, we can construct unit vectors 1 0 0 0 . e2 = 1 , and e3 = 0 0 0 1 Observe in Figure 1.28 that these vectors serve to locate the positive coordinate axes in U2 and [R3. -p-x May?;? Ike Standard unit vectors in R2 and Chapter 1 Vectors In general, in R", we define unit vectors e,, e2.....e„, where e, has 1 in its fth component and zeros elsewhere. These vectors arise repeatedly in linear algebra and are called the standard unit vectors. Immm 1. Normalize the vector v = jjvjj = V22 + (-1)2 + 32 = Vl4, so a unit vector in the same direction as v is given by 2" ' 2/\ /IT u = (l/j!v||)v = (l/Víl) -1 = -l/A /14 3_ . 3/A /14_ Since property (b) of Theorem 1.3 describes how length behaves with respect to scalar multiplication, natural curiosity suggests that we ask whether length and vector addition are compatible. It would be nice if we had an identity such as j|u + v|j = jju|j + | j v j |, but for almost any choice of vectors u and v this turns out to be false. [See Exercise 52(a).] However, all is not lost, for it turns out that if we replace the = sign by s, the resulting inequality is true. The proof of this famous and important result—the Triangle Inequality—relies on another important inequality—the Cauchy-Schwarz Inequality—which we will prove and discuss in more detail in Chapter 7. Theorem 1.4 The Cauchy-Schwarz Inequality For all vectors u and v in R", u • v u v Figure 1.29 The Triangle Inequality See Exercises 71 and 72 for algebraic and geometric approaches to the proof of this inequality. In R2 or R3, where we can use geometry, it is clear from a diagram such as Figure 1.29 that jju + v|| < jjujj + ||v|j for all vectors u and v. We now show that this is true more generally Theorem 1.5 The Triangle Inequality For all vectors u and v in R", u + vil S |u + lvi Section 1.2 Length and Angle: The Dot Product 23 f real Since both sides of the inequality are nonnegative, showing that the square of the left-hand side is less than or equal to the square of the right-hand side is equivalent to proving the theorem. (Why?) We compute | u + v||2 = (u + v) • (u + v) = u-u + 2(u-v) + vv < |juj|2 + 2|u-v| + jjv|i2 < ||u||2 + 2||u|| ||v|| + ||v|| = (H + IMP2 as required. By Example 1.9 By Cauchy-Schwarz Distance The distance between two vectors is the direct analogue of the distance between two points on the real number line or two points in the Cartesian plane. On the number line (Figure 1,30), the distance between the numbers a and b is given by \a - b\. (Taking the absolute value ensures that we do not need to know which of a or b is larger.) This distance is also equal to V (a — b)2, and its two-dimensional generalization is the familiar formula for the distance d between points (ax, a2) and (blt b2)—namely, d = VUi - bt)2 + (a2 - b2)2. H—h a -4- -2 Figure 1.30 H—I—I—h H—l-> d — \a - b | = |-2- 3| = In terms of vectors, if a = ax .«2. and b = a. , then d is just the length of a — b, as shown in Figure 1.31. This is the basis for the next definition. («1» a2) (ai, a2) «2 ~ »2 d = ~ Kf + (a, - b2f = {[a - b|| Definition The distance d(u, v) between vectors u and v in 01" is defined by d(u, v) = ||u ~ v|| Chapter 1 Vectors SiKÍÍIIIlIi 1, Figure 1.33 Example 1.21 " V2~ o~ Find the distance between u = 1 and v — 2 . -1 -2 vT -i i , so Solution We compute u — v = d(u, v) = ||u - v[| = V(V2)2 + (-1)2 + l2 = VI = 2 Angles The dot product can also be used to calculate the angle between a pair of vectors. In R or R3, the angle between the nonzero vectors u and v will refer to the angle 6 determined by these vectors that satisfies 180° (see Figure 1.32). U Figure 1.32 The angle between u and v In Figure 1.33, consider the triangle with sides u, v, and u - v, where 6 is the angle between u and v. Applying the law of cosines to this triangle yields jju - v||2 = j|u||2 + ||v|j2 - 2||u|| ||v|| cosO Expanding the left-hand side and using | j v | j2 = v • v several times, we obtain - 2(u-v) + = u + v - 21 u v cos i which, after simplification, leaves us with u • v = ||uj| ||v|| cos 6. From this we obtain the following formula for the cosine of the angle 8 between nonzero vectors u and v. We state it as a definition. Definition For nonzero vectors u and v in R' u • v COS 6 = -.—rrrr~ Compute the angle between the vectors u = [2,1, -2] and v = [1,1,1]. Section 1.2 Length and Angle: The Dot Product 29 We calculate u-v =2-1 + 1-1 +(-2)' 1 = 1, ||u|| = V2: + I2 +(~2)2 = V9 = 3, and ||v|| = Vl2 + l2 + l2 = V3. Therefore, cos0=l/3V3, so 0 = cos"'(l/3 V3) « 1.377 radians, or 78.9°. Example 1.22 Compute the angle between the diagonals on two adjacent faces of a cube. Solution The dimensions of the cube do not matter, so we will work with a cube with sides of length 1. Orient the cube relative to the coordinate axes in U3, as shown in Figure 1.34, and take the two side diagonals to be the vectors [1, 0,1] and [0,1,1]. Then angle 0 between these vectors satisfies cos 0 1-0 + 0-1 + 1-1 V2V2 2 from which it follows that the required angle is v/3 radians, or 60°. [1.0,1] Figure 1.34 (Actually, we don't need to do any calculations at all to get this answer. If we draw a third side diagonal joining the vertices at (1,0,1) and (0,1,1), we get an equilateral triangle, since all of the side diagonals are of equal length. The angle we want is one of the angles of this triangle and therefore measures 60°. Sometimes, a little insight can save a lot of calculation; in this case, it gives a nice check on our work!) Remarks • As this discussion shows, we usually will have to settle for an approximation to the angle between two vectors. However, when the angle is one of the so-called special angles (0°, 30°, 45°, 60°, 90°, or an integer multiple of these), we should be able to recognize its cosine (Table 1.1) and thus give the corresponding angle exactly. In all other cases, we will use a calculator or computer to approximate the desired angle by means of the inverse cosine function. Table 1.1 Cosines of Special Angles 0 0° 30° 45° 60° 90° Vi V3 V2 1 COS 0 - = 1 - - = —;= 2 2 2 V2 vT _ l 2 2 vo -=0 2 Chapter 1 Vectors * The derivation of the formula for the cosine of the angle between two vectors is valid only in U2 or R3, since it depends on a geometric fact: the law of cosines. In R", for n > 3, the formula can be taken as a definition instead. This makes sense, since the Cauchy-Schwarz Inequality implies that — 1 to 1, just as the cosine function does. 1, so - U'V ■ ranges from The word orthogonal is derived from the Greek words orthos, meaning "upright," and gonia, meaning "angle." Hence, orthogonal literally means "right-angled." The Latin equivalent is rectangular. Orthogonal Vectors The concept of perpendicularity is fundamental to geometry. Anyone studying geometry quickly realizes the importance and usefulness of right angles. We now generalize the idea of perpendicularity to vectors in M", where it is called orthogonality. In IR2 or R3, two nonzero vectors u and v are perpendicular if the angle 9 between u • v them is a right angle—that is, if 0 = tt/2 radians, or 90°. Thus, -7-7^7 = cos 90° = 0, ||u||j|v|| and it follows that u • v = 0. This motivates the following definition. Definition Two vectors u and v in W are orthogonal to each other if u-v = 0. Since 0 • v = 0 for every vector v in IR", the zero vector is orthogonal to every vector. Example 1.23 In R3, u = [1,1, -2] and v = [3,1, 2] are orthogonal, since u • v = 3 + 1- 4 = 0. Using the notion of orthogonality, we get an easy proof of Pythagoras' Theorem, valid in W. Theorem 1.6 Pythagoras' Theorem For all vectors u and v in FT, ||u + v|j2 = ||u||2 + ||v||2 if and only if u and v are orthogonal. Figure 1.35 Proof From Example 1.16, we have ||u + v||2 = ||u||2 + 2(u-v) + ||v||2 for all vectors u andvin FT. It follows immediately that |ju + vjj2 = jjujj2 + |jv|j2 if and onlyifu-v = 0. See Figure 1.35. ___J3aM^M The concept of orthogonality is one of the most important and useful in linear algebra, and it often arises in surprising ways. Chapter 5 contains a detailed treatment of the topic, but we will encounter it many times before then. One problem in which it clearly plays a role is finding the distance from a point to a line, where "dropping a perpendicular" is a familiar step. Section 1.2 Length and Angle: The Dot Product Projections We now consider the problem of finding the distance from a point to a line in the context of vectors. As you will see, this technique leads to an important concept: the projection of a vector onto another vector. As Figure 1.36 shows, the problem of finding the distance from a point B to a line € (in R2 or IR3) reduces to the problem of finding the length of the perpendicular line segment PB or, equivalently, the length of the vector PB. If we choose a point A on (,, then, in the right-angled triangle &APB, the other two vectors are the leg AP and the hypotenuse AB. AP is called the projection of AB onto the line £. We will now look at this situation in terms of vectors. Figure 1.37 The projection of v onto u Figure 1.36 The distance from a point to a line Consider two nonzero vectors u and v. Let p be the vector obtained by dropping a perpendicular from the head of v onto u and let 6 be the angle between u and v, as shown in Figure 1.37. Then clearly p = j|pj|u, whereu = (l/||u||)u is the unit vector in the direction of u. Moreover, elementary trigonometry gives jjp jj = ||v|| cos 9, and u • v we know that cos 0 = -r.—., .. ... Thus, after substitution, we obtain 11/ u'v viiV| u ||V U'V > H2 u*vN )- u-u, 1 This is the formula we want, and it is the basis of the following definition for vectors in R". Definition If u and v are vectors in R" and u i= 0, then the projection of v onto u is the vector proju(v) defined by An alternative way to derive this formula is described in Exercise 73. 28 Chapter 1 Vectors X proju(v) Figure 1.38 u Remarks • The term projection comes from the idea of projecting an image onto a wall (with a slide projector, for example). Imagine a beam of light with rays parallel to each other and perpendicular to u shining down on v. The projection of v onto u is just the shadow cast, or projected, by v onto u. • It may be helpful to think of proju(v) as a function with variable v. Then the variable v occurs only once on the right-hand side of the definition. Also, it is helpful to remember Figure 1.38, which reminds us that proju(v) is a scalar multiple of the vector u (not v). • Although in our derivation of the definition of proju(v) we required v as well as u to be nonzero (why?), it is clear from the geometry that the projection of the /u-0N zero vector onto u is 0. The definition is in agreement with this, since I — 0u = 0. • If the angle between u and v is obtuse, as in Figure 1.38, then proju(v) will be in the opposite direction from u; that is, proju(v) will be a negative scalar multiple of u. • If u is a unit vector then proju(v) = (u • v)u. (Why?) Example 1.24 Find the projection of v onto u in each case. (a) v and u (b) v = and u = e, "l" 1/2" (c) v = 2 and u = 1/2 .3. .1/V2. ~2 "-1" .1. _ 3. = 1 and u • u ~2~ "2/5" .1. .1/5. Solution (a) We compute u • v proj„(v) = (b) Since e3 is a unit vector, projjv) = (e3 • v)e3 = 3e3 = (c) We see that llull = Vj + J + j = 1. Thus, '2" _21 .1. ' 1 3 proju(v) = (u-v)u = [-+!+ ~f= 1/2 1/2 1/V2 3, SO 3(1 + V2) 1/2 ' 1/2 1/V2. 3(1 + V2) 1 1 V2. Section 1.2 Length and Angle: The Dot Product Exercises 1.2 In Exercises 1-6, find u • v. 1. u = - r "3 2. u = , v = 2^ .1 *f '2" 3. u = 2 V = 3 cas 4. u = 3 1 3 -2 3.2 -0.6 -1.4 , v = 26. Exercise 20 «s 28. Exercise 22 30. Let A = (~%2),B 1.5 ,v = 4.1 .-0.2, -5] 5. u= [1, V2, V3,0],v = [4, -V2.0, 6. u = [1.12, -3.25,2.07, -1.83], v = [-2.29,1.72,4.33, -1.54] In Exercises 7-12, find || u || for the given exercise, and give CAa 27. Exercise 21 CAS 29. Exercise 23 (1, 0), and C = (4, 6). Prove that AABC is a right-angled triangle. 31. Let A = (1,1, -1), B = (-3, 2, -2), and C = (2, 2, -4). Prove that AABC is a right-angled triangle. CAS 32. Find the angle between a diagonal of a cube and an adjacent edge. 33. A cube has four diagonals. Show that no two of them are perpendicular. 34. A parallelogram has diagonals determined by the vectors a unit vector in the direction ofu. "2" r 7. Exercise 1 8. Exercise 2 9. Exercise 3 d, = 2 , and d2 = -1 CAS 10. Exercise 4 11. Exercise 5 12. Exercise 6 -0. 3_ In Exercises 13-16, find the distance d(u, v) between u and v in the given exercise. 13. Exercise 1 14. Exercise 2 15. Exercise 3 CAS 16. Exercise 4 17. If u, v, and w are vectors in W, «2 2, and c is a scalar, explain why the following expressions make no sense: (a) ||u-v|j (c) u- (vw) (b) u • v + w (d) c • (u + w) In Exercises 18-23, determine whether the angle between u and v is acute, obtuse, or a right angle. 18. u = "3" "-f A 1- 19. u = 2' r -1 ,v = -2 1. .-1. 20. u= [4,3,-l],v= [1,-1,1] gas 21. u = [0.9, 2.1, 1.2], v = [-4.5, 2.6, -0.8] 22. u = [1, 2, 3, 4], v = [-3,1, 2, -2] 23. u = [1,2, 3,4], v= [5,6,7,8] In Exercises 24-29, find the angle between u and v in the given exercise. 24. Exercise 18 25. Exercise 19 Show that the parallelogram is a rhombus (all sides of equal length) and determine the side length. 35. The rectangle ABCD has vertices at A = (1, 2, 3), B = (3, 6, -2), and C = (0, 5, -4). Determine the coordinates of vertex D. 36. An airplane heading due east has a velocity of 200 miles per hour. A wind is blowing from the north at 40 miles per hour. What is the resultant velocity of the airplane? 37. A boat heads north across a river at a rate of 4 miles per hour. If the current is flowing east at a rate of 3 miles per hour, find the resultant velocity of the boat. 38. Ann is driving a motorboat across a river that is 2 km wide. The boat has a speed of 20 km/h in still water, and the current in the river is flowing at 5 km/h. Ann heads out from one bank of the river for a dock directly across from her on the opposite bank. She drives the boat in a direction perpendicular to the current. (a) How far downstream from the dock will Ann land? (b) How long will it take Ann to cross the river? 39. Bert can swim at a rate of 2 miles per hour in still water. The current in a river is flowing at a rate of 1 mile per hour. If Bert wants to swim across the river to a point directly opposite, at what angle to the bank of the river must he swim? Chapter 1 Vectors In Exercises 40-45, find the projection of v onto u. Draw a sketch in Exercises 40 and 41. 40. u 42. u = gas 44. u = 1 1 1/2 •1/4 ■1/2 "-2" 41. u = 3/5' Y , v — 4_ .-4/5. 2^ 2 2 L- 2. 43. u = ■ r ' 2" -i -3 , v = i -1 -i_ _-2_ "0.5" "2.1" u = , v — 1-5. 1.2 CAS 45. u = 3.01 1.34 -0.33 ,v= 4.25 2.52 J L -1.66, Figure 1.39 suggests two ways in which vectors may be used to compute the area of a triangle. The area A of v - projH(v) Figure 1.39 the triangle in part (a) is given by ||| u|j |[v — proju(v) j|, and part (b) suggests the trigonometric form of the area of a triangle: A = |||u|| || v || sin 0 (We can use the /---—^— identity sin 6 = V 1 — cos 9 to find sin 6.) In Exercises 46 and 47, compute the area of the triangle with the given vertices using both methods. 46. A = (1, -1), B = (2, 2), C = (4, 0) 47. A = (3, -1, 4), B = (4, -2, 6), C = (5, 0, 2) In Exercises 48 and 49, find all values of the scalar kfor which the two vectors are orthogonal. "2" ~k + I 48. u = ,v = .3. k - 1. r ' k1' 49. u = -1 . v = k 2 -3 50. Describe all vectors v to u 51. Describe all vectors v that are orthogonal that are orthogonal to u 52. Under what conditions are the following true for vectors u and v in R2 or R3? (a) IIu + v|| = j|u|| + ||v|| (b) ||u + v|| = ||u|| - ||v|| 53. Prove Theorem 1.2(b). 54. Prove Theorem 1.2(d). In Exercises 55-57, prove the stated property of distance between vectors. 55. d(u, v) = d(v, u) for all vectors u and v 56. d(u, w) s d(u, v) + d(v, w) for all vectors u, v, and w 57. d(u, v) = 0 if and only if u = v 58. Prove that u • c v = c(u • v) for all vectors u and v in W and all scalars c. 59. Prove that ||u — v|| S ||u|| — ||v|| for all vectors u and v in R". [Hint: Replace u by u — v in the Triangle Inequality.] 60. Suppose we know that u • v = u • w. Does it follow that v = w? If it does, give a proof that is valid in R"; otherwise, give a counterexample (i.e., a specific set of vectors u, v, and w for which u • v = u • w but v # w). v) v for all vec- 2||u|| + 2 V 61. Prove that (u + v) • (u tors u and v in R". 62. (a) Prove that ||u + v||2 + ||u - vjj2 for all vectors u and v in R". (b) Draw a diagram showing u, v, u + v, and u — v in R2 and use (a) to deduce a result about parallelograms. -||u - v||2forall 63. Prove that u • v u + v vectors u and v in ! Section 1.2 Length and Angle: The Dot Product 31 64. (a) Prove that jju + vjj = |ju — v|| if and only if u and v are orthogonal, (b) Draw a diagram showing u, v, u + v, and u - v in F52 and use (a) to deduce a result about parallelograms. 65. (a) Prove that u + v and u — v are orthogonal in R" if and only if ||u|| = ||v||. (b) Draw a diagram showing u, v, u + v, and u — v in R2 and use (a) to deduce a result about parallelograms. 66. If ||u|| = 2, jjvjj = VJ.andu-v = 1, find ||u + v||. 67. Show diat there are no vectors u and v such that ||u|| = 1, ||v|j = 2, andu • v = 3. 68. (a) Prove that if u is orthogonal to both v and w, then u is orthogonal to v + w. (b) Prove that if u is orthogonal to both v and w, then u is orthogonal to sv + f w for all scalars s and f. 69. Prove that u is orthogonal to v — proj„(v) for all vectors u and v in W, where u + 0. 70. (a) Prove that proju(proju(v)) = proju(v). (b) Prove that proj„(v — proju(v)) = 0. (c) Explain (a) and (b) geometrically. 71. The Cauchy-Schwarz Inequality u • vj :£ ||u|| ||vj| is equivalent to the inequality we get by squaring both sides: (u-v)2 S |ju|j2 ||v||2. V ' or I (a) In R2, with u and v = , this becomes (m,Vj + w2v2)2 < (w2 + u2)(v2 + v1) Prove this algebraically. [Hint: Subtract the left-hand side from the right-hand side and show that the difference must necessarily be nonnegative.] (b) Prove the analogue of (a) in K'\ 72. Figure 1.40 shows that, in |jproju(v) || < ||v||. (a) Prove that this inequality is true in general. [Hint: Prove that proju(v) is orthogonal to v-proju(v) and use Pythagoras' Theorem.] (b) Prove that the inequality ||proj„(v) || £ jjvjj is equivalent to the Cauchy-Schwarz Inequality. proju(v) Figure 1.40 73. Use the fact that proju(v) = cu for some scalar c, together with Figure 1.41, to find c and thereby derive the formula for proju(v). cu Figure 1.41 74. Using mathematical induction, prove the following generalization of the Triangle Inequality: IK + v2 + • • ■ + v„| for all n S: 1. + + Vectors and Geometry Many results in plane Euclidean geometry can be proved using vector techniques. For example, in Example 1.24, we used vectors to prove Pythagoras' Theorem. In this exploration, we will use vectors to develop proofs for some other theorems from Euclidean geometry. As an introduction to the notation and the basic approach, consider the following easy example. Example 1.25 o Figure 1.42 The midpoint of AB Figure 1.43 Give a vector description of the midpoint M of a line segment AB. Solution We first convert everything to vector notation. If O denotes the origin and P is a point, let p be the vector OP. In this situation, a = OA, b = OB, m = OM, and AB = OB - OA = b - a (Figure 1.42). Now, since M is the midpoint of AB, we have so in m - a = AM = jAB = \ (b - a) = a + }(b - a) = |(a + b) 1. Give a vector description of the point P that is one-third of the way from A to B on the line segment AB. Generalize. 2. Prove that the line segment joining the midpoints of two sides of a triangle is parallel to the third side and half as long. (In vector notation, prove that PQ = \AB in Figure 1.43.) 3. Prove that the quadrilateral PQRS (Figure 1.44), whose vertices are the midpoints of the sides of an arbitrary quadrilateral ABCD, is a parallelogram. 4. A median of a triangle is a line segment from a vertex to the midpoint of the opposite side (Figure 1.45). Prove that the three medians of any triangle are concurrent (i.e., they have a common point of intersection) at a point G that is two-thirds of the distance from each vertex to the midpoint of the opposite side. [Hint: In Figure 1.46, show that the point that is two-thirds of the distance from A to P is given by |(a + b + c). Then show that f (a + b + c) is two-thirds of the distance from J3 to Q and two-thirds of the distance from C to R.] The point G in Figure 1.46 is called the centroid of the triangle. Figure 1.47 An altitude The circumcenter A median Figure 1.46 The centroid 5. An altitude of a triangle is a line segment from a vertex that is perpendicular to the opposite side (Figure 1.47). Prove that the three altitudes of a triangle are concurrent. [Hint: Let Hbe the point of intersection of the altitudes from A and B in Figure 1.48. Prove that CH is orthogonal to AB.] The point H in Figure 1.48 is called the orthocenter of the triangle. 6. A perpendicular bisector of a line segment is a line through the midpoint of the segment, perpendicular to the segment (Figure 1.49). Prove that the perpendicular bisectors of the three sides of a triangle are concurrent. [Hint: Let K be the point of in-tersection of the perpendicular bisectors of AC and BC in Figure 1.50. Prove that RK is orthogonal to AB.] The point K in Figure 1.50 is called the circumcenter of the triangle. Figure 1.48 The orthocenter Figure 1.49 A perpendicular bisector 7. Let A and B be the endpoints of a diameter of a circle. If C is any point on the circle, prove that AACB is a right angle. [Hint: In Figure 1.51, let O be the center of the circle. Express everything in terms of a and c and show that AC is orthogonal to BC.] 8. Prove that the line segments joining the midpoints of opposite sides of a quadrilateral bisect each other (Figure 1.52). Figure 1.51 34 Chapter 1 Vectors Lines and Planes We are all familiar with the equation of a line in the Cartesian plane. We now want to consider lines in [R2 from a vector point of view. The insights we obtain from this approach will allow us to generalize to lines in [R3 and then to planes in R3. Much of the linear algebra we will consider in later chapters has its origins in the simple geometry of lines and planes; the ability to visualize these and to think geometrically about a problem will serve you well. Lines in U and ,3 In the xy-plane, the general form of the equation of a line is ax + by = c. If b # 0, then the equation can be rewritten as y = — (a/b)x + c/b, which has the form y — mx + k. [This is the slope-intercept form; m is the slope of the line, and the point with coordinates (0, k) is its y-intercept.] To get vectors into the picture, let's consider an example. Example 126 The line £ with equation 2x + y = 0 is shown in Figure 1.53. It is a line with slope — 2 passing through the origin. The left-hand side of the equation is in the form of a dot product; in fact, if we let n "2" X and x = .1. .y. then the equation becomes n • x = 0. The vector n is perpendicular to the line—that is, it is orthogonal to any vector x that is parallel to the line (Figure 1.54)—and it is called a normal vector to the line. The equation n • x = 0 is the normal form of the equation of €. Another way to think about this line is to imagine a particle moving along the line. Suppose the particle is initially at the origin at time t = 0 and it moves along the line in such a way that its x-coordinate changes 1 unit per second. Then at t = 1 the particle is at (1, —2), at t = 1.5 it is at (1.5, —3), and, if we allow negative values of t (i.e., we consider where the particle was in the past), at t = —2 it is (or was) at ( — 2,4). The Latin word norma refers to a carpenters square, used for drawing right angles. Thus, a normal vector is one that is perpendicular to something else, usually a plane. Figure 1.53 The line 2x + y = 0 Figure 1.54 A normal vector n -Section 1.3 Lines and Planes 35 This movement is illustrated in Figure 1.55. In general, ifx = r, then / = —2f, and we may write this relationship in vector form as X ' t i - t I L-2t\ t -2_ What is the significance of the vector d ? It is a particular vector parallel to €, called a direction vector for the line. As shown in Figure 1.56, we may write the equation of £ as x = td. This is the vector form of the equation of the line. If the line does not pass through the origin, then we must modify things slightly. Example 1.27 Consider the line € with equation 2x + y = 5 (Figure 1.57). This is just the line from Example 1.26 shifted upward 5 units. It also has slope —2, but its /-intercept is the point (0, 5). It is clear that the vectors d and n from Example 1.26 are, respectively, a direction vector and a normal vector for this line too. Thus, n is orthogonal to every vector that is parallel to (,. The point P = (1, 3) is on (. If X = (x, y) represents a general point on €, then the vector PX = x — p is parallel to t and n • (x — p) = 0 (see Figure 1.58). Simplified, we have n • x = n • p. As a check, we compute "2" X = 2x + y and n-p = ~2 I x = 1_ J- 1_ „3_ Thus, the normal form n • x = n • p is just a different representation of the general form of the equation of the line. (Note that in Example 1.26, p was the zero vector, so n • p = 0 gave the right-hand side of the equation.) 36 Chapter 1 Vectors y y The line 2x + y = 5 n • (x - p) = 0 < These results lead to the following definition. DefiliitiOil The normal form of the equation of a line £ in R2 is n • (x — p) = 0 or n ■ x = n • p where p is a specific point on £ and n ¥= 0 is a normal vector for £. The general form of the equation of £ is ax + by = c, where n normal vector for £. is a The word parameter and the corresponding adjective parametric come from the Greek words para, meaning "alongside," and metron, meaning "measure." Mathematically speaking, a parameter is a variable in terms of which other variables are expressed—a new "measure" placed alongside old ones. Continuing with Example 1.27, let us now find the vector form of the equation of €. Note that, for each choice of x, x — p must be parallel to—and thus a multiple of—the direction vector d. That is, x — p = td or x = p + fd for some scalar t. In terms of components, we have 'v\ rr r r (i) X rr \ — + t .y. L3_ _-2_ or x = 1 + t y = 3 - it (2) Equation (1) is the vector form of the equation of £, and the componentwise Equations (2) are called parametric equations of the line. The variable f is called a parameter. How does all of this generalize to R3? Observe that the vector and parametric forms of the equations of a line carry over perfecdy. The notion of the slope of a line in R2—which is difficult to generalize to three dimensions—is replaced by the more convenient notion of a direction vector, leading to the following definition. Definition The vector form of the equation of a line £ in R2 or R3 is x = p + fd where p is a specific point on £ and d + 0 is a direction vector for £. The equations corresponding to the components of the vector form of the equation are called parametric equations of £. Section 1.3 Lines and Planes We will often abbreviate this terminology slightly, referring simply to the general, normal, vector, and parametric equations of a line or plane. Find vector and parametric equations of the line in W through the point P = (1,2, ~~1), 5" parallel to the vector d = -1 3 Solution The vector equation x = p + td is The parametric form is X I 5' y = 2 + t -1 .-1. 3. x = 1 + Si y = 2 - t z = -1 + 3t iMNItt 9 The vector and parametric forms of the equation of a given line £ are not unique—in fact, there are infinitely many, since we may use any point on € to determine p and any direction vector for €. However, all direction vectors are clearly " 10" 1), and -2 another direction vector. Therefore, 6 multiples of each other. In Example 1.28, (6, 1, 2) is another point on the line (take t X " 10" y = 1 + 5 -2 _2_ 6. gives a different (but equivalent) vector equation for the line. The relationship between the two parameters s and t can be found by comparing the parametric equations: For a given point (x, y, z) on €, we have x « 1 + St = 6 + 10s y = 2 - t - 1 - 25 z= -1 + 3r = 2 + 6s implying that -10s + 5f= 5 2s - t = —1 -6s + 3t - 3 Each of these equations reduces to t — 1 + 2s. Chapter 1 Vectors * Intuitively, we know that a line is a one-dimensional object. The idea of "dimension" will be clarified in Chapters 3 and 6, but for the moment observe that this idea appears to agree with the fact that the vector form of the equation of a line requires one parameter. Example 1.29 One often hears the expression "two points determine a line." Find a vector equation of the line t in IR3 determined by the points P = (-1, 5,0) and Q = (2,1,1). T Solution We may choose any point on t for p, so we will use P (Q would also be Figure 1.59 n is orthogonal to infinitely many vectors n • (x - p) = 0 fine). A convenient direction vector is d = PQ Thus, we obtain (or any scalar multiple of this). td 3" 5 + t -4 0. 1. Planes in u The next question we should ask ourselves is, How does the general form of the equation of a line generalize to IR3? We might reasonably guess that if ax + by = c is the general form of the equation of a line in IR2, then ax + by + cz = d might represent a line in IR3. In normal form, this equation would be n • x = n • p, where n is a normal vector to the line and p corresponds to a point on the line. To see if this is a reasonable hypothesis, let's think about the special case of the equation ax + by + cz = 0. In normal form, it becomes n • x = 0, where n However, the set of all vectors x that satisfy this equation is the set of all vectors orthogonal to n. As shown in Figure 1.59, vectors in infinitely many directions have this property, determining a family of parallel planes. So our guess was incorrect: It appears that ax + by + cz = d is the equation of a plane—not a line—in IR3. Let's make this finding more precise. Every plane 9} in IR3 can be determined by specifying a point p on 9* and a nonzero vector n normal to 9j> (Figure 1.60). Thus, if x represents an arbitrary point on SP, we have n • (x — p) — 0 or n • x = n • p. If a X b and x = y _c_ + by + then, in terms of components, the equation becomes d (where d = n • p). I3 is Definition The normal form of the equation of a plane 9? in I n • (x — p) = 0 or n • x = n • p where p is a specific point on SP and n i= 0 is a normal vector for 9?. The general form of the equation of& is ax + by + cz = d, where n is a normal vector for 9f. Section 1.3 Lines and Planes Note that any scalar multiple of a normal vector for a plane is another normal vector. . Find the normal and general forms of the equation of the plane that contains the "l* point P = (6, 0, 1) and has normal vector n - 2 „3. , we have n • p - 1-6 + 2-0 = 3-1 = 9, so the normal equation n • x = n • p becomes the general equation x + 2y + 3z = 9. X Solution With p = 0 andx = y i z Geometrically, it is clear that parallel planes have the same normal vector(s). Thus, their general equations have left-hand sides that are multiples of each other. So, for example, 2x + Ay + 6z = 10 is the general equation of a plane that is parallel to the plane in Example 1,30, since we may rewrite the equation as x + 2y + 3z = 5—from which we see that the two planes have the same normal vector n. (Note that the planes do not coincide, since the right-hand sides of their equations are distinct.) We may also express the equation of a plane in vector or parametric form. To do so, we observe that a plane can also be determined by specifying one of its points P (by the vector p) and two direction vectors u and v parallel to the plane (but not parallel to each other). As Figure 1.61 shows, given any point X in the plane (located Figure 1.61 X — p = SU + f V by x), we can always find appropriate multiples sit and tv of the direction vectors such that x — p = su + tv or x = p + su + tv. If we write this equation componentwise, we obtain parametric equations for the plane. DfiflllitiOII The vector form of the equation of a plane \f in W is x = p + su + tv where p is a point on 9? and u and v are direction vectors for 9JP (u and v are nonzero and parallel to 9?, but not parallel to each other). The equations corresponding to the components of the vector form of the equation are called parametric equations of 9?. Chapter 1 Vectors Example 1.31 \ Figure 1.62 Two normals determine a line 3>, Figure 1.63 The intersection of two planes is a line m Find vector and parametric equations for the plane in Example 1.30. Solution We need to find two direction vectors. We have one point P = (6, 0,1) in the plane; if we can find two other points Q and R in 3\ then the vectors PQ and PR can serve as direction vectors (unless by bad luck they happen to be parallel!). By trial and error, we observe that Q = (9, 0,0) and R = (3, 3, 0) both satisfy the general equation x + 2y + 3z = 9 and so lie in the plane. Then we compute PQ = q - p and v = PR which, since they are not scalar multiples of each other, will serve as direction vectors. Therefore, we have the vector equation of 9\ X 6 3 -3 y = 0 + s 0 + t 3 z 1 -1 -1 and the corresponding parametric equations, x — 6 + 3s — 3f y- 3t z = 1 — s — t [What would have happened had we chosen R = (0, 0, 3)?] Remarks * A plane is a two-dimensional object, and its equation, in vector or parametric form, requires two parameters. * As Figure 1.59 shows, given a point P and a nonzero vector n in U3, there are infinitely many lines through P with n as a normal vector. However, P and two non-parallel normal vectors and n2 do serve to locate a line € uniquely, since € must then be the line through P that is perpendicular to the plane with equation x = p + SO] + £n2 (Figure 1.62). Thus, a line in U3 can also be specified by a pair of equations axx + bxy + cxz = d\ a2x + b2y + c2z = d2 one corresponding to each normal vector. But since these equations correspond to a pair of nonparallel planes (why nonparallel?), this is just the description of a line as the intersection of two nonparallel planes (Figure 1.63). Algebraically, the line consists of all points (x, y, z) that simultaneously satisfy both equations. We will explore this concept further in Chapter 2 when we discuss the solution of systems of linear equations. Tables 1.2 and 1.3 summarize the information presented so far about the equations of lines and planes. Observe once again that a single (general) equation describes a line in US2 but a plane in IR3. [In higher dimensions, an object (line, plane, etc.) determined by a single equation of this type is usually called a hyperplane.] The relationship among Section 1.3 Lines and Planes 41 Table 1,2 Equations of Lines in U2 Normal Form General Form Vector Form Parametric Form n • x = n • p ax + by = c x = p + £d \y = p2 + td2 \ Table 1.3 Lines and Planes in M* Normal Form General Form Vector Form Parametric Form Lines in, • x = n, • p, l.n2-x = n2-p2 Planes n • x = n • p a,x + bxy + C\Z = d, x = p + td Pl + tdx .a2x + b2y + c2z = d2 y = Pz + td2 Pi + td3 ax + by + cz = d x = p + su -t- ŕv [*- Pl + SH, + ŕv, y = Pl + su2 + tv2 iz = P3 + su3 + ŕv3 the dimension of the object, the number of equations required, and the dimension of the space is given by the "balancing formula": (dimension of the object) + (number of general equations) = dimension of the space The higher the dimension of the object, the fewer equations it needs. For example, a plane in [R3 is two-dimensional, requires one general equation, and lives in a three-dimensional space: 2 + 1 = 3. A line in IR3 is one-dimensional and so needs 3-1 = 2 equations. Note that the dimension of the object also agrees with the number of parameters in its vector or parametric form. Notions of "dimension" will be clarified in Chapters 3 and 6, but for the time being, these intuitive observations will serve us well. We can now find the distance from a point to a line or a plane by combining the results of Section 1.2 with the results from this section. Example 1.32 Find the distance from the point B = (1, 0, 2) to the line (, through the point -l" 1 0 A = (3,1,1) with direction vector d = Solution As we have already determined, we need to calculate the length of PB, where P is the point on £ at the foot of the perpendicular from B. If we label v = AB, then AP = projd(v) and PB = v — projd(v) (see Figure 1.64). We do the necessary calculations in several steps. Step 1: v = AB=b-a = 1 3 -2 0 - 1 = -1 2 1 1 Chapter 1 Vectors >' ~ projd(v) , projd(v) d(B, €) == |v - projd(v) [| Step 2: The projection of v onto d is Pr0jd(v) = (T.d)d :-d • (-2) +1 - (-d + o-i (-1)2 +1 + 0 ~--2_ 1 -2 - _3~ 2 -1 - 1 2 = 3 2 1_ o_ 1„ Step 3: The vector we want is v - projd (v) Step 4: The distance d(B, €) from B to € is jjv - projd(v) | Using Theorem 1.3(b) to simplify the calculation, we have [-3 jjv - projd(v) || = k -3 2 = |V9 + 9 + 4 = |V22 Note In terms of our earlier notation, d(B, €) = d(v, projd(v)). Section 1.3 Lines and Planes «3 In the case where the line £ is in R2 and its equation has the general form ax + by = c, the distance d(B, €) from B = (x0, y0) is given by the formula d(B,€) = \axQ + by0 — c\ \rITb1 (3) You are invited to prove this formula in Exercise 39. Example 1.33 Find the distance from the point B - (1, 0, 2) to the plane whose general equation is x + y — z — 1. Solution In this case, we need to calculate the length of PB, where P is the point on ?? at the foot of the perpendicular from B. As Figure 1.65 shows, if A is any point on l" 8P and we situate the normal vector n of S? so that its tail is at A, then we need to find the length of the projection of AB onto n. Again we do the necessary calculations in steps. projn(.4S SP Figure 1.65 d(B,m = |!projn(AB) || Step 1: By trial and error, we find any point whose coordinates satisfy the equation x + y - z - 1. A = (1,0,0) will do. Step 2: Set v = AB = b - a 1 1 0 0 - 0 = 0 2 0 2 Step 3: The projection of v onto n is projn(v) = f |t 1-0 + 1-0 - 1-2 1 + 1 + (-1)2 r - 2" 3 l 2 3 2 44 Chapter 1 Vectors Step 4: The distance d(B, 2P) from B to 9* is ||projn(v)|| = |-| = |V3 In general, the distance d(J3,2P) from the point B = (x0, y0, z0) to the plane whose general equation is ax + by + cz = d is given by the formula |ax0 + fry0 + C20 — rf| VV + b2 + c2 You will be asked to derive this formula in Exercise 40. (4) Exercises 1.3 In Exercises 1 and 2, write the equation of the line passing through P with normal vector n in (a) normal form and (b) general form. In Exercises 9 and 10, write the equation of the plane passing through P with direction vectors u and v in (a) vector form and (b) parametric form. 1. P = (0,0),n = "3" 2.P = (1,2), n = 3" ~2 - 3 _2_ 9. P = (0,0,0),u = 1 , V = 2 .2. 1, In Exercises 3-6, write the equation of the line passing ~o" ■I through P with direction vector d in (a) vector form and 10. P = (6, -4, -3),u — 1 , V — 1 (b) parametric form. 1 1 3.P = (1,0), d = 5. P = (0,0, 0), d = 4. P = (-4,4),d = 6.P = (3, 0,-2),d = 1 Li J In Exercises 7 and 8, write the equation of the plane passing through P with normal vector n in (a) normal form and (b) general form. 7. P = (0,1,0), n 8.P = (3,0, -2),n In Exercises 11 and 12, give the vector equation of the line passing through P and Q. 11. P = (l,-2),Q = (3,0) 12. P = (0,1, -1), Q = (-2,1,3) In Exercises 13 and 14, give the vector equation of the plane passing through P, Q, and R. 13. P = (1, 1, 1), Q = (4, 0, 2), R = (0,1, -1) 14. P = (1,1, 0), Q = (1, 0, 1), R = (0, 1, 1) 15. Find parametric equations and an equation in vector form for the lines in U2 with the following equations: (a) y = 3x - 1 (b) 3x + 2y = 5 Section 1.3 Lines and Planes »5 16. Consider the vector equation x = p + f(q — p), where p and q correspond to distinct points P and Q in IR2 or IR3. (a) Show that this equation describes the line segment PQ as t varies from 0 to 1. (b) For which value of t is x the midpoint of PQ, and what is x in this case? (c) Find the midpoint of PQ when P — (2, -3) and Q = (0,1). (d) Find the midpoint of PQ when P = (1, 0,1) andQ = (4, 1, -2). (e) Find the two points that divide PQ in part (c) into three equal parts. (f) Find the two points that divide PQ in part (d) into three equal parts. 17. Suggest a "vector proof" of the fact that, in IR2, two lines with slopes n%i and m2 are perpendicular if and only if martin — — L 18. The line (, passes through the point P = (1, — 1,1) and 2 has direction vector d = For each of the following planes (3>, determine whether (, and 9* are parallel, perpendicular, or neither: (a) 2x + 3y - z = 1 (b) 4x - y + 5z = 0 (c) x - y - z = 3 (d) 4x + 6y - 2z = 0 19. The plane &1 has the equation 4x — y + 5z — 2. For each of the planes 2? in Exercise 18, determine whether 5P, and 9P are parallel, perpendicular, or neither. 20. Find the vector form of the equation of the line in IR2 that passes through P — (2, — 1) and is perpendicular to the line with general equation 2x — 3y = 1. 21. Find the vector form of the equation of the line in R2 that passes through P = (2, — 1) and is parallel to the line with general equation 2x — 3y = 1. 22. Find the vector form of the equation of the line in IR3 that passes through P = ( — 1,0, 3) and is perpendicular to the plane with general equation x — 3y + 2z = 5. 23. Find the vector form of the equation of the line in IR3 that passes through P = (— 1, 0, 3) and is parallel to the line with parametric equations x = 1 - t y= 2 + 3t z = -2 - t 24. Find the normal form of the equation of the plane that passes through P = (0, —2, 5) and is parallel to the plane with general equation 6x — y + 2z = 3. 25. A cube has vertices at the eight points (x, y, z), where each of x, y, and z is either 0 or 1. (See Figure 1.34.) (a) Find the general equations of the planes that determine the six faces (sides) of the cube. (b) Find the general equation of the plane that contains the diagonal from the origin to (1,1,1) and is perpendicular to the xy-plane. (c) Find the general equation of the plane that contains the side diagonals referred to in Example 1.22. 26. Find the equation of the set of all points that are equidistant from the points P = (1, 0, — 2) and Q = (5, 2, 4). In Exercises 27 and 28, find the distance from the point Q to the line I. 27. Q = (2, 2), € with equation 28. Q = (0,1,0), (, with equation x 1" + t j- 2_ .-1. x 1 -2 y = 1 + t 0 z 1 3 In Exercises 29 and 30, find the distance from the point Q to the plane 29. Q = (2, 2, 2), with equation x + y - z = 0 30. Q = (0,0, 0), <3> with equation x - 2y + 2z = 1 Figure 1.66 suggests a way to use vectors to locate the point Rond that is closest to Q. 31. Find the point R on £ that is closest to Q in Exercise 27. 32. Find the point R on i that is closest to Q in Exercise 28. Chapter 1 Vectors Figure 1.67 suggests a way to use vectors to locate the point R on ?J> that is closest to Q. Figure 1.67 r = p + PQ + QR 33. Find the point R on 9P that is closest to Q in Exercise 29. 34. Find the point Ron 9" that is closest to Q in Exercise 30. In Exercises 35 and 36, find the distance between the parallel lines. 1 -2 and X 5 -2 + s + t 1 3 .y. 4 3 X 1 1 X 0 1 36. y = 0 + 5 1 and y = 1 + t 1 z -1 1 z 1 1 In Exercises 37 and 38, find the distance between the parallel planes. 37. 2x + y - 2z - 0 and 2x + y - 2z = 5 38. x + y + z = 1 and x + y + z — 3 39. Prove Equation (3) on page 43. 40. Prove Equation (4) on page 44. 41. Prove that, in !R2, the distance between parallel lines with equations n • x = cx and n • x = c2 is given by ifi ~ c2j |[ kb. || 42. Prove that the distance between parallel planes with equations n • x = d{ and n • x = d2 is given by \dl — d2\ HI ' If two nonparallelplav.es S?t and 2P2 have normal vectors nt and n2 and 9 is the angle between n} and n2, then we define the angle between 9^ and S?2 to be either 0 or 180° — 6, whichever is an acute angle. (Figure 1.68) Figure 1.68 In Exercises 43-44, find the acute angle between the planes with the given equations. 43. x + y + z = 0 and 2x + y — 2z — 0 44,3x — y + 2z—5 and x + 4y — z = 2 In Exercises 45-46, show that the plane and line with the given equations intersect, and then find the acute angle of intersection between them. 45. The plane given by x + y + 2z — 0 and the line given by x = 2 + t y=\-2t z = 3 + t 46. The plane given by 4x — y — z = 6 and the line given by x = f y - 1 + 21 z = 2 + it Exercises 47-48 explore one approach to the problem of finding the projection of a vector onto a plane. As Figure 1.69 shows, if3> is a plane through the origin in W with normal vector n, and v is a vector in IR3, then p = projgi(v) is a vector in S? such that v — cn = pfor some scalar c. Figure 1.69 Projection onto a plane Section 1,3 lines and Planes 47. Using the fact that n is orthogonal to every vector in 9? (and hence to p), solve for c and thereby find an expression for p in terms of v and n. 48. Use the method of Exercise 43 to find the projection of onto the planes with the following equations: (a) x + y + z = 0 (b) 3x - y + z = 0 (c) x - 2z = 0 (d) 2x - 3y + z = 0 Exploration ■_■_■__________ The Cross Product It would be convenient if we could easily convert the vector form x = p + su + fv of the equation of a plane to the normal form n • x = n • p. What we need is a process that, given two nonparallel vectors u and v, produces a third vector n that is orthogonal to both u and v. One approach is to use a construction known as the cross product of vectors. Only valid in R3, it is defined as follows: is the vector u X v U2V3 - W3V2 U X V = u3vx - uxv3 -UiV2 — u2vl_ Definition The cross product of u defined by "1 V'l W2 and v = V2 v. A shortcut that can help you remember how to calculate the cross product of two vectors is illustrated below. Under each complete vector, write the first two components of that vector. Ignoring the two components on the top line, consider each block of four: Subtract the products of the components connected by dashed lines from the products of the components connected by solid lines. (It helps to notice that the first component of u X v has no Is as subscripts, the second has no 2s, and the third has no 3s.) "1 v: u3 C > v3 u2v3 — U3V2 «1 < > v, W3V, — M,V3 u2 v2 ulv2 ~ U2V\ The following problems briefly explore the cross product. 1. Compute u X v. 3~ 3" ~0~ (a) u = 1 , v = --1 (b) u = -1 , v = 1 1 2 2 1 a 11 x v u Figure 1.70 Figure 1.71 2~ "l" V 2 , v = -4 (d) u = 1 , v = 2 3_ --6. _1_ .3. (c) u 2. Show that er X e2 = e3, e2 X e3 = and e3 X e: = e2. 3. Using the definition of a cross product, prove that u X v (as shown in Figure 1.70) is orthogonal to u and v. 4. Use the cross product to help find the normal form of the equation of the plane. (a) The plane passing through P = (1,0, -2), parallel tou = (b) The plane passing through P = (0, -1,1), Q = (2, 0,2), and R 5. Prove the following properties of the cross product: (a) v X u =-(u X v) (b)uX0 = 0 (c) u X u = 0 (d) u X fcv = fc(u X v) (e)oXh = 0 (f) u X (v + w) = u X v + u X w 6. Prove the following properties of the cross product: (a) u* (v X w) = (u X v) -w (b) u X (v X w) = (u-w)v - "o" 3" 1 andv = -1 .1. . 2. (1,2,-1) (u • v)w (c) ||u X vll - (u-v); I3 and let 6 be the angle between u and v. : llull IIvll sin 8. [Hint: Use Problem 6(c)/ 7. Redo Problems 2 and 3, this time making use of Problems 5 and 6. 8. Let u and v be vectors in I (a) Prove that ||u X v|| = (b) Prove that the area A of the triangle determined by u and v (as shown in Figure 1.71) is given by A = \\\uX v|| (c) Use the result in part (b) to compute the area of the triangle with vertices A = (1, 2,1), B = (2,1, 0), and C = (5, -1, 3). Writing Project The Origins of the Dot Product and Cross Product The notations for dot and cross product that we use today were introduced in the late 19th century by Josiah Willard Gibbs, a professor of mathematical physics at Yale University. Edwin B. Wilson was a graduate student in Gibbs's class, and he later wrote up his class notes, expanded upon them, and had them published in 1901, with Gibbs's blessing, as Vector Analysis: A Text-Book for the Use of Students of Mathematics and Physics. However, the concepts of dot and cross product arose earlier and went by various other names and notations. Write a report on the evolution of the names and notations for the dot product and cross product. 1. Florian Cajori, A History of Mathematical Notations (New York: Dover, 1993). 2. J. Willard Gibbs and Edwin Bidwell Wilson, Vector Analysis: A Text-Book for the Use of Students of Mathematics and Physics (New York: Charles Scribner's Sons, 1901). Available online at http://archive.org/details/117714283. 3. Ivor Grattan-Guinness, Companion Encyclopedia of the History and Philosophy of the Mathematical Sciences (London: Routledge, 2013). 49 50 Chapter 1 Vectors Force is defined as the product of mass and acceleration due to gravity (which, on Earth, is 9.8 m/s2). Thus, a 1 kg mass exerts a downward force of 1 kg X 9.8 m/s2 or 9.8 kg • m/s2. This unit of measurement is a newton (N). So the force exerted by a 1 kg mass is 9.8 N. Applications Force Vectors We can use vectors to model force. For example, a wind blowing at 30 km/h in a westerly direction or the Earth's gravity acting on a 1 kg mass with a force of 9.8 newtons downward are each best represented by vectors since they each consist of a magnitude and a direction. It is often the case that multiple forces act on an object. In such situations, the net result of all the forces acting together is a single force called the resultant, which is simply the vector sum of the individual forces (Figure 1.72). When several forces act on an object, it is possible that the resultant force is zero. In this case, the object is clearly not moving in any direction and we say that it is in equilibrium. When an object is in equilibrium and the force vectors acting on it are arranged head-to-tail, the result is a closed polygon (Figure 1.73). The resultant of two forces Equilibrium Example 1.34 Ann and Bert are trying to roll a rock out of the way. Ann pushes with a force of 20 N in a northerly direction while Bert pushes with a force of 40 N in an easterly direction. (a) What is the resultant force on the rock? (b) Carla is trying to prevent Ann and Bert from moving the rock. What force must Carla apply to keep the rock in equilibrium? Solution (a) Figure 1.74 shows the position of the two forces. Using the parallelogram rule, we add the two forces to get the resultant r as shown. By Pythagoras' Figure 1.74 The resultant of two forces Section 1.4 Applications St Theorem, we see that ||r|| = V202 + 402 = V2000 «* 44.72 N. For the direction of r, we calculate the angle 8 between r and Berts easterly force. We find that sinfl = 20/1| r|| « 0.447, so 0 « 26.57°. (b) If we denote the forces exerted by Ann, Bert, and Carla by a, b, and c, respectively, then we require a + b + c = 0. Therefore c = — (a + b) = — r, so Carla needs to exert a force of44.72 N in the direction opposite to r. Often, we are interested in decomposing a force vector into other vectors whose resultant is the given vector. This process is called resolving a vector into components. In two dimensions, we wish to resolve a vector into two components. However, there are infinitely many ways to do this; the most useful will be to resolve the vector into two orthogonal components. (Chapters 5 and 7 explore this idea more generally.) This is usually done by introducing coordinate axes and by choosing the components so that one is parallel to the x-axis and the other to the y-axis. These components are usually referred to as the horizontal and vertical components, respectively. In Figure 1.75, f is the given vector and fx and £, are its horizontal and vertical components. Resolving a vector into components Example 1.35 Ann pulls on the handle of a wagon with a force of 100 N. If the handle makes an angle of 20° with the horizontal, what is the force that tends to pull the wagon forward and what force tends to lift it off the ground? Solution Figure 1.76 shows the situation and the vector diagram that we need to consider. Figure 1.76 52 Chapter 1 Vectors We see that DC|| = ||f || cos20° and ||^|| = ||f || sin20° Thus, ||41| » 100(0.9397) « 93.97 and ||fr|| « 100(0.3420) « 34.20. So the wagon is pulled forward with a force of approximately 93.97 N and it tends to lift off the ground with a force of approximately 34.20 N. We solve the next example using two different methods. The first solution considers a triangle of forces in equilibrium; the second solution uses resolution of forces into components. Example 1.36 Figure 1.77 shows a painting that has been hung from the ceiling by two wires. If the painting has a mass of 5 kg and if the two wires make angles of 45 and 60 degrees with the ceiling, determine the tension in each wire. Figure 1.77 Solution 1 We assume that the painting is in equilibrium. Then the two wires must supply enough upward force to balance the downward force of gravity. Gravity exerts a downward force of 5 X 9.8 = 49 N on the painting, so the two wires must collectively pull upward with 49 N of force. Let fj and f> denote the tensions in the wires and let r be their resultant (Figure 1.78). It follows that ||r|| = 49 since we are in equilibrium. Section 1.4 Applications \45° 2 45c 30' r = f i + f> Figure 1.78 \ \ t J \ / | 60V ........ / 49 N Exercises 1.4 Using the law of sines, we have Ifil sin 45° sin 30° sin 105° so |r|j sin 30° 49(0.5) r sin45° 49(0.7071) , „,. fi = —------= 35.87 and f, 1 111 sin 105° 0.9659 11 211 sin 105° 0.9659 Therefore, the tensions in the wires are approximately 35.87 N and 25.36 N. 25.36 Solution 2 We resolve f, and f2 into horizontal and vertical components, say, fj = h, + v, and f, - h, - v2, and note that, as above, there is a downward force of 49 N (Figure 1.79). It follows that „ I { II fill „ „ V3||f, l,| = ||fJI cos 60° = -y1, ||vi|| = ||f,|| sin 60° = —^L-L |f2|| cos 45° KM V2' ||v,|| = ||f,|| sin 45° MA V2 Since the painting is in equilibrium, the horizontal components must balance, as must the vertical components. Therefore, 1| = ||h2|| and ||t,|| + ||v2|| =49, from which it follows that Ifill = = V2"||f2|| and V3 f, f2 --Liii + ±JL _ 49 2 V2 Substituting the first of these equations into the second equation yields 49 V2 V3||f2|| ||f2|| „ „ ^- + -7^ = 49, or ||f2|| 1 + V3 * 25.36 Thus, ||^11 = V2||f2|| = 1.4142(25.36)= 35.87, so the tensions in the hires are approximately 35.87 N and 25.36 N, as before. Force Vectors In Exercises 1-6, determine the resultant of the given forces. 1. fj acting due north with a magnitude of 12 N and f2 acting due east with a magnitude of 5 N 2. fj acting due west with a magnitude of 15 N and f2 acting due south with a magnitude of 20 N 3. fj acting with a magnitude of 8 N and f2 acting at an angle of 60° to f 1 with a magnitude of 8 N 4. fi acting with a magnitude of 4 N and f2 acting at an angle of 135° to fj with a magnitude of 6 N 54 Chapter 1 Vectors 5. fj acting due east with a magnitude of 2 N, f2 acting due west with a magnitude of 6 N, and f3 acting at an angle of 60° to fj with a magnitude of 4 N 6. f{ acting due east with a magnitude of 10 N, f2 acting due north with a magnitude of 13 N, f3 acting due west with a magnitude of 5 N, and f4 acting due south with a magnitude of 8 N 7. Resolve a force of 10 N into two forces perpendicular to each other so that one component makes an angle of 60° with the 10 N force. 8. A 10 kg block lies on a ramp that is inclined at an angle of 30° (Figure 1.80). Assuming there is no friction, what force, parallel to the ramp, must be applied to keep the block from sliding down the ramp? Figure 1.80 9. A tow truck is towing a car. The tension in the tow cable is 1500 N and the cable makes a 45° with the horizontal, as shown in Figure 1.81. What is the vertical force that tends to lift the car off the ground? f=15()0N Figure 1.81 10. A lawn mower has a mass of 30 kg. It is being pushed with a force of 100 N. If the handle of the lawn mower makes an angle of 45° with the ground, what is the horizontal component of the force that is causing the mower to move forward? 11. A sign hanging outside Joe's Diner has a mass of 50 kg (Figure 1.82). If the supporting cable makes an angle of 60° with the wall of the building, determine the tension in the cable. IOF/S DINER Figure 1.32 12. A sign hanging in the window of Joe's Diner has a mass of 1 kg. If the supporting strings each make an angle of 45° with the sign and the supporting hooks are at the same height (Figure 1.83), find the tension in each string. 45°r\ .--^45° | OPEN FOR BUSINESS Figure 1.83 13. A painting with a mass of 15 kg is suspended by two wires from hooks on a ceiling. If the wires have lengths of 15 cm and 20 cm and the distance between the hooks is 25 cm, find the tension in each wire. 14. A painting with a mass of 20 kg is suspended by two wires from a ceiling. If the wires make angles of 30° and 45° with the ceiling, find the tension in each wire. Chapter Review Key Definitions and Concepts algebraic properties of vectors, 10 angle between vectors, 24 binary vector, 13 Cauchy-Schwarz Inequality, 22 cross product, 48 direction vector, 35 distance between vectors, 23 dot product, 18 equation of a line, 36 equation of a plane, 38-39 head-to-tail rule, 6 integers modulo m (Zm), 14-16 length (norm) of a vector, 20 linear combination of vectors, 12 normal vector, 34, 38 m-ary vector 16 orthogonal vectors, 26 parallel vectors, 8 parallelogram rule, 6 projection of a vector onto a vector, 27 Pythagoras' Theorem, 26 scalar multiplication, 7 standard unit vectors, 22 Triangle Inequality, 22 unit vector, 21 vector, 3 vector addition, 5 zero vector, 4 Review Questions 1. Mark each of the following statements true or false: (a) For vectors u, v, and w in R", if u + w = v + w, then u = v. (b) For vectors u, v, and w in R", if u • w = v • w, then u = v. (c) For vectors u, v, and w in R3, if u is orthogonal to v, and v is orthogonal to w, then u is orthogonal tow. (d) In R3, if a line € is parallel to a plane 9\ then a direction vector d for € is parallel to a normal vector nforSP. (e) In IR3, if a line € is perpendicular to a plane 9\ then a direction vector d for € is a parallel to a normal vector n for 9, (f) In W, if two planes are not parallel, then they must intersect in a line. (g) In R\ if two lines are not parallel, then they must intersect in a point. (h) If v is a binary vector such that v • v = 0, then v= 0. (i) In Z5, if ab = 0 then either a — 0 or b — 0. (j) In Z6, if ab = 0 then either a = 0 or b = 0. 2. If u '3' , v = 5^ 2 and the vector 4u + v is drawn with its tail at the point (10, —10), find the coordinates of the point at the head of -lu + v. 3. Ifu = for x. ~3~ , v = 5 _ 2_ , and 2x + u = 3(x — v), solve 4. Let A, B, C, and D be the vertices of a square centered at the origin O, labeled in clockwise order. If a = OA and b = OB, find BC in terms of a and b. 5. Find the angle between the vectors [—1, 1, 2] and [2, 1,-1]. V 1 6. Find the projection ofv -~ 1 onto u = -2 1 2 7. Find a unit vector in the xy-plane that is orthogonal to 8. Find the general equation of the plane through the point (1, 1, 1) that is perpendicular to the line with parametric equations x = 2 v= 3 t It z = -1 + t 9. Find the general equation of the plane through the point (3, 2, 5) that is parallel to the plane whose general equation is 2x + 3y — z = 0. 10. Find the general equation of the plane through the points A(l, 1, 0), 5(1, 0, 1), and C(0, 1, 2). 11. Find the area of the triangle with vertices A(l, 1, 0), B(1,0, 1), and C(0, 1,2). 55 56 Chapter 1 Vectors 12. Find the midpoint of the line segment between A = (5, 1, -2) and B = (3, -7, 0). 13. Why are there no vectors u and v in U" such that ||u|j = 2, ||v|| = 3, andu-v = -7? 14. Find the distance from the point (3,2, 5) to the plane whose general equation is 2x + 3y - z = 0. 15. Find the distance from the point (3,2, 5) to the line with parametric equations %=r,y=l + t,z=2 + r. 16. Compute 3 - (2 + 4)3(4 + 3)2 in Z5. 17. If possible, solve 3(x + 2) — 5 in Z7. 18. If possible, solve 3(x + 2) = 5 in Z9. 19. Compute [2,1, 3, 3] • [3,4,4,2] in Z|. 20. Let u = [1,1,1, 0] in Zj. How many binary vectors v satisfy u • v = 0? Systems of Linea Equations The world was full of equations____ There must be an answer for everything, if only you knew how to set forth the questions. —Anne Tyler The Accidental Tourist Alfred A. Knopf, 1985, p. 235 2.0 Introduction: Triviality The word trivial is derived from the Latin root tri ("three") and the Latin word via ("road"). Thus, speaking literally, a triviality is a place where three roads meet. This common meeting point gives rise to the other, more familiar meaning of trivial— commonplace, ordinary, or insignificant. In medieval universities, the trivium consisted of the three "common" subjects (grammar, rhetoric, and logic) that were taught before the quadrivium (arithmetic, geometry, music, and astronomy). The "three roads" that made up the trivium were the beginning of the liberal arts. In this section, we begin to examine systems of linear equations. The same system of equations can be viewed in three different, yet equally important, ways—these will be our three roads, all leading to the same solution. You will need to get used to this threefold way of viewing systems of linear equations, so that it becomes commonplace (trivial!) for you. The system of equations we are going to consider is 2x + y - 8 x — 3y = —3 Problem 1 Draw the two lines represented by these equations. What is their point of intersection? Problem 2 Consider the vectors u ~2~ f and v = .1, L-3. Draw the coordinate grid determined by u and v. [Hint: Lightly draw the standard coordinate grid first and use it as an aid in drawing the new one.] 8 Problem 3 On the u-v grid, find the coordinates of w = Problem 4 Another way to state Problem 3 is to ask for the coefficients x and y for which xu + yx = w. Write out the two equations to which this vector equation is equivalent (one for each component). What do you observe? Problem 5 Return now to the lines you drew for Problem 1. We will refer to the line whose equation is 2x + y = 8 as line 1 and the line whose equation is x - 3y = -3 as line 2. Plot the point (0, 0) on your graph from Problem 1 and label it P0. Draw a 5? Chapter 2 Systems of Linear Equations mm 2,1 horizontal line segment from P0 to line 1 and label this new point Pj. Next draw a vertical line segment from P, to line 2 and label this point P2. Now draw a horizontal Point a y line segment from P2 to line 1, obtaining point P3. Continue in this fashion, drawing vertical segments to line 2 followed by horizontal segments to line 1. What appears to £J!_9.__ be happening? Pj__Problem 6 Using a calculator with two-decimal-place accuracy, find the (approxi- p2 mate) coordinates of the points P-., P2, P3, , . . , P6. (You will find it helpful to first solve the first equation for x in terms of y and the second equation for y in terms - ofx.) Record your results in Table 2.1, writing die x- and y-coordinates of each point _ separately. The results of these problems show that the task of "solving" a system of linear equations may be viewed in several ways. Repeat the process described in the problems with the following systems of equations: (a) Ax - 2y = 0 (h) 3x + 2y = 9 (c) x + y = 5 (d) x + 2y = 4 x + 2y = 5 x + 3y = 10 x — y = 3 2x — y = 3 Are all of your observations from Problems 1-6 still valid for these examples? Note any similarities or differences. In this chapter, we will explore these ideas in more detail. Introduction to Systems of linear Equations Recall that the general equation of a line in U2 is of the form ax + by = c and that the general equation of a plane in R3 is of the form ax + by + cz = d Equations of this form are called linear equations. Definition A linear equation in the n variables xv x2,. . ., x„ is an equation that can be written in the form chx, + a2x: + a„x„ = b where the coefficients a2>..., an and the constant term b are constants. Example 2.1 The following equations are linear: 3x — 4y = — 1 r — \$ — jt = 9 xx + 5x2 = 3 — x} + 2x4 Vlx + ~y - ^sin-^jz = 1 3.2*1 '~ 0.0Lx2 = 4.6 Observe that the third equation is linear because it can be rewritten in the form x, + 5x2 + x3 — 2x4 = 3. It is also important to note that, although in these examples (and in most applications) the coefficients and constant terms are real numbers, in some examples and applications they will be complex numbers or members of for some prime number p. Section 2.1 Introduction to Systems of Linear Equations The following equations are not linear: xy + 2z = 1 x\ — x\ = 3 Vlx -y - sinl jz) = 1 + z = 2 3*, + 2*3 = 0 Thus, linear equations do not contain products, reciprocals, or other functions of the variables; the variables occur only to the first power and are multiplied only by constants. Pay particular attention to the fourth example in each list: Why is it that the g> » fourth equation in the first list is linear but the fourth equation in the second list is not? A solution of a linear equation a{xr + a2x2 + • • • + a„xn = b is a vector [sx, s2, . . . , s„] whose components satisfy the equation when we substitute = $x, X2 52, . . . , Xn Sn. Example 2.2 (a) [5,4] is a solution of 3x — 4y — — 1 because, when we substitute x = 5 and y = 4, the equation is satisfied: 3(5) — 4(4) = — 1. [1, 1] is another solution. In general, the solutions simply correspond to the points on the line determined by the given equation. Thus, setting x = t and solving for y, we see that the complete set of solutions can be written in the parametric form [t, \ + \t\. (We could also set y equal to some parameter—say, s —and solve for x instead; the two parametric solutions would look different but would be equivalent. Try this.) (b) The linear equation x1 — x2 + 2xi = 3 has [3, 0, 0], [0, 1, 2], and [6, 1, —1] as specific solutions. The complete set of solutions corresponds to the set of points in the plane determined by the given equation. If we set x2 = s and x3 = t, then a parametric solution is given by [3 + 5 — 2f, 5, t]. (Which values of s and t produce the three specific solutions above?) 4 A system of linear equations is a finite set of linear equations, each with the same variables. A solution of a system of linear equations is a vector that is simultaneously a solution of each equation in the system. The solution set of a system of linear equations is the set of all solutions of the system. We will refer to the process of finding the solution set of a system of linear equations as "solving the system." «- Example 2.3 The system 2x - y = 3 i x + 3y = 5 has [2, 1] as a solution, since it is a solution of both equations. On the other hand [1, — 1] is not a solution of the system, since it satisfies only the first equation. Example 2.4 Solve the following systems of linear equations: (a) x — y =s 1 (b) x — y = 2 (c) x — y = 1 x + y = 3 2x — 2y = 4 x — y = 3 60 Chapter 2 Systems of Linear Equations Solution (a) Adding the two equations together gives 2x = 4, so x = 2, from which we find that v = 1. A quick check confirms that [2,1] is indeed a solution of both equations. That this is the only solution can be seen by observing that this solution corresponds to the (unique) point of intersection (2, 1) of the lines with equations X — y = 1 and x + y — 3, as shown in Figure 2.1(a). Thus, [2, 1] is a unique solution. (b) The second equation in this system is just twice the first, so the solutions are the solutions of the first equation alone—namely, the points on the line x — y = 2. These can be represented parametrically as [2 + t,t]. Thus, this system has infinitely many solutions [Figure 2.1(b)]. (c) Two numbers x and y cannot simultaneously have a difference of 1 and 3. Hence, this system has no solutions. (A more algebraic approach might be to subtract the second equation from the first, yielding the absurd conclusion 0 = -2.) As Figure 2.1(c) shows, the lines for the equations are parallel in this case. A system of linear equations is called consistent if it has at least one solution. A system with no solutions is called inconsistent Even though they are small, the three systems in Example 2.4 illustrate the only three possibilities for the number of solutions of a system of linear equations with real coefficients. We will prove later that these same three possibilities hold for any system of linear equations over the real numbers. A system of linear equations with real coefficients has either (a) a unique solution (a consistent system) or (b) infinitely many solutions (a consistent system) or (c) no solutions (an inconsistent system). _._:_ Solving a System of Linear Equations Two linear systems are called equivalent if they have the same solution sets. For example, x — y = 1 and x — y = 1 x + y — 3 y = 1 are equivalent, since they both have the unique solution [2, 1]. (Check this.) Section 2.1 Introduction to Systems of Linear Equations SI Our approach to solving a system of linear equations is to transform the given system into an equivalent one that is easier to solve. The triangular pattern of the second example above (in which the second equation has one less variable than the first) is what we will aim for. ............_________________________________,----a^- Example 2.5 Solve the system x — y — z = 2 1 r y + 3z = 5 5z = 10 Solution Starting from the last equation and working backward, we find successively that z = 2, y = 5 - 3(2) = -1, and x = 2 + (-1) + 2 = 3. So the unique solution is [3,-1,2]. The procedure used to solve Example 2.5 is called back substitution. We now turn to the general strategy for transforming a given system into an equivalent one that can be solved easily by back substitution. This process will be described in greater detail in the next section; for now, we will simply observe it in action in a single example. Example 2.6 Solve the system x — y — z = 2 3* - 3y + 2z = 16 2x - y + z = 9 To transform this system into one that exhibits the triangular structure of Example 2.5, we first need to eliminate the variable x from Equations 2 and 3. Observe that subtracting appropriate multiples of equation 1 from Equations 2 and 3 will do the trick. Next, observe that we are operating on the coefficients, not on the variables, so we can save ourselves some writing if we record the coefficients and constant terms in the matrix The word matrix is derived from the Latin word mater, meaning "mother." When the suffix -ix is added, the meaning becomes "womb." Just as a womb surrounds a fetus, the brackets of a matrix surround its entries, and just as the womb gives rise to a baby, a matrix gives rise to certain types of functions called linear transformations. A matrix with m rows and n columns is calledanmXn matrix (pronounced "m by n"). The plural of matrix is matrices, not "matrixes." 1 -1 -1 3 -3 2 2 -1 1 2 16 9 where the first three columns contain the coefficients of the variables in order, the final column contains the constant terms, and the vertical bar serves to remind us of the equal signs in the equations. This matrix is called the augmented matrix of the system. There are various ways to convert the given system into one with the triangular pattern we are after. The steps we will use here are closest in spirit to the more general method described in the next section. We will perform the sequence of operations on the given system and simultaneously on the corresponding augmented matrix. We begin by eliminating x from Equations 2 and 3. x — y — z = 2 3x — 3y + 2z = 16 2x - y + z = 9 1 -1 -1 3-3 2 2 -1 1 Chapter 2 Systems of Linear Equations Subtract 3 times the first equation from the second equation: x — y — z = 2 5z = 10 2x - y + z = 9 Subtract 2 times the first equation from the third equation: x — y — z = 2 5z= 10 y + 3z = 5 Interchange Equations 2 and 3: x — y — z = 2 y + 3z = 5 5z = 10 Subtract 3 times the first row from the second row: 1-1-1 0 0 5 2 -1 1 Subtract 2 times the first row from the third row: 1 -1 -1 0 0 5 0 1 3 2 10 5 Interchange rows 2 and 3: 1 -1 -1 0 1 3 0 0 5 This is the same system that we solved using back substitution in Example 2.5, where we found that the solution was [3, —1,2]. This is therefore also the solution to the system given in this example. Why? The calculations above show that any solution of the given system is also a solution of the final one. But since the steps we just performed are reversible, we could recover the original system, starting with the final system. (How?) So any solution of the final system is also a solution of the given one. Thus, the systems are equivalent (as are all of the ones obtained in the intermediate steps above). Moreover, we might just as well work with matrices instead of equations, since it is a simple matter to reinsert the variables before proceeding with the back substitution. (Working with matrices is the subject of the next section.) Remark Calculators with matrix capabilities and computer algebra systems can facilitate solving systems of linear equations, particularly when the systems are large or have coefficients that are not "nice," as is often the case in real-life applications. As always, though, you should do as many examples as you can with pencil and paper until you are comfortable with the techniques. Even if a calculator or CAS is called for, think about how you would do the calculations manually before doing anything. After you have an answer, be sure to think about whether it is reasonable. Do not be misled into thinking that technology will always give you the answer faster or more easily than calculating by hand. Sometimes it may not give you the answer at all! Roundoff errors associated with the floating-point arithmetic used by calculators and computers can cause serious problems and lead to wildly wrong answers to some problems. See Exploration: Lies My Computer Told Me for a glimpse of the problem. (You've been warned!) Section 2.1 Introduction to Systems of Linear Equations 6 s Exercises 2.1 In Exercises 1-6, determine which equations are linear equations in the variables x, y, and z. If any equation is not linear, explain why not. 1. x — Try + ^5z = 0 i I ^ 3. x + 7y + z = sinl — 4. 2x — xy — 5z — 0 6. (cos3)x - 4y + z = V3 2. x2 + y2 + z2 5. 3 cos x — 4y + z = V3 In Exercises 7-10, find a linear equation that has the same solution set as the given equation (possibly with some restrictions on the variables). 7. 2x + y = 7 - 3y 1 1 4 9. - + - = — x y xy 8. 2 2 x - y 1 x — y 10. log10x~log10y = 2 In Exercises 11-14, find the solution set of each equation. 11. 3x - 6y = 0 12. 2xx + 3x2 = 5 13. x + 2y + 3z = 4 14. 4x{ + 3x2 + 2x3 = 1 In Exercises 15-18, draw graphs corresponding to the given linear systems. Determine geometrically whether each system has a unique solution, infinitely many solutions, or no solution. Then solve each system algebraically to confirm your answer. 15. x + y = 0 16. x - 2y = 7 2x + y = 3 3x + y = 7 17. 3x-6y = 3 18. 0.10x~-0.05y = 0.20 -.v + 2y = 1 -0.06* + 0.03)' = -0.12 In Exercises 19-24, solve the given system by back substitution. 19.x - 2y = 1 20. 2u ~ 3v = 5 7 = 3 2v = 6 21.x - y + z = 0 22. Xj + 2x2 + 3x3 = 0 2y — z = 1 = 0 3z= -1 4x3 = 0 23. X] H- ^2 -^3 — 1 24. x-3y+ z = 5 %■) ~\~ X-^ ~h — 0 y — 2z = -1 x4 = 0 The systems in Exercises 25 and 26 exhibit a "lower triangular" pattern that makes them easy to solve by forward substitution. (We will encounter forward substitution again in Chapter 3.) Solve these systems. 25. x = 2 26. x, 1 x 2x -3x h y = -3 4y + z = - io 2xx + 2x2 + x3 Find the augmented matrices of the linear systems in Exercises 27-30. 27. x-y-Q 2x + y = 3 28. 2xx + 3x2 29. x + 5y = -1 —x + y = —5 2x + 4y = 4 =3=1 x, + x3 = 0 X\ 2^2 — 0 30. a -2b + d~ — a+ b — c — 3d — In Exercises 31 and 32, find a system of linear equations that has the given matrix as its augmented matrix. "0 111 1- 10 1 2- 111 32. 1-10 3 1 1 12 1-1 0 10 2 3 For Exercises 33-38, solve the linear systems in the given exercises. 33. Exercise 27 34. Exercise 28 35. Exercise 29 36. Exercise 30 37. Exercise 31 38. Exercise 32 39. (a) Find a system of two linear equations in the vari- ables x and y whose solution set is given by the parametric equations x = t andy = 3 — 2t. (b) Find another parametric solution to the system in part (a) in which the parameter is s and y = s. 40. (a) Find a system of two linear equations in the vari- ables Xj, x2, and x3 whose solution set is given by the parametric equations x1 = r,x2 = 1 + t, and x3 = 2 — t. (b) Find another parametric solution to the system in part (a) in which the parameter is s and x3 = s. In Exercises 41-44, the systems of equations are nonlinear. 42. x; + 2y2 = 6 Find substitutions (changes of variables) that convert each x~ — v~ = 3 system into a linear system and use this linear system to help 43 tan x — 2 sin y = 2 solve the given system. "tan*- siny + cosz = 2 41 t + 1 _ o siny-cosz = -1 * x y 44. -2" + 2(3*) = 1 3 4 _ i 3(2fl) - 4(3fc) = 1 x y Direct Methods for Solving Linear Systems In this section, we will look at a general, systematic procedure for solving a system of linear equations. This procedure is based on the idea of reducing the augmented matrix of the given system to a form that can then be solved by back substitution. The method is direct in the sense that it leads directly to the solution (if one exists) in a finite number of steps. In Section 2.5, we will consider some indirect methods that work in a completely different way. Matrices and Echelon Form There are two important matrices associated with a linear system. The coefficient matrix contains the coefficients of the variables, and the augmented matrix (which we have already encountered) is the coefficient matrix augmented by an extra column containing the constant terms. For the system 2x + y — z = 3 x + 5z = 1 ~x + 3y — 2z = 0 the coefficient matrix is 2 1 -1 1 0 5 .-1 3 -2 and the augmented matrix is 2 1 -1 3" 1 0 5 1 _-l 3 -2 0_ Note that if a variable is missing (as y is in the second equation), its coefficient 0 is entered in the appropriate position in the matrix. If we denote the coefficient matrix of a linear system by A and the column vector of constant terms by b, then the form of the augmented matrix is [A | b]. s. Chapter 2 Systems of Linear Equations In Exercises 41-44, the systems of equations are nonlinear. Find substitutions (changes of variables) that convert each system into a linear system and use this linear system to help solve the given system. 41, 2 3 - + - » o x y 3 4 - + -x y = 1 42. xz + 2/ = 6 x2 - y2 = 3 43. tan x — 2 sin y - tan x ~ sin y + cos z = sin y — cos z ■ 44. -2" + 2(3b) = 1 3(2") - 4(3b) = 1 2 2 -1 Direct Methods for Solving Linear Systems In this section, we will look at a general, systematic procedure for solving a system of linear equations. This procedure is based on the idea of reducing the augmented matrix of the given system to a form that can then be solved by back substitution. The method is direct in the sense that it leads directly to the solution (if one exists) in a finite number of steps. In Section 2.5, we will consider some indirect methods that work in a completely different way. Matrices and Echelon Form There are two important matrices associated with a linear system. The coefficient matrix contains the coefficients of the variables, and the augmented matrix (which we have already encountered) is the coefficient matrix augmented by an extra column containing the constant terms. For the system 2x + y — z = 3 x + 5z = 1 -x + 3y - 2z = 0 the coefficient matrix is 2 1 -1 1 0 5 -1 3 -2 and the augmented matrix is 2 1 -1 3" 1 0 5 1 .-1 3 -2 0_ Note that if a variable is missing (asy is in the second equation), its coefficient 0 is entered in the appropriate position in the matrix. If we denote the coefficient matrix of a linear system by A and the column vector of constant terms by b, then the form of the augmented matrix is [A | b]. Section 2.2 Direct Methods for Solving Linear Systems BS In solving a linear system, it will not always be possible to reduce the coefficient matrix to triangular form, as we did in Example 2.6. However, we can always achieve a staircase pattern in the nonzero entries of the final matrix. The word echelon comes from the Latin word scala, meaning "ladder" or "stairs." The French word for "ladder," echelle, is also derived from this Latin base. A matrix in echelon form exhibits a staircase pattern. Definition A matrix is in row echelon form if it satisfies the following properties: 1. Any rows consisting entirely of zeros are at the bottom, 2. In each nonzero row, the first nonzero entry (called the leading entry) is in a column to the left of any leading entries below it. Note that these properties guarantee that the leading entries form a staircase pattern. In particular, in any column containing a leading entry, all entries below the leading entry are zero, as the following examples illustrate. Example 2.1 The following matrices are in row echelon form: "2 4 r "l 0 r 0 -1 2 0 1 5 _0 0 0_ .0 0 4. "0 2 0 1 -1 3" 1 2 1 0 0 -1 1 2 2 0 1 3 0 0 0 0 4 0 0 0 0_ 0 0 0 0 0 5 If a matrix in row echelon form is actually the augmented matrix of a linear system, the system is quite easy to solve by back substitution alone. Example 2.8 Assuming that each of the matrices in Example 2.7 is an augmented matrix, write out the corresponding systems of linear equations and solve them. Solution We first remind ourselves that the last column in an augmented matrix is the vector of constant terms. The first matrix then corresponds to the system 2xx + 4x2 = 1 —x2 ~ 2 (Notice that we have dropped the last equation 0 = 0, or Oxj + 0x2 = 0, which is clearly satisfied for any values of xx and x2.) Back substitution gives x2 = — 2 and then 2x, 1 - 4(-2) = 9, so*! = f. The solution is [§, -2]. The second matrix has the corresponding system xx =1 x2 = 5 0 = 4 The last equation represents Qxx + 0x2 — 4, which clearly has no solutions. Therefore, the system has no solutions. Similarly, the system corresponding to the fourth matrix has no solutions. For the system corresponding to the third matrix, we have Chapter 2 Systems of Linear Equations xl + x2 + 2x3 = 1 — 3 so Xi = 1 — 2(3) — x2 = — 5 — x2. There are infinitely many solutions, since we may assign x2 any value t to get the parametric solution [—5 — t, f, 3]. ^ Elementary Row Operations We now describe the procedure by which any matrix can be reduced to a matrix in row echelon form. The allowable operations, called elementary row operations, correspond to the operations that can be performed on a system of linear equations to transform it into an equivalent system. Definition The following elementary row operations can be performed on a matrix: 1. Interchange two rows. 2. Multiply a row by a nonzero constant. 3. Add a multiple of a row to another row. Observe that dividing a row by a nonzero constant is implied in the above definition, since, for example, dividing a row by 2 is the same as multiplying it by \. Similarly, subtracting a multiple of one row from another row is the same as adding a negative multiple of one row to another row. We will use the following shorthand notation for the three elementary row operations: 1. Rt <-» Rj means interchange rows i and;'. 2. kR{ means multiply row i by k. 3. Rj + kRj means add k times row j to row i (and replace row i with the result). The process of applying elementary row operations to bring a matrix into row echelon form, called row reduction, is used to reduce a matrix to echelon form. Example 2.9 Reduce the following matrix to echelon form: 1 2 -4 -4 5" 2 4 0 0 2 2 3 2 1 5 -1 1 3 6 5 Solution We work column by column, from left to right and from top to bottom. The strategy is to create a leading entry in a column and then use it to create zeros below it. The entry chosen to become a leading entry is called a pivot, and this phase of the process is calledpivofing. Although not strictly necessary, it is often convenient to use the second elementary row operation to make each leading entry a 1. Section 2.2 Direct Methods for Solving Linear Systems S? We begin by introducing zeros into the first column below the leading 1 in the first row: 1 2 -4 -4 5" fi2 - 2Ri "1 2 -4 -4 5" 2 4 0 0 2 0 0 8 8 -8 2 3 2 1 5 -► 0 -1 10 9 -5 1 1 3 6 5 0 3 -1 2 10 The first column is now as we want it, so the next thing to do is to create a leading entry in the second row, aiming for the staircase pattern of echelon form. In this case, we do this by interchanging rows. (We could also add row 3 or row 4 to row 2.) "1 2 -4 -4 5 0 -1 10 9 -5 0 0 8 8 -8 0 3 -1 2 10 The pivot this time was —1. We now create a zero at the bottom of column 2, using the leading entry — 1 in row 2: "1 2 -4 -4 5" 0 -1 10 9 -5 0 0 8 8 -8 0 0 29 29 -5 Column 2 is now done. Noting that we already have a leading entry in column 3, we just pivot on the 8 to introduce a zero below it. This is easiest if we first divide row 3 by 8: "1 2 -4 -4 5" 0 -1 10 9 -5 0 0 1 1 -1 0 0 29 29 -5 Now we use the leading 1 in row 3 to create a zero below it: "1 2 -4 -4 5 0 -1 10 9 -5 0 0 1 1 -1 _0 0 0 0 24 With this final step, we have reduced our matrix to echelon form. Remarks • The row echelon form of a matrix is not unique. (Find a different row echelon form for the matrix in Example 2.9.) Chapter 2 Systems of Linear Equations • The leading entry in each row is used to create the zeros below it. • The pivots are not necessarily the entries that are originally in the positions eventually occupied by the leading entries. In Example 2.9, the pivots were 1, -1, 8, and 24. The original matrix had 1, 4, 2, and 5 in those positions on the "staircase." • Once we have pivoted and introduced zeros below the leading entry in a column, that column does not change. In other words, the row echelon form emerges from left to right, top to bottom. Elementary row operations are reversible—that is, they can be "undone." Thus, if some elementary row operation converts A into B, there is also an elementary row operation that converts B into A. (See Exercises 15 and 16.) DBfillitiOll Matrices A and B are row equivalent if there is a sequence of elementary row operations that converts .<\ into 11. The matrices in Example 2.9, 1 2 -4 -4 5" "1 2 -4 -4 5 2 4 0 0 2 and 0 -1 10 9 -5 2 3 2 1 5 0 0 1 1 -1 1 1 3 6 5 0 0 0 0 24 are row equivalent. In general, though, how can we tell whether two matrices are row equivalent? TilOOiem 2.1 Matrices A and B are row equivalent if and only if they can be reduced to the same row echelon form. Proof If A and B are row equivalent, then further row operations will reduce B (and therefore A) to the (same) row echelon form. Conversely, if A and B have the same row echelon form R, then, via elementary row operations, we can convert A into R and B into R. Reversing the latter sequence of operations, we can convert R into B, and therefore the sequence A —> R —> B achieves the desired effect, __ Remark In practice, Theorem 2.1 is easiest to use if R is the reduced row echelon form of A and B, as defined on page 73. See Exercises 17 and 18. Gaussian Elimination When row reduction is applied to the augmented matrix of a system of linear equations, we create an equivalent system that can be solved by back substitution. The entire process is known as Gaussian elimination. Section 2.2 Direct Methods for Solving Linear Systems Gaussian Elimination 1. Write the augmented matrix of the system of linear equations. 2. Use elementary row operations to reduce the augmented matrix to row echelon form. 3. Using back substitution, solve the equivalent system that corresponds to the row-reduced matrix. Example 2.10 ■•■ark When performed by hand, step 2 of Gaussian elimination allows quite a bit of choice. Here are some useful guidelines: (a) Locate the leftmost column that is not all zeros. (b) Create a leading entry at the top of this column. (It will usually be easiest if you make this a leading 1. See Exercise 22.) (c) Use the leading entry to create zeros below it. (d) Cover up the row containing the leading entry, and go back to step (a) to repeat the procedure on the remaining submatrix. Stop when the entire matrix is in row echelon form. Solve the system 2x, + 3%} + 3 Solution The augmented matrix is "0 2 3 2 3 1 1 -1 -2 We proceed to reduce this matrix to row echelon form, following the guidelines given for step 2 of the process. The first nonzero column is column 1. We begin by creating Carl Friedrich Gauss (1777-1855) is generally considered to be one of the three greatest mathematicians of all time, along with Archimedes and Newton. He is often called the "prince of mathematicians," a nickname that he richly deserves. A child prodigy, Gauss reportedly could do arithmetic before he could talk. At the age of 3, he corrected an error in his father's calculations for the company payroll, and as a young student, he found the formula n(n+ 1 )/2 for the sum of the first n natural numbers. When he was 19, he proved that a 17-sided polygon could be constructed using only a straightedge and a compass, and at the age of 21, he proved, in his doctoral dissertation, that every polynomial of degree n with real or complex coefficients has exactly n zeros, counting multiple zeros—the Fundamental Theorem of Algebra. Gauss's 1801 publication Disquisitiones Arithmetical is generally considered to be the foundation of modern number theory, but he made contributions to nearly every branch of mathematics as well as to statistics, physics, astronomy, and surveying. Gauss did not publish all of his findings, probably because he was too critical of his own work. He also did not like to teach and was often critical of other mathematicians, perhaps because he discovered—but did not publish—their results before they did. The method called Gaussian elimination was known to the Chinese in the third century B.C. and was well known by Gauss's time. The method bears Gauss's name because of his use of it in a paper in which he solved a system of linear equations to describe the orbit of an asteroid. JO Chapter 2 Systems of Linear Equations a leading entry at the top of this column; interchanging rows 1 and 3 is the best way to achieve this. 0 2 3 8~ R, <-> R, 1 -1 -2 -5 2 3 1 5 -...........> 2 3 1 5 1 -1 -2 -5 .0 2 3 8 We now create a second zero in the first column, using the leading 1: 1R, —> 1 -1 -2 0 5 5 0 2 3 -5 15 We now cover up the first row and repeat the procedure. The second column is the first nonzero column of the submatrix. Multiplying row 2 by f will create a leading 1. 1 -1 -2 -5 1 -1 -2 -5 0 5 5 15 -^-> 0 1 1 3 0 2 3 8_ .0 2 3 8 We now need another zero at the bottom of column 2: -2Sj —> 1 -1 -2 0 1 1 0 0 1 The augmented matrix is now in row echelon form, and we move to step 3. The corresponding system is x2 + x3 = 3 x3 = 2 and back substitution gives x3 = 2, then x2 = 3 — x3 = 3 — 2 = 1, and finally Xj = — 5 + x2 + 2x3 = —5 + 1+ 4 = 0. We write the solution in vector form as (We are going to write the vector solutions of linear systems as column vectors from now on. The reason for this will become clear in Chapter 3.) Example 2.11 Solve the system w — x — y + 2z = 1 2w — 2x — y + 3z = 3 — w + x — y = — 3 Section 2.2 Direct Methods for Solving Linear Systems n The augmented matrix is "1-1-12 1 2-2-13 3 „-1 1-10-3 which can be row reduced as follows: 1 -1 -1 2 1 R2-2R, "l -1 -1 2 1 R3 + R, 2 -2 -1 3 3 -> 0 0 1 -1 1 1 1 -1 0 "3. _0 0 -2 2 -2 ~1 -1 -1 2 r R3 + 2R2 -> 0 0 1 -1 l _0 0 0 0 0_ The associated system is now w — x — y + 2z = 1 y — z = 1 which has infinitely many solutions. There is more than one way to assign parameters, but we will proceed to use back substitution, writing the variables corresponding to the leading entries (the leading variables) in terms of the other variables (the free variables). In this case, the leading variables are w and y, and the free variables are x and z. Thus, y = 1 + z, and from this we obtain w— \+x + y — 2z = 1 + x + (1 + z) - 2z = 2 + x — z If we assign parameters x = s and z — t, the solution can be written in vector form as ~w~ "2 + s - f" "2" T ' -r x 5 0 1 0 SC + 5 + t y 1 + £ 1 0 l _ z _ f .0. .0. Example 2.11 highlights a very important property: In a consistent system, the free variables are just the variables that are not leading variables. Since the number of leading variables is the number of nonzero rows in the row echelon form of the coefficient matrix, we can predict the number of free variables (parameters) before we find the explicit solution using back substitution. In Chapter 3, we will prove that, although the row echelon form of a matrix is not unique, the number of nonzero rows is the same in all row echelon forms of a given matrix. Thus, it makes sense to give a name to this number. Chapter 2 Systems of Linear Equations DfiJIInJtlOn The rank of a matrix is the number of nonzero rows in its row echelon form. We will denote the rank of a matrix A by rank(A). In Example 2.10, the rank of the coefficient matrix is 3, and in Example 2.11, the rank of the coefficient matrix is 2. The observations we have just made justify the following theorem, which we will prove in more generality in Chapters 3 and 6. Theorem 2.2 The Rank Theorem Let A be the coefficient matrix of a system of linear equations with n variables. If the system is consistent, then number of free variables = n — rank(A) Thus, in Example 2.10, we have 3 — 3 = 0 free variables (in other words, a unique solution), and in Example 2.11, we have 4 — 2 = 2 free variables, as we found. .. ........ , ,, ft llifillf! f .12 Solve the system %y *^*_ 2x3 — 3 f Xi ~f~ JCi, — "3 2x2 2x^ ■ 1 Solution When we row reduce the augmented matrix, we have 3 -3 1 R2~R, R, - 2R2 -> 1 3 -2 1 3' -2 5 Wilhelm Jordan (1842 -1H99) was a German professor of geodesy whose contribution to solving linear systems was a systematic method of back substitution closely related to the method described here. leading to the impossible equation 0 = 5. (We could also have performed R} ~ f R2 as the second elementary row operation, which would have given us the same contradiction but a different row echelon form.) Thus, the system has no solutions—it is inconsistent. Causs-Jordan Elimination A modification of Gaussian elimination greatly simplifies the back substitution phase and is particularly helpful when calculations are being done by hand on a system with Section 2.2 Direct Methods for Solving Linear Systems infinitely many solutions. This variant, known as Gauss-Jordan elimination, relies on reducing the augmented matrix even further. DCflnitiOfl A matrix is in reduced row echelon form if it satisfies the following properties: 1. It is in row echelon form. 2. The leading entry in each nonzero row is a 1 (called a leading 1). 3. Each column containing a leading 1 has zeros everywhere else. The following matrix is in reduced row echelon form: 0 0 1 -1 -2 0 0 For 2X2 matrices, the possible reduced row echelon forms are 1 0" "1 *~ "0 l" , and "0 0' .0 1. -0 0_ .0 0_ _0 0. For a short proof that the reduced row echelon form of a matrix is unique, see the article by Thomas Yuster, "The Reduced Row Fxhelon Form of a Matrix Is Unique: A Simple Proof," in the March 1984 issue of Mathematics Magazine (vol. 57, no. 2, pp. 93-94). where * can be any number. It is clear that after a matrix has been reduced to echelon form, further elementary row operations will bring it to reduced row echelon form. What is not clear (although intuition may suggest it) is that, unlike the row echelon form, the reduced row echelon form of a matrix is unique. In Gauss-Jordan elimination, we proceed as in Gaussian elimination but reduce the augmented matrix to reduced row echelon form. Gauss-Jordan Elimination 1. Write the augmented matrix of the system of linear equations. 2. Use elementary row operations to reduce the augmented matrix to reduced row echelon form. 3. If the resulting system is consistent, solve for the leading variables in terms of any remaining free variables. Example 2.13 Solve the system in Example 2.11 by Gauss-Jordan elimination. The reduction proceeds as it did in Example 2.11 until we reach the echelon form: -1 -1 2 0 1 -1 0 0 0 74 Chapter 2 Systems of Linear Equations We now must create a zero above the leading 1 in the second row, third column. We do this by adding row 2 to row 1 to obtain 1-10 1 0 0 1-1 0 0 0 0 The system has now been reduced to w — x + z = 2 y — z = 1 It is now much easier to solve for the leading variables: w 2 + x — z and y = 1 + z If we assign parameters x = s and z = t as before, the solution can be written in vector form as " w~ '2 + s - t" X s y 1 + t t Remark From a computational point of view, it is more efficient (in the sense that it requires fewer calculations) to first reduce the matrix to row echelon form and then, working from right to left, make each leading entry a 1 and create zeros above these leading Is. However, for manual calculation, you will find it easier to just work from left to right and create the leading Is and the zeros in their columns as you go. Let's return to the geometry that brought us to this point. Just as systems of linear equations in two variables correspond to lines in R:, so linear equations in three variables correspond to planes in F83. In fact, many questions about lines and planes can be answered by solving an appropriate linear system. Example 2.14 Find the line of intersection of the planes x + 2y — z = 3 and 2x + 3y + z — 1, Solution First, observe that there will be a line of intersection, since the normal vectors of the two planes—[1, 2, —1] and [2, 3, 1]—are not parallel. The points that lie in the intersection of the two planes correspond to the points in the solution set of the system x 4- 2y — z = 3 2x + 3y + z = 1 Gauss-Jordan elimination applied to the augmented matrix yields Section 2.2 Direct Methods for Solving Linear Systems 75 1 2 -1 3" R2 - 2R, 1 2 -1 3 -► 2 3 1 1 .0 -1 3 -5 -> 1 0 0 1 Figure 2.2 The intersection of two planes Replacing variables, we have x + 5z = -7 y — 3z = 5 We set the free variable z equal to a parameter t and thus obtain the parametric equations of the line of intersection of the two planes: x = -7 - 5f y = 5 + 3t z = t In vector form, the equation is X '-7" —5 y = 5 + t 3 _z_ 0_ 1. See Figure 2.2. Example 2.15 r "o" 1 3" Let p = 0 ,q = 2 > u = 1 , and v = -1 -l 1 1 -1 , Determine whether the lines x = p + t u and x = q + t\ intersect and, if so, find their point of intersection. Solution We need to be careful here. Although f has been used as the parameter in the equations of both lines, the lines are independent and therefore so are their parameters. Let's use a different parameter—say, 5—for the first line, so its equation x becomes x = p + su. If the lines intersect, then we want to find an x = y _z_ satisfies both equations simultaneously. That is, we want x = p + su = q + tv or su — (v = q — p. Substituting the given p, q, u, and v, we obtain the equations 5 - 3f = -1 s + t= 2 s + t = 2 that whose solution is easily found to be s = j, t — f. The point of intersection is therefore it Chapter 2 Systems of Linear Equations X r "i" ~9~ 4 y = 0 + 1 l = 5 4 z -l l 1 -4- Figure 2.3 Two intersecting lines See Figure 2.3. (Check that substituting t = § in the other equation gives the same point.) Remark In U3, it is possible for two lines to intersect in a point, to be parallel, or to do neither. Nonparallel lines that do not intersect are called skew lines. Homogeneous Systems We have seen that every system of linear equations has either no solution, a unique solution, or infinitely many solutions. However, there is one type of system that always has at least one solution. Definition A system of linear equations is called homogeneous if the constant term in each equation is zero. In other words, a homogeneous system has an augmented matrix of the form [A | 0]. The following system is homogeneous: 2x + 3y - 2 = 0 -x + 5y + 2z = 0 Since a homogeneous system cannot have no solution (forgive the double negative!), it will have either a unique solution (namely, the zero, or trivial, solution) or infinitely many solutions. The next theorem says that the latter case must occur if the number of variables is greater than the number of equations. Theorem 2.3 If [A I 0] is a homogeneous system of m linear equations with n variables, where m < n, then the system has infinitely many solutions. Since the system has at least the zero solution, it is consistent. Also, rank(A) £ m (why?). By the Rank Theorem, we have number of free variables — n — rank(A) > n — m > 0 So there is at least one free variable and, hence, there are infinitely many solutions. Note Theorem 2.3 says nothing about the case where m Ss n. Exercise 44 asks you to give examples to show that, in this case, there can be either a unique solution or infinitely many solutions. Section 2.2 Direct Methods for Solving Linear Systems R and Zp are examples offields. The set of rational numbers Q and the set of complex numbers C are other examples. Fields are covered in detail in courses in abstract algebra. Linear Systems over lp Thus far, all of the linear systems we have encountered have involved real numbers, and the solutions have accordingly been vectors in some R". We have seen how other number systems arise-—notably, Zp. When p is a prime number, Zp behaves in many respects like R; in particular, we can add, subtract, multiply, and divide (by nonzero numbers). Thus, we can also solve systems of linear equations when the variables and coefficients belong to some Z.p. In such instances, we refer to solving a system over Zp. For example, the linear equation xx + x2 + x3 - 1, when viewed as an equation over Z2, has exactly four solutions: xl 1 0 Xl 0 Xi 1 = 0 x2 = 1 Xl - 0 , and Xl = 1 -*3- 0 *3 0 1 1 (where the last solution arises because 1 + 1 + 1 = 1 in Z2). When we view the equation xx + x2 + x3 = 1 over Z3, the solutions are 1 0 0 2 0 2 1 1 2 0 > 1 > 0 2 2 J 0 1 2 , 1 0 0 I 0 2 2 2 1 1 (Check these.) But we need not use trial-and-error methods; row reduction of augmented matrices works just as well over Zp as over R. Example 2.16 Solve the following system of linear equations over Z3: xx + 2x2 + x3 = 0 x2 + 2x3 = 1 Solution The first thing to note in examples like this one is that subtraction and division are not needed; we can accomplish the same effects using addition and multiplication. (This, however, requires that we be working over Zp, where p is a prime; see Exercise 60 at the end of this section and Exercise 57 in Section 1.1.) We row reduce the augmented matrix of the system, using calculations modulo 3. "l 2 1 o" 1 0 1 2 .0 1 2 L. R2 + 2R, -► R,+R2 R, + 2R2 -> Chapter 2 Systems of Linear Equations K,+R3 " 1 0 0 r 2R, --» 0 1 0 2 .0 0 1 1. Thus, the solution is xt = 1, x2 = 2, x3 - 1. Example 2.17 Solve the following system of linear equations over Z2: Xj + x2 + X3 + x4 — 1 Xy T" X2 x2 + x3 Xj x3 + x4 = 0 + x4 = 1 Solution The row reduction proceeds as follows: 1111 110 0 0 110 0 0 11 10 0 1 R2 + Ri -> R2 *-> K3 Rx + R2 R2 + R3 1 1 1 1 1 0 0 1 1 0 0 1 1 0 0 0 0 1 1 0 0 1 1 0 0 1 0 0 1 i' 0 1 1 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 1 1 0 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 Therefore, we have %i ~r = 1 Xn X^ 0 x3 + x4 = 0 Setting the free variable x4 = t yields Section 2.2 Direct Methods for Solving Linear Systems "l + r T "1" X2 x3 = t t = 0 0 + t 1 1 -X4- L * _0_ _1_ Since t can take on the two values 0 and 1, there are exactly two solutions: "1" ~Q~ 0 1 and 0 1 _0_ _1_ A Remark For linear systems over lp, there can never be infinitely many m~* solutions. (Why not?) Rather, when there is more than one solution, the number of solutions is finite and is a function of the number of free variables and p. (See Exercise 59.) Exercises 2.2 In Exercises 1 -8, determine whether the given matrix is in row echelon form. If it is, state whether it is also in reduced row echelon form. 1. 3. In Exercises 9-14, use elementary row operations to reduce the given matrix to (a) row echelon form and (b) reduced row echelon form. 0 0 1 0 1 1 1 1 0 1 7 0 1 0 0 0 3 2. 0 1 i 4 .0 1 0. .0 0 0 0_ 11. "0 0 0 "0 1 3 0" 0 4. 0 0 .0 0 0 1. _0 0 0. ~1 0 3 4 o" "0 0 r 13. 0 0 0 0 0 6. 0 1 0 _0 1 5 0 1. .1 0 0_ "1 2 3" '2 1 3 5" 1 0 0 0 0 1 -1 15. 8. 0 1 1 0 0 0 3 0 0 1 0 0 Ü 0 5 -2 4 -2 -1 -1 -1 -3 -1 10. 12. 4 3 2 1 2 -4 3 1 14. -2 -4 -3 -6 1 2 -2 6 6 6 7 10 -3 Example 2.9 to show that we can convert "1 2 -4 -4 5 0 -1 10 9 -5 0 0 1 1 -1 0 0 0 0 24 into L'y Chapter 2 Systems of Linear Equations 1 2 -4 -4 5 2 4 0 0 2 2 3 2 1 5 -1 1 3 6 5 16. In general, what is the elementary row operation that "undoes" each of the three elementary row operations Rj «-> Rp kRj, and R, + kRf, In Exercises 17 and 18, show that the given matrices are row equivalent and find a sequence of elementary row operations that will convert A into B. 17. A = 18. A = B 2 0-1 1 1 0 -1 1 1 3 1 -1 3 5 1 2 2 0 19. What is wrong with the following "proof" that every matrix with at least two rows is row equivalent to a matrix with a zero row? Perform R2 + R] and Rl + R2. Now rows 1 and 2 are identical. Now perform R2 — Rl to obtain a row of zeros in the second row. 20. What is the net effect of performing the following sequence of elementary row operations on a matrix (with at least two rows)? R2 + R1,Rl - R2,R2 + a,,-*, 21. Students frequently perform the following type of calculation to introduce a zero into a matrix: "3 f 3R2 - 2Rt "3 1" -> 2 4_ .0 10. However, 3R2 — 2Rl is not an elementary row operation. Why not? Show how to achieve the same result using elementary row operations. 22. Consider the matrix A — Show that any of the three types of elementary row operations can be used to create a leading 1 at the top of the first column. Which do you prefer and why? 23. What is the rank of each of the matrices in Exercises 1-8? 24. What are the possible reduced row echelon forms of 3X3 matrices? In Exercises 25-34, solve the given system of equations using either Gaussian or Gauss-Jordan elimination. 25. xl + 2x2 — 3x3 = 9 26. x — y + z = 0 2X[ — x2 + x3 = 0 — x + 3y + z = 5 4xj — x2 + x3 = 4 3x + y + 7z = 2 27. xx - 3x2 - 2x3 = 0 —xx + 2x2 + x3 = 0 2x, + 4x, + 6x, - 0 28.2m; + 3x - y + 4z = 1 3w — x + z = 1 3w — 4x + y — z = 2 29. 2r + s = 4r + s = 2r + 5s = 30. — x, + 3x, 3 7 -1 2x3 + 4x4 x, — 3x2 + 4x3 — 8x4 = 31* -\ x-i ~\~ X-) x-> &%a 2A1 1 6 6X1 + 2'" 3x4 + J*! — 2x3 — 4x5 32. V2x + y + 2z= 1 V2y - 3z = -V2 -y + V2z = 1 33. w + x + 2y + z w — x — y + z x + y w + x = 1 = 0 = -1 + z= 2 34. a + b + c + d = 4 a + 2b + 3c + 4d = 10 a + 3b + 6c + lOd = 20 a + 4b + 10c + 20d = 35 In Exercises 35-38, determine by inspection (i.e., without performing any calculations) whether a linear system with the given augmented matrix has a unique solution, infinitely many solutions, or no solution. Justify your answers. 35. 37. 0 0 1 0 1 3 1 0 1 1 2 5 6 9 10 2 1 1. 3 7 11 12 36. 38. 3-2 0 1 2 -3 .2 4 -6 12 3 4 6 5 4 3 7 7 7 7 39. Show that if ad — be # 0, then the system ax + by = r cx + dy = s has a unique solution. Section 2.2 Direct Methods for Solving Linear Systems In Exercises 40-43, for what value(s) ofk, if any, will the systems have (a) no solution, (b) a unique solution, and (c) infinitely many solutions? 40. kx + 2y = 3 2x - Ay = -6 42. x ~ 2y + 3z = 2 x + j + z = fc 2jc - y + 4z = fc2 41. x + ky = I kx + y — 1 43. x + v + fcz = 1 x + ky + z = 1 fcx + y + z = —2 44. Give examples of homogeneous systems of m linear equations in n variables with m — n and with m> n that have (a) infinitely many solutions and (b) a unique solution. In Exercises 45 and 46, find the line of intersection of the given planes. 45. 3x + 2y + z = -1 and 2x - y + 4z = 5 46. 4jc + y + z — 0 and 2x - v + 3z = 2 47. (a) Give an example of three planes that have a com- mon line of intersection (Figure 2.4). Figure 2.4 (b) Give an example of three planes that intersect in pairs but have no common point of intersection (Figure 2.5). (c) Give an example of three planes, exactly two of which are parallel (Figure 2.6). Figure 2.6 (d) Give an example of three planes that intersect in a single point (Figure 2.7). Figure 2.7 In Exercises 48 and 49, determine whether the lines x = p + su andx = q + fv intersect and, if they do, find their point of intersection. -1 2 1 -1 48. p = 2 >q = 2 ,u = 2 > v = 1 1 0 -1 0 3 -l 1 2 49. p = 1 »q = l , u = 0 , v = 3 0 -l 1 1 *r r "2" 50. Let p = 2 , u = l , and v = 1 .3. .0. Describe all points Q = (a,b, c) such that the line through Q with direction vector v intersects the line with equation x = p + su. 51. Recall that the cross product of vectors u and v is a vector u X v that is orthogonal to both u and v. (See Exploration: The Cross Product in Chapter 1.) If Figure 2.S v, u = and v = .v3. Chapter 2 Systems of Linear Equations show that there are infinitely many vectors 55.x + v = 1 overZ, x = that simultaneously satisfy u • x = 0 and v • x = 0 and that all are multiples of u2v3 — u3v2 u X v = u3vx uxv2 uxv3 U2Vi T o~ 2~ 0 52. Letp = i ,q = 1 , u = -3 , and v = 6 lO -1 1 -1 Show that the lines x = p + su and x = q + tv are skew lines. Find vector equations of a pair of parallel planes, one containing each line. In Exercises 53-58, solve the systems of linear equations over the indicated Z„. 53. x + 2y = 1 over Z, x + y = 2 54. x + y =1 over Z2 y + z = 0 x + z = 1 y + z = 0 X +2—1 56. 3x + 2y = 1 over Z5 x + 4y = 1 57. 3x + 2y = 1 over Z- x + 4y = 1 58. *i + 4x4 = 1 overZ5 x\ + 2x2 + 4x3 = 3 2X[ + 2x2 + x4 = 1 X] + 3x3 =2 59. Prove the following corollary to the Rank Theorem: Let A be an m X n matrix with entries in Zp. Any consistent system of linear equations with coefficient matrix A has exactly p""rank(A) solutions over Zp. 60. When p is not prime, extra care is needed in solving a linear system (or, indeed, any equation) over Zp. Using Gaussian elimination, solve the following system over Z6. What complications arise? 2x + 3y = 4 4x + 3y = 2 Writing Project A History of Gaussian Elimination As noted in the biographical sketch of Gauss in this section, Gauss did not actually "invent" the method known as Gaussian elimination. It was known in some form as early as the third century B.C. and appears in the mathematical writings of cultures throughout Europe and Asia. Write a report on the history of elimination methods for solving systems of linear equations. What role did Gauss actually play in this history, and why is his name attached to the method? 1. S. Athloen and R. McLaughlin, Gauss-Jordan reduction: A brief history, American Mathematical Monthly 94 (1987), pp. 130-142. 2. Joseph F. Grcar, Mathematicians of Gaussian Elimination, Notices of the AMS, Vol. 58, No. 6 (2011), pp. 782-792. (Available online at http://www.ams.org/ notices/201106/index.html) 3. Roger Hart, The Chinese Roots of Linear Algebra (Baltimore: Johns Hopkins University Press, 2011). 4. Victor J. Katz, A History of Mathematics: An Introduction (Third Edition) (Reading, MA: Addison Wesley Longman, 2008). ■■H CAS Lies My Computer Told Me Computers and calculators store real numbers in floating-point form. For example, 2001 is stored as 0.2001 X 104, and -0.00063 is stored as -0.63 X 10"3. In general, the floating-point form of a number is ±M X I0k, where k is an integer and the mantissa M is a (decimal) real number that satisfies 0.1 < M< 1. The maximum number of decimal places that can be stored in the mantissa depends on the computer, calculator, or computer algebra system. If the maximum number of decimal places diat can be stored is d, we say that there are d significant digits. Many calculators store 8 or 12 significant digits; computers can store more but still are subject to a limit. Any digits that are not stored are either omitted (in which case we say that the number has been truncated) or used to round the number to d significant digits. For example, tt~ 3.141592654, and its floating-point form is 0.3141592654 X 101. In a computer that truncates to five significant digits, tt would be stored as 0.31415 X 101 (and displayed as 3.1415); a computer that rounds to five significant digits would store tt as 0.31416 X 101 (and display 3.1416). When the dropped digit is a solitary 5, the last remaining digit is rounded so that it becomes even. Thus, rounded to two significant digits, 0.735 becomes 0.74 while 0.725 becomes 0.72. Whenever truncation or rounding occurs, a roundoff error is introduced, which can have a dramatic effect on the calculations. The more operations that are performed, the more the error accumulates. Sometimes, unfortunately, there is nothing we can do about this. This exploration illustrates this phenomenon with very simple systems of linear equations. 1. Solve the following system of linear equations exactly (that is, work with rational numbers throughout the calculations). x x y S9L, 800/ 2. As a decimal, §55 — 1.00125, so, rounded to five significant digits, the system becomes x + y = 0 x + 1.00127 = 1 Using your calculator or CAS, solve this system, rounding the result of every calculation to five significant digits. »3 3. Solve the system two more times, rounding first to four significant digits and then to three significant digits. What happens? 4. Clearly, a very small roundoff error (less than or equal to 0.00125) can result in very large errors in the solution. Explain why geometrically. (Think about the graphs of the various linear systems you solved in Problems 1-3.) Systems such as the one you just worked with are called ill-conditioned. They are extremely sensitive to roundoff errors, and there is not much we can do about it. We will encounter ill-conditioned systems again in Chapters 3 and 7. Here is another example to experiment with; 4.552x + 7.083/ = 1.931 1.73 be + 2.693jk = 2.001 Play around with various numbers of significant digits to see what happens, starting with eight significant digits (if you can). Partial Pivoting In Exploration: Lies My Computer Told Me, we saw that ill-conditioned linear systems can cause trouble when roundoff error occurs. In this exploration, you will discover another way in which linear systems are sensitive to roundoff error and see that very small changes in the coefficients can lead to huge inaccuracies in the solution. Fortunately, there is something that can be done to minimize or even eliminate this problem (unlike the problem with ill-conditioned systems). 1. (a) Solve the single linear equation 0.0002 \x = 1 for x. (b) Suppose your calculator can carry only four decimal places. The equation will be rounded to 0.0002x =- 1. Solve this equation. The difference between the answers in parts (a) and (b) can be thought of as the effect of an error of 0.00001 on the solution of the given equation. 2, Now extend this idea to a system of linear equations. (a) With Gaussian elimination, solve the linear system 0.400% + 99.6y = 100 75.3x - 45.3j = 30.0 using three significant digits. Begin by pivoting on 0.400 and take each calculation to three significant digits. You should obtain the "solution" x = — 1.00, y = 1.01. Check that the actual solution is x = 1.00, y = 1.00. This is a huge error—200% in the x value! Can you discover what caused it? (b) Solve the system in part (a) again, this time interchanging the two equations (or, equivalently, the two rows of its augmented matrix) and pivoting on 75.3. Again, take each calculation to three significant digits. What is the solution this time? The moral of the story is that, when using Gaussian or Gauss-Jordan elimination to obtain a numerical solution to a system of linear equations (i.e., a decimal approximation), you should choose the pivots with care. Specifically, at each pivoting step, choose from among all possible pivots in a column the entry with the largest absolute value. Use row interchanges to bring this element into the correct position and use it to create zeros where needed in the column. This strategy is known as partial pivoting. 3. Solve the following systems by Gaussian elimination, first without and then with partial pivoting. Take each calculation to three significant digits. (The exact solutions are given.) (a) 0.00 Ix + 0.995y = 1.00 (b) Wx - 7y =7 -10.2x+ 1.00y=-50.0 -3x + 2.09y + 6z = 3.91 5x — y 4- 5z ~ 6 X 0.00" X 5.00 Exact solution: = Exact solution: y = -1.00 .y. 1.00 z 1.00 adu jatar Munammau ion Musa al-Khwarizmi (c. 780-850) was a Persian mathematician whose book Hisab al-jabr w'al muqabalah (c. 825) described the use of Hindu-Arabic numerals and the rules of basic arithmetic. The second word of the book's title gives rise to the English word algebra, and the word algorithm is derived from al-Khwarizmi's name. Counting Operations: An Introduction to the Analysis of Algorithms Gaussian and Gauss-Jordan elimination are examples of algorithms: systematic procedures designed to implement a particular task—in this case, the row reduction of the augmented matrix of a system of linear equations. Algorithms are particularly well suited to computer implementation, but not all algorithms are created equal. Apart from the speed, memory, and other attributes of the computer system on which they are running, some algorithms are faster than others. One measure of the so-called complexity of an algorithm (a measure of its efficiency, or ability to perform its task in a reasonable number of steps) is the number of basic operations it performs as a function of the number of variables that are input. Let's examine this proposition in the case of the two algorithms we have for solving a linear system: Gaussian and Gauss-Jordan elimination. For our purposes, the basic operations are multiplication and division; we will assume that all other operations are performed much more rapidly and can be ignored. (This is a reasonable assumption, but we will not attempt to justify it.) We will consider only systems of equations with square coefficient matrices, so, if the coefficient matrix is n X n, the number of input variables is n. Thus, our task is to find the number of operations performed by Gaussian and Gauss-Jordan elimination as a function of n. Furthermore, we will not worry about special cases that may arise, but rather establish the worst case that can arise—when the algorithm takes as long as possible. Since this will give us an estimate of the time it will take a computer to perform the algorithm (if we know how long it takes a computer to perform a single operation), we will denote the number of operations performed by an algorithm by T(n), We will typically be interested in T(n) for large values of n, so comparing this function for different algorithms will allow us to determine which will take less time to execute. 1. Consider the augmented matrix 2 4 6 8" [A|b] = 3 9 6 12 -1 1 -1 1 85 Count the number of operations required to bring [A j b] to the row echelon form 2 3 4~ 0 1 -1 0 _0 0 1 1 (By "operation" we mean a multiplication or a division.) Now count the number of operations needed to complete the back substitution phase of Gaussian elimination. Record the total number of operations. 2. Count the number of operations needed to perform Gauss-Jordan elimination—that is, to reduce [A \ b] to its reduced row echelon form "l 0 0 -r 0 1 0 1 .0 0 1 1. (where the zeros are introduced into each column immediately after the leading 1 is created in that column). What do your answers suggest about the relative efficiency of the two algorithms? We will now attempt to analyze the algorithms in a general, systematic way. Suppose the augmented matrix [ A \ b ] arises from a linear system with n equations and n variables; thus, [A | b] is n X (« +1): [A b] = an an &2\ &22 «1« «2« We will assume that row interchanges are never needed—that we can always create a leading 1 from a pivot by dividing by the pivot. 3. (a) Show that n operations are needed to create the first leading 1: an fl!2 • • • aln Q-%\ #22 ' ' ' ®2n ■ i * ... * ^21 a22 ' ' ' a2n .anl an2 (Why don't we need to count an operation for the creation of the leading 1?) Now B" > show that n operations are needed to obtain the first zero in column 1: B1 (Why don't we need to count an operation for the creation of the zero itself?) When the first column has been "swept out," we have the matrix Show that the total number of operations needed up to this point is « + (n — 1) n. (b) Show that the total number of operations needed to reach the row echelon form 1 * 0 1 0 0 is [n + in - \)n] + [(« - l) + (n - 2)(n - l)] + [(« - 2) + (« - 3)(« - 2)] + • • • + [2 + 1 • 2] + 1 which simplifies to n2 + (n - l)2 + • • • + 22 + l2 (c) Show that the number of operations needed to complete the back substitution phase is 1 + 2 + ••• + («- 1) (d) Using summation formulas for the sums in parts (b) and (c) (see Exercises 51 and 52 in Section 2,4 and Appendix B), show that the total number of operations, T(n), performed by Gaussian elimination is T(n) = W Since every polynomial function is dominated by its leading term for large values of the variable, we see that Tin) ~ pi3 for large values of n. 4. Show that Gauss-Jordan elimination has T(n) ~ 5« total operations if we create zeros above and below the leading Is as we go. (This shows that, for large systems of linear equations, Gaussian elimination is faster than this version of Gauss-Jordan elimination.) bi Chapter 2 Systems of Linear Equations Spanning Sets and Linear independence The second of the three roads in our "trivium" is concerned with linear combinations of vectors. We have seen that we can view solving a system of linear equations as asking whether a certain vector is a linear combination of certain other vectors. We explore this idea in more detail in this section. It leads to some very important concepts, which we will encounter repeatedly in later chapters. Spanning Sets of Vectors We can now easily answer the question raised in Section 1.1: When is a given vector a linear combination of other given vectors? Example 2.18 *r 1 -1 (a) Is the vector 2 a linear combination of the vectors 0 and 1 3 3 -3 ~2~ 1 -1 (b) Is 3 a linear combination of the vectors 0 and 1 .4. .3. .-3. Solution (a) We want to find scalars x and y such that 1 -1 1 0 + }' 1 = 2 _3_ .-3. .3. x — y = 1 y = 2 3x — 3y = 3 Expanding, we obtain the system whose augmented matrix is (Observe that the columns of the augmented matrix are just the given vectors; notice the order of the vectors—in particular, which vector is the constant vector.) The reduced echelon form of this matrix is *1 -1 r 0 1 2 _3 -3 3. "l 0 3~ 0 1 2 „0 0 0. (Verify this.) So the solution is x = 3, y = 2, and the corresponding linear combination is "l" "l" 0 + 2 i = 2 .3. .-3. .3. Section 2.3 Spanning Sets and Linear Independence (b) Utilizing our observation in part (a), we obtain a linear system whose augmented matrix is which reduces to "l - 1 2~ 0 1 3 .3 — 3 4. "1 0 5" 0 1 3 .0 0 -2 revealing that the system has no solution. Thus, in this case, V -1 bination of 0 and 1 L3_ .-3. is not a linear com- The notion of a spanning set is intimately connected with the solution of linear systems. Look back at Example 2.18. There we saw that a system with augmented matrix [A | b] has a solution precisely when b is a linear combination of the columns of A. This is a general fact, summarized in the next theorem. Theorem 2.4 A system of linear equations with augmented matrix [A | b] is consistent if and only if b is a linear combination of the columns of A. Lets revisit Example 2.4, interpreting it in light of Theorem 2.4. (a) The system x — y = 1 x + y = 3 has the unique solution x = 2, y = 1. Thus, r 'i + l l 3 See Figure 2.8(a). (b) The system x — y = 2 2x - 2y = 4 has infinitely many solutions of the form x = 2 + t, y = f. This implies that (2 + t) V -r *2" + t 2_ .-2. 4. for all values of t. Geometrically, the vectors so all lie along the same line through the origin -1 , and are all parallel and see Figure 2.8(b)]. Chapter 2 Systems of Linear Equations (a) Figure 2.8 Example 2.19 (c) The system x — y = 1 x - y — 3 has no solutions, so there are no values of x andy that satisfy T "~r 1 + y .1. „3. Y and "-1 case, are 1 • 1 are parallel, but does not lie along the same line through the origin [see Figure 2.8(c)]. We will often be interested in the collection of all linear combinations of a given set of vectors. Definition If S = K, v2, . . . , yj is a set of vectors in 1", then the set of all linear combinations of vu v2,... ,vk is called the span of vb v2,..., vk and is denoted by span(v1; v2,..., vk) or span(S). If span(S) = R", then S is called a spanning set for R". (\2 V _3_ J Show that R2 = span( Solution We need to show that an arbitrary vector combination of 1 i and •••• 1 3 T a .3. b can be written as a linear 2" that is, we must show that the equation x ■1 + can always be solved for x andy (in terms of a and 2>), regardless of the values of a and b. Section 2.3 Spanning Sets and Linear Independence The augmented matrix is 2 1 -1 3 and row reduction produces 2 1 a -1 3 b R2 + 2R, -l 3 b _-l 3 b. -^ 2 1 a y 0 7 a + 2b_ at which point it is clear that the system has a (unique) solution. (Why?) If we continue, we obtain -1 31 b R,-3R2 \ "-1 0 {b - 3a)/7~ -r 0 1 1 (a + 2b)/7_ ? 0 1 (a + lV)fl. from which we see that x = (3a - b)/7 and y = (a + 2b)/7. Thus, for any choice of a and £>, we have 3a 2' /a + 2b\ a + - = -1. V 7 y1 .3j b_ )» >• (Check this.) Remark It is also true that find x and y such that .v span 2 I 5 -1 3 7 2" "r a + y .3. : If, given , then we also have x , we can 2' r -1 + y 3 + 0 "5" a 7_ b . In fact, any set of vectors that contains a spanning set for U2 will also be a spanning set for R2 (see Exercise 20). The next example is an important (easy) case of a spanning set. We will encounter versions of this example many times. Example 2.20 Let ej, e2, and e3 be the standard unit vectors in IR3. Then for any vector = xet + ye2 + ze3 X 1 0 0 y = X 0 + y l + z 0 z 0 0 1 , we have Thus, R3 = span(ei, e2, e3). You should have no difficulty seeing that, in general, R" = span(e1, e2,..., e„). a When the span of a set of vectors in R" is not all of IR", it is reasonable to ask for a description of the vectors' span. Example 2.21 *Y "-I* Find the span of 0 and 1 . (See Example 2.18.) 1 r .3. .-3. 92 Chapter 2 Systems of Linear Equations Two nonparallel vectors span a plane Solution Thinking geometrically, we can see that the set of all linear combinations of r xi "~r 0 and i _3_ .-3. is just the plane through the origin with vectors (Figure 2.9). The vector equation of this plane is and i -3 as direction X 1 -1 y = s 0 + ŕ 1 z 3 -3 X "l" which is just another way of saying that y is in the span of 0 .3. and 1 _"-3_ Suppose we want to obtain the general equation of this plane. There are several ways to proceed. One is to use the fact that the equation ax + by + cz = 0 must be satisfied by the points (1, 0, 3) and (-1, 1, - 3) determined by the direction vectors. Substitution then leads to a system of equations in a, b, and c. (See Exercise 17.) Another method is to use the system of equations arising from the vector equation: 5 — f = X t = y 3s — 3t = z If we row reduce the augmented matrix, we obtain "l -1 X R3 - 3Ri 0 1 y -> .3 -3 z_ 1 -1 0 1 0 0 Now we know that this system is consistent, since -1 x y 3x is in the span of and by assumption. So we must have z — 3x = 0 (or 3x — z = 0, in more standard form), giving us the general equation we seek. BOBirk A normal vector to the plane in this example is also given by the cross product 1 -1 -3 0 X 1 = 0 _3_ .-3. 1. Linear independence In Example 2.18, we found that 3 V ~-r "l" 0 + 2 i = 2 _3_ .-3. .3. Let's abbreviate this equa- tion as 3u + 2v = w. The vector w "depends" on u and v in the sense that it is a linear combination of them. We say that a set of vectors is linearly dependent if one Section 2.3 Spanning Sets and Linear Independence of them can be written as a linear combination of the others. Note that we also have u = — Iv + |w and v = — § u + |w. To get around the question of which vector to express in terms of the rest, the formal definition is stated as follows: — Definition A set of vectors v,, v2, . . . , Vj. is linearly dependent if there are scalars cx, c2,..., ck, at least one of which is not zero, such that CjV, + c2v2 + ■ ■ ■ + ckvk = 0 A set of vectors that is not linearly dependent is called linearly independent. Theorem 2.5 Remarks • In the definition of linear dependence, the requirement that at least one of the scalars cl,c2,...,ck must be nonzero allows for the possibility that some may be zero. In the example above, u, v, and w are linearly dependent, since 3u + 2v — w = 0 and, in fact, all of the scalars are nonzero. On the other hand, 2 - 2 1 + 0 4 _ 0 6 3 1 0 so "l "f , and A .3. 1. are linearly dependent, since at least one (in fact, two) of the three scalars 1,-2, and 0 is nonzero. (Note that the acmal dependence arises simply from the fact that the first two vectors are multiples.) (See Exercise 44.) • Since Ovj + 0v2 + ■ ■ • + 0vk = 0 for any vectors v1; v2, . . . , \k, linear dependence essentially says that the zero vector can be expressed as a nontrivial linear combination of v2,..., Vj.. Thus, linear independence means that the zero vector can be expressed as a linear combination of v1; v2,..., \k only in the trivial way: c^v, + c2v2 + ■ ■ • + ck\k = 0 only if q = 0, c2 = 0,..., q = 0. The relationship between the intuitive notion of dependence and the formal definition is given in the next theorem. Happily, the two notions are equivalent! Vectors v,, v2,..., \m in W are linearly dependent if and only if at least one of the vectors can be expressed as a linear combination of the others. Proof If one of the vectors—say, Vj— is a linear combination of the others, then there are scalars c2,... ,cm such that vl = c2v2 + .. . + cmvm. Rearranging, we obtain vi ~ C2V2 - • • • - cmvm = 0, which implies that Vj, v2,..., vm are linearly dependent, since at least one of the scalars (namely, the coefficient 1 of vt) is nonzero. Conversely, suppose that v1; v2,..., vm are linearly dependent. Then there are scalars C\, c2,..., cm, not all zero, such that cxVi + c2\2 -|— . + cmvm = 0. Suppose Cj # 0. Then CjVj = -c2v2-----cmvm and we may multiply both sides by l/cl to obtain vl as a linear combination of the other vectors: Chapter 2 Systems of Linear Equations Note It may appear as if we are cheating a bit in this proof. After all, we cannot be sure that Vj is a linear combination of the other vectors, nor that q is nonzero. However, the argument is analogous for some other vector v,- or for a different scalar Cj. Alternatively, we can just relabel things so that they work out as in the above proof. In a situation like this, a mathematician might begin by saying, "without loss of generality, we may assume that vx is a linear combination of the other vectors" and then proceed as above. Example 2.22 Any set of vectors containing the zero vector is linearly dependent. For if 0, v2,, are in W, then we can find a nontrivial combination of the form qO + c2v2 + cm vm = 0 by setting c, = 1 and c2 = c3 = • • • = c,„ = 0. Example 2.23 Determine whether the following sets of vectors are linearly independent: "o" 1 , 1 I" i (a) and _4_ 2. r o" (c) 1 1 , and 0 0, -1 1 (d) V 1 _o_ V 2 > _0_ V , and 0 .1. 1 1 -1 "l" , and 4 Solution In answering any question of this type, it is a good idea to see if you can determine by inspection whether one vector is a linear combination of the others. A little thought may save a lot of computation! (a) The only way two vectors can be linearly dependent is if one is a multiple of M w the other. (Why?) These two vectors are clearly not multiples, so they are linearly independent. (b) There is no obvious dependence relation here, so we try to find scalars q, c2, c3 such that 1 0 1 0 1 + c2 1 + c3 0 = 0 .0. _1_ .1. .0. The corresponding linear system is cx + c3 = 0 ci + c2 = 0 c2 + c3 = 0 and the augmented matrix is "l 0 1 o" 1 1 0 0 „0 1 1 0. Once again, we make the fundamental observation that the columns of the coefficient matrix are just the vectors in question! Section 2.3 Spanning Sets and Linear Independence The reduced row echelon form is "l 0 0 o" 0 1 0 0 _0 0 1 0_ 1» (check this), so q = 0, c2 = 0, c, = 0. Thus, the given vectors are linearly independent, (c) A little reflection reveals that 1 0 -1 0 -1 + 1 + 0 = 0 0 -1 1 0 B »■ so the three vectors are linearly dependent. [Set up a linear system as in part (b) to check this algebraically] (d) Once again, we observe no obvious dependence, so we proceed directly to reduce a homogeneous linear system whose augmented matrix has as its columns the given vectors: "l 1 1 o" "l 1 1 0~ R1 + R2 "l 0 3 O" R3-R2 2 1 4 0 -> 0 -1 2 0 -> 0 1 -2 0 _0 -1 2 0_ .0 -1 2 0_ .0 0 0 0. If we let the scalars be cx, c2, and c3, we have c, + 3c3 = 0 c2 — 2c3 = 0 from which we see that the system has infinitely many solutions. In particular, there must be a nonzero solution, so the given vectors are linearly dependent. If we continue, we can describe these solutions exactly: cx = — 3c3 and c2 = 2c3. Thus, for any nonzero value of c}, we have the linear dependence relation V r ~l~ "O" 2 + 2c3 1 + c3 4 = 0 „0_ .2. „0_ B n> (Once again, check that this is correct.) A <-- We summarize this procedure for testing for linear independence as a theorem. Theorem 2.8 Let v1; v2,..., \m be (column) vectors in Un and let A be the n X m matrix [\i v2 • • • \m] with these vectors as its columns. Then v1; v2,. . . , vm are linearly dependent if and only if the homogeneous linear system with augmented matrix [A | 0] has a nontrivial solution. Proof Vj, v2,..., ym are linearly dependent if and only if there are scalars cx,c2,..., c„„ not all zero, such that qvi + c2v2 + . . . + cmym = 0. By Theorem 2.4, this is equivalent ' c, to saying that the nonzero vector matrix is [V[ v2 •.. vm | 0]. is a solution of the system whose augmented 96 Chapter 2 Systems of Linear Equations Example 2.24 The standard unit vectors e2, and e3 are linearly independent in 1R3, since the system with augmented matrix [ex e2 e31 0] is already in the reduced row echelon form "l 0 0 o" 0 1 0 0 „0 0 1 0. and so clearly has only the trivial solution. In general, we see that e1; e2,..., e„ will be linearly independent in W. Performing elementary row operations on a matrix constructs linear combinations of the rows. We can use this fact to come up with another way to test vectors for linear independence. Example 2.25 Theorem 2.7 Consider the three vectors of Example 2.23(d) as row vectors: [1,2,0], [1,1,-1], and [1,4,2] We construct a matrix with these vectors as its rows and proceed to reduce it to echelon form. Each time a row changes, we denote the new row by adding a prime symbol: "l 2 o" rj = r2-r, 1 2 o" 1 1 -1 -> 0 -1 -1 r;=Rj-r, 1 4 2 0 2 2 R; = Rá + 2Ri -> 1 2 0 0 -1 -1 0 0 0 From this we see that 0 = R'l = R'3 + 2R'2 = (R3 - Rx) + 2(R2 - R{) = -3R1 + 2R2 + R3 or, in terms of the original vectors, -3[1,2,0] 4- 2[1,1,-1] + [1,4,2] = [0,0,0] [Notice that this approach corresponds to taking c3 = 1 in the solution to Example 2.23(d).] Thus, the rows of a matrix will be linearly dependent if elementary row operations can be used to create a zero row. We summarize this finding as follows: Let Vi, v2,..., ym be (row) vectors in IR" and let A be the m X n matrix v2 with these vectors as its rows. Then vu v2,..., vm are linearly dependent if and only if rank(A) < m. Proof Assume that v1; v2, . . . , vm are linearly dependent. Then, by Theorem 2.2, at least one of the vectors can be written as a linear combination of the others. Section 2.3 Spanning Sets and Linear Independence Bl We relabel the vectors, if necessary, so that we can write vm = c^j + c2v2 + • * • + cm-iym-\- Then the elementary row operations Rm — CiRx, Rm — c2.R2, . . . , Rm - Cn-xRn-x applied to A will create a zero row in row m. Thus, rank(A) < m. Conversely, assume that rank(A) < m. Then there is some sequence of row operations that will create a zero row. A successive substitution argument analogous to that used in Example 2.25 can be used to show that 0 is a nontrivial linear combination of Vj, v2,..., \m. Thus, Vj, v2,..., vm are linearly dependent. _____91 In some situations, we can deduce that a set of vectors is linearly dependent without doing any work. One such situation is when the zero vector is in the set (as in Example 2.22). Another is when there are "too many" vectors to be independent. The following theorem summarizes this case. (We will see a sharper version of this result in Chapter 6.) ThfiOrCm 2.8 Any set of m vectors in IR" is linearly dependent if m > 'ft. Proot Let v,, v2,..., vm be (column) vectors in LR" and let A be the n X m matrix [Vj v2 •.. vm] with these vectors as its columns. By Theorem 2.6, v2,..., vm are linearly dependent if and only if the homogeneous linear system with augmented matrix [A | 0] has a nontrivial solution. But, according to Theorem 2.6, this will always be the case if A has more columns than rows; it is the case here, since number of columns m is greater than number of rows n. Example 2.26 The vectors T Y Y , and ,3. A 1 are linearly dependent, since there cannot be more than two linearly independent vectors in IR2. (Note that if we want to find the actual dependence relation among these three vectors, we must solve the homogeneous system whose coefficient matrix has the given vectors as columns. Do this!) » Exercises 2.3 In Exercises 1-6, determine if the vector v is a linear combination of the remaining vectors. 5. v = 1. v - 2. v = I 2 , u2 = -1. .-1. 4" '-2 , U] = , u2 = -2. 1. Y" Y "o" 2 l .u2 = 1 _3_ _o_ .1. 1 1 0 3.2 n "1.0 3.4" 3. v = 2 >«i = 1 >u2 = 1 CAS 6, v 2.0 .Ul = 0.4 >u2 = 1.4 3 0 1 .-2.6. L4.8_ .-6.4. 3" Y "o" 4. v = 2 l ,u2 = 1 .-1. .0. .1. -1.2 0.2 -1.0 88 Chapter 2 Systems of Linear Equations In Exercises 7 and 8, determine if the vector b is in the span of the columns of the matrix A. 7. A = 8. A = 1 2 3 4_ 1 2 3 5 6 7 9 10 11 9. Show that U2 = span 10. Show that I span 4" 8 .12. "1* l" .1. - 1. - 3" 0" - 2 1 f "l" V "o" \ 11. Show that U3 = span 0 ) l 1 { 1 0 1 I ( "l" "l" 2~ 12. Show that (R3 = span 1 J 2 1 { 0 3 1 In Exercises 13-16, describe the span of the given vectors (a) geometrically and (b) algebraically. r 2l "-1 o' 3" 13. 14. —i 2 .0. 4. "l" 3~ - f o" 15. 2 2 16. 0 1 -1 .0. -1 1 0 1 17. The general equation of the plane that contains the points (1, 0, 3), (—1, 1, —3), and the origin is of the form ax + by + cz = 0. Solve for a, b, and c. 18. Prove that u, v, and w are all in span(u, V, w). 19. Prove that u, v, and w are all in span(u, u + v, u + v + w). 20. (a) Prove that if ux,..., um are vectors in W, S = {u,, u2>..., Ujfe}, and T = {u,,..,, uh uk+x,.... um), then span(S) C span(T). [Hint: Rephrase this question in terms of linear combinations.] (b) Deduce that if W = span(S), then W = span(T) also. 21. (a) Suppose that vector w is a linear combination of vectors u{,... ,uk and that each u, is a linear combination of vectors Vj,..., v,„. Prove that w is a linear combination of v^ ..., vm and therefore span(ui,..., Ujt) C span(vi,..., v,„). (b) In part (a), suppose in addition that each v, is also a linear combination of u1;. .., u^. Prove that spanCu^ ...,uk) = span(v1;..., v,„). (c) Use the result of part (b) to prove that p3 = span 1 1 1 0 1 1 0 0 1 [Hint: We know that R3 = span(e1( e2, e3).] Use the method of Example 2.23 and Theorem 2.6 to determine if the sets of vectors in Exercises 22-31 are linearly independent. If, for any of these, the answer can be determined by inspection (i.e., without calculation), state why. For any sets that are linearly dependent, find a dependence relationship among the vectors. 22. -1 2 3 2 3 24. 2 1 1 2 26. 28. 29. -2 3 7, -1 1 2 1 1 -1 1 0 4 -1 5 3 2 2 4 3 5 1 j 0 3 2 23. 25. 27. 1 1 1 1 j 2 -1 1 3 2 0 2 2 1 1 0 2 .3, .1. 3 6 0 4 7 0 .5 8 0 2' 3 1 -1 1 0 0 1 1 * -1 -1 1 30. " 3 -I I -1 -1 3 1 -1 31. 1 1 3 1 -1 -1 1 3 In Exercises 32-41, determine if the sets of vectors in the given exercise are linearly independent by converting the Section 2.4 Applications vectors to row vectors and using the method of Example 2.25 and Theorem 2.7. For any sets that are linearly dependent, find, a dependence relationship among the vectors. 32. Exercise 22 33. Exercise 23 34. Exercise 24 35. Exercise 25 36. Exercise 26 37. Exercise 27 38. Exercise 28 39. Exercise 29 40. Exercise 30 41. Exercise 31 42. (a) If the columns of an n X n matrix A are linearly in- dependent as vectors in W, what is the rank of A? Explain. (b) If the rows of an n X n matrix A are linearly independent as vectors in W, what is the rank of A? Explain. 43. (a) If vectors u, v, and w are linearly independent, will u + v, v + w, and u + w also be linearly independent? Justify your answer. (b) If vectors u, v, and w are linearly independent, will u — v, v — w, and u — w also be linearly independent? Justify your answer. 44. Prove that two vectors are linearly dependent if and only if one is a scalar multiple of the other. [Hint: Separately consider the case where one of the vectors is 0.] 45. Give a "row vector proof" of Theorem 2.8. 46. Prove that every subset of a linearly independent set is linearly independent. 47. Suppose that S = {v[,..., vk, v} is a set of vectors in some W and that v is a linear combination of vu ..., vk. If S' = {vl5..., vj, prove that span(S) = span(S'). [Hint: Exercise 21(b) is helpful here.] 48. Let {vj,..., vjj be a linearly independent set of vectors in W, and let v be a vector in R". Suppose that v = qVj + c2v2 + • • ■ + ckvk with Cj # 0. Prove that {v, v2,..., vk} is linearly independent. Applications There are too many applications of systems of linear equations to do them justice in a single section. This section will introduce a few applications, to illustrate the diverse settings in which they arise. Allocation of Resources A great many applications of systems of linear equations involve allocating limited resources subject to a set of constraints. Example 2.21 A biologist has placed three strains of bacteria (denoted I, II, and III) in a test tube, where they will feed on three different food sources (A, B, and C). Each day 2300 units of A, 800 units of B, and 1500 units of C are placed in the test tube, and each bacterium consumes a certain number of units of each food per day, as shown in Table 2.2. How many bacteria of each strain can coexist in the test tube and consume all of the food? Table 2.2 Bacteria Strain I Bacteria Strain II Bacteria Strain III Food A Food B Food C 4 0 1 100 Chapter 2 Systems of Linear Equations Solution Let xx, x2, and x3 be the numbers of bacteria of strains I, II, and III, respectively. Since each of the xx bacteria of strain I consumes 2 units of A per day, strain I consumes a total of 2x, units per day. Similarly, strains II and III consume a total of 2x2 and 4x3 units of food A daily. Since we want to use up all of the 2300 units of A, we have the equation 2*i + 2x2 + 4x3 = 2300 Likewise, we obtain equations corresponding to the consumption of B and C: *i + 2x2 = 800 x, + 3x2 + x3 = 1500 Thus, we have a system of three linear equations in three variables. Row reduction of the corresponding augmented matrix gives "2 2 4 2300 " "l 0 0 100" 1 2 0 800 -—■» 0 1 0 350 „1 3 1 1500. „0 0 1 350. Therefore, X\ = 100, x2 = 350, and x3 = 350. The biologist should place 100 bacteria of strain I and 350 of each of strains II and III in the test tube if she wants all the food to be consumed. Example 2.28 Repeat Example 2.27, using the data on daily consumption of food (units per day) shown in Table 2.3. Assume this time that 1500 units of A, 3000 units of B, and 4500 units of C are placed in the test tube each day. Table 2.3 Bacteria Bacteria Bacteria Strain I Strain II Strain III Food A 1 1 1 Food B 1 2 3 FoodC 1 3 5 Solution Let x., x2, and x3 again be the numbers of bacteria of each type. The augmented matrix for the resulting linear system and the corresponding reduced echelon form are "l 1 i 1500" "l 0 -1 0~ 1 2 3 3000 -, 0 1 2 1500 .1 3 5 4500, _0 0 0 0. We see that in this case we have more than one solution, given by %[ — X3 = 0 x2 + 2x3 = 1500 Letting x3 = f, we obtain xx = t, x2 = 1500 — 2f, and x3 = t. In any applied problem, we must be careful to interpret solutions properly. Certainly the number of bacteria Section 2.4 Applications 101 cannot be negative. Therefore, f S 0 and 1500 — 2f a 0. The latter inequality implies that r s 750, so we have 0 ^ f £ 750. Presumably the number of bacteria must be a whole number, so there are exactly 751 values of t that satisfy the inequality. Thus, our 751 solutions are of the form *1 f o~ l" = 1500 - 2r - 1500 + f -2 -*3- t 0_ 1. one for each integer value of t such that 0 S f < 750. (So, although mathematically this system has infinitely many solutions, physically there are only finitely many.) Balancing Chemical Equations When a chemical reaction occurs, certain molecules (the reactants) combine to form new molecules (theproducts). A balanced chemical equation is an algebraic equation that gives the relative numbers of reactants and products in the reaction and has the same number of atoms of each type on the left- and right-hand sides. The equation is usually written with the reactants on the left, the products on the right, and an arrow in between to show the direction of the reaction. For example, for the reaction in which hydrogen gas (H2) and oxygen (02) combine to form water (H20), a balanced chemical equation is 2H, + O, -» 2H20 indicating that two molecules of hydrogen combine with one molecule of oxygen to form two molecules of water. Observe that the equation is balanced, since there are four hydrogen atoms and two oxygen atoms on each side. Note that there will never be a unique balanced equation for a reaction, since any positive integer multiple of a balanced equation will also be balanced. For example, 6H2 + 30,-> 6H20 is also balanced. Therefore, we usually look for the simplest balanced equation for a given reaction. While trial and error will often work in simple examples, the process of balancing chemical equations really involves solving a homogeneous system of linear equations, so we can use the techniques we have developed to remove the guesswork. Example 2.29 The combustion of ammonia (NH3) in oxygen produces nitrogen (N2) and water. Find a balanced chemical equation for this reaction. Solution If we denote the numbers of molecules of ammonia, oxygen, nitrogen, and water by w, x, y, and z, respectively, then we are seeking an equation of the form wNH3 + x02-> yN2 + zH20 Comparing the numbers of nitrogen, hydrogen, and oxygen atoms in the reactants and products, we obtain three linear equations: Nitrogen: w — 2y Hydrogen: 3w = 2z Oxygen: 2x = z Rewriting these equations in standard form gives us a homogeneous system of three linear equations in four variables. [Notice that Theorem 2.3 guarantees that such a 102 Chapter 2 Systems of Linear Equations system will have (infinitely many) nontrivial solutions.] We reduce the corresponding augmented matrix by Gauss-Jordan elimination. w - 2y =0 "l 0 -2 0 (T 1 0 0 2 — 1 0 3w -2z = 0 - —► 3 0 0 -2 0 -► 0 1 0 1 2 0 2x - z = 0 .0 2 0 -1 0. -0 0 1 1 3 0 ^20 flt J30 Figure 2.10 Flow at a node: fi + /2 = 50 Thus, w = 3 z, x = jz, andy = |z. The smallest positive value of z that will produce integer values for all four variables is the least common denominator of the fractions f, §, and |—namely, 6—which gives w = 4, x = 3, y = 2, and z = 6. Therefore, the balanced chemical equation is 4NH3 + 302 -> 2N, + 6H O Network Analysis Many practical situations give rise to networks: transportation networks, communications networks, and economic networks, to name a few. Of particular interest are the possible flows through networks. For example, vehicles flow through a network of roads, information flows through a data network, and goods and services flow through an economic network. For us, a network will consist of a finite number of nodes (also called junctions or vertices) connected by a series of directed edges known as branches or arcs. Each branch will be labeled with a flow that represents the amount of some commodity that can flow along or through that branch in the indicated direction. (Think of cars traveling along a network of one-way streets.) The fundamental rule governing flow through a network is conservation of flow: At each node, the flow in equals the flow out. Figure 2.10 shows a portion of a network, with two branches entering a node and two leaving. The conservation of flow rule implies that the total incoming flow, f + f2 units, must match the total outgoing flow, 20 + 30 units. Thus, we have the linear equation/! + f2 = 50 corresponding to this node. We can analyze the flow through an entire network by constructing such equations and solving the resulting system of linear equations. Example 2.30 Describe the possible flows through the network of water pipes shown in Figure 2.11, where flow is measured in liters per minute. Solution At each node, we write out the equation that represents the conservation of flow there. We then rewrite each equation with the variables on the left and the constant on the right, to get a linear system in standard form. Node A Nodeß NodeC NodeD 15 =/, +f fi =fi + 10 ft + fi + 5 = 30 /4 + 20=/3 fx-h =10 fi+h =25 h~U= 20 Section 2.4 Applications 5I -A* B ^ 4 20 h 5 ,;• .... D 30| c Figure 2.11 Using Gauss-Jordan elimination, we reduce the augmented matrix: 1 0 0 1 15" "1 0 0 1 15" 1 -1 0 0 10 \ 0 1 0 1 5 0 1 1 0 25 / 0 0 1 "1 20 0 0 1 -1 20 0 0 0 0 0 M * (Check this.) We see that there is one free variable, f4, so we have infinitely many solutions. Setting/4 = f and expressing the leading variables in terms of/4, we obtain /. = 15 - t = 20 + t fi = t These equations describe all possible flows and allow us to analyze the network. For example, we see that if we control the flow on branch AD so that t = 5 L/min, then the other flows are/i = 10,/2 = 0, and / = 25. We can do even better: We can find the minimum and maximum possible flows on each branch. Each of the flows must be nonnegative. Examining the first and second equations in turn, we see that t < 15 (otherwise/i would be negative) and / '-- 5 (otherwise/2 would be negative). The second of these inequalities is more restrictive than the first, so we must use it. The third equation contributes no further restrictions on our parameter f, so we have deduced that 0 s | < 5, Combining this result with the four equations, we see that 10 0 1 0 4 _0 1 4 16. .0 0 1 3. Hence, the currents are Ix = 1 amp, I2 = 4 amps, and J3 = 3 amps. Remark In some electrical networks, the currents may have fractional values or may even be negative. A negative value simply means that the current in the corresponding branch flows in the direction opposite that shown on the network diagram. cas Example 2.32 The network shown in Figure 2.14 has a single power source A and five resistors. Find the currents I,IX,...,I5. This is an example of what is known in electrical engineering as a Wheatstone bridge circuit. 106 Chapter 2 Systems of Linear Equations 2 ohms 10 volts Figure 2.14 A bridge circuit Solution Kirchhoff's current law gives the following equations at the four nodes: I-/, 0 J4 I, - J2 - I3 = 0 I, - L = 0 NodeB: Node C: Node D: Node£: For the three basic circuits, the voltage law gives Circuit ABED A: I4 + 2I5 = 10 Circuit BCEB: 2lx + 273 - J4 = 0 i. - r< o Circuit CDEC: L 2L 2L = 0 (Observe that branch DAB has no resistor and therefore no voltage drop; thus, there is no I term in the equation for circuit ABED A. Note also that we had to change signs three times because we went "against the current." This poses no problem, since we will let the sign of the answer determine the direction of current flow.) We now have a system of seven equations in six variables. Row reduction gives 1 -1 0 0 -1 0 0 1 0 0 0 0 0 7 0 1 -1 -1 0 0 0 0 1 0 0 0 0 3 1 0 -1 0 0 -1 0 0 0 1 0 0 0 4 0 0 0 1 1 -1 0 -> 0 0 0 1 0 0 -1 0 0 0 0 1 2 10 0 0 0 0 1 0 4 0 2 0 2 -1 0 0 0 0 0 0 0 1 3 .0 0 1 -2 0 -2 0. .0 0 0 0 0 0 0 (Use your calculator or CAS to check this.) Thus, the solution (in amps) is I — 7, ij = I5 — 3,12 = h = 4, and J3 = — 1. The significance of the negative value here is that the current through branch CE is flowing in the direction opposite that marked on the diagram. Remark There is only one power source in this example, so the single 10-volt battery sends a current of 7 amps through the network. If we substitute these values into Section 2.4 Applications Ohm's law, E — RI, we get 10 = 7R or R = y. Thus, the entire network behaves as if there were a single y-ohm resistor. This value is called the effective resistance (Reg) of the network. Linear Economic Models An economy is a very complex system with many interrelationships among the various sectors of the economy and the goods and services they produce and consume. Determining optimal prices and levels of production subject to desired economic goals requires sophisticated mathematical models. Linear algebra has proven to be a powerful tool in developing and analyzing such economic models. In this section, we introduce two models based on the work of Harvard economist Wassily Leontief in the 1930s. His methods, often referred to as input-output analysis, are now standard tools in mathematical economics and are used by cities, corporations, and entire countries for economic planning and forecasting. We begin with a simple example. Example 2.33 Wassily Leontief (1906-1999) was born in St. Petersburg, Russia. He studied at the University of Leningrad and received his Ph.D. from the University of Berlin. He emigrated to the United States in 1931, teaching at Harvard University and later at New York University. In 1932, Leontief began compiling data for the monumental task of conducting an input-output analysis of the United States economy, the results of which were published in 1941. He was also an early user of computers, which he needed to solve the large-scale linear systems in his models. For his pioneering work, Leontief was awarded the Nobel Prize in Economics in 1973. The economy of a region consists of three industries, or sectors: service, electricity, and oil production. For simplicity, we assume that each industry produces a single commodity (goods or services) in a given year and that income (output) is generated from the sale of this commodity. Each industry purchases commodities from the other industries, including itself, in order to generate its output. No commodities are purchased from outside the region and no output is sold outside the region. Furthermore, for each industry, we assume that production exactly equals consumption (output equals input, income equals expenditure). In this sense, this is a closed economy that is in equilibrium. Table 2.4 summarizes how much of each industry's output is consumed by each industry. Table 2.4 Service Produced by (output) Electricity Oil Consumed by (input) Service Electricity Oil 1/4 1/4 1/2 1/3 1/3 1/3 1/2 1/4 1/4 From the first column of the table, we see that the service industry consumes 1/4 of its own output, electricity consumes another 1/4, and the oil industry uses 1/2 of the service industry's output. The other two columns have similar interpretations. Notice that the sum of each column is 1, indicating that all of the output of each industry is consumed. Let Xi, x2, and x3 denote the annual output (income) of the service, electricity, and oil industries, respectively, in millions of dollars. Since consumption corresponds to expenditure, the service industry spends j xx on its own commodity, 1 x2 on electricity, and j x3 on oil. This means that the service industry's total annual expenditure is j xx + 1 x2+ I x3. Since the economy is in equilibrium, the service industry's 108 Chapter 2 Systems of Linear Equations expenditure must equal its annual income xx. This gives the first of the following equations; the other two equations are obtained by analyzing the expenditures of the electricity and oil industries. Service: Electricity: Oil: 1 3 i 2 xx + U2+ 4 x3 — x3 Rearranging each equation, we obtain a homogeneous system of linear equations, which we then solve. (Check this!) -fx1 + lx2+5x3 = 0 0 0 1*1-§*2" 2 Xj + I X2 4 X3 o~ "l 0 -1 o" 0 -> 0 1 3 4 0 0. .0 0 0 0. Setting x3 = t, we find that xx = t and x2 = f r. Thus, we see that the relative outputs of the service, electricity, and oil industries need to be in the ratios xx: x2: x3 = 4 : 3 : 4 for the economy to be in equilibrium. Remarks • The last example illustrates what is commonly called the Leontief closed model. • Since output corresponds to income, we can also think of xh x2, and x3 as the prices of the three commodities. We now modify the model in Example 2.33 to accommodate an open economy, one in which there is an external as well as an internal demand for the commodities that are produced. Not surprisingly, this version is called the Leontief open model. Example 2.34 Consider the three industries of Example 2.33 but with consumption given by Table 2.5. We see that, of the commodities produced by the service industry, 20% are consumed by the service industry, 40% by the electricity industry, and 10% by the oil industry. Thus, only 70% of the service industry's output is consumed by this economy. The implication of this calculation is that there is an excess of output (income) over input (expenditure) for the service industry. We say that the service industry is productive. Likewise, the oil industry is productive but the electricity industry is nonproductive. (This is reflected in the fact that the sums of the first and third columns are less than 1 but the sum of the second column is equal to 1). The excess output may be applied to satisfy an external demand. Table 2.5 Produced by (output) Service Electricity Oil Service 0.20 0.50 0.10 Electricity 0.40 0.20 0.20 0.10 0.30 0.30 Consumed by (input) ou Section 2.4 Applications \V'a For example, suppose there is an annual external demand (in millions of dollars) for 10, 10, and 30 from the service, electricity, and oil industries, respectively. Then, equating expenditures (internal demand and external demand) with income (output), we obtain the following equations: output internal demand external demand Service xx = Q2xx + 0.5x2 + 0.lx3 + 10 Electricity x2 = 0.4*, + 0.2x2 + 0.2*3 + 10 Oil x3 = O.Uj + 0.3x2 + 0.3% + 30 Rearranging, we obtain the following linear system and augmented matrix: 0.8*1 — 0.5*2 " 0.1*3 = 10 0.8 -0.5 -0.1 10 0.4% + 0.8%2 - 0.2*3 = 10 -> -0.4 0.8 -0.2 10 Q.lXx — 0.3*2 + 0.7*3 = 30 .-0.1 -0.3 0.7 30 Row reduction yields "l 0 0 61.74" 0 1 0 63.04 _0 0 1 78.70. from which we see that the service, electricity, and oil industries must have an annual production of $61.74, $63.04, and $78.70 (million), respectively, in order to meet both the internal and external demand for their commodities. A ■<-- We will revisit these models in Section 3.7. Finite Linear Games There are many situations in which we must consider a physical system that has only a finite number of states. Sometimes these states can be altered by applying certain processes, each of which produces finitely many outcomes. For example, a light bulb can be on or off and a switch can change the state of the light bulb from on to off and vice versa. Digital systems that arise in computer science are often of this type. More frivolously, many computer games feature puzzles in which a certain device must be manipulated by various switches to produce a desired outcome. The finiteness of such situations is perfecdy suited to analysis using modular arithmetic, and often linear systems over some Zp play a role. Problems involving this type of situation are often called finite linear games. Example 2.30 A row of five lights is controlled by five switches. Each switch changes the state (on or off) of the light directly above it and the states of the lights immediately adjacent to the left and right. For example, if the first and third lights are on, as in Figure 2.15(a), then pushing switch A changes the state of the system to that shown in Figure 2.15(b). If we next push switch C, then the result is the state shown in Figure 2.15(c). 110 Chapter 2 Systems of Linear Equations A B Figure 2.15 C (a) Suppose that initially all the lights are off. Can we push the switches in some order so that only the first, third, and fifth lights will be on? Can we push the switches in some order so that only the first light will be on? Solution The on/off nature of this problem suggests that binary notation will be helpful and that we should work with Z2. Accordingly, we represent the states of the five lights by a vector in Zf, where 0 represents off and 1 represents on. Thus, for example, the vector o' 1 1 0 _0 corresponds to Figure 2.15(b). We may also use vectors in Z | to represent the action of each switch. If a switch changes the state of a light, the corresponding component is a 1; otherwise, it is 0. With this convention, the actions of the five switches are given by 1 1 0 0 0 1 1 1 0 0 0 , b = 1 , c = 1 , d = 1 , e = 0 0 0 1 1 1 0 0 0 1 1 The situation depicted in Figure 2.15(a) corresponds to the initial state followed by a = 1 1 0 0 0 Section 2.4 Applications It is the vector sum (in Z|) Observe that this result agrees with Figure 2.15(b). Starting with any initial configuration s, suppose we push the switches in the order A, C, D, A, C, B. This corresponds to the vector sum s + a + c + d + a + c + b. But in Zf, addition is commutative, so we have a + c + d + a + c + b = s + 2a + b = s + b + d 2c + d where we have used the fact that 2 = 0 in Z2. Thus, we would achieve the same result by pushing only B and D—and the order does not matter. (Check that this is correct.) Hence, in this example, we do not need to push any switch more than once. So, to see if we can achieve a target configuration t starting from an initial configuration s, we need to determine whether there are scalars Z, such that S + XX& + A:b A'.e ~ t In other words, we need to solve (if possible) the linear system over Z2 that corresponds to the vector equation xxa + x2b + • • • + x5e — t — s = t + s In this case, s = 0 and our first target configuration is r 0 1 0 1 The augmented matrix of this system has the given vectors as columns: 110 0 0 1 1 0 1 0 0 1 0 1 1 0 0 111 0 0 0 1 1 We reduce it over Z, to obtain 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 1 1 0 0 0 0 0 112 Chapter 2 Systems of Linear Equations Thus, xs is a free variable. Hence, there are exactly two solutions (corresponding to x5 = 0 and x5 = 1). Solving for the other variables in terms of x5, we obtain X] x5 x2 = 1 + x5 x3 = 1 XA = 1 + Xc So, when x5 = 0 and x5= 1, we have the solutions *1 "o" Xj Y x2 1 x2 0 x3 = 1 and = 1 x4 1 A4 0 „*5_ _0_ _x5_ _1_ respectively. (Check that these both work.) Similarly, in the second case, we have The augmented matrix reduces as follows: 1 1 0 0 0 1 1 0 0 0 1 0 1 1 1 0 0 0 0 1 0 0 1 1 0 1 1 1 0 0 -> 0 0 1 0 0 1 0 0 1 1 1 0 0 0 0 1 1 1 0 0 0 1 1 0_ _0 0 0 0 0 1. showing that there is no solution in this case; that is, it is impossible to start with all of the lights off and turn only the first light on. Example 2.35 shows the power of linear algebra. Even though we might have found out by trial and error that there was no solution, checking all possible ways to push the switches would have been extremely tedious. We might also have missed the fact that no switch need ever be pushed more than once. Example 2.36 Consider a row with only three lights, each of which can be off, light blue, or dark blue. Below the lights are three switches, A, B, and C, each of which changes the states of particular lights to the next state, in the order shown in Figure 2.16. Switch A changes the states of the first two lights, switch B all three lights, and switch C the last two Section 2.4 Applications 113 Off. Dark blue Light blue ffiifliiiUiBi Figure 2.16 Figure 2.17 Exercises 2.4 lights. If all three lights are initially off, is it possible to push the switches in some order so that the lights are off, light blue, and dark blue, in that order (as in Figure 2.17)? Solution Whereas Example 2.35 involved Z2, this one clearly (is it clear?) involves Z3. Accordingly, the switches correspond to the vectors 1 1 0 a = 1 , b = 1 , c = 1 _0_ _1_ .1. in Z|, and the final configuration we are aiming for is t (Off is 0, light blue is 1, and dark blue is 2.) We wish to find scalars xx, x2, x3 in Z3 such that xxa + x2b + x3c = t (where x,- represents the number of times the ;'th switch is pushed). This equation gives rise to the augmented matrix [a b c 11], which reduces over Z3 as follows: "l 1 0 cf "i 0 0 2~ 1 1 1 1 0 1 0 1 y _0 1 1 2_ _0 0 1 1. Hence, there is a unique solution: xx = 2, x2 = 1, x3 = 1. In other words, we must push switch A twice and the other two switches once each. (Check this.) 4 —7 Allocation of Resources 1. Suppose that, in Example 2.27,400 units of food A, 600 units of B, and 600 units of C are placed in the test tube each day and the data on daily food consumption by the bacteria (in units per day) are as shown in Table 2.6. How many bacteria of each strain can coexist in the test tube and consume all of the food? 2. Suppose that in Example 2.27,400 units of food A, 500 units of B, and 600 units of C are placed in the test tube each day and the data on daily food Table 2.6 Bacteria Bacteria Bacteria Strain I Strain II Strain III Food A 1 2 0 Food B 2 1 1 Food C 1 1 2 consumption by the bacteria (in units per day) are as shown in Table 2.7. How many bacteria of each 114 Chapter 2 Systems of Linear Equations Table 2.7 Bacteria Bacteria Bacteria Strain I Strain II Strain III Food A 1 2 0 Food B 2 1 3 Food C 1 1 1 strain can coexist in the test tube and consume all of the food? 3. A florist offers three sizes of flower arrangements containing roses, daisies, and chrysanthemums. Each small arrangement contains one rose, three daisies, and three chrysanthemums. Each medium arrangement contains two roses, four daisies, and six chrysanthemums. Each large arrangement contains four roses, eight daisies, and six chrysanthemums. One day, the florist noted that she used a total of 24 roses, 50 daisies, and 48 chrysanthemums in filling orders for these three types of arrangements. How many arrangements of each type did she make? 4. (a) In your pocket you have some nickels, dimes, and quarters. There are 20 coins altogether and exactly twice as many dimes as nickels. The total value of the coins is $3.00. Find the number of coins of each type, (b) Find all possible combinations of 20 coins (nickels, dimes, and quarters) that will make exactly $3.00. 5. A coffee merchant sells three blends of coffee. A bag of the house blend contains 300 grams of Colombian beans and 200 grams of French roast beans. A bag of the special blend contains 200 grams of Colombian beans, 200 grams of Kenyan beans, and 100 grams of French roast beans. A bag of the gourmet blend contains 100 grams of Colombian beans, 200 grams of Kenyan beans, and 200 grams of French roast beans. The merchant has on hand 30 Idlograms of Colombian beans, 15 kilograms of Kenyan beans, and 25 kilograms of French roast beans. If he wishes to use up all of the beans, how many bags of each type of blend can be made? 6. Redo Exercise 5, assuming that the house blend contains 300 grams of Colombian beans, 50 grams of Kenyan beans, and 150 grams of French roast beans and the gourmet blend contains 100 grams of Colombian beans, 350 grams of Kenyan beans, and 50 grams of French roast beans. This time the merchant has on hand 30 kilograms of Colombian beans, 15 Idlograms of Kenyan beans, and 15 kilograms of French roast beans. Suppose one bag of the house blend produces a profit of $0.50, one bag of the special blend produces a profit of $1.50, and one bag of the gourmet blend produces a profit of $2.00. How many bags of each type should the merchant prepare if he wants to use up all of the beans and maximize his profit? What is the maximum profit? Balancing Chemical Equations In Exercises 7-14, balance the chemical equation for each reaction. 7. FeS, + 02-> Fe203 + SO, 8. CO, + H20-► C6H1206 + 02 (This reaction takes place when a green plant converts carbon dioxide and water to glucose and oxygen during photosynthesis.) 9. C4H10 + 02-> CO, + H20 (This reaction occurs when butane, C4H10, burns in the presence of oxygen to form carbon dioxide and water.) 10. C7H602 + 02-► H20 + C02 11. C5HuOH + 02-> H20 + C02 (This equation represents the combustion of amyl alcohol.) 12. HC104 + P4OI0 —> H3P04 + C1207 13. Na,C03 + C + N2-> NaCN + CO «s 14. C2H2C14 + Ca(OH)2-► C2HC13 + CaCl2 + H20 Network Analysis 15. Figure 2.18 shows a network of water pipes with flows measured in liters per minute. (a) Set up and solve a system of linear equations to find the possible flows. (b) If the flow through AB is restricted to 5 L/min, what will the flows through the other two branches be? (c) What are the minimum and maximum possible flows through each branch? (d) We have been assuming that flow is always positive. What would negative flow mean, assuming we allowed it? Give an illustration for this example. 20 > fi_^ 10 A ** c B Figure 2.18 Section 2.4 Applications 115 16. The downtown core of Gotham City consists of one-way streets, and the traffic flow has been measured at each intersection. For the city block shown in Figure 2.19, the numbers represent the average numbers of vehicles per minute entering and leaving intersections A, B, C, and D during business hours. (a) Set up and solve a system of linear equations to find the possible flows/i,.... (b) If traffic is regulated on CD so that /4 = 10 vehicles per minute, what will the average flows on the other streets be? (c) What are die minimum and maximum possible flows on each street? (d) How would the solution change if all of the directions were reversed? ,0| 2o| 10 .......... fx —► 5 A B *t <15 <15 D c >0| ■4 Figure 2.19 17. A network of irrigation ditches is shown in Figure 2.20, with flows measured in thousands of liters per day. 100. A (a) Set up and solve a system of linear equations to find the possible flows/!, (b) Suppose DC is closed. What range of flow will need to be maintained through DB? (c) From Figure 2.20 it is clear that DB cannot be closed. (Why not?) How does your solution in part (a) show this? (d) From your solution in part (a), determine the minimum and maximum flows through DB. 18. (a) Set up and solve a system of linear equations to find the possible flows in the network shown in Figure 2.21. (b) Is it possible for fx = 100 and f6 = 150? [Answer this question first with reference to your solution in part (a) and then directly from Figure 2.21.] (c) If/4 = 0, what will the range of flow be on each of the other branches? 100} 200 ( 150 { fx —► 20o| h —». 100 4200 A h B At Jl_ c 150 loof D 1001 E ioo| F Figure 2.21 Electrical Networks For Exercises 19 and 20, determine the currents for the given electrical networks. 19. /, C /, 8 volts ohm 4 ohms A/W 1 ohm 13 volts 116 Chapter 2 Systems of Linear Equations 20. 4 ohms /, C /, 5 volts * vw—— ohm 2 ohms /3 D h 8 volts 1 1_ _1_ (a) fl2 -A/W 21. (a) Find the currents I,IU.,., I5 in the bridge circuit in Figure 2.22. (b) Find the effective resistance of this network. (c) Can you change the resistance in branch BC (but leave everything else unchanged) so that the current through branch CE becomes 0? 2 ohms AAA- Figure 2.22 22. The networks in parts (a) and (b) of Figure 2.23 show two resistors coupled in series and in parallel, respectively. We wish to find a general formula for the effective resistance of each network—that is, find i?e# such that E = Re[iI, (a) Show that the effective resistance R^g of a network with two resistors coupled in series [Figure 2.23(a)] is given by Rtf = R, + R2 (b) Show that the effective resistance Reg of a network with two resistors coupled in parallel [Figure 2.23(b)] is given by ———wv R2 E (b) Resistors in series and in parallel linear Economic Models 23. Consider a simple economy with just two industries: farming and manufacturing. Farming consumes 1/2 of the food and 1/3 of the manufactured goods. Manufacturing consumes 1/2 of the food and 2/3 of the manufactured goods. Assuming the economy is closed and in equilibrium, find the relative outputs of the farming and manufacturing industries. 24. Suppose the coal and steel industries form a closed economy. Every $1 produced by the coal industry requires $0.30 of coal and $0.70 of steel. Every $1 produced by steel requires $0.80 of coal and $0.20 of steel. Find the annual production (output) of coal and steel if the total annual production is $20 million. 25. A painter, a plumber, and an electrician enter into a cooperative arrangement in which each of them agrees to work for himself/herself and the other two for a total of 10 hours per week according to the schedule shown in Table 2.8. For tax purposes, each person must establish a value for his/her services. They agree to do this so that they each come out even—that is, so that the Section 2.4 Applications 1« total amount paid out by each person equals the amount he/she receives. What hourly rate should each person charge if the rates are all whole numbers between $30 and $60 per hour? Table 2.8 Supplier Painter Plumber Electrician Painter 2 1 5 Consumer Plumber 4 5 1 Electrician 4 4 4 26. Four neighbors, each with a vegetable garden, agree to share their produce. One will grow beans (B), one will grow lettuce (L), one will grow tomatoes (T), and one will grow zucchini (Z). Table 2.9 shows what fraction of each crop each neighbor will receive. What prices should the neighbors charge for their crops if each person is to break even and the lowest-priced crop has a value of $50? Table 2.9 Producer B L T Z B 0 1/4 1/8 1/6 Consumer L 1/2 1/4 1/4 1/6 T 1/4 1/4 1/2 1/3 Z 1/4 1/4 1/8 1/3 27. Suppose the coal and steel industries form an open economy. Every $1 produced by the coal industry requires $0.15 of coal and $0.20 of steel. Every $1 produced by steel requires $0.25 of coal and $0.10 of steel. Suppose that there is an annual outside demand for $45 million of coal and $124 million of steel. (a) How much should each industry produce to satisfy the demands? (b) If the demand for coal decreases by $5 million per year while the demand for steel increases by $6 million per year, how should the coal and steel industries adjust their production? 28. In Gotham City, the departments of Administration (A), Health (H), and Transportation (T) are interdependent. For every dollars worth of services they produce, each department uses a certain amount of the services produced by the other departments and itself, as shown in Table 2.10. Suppose that, during the year, other city departments require $1 million in Administrative services, $1.2 million in Health services, and $0.8 million in Transportation services. What does the annual dollar value of the services produced by each department need to be in order to meet the demands? Table 2 10 Department A H T A $0.20 0.10 0.20 Buy H 0.10 0.10 0.20 T 0.20 0.40 0.30 Finite Linear Games 29. (a) In Example 2.35, suppose all the lights are initially off. Can we push the switches in some order so that only the second and fourth lights will be on? (b) Can we push the switches in some order so that only the second light will be on? 30. (a) In Example 2.35, suppose the fourth light is initially on and the other four lights are off. Can we push the switches in some order so that only the second and fourth lights will be on? (b) Can we push the switches in some order so that only the second light will be on? 31. In Example 2.35, describe all possible configurations of lights that can be obtained if we start with all the lights off. 32. (a) In Example 2.36, suppose that all of the lights are initially off. Show that it is possible to push the switches in some order so that the lights are off, dark blue, and light blue, in that order. (b) Show that it is possible to push the switches in some order so that the lights are light blue, off, and light blue, in that order. (c) Prove that any configuration of the three lights can be achieved. 33. Suppose the lights in Example 2.35 can be off, light blue, or dark blue and the switches work as described Chapter 2 Systems of Linear Equations in Example 2.36. (That is, the switches control the same lights as in Example 2.35 but cycle through the colors as in Example 2.36.) Show that it is possible to start with all of the lights off and push the switches in some order so that the lights are dark blue, light blue, dark blue, light blue, and dark blue, in that order. 34. For Exercise 33, describe all possible configurations of lights that can be obtained, starting with all the lights off. CAS 35. Nine squares, each one either black or white, are arranged in a 3X3 grid. Figure 2.24 shows one possible D 2 3 4 5 6 7 8 Figure 2.24 The nine squares puzzle arrangement. When touched, each square changes its own state and the states of some of its neighbors (black —> white and white —* black). Figure 2.25 shows 1 2. 3 1 5 1 2 $ ® * 4 5 6 4 5 6 4 5 6 7 8 9 7 8 9 7 8 9 1 2 3 1 2 3 1 2 3 © 5 6 4 * 6 4 5 © 7 8 9 7 8 9 7 8 9 1 2 3 1 2 3 1 2 3 4 5 6 4 5 6 4 5 6 © 8 9 7 ® 9 7 8 ® how the state changes work, (Touching the square whose number is circled causes the states of the squares marked * to change.) The object of the game is to turn all nine squares black. [Exercises 35 and 36 are adapted from puzzles that can be found in the interactive CD-ROM game The Seventh Guest (Trilobyte Software/Virgin Games, 1992).] (a) If the initial configuration is the one shown in Figure 2.24, show that the game can be won and describe a winning sequence of moves. (b) Prove that the game can always be won, no matter what the initial configuration. 36. Consider a variation on the nine squares puzzle. The game is the same as that described in Exercise 35 except that there are three possible states for each square; white, gray, or black. The squares change as shown in Figure 2.25, but now the state changes follow the cycle white gray -» black —> white. Show how the winning all-black configuration can be achieved from the initial configuration shown in Figure 2.26. B 2 3 4 5 6 7 9 State changes for the nine squares puzzli The nine squares puzzle with more states Miscellaneous Problems In Exercises 37-53, set up and solve an appropriate system of linear equations to answer the questions. 37. Grace is three times as old as Hans, but in 5 years she will be twice as old as Hans is then. How old are they now? 38. The sum of Annies, Bert's, and Chris's ages is 60. Annie is older than Bert by the same number of years that Bert is older than Chris. When Bert is as old as Annie is now, Annie will be three times as old as Chris is now. What are their ages? The preceding two problems are typical of those found in popular books of mathematical puzzles. However, they have their origins in antiquity. A Babylonian clay tablet that survives from about 300 B.C. contains the following problem. Section 2.4 Applications 119 39. There are two fields whose total area is 1800 square yards. One field produces grain at the rate of | bushel per square yard; the other field produces grain at the rate of \ bushel per square yard. If the total yield is 1100 bushels, what is the size of each field? Over2000 years ago, the Chinese developed methods for solving systems of linear equations, including a version of Gaussian elimination that did not become well known in Europe until the 19th century. (There is no evidence that Gauss was aware of the Chinese methods when he developed what we now call Gaussian elimination. However, it is clear that the Chinese knew the essence of the method, even though they did not justify its use.) The following problem is taken from the Chinese text Jiuzhang suanshu (Nine Chapters in the Mathematical Art), written during the early Han Dynasty, about 200 B.C. 40. There are three types of corn. Three bundles of the first type, two of the second, and one of the third make 39 measures. Two bundles of the first type, three of the second, and one of the third make 34 measures. And one bundle of the first type, two of the second, and three of the third make 26 measures. How many measures of corn are contained in one bundle of each type? 41. Describe all possible values of a, b, c, and d that will make each of the following a valid addition table. [Problems 41-44 are based on the article "An Application of Matrix Theory" by Paul Glaister in The Mathematics Teacher, 85 (1992), pp. 220-223.] (a) + a b (b) + a b c 2 3 c 3 6 d 4 5 d 4 5 42. What conditions on w, x, y, and z will guarantee that we can find a, b, c, and d so that the following is a valid addition table? + a b c w x d y z 43. Describe all possible values of a, b, c, d, e, and/ that will make each of the following a valid addition table. (a) + a b c (b) + a b c d 3 2 1 d 1 2 3 e 5 4 3 e 3 4 5 f 4 3 1 f 4 5 6 44. Generalizing Exercise 42, find conditions on the entries of a 3X3 addition table that will guarantee that we can solve for a, b, c, d, e, and /as previously. 45. From elementary geometry we know that there is a unique straight line through any two points in a plane. Less well known is the fact that there is a unique parabola through any three noncollinear points in a plane. For each set of points below, find a parabola with an equation of the form y = ax2 + bx + c that passes through the given points. (Sketch the resulting parabola to check the validity of your answer.) (a) (0,1), (~1,4), and (2,1) (b) (-3,1), (-2,2), and (-1,5) 46. Through any three noncollinear points there also passes a unique circle. Find the circles (whose general equations are of the form x2 + y2 + ax + by + c = 0) that pass through the sets of points in Exercise 45. (To check the validity of your answer, find the center and radius of each circle and draw a sketch.) The process of adding rational functions (ratios of polynomials) by placing them over a common denominator is the analogue of adding rational numbers. The reverse process of taking a rational function apart by writing it as a sum of simpler rational functions is useful in several areas of mathematics; for example, it arises in calculus when we need to integrate a rational function and in discrete mathematics when we use generating functions to solve recurrence relations. The decomposition of a rational function as a sum of partial fractions leads to a system of linear equations. In Exercises 47-50, find the partial fraction decomposition of the given form. (The capital letters denote constants.) 3x + 1 A B 47. -----=--+----- x" + 2x — 3 x — 1 x + 3 x2 - 3x + 3 A B C 48. —,--;-= - +--+ t- x3 + 2x* + x x x + 1 (x + 1)- x - 1 49.--------- (x + l)(x2 + l)(x2 + 4) A Bx + C Dx + E =---+ —7--h —r-- x + 1 x2 + 1 x2 + 4 _x3 + x + 1_ _ A B ' x(x - l)(x2 + x + l)(x2 + l)3 ~ x x - 1 Cx + D Ex + F Gx + H Ix + J 4.----1---1----1--- x2 + x + 1 x2 + 1 ' (x2 + l)2 (x2 + l)3 120 Chapter 2 Systems of Linear Equations Following are two useful formulas for the sums of powers of consecutive natural numbers: n(n + 1) 1 + 2 + •••+« =----- 2 and , n(n + l)(2n + l) l2 + 22 + • • • + n2 = ~-L-- 6 The validity of these formulas for all values of n > 1 (or even n £: 0) can fee established using mathematical induction (see Appendix B). One way to make an educated guess as to what the formulas are, though, is to observe that we can rewrite the two formulas above as \n2 + \ and \n3 + \n" + \n respectively. This leads to the conjecture that the sum ofpth powers of the first n natural numbers is a polynomial of degree p + 1 in the variable n. 51. Assuming that 1 + 2 + • • • + n ~ an2 + bn + c, find a, b, and c by substituting three values for n and thereby obtaining a system of linear equations in a, b, and c. 52. Assume that l2 + 22 H----+ n2 - an3 + bn2 + cn + d. Find a, b, c, and d. [Hint: It is legitimate to use « = 0. What is the left-hand side in that case?] 53. Show that l3 + 23 + ■ • ■ + «3 = (n(n + l)/2)2. Vignette x ■■■■HB-H The Global Positioning System The Global Positioning System (GPS) is used in a variety of situations for determining geographical locations. The military, surveyors, airlines, shipping companies, and hikers all make use of it. GPS technology is becoming so commonplace that some automobiles, cellular phones, and various handheld devices are now equipped with it. The basic idea of GPS is a variant on three-dimensional triangulation: A point on Earth's surface is uniquely determined by knowing its distances from three other points. Here the point we wish to determine is the location of the GPS receiver, the other points are satellites, and the distances are computed using the travel times of radio signals from the satellites to the receiver. We will assume that Earth is a sphere on which we impose an xyz-coordinate system with Earth centered at the origin and with the positive z-axis running through the north pole and fixed relative to Earth. For simplicity, let's take one unit to be equal to the radius of Earth. Thus Earth's surface becomes the unit sphere with equation x2 + y2 + zr — 1. Time will be measured in hundredths of a second. GPS finds distances by knowing how long it takes a radio signal to get from one point to another. For this we need to know the speed of light, which is approximately equal to 0.47 (Earth radii per hundredths of a second). Let's imagine that you are a hiker lost in the woods at point (x, y, z) at some time t. You don't know where you are, and furthermore, you have no watch, so you don't know what time it is. However, you have your GPS device, and it receives simultaneous signals from four satellites, giving their positions and times as shown in Table 2.11. (Distances are measured in Earth radii and time in hundredths of a second past midnight.) This application is based on the article "An Underdetermined Linear System for GPS" by Dan Kalman in The College Table 2.11 Satellite Data Mathematics Journal, 33 (2002), pp. 384-390. For a more in-depth treatment of the ideas introduced here, see G. Strang and K. Borre, Linear Algebra, Geodesy, and GPS (Wellesley-Cambridge Press, MA, 1997). 1 2 3 4 Satellite (1.11,2.55,2.14) (2.87, 0.00, 1.43) (0.00, 1.08, 2.29) (1.54, 1.01,1.23) Position Time 1.29 1.31 2.75 4.06 121 Let (x, y, z) be your position, and let t be the time when the signals arrive. The goal is to solve for x, y, z, and f. Your distance from Satellite 1 can be computed as follows. The signal, traveling at a speed of 0.47 Earth radii/10-2 sec, was sent at time 1.29 and arrived at time f, so it took t — 1.29 hundredths of a second to reach you. Distance equals velocity multiplied by (elapsed) time, so d = 0.47(f - 1.29) We can also express d in terms of (x, y, z) and the satellite's position (1.11, 2.55, 2.14) using the distance formula: d = V(x - l.ll)2 + (y - 2.55)2 + (z - 2.14)2 Combining these results leads to the equation (x - l.ll)2 + (y - 2.55)2 + (z - 2.14)2 = 0.472(f - 1.29)2 (1) Expanding, simplifying, and rearranging, we find that Equation (1) becomes 2.22x + 5.10y + 4.28z - 0.57* = x2 + y2 + z2 - 0.22I2 +11.95 Similarly, we can derive a corresponding equation for each of the other three satellites. We end up with a system of four equations in x, y, z, and f: 2.22x + 5.10y + 4.28z - 0.57f = x2 + y2 + z2 - 0.22f2 + 11.95 5.74x + 2.86z - 0.58* = x2 + y2 + z2 - 0.22f2 + 9.90 2.16y + 4.58z - 1.21f = x2 + y2 + z2 - 0.22r2 + 4.74 3.08x + 2.02y + 2.46z - 1.79f = x2 + y2 + z2 - 0.22r2 + 1.26 These are not linear equations, but the nonlinear terms are the same in each equation. If we subtract the first equation from each of the other three equations, we obtain a linear system: 3.52x - 5.10y - 1.42z - O.Olr = 2.05 -2.22x - 2.94y + 0.30z - 0.64f = 7.21 0.86x ~ 3.08y - 1.82z - 1.22f = -10.69 The augmented matrix row reduces as 3.52 -5.10 -1.42 -0.01 -2.05 1 0 0 0.36 2.97 2.22 -2.94 0.30 -0.64 -7.21 —-» 0 1 0 0.03 0.81 0.86 -3.08 -1.82 -1.22 -10.69. .0 0 1 0.79 5.91. from which we see that x = 2.97 - 0.36f y = 0.81 - 0.03* (2) z = 5.91 - 0.79f with t free. Substituting these equations into (1), we obtain (2.97 - 0.36f - 1.11)2 + (0.81 - 0.03r - 2.55)2 + (5.91 - 0.79f - 2.14)2 = 0.472(t - 1.29)2 which simplifies to the quadratic equation 0.54f2 - 6.65f + 20.32 = 0 There are two solutions: t = 6.74 and f = 5.60 Substituting into (2), we find that the first solution corresponds to {x, y, z) = (0.55, 0.61,0.56) and the second solution to (x,y, z) - (0.96,0.65,1.46). The second solution is clearly not on the unit sphere (Earth), so we reject it. The first solution produces x2 + y1 + z2 = 0.99, so we are satisfied that, within acceptable roundoff error, we have located your coordinates as (0.55, 0.61, 0.56). In practice, GPS takes significantly more factors into account, such as the fact that Earth's surface is not exactly spherical, so additional refinements are needed involving such techniques as least squares approximation (see Chapter 7). In addition, the results of the GPS calculation are converted from rectangular (Cartesian) coordinates into latitude and longitude, an interesting exercise in itself and one involving yet other branches of mathematics. 123 Chapter 2 Systems of Linear Equations Iterative Methods for Solving Linear Systems The direct methods for solving linear systems, using elementary row operations, lead to exact solutions in many cases but are subject to errors due to roundoff and other factors, as we have seen. The third road in our "trivium" takes us down quite a different path indeed. In this section, we explore methods that proceed itemtively by successively generating sequences of vectors that approach a solution to a linear system. In many instances (such as when the coefficient matrix is sparse—that is, contains many zero entries), iterative methods can be faster and more accurate than direct methods. Also, iterative methods can be stopped whenever the approximate solution they generate is sufficiently accurate. In addition, iterative methods often benefit from inaccuracy: Roundoff error can actually accelerate their convergence toward a solution. We will explore two iterative methods for solving linear systems: Jacobi's method and a refinement of it, the Gauss-Seidel method. In all examples, we will be considering linear systems with the same number of variables as equations, and we will assume that there is a unique solution. Our interest is in finding this solution using iterative methods. *- Example 2.37 Consider the system r 3x, — 5x2 = ~7 Carl Gustav Jacobi (1804-1851) was a German mathematician who made important contributions to many fields of mathematics and physics, including geometry, number theory, analysis, mechanics, and fluid dynamics. Although much of his work was in applied mathematics, Jacobi believed in the importance of doing mathematics for its own sake. A fine teacher, he held positions at the Universities of Berlin and Königsberg and was one of the most famous mathematicians in Europe. Jacobi s method begins with solving the first equation for Xj and the second equation for x2, to obtain 5 + x2 (1) X, = 7 7 + 3*i 5 We now need an initial approximation to the solution. It turns out that it does not matter what this initial approximation is, so we might as well take x, = 0, x2 = 0. We use these values in Equations (1) to get new values of xx and x2: 5 + 0 5 x, = —^r" = - » 0.714 7 7 - 7 + 3'° _ 7 Xl~ 5 ~ 5 Now we substitute these values into (1) to get 5 + 1.4 1.400 7 7 + 3 x2 » 0.914 1.829 (written to three decimal places). We repeat this process (using the old values of x2 and Xi to get the new values of X\ and x2), producing the sequence of approximations given in Table 2.12. Section 2.5 Iterative Methods for Solving Linear Systems 125 Tahle 2.12 n 0 1 2 3 4 5 6 Xj 0 0.714 0.914 0.976 0.993 0.998 0.999 0 1.400 1.829 1.949 1.985 1.996 1.999 The successive vectors the fourth iterate is 0.993 1.985 are called iterates, so, for example, when n = 4, . We can see that the iterates in this example are approaching , which is the exact solution of the given system. (Check this.) We say in this case that Jacobi's method converges. Jacobi's method calculates the successive iterates in a two-variable system according to the crisscross pattern shown in Table 2.13. Table 2.13 The Gauss-Seidel method is named after C. F. Gauss and Philipp Ludwig von Seidel (1821 -1896). Seidel worked in analysis, probability theory, astronomy, and optics. Unfortunately, he suffered from eye problems and retired at a young age. The paper in which he described the method now known as Gauss-Seidel was published in 1874. Gauss, it seems, was unaware of the method! Before we consider Jacobi's method in the general case, we will look at a modification of it that often converges faster to the solution. The Gauss-Seidel method is the same as the Jacobi method except that wre use each new value as soon as we can. So in our example, we begin by calculating x1 = (5 + 0)/7 = § ~ 0.714 as before, but we now use this value of xx to get the next value of x2: 7 + 3' 1.829 We then use this value of x2 to recalculate x{, and so on. The iterates this time are shown in Table 2.14. We observe that the Gauss-Seidel method has converged faster to the solution. The iterates this time are calculated according to the zigzag pattern shown in Table 2.15. Table 2.14 « 0 1 2 3 4 5 xx 0 0.714 0.976 0.998 1.000 1.000 x2 0 1.829 1.985 1.999 2.000 2.000 Chapter 2 Systems of Linear Equations Table 2.15 «0123 The Gauss-Seidel method also has a nice geometric interpretation in the case of two variables. We can think of x: and x2 as the coordinates of points in the plane. Our starting point is the point corresponding to our initial approximation, (0,0). Our first calculation gives X\ = f, so we move to the point (1,0) ~ (0.714,0). Then we compute x2 = § « 1.829, which moves us to the point (f, ff) « (0.714, 1.829). Continuing in this fashion, our calculations from the Gauss-Seidel method give rise to a sequence of points, each one differing from the preceding point in exactly one coordinate. If we plot the lines 7xx — x2 = 5 and 3xt ~ 5x2 = — 7 corresponding to the two given equations, we find that the points calculated above fall alternately on the two lines, as shown in Figure 2.27. Moreover, they approach the point of intersection of the lines, which corresponds to the solution of the system of equations. This is what convergence means! x2 Figure 2.27 Converging iterates The general cases of the two methods are analogous. Given a system of n linear equations in n variables, «n*i + «12*2 + • • • + alnxn = &, aiiXy + «22*2 + • • • + a2nxn = b2 (2) anlx\ "t" an2x2 + ' ' ' + d„„X„ = bn we solve the first equation for xx, the second for x2, and so on. Then, beginning with an initial approximation, we use these new equations to iteratively update each Section 2.5 Iterative Methods for Solving Linear Systems 127 variable. Jacobi's method uses all of the values at the kth iteration to compute the (k + l)st iterate, whereas the Gauss-Seidel method always uses the most recent value of each variable in every calculation. Example 2.39 later illustrates the Gauss-Seidel method in a three-variable problem. At this point, you should have some questions and concerns about these iterative methods. (Do you?) Several come to mind: Must these methods converge? If not, when do they converge? If they converge, must they converge to the solution? The answer to the first question is no, as Example 2.38 illustrates. ____-—-w Example 2,38 Apply the Gauss-Seidel method to the system xx — x-: i 2xi + Xt = 5 with initial approximation Solution We rearrange the equations to get x1 = 1 + x2 x2 — 5 — 2xj The first few iterates are given in Table 2.16. (Check these.) The actual solution to the given system is ~2 x2. .1. . Clearly, the iterates in Table 2.16 are not approaching this point, as Figure 2.28 makes graphically clear in an example of divergence. El Chapter 2 Systems of Linear Equations So when do these iterative methods converge? Unfortunately, the answer to this question is rather tricky. We will answer it completely in Chapter 7, but for now we will give a partial answer, without proof. Let A be the nXn matrix A = df i CI-) Lan\ an We say that A is strictly diagonally dominant if !«22l > l«2l! + l«2jl + ' • • + |flj Am, > fl»l + !«„?. + • • • + fl„„ That is, the absolute value of each diagonal entry an, a22,. ■.. an„ is greater than the sum of the absolute values of the remaining entries in that row. TheOFem 2.9 If a system of n linear equations in n variables has a strictly diagonally dominant coefficient matrix, then it has a unique solution and both the Jacobi and the Gauss-Seidel method converge to it. Remark Be warned! This theorem is a one-way implication. The fact that a system is not strictly diagonally dominant does not mean that the iterative methods diverge. They may or may not converge. (See Exercises 15-19.) Indeed, there are examples in which one of the methods converges and the other diverges. However, if either of these methods converges, then it must converge to the solution—it cannot converge to some other point. Theorem 2.10 If the Jacobi or the Gauss-Seidel method converges for a system of n linear equations in n variables, then it must converge to the solution of the system. Proof We will illustrate the idea behind the proof by sketching it out for the case of Jacobi's method, using the system of equations in Example 2.37. The general proof is similar. Convergence means that "as iterations increase, the values of the iterates get closer and closer to a limiting value." This means that x% and x2 converge to r and s, respectively, as shown in Table 2.17. We must prove that V r .X2. s is the solution of the system of equations. In other words, at the (k + 1) st iteration, the values of xx and x2 must stay the same as at Section 2.5 Iterative Methods for Solving Linear Systems 129 Table 2.17 n k k + l k + 2 xx r r r x2 s s s the fcth iteration. But the calculations give xx = (5 + x2)/7 = (5 + s)/7 and x2 (7 + 3%)/5 = (7 + 3r)/5, Therefore, 5 + 5 7 + 3r --= r and---= s Rearranging, we see that 7r - s= 5 3r - 55 = -7 Thus, xx = r, x2 = 5 satisfy the original equations, as required. By now you may be wondering: If iterative methods don't always converge to the solution, what good are they? Why don't we just use Gaussian elimination? First, we have seen that Gaussian elimination is sensitive to roundoff errors, and this sensitivity can lead to inaccurate or even wildly wrong answers. Also, even if Gaussian elimination does not go astray, we cannot improve on a solution once we have found it. For example, if we use Gaussian elimination to calculate a solution to two decimal places, there is no way to obtain the solution to four decimal places except to start over again and work with increased accuracy. In contrast, we can achieve additional accuracy with iterative methods simply by doing more iterations. For large systems, particularly those with sparse coefficient matrices, iterative methods are much faster than direct methods when implemented on a computer. In many applications, the systems that arise are strictly diagonally dominant, and thus iterative methods are guaranteed to converge. The next example illustrates one such application. Example 2.39 Suppose we heat each edge of a metal plate to a constant temperature, as shown in Figure 2.29. 50° A heated metal plate 130 Chapter 2 Systems of Linear Equations Eventually the temperature at the interior points will reach equilibrium, where the following property can be shown to hold: —-—-:-► The temperature at each interior point P on a plate is the average of the temperatures on the circumference of any circle centered at P inside the plate (Figure 2.30). Figure 2.30 a •4-- To apply this property in an actual example requires techniques from calculus. As an alternative, we can approximate the situation by overlaying the plate with a grid, or mesh, that has a finite number of interior points, as shown in Figure 2.31. The discrete analogue of the averaging property governing equilibrium temperatures is stated as follows: The temperature at each interior point P is the average of the temperatures at the points adjacent to P. For the example shown in Figure 2.31, there are three interior points, and each is adjacent to four other points. Let the equilibrium temperatures of the interior points Section 2.5 Iterative Methods for Solving Linear Systems '+5 be tv f2, and r3, as shown. Then, by the temperature-averaging property, we have 100 + 100 + r, + 50 f'=---4-- f. + t, + n + so — (3) 4 100 4- 100 + 0 + f, or 4r, - f2 = 250 -tx + 4f2 - t3 = 50 - f2 + 4f3 = 200 Notice that this system is strictly diagonally dominant. Notice also that Equations (3) are in the form required for Jacobi or Gauss-Seidel iteration. With an initial approximation of = 0, t2 = 0, t3 = 0, the Gauss-Seidel method gives the following iterates. 100 + 100 + 0 + 50 Iteration 1: tx =----------= 62.5 62.5 + 0 + 0 + 50 U =-= 28.125 4 100 + 100 + 0 + 28.125 t3 =---= 57.031 100 + 100 + 28.125 + 50 Iteration 2: f, =---= 69.531 69.531 + 57.031 + 0 + 50 U = -= 44.141 2 4 100 + 100 + 0 + 44.141 t, = -= 61.035 Continuing, we find the iterates listed in Table 2.18. We work with five-significant-digit accuracy and stop when two successive iterates agree within 0.001 in all variables. Thus, the equilibrium temperatures at the interior points are (to an accuracy of »—*- 0.001) tx = 74.108, f2 = 46.430, and r3 = 61.607. (Check these calculations.) By using a finer grid (with more interior points), we can get as precise information as we like about the equilibrium temperatures at various points on the plate. Table 2.18 n 0 1 2 3 7 8 ti 0 62.500 69.531 73.535 74.107 74.107 h 0 28.125 44.141 46.143 46.429 46.429 h 0 57.031 61.035 61.536 61.607 61.607 132 Chapter 2 Systems of Linear Equations Exercises 2.5 In Exercises 1-6, apply Jacobi's method to the given system. Take the zero vector as the initial approximation and work with four-significant-digit accuracy until two successive iterates agree within 0.001 in each variable. In each case, compare your answer with the exact solution found using any direct method you like. 1. 7x, - x2 = 6 2. 2xi + x2 = 5 xx — 5x, = "4 3. 4.5.x, - 0.5x2 = i X\ 3.5^2 — ' 1 4. 20*, + *2 ~~ x3 — 1? *[ — 10*2 + *3 = 13 —*j + x2 + 10*3 = 18 5. 3*j + x2 =1 X, + 4*2 + *3 = 1 x2 + 3*, = 1 6. 3*i — x2 =1 —*! + 3*2 — *_> =0 *2 ■ 3*3 X4 1 — *3 + 3*4 = 1 *2 = 1 In Exercises 7-12, repeat the given exercise using the Gauss-Seidel method. Take the zero vector as the initial approximation and work with four-significant-digit accuracy until two successive iterates agree within 0.001 in each variable. Compare the number of iterations required by the Jacobi and Gauss-Seidel methods to reach such an approximate solution. 7. Exercise 1 9. Exercise 3 11. Exercise 5 8. Exercise 2 10. Exercise 4 12. Exercise 6 In Exercises 13 and 14, draw diagrams to illustrate the convergence of the Gauss-Seidel method with the given system. 13. The system in Exercise 1 14. The system in Exercise 2 In Exercises 15 and 16, compute the first four iterates, using the zero vector as the initial approximation, to show that the Gauss-Seidel method diverges. Then show that the equations can be rearranged to give a strictly diagonally dominant coefficient matrix, and apply the Gauss-Seidel method to obtain an approximate solution that is accurate to within 0.001. 15. xx 2x-> 3 3*i "i~ 2X} 1 16. Xi — 4*2 + 2*3 = 2 2x2 + 4x3 = 1 6xj — x2 — 2x3 = 1 17. Draw a diagram to illustrate the divergence of the Gauss-Seidel method in Exercise 15. In Exercises 18 and 19, the coefficient matrix is not strictly diagonally dominant, nor can the equations be rearranged to make it so. However, both the Jacobi and the Gauss-Seidel method converge anyway. Demonstrate that this is true of the Gauss-Seidel method, starting with the zero vector as the initial approximation and obtaining a solution that is accurate to within 0.01. 18. 19. -4*i + 5*2 = 14 5*i — 2*2 + 3*3 = —i X, + 4x, — 4x, = 102 -90 — 2*i — 2x2 + 4x3 20. Continue performing iterations in Exercise 18 to obtain a solution that is accurate to within 0.001. 21. Continue performing iterations in Exercise 19 to obtain a solution that is accurate to within 0.001. In Exercises 22-24, the metal plate has the constant temperatures shown on its boundaries. Find the equilibrium temperature at each of the indicated interior points by setting up a system of linear equations and applying either the Jacobi or the Gauss-Seidel method. Obtain a solution that is accurate to within 0.001. 22. i 40° t 40° Section 2.5 Iterative Methods for Solving Linear Systems 133 23. 0° 0° 100° 100° 100° 100° 24. o° 20° 40° 2if 100° 40° 100° In Exercises 25 and 26, we refine the grids used in Exercises 22 and 24 to obtain more accurate information about the equilibrium temperatures at interior points of the plates. Obtain solutions that are accurate to within 0.001, using either the Jacobi or the Gauss-Seidel method. 0° 0° 40° 40° 20° 20° 100° 100° 40° 40° 100° 100° Exercises 27 and 28 demonstrate that sometimes, if we are lucky, the form of an iterative problem may allow us to use a little insight to obtain an exact solution. 27. A narrow strip of paper 1 unit long is placed along a number line so that its ends are at 0 and 1. The paper is folded in half, right end over left, so that its ends are now at 0 and \. Next, it is folded in half again, this time left end over right, so that its ends are at | and\. Figure 2.32 shows this process. We continue folding the paper in half, alternating right-over-left and leftover-right. If we could continue indefinitely, it is clear that the ends of the paper would converge to a point. It is this point that we want to find. (a) Let xl correspond to the left-hand end of the paper and x2 to the right-hand end. Make a table with the first six values of [xu x2] and plot the corresponding points on xx, x2 coordinate axes. (b) Find two linear equations of the form x2 = axl + b and Xj = cx2 + d that determine the new values of the endpoints at each iteration. Draw the corresponding lines on your coordinate axes and show that this diagram would result from applying the Gauss-Seidel method to the system of linear equations you have found. (Your diagram should resemble Figure 2.27 on page 126.) (c) Switching to decimal representation, continue applying the Gauss-Seidel method to approximate die point to which the ends of the paper are converging to within 0.001 accuracy. (d) Solve the system of equations exactly and compare your answers, 28. An ant is standing on a number line at point A. It walks halfway to point B and turns around. Then it walks halfway back to point A, turns around again, and walks halfway to point B. It continues to do this indefinitely. Let point A be at 0 and point B be at 1. The ant's walk is made up of a sequence of overlapping line segments. Let xx record the positions of the left-hand endpoints of these segments and x2 their right-hand endpoints. (Thus, we begin with Xj = 0 and x2 — \. Then we have x} = j and x2 and so on.) Figure 2.33 shows the start of the ant's walk. (a) Make a table with the first six values of [xx, x2] and plot the corresponding points on xx, x2 coordinate axes. (b) Find two linear equations of the form x2 = axx + b andx, = cx2 + d that determine the new values of the endpoints at each iteration. Draw the corresponding l >> 134 Chapter 2 Systems of Linear Equations 0 Folding a strip of paper The ants walk 2 8 4 lines on your coordinate axes and show that this diagram would result from applying the Gauss-Seidel method to the system of linear equations you have found. (Your diagram should resemble Figure 2.27 on page 126.) (c) Switching to decimal representation, continue applying the Gauss-Seidel method to approximate the values to which xx and x2 are converging to within 0.001 accuracy. (d) Solve the system of equations exactly and compare your answers. Interpret your results. Chapter Review Key Definitions and Concepts augmented matrix, 61,64 back substitution, 61 coefficient matrix, 64 consistent system, 60 convergence, 125-126 divergence, 127 elementary row operations, 66 free variable, 71 Gauss-Jordan elimination, 73 Gauss- Seidel method, 124 Gaussian elimination, 68-69 homogeneous system, 76 inconsistent system, 60 iterate, 125 Jacob is method, 124 leading variable (leading 1), 71-73 linear equation, 58 linearly dependent vectors, 93 linearly independent vectors, 93 pivot, 66 rank of a matrix, 72 Rank Theorem, 72 reduced row echelon form, 73 row echelon form, 65 row equivalent matrices, 68 span of a set of vectors, 90 spanning set, 90 system of linear equations, 59 Review Questions 1. Mark each of the following statements true or false: (a) Every system of linear equations has a solution. (b) Every homogeneous system of linear equations has a solution. (c) If a system of linear equations has more variables than equations, then it has infinitely many solutions. (d) If a system of linear equations has more equations than variables, then it has no solution. (e) Determining whether b is in span(a],..., a„) is equivalent to determining whether the system [A i b] is consistent, where A — [&v ■ ■ a„]. (f) In R3, span (u, v) is always a plane through the origin. (g) In R3, if nonzero vectors u and v are not parallel, then they are linearly independent. (h) In R3, if a set of vectors can be drawn head to tail, one after the other so that a closed path (polygon) is formed, then the vectors are linearly dependent. Chapter Review (i) If a set of vectors has the property that no two vectors in the set are scalar multiples of one another, then the set of vectors is linearly independent. (j) If there are more vectors in a set of vectors than the number of entries in each vector, then the set of vectors is linearly dependent. '1 ~2 11. Find the general equation of the plane spanned by V ~3~ 1 and 2 .1. J. 2. Find the rank of the matrix -1 4 -5 0 3 2 1 3 4 2-3 2 -1 6 2 2 1 3 1 -1 , 9 „-3. 12. Determine whether independent. 13. Determine whether R3 = span(u, v, w) if: are linearly 3. Solve the linear system x + y — 2z ■ x + 3v 2x + y 4. Solve the linear system 3w + 8x - 18y + z = 35 w + 2x - Ay =11 w + 3x — 7y + z = 10 5. Solve the linear system 2x + 3y = 4 x + 2y = 3 over Z7. 6. Solve the linear system 3x + 2y = 1 x + 4y = 2 over Z5. 7. For what value(s) of k is the linear system with (a) u 1 1 0 1 , v = 0 , w = 1 .0. .1. .1. z = 7 f "-1 0 5z= 7 (b) u = -1 , v = 0 , w = -1 0„ 1 1 augmented matrix inconsistent? k 2 1 _1 2k 1 8. Find parametric equations for the line of intersection of the planes x + 2y + 3z — 4 and 5x + 6y + 7z — 8. 9. Find the point of intersection of the following lines, if it exists. X V f X 5" "-l" y = 2 + 5 -1 and y = -2 + £ 1 ~ % - _3_ 2 . _z_ _~4_ 1. 10. Determine whether is in the span of 14. Let a2, a3 be linearly independent vectors in R3, and let A = [ax a2 a3]. Which of the following statements are true? (a) The reduced row echelon form of A is I3. (b) The rank of A is 3. (c) The system [A j b] has a unique solution for any vector b in R3. (d) (a), (b), and (c) are all true. (e) (a) and (b) are both true, but not (c). 15. Let a;, a2, a3 be linearly dependent vectors in R\ not all zero, and let A — [a, a2 a3]. What are the possible values of the rank of A? 16. What is the maximum rank of a 5 X 3 matrix? What is the minimum rank of a 5 X 3 matrix? 17. Show that if u and v are linearly independent vectors, then so are u + v and n v. 18. Show that span(u, v) = spaniu, u + v) for any vectors u and v. 19. In order for a linear system with augmented matrix [A [ b] to be consistent, what must be true about the ranks of A and [A j b] ? Ill] [10 -l' 2 3 -1 and 1 1 1 -14 1 0 13 20. Are the matrices row equivalent? Why or why not? and 1 2 -2