Copyright Copyright © 1982, 1990 by Charles C. Pinter All rights reserved. Bibliographical Note This Dover edition, first published in 2010, is an unabridged republication of the 1990 second edition of the work originally published in 1982 by the McGraw-Hill Publishing Company, Inc., New York. Library of Congress Cataloging-in-Publication Data Pinter, Charles C, 1932- A book of abstract algebra / Charles C. Pinter. — Dover ed. p. cm. Originally published: 2nd ed. New York : McGraw-Hill, 1990. Includes bibliographical references and index. ISBN-13: 978-0-486-47417-5 TSBN-10: 0-486-47417-8 1. Algebra, Abstract. I. Title. QA162.P56 2010 512'.02—dc22 2009026228 Manufactured in the United States by Courier Corporation 47417807 2015 www.doverpublications.com CONTENTS* Preface xi Chapter 1 Why Abstract Algebra? 1 History of Algebra. New Algebras. Algebraic Structures. Axioms and Axiomatic Algebra. Abstraction in Algebra. Chapter 2 Operations 19 Operations on a Set. Properties of Operations. Chapter 3 The Definition of Groups 25 Groups. Examples of Infinite and Finite Groups. Examples of Abelian and Nonabelian Groups. Group Tables. Theory of Coding: Maximum-Likelihood Decoding. Chapter 4 Elementary Properties of Groups 36 Uniqueness of Identity and Inverses. Properties of Inverses. Direct Product of Groups. Chapter 5 Subgroups 44 Definition of Subgroup. Generators and Defining Relations. Cayley Diagrams. Center of a Group. Group Codes; Hamming Code. * Italic headings indicate topics discussed in the exercise sections. VI CONTENTS Chapter 6 Functions Chapter 7 Injective, Surjective, Bijective Function. Composite and Inverse of Functions. Finite-State Machines. Automata and Their Semigroups. Groups of Permutations Symmetric Groups. Dihedral Groups. An Application of Groups to Anthropology. Chapter 8 Permutations of a Finite Set Decomposition of Permutations into Cycles. Transpositions. Even and Odd Permutations. Alternating Groups. Chapter 9 Isomorphism The Concept of Isomorphism in Mathematics. Isomorphic and Nonisomorphic Groups. Cayley's Theorem. Group Automorphisms. Chapter 10 Order of Group Elements Powers/Multiples of Group Elements. Laws of Exponents. Properties of the Order of Group Elements. Chapter 11 Cyclic Groups Finite and Infinite Cyclic Groups. Isomorphism of Cyclic Groups. Subgroups of Cyclic Groups. Chapter 12 Partitions and Equivalence Relations Chapter 13 Counting Cosets Lagrange's Theorem and Elementary Consequences. Survey of Groups of Order =s 10. Number of Conjugate Elements. Group Acting on a Set. Chapter 14 Homomorphisms Elementary Properties of Homomorphisms. Normal Subgroups. Kernel and Range. Inner Direct Products. Conjugate Subgroups. 56 69 80 90 103 112 119 126 136 CONTENTS VII Chapter 15 Quotient Groups Quotient Group Construction. Examples and Applications. The Class Equation. Induction on the Order of a Group. Chapter 16 The Fundamental Homomorphism Theorem Fundamental Homomorphism Theorem and Some Consequences. The Isomorphism Theorems. The Correspondence Theorem. Cauchy's Theorem. Sylow Subgroups. Sylow's Theorem. Decomposition Theorem for Finite Abelian Groups. Chapter 17 Rings: Definitions and Elementary Properties Commutative Rings. Unity. Invertibles and Zero-Divisors. Integral Domain. Field. Chapter 18 Ideals and Homomorphisms Chapter 19 Chapter 20 Chapter 21 Chapter 22 147 157 164 181 190 200 Quotient Rings Construction of Quotient Rings. Examples. Fundamental Homomorphism Theorem and Some Consequences. Properties of Prime and Maximal Ideals. Integral Domains Characteristic of an Integral Domain. Properties of the Characteristic. Finite Fields. Construction of the Field of Quotients. The Integers 208 Ordered Integral Domains. Well-ordering. Characterization of Z Up to Isomorphism. Mathematical Induction. Division Algorithm. Factoring into Primes Ideals of Z. Properties of the GCD. Relatively Prime Integers. Primes. Euclid's Lemma. Unique Factorization. 217 Vlil CONTENTS Chapter 23 Chapter 24 Elements of Number Theory (Optional) 226 Properties of Congruence. Theorems of Fermat and Euler. Solutions of Linear Congruences. Chinese Remainder Theorem. Wilson's Theorem and Consequences. Quadratic Residues. The Legendre Symbol. Primitive Roots. Rings of Polynomials 240 Motivation and Definitions. Domain of Polynomials over a Field. Division Algorithm. Polynomials in Several Variables. Fields of Polynomial Quotients. Chapter 25 Chapter 26 Factoring Polynomials Ideals of F[x]. Properties of the GCD. Irreducible Polynomials. Unique factorization. Euclidean Algorithm. Substitution in Polynomials Roots and Factors. Polynomial Functions. Polynomials over Q. Eisenstein's Irreducibility Criterion. Polynomials over the Reals. Polynomial Interpolation. 251 258 Chapter 27 Extensions of Fields 270 282 Algebraic and Transcendental Elements. The Minimum Polynomial. Basic Theorem on Field Extensions. Chapter 28 Vector Spaces Elementary Properties of Vector Spaces. Linear Independence. Basis. Dimension. Linear Transformations. Chapter 29 Degrees of Field Extensions 292 Simple and Iterated Extensions. Degree of an Iterated Extension. Fields of Algebraic Elements. Closure. Algebraic Numbers. Algebraic CONTENTS IX Chapter 30 Chapter 31 Chapter 32 Chapter 33 Appendix A Appendix B Appendix C Ruler and Compass 301 Constructible Points and Numbers. Impossible Constructions. Constructible Angles and Polygons. Galois Theory: Preamble 311 Multiple Roots. Root Field. Extension of a Field. Isomorphism. Roots of Unity. Separable Polynomials. Normal Extensions. Galois Theory: The Heart of the Matter 323 Field Automorphisms. The Galois Group. The Galois Correspondence. Fundamental Theorem of Galois Theory. Computing Galois Groups. Solving Equations by Radicals 334 Radical Extensions. Abelian Extensions. Solvable Groups. Insolvability of the Quintic. Review of Set Theory 345 Review of the Integers 349 Review of Mathematical Induction Answers to Selected Exercises 353 Index 381 PREFACE Once, when I was a student struggling to understand modern algebra, I was told to view this subject as an intellectual chess game, with conventional moves and prescribed rules of play. I was ill served by this bit of extemporaneous advice, and vowed never to perpetuate the falsehood that mathematics is purely—or primarily—a formalism. My pledge has strongly influenced the shape and style of this book. While giving due emphasis to the deductive aspect of modern algebra, I have endeavored here to present modern algebra as a lively branch of mathematics, having considerable imaginative appeal and resting on some firm, clear, and familiar intuitions. I have devoted a great deal of attention to bringing out the meaningfulness of algebraic concepts, by tracing these concepts to their origins in classical algebra and at the same time exploring their connections with other parts of mathematics, especially geometry, number theory, and aspects of computation and equation solving. In an introductory chapter entitled Why Abstract Algebra?, as well as in numerous historical asides, concepts of abstract algebra are traced to the historic context in which they arose. I have attempted to show that they arose without artifice, as a natural response to particular needs, in the course of a natural process of evolution. Furthermore, I have endeavored to bring to light, explicitly, the intuitive content of the algebraic concepts used in this book. Concepts are more meaningful to students when the students are able to represent those concepts in their minds by clear and familiar mental images. Accordingly, the process of concrete concept-formation is developed with care throughout this book. I have deliberately avoided a rigid conventional format, with its succession of definition, theorem, proof, corollary, example. In my experience, that kind of format encourages some students to believe that mathematical concepts have a merely conventional character, and may XI xll CONTENTS encourage mtc memorization. Instead, each chapter has the form of a discussion with the student, with the accent on explaining and motivating. In an effort to avoid fragmentation of the subject matter into loosely related definitions and results, each chapter is built around a central theme and remains anchored to this focal point. In the later chapters especially, this focal point is a specific application or use. Details of every topic are then woven into the general discussion, so as to keep a natural How of ideas running through each chapter. The arrangement of topics is designed to avoid tedious proofs and long-winded explanations. Routine arguments are worked into the discussion whenever this seems natural and appropriate, and proofs to theorems are seldom more than a few lines long. (There are, of course, a few exceptions to this.) Elementary background material is filled in as it is needed. For example, a brief chapter on functions precedes the discussion of permutation groups, and a chapter on equivalence relations and partitions paves the way for Lagrange's theorem. This book addresses itself especially to the average student, to enable him or her to :carn and understand as much algebra as possible. In scope and subject-matter coverage, it is no different from many other standard texts. It begins with the promise of demonstrating the unsolvability of the quintic and ends with that promise fulfilled. Standard topics are discussed in their usual order, and many advanced and peripheral subjects are introduced in the exercises, accompanied by ample instruction and commentary. I have included a copious supply of exercises*—probably more exercises than in other books at this level. They are designed to offer a wide range of experiences to students at different levels of ability. There is some novelty in the way the exercises are organized: at the end of each chapter, the exercises are grouped into exercise sets, each set containing about six to eight exercises and headed by a descriptive title. Each set touches upon an idea or skill covered in the chapter. The first few exercise sets in each chapter contain problems which are essentially computational or manipulative. Then, there are two or three sets of simple proof-type questions, which require mainly the ability to put together definitions and results with understanding of their meaning. After that, I have endeavored to make the exercises more interesting by arranging them so that in each set a new result is proved, or new light is shed on the subject of the chapter. As a rule, all the exercises have the same weight: very simple exercises are grouped together as parts of a single problem, and conversely, problems which require a complex argument are broken into several subproblems which the student may tackle in turn. I have selected mainly problems which have intrinsic relevance, and are not merely drill, on the premise that this is much more satisfying to the student. CONTENTS Xlii CHANGES IN THE SECOND EDITION During the seven years that have elapsed since publication of the first edition of A Book of Abstract Algebra, I have received letters from many readers with comments and suggestions. Moreover, a number of reviewers have gone over the text with the aim of finding ways to increase its effectiveness and appeal as a teaching tool. In preparing the second edition, I have taken account of the many suggestions that were made, and of my own experience with the book in my classes. In addition to numerous small changes that should make the book easier to read, the following major changes should be noted: EXERCISES Many of the exercises have been refined or reworded— and a few of the exercise sets reorganized—in order to enhance their clarity or, in some cases, to make them more mathematically interesting. In addition, several new exericse sets have been included which touch upon applications of algebra and are discussed next: APPLICATIONS The question of including "applications" of abstract algebra in an undergraduate course (especially a one-semester course) is a touchy one. Either one runs the risk of making a visibly weak case for the applicability of the notions of abstract algebra, or on the other hand—by including substantive applications—one may end up having to omit a lot of important algebra. I have adopted what I believe is a reasonable compromise by adding an elementary discussion of a few application areas (chiefly aspects of coding and automata theory) only in the exercise sections, in connection with specific exercise. These exercises may be either stressed, de-emphasized, or omitted altogether. PRELIMINARIES It may well be argued that, in order to guarantee the smoothe flow and continuity of a course in abstract algebra, the course should begin with a review of such preliminaries as set theory, induction and the properties of integers. In order to provide material for teachers who prefer to start the course in this fashion, 1 have added an Appendix with three brief chapters on Sets, Integers and Induction, respectively, each with its own set of exercises. SOLUTIONS TO SELECTED EXERCISES A few exercises in each chapter are marked with the symbol #. This indicates that a partial solution, or sometimes merely a decisive hint, are given at the end of the book in the section titled Solutions to Selected Exercises. ACKNOWLEDGMENTS I would like to express my thanks for the many useful comments and suggestions provided by colleagues who reviewed this text during the course of this revision, especially to J. Richard Byrne, Portland State XIV CONTENTS University: D. R. LaTorre, Clemson University; Kazem Mahdavi, State University College at Potsdam; Thomas N. Roe, South Dakota State University; and Armond E. Spencer, State University of New York-Potsdam. In particular, I would like to thank Robert Weinstein, mathematics editor at McGraw-Hill during the preparation of the second edition of this book. I am indebted to him for his guidance, insight, and steady encouragement. A BOOK OF ABSTRACT ALGEBRA Charles C. Pinter CHAPTER ONE WHY ABSTRACT ALGEBRA? When we open a textbook of abstract algebra for the first time and peruse the table of contents, we are struck by the unfamiliarity of almost every topic we see listed. Algebra is a subject we know well, but here it looks surprisingly different. What are these differences, and how fundamental are they? First, there is a major difference in emphasis. In elementary algebra we learned the basic symbolism and methodology of algebra; we came to see how problems of the real world can be reduced to sets of equations and how these equations can be solved to yield numerical answers. This technique for translating complicated problems into symbols is the basis for all further work in mathematics and the exact sciences, and is one of the triumphs of the human mind. However, algebra is not only a technique, it is also a branch of learning, a discipline, like calculus or physics or chemistry. It is a coherent and unified body of knowledge which may be studied systematically, starting from first principles and building up. So the first difference between the elementary and the more advanced course in algebra is that, whereas earlier we concentrated on technique, we will now develop that branch of mathematics called algebra in a systematic way. Ideas and general principles will take precedence over problem solving. (By the way, this does not mean that modern algebra has no applications—quite the opposite is true, as we will see soon.) Algebra at the more advanced level is often described as modern or abstract algebra. In fact, both of these descriptions are partly misleading. Some of the great discoveries in the upper reaches of present-day algebra l 2 CHARIER ONE (for example, the so-called Galois theory) were known many years before the American Civil War; and the broad aims of algebra today were clearly stated by Leibniz in the seventeenth century. Thus, "modern" algebra is not so very modern, after all! To what extent is it abstract! Well, abstraction is all relative; one person's abstraction is another person's bread and butter. The abstract tendency in mathematics is a little like the situation of changing moral codes, or changing tastes in music: What shocks one generation becomes the norm in the next. This has been true throughout the history of mathematics. For example, 1000 years ago negative numbers were considered to be an outrageous idea. After all, it was said, numbers are for counting: we may have one orange, or two oranges, or no oranges at all; but how can we have minus an orange? The logisticians, or professional calculators, of those days used negative numbers as an aid in their computations; they considered these numbers to be a useful fiction, for if you believe in them then every linear equation ax + b = 0 has a solution (namely x = -b/a, provided a^O). Even the great Diophantus once described the solution of 4x + 6 = 2 as an absurd number. The idea of a system of numeration which included negative numbers was far too abstract for many of the learned heads of the tenth century! The history of the complex numbers (numbers which involve V —1) is very much the same. For hundreds of years, mathematicians refused to accept them because they couldn't find concrete examples or applications. (They are now a basic tool of physics.) Set theory was considered to be highly abstract a few years ago, and so were other commonplaces of today. Many of the abstractions of modern algebra are already being used by scientists, engineers, and computer specialists in their everyday work. They will soon be common fare, respectably "concrete," and by then there will be new "abstractions." Later in this chapter we will take a closer look at the particular brand of abstraction used in algebra. We will consider how it came about and why it is useful. Algebra has evolved considerably, especially during the past 100 years. Its growth has been closely linked with the development of other branches of mathematics, and it has been deeply influenced by philosophical ideas on the nature of mathematics and the role of logic. To help us understand the nature and spirit of modern algebra, we should take a brief look at its origins. ORIGINS The order in which subjects follow each other in our mathematical education tends to repeat the historical stages in the evolution of mathe- WHY ABSTRACT ALGEBRA? 3 matics. In this scheme, elementary algebra corresponds to the great classical age of algebra, which spans about 300 years from the sixteenth through the eighteenth centuries. It was during these years that the art of solving equations became highly developed and modern symbolism was invented. The word "algebra"—al jebr in Arabic—was first used by Mohammed of Kharizm, who taught mathematics in Baghdad during the ninth century. The word may be roughly translated as "reunion," and describes his method for collecting the terms of an equation in order to solve it. It is an amusing fact that the word "algebra" was first used in Europe in quite another context. In Spain barbers were called algebristas, or bonesetters (they xe-united broken bones), because medieval barbers did bonesetting and bloodletting as a sideline to their usual business. The origin of the word clearly reflects the actual context of algebra at that time, for it was mainly concerned with ways of solving equations. In fact, Omar Khayyam, who is best remembered for his brilliant verses on wine, song, love, and friendship which are collected in the Rubaiyat— but who was also a great mathematician—explicitly defined algebra as the science of solving equations. Thus, as we enter upon the threshold of the classical age of algebra, its central theme is clearly identified as that of solving equations. Methods of solving the linear equation ax + b = 0 and the quadratic ax" + bx + c = 0 were well known even before the Greeks. But nobody had yet found a general solution for cubic equations x + ax2 + bx = c or quartic (fourth-degree) equations x4 + ax3 + bx2 + cx = d This great accomplishment was the triumph of sixteenth century algebra. The setting is Italy and the time is the Renaissance—an age of high adventure and brilliant achievement, when the wide world was reawakening after the long austerity of the Middle Ages. America had just been discovered, classical knowledge had been brought to light, and prosperity had returned to the great cities of Europe. It was a heady age when nothing seemed impossible and even the old barriers of birth and rank could be overcome. Courageous individuals set out for great adventures in the far corners of the earth, while others, now confident once again of the power of the human mind, were boldly exploring the limits of knowledge in the sciences and the arts. The ideal was to be bold and many-faceted, to "know something of everything, and everything of at least one thing." The great traders were patrons of the arts, the finest minds in science were adepts at political intrigue and high finance. The study of algebra was reborn in this lively milieu. Those men who brought algebra to a high level of perfection at the beginning of its classical age—all typical products of the Italian Renaiss- 4 CHAPTER ONE ance—were as colorful and extraordinary a lot as have ever appeared in a chapter of history- Arrogant and unscrupulous, brilliant, flamboyant, swaggering, and remarkable, they lived their lives as they did their work: with style and panache, in brilliant dashes and inspired leaps of the imagination. The spirit of scholarship was not exactly as it is today. These men, instead of publishing their discoveries, kept them as well-guarded secrets to be used against each other in problem-solving competitions. Such contests were a popular attraction: heavy bets were made on the rival parties, and their reputations (as well as a substantial purse) depended on the outcome. One of the most remarkable of these men was Girolamo Cardan. Cardan was born in 1501 as the illegitimate son of a famous jurist of the city of Pavia. A man of passionate contrasts, he was destined to become famous as a physician, astrologer, and mathematician—and notorious as a compulsive gambler, scoundrel, and heretic. After he graduated in medicine, his efforts to build up a medical practice were so unsuccessful that he and his wife were forced to seek refuge in the poorhouse. With the help of friends he became a lecturer in mathematics, and, after he cured the child of a senator from Milan, his medical career also picked up. He was finally admitted to the college of physicians and soon became its rector. A brilliant doctor, he gave the first clinical description of typhus fever, and as his fame spread he became the personal physician of many of the high and mighty of his day. Cardan's early interest in mathematics was not without a practical side. As an inveterate gambler he was fascinated by what he recognized to be the laws of chance. He wrote a gamblers' manual entitled Book on Games of Chance, which presents the first systematic computations of probabilities. He also needed mathematics as a tool in casting horoscopes, for his fame as an astrologer was great and his predictions were highly regarded and sought after. His most important achievement was the publication of a book called Ars Magna {The Great Art), in which he presented systematically all the algebraic knowledge of his time. However, as already stated, much of this knowledge was the personal secret of its practitioners, and had to be wheedled out of them by cunning and deceit. The most important accomplishment of the day, the general solution of the cubic equation which had been discovered by Tartaglia, was obtained in that fashion. Tartaglia's life was as turbulent as any in those days. Born with the name of Niccold Fontana about 1500, he was present at the occupation of Brescia by the French in 1512. He and his father fled with many others into a cathedral for sanctuary, but in the heat of battle the soldiers massacred the hapless citizens even in that holy place. The father was killed, and the boy, with a split skull and a deep saber cut across his jaws WHY ABSTRACT ALGEBRA? 5 and palate, was left for dead. At night his mother stole into the cathedral and managed to carry him off; miraculously he survived. The horror of what he had witnessed caused him to stammer for the rest of his life, earning him the nickname Tartaglia, "the stammerer," which he eventually adopted. Tartaglia received no formal schooling, for that was a privilege of rank and wealth. However, he taught himself mathematics and became one of the most gifted mathematicians of his day. He translated Euclid and Archimedes and may be said to have originated the science of ballistics, for he wrote a treatise on gunnery which was a pioneering effort on the laws of falling bodies. In 1535 Tartaglia found a way of solving any cubic equation of the form x2, + ax = b (that is, without an x term). When be announced his accomplishment (without giving any details, of course), he was challenged to an algebra contest by a certain Antonio Fior, a pupil of the celebrated professor of mathematics Scipio del Ferro. Scipio had already found a method for solving any cubic equation of the form x* + ax = b (that is, without an x2 term), and had confided his secret to his pupil Fior. It was agreed that each contestant was to draw up 30 problems and hand the list to his opponent. Whoever solved the greater number of problems would receive a sum of money deposited with a lawyer. A few days before the contest, Tartaglia found a way of extending his method so as to solve any cubic equation. In less than 2 hours he solved all his opponent's problems, while his opponent failed to solve even one of those proposed by Tartaglia. For some time Tartaglia kept his method for solving cubic equations to himself, but in the end he succumbed to Cardan's accomplished powers of persuasion. Influenced by Cardan's promise to help him become artillery adviser to the Spanish army, he revealed the details of his method to Cardan under the promise of strict secrecy. A few years later, to Tartaglia's unbelieving amazement and indignation, Cardan published Tartaglia's method in his book Ars Magna. Even though he gave Tartaglia full credit as the originator of the method, there can be no doubt that he broke his solemn promise. A bitter dispute arose between the mathematicians, from which Tartaglia was perhaps lucky to escape alive. He lost his position as public lecturer at Brescia, and lived out his remaining years in obscurity. The next great step in the progress of algebra was made by another member of the same circle. It was Ludovico Ferrari who discovered the general method for solving quartic equations—equations of the form x4 + ax3 + bx" + cx = d Ferrari was Cardan's personal servant. As a boy in Cardan's service he 6 CHAPTER ONE WHY ABSTRACT ALGEBRA? 7 learned Latin, Greek, and mathematics. He won fame after defeating Tartaglia in a contest in 1548, and received an appointment as supervisor of tax assessments in Mantua. This position brought him wealth and influence, but he was not able to dominate his own violent disposition. He quarreled with the regent of Mantua, lost his position, and died at the age of 43. Tradition has it that he was poisoned by his sister. As for Cardan, after a long career of brilliant and unscrupulous achievement, his luck finally abandoned him. Cardan's son poisoned his unfaithful wife and was executed in 1560. Ten years later, Cardan was arrested for heresy because he published a horoscope of Christ's life. He spent several months in jail and was released after renouncing his heresy privately, but lost his university position and the right to publish books. He was left with a small pension which had been granted to him, for some unaccountable reason, by the Pope. As this colorful time draws to a close, algebra emerges as a major branch of mathematics. It became clear that methods can be found to solve many different types of equations. In particular, formulas had been discovered which yielded the roots of all cubic and quartic equations. Now the challenge was clearly out to take the next step, namely, to find a formula for the roots of equations of degree 5 or higher (in other words, equations with an x5 term, or an xb term, or higher). During the next 200 years, there was hardly a mathematician of distinction who did not try to solve this problem, but none succeeded. Progress was made in new parts of algebra, and algebra was linked to geometry with the invention of analytic geometry. But the problem of solving equations of degree higher than 4 remained unsettled. It was, in the expression of Lagrange, "a challenge to the human mind." It was therefore a great surprise to all mathematicians when in 1824 the work of a young Norwegian prodigy named Niels Abel came to light. In his work, Abel showed that there does not exist any formula (in the conventional sense we have in mind) for the roots of an algebraic equation whose degree is 5 or greater. This sensational discovery brings to a close what is called the classical age of algebra. Throughout this age algebra was conceived essentially as the science of solving equations, and now the outer limits of this quest had apparently been reached. In the years ahead, algebra was to strike out in new directions. THE MODERN AGE About the time Niels Abel made his remarkable discovery, several mathematicians, working independently in different parts of Europe, began raising questions about algebra which had never been considered before. Their researches in different branches of mathematics had led them to investigate "algebras" of a very unconventional kind—and in connection with these algebras they had to find answers to questions which had nothing to do with solving equations. Their work had important applications, and was soon to compel mathematicians to greatly enlarge their conception of what algebra is about. The new varieties of algebra arose as a perfectly natural development in connection with the application of mathematics to practical problems. This is certainly true for the example we are about to look at first. The Algebra of Matrices A matrix is a rectangular array of numbers such as 12 11 -3\ \9 0.5 4/ Such arrays come up naturally in many situations, for example, in the solution of simultaneous linear equations. The above matrix, for instance, is the matrix of coefficients of the pair of equations 2x + lly -3z = 0 9x + 0.5y + Az = Q Since the solution of this pair of equations depends only on the coefficients, we may solve it by working on the matrix of coefficients alone and ignoring everything else. We may consider the entries of a matrix to be arranged in rows and columns; the above matrix has two rows which are (2 11 -3) and (9 0.5 4) and three columns which are It is a 2 x 3 matrix. To simplify our discussion, we will consider only 2x2 matrices in the remainder of this section. Matrices are added by adding corresponding entries: fa b\ la' b'\/a + a' b + b'\ \c d) \c' d') \e + c' d + d'J The matrix -(! 2) is called the zero matrix and behaves, under addition, like the number zero. The multiplication of matrices is a little more difficult. First, let us 8 CHAPTER ONE WHY ABSTRACT ALGEBRA? 9 recall that the dot product of two vectors (a, b) and (a', b') is (a,b)(a',b') = aa' + bb' that is, we multiply corresponding components and add. Now, suppose we want to multiply two matrices A and B; we obtain the product AB as follows: The entry in the first row and first column of AB, that is, in this position (-H-) is equal to the dot product of the first row of A by the first column of B. The entry in the first row and second column of AB, in other words, this position is equal to the dot product of the first row of A by the second column of B. And so on. For example, «UM5 ) (iMlXI 0M3 ) (n)(21)"( 3) So finally, (3 0X2 0) (3 3) The rules of algebra for matrices are very different from the rules of "conventional" algebra. For instance, the commutative law of multplica-tion, AB = BA, is not true. Here is a simple example: (i !)(! IHl lM\ \H\ iX! i) B AB BA B If A is a real number and A2 = 0, then necessarily A = 0; but this is not true of matrices. For example, (! :iX! :!)-(! S) that is, A = 0 although A ^ 0. In the algebra of numbers, if AB = AC where A ¥■ 0, we may cancel A and conclude that B = C. In matrix algebra we cannot. For example, GjXJjM? »-GXj) A B A that is, AB = AC, A # 0, yet B * C. The identity matrix -(J ?) corresponds in matrix multiplication to the number 1; for we have AI = IA = A for every 2x2 matrix A. If A is a number and A2 = 1, we conclude that A — ±1. Matrices do not obey this rule. For example, that is, A2 = I, and yet A is neither I nor —I. No more will be said about the algebra of matrices at this point, except that we must be aware, once again, that it is a new game whose rules are quite different from those we apply in conventional algebra. Boolean Algebra An even more bizarre kind of algebra was developed in the mid-nineteenth century by an Englishman named George Boole. This algebra—subsequently named boolean algebra after its inventor—has a myriad of applications today. It is formally the same as the algebra of sets. If S is a set, we may consider union and intersection to be operations on the subsets of S. Let us agree provisionally to write A + B for AUB and AB for A n B (This convention is not unusual.) Then, A + B = B + A AB = BA A{B + C) = AB + AC A+0 = A A-0 = 0 and so on. 10 CHAPTER ONE These identities are analogous to the ones we use in elementary algebra. But the following identities are also true, and they have no counterpart in conventional algebra: A + (B- C)= (A + B)(A + C) A+A=A AA=A {A + B)-A = A (AB)+A = A and so on. This unusual algebra has become a familiar tool for people who work with electrical networks, computer systems, codes, and so on. It is as different from the algebra of numbers as it is from the algebra of matrices. Other exotic algebras arose in a variety of contexts, often in connection with scientific problems. There were "complex" and "hypercomplex" algebras, algebras of vectors and tensors, and many others. Today it is estimated that over 200 different kinds of algebraic systems have been studied, each of which arose in connection with some application or specific need. Algebraic Structures As legions of new algebras began to occupy the attention of mathematicians, the awareness grew that algebra can no longer be conceived merely as the science of solving equations. It had to be viewed much more broadly as a branch of mathematics capable of revealing general principles which apply equally to all known and all possible algebras. What is it that all algebras have in common? What trait do they share which lets us refer to all of them as "algebras"? In the most general sense, every algebra consists of a set (a set of numbers, a set of matrices, a set of switching components, or any other kind of set) and certain operations on that set. An operation is simply a way of combining any two members of a set to produce a unique third member of the same set. Thus, we are led to the modern notion of algebraic structure. An algebraic structure is understood to be an arbitrary set, with one or more operations defined on it. And algebra, then, is defined to be the study of algebraic structures. It is important that we be awakened to the full generality of the notion of algebraic structure. We must make an effort to discard all our preconceived notions of what an algebra is, and look at this new notion of algebraic structure in its naked simplicity. Any set, with a rule (or rules) for combining its elements, is already an algebraic structure. There does WHY ABSTRACT ALGEBRA? 11 not need to be any connection with known mathematics. For example, consider the set of all colors (pure colors as well as color combinations), and the operation of mixing any two colors to produce a new color. This may be conceived as an algebraic structure. It obeys certain rules, such as the commutative law (mixing red and blue is the same as mixing blue and red). In a similar vein, consider the set of all musical sounds with the operation of combining any two sounds to produce a new (harmonious or disharmonious) combination. As another example, imagine that the guests at a family reunion have made up a rule for picking the closest common relative of any two persons present at the reunion (and suppose that, for any two people at the reunion, their closest common relative is also present at the reunion). This too, is an algebraic structure: we have a set (namely the set of persons at the reunion) and an operation on that set (namely the "closest common relative" operation). As the general notion of algebraic structure became more familiar (it was not fully accepted until the early part of the twentieth century), it was bound to have a profound influence on what mathematicians perceived algebra to be. In the end it became clear that the purpose of algebra is to study algebraic structures, and nothing less than that. Ideally it should aim to be a general science of algebraic structures whose results should have applications to particular cases, thereby making contact with the older parts of algebra. Before we take a closer look at this program, we must briefly examine another aspect of modern mathematics, namely, the increasing use of the axiomatic method. AXIOMS The axiomatic method is beyond doubt the most remarkable invention of antiquity, and in a sense the most puzzling. It appeared suddenly in Greek geometry in a highly developed form—already sophisticated, elegant, and thoroughly modern in style. Nothing seems to have foreshadowed it and it was unknown to ancient mathematicians before the Greeks. It appears for the first time in the light of history in the great textbook of early geometry, Euclid's Elements. Its origins—the first tentative experiments in formal deductive reasoning which must have preceded it—remain steeped in mystery. Euclid's Elements embodies the axiomatic method in its purest form. This amazing book contains 465 geometric propositions, some fairly simple, some of astounding complexity. What is really remarkable, though, is that the 465 propositions, forming the largest body of scientific knowledge in the ancient world, are derived logically from only 10 12 CHAPTER ONE premises which would pass as trivial observations of common sense. Typical of the premises are the following: Things equal to the same thing are equal to each other. The whole is greater than the part. A straight line can be drawn through any two points. All right angles are equal. So great was the impression made by Euclid's Elements on following generations that it became the model of correct mathematical form and remains so to this day. It would be wrong to believe there was no notion of demonstrative mathematics before the time of Euclid. There is evidence that the earliest geometers of the ancient Middle East used reasoning to discover geometric principles. They found proofs and must have hit upon many of the same proofs we find in Euclid. The difference is that Egyptian and Babylonian mathematicians considered logical demonstration to be an auxiliary process, like the preliminary sketch made by artists—a private mental process which guided them to a result but did not deserve to be recorded. Such an attitude shows little understanding of the true nature of geometry and does not contain the seeds of the axiomatic method. It is also known today that many—maybe most—of the geometric theorems in Euclid's Elements came from more ancient times, and were probably borrowed by Euclid from Egyptian and Babylonian sources. However, this does not detract from the greatness of his work. Important as are the contents of the Elements, what has proved far more important for posterity is the formal manner in which Euclid presented these contents. The heart of the matter was the way he organized geometric facts—arranged them into a logical sequence where each theorem builds on preceding theorems and then forms the logical basis for other theorems. (We must carefully note that the axiomatic method is not a way of discovering facts but of organizing them. New facts in mathematics are found, as often as not, by inspired guesses or experienced intuition. To be accepted, however, they should be supported by proof in an axiomatic system.) Euclid's Elements has stood throughout the ages as the model of organized, rational thought carried to its ultimate perfection. Mathematicians and philosophers in every generation have tried to imitate its lucid perfection and flawless simplicity. Descartes and Leibniz dreamed of organizing all human knowledge into an axiomatic system, and Spinoza created a deductive system of ethics patterned after Euclid's geometry. While many of these dreams have proved to be impractical, the method WHY ABSTRACT ALGEBRA? 13 popularized by Euclid has become the prototype of modern mathematical form. Since the middle of the nineteenth century, the axiomatic method has been accepted as the only correct way of organizing mathematical knowledge. To perceive why the axiomatic method is truly central to mathematics, we must keep one thing in mind: mathematics by its nature is essentially abstract. For example, in geometry straight lines are not stretched threads, but a concept obtained by disregarding all the properties of stretched threads except that of extending in one direction. Similarly, the concept of a geometric figure is the result of idealizing from all the properties of actual objects and retaining only their spatial relationships. Now, since the objects of mathematics are abstractions, it stands to reason that we must acquire knowledge about them by logic and not by observation or experiment (for how can one experiment with an abstract thought?). This remark applies very aptly to modern algebra. The notion of algebraic structure is obtained by idealizing from all particular, concrete systems of algebra. We choose to ignore the properties of the actual objects in a system of algebra (they may be numbers, or matrices, or whatever—we disregard what they are), and we turn our attention simply to the way they combine under the given operations. In fact, just as we disregard what the objects in a system are, we also disregard what the operations do to them. We retain only the equations and inequalities which hold in the system, for only these are relevant to algebra. Everything else may be discarded. Finally, equations and inequalities may be deduced from one another logically, just as spatial relationships are deduced from each other in geometry. THE AXIOMATICS OF ALGEBRA Let us remember that in the mid-nineteenth century, when eccentric new algebras seemed to show up at every turn in mathematical research, it was finally understood that sacrosanct laws such as the identities ab = ba and a{bc) = {ab)c are not inviolable—for there are algebras in which they do not hold. By varying or deleting some of these identities, or by replacing them by new ones, an enormous variety of new systems can be created. Most importantly, mathematicians slowly discovered that all the algebraic laws which hold in any system can be derived from a few simple, basic ones. This is a genuinely remarkable fact, for it parallels the discovery made by Euclid that a few very simple geometric postulates are sufficient to prove all the theorems of geometry. As it turns out, then, we 14 CHAPTER ONE have the same phenomenon in algebra: a few simple algebraic equations offer themselves naturally as axioms, and from them all other facts may be proved. These basic algebraic laws are familiar to most high school students today. We list them here for reference. We assume that A is any set and there is an operation on A which we designate with the symbol * a*b = b*a (1) If Equation (1) is true for any two elements a and b in A, we say that the operation * is commutative. What it means, of course, is that the value of a * b (or b * a) is independent of the order in which a and b are taken. a * (b * c) = (a * b) * c (2) If Equation (2) is true for any three elements a, b, and c in A, we say the operation * is associative. Remember that an operation is a rule for combining any two elements, so if we want to combine three elements, we can do so in different ways. If we want to combine a, b, and c without changing their order, we may either combine a with the result of combining b and c, which produces a * (b * c); or we may first combine a with b, and then combine the result with c, producing (a*b)*c. The associative law asserts that these two possible ways of combining three elements (without changing their order) yield the same result. There exists an element e in A such that e * a = a and a * e = a for every a in A (3) If such an element e exists in A, we call it an identity element for the operation *. An identity element is sometimes called a "neutral" element, for it may be combined with any element a without altering a. For example, 0 is an identity element for addition, and 1 is an identity element for multiplication. For every element a in A, there is an element a"1 ("a inverse") in A such that a*a~1 = e and a~l*a = e (4) If statement (4) is true in a system of algebra, we say that every element has an inverse with respect to the operation *. The meaning of the inverse should be clear: the combination of any element with its inverse produces the neutral element (one might roughly say that the inverse of a "neutralizes" a). For example, if A is a set of numbers and the operation is addition, then the inverse of any number a is (-a); if the operation is multiplication, the inverse of any a ^0 is I/a. Let us assume now that the same set A has a second operation, WHY ABSTRACT ALGEBRA? 15 symbolized by _L, as well as the operation *: a*(blc) = (a*b)±(a*c) (5) If Equation (5) holds for any three elements a, b, and c in A, we say that * is distributive over 1. If there are two operations in a system, they must interact in some way; otherwise there would be no need to consider them together. The distributive law is the most common way (but not the only possible one) for two operations to be related to one another. There are other "basic" laws besides the five we have just seen, but these are the most common ones. The most important algebraic systems have axioms chosen from among them. For example, when a mathematician nowadays speaks of a ring, the mathematician is referring to a set A with two operations, usually symbolized by + and •, having the following axioms: Addition is commutative and associative, it has a neutral element commonly symbolized by 0, and every element a has an inverse -a with respect to addition. Multiplication is associative, has a neutral element 1, and is distributive over addition. Matrix algebra is a particular example of a ring, and all the laws of matrix algebra may be proved from the preceding axioms. However, there are many other examples of rings: rings of numbers, rings of functions, rings of code "words," rings of switching components, and a great many more. Every algebraic law which can be proved in a ring (from the preceding axioms) is true in every example of a ring. In other words, instead of proving the same formula repeatedly—once for numbers, once for matrices, once for switching components, and so on—it is sufficient nowadays to prove only that the formula holds in rings, and then of necessity it will be true in all the hundreds of different concrete examples of rings. By varying the possible choices of axioms, we can keep creating new axiomatic systems of algebra endlessly. We may well ask: is it legitimate to study any axiomatic system, with any choice of axioms, regardless of usefulness, relevance, or applicability? There are "radicals" in mathematics who claim the freedom for mathematicians to study any system they wish, without the need to justify it. However, the practice in established mathematics is more conservative: particular axiomatic systems are investigated on account of their relevance to new and traditional problems and other parts of mathematics, or because they correspond to particular applications. In practice, how is a particular choice of algebraic axioms made? Very simply: when mathematicians look at different parts of algebra and notice that a common pattern of proofs keeps recurring, and essentially the same assumptions need to be made each time, they find it natural to 16 CHAPTER ONE single out this choice of assumptions as the axioms for a new system. All the important new systems of algebra were created in this fashion. ABSTRACTION REVISITED Another important aspect of axiomatic mathematics is this: when we capture mathematical facts in an axiomatic system, we never try to reproduce the facts in full, but only that side of them which is important or relevant in a particular context. This process of selecting what is relevant and disregarding everything else is the very essence of abstraction. This kind of abstraction is so natural to us as human beings that we practice it all the time without being aware of doing so. Like the Bourgeois Gentleman in Moliere's play who was amazed to learn that he spoke in prose, some of us may be surprised to discover how much we think in abstractions. Nature presents us with a myriad of interwoven facts and sensations, and we are challenged at every instant to single out those which are immediately relevant and discard the rest. In order to make our surroundings comprehensible, we must continually pick out certain data and separate them from everything else. For natural scientists, this process is the very core and essence of what they do. Nature is not made up of forces, velocities, and moments of inertia. Nature is a whole—nature simply is! The physicist isolates certain aspects of nature from the rest and finds the laws which govern these abstractions. It is the same with mathematics. For example, the system of the integers (whole numbers), as known by our intuition, is a complex reality with many facets. The mathematician separates these facets from one another and studies them individually. From one point of view the set of the integers, with addition and multiplication, forms a ring (that is, it satisfies the axioms stated previously). From another point of view it is an ordered set, and satisfies special axioms of ordering. On a different level, the positive integers form the basis of "recursion theory," which singles out the particular way positive integers may be constructed, beginning with 1 and adding 1 each time. It therefore happens that the traditional subdivision of mathematics into subject matters has been radically altered. No longer are the integers one subject, complex numbers another, matrices another, and so on; instead, particular aspects of these systems are isolated, put in axiomatic form, and studied abstractly without reference to any specific objects. The other side of the coin is that each aspect is shared by many of the traditional systems: for example, algebraically the integers form a ring, WHY ABSTRACT ALGEBRA? 17 and so do the complex numbers, matrices, and many other kinds of objects. There is nothing intrinsically new about this process of divorcing properties from the actual objects having the properties; as we have seen, it is precisely what geometry has done for more than 2000 years. Somehow, it took longer for this process to take hold in algebra. The movement toward axiomatics and abstraction in modern algebra began about the 1830s and was completed 100 years later. The movement was tentative at first, not quite conscious of its aims, but it gained momentum as it converged with similar trends in other parts of mathematics. The thinking of many great mathematicians played a decisive role, but none left a deeper or longer lasting impression than a very young Frenchman by the name of Evariste Galois. The story of Evariste Galois is probably the most fantastic and tragic in the history of mathematics. A sensitive and prodigiously gifted young man, he was killed in a duel at the age of 20, ending a life which in its brief span had offered him nothing but tragedy and frustration. When he was only a youth his father commited suicide, and Galois was left to fend for himself in the labyrinthine world of French university life and student politics. He was twice refused admittance to the Ecole Polytechnique, the most prestigious scientific establishment of its day, probably because his answers to the entrance examination were too original and unorthodox. When he presented an early version of his important discoveries in algebra to the great academician Cauchy, this gentleman did not read the young student's paper, but lost it. Later, Galois gave his results to Fourier in the hope of winning the mathematics prize of the Academy of Sciences. But Fourier died, and that paper, too, was lost. Another paper submitted to Poisson was eventually returned because Poisson did not have the interest to read it through. Galois finally gained admittance to the Ecole Normale, another focal point of research in mathematics, but he was soon expelled for writing an essay which attacked the king. He was jailed twice for political agitation in the student world of Paris. In the midst of such a turbulent life, it is hard to believe that Galois found time to create his colossally original theories on algebra. What Galois did was to tic in the problem of finding the roots of equations with new discoveries on groups of permutations. He explained exactly which equations of degree 5 or higher have solutions of the traditional kind—and which others do not. Along the way, he introduced some amazingly original and powerful concepts, which form the framework of much algebraic thinking to this day. Although Galois did not work explicitly in axiomatic algebra (which was unknown in his day), the abstract notion of algebraic structure is clearly prefigured in his work. 18 CHAPTER ONE In 1832, when Galois was only 20 years old, he was challenged to a duel. What argument led to the challenge is not clear: some say the issue was political, while others maintain the duel was fought over a fickle lady's wavering love. The truth may never be known, but the turbulent, brilliant, and idealistic Galois died of his wounds. Fortunately for mathematics, the night before the duel he wrote down his main mathematical results and entrusted them to a friend. This time, they weren't lost—but they were only published 15 years after his death. The mathematical world was not ready for them before then! Algebra today is organized axiomatically, and as such it is abstract. Mathematicians study algebraic structures from a general point of view, compare different structures, and find relationships between them. This abstraction and generalization might appear to be hopelessly impractical—but it is not! The general approach in algebra has produced powerful new methods for "algebraizing" different parts of mathematics and science, formulating problems which could never have been formulated before, and finding entirely new kinds of solutions. Such excursions into pure mathematical fancy have an odd way of running ahead of physical science, providing a theoretical framework to account for facts even before those facts are fully known. This pattern is so characteristic that many mathematicians see themselves as pioneers in a world of possibilities rather than facts. Mathematicians study structure independently of content, and their science is a voyage of exploration through all the kinds of structure and order which the human mind is capable of discerning. CHAPTER TWO OPERATIONS Addition, subtraction, multiplication, division—these and many others are familiar examples of operations on appropriate sets of numbers. Intuitively, an operation on a set A is a way of combining any two elements of A to produce another element in the same set A. Every operation is denoted by a symbol, such as +, x, or -4-, In this book we will look at operations from a lofty perspective; we will discover facts pertaining to operations generally rather than to specific operations on specific sets. Accordingly, we will sometimes make up operation symbols such as * and O to refer to arbitrary operations on arbitrary sets. Let us now define formally what we mean by an operation on set A. Let A be any set: An operation * on A is a rule which assigns to each ordered pair (a, b) of elements of A exactly one element a* b in A. There are three aspects of this definition which need to be stressed: 1. a * b is defined for every ordered pair (a, b) of elements of A. There are many rules which look deceptively like operations but are not, because this condition fails. Often a * b is defined for all the obvious choices of a and b, but remains undefined in a few exceptional cases. For example, division does not qualify as an operation on the set IR of the real numbers, for there are ordered pairs such as (3, 0) whose quotient 3/0 is undefined. In order to be an operation on R, division would have to associate a real number alb with every ordered pair (a, b) of elements of R. No exceptions allowed! 19 20 CHAPTER TWO operations 21 2. a* b must be uniquely defined. In other words, the value of a * b must be given unambiguously. For example, one might attempt to define an operation □ on the set R of the real numbers by letting a a b be the number whose square is ab. Obviously this is ambiguous because 2d8, let us say, may be either 4 or -4. Thus, □ does not qualify as an operation on R! 3. // a and b are in A, a*b must be in A. This condition is often expressed by saying that A is closed under the operation *. If we propose to define an operation * on a set A, we must take care that *, when applied to elements of A, does not take us out of A. For example, division cannot be regarded as an operation on the set of the integers, for there are pairs of integers such as (3, 4) whose quotient 3/4 is not an integer. On the other hand, division does qualify as an operation on the set of all the positive real numbers, for the quotient of any two positive real numbers is a uniquely determined positive real number. An operation is any rule which assigns to each ordered pair of elements of A a unique element in A. Therefore it is obvious that there are, in general, many possible operations on a given set A. If, for example, A is a set consisting of just two distinct elements, say a and b, each operation on A may be described by a table such as this one: y) x * y {a, a) (a,b) (b,a) (b,b) In the left column arc listed the four possible ordered pairs of elements of A, and to the right of each pair (x, y) is the value of x*y. Here are a few of the possible operations: (a, a) a (a, a) a (a, a) b {a, a) b (a,b) a (a,b) b (a, b) a (a,b) b (b,a) a (b,a) a (b,a) b (b,a) b (b,b) a (b,b) b (b,b) a (b,b) a Each of these tables describes a different operation on A. Each table has four rows, and each row may be filled with either an a or a fc; hence there are 16 possible ways of filling the table, corresponding to 16 possible operations on the set A. We have already seen that any operation on a set A comes with certain "options." An operation * may be commutative, that is, it may satisfy a*b = b*a (1) for any two elements a and b in A. It may be associative, that is, it may satisfy the equation (a*b)*c= a*(b*c) (2) for any three elements a, b, and c in A. To understand the importance of the associative law, we must remember that an operation is a way of combining two elements; so if we want to combine three elements, we can do so in different ways. If we want to combine a, b, and c without changing their order, we may either combine a with the result of combining b and c, which produces a * (b * c); or we may first combine a with b, and then combine the result with c, producing (a * b) * c. The associative law asserts that these two possible ways of combining three elements (without changing their order) produce the same result. For example, the addition of real numbers is associative because a + (b + c) = (a + b) + c. However, division of real numbers is not associative: for instance, 3/(4/5) is 15/4, whereas (3/4)/5 is 3/20. If there is an element e in A with the property that e*a = a and a*e = a for every element a in A (3) then e is called an identity or "neutral" element with respect to the operation *. Roughly speaking, Equation (3) tells us that when e is combined with any element a, it does not change a. For example, in the set R of the real numbers, 0 is a neutral element for addition, and 1 is a neutral element for multiplication. If a is any element of A, and x is an element of A such that a * x = e and x* a = e (4) then x is called an inverse of a. Roughly speaking, Equation (4) tells us that when an element is combined with its inverse it produces the neutral element. For example, in the set R of the real numbers, -a is the inverse of a with respect to addition; if a ^0, then 1/a is the inverse of a with respect to multiplication. The inverse of a is often denoted by the symbol a~\ (The symbol a'1 is usually pronounced "a inverse.") }.l ( haiti k two operations 23 EXERCISES Throughout this book, the exercises are grouped into exercise sets, each set being identified by a letter A, B, C, etc, and headed by a descriptive title. Each exercise set contains six to ten exercises, numbered consecutively. Generally, the exercises in each set are independent of each other and may be done separately. However, when the exercises in a set are related, with some exercises building on preceding ones so that they must be done in sequence, this is indicated with a symbol t in the margin to the left of the heading. The symbol # next to an exercise number indicates that a partial solution to that exercise is given in the Answers section at the end of the book. A. Examples of Operations Which of the following rules are operations on the indicated set? (Z designates the set of the integers, Q the rational numbers, and R the real numbers.) For each rule which is not an operation, explain why it is not. Example a* b = a + b ab , on the set Z. Solution This is not an operation on Z. There are integers a and b such that (a + b)/ab is not an integer. (For example, 2 + 3 = 5 2-3 6 is not an integer.) Thus, Z is not closed under *. 1 a * b = \/\ab\, on tne set Q- 2 a * b = a In b, on the set (x I R : x>0}. # 3 a * b is a root of the equation x - a b = 0, on the set R. 4 Subtraction, on the set Z. 5 Subtraction, on the set {nCZ: nszQ}. 6 a*b = \a - b\, on the set {nEl:nS=0}. B. Properties of Operations Each of the following is an operation * on R, Indicate whether or not (i) it is commutative, (ii) it is associative, (iii) R has an identity element with respect to *, (iv) every jteR has an inverse with respect to *. Instructions For (i), compute x*y and y*x, and verify whether or not they are equal. For (ii), compute x * (y * z) and (x * y) * z, and verify whether or not they are equal. For (iii), first solve the equation x * e = x for e\ if the equation cannot be solved, there is no identity element. If it can be solved, it is still necessary to check that e*x = x*e = x for any *eR. If it checks, then e is an identity element. For (iv), first note that if there is no identity element, there can be no inverses. If there is an identity element e, first solve the equation x * x' = e for x'; if the equation cannot be solved, x does not have an inverse. If it can be solved, check to make sure that x * x' = x' * x = e. If this checks, x' is the inverse of x. Example x*y = x + y + l Associative Commutative Identity Inverses Yes B No □ Yes S No □ Yes El No □ Yes H No □ (i) x*y = x + y + l;y*x = y + x + l= x + y + l. (Thus, * is commutative.) (ii) x*(y*z) = x*(y + z + l) = x + (y + z + l) + l = x + y + z + 2. (x*y)* z = (x + y + 1)*z = (x + y + l) + z + 1 = x + y + z + 2. (* is associative.) (iii) Solve x*e -x for e: x*e = x + e + 1 = x; therefore, e = -1. Check: x*(-1) = x + (-1) + 1 = x; (~l)*x = (-1) + x + 1 = x. Therefore, -1 is the identity element. (* has an identity element.) (iv) Solve x*x' = —1 for x': x*x' = x + x' + 1 = —1; therefore x' = -x-2. Check: x*(-x - 2) = x + (-x -2) + 1 = -1; (-* — 2)*x = (—x -2) + x+ 1= —1. Therefore, —x — 2 is the inverse of x. (Every element has an inverse.) 1 x*v = jc + 2v + 4 Commutative Associative Identity Inverses Yes □ No □ Yes □ No □ Yes □ No □ Yes □ No □ (i) x*y = x + 2y + 4; y*x = (ii) x*(y*z)-x*( ) = (x*y)* z = ( )*z = (iii) Solve x * e = x for e. Check. (iv) Solve x*x' = e for x'. Check. 2 x*y = x + 2y — xy Commutative Associative Identity Inverses Yes □ No □ Yes □ No □ Yes □ No □ Yes □ No □ 3 x *y = \x + y\ Commutative Associative Identity Inverses Yes □ No □ Yes □ No □ Yes □ No □ Yes □ No □ 4 x*y = \x-y\ Commutative Associative Identity Inverses Yes □ No □ Yes □ No □ Yes □ No □ Yes □ No □ 24 CHAPTER TWO 5 x * v = xy + 1 Commutative Yes □ No □ Associative Yes □ No □ Identity Yes □ No □ 6 x * v = max {x, v} = the larger of the two numbers x and v Commutative Yes □ No □ Yes □ No □ Identity Yes □ No □ # 1 x*y xy x + y + 1 Commutative Yes □ No □ (on the set of positive real numbers) Associative Yes □ No □ Identity Yes □ No □ Inverses Yes □ No □ Inverses Yes □ No □ Inverses Yes □ No □ C. Operations on a Two-Element Set Let /4 be the two-element set A = {a, b). 1 Write the tables of all 16 operations on A. (Use the format explained on page 20.) Label these operations 0j to 016. Then: 2 Identify which of the operations 01 to 016 are commutative. 3 Identify which operations, among 0, to 016, are associative. 4 For which of the operations 0t to 016 is there an identity element? 5 For which of the operations 0, to 016 does every element have an inverse? D. Automata: The Algebra of Input/Output Sequences Digital computers and related machines process information which is received in the form of input sequences. An input sequence is a finite sequence of symbols from some alphabet A. For instance, if A = {0,1} (that is, if the alphabet consists of only the two symbols 0 and 1), then examples of input sequences are 011010 and 10101111. If A = {a,b,c}, then examples of input sequences are babbcac and cccabaa. Output sequences are defined in the same way as input sequences. The set of all sequences of symbols in the alphabet A is denoted by A*. There is an operation on A* called concatenation: If a and b are in A*, say a = a.a, ... o_ and b = b,b, ... then ab = a.a, ab.b. In other words, the sequence ab consists of the two sequences a and b end to end. For example, in the alphabet A = {0,1}, if a = 1001 and b = 010, then ab = 1001010. The symbol A denotes the empty sequence. 1 Prove that the operation defined above is associative. 2 Explain why the operation is not commutative. 3 Prove that there is an identity element for this operation. CHAPTER THREE THE DEFINITION OF GROUPS One of the simplest and most basic of all algebraic structures is the group. A group is defined to be a set with an operation (let us call it *) which is associative, has a neutral element, and for which each element has an inverse. More formally, By a group we mean a set G with an operation * which satisfies the axioms: (Gl) * is associative. (G2) There is an element e in G such that a* e = a and e*a = a for every element a in G. (G3) For every element a in G, there is an element a"1 in G such that a* a= e and a~l *a = e. The group we have just defined may be represented by the symbol (G, *). This notation makes it explicit that the group consists of the set G and the operation *. (Remember that, in general, there are other possible operations on G, so it may not always be clear which is the group's operation unless we indicate it.) If there is no danger of confusion, we shall denote the group simply with the letter G. The groups which come to mind most readily are found in our familiar number systems. Here are a few examples. 25 26 chapter three the definition of groups 27 Z is the symbol customarily used to denote the set {...,-3,-2,-1,0,1,2,3,...} of the integers. The set Z, with the operation of addition, is obviously a group. It is called the additive group of the integers and is represented by the symbol (Z, +). Mostly, we denote it simply by the symbol Z. Q designates the set of the rational numbers (that is, quotients mln of integers, where n^O). This set, with the operation of addition, is called the additive group of the rational numbers, (Q, + }. Most often we denote it simply by 0. The symbol U represents the set of the real numbers. R, with the operation of addition, is called the additive group of the real numbers, and is represented by (R, +}, or simply U. The set of all the nonzero rational numbers is represented by Q*. This set, with the operation of multiplication, is the group (Q*, -), or simply Q*. Similarly, the set of all the nonzero real numbers is represented by M*. The set U* with the operation of multiplication, is the group (1R*, •), or simply U*. Finally, Qpos denotes the group of all the positive rational numbers, with multiplication. Rpos denotes the group of all the positive real numbers, with multiplication. Groups occur abundantly in nature. This statement means that a great many of the algebraic structures which can be discerned in natural phenomena turn out to be groups. Typical examples, which we shall examine later, come up in connection with the structure of crystals, patterns of symmetry, and various kinds of geometric transformations. Groups are also important because they happen to be one of the fundamental building blocks out of which more complex algebraic structures are made. Especially important in scientific applications are the finite groups, that is, groups with a finite number of elements. It is not surprising that such groups occur often in applications, for in most situations of the real world we deal with only a finite number of objects. The easiest finite groups to study are those called the groups of integers modulo n (where n is any positive integer greater than 1). These groups will be described in a casual way here, and a rigorous treatment deferred until later. Let us begin with a specific example, say, the group of integers modulo 6. This group consists of a set of six elements, {0,1,2,3,4,5} and an operation called addition modulo 6, which may be described as follows: Imagine the numbers 0 through 5 as being evenly distributed on the circumference of a circle. To add two numbers h and k, start with h and move clockwise k additional units around the circle: h + k is where you end up. For example, 3 + 3 = 0,3 + 5 = 2, and so on. The set {0,1,2, 3,4, 5} with this operation is called the group of integers modulo 6, and is represented by the symbol Z6. In general, the group of integers modulo n consists of the set {0,1,2, ...,«-1} with the operation of addition modulo n, which can be described exactly as previously. Imagine the numbers 0 through n — 1 to be points on the unit circle, each one separated from the next by an arc of length 27r/n. To add h and k, start with h and go clockwise through an arc of k times 2irln. The sum h + k will, of course, be one of the numbers 0 through n — 1. From geometrical considerations it is clear that this kind of addition (by successive rotations on the unit circle) is associative. Zero is the neutral element of this group, and n - h is obviously the inverse of h [for h + (n — h) = n, which coincides with 0]. This group, the group of integers modulo n, is represented by the symbol Z„. Often when working with finite groups, it is useful to draw up an "operation table." For example, the operation table of Z6 is + 0 1 2 3 4 5 0 0 1 2 3 4 5 1 1 2 3 4 5 0 2 2 3 4 5 0 1 3 3 4 5 0 1 2 4 4 5 0 1 2 3 5 5 0 1 2 3 4 28 CHAPTER THREE THE DEFINITION OF GROUPS 29 The basic format of this table is as follows: 0 1 2 3 4 5 with one row for each element of the group and one column for each element of the group. Then 3 + 4, for example, is located in the row of 3 and the column of 4. In general, any finite group (G, *) has a table x *y The entry in the row of x and the column of y is x * y. Let us remember that the commutative law is not one of the axioms of group theory; hence the identity a* b = b*a is not true in every group. If the commutative law holds in a group G, such a group is called a commutative group or, more commonly, an abelian group. Abelian groups are named after the mathematician Niels Abel, who was mentioned in Chapter 1 and who was a pioneer in the study of groups. All the examples of groups mentioned up to now are abelian groups, but here is an example which is not. Let G be the group which consists of the six matrices •-(-! -!) •-(-! -J) with the operation of matrix multiplication which was explained on page 8. This group has the following operation table, which should be checked: I A B C D K I I A B c D K A A I C B K D B B K D A I C C C D K I A B D D C I K B A K K B A D C I In linear algebra it is shown that the multiplication of matrices is associative. (The details are simple.) It is clear that I is the identity element of this group, and by looking at the table one can see that each of the six matrices in {I, A, B, C, D, K} has an inverse in {I, A, B, C, D, K}. (For example, B is the inverse of D, A is the inverse of A, and so on.) Thus, G is a group! Now we observe that AB = C and BA = K, so G is not commutative. EXERCISES A. Examples of Abelian Groups Prove that each of the following sets, with the indicated operation, is an abelian group. Instructions Proceed as in Chapter 2, Exercise B. 1 x*y = x + y + k (It a fixed constant), on the set R of the real numbers. 2 x*y = y, on the set {xBU : x *0). 3 x * v =x + v + xy, on the set jjeR:j#-l), # 4 x*y = X+/,, on the set {xSU: -Kx)*(c, d) = (ac, be + d), on the set {(x, v) e R x R : x 5*0}. 3 Same operation as in part 2, but on the set RxR. 4 (a, b)*(c, (2) = (oc - bd, ad + be), on the set RxR with the origin deleted. 5 Consider the operation of the preceding problem on the set RxR. Is this a group? Explain. C. Groups of Subsets of a Set If A and B are any two sets, their symmetric difference is the set A + B defined as follows: A + B = (A- B)U(B- A) Note: A - B represents the set obtained by removing from A all the elements which are in B. The shaded area is A + B It is perfectly clear that A + B = B + A; hence this operation is commutative. It is also associative, as the accompanying pictorial representation suggests: Let the union of A, B, and C be divided into seven regions as illustrated. f 2 \ 1 1 3 \ V 5 J \ / 4 \ / 6 \ / \ 1 I A + B consists of the regions 1, 4, 3, and 6. B + C consists of the regions 2, 3, 4, and 7. A + (B + C) consists of the regions 1, 3, 5, and 7. (A + B) + C consists of the regions 1, 3, 5, and 7. Thus, A + (B + C) = (A + B) + C. If D is a set, then the power set of D is the set PD of all the subsets of D. That is, PD = {A:ACD} The operation + is to be regarded as an operation on PD. 1 Prove that there is an identity element with respect to the operation +, which is 2 Prove every subset A of D has an inverse with respect to +, which is_. Thus, (PD, +) is a group! 3 Let D be the three-element set D = {a, b, c). List the elements of PD. (For example, one element is {a}, another is {a, b), and so on. Do not forget the empty set and the whole set D.) Then write the operation table for {PD, +). D. A Checkerboard Game 1 2 3 4 Our checkerboard has only four squares, numbered 1, 2, 3, and 4. There is a single checker on the board, and it has four possible moves: V: Move vertically; that is, move from 1 to 3, or from 3 to 1, or from 2 to 4, or from 4 to 2. H: Move horizontally; that is, move from 1 to 2 or vice versa, or from 3 to 4 or vice versa. D: Move diagonally; that is, move from 2 to 3 or vice versa, or move from 1 to 4 or vice versa. /: Stay put. We may consider an operation on the set of these four moves, which consists of performing moves successively. For example, if we move horizontally and then vertically, we end up with the same result as if we had moved diagonally: H*V= D If we perform two horizontal moves in succession, we end up where we started: H* H = /. And so on. If G = {V, H, D, I), and * is the operation we have just described, write the table of G. 32 chapter three the definition of groups 33 * T v H D V H D Granting associativity, explain why (G, *} is a group. E. A Coin Game Imagine two coins on a table, at positions A and B. In this game there are eight possible moves: AY,: Flip over the coin at A. AY2: Flip over the coin at B. AY,: Flip over both coins. AY4: Switch the coins. AY5: Flip coin at A; then switch. AY6: Flip coin at B; then switch. AY7: Flip both coins; then switch. /: Do not change anything. We may consider an operation on the set {/, AY,, . . . , AY7}, which consists of performing any two moves in succession. For example, if we switch coins, then flip over the coin at A, this is the same as first flipping over the coin at B then switching: AY4 * AY, = AY j * AY4 = AY6 If G = {/, AY,, . table of (G, *). AY7} and * is the operation we have just described, write the / Af, Mj AY, AY4 AY5 AY6 AY, / Af, M2 AY, W, Af5 Mt AY7 Granting associativity, explain why (G, *) is a group. Is it commutative? If not, show why not. F. Groups in Binary Codes The most basic way of transmitting information is to code it into strings of Os and Is, such as 0010111, 1010011, etc. Such strings are called binary words, and the number of 0s and Is in any binary word is called its length. All information may be coded in this fashion. When information is transmitted, it is sometimes received incorrectly. One of the most important purposes of coding theory is to find ways of detecting errors, and correcting errors of transmission. If a word a = a,a2 ■ • • an is sent, but a word b = blb2 ■ - ■ bn is received (where the o, and the 6; are 0s or Is), then the error pattern is the word e = ele2 •••em where if a, = b, if a, * b, With this motivation, we define an operation of adding words, as follows: If a and b are both of length 1, we add them according to the rules 1+0=1 0 + 0 = 0 1+1=0 0+1=1 If a and b are both of length n, we add them by adding corresponding digits. That is (let us introduce commas for convenience), (0„ a2, . . . ,an) + (/>,, b2,... , 6„) = (a, + />,, a2 + b2____,a„ + bn) Thus, the sum of a and b is the error pattern e. For example, 0010110 +0011010 =0001100 10100111 + 11110111 =01010000 The symbol B" will designate the set of all the binary words of length n. We will prove that the operation of word addition has the following properties on B": 1. It is commutative. 2. It is associative. 3. There is an identity element for word addition. 4. Every word has an inverse under word addition. First, we verify the commutative law for words of length 1: 0+1=1=1+0 1 Show that («,, a2, a„) + (blr b2, bn) = {bl, b2, b„) + (a,,a2,. . . ,a„). 2 To verify the associative law, we first verify it for words of length 1: 1 + (1 + 1) = 1+ 0 = 1= 0+1 = (1 +1) + 1 1 +(1+0) = 1 + 1= 0 = 0 + 0 = (l + l) + 0 Check the remaining six cases. 3 Show that (a,,...,a„) + [(b1,...,6„) + (c1,...,cj] = [(a,,...,a„) + (ft......*.)) + (<;,,..., c„). 4 The identity element of B", that is, the identity element for adding words of length n, is_. 34 chapter three the definition of groups 35 5 The inverse, with respect to word addition, of any word (a,,. . ., a„) is 6 Show that a + b = a - b [where a - b = a + (—*)]. 7 If a + ft = c, show that a = b + c. G. Theory of Coding: Maximum-Likelihood Decoding We continue the discussion started in Exercise F: Recall that B" designates the set of all binary words of length n. By a code we mean a subset of B". For example, below is a code in B . The code, which we shall call C,, consists of the following binary words of length 5: 00000 00111 01001 OHIO 10011 10100 11010 11101 Note that there are 32 possible words of length 5, but only eight of them are in the code C,. These eight words are called codewords; the remaining words of B5 are not codewords. Only codewords are transmitted. If a word is received which is not a codeword, it is clear that there has been an error of transmission. In a well-designed code, it is unlikely that an error in transmitting a codeword will produce another codeword (if that were to happen, the error would not be detected). Moreover, in a good code it should be fairly easy to locate errors and correct them. These ideas are made precise in the discussion which follows. The weight of a binary word is the number of Is in the word: for example, 11011 has weight 4. The distance between two binary words is the number of positions in which the words differ. Thus, the distance between 11011 and 01001 is 2 (since these words differ only in their first and fourth positions). The minimum distance in a code is the smallest distance among all the distances between pairs of codewords. For the code C, above, pairwise comparison of the words shows that the minimum distance is 2. What this means is that at least two errors of transmission are needed in order to transform a codeword into another codeword; single errors will change a codeword into a rconcodeword, and the error will therefore be detected. In more desirable codes (for example, the so-called Hamming code), the minimum distance is 3, so any one or two errors are always detected, and only three errors in a single word (a very unlikely occurrence) might go undetected. In practice, a code is constructed as follows: in every codeword, certain positions are information positions, and the remaining positions are redundancy positions. For instance, in our code C,, the first three positions of every codeword are the information positions: if you look at the eight codewords (and confine your attention only to the first three digits in each word), you will see that every three-digit sequence of 0s and Is is there namely, 000, 001, 010, 011, 100, 101, 110, 111 The numbers in the fourth and fifth positions of every codeword satisfy parity-check equations. # 1 Verify that every codeword ala2aiaia5 in C, satisfies the following two parity-check equations: a4 = al + a3; a5 = a, + + a3. 2 Let C2 be the following code in B6. The first three positions are the information positions, and every codeword ala2a:iaiasa6 satisfies the parity-check equations a4 = a2, as = a, + a2, and a„ = a, + a2 + a3. # (a) List the codewords of C2. (b) Find the minimum distance of the code C2. (c) How many errors in any codeword of C2 are sure to the detected? Explain. 3 Design a code in W where the first two positions are information positions. Give the parity-check equations, list the codewords, and find the minimum distance. If a and b are any two words, let d{a, b) denote the distance between a and b. To decode a received word x (which may contain errors of transmission) means to find the codeword closest to x, that is, the codeword a such that d(z, x) is a minimum. This is called maximum-likelihood decoding. 4 Decode the following words in C,: 11111, 00101, 11000, 10011, 10001, and 10111. You may have noticed that the last two words in part 4 had ambiguous decodings: for example, 10111 may be decoded as either 10011 or 00111. This situation is clearly unsatisfactory. We shall see next what conditions will ensure that every word can be decoded into only one possible codeword. In the remaining exercises, let C be a code in B", let m denote the minimum distance in C, and let a and b denote codewords in C. 5 Prove that it is possible to detect up to m - 1 errors. (That is, if there are errors of transmission in m - 1 or fewer positions of a codeword, it can always be determined that the received word is incorrect.) # 6 By the sphere of radius k about a codeword a we mean the set of all words in B" whose distance from a is no greater than k. This set is denoted by £t(a); hence .S;(a) = {x:d(a,x)«*} If t = |(m - 1), prove that any two spheres of radius t, say S,(a) and 5,(b), have no elements in common. [Hint: Assume there is a word x such that x e S,(a) and xe 5,(b). Using the definitions of t and m, show that this is impossible.] 7 Deduce from part 6 that if there are t or fewer errors of transmission in a codeword, the received word will be decoded correctly. 8 Let C2 be the code described in part 2. (If you have not yet found the minimum distance in C2, do so now.) Using the results of parts 5 and 7, explain why two errors in any codeword can always be detected, and why one error in any codeword can always be corrected. elementary properties of groups 37 CHAPTER FOUR ELEMENTARY PROPERTIES OF GROUPS Is it possible for a group to have two different identity elements? Well, suppose el and e2 are identity elements of some group G. Then e. * e, = e, e.*e. because e, is an identity element, and because e2 is an identity element Therefore This shows that in every group there is exactly one identity element. Can an element a in a group have two different inverses'? Well, if a, and a-, are both inverses of a, then and a, * (a * a2) = ax * e = a, (a, *a)*a2 = e*a2 = a2 By the associative law, al * (a * a2) = (a, * a) * a2; hence a, = a2. This shows that in every group, each element has exactly one inverse. Up to now we have used the symbol * to designate the group operation. Other, more commonly used symbols are + and ■ ("plus" and "multiply"). When + is used to denote the group operation, we say we are using additive notation, and we refer to a + b as the sum of a and b. (Remember that a and b do not have to be numbers and therefore "sum" does not, in general, refer to adding numbers.) When • is used to denote the group operation, we say we are using multiplicative notation; we usually write ab instead of a ■ b, and call ab the product of a and b. (Once again, remember that "product" does not, in general, refer to multiplying 36 numbers.) Multiplicative notation is the most popular because it is simple and saves space. In the remainder of this book multiplicative notation will be used except where otherwise indicated. In particular, when we represent a group by a letter such as G or H, it will be understood that the group's operation is written as multiplication. There is common agreement that in additive notation the identity element is denoted by 0, and the inverse of a is written as —a. (It is called the negative of a.) In multiplicative notation the identity element is e and the inverse of a is written as a-1 ("a inverse"). It is also a tradition that + is to be used only for commutative operations. The most basic rule of calculation in groups is the cancellation law, which allows us to cancel the factor a in the equations ab = ac and ab = ca. This will be our first theorem about groups. Theorem 1 If G is a group and a, b, c are elements of G, then } = c and (i) ab = ac (ii) ba = ca implies implies It is easy to see why this is true: if we multiply (on the left) both sides of the equation ab = ac by a"1, we get b = c. In the case of ba = ca, we multiply on the right by a-1. This is the idea of the proof; now here is the proof: Suppose Then By the associative law, that is, Thus, finally, ab = ac a~\ab) = a_1(ac) (a ia)b = (a'la)c eb = ec b = c Part (ii) is proved analogously. In general, we cannot cancel a in the equation ab = ca. (Why not?) Theorem 2 If G is a group and a, b are elements of G, then b = a'x , so by the ab = e implies a = b~ and The proof is very simple: if ab = e, then ab - aa cancellation law, b = a~\ Analogously, a = b~\ This theorem tells us that if the product of two elements is equal to e, these elements are inverses of each other. In particular, if a is the inverse of b, then b is the inverse of a. The next theorem gives us important information about computing inverses. 38 CHAPTER FOUR elementary properties of groups 39 Theorem 3 If G is a group and a, b are elements of G, then (i) (ab)'1 = b~\~x and (ii) (a-'y^a The first formula tells us that the inverse of a product is the product of the inverses in reverse order. The next formula tells us that a is the inverse of the inverse of a. The proof of (i) is as follows: (ab)(b~xa~l) = a[(bb~l)a~l] by the associative law = a[ea~'] because bb'1 = e = aa~l = e Since the product of ab and is equal to e, it follows by Theorem 2 that they are each other's inverses. Thus, (aft)-1 = b~la~\ The proof of (ii) is analogous but simpler: aa~[ = e, so by Theorem 2 a is the inverse of a~\ that is, a = (a ')"'. The associative law states that the two products a(bc) and (ab)c are equal; for this reason, no confusion can result if we denote either of these products by writing abc (without any parentheses), and call abc the product of these three elements in this order. We may next define the product of any four elements a, b, c, and d in G by abed = a(bcd) By successive uses of the associative law we find that a(bc)d = ab(cd) = (ab)(cd) = (ab)cd Hence the product abed (without parentheses, but without changing the order of its factors) is defined without ambiguity. In general, any two products, each involving the same factors in the same order, are equal. The net effect of the associative law is that parentheses are redundant. Having made this observation, we may feel free to use products of several factors, such as a^a2 ■ ■ ■ an, without parentheses, whenever it is convenient. Incidentally, by using the identity (aby1 = b'^'1 repeatedly, we find that -l -l • a, a, If G is a finite group, the number of elements in G is called the order of G. It is customary to denote the order of G by the symbol igi EXERCISES Remark on notation In the exercises below, the exponential notation a" is used in the following sense: if a is any element of a group G, then a2 means aa, a3 means aaa, and, in general, a" is the product of n factors of a, for any positive integer n. A. Solving Equations in Groups Let a, b, c, and x be elements of a group G. In each of the following, solve for x in terms of a, b, and c. Example Solve simultaneously: x2 = b and x5 = e From the first equation, b = x1 Squaring, b2 = x" Multiplying on the left by x, xb2 = xxl = x5 = e. (Note: x5 = e was given.) Multiplying by {b2y\ xb2(b2)~' = e(b2y' . Therefore, x = (b2)~l . Solve: 1 axb = c 2 x2b = xa~lc Solve simultaneously: # 3x2a = bxc~l and acx = xac 4 ax2 - b and x3 = e 5 x2 = a2 and xs = e 6 (xaxf = bx and x2a = (xa)~x B. Rules of Algebra in Groups For each of the following rules, either prove that it is true in every group G, or give a counterexample to show that it is false in some groups. (All the counterexamples you need may be found in the group of matrices {I, A, B, C, D, K} described on page 28.) 1 If x2 = e, then x = e. 2 If x2 = a2, then x = a. 3 (ab)2 = a2b2 4 If x2 = x, then x = e. 5 For every x G G, there is some y 6 G such that x = y2. (This is the same as saying that every element of G has a "square root.") 6 For any two elements x and y in G, there is an element z in G such that y = xz. 40 chapter four elementary properties of groups 41 C. Elements That Commute If a and b are in G and ab = ba, we say that a and b commute. Assuming that a and b commute, prove the following: 1 a~l and b~x commute. 2 a and b~l commute. (Hint: First show that a = b'ab.) 3 a commutes with ab. 4 a2 commutes with b2. 5 xax~' commutes with xbx~\ for any xGG. 6 ab = ba iff aba~x = b. (The abbreviation iff stands for "if and only if." Thus, first prove that if ab = ba, then aba 1 = b. Next, prove that if aba' = b, then ab = ba. Proceed roughly as in Exercise A. Thus, assuming ab = ba, solve for b. Next, assuming aba~* = b, solve for ab.) 7 ab = ba iff aba lb = e. t D. Group Elements and Their Inverses1 Let G be a group. Let a, b, c denote elements of G, and let e be the neutral element of G. 1 Prove that if ab = e, then ba = e. (Hint: See Theorem 2.) 2 Prove that if abc — e, then cab = e and bca = e. 3 State a generalization of parts 1 and 2 Prove the following: 4 If xay = a~\ then yax = a"1. 5 Let a, b, and c each be equal to its own inverse. If ab = c, then be = a and ca = b. 6 If abc is its own inverse, then bca is its own inverse, and cab is its own inverse. 7 Let a and b each be equal to its own inverse. Then ba is the inverse of ab. 8 a = a ' iff aa = e. (That is, a is its own inverse iff a1 = e.) 9 Let c = c '. Then ab = c iff a/>c = «. t E. Counting Elements and Their Inverses Let G be a finite group, and let 5 be the set of all the elements of G which are not equal to their own inverse. That is, S = {x £ G: x ^x'1}. The set 5 can be divided up into pairs so that each element is paired off with its own inverse. (See diagram on the next page.) Prove the following: 1 When the exercises in a set are related, with some exercises building on preceding ones so that they must be done in sequence, this is indicated with a symbol t in the margin to the left of the heading. 1 In any finite group G, the number of elements not equal to their own inverse is an even number. 2 The number of elements of G equal to their own inverse is odd or even, depending on whether the number of elements in G is odd or even. 3 If the order of G is even, there is at least one element x in G such that x ^ e and x = x~\ In parts 4 to 6, let G be a finite abelian group, say, G = {e, alt a2,... , an). Prove the following: 4 (a,a2 • • • an)2 = e 5 If there is no element x ^ e in G such that x = x \ then a,a2 • • • a„ = e. 6 If there is exactly one x y4 e in G such that x = x'\ then a,a2 • ■ • a„ = x. t F. Constructing Small Groups In each of the following, let G be any group. Let e denote the neutral element of G. 1 If a, b are any elements of G, prove each of the following: (a) If a2 = a, then a = e. (b) If ab = a, then b = e. (c) If ab = b, then a = e. 2 Explain why every row of a group table must contain each element of the group exactly once. (Hint: Suppose x appears twice in the row of a: Now use the cancellation law for groups.) 3 There is exactly one group on any set of three distinct elements, say the set {e, a, b). Indeed, keeping in mind parts 1 and 2 above, there is only one way of completing the following table. Do so! You need not prove associativity. 42 chapter four elementary properties of groups 43 e a b e e a b a a b b 4 There is exactly one group G of four elements, say G = {e, a, b, c), satisfying the additional property that xx = e for every *£G. Using only part 1 above, complete the following group table of G: 5 There is exactly one group G of four elements, say G = { e). Complete the group table of G, as in the preceding exercise. 6 Use Exercise E3 to explain why the groups in parts 4 and 5 are the only possible groups of four elements (except for renaming the elements with different symbols). G. Direct Products of Groups If G and H are any two groups, their direct product is a new group, denoted by G x H, and defined as follows: Gxff consists of all the ordered pairs (*, y) where x is in G and y is in H. That is, Gxfi= {(x, y):xEG and y e H} The operation of G x H consists of multiplying corresponding components: (x, y)-(x', y') = (xx', yy') If G and H are denoted additively, it is customary to denote GxH additively: {x,y) + (x',y') = (x + x',y + y') 1 Prove that G x H is a group by checking the three group axioms, (Gl) to (G3): (Gl) *X*»> v3)] = ( , ) [(*.. y,)(jc2, y2)](jc3, y3) = ( , ) (G2) Let eG be the identity element of G, and eH the identity element of H. The identity element of G x H is ( , ). Check (G3) For each (a, b)eGx H, the inverse of (a, 2>) is ( , ). Check. 2 List the elements of Z2 x Z3, and write its operation table. (Note: There are six elements, each of which is an ordered pair. The notation is additive.) # 3 If G and H are abelian, prove that G x H is abelian. 4 Suppose the groups G and H both have the following property: Every element of the group is its own inverse. Prove that G x H also has this property. H. Powers and Roots of Group Elements Let G be a group, and a,bEG. For any positive integer n we define a" by a" = aaa ■ ■ ■ a n factors If there is an element j£G such that a = x\ we say that a has a square root in G. Similarly, if a = y3 for some y e G, we say a has a cube root in G. In general, a has an «th root in G if a = z" for some z £ G. Prove the following: 1 (fcafc ')" = 6a"fc~', for every positive integer n. Prove by induction. (Remember that to prove a formula such as this one by induction, you first prove it for n = 1; next you prove that ;/ it is true for n = k, then it must be true for n = k + 1. You may conclude that it is true for every positive integer n. Induction is explained more fully in Appendix C.) 2 If ab = ba, then (ab)" = a"b" for every positive integer n. Prove by induction. 3 If xax - e, then (xa)2" = a". 4 If a3 = e, then a has a square root. 5 If a2 = e, then a has a cube root. 6 If a"1 has a cube root, so does a. 7 If x2ax = a"1, then a has a cube root, a '.) 8 If xax = 6, then ab has a square root. (Hint: Show that xax is a cube root of subgroups 45 CHAPTER FIVE SUBGROUPS Let G be a group, and S a nonempty subset of G. It may happen (though it doesn't have to) that the product of every pair of elements of S is in S. If it happens, we say that S is closed with respect to multiplication. Then, it may happen that the inverse of every element of S is in S. In that case, we say that S is closed with respect to inverses. If both these things happen, we call S a subgroup of G. When the operation of G is denoted by the symbol +, the wording of these definitions must be adjusted: if the sum of every pair of elements of S is in S, we say that S is closed with respect to addition. If the negative of every element of S is in S, we say that 5 is closed with respect to negatives. If both these things happen, 5 is a subgroup of G. For example, the set of all the even integers is a subgroup of the additive group Z of the integers. Indeed, the sum of any two even integers is an even integer, and the negative of any even integer is an even integer. As another example, O* (the group of the nonzero rational numbers, under multiplication) is a subgroup of U* (the group of the nonzero real numbers, under multiplication). Indeed, Q*cR* because every rational number is a real number. Furthermore, the product of any two rational numbers is rational, and the inverse (that is, the reciprocal) of any rational number is a rational number. An important point to be noted is this: if S is a subgroup of G, the operation of S is the same as the operation of G. In other words, if a and b are elements of S, the product ab computed in S is precisely the product ab computed in G. For example, it would be meaningless to say that (Q*, • ) is a subgroup of (U, + ); for although it is true that Q* is a subset of IR, the operations on these two groups are different. The importance of the notion of subgroup stems from the following fact: if G is a group and S is a subgroup of G, then S itself is a group. It is easy to see why this is true. To begin with, the operation of G, restricted to elements of S, is certainly an operation on S. It is associative: for if a, b, and c are in S, they are in G (because S C G); but G is a group, so a(bc) - (ab)c. Next, the identity element e of G is in S (and continues to be an identity element in S) for 5 is nonempty, so 5 contains an element a; but S is closed with respect to inverses, so S also contains a ; thus, S contains aa 1 = e, because S is closed with respect to multiplication. Finally, every element of S has an inverse in S because 5 is closed with respect to inverses. Thus, S is a group! One reason why the notion of subgroup is useful is that it provides us with an easy way of showing that certain things are groups. Indeed, if G is already known to be a group, and 5 is a subgroup of G, we may conclude that S is a group without having to check all the items in the definition of "group." This conclusion is illustrated by the next example. Many of the groups we use in mathematics are groups whose elements are functions. In fact, historically, the first groups ever studied as such were groups of functions. ^(R) represents the set of all functions from U to IR, that is, the set of all real-valued functions of a real variable. In calculus we learned how to add functions: if / and g are functions from IR to R, their sum is the function f + g given by [/ + g](x) = f(x) + g(x) for every real number x Clearly, / + g is again a function from IR to U, and is uniquely determined by / and g. ^(IR), with the operation + for adding functions, is the group 46 chapter five {f(R),+), or simply ^(R). The details are simple, but first, let us remember what it means for two functions to be equal. If / and g are functions from R to R, then /and g are equal (that is, /= g) if and only if /(*) = g(x) f°r every real number x. In other words, to be equal, / and g must yield the same value when applied to every real number x. To check that + is associative, we must show that f + [g + h] = [f + g] + h, for every three functions, /, g, and h in &(U). This means that for any real number x, {/ + [g + h]}(x) = {[/ + g] + h}(x). Well, {/ + [g + h]}(x) = f(x) + [g + h](x)=f{x) + g(x) + h(x) and {[ / + g] + h}(x) has the same value. The neutral element of £F(R) is the function 0 given by 0{x) = 0 for every real number x To show that © + / = /, one must show that [& + f](x) =/(*) for every real number x. This is true because [0 + /](x) = 0(x) + f(x) = 0 + f(x) = fix). Finally, the inverse of any function / is the function -/ given by [-/](*) = —f[x) for every real number x One perceives immediately that / + [-/] = 0, for every function /. <#(R) represents the set of all continuous functions from R to R. Now, (R) represents the set of all the differentiable functions from R to R. It is a subgroup of ^(R) because the sum of any two differentiable functions is differentiable, and the negative of any differentiable function is differentiable. Thus, 2>(R), with the operation of adding functions, is a group denoted by <3(R), +), or simply ®(R). By the way, in any group G the one-element subset {e}, containing only the neutral element, is a subgroup. It is closed with respect to multiplication because ee = e, and closed with respect to inverses because e 1 = e. At the other extreme, the whole group G is obviously a subgroup of itself. These two examples are, respectively, the smallest and largest possible subgroups of G. They are called the trivial subgroups of G. All the other subgroups of G are called proper subgroups. Suppose G is a group and a, b, and c are elements of G. Define S to be the subset of G which contains all the possible products of a, b, c, and their inverses, in any order, with repetition of factors permitted. Thus, subgroups 47 typical elements of S would be abac'1 c~la~lbbc and so on. It is easy to see that S is a subgroup of G: for if two elements of S are multiplied together, they yield an element of 5, and the inverse of any element of S is an element of S. For example, the product of aba and cb~*ac is abacb ac and the inverse of ab c a is 'cba' S is called the subgroup of G generated by a, b, and c. If at,..., an are any finite number of elements of G, we may define the subgroup generated by au,.., «„ in the same way. In particular, if a is a single element of G, we may consider the subgroup generated by a. This subgroup is designated by the symbol (a), and is called a cyclic subgroup of G; a is called its generator. Note that (a) consists of all the possible products of a and a~\ for example, aTlaaa~x and aaa^aa'1. However, since factors of a"1 cancel factors of a, there is no need to consider products involving both a and a"1 side by side. Thus, (a) contains a, aa, aaa,. .., a~\ a~la~1, a~la~1a~\ . . . , as well as aa"1 = e. If the operation of G is denoted by +, the same definitions can be given with "sums" instead of "products." In the group of matrices whose table appears on page 28, the subgroup generated by D is {£>) = {/, B, D} and the subgroup generated by A is < A) = {/, A}. (The student should check the table to verify this.) In fact, the entire group G of that example is generated by the two elements A and B. If a group G is generated by a single element a, we call G a cyclic group, and write G = (a). For example, the additive group Z6 is cyclic. (What is its generator?) Every finite group G is generated by one or more of its elements (obviously). A set of equations, involving only the generators and their inverses, is called a set of defining equations for G if these equations completely determine the multiplication table of G. For example, let G be the group {e, a, b, b2, ab, ab2} whose generators a and b satisfy the equations 48 chapter five subgroups 49 a2 = e o3 = ba = ab" (1) These three equations do indeed determine the multiplication table of G. To see this, note first that the equation ba = ab2 allows us to switch powers of a with powers of b, bringing powers of a to the left, and powers of b to the right. For example, to find the product of ab and ab2, we compute as follows: (ab)(ab2) = abab2 = aab2b2 = aV But by Equations (1), a2 = e and b* = b3b = b; so finally, (ab)(ab2) = b. All the entries in the table of G may be computed in the same fashion. When a group is determined by a set of generators and defining equations, its structure can be efficiently represented in a diagram called a Cay ley diagram. These diagrams are explained in Exercise G. EXERCISES A. Recognizing Subgroups In parts 1-6 below, determine whether or not H is a subgroup of G. (Assume that the operation of H is the same as that of G.) Instructions If H is a subgroup of G, show that both conditions in the definition of "subgroup" are satisfied. If H is not a subgroup of G, explain which condition fails. Example G = R*, the multiplicative group of the real numbers. H = (2" :n£Z) H is gl is not □ a subgroup of G. (i) If 2", 2™ G H, then 2"2"" = 2"+m But n + m G Z, so 2"+'" G //. (ii) If 2" G /f, then 1/2" = 2~". But -n 6 Z, so 2"" G //. (Note that in this example the operation of G and H is multiplication. In the next problem, it is addition.) 1 G = (R, +),H= {logo : u£Q, a >0}. //is □ is not □ a subgroup of G. 2 G = 0}. //is □ is not □ a subgroup of G. 3 G = (R, +), H = (*G R : tan x G©}. //is □ is not □ a subgroup of G. Hint: Use the following formula from trigonometry: tan x + tan y tan(x + y) * -—--—— v " 1 - tan x tan y 4 G = (R*,'),fl=(2T:m,/i6l}. W is □ is not □ a subgroup of G. 5 G = (R x R, +), H= {(x, y): y = 2x). H is □ is not □ a subgroup of G. 6G = (RxR,+),H=(();,y):jJ + y1>0}. Wis □ is not □ a subgroup of G. 7 Let C and D be sets, with CCD. Prove that Pc is a subgroup of f„. (See Chapter 3, Exercise C.) B. Subgroups of Functions In each of the following, show that H is a subgroup of G. Example G = (+), H={fE &(R): /(0) = 0} (i) Suppose f g&H; then /(0) = 0 and g(0) = 0, so |/ + g](0) = /(0) + g(0) = 0 + 0 = 0. Thus, f + geH. (ii) If / e H, then /(0) = 0. Thus, [-/](0) = -/(0) = -0 = 0, so -/ e H. : f{x) = 0 for every x e [0,1]} :f(-x)=-f(x)} : / is periodic of period tr) Remark: A function / is said to be periodic of period a if there is a number a, called the period of /, such that f(x) = f(x + na) for every lEffi and n£l. 1 G= <^(R), +), H={fe&(K) 2 G= (*(R), +), H={/ef(R) 3 G= (JF(R), +), H = {f 4 G-(«(R),+), «={/G # 5 G = (S(R), +), //= {/6®(R) 6 G=(^(R),+), //={/eS?(R) dfldx is constant} /(*) e Z for every x £ I C. Subgroups of Abelian Groups In the following exercises, let G be an abelian group. 1 If H = {x e G : x = x '}, that is, H consists of all the elements of G which are their own inverses, prove that H is a subgroup of G. 2 Let n be a fixed integer, and let H = {x G G : x" = 0, x" G //}. Prove that AT is a subgroup of G. 6 Suppose H and /f are subgroups of G, and define HK as follows: ///f = {xy : x G W and y G AT} Prove that HK is a subgroup of G. 7 Explain why parts 4-6 are not true if G is not abelian. 50 chapter five subgroups 51 D. Subgroups of an Arbitrary Group Let G be a group. 1 If H and K are subgroups of a group G, prove that H n K is a subgroup of G. (Remember that x represents the operation "multiply by a": e—»a—>a2 —*a3—*•• ■ If the group has two generators, say a and b, we need two kinds of arrows, say •-* and -», where ~* means "multiply by a," and —* means "multiply by b." For example, the group G = {e, a, b, b2, ab, ab2} where a2 = e, b3 = e, and ba = ab2 (see page 47) has the following Cayley diagram: ■ means "multiply by fr" • means "multiply by a" Moving in the forward direction of the arrow —> means multiplying by b, x—-xb whereas moving in the backward direction of the arrow means multiplying by b'1: x^-xb'1 (Note that "multiplying x by b" is understood to mean multiplying on the right by b: it means xb, not bx.) It is also a convention that if a2 = e (hence a = a-1), then 52 chapter five subgroups 53 no arrowhead is used: for if a = a~\ then multiplying by a is the same as multiplying by a '. The Cayley diagram of a group contains the same information as the group's table. For instance, to find the product (ab)(ab2) in the figure on page 51, we start at ah and follow the path corresponding to ab2 (multiplying by a, then by b, then again by b), which is This path leads to b; hence (ab)(ab2) = b. As another example, the inverse of ab2 is the path which leads from ab2 back to e. We note instantly that this is ba. A point-and-arrow diagram is the Cayley diagram of a group iff it has the following two properties: (a) For each point x and generator a, there is exactly one a-arrow starting at x, and exactly one a-arrow ending at x; furthermore, at most one arrow goes from x to another point y. (b) If two different paths starting at x lead to the same destination, then these two paths, starting at any point y, lead to the same destination. Cayley diagrams are a useful way of finding new groups. Write the table of the groups having the following Cayley diagrams: (Remark: You may take any point to represent e, because there is perfect symmetry in a Cayley diagram. Choose e, then label the diagram and proceed.) 4 ft u S K A / A \ H. Coding Theory: Generator Matrix and Parity-Check Matrix of a Code For the reader who does not know the subject, linear algebra will be developed in Chapter 28. However, some rudiments of vector and matrix multiplication will be needed in this exercise; they are given here: A vector with n components is a sequence of n numbers: (alt a2,..., an). The dot product of two vectors with n components, say the vectors a = (a,, a2, . . . , an) and b = (blt b2, . . . , bn), is defined by a-b= (ai , bn) = a,b, + a2b2 + ■ ■ ■ + arIbn that is, you multiply corresponding components and add. For example, (1,4, -2, 3) • (6, 2, 4, -2) = 1(6) + 4(2) + (-2)4 + 3(-2) = 0 When a-b = 0, as in the last example, we say that a and b are orthogonal. A matrix is a rectangular array of numbers. An "m by n matrix" (m X n matrix) has m rows and n columns. For example, /l 2 -2 3\ B= 4 1 1-3 V7 2 5 -1/ is a 3 x 4 matrix: It has three rows and four columns. Notice that each row of B is a vector with four components, and each column of B is a vector with three components. If A is any m x n matrix, let a,, a2,. . . , a„ be the columns of A. (Each column of A is a vector with m components.) If x is any vector with m components, xA denotes the vector xA = (x-aj,x-a2,. . . , x-a„) That is, the components of xA are obtained by dot multiplying x by the successive columns of A. For example, if B is the matrix of the previous paragraph and x = (3,1, -2), then the components of xB are (3,1, -2)-(1,4,7) = -7 (3,1,-2)-(2,1,2) = 3 (3,1,-2) (-2,1,5) = -15 (3,1,-2) (3,-3,-1) = 8 that is, xB = (-7,3, -15,8). If A is an m x n matrix, let a1 , a vector with n components, Ay denotes the vector , a(m) be the rows of A. If y is any Ay (ya(,»y.«(2> That is, the components of Ay are obtained by dot multiplying y with the successive rows of A. (Clearly, Ay is not the same as yA.) From linear algebra, A(x + y) = Ax + Ay and (A + B)x = Ax + Bx. We shall now continue the discussion of codes begun in Exercises F and G of Chapter 3. Recall that B" is the set of all vectors of length n whose entries are Os and Is. In Exercise F, page 32, it was shown that B" is a group. A code is defined to be any subset C of B". A code is called a group code if C is a subgroup of B". The codes described in Chapter 3, as well as all those to be mentioned in this exercise, are group codes. 54 chapter five subgroups 55 Anmxn matrix G is a generator matrix for the code C if C is the group generated by the rows of G. For example, if C, is the code given on page 34, its generator matrix is II 0 0 1 1\ G,= 0 1 0 0 1 \0 0 1 1 1 You may check that all eight codewords of C, are sums of the rows of G,. Recall that every codeword consists of information digits and parity-check digits. In the code C, the first three digits of every codeword are information digits, and make up the message; the last two digits are parity-check digits. Encoding a message is the process of adding the parity-check digits to the end of the message. If x is a message, then E(\) denotes the encoded word. For example, recall that in C, the parity-check equations are a„ = a, + a, and as = a, + a2 + a3. Thus, a three-digit message «,a2a3 is encoded as follows: E(a,, a2, o3) = («,, a2, a3, as + u„ a, + a2 + a3) The two digits added at the end of a word are those dictated by the parity check equations. You may verify that /l 0 0 1 1 .a3)\0 10 0 1 Vo o i i i = (a1,a2,a3,a1 + a3,al + a2 + a3) This is true in all cases: If G is the generator matrix of a code and x is a message, then £(x) is equal to the product xG. Thus, encoding using the generator matrix is very easy: you simply multiply the message x by the generator matrix G. Now, the parity-check equations of C, (namely, a4 = a, + a, and a5 = a, + a2 + a,) can be written in the form a, + a, + a. = 0 and a, + a2 + a, + a, = 0 £(a,, a2, a,) = (al [Dot multiply (a,, a-by the successive columns of G..1 which is equivalent to («,,«,,a3,a4,a5)-(l,0,l,l,0) = 0 and (al,«3>a,,a4,«,)-(l,l,1.0,l)-0 The last two equations show that a word ala2aiaAa, is a codeword (that is, satisfies the parity-check equations) if and only if (a,, a2, a„ a4, a5) is orthogonal to both rows of the matrix: H=(1 ° 1 1 °) H U 1 1 0 \) H is called the parity-check matrix of the code C,. This conclusion may be stated as a theorem: (Remember that Hx is obtained by dot multiplying x by the rows of H.) 1 Find the generator matrix G2 and the parity-check matrix H, of the code C2 described in Exercise G2 of Chapter 3. 2 Let C3 be the following code in B7: the first four positions are information positions, and the parity-check equations are a5 = a2 + a, + a4, a6 = a, + a3 + a4, and an = a, + a2 + aA. (C, is called the Hamming code.) Find the generator matrix G3 and parity-check matrix H3 of C3. The weight of a word x is the number of Is in the word and is denoted by w(x). For example, w(11011) = 4. The minimum weight of a code C is the weight of the nonzero codeword of smallest weight in the code. (See the definitions of "distance" and "minimum distance" on page 34.) Prove the following: 3 d(x,y)= w(x + y). 4 w(\) = d(\, 0), where 0 is the word whose digits are all 0s. 5 The minimum distance of a group code C is equal to the minimum weight of C. 6 (a) If x and y have even weight, so does x + y. (b) If x and y have odd weight, x + y has even weight. (c) If x has odd and y has even weight, then x + y has odd weight. 7 In any group code, either all the words have even weight, or half the words have even weight and half the words have odd weight. (Use part 6 in your proof.) 8 H(x + y) = 0 if and only if Hx = Hy, where H denotes the parity-check matrix of a code in B" and x and y are any two words in B"). Theorem 1 Let H be the parity-check matrix of a code C in B" A word x in B" is a codeword if and only if Hx = 0. functions 57 CHAPTER SIX FUNCTIONS The concept of a function is one of the most basic mathematical ideas and enters into almost every mathematical discussion. A function is generally denned as follows: If A and B are sets, then a function from A to B is a rule which to every element x in A assigns a unique element y in B. To indicate this connection between x and y we usually write y =/(x), and we call y the image of x under the function /. There is nothing inherently mathematical about this notion of function. For example, imagine A to be a set of married men and B to be the set of their wives. Let / be the rule which to each man assigns his wife. Then /is a perfectly good function from A to B; under this function, each wife is the image of her husband. (No pun is intended.) Take care, however, to note that if there were a bachelor in A then / would not qualify as a function from A to B; for a function from A to B must assign a value in B to every element of A, without exception. Now, suppose the members of A and B are Ashanti, among whom polygamy is common; in the land of the Ashanti, / does not necessarily qualify as a function, for it may assign to a given member of A several wives. If/is a function from A to B, it must assign exactly one image to each element of A. If /is a function from A to B it is customary to describe it by writing f-.A^B The set A is called the domain of /. The range of / is the subset of B which consists of all the images of elements of A. In the case of the function illustrated here, {a, b, c) is the domain of/, and {x, y} is the range of/(z is not in the range of/). Incidentally, this function /may be represented in the simplified notation b x yj This notation is useful whenever A is a finite set: the elements of A are listed in the top row, and beneath each element of A is its image. It may perfectly well happen, if / is a function from A to B, that two or more elements of A have the same image. For example, if we look at the function immediately above, we observe that a and b both have the same image x. If / is a function for which this kind of situation does not occur, then /is called an injective function. Thus, Definition 1 A function f : A —* B is called injective if each element of B is the image of no more than one element of A. 56 58 chapter six functions 59 The intended meaning, of course, is that each element y in B is the image of no two distinct elements of A. So if ^ y=f(xy)=Kx2) that is, x, and x2 have the same image y, we must require that #, be equal to x2. Thus, a convenient definition of "injective" is this: a function / : A —> B is injective if and only if /Cm)=/(*2) implies x,=x2 If / is a function from A to B, there may be elements in B which are not images of elements of A. If this does not happen, that is, if every element of B is the image of some element of A, we say that / is surjective. Definition 2 A function f : A—* B is called surjective if each element of B is the image of at least one element of A. This is the same as saying that B is the range of/. Now, suppose that /is both injective and surjective. By Definitions 1 and 2, each element of B is the image of at least one element of A, and no more than one element of A. So each element of B is the image of exactly one element of A. In this case, / is called a bijective function, or a one-to-one correspondence. Definition 3 A function f: A-injective and surjective. B is called bijective if it is both It is obvious that under a bijective function, each element of A has exactly one "partner" in B and each element of B has exactly one partner in A. The most natural way of combining two functions is to form their "composite." The idea is this: suppose/is a function from A to B, and g is a function from B to C. We apply / to an element x in A and get an element y in B; then we apply g to y and get an element z in C. Thus, z is obtained from x by applying / and g in succession. The function which consists of applying f and g in succession is a function from A to C, and is called the composite of /and g. More precisely, Let f : A^B and g : B—> C be functions. The composite function denoted by g°f is a function from A to C defined as follows: [g0f](*) = g(/(*)) for every x&A For example, consider once again a set A of married men and the set B of their wives. Let C be the set of all the mothers of members of B. Let / : A—* B be the rule which to each member of A assigns his wife, and g : B~* C the rule which to each woman in B assigns her mother. Then g°/is the "mother-in-law function," which assigns to each member of A his wife's mother: Wife Mother-in-law For another, more conventional, example, let/and g be the following functions from R to R: f(x) = 2x; g(x) = x + 1. (In other words, / is the rule "multiply by 2" and g is the rule "add 1.") Their composites are the functions g °f and /° g given by and [/•*](*)= A *(*)) = 2(* + D [g°f](x) = g(f(x)) = 2x + l 60 chapter six functions 61 f°g and g°f are different: f°g is the rule "add 1, then multiply by 2," whereas g°/is the rule "multiply by 2 and then add 1." It is an important fact that the composite of two injectivc functions is injective, the composite of two surjective functions is surjective, and the composite of two bijective functions is bijective. In other words, if / : A—* B and g : B-* C are functions, then the following are true: /// and g are injective, then g°f is injective. Iff and g are surjective, then g°f is surjective. If f and g are bijective, then g°f is bijective. Let us tackle each of these claims in turn. We will suppose that / and g are injective, and prove that g°/is injective. (That is, we will prove that if [g°f](x) = [g°f](y), then x = y.) Suppose [g°f](x) = [g°f](y), that is, g(f(x)) = g(f(y)) Because g is injective, we get f(x) = f(y) and because / is injective, x = y Next, let us suppose that / and g arc surjective, and prove that g °f is surjective. What we need to show here is that every element of C is g°f of some element of A. Well, if zEC, then (because g is surjective) z = g(y) Ior some y E B; but / is surjective, so y = f(x) for some x E. A. Thus, z = g(y) = g(f(x)) = [g°f](x) Finally, if/and g are bijective, they are both injective and surjective. By what we have already proved, g°f is injective and surjective, hence bijective. A function / from A to B may have an inverse, but it does not have to. The inverse of /, if it exists, is a function/"' ("/inverse") from B to A such that x = f~\y) if and only if y = f(x) Roughly speaking, if/carries x to y then/"' carries y to x. For instance, returning (for the last time) to the example of a set A of husbands and the set B of their wives, if / : A—* B is the rule which to each husband assigns his wife, then f~1:B^>A is the rule which to each wife assigns her husband: Wife Husband If we think of functions as rules, then /"'is the rule which undoes what ever / does. For instance, if / is the real-valued function f(x) = 2x, then /"' is the function /"'(*) = x/2 [or, if preferred, f~\y) = y/2]. Indeed, the rule "divide by 2" undoes what the rule "multiply by 2" does. "Multiply by 2" "Divide by 2" •2x Which functions have inverses, and which others do not? If /, a function from A to B, is not injective, it cannot have an inverse; for "not injective" means there are at least two distinct elements xl and x2 with the same image y; -f Clearly, x1 =f~l(y) and x2 = f~\y) so f'\y) is ambiguous (it has two different values), and this is not allowed for a function. Iff, a function from A to B, is not surjective, there is an element y in B which is not an image of any element of A; thus/_1(/) does not exist. So /" cannot be a function from B (that is, with domain B) to A. It is therefore obvious that if exists, / must be injective and surjective, that is, bijective. On the other hand, if/is a bijective function from A to B, its inverse clearly exists and is determined by the rule that if y = then/-'(» = *. 62 chapter six functions 63 7" Furthermore, it is easy to see that the inverse of / is also a bijective function. To sum up: A function f . A^>B has an inverse if and only if it is bijective. In that case, the inverse f is a bijective function from B to A. EXERCISES A. Examples of Injective and Surjective Functions Each of the following is a function / : R—»R. Determine (a) whether or not / is injective, and (£>) whether or not / is surjective. Prove your answer in either case. Example 1 f(x) = 2x f is injective. Proof Suppose f(a)=f(b), that is, 2a = 2b Then a = b Therefore / is injective. ■ / is surjective. Proof Take any element y€R. Then y = 2(y/2) = f(yl2). Thus, every y e R is equal to f(x) for x = yl2. Therefore / is surjective. ■ Example 2 f(x) = x2 f is not injective. Proof By exhibiting a counterexample: f(2) = 4 = /(-2), although 2 # -2. ■ / is not surjective. Proof By exhibiting a counterexample: -1 is not equal to f(x) for any xER. m 1 f(x) = 3x + 4 2 f(x) = x3 + 1 3 f(x) = \x\ # 4 f{x) = x3 - 3x # 6 /W = { fix) = { x if jc is rational 2x if x is irrational 2x if x is an integer x otherwise 7 Determine the range of each of the functions in parts 1 to 6. B. Functions on IR and Z Determine whether each of the functions listed in parts 1-4 is or is not (a) injective and (b) surjective. Proceed as in Exercise A. 1 / : R-» (0, oc), defined by f{x) = e". 2 f: (0,1)-»R, defined by f(x) = tan x. 3 / : R—» Z defined by f(x) = the least integer greater than or equal to x. f :Z-Z, defined by f{n) = \n + \ * " is ' In - 1 if n is even odd 5 Find a bijective function /from the set I of the integers to the set E of the even integers. C. Functions on Arbitrary Sets and Groups Determine whether each of the following functions is or is not (a) injective and (b) surjective. Proceed as in Exercise A. In parts 1 to 3, A and B are sets, and A X B denotes the set of all the ordered pairs (x, y) as x ranges over A and y over B. 1 / : A x B-> A, defined by f(x, y) = x. 2 f : Ax B^ B x A, denned by f(x, y) = (y, x). 3 / : A—* A x B, defined by f(x) = (x, b), where b is a fixed element of B. 4 G is a group, n6G, and / : G-* G is defined by f(x) = ax. 5 G is a group and / : G —> G is defined by f(x) =x~\ 6 G is a group and / : G—» G is defined by /(x) = jc2. D. Composite Functions In parts 1-3 find the composite function, as indicated. 1 /: R-»R is defined by f(x) = sin x. g : R—> R is defined by g(x) = e". Find fog and g°f. 64 chapter six functions 65 2 A and R are sets; / : A x B-* B x A is given by /(*, y) = (y, x). g : B x A—» B is given by g( y, jt) = v. Find g«/. 3 / : (0,1)-»R is denned by /(*) = llx. g : R-» R is denned by g(x) = In a:. Find g°/. Explain why f°g is undefined. 4 In school, Jack and Sam exchanged notes in a code / which consists of spelling every word backwards and interchanging every letter s with t. Alternatively, they use a code g which interchanges the letters a with o, i with u, e with y, and s with t. Describe the codes f°g and g°/. Are they the same? 5 A = {a, b, c, d\; / and g are functions from A to A; in the tabular form described on page 57, they are given by la b c d\ _(a b c d\ ' \a c. a c) g \b a b a) Give f°g and g«f in the same tabular form. 6 G is a group, and a and b are elements of G. f : G -» G is defined by f(x) = ax. g : G —* G is defined by g(x) = bx. Find /»g and g »/. 7 Indicate the domain and range of each of the composite functions you found in parts 1 to 6. E. Inverses of Functions Each of the following functions / is bijective. Describe its inverse. 1 / : (0, =o)-» (0, *>), defined by fix) = 1 Ix. 2 /: R-» (0, °°), defined by f{x) = e". 3 / : R-» R, defined by f(x) = x3 + 1. a t in n j c ,1 (t \ \lx if * is rational 4 / : R- R, defined by fix) = { ^ {{ x {& 5 A = {a, b, c, d), B = (1,2, 3,4} and / : A-*B is given by la b c d\ } \3 1 2 4/ 6 G is a group, a 6 C, and / : G-» G is defined by fix) = ax. F. Functions on Finite Sets 1 The members of the U.N. Peace Committee must choose, from among themselves, a presiding officer of their committee. For each member x, let fix) designate that member's choice for officer. If no two members vote alike, what is the range of /? 2 Let A be a finite set. Explain why any injective function / : A-* A is necessarily surjective. (Look at part 1.) 3 If A is a finite set, explain why any surjective function f:A—*A is necessarily injective. 4 Are the statements in parts 2 and 3 true when A is an infinite set? If not, give a counterexample. # 5 If A has n elements, how many functions are there from A to A1 How many bijective functions are there from A to A1 G. Some General Properties of Functions In parts 1 to 3, let A, B, and C be sets, and let / : A—> B and g : B—* C be functions. 1 Prove that if g»/ is injective, then / is injective. 2 Prove that if g°/ is surjective, then g is surjective. 3 Parts 1 and 2, together, tell us that if g°/is bijective, then/is injective and g is surjective. Is the converse of this statement true: If /is injective and g surjective, is g°/bijective? (If "yes," prove it; if "no," give a counterexample.) 4 Let / : A—* B and g : B—» A be functions. Suppose that y = fix) iff x = g(y). Prove that /is bijective, and g =/ '. H. Theory of Automata Digital computers and other electronic systems are made up of certain basic control circuits. Underlying such circuits is a fundamental mathematical notion, the notion of finite automata, also known as finite-state machines. A finite automaton receives information which consists of sequences of symbols from some alphabet A. A typical input sequence is a word x = jr,jc, ■ ■ ■ xn, where xt, x2,. . . arc symbols in the alphabet A. The machine has a set of internal components whose combined state is called the internal state of the machine. At each time interval the machine "reads" one symbol of the incoming input sequence and responds by going into a new internal state: the response depends both on the symbol being read and on the machine's present internal state. Let S denote the set of internal states; we may describe a particular machine by specifying a function a : S x A—> S. If s, is an internal state and a. is the symbol currently being read, then a(st, af) = sk is the machine's next state. (That is, the machine, while in state st, responds to the symbol af by going into the new state sk.) The function a is called the next-state function of the machine. Example Let M, be the machine whose alphabet is A = {0,1}, whose set of internal states is S — {sn, st), and whose next-state function is given by the table Present state 0 1 *. s, Input 66 chapter SIX (The table asserts: When in state s0 and reading 0, remain in s0. When in s0 and reading 1, go to state 5,. When in sl and reading 0, remain in When in s, and reading 1, go to state sa.) A possible use of A/, is as a parity-check machine, which may be used in decoding information arriving on a communication channel. Assume the incoming information consists of sequences of five symbols, 0s and Is, such as 10111. The machine starts off in state s0. It reads the first digit, 1, and as dictated by the table above, goes into state Then it reads the second digit, 0, and remains in s,. When it reads the third digit, 1, it goes into state s„, and when it reads the fourth digit, 1, it goes into j,. Finally, it reaches the last digit, which is the parity-check digit: if the sum of the first four digits is even, the parity-check digit is 0; otherwise it is 1. When the parity-check digit, 1, is read, the machine goes into state s0. Ending in state su indicates that the parity-check digit is correct. If the machine ends in s,, this indicates that the parity-check digit is incorrect and there has been an error of transmission. A machine can also be described with the aid of a state diagram, which consists of circles interconnected by arrows: the notation 0- •0 means that if the machine is in state s, when x is read, it goes into state sr The diagram of the machine Mt of the previous example is In parts 1-4 describe the machines which are able to carry out the indicated functions. For each machine, give the alphabet A, the set of states 5, and the table of the next-state function. Then draw the state diagram of the machine. # 1 The input alphabet consists of four letters, a, b, c, and d. For each incoming sequence, the machine determines whether that sequence contains exactly three a's. 2 The same conditions pertain as in part 1, but the machine determines whether the sequence contains at least three a's. 3 The input alphabet consists of the digits 0, 1, 2, 3, 4. The machine adds the digits of an input sequence; the addition is modulo 5 (see page 27). The sum is to be given by the machine's state after the last digit is read. 4 The machine tells whether or not an incoming sequence of Os and Is ends with 111. FUNCTIONS 67 5 If M is a machine whose next-state function is a, define a as follows: If x is an input sequence and the machine (in state s,) begins reading x, then a(s,, x) is the state of the machine after the last symbol of x is read. For instance, if M, is the machine of the example given above, then d(s0, 11010) = j,. (The machine is in v„ before the first symbol is read; each 1 alters the state, but the 0s do not. Thus, after the last 0 is read, the machine is in state s,.) (a) For the machine Af,, give a(s„, x) for all three-digit sequences x. (b) For the machine of part 1, give a(s,, x) for each state s, and every two-letter sequence x. 6 With each input sequence x we associate a function TX:S—>S called a state transition function, defined as follows: Tx(s,) = a(s„ x) For the machine Ml of the example, if x = 11010, 7*x is given by TAs0) = s, and 7"x(j,) = .s0 (a) Describe the transition function 7"x for the machine M, and the following sequences: x = 01001, x = 10011, x = 01010. (b) Explain why M, has only two distinct transition functions. [Note: Two functions /and g are equal if f(x) = g{x) for every x; otherwise they are distinct.] (c) For the machine of part 1, describe the transition function TK for the following x: x = abbca, x = babac, x = ccbaa. (d) How many distinct transition functions are there for the machine of part 3? I. Automata, Semigroups, and Groups By a semigroup we mean a set A with an associative operation. (There does not need to be an identity element, nor do elements necessarily have inverses.) Every group is a semigroup, though the converse is clearly false. With every semigroup A we associate an automaton M = M(A) called the automaton of the semigroup A. The alphabet of Af is A, the set of states also is A, and the next-state function is a(s, a) = sa [or a(s, a) = s + a if the operation of the semigroup is denoted additively]. 1 Describe A/(Z4). That is, give the table of its next-state function, as well as its state diagram. 2 Describe M(S3). If M is a machine and 5 is the set of states of M, the state transition functions of M (defined in Exercise H6 of this chapter) are functions from S to S. In the next exercise you will be asked to show that Ty°Tx= 7/xy; that is, the composite of two transition functions is a transition function. Since the composition of functions is associative [f°{g°h) = {f°g)<>h\, it follows that the set of all transition functions, with the operation °, is a semigroup. It is denoted by y(A/) and called the semigroup of the machine M. 3 Prove that T = T°T.. 68 chapter six 4 Let M, be the machine of the example in Exercise H above. Give the table of the semigroup S^A/,). Does S^Af,) have an identity element? Is ^(M,) a group? 5 Let M2 be the machine of Exercise H3. How many distinct functions are there in 5f(M,)? Give the table of Sf(M3). Is ^(Af3) a group? (Why?) 6 Find the table of Sf(M) if M is the machine whose state diagram is CHAPTER SEVEN GROUPS OF PERMUTATIONS In this chapter we continue our discussion of functions, but we confine our discussions to functions from a set to itself. In other words, we consider only functions /: A —* A whose domain is a set A and whose range is in the same set A. To begin with, we note that any two functions / and g (from A to -4) are equal if and only if f{x) = g(x) for every element x in A. If / and g are functions from A to A, their composite f°g is also a function from A to A. We recall that it is the function defined by If" g] (*) = A g(*)) for every x in A (1) It is a very important fact that the composition of functions is associative. Thus, if /, g, and h are three functions from A to A, then f°(g°h) = U°g)°h To prove that the functions f°(g°h) and (f°g)°h are equal, one must show that for every element x in A, {/•[*'*]}<*)-{[/•*]•*}(*) We get this by repeated use of Equation (1): {/. [ g • h]} (x) = f([g • h](x)) = /(g(h(x))) = [/»#W) = {[/»s]4)W By a permutation of a set A we mean a bijective function from A to A, that is, a one-to-one correspondence between A and itself. In elementary 69 70 chapter seven algebra we learned to think of a permutation as a rearrangement of the elements of a set. Thus, for the set {1,2,3,4,5}, we may consider the rearrangement which changes (1,2,3,4,5) to (3,2,1,5,4); this rearrangement may be identified with the function which is obviously a one-to-one correspondence between the set {1,2,3,4,5} and itself. It is clear, therefore, that there is no real difference between the new definition of permutation and the old. The new definition, however, is more general in a very useful way since it allows us to speak of permutations of sets A even when A has infinitely many elements. In Chapter 6 we saw that the composite of any two bijective functions is a bijective function. Thus, the composite of any two permutations of A is a permutation of A. It follows that we may regard the operation ° of composition as an operation on the set of all the permutations of A. We have just seen that composition is an associative operation. Is there a neutral element for composition? For any set A, the identity function on A, symbolized by eA or simply 6, is the function x —» x which carries every element of A to itself. That is, it is defined by e(x) = x for every element x & A It is easy to see that e is a permutation of A (it is a one-to-one correspondence between A and itself); and if/is any other permutation of A, then f°e=f and £•/ = / The first of these equations asserts that [/° e](x) = /(*) for every element x in A, which is quite obvious, since [/° e](x) = f(e(x)) =f{x). The second equation is proved analogously. We saw in Chapter 6 that the inverse of any bijective function exists and is a bijective function. Thus, the inverse of any permutation of A is a permutation of A. Furthermore, if/is any permutation of A and / is its inverse, then f~l°f=s and /°/_1-e The first of these equations asserts that for any element x in A, [f1°f](x)=e(x) groups of permutations 71 that is,f \f(x)) = x: This is obviously true, by the definition of the inverse of a function. The second equation is proved analogously. Let us recapitulate: The operation ° of composition of functions qualifies as an operation on the set of all the permutations of A. This operation is associative. There is a permutation e such that e°/ = /and f°e=f for any permutation / of A. Finally, for every permutation /of A there is another permutation f~' otA such that/°/ 1 = e and/-1 °/= e. Thus, the set of all the permutations of A, with the operation ° of composition, is a group. For any set A, the group of all the permutations of A is called the symmetric group on A, and it is represented by the symbol SA. For any positive integer n, the symmetric group on the set {1,2,3,. . . , n} is called the symmetric group on n elements, and is denoted by Sn. Let us take a look at S3. First, we list all the permutations of the set {1,2,3}: -(J ) -(III) -a i \) This notation for functions was explained on page 57; for example, 1 2 3\ .3 1 2/ 2. A more graphic is the function such that /3(1) = 3, y3(2) = 1, and /3(3) way of representing the same function would be 1 2 3 ■i 4 4 3 1 2 The operation on elements of S3 is composition. To find a ° /3, we note that [a°/3](l) = <*(0(l)) = a(3) = 2 [a°0](2) = a(iS(2)) = a(l) = l [a°/3](3) = a(/3(3))=a(2) = 3 72 chapter seven groups of permutations 73 /I 2 3\ aoH2 i 3)= Thus, 1 2 3N >2 1 3/ Note that in a ° j3, /3 is applied first and a next. A graphic way of representing this is / 1 2 3 p = \ 1 1 4 \ 3.1 2 1 2 3 a = [ i I 1 1 3 2 The other combinations of elements of S3 may be computed in the same fashion. The student should check the following table, which is the table of the group S3; ° e (X ß y i> K e e a ß y 8 K a a e y ß K 8 ß ß K 8 a 1 y y 7 8 K e a ß s 8 y e K ß a K K ß a 8 y E By a group of permutations we mean any group 5^, or Sn, or any subgroup of one of these groups. Among the most interesting groups of permutations are the groups of symmetries of geometric figures. We will see how such groups arise by considering the group of symmetries of the square. We may think of a symmetry of the square as any way of moving a square to make it coincide with its former position. Every time we do this, vertices will coincide with vertices, so a symmetry is completely described by its effect on the vertices. Let us number the vertices as in the following diagram: The most obvious symmetries are obtained by rotating the square clockwise about its center P, through angles of 90°, 180°, and 270°, respectively. We indicate each symmetry as a permutation of the vertices; thus a clockwise rotation of 90° yields the symmetry d _ / 1 2 3 4\ Ri-\ 2 3 4 1/ 2 3 4 for this rotation carries vertex 1 to 2, 2 to 3, 3 to 4, and 4 to 1. Rotations of 180° and 270° yield the following symmetries, respectively: 1 2 3 4\ /1 2 3 4N 3 4 12/ and R>-\4 1 2 3, «2 = The remaining symmetries are flips of the square about its axes A, B, C, and D: \ l \ /4 \ For example, when we flip the square about the axis A, vertices 1 and 3 stay put, but 2 and 4 change places; so we get the symmetry R =(l 2 3 4' 4 ll 4 3 2. In the same way, the other flips are 2 3 4' v3 2 1 A, "Ml and «■-(: One last symmetry is the identity «.-(! which leaves the square as it was. 3 2 3 3 74 chapter seven The operation on symmetries is composition: R^Rj is the result of first performing /?., and then Rr For example, Rl ° /?4 is the result of first flipping the square about its axis A, then rotating it clockwise 90°: _/l 2 3 4\ \2 1 4 3/ Thus, the net effect is the same as if the square had been flipped about its axis C. The eight symmetries of the square form a group under the operation ° of composition, called the group of symmetries of the square. For every positive integer n 5s 3, the regular polygon with n sides has a group of symmetries, symbolized by D„, which may be found as we did here. These groups are called the dihedral groups. For example, the group of the square is D4, the group of the pentagon is D5, and so on. Every plane figure which exhibits regularities has a group of symmetries. For example, the following figure, has a group of symmetries R, consisting of two rotations (180° and 360°) and two flips about the indicated axes. Artificial as well as natural objects often have a surprising number of symmetries. Far more complicated than the plane symmetries are the symmetries of objects in space. Modern-day crystallography and crystal physics, for example, rely very heavily on knowledge about groups of symmetries of three-dimensional shapes. Groups of symmetry are widely employed also in the theory of electron structure and of molecular vibrations. In elementary particle physics, such groups have been used to predict the existence of certain elementary particles before they were found experimentally! groups of permutations 75 Symmetries and their groups arise everywhere in nature: in quantum physics, flower petals, cell division, the work habits of bees in the hive, snowflakes, music, and Romanesque cathedrals. EXERCISES A. Computing Elements of S6 1 Consider the following permutations/, g, and h in S6: (1 2 3 4 5 6\ _/l 2 3 4 5 6\ 1 \6 1 3 5 4 2/ 8'\2 3 1 6 5 4/ »"(] i Compute the following: /_1 = ^1 2 3 4 5 6^ 2 3 4 5 6 6 4 5 2 ) = 2 3 4 5 6) ,-,.(1 2 3 4 5 6) f'i-i1 2 3 4 5 6) 2 3 4 5 6) 2 f°(g°h) = 3 5 g°g°g = B. Examples of Groups of Permutations 1 Let G be the subset of S4 consisting of the permutations ,Jl 2 3 4) tmn 2 3 4\ \1 2 3 4/ M2 1 4 3/ Show that G is a group of permutations, and write its table: f g h 76 chapter seven groups of permutations 77 /-(I 2 List the elements of the cyclic subgroup of S6 generated by 2 3 4 5 6] v2 3 4 1 6 5/ 3 Find a four-element abelian subgroup of S5. Write its table. 4 The subgroup of 5, generated by 3 4 5\ /I 2 3 4 5\ 3 4 5/ g VI 2 4 5 3/ has six elements. List them, then write the table of this group: 2 3 4 5 s A: = 2 2 1 2 2 2 /-a! st ) 1 2 3 4 5 2 3 4 5' I C. Groups of Permutations of U. In each of the following, A is a subset of R and G is a set of permutations of A. Show that G is a subgroup of SA, and write the table of G. 1 A is the set of all x G R such that x^O, 1. G= {e, f, g}, where/(*) = 1/(1-*) and g(x) = (x - 1) Ix. 2 A is the set of all the nonzero real numbers. G = {e, /, g, h), where f(x) = 1 Ix, g(x) = -jc, and h(x) = -1 /jr. 3 /t is the set of all the real numbers x ^0,1. G = {e, f, g, h, j, k), where f(x) = \-x, g(x) = l/x, h(x) = 1/(1 - x), j{x) = (x-\)lx, and k(x) = xl(x - 1). 4 A is the set of all the real numbers x 7^0,1,2. G is the subgroup of SA generated by f(x) = 2 — x and g(x) = 2lx. (G has eight elements. List them, and write the table of G.) t D. A Cyclic Group of Permutations For each integer n, define /„ by /„(*) = x + n. 1 Prove: For each integer n, /„ is a permutation of U, that is, /„ G 5H. # 2 Prove that /„•/„,= /„„ and /„"' =/.„. 3 Let G = {/„ : n G Z}. Prove that G is a subgroup of 5„. 4 Prove that G is cyclic. (Indicate a generator of G.) t E. A Subgroup of SR For any pair of real numbers a 9^0 and b, define a function fmb as follows: /.,,(*) = ax + b 1 Prove that /„ b is a permutation of R, that is, fab G 5H. 2 Prove that f^'f^=L^„. # 3 Prove that /"i =/„.._»,.. 4 Let G = {/„ (, : a G R, i G R, a^O}. Show that G is a subgroup of 5R. F. Symmetries of Geometric Figures 1 Let G be the group of symmetries of the regular hexagon. List the elements of G (there are 12 of them), then write the table of G. Ml etc. . .. 4 5 4 1 Let G be the group of symmetries of the rectangle. List the elements of G (there are four of them), and write the table of G. 3 List the symmetries of the letter Z and give the table of this group of symmetries. Do the same for the letters V and H. 4 List the symmetries of the following shape, and give the table of their group. (Assume that the three arms are of equal length, and the three central angles are equal.) I 78 CHAPTER SEVEN groups of permutations 79 G. Symmetries of Polynomials Consider the polynomial p = (*, - x2f + (x, - xA2. It is unaltered when the subscripts undergo any of the following permutations: (1 2 3 4\ /l 2 3 4\ (1 2 3 4\ (1 2 3 4\ \2 1 3 4/1 ll 2 4 3i V2 I 4 3/ \3 4 1 2/ /l 2 3 4\ (1 2 3 4\ (1 2 3 4\ /l 2 3 4\ U 3 1 2/ U 4 2 1/ U 3 2 1/ \1 2 3 4/ 2 3 4\ [1 2 3 4] /l 2 3 4 ,4 3 12/ V3 4 2 1/ U 3 2 For example, the first of these permutations replaces p by Oj -x,)2 + (x3-Jt4)2 the second permutation replaces p by (xt - x2)2 + (xt-x,)2; and so on. The symmetries of a polynomial p are all the permutations of the subscripts which leave p unchanged. They form a group of permutations. List the symmetries of each of the following polynomials, and write their group table. 1 p = x^x2 + x2x3 2 p = (xl- x2)(x2 - x,)(xl - x3) 3 p = x^x2 + x2x3 + x,x3 4 p = {xl - x2)(x3 - xA H. Properties of Permutations of a Set A 1 Let A be a set and a 6 A. Let G be the subset of SA consisting of all the permutations / of A such that /(a) = a. Prove that G is a subgroup of SA. # 2 If /is a permutation of A and a 6 A, we say that f moves a iff {a) a. Let A be an infinite set, and let G be the subset of SA which consists of all the permutations / of A which move only a finite number of elements of A. Prove that G is a subgroup of SA. 3 Let A be a finite set, and B a subset of A. Let G be the subset of SA consisting of all the permutations / of A such that /(x) e S for every x e fl. Prove that G is a subgroup of SA. 4 Give an example to show that the conclusion of part 3 is not necessarily true if A is an infinite set. I. Algebra of Kinship Structures (Anthropology) Anthropologists have used groups of permutations to describe kinship systems in primitive societies. The algebraic model for kinship structures described here is adapted from An Anatomy of Kinship by H. C. White. The model is based on the following assumptions, which are widely supported by anthropological research: (i) The entire population of the society is divided into clans. Every person belongs to one, and only one, clan. Let us call the clans k1, k2,. . . , kn. (ii) In every clan kt, all the men must choose their wives from among the women of a specified clan kr We symbolize this by writing w{kt) = kr (iii) Men from two different clans cannot marry women from the same clan. That is, if ki # kr then w(k,) ? w(kj). (iv) All the children of a couple are assigned to some fixed clan. So if a man belongs to clan k„ all his children belong to a clan which we symbolize by c(k,). (v) Children whose fathers belong to different clans must themselves be in different clans. That is, if kp then c(it,) # c(k,). (vi) A man cannot marry a woman of his own clan. That is, ^(/t,)^ kr Now let K = {k,, k2, ...,*„} be the set of all the distinct clans. By (ii), w is a function from K to K, and by (iv), c is a function from K to K. By (iii), w is an injective function; hence (see Exercise F2 of Chapter 6) w is a permutation of K. Likewise, by (v), c is a permutation of K. Let G be the group of permutations generated by c and w; that is, G consists of c, w, c' ', w~\ and all possible composites which can be formed from these —for example, c°w°w°c~1°w~\ Clearly the identity function e is in G since, for example, e = c«e~\ Here are two final assumptions: (vii) Every person, in any clan, has a relation in every other clan. This means that for any kt and k} in K, there is a permutation a in G such that «(*,) = (viii) Rules of kinship apply uniformly to all clans. Thus, for any a and /3 in G, if aikj) = fi(kj) for some specific clan kJt it necessarily follows that a(/tf) = j8(fc/) for every clan kr Prove parts 1-3: 1 Let a EG. If a{kt) = k, for any given k„ then a = e. 2 Let a e G. There is a positive integer m«n such that a™ = e. [am = a ° a ° ■ ■ ■ ° a (m factors of a). Hint: Consider a(/tj), a2(fc,), etc.] 3 The group G consists of exactly n permutations. Explain parts 4-9. 4 If a person belongs to clan k„ that person's father belongs to clan c~\kt). If a woman belongs to clan kp her husband belongs to clan w~1(kl). 5 If any man is in the same clan as his son, then c= e. If any woman is in the same clan as her son, then c = w. 6 If a person belongs to clan kt, the son of his mother's sister belongs to clan c°w ow»c \k,). Conclude that marriage between matrilateral parallel cousins (marriage between a woman and the son of her mother's sister) is prohibited. 7 Marriage between a man and the daughter of his father's sister is prohibited. 8 If matrilateral cross-cousins may marry (that is, a woman may marry the son of her mother's brother), then c° w = w~l°c. 9 If patrilateral cross-cousins may marry (a woman may marry the son of her father's sister), then c and w"1 commute. permutations of a finite set 81 CHAPTER EIGHT PERMUTATIONS OF A FINITE SET Permutations of finite sets are used in every branch of mathematics—for example, in geometry, in statistics, in elementary algebra—and they have a myriad of applications in science and technology. Because of their practical importance, this chapter will be devoted to the study of a few special properties of permutations of finite sets. If n is a positive integer, consider a set of n elements. It makes no difference which specific set we consider, just as long as it has n elements; so let us take the set {1,2,...,«}. We have already seen that the group of all the permutations of this set is called the symmetric group on n elements and is denoted by Sn. In the remainder of this chapter, when we say "permutation" we will invariably mean a permutation of the set {1,2,...,«} for an arbitrary positive integer n. One of the most characteristic activities of science {any kind of science) is to try to separate complex things into their simplest component parts. This intellectual "divide and conquer" helps us to understand complicated processes and solve difficult problems. The savvy mathematician never misses the chance of doing this whenever the opportunity presents itself. We will see now that every permutation can be decomposed into simple parts called "cycles," and these cycles are, in a sense, the most basic kind of permutations. We begin with an example: take, for instance, the permutation _/l 2345678 9^ ^ \3 1 6 9 8 2 4 5 7 and look at how / moves the elements in its domain: ) A Notice how / decomposes its domain into three separate subsets, so that, in each subset, the elements are permuted cyclically so as to form a closed chain. These closed chains may be considered to be the component parts of the permutation; they are called "cycles." (This word will be carefully defined in a moment.) Every permutation breaks down, just as this one did, into separate cycles. Let a,, a2, . . . , as be distinct elements of the set {1,2, ... , n}. By the cycle (a,a2 • ••as) we mean the permutation of {1,2,..., n) which carries a, to a2, a2 to a3,----, as_, to as, and as to at, while leaving all the remaining elements of {1,2,...,«} fixed' For instance, in St, the cycle (1426) is the permutation 4 5 6^ 5 2 3 V4 6 3 2 In S5, the cycle (254) is the permutation 9 2 3 4 5\ 5 3 2 4/ Because cycles are permutations, we may form the composite of two cycles in the usual manner. The composite of cycles is generally called their product and it is customary to omit the symbol °. For example, in S5, 2 3 4 5\ /l 2 3 4 5\ 4 3 5 2/ \2 4 3 15/ '1 2 3 4 5s V4 5 3 1 2/ Actually, it is very easy to compute the product of two cycles by reasoning in the following manner: Let us continue with the same example, (245)(124) = (J -(J 82 chapter eight permutations of a finite set 83 (2 4 5)(1 2_4) a /3 Remember that the permutation on the right is applied first, and the permutation on the left is applied next. Now, B carries 1 to 2, and a carries 2 to 4; hence a/8 carries 1 to 4. B carries 2 to 4, and o carries 4 to 5; hence aB carries 2 to 5 . /3 leaves 3 fixed and so does a; hence aB leaves 3 fixed . 8 carries 4 to 1 and a leaves 1 fixed, so aB carries 4 to 1 . B leaves 5 fixed and a carries 5 to 2; hence a/3 carries 5 to 2. If (a, ft,. We let ft2 = /(ft,), ft, = /(ft2), and proceed as before to obtain the next cycle, say (ft, • • • ftr). Obviously (ft, • ■ ■ ft,) is disjoint from (a, • ■ • ak). We continue this process until all the numbers in {1,. . . , n) have been exhausted. This concludes the proof. Incidentally, it is easy to see that this product of cycles is unique, except for the order of the factors. Now our curiosity may prod us to ask: once a permutation has been written as a product of disjoint cycles, has it been simplified as much as possible! Or is there some way of simplifying it further? A cycle of length 2 is called a transposition. In other words, a transposition is a cycle (a,a;) which interchanges the two numbers a, and a;. It is a fact both remarkable and trivial that every cycle can be expressed as a product of one or more transpositions. In fact, (a,a2 •ar) = («fs,_1)(ara,_2) • ■ ■ (ara2){ara2)(arax) which may be verified by direct computation. For example, (12345) = (54)(53)(52)(51) However, there is more than one way to write a given permutation as a product of transpositions. For example, (12345) may also be expressed as a product of transpositions in the following ways: (12345) = (15)(14)(13)(12) (12345) = (54)(52)(51)(14)(32)(41) as well as in many other ways. 84 chapter eight permutations of a finite set 85 Thus, every permutation, after it has been decomposed into disjoint cycles, may be broken down further and expressed as a product of transpositions. However, the expression as a product of transpositions is not unique, and even the number of transpositions involved is not unique. Nevertheless, when a permutation tt is written as a product of transpositions, one property of this expression is unique: the number of transpositions involved is either always even or always odd. (This fact will be proved in a moment.) For example, we have just seen that (12345) can be written as a product of four transpositions and also as a product of six transpositions; it can be written in many other ways, but always as a product of an even number of transpositions. Likewise, (1234) can be decomposed in many ways into transpositions, but always an odd number of transpositions. A permutation is called even if it is a product of an even number of transpositions, and odd if it is a product of an odd number of transpositions. What we are asserting, therefore, is that every permutation is unambiguously either odd or even. This may seem like a pretty useless fact—but actually the very opposite is true. A number of great theorems of mathematics depend for their proof (at that crucial step when the razor of logic makes its decisive cut) on none other but the distinction between even and odd permutations. We begin by showing that the identity permutation, e, is an even permutation. Theorem 2 No matter how e is written as a product of transpositions, the number of transpositions is even. Proof: Let f,, t2,. . ., tm be m transpositions, and suppose that We aim to prove that e can be rewritten as a product of m - 2 transpositions. We will then be done: for if e were equal to a product of an odd number of transpositions, and we were able to rewrite this product repeatedly, each time with two fewer transpositions, then eventually we would get e equal to a single transposition (ab), and this is impossible. Let x be any numeral appearing in one of the transpositions t2,...,tm. Let tk = (xa), and suppose tk is the last transposition in Equation (1) (reading from left to right) in which x appears: e=U, t, tk Now, tk-1 is a transposition which is either equal to (xa), or else one or both of its components are different from x and a. This gives four possibilities, which we now treat as four separate cases. Case I tk_t = (xa). Then tk_1tk = (xa)(xa), which is equal to the identity permutation. Thus, tk_}tk may be removed without changing Equation (1). As a result, e is a product of m - 2 transpositions, as required. Case II tk_t = (xb), where b^x,a. Then But (xb)(xa) = (xa)(ab) - (xa) x does not appear here We replace tk_ttk by (xa)(ab) in Equation (1). As a result, the last occurrence of x is one position further left than it was at the start. Case III tk_, = (ca), where c^x,a. Then tk^tk = (ca)(xa) But (ca)(xa) = (xc)(ca) We replace tk_1tk by (xc)(ca) in Equation (1), as in Case II. Case IV tk_1 = (be), where b^x,a and c^x,a. Then tk_xtk = (bc)(xa) But (bc)(xa) = (xa)(bc) We replace tk^tk by (xa)(bc) in Equation (1), as in Cases II and III. In Case I, we are done. In Cases II, HI, and IV, we repeat the argument one or more times. Each time, the last appearance of x is one position further left than the time before. This must eventually lead to Case I. For otherwise, we end up with the last (hence the only) appearance of* being in tt. This cannot be: for if t, = (xa) and x does not appear mt2,...,tm, then e(x) = a, which is impossible! ■ (The box ■ is used to mark the ending of a proof.) Our conclusion is contained in the next theorem. Theorem 3 If ir€.Sn, then tt cannot be both an odd permutation and an even permutation. Suppose tt can be written as the product of an even number of transpositions, and differently as the product of an odd number of transpositions. Then the same would be true for ir'1. But e= wtt'1: 86 chapter eight permutations of a finite set 87 thus, writing tr~l as a product of an even number of transpositions and 7r as a product of an odd number of transpositions, we get an expression for e as a product of an odd number of transpositions. This is impossible by Theorem 2. The set of all the even permutations in S„ is a subgroup of Sn. It is denoted by An, and is called the alternating group on the set {1,2,...,«}. EXERCISES A. Practice in Multiplying and Factoring Permutations 1 Compute each of the following products in 59. (Write your answer as a single permutation.) (a) (145)(37)(682) (b) (17)(628)(9354) (c) (71825)(36)(49) (d) (12)(347) # (e) (147)(1678)(74132) (/) (6148)(2345)(12493) 2 Write each of the following permutations in 5, as a product of disjoint cycles: (1 2 3 4 5 6 7 8 9\ /l 2 3 4 5 6 7 8 9\ (a) U 9 2 5 1 7 6 8 3/ W \7 4 9 2 3 8 1 6 5/ , , (1 2 3 4 5 6 7 8 9\ /l 2 3 4 5 6 7 8 91 (c) U o <: i 1 ? d r 6/ ( ' V9 1 2 3 4 5 6 7 .4 9 2 5 1 7 6 8 3; '1 2345678 9^ ,7 9 5 3 1 2 4 8 6/ W\9 8 7 4 3 6 5 3 Express each of the following as a product of transpositions in 58: (a) (137428) (ft) (416)(8235) /l 2 3 4 5 6 7 8\ (c) (123)(456)(1574) <<0 (3 1 4 2 * 7 6 sJ 4 If a = (3714), /3 = (123), and y = (24135) in S7, express each of the following as a product of disjoint cycles: (a) a'B (b) y-'a (c) a2B (d) B2ay # (/) yV (g) /TV (ft) a"'y2a (Note: a2 = a°a, y3 = yyy, etc.) 5 In S5, write (12345) in five different ways as a cycle, and in five different ways as a product of transpositions. 6 In S„ express each of the following as the square of a cycle (that is, express as a2 where a is a cycle): (a) (132) (b) (12345) (c) (13)(24) to y4 B. Powers of Permutations If tt is any permutation, we write 7r°7r = 7r2, i7«tJiT: of this notation is evident. 1 Compute a-1, a2, a3, a4, a5 where (a) a = (123) (b) a = (1234) it3, etc. The convenience (c) a = (123456). In the following problems, let a be a cycle of length s, say a = (a1a2 • • • as). # 2 Describe all the distinct powers of a. How many are there? Note carefully the connection with addition of integers modulo s (page 27). 3 Find the inverse of a, and show that a~l = a'~\ Prove each of the following: § 4 o2 is a cycle iff s is odd. 5 If j is odd, a is the square of some cycle of length s. (Find it. Hint- Show a = aJ + 1.) 6 If s is even, say s = 2t, then a2 is the product of two cycles of length t. (Find them.) 7 If s is a multiple of say s = kt, then a* is the product of k cycles of length r. 8 If s is a prime number, every power of a is a cycle. C. Even and Odd Permutations 1 Determine which of the following permutations is even, and which is odd. «-(} 4 It I I I I) <'><"««> (c) (12)(76)(345) (d) (1276)(3241)(7812) (e) (123)(2345)(1357) Prove each of the following: 2 (a) The product of two even permutations is even. (b) The product of two odd permutations is even. (c) The product of an even permutation and an odd permutation is odd. 3 (a) A cycle of length / is even if / is odd. (b) A cycle of length / is odd if / is even. 4 (a) If a and B are cycles of length / and m, respectively, then aB is even or odd depending on whether / + m - 2 is even or odd. (b) If ir = Bx •• ■ Br where each B, is a cycle of length /,, then ir is even or odd depending on whether lt + l2 + ■•• + /, — r is even or odd. D. Disjoint Cycles In each of the following, let a and B be disjoint cycles, say a = (a,a2--as) and B = (blb2-br) Prove parts 1-3: 1 For every positive integer n, (aB)n = a"B". 2 If aB = e, then a - e and B = e. 3 If (aB)' = £, then a' = e and B' = e (where t is any positive integer). (Use part 2 in your proof.) 4 Find a transposition y such that aBy is a cycle. 5 Let y be the same transposition as in the preceding exercise. Show that ayB and yaB are cycles. 88 chapter eight permutations of a finite set 89 6 Let a and B be cycles of odd length (not disjoint). Prove that if a2 = B2, then a = B. t E. Conjugate Cycles Prove each of the following in Sn: 1 Let o=(fl,.....ujbea cycle and let tt be a permutation in S„. Then irair^ is the cycle (7r(a,), . . . , rr(as)). If a is any cycle and tt any permutation, rraTr'1 is called a conjugate of a. In the following parts, let tt denote any permutation in 5„. # 2 Conclude from part 1: Any two cycles of the same length are conjugates of each other. 3 If a and 8 are disjoint cycles, then Trent'1 and irBrr'1 are disjoint cycles. 4 Let a- be a product at • ■ • a, of t disjoint cycles of lengths /,,...,/,, respectively. Then tto-tt'1 is also a product of / disjoint cycles of lengths /,. 5 Let al and a2 be cycles of the same length. Let 8, and B2 be cycles of the same length. Let a, and B, be disjoint, and let ot2 and B2 be disjoint. There is a permutation tt SSn such that atB, = Tra2B2Tr~^. and e is called the F. Order of Cycles 1 Prove in S„: If a - (*, • ■ • aj is a cycle of length s, then a' = a3' = e. Is a* = f for any positive integer k G2 which looks as though it might be an isomorphism. 2. Check that / is injective and surjective (hence bijective). 3. Check that / satisfies the identity f(ab)=f(a)f(b) Here's an example: U is the group of the real numbers with the operation of addition. Mpos is the group of the positive real numbers with the operation of multiplication. It is an interesting fact that R and Rpos are isomorphic. To see this, let us go through the steps outlined above: 1. The educated guess: The exponential function f(x) = ex is a function from U to (Rpos which, if we recall its properties, might do the trick. 2. / is injective: Indeed, if f(a)=f(b), that is, e" = eb, then, taking the natural log on both sides, we get a = b. f is surjective: Indeed, if y £ Rpos, that is, if y is any positive real number, then y = eln y = /(In y); thus, y = f(x) for x = In y. 3. It is well known that ea+b = e" ■ eb, that is, f{a + b)=f(a)-f(b) Incidentally, note carefully that the operation of U is +, whereas the operation of Upos is •. That is the reason we have to use + on the left side of the preceding equation, and ■ on the right side of the equation. How does one recognize when two groups are not isomorphic? In practice it is usually easier to show that two groups are not isomorphic than to show they are. Remember that if two groups are isomorphic they are replicas of each other; their elements (and their operation) may be named differently, but in all other respects they are the same and share the same properties. Thus, if a group G, has a property which group G2 does not have (or vice versa), they are not isomorphic! Here are some examples of properties to look out for: 1. Perhaps G( is commutative, and G2 is not. 2. Perhaps Gx has an element which is its own inverse, and G2 does not. 3. Perhaps Gx is generated by two elements, whereas G2 is not generated by any choice of two of its elements. 4. Perhaps every element of G, is the square of an element of G,, whereas G, does not have this property. This list is by no means exhaustive; it merely illustrates the kind of things to be on the lookout for. Incidentally, the kind of properties to watch for are properties which do not depend merely on the names assigned to individual elements; for instance, in our last example, 0 £ G[ and 0 ^ G2, but nevertheless Gx and G2 are isomorphic. Finally, let us state the obvious: if Gx and G2 cannot be put in one-to-one correspondence (say, Gt has more elements that G2), clearly they cannot be isomorphic. In the early days of modern algebra the word "group" had a different meaning from the meaning it has today. In those days a group always meant a group of permutations. The only groups mathematicians used were groups whose elements were permutations of some fixed set and whose operation was composition. There is something comforting about working with tangible, concrete things, such as groups of permutations of a set. At all times we have a clear picture of what it is we are working with. Later, as the axiomatic method reshaped algebra, a group came to mean any set with any associative operation having a neutral element and allowing each element an inverse. The new notion of group pleases mathematicans because it is simpler and more lean and sparing than the old notion of groups of permutations; it is also more general because it allows many new things to be groups which are not groups of permutations. However, it is harder to visualize, precisely because so many different things can be groups. It was therefore a great revelation when, about 100 years ago, Arthur Cayley discovered that every group is isomorphic to a group of permutations. Roughly, this means that the groups of permutations are actually all the groups there are! Every group is (or is a carbon copy of) a group of permutations. This great result is a classic theorem of modern algebra. As a bonanza, its proof is not very difficult. Cayley's Theorem Every group is isomorphic to a group of permuta- tions. Proof: Let G be a group; we wish to show that G is isomorphic to a group of permutations. The first question to ask is, "What group of permutations? Permutations of what set?" (After all, every permutation must be a permutation of some fixed set.) Well, the one set we have at hand is the set G, so we had better fix our attention on permutations of G. The way we match up elements of G with permutations of G is quite interesting: With each element a in G we associate a function it : G—» G defined by ■"■„(*) = ax In other words, ira is the function whose rule may be described by the words "multiply on the left by «," We will now show that ira is a permutation of G: % chapter nine isomorphism 97 1. ira is injective: Indeed, if ira(x{) = Tta{x2), then ax, = ax2, so by the cancellation law, xt = x2. 2. Tra is surjective: For if y G G, then y = a(a~~ly) = Tra(a~ly). Thus, each y in G is equal to rra{x) for x = a "j. 3. Since ira is an injective and surjective function from G to G, rra is a permutation of G. Let us remember that we have a permutation tta for each element a in G; for example, if b and c are other elements in G, rrb is the permutation "multiply on the left by b," irc is the permutation "multiply on the left by c," and so on. In general, let G* denote the set of all the permutations ira as a ranges over all the elements of G: G* = {Tta:aeG} Observe now that G* is a set consisting of permutations of G—but not necessarily all the permutations of G. In Chapter 7 we used the symbol SG to designate the group of all the permutations of G. We must show now that G* is a subgroup of SG, for that will prove that G* is a group of permutations. To prove that G* is a subgroup of SG, we must show that G* is closed with respect to composition, and closed with respect to inverses. That is, we must show that if ita and 7tb are any elements of G*, their composite ira ° nb is also in G*; and if rra is any element of G*, its inverse is in G*. First, we claim that if a and b are any elements of G, then ™a ° = (3) To show that ira ° 7rb and irofc are the same, we must show that they have the same effect on every element x: that is, we must prove the identity i7a6; this proves that the composite of any K = "■„>(*)■ Well, (ab)x = irah{x). Thus, rra ° 7r6 two members ira and irft of G* is another member rrab of G*. Thus, G* is closed with respect to composition. It is each to see that w, is the identity function: indeed, nt(x) = ex = x In other words, 7r, is the identity element of SG. Finally, by Equation (3), So by Theorem 2 of Chapter 4, the inverse of tra is ir„-i. This proves that the inverse of any member -na of G* is another member ira-i of G*. Thus, G* is closed with respect to inverses. Since G* is closed with respect to composition and inverses, G* is a subgroup of Sc. We are now in the final lap of our proof. We have a group of permutations G*, and it remains only to show that G is isomorphic to G*. To do this, we must find an isomorphism / : G—* G*. Let / be the function /(«) = ira In other words, / matches each element a in G with the permutation rra in G*. We can quickly show that / is an isomorphism: 1. f is injective: Indeed, if /(«) = f(b) then rra = trb. Thus, ira{e) = nh(e). that is, ae = be, so, finally, a = b. 2. f is surjective: Indeed, every element of G* is some it and wa = f(a). 3. Lastly, /(at) = irab = ira° irh =/(a)°/(fe). Thus, / is an isomorphism, and so G — G*. ■ EXERCISES A. Isomorphism Is an Equivalence Relation among Groups The following three facts about isomorphism are true for all groups: (i) Every group is isomorphic to itself. (ii) If G, s G2, then G2 = G,. (iii) If G, = G2 and G2 = G3, then G, Fact (i) asserts that for any group G, there exists an isomorphism from G to G. Fact (ii) asserts that, if there is an isomorphism /from G, to G2, there must be some isomorphism from G2 to G,. Well, the inverse of / is such an isomorphism. Fact (iii) asserts that, if there are isomorphisms/ : G, -* G2 and g : G2^ G3, there must be an isomorphism from G, to G3. One can easily guess that g°/is such an isomorphism. The details of facts (i), (ii), and (iii) are left as exercises. 1 Let G be any group. If e : G-» G is the identity function, s(x) - x, show that e is an isomorphism. 2 Let G, and G2 be groups, and / : G, -» G2 an isomorphism. Show that G2-» G, is an isomorphism. [Hint: Review the discussion of inverse functions at the end of Chapter 6. Then, for arbitrary elements c, d£G2, there exist a, b G G, such that c=/(a) and d = f(b). Note that a=/~'(c) and b=f~\d). Show thatf~\cd)=f \c)f-\d).] 3 Let G,, G2, and G3 be groups, and let /: G, -* G2 and g:G2-*G3 be isomorphisms. Prove that #°/ : G, — G, is an isomorphism. 98 chapter nine isomorphism 99 B. Elements Which Correspond under an Isomorphism Recall that an isomorphism / from G, to G2 is a one-to-one correspondence between G, and G2 satisfying f(ab) =f{a)f{b). / matches every element of G, with a corresponding element of G2. It is important to note that: (i) / matches the neutral element of G, with the neutral element of G2. (ii) If /matches an element x in G, with y in G2, then, necessarily,/matches jc"1 with y~ . That is, if x*-*y, then x 1 <-»y-1. (iii) /matches a generator of G( with a generator of G2. The details of these statements are now left as an exercise. Let Gl and G2 be groups, and let / : G,—> G2 be an isomorphism. 1 If el denotes the neutral element of G, and e2 denotes the neutral element of G2, prove that /(e,) = e2. [Hint: In any group, there is exactly one neutral element; show that /(?,) is the neutral element of G2.] 2 Prove that for each element a in G,, /(a"') — [f(a)]"'. (Hint: You may use Theorem 2 of Chapter 4.) 3 If G, is a cyclic group with generator a, prove that G2 is also a cyclic group, with generator f(a). C. Isomorphism of Some Finite Groups In each of the following, G and H are finite groups. Determine whether or not G = H. Prove your answer in either case. To find an isomorphism from G to H will require a little ingenuity. For example, if G and H are cyclic groups, it is clear that we must match a generator a of G with a generator b of H; that is, /(a) = b. Then f(aa) = bb, f(aaa) = bbb, and so on. If G and H are not cyclic, we have other ways: for example, if G has an element which is its own inverse, it must be matched with an element of H having the same property. Often, the specifics of a problem will suggest an isomorphism, if we keep our eyes open. To prove that a specific one-to-one correspondence /: G—*// is an isomorphism, we may check that it transforms the table of G into the table of H. # 1 G is the checkerboard game group of Chapter 3, Exercise D. H is the group of the complex numbers {/, —i, 1, -1} under multiplication. 2 G is the same as in part 1. H = Z„. 3 G is the group P2 of subsets of a two-element set. (See Chapter 3, Exercise C.) H is as in part 1. #4 G is 53. H is the group of matrices described on page 28 of the text. 5 G is the coin game group of Chapter 3, Exercise E. H is £>4, the group of symmetries of the square. 6 G is the group of symmetries of the rectangle. H is as in part 1. D. Separating Groups into Isomorphism Classes Each of the following is a set of four groups. In each set, determine which groups are isomorphic to which others. Prove your answers, and use Exercise A3 where convenient. 11, Z2 x Z2 P2 V [P2 denotes the group of subsets of a two-element set. (See Chapter 3, Exercise C.) V denotes the group of the four complex numbers (t, -i, 1, -1} with respect to multiplication.] 2 S, Z„ Z3 x Z2 Z* (Z* denotes the group {1, 2,3, 4,5, 6} with multiplication modulo 7. The product modulo 7 of a and b is the remainder of ab after division by 7.) 3 Z8 P3 Z2 x Z2 x Z, Z>4 (Z>4 is the group of symmetries of the square.) 4 The groups having the following Cayley diagrams: E. Isomorphism of Infinite Groups # 1 Let E designate the group of all the even integers, with respect to addition. Prove that Z=E. 2 Let G be the group {10" : «eZ} with respect to multiplication. Prove that G 3 Z. (Remember that the operation of Z is addition.) 3 Prove that C = RxR. 4 We have seen in the text that R is isomorphic to Rp"'. Prove that R is not isomorphic to R* (the multiplicative group of the nonzero real numbers). (Hint: Consider the properties of the number -1 in R*. Does R have any element with those properties?) 100 chapter nine isomorphism 101 5 Prove that Z is not isomorphic to Q. 6 We have seen that R = Rpos. However, prove that Q is not isomorphic to Qpos. (0 (c) m < 0 and n < 0 2 Prove that (a™)" = amn in the following cases: (a) m=0 (b) n = 0 (c) m < 0 and n > 0 (d) m > 0 and n < 0 (e) m < 0 and n < 0 3 Prove that (a")'1 = a~" in the following cases: (a) n = 0 (b) n<0 108 chapter ten ORDER OF GROUP ELEMENTS 109 B. Examples of Orders of Elements 1 What is the order of 10 in Z25? 2 What is the order of 6 in Z16? 3 What is the order of f- ■C inS6? 4 What is the order of 1 in R*? What is the order of 1 in R? 5 If A is the set of all the real numbers x #0, 1,2, what is the order of in SA? 6 Can an element of an infinite group have finite order? Explain. 7 In £7A, list all the elements (a) of order 2; (b) of order 3; (c) of order 4; (d) of order 6. C. Elementary Properties of Order Let u, b, and c be elements of a group G. Prove the following: 1 Ord(a) = 1 iff a = e. 2 If ord(a) = n, then a"~r = (a) '. 3 If a* = e where k is odd, then the order of a is odd. # 4 OTd(a) = OTd(bab~s). 5 The order of a"' is the same as the order of a. 6 The order of ab is the same as the order of ba. [Hint: if (ha)" = baba •••ba — e then a is the inverse of x. Thus, ax = e.\ 7 Ord(abc) = ord(cafc) = ord(ica). 8 Let x = a,a2 • • • an, and let y be a product of the same factors, permuted cyclically. (That is, y = akaktl ■ ■ ■ anat ■ ■ ■ ak_l.) Then ord(x) = ord(y). D. Further Properties of Order Let a be any element of finite order of a group G. Prove the following: 1 If aF = e where p is a prime number, then a has order p. (a ^ e.) # 2 The order of a* is a divisor (factor) of the order of a. 3 If ord(a) = km, then ord(a*) = m. 4 If ord(a) = n where n is odd, then ord(a2) = n. 5 If a has order n, and a' = as, then n is a factor of r — s. 6 If a is the only element of order k in G, then a is in the center of G. (Hint: Use Exercise C4. Also, see Chapter 4, Exercise C6.) 7 If the order of a is not a multiple of m, then the order of a* is not a multiple of m. (Hint: Use part 2.) 8 If ord(a) = mk and a'k = e, then r is a multiple of m. t E. Relationship between ord(ab), ord(a), and ord(ft) Let a and 6 be elements of a group G. Let ord(a) = m and ord(b) = n; \cm(m, n) denote the least common multiple of m and n. Prove parts 1-5: let 1 If a and b commute, then ord(ab) is a divisor of lcm(m, n). 2 If m and n are relatively prime, then no power of a can be equal to any power of b (except for e). (Remark: Two integers are said to be relatively prime if they have no common factors except ±1.) (Hint: Use Exercise D2.) 3 If m and n are relatively prime, then the products s'6'(0si 1, hence we can write m = m'q and n = n'q. Explain why (am)" = e, and proceed from there.] 3 Let / be the least common multiple of m and n. Let Urn = k. Explain why («")* = e. 4 Prove: If (am)' = e, then n is a factor of mt. (Thus, mt is a common multiple of m and n.) Conclude that / =s mt 5 Use parts 3 and 4 to prove that the order of am is [lcm(m, n)]/m. ord(6)= — = — Thus b" k = e. (Why?) Draw your conclusion from these facts.] Thus, if a has order n and a has a kth root b, then b has order nk/l, where n and / are relatively prime. 6 Let a have order n. Let be an integer such that every prime factor of k is a factor of n. Prove: If a has a fcth root b, then ord(fc) = n/t. t H. Relationship between the Order of a and the Order of any kth Root of a Let a denote an element of a group G. 1 Let a have order 12. Prove that if a has a cube root, say a = b' for some be G, then 6 has order 36. {Hint: Show that b36 = e; then show that for each factor A: of 36, bk = e is impossible. [Example: If Z>12 = e, then 612 = (fe3)4 = a4 = e\] Derive your conclusion from these facts.} # 2 Let a have order 6. If a has a fourth root in G, say a = £>4, what is the order of bl 3 Let a have order 10. If a has a sixth root in G, say a = b6, what is the order of Z>? 4 Let a have order n, and suppose a has a fcth root in G, say a - 6*. Explain why the order of b is a factor of nk. Let nk ord(i>) ■ 5 Prove that « and / are relatively prime. [Hint: Suppose n and / have a common factor q > 1. Then n = qn' and / = <,/', so CYCLIC GROUPS 113 CHAPTER ELEVEN CYCLIC GROUPS If G is a group and a G G, it may happen that every element of G is a power of a. In other words, G may consist of all the powers of a, and nothing else: G= {«" : nGZ} In that case, G is called a cyclic group, and a is called its generator. We write G=(a) and say that G is the cyclic group generated by a. If G = (a) is the cyclic group generated by a, and a has order n, we say that G is a eycfic group of order n. We will see in a moment that, in that case, G has exactly n elements. If the generator of G has order infinity, we say that G is a cyclic group of order infinity. In that case, we will see that G has infinitely many elements. The simplest example of a cyclic group is Z, which consists of all the multiples of 1. (Remember that in additive notation we speak of "multiples" instead of "powers.") Z is a cyclic group of order infinity; its generator is 1. Another example of a cyclic group is Zfi, which consists of all the multiples of 1, added modulo 6. Z„ is a cyclic group of order 6; 1 is a generator of Z6, but Z6 has another generator too. What is it? Suppose (a) is a cyclic group whose generator a has order n. Since (a) is the set of all the powers of a, it follows from Theorem 3 of Chapter 10 that (a) = {e,a,a ,... ,a } If we compare this group with Z„, we notice a remarkable resemblance! For one thing, they are obviously in one-to-one correspondence: (a) = {a ,a ,a.....a } Z„ = {0, 1, 2,...,n-l} In other words, the function /(/) = a1 is a one-to-one correspondence from Z„ to (a). But this function has an additional property, namely, /(,+/) = a'+' = aV =f(i)KJ) Thus, /is an isomorphism from Z„ to (a). In conclusion, Z„ s (a) Let us review the situation in the case where a has order infinity. In this case, by Theorem 4 of Chapter 10, («) = {..., a'2, a~\ e, a, a2,. . .} There is obviously a one-to-one correspondence between this group and Z: i \ ( -2 10 12 i {a) = {. . . , a , a , a , a , a , . . .} Z={. ..,-2, -1, 0, 1, 2, ...} In other words, the function /(/) = «' is a one-to-one correspondence from Z to (a). As before, / is an isomorphism, and therefore * (a) What we have just proved is a very important fact about cyclic groups; let us state it as a theorem. Theorem 1: Isomorphism of cyclic groups (i) For every positive integer n, every cyclic group of order n is isomorphic to Z„. Thus, any two cyclic groups of order n are isomorphic to each other. 112 114 CHAPTER ELEVEN CYCLIC GROUPS 115 (ii) Every cyclic group of order infinity is isomorphic to Z, and therefore any two cyclic groups of order infinity are isomorphic to each other. If G is any group and a e G, it is easy to see that 1. The product of any two powers of a is a power of a; for ama" = am+n. 2. Furthermore, the inverse of any power of a is a power of a, because 3. It therefore follows that the set of all the powers of a is a subgroup of G. This subgroup is called the cyclic subgroup of G generated by a. It is obviously a cyclic group, and therefore we denote it by (a). If the element a has order n, then, as we have seen, (a) contains the n elements {e, a, a2, . . . , a"If a has order infinity, then (a) = {... , a~2, a~\ e, a, a2,. . .} and has infinitely many elements. For example, in Z, (2) is the cyclic subgroup of Z which consists of all the multiples of 2. In Z15, (3) is the cyclic subgroup {0,3,6,9,12} which contains all the multiples of 3. In 53, (B) = {e, B, S}, and contains all the powers of B. Can a cyclic group, such as Z, have a subgroup which is not cyclic! This question is of great importance in our understanding of cyclic groups. Its answer is not obvious, and certainly not self-evident: Theorem 2 Every subgroup of a cyclic group is cyclic. Let G = (a) be a cyclic group, and let H be any subgroup of G. We wish to prove that H is cyclic. Now, G has a generator a; and when we say that H is cyclic, what we mean is that H too has a generator (call it b), and H consists of all the powers of b. The gist of this proof, therefore, is to find a generator of H, and then check that every element of H is a power of this generator. Here is the idea: G is the cyclic group generated by a, and H is a subgroup of G. Every element of H is therefore in G, which means that every element of H is some power of a. The generator of H which we are searching for is therefore one of the powers of a—one of the powers of a which happens to be in H; but which one? Obviously the lowest one! More accurately, the lowest positive power of a in H. And now, carefully, here is the proof: Proof: Let m be the smallest positive integer such that am e H. We will show that every element of H is a power of a™, hence am is a generator of H. Let a' be any element of H. Divide t by m using the division algorithm: t = mq + r Qs r < m Then Solving for ar, r / mq\—1 / a =(a *) a (O But am e H and a' £ H; thus {amyqa' e H. It follows that a' G H. But r < m and m is the smallest positive integer such that am EH. So r = 0, and therefore t = mq. We conclude that every element a' E H is of the form a = (am)q, that is, a power of am. Thus, H is the cyclic group generated by a'", m This chapter ends with a final comment regarding the different uses of the word "order" in algebra. Let G be a group; as we have seen, the order of an element a in G is the least positive integer n such that aaa •••« = « (The order of a is infinity if there is no such n.) Earlier, we defined the order of the group G to be the number of elements in G. Remember that the order of G is denoted by |G|. These are two separate and distinct definitions, not to be confused with one another. Nevertheless, there is a connection between them: Let a be an element of order n in a group. By Chapter 10, Theorem 3, there are exactly n different powers of a, hence (a) has n elements. Thus, If ord(a) = n then \{a)\ = n That is, the order of a cyclic group is the same as the order of its generator. EXERCISES A. Examples of Cyclic Groups 1 List the elements of (6) in Z16. 2 List the elements of (/) in 56, where 2 3 >.6 1 3 116 chapter eleven cyclic groups 117 3 Describe the cyclic subgroup of SF(R). 6 Show that -1, as well as 1, is a generator of Z. Are there any other generators of Z? Explain! What are the generators of an arbitrary infinite cyclic group (a)l 7 Is R* cyclic? Try to prove your answer. Hint: Suppose k is a generator of R*: If kk2>k* >■ t3 If k>l, then k(9) = 6. Let a have order n, and prove the following: 1 a is a generator of (a) iff r and n are relatively prime. (Hint: See Chapter 10, Exercise G2.) 2 (a) has 4>{n) different generators. (Use part 1 in your proof.) 3 For any factor m of n, let Cm = {x £ (a) : xm = e). Cm is a subgroup of (a). # 4 Cm has exactly m elements. (Hint: Use Exercise B4.) 5 An element x in (a) has order m iff x is a generator of Cm. 6 There are d>(ra) elements of order m in (a). (Use parts 1 and 5.) # 7 Let n = mk. a has order m iff r = /c/ where / and m are relatively prime. (Hint: See Chapter 10, Exercise Gl,2.) 8 If c is any generator of (a), then {cr : r is relatively prime to n) is the set of all the generators of (a). D. Elementary Properties of Cyclic Subgroups of Groups Let G be a group and let a, b £ G. Prove the following: 1 If a is a power of b, say a = bk, then (a) C (b). 1 Suppose a is a power of b, say a - bk. Then b is equal to a power of a iff (a) = (b). 3 Suppose a£ (b). Then (a) = (Z>) iff a and b have the same order. 4 Let ord(a) = n, and b = ak. Then (a) = (b) iff n and k are relatively prime. 5 Let ord(a) = n, and suppose a has a fcth root, say a = 6*. Then (a) = (b) iff k and « are relatively prime. 6 Any cyclic group of order mn has a unique subgroup of order n. E. Direct Products of Cyclic Groups Let G and H be groups, with a £ G and bSH. Prove parts 1-4: 1 If (a, fc) is a generator of G x //, then a is a generator of G and b is a generator of H. 2 If G x H is a cyclic group, then G and H are both cyclic. 3 The converse of part 2 is false. (Give an example to demonstrate this.) 4 Let ord(a) = m and ord(6) = n. The order of (a, b) in G x H is the least common multiple of m and n. (Hint: Use Chapter 10, Theorem 5. Nothing else is needed!) 5 Conclude from part 4 that if m and n are relatively prime, then (a, b) has order mn. 6 Suppose (c, a") £ G X H, where c has order m and has order n. Prove: If m and n are nor relatively prime (hence have a common factor q> 1), then the order of (c, is less than mn. 1 Conclude from parts 5 and 6 that (a) x (b) is cyclic iff ord(a) and ord(fc) are relatively prime. 8 Let G be an abelian group of order mn, where m and n are relatively prime. Prove: If G has an element a of order m and an element b of order n, then G = (a) x (6). (Hint: See Chapter 10, Exercise El, 2.) 9 Let (a) be a cyclic group of order mn, where m and n are relatively prime, and prove that (a) = (am) x (a"). t F. Jtth Roots of Elements in a Cyclic Group Let (a) be a cyclic group of order n. For any integer k, we may ask: which elements in (a) have a kth root? The exercises which follow will answer this question. 1 Let a have order 10. For what integers k (0« k 12), does a have a Jfcth root? For what integers k (0=s k =s 12), does a6 have &th root? 118 chapter eleven Let k and n be any integers, and let gcd(&, n) denote the greatest common divisor of k and n. A linear combination of k and n is any expression ck + dn where c and d are integers. It is a simple fact of number theory (the proof is given on page 219), that an integer m is equal to a linear combination of k and n iff m is a multiple of gcd(/c, n). Use this fact to prove the following, where a is an element of order n in a group G. 2 If m is a multiple of gcd(/c, n), then am has a Arth root in (a). [Hint: Compute am, and show that am = (ac)k for some ac S (a).] # 3 If a™ has a Jtth root in (a), then m is a multiple of gcd(£, n). Thus, a™ has a fcth root in (a) iff gcd(/c, n) is a factor of rn. 4 a has a kth root in (a) iff A: and n are relatively prime. 5 Let p be a prime number. (a) If n is not a multiple of p, then every element in (a) has a pth root. (£>) If n is a multiple of p, and a" has a pth root, then m is a multiple of p. (Thus, the only elements in (a) which have pth roots are e, ap, alp, etc.) 6 The set of all the elements in (a) having a kth root is a subgroup of (a). Explain why this subgroup is cyclic, say (a1"). What is the value of m? (Use part 3.) CHAPTER TWELVE PARTITIONS AND EQUIVALENCE RELATIONS Imagine emptying a jar of coins onto a table and sorting them into separate piles, one with the pennies, one with the nickels, one with the dimes, one with the quarters, and one with the half-dollars. This is a simple example of partitioning a set. Each separate pile is called a class of the partition; the jarful of coins has been partitioned into five classes. Here are some other examples of partitions: The distributor of farm-fresh eggs usually sorts the daily supply according to size, and separates the eggs into three classes called "large," "medium," and "small." The delegates to the Democratic national convention may be classified according to their home state, thus falling into 50 separate classes, one for each state. A student files class notes according to subject matter; the notebook pages are separated into four distinct categories, marked (let us say) "algebra," "psychology," "English," and "American history." Every time we file, sort, or classify, we are performing the simple act of partitioning a set. To partition a set A is to separate the elements of A into nonempty subsets, say At, A2, A3, etc., which are called the classes of the partition. 119 120 CHAPTER TWELVE PARTITIONS AND EQUIVALENCE RELATIONS 121 Any two distinct classes, say At and Aj, are disjoint, which means they have no elements in common. And the union of the classes is all of A. Instead of dealing with the process of partitioning a set (which is awkward mathematically), it is more convenient to deal with the result of partitioning a set. Thus, {A1; A2, A3, A4}, in the illustration above, is called a partition of A. We therefore have the following definition: A partition of a set A is a family {Ai: i £ /} of nonempty subsets of A which are mutually disjoint and whose union is all of A. The notation {A^.iE.1} is the customary way of representing a family of sets {A,, A,, Ak,...} consisting of one set At for each index i in /. (The elements of / are called indices; the notation {A,: i€E /} may be read: the family of sets A,, as i ranges over /.) Let {Aj: JG/} be a partition of the set A. We may think of the indices (', /, k,... as labels for naming the classes A,, A-, Ak,.... Now, in practical problems, it is very inconvenient to insist that each class be named once and only once. It is simpler to allow repetition of indexing whenever convenience dictates. For example, the partition illustrated previously might also be represented like this, where Ax is the same class as A5, A2 is the same as A6, and A3 is the same as A7. As we have seen, any two distinct classes of a partition are disjoint; this is the same as saying that if two classes are not disjoint, they must be equal. The other condition for the classes of a partition of A is that their union must be equal to A; this is the same as saying that every element of A lies in one of the classes. Thus, we have the following, more explicit definition of partition: By a partition of a set A we mean a family (A, : / G /} of nonempty subsets of A such that (i) If any two clctsses, say At and A., have a common element x (that is, are not disjoint), then A,= A-, and (ii) Every element x of A lies in one of the classes. We now turn to another elementary concept of mathematics. A relation on a set A is any statement which is either true or false for each ordered pair (x, y) of elements of A. Examples of relations, on appropri- ate sets, are "x = y," "x c, where c is some fixed real number. 8 If C is any set, Pc denotes the set of all the subsets of C. Let D C C. In Pc, let A ~ B iff A n £> = 5 n Z>. 9 In R x R, let (a, 6) ~ (c, d) iff a2 + b2 = c1 + d2. 10 In R*, let a~ b iff a/6£ 0). 5 (x, y) ~ (u, i>) iff * + y = « + f- 6 (or, y) — (u, v) iff jc2 — y = u2 — v. D. Equivalence Relations on Groups Let G be a group. In each of the following, a relation on G is defined. Prove it is an equivalence relation. Then describe the equivalence class of e. 1 If H is a subgroup of G, let a ~ b iff ab'' £ //. 2 If H is a subgroup of G, let a ~ b iff a~1bE H. Is this the same equivalence relation as in part 1? Prove, or find a counterexample. 3 Let a ~ b iff there is an x E G such that a — xbx'1. 4 Let a ~ ft iff there is an integer k such that a* = b . # 5 Let a ~ b iff aft""1 commutes with every xS G. 6 Let a ~ b iff afc-1 is a power of c (where c is a fixed element of G). E. General Properties of Equivalence Relations and Partitions 1 Let {A, : i £ /} be a partition of A. Let {B; :;'£/} be a partition of B. Prove that {/i, x Bt: (i, j) £ / x J) is a partition of A x B. 2 Let ~, be the equivalence relation corresponding to the above partition of A, and let ~7 be the equivalence relation corresponding to the partition of B. Describe the equivalence relation corresponding to the above partition of A x B. 3 Let / : A-* B be a function. Define ~ by a ~ b iff f(a)= f(b). Prove that ~ is an equivalence relation on A. Describe its equivalence classes. 4 Let / : A —> B be a function, and let (B, : i £ /} be a partition of B. Prove that {/-1(B,) : i E /) is a partition of /t. If ~, is the equivalence relation corresponding to the partition of B, describe the equivalence relation corresponding to the partition of A. [Remark: For any CC B, f~\C) = {x £ A : /(*) £ C}.] 5 Let ~t and ~2 be distinct equivalence relations on A. Define ~3 by a ~3 b iff a ~, b and a ~2 b. Prove that ~3 is an equivalence relation on A. If [x], denotes the equivalence class of x for (j = 1,2, 3), prove that = [x]l Ci [x]2. In parts 4 to 6, an equivalence relation on R x R is given. Prove it is an equivalence relation, describe it geometrically, and give the corresponding partition. COUNTING COSETS 127 CHAPTER THIRTEEN COUNTING COSETS Just as there are great works in art and music, there are also great creations of mathematics. "Greatness," in mathematics as in art, is hard to define, but the basic ingredients are clear: a great theorem should contribute substantial new information, and it should be unexpected'. That is, it should reveal something which common sense would not naturally lead us to expect. The most celebrated theorems of plane geometry, as may be recalled, come as a complete surprise; as the proof unfolds in simple, sure steps and we reach the conclusion—a conclusion we may have been skeptical about, but which is now established beyond a doubt—we feel a certain sense of awe not unlike our reaction to the ironic or tragic twist of a great story. In this chapter we will consider a result of modern algebra which, by all standards, is a great theorem. It is something we would not likely have foreseen, and which brings new order and simplicity to the relationship between a group and its subgroups. We begin by adding to our algebraic tool kit a new notion—a conceptual tool of great versatility which will serve us well in all the remaining chapters of this book. It is the concept of a coset. Let G be a group, and H a subgroup of G. For any element a in G, the symbol aH denotes the set of all products ah, as a remains fixed and h ranges over H. aH is called a left coset of H in G. In similar fashion, Ha denotes the set of all products ha, as a remains fixed and h ranges over H. Ha is called a right coset of H in G. In practice, it will make no difference whether we use left cosets or right cosets, just as long as we remain consistent. Thus, from here on, whenever we use cosets we will use right cosets. To simplify our sentences, we will say coset when we mean "right coset." When we deal with cosets in a group G, we must keep in mind that every coset in G is a subset of G. Thus, when we need to prove that two cosets Ha and Hb are equal, we must show that they are equal sets. What this means, of course, is that every element x&Ha is in Hb, and conversely, every element y G Hb is in Ha. For example, let us prove the following elementary fact: If a G Hb, then Ha = Hb (1) We are given that a G Hb, which means that a = h^b for some hl G H. We need to prove that Ha = Hb. Let x G Ha; this means that x = h2a for some h2 G H. But a = hxb, so x = h2a = (h1hl)b, and the latter is clearly in Hb. This proves that every x G Ha is in Hb; analogously, we may show that every y G Hb is in Ha, and therefore Ha = Hb. The first major fact about cosets now follows. Let G be a group and let H be a fixed subgroup of G: Theorem 1 The family of all the cosets Ha, as a ranges over G, is a partition of G. Proof: First, we must show that any two cosets, say Ha and Hb, are either disjoint or equal. If they are disjoint, we are done. If not, let x G Ha PI Hb. Because x G Ha, x = h^a for some /i, G H. Because x G Hb, x = h2b for some h2 G H. Thus, hxa = h2b, and solving for a, we have a = (h;lh2)b 126 128 CHAPTER THIRTEEN COUNTING COSETS 129 Thus, a<£Hb It follows from Property (1) above that Ha = Hb. Next, we must show that every element c G G is in one of the cosets of H. But this is obvious, because c = ec and e G H; therefore, c = ecGHc Thus, the family of all the cosets of H is a partition of G. ■ Before going on, it is worth making a small comment: A given coset, say Hb, may be written in more than one way. By Property (1) if a is any element in Hb, then Hb is the same as Ha. Thus, for example, if a coset of H contains n different elements ax, a2,. .., an, it may be written in n different ways, namely, Hax, Ha2,... , Han. The next important fact about cosets concerns finite groups. Let G be a finite group, and H a subgroup of G. We will show that all the cosets of H have the same number of elements\ This fact is a consequence of the next theorem. Theorem 2 If Ha is any coset of H, there is a one-to-one correspondence from H to Ha. Proof: The most obvious function from H to Ha is the one which, for each h G H, matches h with ha. Thus, let / : Ha be defined by f(h) = ha Remember that a remains fixed whereas h varies, and check that / is injective and surjective. f is injective: Indeed, if /(/i,) =f(h2), then h^a = h2a, and therefore K=h2. f is surjective, because every element of Ha is of the form ha for some lie//, and ha=f(h). Thus, / is a one-to-one correspondence from H to Ha, as claimed. ■ By Theorem 2, any coset Ha has the same number of elements as H, and therefore all the cosets have the same number of elements! Let us take a careful look at what we have proved in Theorems 1 and 2. Let G be a finite group and H any subgroup of G. G has been partitioned into cosets of H, and all the cosets of H have the same number of elements (which is the same as the number of elements in H). Thus, the number of elements in G is equal to the number of elements in H, multiplied by the number of distinct cosets of H. This statement is known as Lagrange's theorem. (Remember that the number of elements in a group is called the group's order.) Theorem 3: Lagrange's theorem Let G be a finite group, and H any subgroup of G. The order of G is a multiple of the order of H. In other words, the order of any subgroup of a group G is a divisor of the order of G. For example, if G has 15 elements, its proper subgroups may have either 3 or 5 elements. If G has 7 elements, it has no proper subgroups, for 7 has no factors other than 1 and 7. This last example may be generalized: Let G be a group with a prime number p of elements. If a G G where a¥^e, then the order of a is some integer m ¥< 1. But then the cyclic group (a) has m elements. By Lagrange's theorem, m must be a factor of p. But p is a prime number, and therefore m= p. It follows that (a) has p elements, and is therefore all of G! Conclusion: Theorem 4 If G is a group with a prime number p of elements, then G is a cyclic group. Furthermore, any element a¥^ e in G is a generator of G. Theorem 4, which is merely a consequence of Lagrange's theorem, is quite remarkable in itself. What it says is that there is (up to isomorphism) only one group of any given prime order p. For example, the only group (up to isomorphism) of order 7 is Z7, the only group of order 11 is Zn, and so on! So we now have complete information about all the groups whose order is a prime number. By the way, if a is any element of a group G, the order of a is the same as the order of the cyclic subgroup (a), and by Lagrange's theorem this number is a divisor of the order of G. Thus, Theorem 5 The order of any element of a finite group divides the order of the group. Finally, if G is a group and // is a subgroup of G, the index ofH in G is the number of cosets of H in G. We denote it by (G:H). Since the number of elements in G is equal to the number of elements in H, 130 CHAPTER THIRTEEN COUNTING COSETS 131 multiplied by the number of cosets of H in G, order of G (G:H) order of H EXERCISES A. Examples of Cosets in Finite Groups In each of the following, // is a subgroup of G. In parts 1-5 list the cosets of H. For each coset, list the elements of the coset. Example G = I4, H = {0, 2). (Remark: If the operation of G is denoted by +, it is customary to write H + x for a coset, rather than Hx.) The cosets of H in this example are H = H+ 0= H+ 2= {0,2) and H + 1 = H + 3 = {1,3} 1 G = Sj, H={e, B, 8). 2 G = S3, H = {e,a}. 3 G = Z15, H={5). 4 G = D4, H= {R0, R4). (For D4, see page 73.) 5 G = S4, H = A4. (For A4, see page 86.) 6 Indicate the order and index of each of the subgroups in parts 1 to 5. B. Examples of Cosets in Infinite Groups Describe the cosets of the subgroups described in parts 1-5: # 1 The subgroup (3) of Z. 2 The subgroup Z of R. 3 The subgroup H = {2" : n G Z) of R*. 4 The subgroup (§) of R*; the subgroup (J) of R. 5 The subgroup // = {(x, y): x = y} of R x R. 6 For any positive integer m, what is the index of (m) in Z 7 Find a subgroup of R* whose index is equal to 2. C. Elementary Consequences of Lagrange's Theorem Let G be a finite group. Prove the following: 1 If G has order n, then x" = e for every x in G. 2 Let G have order pq, where p and q are primes. Either G is cyclic, or every element x # e in G has order p or 0. 3 Let G have order 4. Either G is cyclic, or every element of G is its own inverse. Conclude that every group of order 4 is abelian. 4 If G has an element of order p and an element of order q, where p and q are distinct primes, then the order of G is a multiple of pq. 5 If G has an element of order k and an element of order m, then |G| is a multiple of lcm(A:, m), where lcm(A:, m) is the least common multiple of k and m. # 6 Let p be a prime number. In any finite group, the number of elements of order p is a multiple of p - 1. D. Further Elementary Consequences of Lagrange's Theorem Let G be a finite group, and let H and K be subgroups of G. Prove the following: 1 Suppose HCK (therefore H is a subgroup of K). Then (G:H) = (G:K)(K:H). 2 The order of H n K is a common divisor of the order of H and the order of K. 3 Let H have order m and K have order n, where m and n are relatively prime. Then HHK= {e). 4 Suppose H and K are not equal, and both have order the same prime number p. Then HH K= {e}. 5 Suppose H has index p and K has index q, where p and q are distinct primes. Then the index of H D K is a multiple of pq. # 6 If G is an abelian group of order n, and m is an integer such that m and n are relatively prime, then the function f(x) = x"' is an automorphism of G. E. Elementary Properties of Cosets Let G be a group, and H a subgroup of G. Let a and b denote elements of G. Prove the following: 1 Ha = Hb ffiab~!EH. 2 Ha = H iff a G //. 3 If atf = Ha and ft// = Hb, then (ab)H= H(ab). # 4 If aH=Ha, then a~lH=Ha~\ 5 If (aft)// = (ac)//, then 6H = c//. 6 The number of right cosets of H is equal to the number of left cosets of H. 7 If / is a subgroup of G such that / = H D K, then for any a G G, /a = //a O Ka. Conclude that if H and K are of finite index in G, then their intersection H n K is also of finite index in G. Theorem 5 of this chapter has a useful converse, which is the following: Cauchy's theorem If G is a finite group, and p is a prime divisor of \G\, then G has an element of order p. 132 chapter thirteen counting cosets 133 For example, a group of order 30 must have elements of orders 2, 3, and 5. Cauchy's theorem has an elementary proof, which may be found on page 340. In the next few exercise sets, we will survey all possible groups whose order is «10. By Theorem 4 of this chapter, if G is a group with a prime number p of elements, then G = Zp. This takes care of all groups of orders 2, 3 5, and 7. In Exercise G6 of Chapter 15, it will be shown that if G is a group with p2 elements (where p is a prime), then G = Zp2 or G = Tp x 7. This will take care of all groups of orders 4 and 9. The remaining cases are examined in the next three exercise sets. t F. Survey of All Six-Element Groups Let G be any group of order 6. By Cauchy's theorem, G has an element a of order 2 and an element b of order 3. By Chapter 10, Exercise E3, the elements e, a, b, b2, ab, ah2 are all distinct; and since G has only six elements, these are all the elements in G. Thus, ba is one of the elements e, a, b, b2, ab, or ab2. 1 Prove that ba cannot be equal to either e, a, b, or b2. Thus, ba = ab or ba = ab2. Either of these two equations completely determines the table of G. (See the discussion at the end of Chapter 5.) 2 If ba = ab, prove that G = Z6. 3 If ba = ab2, prove that G = 53. It follows that Z6 and 53 are (up to isomorphism), the only possible groups of order 6. t G. Survey of All 10-Element Groups Let G be any group of order 10. 1 Reason as in Exercise F to show that G = {e, a, b, b2, b3, b4, ab, ab2, ab3, ab*}, where a has order 2 and b has order 5. 2 Prove that ba cannot be equal to e, a, b, b2, b3, or b4. 3 Prove that if ba = ab, then G = Z,„. 4 If ba = ab2, prove that ba2 = a2b4, and conclude that b - b4. This is impossible because b has order 5; hence ba ^ ab2. (Hint: The equation ba = ab2 tells us that we may move a factor a from the right to the left of a factor b, but in so doing, we must square b. To prove an equation such as the preceding one, move all factors a to the left of all factors b.) 5 If ba = ab3, prove that ba2 = a2b9 = a2b4, and conclude that b = b". This is impossible (why?); hence ba^ab3. 6 Prove that if ba = ab4, then G = D5 (where D5 is the group of symmetries of the pentagon). Thus, the only possible groups of order 10 (up to isomorphism), are Z10 and t H. Survey of All Eight-Element Groups Let G be any group of order 8. If G has an element of order 8, then G — Z8. Let us assume now that G has no element of order 8; hence all the elements^ e in G have order 2 or 4. 1 If every x ¥= e in G has order 2, let a, b, c be three such elements. Prove that G = {e, a, b, c, ab, be, ac, abc). Conclude that G = Z2 X l2 x T.}. In the remainder of this exercise set, assume G has an element a of order 4. Let H - (a) = {e, a, a2, a3}. If 66G is not in H, then the coset H b = {b, ab, a2b, a3b}. By Lagrange's theorem, G is the union of He = H and Hb; hence G = {e, a, a2, a3, b, ab, a2b, a3b) 2 Assume there is in Hb an element of order 2. (Let b be this element.) If ba = a2b, prove that b2a = a4b2, hence a = a4, which is impossible. (Why?) Conclude that either ba = ab or ba = a3b. 3 Let b be as in part 2. Prove that if ba = ab, then G = Z4 x Z2. 4 Let b be as in part 2. Prove that if ba = a3b, then G = Z>4. 5 Now assume the hypothesis in part 2 is false. Then b, ab, a2b, and a3b all have order 4. Prove that bz = a2. (Hint: What is the order of b2'? What element in G has the same order?) 6 Prove: If ba = ab, then (a3b)2 = e, contrary to the assumption that ord(a't>) = 4. If ba = a2b, then a = b4a = e, which is impossible. Thus, ba = a3b. 7 The equations a4 — b4 = e, a2 = bl, and ba = a3b completely determine the table of G. Write this table. (G is known as the quarternion group Q.) Thus, the only groups of order 8 (up to isomorphism) are Z8, Z2 x Z2 x Z2, Z4 xZ2, £>„, and Q. t I. Conjugate Elements If a EG, a conjugate of a is any element of the form xax~\ where xBG. (Roughly speaking, a conjugate of a is any product consisting of a sandwiched between any element and its inverse.) Prove each of the following: 1 The relation "a is equal to a conjugate of b" is an equivalence relation in G. (Write a ~ b for "a is equal to a conjugate of b") 134 chapter thirteen counting cosets 135 This relation — partitions any group G into classes called conjugacy classes. (The conjugacy class of a is \a\ - {xax~* : xE. G}.) For any element a £ G, the centralizer of a, denoted by Ca, is the set of all the elements in G which commute with a. That is, Ca = {xE: G : xa = ax) Prove the following: {x £ G : xax a) 2 For any a £ G, Ca is a subgroup of G. 3 x~*ax = y 'ay iff xy ' commutes with a iff jry-1 £ C„. 4 aT'ox = >'~'ay iff Cax = Cay. (Hint: Use Exercise El.) 5 There is a one-to-one correspondence between the set of all the conjugates of a and the set of all the cosets of Ca. (Hint: Use part 4.) 6 The number of distinct conjugates of a is equal to (G : Ca), the index of Ca in G. Thus, the size of every conjugacy class is a factor of \G\. t J. Group Acting on a Set Let A be a set, and let G be any subgroup of 5„. G is a group of permutations of A; we say it is a group acting on the set A. Assume here that G is a finite group. If u E A, the orbit of u (with respect to G) is the set O(u) = {£(«): g£G} 1 Define a relation — on A by u ~ v iff g(u) = v for some g £ G. Prove that — is an equivalence relation on A, and that the orbits are its equivalence classes. If u £ A, the stabilizer of u is the set Gu = {g £ G : g(u) = u}, that is, the set of all the permutations in G which leave u fixed. 2 Prove that Gu is a subgroup of G. # 3 Let a = (1 2)(3 4)(5 6) and 0 = (2 3) in 5ft. Let G be the following subgroup of S„: G = {£, a, 0, a/8, /3a, a/3a, /3a/3, (a/3)2}. Find O(l), 0(2), 0(5), G„ G2, G4, Gs. 4 Let /, g £ G. Prove that / and g are in the same left coset of Gu iff f(u) = g(u). (Hint: Use Exercise El modified for left cosets.) 5 Use part 4 to show that the number of elements in 0{u) is equal to the index of Gu in G. [Hint: If/(w)= v, match the coset of/with v.] 6 Conclude from part 5 that the size of every orbit (with respect to G) is a factor of the order of G. In particular, if / £ SA, the length of each cycle of /is a factor of the order of / in SA. Recall that B" is the group of all binary words of length n. A group code C is a subgroup of B". To decode a received word x means to find the codeword a closest to x, that is, the codeword a such that the distance d(a, x) is a minimum. But d(a, x) = w(a + x), the weight (number of Is) of a + x. Thus, to decode a received word x is to find the codeword a such that the weight w(a + x) is a minimum. Now, the coset C + x consists of all the sums c + x as c ranges over all the codewords; so by the previous sentence, if a + x is the word of minimum weight in the coset C + x, then a is the word to which x must be decoded. Now a = (a + x) + x; so a is found by adding x to the word of minimum weight in the coset C + x. To recapitulate: In order to decode a received word x you examine the coset C + x, find the word e of minimum weight in the coset (it is called the coset leader), and add e to x. Then e + x is the codeword closest to x, and hence the word to which x must be decoded. 1 Let C, be the code described in Exercise G of Chapter 3. (a) List the elements in each of the cosets of C,. (b) Find a coset leader in each coset. (There may be more than one word of minimum weight in a coset; choose one of them as coset leader.) (c) Use the procedure described above to decode the following words x: 11100, 01101, 11011, 00011. 2 Let C3 be the Hamming code described in Exercise H2 of Chapter 5. List the elements in each of the cosets of C3 and find a leader in each coset. Then use coset decoding to decode the following words x: 1100001, 0111011, 1001011. 3 Let C be a code and let H be the parity-check matrix of C. Prove that x and y are in the same coset of C if and only if Hx = Hy. (Hint: Use Exercise H8, Chapter 5.) If x is any word, Hx is called the syndrome of x. It follows from part 3 that all the words in the same coset have the same syndrome, and words in different cosets have different syndromes. The syndrome of a word x is denoted by syn(x). 4 Let a code C have q cosets, and let the coset leaders be e,, e2.....e^. Explain why the following is true: To decode a received word x, compare syn(x) with syn(e,), . . . , syn(e?) and find the coset leader e, such that syn(x) = syn(e,). Then x is to be decoded to x + e,. 5 Find the syndromes of the coset leaders in part 2. Then use the method of part 4 to decode the words x - 1100001 and x = 1001011. K. Coding Theory: Coset Decoding In order to undertake this exercise set, the reader should be familiar with the introductory paragraphs (preceding the exercises) of Exercises F and G of Chapter 3 and Exercise H of Chapter 5. HOMOMORPKISMS 137 CHAPTER FOURTEEN HOMOMORPHISMS We have seen that if two groups are isomorphic, this means there is a one-to-one correspondence between them which transforms one of the groups into the other. Now if G and H are any groups, it may happen that there is a function which transforms G into H, although this function is not a one-to-one correspondence. For example, Z6 is transformed into Z3 by /0 1 2 3 4 5\ ? VO 1 2 0 1 2/ as we may verify by comparing their tables: + 0 1 2 3 4 5 0 0 1 2 3 4 5 1 1 2 3 4 5 0 2 2 3 4 5 0 1 3 3 4 5 0 1 2 4 4 5 0 1 2 3 5 5 0 1 2 3 4 Replace x by f(x) + 0 1 2 0 1 2 0 0 1 2 0 1 2 1 1 2 (J 1 2 0 2 2 0 1 2 0 1 0 0 1 2 0 1 2 1 1 2 0 1 2 0 2 2 0 1 2 0 1 Eliminate duplicate + 0 1 2 information (For example. 2 + 2 = 1 0 0 1 2 appears four separate 1 1 2 0 times in table above.) 2 2 0 1 If G and // are any groups, and there is a function/which transforms G into H, we say that H is a homomorphic image of G. The function / is called a homomorphism from G to //. This notion of homomorphism is one of the skeleton keys of algebra, and this chapter is devoted to explaining it and defining it precisely. First, let us examine carefully what we mean by saying that "/ transforms G into H." To begin with, / must be a function from G onto H; but that is not all, because / must also transform the table of G into the table of H. To accomplish this, / must have the following property: for any two elements a and b in G, if f(a) = a' and f(b) = b', then f(ab) = a'b' (1) Graphically, if a-*a and b ——* b' then ab—-—>a'b' Condition (1) may be written more succinctly as follows: f(ab) = f(a)f(b) (2) Thus, Definition // G and H are groups, a homomorphism from G to H is a function f : G—>■ H such that for any two elements a and b in G, f(ab)^f(a)f(b) If there exists a homomorphism from G onto H, we say that H is a homomorphic image of G. Groups have a very important and privileged relationship with their homomorphic images, as the next few examples will show. Let P denote the group consisting of two elements, e and o, with the table + e o e e o o o e We call this group the parity group of even and odd numbers. We should think of e as "even" and o as "odd," and the table as describing the rule for adding even and odd numbers. For example, even + odd = odd, odd + odd = even, and so on. 136 138 CHAPTER FOURTEEN HOMOMORPHISMS 139 The function f:Z—>P which carries every even integer to e and every odd integer to o is clearly a homomorphism from Z to P. This is easy to check because there are only four different cases: for arbitrary integers r and s, r and s are either both even, both odd, or mixed. For example, if r and 5 are both odd, their sum is even, so/(r) = o, f(s) = o, and f(r + s) = e. Since e = o + o, f(r + s)=f(r) + f(s) This equation holds analogously in the remaining three cases; hence/is a homomorphism. (Note that the symbol + is used on both sides of the above equation because the operation, in Z as well as in P, is denoted by + •) It follows that P is a homomorphic image of Z! Now, what do P and Z have in common? P is a much smaller group than Z, therefore it is not surprising that very few properties of the integers are to be found in P. Nevertheless, one aspect of the structure of Z is retained absolutely intact in P, namely, the structure of the odd and even numbers. (The fact of being odd or even is called the parity of integers.) In other words, as we pass from Z to P we deliberately lose every aspect of the integers except their parity; their parity alone (with its arithmetic) is retained, and faithfully preserved. Another example will make this point clearer. Remember that D4 is the group of the symmetries of the square. Now, every symmetry of the \ / \ / \ / \ / X / \ / \ square either interchanges the two diagonals here labeled 1 and 2, or leaves them as they were. In other words, every symmetry of the square brings about one of the permutations /I 2 V2 1 1 2 1 2 of the diagonals. For each /?, ED„ let f(Rt) be the permutation of the diagonals produced by Rr Then / is clearly a homomorphism from D4 onto S2. Indeed, it is clear on geometrical grounds that when we perform the motion Rj followed by the motion Rf on the square, we are, at the same time, carrying out the motions f(R,) followed by f(Rf) on the diagonals. Thus, It follows that S2 is a homomorphic image of D4. Now 5, is a smaller group than £>„, and therefore very few of the features of D4 are to be found in S2. Nevertheless, one aspect of the structure of D4 is retained absolutely intact in S2, namely, the diagonal motions. Thus, as we pass from D4 to S2, we deliberately lose every aspect of plane motions except the motions of the diagonals; these alone are retained and faithfully preserved. A final example may be of some help; it relates to the group B" described in Chapter 3, Exercise E. Here, briefly, is the context in which this group arises: The most basic way of transmitting information is to code it into strings of Os and Is, such as 0010111, 1010011, etc. Such strings are called binary words, and the number of 0s and Is in any binary word is called its length. The symbol B" designates the group consisting of all binary words of length n, with an operation of addition described in Chapter 3, Exercise E. Consider the function / : B7-* B5 which consists of dropping the last two digits of every seven-digit word. This kind of function arises in many practical situations: for example, it frequently happens that the first five digits of a word carry the message while the last two digits are an error check. Thus, / separates the message from the error check. It is easy to verify that / is a homomorphism; hence Bs is a homomorphic image of B7. As we pass from B7 to B5, the message component of words in B7 is exactly preserved while the error check is deliberately lost. These examples illustrate the basic idea inherent in the concept of a homomorphic image. The cases which arise in practice are not always so clear-cut as these, but the underlying idea is still the same: In a homomorphic image of G, some aspect of G is isolated and faithfully preserved while all else is deliberately lost. The next theorem presents two elementary properties of homomor-phisms. Theorem 1 Let G and H be groups, andf : G -Then (i) /(e) = e, and (ii) /(«"') = [/(a)]"1 for every element a£G. H a homomorphism. 140 CHAPTER FOURTEEN homomorphisms 141 In the equation /(e) = e, the letter e on the left refers to the neutral element in G, whereas the letter e on the right refers to the neutral element in H. To prove (i), we note that in any group, if vv = y then y = e (Use the cancellation law on the equation yy = ye.) Now, /(e)/(e) = f(ee) = /(e); hence /(e) = e. To prove (ii), note that f(a)f{a~l) = /(aa'1) = /(e). But f(e) = e, so f(a)f(a'x) = e. It follows by Theorem 2 of Chapter 4 that f(a~l) is the inverse of f(a), that is, /(« ~l) = [f(a)]'\ Before going on with our study of homomorphisms, we must be introduced to an important new concept. If a is an element of a group G, a conjugate of a is any element of the form xax~\ where x G G. For example, the conjugates of a in S3 are /3°a°/3~' = y yaa°y~ =k S°a°5_1 = k k'o'k 1=y as well as a itself, which may be written in two ways, as e ° a ° e~ or as B'tt'B-, If // is any subset of a group G, we say that f/ is closed with respect to conjugates if every conjugate of every element of H is in H. Finally, Definition Let H be a subgroup of a group G. H is called a normal subgroup of G if it is closed with respect to conjugates, that is, if for any aEH and xGG xax~l£H (Note that according to this definition, a normal subgroup of G is any nonempty subset of G which is closed with respect to products, with respect to inverses, and with respect to conjugates.) We now return to our discussion of homomorphisms. Definition Let f : G-+H be a homomorphism. The kernel of f is the set K of all the elements of G which are carried by f onto the neutral element of H. That is, K= {* E G : /(jc) = e} Theorem 2 Let f : G-* H be a homomorphism. (i) The kernel of f is a normal subgroup of G, and (ii) The range of f is a subgroup of H. Proof: Let K denote the kernel of /. If a, b G K, this means that f(a) = e and f(b) = e. Thus, f(ab) = f(a)f(b) = ee = e; hence ab G K. If aG K, then/(a) = e. Thus,/(a"') = [f(a)]~l = e~' = e, so a 1 G K. Finally, if aGK and x G G, then f(xax~l) = f(x)f(a)f(x~x) = f{x)f{a)[f(x)yx = e, which shows that xax"1 G K. Thus, AT is a normal subgroup of G. Now we must prove part (ii). If f(a) and f(b) are in the range of/, then their product f(a)f(b) = f(ab) is also in the range of/. If f(a) is in the range of/, its inverse is (/(«)]' = /(«'), which is also in the range of /. Thus, the range of / is a subgroup of //. ■ If/is a homomorphism, we represent the kernel of / and the range of /with the symbols ker(/) and ran(/) EXERCISES A. Examples of Homomorphisms of Finite Groups 1 Consider the function / : Z„—>Z, /=('° 1 ; \0 1 Verify that / is a homomorphism, i given by 2 3 4 5 6 7> 2 3 0 1 2 3/ find its kernel K, and list the cosets of K. [Remark: To verify that / is a homomorphism, you must show that f(a + b) = f(a) + f(b) for all choices of a and b in Z8; there are 64 choices. This may be accomplished by checking that / transforms the table of Zs to the table of Z4, as on page 136.] 2 Consider the function / : 53 —* Z2 given by Mo T Verify that /is a homomorphism, find its kernel K, and list the cosets of K. 3 Find a homomorphism / :Z15—»Z5, and indicate its kernel. (Do not actually verify that /is a homomorphism.) 4 Imagine a square as a piece of paper lying on a table. The side facing you is side 142 CHAPTER FOURTEEN HOMOMORPHISMS 143 A. The side hidden from view is side B. Every motion of the square either interchanges the two sides (that is, side B becomes visible and side A hidden) or leaves the sides as they were. In other words, every motion R, of the square brings about one of the permutations _ (A B ~\A B ■ 52 is a homomorphism, and give its of the sides; call it g(/0- Verify that g : Z>4 kernel. 5 Every motion of the regular hexagon brings about a permutation of its diagonals, labeled 1, 2, and 3. For each Rt E D6, let /(/?,.) be the permutation of the diagonals produced by Rr Argue informally (appealing to geometric intuition) to explain why /: D6—> 53 is a homomorphism. Then complete the following: 2 3 2 3 5 6 5 6 )- 1 2 3 4 5 6 2 3 4 5 6 J) (That is, find the value of / on all 12 elements of D6.) i 6 Let B C A. Let h : PA -» PB be defined by h(C) = CC\B. For A = {\, 2, 3} and B = {1, 2}, complete the following: '0 {1} {2} {3} {1,2} {1,3} {2,3} A\ For any A and BCA, show that ft is a homomorphism. B. Examples of Homomorphisms of Infinite Groups Prove that each of the following is a homomorphism, and describe its kernel. 1 The function d> : ^(R) —R given by (R)-» &(M) given by <£(/)=/'■ 2>(R) is the group of differentiable functions from R to R; /' is the derivative of/. 3 The function / : R x R—» R given by f(x, y) = x + y. 4 The function / : R*-» Rpo' defined by f(x) = \x\. 5 The function f:C*-> Rp°" defined by f(a + b\) = VV + b2. 6 Let G be the multiplicative group of all 2 x 2 matrices * +>e given by /(A) = determinant of A = satisfying ad — be ^ 0. Let / : G ad — be. C. Elementary Properties of Homomorphisms Let G, H, and K be groups. Prove the following: 1 If / : G—*H and g : H-* K are homomorphisms, then their composite g°f: G—*Kisa homomorphism. 2 If / : G—* H is a homomorphism with kernel K, then / is injective iff K = {e}. 3 If /: G—*H is a homomorphism and K is any subgroup of G, then f(K) = {/(jc) : xG K} is a subgroup of W. 4 If / : G —» H is a homomorphism and / is any subgroup of H, then r'(J) = {xEG:f(x)eJ} is a subgroup of G. Furthermore, ker /C /"'(/). 5 If / : G—» W is a homomorphism with kernel K, and / is a subgroup of G, let/, designate the restriction of / to J. (In other words f, is the same function as /, except that its domain is restricted to J.) Then ker fj — J C\K. 6 For any group G, the function / : G—* G defined by f(x) = e is a homomorphism. 7 For any group G, {e} and G are homomorphic images of G. 8 The function /: G—*G defined by f(x) = x2 is a homomorphism iff G is abelian. 9 The functions/,(x, y) = x and /2(x, y) = y, from G x // to G and W, respectively, are homomorphisms. D. Basic Properties of Normal Subgroups In the following, let G denote an arbitrary group. 1 Find all the normal subgroups (a) of 53 and (b) of D4. Prove the following: 2 Every subgroup of an abelian group is normal. 3 The center of any group G is a normal subgroup of G. 4 Let H be a subgroup of G. H is normal iff it has the following property: For all a and b in G, abe H iff ba £ //. 5 Let H be a subgroup of G. // is normal iff aH = tfa for every a EG. 6 Any intersection of normal subgroups of G is a normal subgroup of G. 144 chapter fourteen homomorph1sms 145 £. Further Properties of Normal Subgroups Let G denote a group, and ft a subgroup of G. Prove the following: # 1 If ft has index 2 in G, then ft is normal. (Hint: Use Exercise D5.) 2 Suppose an element a£C has order 2. Then (a) is a normal subgroup of G iff a is in the center of G. 3 If a is any element of G, (a) is a normal subgroup of G iff a has the following property: For any x E G, there is a positive integer k such that xa = a*x. 4 In a group G, a commutator is any product of the form aba lb~ , where a and b are any elements of G. If a subgroup ft of G contains all the commutators of G, then ft is normal. 5 If ft and K are subgroups of G, and K is normal, then ft.K is a subgroup of G. (HK denotes the set of all products hk as h ranges over ft and k ranges over K.) # 6 Let S be the union of all the cosets Ha such that fta = oft. Then 5 is a subgroup of G, and ft is a normal subgroup of S. F. Homomorphism and the Order of Elements If / : G—* H is a homomorphism, prove each of the following: 1 For each element a £ G, the order of /(a) is a divisor of the order of a. 2 The order of any element b ^ e in the range of /is a common divisor of \G\ and |H|. (Use part 1.) 3 If the range of / has n elements, then x" £ ker / for every x £ G. 4 Let m be an integer such that m and |//| are relatively prime. For any x £ G, if £ ker /, then x £ ker /. 5 Let the range of /have m elements. If a 6 G has order «, where m and n are relatively prime, then a is in the kernel of /. (Use part 1.) 6 Let p be a prime. If ran / has an element of order p, then G has an element of order p. G. Properties Preserved under Homomorphism A property of groups is said to be "preserved under homomorphism" if, whenever a group G has that property, every homomorphic image of G does also. In this exercise set, we will survey a few typical properties preserved under homomorphism. If / : G—> ft is a homomorphism of G onto ft, prove each of the following: 1 If G is abelian, then ft is abelian. 2 If G is cyclic, then ft is cyclic. 3 If every element of G has finite order, then every element of ft has finite order. 4 If every element of G is its own inverse, every element of ft is its own inverse. 5 If every element of G has a square root, th^n every element of ft has a square root. 6 If G is finitely generated, then ft is finitely generated. (A group is said to be "finitely generated" if it is generated by finitely many of its elements.) t H. Inner Direct Products If G is any group, let ft and K be normal subgroups of G such that ft D K Prove the following: 1 Let /i, and h2 be any two elements of ft, and A:, and k2 any two elements of K. hlkl = h2k2 implies A, = h2 and kx = k2 (Hint: If ft,/fc, = h2k2, then h~lh1 £ ft n K and k2k^ £ ft n K. Explain why.) 2 For any h £ ft and ke.K, hk = kh. (Hint: hk = kh iff hkh'1k'i = e. Use the fact that ft and K are normal.) 3 Now, make the additional assumption that G = HK; that is, every x in G can be written as x = hk for some he H and ke. K. Then the function <£(ft, A) = /iJt is an isomorphism from ft x K onto G. We have thus proved the following: // ft and K are normal subgroups of G, such that ft n K = {e} and G = ftK, then G = ft x K. G is sometimes called the /'nner direct product of ft and /C. t I. Conjugate Subgroups Let ft be a subgroup of G. For any a £ G, let aHo 1 = {a.ra~' : x £ ft}; aWn 1 is called a conjugate of ft. Prove the following: 1 For each a E G, afta1 is a subgroup of G. 2 For each a £ G, ft 3 a//fl '. 3 ft is a normal subgroup of G iff ft = aHa'1 for every a £ G. In the remaining exercises of this set, let G be a finite group. By the normalizer of ft we mean the set N(H) = {a £ G: axa~l £ ft for every x £ ft). # 4 If a £ A'(W), then afta1 = ft. (Remember that G is now a finite group.) 5 N(H) is a subgroup of G. 6 ft C Af(ft). Furthermore, ft is a normal subgroup of ivYft). In parts 7-10, let N = N(H). 7 For any a, b £ G, afta1 = fcftfc ' iff b^aEN iff a./V = WV. # 8 There is a one-to-one correspondence between the set of conjugates of ft and the set of cosets of N. (Thus, there are as many conjugates of ft as cosets of N.) 146 CHAPTER FOURTEEN 9 H has exactly (G:N) conjugates. In particular, the number of distinct conjugates of H is a divisor of \G\. 10 Let K be any subgroup of G, let K* = {Na:a(EK}, and let XK = {aftT1 :aeK} Argue as in part 8 to prove that XK is in one-to-one correspondence with K*. Conclude that the number of elements in XK is a divisor of \K\. CHAPTER FIFTEEN QUOTIENT GROUPS In Chapter 14 we learned to recognize when a group H is a homomorphic image of a group G. Now we will make a great leap forward by learning a method for actually constructing all the homomorphic images of any group. This is a remarkable procedure, of great importance in algebra. In many cases this construction will allow us to deliberately select which properties of a group G we wish to preserve in a homomorphic image, and which other properties we wish to discard. The most important instrument to be used in this construction is the notion of a normal subgroup. Remember that a normal subgroup of G is any subgroup of G which is closed with respect to conjugates. We begin by giving an elementary property of normal subgroups. Theorem 1 If H is a normal subgroup of G, then aH = Ha for every a&G. (In other words, there is no distinction between left and right cosets for a normal subgroup.) Proof: Indeed, if x is any element of aH, then x = ah for some hEH. But H is closed with respect to conjugates; hence ahai 1E.H. Thus, x = ah = (aha'l)a is an element of Ha. This shows that every element of aH is in Ha; analogously, every element of Ha is in aH. Thus, aH = Ha m Let G be a group and let H be a subgroup of G. There is a way of combining cosets, called coset multiplication, which works as follows: the coset of a, multiplied by the coset of b, is defined to be the coset of ab. In 147 148 CHAPTER FIFTEEN QUOTIENT GROUPS 149 symbols, HaHb = H{ab) This definition is deceptively simple, for it conceals a fundamental difficulty. Indeed, it is not at all clear that the product of two cosets Ha and Hb, multiplied together in this fashion, is uniquely defined. Remember that Ha may be the same coset as He (this happens iff c is in Ha), and, similarly, Hb may be the same coset as Hd. Therefore, the product Ha ■ Hb is the same as the product He ■ Hd. Yet it may easily happen that H(ab) is not the same coset as H{cd). Graphically, Ha-Hb = H(ab) II II % He ■ Hd = H(cd) For example, if G = and H={e,a), then HB = {B,y} = Hy H8 = {8, k) = Hk and yet H(B°8) = He^HB = H(y k) Thus, coset multiplication does not work as an operation on the cosets of H= {e,a} in 53. The reason is that, although H is a subgroup of S3, H is not a normal subgroup of S3. If H were a normal subgroup, coset multiplication would work. The next theorem states exactly that! Theorem 2 Let H be a normal subgroup of G. If Ha = He and Hb = Hd, then H(ab) = H(cd). Proof: If Ha = He, then a G He; hence a = /i,c for some hx E H. If Hb = Hd, then bEHd; hence b = h2d from some h2 G H. Thus, ab = htch2d = h^(ch2)d But ch2EcH = He (the last equality is true by Theorem 1). Thus, ch2 = h3c for some hi E H. Returning to ab, ab = h,(ch2)d = h,(h3c)d = (A,/t3)(cd) and this last element is clearly in H(cd). We have shown that aft G tf(cd). Thus, by Property (1) in Chapter 13, H(ab) = Hied), a We are now ready to proceed with the construction promised at the beginning of the chapter. Let G be a group and let H be a normal subgroup of G. Think of the set which consists of all the cosets of H. This set is conventionally denoted by the symbol G/H. Thus, if Ha, Hb, He,... are cosets of H, then G/H = {Ha, Hb, He, . . .} We have just seen that coset multiplication is a valid operation on this set. In fact, Theorem 3 G/H with coset multiplication is a group. Proof: Coset multiplication is associative, because Ha ■ (Hb ■ He) = Ha ■ H(bc) = Ha(bc) = H(ab)c = H(ab) ■ He = (Ha ■ Hb) ■ He The identity element of G/H is H = He, for Ha ■ He = Ha and He - Ha - Ha for every coset Ha. Finally, the inverse of any coset Ha is the coset Ha'1, because Ha ■ Ha'1 = Haa'1 = He and Ha'1 ■ Ha = Ha^a = He. m H. The group G/H is called the factor group, or quotient group of G by And now, the piece de resistance: Theorem 4 G/H is a homomorphic image of G. Proof: The most obvious function from G to G/H is the function / which carries every element to its own coset, that is, the function given by f(x) = Hx This function is a homomorphism, because f(xy) = Hxy = Hx ■ Hy = f(x)f(y) f is called the natural homomorphism from G onto G/H. Since there is a homomorphism from G onto G/H, G/H is a homomorphic image of G. ■ Thus, when we construct quotient groups of G, we are, in fact, constructing homomorphic images of G. The quotient group construction is useful because it is a way of actually manufacturing homomorphic images of any group G. In fact, as we will soon see, it is a way of manufacturing all the homomorphic images of G. Our first example is intended to clarify the details of quotient group construction. Let Z be the group of the integers, and let (6) be the cyclic subgroup of Z which consists of all the multiples of 6. Since Z is abelian, 150 CHAPTER FIFTEEN QUOTIENT GROUPS 151 and every subgroup of an abelian group is normal, (6) is a normal subgroup of Z. Therefore, we may form the quotient group Z/(6). The elements of this quotient group are all the cosets of the subgroup (6), namely: -18,-12,-6,0, 6,12,18,. -17,-11,-5,1,7,13,19,. -16,-10, -4,2,8,14,20,. -15,-9,-3,3,9,15,21,.. -14,-8,-2,4,10,16,22,. 13,-7,-1,5,11,17,23,. <6>+0={ <6} + l = { <6> + 2={ <6>+3 = { (6)+4={ <6) + 5 = { These are all the different cosets of (6), for it is easy to see that (6) + 6 = (6) + 0, (6) + 7 = (6) + 1, (6) + 8 = (6) + 2, and so on. Now, the operation on Z is denoted by +, and therefore we will call the operation on the cosets coset addition rather than coset multiplication. But nothing is changed except the name; for example, the coset (6) + 1 added to the coset (6) + 2 is the coset (6) + 3. The coset (6) + 3 added to the coset (6) + 4 is the coset (6) + 7, which is the same as (6) + 1. To simplify our notation, let us agree to write the cosets in the following shorter form: 0={6)+0 T=<6) + 1 2=<6)+2 3=(6)+3 4=<6)+4 5=(6)+5 Then Z/{6) consists of the six elements 0, T, 2, 3, 4, and 5, and its operation is summarized in the following table: + 0 1 2 3 4 5 0 0 1 2 3 4 5 1 1 2 3 4 5 U 2 2 3 4 5 0 1 3 3 4 5 0 1 2 4 4 5 0 1 2 3 5 5 0 1 2 3 4 The reader will perceive immediately the similarity between this group and Z6. As a matter of fact, the quotient group construction of Z/<6) is considered to be the rigorous way of constructing Z6. So from now on, we will consider Z6 to be the same as Z/<6); and, in general, we will consider Z„ to be the same as Z/(n). In particular, we can see that for any n, Z„ is a homomorphic image of Z. Let us repeat: The motive for the quotient group construction is that it gives us a way of actually producing all the homomorphic images of any group G. However, what is even more fascinating about the quotient group construction is that, in practical instances, we can often choose H so as to "factor out" unwanted properties of G, and preserve in GIH only "desirable" traits. (By "desirable" we mean desirable within the context of some specific application or use.) Let us look at a few examples. First, we will need two simple properties of cosets, which are given in the next theorem. Theorem 5 Let G be a group and H a subgroup of G. (i) Ha = Hb iff ab l(EH and (ii) Ha = H iff aSH Then Proof: If Ha = Hb, then a G Hb, so a = hb for some hGH. Thus, ab'1 = h<= H If ab~l G H, then ab~' = h for h G H, and therefore a = hb G Hb. It follows by Property (1) of Chapter 13 that Ha = Hb. This proves (i). It follows that Ha = He iff ae~l = a€ H, which proves (ii). ■ For our first example, let G be an abelian group and let H consist of all the elements of G which have finite order. It is easy to show that H is a subgroup of G. (The details may be supplied by the reader.) Remember that in an abelian group every subgroup is normal; hence H is a normal subgroup of G, and therefore we may form the quotient group GIH. We will show next that in GIH, no element except the neutral element has finite order. For suppose GIH has an element Hx of finite order. Since the neutral element of GIH is H, this means there is an integer m^O such that (Hx)m = H, that is, Hxm = H. Therefore, by Theorem 5(ii), xm G H, so xm has finite order, say t: (xm)' = xm' = e But then x has finite order, so x G H. Thus, by Theorem 5(ii), Hx = H. This proves that in GIH, the only element Hx of finite order is the neutral element H. Let us recapitulate: If H is the subgroup of G which consists of all the elements of G which have finite order, then in GIH, no element (except the neutral element) has finite order. Thus, in a sense, we have "factored out" all the elements of finite order (they are all in H) and produced a 152 CHAPTER FIFTEEN QUOTIENT GROUPS 153 quotient group G/H whose elements all have infinite order (except for the neutral element, which necessarily has order 1). Our next example may bring out this idea even more clearly. Let G be an arbitrary group; by a commutator of G we mean any element of the form aba 'b 1 where a and b are in G. The reason such a product is called a commutator is that aha " V =e iff ab = ba In other words, aba 'b 1 reduces to the neutral element whenever a and b commute—and only in that case! Thus, in an abelian group all the commutators are equal to e. In a group which is not abelian, the number of distinct commutators may be regarded as a measure of the extent to which G departs from being commutative. (The fewei the commutators, the closer the group is to being an abelian group.) We will see in a moment that if H is a subgroup of G which contains all the commutators of G, then G/H is abelian! What this means, in a fairly accurate sense, is that when we factor out the commutators of G we get a quotient group which has no commutators (except, trivially, the neutral element) and which is therefore abelian. To say that G/H is abelian is to say that for any two elements Hx and Hy in G/H, HxHy = HyHx, that is, Hxy = Hyx. But by Theorem 5(ii), Hxy = Hyx iff Now xy(yx) is the commutator xyx H, then G/H is abelian. xy(yx) 1 GH ly~i; so if all commutators are in EXERCISES A. Examples of Finite Quotient Groups In each of the following, G is a group and H is a normal subgroup of G. List the elements of G/H and then write the table of G/H. Example G = Z6 and W = {0,3} The elements of G/H are the three cosets H = H + 0 = {0,3}, H + I = {1,4}, and H + 2 = {2, 5}. (Note that H + 3 is the same as H + 0, H + 4 is the same as H + 1, and H + 5 is the same as H + 2.) The table of G/H is + H H+l H + 2 H H H+l H + 2 H + 1 H + 1 H + 2 H H + 2 H + 2 H H + l 1 G = Z„„ H= {0, 5}. (Explain why G/H = Z5.) 2 G = S„ H = {e, 0,8}. 3 G=D4,H = {R0, R2}. (See page 73.) 4 G = D4, H={R0,R2,R4,R,}. 5 G = Z4 x Z2, // = ((0, 1)) = the subgroup of Z4 x Z2 generated by (0, 1). 6 G = P3, H= {0, {1}}. (/>3 is the group of subsets of {1,2,3}.) B. Examples of Quotient Groups of U x U In each of the following, H is a subset of R x R. (a) Prove that H is a normal subgroup of R x R. (Remember that every subgroup of an abelian group is normal.) (b) In geometrical terms, describe the elements of the quotient group G/H. (c) In geometrical terms or otherwise, describe the operation of G/H. 1 H = {(x,0):xeR} 2 H = {(x,y): y = -x) 3 H={(x,y):y = 2x) C. Relating Properties of H to Properties of G/H In parts 1-5 below, G is a group and H is a normal subgroup of G. Prove the following (Theorem 5 will play a crucial role): 1 If x2 e H for every r£G, then every element of G/H is its own inverse. Conversely, if every element of G/H is its own inverse, then x2 G Hfor all i£G. 2 Let m be a fixed integer. U xm G H for every * e G, then the order of every element in G/H is a divisor of m. Conversely, if the order of every element in G/H is a divisor of m, then x™ 6 H for every jEG. 3 Suppose that for every x G G, there is an integer n such that x" G H; then every element of G/H has finite order. Conversely, if every element of G/H has finite order, then for every x G G there is an integer n such that *" G H. # 4 Every element of G/H has a square root iff for every x£G, there is some y G G such that xy2 G //. 5 G/// is cyclic iff there is an element a E G with the following property: for every x G G, there is some integer n such that xa" G //. 6 If G is an abelian group, let Hp be the set of all x G G whose order is a power of p. Prove that Hp is a subgroup of G. Prove that G/Hp has no elements whose order is a nonzero power of p 7 (a) If G/H is abelian, prove that H contains all the commutators of G. (b) Let K be a normal subgroup of G, and /Y a normal subgroup of K. If G/H is abelian, prove that G/K and AT/// are both abelian. 154 CHAPTER FIFTEEN quotient groups 155 # D. Properties of G Determined by Properties of G/H and H There are some group properties which, if they are true in G/H and in H, must be true in G. Here is a sampling. Let G be a group, and H a normal subgroup of G. Prove the following: 1 If every element of G!H has finite order, and every element of H has finite order, then every element of G has finite order. 2 If every element of G/H has a square root, and every element of H has a square root, then every element of G has a square root. (Assume G is abelian.) 3 Let p be a prime number. If G/H and H are p-groups, then G is a p-group. A group G is called a p-group if the order k ;very element x in G is a power of p. 4 If G/H and H are finitely generated, then G is finitely generated. (A group is said to be finitely generated if it is generated by a finite subset of its elements.) E. Order of Elements in Quotient Groups Let G be a group, and H a normal subgroup of G. Prove the following: 1 For each element uEG, the order of the element Ha in G/H is a divisor of the order of a in G. (Hint: Use Chapter 14, Exercise Fl.) 2 If (G: H) = m, the order of every element of G/H is a divisor of m. 3 If (G: H) = p, where p is a prime, then the order of every element a 0H in G is a multiple of p. (Use part 1.) 4 If G has a normal subgroup of index p, where p is a prime, then G has at least one element of order p. 5 If (G: H) = m, then o"€H for every a G G. # 6 In Q/Z, every element has finite order. t F. Quotient of a Group by Its Center The center of a group G is the normal subgroup C of G consisting of all those elements of G which commute with every element of G. Suppose the quotient group GIC is a cyclic group; say it is generated by the element Ca oiGIC. Prove parts 1-3: 1 For every xS G, there is some integer m such that Cx = Cam. 2 For every x G G, there is some integer m such that x = ca", where c G C. 3 For any two elements x and y in G, xy = y*. (Hint: Use part 2 to write * = cam, y = c'fl", and remember that c, c' G C.) 4 Conclude that if GIC is cyclic, then G is abelian. t G. Using the Class Equation to Determine the Size of the Center (Prerequisite: Chapter 13, Exercise 1.) Let G be a finite group. Elements a and b in G are called conjugates of one another (in symbols, a~b) iff a = xbx'i for some x G G (this is the same as /> = ^"'ax). The relation ~ is an equivalence relation in G; the equivalence class of any element a is called its conjugacy class. Hence G is partitioned into conjugacy classes (as shown in the diagram); the size of each conjugacy class divides the order of G. (For these facts, see Chapter 13, Exercise I.) "Each element of the center C is alone in its conjugacy class.' Let S1,S2,...,SI be the distinct conjugacy classes of G, and let fe2,..., k, be their sizes. Then |G| = + k2 +----h k,. (This is called the class equation of G.) Let G be a group whose order is a power of a prime p, say |G| = p". Let C denote the center of G. Prove parts 1-3: 1 The conjugacy class of a contains a (and no other element) iff a G C. 2 Let c be the order of C. Then |G| = c + fcs + ks+l + ■■■ + k„ where k„...,k, are the sizes of all the distinct conjugacy classes of elements x 0 C. 3 For each ie {s, s + 1,..., t), kt is equal to a power of p. (See Chapter 13, Exercise 16.) 4 Solving the equation |G| = c + k, + • • • + k,%at c, explain why c is a multiple of P- We may conclude from part 4 that C must contain more than just the one element e; in fact, \C\ is a multiple of p. 5 Prove: If \G\ = p2, G must be abelian. (Use the preceding Exercise F.) # 6 Prove: If |G| = p2, then either G = Z„2 or G s Z„ x Z„. 156 CHAPTER FIFTEEN t H. Induction on \G\: An Example Many theorems of mathematics are of the form is true for every positive integer n." [Here, P(n) is used as a symbol to denote some statement involving n.] Such theorems can be proved by induction as follows: (a) Show that P(n) is true for n = 1. (b) For any fixed positive integer k, show that, if P(n) is true for every n< k, then P(ri) must also be true for n = k. If we can show (a) and (b), we may safely conclude that P(n) is true for all positive integers n. Some theorems of algebra can be proved by induction on the order n of a group. Here is a classical example: Let G be a finite abelian group. We will show that G must contain at least one element of order p, for every prime factor p of \G\. If |G] = 1, this is true by default, since no prime p can be a factor of 1. Next, let \G\ = k, and suppose our claim is true for every abelian group whose order is less than k. Let p be a prime factor of k. Take any element a # e in G. If ord(a) = p or a multiple of p, we are done! 1 If ord(a) = tp (for some positive integer r), what element of G has order p? 2 Suppose ord(a) is not equal to a multiple of p. Then GI (a) is a group having fewer than k elements. (Explain why.) The order of GI (a) is a multiple of p. (Explain why.) 3 Why must GI (a) have an element of order pi 4 Conclude that G has an element of order p. (Hint: Use Exercise El.) CHAPTER SIXTEEN THE FUNDAMENTAL HOMOMORPHISM THEOREM Let G be any group. In Chapter 15 we saw that every quotient group of G is a homomorphic image of G. Now we will see that, conversely, every homomorphic image of G is a quotient group of G. More exactly, every homomorphic image of G is isomorphic to a quotient group of G. It will follow that, for any groups G and H, H is a homomorphic image of G iff H is (or is isomorphic to) a quotient group of G. Therefore, the notions of homomorphic image and of quotient group are interchangeable. The thread of our reasoning begins with a simple theorem. Theorem 1 Let f : G-* H be a homomorphism with kernel K. Then f(a) = /(£>) iff Ka=Kb (In other words, any two elements a and b in G have the same image under / iff they are in the same coset of K.) Indeed, f(a)=f(b) iff M[nP)V = * iff f(ab'l) = e iff ab-'eK iff Ka = Kb (by Chapter 15, theorem 5i) 157 158 CHAPTER SIXTEEN THE FUNDAMENTAL HOMOMORPHISM THEOREM 159 What does this theorem really tell us? It says that if / is a homo-morphism from G to // with kernel K, then all the elements in any fixed coset of K have the same image, and, conversely, elements which have the same image are in the same coset of K. K = Ke Ka = Kb It is therefore clear already that there is a one-to-one correspondence matching cosets of K with elements in H. It remains only to show that this correspondence is an isomorphism. But first, how exactly does this correspondence match up specific cosets of K with specific elements of HI Clearly, for each x, the coset Kx is matched with the element f(x). Once this is understood, the next theorem is easy. Theorem 2 Let f: G- the kernel of f, then H be a homomorphism of G onto H. If K is H = GIK Proof: To show that GIK is isomorphic to H, we must look for an isomorphism from GIK to H. We have just seen that there is a function from GIK to H which matches each coset Kx with the element f(x); call this function 4>. Thus, (Kx)=f(x) This definition does not make it obvious that (Ka) is the same as (Kb), then f(a) Ka = Kb. f(b); so by Theorem 1, (Kx). Finally, (Kab) = f{ab) - f(a)f(b) = d,(Ka)u>>(Kb). Thus, d> is an isomorphism from GIK onto //. ■ Theorem 2 is often called the fundamental homomorphism theorem. It asserts that every homomorphic image of G is isomorphic to a quotient group of G. Which specific quotient group of G"! Well, if / is a homomorphism from G onto H, then // is isomorphic to the quotient group of G by the kernel of f. The fact that/is a homomorphism from G onto H may be symbolized by writing -» H f:G~ Furthermore, the fact that K is the kernel of this homomorphism may be indicated by writing f:G H Thus, in capsule form, the fundamental homomorphism theorem says that If f.G -» H then H= GIK Let us see a few examples: We saw in the opening paragraph of Chapter 14 that /=(° 1 J Vo i 2 3 4 5 2 0 12 is a homomorphism from Z6 onto Z,. Visibly, the kernel of / is {0,3}, which is the subgroup of Z6 generated by 3, that is, the subgroup (3). This situation may be symbolized by writing We conclude by Theorem 2 that Z, = Z6/{3) For another kind of example, let G and H be any groups and consider their direct product G x //. Remember that G x H consists of all the ordered pairs {x, y) as x ranges over G and y ranges over H. You multiply ordered pairs by multiplying corresponding components: that is, the operation on G x H is given by (a, b) • (c, d) = (ac, bd) 160 CHAPTER SIXTEEN the fundamental homomorphism theorem 161 Now, let / be the function from G X H onto H given by fix, y) = y It is easy to check that /is a homomorphism. Furthermore, (x, y) is in the kernel of / iff f(x, y) = y = e. This means that the kernel of / consists of all the ordered pairs whose second component is e. Call this kernel G*; then G* = {(x, e) : x 6 G} We symbolize all this by writing f-.GxH- ■ H By the fundamental homomorphism theorem, we deduce that H = (G x H)IG*. [It is easy to see that G* is an isomorphic copy of G; thus, identifying G* with G, we have shown that, roughly speaking, (G x H)/G =- H.] Other uses of the fundamental homomorphism theorem are given in the exercises. EXERCISES In the exercises which follow, FHT will be used as an abbreviation for fundamental homomorphism theorem. A. Examples of the FHT Applied to Finite Groups In each of the following, use the fundamental homomorphism theorem to prove that the two given groups are isomorphic. Then display their tables. Example Z2 and 1J (2). /0 12 3 4 5^ ' VO 10 10 1, is a homomorphism from 2.b onto Z2. (Do not prove that/is a homomorphism.) The kernel of/is {0,2,4} = (2). Thus, / : Z6 -~» Z, 1 6 <2> 2 It follows by the FHT that Z3 = ZJ{2). 1 Z5 and Z20/(5>. 2 Z3 and Z„/(3). 3 Z2 and S,/{e, 8,8}. 4 P2 and P3/K, where K= {0, {c}}. [Hint: Consider the function/(C) = C n {a, b). P3 is the group of subsets of {a, b, c), and P, of (a, b}.] 5 Z, and (Z3 x Z3)/A', where K ={(0,0), (1,1), (2,2)}. [Hint: Consider the function f{a, b) = a — b from Z3 x Z3 to Z3.] B. Example of the FHT Applied to Let a: ^(R)->R be defined by a(/)=/(l) and let B: &(«.)-0(/)=/(2). be defined by 1 Prove that a and B are homomorphisms from Sf(R) onto R. 2 Let / be the set of all the functions from R to R whose graph passes through the point (1,0) and let K be the set of all the functions whose graph passes through (2,0). Use the FHT to prove that R = &(R)IJ and R = &(R)IK. 3 Conclude that &(R)/J = &(R)1K. C. Example of the FHT Applied to Abelian Groups Let G be an abelian group. Let H = {x1: x€. G} and K = [x G G : x2 = e}. 1 Prove that f(x) = x2 is a homomorphism of G onto H. 2 Find the kerne! of /. 3 Use the FHT to conclude that H - G/K. t D. Group of Inner Automorphisms of a Group G Let G be a group. By an automorphism of G we mean an isomorphism / : G—* G. # 1 The symbol Aut(G) is used to designate the set of all the automorphisms of G. Prove that the set Aut (G), with the operation ° of composition, is a group by proving that Aut(G) is a subgroup of SG. 2 By an inner automorphism of G we mean any function d>a of the following form: for every x G G a(x) = "xa~l Prove that every inner automorphism of G is an automorphism of G. 3 Prove that, for arbitrary a, b G G. a"^b = ab a!ld (J~l = 4>.-> 4 Let 1(G) designate the set of all the inner automorphisms of G. That is, 1(G) = {4>a : oG G). Use part 3 to prove that 1(G) is a subgroup of Aut(G). Explain why 1(G) is a group. 5 By the center of G we mean the set of all those elements of G which commute with every element of G, that is, the set C defined by C = [aE G : ax = xa for every x G G) Prove that a G C if and only if axa= x for every x G G. 162 CHAFfER SIXTEEN THE FUNDAMENTAL HOMOMORPHISM THEOREM 163 6 Let h:G—*I(G) be the function defined by h(a) = a. Prove that h is a homomorphism from G onto 1(G) and that C is its kernel. 7 Use the FHT to conclude that 1(G) is isomorphic with G/C. t H. Quotient Groups Isomorphic to the Circle Group Every complex number a + b\ may be represented as a point in the complex plane. t E. The FHT Applied to Direct Products of Groups Let G and H be groups. Suppose / is a normal subgroup of G and K is a normal subgroup of H. 1 Show that the function/(x, y) = (Jx, Ky) is a homomorphism from G x H onto (GIJ) x (HIK). 2 Find the kernel of /. 3 Use the FHT to conclude that (G x H)/(J x K) = (G/J) x (HIK). t F. First Isomorphism Theorem Let G be a group; let H and K be subgroups of G, with H a normal subgroup of G. Prove the following: 1 H n K is a normal subgroup of K. # 2 If HK = {xy : x E H and yEK}, then HK is a subgroup of G. 3 H is a normal subgroup of WAT. 4 Every member of the quotient group HKIH may be written in the form Hk for some <:£ K. 5 The function /(£:) = Hk is a homomorphism from K onto HKIH, and its kernel is // n K. 6 By the FHT, KI(H n K) = HKIH. (This is referred to as the first isomorphism theorem.) t G. A Sharper Cayley Theorem If H is a subgroup of a group G, let X designate the set of all the left cosets of H in G. For each element a EG, define pa : X—* X as follows: pa(xH) = (ax)H 1 Prove that each p0 is a permutation of X. 2 Prove that h : G—> Sx defined by h(a) - pa is a homomorphism. # 3 Prove that the set {a E H : xax'1 E H for every x E G}, that is, the set of all the elements of H whose conjugates are all in H, is the kernel of h. 4 Prove that if H contains no normal subgroup of G except {e}, then G is isomorphic to a subgroup of Sx. Imaginary axis — -ja +bi I -* Real axis cos x + i sin x The unit circle in the complex plane consists of all the complex numbers whose distance from the origin is 1; thus, clearly, the unit circle consists of all the complex numbers which can be written in the form cos x + i sin x for some real number x. # 1 For each i£R, it is conventional to write cisx = cos x + i sin x. Prove that cis (x + y) = (cis jr)(cis y). 2 Let T designate the set {cis x : x E R), that is, the set of all the complex numbers lying on the unit circle, with the operation of multiplication. Use part 1 to prove that T is a group. (T is called the circle group.) 3 Prove that f(x) = cis x is a homomorphism from R onto 7". 4 Prove that ker /= [2mr :nEZ) = <2tt>. 5 Use the FHT to conclude that T = Rl(2ir). 6 Prove that g(x) = cis27rjt: is a homomorphism from R onto 7", with kernel Z. 7 Conclude that T = t I. The Second Isomorphism Theorem Let H and K be normal subgroups of a group G, with H C K. Define (Ha) = is a homomorphism. 3 is surjective. 4 ker

= a'} = ■ ■ ■ = a'n = g_ If (Dl) and (D2) hold, we will write G = [a,, a2, and 2. , a„J. Assume this in parts 1 1 Let G' be the set of all products a'l ■ ■ ■ a1;, as l2, . . . , la range over Z. Prove that G' is a subgroup of G, and G' = [a2, . . . , aj. 2 Prove: G = (a,) x G'. Conclude that G = («,) x . In the remaining exercises of this set, let p be a prime number, and assume G is a finite abelian group such that the order of every element in G is some power of p. Let a G G be an element whose order is the highest possible in G. We will argue by induction to prove that G is "decomposable." Let H= (a). 3 Explain why we may assume that G/H = [Hbu . . ., Hb ] for some bt.....i.eo. By Exercise O, we may assume that for each i = l,...,n, ord(p.) = ord(/#>,). We will show that G = [a, 6,,. . . , 6J. 168 CHAPTER SIXTEEN 4 Prove that for every xSG, there are integers k0, *„..., *„ such that : 0 1 / are both divisors of zero in M2(U). Of course, there are rings which have no divisors of zero at all! For example, Z, Q, R, and C do not have any divisors of zero. It is important to note carefully what it means for a ring to have no divisors of zero: it means that if the product of two elements in the ring is equal to zero, at least one of the factors is zero. (Our commandment from elementary mathematics!) It is also decreed in elementary algebra that a nonzero number a may be canceled in the equation ax = ay to yield x = y. While undeniably true in the number systems of mathematics, this rule is not true in every ring. For example, in Z6, 2-5 = 2-2 yet we cannot cancel the common factor 2. A similar example involving 2x2 matrices may be seen on page 9. When cancellation is possible, we say the ring has the "cancellation property." A ring is said to have the cancellation property if ab = ac or ba = ca implies b = c for any elements a, b, and c in the ring if a ¥0. and 1 1 0 0 174 chapter seventeen There is a surprising and unexpected connection between the cancellation property and divisors of zero: Theorem 2 A ring has the cancellation property iff it has no divisors of zero. Proof: The proof is very straightforward. Let A be a ring, and suppose first that A has the cancellation property. To prove that A has no divisors of zero we begin by letting ah = 0, and show that a or b is equal to 0. If a = 0, we are done. Otherwise, we have ab = 0 = flO so by the cancellation property (cancelling a), b = 0. Conversely, assume A has no divisors of zero. To prove that A has the cancellation property, suppose ab = ac where a 5*0. Then ab — ac = a(b - c) = 0 Remember, there are no divisors of zero! Since a^O, necessarily b - c = 0, so b = c. m An integral domain is defined to be a commutative ring with unity having the cancellation property. By Theorem 2, an integral domain may also be defined as a commutative ring with unity having no divisors of zero. It is easy to see that every field is an integral domain. The converse, however, is not true: for example, Z is an integral domain but not a field. We will have a lot to say about integral domains in the following chapters. EXERCISES A. Examples of Rings In each of the following, a set A with operations of addition and multiplication is given. Prove that A satisfies all the axioms to be a commutative ring with unity. Indicate the zero element, the unity, and the negative of an arbitrary a. I A is the set Z of the integers, with the following "addition" © and "multiplication" O: a®b=a+b-\ aOb = ab-(a + b) + 2 2 A is the set Q of the rational numbers, and the operations arc © and O defined as follows: a@b = a + b + 1 aOb = ab + a + b rings: definitions and elementary properties 175 # 3 A is the set Q x Q of ordered pairs of rational numbers, and the operations are the following addition © and multiplication O: (a, b)®(c, d) = (a + c,b + d) (a, b)Q(c, d) = (ac - bd, ad + be) 4 A = {x + yV2 :x,yEl} with conventional addition and multiplication. 5 Prove that the ring in part 1 is an integral domain. 6 Prove that the ring in part 2 is a field, and indicate the multiplicative inverse of an arbitrary nonzero element. 7 Do the same for the ring in part 3. B. Ring of Real Functions 1 Verify that iF(R) satisfies all the axioms for being a commutative ring with unity. Indicate the zero and unity, and describe the negative of any/. ' 2 Describe the divisors of zero in &(R). 3 Describe the invcrtible elements in &(M). 4 Explain why ^(IR) is neither a field nor an integral domain. C. Ring of 2 x 2 Matrices Let M2(U) designate the set of all 2 x 2 matrices ca whose entries are real numbers a, b, c, and d, with the following addition and multiplication: and 1 Verify that M2( 2 Show that M2(\ 3 Explain why M a b> c di (a + r Kc + t b + s d + u ar + bt as cr + dt cs - bu" dm ) satisfies the ring axioms, is not commutative and has a unity. R) is not an integral domain or a field. D. Rings of Subsets of a Set If D is a set, then the power set of D is the set PD of all the subsets of D. Addition and multiplication are defined as follows: If A and B are elements of P (that is, subsets of D), then A + B = (A- B)U(B- A) and AB = A n B It was shown in Chapter 3, Exercise C, that PD with addition alone is an abelian group. 176 chapter seventeen rings: definitions and elementary properties 177 # 1 Prove: PD is a commutative ring with unity. (You may assume CI is associative; for the distributive law, use the same diagram and approach as was used to prove that addition is associative in Chapter 3, Exercise C.) 2 Describe the divisors of zero in PD. 3 Describe the invertible elements in P„. 4 Explain why Pn is neither a field nor an integral domain. (Assume D has more than one element.) 5 Give the tables of P3, that is, PD where D = {a, b, c]. E. Ring of Quaternions A quaternion (in matrix form) is a 2 x 2 matrix of complex numbers of the form _ / a + bi c + di\ 01 \—c + di a - bi) 1 Prove that the set of all the quaternions, with the matrix addition and multiplication explained on pages 7 and 8, is a ring with unity. This ring is denoted by the symbol 2. Find an example to show that Si is not commutative. (You may assume matrix addition and multiplication are associative and obey the distributive law.) 2 Let -('„;) -a °.) i) -g a Show that the quaternion a, defined previously, may be written in the form a = a\ + bi + cj + dk (This is the standard notation for quaternions.) # 3 Prove the following formulas: l'.j'-k'--l ij 4 The conjugate of a is jk = -kj ki = -ik = j _ _ (a — bi —c- di\ a \c - di a + bi) The norm of a is a2 + b1 + c2 + d\ and is written ||a||. Show directly (by matrix multiplication) that t 0\ (i ,) where t — a Conclude that the multiplicative inverse of a is (\lt)a. 5 A skew field is a (not necessarily commutative) ring with unity in which every nonzero element has a multiplicative inverse. Conclude from parts 1 and 4 that .2 is a skew field. F. Ring of Endomorphisms Let G be an abelian group in additive notation. An endomorphism of G is a homomorphism from G to G. Let End(G) denote the set of all the endomorphisms of G, and define addition and multiplication of endomorphisms as follows: [/ + #](•*)= /(*) + g(*) for every x in G fg=f°8 the composite of / and g 1 Prove that End(G) with these operations is a ring with unity. 2 List the elements of End(Z4), then give the addition and multiplication tables for End(Z.,). Remark: The endomorphisms of Z4 are easy to find. Any endomorphisms of Z4 will carry 1 to either 0, 1, 2, or 3. For example, take the last case: if then necessarily 1 + 1-^3 + 3 = 2 1 + 1 + 1-43 + 3 + 3 = 1 hence / is completely determined by the fact that l-43 and 0-»0 G. Direct Product of Rings If A and B are rings, their direct product is a new ring, denoted by A x fi, and defined as follows: A x B consists of all the ordered pairs (x, y) where x is in A and y is in B. Addition in Ax B consists of adding corresponding components: (*,, y,) + (x2, y2) = (x, + x2, y, + y2) Multiplication in A x B consists of multiplying corresponding components: (x„ yi)-(x2, y2) = (x,x2, yty2) 1 If A and B arc rings, verify that A x B is a ring. 2 If A and B are commutative, show that A x B is commutative. If A and B each has a unity, show that Ax B has a unity. 3 Describe carefully the divisors of zero in A X B. It 4 Describe the invertible elements in A x B. 5 Explain why Ax B can never be an integral domain or a field. (Assume A and B each have more than one element.) H. Elementary Properties of Rings Prove parts 1-4: 178 chapter seventeen rings: definitions and elementary properties 179 1 In any ring, a(b — c) = ab - ac and (b - c)a = ba — ca. 2 In any ring, if ab = -ba, then (a + ft)2 = (a - b)2 = a2 + b2. 3 In any integral domain, if a2 = b2, then a = ±b. 4 In any integral domain, only 1 and -1 are their own multiplicative inverses. (Note thatx = x ' iff x2 = l.) 5 Show that the commutative law for addition need not be assumed in denning a ring with unity: it may be proved from the other axioms. [Hint: Use the distributive law to expand (a + ft)(l + 1) in two different ways.] # 6 Let A be any ring. Prove that if the additive group of A is cyclic, then A is a commutative ring. 7 Prove: In any integral domain, if a" =0 for some integer n, then a = 0. I. Properties of Invertible Elements Prove that parts 1-5 are true in a nontrivial ring with unity. 1 If a is invertible and ab = ac, then b = c. 2 An element a can have no more than one multiplicative inverse. 3 If a2 = 0 then a + 1 and a - 1 are invertible. 4 If a and b are invertible, their product ab is invertible. 5 The set S of all the invertible elements in a ring is a multiplicative group. 6 By part 5, the set of all the nonzero elements in a field is a multiplicative group. Now use Lagrange's theorem to prove that in a finite field with m elements, xml = \ for every x ^ 0. 7 If ax = 1, x is a right inverse of a; if ya = 1, y is a left inverse of a. Prove that if a has a right inverse x and a left inverse y, then a is invertible, and its inverse is equal to x and to y. (First show that yaxa = 1.) 8 Prove: In a commutative ring, if ab is invertible, then a and b are both invertible. J. Properties of Divisors of Zero Prove that each of the following is true in a nontrivial ring. 1 If a ^ ±1 and a2 = 1, then a + 1 and a - 1 are divisors of zero. # 2 If ab is a divisor of zero, then a or b is a divisor of zero. 3 In a commutative ring with unity, a divisor of zero cannot be invertible. 4 Suppose ab ^ 0 in a commutative ring. If either a or b is a divisor of zero, so is ab. 5 Suppose a is neither 0 nor a divisor of zero. If ab = ac, then b = c. 6 A x B always has divisors of zero. K. Boolean Rings A ring A is a boolean ring if a2 = a for every a e A. Prove that parts 1 and 2 are true in any boolean ring A. 1 For every a 6 A, a= -a. [Hint: Expand (a + a)2.] 2 Use part 1 to prove that A is a commutative ring. [Hint: Expand (a + b)2.] In parts 3 and 4, assume A has a unity and prove: 3 Every element except 0 and 1 is a divisor of zero. [Consider x(x - 1).| 4 1 is the only invertible element in A. 5 Letting a v ft = a + ft + aft we have the following in A: a v bc = (a v ft)(a v c) a v (1 + a) = 1 a v a = a a(a v ft) = a L. The Binomial Formula An important formula in elementary algebra is the binomial expansion formula for an expression (a + ft)". The formula is as follows: (a+by =2 (%" v where the binomial coefficient / n \ = n(/i-l)(n-2)---(rt-A + l) This theorem is true in every commutative ring. (If k is any positive integer and a is an element of a ring, ka refers to the sum a + a + ■ ■ ■ + a with k terms, as in elementary algebra.) The proof of the binomial theorem in a commutative ring is no different from the proof in elementary algebra. We shall review it here. The proof of the binomial formula is by induction on the exponent n. The formula is trivially true for n = 1. In the induction step, we assume the expansion for (a + ft)" is as above, and we must prove that (a + ft)" M = £ n + 1 Now, (a + ft)"M = (a + ft)(a + ft)" = (a + b) 2 (l)a-"bk Collecting terms, we find that the coefficient of a"*' "kbk is (rk) + (k-l, 180 chapter seventeen By direct computation, show that GM:.MT) It will follow that (a + £>)n+I is as claimed, and the proof is complete. M. Nilpotent and Unipotent Elements An element a of a ring is nilpotent if a" = 0 for some positive integer n. 1 In a ring with unity, prove that if a is nilpotent, then a + 1 and a - 1 are both invcrtible. [Hint: Use the factorization 1 - a" = (1 - a)(l + a + a2 + ■ ■ ■ + a" ') for 1 - a, and a similar formula for 1 + a.] 2 In a commutative ring, prove that any product xa of a nilpotent element a by any element x is nilpotent. # 3 In a commutative ring, prove that the sum of two nilpotent elements is nilpotent. (Hint: You must use the binomial formula; see Exercise L.) An element a of a ring is unipotent iff 1 - a is nilpotent. 4 In a commutative ring, prove that the product of two unipotent elements a and b is unipotent. [Hint: Use the binomial formula to expand 1 - ab = (1 - a) + a(l - b) to power n + m.\ 5 In a ring with unity, prove that every unipotent element is invertible. (Hint: Use Part 1.) CHAPTER EIGHTEEN IDEALS AND HOMOMORPHISMS We have already seen several examples of smaller rings contained within larger rings. For example, Z is a ring inside the larger ring Q, and Q itself is a ring inside the larger ring R, When a ring B is part of a larger ring A, we call B a subring of A. The notion of subring is the precise analog for rings of the notion of subgroup for groups. Here are the relevant definitions: Let A be a ring, and B a nonempty subset of A. If the sum of any two elements of B is again in B, then B is closed with respect to addition. If the negative of every element of B is in B, then B is closed with respect to negatives. Finally, if the product of any two elements of B is again in B, then B is closed with respect to multiplication. B is called a subring of A if B is closed with respect to addition, multiplication, and negatives. Why is B then called a subring of Al Quite elementary: // a nonempty subset B C A is closed with respect to addition, multiplication, and negatives, then B with the operations of A is a ring. This fact is easy to check: If a, b, and c are any three elements of B, then a, b, and c are also elements of A because B C A. But A is a ring, so a + (b + c) = (a + b) + c a(bc) = (ab)c a(b + c) = ab + ac and (b + c)a = ba + ca 181 182 CHAPTER EIGHTEEN IDEALS AND HOMOMORPHISMS 183 Thus, in B addition and multiplication are associative and the distributive law is satisfied. Now, B was assumed to be nonempty, so there is an element bE. B; but B is closed with respect to negatives, so —b is also in B. Finally, B is closed with respect to addition; hence b + (-b)E B. That is, 0 is in B. Thus, B satisfies all the requirements for being a ring. For example, Q is a subring of R because the sum of two rational numbers is rational, the product of two rational numbers is rational, and the negative of every rational number is rational. By the way, if B is a nonempty subset of A, there is a more compact way of checking that B is a subring of A: B is a subring of A if and only if B is closed with respect to subtraction and multiplication. The reason is that B is closed with respect to subtraction iff B is closed with respect to both addition and negatives. This last fact is easy to check, and is given as an exercise. Awhile back, in our study of groups, we singled out certain special subgroups called normal subgroups. We will now describe certain special subrings called ideals which are the counterpart of normal subgroups: that is, ideals are in rings as normal subgroups are in groups. Let A be a ring, and B a nonempty subset of A. We will say that B absorbs products in A (or, simply, B absorbs products) if, whenever we multiply an element in B by an element in A (regardless of whether the latter is inside B or outside B), their product is always in B. In other words, for all b €E B and x e A, xb and bx are in B. A nonempty subset B of a ring A is called an ideal of A if B is closed with respect to addition and negatives, and B absorbs products in A. A simple example of an ideal is the set E of the even integers. E is an ideal of Z because the sum of two even integers is even, the negative of any even integer is even, and, finally, the product of an even integer by any integer is always even. In a commutative ring with unity, the simplest example of an ideal is the set of all the multiples of a fixed element a by all the elements in the ring. In other words, the set of all the products as a remains fixed and x ranges over all the elements of the ring. This set is obviously an ideal because xa + ya = (x + y)a -(xa) = (-x)a and y(xa) = (yx)a This ideal is called the principal ideal generated by a, and is denoted by {«) As in the case of subrings, if B is a nonempty subset of A, there is a more compact way of checking that B is an ideal of A: B is an ideal of A if and only if B is closed with respect to subtraction and B absorbs products in A. We shall see presently that ideals play an important role in connection with homomorphism Homomorphisms are almost the same for rings as for groups. A homomorphism from a ring A to a ring B is a function f ; A-+ B satisfying the identities and f(xx + x2)=f(xx)+f(x2) f{x1x2)=f{xx)f(x2) There is a longer but more informative way of writing these two identities: 1. = yt and f(x2) = y2, then f(xx + x2) = y, + y2. 2. /// (xx) = yx andf(x2) = y2, then f(xtx2) = y,y2. In other words, if / happens to carry xx to y, and x2 to y2, then, necessarily, it must carry x, + x2 to yx + y2 and xxx2 to yxy2. Symbolically, ->y, and x2—*y2, then necessarily and xxx2—*yxy2 One can easily confirm for oneself that a function / with this property will transform the addition and multiplication tables of its domain into the addition and multiplication tables of its range. (We may imagine infinite rings to have "nonterminating" tables.) Thus, a homomorphism from a ring A onto a ring B is a function which transforms A into B. For example, the ring Z6 is transformed into the ring Z3 by 2 3 2 0 as we may verify by comparing their tables. The addition tables are 184 CHAPTER EIGHTEEN IDEAI.S AND HOMOMORPHISMS 185 compared on page 136, and we may do the same with their multiplication tables: 0 1 2 3 4 5 0 1 2 0 1 2 0 0 0 0 0 0 0 Replace 0 0 0 0 0 0 0 1 0 1 2 3 4 5 x by f{x) 1 0 1 2 0 1 2 2 0 2 4 0 2 4 2 0 2 I 0 2 1 3 0 3 0 3 0 3 0 0 (i 0 {) 0 0 4 0 4 2 0 1 2 1 0 1 2 0 1 2 5 0 5 4 3 2 1 2 0 2 1 0 2 1 0 1 2 0 0 0 0 1 0 1 2 T 0 2 1 Eliminate duplicate information (For example, 2-2 = 1 appears four separate times in table above.) If there is a homomorphism from A onto B, we call B a homomorphic image of A. If / is a homomorphism from a ring A to & ring B, not necessarily onto, the range of/is a subring of B. (This fact is routine to verify.) Thus, the range of a ring homomorphism is always a ring. And obviously, the range of a homomorphism is always a homomorphic image of its domain. Intuitively, if B is a homomorphic image of A, this means that certain features of A are faithfully preserved in B while others are deliberately lost. This may be illustrated by developing further an example described in Chapter 14. The parity ring P consists of two elements, e and o, with addition and multiplication given by the tables + e o e o e e o and e e e o o e o e 0 We should think of e as "even" and o as "odd," and the tables as describing the rules for adding and multiplying odd and even integers. For example, even + odd = odd, even times odd = even, and so on. The function /:Z—*P which carries every even integer to e and every odd integer to o is easily seen to be a homomorphism from Z to P; this is made clear on page 137. Thus, P is a homomorphic image of Z. Although the ring P is very much smaller than the ring Z, and therefore few of the features of Z can be expected to reappear in P, nevertheless one aspect of the structure of Z is retained absolutely intact in P, namely, the structure of odd and even numbers. As we pass from Z to P, the parity of the integers (their being even or odd), with its arithmetic, is faithfully preserved while all else is lost. Other examples will be given in the exercises. If/is a homomorphism from a ring A to a ring B, the kernel of /is the set of all the elements of A which are carried by / onto the zero element of B. In symbols, the kernel of / is the set K = {x e A : /(*) = 0} It is a very important fact that the kernel offis an ideal of A. (The simple verification of this fact is left as an exercise.) If A and B are rings, an isomorphism from A to B is a homomorphism which is a one-to-one correspondence from A to B. In other words, it is an injective and surjective homomorphism. If there is an isomorphism from A to B we say that A is isomorphic to B, and this fact is expressed by writing A = B EXERCISES A. Examples of S si brings Prove that each of the following is a subring of the indicated ring: 1 (x + V3y : x, y £ 1} is a subring of R. 2 {x + 21/3y + 22,3z : x, y, z 6 Z} is a subring of R. 3 {x2y : x, ySZ] is a subring of R. # 4 Let < 2 h 3 h &(U)->R given by {f) =/(0). R x R-> IR given by h(x, y) = x. R^>M2(R) given by 4 h : R x R-*^«2(R) given by Kx,y) = (l °y) # 5 Let A be the set R x R with the usual addition and the following "multiplication": (a, b)Q(c, d) = (ac, be) Granting that A is a ring, let / : A-+M2(U) be given by 6 h: Pc~* Pc given by h(A) = A D D, where D is a fixed subset of C. 7 List all the homomorphisms from Z2 to Z4; from Z3 to Z6. F. Elementary Properties of Homomorphisms Let A and B be rings, and /: A-* B a homomorphism. Prove each of the following: 1 f{A) = {/(*) : x E A} is a subring of B. 2 The kernel of / is an ideal of A. 3 /(0) = 0, and for every a £ A, /(-a) = -f(a). 4 /is injective iff its kernel is equal to {0}. 5 If B is an integral domain, then either/(1) = 1 or /(1) = 0. If /(1) = 0, then f(x) = 0 for every x e A. If /(1) = 1, the image of every invertible element of A is an invertible element of B. 188 chapter eighteen ideals and homomosph1sms 189 6 Any homomorphic image of a commutative ring is a commutative ring. Any homomorphic image of a field is a field. 7 If the domain A of the homomorphism / is a field, and if the range of / has more than one element, then / is injective. (Hint: Use Exercise D6.) G. Examples of Isomorphisms 1 Let A be the ring of Exercise A2 in Chapter 17. Show that the function fix) = x - 1 is an isomorphism from Q to A; hence Q s A. 2 Let Sf be the following subset of M2(U): 9 = Prove that the function fia + bi) (ab \-b a is an isomorphism from C to if. [Remark: You must begin by checking that /is a well-defined function; that is, if a + bi = c + di, then fia + bi) = fic + di). To do this, note that if a + bi = c + di then a - c = (d - b)i; this last equation is impossible unless both sides are equal to zero, for otherwise it would assert that a given real number is equal to an imaginary number.] 3 Prove that {(x, x): x GZ} is a subring of Z x Z, and show {(x, x) : x G Z} 3 Z. 4 Show that the set of all 2 x 2 matrices of the form (8 9 is a subring of M2(R), then prove this subring is isomorphic to U. For any integer k, let kl designate the subring of Z which consists of all the multiples of k. 5 Prove that Z ^ 2Z; then prove that 2Z ^ 3Z. Finally, explain why if k I, then kZ ?= /Z. (Remember: How do you show that two rings, or groups, are not isomorphic?) H. Further Properties of Ideals Let A be a ring, and let / and K be ideals of A. Prove parts 1-4. (In parts 2-4, assume A is a commutative ring.) 1 If J n K = {0}, then jk = 0 for every j&J and k G K. 2 For any a 6 A, Ia = {ax + j + k : x G A, j G J, k G K) is an ideal of A. # 3 The radical of / is the set rad J = {a G A : a" G J for some w G Z}. For any ideal J, rad J is an ideal of A. 4 For any aGA, {x G A : ax = 0} is an ideal (called the annihilator of a). Furthermore, {x G ^4 : ax = 0 for every a£4} is an ideal (called the annihilating ideal of /I). If A is a ring with unity, its annihilating ideal is equal to {0}. 5 Show that {0} and A are ideals of A. (They are trivial ideals; every other ideal of A is a proper ideal.) A proper ideal / of A is called maximal if it is not strictly contained in any strictly larger proper ideal: that is, if 7 C AT, where K is an ideal containing some element not in J, then necessarily K = A. Show that the following is an example of a maximal ideal: In ?f(U), the ideal J = {/ : /(0) = 0}. [Hint: Use Exercise D5. Note that if g G K and g(0) ¥= 0 (that is, g0J), then the function h(x) = g(x) - g(0) is in J; hence h(x) - g(x) G K. Explain why this last function is an invertible element of ^(ir).] I. Further Properties of Homomorphisms Let A and B be rings. Prove each of the following: 1 If / : A—* B is a homomorphism from A onto B with kernel K, and J is an ideal of A such that KCJ then f(J) is an ideal of B. 2 If /: A —► B is a homomorphism from A onto B, and B is a field, then the kernel of/is a maximal ideal. (Hint: Use part 1, with Exercise D6. Maximal ideals are defined in Exercise H5.) 3 There are no nontrivial homomorphisms from Z to Z. [The trivial homomorphisms are fix) = 0 and fix) = x.] 4 If n is a multiple of m, then Zm is a homomorphic image of Z„. 5 If n is odd, there is an injective homomorphism from Z2 into Z2„. t J. A Ring of Endomorphisms Let A be a commutative ring. Prove each of the following: 1 For each element a in A, the function w„ defined by na(x) = ax satisfies the identity ira(x + y) = ir0{x)+ ira(y). (In other words, «•„ is an endomorphism of the additive group of A.) 2 ira is injective iff a is not a divisor of zero. (Assume a 5^0.) 3 7r„ is surjective iff a is invertible. (Assume A has a unity.) 4 Let sA denote the set (w„ : aG A) with the two operations K + "*IM = 7T„(x) + tTjCk) and w„ tt6 = tt0 ° w6 Verify that sd is a ring. 5 If <^> : A—» jtf is given by $(a) = 7ra, then <£> is a homomorphism. 6 If A has a unity, then is an isomorphism. Similarly, if A has no divisors of zero then is an isomorphism. QUOTIENT RINGS 191 CHAPTER NINETEEN QUOTIENT RINGS We continue our journey into the elementary theory of rings, traveling a road which runs parallel to the familiar landscape of groups. In our study of groups we discovered a way of actually constructing all the homomor-phic images of any group G. We constructed quotient groups of G, and showed that every quotient group of G is a homomorphic image of G. We will now imitate this procedure and construct quotient rings. We begin by denning cosets of rings: Let A be a ring, and J an ideal of A. For any element a£ A, the symbol J + a denotes the set of all sums j + a, as a remains fixed and j ranges over J. That is, 7 + « = {/ + a: jGJ} J + a is called a coset of J in A. It is important to note that, if we provisionally ignore multiplication, A with addition alone is an abelian group and 7 is a subgroup of A. Thus, the cosets we have just defined are (if we ignore multiplication) precisely the cosets of the subgroup 7 in the group A, with the notation being additive. Consequently, everything we already know about group cosets continues to apply in the present case—only, care must be taken to translate known facts about group cosets into additive notation. For example, Property (1) of Chapter 13, with Theorem 5 of Chapter 15, reads as follows in additive notation: aeJ+b J+a=J+b J + a = J iff iff iff J+a=J+b a-bGJ aSJ (1) (2) (3) We also know, by the reasoning which leads up to Lagrange's theorem, that the family of all the cosets / + a, as a ranges over A, is a partition of A. There is a way of adding and multiplying cosets which works as follows: (7 + a) + (7 + b) = 7 + (a + b) (7 + a)(J + b)=J + ab In other words, the sum of the coset of a and the coset of b is the coset of a + b; the product of the coset of a and the coset of b is the coset of ab. It is important to know that the sum and product of cosets, defined in this fashion, are determined without ambiguity. Remember that 7 + a may be the same coset as 7 + c [by Condition (1) this happens iff c is an element of 7 + a], and, likewise, 7 + b may be the same coset as 7 + d. Therefore, we have the equations (7 + a) + (J + b) = 7 + (a + b) (7 + a)(J +b) = J + ab II II and ii II (7 + c) + (7 + d) = J + (c + d) (7 + c)(7 + d) = 7 + cd Obviously we must be absolutely certain that 7 + (a + b) = J + (c + d) and 7 + ab = J + cd. The next theorem provides us with this important guarantee. Theorem 1 Let 7 be an ideal of A. If J + a = J + c and 7 4 b = 7 + d, then (i) 7 + (a + b) = J + (c + d), and (ii) J + ab = J + cd. Proof: We are given that 7 + a = 7 + c and 7 + b = 7 + d; hence by Condition (2), and b-d&J 192 CHAPTER NINETEEN QUOTIENT RINGS 193 Since J is closed with respect to addition, (a - c) + (b - d) = (a + b) -(c + d) is in J. It follows by Condition (2) that J + (a + b) = J + (c + d), which proves (i). On the other hand, since J absorbs products in A, (a - c)b GJ c{b~d)(EJ • , ancj *—_,-1 ab - cb cb - cd and therefore (ab - cb) + (cb - cd) = ab - cd is in J. It follows by Condition (2) that J + ab = J + cd. This proves (ii). ■ Now, think of the set which consists of all the cosets of J in A. This set is conventionally denoted by the symbol AIJ. For example, if J + a, J + b, J + c,. . . are cosets of J, then A/J = {J + a, J + b, J + c, . . .} We have just seen that coset addition and multiplication are valid operations on this set. In fact, Theorem 2 AIJ with coset addition and multiplication is a ring. Proof: Coset addition and multiplication are associative, and multiplication is distributive over addition. (These facts may be routinely checked.) The zero element of AIJ is the coset / = J + 0, for if J + a is any coset, (J + a) + (J + 0) = / + (a + 0) = J + a Finally, the negative of J + a is J + (~a), because (J + a) + (J + (-«)) = / + (a + (-a)) = J + 0 ■ The ring AIJ is called the quotient ring of A by /. And now, the crucial connection between quotient rings and homomorphisms: Theorem 3 AIJ is a homomorphic image of A. Following the plan already laid out for groups, the natural homomorphism from A onto AIJ is the function / which carries every element to its own coset, that is, the function / given by f(x) = J + x This function is very easily seen to be a homomorphism. Thus, when we construct quotient rings of A, we are, in fact, constructing homomorphic images of A. The quotient ring construction is useful because it is a way of actually manufacturing homomorphic images of any ring A. The quotient ring construction is now illustrated with an important example. Let Z be the ring of the integers, and let (6) be the ideal of Z which consists of all the multiples of the number 6. The elements of the quotient ring Z/{6) are all the cosets of the ideal (6), namely: <6>+0={.. •, -18, -12, -6,0,6,12,18,. ••} = 0 (6) + l = {.. • , -17, -11, -5,1,7,13,19,. •■} = T <6) + 2={.. • , -16, -10, -4,2,8,14,20,. ••} = 2 <6)+3={. . .,-15, -9, -3,3,9,15,21,. ••} = 3 (6>+4={.. • , -14, -8, - -2, 4,10, 16, 22, . ■•} = 4 (6)+5=(.. . , -13, -7, - 1,5,11,17,23, . •■} = 5 We will represent these cosets by means of the simplified notation 0, 1,2, 3, 4, 5. The rules for adding and multiplying cosets give us the following tables: + 0 1 2 3 4 5 0 1 2 3 4 5 0 0 1 2 3 4 5 0 0 0 0 0 0 0 1 1 2 3 4 5 0 1 0 1 2 3 4 5 2 2 3 4 5 0 1 2 0 2 4 0 2 4 3 3 4 5 0 1 2 3 0 3 0 3 0 3 4 4 5 0 1 2 3 4 0 4 2 0 4 2 5 5 0 1 2 3 4 5 0 5 4 3 2 1 One cannot fail to notice the analogy between the quotient ring Z/(6) and the ring Z6, In fact, we will regard them as one and the same. More generally, for every positive integer n, we consider Z„ to be the same as Z/(n). In particular, this makes it clear that Z„ is a homomorphic image of Z. By Theorem 3, any quotient ring AIJ is a homomorphic image of A. Therefore the quotient ring construction is a way of actually producing homomorphic images of any ring A. In fact, as we will now sec, it is a way of producing all the homomorphic images of A. Theorem 4 Let f: A-* B be a homomorphism from a ring A onto a ring B, and let K be the kernel of f. Then B = A IK. Proof: To show that A/K is isomorphic with B, we must look for an isomorphism from A/K to B. Mimicking the procedure which worked 194 CHAPTER NINETEEN QUOTIENT RINGS 195 successfully for groups, we let 4> be the function from AIK to B which matches each coset K + x with the element f(x); that is, is a well-defined, bijective function from AIK to B. Finally, 4>((K + a) + (K + b)) = 4>(K + {a + b)) = f(a + b) = /(«) + fib) = 4>(K + a) + (K + a)q>(K + b) Thus, 4> is an isomorphism from AIK onto B. m Theorem 4 is called the fundamental homomorphism theorem for rings. Theorems 3 and 4 together assert that every quotient ring of A is a homomorphic image of A, and, conversely, every homomorphic image of A is isomorphic to a quotient ring of A. Thus, for all practical purposes, quotients and homomorphic images of a ring are the same. As in the case of groups, there are many practical instances in which it is possible to select an ideal 7 of A so as to "factor out" unwanted traits of A, and obtain a quotient ring AIJ with "desirable" features. As a simple example, let A be a ring, not necessarily commutative, and let 7 be an ideal of A which contains all the differences ab — ba as a and b range over A. It is quite easy to show that the quotient ring AIJ is then commutative. Indeed, to say that AIJ is commutative is to say that for any two cosets J + a and J + b, {J + a)(J + b) = {J + b)(J + a) that is J + ab = J + ba By Condition (2) this last equation is true iff ab - 6a G J. Thus, if every difference ab - ba is in 7, then any two cosets commute. A number of important quotient ring constructions, similar in principle to this one, are given in the exercises. An ideal J of a commutative ring is said to be a prime ideal if for any two elements a and b in the ring, If ab i then bGJ Whenever J is a prime ideal of a commutative ring with unity A, the quotient ring AIJ is an integral domain. (The details are left as an exercise.) An ideal of a ring is called proper if it is not equal to the whole ring. A proper ideal J of a ring A is called a maximal ideal if there exists no proper ideal K of A such that 7 C K with J ¥ K (in other words, 7 is not contained in any strictly larger proper ideal). It is an important fact that if A is a commutative ring with unity, then J is a maximal ideal of A iff AIJ is a field. To prove this assertion, let J be a maximal ideal of A. If A is a commutative ring with unity, it is easy to see that AIJ is one also. In fact, it should be noted that the unity of AIJ is the coset 7+1, because if J + a is any coset, (J + a)(J 4-1) = / + al ~ J + a. Thus, to prove that AIJ is a field, it remains only to show that if 7 + a is any nonzero coset, there is a coset J + x such that (7 + a)(J + x) = J + 1. The zero coset is J. Thus, by Condition (3), to say that J + a is not zero, is to say that a^J. Now, let K be the set of all the sums xa + j as x ranges over A and ; ranges over J. It is easy to check that K is an ideal. Furthermore, K contains a because a = la + 0, and K contains every element / G J because j can be written as Oo + /. Thus, K is an ideal which contains J and is strictly larger than J (for remember that a G K but a^J). But J is a maximal ideal! Thus, K must be the whole ring A. It follows that 1G K, so 1 = xa + j for some x G A and / G J. Thus, 1 - xa =;'E /, so by Condition (2), J + 1 = J + xa = (J + x)(J + a). In the quotient ring AIJ,J + x is therefore the multiplicative inverse of /+ a. The converse proof consists, essentially, of "unraveling" the preceding argument; it is left as an entertaining exercise. EXERCISES A. Examples of Quotient Rings In each of the following, A is a ring and J is an ideal of A. List the elements of AIJ, and then write the addition and multiplication tables of AIJ. Example A = Z6, J= {0,3). The elements of AIJ are the three cosets /= J + 0= {0,3}, / + 1 = {1,4}, and 7 + 2 = {2, 5}. The tables for AIJ are as follows: + 7 7 + 1 7 + 2 7 7+1 J + 2 J 7 J + 1 7 + 2 7 7 7 7 7 + 1 7 + 1 7 + 2 7 7+1 7 7 + 1 7 + 2 7 + 2 7 + 2 7 7 + 1 7 + 2 / 7 + 2 7 + 1 196 chapter nineteen quotient rings 197 1 A - Z10, 7= {0,5}. 2 A = P3, J = {0, {a}}. (P3 is defined in Chapter 17, Exercise D.) 3 A = l2x Z6; 7 = {(0, 0), (0,2), (0,4)}. B. Examples of the Use of the HIT In each of the following, use the FHT (fundamental homomorphism theorem) to prove that the two given groups are isomorphic. Then display their tables. Example Z2 and Z6/(2). The following function is a homomorphism from Z6 onto Z2: '0 1 2 3 / = 0 1 0 3 4 5\ 10 1/ (Do not prove that / is a homomorphism.) The kernel of/ is {0,2,4} = (2). Thus: It follows by the FHT that Z2 = 1 ' <2> /<2>. 1 Z3 and Z20/(5). 2 Z3 and Z6/(3>. 3 P2 and P3/K, where K={0,{c}}. [Hint: See Chapter 18, Exercise E6. Consider the function f(X) = Xn {a, b}.] 4 Z2 and Z2 x Z2/K, where K= {(0,0), (0, 1)}. C. Quotient Rings and Homomorphic Images in J^(R) 1 Let be the function from 9(U) to RxR defined by *(/) = (/(0), /(l)). Prove that is a homomorphism from 3>(U) onto R x R, and describe its kernel. 2 Let 7 be the subset of !¥(R) consisting of all / whose graph passes through the points (0,0) and (1,0). Referring to part 1, explain why 7 is an ideal of £F(R), and ^(R)/7 = R x R. 3 Let {f) =/0 = the restriction of / to Q (Note: The domain of fQ is Q and on this domain fQ is the same function as /.) Prove that is a homomorphism from ^(R) onto !f(Q, R), and describe the kernel of (a) = t, is a homomorphism from A onto A. Let / designate the annihilating ideal of A (defined in Exercise H4 of Chapter 18). Use the FHT to show that All = A. E. Properties of Quotient Rings AIJ in Relation to Properties of J Let A be a ring and J an ideal of A. Use Conditions (1), (2), and (3) of this chapter. Prove each of the following: # 1 Every element of AIJ has a square root iff for every x E A, there is some y £ A such that x - y2 £ /. 2 Every element of AIJ is its own negative iff x + x £ J for every x€. A. 3 /4/7 is a boolean ring iff x2 -xEJ for every xEA. (A ring 5 is called a boolean ring iff s2 = s for every 5 £ 5.) 4 If J is the ideal of all the nilpotent elements of a commutative ring A, then AIJ has no nilpotent elements (except zero). (Nilpotent elements are defined in Chapter 17, Exercise M; by M2 and M3 they form an ideal.) 5 Every element of AIJ is nilpotent iff J has the following property: for every x £ A, there is a positive integer n such that x" £ /. # 6 AIJ has a unity element iff there exists an element a £ A such that ax - x £ 7 and - a: £ 7 for every * E /I. 198 chapter nineteen quotient rings 199 F. Prime and Maximal Ideals Let A be a commutative ring with unity, and J an ideal of A. Prove each of the following: 1 AIJ is a commutative ring with unity. 2 / is a prime ideal iff AIJ is an integral domain. 3 Every maximal ideal of A is a prime ideal. (Hint: Use the fact, proved in this chapter, that if / is a maximal ideal then AIJ is a field.) 4 If AIJ is a field, then / is a maximal ideal. (Hint: See Exercise 12 of Chapter 18.) G. Further Properties of Quotient Rings in Relation to Their Ideals Let A be a ring and J an ideal of A. (In parts 1-3 and 5 assume that A is a commutative ring with unity.) 1 Prove that AIJ is a field iff for every element a £ A, where a^J, there is some b £ A such that ah-^SJ. 2 Prove that every nonzero element of A/7 is either invertible or a divisor of zero iff the following property holds, where a, i£ A: For every a0J, there is some x 0 J such that either ax £ J or ax - 1 £ J. 3 An ideal J of a ring A is called primary iff for all a, b £ A, if ab £ J, then either a £ J or b" £ J for some positive integer n. Prove that every zero divisor in AIJ is nilpotent iff J is primary. 4 An ideal J of a ring A is called semiprime iff it has the following property: For every a £ A, if a" £ J for some positive integer n, then necessarily a £ J. Prove that / is semiprime iff AIJ has no nilpotent elements (except zero). 5 Prove that an integral domain can have no nonzero nilpotent elements. Then use part 4, together with Exercise F2, to prove that every prime ideal in a commutative ring is semiprime. H. Z„ as a Homomorphic Image of Z Recall that the function f(a) = a is the natural homomorphism from Z onto Z„. If a polynomial equation p = 0 is satisfied in Z, necessarily /(p) =/(0) is true in Z„. Let us take a specific example; there are integers x and y satisfying llx2 - 8y2 + 29 = 0 (we may take x = 3 and y = 4). It follows that there must be elements £ and y in Z6 which satisfy TTf2 -8v2 +29 = 0 in Z6, that is, 5 x2 - 2 y2 + 5 = 0. (We take x = 3 and y = 4.) The problems which follow are based on this observation. 1 Prove that the equation x2 - ly2 - 24 = 0 has no integer solutions. (Hint: If there are integers x and y satisfying this equation, what equation will x and y satisfy in Z7?) 2 Prove that x2 + (x + l)2 + (x + 2)2 = y2 has no integer solutions. 3 Prove that jr2 + 10y2 = n (where n is an integer) has no integer solutions if the last digit of n is 2, 3, 7, or 8. 4 Prove that the sequence 3, 8, 13, 18, 23, . . . does not include the square of any integer. (Hint: The image of each number on this list, under the natural homomorphism from Z to Z,, is 3.) 5 Prove that the sequence 2, 10, 18, 26,. . . does not include the cube of any integer. 6 Prove that the sequence 3, 11, 19, 27, . . .does not include the sum of two squares of integers. 7 Prove that if n is a product of two consecutive integers, its units digit must be 0, 2, or 6. 8 Prove that if n is the product of three consecutive integers, its units digit must be 0, 4, or 6. INTEGRAL DOMAINS 201 CHAPTER TWENTY INTEGRAL DOMAINS Let us recall that an integral domain is a commutative ring with unity having the cancellation property, that is, if and ab = ac then b = c (1) At the end of Chapter 17 we saw that an integral domain may also be defined as a commutative ring with unity having no divisors of zero, which is to say that if ab=0 then a = Q b = 0 (2) for as we saw, (1) and (2) are equivalent properties in any commutative ring. The system Z of the integers is the exemplar and prototype of integral domains. In fact, the term "integral domain" means a system of algebra ("domain") having integerlike properties. However, Z is not the only integral domain: there are a great many integral domains different from Z. Our first few comments will apply to rings generally. To begin with, we introduce a convenient notation for multiples, which parallels the exponent notation for powers. Additively, the sum a + a + ■ ■ • + a of n equal terms is written as n • a. We also define 0 • a to be 0, and let (—n) • a = -(n ■ a) for all positive integers n. Then m-a + n- fl = (m + n)-« and m • (n • a) = (mn) ■ a for every element a of a ring and all integers m and n. These formulas are the translations into additive notation of the laws of exponents given in Chapter 10. If A is a ring, A with addition alone is a group. Remember that in additive notation the order of an element a in A is the least positive integer n such that n • a = 0. If there is no such positive integer n, then a is said to have order infinity. To emphasize the fact that we are referring to the order of a in terms of addition, we will call it the additive order of a. In a ring with unity, if 1 has additive order n, we say the ring has "characteristic n." In other words, if A is a ring with unity, the characteristic of A is the least positive integer n such that 1 + 1+ • • • + 1=0 n times If there is no such positive integer n, A has characteristic 0. These concepts are especially simple in an integral domain. Indeed, Theorem 1 All the nonzero elements in an integral domain have the same additive order. Proof: That is, every a ¥> 0 has the same additive order as the additive order of 1. The truth of this statement becomes transparently clear as soon as we observe that n-a = + + « = la+ '•• + !« = (! + ••• + 1)« = (n ■ l)a hence n • a = 0 iff n ■ 1 = 0. (Remember that in an integral domain, if the product of two factors is equal to 0, at least one factor must be 0.) ■ It follows, in particular, that if the characteristic of an integral domain is a positive integer n, then n-x = Q for every element x in the domain. Furthermore, Theorem 2 In an integral domain with nonzero characteristic, the characteristic is a prime number. Proof: If the characteristic were a composite number mn, then by the distributive law, 200 202 CHAPTER TWENTY INTEGRAL DOMAINS 203 (m • l)(n • 1) = (1 + • • • + 1)(1 + ... + l)=l + l + ... + l = (mn) -1 = 0 m terms n terms mn terms Thus, either m -1 = 0 or n • 1 = 0, which is impossible because mn was chosen to be the least positive integer such that (mn) -1=0. ■ A very interesting rule of arithmetic is valid in integral domains whose characteristic is not zero. Theorem 3 In any integral domain of characteristic p, (a + b)p = ap + bp for all elements a and b Proof: This formula becomes clear when we look at the binomial expansion of (a + b)p. Remember that by the binomial formula, (a + bf = ap + ( J) • ap~lb + ■ ■ ■ + ( P_ J ) ■ abp 1 + bp where the binomial coefficient pV, p(p-l)(p-2)--(p-k+l) (I) k/ k! It is demonstrated in Exercise L of Chapter 17 that the binomial formula is correct in every commutative ring. Note that if p is a prime number and 0 be the function from A to A' defined by (a) = [a, 1]. This function is injective because, by Equation (3), if [a, l] = [b, 1] then a = b. It is obviously surjective and is easily shown to be a homomorphism. Thus, 4> is an isomorphism from A to A', so A* contains an isomorphic copy A' of A. EXERCISES A. Characteristic of an Integral Domain Let A be a finite integral domain. Prove each of the following: 1 Let a be any nonzero element of A. If n-a = 0, where n ^ 0, then n is a multiple of the characteristic of A. 2 It A has characteristic zero, n ^ 0, and n • a = 0, then a = 0. 3 If A has characteristic 3, and 5 • a = 0, then a = 0. 4 If there is a nonzero element a in A such that 256 ■ a = 0, then A has characteristic 2. 5 If there are distinct nonzero elements a and b in A such that 125 • a = 125 • b, then A has characteristic 5. 6 If there are nonzero elements a and b in A such that (a + b)2 = a2 + b2, then A has characteristic 2. 7 If there are nonzero elements a and b in A such that 10a = 0 and 14b = 0, then A has characteristic 2. B. Characteristic of a Finite Integral Domain Let A be an integral domain. Prove each of the following: 1 If A has characteristic q, then q is a divisor of the order of A. 2 If the order of A is a prime number p, then the characteristic of A must be equal to p. 3 If the order of A is pm, where p is a prime, the characteristic of A must be equal to p. 4 If A has 81 elements, its characteristic is 3. 5 If A, with addition alone, is a cyclic group, the order of A is a prime number. C. Finite Rings Let A be a finite commutative ring with unity. 1 Prove: Every nonzero element of A is either a divisor of zero or invertible. (Hint: Use an argument analogous to the proof of Theorem 4) 206 chapter twenty integral domains 207 2 Prove: If a 0 is not a divisor of zero, then some positive power of a is equal to 1. (Hint: Consider a, a2, a3,.... Since A is finite, there must be positive integers n < m such that a" = am.) 3 Use part 2 to prove: If a is invertible, then a1 is equal to a positive power of a. I). Field of Quotients of an Integral Domain The following questions refer to the construction of a field of quotients of A, as outlined on pages 203 to 205. 1 If [a, b] = [r, s] and [c, d] = [t, u], prove that |a, b\ + [c, d] = \r, s] + [t, u]. 2 If [a, b] = [r, s] and [c, d] = [t, u], prove that [a, b][c, d] = [r, s][t, u). 3 If (a, b)~ (c, d) means ad = be, prove that ~ is an equivalence relation on S. 4 Prove that addition in A* is associative and commutative. 5 Prove that multiplication in A* is associative and commutative. 6 Prove the distributive law in A*. 7 Verify that b has the same meaning as b < a. Furthermore, a =£ b means "a < b or a = b ," and b 3= a means the same as a =s b. In an ordered integral domain A, an element a is called positive if «>0. If «<0 we call a negative. Note that if a is positive then —a is negative. (Proof: Add -a to both sides of the inequality a >0.) Similarly, if a is negative, then —a is positive. In any ordered integral domain, the square of every nonzero element is positive. Indeed, if c is nonzero, then either c>0 of c<0. If c>0, then, multiplying both sides of the inequality c > 0 by c, so c >0. On the other hand cc > cO if c<0, = 0 then (-c)>0 hence (- c)(- c) > 0(- c) = 0 But (—c)(—c) = c2, so once again, c2 >0. In particular, since 1 = I1, 1 is always positive. From the fact that 1 >0, we immediately deduce that 1 + 1 > 1, I + 1 + 1 > 1 + 1, and so on. In general, for any positive integer n, (n+1)-1 >nl where n • 1 designates the unity element of the ring A added to itself n 208 210 CHAPTER TWENTY-ONE THE INTEGERS 211 times. Thus, in any ordered integral domain A, the set of all the multiples of 1 is ordered as in Z: namely • • • < (-2) • 1 <(-l)' 7<00 and there are no elements of A between 0 and 1, so b > 1. (Remember that b cannot be equal to 1 because b is not a multiple of 1.) Since b> 1, it follows that b — 1 >0. But b - 1 \, hence b > 1. Thus, b - 1 > 0, and b - 1 E K. But then, by Condition (ii), b E K, which is impossible. ■ Let the symbol Sn represent any statement about the positive integer n. For example, S„ might stand for "n is odd," or "n is a prime," or it might represent an equation such as (n — l)(n + 1) = n2 - 1 or an inequality such as n € n2. If, let us say, 5n stands for n =s n2, then S{ asserts that 1 =s l2, S2 asserts that 2 =£ 22, S3 asserts that 3 =s 32, and so on. Theorem 2: Principle of mathematical induction Consider the following conditions: (i) 5, is true. (ii) For any positive integer k, if Sk is true, then also 5k+1 is true. If Conditions (i) and (ii) are satisfied, then Sa is true for every positive integer n. Proof: Indeed, if K is the set of all the positive integers k such that 5k is true, then K complies with the conditions of Theorem 1. Thus, K contains all the positive integers. This means that Sn is true for every n. ■ As a simple illustration of how the principle of mathematical indue- 212 CHAPTER TWENTY-ONE THE INTEGERS 213 tion is applied, let Sa be the statement that , „ n(n + l) 1 +2+ • ■ • +n= —1 that is, the sum of the first n positive integers is equal to n(n + l)/2. Then S, is simply -¥ which is clearly true. Suppose, next, that k is any positive integer and that Sk is true. In other words, 1 + 2 + ... + k=!E0E±l) Then, by adding k + 1 to both sides of this equation, we obtain k(k+ 1) 1 + 2 + • ■• + k + (k + l) + (k + l) that is, 1 + 2--- + (k + 1) (k+l)(k + 2) However, this last equation is exactly Sk+l. We have therefore verified that whenever Sk is true, Sk+J also is true. Now, the principle of mathematical induction allows us to conclude that n(n + 1) 1 + 2 + + n ■ for every positive integer n. A variant of the principle of mathematical induction, called the principle of strong induction, asserts that 5n is true for every positive integer n on the conditions that (i) 5, is true, and (ii) For any positive integer k, if 5( is true for every i < k, then Sk is true. The details are outlined in Exercise H at the end of this chapter. One of the most important facts about the integers is that any integer m may be divided by any positive integer n to yield a quotient q and a positive remainder r. (The remainder is less than the divisor n.) For example, 25 may be divided by 8 to give a quotient of 3 and a remainder of 1: 25 = 8 x 3 + 1 m n q r This process is known as the division algorithm. It is stated in a precise manner as follows: Theorem 3: Division algorithm If m and n are integers and n is positive, there exist unique integers q and r such that m = nq + r and 0 =£ r < n We call q the quotient, and r the remainder, in the division of m by n. Proof: We begin by showing a simple fact: There exists an integer x such that xnsm. (*) Remember that n is positive; hence ns=l. As for m, either m 3=0 or m < 0. We consider these two cases separately: Suppose m*0. Then 0 =s m hence (0)n =s m Suppose m < 0. We may multiply both sides of n & 1 by the positive integer —m to get (-m)n 3= -m. Adding mn + m to both sides yields mn «m X Thus, regardless of whether m is positive or negative, there is some integer x such that xn=sm. Let W be the subset of Z consisting of all the nonnegative integers which are expressible in the form m - xn, where x is any integer. By (*) W is not empty; hence by the well-ordering property, W contains a least integer r. Because r £ W, r is nonnegative and is expressible in the form m - nq for some integer q. That is, and r3=0 r = m - nq Thus, we already have m = nq + r and 0 =s r. It remains only to verify that r 0, then a < ft. 8 If a < ft and c aft, if a2 + ft2^0 5 a + ft 1 6 aft + ac + ftc + 1< a + ft + c + aftc, if a, ft, c > 1 C. Uses of Induction Prove parts 1-7, using the principle of mathematical induction. (Assume n is a positive integer.) 1 1 + 3 + 5 + ■ • • + (2n - 1) = n2 (The sum of the first n odd integers is n2) 2 l3 + 23 + --- + n3 = (l + 2 + --- + n)2 3 1- 22 + + (n - 1)2< y 0, F„^F„ + 2-FnF„ + 3 = (-1)" D. Every Integral System Is Isomorphic to Z Let A be an integral system. Let ft: Z—> A be defined by: h(n) = n• 1. The purpose of this exercise is to prove that ft is an isomorphism, from which it follows that A=T. 1 Prove: For every positive integer n, n ■ 1 >0. What is the characteristic of A"? 2 Prove that ft is injective and surjective. 3 Prove that ft is an isomorphism. E. Absolute Values In any ordered integral domain, define |a| by I if a if a&O W I-a if a<0 Using this definition, prove the following: 1 I--I-H 2 a«|a| 3 a 3= -|a| 4 If ft>0, |a|«ft iff - ft«a«ft # 5 |a + ft| =£ |a| + |ft| 6 ja-ft|«|a| + |ft| 7 |aft| = |a|-|ft| # 8 |a|-|ft|*s|a-ft| 9 \\a\-\b\*\a-b\ F. Problems on the Division Algorithm Prove parts 1-3, where k, m, n, q, and r designate integers. 1 Let n > 0 and k > 0. If q is the quotient and r is the remainder when m is divided by n, then q is the quotient and kr is the remainder when km is divided by kn. 2 Let n > 0 and k > 0. If q is the quotient when m is divided by n, and q, is the quotient when q is divided by k, then q, is the quotient when m is divided by nk. 3 If n 5*0, there exist q and r such that m = nq + r and 0*Sr<|n|. (Use Theorem 3, and consider the case when n < 0.) 4 In Theorem 3, suppose m = nq, + r, = nq2 + r2 where 0 =sr,, r2 < n. Prove that r, - r2 = 0. [Hint: Consider the difference (nq, + r,) - (nq2 + r2).] 5 Use part 4 to prove that ql - q2 = 0. Conclude that the quotient and remainder, in the division algorithm, are unique. 6 If r is the remainder when m is divided by n, prove that m = r in Zn. G. Laws of Multiples The purpose of this exercise is to give rigorous proofs (using induction) of the basic identities involved in the use of exponents or multiples. If A is a ring and a€E A, we define n • a (where n is any positive integer) by the pair of conditions: (i) la = a, and (ii) (n + l)-a = n-a + a Use mathematical induction (with the above definition) to prove that the following are true for all positive integers n and all elements a, b& A: 1 n(a + b) = n-a + nb 2 (n + m)-a = n- a + m'a 3 (n ■ a)b = a(n • b) = n • (ab) 4 m • (n • a) = (mn) • a # 5 n • a = (n • 1 )a where 1 is the unity element of A 6 (n ■ a)(m • b) - (am) ■ ab (Use parts 3 and 4.) H. Principle of Strong Induction Prove the following in Z: 1 Let K denote a set of positive integers. Consider the following conditions: (i) ie*. (ii) For any positive integer k, if every positive integer less than k is in K, then If K satisfies these two conditions, prove that K contains all the positive integers. 2 Let 5„ represent any statement about the positive integer n. Consider the following conditions: (i) 5, is true. (ii) For any positive integer k, if S{ is true for every i 1. In case r > 1 we may multiply both sides of 1< r by s to get 5 < rs = 1; this is impossible because s cannot be positive and <1. Thus, it must be that r = 1; hence 1 = rs = Is = s, so also s = 1. If r and s are both negative, then —r and —s are positive. Thus, 1 = rs = (-r)(-s) and by the preceding case, —r = —s = l. Thus, r = s = — 1. ■ A pair of integers r and s are called associates if they divide each other, that is, if r\ s and s\r.\ir and 5 are associates, this means there are integers k and / such that r= ks and s = Ir. Thus, r = ks = klr, hence kl = 1. By Theorem 2, k and / are ±1, and therefore r = ±s. Thus, we have shown that // r and s are associates in Z, then r = ±s . (1) An integer t is called a common divisor of integers r and s if t\r and t\s. A greatest common divisor of r and s is an integer t such that (i) t\r and t\s, and (ii) For any integer u, if u \ r arid u | s, then u 11. In other words, t is a greatest common divisor of r and s if t is a common divisor of r and s, and every other common divisor of r and 5 divides t. Note that the adjective "greatest" in this definition does not mean primarily that t is greater in magnitude than any other common divisor, but, rather, that it is a multiple of any other common divisor. The words "greatest common divisor" are familiarly abbreviated by gcd. As an example, 2 is a gcd of 8 and 10; but -2 also is a gcd of 8 and 10. According to the definition, two different gcd's must divide each other; hence by Property (1) above, they differ only in sign. Of the two possible gcd's ± t for r and s, we select the positive one, call it the gcd of r and s, and denote it by gcd(r, s) Does every pair r, s of integers have a gcd? Our experience with the integers tells us that the answer is "yes." We can easily prove this, and more: Theorem 3 Any two nonzero integers r and s have a greatest common divisor t. Furthermore, t is equal to a "linear combination" of r and s. That is, t = kr + Is for some integers k and I. 220 CHAPTER TWENTY-TWO FACTORING INTO PRIMES 221 Proof: Let J be the set of all the linear combinations of r and s, that is, the set of all ur + vs as u and v range over Z. J is closed with respect to addition and negatives and absorbs products because («[/• + vts) + (u2r + v2s) = (Mj + u2)r + (Uj + i;2)j — (wr + iw) = (— u)r + (-v)s and w(ur + vs) = (wu)r + (wv)s Thus, J is an ideal of Z. By Theorem 1, J is a principal ideal of Z, say J= (t). (/ consists of all the multiples of f.) Now t is in /, which means that t is a linear combination of r and s: t= kr+ Is Furthermore, r = Ir + 0s and s = Or + Is, so r and 5 are linear combinations of r and s; thus r and 5 are in /. But all the elements of / are multiples of t, so r and s are multiples of t. That is, t\r and t s Now, if u is any common divisor of r and 5, this means that r = xu and s — yu for some integers x and y. Thus, t = kr + Is = kxu + lyu = u(kx + ly) It follows that u 1t. This confirms that r is the gcd of r and s. ■ A word of warning: the fact that an integer m is a linear combination of r and 5 does not necessarily imply that m is the gcd of r and s. For example, 3 = (1)15 + (-2)6, and 3 is the gcd of 15 and 6. On the other hand, 27 = (1)15 + (2)6, yet 27 is not a gcd of 15 and 6. A pair of integers r and 5 are said to be relatively prime if they have no common divisors except ±1. For example, 4 and 15 are relatively prime. If r and s are relatively prime, their gcd is equal to 1; so by Theorem 3, there are integers k and / such that kr + Is = 1. Actually, the converse of this statement is true too: if some linear combination of r and s is equal to 1 (that is, if there are integers k and / such that kr + Is = 1), then r and s are relatively prime. The simple proof of this fact is left as an exercise. If m is any integer, it is obvious that ±1 and ±m are factors of m. We call these the trivial factors of m. If m has any other factors, we call them proper factors of m. For example, ±1 and ±6 are the trivial factors of 6, whereas ±2 and ±3 are proper factors of 6. If an integer m has proper factors, m is called composite. If an integer p 7s 0, 1 has no proper factors (that is, if all its factors are trivial), then we call p a prime. For example, 6 is composite, whereas 7 is a prime. Composite number lemma // a positive integer m is composite, then m = rs where \ 1 can be expressed as a product of positive primes. That is, there are one or more primes px,...,pr such that Pi Pi Pr Proof: Let K represent the set of all the positive integers greater than 1 which cannot be written as a product of one or more primes. We will assume there are such integers, and derive a contradiction. By the well-ordering principle, K contains a least integer m; m cannot be a prime, because if it were a prime it would not be in K. Thus, m is composite; so by the composite number lemma, m = rs for positive integers r and s less than m and greater than 1; r and s are not in K because m is the least integer in K. This means that r and 5 can be expressed as products of primes, hence so can m = rs. This contradiction proves that K is empty; hence every n > 1 can be expressed as a product of primes. ■ Theorem 5: Unique factorization Suppose n can be factored into positive primes in two ways, namely, » "Pi- ■ • pr = qi ■ •• q, Then r = t, and the pi are the same numbers as the qj except, possibly, for the order in which they appear. Proof: In the equation px ■ • • pr = qt • • • q„ let us cancel common factors from each side, one by one, until we can do no more canceling. If all the factors are canceled on both sides, this proves the theorem. Otherwise, we are left with some factors on each side, say Pi Pk = qr" q» Now, pi is a factor of pt- ■ ■ pk, so pt] q, • ■ • qm. Thus, by Corollary 2 to Euclid's lemma, pt is equal to one of the factors q.f..., qm, which is impossible because we assumed we can do no more canceling. ■ It follows from Theorems 4 and 5 that every integer m can be factored into primes, and that the prime factors of m are unique (except for the order in which we happen to list them). EXERCISES A. Properties of the Relation "a Divides ft" Prove the following, for any integers a, b, and c: 1 If a | b and b \ c, then a \ c. 2 a\b iff a\(-b) iff (-a)\b. 3 l\a and (-1)|o. 4 a\0. 5 If c| a and c\ b, then c\(ax + by) for all x, y G Z. 6 If fl>0 and 6>0 and a\b, then a « b. 7 a \ b iff ac | be, when c ¥- 0. 8 If a | b and c \ d, then ac \ bd. 9 Let p be a prime. If p \ a" for some n > 0, then p \ a. B. Properties of the gcd Prove the following, for any integers a, b, and c. For each of these problems, you will need only the definition of the gcd. # 1 If a > 0 and a | b, then gcd(a, b) = a. 2 gcd(a, 0) = o, if a>0. 3 gcd(a, b) = gcd(a, b + xa) for any xE.1. 4 Let p be a prime. Then gcd(a, p) = 1 or p. (Explain.) 5 Suppose every common divisor of a and b is a common divisor of c and d, and vice versa. Then gcd(a, b) = gcd(c, d). 6 If gcd(afr, c) = 1, then gcd(a, c) = 1 and gcd(6, c) = 1. 7 Let gcd(a, b) = c. Write a = ca' and b = cb'. Then gcd(a', b') = \. C. Properties of Relatively Prime Integers Prove the following, for all integers a, b, c, d, r, and 5. (Theorem 3 will be helpful.) 1 If there arc integers r and s such that ra + sb = 1, then a and b are relatively prime. 2 If gcd(a, c) = 1 and c\ab, then c\b. (Reason as in the proof of Euclid's lemma.) 3 If a | d and c \ d and gcd(a, c) = 1, then ac | d. 4 If d|ab and d\cb, where gcd(a, c) = 1, then d\b. 5 If d = gcd(a, b) where a = dr and b = ds, then gcd(/\ s) = 1. 6 If gcd(a, c) = 1 and gcd(b, c) = 1. then gcd(ab, c) = 1. 224 chapter twenty-two factoring into primes 225 D. Further Properties of gcd's and Relatively Prime Integers Prove the following, for all integers a, ft, c, d, r, and s: 1 Suppose a\b and c\b and gcd(a, c) = d. Then ac\bd. 2 If ac | b and ad | b and gcd(c, d) = 1, then acd \ b. 3 Let d = gcd(a, b). For any integer x, d | x iff x is a linear combination of a and b. 4 Suppose that for all integers x, x | a and j: [ 6 iff jr | c. Then c = gcd(a, b). 5 For all n >0, if gcd(a, ft) = 1, then gcd(a, 6") = 1. (Prove by induction.) 6 Suppose gcd(a, b) = 1 and c\ab. Then there exist integers r and s such that c= rs, r\a, s \ b, and gcd(r, s) = 1. E. A Property of the gcd Let a and ft be integers. Prove parts 1 and 2: # 1 Suppose a is odd and b is even, or vice versa. Then gcd(a, b) = gcd(a + b, a - b). 2 Suppose a and b are both odd. Then 2gcd(a, b) = gcd(a + b, a - b). 3 If a and b are both even, explain why either of the two previous conclusions are possible. F. Least Common Multiples A least common multiple of two integers a and b is a positive integer c such that (i) a I c and b | c; (ii) if a |x and b \ x, then c\x. 1 Prove: The set of all the common multiples of a and b is an ideal of Z. 2 Prove: Every pair of integers a and 6 has a least common multiple. (Hint: Use part 1.) The positive least common multiple of a and b is denoted by lcm(a, b). Prove the following for all positive integers a, b, and c: # 3 a • lcm(ft, c) = lcm(aft, ac). 4 If a = a,c and b = b,c where c = gcd(a, ft), then lcm(a, b) ■■ 5 lcm(a, ab) = ab. 6 If gcd(a, b) = 1, then lcm(a, ft) = ab. 7 If lcm(a, ft) = aft, then gcd(a, ft) = 1. 8 Let gcd(a, ft) = c. Then lcm(a, ft) = able. 9 Let gcd(a, ft) = c and lcm(a, ft) = d. Then cd = ab. a,b,c. G. Ideals in Z Prove the following: 1 (n) is a prime ideal iff n is a prime number. 2 Every prime ideal of Z is a maximal ideal. [Hint: If (p) C (a), but ( p) ^ ( a), explain why gcd(/>, a) = 1 and conclude that 1 £ (a).] Assume the ideal is not <0}. 3 For every prime number p, Zp is a field. (Hint: Remember Zp=Z/(p). Use Exercise 4, Chapter 19.) 4 If c = lcm(a, ft), then (a) n = (c). 5 Every homomorphic image of Z is isomorphic to Z„ for some n. 6 Let G be a group and let a, ft e G. Then S = {n £ Z: aft" = ft"a} is an ideal of Z. 7 Let G be a group, H a subgroup of G, and a £ G. Then S={neZ:«"eH) is an ideal of Z. 8 If gcd(a, ft) = d, then (a) + (ft) = (d). (Note: If / and K are ideals of a ring A, then J + K = {x + y: jr £ J and y £ K}.) H. The gcd and the 1cm as Operations on Z For any two integers a and ft, let a * ft = gcd(a, ft) and a ° ft = lcm(a, ft). Prove the following properties of these operations: 1 * and » are associative. 2 There is an identity element for °, but not for * (on the set of positive integers). 3 Which integers have inverses with respect to »? 4 Prove: a * (ft °c) = (a * ft)°(a * c). ELEMENTS OF NUMBER THEORY (OPTIONAL) 227 CHAPTER TWENTY-THREE ELEMENTS OF NUMBER THEORY (OPTIONAL) Almost as soon as children are able to count, they learn to distinguish between even numbers and odd numbers. The distinction between even and odd is the most elemental of all concepts relating to numbers. It is also the starting point of the modern science of number theory. From a sophisticated standpoint, a number is even if the remainder, after dividing the number by 2, is 0. The number is odd if that remainder is 1. This notion may be generalized in an obvious way. Let n be any positive integer: a number is said to be congruent to U, modulo n if the remainder, when the number is divided by n, is 0. The number is said to be congruent to 1, modulo n if the remainder, when the number is divided by «, is 1. Similarly, the number is congruent to 2, modulo n if the remainder after division by n is 2; and so on. This is the natural way of generalizing the distinction between odd and even. Note that "even" is the same as "congruent to 0, modulo 2"; and "odd" is the same as "congruent to 1, modulo 2." In short, the distinction between odd and even is only one special case of a more general notion. We shall now define this notion formally: Modulo 2. < z, 0 /VV # f $ & ✓" 1 1 ] 1 t 1 1 t i -10 -9 0 1 2 3 4 5 6 7 H Modulo 3: < O \ % •$ .0 vo ... /y> ... 1 1 1 /////// ' -9 -8 -7 0 1 2 3 4 5 h 7 K Modulo 4: O \ % ") «P \° *w° *P ... //// 1 1 1 1 >P ->° vo ^> v° v° ,p vo / r J/ / r / / t i i i i i i i i -12 -ii -lo -i> 0 1 2345678 Let n be any positive integer. If a and b are any two integers, we shall say that a is congruent to b, modulo nil a and b, when they are divided by n, leave the same remainder r. That is, if we use the division algorithm to divide a and b by n, then a = nql + r and b = nq2 + r where the remainder r is the same in both equations. Subtracting these two equations, we see that a - b = (nqx + r) - {nq2 + r) = n(q1 - q2) Therefore we get the following important fact: a is congruent to b, modulo n iff n divides a — b (1) If a is congruent to b, modulo n, we express this fact in symbols by writing a = b(mod n) which should be read "a is congruent to b, modulo n." We refer to this relation as congruence modulo n. By using Condition (1), it is easily verified that congruence modulo n is a reflexive, symmetric, and transitive relation on Z. It is also easy to check that for any n > 0 and any integers a, b, and c, a = b (mod n) implies a + c = b + c (mod n) and a = b (mod n) implies ac = be (mod n) 226 228 CHAPTER TWENTY-THREE ELEMENTS OF NUMBER THEORY (OPTIONAL) 229 (The proofs, which are exceedingly easy, are assigned as Exercise C at the end of this chapter.) Recall that (n) = {..., -3n, —2n, —n, 0, n, 2n, 3n, . . .} is the ideal of Z which consists of all the multiples of n. The quotient ring Z/(n) is usually denoted by Zn, and its elements are denoted by 0,1,2,. . . , n - 1. These elements are cosets: 0= (n) +0 = {..., -In,-«,0, n,2n,. ..} 1 = («) + 1 = {. .., -In + 1, -n + 1,1, n + 1,2« + 1, . . .} 2 = (n) + 2 = {... , -2« + 2, -n +2, 2, n + 2,2« +2, . . .} and so on. It is clear by inspection that different integers are in the same coset iff they differ from each other by a multiple of «. That is, a and b are in the same coset iff n divides a - b iff a^b {mod n) (2) If a is any integer, the coset (in Z„) which contains a will be denoted by a. For example, in Z6, 0 = 6=-6 = l2 = 18=--- 1 = 7 = -5 = 13 = ••• 2 = 8=-4=14=-- etc. In particular, a = b means that a and b are in the same coset. It follows by Condition (2) that a = b in Z„ iff a = b (mod«) (3) On account of this fundamental connection between congruence modulo n and equality in Z„, most facts about congruence can be discovered by examining the rings Z„. These rings have very simple properties, which are presented next. From these properties we will then be able to deduce all we need to know about congruences. Let « be a positive integer. It is an important fact that for any integer a, a is invertible in Z„ iff a and n are relatively prime. (4) Indeed, if a and n are relatively prime, their gcd is equal to 1. Therefore, by Theorem 3 of Chapter 22, there are integers s and t such that sa + tn = 1. It follows that 1 - sa = tn e (n ) so by Condition (2) of Chapter 19,1 and sa belong to the same coset in Z/{n). This is the same as saying that I = ~sa = sa; hence s is the multiplicative inverse of a in Z„. The converse is proved by reversing the steps of this argument. It follows from Condition (4) above, that if « is a prime number, every nonzero element of Z„ is invertible! Thus, Zp is a field for every prime number p. (5) In any field, the set of all the nonzero elements, with multiplication as the only operation (ignore addition), is a group. Indeed, the product of any two nonzero elements is nonzero, and the multiplicative inverse of any nonzero element is nonzero. Thus, in Zp, the set z; = {i,2,...,P-i) with multiplication as its only operation, is a group of order p - 1. Remember that if G is a group whose order is, let us say, m, then xm = e for every x in G. (This is true by Theorem 5 of Chapter 13.) Now, Z* has order p — 1 and its identity element is T, so (d)p~] = I for every a t^O in Zp. If we use Condition (3) to translate this equality into a congruence, we get a classical result of number theory: Little theorem of Fermat Let p be a prime. Then, 1 (mod p) for every a^0 (mod p) Corollary ap = a (mod p) for every integer a. Actually, a version of this theorem is true in Zn even where « is not a prime number. In this case, let Vn denote the set of all the invertible elements in Z„. Clearly, Vn is a group with respect to multiplication. (Reason: The product of two invertible elements is invertible, and, if a is invertible, so is its inverse.) For any positive integer n, let <£(«) denote the number of positive integers, less than n, which are relatively prime to n. For example, 1, 3, 5, and 7 are relatively prime to 8; hence (n). Thus, Vn is a group of order d>(n), and its identity element is T. Consequently, for any a in Vn, (d)*in) =T. If we use Condition (3) to translate this equation into a congruence, we get: Euler's theorem // a and n are relatively prime, a*'"'= 1 (mod n). 230 CHAPTER TWENTY-THREE ELEMENTS OF NUMBER THEORY (OPTIONAL) 231 FURTHER TOPICS IN NUMBER THEORY Congruences are more important in number theory than we might expect. This is because a vast range of problems in number theory—problems which have nothing to do with congruences at first sight—can be transformed into problems involving congruences, and are most easily solved in that form. An example is given next: A Diophantine equation is any polynomial equation (in one or more unknowns) whose coefficients are integers. To solve a Diophantine equation is to find integer values of the unknowns which satisfy the equation. We might be inclined to think that the restriction to integer values makes it easier to solve equations; in fact, the very opposite is true. For instance, even in the case of an equation as simple as Ax + 2y = 5, it is not obvious whether we can find an integer solution consisting of x and y in Z. (As a matter of fact, there is no integer solution; try to explain why not.) Solving Diophantine equations is one of the oldest and most important problems in number theory. Even the problem of solving Diophantine linear equations is difficult and has many applications. Therefore, it is a very important fact that solving linear Diophantine equations is equivalent to solving linear congruences. Indeed, ax + by = c iff by = c iff ax = c (mod b) Thus, any solution of ax = c (mod b) yields a solution in integers of ax + by = c. Finding solutions of linear congruences is therefore an important matter, and we will turn our attention to it now. A congruence such as ax ■ b (mod n) may look very easy to solve, but appearances can be deceptive. In fact, many such congruences have no solutions at all! For example, Ax = 5 (mod 2) cannot have a solution, because Ax is always even [hence, congruent to 0 (mod 2)], whereas 5 is odd [hence congruent to 1 (mod 2)]. Our first item of business, therefore, is to find a way of recognizing whether or not a linear congruence has a solution. Theorem 1 The congruence ax = b (mod n) has a solution iff gcd(o, n) | b. Indeed, ax = b (mod n) iff n I (ax - b) iff ax - b = yn iff ax- yn = b Next, by the proof of Theorem 3 in Chapter 22, if J is the ideal of all the linear combinations of a and n, then gcd(a, n) is the least positive integer in J. Furthermore, every integer in / is a multiple of gcd(a, n). Thus, b is a linear combination of a and n iff b E. J iff & is a multiple of gcd(a, n). This completes the proof of our theorem. Now that we are able to recognize when a congruence has a solution, let us see what such a solution looks like. Consider the congruence ax = b (mod n). By a solution modulo n of this congruence, we mean a congruence x = c (mod n) such that any integer x satisfies x = c (mod n) iff it satisfies ax = b (mod n). [That is, the solutions of ax = b (mod n) are all the integers congruent to c, modulo n.] Does every congruence ax=b (mod n) (supposing that it has a solution) have a solution modulo n? Unfortunately not! Nevertheless, as a starter, we have the following: Lemma // gcd(a, n) = 1, then ax = b (mod n) has a solution modulo Proof: Indeed, by (3), ax = b (mod n) is equivalent to the equality ax = b in Z„. But by Condition (4), a has a multiplicative inverse in Z„; hence from ax = b we get x = a~^b. Setting d~1b = c, we get x = c in Z„, that is, x = c (mod n). m Thus, if a and n are relatively prime, ax = b (mod n) has a solution modulo n. If a and n are not relatively prime, we have no solution modulo n; nevertheless, we have a very nice result: Theorem 2 // the congruence ax = b (mod n) has a solution, then it has a solution modulo m, where m = gcd(a, n) This means that the solution of ax = b (mod ri) is of the form x = c (mod m); it consists of all the integers which are congruent to c, modulo m. Proof. To prove this, let gcd(a, n) = d, and note the following: ax = b (mod n) iff n | (ax - b) iff ax - b = ny iff dX~d=dy iff a b I n dX=d\m0dd (6) 232 chapter twenty-three elements of number theory (opiional) 233 But aid and nld are relatively prime (because we are dividing a and n by d, which is their gcd); hence by the lemma, has a solution x mod n/d. By Condition (6), this is also a solution of ax = b (mod n). m As an example, let us solve 6x = 4 (mod 10). Gcd(6,10) = 2 and 214, so by Theorem 1, this congruence has a solution. By Condition (6), this solution is the same as the solution of (mod that is, 3jc ■ 2 (mod 5) This is equivalent to the equation 3x = 2 in Z5, and its solution is x = 4. So finally, the solution of 6x = 4 (mod 10) is -t=4 (mod 5). How do we go about solving several linear congruences simultaneously? Well, suppose we are given k congruences, a^x = 6, (mod n{), a2x = b2 (mod n2), . . . , akx — bk (mod nk) If each of these congruences has a solution, we solve each one individually by means of Theorem 2. This gives us x = c2(mod m2), x = ck (mod mk) x = c, (mod m,), * — <,2V...*~ r,.2, We are left with the problem of solving this last set of congruences simultaneously. Is there any integer x which satisfies all k of these congruences? The answer for two simultaneous congruences is as follows: Theorem 3 Consider x = a (mod n) and x=b (mod m). There is an integer x satisfying both simultaneously iff a = b (mod a"), where d = gcd(w, n). Proof: If x is a simultaneous solution, then n | (x - a) and m \ (x — b). Thus, x - a nq1 and x - b = mq2 Subtracting the first equation from the second gives a — b = mq2 — nql But d\m and d\n, so 0*1 (a - b); thus, a = b (mod d). Conversely, if a = b (mod a"), then d\(a-b), so a — b = dq. By Theorem 3 of Chapter 22, d = rn + tm for some integers r and t. Thus, a - b — rqn + tqm. From this equation, we get a — rqn = b + tqm Set x = a — rqn = b + (ijm; then x — a = - rqn and * — b = f^m; hence /I|(x - a) and m\(x - b), so x = a (mod n) and, jc = b (mod m) ■ Now that we are able to determine whether or not a pair of congruences has a simultaneous solution, let us see what such a solution looks like. Theorem 4 If a pair of congruences x = a (mod n) and x = b (mod m) has a simultaneous solution, then it has a simultaneous solution of the form x — c (mod t) where t is the least common multiple of m and n. Before proving the theorem, let us observe that the least common multiple (1cm) of any two integers m and n has the following property: let t be the least common multiple of m and n. Every common multiple of m and n is a multiple of t, and conversely. That is, for all integers x, and iff tlx (See Exercise F at the end of Chapter 22.) In particular, m\(x - c) and m|(jc - c) iff t\(x - c) hence x as c (mod m) and x = c (mod n) iff x = c (mod t) (7) Returning to our theorem, let c be any solution of the given pair of congruences (remember, wc are assuming there is a simultaneous solution). Then c = a (mod n) and c = b (mod m). Any other integer x is a simultaneous solution iff x = c (mod n) and x = c (mod w). But by Condition (7), this is true iff jc = c (mod /). The proof is complete. A special case of Theorems 3 and 4 is very important in practice: it is the case where m and n are relatively prime. Note that, in this case, gcd(w, n) = 1 and lcm(m, n) = mn. Thus, by Theorems 3 and 4, // m and n are relatively prime, the pair of congruences x = a (mod n) and x = b (mod m) always has a solution. This solution is of the form x = c (mod mn). This statement can easily be extended to the case of more than two linear congruences. The result is known as the Chinese remainder theorem Let m,, m1 mk be pairwise relatively prime. Then the system of simultaneous linear congruences 234 chap1er TWENTY-THREE ELEMENTS OF NUMBER THEORY (OPTIONAL) 235 x = c, (mod m,), x = c2 (mod m2), x = ck(modmk) always has a solution, which is of the form x = c (mod mxm2 . . . mk). Use Theorem 4 to solve x = c, (modm,) and x = c2 (mod m2) simultaneously. The solution is of the form x = d (mod m^). Solve the latter simultaneously with x = c} (modm5), to get a solution mod mxm2m^. Repeat this process k — 1 times. EXERCISES A. Solving Single Congruences 1 For each of the following congruences, find m such that the congruence has a unique solution modulo m. If there is no solution, write "none." (a) 60x m 12 (mod 24) (ft) 42x - 24 (mod 30) (e) 49* s 30 (mod25) (a") 39* - 14 (mod 52) (e) 147x ■ 47 (mod 98) (/) 39x « 26 (mod 52) 2 Solve the following linear congruences: (a) 12x = 7 (mod 25) (A) 35x = 8 (mod 12) (c) 15x = 9 (mod 6) (d) 42x^12 (mod30) («) 147x = 49 (mod98) (/) 39x =• 26 (mod52) 3 (a) Explain why 2x2 = 8 (mod 10) has the same solutions as x2 = 4 (mod 5). (b) Explain why x = 2 (mod 5) and x = 3 (mod 5) are all the solutions of 2x2 =8 (mod 10). 4 Solve the following quadratic congruences. (If there is no solution, write "none.") (a) 6x2 = 9 (mod 15) (b) 60x2 = 18 (mod 24) (c) 30x2 = 18 (mod 24) (d) 4(x + l)2 ■ 14 (mod 10) (e) 4x2 - 2x + 2 = 0 (mod 6) # (/) 3x2-6x + 6^0 (mod 15) 5 Solve the following congruences: (a) x4 = 4 (mod 6) (b) 2(x - l)4 = 0 (mod 8) (c) x3 + 3x2 + 3x + 1 - 0 (mod 8) (d) x4 + 2x2 + 1 = 4 (mod 5) 6 Solve the following Diophantine equations. (If there is no solution, write "none.") (a) 14x + 15y = 11 (b) 4x + 5y = 1 (c) 21x + lOy = 9 # (d) 30x2 + 24y = 18 B. Solving Sets of Congruences Example Solve the pair of simultaneous congruences x = 5 (mod 6), x = 7 (mod 10). By Theorems 3 and 4, this pair of congruences has a solution modulo 30. From x = 5 (mod 6), we get x = 6q + 5. Introducing this into x = 7 (mod 10) yields 6o + 5 = 7 (mod 10). Thus, successively, 6a = 2 (mod 10), 3a = 1 (mod5), q = 2 (mod 5), a = 5r + 2. Introducing this into x = 6a + 5 gives x = 6(5r + 2) + 5 = 30r + 17. Thus, x = 17 (mod 30). This is our solution. 1 Solve each of the following pairs of simultaneous congruences: (a) x m 7 (mod 8); x ■ 11 (mod 12) (b) x = 12 (mod 18); x = 30 (mod 45) (c) x = 8 (mod 15); x = 11 (mod 14) 2 Solve each of the following pairs of simultaneous congruences: (a) 10x = 2 (mod 12); 6x = 14 (mod 20) (ft)4x = 2 (mod 6); 9x = 3 (mod 12) (c) 6x m 2 (mod 8); lOx = 2 (mod 12) # 3 Use Theorems 3 and 4 to prove the following: Suppose we are given k congruences x=c, (modm,), x = c2 (mod m2) x = ck(modmk) There is an x satisfying all k congruences simultaneously if for all i, /' e {!,..., k), ci^cj (modaV), where du = gcd(w(, ntj). Moreover, the simultaneous solution is of the form x = c (mod (), where t = lcm(/?71, m2, .... mk). 4 Solving each of the following systems of simultaneous linear congruences; if there is no solution, write "none." (a) x = 2 (mod 3); x = 3 (mod 4); x = l (mod 5); x=4 (mod 7) (b) 6x " 4 (mod 8); lOx = 4 (mod 12); 3x = 8 (mod 10) (c) 5x = 3 (mod 6); 4x = 2 (mod 6); 6x = 6 (mod 8) 5 Solve the following systems of simultaneous Diophantine equations: # (a) 4x + 6y = 2; 9x + 12>> = 3 (b) 3x + 4y = 2; 5x + 6y = 2; 3x + lOy = 8. C. Elementary Properties of Congruence Prove the following for all integers a, b, c, d and all positive integers m and n: 1 If a = b (mod n) and b = c (mod n), then a = c (mod n). 2 If a = 6 (mod n), then a + c = ft + c (mod n). 3 If a = b (mod n), then ac = be (mod n). 4 a = b (mod 1). 5 If ab = 0 (mod p), where p is a prime, then a = 0 (mod /?) or ft = 0 (mod p). 6 If a2 = ft2 (mod p), where p is a prime, then a = ±ft (mod p). 7 If a = ft (mod m), then a + km^b (mod m), for any integer k. In particular, a + /cm = a (mod m). 8 If ac = ftc (mod n) and gcd(c, «) = 1, then a = ft (mod «). 9 If a = ft (mod n), then a = ft (mod m) for any m which is a factor of n. D. Further Properties of Congruence Prove the following for an integers a, ft, c and all positive integers m and n: 1 If ac ea be (mod n), and gcd(c, n) = d, then a = ft (mod «/rf). 2 If a = ft (mod n), then gcd(a, n) = gcd(ft, n). 236 chapter twenty-three elements of number theory (optional) 237 3 If a m b (mod p) for every prime p, then a = b. 4 If a m b (mod n), then a" ■ 6m (mod n) for every positive integer m. 5 If a = b (mod /n) and a = b (mod n) where gcd(m, n) = 1, then a = 6 (mod mn). # 6 If ai> s 1 (mod c), ac = 1 (mod and be m 1 (mod a), then afe + fee + ac = 1 (mod abc). (Assume a, b, c>0.) 7 If a2 = 1 (mod 2), then a2 ■ 1 (mod 4). 8 If a = b (mod n), then a2 + b2 = 2ai> (mod n2), and conversely. 9 If a = 1 (mod m), then a and m are relatively prime. E. Consequences of Fermat's Theorem 1 If p is a prime, find (p). Use this to deduce Fermat's theorem from Euler's theorem. Prove parts 2-6: 2 If p > 2 is a prime and a^O (mod p), then fl(p-i)'2s±1 (mod p) 3 (a) Let p a prime >2. If p = 3 (mod4), then (p - l)/2 is odd. (b) Let p > 2 be a prime such that p = 3 (mod 4). Then there is no solution to the congruence x2 + 1 =0 (mod p). [Hint: Raise both sides of*2 — -1 (mod p) to the power (p - l)/2, and use Fermat's little theorem.] # 4 Let p and q be distinct primes. Then p* + qp l(mod pq). 5 Let p be a prime. (a) If, (p - 1) | m, then a™ = 1 (mod p) provided that p -fa. (b) If, (p - \ )\m, then am + 1 = a (mod p) for all integers a. # 6 Let p and q be distinct primes. (a) If (p - 1)|m and (a - l)|m, then am ■ 1 (mod pa) for any a such that p (pn) = p" -p"~l = p" \p- 1). (Hint: For any integer a, a and p" have a common divisor ¥■- ±1 iff a is a multiple of p. There are exactly p""1 multiples of p between 1 and p".) S For every a^O (mod p), a p"(f-i) ), where p is a prime. 6 Under the conditions of part 3, if t is a common multiple of 4>(m) and 4>(n), then a' = 1 (mod mn). Generalize to three integers /, m, and n. 7 Use parts 4 and 6 to explain why the following are true: (a) a12 = 1 (mod 180) for every a such that gcd(a, 180) = 1. (/>) a42 = 1 (mod 1764) if gcd (a, 1764) = 1. (Remark: 1764 = 4 x 9 x 49.) (c) a60 = 1 (mod 1800) if gcd (a, 1800) = 1. # 8 If gcd (m, n) = 1, prove that n*1"0 + m*"" = 1 (mod mn). 9 If /, m, n are relatively prime in pairs, prove that (mn)Mn + (/n)*(r") + (/m)*(,,) = l (mod//n/i). G. Wilson's Theorem, and Some Consequences In any integral domain, if x2 = 1, then x2 - 1 = (x + 1)(* - 1) = 0; hence jc = ±1. Thus, an element xiL±\ cannot be its own multiplicative inverse. As a consequence, in Zp the integers 2, 3, . . . , p - 2 may be arranged in pairs, each one being paired off with its multiplicative inverse. Prove the following: 1 In Zp,2-3---p-2=l. 2 (p — 2)! = 1 (mod p) for any prime number p. 3(p - 1)! + 1 = 0 (mod p) for any prime number p This is known as Wilson's theorem. 4 For any composite number n ^ 4, (n - 1)! ■ 0 (mod n). [Hint: If p is any prime factor of n, then p is a factor of (n - 1)! Why?] Before going on to the remaining exercises, we make the following observations: Let p > 2 be a prime. Then (p_1)! = 1.2...£_i.£±i.....(,_2).(p-i) Consequently, Reason: p - Is -1 (mod p), p -2= -2 (mod p),- • ■, (p + l)/2= -(p - l)/2 (mod p). With this result, prove the following: 5 l(p - l)/2]!2 = (-l)(" + l)'2 (mod p), for any prime p>2. (Hint: Use Wilson's theorem.) 238 chapter twenty-three elements of number theory (optional) 239 6 If p = 1 (mod4), then (p + 1)12 is odd. (Why?) Conclude that (^-)!2--l (mod p) 7 If p = 3 (mod4), then (p + l)/2 is even. (Why?) Conclude that (£^1)'-2 = 1 (mod p) 8 When p >2 is a prime, the congruence x2 + 1 =0 (mod p) has a solution if pi (mod 4). 9 For any prime p > 2, *2 = -1 (mod p) has a solution iff p^3 (mod 4). (Hint: Use part 8 and Exercise E3.) H. Quadratic Residues An integer a is called a quadratic residue modulo m if there is an integer x such that x2 = a (mod m). This is the same as saying that a is a square in Zm. If a is not a quadratic residue modulo m, then a is called a quadratic nonresidue modulo m. Quadratic residues are important for solving quadratic congruences, for studying sums of squares, etc. Here, we will examine quadratic residues modulo an arbitrary prime p > 2. Let h:Z*^>Z* be defined by h(a) = a2. 1 Prove h is a homomorphism. Its kernel is (±1). # 2 The range of h has (p - l)/2 elements. Prove: If ran h = R, R is a subgroup of Z* having two cosets. One contains all the residues, the other all the nonresidues. The Legendre symbol is defined as follows + 1 -1 0 if p -t a and a is a residue mod p. if p t a and a is a nonresidue mod p. if p|a. 3 Referring to part 2, let the two cosets of R be called 1 and -1. Then Z*IR = {1, -1}. Explain why (!)-«.-> for every integer a which is not a multiple of p. * 4 B—(jZ): (!);(!)= (!>(!). 5 Prove: if a = 6 (mod p), then = ^In particular, = (")■ '(t)"{-> If-)S<) (Hint: Use Extrcises G6 and 7.) The most important rule for computing (?) is the /aw o/ quadratic reciprocity, which asserts that for distinct primes p, q>2, f-(J) if p, q are both = 3 (mod 4) otherwise (The proof may be found in any textbook on number theory, for example, Fundamentals of Number Theory by W. J. LeVeque.) 8 Use parts 5 to 7 and the law of quadratic reciprocity to find: (30) (JO) (15) (M) V 101 / V 151 / V41/ V59/ Is 14 a quadratic residue, modulo 59? 9 Which of the following congruences is solvable? (a) *2 = 30 (mod 101) (b) x2 = 6 (mod 103) /379\ 1401/ (c) 2x2 = 70 (mod 106) Note: x = a (mod p) is solvable iff a is a quadratic residue modulo p iff 1 (i) I. Primitive Roots Recall that Vn is the multiplicative group of all the invcrtible elements in Z„. If V„ happens to be cyclic, say Va = (m), then any integer a = m (mod n) is called a primitive root of n, 1 Prove that a is a primitive root of n iff the order of a in Vn is (n). 2 Prove that every prime number p has a primitive root. (Hint: For every prime p, Z* is a cyclic group. The simple proof of this fact is given as Theorem 1 in Chapter 33.) 3 Find primitive roots of the following integers (if there are none, say so): 6, 10, 12, 14, 15. 4 Suppose a is a primitive root of m. Prove: If b is any integer which is relatively prime to m, then b = a* (mod m) for some k»l. 5 Suppose m has a primitive root, and let n be relatively prime to 4>(m). (Suppose n > 0.) Prove that if a is relatively prime to m, then x" = a (mod m) has a solution. 6 Let p>2 be a prime. Prove that every primitive root of p is a quadratic nonresidue, modulo p. (Hint: Suppose a primitive root a is a residue; then every power of a is a residue.) 7 A prime p of the form p = 2"' + 1 is called a Fermat prime. Let p be a Fermat prime. Prove that every quadratic nonresidue mod p is a primitive root of p. (Hint: How many primitive roots arc there? How many residues? Compare.) RINGS OF POLYNOMIALS 241 CHAPTER TWENTY-FOUR RINGS OF POLYNOMIALS In elementary algebra an important role is played by polynomials in an unknown x. These are expressions such as , 2 2xJ \x + 3 whose terms are grouped in powers of x. The exponents, of course, are positive integers and the coefficients are real or complex numbers. Polynomials are involved in countless applications—applications of every kind and description. For example, polynomial functions are the easiest functions to compute, and therefore one commonly attempts to approximate arbitrary functions by polynomial functions. A great deal of effort has been expended by mathematicians to find ways of achieving this. Aside from their uses in science and computation, polynomials come up very naturally in the general study of rings, as the following example will show: Suppose we wish to enlarge the ring Z by adding to it the number tt. It is easy to see that we will have to adjoin to Z other new numbers besides just tt; for the enlarged ring (containing tt as well as all the integers) will also contain such things as - tt, tt + 7, 6tt -11, and so on. As a matter of fact, any ring which contains Z as a subring and which also contains the number tt will have to contain every number of the form r" + bir" + kir + I where a, b,. . . , k, I are integers. In other words, it will contain all the polynomial expressions in tt with integer coefficients. But the set of all the polynomial expressions in tt with integer coefficients is a ring. (It is a subring of U because it is obvious that the sum and product of any two polynomials in tt is again a polynomial in tt.) This ring contains Z because every integer a is a polynomial with a constant term only, and it also contains tt. Thus, if we wish to enlarge the ring Z by adjoining to it the new number tt, it turns out that the "next largest" ring after Z which contains Z as a subring and includes tt, is exactly the ring of all the polynomials in tt with coefficients in Z. As this example shows, aside from their practical applications, polynomials play an important role in the scheme of ring theory because they are precisely what we need when we wish to enlarge a ring by adding new elements to it. In elementary algebra one considers polynomials whose coefficients are real numbers, or in some cases, complex numbers. As a matter of fact, the properties of polynomials are pretty much independent of the exact nature of their coefficients. All we need to know is that the coefficients are contained in some ring. For convenience, we will assume this ring is a commutative ring with unity. Let A be a commutative ring with unity. Up to now we have used letters to denote elements or sets, but now we will use the letter x in a different way. In a polynomial expression such as ax2 + bx + c, where a, b, c 6 A, we do not consider x to be an element of A, but rather x is a symbol which we use in an entirely formal way. Later we will allow the substitution of other things for x, but at present x is simply a placeholder. Notationally, the terms of a polynomial may be listed in either ascending or descending order. For example, 4x3 - 3x2 + x + 1 and 1 + x — 3x + 4x denote the same polynomial. In elementary algebra descending order is preferred, but for our purposes ascending order is more convenient. Let A be a commutative ring with unity, and x an arbitrary symbol. Every expression of the form a0 + a}x + a2x2 + - • • + anx" is called a polynomial in x with coefficients in A, or more simply, a polynomial in x over A. The expressions akxk, for k E {1, . . . , n}, are called the terms of the polynomial. Polynomials in x are designated by symbols such as a(x), b(x), q(x), and so on. If a(x) = a0 + axx + ■ ■ ■ + anx" is any polynomial and akxk is any one of its terms, ak is called the coefficient of xk. By the degree of a polynomial a(x) we mean the greatest n such that the coefficient of x" is not zero. In other words, if a(x) has degree n, this means that a„^0 but 242 CHAPTER TWENTY-FOUR RINGS OF POLYNOMIALS 243 am = 0 for every m > n. The degree of a(x) is symbolized by deg a(x) For example, 1 + 2x — 3x2 + x3 is a polynomial degree 3. The polynomial 0 + Ox + Ox2 + ■ ■ ■ all of whose coefficients are equal to zero is called the zero polynomial, and is symbolized by 0. It is the only polynomial whose degree is not defined (because it has no nonzero coefficient). If a nonzero polynomial a(x) = aQ + atx + ■ ■ ■ + anxn has degree n, then an is called its leading coefficient: it is the last nonzero coefficient of a(x). The term anx" is then called its leading term, while a0 is called its constant term. If a polynomial a(x) has degree zero, this means that its constant term a0 is its only nonzero term: a{x) is a constant polynomial. Beware of confusing a polynomial of degree zero with the zero polynomial. Two polynomials a(x) and b(x) are equal if they have the same degree and corresponding coefficients are equal. Thus, if a(x) = a0 + ■ ■ ■ + anx" is of degree n, and b(x) = ba + ■ ■ - + bmxm is of degree m, then a(x) = b(x) iff n = m and ak = bk for each k from 0 to n. The familiar sigma notation for sums is useful for polynomials. Thus, a(x) = a0 + atx + ■ ■ ■ + anx" = 2 akxk with the understanding that x — 1. Addition and multiplication of polynomials is familiar from elementary algebra. We will now define these operations formally. Throughout these definitions we let a(x) and b(x) stand for the following polynomials: a(x) = aa + axx + • • • + anxn b(x) = b0 + b}x + ■ ■ • + bnx" Here we do not assume that a(x) and b(x) have the same degree, but allow ourselves to insert zero coefficients if necessary to achieve uniformity of appearance. We add polynomials by adding corresponding coefficients. Thus, a(x) + b(x) = (a0 + b0) + (a, + bx)x + • • • + (a„ + bn)x" Note that the degree of a(x) + b{x) is less than or equal to the higher of the two degrees, deg a(x) and deg b(x). Multiplication is more difficult, but quite familiar: a(x)b(x) = a0b0 + (aQb1 + b^^x + (anb2 + aibl + a2b0)x2 + ■■■ + a„bnx2" In other words, the product of a(x) and b(x) is the polynomial c(x) = c0 + ctx + ■ + c-,„x 2n whose &th coefficient (for any k from 0 to 2/z) is ^=2 aibj i+i=k This is the sum of all the aibj for which i + j = k. Note that deg [a(x)b(x)] < deg a(x) + deg b(x). If A is any ring, the symbol A[x] designates the set of all the polynomials in x whose coefficients are in A, with addition and multiplication of polynomials as we have just defined them. Theorem 1 Let A be a commutative ring with unity. Then A[x] is a commutative ring with unity. Proof: To prove this theorem, we must show systematically that A[x] satisfies all the axioms of a commutative ring with unity. Throughout the proof, let a(x), b(x), and c{x) stand for the following polynomials: a{x) = a0 + a{x + • • ■ + anx" b(x) = b0 + b,x + ■ ■ ■ + bnx" and c(x) = c0 + c,x + • • • + cnxn The axioms which involve only addition are easy to check: for example, addition is commutative because a(x) + b(x) = (a0 + b0) + (al + b1)x + --- + (a„ + bn)x" = (°o + a0) + (bl + a,)x + --- + (bn + a„)x" = b(x) + a(x) The associative law of addition is proved similarly, and is left as an exercise. The zero polynomial has already been described, and the negative of a(x) is -a(x) = (-a0) + (-a,)jc + ■ + (~an)x" To prove that multiplication is associative requires some care. Let b(x)c(x) = d(x), where d(x) = d0 + d{x + • • ■ + d2nx2". By the definition of polynomial multiplication, the kth coefficient of b(x)c{x) is /+/=* 244 CHAPTER TWENTY-FOUR RINGS OF POLYNOMIALS 245 Then a(x)[b(x)c(x)] = a(x)d(x) = e(x), where e(x) = e0 + exx -I----+ einxin. Now, the /th coefficient of a(x)d(x) is e,= 2 M*= S a*( 2 V, h+k-l hik=l It is easy to see that the sum on the right consists of all the terms ahbtc, such that h + i + j = I. Thus, e, = 2 flfcVy For each / from 0 to 3n, e, is the /th coefficient of a(x)[b(x)c(x)]. If we repeat this process to find the /th coefficient of [a(x)b(x)]c(x), we discover that it, too, is e,. Thus, a(x)[b(x)c(x)] = [a(x)b(x)]c(x) To prove the distributive law, let a(x)[b(x) + c(x)] = d(x) where d(x) = dQ + djX + ■ ■ • + d2nx2n. By the definitions of polynomial addition and multiplication, the ^h coefficient a(x)[b(x) + c(x)] is dk= 2 ",(/>,- + c,)= 2 («,*, + «,<:,) i+j^k i*j—k = 2 2 a,cy i+j = k ,+i-k But £,•+.•_* is exactly the £th coefficient of a(jt)6(;t), and a,c; is the /cth coefficient of a(.r)c(x), hence is equal to the kth coefficient of a(x)b(x) + a(x)c(x). This proves that a(x)\b(x) + c(x)] = a(x)b(x) + a(x)c(x) The commutative law of multiplication is simple to verify and is left to the student. Finally, the unity polynomial is the constant polynomial 1. ■ Theorem 2 If A is an integral domain, then A[x] is an integral domain. Proof: If a(x) and b(x) are nonzero polynomials, we must show that their product a(x)b(x) is not zero. Let «„ be the leading coefficient of a(x), and bm the leading coefficient of b(x). By definition, an^0, and bm #0. Thus anbm #0 because A is an integral domain. It follows that a(x)b(x) has a nonzero coefficient (namely, anbm), so it is not the zero polynomial. ■ If A is an integral domain, we refer to A[x] as a domain of polynomials, because A[x] is an integral domain. Note that by the preceding proof, if an and bm are the leading coefficients of a(x) and b(x), then anbm is the leading coefficient of a(x)b(x). Thus, deg a(x)b(x) = n + m: In a domain of polynomials A[x\, where A is an integral domain, deg[o(;t) • b(x)] = deg a(x) + deg b(x) In the remainder of this chapter we will look at a property of polynomials which is of special interest when all the coefficients lie in a field. Thus, from this point forward, let F be a field, and let us consider polynomials belonging to F[x]. It would be tempting to believe that if F is a field then F[x] also is a field. However, this is not so, for one can easily see that the multiplicative inverse of a polynomial is not generally a polynomial. Nevertheless, by Theorem 2, F[x] is an integral domain. Domains of polynomials over afield do, however, have a very special property: any polynomial a(x) may be divided by any nonzero polynomial b(x) to yield a quotient q(x) and a remainder r(x). The remainder is either 0, or if not, its degree is less than the degree of the divisor b(x). For example, x2 may be divided by x - 2 to give a quotient of x + 2 and a remainder of 4: x2 = (x-2)(x + 2) + 4^ a(x) b(x) q(x) r(x) This kind of polynomial division is familiar to every student of elementary algebra. It is customarily set up as follows: x + 2 Divisor-a(x) -2x 2x 2x -Quotient q(x) -Dividend b(x) -Remainder r(x) The process of polynomial division is formalized in the next theorem. Theorem 3: Division algorithm for polynomials // a(x) and b(x) are polynomials over a field F, and b(x)^0, there exist polynomials q(x) and r(x) over F such that and a(x) = b(x)q(x) + r(x) r(x) = 0 or deg r(x) < deg b(x) Proof: Let b(x) remain fixed, and let us show that every polynomial a(x) satisfies the following condition: There exist polynomials q(x) and r(x) over F such that a(x) = b(x)q(x) + r(x), and r(x) = 0 or deg r(x) < deg b(x). 246 chapter twenty-four rings of polynomials 247 We will assume there are polynomials a(x) which do not fulfill the condition, and from this assumption we will derive a contradiction. Let a{x) be a polynomial of lowest degree which fails to satisfy the conditions. Note that a(x) cannot be zero, because we can express 0 as 0 = b(x)-0 + Q, whereby a(x) would satisfy the conditions. Furthermore, deg a(x) 3= deg b(x), for if deg a(x) < deg b(x) then we could write a(x) = b(x)-0 + a(x), so again a(x) would satisfy the given conditions. Let a(x) = «„ + •••+ anx" and b(x) = &„ + ••• + bmxm. Define a new polynomial A(x) = a{x) - x" nb(x) (1) a(x) - yb0 — x + b, X + ■ ■ b. This expression is the difference of two polynomials both of degree n and both having the same leading term anx". Because anxn cancels in the subtraction, A(x) has degree less than n. Remember that a(x) is a polynomial of least degree which fails to satisfy the given condition; hence A(x) does satisfy it. This means there are polynomials p(x) and r{x) such that A(x) = b(x)p(x) + r(x) where r(x) = 0 or deg r{x) < deg b(x). But then a(x) = A (x) + |=- x" " mb(x) by Equation (1) = b{x)p{x) + r(x) +YX" "mftW = Kx) p{x) + + r(x) If we let p(x) + {ajbm)x"'m be renamed q(x), then a(x) = b(x)q(x) + r(x), so a(x) fulfills the given condition. This is a contradiction, as required. ■ EXERCISES A. Elementary Computation in Domains of Polynomials Remark on notation: In some of the problems which follow, we consider polynomials with coefficients in Z„ for various n. To simplify notation, we denote the elements of Z„ by 1,2,..., n — 1 rather than the more correct 1,2, . . . , n - 1. # 1 Let a(x) = 2x2 + 3x + 1 and b(x) = x3 •+ 5x2 + x. Compute a(x) + b(x), a(x) -b(x) and a(x)b(x) in Z[x], Z5[x], Zb[x], and Z7[xj. 2 Find the quotient and remainder when x3 + x2 + x + 1 is divided by x2 + 3x + 2 in Z[x] and in Z5[x]. 3 Find the quotient and remainder when x3 + 2 is divided by 2x2 + 3.v + 4 in Z[x], in Z3[x], and in Z5[x]. We call b(x) a factor.of a(x) if a(x) = b(x)q(x) for some q(x), that is, if the remainder when a(x) is divided by b(x) is equal to zero. 4 Show that the following is true in A[x] for any ring A: For any odd n, (a) x + 1 is a factor of x" + 1. (b) x + 1 is a factor of x" + x"+ ■ - • + x + 1. 5 Prove the following: In Z3(x], x + 2 is a factor of x"' + 2, for all m. In Z„[x], x + (n - 1) is a factor of x'" + (n - 1), for all m and n. 6 Prove that there is no integer m such that 3x2 + 4x + m is a factor of 6x4 + 50 in Z[x]. 7 For what values of n is x2 + 1 a factor of x3 + 5x + 6 in Z„[x]? B. Problems Involving Concepts and Definitions 1 Is x8 + 1 = x3 + 1 in ZJxj? Explain your answer. 2 Is there any ring A such that in A[x\, some polynomial of degree 2 is equal to a polynomial of degree 4? Explain. # 3 Write all the quadratic polynomials in Z5[x]. How many are there? How many cubic polynomials are there in Z5[x]? More generally, how many polynomials of degree m are there in Z„[x]? 4 Let A be an integral domain; prove the following: If (x + l)2 = x2 + 1 in A[x\, then A must have characteristic 2. If (x + l)4 = x4 + 1 in A[x\, then A must have characteristic 2. If (x + l)6 = x6 + 2x' + 1 in A[x], then A must have characteristic 3. 5 Find an example of each of the following in Za[x]: a divisor of zero, an invertible element. (Find nonconstant examples.) 6 Explain why x cannot be invertible in any A[x], hence no domain of polynomials can ever be a field. 7 There are rings such as P3 in which every element ^0,1 is a divisor of zero. Explain why this cannot happen in any ring of polynomials A[x], even when A is not an integral domain. 8 Show that in every A[x], there are elements ^0, 1 which are not idempotcnt, and elements ?*0,1 which are not nilpotent. 248 CHAPTER TWENTY-FOUR RINGS OF POLYNOMIALS 249 C. Rings A[x] Where A Is Not an Integral Domain 1 Prove: If A is not an integral domain, neither is A[x\. 2 Give examples of divisors of zero, of degrees 0, 1, and 2, in Z4[x]. 3 In I10[x], (2x + 2)(2x + 2) = (2x + 2)(5x2 + 2x + 2), yet (2x + 2) cannot be canceled in this equation. Explain why this is possible in Zlu[x], but not in Z5[x]. 4 Give examples in Z4[x], in Zb[x], and in l9[x] of polynomials a(x) and b(x) such that deg a(x)b(x) < deg a(x) + deg b(x). 5 If A is an integral domain, we have seen that in A[x], deg a(x)b(x) = deg a(x) + deg b{x) Show that it A is not an integral domain, we can always find polynomials a(x) and b(x) such that deg a(x)b(x) < deg a(x) + deg b(x). 6 Show that if A is an integral domain, the only invertible elements in A[x] are the constant polynomials with inverses in A. Then show that in Z4[x] there are invertible polynomials of all degrees. # 7 Give all the ways of factoring x2 into polynomials of degree 1 in Z9[x]; in Z5[x]. Explain the difference in behavior. 8 Find all the square roots of x2 + x + 4 in Z5[x]. Show that in Zs[x], there are infinitely many square roots of 1. D. Domains A[x] Where A Has Finite Characteristic In each of the following, let A be an integral domain: 1 Prove that if A has characteristic p, then A[x\ has characteristic p. 2 Use part 1 to give an example of an infinite integral domain with finite characteristic. 3 Prove: If A has characteristic 3, then x + 2 is a factor of xm + 2 for all m. More generally, if A has characteristic p, then x + (p-l)isa factor of x" + (p - 1) for all m. 4 Prove that HA has characteristic p, then in A[x], (x + c)p = x" + cp. (You may use essentially the same argument as in the proof of Theorem 3, Chapter 20.) 5 Explain why the following "proof of part 4 is not valid: (x + c)p = xp + c" in A[x] because (a + c)p = a" + cp for all a, c 6 A. (Note the following example: in Z2, a2 + 1 = a4 + 1 for every a, yet x2 + 1 x4 + 1 in Z2 [x].) # 6 Use the same argument as in part 4 to prove that if A has characteristic/?, then \a(x) + b(x)}p = a(x)p + b(x)p for any a(x), b(x) 6 A[x]. Use this to prove: E. Subrings and Ideals in A[x] 1 Show that if 5 is a subring of A, then B[*] is a subring of A[x] 2 If B is an ideal of A, S[jc] is an ideal of A[x]. 3 Let 5 be the set of all the polynomials a(x) in A[x] for which every coefficient a, for odd i is equal to zero. Show that 5 is a subring of A{x), Why is the same not true when "odd" is replaced by "even"? 4 Let / consist of all the elements in A[x\ whose constant coefficient is equal to zero. Prove that / is an ideal of A[x]. # 5 Let / consist of all the polynomials a0 + aLx + ■ ■ ■ + anx" in A[x] such that a0 + a, + • • ■ + an = 0. Prove that / is an ideal of A[x]. 6 Prove that the ideals in both parts 4 and 5 are prime ideals. (Assume A is an integral domain.) F. Homomorphtsms of Domains of Polynomials Let A be an integral domain. 1 Let h : A[x\-+ A map every polynomial to its constant coefficient; that is, h(alt + axx + ■ • ■ + anx") = a0 Prove that h is a homomorphism from A\x\ onto A, and describe its kernel. 2 Explain why the kernel of h in part 1 consists of all the products xa(x), for all a(x) e A[x]. Why is this the same as the principal ideal (x) in A[x]l 3 Using parts 1 and 2, explain why A[x]l(x) = A. 4 Let g: A[x)-»/l send every polynomial to the sum of its coefficients. Prove that g is a surjective homomorphism, and describe its kernel. 5 If c£ A, let h : A[x]-> A[x] be defined by h(a(xj) = a(cx), that is, h(a0 + a,x + ■ ■ ■ + a„x") = a0 + a^cx + a2c2x2 + ■■■ + arlcnx" Prove that h is a homomorphism and describe its kernel. 6 If h is the homomorphism of part 5, prove that h is an automorphism (isomorphism from A[x] to itself) iff c is invertible. G. Homomorphisms of Polynomial Domains Induced by a Homomorphism of the Ring of Coefficients Let A and B be rings and let h : A -* B be a homomorphism with kernel K Define h : A[x]-* B[x] by h(a0 + atx + ■ ■ ■ + anx") = h(aa) + h(aAx + ■■■ + h(ajx" (We say that h is induced by h.) 1 Prove that h is a homomorphism from A[x\ to B[x\. 2 Describe the kernel K of h. # 3 Prove that h is surjective iff h is surjective. 4 Prove that h is injective iff h is injective. 5 Prove that if a(x) is a factor of b(x), then h(a(x)) is a factor of h(b(x)). 250 chapter twenty-four 6 If h : Z—>Z„ is the natural homomorphism, let h : Z[x}—>Zn[x\ be the homo-morphism induced by h. Prove that h(a(xj) = 0 iff n divides every coefficient of a(x). 7 Let h be as in part 6, and let n be a prime. Prove that if a(x)b(x) £ ker h, then either a(x) or b(x) is in ker h. (Hint: Use Exercise F2 of Chapter 19.) H. Polynomials in Several Variables A[x,,x2] denotes the ring of all the polynomials in two letters xt and x2 with coefficients in A. For example, x2 - 2xy + y2 + x — 5 is a quadratic polynomial in Q[x, y]. More generally, A[x,, . . . , xn] is the ring of the polynomials in n letters x,,. . . , x„ with coefficients in A. Formally it is defined as follows: Let A[x^\ be denoted by A,; then ^4,[jc-,] is A[x,,x2}. Continuing in this fashion, we may adjoin one new letter x: at a time, to get A[x^ . . . ,xlt\. 1 Prove that if A is an integral domain, then A\xt, .... xn\ is an integral domain. 2 Give a reasonable definition of the degree of any polynomial p(x, y) in A[x, y] and then list all the polynomials of degree =s3 in Z,[x, y|. Let us denote an arbitrary polynomial p(x, y) in A[x, y] by T. atjx'y' where £ ranges over some pairs i, j of nonnegative integers. 3 Imitating the definitions of sum and product of polynomials in A\x\, give a definition of sum and product of polynomials in A[x, y]. 4 Prove that deg a(x, y)b(x, y) = deg a(x, y) + deg b(x, y) if A is an integral domain. I. Fields of Polynomial Quotients Let A be an integral domain. By the closing part of Chapter 20, every integral domain can be extended to a "field of quotients." Thus, A[x] can be extended to a field of polynomial quotients, which is denoted by A(x). Note that A(x) consists of all the fractions a(x)lb(x) for a(x) and b(x) ¥=Q in A[x], and these fractions are added, subtracted, multiplied, and divided in the customary way. 1 Show that A(x) has the same characteristic as A. 2 Using part 1, explain why there is an infinite field of characteristic p, for every prime p. 3 If A and B are integral domains and h : A —» B is an isomorphism, prove that h determines an isomorphism h : A(x)—> B(x). J. Division Algorithm: Uniqueness of Quotient and Remainder In the division algorithm, prove that q(x) and r(x) are uniquely determined. [Hint: Suppose a(x) = b(x)ql(x) + r^x) = b(x)q2(x) + r2(x), and subtract these two expressions, which are both equal to a(x).] CHAPTER TWENTY-FIVE FACTORING POLYNOMIALS Just as every integer can be factored into primes, so every polynomial can be factored into "irreducible" polynomials which cannot be factored further. As a matter of fact, polynomials behave very much like integers when it comes to factoring them. This is especially true when the polynomials have all their coefficients in a field. Throughout this chapter, we let F represent some field and we consider polynomials over F. It will be found that F[x] has a considerable number of properties in common with Z. To begin with, all the ideals of F[x\ are principal ideals, which was also the case for the ideals of Z. Note carefully that in F[x\, the principal ideal generated by a polynomial a(x) consists of all the products a(x)s(x) as a(x) remains fixed and s(x) ranges over all the members of F[x]. Theorem 1 Every ideal of F[x] is principal. Proof: Let J be any ideal of F[x]. If / contains nothing but the zero polynomial, J is the principal ideal generated by 0. If there are nonzero polynomials in J, let b(x) be any polynomial of lowest degree in J. We will show that J = (b(x)), which is to say that every element of J is a polynomial multiple b(x)q(x) of b(x). Indeed, if a(x) is any element of J, we may use the division algorithm to write a(x) = b(x)q(x) + r(x), where r(x) = 0 or deg r(x) < deg b(x). Now, r(x) = a(x) - b(x)q(x); but a(x) was chosen in 7, and b(x) 6 /; hence b(x)q(x) £ /. It follows that r(x) is in J. If r(x)¥=0, its degree is less than the degree of b(x). But this is 251 252 CHAPTER TWENTY-FIVE FACTORING POLYNOMIALS 253 impossible because b(x) is a polynomial of lowest degree in J. Therefore, of necessity, r(x) = 0. Thus, finally, a(x) = b(x)q(x); so every member of J is a multiple of b(x), as claimed. ■ It follows that every ideal / of F[x] is principal. In fact, as the proof above indicates, J is generated by any one of its members of lowest degree. Throughout the discussion which follows, remember that we are considering polynomials in a fixed domain F[x] where F is a field. Let a(x) and b(x) be in F[x]. We say that b(x) is a multiple of a(x) if b(x) = a(x)s(x) for some polynomial s(x) in F[x]. If b(x) is a multiple of a(x), we also say that a(x) is a factor of b(x), or that a(x) divides b(x). In symbols, we write a(x)\b{x) Every nonzero constant polynomial divides every polynomial. For if c i* 0 is constant and a(x) = a0-i----+ anx", then (ao,ai , a0 + axx + ■ ■ ■ + anx = c^— + — x + hence c | a(x). A polynomial a(x) is invertible iff it is a divisor of the unity polynomial 1. But if a(x)b(x) = 1, this means that a(x) and b(x) both have degree 0, that is, are constant polynomials: a(x) = a, b(x) = b, and ab = 1. Thus, the invertible elements of F[x] are all the nonzero constant polynomials. A pair of nonzero polynomials a(x) and b(x) are called associates if they divide one another: a(x)\b(x) and b(x)\a(x). That is to say, a(x) = b(x)c(x) and b(x) = a(x)d(x) for some c(x) and d(x). If this happens to be the case, then a(x) = b(x)c(x) = a(x)-)c(;c) hence d(x)c(x) = 1 because F[*] is an integral domain. But then c(x) and d(x) are constant polynomials, and therefore a(x) and b(x) are constant multiples of each other. Thus, in F[x], a(x) and b(x) are associates iff they are constant multiples of each other. If a(x) = a0 + • • • + anxn, the associates of a(x) are all its nonzero constant multiples. Among these multiples is the polynomial a0 a, a +~a'X + + x which is equal to (\lan)a(x), and which has 1 as its leading coefficient. Any polynomial whose leading coefficient is equal to 1 is called monk. Thus, every nonzero polynomial a(x) has a unique monk associate. For example, the monic associate of 3 + Ax + 2x* is \ + 2x + x2. A polynomial d(x) is called a greatest common divisor of a(x) and b(x) if d(x) divides a(x) and b(x), and is a multiple of any other common divisor of a(x) and b(x); in other words, (i) d(x)\a(x) and d(x)\b(x), and (ii) For any u(x) in F[x], if u(x)\a(x) and u(x)\b(x), then u(x)\d(x). According to this definition, two different gcd's of a(x) and b(x) divide each other, that is, are associates. Of all the possible gcd's of a(x) and b(x), we select the monic one, call it the gcd of a(x) and b(x), and denote it by gcd[a(*), b(x)]. It is important to know that any pair of polynomials always has a greatest common divisor. Theorem 2 Any two nonzero polynomials a(x) and b(x) in F[x] have a gcd d(x). Furthermore, d(x) can be expressed as a "linear combination" d(x) = r{x)a(x) + s(x)b(x) where r(x) and s(x) are in F[x]. Proof: The proof is analogous to the proof of the corresponding theorem for integers. If J is the set of all the linear combinations u(x)a(x) + v(x)b(x) as u(x) and v(x) range over F[x], then J is an ideal of F[x], say the ideal (d(x)) generated by d(x). Now a(x) = la(x) + 0b(x) and b(x) = 0a(x) + ib(x), so a(x) and b(x) are in J. But every element of J is a multiple of d(x), so d(x)\a(x) and d(x)\b(x) If k(x) is any common divisor of a(x) and b(x), this means there are polynomials f(x) and g(x) such that a(x) = k(x)f(x) and b(x) = k(x)g(x). Now, d(x) G J, so d(x) can be written as a linear combination d(x) = r(x)a(x) + s(x)b(x) = r(x)k(x)f(x) + s(x)k(x)g(x) = k(x)[r(x)f(x) + s(x)g(x)} hence k(x)\d(x). This confirms that d(x) is the gcd of a(x) and b(x). m 254 CHAPTER TWENTY-FIVE FACTORING POLYNOMIALS 255 Polynomials a{x) and b(x) in F[x] are said to be relatively prime if their gcd is equal to 1. (This is equivalent to saying that their only common factors are constants in F.) A polynomial a(x) of positive degree is said to be reducible over F if there are polynomials b(x) and c{x) in F[x], both of positive degree, such that a(x) = b(x)c(x) Because b(x) and c(x) both have positive degrees, and the sum of their degrees is deg a(x), each has degree less than deg a(x). A polynomial p(x) of positive degree in F[x] is said to be irreducible over F if it cannot be expressed as the product of two polynomials of positive degree in F[x]. Thus, p(x) is irreducible iff it is not reducible. When we say that a polynomial p(x) is irreducible, it is important that we specify irreducible over the field F. A polynomial may be irreducible over F, yet reducible over a larger field E. For example, p(x) = x2 + 1 is irreducible over R; but over C it has factors (x + i)(x - i). We next state the analogs for polynomials of Euclid's lemma and its corollaries. The proofs are almost identical to their counterparts in Z; therefore they are left as exercises. Euclid's lemma for polynomials Let p(x) be irreducible. If p{x) | a(x)b(x), then p{x) \ a(x) or p(x) | b(x). Corollary 1 Let p(x) be irreducible. If p(x) \ ax{x)a2{x) ■ ■ ■ an(x), then p(x) | a,(x) for one of the factors a,{x) among a,(*), . . . , a„(x). Corollary 2 Let q,(x), . . . , qr(x) and p(x) be monk irreducible polynomials. If p(x) | <7,(jr) • ■ ■ qr(x), then p(x) is equal to one of the factors 1i(x), ■ ■■ , qr{x)- Theorem 3: Factorization into irreducible polynomials Every polynomial a(x) of positive degree in E[x] can be written as a product a(x) = kpi(x)p2{x)- •• pr(x) where k is a constant in F and pt(x).....pr(x) are monk irreducible polynomials of F[x]. If this were not true, we could choose a polynomial a(x) of lowest degree among those which cannot be factored into irreducibles. Then a(x) is reducible, so a(x) = b(x)c(x) where b(x) and c(x) have lower degree than a{x). But this means that b(x) and c(x) can be factored into irreducibles, and therefore a(x) can also. Theorem 4: Unique factorization // a(x) can be written in two ways as a product of monk irreducibles, say a(x) = kp,(x) ■ ■ ■ pr(x) = lq,(x) ■ ■ • qs(x) then k = I, r = s, and each p,{x) is equal to a q^x). The proof is the same, in all major respects, as the corresponding proof for Z; it is left as an exercise. In the next chapter we will be able to improve somewhat on the last two results in the special cases of U[x] and C[x]. Also, we will learn more about factoring polynomials into irreducibles. EXERCISES A. Examples of Factoring into Irreducible Factors 1 Factor x* - A into irreducible factors over Q, over R, and over C. 2 Factor x" - 16 into irreducible factors over Q, over R, and over C. 3 Find all the irreducible polynomials of degree =s4 in 12\x\. # 4 Show that x2 + 2 is irreducible in Z5[x]. Then factor x4 - 4 into irreducible factors in Zs[x], (By Theorem 3, it is sufficient to search for monic factors.) 5 Factor 2*3 + Ax + 1 in Z,[x]. (Factor it as in Theorem 3.) 6 In Z6[*], factor each of the following into two polynomials of degree 1: x, x + 2, x + 3. Why is this possible? B. Short Questions Relating to Irreducible Polynomials Let F be a field. Explain why each of the following is true in 1 Every polynomial of degree 1 is irreducible. 2 If a(x) and b(x) are distinct monic polynomials, they cannot be associates. 3 Any two distinct irreducible polynomials are relatively prime. 4 If a(x) is irreducible, any associate of a(x) is irreducible. 5 If a(x) ¥ 0, a(x) cannot be an associate of 0. 6 In 1p\x\, every nonzero polynomial has exactly p - 1 associates. 7 x2 + 1 is reducible in Zp[x] iff p = a + b where ab = 1 (mod p). C. Number of Irreducible Quadratics over a Finite Field 1 Without finding them, determine how many reducible monic quadratics there are in i^[x\. [Hint: Every reducible monic quadratic can be uniquely factored as (x + a)(x + b).] 256 CHAPTER TWENTY-FIVE FACTORING POLYNOMIALS 257 2 How many reducible quadratics are there in Z,[*]? How many irreducible quadratics? 3 Generalize: How many irreducible quadratics are there over a finite field of n elements? 4 How many irreducible cubics are there over a field of n elements? D. Ideals in Domains of Polynomials Let F be a field, and let J designate any ideal of F[x]. Prove parts 1-4. 1 Any two generators of J are associates. 2 J has a unique monic generator m(x). An arbitrary polynomial a(x) E F[x\ is in J iff m(x)\a(x). 3 J is a prime ideal iff it has an irreducible generator. # 4 If p(x) is irreducible, then ( p(x)) is a maximal ideal of F[x]. (See Chapter 18, Exercise H5.) 5 Let S be the set of all polynomials au + axx + ■ ■ • + anx" in F[x] which satisfy a„ + a, + ■■- + «„ =0. It has been shown (Chapter 24, Exercise E5) that 5 is an ideal of F[x]. Prove that x - 1 E 5, and explain why it follows that 5 = (x - 1). 6 Conclude from part 5 that F[x\l(x - 1) = F. (See Chapter 24, Exercise F4.) 7 Let F[x, y] denote the domain of all the polynomials E a^x'y' in two letters x and y, with coefficients in F. Let J be the ideal of F[x, y] which contains all the polynomials whose constant coefficient in zero. Prove that J is not a principal ideal. Conclude that Theorem 1 is not true in F[x, y). E. Proof of the Unique Factorization Theorem 1 Prove Euclid's lemma for polynomials. 2 Prove the two corollaries of Euclid's lemma. 3 Prove the unique factorization theorem for polynomials. F. A Method for Computing the gcd Let a(x) and b(x) be polynomials of positive degree. By the division algorithm, we may divide a(x) by b(x): a(x) = b(x)qx(x) + r,(x) 1 Prove that every common divisor of a(x) and b(x) is a common divisor of b(x) and rx(x). It follows from part 1 that the gcd of a(x) and b(x) is the same as the gcd of b(x) and rx(x). This procedure can now be repeated on b(x) and r,(jr); divide b(x) by r,(x): b(x) = rl(x)ql(x) + r2(x) Next rl{x)= r2(x)q3(x) + r3(x) Finally, r„_, {x) = r„(x)o„ ,.,(*) + 0 In other words, we continue to divide each remainder by the succeeding remainder. Since the remainders continually decrease in degree, there must ultimately be a zero remainder. But we have seen that gcd[fl(», b(x)\ = gcd[6(x), r,(x)] - ■ • • - godfr..,^), rn(x)} Since rn(x) is a divisor of r„ ,(*), it must be the gcd of r„(x) and r„_,(jt). Thus, r„(*) = gcd[a(*),ft(*)] This method is called the euciidean algorithm for finding the gcd. # 2 Find the gcd of x3 + 1 and x" + x3 + 2x2 + x - 1. Express this gcd as a linear combination of the two polynomials. 3 Do the same for xu - 1 and x15 - 1. 4 Find the gcd of x3 + x2 + x + 1 and x" + x3 + 2x2 + 2x in Z3[x]. G. A Transformation of F[x] Let G be the subset of F[x\ consisting of all polynomials whose constant term is nonzero. Let h : G —» G be defined by h(a0 + atx + ■•• + anx") = an + an_xx H-----h a,,*" Prove parts 1-3: 1 /i preserves multiplication, that is, ft[a(x)fc(i:)] = /i|a(j;)]/i[ft(x)]. 2 h is injective and surjective and h°h = e. 3 a0 + axx + ■ ■ ■ + anx" is irreducible iff an + a„ ,* + •■• + 0^" is irreducible. 4 Let a0 + axx + ■ ■ ■ + anx" = (ba + ■ ■ ■ + bmx"')(c0 + • • • + cqxq). Factor an + an ,* + ■•• + aax" 5 Let a(x) = a0 + axx + • • • + anx" and d(x) = an + an_xx + ■ ■ ■ + a„x". If c £ F, prove that a(c) = 0 iff a(l Ic) = 0. SUBSTITUTION IN POLYNOMIALS 259 CHAPTER TWENTY-SIX SUBSTITUTION IN POLYNOMIALS Up to now we have treated polynomials as formal expressions. If a(x) is a polynomial over a field F, say a(x) = a0 + axx + ■ • • + anx" this means that the coefficients a0, aan are elements of the field F, while the letter x is a placeholder which plays no other role than to occupy a given position. When we dealt with polynomials in elementary algebra, it was quite different. The letter x was called an unknown and was allowed to assume numerical values. This made a(x) into a function having x as its independent variable. Such a function is called a polynomial function. This chapter is devoted to the study of polynomial functions. We begin with a few careful definitions. Let a(x) = a0 + a,* + • ■ • + anx" be a polynomial over F. If c is any element of F, then a0 + a,c + • • • + anc" is also an element of F, obtained by substituting c for x in the polynomial a(x). This element is denoted by a(c). Thus, a(c) = a0 + a;C + ■ • • + a„c" Since we may substitute any element of F for x, we may regard a(x) as a function from F to F. As such, it is called a polynomial function on F. The difference between a polynomial and a polynomial function is mainly a difference of viewpoint. Given a{x) with coefficients in F: if x is regarded merely as a placeholder, then a(x) is a polynomial; if x is allowed to assume values in F, then a(x) is a polynomial function. The difference is a small one, and we will not make an issue of it. If a(x) is a polynomial with coefficients in F, and c is an element of F such that a(c) = 0 then we call c a root of a(x). For example, 2 is a root of the polynomial 3x2 + x - 14 G U[x], because 3 • 22 + 2 - 14 = 0. There is an absolutely fundamental connection between roots of a polynomial and factors of that polynomial. This connection is explored in the following pages, beginning with the next theorem: Let a(x) be a polynomial over a field F. Theorem 1 c is a root of a(x) iff x — c is a factor of a(x). Proof: If x — c is a factor of a(x), this means that a(x) — (x — c)q(x) for some q(x). Thus, a(c) = (c - c)a(c) = 0, so c is a root of a(x). Conversely, if c is a root of a(x), we may use the division algorithm to divide a(x) by x — c: a(x) = (x - c)q(x) + r(x). The remainder r(x) is either 0 or a polynomial of lower degree than x — c\ but lower degree than x — c means that r(x) is a constant polynomial: r(x) = r3=0. Then 0 = a(c) = (c - c)q(c) + r = 0 + r=r Thus, r = 0, and therefore x — c is a factor of a(x). m Theorem 1 tells us that if c is a root of a(x), then x — c is a factor of a(*) (and vice versa). This is easily extended: if c, and c2 are two roots of a(x), then x - c, and * - c2 are two factors of a(x). Similarly, three roots give rise to three factors, four roots to four factors, and so on. This is stated concisely in the next theorem. Theorem 2 // a(x) has distinct roots c,, cx)(x — c2) • • ■ (x — cm) is a factor of a(x). , cm in F, then (x — Proof: To prove this, let us first make a simple observation: if a polynomial a(x) can be factored, any root of a(x) must be a root of one of its factors. Indeed, if a(x) = s(x)t(x) and a(c) = 0, then s(c) t(c) = 0, and therefore either s(c) = 0 or r(c) = 0. 258 260 CHAPTER TWENTY-SIX SUBSTITUTION IN POLYNOMIAL 261 Let c,,... ,cm be distinct roots of a(x). By Theorem 1, a(x) = (x-c1)q1(x) By our observation in the preceding paragraph, c2 must be a root of x - c, or of q-y{x). It cannot be a root of x - c1 because c2 - c, ^ 0; so c2 is a root of <,,(*). Thus, qx(x) = (x- c2)q2(x), and therefore a(x) = (x - c,)(jr - c2)q2{x) Repeating this argument for each of the remaining roots gives us our result. ■ An immediate consequence is the following important fact: Theorem 3 // a(x) has degree n, it has at most n roots. Proof: If a{x) had n + 1 roots c„ then by Theorem 2, (x - c,)■ • • (x - c„+1) would be a factor of a(x), and the degree of a(x) would therefore be at least n + 1. ■ It was stated earlier in this chapter that the difference between polynomials and polynomial functions is mainly a difference of viewpoint. Mainly, but not entirely! Remember that two polynomials a(x) and b(x) are equal iff corresponding coefficients are equal, whereas two functions a(x) and b(x) are equal iff a(x) = b(x) for every x in their domain. These two notions of equality do not always coincide! For example, consider the following two polynomials in Z5 [x\. a(x) = Xs + 1 b(x) = x - 4 You may check that a(0) = b(0), «(1) = b(l),. . . , a(4) = 6(4); hence a(x) and b(x) are equal functions from Z5 to Z5. But as polynomials, a(x) and b(x) are quite distinct! (They do not even have the same degree.) It is reassuring to know that this cannot happen when the field F is infinite. Suppose a(x) and b(x) are polynomials over a field F which has infinitely many elements. If a(x) and b(x) are equal as functions, this means that a(c) = b(c) for every c€F. Define the polynomial d(x) to be the difference of a(x) and b(x): d(x) = a{x) - b(x). Then d(c) - 0 for every c£f. Now, if d(x) were not the zero polynomial, it would be a polynomial (with some finite degree n) having infinitely many roots, and by Theorem 3 this is impossible! Thus, d(x) is the zero polynomial (all its coefficients are equal to zero), and therefore a(x) is the same polynomial as b(x). (They have the same coefficients.) This tells us that if F is a field with infinitely many elements (such as Q, R, or C), there is no need to distinguish between polynomials and polynomial functions. The difference is, indeed, just a difference of viewpoint. POLYNOMIALS OVER Z AND Q In scientific computation a great many functions can be approximated by polynomials, usually polynomials whose coefficients are integers or rational numbers. Such polynomials are therefore of great practical interest. It is easy to find the rational roots of such polynomials, and to determine if a polynomial over Q is irreducible over Q. We will do these things next. First, let us make an important observation: Let a{x) be a polynomial with rational coefficients, say a(x) T + Tx + i kn „ We may now factor out s from all but the first term to get 1 a(x)> (V. + Mo ■ b(x) The polynomial b(x) has integer coefficients; and since it differs from a(x) only by a constant factor, it has the same roots as a(x). Thus, for every polynomial with rational coefficients, there is a polynomial with integer coefficients having the same roots. Therefore, for the present we will confine our attention to polynomials with integer coefficients. The next theorem makes it easy to find all the rational roots of such polynomials: Let sit be a rational number in simplest form (that is, the integers s and t do not have a common factor greater than 1). Let a(x) = a0 + ■ ■ ■ + anx" be a polynomial with integer coefficients. Theorem 4 If sit is a root of a(x), then s\a0 and t\an. Proof: If sit is a root of a(x), this means that an + ai(s/t) + ■■■ + a„(s"/tn) = 0 Multiplying both sides of this equation by t" we get V" + i/"' + -- + fl/ = 0 (l) We may now factor out s from all but the first term to get -a0f = s(alt-1 + - • + aBj-1) 262 CHAPTER TWENTY-SIX SUBSTITUTION IN POLYNOMIALS 263 Thus, s\a0t"; and since s and / have no common factors, s\a0. Similarly, in Equation (1), we may factor out t from all but the last term to get ■ + a. Thus, t\ans"; and since s and / have no common factors, t\an. m As an example of the way Theorem 4 may be used, let us find the rational roots of a(x) = 2x4 + lx3 + 5x2 + Ix + 3. Any rational root must be a fraction sit where s is a factor of 3 and t is a factor of 2. The possible roots are therefore ±1, ±3, ±5 and ±\. Testing each of these numbers by direct substitution into the equation a(x) = 0, we find that - \ and —3 are roots. Before going to the next step in our discussion we note a simple but fairly surprising fact. Lemma Let a(x) = b(x)c(x), where a(x), b(x), and c(x) have integer coefficients. If a prime number p divides every coefficient of a(x), it either divides every coefficient of b(x) or every coefficient of c(x). Proof: If this is not the case, let br be the first coefficient of b(x) not divisible by p, and let c, be the first coefficient of c(x) not divisible by p. Now, a(x) = b(x)c(x), so + b.c. + - ■ • + b. Each term on the right, except brcn is a product bic/ where either 1> r or / > t. By our choice of br and c,, if i > r then p \ bt, and if /' > t then p\c}. Thus, p is a factor of every term on the right with the possible exception of brcn but p is also a factor of ar+l. Thus, p must be a factor of brcn hence of either br or c„ and this is impossible. ■ We saw (in the discussion immediately preceding Theorem 4) that any polynomial a(x) with rational coefficients has a constant multiple ka(x), with integer coefficients, which has the same roots as a(x). We can go one better; let a(x)&Z[x]: Theorem 5 Suppose a(x) can be factored as a(x) = b{x)c(x), where b(x) and c(x) have rational coefficients. Then there are polynomials B(x) and C(x) with integer coefficients, which are constant multiples ofb(x) and c(x), respectively, such that a(x) = B(x)C(x). Proof. Let k and / be integers such that kb{x) and lc(x) have integer coefficients. Then kla(x) = [kb(x)][lc(x)]. By the lemma, each prime factor of kl may now be canceled with a factor of either kb(x) or lc(x). m Remember that a polynomial a(x) of positive degree is said to be reducible over F if there are polynomials b(x) and c(x) in F[x], both of positive degree, such that a(x) = b{x)c(x). If there are no such polynomials, then a(x) is irreducible over F. If we use this terminology, Theorem 5 states that any polynomial with integer coefficients which is reducible over Q is reducible already over Z. In Chapter 25 we saw that every polynomial can be factored into irreducible polynomials. In order to factor a polynomial completely (that is, into irreducibles), we must be able to recognize an irreducible polynomial when we see one! This is not always an easy matter. But there is a method which works remarkably well for recognizing when a polynomial is irreducible over Q: Theorem 6: Eisenstein's irreducibility criterion Let a(x) = a„ + a j* + • - • + anx" be a polynomial with integer coefficients. Suppose there is a prime number p which divides every coefficient of a(x) except the leading cofftcient an; suppose p does not divide an and p2 does not divide a0. Then a(x) is irreducible over Q. Proof: If a(x) can be factored over Q as a(x) = b(x)c{x), then by Theorem 5 we may assume b(x) and c{x) have integer coefficients: say b(x) = b0 + ■ ■ ■ + bkxk and c(x) = c0 + ■ ■ ■ + cmxm Now, a0 = b0c„; p divides a0 but p2 does not, so only one of b0, c0 is divisible by p. Say p | c0 and p -f h0. Next, an = bkc,„ andp X an, so p -r cm. Let s be the smallest integer such that p -f cs. We have «, = V, + V,-! + • • • + b,c0 and by our choice of cs, every term on the right except b0cs is divisible by p. But as also is divisible by p, and therefore bacs must be divisible by p. This is impossible because p * b0 and ptcs. Thus, a(x) cannot be factored. ■ For example, x3 + 2x2 + 4x + 2 is irreducible over Q because p = 2 satisfies the conditions of Eisenstein's criterion. POLYNOMIALS OVER R AND C One of the most far-reaching theorems of classical mathematics concerns polynomials with complex coefficients. It is so important in the frame- 264 CHAPTER TWENTY-SIX SUBSTITUTION IN POLYNOMIALS 265 work of traditional algebra that it is called the fundamental theorem of algebra. It states the following: Every nonconstant polynomial with complex coefficients has a complex root. (The proof of this theorem is based upon techniques of calculus and can be found in most books on complex analysis. It is omitted here.) It follows immediately that the irreducible polynomials in C[x] are exactly the polynomials of degree 1. For if a(x) is a polynomial of degree greater than 1 in C[x], then by the fundamental theorem of algebra it has a root c and therefore a factor x - c. Now, every polynomial in C[x] can be factored into irreducibles. Since the irreducible polynomials are all of degree 1, it follows that if a(x) is a polynomial of degree n over C, it can be factored into a(x) - k(x - c,)(x -c2)--(x-cn) In particular, if a(x) has degree n it has n (not necessarily distinct) complex roots Cj,.. . , cn. Since every real number a is a complex number (a = a + Of), what has just been stated applies equally to polynomials with real coefficients. Specifically, if a(x) is a polynomial of degree n with real coefficients, it can be factored into a(x) = k(x — c,) • • • (x - cn), where c,, . . . , cn are complex numbers (some of which may be real). For our closing comments, we need the following lemma: Lemma Suppose a(x) 6. R[x]. If a + bi is a root of a(x), so is a — bi. Proof: Remember that a — bi is called the conjugate of a + bi. If r is any complex number, we write r for its conjugate. It is easy to see that the function f(r) = f is a homomorphism from C to C (in fact, it is an isomorphism). For every real number a, f(a) = a. Thus, if a(x) has real coefficients, then f(a0 + atr + ■ • ■ + anr") = a0 + a^r + ■ • • + a„r". Since /(0) = 0, it follows that if r is a root of a(x), so is f. ■ Now let a(x) be any polynomial with real coefficients, and let r=a + bi be a complex root of a(x). Then r is also a root of a(x), so (x - r)(x -F) = x2- lax + (a2 + b2) and this is a quadratic polynomial with real coefficients*. We have thus shown that any polynomial with real coefficients can be factored into polynomials of degree I or 2 in U\x\. In particular, the irreducible polynomials of U\x] are the linear polynomials and the irreducible quadratics (that is, the ax2 + bx + c where b2 - 4ac <0). EXERCISES A. Finding Roots of Polynomials over Finite Fields In order to find a root of a(x) in a finite field F, the simplest method (if Fis small) is to test every element of F by substitution into the equation a(x) = 0. 1 Find all the roots of the following polynomials in Z5[x], and factor the polynomials: x3 + x2 + x+l; 3x4 + x2 + l; x5 + 1; x4 + 1; x4 + 4 # 2 Use Fermat's theorem to find all the roots of the following polynomials in Z7\x]: 1; 3x'8 + x19 + 3; 2x: - jt" + 2x + 6 3 Using Fermat's theorem, find polynomials of degree «6 which determine the same functions as the following polynomials in Z7[x]: 3x' - Sx54 + 2x" 4xlm+6x> -2xs 3x" -3x$ 4 Explain why every polynomial in Zp[x] has the same roots as a polynomial of degree (c) for n values of c, prove that a(x) = ft(x). 7 There are infinitely many irreducible polynomials in Z5[x]. # 8 How many roots does x2 - x have in Z10? In Z„? Explain the difference. D. Irreducible Polynomials in Q[x] by Eisenstein's Criterion (and Variations on the Theme) 1 Show that each of the following polynomials is irreducible over O: 2 1 3x4 - 8*3 + 6x2 - Ax + 6; r * + ^ x4 lx + J 5* 1 3 2 3* -3* + 1; 1 « 4 , 2 2 , , 2* +3* "3* +1 2 It often happens that a polynomial a(y), as it stands, does not satisfy the conditions of Eisenstein's criterion, but with a simple change of variable y = x + c, it does. It is important to note that if a(x) can be factored into p(x)q(x), then certainly a(x + c) can be factored into p(x + c)q(x + c). Thus, the ir-reducibility of a(x + c) implies the irreducibility of a(x). (a) Use the change of variable y = x + 1 to show that x" + Ax + 1 is irreducible in Q[x]. [In other words, test (x + l)4 + 4(x + 1) + 1 by Eisenstein's criterion.] (b) Find an appropriate change of variable to prove that the following are irreducible in „...,/(«„) = bn. The simplest and most useful kind of function for this purpose is a polynomial function of the lowest possible degree. We now consider a commonly used technique for constructing a polynomial p(x) of degree n which assumes given values b0,bl,. . . ,bn are given points «„, ax,.. ■, a„. That is, P(a0) = b0, p(al) = 6,,. . . , p(a„) = bn First, for each i =0,1,.. ., n, let q,(x) = (x-au)-(x- «,_,)(* - a1+,)- • -(x - aj 1 Show that q.(aj) = 0 for /' ^ i, and a,(a,) ^0. Let qt(at) = c,, and define p(x) as follows: P(x) = S 7 9(F) send every polynomial a(x) to the polynomial function which it determines. Show that h is a homomorphism from F[x] onto 9(F). (Note: To show that h is onto, use Exercise H4.) 7 Let F= {c„ . . . , cj and p(x) = (x - c,)-• ■ (jc - c„). Prove that EXTENSIONS OF FIELDS 271 CHAPTER TWENTY-SEVEN EXTENSIONS OF FIELDS In the first 26 chapters of this book we introduced the cast and set the scene on a vast and complex stage. Now it is time for the action to begin. We will be surprised to discover that none of our effort has been wasted; for every notion which was defined with such meticulous care, every subtlety, every fine distinction will have its use and play its prescribed role in the story which is about to unfold. We will see modern algebra reaching out and merging with other disciplines of mathematics; we will see its machinery put to use for solving a wide range of problems which, on the surface, have nothing whatever to do with modern algebra. Some of these problems—ancient problems of geometry, riddles about numbers, questions concerning the solutions of equations—reach back to the very beginnings of mathematics. Great masters of the art of mathematics puzzled over them in every age and left them unsolved, for the machinery to solve them was not there. Now, with a light touch modern algebra uncovers the answers. Modern algebra was not built in an ivory tower but was created part and parcel with the rest of mathematics—tied to it, drawing from it, and offering it solutions. Clearly it did not develop as methodically as it has been presented here. It would be pointless, in a first course in abstract algebra, to replicate all the currents and crosscurrents, all the hits and misses and false starts. Instead, we are provided with a finished product in which the agonies and efforts that went into creating it cannot be discerned. There is a disadvantage to this: without knowing the origin of a given concept, without knowing the specific problems which gave it birth, the student often wonders what it means and why it was ever invented. We hope, beginning now, to shed light on that kind of question, to justify what we have already done, and to demonstrate that the concepts introduced in earlier chapters are correctly designed for their intended purposes. Most of classical mathematics is set in a framework consisting of fields, especially Q, U, and C. The theory of equations deals with polynomials over R and C, calculus is concerned with functions over R, and plane geometry is set in R x R. It is not surprising, therefore, that modern efforts to generalize and unify these subjects should also center around the study of fields. It turns out that a great variety of problems, ranging from geometry to practical computation, can be translated into the language of fields and formulated entirely in terms of the theory of fields. The study of fields will therefore be our central concern in the remaining chapters, though we will see other themes merging and flowing into it like the tributaries of a great river. If F is a field, then a subfield of F is any nonempty subset of F which is closed with respect to addition and subtraction, multiplication and division. (It would be equivalent to say: closed with respect to addition and negatives, multiplication and multiplicative inverses.) As we already know, if A" is a subfield of F, then A" is a field in its own right. If A is a subfield of F, we say also that F is an extension field of A. When it is clear in context that both Fand A are fields, we say simply that F is an extension of A. Given a field F, we may look inward from F at all the subfields of F. On the other hand, we may look outward from F at all the extensions of F. Just as there are relationships between F and its subfields, there are also interesting relationships between F and its extensions. One of these relationships, as we shall see later, is highly reminiscent of Lagrange's theorem—an inside-out version of it. Why should we be interested in looking at the extensions of fields? There are several reasons, but one is very special. If F is an arbitrary field, there are, in general, polynomials over F which have no roots in F. For example, x2 4- 1 has no roots in U. This situation is unfortunate but, it turns out, not hopeless. For, as we shall soon see, every polynomial over 270 272 CHAPTER TWENTY-SEVEN EXTENSIONS OF FIELDS 273 any field F has roots. If these roots are not already in F, they are in a suitable extension of F. For example, x2 + 1 = 0 has solutions in C. In the matter of factoring polynomials and extracting their roots, C is Utopia! In C every polynomial a(x) of degree n has exactly n roots cM . . . , c„ and can therefore be factored as a(x) = k(x - c,)(* - c2) ■ ■ • (x - c„). This ideal situation is not enjoyed by all fields—far from it! In an arbitrary field F, a polynomial of degree n may have any number of roots, from no roots to n roots, and there may be irreducible polynomials of any degree whatever. This is a messy situation, which does not hold the promise of an elegant theory of solutions to polynomial equations. However, it turns out that F always has a suitable extension E such that any polynomial a(x) of degree n over F has exactly n solutions in E. Therefore, a(x) can be factored in E[x] as a(x) = k(x - c,)(x -c2)---(x- C„) Thus, paradise is regained by the expedient of enlarging the field F. This is one of the strongest reasons for our interest in field extensions. They will give us a trim and elegant theory of solutions to polynomial equations. Now, let us get to work! Let £ be a field, F a subfield of E, and c any element of E. We define the substitution function ac as follows: For every polynomial a(x) in F\x\, W) o-c(i>W) a(c)b(c) a(c) b(c) The kernel of the homomorphism trc is the set of all the polynomials a(x) such that a(c) = o-c(a(x)) = 0. That is, the kernel of ac consists of all the polynomials a(x) in F[x\ such that c is a root of a(x). Let Jc denote the kernel of o-(.; since the kernel of any homomorphism is an ideal, Jc is an ideal of F[x\. An element c in E is called algebraic over F if it is the root of some nonzero polynomial a(x) in F[x\. Otherwise, c is called transcendental over F. Obviously c is algebraic over F iff Jc contains nonzero polynomials, and transcendental over F iff Jc = {0}. We will confine our attention now to the case where c is algebraic. The transcendental case will be examined in Exercise G at the end of this chapter. Thus, let c be algebraic over F, and let Jc be the kernel of ac (where crc is the function "substitute c for x"). Remember that in F[x] every ideal is a principal ideal; hence Jc = (p(x)) = the set of all multiples of p(x), for some polynomial p(x). Since every polynomial in Jc is a multiple of p(x), p(x) is a polynomial of lowest degree among all the nonzero polynomials in Jc. It is easy to see that p(x) is irreducible; otherwise we could factor it into polynomials of lower degree, say p(x) = f{x)g(x). But then 0 = pic) = f(c)g(c), so f(c) = 0 or g(c) = 0, and therefore either f(x) or g(x) is in Jc. This is impossible, because we have just seen that p(x) has the lowest degree among all the polynomials in Jc, whereas f(x) and g(x) both have lower degree than p(x). Since every constant multiple of p(x) is in Jc, we may take p(x) to be monic, that is, to have leading coefficient 1. Then p(x) is the unique monic polynomial of lowest degree in Jc. (Also, it is the only monic irreducible polynomial in Jc.) This polynomialp(x) is called the minimum polynomial of c over F, and will be of considerable importance in our discussions in a later chapter. Let us look at an example: U is an extension field of Q, and R contains the irrational number V2. The function ar^ is the function "substitute V2 for x"; for example (x4 - 3x2 + I) = V24 - 3\/22 + 1 = -1. By our discussion above, a^: Q[x]-*R is a homomorphism and its kernel consists of all the polynomials in Q[x] which have V2 as one of their roots. The monic polynomial of least degree in Q[x\ having V2 as a root is p(x) = x - 2; hence x - 2 is the minimum polynomial of V2 over Now, let us turn our attention to the range of vc. Since o~r is a homomorphism, its range is obviously closed with respect to addition, multiplication, and negatives, but it is not obviously closed with respect to multiplicative inverses. Not obviously, but in fact it is closed for multiplicative inverses, which is far from self-evident, and quite a remarkable fact. In order to prove this, let/(c) be any nonzero element in the range of vc. Since /(c) ^0, f(x) is not in the kernel of a,.. Thus, f(x) is not a multiple of p{x), and since p(x) is irreducible, it follows that/(*) and p(x) are relatively prime. Therefore there are polynomials s(x) and t(x) such 274 CHAPTER TWENTY-SEVEN EXTENSIONS OF FIELDS 275 that s(x)f(x) + t(x)p(x) = 1. But then s{c)f{c) + t(c)p(c) = l and therefore s{c) is the multiplicative inverse of /(c). We have just shown that the range of (* + c)) 3 FLr]/(/j(x)>. 5 Let p(x) be irreducible, and let a be a root of p(cx). Then F[x]l(p(cx)) = F(a) and F[x]/(p(x)) = F(ca). Conclude that F[x]l(p(cx)) = F[x]/(p(x)). 6 Use parts 4 and 5 to prove the following: (a) Zu[x]/{x2+l) ~Zu[x]/(x2 + x + 4). (b) If a is a root of x2 - 2 and b is a root of x2 - 4x + 2, then Q(a) = Q(b). (c) If a is a root of x2 - 2 and b is a root of x2 - J, then Q(a) = Q(ft). t F. Quadratic Extensions If the minimum polynomial of a over F has degree 2, we call F(a) a quadratic extension of F. 1 Prove that, if F is a field whose characteristic is 5*2, any quadratic extension of F is of the form F(Va), for some a£F. (Hint: Complete the square, and use Exercise E4.) Let F be a finite field, and F* the multiplicative group of nonzero elements of F. Obviously H = {x2: x £ F*} is a subgroup of F*; since every square x2 in F* is the square of only two different elements, namely ±x, exactly half the elements of F* are in H. Thus, H has exactly two cosets: H itself, containing all the squares, and aH (where a containing all the nonsquares. If a and b are nonsquares, then by Chapter 15, Theorem 5(i), a ab~ H Thus: if a and b are nonsquares, alb is a square. Use these remarks in the following: 2 Let F be a finite field. If a, b £ F, let p(x) = x2 - a and a(x) = x2 — b be irreducible in F[jt], and let Va and Vfc denote roots of p(x) and q(x) in an extension of F. Explain why a/ft is a square, say a/ft = c2 for some c£ F. Prove that Vft is a root of p(cx). 3 Use part 2 to prove that F[x)/(p(cx)} = F(Vft); then use Exercise E5 to conclude that F(Va) = F(Vft). 4 Use part 3 to prove: Any two quadratic extensions of a finite field are isomorphic. 5 If a and ft are nonsquares in R, a/ft is a square (why?). Use the same argument as in part 4 to prove that any two simple extensions of R are isomorphic (hence isomorphic to C). G. Questions Relating to Transcendental Elements Let F be a field, and let c be transcendental over F. Prove the following: 1 (a(c):a(x) £ F\x]} is an integral domain isomorphic to F[x\. # 2 F(c) is the field of quotients of (a(c): a(x)E F[x]}, and is isomorphic to F(x), the field of quotients of F[x]. 3 If c is transcendental over F, so are c + 1, kc (where kE F and k ¥= 0), c , 4 If c is transcendental over F, every element in F(c) but not in Fis transcendental over F. t H. Common Factors of Two Polynomials: Over F and over Extensions of F Let F be a field, and let a(x), b(x)E F[x]. Prove the following: 1 If a(x) and ft(jt) have a common root c in some extension of F, they have a common factor of positive degree in F[x\. [Use the fact that a(x), b(x) £ ker crc.] 2 If a(x) and b(x) are relatively prime in F\x\, they are relatively prime in K[x], for any extension K of F. Conversely, if they are relatively prime in K[x\, then they are relatively prime in F[jc]. t I. Derivatives and Their Properties Let a(x) = an + axx + • • • + anx" £ F[x]. The derivative of a(x) is the following polynomial a'(x) £ F[x]: a'(x) = a, + 2a;,x + • • ■ + /ja„x"_1 280 chapter twenty-seven extensions of fields 281 (This is the same as the derivative of a polynomial in calculus.) We now prove the analogs of the formal rules of differentiation, familiar from calculus. Let a(x), b(x) S F[x], and let itEF. Prove parts 1-4: 1 [a(x) + b(x)\ = a'(x) + b'(x) 2 [a(x)b(x)]' = a\x)b(x) + a(x)b'(x) 3 [ka(x)]' = ka\x) 4 If F has characteristic 0 and a'(x) = 0, then a(x) is a constant polynomial. Why is this conclusion not necessarily true if F has characteristic p ¥° 0? 5 Find the derivative of the following polynomials in Z5[x[: x6 + 2x3 + x + l x5 + 3x2 + l x15 + 3x10 + 4x5 + 1 6 If Fhas characteristic p ^0, and a'(x) = 0, prove that the only nonzero terms of a(x) are of the form ampxmp for some m. [That is, a(x) is a polynomial in powers of*".] t J. Multiple Roots Suppose a(x) £ F[x], and K is an extension of F. An element c £ K is called a multiple root of a(x) if (x - c)m \ a(x) for some m > 1. It is often important to know if all the roots of a polynomial are different, or not. We now consider a method for determining whether an arbitrary polynomial a(x) £ F[x] has multiple roots in any extension of F. Let K be any field containing all the roots of a(x). Suppose a(x) has a multiple root c. 1 Prove that a(x) = (x - c)2q(x) £ K[x]. 2 Compute a'(x), using part 1. 3 Show that x - c is a common factor of a(x) and a'(x). Use Exercise HI to conclude that a(x) and a'(x) have a common factor of degree >1 in F[x\. Thus, if a(x) has a multiple root, then a(x) and a'(x) have a common factor in F[x]. To prove the converse, suppose a(x) has no multiple roots. Then a(x) can be factored as a(x) = (x — c,)- • • (jf — c„) where c,, . . . , cn are all different. 4 Explain why a'(x) is a sum of terms of the form (x -<:,)•■• (x - c,_, )(x - c,.,.,) • ■ • (x - cj 5 Using part 4, explain why none of the roots c,,. . ., c„ of a(x) are roots of «'(x). 6 Conclude that o(x) and a'^) nave no common factor of degree >1 in F[x]. 7 Show that each of the following polynomials has no multiple roots in any extension of its field of coefficients: x3 - lx2 + 8 £ Q[x] x2 + x + 1 £ Z,[x] 1 £ Z7[x] The preceding example is most interesting: it shows that there are 100 different hundredth roots of 1 over Z7. (The roots ±1 are in Z7, while the remaining 98 roots are in extensions of Z7.) Corresponding results hold for most other fields. This important result is stated as follows: A polynomial a(x) in F[x] has a multiple root iff a(x) and a'(x) have a common factor of degree >1 in F[x]. VECTOR SPACES 283 CHAPTER TWENTY-EIGHT VECTOR SPACES Many physical quantities, such as length, area, weight, and temperature, are completely described by a single real number. On the other hand, many other quantities arising in scientific measurement and everyday reckoning are best described by a combination of several numbers. For example, a point in space is specified by giving its three coordinates with respect to an xyz coordinate system. Here is an example of a different kind: A store handles 100 items; its monthly inventory is a sequence of 100 numbers (a1,a2, specifying the quantities of each of the 100 items currently in stock. Such a sequence of numbers is usually called a vector. When the store is restocked, a vector is added to the current inventory vector. At the end of a good month of sales, a vector is subtracted. As this example shows, it is natural to add vectors by adding corresponding components, and subtract vectors by subtracting corresponding components. If the store manager in the preceding example decided to double inventory, each component of the inventory vector would be multiplied by 2. This shows that a natural way of multiplying a vector by a real number k is to multiply each component by k. This kind of multiplication is commonly called scalar multiplication. Historically, as the use of vectors became widespread and they came to be an indispensable tool of science, vector algebra grew to be one of the major branches of mathematics. Today it forms the basis for much of advanced calculus, the theory and practice of differential equations, statistics, and vast areas of applied mathematics. Scientific computation is enormously simplified by vector methods; for example, 3, or 300, or 3000 individual readings of scientific instruments can be expressed as a single vector. In any branch of mathematics it is elegant and desirable (but not always possible) to find a simple list of axioms from which all the required theorems may be proved. In the specific case of vector algebra, we wish to select as axioms only those particular properties of vectors which are absolutely necessary for proving further properties of vectors. And we must select a sufficiently complete list of axioms so that, by using them and them alone, we can prove all the properties of vectors needed in mathematics. A delightfully simple list of axioms is available for vector algebra. The remarkable fact about this axiom system is that, although we conceive of vectors as finite sequences (a1, a2,... , an) of numbers, nothing in the axioms actually requires them to be such sequences! Instead, vectors are treated simply as elements in a set, satisfying certain equations. Here is our basic definition: A vector space over a field F is a set V, with two operations + and • called vector addition and scalar multiplication, such that 1. V with vector addition is an abelian group. 2. For any k G F and a G V, the scalar product ka is an element of V, subject to the following conditions: for all / G F and a, bG V, (a) k(a + b) = ka + kb, (b) (k + l)a = ka + la, (c) k(la) = (kl)a, (d) la = a. The elements of V are called vectors and the elements of the field F are called scalars. In the following exposition the field F will not be specifically referred to unless the context requires it. For notational clarity, vectors will be written in bold type and scalars in italics. The traditional example of a vector space is the set R" of all n-tuples of real numbers, (al,a2, ■ ■ ■ ,an), with the operations (a,, a2,..., a„) + (/>„ b2,.. ., b J = (a, + b,, a2 + b2,. .. , a„ + b„) and k(a}, a2,. . . , a„) = (/ca,, ka2,. . . , kan) For example, R2 is the set of all two-dimensional vectors (a, b), while R3 is the set of all vectors (a, b, c) in euclidean space. (See the figure on the next page.) However, these are not the only vector spaces! Our definition of vector space is so very simple that many other things, quite different in appearance from the traditional vector spaces, satisfy the conditions of our definition and are therefore, legitimately, vector spaces. 282 284 CHAPTER TWENTY-EIGHT VECTOR SPACES 285 ,{a. b) {a. b. c) For example, &(U), you may recall, is the set of all functions from R to U. We define the sum / + g of two functions by the rule [f + g](x)=f(x) + g(x) and we define the product af, of a real number a and a function /, by [af](x) = af(x) It is very easy to verify that &(R), with these operations, satisfies all the conditions needed in order to be a vector space over the field U. As another example, let ®£ denote the set of all polynomials with real coefficients. Polynomials are added as usual, and scalar multiplication is defined by k(a0 + «,* + ••• + a„x") = (K) + (ka,)x + ■•■ + (kan)x" Again, it is not hard to see that is a vector space over R. Let V be a vector space. Since V with addition alone is an abelian group, there is a zero element in V called the zero vector, written as 0. Every vector a in V has a negative, written as -a. Finally, since V with vector addition is an abelian group, it satisfies the following conditions which are true in all abelian groups: a + b = a + c implies b = c a + b = 0 implies a=-b and b: -(a + b) = (-a) + (-b) and -(-a) = a (1) (2) (?) There are simple, obvious rules for multiplication by zero and by negative scalars. They are contained in the next theorem. Theorem I If V is a vector space, then: (i) 0a = 0, for every a6K. (ii) kO = 0, for every scalar k. (hi) // Jta = 0, then k = 0 or a = 0. (iv) (-1 )a = -a for every a £ V. To prove Rule (i), we observe that 0a = (0 + 0)a = 0a + 0a hence 0 + 0a = 0a + 0a. It follows by Condition (1) that 0 = 0a. Rule (ii) is proved similarly. As for Rule (iii), if k = 0, we are done. If fc^O, we may multiply Aa = 0 by 1/k to get a = 0. Finally, for Rule (iv), we have a + (-l)a = la + (-l)a = (1 + (-l))a = 0a = 0 so by Condition (2), (-l)a= -a. Let V be a vector space, and U CV. We say that U is closed with respect to scalar multiplication if ka&U for every scalar k and every aE(/. We call U a subspace of V if U is closed with respect to addition and scalar multiplication. It is easy to see that if V is a vector space over the field F, and U is a subspace of V, then U is a vector space over the same field F. If a,, a2,. . . , a„ are in V and kx, k2,. . . , kn are scalars, then the vector /c,a, + k2*2 + ■■■ + k„an is called a linear combination of a,, a2,. . . , a„. The set of all the linear combinations of a,, a2,. . . , a„ is a subspace of V. (This fact is exceedingly easy to verify.) If U is the subspace consisting of all the linear combinations of a,,a2,... ,a„, we call U the subspace spanned by a,,a2,. . . ,a„. An equivalent way of saying the same thing is as follows: a space (or subspace) U is spanned by a„ a2,. . . , a„ iff every vector in U is a linear combination of a,, a2, . . . , a„. If U is spanned by a,, a2,. . . , a„, we also say that a,, a,, . . . , a„ span U. Let S = {a,, a2, . . . , a„} be a set of distinct vectors in a vector space V. Then S is said to be linearly dependent if there are scalars klt..., kn, not all zero, such that /c.a, + fc2a2 + ■ • • +/c„a„ = 0 (4) Obviously this is the same as saying that at least one of the vectors in S is a linear combination of the remaining ones. [Solve for any vector a, in Equation (4) having a nonzero coefficient.] If S = {a,,a2,.. . ,a„} is not linearly dependent, then it is linearly independent. That is, S is linearly independent iff /c^a, + k2a2 + • • ■ + knan - 0 implies Jkt = k2 = • • • = kH = 0 This is the same as saying that no vector in S is equal to a linear combination of the other vectors in S. 286 CHAPTER TWENTY-EIGHT VECTOR SPACES 287 It is obvious from these definitions that any set of vectors containing the zero vector is linearly dependent. Furthermore, the set {a}, containing a single nonzero vector a, is linearly independent. The next two lemmas, although very easy and at first glance rather trite, are used to prove the most fundamental theorems of this subject. Lemma 1 If {a,,a2,. . . , a„} is linearly dependent, then some a, is a linear combination of the preceding ones, aj,a2,. . . ,a;_j. Proof: Indeed, if {a,,a2, . . . ,a„} is linearly dependent, then kl&l + ' " + knan = 0 f°r coefficients ku k2,. . . , kn which are not all zero. If fc, is the last nonzero coefficient among them, then klai + • ■ ■ + kiai = 0, and this equation can be used to solve for a, in terms of a,,. . . , a,._,. ■ Let (a,,a2, removal of a,. ,. . . ,a„} denote the set {a,,a2, ...,»„} after Lemma 2//{a„a2,...,aJ spans V, and a, is a linear combination of preceding vectors, then {a,,. . . , . . . , a„} still spans V. Proof: Our assumption is that a, = ^a, + • • • + ki_1ai_l for some scalars kl,...,ki_1. Since every vector be V is a linear combination b = + • • ■ + l,a, + • - • + /„a„ it can also be written as a linear combination b = llal + ■ • • + /,(fcia, + • ■ • + Jfc|_,a(_1) + • • • + lnan in which a; does not figure. ■ A set of vectors {a,,. . . , a„} in V is called a basis of V if it is linearly independent and spans V. For example, the vectors e, = (1,0,0), e2 = (0,1,0), and e3 = (0,0,1) form a basis of U. They are linearly independent because, obviously, no vector in {e,, e2, e3} is equal to a linear combination of preceding ones. [Any linear combination of e, and e2 is of the form ael + be2 = (a, b, 0), whereas e3 is not of this form; similarly, any linear combination of ex alone is of the form ae{ = (a, 0, 0), and e2 is not of that form.] The vectors e,, e2, e3 span R3 because any vector (a, b, c) in R can be written as (a, b, c) — ae} + be2 + ce3. Actually, {e,, e2, e3} is not the only basis of R3. Another basis of R3 consists of the vectors (1, 2, 3), (1, 0, 2), and (3, 2,1); in fact, there are infinitely many different bases of R3. Nevertheless, all bases of R3 have one thing in common: they contain exactly three vectors! This is a consequence of our next theorem: Theorem 2 Any two bases of a vector space V have the same number of elements. Proof: Suppose, on the contrary, that V has a basis A = {a,,..., an} and a basis B = {b15. .., bm} where m^n. To be specific, suppose na„,} stu' spans V. Repeating this process, we discard vectors one by one from {a,,. . . ,a„} and, each time, the remaining vectors still span V. We keep doing this until the remaining set is independent. (In the worst case, this will happen when only one vector is left.) ■ 288 CHAPTER TWENTY-EIGHT VECTOR SPACES 289 The next lemma asserts that if {a,, ... , aj is an independent set of vectors in V, there is a way of adding vectors to this set so as to get a basis of V. Lemma 4 // the set {a,,. . . ,as} is linearly independent, it can be extended to a basis of V. Proof: If {b1(...,bj is any basis of V, then {a,,.. . ,a„b,----, b„} spans V. By the proof of Lemma 3, we may discard vectors from this set until we get a basis of V. Note that we never discard any a,, because, by hypothesis, a, is not a linear combination of preceding vectors. ■ The next theorem is an immediate consequence of Lemmas 3 and 4. Theorem 3 Let V have dimension n. If {a,, . . . , a„} is an independent set, it is already a basis of V. If {b,,. . . , b„} spans V, it is already a basis of V. If {a,.. . . , a„} is a basis of V, then every vector c in V has a unique expression c = *,a, + • • • + k„a„ as a linear combination of a,, . . . , a„. Indeed, if then hence (*,-/>, + ••• + (*,, -/>„=<> so fc, = lu...k„ = /„. If c = fc,a, + • • • + knan, the coefficients klt...,km are called the coordinates of c with respect to the basis {a,.....a„}. It is then convenient to represent c as the n-tuple c = (*„...,*„) If U and V are vector spaces over a field F, a function h : U-> V is a homomorphism if it satisfies the following two conditions: h(a + b) = A(a) + h(b) and h{ka) = kh(a) A homomorphism of vector spaces is also called a linear transformation. If h : U—> V is a linear transformation, its kernel [that is, the set of all aG U such that h(a) = 0] is a subspace of U, called the null space of h. Homomorphisms of vector spaces behave very much like homomorph-of groups and rings. Their properties are presented in the exercises. isms EXERCISES A. Examples of Vector Spaces 1 Prove that R", as denned on page 283, satisfies all the conditions for being a vector space over R. 2 Prove that &(U), as defined on page 284, is a vector space over R. 3 Prove that 9(, as defined on page 284, is a vector space over R. 4 Prove that M2(R), the set of all 2x2 matrices of real numbers, with matrix addition and the scalar multiplication (a b\ika kb\ \c d) \kc kd) is a vector space over B. Examples of Subspaces # 1 Prove that {(a, b, c) : la - 3b + c = 0} is a subspace of R3. 2 Prove that the set of all (x, y, z) G R3 which satisfy the pair of equations ax + by + c = 0, dx + ey + f = 0 is a subspace of R3. 3 Prove that { / : /(l) =0} is a subspace of &(R). 4 Prove that {/: / is a constant on the interval [0,1]} is a subspace of ^(R). 5 Prove that the set of all even functions [that is, functions / such that f{x) = f(-x)] is a subspace of ^(R). Is the same true for the set of all the odd functions [that is, functions / such that f{-x) = ~f{x)]l 6 Prove that the set of all polynomials of degree =Sn is a subspace of 9i, C. Examples of Linear Independence and Bases 1 Prove that {(0,0,0,1), (0,0,1,1), (0,1,1,1), (1,1,1,1)} is a basis of R4. 2 If a = (1, 2, 3,4) and b = (4, 3, 2,1), explain why {a, b) may be extended to a basis of R4. Then find a basis of R4 which includes a and b. 3 Let A be the set of eight vectors (x, y, z) where x, y,z = 1,2. Prove that A spans R3, and find a subset of A which is a basis of R3. 4 If 9in is the subspace of consisting of all polynomials of degree prove that {1, x, x2,.. . , x") is a basis of 9tn. Then find another basis of 9(n. 5 Find a basis for each of the following subspaces of R3: # (a) 5, = {fx, y, z): 3x - 2y + z = 0} (6) S2 = {{x, y, z): x + y - z = 0 and 2x - y + z = 0} 6 Find a basis for the subspace of R' spanned by the set of vectors (x, y, z) such that x2 + y2 + z2 = 1. 290 chapter twenty-eight VECTOR SPACES 291 7 Let U be the subspace of ^(R) spanned by {cos2 x, sin2 x, cos2x}. Find the dimension of U, and then find a basis of U. 8 Find a basis for the subspace of Sff spanned by {jc3 + x2 + x + l,x2 + \,x2 - x2 + x - 1, x2 - 1} D. Properties of Subspaces and Bases Let V be a finite-dimensional vector space. Let dim V designate the dimension of V. Prove each of the following: 1 If U is a subspace of V, then dim U « dim V. 2 If U is a subspace of V, and dim U = dim V, then U = V. 3 Any set of vectors containing 0 is linearly dependent. 4 The set {a}, containing only one nonzero vector a, is linearly independent. 5 Any subset of an independent set is independent. Any set of vectors containing a dependent set is dependent. # 6 If {a, b, c} is linearly independent, so is {a + b, b + c, a + c}. 7 If {a,, . . . , a„} is a basis of V, so is {fc,a,, . . . , kna.n) for any nonzero scalars k....., kn. 8 The space spanned by {a,,..., a,,} is the same as the space spanned by {b,.....bm} iff each a, is a linear combination of b,,. . . , b„,, and each by is a linear combination of a,, ... ,a.. E. Properties of Linear Transformations Let U and V be finite-dimensional vector spaces over a field F, and let h : U-be a linear transformation. Prove parts 1-3: 1 The kernel of h is a subspace of U. (It is called the null space of h.) 2 The range of h is a subspace of V. (It is called the range space of h.) 3 h is injective iff the null space of h is equal to {0}. Let Jf be the null space of h, and 9t the range space of h. Let {a,, basis of .A'. Extend it to a basis {a,,..., a,,... ,a„} of U. Prove parts 4-6: , ar} be 4 Every vector be 9t is a linear combination of /i(ar+l),. . . , h(an). # 5 {/j(a,+1), . . . , h(atl)} is linearly independent. 6 The dimension of 9i is n - r. 7 Conclude as follows: for any linear transformation h, dim (domain fc) = dim (null space of h) + dim (range space of h). 8 Let U and V have the same dimension n. Use part 7 to prove that h is injective iff h is surjective. F. Isomorphism of Vector Spaces Let U and V be vector spaces over the field F, with dim U = n and dim V= m. Let h : U—> V be a homomorphism. Prove the following: 1 Let h be injective. If {a,,... ,ar} is a linearly independent subset of U, then {/ifa,),. . . , /i(ar)} is a linearly independent subset of V. # 2 h is injective iff dim U = dim h(U). 3 Suppose dim U = dim V; h is an isomorphism (that is, a bijective homomorphism) iff h is injective iff h is surjective. 4 Any ^-dimensional vector space V over F is isomorphic to the space F" of all n-tuples of elements of F. t G. Sums of Vector Spaces Let T and U be subspaces of V. The sum of T and V, denoted by T + U, is the set of all vectors a + b, where a £ T and bet/. 1 Prove that T + U and T D U are subspaces of V. V is said to be the direct sum of T and U if V = T + U and T n 1/ = {0}. In that case, we write V= T@U. # 2 Prove: V= T ® U iff every vector c£V can be written, in a unique manner, as a sum c = a + b where a e T and bet/. 3 Let T be a /t-dimensional subspace of an n-dimensiona! space V. Prove that an (n — A)-dimensional subspace U exists such that V" T ®U. 4 If T and 1/ are arbitrary subspaces of V, prove that dim (7 + t/) = dim T + dim U — dim (T n I/) DEGREES OF FIELD EXTENSIONS 293 CHAPTER TWENTY-NINE DEGREES OF FIELD EXTENSIONS In this chapter we will see how the machinery of vector spaces can be applied to the study of field extensions. Let F and K be fields. If K is an extension of F, we may regard K as being a vector space over F. We may treat the elements in K as "vectors" and the elements in F as "scalars." That is, when we add elements in K, we think of it as vector addition; when we add and multiply elements in F, we think of this as addition and multiplication of scalars; and finally, when we multiply an element of F by an element of K, we think of it as scalar multiplication. We will be especially interested in the case where the resulting vector space is of finite dimension. If K, as a vector space over F, is of finite dimension, we call K a finite extension of F. If the dimension of the vector space K is n, we say that K is an extension of degree n over F. This is symbolized by writing \K : F] = n which should be read, "the degree of K over F is equal to «." Let us recall that F(c) denotes the smallest field which contains F and c. This means that F(c) contains F and c, and that any other field K containing F and c must contain F(c). We saw in Chapter 27 that if c is algebraic over F, then F(c) consists of all the elements of the form a(c), for all a(x) in F[x]. Since F(c) is an extension of F, we may regard it as a vector space over F. Is F(c) a finite extension of F? Well, let c be algebraic over F, and let p(x) be the minimum polynomial of c over F. [That is, p(x) is the monic polynomial of lowest degree having c as a root.] Let the degree of the polynomial p{x) be equal ton. It turns out, then, that the n elements 1, c, c2,. . . , c""1 are linearly independent and span F(c). We will prove this fact in a moment, but meanwhile let us record what it means. It means that the set of n "vectors" {1, c, c2,..., c"~1} is a basis of F(c); hence F(c) is a vector space of dimension n over the field F. This may be summed up concisely as follows: Theorem 1 The degree of F(c) over F is equal to the degree of the minimum polynomial of c over F. Proof: It remains only to show that the n elements 1, c, . . . , c"~l span F(c) and are linearly independent. Well, if a(c) is any clement of F(c), use the division algorithm to divide a{x) by p(x): a(x) = p(x)q(x) + r(x) where deg r{x) n - 1 Therefore, a(c) = p(c)q(c) + r(c) = 0 + r(c) = r(c) = o This shows that every element of F(c) is of the form r(c) where r(x) has degree n — 1 or less. Thus, every element of F(c) can be written in the form a„ + a,c+---Jtan_lc"~l which is a linear combination of 1, c, c2,.. . , c" \ Finally, to prove that 1, c, c2,... , c"_1 are linearly independent, suppose that an + a,c + • • • + a„_,c"~l = 0. If the coefficients «0'«1 a„_. were not all zero, c would be the root of a nonzero polynomial of degree n — 1 or less, which is impossible because the minimum polynomial of c over F has degree n. Thus, «„ = «!, = ••• = For example, let us look at Q(V2): the number V2 is not a root of any monic polynomial of degree 1 over Q. For such a polynomial would 292 294 CHAPTER TWENTY-NINE DEGREES OF FIELD EXTENSIONS 295 have to be x - V2, and the latter is not in Q[x] because V2 is irrational. However, V2 is a root of x2 - 2, which is therefore the minimum polynomial of V2 over Q, and which has degree 2. Thus, [Q(V2): Q] = 2 In particular, every element in Q(V2) is therefore a linear combination of 1 and V2, that is, a number of the form a + bV2 where a, /j E O. As another example, iis a root of the irreducible polynomial x2 + 1 in U[x]. Therefore x2 + 1 is the minimum polynomial of i over R; x2 + 1 has degree 2, so [R(i) : R] = 2. Thus, U(i) consists of all the linear combinations of 1 and i with real coefficients, that is, all the a + bi where a, b G R. Clearly then, R(i) = C, so the degree of C over R is equal to 2. In the sequel we will often encounter the following situation: £ is a finite extension of K, where K is a finite extension of F. If we know the degree of E over K and the degree of K over F, can we determine the degree of E over F1 This is a question of major importance! Fortunately, it has an easy answer, based on the following lemma: Lemma Let alt a2,..., am be a basis of the vector space K over F, and let b,, b2, . . . , bn be a basis of the vector space E over K. Then the set of mn products {af6y} is a basis of the vector space E over the field F. Proof: To prove that the set {a-bj} spans E, note that each element c in E can be written as a linear combination c — kxb, + ■■• + knbn with coefficients kt in K. But each kn because it is in K, is a linear combination k, = /na, + + limam with coefficients L in F. Substituting, c = (/>, + ■■■ + llmam)b, + ■■■ + (/Blfll + ■ ■ • + lnmam)bn = S l.a.b, and this is a linear combination of the products a,i>; with coefficient /,. in F, To prove that {aft^} is linearly independent, suppose L ltjaibl=0. This can be written as (/na, + • • • + lXmam)b, + ■■■ + (/„,«, + ■ ■ ■ + lnmam)bn = 0 and since blt..., b„ are independent, lnal + ■ ■ ■ + limam = 0 for each i. But a,, . . . , am are also independent, so every li} = 0. ■ With this result we can now conclude the following: Theorem 2 Suppose F C K C E where E is a finite extension of K and K is a finite extension of F. Then E is a finite extension of F, and [E: F] = [E: K)[K : F] This theorem is a powerful tool in our study of fields. It plays a role in field theory analogous to the role of Lagrange's theorem in group theory. See what it says about any two extensions, K and E, of a fixed "base field" F: If K is a subfield of E, then the degree of K (over F) divides the degree of E (over F). If c is algebraic over F, we say that F(c) is obtained by adjoining c to F. If c and d are algebraic over F, we may find adjoin c to F, thereby obtaining F(c), and then adjoin d to F(c). The resulting field is denoted F(c, d), and is the smallest field containing F, c and d. [Indeed, any field containing F, c and d must contain F(c), hence also F(c, d).] It does not matter whether we first adjoin c and then d, or vice versa. If Cj,. . . , c„ are algebraic over F, we let F(c1;. . . , c„) be the smallest field containing Fand c,, . . . , cn. We call it the field obtained by adjoining c,,. . . , c„ to F. We may form F(c1(. . . , cn) step by step, adjoining one c, at a time, and the order of adjoining the c, is irrelevant. An extension F(c) formed by adjoining a single element to F is called a simple extension of F. An extension F(c,,. . . , c„), formed by adjoining a finite number of elements c,,..., cn, is called an iterated extension. It is called "iterated" because it can be formed step by step, one simple extension at a time: F C F(cJ C F(c,, c2) C F(c,, c2, c3) C ■ • • C F(Cl, (1) If Cj,..., c„ are algebraic over F, then by Theorem 1, each extension in Condition (1) is a finite extension. By Theorem 2, F(ct, c2) is a finite extension of F; applying Theorem 2 again, F(cl,c2lci) is a finite extension of F; and so on. So finally, if cu . . . ,cn are algebraic over F, then F(c,,.. ., cn) is a finite extension of F. Actually, the converse is true too: every finite extension is an iterated extension. This is obvious: for if K is a finite extension of F, say an extension of degree n, then K has a basis {a1;. . . ,an} over F. This means that every element in AT is a linear combination of a,,. . . , a„ with coefficients in F; but any field containing F and ax,. . . , an obviously 296 CHAPTER TWENTY-NINE DEGREES OF FIELD EXTENSIONS 297 contains all the linear combinations of a,, . . . , an; hence K is the smallest field containing F and a,, . . . , an. That is, K = F(a,,. . . , an). In fact, if A" is a finite extension of F and K = F(al,. . . ,a„), then a,,. . . , an have to be algebraic over F. This is a consequence of a simple but important little theorem: Theorem 3 If K is a finite extension of F, every element of K is algebraic over F. Proof: Indeed, suppose K is of degree n over F, and let c be any element of K. Then the set {1, c, c2,..., c") is linearly dependent, because it has n + 1 elements in a vector space K of dimension n. Consequently, there are scalars a0, . . . , an £ F, not all zero, such that + anc" = 0. Therefore c is a root of the polynomial a(x) = + anx" in F[x]. ■ Let us sum up: Every iterated extension F(c,,..., c„), where c,,. . . , cn are algebraic over F, is a finite extension of F. Conversely, every finite extension of F is an iterated extension F(cl,. . . , c„), where C,,.. . , cn are algebraic over F. Here is an example of the concepts presented in this chapter. We have already seen that 0(V2) is of degree 2 over Q, and therefore 0(V2) consists of all the numbers a + bV2 where a, bEQ. Observe that V3 cannot be in Q(V2); for if it were, we would have V5 = a + bV2 for rational a and b; squaring both sides and solving for V2 would give us V2 = a rational number, which is impossible. Since V3 is not in Q(V2), V3 cannot be a root of a polynomial of degree 1 over Q(V2) (such a polynomial would have to be x — V3). But V3 is a root of x1 - 3, which is therefore the minimum polynomial of V3 over Q(V2). Thus, Q(V2, V5) is of degree 2 over Q(V2), and therefore by Theorem 2, Q(V2, V3) is of degree 4 over Q. By the comments preceding Theorem 1, {1, V2} is a basis of Q( V2) over Q, and {1, V3} is a basis of Q(V2, V3) over , for all a, b, c, and d in Q. For later reference. The technical observation which follows will be needed later. By the comments immediately preceding Theorem 1, every element of F(c,) is a linear combination of powers of c,, with coefficients in F. That is, every element of F(c,) is of the form (2) where the /t, are in F. For the same reason, every element of F(c,, c2) is of the form i where the coefficients /( are in F(c,). Thus, each coefficient /; is equal to a sum of the form (2). But then, clearing brackets, it follows that every element of F(ct, c2) is of the form w where the coefficients ktj are in F. If we continue this process, it is easy to see that every element of F(c,, c2,..., c„) is a sum of terms of the form kc^---cln" where the coefficient k of each term is in F. EXERCISES A. Examples of Finite Extensions 1 Find a basis for 0(/V2) over Q, and describe the elements of Q(iV2). (See the two examples immediately following Theorem 1.) 2 Show that every element of R(2 + 3/) can be written as a + bi, where a, b E U. Conclude that R(2 + 3i) = C. # 3 If a = Vl +i/2, show that {1, 2"3, 22 \ a, 2" a, 22 'a} is a basis of Q(a) over Q. Describe the elements of Q(a). # 4 Find a basis of Q(V2 +'v/4) over Q, and describe the elements of Q(V2 +\/4). 5 Find a basis of Q(V5, V7) over Q, and describe the elements of Q(V5, V7). (See the example at the end of this chapter.) 6 Find a basis of Q(V2, V3, V5) over Q, and describe the elements of Q(V2, V3, V5). 1 Name an extension of Q over which ir is algebraic of degree 3. t B. Further Examples of Finite Extensions Let F be a field of characteristic # 2. Let a ^ b be in F. 1 Prove that any field F containing Va + Vb also contains Va and Vb. [Hint: Compute (Va + Vb)z and show that VabEF. Then compute Vab(Va + Vb), which is also in F] Conclude that F(Va + V5) = F(Va, Vb). 2 Prove that if b ¥- xza for any xEF, then Vb0F(Va). Conclude that F(Va, Vb) is of degree 4 over F. 298 chapter twenty-nine degrees of field extensions 299 3 Show that x = Va + Vb satisfies x* - 2(a + b)x2 + (a-bf Va + b + 2Vaft also satisfies this equation. Conclude that 0. Show that x = /•'(Va + b + 2Vab) = F(V5, Vft) 4 Using parts 1 to 3, find an uncomplicated basis for Q(d) over Q, where d is a root of x" - 14a:2 + 9. Then find a basis for q(V7 + 2VlO) over q. C. Finite Extensions of Finite Fields By the proof of the basic theorem of field extensions, if p(x) is an irreducible polynomial of degree n in F[x\, then F[x]I(p(x)) = F(c) where c is a root of p(r). By Theorem 1 in this chapter, F(c) is of degree n over F. Using the paragraph preceding Theorem 1: 1 Prove that every element of F(c) can be written uniquely as a0 + fl,c + • • • + fl„_,c"-1, for some a0, . . . , a„__, E F. # 2 Construct a field of four elements. (It is to be an extension of Z2.) Describe its elements, and supply its addition and multiplication tables. 3 Construct a field of eight elements. (It is to be an extension of Z2). 4 Prove that if F has q elements, and a is algebraic over F of degree n, then F(a) has q" elements. 5 Prove that for every prime number p, there is an irreducible quadratic in Z^ja:]. Conclude that for every prime p. there is a field with p2 elements. D. Degrees of Extensions (Applications of Theorem 2) Let F be a field, and K a field extension of F. Prove the following: 1 [K: F] = 1 iff K= F. # 2 If [K: F] is a prime number, there is no field properly between F and K (that is, there is no field L such that fC L C K). 3 If [K : F] is a prime, then K = F(a) for every a E K - F. 4 Suppose a, ft £ K are algebraic over F with degrees m and n, where m and n are relatively prime. Then: (a) F(a, b) is of degree mn over F. (ft) F(a) n F{b) = F. 5 If the degree of F(a) over F is a prime, then F(a) = F(a") for any n (on the condition that a" 0 F). 6 If an irreducible polynomial p(x) £ F[a:] has a root in /f, then deg p(x)\\K:F]. E. Short Questions Relating to Degrees of Extensions Let F be a field. Prove parts 1-3: 1 The degree of a over F is the same as the degree of 1 la over F. It is also the same as the degrees of a + c and ac over F, for any c E F. 2 a is of degree 1 over F iff a £ F. 3 If a real number c is a root of an irreducible polynomial of degree >1 in Q[jc], then c is irrational. 4 Use part 3 and Eisentein's irreducibility criterion to prove that VmTň (where m, n EZ) is irrational if there is a prime number which divides m but not n, and whose square does not divide m. 5 Show that part 4 remains true forVmJTi, where q > 1. 6 If a and b are algebraic over F, prove that F(a, b) is a finite extension of F. f F. Further Properties of Degrees of Extensions Let F be a field, and K a finite extension of F Prove each of the following: 1 Any element algebraic over K is algebraic over F, and conversely. 2 If b is algebraic over K, then [F{b): F\\\K(b): F]. 3 If b is algebraic over K, then \K(b) : K]« [F(b): F]. (Hint: The minimum polynomial of b over F may factor in K[at], and b will then be a root of one of its irreducible factors.) # 4 If b is algebraic over /f, then [K(í>): F(i>)] €: F]. [Hint: Note that FCKC K(b) and FC F(b) Q K(b). Relate the degrees of the four extensions involved here, using part 3.] # 5 Let p(x) be irreducible in F\x\. If [K : F] and deg p(x) are relatively prime, then p(x) is irreducible in K[x]. f G. Fields of Algebraic Elements: Algebraic Numbers Let FC K and a, ft £ AT. We have seen on page 295 that if a and ft are algebraic over F, then F(a, ft) is a finite extension of F. Use the above to prove parts 1 and 2. 1 If a and ft are algebraic over F, then a + ft, a - ft, aft, and alb are algebraic over F. (In the last case, assume ft#0.) 2 The set {at E K : at is algebraic over F} is a subfield of K, containing F. Any complex number which is algebraic over Q is called an algebraic number. By part 2, the set of all the algebraic numbers is a field, which we shall designate by A. Let a{x) = a„ + a,x + ■ ■ • + anx" be in A[x], and let c be any root of a(x). We will prove that c£A. To begin with, all the coefficients of a(Ar) are in Q(a„, a,, . . . , aj. 3 Prove: Q(a0, a....., a„) is a finite extension of Q. 300 CHAPTER TWENTY-NINE Let Q(a„, ...,«„) = Qj. Since a(x)&Q1[x], c is algebraic over Q,. Prove parts 4 and 5: 4 Qjfc) is a finite extension of Q,, hence a finite extension of Q. (Why?) 5 cGA. Conclusion: The roots of any polynomial whose coefficients are algebraic numbers are themselves algebraic numbers. A field F is called algebraically closed if the roots of every polynomial in F[x] are in F. We have thus proved that A is algebraically closed. CHAPTER THIRTY RULER AND COMPASS The ancient Greek geometers considered the circle and straight line to be the most basic of all geometric figures, other figures being merely variants and combinations of these basic ones. To understand this view we must remember that construction played a very important role in Greek geometry: when a figure was defined, a method was also given for constructing it. Certainly the circle and the straight line are the easiest figures to construct, for they require only the most rudimentary of all geometric instruments: the ruler and the compass. Furthermore, the ruler, in this case, is a simple, unmarked straightedge. Rudimentary as these instruments may be, they can be used to carry out a surprising variety of geometric constructions. Lines can be divided into any number of equal segments, and any angle can be bisected. From any polygon it is possible to construct a square having the same area, or twice or three times the area. With amazing ingenuity, Greek geometers devised ways to cleverly use the ruler and compass, unaided by any other instrument, to perform all kinds of intricate and beautiful constructions. They were so successful that it was hard to believe they were unable to perform three little tasks which, at first sight, appear to be very simple: doubling the cube, trisecting any angle, and squaring the circle. The first task demands that a cube be constructed having twice the volume of a given cube. The second asks that any angle be divided into three equal parts. The third requires the construction of a square whose area is equal to that of a given circle. Remember, only a ruler and compass are to be used! 301 302 CHAPTER THIRTY RULER AND COMPASS 303 Mathematicians, in Greek antiquity and throughout the Renaissance, devoted a great deal of attention to these problems, and came up with many brilliant ideas. But they never found ways of performing the above three constructions. This is not surprising, for these constructions are impossible! Of course, the Greeks had no way of knowing that fact, for the mathematical machinery needed to prove that these constructions are impossible—in fact, the very notion that one could prove a construction to be impossible—was still two millennia away. The final resolution of these problems, by proving that the required constructions are impossible, came from a most unlikely source: it was a by-product of the arcane study of field extensions, in the upper reaches of modern algebra. To understand how all this works, we will see how the process of ruler-and-compass constructions can be placed in the framework of field theory. Clearly, we will be making use of analytic geometry. If si is any set of points in the plane, consider operations of the following two kinds: 1. Ruler operation: Through any two points in si, draw a straight line. 2. Compass operation: Given three points A, B, and C in si, draw a circle with center C and radius equal in length to the segment AB. The points of intersection of any two of these figures (line-line, line-circle, or circle-circle) are said to be constructible in one step from M. A point P is called constructible from si if there are points P,, P2, . . . , Pn = P such that P, is constructible in one step from si, P2 is constructible in one step from sHJ{Px), and so on, so that P, is constructible in one step from si U {P%,. . . , P,_,}. As a simple example, let us see that the midpoint of a line segment AB is constructible from the two points A and B in the above sense. Well, given A and B, first draw the line AB. Then, draw the circle with center A and radius AB and the circle with center B and radius AB; let C and D be the points of intersection of these circles. C and D are constructible in one step from {A, B}. Finally, draw the line through C and D; the intersection of this line with AB is the required midpoint. It is constructible from {A, B). As this example shows, the notion of constructible points is the correct formalization of the intuitive idea of ruler-and-compass constructions. We call a point in the plane constructible if it is constructible from Q x Q, that is, from the set of all points in the plane with rational coefficients. How does field theory fit into this scheme? Obviously by associating with every point its coordinates. More exactly, with every constructible point P we associate a certain field extension of Q, obtained as follows: Suppose P has coordinates (a, b) and is constructed from QxQ in one step. We associate with P the field 0(«, b), obtained by adjoining to Q the coordinates of P. More generally, suppose P is constructible from Q x Q in n steps: there are then n points Pl, P2,. . . , Pn = P such that each P. is constructible in one step from QxQu {/»,,. . . , P4_,}. Let the coordinates of P,, . . . , P„ be (au &,),. .., (an, bn), respectively. With the points P,, . . . , Pn we associate fields Ku ..., Kn where Kl = Q(a,, bx), and for each i> 1, Kt-K, -i(«,.*,) Thus, Kx = Q(fl,, bx), K2 = Kx(a2, b2), and so on: beginning with Q, we adjoin first the coordinates of P,, then the coordinates of P2, and so on successively, yielding the sequence of extensions Q C Kx C K2 C • • • C K„ = K We call K the field extension associated with the point P. Everything we will have to say in the sequel follows easily from the next lemma. Lemma If Kx, . . . , Kn are as defined previously, then [Ki : K, ,] = 1,2, or 4. Proof: Remember that K,_t already contains the coordinates of P,, . . . , and Ki is obtained by adjoining to the coordinates x„ y, of P,. But P, is constructible in one step from QxQu {Pp . . . , P,_,}, so we must consider three cases, corresponding to the three kinds of intersection which may produce P,, namely: line intersects line, line intersects circle, and circle intersects circle. Line intersects line: Suppose one line passes through the points (a,,a2) and (bx,b2), and the other line passes through (c,,c2) and (rf,, d2). We may write equations for these lines in terms of the constants «,, a2, bl, b2, c,, c2 and d,, d2 (all of which are in /(;_,), and then solve 304 CHAPTER THIRTY RULER AND COMPASS 305 these equations simultaneously to give the coordinates x, y of the point of intersection. Clearly, these values of x and y are expressed in terms of a,, a2, b,, b2, c,, c2, dx, d2, hence are still in Thus, = Lj'/ic intersects circle: Consider the line AB and the circle with center C and radius equal to the distance k = DE. Let A, B, C have coordinates («,, o2), (b,, b2), and (c,, c2), respectively. By hypothesis, X,^, contains the numbers a,, a2, fej, Z>2, c,, c2, as well as k2 = the square of the distance D£. (To understand the last assertion, remember that contains the coordinates of D and E; see the figure and use the Pythagorean theorem.) Now, the line AB has equation x - by by - «, and the circle has equation (x-Cl)2 + (y-c2)2 = k2 Solving for x in (1) and substituting into (2) gives -> b7 -?C*-6,)-*2 c2 = k (1) (2) This is obviously a quadratic equation, and its roots are the x coordinates of S and T. Thus, the x coordinates of both points of intersection are roots of a quadratic polynomial with coefficients in The same is true of the y coordinates. Thus, if K, = K, y,) where (*., y,) is one of the points of intersection, then V,): *,-,] = [*,-,<*„ y,): *,-,(*,)][«,-,(*,) = *.-.] =2x2=4 {This assumes that x,, y, £ If either x, or y, or both are already in JKi_„ then [K,_y(x„ y,): K._,\ = 1 or 2.} Circfe intersects circle: Suppose the two circles have equations x2 + y2 + ax + fey + c = 0 (3) and jc2 + y2 + djt + ey + /= 0 (4) Then both points of intersection satisfy (a-d)x + (b-e)y + (c-f) = 0 (5) obtained simply by subtracting (4) from (3). Thus, x and y may be found by solving (4) and (5) simultaneously, which is exactly the preceding case. ■ We are now in a position to prove the main result of this chapter: Theorem 1: Basic theorem on constructibie points // the point with coordinates (a, b) is constructibie, then the degree of Q(a) over Q is a power of 2, and likewise for the degree of Q(b) over Q. Proof: Let P be a constructibie point; by definition, there are points Py,...,Pn with coordinates (a,, by),. . . , (an, bn) such that each P, is constructibie in one step from Q x Q u {Pu . . . , />_,}, and Pn = P. Let the fields associated with P,,..., Pn be Kt,..., Kn. Then [Kn:Q] = [Kn:Kn_l][Kn.,:Kn_2]--[Ki:Q] and by the preceding lemma this is a power of 2, say 2"'. But [X„:Q] = K:Q(«)][Q(fl):Q] hence[Q(a) : Q| is a factor of 2m, hence also a power of 2. ■ We will now use this theorem to prove that ruler-and-compass constructions cannot possibly exist for the three classical problems described in the opening to this chapter. Theorem 2 "Doubling the cube" is impossible by ruler and compass. Proof: Let us place the cube on a coordinate system so that one edge of the cube coincides with the unit interval on the x axis. That is, its A----- f"+-----T i I Ml J7! i I I Xl.O) 0) 306 chapter thirty ruler and compass 307 endpoints are (0,0) and (1,0). If we were able to double the cube by ruler and compass, this means we could construct a point (c, 0) such that c = 2. However, by Theorem 1, [Q(c): Q] would have to be a power of 2, whereas in fact it is obviously 3. This contradiction proves that it is impossible to double the cube using only a ruler and compass. ■ Theorem 3 "Trisecting the angle" by ruler and compass is impossible. That is, there exist angles which cannot be trisected using a ruler and compass. Proof: We will show specifically that an angle of 60° cannot be trisected. If we could trisect an angle of 60°, we would be able to construct a point (c, 0) (see figure) where c = cos 20°; hence certainly we could construct (b,0) where b = 2 cos 20°. (c,0) Proof. If we were able to square the circle by ruler and compass, it would be possible to construct the point (0, Vjr); hence by Theorem 1, \Q(Vtt) : Q] would be a power of 2. But it is well known that ir is transcendental over Q. By Theorem 3 of Chapter 29, the square of an algebraic clement is algebraic; hence Vtt is transcendental. It follows that 0(Vrr) is not even a finite extension of Q, much less an extension of some degree 2m as required. ■ EXERCISES t A. Constructible Numbers If O and / are any two points in the plane, consider a coordinate system such that the interval Ol coincides with the unit interval on the x axis. Let D be the set of real numbers such that a £ D iff the point (a, 0) is constructible from {0, /}. Prove the following: 1 U a, be. D, then a + b e D and a - b e D. 2 If a, deO, then udED, (Hint: Use similar triangles. See the accompanying figure.) But from elementary trigonometry hence cos30 = 4cos 0-3cos0 cos 60° = 4 cos3 20° - 3 cos 20° Thus, b — 2cos20° satisfies b3 - 3b - 1 = 0. The polynomial p(x) = x3 — 3x — 1 is irreducible over Q because p(x + 1) = x3 + 3x2 — 3 is irreducible by Eisenstein's criterion. It follows that Q(b) has degree 3 over Q, contradicting the requirement (in Theorem 1) that this degree has to be a power of 2. ■ Theorem 4 "Squaring the circle" by ruler and compass is impossible. 3 If a, b 6 D, then alb e D. (Use the same figure as in part 2.) 4 If a > 0 and a GO, then Vn G D. (Hint: In the accompanying figure, AB is the diameter of a circle. Use an elementary property of chords of a circle to show that x = Va.) 308 chapter thirty ruler AND COMPASS 309 It follows from parts 1 to 4 that D is a field, closed with respect to taking square roots of positive numbers. D is called the field of constructible numbers. 5 QcD. 6 If a is a real root of any quadratic polynomial with coefficients in D, then a SD. (Hint: Complete the square and use part 4.) f B. Constructible Points and Constructible Numbers Prove each of the following: 1 Let sd be any set of points in the plane; (a, b) is constructible from sA iff (a, 0) and (0, b) are constructible from s£. 2 If a point P is constructible from {O, /} [that is, from (0,0) and (1,0)], then P is constructible from Q X O. # 3 Every point in Q x Q is constructible from {O, I}. (Use Exercise A5 and the definition of D.) 4 If a point P is constructible from Q x Q, it is constructible from {O, /}. By combining parts 2 and 4, we get the following important fact: Any point P is constructible from QxQ iff P is constructible from {O,!}. Thus, we may define a point to be constructible iff it is constructible from {O, /}. 5 A point P is constructible iff both its coordinates are constructible numbers. t C. Constructible Angles An angle a is called constructible iff there exist constructible points A, B, and C such that LABC = a. Prove the following: 1 The angle a is constructible iff sin a and cos a are constructible numbers. 2 cos a e 0 iff sin a SB. 3 If cos a, cos BSD, then cos (a + B), cos (a - B) S D. 4 cos (2a) e D iff cos a S D. 5 If a and 8 are constructible angles, so are a + B, a - B, \a, and net for any positive integer n. # 6 The following angles are constructible: 30°, 75°, 22 2°. 7 The following angles are not constructible: 2ff, 40°, 140°. (Hint: Use the proof of Theorem 3.) D. Constructible Polygons A polygon is called constructible iff its vertices are constructible points. Prove the following: # 1 The regular /j-gon is constructible iff the angle 2ir/n is constructible. 2 The regular hexagon is constructible. 3 The regular polygon of nine sides is not constructible. t E. A Constructible Polygon We will show that 2W5 is a constructible angle, and it will follow that the regular pentagon is constructible. 1 If r = cos k + i sin k is a complex number, prove that 11 r = cos k - i sin k. Conclude that r + 1 lr = 2 cos k. By de Moivre's theorem, 2tt . . 2ir = cos -j- + i sin — is a complex fifth root of unity. Since x5 - 1 = (x - l)(x4 + x3 + x2 + X + 1) (o is a root of p(x) = x* + x3 + x2 + x + 1. 2 Prove that a>2 + a> + 1 + co~l + oT2 = 0. 3 Prove that , 27T _ 2ff . „ 4 cos — + 2 cos -j- - 1 = 0 (Hint: Use parts 1 and 2.) Conclude that cos (2tt/5) is a root of the quadratic 4*2 -2.r-l. 4 Use part 3 and A6 to prove that cos (27r/5) is a constructible number. 5 Prove that 2ir/5 is a constructible angle. 6 Prove that the regular pentagon is constructible. t F. A Nonconstructible Polygon By de Moivre's theorem, 2ir . . 2ir (o = cos — + i sin — is a complex seventh root of unity. Since xy - 1 = (x - l)(x" + Xs + x* + x3 + x2 + x + 1) w is a root of x6 + x5 + x" + x3 + x2 + x + 1. 1 Prove that 2 + w + 1 + a)'1 + GF(c), and we are done. Let us suppose deg/>(x)s=2 and get a contradiction: observe that h(x) and B(x) must both be multiples of p(x) because both have b as a root, and p(x) is the minimum polynomial of b. But if h(x) and B(x) have a common factor of degree 32, they must have two or more roots in common, contrary to the fact that h is their only common root. Our proof is complete. ■ For example, we may apply this theorem directly to Q(V2, V5). Taking t = 1, we get c = V2 + V3, hence Q(V2, V5) = Q(V2 + V3). If a(x) is a polynomial of degree n in F[x], let its roots be c,, . . . , c„. Then F(cj,. . . , c„) is clearly the smallest extension of F containing all the roots of a(x). F(c,, . . . , c„) is called the root field of a(x) over F. We will have a great deal to say about root fields in this and subsequent chapters. 314 CHAPTER THIRTY-ONE GALOIS THEORY: PREAMBLE 315 Isomorphisms were important when we were dealing with groups, and they are important also for fields. You will remember that if F, and F2 are fields, an isomorphism from F, to F2 is a bijective function h: F1 —* F2 satisfying h(a + b) = h(a) + h{b) and h(ab) = h(a)h(b) From these equations it follows that h(0) = 0, h(l) = 1, h(-a) = — h(a), and h{a~i) = (h{a))~\ Suppose F, and F2 are fields, and h: F,—>F2 is an isomorphism. Let K} and K2 be extensions of F, and F2, and let h: K^—* K2 also be an isomorphism. We call h and extension of /i if h(x) = h(x) for every x in F,, that is, if n and h are the same on F,. (£ is an extension of n in the plain sense that it is formed by "adding on" to h.) As an example, given any isomorphism h: F, —* F,, we can extend // to an isomorphism h: FT*]—> F2[x]. (Note that F[jt] is an extension of F when we think of the elements of F as constant polynomials; of course, F[x] is not a field, simply an integral domain, but in the present example this fact is unimportant.) Now we ask: What is an obvious and natural way of extending hi The answer, quite clearly, is to let h send the polynomial with coefficients ct0, at,. .., an to the polynomial with coefficients h(aa), «(«,), . . . , «(«„): h(a0 + atx H-----h anx") = h{aQ) + h(ax)x + • • • + h(an)x" It is child's play to verify formally that h is an isomorphism from Fjx] to F2[x]. In the sequel, the polynomial h(a(x)), obtained in this fashion, will be denoted simply by ha(x). Because R is an isomorphism, a{x) is irreducible iff ha(x) is irreducible. A very similar isomorphism extension is given in the next theorem. Theorem 3 Let h: F, —» F2be an isomorphism, and letp(x) be irreducible in F^x]. Suppose a is a root of p(x), and b a root of hp(x). Then h can be extended to an isomorphism ft: F,(a)-F2(fc) Proof: Remember that every element of Fx{a) is of the form c0 + cxa + • • ■ + c„a" where cu, . . . ,cn are in Fu and every element of F2(b) is of the form da + dxb H-----V dnb" where d0,. . . , dn are in F2. Imitating what we did successfully in the preceding example, we let h send the expression with coefficients c0,. . . , cn to the expression with coefficients h(c0),.. . ,h(cn): h(c0 + cta + --- + cna") = h(c0) + h(ct)b + ■■■ + h(c„)b" Again, it is routine to verify that h is an isomorphism. Details arc laid out in Exercise H at the end of the chapter. ■ Most often we use Theorem 3 in the special case where F, and F2 are the same field—let us call it F—and h is the identity function e: F-*F. [Remember that the identity function is e(x) = x.\ When wc apply Theorem 3 to the identity function e: F—* F, we get Theorem 4 Suppose a and b are roots of the same irreducible polynomial p{x) in F[x\. Then there is an isomorphism g: F(a)-* F(b) such that g(x) = x for every x in F, and g(a) = b. Now let us consider the following situation: K and K' are finite extensions of F, and K and K' have a common extension E. If h : K~* K' is an isomorphism such that h(x) = x for every x in F, we say that h fixes F. Let c be an element of K; if h fixes F, and c is a root of some polynomial a(x) = a0 + ■ ■ • + aaxn in F[x], h(c) also is a root of a(x). It is easy to see why: the coefficients of a(x) are in F and are therefore not changed by h. So if a(c) = 0, then a(h(c)) = «0 + a,«(c) + • • ■ + anh(c)" = h(a0 + a{c + ■ ■ ■ + a„c") = h(0) = 0 Furthermore, h(a) = b. 316 chapter thirty-one galois theory: preamble 317 What we have just shown may be expressed as follows: (*) Let a(x) be any polynomial in F[x]. Any isomorphism which fixes F sends roots of a(x) to roots of a(x). If K happens to be the root field of a(x) over F, the situation becomes even more interesting. Say K = F(cx, c2,. .. , c„), where cy, c2,. . ., cn are the roots of a(x). lfh:K—>K'is any isomorphism which fixes F, then by (*), h permutes c,, c2,. . ., c„. Now, by the brief discussion headed "For later reference" on page 296, every element of F(cu ..., c„) is a sum of terms of the form kc^c'i ■ • • where the coefficient k is in F. Because h fixes F, h(k) = k. Furthermore, c,, c2,. . . , c„ are the roots of a(x), so by (*), the product c'lc'j • • • c1; is transformed by h into another product of the same form. Thus, h sends every element of F(cu c2,. . . , c„) to another element of F(cit c2,..., cB). The above comments are summarized in the next theorem. Theorem 5 Let K and K' be finite extensions of F. Assume K is the root field of some polynomial over F. If h: K-* K' is an isomorphism which fixes F, then K= K'. Proof: From Theorem 2, K and K' are simple extensions of F, say K = F(a) and K' = F(b). Then E = F(a, b) is a common extension of K and K'. By the comments preceding this theorem, h maps every element of K to an element of K'; hence K' C K. Since the same argument may be carried out for h~ , we also have K C K'. ■ Theorem 5 is often used in tandem with the following (see the figure on the next page): Theorem 6 Let L and L' be finite extensions of F. Let K be an extension of L such that K is a root field over F. Any isomorphism h : L—» L' which fixes F can be extended to an isomorphism h : K—> K. Proof: From Theorem 2, K is a simple extension of L, say K = L(c). Now we can use Theorem 3 to extend the isomorphism h : L —* U to an isomorphism h : L(c)-^L'(d) K K' By Theorem 5 applied to h, K = K'. ■ Remark: It follows from the theorem that L' C K, since ran h C ran h = K. For later reference. The following results, which are of a somewhat technical nature, will be needed later. The first presents a surprisingly strong property of root fields. Theorem 7 Let K be the root field of some polynomial over F. For every irreducible polynomial p(x) in F[x], if p(x) has one root in K, then p(x) must have all of its roots in K. Proof: Indeed, suppose p(x) has a root a in K, and let b be any other root of p(x). From Theorem 4, there is an isomorphism h : F(a)—* F(b) fixing F. But F(a) C K; so from Theorem 6 and the remark following it F(b) C K; hence b e K. ■ Theorem 8 Suppose IQECK, where E is a finite extension of I and K is a finite extension of E. If K is the root field of some polynomial over I, then K is also the root field of some polynomial over E. Proof: Suppose K is a root field of some polynomial over /. Then K is a root field of the same polynomial over E. ■ EXERCISES A. Examples of Root Fields over Q Example Find the root field of a(x) = (x2 3)(x} - 1) over Q. Solution The complex roots of a(x) are ±V3,1, |(-1 ± V3i), so the root field is Q(±V3,1, ± V3i))- The same field can be written more simply as Q(V3, i). 318 chapter thirty-one GALOIS THEORY: PREAMBLE 319 1 Show that Q(V3, /') is the root field of (x2 - 2x - 2)(x2 + 1) over Q. Comparing part 1 with the example, we note that different polynomials may have the same root field. This is true even if the polynomials are irreducible. 2 Prove that x2 ~ 3 and x2 - 2x-2 are both irreducible over Q. Then find their root fields over Q and show they are the same. 3 Find the root field of x* - 2, first over O, then over R. 4 Explain: Q(i, V2) is the root field of x" - 2x2 + 9 over Q, and is the root field of x2 - 2V2x + 3 over Q(V2). 5 Find irreducible polynomials a(x) over Q, and b(x) over Q(i), such that Q(i", V3) is the root field of a(x) over Q, and is the root field of b(x) over Q(i). Then do the same for ©(V3!, V^). # 6 Which of the following extensions are root fields over Q? Justify your answer: O(i); Q(V2); Q(V2), where V2 is the real cube root of 2; Q(2 + VE); Q(i + V3); Q(i, V2, V3). B. Examples of Root Fields over Zp Example Find the root field of x2 + 1 over Z3. Solution By the basic theorem of field extensions, c?tt) = z'(tt) where u is a root of x2 + 1. In Z3(u), x2 + 1 = (x + u)(x - «), because u2 + 1 = 0. Since Z3(u) contains ±u, it is the root field of x2 + 1 over Z3. Note that Z,(«) has nine elements, and its addition and multiplication tables are easy to construct. (See Chapter 27, Exercise C4). 1 Show that, in any extension of Z3 which contains a root u of a(x) = x3 + 2x + 1 G Z3[x] it happens that u + 1 and « + 2 are the remaining two roots of a(x). Use this fact to find the root field of x3 + 2x + 1 over Z3. List the elements of the root field. 2 Find the root field of x2 + x + 2 over Z3, and write its addition and multiplication tables. 3 Find the root field of x3 + x2 + 1 E Z.2[x] over Z2. Write its addition and multiplication tables. 4 Find the root field over Z2 of x3 + x + 1 G 12\x]. (Caution: This will prove to be a little more difficult than part 3.) # 5 Find the root field of x3 + x2 + x + 2 over Z3. Find a basis for this root field over Z,. C. Short Questions Relating to Root Field Prove each of the following 1 Every extension of degree 2 is a root field. 2 If F C / C K and K is a root field of a(x) over F, then K is a root field of a(x) over /. 3 The root field over R of any polynomial in U[x] is R or C. 4 If c is a complex root of a cubic a(x) G Q[x], then Q(c) is the root field of a(x) over Q. # 5 If p(x) = x4 + ax2 + b is irreducible in F[x], then F[x]/(p(x)) is the root field of p(x) over F. 6 If K — F(a) and K is the root field of some polynomial over F, then K is the root field of the minimum polynomial of a over F. 7 Every root field over F is the root field of some irreducible polynomial over F. (Hint: Use part 6 and Theorem 2.) 8 Suppose [K :F] = n, where K is a root field over F. Then K is the root field over F of every irreducible polynomial of degree n in F[x] having a root in K. 9 If a(x) is a polynomial of degree n in F[x], and K is the root field of a(x) over F, then [K: F] divides n\ D. Reducing Iterated Extensions to Simple Extensions 1 Find c such that Q(V2, V^3) = Q(c). Do the same for Q(V2,vI) 2 Let a be a root of x3 - x + 1, and b a root of x2 - 2x - 1. Find c such that Q(a, b) = Q(c). (Hint: Use calculus to show that x3 - x + 1 has one real and two complex roots, and explain why no two of these may differ by a real number.) # 3 Find c such that Q(V2, V3, V^S) = Q(c). 4 Find an irreducible polynomial p(x) such that Q(V2, V3) is the root field of p(x) over Q. (Hint: Use Exercise C6.) 5 Do the same as in part 4 for Q(V2, V3, V^5). t E. Roots of Unity and Radical Extensions De Moivre's theorem provides an explicit formula to write the n complex nth roots of 1. (See Chapter 16, Exercise H.) By de Moivre's formula, the nth roots of unity consist of a> = cos (2irln) + (sin(27r/n) and its first n powers, namely, 1, id, a)2,. . . , to"~\ We call a> a primitive nth root of unity, because all the other nth roots of unity arc powers of oj. Clearly, every nth root of unity (except 1) is a root of x"~ 1 + X ■ + X + 1 This polynomial is irreducible if n is a prime (see Chapter 26, Exercise D3). Prove parts 1-3, where ai denotes a primitive nth root of unity. 1 Q(w) is the root field of x" - 1 over Q. 2 If n is a prime, [Q(a>): Q] = n - 1. 3 If n is a prime, ai"~l is equal to a linear combination of 1, to,..., (o"2 with rational coefficients. 4 Find [Q(cu): ©], where to is a primitive nth root of unity, for n = 6, 7, and 8. 320 CHAPTER THIRTY-ONE galois theory: preamble 321 5 Prove that for any rS (1,2,..., n — 1}, \Zaa>r is an nth root of a. Conclude that \/fl, V"fl<" 1 are the n complex nth roots of a. 6 Prove that Q(a>, v'a) is the root field of x" - a over Q. 7 Find the degree of 0(&>,v'2) over Q, where w is a primitive cube root of 1. Also show that Q(o),V2) = Q(v2, /V3) (Hint: Compute co.) 8 Prove that if K is the root field of any polynomial over Q, and K contains an nth root of any number a, then K contains all the nth roots of unity. t I . Separable and Inseparable Polynomials Let F be a field. An irreducible polynomial p{x) in F[x] is said to be separable over F if it has no multiple roots in any extension of F. If p(x) does have a multiple root in some extension, it is inseparable over F. 1 Prove that if F has characteristic 0, every irreducible polynomial in F[x] is separable. Thus, for characteristic 0, there is no question whether an irreducible polynomial is separable or not. However, for characteristic p 5^0, it is different. This case is treated next. In the following problems, let F be a field of characteristic p ^ 0. 2 If a'(x) = 0, prove that the only nonzero terms of a(jc) are of the form ampx"'p for some m. [In other words, a{x) is a polynomial in powers of xp.] 3 Prove that if an irreducible polynomial a(x) is inseparable over F, then a(x) is a polynomial in powers of xp. (Hint: Use part 2, and reason as in the proof of Theorem 1.) 4 Use Chapter 27, Exercise J (especially the conclusion following J6) to prove the converse of part 3. Thus, if F is a field of characteristic p ^ 0, an irreducible polynomial a(x) e F[x] is inseparable iff a(x) is a polynomial in powers of xp. For finite fields, we can say even more: 5 Prove that if F is any field of characteristic p t^O, then in F[x], (a„ + «,* + ••• + anx")p = ap + a'x" + ■ ■ ■ + a"„x"p (Hint: See Chapter 24, Exercise D6.) 6 If F is a finite field of characteristic p^O, prove that, in F[x], every polynomial a(xp) is equal to for some b(x). [Hint: Use part 5 and the fact that in a finite field of characteristic p, every element has a pth root (see Chapter 20, Exercise F).[ 7 Use parts 3 and 6 to prove: In any finite field, every irreducible polynomial is separable. Thus, fields of characteristic 0 and finite fields share the property that irreducible polynomials have no multiple roots. The only remaining case is that of infinite fields with finite characteristic. It is treated in the next exercise set. t G. Multiple Roots over Infinite Fields of Nonzero Characteristic If Zp[y] is the domain of polynomials (in the letter y) over Zp, let E = Zp(y) be the field of quotients of Zp[y]. Let K denote the subfield Zp(yp) of Zp(y). 1 Explain why Zp(y) and Z.p(yp) are infinite fields of characteristic p. 2 Prove that a(x) = xp - yp has the factorization xp - yp = (x — y)p in E[x], but is irreducible in K[x]. Conclude that there is an irreducible polynomial a(x) in K[x] with a root whose multiplicity is p. Thus, over an infinite field of nonzero characteristic, an irreducible polynomial may have multiple roots. Even these fields, however, have a remarkable property: all the roots of any irreducible polynomial have the same multiplicity. The details follow: Let F be any field, p(x) irreducible in F[x], a and b two distinct roots of p(x), and K the root field of p(x) over F. Let i: AT-* i(K) = K' be the isomorphism of Theorem 4, and f: AT[jt[—> K'[x] the isomorphism described immediately preceding Theorem 3. 3 Prove that (leaves p(x) fixed. 4 Prove that u[(x -a)m) = (x- b)m. 5 Prove that a and b have the same multiplicity. t H. An Isomorphism Extension Theorem (Proof of Theorem 3) Let F,, F2, h, p(x), a, b, and h be as in the statement of Theorem 3. To prove that h is an isomorphism, it must first be shown that it is properly defined: that is, if c(a) = o'(a) in F,(a), then h(c(a)) = h(d(a)). 1 If c(a) = d(a), prove that c{x) - d(x) is a multiple of p(x). Deduce from this that hc{x) — hd(x) is a multiple of hp(x). # 2 Use part 1 to prove that h(c(a)) = h(d(a)). 3 Reversing the steps of the preceding argument, show that h is injective. 4 Show that h is surjective. 5 Show that h is a homomorphism. t I. Uniqueness of the Root Field Let h: F, —> F2 be an isomorphism. If a(x) 6 F,[x], let Kl be the root field of a(x) over Flt and K2 the root field of ha(x) over F2. 322 CHAPTER THIRTY-ONE 1 Prove: If p(x) is an irreducible factor of a(x), u G AT, is a root of p(x), and v G K2 is a root of hp(x), then F,(m) = F2(i>). 2 F,(u) = KI iffF2(v) = K1. # 3 Use parts 1 and 2 to form an inductive proof that K, = K2. 4 Draw the following conclusion: The root field of a polynomial a(x) over a field F is unique up to isomorphism. t J. Extending Isomorphism In the following, let F be a subfield of C. An infective homomorphism h: F-called a monomorphism; it is obviously an isomorphism F—>h(F). •C is 1 Let íú be a complex pth root of unity (where p is a prime), and let h: Q(o>)—»C be a monomorphism fixing Q. Explain why h is completely determined by the value of h(ú)). Then prove that there exist exactly p - 1 monomorphisms Q(C which fix O. # 2 Let p(x) be irreducible in F[x], and c a complex root of p(x). Let h: F-*C be a monomorphism. If deg p(x) = prove that there are exactly n monomorphisms F(c)—»C which are extensions of h. 3 Let FC KCC. with [K: F] = n. If h: F—>C is a monomorphism, prove that there are exactly n monomorphisms AT —*C which are extensions of h. # 4 Prove: The only possible monomorphism h:Q—>C is h(x) = x. Thus, any monomorphism h: Q(«)—»C necessarily fixes Q. 5 Prove: There arc exactly three monomorphisms Q(V2)^>C, and they are determined by the conditions: ^/2 —»V2; Í/2—*Í/2a>; y/l—>^\/2a)2, where o> is a primitive cube root of unity. K. Normal Extensions If K is the root field of some polynomial a(x) over F, AT is also called a normal extension of F. There are other possible ways of defining normal extensions, which are equivalent to the above. We consider the two most common ones here: they are precisely the properties expressed in theorems 7 and 6. Let A. be a finite extension of F. 1 Suppose that for every irreducible polynomial p(x) in F\x\, if p(x) has one root in K, then p(x) must have all its roots in AT. Prove that K is a normal extension of F. 2 Suppose that, if h is any isomorphism with domain K which fixes F, then h(K)Q K. Prove that A. is a normal extension of F. CHAPTER THIRTY-TWO GALOIS THEORY: THE HEART OF THE MATTER If AT is a field and h is an isomorphism from K to AT, we call h an automorphism of K (automorphism = "self-isomorphism"). We begin this chapter by restating Theorems 5 and 6 of Chapter 31: Let AT be the root field of some polynomial over F; suppose a G AT: (i) Any isomorphism with domain K which fixes F is an automorphism of K. (ii) If a and b are roots of an irreducible polynomial p(x) in F[x], there is an automorphism of K fixing F and sending a to b. Rule (i) is merely a restatement of Theorem 5 of Chapter 31, using the notion of automorphism. Rule (ii) is a result of combining Theorem 4 of Chapter 31 [which asserts that there exists an F-fixing isomorphism from L = F(a) to L' = F(b)] with Theorem 6 of the same chapter. Let K be the root field of a polynomial a(x) in F[x\. If c,, c2,..., cn are the roots of a(x), then K= F{cl, c2,. . . , cn), and, by (*) on page 316, any automorphism // of AT which fixes Fpermutes c,, C2,..., cn. On the other hand, remember that every element a in F(c,, c2,. . . , c„) is a sum of terms of the form kc^-'-c'; where the coefficient k of each term is in F. If h is an automorphism which fixes F, h does not change the coefficients, so h(a) is completely determined once we know /j(c,), . . . , h(cn). Thus, every automorphism 323 324 CHAPTER THIRTY-TWO GALOIS THEORY: THE HEART OF THE MATTER 325 of K fixing F is completely determined by a permutation of the roots of a(x). This is very important! What it means is that we may identify the automorphisms of K which fix F with permutations of the roots of a(x). It must be pointed out here that, just as the symmetries of geometric figures determine their geometric properties, so the symmetries of equations (that is, permutations of their roots) give us all the vital information needed to analyze their solutions. Thus, if K is the root field of our polynomial a(x) over F, we will now pay very close attention to the automorphisms of K which fix F. To begin with, how many such automorphisms are there? The answer is a classic example of mathematical elegance and simplicity. Theorem 1 Let K be the root field of some polynomial over F. The number of automorphisms of K fixing F is equal to the degree of K over F. Proof: Let [K: F] — n, and let us show that K has exactly n automorphisms fixing F. By Theorem 2 of Chapter 31, K = F{a) for some a€\ K. Let p(x) be the minimum polynomial of a over F; if b is any root of p(x), then by (ii) on the previous page, there is an automorphism of K fixing F and sending a to b. Since p(x) has n roots, there are exactly n choices of b, and therefore n automorphisms of K fixing F. [Remember that every automorphism h which fixes F permutes the roots of p(x) and therefore sends a to some root of p(x); and h is completely determined once we have chosen h(a).] ■ For example, we have already seen that Q(V2) is of degree 2 over Q. Q(V2) is the root field of x2 - 2 over Q because Q(V2) contains both roots of x2~2, namely ±V2. By Theorem 1, there are exactly two automorphisms of Q(V2) fixing Q: one sends V2 to V2; it is the identity function. The other sends V2 to -V2, and is therefore the function a + bV2^> a - bV2. Similarly, we saw that C = R(i), and C is of degree 2 over R. The two automorphisms of C which fix R are the identity function and the function a + bi—* a - bi which sends every complex number to its complex conjugate. As a final example, we have seen that Q(V2, V3) is an extension of degree 4 over Q, so by Theorem 1, there are four automorphisms of Q(V2, V3) which fix Q: Now, Q(V5, V3) is the root field of (x2 -2)(x2 - 3) over Q for it contains the roots of this polynomial, and any extension of Q containing the roots of (x2 - 2)(x2 - 3) certainly contains V5 and V3. Thus, by (*) on page 316, each of the four automorphisms which fix Q sends roots of x2 - 2 to roots of x2 - 2, and roots of x2 - 3 to roots of x2 - 3. But there are only four possible ways of doing this, namely, V2-IV3- V2^ V2 V3-*-V3 •V3 } { and V3- -V2j V3J 'V2-»-V2l V3-+-V3J Since every element of Q(V2, V3) is of the form a + bV2 + cV5 + dV6, these four automorphisms (we shall call them e, a, B, and y) are the following: a + bVl + cV3 + dV6—a + bV2 + cV3 + dV~6 a + bV2 + cV3 + dV6——* a - bV2 + cV3 - oVfj a + bV2 + cV3 + dV6 - > a + bV2 - cV5 - dV6 a + bV2 + cV3 + dV6—y—+ a - bV2 - cV3 + dV6 If K is an extension of F, the automorphisms of K which fix F form a group. (The operation, of course, is composition.) This is perfectly obvious: for if g and h fix F, then for every x in F, h x—*x and s x-*x so h x—*x- that is, g°h fixes F. Furthermore, if then h x—>x that is, if h fixes F so does h~ . This fact is perfectly obvious, but nonetheless of great importance, for it means that we can now use all of our accumulated knowledge about groups to help us analyze the solutions of polynomial equations. And that is precisely what Galois theory is all about. If K is the root field of a polynomial a(jc) in F[x], the group of all the automorphisms of K which fix F is called the Galois group of a(x). We also call it the Galois group of K over F, and designate it by the symbol Gal{K : F) In our last example we saw that there are four automorphisms of Q(V2, V3) which fix Q. We called them e, a, B, and y. Thus, the Galois group of Q(V2, V5) over Q is Ga/(Q(V5, V5) : Q) = {e, a, B, y}; the operation is composition, giving us the table 0 e a B y e e a B y a a e y 8 B y e a 7 y 8 a e 326 CHAPTER THIRTY-TWO GALOIS THEORY: THE HEART OF THE MATTER 327 As one can see, this is an abelian group in which every element is its own inverse; almost at a glance one can verify that it is isomorphic to Z2 x Z2. Let K be the root field of a(x), where a(x) is in F[x]. In our earlier discussion we saw that every automorphism of K fixing F [that is, every member of the Galois group of a(xj\ may be identified with a permutation of the roots of a(x). However, it is important to note that not every permutation of the roots of a(x) need be in the Galois group of a(x), even when a(x) is irreducible. For example, we saw that Q(V2, V5) = G(V2 + V3), where V2 + V5 is a root of the irreducible polynomial x* - 10x2 + 1 over Q. Since x" - lux2 + 1 has four roots, there are 4! = 24 permutations of its roots, only four of which are in its Galois group. This is because only four of the permutations are genuine symmetries of x4 - I0x2 + 1, in the sense that they determine automorphisms of the root field. In the discussion throughout the remainder of this chapter, let F and K remain fixed. F is an arbitrary field and K is the root field of some polynomial a(x) in F[x]. The thread of our reasoning will lead us to speak about fields /where FQlQK, that is, fields "between" Fand K. We will refer to them as intermediate fields. Since K is the root field of a(x) over F, it is also the root field of a(x) over / for every intermediate field /. The letter G will denote the Galois group of K over F. With each intermediate field /, we associate the group /* = Gal(K : I) that is, the group of all the automorphisms of K which fix /. It is obviously a subgroup of G. We will call /* the fixer of /. Conversely, with each subgroup H of G we associate the subfield of K containing all the a in K which are not changed by any ttSH. That is, {a E K : ir(a) = a for every it €E H) One verifies in a trice that this is a subfield of K. It obviously contains F, and is therefore one of the intermediate fields. It is called the fixed field of H. For brevity and euphony we call it the fixfield of H. Let us recapitulate: Every subgroup H of G fixes an intermediate field /, called the fixfield of H. Every intermediate field / is fixed by a subgroup H of G, called the fixer of /. This suggests very strongly that there is a one-to-one correspondence between the subgroups of G and the fields intermediate between F and K. Indeed, this is correct. This one-to-one correspondence is at the very heart of Galois theory, because it provides the tie-in between properties of field extensions and properties of subgroups. Just as, in Chapter 29, we were able to use vector algebra to prove new things about field extensions, now we will be able to use group theory to explore field extensions. The vector-space connection was a relative lightweight. The connection with group theory, on the other hand, gives us a tool of tremendous power to study field extensions. We have not yet proved that the connection between subgroups of G and intermediate fields is a one-to-one correspondence. The next two theorems will do that. Theorem 2 If H is the fixer of /, then I is the fixfield of H. Proof: Let H be the fixer of /, and /' be the fixfield of H. It follows from the definitions of fixer and fixfield that / C /', so we must now show that /' C /. We will do this by proving that a I implies a £ /'. Well, if a is an element of K which is not in /, the minimum polynomial p(x) of a over / must have degree s*2 (for otherwise, aG/). Thus, p(x) has another root b. By Rule (ii) given at the beginning of this chapter, there is an automorphism of K fixing / and sending a to b. This automorphism moves a, so a 0 I'. ■ Lemma Let H be a subgroup of G, and I the fixfield of H. The number of elements in H is equal to [K : /], Proof: Let H have r elements, namely, hx,..., hr. Let K = 1(a). Much of our proof will revolve around the following polynomial: b{x) = [x - h{(a)][x - h2(a)\ - [x- hr(a)} Since one of the n, is the identity function, one factor of b(x) is (x - a), and therefore a is a root ofb(x). In the next paragraph we will see that all the coefficients of b(x) are in /, so b(x)£I[x]. It follows that b(x) is a multiple of the minimum polynomial of a over /, whose degree is exactly [K : /]. Since b(x) is of degree r, this means that r 2* [K : /], which is half our theorem. Well, let us show that all the coefficients of b(x) are in /. We saw on page 314 that every isomorphism h^.K^K can be extended to an 328 CHAPTER THIRTY-TWO GALOIS THEORY: THE HEART OF THE MATTER 329 isomorphism ft, : K[x]-* K[x]. Because ft, is an isomorphism of polynomials, we get ft,(ft(x)) = h,(x - ft,(fl))ft,(* - h2{a)) ■ ■ ■ h,(x - hr(a)) = (x-hioh1(a))-(x-htohr(a)) But ft,° Aj, ft,°ft2, . . . , = hr are r distinct elements of H, and H has exactly r elements, so they are all the elements of H (that is, they are ft,,..., hr, possibly in a different order). So the factors of ft,(ft(jc)) are the same as the factors of b(x), merely in a different order, and therefore hj(b(x)) = b(x). Since equal polynomials have equal coefficients, ft, leaves the coefficients of b(x) invariant. Thus, every coefficient of b(x) is in the fixfield of H, that is, in /. We have just shown that [/£:/]=£/-. For the opposite inequality, remember that by Theorem 1, [K : /] is equal to the number of /-fixing automorphisms of K. But there are at least r such automorphisms, namely ft, hr. Thus, [K : /] 3s r, and we are done. Theorem 3 // / is the fixfield of H, then H is the fixer of I. Proof: Let / be the fixfield of H, and /* the fixer of /. It follows from the definitions of fixer and fixfield that H CI*. We will prove equality by showing that there are as many elements in H as in /*. By the lemma, the order of II is equal to [K: /]. By Theorem 2, / is the fixfield of /*, so by the lemma again, the order of /* is also equal to [K : I]. ■ It follows immediately from Theorems 2 and 3 that there is a one-to-one correspondence between the subgroups of Gal(K : F) and the intermediate fields between K and F. This correspondence, which matches every subgroup with its fixfield (or, equivalently, matches every intermediate field with its fixer) is called a Galois correspondence. It is worth observing that larger subfields correspond to smaller subgroups; that is, /, C I2 iff I*2 C I* As an example, we have seen that the Galois group of Q(V5, V3) over QisG={e,a, j8,y} with the table given on page 325. This group has exactly five subgroups—namely, {e}, {e,a}, {e, B}, {e,y}, and the whole group G. They may be represented in the "inclusion diagram": < In order to effectively tie in subgroups of G with extensions of the field F, we need one more fact, to be presented next. Suppose EClCK, where AT is a root field over E and /is a root field over E. (Hence by Theorem 8 of Chapter 31, K is a root field over /.) If h E Gal(K: E), h is an automorphism of K fixing E. Consider the restriction of h to I, that is, ft restricted to the smaller domain /. It is an isomorphism with domain / fixing E, so by Rule (i) given at the beginning of this chapter, it is an automorphism of /, still fixing E. We have just shown that if ft E Gal(K : £), then the restriction of ft to / is in Gal(I: E). This permits us to define a function fi : Gal(K : £)-* Gal(I: E) by the rule /i(ft) = the restriction of ft to / It is very easy to check that p is a homomorphism. n is surjective, because every E-fixing automorphism of / can be extended to an £-fixing automorphism of K, by Theorem 6 in Chapter 31. 330 chapter thirty-two Finally, if h £ Gal(K : E), the restriction of h to / is the identity function iff h(x) = x for every x £ /, that is, iff h fixes /. This proves that the kernel of p. is Gal(K : I). To recapitulate: |i is a homomorphism from Gal(K : E) onto Gal(I: E) with kernel Gal(K : I). By the FHT, we immediately conclude as follows: Theorem 4 Suppose E C / C K, where 1 is a root field over E and K is a root field over E. Then Gal(I:E)- Gal(K . t) It follows, in particular, that Gal(K: /) is a normal subgroup of Gal(K : E). EXERCISES t A. Computing a Galois Group 1 Show that Q(i, VI) is the root field of (x2 + \)(x2 - 2) over Q. # 2 Find the degree of Q(i, VI) over Q. 3 List the elements of Gal(Q(i, V5) : O) and exhibit its table. 4 Write the inclusion diagram for the subgroups of Ga/(Q((, VI): Q), and the inclusion diagram for the fields intermediate between Q and Q(i, VI). Indicate the Galois correspondence. t B. Computing a Galois Group of Eight Elements 1 Show that Q(V2, V3, V5~) is the root field of (x2 - 2)(x2 - l)(x2 - 5) over Q. 2 Show that the degree of Q(V2, V3, V5) over Q is 8. 3 List the eight elements of G = Ga/(Q(V2, V3, V5): Q) and write its table. 4 List the subgroups of G. (By Lagrange's theorem, any proper subgroup of G has either two or four elements.) 5 For each subgroup of G, find its fixfield. 6 Indicate the Galois correspondence by means of a diagram like the one on page 329. t C. A Galois Group Equal to S3 1 Show that 0(Vl, iV3) is the root field of x3 - 2 over Q, where VI designates the real cube root of 2. (Hint: Compute the complex cube roots of unity.) 2 Show that [Q(Vl): Q] = 3. 3 Explain why x2 + 3 is irreducible over 0(Vl), then show that [Q(Vl, i'V3): QfVl)] = 2. Conclude that [Q(Vl, iV3) : Q] = 6. GALOIS THEORY: THE HEART OF THE MATTER 331 4 Use part 3 to explain why Gal(Q(y/2, iV3): Q) has six elements. Then use the discussion following Rule (ii) on page 323 to explain why every element of Ga/(Q(Vl, i"V3) : O) may be identifed with a permutation of the three cube roots of 2. 5 Use part 4 to prove that Ga/(Q(Vl, jV3) : Q) = Sy. t D. A Galois Group Equal to I>4 If a =V2 is a real fourth root of 2, then the four fourth roots of 2 are ±a and ±ia. Explain parts 1-6, briefly but carefully: # 1 Q(a, (') is the root field of x4 - 2 over Q. 2 [Q(o):Q] = 4. 3 ( £ Q(a); hence [Q(a, i): Q(o)] = 2. 4 [Q(a, 0 : Q] = 8. 5 {1, a, a , a3, i, ia, ia2, ia3} is a basis for Q(a, i) over Q. 6 Any Q-fixing automorphism h of Q(a, i) is determined by its effect on the elements in the basis. These, in turn, are determined by h(a) and h(i). 7 Explain: h(a) must be a fourth root of 2 and h(i) must be equal to ±i. Combining the four possibilities for h(a) with the two possibilities for h(i) gives eight possible automorphisms. List them in the format 8 Compute the table of the group Ga/(Q(a, i): Q) and show that it is isomorphic to £>„, the group of symmetries of the square. t E. A Cyclic Galois Group # 1 Describe the root field K of x1 - 1 over Q. Explain why [K : Q] = 6. 2 Explain: If a is a primitive seventh root of unity, any h £ Gal(K : Q) must send a to a seventh root of unity. In fact, h is determined by h(a). 3 Use part 2 to list explicitly the six elements of Gal(K : Q). Then write the table of Gal(K : Q) and show that it is cyclic. 4 List all the subgroups of Gal(K : Q), with their fixfields. Exhibit the Galois correspondence. 5 Describe the root field L of x6 - 1 over Q, and show that [L:Q] = 2. Explain why it follows that there are no intermediate fields between Q and L (except for Q and L themselves). # 6 Let L be the root field of x6 - 2 over Q. List the elements of Gal(L : O) and write its table. t F. A Galois Group Isomorphic to Ss Let a(x) = x5 - 4x4 + 2x + 2 £ Q[jt], and let r,,.. , r5 be the roots of a(x) in C. Let K = Q(r,.....r5) be the root field of a(x) over Q. 332 chapter thirty-two galois theory: the heart of the matter 333 Prove: parts 1-3: 1 a(x) is irreducible in Q[jc] . 2 a(x) has three real and two complex roots. [Hint: Use calculus to sketch the graph of y = a(x), and show that it crosses the x axis three times.] 3 If r, denotes a real root of a(x), [Q(r,) : Q] = 5. Use this to prove that [K : Q] is a multiple of 5. 4 Use part 3 and Cauchy's theorem (Chapter 13, Exercise E) to prove that there is an element a of order 5 in Gal(K: Q). Since a may be identified with a permutation of {/-,,..., r5}, explain why it must be a cycle of length 5. (Hint: Any product of disjoint cycles on {r,,. . . , r5} has order 5*5.) 5 Explain why there is a transposition in Gal(K : Q). [It permutes the conjugate pair of complex roots of a(x).] 6 Prove: Any subgroup of S5 which contains a cycle of length 5 and a transposition must contain all possible transpositions in S5, hence all of S5. Thus, Gal(K : Q) = S5. G. Shorter Questions Relating to Automorphisms and Galois Groups Let F be a field, and K a finite extension of F. Suppose a, be K. Prove parts 1-3: 1 If an automorphism h of K fixes F and a, then h fixes F(a). 2 F(a, b)* = F(a)* n F(b)\ 3 Aside from the identity function, there are no Q-fixing automorphisms of Q(^/2). [Hint: Note that Q(i/2) contains only real numbers.] 4 Explain why the conclusion of part 3 does not contradict Theorem 1. In the next three parts, let o> be a primitive plh root of unity, where p is a prime. 5 Prove: If h G Gal(Q(u>) : Q), then h(a>) = to" for some k where 1« k =£ p - 1. 6 Use part 5 to prove that Gal(Q(o>): Q) is an abelian group. 7 Use part 5 to prove that Ga/(Q(a>): Q) is a cyclic group. t H. The Group of Automorphisms of C 1 Prove: The only automorphism of Q is the identity function. [Hint: If h is an automorphism, h(l) = 1; hence h(2) = 2, and so on.] 2 Prove: Any automorphism of R sends squares of numbers to squares of numbers, hence positive numbers to positive numbers. 3 Using part 2, prove that if h is any automorphism of R, a a - bi are the only automorphisms of C which fix U. I. Further Questions Relating to Galois Groups Throughout this set of questions, let AT be a root field over F, let G = Gal(K : F), and let / be any intermediate field. Prove the following: 1 /* = Gal(K : /) is a subgroup of G. 2 If H is a subgroup of G and H° = {a G K : 77(a) = a for every it G H], then H" is a subfield of K, and F C H°. 3 Let H be the fixer of /, and /' the fixfield of H. Then / C /'. Let / be the fixfield of H, and /* the fixer of /. Then H C /*. # 4 Let / be a normal extension of F (that is, a root field of some polynomial over F). If G is abelian, then Gal(K : I) and Gal(I: F) are abelian. (Hint: Use Theorem 4.) 5 Let / be a normal extension of F. If G is a cyclic group, then Gal(K : I) and Gal(I: F) are cyclic groups. 6 If G is a cyclic group, there exists exactly one intermediate field / of degree k, for each integer k dividing [K : F]. t J. Normal Extensions and Normal Subgroups Suppose F C K, where AT is a normal extension of F. (This means simply that AT is the root field of some polynomial in F[x]: see Chapter 31, Exercise K.) Let 7t CI2 be intermediate fields. 1 Deduce from Theorem 4 that, if I2 is a normal extension of /;, then /* is a normal subgroup of /*. 2 Prove the following for any intermediate field /: Let h G Gal(K : F), gSI*, aG7, and b = h(a). Then [h°g»/T'](/3) = b. Conclude that hl*h'x 4 shown to be futile, but a criterion is made available to test any equation and determine if it has solutions given by a radical formula. All this will be made clear in the following pages. Every quadratic equation ax2 + bx + c = 0 has its roots given by the formula - -b± Vr?^4ac 2a Equations of degree 3 and 4 can be solved by similar formulas. For example, the cubic equation x3 + ax + b = 0 has a solution given by + Vd + Vd where D 27 + 4 (1) Such expressions are built up from the coefficients of the given polynomials by repeated addition, subtraction, multiplication, division, and taking roots. Because of their use of radicals, they are called radical expressions or radical formulas. A polynomial a{x) is solvable by radicals if there is a radical expression giving its roots in terms of its coefficients. Let us return to the example of x3 + ax + b = 0, where a and b are rational, and look again at Formula (1). We may interpret this formula to assert that if we start with the field of coefficients Q, adjoin the square root VD, then adjoin the cube roots v-b/2± V~D, we reach a field in which x3 + ax + b = 0 has its roots. In general, to say that the roots of a(x) are given by a radical expression is the same as saying that we can extend the field of coefficients of a(x) by successively adjoining nth roots (for various n), and in this way obtain a field which contains the roots of a(x). We will express this notion formally now, in the language of field theory. F(c,,. . . ,cn) is called a radical extension of F if, for each i, some power of c. is in F(c, c,_,). In other words, F(c,,. . . , c„) is an iterated extension of F obtained by successively adjoining nth roots, for various n. We say that a polynomial a(x) in F[x] is solvable by radicals if there is a radical extension of F containing all the roots of a(x), that is, containing the root field of a(x). To deal effectively with nth roots we must know a little about them. To begin with, the nth roots of 1, called nth roots of unity, are, of course, the solutions of x" - 1 = 0. Thus, for each n, there are exactly n nth roots of unity. As we shall see, everything we need to know about roots will follow from properties of the roots of unity. In C the nth roots of unity are obtained by de Moivre's theorem. They consist of a number to and its first n powers: 1 = to , to, to ,. . . , a>"~\ We will not review de Moivre's theorem here because, remarkably, the main facts about roots of unity are true in every field of characteristic zero. Everything we need to know emerges from the following theorem: Theorem 1 Any finite group of nonzero elements in a field is a cyclic group. (The operation in the group is the field's multiplication.) Proof: If F* denotes the set of nonzero elements of F, suppose that G C F*, and that G, with the field's "multiply" operation, is a group of n 336 CHAPTER THIRTY-THREE SOLVING EQUATIONS BY RADICALS 337 elements. We will compare G with Z„ and show that G, like Z„, has an element of order n and is therefore cyclic. For any integer k, let g(k) be the number of elements of order k in G, and let z(k) be the number of elements of order k in Z„. For every positive integer k which is a factor of n, the equation x = 1 has af mosf /: solutions in F; thus, (*) G contains at most k elements whose order is a factor of k. If G has an element a of order k, then (a) = {e, a, a2,. . . , a*-1} are all the distinct elements of G whose order is a factor of k. [By (*), there cannot be any others.] In Z„, the subgroup A: I contains all the elements of Z„ whose order is a factor of k. Since (a) and (n//fc) are cyclic groups with the same number of elements, they are isomorphic; thus, the number of elements of order k in (a) is the same as the number of elements of order k in (nlk). Thus, g(k) = z(k). Let us recapitulate: if G has an element of order k, then g(k) = z(k); but if G has no such elements, then g{k) = 0. Thus, for each positive integer k which is a factor of n, the number of elements of order k in G is less than (or equal to) the number of elements of order k in Z„. Now, every element of G (as well as every element of Z„) has a well-defined order, which is a divisor of n. Imagine the elements of both groups to be partitioned into classes according to their order, and compare the classes in G with the corresponding classes in Z„. For each k, G has as many or fewer elements of order k than Z„ does. So if G had no elements of order n (while Z„ does have one), this would mean that G has fewer elements than Z„, which is false. Thus, G must have an element of order n, and therefore G is cyclic. ■ The nth roots of unity (which are contained in F or a suitable extension of F) obviously form a group with respect to multiplication. By Theorem 1, it is a cyclic group. Any generator of this group is called a primitive nth root of unity. Thus, if w is a primitive nth root of unity, the set of all the nth roots of unity is ! 2 n -1 1, a, m ,... , to If co is a primitive nth root of unity, F(co) is an abelian extension of F in the sense that g°h = h°g for any two F-fixing automorphisms g and h of F(co). Indeed, any automorphism must obviously send nth roots of unity to nth roots of unity. So if g(co) = cor and h("'1. Indeed, if c is any other nth root of a, then clearly clb is an nth root of 1, say cor; hence c = bco4. We may infer from the above that if F contains a primitive nth root of unity, and b is an nth root of a, then F{b) is the root field of x" - a over F. In particular, F(b) is an abelian extension of F. Indeed, any F-fixing automorphism of F(b) must send nth roots of a to nth roots of a: for if c is any nth root of a and g is an F-fixing automorphism, then g{c)n = S(c") = Si") = a; hence g(c) is an nth root of a. So if g(b) = bco' and h{b) = bco\ then g"h(b) = g(bcos) = bcoW = bcor*s and h-g{b) = h{bcor) = bcoW = bcor+' hence g«h(b) = h° g(b). Since g and h fix F, and every element in F(b) is a linear combination of powers of b with coefficients in F, it follows that g°h =h"g. If a(x) is in F[.r], remember that a{x) is solvable by radicals just as long as there exists some radical extension of F containing the roots of a(x). [Any radical extension of F containing the roots of a(x) will do.] Thus, we may as well assume that any radical extension used here begins by adjoining to F the appropriate roots of unity; henceforth we will make this assumption. Thus, if K= F(c,,. . . , cj is a radical extension of F, then FC F(cJ C F(c]; c2) C • ■ • C F(c,, . . . , c„) (2) is a sequence of simple abelian extensions. (The extensions are all abelian by the comments in the preceding three paragraphs.) Still, this is not quite enough for our purposes: In order to use the machinery which was set up in the previous chapter, we must be able to say that each field in (2) is a root field over F. This may be accomplished as follows: Suppose we have already constructed the extensions /„ C /, C ■ • • £ Iq in (2) so that Iq is a root field over F. We must extend Iq to / +1, so is a root field over F. Also, Iq+1 must include the element c which is the nth root of some element a&Iq. Let H = {hy,..., hr} be the group of all the F-fixing automorphisms of Iq, and consider the polynomial b(x) = [xn - Ma)]!*" - h2(a)] •••[*"- hr{d)\ 338 CHAPTER THIRTY-THREE SOLVING EQUATIONS BY RADICALS 339 By the proof of the lemma on page 327, one factor of b(x) is (x" - a); hence cq + 1 is a root of b(x). Moreover, by the same lemma, every coefficient of b(x) is in the fixfield of H, that is, in F. We now define / +1 to be the root field of b(x) over F. Since all the roots of b(x) are nth roots of elements in Iq, it follows that Iq + 1 is a radical extension of Iq. The roots may be adjoined one by one, yielding a succession of abelian extensions, as discussed previously. To conclude, we may assume in (2) that K is a root field over F. If G denotes the Galois group of K over F, each of these fields Ik has a fixer which is a subgroup of G. These fixers form a sequence For each k, by Theorem 4 of Chapter 32, l*k is a normal subgroup of and lt+JI*k - Gat(Ik + l : lk) which is abelian because Ik + l is an abelian extension of Ik. The following definition was invented precisely to account for this situation. A group G is called solvable if it has a sequence of subgroups {e} = H0 C /V, C • • • C Hm = G such that for each k, Gk is a normal subgroup of Gk + l and Gk+l/Gk is abelian. We have shown that if K is a radical extension of F, then Gal(K : F) is a solvable group. We wish to go further and prove that if a(x) is any polynomial which is solvable by radicals, its Galois group is solvable. To do so, we must first prove that any homomorphic image of a solvable group is solvable. A key ingredient of our proof is the following simple fact, which was explained on page 152: GIH is abelian iff H contains all the products xyx~ y'1, for all x and y in G. (The products xyx~ly~ are called "commutators" of G.) Theorem 2 Any homomorphic image of a solvable group is a solvable group. Proof: Let G be a solvable group, with a sequence of subgroups {e) C ff, C ■ • • C Hm = G as specified in the definition. Let / : G-> X be a homomorphism from G onto a group X. Then /(//„), /(#,),.. ., f{Hm) are subgroups of X, and clearly {e} C f{Ha) C /(//,) C • • • C f(Hm) = X. For each i we have the following: if f(a) G/(//,) and f{x) G/(//,. +,), then a G H, and x G + 1; hence xax~'&H, and therefore f(x)f(a)f(x)'x G/(//,). So /(//,) is a normal subgroup of f{Hl + l). Finally, since Hi+1/H; is abelian, every commutator jcyjr'y-1 (for all x and y in Hi+l) is in H,; hence every f(x)f(y)f(x) 'fXyY1 is in /(/¥,). Thus, /(//, +,)//(//,) is abelian. . Now we can prove the main result of this chapter: Theorem 3 Let a(x) be a polynomial over a field F. If a(x) is solvable by radicals, its Galois group is a solvable group. Proof: By definition, if K is the root field of a(x), there is an extension by radicals F(c,,. . . , c„) such that FC KCF(c,, ... ,cn). It follows by Theorem 4 of Chapter 32 that Ga/(F(c,,. . . , cj : F)l Gal(F(Cl, . . . , c„): K) = Gal(K : F); hence by that theorem, Gal(K: F) is a homomorphic image of Gal{F{cx,. . . ,cn): F) which we know to be solvable. Thus, by Theorem 2 Gal(K : F) is solvable. ■ Actually, the converse of Theorem 3 is true also. All we need to show is that, if K is an extension of F whose Galois group over F is solvable, then K may be further extended to a radical extension of F. The details are not too difficult and are assigned as Exercise E at the end of this chapter. Theorem 3 together with its converse say that a polynomial a(x) is solvable by radicals iff its Galois group is solvable. We bring this chapter to a close by showing that there exist groups which are not solvable, and there exist polynomials having such groups as their Galois group. In other words, there are unsolvable polynomials. First, here is an unsolvable group: Theorem 4 The symmetric group 5S is not a solvable group. Proof: Suppose 55 has a sequence of subgroups {e} = HnCH,C---QHm = S5 as in the definition of solvable group. Consider the subset of Ss containing all the cycles (ijk) of length 3. We will show that if //; contains all the cycles of length 3, so does the next smaller group //,_,. It would follow in m steps that H0 = { a primitive pth root of unity in the field F. 1 If a* is any root of xp - a £ F[x], show that F(o>, d) is a root field of xp - a. Suppose xp - a is not irreducible in F[x]. 2 Explain why xp — a factors in F[x] as x" - a= p(x)f{x), where both factors have degree s2. # 3 If deg p(x) = m, explain why the constant term of p(x) (let us call it b) is equal to the product of m pth roots of a. Conclude that b = a>kdm for some k. 4 Use part 3 to prove that bp = am. 5 Explain why m and p are relatively prime. Explain why it follows that there are integers s and t such that sm + tp = 1. 6 Explain why b'p = a"". Use this to show that (b'a')p = a. 7 Conclude: If x" - a is not irreducible in F[x], it has a root (namely, bsa') in F. We have proved: xp - a either has a root in.F or is irreducible over F. t D. Another Way of Defining Solvable Groups Let G be a group. The symbol H denote the union of all the cosets which are members of f. If $ < GIK, then J5 < G. (Use part 2.) 4 If AT is a maximal normal subgroup of G, then GIK has no nontrivial normal subgroups. (Use part 3.) 5 If an abelian group G has no nontrivial subgroups, G must be a cyclic group of prime order. (Otherwise, choose some a EG such that (a) is a proper subgroup of G.) 6 If H< K). 24 A x (B - D) = (A x B) - (A x D). APPENDIX B REVIEW OF THE INTEGERS One of the most important facts about the integers is that any integer m can be divided by any positive integer n to yield a quotient q and a positive remainder r. (The remainder is less than the divisor n.) For example, 25 may be divided by 8 to give a quotient of 3 and a remainder of 1: 25 = 8 x 3 + 1 This process is known as the division algorithm. It may be stated precisely as follows: Theorem 1: Division algorithm If m and n are integers and n is positive, there exist unique integers q and r such that m- nq + r and 0 « r < n We call q the quotient and r the remainder when m is divided by n. Here we shall take the division algorithm as a postulate of the system of the integers. (In Chapter 21 we started with a more fundamental premise and proved the division algorithm from it.) If r and s are integers, we say that s is a multiple of r if there is an integer k such that s = rk In this case, we also say that r is a factor of s, or r divides s, and we symbolize this relationship by writing r\s 349 350 appendix b REVIEW OF THE INTEGERS 351 For example, 3 is a factor of 12, so we write: 3112. Some of the elementary properties of divisibility are stated in the next theorem. Theorem 2: The following are true for all integers a, b, and c: (i) // a | b and b | c, thert a \ c. (ii) l\a. (iii) a|0. (iv) If c\a and c\b, then c\(ax + by) for all integers x and y. (v) If a\b and c\d, then ac\bd. The proofs of these relationships follow from the definition of divisibility. For instance, we give here the proof of (iv): If c\a and c\b, this means that a = kc and b = Ic for some k and /. Then ax + by = kcx + Icy = c(kx + ly) Visibly, c is a factor of c(kx + ly), and hence a factor of ax + by. In symbols, c\(ax + by), and we are done. An integer / is called a common divisor of integers r and s if t\r and t\s. A greatest common divisor of r and s is an integer t such that (i) t\r and t\s, and (ii) For any integer u, if u\r and u\s, then u\t. In other words, t is a greatest common divisor of r and .v if t is a common divisor of r and 5 and every other common divisor of r and s divides t. It is an important fact that any two nonzero integers r and s always have a positive greatest common divisor: Theorem 3: Any two nonzero integers r and s have a unique positive greatest common divisor t. Moreover, t is equal to a "linear combination" of r and s. That is, t = kr+ls for some integers k and I. The unique positive greatest common divisor of r and s is denoted by the symbol gcd(r, s). A pair of integers r and s are said to be relatively prime if they have no common divisors except ±1. For example, 4 and 15 are relatively prime. If r and s are relatively prime, their gcd is equal to 1. So by Theorem 3, there are integers k and / such that kr + Is = 1 Actually, the converse of this statement is true too, and is stated in the next theorem: Theorem 4: Two integers r and s are relatively prime if and only if there are integers k and I such that kr + Is = 1. The proof of this theorem is left as an exercise. From Theorem 4, we deduce the following: Theorem 5 If r and s are relatively prime, and r\st, then r\t. Proof From Theorem 4 we know there are integers k and / such that kr + Is = 1. Multiplying through by t, we get krt + lst = t (i) But we are given the fact that r\st; that is, st is a multiple of r, say st = mr. Substitution into Equation (1) gives krt + lmr=t, that is, r(kf+ lm) = t. This shows that r is a factor of t, as required. ■ If an integer m has factors not equal to 1 or -1, we say that m is composite. If a positive integer m # 1 is not composite, we call it a prime. For example, 6 is composite (it has factors ±2 and ±3), while 7 is prime. If p is a prime, then for any integer n, p and n are relatively prime. Thus, Theorem 5 has the following corollary: Corollary Let m and n be integers, and let p be a prime: lfp\mn, then either p\m or p\n. It is a major fact about integers that every positive integer m > 1 can be written, uniquely, as a product of primes. (The proof is given in Chapter 22.) By a least common multiple of two integers r and s we mean a positive integer m such that (i) r\m and s\m, and (ii) If r\x and s\x, then m\x. In other words, m is a common multiple of r and s and m is a factor of every other common multiple of r and s. In Chapter 22 it is shown that every pair of integers r and s has a unique least common multiple. The least common multiple of r and 5 is denoted by lcm(r, s). The least common multiple has the following properties: Theorem 6 For any integers a, b, and c. (i) If gcd(a, b) = l, then lcm(a, b) = ab. (ii) Conversely, if lcm{a, b) = ab, then gcd(a, b)=\. (iii) If gcd{a, b) = c, then lcm(a, b) = able. (iv) lcm(a, ab) = ab. The proofs are left as exercises. EXERCISES Prove that the following are true for any integers a, b, and c: 1 If a i b and b | c, then a \ c. 2 If a\b, then a\(-b) and (~a)\b. 3 11 a and (-l)|a. 4 alO. 352 appendix b 5 If a | ft, then ac\bc. 6 If a > 0, then gcd(a, 0) = a. 7 If gcd(«6, c) = 1, then gcd(a, c) = 1 and gcd(ft. c) = 1. 8 If there are integers k and / such that ka + lb = 1, then a and ft are relatively prime. 9 If a\d and c\d and gcd(a, c) = 1, then ac|d. 10 If d | aft and a"|cft and gcd(a, c) = 1, then d\b. 11 If gcd(a, b) = 1, then lcm(a, ft) = aft. 12 If lcm(a, ft) = aft. then gcd(a, ft) = 1. 13 If gcd(a, ft) = c, then lcm(a, ft) = able. 14 lcm(a, aft) = aft. 15 a • lcm(ft. c) = lcm(aft, ac). APPENDIX c REVIEW OF MATHEMATICAL INDUCTION The basic assumption one makes about the ordering of the integers is the following: Well-ordering principle. Every nonempty set of positive integers has a least element. From this assumption, it is easy to prove the following theorem, which underlies the method of proof by induction: Theorem 1: Principle of mathematical induction Let A represent a set of positive integers. Consider the following two conditions: (i) 1 is in A. (it) For any positive integer k. if k is in A, then k + 1 is in A. If A is any set of positive integers satisfying these two conditions, then A consists of all the positive integers. Proof: If A does not contain all the positive integers, then by the well-ordering principle (above), the set of all the positive integers which are not in A has a least element; call it ft. From Condition (i), ft ^ 1; hence ft > 1. Thus, ft - 1 >0, and b —1<= A because ft is the least positive integer not in A. But then, from Condition (ii), be A, which is impossible. ■ Here is an example of how the principle of mathematical induction is used: We shall prove the identity n(n + 1) 1 + 2 + + n (1) that is, the sum of the first n positive integers is equal to n(n + 1)12. 353 354 appendix c Let A consist of all the positive integers n for which Equation (1) is true. Then 1 is in A because 1 : 1-2 ANSWERS TO SELECTED EXERCISES Next, suppose that k is any positive integer in A; we must show that, in that case, k + 1 also is in A. To say that k is in A means that 1+2+--- + k- k(k + 1) By adding k + 1 to both sides of this equation, we get 1 + 2+ • ■ • + * + (Jfc + 1) = M^_Lii + (fc + 1) that is, 1 + 2 + (* + !)(*+ 2) From this last equation, k + 1 G A. We have shown that IE. A, and moreover, that if k G A, then k + 1 £ A. So by the principle of mathematical induction, all the positive integers are in A. This means that Equation (1) is true for every positive integer, as claimed. EXERCISES Use mathematical induction to prove the following: 1 l+3 + 5 + '--+(2n-l) = n2 (That is, the sum of the first n odd integers is equal to n2.) 2 l3 + 23 + ■ • • + n = (1 + 2 + ■ ■ • + nf 3 l2 + 22 + • • ■ + n2 = I n(n + \){2n + 1) 4 l3 + 23 + • • ■ + n = \ n(n + l)2 1 5 2!+l 6 l2 + n 1 (n + 1) 7' ■ + (n 1)2< — <12 + 22 + + n CHAPTER 2 A3 This is not an operation on R, since a * b is not uniquely defined for any a, b G R, a ¥> 0, and b 0. If a ^ 0 and b 0, then the equation x2 - a2b2 = 0 has two roots—namely, x = a*b = ±ab. B7 Commutative H □ Yes No Associative a a Yes No Identity □ H Yes No Inverses □ a Yes No (ii) Associative law: xyz(y + 2 + 1) (x * y) * z = b * a = a, (b* b)*a = a* a = b 8. b*(b*b) = b*a = «, (b*b)*b = a* b = a Since a * (a * b) # (a * a) * b, * is not associative. By the way, this operation is commutative, since a*b = a = b*a. Using commutativity, we need not check all eight cases above; in fact, equality in cases 1, 3, 6, and 8 follows using commutativity. For example, in case 3, it follows from commutativity that a*(b* a) = (b * a)* a = (a* b)* a. Thus, for commutative operations, only four of the eight cases need to be checked. If you are able to show that it is sufficient to check only cases 2 and 4 for commutative operations, your work will be further reduced. We have checked only one of the 16 operations. Check the remaining 15. CHAPTER 3 A4 (ii) Associative law: x*(y*z) = x*(j~) = /y+_z_\ Vyz + 1/ + 1 (**v)*z = (^) *(£tt) (I±L) + \xy + 1/ !x + y_\ \xy + \) z + 1 xyz + x + y + z xy + xz + yz + 1 x + y + z + xyz xy + yz + xz + 1 (iii) Identity element: Solve x * e = x for e. T, x + e If x * e =-—r = x xe + 1 then e(l -x2) = 0, so e = 0. Now check that jr*0 = 0*x = A:. (iv) Inverses: Solve x * x' = 0 for x'. (You should get x' = -x.) Then check that x * (—x) = 0 = (-jc) * x. B2 (ii) Associative law: (a,b)* l(c, d) * (e, /)] = (a, b) * (ce, de+f) = (ace, bee + de + f) \(a, b) * (c, d)] * (e, f) = (ac, be + d)* (e, /) = (ace, bee + de + f) (iii) Identity element: Solve (a, b) * (e,, e2) = (a, b) for (elte2). Suppose (a, b) * (e,, e2) = (ae,, be, + e2) = (a, b) This implies that ae, = a and be, + e2 = b, so e, = 1 and e2 = 0. Thus, (e,, e2) = (l,0). Now check that (a, b)*(1,0) = (1,0)*(a, b) = (a, b). (iv) Inverses: Solve (a, b) * (a', b') = (1,0) for (a',b'). You should get a' = 1/a and b' = - bla. Then check. Gl The equation a4 = a, + a3 means that the fourth digit of every codeword is equal to the sum of the first and third digits. We check this fact for the eight codewords of C, in turn: 0 = 0 + 0, 1 = 0+1,0 = 0 + 0, 1 = 0+1,1 = 1+0, 0=1 + 1, 1 = 1 + 0, and 0 = 1 + 1. G2 (a) As stated, the first three positions are the information positions. Therefore there are eight codewords; omitting the numbers in the last three positions (the redundancy positions), the codewords are 000, 001, 010, 011, 100, 101, 110, and 111. The numbers in the redundancy positions are specified by the parity-check equations given in the exercise. Thus, the complete codewords are 000000, 001001,.... (Complete the list.) G6 Let ak,bk, xk denote the digits in the kth position of a, b, and x, respectively. Note that if xk ak and xktkbk, then ak = bk. But if xk ak and xk = bk, then ak bk. And if xk = ak and xk # bk, then ak # bk. Finally, if xk = ak and xk = bk, then at = bk. Thus, a differs from b in only those positions where a differs from x or b differs from x. Since a differs from x in t or fewer positions, and b differs from x in / or fewer positions, a cannot differ from b in more than 2t positions. CHAPTER 4 A3 Take the first equation, x2a = bxc~l, and multiply on the right by c: x2ac = (bxc~l)c - bx(c'ic) = bxe = bx From the second equation, x2ac = x(xac) = xacx. Thus, xacx = bx 358 answers to selected exercises answers to selected exercises 359 By the cancellation law (cancel x on the right), xac = b Now multiply on the right by (ac)~' to complete the problem. CI From Theorem 3, a~lb 1 = (ba)'1 and 6~V = {ab)~\ D2 From Theorem 2, if (ab)c = e, then c is the inverse of ab. G3 Let G and // be abelian groups. To prove that G x // is abelian, let (a, fc) and (c, d) be any two elements of G x // (so that «£ G, c £ G, 6 £ //, and d £ //) and show that (a, b) ■ (c, d) = (c, d) ■ (a, b). Proof: (a, b) ■ (c, d) = (ac, bd) — (ca, db) because G and H are abelian = (c, d)(a,b) m CHAPTER 5 B5 Suppose /, g £ H. Then dy/d1* = a and dgldx = £> for constants a and b. From calculus, d(/ + g)/d* = df/dx + dgldx = a + b, which is constant. Thus, / + g0 such that x" E.H and y"1 £ //. Since // is a subgroup of G, it is closed under products—and hence under exponentiation (which is repeated multiplication of an element by itself). Thus (x")m £ H and (x"')" £ H. Set q = mn. Since G is abelian, {xy)q = x"y" = xmnym" = (x")",(ym)" S H since both (x")m and (ym)" are in //. Complete the problem. D5 S = {a,,. . . , a„} has « elements. The n products a,a,, a,a2,. . . , alan are elements of S (why?) and no two of them can be equal (why?). Hence every element of S is equal to one of these products. In particular, at = alak for some k. Thus, a,e = a,at, and hence e = ak. This shows that eE.S. Now complete the problem. D7 (a) Suppose x £ K. Then (i) if a £ H, then ^ax"1 £ W, and (ii) if xbx~' EH, then bBH. We shall prove that x1 G K: we must first show that if a £ H, then jc"Vjc"1)"' =x~laxSH. Well, a = ^"'ax)x"', and if j:(jt 'ax)x~lGH, then jT'ox £ // by (ii) above. [Use (ii) with x'lax replacing b.\ Conversely, we must show that if x~lax £ H, then a £ H. Well, if x~*ax £ //, then by (i) above, x(x~'ax)x_1 = a £//. E7 We begin by listing all the elements of Z2 x Z4 obtained by adding (1,1) to itself repeatedly: (1,1); (1,1) + (1,1) = (0, 2); (1,1) + (1,1) + (1,1) = (1,3); (1,1) + (1,1) + (1,1) + (1,1) = (0,0). If we continue adding (1,1) to multiples of itself, we simply obtain the above four pairs over and over again. Thus, (1, 1) is not a generator of all of Z2 x Z4. This process is repeated for every element of Z2 x Z4 generator of Z2 x Z4; hence Z2 x Z4 is not cyclic. Fl The table of G is as follows: None is a e a b b1 ab ab2 e e a b b2 ab ab2 a a e ab ab1 b b2 b b ab2 b2 e a ab ly b2 ab e b ab' a ab ab b2 ab1 a t' h h2 ab2 b a ab Ir 1' H3 Using the defining equations a2 = e, b3 = e, and ba = ab2, we compute the product of ab and ab2 in this way: (ab)(ab2) = a(ba)b2 = a(ab2)b2 = aVf> = eeb = b Complete the problem by exhibiting the computation of all the table entries. Recall the definition of the operation + in Chapter 3, Exercise F : x + y has Is in those positions where x and y differ, and Os elsewhere. CHAPTER 6 A4 From calculus, the function f(x) = x3 - 3* is continuous and unbounded. Its graph is shown below. [/ is unbounded because f(x) = x(x2 - 3) is an arbitrarily large positive number for sufficiently large positive values of x, and an arbitrarily large negative number (large in absolute value) for sufficiently large negative values of x] Because / is continuous and unbounded, the range of /is IR. Thus, f is surjective. Now determine whether/ is injective, and prove your answer. Graph of f(x) = x' - 3x 360 ANSWERS TO SELECTED EXERCISES A6 F5 HI f is injective: To prove this, note first that if jc is an integer then f(x) is an integer, and if x is not an integer, then f(x) is not an integer. Thus, if /(•*) =/(v), then x and y are either both integers or both nonintegers. Case 1, both integers: then f(x) - 2x, /(y) - 2y, and 2x = 2y; so x — y. Case 2, both nonintegers: ■ ■ ■ (Complete the problem. Determine whether / is surjective.) Let A = {a,, a2, . . . , an). If / is any function from A to A, there are n possible values for /(a,), namely, a,, a2, . . . , an. Similarly there are n possible values for/(a2). Thus there are n pairs consisting of a value of/(a,) together with a value of f(a2). Similarly there are n3 triples consisting of a value of /(a,), a value of /(a2), and a value of /(a3). We may continue in this fashion and conclude as follows: Since a function / is specified by giving a value for /(a,), a value for /(a2), and so on up to a value for /(a„), there are n" functions from A to A. Now, by reasoning in a similar fashion, how many bijective functions are there? The following is one example (though not the only possible one) of a machine capable of carrying out the prescribed task: A = (a, b, c, d) S = {s„, s,, s2, s2, st} The next-state function is described by the following table: abed To explain why the machine carries out the prescribed function, note first that the letters b, c, and d never cause the machine to change its state—only a does. If the machine begins reading a sequence in state s0, it will be in state s3 after exactly three a's have been read. Any subsequent a's will leave the machine in state s4. Thus, if the machine ends in s3, the sequence has read exactly three a's. The machine's state diagram is illustrated below: b.c.d b, c. d b.c.d b.c.d a.b. c.d 14 M, has only two distinct transition functions, which we shall denote by T0 and Tt (where o may be any sequence with on odd number of Is, and e any sequence with an even number of Is). Ta and Tt may be described as follows: Us„)- Us,) T.{sA. ANSWERS TO SELECTED EXERCISES 361 Now, Tm» 7"„ = 7"oc by part 3. Since e is a sequence with an even number of Is and o is a sequence with an odd number of Is, oe is a sequence with an odd number of Is; hence T„ = T.. Thus, Tt°T0= T,. Similarly In brief, the table of S^Af,) is as follows: and T. ° T. T. T. T. T. T. r0 T. The table shows that ¥(M,) is a two-element group. The identity element is Tc, and Tm is its own inverse. CHAPTER 7 D2 is the function defined by the formula f„ + m(*) = X+ " +m Use this fact to show that f„"fm =/„+„■ In order to show that /_„ is the inverse of /„, we must verify that / •/_ = e and /_„»/„ - e. We verify the first equation as follows: /„(*) = x + (-«); hence [/„ •/ „](*)=/■(/ .«) = /„(* -") = x -n + " = x = EW Since [/„ of. „\(x) = e(x) for every x in R, it follows that /„ •/_„ = e. E3 To prove that flla,„u is the inverse of /„ „, we must verify that /..t°/i/..-»/. = e and /i/..-»/«°/..i. = e To verify the first equation, we begin with the fact that /,,„..h/„(jc) = xla - bla. Now complete the problem. H2 If /, g G G, then / moves a finite number of elements of A, say a and g moves a finite number of elements of A, say ft,, any element of A which is not one of the elements a,, then bm. Now if x is a„, ft,.....ft,, Thus, fog moves at {a,,...,a„,ft,,...,ftm}. most the finite set of elements CHAPTER 8 Al(e) (I 8 2 5 6 6 5 1 7 8^ 7 4y 362 ANSWERS TO SELECTED EXERCISES ANSWERS TO SELECTED EXERCISES 363 A4( /) y = y y °y1 3 -1 /! y ° a = (1 = (1 2 3 4 5); a'1 =(4 1 7 3); thus, y'a1 2 3 4 5)-(4 1 7 3) = (1 7 4 2 3 5) B2 " \"3 a. ■•■«2/' and so on. Note that a2(ax) = a3, a2(a2) = a4, . . . , a2(as_ 2) = ar. Finally, a («,_,) = a, and a (as) = a2. Thus, a2(a,) = a? Complete the problem using addition modulo s, page 27. B4 Let a = (a, a,---as) where s is odd. Then a~ 2 . E2 If a and /3 are cycles of the same length, a = (fl, ■ • ■ a,) and 8 = (6, • • • fe,), let 7r be the following permutation: w(a,) = t>. for i = 1,... ,S and tt(£) = k for ft a,, . . . , as, bx,...,bs. Finally, let tt map distinct elements of {&,,. . . , bs] — {a,,. . . , a,} to distinct elements of {a,,...,a,)-(b,.....6s}. Now complete the problem, supplying details. Fl ja. a, • • ■ a, \ I when Ar is a positive integer, k For what values of k can you have a* = £? H2 Use Exercise HI and the fact that (/;) = (li)(l;)(li). CHAPTER 9 CI The group tables for G and // are as follows: Table for G Table for H 1 V H D l-l i -i / I V II D I 1 -i i -i V V I D II -1 -1 1 1 11 H D I V i i -i -1 1 D D H V I —i -i / 1 -1 G and // are not isomorphic because in G every element is its own inverse (VV= /, //// = /, and DD = /), whereas in H there are elements not equal to their inverse; for example, (-i)(-/) =-1 ^ 1. Find at least one other difference which shows that G ¥ H. C4 The group tables for G and H are as follows: Table for G Table for H £ a ß y s K i A b c d k e a ß y s K i i A b c: d k a t y ß K s A A 1 C b k i) ß K a e y b b k d A i c y s K e a ß C c d k 1 A b s y £ K ß a d d c i k b A K ß a s y t: k k b A d C I G and H are isomorphic. Indeed, let the function / : G—» H be defined by E a ß y S I A B C 5 k\ D K/ By inspection, / transforms the table of G into the table of H. Thus, / is an isomorphism from G to H. El Show that the function / :Z-*E given by f(n)=2n is bijective and that f(n + m)=f(n)+f(m). Fl Check that (2 4)2 = e, (1 2 3 4)" = e, and (1 2 3 4)(2 4) = (2 4)(1 2 3 4)3. Now explain why G = G'. H2 Let fG : G, —> G2 and fH : W, —*• H2 be isomorphisms, and find an isomorphism / : Gj x Hx -* G2 x H2. CHAPTER 10 Al(c) If m < 0 and n < 0, let m = -k and n = —I, where k, I > 0. Then m + n = -(/c + /). Now, and Hence m -ft / -l\Jt a = a = (a ) «■-«-'-(a-'y ama" = (a"1 )*(«"')' = (a ' )*1' = a"(*+" = a" B3 The order of / is 4. Explain why. C4 For any positive integer A:, if ak = e, then {bab'lf = bakb~l (why?) = bbl Conversely, if (bab ')* = 6a*b 1 = e, then a4 = e. (Why?) Thus, for any positive integer k, ak = e iff (bab'1)11 = e. Now complete the problem. 364 ANSWERS TO SELECTED EXERCISES D2 Let the order of a be equal to n. Then (ak)" = a"k = (a")k = ek = e. Now use Theorem 5. F2 The order of a8 is 3. Explain why. H2 The order is 24. Explain why. CHAPTER 11 A6 Bl C4 C7 D6 F3 If k is a generator of Z, this means that Z consists of all the multiples of k; that is, k, 2k, 3k, etc., as well as 0 and -k, -2k, -3k, etc.: -2k 0 2k Let G be a group of order n, and suppose G is cyclic, say G = (a). Then a, the generator of G, is an element of order n. (This follows from the discussion on the first two pages of this chapter.) Conversely, let G be a group of order n (that is, a group with n elements), and suppose G has an element a of order n. Prove that G is cyclic. By Exercise B4, there is an element b of order m in (a), and b e Cm. Since C„ is a subgroup of (a), which is cyclic, we know from Theorem 2 that Cm is cyclic. Since every element x in Cm satisfies x" = e, no element in C„ can have order greater than m. Now complete the argument. First, assume that ord(a') = m. Then (a')m = a"" = e. Use Theorem 5 of Chapter 10 to show that r = kl for some integer /. To show that / and m are relatively prime, assume on the contrary that / and m have a common factor a; that is. m = hq h ') Slope = 2 («,») Complete the solution, supplying details. D5 If ab 1 commutes with every x in G, then we can show that ba 1 commutes with every x in G: to"1* = (•*"'«&') ' = («*>"'*"') 1 (why?) CHAPTER 13 Bl Note first that the operation in the case of the group Z is addition. The subgroup (3) consists of all the multiples of 3, that is, (3) = {...,-9, -6,-3,0,3,6,9,...} The cosets of (3) are (3) + 0 = (3), as well as <3) + l = {...,-8, -5,-2,1,4,7, 10,...} (3) + 2 = {. . . , -7, -4, -1,2,5, 8,11,.. .} Note that (3) + 3 = (3>, (3) + 4 = (3) + 1, and so on; hence there are only three cosets of (3), namely, (3> = <3)+0 <3) + l (3>+2 366 ANSWERS TO SELECTED EXERCISES C6 Every element a of order p belongs in a subgroup (a). The subgroup (a) has p - 1 elements (why?), and each of these elements has order p (why?). Complete the solution. D6 For one part of the problem, use Lagrange's theorem. For another part, use the result of Exercise F4, Chapter 11. E4 To say that a/7 = Ha is to say that every element in aH is in Ha and conversely. That is, for any hE H, there is some iteff such that ah - ka and there is some lEH such that ha = al. (Explain why this is equivalent to aH = Ha.) Now, an arbitrary element of a'H is of the form a'lh = (hla)~\ Complete the solution. J3 0(1) = 0(2) = {1, 2, 3, 4}; G, = { b, B}; G2 = {f, aBa}. Complete the problem, supplying details. CHAPTER 14 A6 We use the following properties of sets: For any three sets X, Y and Z, (i) (xu Y)nz = (xnz){j(Ynz) (ii) (x- Y)nz = (xnz)-(ynz) Now here is the proof that h is a homomorphism: Let C and D be any subsets of A; then h(C + D) = h[(C -D)U(D- C)] = [(C - D)U(D - C)]n B Now complete, using (i) and (ii). by def of the operation + by def of h CI Let /'be injective. To show that K = {e), take any arbitrary element xE K and show that necessarily x = e. Well, since xEK, f(x) = e=f{e). Now complete, and prove the converse: Assume K = [e) . . .. D6 Consider the following family of subsets of G : {//, : i £ /}, where each Hi is a normal subgroup of G. Show that H = Pi Hi is a normal subgroup of G. First, show that H is closed under the group operation: well, if a,bEH, then a E Ht and h E Hi for every i G /. Since each H{ is a subgroup of G, ab G //, for every i G /; hence aft G PI Now complete. El If H has index 2, then G is partitioned into exactly two right cosets of H; also G is partitioned into exactly two left cosets of H. One of the cosets in each case is H. E6 First, show that if xES and yGS, then ryGS. Well, if x G S, then x G Ha = aH for some a EG. And if y G 5, then y G 7/6 = ftH for some b EG. Show that xy G H(ab) and that H(ab) = (ab)H and then complete the problem. ANSWERS TO SELECTED EXERCISES 367 14 It is easy to show that aHa^ C H. Show it. What does Exercise 12 tell you about the number of elements in aHa~ll 18 Let X= {aHa~' : a EG) be the set of all the conjugates of H, and let Y = {aN : a EG) be the set of all the cosets of N. Find a function / : X-* Y and show that /is bijective. CHAPTER 15 C4 Every element of GIH is a coset Hx. Assume every element of GIH has a square root: this means that for every x G G, Hx = (Hy)2 for some y G G. Avail yourself of Theorem 5 in this chapter. D4 Let H be generated by {hl,...,hj and let GIH be generated by {//a,,. . . , Ham). Show that G is generated by {«„... ,fl„, hu----hn} that is, every x in G is a product of the above elements and their inverses. E6 Every element of Q/Z is a coset Z + (m/n). G6 If G is cyclic, then necessarily G = Zp2. (Why?) If G is not cyclic, then every element x ¥ e in G has order p. (Why?) Take any two elements a¥ e and b^e in G where b is not a power of a. Complete the problem. CHAPTER 16 Dl Let / G Aut(G); that is, let / be an isomorphism from G onto G. We shall prove that / "' G Aut(G); that is, /""' is an isomorphism from G onto G. To begin with, it follows from the last paragraph of Chapter 6 that f~l is a bijective function from G onto G. It remains to show that /"' is a homomorphism. Let /~'(c) = a and /~'(d) = b, so that c = f(a) and d = fib). Then cd = f(a)j\b) = f{ab), whence f~\cd) = ab. Thus, f-\cd) = ab = f'(c)f -\d) which shows that / "' is a homomorphism. F2 If a, b G HK, then a = h,k, and b = h2k2, where hnh2E H and kt,k2E K. Then aft = hJk,h2k2 = A,(fc,/i2/c|"1 )fc,/c2. G3 Note that the range of ft is a group of functions. What is its identity clement? HI From calculus, cos(jc + y) = cos* cos y - sin x sin y, and sin (x + y) = sin x cos y + cos * sin y. 368 ANSWERS TO SELECTED EXERCISES ANSWERS TO SELECTED EXERCISES 369 L4 The natural homomorphism (Theorem 4, Chapter 15) is a homomorphism / : G-*GI{a) with kernel (a). Let 5 be the normal subgroup of Gl(a) whose order is p"1 '. (The existence of S is assured by part 3 of this exercise set.) Referring to Exercise J, show that 5* is a normal subgroup of G, and that the order of S* is pm. CHAPTER 17 A3 We prove that O is associative: (a. fc)©[(c, d)Q(p, q)\ = (a, b)Q(cp - dq, cq + dp) = (acp - adq - bcq — bdp, acq + adp + bcp— bdq) [{a, b)Q(c, d)]Q(p, q) = (ac - bd, ad + bc)Q(p, q) = (acp - bdp - adq - bcq, acq — bdq + adp + bcp) Thus, (a, b)Q[(c, d)Q(p, q)] = [(«, b)Q(c, d)]Q(p, q). B2 A nonzero function /is a divisor of zero if there exists some nonzero function g such that fg = 0, where 0 is the zero function (page 46). The equation /g = 0 means that f(x)g(x) = 0(x) for every jcER. Very precisely, what functions / have this property? Dl For the distributive law, refer to the diagram on page 30, and show that a n (B + c) = (a n B) + (a n cy. B + c consists of the regions 2, 3, 4, and 7; a f~l (B + c) consists of the regions 2 and 4. Now complete the problem. E3 ",)(; °)-("o' -,)-«?) G4 (a, b) is an invertible element of a x B iff there is an ordered pair (c, d) in a x B satisfying (a, b) - (c, d) = (1, 1). Now complete. H6 If a is a ring, then, as we have seen, a with addition as the only operation is an abelian group: this group is called the additive group of the ring a. Now, suppose the additive group of A is a cyclic group, and its generator is c. If a and b are any two elements of a, then a = c + c + and + c (m terms) 6 = c + c + ■ • • + c (n terms) for some positive integers m and n. J2 If ab is a divisor of 0, this means that ab ^ 0 and there is some x ¥= 0 such that abx — 0. Moreover, a 4 0 and b # 0, for otherwise ab = 0. M3 Suppose am = 0 and 6" = 0. Show that (a + b)m'" = 0. Explain why, in every term of the binomial expansion of (a + b)'"*", either a is raised to a powers m, or b is raised to a power s n. CHAPTER 18 A4 From calculus, the sum and product of continuous functions are continuous. B3 The proof hinges on the fact that if k and a are any two elements of Z„, then ka = a + a + -- + a (k terms) C4 If the cancellation law holds in A, it must hold in B. (Why?) Why is it necessary to include the condition that B contains 1? C5 Let B be a subring of a field F. If b is any nonzero element of B, then b~1 is in F, though not necessarily in B. (Why?) Complete the argument. E5 f(x,y)f(u,v)-(; ;)c J). f[x, y)0(u, v)) = Complete the problem. H3 If a" e J and bm €E /, show that (a + b)"*m 6 J. (See the solution of Exercise M3, Chapter 17.) Complete the solution. CHAPTER 19 El To say that the coset J + x has a square root is to say that for some element y in A, J + x = (J + y)(J +y) = J + y2. E6 A unity element of All is a coset J + a such that for any x€ A, (J + a)(J + x) = J + x and (7 + x)(J + a) = J + x Gl To say that a ^ J is equivalent to saying that J + a=t J; that is, J + a is not the zero element of AIJ. Explain and complete. CHAPTER 20 E5 Restrict your attention to A with addition alone, and see Chapter 13, Theorem 4. E6 For n = 2 you have (a + bf2 = [(a + b)"]p = [ap + bp]p by Theorem 3 = (flp)p + (bp)p by Theorem 3 - a"2 + bp2 Prove the required formula by reasoning similarly and using induction: assume the formula is true for n = k, and prove for n = k + 1. 370 ANSWERS TO SELECTED EXERCISES ANSWERS TO SELECTED EXERCISES 371 CHAPTER 21 B5 Use the product (a - l)(b - 1). C8 In the induction step, you assume the formula is true for n = k and prove it is true for n = k + 1. That is, you assume Vk+, = (-l)k and prove Recall that by the definition of the Fibonacci sequence, Fn + 2 = Fn + 1 + F„ for every n>2. Thus, Fk + 2 = Fi+1 + Fk and Fk^4 = Ft+3 + Fk + 2. Substitute these in the second of the equations above. E5 An elegant way to prove this inequality is by use of part 4, with a + b in the place of a, and \a\ + \b\ in the place of b. E8 This can be proved easily using part 5. F2 You are given that m = nq + r and q = kq, + r, where 0 s r < n and 0 =s r, < k. (Explain.) Thus, m = n(kq, + r,) + r = (nk)q, + (nr, + r) You must show that nr, + r < nk, (Why?) Begin by noting that k - r, > 0; hence k — r, s= 1, so n(k - r,) s= n. G5 For the induction step, assume k-o = (k-l)a, and prove (k+l)a = [(k + !)• l]a. From (ii) in this exercise, (k + 1) • 1 = k • 1 + 1. CHAPTER 22 Bl Assume a>0 and a\b. To solve the problem, show that a is the greatest common divisor of a and b. First, a is a common divisor of a and b: a | a and a | b. Next, suppose tis any common divisor of a and b: t \ a and t\b. Explain why you are now done. D3 From the proof of Theorem 3, d is the generator of the ideal consisting of all the linear combinations of a and b. El Suppose a is odd and b is even. Then a + b is odd and a - b is odd. Moreover, if / is any common divisor of a - b and a + b, then t is odd. (Why?) Note also that if / is a common divisor of a — b and a + b, then ( divides the sum and difference of a + b and a - b. F3 If / = \cm(ab, ac), then / = abx = acy for some integers x and y. From these equations you can see that a is a factor of /, say / = am. Thus, the equations become am = abx = acy. Cancel a, then show that m = lcm(£>, c). G8 Look at the proof of Theorem 3. CHAPTER 23 A4(/) 3x2 -6x + 6 = 3(x2 - 2x + 1) + 3 = 3(x - l)2 + 3. Thus, we are to solve the congruence 3(x - if = -3 (mod 15). We begin by solving 3y = -3 (mod 15), then we will set y = (x - l)2. We note first that by Condition (6), in a congruence ax = b (mod «), if the three numbers a, b, and n have a common factor d, then ax m ft(mod n) is equivalent to ~dx~ \ (m°d (That is, all three numbers a, b, and n may be divided by the common factor d.) Applying this observation to our congruence 3y = -3 (mod 15), we get 3y = -3 (mod 15) is equivalent to y = -1 (mod 5) This is the same as y = 4 (mod 5), because in Z5 the negative of 1 is 4. Thus, our quadratic congruence is equivalent to (x-l)2 = 4 (mod 5) In Z5, 22 = 4 and 32 = 4; hence the solutions are x - 1 = 2 (mod 5) and x - 1 = 3 (mod 5), or finally, x «e 3 (mod 5) and x = 4 (mod 5) \6(d) We begin by finding all the solutions of 30z + 24y = 18, then set z = x2. Now, 30z + 24y = 18 iff 24y = 18 - 30z iff 30z = 18 (mod 24) the previous solution, this is equivalent to 5z: lA, 5 = 1; hence 5z = 3 (mod 4) is the same as z'- By comments in (mod 4). But in , (mod 4). Now set z = x2. Then the solution is *2 = 3 (mod 4). But this last congruence has no solution, because in Z4, 3 is not the square of any number. Thus, the Diophantine equation has no solutions. B3 Here is the idea: By Theorem 3, the first two equations have a simultaneous solution; by Theorem 4, it is of the form x = c (mod?), where t = \cm(m1, m2). To solve the latter simultaneously with x = c3 (mod m3), you will need to know that c3 = c [mod gcd(f, m3)]. But gcd(r, m3) = lcm(d13, d2i). (Explain this carefully, using the result of Exercise H4 in Chapter 22.) From Theorem 4 (especially the last paragraph of its proof), it is clear that since c3 = ct (mod dl3) and c3 — c2 (mod d23), therefore c3 = c [mod lcm( + •■• + h(cn)b" For d(x) = dg + d,x H-----H t/nx", we have similar formulas. Show that ft(c(a)) = /i(d(a)) iff ft is a root of hc(x) - hd(x). 14 The proof is by induction on the degree of a(x). If deg a(x) = 1, then /f, = F, and K2 = F2 (explain why), and therefore = /f2. Now let deg a(x) = n, and suppose the result is true for any polynomial of degree n - 1. Let p(x) be an irreducible factor of a(x)\ let u be a root of p(x) and v a root of hp(x). If F,(u) = Kt, then by parts 1 and 2 we are done. If not, let F[ = F,(u) and F2 = F2(i>); h can be extended to an isomorphism h: F\-* F2, with «(m) = d. In F[[x], = (x - u)a,(x), and in F'2[x], ha(x) = h~a(x) = (x - v)hal(x), where deg at(x) = deg ha,(x) = n - 1. Complete the solution. J2 Explain why any monomorphism /i : F(c)-^ C, which is an extension of h, is fully determined by the value of «(c), which is a root of hp{x). J4 Since ft.(l) = 1, begin by proving by induction that for any positive integer n, h(n) = n. Then complete the solution. CHAPTER 32 A2 In the first place, Q(V2) is of degree 2 over Q. Next, x2 + 1 is irreducible over Q(V2). (Explain why.) Complete the solution. Dl The complex fourth roots of 1 are ±1 and ±i. Thus, the complex fourth roots of 2 are , aa>2, aa>3, aw4, and aoi5. Any automorphism of Q(a, V3i) fixing Q maps sixth roots of 2 to sixth roots of 2, at the same time mapping V3i to ±V5i (and hence mapping co to