Do we really understand quantum mechanics? Strange correlations, paradoxes, and theorems F. Laloea) Laboratoire de Physique de l'ENS, LKB, 24 rue Lhomond, F-7'5005Paris, France (Received 10 November 1998; accepted 29 January 2001) This article presents a general discussion of several aspects of our present understanding of quantum mechanics. The emphasis is put on the very special correlations that this theory makes possible: They are forbidden by very general arguments based on realism and local causality. In fact, these correlations are completely impossible in any circumstance, except for very special situations designed by physicists especially to observe these purely quantum effects. Another general point that is emphasized is the necessity for the theory to predict the emergence of a single result in a single realization of an experiment. For this purpose, orthodox quantum mechanics introduces a special postulate: the reduction of the state vector, which comes in addition to the Schrödinger evolution postulate. Nevertheless, the presence in parallel of two evolution processes of the same object (the state vector) may be a potential source for conflicts; various attitudes that are possible to avoid this problem are discussed in this text. After a brief historical introduction, recalling how the very special status of the state vector has emerged in quantum mechanics, various conceptual difficulties are introduced and discussed. The Einstein-Podolsky-Rosen (EPR) theorem is presented with the help of a botanical parable, in a way that emphasizes how deeply the EPR reasoning is rooted into what is often called "scientific method." In another section the Greenberger-Horne-Zeilinger argument, the Hardy impossibilities, as well as the Bell-Kochen-Specker theorem are introduced in simple terms. The final two sections attempt to give a summary of the present situation: One section discusses nonlocality and entanglement as we see it presently, with brief mention of recent experiments; the last section contains a (nonexhaustive) list of various attitudes that are found among physicists, and that are helpful to alleviate the conceptual difficulties Of quantum mechanics. © 2001 American Association of Physics Teachers. [DOI: 10.1119/1.1356698] CONTENTS I. HISTORICAL PERSPECTIVE................. 656 A. Three periods............................ 657 1. Prehistory............................ 657 2. The undulatory period.................. 657 3. Emergence of the Copenhagen interpretation......................... 658 B. The status of the state vector................ 658 1. Two extremes and the orthodox solution.............................. 658 2. An illustration......................... 659 II. DIFFICULTIES, PARADOXES................ 659 A. Von Neumann's infinite regress............. 660 B. Wigner's friend........................... 661 C. Schrodinger's cat......................... 661 D. Unconvincing arguments................... 662 III. EINSTEIN, PODOLSKY, AND ROSEN........ 662 A. A theorem............................... 662 B. Of peas, pods, and genes................... 663 1. Simple experiments; no conclusion yet..... 663 2. Correlations; causes unveiled............ 663 3. Transposition to physics................. 664 IV. QUANTITATIVE THEOREMS: BELL, GREENBERGER-HORNE-ZEILINGER, HARDY, BELL-KOCHEN-SPECKER........ 666 A. Bell inequalities.......................... 666 1. Two spins in a quantum singlet state...... 666 2 Proof................................ 666 3. Contradiction with quantum mechanics and with experiments................... 667 4. Generality of the theorem............... 667 B. Hardy's impossibilities..................... 668 C. GHZ equality............................ 669 D. Bell -Kochen- Specker; contextuality......... 671 V. NONLOCALITY AND ENTANGLEMENT: WHERE ARE WE NOW?.................... 672 A. Loopholes, conspiracies.................... 672 B. Locality, contrafactuality................... 674 C. "All-or-nothing coherent states;" decoherence............................. 675 1. Definition and properties of the states..... 675 2. Decoherence.......................... 676 D. Quantum cryptography, teleportation......... 677 1. Sharing cryptographic keys by quantum measurements......................... 677 2. Teleporting a quantum state............. 678 E. Quantum computing and information......... 679 VI. VARIOUS INTERPRETATIONS.............. 680 A. Common ground; "correlation interpretation"........................... 680 B. Additional variables....................... 682 1. General framework..................... 682 2. Bohmian trajectories................... 683 C. Modified (nonlinear) Schrödinger dynamics.... 684 1. Various forms of the theory.............. 684 2. Physical predictions.................... 685 655 Am. J. Phys. 69 (6), June 2001 http://ojps.aip.org/ajp/ © 2001 American Association of Physics Teachers 655 D. History interpretation...................... 686 1. Histories, families of histories............ 686 2. Consistent families..................... 687 3. Quantum evolution of an isolated system... 687 4. Comparison with other interpretations..... 688 5. A profusion of points of view; discussion... 689 E. Everett interpretation...................... 690 VII. CONCLUSION............................ 691 Quantum mechanics describes physical systems through a mathematical object, the state vector |Ý), which replaces positions and velocities of classical mechanics. This is an enormous change, not only mathematically, but also conceptually. The relations between |Ý) and physical properties are much less direct than in classical mechanics; the distance between the formalism and the experimental predictions leaves much more room for discussions about the interpretation of the theory. Actually, many difficulties encountered by those who tried (or are still trying) to "really understand" quantum mechanics are related to questions pertaining to the exact status of |Ý): For instance, does it describe the physical reality itself, or only some partial knowledge that we might have of this reality? Does it fully describe ensemble of systems only (statistical description), or one single system as well (single events)? Assume that, indeed, |Ý) is affected by an imperfect knowledge of the system; is it then not natural to expect that a better description should exist, at least in principle? If so, what would be this deeper and more precise description of the reality? Another confusing feature of |Ý) is that, for systems extended in space (for instance, a system made of two particles at very different locations), it gives an overall description of all its physical properties in a single block from which the notion of space seems to have disappeared; in some cases, the physical properties of the two remote particles seem to be completely "entangled" (the word was introduced by Schrö-dinger in the early days of quantum mechanics) in a way where the usual notions of space-time and local events seem to become dimmed. Of course, one could think that this entanglement is just an innocent feature of the formalism with no special consequence: For instance, in classical electro-magnetism, it is often convenient to introduce a choice of gauge for describing the fields in an intermediate step, but we know very well that gauge invariance is actually fully preserved at the end. But, and as we will see below, it turns out that the situation is different in quantum mechanics: In fact, a mathematical entanglement in |Ý) can indeed have important physical consequences on the result of experiments, and even lead to predictions that are, in a sense, contradictory with locality (we will see below in what sense). Without any doubt, the state vector is a rather curious object to describe reality; one purpose of this article is to describe some situations in which its use in quantum mechanics leads to predictions that are particularly unexpected. As an introduction, and in order to set the stage for this discussion, we will start with a brief historical introduction, which will remind us of the successive steps from which the present status of |Ý) emerged. Paying attention to history is not inappropriate in a field where the same recurrent ideas are so often rediscovered; they appear again and again, sometimes almost identical over the years, sometimes remodeled or rephrased with new words, but in fact more or less unchanged. Therefore, a look at the past is not necessarily a waste of time! I. HISTORICAL PERSPECTIVE The founding fathers of quantum mechanics had already perceived the essence of many aspects of the discussions on quantum mechanics; today, after almost a century, the discussions are still lively and, if some very interesting new aspects have emerged, at a deeper level the questions have not changed so much. What is more recent, nevertheless, is a general change of attitude among physicists: Until about 20 years ago, probably as a result of the famous discussions between Bohr, Einstein, Schrödinger, Heisenberg, Pauli, de Broglie, and others (in particular at the famous Solvay meetings, Ref. 1), most physicists seemed to consider that "Bohr was right and proved his opponents to be wrong," even if this was expressed with more nuance. In other words, the majority of physicists thought that the so-called "Copenhagen interpretation'' had clearly emerged from the infancy of quantum mechanics as the only sensible attitude for good scientists. As we all know, this interpretation introduced the idea that modern physics must contain indeterminacy as an essential ingredient: It is fundamentally impossible to predict the outcome of single microscopical events; it is impossible to go beyond the formalism of the wave function (or its equivalent, the state vector1 |Ý)) and complete it; for some physicists, the Copenhagen interpretation also includes the difficult notion of "complementarity"... even if it is true that, depending on the context, complementarity comes in many varieties and has been interpreted in many different ways! By and large, the impression of the vast majority was that Bohr had eventually won the debate with Einstein, so that discussing again the foundations of quantum mechanics after these giants was pretentious, useless, and maybe even bad taste. Nowadays, the attitude of physicists is much more moderate concerning these matters, probably partly because the community has better realized the nonrelevance of the ' 'impossibility theorems" put forward by the defenders of the Copenhagen orthodoxy, in particular by Von Neumann, Ref. 2 (see also Refs. 3-5, as well as the discussion given in Ref. 6); another reason is, of course, the great impact of the discoveries and ideas of J. Bell, Ref. 7. At the turn of the century, it is probably fair to say that we are no longer sure that the Copenhagen interpretation is the only possible consistent attitude for physicists—see for instance the doubts expressed in Ref. 8. Alternative points of view are considered as perfectly consistent: theories including additional variables (or "hidden variables"),2 see Refs. 9 and 10; modified dynamics of the state vector, Refs. 4, 11, 12, and 13 (nonlinear and/or stochastic evolution); at the other extreme we have points of view such as the so-called "many worlds interpretation" (or multibranched universe interpretation), Ref. 14, or more recently other interpretations such as that of "decoherent histories," Ref. 15 (the list is nonexhaustive). All these interpretations will be discussed in Sec. VI. For a recent review containing many references, see Ref. 16, which emphasizes additional variables, but which is also characteristic of the variety of positions among contemporary scientists,3 as well as an older but very interesting debate published in Physics Today (Ref. 17); another very useful source of older references is the 1971 AJP "Resource Letter" (Ref. 18). But recognizing this variety of positions should not be the source of misunderstandings! It should also be emphasized very clearly that, until now, no new fact whatsoever (or no new reasoning) has appeared that has made the Copenhagen interpretation obsolete in any sense. 656 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 656 A. Three periods Three successive periods may be distinguished in the history of the elaboration of the fundamental quantum concepts; they have resulted in the point of view that we may call "the orthodox interpretation," with all provisos that have just been made above. Here we give only a brief historical summary, but we refer the reader who would like to know more about the history of the conceptual development of quantum mechanics to the book of Jammer, Ref. 19; see also Ref. 20; for detailed discussions of fundamental problems in quantum mechanics, one could also look for references such as Refs. 21, 22, and 8 or those given in Ref. 18. 1. Prehistory Planck's name is obviously the first that comes to mind when one thinks about the birth of quantum mechanics: He is the one who introduced the famous constant h, which now bears his name, even if his method was phenomenological. His motivation was actually to explain the properties of the radiation in thermal equilibrium (blackbody radiation) by introducing the notion of finite grains of energy in the calculation of the entropy, later interpreted by him as resulting from discontinuous exchange between radiation and matter. It is Einstein who, later, took the idea more seriously and really introduced the notion of quantum of light (which would be named ' 'photon'' much later), in order to explain the wavelength dependence of the photoelectric effect—for a general discussion of the many contributions of Einstein to quantum theory, see Ref. 23. One should nevertheless realize that the most important and urgent question at the time was not so much to explain fine details of the properties of radiation-matter interaction, or the peculiarities of the blackbody radiation; it was, rather, to understand the origin of the stability of atoms, that is of all matter which surrounds us and of which we are made! Despite several attempts, explaining why atoms do not collapse almost instantaneously was still a complete challenge in physics. One had to wait a little bit more, until Bohr introduced his celebrated atomic model, to see the appearance of the first elements allowing treatment of the question. He proposed the notion of "quantized permitted orbits" for electrons, as well as that of "quantum jumps" to describe how they would go from one orbit to another, during radiation emission processes for instance. To be fair, we must concede that these notions have now almost disappeared from modern physics, at least in their initial forms; quantum jumps are replaced by a much more precise theory of spontaneous emission in quantum electrodynamics. But, on the other hand, one may also see a resurgence of the old quantum jumps in the modern use of the postulate of the wave packet reduction. After Bohr came Heisenberg, who introduced the theory that is now known as "matrix mechanics," an abstract intellectual construction with a strong philosophical component, sometimes close to positivism; the classical physical quantities are replaced by "observables," mathematically matrices, defined by suitable postulates without much help of the intuition. Nevertheless, matrix mechanics contained many elements which turned out to be building blocks of modern quantum mechanics! In retrospect, one can be struck by the very abstract and somewhat mysterious character of atomic theory at this period of history; why should electrons obey such rules which forbid them to leave a given class of orbits, as if they were miraculously guided on simple trajectories? What was the origin of these quantum jumps, which were supposed to have no duration at all, so that it would make no sense to ask what were the intermediate states of the electrons during such a jump? Why should matrices appear in physics in such an abstract way, with no apparent relation with the classical description of the motion of a particle? One can guess how relieved many physicists felt when another point of view emerged, a point of view which looked at the same time much simpler and in the tradition of the physics of the 19th century: the undulatory (or wave) theory. 2. The undulatory period It is well known that de Broglie was the first who introduced the idea of associating a wave with every material particle; this was soon proven to be correct by Davisson and Germer in their famous electron diffraction experiment. Nevertheless, for some reason, at that time de Broglie did not proceed much further in the mathematical study of this wave, so that only part of the veil of mystery was raised by him (see for instance the discussion in Ref. 24). It is sometimes said that Debye was the first who, after hearing about de Broglie's ideas, remarked that in physics a wave generally has a wave equation: The next step would then be to try and propose an equation for this new wave. The story adds that the remark was made in the presence of Schrodinger, who soon started to work on this program; he successfully and rapidly completed it by proposing the equation which now bears his name, one of the most basic equations of all physics. Amusingly, Debye himself does not seem to have remembered the event. The anecdote may not be accurate; in fact, different reports about the discovery of this equation have been given and we will probably never know exactly what happened. What remains clear anyway is that the introduction of the Schrodinger equation is one of the essential milestones in the history of physics. Initially, it allowed one to understand the energy spectrum of the hydrogen atom, but we now know that it also gives successful predictions for all other atoms, molecules and ions, solids (the theory of bands for instance), etc. It is presently the major basic tool of many branches of modern physics and chemistry. Conceptually, at the time of its introduction, the undulatory theory was welcomed as an enormous simplification of the new mechanics; this is particularly true because Schrodinger and others (Dirac, Heisenberg) promptly showed how it allowed one to recover the predictions of the complicated matrix mechanics from more intuitive considerations on the properties of the newly introduced "wave function"—the solution of the Schrodinger equation. The natural hope was then to be able to extend this success, and to simplify all problems raised by the mechanics of atomic particles: One would replace it by a mechanics of waves, which would be analogous to electromagnetic or sound waves. For instance, Schrodinger thought initially that all particles in the universe looked to us like point particles just because we observe them at a scale which is too large; in fact, they are tiny "wave packets" which remain localized in small regions of space. He had even shown that these wave packets remain small (they do not spread in space) when the system under study is a harmonic oscillator.... Alas, we now know that this is only one of the very few special cases where this is true; in general, they do constantly spread in space! 657 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 657 3. Emergence of the Copenhagen interpretation It did not take a long time before it became clear that the undulatory theory of matter also suffers from very serious difficulties—actually so serious that physicists were soon led to abandon it. A first example of difficulty is provided by a collision between particles, where the Schrodinger wave spreads in all directions, exactly as the water wave stirred in a pond by a stone thrown into it; but, in all collision experiments, particles are observed to follow well-defined trajectories which remain perfectly localized, going in some precise direction. For instance, every photograph taken in the collision chamber of a particle accelerator shows very clearly that particles never get "diluted" in all space! This remark stimulated the introduction, by Born, of the probabilistic interpretation of the wave function. Another difficulty, even more serious, arises as soon as one considers systems made of more than one single particle: Then, the Schrodinger wave is no longer an ordinary wave since, instead of propagating in normal space, it propagates in the so-called ' 'configuration space" of the system, a space which has 37V dimensions for a system made of N particles! For instance, already for the simplest of all atoms, the hydrogen atom, the wave which propagates in six dimensions (if spins are taken into account, four such waves propagate in six dimensions); for a macroscopic collection of atoms, the dimension quickly becomes an astronomical number. Clearly the new wave was not at all similar to classical waves, which propagate in ordinary space; this deep difference will be a sort of Leitmotiv in this text,4 reappearing under various aspects here and there.5 In passing, and as a side remark, it is amusing to notice that the recent observation of the phenomenon of Bose-Einstein condensation in dilute gases (Ref. 25) can be seen, in a sense, as a sort of realization of the initial hope of Schrodinger: This condensation provides a case where the many-particle matter wave does propagate in ordinary space. Before condensation takes place, we have the usual situation: The atoms belong to a degenerate quantum gas, which has to be described by wave functions defined in a huge configuration space. But, when they are completely condensed, they are restricted to a much simpler many-particle state that can be described by the same wave function, exactly as a single particle. In other words, the matter wave becomes similar to a classical field with two components (the real part and the imaginary part of the wave function), resembling an ordinary sound wave for instance. This illustrates why, somewhat paradoxically, the "exciting new states of matter" provided by Bose-Einstein condensates are not an example of an extreme quantum situation; they are actually more classical than the gases from which they originate (in terms of quantum description, interparticle correlations, etc.). Conceptually, of course, this remains a very special case and does not solve the general problem associated with a naive view of the Schrodinger waves as real waves. The purely undulatory description of particles has now disappeared from modern quantum mechanics. In addition to Born and Bohr, Heisenberg (Ref. 26), Jordan, Dirac (Ref. 27), and others played an essential role in the appearance of a new formulation of quantum mechanics (Ref. 20), where probabilistic and undulatory notions are incorporated in a single complex logical edifice. The now classical Copenhagen interpretation of quantum mechanics (often also called "orthodox interpretation") incorporates both a progressive, deterministic evolution of the wave function/state vector according to the Schrodinger equation, as well as a second postulate of evolution that is often called the ' 'wave packet reduction" (or also "wave function collapse"). The Schrodinger equation in itself does not select precise experimental results, but keeps all of them as potentialities in a coherent way; forcing the emergence of a single result in a single experiment is precisely the role of the postulate of the wave packet reduction. In this scheme, separate postulates and equations are therefore introduced, one for the "natural" evolution of the system, another for measurements performed on it. B. The status of the state vector With two kinds of evolution, it is no surprise if the state vector should get, in orthodox quantum theory, a nontrivial status—actually it has no equivalent in all the rest of physics. 1. Two extremes and the orthodox solution Two opposite mistakes should be avoided, since both "miss the target" on different sides. The first is to endorse the initial hopes of Schrodinger and to decide that the (many-dimension) wave function directly describes the physical properties of the system. In such a purely undulatory view, the position and velocities of particles are replaced by the amplitude of a complex wave, and the very notion of point particle becomes diluted; but the difficulties introduced by this view are now so well known—see the discussion in the preceding section—that few physicists seem to be tempted to support it. Now, by contrast, it is surprising to hear relatively often colleagues falling to the other extreme, and endorsing the point of view where the wave function does not attempt to describe the physical properties of the system itself, but just the information that we have on it—in other words, the wave function should get a relative (or contextual) status, and become analogous to a classical probability distribution in usual probability theory. Of course, at first sight, this would bring a really elementary solution to all fundamental problems of quantum mechanics: We all know that classical probabilities undergo sudden jumps, and nobody considers this as a special problem. For instance, as soon as new information becomes available to us on any system, the probability distribution that we associate with it changes suddenly; is this not the obvious way to explain the sudden wave packet reduction? One first problem with this point of view is that it would naturally lead to a relative character of the wave function: If two observers had different information on the same system, should they use different wave functions to describe the same system?6 In classical probability theory, there would be no problem at all with "observer-dependent" distribution probabilities, but standard quantum mechanics clearly rejects this possibility: It certainly does not attribute such a character to the wave function.7 Moreover, when in ordinary probability theory a distribution undergoes a sudden "jump" to a more precise distribution, the reason is simply that more precise values of the variables already exist—they actually existed before the jump. In other words, the very fact that the probability distribution reflected our imperfect knowledge implies the possibility for a more precise description, closer to the reality of the system itself. But this is in complete opposition with orthodox quantum mechanics, which negates the very idea of a better description of the reality than the wave function. In fact, introducing the notion of pre-existing values is precisely the basis of unorthodox theories with additional variables (hidden variables)! So the advocates of this 658 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 658 "information interpretation' are often advocates of additional variables (often called hidden variables—see Sec. VI B and note 2), without being aware of it! It is therefore important to keep in mind that, in the classical interpretation of quantum mechanics, the wave function (or state vector) gives the ultimate physical description of the system, with all its physical properties; it is neither contextual, nor observer dependent; if it gives probabilistic predictions on the result of future measurements, it nevertheless remains inherently completely different from an ordinary classical distribution of probabilities. If none of these extremes is correct, how should we combine them? To what extent should we consider that the wave function describes a physical system itself (realistic interpretation), or rather that it contains only the information that we may have on it (positivistic interpretation), presumably in some sense that is more subtle than a classical distribution function? This is not an easy question, and various authors answer the question with different nuances; we will come back to this question in Sec. IIB, in particular in the discussion of the '' Schrodinger cat paradox.'' Even if it not so easy to be sure about what the perfectly orthodox interpretation is, we could probably express it by quoting Peres in Ref. 29: "a state vector is not a property of a physical system, but rather represents an experimental procedure for preparing or testing one or more physical systems;" we could then add another quotation from the same article, as a general comment: "quantum theory is incompatible with the proposition that measurements are processes by which we discover some unknown and preexisting property." In this context, a wave function is an absolute representation, but of a preparation procedure rather than of the isolated physical system itself; nevertheless, since this procedure may also imply some information on the system itself (for instance, in the case of repeated measurements of the same physical quantity), we have a sort of intermediate situation where none of the answers above is completely correct, but where they are combined in a way that emphasizes the role of the whole experimental setup. 2. An illustration Just as an illustration of the fact that the debate is not closed, we take a quotation from a recent article (Ref. 30) which, even if taken out of its context, provides an interesting illustration of the variety of nuances that can exist within the Copenhagen interpretation (from the context, it seems clear that the authors adhere to this interpretation); after criticizing erroneous claims of colleagues concerning the proper use of quantum concepts, they write: "(One) is led astray by regarding state reductions as physical processes, rather than accepting that they are nothing but mental processes." The authors do not expand much more on this sentence, which they relate on a "minimalistic interpretation of quantum mechanics;" actually they even give a general warning that it is dangerous to go beyond it ("Van Kampen's caveat"). Nevertheless, let us try to be bold and to cross this dangerous line for a minute; what is the situation then? We then see that two different attitudes become possible, depending on the properties that we attribute to the Schrodinger evolution itself: Is it also a "purely mental process," or is it of a completely different nature and associated more closely with an external reality? Implicitly, the authors of Ref. 30 seem to favor the second possibility—otherwise, they would probably have made a more general statement about all evolutions of the state vector—but let us examine both possibilities anyway. In the first case, the relation of the wave function to physical reality is completely lost and we meet all the difficulties mentioned in the preceding paragraph as well as some of the next section; we have to accept the idea that quantum mechanics has nothing to say about reality through the wave function (if the word reality even refers to any well-defined notion!). In the second case, we meet the conceptual difficulties related to the coexistence of two processes of completely different nature for the evolution of the state vector, as discussed in the next section. What is interesting is to note that Peres's point of view (at the end of the preceding section), while also orthodox, corresponds to neither possibility: It never refers to mental process, but just to preparation and tests on physical systems, which is clearly different; this illustrates the flexibility of the Copenhagen interpretation and the variety of ways that different physicists use to describe it. Another illustration of the possible nuances is provided by a recent note published by the same author together with Fuchs (Ref. 31) entitled, "Quantum theory needs no 'interpretation. ''' These authors explicitly take a point of view where the wave function is not absolute, but observer dependent: "it is only a mathematical expression for evaluating probabilities and depends on the knowledge of whoever is doing the computing." The wave function becomes similar to a classical probability distribution which, obviously, depends on the knowledge of the experimenter, so that several different distributions can be associated with the same physical system (if there are several observers). On the other hand, as mentioned above, associating several different wave functions with one single system is not part of what is usually called the orthodox interpretation (except, of course, for a trivial phase factor). To summarize, the orthodox status of the wave function is indeed a subtle mixture between different, if not opposite, concepts concerning reality and the knowledge that we have of this reality. Bohr is generally considered more as a realist than a positivist or an operationalist (Ref. 19); he would probably have said that the wave function is indeed a useful tool, but that the concept of reality cannot properly be defined at a microscopic level only; it has to include all macroscopic measurement apparatuses that are used to have access to microscopic information (we come back to this point in more detail in Sec. HIB 3). In this context, it is understandable why he once even stated that' 'there is no quantum concept" (Ref. 32)! IL DIFFICULTIES, PARADOXES We have seen that, in most cases, the wave function evolves gently, in a perfectly predictable and continuous way, according to the Schrodinger equation; in some cases only (as soon as a measurment is performed), unpredictable changes take place, according to the postulate of wave packet reduction. Obviously, having two different postulates for the evolution of the same mathematical object is unusual in physics; the notion was a complete novelty when it was introduced, and still remains unique in physics, as well as the source of difficulties. Why are two separate postulates necessary? Where exactly does the range of application of the first stop in favor of the second? More precisely, among all the interactions—or perturbations—that a physical system can undergo, which ones should be considered as normal (Schrodinger evolution), which ones are a measurement 659 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 659 (wave packet reduction)? Logically, we are faced with a problem that did not exist before, when nobody thought that measurements should be treated as special processes in physics. We learn from Bohr that we should not try to transpose our experience of the everyday world to microscopic systems; this is fine, but where exactly is the limit between the two worlds? Is it sufficient to reply that there is so much room between macroscopic and microscopic sizes that the exact position of the border does not matter?9 Moreover, can we accept that, in modern physics, the "observer" should play such a central role, giving to the theory an unexpected anthropocentric foundation, as in astronomy in the middle ages? Should we really refuse as unscientific to consider isolated (unobserved) systems, because we are not observing them? These questions are difficult, almost philosophical, and we will not attempt to answer them here. Rather, we will give a few characteristic quotations, which illustrate10 various positions. (i) Bohr (second Ref. 19, page 204): "There is no quantum world... it is wrong to think that the task of physics is to find out how Nature is. Physics concerns what we can say about Nature." (ii) Heisenberg (same reference, page 205): "But the atoms or the elementary particles are not real; they form a world of potentialities or possibilities rather than one of things and facts."11 (iii) Jordan (as quoted by Bell in Ref. 33): "observations not only disturb what has to be measured, they produce it. In a measurement of position, the electron is forced to a decision. We compel it to assume a definite position; previously it was neither here nor there, it had not yet made its decision for a definite position...." (iv) Mermin (Ref. 6), summarizing the "fundamental quantum doctrine" (orthodox interpretation): "the outcome of a measurement is brought into being by the act of measurement itself, a joint manifestation of the state of the probed system and the probing apparatus. Precisely how the particular result of an individual measurement is obtained— Heisenbergs transition from the possible to the actual—is inherently unknowable." (v) Bell (Ref. 34), speaking of "modern" quantum theory (Copenhagen interpretation): "it never speaks of events in the system, but only of outcomes of observations upon the system, implying the existence of external equipment."12 (How, then, do we describe the whole universe, since there can be no external equipment in this case?) (vi) Shimony (Ref. 8): "According to the interpretation proposed by Bohr, the change of state is a consequence of the fundamental assumption that the description of any physical phenomenon requires reference to the experimental arrangement." (vii) Rosenfeld (Ref. 35): "the human observer, whom we have been at pains to keep out of the picture, seems irresistibly to intrude into it...." (viii) Stapp (Ref. 30): "The interpretation of quantum theory is clouded by the following points: (1) Invalid classical concepts are ascribed fundamental status; (2) The process of measurement is not describable within the framework of the theory; (3) The subject-object distinction is blurred; (4) The observed system is required to be isolated in order to be defined, yet interacting to be observed." A. Von Neumann's infinite regress In this section, we introduce the notion of the Von Neumann regress, or Von Neumann chain, a process that is at the source of the phenomenon of decoherence. Both actually correspond to the same basic physical process, but the word decoherence usually refers to its initial stage, when the number of degrees of freedom involved in the process is still relatively limited. The Von Neumann chain, on the other hand, is more general since it includes this initial stage as well as its continuation, which goes on until it reaches the other extreme where it really becomes paradoxical: the Schrödinger cat, the symbol of a macroscopic system, with an enormous number of degrees of freedom, in an impossible state (Schrödinger uses the word "ridiculous" to describe it). Decoherence in itself is an interesting physical phenomenon that is contained in the Schrödinger equation and introduces no particular conceptual problems; the word is relatively recent, and so is the observation of the process in beautiful experiments in atomic physics, Ref. 36—for more details on decoherence, see Sec. VC2. Since for the moment we are at the stage of a historical introduction of the difficulties of quantum mechanics, we will not discuss microscopic decoherence further, but focus the interest on macroscopic systems, where serious conceptual difficulties do appear. Assume that we take a simple system, such as a spin 1/2 atom, which enters into a Stern-Gerlach spin analyzer. If the initial direction of the spin is transverse (with respect to the magnetic field which defines the eigenstates associated with the apparatus), the wave function of the atom will split into two different wave packets, one which is pulled upwards, the other pushed downwards; this is an elementary consequence of the linearity of the Schrödinger equation. Propagating further, each of the two wave packets may strike a detector, with which they interact by modifying its state as well as theirs; for instance, the incoming spin 1/2 atoms are ionized and produce electrons; as a consequence, the initial coherent superposition now encompasses new particles. Moreover, when a whole cascade of electrons is produced in photomul-tipliers, all these additional electrons also become part of the superposition. In fact, there is no intrinsic limit in what soon becomes an almost infinite chain: Rapidly, the linearity of the Schrödinger equation leads to a state vector which is the coherent superposition of states including a macroscopic number of particles, macroscopic currents and, maybe, pointers or recorders which have already printed zeros or ones on a piece of paper! If we stick to the Schrödinger equation, there is nothing to stop this "infinite Von Neumann regress," which has its seed in the microscopic world but rapidly develops into a macroscopic consequence. Can we, for instance, accept the idea that, at the end, it is the brain of the experimenter (who becomes aware of the results) and therefore a human being, which enters into such a superposition? Needless to say, no-one has ever observed two contradictory results at the same time, and the very notion is not even very clear: It would presumably correspond to an experimental result printed on paper looking more or less like two superimposed slides, or a double exposure of a photograph. But in practice we know that we always observe only one single result in a single experiment; linear superpositions somehow resolve themselves before they become sufficiently macroscopic to involve measurement apparatuses and ourselves. It therefore seems obvious13 that a proper theory should break the Von Neumann chain, and stop the regress when (or maybe before) it reaches the macroscopic world. But when exactly and how precisely? 660 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 660 B. Wigner's friend The question can also be asked differently: In a theory where the observer plays such an essential role, who is entitled to play it? Wigner discusses the role of a friend, who has been asked to perform an experiment, a Stern-Gerlach measurement for instance (Ref. 37); the friend may be working in a closed laboratory so that an outside observer will not be aware of the result before he/she opens the door. What happens just after the particle has emerged from the analyzer and when its position has been observed inside the laboratory, but is not yet known outside? For the outside observer, it is natural to consider the whole ensemble of the closed laboratory, containing the experiment as well as his friend, as the "system" to be described by a big wave function. As long as the door of the laboratory remains closed and the result of the measurement unknown, this wave function will continue to contain a superposition of the two possible results; it is only later, when the result becomes known outside, that the wave packet reduction should be applied. But, clearly, for Wigner's friend who is inside the laboratory, this reasoning is just absurd! He/she will much prefer to consider that the wave function is reduced as soon as the result of the experiment is observed inside the laboratory. We are then back to a point that we already discussed, the absolute/ relative character of the wave function: Does this contradiction mean that we should consider two state vectors, one reduced, one not reduced, during the intermediate period of the experiment? For a discussion by Wigner of the problem of the measurement, see Ref. 38. An unconventional interpretation, sometimes associated with Wigner's name,14 assumes that the reduction of the wave packet is a real effect which takes place when a human mind interacts with the surrounding physical world and acquires some consciousness of its state; in other words, the electrical currents in the human brain may be associated with a reduction of the state vector of measured objects, by some yet unknown physical process. Of course, in this view, the reduction takes place under the influence of the experimentalist inside the laboratory and the question of the preceding paragraph is settled. But, even if one accepts the somewhat provocative idea of possible action of the mind (or consciousness) on the environment, this point of view does not suppress all logical difficulties: What is a human mind, what level of consciousness is necessary to reduce the wave packet, etc.? C. Schrödinger's cat The famous story of the Schrodinger cat (Refs. 39 and 40) illustrates the problem in a different way; it is probably too well known to be described once more in detail here. Let us then just summarize it: The story illustrates how a living creature could be put into a very strange state, containing life and death, by correlation with a decaying radioactive atom, and through a Von Neumann chain; the chain includes a gamma ray detector, electronic amplification, and finally a mechanical system that automatically opens a bottle of poisonous gas if the atomic decay takes place. The resulting paradox may be seen as an illustration of the following question: Does an animal such as a cat have the intellectual abilities that are necessary to perform a measurement and resolve all Von Neumann branches into one? Can it perceive its own state, projecting itself onto one of the alive or dead states? Or do humans only have access to a sufficient level of introspec- tion to become conscious of their own observations, and to reduce the wave function? In that case, when the wave function includes a cat component, the animal could remain simultaneously dead and alive for an arbitrarily long period of time, a paradoxical situation indeed. Another view on the paradox is obtained if one just considers the cat as a symbol of any macroscopic object; such objects can obviously never be in a "blurred" state containing possibilities that are obviously contradictory (open and closed bottle, dead and alive cat, etc.). Schrodinger considers this as a "quite ridiculous case," which emerges from the linearity of his equation, but should clearly be excluded from any reasonable theory—or at best considered as the result of some incomplete physical description. In Schrödinger's words: "an indeterminacy originally restricted to the atomic domain becomes transformed into a macroscopic indeterminacy." The message is simple: Standard quantum mechanics is not only incapable of avoiding these ridiculous cases, it actually provides a recipe for creating them; one obviously needs some additional ingredients in the theory in order to resolve the Von Neumann regress, select one of its branches, and avoid stupid macroscopic superpositions. It is amusing to note in passing that Schrödinger's name is associated with two contradictory concepts that are actually mutually exclusive, a continuous equation of evolution and the symbolic cat, a limit that the equation should never reach! Needless to say, the limit of validity of the linear equation does not have to be related to the cat itself: The branch selection process may perfectly take place before the linear superposition reaches the animal. But the real question is that the reduction process has to take place somewhere, and where exactly? Is this paradox related to decoherence? Not really. Coherence is completely irrelevant for Schrodinger, since the cat is actually just a symbol of a macroscopic object that is in an impossible blurred state, encompassing two possibilities that are incompatible in ordinary life; the state in question is not necessarily a pure state (only pure states are sensitive to de-coherence) but can also be a statistical mixture. Actually, in the story, the cat is never in a coherent superposition, since its blurred state is precisely created by correlation with some parts of the environment (the bottle of poison for instance); the cat is just another part of the environment of the radioactive atom. In other words, the cat is not the seed of a Von Neumann chain; it is actually trapped into two (or more) of its branches, in a tree that has already expanded into the macroscopic world after decoherence has already taken place at a microscopic level (radioactive atom and radiation detector), and will continue to expand after it has captured the cat. Decoherence is irrelevant for Schrodinger since his point is not to discuss the features of the Von Neumann chain, but to emphasize the necessity to break it: The question is not to have a coherent or a statistical superposition of macroscopi-cally different states, it is to have no superposition at all! 5 So the cat is the symbol of an impossibility, an animal that can never exist (a Schrodinger gargoyle?), and a tool for a "reductio ad absurdum" reasoning that puts into light the limitations of the description of a physical system by a Schrodinger wave function only. Nevertheless, in the recent literature in quantum electronics, it has become more and more frequent to weaken the concept, and to call "Schrodinger cat (SC)" any coherent superposition of states that can be distinguished macroscopically, independently of the numbers of degree of freedom of the system. SC states can then be observed (for instance an ion located in two different 661 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 661 places in a trap), but often undergo rapid decoherence through correlation to the environment. Moreover, the Schrö-dinger equation can be used to calculate how the initial stages of the Von Neumann chain take place, and how rapidly the solution of the equation tends to ramify into branches containing the environment. Since this use of the words SC has now become rather common in a subfield of physics, one has to accept it; it is, after all, just a matter of convention to associate them with microscopic systems— any convention is acceptable as long as it does not create confusion. But it would be an insult to Schrodinger to believe that decoherence can be invoked as the solution of his initial cat paradox: Schrodinger was indeed aware of the properties of entanglement in quantum mechanics, a word that he introduced (and uses explicitly in the article on the cat), and he was not sufficiently naive to believe that standard quantum mechanics would predict possible interferences between dead and alive cats! To summarize, the crux of most of our difficulties with quantum mechanics is the following question: What is exactly the process that forces Nature to break the regress and to make its choice among the various possibilities for the results of experiments? Indeed, the emergence of a single result in a single experiment, in other words the disappearance of macroscopic superpositions, is a major issue; the fact that such superpositions cannot be resolved at any stage within the linear Schrodinger equation may be seen as the major difficulty of quantum mechanics. As Pearle nicely expresses it, Ref. 12, the problem is to "explain why events occur!" D. Unconvincing arguments We have already emphasized that the invention of the Copenhagen interpretation of quantum mechanics has been, and remains, one of the big achievements of physics. One can admire even more, in retrospect, how early its founders conceived it, at a time when experimental data were relatively scarce. Since that time, numerous ingenious experiments have been performed, precisely with the hope of seeing the limits of this interpretation but, until now, not a single fact has disproved the theory. It is really a wonder of pure logic that has allowed the early emergence of such an intellectual construction. This being said, one has to admit that, in some cases, the brilliant authors of this construction may sometimes have gone too far, pushed by their great desire to convince. For instance, authoritative statements have been made concerning the absolute necessity of the orthodox interpretation which now, in retrospect, seem exaggerated—to say the least. According to these statements, the orthodox interpretation would give the only ultimate description of physical reality; no finer description would ever become possible. In this line of thought, the fundamental probabilistic character of microscopic phenomena should be considered as a proven fact, a rule that should be carved into marble and accepted forever by scientists. But, now, we know that this is not proven to be true: Yes, one may prefer the orthodox interpretation if one wishes, but this is only a matter of taste; other interpretations are still perfectly possible; determinism in itself is not disproved at all. As discussed for instance in Ref. 6, and initially clarified by Bell (Refs. 3 and 7) as well as by Böhm (Refs. 4 and 5), the "impossibility proofs" put forward by the proponents of the Copenhagen interpretation are logically unsatisfactory for a simple reason: They arbi- trarily impose conditions that may be relevant to quantum mechanics (linearity), but not to the theories that they aim to dismiss—any theory with additional variables such as the Böhm theory, for instance. Because of the exceptional stature of the authors of the impossibility theorems, it took a long time for the physics community to realize that they were irrelevant; now, this is more widely recognized so that the plurality of interpretations is more easily accepted. III. EINSTEIN, PODOLSKY, AND ROSEN It is sometimes said that the article by Einstein, Podolsky, and Rosen (EPR) in Ref. 41 is, by far, that which has collected the largest number of quotations in the literature; the statement sounds very likely to be true. There is some irony in this situation since, so often, the EPR reasoning has been misinterpreted, even by prominent physicists! A striking example is given in the Einstein-Born correspondence (Ref. 42) where Born, even in comments that he wrote after Einstein's death, still clearly shows that he never really understood the nature of the objections raised by EPR. Born went on thinking that the point of Einstein was an a priori rejection of indeterminism ("look, Einstein, indeterminism is not so bad"), while actually the major concern of EPR was locality and/or separability (we come back later to these terms, which are related to the notion of space-time). If giants like Born could be misled in this way, no surprise that, later on, many others made similar mistakes! This is why, in what follows, we will take an approach that may look elementary, but at least has the advantage of putting the emphasis on the logical structure of the arguments. A. A theorem One often speaks of the "EPR paradox," but the word "paradox" is not really appropriate in this case. For Einstein, the basic motivation was not to invent paradoxes or to entertain colleagues inclined to philosophy; it was to build a strong logical reasoning which, starting from well-defined assumptions (roughly speaking: locality and some form of realism), would lead ineluctably to a clear conclusion (quantum mechanics is incomplete, and even: physics is deterministic).16 To emphasize this logical structure, we will speak here of the "EPR theorem," which formally could be stated as follows: Theorem: If the predictions of quantum mechanics are correct (even for systems made of remote correlated particles) and if physical reality can be described in a local (or separable) way, then quantum mechanics is necessarily incomplete: some "elements of reality"17 exist in Nature that are ignored by this theory. The theorem is valid, and has been scrutinized by many scientists who have found no flaw in its derivation; indeed, the logic which leads from the assumptions to the conclusion is perfectly sound. It would therefore be an error to repeat (a classical mistake!) "the theorem was shown by Bohr to be incorrect" or, even worse, "the theorem is incorrect since experimental results are in contradiction with it."18 Bohr himself, of course, did not make the error: In his reply to EPR (Ref. 43), he explains why he thinks that the assumptions on which the theorem is based are not relevant to the quantum world, which makes it inapplicable to a discussion on quantum mechanics; more precisely, he uses the word "ambiguous" to characterize these assumptions, but he never claims that the reasoning is faulty (for more details, see Sec. HIB 3). A theorem which is not applicable in a 662 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 662 Source S Fig. 1. particular case is not necessarily incorrect: Theorems of Euclidean geometry are not wrong, or even useless, because one can also build a consistent non-Euclidean geometry! Concerning possible contradictions with experimental results we will see that, in a sense, they make a theorem even more interesting, mostly because it can then be used within a ' 're-ductio ad absurdum'' reasoning. Good texts on the EPR argument are abundant; for instance, a classic is the wonderful little article by Bell (Ref. 33); another excellent introductory text is, for instance, Ref. 44, which contains a complete description of the scheme (in the particular case where two settings only are used) and provides an excellent general discussion of many aspects of the problem; for a detailed source of references, see for instance Ref. 45. Most readers are probably already familiar with the basic scheme considered, which is summarized in Fig. 1: A source S emits two correlated particles, which propagate toward two remote regions of space where they undergo measurements; the type of these measurements are defined by "settings," or "parameters"19 (typically orientations of Stern-Gerlach analyzers, often denoted a and b), which are at the choice of the experimentalists; in each region, a result is obtained, which can take only two values symbolized by ±1 in the usual notation; finally, we will assume that, every time both settings are chosen to be the same value, the results of both measurements are always the same. Here, rather than trying to paraphrase the good texts on EPR with more or less success, we will purposefully take a different presentation, based on a comparison, a sort of a parable. Our purpose is to emphasize a feature of the reasoning: The essence of the EPR reasoning is actually nothing but what is usually called "the scientific method" in the sense discussed by Francis Bacon and Claude Bernard. For this purpose, we will leave pure physics for botany! Indeed, in both disciplines, one needs rigorous scientific procedures in order to prove the existence of relations and causes, which is precisely what we want to do. B. Of peas, pods, and genes When a physicist attempts to infer the properties of microscopic objects from macroscopic observations, ingenuity (in order to design meaningful experiments) must be combined with a good deal of logic (in order to deduce these microscopic properties from the macroscopic results). Obviously, some abstract reasoning is indispensable, merely because it is impossible to observe with the naked eye, or to take in one's hand, an electron or even a macromolecule for instance. The scientist of past centuries who, like Mendel, was trying to determine the genetic properties of plants, had exactly the same problem: He did not have access to any direct observation of the DNA molecules, so that he had to base his reasoning on adequate experiments and on the observation of their macroscopic outcome. In our parable, the scientist will observe the color of flowers (the "result" of the measurement, +1 for red, -1 for blue) as a function of the condition in which the peas are grown (these conditions are the ' 'ex- perimental settings" a and b, which determine the nature of the measurement). The basic purpose is to infer the intrinsic properties of the peas (the EPR "element of reality") from these observations. 1. Simple experiments; no conclusion yet It is clear that many external parameters such as temperature, humidity, amount of light, etc., may influence the growth of vegetables and, therefore, the color of a flower; it seems very difficult in a practical experiment to be sure that all the relevant parameters have been identified and controlled with a sufficient accuracy. Consequently, if one observes that the flowers which grow in a series of experiments are sometimes blue, sometimes red, it is impossible to identify the reason behind these fluctuations; it may reflect some trivial irreproducibility of the conditions of the experiment, or something more fundamental. In more abstract terms, a completely random character of the result of the experiments may originate either from the fluctuations of uncontrolled external perturbations, or from some intrinsic property that the measured system (the pea) initially possesses, or even from the fact that the growth of a flower (or, more generally, life?) is fundamentally an indeterministic process—needless to say, all three reasons can be combined in any complicated way. Transposing the issue to quantum physics leads to the following formulation of the following question: Are the results of the experiments random because of the fluctuation of some uncontrolled influence taking place in the macroscopic apparatus, of some microscopic property of the measured particles, or of some more fundamental process? The scientist may repeat the "experiment" a thousand times and even more: If the results are always totally random, there is no way to decide which interpretation should be selected; it is just a matter of personal taste. Of course, philosophical arguments might be built to favor or reject one of them, but from a pure scientific point of view, at this stage, there is no compelling argument for a choice or another. Such was the situation of quantum physics before the EPR argument. 2. Correlations; causes unveiled The stroke of genius of EPR was to realize that correlations could allow a big step further in the discussion. They exploit the fact that, when the settings chosen are the same, the observed results turn out to be always identical; in our botanical analogy, we will assume that our botanist observes correlations between colors of flowers. Peas come together in pods, so that it is possible to grow peas taken from the same pod and observe their flowers in remote places. It is then natural to expect that, when no special care is taken to give equal values to the experimental parameters (temperature, etc.), nothing special is observed in this new experiment. But assume that, every time the parameters are chosen to the same values, the colors are systematically the same; what can we then conclude? Since the peas grow in remote places, there is no way that they can be influenced by any single uncontrolled fluctuating phenomenon, or that they can somehow influence each other in the determination of the colors. If we believe that causes always act locally, we are led to the following conclusion: The only possible explanation of the common color is the existence of some common property of both peas, which determines the color; the property in ques- 663 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 663 tion may be very difficult to detect directly, since it is presumably encoded inside some tiny part of a biological molecule, but it is sufficient to determine the results of the experiments. Since this is the essence of the argument, let us make every step of the EPR reasoning completely explicit, when transposed to botany. The key idea is that the nature and the number of "elements of reality" associated with each pea cannot vary under the influence of some remote experiment, performed on the other pea. For clarity, let us first assume that the two experiments are performed at different times: One week, the experimenter grows a pea, then only next week another pea from the same pod; we assume that perfect correlations of the colors are always observed, without any special influence of the delay between the experiments. Just after completion of the first experiment (observation of the first color), but still before the second experiment, the result of that future experiment has a perfectly determined value; therefore, there must already exist one element of reality attached to the second pea that corresponds to this fact— clearly, it cannot be attached to any other object than the pea, for instance one of the measurement apparatuses, since the observation of perfect correlations only arises when making measurements with peas taken from the same pod. Symmetrically, the first pod also had an element of reality attached to it which ensured that its measurement would always provide a result that coincides with that of the future measurement. The simplest idea that comes to mind is to assume that the elements of reality associated with both peas are coded in some genetic information, and that the values of the codes are exactly the same for all peas coming from the same pod; but other possibilities exist and the precise nature and mechanism involved in the elements of reality do not really matter here. The important point is that, since these elements of reality cannot appear by any action at a distance, they necessarily also existed before any measurement was performed—presumably even before the two peas were separated. Finally, let us consider any pair of peas, when they are already spatially separated, but before the experimentalist decides what type of measurements they will undergo (values of the parameters, delay or not, etc.). We know that, if the decision turns out to favor time separated measurements with exactly the same parameter, perfect correlations will always be observed. Since elements of reality cannot appear, or change their values, depending on experiments that are performed in a remote place, the two peas necessarily carry some elements of reality with them which completely determine the color of the flowers; any theory which ignores these elements of reality is incomplete. This completes the proof. It seems difficult not to agree that the method which led to these conclusions is indeed the scientific method; no tribunal or detective would believe that, in any circumstance, perfect correlations could be observed in remote places without being the consequence of some common characteristics shared by both objects. Such perfect correlations can then only reveal the initial common value of some variable attached to them, which is in turn a consequence of some fluctuating common cause in the past (a random choice of pods in a bag, for instance). To express things in technical terms, let us for instance assume that we use the most elaborate technology available to build elaborate automata, containing powerful modern computers20 if necessary, for the purpose of reproducing the results of the remote experiments: Whatever we 664 Am. J. Phys., Vol. 69, No. 6, June 2001 do, we must ensure that, somehow, the memory of each computer contains the encoded information concerning all the results that it might have to provide in the future (for any type of measurement that might be made). To summarize this section, we have shown that each result of a measurement may be a function of two kinds of variables:21 (i) intrinsic properties of the peas, which they carry along with them, (ii) the local setting of the experiment (temperature, humidity, etc.); clearly, a given pair that turned out to provide two blue flowers could have provided red flowers in other experimental conditions. We may also add the following. (iii) The results are well-defined functions; in other words no fundamentally indeterministic process takes place in the experiments. (iv) When taken from its pod, a pea cannot "know in advance" to which sort of experiment it will be submitted, since the decision may not yet have been made by the experimenters; when separated, the two peas therefore have to take with them all the information necessary to determine the color of flowers for any kind of experimental conditions. What we have shown actually is that each pea carries with it as many elements of reality as necessary to provide ' 'the correct answer"22 to all possible questions it might be submitted to. 3. Transposition to physics The starting point of EPR is to assume that quantum mechanics provides correct predictions for all results of experiments; this is why we have built the parable of the peas in a way that exactly mimics the quantum predictions for measurements performed on two spin 1/2 particles for some initial quantum state: The red/blue color is obviously the analog to the result that can be obtained for a spin in a Stern-Gerlach apparatus, and the parameters (or settings) are analogous to the orientation of these apparatuses (rotation around the axis of propagation of the particles). Quantum mechanics predicts that the distance and times at which the spin measurements are performed are completely irrelevant, so that the correlations will remain the same if they take place in very remote places. Another ingredient of the EPR reasoning is the notion of "elements of reality;" EPR first remark that these elements cannot be found by a priori philosophical considerations, but must be found by an appeal to results of experiments and measurements. They then propose the following criterion: "if, without in any way disturbing a system, we can predict with certainty the value of a physical quantity, then there exists an element of physical reality corresponding to this physical quantity.'' In other words, certainty cannot emerge from nothing: An experimental result that is known in advance is necessarily the consequence of some pre-existing physical property. In our botanical analogy, we implicitly made use of this idea in the reasoning of Sec. HIB 2. A last, but essential, ingredient of the EPR reasoning is the notion of space-time and locality: The elements of reality in question are attached to the region of space where the experiment takes place, and they cannot vary suddenly (or even less appear) under the influence of events taking place in very distant regions of space. The peas of the parable were in F. Laloe 664 fact not so much the symbol of some microscopic object, electrons, or spin 1/2 atoms for instance. Rather, they symbolize regions of space where we just know that "something is propagating;" it can be a particle, a field, or anything else, with absolutely no assumption on its structure or physical description. Actually, in the EPR quotation of the preceding paragraph, one may replace the word "system" by "region of space," without altering the rest of the reasoning. One may summarize the situation by saying that the basic belief of EPR is that regions of space can contain elements of reality attached to them (attaching distinct elements of reality to separate regions of space is sometimes called "separability") and that they evolve locally. From these assumptions, EPR prove that the results of the measurements are functions of (i) intrinsic properties of the spins that they carry with them (the EPR elements of reality) and (ii) of course, also of the orientations of the Stern-Gerlach analyzers. In addition, they show the following. (iii) The functions giving the results are well-defined functions, which implies that no indeterministic process is taking place; in other words, a particle with spin carries along with it all the information necessary to provide the result to any possible measurement. (iv) Since it is possible to envisage future measurements of observables that are called "incompatible" in quantum mechanics, as a matter of fact, incompatible observables can simultaneously have a perfectly well-defined value. Item (i) may be called the EPR-1 result: Quantum mechanics is incomplete (EPR require from a complete theory that' 'every element of physical reality must have a counterpart in the physical theory"); in other words, the state vector may be a sufficient description for a statistical ensemble of pairs, but for one single pair of spins, it should be completed by some additional information; in other words, inside the ensemble of all pairs, one can distinguish between suben-sembles with different physical properties. Item (iii) may be called EPR-2, and establishes the validity of determinism from a locality assumption. Item (iv), the EPR-3 result, shows that the notion of incompatible observables is not fundamental, but just a consequence of the incomplete character of the theory; it actually provides a reason to reject complementarity. Curiously, EPR-3 is often presented as the major EPR result, sometimes even with no mention of the two others; actually, the rejection of complementarity is almost marginal or, at least, less important for EPR than the proof of incompleteness. In fact, in all that follows in this article, we will only need EPR-1,2. Niels Bohr, in his reply to the EPR article (Ref. 43), stated that their criterion for physical reality contains an essential ambiguity when it is applied to quantum phenomena. A more extensive quotation of Bohr's reply is the following: "The wording of the above mentioned criterion (the EPR criterion for elements of reality)... contains an ambiguity as regards the expression 'without in any way disturbing a system.' Of course there is in a case like that considered (by EPR) no question of a mechanical disturbance of the system under investigation during the last critical stage of the measuring procedure. But even at this stage there is essentially the question of an influence of the very conditions which de- fine the possible types of predictions regarding the future behavior of the system... the quantum description may be characterized as a rational utilization of all possibilities of unambiguous interpretation of measurements, compatible with the finite and uncontrollable interactions between the objects and the measuring instruments in the field of quantum theory." Indeed, in Bohr's view, physical reality cannot be properly defined without reference to a complete and well-defined experiment. This includes not only the systems to be measured (the microscopic particles), but also all the measurement apparatuses: "these (experimental) conditions must be considered as an inherent element of any phenomenon to which the term physical reality can be unambiguously applied.'' Therefore EPR's attempt to assign elements of reality to one of the spins only, or to a region of space containing it, is incompatible with orthodox quantum mechanics23—even if the region in question is very large and isolated from the rest of the world. Expressed differently, a physical system that is extended over a large region of space is to be considered as a single entity, within which no attempt should be made to distinguish physical subsystems or any substructure; trying to attach physical reality to regions of space is then automatically bound to failure. In terms of our Leitmotiv of Sec. IA 3, the difference between ordinary space and configuration space, we could say the following: The system has a single wave function for both particles that propagates in a configuration space with more than three dimensions, and this should be taken very seriously; no attempt should be made to come back to three dimensions and implement locality arguments in a smaller space. Bohr's point of view is, of course, not contradictory with relativity, but since it minimizes the impact of basic notions such as space-time, or events (a measurement process in quantum mechanics is not local; therefore it is not an event stricto sensu), it does not fit very well with it. One could add that Bohr's article is difficult to understand; many physicists admit that a precise characterization of his attitude, in terms for instance of exactly what traditional principles of physics should be given up, is delicate (see, for example, the discussion of Ref. 8). In Pearle's words: "Bohr's rebuttal was essentially that Einstein's opinion disagreed with his own" (Ref. 46). It is true that, when scrutinizing Bohr's texts, one is never completely sure to what extent he fully realized all the consequences of his position. Actually, in most of his reply to EPR in Physical Review (Ref. 43), he just repeats the orthodox point of view in the case of a single particle submitted to incompatible measurements, and even goes through considerations that are not obviously related to the EPR argument, as if he did not appreciate how interesting the discussion becomes for two remote correlated particles; the relation to locality is not explicitly discussed, as if this was an unimportant issue (while it was the starting point of further important work, the Bell theorem for instance24). The precise reply to EPR is actually contained in only a short paragraph of this article, from which the quotations given above have been taken. Even Bell confessed that he had strong difficulties understanding Bohr ("I have very little idea what this means... ." See the appendix of Ref. 33). 665 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 665 IV. QUANTITATIVE THEOREMS: BELL, GREENBERGER-HORNE-ZEILINGER, HARDY, BELL-KOCHEN-SPECKER The Bell theorem, Ref. 47, may be seen in many different ways. In fact, Bell initially invented it as a logical continuation of the EPR theorem: The idea is to take completely seriously the existence of the EPR elements of reality, and introduce them into the mathematics with the notation X; one then proceeds to study all possible kinds of correlations that can be obtained from the fluctuations of the X's, making the condition of locality explicit in the mathematics (locality was already useful in the EPR theorem, but not used in equations). As a continuation of EPR, the reasoning necessarily develops from a deterministic framework and deals with classical probabilities; it studies in a completely general way all kinds of correlation that can be predicted from the fluctuations in the past of some classical common cause—if one prefers, from some uncertainty concerning the initial state of the system. This leads to the famous inequalities. But subsequent studies have shown that the scope of the Bell theorem is not limited to determinism; for instance, the X's may influence the results of future experiments by fixing the values of probabilities of the results, instead of these results themselves (see Appendix A). We postpone the discussion of the various possible generalizations to Sec. IV A 4 and, for the moment, we just emphasize that the essential condition for the validity of the Bell theorem is locality: All kinds of fluctuations can be assumed, but their effect must affect physics only locally. If we assume that throwing dice in Paris may influence physical events taking place in Tokyo, or even in other galaxies, the proof of the theorem is no longer possible. For nonspeciahzed discussions of the Bell theorem, see for instance Refs. 33, 44, 48, and 49. A. Bell inequalities The Bell inequalities are relations satisfied by the average values of product of random variables that are correlated classically (their correlations arise from the fluctuations of some common cause in the past, as above for the peas). As we will see, the inequalities are especially interesting in cases where they are contradictory with quantum mechanics; one of these situations occurs in the EPRB (B for Böhm, Ref. 50) version of the EPR argument, where two spin 1/2 particles undergo measurements. This is why we begin this section by briefly recalling the predictions of quantum mechanics for such a physical system—but the only ingredient we need from quantum mechanics at this stage is the predictions concerning the probabilities of results. Then we again leave standard quantum mechanics and come back to the EPR-Bell argument, discuss its contradictions with quantum mechanics, and finally emphasize the generality of the theorem. 1. Two spins in a quantum singlet state We assume that two spin 1/2 particles propagate in opposite directions after leaving a source which has emitted them in a singlet spin state. Their spin state is then described by 1 |ý)=—[| + , ->-!-, + >]. (l) When they reach distant locations, they are then submitted to spin measurements, with Stern-Gerlach apparatuses oriented along angles a and b around the direction of propagation. If 9 is the angle between a and b, quantum mechanics predicts that the probability for a double detection of results +1, +1 (or of — 1, -1) is P+)+ = P-,- = sin2<9, (2) while the probability of two opposite results is P+_ = P_+ = cos2<9. (3) This is all that we want to know, for the moment, of quantum mechanics: probability of the results of measurements. We note in passing that, if 0=0 (when the orientations of the measurement apparatuses are parallel), the formulas predict that one of the probabilities vanishes, while the other is equal to one; therefore the condition of perfect correlations required by the EPR reasoning is fulfilled (in fact, the results of the experiments are always opposed, instead of equal, but it is easy to convince oneself that this does not have any impact on the reasoning). 2. Proof We now come back to the line of the EPR theorem. In the framework of strict deterministic theories, the proof of the Bell theorem is the matter of a few lines; the longest part is actually the definition of the notation. Following Bell, we assume that X represents all "elements of reality" associated with the spins; it should be understood that X is only a concise notation which may summarize a vector with many components, so that we are not introducing any limitation here. In fact, one can even include in X components which play no special role in the problem; the only thing which matters is that X does contain all the information concerning the results of possible measurements performed on the spins. We use another classical notation, A and B, for these results, and small letters a and b for the settings (parameters) of the corresponding apparatuses. Clearly A and B may depend, not only on X, but also on the settings a and b; nevertheless locality requests that b has no influence on the results (since the distance between the locations of the measurements can be arbitrarily large); conversely, a has no influence on result B. We therefore ca3\A(a,\) and B(b,X) the corresponding functions (their values are either +1 or -1). In what follows, it is sufficient to consider two directions only for each separate measurement; we then use the simpler notation: A(a,\)=A, A(a',\)=A' (4) and B(b,\)=B, B(b',\)=B'. (5) For each pair of particles, X is fixed, and the four numbers have well-defined values (which can only be ±1). With Eberhard, Ref. 51, we notice that the product M=AB +AB' -A 'B +A 'B' = {A -A' )B + {A +A' )B' (6) is always equal to either +2, or to -2; this is because one of the brackets on the right-hand side of this equation always vanishes, while the other is ±2. Now, if we take the average value of M over a large number of emitted pairs (average over X), since each instance of M is limited to these two values, we necessarily have -2ss(M)sc + 2. (7) 666 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 666 This is the so-called BCHSH form (Ref. 52) of the Bell theorem: The average values of all possible kinds of measurements that provide random results, whatever the mechanism behind them may be (as long as the randomness is local and arises from the effect of some common fluctuating cause in the past), necessarily obey this strict inequality. 3. Contradiction with quantum mechanics and with experiments The generality of the proof is such that one could reasonably expect that any sensible physical theory will automatically give predictions that also obey this inequality; the big surprise was to realize that quantum mechanics does not: It turns out that, for some appropriate choices of the four directions a,a',b,b' (the precise values do not matter for the discussion here), the inequality is violated by a factor V2. which is more than 40%. Therefore, the EPR-Bell reasoning leads to a quantitative contradiction with quantum mechanics; indeed, the latter is not a local realistic theory in the EPR sense. How is this contradiction possible, and how can a reasoning that is so simple be incorrect within quantum mechanics? The answer is the following: What is wrong, if we believe quantum mechanics, is to attribute well-defined values A,A',B,B' to each emitted pair; because only two of them at maximum can be measured in any experiment, we cannot speak of these four quantities, or reason on them, even as unknown quantities. As nicely emphasized by Peres in an excellent short article (Ref. 53), "unperformed experiments have no result," that is all! Wheeler expresses a similar idea when he writes: "No elementary quantum phenomenon is a phenomenon until it is a recorded phenomenon" (Ref. 54). As for Wigner, he emphasizes in Ref. 55 that the proof of the Bell inequalities relies on a very simple notion: the number of categories into which one can classify all pairs of particles.25 Each category is associated with well-defined results of measurements, for the various choices of the settings a and b that are considered; in any sequence of repeated experiments, each category will contribute with some given weight, its probability of occurrence, which it has to positive or zero. Wigner then notes that, if one introduces the notion of locality, each category becomes the intersection of a subensemble that depends on a only, by another subensemble that depends on b only. This operation immediately reduces the number of categories: In a specific example, he shows that their number reduces from 49 to (23)2 = 26. The mathematical origin of the Bell inequalities lies precisely in the possibility of distributing all pairs into this smaller number of categories, with positive probabilities. A general way to express the Bell theorem in logical terms is to state that the following system of three assumptions (which could be called the EPR assumptions) is self-contradictory: (1) validity of their notion of "elements of reality," (2) locality, (3) the predictions of quantum mechanics are always correct. The Bell theorem then becomes a useful tool to build a "re-ductio ad absurdum" reasoning: It shows that, among all three assumptions, one (at least) has to be given up. The motivation of the experimental tests of the Bell inequalities was precisely to check if it was not the third which should be abandoned. Maybe, after all, the Bell theorem is nothing but an accurate pointer toward exotic situations where the predictions of quantum mechanics are so paradoxical that they are actually wrong? Such was the hope of some theorists, as well as the exciting challenge to experimentalists. Experiments were performed in the seventies, initially with photons (Refs. 56 and 57) where they already gave very clear results, as well as with protons (Ref. 58); in the eighties, they were made more and more precise and convincing (Ref. 59—see also Ref. 60); ever since, they have been constantly improved (see for instance Ref. 61, but the list of references is too long to be given here); all these results have clearly shown that, in this conflict between local realism and quantum mechanics, the latter wins completely. A fair summary of the situation is that, even in these most intricate situations invented and tested by the experimentalists, no one has been able to disprove quantum mechanics. In this sense, we can say that Nature obeys laws which are nonlocal, or nonrealist, or both. It goes without saying that no experiment in physics is perfect, and it is always possible to invent ad hoc scenarios where some physical processes, for the moment totally unknown, "conspire" in order to give us the illusion of correct predictions of quantum mechanics—we come back to this point in Sec. V A—but the quality and the number of the experimental results does not make this attitude very attractive intellectually. 4. Generality of the theorem We have already mentioned that several generalizations of the Bell theorem are possible; they are at the same time mathematically simple and conceptually interesting. For instance, in some of these generalizations, it is assumed that the result of an experiment becomes a function of several fluctuating causes: the fluctuations taking place in the source as usual, but also fluctuations taking place in the measuring apparatuses (Ref. 62), and/or perturbations acting on the particles during their motion toward the apparatuses; actually, even fundamentally indeterministic (but local) processes may influence the results. The two former cases are almost trivial since they just require the addition of more dimensions to the vector variable X; the latter requires replacing the deterministic functions A and B by probabilities, but this is also relatively straightforward, Ref. 49 (see also the footnote in Ref. 62 and Appendix A). Moreover, one should realize that the role of the A and B functions is just to relate the conditions of production of a pair of particles (or of their propagation) to their behavior when they reach the measurement apparatuses (and to the effects that they produce on them); they are, so to say, solutions of the equation of motion whatever these are. The important point is that they may perfectly include, in a condensed notation, a large variety of physical phenomena: propagation of point particles, propagation of one or several fields from the source to the detectors (see for instance the discussion in Sec. 4 of Ref. 33), particles, and fields in interaction, or whatever process one may have in mind (even random propagations can be included)—as long as they do not depend on the other setting (A is supposed to be a function of a, not of b). The exact mathematical form of the equations of propagation is irrelevant; the essential thing is that the functions exist. Indeed, what really matters for the proof of the Bell theorem is the dependence with respect to the settings a and b: The functional must depend on a only, while B must depend on b only. Locality expressed mathematically in terms of a 667 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 667 and b is the crucial ingredient. For instance we could, if we wished, assume that the result A of one measurement is also a function of fluctuating random variables attached to the other apparatus, which introduces a nonlocal process; but this does not create any mathematical problem for the proof (as long as these variables are not affected by setting b). On the other hand, if A becomes a function of a and b (and/or the same for B), it is easy to see that the situation is radically changed: In the reasoning of Sec. IV A 2 we must now associate eight numbers to each pair (since there are two results to specify for each of the four different combinations of settings), instead of four, so that the proof miserably collapses. Appendix A gives another concrete illustration snowing that it is locality, not determinism, which is at stake; see also the appendix of Ref. 49. Needless to say, the independence of A of b does not mean that the result observed on one side, A, is independent of the outcome at the other side, B: One should not confuse setting and outcome dependencies! It is actually clear that, in any theory, the correlations would disappear if outcome dependence was totally excluded. We should also mention that the setting dependence is subject to some constraints, if the theory is to remain compatible with relativity. If, for instance, the probability of observation of the results on one side, which is a sum of probabilities over the various possible outcomes on the other side, was still a function of the other setting, one would run into incompatibility; this is because one could use the device to send signals without any fundamental delay, thus violating the constraints of relativity. See Refs. 63 and 64 for a discussion in terms of "strong locality" and "predictive completeness" (or "parameter independence" and of "outcome independence" in Ref. 65). Appendix D discusses how the general formalism of quantum mechanics manages to ensure compatibility with relativity. An interesting generalization of the Bell theorem, where time replaces the settings, has been proposed by Franson in Ref. 66 and implemented in experiments for an observation of a violation of the Bell inequalities (see for instance Ref. 67); another generalization shows that a violation of the Bell inequalities is not limited to a few quantum states (singlet for instance), but includes all states that are not products, Refs. 68 and 69. For a general discussion of the conceptual impact of a violation of the inequalities, we refer to the book collecting Bell's articles (Ref. 7). We wish to conclude this section by emphasizing that the Bell theorem is much more general than many people think. All potential authors on the subject should think twice and remember this carefully before taking their pen and sending a manuscript to a physics journal: Every year a large number of them is submitted, with the purpose of introducing ' 'new'' ways to escape the constraints of the Bell theorem, and to "explain" why the experiments have provided results that are in contradiction with the inequalities. According to them, the nonlocal correlations would originate from some new sort of statistics, or from perturbations created by cosmic rays, gas collisions with fluctuating impact parameters, etc. The imagination is the only limit of the variety of the processes that can be invoked, but we know from the beginning that all these attempts are doomed to failure. The situation is analogous to the attempts of past centuries to invent ' 'perpetuum mobile" devices: Even if some of these inventions were extremely clever, and if it is sometimes difficult to find the exact reason why they cannot work, it remains true that Sometimes A= 1 <-----------------► B= 1 A = -l <---------------► B = -l Never Fig. 2. the law of energy conservation allows us to know at once that they cannot. In the same way, some of these statistical "Bell beating schemes" may be extremely clever, but we know that the theorem is a very general theorem in statistics: In all situations that can be accommodated by the mathematics of the X's and the A and B functions (and there are many!), it is impossible to escape the inequalities. No, nonlocal correlations cannot be explained cheaply; yes, a violation of the inequalities is therefore a very, very, rare situation. In fact, until now, it has never been observed, except of course in experiments designed precisely for this purpose. In other words, if we wanted to build automata including arbitrarily complex mechanical systems and computers, we could never mimic the results predicted by quantum mechanics (at least for remote measurements); this will remain impossible forever, or at least until completely different computers working on purely quantum principles are built.26 B. Hardy's impossibilities Another scheme of the same conceptual type was introduced recently by Hardy (Ref. 70); it also considers two particles but it is nevertheless completely different since it involves, instead of mathematical constraints on correlation rates, the very possibility of occurrence for some type of events—see also Ref. 71 for a general discussion of this interesting contradiction. As in Sec. IV A 2, we assume that the first particle may undergo two kinds of measurements, characterized by two values a and a' of the first setting; if we reason as in the second half of Sec. IV A 2, within the frame of local realism, we can call A and A' the corresponding results. Similar measurements can be performed on the second particle, and we call B and B' the results. Let us now consider three types of situations: (i) settings without prime—we assume that the result A = \,B=\ is sometimes obtained, (ii) one prime only—we assume that the "double one" is impossible, in other words that one never gets A = \, B' = l, and neverA' = l, B=l either, (iii) double prime settings—we assume that "double minus one" is impossible, in other words that A' = — l,B' = — l is never observed. A closer inspection shows that these three assumptions are in fact incompatible. To see why, let us for instance consider the logical scheme of Fig. 2, where the upper part corresponds to the possibility opened by statement (i); statement (ii) then implies that, if A = 1, one necessarily has B' = — l. 668 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 668 which explains the first diagonal in the figure; the second diagonal follows by symmetry. Then we see that all events corresponding to the results.4 =B= 1 also necessarily correspond to A' = B' = - 1, so that a contradiction with statement (iii) appears: The three propositions are in fact incompatible. A way to express it is to say that the "sometimes" of (i) is contradictory with the "never" of proposition (iii). But it turns out that quantum mechanics does allow a simultaneous realization of all three propositions! To see how. let us for instance consider a two-spin state vector of the form: I*) ß\ vl- it) where the | ±, ±) refer to eigenstates of A' and B' (Note: axis Oz is chosen as the direction of measurement associated with primed operators). From the beginning, the absence of any |Ý) component on | —, —) ensures that proposition (iii) is true. As for the measurements without prime, we assume that they are both performed along a direction in the plane xOz that makes an angle 20 with Oz; the eigenstate with eigenvalue +1 associated in the single-spin state is then merely cos<9| + ) + sin<9|-}. (9) The first state excluded by proposition (ii) (diagonal in Fig. 2) is then the two-spin state: cos<9| + , + ) + sin<9|-, + ) (10) while the second is: COS i sin 0| (11) so that the two exclusion conditions are equivalent to the following conditions: a sin 0+y cos 0=/? sin 0+y cos 0=0 (12) or, within a proportionality coefficient: « = /?=-y cot 0. (13) This arbitrary coefficient may be used to write |Ý) in the form: W) - cos 0( I +, -} +1 -, +}) + sin 0| +, +}. (14) The last thing to do is to get the scalar product of this ket by that where the two spins are in the state (9); we get the following result: (15) sin 0 cos2 0 The final step is to divide this result by the square of the norm of ket (14) in order to obtain the probability of the process considered in (iii); this is a straightforward calculation (see Appendix B), but here we just need to point out that the probability is not zero; the precise value of its 0 maximum found in Appendix B is about 9%. This proves that the pair of results considered in proposition (i) can sometimes be obtained together with (ii) and (iii): Indeed, in 9% of the cases, the predictions of quantum mechanics are in complete contradiction with those of a local realist reasoning. An interesting aspect of the above propositions is that they can be generalized to an arbitrary number of measurements (Ref. 72); it turns out that this allows a significant increase of the percentage of "impossible events" (impossible within local realism) predicted by quantum mechanics—from 9% to almost 50%! The generalization involves a chain, which keeps the two first lines (i) and (ii) unchanged, and iterates the second in a recurrent way, by assuming that: B(n)= -1 Fig. 3. (iii) for measurements of the type (ď ,b") or (a",b'), one never gets opposite results,27 (iv) for measurements of the type (a",b'") or (a'",b"). one never gets opposite results, etc., (n) finally, for measurement of the type (an,bn), one never gets -1 and — 1. The incompatibility proof is very similar to that given above; it is summarized in Fig. 3. In both cases, the way to resolve the contradiction is the same as for the Bell theorem: In quantum mechanics, it is not correct to reason on all four quantities A, A', B, and B', even as quantities that are unknown and that could be determined in a future experiment. This is simply because, with a given pair of spins, it is obviously impossible to design an experiment that will measure all of them: They are incompatible. If we insisted on introducing similar quantities to reproduce the results of quantum mechanics, since four experimental combinations of settings are considered, we would have to consider eight numbers instead of four, as already discussed in Sec. IVA4. For a discussion of nonlocal effects with other states, see Ref. 73. C. GHZ equality For many years, everyone thought that Bell had basically exhausted the subject by considering all really interesting situations, in other words that two-spin systems provided the most spectacular quantum violations of local realism. It therefore came as a surprise to many when in 1989 Green- 669 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 669 berger, Horne, and Zeilinger (GHZ) showed that this is not true: As soon as one considers systems containing more than two correlated particles, even more dramatic violations of local realism become possible in quantum mechanics, and even without involving inequalities. Here, we limit ourselves to the discussion of three particle systems, as in the original articles (Refs. 74 and 75), but generalization to N particles is possible; see for instance Sec. V C 1 or Ref. 76. While Ref. 75 discussed the properties of three correlated photons, each emitted through two pinholes and impinging beam splitters, we will follow Ref. 77 and consider a system of three 1/2 spins (external variables play no role here); we assume that the system is described by the quantum state: l*>=-kl + , + ,+H-, -,->], (16) where the |±) states are the eigenstates of the spins along the Oz axis of an orthonormal frame Oxyz. We now calculate the quantum probabilities of measurements of the spins o-l23 of the three particles, either along direction Ox, or along direction Oy, which is perpendicular. More precisely, we assume that what is measured is not individual spin components, but only the product of three of these components, for instance alyX(j2yX(j3x. A straightforward calculation (see Appendix C) shows that V(alyXa2yXa3x=l)= + l, V(alxXa2yXa3y=l)= + l, (17) V(alyXa2xXa3y=l)= + l. In fact, the state vector written in (16) turns out to be a common eigenstate to all three operator products, so that each of them takes a value +1 that is known before the measurement.28 Now, if we consider the product of three spin components along Ox, it is easy to check (Appendix C) that the same state vector is also an eigenstate of the product operator ct\xX(j2xX(j3x, but now with eigenvalue — 1, so that V(alxXa2xXa3x=-l)=l. (18) This time the result is — 1, with probability 1, that is with certainty. Let us now investigate the predictions of a local realist EPR type point of view in this kind of situation. Since the quantum calculation is so straightforward, it may seem useless: Indeed, no one before GHZ suspected that anything interesting could occur in such a simple case, where the initial state is an eigenstate of all observables considered, so that the results are perfectly certain. But, actually, we will see that a complete contradiction emerges from this analysis! The local realist reasoning is a simple generalization of that given in Sec. IVA2; we call Axy the results that the first spin will give for a measurement, either along Ox or Oy: similar letters B and C are used for the measurement on the two other spins. From the three equalities written in (17) we then get: AyByC=\, AxByCy=\, AyBxCy=\. (19) Now, if we assume that the observations of the three spins are performed in three remote regions of space, locality implies that the values measured for each spin should be independent of the type of observation performed on the two other spins. This means that the same values of A, B, and C can be used again for the experiment where the three Ox components are measured: The result is merely the product AXBXCX. But, since the squares .42, etc., are always equal to +1, we can obtain this result by multiplying all three parts of Eq. (19), which provides AxBxCx= + \. (20) But equality (18) predicts the opposite sign! Here we obtain a contradiction that looks even more dramatic than for the Bell inequalities: The two predictions do not differ by some significant fraction (about 40%), they are just completely opposite. In addition, all fluctuations are eliminated since all of the results (the products of the three components) are perfectly known before measurement: The 100% contradiction is obtained with 100% certainty! Unfortunately, this does not mean that, experimentally, tests of the GHZ equality are easy. Three particles are involved, which must be put in state (16), surely a nontrivial task; moreover one has to design apparatuses that measure the product of three spin components. To our knowledge, no experiment analogous to the Bell inequality experiments has been performed on the GHZ equality yet, at least with macroscopic distances; only microscopic analogues have been observed, in nuclear magnetic resonance experiments (Ref. 78)—for recent proposals, see for instance Refs. 79 and 80. Nevertheless, constant progress in the techniques of quantum electronics is taking place, and GHZ entanglement has already been observed (Refs. 81 and 82), so that one gets the impression that a full experiment is not too far away in the future. In a GHZ situation, how precisely is the conflict between the reasoning above and quantum mechanics resolved? There are different stages at which this reasoning can be put into question. First, we have assumed locality, which here takes the form of noncontextuality (see Sec. IV D): Each of the results is supposed to be independent of the nature of the measurements that are performed on the others, because they take place in remote regions of space. Clearly, there is no special reason why this should necessarily be true within quantum mechanics. Second, we have also made assumptions concerning the nature of the "elements of reality" attached to the particles. In this respect, it is interesting to note that the situation is very different from the EPR-Bell or Hardy cases: Bohr could not have replied that different elements of reality should be attached to different experimental setups! In the GHZ argument, it turns out that all four quantum operators corresponding to the measurements commute, so that there is in principle no impossibility of measuring all of them with a single setup. But the local realist reasoning also assumes that a measurement of the product of three operators is equivalent to a separate measurement of each of them, which attributes to them separate elements of reality. In the formalism of quantum mechanics, the question is more subtle. It turns out that the measurement of a single product of commuting operators is indeed equivalent to the measurement of each of them; but this is no longer the case for several product operators, as precisely illustrated by those introduced above: Clearly, all six spin component operators appearing in the formulas do not commute with each other. It is therefore impossible to design a single experimental setup to have access to all six quantities Axy, Bxy, and Cxy that we have used in the local realist proof29 When the measurements are imperfect, the GHZ equality can give rise to inequalities (as in the BCHSH theorem), as discussed in Refs. 75 and 83; the latter reference also pre- 670 Am. J. Phys., Vol. 69, No. 6, June 2001 F. Laloe 670 sents a generalization to an arbitrary number N of particles; in the same line, Ref. 76 provides a discussion of the TV-particle correlation function with varying angles for the analyzers, which we partially reproduce in Sec. VC 1. D. Bell-Kochen-Specker; contextuality Another theorem was introduced also by Bell, Ref. 3, as well as (independently and very shortly after) by Kochen and Specker (Ref. 84), hence the name "BKS theorem" that is often used for it. This theorem is not particularly related to locality, as opposed to those that we have already discussed in the preceding subsections. It is actually related to another notion, called "contextuality:" An additional variable attached to a physical system is called "contextual" if its value depends not only on the physical quantity that it describes, but also on the other physical quantities that can be measured at the same time on the same system (in quantum mechanics they correspond to commuting observables). If, on the other hand, its value is completely independent of all the other observables that the experimenter may decide to measure at the same time, the additional variable is called "noncontextual;" one can then say that it describes a property of the physical system only, and not a combined property of the system and the measurement apparatus; it may have pre-existed in the system before any measurement. The notion of distance is no longer relevant in this context; for instance, the theorem applies to a single system with no extension in space. Let us first consider a spin 1 particle in quantum mechanics, with three quantum states |-1)|0) and | +1) as a basis of a state space of the dimension three. The three components Sx, Sy, and Sz do not commute (they obey the usual commutation relation for the angular momentum), but it is easy to show that all the squares of all these three operators do commute; this is a specific property of angular momentum 1, and can be seen by an elementary matrix calculation with the usual operators S+ . Moreover, the sum of these squares is a constant (a c number) since S2x + S2y + S2z = 2h2. (21) It is not against any fundamental principle of quantum mechanics, therefore, to imagine a triple measurement of the observables S2, S2, and S2; we know that the sum of the three results will always be 2 (from now on we drop the factor h2, which plays no role in the discussion). Needless to say, the choice of the three orthogonal directions is completely arbitrary, and the compatibility is ensured for any choice of this triad, but not more than one: The measurements for different choices remain totally incompatible. In passing, we note that the measurement of the square Si of one component cannot merely be seen as a measurement of Sx followed by a squaring calculation made afterwards by the experimentalist! Ignoring information is not equivalent to not measuring it (we come to this point in more detail, in terms of interferences and decoherence, at the end of Sec. VIA). There is indeed less information in S2. than in Sx itself, since the former has only two eigenvalues (1 and 0), while the latter has three (— 1 is also a possible result). What is needed to measure Si is, for instance, a modified Stern-Gerlach system where the components of the wave function corresponding to results ± 1 are not separated, or where they are separated but subsequently grouped together in a way they makes them impossible to distinguish. Generally speaking, in quantum mechanics, measuring the square of an operator is certainly not the same physical process as measuring the operator itself! Now, suppose that we try to attach to each individual spin an EPR element of reality/additional variable that corresponds to the result of measurement of Si; by symmetry, we will do the same for the two other components, so that each spin now gets three additional variables X to which we may attribute values that determine the possible results: 1 or 0. The results are described by functions of these variables, which we denote Ax^z: Ax = 0 or 1, Ay = 0 or 1, Az = 0 or 1. (22) At first sight, this seems to provide a total of eight possibilities; but, if we want to preserve relation (21), we have to select among these eight possibilities only those three for which two A's are one, one is zero. As traditional, for this particular spin we then attribute colors to the three orthogonal directions Ox, Oy, and Oz: The two directions that get an A = 1 are painted in red, the last in blue, Ref. 85. The same operation can obviously be made for all possible choices of the triplet of directions Oxyz. A question which then naturally arises is: For an arbitrary direction, can one attribute a given color (a given value for Ax) that remains independent of the context in which it was defined? Indeed, we did not define the value as a property of an Ox direction only, but in the context of two other directions Oy and Oz: the possibility of a context independent coloring is therefore not obvious. Can we for instance fix Oz and rotate Ox and Oy around it, and still keep the same color for Oz? We are now facing an amusing little problem of geometry that we might call ' 'ternary coloring of all space directions.'' Bell as well as Kochen and Specker showed that this is actually impossible; for a proof see either the original articles, or the excellent review (Ref. 6) given by Mermin. In the same article, this author shows how the complications of the geometrical problem may be entirely avoided by going to a space of states of dimension four instead of three. He considers two spin 1/2 particles and the following table of nine quantum variables (we use the same notation as in Sec. IV C): VxTy a-yax azaz. (23) All operators have eigenvalues ±1. It is easy to see why all three operators belonging to the same line, or to the same column, always commute (the products of two