Physics in Spacetime (F4051) Lecture notes Linus Wulff Spring 2024 Chapter 1 Space and Time This course deals with the special theory of relativity introduced by Einstein in a famous 1905 paper. The traditional way of introducing special relativity is to derive it, in much the same way that Einstein did, from two basic principles: 1. The principle of relativity 2. The constancy of the speed of light From these assumptions the notion of a spacetime with (inertial) observers being connected by Lorentz transformations follows. This is a natural way to proceed if one starts from a knowledge of classical mechanics and Maxwell’s equations of electrodynamics. However, it is not the best way to understand the geometrical aspects of spacetime. This is part of the reason it took Einstein another ten years to formulate the general theory of relativity, describing gravity, where the geometry of spacetime is the key player. In this course we will follow a different route1 which leads more directly to a geometric picture. Rather than starting with the principles described above we will derive the same physics from what is known as ˆ The principle of maximum proper time This approach is more in line with general relativity, and this course can be thought of as a first step towards studying general relativity. 1.1 What is space and time? The notions of Space and Time are central to physics. In physics we are interested in answering questions like Given some configuration of particles with given positions and momenta at some initial time ti, what will the configuration look like at some later time tf ? 1 The approach to relativity taken here is inspired by lecture notes and a book by B. Laurent (Introduction To Spacetime, World Scientific, 1994). 1 For such questions to make sense we must have a precise way of defining what we mean by time and also what we mean by a particle’s position in space. So what are space and time? Rather than get into a philosophical discussion about the nature of space and time, a more useful approach when faced with such deep questions in physics is to try to replace it by a different, more down to earth, question. After all, in physics we deal only with things that can be measured, and therefore a better question to ask is How do we measure space and time? We say that we define space and time operationally, by declaring how we measure them. So how do we measure distances in space? The most basic way is to take a reference object, say stick of a certain length, and use it to measure the distance between two points. We will call such a reference object a ruler. Of course, a good ruler should not bend or change its length with temperature etc., so we will assume it is always possible to find a sufficiently good ruler (or equivalent) so that we can measure lengths to the precision we need. How do we measure time? To measure time we need a clock. It does not have to be what we normally think of as a clock, it can be any physical process which has a known time dependence, e.g. a periodic process with definite period like a pendulum or a non-periodic process like an atom in an excited state with known half-life. Again, we will assume that there exist such clocks with good enough precision for the time measurements we need to perform. We allow each person, or observer, to measure time with their own clock and spatial distances with their own ruler. We will assume that these are small enough that the observer can carry them with her, i.e. they will be assumed to be in the same state of motion as the observer and experience the same forces she experiences. But if each observer makes their own measurements using their own clock and ruler, how do we relate the measurements of two different observers? Newton and his contemporaries assumed that there was an absolute notion of time, so that all observers clocks would tick at the same rate. In that case it is very easy to relate the measurements of two observers. We now know that this assumption was wrong. For example, taking two synchronized atomic clocks, putting one on a plane circling the earth and leaving one on the ground, one finds when comparing them at the end that they differ (by a few hundred nanoseconds). This observation is clearly inconsistent with the Newtonian idea of an absolute time. 1.2 The principle of maximum proper time Experiments show that time runs differently for different observers. We must therefore assign each observer their own time, their proper time, which is the time measured on their clock. We can now state the key principle that will allow us to compare the measurements of different observers 2 The principle of maximum proper time: If two observers are separated and then meet again, the one that does not experience any acceleration always measures the longest proper time. It says that proper time is maximized for inertial, i.e. unaccelerated, observers. There is plenty of experimental evidence to support this principle, such as the experiments with atomic clocks on planes, or the operation of GPS satellites which requires very precise time measurements. In this course we will take this principle as the starting point from which we will derive the theory of special relativity. 1.3 Spacetime We are familiar with the fact that to specify the position of an object in our three dimensions we need to give three numbers – the coordinates with respect to some specified coordinate system. For positions on the earth we might for example give the longitude, the latitude and the height above sea level. To specify an event – something happening at a certain place at a certain instant of time – we must give one more number, namely the time on a clock associated to the coordinate system. In our example this could be the time GMT. We have argued that we must allow each observer to measure distances and times using their own coordinate system defined by their ruler and clock. Each observer will therefore associate to a given event four numbers (t, x, y, z) – the spacetime coordinates relative to their coordinate system. Note that we are defining an event here in an idealized way as a single point in spacetime, i.e. something that happens at a point in space at a single instant of time. The set of all events make up the four-dimensional spacetime. Note that each observer will (in general) assign different coordinates to the same event because they are using different coordinate systems, there is no preferred coordinate system in spacetime. One of our first tasks will therefore be to understand how to relate the observations of different observers. 1.4 Worldlines The trajectory of an object traces out a continuous path in spacetime – a worldline (really “worldtube” if the object is not point-like, but this distinction won’t be very important to us). In ordinary Euclidean space we are familiar with the fact that there is a shortest path between any two points. This path is called a straight line. It is the path an object follows if it is not acted upon by any external forces, i.e. it is unaccelerated. Similarly, we will assume that there is precisely one straight line connecting any two events in spacetime and that any object not acted on by external forces, i.e. not experiencing any acceleration, follows such a straight worldline. To a worldline connecting two events in spacetime we can associate a number – the proper time along that worldline. Recall that this is the time an 3 observer traveling along the worldline measures on her clock between the two events. The principle of maximum proper time says that a straight worldline corresponds to the longest proper time. Therefore the analog of shortest length in Euclidean space is longest proper time in spacetime and a clock can be thought of as measuring distances in spacetime. When we draw spacetime diagrams we will draw the worldlines of unaccelerated objects as straight lines. Curved lines will correspond to worldlines of accelerated objects (Figure 1.1). ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ u u A B τAB τAB Figure 1.1: Spacetime diagram showing an accelerated (curved worldline) and an unaccelerated (straight worldline) observer meeting at events A and B. The proper time measured on their respective clocks between the meetings is τAB and τAB. The principle of maximum proper time then says that τAB > τAB. An important notion in Euclidean geometry is the notion of two lines being parallel. In spacetime we can similarly have the notion of two observers being on the same course. How can two observers, e.g. two spaceships traveling in outer space, determine whether they are on the same course? One way to do this uses a construction from Euclidean space adapted to spacetime. Imagine that the two observers each send out a probe fitted with a clock, which travels freely until it is picked up by the other observer at some later time (Figure 1.2). If the two probes happen to meet halfway, i.e. after half of the proper time (from being emitted to being picked up) has elapsed on each clock, then we will say that the observers are on the same course, or that their worldlines are parallel. From the figure we see that this also implies that the lines AB and CD (not drawn) are parallel. Note that to carry out the experiment we really need to send clocks that also have a recording device that records the time they were sent and the time they met. We would also need to do the experiment several times to get them to meet halfway. 4 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡             Q d d d d d ds s s s s c A C B D τ 2 τ 2 τ 2 τ 2 Figure 1.2: The worldlines of two observers are parallel if they can send out probes, at A and B, that meet halfway before encountering the other observer at D and C. Notice that this construction does not refer to space or time separately, only to the full spacetime picture or the proper time measured by a particular clock. This is in sharp contrast to how we would describe such an experiment in Newtonian physics. Just like in Euclidean space the line AC in Figure 1.2 defines a vector, which we can draw as an arrow starting at A and ending at C. We declare the length of the vector to be given by the proper time elapsed from A to C. The construction in the figure gives us a way to parallel transport vectors, i.e. moving a vector while keeping it parallel to itself. The vector AC can be parallel transported to the vector BD. Taking the two worldlines to approach each other we obtain the special case of parallel transport of the vector along the worldline. A general parallel transport is obtained by a sequence of such “elementary” parallel transports. We will now make a very important assumption. We will assume that the vector one obtains by such a sequence of elementary parallel transports from a point A to a point A in spacetime does not depend on how one chooses the sequence of parallel transports, i.e. it does not depend on the path taken. This assumption is actually not true close to gravitating bodies and in that case one must use the more advanced theory of general relativity. The assumption is true if gravity is very weak, which is the case we consider in this course. In this case we are working with special relativity. In fact, the change of a vector under parallel transport is directly related to the curvature of a space. In special relativity spacetime is flat, while in general relativity it can be curved. 5 Chapter 2 Spacetime vectors In the last chapter we defined a vector on the straight worldline of an observer as an arrow from one event to a later event on the worldline, with length given by the proper time elapsed between the two events. The notion of vectors is familiar from Euclidean space and we will use the same notation ¯v for a vector in spacetime. Such vectors are often called four-vectors since spacetime is four-dimensional. Just as any point in Euclidean space R3 can be associated with a vector going from the origin to that point, any event in spacetime can be associated to a spacetime vector from an origin (which we can choose as we please) to the point in question. We have also seen that we can move vectors around using parallel transport. Two vectors related by parallel transport will be considered the same vector (this is consistent since we are assuming that the vector obtained by parallel transport is independent of the path taken). Spacetime vectors obey the usual axioms familiar from Euclidean space: ˆ Commutativity of addition: ¯u + ¯v = ¯v + ¯u ˆ Associativity of addition: ¯u + (¯v + ¯w) = (¯u + ¯v) + ¯w ˆ Identity element of addition: ¯v + ¯0 = ¯v ˆ Inverse element of addition: Given ¯v there exists a vector −¯v such that ¯v + (−¯v) = 0 ˆ Compatibility of scalar multiplication: a(b¯v) = (ab)¯v for a, b ∈ R ˆ Identity element of scalar multiplication: 1¯v = ¯v ˆ Distributivity of scalar multiplication with respect to vector addition: a(¯u + ¯v) = a¯u + a¯v ˆ Distributivity of scalar multiplication with respect to addition: (a + b)¯v = a¯v + b¯v Addition of spacetime vectors can be done by the geometric construction familiar from Euclidean space, which is illustrated in Figure 2.1. Recall that a basis for a vector space 6 d d d d d d d d ds ¯u        Q ¯v T ¯u + ¯v Figure 2.1: Geometric addition of the vectors ¯u and ¯v producing a third vector ¯u + ¯v. is a set of linearly independent vectors ¯vi with i = 1, . . . , n which span the space, so that any vector is expressed uniquely as a linear combination a1¯v1 + a2¯v2 + . . . + an¯vn , (2.1) for some numbers ai ∈ R with i = 1, . . . , n. The vector can be denoted in this basis as (a1, a2, . . . , an) and n is called the dimension of the vector space. A basis of spacetime vectors consists of four linearly independent spacetime vectors. 2.1 Inner product An important notion in linear algebra is that of the inner product between two vectors. Given two vectors their inner product is a real number. We will denote the inner product with a dot, e.g. ¯u·¯v denotes the inner product between vectors ¯u and ¯v. The inner product satisfies the following standard axioms ˆ Symmetry: ¯u · ¯v = ¯v · ¯u ˆ Linearity: (a¯u) · ¯v = a(¯u · ¯v) and (¯u + ¯v) · ¯w = ¯u · ¯w + ¯v · ¯w ˆ Non-degeneracy: If ¯u · ¯v = 0 for all vectors ¯v then ¯u = ¯0 Often the inner product is required to be positive definite, so that ¯u2 = ¯u · ¯u ≥ 0, which is a stronger requirement than being non-degenerate. This is the case in Euclidean space 7 where we are used to identifying ¯u2 with the length-squared of a vector, which is obviously positive. We will see below that it is not possible to require this for spacetime vectors. Instead, for a spacetime vector that goes along the straight worldline of an object from point A to point B we will take ¯u2 = −τ2 , (2.2) where τ is the proper time along the worldline from A to B. The minus sign seems strange at this point but we will see shortly that it is needed if we want vectors representing lengths in space to have positive square. All the differences between Euclidean space and spacetime are due to the fact that the inner product in spacetime is not positive definite. As we will see this is what makes it possible to separate the time-direction from the spatial directions. The assumption that ¯u2 = −τ2 for vectors corresponding to a segment of a straight worldline determines also the inner product ¯u · ¯v of two straight worldline vectors ¯u, ¯v. To see this consider three such worldline vectors related by a¯u = ¯v + ¯w , (2.3) for some a ∈ R. Writing this as ¯w = a¯u − ¯v and squaring both sides we get ¯w2 = (a¯u − ¯v) · (a¯u − ¯v) = a2 ¯u2 − 2a¯u · ¯v + ¯v2 . (2.4) Rearranging this we have ¯u · ¯v = 1 2a a2 ¯u2 + ¯v2 − ¯w2 . (2.5) The right-hand-side involves only squares of vectors, which are expressed in terms of the corresponding proper times. Therefore we see that the inner product ¯u·¯v is also determined in terms of the proper times corresponding to the lengths of the vectors ¯u, ¯v, ¯w. It is important to understand that the assumption that there exists an inner product for spacetime vectors satisfying the above axioms is not a trivial statement. The mere existence of this inner product has physical consequences. To see this consider the identity (¯u + ¯v)2 + (¯u − ¯v)2 = 2¯u2 + 2¯v2 . (2.6) Let’s assume that all these vectors are part of straight worldlines of observers. Since the expression contains only squares it only involves the proper times measured by these observers. With four spaceships traveling along these straight worldlines it is then possible to arrange an experiment (see Figure 2.2) to test whether the proper times they measure satisfy the above identity, i.e. whether τ2 1 + τ2 2 = 2τ2 3 + 2τ2 4 . One finds that it is indeed satisfied. 2.2 Timelike, Spacelike and Null Let ¯u, ¯v be two straight worldline vectors. Then ¯u2 = −τ2 u , ¯v2 = −τ2 v . (2.7) 8 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡! τ3 ¯uτ2 τ1 τ4 −¯v τ4 ¯v I A                U £ £ £ £ £ £ £ £ £ £ £ £ ££# Figure 2.2: Experiment involving four spaceships to test equation (2.6). We now define a new vector which is a linear combination of ¯u and ¯v, ¯y = a¯u + b¯v , (2.8) for some a, b. Its inner product with ¯u is ¯u · ¯y = a¯u2 + b¯u · ¯v . (2.9) Taking a = −b¯u·¯v ¯u2 we find ¯u · ¯y = 0 . (2.10) We say that ¯y is orthogonal to ¯u. Now consider the following situation where two spaceships part and then meet again: 9 T ship 1 ship 2 ship 2            Q            k Spaceship 1 is unaccelerated throughout the duration of its journey, while spaceship 2 travels unaccelerated for a while, then accelerates hard for a short time to reverse its direction of motion and then again floats freely until it meets spaceship 1 again. We will take the spacetime vectors corresponding to this situation as in Figure 2.3 with T E ¯y¯u ¯v2 ¯v1 ¢ ¢ ¢ ¢ ¢ ¢¢ f f f f f ffw Figure 2.3: Spacetime vectors corresponding to two spaceships parting and meeting again. ¯v1 = 1 2 ¯u + ¯y , ¯v2 = 1 2 ¯u − ¯y , (2.11) 10 where ¯y is the vector introduced above which is orthogonal to ¯u and is a small number. Note that ¯v1 + ¯v2 = ¯u so that the spaceships indeed meet at the end. The proper time for the journey of ship 1 is τ1 = √ −¯u2 . (2.12) The proper time for the journey of ship 2 is the sum of the proper time for the two segments of the journey τ2 = −¯v2 1 + −¯v2 2 . (2.13) Since ¯u · ¯y = 0 we have ¯v2 1 = 1 4 ¯u2 + 2 ¯y2 = ¯v2 2 so that τ2 = 2 −1 4 ¯u2 − 2 ¯y2 = √ −¯u2 1 + 4 2 ¯y2 ¯u2 = τ1 1 − 4 2 ¯y2 τ2 1 . (2.14) The principle of maximum proper time says that the unaccelerated observer measures the longest proper time, i.e. τ1 > τ2 (we assume ¯y = 0). This in turn implies that 1 − 4 2 ¯y2 τ2 1 < 1 ⇒ ¯y2 > 0 , (2.15) the vector ¯y has positive square. This result was obtained assuming ¯u2 = −τ2 1 < 0. If we had decided instead to take the opposite convention, i.e. ¯u2 = +τ2 1 > 0, the same calculation would give ¯y2 < 0. We see that, in contrast to what we are used to from Euclidean space, it is not possible for all spacetime vectors to have positive square. This fact follows from the principle of maximum proper time. Clearly no observer can travel along ¯y because then his clock would need to show an imaginary time, which is absurd. Consider now the vector ¯w = c¯u + d¯y , (2.16) with ¯u and ¯y orthogonal as before. Squaring this we find ¯w2 = c2 ¯u2 + d2 ¯y2 . (2.17) If we take c2 = −d2 ¯y2 ¯u2 (note that the RHS is positive which is consistent with c, d being real numbers) we get ¯w2 = 0! We conclude that there also exist spacetime vectors ¯w = ¯0 such that ¯w2 = 0. To summarize we have learned that there are 3 classes of spacetime vectors: ˆ ¯v2 < 0: Timelike ˆ ¯v2 > 0: Spacelike ˆ ¯v2 = 0: Null (light-like) Vectors that are part of a straight worldline of an observer are timelike. We have seen above that the principle of maximum proper time implies that if ¯u is timelike and ¯u · ¯y = 0 then ¯y is spacelike (or ¯y = ¯0). This is a very useful result to remember when working with spacetime vectors: 11 ¯u timelike and ¯u · ¯v = 0 ⇒ ¯v spacelike (or ¯v = ¯0). 12 Chapter 3 Simultaneity and spatial distance An observer traveling along in a spaceship only has direct access to the interior of the spaceship. Nevertheless they must be able to make statements and inferences about what happens in the outside world. To be able to do this they need in particular to be able to say when an event, which is not on their worldline, occurred. Another way to say it is that they need to have a way to determine whether an event far away is simultaneous with an event on their worldline, e.g. a supernova explosion far away happens when their clock shows 10:23. The natural way to do this is via the construction in figure 3.1. The observer sends out a probe, which travels on a straight worldline to the event and on a straight worldline back. She arranges it so that the probe reaches the event precisely when half the proper time of its journey has elapsed. Then she will say that the event on her worldline halfway between sending out and receiving the probe is simultaneous with the distant event. From the figure we have τ2 = −(¯v + ¯r)2 = −(¯v − ¯r)2 ⇒ ¯v · ¯r = 0 , (3.1) so that ¯r, being orthogonal to a timelike vector, must be spacelike. We also need a way to measure spatial distances. To see how to do this let us consider a family of straight parallel worldlines L0, L1, . . . defined by the equation ¯Rn = λn ¯u + n¯ρ , n = 0, 1, 2, . . . (3.2) where ¯u, ¯ρ are timelike vectors and λn ∈ R parametrizes a point on the n’th worldline, Ln. This is illustrated in figure 3.2. This could be a fleet of identical spaceships traveling unaccelerated and arranged head to tail. Consider now an observer traveling from the front of the fleet to the back, counting how many ships he passes. This number is a measure of how far it is from the head of the fleet to the tail. The distance is expressed in units of “standard spaceship”. There is an alternative way to measure this distance. We first note that there is only 13 ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡ r rrrrj $$$$$$$$ v v v v v v v ¡ ¡ ¡ ¡¡! ¡ ¡ ¡ ¡¡! s s s s P P1 P2 P3 t t τ τ¯v ¯r ¯v Figure 3.1: Via this construction the observer decides that P2, halfway between P1 and P3, is simultaneous with P. one vector going from L0 to Ln with the property that it is orthogonal to ¯u.1 It is given by ¯rn = n¯ρ − n¯ρ · ¯u ¯u2 ¯u = n ¯ρ − ¯ρ · ¯u ¯u2 ¯u . (3.3) Note that ¯rn, and therefore also its magnitude, is proportional to n, the number of spaceships. We can therefore use the magnitude ¯r2 n as a measure of the distance. All we need to do is work out the conversion factor to go between ¯r2 n and the number of spaceships. Looking at figure 3.1 we read off τ2 = −(¯v + ¯r)2 = t2 − ¯r2 , or ¯r2 = t2 − τ2 . (3.4) The advantage of this method is that we don’t need the fleet of spaceships (other than to fix the unit of distance). Later we will find an even more practical way to measure distance. How we pick the unit of distance is up to us. Nothing prevents us from choosing units such that √ ¯r2 itself is the distance. This is in fact the most natural choice to make and we will stick to it in this course. From (3.4) we see that now space and time acquire the same dimensions. In the theory of relativity this is as natural as say height and width having the same dimensions and being measured in the same units. 1 Proof: It is clear that at least one such vector exists. Assume there are two such vectors ¯r1, ¯r2. We may assume their foot-point is the same point on L0. The fact the ¯u·¯r1 = ¯u·¯r2 = 0 implies ¯u·(¯r1 −¯r2) = 0. But ¯r1 − ¯r2 = λ¯u for some λ and the previous equation implies λ = 0 so that ¯r1 = ¯r2. 14 ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢    Q         Q             Q                  Q ¯u ¯ρ L0 L1 L2 L3 · · · Figure 3.2: A family of parallel worldlines. 3.1 Orthogonal space To every unaccelerated observer there corresponds a straight worldline. Such a worldline is characterized by a timelike vector which we can normalize to a unit vector ˆu. We will always us a ‘hat’ to denote unit vectors. A timelike unit vector satisfies ˆu2 = −1 and a spacelike unit vector ˆr2 = 1. Given an observer with worldline unit vector ˆu there exist orthogonal vectors ¯r, ˆu · ¯r = 0 . (3.5) They form the orthogonal space to the observer’s worldline. This is a vector space since linear combinations of such vectors clearly belong to the space. In fact, since all such ¯r are spacelike (or zero), it is a Euclidean vector space. Since we are imposing one condition on the four components of ¯r the orthogonal space is three-dimensional. Recall that √ ¯r2 is the (spatial) distance from the observer. The orthogonal space is the space used in Newtonian physics. The difference is that in the theory of relativity each observer has their own orthogonal space. 15 Given the worldline of an observer with direction ˆu, we can split any spacetime vector ¯R into a component along ˆu and a component orthogonal to it as ¯R = tˆu + ¯r with ˆu · ¯r = 0 . (3.6) This is illustrated in figure 3.3. According to the figure, an observer following the worldline ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢              Q €€€€€€€€€€q tˆu ¯R ¯r L Figure 3.3: Split of a spacetime vector ¯R with respect to a timelike direction ˆu. L measures the event corresponding to ¯R to happen at a time t and spatial position ¯r. The distance to the event is = √ ¯r2. From the equation we find t = −ˆu· ¯R and ¯r = ¯R+(ˆu· ¯R)ˆu. We see that t and ¯r are uniquely fixed in terms of ˆu and ¯R. 3.2 Linearly independent vectors Consider four spacetime vectors ¯A, ¯B, ¯C, ¯D . (3.7) They are linearly independent if none of them can be expressed as a linear combination of the others, or equivalently if the equation a ¯A + b ¯B + c ¯C + d ¯D = 0 (3.8) 16 has only the trivial solution a = b = c = d = 0. Note that since spacetime is fourdimensional we cannot have more than four linearly independent vectors. In Euclidean space we are used to two orthogonal vectors being linearly independent. This is not true in spacetime, e.g. a null vector is orthogonal to itself but clearly not linearly independent of itself. What is true is that if ¯v · ¯u = 0 for all ¯u then ¯v = 0. To see this take ¯u timelike. We conclude that ¯v must be spacelike, but since it is orthogonal to all other spacelike vectors it must vanish since these form a Euclidean vector space. To test if ¯A, ¯B, ¯C, ¯D are linearly independent we form the determinant of the matrix of inner products ¯A · ¯A ¯A · ¯B ¯A · ¯C ¯A · ¯D ¯B · ¯A ¯B · ¯B ¯B · ¯C ¯B · ¯D ¯C · ¯A ¯C · ¯B ¯C · ¯C ¯C · ¯D ¯D · ¯A ¯D · ¯B ¯D · ¯C ¯D · ¯D (3.9) The vectors are linearly dependent if and only if this determinant vanishes. To see this assume they are linearly dependent. Then (3.8) holds for some non-zero coefficients. Taking this linear combination of rows in the matrix we obtain a row of zeros so that the determinant vanishes. Conversely, if the determinant vanishes there exists a linear combination of the rows that gives zero. This means that there exists a vector a ¯A + b ¯B + c ¯C + d ¯D, with a, b, c, d not all zero, which is orthogonal to ¯A, ¯B, ¯C, ¯D. Assuming they are linearly independent leads to a contradiction since this vector would then be orthogonal to all vectors and would therefore have to vanish, therefore they must be linearly dependent. Note that this test works only for four vectors. It does not work for lower-dimensional subspaces of spacetime. 17 Chapter 4 Velocity and light signals Consider an observer and an object following straight worldlines L and L respectively, figure 4.1. The observer finds the object’s position at a time t to be given by ¯r, with ¯r = 0               E t τ ¯r L L Figure 4.1: Observer L describes the object L as having position ¯r at time t. at t = 0. He assigns the object the velocity ¯v = d¯r dt = ¯r t , (4.1) where the last equality follows from the fact that they are both following straight worldlines, so ¯r is proportional to t. This velocity clearly depends on the observer, since ¯r and t refer 18 to the observer. For this reason it is called the relative velocity of the observer and the object. Note that ¯v is orthogonal to ˆu, i.e. belongs to the observer’s orthogonal space. In particular this means that the relative velocity is spacelike. The unit vector along the objects worldline ˆv tells us how the relative velocity ¯v is directed relative to the direction of the observer’s worldline, ˆu. We call ˆv the four-velocity of the object. Note that any timelike unit vector (pointing forwards in time) can be a four-velocity, since it could be the direction of an object’s worldline. 4.1 Standard velocity split Let ˆv be the four-velocity of the object and ˆu that of the observer. The object’s spacetime position is given by ¯R = τˆv and from figure 4.1 we see that τˆv = tˆu + ¯r = t(ˆu + ¯v) , (4.2) where we used the definition of the relative velocity in (4.1). Dividing by τ we can write this as ˆv = γ(ˆu + ¯v) where γ = t τ and ˆu · ¯v = 0 . (4.3) This formula for the split of the four-velocity of the object into the four-velocity of the observer and the relative velocity is very useful and has many applications. For example, squaring this equation gives 1 = γ2 (1 − v2 ) or t τ = γ = 1 √ 1 − v2 , (4.4) where v2 = ¯v2 , the relative velocity squared. Note that v2 = 1− τ2 t2 ≤ 1. This is the famous time dilation formula. Since t = γτ and γ ≥ 1 the observer sees the object’s clock slowed down by a factor of 1/γ. Taking the inner product of (4.3) with ˆu we find ˆu · ˆv = −γ = − 1 √ 1 − v2 . (4.5) Notice that this implies that the definition of the relative velocity v is symmetric between the observer and object. The object judges the observer to have the same velocity as the observer assigns to the object. 4.2 Light signals So far we have not discussed null lines, which are on the border between timelike and spacelike lines. Now let us assume that L in figure 4.1 is such a null line. Then τ2 = − ¯R2 = 0 , (4.6) 19 but it is still true that the objects position is given by ¯R = tˆu + ¯r . (4.7) Squaring this we get −t2 + ¯r2 = 0 or v2 = 1 . (4.8) Note that this result is independent of the observer. All observers would measure the relative velocity of such a signal to have magnitude v = 1. It does not follow from the theory of relativity itself that things following such null lines exist. Note that it would be meaningless to assign them a clock since it would not tick (τ = 0 along such a line). Nevertheless, experience tells us that there exist signals in nature that can travel along null lines. Light (electromagnetic radiation) being the most important example and v = 1 is called the “velocity of light”. (More generally, as we will see later, a particle follows a null worldline if and only if it is massless. The quantum of light is a massless particle called the photon.) The existence of such signals is of great practical importance. For example, recall that to determine simultaneity we used the setup in figure 3.1. The observer has to arrange the situation so that the proper time of the probe going to and from the event are the same. To achieve this in practice presents great difficulties. The existence of light, or more generally electromagnetic, signals solves this problem since using such a signal instead of the probe τ = 0 always. Equivalently, the observer knows that v = 1 and can therefore calculate the distance directly from the time it takes the signal to go there and back. Imagine an observer who sends out a flash of light at t = 0. The light pulse travels out in all directions with unit velocity forming a sphere of light. To draw the corresponding spacetime diagram we must go down to two dimensions of space where the light forms a circle traveling outwards from the observer, figure 4.2. In this three-dimensional spacetime picture the light forms a cone. For this reason it is referred to as the light-cone. It is the surface given by the equation r = t , where r = √ ¯r2 . (4.9) What we have drawn is really only half of the light-cone, called the future light-cone. There is also the past light-cone given by r = −t . (4.10) It describes a sphere of light contracting towards the origin. The full light cone is given by the equation r2 = t2 (4.11) and is illustrated in figure 4.3. Note that this equation is true for any observer, so the light-cone looks the same to any observer. Don’t be mislead by the picture which might seem to suggest otherwise! 20 T E                   d d d d d d d d dd ˆˆˆˆz ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¯n ¯r t Figure 4.2: An observer’s future light-cone. The circle of light is at distance r at time t and traveling out in all null directions, e.g. ¯n. 4.3 Split of null vectors As we have seen, any spacetime vector can be split with respect to some four-velocity ˆu into a component along ˆu and a component orthogonal to it. For a null vector ¯n we get ¯n = a(ˆu + ¯m) (4.12) but ¯n2 = 0 implies ¯m2 = 1 so that ¯n = a(ˆu + ˆm) , (4.13) proportional to the sum of a timelike and a spacelike unit vector. If a vector ¯k is orthogonal to ¯n it is either spacelike or proportional to ¯n.1 In particular two orthogonal null vectors must be parallel. 1 Proof: Writing ¯k = b(ˆu + ¯r) we find the condition (we may assume a, b = 0) ˆm · ¯r = 1 . since ˆm and ¯r belong to the orthogonal subspace they are Euclidean. Therefore we must have either√ ¯r2 > 1, in which case ¯k is spacelike, or √ ¯r2 = 1 and ¯r = ˆm, in which case ¯k = b a ¯n. 21                d d d d d d dd                d d d d d d dd Figure 4.3: The full light-cone. 4.4 Future and past Consider a timelike vector ¯v at the origin. We can split it with respect to the four-velocity of an observer as ¯v = tˆu + ¯r , ˆu · ¯r = 0 . (4.14) Squaring we find ¯v2 = r2 − t2 < 0 , (4.15) which implies that every timelike vector is pointing inside the light-cone (see figure 4.2). Replacing ¯v with a spacelike vector we similarly find that every spacelike vector points outside the light-cone. Null vectors point along the light-cone. Let ¯u, ¯v be two timelike vectors with negative inner product ¯u · ¯v < 0 . (4.16) Construct the linear combination ¯w = a¯u + b¯v , with a, b ≥ 0 , a + b = 0 . (4.17) Squaring we find ¯w2 = a¯u2 + 2ab¯u · ¯v + ¯v2 < 0 , (4.18) since all terms are negative, so that ¯w is also timelike. By varying a, b we can continuously go from the vector ¯u to the vector ¯v via only timelike vectors. This would be impossible if one was pointing into the future light-cone and the other into the past light-cone. We therefore conclude that two timelike vectors with negative inner product must be pointing 22 inside the same part of the light-cone, i.e. either both into the future light-cone or both into the past light-cone. It is easy to see that the same argument goes through if we take one timelike and one null vector. If instead ¯u · ¯v > 0 then ¯u and −¯v must point inside the same part of the light-cone so that one of ¯u and ¯v is pointing to the past and the other to the future. Conversely, two timelike vectors pointing inside the same part (future/past) of the light-cone have ¯u· ¯v < 0 (note that ¯u · ¯v cannot vanish). Again this goes through if one of them is null. Let ˆu, ˆv be timelike and future directed. Then (ˆu + ˆv)2 = −2 + 2ˆu · ˆv < 0 and (ˆu + ˆv) · ˆv = ˆu · ˆv − 1 < 0 , (4.19) from which we conclude that ˆu + ˆv is also timelike and future directed. This must hold for any sum of timelike future directed vectors. An important consequence of this is that no spaceship (or other object) can reverse its four-velocity and travel to the past to arrive before it departed. Time travel is therefore impossible in the theory of relativity. Related to this, if two spaceships part and then meet again they will always agree that they parted before they meet. We have seen that timelike (and null) vectors can be divided into two classes: future directed and past directed. Note that no such division is possible for spacelike vectors. 23 Chapter 5 Lorentz transformation In this chapter we will restrict to situations where all the spacetime vectors involved lie in a two-dimensional plane. In particular, this means that they can all be expressed in terms of two linearly independent vectors. Many problems one encounters are in fact of this type. In the cases of interest this plane contains a timelike future directed unit vector, which could be the four-velocity of an observer. Every vector can be split with respect to this fourvelocity, figure 3.3. Note that this corresponds to a one-dimensional problem in Newtonian physics since there is only one spatial direction. We are interested in how to relate the measurements of two different observers. Let us consider first the analogous situation in Euclidean space, with two sets of orthogonal vectors rotated with respect to each other. Thinking of this as two different coordinate systems we know how to relate them via the rotation angle. Consider now the corresponding situation in spacetime, illustrated in figure 5.1, which occurs often. We have two sets of orthogonal unit vectors ˆu, ˆr with ˆu · ˆr = 0 and ˆv, ˆs with ˆv · ˆs = 0 and ˆu2 = ˆv2 = −1, ˆr2 = ˆs2 = 1. Obviously ˆu and ˆr are linearly independent since one is timelike and one is spacelike. Since we are restricting to a two-dimensional plane we can express ˆv and ˆs in terms of ˆu and ˆr. Up to a proportionality factor we have ˆv ∝ ˆu + αˆr , ˆs ∝ ˆr + αˆu , (5.1) for some α ∈ R, where we have used the fact that ˆv · ˆs = 0 to fix the form of ˆs. Let us compute the length of the vectors on the RHS to fix the normalization. We have (ˆr + αˆu)2 = −(ˆu + αˆr)2 = 1 − α2 (5.2) and therefore we must have ˆv = ± 1 √ 1 − α2 (ˆu + αˆr) , ˆs = ± 1 √ 1 − α2 (ˆr + αˆu) , (5.3) so that ˆs2 = −ˆv2 = 1. Demanding that ˆv → ˆu and ˆs → ˆr as α → 0 we see that we need to pick the plus signs. 24 T Ee e e e e e e e e eeu ¨ ¨¨¨ ¨¨¨ ¨¨¨B ˆu ˆv ˆr ˆs Figure 5.1: Two sets of orthogonal unit vectors in spacetime. Recalling now the standard velocity split, eq. (4.3) ˆv = 1 √ 1 − v2 (ˆu + ¯v) , ˆu · ¯v = 0 , (5.4) we read off α = ±v with v = √ ¯v2, the magnitude of the relative velocity. We therefore have ˆv = γ(ˆu + vˆr) , ˆs = γ(ˆr + vˆu) , γ = 1 √ 1 − v2 , (5.5) where we have absorbed the sign into v so that v is positive if the relative velocity is along ˆr and negative if it is along −ˆr. This is the famous Lorentz transformation. It tells us how the measurements of two inertial observers are related. To see this consider an event specified by the spacetime vector ¯R. We can write it in two ways as ¯R = tˆu + xˆr or ¯R = t ¯v + x ¯s . (5.6) Observer u assigns coordinates (t, x) to the event while observer v assigns it coordinates (t , x ). Equating the two expressions for ¯R and using the formula for the Lorentz transformation we find tˆu + xˆr = t γ(ˆu + vˆr) + x γ(ˆr + vˆu) (5.7) or t = γ(t + vx ) , x = γ(x + vt ) . (5.8) This is the Lorentz transformation that relates the time and space measurements of the two observers. 25 5.1 Addition of velocities In Euclidean space we can perform first one rotation and then another. The result is a third rotation. Similarly, we can perform first one Lorentz transformation with parameter (relative velocity) v1 and then another with parameter v2. This clearly gives a third Lorentz transformation. The only question is how the parameter of the third Lorentz transformation is related to v1, v2. To answer this question we note that the final timelike unit vector is proportional to (ˆu + v1ˆr) + v2(ˆr + v1 ˆu) = (1 + v1v2)ˆu + (v1 + v2)ˆr . (5.9) From the equation ˆv = γ(ˆu + vˆr) we see that we can read off the relative velocity v as the ratio of the ˆr-component and the ˆu-component. In this way we find v = v1 + v2 1 + v1v2 . (5.10) This is the relativistic formula for addition of velocities. It has been tested to high accuracy in experiments with light propagating through flowing liquids. Note that for velocities much smaller than the speed of light v1, v2 1 we have v ≈ v1 + v2, the Newtonian formula for addition of velocities. The relativistic formula guarantees that the relative velocity v always remains less than the speed of light v < 1 for v1, v2 < 1. In the special case v2 = −v1 we get v = 0. This means that the inverse Lorentz transformation is obtained by changing the sign of v. This is easily verified directly by performing first a Lorentz transformation with parameter v and then one with parameter −v. 5.2 Lorentz contraction Consider a spaceship of length . The observer on the spaceship has four-velocity ˆu. Another observer has four-velocity ˆv. Both measure the length of the spaceship from their perspective getting the answer and respectively. The setup is illustrated in figure 5.2. From the figure we see that ˆs − ˆr ∝ ˆu (5.11) and taking the inner product with ˆr gives ˆr · ˆs − = 0 . (5.12) Using the Lorentz transformation ˆs = γ(ˆr + vˆu) we find ˆr · ˆs = γ so that / = 1/γ = √ 1 − v2 ≤ 1 , (5.13) where v is the relative velocity of the two observers. This is the famous Lorentz contraction. A moving object appears shortened, or contracted, in its direction of motion (relative to the observer). This effect is similar to slicing a sausage in that the length of the slice depends on the cutting angle. In that case the perpendicular slice has the shortest length, whereas in the spaceship case the observer at rest measures the longest length. Note that in this respect the Euclidean figure 5.2 is very misleading. 26 T ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡ ¡ ¡ ¡¡! ¡ ¡ ¡¡ EE rrr rrrj rrrrj ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡ ¡ ¡ ¡¡ L L1 L2 ˆv ˆu ˆs ˆr Figure 5.2: A spaceship stretching between L1 and L2, with length , is observed by the observer following the worldline L who measures its length to be . 27 Chapter 6 Waves Sound waves are familiar from Newtonian physics. The pressure P(¯r, t) varies in space and with time around the mean pressure P0. For a plane sound wave the variation, p = P −P0, has the form p = A sin(−ωt + ¯k · ¯r + φ0) . (6.1) The constants A and φ0 are the amplitude and phase shift of the wave, while ω is the angular frequency and the three-vector ¯k is the wave vector. We may assume that ω > 0. The surfaces where p takes a constant value, e.g. its maximum A, are 2-dimensional planes in our 3-dimensional space, see figure 6.1, which is the reason for the name plane waves. Sound waves are just one example. Many other types of waves exist in nature that can be T E    d d d d d d d d d d dd d d d d d d d d d d dd d d d d d d d d d d dd ¯k x y Figure 6.1: Plane wave in two space dimensions. The wave fronts, surfaces of constant phase at say t = 0, given by ¯k · ¯r = const, are straight parallel lines orthogonal to the wave vector ¯k. In three dimensions they are planes. 28 described in the same way. Consider now a plane wave in spacetime given by ψ = A sin( ¯K · ¯R + φ0) , (6.2) where now ¯K is the wave four-vector and ¯R is the four-vector of a spacetime point (event). To see that this expression makes sense we introduce an observer with four-velocity ˆu. We can then decompose the four-vectors involved with respect to this observer as ¯R = tˆu + ¯r , ¯r · ˆu = 0 , (6.3) ¯K = ωˆu + ¯k , ¯k · ˆu = 0 . (6.4) Using these expressions the wave takes the form ψ = A sin(−ωt + ¯k · ¯r + φ0) , (6.5) which is the same as before, with ω the angular frequency and ¯k the wave vector. Note that ω and ¯k depend on the observer, since they are defined using his four-velocity. We have for example ωu = −ˆu · ¯K , (6.6) the angular frequency of the wave as measured by the observer u. The superposition principle says that we can add together waves of the form (6.2) to form new waves. For physical waves the angular frequency is determined by the wave vector ω = ω(¯k). The relation between ω and ¯k is called the dispersion relation and depends on the type of wave. Let us consider the sum of two one-dimensional plane waves with similar wave number k1 = (1 − )k0 and k2 = (1 + )k0 with 1 and φ0 = 0, ψ = sin(−ω1t + k1x) + sin(−ω2t + k2x) . (6.7) We have −ω1t + k1x = −ω(k1)t + k1x ≈ −ω(k0)t + k0x + k0 dω dk t − x , (6.8) where we have Taylor expanded the function ω to first order in . For k2 we find the same expression but with the sign of changed. Therefore, using the formula for sin of a sum of angles and setting ∆ = k0(dω dk t − x) , we have ψ = sin(−ω0t + k0x + ∆) + sin(−ω0t + k0x − ∆) = sin(−ω0t + k0x) cos( ∆) + cos(−ω0t + k0x) sin( ∆) + sin(−ω0t + k0x) cos(− ∆) + cos(−ω0t + k0x) sin(− ∆) =2 cos k0 dω dk t − x sin(−ω0t + k0x) . (6.9) 29 This is a plane wave with wave number k0 and frequency ω0 = ω(k0) but with an amplitude which is modulated by the cos factor. The cosine factor describes the envelope of the wave packet. The envelope of the wave packet depends on the position through x − dω dk t and therefore travels with the group velocity v = dω dk . (6.10) This is the velocity of signals sent with such waves. For light we should clearly have v = 1 and indeed light (electromagnetic radiation) has the dispersion relation ω(k) = k = ¯k2 . (6.11) Equivalently, we have for light that ¯K2 = (ωˆu + ¯k)2 = −ω2 + ¯k2 = 0 , (6.12) i.e. light rays have a wave four-vector which is null. This means that for light (electromagnetic radiation) we can write the wave four-vector as ¯K = ω(ˆu + ˆk) ˆk · ˆu = 0 , (6.13) where ˆk is a unit vector describing the direction of the wave. 6.1 Doppler shift and aberration Consider two observers with four-velocities ˆu and ˆv observing a light wave ( ¯K2 = 0). Splitting vectors with respect to the first observer we have ˆv = γ(ˆu + ¯v) ¯v · ˆu = 0 , (6.14) the standard velocity split, and ¯K = ωu(ˆu + ˆku) ˆku · ˆu = 0 . (6.15) We have put a subscript u to emphasize that ωu and ˆku are the quantities measure by observer u. Since ¯v and ˆku are in the orthogonal space to ˆu they can be treated as ordinary Euclidean vectors and we can write ¯v · ˆku = v cos θu , (6.16) where θu is the angle between the relative velocity ¯v of observers u and v and the direction the light is traveling, given by ˆku, as measured by observer u. The angular frequency measured by observer v becomes ωv = −ˆv · ¯K = −ωuγ(−1 + ¯v · ˆku) = ωuγ(1 − v cos θu) , (6.17) so that 30 ωv ωu = 1 − v cos θu √ 1 − v2 . (6.18) This is the formula for the Doppler shift. It says how the ratio of the frequencies measured by two observers depends on their relative velocity and the angle of the velocity to that of the light. To see the physical consequences of this formula we take the observer u be at rest with respect to the source emitting the light (radiation). Let us consider the two special cases when v is moving directly towards the source or directly away from it. We have Towards source cos θu = −1 ωv ωu = 1+v 1−v > 1 “Blue shift” Away from source cos θu = +1 ωv ωu = 1−v 1+v < 1 “Red shift” An observer moving towards the source sees a higher frequency, the light is blue shifted, while an observer moving away sees a lower frequency, the light is red shifted. The Doppler effect is very important in astronomy where it is used for example to measure the velocity of stars and galaxies relative to us by looking at the shifts of their spectral lines. We could have used the observer v instead of u and we would have found instead the formula ωu ωv = 1 + v cos θv √ 1 − v2 . (6.19) Note the plus sign in the numerator which is due to the fact that the relative velocity according to v has opposite sign compared to what u measures. Multiplying this with the previous formula for the Doppler shift we find (1 + v cos θv)(1 − v cos θu) = 1 − v2 . (6.20) This is the formula for aberration derived by Einstein in his 1905 paper. It can also be written as cos θv = cos θu − v 1 − v cos θu . (6.21) To see its physical consequences we take again u at rest with respect to the source and v traveling towards the source, but now we let them observe the light at a small angle, θu = π − δu, θv = π − δv with δu, δv 1. Using the fact that cos(π − x) = − cos(x) and that for small x cos(x) ≈ 1 − 1 2 x2 we find 1 − 1 2 δ2 v ≈ 1 + v − 1 2 δ2 u 1 + v − v1 2 δ2 u ≈ 1 − 1 2 1 − v 1 + v δ2 u . (6.22) so that δv ≈ 1 − v 1 + v δu ≤ δu . (6.23) 31 We conclude that the observer traveling towards the source of the light measures a smaller angle. Similarly, by changing the sign of v, we learn that an observer traveling away from the light source measures a larger angle by a factor 1+v 1−v ≥ 1. This means that light is concentrated in the direction of motion and an observer traveling towards a star sees it as smaller and brighter, while an observer traveling away sees it as larger and fainter. This is sometimes referred to as the “headlight” effect. It again has important consequences in astronomy, for example when observing relativistic jets of plasma from compact objects accreting matter. These appear brighter when directed towards the earth and fainter when directed away from the earth. 32 Chapter 7 Particle kinematics An important application of special relativity is to processes involving elementary particles, e.g. in particle accelerators, where they can often reach velocities close to the speed of light. In order to discuss what happens in particle collisions we first need to introduce the notion of four-momentum. 7.1 Four-momentum Recall the split of a four-velocity ˆv with respect to another one ˆu ˆv = γ(ˆu + ¯v) ˆu · ¯v = 0 , γ = 1 √ 1 − v2 . (7.1) If v is an object and u an observer ¯v is the velocity of the object relative to the observer. In the Newtonian limit v << 1, i.e. for velocities much smaller than the speed of light, we have, dropping terms of order v3 and higher, ˆv ≈ (1 + v2 2 )ˆu + ¯v . (7.2) To relate this to something more familiar let us multiply by the mass of the object m, mˆv ≈ (m + mv2 2 )ˆu + m¯v . (7.3) We recognize the kinetic energy of the object 1 2 mv2 and its momentum m¯v. Note that every object has a well-defined mass (just consider an observer traveling along with the object, i.e. whose ˆu is (nearly) parallel to ˆv, who can define the mass in the usual Newtonian way). We define the four-momentum of an object to be its mass times its four-velocity ¯P = mˆv . (7.4) By construction it satisfies ¯P2 = −m2 . (7.5) 33 The split of ˆv gives a corresponding split of the four-momentum ¯P = Eˆu + ¯p ˆu · ¯p = 0 , (7.6) with E = mγ = m √ 1 − v2 ≈ m + mv2 2 , ¯p = m¯vγ = m¯v √ 1 − v2 ≈ m¯v . (7.7) In analogy with the Newtonian case we call E the energy and ¯p the momentum (or threemomentum) of the object. Note that they are not intrinsic properties of the object but depend on the observer. The four-momentum ¯P = mˆv is time-like since ˆv is time-like. Furthermore, since m > 0, it is also future directed. The energy measured by observer u can be written E = −ˆu · ¯P (7.8) and we see that it is always positive since both ˆu and ¯P are time-like and future directed so their inner product is negative. The usefulness of the notion of momentum in Newtonian mechanics comes from the fact that it is conserved (in the absence of external forces): The sum of all momenta before a collision is equal to the sum of momenta after the collision. This is actually a special case of the relativistic conservation of four-momentum before ¯Pi = after ¯Pj . (7.9) It says that there are four conserved quantities: The total energy and the three components of the momentum. The fact that four-momentum is conserved is intimately tied to symmetries in nature. In fact, you will learn later in your studies that the conservation of momentum is equivalent to the statement that the laws of physics look the same regardless of position in space. The laws of physics are the same here as on the moon or in the Andromeda galaxy. Similarly the conservation of energy is equivalent to the fact that the laws of physics look the same at all times. Just like the theory of relativity combines space and time into a spacetime it combines the notions of momentum and energy into the single concept of four-momentum. Note that in the Newtonian limit the total energy becomes E ≈ i mi + mi¯v2 i 2 . (7.10) But we know that the kinetic energy is not conserved in general, it is only conserved in elastic collisions. This means that the first term must change to keep the total energy conserved, i.e. the sum of the masses before and after the collision will in general differ, contrary to what is assumed in Newtonian mechanics. In inelastic collisions the kinetic energy decreases, so the total mass must increase. Of course in everyday situations v << 1 so the kinetic energy is much smaller than the total mass, 1 2 m¯v2 << m. 34 This fact, that mass is also a form of energy, is the content of the most famous equation in physics E = mc2 . (7.11) Remember that here we are using units where the speed of light is unity, c = 1. Note also that E = m is true only for an observer at rest with respect to the object in question. It is often referred to as the rest energy. 7.2 Massless particles Knowing all but one four-momentum of the particles in a given reaction we can use the conservation of four-momentum to determine the remaining one. In this way one can experimentally establish the existence of particles with null four-momentum ¯P2 = 0 . (7.12) Since we may define the mass through ¯P2 = −m2 we call such particles massless. The most important example of such a particle is the photon, the quantum of light (more generally electromagnetic radiation). The four-momentum and energy of such a particle does not vanish, if it did we would not be able to detect them, and therefore we conclude from the expression E = m √ 1 − v2 (7.13) that such a particle must have v = 1, i.e. travel at the speed of light. It therefore follows a null line with direction proportional to the null vector ¯P. Note that for a massless particle we have ¯P = Eˆu + ¯p = E(ˆu + ˆp) , (7.14) since it is null. 7.3 Tachyons Sometimes on hears about particles with space-like four-momentum ¯P2 > 0 . (7.15) Such particles are referred to as tachyons. They would have an imaginary mass from ¯P2 = −m2 and always travel along space-like lines. This turns out not to be consistent with the laws of quantum mechanics and such particles therefore cannot exist in nature. Instead in our current best theories particles emerge as “ripples” of a field with a quantum of energy. In such quantum field theories it is not unusual to have m2 < 0. However, the resulting particles do not travel faster than light. Instead such an imaginary mass signals an instability of the vacuum. In fact, this is part of the mechanism by which particles can acquire mass through the so-called Higgs mechanism, related to the famous Higgs boson discovered at CERN. 35 7.4 Particle reactions and kinematics The two most important types of particle reactions are ˆ Decay: One particle goes into two or more ˆ Collision: Two particles go into one or more The allowed configurations of four-momenta are constrained by 1. ¯P2 i = −m2 i for all particles involved 2. Conservation of four-momentum i ¯Pi,in = j ¯Pj,out These equations form the basis of particle kinematics. It is often convenient to refer all quantities to some particular observer, say with four-velocity ˆu. Then we have ¯Pi = Ei ˆu + ¯pi ˆu · ¯pi = 0 . (7.16) The inner product of the four-momenta of two particles becomes ¯Pi · ¯Pj = −EiEj + ¯pi · ¯pj , (7.17) and in the special case i = j we find m2 i = − ¯P2 i = E2 i − ¯p2 i ⇒ Ei = m2 i + ¯p2 i , (7.18) expressing the energy in terms of the momentum. A typical kinematical calculation involves two steps: 1. Take the inner product of the conservation of four-momentum i ¯Pi,in = j ¯Pj,out with some ¯Pi, or rearrange it and take the square. 2. Replace ¯P2 i by −m2 i everywhere and inner products using (7.17). 36 E         e e e e e e e… θ ¯q ¯p = 0 ¯p ¯q Figure 7.1: Scattering of an electron, with momentum ¯q, off a proton at rest (¯p = 0) in the observer’s orthogonal space. The recoil angle of the proton is θ. Example Consider an electron (e− ) scattering off a proton (p+ ). We set me− = m and mp+ = M and let ¯Q, ¯P be the initial four-momenta and ¯Q , ¯P the final ones. We have ¯P2 = ¯P 2 = −M2 , ¯Q2 = ¯Q 2 = −m2 . (7.19) The conservation of four-momentum reads ¯P + ¯Q = ¯P + ¯Q . (7.20) Let’s say we are interested in the situation where the proton is at rest before the collision and we want to find the recoil angle of the proton. The situation is illustrated in the observer’s orthogonal space in figure 7.1. When there is one particle which we do not know anything about, in this case the outgoing electron with four-momentum ¯Q , it is often useful to rearrange the conservation of four-momentum so that its four-momentum appears alone on one side of the equation and then take the square. We therefore write the conservation of four-momentum as ¯P + ¯Q − ¯P = ¯Q (7.21) and squaring this equation we find ¯P2 + ¯Q2 + ¯P 2 + 2 ¯P · ¯Q − 2 ¯P · ¯P − 2 ¯Q · ¯P = ¯Q 2 . (7.22) 37 Using the fact that ¯P2 = ¯P 2 = −M2 and ¯Q2 = ¯Q 2 = −m2 this becomes 0 = − M2 + ¯P · ¯Q − ( ¯P + ¯Q) · ¯P =( ¯P + ¯Q) · ( ¯P − ¯P ) . (7.23) Note that Q has dropped out completely. This is a useful equation for two-particle elastic collisions (same masses coming in and going out). Since we are assuming the proton is initially at rest with respect to the observer we have ¯P = M ˆu , ¯Q = Eˆu + ¯q , ¯P = E ˆu + ¯p , (7.24) where ˆu is the observer’s four-velocity. Using this we find 0 = ( ¯P + ¯Q)·( ¯P − ¯P ) = ((M+E)ˆu+¯q)·((M−E )ˆu−¯p ) = −(M+E)(M−E )−¯q·¯p . (7.25) Since ¯q and ¯p are in the orthogonal space to ˆu they can be treated as ordinary Euclidean vectors and we can write ¯q · ¯p = qp cos θ , (7.26) where θ is the recoil angle we are after, see figure 7.1. Noting that −m2 = ¯Q2 = −E2 + q2 ⇒ q = √ E2 − m2 (7.27) and similarly we have p = √ E 2 − M2 we get cos θ = ¯q · ¯p qp = (E + M)(E − M) √ E2 − m2 √ E 2 − M2 = E + M √ E2 − m2 E − M E + M . (7.28) This expresses the recoil angle of the proton in terms of the initial energy of the electron E and the final energy of the proton E . 7.5 Center-of-mass observer For more complicated processes it is often useful to group all (or part) of the out-going particles together. Their total four-momentum is ¯P = i ¯Pi . (7.29) Then the ”mass” defined by ¯P2 = −M2 does not have a fixed value. It depends on the relative motion of the particles. It does have a lower bound however. Since a sum of time-like future directed vectors is again time-like and future directed we can consider an observer with four-velocity ˆu = ¯P/M. Then we find ˆu = 1 M ¯P = 1 M i ¯Pi = 1 M i Ei ˆu + 1 M i ¯pi . (7.30) 38 We see that the last term must vanish, which means that this observer sees the total spatial momentum of the particles being zero, i ¯pi = 0. Using this fact we find −M2 = ¯P2 = − i Ei 2 = − i m2 i + p2 i 2 . (7.31) Therefore M2 = i m2 i + p2 i 2 ≥ i mi 2 , (7.32) with equality occurring only if ¯pi = 0 for all i, i.e. if all particles are at rest with respect to each other. This inequality leads to so-called threshold conditions for when a certain set of particles can be produced. Note that there is no reference to any observer in the last equation – the result is observer independent. Nevertheless it was convenient to introduce an observer in the derivation of the result and this turns out to often be the case. An observer, such as the one considered above, whose four-momentum is parallel to the total four-momentum i ¯Pi is called a center-of-mass observer (sometimes center-of-mass frame or center-of-mass system), since such an observer sees the particles having zero total three-momentum, i ¯pi = 0. The introduction of such an observer can often simplify the calculations. Note that using Lorentz transformations we can of course translate from one observer to another at the end. 39 Chapter 8 Curved worldlines and acceleration So far we have dealt almost exclusively with straight worldlines. Unaccelerated spaceships and particles not influenced by external forces travel along such worldlines. Conversely, spaceships that run their engines or particles influenced by external forces follow curved worldlines. While it is possible to treat accelerated observers in special relativity it is usually avoided for practical reasons. We will therefore continue to assume that all observers are unaccelerated, unless otherwise stated. The natural parameter along a (massive) particle’s worldline is the proper time τ. The spacetime position of the particle is given as a function of τ ¯R = ¯R(τ) . (8.1) We can calculate the rate of change of the position vector with τ by taking the derivative d ¯R dτ = lim ∆τ→0 ¯R(τ + ∆τ) − ¯R(τ) ∆τ . (8.2) Since τ is the proper time along the worldline, which measures its length, we have for an infinitesimal piece of the worldline dτ2 = −(d ¯R)2 . (8.3) It follows that ˆv = d ¯R dτ (8.4) is a time-like unit vector, ˆv2 = −1, which is tangent to the worldline at the event ¯R(τ). For a straight worldline d ¯R/dτ =constant and ˆv is the four-velocity. We will continue to call it the four-velocity also for curved worldlines, though in that case it is not constant. The four-acceleration is defined as the rate of change of the four-velocity ¯A = dˆv dτ = d2 ¯R dτ2 . (8.5) 40 Differentiating the condition ˆv2 = −1 we learn that ˆv · ¯A = 0 . (8.6) This implies that ¯A is space-like. Note that ¯A is not, in general, a unit vector. In particular it vanishes for a straight worldline. Let us now analyze how an unaccelerated observer views an accelerated worldline. Consider an observer whose (straight) worldline is tangent to the curved worldline at a certain point ¯R(τ0), as depicted in figure 8.1. Splitting ¯R with respect to the four-velocity ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡ ¡! rrrj        rrj ¡ ¡ ¡ ¡! ¯R(τ0) ¯a0 ¯R(τ) ¯r tˆu Figure 8.1: An observer’s worldline is tangent to the curved worldline of an accelerated object at the point ¯R(τ0). ˆu of the observer we have ¯R(τ) = tˆu + ¯r , ˆu · ¯r = 0 . (8.7) Considering a small interval along the curve this gives d ¯R = dtˆu + d¯r (8.8) 41 and squaring this we find −dτ2 = −dt2 + d¯r2 (8.9) or dτ = dt √ 1 − v2 , ¯v = d¯r dt . (8.10) Note that ¯v is the relative velocity and we can also write dτ = dt/γ. This gives the four-velocity of the object as ˆv = d ¯R dτ = γ d ¯R dt = γ(ˆu + ¯v) . (8.11) This is the same velocity split formula we found before when expressing a constant fourvelocity ˆv in terms of another one ˆu, here however ˆv and ¯v are not constant. At the point ¯R(τ0) we have ˆv = ˆu since the worldlines are tangent so the relative velocity ¯v = 0 at this point. For the four-acceleration we find ¯A(τ0) = dˆv dτ 0 = dˆv dt 0 = (¯a)0 , (8.12) where ¯a = ∂¯v/∂t is the ordinary Newtonian acceleration measured by the observer momentarily at rest with respect to the accelerated object. If this object happens to be a spaceship the vector ¯A at a certain point of its worldline is the acceleration experienced by the crew at that instant. This gives the physical interpretation of the four-acceleration. 8.1 Special case: Constant acceleration Consider a 2-plane in spacetime, which could describe for example the history of a spaceship traveling in a straight line as seen by an observer. In two dimensions we need only one condition to determine a curve and we take ¯R2 = R2 , (8.13) with ¯R the spacetime position and R a constant. Taking the derivative we find ¯R · ˆv = 0 (8.14) and another derivative gives ¯R · ¯A = 1 . (8.15) The vectors ¯R, ˆv and ¯A lie in a 2-plane and since both ¯R and ¯A are orthogonal to ˆv they must be parallel, so that using the equation above we find ¯A = ¯R R2 ⇒ ¯A2 = 1 R2 , (8.16) 42 i.e. the four-acceleration of this worldline is constant (in magnitude). This could for example be the worldline of a spaceship running its engines so that the crew experience a constant acceleration. Let ¯R2 = R2 1 and ¯R2 = R2 2 with R2 > R1 be two such worldlines in the same 2-plane and with the same origin. From (8.14) we see that any line through the origin cuts the curves orthogonally. Therefore R2 − R1 is the orthogonal distance between the curves. For example, one of the lines could be the worldline of the front of a spaceship and the other the worldline of its tail. The ship’s length is constant, equal to R2 − R1, but the acceleration of the front and the tail are unequal, being 1/R2 1 and 1/R2 2 respectively. Introducing an observer with four-velocity ˆu we can split ¯R = tˆu + xˆr , ˆu · ˆr = 0 (8.17) and ¯R2 = R2 ⇒ −t2 + x2 = R2 . (8.18) This means that in the (t, x)-plane of the observer the worldlines of constant acceleration are hyperbolas, as illustrated in figure 8.2. Important properties are hidden in this figure, T E t x R1 R2 ¯R2 = R2 1 ¯R2 = R2 2 Figure 8.2: Worldlines of constant acceleration as viewed by an observer. e.g. that all points on the curves are equivalent and that all observers see the same picture. It is also hard to see that lines through the origin are orthogonal to the curves and that they have constant separation. These problems arise because we are representing a spacetime situation in Euclidean space. The corresponding situation in Euclidean space would be 43 two concentric circles of radius R1 and R2 respectively and here the properties mentioned above become obvious. Note however that this Euclidean picture is also not a reliable representation of the spacetime situation, e.g. the curves are closed which is impossible in spacetime. We simply have to live with the fact that spacetime situations cannot be completely reliably represented in Euclidean space. 8.2 Fitting a relativistic car into a garage A famous “paradox” in special relativity is this: Imagine you just bought a very fast car. Unfortunately, the car turns out to be slightly too long to fit in your garage. Is it possible to exploit the Lorentz contraction and drive the car very fast into the garage and slam the door? From the point of view of the garage the car appears shorter due to the Lorentz contraction, which suggests that it should work. On the other hand, from the point of view of the car, the garage appears shorter making the situation worse, not better. This is the “paradox”. Of course, when you analyze things carefully there is no contradiction. Let ˆu be the four-velocity of the garage and ˆv that of the car. We take units where the length of the garage is 1 and let ˆg be orthogonal to ˆu and stretching from the doors to the back of the garage. Similarly, we let ¯c be orthogonal to ˆv and stretch from the tail of the car to the front. We assume the car continues unaccelerated until it reaches the back wall of the garage at which point it suddenly stops. (This is of course an idealized situation involving infinite acceleration.) The spacetime diagram is given in figure 8.3. The following important events are indicated in the figure E0 : The tail of the car passes the doors and they close E1 : An event at the front of the car which is simultaneous with E0 from the cars point of view E2 : The car collides with the back wall of the garage. Simultaneous with E0 from the point of view of the garage If ¯v is the relative velocity we can apply the Lorentz transformation relating the observations from the garage to those from the car ˆv = γ(ˆu + vˆg) , ˆc = γ(ˆg + vˆu) , (8.19) or the inverse transformations ˆu = γ(ˆv − vˆc) , ˆg = γ(ˆc − vˆv) . (8.20) The last of these equations will be enough for us. Let ¯w be the vector from E1 to E2. From figure 8.3 we have ¯c + ¯w = ˆg = γ(ˆc − vˆv) (8.21) 44 T T                               E d d d dd‚ s s s E0 E1 E2 doors back wall front tail ˆu ˆv ˆg ¯c Figure 8.3: Spacetime diagram for the problem of fitting the car into the garage. which implies, since ¯w is proportional to ˆv, that ¯c = γˆc , ¯w = −γvˆv . (8.22) The length of the car is = √ ¯c2 = γ > 1, so the car is longer than the garage, consistent with our assumptions. Since ¯w is negative relative to ˆv, from the point of view of the car the front collides with the back wall before the doors close, although these events are simultaneous from the point of view of the garage. However, since both ˆg and ¯c are space-like, no signal from the collision can reach the tail of the car before it passes the doors. Therefore no material strength can stop the car from being compressed (from its point of view) and you can shut the door behind it. You have to be quick though if the car is elastic and tends to regain its original shape. This example illustrates the fact that no perfectly rigid body exists. We have seen that it is indeed possible (though not recommended) to fit the car into the garage by exploiting the Lorentz contraction. We have also seen that what appears at first sight to be a paradox is just due to not analyzing the situation carefully enough, in particular neglecting the issues due to the relativity of simultaneity. When the problem is carefully analyzed there are of course no contradictions. 45 8.3 Rotating wheel In a wheel at rest all constituent particles follows straight worldlines. When the wheel starts spinning the worldlines become tilted and form helices in spacetime. Only the center continues on a straight worldline. Using the center as an observer and thinking of a ring of matter a distance ρ around it we realize that such a ring must have circumference 2πρ whether the wheel is spinning or not. In a sense the Lorentz contraction is prevented from taking place. Instead the tilting of the worldlines has the effect that the orthogonal distance between them increases compensating the Lorentz contraction. However, this means that a deformation of the wheel takes place. This is an inevitable deformation which is always connected with rotation. How much energy is needed to produce the deformation depends on the stresses in the wheel. Therefore the moment of inertia of a wheel must also depend on the stresses within it. This is an important consequence of the theory of relativity though it plays little role in everyday circumstances. 46