Physics in Spacetime (F4051)
Lecture notes
Linus Wulﬀ
Spring 2024
Chapter 1
Space and Time
This course deals with the special theory of relativity introduced by Einstein in a famous
1905 paper. The traditional way of introducing special relativity is to derive it, in much
the same way that Einstein did, from two basic principles:
1. The principle of relativity
2. The constancy of the speed of light
From these assumptions the notion of a spacetime with (inertial) observers being connected
by Lorentz transformations follows. This is a natural way to proceed if one starts from a
knowledge of classical mechanics and Maxwell’s equations of electrodynamics. However,
it is not the best way to understand the geometrical aspects of spacetime. This is part of
the reason it took Einstein another ten years to formulate the general theory of relativity,
describing gravity, where the geometry of spacetime is the key player.
In this course we will follow a diﬀerent route1
which leads more directly to a geometric
picture. Rather than starting with the principles described above we will derive the same
physics from what is known as
 The principle of maximum proper time
This approach is more in line with general relativity, and this course can be thought of as
a ﬁrst step towards studying general relativity.
1.1 What is space and time?
The notions of Space and Time are central to physics. In physics we are interested in
answering questions like
Given some conﬁguration of particles with given positions and momenta at
some initial time ti, what will the conﬁguration look like at some later time tf ?
1
The approach to relativity taken here is inspired by lecture notes and a book by B. Laurent (Introduction
To Spacetime, World Scientiﬁc, 1994).
1
For such questions to make sense we must have a precise way of deﬁning what we mean
by time and also what we mean by a particle’s position in space. So what are space and
time? Rather than get into a philosophical discussion about the nature of space and time,
a more useful approach when faced with such deep questions in physics is to try to replace
it by a diﬀerent, more down to earth, question. After all, in physics we deal only with
things that can be measured, and therefore a better question to ask is
How do we measure space and time?
We say that we deﬁne space and time operationally, by declaring how we measure them.
So how do we measure distances in space? The most basic way is to take a reference
object, say stick of a certain length, and use it to measure the distance between two points.
We will call such a reference object a ruler. Of course, a good ruler should not bend or
change its length with temperature etc., so we will assume it is always possible to ﬁnd
a suﬃciently good ruler (or equivalent) so that we can measure lengths to the precision
we need. How do we measure time? To measure time we need a clock. It does not have
to be what we normally think of as a clock, it can be any physical process which has a
known time dependence, e.g. a periodic process with deﬁnite period like a pendulum or a
non-periodic process like an atom in an excited state with known half-life. Again, we will
assume that there exist such clocks with good enough precision for the time measurements
we need to perform.
We allow each person, or observer, to measure time with their own clock and spatial
distances with their own ruler. We will assume that these are small enough that the
observer can carry them with her, i.e. they will be assumed to be in the same state
of motion as the observer and experience the same forces she experiences. But if each
observer makes their own measurements using their own clock and ruler, how do we relate
the measurements of two diﬀerent observers? Newton and his contemporaries assumed
that there was an absolute notion of time, so that all observers clocks would tick at the
same rate. In that case it is very easy to relate the measurements of two observers. We
now know that this assumption was wrong. For example, taking two synchronized atomic
clocks, putting one on a plane circling the earth and leaving one on the ground, one ﬁnds
when comparing them at the end that they diﬀer (by a few hundred nanoseconds). This
observation is clearly inconsistent with the Newtonian idea of an absolute time.
1.2 The principle of maximum proper time
Experiments show that time runs diﬀerently for diﬀerent observers. We must therefore
assign each observer their own time, their proper time, which is the time measured on their
clock. We can now state the key principle that will allow us to compare the measurements
of diﬀerent observers
2
The principle of maximum proper time:
If two observers are separated and then meet again, the one that does not
experience any acceleration always measures the longest proper time.
It says that proper time is maximized for inertial, i.e. unaccelerated, observers. There
is plenty of experimental evidence to support this principle, such as the experiments with
atomic clocks on planes, or the operation of GPS satellites which requires very precise time
measurements. In this course we will take this principle as the starting point from which
we will derive the theory of special relativity.
1.3 Spacetime
We are familiar with the fact that to specify the position of an object in our three dimensions
we need to give three numbers – the coordinates with respect to some speciﬁed
coordinate system. For positions on the earth we might for example give the longitude,
the latitude and the height above sea level. To specify an event – something happening
at a certain place at a certain instant of time – we must give one more number, namely
the time on a clock associated to the coordinate system. In our example this could be the
time GMT.
We have argued that we must allow each observer to measure distances and times using
their own coordinate system deﬁned by their ruler and clock. Each observer will therefore
associate to a given event four numbers (t, x, y, z) – the spacetime coordinates relative to
their coordinate system. Note that we are deﬁning an event here in an idealized way as a
single point in spacetime, i.e. something that happens at a point in space at a single instant
of time. The set of all events make up the four-dimensional spacetime. Note that each
observer will (in general) assign diﬀerent coordinates to the same event because they are
using diﬀerent coordinate systems, there is no preferred coordinate system in spacetime.
One of our ﬁrst tasks will therefore be to understand how to relate the observations of
diﬀerent observers.
1.4 Worldlines
The trajectory of an object traces out a continuous path in spacetime – a worldline (really
“worldtube” if the object is not point-like, but this distinction won’t be very important
to us). In ordinary Euclidean space we are familiar with the fact that there is a shortest
path between any two points. This path is called a straight line. It is the path an object
follows if it is not acted upon by any external forces, i.e. it is unaccelerated. Similarly, we
will assume that there is precisely one straight line connecting any two events in spacetime
and that any object not acted on by external forces, i.e. not experiencing any acceleration,
follows such a straight worldline. To a worldline connecting two events in spacetime we can
associate a number – the proper time along that worldline. Recall that this is the time an
3
observer traveling along the worldline measures on her clock between the two events. The
principle of maximum proper time says that a straight worldline corresponds to the longest
proper time. Therefore the analog of shortest length in Euclidean space is longest proper
time in spacetime and a clock can be thought of as measuring distances in spacetime.
When we draw spacetime diagrams we will draw the worldlines of unaccelerated objects
as straight lines. Curved lines will correspond to worldlines of accelerated objects (Figure
1.1).
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
u
u
A
B
τAB
τAB
Figure 1.1: Spacetime diagram showing an accelerated (curved worldline) and an unaccelerated
(straight worldline) observer meeting at events A and B. The proper time measured
on their respective clocks between the meetings is τAB and τAB. The principle of maximum
proper time then says that τAB > τAB.
An important notion in Euclidean geometry is the notion of two lines being parallel.
In spacetime we can similarly have the notion of two observers being on the same course.
How can two observers, e.g. two spaceships traveling in outer space, determine whether
they are on the same course? One way to do this uses a construction from Euclidean space
adapted to spacetime. Imagine that the two observers each send out a probe ﬁtted with
a clock, which travels freely until it is picked up by the other observer at some later time
(Figure 1.2). If the two probes happen to meet halfway, i.e. after half of the proper time
(from being emitted to being picked up) has elapsed on each clock, then we will say that
the observers are on the same course, or that their worldlines are parallel. From the ﬁgure
we see that this also implies that the lines AB and CD (not drawn) are parallel. Note
that to carry out the experiment we really need to send clocks that also have a recording
device that records the time they were sent and the time they met. We would also need
to do the experiment several times to get them to meet halfway.
4
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡












Q
d
d
d
d
d
ds
s
s
s
s
c
A
C
B
D
τ
2
τ
2
τ
2
τ
2
Figure 1.2: The worldlines of two observers are parallel if they can send out probes, at A
and B, that meet halfway before encountering the other observer at D and C.
Notice that this construction does not refer to space or time separately, only to the
full spacetime picture or the proper time measured by a particular clock. This is in sharp
contrast to how we would describe such an experiment in Newtonian physics.
Just like in Euclidean space the line AC in Figure 1.2 deﬁnes a vector, which we can
draw as an arrow starting at A and ending at C. We declare the length of the vector to
be given by the proper time elapsed from A to C. The construction in the ﬁgure gives us
a way to parallel transport vectors, i.e. moving a vector while keeping it parallel to itself.
The vector AC can be parallel transported to the vector BD. Taking the two worldlines
to approach each other we obtain the special case of parallel transport of the vector along
the worldline. A general parallel transport is obtained by a sequence of such “elementary”
parallel transports.
We will now make a very important assumption. We will assume that the vector one
obtains by such a sequence of elementary parallel transports from a point A to a point
A in spacetime does not depend on how one chooses the sequence of parallel transports,
i.e. it does not depend on the path taken. This assumption is actually not true close
to gravitating bodies and in that case one must use the more advanced theory of general
relativity. The assumption is true if gravity is very weak, which is the case we consider
in this course. In this case we are working with special relativity. In fact, the change of a
vector under parallel transport is directly related to the curvature of a space. In special
relativity spacetime is ﬂat, while in general relativity it can be curved.
5
Chapter 2
Spacetime vectors
In the last chapter we deﬁned a vector on the straight worldline of an observer as an arrow
from one event to a later event on the worldline, with length given by the proper time
elapsed between the two events. The notion of vectors is familiar from Euclidean space
and we will use the same notation ¯v for a vector in spacetime. Such vectors are often called
four-vectors since spacetime is four-dimensional. Just as any point in Euclidean space R3
can be associated with a vector going from the origin to that point, any event in spacetime
can be associated to a spacetime vector from an origin (which we can choose as we please)
to the point in question. We have also seen that we can move vectors around using parallel
transport. Two vectors related by parallel transport will be considered the same vector
(this is consistent since we are assuming that the vector obtained by parallel transport is
independent of the path taken).
Spacetime vectors obey the usual axioms familiar from Euclidean space:
 Commutativity of addition: ¯u + ¯v = ¯v + ¯u
 Associativity of addition: ¯u + (¯v + ¯w) = (¯u + ¯v) + ¯w
 Identity element of addition: ¯v + ¯0 = ¯v
 Inverse element of addition: Given ¯v there exists a vector −¯v such that ¯v + (−¯v) = 0
 Compatibility of scalar multiplication: a(b¯v) = (ab)¯v for a, b ∈ R
 Identity element of scalar multiplication: 1¯v = ¯v
 Distributivity of scalar multiplication with respect to vector addition: a(¯u + ¯v) =
a¯u + a¯v
 Distributivity of scalar multiplication with respect to addition: (a + b)¯v = a¯v + b¯v
Addition of spacetime vectors can be done by the geometric construction familiar from
Euclidean space, which is illustrated in Figure 2.1. Recall that a basis for a vector space
6
d
d
d
d
d
d
d
d
ds
¯u







Q
¯v T
¯u + ¯v
Figure 2.1: Geometric addition of the vectors ¯u and ¯v producing a third vector ¯u + ¯v.
is a set of linearly independent vectors ¯vi with i = 1, . . . , n which span the space, so that
any vector is expressed uniquely as a linear combination
a1¯v1 + a2¯v2 + . . . + an¯vn , (2.1)
for some numbers ai ∈ R with i = 1, . . . , n. The vector can be denoted in this basis as
(a1, a2, . . . , an) and n is called the dimension of the vector space. A basis of spacetime
vectors consists of four linearly independent spacetime vectors.
2.1 Inner product
An important notion in linear algebra is that of the inner product between two vectors.
Given two vectors their inner product is a real number. We will denote the inner product
with a dot, e.g. ¯u·¯v denotes the inner product between vectors ¯u and ¯v. The inner product
satisﬁes the following standard axioms
 Symmetry: ¯u · ¯v = ¯v · ¯u
 Linearity: (a¯u) · ¯v = a(¯u · ¯v) and (¯u + ¯v) · ¯w = ¯u · ¯w + ¯v · ¯w
 Non-degeneracy: If ¯u · ¯v = 0 for all vectors ¯v then ¯u = ¯0
Often the inner product is required to be positive deﬁnite, so that ¯u2
= ¯u · ¯u ≥ 0, which
is a stronger requirement than being non-degenerate. This is the case in Euclidean space
7
where we are used to identifying ¯u2
with the length-squared of a vector, which is obviously
positive. We will see below that it is not possible to require this for spacetime vectors.
Instead, for a spacetime vector that goes along the straight worldline of an object from
point A to point B we will take
¯u2
= −τ2
, (2.2)
where τ is the proper time along the worldline from A to B. The minus sign seems strange
at this point but we will see shortly that it is needed if we want vectors representing lengths
in space to have positive square. All the diﬀerences between Euclidean space and spacetime
are due to the fact that the inner product in spacetime is not positive deﬁnite. As we will
see this is what makes it possible to separate the time-direction from the spatial directions.
The assumption that ¯u2
= −τ2
for vectors corresponding to a segment of a straight
worldline determines also the inner product ¯u · ¯v of two straight worldline vectors ¯u, ¯v. To
see this consider three such worldline vectors related by
a¯u = ¯v + ¯w , (2.3)
for some a ∈ R. Writing this as ¯w = a¯u − ¯v and squaring both sides we get
¯w2
= (a¯u − ¯v) · (a¯u − ¯v) = a2
¯u2
− 2a¯u · ¯v + ¯v2
. (2.4)
Rearranging this we have
¯u · ¯v =
1
2a
a2
¯u2
+ ¯v2
− ¯w2
. (2.5)
The right-hand-side involves only squares of vectors, which are expressed in terms of the
corresponding proper times. Therefore we see that the inner product ¯u·¯v is also determined
in terms of the proper times corresponding to the lengths of the vectors ¯u, ¯v, ¯w.
It is important to understand that the assumption that there exists an inner product
for spacetime vectors satisfying the above axioms is not a trivial statement. The mere
existence of this inner product has physical consequences. To see this consider the identity
(¯u + ¯v)2
+ (¯u − ¯v)2
= 2¯u2
+ 2¯v2
. (2.6)
Let’s assume that all these vectors are part of straight worldlines of observers. Since
the expression contains only squares it only involves the proper times measured by these
observers. With four spaceships traveling along these straight worldlines it is then possible
to arrange an experiment (see Figure 2.2) to test whether the proper times they measure
satisfy the above identity, i.e. whether τ2
1 + τ2
2 = 2τ2
3 + 2τ2
4 . One ﬁnds that it is indeed
satisﬁed.
2.2 Timelike, Spacelike and Null
Let ¯u, ¯v be two straight worldline vectors. Then
¯u2
= −τ2
u , ¯v2
= −τ2
v . (2.7)
8
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡!
τ3
¯uτ2 τ1
τ4
−¯v
τ4
¯v
I
A















U
£
£
£
£
£
£
£
£
£
£
£
£
££#
Figure 2.2: Experiment involving four spaceships to test equation (2.6).
We now deﬁne a new vector which is a linear combination of ¯u and ¯v,
¯y = a¯u + b¯v , (2.8)
for some a, b. Its inner product with ¯u is
¯u · ¯y = a¯u2
+ b¯u · ¯v . (2.9)
Taking a = −b¯u·¯v
¯u2 we ﬁnd
¯u · ¯y = 0 . (2.10)
We say that ¯y is orthogonal to ¯u.
Now consider the following situation where two spaceships part and then meet again:
9
T
ship 1
ship 2
ship 2











Q











k
Spaceship 1 is unaccelerated throughout the duration of its journey, while spaceship 2
travels unaccelerated for a while, then accelerates hard for a short time to reverse its
direction of motion and then again ﬂoats freely until it meets spaceship 1 again.
We will take the spacetime vectors corresponding to this situation as in Figure 2.3 with
T
E
¯y¯u
¯v2
¯v1
¢
¢
¢
¢
¢
¢¢
f
f
f
f
f
ffw
Figure 2.3: Spacetime vectors corresponding to two spaceships parting and meeting again.
¯v1 =
1
2
¯u + ¯y , ¯v2 =
1
2
¯u − ¯y , (2.11)
10
where ¯y is the vector introduced above which is orthogonal to ¯u and is a small number.
Note that ¯v1 + ¯v2 = ¯u so that the spaceships indeed meet at the end. The proper time for
the journey of ship 1 is
τ1 =
√
−¯u2 . (2.12)
The proper time for the journey of ship 2 is the sum of the proper time for the two segments
of the journey
τ2 = −¯v2
1 + −¯v2
2 . (2.13)
Since ¯u · ¯y = 0 we have ¯v2
1 = 1
4
¯u2
+ 2
¯y2
= ¯v2
2 so that
τ2 = 2 −1
4
¯u2 − 2 ¯y2 =
√
−¯u2 1 +
4 2 ¯y2
¯u2
= τ1 1 −
4 2 ¯y2
τ2
1
. (2.14)
The principle of maximum proper time says that the unaccelerated observer measures the
longest proper time, i.e. τ1 > τ2 (we assume ¯y = 0). This in turn implies that
1 −
4 2 ¯y2
τ2
1
< 1 ⇒ ¯y2
> 0 , (2.15)
the vector ¯y has positive square. This result was obtained assuming ¯u2
= −τ2
1 < 0. If
we had decided instead to take the opposite convention, i.e. ¯u2
= +τ2
1 > 0, the same
calculation would give ¯y2
< 0. We see that, in contrast to what we are used to from
Euclidean space, it is not possible for all spacetime vectors to have positive square. This
fact follows from the principle of maximum proper time. Clearly no observer can travel
along ¯y because then his clock would need to show an imaginary time, which is absurd.
Consider now the vector
¯w = c¯u + d¯y , (2.16)
with ¯u and ¯y orthogonal as before. Squaring this we ﬁnd
¯w2
= c2
¯u2
+ d2
¯y2
. (2.17)
If we take c2
= −d2 ¯y2
¯u2 (note that the RHS is positive which is consistent with c, d being
real numbers) we get ¯w2
= 0! We conclude that there also exist spacetime vectors ¯w = ¯0
such that ¯w2
= 0.
To summarize we have learned that there are 3 classes of spacetime vectors:
 ¯v2
< 0: Timelike
 ¯v2
> 0: Spacelike
 ¯v2
= 0: Null (light-like)
Vectors that are part of a straight worldline of an observer are timelike. We have seen
above that the principle of maximum proper time implies that if ¯u is timelike and ¯u · ¯y = 0
then ¯y is spacelike (or ¯y = ¯0). This is a very useful result to remember when working with
spacetime vectors:
11
¯u timelike and ¯u · ¯v = 0 ⇒ ¯v spacelike (or ¯v = ¯0).
12
Chapter 3
Simultaneity and spatial distance
An observer traveling along in a spaceship only has direct access to the interior of the
spaceship. Nevertheless they must be able to make statements and inferences about what
happens in the outside world. To be able to do this they need in particular to be able to
say when an event, which is not on their worldline, occurred. Another way to say it is that
they need to have a way to determine whether an event far away is simultaneous with an
event on their worldline, e.g. a supernova explosion far away happens when their clock
shows 10:23.
The natural way to do this is via the construction in ﬁgure 3.1. The observer sends
out a probe, which travels on a straight worldline to the event and on a straight worldline
back. She arranges it so that the probe reaches the event precisely when half the proper
time of its journey has elapsed. Then she will say that the event on her worldline halfway
between sending out and receiving the probe is simultaneous with the distant event.
From the ﬁgure we have
τ2
= −(¯v + ¯r)2
= −(¯v − ¯r)2
⇒ ¯v · ¯r = 0 , (3.1)
so that ¯r, being orthogonal to a timelike vector, must be spacelike.
We also need a way to measure spatial distances. To see how to do this let us consider
a family of straight parallel worldlines L0, L1, . . . deﬁned by the equation
¯Rn = λn ¯u + n¯ρ , n = 0, 1, 2, . . . (3.2)
where ¯u, ¯ρ are timelike vectors and λn ∈ R parametrizes a point on the n’th worldline,
Ln. This is illustrated in ﬁgure 3.2. This could be a ﬂeet of identical spaceships traveling
unaccelerated and arranged head to tail. Consider now an observer traveling from the front
of the ﬂeet to the back, counting how many ships he passes. This number is a measure of
how far it is from the head of the ﬂeet to the tail. The distance is expressed in units of
“standard spaceship”.
There is an alternative way to measure this distance. We ﬁrst note that there is only
13
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
r
rrrrj
$$$$$$$$
v
v
v
v
v
v
v
¡
¡
¡
¡¡!
¡
¡
¡
¡¡!
s
s
s
s P
P1
P2
P3
t
t
τ
τ¯v
¯r
¯v
Figure 3.1: Via this construction the observer decides that P2, halfway between P1 and P3,
is simultaneous with P.
one vector going from L0 to Ln with the property that it is orthogonal to ¯u.1
It is given by
¯rn = n¯ρ −
n¯ρ · ¯u
¯u2
¯u = n ¯ρ −
¯ρ · ¯u
¯u2
¯u . (3.3)
Note that ¯rn, and therefore also its magnitude, is proportional to n, the number of spaceships.
We can therefore use the magnitude ¯r2
n as a measure of the distance. All we need
to do is work out the conversion factor to go between ¯r2
n and the number of spaceships.
Looking at ﬁgure 3.1 we read oﬀ
τ2
= −(¯v + ¯r)2
= t2
− ¯r2
, or ¯r2
= t2
− τ2
. (3.4)
The advantage of this method is that we don’t need the ﬂeet of spaceships (other than
to ﬁx the unit of distance). Later we will ﬁnd an even more practical way to measure
distance.
How we pick the unit of distance is up to us. Nothing prevents us from choosing units
such that
√
¯r2 itself is the distance. This is in fact the most natural choice to make and we
will stick to it in this course. From (3.4) we see that now space and time acquire the same
dimensions. In the theory of relativity this is as natural as say height and width having
the same dimensions and being measured in the same units.
1
Proof: It is clear that at least one such vector exists. Assume there are two such vectors ¯r1, ¯r2. We
may assume their foot-point is the same point on L0. The fact the ¯u·¯r1 = ¯u·¯r2 = 0 implies ¯u·(¯r1 −¯r2) = 0.
But ¯r1 − ¯r2 = λ¯u for some λ and the previous equation implies λ = 0 so that ¯r1 = ¯r2.
14
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢



Q








Q












Q

















Q
¯u ¯ρ
L0 L1 L2 L3 · · ·
Figure 3.2: A family of parallel worldlines.
3.1 Orthogonal space
To every unaccelerated observer there corresponds a straight worldline. Such a worldline
is characterized by a timelike vector which we can normalize to a unit vector ˆu. We will
always us a ‘hat’ to denote unit vectors. A timelike unit vector satisﬁes ˆu2
= −1 and a
spacelike unit vector ˆr2
= 1. Given an observer with worldline unit vector ˆu there exist
orthogonal vectors ¯r,
ˆu · ¯r = 0 . (3.5)
They form the orthogonal space to the observer’s worldline. This is a vector space since
linear combinations of such vectors clearly belong to the space. In fact, since all such ¯r
are spacelike (or zero), it is a Euclidean vector space. Since we are imposing one condition
on the four components of ¯r the orthogonal space is three-dimensional. Recall that
√
¯r2
is the (spatial) distance from the observer. The orthogonal space is the space used in
Newtonian physics. The diﬀerence is that in the theory of relativity each observer has
their own orthogonal space.
15
Given the worldline of an observer with direction ˆu, we can split any spacetime vector
¯R into a component along ˆu and a component orthogonal to it as
¯R = tˆu + ¯r with ˆu · ¯r = 0 . (3.6)
This is illustrated in ﬁgure 3.3. According to the ﬁgure, an observer following the worldline
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢













Q
q
tˆu
¯R
¯r
L
Figure 3.3: Split of a spacetime vector ¯R with respect to a timelike direction ˆu.
L measures the event corresponding to ¯R to happen at a time t and spatial position ¯r. The
distance to the event is =
√
¯r2. From the equation we ﬁnd t = −ˆu· ¯R and ¯r = ¯R+(ˆu· ¯R)ˆu.
We see that t and ¯r are uniquely ﬁxed in terms of ˆu and ¯R.
3.2 Linearly independent vectors
Consider four spacetime vectors
¯A, ¯B, ¯C, ¯D . (3.7)
They are linearly independent if none of them can be expressed as a linear combination of
the others, or equivalently if the equation
a ¯A + b ¯B + c ¯C + d ¯D = 0 (3.8)
16
has only the trivial solution a = b = c = d = 0. Note that since spacetime is fourdimensional
we cannot have more than four linearly independent vectors.
In Euclidean space we are used to two orthogonal vectors being linearly independent.
This is not true in spacetime, e.g. a null vector is orthogonal to itself but clearly not
linearly independent of itself. What is true is that if ¯v · ¯u = 0 for all ¯u then ¯v = 0. To see
this take ¯u timelike. We conclude that ¯v must be spacelike, but since it is orthogonal to
all other spacelike vectors it must vanish since these form a Euclidean vector space.
To test if ¯A, ¯B, ¯C, ¯D are linearly independent we form the determinant of the matrix
of inner products
¯A · ¯A ¯A · ¯B ¯A · ¯C ¯A · ¯D
¯B · ¯A ¯B · ¯B ¯B · ¯C ¯B · ¯D
¯C · ¯A ¯C · ¯B ¯C · ¯C ¯C · ¯D
¯D · ¯A ¯D · ¯B ¯D · ¯C ¯D · ¯D
(3.9)
The vectors are linearly dependent if and only if this determinant vanishes. To see this assume
they are linearly dependent. Then (3.8) holds for some non-zero coeﬃcients. Taking
this linear combination of rows in the matrix we obtain a row of zeros so that the determinant
vanishes. Conversely, if the determinant vanishes there exists a linear combination of
the rows that gives zero. This means that there exists a vector a ¯A + b ¯B + c ¯C + d ¯D, with
a, b, c, d not all zero, which is orthogonal to ¯A, ¯B, ¯C, ¯D. Assuming they are linearly independent
leads to a contradiction since this vector would then be orthogonal to all vectors
and would therefore have to vanish, therefore they must be linearly dependent.
Note that this test works only for four vectors. It does not work for lower-dimensional
subspaces of spacetime.
17
Chapter 4
Velocity and light signals
Consider an observer and an object following straight worldlines L and L respectively,
ﬁgure 4.1. The observer ﬁnds the object’s position at a time t to be given by ¯r, with ¯r = 0














E
t τ
¯r
L
L
Figure 4.1: Observer L describes the object L as having position ¯r at time t.
at t = 0. He assigns the object the velocity
¯v =
d¯r
dt
=
¯r
t
, (4.1)
where the last equality follows from the fact that they are both following straight worldlines,
so ¯r is proportional to t. This velocity clearly depends on the observer, since ¯r and t refer
18
to the observer. For this reason it is called the relative velocity of the observer and the
object. Note that ¯v is orthogonal to ˆu, i.e. belongs to the observer’s orthogonal space. In
particular this means that the relative velocity is spacelike.
The unit vector along the objects worldline ˆv tells us how the relative velocity ¯v is
directed relative to the direction of the observer’s worldline, ˆu. We call ˆv the four-velocity
of the object. Note that any timelike unit vector (pointing forwards in time) can be a
four-velocity, since it could be the direction of an object’s worldline.
4.1 Standard velocity split
Let ˆv be the four-velocity of the object and ˆu that of the observer. The object’s spacetime
position is given by ¯R = τˆv and from ﬁgure 4.1 we see that
τˆv = tˆu + ¯r = t(ˆu + ¯v) , (4.2)
where we used the deﬁnition of the relative velocity in (4.1). Dividing by τ we can write
this as
ˆv = γ(ˆu + ¯v) where γ =
t
τ
and ˆu · ¯v = 0 . (4.3)
This formula for the split of the four-velocity of the object into the four-velocity of the
observer and the relative velocity is very useful and has many applications. For example,
squaring this equation gives
1 = γ2
(1 − v2
) or
t
τ
= γ =
1
√
1 − v2
, (4.4)
where v2
= ¯v2
, the relative velocity squared. Note that v2
= 1− τ2
t2 ≤ 1. This is the famous
time dilation formula. Since t = γτ and γ ≥ 1 the observer sees the object’s clock slowed
down by a factor of 1/γ.
Taking the inner product of (4.3) with ˆu we ﬁnd
ˆu · ˆv = −γ = −
1
√
1 − v2
. (4.5)
Notice that this implies that the deﬁnition of the relative velocity v is symmetric between
the observer and object. The object judges the observer to have the same velocity as the
observer assigns to the object.
4.2 Light signals
So far we have not discussed null lines, which are on the border between timelike and
spacelike lines. Now let us assume that L in ﬁgure 4.1 is such a null line. Then
τ2
= − ¯R2
= 0 , (4.6)
19
but it is still true that the objects position is given by
¯R = tˆu + ¯r . (4.7)
Squaring this we get
−t2
+ ¯r2
= 0 or v2
= 1 . (4.8)
Note that this result is independent of the observer. All observers would measure the
relative velocity of such a signal to have magnitude v = 1.
It does not follow from the theory of relativity itself that things following such null
lines exist. Note that it would be meaningless to assign them a clock since it would not
tick (τ = 0 along such a line).
Nevertheless, experience tells us that there exist signals in nature that can travel along
null lines. Light (electromagnetic radiation) being the most important example and v = 1
is called the “velocity of light”. (More generally, as we will see later, a particle follows
a null worldline if and only if it is massless. The quantum of light is a massless particle
called the photon.)
The existence of such signals is of great practical importance. For example, recall that
to determine simultaneity we used the setup in ﬁgure 3.1. The observer has to arrange
the situation so that the proper time of the probe going to and from the event are the
same. To achieve this in practice presents great diﬃculties. The existence of light, or more
generally electromagnetic, signals solves this problem since using such a signal instead of
the probe τ = 0 always. Equivalently, the observer knows that v = 1 and can therefore
calculate the distance directly from the time it takes the signal to go there and back.
Imagine an observer who sends out a ﬂash of light at t = 0. The light pulse travels out
in all directions with unit velocity forming a sphere of light. To draw the corresponding
spacetime diagram we must go down to two dimensions of space where the light forms a
circle traveling outwards from the observer, ﬁgure 4.2. In this three-dimensional spacetime
picture the light forms a cone. For this reason it is referred to as the light-cone. It is the
surface given by the equation
r = t , where r =
√
¯r2 . (4.9)
What we have drawn is really only half of the light-cone, called the future light-cone. There
is also the past light-cone given by
r = −t . (4.10)
It describes a sphere of light contracting towards the origin. The full light cone is given by
the equation
r2
= t2
(4.11)
and is illustrated in ﬁgure 4.3. Note that this equation is true for any observer, so the
light-cone looks the same to any observer. Don’t be mislead by the picture which might
seem to suggest otherwise!
20
T
E 
 
 
 
 
 
 
 
  
d
d
d
d
d
d
d
d
dd
z
¢
¢
¢
¢
¢
¢
¢
¢
¢
¯n
¯r
t
Figure 4.2: An observer’s future light-cone. The circle of light is at distance r at time t
and traveling out in all null directions, e.g. ¯n.
4.3 Split of null vectors
As we have seen, any spacetime vector can be split with respect to some four-velocity ˆu
into a component along ˆu and a component orthogonal to it. For a null vector ¯n we get
¯n = a(ˆu + ¯m) (4.12)
but ¯n2
= 0 implies ¯m2
= 1 so that
¯n = a(ˆu + ˆm) , (4.13)
proportional to the sum of a timelike and a spacelike unit vector. If a vector ¯k is orthogonal
to ¯n it is either spacelike or proportional to ¯n.1
In particular two orthogonal null vectors
must be parallel.
1
Proof: Writing ¯k = b(ˆu + ¯r) we ﬁnd the condition (we may assume a, b = 0)
ˆm · ¯r = 1 .
since ˆm and ¯r belong to the orthogonal subspace they are Euclidean. Therefore we must have either√
¯r2 > 1, in which case ¯k is spacelike, or
√
¯r2 = 1 and ¯r = ˆm, in which case ¯k = b
a ¯n.
21
 
 
 
 
 
 
  
d
d
d
d
d
d
dd
 
 
 
 
 
 
  
d
d
d
d
d
d
dd
Figure 4.3: The full light-cone.
4.4 Future and past
Consider a timelike vector ¯v at the origin. We can split it with respect to the four-velocity
of an observer as
¯v = tˆu + ¯r , ˆu · ¯r = 0 . (4.14)
Squaring we ﬁnd
¯v2
= r2
− t2
< 0 , (4.15)
which implies that every timelike vector is pointing inside the light-cone (see ﬁgure 4.2).
Replacing ¯v with a spacelike vector we similarly ﬁnd that every spacelike vector points
outside the light-cone. Null vectors point along the light-cone.
Let ¯u, ¯v be two timelike vectors with negative inner product
¯u · ¯v < 0 . (4.16)
Construct the linear combination
¯w = a¯u + b¯v , with a, b ≥ 0 , a + b = 0 . (4.17)
Squaring we ﬁnd
¯w2
= a¯u2
+ 2ab¯u · ¯v + ¯v2
< 0 , (4.18)
since all terms are negative, so that ¯w is also timelike. By varying a, b we can continuously
go from the vector ¯u to the vector ¯v via only timelike vectors. This would be impossible
if one was pointing into the future light-cone and the other into the past light-cone. We
therefore conclude that two timelike vectors with negative inner product must be pointing
22
inside the same part of the light-cone, i.e. either both into the future light-cone or both
into the past light-cone. It is easy to see that the same argument goes through if we take
one timelike and one null vector.
If instead ¯u · ¯v > 0 then ¯u and −¯v must point inside the same part of the light-cone so
that one of ¯u and ¯v is pointing to the past and the other to the future. Conversely, two
timelike vectors pointing inside the same part (future/past) of the light-cone have ¯u· ¯v < 0
(note that ¯u · ¯v cannot vanish). Again this goes through if one of them is null.
Let ˆu, ˆv be timelike and future directed. Then
(ˆu + ˆv)2
= −2 + 2ˆu · ˆv < 0 and (ˆu + ˆv) · ˆv = ˆu · ˆv − 1 < 0 , (4.19)
from which we conclude that ˆu + ˆv is also timelike and future directed. This must hold
for any sum of timelike future directed vectors. An important consequence of this is that
no spaceship (or other object) can reverse its four-velocity and travel to the past to arrive
before it departed. Time travel is therefore impossible in the theory of relativity. Related
to this, if two spaceships part and then meet again they will always agree that they parted
before they meet.
We have seen that timelike (and null) vectors can be divided into two classes: future
directed and past directed. Note that no such division is possible for spacelike vectors.
23
Chapter 5
Lorentz transformation
In this chapter we will restrict to situations where all the spacetime vectors involved lie
in a two-dimensional plane. In particular, this means that they can all be expressed in
terms of two linearly independent vectors. Many problems one encounters are in fact of
this type.
In the cases of interest this plane contains a timelike future directed unit vector, which
could be the four-velocity of an observer. Every vector can be split with respect to this fourvelocity,
ﬁgure 3.3. Note that this corresponds to a one-dimensional problem in Newtonian
physics since there is only one spatial direction.
We are interested in how to relate the measurements of two diﬀerent observers. Let us
consider ﬁrst the analogous situation in Euclidean space, with two sets of orthogonal vectors
rotated with respect to each other. Thinking of this as two diﬀerent coordinate systems we
know how to relate them via the rotation angle. Consider now the corresponding situation
in spacetime, illustrated in ﬁgure 5.1, which occurs often. We have two sets of orthogonal
unit vectors ˆu, ˆr with ˆu · ˆr = 0 and ˆv, ˆs with ˆv · ˆs = 0 and ˆu2
= ˆv2
= −1, ˆr2
= ˆs2
= 1.
Obviously ˆu and ˆr are linearly independent since one is timelike and one is spacelike. Since
we are restricting to a two-dimensional plane we can express ˆv and ˆs in terms of ˆu and ˆr.
Up to a proportionality factor we have
ˆv ∝ ˆu + αˆr , ˆs ∝ ˆr + αˆu , (5.1)
for some α ∈ R, where we have used the fact that ˆv · ˆs = 0 to ﬁx the form of ˆs. Let us
compute the length of the vectors on the RHS to ﬁx the normalization. We have
(ˆr + αˆu)2
= −(ˆu + αˆr)2
= 1 − α2
(5.2)
and therefore we must have
ˆv = ±
1
√
1 − α2
(ˆu + αˆr) , ˆs = ±
1
√
1 − α2
(ˆr + αˆu) , (5.3)
so that ˆs2
= −ˆv2
= 1. Demanding that ˆv → ˆu and ˆs → ˆr as α → 0 we see that we need to
pick the plus signs.
24
T
Ee
e
e
e
e
e
e
e
e
eeu
¨
¨¨¨
¨¨¨
¨¨¨B
ˆu
ˆv
ˆr
ˆs
Figure 5.1: Two sets of orthogonal unit vectors in spacetime.
Recalling now the standard velocity split, eq. (4.3)
ˆv =
1
√
1 − v2
(ˆu + ¯v) , ˆu · ¯v = 0 , (5.4)
we read oﬀ α = ±v with v =
√
¯v2, the magnitude of the relative velocity. We therefore
have
ˆv = γ(ˆu + vˆr) , ˆs = γ(ˆr + vˆu) , γ =
1
√
1 − v2
, (5.5)
where we have absorbed the sign into v so that v is positive if the relative velocity is along
ˆr and negative if it is along −ˆr. This is the famous Lorentz transformation. It tells us
how the measurements of two inertial observers are related. To see this consider an event
speciﬁed by the spacetime vector ¯R. We can write it in two ways as
¯R = tˆu + xˆr or ¯R = t ¯v + x ¯s . (5.6)
Observer u assigns coordinates (t, x) to the event while observer v assigns it coordinates
(t , x ). Equating the two expressions for ¯R and using the formula for the Lorentz transformation
we ﬁnd
tˆu + xˆr = t γ(ˆu + vˆr) + x γ(ˆr + vˆu) (5.7)
or
t = γ(t + vx ) , x = γ(x + vt ) . (5.8)
This is the Lorentz transformation that relates the time and space measurements of the
two observers.
25
5.1 Addition of velocities
In Euclidean space we can perform ﬁrst one rotation and then another. The result is a
third rotation. Similarly, we can perform ﬁrst one Lorentz transformation with parameter
(relative velocity) v1 and then another with parameter v2. This clearly gives a third
Lorentz transformation. The only question is how the parameter of the third Lorentz
transformation is related to v1, v2.
To answer this question we note that the ﬁnal timelike unit vector is proportional to
(ˆu + v1ˆr) + v2(ˆr + v1 ˆu) = (1 + v1v2)ˆu + (v1 + v2)ˆr . (5.9)
From the equation ˆv = γ(ˆu + vˆr) we see that we can read oﬀ the relative velocity v as the
ratio of the ˆr-component and the ˆu-component. In this way we ﬁnd
v =
v1 + v2
1 + v1v2
. (5.10)
This is the relativistic formula for addition of velocities. It has been tested to high accuracy
in experiments with light propagating through ﬂowing liquids. Note that for velocities much
smaller than the speed of light v1, v2 1 we have v ≈ v1 + v2, the Newtonian formula
for addition of velocities. The relativistic formula guarantees that the relative velocity v
always remains less than the speed of light v < 1 for v1, v2 < 1.
In the special case v2 = −v1 we get v = 0. This means that the inverse Lorentz
transformation is obtained by changing the sign of v. This is easily veriﬁed directly by
performing ﬁrst a Lorentz transformation with parameter v and then one with parameter
−v.
5.2 Lorentz contraction
Consider a spaceship of length . The observer on the spaceship has four-velocity ˆu.
Another observer has four-velocity ˆv. Both measure the length of the spaceship from their
perspective getting the answer and respectively. The setup is illustrated in ﬁgure 5.2.
From the ﬁgure we see that
ˆs − ˆr ∝ ˆu (5.11)
and taking the inner product with ˆr gives
ˆr · ˆs − = 0 . (5.12)
Using the Lorentz transformation ˆs = γ(ˆr + vˆu) we ﬁnd ˆr · ˆs = γ so that
/ = 1/γ =
√
1 − v2 ≤ 1 , (5.13)
where v is the relative velocity of the two observers. This is the famous Lorentz contraction.
A moving object appears shortened, or contracted, in its direction of motion (relative to
the observer). This eﬀect is similar to slicing a sausage in that the length of the slice
depends on the cutting angle. In that case the perpendicular slice has the shortest length,
whereas in the spaceship case the observer at rest measures the longest length. Note that
in this respect the Euclidean ﬁgure 5.2 is very misleading.
26
T
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡¡!
¡
¡
¡¡
EE
rrr
rrrj
rrrrj
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡¡
L
L1 L2
ˆv ˆu
ˆs
ˆr
Figure 5.2: A spaceship stretching between L1 and L2, with length , is observed by the
observer following the worldline L who measures its length to be .
27
Chapter 6
Waves
Sound waves are familiar from Newtonian physics. The pressure P(¯r, t) varies in space and
with time around the mean pressure P0. For a plane sound wave the variation, p = P −P0,
has the form
p = A sin(−ωt + ¯k · ¯r + φ0) . (6.1)
The constants A and φ0 are the amplitude and phase shift of the wave, while ω is the
angular frequency and the three-vector ¯k is the wave vector. We may assume that ω > 0.
The surfaces where p takes a constant value, e.g. its maximum A, are 2-dimensional planes
in our 3-dimensional space, see ﬁgure 6.1, which is the reason for the name plane waves.
Sound waves are just one example. Many other types of waves exist in nature that can be
T
E 
 
d
d
d
d
d
d
d
d
d
d
dd
d
d
d
d
d
d
d
d
d
d
dd
d
d
d
d
d
d
d
d
d
d
dd
¯k
x
y
Figure 6.1: Plane wave in two space dimensions. The wave fronts, surfaces of constant
phase at say t = 0, given by ¯k · ¯r = const, are straight parallel lines orthogonal to the wave
vector ¯k. In three dimensions they are planes.
28
described in the same way.
Consider now a plane wave in spacetime given by
ψ = A sin( ¯K · ¯R + φ0) , (6.2)
where now ¯K is the wave four-vector and ¯R is the four-vector of a spacetime point (event).
To see that this expression makes sense we introduce an observer with four-velocity ˆu. We
can then decompose the four-vectors involved with respect to this observer as
¯R = tˆu + ¯r , ¯r · ˆu = 0 , (6.3)
¯K = ωˆu + ¯k , ¯k · ˆu = 0 . (6.4)
Using these expressions the wave takes the form
ψ = A sin(−ωt + ¯k · ¯r + φ0) , (6.5)
which is the same as before, with ω the angular frequency and ¯k the wave vector. Note
that ω and ¯k depend on the observer, since they are deﬁned using his four-velocity. We
have for example
ωu = −ˆu · ¯K , (6.6)
the angular frequency of the wave as measured by the observer u.
The superposition principle says that we can add together waves of the form (6.2) to
form new waves. For physical waves the angular frequency is determined by the wave vector
ω = ω(¯k). The relation between ω and ¯k is called the dispersion relation and depends on
the type of wave. Let us consider the sum of two one-dimensional plane waves with similar
wave number k1 = (1 − )k0 and k2 = (1 + )k0 with 1 and φ0 = 0,
ψ = sin(−ω1t + k1x) + sin(−ω2t + k2x) . (6.7)
We have
−ω1t + k1x = −ω(k1)t + k1x ≈ −ω(k0)t + k0x + k0
dω
dk
t − x , (6.8)
where we have Taylor expanded the function ω to ﬁrst order in . For k2 we ﬁnd the same
expression but with the sign of changed. Therefore, using the formula for sin of a sum of
angles and setting ∆ = k0(dω
dk
t − x) , we have
ψ = sin(−ω0t + k0x + ∆) + sin(−ω0t + k0x − ∆)
= sin(−ω0t + k0x) cos( ∆) + cos(−ω0t + k0x) sin( ∆)
+ sin(−ω0t + k0x) cos(− ∆) + cos(−ω0t + k0x) sin(− ∆)
=2 cos k0
dω
dk
t − x sin(−ω0t + k0x) . (6.9)
29
This is a plane wave with wave number k0 and frequency ω0 = ω(k0) but with an amplitude
which is modulated by the cos factor. The cosine factor describes the envelope of the wave
packet. The envelope of the wave packet depends on the position through x − dω
dk
t and
therefore travels with the group velocity
v =
dω
dk
. (6.10)
This is the velocity of signals sent with such waves. For light we should clearly have v = 1
and indeed light (electromagnetic radiation) has the dispersion relation
ω(k) = k = ¯k2 . (6.11)
Equivalently, we have for light that
¯K2
= (ωˆu + ¯k)2
= −ω2
+ ¯k2
= 0 , (6.12)
i.e. light rays have a wave four-vector which is null. This means that for light (electromagnetic
radiation) we can write the wave four-vector as
¯K = ω(ˆu + ˆk) ˆk · ˆu = 0 , (6.13)
where ˆk is a unit vector describing the direction of the wave.
6.1 Doppler shift and aberration
Consider two observers with four-velocities ˆu and ˆv observing a light wave ( ¯K2
= 0).
Splitting vectors with respect to the ﬁrst observer we have
ˆv = γ(ˆu + ¯v) ¯v · ˆu = 0 , (6.14)
the standard velocity split, and
¯K = ωu(ˆu + ˆku) ˆku · ˆu = 0 . (6.15)
We have put a subscript u to emphasize that ωu and ˆku are the quantities measure by
observer u. Since ¯v and ˆku are in the orthogonal space to ˆu they can be treated as ordinary
Euclidean vectors and we can write
¯v · ˆku = v cos θu , (6.16)
where θu is the angle between the relative velocity ¯v of observers u and v and the direction
the light is traveling, given by ˆku, as measured by observer u.
The angular frequency measured by observer v becomes
ωv = −ˆv · ¯K = −ωuγ(−1 + ¯v · ˆku) = ωuγ(1 − v cos θu) , (6.17)
so that
30
ωv
ωu
=
1 − v cos θu
√
1 − v2
. (6.18)
This is the formula for the Doppler shift. It says how the ratio of the frequencies measured
by two observers depends on their relative velocity and the angle of the velocity to that
of the light. To see the physical consequences of this formula we take the observer u be
at rest with respect to the source emitting the light (radiation). Let us consider the two
special cases when v is moving directly towards the source or directly away from it. We
have
Towards source cos θu = −1 ωv
ωu
= 1+v
1−v
> 1 “Blue shift”
Away from source cos θu = +1 ωv
ωu
= 1−v
1+v
< 1 “Red shift”
An observer moving towards the source sees a higher frequency, the light is blue shifted,
while an observer moving away sees a lower frequency, the light is red shifted. The Doppler
eﬀect is very important in astronomy where it is used for example to measure the velocity
of stars and galaxies relative to us by looking at the shifts of their spectral lines.
We could have used the observer v instead of u and we would have found instead the
formula
ωu
ωv
=
1 + v cos θv
√
1 − v2
. (6.19)
Note the plus sign in the numerator which is due to the fact that the relative velocity
according to v has opposite sign compared to what u measures. Multiplying this with the
previous formula for the Doppler shift we ﬁnd
(1 + v cos θv)(1 − v cos θu) = 1 − v2
. (6.20)
This is the formula for aberration derived by Einstein in his 1905 paper. It can also be
written as
cos θv =
cos θu − v
1 − v cos θu
. (6.21)
To see its physical consequences we take again u at rest with respect to the source and
v traveling towards the source, but now we let them observe the light at a small angle,
θu = π − δu, θv = π − δv with δu, δv 1. Using the fact that cos(π − x) = − cos(x) and
that for small x cos(x) ≈ 1 − 1
2
x2
we ﬁnd
1 −
1
2
δ2
v ≈
1 + v − 1
2
δ2
u
1 + v − v1
2
δ2
u
≈ 1 −
1
2
1 − v
1 + v
δ2
u . (6.22)
so that
δv ≈
1 − v
1 + v
δu ≤ δu . (6.23)
31
We conclude that the observer traveling towards the source of the light measures a smaller
angle. Similarly, by changing the sign of v, we learn that an observer traveling away from
the light source measures a larger angle by a factor 1+v
1−v
≥ 1. This means that light is
concentrated in the direction of motion and an observer traveling towards a star sees it as
smaller and brighter, while an observer traveling away sees it as larger and fainter. This
is sometimes referred to as the “headlight” eﬀect. It again has important consequences in
astronomy, for example when observing relativistic jets of plasma from compact objects
accreting matter. These appear brighter when directed towards the earth and fainter when
directed away from the earth.
32
Chapter 7
Particle kinematics
An important application of special relativity is to processes involving elementary particles,
e.g. in particle accelerators, where they can often reach velocities close to the speed of light.
In order to discuss what happens in particle collisions we ﬁrst need to introduce the notion
of four-momentum.
7.1 Four-momentum
Recall the split of a four-velocity ˆv with respect to another one ˆu
ˆv = γ(ˆu + ¯v) ˆu · ¯v = 0 , γ =
1
√
1 − v2
. (7.1)
If v is an object and u an observer ¯v is the velocity of the object relative to the observer.
In the Newtonian limit v << 1, i.e. for velocities much smaller than the speed of light, we
have, dropping terms of order v3
and higher,
ˆv ≈ (1 +
v2
2
)ˆu + ¯v . (7.2)
To relate this to something more familiar let us multiply by the mass of the object m,
mˆv ≈ (m +
mv2
2
)ˆu + m¯v . (7.3)
We recognize the kinetic energy of the object 1
2
mv2
and its momentum m¯v. Note that every
object has a well-deﬁned mass (just consider an observer traveling along with the object,
i.e. whose ˆu is (nearly) parallel to ˆv, who can deﬁne the mass in the usual Newtonian way).
We deﬁne the four-momentum of an object to be its mass times its four-velocity
¯P = mˆv . (7.4)
By construction it satisﬁes
¯P2
= −m2
. (7.5)
33
The split of ˆv gives a corresponding split of the four-momentum
¯P = Eˆu + ¯p ˆu · ¯p = 0 , (7.6)
with
E = mγ =
m
√
1 − v2
≈ m +
mv2
2
, ¯p = m¯vγ =
m¯v
√
1 − v2
≈ m¯v . (7.7)
In analogy with the Newtonian case we call E the energy and ¯p the momentum (or threemomentum)
of the object. Note that they are not intrinsic properties of the object but
depend on the observer.
The four-momentum ¯P = mˆv is time-like since ˆv is time-like. Furthermore, since m > 0,
it is also future directed. The energy measured by observer u can be written
E = −ˆu · ¯P (7.8)
and we see that it is always positive since both ˆu and ¯P are time-like and future directed
so their inner product is negative.
The usefulness of the notion of momentum in Newtonian mechanics comes from the
fact that it is conserved (in the absence of external forces): The sum of all momenta before
a collision is equal to the sum of momenta after the collision. This is actually a special
case of the relativistic conservation of four-momentum
before
¯Pi =
after
¯Pj . (7.9)
It says that there are four conserved quantities: The total energy and the three components
of the momentum.
The fact that four-momentum is conserved is intimately tied to symmetries in nature.
In fact, you will learn later in your studies that the conservation of momentum is equivalent
to the statement that the laws of physics look the same regardless of position in space. The
laws of physics are the same here as on the moon or in the Andromeda galaxy. Similarly
the conservation of energy is equivalent to the fact that the laws of physics look the same
at all times. Just like the theory of relativity combines space and time into a spacetime it
combines the notions of momentum and energy into the single concept of four-momentum.
Note that in the Newtonian limit the total energy becomes
E ≈
i
mi +
mi¯v2
i
2
. (7.10)
But we know that the kinetic energy is not conserved in general, it is only conserved in
elastic collisions. This means that the ﬁrst term must change to keep the total energy
conserved, i.e. the sum of the masses before and after the collision will in general diﬀer,
contrary to what is assumed in Newtonian mechanics. In inelastic collisions the kinetic
energy decreases, so the total mass must increase. Of course in everyday situations v << 1
so the kinetic energy is much smaller than the total mass, 1
2
m¯v2
<< m.
34
This fact, that mass is also a form of energy, is the content of the most famous equation
in physics
E = mc2
. (7.11)
Remember that here we are using units where the speed of light is unity, c = 1. Note also
that E = m is true only for an observer at rest with respect to the object in question. It
is often referred to as the rest energy.
7.2 Massless particles
Knowing all but one four-momentum of the particles in a given reaction we can use the
conservation of four-momentum to determine the remaining one. In this way one can
experimentally establish the existence of particles with null four-momentum
¯P2
= 0 . (7.12)
Since we may deﬁne the mass through ¯P2
= −m2
we call such particles massless. The most
important example of such a particle is the photon, the quantum of light (more generally
electromagnetic radiation). The four-momentum and energy of such a particle does not
vanish, if it did we would not be able to detect them, and therefore we conclude from the
expression
E =
m
√
1 − v2
(7.13)
that such a particle must have v = 1, i.e. travel at the speed of light. It therefore follows a
null line with direction proportional to the null vector ¯P. Note that for a massless particle
we have
¯P = Eˆu + ¯p = E(ˆu + ˆp) , (7.14)
since it is null.
7.3 Tachyons
Sometimes on hears about particles with space-like four-momentum
¯P2
> 0 . (7.15)
Such particles are referred to as tachyons. They would have an imaginary mass from
¯P2
= −m2
and always travel along space-like lines. This turns out not to be consistent
with the laws of quantum mechanics and such particles therefore cannot exist in nature.
Instead in our current best theories particles emerge as “ripples” of a ﬁeld with a
quantum of energy. In such quantum ﬁeld theories it is not unusual to have m2
< 0.
However, the resulting particles do not travel faster than light. Instead such an imaginary
mass signals an instability of the vacuum. In fact, this is part of the mechanism by which
particles can acquire mass through the so-called Higgs mechanism, related to the famous
Higgs boson discovered at CERN.
35
7.4 Particle reactions and kinematics
The two most important types of particle reactions are
 Decay: One particle goes into two or more
 Collision: Two particles go into one or more
The allowed conﬁgurations of four-momenta are constrained by
1. ¯P2
i = −m2
i for all particles involved
2. Conservation of four-momentum
i
¯Pi,in =
j
¯Pj,out
These equations form the basis of particle kinematics. It is often convenient to refer all
quantities to some particular observer, say with four-velocity ˆu. Then we have
¯Pi = Ei ˆu + ¯pi ˆu · ¯pi = 0 . (7.16)
The inner product of the four-momenta of two particles becomes
¯Pi · ¯Pj = −EiEj + ¯pi · ¯pj , (7.17)
and in the special case i = j we ﬁnd
m2
i = − ¯P2
i = E2
i − ¯p2
i ⇒ Ei = m2
i + ¯p2
i , (7.18)
expressing the energy in terms of the momentum.
A typical kinematical calculation involves two steps:
1. Take the inner product of the conservation of four-momentum
i
¯Pi,in =
j
¯Pj,out
with some ¯Pi, or rearrange it and take the square.
2. Replace ¯P2
i by −m2
i everywhere and inner products using (7.17).
36
E 
 
 
  
e
e
e
e
e
e
e
θ
¯q
¯p = 0
¯p
¯q
Figure 7.1: Scattering of an electron, with momentum ¯q, oﬀ a proton at rest (¯p = 0) in the
observer’s orthogonal space. The recoil angle of the proton is θ.
Example
Consider an electron (e−
) scattering oﬀ a proton (p+
). We set me− = m and mp+ = M
and let ¯Q, ¯P be the initial four-momenta and ¯Q , ¯P the ﬁnal ones. We have
¯P2
= ¯P 2
= −M2
, ¯Q2
= ¯Q 2
= −m2
. (7.19)
The conservation of four-momentum reads
¯P + ¯Q = ¯P + ¯Q . (7.20)
Let’s say we are interested in the situation where the proton is at rest before the collision
and we want to ﬁnd the recoil angle of the proton. The situation is illustrated in the
observer’s orthogonal space in ﬁgure 7.1. When there is one particle which we do not
know anything about, in this case the outgoing electron with four-momentum ¯Q , it is
often useful to rearrange the conservation of four-momentum so that its four-momentum
appears alone on one side of the equation and then take the square. We therefore write
the conservation of four-momentum as
¯P + ¯Q − ¯P = ¯Q (7.21)
and squaring this equation we ﬁnd
¯P2
+ ¯Q2
+ ¯P 2
+ 2 ¯P · ¯Q − 2 ¯P · ¯P − 2 ¯Q · ¯P = ¯Q 2
. (7.22)
37
Using the fact that ¯P2
= ¯P 2
= −M2
and ¯Q2
= ¯Q 2
= −m2
this becomes
0 = − M2
+ ¯P · ¯Q − ( ¯P + ¯Q) · ¯P
=( ¯P + ¯Q) · ( ¯P − ¯P ) . (7.23)
Note that Q has dropped out completely. This is a useful equation for two-particle elastic
collisions (same masses coming in and going out). Since we are assuming the proton is
initially at rest with respect to the observer we have
¯P = M ˆu , ¯Q = Eˆu + ¯q , ¯P = E ˆu + ¯p , (7.24)
where ˆu is the observer’s four-velocity. Using this we ﬁnd
0 = ( ¯P + ¯Q)·( ¯P − ¯P ) = ((M+E)ˆu+¯q)·((M−E )ˆu−¯p ) = −(M+E)(M−E )−¯q·¯p . (7.25)
Since ¯q and ¯p are in the orthogonal space to ˆu they can be treated as ordinary Euclidean
vectors and we can write
¯q · ¯p = qp cos θ , (7.26)
where θ is the recoil angle we are after, see ﬁgure 7.1. Noting that
−m2
= ¯Q2
= −E2
+ q2
⇒ q =
√
E2 − m2 (7.27)
and similarly we have p =
√
E 2 − M2 we get
cos θ =
¯q · ¯p
qp
=
(E + M)(E − M)
√
E2 − m2
√
E 2 − M2
=
E + M
√
E2 − m2
E − M
E + M
. (7.28)
This expresses the recoil angle of the proton in terms of the initial energy of the electron
E and the ﬁnal energy of the proton E .
7.5 Center-of-mass observer
For more complicated processes it is often useful to group all (or part) of the out-going
particles together. Their total four-momentum is
¯P =
i
¯Pi . (7.29)
Then the ”mass” deﬁned by ¯P2
= −M2
does not have a ﬁxed value. It depends on the
relative motion of the particles. It does have a lower bound however. Since a sum of
time-like future directed vectors is again time-like and future directed we can consider an
observer with four-velocity ˆu = ¯P/M. Then we ﬁnd
ˆu =
1
M
¯P =
1
M i
¯Pi =
1
M i
Ei ˆu +
1
M i
¯pi . (7.30)
38
We see that the last term must vanish, which means that this observer sees the total spatial
momentum of the particles being zero, i ¯pi = 0. Using this fact we ﬁnd
−M2
= ¯P2
= −
i
Ei
2
= −
i
m2
i + p2
i
2
. (7.31)
Therefore
M2
=
i
m2
i + p2
i
2
≥
i
mi
2
, (7.32)
with equality occurring only if ¯pi = 0 for all i, i.e. if all particles are at rest with respect
to each other. This inequality leads to so-called threshold conditions for when a certain set
of particles can be produced. Note that there is no reference to any observer in the last
equation – the result is observer independent. Nevertheless it was convenient to introduce
an observer in the derivation of the result and this turns out to often be the case.
An observer, such as the one considered above, whose four-momentum is parallel to the
total four-momentum i
¯Pi is called a center-of-mass observer (sometimes center-of-mass
frame or center-of-mass system), since such an observer sees the particles having zero total
three-momentum, i ¯pi = 0. The introduction of such an observer can often simplify the
calculations. Note that using Lorentz transformations we can of course translate from one
observer to another at the end.
39
Chapter 8
Curved worldlines and acceleration
So far we have dealt almost exclusively with straight worldlines. Unaccelerated spaceships
and particles not inﬂuenced by external forces travel along such worldlines. Conversely,
spaceships that run their engines or particles inﬂuenced by external forces follow curved
worldlines. While it is possible to treat accelerated observers in special relativity it is usually
avoided for practical reasons. We will therefore continue to assume that all observers
are unaccelerated, unless otherwise stated.
The natural parameter along a (massive) particle’s worldline is the proper time τ. The
spacetime position of the particle is given as a function of τ
¯R = ¯R(τ) . (8.1)
We can calculate the rate of change of the position vector with τ by taking the derivative
d ¯R
dτ
= lim
∆τ→0
¯R(τ + ∆τ) − ¯R(τ)
∆τ
. (8.2)
Since τ is the proper time along the worldline, which measures its length, we have for an
inﬁnitesimal piece of the worldline
dτ2
= −(d ¯R)2
. (8.3)
It follows that
ˆv =
d ¯R
dτ
(8.4)
is a time-like unit vector, ˆv2
= −1, which is tangent to the worldline at the event ¯R(τ).
For a straight worldline d ¯R/dτ =constant and ˆv is the four-velocity. We will continue to
call it the four-velocity also for curved worldlines, though in that case it is not constant.
The four-acceleration is deﬁned as the rate of change of the four-velocity
¯A =
dˆv
dτ
=
d2 ¯R
dτ2
. (8.5)
40
Diﬀerentiating the condition ˆv2
= −1 we learn that
ˆv · ¯A = 0 . (8.6)
This implies that ¯A is space-like. Note that ¯A is not, in general, a unit vector. In particular
it vanishes for a straight worldline.
Let us now analyze how an unaccelerated observer views an accelerated worldline.
Consider an observer whose (straight) worldline is tangent to the curved worldline at a
certain point ¯R(τ0), as depicted in ﬁgure 8.1. Splitting ¯R with respect to the four-velocity
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
¡
¡
¡
¡!
rrrj
 
 
  
rrj
¡
¡
¡
¡!
¯R(τ0)
¯a0
¯R(τ)
¯r
tˆu
Figure 8.1: An observer’s worldline is tangent to the curved worldline of an accelerated
object at the point ¯R(τ0).
ˆu of the observer we have
¯R(τ) = tˆu + ¯r , ˆu · ¯r = 0 . (8.7)
Considering a small interval along the curve this gives
d ¯R = dtˆu + d¯r (8.8)
41
and squaring this we ﬁnd
−dτ2
= −dt2
+ d¯r2
(8.9)
or
dτ = dt
√
1 − v2 , ¯v =
d¯r
dt
. (8.10)
Note that ¯v is the relative velocity and we can also write dτ = dt/γ. This gives the
four-velocity of the object as
ˆv =
d ¯R
dτ
= γ
d ¯R
dt
= γ(ˆu + ¯v) . (8.11)
This is the same velocity split formula we found before when expressing a constant fourvelocity
ˆv in terms of another one ˆu, here however ˆv and ¯v are not constant. At the point
¯R(τ0) we have ˆv = ˆu since the worldlines are tangent so the relative velocity ¯v = 0 at this
point. For the four-acceleration we ﬁnd
¯A(τ0) =
dˆv
dτ 0
=
dˆv
dt 0
= (¯a)0 , (8.12)
where ¯a = ∂¯v/∂t is the ordinary Newtonian acceleration measured by the observer momentarily
at rest with respect to the accelerated object. If this object happens to be a
spaceship the vector ¯A at a certain point of its worldline is the acceleration experienced by
the crew at that instant. This gives the physical interpretation of the four-acceleration.
8.1 Special case: Constant acceleration
Consider a 2-plane in spacetime, which could describe for example the history of a spaceship
traveling in a straight line as seen by an observer. In two dimensions we need only one
condition to determine a curve and we take
¯R2
= R2
, (8.13)
with ¯R the spacetime position and R a constant. Taking the derivative we ﬁnd
¯R · ˆv = 0 (8.14)
and another derivative gives
¯R · ¯A = 1 . (8.15)
The vectors ¯R, ˆv and ¯A lie in a 2-plane and since both ¯R and ¯A are orthogonal to ˆv they
must be parallel, so that using the equation above we ﬁnd
¯A =
¯R
R2
⇒ ¯A2
=
1
R2
, (8.16)
42
i.e. the four-acceleration of this worldline is constant (in magnitude). This could for
example be the worldline of a spaceship running its engines so that the crew experience a
constant acceleration.
Let ¯R2
= R2
1 and ¯R2
= R2
2 with R2 > R1 be two such worldlines in the same 2-plane
and with the same origin. From (8.14) we see that any line through the origin cuts the
curves orthogonally. Therefore R2 − R1 is the orthogonal distance between the curves.
For example, one of the lines could be the worldline of the front of a spaceship and the
other the worldline of its tail. The ship’s length is constant, equal to R2 − R1, but the
acceleration of the front and the tail are unequal, being 1/R2
1 and 1/R2
2 respectively.
Introducing an observer with four-velocity ˆu we can split
¯R = tˆu + xˆr , ˆu · ˆr = 0 (8.17)
and
¯R2
= R2
⇒ −t2
+ x2
= R2
. (8.18)
This means that in the (t, x)-plane of the observer the worldlines of constant acceleration
are hyperbolas, as illustrated in ﬁgure 8.2. Important properties are hidden in this ﬁgure,
T
E
t
x
R1 R2
¯R2
= R2
1
¯R2
= R2
2
Figure 8.2: Worldlines of constant acceleration as viewed by an observer.
e.g. that all points on the curves are equivalent and that all observers see the same picture.
It is also hard to see that lines through the origin are orthogonal to the curves and that they
have constant separation. These problems arise because we are representing a spacetime
situation in Euclidean space. The corresponding situation in Euclidean space would be
43
two concentric circles of radius R1 and R2 respectively and here the properties mentioned
above become obvious. Note however that this Euclidean picture is also not a reliable
representation of the spacetime situation, e.g. the curves are closed which is impossible
in spacetime. We simply have to live with the fact that spacetime situations cannot be
completely reliably represented in Euclidean space.
8.2 Fitting a relativistic car into a garage
A famous “paradox” in special relativity is this:
Imagine you just bought a very fast car. Unfortunately, the car turns out to
be slightly too long to ﬁt in your garage. Is it possible to exploit the Lorentz
contraction and drive the car very fast into the garage and slam the door?
From the point of view of the garage the car appears shorter due to the Lorentz contraction,
which suggests that it should work. On the other hand, from the point of view of the car,
the garage appears shorter making the situation worse, not better. This is the “paradox”.
Of course, when you analyze things carefully there is no contradiction.
Let ˆu be the four-velocity of the garage and ˆv that of the car. We take units where the
length of the garage is 1 and let ˆg be orthogonal to ˆu and stretching from the doors to the
back of the garage. Similarly, we let ¯c be orthogonal to ˆv and stretch from the tail of the
car to the front. We assume the car continues unaccelerated until it reaches the back wall
of the garage at which point it suddenly stops. (This is of course an idealized situation
involving inﬁnite acceleration.)
The spacetime diagram is given in ﬁgure 8.3. The following important events are
indicated in the ﬁgure
E0 : The tail of the car passes the doors and they close
E1 : An event at the front of the car which is simultaneous with E0 from the cars point
of view
E2 : The car collides with the back wall of the garage. Simultaneous with E0 from the
point of view of the garage
If ¯v is the relative velocity we can apply the Lorentz transformation relating the observations
from the garage to those from the car
ˆv = γ(ˆu + vˆg) , ˆc = γ(ˆg + vˆu) , (8.19)
or the inverse transformations
ˆu = γ(ˆv − vˆc) , ˆg = γ(ˆc − vˆv) . (8.20)
The last of these equations will be enough for us. Let ¯w be the vector from E1 to E2. From
ﬁgure 8.3 we have
¯c + ¯w = ˆg = γ(ˆc − vˆv) (8.21)
44
T T
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
E
d
d
d
dd
s s
s
E0
E1
E2
doors back wall
front
tail
ˆu
ˆv
ˆg
¯c
Figure 8.3: Spacetime diagram for the problem of ﬁtting the car into the garage.
which implies, since ¯w is proportional to ˆv, that
¯c = γˆc , ¯w = −γvˆv . (8.22)
The length of the car is =
√
¯c2 = γ > 1, so the car is longer than the garage, consistent
with our assumptions.
Since ¯w is negative relative to ˆv, from the point of view of the car the front collides
with the back wall before the doors close, although these events are simultaneous from the
point of view of the garage. However, since both ˆg and ¯c are space-like, no signal from
the collision can reach the tail of the car before it passes the doors. Therefore no material
strength can stop the car from being compressed (from its point of view) and you can shut
the door behind it. You have to be quick though if the car is elastic and tends to regain
its original shape. This example illustrates the fact that no perfectly rigid body exists.
We have seen that it is indeed possible (though not recommended) to ﬁt the car into
the garage by exploiting the Lorentz contraction. We have also seen that what appears at
ﬁrst sight to be a paradox is just due to not analyzing the situation carefully enough, in
particular neglecting the issues due to the relativity of simultaneity. When the problem is
carefully analyzed there are of course no contradictions.
45
8.3 Rotating wheel
In a wheel at rest all constituent particles follows straight worldlines. When the wheel
starts spinning the worldlines become tilted and form helices in spacetime. Only the
center continues on a straight worldline.
Using the center as an observer and thinking of a ring of matter a distance ρ around
it we realize that such a ring must have circumference 2πρ whether the wheel is spinning
or not. In a sense the Lorentz contraction is prevented from taking place. Instead the
tilting of the worldlines has the eﬀect that the orthogonal distance between them increases
compensating the Lorentz contraction. However, this means that a deformation of the
wheel takes place.
This is an inevitable deformation which is always connected with rotation. How much
energy is needed to produce the deformation depends on the stresses in the wheel. Therefore
the moment of inertia of a wheel must also depend on the stresses within it. This is an
important consequence of the theory of relativity though it plays little role in everyday
circumstances.
46