Physics in Spacetime
Lecture notes
Linus Wulﬀ
Spring 2022
Chapter 1
Space and Time
This course deals with the special theory of relativity introduced by Einstein in a famous
1905 paper. The traditional way of introducing special relativity is to derive it, in much
the same way that Einstein did, from
1. The principle of relativity
2. The constancy of the speed of light
From these assumptions the notion of a spacetime with (inertial) observers being connected
by Lorentz transformations follows. This is a natural way to proceed if one starts from a
knowledge of classical mechanics and Maxwell’s equations of electrodynamics. However, it
is not the best way to understand the geometrical aspects of spacetime. This is part of the
reason it took Einstein another ten years to come up with the general theory of relativity,
where the geometry of spacetime is the key player.
In this course we will follow a diﬀerent route1
which leads more directly to a geometric
picture. Rather than follow the traditional route starting with the principles described
above we will derive the same physics from what is known as
 The principle of maximum proper time
This approach is more in line with general relativity, and this course can be thought of as
a ﬁrst step towards studying general relativity.
1.1 What is space and time?
The notions of Space and Time are central to physics. In physics we are interested in
answering questions like
Given some conﬁguration of particles with given positions and momenta at
some initial time ti, what will the conﬁguration look like at some later time tf .
1
The approach to relativity taken here is inspired by lecture notes and a book by B. Laurent (Introduction
To Spacetime, World Scientiﬁc, 1994).
1
For such questions to make sense we must have a precise way of deﬁning what we mean
by time and also what we mean by a particle’s position in space. So what are space and
time? Rather than get into a philosophical discussion about the nature of space and time,
a more useful approach when faced with such deep questions in physics is to try to replace
it by a diﬀerent, more down to earth, question. After all, in physics are only interested in
things that can be measured and therefore a better question to ask is
How do we measure space and time?
We say that we deﬁne space and time operationally, by declaring how we measure them.
So how do we measure distances in space? The most basic way is to take a reference
object, say stick of a certain length, and use it to measure the distance between two points.
We will call such a reference object a ruler. Of course, a good ruler should not bend or
change its length with temperature etc., so we will assume it is always possible to ﬁnd
a suﬃciently good ruler (or equivalent) that we can measure lengths to the precision we
need. How do we measure time? To measure time we need a clock. It does not have
to be what we normally think of as a clock, it can be any physical system which has a
known time dependence, e.g. a pendulum or atoms in an excited state with known half-life.
Again, we will assume that there exist such clocks with good enough precision for the time
measurements we need to perform.
We have established that each person, or more generally, each observer measures time
with their clock and spatial distances with their ruler. We will assume that these are
small enough that the observer can carry them with her, i.e. they will be in the same
state of motion as the observer and experience the same forces she experiences. But if
each observer makes their own measurements, how do we relate the measurements of two
diﬀerent observers? Newton and his contemporaries assumed that there was an absolute
notion of time, so that all observers clocks would tick at the same rate. In that case it is
very easy to relate the measurements of two observers. We now know that this assumption
was wrong. For example, taking two synchronized atomic clocks, putting one on a plane
circling the earth and leaving one on the ground, one ﬁnds when comparing them at the end
that they diﬀer (by a few hundred nanoseconds). This observation is clearly inconsistent
with the Newtonian idea of an absolute time.
1.2 The principle of maximum proper time
Experiments show that time runs diﬀerently for diﬀerent observers. We must therefore
assign each observer their own time, their proper time, which is the time measured on their
clock. We can now state the key principle that will allow us to compare the measurements
of diﬀerent observers
If two observers are separated and then meet again, the one that does not experience
any acceleration always measures the longest proper time.
2
This is the principle of maximum proper time. It says that proper time is maximized for
inertial (unaccelerated) observers. There is plenty of experimental evidence to support this
principle, such as the experiments with atomic clocks on planes, or the operation of GPS
satellites which requires very precise time measurements. We will take this principle as the
starting point from which we will derive the theory of special relativity.
1.3 Spacetime
We are familiar with the fact that to specify the position of an object in our three dimensions
we need to give three numbers – the coordinates with respect to some speciﬁed
coordinate system. For positions on the earth we might for example give the longitude,
the latitude and the height above sea level. To specify an event – something happening
at a certain place at a certain instant of time – we must give one more number, namely
the time on a clock associated to the coordinate system. In our example this could be the
time GMT.
We have argued that we must allow each observer to measure distances and times using
their own coordinate system deﬁned by their ruler and clock. Each observer will therefore
associate to a given event four numbers (t, x, y, z) – the spacetime coordinates relative to
their coordinate system. Note that we are deﬁning an event here in an idealized way as a
single point in spacetime, i.e. something that happens at a point in space at a single instant
of time. The set of all events make up the four-dimensional spacetime. Note that each
observer will (in general) assign diﬀerent coordinates to the same event because they are
using diﬀerent coordinate systems, there is no preferred coordinate system in spacetime.
One of our ﬁrst tasks will therefore be to understand how to relate the observations of
diﬀerent observers.
1.4 Worldlines
The trajectory of an object traces out a continuous path in spacetime – a worldline (really
“worldtube” if the object is not point-like, but this distinction wont be very important to
us). In ordinary Euclidean space we are familiar with that fact the there is a shortest path
between any two points. This path is called a straight line. It is the path a particle follows
if it is not acted upon by any external forces, i.e. it is unaccelerated. Similarly, we will
assume that there is precisely one straight worldline connecting any two events in spacetime
and that any object not acted on by external forces, i.e. not experiencing any acceleration,
follows such a straight worldline. To a worldline connecting two events in spacetime we can
associate a number – the proper time along that worldline. Recall that this is the time an
observer traveling along the worldline measures on her clock between the two events. The
principle of maximum proper time says that a straight worldline corresponds to the longest
proper time. Therefore the analog of shortest length in Euclidean space is longest proper
time in spacetime and a clock can be thought of as measuring distances in spacetime.
3
When we draw spacetime diagrams we will draw the worldlines of unaccelerated objects
as straight lines. Curved lines will correspond to worldlines of accelerated objects (Figure
1.1).
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
u
u
A
B
τAB
τAB
Figure 1.1: Spacetime diagram showing an accelerated (curved worldline) and an unaccelerated
(straight worldline) observer meeting at events A and B. The proper time measured
on their respective clocks between the meetings is τAB and τAB. The principle of maximum
proper time then says that τAB > τAB.
An important notion in Euclidean geometry is the notion of two lines being parallel.
In spacetime we can similarly have the notion of two observers being on the same course.
How can two observers, e.g. two spaceships traveling in outer space, determine whether
they are on the same course? One way to do so uses a construction from Euclidean space
adapted to spacetime. Imagine that the two observers each send out a probe ﬁtted with
a clock, which travels freely until it is picked up by the other observer at some later time
(Figure 1.2). If the two probes happen to meet halfway, i.e. after half of the proper time
(from being emitted to being picked up) has elapsed on each clock, then we will say that
the observers are on the same course, or that their worldlines are parallel. From the ﬁgure
we see that this also implies that the lines AB and CD (not drawn) are parallel. Note
that to carry out the experiment we really need to send clocks that also have a recording
device that records the time they were sent and the time they met. We would also need
to do the experiment several times to get them to meet halfway.
Notice that this construction never refers to space or time separately, only to the full
spacetime picture or the proper time measured by a particular clock. This is in sharp
contrast to how we would describe such an experiment in Newtonian physics.
4
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡












Q
d
d
d
d
d
ds
s
s
s
s
c
A
C
B
D
τ
2
τ
2
τ
2
τ
2
Figure 1.2: The worldlines of two observers are parallel if they can send out probes, at A
and B, that meet halfway before encountering the other observer at D and C.
Just like in Euclidean space the line AC in Figure 1.2 deﬁnes a vector, which we can
draw as an arrow starting at A and ending at C. We declare the length of the vector to
be given by the proper time elapsed from A to C. The construction in the ﬁgure gives us
a way to parallel transport vectors, i.e. moving a vector keeping it parallel to itself. The
vector AC can be parallel transported to the vector BD. Taking the two worldlines to
approach each other we obtain the special case of parallel transport of the vector along
the worldline. A general parallel transport is obtained by a sequence of such “elementary”
parallel transports.
We will now make a very important assumption. We will assume that the vector one
obtains by such a sequence of elementary parallel transports from a point A to a point A
in spacetime does not depend on how one chooses the sequence of parallel transports, i.e.
it does not depend on the path taken. This assumption is not true close to gravitating
bodies and in that case one must use the more advanced theory of general relativity. The
assumption is true if gravity is very weak, which is the case we consider in this course.
In this case we are working with special relativity. In fact, the change of a vector under
parallel transport is directly related to the curvature of a space. In special relativity we
are working in a ﬂat spacetime.
5
Chapter 2
Spacetime vectors
In the last chapter we deﬁned a vector on the straight worldline of an observer as an arrow
from one event to another on the worldline with length given by the proper time elapsed
between the two events. The notion of vectors is familiar from Euclidean space and we
will use the same notation ¯v for a vector in spacetime. Such vectors are often called fourvectors
since spacetime is four-dimensional. Just as any point in Euclidean space R3
can
be associated with a vector going from the origin to that point, any event in spacetime can
be associated to a spacetime vector from an origin (which we can choose as we please) to
the point in question. We have also seen that we can move vectors around using parallel
transport. Two vectors related by parallel transport will be considered the same vector
(this is consistent since we are assuming that the vector obtained by parallel transport is
independent of the path taken).
Spacetime vectors obey the usual axioms familiar from Euclidean space:
 Commutativity of addition: ¯u + ¯v = ¯v + ¯u
 Associativity of addition: ¯u + (¯v + ¯w) = (¯u + ¯v) + ¯w
 Identity element of addition: ¯v + ¯0 = ¯v
 Inverse elements of addition: Given ¯v the exists a vector −¯v such that ¯v + (−¯v) = 0
 Compatibility of scalar multiplication: a(b¯v) = (ab)¯v for a, b ∈ R
 Identity element of scalar multiplication: 1¯v = ¯v
 Distributivity of scalar multiplication with respect to vector addition: a(¯u + ¯v) =
a¯u + a¯v
 Distributivity of scalar multiplication with respect to addition: (a + b)¯v = a¯v + b¯v
Addition of spacetime vectors can be done by the geometric construction familiar from
Euclidean space, which is illustrated in Figure 2.1. Recall that a basis for a vector space
6
d
d
d
d
d
d
d
d
ds
¯u







Q
¯v T
¯u + ¯v
Figure 2.1: Addition of vectors ¯u and ¯v to producing a third vector ¯u + ¯v.
is a set of linearly independent vectors ¯vi with i = 1, . . . , n which span the space, so that
any vector is expressed uniquely as a linear combination
a1¯v1 + a2¯v2 + . . . + an¯vn , (2.1)
for some numbers ai ∈ R with i = 1, . . . , n. The vector can be denoted in this basis as
(a1, a2, . . . , an) and n is called the dimension of the vector space. A basis of spacetime
vectors consists of four linearly independent spacetime vectors.
2.1 Inner product
An important operation in linear algebra is the inner product (sometimes called scalar
product) between two vectors. Given two vectors their inner product is a real number.
We will denote the inner product with a dot, e.g. ¯u · ¯v denotes the inner product between
vectors ¯u and ¯v. The inner product satisﬁes the following standard axioms
 Symmetry: ¯u · ¯v = ¯v · ¯u
 Linearity: (a¯u) · ¯v = a(¯u · ¯v) and (¯u + ¯v) · ¯w = ¯u · ¯w + ¯v · ¯w
 Non-degenerate: If ¯u · ¯v = 0 for all vectors ¯v then ¯u = ¯0
Often the inner product is required to be positive deﬁnite, so that ¯u2
= ¯u · ¯u ≥ 0. This is
the case in Euclidean space where we are used to identifying ¯u2
with the length-squared of
7
a vector, which is obviously positive. We will see below that it is not possible to require
this for spacetime vectors. Instead, for a spacetime vector that goes along the straight
worldline of an object from point A to point B we will take
¯u2
= −τ2
, (2.2)
where τ is the proper time along the worldline from A to B. The minus sign seems strange
at this point but we will see shortly that it is needed if we want lengths in space to have
positive square. All the diﬀerences between Euclidean space and spacetime are due to the
fact that the inner product in spacetime is not positive deﬁnite. As we will see this is what
makes it possible to separate the time-direction from the spatial directions.
The assumption the ¯u2
= −τ2
for vectors corresponding to a segment of a straight
worldline determines also ¯u · ¯v for some two straight worldline vectors ¯u, ¯v. To see this
consider three such worldline vectors related by
a¯u = ¯v + ¯w , (2.3)
for some a ∈ R. Writing this as ¯w = a¯u − ¯v and squaring both sides we get
¯w2
= (a¯u − ¯v) · (a¯u − ¯v) = a2
¯u2
− 2a¯u · ¯v + ¯v2
. (2.4)
Rearranging this we have
¯u · ¯v =
1
2a
a2
¯u2
+ ¯v2
− ¯w2
. (2.5)
The right-hand-side involves only squares of vectors, which are expressed in terms of the
corresponding proper times corresponding. Therefore we see that the inner product ¯u · ¯v
is also determined in terms of the proper times corresponding to the lengths of the vectors
¯u, ¯v, ¯w.
It is important to understand that the assumption that there exists an inner product
for spacetime vectors satisfying the above axioms is not a trivial statement. The mere
existence of this inner product has physical consequences. To see this consider the identity
(¯u + ¯v)2
+ (¯u − ¯v)2
= 2¯u2
+ 2¯v2
. (2.6)
Consider the case that all these vectors are part of straight worldlines of observers. Since
the expression contains only squares it only involves the proper times measured by these
observers. With four spaceships traveling along these straight worldlines it is then possible
to arrange an experiment (see Figure 2.2) to test whether the proper times they measure
satisfy the above identity, i.e. whether τ2
1 + τ2
2 = 2τ2
3 + 2τ2
4 . One ﬁnds that it is indeed
satisﬁed.
2.2 Timelike, Spacelike and Null
Let ¯u, ¯v be two straight worldline vectors. Then
¯u2
= −τ2
u , ¯v2
= −τ2
v . (2.7)
8
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡!
τ3
¯uτ2 τ1
τ4
−¯v
τ4
¯v
I
A















U
£
£
£
£
£
£
£
£
£
£
£
£
££#
Figure 2.2: Experiment involving four spaceships to test equation (2.6).
We now deﬁne a new vector which is a linear combination of ¯u and ¯v,
¯y = a¯u + b¯v . (2.8)
Its inner product with ¯u is
¯u · ¯y = a¯u2
+ b¯u · ¯v . (2.9)
Taking a = −b¯u·¯v
¯u2 we ﬁnd
¯u · ¯y = 0 . (2.10)
We say that ¯y is orthogonal to ¯u.
Now consider the following situation. Two spaceships part and then meet again:
9
T
ship 1
ship 2
ship 2











Q











k
Spaceship 1 is unaccelerated throughout the duration of its journey, while spaceship two
travels unaccelerated for a while then accelerates hard for a short time to reverse its
direction of motion and then again ﬂoats freely until it meets spaceship 1 again.
We take the spacetime vectors corresponding to this situation as in Figure 2.3 with
T
E
¯y¯u
¯v1
¯v2
¢
¢
¢
¢
¢
¢¢
f
f
f
f
f
ffw
Figure 2.3: Spacetime vectors corresponding to two spaceships parting and meeting again.
¯v1 =
1
2
¯u + ¯y , ¯v2 =
1
2
¯u − ¯y , (2.11)
10
where ¯y is the vector introduced above which is orthogonal to ¯u and is a small number.
Note that ¯v1 + ¯v2 = ¯u so that the spaceships indeed meet at the end. The proper time for
the journey of ship 1 is
τ1 =
√
−¯u2 . (2.12)
The proper time for the journey of ship 2 is the sum of the proper time for the two segments
of the journey
τ2 = −¯v2
1 + −¯v2
2 . (2.13)
Since ¯u · ¯y = 0 we have ¯v2
1 = 1
4
¯u2
+ 2
¯y2
= ¯v2
2 so that
τ2 = 2 −1
4
¯u2 − 2 ¯y2 =
√
−¯u2 1 +
4 2 ¯y2
¯u2
= τ1 1 −
4 2 ¯y2
τ2
1
. (2.14)
The principle of maximum proper time says that the unaccelerated observer measures the
longest proper time, i.e. τ1 > τ2 (we assume ¯y = 0). This in turn implies that
1 −
4 2 ¯y2
τ2
1
< 1 ⇒ ¯y2
> 0 . (2.15)
This result was obtained using ¯u2
= −τ2
1 < 0. If we had decided instead to take the
opposite convention, i.e. ¯u2
= +τ2
1 > 0, the same calculation would lead to ¯y2
< 0. We
see that, in contrast to what we are used to from Euclidean space, it is not possible for all
spacetime vectors to have positive square. This fact follows from the principle of maximum
proper time.
Clearly no observer can travel along ¯y because then his clock would need to show an
imaginary time, which is absurd.
Consider now the vector
¯w = c¯u + d¯y , (2.16)
with ¯u and ¯y orthogonal as before. Squaring this we ﬁnd
¯w2
= c2
¯u2
+ d2
¯y2
. (2.17)
If we take c2
= −d2 ¯y2
¯u2 (note that the RHS is positive which is consistent with c, d being
real numbers) we get ¯w2
= 0! We conclude that there exist spacetime vectors ¯w = ¯0 such
that ¯w2
= 0.
To summarize we have learned that there are 3 classes of spacetime vectors:
 ¯v2
< 0: Timelike
 ¯v2
> 0: Spacelike
 ¯v2
= 0: Null (light-like)
Vectors that are part of a straight worldline of an observer are timelike. We have shown
above that if ¯u is timelike and ¯u · ¯y = 0 then ¯y is spacelike (or ¯y = ¯0). This is a very useful
result to remember when working with spacetime vectors.
11
Chapter 3
Simultaneity and spatial distance
An observer traveling along in a spaceship has only direct access to the interior of the
spaceship. Nevertheless they must be able to make statements and inferences about what
happens in the outside world. To be able to do this they need in particular to be able
to say when an event, which is not on her worldline, occurred. To do this they need to
have a way to determine whether an event far away is simultaneous with an event on their
worldline, e.g. a supernova explosion far away happens when their clock shows 10:23.
The natural way to do this is via the construction in ﬁgure 3.1. The observer sends
out a probe, which travels on a straight worldline to the event and on a straight worldline
back. She arranges it so that the probe reaches the event precisely when half the proper
time of its journey has elapsed. Then she will say that the event on her worldline halfway
between sending out and receiving the probe is simultaneous with the distant event.
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
rr
rrrj
$$$$$$$$
v
v
v
v
v
v
v
¡
¡
¡
¡¡!
¡
¡
¡
¡¡!
s
s
s
s P
P1
P2
P3
t
t
τ
τ¯v
¯r
¯v
Figure 3.1: Via this construction the observer decides that P2, halfway between P1 and P3,
is simultaneous with P.
12
From the ﬁgure we have
τ2
= −(¯v + ¯r)2
= −(¯v − ¯r)2
⇒ ¯v · ¯r = 0 , (3.1)
so that ¯r is spacelike.
We also need a way to measure spatial distances. Let us consider a family of straight
parallel worldlines L0, L1, . . . deﬁned by the equation
¯Rn = λn ¯u + n¯ρ , n = 0, 1, 2, . . . (3.2)
where ¯u, ¯ρ are timelike vectors and λn ∈ R parametrizes a point on the n’th worldline,
Ln. This is illustrated in ﬁgure 3.2. This could be a ﬂeet of identical spaceships traveling
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢



Q








Q












Q

















Q
¯u ¯ρ
L0 L1 L2 L3 · · ·
Figure 3.2: A family of parallel worldlines.
unaccelerated and arranged head to tail. Consider now an observer traveling from the front
of the ﬂeet to the back, counting how many ships he passes. This number is a measure of
how far it is from the head of the ﬂeet to the tail. The distance is expressed in units of
“standard spaceship”.
13
There is an alternative way to measure this distance. There is only one vector going
from L0 to Ln with the property that it is orthogonal to ¯u.1
It is given by
¯rn = n¯ρ −
n¯ρ · ¯u
¯u2
¯u . (3.3)
Note that ¯rn, and therefore also its length, is proportional to n, the number of spaceships.
We can use ¯r2
n as a measure of the distance. We just need to work out the conversion
factor to go between ¯r2
n and the number of spaceships.
Looking at ﬁgure 3.1 (replacing ¯v with ¯u and dropping the subscript on ¯rn in the
following) we read oﬀ
τ2
= −(¯u + ¯r)2
= t2
− ¯r2
, or l2
= ¯r2
= t2
− τ2
. (3.4)
The advantage of this method is that we don’t need the ﬂeet of spaceships (other than
to ﬁx the unit of distance). Later we will ﬁnd an even more practical way to measure
distance.
How we pick the unit of distance is up to us. Nothing prevents us from choosing units
such that
√
¯r2 itself is the distance. This is in fact the most natural choice to make.
From (3.4) we see that now space and time acquire the same dimensions. In the theory
of relativity this is as natural as height and width having the same dimensions and being
measured in the same units.
3.1 Orthogonal space
To every unaccelerated observer there corresponds a straight worldline. Such a worldline
is characterized by a timelike vector which we can normalize to a unit vector ˆu. We will
always us a ‘hat’ to denote unit vectors. A timelike unit vector satisﬁes ˆu2
= −1 and a
spacelike unit vector ˆr2
= 1. Given an observer with unit vector ˆu there exist vectors ¯r
such that
ˆu · ¯r = 0 . (3.5)
They form the orthogonal space to the observers worldline. This is a vector space since
linear combinations of such vectors clearly belong to the space. In fact, since all such ¯r
are spacelike (or zero), it is a Euclidean vector space. Since we are imposing one condition
on the four components of ¯r the orthogonal space is three-dimensional. Recall that
√
¯r2
is the (spatial) distance from the observer. The orthogonal space is the space used in
Newtonian physics. The diﬀerence is that in the theory of relativity each observer has
their own orthogonal space.
Given the worldline of an observer with direction ˆu, we can split any spacetime vector
¯R into a component along ˆu and a component orthogonal to it as
¯R = tˆu + ¯r with ˆu · ¯r = 0 . (3.6)
1
Proof: It is clear that at least one such vector exists. Assume there are two such vectors ¯r1, ¯r2. We
may assume their foot-point is the same point on L0. The fact the ¯u·¯r1 = ¯u·¯r2 = 0 implies ¯u·(¯r1 −¯r2) = 0.
But ¯r1 − ¯r2 = λ¯u and the previous equation implies λ = 0 so that ¯r1 = ¯r2.
14
This is illustrated in ﬁgure 3.3. According to the ﬁgure, an observer following the worldline
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢
¢













Q
q
tˆu
¯R
¯r
L
Figure 3.3: Split of a spacetime vector ¯R with respect to a timelike direction ˆu.
L measures the event corresponding to ¯R to happen at a time t and spatial position ¯r. The
distance to the event is
√
¯r2. From the equation we ﬁnd t = −ˆu · ¯R and ¯r = ¯R + (ˆu · ¯R)ˆu.
We see that t and ¯r are uniquely ﬁxed in terms of ˆu and ¯R.
3.2 Linearly independent vectors
Consider four spacetime vectors
¯A, ¯B, ¯C, ¯D . (3.7)
They are linearly independent if none of them can be expressed as a linear combination of
the others, or equivalently if the equation
a ¯A + b ¯B + c ¯C + d ¯D = 0 (3.8)
has only the trivial solution a = b = c = d = 0. Note that since spacetime is fourdimensional
we cannot have more than four linearly independent vectors.
In Euclidean space we are used to two orthogonal vectors being linearly independent.
This is not true in spacetime, e.g. a null vector is orthogonal to itself but clearly not
linearly independent of itself. What is true is that if ¯v · ¯u = 0 for all ¯u then ¯v = 0. To see
15
this take ¯u timelike. We conclude that ¯v must be spacelike, but since it is orthogonal to
all other spacelike vectors it must vanish since these form a Euclidean vector space.
To test if ¯A, ¯B, ¯C, ¯D are linearly independent we form the determinant of the matrix
of inner products
¯A · ¯A ¯A · ¯B ¯A · ¯C ¯A · ¯D
¯B · ¯A ¯B · ¯B ¯B · ¯C ¯B · ¯D
¯C · ¯A ¯C · ¯B ¯C · ¯C ¯C · ¯D
¯D · ¯A ¯D · ¯B ¯D · ¯C ¯D · ¯D
(3.9)
The vectors are linearly dependent if and only if this determinant vanishes. To see this assume
they are linearly dependent. Then (3.8) holds for some non-zero coeﬃcients. Taking
this linear combination of rows in the matrix we obtain a row of zeros so that the determinant
vanishes. Conversely, if the determinant vanishes there exists a linear combination
of the rows that gives zero. This means that there exists a vector a ¯A + b ¯B + c ¯C + d ¯D,
with a, b, c, d not all zero, which is orthogonal to ¯A, ¯B, ¯C, ¯D. Assuming they are linearly
independent gives a contradiction, therefore they must be linearly dependent.
Note that this test works only for four vectors. It does not work for lower-dimensional
subspaces of spacetime.
16
Chapter 4
Velocity and light signals
Consider an observer and an object following straight worldlines L and L respectively,
ﬁgure 4.1. The observer describes the objects position at a time t to be ¯r, with ¯r = 0 at














E
t τ
¯r
L
L
Figure 4.1: Observer L describes the object L as having position ¯r at time t.
t = 0. He assigns the object the velocity
¯v =
d¯r
dt
=
¯r
t
, (4.1)
where the last equality follows from the fact that they are following straight worldlines so
¯r is proportional to t. This velocity clearly depends on the observer, since ¯r and t refer
17
to the observer. For this reason it is called the relative velocity of the observer and the
object. Note that ¯v is orthogonal to ˆu, i.e. belongs to the observer’s orthogonal space. In
particular this means that the relative velocity is spacelike.
The unit vector along the objects worldline ˆv tells us how the relative velocity ¯v is
directed relative to the direction of the observers worldline, ˆu. We call ˆv the four-velocity
of the object. Any timelike unit vector (pointing forwards in time) can be a four-velocity,
since it could be the direction of an objects worldline.
4.1 Standard velocity split
Let ˆv be the four-velocity of the object and ˆu that of the observer. The objects position is
given by ¯R = τˆv and from ﬁgure 4.1 we see that
τˆv = tˆu + ¯r = t(ˆu + ¯v) , (4.2)
where we used the deﬁnition of the relative velocity in (4.1). We can write this as
ˆv = γ(ˆu + ¯v) where γ =
t
τ
and ˆu · ¯v = 0 . (4.3)
This formula for the split of the four-velocity of the object into the four-velocity of the
observer and the relative velocity is very useful and has many applications. For example,
squaring this equation gives
1 = γ2
(1 − v2
) or
t
τ
= γ =
1
√
1 − v2
, (4.4)
where v2
= ¯v2
, the relative velocity squared. Note that v2
= 1 − τ2
t2 ≤ 1. This is the
famous formula for time dilation. Since t = γτ and γ ≥ 1 the observer sees the object’s
clock slowed down by a factor of γ.
Taking the inner product of (4.3) with ˆu we ﬁnd
−ˆu · ˆv = γ =
1
√
1 − v2
. (4.5)
Notice that this implies that the deﬁnition of the relative velocity v is symmetric between
the observer and object. The object judges the observer to have the same velocity as the
observer assigns to the object.
4.2 Light signals
So far we have not discussed null lines, which are on the border between timelike and
spacelike lines. Now let us assume that L in ﬁgure 4.1 is such a null line. Then
τ2
= − ¯R2
= 0 , (4.6)
18
but it is still true that the objects position is given by
¯R = tˆu + ¯r . (4.7)
Squaring this we get
−t2
+ ¯r2
= 0 or v2
= 1 . (4.8)
Note that this result is independent of the observer. All observers would measure the
relative velocity of such a signal to have magnitude v = 1.
It does not follow from the theory of relativity itself that particles following such null
lines exist. It would be meaningless to assign them a clock since it would not tick (τ = 0
along such a line).
Nevertheless experience tells us that there exist signals in nature that can travel along
null lines. Light being the most important example and v = 1 is called the ”velocity of
light”. More generally, as we will see later, a particle can follow a null worldline if and
only if it is massless. The quantum of light is the massless particle called the photon.
The existence of such signals is of great practical importance. Recall that to determine
simultaneity we used the setup in ﬁgure 3.1. The observer has to arrange the situation
so that the proper time of the probe to and from the event are the same. To achieve
this in practice presents great diﬃculties. The existence of light, or more generally electromagnetic,
signals solves this problem since using such a signal instead of the probe
τ = 0 always. Equivalently, the observer knows that v = 1 and can therefore calculate the
distance directly from the time it takes there and back.
Imagine an observer who sends out a ﬂash of light at t = 0. The light pulse travels out
in all directions with unit velocity forming a sphere of light. To draw the corresponding
spacetime diagram we must go down to two dimensions of space where the light forms a
circle traveling outwards from the observer, ﬁgure 4.2. In this three-dimensional spacetime
picture the light forms a cone. For this reason it is referred to as the light-cone. It is the
surface given by the equation
r = t , where r =
√
¯r2 . (4.9)
What we have drawn is really only half of the light-cone, called the future light-cone. There
is also the past light-cone given by
r = −t . (4.10)
It describes a sphere of light contracting towards the origin. The full light cone is given by
the equation
r2
= t2
(4.11)
and is illustrated in ﬁgure 4.3. Note that this equation is true for any observer. This means
that the light-cone looks the same to any observer. Don’t be mislead by the picture which
might seem to suggest otherwise!
19
T
E 
 
 
 
 
 
 
 
  
d
d
d
d
d
d
d
d
dd
z
¢
¢
¢
¢
¢
¢
¢
¢
¢
¯n
¯r
t
Figure 4.2: An observer’s future light-cone. The circle of light is at distance r at time t
and traveling out in all null directions, e.g. ¯n.
4.3 Split of null vectors
As we have seen, any spacetime vector can be split with respect to some four-velocity ˆu
into a component along ˆu and a component orthogonal to it. For a null vector ¯n we get
¯n = a(ˆu + ¯m) (4.12)
but ¯n2
= 0 implies ¯m2
= 1 so that
¯n = a(ˆu + ˆm) , (4.13)
proportional to the sum of a timelike and a spacelike unit vector. If a vector ¯k is orthogonal
to ¯n it is either spacelike or proportional to ¯n.1
In particular we have that two orthogonal
null vectors must be parallel.
1
Proof: Writing ¯k = b(ˆu + ¯r) we ﬁnd the condition (we may assume a, b = 0)
ˆm · ¯r = 1 .
since ˆm and ¯r belong to the orthogonal subspace they are Euclidean. Therefore we must have either√
¯r2 > 1, in which case ¯k is spacelike, or
√
¯r2 = 1 and ¯r is parallel to ˆm, in which case ¯k = b
a ¯n.
20
 
 
 
 
 
 
  
d
d
d
d
d
d
dd
 
 
 
 
 
 
  
d
d
d
d
d
d
dd
Figure 4.3: The full light-cone.
4.4 Future and past
Consider a timelike vector ¯v at the origin. We can split it with respect to the four-velocity
of an observer as
¯v = tˆu + ¯r , ˆu · ¯r = 0 . (4.14)
Squaring we ﬁnd
¯v2
= r2
− t2
< 0 , (4.15)
which implies that every timelike vector is pointing insider the light-cone (see ﬁgure 4.2).
Replacing ¯v with a spacelike vector we similarly ﬁnd that every spacelike vector points
outside the light-cone. Null vectors point along the light-cone.
Let ¯u, ¯v be timelike vectors with negative inner product
¯u · ¯v < 0 . (4.16)
Construct the linear combination
¯w = a¯u + b¯v , with a, b ≥ 0 , a + b = 0 . (4.17)
Squaring we ﬁnd
¯w2
= a¯u2
+ 2ab¯u · ¯v + ¯v2
< 0 , (4.18)
since all terms are negative. By varying a, b we can continuously go from the vector ¯u to
the vector ¯v via only timelike vectors. This would be impossible if one was pointing into
the future light-cone and the other into the past light-cone. We therefore conclude that
two timelike vectors with negative inner product must be pointing in the same direction,
21
i.e. either both into the future light-cone or both into the past light-cone. It is easy to see
that the same argument goes through if we take one timelike and one null vector.
If instead ¯u · ¯v > 0 then ¯u and −¯v are point in the same direction so that ¯u and ¯v are
pointing in opposite directions. Conversely two timelike vectors pointing inside the same
part (future/past) of the light-cone have ¯u · ¯v < 0 (note that ¯u · ¯v cannot vanish). Again
this goes through if one of them is null.
Let ˆu, ˆv be timelike and future directed. Then
(ˆu + ˆv)2
= −2 + 2ˆu · ˆv < 0 and (ˆu + ˆv) · ˆv = ˆu · ˆv − 1 < 0 , (4.19)
from which we conclude that ˆu + ˆv is also timelike and future directed. This must hold
for any sum of timelike future directed vectors. An important consequence of this is that
no spaceship (or other object) can reverse its four-velocity and travel to the past to arrive
before it departed. Time travel is therefore impossible within the theory of relativity.
Related to this, if two spaceships part and then meet again they will always agree they
part before they meet.
We have seen that timelike (and null) vectors can be divided into two classes: future
directed and past directed. Note that no such division is possible for spacelike vectors.
22
Chapter 5
Lorentz transformation
In this chapter we will restrict to situations where all the spacetime vectors involved lie
in a two-dimensional plane. In particular, this means that they can all be expressed in
terms of two linearly independent vectors. Many problems one encounters are in fact of
this type.
In the cases of interest this plane contains a timelike future directed unit vector, which
could be the four-velocity of an observer. Every vector can be split with respect to this fourvelocity,
ﬁgure 3.3. Note that this corresponds to a one-dimensional problem in Newtonian
physics since there is only one spatial direction.
We are interested in how to relate the measurements of two diﬀerent observers. Let
us consider ﬁrst the analogous situation in Euclidean space, with two sets of orthogonal
vectors rotated with respect to each other. Thinking of this as two diﬀerent coordinate
systems we know how two relate them via the rotation. Consider now the corresponding
situation in spacetime, illustrated in ﬁgure 5.1, which occurs often. We have two sets of
orthogonal unit vectors ˆu, ˆr with ˆu · ˆr = 0 and ˆv, ˆs with ˆv · ˆs = 0 and ˆu2
= ˆv2
= −1,
ˆr2
= ˆs2
= 1. Obviously ˆu and ˆr are linearly independent since one is timelike and one is
spacelike. Since we are restricting to a two-dimensional plane can express ˆv, ˆs in terms of
ˆu, ˆr. Up to a proportionality factor we have
ˆv ∝ ˆu + αˆr , ˆs ∝ ˆr + αˆu , (5.1)
for some α ∈ R, where we have used the fact that ˆv · ˆs = 0 to ﬁx the form of ˆs. Let us
compute the length of the vectors on the RHS to ﬁx the normalization. We have
(ˆr + αˆu)2
= −(ˆu + αˆr)2
= 1 − α2
(5.2)
and therefore we must have
ˆv = ±
1
√
1 − α2
(ˆu + αˆr) , ˆs = ±
1
√
1 − α2
(ˆr + αˆu) , (5.3)
so that ˆs2
= −ˆv2
= 1. Demanding that ˆv → ˆu and ˆs → ˆr as α → 0 we see that we need to
pick the plus signs.
23
T
Ee
e
e
e
e
e
e
e
e
eeu
¨
¨¨¨
¨¨¨
¨¨¨B
ˆu
ˆv
ˆr
ˆs
Figure 5.1: Two sets of orthogonal unit vectors in spacetime.
Recalling now the standard velocity split, eq. (4.3)
ˆv =
1
√
1 − v2
(ˆu + ¯v) , ˆu · ¯v = 0 , (5.4)
we read oﬀ α = ±v with v =
√
¯v2, the magnitude of the relative velocity. We therefore
have
ˆv = γ(ˆu + vˆr) , ˆs = γ(ˆr + vˆu) , γ =
1
√
1 − v2
, (5.5)
where we have absorbed the sign into v so that v is positive if the relative velocity is along
ˆr and negative if it is along −ˆr. This is the famous Lorentz transformation. It tells us
how the measurements of two inertial observers are related. To see this consider an event
speciﬁed by the spacetime vector ¯R. We can write it in two ways as
¯R = tˆu + xˆr or ¯R = t ¯v + x ¯s . (5.6)
Observer u assigns coordinates (t, x) to the event while observer v assigns it coordinates
(t , x ). Equating the two expressions for ¯R and using the formula for the Lorentz transformation
we ﬁnd
tˆu + xˆr = t γ(ˆu + vˆr) + x γ(ˆr + vˆu) (5.7)
or
t = γ(t + vx ) , x = γ(x + vt ) . (5.8)
This is the Lorentz transformation that relates the time and space measurements of the
two observers.
24
5.1 Addition of velocities
In Euclidean space we can perform ﬁrst one rotation and then another. The result is a
third rotation. Similarly, we can perform ﬁrst one Lorentz transformation with parameter
(relative velocity) v1 and then another with parameter v2. This clearly gives a third
Lorentz transformation. The only question is how the parameter of the third Lorentz
transformation is related to v1, v2.
To answer this question we note that the ﬁnal timelike unit vector is proportional to
(ˆu + v1ˆr) + v2(ˆr + v1 ˆu) = (1 + v1v2)ˆu + (v1 + v2)ˆr . (5.9)
From the equation ˆv = γ(ˆu + vˆr) we see that we can read oﬀ v as the ratio of the ˆrcomponent
and the ˆu-component. In this way we ﬁnd
v =
v1 + v2
1 + v1v2
. (5.10)
This is the relativistic formula for addition of velocities. It has been tested to high accuracy
in experiments with light propagating through ﬂowing liquids. Note that for velocities much
smaller than the speed of light v1, v2 1 we have v ≈ v1 + v2, the Newtonian formula
for addition of velocities. The relativistic formula guarantees that the relative velocity v
always remains less than the speed of light v < 1 for v1, v2 < 1.
In the special case v2 = −v1 we get v = 0. This means that the inverse Lorentz
transformation is obtained by changing the sign of v. This is easily veriﬁed directly by
performing ﬁrst a Lorentz transformation with parameter v and then one with parameter
−v.
5.2 Lorentz contraction
Consider a spaceship of length . The observer on the spaceship has four-velocity ˆu.
Another observer has four-velocity ˆv. Both measure the length of the spaceship from their
perspective getting the answer and respectively. The setup is illustrated in ﬁgure 5.2.
From the ﬁgure we see that
ˆs − ˆr ∝ ˆu (5.11)
and taking the inner product with ˆr gives
ˆr · ˆs − = 0 . (5.12)
Using the Lorentz transformation ˆs = γ(ˆr + vˆu) we ﬁnd ˆr · ˆs = γ so that
/ = 1/γ =
√
1 − v2 ≤ 1 , (5.13)
where v is the relative velocity of the two observers. This is the famous Lorentz contraction.
A moving object appears shortened, or contracted, in its direction of motion (relative to
the observer). This eﬀect is similar to slicing a sausage in that the length of the slice
depends on the cutting angle. In that case the perpendicular slice has the shortest length,
whereas in the spaceship case the observer at rest measures the longest length. Note that
in this respect the Euclidean ﬁgure 5.2 is very misleading.
25
T
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡¡!
¡
¡
¡¡
EE
rrr
rrrj
rrrrj
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡¡
L
L1 L2
ˆv ˆu
ˆs
ˆr
Figure 5.2: A spaceship stretching between L1 and L2, with length , is observed by the
observer following the worldline L who measures its length to be .
26
Chapter 6
Waves
Sound waves are familiar from Newtonian physics. The pressure P(¯r, t) varies in space and
with time around the mean pressure P0. For a plane sound wave the variation, p = P −P0,
has the form
p = A sin(−ωt + ¯k · ¯r + φ0) . (6.1)
The constants A and φ0 are the amplitude and phase shift of the wave, while ω is the
angular frequency and the three-vector ¯k is the wave vector. We may assume that ω > 0.
The surfaces where p takes a constant value, e.g. its maximum A, are 2-dimensional planes
in our 3-dimensional space, see ﬁgure 6.1, which is the reason for the name plane waves.
Sound waves are just one example. Many other types of waves exist in nature that can be
T
E 
 
d
d
d
d
d
d
d
d
d
d
dd
d
d
d
d
d
d
d
d
d
d
dd
d
d
d
d
d
d
d
d
d
d
dd
¯k
x
y
Figure 6.1: Plane wave in two space dimensions. The wave fronts, surfaces of constant
phase at say t = 0, given by ¯k · ¯r = const, are straight parallel lines orthogonal to the wave
vector ¯k. In three dimensions they are planes.
27
described in the same way.
Consider now a plane wave in spacetime given by
ψ = A sin( ¯K · ¯R + φ0) , (6.2)
where now ¯K is the wave four-vector and ¯R is the four-vector of a spacetime point (event).
To se that this expression makes sense we can introduce an observer with four-velocity ˆu.
We can then decompose the four-vectors involved with respect to this observer as
¯R = tˆu + ¯r , ¯r · ˆu = 0 , (6.3)
¯K = ωˆu + ¯k , ¯k · ˆu = 0 . (6.4)
Using these expressions the wave takes the form
ψ = A sin(−ωt + ¯k · ¯r + φ0) , (6.5)
which is the same as before, with ω the angular frequency and ¯k the wave vector. Note
that ω and ¯k depend on the observer, since they are deﬁned using his four-velocity. We
have for example
ωu = −ˆu · ¯K , (6.6)
the angular frequency of the wave as measured by the observer u.
The superposition principle says that we can add together waves of the form (6.2) to
form new waves. For physical waves the angular frequency is determined by the wave vector
ω = ω(¯k). The relation between ω and ¯k is called the dispersion relation and depends on
the type of wave. Let us consider the sum of two one-dimensional plane waves with similar
wave number k1 = (1 − )k0 and k2 = (1 + )k0 with 1 and φ0 = 0,
ψ = sin(−ω1t + k1x) + sin(−ω2t + k2x) . (6.7)
We have
−ω1t + k1x = −ω(k1)t + k1x ≈ −ω(k0)t + k0x + k0
dω
dk
t − x , (6.8)
where we have Taylor expanded the function ω to ﬁrst order in . For k2 we ﬁnd the same
expression but with the sign of changed. Therefore, using the formula for sin of a sum of
angles and setting ∆ = k0(dω
dk
t − x) , we have
ψ = sin(−ω0t + k0x + ∆) + sin(−ω0t + k0x − ∆)
= sin(−ω0t + k0x) cos( ∆) + cos(−ω0t + k0x) sin( ∆)
+ sin(−ω0t + k0x) cos(− ∆) + cos(−ω0t + k0x) sin(− ∆)
=2 cos k0
dω
dk
t − x sin(−ω0t + k0x) . (6.9)
28
This is a plane wave with wave number k0 and frequency ω0 = ω(k0) but with an amplitude
which is modulated by the cos factor. The cosine factor describes the envelope of the wave
packet. The envelope of the wave packet depends on the position through x − dω
dk
t and
therefore travels with the group velocity
v =
dω
dk
. (6.10)
This is the velocity of signals sent with such waves. For light we should clearly have v = 1
and indeed light (electromagnetic radiation) has the dispersion relation
ω = k = ¯k2 . (6.11)
Equivalently, we have for light that
¯K2
= (ωˆu + ¯k)2
= −ω2
+ ¯k2
= 0 , (6.12)
i.e. light rays have a wave four-vector which is null. This means that for light (electromagnetic
radiation) we can write the wave four-vector as
¯K = ω(ˆu + ˆk) ˆk · ˆu = 0 , (6.13)
where ˆk is a unit vector describing the direction of the wave.
6.1 Doppler shift and aberration
Consider two observers with four-velocities ˆu and ˆv observing a light wave ( ¯K2
= 0).
Splitting vectors with respect to the ﬁrst observer we have
ˆv = γ(ˆu + ¯v) ¯v · ˆu = 0 , (6.14)
the standard velocity split, and
¯K = ωu(ˆu + ˆku) ˆku · ˆu = 0 . (6.15)
We have put a subscript u to emphasize that ωu and ˆku are the quantities measure by
observer u. Since ¯v and ˆku are in the orthogonal space to ˆu they can be treated as ordinary
Euclidean vectors and we can write
¯v · ˆku = v cos θu , (6.16)
where θu is the angle between the relative velocity ¯v of observers u and v and the direction
the light is traveling, given by ˆku, as measured by observer u.
The angular frequency measured by observer v becomes
ωv = −ˆv · ¯K = −ωuγ(−1 + ¯v · ˆku) = ωuγ(1 − v cos θu) , (6.17)
29
so that
ωv
ωu
=
1 − v cos θu
√
1 − v2
. (6.18)
This is the formula for the Doppler shift. It says how the ratio of the frequencies measured
by two observers depends on their relative velocity and the angle of the velocity to that
of the light. To see the physical consequences of this formula we take the observer u be
at rest with respect to the source emitting the light (radiation). Let us consider the two
special cases when v is moving directly towards the source or directly away from it. We
have
Towards source cos θu = −1 ωv
ωu
= 1+v
1−v
> 1 “Blue shift”
Away from source cos θu = +1 ωv
ωu
= 1−v
1+v
< 1 “Red shift”
An observer moving towards the source sees a higher frequency, the light is blue shifted,
while an observer moving away sees a lower frequency, the light is red shifted. The Doppler
eﬀect is very important in astronomy where it is used for example to measure the velocity
of stars and galaxies relative to us by looking at the shifts of their spectral lines.
We could have used the observer v instead of u and we would have found instead the
formula
ωu
ωv
=
1 + v cos θv
√
1 − v2
. (6.19)
Note the plus sign in the numerator which is due to the fact that the relative velocity
according to v has opposite sign compared to what u measures. Multiplying this with the
previous formula for the Doppler shift we ﬁnd
(1 + v cos θv)(1 − v cos θu) = 1 − v2
. (6.20)
This is the formula for aberration derived by Einstein in his 1905 paper. It can also be
written as
cos θv =
cos θu − v
1 − v cos θu
. (6.21)
To see its physical consequences we take again u at rest with respect to the source and
v traveling towards the source, but now we let them observe the light at a small angle,
θu = π − δu, θv = π − δv with δu, δv 1. Using the fact that cos(π − x) = − cos(x) and
that for small x cos(x) ≈ 1 − 1
2
x2
we ﬁnd
1 −
1
2
θ2
v ≈
1 + v − 1
2
θ2
u
1 + v − v1
2
θ2
u
≈ 1 −
1
2
1 − v
1 + v
θ2
u . (6.22)
so that
θv ≈
1 − v
1 + v
θu ≤ θu . (6.23)
30
We conclude that the observer traveling towards the source of the light measures a smaller
angle. Similarly, by changing the sign of v, we learn that an observer traveling away from
the light source and measures a larger angle by a factor 1+v
1−v
≥ 1. This means that light is
concentrated in the direction of motion and an observer traveling towards a star sees it as
smaller and brighter, while an observer traveling away sees it as larger and fainter. This
is sometimes referred to as the “headlight” eﬀect. It again has important consequences in
astronomy, for example when observing relativistic jets of plasma from compact objects
accreting matter. These appear brighter when directed towards the earth and fainter when
directed away from the earth.
31
Chapter 7
Particle kinematics
An important application of special relativity is to processes involving elementary particles,
e.g. in particle accelerators, where they can often reach velocities close to the speed of light.
In order to discuss what happens in particle collisions we ﬁrst need to introduce the notion
of four-momentum.
7.1 Four-momentum
Recall the split of a four-velocity ˆv with respect to another one ˆu
ˆv = γ(ˆu + ¯v) ˆu · ¯v = 0 , γ =
1
√
1 − v2
. (7.1)
If v is an object and u an observer ¯v is the velocity of the object relative to the observer.
In the Newtonian limit v << 1, i.e. for velocities much smaller than the speed of light, we
have, dropping terms of order v3
and higher,
ˆv ≈ (1 +
v2
2
)ˆu + ¯v . (7.2)
To relate this to something more familiar let us multiply by the mass of the object m,
mˆv ≈ (m +
mv2
2
)ˆu + m¯v . (7.3)
We recognize the kinetic energy of the particle 1
2
mv2
and its momentum m¯v. Note that
every object has a well-deﬁned mass (just consider an observer traveling along with the
object, i.e. whose ˆu is (nearly) parallel to ˆv, who can deﬁne the mass in the usual Newtonian
way). We deﬁne the four-momentum of an object to be its mass times its four-velocity
¯P = mˆv . (7.4)
By construction it satisﬁes
¯P2
= −m2
. (7.5)
32
The split of ˆv gives a corresponding split of the four-momentum
¯P = Eˆu + ¯p ˆu · ¯p = 0 , (7.6)
with
E = mγ =
m
√
1 − v2
≈ m +
mv2
2
, ¯p = m¯vγ =
m¯v
√
1 − v2
≈ m¯v . (7.7)
In analogy with the Newtonian case we call E the energy and ¯p the momentum (or threemomentum)
of the object. Note that they are not intrinsic properties of the object but
depend on the observer.
The four-momentum ¯P = mˆv is time-like since ˆv is time-like. Furthermore, since m > 0,
it is also future directed. The energy measured by observer u can be written
E = −ˆu · ¯P (7.8)
and we see that it is always positive since both ˆu and ¯P are time-like and future directed.
The usefulness of the notion of momentum in Newtonian mechanics comes from the
fact that it is conserved (in the absence of external forces): The sum of all momenta before
a collision is equal to the sum of momenta after the collision. This is actually a special
case of the relativistic conservation of four-momentum
before
¯Pi =
after
¯Pj . (7.9)
It says that there are four conserved quantities: The total energy and the three components
of the momentum.
The fact that four-momentum is conserved is intimately tied to symmetries in nature.
In fact, you will learn later in your studies that the conservation of momentum is equivalent
to the statement that the laws of physics look the same regardless of position in space. The
laws of physics are the same here as on the moon or in the Andromeda galaxy. Similarly
the conservation of energy is equivalent to the fact that the laws of physics look the same
at all times. Just like the theory of relativity combines space and time into a spacetime it
combines the notions of momentum and energy into the single concept of four-momentum.
Note that in the Newtonian limit the total energy becomes
E ≈
i
mi +
mi¯v2
i
2
. (7.10)
But we know that the kinetic energy is not conserved in general, it is only conserved in
elastic collisions. This means that the ﬁrst term must change to keep the total energy
conserved, i.e. the sum of the masses before and after the collision will in general diﬀer,
contrary to what is assumed in Newtonian mechanics. In inelastic collisions the kinetic
energy decreases, so the total mass must increase. Of course in everyday situations v << 1
so the kinetic energy is much smaller than the total mass, 1
2
m¯v2
<< m.
33
This fact, that mass is also a form of energy, is the content of what is probably the
most famous equations in physics
E = mc2
. (7.11)
Remember that here we are using units where the speed of light is unity, c = 1. Note also
that E = m is true only for an observer at rest with respect to the object in question. It
is often referred to as the rest energy.
7.2 Massless particles
Knowing all but one four-momentum of the particles in a given reaction we can use the
conservation of four-momentum to determine the remaining one. In this way one can
experimentally establish the existence of particles with null four-momentum
¯P2
= 0 . (7.12)
Since we may deﬁne the mass through ¯P2
= −m2
we call such particles massless. The most
important example of such a particle is the photon, the quantum of light (more generally
electromagnetic radiation). The four-momentum and energy of such a particle does not
vanish, if it did we would not be able to detect them, and therefore we conclude from the
expression
E =
m
√
1 − v2
(7.13)
that such a particle must have v = 1, i.e. travel at the speed of light. It therefore follows a
null line with direction proportional to the null vector ¯P. Note that for a massless particle
we have
¯P = Eˆu + ¯p = E(ˆu + ˆp) , (7.14)
since it is null.
7.3 Tachyons
Sometimes on hears about particles with space-like four-momentum
¯P2
> 0 . (7.15)
Such particles are referred to as tachyons. Such a particle would have an imaginary mass
from ¯P2
= −m2
and always travel along space-like lines. This turns out not to be consistent
with the laws of quantum mechanics and such particles therefore cannot exist in nature.
Instead in our current best theories particles emerge as “ripples” of a ﬁeld with a
quantum of energy. In such quantum ﬁeld theories it is not unusual to have m2
< 0.
However, the resulting particles do not travel faster than light. Instead such an imaginary
mass signals an instability of the vacuum. In fact, this is part of the mechanism by which
particles can acquire mass through the so-called Higgs mechanism, related to the famous
Higgs particle discovered at CERN.
34
7.4 Particle reactions and kinematics
The two most important types of particle reactions are
 Decay: One particle goes into two or more
 Collision: Two particles go into one or more
The allowed conﬁgurations of four-momenta are constrained by
1. ¯P2
i = −m2
i for all particles involved
2. Conservation of four-momentum
i
¯Pi,in =
j
¯Pj,out
These equations form the basis of particle kinematics. It is often convenient to refer all
quantities to some particular observer, say with four-velocity ˆu. Then we have
¯Pi = Ei ˆu + ¯pi ˆu · ¯pi = 0 . (7.16)
The inner product of the four-momenta of two particles becomes
¯Pi · ¯Pj = −EiEj + ¯pi · ¯pj , (7.17)
and in the special case i = j we ﬁnd
m2
i = − ¯P2
i = E2
i − ¯p2
i ⇒ Ei = m2
i + ¯p2
i . (7.18)
A typical kinematical calculation involves two steps:
1. Take the inner product of the conservation of four-momentum
i
¯Pi,in =
j
¯Pj,out
with some ¯Pi, or rearrange it and take the square.
2. Replace ¯P2
i by −m2
i everywhere and inner products by (7.17).
35
E 
 
 
  
e
e
e
e
e
e
e
θ
¯q
¯p = 0
¯p
¯q
Figure 7.1: Scattering of an electron, with momentum ¯q, oﬀ a proton at rest (¯p = 0) in the
observer’s orthogonal space. The recoil angle of the proton is θ.
Example
Consider an electron (e−
) scattering oﬀ a proton (p+
). We set me− = m and mp+ = M
and let ¯P, ¯Q be the initial four-momenta and ¯P , ¯Q the ﬁnal ones. We have
¯P2
= ¯P 2
= −M2
, ¯Q2
= ¯Q 2
= −m2
. (7.19)
The conservation of four-momentum reads
¯P + ¯Q = ¯P + ¯Q . (7.20)
Let’s say we are interested in the situation where the proton is at rest before the collision
and we want to ﬁnd the recoil angle of the proton. The situation is illustrated in the
observer’s orthogonal space in ﬁgure 7.1. When there is one particle which we do not
know anything about, in this case the outgoing electron with four-momentum ¯Q , it is
often useful to rearrange the conservation of four-momentum so that its that particle’s
four-momentum appears alone on one side of the equation and then take the square. We
therefore write
¯P + ¯Q − ¯P = ¯Q (7.21)
and squaring this equation we ﬁnd
¯P2
+ ¯Q2
+ ¯P 2
+ 2 ¯P · ¯Q − 2 ¯P · ¯P − 2 ¯Q · ¯P = ¯Q 2
. (7.22)
36
Using fact that ¯P2
= ¯P 2
= −M2
and ¯Q2
= ¯Q 2
= −m2
this becomes
0 = − M2
+ ¯P · ¯Q − ( ¯P + ¯Q) · ¯P
=( ¯P + ¯Q) · ( ¯P − ¯P ) . (7.23)
This is a useful equation for two-particle elastic collisions. Since we are assuming the
proton is initially at rest with respect to the observer we have
¯P = M ˆu , ¯Q = Eˆu + ¯q , ¯P = E ˆu + ¯p . (7.24)
Using this we ﬁnd
0 = ( ¯P + ¯Q)·( ¯P − ¯P ) = ((M+E)ˆu+¯q)·((M−E )ˆu−¯p ) = −(M+E)(M−E )−¯q·¯p . (7.25)
Since ¯q and ¯p are in the orthogonal space to ˆu they can be treated as ordinary Euclidean
vector and we can write
¯q · ¯p = qp cos θ , (7.26)
where θ is the recoil angle we are after, see ﬁgure 7.1. Noting that
−m2
= ¯Q2
= −E2
+ q2
⇒ q =
√
E2 − m2 (7.27)
and similarly we have p =
√
E 2 − M2 we get
cos θ =
¯q · ¯p
qp
=
(E + M)(E − M)
√
E2 − m2
√
E 2 − M2
=
E + M
√
E2 − m2
E − M
E + M
. (7.28)
This expresses the recoil angle of the proton in terms of the initial energy of the electron
E and the ﬁnal energy of the proton E .
7.5 Center-of-mass observer
For more complicated processes it is often useful to regard all (or part) of the out-going
particles together. Their total four-momentum is
¯P =
i
¯Pi . (7.29)
Then the ”mass” deﬁned by ¯P2
= −M2
does not have a ﬁxed value. It depends on the
relative motion of the particles. It does have a lower bound however. Since a sum of
time-like future directed vectors is again time-like and future directed we can consider an
observer with four-velocity ˆu = ¯P/M. Then we ﬁnd
ˆu =
1
M
¯P =
1
M i
Pi =
1
M i
Ei ˆu +
1
M i
¯pi . (7.30)
37
The last term must therefore vanish, which means that this observer see the total spatial
momentum of the particles being zero, i ¯pi = 0. Using this fact we ﬁnd
−M2
= ¯P2
= −
i
Ei
2
= −
i
m2
i + p2
i
2
. (7.31)
Therefore
M2
=
i
m2
i + p2
i
2
≥
i
mi
2
, (7.32)
with equality occurring only if ¯pi = 0 for all i, i.e. if all particles are at rest with respect
to each other. This inequality leads to so-called threshold conditions for when a certain
set of particles can be produced, for example a Higgs boson at the LHC. Note that there
is no reference to any observer in the last equation – the result is observer independent.
Nevertheless it was convenient to introduce an observer in the derivation of the result and
this turns out to often be the case.
An observer, such as the one considered above, whose four-momentum is parallel to
the total four-momentum i
¯Pi is called a center-of-mass observer (often center-of-mass
frame or center-of-mass system), since such an observer sees the particles with zero total
three-momentum, i ¯pi = 0. The introduction of such an observer can often simplify the
calculations. Note that using Lorentz transformations we can of course translate from one
observer to another.
38
Chapter 8
Curved worldlines
So far we have dealt almost exclusively with straight worldlines. Unaccelerated spaceships
and particles not inﬂuenced by external forces travel along such worldlines. Conversely,
spaceships that run their engines or particles inﬂuenced by external forces follow curved
worldlines. While it is possible to treat accelerated observers in special relativity it is usually
avoided for practical reasons. We will therefore continue to assume that all observers
are unaccelerated, unless otherwise stated.
The natural parameter along a (massive) particle’s worldline is the proper time τ. The
spacetime position of the particle is given as a function of τ
¯R = ¯R(τ) . (8.1)
We can calculate the rate of change of the position vector with τ by taking the derivative
d ¯R
dτ
= lim
∆τ→0
¯R(τ + ∆τ) − ¯R(τ)
∆τ
. (8.2)
Since τ is the proper time along the worldline, which measures its length, we have for an
inﬁnitesimal piece of the worldline
dτ2
= −(d ¯R)2
. (8.3)
It follows that
ˆv =
d ¯R
dτ
(8.4)
is a time-like unit vector, ˆv2
= −1. It is tangent to the worldline at the event ¯R(τ). For a
straight worldline d ¯R/dτ =constant and ˆv is the four-velocity. We will continue to call it
the four-velocity also for curved worldlines.
The four-acceleration is deﬁned as the rate of change of the four-velocity
¯A =
dˆv
dτ
=
d2 ¯R
dτ2
. (8.5)
39
Diﬀerentiating the condition ˆv2
= −1 we learn that
ˆv · ¯A = 0 . (8.6)
This implies that ¯A is space-like. Note that ¯A is not, in general, a unit vector. In particular
it vanishes for a straight worldline.
Let us now analyze how an unaccelerated observer views an accelerated worldline.
Consider an observer whose (straight) worldline is tangent to the curved worldline at a
certain point ¯R(τ0), as depicted in ﬁgure 8.1. Splitting ¯R with respect to the four-velocity
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡
¡¡
¡
¡
¡
¡
¡
¡
¡
¡
¡!
rrrj
 
 
  
rrj
¡
¡
¡
¡!
¯R(τ0)
¯a0
¯R(τ)
¯r
tˆu
Figure 8.1: An observer’s worldline is tangent to the curved worldline of an accelerated
object at the point ¯R(τ0).
ˆu of the observer we have
¯R(τ) = tˆu + ¯r , ˆu · ¯r = 0 . (8.7)
Considering a small interval along the curve this gives
d ¯R = dtˆu + d¯r (8.8)
40
and squaring this we ﬁnd
−dτ2
= −dt2
+ d¯r2
(8.9)
or
dτ = dt
√
1 − v2 , ¯v =
d¯r
dt
. (8.10)
Note that ¯v is the relative velocity and we can also write dτ = dt/γ. This gives the
four-velocity of the object as
ˆv =
d ¯R
dτ
= γ
d ¯R
dt
= γ(ˆu + ¯v) . (8.11)
This is the same velocity split formula we found before when expressing a constant fourvelocity
ˆv in terms of another one ˆu, here however ˆv and ¯v are not constant. At the point
¯R(τ0) we have ˆv = ˆu since the worldlines are tangent so the relative velocity ¯v = 0 at this
point. For the four-acceleration we ﬁnd
¯A(τ0) =
dˆv
dτ 0
=
dˆv
dt 0
= (¯a)0 , (8.12)
where ¯a = ∂¯v/∂¯t is the ordinary Newtonian acceleration measured by the observer momentarily
at rest with respect to the accelerated object. If this object happens to be a
spaceship the vector ¯A at a certain point of its worldline is the acceleration experience by
the crew at that instant. This gives the physical interpretation of the four-acceleration.
8.1 Special case: Constant acceleration
Consider a 2-plane in spacetime, which could describe for example the history of a spaceship
traveling in a straight line as seen by an observer. In two dimensions we need only one
condition to determine a curve and we take
¯R2
= R2
, (8.13)
with ¯R the spacetime position and R a constant. Taking the derivative we ﬁnd
¯R · ˆv = 0 (8.14)
and another derivative gives
¯R · ¯A = 1 . (8.15)
The vectors ¯R, ˆv and ¯A lie in a 2-plane and since both ¯R and ¯A are orthogonal to ˆv they
must be parallel, so that using the equation above we ﬁnd
¯A =
¯R
R2
⇒ ¯A2
=
1
R2
, (8.16)
41
i.e. the four-acceleration of this worldline is constant. This could for example be the worldline
of a spaceship running its engines so that the crew experience a constant acceleration.
Let ¯R2
= R2
1 and ¯R2
= R2
2 with R2 > R1 be two such worldlines in the same 2-plane
and with the same origin. From (8.14) we see that any line through the origin cuts the
curves orthogonally. Therefore R2 − R1 is the orthogonal distance between the curves.
For example, one of the lines could be the worldline of the front of a spaceship and the
other the worldline of its tail. The ship’s length is constant, equal to R2 − R1, but the
acceleration of the front and the tail are unequal, being 1/R2
1 and 1/R2
2 respectively.
Introducing an observer with four-velocity ˆu we can split
¯R = tˆu + xˆr , ˆu · ˆr = 0 (8.17)
and
¯R2
= R2
⇒ −t2
+ x2
= R2
. (8.18)
This means that in the (x, t)-plane of the observer the worldlines of constant acceleration
are hyperbolas, as illustrated in ﬁgure 8.2. Important properties are hidden in this ﬁgure,
T
E
t
x
R1 R2
¯R2
= R2
1
¯R2
= R2
2
Figure 8.2: Worldlines of constant acceleration as viewed by an observer.
e.g. that all points on the curves are equivalent and that all observers see the same picture.
It is also hard to see that lines through the origin are orthogonal to the curves and that they
have constant separation. These problems arise because we are representing a spacetime
situation in Euclidean space. The corresponding situation in Euclidean space would be two
concentric circles of radius R1 and R2 respectively, and here those properties mentioned
42
above become obvious. Note however that this Euclidean picture is also not a reliable
representation of the spacetime situation, e.g. the curves are closed which is impossible in
spacetime.
8.2 Fitting a car into a garage
A famous “paradox” in special relativity is this:
Imagine you just bought a very fast car. Unfortunately, the car turns out to
be slightly too long to ﬁt in your garage. Is it possible to exploit the Lorentz
contraction and drive the car very fast into the garage and slam the door?
From the point of view of the garage the car appears shorter due to the Lorentz contraction,
which suggests that it should work. On the other hand, from the point of view of the car,
the garage appears shorter making the situation worse, not better. This is the “paradox”.
Of course, when you analyze things carefully there is no contradiction.
Let ˆu be the four-velocity of the garage and ˆv that of the car. We take units where the
length of the garage is 1 and let ˆg be orthogonal to ˆu and stretching from the doors to the
back of the garage. Similarly, we let ¯c be orthogonal to ˆv and stretch from the tail of the
car to the front. We assume the car continues unaccelerated until it reaches the back wall
of the garage at which point it suddenly stops. (This is of course an idealized situation
involving inﬁnite acceleration.)
The spacetime diagram is given in ﬁgure 8.3. The following important events are
indicated in the ﬁgure
E0 : The tail of the car passes the doors and they close
E1 : An event at the front of the car which is simultaneous with E0 from the cars point
of view
E2 : The car collides with the back wall of the garage. Simultaneous with E0 from the
point of view of the garage.
If ¯v is the relative velocity we can apply the Lorentz transformation relating the observations
from the garage to those from the car
ˆv = γ(ˆu + vˆg) , ˆc = γ(ˆg + vˆu) , (8.19)
or the inverse transformations
ˆu = γ(ˆv − vˆc) , ˆg = γ(ˆc − vˆv) . (8.20)
The last of these equations will be enough for us. Let ¯w be the vector from E1 to E2. From
ﬁgure 8.3 we have
¯c + ¯w = ˆg = γ(ˆc − vˆv) (8.21)
43
T T
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
E
d
d
d
dd
s s
s
E0
E1
E2
doors back wall
front
tail
ˆu
ˆv
ˆg
¯c
Figure 8.3: Spacetime diagram for the problem of ﬁtting the car into the garage.
which implies
¯c = γˆc , ¯w = −γvˆv . (8.22)
The length of the car is =
√
¯c2 = γ > 1, so the car is longer than the garage, consistent
with our assumptions.
Since ¯w is negative relative to ˆv, from the point of view of the car the front collides
with the back wall before the doors close, although these events are simultaneous from the
point of view of the garage. However, since both ˆg and ¯c are space-like, no signal from
the collision can reach the tail of the car before it passes the doors. Therefore no material
strength can stop the car from being compressed (from its point of view) and you can shut
the door behind it. You have to be quick though if the car is elastic and tends to regain
its original length. This example illustrates the fact that no perfectly rigid body exists.
We have seen that it is indeed possible (though not recommended) to ﬁt the car into
the garage by exploiting the Lorentz contraction. We have also seen that what appears at
ﬁrst sight to be a paradox is just due to not analyzing the situation carefully enough, in
particular neglecting the issues due to the relativity of simultaneity. When the problem is
carefully analyzed there are of course no contradictions.
8.3 Rotating wheel
In a wheel at rest all constituent particles follows straight worldlines. When the wheel
starts spinning the worldlines become tilted and form helices in spacetime. Only the
44
center continues on a straight worldline.
Using the center as an observer and thinking of a ring of matter a distance ρ around
it we realize that such a ring must have circumference 2πρ whether the wheel is spinning
or not. In a sense the Lorentz contraction is prevented from taking place. Instead the
tilting of the worldlines has the eﬀect that the orthogonal distance between them increases
compensating the Lorentz contraction. However, this means a deformation of the wheel
takes place.
This is an inevitable deformation which is always connected with rotation. How much
energy is needed to produce the deformation depends on the stresses in the wheel. Therefore
the moment of inertia of a wheel must also depend on the stresses within it. This is an
important consequence of the theory of relativity though it plays little role in everyday
circumstances.
45