Gravity for Beginners∗
Keith Head†
February 5, 2003
Contents
1 The Basic Gravity Equation 2
1.1 Origins: Newton’s Apple . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Economists Discover Gravity . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Economic Explanations for Gravity . . . . . . . . . . . . . . . . . . . . . 3
2 Estimation of the Gravity Equation 4
2.1 Economic Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Remoteness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3 “Augmenting” the Gravity Equation 9
3.1 Income per Capita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Adjacency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Common Language and Colonial Links . . . . . . . . . . . . . . . . . . . 9
3.4 Border Eﬀects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Evaluating Trade-Creating Policies 11
4.1 Free Trade Agreements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2 Monetary Agreements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
∗
Original Version: October, 2000. This version prepared for UBC Econ 590a students, January 2003.
This is a work in progress and I welcome comments and suggestions. The most up-to-date version is
available at economics.ca/keith/gravity.pdf
†
Faculty of Commerce, University of British Columbia, 2053 Main Mall, Vancouver, BC, V6T1Z2,
Canada. Tel: (604)822-8492, Fax: (604)822-8477, Email:keith.head@ubc.ca
1 The Basic Gravity Equation
The gravity equation is a popular formulation for statistical analyses of bilateral ﬂows
between diﬀerent geographical entities. In this paper, I provide an overview of the
development and use of this equation. I also include some practical tips for researchers
who want to use the equation in their own work.
1.1 Origins: Newton’s Apple
In 1687, Newton proposed the “Law of Universal Gravitation.” It held that the attractive
force between two objects i and j is given by
Fij = G
MiMj
D2
ij
, (1)
where notation is deﬁned as follows
• Fij is the attractive force.
• Mi and Mj are the masses.
• Dij is the distance between the two objects.
• G is a gravitational constant depending on the units of measurement for mass and
force.
1.2 Economists Discover Gravity
In 1962 Jan Tinbergen proposed that roughly the same functional form could be applied
to international trade ﬂows. However, it has since been applied to a whole range of
what we might call “social interactions” including migration, tourism, and foreign direct
investment. This general gravity law for social interaction may be expressed in roughly
the same notation:
Fij = G
Mα
i Mβ
j
Dθ
ij
, (2)
where notation is deﬁned as follows
• Fij is the“ﬂow” from origin i to destination j. Alternatively, let ˜Fij represents
total volume of interactions between i and j (i.e. the sum of the ﬂows in both
directions: ˜Fij = Fij + Fji).
• Mi and Mj are the relevant economic sizes of the two locations.
– If F is measured as a monetary ﬂow (e.g. export values), then M is usually
the gross domestic product (GDP) or gross national income (GNI, formerly
GNP) of each location.
– For ﬂows of people, it is more natural to measure M with the populations.
• Dij is the distance between the locations (usually measured center to center).
Note that we return to Newton’s Law (equation 1) if α = β = 1 and θ = 2.
2
1.3 Economic Explanations for Gravity
The gravity equation can be thought of as a kind of short-hand representation of supply
and demand forces. If country i is the origin, then Mi represents the total amount it is
willing to supply to all customers. Meanwhile Mj represents the total amount destination
j demands. Distance acts as a sort of tax “wedge,” imposing trade costs, and resulting
in lower equilibrium trade ﬂows.
More recently (starting with Anderson, 1979) there have been several attempts to
derive the gravity equation formally. Here I sketch a derivation.
Let Mj be the amount of income country j spends on all goods from any source i.
Let sij be the share of Mj spent on goods from country i. Then Fij = sijMj. What do
we know about sij?
1. It must lie between 0 and 1.
2. It should increase if i produces a wide variety of goods (large ni) and/or goods
perceived to be of high quality (large µi).
3. It should decrease due to trade barriers such as distance, Dij.
In light of these arguments we suggest
sij =
g(µi, ni, Dij)
g(µ , n , D j)
,
where the g(·) function should be increasing in its ﬁrst two arguments and decreasing
in distance for all sij > 0.
To move forward, we need a speciﬁc form for g(). One approach (taken by Bergstrand)
uses the Dixit and Stiglitz model of monopolistic competition between diﬀerentiated but
symmetric ﬁrms. This model sets µi = 1 and makes ni proportional to Mi. A second
approach (due to Anderson) assumes a single good from each country, ni = 1, but allows
the preference parameter µi to vary across countries subject to the constraint of marketclearing.
diﬀer in such a way as to also be proportional to the size of the economy, Mi.
Both let trade costs be a power function of distance. I prefer the monopolistic competition
approach because it seems more natural to endogenize the number of varieties,
ni, than to endogenize the preference parameter. Later we will see that there are some
empirical speciﬁcations that are valid under either approach.
Allowing both n and µ to vary across countries, let g(ni, µi) = ni
v=1(pijv/µijv)1−σ
,
where v indexes particular varieties that are substitutable with an elasticity of substitution
given by σ. If the goods from the same country are diﬀerentiated but of the
same average quality and subject to the same transport costs, then we can drop the v
subscripts and set g() = ni(pij/µij)1−σ
.
The next step is to relate the delivered (quality-adjusted) price to the price in the
origin country and transportation costs between origin and destination. We assume the
following relationship:
pij/µij = (pi/µi)Dδ
ij.
3
The origin price, pi, is often referred to as the free-on-board or fob price. We will
postpone a complete discussion of the justiﬁcation for this equation but note that it
allows for both the eﬀect of distance-based freight charges on the delivered price and for
distance eﬀects on perceived quality (due to mundane causes such as damage in transit
or, more speculatively, to culture-based biases that are correlated with distance).
In the basic gravity equation, we assume away price diﬀerences.1
Note that this
is not quite as unrealistic as it at ﬁrst seems—we require only that fob prices vary
proportionally to the quality of the export country’s products, i.e. that pi/µi ≈ k.
The number of varieties in each country ni is not something we can hope to observe
directly. Rather we take advantage of a property of the Dixit-Stiglitz model: namely, all
ﬁrms are the same size. In that case, ni = Mi/q where q is ﬁrm size. Imposing these last
assumptions, deﬁning θ ≡ δ(σ − 1) ≥ 0, we obtain g() = MiD−θ
/(qkσ−1
). This implies
market shares for exporter i in country j of
sij = MiD−θ
ij Rj,
where Rj = 1/( M D−θ
j )). After substituting and rearranging we obtain a result that
is very close to what we had sought for:
Fij = Rj
MiMj
Dθ
ij
. (3)
The main diﬀerence is that now the term Rj replaces the “gravitational constant,” G.
We will discuss the interpretation of that term in the next section.
Before that note what happens in a “frictionless” world, i.e. one in which θ = 0. Then
Rj = 1/ M = 1/Mw and F∗
ij = MiMj/Mw (the w subscript stands for “world”).
2 Estimation of the Gravity Equation
The multiplicative nature of the gravity equation means that we can take natural logs
and obtain a linear relationship between log trade ﬂows and the logged economy sizes
and distances:
ln Fij = α ln Mi + β ln Mj − θ ln Dij + ρ ln Rj. + ij. (4)
The inclusion of the error term ij delivers an equation that can be estimated by ordinary
least squares regression. If our derivations in the earlier section are correct, we would
expect to estimate α = β = ρ = 1.
2.1 Economic Mass
The economic sizes of the exporting and importing countries, Mi and Mj, are usually
measured with gross domestic product. The estimated coeﬃcients are usually close to
1
Recently developed methods of analyzing bilateral trade do not require this assumption. See Feenstra
(Scottish Journal of Political Economy, 2002).
4
the predicted value of one. However, it is not unusual to obtain values ranging anywhere
between 0.7 and 1.1.
Note that the theory we used to derive the gravity equation predicts coeﬃcients of
one. Indeed, we lack an interpretation for coeﬃcients diﬀerent from one. There are
further problems with including the ln Mi and ln Mj as regressors. First, they tend to
inﬂate the R2
of the regressions since it is hard to imagine a world in which big countries
don’t trade more in absolute terms. Second, since exports and imports are part of GDP,
there is a built accounting relationship between the Fij and Mi and Mj. Some studies
have tried to deal with this simultaneity by using instrumental variables for GNP (such
as population). A simpler solution is just to impose the theoretical prediction of unitary
elasticities. This implies that we pass the income terms over to the left hand side.
Subtracting ln Mi + ln Mj − ln Mw from both sides of (4), we obtain
ln(Fij/F∗
ij) = ln Mw + ρ ln Rj. − θ ln Dij + ij. (5)
The dependent variable measures the deviation of actual trade ﬂows from the “frictionless”
ideal. The sum of the ﬁrst two terms on the right-hand side will be estimated as
the regression’s constant; that is variation in Rj is shoved into the error term. There
are two test statistics that one can examine to see if the data statistically reject the
frictionless idea. One is the t-stat on the constant. The other is the t-stat on ˆtheta.
2.2 Distance
Distance is almost always measured using the “great circle” formula. This formula
approximates the shape of the earth as a sphere and calculates the minimum distance
along the surface.
Tip: To calculate great circle distances you need the longitude and latitude of the
capitol or “economic center” of each economy in the study. The apply the following
formula to obtain the distance measure in miles:
Dij = 3962.6 arccos([sin(Yi) · sin(Yj)] (6)
+ [cos(Yi) · cos(Yj) · cos(Xi − Xj)]),
where X is longitude in degrees multiplied by 57.3 to convert it to radians and Y is
latitude multiplied by −57.3 (assuming it is measured in degrees West).
Even for air travel, great circle distances probably underestimate true distances since
they do not take into account that most ﬂights avoid the North Pole. For maritime
travel, they do not take into account indirect routes mandated by land and ice barriers.
In addition, many air and sea routes are shaped by economic considerations such as
“hub economies.” Furthermore international shipping cartels often set freight costs that
bear little relationship to distance travelled. Also, the costs of packaging, loading and
unloading, seem to be primarily ﬁxed costs that do not vary with distance. Taken
together, these considerations suggest that distance should matter very little for trade.
5
While he have many ex-ante reasons to expect little relationship between trade and
distance, the facts say that distance dramatically impedes trade. Together with AnneCelia
Disdier of the University of Paris, I have been conducting a meta-analysis of
gravity equation distance estimates from 595 regressions reported in about 35 papers.
The samples ranged from 1928 to 1995. The trading partners were mainly nations
though some results for the trade of Canada’s provinces were included as well. The
average distance eﬀect turns out to be ˆθ = 0.94. This means that a doubling of distance
will decrease trade by one half.
Leamer and Levinsohn’s (1994) survey of the empirical evidence on international
trade oﬀers the identiﬁcation of distance eﬀects on bilateral trade as one of the “clearest
and most robust empirical ﬁndings in economics.”2
They asked “Why don’t trade economists ‘admit’ the eﬀect of distance into their
thinking? One [answer] is that human beings are not disposed toward processing numbers,
and empirical results will remain unpersuasive if not accompanied by a graph.”
They showed Germany’s trade but I will stay closer to home, showing trade by Canadian
provinces and US states.
The graphical method is a scatterplot of F1j/Mj (exporter 1’s share of market j) on
the vertical axis against D1j on the horizontal axis. Both axes are shown in “log” scale.
Thus each space between tic marks raises the variable by some factor. For the vertical
axis the factor is 10 while it is 2 for the horizontal axis. A line through the means of the
data with a slope of -1 (in log scale) is also shown as a reference. It is often revealing
to contrast exporter 1’s performance with that of some comparable economy (i = 2).
The gap between the intercepts should be approximately equal to the relative sizes of
the two exporters, i.e. M1/M2. Alternatively, one might estimate (using ordinary least
squares, for instance) and graph lines that best ﬁt the data for each exporter.
Why does distance matter so much? Economists have oﬀered four major explana-
tions:
1. Distance is a proxy for transport costs. David Hummels has argued that shipping
costs (freight charges and marine insurance) can go a long way towards explaining
why distance matters.
2. Distance indicates the time elapsed during shipment. For perishable goods the
probability of surviving intact is a decreasing function of time in transit. Perishability
may be interpreted quite broadly to include the following risks:
(a) Damage or loss of the good due to weather or mishandling (e.g. ship sinks in
a storm).
(b) Decomposition and spoiling of organic materials (e.g. maggot infestation).
(c) Loss of the market (the intended purchaser becomes unwilling or unable to
make payment).
2
They assert that the typical distance eﬀect is 0.6.
6
Figure 1: Trade is Inversely Proportionate to Distance
1e-007
1e-006
1e-005
0.0001
0.001
0.01
0.1
1
100 200 400 800 1600 3200 6400
1995ProvincialExports/GDPofImporter
Distance (miles)
British Columbia (actual)
BC (gravity prediction)
Ontario (actual)
ON (predicted)
0.001
0.01
0.1
1
100 200 400 800 1600
CommodityFlowShare,1997
Distance (miles)
Washington
California
7
3. Synchronization costs. When factories combine multiple inputs in the production
process, they need those inputs to arrive in time or bottlenecks emerge.
One possibility is to use warehouses to keep inventories of each input but this
approach suﬀers from various drawbacks (land costs, technological obsolescence,
fashion changes, and low pressures for quality control). Sourcing inputs from
nearby lowers synchronization costs.
4. Communication costs. According to Paul Krugman, distance “proxies for the
possibilities of personal contact between managers, customers, and so on; that
much business depends on the ability to exchange more information, of a less
formal kind, than can be sent over a wire.”
5. Transaction costs. Distance may also be correlated with the costs of searching
for trading opportunities and the establishment of trust between potential trading
partners.
6. “Cultural distance.” It may also be that greater geographic distances are correlated
with larger cultural diﬀerences. Cultural diﬀerences can impeded trade in many
ways such as inhibiting communication, generating misunderstandings, clashes in
negotiation styles, etc.
2.3 Remoteness
Until recently, most papers implicitly assumed that Rj is constant across countries and
therefore becomes the intercept in the regression equation. However, Rj is important
because it measures each importer’s set of alternatives. Countries with many nearby
sources of goods, i.e. those with low values of Rj, will import less from each particular
source.
A few studies have included variables like Rj and referred to them as “remoteness.”
However some of these measures diﬀer from the theoretically correct Rj in ways that
may be problematic. For instance, Helliwell (1998) measures remoteness as REMj =
D j/M . This measure causes remoteness to be very large if it includes distant (high
D j) but tiny (low M ) countries. Since the previous literature usually ﬁnds θ ≈ 1, a
better measure of remoteness is 1/( M /D j). In this measure the size of very distant
countries becomes irrelevant.
The importance of remoteness in actual trade patterns can be illustrated by comparing
trade between Australia and New Zealand with trade between Austria and Portugal.
The distance between each pair’s major city is approximately the same: Lisbon–Vienna
and Auckland–Canberra both happen to be 1430 miles apart. Furthermore the product
of their GDP’s are similar (Australia–New Zealand is 20% smaller). Hence, omitting
remoteness, the gravity equation would predict that Austria–Portugal trade would be
slightly larger. In fact, however, in 1993 Australia–New Zealand trade was nine times
greater than Austria–Portugal Trade.
8
Tip: The remoteness measure includes Mi/Dii in its summation requiring us to specify
a country’s distance from itself, Dii. For reasons provided in Head and Mayer (2000), I
believe a good approximation for this “internal distance” is provided by the square root
of the country’s area multiplied by about 0.4.
3 “Augmenting” the Gravity Equation
Gravity equations do a pretty good job at explaining trade with just the size of the
economies and their distances. However, there is a huge amount of variation in trade they
cannot explain. Most authors add a few other variables with less theoretical justiﬁcation,
usually because past experience has shown that they “work.” In the next subsections I
discuss the most commonly included of these variables.
3.1 Income per Capita
Many authors estimate gravity equations with the log of per-capita incomes (ln M/POP)of
the exporting and importing countries included as well as the log of aggregate incomes
(ln M).
The idea behind this appears to be that higher income countries trade more in
general. One cause might be superior transportation infrastructure (roads to the interior,
container ports, airports, etc.). High income countries probably have lower tariﬀs. A
countervailing eﬀect is that high income countries tend to be more service-oriented,
leading to lower trade in merchandise for a given level of GDP.
Estimated coeﬃcients on the log of per-capita GDP display considerable variation
across studies, ranging as low as 0.2 and as high as 1.
3.2 Adjacency
Adjacent, or contiguous, countries share a border. Many studies include a dummy
variable to identify such pairs.
The estimated coeﬃcient usually lies in the vicinity of 0.5, suggesting that trade is
about 65% higher as a result of sharing a border. It is not clear why adjacency should
matter if one is already controlling for distance. Perhaps center-to-center distance overstates
the eﬀective distance because neighboring countries often engage in large volumes
of border trade. Examples of this phenomenon include Windsor–Detroit, Tijuana–San
Diego, and Hongkong–Shenzhen.
3.3 Common Language and Colonial Links
Recall that one explanation for the trade impeding eﬀects of distance was transaction
costs caused by inability to communicate and cultural diﬀerences. If so, we would expect
that countries that speak the same language would trade more. The evidence strongly
9
conﬁrms this proposition. Two countries that speak the same language will trade twice
to three times as much as pairs that do not share a common language.
Part of the reason for this common language eﬀect is probably the share history that
caused the two countries to share a language. Indeed, measures of colonial links also are
positively correlated with trade. Including them as controls reduces the language eﬀect
somewhat but it remains quite strong.
3.4 Border Eﬀects
A recent literature initiated by John McCallum’s 1995 American Economic Review article
investigates whether national borders still matter for trade.
In The Borderless World, Kenichi Ohmae of McKinsey asserted
“National borders have eﬀectively disappeared and, along with them, the
economic logic that made them useful lines of demarcation in the ﬁrst place.”
McCallum’s examination of the trade patterns of Canadian provinces countered that
borders must matter very much because the typical Canadian province trades 20 times
more with other provinces than with American states of a given size and distance.
Perhaps the best way to see how this sort of calculation would arise is from considering
Ontario’s shipments to British Columbia and Washington state. The distances
involved are essentially the same but one case involves crossing a border and the other
does not.
If borders were irrelevant, the gravity equation would predict that exports to BC
should be 0.6 of exports to Washington because that is the ratio of the two states’
economies. However, BC actually receives 12.6 times more goods from Ontario than
does Washington. Thus the border eﬀect, deﬁned as the actual trade ratio divided by
the predicted trade ratio, is 12.6/0.6 = 21!
Since the Canada-US Free Trade Agreement was implemented, cross-border trade
has grown dramatically (around 60%) and border eﬀects have fallen to about 12 on
average for Canadian trade.
Border eﬀects can also be calculated without the “intra-national” trade ﬂows that
are only available for a few countries. This method, developed by Shang Jin Wei requires
estimates of each country’s distance to itself. Head and Mayer developed a way to measure
internal and external distances in a consistent manner and applied it to European
trade. They also found high border eﬀects.
Why do borders matter? One approach is to question the methods and the measurements.
Another approach is to accept the result and argue that it points to the great
importance of national institutions (legal, monetary, social) that promote trade. The
dust has not settled on this debate.
10
I believe that trade depends on networks of connected ﬁrms. These networks formed
over time when borders and distance imposed higher costs because both tariﬀs and
transport costs were higher. Members of networks focused on building local relationships.
These strong local ties generate trade. Thus I think border and distance eﬀects
are large for the same reasons.
4 Evaluating Trade-Creating Policies
Countries often enter into agreements with intent of facilitating bilateral trade. Do such
agreements work?
4.1 Free Trade Agreements
Regional trade liberalizing agreements like Europe’s common market and North America’s
free trade agreements have proliferated in the last 20 years and one of the primary
uses of gravity equations has been to evaluate them.
On average FTAs seem to raise trade by around 50%. However, a recent study by
Frankel and Rose (National Bureau of Economic Research Working Paper 7857) ﬁnds
that FTAs lead to a tripling of trade between partners.
4.2 Monetary Agreements
Studies of how exchange rate volatility aﬀects trade have obtained mixed results. One
recent study, by Frankel and Rose , ﬁnds that countries that share a common currency,
such as the US and Panama, trade three times more with each other than one would
expect.
This eﬀect is surprisingly large and perhaps implausible as a general rule. For instance,
I ﬁnd it very unlikely that the adoption of the EURO by 11 countries in Europe
will cause trade between them to triple!
11