PORTFOLIO THEORY – LECTURE NOTES 2
Dr. Andrea Rigamonti
FACTOR INVESTING WITH LONG-SHORT PORTFOLIOS
Computing optimal weights is not the only possibility. An alternative approach involves creating
long-short portfolios. Suppose we want to invest into N assets. First, assets are ranked according to
their predicted return. We then assemble a portfolio with two legs: a long leg which contains a given
number of assets with the highest predicted returns, and a short leg with a given number of assets
predicted to have the lowest returns. Within a certain leg the assets are often equally weighted,
although other weighting systems that assign different weights based on the predicted return are
of course possible.
One of the advantages of such an approach is that it allows the investor to consider predicted
returns without suffering the full consequences of the estimation error in the mean. Optimization
procedures like the mean-variance one are in fact error-maximizing: errors in the inputs lead to
extreme weights, which can lead to abysmal performance. This is why standard mean-variance
optimization rarely works well and is generally replaced either with more advanced techniques that
limit extreme allocations, or with a minimum variance portfolio. A long-short portfolio does not
have theoretically optimal weights, but it can still work better by avoiding this error-maximization
trap.
The other advantage is that a long-short portfolio can be self-financing: the money obtained from
shorting the assets predicted to perform poorly is used to go long on the assets with predicted high
return. As short positions tend to be more risky than long positions, using a partially self-financing
portfolio is also common. In this case, less than half (e.g., 30%) of the wealth is placed on the short
leg, and the long leg is financed partly from the shorting and partly from the investor’s initial wealth
(in our example where 70% of the invested sum goes to the long leg, 30% of the money comes for
the shorting and the other 40% from the investor’s funds).
A practical disadvantage of such a portfolio is that in the real world it is generally difficult to short a
large number of stocks. So in practice N has to be relatively small, and therefore it can be more risky,
as it is not very well diversified. A long-short portfolio can also be particularly vulnerable in turbulent
market conditions, when the price of virtually all stocks are either increasing or decreasing at the
same time. The latter problem can be eased by making the portfolio construction more flexible (e.g.,
by varying the number of stocks and/or the amount of wealth in the long and short leg depending
on the market conditions).
Obviously, a pre-condition for creating a long-short portfolio is having a ranking based on how we
expect the assets to perform. How can we obtain it? Using the sample means is not appropriate, as
we pointed out that such estimates are too unreliable. A much better alternative is to use a factorbased
approach. In fact, long-short portfolios are a typical way factor investing is performed. This
generally involves computing expected returns using multifactor models.
Remember that the formula of a multifactor model with 𝑘 factors is:
𝑅𝑖 = 𝛼𝑖 + 𝑏𝑖1 𝑓1 + 𝑏𝑖2 𝑓2 + ⋯ + 𝑏𝑖𝑘 𝑓𝑘 + 𝜀𝑖
In practice, the expected return of a stock given a certain multifactor model is computed as:
𝐸[𝑅𝑖] = 𝛼𝑖 + 𝑏𝑖1 𝛾1 + 𝑏𝑖2 𝛾2 + ⋯ + 𝑏𝑖𝑘 𝑓𝛾 𝑘
where 𝛾 is the factor risk premium.1
Therefore, we need to estimate the loadings 𝑏𝑖𝑘 and the risk premia. This is typically done using the
Fama-MacBeth regression. It is a two-stage linear regression. Consider an estimation sample with
𝑁 assets and 𝑇 periods.
In the first stage, the loadings are estimated by regressing the returns of each asset 𝑖 on the 𝑘
factors, using the entire set of 𝑇 periods:
𝑅1𝑡 = 𝛼1 + 𝑏11 𝑓1 𝑡 + 𝑏12 𝑓2𝑡 + ⋯ + 𝑏1𝑘 𝑓𝑘
𝑅2𝑡 = 𝛼2 + 𝑏21 𝑓1𝑡 + 𝑏22 𝑓2𝑡 + ⋯ + 𝑏2𝑘 𝑓𝑘
⋮
𝑅𝑖𝑡 = 𝛼𝑖 + 𝑏𝑖1 𝑓1 𝑡 + 𝑏𝑖2 𝑓2𝑡 + ⋯ + 𝑏𝑖𝑘 𝑓𝑘
⋮
𝑅 𝑁𝑡 = 𝛼 𝑁 + 𝑏 𝑁1 𝑓1 𝑡 + 𝑏 𝑁2 𝑓2𝑡 + ⋯ + 𝑏 𝑁𝑘 𝑓𝑘𝑡
The estimated loadings are then used as explanatory variables in a second regression that, for each
period 𝑡, regresses the asset returns of the entire set of 𝑁 assets:
𝑅𝑖1 = 𝛾10 + 𝛾11 𝑏𝑖1
̂ + 𝛾12 𝑏𝑖2
̂ + ⋯ + 𝛾1𝑘 𝑏𝑖𝑘
̂
𝑅𝑖2 = 𝛾20 + 𝛾21 𝑏𝑖1
̂ + 𝛾22 𝑏𝑖2
̂ + ⋯ + 𝛾2𝑘 𝑏𝑖𝑘
̂
⋮
𝑅𝑖𝑡 = 𝛾𝑡0 + 𝛾𝑡1 𝑏𝑖1
̂ + 𝛾𝑡2 𝑏𝑖2
̂ + ⋯ + 𝛾𝑡𝑘 𝑏𝑖𝑘
̂
⋮
𝑅𝑖𝑇 = 𝛾 𝑇0 + 𝛾 𝑇1 𝑏𝑖1
̂ + 𝛾 𝑇2 𝑏𝑖2
̂ + ⋯ + 𝛾 𝑇𝑘 𝑏𝑖𝑘
̂
Ideally we should use the true loadings, but their value is of course unknown in practice.
To compute the expected returns of each asset 𝑖 we need the loadings, estimated in the first
regression, and the risk premia, estimated in the second regression. Notice however that the risk
premia are time-varying. A common approach is to compute their average value over the 𝑇 periods
(just like it is common to compute the average market excess return when using the CAPM). The
expected return of asset 𝑖 according to the chosen multifactor model is given by (we omit the ^ to
keep the notation light):
𝐸[𝑅𝑖] = 𝑏𝑖1 𝛾1 + 𝑏𝑖2 𝛾2 + ⋯ + 𝑏𝑖𝑘 𝛾 𝑘
For greater clarity, let us consider how this works with the Fama-French three-factor model, which
is probably the most important factor model. Recall that the model is:
𝑅𝑖 = 𝑅𝑓 + 𝑏𝑖1(𝑅 𝑚 − 𝑅𝑓) + 𝑏𝑖2 𝑆𝑀𝐵 + 𝑏𝑖3 𝐻𝑀𝐿
In practice the expected return of asset 𝑖 will be computed as:
𝐸[𝑅𝑖] = 𝑅𝑓 + 𝑏𝑖1 𝛾(𝑅 𝑚−𝑅 𝑓) + 𝑏𝑖2 𝛾𝑆𝑀𝐵 + 𝑏𝑖3 𝛾 𝐻𝑀𝐿
We use the Fama-MacBeth regression to estimate the loadings and the risk premia. Usually, the
excess return is used as dependent variable, to focus on the component of the return that is
dependent on factor exposure. Therefore, the first stage regression for each asset 𝑖 is:
𝑅𝑖𝑡 − 𝑅𝑓𝑡 = 𝛼𝑖 + 𝑏𝑖1(𝑅 𝑚𝑡 − 𝑅𝑓𝑡) + 𝑏𝑖2 𝑆𝑀𝐵𝑡 + 𝑏𝑖3 𝐻𝑀𝐿 𝑡
1
In the CAPM, and in single factor models in general, we can directly use the factor value (the excess market return in
the case of CAPM). In multifactor models we cannot do this, and we need to use the risk premia of the factors instead.
To simplify the notation, we indicate the first factor as 𝑀𝐾𝑇:
𝑅𝑖𝑡 − 𝑅𝑓𝑡 = 𝛼𝑖 + 𝑏𝑖1 𝑀𝐾𝑇𝑡 + 𝑏𝑖2 𝑆𝑀𝐵𝑡 + 𝑏𝑖3 𝐻𝑀𝐿 𝑡
As explained before, this regression needs to be carried out separately for each of the 𝑁 assets.
Now that we have the estimates for the loadings, we can set up the second stage regression:
𝑅𝑖 − 𝑅𝑓 = 𝛾𝑡0 + 𝛾𝑡1 𝑏𝑖1
̂ + 𝛾𝑡2 𝑏𝑖2
̂ + 𝛾𝑡3 𝑏𝑖3
̂
This regression needs to be carried out separately for each of the 𝑇 periods in the estimation
window, obtaining 𝑇 values for 𝛾𝑡1, 𝛾𝑡2 and 𝛾𝑡3 . We then compute their average in order to have a
single value. We rename the average of 𝛾𝑡1, 𝛾𝑡2 and 𝛾𝑡3 as 𝛾 𝑀𝐾𝑇, 𝛾𝑆𝑀𝐵 and 𝛾 𝐻𝑀𝐿 respectively, for
better clarity. We also compute the average risk-free rate in order to have a single value for 𝑅𝑓.2
We can now compute the expected return of each asset 𝑖 as:
𝐸[𝑅𝑖] = 𝑅𝑓 + 𝑏𝑖1 𝛾 𝑀𝐾𝑇 + 𝑏𝑖2 𝛾𝑆𝑀𝐵 + 𝑏𝑖3 𝛾 𝐻𝑀𝐿
It is now straightforward to create the long-short portfolio. We simply rank the assets according to
their expected return, and take a long position on those positioned in the upper part of the ranking,
and a short position on those in lower part of the ranking.
Each time the portfolio has to be updated, we need to compute new estimates of the expected
returns. So, for example, if we want to update the portfolio monthly, we ne need to repeat the
procedure every month, using the up-to-date data.
2
Computing the average value of the risk premia and of the risk-free rate is a reasonable approach, and the one
commonly used. However, it is not “the” right approach. Other approaches might also be appropriate depending on the
specific situation.