PORTFOLIO THEORY – LECTURE NOTES 1
Dr. Andrea Rigamonti
MEAN-VARIANCE OPTIMIZATION
From an economist’s point of view, an investor that optimizes a portfolio is trying to maximize a
utility function. An extremely simple utility function is the linear utility function:
𝑈(𝑉) = 𝑎 + 𝑏𝑉, 𝑏 > 0
where 𝑈(𝑉) is the utility that the investor gets depending on the value 𝑉 of the portfolio. This
function simply says that the higher the wealth, the higher the utility of the investor. Its shape is a
line, and the parameter 𝑏 determines how much the utility increases following a wealth increase. 𝑏
is assumed be positive, otherwise the investor would be indifferent (𝑏 = 0) or less satisfied (𝑏 < 0)
when wealth increases. In other words, only the mean return of the portfolio matters.
Markowitz (1952) revolutionized the field by adding risk to the equation. His standard approach
assumes that an investor cares not only about the mean but also the variance of the portfolio
returns, i.e. the investor has a mean-variance utility. Given a certain mean return, the utility
increases as the variance (which quantifies risk) gets lower. Equivalently, given a certain level of
variance, the utility increases with a higher mean return. The decision in the trade-off between
return and risk is quantified by a risk aversion parameter 𝛾. A higher 𝛾 means that the investor is
more risk averse and will therefore require a higher compensation for an increased risk. A lower 𝛾
means the investor has a lower risk-aversion and will be willing to take more risks. In other words,
given a set of assets with a certain mean and variance, the lower the 𝛾 of the investor, the more he
will create an optimal portfolio with a higher mean return but also a higher variance.
To model such preferences, a quadratic utility function is used:
𝑈(𝑉) = 𝑉 −
𝛾
2
𝑉2
, 𝛾 > 0
𝛾 is assumed to be positive so that the utility function is concave:
Source: https://financestu.com
This implies that the investor is risk-averse. With 𝛾 = 0 the investor would be indifferent to risk,
while 𝛾 < 0 would mean that the investor is risk taker (i.e., prefers more risk for the same amount
of wealth, which is obviously not realistic).
Remember that the expected return of a portfolio is given by 𝜇 𝑃 = 𝒘′
𝝁 = 𝑉, where 𝒘 is the vector
of portfolio weights and 𝝁 is the vector of mean return of the single assets. Moreover, recall that
the variance is the expected value of the squared deviation from the mean.
So, the utility function becomes:
𝑈(𝒘) = 𝒘′𝝁 −
𝛾
2
𝒘′∑𝒘
Therefore, given a risk-free asset and a set of 𝑁 risky assets with mean returns 𝝁 and covariance
matrix 𝜮, and a certain risk aversion coefficient 𝛾, the investor we are considering wants to select
the weights 𝒘 in a way that maximizes the following utility function:
max
𝒘
𝒘′𝝁 −
𝛾
2
𝒘′∑𝒘
This is an unconstrained optimization problem easy to solve. We just need to set the first-order
condition, i.e., take the partial derivative with respect to 𝒘 and set it equal to zero:
𝜕𝑈(𝒘)
𝜕𝒘
= 𝝁 −
2𝛾
2
∑𝒘 = 𝝁 − 𝛾∑𝒘 = 𝟎
We then solve for 𝒘:
∑𝒘 =
1
𝛾
𝝁
𝒘 =
1
𝛾
∑−𝟏
𝝁
So, the closed-form solution that gives the optimal weights for the risky assets is:
𝒘 𝒎𝒗 =
1
𝛾
∑−𝟏
𝝁
while the weight for the risk-free asset is equal to 1 − 𝒘 𝒎𝒗′𝟏, where 𝟏 is a vector of ones with
length equal to the number of risky assets.
The resulting optimal expected utility is:
𝑈(𝒘 𝒎𝒗) =
1
2𝛾
𝝁′∑−𝟏
𝝁
Providing a specific value for 𝛾 might be difficult in practice. Moreover, it can be difficult to interpret
the meaning of the specific utility value associated with a certain portfolio. While we can say that a
portfolio with a certain utility is preferable to another with a lower utility, it is not obvious how good
it is in a more general sense. In short, the value itself, in the case of utility, is not very informative.
A more intuitive approach is to specify the preferences through a desired mean portfolio return 𝑅𝑒
instead of a level of risk aversion. In this case, the goal becomes to minimize the variance given the
desired mean return. This is a constrained optimization problem:
min
𝒘
𝒘′
∑𝒘
subject to:
𝒘′
𝝁 + (1 − 𝒘′𝟏)𝑅𝑓 = 𝑅𝑒
In the constraint, 𝒘′
𝝁 is the return of the risky assets, and (1 − 𝒘′𝟏)𝑅𝑓 is the return of the risk-free
asset. Together they give the return of the portfolio, which, as stated, must be equal to 𝑅𝑒.
To solve this problem, we use the method of Lagrange multipliers.
First we need to define the Lagrangian function, i.e., a modified version of the objective function
that incorporates the constraint in this way:
𝐿(𝒘, 𝜆) = 𝒘′
∑𝒘 + 𝜆[𝑅𝑒 − 𝒘′
𝝁 − (1 − 𝒘′𝟏)𝑅𝑓]
where 𝜆 is the Lagrange multiplier.
By including this additional term we can now solve an unconstrained problem instead of a
constrained one. Therefore, we set the first order conditions for the Lagrangian function. The
conditions involve two simultaneous equations, as we have to compute the partial derivative both
with respect to 𝒘 and to 𝜆.
𝜕𝐿
𝜕𝒘
= 2∑𝒘 − 𝜆𝝁 + 𝜆𝑅𝑓 𝟏 = 𝟎
𝜕𝐿
𝜕𝜆
= 𝑅𝑒 − 𝒘′
𝝁 − (1 − 𝒘′
𝟏)𝑅𝑓 = 0
Solving this system for 𝒘 is somewhat computation intensive, but it gives a closed-form solution.
Specifically, the optimal portfolio weights in this case are:
𝒘 𝒎𝒗 =
𝑅𝑒
𝝁′∑−𝟏 𝝁
∑−𝟏
𝝁
In both cases we do not need to explicitly include the risk-free asset in the asset menu, as it is
equivalent and simpler to work with excess returns, i.e. with the returns of the risky assets from
which we subtracted the risk-free rate.
Obviously, it is also possible to specify a given level of variance and maximize the mean return.
Mathematically, these are two equivalent optimization problems. However, it is more intuitive and
more common to specify the desired mean and minimize the variance.
So far we treated the inputs 𝝁 and ∑ as if they are given, but in practice they need to be estimated.
The simplest approach is called plug-in approach: the sample estimates of the inputs are computed
from past data, and are then plugged into the optimization problem as if they were the true values.
Obviously, this is not really the case, as sample estimates can be poor estimates of the true
parameter values. In particular, the estimation error in the sample mean is typically so big that
minimizing the variance while ignoring 𝝁 usually leads to portfolios with a higher Sharpe ratio than
those computed via mean-variance optimization.
Therefore, the investor might want to compute the global minimum variance portfolio (GMV), also
simply called minimum variance portfolio. Obviously, we need to impose the constraint that the
sum of the weights of the risky assets is equal to one, which means that nothing is invested in the
risk-free asset. Otherwise, everything would be invested in the risk-free asset. Hence, we have to
solve the following constrained optimization problem:
min
𝒘
𝒘′
∑𝒘
subject to:
𝒘′𝟏 = 1
We write the Lagrangian function:
𝐿(𝒘, 𝜆) = 𝒘′
∑𝒘 + 𝜆[1 − 𝒘′𝟏]
In order to get a more convenient first order condition, it is common to multiply the first term by
0.5, which does not alter the result. So the Lagrangian becomes:
𝐿(𝒘, 𝜆) =
1
2
𝒘′
∑𝒘 + 𝜆[1 − 𝒘′𝟏]
The first order conditions are:
𝜕𝐿
𝜕𝒘
= ∑𝒘 − 𝜆𝟏 = 𝟎
𝜕𝐿
𝜕𝜆
= 1 − 𝒘′
𝟏 = 0
Through some simple rearrangement we get:
𝒘 = 𝜆∑−𝟏
𝟏
𝒘′
𝟏 = 1
In the first equation, we can multiply both sides by 𝟏′, obtaining:
𝟏′𝒘 = 𝜆𝟏′∑−𝟏
𝟏
From the second equation we know that:
𝒘′
𝟏 = 𝟏′
𝒘 = 1
Hence, the first equation becomes:
1 = 𝜆𝟏′∑−𝟏
𝟏
𝜆 =
1
𝟏′∑−𝟏 𝟏
So, finally, we can take this last result and replace 𝜆 in 𝒘 = 𝜆∑−𝟏
𝟏, obtaining:
𝒘 =
1
𝟏′∑−𝟏 𝟏
∑−𝟏
𝟏
Therefore, the closed form-solution that gives the minimum variance weights is:
𝒘 𝒗 =
1
𝟏′∑−𝟏 𝟏
∑−𝟏
𝟏
As nothing is invested in the risk-free rate (since we require that the weights for the risky assets
must sum up to 1), it is equivalent to work with returns or excess returns. However, it might be
convenient to still work with excess returns, so that the results will be easily comparable with those
obtained by the mean-variance portfolio.
A further improvement can come from restricting the minimum variance portfolio to only have long
positions. In other words, we add another constraint that prohibits short selling positions, to get a
long-only minimum variance portfolio. This problem does not have a closed form solution, but it
can easily be solved with computer programs.
When a value for 𝛾 is specified, another strategy that mitigates the impact of the estimation error
is the 1/N rule.1 In this rule the (sample) estimates of 𝝁 and ∑ are used to optimally allocate the
wealth between the risk-free asset and the equally weighted risky assets. This rule usually performs
very well. The weights for the risky assets according to this strategy are:
𝒘 𝟏/𝑵 =
1
𝛾
𝟏′𝝁
𝟏′∑𝟏
𝟏
and the weight for the riskless asset is given by 1 − 𝒘′ 𝟏/𝑵 𝟏.
1
The name “1/N rule” is often used to refer to a naive rule where one simply places all the wealth on risky assets with all
weights equal to 1/N, without estimating any input. In these notes we refer instead to the more elaborate rule described in
the text.
For example, if 𝑁 = 5 and this rule returns a weight of 0.15 for each risky asset, we equally divide
75% of our wealth among the risky assets (i.e., 15% on each risky asset), and then place the
remaining 25% on the risk-free asset.