PORTFOLIO THEORY – LECTURE NOTES 3 Dr. Andrea Rigamonti IMPROVING PORTFOLIO OPTIMIZATION Let us consider again the portfolio optimization problem. Whatever metrics we will use to evaluate the results, we need to compare them with appropriate benchmarks in order to know if the results are satisfying. A benchmark portfolio surprisingly difficult to beat is the one created with a naΓ―ve 1/N rule.1 This portfolio simply assigns equal weights to all the assets in all periods, and therefore does not require to estimate any input and to perform any optimization procedure. This is indeed one of its main strengths: it is completely immune to estimation errors. Another advantage is that it has a very low turnover, which translates into very low transaction costs. Another possible benchmark is the market, or more precisely the returns of a large stock market index, like the S&P 500. While it is not possible to buy an index, it is possible to buy ETFs. An ETF (Exchange-Traded Fund) is a fund that is traded on the financial markets and which tries to replicate a certain index. By investing in such fund, the investor can invest in a certain index without having to trade al the stocks contained in such index. Stock markets sometimes go through periods of poor performance that can last years, but over the long run they provide good returns (in countries with a solid economy). Therefore, such passive investing solution, which also does not require to perform estimation and optimization procedures, is another reasonable benchmark (over long enough periods of time). Basic mean-variance optimization with sample estimates often struggles to beat common benchmarks, especially the naΓ―ve rule mentioned above. To improve optimization performance we need to reduce the severity of the estimation errors in the inputs, and/or reduce the impact of such errors on portfolio formation. Plenty of solutions have been proposed. We focus on some of them that proved to be effective and not too difficult to apply. On the side of parameter estimation, we consider techniques developed to improve the estimation of the covariance matrix. While in general the error in the vector of sample means in more severe than the error in the sample covariance matrix, solutions proposed to improve over the latter are simpler and more effective. Moreover, while the number of parameters in 𝝁 is equal to the number of assets 𝑁, the number of parameters in βˆ‘ is equal to 𝑁2 . Therefore, as the portfolio gets larger, estimation error in βˆ‘ becomes worse much faster than in does in 𝝁, which might jeopardize the performance benefit that we theoretically get from a larger 𝑁 thanks to the greater diversification potential. The approach arguably more successful in dealing with the estimation of the covariance matrix is given by shrinkage estimators. Ledoit & Wolf (2004) define the shrinkage estimator βˆ‘ 𝑳𝑾 = π›Ώπ‘°πœ‡ 𝑠̂ + (1 βˆ’ 𝛿)βˆ‘ where βˆ‘ is the usual sample covariance matrix, 𝑰 is the identity matrix, πœ‡ 𝑠̂ is the average sample variance of all the variables (so the product π‘°πœ‡ 𝑠̂ gives a diagonal matrix whose elements on the diagonal are equal to the average sample variance and all the other elements are equal to zero), and 𝛿 is the shrinkage parameter whose value is between 0 and 1. It is called β€œshrinkage” estimator because the sample estimate is shrunk toward a target matrix. βˆ‘ 𝑳𝑾 is basically the result of a weighted average between the sample covariance matrix and the target matrix. The intensity of the shrinkage is controlled by the shrinkage parameter 𝛿. Ledoit and Wolf 1 We will refer to this rule/portfolio as β€œnaive” or β€œnaΓ―ve 1/N”, to not confuse it with the 1/N rule we described before. However, it is also commonly called simply β€œ1/N” rule/portfolio in the literature. (2004) select an optimal value for 𝛿 with a procedure whose details are beyond our scope here. This shrunk estimator improves over the sample estimator, which translate in better performance of the mean-variance and minimum variance portfolios (and portfolios computed using the covariance matrix as input in general). Moreover, it gives a nonsingular covariance matrix even when the number of periods 𝑇 used for estimation is smaller than the number of assets 𝑁. The sample covariance matrix, on the contrary, in such case is singular and therefore not invertible, which makes it impossible to perform the optimization procedures. It is also very easy to use this estimator in common statistical environments like R. Hence, it is now standard to use this estimator instead of the sample covariance in many applications, including portfolio optimization. We now turn to solutions that mitigate the impact of estimation errors on portfolio formation. One of such solutions relies on a logic similar to the one we just described, and consists in computing shrinkage portfolios. The idea, proposed by Tu and Zhou (2011), is that we can improve over the mean-variance portfolio by shrinking it toward the naΓ―ve 1/N portfolio. In this way we can combine a portfolio that optimizes the weights but suffers from estimation errors with one that is not optimized but is also immune from estimation errors, obtaining a portfolio that improves over both. The weights of such portfolio are given by π’˜βˆ— = π›Ώπ’˜ 𝑡𝑨𝑰𝑽𝑬 + (1 βˆ’ 𝛿)π’˜ where π’˜ 𝑡𝑨𝑰𝑽𝑬 is the vector of weights of the naΓ―ve 1/N portfolio, π’˜ is the vector of weights of the optimized portfolio (usually the mean-variance portfolio, but can be also other portfolios), and 𝛿 is again the parameter that controls the shrinkage intensity. The value of 𝛿 can be chosen using optimization rules, heuristics, or cross-validation. Another solution to improve over the standard mean-variance portfolio is the grouping strategy proposed by Branger et al. (2019). The idea is that since the performance of mean-variance (or minimum variance) optimization suffers more and more as 𝑁 increases, due to the growing number of parameters to estimate, one could achieve better performance by grouping together the assets in a certain number of groups. The optimization procedure is than performed between the groups, while within a group the assets are equally weighted. This reduces the dimension of the problem, and therefore the number of parameters to estimate. In other words, it is another strategy that combines the benefits of the naΓ―ve 1/N rule (which is applied within groups) and of optimization (which is applied between groups). The stocks can be grouped according to how similar they are in terms of estimated mean, variance or beta, and the number of groups can be chosen using optimization rules, heuristics, or cross-validation. The higher the number of groups, the closer we get to the usual optimization; the smaller the number of groups, the closer we get to the naΓ―ve 1/N rule. In the extreme case where we only have one group we obtain the naΓ―ve 1/N portfolio; in the extreme case where the number of groups is equal to N, we get the usual optimized portfolio.