CHAPTER 1 The Equity Premium: ABCs Rajnish Mehra University of California, Santa Barbara, and NBER and Edward C. Prescott Arizona State University and Federal Reserve Bank of Minneapolis 1. Introduction 2 1.1. An Important Preliminary Issue 2 1.2. Data Sources 3 1.3. Estimates of the Equity Premium 6 1.4. Variation in the Equity Premium over Time 9 2. Is the Equity Premium due to a Premium for Bearing Non-Diversifiable Risk? 11 2.1. Standard Preferences 14 References 25 Appendix A 29 Appendix B 29 Appendix C 35 Appendix D 35 JEL Classification: G10, G12, D9 Keywords: asset pricing, equity risk premium, CAPM, consumption CAPM, risk free rate puzzle We thank George Constantinides, John Donaldson and Viral Shah for helpful comments and Francisco Azeredo for excellent research assistance. HANDBOOK OF THE EQUITY RISK PREMIUM Copyright c 2008 by Rajnish Mehra and Edward C. Prescott. All rights of reproduction in any form reserved. 1 2 Chapter 1 • The Equity Premium: ABCs 1. INTRODUCTION The year 1978 saw the publication of Robert Lucas’ seminal paper “Asset Prices in an Exchange Economy” in Econometrica. Its publication transformed asset pricing and substantially raised the level of discussion, providing a theoretical construct to study issues that could not be addressed within the dominant paradigm at the time, the Capital Asset Pricing Model.1 A crucial input parameter for using the latter is the equity premium2 (the return earned by a broad market index in excess of that earned by a relatively risk-free security). Lucas’ asset pricing model allowed one to pose questions about the magnitude of the equity premium.3 In our paper, “The Equity Premium: A Puzzle,”4 we decided to address this issue. In this chapter we take a retrospective look at our original paper and show why we concluded that the equity premium is not a premium for bearing non-diversifiable risk.5 We critically evaluate the data sources used to document the puzzle and touch on other issues that may be of interest to the researcher who did not have a ringside seat 20 years ago. We stress that the perspective here captures the spirit of our original paper and not necessarily our current thinking on these issues.6 This and the subsequent two chapters are motivated by the intention to make this volume a self-contained reference for the beginning researcher in the field. The chapters that follow address the research efforts that have preoccupied the profession in an effort to explain the equity premium. This chapter is organized into two parts. Part 1 documents the historical equity premium in the United States and in selected countries with significant capital markets (in terms of market value) and comments on data sources. Part 2 examines the question, “Is the equity premium a premium for bearing non-diversifiable risk?” 1.1. An Important Preliminary Issue Any discussion of the equity premium raises the question of whether arithmetic or geometric returns should be used for summarizing historical return data. In Mehra and Prescott (1985), we used arithmetic averages. If returns are uncorrelated over time, the appropriate statistic is the arithmetic average because the expected future value of a $1 investment is obtained by compounding the mean returns. Thus, this is the appropriate statistic to report if one is interested in the mean terminal value of the investment.7 1See Mossin (1966) for a lucid articulation. 2This was generally assumed to be in the 6–8 percent range. 3To put this advance in perspective, an equivalent contribution in physics would be to come up with a model that enabled one to address the question of whether the value of Newton’s gravitational constant G (6.672 × 10−11Nm2 /kg2 ) is reasonable, given other cosmological observations. Why is the value of G what we observe? 4See Mehra and Prescott (1985). 5This chapter draws on material in Mehra and Prescott (2003). All of the acknowledgements in that chapter continue to apply. 6For an elaboration, see McGrattan and Prescott (2003, 2005) and Mehra and Prescott (2008) in this volume. 7We present a simple proof in Appendix A. Rajnish Mehra and Edward C. Prescott 3 The arithmetic average return exceeds the geometric average return. If returns are log-normally distributed, the difference between the two is one-half the variance of the returns. Since the annual standard deviation of the equity returns is about 20 percent, there is a difference of about 2 percent between the two measures. Using geometric averages significantly underestimates the expected future value of an investment. In this chapter, as in our 1985 paper, we report arithmetic averages. In instances where we cite the results of research when arithmetic averages are not available, we clearly indicate this.8 1.2. Data Sources A crucial consideration in a discussion of the historical equity premium has to do with the reliability of early data sources. The data we used in documenting the historical equity premium in the United States can be subdivided into three distinct subperiods, 1802–1871, 1871–1926, and 1926–present, with wide variation in the quality of the data over each subperiod. Data on stock prices for the 19th century is patchy, often necessarily introducing an element of arbitrariness to compensate for its incompleteness. 1.2.1. Subperiod 1802–1871 Equity Return Data The equity return data prior to 1871 is not particularly reliable. To the best of our knowledge, the stock return data used by all researchers for the period 1802–1871 is due to Schwert (1990), who gives an excellent account of the construction and composition of early stock market indexes. Schwert (1990) constructs a “spliced” index for the period 1802–1987; his index for the period 1802–1862 is based on the work of Smith and Cole (1935), who constructed a number of early stock indexes. For the period 1802–1820, their index was constructed from an equally weighted portfolio of seven bank stocks, and another index for 1815–1845 was composed of six bank stocks and one insurance stock. For the period 1834–1862, the index consisted of an equally weighted portfolio of (at most) 27 railroad stocks.9 They used one price quote, per stock, per month, from local newspapers. The prices used were the average of the bid and ask prices, rather than transaction prices, and the computation of returns ignores dividends. For the period 1863–1871, Schwert uses data from Macaulay (1938), who constructed a valueweighted index using a portfolio of 25 Northeast and Mid-Atlantic railroad stocks;10 this index also excludes dividends. Needless to say, it is difficult to assess how well this data proxies the “market,” since undoubtedly there were other industry sectors that were not reflected in the index. 8In this case an approximate estimate of the arithmetic average return can be obtained by adding one-half the variance of the returns to the geometric average. 9“They chose stocks in hindsight . . . the sample selection bias caused by including only stocks that survived and were actively quoted for the whole period is obvious” (Schwert (1990)). 10“It is unclear what sources Macaulay used to collect individual stock prices but he included all railroads with actively traded stocks.” Ibid. 4 Chapter 1 • The Equity Premium: ABCs Return on a Risk-Free Security Since there were no Treasury bills extant at the time, researchers have used the data set constructed by Siegel (2002) for this period, using highly rated securities with an adjustment for the default premium. Interestingly, based on this data set, the equity premium for the period 1802–1862 was zero. We conjecture that this may be due to the fact that since most financing in the first half of the 19th century was done through debt, the distinction between debt and equity securities was not very clear-cut.11 1.2.2. Subperiod 1871–1926 Equity Return Data Shiller (1989) is the definitive source for the equity return data for this period. His data is based on the work of Cowles (1939), which covers the period 1871–1938. Cowles used a value-weighted portfolio for his index, which consisted of 12 stocks12 in 1871 and ended with 351 in 1938. He included all stocks listed on the New York Stock Exchange, whose prices were reported in the Commercial and Financial Chronicle. From 1918 onward he used the Standard and Poor’s (S&P) industrial portfolios. Cowles reported dividends, so that, unlike the earlier indexes for the period 1802–1871, a total return calculation was possible. Return on a Risk-Free Security There is no definitive source for the short-term risk-free rate in the period before 1920, when Treasury certificates were first issued. In our 1985 study, we used short-term commercial paper as a proxy for a riskless short-term security prior to 1920 and Treasury certificates from 1920–1930.13 Our data prior to 1920 was taken from Homer (1963). Most researchers have used either our data set or Siegel’s. 1.2.3. Subperiod 1926–Present Equity Return Data This period is the “Golden Age” with regard to accurate financial data. The NYSE database at the Center for Research in Security Prices (CRSP) was initiated in 1926 and provides researchers with high-quality equity return data. The Ibbotson Associates Yearbooks14 are also a very useful compendium of post-1926 financial data. 11The first actively traded stock was floated in the U.S. in 1791 and by 1801 there were over 300 corporations, although less than 10 were actively traded (Siegel (2002)). 12It was only from February 16, 1885, that Dow Jones began reporting an index, initially composed of 12 stocks. The S&P index dates back to 1928, though for the period 1928–1957 it consisted of 90 stocks. The S&P 500 debuted in March 1957. 13See Mehra and Prescott (2008) in this volume for a discussion on the choice of a proxy for the risk-free asset. 14Ibbotson Associates, 2006. Stocks, bonds, bills and inflation. 2005 Yearbook. Ibbotson Associates, Chicago. Rajnish Mehra and Edward C. Prescott 5 TABLE 1 U.S. Annual Real Growth Rate of Per Capita Consumption of Non-durables and Services 1889–1978 1889–2004 1889–1929 1930–1978 1930–2004 Mean 0.018 0.018 0.021 0.016 0.018 Std. Dev. 0.036 0.032 0.044 0.028 0.022 Serial Correlation –0.140 –0.135 –0.463 0.520 0.450 Return on a Risk-Free Security Since the advent of Treasury bills in 1931, short-maturity bills have almost universally been used as proxy for a “real” risk-free security15 since the innovation in inflation is orthogonal to the path of real GNP growth.16 With the debut of Treasury Inflation Protected Securities (TIPS) on January 29, 1997, the return on these securities is the real risk-free rate.17 1.2.4. Consumption Data In our study, we used the Kuznets–Kendrik–NIA per capita real consumption of nondurables and services for the period 1889–1978. Our data source was Grossman and Shiller (1981). An updated version of this series is available in Shiller (1989).18 The initial series (the flow of perishable and semi-durable goods to consumers) for the period 1889–1919 was constructed by William Shaw.19 Simon Kuznets (1938, 1946) modified Shaw’s measure by incorporating transportation and distribution costs, and created a series (the flow of perishable and semi-durable goods to consumers) for the period 1919–1929. The final version of these series is available in an unpublished mimeograph underlying Tables R-27 and R-28 in Kuznets (1961). Kendrick (1961) made further adjustments to these series in order to make them comparable to the Department of Commerce’s personal consumption expenditure series. Kendrick’s adjustments are available in Tables A-IIa and A-IIb in Kendrick (1961). This is the data source that Grossman and Shiller (1981) used in constructing the1889–1929 subset of their series on per capita real consumption of non-durables and services. The post-1929 data is from the National Income and Product Accounts of the United States. Table 1 details the statistics on the growth rate of real per capita consumption of non-durables and services, while Figure 1 is a time-series plot. We used the statistics in the 1889–1978 column in our original study. 15Mehra and Prescott. Ibid. 16Litterman (1980) documents that in postwar data the innovation in inflation had a standard deviation of one-half of one percent. 17Mehra and Prescott. Ibid. 18Further updates are available on Robert Shiller’s website: www.econ.yale.edu/∼shiller. 19See Shaw (1947). 6 Chapter 1 • The Equity Premium: ABCs 20.15 20.10 20.05 0.00 0.05 0.10 0.15 1889 1893 1897 1901 1905 1909 1913 1917 1921 1925 1929 1933 1937 1941 1945 1949 1953 1957 1961 1965 1969 1973 1977 1981 1985 1989 1993 1997 2001 FIGURE 1 U.S. annual real growth rate of per capita consumption of non-durables and services 1889–2004. Source: Mehra and Prescott (1985), updated by the authors. While the serial correlation of consumption growth for the entire sample is negative, Azeredo (2007) points out that for more than the last 70 years it has been positive. In addition, we note that the standard deviation has declined.20 We discuss the implications for the equity premium in Section 2.1.2 and further in Appendix B. 1.3. Estimates of the Equity Premium Historical data provides us with a wealth of evidence documenting that for over a century, stock returns have been considerably higher than those for Treasury bills. This is illustrated in Table 2, which reports the unconditional estimates21 for the U.S. equity premium based on the various data sets used in the literature, going back to 1802. The average annual real return (the inflation adjusted return) on the U.S. stock market over the last 116 years has been about 7.67 percent. Over the same period, the return on a relatively riskless security was a paltry 1.31 percent. The difference between these two returns, the “equity premium,” was 6.36 percent. 20Christina Romer (1999) points out that the larger pre-1929 estimates may be an artifact of the early methodology rather than due to a change in the underlying stochastic process. 21To obtain unconditional estimates, we use the entire data set to form our estimate. The Mehra–Prescott data set spans the longest time period for which both consumption and stock return data is available; the former is necessary to test the implications of consumption-based asset pricing models. Rajnish Mehra and Edward C. Prescott 7 TABLE 2 U.S. Equity Premium Using Different Data Sets Real return on a relatively Real return on riskless a market index (%) security (%) Equity premium (%) Data set Mean Mean Mean 1802–2004 (Siegel) 8.38 3.02 5.36 1871–2005 (Shiller) 8.32 2.68 5.64 1889–2005 (Mehra–Prescott) 7.67 1.31 6.36 1926–2004 (Ibbotson) 9.27 0.64 8.63 TABLE 3 Equity Premium for Selected Countries Mean real return Relatively riskless Equity Country Period Market index (%) security (%) premium (%) United Kingdom 1900–2005 7.4 1.3 6.1 Japan 1900–2005 9.3 −0.5 9.8 Germany 1900–2005 8.2 −0.9 9.1 France 1900–2005 6.1 −3.2 9.3 Sweden 1900–2005 10.1 2.1 8.0 Australia 1900–2005 9.2 0.7 8.5 India 1991–2004 12.6 1.3 11.3 Source: Dimson et al. (2002) and Mehra (2007) for India. Furthermore, this pattern of excess returns to equity holdings is not unique to the U.S. but is observed in every country with a significant capital market. The U.S. together with the U.K., Japan, Germany, and France accounts for more than 85 percent of the capitalized global equity value. The annual return on the British stock market was 7.4 percent over the last 106 years, an impressive 6.1 percent premium over the average bond return of 1.3 percent. Similar statistical differentials are documented for France, Germany, and Japan. Table 3 documents the equity premium for these countries. 8 Chapter 1 • The Equity Premium: ABCs The dramatic investment implications of this differential rate of return can be seen in Table 4, which maps the capital appreciation of $1 invested in different assets from 1802 to 2004 and from 1926 to 2004. One dollar invested in a diversified stock index yields an ending wealth of $655,348 versus a value of $293, in real terms, for $1 invested in a portfolio of T-bills for the period 1802–2004. The corresponding values for the 78-year period, 1926–2004, are $238.30 and $1.54. It is assumed that all payments to the underlying asset, such as dividend payments to stock and interest payments to bonds, are reinvested and that no taxes are paid. This long-term perspective underscores the remarkable wealth-building potential of the equity premium. It should come as no surprise, therefore, that the equity premium is of central importance in portfolio allocation decisions and estimates of the cost of capital and is front and center in the current debate about the advantages of investing Social Security Trust funds in the stock market. In Table 5 we document the premium for some interesting historical subperiods: 1889–1933, when the United States was on a gold standard; 1934–2005, when it was off the gold standard; and 1946–2005, the postwar period. Table 6 presents 30-year moving averages, similar to those reported by the U.S. meteorological service to document “normal” temperature. TABLE 4 Terminal Value of $1 Invested in Stocks and Bonds Stocks T-bills Investment period Real Nominal Real Nominal 1802–2004 $655,348.00 $10,350,077.00 $293.00 $4,614.00 1926–2004 $238.30 $2,533.43 $1.54 $17.87 Source: Ibbotson (2006) and Siegel (2002). TABLE 5 Equity Premium in Different Subperiods Real return on Real return on a a market index relatively riskless Equity premium (%) security (%) (%) Time period Mean Mean Mean 1889–1933 7.01 3.39 3.62 1934–2005 8.08 0.01 8.07 1946–2005 8.19 0.71 7.48 Source: Mehra and Prescott (1985). Updated by the authors. Rajnish Mehra and Edward C. Prescott 9 TABLE 6 Equity Premium 30-Year Moving Averages Real return on Real return on a a market index relatively riskless Equity premium (%) security (%) (%) Time period Mean Mean Mean 1900–1950 7.45 2.95 4.50 1951–2005 8.53 1.11 7.42 Source: Mehra and Prescott (1985). Updated by the authors. Although the premium has been increasing over time, this is primarily due to the diminishing return on the riskless asset, rather than a dramatic increase in the return on equity, which has been relatively constant. The low premium in the 19th century is largely due to the fact that the equity premium for the period 1802–1861 was zero.22 If we exclude this period, we find that difference in the premium in the second half of the 19th century relative to average values in the 20th century is less striking. We see a dramatic change in the equity premium in the post-1933 period—the premium rose from 3.62 percent to 8.07 percent, an increase of more than 125 percent. Since 1933 marked the end of the period when the U.S. was on the gold standard, this break can be seen as the change in the equity premium after the implementation of the new policy. 1.4. Variation in the Equity Premium Over Time The equity premium has varied considerably over time, as illustrated in Figures 2 and 3. Furthermore, the variation depends on the time horizon over which it is measured. There have even been periods when it has been negative. The low-frequency variation has been countercyclical. This is shown in Figure 4, where we have plotted the stock market value as a share of national income23 and the mean equity premium averaged over certain time periods. We have divided the time period from 1929 to 2005 into subperiods where the ratio market value of equity to national income (MV/NI) was greater than and when it was less than the mean value24 over the sample period. Historically, as the figure illustrates, subsequent to periods when this ratio was high, the realized equity premium was low. A similar result holds when 22See the earlier discussion on data. 23In Mehra (1998) it is argued that the variation in this ratio is difficult to rationalize in the standard neoclassical framework since, over the same period, after-tax cash flows to equity as a share of national income are fairly constant. Here we do not address this issue and simply utilize the fact that this ratio has varied considerably over time. 24Mean MV/NI for the period 1929–2005 was 0.91. 10 Chapter 1 • The Equity Premium: ABCs 260 240 220 0 20 40 60 1926 1929 1932 1935 1938 1941 1944 1947 1950 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 EquityRiskPremium(Percent) FIGURE 2 Realized equity risk premium per year: 1926–2004. 0 2 4 6 8 10 12 14 16 18 1945 1947 1949 1951 1953 1955 1957 1959 1961 1963 1965 1967 1969 1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 AverageEquityRiskPremium(Percent) FIGURE 3 Equity risk premium over the 20-year period 1926–2004. (Source: Ibbotson (2006)) Rajnish Mehra and Edward C. Prescott 11 0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 1929 1932 1935 1938 1941 1944 1947 1950 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 AverageEquityPremium 0.00 0.50 1.00 1.50 2.00 2.50 RatioofMarketValuetoNationalIncome FIGURE 4 Market value to national income ratio and average equity premium (average of subperiods when the MV/NI ratio is > or < avg. MV/NI ratio). stock valuations are low relative to national income. In this case the subsequent equity premium is high. Since after-tax corporate profits as a share of national income are fairly constant over time, this translates into the observation that the realized equity premium was low subsequent to periods when the price/earnings ratio is high, and vice versa. This is the basis for the returns predictability literature in finance. In Figure 5 we have plotted stock market value as a share of national income and the subsequent three-year mean equity premium. This provides further conformation that, historically, periods of relatively high market valuation have been followed by periods when the equity premium was relatively low. 2. IS THE EQUITY PREMIUM DUE TO A PREMIUM FOR BEARING NON-DIVERSIFIABLE RISK? Why have stocks been such an attractive investment relative to bonds? Why has the rate of return on stocks been higher than that on relatively risk-free assets? One intuitive answer is that since stocks are “riskier” than bonds, investors require a larger premium for bearing this additional risk; and indeed, the standard deviation of the returns to stocks (about 20 percent per annum historically) is larger than that of the returns to T-bills (about 4 percent per annum), so, obviously they are considerably more risky than bills! But are they? 12 Chapter 1 • The Equity Premium: ABCs 0.00 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 1929 1932 1935 1938 1941 1944 1947 1950 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 AverageEquityPremium 0.00 0.50 1.00 1.50 2.00 2.50 RatioofMarketValuetoNationalIncome FIGURE 5 Market value to national income ratio and average 3-year ahead equity premium (average of subperiods when the MV/NI ratio is > or < avg. MV/NI ratio). Figures 6 and 7 illustrate the variability of the annual real rate of return on the S&P 500 index and a relatively risk-free security over the period 1889–2005.25 To enhance and deepen our understanding of the risk-return trade-off in the pricing of financial assets, we take a detour into modern asset pricing theory and look at why different assets yield different rates of return. The deus ex machina of this theory is that assets are priced such that, ex-ante, the loss in marginal utility incurred by sacrificing current consumption and buying an asset at a certain price is equal to the expected gain in marginal utility, contingent on the anticipated increase in consumption when the asset pays off in the future. The operative emphasis here is the incremental loss or gain of utility of consumption and should be differentiated from incremental consumption. This is because the same amount of consumption may result in different degrees of well-being at different times. As a consequence, assets that pay off when times are good and consumption levels are high—when the marginal utility of consumption is low—are less desirable than those that pay off an equivalent amount when times are bad and additional consumption is more highly valued. Hence, consumption in period t has a different price if times are good than if times are bad. Let us illustrate this principle in the context of the standard, popular paradigm, the Capital Asset Pricing Model (CAPM). The model postulates a linear relationship between an asset’s “beta,” a measure of systematic risk, and its expected return. Thus, high-beta stocks yield a high expected rate of return. That is because in the CAPM, 25The index did not consist of 500 stocks for the entire period. Rajnish Mehra and Edward C. Prescott 13 260 240 220 0 20 40 60 1889 1893 1897 1901 1905 1909 1913 1917 1921 1925 1929 1933 1937 1941 1945 1949 1953 1957 1961 1965 1969 1973 1977 1981 1985 1989 1993 1997 2001 2005 FIGURE 6 Real annual return on S&P 500 Index (%) 1889–2005. Source: Mehra and Prescott (1985). Data updated by the authors. 225 220 215 210 25 0 5 10 15 20 25 1957 1961 1893 1897 1901 1905 1909 1913 1917 1921 1925 1929 1933 1937 1941 1945 1949 1953 1965 1969 1973 1977 1981 1985 1989 1993 1997 2001 2005 1889 FIGURE 7 Real annual return on T-bills (%) 1889–2005. Source: Mehra and Prescott (1985). Data updated by the authors. 14 Chapter 1 • The Equity Premium: ABCs good times and bad times are captured by the return on the market. The performance of the market, as captured by a broad-based index, acts as a surrogate indicator for the relevant state of the economy. A high-beta security tends to pay off more when the market return is high—when times are good and consumption is plentiful; it provides less incremental utility than a security that pays off when consumption is low, is less valuable, and consequently sells for less. Thus, higher-beta assets that pay off in states of low marginal utility will sell for a lower price than similar assets that pay off in states of high marginal utility. Since rates of return are inversely proportional to asset prices, the lower beta assets will, on average, give a lower rate of return than the former. Another perspective on asset pricing emphasizes that economic agents prefer to smooth patterns of consumption over time. Assets that pay off a larger amount at times when consumption is already high “destabilize” these patterns of consumption, whereas assets that pay off when consumption levels are low “smooth” out consumption. Naturally, the latter are more valuable and thus require a lower rate of return to induce investors to hold these assets. (Insurance policies are a classic example of assets that smooth consumption. Individuals willingly purchase and hold them, despite their very low rates of return.) To return to the original question: are stocks that much riskier than T-bills so as to justify a 7-percentage differential in their rates of return? What came as a surprise to many economists and researchers in finance was the conclusion of our paper, written in 1979. Stocks and bonds pay off in approximately the same states of nature or economic scenarios and, hence, as argued earlier, they should command approximately the same rate of return. In fact, using standard theory to estimate risk-adjusted returns, we found that stocks on average should command, at most, a 1 percent return premium over bills. Since, for as long as we had reliable data (about 100 years), the mean premium on stocks over bills was considerably and consistently higher, we realized that we had a puzzle on our hands. It took us six more years to convince a skeptical profession and for our paper “The Equity Premium: A Puzzle” to be published (Mehra and Prescott (1985)). 2.1. Standard Preferences The neoclassical growth model and its stochastic variants are a central construct in contemporary finance, public finance, and business cycle theory. It has been used extensively by, among others, Abel et al. (1989), Auerbach and Kotlikoff (1987), Becker and Barro (1988), Brock (1979), Cox, Ingersoll, and Ross (1985), Donaldson and Mehra (1984), Kydland and Prescott (1982), Lucas (1978), and Merton (1971). In fact, much of our economic intuition is derived from this model class. A key idea of this framework is that consumption today and consumption in some future period are treated as different goods. Relative prices of these different goods are equal to people’s willingness to substitute between these goods and businesses’ ability to transform these goods into each other. The model has had some remarkable successes when confronted with empirical data, particularly in the stream of macroeconomic research referred to as Real Business Cycle Theory, where researchers have found that it easily replicates the essential Rajnish Mehra and Edward C. Prescott 15 macroeconomic features of the business cycle. See, in particular, Kydland and Prescott (1982). Unfortunately, when confronted with financial market data on stock returns, tests of these models have led, without exception, to their rejection. Perhaps the most striking of these rejections is our 1985 paper.26 To illustrate this we employ a variation of Lucas’ (1978) endowment economy rather than the production economy studied in Prescott and Mehra (1980). This is an appropriate abstraction to use if it is the equilibrium relation between the consumption and asset returns that are being used to estimate the premium for bearing non-diversifiable risk, which is what we were doing. Introducing production would only complicate the selection of exogenous processes, which resulted in the observed process for consumption.27 To examine the role of other factors for mean asset returns, it would be necessary to introduce other features of reality such as taxes and intermediation costs as has recently been done.28 If the model had accounted for differences in average asset returns, the next step would have been to use the neoclassical growth model, which has intertemporal transformation opportunities through variations in the rate at which the capital stock is accumulated, to see if this abstraction accounted for the observed large differences in average asset returns. Since per capita consumption has grown over time, we assume that the growth rate of the endowment follows a Markov process. This is in contrast to the assumption in Lucas’ model that the endowment level follows a Markov process. Our assumption, which requires an extension of competitive equilibrium theory,29 enables us to capture the non-stationarity in the consumption series associated with the large increase in per capita consumption that occurred over the last century. We consider a frictionless economy that has a single representative “stand-in” household. This unit orders its preferences over random consumption paths by E0 ∞ t=0 βt U(ct) , 0 < β < 1, (1) where ct is the per capita consumption and the parameter β is the subjective time discount factor, which describes how impatient households are to consume. If β is small, people are highly impatient, with a strong preference for consumption now versus consumption in the future. As modeled, these households live forever, which implicitly means that the utility of parents depends on the utility of their children. In the real world, this is true for some people and not for others. However, economies with both types of people—those who care about their children’s utility and those who do not—have essentially the same implications for asset prices and returns.30 26The reader is referred to McGrattan and Prescott (2003) and Mehra and Prescott (2007 and 2008) for an alternative perspective. 27In a production economy, consumption would be endogenously determined, restricting the class of consumption processes that could be considered. See Appendix C. 28See McGrattan and Prescott (2003) and Mehra and Prescott (2007 and 2008). 29This is accomplished in Mehra (1988). 30See Constantinides, Donaldson, and Mehra (2002, 2005). Constantinides et al. (2007) explicitly model bequests. 16 Chapter 1 • The Equity Premium: ABCs We use this simple abstraction to build quantitative economic intuition about what the returns on equity and debt should be. E0{·} is the expectations operator conditional upon information available at time zero (which denotes the present time), and U : R+ → R is the increasing, continuously differentiable concave utility function. We further restrict the utility function to be of the constant relative risk aversion (CRRA) class U(c, α) = c1−α − 1 1 − α , 0 < α < ∞, (2) where the parameter α measures the curvature of the utility function. When α = 1, the utility function is defined to be logarithmic, which is the limit of the above representation as α approaches 1. The feature that makes this the “preference function of choice” in much of the literature in Growth and Real Business Cycle Theory is that it is scaleinvariant. This means that a household is more likely to accept a gamble if both its wealth and the gamble amount are scaled by a positive factor. Hence, although the level of aggregate variables such as capital stock have increased over time, the resulting equilibrium return process is stationary. A second attractive feature is that it is one of only two preference functions that allows for aggregation and a “stand-in” representative agent formulation that is independent of the initial distribution of endowments. One disadvantage of this representation is that it links risk preferences with time preferences. With CRRA preferences, agents who like to smooth consumption across various states of nature also prefer to smooth consumption over time, that is, they dislike growth. Specifically, the coefficient of relative risk aversion is the reciprocal of the elasticity of intertemporal substitution. There is no fundamental economic reason why this must be so. We will revisit this issue in the next chapter, where we examine preference structures that do not impose this restriction.31 We assume there is one productive unit, which produces output yt in period t, which is the period dividend. There is one equity share with price pt that is competitively traded; it is a claim to the stochastic process {yt}. Consider the intertemporal choice problem of a typical investor at time t. He equates the loss in utility associated with buying one additional unit of equity to the discounted expected utility of the resulting additional consumption in the next period. To carry over one additional unit of equity, pt units of the consumption good must be sacrificed, and the resulting loss in utility is ptU (ct). By selling this additional unit of equity in the next period, pt+1 + yt+1 additional units of the consumption good can be consumed and βEt{(pt+1 + yt+1)U (ct+1)} is the expected value of the incremental utility next period. At an optimum, these quantities must be equal. Hence, the fundamental relation that prices assets is ptU (ct) = βEt{(pt+1 + yt+1)U (ct+1)}. Versions of this expression can be found in Rubinstein (1976), Lucas (1978), Breeden (1979), and Prescott and Mehra (1980), among others. Excellent textbook treatments can be found in Cochrane (2005), Danthine and Donaldson (2005), Duffie (2001), and LeRoy and Werner (2001). 31See Epstein and Zin (1991) and Weil (1989). Rajnish Mehra and Edward C. Prescott 17 We use it to price both stocks and risk-less one-period bonds. For equity we have 1 = βEt U (ct+1) U (ct) Re,t+1 , (3) where Re,t+1 = pt+1 + yt+1 pt . (4) For the risk-less one-period bonds, the relevant expression is 1 = βEt U (ct+1) U (ct) Rf,t+1, (5) where the gross rate of return on the riskless asset is by definition Rf,t+1 = 1 qt , (6) with qt being the price of the bond. Since U(c) is assumed to be increasing, we can rewrite (3) as 1 = βEt Mt+1Re,t+1 , (7) where Mt+1 is a strictly positive stochastic discount factor. This guarantees that the economy will be arbitrage-free and the law of one price holds. A little algebra shows that Et(Re,t+1) = Rf,t+1 + Covt −U (ct+1), Re,t+1 Et(U (ct+1)) . (8) The equity premium Et(Re,t+1) − Rf,t+1 thus can be easily computed. Expected asset returns equal the risk-free rate plus a premium for bearing risk, which depends on the covariance of the asset returns with the marginal utility of consumption. Assets that co-vary positively with consumption—that is, they pay off in states when consumption is high and marginal utility is low—command a high premium since these assets “destabilize” consumption. The question we need to address is the following: is the magnitude of the covariance between the marginal utility of consumption large enough to justify the observed 6 percent equity premium in U.S. equity markets? To address this issue, we make some additional assumptions. While they are not necessary and were not, in fact, part of our original paper on the equity premium, we include them to facilitate exposition and because they result in closed-form solutions.32 32The exposition below is based on Abel (1988), his unpublished notes, and Mehra (2003). See Appendix B for the analysis in our 1985 paper. 18 Chapter 1 • The Equity Premium: ABCs These assumptions are 1. the growth rate of consumption xt+1 ≡ ct+1 ct is i.i.d. 2. the growth rate of dividends zt+1 ≡ yt+1 yt is i.i.d. 3. (xt, zt) are jointly log-normally distributed. The consequences of these assumptions are that the gross return on equity Re,t (defined above) is i.i.d. and that (xt, Re,t) are jointly log-normal. Substituting U (ct) = c−α t in the fundamental pricing relation33 pt = βEt (pt+1 + yt+1) U (ct+1) U (ct) , (9) we get pt = βEt (pt+1 + yt+1)x−α t+1 . (10) As pt is homogeneous of degree 1 in y, we can represent it as pt = wyt, and hence Re,t+1 can be expressed as Re,t+1 = w + 1 w · yt+1 yt = w + 1 w · zt+1. (11) It is easily shown that w = βEt{zt+1x−α t+1} 1 − βEt{zt+1x−α t+1} ; (12) hence, Et{Re,t+1} = Et{zt+1} βEt{zt+1x−α t+1} . (13) Analogously, the gross return on the riskless asset can be written as Rf,t+1 = 1 β 1 Et{x−α t+1} . (14) Since we have assumed the growth rate of consumption and dividends to be lognormally distributed, Et{Re,t+1} = eμz+ 1 2 σ2 z βeμz−αμx+1/2(σ2 z +α2σ2 x−2ασx,z) (15) 33In contrast to our approach, which is in the applied general equilibrium tradition, there is another tradition of testing Euler equations (such as Eq. (9)) and rejecting them. Hansen and Singleton (1982) and Grossman and Shiller (1981) exemplify this approach. See Appendix D for an elaboration. Rajnish Mehra and Edward C. Prescott 19 and ln Et{Re,t+1} = − ln β + αμx − 1 2 α2 σ2 x + ασx,z, (16) where μx = E(ln x), σ2 x = Var(ln x), σx,z = Cov(ln x, ln z), and ln x is the continuously compounded growth rate of consumption. The other terms involving z and Re are defined analogously. Similarly, Rf = 1 βe−αμx+ 1 2 α2σ2 x (17) and ln Rf = − ln β + αμx − 1 2 α2 σ2 x. (18) ∴ ln E{Re} − ln Rf = ασx,z. (19) From (11) it also follows that ln E{Re} − ln Rf = ασx,Re , where σx,Re = Cov(ln x, ln Re). (20) The (log) equity premium in this model is the product of the coefficient of risk aversion and the covariance of the (continuously compounded) growth rate of consumption with the (continuously compounded) return on equity or the growth rate of dividends. If we impose the equilibrium condition that x = z, a consequence of which is the restriction that the return on equity is perfectly correlated to the growth rate of consumption, we get ln E{Re} − ln Rf = ασ2 x, (21) and the equity premium then is the product of the coefficient of relative risk aversion and the variance of the growth rate of consumption. As we see ahead, this variance is 0.001369, so unless the coefficient of risk aversion α is large, a high-equity premium is impossible. The growth rate of consumption just does not vary enough! In Mehra and Prescott (1985) we report the following sample statistics for the U.S. economy over the period 1889–1978: Risk-free rate Rf = 1.0080 Mean return on equity E{Re} = 1.0698 20 Chapter 1 • The Equity Premium: ABCs Mean growth rate of consumption E{x} = 1.0180 Standard deviation of the growth rate of consumption σ{x} = 0.0360 Mean equity premium E{Re} − Rf = 0.0618 In our calibration, we are guided by the tenet that model parameters should meet the criteria of cross-model verification: not only must they be consistent with the observations under consideration, but they should not be grossly inconsistent with other observations in growth theory, business cycle theory, labor market behavior, and so on. There is a wealth of evidence from various studies that the coefficient of risk aversion α is a small number, certainly less than 10.34 We can then pose a question: if we set the risk aversion coefficients α to be 10 and β to be 0.99, what are the expected rates of return and the risk premia using the parameterization above? Using the expressions derived earlier, we have ln Rf = − ln β + αμx − 1 2 α2 σ2 x = 0.124 or Rf = 1.132, that is, a risk-free rate of 13.2 percent! Since ln E{Re} = ln Rf + ασ2 x = 0.136, we have E{Re} = 1.146, or a return on equity of 14.6 percent. This implies an equity risk premium of 1.4 percent, far lower than the 6.18 percent historically observed equity premium. In this calculation we have been liberal in choosing the values for α and β. Most studies indicate a value for α that is close to 3. If we pick a lower value for β, the risk-free rate will be even higher and the premium lower. So the 1.4 percent value represents the maximum equity risk premium that can be obtained in this class of models given the constraints on α and β. Since the observed equity premium is over 6 percent, we have a puzzle on our hands that risk considerations alone cannot account for. 2.1.1. The Risk-Free Rate Puzzle Philippe Weil (1989) has dubbed the high risk-free rate obtained above “the risk-free rate puzzle.” The short-term real rate in the U.S. averages less than 1 percent, while the high value of α required to generate the observed equity premium results in an unacceptably 34A number of these studies are documented in Mehra and Prescott (1985). Rajnish Mehra and Edward C. Prescott 21 high risk-free rate. The risk-free rate as shown in Eq. (18) can be decomposed into three components: ln Rf = − ln β + αμx − 1 2 α2 σ2 x. The first term, −ln β, is a time preference or impatience term. When β < 1, it reflects the fact that agents prefer early consumption to later consumption. Thus, in a world of perfect certainty and no growth in consumption, the unique interest rate in the economy will be Rf = 1/β. The second term, αμx, arises because of growth in consumption. If consumption is likely to be higher in the future, agents with concave utility would like to borrow against future consumption in order to smooth their lifetime consumption. The greater the curvature of the utility function and the larger the growth rate of consumption, the greater the desire to smooth consumption. In equilibrium, this will lead to a higher interest rate since agents in the aggregate cannot simultaneously increase their current consumption. The third term, 1 2 α2 σ2 x, arises due to a demand for precautionary saving. In a world of uncertainty, agents would like to hedge against future unfavorable consumption realizations by building “buffer stocks” of the consumption good. Hence, in equilibrium, the interest rate must fall to counter this enhanced demand for savings. Figure 8 plots ln Rf = − ln β + αμx − 1 2 α2 σ2 x calibrated to the U.S. historical values with μx = 0.0175 and σ2 x = 0.00123 for various values of β. It shows that the precautionary savings effect is negligible for reasonable values of α (1 < α < 5). For α = 3 and β = 0.99, Rf = 1.65, which implies a risk-free rate of 6.5 percent— much higher than the historical mean rate of 0.8 percent. The economic intuition is 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 a 0 280 270 260 250 240 230 220 210 10 20 30 40 50 60 70 80 b5 0.99 b 5 0.96 b 5 0.55 FIGURE 8 Mean risk-free rate vs. α. 22 Chapter 1 • The Equity Premium: ABCs straightforward—with consumption growing at 1.8 percent a year with a standard deviation of 3.6 percent—agents with isoelastic preferences have a sufficiently strong desire to borrow in order to smooth consumption that it takes a high interest rate to induce them not to do so. The late Fischer Black35 proposed that α = 55 would solve the puzzle. Indeed, it can be shown that the 1889–1978 U.S. experience reported above can be reconciled with α = 48 and β = 0.55. To see this, observe that since σ2 x = ln 1 + var(x) [E(x)]2 = 0.00123 and μx = ln E(x) − 1 2 σ2 x = 0.0175, this implies α = ln E(R) − ln Rf σ2 x = 48.4. Since ln β = − ln Rf + αμx − 1 2 α2 σ2 x = −0.60, this implies β = 0.55. Besides postulating an unacceptably high α, another problem is that this is a “knifeedge” solution. No other set of parameters will work, and a small change in α will lead to an unacceptable risk-free rate, as shown in Figure 8. An alternate approach is to experiment with negative time preferences; however, there seems to be no empirical evidence that agents do have such preferences.36 Figure 8 shows that for extremely high α, the precautionary savings term dominates and results in a “low” risk-free rate.37 However, then a small change in the growth rate of consumption will have a large impact on interest rates. This is inconsistent with a crosscountry comparison of real risk-free rates and their observed variability. For example, throughout the 1980s, South Korea had a much higher growth rate than the U.S., but real rates were not appreciably higher. Nor does the risk-free rate vary considerably over time, as would be expected if α was large. In Section 3 we show how alternative preference structures can help resolve the risk-free rate puzzle. 35Private communication, 1981. 36In a model with growth, equilibrium can exist with β > 1. See Mehra (1988) for the restrictions on the parameters α and β for equilibrium to exist. 37Kandel and Stambaugh (1991) have suggested this approach. Rajnish Mehra and Edward C. Prescott 23 0.4 0.3 0.2 0.5 1.5 a 2.5210 0.1 EquityPremium(Percent) 0 20.1 correl . 5 20.14 correl . 5 0.00 correl . 5 0.45 FIGURE 9 Equity Premium vs. α. 2.1.2. The Effect of Serial Correlation in the Growth Rate of Consumption The preceding analysis has assumed that the growth rate of consumption is i.i.d over time. However, for the sample period 1889–2004 it is slightly negative (−0.135), while for the sample period 1930–2004 the value is 0.45. The effect of this non-zero serial correlation on the equity premium can be analyzed using the framework in Appendix B. Figure 9 shows the effect of changes in the risk aversion parameter on the equity premium for different serial correlations.38 When the serial correlation of consumption is positive, the equity premium actually declines with increasing risk aversion, thus, further exacerbating the equity premium puzzle.39 An alternative perspective on the puzzle is provided by Hansen and Jagannathan (1991). The fundamental pricing equation can be written as Et(Re,t+1) = Rf,t+1 − Covt Mt+1, Re,t+1 Et(Mt+1) . (22) 38In addition, see Figure 2B in Appendix B. 39See Azeredo (2007) for a detailed discussion. 24 Chapter 1 • The Equity Premium: ABCs This expression also holds unconditionally, so that E(Re,t+1) = Rf,t+1 − σ(Mt+1)σ(Re,t+1)ρR,M /E(Mt+1) (23) or E(Re,t+1) − Rf,t+1/σ(Re,t+1) = −σ(Mt+1)ρR,M /E(Mt+1), (24) and since −1 ≤ ρR,M ≤ 1, E(Re,t+1) − Rf,t+1/σ(Re,t+1) ≤ σ(Mt+1)/E(Mt+1). (25) This inequality is referred to as the Hansen–Jagannathan lower bound on the pricing kernel. For the U.S. economy, the Sharpe ratio, E(Re,t+1) − Rf,t+1/σ(Re,t+1), can be calculated to be 0.37. Since E(Mt+1) is the expected price of a one-period risk-free bond, its value must be close to 1. In fact, for the parameterization discussed earlier, E(Mt+1) = 0.96 when α = 2. This implies that the lower bound on the standard deviation for the pricing kernel must be close to 0.3 if the Hansen–Jagannathan bound is to be satisfied. However, when this is calculated in the Mehra–Prescott framework, we obtain an estimate for σ(Mt+1) = 0.002, which is off by more than an order of magnitude. We would like to emphasize that the equity premium puzzle is a quantitative puzzle; standard theory is consistent with our notion of risk that, on average, stocks should return more than bonds. The puzzle arises from the fact that the quantitative predictions of theory are an order of magnitude different from what has been historically documented. The puzzle cannot be dismissed lightly, since much of our economic intuition is based on the very class of models that fall short so dramatically when confronted with financial data. It underscores the failure of paradigms central to financial and economic modeling to capture the characteristic that appears to make stocks comparatively so risky. Hence, the viability of using this class of models for any quantitative assessment, say, for instance, to gauge the welfare implications of alternative stabilization policies, is thrown open to question. For this reason, over the last 20 years or so, attempts to resolve the puzzle have become a major research impetus in finance and economics. Several generalizations of key features of the Mehra and Prescott (1985) model have been proposed to better reconcile observations with theory. These include alternative assumptions on prefer- ences,40 modified probability distributions to admit rare but disastrous events,41 survival 40For example, Abel (1990), Bansal and Yaron (2004), Benartzi and Thaler (1995), Boldrin, Christiano, and Fisher (2001), Campbell and Cochrane (1999), Constantinides (1990), Epstein and Zin (1991), and Ferson and Constantinides (1991). 41See Rietz (1988) and Mehra and Prescott (1988). Rajnish Mehra and Edward C. Prescott 25 bias,42 incomplete markets,43 and market imperfections.44 They also include attempts at modeling limited participation of consumers in the stock market45 and problems of temporal aggregation.46 We examine some of the research efforts to resolve the puzzle47 in the next two chapters. References Abel, A. B. Stock prices under time varying dividend risk: An exact solution in an infinite horizon general equilibrium model. Journal of Monetary Economics 22 (1988): 375–394. Abel, A. B. Asset prices under habit formation and catching up with the Joneses. American Economic Review, Papers and Protocols 80 (1990): 38–42. Abel, A. B., N. G. Mankiw, L. H. Summers, and R. J. Zeckhauser. Assessing dynamic efficiency: Theory and evidence. Review of Economic Studies 56 (1989): 1–20. Aiyagari, S. R., and M. Gertler. Asset returns with transactions costs and uninsured individual risk. Journal of Monetary Economics 27 (1991): 311–331. Alvarez, F., and U. Jermann. Asset pricing when risk sharing is limited by default. Econometrica 48 (2000): 775–797. Attanasio, O. P., J. Banks, and S. Tanner. Asset holding and consumption volatility. Journal of Political Economy 110(4) (2002): 771–792. Auerbach, A. J., and L. J. Kotlikoff. Dynamic Fiscal Policy. Cambridge University Press, Cambridge (1987). Azeredo, F. Essays on aggregate economics and finance. Doctoral dissertation, University of California, Santa Barbara (2007). Bansal, R., and J. W. Coleman. A monetary explanation of the equity premium, term premium and risk-free rate puzzles. Journal of Political Economy 104 (1996): 1135–1171. Bansal, R., and A. Yaron. Risks for the long run: A potential resolution of asset pricing puzzles. Journal of Finance 59 (2004): 1481–1509. Basak, S., and D. Cuoco. An equilibrium model with restricted stock market participation. The Review of Financial Studies 11 (1998): 309–341. Becker, G. S., and R. J. Barro. A reformulation of the economic theory of fertility. Quarterly Journal of Economics 103(1) (1988): 1–25. Benartzi, S., and R. H. Thaler. Myopic loss aversion and the equity premium puzzle. Quarterly Journal of Economics 110 (1995): 73–92. Bewley, T. F. Thoughts on tests of the intertemporal asset pricing model. Working paper, Northwestern University (1982). Billingsley, P. Probability and Measure. John Wiley and Sons, New York (1995). Boldrin, M., L. J. Christiano, and J. D. M. Fisher. Habit persistence, asset returns, and the business cycle. American Economic Review 91 (2001): 149–166. 42See Brown, Goetzmann, and Ross (1995). 43For example, Bewley (1982), Brav, Constantinides, and Geczy (2002), Constantinides and Duffie (1996), Heaton and Lucas (1997, 2000), Lucas (1994), Mankiw (1986), Mehra and Prescott (1985), Storesletten, Telmer, and Yaron (2004), and Telmer (1993). 44For example, Aiyagari and Gertler (1991), Alvarez and Jerman (2000), Bansal and Coleman (1996), Basak and Cuoco (1998), Constantinides, Donaldson, and Mehra (2002), Danthine, Donaldson, and Mehra (1992), Daniel and Marshall (1997), He and Modest (1995), Heaton and Lucas (1996), Luttmer (1996), McGrattan and Prescott (2000), Sethi (1997), and Storesletten, Telmer, and Yaron (2006). 45Attanasio, Banks, and Tanner (2002), Brav, Constantinides, and Geczy (2002), Mankiw and Zeldes (1991), and Vissing-Jorgensen (2002). 46Gabaix and Laibson (2001), Heaton (1995), and Lynch (1996). 47The reader is also referred to the excellent surveys by Kocherlakota (1996), Cochrane (1997), and Campbell (1999, 2001). 26 Chapter 1 • The Equity Premium: ABCs Brav, A., G. M. Constantinides, and C. C. Geczy. Asset pricing with heterogeneous consumers and limited participation: Empirical evidence. Journal of Political Economy 110 (2002): 793–824. Breeden, D. An intertemporal asset pricing model with stochastic consumption and investment opportunities. Journal of Financial Economics 7 (1979): 265–296. Brock, W. A. An integration of stochastic growth theory and the theory of finance, Part 1: The growth model, in J. Green, and J. Scheinkman, eds., General Equilibrium, Growth & Trade. Academic Press, New York (1979). Brown, S., W. Goetzmann, and S. Ross. Survival. Journal of Finance 50 (1995): 853–873. Campbell, J. Y. Asset prices, consumption, and the business cycle. Chapter 19 in J. B. Taylor, and M. Woodford, eds. Handbook of Macroeconomics 1. North–Holland, Amsterdam, (1999): 1231–1303. Campbell, J. Y. Asset pricing at the millennium. Journal of Finance 55 (2001): 1515–1567. Campbell, J. Y., and J. H. Cochrane. By force of habit: A consumption–based explanation of aggregate stock market behavior. Journal of Political Economy 107 (1999): 205–251. Cochrane, J. H. Where is the market going? Uncertain facts and novel theories. Economic Perspectives 21 (1997): 3–37. Cochrane, J. H. Asset Pricing. Princeton University Press, Princeton, NJ (2005). Constantinides, G. M. Habit formation: A resolution of the equity premium puzzle. Journal of Political Economy 98 (1990): 519–543. Constantinides, G. M., J. B. Donaldson, and R. Mehra. Junior can’t borrow: A new perspective on the equity premium puzzle. Quarterly Journal of Economics 118 (2002): 269–296. Constantinides, G. M., J. B. Donaldson, and R. Mehra. Junior must pay: Pricing the implicit put in privatizing Social Security. Annals of Finance 1 (2005): 1–34. Constantinides, G. M., J. B. Donaldson, and R. Mehra. Junior is rich: Bequests as consumption. Economic Theory 32 (2007): 125–155. Constantinides, G. M., and D. Duffie. Asset pricing with heterogeneous consumers. Journal of Political Economy 104 (1996): 219–240. Cowles, A., and Associates. Common Stock Indexes, 2d ed. Cowles Commission Monograph no. 3. Principia Press, Bloomington, IN (1939). Cox, J. C., J. E. Ingersoll, Jr., and S. A. Ross. A theory of the term structure of interest rates. Econometrica 53 (1985): 385–407. Daniel, K., and D. Marshall. The equity premium puzzle and the risk-free rate puzzle at long horizons. Macroeconomic Dynamics 1 (1997): 452–484. Danthine, J.-P., J. B. Donaldson, and R. Mehra. The equity premium and the allocation of income risk. Journal of Economic Dynamics and Control 16 (1992): 509–532. Danthine, J.-P., and J. B. Donaldson. Intermediate Financial Theory. Prentice Hall, Upper Saddle River, NJ (2005). Debreu, G. Valuation equilibrium and pareto optimum. Proceedings of the National Academy of Sciences. 40 (1954): 588–592. Dimson, E., P. Marsh, and M. Staunton. Triumph of the Optimists: 101 Years of Global Investment Returns. Princeton University Press, Princeton, NJ (2002). Donaldson, J. B., and R. Mehra. Comparative dynamics of an equilibrium intertemporal asset pricing model. Review of Economic Studies 51 (1984): 491–508. Duffie, D. Dynamic Asset Pricing Theory, 3rd ed. Princeton University Press, Princeton, NJ (2001). Epstein, L. G., and S. E. Zin. Substitution, risk aversion, and the temporal behavior of consumption and asset returns: An empirical analysis. Journal of Political Economy 99 (1991): 263–286. Ferson, W. E., and G. M. Constantinides. Habit persistence and durability in aggregate consumption. Journal of Financial Economics 29 (1991): 199–240. Gabaix, X., and D. Laibson. The 6D bias and the equity premium puzzle, in B. Bernanke, and K. Rogoff, eds., NBER Macroeconomics Annual 2001. MIT Press, Cambridge, MA (2001). Grossman, S. J., and R. J. Shiller. The determinants of the variability of stock market prices. American Economic Review 71 (1981): 222–227. Hansen, L. P., and R. Jagannathan. Implications of security market data for models of dynamic economies. Journal of Political Economy 99 (1991): 225–262. Rajnish Mehra and Edward C. Prescott 27 Hansen, L. P., and K. J. Singleton. Generalized instrumental variables estimation of nonlinear rational expectations models. Econometrica 50 (1982): 1269–1288. Hansen, L. P., and K. J. Singleton. Stochastic consumption, risk aversion and the intertemporal behavior of asset returns. Journal of Political Economy 91 (1983): 249–268. He, H., and D. M. Modest. Market frictions and consumption-based asset pricing. Journal of Political Economy 103 (1995): 94–117. Heaton, J. An empirical investigation of asset pricing with temporally dependent preference specifications. Econometrica 66 (1995): 681–717. Heaton, J., and D. J. Lucas. Evaluating the effects of incomplete markets on risk sharing and asset pricing. Journal of Political Economy 104 (1996): 443–487. Heaton, J., and D. J. Lucas. Market frictions, savings behavior and portfolio choice. Journal of Macroeconomic Dynamics 1 (1997): 76–101. Heaton, J. C., and D. J. Lucas. Portfolio choice and asset prices: The importance of entrepreneurial risk. Journal of Finance 55 (2000). Homer, S. A History of Interest Rates. Rutgers University Press, New Brunswick, NJ (1963). Ibbotson Associates. Stocks, bonds, bills and inflation. 2005 Yearbook. Ibbotson Associates, Chicago (2006). Kandel, S., and R. F. Stambaugh. Asset returns and intertemporal preferences. Journal of Monetary Economics 27 (1991): 39–71. Kendrick, J. W. Productivity trends in the United States, NBER 71 (1961). Kocherlakota, N. R. The equity premium: It’s still a puzzle. Journal of Economic Literature 34 (1996): 42–71. Kuznets, S. S. Commodity Flow and Capital Formation, NBER 34, New York (1938). Kuznets, S. S. National Product since 1869, National Bureau of Economic Research Number 46, New York (1946). Kuznets, S. S. Capital in the American economy: Its formation and financing studies in capital formation and Financing, NBER (1961). Kydland, F., and E. C. Prescott. Time to build and aggregate fluctuations. Econometrica 50 (1982): 1345– 1371. LeRoy, S. H., and J. Werner. Principles of Financial Economics. Cambridge University Press, New York (2001). Litterman, R. B. Bayesian procedure for forecasting with vector auto-regressions. Working paper, MIT (1980). Lucas, D. J. Asset pricing with undiversifiable risk and short sales constraints: Deepening the equity premium puzzle. Journal of Monetary Economics 34 (1994): 325–341. Lucas, R. E., Jr. Asset prices in an exchange economy. Econometrica 46 (1978): 1429–1445. Luttmer, E. G. J. Asset pricing in economies with frictions. Econometrica 64 (1996): 1439–1467. Lynch, A. W. Decision frequency and synchronization across agents: Implications for aggregate consumption and equity returns. Journal of Finance 51 (1996): 1479–1497. Macaulay, F. R. The movements of interest rates, bond yields and stock prices in the U.S. since 1856. National Bureau of Economic Research, New York (1938). Mankiw, N. G. The equity premium and the concentration of aggregate shocks. Journal of Financial Economics 17 (1986): 211–219. Mankiw, N. G., and S. P. Zeldes. The consumption of stockholders and nonstockholders. Journal of Financial Economics 29 (1991): 97–112. McGrattan, E. R., and E. C. Prescott. Is the stock market overvalued? Federal Reserve Bank of Minneapolis Quarterly Review (2000). McGrattan, E. R., and E. C. Prescott. Average debt and equity returns: Puzzling? American Economic Review 93 (2003): 392–397. McGrattan, E. R., and E. C. Prescott. Taxes, regulations, and the value of U.S. and U.K. corporations. Review of Economic Studies 92 (2005): 767–796. Mehra, R. On the existence and representation of equilibrium in an economy with growth and nonstationary consumption. International Economic Review 29 (1988): 131–135. Mehra, R. On the volatility of stock prices: An exercise in quantitative theory. International Journal of Systems Science 29 (1998): 1203–1211. 28 Chapter 1 • The Equity Premium: ABCs Mehra, R. The equity premium: Why is it a puzzle? Financial Analysts Journal (2003): 54–69. Mehra, R. The equity premium in India. Oxford Companion to Economics in India, B. Kaushik, ed. Oxford University Press, New York (2007). Mehra, R., and E. C. Prescott. The equity premium: A puzzle. Journal of Monetary Economics 15 (1985): 145–161. Mehra, R., and E. C. Prescott. The equity premium: A solution? Journal of Monetary Economics 22 (1988): 133–136. Mehra, R., and E. C. Prescott. The equity premium in retrospect. In Handbook of the Economics of Finance, G. M. Constantinides, M. Harris, and R. Stulz, eds. North-Holland, Amsterdam (2003). Mehra, R., and E. C. Prescott. Intermediated quantities and returns. Working paper, UCSB (2007). Mehra, R., and E. C. Prescott. Non-risk based explanations of the equity premium. Forthcoming in Handbook of the Equity Risk Premium, R. Mehra, ed. Amsterdam (2008). Merton, R. C. Optimum consumption and portfolio rules in a continuous time model. Journal of Theory 3 (1971): 373–413. Mossin, J. Equilibrium in a capital asset market. Econometrica 34 (1966): 768–783. Prescott, E. C., and R. Mehra. Recursive competitive equilibrium: The case of homogeneous households. Econometrica 48 (1980): 1365–1379. Rietz, T. A. The equity risk premium: A solution. Journal of Monetary Economics 22 (1988): 117–131. Romer, C. D. Changes in business cycles: Evidence and explanations. Journal of Economic Perspectives 13 (1999): 23–44. Rubinstein, M. The valuation of uncertain income streams and the pricing of options. Bell Journal of Economics 7 (1976): 407–425. Schwert, G. W. Indexes of U.S. stock prices from 1802 to 1987. Journal of Business 63 (1990): 399–426. Sethi, S. P. Optimal consumption and investment with bankruptcy. Kluwer Academic Publishers. Norwell, MA 1997. Shaw, W. H. The value of commodity output since 1869. NBER, 48 (1947). Shiller, R. J. Comovements in stock prices and comovements in dividends. Journal of Finance 44 (1989): 719–729. Siegel, J. Stocks for the Long Run, 3rd ed. Irwin, New York (2002). Smith, W. B., and A. H. Cole. Fluctuations in American business, 1790–1860. Harvard University Press, Cambridge, MA (1935). Storesletten, K., C. I. Telmer, and Y. Amir. Asset pricing with idiosyncratic risk and overlapping generations. Working paper. Carnegie Mellon University (2006). Review of Economic Dynamics, forthcoming. Storesletten, K., C. I. Telmer, and A. Yaron. Consumption and risk sharing over the life cycle. Journal of Monetary Economics 51(3) (2004): 609–633. Telmer, C. I. Asset-pricing puzzles and incomplete markets. Journal of Finance 49 (1993): 1803–1832. Vissing-Jorgensen, A. Limited asset market participation and the elasticity of intertemporal substitution. Journal of Political Economy, forthcoming. Weil, P. The equity premium puzzle and the risk-free rate puzzle. Journal of Monetary Economics 24 (1989): 401–421. Rajnish Mehra and Edward C. Prescott 29 APPENDIX A Suppose the distribution of returns period by period is independently and identically distributed. Then, as the number of periods tends to infinity, the future value of the investment, computed at the arithmetic average of returns, tends to the expected value of the investment with probability 1. To see this, let VT = T Π t=1 (1 + rt) , where rt is the asset return in period t and VT is the terminal value of $1 at time T. Then E(VT ) = E T Π i=1 (1 + rt) . Since the rts are assumed to be uncorrelated, we have E(VT ) = T Π i=1 E (1 + rt) or E(VT ) = T Π i=1 (1 + E (rt)) . Let the arithmetic average AA = 1 T T t=t rt. Then, by the strong law of large numbers (Theorem 22.1, Billingsley (1995)), E(VT ) → T Π i=1 (1 + AA) as T → ∞ or E(VT ) → (1 + AA)T as the number of periods T becomes large. APPENDIX B The Original Analysis of the Equity Premium Puzzle In this appendix we present our original analysis of the equity premium puzzle. Needless to say, it draws heavily from Mehra and Prescott (1985). 30 Chapter 1 • The Equity Premium: ABCs The Economy, Asset Prices and Returns We employ a variation of Lucas’ (1978) pure exchange model. Since per capita consumption has grown over time, we assume that the growth rate of the endowment follows a Markov process. This is in contrast to the assumption in Lucas’ model that the endowment level follows a Markov process. Our assumption, which requires an extension of competitive equilibrium theory, enables us to capture the non-stationarity in the consumption series associated with the large increase in per capita consumption that occurred in the 1889–1978 period. The economy we consider was judiciously selected so that the joint process governing the growth rates in aggregate per capita consumption and asset prices would be stationary and easily determined. The economy has a single representative “stand-in” household. This unit orders its preferences over random consumption paths by E0 ∞ t=0 βt U(ct) , 0 < β < 1, (1B) where ct is per capita consumption, β is the subjective time discount factor, E{·} is the expectation operator conditional upon information available at time zero (which denotes the present time), and U: R+ → R is the increasing concave utility function. To ensure that the equilibrium return process is stationary, we further restrict the utility function to be of the constant relative risk aversion (CRRA) class U(c, α) = c1−α − 1 1 − α , 0 < α < ∞. (2B) The parameter α measures the curvature of the utility function. When α is equal to one, the utility function is defined to be the logarithmic function, which is the limit of the above function as α approaches one. We assume there is one productive unit that produces output yt in period t, which is the period dividend. There is one equity share with price pt, that is competitively traded; it is a claim to the stochastic process {yt}. The growth rate in yt is subject to a Markov chain; that is, yt+1 = xt+1yt, (3B) where xt+1 ∈ {λ1, . . . , λn} is the growth rate, and Pr{xt+1 = λj; xt = λi} = φij. (4B) It is also assumed that the Markov chain is ergodic. The λis are all positive and y0 > 0. The random variable yt is observed at the beginning of the period, at which time dividend payments are made. All securities are traded ex-dividend. We also assume that the matrix A with elements aij ≡ βφijλ1−α j for i, j = 1, . . . , n is stable; that is, lim Am as m → ∞ is zero. In Mehra (1988) it is shown that this is necessary and sufficient for the expected utility to exist if the stand-in household consumes yt every period. The paper Rajnish Mehra and Edward C. Prescott 31 also defines and establishes the existence of a Debreu (1954) competitive equilibrium with a price system having a dot product representation under this condition. Next we formulate expressions for the equilibrium time t price of the equity share and the risk-free bill. We follow the convention of pricing securities ex-dividend or ex-interest payments at time t, in terms of the time t consumption good. For any security with process {ds} on payments, its price in period t is Pt = Et ∞ s=t+1 βs−t U (ys)ds U (yt) , (5B) as the equilibrium consumption is the process {ys} and the equilibrium price system has a dot product representation. The dividend payment process for the equity share in this economy is {ys}. Consequently, using the fact that U (c) = c−α , Pe t = Pe (xt, yt) = E ∞ s=t+1 βs−t yα t yα s ys|xt, yt . (6B) The variables xt and yt are sufficient relative to the entire history of shocks up to, and including, time t for predicting the subsequent evolution of the economy. They thus constitute legitimate state variables for the model. Since ys = ytxt+1 . . . xs, the price of the equity security is homogeneous of degree one in yt, which is the current endowment of the consumption good. As the equilibrium values of the economies being studied are time-invariant functions of the state (xt, yt), the subscript t can be dropped. This is accomplished by redefining the state to be the pair (c, i), if yt = c and xt = λi. With this convention, the price of the equity share from (6B) satisfies pe (c, i) = β n j=1 φij(λjc)−α [pe (λjc, j) + λjc]cα . (7B) Using the result that pe (c, i) is homogeneous of degree one in c, we represent this function as pe (c, i) = wic, (8B) where wi is a constant. Making this substitution in (7B) and dividing by c yields wi = β n j=1 φijλ (1−α) j (wj + 1) for i = 1, . . . , n. (9B) This is a system of n linear equations in n unknowns. The assumption that guaranteed the existence of equilibrium guarantees the existence of a unique positive solution to this system. 32 Chapter 1 • The Equity Premium: ABCs The period return if the current state is (c, i) and next period state (λjc, j) is re ij = pe (λjc, j) + λjc − pe (c, i) pe(c, i) = λj(wj + 1) wi − 1. (10B) The equity’s expected period return if the current state is i is Re i = n j=1 φijre ij. (11B) Capital letters are used to denote the expected return. With the subscript i, it is the expected return conditional upon the current state being (c, i). Without this subscript, it is the expected return with respect to the stationary distribution. The superscript indicates the type of security. The other security considered is the one-period real bill or riskless asset, which pays one unit of the consumption good next period with certainty. From (6B), p f i = pf (c, i) = β n j=1 φij U (λjc) U (c) (12B) = β φijλ−α j . The certain return on this riskless security is R f i = 1 p f i − 1 (13B) when the current state is (c, i). As mentioned earlier, the statistics that are probably most robust to the modeling specification are the means over time. Let π ∈ Rn be the vector of stationary probabilities on i. This exists because the chain on i has been assumed to be ergodic. The vector π is the solution to the system of equations π = φT π, with n i=1 πi = 1 and φT = {φji}. The expected returns on the equity and the risk-free security are, respectively, Re = n i=1 πiRe i and Rf = n i=1 πiR f i . (14B) Rajnish Mehra and Edward C. Prescott 33 Time sample averages will converge in probability to these values given the ergodicity of the Markov chain. The risk premium for equity is, Re − Rf , a parameter that is used in the test. The parameters defining preferences are α and β, while the parameters defining technology are the elements of [φij] and [λi]. Our approach is to assume two states for the Markov chain and to restrict the process as follows: λ1 = 1 + μ + δ, λ2 = 1 + μ − δ, φ11 = φ22 = φ, φ12 = φ21 = (1 − φ). The parameters μ, φ, and δ now define the technology. We require δ > 0 and 0 < φ < 1. This particular parameterization was selected because it permitted us to independently vary the average growth rate of output by changing μ, the variability of consumption by altering δ, and the serial correlation of growth rates by adjusting φ. The parameters were selected so that the average growth rate of per capita consumption, the standard deviation of the growth rate of per capita consumption, and the first-order serial correlation of this growth rate, all with respect to the model’s stationary distribution, matched the sample values for the U.S. economy between 1889–1978. The sample values for the U.S. economy were 0.018, 0.036, and −0.14, respectively. The resulting parameters’ values were μ = 0.018, δ = 0.036, and φ = 0.43. Given these values, the nature of the test is to search for parameters α and β for which the model’s averaged risk-free rate and equity risk premium match those observed for the U.S. economy over this 90-year period. The parameter α, which measures people’s willingness to substitute consumption between successive yearly time periods, is an important one in many fields of economics. As mentioned in the text, there is a wealth of evidence from various studies that the coefficient of risk aversion α is a small number, certainly less than 10. A number of these studies are documented in Mehra and Prescott (1985). This is an important restriction, for with a large α virtually any pair of average equity and risk-free returns can be obtained by making small changes in the process on consumption. Given the estimated process on consumption, Figure 1B depicts the set of values of the average risk-free rate and equity risk premium, which are both consistent with the model and result in average real risk-free rates between zero and four percent. These are values that can be obtained by varying preference parameters α between 0 and 10 and β between 0 and 1. The observed real return of 0.80 percent and equity premium of 6 percent are clearly inconsistent with the predictions of the model. The largest premium obtainable with the model is 0.35 percent, which is not close to the observed value. An advantage of our approach is that we can easily test the sensitivity of our results to such distributional assumptions. With α less than 10, we found that our results were essentially unchanged for very different consumption processes, provided that the mean and variances of growth rates equaled the historically observed values. We use this fact in motivating the discussion in the text. 34 Chapter 1 • The Equity Premium: ABCs AverageRiskPremia(Percent) 2 1 0 1 2 3 4 Admissible Region Rf Average Risk-Free Rate (Percent) Re 2Rf FIGURE 1B Set of admissible average equity risk premia and real returns. 0.3 0.25 0.2 1.5 2.5 3.5 4.5432 Risk-Free Rate (Percent) 0 0.15 EquityPremium(Percent) 0.1 0.05 correl. 5 20.14 correl. 5 0.45 FIGURE 2B Set of admissible average equity risk premia and real returns. As mentioned earlier in the text, the serial correlation of the growth rate of consumption for the period 1930–2004 is 0.45. Figure 2B shows the resulting feasible region in this case (we have also included the region from Figure 1B for comparison). The conclusion that the premium for bearing non-diversifiable aggregate risk is small remains unchanged. Rajnish Mehra and Edward C. Prescott 35 APPENDIX C Expanding the set of technologies in a pure exchange, Arrow–Debreu economy to admit capital accumulation and production as in Brock (1979), Prescott and Mehra (1980), or Donaldson and Mehra (1984) does not increase the set of joint equilibrium processes on consumption and asset prices. Since the set of equilibria in a production company is a subset of those in an exchange economy, it follows immediately that if the equity premium cannot be accounted for in an exchange economy, modifying the technology to incorporate production will not alter this conclusion.48 To see this, let θ denote preferences, τ technologies, E the set of the exogenous processes on the aggregate consumption good, P the set of technologies with production opportunities, and m(θ, τ) the set of equilibria for economy (θ, τ). Theorem ∪ τεE m(θ, τ) ⊃ ∪ τεP m(θ, τ) Proof. For θ0εθ and τ0εP, let (a0, c0) be a joint equilibrium process on asset prices and consumption. A necessary condition for equilibrium is that the asset prices a0 be consistent with c0, the optimal consumption for the household with preferences θ0. Thus, if (a0, c0) is an equilibrium, then a0 = g(c0, θ), where g is defined by the first-order necessary conditions for household maximization. This functional relation must hold for all equilibria, regardless of whether they are for a pure exchange or a production economy. Let (a0, c0) be an equilibrium for some economy (θ0, τ0) with τ0εP. Consider the pure exchange economy with θ1 = θ0 and τ1 = c0. Our contention is that (a0, c0) is a joint equilibrium process for asset prices and consumption for the pure exchange economy (θ1, τ1). For all pure exchange economies, the equilibrium consumption process is τ, so c1 = τ1 = c0, given that more is preferred to less. If c0 is the equilibrium process, the corresponding asset price must be g(c0, θ1). But θ1 = θ0 so g(c0, θ1) = g(c0, θ0) = a0. Hence, a0 is the equilibrium for the pure exchange economy (θ1, τ1), proving the theorem. APPENDIX D Estimating the Equity Risk Premium Versus Estimating the Risk Aversion Parameter Estimating or measuring the relative risk aversion parameter using statistical tools is very different than estimating the equity risk premium. Mehra and Prescott (1985), as 48The discussion below is based on Mehra (1998). 36 Chapter 1 • The Equity Premium: ABCs discussed earlier, use an extension of Lucas’ (1978) asset pricing model to estimate how much of the historical difference in yields on Treasury bills and corporate equity is a premium for bearing aggregate risk. Crucial to their analysis is their use of micro observations to restrict the value of the risk aversion parameter. They did not estimate either the risk aversion parameter or the discount rate parameters. Mehra and Prescott (1985) reject extreme risk aversion based upon observations on individual behavior. These observations include the small size of premia for jobs with uncertain income and the limited amount of insurance against idiosyncratic income risk. Another observation is that people with limited access to capital markets make investments in human capital that result in very uneven consumption over time. A sharp estimate for the magnitude of the risk aversion parameter comes from macroeconomics. The evidence is that the basic growth model, when restricted to be consistent with the growth facts, generates business cycle fluctuations if and only if this risk aversion parameter is near zero. (This corresponds to the log case in standard usage.) The point is that the risk aversion parameter comes up in a wide variety of observations at both the household and the aggregate level and is not found to be large. For all values of the risk aversion coefficient less than 10, which is an upper bound number for this parameter, Mehra and Prescott find that a premium for bearing aggregate risk accounts for little of the historic equity premium. This finding has stood the test of time. Another tradition is to use consumption and stock market data to estimate the degree of relative risk aversion parameter and the discount factor parameter. This is what Grossman and Shiller report they did in their American Economic Review Papers and Proceedings article (1981, p. 226). In a paper in which they develop “a method for estimating nonlinear rational expectations models directly from stochastic Euler equations,” Hansen and Singleton illustrate their methods by estimating the risk aversion parameter and the discount factor using stock dividend consumption prices (1981, p. 1269). What the work of Grossman and Shiller (Ibid.) and Hansen and Singleton (1982, 1983) establish is that using consumption and stock market data and assuming frictionless capital markets is a bad way to estimate the risk aversion and discount factor parameters. It is analogous to estimating the force of gravity near the earth’s surface by dropping a feather from the top of the Leaning Tower of Pisa, under the assumption that friction is zero. A tradition related to statistical estimation is to statistically test whether the stochastic Euler equation arising from the stand-in household’s intertemporal optimization holds. Both Grossman and Shiller (1983) and Hansen and Singleton (1982) reject this relation. The fact that this relation is inconsistent with the U.S. time-series data is no reason to conclude that the model economy used by Mehra and Prescott to estimate how much of the historical equity premium is a premium for bearing aggregate risk is not a good one for that purpose. Returning to the analogy from physics, it would be silly to reject Newtonian mechanics as a useful tool for drawing scientific inference because the distance traveled by the feather did not satisfy 1/2gt2 .