Theory of economic growth Slides made by: K˚are Bævre, Department of Economics, University of Oslo Slight adjustment: Miroslav Hlouˇsek, ESF MU 5 Cross-country studies and human capital Required reading: Mankiw, Romer and Weil (1992), Klenow and RodriguezClare (1997), BSiM: 1.2.10-1.2.11, 10.1-10.2,10.5 Secondary reading: Young (1995), Bil and Klenow (2000), Hsieh (1999), Pritchett (2001) 5.1 Growth econometrics: Why and how? • We desire theories that are able to teach us something of relevance to the real world (E.g. the questions in section 1.2.3) • Remember: We are addressing large and complex questions. Should we expect our theories to be true? What other criteria should we then use? • We can adopt several perspectives when confronting empirical data 1. Testing theories (Is it ‘true’?) 2. Assess the explanatory power of the theory (How much do we explain?) 3. Checking stylized facts 4. Explore data for regularities suggesting new theories 5. Calibrate our models • There are good reason to be eclectic when it comes to choice of methods 5.2 Levels regressions • A fundamental question in studies of growth is: Why are some countries so rich, and some so poor? • It is therefore natural to investigate empirically how good we can explain variation in income levels across countries by different explanatory variables. 1 • A common framework for such analysis is the so-called level regressions ln((Y/L)i) = a + b1x1,i + b2x2,i + . . . + i (1) where xj,i are explanatory variables for country i, and is an error term. • From the production function we know that Y/L depends on K/L and H/L, or in the CD-case: ln(Y/L) = α ln(K/L) + η ln(H/L) However, we can not readily enter the capital intensities as explanatory variables (xj,i) in (1) because they are endogenous (that is, themselves dependent upon Y/L). • Estimation of an equation like (1) will give biased estimates unless we are able to control for this problem. From econometrics we know that this can be achieved by use of instrumental variables. However, suitable instruments are hard to come by. • Levels regressions are therefore usually formulated somewhat differently, and based on the assumption that all countries are in their steady state. • This is obviously a crude approximation, but it is still useful for putting the model to a first test. • We remember that the augmented Solow-model tells us that savings rates (sk, sh) and population growth (n) affects the steady state level of GDP per capita ((Y/L)∗ ). Or as we have seen: ln((Y (t)/L(t))∗ ) = ln(T(t))+ α 1 − α − η ln(sk)+ η 1 − α − η ln(sh)− α + η 1 − α − η ln(n+x+δ) (2) • By assuming that the observed measure (Y/L)i corresponds to the steady state level for country i, we can therefore motivate the regression equation ln((Y/L)i) = a + b1 ln(sk,i) + b2 ln(sh,i) − b3 ln(ni + x + δ) + i (3) • Note that we do not include country specific data on depreciation (δ) since these are too hard to measure/not available, and probably do not vary that much across countries anyway. 2 • The approach adopted by MRW effectively ignores differences in technology between countries. This is a deliberate choice, based on the following three considerations: (i) Since technology is exogenous and unexplained in the Solow-model, they want to see how far they can get without including technology as an explanatory variable. (ii) The perspective they adopt insists that knowledge flows quite freely across countries, this should result in absence of large and persistent technology differences across countries. (iii) It is not possible to readily observe x and T(t) for individual coun- tries. • MRW therefore use average values of x + δ (like 0.02 + 0.03 = 0.05), the exact levels are not very important for our results. • Notice however, that use of a common intercept a for all countries amounts to assuming that Ti(t) = a + εi, (4) where a = T(t) is the average level of technology at time t, and εi is a random country specific deviation from this level that is included in the total error term i for this country. • In econometric terminology the theoretical equation (2) is a structural equation, with equation (3) as its reduced form. • From theory/the structural equation we are lead to believe that b1 = α 1 − α − η (5) b2 = η 1 − α − η (6) b3 = α + η 1 − α − η (7) which in turn implies b1 + b2 = b3 (8) which is a testable restriction on the reduced form parameters b1, b2, b3 3 • Imposing condition (8) we are left with two independent equations from which we can solve out for the structural parameters α and η based on the estimated values of the reduced form parameters (MRW call these the ‘implied’ α and β) • Though this framework allows for a nice way of exploring the relationship between theory (the augmented Solow-model) and the data, there are important problems: – What if deviations from steady-state are important? – Are the right-hand variables exogenous? Or do also sk, sh and n depend on Y/L? – Are sk, sh and n independent of country-specific level of technology (i.e. Ti(t))? If not, our inclusion of εi in the error term i makes it correlated with the explanatory variable. It is well known from econometrics that this is an important source of bias in estimates of b1, b2, b3 (Omitted variable bias) – It is hard to include additional explanatory variables (e.g. political variables) in levels regressions because they are likely to be endogenous to the income level. 5.3 A broader base for the accumulated factor saves the Solow model? MRW (1992) 5.3.1 The text-book Solow model • In MRW, Sect II, the authors conduct a level-regression based on the text-book Solow-model (i.e. with only accumulation of physical capital) • Their results (Table I) are qualitatively in accordance with what we expect from theory. Coefficients have predicted signs and are significant (in two of three samples). Restriction of equal magnitude and opposite sign of coefficients is not rejected. • However: 1. The explanatory power is modest. 2. The implied value of α is much higher than what we would suspect from estimates of this elasticity based on capital’s share of income. • This is yet another indication that we should reconsider the role of capital, and also include human capital. 4 • Note that an alternative way of formulating (2) is ln(Y (t)/L(t))∗ = ln(T(0))+xt+ α 1 − α ln(sk)− α 1 − α ln(n+x+δ)+ η 1 − α ln(h∗ ) (9) • If (9) is the true model, it is thus as if they have neglected the term with h when they estimate the textbook model. Since h∗ is probably positively correlated with sk, this introduces a bias. More specifically: b1 = α 1−α will be biased upwards (why?), and hence we will get a too high implied value of α. 5.3.2 The augmented Solow-model • In Section III the authors conduct a levels-regression based on the augmented Solow-model. • They prefer to estimate (2) rather than (9). Implicitly they are arguing that sh is easier to measure than h∗ . Another argument is that sh is probably less likely to suffer form endogeneity problems. • MRW use enrollment rate to secondary schooling as a proxy for sh. • Their results are impressive. Main results: 1. Human capital measure is significant in all three samples. 2. Coefficient on physical capital investment is reduced. 3. Model explains much of the cross-country income differences in the data (R2 = 0.77 in the intermediate sample) 4. The values of α and β implied by the reduced form coefficients are both 1/3, which is close to what we should suspect from other considerations. • That the implied value of β makes sense can be seen from a crude guestimate β ≈ (1 − α) 1 − minimum wage average wage in maunfacturing ∈ (1/3, 1/2) • All in all, the results in MRW is very good news for the Solow-model. It is in accordance with the data, and capital accumulation seems to go a long way towards answering our question about why some countries are rich, and some are poor. 5 • Quite impressive of such a simple model! • MRW was a major contribution to a ’neo-classical’ revival. 5.4 Education and the measurement of human capital • The conclusion of MRW has been significantly modified by later re- search • The criticism basically goes along two lines: 1. Methodological shortcomings. i) Endogeniety, ii) Neglecting differences in Ti 2. That their measure of sh is poor. • There has been suggested few remedies of the first set of problems within the framework of levels-regression. We will have more to say on these issues when we turn to growth-regressions. • We therefore focus on problems connected to measuring human capital. • We follow Klenow and Rodrigez-Clare (1997) • Their approach is somewhat different. They do not want to estimate α and η due to the methodological problems, and likely biases. Instead they set out to use ‘independent’ evidence on these parameter values, and based on these calculate the contributions of the inputs directly. I.e. with Y L = A K Y α/(1−α−η) H Y η/(1−α−η) = AX (10) they measure the contributions K Y α/(1−α−η) and H Y η/(1−α−η) directly. They use X = K Y α/(1−α−η) H Y η/(1−α−η) to denote the total contribution from accumulated factor. The remainder, A, is attributed to the level of technology. • Instead of running regressions they use a method called variance de- composition 1 = var(ln(Y/L)) var(ln(Y/L)) = cov(ln(Y/L), ln(Y/L)) var(ln(Y/L)) = cov(ln(Y/L), ln(X)) + cov(ln(Y/L), ln(A)) var(ln(Y/L)) 6 • The reported covariances are equal to the coefficients from univariate regressions of ln(X) and ln(A) respectively, on ln(Y/L). • Since cov(ln(X), ln(A)) = 0 by construction in MRW, their (unique) decomposition is equal to R2 , 1 − R2 . • Methodological differences aside, the main point in the article is showing how sensitive the MRW results are to modifications of how the role of human capital is measured. • Their first major finding is that if we also include primary and tertiary enrollment rates, human capital seems to play a much smaller role (MRW3/4 in Table 1). • In particular, primary enrollment rates varies much less between countries. MRW’s focus on only secondary enrollment rates can be misleading since it exaggerates the true variation in the explanatory variables. • A more fundamental difference in methodology is their use of estimates based on micro-evidence on the returns to schooling. They exploited data from Mincer (1974) regression. • Mincer regressions: ln(ws) = ln(w0) + rs where ws is the wage of person with s years of schooling. Microeconomic studies also control for a lot of other factors, so r is in principle the partial effect of schooling when all else is equal. • In general, this functional form fits the data well. The estimates of r are surprisingly stable around 0.1 across different studies. We can use this as a rule of thumb/first approximation. (More precisely: r = 0.095) • Note that the return rate, r, is fixed, implying that an additional year of higher education is more valuable than an additional year of lower education. Thus we put different weights on the contributions of the different types of education. • Based on measures constructed in this way, they find results which once again modifies the original MRW results (see Table 2). However, the effects may not be as drastic as those suggested by their MRW4 (or BK4) estimation. • Based on these new results, it appears that MRW’s results attribute too much to the role of factor accumulation. 7 • As a conclusion we can say that the results indicate that both factor accumulation and technological differences are important (roughly 50/50). 5.5 Why is education privately profitable but with weak macro-effects? • A r = 0.1 in the Mincer-regressions estimated on micro-data suggests substantial private returns to education. • Pritchett (2001) also estimates human capital based on Mincer-regressions. • His estimates of the growth-growth equation: γy = a + b1γk + b2γh yields a negative b2, i.e. a negative effect of increased aggregate education levels on growth in aggregate production. • The results appear somewhat extreme, but strongly suggest a conflict between private profitability of education and weak (if any) effects on aggregate production. • Why is this so? 1. Substantial differences between countries. 2. How is the human capital put to use? Piracy vs. chemical engineer. The role of private opportunities. 3. Education has failed, non-productive. 5.6 Does schooling cause growth or the other way around? • MRW’s finding of a positive correlation between enrollment rates and output remains. • Can this be due to a reverse causality? • Higher expected growth can induce more schooling because it will put more weight on how an individual values future human capital relative to the current (alternative) costs of education. • This is the intuition in Bils and Klenow (2000), (not required paper). 8 5.7 Growth accounting and development accounting 5.7.1 Growth accounting • Consider a general production function Y = F(T, K, L) • Simple differentiation with respect to time gives ˙Y Y = FT T Y · ˙T T + FKK Y · ˙K K + FLL Y · ˙L L • The different terms can be seen as capturing the contribution of the growth in the inputs T, K, L to growth in Y . • Here, ˙Y /Y , ˙K/K, and ˙L/L are directly observable (that is not to say that they are always perfectly measured). In addition, FKK/Y and FLL/Y can be set equal to the respective factor shares, sK and sL. • This leaves us with g = FT T Y · ˙T T as the unobserved contribution to growth from technological progress. • We can estimate g as the so-called Solow-residual ˜g = ˙Y Y − sK ˙K K − sL ˙L L • We also often refer to the estimate ˜g as growth of Total Factor Productivity (TFP). • Note that the estimate is an accounting residual, and we can not warrant that TFP growth is due to technological progress as we think of it in the model (i.e. ˙T/T = x). Anything we are not able to explain from changes in K and L will fall in this category (ranging from the effects of a war, to some types of measurement errors). • One should therefore be careful not to interpret to much about causality into growth accounting. (See also BSiM 10.5) • Note that growth accounting differs from the regression approaches since we use estimates of factor shares taken from separate sources instead of estimating these as elasticities. 9 • The challenge for getting good growth accounting estimates is to measure changes in inputs carefully. One would like to capture both quantitative and qualitative changes. • Note in particular that changes in human capital is usually entered as changes in the quality of labor (educational attainment, age composition etc). • In practice it is often convenient to work with a so-called trans-log production function (e.g. Young (1995)), which is basically a generalization of the Cobb-Douglas. The main advantage is that it makes it easy to work with disaggregation of inputs into sub-categories. • Growth accounts usually find that a substantial part of growth is due to factor-accumulation, but also finds that TFP-growth is substantial. • Particular attention has been devoted to the East-Asian growth mira- cles. • The most famous study is that by Young (1995). His estimates attributes the high growth rates to exceptionally strong factor accumulation. TFP growth in these countries is at levels comparable to other countries. • Young’s results was a main contribution to the neo-classical revival. 5.7.2 Dual growth accounting • Under the standard assumptions we have Y = RK + wL • From this relationship, differentiation gives us ˙Y Y = sK( ˙R/R + ˙K/K) + sL( ˙w/w + ˙L/L) • Hence we get the dual formulation of the estimate of TFP growth: ˜g = ˙Y Y − sK ˙K K − sL ˙L L = sK ˙R/R + sL ˙w/w That is, estimated based on changes in factor prices. 10 • The most important aspect of this approach is that it allows us to estimate TFP-growth based on alternative data, which are perhaps easier to measure and more reliable. At least it provides us with an important check of estimates based on the primary approach. • Hsieh (1999) used the dual approach to redo Young (1995) analysis of South East Asia. • The most important difference is for Singapore. He argues that this is because the national accounts overstates capital growth, while rental prices have been fairly stable. 5.7.3 Development accounting • Let the production function be Cobb-Douglas Y = TKα Hη L1−α−η • In this special case, the (primary) growth account was based on looking at the differentiated form ˙Y Y = ˙T T + α ˙K K + η ˙H H + (1 − α − η) ˙L L • The growth account looks at changes in a given country over time. • It is also worth considering development accounting. I.e. for countries i and j we should have Yi Yj = Ti Tj Ki Kj α Hi Hj η Li Lj 1−α−η • Based on this relationship one use different criteria to judge how much of differences across countries in Y (i.e. the left hand side) is due to differences in TFP (i.e. T) or inputs (K, H and L). • Different authors have used different criteria. The main thing to notice, however, is that this approach is based on calibration by using estimates of α and η derived from independent sources. I.e. the parameters are not estimated from the data under consideration, as it is done in the cross-country regressions. • Klenow and Rodriguez-Clare (1997) use a variant of development ac- counting. 11 6 Convergence and growth regressions Required reading: BSiM: 11 (11.6-11.9 cursory reading), 12 (12.5 cursory) Secondary reading: Islam (1995), Jones (1997) 6.1 Convergence to the steady state • A fundamental property of the Solow- model is convergence to a steady state, characterized by the fundamental equation ˙ˆk ˆk = sf(ˆk) ˆk − (n + x + δ) (11) • Along the transition to the steady-state growth is stronger the further we are from the steady state (growth is positive if we are below, negative if we are above). • This is easily seen from the graph • More formally we can easily show show (BSiM 1.30) that (11) can be reformulated to ˙ˆk ˆk = (n + x + δ)   ˆk ˆk∗ α−1 − 1   • To characterize the convergence it is however often more convenient to instead work with the linearization ˙ˆk ˆk = β(ln ˆk∗ − ln ˆk) 12 where β = (1 − α)(n + x + δ) (12) is the speed of convergence. • We also know that the dynamics of ˆk translates directly to that of ˆy, so we get ˙ˆy ˆy = β(ln ˆy∗ − ln ˆy) (13) • Parameter β indicates how rapidly an output per effective worker, ˆy, approaches its steady-state value, ˆy∗ , in the neighborhood of the steady state. If β = 0.05, 5 per cent of the gap between ˆy and ˆy∗ vanishes in one year. (Half-life = 14 years, see rule of 70). • Solving the differential equation (13), we get: ln ˆy(t) − ln ˆy(0) = (1 − e−βt ) ln ˆy∗ − (1 − e−βt ) ln ˆy(0) • Dividing throughout by t gives a growth rate on the left hand side (hence the choice of linearizing in ln ˆy rather than ˆy) ln ˆy(t) − ln ˆy(0) t = b1 ln ˆy∗ − b2 ln ˆy(0) (14) where b1 = b2 = (1 − e−βt )/t. 6.2 Absolute and conditional convergence • So far we have been looking at convergence of a given economy to its steady state. • It is also an important question whether we should expect convergence across countries. • We now use (14) to discuss this property. Note that we talk about poor and rich countries (i.e. as measured by y(t) = Y (t)/L(t) even if (14) involves ˆy(t) = y(t)/T(t). The role of T(t) in the denominator is not important here, since we assume equal technological progress across countries. We will deal with this more carefully soon. • The Solow-model predicts the following: Countries that are poor will grow faster than rich ones provided they have the same steady state (and are both below this steady state). 13 • This is conditional convergence, which is the same as predicting that b2 in (14) is positive, so the partial impact of ln ˆy(0) on the growth rate is negative. • In (14) we have conditioned on the steady state by the term ln ˆy∗ . • Note that the Solow-model does not predict absolute convergence, i.e. that a poor country will always grow faster than a rich country. • This can easily be demonstrated in the familiar graph. Absolute convergence does not hold, but conditional does. 6.3 β and σ-convergence • It is important to be careful about how we think about convergence. • The two types of convergence we discussed above are usually referred to as β-convergence. That is, they have to do with the b2 above (which is derived from β, hence the name). We speak respectively of absolute and conditional β-convergence. • In addition we can define σ-convergence: A group of economies are converging in the sense of σ if the dispersion of their real per capita GDP levels (or equivalent measure of interest) tends to decrease over time. That is, if σt2 < σt1 when t2 > t1 and σt is the standard deviation of ln(y(t)) across the economies at time t. • It is easy to realize that β-convergence is necessary for σ-convergence. 14 • However, β-convergence does not necessarily imply σ-convergence. (Galton’ fallacy). • This can be illustrated by the graphs: • See BSiM 11.1 for a more detailed discussion. 6.4 Growth regressions • Most of the empirical literature (at least the early one) is primarily concerned with β-convergence, and obtaining estimates of the speed of convergence. • We have derived the following log-linear equation for growth rates of ˆy ln ˆy(t) − ln ˆy(0) t = b1 ln ˆy∗ − b2 ln ˆy(0) (15) where b1 = b2 = (1 − e−βt )/t. • This suggest that we should run regressions of growth rates on initial level of income per capita. This will enable us to derive an estimate of β from b2 • Importantly, we must also condition upon ˆy∗ , i.e. we must consider conditional convergence. 15 • We can distinguish between two approaches 1. Theory driven (MRW): We use the theoretical expression for ˆy∗ , i.e. as a function of savings rate(s) and population growth. For the augmented model we insert from equation (2) in L.N. 3, and solve out for y instead of ˆy and get the structural equation ln(y(t)) − ln(y(0)) t = (1 − e−βt ) t ln T(0) + xt + (1 − e−βt ) t α 1 − α − η ln(sk) + (1 − e−βt ) t η 1 − α − η ln(sh) − (1 − e−βt ) t α + η 1 − α − η ln(n + g + δ) − (1 − e−βt ) t ln(y(0)) 2. Barro-regressions: We run regressions of the type ln y(t) − ln y(0) t = b0 + B1X − b2 ln(y(0)) (16) where X is a vector of variables characterizing each economy which are assumed to be of influence for the steady state level. 6.5 Empirical evidence • Initial results on absolute convergence where spurious, and due to ex post selection bias (only the success stories had data/had joined the OECD) and hence convergence followed from the choice of sample. • MRW: Not absolute convergence. Evidence of conditional β-convergence. • Speed of convergence: 0.01 − 0.02. The short run is long! • The implied share of the accumulated factor suggests a broad concept of capital or externalities (α + η ≈ 0.75). • Similar studies: There is no evidence of absolute β-convergence in samples covering ‘the world’. There is even evidence of σ-divergence. • When we run conditional regressions we find rather strong evidence of both β- and σ-convergence. The rate of convergence is usually found to be around 0.02. • We also find strong signs of absolute convergence when we confine our comparisons to close/similar economies such as US states, Japanese prefectures. 16 6.6 Econometric issues • A problem with the approach of MRW above is that it assumes that the initial level of technology, T(0), is the same for all economies in the sample. • In reality this implies neglect of variable in the regression (because T(0) is likely to differ). • Once again we are left with potential biases because T(0) will be positively correlated with y(0). This gives a bias toward zero for the parameter b2, and hence a downward bias for β. • Note that problem is probably more severe here than in the case with levels-regression. • A solution to this problem is to include a dummy for each economy in the sample. This dummy then captures T(0). • One must include in the model several growth rates for each country, i.e. from subsequent time-periods. • This panel data approach is adopted in most recent studies. See Islam (1995), which you should read cursory (without caring too much about technical details). • The results differ quite a bit from those reported by MRW. Most notably there appears to be a higher speed of convergence and hence a smaller implied share of capital. In addition, note that the concept of convergence is now even more conditional and in a sense more hollow. • A problem with this approach is that available time series are usually short, so that we must use growth rates for undesirably short periods. 6.7 World income distribution, Twin-Peaks and club- convergence • Even if we find evidence of conditional β-convergence it is worth noticing that absolute σ-divergence is perhaps the really important phe- nomenon. • There also appears to be evidence of a twin-peaks phenomenon in the world distribution of income. 17 • Recent research focuses much on club-convergence. I.e. how multiple equilibria might give rise to a twin-peaks like phenomenon. • The important message is that we have to look closer at structural characteristics of the economies under consideration. 6.8 “Barro-regressions” • There is a substantial literature on theory-free regressions of the type (16). • These are sometimes referred to as “Barro-regressions” after an influential early study by Barro. • Also these studies report fairly similar results about convergence. • However, the focus of these studies is often directed more towards exploring the empirical relationship between different variables and growth. • On these issues there is substantial variation in the results reported and few robust findings seems to transpire. There are substantial methodological problems connected to these type of regressions. • A brief survey of results can be cast as follows: – Population growth does not seem as detrimental as usually as- sumed – High inequality lowers growth, perhaps by raising social and political instability – The depth of financial intermediation seems important for growth – Democracy in itself does not seem to be important – Extended economic freedom and protection of property rights seems to be important – Results are ambiguous over the role of big government and high taxation – Government spending on infrastructure is important – Openness to trade is beneficial for growth, but only under certain circumstances that have not been fully identified 18 References [1] Bils, M. and Klenow, P., Does Schooling Cause Growth?, The American Economic Review, December 2000, 90 (5). 1160-1183. [2] Hsieh, C. T., Productivity Growth and Factor Prices in East Asia, American Economic Review, May 1999, 89 (2). 133-138. [3] Islam, N., Growth Empirics: a Panel Data Approach, Quarterly Journal of Economics, 1995, 110 (4). Pp. 1127-1170. [4] Jones, C.I., On the Evolution of the World Income Distribution, Journal of Economic Perspectives, 1997, 11 (3). Pp. 19-36. [5] Klenow, P. and Rodriguez-Clare, A. The Neoclassical Revival in Growth Economics: Has it gone too Far?, In B.S. Bernanke and J.J. Rotemberg (eds) NBER Macroeconomics Annual 1997, Cambridge, MA: MIT Press, 1997. [6] Mankiw, N. G., Romed D. and Weil, N. D., A Contribution to the Empirics of Economic Growth, Quarterly Journal of Economics, May 1992, 107 (2), 407-437. [7] Pritchett, L., Where Has All the Education Gone?, World Bank Economic Review, 2001, 15 (3). 367-391. [8] Young, A., The Tyranny of Numbers: Confronting the Statistical Realities of the East Asian Growth Experience, Quarterly Journal of Economics, August 1995, 110 (3), 641-680. 19