American Economic Association Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment Author(s): Esther Duflo Source: The American Economic Review, Vol. 91, No. 4 (Sep., 2001), pp. 795-813 Published by: American Economic Association Stable URL: http://www.jstor.org/stable/2677813 Accessed: 05-03-2018 14:27 UTC JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://about.jstor.org/terms American Economic Association is collaborating with JSTOR to digitize, preserve and extend access to The American Economic Review This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment By ESTHER DuFLo* Between 1973 and 1978, the Indonesian government engaged in one of the largest school construction programs on record. Combining differences across regions in the number of schools constructed with differences across cohorts induced by the timing of the program suggests that each primary school constructed per 1,000 children led to an average increase of 0.12 to 0.19 years of education, as well as a 1.5 to 2.7 percent increase in wages. This implies estimates of economic returns to education ranging from 6.8 to 10.6 percent. (JEL 12, J3 1, 015, 022) The questions of whether investments in infrastructure can cause an increase in educational attainment, and whether an increase in educational attainment causes an increase in earnings are basic concerns for development economists. A large body of literature investigates the impact of schooling infrastructure on schooling, as well as the returns to education in developing countries [see George Psacharopoulos (1994) and John Strauss and Duncan Thomas (1995) for surveys]. Estimated returns to education are, in general, larger in developing countries than in industrialized countries. However, most of the existing studies are based on simple correlations between years of education and wages. Family and community background are important determinants of both schooling and labor market outcomes in developing countries, and the bias in estimates that treat an individual's education level as exogenous could be important. This paper exploits a dramatic change in policy to evaluate the effect building schools has on education and earnings in Indonesia, a country where the GDP per capita in 1995 was only $720, 3.5 percent that of the United States. In 1973, the Indonesian government launched a major school construction program, the Sekolah Dasar INPRES program. Between 1973-1974 and 1978-1979, more than 61,000 primary schools were constructed-an average of two schools per 1,000 children aged 5 to 14 in 1971. Enrollment rates among children aged 7 to 12 increased from 69 percent in 1973 to 83 percent by 1978. This was in contrast to the absence of capital expenditure and a decline in enrollment in the early 1970's. Using a large cross section of men born between 1950 and 1972 from the 1995 intercensal survey of Indonesia (SUPAS), I linked an adult's education and wages with district-level data on the number of new schools built between 1973-1974 and 1978-1979 in his region of birth. The exposure of an individual to the program was determined both by the number of schools built in his region of birth and by his age when the program was launched. After controlling for region of birth and cohort of birth effects, interactions between dummy variables indicating the age of the individual in 1974 and the intensity of the program in his region of birth are plausibly exogenous variables, and are used as instruments in the wage equation. Similar strategies were used to estimate the effect of * Department of Economics, Massachusetts Institute of Technology, 50 Memorial Drive, Cambridge, MA 02142. Financial support from the Fondation Thiers and the Alfred P. Sloan Dissertation Fellowship is gratefully acknowledged. I also thank the World Bank for partially funding this research, the Central Bureau of Statistics of Indonesia, the Bappenas and the Ministry of Education and Culture for data, and the staff of the HIID office in Jakarta (in particular Joe Stem and Peter Rosner) for their help and hospitality. I am grateful to Joshua Angrist, Abhijit Banerjee, Michael Kremer, and Jonathan Morduch for their advice and support throughout this research, and to Daron Acemoglu, David Card, Anne Case, Aimee Chin, Angus Deaton, Guido Imbens, Emmanuel Saez, various seminar participants, and two referees for helpful comments. I bear sole responsibility for the content of this paper, which is not meant to reflect the views of the World Bank or any government agency. 795 This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 796 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 school quality on returns to education (David Card and Alan Krueger, 1992), and the effect of college education on earnings (Card and Thomas Lemieux, 1998). The estimates suggest that each new school constructed per 1,000 children was associated with an increase of 0.12 to 0.19 in years of education and 1.5 to 2.7 percent in earnings for the first cohort fully exposed to the program. This implies estimates of economic returns to education ranging from 6.8 to 10.6 percent. The remainder of this paper is organized as follows. In Section I, I describe the data, the INPRES program, and an overview of the identification strategy. In Section II, I present the estimated impact of the program on education. Section III is devoted to the estimation of the effect of the program on wages, and Section IV to the estimate of economic returns to education. Section V combines the estimates of the program effect on wages with detailed data on the cost of education in Indonesia in a tentative cost-benefit analysis of the program. Section VI concludes. I. The Program A. Data The data used in this paper come from the 1995 intercensal survey of Indonesia (SUPAS). I focus on men born between 1950 and 1972. Summary statistics for this sample are presented in Table 1, panel A. There are 152,989 individuals in the sample, with an average level of 7.98 years of completed education (6 years of education correspond to graduation from primary school). There are 60,633 individuals who work for a wage (sample selection issues are examined in Section IV). Using information on the district of birth of each individual, I matched the individual survey data with district-level census data and the number of schools scheduled to be constructed in each district under the INPRES program.1 TABLE 1-DESCRIPTIVE STATISTICS Mean Panel A: Individual Level Means Education (whole sample N = 152,989) 7.98 Education (sample with valid wage data N = 60,663) 9.00 INPRES schools built per 1,000 children 1.98 INPRES schools built per 1,000 children (sample with valid wage data) 1.89 INPRES schools built per 1,000 children (High program regions) 2.44 INPRES schools built per 1,000 children (Low program regions) 1.54 Log(hourly wage) 6.87 Monthly earnings (SUPAS 1995), thousands Rupiah 13 Monthly earnings (SUSENAS 1993) of wage earners, thousands Rupiah 205 Monthly earnings (SUSENAS 1993) of self-employed individuals, thousands Rupiah 152 Panel B: District Level Means (N = 293) INPRES schools constructed (1973-1974 to 1978-1979) 222 INPRES schools constructed per 1000 children (1973-1974 to 1978-1979) 2.34 Number of teachers in 1973-1974 1,530 Number of teachers in 1978-1979 2,082 Number of schools in 1973-1974 219 Fraction of the population attending school in 1971 (Census) 0.174 Enrollment rate in primary school in 1973 (Ministry of Education and Culture) 0.68 Panel C: Indonesian Family Life Survey, Individuals Born Between 1950 and 1972 (all numbers are in percentages) Proportion of individuals having migrated between birth and age 12 8.5 Proportion of people having repeated at least one grade in primary school 20.0 Proportion of people completing more than primary having repeated at least one grade in primary school 6.0 Proportion of individuals having attended primary school after age 12 (estimated) 15.8 Proportion of individuals having attended primary school after age 13 (estimated) 6.8 Proportion of individuals born 1950-1961, completing primary or less, who left school after 1974 2.8 Proportion of individuals born 1962-1966, completing primary or less, who left school after 1974 24.5 Sources: IFLS, SUPAS, SUSENAS, INPRES instruction, Census (1971), Ministry of Education and Culture. 1 According to a survey of the implementation of the program conducted by the Ministry of Education and Culture in 1983, the actual number of schools constructed closely corresponded to the plans. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 797 District-level descriptive statistics are presented in Table 1, panel B. B. The Sekolah Dasar INPRES Program Starting in 1973, the Indonesian government emphasized the need for "equity" across provinces. Oil revenues were mobilized to finance centrally administered development programs, the "presidential instructions" (INPRES). The Sekolah Dasar INPRES was one of the first INPRES programs and by far the largest at the time it was launched (in 1973-1974). As a result of the oil boom, real expenditures on regional development more than doubled between 1973 and 1980, and the Sekolah Dasar INPRES program became extremely important. Between 1973-1974 and 1978-1979, 61,807 new schools were constructed (Table 1, panel B), at a cost of over 500 million 1990 U.S. dollars (1.5 percent of the Indonesian GDP in 1973). This represented more than one school per 500 children aged 5 to 14 in 1971, which reportedly makes INPRES the fastest primary school construction program ever undertaken in the world (World Bank, 1990). Once an INPRES school was established, the government recruited the teachers and paid their salaries (each school was designed for three teachers and 120 pupils). An effort to train more teachers paralleled the INPRES program (World Bank, 1990), and the proportion of teachers meeting the minimum qualification requirements did not worsen significantly between 1971 and 1978. The stock of schools multiplied by two over the period, and the stock of teachers grew by 43 percent. This contrasted with a freeze of capital expenditure and teacher recruiting prior to 1973 (Daroesman, 1971). The program was designed explicitly to target children who had not previously been enrolled in school. The general allocation rule was that the number of schools to be constructed in each district was proportional to the number of children of primary school age not enrolled in school in 1972. The "presidential instructions" also listed the exact number of schools to be constructed in each district. Table 2 presents the results that a regression of the logarithm of the number of INPRES schools planned in each region had on the logarithm of the nonenrollment rate and the logarithm of the number of TABLE 2-THE ALLOCATION OF SCHOOLS Log(INPRES schools)a Log of number of children 0.78 aged 5-14 in the region (0.027) Log(1 - enrollment rate in 0.12 primary school in 1973)b (0.038) Number of observations 255 R 2 0.78 Notes: Standard errors are in parentheses. a The dependent variable is the log of the number of INPRES schools built between 1973 and 1978. b The enrollment rate in primary school is the number of children enrolled in primary school in 1973 (obtained from the Ministry of Education and Culture) divided by the number of children aged 5-14 in the region in 1973. children. The actual rule implies that both coefficients should be close to 1. Both coefficients have the expected sign, but the coefficient of the nonenrollment rate is smaller than 1. This might be explained by measurement error in the nonenrollment measure as well as by imperfect application of the general rule: The program appears to have been less redistributive than it intended to. C. Identification Strategy The date of birth and the region of birth jointly determine an individual's exposure to the program. Indonesian children normally attend primary school between the ages of 7 and 12. All children born in 1962 or before were 12 or older in 1974, when the first INPRES schools were constructed. Thus, they did not benefit from the program, since they should have left primary school before the first INPRES schools were opened. Grade repetition and delayed school entry could lead a few of these children to benefit from the program during their last year in school. However, according to the 1993 Indonesian Family Life Survey (IFLS) data set (conducted in 1993 by RAND and the Demographic Institute at the University of Indonesia), less than 3 percent of the children born between 1950 and 1962 were still in primary school in 1974. For younger children, the exposure is an increasing function of their date of birth. Hence, the effect of the program should be close to 0 for children 12 or older in 1974 and increasing for younger children. Because the program intensity was related to This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 798 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 TABLE 3-MEANS OF EDUCATION AND LOG(WAGE) BY COHORT AND LEVEL OF PROGRAM CELLS Years of education Log(wages) Level of program in region of birth Level of program in region of birth High Low Difference High Low Difference (1) (2) (3) (4) (5) (6) Panel A: Experiment of Interest Aged 2 to 6 in 1974 8.49 9.76 -1.27 6.61 6.73 -0.12 (0.043) (0.037) (0.057) (0.0078) (0.0064) (0.010) Aged 12 to 17 in 1974 8.02 9.40 -1.39 6.87 7.02 -0.15 (0.053) (0.042) (0.067) (0.0085) (0.0069) (0.011) Difference 0.47 0.36 0.12 -0.26 -0.29 0.026 (0.070) (0.038) (0.089) (0.011) (0.0096) (0.015) Panel B: Control Experiment Aged 12 to 17 in 1974 8.02 9.40 -1.39 6.87 7.02 -0.15 (0.053) (0.042) (0.067) (0.0085) (0.0069) (0.011) Aged 18 to 24 in 1974 7.70 9.12 -1.42 6.92 7.08 -0.16 (0.059) (0.044) (0.072) (0.0097) (0.0076) (0.012) Difference 0.32 0.28 0.034 0.056 0.063 0.0070 (0.080) (0.061) (0.098) (0.013) (0.010) (0.016) Notes: The sample is made of the individuals who earn a wage. Standard errors are in parentheses. enrollment rates in 1972, which differed widely across regions, region of birth is a second dimension of variation in the intensity of the program. Region of birth is highly correlated with the region of education: 91.5 percent of the children in the IFLS sample were still living in the district where they were born at age 12. However, unlike region of education, it is not endogenous with respect to the program [which would lead to bias in the program effect; see Rosenzweig and Wolpin (1988)], given that all individuals in the sample were bom before the program was started. The basic idea behind the identification strategy can be illustrated using simple two-by-two tables. Table 3 shows means of education and wages for different cohorts and program levels. Regions are separated in "high program" and "low program" regions. The difference between the number of schools constructed per 1,000 children constructed in high and low program regions is 0.90.2 In panel A, I compare the educational attainment and the wages of individuals who had little or no exposure to the program (they were 12 to 17 in 1974) to those of individuals who were exposed the entire time they were in primary school (they were 2 to 6 in 1974), in both types of regions. In both cohorts, the average educational attainment and wages in regions that received fewer schools are higher than in regions that received more schools. This reflects the program provision that more schools were to be built in regions where enrollment rates were low. In both types of regions, average educational attainment increased over time. However, it increased more in regions that received more schools. The difference in these differences can be interpreted as the causal effect of the program, under the assumption that in the absence of the program, the increase in educational attainment would not have been systematically different in low and high program regions. An individual young enough, born in a high program region, received on average 0.12 more years of education, and the logarithm of his wage in 1995 was 0.026 higher. These differences in differences are not significantly different from 0. This simple estimator suggests that one school per 1,000 children contributed to an increase in education by 0.13 years (0.12 divided by 0.90) and wages by 0.029 for children aged 2 to 6 when the program was initiated. The Wald estimate of returns to education is the ratio of these two estimates. The identification assumption should not be taken for granted: The pattern of increase in 2 To make Wald estimates meaningful, estimates in Table 3 are presented for the sample with valid wage data. High program regions are defined as regions where the residual of a regression of the number of schools on the number of children is positive. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 799 education could vary systematically across regions. In particular, there could be mean reversion. However, an implication of the identification assumption can be tested because individuals aged 12 or older in 1974 were not exposed to the program. The increase in education between cohorts in this age-group should not differ systematically across regions. In Table 3, panel B, I present this control experiment. I consider a cohort aged 18 to 24 in 1974 and a cohort aged 12 to 17 in 1974. The estimated differences in differences are very close to 0. These results provide some suggestive evidence that the differences in differences are not driven by inappropriate identification assumptions, although they are imprecisely estimated. In panel B, for example, the differences in differences are insignificantly different from 0 but also from the differences in differences in panel A. The remainder of this paper will elaborate on this strategy to lead to more convincing results. II. Effect on Education A. Basic Results To exploit the variation in treatment intensity across regions and cohorts, this strategy can be generalized to a regression framework. Consider first the difference between the average education of a young cohort exposed to the program and that of an older cohort not exposed to the program. If additional schools led to an increase in educational attainment, the difference will be positively related to the number of schools constructed in each region. This suggests running the following regres- sion: (1) Sijk = C1 + alj + Ilk + (PiTi)y1 + (CiTi)51 + Eijk where Sijk is the education of individual i born in region j in year k, Ti is a dummy indicating whether the individual belongs to the ''young" cohort in the subsample, c1 is a constant, Olk iS a cohort of birth fixed effect, a1lj is a district of birth fixed effect, Pj denotes the intensity of the program in the region of birth, and Cj is a vector of region-specific variables. Table 4 (columns 1-3) presents estimates of equation (1) for two subsamples. In panel A, I compare children aged 2 to 6 in 1974 with children aged 12 to 17 in 1974. In column 1, the specification controls only for the interaction of a cohort of birth dummy and the population aged 5 to 14 in 1971. The suggested effect is that one school built per 1,000 children increased the education of the children aged 2 to 6 in 1974 by 0.12 years for the whole sample, and by 0.20 years for the sample of wage earners. This interpretation relies on the identification assumption that there are no omitted timevarying and region-specific effects correlated with the program. The allocation of schools to each region was an explicit function of the enrollment rate in the region in 1972. Therefore, the estimate could potentially confound the effect of the program with mean reversion that would have taken place even in its absence. The identification assumption will also be violated if the allocation of other governmental programs initiated as a result of the oil boom (and potentially affecting education) was correlated with the allocation of INPRES schools. Thus, I present specifications that control for the interactions between cohort dummies and the enrollment rate in the population in 1971, as well as for interactions between cohort dummies and the allocation of the water and sanitation program, the second largest INPRES program centrally administered at the time. Controlling for both the enrollment rate and the water and sanitation program makes the estimates higher (columns 2 and 3), suggesting that the estimates are not upwardly biased by mean reversion or omitted programs. Panel B of Table 4 shows the results of the control experiment (comparing the cohort aged 12 to 17 to the cohort aged 18 to 24 in 1974). If, before the program was started, education had increased faster in regions that received more schools, panel B would show (spurious) positive coefficients. But the impact of the "program" is very small and never significant. The coefficients are statistically different from the corresponding coefficients in panel A. Although this is not definitive evidence (education level could have started converging precisely after 1973), it is reassuring. Even if the identification assumption is satisfied, the coefficient may slightly overestimate This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 800 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 TABLE 4-EFFECT OF THE PROGRAM ON EDUCATION AND WAGES: COEFFICIENTS OF THE INTERACTIONS BETWEEN COHORT DUMMIES AND THE NUMBER OF SCHOOLS CONSTRUCTED PER 1,000 CHILDREN IN THE REGION OF BIRTH Dependent variable Years of education Log(hourly wage) Observations (1) (2) (3) (4) (5) (6) Panel A: Experiment of Interest: Individuals Aged 2 to 6 or 12 to 17 in 1974 (Youngest cohort: Individuals ages 2 to 6 in 1974) Whole sample 78,470 0.124 0.15 0.188 (0.0250) (0.0260) (0.0289) Sample of wage earners 31,061 0.196 0.199 0.259 0.0147 0.0172 0.0270 (0.0424) (0.0429) (0.0499) (0.00729) (0.00737) (0.00850) Panel B: Control Experiment: Individuals Aged 12 to 24 in 1974 (Youngest cohort: Individuals ages 12 to 17 in 1974) Whole sample 78,488 0.0093 0.0176 0.0075 (0.0260) (0.0271) (0.0297) Sample of wage earners 30,225 0.012 0.024 0.079 0.0031 0.00399 0.0144 (0.0474) (0.0481) (0.0555) (0.00798) (0.00809) (0.00915) Control variables: Year of birth*enrollment rate in 1971 No Yes Yes No Yes Yes Year of birth*water and sanitation program No No Yes No No Yes Notes: All specifications include region of birth dummies, year of birth dummies, and interactions between the year of birth dummies and the number of children in the region of birth (in 1971). The number of observations listed applies to the specification in columns (1) and (4). Standard errors are in parentheses. the effect of the program on average education.3 Note that such a large program could potentially have affected the returns to education by increasing the stock of primary school graduates (Angrist, 1995). Individuals' education choices could then have responded to this decrease in the returns to education. To the extent that Indonesia is an integrated labor market, the returns to education would have declined in the entire country. The estimates do not take this negative effect of the program into account because it is common to all regions. This effect, however, is not likely to be very large. Its size ultimately depends on the elasticity of the demand for educated labor (which is likely to be low in a rapidly growing economy), the sensitivity of educational choice to perceived returns to education, and the extent of integration in the Indonesian labor market. B. Reduced-Form Evidence This identification strategy can be generalized to an interaction terms analysis. Consider the following relationship between the education (Sijk) of an individual i, born in region j, in year k, and his exposure to the program: (2) Sijk = C I+ a lj +f1k 23 + E (pj x dil)yll 1=2 23 + E (Cj X di,)>S1 + 6ijk where di, is a dummy that indicates whether individual i is age 1 in 1974 (a year-of-birth dummy). In these unrestricted estimates, I measure the time dimension of exposure to the program with 22 year-of-birth dummies. Individuals aged 24 in 1974 form the control group, and this dummy is omitted from the regression. Each coefficient yll can be interpreted as an estimate of the impact of the program on a given cohort. This is simply a 3In the working paper version (Duflo, 2000), this point is made in the context of a simple formal model. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 801 X ~~~~~~~~~~~~~~~0.2 0. 1 0 22 21 20 1 9 1 8 17 1 6 15 114% 13 1 2 1 1 1Q 9 8 7 6 5 4 3 '-~~~~~~~~~~~~~~ ~~~~-0.1 0 ,* ,-- - - -----S z . - - -------'-- - - --'--- - '- - ----'--'-------------0.12 t.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~0 2 Age in 1974 FIGURE 1. COEFFICIENTS OF THE INTERACTIONS AGE IN 1974* PROGRAM INTENSITY IN THE REGION OF BIRTH IN THE EDUCATION EQUATION generalization of equation (1) to estimate cohortby-cohort contrasts. There is a testable restriction on the pattern of the coefficients Yii Because children aged 13 and older in 1974 did not benefit from the program, the coefficients Yll should be 0 for 1 > 12 and start increasing for 1 smaller than some threshold (the oldest age at which an individual could have been exposed to the program and still benefit from it). Figure 1 plots the YjI Each dot on the solid line is the coefficient of the interaction between a dummy for being a given age in 1974 and the number of schools constructed per 1,000 children in the region of birth (a 95-percent confidence interval is plotted by broken lines). These coefficients fluctuate around 0 until age 12 and start increasing after age 12. As expected, the program had no effect on the education of cohorts not exposed to it, and it had a positive effect on the education of younger cohorts. All coefficients are significantly different from 0 after age 8. These figures show that the identification strategy is reasonable and that the program had an effect on education. C. Restricted Estimation Instead of testing whether the Yll are equal to 0 for I ? 13, one can impose this restriction. The equation to be estimated is then 12 (3) Sijk = CI + aj1 + Olk + E (Pjdj&)y I I = 2 12 + Z (Cjdil) 1T + Sijk 1=2 The omitted group (the control group) is now comprised of individuals aged 13 to 24 in 1974. This is more efficient and leads to more precise estimates of the effect of the program. Columns (1)-(3) in Table 5 show the coefficients of the interactions between age in 1974 and the intensity of the program in the region of birth in three specifications in the whole sample [columns (4)-(6) show the same results for the sample of wage earners]. In all columns, the estimated effect is positive after age 10. All coefficients are significantly greater than 0 after age 8. All sets of interactions are statistically different from 0 (the F-statistic for the null hypothesis is presented at the bottom of the table). The coefficients generally increase with date of birth (decreasing with age), except for a This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 802 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 TABLE 5-EiFECT OF THE PROGRAM ON EDUCATION AND WAGES: COEFFICIENTS OF THE INTERACTIONS BETWEEN DUMMIES INDICATING AGE IN 1974 AND THE NUMBER OF SCHOOLS CONSTRUCTED PER 1,000 CHILDREN IN REGION OF BIRTH Dependent variable: years of education Dependent variable: Whole sample Sample of wage earners log(hourly wage) Age in 1974 (1) (2) (3) (4) (5) (6) (7) (8) (9) 12 -0.035 -0.025 0.002 -0.040 -0.010 0.009 0.016 0.019 0.027 (0.047) (0.048) (0.054) (0.077) (0.078) (0.091) (0.013) (0.013) (0.015) 11 0.011 0.025 0.018 0.008 0.014 -0.003 -0.014 -0.013 -0.009 (0.046) (0.047) (0.051) (0.073) (0.074) (0.083) (0.012) (0.013) (0.014) 10 0.059 0.049 0.078 0.10 0.092 0.13 0.0036 0.0042 0.0059 (0.047) (0.049) (0.054) (0.075) (0.076) (0.090) (0.013) (0.013) (0.015) 9 0.14 0.14 0.15 0.067 0.063 0.17 0.0095 0.010 0.018 (0.039) (0.041) (0.044) (0.065) (0.066) (0.077) (0.011) (0.011) (0.013) 8 0.088 0.11 0.11 0.19 0.20 0.28 0.019 0.021 0.027 (0.049) (0.050) (0.054) (0.078) (0.079) (0.089) (0.013) (0.013) (0.015) 7 0.12 0.14 0.16 0.11 0.13 0.16 -0.0095 -0.0049 0.0066 (0.044) (0.046) (0.051) (0.072) (0.073) (0.084) (0.012) (0.012) (0.014) 6 0.14 0.17 0.26 0.23 0.23 0.32 0.011 0.013 0.018 (0.042) (0.044) (0.049) (0.070) (0.070) (0.084) (0.012) (0.012) (0.014) 5 0.10 0.13 0.13 0.14 0.16 0.27 0.021 0.023 0.052 (0.043) (0.045) (0.050) (0.075) (0.075) (0.088) (0.013) (0.013) (0.015) 4 0.11 0.12 0.18 0.19 0.19 0.29 0.019 0.020 0.038 (0.039) (0.041) (0.046) (0.069) (0.069) (0.082) (0.012) (0.012) (0.014) 3 0.11 0.14 0.20 0.15 0.17 0.30 0.0079 0.013 0.027 (0.044) (0.046) (0.053) (0.079) (0.080) (0.097) (0.013) (0.014) (0.016) 2 0.14 0.19 0.19 0.20 0.22 0.25 0.016 0.023 0.040 (0.041) (0.043) (0.049) (0.073) (0.074) (0.088) (0.012) (0.013) (0.015) Control variables:a Year of birth*enrollment rate in 1971 No Yes Yes No Yes Yes No Yes Yes Year of birth*water and sanitation program No No Yes No No Yes No No Yes F-statisticb 4.03 5.18 6.15 2.70 2.74 4.38 1.13 1.29 2.05 R 2 0.19 0.19 0.17 0.14 0.14 0.13 0.14 0.15 0.13 Number of observations 152,989 152,495 143,107 60,633 60,466 55,144 60,633 60,466 55,144 Notes: All specifications include region of birth dummies, year of birth dummies, and interactions between the year of birth dummies and the number of children in the region of birth (in 1971). Standard errors are in parentheses. a The control group is comprised of individuals aged 13-24 in 1974. b The F-statistics test the hypothesis that the coefficients of the interaction between the year of birth dummies and the program intensity in the region of birth are jointly zero. high value at age 9 and a decline between ages 6 and 5. They increase faster between ages 12 and 9 than they do subsequently, thus suggesting that once the education level in the population reaches a certain point, increasing it by building primary schools becomes more difficult. The estimates in column (1) (without controls) suggest that one school per 1,000 children increases the education of the youngest children by 0.14 years. On average, 1.98 schools were built per 1,000 children. This implies that at its mean value, the program caused an increase in education of 0.27 years for these children (the average education in the sample is 7.98 years). As before, controlling for enrollment rate in 1971 [colunm (2)] and the water and sanitation program [column (3)] makes the estimate slightly higher. In columns (4)-(6), I present the same estimates for the subsample of wage earners. The program effect is higher for wage earners than it is in the whole sample. More insight into why this program was effective is obtained by examining its impact in different types of regions. In Table 6 (panel A), I present results equivalent to the specification in Table 4 [equation (1)] for various subsamples of regions of birth. Columns (2) and (3) suggest that the program had no effect in densely populated regions, and a large effect in sparsely This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 803 TABLE 6-PROGRAM EFFECT AND RETURNS TO EDUCATION BY CATEGORIES OF REGION OF BIRTH Characteristics of region of birth Preprogram Whole Densitya 1976 Povertyb educationc sample Median High Low Median (1) (2) (3) (4) (5) (6) (7) Panel A: Effect of the Program on Educati Dependent variable: Years of education. Sample: individuals ages 2 to 6 or 12 to 17 in 1974 Interaction 0.15 0.19 -0.014 0.13 0.083 0.14 0.13 (2-6 in 1974)*program intensity in region of (0.026) (0.035) (0.048) (0.058) (0.035) (0.040) (0.036) Panel B: Effect of the Program on Wages Dependent variable: log(hourly wage). Sample: individuals ages 2 to 6 or 12 to 17 in 1974 (wage earners) Interaction 0.017 0.032 -0.00084 0.051 -0.00083 0.028 0.0046 (2-6 in 1974)*program intensity in region of (0.0074) (0.011) (0.012) (0.017) (0.0094) (0.013) (0.0095) Panel C: Returns to Education Dependent variable: log(hourly wage). Sample: wage earners Years of education 0.078 0.11 No First 0.10 No First 0.12 0.029 (0.00062) (0.026) stage (0.028) stage (0.032) (0.052) [0.9] [0.86] [0.88] [0.72] [0.83] Notes: Region of birth dummies, year of birth dummies, and the enrollment in the region in 1971 are included in overidentification test are in square brackets. a The median density (the density for the region of birth for the median person in the weighted sample) is 308 habitants per square kilometer. b The high poverty provinces are the provinces where the proportion of people consuming less than 1,500 Rp per capita is larger than the national for rural regions (in the 1976 SUSENAS). I define "high poverty" as rural districts in these provinces, which are: Lampung, Central East Java, East Nusa Tenggara, Central Sulawesi, South Sulawesi, Southeast Sulawesi, Maluku, Irian Jaya (World Bank, 1979). c The preprogram education is the average education in the region of birth for people born in 1962 or before. The median is 3.18 years. populated regions. In sparsely populated regions, each new school significantly reduces the distance to school. In densely populated regions, the main effect will be to increase the availability of slots or to reduce the overcrowding of old schools. This suggests that reducing the distance children traveled to school was the most important effect of the program. This interpretation, however, should be taken with caution, in that this difference may come from other characteristics correlated with density. Columns (4) and (5) suggest that the program had more impact in poor provinces. In columns (6) and (7), I divide the sample into regions where the education of the cohort not exposed to the program (men born between 1950 and 1962) was lower or higher than the median (3.08 years of education). Results are similar for both sets of regions. In summary, it appears that the school construction program had an impact on education. It should be recalled that this program was accompanied by a general effort by the Indonesian government in favor of education, a priority of the second five-year plan. As part of this effort, primary school fees were suppressed in 1978 (World Bank, 1990). Therefore, these results cannot be generalized to less favorable contexts without applying caution. D. At What Level of Education Was the Program Effective? The impact of the program on welfare depends on whether it primarily affected children with a low or a high level of education. Differences in differences in the cumulative This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 804 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 0.08- 0.06 0 -0.02 -0.04 0 2 4 6 8 10 12 14 16 18 20 Years of education FIGURE 2. DIFFERENCE IN DIFFERENCES IN CDF (ESTIMATED FROM LINEAR PROBABILITY MODEL) WITH 95-PERCENT CONFIDENCE INTERVAL distribution function of education provide information on the level at which the program was effective. In practice, for Sijkm, a dummy that indicates whether the individual i, born in region j, in year k, completed m years of education or less, and for Pj, a dummy indicating whether the child was born in a high program region, I estimate the following equation: (4) Sijk,?, = c + aj + Pk + (Pj Ti) K,?, + 8 ijk The Km, for m = 0 to 19, are the values of the estimated impact of the program at each level of education. They are plotted in Figure 2 (the 95percent confidence interval is plotted by broken lines). The shape of Figure 2 indicates at what level the program was effective. The effect is increasing until the sixth year of education, decreasing until the twelfth, and slightly increasing thereafter. A maximum of about 6 percent of the sample living in high program regions were induced to complete at least primary school. This also shows some impact of the program on the probability of completing lower secondary school (1.5 percent of the sample is estimated to have been induced by the program to complete the 7th, 8th, and 9th grades or more). There is a negative difference in differences at the senior high-school level. The program increased average schooling through increasing primary schooling essentially. This provides additional evidence that the assumption underlying the identification strategy is reasonable as the estimated effect of the program for the levels of education that it did not target is small or nonexistent. The negative difference in differences at the senior high school level may indicate that some variable predicting the probability of attending senior high school is omitted from this regression (and changed in low program regions more than in high program regions). The program could also have induced more marginal people to complete primary school and move on to junior high school.4 However, the direct and indirect costs of junior high school were much higher than the costs of primary education and were not equalized across regions at the time. This 4For example, Angrist and Imbens (1995) find that compulsory attendance laws in the United States induce a fraction of the sample to complete some college as a consequence of constraining them to complete high school. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 805 may explain why we do not observe large spillovers. III. Effect on Wages A. Basic Results The same identification strategy can be applied to estimate the effect of the program on wages. As with education, I estimate: (5) Yijk Cl + alj + Olk + (PjTi)-yl + (CjTi)51 + 8ijk where Yijk iS the logarithm of the 1995 wage of an individual i, born in region j, in cohort k. Results are presented in Table 4 [columns (4)-(6)] and in Figure 1. In Table 4, panel A, I set Ti equal to 1 for children aged 2 to 6 in 1974, and use children aged 12 to 17 as the comparison group. In Table 4, panel A, the estimates range from 1.5 to 2.7 percent. As in the case of education, the estimates increase when I control for enrollment rates in 1971 and for the allocation of the water and sanitation program, although none of these estimates is significantly different from each other. In panel B (which presents the control experiment), the interaction coefficient is small and not significantly different from zero in all specifications. However, these estimates are imprecise and I cannot reject equality of the coefficients in panels A and B (although the point estimates are much smaller in panel B). B. Reduced-Form Evidence As for education, we can write an unrestricted reduced-form relationship between exposure to the program and the logarithm of the wage of an individual (Yijk): 23 (6) Yijk = C2 + a2j + 02k + E (Pjdil)Y21 1=2 23 + E (Cjdil)821 + Vijk 1=2 where a2j is a region-of-birth effect and 2k i a cohort-of-birth effect. Pi, Ci, and di, are de fined as in the education equation: Pj is the intensity of the program in the region of birth, Cj is the vector of control variables, and di, is a dummy indicating whether individual i was of age 1 in 1974. The Y21 should be zero for 1 greater than 12 and start increasing thereafter. Moreover, if the program affected wages only through its effect on education, the coefficients Y21 should track the yil (in the education equation). In Figure 3, the Y21 are plotted by a dotted line, and the yli are plotted by a solid line. Both are oscillating until age 10 and start increasing after age 11. The coefficients of the interactions for education and wages track each other. C. Restricted Estimates Finally, in columns (7)-(9) of Table 5, I present estimates of the equation 12 (7) Yijk C2 + a2j + 02k + E (Pj dil)Y21 1=2 12 + E (Cjdil)a21 + Vijk. 1=2 The effect of the program on wages is less precisely estimated than as on education because wages fluctuate more and the sample is smaller (given that wages are not collected for self-employed people). However, qualitatively, the results parallel the estimated effects on education. No effect is found for children aged 10 or older in 1974. The coefficients are positive for younger children (except at age 7). The coefficients of the interactions generally decrease with age. The estimates are higher when I control for both enrollment rate and the water and sanitation program. The last line in this table indicates that constructing one school per 1,000 children increased the 1995 wages of individuals aged 2 in 1974 by 1.6 percent to 4.0 percent. The average number of schools constructed per 1,000 children is 1.89 in the sample This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 806 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 0.25 0.2 0.15 0.1 0.05 - - ~~~~~~~~~~~~~~~~~~~~~0 -0.05 _____ -0.1 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Age in 1974 education . - -Iog(wage) FIGURE 3. COEFFICIENTS OF THE INTERACTIONS AGE IN 1974* PROGRAM INTENSITY IN THE REGION OF BIRTH IN THE WAGE AND EDUCATION EQUATIONS with valid wage data. Thus, on average, the program caused a 3 to 7 percent increase in the wages of this cohort. In Table 6 (panel B), I present the estimates of equation (5) for different subsamples. The variations of the effect of the program on wages across subsamples parallel those on education. In particular, the program had no effects on wages in regions where it had no effect on years of education. This suggests that the program effect on wages was caused by the changes in years of education. In the next section, I use this to construct instrumental variables (IV) estimates of the effect of education on wages. IV. Estimating Returns to Education The identification assumption that the evolution of wages and education across cohorts would not have varied systematically from one region to another, in the absence of the program, is sufficient to estimate the impact of the program. Additionally, if we assume that the program had no effect on wages other than by increasing educational attainment, one can use this program to construct instrumental variables estimates of the impact of additional years of education on wages. The most serious concern, for this interpretation, is that the program might have affected both the quality and the quantity of education, and that changes in wages could reflect both effects. I examine below whether there is evidence that this occurred. A. Two-Stage Least-Squares Estimates of the Returns to Education Estimates of equation (3) are of intrinsic interest because they provide an assessment of the impact of the program on education. But they also represent the first stage of a two-stage leastsquares (2SLS) estimation of the impact of education on wages. Consider the following equation which characterizes the causal effect of education on wages: This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 807 (8) Yijk = d + aj + Pk + SyIk b + TIjk where cj and Pk denote region-of-birth and cohort-of-birth effects, respectively. Note that the returns to education are measured in 1995. If the program was large enough to have general equilibrium effects on the returns to education, this will therefore be reflected in the estimates. Ordinary least-squares (OLS) estimates of equation (8) may lead to biased estimates if there is a correlation between Nijk and Sijk. However, under the assumptions that the differences in wages across cohorts would not have been systematically correlated with the program intensity in the absence of the program, and that the program had no direct effect on wages, the interactions between the age in 1974 and the program intensity in the region of birth are available as instruments for equation (8). These instruments have been shown to have good explanatory power in the first stage. The equation will also be estimated using a single instrument, the interaction of being in the "young" cohort and the program intensity in the region of birth. Equation (8) can also be modified to incorporate control variables as follows: (9) Yijk =d + aj + bk + Sikb 12 + E (Cjdi)YIr + TijkI =2 The results are presented in Table 7, panel Al (panel A2 presents results with the logarithm of monthly earnings as the dependent variable). The first line shows the OLS estimate. The estimated return to education is 7.8 percent and is not affected by introducing control variables. This is lower than OLS estimates in Indonesia in older samples, but consistent with estimates in other Indonesian data sets of the 1990's and with the decline in estimated returns to education over time. The estimates reported in World Bank (1990) decrease from 19 percent in 1982 to 10 percent in 1986. The second line presents 2SLS estimates of equation (9) (the F-statistics of the overidentifying restrictions test are shown in square brackets). In column (1), the number of children in 1971 is the only control variable. The point estimate (6.75 percent) is slightly lower than the OLS estimate, although I cannot reject equality. In column (2), I introduce interactions between the enrollment rate in 1971 and year-of-birth dummies. The point estimate is higher than without the controls (8.1 percent). When I introduce a control for the water and sanitation program, the estimate is again slightly higher (10.6 percent). In the third line, I present the 2SLS estimate using only one instrument. The results are very similar to the IV estimates using more instruments. In Table 6, panel C, I examine whether returns to education vary across regions.5 They are higher (11 percent) in sparsely populated regions and in regions where the average education level of cohorts not exposed to the program is low (12 percent). They seem to be lower in regions where initial education was high, although the standard error of this estimate is too large to be conclusive. This last result is consistent with the idea that the general equilibrium effect of an increase in education is to depress the returns, but it suggests that even after the program, returns were still higher in regions that received more schools. I now turn to two potential sources of bias: the assumption that the program had no impact on wages other than through the increase in the quantity of education, and problems arising from sample selection. B. Could Change in Quality Bias the 2SLS Estimates? Estimates of returns to education are biased if the program affects both the quality and the quantity of education. Two pieces of evidence suggest that the program did not substantially affect the quality of education. First, using data from the ministry of education and culture, I verified that changes in average pupil/teacher ratio between 1973 and 1977 were not systematically related to the number of INPRES schools constructed in each region. 5 I have not presented the 2SLS estimate when the Fstatistic for the joint significance of the instruments in the first stage was below 2, because it would not be interpret- able. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 808 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 TABLE 7-EFFECT OF EDUCATION ON LABOR MARKET OUTCOMES: OLS AND 2SLS ESTIMATES Method Instrument (1) (2) (3) (4) Panel A: Sample of Wage Earners Panel Al: Dependent variable: log(hourly wage) OLS 0.0776 0.0777 0.0767 (0.000620) (0.000621) (0.000646) 2SLS Year of birth dummies*program 0.0675 0.0809 0.106 0.0908 intensity in region of birth (0.0280) (0.0272) (0.0222) (0.0541) [0.96] [0.9] [0.93] [0.9] 2SLS (Aged 2-6 in 1974)*program 0.0752 0.0862 0.104 intensity in region of birth (0.0338) (0.0336) (0.0304) (0.0338) (0.0336) (0.0304) Panel A2: Dependent variable: log(monthly earnings) OLS 0.0698 0.0698 0.0689 (0.000601) (0.000602) (0.000628) 2SLS Year of birth dummies*program 0.0756 0.0925 0.0913 0.134 intensity in region of bilth (0.0280) (0.0278) (0.0219) (0.0631) [0.73] [0.63] [0.58] [0.7] Panel B: Whole Sample Panel B1: Dependent variable: participation in the wage sector OLS 0.0328 0.0327 0.0337 (0.00311) (0.000311) (0.000319) 2SLS Year of birth dummies*program 0.101 0.118 0.0892 intensity in region of birth (0.0210) (0.0197) (0.0162) [0.66] [0.93] [1.12] Panel B2: Dependent variable: log(monthly earnings), imnputed for self-employed individuals OLS 0.0539 0.0539 0.0539 (0.000354) (0.000354) (0.000355) 2SLS Year of birth dummies*program 0.0509 0.0745 0.0346 intensity in region of birth (0.0157) (0.0136) (0.0138) [0.68] [0.58] [1.16] Control variables: Year of birth*enrollment rate No Yes Yes Yes in 1971 Year of birth*water and No No Yes No sanitation program Propensity score, propensity No No No Yes score squared Notes: Year of birth dummies, region of birth dummies, and the interactions between year of birth dummies and the number of children in the region of birth in 1971 are included in the regressions. Standard errors are in parentheses. F-statistics of the test of overidentification restrictions are in square brackets. Second, the program did not affect the educational attainment of individuals completing nine years of education or more (as shown in Section IV). However, if the quality of education had been affected, their wages would have reflected it. I estimated equation (2) in the sample of individuals with an education level above 9 years. No specific pattern emerges.6 The evidence in Table 6 can be interpreted along the same lines: In densely populated regions [column (4)], the program had no effect on years of education, and it also had no effect on wages. If the quality of education had changed and this had affected wages, we would see an effect of the program on wages. These two separate pieces of evidence lend some support to the assumption that the increase in wages was attributed mainly to the increase in the quantity of education. There is no clear evidence that the program significantly altered the quality of education. 6 A figure is shown in the working paper version (2000) of this study. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 809 D. Correction for Sample Selection The returns to education are estimated in a selected sample: Only 45 percent of the individuals in the sample are working for a wage, with most remaining individuals being self-employed. The probability of working for a wage is potentially affected by education. To examine this, I use 2SLS to estimate (10) wijk - d + aj + bk + eijkA 12 + E (Cjdil)I + Thijk l=2 where wijk is a dummy variable, indicating whether an individual reports a positive wage. Estimates of this equation are presented in Table 7, panel B 1. The IV coefficient range is 0.09 to 0.12. The probability of working for a wage is indeed affected by education. This is an interesting result, but it casts a shadow on the validity of the 2SLS estimate of returns to education. Because the probability of working for a wage is affected by schooling (and by the instruments), the sample selection is likely to induce a correlation between the instruments and the error in equation (9). I implement two alternative procedures to investigate whether sample selection is likely to be an important problem in this case. First, I follow a suggestion introduced by James Heckman and Joseph Hotz (1989), later elaborated by Hyungtaik Ahn and James L. Powell (1993), to condition in the second stage on the probability of selection given the instruments. In practice, an indicator of whether the individual is working for a wage is regressed on the instruments, and polynomials of the predicted value from this regression are introduced as controls in the wage equation. The result of the introduction of the correction for sample selection is presented in Table 7, column (5) (panel Al). The coefficient changes very little, from 8.1 percent [in column (3)] to 9.2 percent. An alternative approach is to impute an income for self-employed individuals and examine whether the results change when the estimation is performed in this "completed sample." The income and expenditure module of the 1993 SUSENAS survey, made up of 50,000 individuals, allows us to compute income for all individuals but it does not contain the region of birth. Households report the members' occupations and the sector of activity from which they derive their main source of income. I calculate the average income derived from the main activity of the household for cells defined by sector (nine industrial sectors and services and four types of agricultural activities), status, and urban/rural residence. I then "complete" the SUPAS sample by defining the dependent variable as the logarithm of monthly earnings if they are recorded in the SUPAS data (for individuals working for wages) and the logarithm of the average income from the SUSENAS in the individual's occupation cell for all self-employed individuals (multiplied by the wage inflation factor defined as the ratio of the average wage from the SUPAS and the average income of wage earners imputed from the SUSENAS).7 The results are presented in Table 7, panel B2. They must be compared to the results in panel A2, where the dependent variable is the logarithm of monthly earnings of wage earners. In all cases, the estimates using the completed sample are smaller than those using the sample of wage earners. In the specification controlling for the water and sanitation program it drops to 3.5 percent. This particular result is surprising, but the fact that the returns for the complete sample are somewhat smaller than those for the sample of wage earners indicates that returns to education might be higher in the wage sector than that among the self-employed. V. Comparing Costs and Benefits The estimates of the program' s effect on wages can be used to compare the costs of building and operating the new schools to the additional wealth they generated, under the assumption that the increase in wages represents an increase in human capital. Note that in this case, the increase in wages underestimates the total benefit generated by the program: The increase in education is likely to affect other 7Individuals who did not work at least one hour in the previous week do not report a branch of activity. They are, therefore, still excluded from this sample. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 810 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 outcomes (fertility, child morbidity and mortality, etc.). These calculations require additional assumptions and should be taken with considerable caution. Nevertheless, it is useful to estimate the magnitude of the consequences of such a large-scale program. Using information contained in the presidential instruction and in a study on the cost of education in Indonesia conducted in 1971 (Daroesman, 1971), I estimated the cost of building, staffing, and maintaining the INPRES school for 20 years. Yearly costs are estimated using the following formula: C(t) = rK + rTC + W(t) 1.25 where K is the total capital cost, TC is the total training cost of new teachers, W(t) represents the sum of teachers' salaries at date t, 1.25 is the average ratio of total recurrent costs over wage costs, and r is the real interest rate (discount rate). I present the cost-benefits analysis for two different assumptions about the deadweight burden of taxation (0.2 and 0.6). Further assumptions are needed to compute the yearly benefits of the program. First, an important assumption is that the increase in wages attributed to the program represents an increase in the productivity of labor (and that there is no general equilibrium effect on the returns to education). Second, I assume that the effect on (working) women and on selfemployed people is the same as the effect on men working for a wage. I also assume that the share of total labor income going to people of any given age is constant across years and is equal to the share of total wages going to this cohort in 1995 (which I can calculate from my data). Thus, I estimate the benefit of the program at date t, for a cohort c using the following formula: B(c, t) o aGDP(t)S(c, t)E(c) where a is the share of labor in GDP, S(c, t) is the fraction of total wages earned by cohort c in year t, and E(c) is the estimated average effect of the program on cohort c. To obtain the total benefits for each year, I take the sum of these benefits over all cohorts. The relevant variable for the cost-benefit calculation is the discounted sum of net benefits. In Table 8, I present an evaluation of the program's returns for the first two specifications estimated in Table 4 and three different assumptions about the projected growth rate of GDP from 1996 to 2050. To evaluate the contribution of economic growth to the benefits of the program, I also present these results with the assumption that Indonesia's GDP grew at a rate of 2 percent annually from 1973 to 2050. The cost-benefits analysis is sensitive to the specification chosen for the estimation of the program effect and to the assumptions about future growth rates in Indonesia. Nevertheless, three main points emerge from this analysis. First, a school construction program takes a very long time to generate positive returns (because the costs are incurred early on, whereas the benefits are spread over a generation). Second, the returns generated are large. The internal rates of return range from 8.8 to 12 percent, well above the average interest rate on government debt in Indonesia during the period. Third, the benefits are, to a large extent, driven by the rapid growth of Indonesia's GDP from 1973 to 1997 (which results from the fact that each year's benefits are a fraction of that year's GDP). If the growth rate had been very low from 1973 until today, the net present value of the program would actually have been slightly negative, according to all specifications but one. Investing in education is much more valuable, from a government point of view, if it expects a fast subsequent growth. VI. Conclusion The INPRES program led to an increase in educational attainment in Indonesia. On average, the estimates indicate that the program led to an increase of 0.25 to 0.40 years of education (0.12 to 0.19 years for each new school built per 1,000 children), and increased by 12 percent the probability that an affected child would complete primary school. The estimates also suggest that the program led to an increase of 3 to 5.4 percent in wages. Combining the effect of the program on years of schooling and wages generates 2SLS estimates of economic returns to education ranging from 6.8 to 10.6 percent. These 2SLS estimates are close to and not significantly different from the OLS estimates. Therefore, these estimates This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEQUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 811 TABLE 8-EVALUATION OF THE PROGRAM'S NET RETURN Deadweight loss coefficient 0.2 0.6 (1) (2) (3) (4) Panel A: Results Control for year of birth*enrollment rate No Yes No Yes First year where benefit > costs (discount rate = 5 percent) In annual value 1996 1996 1997 1997 In discounted sum 2005 2002 2009 2005 Discounted sum of net benefits in 2050 (growth rate after 1997 = 5 percent, discount rate 5 percent) In million 1990 U.S.$ 13,025 13,096 11,340 18,807 As a fraction of Indonesia's GDP in 1973 0.30 0.36 0.31 0.52 Divided by initial costs 24.1 24.2 21.0 35.0 Discounted sum of net benefits in 2050 (growth rate after 1997 = 2 percent, discount rate 5 percent) In million 1990 U.S.$ 6,691 11,589 5,008 9,905 As a fraction of Indonesia's GDP in 1973 0.18 0.32 0.14 0.27 Divided by initial costs 12.4 21.4 9.26 18.3 Discounted sum of net benefits in 2050 (growth rate from 1973 = 2 percent, discount rate 5 percent) In million 1990 U.S.$ -631.6 1,200 -2,315 -483 As a fraction of Indonesia's GDP in 1973 -0.017 0.033 -0.063 -0.013 Divided by initial costs -1.16 2.22 -4.28 -0.89 Intemal rate of returna Growth rate after 1997 = 5 percent 0.102 0.118 0.0895 0.105 Growth rate after 1997 = 2 percent 0.088 0.106 0.0750 0.0915 Growth rate from 1973 = 2 percent 0.0443 0.059 0.0326 0.0467 Panel B: Assumptions and Parameters Population growth rate after 1997 0.015 Yearly teacher's salary in 1973 (1990 U.S. dollars) 363 Yearly teacher's salary in 1995 (1990 U.S. dollars) 2,467 Total recurrent costs/teacher salary 1.25 Total cost of construction (million 1990 U.S. dollars) 522 Number of schools constructed 61,800 Lifetime of the schools (years) 20 Share of labor income in GDP 0.7 Notes: The estimates underlying these calculations are taken from Table 5 [columns (7) and (8)]. Program effect has been set to 0 for children aged 7 or older in 1974. a The internal rate of return is the interest rate such that the net present value of the project at infinity is 0. do not support the view that OLS estimates of returns to education in developing countries are biased upward as a result of omitted family and community background variables, which has been argued by Behrman (1990), among others. Nor do they conform to most studies in industrialized countries, which obtain higher IV estimates than OLS estimates [see surveys in Orley Ashenfelter et al. (1999) and Card (1999)]. Both the OLS estimates and the 2SLS estimates are similar to most estimates reported for developed countries, but smaller than estimates reported in Psacharopoulos (1994) for developing economies. A number of specification checks support the causal interpretation of these estimates of the effect of the INPRES program. However, they need not generalize to other contexts. First, the emphasis on education in Indonesia at the time of the program created a context particularly favorable to its success. Second, the program was large and could have had general equilibrium effects on the returns to education. Since the returns to education are estimated for 1995, in an environment where the education levels were higher than when the program began, individuals' returns may be This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms 812 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2001 lower than they would be in other developing countries. Finally, if returns to education are not constant, the 2SLS estimates are a weighted average of the returns to education for people who are affected by the instruments (Angrist and Imbens, 1995). The INPRES program induced variation only at the primary school level. Returns to secondary education may be different. In particular, flexible OLS specifications allowing the returns to education to vary by year suggest that returns to education may be convex in developing countries (Strauss and Thomas, 1995). Moreover, individuals whose education level changed because of the program may expelience returns to education that differ from the population average. On one hand, those affected children likely belong to the poorest segment of the population because they were prevented from attending school by the lack of infrastructure. On the other hand, they took advantage of the opportunity once it arose. It is conceivable that only individuals with high expected returns chose to do so. The findings reported here are important because they show that an unusually large government-administered intervention was effective in increasing both education and wages in Indonesia. This intervention was meant to increase the quantity of education. It is sometimes feared that the deterioration in the quality of education that might result from this type of program could offset any gain in quantity. However, the estimates reported here suggest that the program was effective in increasing not only education levels but also wages. This suggests that the combined effect of quality and quantity changes in education was an increase in human capital. This study concentrated on estimating the private returns to education. This large increase in the education of the young cohorts, however, may have had a broader impact on the Indonesian economy. How did the economy adjust to a shock in the supply of educated workers? Studying these effects will be the object of future work. REFERENCES Ahn, Hyungtaik and Powell, James L. "Semiparametric Estimation of Censored Selection Models with a Nonparametric Selection Mechanism." Journal of Econometrics, July 1993, 58(1-2), pp. 3-29. Angrist, Joshua D. "The Economic Returns to Schooling in the West Bank and Gaza Strip." American Economic Review, December 1995, 85(5), pp. 1065-87. Angrist, Joshua D. and Imbens, Guido W. "Two Stage Least Squares Estimation of Average Causal Effects in Models with Variable Treatment Intensity." Journal of the American Statistical Association, June 1995, 90(430), pp. 431-42. Ashenfelter, Orley; Harmon, Colm P. and Oosterbeek, Hessel. "A Review of Estimates of the Schooling/Earnings Relationship." Labor Economics, November 1999, 6(4), pp. 453- 70. Behrman, Jere R. "The Action of Human Resources and Poverty on One Another: What We Have Yet to Learn." World Bank Living Standards Measurement Studies Working Paper No. 74, 1990. Card, David. "The Causal Effect of Education on Earnings," in Orley Ashenfelter and David Card, eds., Handbook of labor economics. Amsterdam: North-Holland, 1999, pp. 1802- 63. Card, David and Krueger, Alan. "Does School Quality Matter? Returns to Education and the Characteristics of Public Schools in the United States." Journal of Political Economy, February 1992, 100(1), pp. 1-40. Card, David and Lemieux, Thomas. "Earnings, Education and the Canadian GI Bill." National Bureau of Economic Research (Cambridge, MA) Working Paper No. 6718, September 1998. Daroesman, Ruth. "Finance of Education." Bulletin of Indonesian Economic Studies, December 1971, Pts. 1 and 2, 7(3), pp. 61-95. Duflo, Esther. "Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment." National Bureau of Economic Research (Cambridge, MA) Working Paper No. 7860, August 2000. Heckman, James J. and Hotz, V. Joseph. "Choosing Among Alternative Non Experimental Methods for Estimating the Impact of Social Programs: The Case of Manpower Training." Journal of the American Statistical Association, December 1989, 84(408), pp. 862-74. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms VOL. 91 NO. 4 DUFLO: CONSEOUENCES OF SCHOOL CONSTRUCTION IN INDONESIA 813 Psacharopoulos, George. "Returns to Investments in Education: A Global Update." World Development, September 1994, 22(9), pp. 1325-43. Rosenzweig, Mark R. and Wolpin, Kenneth I. "Migration Selectivity and the Effects of Public Programs." Journal of Public Economics, December 1988, 37(3), pp. 265-89. Strauss, John and Thomas, Duncan. "Human Resources: Empirical Modeling of Household and Family Decisions," in Jere Behrman and T. N. Srinivasan, eds., Handbook of development economics. Amsterdam: North-Holland, 1995, 3A(9), pp. 1885-2023. World Bank. "Indonesia: Strategy for a Sustained Reduction in Poverty." Washington, DC: World Bank Country Study, 1990. This content downloaded from 147.251.185.127 on Mon, 05 Mar 2018 14:27:50 UTC All use subject to http://about.jstor.org/terms