American Economic Association Estimates of the Economic Return to Schooling from a New Sample of Twins Author(s): Orley Ashenf elter and Alan Krueger Source: The American Economic Review, Vol. 84, No. 5 (Dec, 1994), pp. 1157-1173 Published by: American Economic Association Stable URL: http://www.jstor.org/stable/2117766 Accessed: 17-03-2015 15:28 UTC Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at http://www.jstor.org/page/ info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. American Economic Association is collaborating with JSTOR to digitize, preserve and extend access to The American Economic Review. STOR http://www.jstor.org This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions Estimates of the Economic Return to Schooling from a New Sample of Twins By Orley Ashenfelter and Alan Krueger* This paper uses a new survey to contrast the wages of genetically identical twins with different schooling levels. Multiple measurements of schooling levels were also collected to assess the effect of reporting error on the estimated economic returns to schooling. The data indicate that omitted ability variables do not bias the estimated return to schooling upward, but that measurement error does bias it downward. Adjustment for measurement error indicates that an additional year of schooling increases wages by 12-16 percent, a higher estimate of the economic returns to schooling than has been previously found. (JEL J31) This paper uses a new survey of identical twins to study the economic returns to schooling. We estimate the returns to schooling by contrasting the wage rates of identical twins with different schooling levels. Our goal is to ensure that the correlation we observe between schooling and wage rates is not due to a correlation between schooling and a worker's ability or other characteristics. We do this by taking advantage of the fact that monozygotic (from the same egg) twins are genetically identical and have similar family backgrounds. In our survey we also took some unusual steps to measure a worker's schooling level accurately. We obtained independent estimates of each sibling's schooling level by asking the twins to report on both their own and their twin's schooling. These new data provide a simple and powerful method for as- industrial Relations Section, Princeton University, Princeton, NJ 08544. This research was supported by the Industrial Relations Section, Princeton University, and the National Science Foundation (SES-9012149). We are indebted to Graham Bürge, Greg Fisher, Kevin Hallock, and Michael Quinn for excellent assistance with data collection and processing, and to Michael Boozer for assistance with econometric computations. We are also indebted to Andy Miller of the Twins Days Festival, Twinsburg, Ohio, for help in arranging our interview survey of twins. We have received helpful comments on an earlier draft from James Heckman, David Neumark, and the referees. sessing the role of measurement error in estimates of the economic returns to schooling. The results of our study indicate that the economic returns to schooling may have been underestimated in the past.1 We estimate that each year of school completed increases a worker's wage rate by 12-16 percent. This estimate is nearly double previous estimates, and it is much greater than the estimate we would have obtained from these data had we been unable to adjust for omitted ability variables and measurement error. Surprisingly, we find no evidence that unobserved ability is positively related to the schooling level completed; instead, we find some weak evidence that unobserved ability may be negatively related to schooling level. We also find significant evidence of measurement error in schooling levels. Our results indicate that measurement error may lead to considerable underestimation of the returns to schooling in studies based on siblings. 'Jacob Mincer (1974) shows that if the return to schooling is independent of schooling level, and if the only costs of schooling are forgone earnings, then the proportional increase in earnings per year of schooling is the rate of return on schooling investments. We follow conventional practice and simply call the proportional earnings increase per year of schooling the rate of return. This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1158 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 We begin the paper with a discussion of the data we have collected. We compare our sample with more conventional data and with other surveys of twins, and we report on the extent of the measurement error we have found. We next report the detailed results of our study of the earnings of twins using conventional econometric methods to adjust for measurement error. In a final section of the paper we provide estimates and tests of the restrictions from a simple model of the earnings process that incorporates errors in the measurement of schooling. I. Data Collection and Appraisal Our goal was to obtain a sample of data on twins in which we could obtain independent measures of each sibling's schooling level. We realized at the outset that this would be a simple task if both twins could be interviewed simultaneously. Both twins could then be asked questions about themselves and their siblings. A natural place to interview twins for this purpose is one of the many "twins festivals" held throughout the United States. In fact, we chose to attend the 16th Annual Twins Days Festival in Twinsburg, Ohio, in August of 1991. The Twinsburg Festival is the largest gathering of twins in the world, and in 1991, it attracted over 3,000 sets of twins, triplets, and quadruplets, many of whom were children. We managed to interview over 495 separate individuals over the age of 18 during the three days of the festival. A. Data Collection Our data-collection instrument was patterned after the questionnaire used by the Bureau of the Census for the Current Population Survey (CPS). (A copy of the questionnaire we used is available from the authors upon request.) Many of the questions on the survey are identical to those administered in the CPS, but some were written by us and are relevant only for a study of twins. Monozygotic (commonly called "identical") twins result from the splitting of a fertilized egg and are considered to be genetically identical. Dizygotic (commonly called "fraternal") twins result from the fertilization of separate eggs and lead only to siblings that are genetically similar, as are non-twin brothers and sisters. One goal of our survey instrument was to determine whether the twins we interviewed were identical or fraternal. Much of our analysis below is restricted to a sample of identical twins.2 Our interviewing technique employed a team of five interviewers. The Twinsburg Festival maintains a research pavilion, which consists of a tent near the main entrance to the festival where researchers are located. To carry out our survey we placed an advertisement in the festival program inviting all adult twins to come to our booth to be interviewed. As an incentive we offered to make a contribution to the Twins Festival Scholarship Fund for every pair of adult twins who completed an interview. Our interviewers also roved throughout the festival grounds and approached every adult twin pair they encountered with a request for an interview. We were pleasantly surprised to find that virtually every pair of twins that we approached agreed to participate in our interviews. (Only four pairs of twins refused to be interviewed.) At the outset we were concerned that our questions about earnings, when asked in a face-to-face interview, might lead to some nonresponse. As it turned out, our concerns were misplaced, and virtually every twin provided the requested data (leading to a response rate for this question that is far higher than in the CPS). We asked each twin about his or her wage rate on the most recent job, but we have included twins in our sample only if they held jobs within the previous two years. In every case we separated the twins for the purposes of our interview, so that no twin 2 We determined whether twins were identical by their answers to the question "Is your twin brother/sister an identical twin? That is, are you monozygotic twins?" In a study of questionnaire responses by pairs who claimed to be monozygotic twins Seymour Jablon et al. (1967) found that fewer than 3 percent were incorrect as measured by serological tests. This content downloaded from 147.251.185.127 on Tue. 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions VOL. 84 NO. 5 ASHENFELTERAND KRUEGER: ECONOMIC RETURN TO SCHOOLING 1159 heard his or her sibling's response to the questionnaire. Although we report on a detailed comparison of our survey with data from the CPS below, we have some casual impressions about our sample of twins that should be kept in mind. Much of the purpose of a twins festival is to celebrate the similarity of the twins who are present. For the participants, these festivals provide an environment where twins are not so unusual as they ordinarily seem. The participants therefore tend to dress alike and to celebrate their similarity. As a result, we suspect that twins in our sample may bear stronger similarities than would be the case in a random sample of twins. For example, our sample contains a far greater representation of identical twins relative to fraternal twins than would exist in a random sample. These similarities will cause no problem for estimating the returns to schooling, but they may make a comparison of our study with other studies of twins more difficult. On the other hand, the twins in our study do vary in dimensions that the twins in other studies do not. For example, the Jere Behrman et al. (1980) study is based on a sample of male veterans of World War II. Our study has a representation considerably broader than this, and it includes women as well as men. B. Representativeness of the Sample Table 1 provides sample means and standard deviations for the variables we study below and for a few additional variables designed to measure the extent to which the twins shared a common environment. The table also contains similar data from the Current Population Survey for comparison purposes. Two things are clear from this table. First, although similar to the CPS sample, our sample of twins is better educated and more highly paid than the CPS sample. Likewise, our sample of twins is younger and contains more women and whites than the CPS sample. Second, it is clear that the identical twins in our sample tend to have similar education levels, and that identical twins bear a closer similarity Table 1-Descriptive Statistics Means (standard deviations in parentheses) Identical Fraternal Variable twins" twins" Populationb Self-reported education 14.11 (2.16) 13.72 (2.01) 13.14 (2.73) Sibling-reported education 14.02 (2.14) 13.41 (2.07) — Hourly wage $13.31 (11.19) $12.07 (5.40) $11.10 (7.41) Age 36.56 (10.36) 35.59 (8.29) 38.91 (12.53) White 0.94 (0.24) 0.93 (0.25) 0.87 (0.34) Female 0.54 (0.50) 0.48 (0.50) 0.45 (0.50) Self-employed 0.15 (0.36) 0.10 (0.30) 0.12 (0.32) Covered by union 0.24 (0.43) 0.30 (0.46) — Married 0.45 (0.50) 0.54 (0.50) 0.62 (0.48) Age of mother at birth 28.27 (6.37) 29.38 (7.05) — Twins report same education 0.49 (0.50) 0.43 (0.50) — Twins studied together 0.74 (0.44) 0.38 (0.49) — Helped sibling find job 0.43 (0.50) 0.24 (0.43) — Sibling helped find job 0.35 (0.48) 0.22 (0.41) — Sample size 298 92 164,085 "Source: Twinsburg Twins Survey, August 1991. Source: 1990 Current Population Survey (Outgoing Rotation Groups File). Sample includes workers aged 18-65 with an hourly wage greater than $1.00 per hour. than fraternal twins. For example, 49 percent of identical twins (but 43 percent of fraternal twins) report attaining exactly the same level of education, while 74 percent of identical twins (but 38 percent of fraternal twins) report having studied together during high school. Table 2 reports the correlations among the (logarithmic) wages, (self-reported and This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1160 THE AMERICAN ECONOMIC REVIEW Table 2—Correlation Matrices DECEMBER 1994 A. Identical Twins Variable Yi Y2 Sl 5? si sl Eh F2 EM F2 Yi 1.000 Y2 0.563 1.000 si 0.382 0.168 1.000 s? 0.375 0.140 0.920 1.000 si 0.267 0.272 0.658 0.697 1.000 sl 0.248 0.247 0.700 0.643 0.877 1.000 Father's education (EF) 0.155 0.088 0.345 0.266 0.361 0.416 1.000 Father's education (E2,) 0.159 0.091 0.357 0.278 0.320 0.389 0.857 1.000 Mother's education (E^) 0.102 0.088 0.348 0.343 0.392 0.410 0.614 0.644 1.000 Mother's education 0.126 0.087 0.316 0.321 0.322 0.337 0.503 0.579 0.837 1.000 B. Fraternal Twins Variable Yt Y2 sl 5? si sl F2 EM EM n 1.000 0.364 1.000 sl 0.142 0.233 1.000 si 0.128 0.256 0.869 1.000 si 0.140 0.367 0.543 0.535 1.000 sl 0.136 0.387 0.621 0.565 0.951 1.000 Father's education (EF) 0.109 0.028 0.332 0.408 0.353 0.407 1.000 Father's education (E2,) 0.025 -0.107 0.259 0.392 0.230 0.253 0.803 1.000 Mother's education (E^) 0.147 -0.117 0.025 0.127 0.244 0.244 0.547 0.458 1.000 Mother's education (E^) -0.065 -0.178 0.180 0.216 0.109 0.180 0.587 0.600 0.742 1.000 Note: Y{ and Y2 represent sibling l's and sibling 2's log hourly wage rate, respectively. sibling-reported) education levels, and father's and mother's education levels for our sample of twins. In all our analyses we have randomly selected one twin as the first in each pair. We write Sj for the self-reported education level of the first twin, Sf for the sibling-reported education level of the first twin, Sf f°r the self-reported education level of the second twin, and S\ for the sibling- reported education level of the second twin. (That is, 5™, m, n = 1,2, refers to the education level of the nth twin as reported by the mth twin.) All six of the possible correlations are reported in the table. It is apparent that the independent measures of education levels are highly correlated. There are, of course, two measures of the father's and mother's education levels, and we have This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions VOL. 84 NO. 5 ASHENFELTER AND KRUEGER: ECONOMIC RETURN TO SCHOOLING 1161 reported the correlations across both of these also. It is apparent from the table that the wage rates and education levels of identical twins are highly correlated and that they are more highly correlated than the wage rates and education levels of fraternal twins. It is possible to compare some of the correlations in Table 2 with other reports of sibling correlations. For identical twins, Behrman et al. (1980) report intrapair correlations of 0.76 for years of schooling and 0.55 for (the logarithm of) earnings. These may be contrasted with our estimates of intrapair correlations for identical twins of 0.66 for self-reported schooling and 0.56 for (the logarithm of) wages rates. For fraternal twins Behrman et al. report intrapair correlations of 0.55 for schooling (compared to our estimate of 0.54) and 0.30 for earnings (compared to our estimate of 0.36). Although they are not identical, the correlation coefficients from the Behrman et al. data differ only a little from those in our survey. C. The Extent of Measurement Error The correlations in Table 2 provide a comprehensive set of estimates of the measurement error in these data. In the classical model of measurement error we may write S™ = Sn + v™ where Sn is the true schooling level and v™ (m = l,2) are measurement errors that are uncorrelated with Sn (n = l,2) and with each other.3 In this model the correlation between the two measures of schooling, Si and S%, is just Var(S„)/[Var(S(l)-Var(S„2)],/2. This correlation is the fraction of the variance in the reported measures of schooling We call this the "classical measurement error model." The assumption that the measurement errors are uncorrelated with each other may be relaxed by allowing a family fixed effect in the measurement error, or a correlation between the two reports by a single twin, and we do so in Section III. that is due to true variation in schooling. This ratio is sometimes called the "reliability ratio" of the schooling measure. The two estimates of the reliability ratio for the twins schooling levels in Table 2 are 0.92 and 0.88. These estimates indicate that between 8 percent and 12 percent of the measured variance in schooling levels is error. Previous estimates of the reliability ratio in schooling levels (derived by resurvey-ing) by Paul Siegel and Robert Hodge (1968) and William Bielby et al. (1977) have ranged between 0.80 and 0.93 and are very similar to our estimates from the survey of twins. Since both twins were asked about the schooling levels of their parents, it is also possible to estimate the measurement error in parental schooling levels. These estimates of the reliability ratio in the schooling levels of the twins' parents are lower than the estimates of the reliability ratios for the twins themselves. The reliability ratios are around 0.86 for the father's schooling and 0.84 for the mother's schooling. II. Conceptual Framework and Basic Empirical Results A. Conceptual Framework We denote by yu and y2i the logarithms of the wage rates of the first and second twins in the j'th pair. We let X( represent the set of variables that vary by family, but not across twins. In our study the variables in X, include age, race, and any measures of family background. We let Z1( and Z2- represent the sets of variables that may vary across the twins. In our study these variables include the education levels, union status, job tenure, and marital status of each twin. A general setup (see e.g., Gary Chamberlain, 1982) specifies wage rates as consisting of an unobservable component that varies by family /i,-, observable components that vary by family, X;, observable components that vary across individuals, Zu and Z2i, and unobservable individual components (eu and s2i). This implies (1) yli = aXI.-r^Zlj + ,i, + ell. This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1162 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 and (2) y2i = aX,. + 0Z2,. + ^. + £2/ where we assume that the equations are identical for the two twins. A general representation for the correlation between the family effect and the observables is (3) ju,,. = 7Zli + 7Z2l. + 8X, + w; where we have assumed that the correlations between the family effect and the observables for each twin are the same, and where w, is uncorrelated with Z1;, Z2i, and X;. The coefficients 7 measure the "selection effect" relating earnings and the observables, while the coefficients 0 measure the structural (or selection-corrected) effect of the observables on earnings.4 The data on twins make it possible to measure the selection effect and therefore to identify the rate of return to schooling. The reduced form for this model is obtained by substituting (3) into (2) and (1) and collecting terms: (4) yu = [* + b)Xi + [p + y]Zu (5) y2,. = [a + 8]X, + 7Z1;. + [0 + 7]Z2, + 4, where s'u = (Dt + eu and e'2i = (o,■ + e2i. Although equations (4) and (5) may be fitted by ordinary least squares (OLS), generalized least squares (GLS) is the optimal estimator for these equations because of the cross-equation restrictions on the coefficients. (Generalized least squares also provides the appropriate estimates of standard errors for the estimated coefficients.) In this framework Z2i may influence yu and Z1; may influence y2i in the reduced 4These selection effects are precisely "omitted-vari-able bias." form. That is, both siblings' education levels (or any other variable that varies across twins) may enter into both siblings' wage equations because of the correlation between the family effect and schooling levels. These correlations are entirely a result of selection effects. If, for example, families that would otherwise have high wage rates are more likely to educate their children, then the component of 7 for the schooling variable should be positive. Finally, it is clear that the coefficients 0 of the variables that differ across twins are identified. They may be estimated because the selection effects 7 may be estimated. On the other hand, the coefficients a of the variables that vary only across families are not identified. The difference between (1) and (2) [or (4) and (5)] is (6) yu ~ y2i = P(Z1;. -Z2f) + eu - e2i. In (6) the individual effect /n, has been removed. The least-squares estimator for this equation is called the "fixed-effects" estimator. In equations (4) and (5) the selection effect is estimated explicitly and then subtracted to obtain the structural estimate of the return to schooling. In (6) the selection effect is eliminated by differencing. We report estimates of all these equations below in order to provide direct evidence on the size of the selection effect. B. The Effect of Measurement Error Classical measurement error in schooling will lead to bias in the estimators of the effect of schooling on wage rates. In a bi-variate regression, the least-squares regression coefficient in the presence of measurement error in schooling is attenuated by an amount equal to the reliability ratio; that is, Plim Cols = /3ols(1 " Var( i/)/[Var( v) + Var( S)}) where /3OLS is the population regression This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions VOL. 84 NO. 5 ASHENFELTER AND KRUEGER: ECONOMIC RETURN TO SCHOOLING 1163 coefficient if schooling were perfectly measured, Var(S) is the variance in true schooling levels, and Yar(i>f) = Var(^2) = Var(v) is the assumed common variance of measurement error. Our estimates of the reliability ratio in the level of schooling are about 0.90, indicating that the ordinary least-squares regression estimator would be biased downward by about 10 percent relative to its value in the absence of measurement error. In the presence of selection effects, however, the ordinary least-squares estimator will be biased even in the absence of measurement error (because of the omitted sibling's schooling variable). The fixed-effects estimator eliminates this selection (or "omitted variable") bias, but it does so at the expense of introducing far greater measurement-error bias. In the presence of classical measurement error (see Zvi Griliches, 1979), the probability limit of the fixed-effects estimator, /3FE, is Var(^) [Var(i/)+Var(S)](l-Pl) where ps is the correlation between the measured schooling levels of the twins and )3FE is the population fixed-effects estimator that would be obtained in the absence of measurement error. For the fixed-effects estimator, the attenuation caused by measurement error is increased because of the correlation between the schooling level of the twins. For example, with a reliability ratio of 0.9 and a correlation between the twins' self-reported schooling of 0.66, the fixed-effects estimator would be biased downward by 0.1/(1-0.66) = 0.294, or about 30 percent relative to its value in the absence of measurement error. One simple procedure for reducing the effect of measurement error on either estimator is to average the multiple reports on schooling and to use this average as the independent variable in equation (6). Assuming classical measurement error and using {S\-Sl)/2 + (Sl-S\)/2 as the independent variable in equation (6) leads to a modified fixed-effects estimator with the fol- lowing property: plim /3 Var(i^) [Var(S) + Var(0](l-ps) 2Var(S1-S2) Measurement error causes a smaller asymptotic bias here than in the standard fixed-effects estimator because the averaging decreases the measurement error as a fraction of the total variance in the independent variable. We report the results of estimates based on averages of the schooling data below to appraise further the importance of measurement error in estimation of the returns to schooling. A straightforward consistent estimator for equation (4), (5), or (6), assuming classical measurement error, may be obtained by the method of instrumental variables using the independent measures of the schooling variables as instruments. For example, we may fit (7) y1,-y2, = )3(51]-522) + ei,.-e2, = pAS' + Ac using AS" = (S2 - S\) as an instrument for AS'. We also report these estimates below. Finally, since we have multiple measures of schooling for each twin it is possible to relax the classical assumption that the measurement errors v\ and v\ (or v\ and v\) are uncorrelated. For example, if a twin who reports an upward-biased measure of her own schooling is more likely to report an upward-biased measure of her sibling's schooling, then the correlation, pv, between the measurement errors v\ and v\ (and v\ and v\) will be positive. A positive correlation in the measurement error in each sibling's report will lead to a higher correlation between S\ and S2 than between S{ and Sf (and a higher correlation between S2 and Sf than between S\ and S2), be- This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1164 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 Table 3—Ordinary Least-Squares (OLS), Generalized Least-Squares (GLS), Instrumental-Variables (IV), and Fixed-Effects Estimates of Log Wage Equations for Identical Twins" Variable OLS (0 GLS (ii) GLS (iii) IV (iv) First difference (v) First difference by IV (vi) Own education 0.084 (0.014) 0.087 (0.015) 0.088 (0.015) 0.116 (0.030) 0.092 (0.024) 0.167 (0.043) Sibling's education — — -0.007 (0.015) -0.037 (0.029) — — Age 0.088 (0.019) 0.090 (0.023) 0.090 (0.023) 0.088 (0.019) — — Age squared (-100) -0.087 (0.023) -0.089 (0.028) -0.090 (0.029) -0.087 (0.024) — — Male 0.204 (0.063) 0.204 (0.077) 0.206 (0.077) 0.206 (0.064) — — White -0.410 (0.127) -0.417 (0.143) -0.424 (0.144) -0.428 (0.128) — — Sample size: R2: 298 0.260 298 0.219 298 0.219 298 149 0.092 149 Notes: Each equation also includes an intercept term. Numbers in parentheses are estimated standard errors. "Own education and sibling's education are instrumented for using each sibling's report of the other sibling's education as instruments. cause the own-reports contain a common measurement-error component that the cross-sibling reports do not contain. In contrast, in the presence of classical measurement error these correlations would be identical. In fact, the correlations in Table 2 are consistent with the hypothesis of positively correlated measurement error in the siblings' reports. In the presence of correlated measurement errors the instrumental-variables estimators of equation (4), (5), or (6) will be inconsistent. For example, instrumental variables used to obtain the fixed-effects estimator in (6) leads to plim/SpHv = p/{l - 2p„[Var(v)/Var( A5)]}. A straightforward consistent estimator of equation (6) may be obtained by instrumental-variables estimation of (8) yu-y2i = P{S\-S\)+su-e2i = PAS* + As in which AS** = 5,2 - Sf is used as an instrument for AS*, and we report this estimate below.5 C. The Basic Empirical Results Table 3 contains simple estimates of the effect of schooling on earnings that control only for demographic variables (that may be considered strictly exogenous). In columns (i) and (ii) we report the results of stacking equations (1) and (2) and fitting them by least squares and generalized least squares (the seemingly-unrelated-regression method due to Arnold Zellner [1962]). The results in columns (i) and (ii) are comparable to most of the estimates that have appeared in the literature which ignore the potential correlation between schooling level and 5Note that the estimates using averages of the schooling differences will be inconsistent in the presence of correlated measurement errors, but as in the classical case, the inconsistency will be reduced by averaging. This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions VOL. 84 NO. 5 ASHENFELTER AND KRUEGER: ECONOMIC RETURN TO SCHOOLING 1165 family background. For example, a regression fitted to data from the 1990 CPS with an identical specification as that in column (i) of Table 3 gives an estimate of the effect of schooling on the wage of 8.3 percent per year completed (compared to 8.7 percent in the data for twins). Estimates of the effect of age and gender on wage rates are also similar in the CPS, but estimates of the effect of race on wage rates are very different (9 percent vs. - 40 percent). The results in column (iii) of Table 3 correspond to stacking equations (4) and (5) and fitting them by generalized least squares. These are the results that include the sibling's education level in each twin's wage equation. The coefficient of this variable is a measure of the selection effect, 7, in equation (3). As the table indicates, this effect is small and negative, indicating that the selection effect in these data is negative. In this sample the better-educated families are not those who would otherwise be the most highly compensated in the labor market. This result also implies that a regression estimator of the returns to schooling that does not adjust for the selection effect will be downward-biased. A regression of the intrapair difference in wage rates on the intrapair difference in schooling levels (which is the fixed-effects estimate) is reported in column (v) of Table 3. This result confirms that the OLS regression result is smaller, not larger, than the intrapair regression estimate. This result is dramatically different from the result reported by Behrman et al. (1980). Behrman et al. report a simple regression estimate of the return to schooling similar to what we report in column (i), but their intrapair regressions [comparable to those in our column (v)] indicate schooling returns that are only around 40 percent as large.6 6 We are comparing the regression coefficient in line Y-l in Behrman et al.'s (1980) table 6.1, which is for identical and fraternal twins, with the regression coefficient in line Y-4 in their table 6.2, which is for identical twins only. The result in line Y-4 in table 6.2 of Behrman et al. is a typographical error and should read 0.03, not 0.003. o o ° 8 o -2 H______ -7-5-3-1 1 3 5 7 Difference in Years of Schooling Figure 1. Intrapair Returns to Schooling, Identical Twins Figure 1 contains the scatter diagram of the intrapair (logarithmic) wage difference against the intrapair schooling difference. This diagram displays much of what the basic data contain. First, it is clear that many twins report identical education levels, so that many intrapair education differences are zero. Second, there is still a large amount of variability in the reported wage differences of identical twins with the same education levels. The standard deviation of the difference in the log wages is 0.56 for identical twins with identically reported education levels. This may be compared with a standard deviation in the difference in log wages in the overall sample of 0.58. Finally, and despite the variability in wage rates, there is a clear tendency for better-educated twins to report higher wage rates. Columns (iv) and (vi) in Table 3 report the instrumental-variables estimates which are intended to correct for measurement error in the education data. Here we use each sibling's report of his (or her) sibling's education level as an instrumental variable for his (or her) sibling's education level. These instrumental-variables estimates are much larger than the least-squares estimates, and they are consistent with our finding above that a considerable fraction of the variability in reported differences in twins' education levels is due to measurement error. If we accept the sibling reports as valid instruments, it seems likely that This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 7766 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 Table 4—Estimates Using Average of Schooling Reports, Log Wage Equations for Identical Twins OLS GLS GLS First difference Variable (0 (ii) (iii) (iv) Average own 0.087 0.094 0.098 0.117 education3 (0.015) (0.016) (0.016) (0.026) Average sibling's — — -0.017 educationb (0.016) Age 0.089 0.091 0.091 — (0.019) (0.023) (0.023) Age squared -0.088 -0.091 -0.091 — (-100) (0.023) (0.029) (0.029) Male 0.203 0.202 0.208 — (0.063) (0.077) (0.077) White -0.406 -0.382 -0.385 — (0.127) (0.144) (0.144) Sample size: 298 298 298 149 ■ R2: 0.272 0.223 0.225 0.122 Notes: Each equation also includes an intercept term. Numbers in parentheses are estimated standard errors. "Average own education is equal to (5J + S2)/2. bAverage sibling's education is equal to (Sf + S\)/2. conventional methods are producing serious underestimates of the economic returns to schooling. A conventional test of the difference between the least-squares estimate (0.09) and the instrumental-variables estimate (0.17) rejects the hypothesis that these are equal with a t ratio of 1.97 (see Jerry Hausman, 1978). A table containing estimates similar to those in Table 3 for the pooled sample of fraternal and identical twins is available from the authors upon request. Table 4 contains some further tests of the effect of measurement error on estimates of the returns to schooling. In this table we report the results of reestimating the least-squares and generalized least-squares results of Table 3 using simple averages of the multiple indicators of education levels as independent variables. As expected, all of the estimates in Table 4 are larger than the corresponding estimates in Table 3. These results provide further evidence that measurement error is producing a downward bias in conventional estimates of the returns to schooling. Table 5 contains an analysis that parallels the analysis in Table 3 except that variables measuring union status, marital status, years of tenure on the current job, and the education of the worker's parents have been added to the regressions. The estimated returns to schooling here are even larger than in Table 4. In addition, worker job tenure has a strong positive and precisely determined effect on wage rates. Marital status and union status have positive effects on wages, but neither effect is measured precisely. It is also worth noting that when we control for a standard list of variables, as we do in Table 5, the fixed-effect estimate of the return to schooling is attenuated compared to the GLS estimate. Many of the results in Tables 3, 4, and 5 are similar to those that have been reported elsewhere in the study of the determination of wage rates. Wage rates are concave in age, males earn more than females, and parental education seems to have very little independent effect on wage rates. One anomaly in Tables 3, 4, and 5 is the estimated effect of race on wage rates, which This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions VOL. 84 NO. 5 ASHENFELTER AND KRUEGER: ECONOMIC RETURN TO SCHOOLING 1167 Table 5—GLS, IV, and Fixed-Effects Estimates of Augmented Log-Wage Equations for Identical Twins Variable GLS (i) GLS (ii) IVa (iii) First difference (iv) First difference by IV (v) Own education 0.105 (0.016) 0.105 (0.016) 0.147 (0.034) 0.091 (0.022) 0.179 (0.041) Sibling's education — -0.008 -0.062 (0.016) (0.035) — Age 0.082 (0.023) 0.082 (0.023) 0.082 (0.019) — — Age squared (-100) -0.094 (0.029) -0.094 (0.029) -0.092 (0.024) — — Male 0.147 (0.080) 0.149 (0.081) 0.139 (0.066) — — White -0.472 (0.143) -0.482 (0.144) -0.506 (0.130) — — Covered by union 0.115 (0.072) 0.118 (0.072) 0.153 (0.081) 0.063 (0.090) 0.095 (0.095) Married 0.089 (0.065) 0.086 (0.065) 0.051 (0.073) 0.142 (0.081) 0.140 (0.086) Years of tenure 0.025 (0.005) 0.024 (0.005) 0.020 (0.005) 0.028 (0.006) 0.028 (0.006) Father's education 0.001 (0.014) 0.001 (0.014) 0.006 (0.013) — — Mother's education 0.013 (0.017) 0.015 (0.018) 0.019 (0.017) — — Sample size: R2: 284 0.320 284 0.320 284 147 0.257 147 Notes: Each equation also includes an intercept term. Numbers in parentheses are estimated standard errors. aOwn education and sibling's education are instrumented using sibling's report of the other sibling's education as instruments. indicates that white workers earn less than nonwhite workers. It seems possible that this result is due to selection in the relatively small sample of nonwhites who attended the twins festival and turned up in our sample. We have, therefore, computed the results in Tables 4 and 5 deleting the sample of nonwhite workers. The results of these regressions for white workers do not differ in any material way from those already reported. (The effect of schooling on wage rates is slightly higher for white twin pairs than for the group as a whole, but this difference is not statistically significant.) Finally, we implement an instrumental-variables approach that is consistent in the presence of measurement errors that are correlated between the twins' reports of their own schooling and of their siblings' schooling. Specifically, we include AS* = Sj1 - S\ m the first-differenced wage equations, and use AS** = S2 - Sf as an instrument for AS*. These instrumental-variables first-difference estimates, along with least-squares first-difference estimates, are reported in Table 6. When no other covari-ates are included, the instrumental-variable estimate that is robust to correlated measurement errors is 0.129, which is 20 percent greater than the OLS estimate of 0.107. Similar results hold when other variables are added to the regression [see columns This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1168 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 Table 6—OLS and IV First-Difference Estimates of Log-Wage Equations for Identical Twins, Assuming Correlated Measurement Errors OLS IV OLS IV Variable (i) (ii) (iii) (iv) AS* 0.107 0.129 0.112 0.132 ■ (0.025) (0.030) (0.023) (0.028) A Covered by union — — 0.089 0.099 (0.088) (0.089) A Married — — 0.157 0.160 (0.080) (0.080) A Years of tenure — _ 0.028 0.028 (0.006) (0.006) Sample size: 149 149 147 147 R2: 0.105 — 0.286 - Notes: AS* is the difference between sibling l's report of her (his) own education and her (his) report of sibling 2's education. The instrument used for AS* is AS**, the difference between sibling 2's report of sibling l's education and sibling 2's report of sibling 2's own education. Numbers in parentheses are estimated standard errors. (iii) and (iv)]. In each case, however, the new instrumental-variables estimates yield returns to education that are 3 percentage points smaller than specifications that use differences in sibling reports of education as the instrument for differences in own-reported education. Apparently, the classical model of measurement error is too restrictive for these data. III. A Simple Model of Wage Rates, Schooling, and Measurement Error A. Classical Measurement Errors A simplified version of equation (6), which represents the intrapair difference in wage rates, is (9) Ay. = 0As,. + Ae,. where (5 represents the return to schooling, Ay, represents the intrapair difference in log wages, A 5, represents the true intrapair difference in schooling, and Ae,- is an error that is independent of schooling levels. Letting As't and As" represent the self-reported schooling difference (Sj — Sf) and the sibling-reported schooling difference Table 7—Empirical Covariance Matrix Variable Ay Ay As" Ay 0.336 0.338 0.360 As' 3.691 2.158 As" 3.902 (Si - Sp, we may also write (10) As'i = As, + Av'; (11) As'; = Asi + Av'! where we assume that Av\ and Av" are classical measurement errors in schooling that are uncorrelated with the true schooling levels, with each other, and with Ae. Notice that any fixed tendency for some families to misreport their schooling levels has been eliminated by differencing. This setup leads to a very simple method-of-moments estimation scheme. The theoretical covariance matrix of the three variables Ay, AS', and AS" is contained in Table 7, where 1.744 (0.327) 1.744 (0.327) 1.721 (0.313) — — — — 0.500 (0.129) — — — 0.583 (0.132) Pi — — — 0.515 (0.136) Note: Estimated asymptotic standard errors are in parentheses. and most important, there is the restriction that the covariance between the wage difference and the education difference should be the same for each measure of the education difference. Remarkably, Table 8 indicates that this equality holds almost precisely in the data. Second, if self-reported measures of education are more accurate than sibling-reported measures of education, then the variance of self-reported education differences (3.69) should be less than the variance of sibling-reported education differences (3.90). The empirical covariance matrix is also consistent with this hypothesis. Table 9 contains the maximum-likelihood estimates of the basic parameters set out in Table 7. Since equations (7)-(9) are over-identified, there are two estimates of the rate of return to schooling in the unrestricted model. This implies that there are also two estimates of the variance in the difference in wage rates that is explained by schooling differences. The first estimate of the return to schooling is simply the ordinary instrumental-variables estimate (reported earlier in Table 4) of Cov(Ay,AS")/(Cov(AS',AS") = 0.167. The second estimate, which corresponds to the instrumental-variables estimate we would This content downloaded from 147.251.185.127 on Tue, 17 Mar 2015 15:28:07 UTC All use subject to JSTOR Terms and Conditions 1170 THE AMERICAN ECONOMIC REVIEW DECEMBER 1994 Table 10—Theoretical Moment Matrix Assuming Correlated Measurement Errors Parameter Ay AS" AS" AS* AS** Ay PWs + vl P°ls P<*Is P°Zs P°Zs AS' als+2a} tTls-2pv