Econometrics - Lecture 5 Endogeneity, Instru-mental Variables, IV Estimator Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests Dec 10, 2010 Hackl, Econometrics, Lecture 5 2 OLS Estimator Dec 10, 2010 Hackl, Econometrics, Lecture 5 3 Linear model for yt yt = xt'β + εt, t = 1, …, T (or y = Xβ + ε) given observations xtk, k =1, …, K, of the regressor variables and the error term εt Properties of the OLS estimator b = (Σtxt xt’)-1Σtxt yt = (X’X)-1X’y 1.OLS estimator b is unbiased if n(A1) E{ε} = 0 n(A10) E{ε|X} = 0, i.e., X uninformative about E{εt} for all t (ε is conditional mean independent of X) q(A2) [{xt, t=1, …,T} and {εt, t=1, …,T} are independent] is stronger q(A8) [xt and εt are independent for all t] is less strong q(A7) [E{xt εt} = 0 for all t, no contemporary correlation] is even less strong than (A8) n OLS Estimator, cont’d Dec 10, 2010 Hackl, Econometrics, Lecture 5 4 2.OLS estimator b is consistent for β if n(A8) xt and εt are independent for all t n(A6) (1/T)Σt xt xt’ has as limit (T→∞) a nonsingular matrix Σxx (A8) can be substituted by (A7) [E{xt εt} = 0 for all t, no contemporary correlation] 3.OLS estimator b is asymptotically normally distributed if (A6), (A8) and n(A11) εt~ IID(0,σ²) are true; nfor large T, b follows approximately the normal distribution b ~a N{β, σ2(Σt xt xt’ )-1} nUse White and Newey-West estimators for V{b} in case of heteroskedasticity and autocorrelation of error terms, respectively n Hackl, Econometrics, Lecture 5 5 Assumption (A7): E{xt εt} = 0 for all t nImplication of (A7): for all t, each of the regressors is uncorrelated with the current error term, no contemporary correlation nStronger assumptions – (A2), (A8), (A10) – have same consequences n(A7) guaranties unbiasedness and consistency of the OLS estimator nIn reality, the (A7) is not always true: alternative estimating procedures required nExamples of situations with E{xt εt} ≠ 0: nRegressors with measurement errors nRegression on the lagged dependent variable with autocorrelated error terms nEndogeneity of regressors nSimultaneity n n Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 6 Hackl, Econometrics, Lecture 5 7 Regressor with Measurement Error n yt = β1 + β2wt + vt nwith white noise vt, V{vt} = σv², and E{vt|wt} = 0; conditional expectation of yt given wt : E{yt|wt} = β1 + β2wt nE.g., wt: household income, yt: household savings nMeasurement process: reported household income xt deviates from household income wt n xt = wt + ut n where ut is (i) white noise with V{ut} = σu², (ii) independent of vt, and (iii) independent of wt nThe model to be analyzed is n yt = β1 + β2xt + εt with εt = vt - β2ut nE{xt εt} = - β2 σu² ≠ 0: requirement for consistency is violated nxt and εt are negatively correlated if β2 > 0 (positively correlated if β2 < 0) n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 8 Measurement Error, cont‘d nInconsistency of b2 n plim b2 = β2 + E{xt εt} / V{xt} n n n n β2 is underestimated nInconsistency of b1 n plim (b1 - β1) = - plim (b2 - β2) E{xt} n given E{xt} > 0 for the reported income: β1 is overestimated; inconsistency carries over nThe model does not correspond to the conditional expectation of yt given xt: n E{yt|xt} = β1 + β2xt - β2 E{ut|xt} ≠ β1 + β2xt n as E{ut|xt} ≠ 0 n Dec 10, 2010 Hackl, Econometrics, Lecture 5 9 Dynamic Regression nAllows to model dynamic effects of changes of x on y: n yt = β1 + β2xt + β3yt-1 + εt nOLS estimators are consistent if E{xt εt} = 0 and E{yt-1 εt} = 0 nAR(1) model for εt: n εt = ρεt-1 + vt n vt white noise with σv² nFrom yt = β1 + β2xt + β3yt-1 + ρεt-1 + vt follows n E{yt-1εt} = β3 E{yt-2εt} + ρ²σv²(1 - ρ²)-1 n i.e., yt-1 is correlated with εt nOLS estimators not consistent nThe model does not correspond to the conditional expectation of yt given the regressors xt and yt-1: n E{yt|xt, yt-1} = β1 + β2xt + β3yt-1 + E{εt |xt, yt-1} Dec 10, 2010 Hackl, Econometrics, Lecture 5 10 Omission of Relevant Regressors nTwo models: n yi = xi‘β + zi’γ + εi (A) n yi = xi‘β + vi (B) qcan be written with yi from (A): nTrue model (A), fitted model (B) nOLS estimates bB of β from (B) n n nOmitted variable bias: E{(Σi xi xi’)-1 Σi xi zi’}γ = E{(X’X)-1 X’Z}γ nNo bias if (a) γ = 0 or if (b) variables in xi and zi are orthogonal nOLS estimators are biased, if relevant regressors are omitted that are non-orthogonal, i.e., correlated Dec 10, 2010 Hackl, Econometrics, Lecture 5 11 Unobserved Regressors nExample: Wage equation with yi: log wage, x1i: personal characteristics, x2i: years of schooling, ui: abilities (unobservable) n yi = x1i‘β1 + x2iβ2 + uiγ + vi nModel for analysis (unobserved ui covered in error term) n yi = xi‘β + εi n with xi = (x1i‘, x2i)’, β = (β1‘, β2)’, εi = uiγ + vi nGiven E{xi vi} = 0 n plim b = β + Σxx-1 E{xi ui} γ nOLS estimator b are inconsistent if xi and ui are correlated (γ ≠ 0), e.g., if higher abilities induce more years at school: estimator for β2 might be overestimated, effect of years at school etc. overestimated: “ability bias” nUnobserved heterogeneity: observational units might differ in other aspects than ones that are observable n Dec 10, 2010 Hackl, Econometrics, Lecture 5 12 Endogenous Regressors nRegressors correlated with error term: E{X‘ε} ≠ 0 nEndogeneity bias nIn many economic applications nOLS estimators b = β + (X‘X)-1X‘ε qE{b} ≠ β, b is biased; bias E{(X‘X)-1X‘ε} difficult to assess qplim b = β + Σxx-1 q with q = plim(T-1X‘ε) nFor q = 0 (regressors and error term asymptotically uncorrelated), OLS estimators b are consistent also in case of endogenous regressors nFor q ≠ 0 (error term and at least one regressor asymptotically correlated): plim b ≠ β, the OLS estimators b are not consistent nExogenous regressors: with error term uncorrelated, all non-endogenous regressors Dec 10, 2010 Hackl, Econometrics, Lecture 5 13 Consumption Function nAWM data base, 1970:1-2003:4 nC: private consumption (PCR), growth rate p.y. nY: disposable income of households (PYR), growth rate p.y. n Ct = β1 + β2Yt + εt (A) n β2: marginal propensity to consume, 0 < β2 < 1 nOLS-estimates: n Ĉt = 0.011 + 0.718 Yt n with t = 15.55, R2 = 0.65, DW = 0.50 nIt: per capita investment (exogenous, E{It εt} = 0) n Yt = Ct + It (B) nBoth Yt and Ct are endogenous: E{Ct εi} = E{Yt εi} = σε²(1 – β2)-1 nThe regressor Yt has an impact on Ct; at the same time Ct has an impact on Yt Dec 10, 2010 Hackl, Econometrics, Lecture 5 14 Simultaneous Equation Models nVariables Yt and Ct are simultaneously determined by equations (A) and (B) nEquations (A) and (B) are the structural equations or the structural form of the simultaneous equation model that describes both Yt and Ct nThe coefficients β1 and β2 are behavioral parameters nReduced form of the model: one equation for each of the endogenous variables Ct and Yt, with only the exogenous variable It as regressor nThe OLS estimators are biased and inconsistent n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 15 Consumption Function, cont’d nReduced form of the model: n n n n n nOLS estimator b2 from (A) is inconsistent n plim b2 = β2 + Cov{Yt εi} / V{Yt} = β2 + (1 – β2) σε²(V{It} + σε²)-1 n for 0 < β2 < 1, b2 overestimates β2 nThe OLS estimator b1 is also inconsistent n Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 16 Hackl, Econometrics, Lecture 5 17 An Alternative Estimator nModel n yt = β1 + β2 xt + εt n with E{ εt xt } ≠ 0, i.e., endogenous regressor : OLS estimators are biased and inconsistent nInstrumental variable zt satisfying 1.Exogeneity: E{εt zt } = 0: uncorrelated with error term 2.Relevance: Cov{xt , zt } ≠ 0: correlated with endogenous regressor nTransformation of model equation n Cov{yt , zt } = β2 Cov{xt , zt } + Cov{εt , zt } n gives n Dec 10, 2010 Hackl, Econometrics, Lecture 5 18 IV Estimator for β2 nSubstitution of sample moments for covariances gives the instrumental variables (IV) estimator n n n nConsistent estimator for β2 given that the instrumental variable zt is valid , i.e., it is qExogenous, i.e. E{εt zt} = 0 qRelevant, i.e. Cov{xt , zt} ≠ 0 nTypically, it cannot not be shown that the IV estimator is unbiased; small sample properties are unknown nCoincides with OLS estimator for zt = xt Dec 10, 2010 Hackl, Econometrics, Lecture 5 19 Consumption Function, cont’d nAlternative model: Ct = β1 + β2Yt-1 + εt nYt-1 and εt are certainly uncorrelated; avoids risk of inconsistency due to correlated Yt and εt nYt-1 is certainly highly correlated with Yt, is almost as good as regressor as Yt nFitted model: n Ĉ = 0.012 + 0.660 Y-1 n with t = 12.86, R2 = 0.56, DW = 0.79 (instead of Ĉ = 0.011 + 0.718 y with t = 15.55, R2 = 0.65, DW = 0.50) nDeterioration of t-statistic and R2 are price for improvement of the estimator Dec 10, 2010 Hackl, Econometrics, Lecture 5 20 IV Estimator: The Idea nAlternative to OLS estimator nAvoids inconsistency in case of endogenous regressors nIdea of the IV estimator: qReplace regressors which are correlated with error terms by regressors nwhich are uncorrelated with the error terms nwhich are (highly) correlated with the regressors that are to be replaced qand use OLS estimation nThe hope is that the IV estimator is consistent (and less biased) than the OLS estimator nPrice: Deteriorated model fit, e.g., t-statistic, R2 Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 21 Hackl, Econometrics, Lecture 5 22 IV Estimator: General Case nThe model is n yt = xt‘β + εt n with V{εi} = σε² and n E{εt xt} ≠ 0 nat least one component of xt is correlated with the error term nThe vector of instruments zt (with the same dimension as xt) fulfills n E{εt zt} = 0 nIV estimator based on the instruments zt n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 23 IV Estimator: General Case, cont’d nThe (asymptotic) covariance matrix of is given by n n nIn the estimated covariance matrix, σ² is substituted by n n nThe asymptotic distribution of IV estimators, given IID(0, σε²) error terms, leads to the approximate distribution n n with the estimated covariance matrix n Dec 10, 2010 Hackl, Econometrics, Lecture 5 24 Derivation of IV Estimators nThe model is n yt = xt‘β + εt = x0t‘β0 + βKxKt + εt n with x0t = (x1t, …, xK-1,t)’ containing the first K-1 components of xt, and E{εt x0t} = 0 nK-the component is endogenous: E{εt xKt} ≠ 0 nThe instrumental variable zKt fulfills n E{εt zKt} = 0 nMoment conditions: K conditions to be satisfied by the coefficients, the K-th condition with zKt instead of xKt: n E{εt x0t} = E{(yt – x0t‘β0 – βKxKt) x0t} = 0 (K-1 conditions) n E{εt zt} = E{(yt – x0t‘β0 – βKxKt) zKt} = 0 nNumber of conditions – and corresponding linear equations – equals the number of coefficients to be estimated Dec 10, 2010 Hackl, Econometrics, Lecture 5 25 Derivation of IV Estimators, cont’d nThe system of linear equations for the K coefficients β to be estimated can be uniquely solved for the coefficients β: the coefficients β are identified nTo derive the IV estimators from the moment conditions, the expectations are replaced by sample averages n n n nThe solution of the linear equation system – with zt’ = (x0t‘, zKt) – is n n nIdentification requires that the KxK matrix Σt zt xt’ is finite and invertible; instrument zKt is relevant when this is fulfilled Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 26 Hackl, Econometrics, Lecture 5 27 Calculation of IV Estimators nThe model in matrix notation, n y = Xβ + ε nThe IV estimator n n with zt obtained from xt by substituting values of the instrumental variable(s) for all endogenous regressors nCalculation in two steps: 1.Regression of the explanatory variables x1, …, xK – including the endogenous ones – on the columns of Z: fitted values 2. 2.Regression of y on the fitted explanatory variables: 3. Dec 10, 2010 Hackl, Econometrics, Lecture 5 28 Calculation of IV Estimators, cont’d nRemarks: nThe KxK matrix Z’X = Σt ztxt’ is required to be finite and invertible nFrom n n n it is obvious that the estimator obtained in the second step is the IV estimator nHowever, the estimator obtained in the second step is more general; see below Dec 10, 2010 Hackl, Econometrics, Lecture 5 29 Choice of Instrumental Variables nInstrumental variable are required to be nexogenous, i.e., uncorrelated with the error terms nrelevant, i.e., correlated with the endogenous regressors nInstruments nmust be based on subject matter arguments, e.g., arguments from economic theory nshould be explained and motivated nmust show significant effect in explaining endogenous regressor nChoice of instruments often not easy nRegression of endogenous variables on instruments nBest linear approximation of Si nEconomic interpretation not of importance and interest n n n Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 30 Hackl, Econometrics, Lecture 5 31 Example: Returns to Schooling nHuman capital earnings function: n wi = β1 + Siβ2 + Eiβ3 + Ei2β4 + εi n with wi: log of individual earnings, Si: years of schooling, Ei: years of experience (Ei = agei - Si – 6) nEmpirically, more education implies higher income nQuestion: Is this effect causal? nIf yes, one year more at school increases wage by β2 nOtherwise, abilities may cause higher income and also more years at school ; one year more at school does not increase wage nIssue of substantial attention in literature Dec 10, 2010 Hackl, Econometrics, Lecture 5 32 Returns to Schooling nWage equation: besides Si and Ei:, additional explanatory variables like gender, regional, racial dummies nModel for analysis: n wi = β1 + zi‘γ + Siβ2 + Eiβ3 + Ei2β4 + εi n zi: observable variables excluding Ei, Si nzi is assumed to be exogenous, i.e., E{zi εi} = 0 nSi may be endogenous, i.e., E{Si εi} ≠ 0 qAbility bias: unobservable factors like intelligence, family background, etc. enable to more schooling and higher earnings qMeasurement error in measuring schooling qEtc. nWith Si, also Ei = agei - Si – 6 and Ei2 are endogenous nOLS estimators may be inconsistent n Dec 10, 2010 Hackl, Econometrics, Lecture 5 33 Returns to Schooling: Data nVerbeek‘s data set “schooling” nNational Longitudinal Survey of Young Men (Card, 1995) nData from 3010 males, survey 1976 nIndividual characteristics, incl. experience, race, region, family background etc. nHuman capital function q log(wagei) = β1 + β2 edi + β3 expi + β3 expi² + εi n with edi: years of schooling (Si), expi: years of experience (Ei) nFurther explanatory variables: black: dummy for afro-american, smsa: dummy for living in metropolitan area, south: dummy for living in the south Dec 10, 2010 OLS Estimation Dec 10, 2010 Hackl, Econometrics, Lecture 5 34 OLS estimated wage function : Output from GRETL Model 2: OLS, using observations 1-3010 Dependent variable: l_WAGE76 Koeffizient Std.-fehler t-Quotient P-Wert ---------------------------------------------------------- const 4.73366 0.0676026 70.02 0.0000 *** ED76 0.0740090 0.00350544 21.11 2.28e-092 *** EXP76 0.0835958 0.00664779 12.57 2.22e-035 *** EXP762 -0.00224088 0.000317840 -7.050 2.21e-012 *** BLACK -0.189632 0.0176266 -10.76 1.64e-026 *** SMSA76 0.161423 0.0155733 10.37 9.27e-025 *** SOUTH76 -0.124862 0.0151182 -8.259 2.18e-016 *** Mean dependent var 6.261832 S.D. dependent var 0.443798 Sum squared resid 420.4760 S.E. of regression 0.374191 R-squared 0.290505 Adjusted R-squared 0.289088 F(6, 3003) 204.9318 P-value(F) 1.5e-219 Log-likelihood -1308.702 Akaike criterion 2631.403 Schwarz criterion 2673.471 Hannan-Quinn 2646.532 Hackl, Econometrics, Lecture 5 35 Instruments for Si, Ei, Ei2 nPotential instrumental variables nFactors which affect schooling but are uncorrelated with error terms, in particular with unobserved abilities that are determining wage nFor years of experience (Ei, Ei2): age is natural candidate nFor years of schooling (Si) qCosts of schooling, e.g., distance to school (lived near college), number of siblings qParents’ education qQuarter of birth n Dec 10, 2010 Step 1 of IV Estimation Dec 10, 2010 Hackl, Econometrics, Lecture 5 36 Model for schooling (ed76), gives predicted values ed76_h Model 3: OLS, using observations 1-3010 Dependent variable: ED76 coefficient std. error t-ratio p-value ---------------------------------------------------------- const -1.81870 4.28974 -0.4240 0.6716 AGE76 1.05881 0.300843 3.519 0.0004 *** sq_AGE76 -0.0187266 0.00522162 -3.586 0.0003 *** BLACK -1.46842 0.115245 -12.74 2.96e-036 *** SMSA76 0.841142 0.105841 7.947 2.67e-015 *** SOUTH76 -0.429925 0.102575 -4.191 2.85e-05 *** NEARC4A 0.441082 0.0966588 4.563 5.24e-06 *** Mean dependent var 13.26346 S.D. dependent var 2.676913 Sum squared resid 18941.85 S.E. of regression 2.511502 R-squared 0.121520 Adjusted R-squared 0.119765 F(6, 3003) 69.23419 P-value(F) 5.49e-81 Log-likelihood -7039.353 Akaike criterion 14092.71 Schwarz criterion 14134.77 Hannan-Quinn 14107.83 Step 2 of IV Estimation Dec 10, 2010 Hackl, Econometrics, Lecture 5 37 Wage equation, estimated by IV with instruments age, age2, and nearc4a Model 4: OLS, using observations 1-3010 Dependent variable: l_WAGE76 coefficient std. error t-ratio p-value ---------------------------------------------------------- const 3.69771 0.435332 8.494 3.09e-017 *** ED76_h 0.164248 0.036887 4.453 8.79e-06 *** EXP76_h 0.044588 0.022502 1.981 0.0476 ** EXP762_h -0.000195 0.001152 -0.169 0.8655 BLACK -0.057333 0.056772 -1.010 0.3126 SMSA76 0.079372 0. 037116 2.138 0.0326 ** SOUTH76 -0.083698 0.022985 -3.641 0.0003 *** Mean dependent var 6.261832 S.D. dependent var 0.443798 Sum squared resid 446.8056 S.E. of regression 0.385728 R-squared 0.246078 Adjusted R-squared 0.244572 F(6, 3003) 163.3618 P-value(F) 4.4e-180 Log-likelihood -1516.471 Akaike criterion 3046.943 Schwarz criterion 3089.011 Hannan-Quinn 3062.072 GRETL’s TSLS Estimation Dec 10, 2010 Hackl, Econometrics, Lecture 5 38 Wage equation, estimated by IV: Output from GRETL Model 8: TSLS, using observations 1-3010 Dependent variable: l_WAGE76 Instrumented: ED76 EXP76 EXP762 Instruments: const AGE76 sq_AGE76 BLACK SMSA76 SOUTH76 NEARC4A coefficient std. error t-ratio p-value ---------------------------------------------------------- const 3.69771 0.495136 7.468 8.14e-014 *** ED76 0.164248 0.0419547 3.915 9.04e-05 *** EXP76 0.0445878 0.0255932 1.742 0.0815 * EXP762 -0.00019526 0.0013110 -0.1489 0.8816 BLACK -0.0573333 0.0645713 -0.8879 0.3746 SMSA76 0.0793715 0.0422150 1.880 0.0601 * SOUTH76 -0.0836975 0.0261426 -3.202 0.0014 *** Mean dependent var 6.261832 S.D. dependent var 0.443798 Sum squared resid 577.9991 S.E. of regression 0.438718 R-squared 0.195884 Adjusted R-squared 0.194277 F(6, 3003) 126.2821 P-value(F) 8.9e-143 Hackl, Econometrics, Lecture 5 39 Returns to Schooling: Summary of Estimates nEstimated regression coefficients and t-statistics n1) The model differs from that used by Verbeek Dec 10, 2010 OLS IV1) TSLS1) IV (M.V.) ed76 0.0740 0.1642 0.1642 0.1329 21.11 4.45 3.92 2.59 exp76 0.0836 0.0445 0.0446 0.0560 12.75 1.98 1.74 2.15 exp762 -0.0022 -0.0002 -0.0002 -0.0008 -7.05 -0.17 -0.15 -0.59 black -0.1896 -0. 0573 -0.0573 -0.1031 -10.76 -1.01 -0.89 -1.33 Hackl, Econometrics, Lecture 5 40 Some Comments nInstrumental variables (age, age2, nearc4a) nare relevant, i.e., have explanatory power for ed76, exp76, exp762 nWhether they are exogenous, i.e., uncorrelated with the error terms, is not answered nTest for exogeneity of regressors: Wu-Hausman test nEstimates of ed76-coefficient: nIV estimate: 0.13, i.e., 13% higher wage for one additional year of schooling; nearly the double of the OLS estimate (0.07); not in line with “ability bias” argument! ns.e. of IV estimate (0.04) much higher than s.e. of OLS estimate (0.004) nLoss of efficiency especially in case of weak instruments: R2 of model for ed76: 0.12; Corr{ed76, ed76_h} = 0,35 Dec 10, 2010 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 41 From OLS to IV Estimation nLinear model yi = xi‘β + εi nOLS estimator for the solution of the K normal equations n 1/N Σi(yi – xi‘β) xi = 0 nCorresponding moment conditions n E{εi xi} = E{(yi – xi‘β) xi} = 0 nIV estimator given R instrumental variables zi which may overlap with xi: based on the R moment conditions n E{εi zi} = E{(yi – xi‘β) zi} = 0 nIV estimator: solution of corresponding sample moment conditions n Dec 10, 2010 Hackl, Econometrics, Lecture 5 42 Number of Instruments nMoment conditions n E{εi zi} = E{(yi – xi‘β) zi} = 0 n one equation for each component of zi nzi possibly overlapping with xi nGeneral case: R moment conditions nSubstitution of expectations by sample averages gives R equations n n 1.R = K: one unique solution, the IV estimator; identified model 2. 2.R < K: infinite number of solutions, not enough instruments; under-dentified or not identified model Dec 10, 2010 Hackl, Econometrics, Lecture 5 43 The GIV Estimator 3.R > K: more instruments than necessary for identification; over-identified model nFor R > K, in general, no unique solution of all R sample moment conditions can be obtained; instead: nthe weighted quadratic form in the sample moments n n with a RxR positive definite weighting matrix WN is minimized ngives the generalized instrumental variable (GIV) estimator n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 44 The GIV Estimator, cont’d nThe weighting matrix WN nDifferent weighting matrices result in different consistent GIV estimators with different covariance matrices nFor R = K, the matrix X’Z is square and invertible; the IV estimator is (Z’X)-1Z’y for any WN nOptimal choice for WN? n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 45 GIV and TSLS Estimator nOptimal weighting matrix: WNopt = [1/N(Z’Z)]-1; corresponds to the most efficient IV estimator n nIf the error terms are heteroskedastic or autocorrelated, the optimal weighting matrix has to be adapted nRegression of each regressor, i.e., each column of X, on Z results in and n nThis explains why the GIV estimator is also called “two stage least squares” (TSLS) estimator”: 1.First step: regress each column of X on Z 2.Second step: regress y on predictions of X n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 46 GIV Estimator and Properties nGIV estimator is consistent nThe asymptotic distribution of the GIV estimator, given IID(0, σε²) error terms, leads to the approximate distribution n nThe (asymptotic) covariance matrix of is given by n n nIn the estimated covariance matrix, σ² is substituted by n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 47 Contents nThe OLS Estimator: With Error Correlated Regressors nRegressors Correlated with Error Terms: Some Cases nInstrumental Variables (IV) Estimator: The Concept nIV Estimator: The Method nCalculation of the IV Estimator nAn Example nThe GIV Estimator nSome Tests n n n n Dec 10, 2010 Hackl, Econometrics, Lecture 5 48 Hackl, Econometrics, Lecture 5 49 Some Tests nFor testing nEndogeneity of regressors: Wu-Hausman test or Durbin-Wu-Hausman test nRelevance of potential instrumental variables: over-identifying restrictions test or Sargan test nWeak instruments: Cragg-Donald test Dec 10, 2010 Hackl, Econometrics, Lecture 5 50 Wu-Hausman Test nFor testing whether one or more regressors are endogenous (correlated with the error term) nBased on the assumption that the instrumental variables are valid; i.e., given that E{εi zi} = 0, E{εixi} = 0 can be tested nThe idea of the test: nUnder the null hypothesis, both the OLS and IV estimator are consistent; they should differ by sampling errors only nRejection of the null hypothesis indicates inconsistency of the OLS estimator Dec 10, 2010 Hackl, Econometrics, Lecture 5 51 Wu-Hausman Test, cont’d nBased on the (squared) difference between OLS- and IV-estimators nAdded variable interpretation of the Wu-Hausman test: checks whether the residuals vi from the reduced form equation of potentially endogenous regressors contribute to explaining n yi = x1i’b1 + x2ib2 + viγ + εi nvi: residuals from reduced form equation for x2 (predicted values for x2: x2 + v) nH0: γ = 0; corresponds to: x2 is exogenous nFor testing H0: use of nt-test, if γ has one component, x2 is just one regressor nF-test, if more than 1 regressors are tested for exogeneity Dec 10, 2010 Hackl, Econometrics, Lecture 5 52 Wu-Hausman Test, cont’d nRemarks nTest requires valid instruments nTest has little power if instruments are weak or invalid nTest can be used to test whether additional instruments are valid Dec 10, 2010 Hackl, Econometrics, Lecture 5 53 Sargan Test nFor testing whether the instruments are valid nThe validity of the instruments requires that all moment conditions are fulfilled; the R values of the sums n n n must be close to zero nTest statistic n n has under the null hypothesis an asymptotic Chi-squared distribution with R-K df nCalculation of ξ = NRe2 using Re2 form the auxiliary regression of IV residuals ei on the instruments zi Dec 10, 2010 Hackl, Econometrics, Lecture 5 54 Sargan Test, cont’d nRemarks nOnly R-K of the R moment conditions are free; in case of identified model (R = K), all R moment conditions are fulfilled nThe test is also called over-identifying restrictions test nRejection implies: the joint validity of all moment conditions and hence of all instruments is not acceptable nThe Sargan test gives no indication of invalid instruments nTest whether a subset of R-R1 instruments is valid; R1 (>K) instruments are out of doubt: qCalculate ξ for all R moment conditions qCalculate ξ1 for the R1 moment conditions qUnder H0, ξ - ξ1 has a Chi-squared distribution with R-R1 df Dec 10, 2010 Hackl, Econometrics, Lecture 5 55 Cragg-Donald Test nWeak (only marginally valid) instruments: nBiased estimates nInconsistent estimates nInappropriate large-sample approximations to the finite-sample distributions even for large T nDefinition of weak instruments: estimates are biased to an extent that is unacceptably large nNull hypothesis: instruments are weak, i.e., can lead to an asymptotic relative bias greater than some value b n Dec 10, 2010 Your Homework 1.Use the data set “schooling” of Verbeek for the following analyses based on the wage equation n log(wage76) = b1 + b2 ed76 + b3 exp76 + b4 exp762 n + b5 black + b6 smsa76 + b7 south76 + b8 nearc4 + e a.Estimate the reduced form for ed76, including daded and momed (i) with and (ii) without nearc4; assess the validity of the potential instruments; what indicate the correlation coefficients? b.Estimate the returns to schooling, using the instruments age, age2, daded, and momed; interpret the results including the test for validity and the Sargan test c.Estimate the returns to schooling, using the instruments age, age2, nearc4, daded, and momed; interpret the results including the test for validity and the Sargan test d.Compare the estimates of b., c., and of the model with instruments age, age2, and nearc4 Dec 10, 2010 Hackl, Econometrics, Lecture 5 56 Your Homework, cont’d 2.For the model for consumption and income (slide 13 ff): a.Show that both yt and xt are endogenous: q E{yt εi} = E{xt εi} = σε²(1 – β2)-1 a.Derive the reduced form of the model Dec 10, 2010 Hackl, Econometrics, Lecture 5 57