Econometrics 2 - Lecture 7 Models Based on Panel Data Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 2 Types of Data nPopulation of interest: individuals, households, companies, countries nTypes of observations nCross-sectional data: observations of all units of a population, or of a representative subset, at one specific point in time nTime series data: series of observations on units of the population over a period of time nPanel data (longitudinal data): repeated observations over (the same) population units collected over a number of periods; data set with both a cross-sectional and a time series aspect; multi-dimensional data nCross-sectional and time series data are one-dimensional, special cases of panel data nPooling independent cross-sections: (only) similar to panel data n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 3 Example: Individual Wages nVerbeek’s data set “males” nSample of q545 full-time working males qeach person observed yearly after completion of school in 1980 till 1987 nVariables qwage: log of hourly wage (in USD) qschool: years of schooling qexper: age – 6 – school qdummies for union membership, married, black, Hispanic, public sector qothers n May 4, 2012 Hackl, Econometrics 2, Lecture 6 4 Panel Data in Gretl nThree types of data: nCross-sectional data: matrix of observations, units over the columns, each row corresponding to the set of variables observed for a unit nTime series data: matrix of observations, each column a time series, rows correspond to observation periods (annual, quarterly, etc.) nPanel data: matrix of observations with special data structure qStacked time series: each column one variable, with stacked time series corresponding to observational units qStacked cross sections: each column one variable, with stacked cross sections corresponding to observation periods qUse of index variables: index variables defined for units and observation periods n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 5 Stacked Data: Examples n n Stacked time series n n n n n n n n n Stacked cross sections May 4, 2012 Hackl, Econometrics 2, Lecture 6 6 unit Year x1 x2 1:1 1 2009 1.197 252 1:2 1 2010 1.369 269 1:3 1 2011 1.675 275 ... ... ... ... ... 2:1 2 2009 1.220 198 2:2 2 2010 1.397 212 2:3 2 2011 1.569 275 ... ... ... ... ... unit year x1 x2 1:1 1 2009 1.197 252 2:1 2 2009 1.220 198 3:1 3 2009 1.173 167 ... ... ... ... ... 1:2 1 2010 1.369 269 2:2 2 2010 1.397 212 3:2 3 2010 1.358 201 ... ... ... ... ... Panel Data Files nFiles with one record per observation (see Table) qFor each unit (individual, company, country, etc.) T records qStacked time series or stacked cross sections qAllows easy differencing nFiles with one record per unit qEach record contains all observations for all T periods qTime-constant variables are stored only once n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 7 Panel Data nTypically data at micro-economic level (individuals, households), but also at macro-economic level (e.g., countries) nNotation: nN: Number of cross-sectional units nT: Number of time periods nTypes of panel data: nLarge T, small N: “long and narrow” nSmall T, large N: “short and wide” nLarge T, large N: “long and wide” n nExample: Data set “males”: short and wide panel (N » T) n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 8 Panel Data: Some Examples nData set “males”: short and wide panel (N = 545, T = 8) nrich in information (~40 variables) nanalysis of effects of unobserved differences (heterogeneity) nGrunfeld investment data: investments in plant and equipment by nN = 10 firms nfor each T = 20 yearly observations for 1935-1954 nPenn World Table: purchasing power parity and national income accounts for nN = 189 countries/territories nfor some or all of the years 1950-2009 (T ≤ 60) n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 9 Use of Panel Data nEconometric models for describing the behaviour of cross-sectional units over time nPanel data model nAllow to control individual differences, comparison of behaviour, to analyse dynamic adjustment, to measure effects of policy changes nMore realistic models nAllow more detailed or sophisticated research questions nMethodological implications: nDependence of sample units in time-dimension nSome variables might be time-constant (e.g., variable school in “males”, population size in the Penn World Table dataset) nMissing values n May 4, 2012 Hackl, Econometrics 2, Lecture 6 10 Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 11 Example: Schooling and Wages nData set “males” nIndependent random samples for 1980 and 1987 nN80 = N87 = 100 nVariables: wage (log of hourly wage), exper (age – 6 – years of schooling) n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 12 1980 1987 Full set sample Full set sample wage mean 1.39 1.37 1.87 1.89 st.dev. 0.558 0.598 0.467 0.475 exper mean 3.01 2.96 10.02 9.99 st.dev. 1.65 1.29 1.65 1.85 Pooling of Samples nIndependent random samples: nPooling gives an independently pooled cross section nOLS estimates with higher precision, tests with higher power nRequires qthe same distributional properties of sampled variables qthe same relation between variables in the samples May 4, 2012 Hackl, Econometrics 2, Lecture 6 13 Example: Schooling and Wages nSome wage equations: n1980 sample n wage = 1.344 + 0.010*exper, R2 = 0.001 n1987 sample n wage = 2.776 – 0.089*exper, R2 = 0.119 npooled sample n wage = 1.300 + 0.051*exper, R2 = 0.111 npooled sample with dummy d87 n wage = 1.542 – 0.056*exper + 0.912*d87, R2 = 0.210 npooled sample with dummy d87 and interaction n wage = 1.344 + 0. 010*exper + 1.432*d87 + 0.099*d87*exper nd87: dummy for observations from 1987 n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 14 Wage Equations nWage equations, dependent variable: wage (log of hourly wage) n n n n n n n n n n nAt least the intercept changes from 1908 to 1987 May 4, 2012 Hackl, Econometrics 2, Lecture 6 15 1980 1987 80+87 80+87 80+87 Interc. coeff 1.344 2.776 1.300 1.542 1.344 s.e. 0.386 0.247 0.078 0.088 0.134 exper coeff 0.010 -0.089 0.051 -0.056 0.010 s.e. 0.032 0.024 0.010 0.024 0.041 d87 coeff 0.912 1.432 s.e. 0.184 0.321 d87*exper coeff -0.099 s.e. 0.050 R2 (%) 3.4 6.6 11.1 21.0 22.5 Pooled Independent Cross-sectional Data nPooling of two independent cross-sectional samples nThe model n yit = β1 + β2 xit + εit for i = 1,...,N, t = 1,2 nImplicit assumption: identical β1, β2 for i = 1,...,N, t = 1,2 nOLS-estimation: requires homoskedastic and uncorrelated εit n E{εit} = 0, Var{εit} = σ2 for i = 1,...,N, t = 1,2 n Cov{εi1, εj2} = 0 for all i, j with i ≠ j n May 4, 2012 Hackl, Econometrics 2, Lecture 6 16 Questions of Interest nChanges between the two cross-sectional samples nin distributional properties of the variables? nin parameters of the model? nModel in presence of changes: nDummy variable D: indicator for t = 2 (D=0 for t=1, D=1 for t=2) n yit = β1 + β2 xit + β3 D + β4 D*xit + εit n change (from t =1 to t = 2) qof intercept from β1 to β1 + β3 qof coefficient of x from β2 to β2 + β4 nTests for constancy of (1) β1 or (2) β1, β2 over time: n H0(1): β3 = 0 or H0(2): β3 = β4 = 0 nSimilarly testing for constancy of σ2 over time nGeneralization to more than two time periods n May 4, 2012 Hackl, Econometrics 2, Lecture 6 17 Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 18 A Model for Two-period Panel Data nModel for y, based on panel data for two periods: n yit = β0 + δ1dt + β1 xit + εit n = β0 + δ1dt + β1 xit + αi + uit n i = 1,...,N: sample units of the panel n t = 1, 2: time period of sample n dt: dummy for period t = 2 nεit = αi + uit: composite error nαi: represents all unit-specific, time-constant factors; also called unobserved (individual) heterogeneity nuit: represents unobserved factors that change over time, also called idiosyncratic or time-varying error quit (and εit) may be correlated over time for the same unit nModel is called unobserved or fixed effects model n May 4, 2012 Hackl, Econometrics 2, Lecture 6 19 Estimation of the Parameters of Interest nParameter of interest is β1 nEstimation concepts: 1.Pooled OLS estimation of β1 from yit = β0 + δ1dt + β1xit + εit based on the pooled dataset qInconsistent, if xit and αi are correlated qIncorrect standard errors due to correlation of uit (and εit) over time; typically too small standard errors 2.First-difference estimator: OLS estimation of β1 from the first-difference equation n ∆yi = yi1 – yi2 = δ1 + β1 ∆xi + ∆ui qαi are differenced away qxit and αi may be correlated 3.Fixed effects estimation (see below) n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 20 Wage Equations nData set “males”, cross-sectional samples for 1980 and 1987 n(1): OLS estimation in pooled sample n(2): OLS estimation in pooled sample n with interaction dummy n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 21 (1) (2) interc. coeff 1.045 1.241 s.e. 0.048 0.056 exper coeff 0.160 0.073 s.e. 0.017 0.021 exper2 coeff -0.008 -0.006 0.001 0.001 d87 coeff 0.479 s.e. 0.076 R2 (%) 16.2 19.0 Pooled OLS Estimation nModel for y, based on panel data from T periods: n yit = xit‘β + εit nPooled OLS estimation of β nAssumes equal unit means αi nConsistent if xit and εit (at least contemporaneously) uncorrelated nDiagnostics of interest: qTest whether panel data structure to be taken into account qTest whether fixed or random effects model preferable nIn Gretl: the output window of OLS estimation applied to panel data structure offers a special test: Test > Panel diagnostics nTests H0: pooled model preferable to fixed effects and random effects model nHausman test (H0: random effects model preferable to fixed effects model) n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 22 Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 23 Models for Panel Data nModel for y, based on panel data from N cross-sectional units and T periods n yit = β0 + xit’β1 + εit n i = 1, ..., N: sample unit n t = 1, ..., T: time period of sample n xit and β1: K-vectors nβ0: represents intercept of i-the unit nβ0 and β1: represent intercept and K regression coefficients; are assumed to be identical for all units and all time periods nεit: represents unobserved factors that affect yit qAssumption that εit are uncorrelated over time not realistic qStandard errors of OLS estimates misleading, OLS estimation not efficient n May 4, 2012 Hackl, Econometrics 2, Lecture 6 24 Random Effects Model nModel n yit = β0 + xit’β1 + εit nSpecification for the error terms: two components n εit = αi + uit qαi ~ IID(0, σa2); represents all unit-specific, time-constant factors quit ~ IID(0, σu2); uncorrelated over time qαi and uit are assumed to be mutually independent and independent of xjs for all j and s nRandom effects model n yit = β0 + xit’β1 + αi + uit nCorrelation of error terms only via the unit-specific factors αi nEfficient estimation of β0 and β1: takes error covariance structure into account; GLS estimation n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 25 Fixed Effects Model nModel n yit = β0 + xit’β1 + εit nSpecification for the error terms: two components n εit = αi + uit qαi unit-specific, time-constant factors; may be correlated with xit quit ~ IID(0, σu2); uncorrelated over time nFixed effects model n yit = αi + xit’β1 + uit nOverall intercept omitted; unit-specific intercepts αi n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 26 Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 27 Fixed Effects Model nModel for y, based on panel data for T periods n yit = αi + xit’β + uit , uit ~ IID(0, σu2) n i = 1, ..., N: sample unit n t = 1, ..., T: time period of sample nαi: fixed parameter, represents all unit-specific, time-constant factors, unobserved (individual) heterogeneity nxit: all K components are assumed to be independent of all uit; may be correlated with αi nModel with dummies dij = 1 for i = j and 0 otherwise: n yit = Σj αi dij + xit’β + uit nNumber of coefficients: N + K nLeast squares dummy variable (LSDV) estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 28 LSDV Estimator nModel with dummies dij = 1 for i = j and 0 otherwise: n yit = Σj αi dij + xit’β + uit nNumber of coefficients: N + K nLSDV estimator: OLS estimation of the dummy variable version of the fixed effects model nNT observations for estimating N + K coefficients nNumerically not attractive nEstimates for αi usually not of interest May 4, 2012 Hackl, Econometrics 2, Lecture 6 29 Fixed Effects (or Within) Estimator nWithin transformation: transforms yit into time-demeaned ÿit by subtracting the average ӯi = (Σt yit )/T: n ÿit = yit - ӯi n analogously ẍit and üit nModel in time-demeaned variables n ÿit = ẍit’β + üit nTime-demeaning differences away time-constant factors αi; cf. the first-difference estimator nPooled OLS estimation of β gives the fixed effects estimator bFE, also called within estimator nUses time variation in y and x within each cross-sectional observation; explains deviations of yit from ӯi, not of ӯi from ӯj nGretl: Model > Panel > Fixed or random effects ... n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 30 Properties of Fixed Effects Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 31 bFE = (ΣiΣt ẍit ẍit’)-1 ΣiΣt ẍit ÿit nUnbiased if all xit are independent of all uit nNormally distributed if normality of uit is assumed nConsistent (for N going to infinity) if xit are strictly exogenous, i.e., E{xit uis} = 0 for all s, t nAsymptotically normally distributed nCovariance matrix V{bFE} = σu2(ΣiΣt ẍit ẍit’)-1 nEstimated covariance matrix: substitution of σu2 by su2 = (ΣiΣt ῦitῦit)/[N(T-1)] with the residuals ῦit = ÿit - ẍit’bFE nAttention! The standard OLS estimate of the covariance matrix underestimates the true values Estimator for αi nTime-constant factors αi, i = 1, ..., N nEstimates based on the fixed effects estimator bFE n ai = ӯi - ẋi’bFE n with averages over time ӯi and ẋi for the i-th unit nConsistent (for T increasing to infinity) if xit are strictly exogenous nInteresting aspects of estimates ai qDistribution of the ai , i = 1, ..., N qValue of ai for unit i of special interest n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 32 First-Difference Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 33 Elimination of time-constant factors αi by differencing ∆yit = yit – yi,t-1 = ∆xit’β + ∆uit ∆xit and ∆uit analogously defined as ∆yit = yit – yi,t-1 First-difference estimator: OLS estimation bFD = (ΣiΣt ∆xit ∆xit’)-1 ΣiΣt ∆xit ∆yit Properties nConsistent (for N going to infinity) under slightly weaker conditions than bFE nSlightly less efficient than bFE due to serial correlations of the ∆uit nFor T = 2, bFD and bFE coincide Differences-in-Differences Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 34 Natural experiment or quasi-experiment: nExogenous event, e.g., a new law, changes in operating conditions nTreatment group, control group nAssignment to groups not at random (like in a true experiment) nData: before event, after event Model for response yit yit = δrit + μt + αi + uit, i =1,...,N, T = 1 (before), 2 (after event) nDummy rit =1 if i-th unit in treatment group, rit =0 otherwise nδ: treatment effect nFixed effects model (for differencing away time-constant factors): ∆yit = yi2 – yi1 = δ ∆rit + μ0 + ∆uit with μ0 = μ2 – μ1 Differences-in-Differences Estimator, cont’d May 4, 2012 Hackl, Econometrics 2, Lecture 6 35 Differences-in-differences (DD or DID or D-in-D) estimator of treatment effect δ dDD = ∆ӯtreated - ∆ӯuntreated ∆ӯtreated: average difference yi2 – yi1 of treatment group units ∆ӯuntreated: average difference yi2 – yi1 of control group units nTreatment effect δ measured as difference between changes of y with and without treatment ndDD consistent if E{∆rit ∆uit} = 0 nAllows correlation between time-constant factors αi and rit Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 36 Random Effects Model May 4, 2012 Hackl, Econometrics 2, Lecture 6 37 Model: yit = β0 + xit’β + αi + uit , uit ~ IID(0, σu2) nTime-constant factors αi: stochastic variables with identical distribution for all units αi ~ IID(0, σa2) nAttention! More information about αi as compared to fixed effects model nαi + uit: error term with two components qUnit-specific component αi, time-constant qRemainder uit, assumed to be uncorrelated over time nαi, uit: mutually independent, independent of xjs for all j and s nOLS estimators for β0 and β are unbiased, consistent, not efficient (see next slide) n GLS Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 38 αi iT + ui: T-vector of error terms for i-th unit, T-vector iT = (1, ..., 1)’ Ω = Var{αiiT + ui}: Covariance matrix of αiiT + ui Ω = σa2 iT iT’ + σu2IT Inverted covariance matrix Ω-1 = σu-2{[IT – (iTiT’)/T] + ψ (iTiT’)/T} with ψ = σu2/(σu2 + Tσa2) (iTiT’)/T: transforms into averages IT – (iTiT’)/T: transforms into deviations from average GLS estimator bGLS = [ΣiΣtẍitẍit’+ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1[ΣiΣtẍitÿit+ψTΣi(ẋi –ẋ)(ӯi –ӯ)] with the average ӯ over all i and t, analogous ẋ nψ = 0: bGLS coincides with bFE; bGLS and bFE equivalent for large T nψ = 1: bGLS coincides with the OLS estimators for β0 and β Between Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 39 Model for individual means ӯi and ẋi: ӯi = β0 + ẋi’β + αi + ūi , i = 1, ..., N OLS estimator bB = [Σi(ẋi –ẋ)(ẋi –ẋ)’]-1Σi(ẋi –ẋ)(ӯi –ӯ) is called the between estimator nConsistent if xit strictly exogenous, uncorrelated with αi nGLS estimator can be written as bGLS = ∆bB + (IK - ∆)bFE ∆: weighting matrix, proportional to the inverse of Var{bB} qMatrix-weighted average of between estimator bB and within estimator bFE qThe more accurate bB the more weight has bB in bGLS qbGLS: optimal combination of bB and bFE, more efficient than bB and bFE n GLS Estimator: Properties May 4, 2012 Hackl, Econometrics 2, Lecture 6 40 bGLS = [ΣiΣtẍitẍit’+ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1[ΣiΣtẍitÿit+ψTΣi(ẋi –ẋ)(ӯi –ӯ)] nUnbiased, if xit are independent of all αi and uit nConsistent for N or T or both tending to infinity if qE{ẍit uit} = 0 qE{ẋi uit} = 0, E{ẍit αi} = 0 qThese conditions are required also for consistency of bB nMore efficient than the between estimator bB and the within estimator bFE; also more efficient than the OLS estimator n Random Effects Estimator May 4, 2012 Hackl, Econometrics 2, Lecture 6 41 EGLS or Balestra-Nerlove estimator: Calculation of bGLS from model yit – ϑӯi = β0(1 – ϑ) + (xit – ϑẋi)’β + vit with ϑ = 1 – ψ1/2, vit ~ IID(0, σv2) quasi-demeaned yit – ϑӯi and xit – ϑẋi Two step estimator: 1.Step 1: Transformation parameter ψ calculated from qwithin estimation: su2 = (ΣiΣt ῦitῦit)/[N(T-1)] qbetween estimation: sB2 = (1/N)Σi (ӯi – b0B – ẋi’bB)2 = sa2+(1/T)su2 qsa2 = sB2 – (1/T)su2 2.Step 2: qCalculation of 1 – [su2/(su2 + Tsa2)]1/2 for parameter ϑ qTransformation of yit and xit qOLS estimation gives the random effect estimator bRE for β Random Effects Estimator: Properties May 4, 2012 Hackl, Econometrics 2, Lecture 6 42 bRE: EGLS estimator of β from yit – ϑӯi = β0(1 – ϑ) + (xit – ϑẋi)’β + vit with ϑ = 1 – ψ1/2, ψ = σu2/(σu2 + Tσa2) nAsymptotically normally distributed under weak conditions nCovariance matrix Var{bRE} = σu2[ΣiΣt ẍit ẍit’ + ψTΣi(ẋi –ẋ)(ẋi –ẋ)’]-1 nMore efficient than the within estimator bFE (if ψ > 0) n Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 43 Summary of Estimators May 4, 2012 Hackl, Econometrics 2, Lecture 6 44 nBetween estimator nFixed effects (within) estimator nCombined estimators qOLS estimator qRandom effects (EGLS) estimator nFirst-difference estimator q Estimator Consistent, if Between bB xit strictly exog, xit and αi uncorr Fixed effects bFE xit strictly exog OLS b xit and αi uncorr, xit and uit contemp. uncorr Random effects bRE conditions for bB and bFE are met First-difference bFD E{ẍit üit} = 0 Fixed Effects or Random Effects? May 4, 2012 Hackl, Econometrics 2, Lecture 6 45 Random effects model E{yit | xit} = xit’β nLarge values N; of interest: population characteristics (β), not characteristics of individual units (αi) nMore efficient estimation of β, given adequate specification of the time-constant model characteristics Fixed effects model E{yit | xit} = xit’β + αi nOf interest: besides population characteristics (β), also characteristics of individual units (αi), e.g., of countries or companies; rather small values N nLarge values of N, if xit and αi correlated: consistent estimator bFE in case of correlated xit and αi n Diagnostic Tools May 4, 2012 Hackl, Econometrics 2, Lecture 6 46 nTest for common intercept of all units qApplied to pooled OLS estimation: Rejection indicates preference for fixed or random effects model qApplied to fixed effects estimation: Non-rejection indicates preference for pooled OLS estimation nHausman test: qNull-hypothesis that GLS estimates are consistent qRejection indicates preference for fixed effects model nTest for non-constant variance of the error terms, Breusch-Pagan test pRejection indicates preference for fixed or random effects model pNon-rejection indicates preference for pooled OLS estimation n Hausman Test May 4, 2012 Hackl, Econometrics 2, Lecture 6 47 Tests for correlation between xit and αi H0: xit and αi are uncorrelated Test statistic: ξH = (bFE - bRE)‘ [Ṽ{bFE} - Ṽ{bRE}]-1 (bFE - bRE) with estimated covariance matrices Ṽ{bFE} and Ṽ{bRE} nbRE: consistent if xit and αi are uncorrelated nbFE: consistent also if xit and αi are correlated Under H0: plim(bFE - bRE) = 0 nξH asymptotically chi-squared distributed with K d.f. nK: dimension of xit and β Hausman test may indicate also other types of misspecification q Robust Inference May 4, 2012 Hackl, Econometrics 2, Lecture 6 48 Consequences of heteroskedasticity and autocorrelation of the error term: nStandard errors and related tests are incorrect nInefficiency of estimators Robust covariance matrix for estimator b of β from yit = xit’β + εit b = (ΣiΣt xitxit’)-1 ΣiΣt xityit nAdjustment of covariance matrix similar to Newey-West: assuming uncorrelated error terms for different units (E{εit εjs} = 0 for all i ≠ j) V{b} = (ΣiΣt xitxit’)-1 ΣiΣtΣs eiteis xitxis’ (ΣiΣt xitxit’)-1 eit: OLS residuals nAllows for heteroskedasticity and autocorrelation within units nCalled panel-robust estimate of the covariance matrix Analogous variants of the Newey-West estimator for robust covariance matrices of random effects and fixed effects estimators Testing for Autocorrelation and Heteroskedasticity May 4, 2012 Hackl, Econometrics 2, Lecture 6 49 Tests for heteroskedasticity and autocorrelation in random effects model error terms nComputationally cumbersome Tests based on fixed effects model residuals nEasier case nApplicable for testing in both fixed and random effects case Test for Autocorrelation May 4, 2012 Hackl, Econometrics 2, Lecture 6 50 Durbin-Watson test for autocorrelation in the fixed effects model nError term uit = ρui,t-1 + vit qSame autocorrelation coefficient ρ for all units qvit iid across time and units nTest of H0: ρ = 0 against ρ > 0 nAdaptation of Durbin-Watson statistic n nTables with critical limits dU and dL for K, T, and N; e.g., Verbeek’s Table 10.1 Test for Heteroskedasticity May 4, 2012 Hackl, Econometrics 2, Lecture 6 51 Breusch-Pagan test for heteroskedasticity of fixed effects model residuals nV{uit} = σ2h(zit’γ); unknown function h(.) with h(0)=1, J-vector z nH0: γ = 0, homoskedastic uit nAuxiliary regression of squared residuals on intercept and regressors z nTest statistic: N(T-1) times R2 of auxiliary regression nChi-squared distribution with J d.f. under H0 Goodness-of-Fit May 4, 2012 Hackl, Econometrics 2, Lecture 6 52 Goodness-of-fit measures for panel data models: different from OLS estimated regression models nFocus may be on within or between variation in the data nThe usual R2 measure relates to OLS-estimated models Definition of goodness-of-fit measures: squared correlation coefficients between actual and fitted values nR2within: squared correlation between within transformed actual and fitted yit; maximized by within estimator nR2between: based upon individual averages of actual and fitted yit; maximized by between estimator nR2overall: squared correlation between actual and fitted yit; maximized by OLS Corresponds to the decomposition [1/TN]ΣiΣt(yit – ӯ)2 = [1/TN]ΣiΣt(yit – ӯi)2 + [1/N]Σi(ӯi – ӯ)2 q Goodness-of-Fit, cont’d May 4, 2012 Hackl, Econometrics 2, Lecture 6 53 Fixed effects estimator bFE nExplains the within variation nMaximizes R2within R2within(bFE) = corr2{ŷitFE – ŷiFE, yit – ӯi} Between estimator bB nExplains the between variation nMaximizes R2between R2between(bB) = corr2{ŷiB, ӯi} Contents nPanel Data nPooling Independent Cross-sectional Data nPanel Data: Pooled OLS Estimation nPanel Data Models nFixed Effects Model nRandom Effects Model nAnalysis of Panel Data Models nPanel Data in Gretl n n n n May 4, 2012 Hackl, Econometrics 2, Lecture 6 54 Panel Data and Gretl May 4, 2012 Hackl, Econometrics 2, Lecture 6 55 Estimation of panel models Pooled OLS nModel > Ordinary Least Squares … nSpecial diagnostics on the output window: Tests > Panel diagnostics Fixed and random effects models nModel > Panel > Fixed or random effects… nProvide diagnostic tests qFixed effects model: Test for common intercept of all units qRandom effects model: Breusch-Pagan test, Hausman test Further estimation procedures nBetween estimator nWeighted least squares nInstrumental variable panel procedure Hackl, Econometrics 2, Lecture 6 56 Your Homework 1.Use Verbeek’s data set MALES which contains panel data for 545 full-time working males over the period 1980-1987. Estimate a wage equation which explains the individual log wages by the variables years of schooling, years of experience and its squares, and dummy variables for union membership, being married, black, Hispanic, and working in the public sector. Use (i) pooled OLS, (ii) the between and (iii) the within estimator, and (iv) the random effects estimator. 2. n May 4, 2012