Econometrics - Lecture 4 Heteroskedasticity and Autocorrelation Contents nViolations of V{ε} = s2 IN nHeteroskedasticity nGLS Estimation nAutocorrelation Dec 9, 2011 Hackl, Econometrics, Lecture 4 2 Gauss-Markov Assumptions A1 E{εi} = 0 for all i A2 all εi are independent of all xi (exogeneous xi) A3 V{ei} = s2 for all i (homoskedasticity) A4 Cov{εi, εj} = 0 for all i and j with i ≠ j (no autocorrelation) Dec 9, 2011 Hackl, Econometrics, Lecture 4 3 Observation yi is a linear function yi = xi'b + εi of observations xik, k =1, …, K, of the regressor variables and the error term εi for i = 1, …, N; xi' = (xi1, …, xiK); X = (xik) n n n n n n n In matrix notation: E{ε} = 0, V{ε} = s2 IN OLS Estimator: Properties nUnder assumptions (A1) and (A2): n1. The OLS estimator b is unbiased: E{b} = β n nUnder assumptions (A1), (A2), (A3) and (A4): n2. The variance of the OLS estimator is given by n V{b} = σ2(Σi xi xi’)-1 = σ2(X‘ X)-1 n3. The sampling variance s2 of the error terms εi, n s2 = (N – K)-1 Σi ei2 n is unbiased for σ2 n4. The OLS estimator b is BLUE (best linear unbiased estimator) Dec 9, 2011 Hackl, Econometrics, Lecture 4 4 Violations of V{e} = s2IN nImplications of the Gauss-Markov assumptions for ε: n V{ε} = σ2IN nViolations: nHeteroskedasticity: V{ε} = diag(s12, …, sN2) or n V{ε} = s2Y = s2 diag(h12, …, hN2) nAutocorrelation: V{εi, εj} ¹ 0 for at least one pair i ¹ j or n V{ε} = s2Y n with non-diagonal elements different from zero n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 5 Example: Household Income and Expenditures n70 households (HH): n monthly HH-income and expenditures for durable goods n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 6 0 400 800 1200 1600 2000 2400 0 2000 4000 6000 8000 10000 12000 HH-income Hackl, Econometrics, Lecture 4 7 Household Income and Expenditures, cont‘d Residuals e = y- ŷ from Ŷ = 44.18 + 0.17 X X: monthly HH-income Y: expenditures for durable goods the larger the income, the more scattered are the residuals -600 -400 -200 0 200 400 600 0 2000 4000 6000 8000 10000 12000 HH-income Dec 9, 2011 Typical Situations for Heteroskedasticity nHeteroskedasticity is typically observed nIn data from cross-sectional surveys, e.g., in households or regions nData with variance which depends of one or several explanatory variables, e.g., firm size nData from financial markets, e.g., exchange rates, stock returns Dec 9, 2011 Hackl, Econometrics, Lecture 4 8 Example: Household Expenditures nWith growing income increasing variation of expenditures; from Verbeek, Fig. 4.1 n Dec 9, 2011 Hackl, Econometrics, Lecture 4 9 Autocorrelation of Economic Time-series nConsumption in actual period is similar to that of the preceding period; the actual consumption „depends“ on the consumption of the preceding period nConsumption, production, investments, etc.: it is to be expected that successive observations of economic variables correlate positively nSeasonal adjustment: application of smoothing and filtering algorithms induces correlation Dec 9, 2011 Hackl, Econometrics, Lecture 4 10 Hackl, Econometrics, Lecture 4 11 Example: Imports Scatter-diagram of by one period lagged imports [MTR(-1)] against actual imports [MTR] Correlation coefficient between MTR und MTR(-1): 0.9994 Dec 9, 2011 Hackl, Econometrics, Lecture 4 12 Example: Import Function MTR: Imports FDD: Demand (from AWM-database) Import function: MTR = -227320 + 0.36 FDD R2 = 0.977, tFFD = 74.8 Dec 9, 2011 Hackl, Econometrics, Lecture 4 13 Import Function, cont‘d MTR: Imports FDD: Demand (from AWM-database) RESID: et = MTR - (-227320 + 0.36 FDD) Dec 9, 2011 Hackl, Econometrics, Lecture 4 14 Import Function, cont‘d Scatter-diagram of by one period lagged residuals [Resid(-1)] against actual residuals [Resid] Serial correlation! Dec 9, 2011 Typical Situations for Autocorrelation nAutocorrelation is typically observed if na relevant regressor with trend or seasonal pattern is not included in the model: miss-specified model nthe functional form of a regressor is incorrectly specified nthe dependent variable is correlated in a way that is not appropriately represented in the systematic part of the model nWarning! Omission of a relevant regressor with trend implies autocorrelation of the error terms; in econometric analyses autocorrelation of the error terms is always possible! nAutocorrelation of the error terms indicates deficiencies of the model specification nTests for autocorrelation are the most frequently used tool for diagnostic checking the model specification Dec 9, 2011 Hackl, Econometrics, Lecture 4 15 Import Functions nRegression of imports (MTR) on demand (FDD) n MTR = -2.27x109 + 0.357 FDD, tFDD = 74.9, R2 = 0.977 n Autocorrelation (order 1) of residuals: q Corr(et, et-1) = 0.993 nImport function with trend (T) n MTR = -4.45x109 + 0.653 FDD – 0.030x109 T n tFDD = 45.8, tT = -21.0, R2 = 0.995 n Multicollinearity? Corr(FDD, T) = 0.987! nImport function with lagged imports as regressor n MTR = -0.124x109 + 0.020 FDD + 0.956 MTR-1 n tFDD = 2.89, tMTR(-1) = 50.1, R2 = 0.999 n Dec 9, 2011 Hackl, Econometrics, Lecture 4 16 Consequences of V{e} ¹ s2IN nOLS estimators b for b nare unbiased nare consistent nhave the covariance-matrix n V{b} = s2 (X'X)-1 X'YX (X'X)-1 nare not efficient estimators, not BLUE nfollow – under general conditions – asymptotically the normal distribution nThe estimator s2 = e'e/(N-K) for s2 is biased n n n n nand Dec 9, 2011 Hackl, Econometrics, Lecture 4 17 Consequences of V{e} ¹ s2IN for Applications nOLS estimators b for b are still unbiased nRoutinely computed standard errors are biased; the bias can be positive or negative nt- and F-tests may be misleading nRemedies nAlternative estimators nCorrected standard errors nModification of the model nTests for identification of nheteroskedasticity nautocorrelation n are important tools n Dec 9, 2011 Hackl, Econometrics, Lecture 4 18 Contents nViolations of V{ε} = s2 IN nHeteroskedasticity nGLS Estimation nAutocorrelation Dec 9, 2011 Hackl, Econometrics, Lecture 4 19 Inference under Heteroskedasticity nCovariance matrix of b: n V{b} = s2 (X'X)-1 X'YX (X'X)-1 nUse of s2 (X'X)-1 (the standard output of econometric software) instead of V{b} for inference on b may be misleading nRemedies nUse of correct variances and standard errors nTransformation of the model so that the error terms are homoskedastic Dec 9, 2011 Hackl, Econometrics, Lecture 4 20 The Correct Variances nV{εi} = σi2 = σ2hi2: each observation has its own unknown parameter hi nN observation for estimating N unknown parameters? nTo estimate σ2i – and V{b} nKnown form of the heteroskedasticity, specific correction qE.g., hi2 = zi’a for some variables zi qRequires estimation of a nWhite’s heteroskedasticity-consistent covariance matrix estimator (HCCME) n Ṽ{b} = s2(X'X)-1(Siĥi2xixi’) (X'X)-1 n with ĥi2=ei2 qDenoted as HC0 qInference based on HC0: heteroskedasticity-robust inference Dec 9, 2011 Hackl, Econometrics, Lecture 4 21 White’s Standard Errors nWhite’s standard errors for b nSquare roots of diagonal elements of HCCME nUnderestimate the true standard errors nVarious refinements, e.g., HC1 = HC0[N/(N-K)] nIn GRETL: HC0 is the default HCCME, HC1 and other refinements are optionally available Dec 9, 2011 Hackl, Econometrics, Lecture 4 22 An Alternative Estimator for b nIdea of the estimator nTransform the model so that it satisfies the Gauss-Markov assumptions nApply OLS to the transformed model nShould result in a BLUE nTransformation often depends upon unknown parameters that characterizing heteroskedasticity: two-step procedure 1.Estimate the parameters that characterize heteroskedasticity and transform the model 2.Estimate the transformed model nThe procedure results in an approximately BLUE Dec 9, 2011 Hackl, Econometrics, Lecture 4 23 An Example nModel: n yi = xi’β + εi with V{εi} = σi2 = σ2hi2 nDivision by hi results in n yi /hi = (xi /hi)’β + εi /hi n with a homoskedastic error term n V{εi /hi} = σi2/hi2 = σ2 nOLS applied to the transformed model gives n nIt is called a generalized least squares (GLS) or weighted least squares (WLS) estimator Dec 9, 2011 Hackl, Econometrics, Lecture 4 24 Weighted Least Squares Estimator nA GLS or WLS estimator is a least squares estimator where each observation is weighted by a non-negative factor wi > 0: n n nWeights proportional to the inverse of the error term variance: Observations with a higher error term variance have a lower weight; they provide less accurate information on β nNeeds knowledge of the hi qIs seldom available qIs mostly provided by estimates of hi based on assumptions on the form of hi qE.g., hi2 = zi’a for some variables zi nAnalogous with general weights wi Dec 9, 2011 Hackl, Econometrics, Lecture 4 25 Example: Labor Demand nVerbeek’s data set “labour2”: Sample of 569 Belgian companies (data from 1996) nVariables qlabour: total employment (number of employees) qcapital: total fixed assets qwage: total wage costs per employee (in 1000 EUR) qoutput: value added (in million EUR) nLabour demand function n labour = b1 + b2*wage + b3*output + b4*capital Dec 9, 2011 Hackl, Econometrics, Lecture 4 26 Labor Demand Function nFor Belgian companies, 1996; Verbeek n Dec 9, 2011 Hackl, Econometrics, Lecture 4 27 Labor Demand Function, cont’d nCan the error terms be assumed to be homoskedastic? nThey may vary depending of the company size, measured by, e.g., size of output or capital nRegression of squared residuals on appropriate regressors will indicate heteroskedasticity n Dec 9, 2011 Hackl, Econometrics, Lecture 4 28 Labor Demand Function, cont’d nAuxiliary regression of squared residuals, Verbeek n n n n n n n n n nIndicates dependence of error terms on output, capital, not on wage n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 29 Labor Demand Function, cont’d nEstimated function n labour = b1 + b2*wage + b3*output + b4*capital n OLS estimates without (s.e.) and with White standard errors (White s.e.), and GLS estimates with wi = 1/ei n n n n n n nThe standard errors are inflated by factors 3.7 (wage), 6.4 (capital), 7.0 (output) wrt the White s.e. n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 30 b1 b2 b3 b4 Coeff OLS 287.19 -6.742 15.400 -4.590 s.e. 19.642 0.501 0.356 0.269 White s.e. 64.877 1.852 2.482 1.713 Coeff GLS 282.06 -6.609 15.235 -4.197 s.e. 1.808 0.042 0.094 0.141 Labor Demand Function, cont’d Hackl, Econometrics, Lecture 4 31 With White standard errors: Output from GRETL Dependent variable : LABOR Heteroskedastic-robust standard errors, variant HC0, coefficient std. error t-ratio p-value ------------------------------------------------------------- const 287,719 64,8770 4,435 1,11e-05 *** WAGE -6,7419 1,8516 -3,641 0,0003 *** CAPITAL -4,59049 1,7133 -2,679 0,0076 *** OUTPUT 15,4005 2,4820 6,205 1,06e-09 *** Mean dependent var 201,024911 S.D. dependent var 611,9959 Sum squared resid 13795027 S.E. of regression 156,2561 R- squared 0,935155 Adjusted R-squared 0,934811 F(2, 129) 225,5597 P-value (F) 3,49e-96 Log-likelihood 455,9302 Akaike criterion 7367,341 Schwarz criterion -3679,670 Hannan-Quinn 7374,121 Dec 9, 2011 Tests against Heteroskedasticity nDue to unbiasedness of b, residuals are expected to indicate heteroskedasticity nGraphical displays of residuals may give useful hints nResidual-based tests: nBreusch-Pagan test nKoenker test nGoldfeld-Quandt test nWhite test n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 32 Breusch-Pagan Test nFor testing whether the error term variance is a function of Z2, …, Zp nModel for heteroskedasticity n si2/s2 = h(zi‘a) n with function h with h(0)=1, p-vectors zi und a, an intercept and p-1 variables Z2, …, Zp nNull hypothesis n H0: a = 0 n implies si2 = s2 for all i, i.e., homoskedasticity nAuxiliary regression of the standardized squared OLS residuals gi = ei2/s2 - 1, s2 = e’e/N, on zi (and squares of zi) nTest statistic: BP = N*ESS with the explained sum of squares ESS = N*V(ĝ), of the auxiliary regression; ĝ are the fitted values for g. BP follows approximately the Chi-squared distribution with p d.f. n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 33 Breusch-Pagan Test, cont‘d nTypical functions h for h(zi‘a) nLinear regression: h(zi‘a) = zi‘a nExponential function h(zi‘a) = exp{zi‘a} qAuxiliary regression of the log (ei2) upon zi q“Multiplicative heteroskedasticity” qVariances are non-negative nKoenker test: variant of the BP test which is robust against non-normality of the error terms nGRETL: The output window of OLS estimation allows the execution of the Breusch-Pagan test with h(zi‘a) = zi‘a qOLS output => Tests => Heteroskedasticity => Breusch-Pagan qKoenker test: OLS output => Tests => Heteroskedasticity => Koenker n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 34 Labor Demand Function, cont’d nAuxiliary regression of squared residuals, Verbeek n n n n n n n n nNR2 = 331.04, p-value = 2.17E-70; reject null hypothesis of homoskedasticity n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 35 Goldfeld-Quandt Test nFor testing whether the error term variance has values sA2 and sB2 for observations from regime A and B, respectively, sA2 ¹ sB2 regimes can be urban vs rural area, economic prosperity vs stagnation, etc. nExample (in matrix notation): qyA = XAbA + eA, V{eA} = sA2INA (regime A) qyB = XBbB + eB, V{eB} = sB2INB (regime B) nNull hypothesis: sA2 = sB2 nTest statistic: n n n with Si: sum of squared residuals for i-th regime; follows under H0 exactly or approximately the F-distribution with NA-K and NB-K d.f. n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 36 Goldfeld-Quandt Test, cont‘d nTest procedure in three steps: 1.Sort the observations with respect to the regimes 2.Separate fittings of the model to the NA and NB observations; sum of squared residuals SA and SB 3.Calculation of test statistic F n Dec 9, 2011 Hackl, Econometrics, Lecture 4 37 White Test nFor testing whether the error term variance is a function of the model regressors, their squares and their cross-products nAuxiliary regression of the squared OLS residuals upon xi’s, squares of xi’s and cross-products nTest statistic: NR2 with R2 of the auxiliary regression; follows the Chi-squared distribution with the number of coefficients in the auxiliary regression as d.f. nThe number of coefficients in the auxiliary regression may become large, maybe conflicting with size of N, resulting in low power of the White test n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 38 Labor Demand Function, cont’d nWhite's test for heteroskedasticity nOLS, using observations 1-569 nDependent variable: uhat^2 n n coefficient std. error t-ratio p-value n -------------------------------------------------------------- n const -260,910 18478,5 -0,01412 0,9887 n WAGE 554,352 833,028 0,6655 0,5060 n CAPITAL 2810,43 663,073 4,238 2,63e-05 *** n OUTPUT -2573,29 512,179 -5,024 6,81e-07 *** n sq_WAGE -10,0719 9,29022 -1,084 0,2788 n X2_X3 -48,2457 14,0199 -3,441 0,0006 *** n X2_X4 58,5385 8,11748 7,211 1,81e-012 *** n sq_CAPITAL 14,4176 2,01005 7,173 2,34e-012 *** n X3_X4 -40,0294 3,74634 -10,68 2,24e-024 *** n sq_OUTPUT 27,5945 1,83633 15,03 4,09e-043 *** n n Unadjusted R-squared = 0,818136 n nTest statistic: TR^2 = 465,519295, nwith p-value = P(Chi-square(9) > 465,519295) = 0,000000 Dec 9, 2011 Hackl, Econometrics, Lecture 4 39 Contents nViolations of V{ε} = s2 IN nHeteroskedasticity nGLS Estimation nAutocorrelation Dec 9, 2011 Hackl, Econometrics, Lecture 4 40 Generalized Least Squares Estimator nA GLS or WLS estimator is a least squares estimator where each observation is weighted by a non-negative factor wi > 0 nExample: n yi = xi’β + εi with V{εi} = σi2 = σ2hi2 qDivision by hi results in a model with homoskedastic error terms n V{εi /hi} = σi2/hi2 = σ2 qOLS applied to the transformed model results in the weighted least squares (GLS) estimator with wi = hi-2: q q nThe concept of transforming the model so that Gauss-Markov assumptions are fulfilled is used also in more general situations, e.g., for autocorrelated error terms n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 41 Properties of GLS Estimators nThe GLS estimator n n n is a least squares estimator; standard properties of OLS estimator apply nThe covariance matrix of the GLS estimator is n n nUnbiased estimator of the error term variance n n nUnder the assumption of normality of errors, t- and F-tests can be used; for large N, these properties apply approximately without normality assumption Dec 9, 2011 Hackl, Econometrics, Lecture 4 42 Feasible GLS Estimator nIs a GLS estimator with estimated weights wi nSubstitution of the weights wi = hi-2 by estimates ĥi-2 n n nFeasible (or estimated) GLS or FGLS or EGLS estimator nFor consistent estimates ĥi, the FGLS and GLS estimators are asymptotically equivalent nFor small values of N, FGLS estimators are in general not BLUE nFor consistently estimated ĥi, the FGLS estimator is consistent and asymptotically efficient with covariance matrix (estimate for s2: based on FGLS residuals) n nWarning: the transformed model is uncentered n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 43 Multiplicative Heteroskedasticity nAssume V{εi} = σi2 = σ2hi2 = σ2exp{zi‘a} nThe auxiliary regression n log ei2 = log σ2 + zi‘a + vi with vi = log(ei2/σi2) n provides a consistent estimator a for α nTransform the model yi = xi’β + εi with V{εi} = σi2 = σ2hi2 by dividing through ĥi from ĥi2 = exp{zi‘a} nError term in this model is (approximately) homoskedastic nApplying OLS to the transformed model gives the FGLS estimator for β n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 44 FGLS Estimation nIn the following steps: 1.Calculate the OLS estimates b for b 2.Compute the OLS residuals ei = yi – xi‘b 3.Regress log(ei2) on zi and a constant, obtaining estimates a for α n log ei2 = log σ2 + zi‘a + vi 4.Compute ĥi2 = exp{zi‘a}, transform all variables and estimate the transformed model to obtain the FGLS estimators: n yi /ĥi = (xi /ĥi)’β + εi /ĥi 5.The consistent estimate s² for σ2, based on the FGLS-residuals, and the consistently estimated covariance matrix n n are part of the standard output when regressing the transformed model 5. n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 45 Labor Demand Function nFor Belgian companies, 1996; Verbeek n n n n n n n n n n nLog-tranformation is expected to reduce heteroskedasticity Dec 9, 2011 Hackl, Econometrics, Lecture 4 46 Labor Demand Function, cont’d nFor Belgian companies, 1996; Verbeek n n n n n n n n n n nBreusch-Pagan test: NR2 = 66.23, p-value: 1,42E-13 Dec 9, 2011 Hackl, Econometrics, Lecture 4 47 Labor Demand Function, cont’d nFor Belgian companies, 1996; Verbeek n Weights estimated assuming multiplicative heteroskedasticity n n n n n n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 48 Labor Demand Function, cont’d nEstimated function n log(labour) = b1 + b2*log(wage) + b3*log(output) + b4*log(capital) n The table shows: OLS estimates without (s.e.) and with White standard errors (White s.e.) as well as FGLS estimates and standard errors n n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 49 b1 b2 b3 b4 OLS coeff 6.177 -0.928 0.990 -0.0037 s.e. 0.246 0.071 0.026 0.0188 White s.e. 0.293 0.086 0.047 0.0377 FGLS coeff 5.895 -0.856 1.035 -0.0569 s.e. 0.248 0.072 0.027 0.0216 Labor Demand Function, cont’d nSome comments: nReduction of standard errors in FGLS estimation as compared with heteroskedasticity-robust estimation, efficiency gains nComparison with OLS estimation not appropriate nFGLS estimates differ slightly from OLS estimates; effect of capital is indicated to be relevant (p-value: 0.0086) nR2 of FGLS estimation is misleading qModel is uncentered, no intercept qComparison with that of OLS estimation not appropriate, explained variable differ n Dec 9, 2011 Hackl, Econometrics, Lecture 4 50 Contents nViolations of V{ε} = s2 IN nHeteroskedasticity nGLS Estimation nAutocorrelation Dec 9, 2011 Hackl, Econometrics, Lecture 4 51 Autocorrelation nTypical for time series data such as consumption, production, investments, etc., and models for time series data nAutocorrelation of error terms is typically observed if qa relevant regressor with trend or seasonal pattern is not included in the model: miss-specified model qthe functional form of a regressor is incorrectly specified qthe dependent variable is correlated in a way that is not appropriately represented in the systematic part of the model nAutocorrelation of the error terms indicates deficiencies of the model specification such as omitted regressors, incorrect functional form, incorrect dynamic nTests for autocorrelation are the most frequently used tool for diagnostic checking the model specification Dec 9, 2011 Hackl, Econometrics, Lecture 4 52 Example: Demand for Ice Cream nTime series of 30 four weekly observations (1951-1953) nVariables qcons: consumption of ice cream per head (in pints) qincome: average family income per week (in USD, red line) qprice: price of ice cream (in USD per pint, blue line) qtemp: average temperature (in Fahrenheit); tempc: (green, in °C) Dec 9, 2011 Hackl, Econometrics, Lecture 4 53 Demand for Ice Cream, cont’d nTime series plot of n Cons: consumption of ice cream per head (in pints); mean: 0.36 qTemp/100: average temperature (in Fahrenheit) qPrice (in USD per pint); mean: 0.275 USD Dec 9, 2011 Hackl, Econometrics, Lecture 4 54 Demand for Ice Cream, cont’d nDemand for ice cream, measured by cons, explained by price, income, and temp n n n n n n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 55 Demand for Ice Cream, cont’d nDemand for ice cream explained from income and price index n Dec 9, 2011 Hackl, Econometrics, Lecture 4 56 Demand for Ice Cream, cont’d nIce cream model: Scatter-plot of residuals et vs et-1 (r = 0.401) n Dec 9, 2011 Hackl, Econometrics, Lecture 4 57 A Model with AR(1) Errors nLinear regression n yt = xt‘b + et 1) n with n et = ret-1 + vt with -1 < r < 1 or |r| < 1 n where vt are uncorrelated random variables with mean zero and constant variance sv2 nFor ρ ¹ 0, the error terms et are correlated; the Gauss-Markov assumption V{e} = se2IN is violated nThe other Gauss-Markov assumptions are assumed to be fulfilled nThe sequence et, t = 0, 1, 2, … which follows et = ret-1 + vt is called an autoregressive process of order 1 or AR(1) process n_____________________ n1) In the context of time series models, variables are indexed by „t“ n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 58 Properties of AR(1) Processes nRepeated substitution of et-1, et-2, etc. results in n et = ret-1 + vt = vt + rvt-1 + r2vt-2 + … nwith vt being uncorrelated and having mean zero and variance sv2: nE{et} = 0 nV{et} = se2 = sv2(1-r2)-1 nThis results from V{et} = sv2 + r2sv2 + r4sv2 + … = sv2(1-r2)-1 for |r|<1 as the geometric series 1 + r2 + r4 + … has the sum (1-r2)-1 given that |r| < 1 qfor |r| > 1, V{et} is undefined nCov{et, et-s } = rs sv2 (1-r2)-1 for s > 0 n all error terms are correlated; covariances – and correlations Corr{et, et-s } = rs (1-r2)-1 – decrease with growing distance s in time n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 59 AR(1) Process, cont’d nThe covariance matrix V{e}: n n n n n n n nV{e} has a band structure nDepends only of two parameters: r and sv2 n Dec 9, 2011 Hackl, Econometrics, Lecture 4 60 Consequences of V{e} ¹ s2IT nOLS estimators b for b nare unbiased nare consistent nhave the covariance-matrix n V{b} = s2 (X'X)-1 X'YX (X'X)-1 nare not efficient estimators, not BLUE nfollow – under general conditions – asymptotically the normal distribution nThe estimator s2 = e'e/(T-K) for s2 is biased nFor an AR(1)-process et with r > 0, s.e. from s2 (X'X)-1 underestimates the true s.e. n n n n nand Dec 9, 2011 Hackl, Econometrics, Lecture 4 61 Inference under Autocorrelation nCovariance matrix of b: n V{b} = s2 (X'X)-1 X'YX (X'X)-1 nUse of s2 (X'X)-1 (the standard output of econometric software) instead of V{b} for inference on b may be misleading nIdentification of autocorrelation: nStatistical tests, e.g., Durbin-Watson test nRemedies nUse of correct variances and standard errors nTransformation of the model so that the error terms are uncorrelated Dec 9, 2011 Hackl, Econometrics, Lecture 4 62 Estimation of r nAutocorrelation coefficient r: parameter of the AR(1) process n et = ret-1 + vt nEstimation of ρ nby regressing the OLS residual et on the lagged residual et-1 n n n nestimator is qbiased qbut consistent under weak conditions Dec 9, 2011 Hackl, Econometrics, Lecture 4 63 Autocorrelation Function nAutocorrelation of order s: n n n nAutocorrelation function assigns rs to s nCorrelogram: graphical representation of the autocorrelation function n Dec 9, 2011 Hackl, Econometrics, Lecture 4 64 Example: Ice Cream Demand nAutocorrelation function (ACF) of cons n Dec 9, 2011 Hackl, Econometrics, Lecture 4 65 LAG ACF PACF Q-stat. [p-value] 1 0,6627 *** 0,6627 *** 14,5389 [0,000] 2 0,4283 ** -0,0195 20,8275 [0,000] 3 0,0982 -0,3179 * 21,1706 [0,000] 4 -0,1470 -0,1701 21,9685 [0,000] 5 -0,3968 ** -0,2630 28,0152 [0,000] 6 -0,4623 ** -0,0398 36,5628 [0,000] 7 -0,5145 *** -0,1735 47,6132 [0,000] 8 -0,4068 ** -0,0299 54,8362 [0,000] 9 -0,2271 0,0711 57,1929 [0,000] 10 -0,0156 0,0117 57,2047 [0,000] 11 0,2237 0,1666 59,7335 [0,000] 12 0,3912 ** 0,0645 67,8959 [0,000] Example: Ice Cream Demand nCorrelogram of cons n Dec 9, 2011 Hackl, Econometrics, Lecture 4 66 Tests for Autocorrelation of Error Terms nDue to unbiasedness of b, residuals are expected to indicate autocorrelation nGraphical display, correlogram of residuals may give useful hints nResidual-based tests: nDurbin-Watson test nBox-Pierce test nBreusch-Godfrey test n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 67 Durbin-Watson Test nTest of H0: r = 0 against H1: r ¹ 0 nTest statistic n n n n nFor r > 0, dw is expected to have a value in (0,2) nFor r < 0, dw is expected to have a value in (2,4) ndw close to the value 2 indicates no autocorrelation of error terms nCritical limits of dw qdepend upon xt’s qexact critical value is unknown, but upper and lower bounds can be derived, which depend only of the number of regression coefficients nTest can be inconclusive n Dec 9, 2011 Hackl, Econometrics, Lecture 4 68 Durbin-Watson Test: Bounds for Critical Limits nDerived by Durbin and Watson nUpper (dU) and lower (dL) bounds for the critical limits and a = 0.05 n n n n n n ndw < dL: reject H0 ndw > dU: do not reject H0 ndL < dw < dU: no decision (inconclusive region) n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 69 T K=2 K=3 K=10 dL dU dL dU dL dU 15 1.08 1.36 0.95 1.54 0.17 3.22 20 1.20 1.41 1.10 1.54 0.42 2.70 100 1.65 1.69 1.63 1.71 1.48 1.87 Durbin-Watson Test: Remarks nDurbin-Watson test gives no reference to causes of rejection of the null hypothesis and how the model to modify nVarious types of misspecification may cause the rejection of the null hypothesis nDurbin-Watson test is a test against first-order autocorrelation; a test against autocorrelation of other orders may be more suitable, e.g., order four if the model is for quarterly data nUse of tables unwieldy qLimited number of critical bounds (K, T, a) in tables qInconclusive region Dec 9, 2011 Hackl, Econometrics, Lecture 4 70 Asymptotic Tests nAR(1) process for error terms n et = ret-1 + vt nAuxiliary regression of et on xt‘b and et-1: produces nRe2 nTest of H0: r = 0 1.Breusch-Godfrey test (GRETL: OLS output => Tests => Autocorr.) qRe2 of the auxiliary regression: close to zero if r = 0 q(T-1) Re2 follows approximately the Chi-square distribution with 1 d.f. if r = 0 qLagrange multiplier F (LMF) statistic: F-test for explanatory power of et-1; follows approximately the F(1, T-K-1) distribution if r = 0 qGeneral case of the Breusch-Godfrey test: Auxiliary regression based on higher order autoregressive process q n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 71 Asymptotic Tests, cont’d 2.Box-Pierce test qThe corresponding t-statistic q t = √(T) r q follows approximately the t-distribution if r = 0 qTest based on √(T) r is a special case of the Box-Pierce test which uses the test statistic Qm = T Σsm rs2 qSimilar the Ljung-Box test, based on q q q which follows the Chi-square distribution with m d.f. if r = 0 qLjung-Box test in GRETL: OLS output => Graphs => Residual correlogram n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 72 Asymptotic Tests, cont’d nRemarks nIf the model of interest contains lagged values of y the auxiliary regression should also include all explanatory variables (just to make sure the distribution of the test is correct) nIf heteroskedasticity is suspected, White standard errors may be used in the auxiliary regression n Dec 9, 2011 Hackl, Econometrics, Lecture 4 73 Demand for Ice Cream, cont’d nDemand for ice cream, measured by cons, explained by price, income, and temp n n n n n n n n n n Dec 9, 2011 Hackl, Econometrics, Lecture 4 74 Demand for Ice Cream, cont’d Hackl, Econometrics, Lecture 3 75 OLS estimated demand function: Output from GRETL Dependent variable : CONS coefficient std. error t-ratio p-value ------------------------------------------------------------- const 0.197315 0.270216 0.7302 0.4718 INCOME 0.00330776 0.00117142 2.824 0.0090 *** PRICE -1.04441 0.834357 -1.252 0.2218 TEMP 0.00345843 0.000445547 7.762 3.10e-08 *** Mean dependent var 0.359433 S.D. dependent var 0,065791 Sum squared resid 0,035273 S.E. of regression 0,036833 R- squared 0,718994 Adjusted R-squared 0,686570 F(2, 129) 22,17489 P-value (F) 2,45e-07 Log-likelihood 58,61944 Akaike criterion -109,2389 Schwarz criterion -103,6341 Hannan-Quinn -107,4459 rho 0,400633 Durbin-Watson 1,021170 Nov 18, 2011 Demand for Ice Cream, cont’d nTest for autocorrelation of error terms nH0: r = 0, H1: r ¹ 0 ndw = 1.02 < 1.21 = dL for T = 30, K = 4 nGRETL also shows the autocorrelation coefficient: r = 0.401 nPlot of actual (o) and fitted (polygon) values Dec 9, 2011 Hackl, Econometrics, Lecture 4 76 Demand for Ice Cream, cont’d nAuxiliary regression et = ret-1 + vt: OLS estimation gives n et = 0.401 et-1 n with s.e.(r) = 0.177, R2 = 0.154 nTest of H0: r = 0 against H1: r > 0 1.Box-Pierce test: qt ≈ √(30) 0.401 = 2.196, p-value: 0.018 qt-statistic: 2.258, p-value: 0.016 2.Breusch-Godfrey test q(T-1) R2 = 4.47, p-value: 0.035 nBoth reject the null hypothesis Dec 9, 2011 Hackl, Econometrics, Lecture 4 77 Inference under Autocorrelation nCovariance matrix of b: n V{b} = s2 (X'X)-1 X'YX (X'X)-1 nUse of s2 (X'X)-1 (the standard output of econometric software) instead of V{b} for inference on b may be misleading nRemedies nUse of correct variances and standard errors nTransformation of the model so that the error terms are uncorrelated Dec 9, 2011 Hackl, Econometrics, Lecture 4 78 HAC-estimator for V{b} nSubstitution of Y in n V{b} = s2 (X'X)-1 X'YX (X'X)-1 n by a suitable estimator nNewey-West: substitution of Sx = s2(X'WX)/T = (StSsstsxtxs‘)/T by n n n n with wj = j/(p+1); p, the truncation lag, is to be chosen suitably nThe estimator n T (X'X)-1 Ŝx (X'X)-1 n for V{b} is called heteroskedasticity and autocorrelation consistent (HAC) estimator, the corresponding standard errors are the HAC s.e. n Dec 9, 2011 Hackl, Econometrics, Lecture 4 79 Demand for Ice Cream, cont’d nDemand for ice cream, measured by cons, explained by price, income, and temp, OLS and HAC standard errors Dec 9, 2011 Hackl, Econometrics, Lecture 4 80 coeff s.e. OLS HAC constant 0.197 0.270 0.288 price -1.044 0.834 0.876 income*10-3 3.308 1.171 1.184 temp*10-3 3.458 0.446 0.411 Cochrane-Orcutt Estimator nGLS estimator nWith transformed variables yt* = yt – ryt-1 and xt* = xt – rxt-1, also called quasi-differences, the model yt = xt‘b + et with et = ret-1 + vt can be written as n yt – ryt-1 = yt* = (xt – rxt-1)‘b + vt = xt*‘b + vt (A) nThe model in quasi-differences has error terms which fulfill the Gauss-Markov assumptions nGiven observations for t = 1, …, T, model (A) is defined for t = 2, …, T nEstimation of r using, e.g., the auxiliary regression et = ret-1 + vt gives the estimate r; substitution of r in (A) for r results in FGLS estimators for b nThe FGLS estimator is called Cochrane-Orcutt estimator n Dec 9, 2011 Hackl, Econometrics, Lecture 4 81 Cochrane-Orcutt Estimation nIn following steps 1.OLS estimation of b for b from yt = xt‘b + et, t = 1, …, T 2.Estimation of r for r from the auxiliary regression et = ret-1 + vt 3.Calculation of quasi-differences yt* = yt – ryt-1 and xt* = xt – rxt-1 4.OLS estimation of b from n yt* = xt*‘b + vt, t = 2, …, T n resulting in the Cochrane-Orcutt estimators nSteps 2. to 4. can be repeated: iterated Cochrane-Orcutt estimator qGRETL provides the iterated Cochrane-Orcutt estimator: q Model => Time series => Cochrane-Orcutt n Dec 9, 2011 Hackl, Econometrics, Lecture 4 82 Demand for Ice Cream, cont’d nIterated Cochrane-Orcutt estimator n n n n n n n n n n nDurbin-Watson test: dw = 1.55; dL=1.21 < dw < 1.65 = dU Dec 9, 2011 Hackl, Econometrics, Lecture 4 83 Demand for Ice Cream, cont’d nDemand for ice cream, measured by cons, explained by price, income, and temp, OLS and HAC standard errors, and Cochrane-Orcutt estimates Dec 9, 2011 Hackl, Econometrics, Lecture 4 84 coeff s.e. Cochrane-Orcutt OLS HAC coeff se constant 0.197 0.270 0.288 0.157 0.300 price -1.044 0.834 0.881 -0.892 0.830 income 3.308 1.171 1.151 3.203 1.546 temp 3.458 0.446 0.449 3.558 0.555 Demand for Ice Cream, cont’d nModel extended by temp-1 n n n n n n n n n n nDurbin-Watson test: dw = 1.58; dL=1.21 < dw < 1.65 = dU Dec 9, 2011 Hackl, Econometrics, Lecture 4 85 Demand for Ice Cream, cont’d nDemand for ice cream, measured by cons, explained by price, income, and temp, OLS and HAC standard errors, Cochrane-Orcutt estimates, and OLS estimates for the extended model Dec 9, 2011 Hackl, Econometrics, Lecture 4 86 OLS Cochrane-Orcutt OLS coeff HAC coeff se coeff se constant 0.197 0.288 0.157 0.300 0.189 0.232 price -1.044 0.881 -0.892 0.830 -0.838 0.688 income 3.308 1.151 3.203 1.546 2.867 1.053 temp 3.458 0.449 3.558 0.555 5.332 0.670 temp-1 -2.204 0.731 General Autocorrelation Structures nGeneralization of model n yt = xt‘b + et n with et = ret-1 + vt nAlternative dependence structures of error terms nAutocorrelation of higher order than 1 nMoving average pattern n Dec 9, 2011 Hackl, Econometrics, Lecture 4 87 Higher Order Autocorrelation nFor quarterly data, error terms may develop according to n et = get-4 + vt n or - more generally - to n et = g1et-1 + … + g4et-4 + vt net follows an AR(4) process, an autoregressive process of order 4 nMore complex structures of correlations between variables with autocorrelation of order 4 are possible than with that of order 1 Dec 9, 2011 Hackl, Econometrics, Lecture 4 88 Moving Average Processes nMoving average process of order 1, MA(1) process n et = vt + avt-1 nεt is correlated with εt-1, but not with εt-2, εt-3, … nGeneralizations to higher orders Dec 9, 2011 Hackl, Econometrics, Lecture 4 89 Remedies against Autocorrelation nChange functional form, e.g., use log(y) instead of y nExtend the model by including additional explanatory variables, e.g., seasonal dummies, or additional lags nUse HAC standard errors for the OLS estimators nReformulate the model in quasi-differences (FGLS) or in differences Dec 9, 2011 Hackl, Econometrics, Lecture 4 90 Your Homework 1.Use the data set “labour2” of Verbeek for the following analyses: a.Estimate (OLS) the model where log labor is explained by log output and log wage; generate a display of the residuals which may indicate heteroskedasticity of the error term b.Perform the Breusch-Pagan (i) with h(zi‘a) = exp(zi‘a) and (ii) with h(zi‘a) = zi‘a, and the White test (iii) with and (iv) without interactions; explain the tests and compare and interpret the results c.Compare (i) the OLS and the White standard errors with (ii) HC0 and (iii) HC1 of the estimated coefficients; interpret the results d.Estimate the model of a., using FGLS and weights obtained in the auxiliary regression of the Breusch-Pagan test (ii) in b.; compare the results with that of a. Dec 9, 2011 Hackl, Econometrics, Lecture 4 91 Your Homework, cont’d 2.Use the data set “icecream” of Verbeek for the following analyses: a.Estimate the model where cons is explained by income and temp; generate two displays of the residuals which may indicate autocorrelation of the error terms b.Use the Durbin-Watson and the Breusch-Godfrey test against autocorrelation; interpret the result c.Repeat a., using (i) the iterative Cochrane-Orcutt estimation and (ii) OLS estimation of the model in differences; interpret the result. 3.Durbin-Watson: (a) Explain the meaning of the statement “The Durbin-Watson test is a misspecification test”; (b) show that dw ≈ 2 – 2r; (c) which of the following tests is a generalization of the DW test? (i) Box-Pierce test; (ii) Breusch-Godfrey test; explain why. Dec 9, 2011 Hackl, Econometrics, Lecture 4 92