Econometrics - Lecture 2 Introduction to Linear Regression – Part 2 Contents nGoodness-of-Fit nHypothesis Testing nAsymptotic Properties of the OLS estimator nMulticollinearity nPrediction n Oct 1, 2010 Hackl, Econometrics, Lecture 2 2 Goodness-of-fit R² nThe quality of the model yi = xi'b + εi can be measured by R2, the goodness-of-fit (GoF) statistic nR2 is the portion of the variance in y that can be explained by the linear regression with regressors xk, k=1,…,K n n nIf the model contains an intercept (as usual): n n with Ṽ{ei} = (Σi ei²)/(N-1) nAlternatively, R2 can be calculated as n Oct 1, 2010 Hackl, Econometrics, Lecture 2 3 Properties of R2 n0 £ R2 £ 1, if the model contains an intercept nR2 = 1: all residuals are zero nR2 = 0: for all regressors, bk = 0; the model explains nothing nComparisons of R2 for two models makes no sense if the explained variables are different nR2 cannot decrease if a variable is added Oct 1, 2010 Hackl, Econometrics, Lecture 2 4 Example: Individ. Wages, cont’d nOLS estimated wage equation (Table 2.1, Verbeek) n n n n n n n n n only 3,17% of the variation of individual wages p.h. is due to the gender n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 5 Other GoF Measures nFor the case of no intercept: Uncentered R2; cannot become negative n Uncentered R2 = 1 – Σi ei²/ Σi yi² nFor comparing models: adjusted R2; compensated for added regressor, penalty for increasing K n n n for a given model, adj R2 is smaller than R2 nFor other than OLS-estimated models n n coincides with R2 for OLS-estimated models Oct 1, 2010 Hackl, Econometrics, Lecture 2 6 Contents nGoodness-of-Fit nHypothesis Testing nAsymptotic Properties of the OLS estimator nMulticollinearity nPrediction n Oct 1, 2010 Hackl, Econometrics, Lecture 2 7 Individual Wages nOLS estimated wage equation (Table 2.1, Verbeek) n n n n n n n n n b1 = 5,147, se(b1) = 0,081: mean wage p.h. for females: 5,15$, with std.error of 0,08$ n b2 = 1,166, se(b2) = 0,112 n 95% confidence interval for β1: 4,988 £ β1 £ 5,306 n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 8 OLS-Estimator: Distributional Properties nUnder the assumptions (A1) to (A5): nThe OLS estimator b = (X’X)-1 X’y is normally distributed with mean β and covariance matrix V{b} = σ2(X‘X)-1 n b ~ N(β, σ2(X’X)-1), bk ~ N(βk, σ2ckk), k=1,…,K nThe statistic n n n follows the standard normal distribution N(0,1) nThe statistic n n n follows the t-distribution with N-K degrees of freedom (df) Oct 1, 2010 Hackl, Econometrics, Lecture 2 9 Testing a Regression Coefficient: t-Test nFor testing a restriction wrt a single regression coefficient bk: nNull hypothesis H0: bk = q nAlternative HA: bk > q nTest statistic: (computed from the sample with known distribution under the null hypothesis) n n ntk is a realization of the random variable tN-K, which follows the t-distribution with N-K degrees of freedom (df = N-K) qunder H0 and qgiven the Gauss-Markov assumptions and normality of the errors nReject H0, if the p-value P{tN-K > tk | H0} is small (tk-value is large) n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 10 Normal and t-Distribution nStandard normal distribution: Z ~ N(0,1) nDistribution function F(z) = P{Z ≤ z} n nt(df)-distribution nDistribution function F(t) = P{Tdf ≤ t} np-value: P{tN-K > tk | H0} = 1 – FH0(tk) n nFor growing df, the t-distribution approaches the standard normal distribution, t follows asymptotically (N → ∞) the N(0,1)-distribution n0.975-percentiles tdf,0.975 of the t(df)-distribution n n n0.975-percentile of the standard normal distribution: z0.975 = 1.96 n Oct 1, 2010 Hackl, Econometrics, Lecture 2 11 df 5 10 20 30 50 100 200 ∞ tdf,0.025 2.571 2.228 2.085 2.042 2.009 1.984 1.972 1.96 File:Normal Distribution CDF Diagram.svg OLS-Estimators: Asymptotic Distribution nIf the Gauss-Markov (A1) - (A4) assumptions hold but not the normality assumption (A5): nt-statistic n n nfollows asymptotically (N → ∞) the standard normal distribution nIn many situations, the unknown exact properties are substituted by approximate results (asymptotic theory) nThe t-statistic nFollows the t-distribution with N-K d.f. nfollows approximately the standard normal distribution N(0,1) nThe approximation error decreases with increasing sample size N n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 12 Two-sided t-Test nFor testing a restriction wrt a single regression coefficient bk: nNull hypothesis H0: bk = q nAlternative HA: bk ≠ q nTest statistic: (computed from the sample with known distribution under the null hypothesis) n n nReject H0, if the p-value P{tN-K > |tk| | H0} is small (|tk|-value is large) n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 13 Individual Wages, cont’d nOLS estimated wage equation (Table 2.1, Verbeek) n n n n n n nTest of null hypothesis H0: β2 = 0 (no gender effect on wages) against HA: β2 > 0 n t2 = b2/se(b2) = 1.1661/0.1122 = 10.38 nUnder H0, t follows the t-distribution with df = 3294-2 = 3292 np-value = P{t3292 > 10.38 | H0} = 3.7E-25: reject H0! n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 14 Individual Wages, cont’d Oct 1, 2010 Hackl, Econometrics, Lecture 2 15 OLS estimated wage equation: Output from GRETL Modell 1: KQ, benutze die Beobachtungen 1-3294 Abhängige Variable: WAGE Koeffizient Std. Fehler t-Quotient P-Wert const 5,14692 0,0812248 63,3664 <0,00001 *** MALE 1,1661 0,112242 10,3891 <0,00001 *** Mittel d. abh. Var. 5,757585 Stdabw. d. abh. Var. 3,269186 Summe d. quad. Res. 34076,92 Stdfehler d. Regress. 3,217364 R-Quadrat 0,031746 Korrigiertes R-Quadrat 0,031452 F(1, 3292) 107,9338 P-Wert(F) 6,71e-25 Log-Likelihood -8522,228 Akaike-Kriterium 17048,46 Schwarz-Kriterium 17060,66 Hannan-Quinn-Kriterium 17052,82 p-value for tMALE-test: < 0,00001 „gender has a significant effect on wages p.h“ Significance Tests nFor testing a restriction wrt a single regression coefficient bk: nNull hypothesis H0: bk = q nAlternative HA: bk ≠ q nTest statistic: (computed from the sample with known distribution under the null hypothesis) n n nDetermine the critical value tN-K,1-a/2 for the significance level a from n P{|tk| > tN-K,1-a/2 | H0} = a nReject H0, if |tk| > tN-K,1-a/2 nTypically, a has the value 0.05 n Oct 1, 2010 Hackl, Econometrics, Lecture 2 16 Significance Tests, cont’d nOne-sided test : nNull hypothesis H0: bk = q nAlternative HA: bk > q (bk < q) nTest statistic: (computed from the sample with known distribution under the null hypothesis) n n nDetermine the critical value tN-K,a for the significance level a from n P{tk > tN-K,a | H0} = a nReject H0, if tk > tN-K,a (tk < -tN-K,a) n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 17 Confidence Interval for bk nRange of values (bkl, bku) for which the null hypothesis on bk is not rejected n bkl = bk - tN-K,1-a/2 se(bk) < bk < bk + tN-K,1-a/2 se(bk) = bkl nRefers to the significance level a of the test nFor large values of df and a = 0.05 (1.96 ≈ 2) n bk – 2 se(bk) < bk < bk + 2 se(bk) nConfidence level: g = 1- a nInterpretation: nA range of values for the true bk that are not unlikely, given the data (?) nA range of values for the true bk such that 100g% of all intervals constructed in that way contain the true bk n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 18 Individual Wages, cont’d nOLS estimated wage equation (Table 2.1, Verbeek) n n n n n n nThe confidence interval for the gender wage difference (in USD p.h.) nconfidence level g = 0.95 n 1.1661 – 1.96*0.1122 < b2 < 1.1661 + 1.96*0.1122 n 0.946 < b2 < 1.386 (or 0.94 < b2 < 1.39) ng = 0.99: 0.877 < b2 < 1.455 n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 19 Testing a Linear Restriction on Regression Coefficients nLinear restriction r’b = q nNull hypothesis H0: r’b = q nAlternative HA: r’b > q nTest statistic n n n se(r’b) is the square root of V{r’b} = r’V{b}r nUnder H0 and (A1)-(A5), t follows the t-distribution with df = N-K nGRETL: The option Linear restrictions from Tests on the output window of the Model statement Ordinary Least Squares allows to test linear restrictions on the regression coefficients n Oct 1, 2010 Hackl, Econometrics, Lecture 2 20 Testing Several Regression Coefficients: F-test nFor testing a restriction wrt more than one, say J with 1 F | H0} is small (F-value is large) nThe test with J = K-1 is a standard test in GRETL n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 21 Individual Wages, cont’d nA more general model is n wagei = β1 + β2 malei + β3 schooli + β4 experi + εi nβ2 measures the difference in expected wages p.h. between males and females, given the other regressors fixed, i.e., with the same schooling and experience: ceteris paribus condition nHave school and exper an explanatory power? nTest of null hypothesis H0: β3 = β4 = 0 against HA: H0 not true nR02 = 0.0317 nR12 = 0.1326 n n np-value = P{F2,3290 > 191.24 | H0} = 2.68E-79 n Oct 1, 2010 Hackl, Econometrics, Lecture 2 22 Individual Wages, cont’d nOLS estimated wage equation (Table 2.2, Verbeek) n n n n n n n n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 23 Alternatives for Testing Several Regression Coefficients nTest again nH0: bk = 0, K-J+1 ≤ k ≤ K nHA: at least one of these bk ≠ 0 1.The test statistic F can alternatively be calculated as 2. 2. nS0 (S1): sum of squared residuals for the (un)restricted model nF follows under H0 and (A1)-(A5) the F(J,N-K)-distribution 2.If s2 is known, the test can be based on n F = (S0-S1)/s2 n under H0 and (A1)-(A5): Chi-squared distributed with J d.f. nFor large N, s2 is very close to s2; test with F approximates F-test Oct 1, 2010 Hackl, Econometrics, Lecture 2 24 Individual Wages, cont’d nA more general model is n wagei = β1 + β2 malei + β3 schooli + β4 experi + εi nHave school and exper an explanatory power? nTest of null hypothesis H0: β3 = β4 = 0 against HA: H0 not true nS0 = 34076.92 nS1 = 30527.87 n F = [(34076.92 - 30527.87)/2]/[30527.87/(3294-4)] = 191.24 nDoes any regressor contribute to explanation? nOverall F-test for H0: β2 = … = β4 = 0 against HA: H0 not true (see Table 2.2 or GRETL-output): J=3 n F = 167.63, p-value: 4.0E-101 n Oct 1, 2010 Hackl, Econometrics, Lecture 2 25 The General Case nTest of H0: Rb = q nRb = q: J linear restrictions on coefficients (R: JxK matrix, q: J-vector) nExample: n n nWald test: test statistic n ξ = (Rb - q)’[RV{b}R’]-1(Rb - q) nfollows under H0 for large N approximately the Chi-squared distribution with J d.f. nTest based on F = ξ /J is algebraically identical to the F-test with n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 26 p-value, Size, and Power nType I error: the null hypothesis is rejected, while it is actually true np-value: the probability to commit the type I error nIn experimental situations, the probability of committing the type I error can be chosen before applying the test; this probability is the significance level α and denoted the size of the test nIn model-building situations, not a decision but learning from data is intended; multiple testing is quite usual; use of p-values is more appropriate than using a strict α nType II error: the null hypothesis is not rejected, while it is actually wrong; the decision is not in favor of the true alternative nThe probability to decide in favor of the true alternative, i.e., not making a type II error, is called the power of the test; depends of true parameter values n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 27 p-value, Size, and Power, cont’d nThe smaller the size of the test, the larger is its power (for a given sample size) nThe more HA deviates from H0, the larger is the power of a test of a given size (given the sample size) nThe larger the sample size, the larger is the power of a test of a given size n nAttention! Significance vs relevance n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 28 Contents nGoodness-of-Fit nHypothesis Testing nAsymptotic Properties of the OLS estimator nMulticollinearity nPrediction n Oct 1, 2010 Hackl, Econometrics, Lecture 2 29 OLS Estimators: Asymptotic Properties nGauss-Markov assumptions (A1)-(A4) plus the normality assumption (A5) are in many situations very restrictive nAn alternative are properties derived from asymptotic theory nAsymptotic results hopefully are sufficiently precise approximations for large (but finite) N nTypically, Monte Carlo simulations are used to assess the quality of asymptotic results nAsymptotic theory: deals with the case where the sample size N goes to infinity: N → ∞ n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 30 Chebychev’s Inequality nChebychev’s Inequality: Bound for probability of deviations from its mean n P{|z-E{z}| > ks} < k-2 n for all k>0; true for any distribution with moments E{z} and s2 = V{z} nFor OLS-estimator bk: n n n for all d>0; ckk: the k-th diagonal element of (X‘X)-1 = (Σi xi xi’)-1 nFor growing N: the elements of Σi xi xi’ increase, V{bk} decreases nGiven (A6), for all d>0 n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 31 OLS Estimators: Consistency nIf (A2) from the Gauss-Markov assumptions (uncorrelated xi and ei) and the assumption (A6) are fulfilled: n n n bk converges in probability to bk for N → ∞ nConsistency of the OLS estimators b: nFor N → ∞, b converges in probability to β, i.e., the probability that b differs from β by a certain amount goes to zero nplimN → ∞ b = β nThe distribution of b collapses in β nNeeds no assumptions beyond (A2) and (A6)! Oct 1, 2010 Hackl, Econometrics, Lecture 2 32 A6 1/N (ΣNi=1xi xi’) = 1/N (X’X) converges with growing N to a finite, nonsingular matrix Σxx OLS Estimators: Consistency, cont’d nConsistency of the OLS estimators can also be shown to hold under weaker assumptions: nThe OLS estimators b are consistent, n plimN → ∞ b = β, nif the assumptions (A7) and (A6) are fulfilled n n n nFollows from n n nand n plim(b - β) = Sxx-1E{xi εi} n Oct 1, 2010 Hackl, Econometrics, Lecture 2 33 A7 The error terms have zero mean and are uncorrelated with each of the regressors: E{xi εi} = 0 Consistency of s2 nThe estimator s2 for the error term variance σ2 is consistent, n plimN → ∞ s2 = σ2, nif the assumptions (A3), (A6), and (A7) are fulfilled n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 34 Consistency: Some Properties nplim g(b) = g(β) qif plim s2 = σ2, plim s = σ nThe conditions for consistency are weaker than those for unbiasedness n Oct 1, 2010 Hackl, Econometrics, Lecture 2 35 OLS Estimators: Asymptotic Normality nDistribution of OLS-estimators mostly unknown nApproximate distribution, based on the asymptotic distribution nMost estimators in econometrics follow asymptotically the normal distribution nAsymptotic distribution of the consistent estimator b: distribution of n N1/2(b - β) for N → ∞ nUnder the Gauss-Markov assumptions (A1)-(A4) and assumption (A6), the OLS estimators b fulfills n n “→” means “is asymptotically distributed as” n n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 36 OLS Estimators: Approximate Normality nUnder the Gauss-Markov assumptions (A1)-(A4) and assumption (A6), the OLS estimators b follow approximately the normal distribution n n nThe approximate distribution does not make use of assumption (A5), i.e., the normality of the error terms! nTests of hypotheses on coefficients bk, nt-test nF-test ncan be performed by making use of the approximate normal distribution n n n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 37 Assessment of Approximate Normality nQuality of napproximate normal distribution of OLS-estimators np-values of t- and F-tests npower of tests, confidence intervals, ec. ndepends on sample size N and factors related to Gauss-Markov assumptions etc. nMonte Carlo studies: simulations that indicate consequences of deviations from ideal situations nExample: yi = b1 + b2xi + ei; distribution of b2 under classical assumptions? n1) Choose N; 2) generate xi, ei, calculate yi, i=1,…,N; 3) estimate b2 nRepeat steps 1)-3) R times: the R values of b2 allow assessment of the distribution of b2 n Oct 1, 2010 Hackl, Econometrics, Lecture 2 38 Contents nGoodness-of-Fit nHypothesis Testing nAsymptotic Properties of the OLS estimator nMulticollinearity nPrediction n Oct 1, 2010 Hackl, Econometrics, Lecture 2 39 Multicollinearity nOLS estimators b = (X’X)-1X’y for regression coefficients b require that the KxK matrix n X’X or Σi xi xi’ n can be inverted nIn real situations, regressors may be correlated, such as nexperience and schooling (measured in years) nage and experience ninflation rate and nominal interest rate ncommon trends of economic time series, e.g., in lag structures n nMulticollinearity: between the explanatory variables exists nan exact linear relationship nan approximate linear relationship n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 40 Multicollinearity: Consequences nApproximate linear relationship between regressors: nWhen correlations between regressors are high: hard to identify the individual impact of each of the regressors nInflated variances qIf xk can be approximated by the other regressors, variance of bk is inflated; qSmaller tk-statistic, reduced power of t-test nExample: yi = b1xi1 + b2xi2 + ei qwith sample variances of X1 and X2 equal 1 and correlation r12, n n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 41 Exact Multicollinearity nExact linear relationship between regressors: nExample: Wage equation qRegressors male and female in addition to intercept qRegressor exper defined as exper = age - school - 6 nΣi xi xi’ is not invertible nEconometric software reports ill-defined matrix Σi xi xi’ nGRETL drops regressor nRemedy: nExclude one of the regressors nExample: Wage equation qDrop regressor female, use regressor male in addition to intercept qAlternatively: use female and intercept qNot good: use of male and female n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 42 Variance Inflation Factor nVariance of bk n n n n Rk2: R2 of the regression of xk on all other regressors nIf xk can be approximated by the other regressors, Rk2 is close to 1, the variance inflated nVariance inflation factor: VIF(bk) = (1 - Rk2)-1 nLarge values for some or all VIFs indicate multicollinearity nAttention! Large values for VIF can also have other causes nSmall value of variance of Xk nSmall number N of observations n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 43 Other Indicators nLarge values for some or all variance inflation factors VIF(bk) are an indicator for multicollinearity nOther indicators: nAt least one of the Rk2, k = 1, …, K, has a large value nLarge values of standard errors se(bk) (low t-statistics), but reasonable or good R2 and F-statistics nEffect of adding a regressor on standard errors se(bk) of estimates bk of regressors already in the model: increasing values of se(bk) indicate multicollinearity n n n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 44 Contents nGoodness-of-Fit nHypothesis Testing nAsymptotic Properties of the OLS estimator nMulticollinearity nPrediction n Oct 1, 2010 Hackl, Econometrics, Lecture 2 45 The Predictor nGiven the relation yi = xi’b + ei nGiven estimators b, predictor for Y at x0, i.e., y0 = x0’b + e0: ŷ0 = x0’b nPrediction error: f0 = ŷ0 - y0 = x0’(b – b) + e0 nSome properties of ŷ0: nUnder assumptions (A1) and (A2), E{b} = b and ŷ0 is an unbiased predictor nVariance of ŷ0 n V{ŷ0} = V{x0’b} = x0’ V{b} x0 = s2 x0’(X’X)-1x0 nVariance of the prediction error f0 n V{f0} = V{x0’(b – b) + e0} = s2 + s2 x0’(X’X)-1x0 = s²f0 n given that f0 and b are uncorrelated n100g% prediction interval: ŷ0 – z1-g/2 sf0 ≤ y0 ≤ ŷ0 + z1-g/2 sf0 n n Oct 1, 2010 Hackl, Econometrics, Lecture 2 46 Example: Simple Regression nGiven the relation yi = b1 + xib2 + ei nPredictor for Y at x0, i.e., y0 = b1 + x0b2 + e0: n ŷ0 = b1 + x0’b2 nVariance of the prediction error n n nPrediction intervals n for various x‘s Oct 1, 2010 Hackl, Econometrics, Lecture 2 47 Your Homework 1.For Verbeek’s data set “WAGES” use GRETL (a) for estimating a linear regression model with intercept for WAGES p.h. with explanatory variables MALE, SCHOOL, and AGE; (b) interpret the coefficients of the model; (c) test the hypothesis that men and women, on average, have the same wage p.h., against the alternative that women earn less; (d) calculate a 95% confidence interval for the wage difference of males and females. 2.Generate a variable EXPER_B by adding the Binomial random variable BE ~ B(2,0.05); (a) estimate two linear regression models with intercept for WAGES p.h. with explanatory variables (i) MALE, SCHOOL, EXPER and AGE, and (ii) MALE, SCHOOL, EXPER_B and AGE; compare the R² of the models; (b) compare the VIFs for the variables of the two models. Oct 1, 2010 Hackl, Econometrics, Lecture 2 48 Your Homework 3.Show for a linear regression with intercept that 4.Show that the F-test based on 5. 5. n and the F-test based on n n n are identical. Oct 1, 2010 Hackl, Econometrics, Lecture 2 49