LECTURE 5 1 / 36 Introduction to Econometrics Multiple Regression Analysis: Hypothesis Testing II Hieu Nguyen Fall semester, 2024 REVISION: the Classical Linear Model (CLM) Assumptions 12 / 36 1.Linearity: the regression model is linear in the parameters (coefficients) 2.Random sampling: the data is a random sample drawn from the population and each data point follows the population equation 3.No perfect collinearity: the values of explanatory variables are not all the same and no explanatory variable is a perfect linear function of any other explanatory variable(s) 4.Zero conditional mean: values of explanatory variables must contain no information about the mean of the unobserved factors - explanatory variables are uncorrelated with the error term 5.Homoskedasticity: the error term has a constant variance 6.Normality of the error term: the error term is normally distributed 3 • e We continue our discussion on how hypotheses about coefficients can be tested in regression models e We will explain what significance of coefficients mean e We will learn how to read regression output • • • e Readings: Wooldridge Chapter 4; Studenmund Chapter 5.1-5.4 TODAY’S LECTURE 4 e „Statistically significant“ variables in a regression §If a regression coefficient is different from zero in a two-sided test, the corresponding variable is said to be „statistically significant“ §If the number of degrees of freedom is large enough so that the normal approximation applies, the following rules of thumb apply: TP_tmp.png TP_tmp.png TP_tmp.png „statistically significant at 10 % level“ „statistically significant at 5 % level“ „statistically significant at 1 % level“ INFERENCE: The t Test 5 e Guidelines for discussing economic and statistical significance §If a variable is statistically significant, discuss the magnitude of the coefficient to get an idea of its economic or practical importance §The fact that a coefficient is statistically significant does not necessarily mean it is economically or practically significant! §If a variable is statistically and economically important but has the „wrong“ sign, the regression model might be misspecified §If a variable is statistically insignificant at the usual levels (10%, 5%, 1%), one may think of dropping it from the regression §If the sample size is small, effects might be imprecisely estimated so that the case for dropping insignificant variables is less strong INFERENCE: The t Test 6 e Testing more general hypotheses about a regression coefficient e Null hypothesis • • • e t-statistic • • • e The test works exactly as before, except that the hypothesized value is substracted from the estimate when forming the statistic TP_tmp.png TP_tmp.png Hypothesized value of the coefficient INFERENCE: The t Test 7 e Example: Campus crime and enrollment §An interesting hypothesis is whether crime increases by one percent if enrollment is increased by one percent TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png The hypothesis is rejected at the 5% level Estimate is different from one but is this difference statistically significant? INFERENCE: The t Test 8 e Computing p-values for t-tests §If the significance level is made smaller and smaller, there will be a point where the null hypothesis cannot be rejected anymore §The reason is that, by lowering the significance level, one wants to avoid more and more to make the error of rejecting a correct H0 §The smallest significance level at which the null hypothesis is still rejected, is called the p-value of the hypothesis test §A small p-value is evidence against the null hypothesis because one would reject the null hypothesis even at small significance levels §A large p-value is evidence in favor of the null hypothesis §P-values are more informative than tests at fixed significance levels INFERENCE: The t Test 9 e How the p-value is computed (here: two-sided test) The p-value is the significance level at which one is indifferent between rejecting and not rejecting the null hypothesis. In the two-sided case, the p-value is thus the probability that the t-distributed variable takes on a larger absolute value than the realized value of the test statistic, e.g.: From this, it is clear that a null hypothesis is rejected if and only if the corresponding p-value is smaller than the significance level. For example, for a significance level of 5% the t-statistic would not lie in the rejection region. TP_tmp.png value of test statistic These would be the critical values for a 5% significance level INFERENCE: The t Test 10 Critical value of two-sided test e Confidence intervals e Simple manipulation of the result in Theorem 4.2 implies that • • • • • e Interpretation of the confidence interval §The bounds of the interval are random §In repeated samples, the interval that is constructed in the above way will cover the population regression coefficient in 95% of the cases TP_tmp.png Lower bound of the Confidence interval Upper bound of the Confidence interval Confidence level INFERENCE: Confidence Intervals 11 e Confidence intervals for typical confidence levels • • • • • e Relationship between confidence intervals and hypotheses tests TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png reject in favor of TP_tmp.png Use rules of thumb TP_tmp.png TP_tmp.png INFERENCE: Confidence Intervals 12 e Example: Model of firms‘ R&D expenditures TP_tmp.png TP_tmp.png Spending on R&D Annual sales Profits as percentage of sales TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png The effect of sales on R&D is relatively precisely estimated as the interval is narrow. Moreover, the effect is significantly different from zero because zero is outside the interval. This effect is imprecisely estimated as the in- terval is very wide. It is not even statistically significant because zero lies in the interval. (0.0128 ) 0.0217 (0.0128 ) INFERENCE: Confidence Intervals 13 e Example: Return to education at 2 year vs. at 4 year colleges TP_tmp.png Years of education at 2 year colleges Years of education at 4 year colleges Test against . A possible test statistic would be: TP_tmp.png The difference between the estimates is normalized by the estimated standard deviation of the difference. The null hypothesis would have to be rejected if the statistic is „too negative“ to believe that the true difference between the parameters is equal to zero. TP_tmp.png TP_tmp.png INFERENCE: Testing Hypotheses About a Linear Combination of Parameters 14 e Impossible to compute with standard regression output because • • e Alternative method TP_tmp.png Usually not available in regression output Define and test against . TP_tmp.png TP_tmp.png TP_tmp.png TP_tmp.png a new regressor (= total years of college) Insert into original regression TP_tmp.png INFERENCE: Testing Hypotheses About a Linear Combination of Parameters 15 e Estimation results • • • • • • • • e This method works always for single linear hypotheses TP_tmp.png TP_tmp.png Total years of college TP_tmp.png TP_tmp.png TP_tmp.png Hypothesis is rejected at 10% level but not at 5% level INFERENCE: Testing Hypotheses About a Linear Combination of Parameters TESTING MULTIPLE HYPOTHESES 16 / 49 e Suppose we have a model yi = β0 + β1xi1 + β2xi2 + β3xi3 + εi e Suppose we want to test multiple linear hypotheses in this model e For example, we want to see if the following restrictions on coefficients hold jointly: β1 + β2 = 1 and β3 = 0 e We cannot use a t-test in this case (t-test can be used only for one hypothesis at a time) e We will use an F-test RESTRICTED VS. UNRESTRICTED MODEL 17 / 49 e We can reformulate the model by plugging the restrictions as if they were true (model under H0) e We call this model restricted model as opposed to the unrestricted model e The unrestricted model is yi = β0 + β1xi1 + β2xi2 + β3xi3 + εi e Restricted model can be derived to have the following form: y∗i = β0 + β1x∗i + εi , where y∗i = yi − xi2 and x∗i = xi1 − xi2 IDEA OF THE F-TEST 18 / 49 e If the restrictions are true, then the restricted model fits the data in the same way as the unrestricted model residuals are nearly the same e If the restrictions are false, then the restricted model fits the data poorly residuals from the restricted model are much larger than those from the unrestricted model e The idea is thus to compare the residuals from the two models IDEA OF THE F-TEST 19 / 49 e How to compare residuals in the two models? vCalculate the sum of squared residuals in the two models vTest if the difference between the two sums is equal to zero (statistically) vH0: the difference is zero (residuals in the two models are the same, restrictions hold) vHA: the difference is positive (residuals in the restricted model are bigger, restrictions do not hold) e Sum of squared residuals F-TEST 20 / 49 e The test statistic is defined as F = (SSRr − SSRur)/q SSRur/(n − k − 1) ∼ F q,n−k−1 , . . . sum of squared residuals from the restricted model . . . sum of squared residuals from the unrestricted model where: SSRr SSRur q . . . number of restrictions n . . . number of observations k . . . number of estimated coefficients 21 e Testing multiple linear restrictions: The F-test e Testing exclusion restrictions TP_tmp.png Years in the league Average number of games per year TP_tmp.png Salary of major lea- gue baseball player Batting average Home runs per year Runs batted in per year TP_tmp.png against TP_tmp.png Test whether performance measures have no effect/can be exluded from regression. INFERENCE: The F Test 22 e Estimation of the unrestricted model TP_tmp.png TP_tmp.png TP_tmp.png None of these variabels are statistically significant when tested individually Idea: How would the model fit be if these variables were dropped from the regression? INFERENCE: The F Test 23 e Estimation of the restricted model • • • • • • e Test statistic TP_tmp.png TP_tmp.png The sum of squared residuals necessarily increases, but is the increase statistically significant? TP_tmp.png The relative increase of the sum of squared residuals when going from H1 to H0 follows a F-distribution (if the null hypothesis H0 is correct) Number of restrictions INFERENCE: The F Test 24 e Rejection rule A F-distributed variable only takes on positive values. This corresponds to the fact that the sum of squared residuals can only increase if one moves from H1 to H0. Choose the critical value so that the null hypo-thesis is rejected in, for example, 5% of the cases, although it is true. INFERENCE: The F Test 25 e Test decision in example • • • • • • e Discussion §The three variables are „jointly significant“ §They were not significant when tested individually §The likely reason is multicollinearity between them TP_tmp.png Number of restrictions to be tested Degrees of freedom in the unrestricted model TP_tmp.png TP_tmp.png The null hypothesis is overwhel-mingly rejected (even at very small significance levels). INFERENCE: The F Test 26 e Test of overall significance of a regression • • • • • • • • e The test of overall significance is reported in most regression packages; the null hypothesis is usually overwhelmingly rejected TP_tmp.png TP_tmp.png TP_tmp.png The null hypothesis states that the explanatory variables are not useful at all in explaining the dependent variable TP_tmp.png Restricted model (regression on constant) INFERENCE: The F Test GOODNESS OF FIT MEASURE 27 / 49 e We know that education and experience have a significant influence on wages e But how important are they in determining wages? e How much of difference in wages between people is explained by differences in education and in experience? e How well variation in the independent variable(s) explains variation in the dependent variable? e This are the questions answered by the goodness of fit measure - R2 TOTAL AND EXPLAINED VARIATION e Total variation in the dependent variable: e Predicted value of the dependent variable = part that is explained by independent variables: (case of regression line - for simplicity of notation) e Explained variation in the dependent variable: 28 / 49 GOODNESS OF FIT - R2 e Denote: 29 / 49 e Define the measure of the goodness of fit: R2 = SSE = Explained variation in y SST Total variation in y GOODNESS OF FIT - R2 30 / 49 e In all models: 0 ≤ R2 ≤ 1 e R2 tells us what percentage of the total variation in the dependent variable is explained by the variation in the independent variable(s) R2 = 0.3 means that the independent variables can explain 30% of the variation in the dependent variable e Higher R2 means better fit of the regression model (not necessarily a better model!) DECOMPOSING THE VARIANCE e For models with intercept, R2 can be rewritten using the decomposition of variance. e Variance decomposition: 31 / 49 VARIANCE DECOMPOSITION AND R2 32 / 49 e Variance decomposition: SST = SSE + SSR e Intuition: total variation can be divided between the explained variation and the unexplained variation residual ei (unexplained part) e We can rewrite R2: 2 R = = SSE SST − SSR SST SST = 1 − SSR SST ADJUSTED R2 33 / 49 e The sum of squared residuals (SSR) decreases when additional explanatory variables are introduced in the model, whereas total sum of squares (SST) remains the same 2 SSR SST R = 1 − increases if we add explanatory variables Models with more variables automatically have better fit. e To deal with this problem, we define the adjusted R2: R2 adj = 1 − SSR n−k−1 SST n−1 .≤ R2 (k is the number of coefficients) e This measure introduces a “punishment” for including more explanatory variables FOUR IMPORTANT SPECIFICATION CRITERIA 34 / 49 Does a variable belong to the equation? 1.Theory: Is the variable’s place in the equation unambiguous and theoretically sound? Does intuition tells you it should be included? 2. 2.t-test: Is the variable’s estimated coefficient significant in the expected direction? 3. 3.R2: Does the overall fit of the equation improve (enough) when the variable is added to the equation? 4. 4.Bias: Do other variables’ coefficients change significantly when the variable is added to the equation? FOUR IMPORTANT SPECIFICATION CRITERIA 35 / 49 e If all conditions hold, the variable belongs in the equation e If none of them holds, the variable is irrelevant and can be safely excluded e If the criteria give contradictory answers, most importance should be attributed to theoretical justification Therefore, if theory (intuition) says that variable belongs to the equation, we include it (even though its coefficients might be insignificant!).