Introduction to econometrics IV. Multiple linear regression model Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 1 / 40 Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 2 / 40 Introduction Multiple LRM discussed in more detail. Some proofs as an illustration. Hypothesis testing – extended methods. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 3 / 40 Basic results Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 4 / 40 Basic results Classical assumptions 1 E ( i ) = 0. 2 var ( i ) = E 2 i = σ2. 3 cov ( i , j) = 0 for i = j. 4 i is Normally distributed. 5 X1i , . . . , Xki are fixed (non-random) variables. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 5 / 40 Basic results Parameters estimates – two explaining variables Model: Yi = α + β1X1i + β2X2i + i . Minimize SSR: β1 = ( x1i yi ) x2 2i − ( x2i yi ) ( x1i x2i ) x2 1i x2 2i − ( x1i x2i )2 , β2 = ( x2i yi ) x2 1i − ( x1i yi ) ( x1i x2i ) x2 1i x2 2i − ( x1i x2i )2 , α = Y − β1X1 − β2X2, where yi = Yi − Y , x1i = X1i − X1, x2i = X2i − X2. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 6 / 40 Basic results OLS estimate – error terms variance Unbiased estimator, σ2: s2 = 2 i N − k − 1 , where i = Yi − α − β1X1i − . . . − βkXki are OLS residuals. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 7 / 40 Basic results Estimating variance of parameters estimates – two regressors Case k = 2: var β1 = σ2 (1 − r2) x2 1i , var β2 = σ2 (1 − r2) x2 2i , where r is (sample) correlation coefficient between X1 and X2. In practice – estimates of σ2. Useful for hypothesis testing. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 8 / 40 Basic results Test of parameter significance (assuming σ2 is known) Koop (2008), p. 94. In practice σ2 is not known ⇒ t-test. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 9 / 40 Basic results Measure of model fit Coefficient of determination: R2 = 1 − SSR TSS = 1 − 2 i Yi − Y 2 . Adding new explanatory variables will always increase R2. Adjusted R2, R 2 : R 2 = 1 − SSR N−k−1 TSS N−1 = 1 − s2 1 N−1 Yi − Y 2 . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 10 / 40 Choice of explanatory variables Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 11 / 40 Choice of explanatory variables Omitted variable bias I True model: Yi = α + β1X1i + β2X2i + i . OLS estimates: β1 = ( x1i yi ) x2 2i − ( x2i yi ) ( x1i x2i ) x2 1i x2 2i − ( x1i x2i )2 . Lower-case letters – deviations from means. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 12 / 40 Choice of explanatory variables Omitted variable bias II Model: Yi = α + β1X1i + i . Parameter estimate of β1: ˜β1 = x1i yi x2 1i , ˜β1 is biased. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 13 / 40 Choice of explanatory variables Omitted variable bias – proof May be shown: E ˜β1 = E β1 + β2 x1i x2i x2 1i + x1i ( i − ) x2 1i = β1 + β2 x1i x2i x2 1i . ˜β1 is biased. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 14 / 40 Choice of explanatory variables Omitted variable bias – comments Bias does not exist in case β2 = 0 or x1i x2i x2 1i . If β2 = 0 then X2 is not omitted. x1i x2i x2 1i connected with correlation between X1 and X2 (denoted by r). Bias does not arise if omitted variable is uncorrelated with included variabe. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 15 / 40 Choice of explanatory variables Inclusion of irrelevant explanatory variables True model: Yi = α + β1X1i + i . Incorrect specification: Yi = α + β1X1i + β2X2i + i . Wrong estimator: ˜β1 = ( x1i yi ) x2 2i − ( x2i yi ) ( x1i x2i ) x2 1i x2 2i − ( x1i x2i )2 . Correct estimator: β1 = x1i yi x2 1i . If ˜β1 unbiased, then using Gauss-Markov theorem var ˜β1 > var β1 . Including irreevant variables leads to less precise estimates. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 16 / 40 Choice of explanatory variables Multicollinearity High or perfect correlation among explanatory variables. OLS estimator has problem estimating separate marginal effects. Two explanatory variables: var β1 = σ2 (1 − r2) x2 1i , var β2 = σ2 (1 − r2) x2 2i . Used in hypothesis testing. High multicolinearity → small t-statistic, wide confidence intervals. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 17 / 40 Hypothesis testing Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 18 / 40 Hypothesis testing Introduction General model: Yi = α + β1X1i + β2X2i + . . . + βkXki + i . Hypothesis including more parameters. F-tests and likelihod ratio tests. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 19 / 40 Hypothesis testing F-test Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 20 / 40 Hypothesis testing F-test Basic test Test R2 = 0 meets the hypothesis: H0 : β1 = . . . = βk = 0. not the same as k individual hypothesis H0 : β1 = 0, H0 : β2 = 0 až H0 : βk = 0. F-statistics for a model with k explanatory variabes and an intercept: F = R2 1 − R2 N − k − 1 k . Assuming null hypothesis is true, F-statistics is distributed as Fk,N−k−1. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 21 / 40 Hypothesis testing F-test General tests Unresticted model: Yi = α + β1X1i + β2X2i + β3X3i + i . For example: H0 : β1 = β2 = 0. Able to include any linear restrictions: aβ1 + bβ2 + cβ3 = d for some constants a, b, c a d. Restricted model: Yi = α + β3X3i + i . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 22 / 40 Hypothesis testing F-test General tests — examples Hypothesis: H0 : β1 = 0, β2 + β3 = 1. Second restriction ay be written as β2 = 1 − β3. Resticted model: Yi − X2i = α + β3 (X3i − X2i ) + i . Simple LRM with dependent variable Y − X2, with an intercept and the explaining variable (X3i − X2i ). Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 23 / 40 Hypothesis testing F-test General tests – F-test For inear restrictions; test statistics: F = (SSRR − SSRUR) /q SSRUR/ (N − k − 1) . SSR is sum of squared residuals, subscripts UR (unresticted model) and R (restricted model). Number of restrictions is q. Intuition: „big“ values of F suggest H0 is not correct. F is distributed as Fq,N−k−1. F-statistics using R2 (only for the same dependent variables in both models): F = R2 UR − R2 R /q 1 − R2 UR / (N − k − 1) . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 24 / 40 Hypothesis testing Likelihood ratio tests Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 25 / 40 Hypothesis testing Likelihood ratio tests Motivace More complicated than F-test × wider variety of applications. Likelihood function: L α, β1, . . . , βk, σ2 = N i=1 1 √ 2πσ2 exp − 1 2σ2 (Yi − α − β1X1i − . . . − βkXki )2 = 1 (2πσ2) N 2 exp − 1 2σ2 N i=1 (Yi − α − β1X1i − . . . − βkXki )2 . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 26 / 40 Hypothesis testing Likelihood ratio tests Some basic results ML estimates correspond to OLS estimates: α, β1,. . . , βk. ML estimate of the error terms variance is biased: σ2 = Yi − α − β1X1i , . . . , βkXki 2 N = 2 i N . Likelihood for unrestricted MLEs: L αU , βU 1 , . . . , βU k , σ2U . Likelihood evaluated at restricted MLEs: L αR , βR 1 , . . . , βR k , σ2R . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 27 / 40 Hypothesis testing Likelihood ratio tests Ilustration Three explaining variables. H0 : β1 = 0, β2 + β3 = 1. Restricted model Yi − X2i = α + β3 (X3i − X2i ) + i . OLS estimates → αR a βR 3 . Values of βR 1 and βR 2 ? → restrictions from H0, βR 1 and βR 2 = 1 − βR 3 . Possible noninear restrictions, e.g.: H0 : β1 = β3 2,, β3 = 1 β2 → in general H0 : g(β1, . . . , βk) = 0, where g(·) is a set of k noninear functions. Non-linear estimates using econometric software. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 28 / 40 Hypothesis testing Likelihood ratio tests Likelihood ratio test Likelihood ratio: λ = L αR, βR 1 , . . . , βR k , σ2R L αU, βU 1 , . . . , βU k , σ2U . Test statistics −2 ln(λ). Statistics is distributed (approximately) as χ2: −2 ln(λ) ∼ χ2 q (q is a number of restrictions in H0). Intuition: including restrictions leads to a lower likelihood. Platí: L αR, βR 1 , . . . , βR k , σ2R ≤ L αU, βU 1 , . . . , βU k , σ2U a tedy 0 ≤ λ ≤ 1. H0 is true ⇒ λ should be near 1 ⇒ test statistics −2 ln(λ) should be small. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 29 / 40 Hypothesis testing Likelihood ratio tests Examples Koop (2008), pp. 107–108. Figure of N(1, 2). Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 30 / 40 Hypothesis testing Likelihood ratio tests Likelihood function −6 −4 −2 0 2 4 6 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 β L(β=0) MLE L(β=−2) L(β=MLE) Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 31 / 40 Hypothesis testing Likelihood ratio tests Alternative for LRM Likelihood function: L α, β1, . . . , βk, σ2 = 1 (2πσ2) N 2 exp − 1 2σ2 N i=1 Yi − α − β1X1i − . . . − βkXki 2 . Using variance estimates: σ2: L α, β1, . . . , βk, σ2 ∝ 1 (σ2) N 2 ∝ 1 (SSR) N 2 , where SSR = 2 i . Likelihood ratio: λ = 1 (SSRR) N 2 1 (SSRU) N 2 = SSRU SSRR N 2 . Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 32 / 40 Hypothesis testing Likelihood ratio tests Wald and Lagrange multiplier tests Approximations of LR test. Abraham Wald (1902–1950) Joseph-Louis Lagrange (1736–1813) Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 33 / 40 Hypothesis testing Likelihood ratio tests Wald test Only unrestricted estimates. Example: hypothesis H0 : g(α, β1, β2, . . . , βk) = c ML estimates αU, βU 1 , . . . , βU k . Idea: if H0 is true then unrestricted estimates should meet the restrictions (approximately). g(αU, βU 1 , . . . , βU k ) near c. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 34 / 40 Hypothesis testing Likelihood ratio tests Wald test – statistics Wald statistics: W = g αU, βU 1 , . . . , βU k − c 2 var g αU, βU 1 , . . . , βU k . In some cases denominator easy to compute, e.g. for g(αU, βU 1 , . . . , βU k ) = βU 1 + βU 2 : var βU 1 + βU 2 = var βU 1 + var βU 2 + 2cov βU 1 , βU 2 . Non-linear restrictions → ekonometric software. Distribution of the test statistics: W ∼ χ2 q, where q is number of restrictions. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 35 / 40 Hypothesis testing Likelihood ratio tests Lagrange multiplier test Only restricted estimates. Example: unrestricted model, simple LRM, β; restricted model for H0 : β = c. βR = c. Motivation: if H0 true, then MLE of restricted model should be close to unrestricted MLE (in our case, c should be near β (OLS or ML estimate). Basic calculus: maximum of likelihood function, first derivative equals zero (slope). If H0 true, then derivative of likelihood function evaluated at βR should be close to zero. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 36 / 40 Hypothesis testing Likelihood ratio tests Lagrange multiplier test – statistics Test statistics: LM = d ln L βR 2 I βR . Intuition: how far away from zero does the slope of the likelihood function become if we impose the restrictions? Numerator is the direct measure of its size × relative to its uncertainty. Denominator LM is related to the variance of the first derivative of the likelihood function: I (·) (information matrix). LM statistics is distributed approximately (assymptotically) as: LM ∼ χ2 q, where q is the number of restrictions in H0. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 37 / 40 Hypothesis testing Likelihood ratio tests Comparing tests Likelihood ratio (LR) test, Wald test (W), Lagrange multiplier test (LM). Log-likelihood (ln L) as a function of β; βMLE maximum; restriction g(β) = 0; restricted value βMLE R . Zdroj: Kennedy (2008) – A Guide to Econometrics. Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 38 / 40 Other issues Content 1 Basic results 2 Choice of explanatory variables 3 Hypothesis testing F-test Likelihood ratio tests 4 Other issues Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 39 / 40 Other issues Choice of functional form Koop (2008), pp. 109–115 (including examples). Non-linearity in regression. Logarithms of the variables and interpretation of the parameters. Interaction terms and power of the variables– changing marginal effects. How to decide which non-linear form? Changing the measure of variables – any changes in estimates and appropriate statistics? Introduction to econometrics (INEC) IV. Multiple regression model Autumn 2011 40 / 40