Econometrics 2 - Lecture 1 ML Estimation, Diagnostic Tests Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •2 Organizational Issues nCourse schedule n n n n n n n n n Classes start at 10:00 n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •3 Class Date 1 Fr, Mar 9 2 Fr, Mar 16 3 Fr, Mar 23 4 Fr, Apr 6 5 Fr, Apr 20 6 Fr, Apr 27 Organizational Issues, cont’d nTeaching and learning method nCourse in six blocks nClass discussion, written homework (computer exercises, GRETL) submitted by groups of (3-5) students, presentations of homework by participants nFinal exam nAssessment of student work nFor grading, the written homework, presentation of homework in class and a final written exam will be of relevance nWeights: homework 40 %, final written exam 60 % nPresentation of homework in class: students must be prepared to be called at random Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •4 Organizational Issues, cont’d nLiterature nCourse textbook nMarno Verbeek, A Guide to Modern Econometrics, 3rd Ed., Wiley, 2008 nSuggestions for further reading nW.H. Greene, Econometric Analysis. 7th Ed., Pearson International, 2012 nR.C. Hill, W.E. Griffiths, G.C. Lim, Principles of Econometrics, 4th Ed., Wiley, 2012 Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •5 Aims and Content nAims of the course nDeepening the understanding of econometric concepts and principles nLearning about advanced econometric tools and techniques qML estimation and testing methods (MV, Cpt. 6) qTime series models (MV, Cpt. 8, 9) qMulti-equation models (MV, Cpt. 9) qModels for limited dependent variables (MV, Cpt. 7) qPanel data models (MV, Cpt. 10) nUse of econometric tools for analyzing economic data: specification of adequate models, identification of appropriate econometric methods, interpretation of results nUse of GRETL Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •6 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •7 Limited Dependent Variables: An Example nExplain whether a household owns a car: explanatory power have nincome nhousehold size netc. nRegression is not suitable! n Why? n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •8 Limited Dependent Variables: An Example nExplain whether a household owns a car: explanatory power have nincome nhousehold size netc. nRegression is not suitable! nOwning a car has two manifestations: yes/no nIndicator for owning a car is a binary variable nModels are needed that allow to describe a binary dependent variable or a, more generally, limited dependent variable Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •9 Cases of Limited Dependent Variable nTypical situations: functions of explanatory variables are used to describe or explain nDichotomous dependent variable, e.g., ownership of a car (yes/no), employment status (employed/unemployed) nOrdered response, e.g., qualitative assessment (good/average/bad), working status (full-time/part-time/not working) nMultinomial response, e.g., trading destinations (Europe/Asia/Africa), transportation means (train/bus/car) nCount data, e.g., number of orders a company receives in a week, number of patents granted to a company in a year nCensored data, e.g., expenditures for durable goods, duration of study with drop outs n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •10 Hackl, Econometrics 2, Lecture 1 •11 Time Series Example: Price/Earnings Ratio nVerbeek’s data set PE: PE = ratio of S&P composite stock price index and S&P composite earnings of the S&P500, annual, 1871-2002 nIs the PE ratio mean reverting? nlog(PE) qMean 2.63 q (PE: 13.9) qMin 1.81 (6.1) qMax 3.60 (36.6) qStd 0.33 Mar 9, 2018 Time Series Models nPurpose of modelling nDescription of the data generating process nForecasting nTypes of model specification nDeterministic trend: a function f(t) of the time t, describing the evolution of E{Yt} over time n Yt = f(t) + εt, εt: white noise n e.g., Yt = α + βt + εt nAutoregression AR(1) n Yt = δ + θYt-1 + εt, |θ| < 1, εt: white noise n generalization: ARMA(p,q)-process n Yt = θ1Yt-1 + … + θpYt-p + εt + α1εt-1 + … + αqεt-q Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •12 Hackl, Econometrics 2, Lecture 1 •13 PE Ratio: Various Models nDiagnostics for various competing models: Δyt = log PEt - log PEt-1 nBest fit for nBIC: MA(2) model Δyt = 0.008 + et – 0.250 et-2 nAIC: AR(2,4) model Δyt = 0.008 – 0.202 Δyt-2 – 0.211 Δyt-4 + et nQ12: Box-Ljung statistic for the first 12 autocorrelations n Mar 9, 2018 Model Lags AIC BIC Q12 p-value MA(4) 1-4 -73.389 -56.138 5.03 0.957 AR(4) 1-4 -74.709 -57.458 3.74 0.988 MA 2, 4 -76.940 -65.440 5.48 0.940 AR 2, 4 -78.057 -66.556 4.05 0.982 MA 2 -76.072 -67.447 9.30 0.677 AR 2 -73.994 -65.368 12.12 0.436 Multi-equation Models Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •14 •Economic processes: Simultaneous and interrelated development of a set of variables •Examples: nHouseholds consume a set of commodities (e.g., food, durables); the demanded quantities depend on the prices of commodities, the household income, the number of persons living in the household, etc.; a consumption model contains a set of dependent variables and a set of explanatory variables. nThe market of a product is characterized by (a) the demanded and supplied quantity and (b) the price of the product; a model for the market consists of equations representing the development and interdependencies of these variables. nAn economy consists of markets for commodities, labour, finances, etc.; a model for a sector or the full economy contains descriptions of the development of the relevant variables and their interactions. Panel Data nPopulation of interest: individuals, households, companies, countries nTypes of observations nCross-sectional data: Observations of all units of a population, or of a (representative) subset, at one specific point in time nTime series data: Series of observations on units of the population over a period of time nPanel data (longitudinal data): Repeated observations of (the same) population units collected over a number of periods; data set with both a cross-sectional and a time series aspect; multi-dimensional data nCross-sectional and time series data are special cases of panel data Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •15 Panel Data Example: Individual Wages nVerbeek’s data set “males” nSample of q545 full-time working males qeach person observed yearly after completion of school in 1980 till 1987 nVariables qwage: log of hourly wage (in USD) qschool: years of schooling qexper: age – 6 – school qdummies for union membership, married, black, Hispanic, public sector qothers n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •16 Panel Data Models nPanel data models allow ncontrolling individual differences, comparing behaviour, analysing dynamic adjustment, measuring effects of policy changes nmore realistic models than cross-sectional and time-series models nmore detailed or sophisticated research questions nE.g.: What is the effect of being married on the hourly wage n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •17 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •18 The Linear Model Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •19 Y: explained variable X: explanatory or regressor variable The model describes the data-generating process of Y under the condition X A simple linear regression model Y = a + bX b: coefficient of X a: intercept A multiple linear regression model Y = b1 + b2X2 + … + bKXK Fitting a Model to Data nChoice of values b1, b2 for model parameters b1, b2 of Y = b1 + b2 X, n given the observations (yi, xi), i = 1,…,N n nModel for observations: yi = b1 + b2 xi + εi, i = 1,…,N n nFitted values: ŷi = b1 + b2 xi, i = 1,…,N n nPrinciple of (Ordinary) Least Squares gives the OLS estimators n bi = arg minb1,b2 S(b1, b2), i=1,2 n nObjective function: sum of the squared deviations n S(b1, b2) = Si [yi - (b1 + b2xi)]2 = Si εi2 n nDeviations between observation and fitted values, residuals: n ei = yi - ŷi = yi - (b1 + b2xi) n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •20 Observations and Fitted Regression Line n nSimple linear regression: Fitted line and observation points (Verbeek, Figure 2.1) Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •21 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •22 OLS Estimators nOLS estimators b1 und b2 result in Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •23 • •with mean values and •and second moments • •Equating the partial derivatives of S(b1, b2) to zero: normal equations OLS Estimators: The General Case nModel for Y contains K-1 explanatory variables n Y = b1 + b2X2 + … + bKXK = x’b n with x = (1, X2, …, XK)’ and b = (b1, b2, …, bK)’ nObservations: [yi, xi] = [yi, (1, xi2, …, xiK)’], i = 1, …, N nOLS-estimates b = (b1, b2, …, bK)’ are obtained by minimizing n n n this results in the OLS estimators n n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •24 • In Matrix Notation nN observations n (y1,x1), … , (yN,xN) nModel: yi = b1 + b2xi + εi, i = 1, …,N, or n y = Xb + ε n with n n n n nOLS estimators n b = (X’X)-1X’y Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •25 Gauss-Markov Assumptions A1 E{εi} = 0 for all i A2 all εi are independent of all xi (exogenous xi) A3 V{ei} = s2 for all i (homoskedasticity) A4 Cov{εi, εj} = 0 for all i and j with i ≠ j (no autocorrelation) Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •26 Observation yi (i = 1, …, N) is a linear function yi = xi'b + εi of observations xik, k =1, …, K, of the regressor variables and the error term εi xi = (xi1, …, xiK)'; X = (xik) n Normality of Error Terms n n nTogether with assumptions (A1), (A3), and (A4), (A5) implies n εi ~ NID(0,σ2) for all i n i.e., all εi are qindependent drawings qfrom the normal distribution N(0,σ2) qwith mean 0 qand variance σ2 nError terms are “normally and independently distributed” (NID, n.i.d.) n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •27 • A5 εi normally distributed for all i Properties of OLS Estimators nOLS estimator b = (X’X)-1X’y n1. The OLS estimator b is unbiased: E{b} = β n2. The variance of the OLS estimator is given by n V{b} = σ2(Σi xi xi’ )-1 n3. The OLS estimator b is a BLUE (best linear unbiased estimator) for β n4. The OLS estimator b is normally distributed with mean β and covariance matrix V{b} = σ2(Σi xi xi’ )-1 nProperties n1., 2., and 3. follow from Gauss-Markov assumptions n4. needs in addition the normality assumption (A5) n n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •28 • Distribution of t-statistic nt-statistic n n nwith the standard error se(bk) of bk follows 1.the t-distribution with N-K d.f. if the Gauss-Markov assumptions (A1) - (A4) and the normality assumption (A5) hold 2.approximately the t-distribution with N-K d.f. if the Gauss-Markov assumptions (A1) - (A4) hold but not the normality assumption (A5) 3.asymptotically (N → ∞) the standard normal distribution N(0,1) 4.Approximately, for large N, the standard normal distribution N(0,1) nThe approximation error decreases with increasing sample size N n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •29 • OLS Estimators: Consistency nThe OLS estimators b are consistent, n plimN → ∞ b = β, n if one of the two sets of conditions are fulfilled: n(A2) from the Gauss-Markov assumptions and the assumption (A6), or nthe assumption (A7), which is weaker than (A2), and the assumption (A6) nAssumptions (A6) and (A7): n n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •30 A6 1/N ΣNi=1 xi xi’ converges with growing N to a finite, nonsingular matrix Σxx A7 The error terms have zero mean and are uncorrelated with each of the regressors: E{xi εi} = 0 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •31 Estimation Concepts Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •32 OLS estimator: Minimization of objective function S(b) = Si εi2 gives nK first-order conditions Si (yi – xi’b) xi = Si ei xi = 0, the normal equations nOLS estimators are solutions of the normal equations nMoment conditions E{(yi – xi’ b) xi} = E{ei xi} = 0 nNormal equations are sample moment conditions (times N) IV estimator: Model allows derivation of the moment conditions E{(yi – xi’ b) zi} = E{ei zi} = 0 which are functions of nobservable variables yi, xi, instrument variables zi, and unknown parameters b nMoment conditions are used for deriving IV estimators nOLS estimators are special case of IV estimators n • Estimation Concepts, cont’d Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •33 GMM estimator: generalization of the moment conditions E{f(wi, zi, b)} = 0 nwith observable variables wi, instrument variables zi, and unknown parameters b; f: multidimensional function with as many components as moment conditions nAllows for non-linear models nUnder weak regularity conditions, the GMM estimators are qconsistent qasymptotically normal Maximum likelihood estimation nBasis is the distribution of yi conditional on regressors xi nDepends on unknown parameters b nThe estimates of the parameters b are chosen so that the distribution corresponds as good as possible to the observations yi and xi • Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •34 Example: Urn Experiment Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •35 The experiment: nThe urn contains red and white balls nProportion of red balls: p (unknown) nN random draws nRandom draw i: yi = 1 if ball in draw i is red, yi = 0 otherwise; P{yi=1} = p nSample: N1 red balls, N-N1 white balls nProbability for this result: P{N1 red balls, N-N1 white balls} ≈ pN1 (1 – p)N-N1 Likelihood function L(p): The probability of the sample result, interpreted as a function of the unknown parameter p L(p) = pN1 (1 – p)N-N1 , 0 < p < 1 Urn Experiment: Likelihood Function and LM Estimator nLikelihood function: (proportional to) the probability of the sample result, interpreted as a function of the unknown parameter p n L(p) = pN1 (1 – p)N-N1 , 0 < p < 1 nMaximum likelihood estimator: that value of p which maximizes L(p) n nCalculation of : maximization algorithm nAs the log-function is monotonous, coordinates p of the extremes of L(p) and log L(p) coincide nUse of log-likelihood function is often more convenient n log L(p) = N1 log p + (N - N1) log (1 – p) n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •36 Urn Experiment: Likelihood Function, cont’d Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •37 Verbeek, Fig.6.1 p log L(p) 0.1 -107.21 0.2 -83.31 0.3 -72.95 0.4 -68.92 0.5 -69.31 0.6 -73.79 0.7 -83.12 0.8 -99.95 0.9 -133.58 Urn Experiment: ML Estimator nMaximizing log L(p) with respect to p gives the first-order condition n n nSolving this equation for p gives the maximum likelihood estimator (ML estimator) n n nFor N = 100, N1 = 44, the ML estimator for the proportion of red balls is = 0.44 Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •38 Maximum Likelihood Estimator: The Idea nSpecify the distribution of the data (of y or y given x) nDetermine the likelihood of observing the available sample as a function of the unknown parameters nChoose as ML estimates those values for the unknown parameters that give the highest likelihood nProperties: In general, the ML estimators are qconsistent qasymptotically normal qefficient n provided the likelihood function is correctly specified, i.e., distributional assumptions are correct Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •39 Example: Normal Linear Regression nModel n yi = β1 + β2Xi + εi n with assumptions (A1) – (A5) nFrom the normal distribution of εi follows: contribution of observation i to the likelihood function: n n nL(β,σ²) = ∏i f(yi│xi;β,σ²) due to independent observations; the log-likelihood function is given by n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •40 Normal Linear Regression, cont’d nMaximizing log L(β,σ²) with respect to β and σ2 gives the ML estimators n n n n which coincide with the OLS estimators, and n n n n which is biased and underestimates σ²! nRemarks: nThe results are obtained assuming normally and independently distributed (NID) error terms nML estimators are consistent but not necessarily unbiased; see the properties of ML estimators below n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •41 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •42 ML Estimator: Notation Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •43 Let the density (or probability mass function) of yi, given xi, be given by f(yi|xi,θ) with K-dimensional vector θ of unknown parameters Given independent observations, the likelihood function for the sample of size N is The ML estimators are the solutions of maxθ log L(θ) = maxθ Σi log Li(θ) or the solutions of the K first-order conditions s(θ) = Σi si(θ), the K-vector of gradients, also denoted score vector Solution of s(θ) = 0 §analytically (see examples above) or §by use of numerical optimization algorithms Matrix Derivatives Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •44 The scalar-valued function or – shortly written as log L(θ) – has the K arguments θ1, …, θK §K-vector of partial derivatives or gradient vector or score vector or gradient § § §KxK matrix of second derivatives or Hessian matrix ML Estimator: Properties Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •45 The ML estimator is 1.Consistent 2.asymptotically efficient 3.asymptotically normally distributed: 4. V: asymptotic covariance matrix of The Information Matrix Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •46 Information matrix I(θ) §I(θ) is the limit (for N → ∞) of § § §For the asymptotic covariance matrix V can be shown: V = I(θ)-1 §I(θ)-1 is the lower bound of the asymptotic covariance matrix for any consistent, asymptotically normal estimator for θ: Cramèr-Rao lower bound Calculation of Ii(θ) can also be based on the outer product of the score vector for a miss-specified likelihood function, Ji(θ) can deviate from Ii(θ) Example: Normal Linear Regression nModel n yi = β1 + β2Xi + εi n with assumptions (A1) – (A5) fulfilled nThe score vector with respect to β = (β1,β2)’ is – using xi = (1, Xi)’ – n n nThe information matrix is obtained both via Hessian and outer product n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •47 Covariance Matrix V: Calculation nTwo ways to calculate V: nEstimator based on the information matrix I(θ) n n n n index “H”: the estimate of V is based on the Hessian matrix nEstimator based on the score vector n n n n with score vector s(θ); index “G”: the estimate of V is based on gradients qalso called: OPG (outer product of gradient) estimator qalso called: BHHH (Berndt, Hall, Hall, Hausman) estimator qE{si(θ) si(θ)’} coincides with Ii(θ) if f(yi| xi,θ) is correctly specified n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •48 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •49 Again the Urn Experiment nLikelihood contribution of the i-th observation n log Li(p) = yi log p + (1 - yi) log (1 – p) nThis gives scores n n n and n n n nWith E{yi} = p, the expected value turns out to be n n n nThe asymptotic variance of the ML estimator V = I-1 = p(1-p) Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •50 Urn Experiment and Binomial Distribution nThe asymptotic distribution is n nSmall sample distribution: n N ~ B(N, p) nUse of the approximate normal distribution for portions qrule of thumb for using the approximate distribution n N p (1-p) > 9 nTest of H0: p = p0 can be based on test statistic n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •51 Example: Normal Linear Regression nModel n yi = xi’β + εi n with assumptions (A1) – (A5) nLog-likelihood function n n nScores of the i-th observation n n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •52 Normal Linear Regression: ML-Estimators nThe first-order conditions – setting both components of Σisi(β,σ²) to zero – give as ML estimators: the OLS estimator for β, the average squared residuals for σ² n n nAsymptotic covariance matrix: Contribution of the i-th observation (E{εi} = E{εi3} = 0, E{εi2} = σ², E{εi4} = 3σ4) n n gives n V = I(β,σ²)-1 = diag (σ²Σxx-1, 2σ4) n with Σxx = lim (Σixixi‘)/N Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •53 Normal Linear Regression: ML- and OLS-Estimators nThe ML estimate for β and σ² follow asymptotically n n n nFor finite samples: Covariance matrix of ML estimators for β n n similar to OLS results Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •54 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •55 Diagnostic Tests nDiagnostic (or specification) tests based on ML estimators nTest situation: nK-dimensional parameter vector θ = (θ1, …, θK)’ nJ ≥ 1 linear restrictions (K ≥ J) nH0: R θ = q with JxK matrix R, full rank; J-vector q nTest principles based on the likelihood function: 1.Wald test: Checks whether the restrictions are fulfilled for the unrestricted ML estimator for θ; test statistic ξW 2.Likelihood ratio test: Checks whether the difference between the log-likelihood values with and without the restriction is close to zero; test statistic ξLR 3.Lagrange multiplier test (or score test): Checks whether the first-order conditions (of the unrestricted model) are violated by the restricted ML estimators; test statistic ξLM Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •56 Likelihood and Test Statistics teststat Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •57 •g(b) = 0: restriction •log L: log-likelihood The Asymptotic Tests nUnder H0, the test statistics of all three tests nfollow asymptotically, for finite sample size approximately, the Chi-square distribution with J d.f. nThe tests are asymptotically (large N) equivalent nFinite sample size: the values of the test statistics obey the relation n ξW ≥ ξLR ≥ ξLM nChoice of the test: criterion is computational effort 1.Wald test: Requires estimation only of the unrestricted model; e.g., testing for omitted regressors: estimate the full model, test whether the coefficients of potentially omitted regressors are different from zero 2.Lagrange multiplier test: Requires estimation only of the restricted model; preferable if restrictions complicate estimation 3.Likelihood ratio test: Requires estimation of both the restricted and the unrestricted model 4. Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •58 Wald Test nChecks whether the restrictions are fulfilled for the unrestricted ML estimator for θ nAsymptotic distribution of the unrestricted ML estimator: n nHence, under H0: R θ = q, n n nThe test statistic n n qunder H0, ξW is expected to be close to zero qp-value to be read from the Chi-square distribution with J d.f. Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •59 Wald Test, cont’d nTypical application: tests of linear restrictions for regression coefficients nTest of H0: βi = 0 n ξW = bi2/[se(bi)2] qξW follows the Chi-square distribution with 1 d.f. qξW is the square of the t-test statistic nTest of the null-hypothesis that a subset of J of the coefficients β are zeros n ξW = (eR’eR – e’e)/[e’e/(N-K)] qe: residuals from unrestricted model qeR: residuals from restricted model qξW follows the Chi-square distribution with J d.f. qξW is related to the F-test statistic by ξW = FJ q q n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •60 Likelihood Ratio Test nChecks whether the difference between the ML estimates obtained with and without the restriction is close to zero n for nested models nUnrestricted ML estimator: nRestricted ML estimator: ; obtained by minimizing the log-likelihood subject to R θ = q nUnder H0: R θ = q, the test statistic n n qis expected to be close to zero qp-value to be read from the Chi-square distribution with J d.f. n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •61 Likelihood Ratio Test, cont’d nTest of linear restrictions for regression coefficients nTest of the null-hypothesis that J linear restrictions of the coefficients β are valid n ξLR = N log(eR’eR/e’e) qe: residuals from unrestricted model qeR: residuals from restricted model qξLR follows the Chi-square distribution with J d.f. nRequires that the restricted model is nested within the unrestricted model n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •62 Lagrange Multiplier Test nChecks whether the derivative of the likelihood for the restricted ML estimator is close to zero nBased on the Lagrange constrained maximization method nLagrangian, given θ = (θ1’, θ2’)’ with restriction θ2 = q, J-vectors θ2, q, λ n H(θ, λ) = Σi log L i(θ) – λ‘(θ2-q) nFirst-order conditions give the restricted ML estimators n and n n n n nλ measures the extent of violation of the restrictions, basis for ξLM n si are the scores; LM test is also called score test Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •63 Lagrange Multiplier Test, cont’d nFor can be shown that follows asymptotically the normal distribution N(0,Vλ) with n n i.e., the inverted lower block diagonal (dimension J x J ) of the inverted information matrix n n n nThe Lagrange multiplier test statistic n n has under H0 an asymptotic Chi-square distribution with J d.f. n is the lower block diagonal of the estimated inverted information matrix, evaluated at the restricted estimators for θ n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •64 The LM Test Statistic nOuter product gradient (OPG) of ξLM nInformation matrix estimated on basis of scores (cf. slide 48) n nWith n nthe LM test statistics can be written as n nWith the NxK matrix of first derivatives S = [s1( ), …, sN( )]‘ n nand – with the N-vector i = (1, …, 1)’ Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •65 Calculation of the LM Test Statistic nAuxiliary regression of a N-vector i = (1, …, 1)’ on the scores si( ), i.e., on the columns of S; no intercept nPredicted values from auxiliary regression: S(S'S)-1S’i nSum of squared predictions: i’S(S’S)-1S’S(S’S)-1S’i = i’S(S’S)-1S’i nTotal sum of squares: i’i = N nLM test statistic n ξLM = i’S(S’S)-1S’i = i’S(S’S)-1S’i (i’i)-1N = N uncR² n with the uncentered R² of the auxiliary regression with residuals e n nRemember: For the regression y = Xβ + ε nOLS estimates for β: b = (X‘X)-1X‘y nthe predictions for y: ŷ = X(X‘X)-1X‘y nuncentered R²: uncR² = ŷ’ŷ/y’y nAlso: ∑i si(θ) = S’i and ∑i si(θ) si(θ)’ = S’S Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •66 The Urn Experiment: Three Tests of H0: p=p0 nThe urn experiment: test of H0: p = p0 nThe likelihood contribution of the i-th observation is n log Li(p) = yi log p + (1 - yi) log (1 – p) nThis gives n si(p) = yi/p – (1-yi)/(1-p) and Ii(p) = [p(1-p)]-1 nWald test (with the unrestricted estimators and ) n ξW = N(R - q) [RV-1R]-1 (R - q) = N( - p0) [ (1- )]-1 ( - p0) n with J = 1, R = I; this gives n n nExample: In a sample of N = 100 balls, N1 = 40 are red, i.e., =0.40 nTest of H0: p0 = 0.5 results in n ξW = 4.167, corresponding to a p-value of 0.041 Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •67 The Urn Experiment: LR Test of H0: p=p0 nLikelihood ratio test: n n with n n n unrestricted estimator and restricted estimator n nExample: In the sample of N = 100 balls, N1 = 40 are red n =0.40, = p0 = 0.5 nTest of H0: p0 = 0.5 results in n ξW = 4.027, corresponding to a p-value of 0.045 Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •68 The Urn Experiment: LM Test of H0: p=p0 nLagrange multiplier test: n with n n n and the inverted information matrix [I(p)]-1 = p(1-p), calculated for the restricted case, the LM test statistic is n n n n n n nExample: nIn the sample of N = 100 balls, 40 are red nLM test of H0: p0 = 0.5 gives ξLM = 4.000 with p-value of 0.044 Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •69 Comparison of the test results Wald LR LM Test statistic 4.167 4.027 4.000 p-value 0.041 0.045 0.046 Contents nOrganizational Issues nOverview of Contents nLinear Regression: A Review nEstimation of Regression Parameters nEstimation Concepts nML Estimator: Idea and Illustrations nML Estimator: Notation and Properties nML Estimator: Two Examples nAsymptotic Tests nSome Diagnostic Tests Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •70 Normal Linear Regression: Scores nLog-likelihood function n n nScores: n n n n n nCovariance matrix n V = I(β,σ²)-1 = diag(σ²Σxx-1, 2σ4) n n n n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •71 Testing for Omitted Regressors nModel: yi = xi’β + zi’γ + εi, εi ~ NID(0,σ²); sample size N nTest whether the J regressors zi are erroneously omitted: nFit the restricted model nApply the LM test to check H0: γ = 0 nFirst-order conditions give the scores n n n with restricted ML estimators for β and σ²; ML-residuals nAuxiliary regression of N-vector i = (1, …, 1)’ on the scores gives the uncentered R² nThe LM test statistic is ξLM = N uncR² nAn asymptotically equivalent LM test statistic is NRe² with Re² from the regression of the ML residuals on xi and zi Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •72 Testing for Heteroskedasticity nModel: yi = xi’β + εi, εi ~ NID, V{εi} = σ² h(zi’α), h(.) > 0 but unknown, h(0) = 1, ∂/∂α{h(.)} 0, J-vector zi nTest for homoskedasticity: Apply the LM test to check H0: α = 0 nFirst-order conditions with respect to σ² and α give the scores n n with restricted ML estimators for β and σ²; ML-residuals nAuxiliary regression of N-vector i = (1, …, 1)’ on the scores gives the uncentered R² nLM test statistic ξLM = N uncR²; a version of Breusch-Pagan test nAn asymptotically equivalent version of the Breusch-Pagan test is based on NRe² with Re² from the regression of the squared ML residuals on zi and an intercept nAttention! No effect of the functional form of h(.) n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •73 Testing for Autocorrelation nModel: yt = xt’β + εt, εt = ρεt-1 + vt, vt ~ NID(0,σ²) nLM test of H0: ρ = 0 nFirst-order conditions give the scores with respect to β and ρ n n with restricted ML estimators for β and σ² nThe LM test statistic is ξLM = (T-1) uncR² with the uncentered R² from the auxiliary regression of the N-vector i = (1,…,1)’ on the scores nIf xt contains no lagged dependent variables: products with xt can be dropped from the regressors; ξLM = (T-1) R² with R² from i = (1, …, 1)’ on the scores nAn asymptotically equivalent test is the Breusch-Godfrey test based on NRe² with Re² from the regression of the ML residuals on xt and the lagged residuals n Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •74 Your Homework 1.Open the Greene sample file “greene7_8, Gasoline price and consumption”, offered within the Gretl system. The dataset contains time series of annual observations from 1960 through 1995.The variables to be used in the following are: G = total U.S. gasoline consumption, computed as total expenditure of gas divided by the price index; Pg = price index for gasoline; Y = per capita (p.c.) disposable income; Pnc = price index for new cars; Puc = price index for used cars; Pop = U.S. total population in millions. Perform the following analyses and interpret the results: a.Produce and discuss a time series plot of the gasoline consumption (G), the disposable income (Y), and the U.S. total population (Pop). b.Produce and interpret the scatter plot of the p.c. gasoline consumption (Gpc) over the p.c. disposable income (Y). c.Fit the linear regression of log(Gpc) on the regressors log(Y) and Pg and give an interpretation of the outcome. 2. 2. Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •75 Your Homework, cont’d d.Test for autocorrelation of the error terms using the LM test statistic ξLM = (T-1) R² with the uncentered R² from the auxiliary regression of the vector of ones i = (1, …, 1)’ on the scores (et*et-1). e.Test for autocorrelation using the Breusch-Godfrey test, the test statistic being TRe² with Re² from the regression of the residuals on the regressors and the lagged residuals et-1. f.Use the Chow test to test for a structural break between 1979 and 1980. 2.Assume that the errors εt of the linear regression yt = β1 + β2xt + εt are NID(0, σ2) distributed. (a) Determine the log-likelihood function of the sample for t = 1, …,T; (b) derive (i) the first-order conditions and (ii) the ML estimators for β1, β2, and σ2; (c) derive the asymptotic covariance matrix of the ML estimators for β1 and β2 on the basis (i) of the information matrix and (ii) of the score vector. d. q 1. 1. Mar 9, 2018 Hackl, Econometrics 2, Lecture 1 •76