Econometrics 1 1. For the simple linear regression y[i] = a + β x[i] + ε[i], i = 1, …, N, the OLS estimator b for β has the variance V{b} = σ^2/(Ns[x]^2) with variances σ^2 and s[x]^2 of the error term and the regressor, respectively. To achieve an accuracy of the estimate b as high as possible, how has the design of the data collection be chosen? 2. Assume that the random variable ε[t] follows the AR(1) process ε[t] = φε[t-1] + v[t] with |φ| < 1 and v[t] being uncorrelated and having mean zero and variance σ^2. Show that the covariance Cov{ε[t], ε[t-s]} equals φ^s σ^2 (1-φ^2)^-1 for s > 0. Draw a schematic figure which shows the covariance Cov{ε[t], ε[t-s]} as a function of s for σ^2 = 1 and (a) φ = 0.7 and (b) φ = - 0.5. 3. For estimating the parameters of the model y[i] = a + x[i1]β[1]+ x[i2]β[2] + ε[i], observations for i = 1, …, N are available. Given that the regressors fulfill the relation x[i2] = hx[i1] with a real number h, show that the rank of the (Nx3) matrix X = (ℓ, x[1], x[2]) has a rank of at most 2; ℓ is a N-vector of ones, x[1] and x[2] are the N-vectors of observations of x[i1] and x[i2], respectively. 4. The linear regression model y[i] = x[i]'β + ε[i], i = 1, …, N with K regressors is fitted to a set of data; you suspect that the error term variances are a function of variables Z[2], …, Z[p]. State the model for heteroskedasticity and the null hypothesis which are basis of the Breusch-Pagan test. Which probability distribution follows the test statistic of the Breusch-Pagan test? 5. The private consumption C[t] of households is assumed to depend of the disposable income Y[t]as indicated by C[t] = β[1] + β[2]Y[t] + ε[t]. It is, moreover, assumed that Y[t] = C[t] + I[t], where I[t] are exogenous investments; exogeneity means that E{I[t] ε[t]} = 0. Show that both Y[t] and C[t] are endogenous, i.e., that E{C[t] ε[i]} ≠ 0 and E{Y[t] ε[i]} ≠ 0. 6. Open the Ramanathan sample file “data3-7, Toyota station wagon repairs”, offered within the Gretl system. Perform the following analyses and interpret the results: a. Interpret the scatter plots of COST over (i) AGE and (ii) MILES. What do they suggest for the specification of a model for COST? b. Estimate the linear regression for COST with regressors AGE and MILES; show appropriate graphs of the residuals and report results of diagnostic tests suggested by a.); discuss possible remedies for at least two of the issues. c. Repeat b.) for the linear regression for log(COST) with regressors MILES and squared MILES, using heteroskedasticity-robust standard errors. d. Use both the White and the Breusch-Pagan test for heteroskedasticity; explain the results. e. Perform the Chow test for break (i) starting with observation 15 and (ii) starting with observation 43; compare the results with analogous ones for the regression of b.). f. Perform the PE test to test whether the regression of c.) is preferable to the regression for COST with regressors MILES and squared MILES (using heteroskedasticity-robust standard errors). g. Generate a dummy-variable D43 and estimate the regression for COST with regressor MILES and a term that make use of D43 in order to take into account the break starting with observation 43 (using heteroskedasticity-robust standard errors); compare the model based on suitable diagnostic tests with the model from b.). Document your analyses and interpretations in Gretl: save all relevant outputs, write your explanations and comments in a session file.