Introduction to Econometrics Home assignment # 2 (Suggested solutions) 1. Your are given the following model y = β1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 + ε Assume that you want to test the following set of restrictions: (a) β2 − β3 = 1 (b) β4 = β6 and β5 = 0 Construct models that incorporate restrictions (a) and (b), separately and together. Describe what test you will use to test the restrictions, including its distribution and parameters (i.e., describe how would you test: the restriction (a), the restriction (b), and both of them together). Solution: • First, let us test the restrictions separately. (a) We can express β2 = 1 + β3 and plug into the unrestricted model: y = β1 + (1 + β3)X2 + β3X3 + β4X4 + β5X5 + β6X6 + ε y − X2 = β1 + β3(X3 + X2) + β4X4 + β5X5 + β6X6 + ε . We have here J = 1 restriction, n observations and k = 6 parameters. Hence, to test the restrictions, we should run the unrestricted model y = β1 + β2X2 + β3X3 + β4X4 + β5X5 + β6X6 + ε and the restricted model y − X2 = β1 + β3(X3 + X2) + β4X4 + β5X5 + β6X6 + ε , save SSE in both cases and test F = (SSER − SSEU )/J SSEU /(n − k) = (SSER − SSEU )/1 SSEU /(n − 6) ∼ F1,n−6 . Note that since we have only one restriction, we could also use the fact that √ F ∼ tn−6 and if n is large, also √ F ∼ N(0, 1) . 1 (b) When we plug these restrictions into the original model, we obtain y = β1 + β2X2 + β3X3 + β6(X4 + X6) + ε . We would apply the same method as in (a) with the only difference that now we have J = 2 restrictions and so F = (SSER − SSEU )/J SSEU /(n − k) = (SSER − SSEU )/2 SSEU /(n − 6) ∼ F2,n−6 . • Second, let us test the restrictions together. The restricted model now becomes y − X2 = β1 + β3(X3 + X2) + β6(X4 + X6) + ε and J = 3, so we have F = (SSER − SSEU )/J SSEU /(n − k) = (SSER − SSEU )/3 SSEU /(n − 6) ∼ F3,n−6 . 2. Imagine you are interested in the determinants of the revenues in shoe stores in Prague. Suppose you have specified the following model: Revt = α + βInct + δPricet + θPopult + ηWeekendt + εt , where Revt denotes the amount of revenues in the Prague shoe stores on a particular day t, Inct is per capita income in Prague, Pricet is a price index for shoes relative to other goods in Prague, Popult is number of people living in Prague, and Weekendt is a dummy variable for weekend days. (a) This specification recognizes that people might go shopping for shoes more often on weekends than on working days. Explain how would you test for such a hypothesis. (b) What are the predicted revenues (in terms of the coefficients of the model) for weekends and for working days? (c) Explain how would you alter the specification to account for the fact that people buy more shoes during the sales period, which is in January and July. (d) If people have higher income, they buy more shoes on weekends (i.e., the effect of per capita income on revenues is larger on weekends compared to working days). Is this incorporated in your specification? If not, how would you do it? How would you test for the hypothesis that if people have higher income, they buy more shoes on weekends? 2 Solution: (a) This specification allows for different intercept for weekends and for working days (ceteris paribus). This could reflect the fact most people go shopping on weekends, possibly making the revenues larger (so η > 0), which could be a hypothesis to be tested using a one-sided t-test: H0 : η ≤ 0, HA : η > 0 t = η s.d.(η) ∼ tn−5,0.95 (b) For weekend days the model looks like: ˆRevt = (ˆα + ˆη) + ˆβInct + ˆδPricet + ˆθPopult , whereas during working days it is: ˆRevt = ˆα + ˆβInct + ˆδPricet + ˆθPopult , (c) To recognize the effect of a sales period (January and July), we have to create a dummy variable indicating if the day in question is in Junuary or July or not: Salest = 1 if day t is in January or July 0 otherwise We introduce this dummy variable in our model as an additional independent variable: Revt = α + βInct + δPricet + θPopult + ηWeekendt + γSalest + εt , (d) If we want to incorporate in our specification that people buy more shoes on weekends if they have higher incomes, we need to allow for different coefficients of Inc for weekends and working days. In other words, we need to allow for both the different intercept and slope coefficient of Inc for weekends and working days. During weekends the model should look like Revt = (α + η) + (β + ω)Inct + δPricet + θPopult + εt and during working days it should look like Revt = α + βInct + δPricet + θPopult + εt . We can achieve that by using the dummy variable Weekend in the following way: Revt = α + (β + ωWeekendt)Inct + δPricet + θPopult + ηWeekendt + εt , 3 so that finally we obtain Revt = α + βInct + ωWeekendtInct + δPricet + θPopult + ηWeekendt + εt , If we assume that people buy more shoes on weekends if they have higher incomes, the coefficient ω should be significant and positive, which is a hypothesis that could be tested using one-sided t-test (H0 : ω ≤ 0, HA : ω > 0). 3. Use data ceosal2.gdt for this exercise. Consider an equation to explain salaries of CEOs in terms of annual firm sales: ln(salary) = β0 + β1 ln(sales) + β2roe + β3neg ros + ε , where salary . . . CEO’s salary in thousands USD sales . . . firm’s sales in millions USD roe . . . firm’s return on equity neg ros . . . dummy, equal to 1 if return on firm’s stock is negative (a) Define the variables you need and estimate the equation. (b) What is the interpretation of the coefficients β1, β2, and β3? (c) Test for the presence of a significant impact of firm’s sales on CEO’s salary by hand (using only the estimated coefficient and the standard error from the Gretl output) and then compare your results to the results of this test in Gretl. Define the null and alternative hypothesis, the test statistic, its distribution, and interpret the results of the test. (d) You wonder if the impact of firm’s return on equity on the CEO’s salary is indeed linear. You decide to test for the presence of a non-linear relationship, which you approximate by a third order polynomial of roe (i.e., α1roe + α2roe2 + α3roe3 ). i. Define the null and alternative hypothesis, the test statistics and its distribution. Describe all specifications you need to be able to conduct the test, construct the necessary variables, and estimate these specifications in Gretl. ii. Calculate the test statistics by hand, compare to the critical value at 99% significance level, and interpret the results. iii. Conduct the test in Gretl and compare the results. Solution: (a) We estimate the model in Gretl and obtain the following results: 4 (b) We obtain the result β1 = 0.26. This means that 1% increase in firm’s sales increases the CEO’s salary by 0.26%. β2 = 0.016, if firm’s return on equity increases by 1, the CEO’s salary increases by 1.6%. β3 = −0.18, this means that CEOs in firms with negative return on stock have by 18% lower salary than those in firms with positive return. (c) Testing statistical significance of variable sales using a two-sided t-test at 95% significance level: H0 : β1 = 0, HA : β1 = 0 t = β1 s.d.(β1) ∼ tn−k(1−α 2 ) t-statistics: t = 0.2633 0.0372 = 7.08 critical value: tn−k(1−α 2 ) = t185(0.975) = 1.960 |t| > tn−k(1−α 2 ) Therefore, we reject H0, and confirm that the effect of sales on CEO’s salary is significant. This is also confirmed by the results in the Gretl output, which show both the calculated t-statistic and the corresponding p-value, which is equal to 2.88∗10−11 . The p-value is thus smaller than the chosen significance level (0.05), we thus reject the null hypothesis and conclude that the effect is significant. (d) i. We can test for a non-linear relationship between roe and salary with an F-test. Let’s first define the unrestricted model: ln(salary) = α0 + α1 ln(sales) + α2roe + α3roe2 + α4roe3 + α5neg ros + ε , The null and alternative hypothesis are: H0 : α3 = 0 & α4 = 0, HA : α3 = 0 ∨ α4 = 0 5 And we get the restricted model by plugging in the restrictions (under the null hypothesis). The restricted model is thus the same as the original equation: ln(salary) = α0 + α1 ln(sales) + α2roe + α5neg ros + ε , The F-statistics is as follows: F = (SSER − SSEU )/J SSEU /(n − k) ∼ FJ,n−k , To be able to conduct the test, we need to run the restricted and unrestricted model and save the sum of squared residuals (SSE) from both models. We already have the results for the restricted model from part (a). Therefore, we create the variables roe2 and roe3 , and run the unrestricted model. We obtain the following results: ii. We can now plug the SSER and SSEU to the test statistic (number of restrictions J = 2, number of observations n = 189, and number of coefficients in the unrestricted model k = 6): F = (SSER − SSEU )/J SSEU /(n − k) = (43.63565 − 43.62930)/2 43.62930/(189 − 6) = 0.0133 , Next, we compare the calculated statistics to the critical value at 99% confidence level: FJ,n−k(1−α) = F2,183(0.99) = 4.61 F < FJ,n−k(1−α) Therefore, we do not reject the null hypothesis, the second and third order polynomials in roe are not jointly significant, and we thus conclude that there is no evidence for the presence of a non-linear relationship in roe. 6 iii. We obtain the following results of the test of linear restrictions in Gretl: Results from the F-test in Gretl show both the calculated F-statistic (the same value as the one calculated by hand) and the corresponding p-value. The p-value is equal to 0.987, which bigger than the chosen significance level (0.01). Therefore, we do not reject the null hypothesis and conclude that the variables are not jointly significant. 7