Introduction to Econometrics Worksheet week # 9 1. Imagine that you want to estimate the race-specific crime rates. Given the data on over 2000 criminals (crime2.gdt), estimate the relationship between the race and the number of crimes committed. The dataset contains the following variables: crime86 - number of crimes committed in 1986 race - race (=1 if black, =2 if Hispanic, =0 otherwise) tottime - total number of months spent in prison since 18 years old pcnv - proportion of prior convictions qemp86 - number of quarters employed in 1986 inc86 - legal income in 1986, $100s (a) Estimate the baseline model of the impact of race on number of crimes committed in 1986: crime86i = α0 + α1racei + εi . (b) Interpret the results. Do you believe that the coefficient α1 is correctly estimated? Under what assumptions would it be? (c) Create two dummy variables for black and Hispanic individuals. Estimate the equation again with these two variables. Interpret the results. crime86i = β0 + β1blacki + β2hispanici + vi . (d) Is there anything that could still create a bias in this equation? If yes, how would you solve for this problem? What direction of bias do you expect? (e) Re-estimate the equation with variables controlling for crime history of a person: crime86i = γ0 + γ1blacki + γ2hispanici + γ3tottimei + γ4pcnvi + ei . (f) Control further for a current employment status and income of an individual: crime86i = δ0+δ1blacki+δ2hispanici+δ3tottimei+δ4pcnvi+δ5qemp86i+δ6inc86i+ui . (g) Interpret the results from part e and f (in comparison with c). How did the coefficients of black and hispanic change? Did you expect this direction of potential bias? Would you conclude that the additional variables indeed belong to the model? (h) Test the hypothesis that no relevant explanatory variables have been omitted using the RESET test in Gretl (test for the model from part f). 1