Days 5 and 6 Popular research designs (Regression discontinuity design, Difference in differences and Synthetic controls method) and Machine learning essentials
We will start with finishing the part on Instrumental variables.
Then we will move to studying various research designs: Regression Discontinuity, Difference-in-differences and Synthetic Controls method. Last part will be an introduction to the Machine learning methods which will finish with Causal Machine Learning - utilizing the great predictive properties of ML algorithms for improving the estimation of parameters of interest (this is what economists are typically interested in).
"Nature does not make jumps". Regression discontinuity design utilizes the variation that is induced by some discontinuity in the data, typically induced by some administrative rule/law. (lecture notes updated 6.1.2022)
One of the most popular current research designs is arguably Difference-in-differences. It is everywhere. It accounts for approximately one quarter of the 100 most cited papers in economics from 2015-2019. The main assumption that DiD is based on is so called "parallel trends assumption". We make assumption about the behavior of a unit had the treatment/intervention not happened. There is a lot of important contributions that happened in the literature in the past three years! The methods that were being routinely used have some serious drawbacks. We will go through some of these recent developments too, with an emphasis on our intuition. (lecture notes updated 6.1.2022)
There is R code that accompanies some of the examples presented.
Next research design, Synthetic Control Method, is so beautiful and intuitive it made in into Wall Street Journal or Washington Post. The person who brought and develop these ideas is Alberto Abadie from MIT. For a treated unit there is not a proper match we could compare it to. What do we do? We will create a synthetic one that is a weighted average of different untreated units. The very visual nature of this method makes it very popular in industry too and is an area of current active research.
The last thing we will cover are some basics of Machine Learning and then mention some modern literature how ML is making its way to economics. ML methods are built for prediction. In certain situations we don't care much about estimation or about understanding the underlying mechanism as long as it works/predicts well/earns money. Combining ML algorithms with causal models is where a lot of research efforts are currently directed to. It seems like a good idea to get familiar with basics of ML methods as it is spreading everywhere.
The R code for the last two topics is here
The third assignment is here:
And can be downloaded here: