Let’s get right into it. Let’s create a data set first. In this data set we’ll assign the treatment effect to 3 to the treated group
Simplest Case (2x2)
In this code, you can change the number to create an infinite amount of graphs to practice identifying coefficients from a regressions as in Worksheet: DiD Coefficients onto Graphs
clear
local units = 2
local start = 1
local end = 2
local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'
egen id = seq(), b(`time')
egen t = seq(), f(`start') t(`end')
sort id t
xtset id t
lab var id "Panel variable"
lab var t "Time variable"
gen D = id==2 & t==2
gen btrue = cond(D==1, 2, 0)
gen Y = id + 3*t + btrue*D
lab de prepost 1 "Pre" 2 "Post"
lab val t prepost
twoway ///
(connected Y t if id==1) ///
(connected Y t if id==2) ///
, ///
legend(order(1 "id=1" 2 "id=2")) ///
xlabel(1 2, valuelabel) ylabel(4(1)10)
More time periods
clear
local units = 2
local start = 1
local end = 10
local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'
egen id = seq(), b(`time')
egen t = seq(), f(`start') t(`end')
lab var id "Panel variable"
lab var t "Time variable"
sort id t
xtset id t
gen D = 0
replace D = 1 if id>=2 & t>=5
lab var D "Treated"
cap drop Y
gen Y = 0
replace Y = cond(D==1, 2, 0) if id==2
lab var Y "Outcome variable"
twoway ///
(connected Y t if id==1) ///
(connected Y t if id==2) ///
, ///
xline(4.5) ///
xlabel(1(1)10) ///
legend(order(1 "id=1" 2 "id=2"))
This will plot a graph that looks like this
Now lets’ run a DD, which by the look of the graph should give us a treatment effect of two
gen treat=0 if id==1
replace treat=1 if id==2
gen post=0 if t<5
replace post=1 if t>=5
gen dd=treat*post
reg Y treat post dd
Story checks out, the coefficient on DD is 2, and the rest is 0, this makes sense because the difference between treatment and control before the reform is 0, and the passage of time for the control group is 0, and the average Y before the reform, for the control group is also 0. Now let’s dive in into the event study. Note that you’ll need to install coefplot if you haven’t install it yet: ssc install coefplot
fvset base 4 t
reg Y 1.treat post 1.treat#i.t
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
* play with coefplot by adding each part of the code separately to see how things change
The graph above should be the graph that you get. Notice that I’ve added a vertical line at 3.5 because 4 is the committed category.
Now let’s complicate things and see the value of the generalized DD. Let’s create a data set that has parallel trends, but that they evolve overtime linearly as oppose to staying flat, and let’s add more units in the treatment group
clear
local units = 3
local start = 1
local end = 10
local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'
egen id = seq(), b(`time')
egen t = seq(), f(`start') t(`end')
sort id t
xtset id t
lab var id "Panel variable"
lab var t "Time variable"
gen D = 0
replace D = 1 if id>=2 & t>=5
lab var D "Treated"
cap drop Y
gen Y = 0
replace Y = id + t + cond(D==1, 0, 0) if id==1
replace Y = id + t + cond(D==1, 2, 0) if id==2
replace Y = id + t + cond(D==1, 4, 0) if id==3
lab var Y "Outcome variable"
twoway ///
(connected Y t if id==1) ///
(connected Y t if id==2) ///
(connected Y t if id==3) ///
, ///
xline(4.5) ///
xlabel(1(1)10) ///
legend(order(1 "id=1" 2 "id=2" 3 "id=3"))
This is how the graph should look:
So here we’ve added an incresing trend over time to all units, and we’ve added two treated units id2 and id3, id2 has a treatment effect of 2 and id3 has a treatment effect of 4, with equal weights the dd should give us an average of 3
gen treat=0 if id==1
replace treat=1 if id==2 | id==3
gen post=0 if t<5
replace post=1 if t>=5
gen dd=treat*post
reg Y treat post dd
* We can check the results of the coefficients by getting the numbers of
sum Y if treat==0 & post==0
sum Y if treat==1 & post==0
sum Y if treat==0 & post==1
sum Y if treat==1 & post==1
* and doing the manual subtraction.
So far, so good, let’s do the event study now
fvset base 4 t
reg Y 1.treat 1.post 1.treat#i.t
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
You may have gotten something like this:
What is happening here is that STATA is dropping 4 and 10 from the graph, and from the output of the regression is really using 10 as the baseline 10, which is a period after the intervention, hence why our event study doesn’t look good. This is true even when we told STATA to have the base of 4. This happens because what we told STATA is for the variable t to have a base of 4, but STATA doesn’t understand that the 1.treat#i.t comes from t, it’s treating it as a different variable.
So in order for that to work well, we have to evolve the post into year-fe or time-fe, and that will make it so that the baseline year is 4:
fvset base 4 t
reg Y 1.treat i.t 1.treat#i.t
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
Now, we have an event study that works well, and give us an overall DD effect of 3. Let’s see the regression ouput, and notice that a 4=t is missing because that’s the baseline
We can always ask STATA to show us that coefficient by doing this:
reg Y 1.treat i.t 1.treat#i.t, allbase
and now we can see that 4 is the base.
Finally let’s fully generalize the regression and the event study. First the regression
reg Y i.id i.t dd
Will give us, with a correct DD value
and the event study:
fvset base 4 t
reg Y i.id i.t 1.treat#i.t, allbase
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
You may notice that this doesn’t create the nice event study because of the same problem as before (the baseline is 10)
if that is happening then a fix is the following:
fvset base 4 t
reg Y i.id i.t i.treat##i.t, allbase
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
This happened because in order to make stata think the treat should take the value of 1 in the interaction, we need also add the treat by itself to change that baseline, so the ## does all of that for us