Sebastian Tello
  • Home
  • CV
  • Contact
  • Research
  • Resources
  • RMDA
  • APP
InstagramBluesky
💻

STATA: Plotting Event Studies

Let’s get right into it. Let’s create a data set first. In this data set we’ll assign the treatment effect to 3 to the treated group

Simplest Case (2x2)

In this code, you can change the number to create an infinite amount of graphs to practice identifying coefficients from a regressions as in 📝Worksheet: DiD Coefficients onto Graphs

More time periods

This will plot a graph that looks like this

image

Now lets’ run a DD, which by the look of the graph should give us a treatment effect of two

gen treat=0 if id==1
replace treat=1 if id==2

gen post=0 if t<5
replace post=1 if t>=5

gen dd=treat*post

reg Y treat post dd
image

Story checks out, the coefficient on DD is 2, and the rest is 0, this makes sense because the difference between treatment and control before the reform is 0, and the passage of time for the control group is 0, and the average Y before the reform, for the control group is also 0. Now let’s dive in into the event study. Note that you’ll need to install coefplot if you haven’t install it yet: ssc install coefplot


fvset base 4 t
reg Y 1.treat post 1.treat#i.t
coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)

* play with coefplot by adding each part of the code separately to see how things change

The graph above should be the graph that you get. Notice that I’ve added a vertical line at 3.5 because 4 is the committed category.

image

Now let’s complicate things and see the value of the generalized DD. Let’s create a data set that has parallel trends, but that they evolve overtime linearly as oppose to staying flat, and let’s add more units in the treatment group

This is how the graph should look:

image

So here we’ve added an incresing trend over time to all units, and we’ve added two treated units id2 and id3, id2 has a treatment effect of 2 and id3 has a treatment effect of 4, with equal weights the dd should give us an average of 3

gen treat=0 if id==1
		replace treat=1 if id==2 | id==3
		
		gen post=0 if t<5
		replace post=1 if t>=5
		
		gen dd=treat*post
		
		reg Y treat post dd
		
		* We can check the results of the coefficients by getting the numbers of
		sum Y if treat==0 & post==0
		sum Y if treat==1 & post==0
		sum Y if treat==0 & post==1
		sum Y if treat==1 & post==1
		* and doing the manual subtraction.

So far, so good, let’s do the event study now

fvset base 4 t
		reg Y 1.treat 1.post 1.treat#i.t 
		coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)

You may have gotten something like this:

image

What is happening here is that STATA is dropping 4 and 10 from the graph, and from the output of the regression is really using 10 as the baseline 10, which is a period after the intervention, hence why our event study doesn’t look good. This is true even when we told STATA to have the base of 4. This happens because what we told STATA is for the variable t to have a base of 4, but STATA doesn’t understand that the 1.treat#i.t comes from t, it’s treating it as a different variable.

image

So in order for that to work well, we have to evolve the post into year-fe or time-fe, and that will make it so that the baseline year is 4:

fvset base 4 t
		reg Y 1.treat i.t 1.treat#i.t 
		coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
image

Now, we have an event study that works well, and give us an overall DD effect of 3. Let’s see the regression ouput, and notice that a 4=t is missing because that’s the baseline

image

We can always ask STATA to show us that coefficient by doing this:

reg Y 1.treat i.t 1.treat#i.t, allbase
image

and now we can see that 4 is the base.

Finally let’s fully generalize the regression and the event study. First the regression

reg Y i.id i.t dd

Will give us, with a correct DD value

image

and the event study:

fvset base 4 t
		reg Y i.id i.t 1.treat#i.t, allbase
		coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)

You may notice that this doesn’t create the nice event study because of the same problem as before (the baseline is 10)

image

if that is happening then a fix is the following:


		fvset base 4 t
		reg Y i.id i.t i.treat##i.t, allbase
		coefplot, keep(1.treat#*.t) vertical recast(connected) xlabel(,angle(45)) xline(3.5)
image

This happened because in order to make stata think the treat should take the value of 1 in the interaction, we need also add the treat by itself to change that baseline, so the ## does all of that for us

clear
local units = 2
local start = 1
local end   = 2

local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'

egen id	   = seq(), b(`time')  
egen t 	   = seq(), f(`start') t(`end') 	

sort id t
xtset id t

lab var id "Panel variable"
lab var t  "Time  variable"

gen D = id==2 & t==2

gen btrue = cond(D==1, 2, 0) 		
	gen Y = id + 3*t + btrue*D 

lab de prepost 1 "Pre" 2 "Post"
lab val t prepost

twoway ///
	(connected Y t if id==1) ///
	(connected Y t if id==2) ///
		,	///
		legend(order(1 "id=1" 2 "id=2")) ///
		xlabel(1 2, valuelabel) ylabel(4(1)10)
clear
local units = 2
local start = 1
local end 	= 10

local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'

egen id	   = seq(), b(`time')  
egen t 	   = seq(), f(`start') t(`end') 	

lab var id "Panel variable"
lab var t  "Time  variable"

sort  id t
xtset id t


gen D = 0
replace D = 1 if id>=2 & t>=5
lab var D "Treated"

cap drop Y
gen Y = 0
replace Y = cond(D==1, 2, 0) if id==2

lab var Y "Outcome variable"	

twoway ///
	(connected Y t if id==1) ///
	(connected Y t if id==2) ///
		,	///
		xline(4.5) ///
		xlabel(1(1)10) ///
		legend(order(1 "id=1" 2 "id=2"))	
clear
local units = 3
local start = 1
local end 	= 10

local time = `end' - `start' + 1
local obsv = `units' * `time'
set obs `obsv'

egen id	   = seq(), b(`time')  
egen t 	   = seq(), f(`start') t(`end') 	

sort  id t
xtset id t

lab var id "Panel variable"
lab var t  "Time  variable"

gen D = 0
replace D = 1 if id>=2 & t>=5
lab var D "Treated"

cap drop Y
gen Y = 0
replace Y = id + t + cond(D==1, 0, 0) if id==1
replace Y = id + t + cond(D==1, 2, 0) if id==2
replace Y = id + t + cond(D==1, 4, 0) if id==3

lab var Y "Outcome variable"		

twoway ///
	(connected Y t if id==1) ///
	(connected Y t if id==2) ///
	(connected Y t if id==3) ///
		,	///
		xline(4.5) ///
		xlabel(1(1)10) ///
		legend(order(1 "id=1" 2 "id=2" 3 "id=3"))