For this exercise on size, we will use data and visualizations to answer the question. We will walk you through the code, but as always, try to do it first!
A measure used for understanding the incidence of health outcomes on the healthcare system is Emergency Room or Emergency Department visits (ER or ED visits). The idea is that the ER will contain visits from individuals who recently had an injury like a car accident or something similar. It will also have people whose health is so bad they decided to go to the hospital. There is a way to codify which are preventable versus non-preventable visits. To the extent that, after controlling for seasonality (i.e., controlling for the fact that there may be more accidents during winter than in spring, for example), the rate of non-preventable visits is fairly similar across places (i.e., random events), the leftover variation would be about preventable visits.
In the dataset called er_visits.dta, we have a dataset of total ER visits from two small towns across 20 years. It also includes the population of each town in that year.
You can find the data here:
We are interested in understanding the impact of a hospital policy that occurred in both towns. Both of these towns had a vaccination campaign that was very similar to each other, with Metroville pushing the campaign for "everyone" while Smallville pushed the campaign only to a target population of elderly folks. In short, Smallville had more pushback against using government resources on these types of campaigns than Metroville. Both policies went into effect in the year 2015, and we want to see their impact on ER visits for each town. Of course, we'll need to learn causal tools to determine this more precisely, but for now we'll focus on a before and after comparison.
- Let's say that's all the information you have: the data and the prompt of us trying to figure out which had a larger impact on ER visits. What would you do given what you've learned in class? (Look at the answer after you are done with all the questions)
- Let's plot the data to see what we see. The task is to plot a graph in which the y-axis is the number of ER visits and the x-axis is years. We should have two lines, one per city. I recommend exploring how to use the command
twowayin Stata. - From graphing it, it seems like Smalltown didn't really have an effect and Metroville had a big effect. Could both of them still have the same size of effect in percentage terms even though not the same in numbers? What amendment to the graph would you recommend?
- Create a new variable that is a rate of ER visits per 100K. Provide the code, and provide the mean rate before the intervention for both places and the mean after for both places (i.e., 4 numbers).
- Create the same graph as in question 2 but now using the rate variable. Notice some differences?
- Now let's answer two questions: Which town had the largest impact, and what is the size of the drop for each town expressed as a percentage?
- Is this difference meaningful? How would you go about thinking if this difference in effect size is meaningful or not?