Purpose
The objective of this homework is for you to practice concepts learned in class and apply them to a real-case scenario. The concepts we will practice in this homework are related to understanding Regression Discontinuity.
Guidelines
- Your responses should be professionally formatted and written. You can type the answers in word, PDF, or a google doc file.
- Submit your do-file and answers on this submission form
- You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
- The due date is June 30th, 2025
Understanding a paper on RD
Read the enclosed paper on the relationship between teenage driving on mortality and risky behaviors in the United States by Huh and Reif (2021). For those interested in learning more about the causal effects of alcohol use, we've listed some optional readings at the end of this assignment. Use the paper by Huh and Reif (2021) to answer the following questions:
- What is the running variable?
- What ages does the analytic sample contain?
- Who is in the treated group?
- Who is in the control group?
- What do we have to be willing to assume if we want to believe that this analysis captures the causal effect of teenage driving on mortality?
- For the following questions use the results from Table 1
- What is the mean mortality rate from all causes for the “control group” in this analysis? Be sure to specify the units in your answer.
- What is the effect of MDA on all-cause mortality? Present your answer as a percent change relative to the mean you found in (a). Is this effect is statistically significant?
- Repeat steps 6. (a)-(b) for Homicides results. Specify the control mean, calculate the treatment effect in percentage terms relative to the mean, and interpret the magnitudes & statistical significance
- What category of mortality accounts for the largest share of the change in all-cause mortality at the discontinuity, external or internal? What about within external categories?
Assumptions and Policy Implication
- One concern we may have is that since most MDA are at 16, the discontinuity is just capturing other things that happen specifically at 16, for example a big birthday bash (sweet sixteen), or more seriously, the federal minimum legal working age or a state’s minimum school-leaving age. How do the authors argue that their analysis is not capturing any of these components? What empirical exercise they provide?
- Even though the MDA is clearly defined, not everyone who passes the threshold chooses to have a driver's license. What is the the effect of getting a driver's license on mortality? Show your work.
- Each US state can set its own MDA. What do Huh and Reif's findings suggest about changing the MDA? What caveats would you offer in drawing policy conclusions from these results? (Ungraded) (completion point)
Estimating the results using STATA
For this section of the homework use the data provided to you in the homework page. We will try to replicate the results from the paper. Open the data all_dl_hw4.dta. An easy way to make great looking graphs is using a good scheme. Schemes in stata are pre-set preferences on color and design. I recommend you look into using a scheme called plotplainblind which has color-blind friendly colors. This scheme can be found when searching for blindschemes. Use your google abilities to figure out how to use it!
- Use the commands twoway, scatter and lfit to create a figure similar to figure 1, panel a. You will use all, as opposed to male and females. Use variable: Proportion with Driver’s license. Label your axis. (Use the -12, to +21 months range).
- Use the commands twoway, scatter and lfit to create a figure similar to figure 1, panel a. You will use all, as opposed to male and females. Use variable: VehicleMiles_150. Label your axis.
- Seeing the full data is useful, but sometimes it just has a lot of noise and it’s hard to see patterns. Create the same two figures as before now using the command binscatter. Check out the “by” option in the binscatter command. Label your axis.
13. Use binscatter to graph an RD figure, where “Work during last 4 weeks” is the y-axis variable instead. Does this graph show any evidence of a discontinuity at MDA?
- Now let's estimate the following regressions. For each regression, add these components [aweight=tri_wgt], robust at the end of each regression. The regression to run is the “baseline” model, which is the main model estimated from the paper. You may need to create new variables.
- Model 1: Estimate the baseline RD model where the outcome is DriverLicense.
- Model 2: Estimate the baseline RD model where the outcome is VehicleMiles_150.
- Model 3: Estimate the baseline RD model where the outcome is VehicleMiles_265.
- Model 4: Estimate the baseline RD model where the outcome is Work4weeks.
- Model 5: Estimate the baseline RD model where the outcome is NotEnrolled.
- Provide a Table that presents your results from the five models (try to have it nicely formatted. You are welcome to use formats from previous homeworks). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows. The results should be similar to the ones from the paper
- For the following questions use the mortality data. First let's create RD graphs using the binscatter or the scatter/lfit commands of the following outcomes:
- all-cause mortality
- motor vehicle mortality
- homicides
- Without seeing the results from the regressions, do you see discontinuities in all of the graphs? Discuss. (ungraded) (completion point)
- Find the baseline specification that the authors use, they usually write it under empirical strategy. Open the mortality data (all_mortality_hw4.dta) With the baseline model, create another table in which the columns are the following models:
- Model 1: Estimate the baseline RD model where the outcome is all-cause mortality.
- Model 2: Estimate the baseline RD model where the outcome is driving fatality mortality.
- Model 3: Estimate the baseline RD model where the outcome is homicide.
- Provide a Table that presents your results from the three models (try to have it nicely formatted, you are welcome to use formats from previous homeworks as examples). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows.
- Why do we check if MDA is affecting homicides?
- Using the baseline model create another table in which the columns are the following models, for all of this use all-cause mortality as your outcome.
- Model 1: Baseline model
- Model 2: Baseline model
- Model 3: Baseline model +
- Provide a Table that presents your results from the three models (try to have it nicely formatted, you are welcome to use formats from previous homeworks as examples). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows.
- Does the inclusion of the quadratic or the cubed version of the running variable affect the results? What does this mean?
References
- Christopher Carpenter and Carlos Dobkin. The effect of alcohol consumption on mortality: Regression discontinuity evidence from the minimum drinking age. American Economic Journal: Applied Economics, 1(1):164–182, 2009.
- Christopher Carpenter and Carlos Dobkin. The minimum legal drinking age and public health. The Journal of Economic Perspectives, 25(2):133–156, 2011
- Scott E Carrell, Mark Hoekstra, and James E West. Does drinking impair college performance? Evidence from a regression discontinuity approach. Journal of Public Economics, 95(1):54–62, 2011.
- Benjamin Crost and Santiago Guerrero. The effect of alcohol availability on marijuana use: Evidence from the minimum legal drinking age. Journal of Health Economics, 31(1):112–121, 2012.
- Jason Huh and Julian Reif. Teenage driving, mortality, and risky behaviors. American Economic Review: Insights, 3(4):523–39, 2021.