Sebastian Tello
  • Home
  • CV
  • Contact
  • Research
  • Resources
  • RMDA
  • APP
6️⃣

Homework 6 (RD)

Purpose

The objective of this homework is for you to practice concepts learned in class and apply them to a real-world scenario. The concepts we will practice in this homework are related to understanding regression discontinuity.

Guidelines

  • You can work by yourself or in groups of up to two.
  • Submit your answers to gradescope. One submission per group.
  • We encourage you to use the boxes. PDFs, JPGs, and PNGs are preferable to Word documents or CSVs. Recall that you can always save something as a PDF. You can also screenshot anything. In Windows, you can do this using the Snipping Tool or Windows+Shift+S. On Mac, you can do this using Command+Shift+4.
  • Submit your do-file to gradescope.
  • You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
  • Your responses should be professionally formatted and written.
  • Pass or fail means completion type question.
  • The due date is April 14th at 10:00pm EDT.

Understanding a paper on RD

Read the enclosed paper on the relationship between teenage driving, mortality, and risky behaviors in the United States by Huh and Reif (2021). For those interested in learning more about the causal effects of alcohol use, we've listed some optional readings at the end of this assignment. Use the paper by Huh and Reif (2021) to answer the following questions:

huh and reif (2021).pdf653.8 KiB
  1. What is the research question?
  2. What is the running variable?
  3. What ages does the analytic sample contain?
  4. Who is in the treated group?
  5. Who is in the control group?
  6. What do we have to be willing to assume if we want to believe this analysis captures the causal effect of teenage driving on mortality?
  7. For the following questions, use the results from the tables.
    1. What is the mean mortality rate from all causes for the “control group” in this analysis? Be sure to specify the units in your answer.
    2. What is MDA's effect on all-cause mortality? Present your answer as a percent change relative to the mean you found in (a). Is this effect statistically significant?
    3. Repeat steps 6. (a)-(b) for homicide results. Specify the control mean, calculate the treatment effect in percentage terms relative to the mean, and interpret the magnitudes & statistical significance.
    4. What category of mortality accounts for the largest share of the change in all-cause mortality at the discontinuity: external or internal? What about within external categories?
  8. (Pass or fail) The senator's communications director asks you to write one sentence a parent can understand summarizing the paper's main finding. Write that sentence. Then write one honest limitation the senator should communicate alongside the result. (Limitations about the interpretation not like thinking about why assumptions don’t work or couldn’t work)

Assumptions and Policy Implication

  1. One concern we may have is that since most MDA are at 16, the discontinuity is just capturing other things that happen specifically at 16, for example a big birthday bash (sweet sixteen), or more seriously, the federal minimum legal working age or a state’s minimum school-leaving age. How do the authors argue that their analysis is not capturing any of these components? What empirical exercise do they provide?
  2. A skeptical colleague says: "I don't buy this RD. Teenagers who rush to get their license the moment they hit the MDA are the most risk-loving kids. Risky teenagers were going to die of something — driving just gave them one more way to do it. This paper isn't estimating the effect of driving; it's just picking up that reckless teenagers are reckless."
  3. (a) Which RD assumption does this concern challenge? State it precisely.

    (b) What evidence from the paper does or does not address this concern?

    (c) (Pass or fail) Propose one empirical test the authors could run to assess whether this critique has merit.

  4. Even though the MDA is clearly defined, not everyone who passes the threshold chooses to have a driver's license. What is the effect of getting a driver's license on mortality? Show your work.
  5. (Pass or fail) Each US state can set its own MDA. What do Huh and Reif's findings suggest about changing the MDA? What caveats would you offer in drawing policy conclusions from these results? (Ungraded) (completion point)

Translating Results for Decision-Makers

  1. A state senator in Virginia is considering raising the MDA from 16 to 17. Suppose Virginia has approximately 100,000 teenagers at each age. Using the paper's estimates, how many fewer teenage deaths would Virginia expect per year if it raised the MDA by one year? Show your work and state clearly what assumptions you are making.
  2. Raising the MDA imposes real costs on teenagers — reduced mobility, reliance on parents, potential lost earnings from part-time jobs. Suppose those costs are equivalent to $1,500 per teen-year of delayed driving. Using your answer from (1), compute a rough cost-per-life-saved estimate for a one-year MDA increase in Virginia. How does this compare to the cost-per-life-saved of other road safety policies (e.g., mandating seatbelts costs roughly $30,000–$300,000 per life saved)? What does this comparison suggest — and what caveat should you always attach when making this kind of comparison?
  3. (Pass or fail) If the goal is to reduce teen mortality from driving, raising the MDA is one tool. Using the paper's mechanism findings — that motor vehicle accidents account for 84% of the excess deaths — propose one alternative policy that targets the same mechanism without changing the MDA. What do the paper's results tell you about how effective that policy (the one you recommend) might be? What do they not tell you?
  4. (Pass or fail) The paper's estimates identify the effect of reaching the MDA for a specific subpopulation: teenagers at the margin of the age cutoff in states and years covered by the data. A state health official asks: "Can I use these numbers to predict what would happen in my state if I raised the MDA tomorrow?" What concept from class should guide your answer? Identify two specific ways the treatment effect could differ in her state, and say whether the true effect would likely be larger or smaller than 5.84 per 100,000.

Estimating the results using STATA

For this section of the homework, use the data provided to you in the homework page. We will try to replicate the results from the paper. Open the data all_dl_hw.dta. An easy way to make great-looking graphs is using a good scheme. Schemes in Stata are preset preferences for color and design. I recommend you look into using a scheme called plotplainblind which has color-blind friendly colors. This scheme can be found when searching for blindschemes. Use your google abilities to figure out how to use it!

  1. Use the commands twoway, scatter and lfit to create a figure similar to Figure 1, Panel A. You will use all observations, as opposed to males and females separately. Use variable: Proportion with Driver’s license. Label your axis. (Use the -12, to +21 months range).
  2. Use the commands twoway, scatter and lfit to create a figure similar to Figure 1, Panel B. You will use all observations, as opposed to males and females separately. Use variable: VehicleMiles_150. Label your axis.
  3. Seeing the full data is useful, but sometimes it just has a lot of noise and it’s hard to see patterns. Create the same two figures as before now using the command binscatter. Label your axis. (Tip: check out this worksheet before using binscatter. There are several ways to do this, but you may also consider using the rd() option or the by() option, just some ideas, many ways of doing it right.)
  4. Use binscatter to graph an RD figure, where “Work during last 4 weeks” is the y-axis variable instead. Does this graph show any evidence of a discontinuity at MDA?
  5. Now let's estimate the following regressions to make a table. For each regression, add these components [aweight=tri_wgt], robust at the end of each regression. The regression to run is the “baseline” model, which is the main model estimated from the paper. You may need to create new variables.
    1. Model 1: Estimate the baseline RD model where the outcome is DriverLicense.
    2. Model 2: Estimate the baseline RD model where the outcome is VehicleMiles_150.
    3. Model 3: Estimate the baseline RD model where the outcome is VehicleMiles_265.
    4. Model 4: Estimate the baseline RD model where the outcome is Work4weeks.
    5. Model 5: Estimate the baseline RD model where the outcome is NotEnrolled.
    6. Provide a table that presents your results from the five models (try to have it nicely formatted. You are welcome to use formats from previous homeworks). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows. The results should be similar to the ones from the paper
    7. For the following questions use the mortality data. First let's create RD graphs using the binscatter or the scatter/lfit commands of the following outcomes.
      1. all-cause mortality
      2. motor vehicle mortality
      3. homicides
      4. Without seeing the results from the regressions, do you see discontinuities in all of the graphs? Discuss. (ungraded) (completion point)

22. Find the baseline specification that the authors use, they usually write it under empirical strategy. Open the mortality data (all_mortality_hw.dta) With the baseline model, create another table in which the columns are the following models:

  1. Model 1: Estimate the baseline RD model where the outcome is all-cause mortality.
  2. Model 2: Estimate the baseline RD model where the outcome is driving fatality mortality.
  3. Model 3: Estimate the baseline RD model where the outcome is homicide.
  4. Provide a Table that presents your results from the three models (try to have it nicely formatted, you are welcome to use formats from previous homeworks as examples). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows.
  5. Why do we check if MDA is affecting homicides?
  6. Using the baseline model create another table in which the columns are the following models, for all of this use all-cause mortality as your outcome.
    1. Model 1: Baseline model
    2. Model 2: Baseline model +(age−cutoff)2+ (age-cutoff)^{2}+(age−cutoff)2
    3. Model 3: Baseline model + (age−cutoff)2 and (age−cutoff)3(age-cutoff)^{2}\ and\ (age-cutoff)^{3}(age−cutoff)2 and (age−cutoff)3
    4. Provide a Table that presents your results from the three models (try to have it nicely formatted, you are welcome to use formats from previous homeworks as examples). Your table should only report the coefficients on the main coefficient of interest (the discontinuity), it should also include the sample size of each regression, and the mean of the dependent variable in the bottom rows.
  7. Does the inclusion of the quadratic or the cubed version of the running variable affect the results? What does this mean?

References

  • Christopher Carpenter and Carlos Dobkin. The effect of alcohol consumption on mortality: Regression discontinuity evidence from the minimum drinking age. American Economic Journal: Applied Economics, 1(1):164–182, 2009.
  • Christopher Carpenter and Carlos Dobkin. The minimum legal drinking age and public health. The Journal of Economic Perspectives, 25(2):133–156, 2011
  • Scott E Carrell, Mark Hoekstra, and James E West. Does drinking impair college performance? Evidence from a regression discontinuity approach. Journal of Public Economics, 95(1):54–62, 2011.
  • Benjamin Crost and Santiago Guerrero. The effect of alcohol availability on marijuana use: Evidence from the minimum legal drinking age. Journal of Health Economics, 31(1):112–121, 2012.
  • Jason Huh and Julian Reif. Teenage driving, mortality, and risky behaviors. American Economic Review: Insights, 3(4):523–39, 2021.
InstagramBluesky