Purpose
By now, you have good practice with reading papers, understanding the gist, and the technical set-up. That’s great, and those are already valuable skills. In this part of the semester, we’ll push you to use the stats to elevate the conversation. This means thinking more about “what results mean” and “how to use those results to think about policy implications.” We will take out more training wheels, and the questions will start feeling less “homework-y” and more like questions you would get in a work conversation or a meeting. Of course, questions about particular concepts and targeted material will still be asked!
Guidelines
- You can work by yourself or in groups of up to two students.
- Submit your answers to Gradescope (within Canvas). If you are a group, submit one assignment per group.
- We encourage you to use the boxes; PDFs, JPGs, and PNGs are preferable over Word documents, RTF or CSV. Remember, you can always save something as a PDF. You can also “Screenshot” anything. In Windows, you can do this by using the snipping tool or
Windows+Shift+S
. In Mac, you can do this bycommand+shift+4
. - Submit your do-file to Gradescope (within Canvas).
- You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
- Your responses should be professionally formatted and written.
- The due date is Monday, April 1st, at 10 pm EDT. (Note that this is the week of your 48-hour project, so you may not want to spend all your monday on it)
Tips
- Clean your desk, get a bottle of water, pick your favorite beverage, turn on “do not disturb,” set a timer for 45 min (then take breaks), put on some work tunes, and dive into the fun of learning.
- Tackle the homework by doing a rough draft, reading it, noticing what you must do to answer questions, and then assessing how long it could take. Then, plan for your week and when you’ll tackle it.
- Start early!
Understanding the paper: The Minimum Legal Drinking Age and Crime
Obtain the paper and data for this homework here.
Read the enclosed paper on the minimum legal drinking age (MLDA) (Carpenter and Dobkin, 2015). For those interested in learning more about the causal effects of alcohol use, we've listed some optional readings at the end of this assignment. Briefly describe the natural experiment in this regression discontinuity analysis.
- What is the research question?
- What ages does the analytic sample contain?
- What is the running variable?
- Who should be considered the “treated” group?
- Who should be considered the “control” group?
- What must we be willing to assume if we want to believe that this analysis captures the causal effect of legal alcohol access on arrests? Use the context, not general words that could be applied to any context.
- Interpret the results from Table 1 in words. What is the total arrest rate for the control group in this analysis? Be sure to specify the units for arrest rates in your answer. Is the arrest rate larger or smaller than you anticipated?
- How does legal alcohol access affect the total arrest rate? Present your answer as a percent change relative to the control group's arrest rate.
- Repeat steps (7)-(8) for the violent crime results in Table 1. Specify the control mean, calculate the treatment effect in percentage terms, interpret the magnitudes, and assess statistical significance.
- What category of crime accounts for the largest share of the change in crime rates at the MLDA discontinuity? Give three examples of crimes from this category.
- Figures 1-4 reveal some interesting trends in arrest rates for young adults.
- Give two examples of crimes for which arrest rates are pretty stable as teenagers enter their early twenties.
- Give two examples of crimes for which arrest rates vary dramatically with age.
Policy Implications & Robustness of Results
- Carpenter and Dobkin analyze the effect of legal alcohol access on arrest rates, but they're interested in the effects on crime. Good crime data are hard to come by since many crimes don't result in arrests.
- Why might arrests increase at age 21 even if legal access to alcohol doesn't increase crime? Give two distinct reasons.
- What evidence do Carpenter and Dobkin provide to bolster the case that their findings capture effects on crimes committed and not just arrests?
- Each US state gets to set its legal drinking age, but the federal government strongly incentivizes states to choose 21. According to the National Minimum Legal Drinking Act, states that set the drinking age below 21 risk losing up to 10 percent of their federal highway funding. Mothers Against Drunk Driving (MADD) was a major political player in lobbying for the Act's passage in 1984. What do Carpenter and Dobkin's findings suggest about the effects of this federal act on young adult crime? Would crime rates be lower if state MLDAs went back to 18? What caveats would you offer in drawing policy conclusions from these results? (Pass or fail)
- The United States is one of only four developed countries in the world with a nationwide drinking age above 18. Iceland, Japan, and South Korea are the other three. Why should we be cautious in generalizing MLDA results from the US to countries like France and Italy, which permit alcohol purchasing at younger ages? (Pass or fail)
Estimating the results
For this section of the homework, use the data provided to you in this link. We will try to replicate the results from the paper. Open the data carpenter_dobkin.dta
and get yourself familiar. the “_r” stands for rate, but explore the other variables to understand their meaning.
In this section, we will replicate the table and the graphs, so to get there, we’ll start small and simple and build ourselves up. The best part about this homework is that the answer key is the paper itself, so you’ll know if you get to the correct answer if you have replicated the results from the paper! (As always, you can use any coding language, but we’ll provide tips for the people using STATA).
- The goal is to recreate Figure 1. For that, we’ll start by plotting violent crimes. A similar process was used to create the rest of the figure. The commands that will be super helpful are
collapse
twoway scatter
, and the optionlfit.
- Step 1: Open the data and understand its layout.
- Step 2:
type twoway (scatter violent_r days_to_21, sort)
. What this graph shows you is, for every day away from one’s 21st bday, what is the rate of violent crime per 10,000 person-years. We can start seeing the discontinuity there, but it is not as clear as in Figure 1. The main difference is how the information is “binned.” The x-axis in Figure 1 is not “every day”, but it’s binning days together and telling us the average violent crime rate for a given bin of days. So we need to create the bins. Figure 1 bins the data into bins of 14 days. Therefore, each point is the average violent crime rate for 14 days. See if you can create a set of bins and then use the collapse command on your own, but if that’s taking too much time, jump to step 3. - Step 3: Create the 14-day bins:
gen age_fortnight = 21 + (14*floor(days_to_21/14))/365
. We are creating the bins of 14 days for you on this variable. Browse the data so that you understand better what this is doing:br days_to_21 age_fortnight
- Step 4: Now that we’ve created the bins, use the
collapse
command to transform the data into means of violent crime per each bin - Step 5: Now, we are ready to plot! Use the
twoway
command withscatter
and thelfit
options. Here are a couple of hints: - Use the point-and-click feature to get the code!
- Do things one by one, and add things as you go.
- Label your axis!
- Notice that Figure 1 only uses data for 2 years before 21 and 2 years after 21, not the whole span.
- Notice that the violent crime rate range in Figure 1 is on the right y-axis, not the left one. So you should compare your results with those numbers.
- Turn the legend off in the graphs that you produce.
- Change the y-axis range to be similar to the one in the paper rather than the one STATA gives you.
- Step 6: Once you’ve created the graph, submit a screenshot of your figure in Gradescope and the code.
- Now that you’ve created “one” of those lines from the graph, to get closer to Figure 1, we need to add lines for the following outcomes: violent crimes, property crimes, alcohol-related, drug possession (ill_drugs_r), and other crimes. Follow a process similar to the one above to create the figure.
- The only tricky thing is to label the two axes, which is not essential, but if you want to figure out how to do it, here are some tips:
- Make property the first variable you plot, and in the options of anything property related, add
yscale(range(200,600) axis(1))
, and right before the title option (after the main “,”) addylabel(#3)
- Then, for the variables on the second-axis, violent crimes and drug possession have the following code after their respective “,”
yaxis(2) yscale(range(100,300) axis(2)) ylabel(#3, axis(2))
- Using
binscatter
Now that you’ve figured out how to graph things usingCollapse
and Twoway
, let’s do something similar withbinscatter
. To do so, install binscatter (if you haven’t already) usingssc install binscatter
. - Open the original data again. Learn how to use the binscatter command; try
binscatter dui_r days_to_21
. Notice what binscatter is doing: it automatically creates bins for you and does the collapse plus the fit-line, an all-in-one command. - Change the graph to include two different fit lines before and after the cutoff (hint: look at the
by
option or therd
option) and use only information from 19 to 23. - Replicating Table 1. Finally, we’ll replicate Table 1. The following steps will help you get the first estimate, but you’ll have to create the final table yourself.
- Step 1: Run the most basic RD regression using “all” arrest as your primary outcome. (Save this as regression 1.) Add the option “
, robust
” at the end of each regression in all of your following regressions. This is a great place to useeststo
andesttab
. - Step 2: Run the same regression, but now only use data from ages 19 to 23. This is regression 2.
- Step 3: Run the same regression as step 2, but now add an interaction between your cutoff variable and the “running variable - cutoff” variable, allowing for different slopes. This is regression 3.
- Step 4: Now run the same regression as step 3, but allow for both slopes (before and after the cutoff) to be quadratic and to differ between each other.
- Step 5: Finally, add the birthday covariates. Read the notes of Table 1, which we write here: “The first line contains the estimate of the discrete increase at age 21 in arrest rates for each crime type. The regressions include a quadratic polynomial in age fully interacted with an indicator variable for age over 21 and indicator variables for 19th, 20th, 21st, and 22nd birthdays and the days immediately after. The regressions are estimated based on arrests of people 19 to 22 years old. Each observation is the arrest rate per 10,000 person-years days from a person’s 21st birthday. Standard errors are in parentheses. The codes that comprise each crime category are in online appendix A.” The only thing left is adding indicator variables for 19th, 20th, 21st, and 22nd birthdays and the days immediately after; these are dummies you can create, but these have already been created for you called birthday_19 and birthday_19_1 and so on.
- Answer the following question: Why must we add the birthday covariates?
- Now that you’ve run the 5 regressions, output them in a table with their standard error and significant stars. Your results should look like this:
- Now replicate fully table 1, including adding: “Rate just under 21”. Present your screenshot and code.
References
- Christopher Carpenter and Carlos Dobkin. The effect of alcohol consumption on mortality: Regression discontinuity evidence from the minimum drinking age. American Economic Journal: Applied Economics, 1(1):164–182, 2009.
- Christopher Carpenter and Carlos Dobkin. The minimum legal drinking age and public health. The Journal of Economic Perspectives, 25(2):133–156, 2011.
- Christopher Carpenter and Carlos Dobkin. The minimum legal drinking age and crime. Review of Economics and Statistics, 97(2):521–524, 2015.
- Scott E Carrell, Mark Hoekstra, and James E West. Does drinking impair college performance? evidence from a regression discontinuity approach. Journal of Public Economics, 95(1):54–62, 2011.
- Benjamin Crost and Santiago Guerrero. The effect of alcohol availability on marijuana use: Evidence from the minimum legal drinking age. Journal of Health Economics, 31(1):112–121, 2012.