Purpose
The objective of this homework is for you to practice concepts learned in class and apply them to a real-case scenario. The concepts we will practice in this homework are related to understanding IV. You are building your skills: reading papers, interpreting tables, interpreting coefficients from complex designs, and using research to think about policy implications. Look at you!
Clean your desk, get a bottle of water, pick your favorite beverage, turn on “do not disturb,” set a timer for 45 min (then take breaks), put on some work tunes, and dive into the fun of learning.
Guidelines
- You can work by yourself or in groups of up to two students.
- Submit your answers to Gradescope (within Canvas). If you are a group, we encourage you to submit one assignment per group, though unnecessary.
- We encourage you to use the boxes; PDFs, JPGs, and PNGs are preferable over Word documents or CSV. Remember, you can always save something as a PDF. You can also “Screenshot” anything. In Windows, you can do this by using the snipping tool or Windows+Shift+S. In Mac, you can do this by command+shift+4.
- Submit your do-file to Gradescope (within Canvas).
- You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
- Your responses should be professionally formatted and written.
- The due date is Monday, Match 18th, at 10 pm EDT.
Private Prisons & Length of Service
For this homework, we will ask questions based on the study by Anita Mukherjee titled “Impacts of private prison contracting on inmate time served and recidivism”. This paper examines the relationship between private prisons and length of service. New prisons are usually built because there is a “demand” for putting people into prisons. The main argument for a private prison system is that it saves money. The arguments against it focus on the dangers of having a private agent managing a “public” service. For example, their incentives may be to maximize profits (like any other economic agent) rather than focus on reducing recidivism, a welfare-improving outcome. In this homework, we will go over a paper that tries to tackle part of this issue. In theory, having a prison be private or public should not affect the crimes committed nor the sentencing, and so we should see no difference in length of service between public and private prisons, especially for two individuals with the same characteristics. This paper analyzes that question.
A lot of context is needed to understand the importance of this question and the complexity of the issue. First, read the introduction of the paper to get yourself acquainted with the context.
Understand the Paper
- If we wanted to understand the effect of going to a private prison on length of service, what would be the regression we would want to run? Write it out and describe how you would code each variable and your unit of observation.
- Read the paper and answer the following questions:
- What are the main outcome variables? Explain how each of them is coded.
- What is the main explanatory variable? Explain how it is coded.
- What is the instrument in this setting (be as explicit as possible)? Who are the "compliers" in this case? (We are not looking for a textbook definition of what compliers are). How is this instrument coded?
- What is the LATE in this context? (I am not looking for the definition of LATE, but the interpretation within this context).
- Write down an equation representing the “First Stage” in this setting.
- Write down an equation representing the “Reduced Form” in this setting.
- Which table (column?) or figure provides evidence of the relevance condition?
- Does the instrument satisfy the exclusion restriction? Can you provide a compelling argument against it?
- What are the main findings?
- What does this research imply about the cost-saving argument for private prisons?
- Create a variable called “sentence_end” which represent the end of the sentence. This can be done by adding the SentencedDays to the sentence_start_date. Browse SentencedDays and sentence_start_date to understand the variables and then create the variable. Show a screenshot of your browse for your answer.
- Notice that when you create this variable, it may have numbers that are hard to interpret. This is because the sentence_start_date variable is formatted to be a date variable. In order to understand our variable sentence_end let's format it as a date variable. This is done by typing
format sentence_end %td
. Browse the start and end date variable to see that the are readable. Take a screenshot and submit as your answer & write it in your do-file. - Now you are ready to create the instrument. Although there are many ways to create the instrument in STATA, you can use just generate and replace commands with the d(date) function. This should be enough to create the instrument. The challenge is to think about how to create the instrument. Provide a screenshot of your code for this variable and then show the output from the following code
sum (instrument variable name), det
- Run the first stage regression and report the estimates.
- Run the reduced form regression and report the estimates
- Use the display command to obtain the IV estimate using your estimate from 1 and 2.
- Now use the ivregress command and report the results.
- Interpret the obtained IV estimate in plain words.
- Run the following regression in Stata:
- Repeat the exercises 1-4 from section “Reduced form IV & 2SLS”, now using controls for the outcome time served.
- Use the estimates to complete a table like this. Feel free to use esttab or to do it in word. The results you get may not be exactly the same as the authors result because we are not using exactly the same data.
Replicating the results
In this part of the homework, we will try to replicate the results from the paper. Open the dataset “hw3_clean.dta” and get familiar with the data. This data was offered by Professor Mukherjee but stripped of some details. The actual data is available for everyone through a FOIA request. Search for the Y and D, and the instrument is something we will create. Notice that captures the days that the sentence was for, while captures the days served by the individual.
Data can be found here
One important instrument component is knowing when a prison opens or closes. The information can be found in Figure 1. For your convenience, we have summarized the changes in the following table.
First, we need to create the instrument. Recall the information from pages 420 and 421 of the paper. As Mukherjee states in the paper, the instrument is the accumulation of changes on the availability of beds across the sentence spell of the individual. For example, say an individual sentence's start date is December 6, 1996, and the sentence's end date is December 5, 2001. Their sentence spell covers the number of openings of beds available. Looking at Table 1, their sentence spell covers the opening from January 98, August 98, and April 99. Since each of those openings was 500 beds, the accumulated change during their sentence spell is 1,500 beds. Since the instrument is adjusted to a per 1,000-bed rate, we would divide the number by 1,000, thus ending up with a final value of the instrument of 1.5.
Let's do another example. Imagine a sentence starting on May 25, 2000, and the assigned end date is May 24, 2005. This sentence spells cover the closing in 2002 and the opening of a prison in 2004. The total accumulated change is -1000+1000=0. Therefore, the final value of the instrument is 0.
There is an additional component: changes should occur at least 90 days after the sentence date and 90 days before the sentence end date. Remember this when creating your variable.
Before you create the instrument, you may need to know how to use the d(date) function in STATA. This can help with the issue of the 90 days after and 90 days before. The d(date) function is a way for STATA to understand that you are referring to a date. For example, if you want to create a variable that takes the value of 1 if the sentence start date is a year away from a particular date, say June 1st, 1996, and 0 otherwise. You would use the d(date) function the following way:
gen dummy =0
replace dummy=1 if sentence_start_date>=d(01jun1996)+365
The date must be written in the following format: ddmmmyyyy
Now that your background is ready, think about how you would create this variable, and then complete the following steps:
Reduced form IV & 2SLS
For the following regressions use the outcome: TimeServed. Present the screenshots of your analysis as part of the answers. Stata output is fine.
Getting Closer to Mukherjee’s results
You may notice that the estimates from the previous section are not the same as the ones from the paper. We need to add controls to the exercise above. The controls we want to include are: age, race, education level, a dummy if the inmate is single or not, a dummy for each county of conviction, a dummy for each level of care, a dummy for each classification, and a dummy for each medical classification. You can read the paper to understand what these controls represent.