Sebastian Tello
  • Home
  • CV
  • Contact
  • Research
  • Resources
  • RMDA
  • APP
InstagramBluesky
Practice Problems for Exam 2 (2026)
🎒

Practice Problems for Exam 2 (2026)

The following are practice questions for the upcoming exam; we’ve written a hefty 75ish questions. Just a reminder that some of the answers are didactical as opposed to exactly what we would want in an answer. In addition to the questions here, recall that you can always re-take some of the quizzes, or you can go over the worksheets and turn some of those into questions; the worksheet has several questions to help you think about regression in different ways and of course, if more questions are needed, you can go over homeworks. If that’s not enough, you could also go over lectures through exercises we did in class or that are on the slides. You can also create new exercises by doing things “backward” or in different directions. Finally, you can find even more exercises/questions in the book Real Stats at the end of each chapter. And if that’s not quite enough, the internet is your oyster!

  • Warm-up is a great way to start your exercises, so as a warm-up, you can redo the practice for midterm 1 and then do midterm 1 again!
  • Recall that these exercises are not exhaustive of all the concepts we’ve seen in class! or all type of questions we’ve seen or will see.

Some broad tips on using these tools

  • Doing a bunch of questions could be useful, but it could also be not beneficial if you are not training your brain to think carefully. For example, a better way of using tools like this is to first write answers for all without looking at the right answer. Second, discuss the answers with a peer. Finding someone who disagrees with your answer is particularly helpful. Discussing how to approach a question (without looking at the right answer) is a useful exercise that the brain can take advantage of to make things “click.” Finally, have someone grade you and give you a grade without them telling you what you got wrong or right. Then, re-take it or re-do the exercise and have them grade it again until you get 100%. In short, the ideal case is when one never looks at the right answer.
    • Notice that when doing any work assignment, especially in a non-school setting, one doesn’t know the “right answer.” You only know how you think you would approach it, and that’s exactly what we are after. So overall, the key is to see the “right answer” as the last thing you do.
  • When you have a question that has multiple-choice options (like in the quizzes), go through each option and think about why it’s right or wrong or what you could change it to to make it right.
  • “Trivial mistakes.” Sometimes, we look at an answer, realize we made a mistake, and categorize it as a “silly mistake.” Sometimes, this makes sense, but we must be careful about what we categorize as a “silly mistake” and not do anything about it. You want to ask yourself, “How could I change my process to guarantee that this doesn’t happen again?”. For example, if the pilot industry were comfortable with “silly mistakes,” we’d be in a pickle. Their approach is to create a set of “checklists” to ensure the likelihood of making a silly mistake is zero. What does that mean for RMDA? For example, let’s say your silly mistake is “wrong units,” so something that you want to add to your process is “check in what units should be the final answer” as part of the process; you can add that step, maybe at the end. The takeaway from this tip should be: “How could I change my process to guarantee that this doesn’t happen again?”.
  • Change the scenarios: You can create more questions out of these questions. For example, change some numbers and re-do problems. Maybe change the Y to other units; how do the results change? Ask yourself: What other questions could we ask given this setting? etc. The practice of making your brain think about other potential questions is the “studying” itself.
  • You can use these questions as either assessment or evaluative. We note that some answers are meant to be didactical (teaching moments) rather than answers that get straight to the point. Some questions will say, “Show your work,” but in the answers, we show numbers. It should be understood that one would want to show the process, and the answer is to check if you are using the right method.
    • If you plan to use these questions as an “assessment,” I recommend you not study, take these questions, and then go back to studying those topics in which you feel weaker.
    • If you plan to use these questions as “evaluative,” I recommend timing yourself. Since the exam is a time-constrained exercise, it’s also good to practice questions with a time constraint.

Practice questions

Let’s make sure your regression interpretation is not rusty. Let’s work through these questions (some that you may have seen before) and try to get 100% before moving to the more conceptual questions.

  1. For the following questions, refer to the following equation and its respective graph
    1. Yi=α0+α1Treatmenti+ϵiY_{i}=\alpha_0+\alpha_1Treatment_{i}+\epsilon_{i}Yi​=α0​+α1​Treatmenti​+ϵi​
      image
    2. What’s the value of α^0 and α^1\hat\alpha_0 \ and\ \hat\alpha_1α^0​ and α^1​?
    3. ‣
      Answer
      ✅
      α^0=0  and  α^1=2\hat\alpha_0=0\ \ and\ \ \hat\alpha_1=2α^0​=0  and  α^1​=2
      image
    4. What’s the value of α^0 and α^1\hat\alpha_0 \ and\ \hat\alpha_1α^0​ and α^1​?
    5. ‣
      Answer
      ✅
      α^0=4 and α^1=−10\hat\alpha_0=4 \ and\ \hat\alpha_1=-10α^0​=4 and α^1​=−10
  2. When do countries tax wealth? Taxes are a big deal. they affect how people allocate their time, how much money government has, etc. Inheritance taxes are a particularly interesting tax policy because of the clear potential for conflict between rich and poor. Scheve and Stasavage (2012) investigated the sources of inheritance taxes by looking at tax policy and other characteristics of 19 countries for which data is available from 1816 to 2000. The data is measured every five years. Specifically the researchers looked at the relationship between inheritance taxes and who was allowed to vote. To assess if expanded suffrage led to increases or decreases in inheritance taxes, we can begin with the following model:
Inheritance taxit=β0+β1Expanded Suffragei,t−1+ϵitInheritance\ tax_{it}=\beta_0+\beta_1Expanded\ Suffrage_{i,t-1}+\epsilon_{it}Inheritance taxit​=β0​+β1​Expanded Suffragei,t−1​+ϵit​

The dependent variable is the top inheritance tax rate, which is measure as a percent (0-100), and the independent variable is a dummy variable for whether all men were eligible to cote in at least half of the previous years.

  1. What does β0\beta_0β0​ represents?
  2. ‣
    Answer
    ✅
    The mean tax rate across 19 countries and 184 years, when suffrage wasn’t expanded. Notice that the mean tax rate for the control group would have been a complete answer.
  3. What does β1\beta_1β1​ represent?
  4. ‣
    Answer
    ✅
    The additional tax or the marginal tax change that happens on average when countries expand suffrage. Notice that is not an “increase” or a “decrease” yet since we don’t know what the number is.
  5. What’s the average difference in inheritance tax between expanded and not expanded suffrage?
  6. ‣
    Answer
    ✅
    β1\beta_1β1​
  7. Let’s say you get:
    1. Inheritance taxit=4.75+19.33Expanded Suffragei,t−1Inheritance\ tax_{it}=4.75+19.33Expanded\ Suffrage_{i,t-1}Inheritance taxit​=4.75+19.33Expanded Suffragei,t−1​
    2. What’s the average inheritance tax for countries without expanded suffrage? Recall that units of inheritance tax are on percent from 0-100%
    3. ‣
      Answer
      ✅
      4.75 percent.
    4. What’s the average inheritance tax for countries with expanded suffrage?
    5. ‣
      Answer
      ✅
      The average inheritance tax rate for countries that expanded suffrage is 24.07 percent. This is because we add both coefficients.
    6. What’s the average difference in inheritance tax between expanded and not expanded suffrage?
    7. ‣
      Answer
      ✅
      The average difference is 19.33 percentage points. Notice that there difference represents change of percent, which are percentage points.

3. For the following questions refer to the following table. The outcome is inheritance tax rate.

image

  1. What’s the marginal effect of having universal suffrage on inheritance tax rate in column (d)?
  2. ‣
    Answer
    ✅
    0.69 percentage point
  3. Write column (d) as an equation with numbers.
  4. ‣
    Answer
    ✅
    InheritanceTax=−521.63+0.69UniversalCoverage+0.28Year+11.76War+2.19Europe+18.71Asia+7.84NorthAmerica+ϵInheritanceTax=-521.63+0.69UniversalCoverage+0.28Year+11.76War+2.19Europe+18.71Asia+7.84NorthAmerica + \epsilonInheritanceTax=−521.63+0.69UniversalCoverage+0.28Year+11.76War+2.19Europe+18.71Asia+7.84NorthAmerica+ϵ
  1. In an effort to better understand the effects of “Get-out-the-vote” messages on voter turnout, Gerber and Green (2005) conducted an RCT involving approximately 30,000 individuals in New Haven, CT, in 1998. One of the treatments was randomly assigned in person visits in which a volunteer visited the person's home and encouraged him or her to vote. Table 3 reflects the findings from the RCT.
    1. image
    2. What’s the marginal effect of being assign to in person contact on voting?
    3. ‣
      Answer
      ✅
      0.47-0.45=0.02, it’s 2 percentage points. (Y is a binary variable in 0/1)
  2. Use the figure below to answer: What’s the sign of β2\beta_2β2​ in the following equation? Y=β0+β1X+β2X2Y=\beta_0+\beta_1X+\beta_2X^2Y=β0​+β1​X+β2​X2
  3. image
    ‣
    Answer
    ✅
    It’s negative. This is because since there is a local max, this means that the second derivative must be negative, because the rate of change of the first derivative is decreasing. The first derivative represents the slope at any point, so in order to see that the slope is decreasing, pick a point of x (say X=25) and see what the slope is, in this case it will be a positive number. Now pick a point that’s higher than your original point (say x=75), and now see the slope of this point, notice that now the slope is negative. Therefore the slope went from a positive number to a negative number, so the slope is decreasing, which means the first-derivative is decreasing, which means the second derivative is negative, and since the second derivative is the sign of β2\beta_2β2​ we then know that β2\beta_2β2​ is negative.
  4. Energy Efficiency promises a double whammy of benefits: if we reduce the amount of energy we can both save the world and save money. What’s not to love? In this exercise we’ll dig into how to explore this relationship. The technology innovation is a programmable thermostat, which is a device that allows the user to preset temperatures at energy-efficient levels. Another important variable is HDD “heating degree-days”, which is a measure of how cold it was in the month (it is the number of degrees that a day’s average temperature is below 65 degree Fahrenheit). Usually the relationship between HDD and temperature (measure as Therms) is positive (the colder it gets, the higher the temperature people set their thermostat). We have data of houses that use thermostat and houses that don’t. The results from an OLS analysis are below, the main outcome variable for all of these regressions is “Therms”. The cost of a therm is $1.59 per therm. The cost of the thermostat is $60
    1. image
    2. Using Model (a), What’s the main conclusion?
    3. ‣
      Answer
      ✅
      That when houses use programmable thermostat, this houses have about 13.02 fewer therms of energy. The overall conclusion is that thermostat are making people more energy efficient.
    4. Does you main conclusion change when accounting for HDD?
    5. ‣
      Answer
      ✅
      It only gets stronger! Now the difference are larger.
    6. Using results from model (b), how much money are houses who use thermostat saving? According to this model is the thermostat worth it?
    7. ‣
      Answer
      ✅
      We can start creating meaningful statements on how much $ one is saving. We just have to convert the coefficient -20.05 to $. Since cost of a therm is $1.59, then this means savings of about $31.87 per month. Since the cost of the thermostat is $60, one could easily pay for this in the span of two month.
    8. Does it make sense that the programmable thermostat should save $30 in the middle of the summer? This indicates that the cost-savings depend on the weather outside. It makes more sense to think about the effects of the thermostat with respect to temperature outside. Therefore we focus on model (c). What’s the interpretation of the number -0.48?
    9. ‣
      Answer
      ✅
      This is the difference in therms for houses that have vs. don’t have thermostat when HDD is 0 (meaning the weather is warm for the whole month). This coefficient seems small relative to the others, and it is not statistically significant. Notice that this is consistent with the idea that programmable thermostat should not reduce heating costs when the furnace isn’t running.
    10. Using Model (c), what is the effect of the thermostat when HDD is 500?
    11. ‣
      Answer
      ✅
      The effect of the thermostat is β^1+β^3×500\hat\beta_1+\hat\beta_3\times500β^​1​+β^​3​×500=−0.48−0.062×500=−31.48-0.48-0.062\times500=-31.48−0.48−0.062×500=−31.48, this means that the thermostat help reduce therms by 31.48, lowering the bill by $50.05
    12. Using Model (c) What’s the average therm use for houses that don’t have a thermostat is particular hot months?
    13. ‣
      Answer
      ✅
      We can think of this as HDD=0, and so the answer will be 4.24

Enough warm-up let’s get it: Research Design Questions

Get out the vote

To better understand the effects of “Get-out-the-vote” messages on voter turnout, Gerber and Green (2005) conducted an RCT involving approximately 30,000 individuals in New Haven, CT, in 1998. One of the treatments was randomly assigned in-person visits in which a volunteer visited the person's home and encouraged him or her to vote. Table 3 reflects the findings from the RCT. Before answering the questions, think about what the instrument (Z), the main explanatory variable (D), and the main outcome (Y) would be.

image
  1. What is the estimate of the first stage (Effect of Z on D)? Show your calculation
  2. ‣
    Answer
    ✅
    Z is = assign or not to in-person contact. D=Actually contacted and Y=Voted in 1998. The effect of Z on X, is the effects on being assigned to a group on actually being contacted. This is just the difference between the first row 0.28-0.03=0.25. This says that being assign to in-person contact increases the likelihood of being contacted by 25 percentage points.
  3. What is the estimate of the reduced form (Effect of Z on Y)? Show your calculation
  4. ‣
    Answer
    ✅
    This is the effect of being assigned to in-person contact on voting. This is just the difference of the second row 0.47-0.45=0.02. This says that being assigned to in-person contact increases the likelihood of voting by 2 percentage points.
  5. What is the IV estimate of the effect of in-person contact on voting?
  6. ‣
    Answer
    ✅
    In order to obtain the IV estimate we need to obtain the ITTFS\frac{ITT}{FS}FSITT​, so in this case, the ITT is=0.02 and the first stage is 0.25, so the IV estimate is: 0.020.25=0.08.
  7. Provide an interpretation of the IV estimate.
  8. ‣
    Answer
    ✅
    This says that having an in-person contact increases your likelihood to vote by 8 percentage points. Notice that is not “having been assigned to in person contact” it’s the effect of actually being contacted. One could add for people that were assigned to in person contact and they were contacted, but in this context that’s not very different from just saying “people who had an in person contact”

Three times the charm

A researcher is interested in the effect of having a third child on a woman’s wages (where the data set contains women with at least two children). She wants to estimate the following model:

log(wage)=β0+β1ThirdKid+β2Educ+β3Exper+β4Exper2+ϵlog(wage)=\beta_{0}+\beta_{1}ThirdKid+\beta_{2}Educ+\beta_{3}Exper+\beta_{4}Exper^{2}+\epsilonlog(wage)=β0​+β1​ThirdKid+β2​Educ+β3​Exper+β4​Exper2+ϵ

Where wages are log hourly wages, thirdkidthirdkidthirdkid is a dummy=1 if the woman has a third child, and the education and experience variables are defined in years.

  1. The researcher decides to use “sexmixsexmixsexmix” as an instrument for “thirdkidthirdkidthirdkid,” where “SameSexSameSexSameSex” is a dummy=1 if the first two children are of the same sex and is equal to zero if they are of the opposite sex. First, why might the researcher want to use an instrument for “thirdkidthirdkidthirdkid?”
  2. ‣
    Answer
    ✅
    If we don't instrument for the third kid, We are comparing women who decided to have a third kid versus women who did not. Deciding to have a third kid could be driven by education level, family size, marriage status and a series of observable and unobservable characteristics that also have an effect on wage. In addition, even when we could control for those observable characteristics, we also have the problem of reverse causality. That is, higher wages lead to having more kids. IV can help soothe both of these problems.
  3. Do you think the variable “SameSexSameSexSameSex” meets the requirements for an instrument? Be sure to address each of the requirements for instrumental variables.
  4. ‣
    Answer
    ✅
    We would like to ask four major questions:
    1. Does the instrument affect X (First-stage)? This is plausible. We have seen evidence that having two sets of same-sex kids makes certain people more likely to have a third kid. More importantly, this is testable.
    2. Is the instrument randomly assigned? Conditional on having two kids, who gets two males or females vs female and males could be considered random. One could argue that in certain places families could “chose” this based on sex preferences before birth (i.e. making birthing decisions based on gender) and that families with higher resources could enact these preferences in a higher rate than families with lower resources.
    3. Can the instrument affect Y through another mechanism that is not X? It is hard to come up with arguments of another path in which sex-mix could affect wages that is not through the number of children's mechanism. One argument could be that the gender compositions of the siblings directly affects educational investment because of gender preferences of the parents.
    4. Monotonicity: Does the instrument only push people in one direction? This is credible; it's hard to think that having two same-sex children makes you more likely not to have a third child (relative to a non-same-sex pair). It's possible but less likely.
  5. Write down the equation the researcher will estimate as the first stage using 2SLS.
  6. ‣
    Answer
    ✅
    ThidKid=α0+α1SameSex+α2Educ+α3Exper+α4Exper2ThidKid=\alpha_{0}+\alpha_{1}SameSex+\alpha_{2}Educ+\alpha_{3}Exper+\alpha_{4}Exper^{2}ThidKid=α0​+α1​SameSex+α2​Educ+α3​Exper+α4​Exper2

    note that is is important to have the controls.

  7. Write down the equation the researcher will estimate as the second step. Which parameter tells you the effect of a third child on wages?
  8. ‣
    Answer
    log(wage)=β0+β1ThirdKid^+β2Educ+β3Exper+β4Exper2+ϵlog(wage)=\beta_{0}+\beta_{1}\widehat{ThirdKid}+\beta_{2}Educ+\beta_{3}Exper+\beta_{4}Exper^{2}+\epsilonlog(wage)=β0​+β1​ThirdKid+β2​Educ+β3​Exper+β4​Exper2+ϵ

    Where ThirdKid^\widehat{ThirdKid}ThirdKid comes from the predicted values of the first stage (which is important to note). β1\beta_1β1​ would recover the parameter of interest. Note that it is important to have the controls and to specify what a third kid hat is, not enough to say it is just a “hat.”

  9. Write down the equation to estimate if they were to use the reduced form.
  10. ‣
    Answer
    ✅
    log(wage)=β0+β1SameSex+β2Educ+β3Exper+β4Exper2+ϵlog(wage)=\beta_{0}+\beta_{1}SameSex+\beta_{2}Educ+\beta_{3}Exper+\beta_{4}Exper^{2}+\epsilonlog(wage)=β0​+β1​SameSex+β2​Educ+β3​Exper+β4​Exper2+ϵ
  11. Who are the never-takers in this example?
  12. ‣
    Answer
    ✅
    The never-takers are individuals who have the same sex for their first two kids, would not make them have a third kid, and if their kids are different sex, they would also not make them less likely to have a third kid. They would stop at two.
  13. Imagine you find a table with the following results.
    1. image
    2. Which of these estimates is the IV estimates? What is the interpretation?
    3. ‣
      Answer
      ✅
      It’s the -0.15. This means that having a third kid decreases wages by 15% for woman whose third kid was incentivized by same-sex first two children.
    4. What are columns 1,2 and 3 representing, respectively?
    5. ‣
      Answer
      ✅
      Column 1 (First stage) Column 2 (Second Stage) Column 3 (ITT or Reduced Form)
    6. What is the value of the coefficient on same-sex in the third column? Provide an interpretation
    7. ‣
      Answer
      ✅
      It’s -0.01185. The interpretation is, having first two kids being the same sex decreases a woman’s wages by 1.2 percent approximately.

NICU Babies

  1. Suppose infants with birthweights below 1500 grams are classified as “very low birthweight” and are therefore automatically eligible for a stay in the neonatal intensive care unit (NICU) under most insurance plans.
    1. Explain intuitively how you would use this fact to estimate the effect of NICU visits on infant health outcomes.
    2. ‣
      Answer
      ✅
      We can compare the health outcomes of babies born below and above this threshold. These babies and families should be, on average, very similar (there is no reason why babies with a couple of grams difference should be really different from each other ), other than the fact that the ones below would go to the NICU, and the ones above would be less likely to (they potentially still could). Making this comparison would allow us to recover the effect of getting for sure assigned to a NICU unit vs. not. Other thoughts: - Notice that the question is saying “intuitively,” which is another way of saying “use non-technical language.” - Notice that we state the comparison and then the assumption. - Notice that we are not comparing going vs. not going because kids above the cutoff could still go, is just that kids below the threshold are more likely to go.
    3. We want to know the effect of being sent to the NICU on 1-year infant mortality. Do you think this is sharp or fuzzy regression discontinuity? What is the “running variable”?
    4. ‣
      Answer
      ✅
      Fuzzy, the discontinuity likely increases the eligibility for the NICU, but that's not to say that babies born with 1,500 grams and over cannot go to the NICU. The running variable is birth weight. The research question is “Effect of NICU on health outcomes.” This tells us that the “treatment” is going to the NICU.
    5. What do we need to assume about the 1500-gram cutoff to get credible identification of the effect of NICU stays?
    6. ‣
      Answer
      ✅
      That kids born right below the threshold are similar in observable and unobservable characteristics to children born right above the threshold and that (2) no other treatment is happening around the cutoff.
    7. How would you explore the discontinuity in a regression? Write the equation.
    8. ‣
      Answer
      ✅
      There are a couple of possible answers: Yi=α+δDi+γ(Birthweight−1500)+ϵiY_{i}=\alpha+\delta D_{i}+\gamma(Birthweight-1500)+\epsilon_{i}Yi​=α+δDi​+γ(Birthweight−1500)+ϵi​ Yi=α+δD+γ(Birthweight−1500)+β1(D×(Birthweight−1500)+ϵiY_{i}=\alpha+\delta D+\gamma(Birthweight-1500)+\beta_{1}(D\times(Birthweight-1500)+\epsilon_{i}Yi​=α+δD+γ(Birthweight−1500)+β1​(D×(Birthweight−1500)+ϵi​ Where Yi is the outcome of interest, D is a binary variable taking the value of 1 if the birthweight is 1500 or lower. However, you would have to do a 2SLS or reduced form if we wanted to explore the effect of NICU on 1-year mortalitysince since this is a fuzzy RD.
    9. Mention a particular robustness check you would suggest performing.
    10. ‣
      Answer
      ✅
      Many potential suggestions. One is a “density” check. We want to know if people in the hospital are coding babies as having less than 1,500 grams to get them to the NICU. This would mean more babies than usual are considered “very low birthweight.”
    11. Let’s say we are concerned that hospital staff are coding some babies that are above the cutoff (e.g., 1550, 1600) as under the cutoff (e.g., 1450, 1490). How would this bias the coefficients? Let’s say we do the analysis and ignore these potential sources of bias. Would the effect we estimate be an upper or lower bound?
    12. ‣
      Answer
      ✅
      If we code people who are above the cutoff as “treated,” then this could bias us against finding an effect because the babies who wouldn’t have received care (above the cutoff) may, on average, not have needed the care, and so it would look as if NICU had a smaller effect. This means that the effect is a lower bound since the “true” effect could be bigger. If that didn’t make sense, consider this a sign of the bias question. We need corr(OV, D) and corr(OV, Y) for that. Here, the OV is something that makes babies put on more weight. This is negatively correlated with D because of the miscoding and presumably positively correlated with Y since more weight leads to better outcomes. This means the bias is negative, so the estimated effects would be more negative than the true effects. If we obtain positive estimated impacts, the true effects are more positive.

Hope Scholarship

  1. Students who graduate from Georgia high schools with GPAs of 3.0 or higher are eligible for the state’s HOPE scholarship. HOPE scholarships provide tuition support for students to enroll at public or private colleges in Georgia. The program aims to increase college enrollment overall and encourage strong students to stay in their home state.
    1. Describe how you would evaluate the effects of HOPE eligibility on enrollment at Georgia colleges using a regression discontinuity (RD) strategy. Specify the treatment group, the control group, and any assumptions required for this strategy to capture the causal effect of being eligible for HOPE on college-going. Specify what type of relationship the running variable has concerning the outcome.
    2. ‣
      Answer
      ✅
      The running variable would be GPA, and the outcome variable could be several outcomes: “Any College Attendance” or “Any Georgia College Attendance.” The discontinuity would be at a 3.00 GPA. People above the cutoff would be the treatment group, and those below the cutoff would be the control group. We would compare the people right around the cutoff and assume that people right below and right above the cutoff are similar in observable and unobservable characteristics. We would imagine that GPA is an increasing function of college attendance (a higher GPA, a higher likelihood of college attendance). However, around the cutoff, differences in GPA could be driven by randomness in the grading scheme rather than representing the true difference in “ability.”
    3. Create a hypothesis of how the HOPE scholarship affects enrollment at Georgia colleges. What about how it would impact “college enrollment” overall?
    4. ‣
      Answer
      ✅
      We could expect that receiving the scholarship would increase attendance at Georgia colleges. It is unclear if it would change the overall college attendance, as people could have attended other colleges too.
    5. Draw some graphs that are consistent with your story and that would represent the RDs. Be sure to label any important features of your picture (axes, legend, etc.) How closely your picture matches your story is more important than which story you believe to be true.
    6. ‣
      Answer
      image
      image
    7. Write out a regression that is consistent with this story. Write down how you would code each variable. Practice seeing how these coefficients map to your graph.
    8. ‣
      Answer
      ✅
      GACollegeAttendance=α0+βDi+γ(GPA−3.00)+ϵGACollegeAttendance= \alpha_0+ \beta D_i+ \gamma(GPA-3.00)+ \epsilonGACollegeAttendance=α0​+βDi​+γ(GPA−3.00)+ϵ, where β\betaβ=0.04 (approximately, the way it was drawn). D is a binary variable indicating whether you are above or below 3.0, and GPA-3.0 indicates the distance between one's GPA and the threshold. We do not include an interaction between GPA-3.0 and D because we drew the slopes as very similar.
    9. Do you think the HOPE scholarship is well suited for an RD study? Would you offer any caveats about using the HOPE eligibility threshold for an RD analysis?
    10. ‣
      Answer
      ✅
      Yes, it is well suited for an RD study. Potential caveats:

      (1) Students may pick certain classes in high school to optimize their GPA to become eligible for a HOPE scholarship. This could create differences between the people who cared and were marginal versus people who didn't. This can create “bunching” around the cutoff. In other words, once you pass a 3.0 GPA, you may not care about your GPA.

      (2) Policy implication caveats: Even if the policy had an impact by increasing attendance in Georgia colleges, this policy may not increase college attendance overall since those people could have already been going to a different college. It is also unclear if incentivizing students to attend colleges in Georgia, away from other colleges, is the right incentive for students.

      (3) There could be other scholarships for one who can be eligible at the same threshold. This would be a concern.

    11. Imagine you run an RD regression with the student's age as the outcome variable. You find a jump around the threshold. Would this finding make you more or less confident about your results?
    12. ‣
      Answer
      ✅
      Less confident. The student's age should not “jump” around the student's GPA threshold. If this is happening, then it may be the case that something else is happening around the discontinuity.
    13. Someone is concerned that you haven’t controlled for the students' race in the proposed model, so your estimate is biased. What conditions would need to be true for a student’s race to be an issue of concern?
    14. ‣
      Answer
      ✅
      We would be worried about race being an OVB if the share of students from a particular race who are above the threshold is substantially different than those of the same race right below the threshold. In other words, if there is a “jump” in the racial composition of people right below vs. right above the 3.0 GPA, the person would have to provide a reason why this jump specifically exists at 3.0 and not at, say, any other part of the running variable, like, for example, 2.3 to 2.4.

Texas accountability program

  1. In 1996, Texas adopted a new school accountability program to help with student performance, while the states bordering Texas did not adopt such a program. With standardized test scores (Score) in 1995 and 1997 for a large sample of 4th graders in Texas and the bordering states, we could run the regression:
    1. Scorest=β0+β1Texass+β2(Texass×D97t)+β3D97t+ϵstScore_{st}=\beta_{0}+\beta_{1}Texas_{s}+\beta_{2}(Texas_{s}\times D97_{t})+\beta_{3}D97_{t}+\epsilon_{st}Scorest​=β0​+β1​Texass​+β2​(Texass​×D97t​)+β3​D97t​+ϵst​

      Here, Texas=1 if the observation was drawn from a Texas school (=0 otherwise), and D97=1 if the observation was from 1997 (=0 otherwise).

    2. Interpret what each coefficient is capturing.
    3. ‣
      Answer
      ✅
      β0\beta_0β0​ represents the average scores in 95 for all bordering states in Texas. β1\beta_1β1​ represents the differences in score between Texas and all other bordering states in 95 (before the reform). β2\beta_2β2​ compares is the difference in scores between Texas and all other border states after the reform and the diffeernce in scores between texas and all other border states before the reform. This coefficient give us the DD estimate. β3\beta_3β3​ represents the change in scores in all other border states between 95 and 97. Notes: - Notice that we are not using general language like “pre-period” or “control group”
    4. Someone would like to know what would have happened in Texas had we not implemented the policy. How would you obtain this from the regression?
    5. ‣
      Answer
      ✅
      This is obtained by β0+β1+β3\beta_0+\beta_1+\beta_3β0​+β1​+β3​
    6. Which of the following provides the best estimate of the causal effect of the policy? β0, β1, β2 or β3\beta_{0},\ \beta_{1},\ \beta_{2}\ or\ \beta_{3}β0​, β1​, β2​ or β3​?
    7. ‣
      Answer
      ✅
      β2\beta_{2}β2​ provides the causal effect of the policy provided the assumptions of DD are met

      d. Imagine that β0=10,β1=2,β2=4,β3=0\beta_{0}=10,\beta_{1}=2, \beta_{2}=4,\beta_{3}=0β0​=10,β1​=2,β2​=4,β3​=0. Fill in the following table:

      Texas
      Border States
      1995
      1997
      ‣
      Answer
      Texas
      Border States
      1995
      12
      10
      1997
      16
      10

      e. What would have happened to the treatment group had they not received treatment?

      ‣
      Answer
      ✅
      They would have ended up with 12 as their score.

      f. Someone is concerned that in the model above, you have not controlled for the difference in “culture” between TX and other bordering states. What would have to be true for this concern to be valid?

      ‣
      Answer
      ✅
      (1) That there is either a change in culture between 95 and 97 for either TX or bordering states. Or if its is for both, that the effect is differential. (2) That this change in culture affects children’s test scores.

      g. Someone is concerned that you have not included year FE in the model. Explain if this is a concern or not?

      ‣
      Answer
      ✅
      In this example, we only observe scores for 95 and 97, so in short, we are putting year FE, which is just one dummy for year 97.

Health Coverage

  1. A state implemented a reform (at year 0) to increase health coverage. You are in charge of estimating the effect of this policy. Use the following graphs to answer the following questions:
    1. ‣
      Figure A
      image
      ‣
      Figure B
      image
      ‣
      Figure C
      image
    2. You have decided to use a DD strategy to analyze the effects of this policy. Which panel provides the most appropriate control group to implement a DD strategy? Explain.
    3. ‣
      Answer
      ✅
      Figure (A) provides the most appropriate control group because it provides the control group that matches the pre-trends of the treatment group.
    4. Using the graph you picked in 4a, calculate the effect of the policy using a DD strategy. Show your work in a table:
    5. Before
      After
      Diff
      Treatment
      Control
      Diff
      ‣
      Answer
      ✅
      This implies that the DD is 0.11, or in other words the policy increased the outcome by 0.11
      Before
      After
      Diff
      Treatment
      0.60
      0.74
      0.14
      Control
      0.78
      0.81
      0.03
      Diff
      -0.18
      -0.07
      0.11
    6. Imagine if we had run the following regression: Y=α+β1Treat+β2Post+β3T×PostY=\alpha+\beta_{1}Treat+\beta_{2}Post+\beta_{3}T\times PostY=α+β1​Treat+β2​Post+β3​T×Post. Indicate what would be the value of each coefficient.
    7. ‣
      Answer
      ✅
      α=0.78,β1=−0.18,β2=0.03, β3=0.11\alpha=0.78,\beta_{1}=-0.18,\beta_{2}=0.03,\ \beta_{3}=0.11α=0.78,β1​=−0.18,β2​=0.03, β3​=0.11

A bill in VA

  1. For the following example, explain how to create a difference-in-difference design to estimate the effects of the policy:
    1. The state of VA passed a bill in February 2015 to increase funding for mental health in schools. The government plans to use this money to increase the counselor-per-student ratio in each school. The bill was passed in February 2015 and enacted that Fall. You have a school-by-year dataset for all the southern states (using the census definition). This dataset includes the average students' mental health outcomes and other school characteristics. Write down the model you would use and explain what each (set of) variables means and how they would be coded (not the STATA code). Try to have a standard model and a generalized model.
    2. ‣
      Answer
      ✅
      There are several possible models:

      Mental Health Outcomecsy=α0+β1VAs+β2Post2015y+β3Post2015y×VAs+ϵcsyMental\ Health\ Outcome_{csy}=\alpha_{0}+\beta_{1}VA_{s}+\beta_{2}Post2015_{y}+\beta_{3}Post2015_{y}\times VA_{s}+\epsilon_{csy}Mental Health Outcomecsy​=α0​+β1​VAs​+β2​Post2015y​+β3​Post2015y​×VAs​+ϵcsy​

      Mental Health Outcomecsy=α0+β1VAs+β2Post2015y+β3Post2015y×VAs+γSchoolControlscs+ϵcsyMental\ Health\ Outcome_{csy}=\alpha_{0}+\beta_{1}VA_{s}+\beta_{2}Post2015_{y}+\beta_{3}Post2015_{y}\times VA_{s}+\boldsymbol{\gamma}SchoolControls_{cs}+\epsilon_{csy}Mental Health Outcomecsy​=α0​+β1​VAs​+β2​Post2015y​+β3​Post2015y​×VAs​+γSchoolControlscs​+ϵcsy​

      Mental Health Outcomecy=α0+βDDVA×Post2015y×SchoolsinVAs+γSchoolFEc+λYearFEy+ϵcsyMental\ Health\ Outcome_{cy}=\alpha_{0}+\beta_{DD}VA\times Post2015_{y}\times SchoolsinVA_{s}+\boldsymbol{\gamma}SchoolFE_{c}+\boldsymbol{\lambda}YearFE_{y}+\epsilon_{csy}Mental Health Outcomecy​=α0​+βDD​VA×Post2015y​×SchoolsinVAs​+γSchoolFEc​+λYearFEy​+ϵcsy​

      VA is a variable that takes the value of 1 if the school is in VA and 0 otherwise. This captures the difference in MH outcomes between VA and non-VA schools. Post2015 is dummy variables that takes the value of 1 if the year is post 2015 and 0 otherwise. This variable captures the change in MH outcomes in non-VA schools before and after 2015. In the second specification we can include school level controls since these vary by school. This capture potential confounders that would affect the outcome and the timing and adoption of the bill. In the third specification, we are including school fixed effects, which has a dummy for each school in the sample (minus one) and year fixed effects which means we have a dummy for each year in the sample (Except one). In this specification we can't include school level controls because these would be captured by the school FE if they are time-invariant. The school FE will capture time-invariant characteristics of each school. These are things like, student-teacher ratio. The year-FE would capture school-invariant characteristics that occurred that year across school. For example a release of a new social-app or the release of a new TV show like “13 reasons why”.

    3. You have the following dataset representing the share of teens reporting anxiety in school. Using these data, create an event study figure with 2014 as your base year.
    4. Southern States
      Virginia
      2011
      14
      9
      2012
      15
      9
      2013
      16
      7
      2014
      17
      6
      2015
      18
      6
      2016
      19
      8
      2017
      20
      9
      2018
      21
      10
      ‣
      Answer
      ✅
      First step is to plot the data:
      image

      The event study would look like this:

      image
    5. Use the event study to evaluate the assumptions of the research design. Is this supporting the assumptions or not?
    6. ‣
      Answer
      ✅
      Definitely not! This is not supporting parallel trends.
    7. Discuss how these events could affect the causal interpretation of the generalized model from the example above. For each event, under what conditions would these events be a concern, and what conditions would it not be a concern?
      1. A pandemic occurred in 2020.
      2. ‣
        Answer
        ✅
        Our data ended in 2018. Therefore not a concern. If we had data up to 2020, we would be concerned if the pandemic affected the average student's anxiety in VA schools differently from the average student’s anxiety in other states. However, this is only a concern for this particular year and onwards rather than for findings previous to 2020.
      3. The rollout of a social media app (say Instagram) in 2015.
      4. ‣
        Answer
        ✅
        It could only be a concern if we think it is affecting schools in VA differently than schools not in VA. If there is no ex-ante reason why that would be the case, this could be absorbed in the year FE.
      5. The long-term closure of schools in 2013 Tennessee due to unprecedented snow.
      6. ‣
        Answer
        ✅
        Potentially. One the one end, (1) it affects the control group differentially than the treatment group, and (2) it is close to the reform, and (3) before the reform in VA. However, it is only one state, and the control group includes a number of states, so it is unclear how much this would change the whole control group.
      7. Increases in teacher pay of about 10% in all southern states but VA in 2017
      8. ‣
        Answer
        ✅
        Most definitively. This is a change that (1) affects schools in VA differently than schools in the control group, and (2) Although this happens “after the treatment,” it would be hard to disentangle the effects of the main policy and this one with the current data. If we had more granular time data, we could use information up to this pay reform. The third component of this discussion is that we need to assume that increases in teacher pay can affect students' mental health outcomes. To be explicit, we would think that an increase in teacher pay improves the morale and mental health of educators, and therefore it affects students’ mental health. Notice that if this pay increase has no effect on students’ mental health outcomes, it would not be a concern. As with any OVB, it needs to be correlated with Y.
      9. Increases in teacher pay of about 10% in all southern states but VA in 2011
      10. ‣
        Answer
        ✅
        This is a (1) differential change but it is happening right (2) at the beginning of our study period in the pre-period. If it is a concern we would see in on the event study (which we do!). However, if the event study would show parallel trends, then we would be less concerned. To make things explicit: for us to be concerned about this if the event study showed “good” parallel trends, we would have to argue that this increase in teacher pay in the control group in 2011 did not affect deferentially mental health outcomes in 4 years. Still, right around the 5 or 6th year it does. That's a bit of a stretch.

True or False

  1. A dataset is a city-year panel. In a DD design, we include city-fixed effects and year-fixed-effects. We cannot include a variable such as “area in squared miles” for each city.
    1. True
    2. False, explain
    3. ‣
      Answer
      ✅
      True. The area of a city rarely changes (or just doesn’t) so it should be absorbed by the city-FE
  2. In a regression that includes school-fixed effects, these capture all time-invariant characteristics across each child.
    1. True
    2. False, explain
    3. ‣
      Answer
      ✅
      False: school fixed-effects captures all time invariant across each school.
  3. A new policy in Costa Rica has expanded the number of cafeterias in some public schools. To understand the effects of cafeterias on children's nutrition, we should control enrollment in the schools to account for new students coming because of the new cafeterias.
    1. True
    2. False, explain
    3. ‣
      Answer
      ✅
      False: Enrollment as stated in the question is an outcome, we wouldn't want to control for an outcome.
  4. If you were interested in the effect of attending a “selective college” (selective meaning something like an IVY league school; so the main explanatory variable being in a selective college or not) on lifetime earnings, we should not include college fixed effects.
    1. True
    2. False, explain
    3. ‣
      Answer
      ✅
      True. You cannot include them because including them would mean there needs to be a within-college variation of “selectivity” or “Ivy” status. Since we think this selectivity is not changing within the college. These would drop, or the selectivity variable would drop.
  5. A clear rule decides congressional elections: whoever gets the most votes in November wins. Because virtually every congressional race in the United States is between two parties, whoever gets more than 50% of the vote wins. We can use this fact to determine if Republican (or Democrat) elected members of Congress follow the average ideology of their constituents (i.e., the median voter theory) or if they deviate in ideology because of the party they belong to. (In other words, do elected members of Congress follow the people or their party?) Some argue that Republicans and Democrats are very distinctive; others say that members of Congress have a strong incentive to respond to the median voter in the district, regardless of party. We can assess how much party matters by looking at the ideology of members of Congress in the 112th Congress (which covers the years 2011 and 2012). (We define the median voter as the voter that represents the median ideology of the population). Ideology measures the “conservatism”/”liberalism” of the members of Congress. This measure was developed by Carroll et al. (2009, 2014). It ranges from -0.779 to 1.293. Higher values indicate more conservative voting in Congress. Share of vote Republican is the percentage of the vote received by the Republican congressional candidate in districts. Ranges from 0 to 1.
    1. image

      a. If the elected member of Congress is to the left of the threshold, what party do they belong to?

      1. Democrat
      2. Republicans
      ‣
      Answer

      A

      b. If the elected member of Congress is to the right of the threshold, what party do they belong to?

      1. Democrat
      2. Republicans
      ‣
      Answer

      B

      c. What is the running variable?

      1. Percentage of people that go vote
      2. A dummy variable if republican won or not
      3. Share of votes for Republican candidate
      4. The conservatism of the candidate
      ‣
      Answer

      C

      d. What is the outcome variable?

      1. A variable measuring distance from 50% share of republican voting
      2. Ideology voting of the elected member of congress
      3. Being Republican
      4. Being Democrat
      ‣
      Answer

      B

      e. Given the research question from the prompt, what is the treatment?

      1. Being conservative
      2. Being less conservative
      3. Being from a particular party (R or D) and elected member of Congress
      ‣
      Answer

      C

    2. Is this a sharp or fuzzy discontinuity?
      1. Sharp
      2. Fuzzy
      3. ‣
        Answer

        A

      g. The results from this RD provide empirical evidence that strengthens which theory:

      1. Congress members respond to the political party.
      2. Congress members respond to the median voter.
      ‣
      Answer

      A