6️⃣

Homework 6 with AK

Purpose

The objective of this homework is for you to practice concepts learned in class and apply them. The ideas we will practice in this homework relate to understanding fixed effects. As you may notice, the questions are becoming more colloquial. There may be many paths toward the correct answer in some cases, and in others, there is only one path.

☝🏽

col·lo·qui·al

/kəˈlōkwēəl/

adjective

(of language) used in ordinary or familiar conversation; not formal or literary.

"colloquial and everyday language"

Guidelines

Work will be independent.
Submit your answers to Gradescope (within Canvas).
We encourage you to use the answer boxes, PDFs, JPGs, and PNGs, preferably over Word documents or CVs. Recall you can always save something as a PDF. You can also “Screenshot” anything. In Windows, you can use the snipping tool or Windows+Shift+S. In Mac, you can do this by command+shift+4.
Submit your do-file to Gradescope (within Canvas).
You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
Your responses should be professionally formatted and written.
The due date is Friday April 11th 11:59pm EDT
For statistical significance, we’ll count it if its significant at the 95% confidence or higher.

Preamble

For this homework, we will work with the dataset called “Ch8_Exercise3_Teaching_evals.dta”, which you can find at this link.

With these data, we want to understand the relationship between teaching evaluations and class grades. Faculty have always hypothesized that classes with higher overall grades obtain better teaching evaluations. Many reasons could explain this relationship (higher class grade, higher evals), and we will evaluate this claim using data in this homework. The questions asked here are similar to questions I got asked when evaluating data for a school. You will notice that the questions are broad, but I was able to use tools and skills from RMDA to provide concrete answers.

Understanding the panel

Open the dataset and notice its structure. Answer the following questions (for yourself): What type of panel is this? What could be two dimensions? Before you keep reading, see if you can determine the variables' meaning. This is good practice for starting to notice what data contains without a data dictionary. ☝🏽Some Tips

The variable Apct indicates the percentage of As in the class. Eval indicates the average teaching evaluation score the instructor got for that particular class. One (1) indicates the lowest score (not great evals), and 5 indicates the highest score (great evals). The rest of the variables should be self-explanatory. Notice that the year variable indicates the academic year instead of the calendar year, so the value 200304 is indicating the academic year 2003-2004.

The race variable indicates the instructor's race. Unfortunately, we don’t know the meaning of the values 1 or 2 for the race variable. We only know that the value of 0 indicates “white”, so convert this variable into a binary variable where white takes the value of 1 and 0 otherwise. Math indicates if the course if quantitative or not.

You’ve been hired as an intern to analyze Batten data on classes and see if we can understand from the data the claim that higher average grade in the class correlated to better faculty evaluations. Think about what other confounders there may be (maybe whether the class in quantitative or not is a confounder for example).

Amanda Crombie is eager to hear what you find in the data. She emailed you a set of questions you could answer with the data. For each of the following question, use whatever statistical method to determine which class type has higher evals. However, in addition to reporting the actual difference for each of these comparisons, indicate the difference in evals as a percentage difference and the p-value that suggests if the difference is statistically significant. In Gradescope, submit a table with the three numbers needed and then for the questions write sentences that explain your findings from the table. These questions aim to describe patterns instead of untangling the causal effect. We will, however, test the hypothesis and see what the data says about those. 🧮Sample Table

Submit the table with the results
Who gets higher evaluations, male or female instructors? Does this difference matter?
Who gets higher evaluations, white or non-white instructors? Does this difference matter?
Which classes get higher evaluations, spring or fall semester classes? Does this difference matter?
Which classes get higher evaluations, required or non-required classes? Does this difference matter?
Which classes get higher evaluations, classes that use math or don’t? Does this difference matter?

‣

Answer

Amanda also wants to know which classes get higher evaluations: Classes with a large number of students or a small number of students. Use a tool learned in class to respond to this question. Again, the goal is to describe, not to detect causal effects.

Present a screen shot of your analysis
Provide an answer based on what you find in the analysis.

‣

Answer

Now let’s move on to understanding how students’ performance in the class is related to instructor’s evaluations. Please run the following models and present them in a table; only report the coefficient on Apct

Model 1: regression of evaluations on the percent of students with As
Model 2: Model 1 + all variables related to observable characteristics about the course (year, required, enrollment, spring, math)
Model 3: Model 2 + all variables related to observable characteristics of the instructor (race and gender)
Model 4: Model 1 + course FE
Model 5: Model 1 + instructor FE
Model 6: Model 1 + course FE + instructor FE

‣

Answer

Use the results from the table you created in question 3 to answer the following questions:

Using the results from Model 1, what would be your overall conclusion about the relationship between students’ performance and evals?

‣

Answer

Using results from model 1, 2 and 3 How does adding either instructor or course characteristics change this conclusion?

‣

Answer

What are some observable characteristics that course FE accounts for?

‣

Answer

What are some unobservable characteristics that course FE accounts for?

‣

Answer

Which characteristics about the course that you are controlling for in model 2 are also controlled in model 4?

‣

Answer

Amanda sees the result from Model 5 and worries that you haven’t accounted for the instructor’s gender. You mentioned this seems important, but she hasn’t seen it in the regression. What would be your response?

‣

Answer

Amanda sees the result from Model 5 and worries that you haven’t accounted for enrollment in the course. You mentioned to her before that this seems like an important variable. What would be your response?

‣

Answer

Building model 7: Model 6 includes course and instructor fixed effects, now let’s add to that model things that change across those two margins: enrollment and the semester when the class is taught. Finally add “year” fixed-effects into the model. Use the results from this model to report a final conclusion about how student performance affects instructors’ evaluations. Report a screenshot of your result with just the main explanatory variable. Present your results in technical and non-technical way and remember to assess significance!

‣

Answer

Extensions

As always, these are ungraded questions, for you to practice

Provide examples of what a fixed effect for each of the following would control: instructor, course, and year. Then, we invite you to think about whether the characteristic you thought it would control for is relevant or not.

‣

Answer

Provide a graph that shows the evolutions of percentage of As over time, for quantitative vs. non-quantitative separately

‣

Answer

Code

* Start of Do File For Students!
global hw5 "$dropbox/1_Classes/Research Methods/Spring 2023/homework/homework 5"

use "$hw5/hw5.dta", clear 
gen 	white =1 if race==0
replace white=0 if race!=0

foreach var of varlist female white spring required math {
ttest eval, by(`var')
display (`r(mu_1)'-`r(mu_2)')/`r(mu_1)'
}

reg eval enrollment

estimates clear
 eststo: reg eval apct 
* Add in Characteristics about the course 
eststo: reg eval apct year required enrollment spring math 
* Add in Characteristics about the instructor 
eststo: reg eval apct year required enrollment spring math white female
* Add in FE:
eststo: reg eval apct i.courseid 
eststo: reg eval apct i.instrid
eststo: reg eval apct i.courseid i.instrid
eststo: reg eval apct i.courseid i.instrid  enrollment 
esttab, se keep(apct) mtitle("Baseline" "+Course Char" "+Instr Char" "Course FE" "Instrct FE" "Course +Instructor Fe" "+enrollment")