Purpose

The objective of this homework is for you to practice concepts learned in class and apply them to a real-case scenario. The ideas we will practice in this homework relate to understanding fixed effects. As you may notice, the questions are becoming more “colloquial.” There may be many paths toward the correct answer in some cases, and in others, there is only one path.

Guidelines

Work will be independent.
Submit your answers to Gradescope (within Canvas).
We encourage you to use the answer boxes, PDFs, JPGs, and PNGs, preferably over Word documents or CVs. Recall you can always save something as a PDF. You can also “Screenshot” anything. In Windows, you can use the snipping tool or Windows+Shift+S. In Mac, you can do this by command+shift+4.
Submit your do-file to Gradescope (within Canvas).
You will get points for correct answers. You will get points deducted if the answer contains more information that’s not necessary or if the answer contains incorrect statements among correct statements. In short, we are trying to incentivize students to use the least amount of characters while maximizing the accuracy of responses.
Your responses should be professionally formatted and written.
The due date is Monday, April 15th, at 10 pm EDT.

Preamble

For this homework, we will work with the dataset called “Ch8_Exercise3_Teaching_evals.dta”, which you can find at this link.

With these data, we want to understand the relationship between teaching evaluations and class grades. Faculty have always hypothesized that classes with higher overall grades obtain better teaching evaluations. Many reasons could explain this relationship (higher class grade, higher evals), and we will evaluate this claim using data in this homework. The questions asked here are similar to questions I got asked when evaluating data for a school. You will notice that the questions are broad, but I was able to use tools and skills from RMDA to provide concrete answers.

Understanding the panel

Open the dataset and notice its structure. Answer the following questions (for yourself): What type of panel is this? What could be my two dimensions? Before you keep reading, see if you can determine the variables' meaning. This is good practice for starting to notice what data contains without a data dictionary.

The variable Apct indicates the percentage of As in the class. Eval indicates the average teaching evaluation score the instructor got for that particular class. One (1) indicates the lowest score (not great evals), and 5 indicates the highest score (great evals). The rest of the variables should be self-explanatory. Notice that the year variable indicates the academic year instead of the calendar year.

The race variable indicates the instructor's race. Unfortunately, we don’t know the meaning of the values 1 or 2 for the race variable. We only know that the value of 0 indicates “white.”

Amanda is eager to hear what you find in the data. She emailed you a set of questions you could answer with the data. For each question, use whatever statistical method to determine which class type has higher evals. For each of these comparisons, indicate the difference in evals as a percentage difference and the p-value that suggests if the difference is statistically significant. For Gradescope, submit a table with the two numbers needed. These questions aim to describe patterns instead of untangling the causal effect. We will, however, test the hypothesis and see what the data says about those.

Who gets higher evaluations, male or female instructors?
Who gets higher evaluations, white or non-white instructors?
Which classes get higher evaluations, in the spring or fall semester?
Which classes get higher evaluations, required classes or non-required classes?
Which classes get higher evaluations, classes that use math or don’t?

‣

Answer

Notice that these questions do not ask for causal effects, just associations. A simple regression would suffice.

If you use a t-test or reg eval on the variables below, you’ll get approximately the results below.

Category	%	p-value	Sentence
Female	2.5	0.000	Females score 2.5% less than males
White	0.4%	0.5230	White instructors obtain 0.4% more score than non-white
Spring	0.2%	0.6912	Spring instructors obtain 0.2% lower evaluation scores than fall instructors.
Required	4.9%	0.0000	Required courses get 4.9% less scores than non-required courses
Math	2.9%	0.0000	Math courses get 2.9% fewer evals score than non-math courses

If you used regression with $log(eval)$ to obtain the coefficients in percent, then you’ll get approximately the results below

Category	Reg Eval on	Reg Log(Eval) -approx- on	Reg Log(Eval) - Actual
Female	2.5%***	2.46***%	2.434%***
White	0.4%	0.179%	.179%
Spring	0.2%	-0.128%	-.128%
Required	4.9%***	5.21***%	5.08%***
Math	2.9%***	2.74***%	2.67%***

Amanda also wants to know which classes get higher evaluations: Classes with a large number of students or a small number of students. Use a tool learned in class to respond to this question. Again, the goal is to describe, not to detect causal effects.

‣

Answer

✅

A simple answer would be using regression. This indicates that larger classes will tend to have lower evaluations. Notice that these questions do not ask for causal effects, just associations.

We could also use a graphical representation of this through binscatter:

Based on your exercise on (1), Amanda would like to know which characteristics of instructors or courses seem to matter for evaluations. (Hint: What does matter mean here? What concepts could one use to answer this?) Your answer should point out the characteristics and provide reasoning for why you picked those.

‣

Answer

✅

The characteristics that matter the most are being female, whether the class is required, and whether the class uses math. These are statistically significant, and the sizes are larger relative to the instructor's race and whether the class is in the Spring semester.

Now let’s move on to understanding how students’ performance in the class is related to instructor’s evaluations. Please run the following models and present them in a table; only report the coefficient on Apct

Model 1: regression of evaluations on the percent of students with As
Model 2: Model 1 + all variables related to observable characteristics about the course (year, required, enrollment spring, math)
Model 3: Model 2 + all variables related to observable characteristics of the instructor.
Model 4: Model 1 + course FE
Model 5: Model 1 + Instructor FE
Model 6: Model 1 + course FE + instructor FE.

‣

Answer

without logs

With logs

Use the results from the table to answer the following questions:

Using the results from Model 1, what would be your overall conclusion about the relationship between students’ performance and evals?

‣

Answer

✅

Model 1 reports a large and statistically significant positive coefficient, as more students ace the class, evals are higher.

How does adding either instructor or course characteristics change this conclusion? (Model 2 and Model 3)

‣

Answer

✅

Adding either course of instructor’s characteristics decreases the effect, but it is still large and positive indicating that the more As student get the better the evals.

What are some observable characteristics that course FE accounts for?

‣

Answer

✅

Course FE accounts for the characteristics that are “instructor” invariant. So, for example, if the course is required or not, if it is taught in the spring or not if it uses a lot of math.

What are some unobservable characteristics that course FE accounts for?

‣

Answer

✅

Some unobservable things the Course could account for is: whether it has a lot of writing, or if the course is 3-credits or not, or harder things to capture like “difficulty of the topic”

Amanda sees the result from Model 5 and worries that you haven’t accounted for the instructor’s gender. You mentioned this seems important, but she hasn’t seen it in the regression. What would be your response?

‣

Answer

✅

In our data, the instructor’s gender does not vary across courses, or in other words, gender doesn’t vary within the instructor. Therefore, this is already captured by the instructor’s FE.

Amanda sees the result from Model 6 and worries that you haven’t accounted for enrollment in the course. You mentioned to her before that this seems like an important variable. What would be your response?

‣

Answer

✅

That’s a fair concern. We should be able to include enrollment in the regressions even when accounting for course and instructor’s FE since enrollment varies within these two categories.

6. Run a final model, Model 6 + enrollment. Use the results from this model to report a conclusion about how student performance affects instructors’ evaluations. Report a screenshot of your result with just the main explanatory variable.

‣

Answer

✅

We get a positive association in this model, but this is not statistically significant. It is also less than half of the estimate without any controls. We can conclude that there is a positive association, but we need more data or that the association may not be different enough from 0.

Extensions

As always, these are ungraded questions, for you to practice

Provide examples of what a fixed effect for each of the following would control: instructor, course, and year. Then, we invite you to think about whether the characteristic you thought it would control for is relevant or not.

‣

Answer

✅

A fixed effect for instructor would control for teacher specific effects, some examples of “unobserved” things would be quality, humor, difficulty, organizational skills and so forth. A course fixed effect would control for factors inherent in course (regardless of who is teaching it); e.g. Movies 101 is probably more popular than Self-Flagellation 599 regardless of instructor. A year fixed effect would control for factors that affect all courses in a given year regardless of course or instructor. Perhaps the school went to on-line evaluations in a year and that may have pushed evaluations up or down, for example. Relevance: This is a bit of a judgment call, but it is useful to think about which fixed effects matter. For instructor fixed effects, one could imagine these clearly existing. One could also imagine them being correlated with enrollment as popular teachers get higher enrollments. While some individual professors give high grades and others give low grades, it’s not clear that high quality instructors give high or low grades so there is probably not a strong correlation between unmeasured teacher quality and grades. For courses, there are likely some course effects. Some of these effects are absorbed by the required variable, but it is also likely that unpopular courses get low enrollment regardless of instructor.

Code

* Start of Do File For Students!
global hw5 "$dropbox/1_Classes/Research Methods/Spring 2023/homework/homework 5"

use "$hw5/hw5.dta", clear 
gen 	white =1 if race==0
replace white=0 if race!=0

foreach var of varlist female white spring required math {
ttest eval, by(`var')
display (`r(mu_1)'-`r(mu_2)')/`r(mu_1)'
}

reg eval enrollment

estimates clear
 eststo: reg eval apct 
* Add in Characteristics about the course 
eststo: reg eval apct year required enrollment spring math 
* Add in Characteristics about the instructor 
eststo: reg eval apct year required enrollment spring math white female
* Add in FE:
eststo: reg eval apct i.courseid 
eststo: reg eval apct i.instrid
eststo: reg eval apct i.courseid i.instrid
eststo: reg eval apct i.courseid i.instrid  enrollment 
esttab, se keep(apct) mtitle("Baseline" "+Course Char" "+Instr Char" "Course FE" "Instrct FE" "Course +Instructor Fe" "+enrollment")

Homework 5 with AK

Purpose

Guidelines

Preamble

Understanding the panel

Extensions

Code