In this worksheet we will work through how to create tables in Stata. We will be using esttab commands.
Know that you can find more info in the following links:
- You’ll first want to download the esttab commands, if you haven’t already. “estout” contains many commands, like the command estout, and the command esttab.
- Let’s work with the following dataset
- Before we start adding regressions to our esttab, we’ll want to clear anything we’ve “Stored” as estimates. This will get rid of anything we had saved to our esttab previously.
- Now let’s add some regressions! We’ll start by adding a simple linear regression of total work experience (in years) on hourly wage. (This regression shows us the association/effect of work experience on wage). To add this regression to our esttab, you’ll add “eststo:” before your regression code. Then to produce the table, you just want to run esttab.
- We now will look at this regression with controls. Let’s add age and race as control variables, because we think that, by omitting these variables, they are would be adding bias to our estimates. Let’s add this regression to our esttab. Notice that because we haven’t typed “estimates clear” when we type esttab, the new regression and the old regression appear in the same table
- Next, we want to see how this effect of total work experience on hourly wage differs between those in a union and those not in a union. So, we’ll add an interaction term to this regression, and then add this regression to our esttab.
- We’re also interested in the effect of tenure in a position on wages, but we think this is a quadratic relationship (non-linear, or the effect of tenure in a position on wages changes depending on the level of tenure). So, we’ll generate a squared term of tenure, and then run a regression with a squared term, and add this regression to our esttab.
- We realize that we aren’t sure if the relationship between tenure and wages is quadratic or exponential. We’ll run a new regression but instead of a quadratic term we will use the natural log of tenure. First, we’ll generate the new ln(tenure) term and run a regression of wage on ln(tenure)
- Now let’s edit our table to ensure it’s displaying everything we are interested in. You can edit the table using options. To figure out which options are available and how to add them, use “help esttab.” Lets display the standard errors in parenthesis under the coefficient, instead of “t-statistics”. Notice the note at the end of the table.
- Let’s also change the column names to represent what each regression showing. This is a bit challenging, and it isn’t clear how to do this using “help esttab.” However, a key coding skill is being able to google until we find some sample code that helps us achieve what we want. Go ahead and try to figure this out yourself! When we say google we mean not just google it also means searching within twitter or youtube.
- Now say that you want to add the mean of the Y variable in a row at the bottom. We do this by collecting the mean from each regression. For this we’ll start with a fresh new table
- Now that we have everything we need for our table, and we confirmed our table looks the way we want in STATA, let’s export it to a word document. To do this, we will save the table in some location as an rtf file. Then, we can open it in word document. Make sure you use “replace” to replace any old files with that name. Then open that file that you just created. If you use a Mac, you should open the RTF with word rather than the default textedit program.
ssc install estout, replace
dataset_2.13.dta93.4KB
estimates clear
eststo: reg wage ttl_exp
esttab
----------------------------
(1)
wage
----------------------------
ttl_exp 0.331***
(13.04)
_cons 3.612***
(10.65)
----------------------------
N 2246
----------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
eststo: reg wage ttl_exp age race
esttab
--------------------------------------------
(1) (2)
wage wage
--------------------------------------------
ttl_exp 0.331*** 0.346***
(13.04) (13.49)
age -0.138***
(-3.57)
race -1.389***
(-5.21)
_cons 3.612*** 9.187***
(10.65) (6.05)
--------------------------------------------
N 2246 2220
--------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
generate interact = ttl_exp*union
eststo: reg wage ttl_exp age race union interact
esttab
------------------------------------------------------------
(1) (2) (3)
wage wage wage
------------------------------------------------------------
ttl_exp 0.331*** 0.346*** 0.342***
(13.04) (13.49) (15.55)
age -0.138*** -0.0609*
(-3.57) (-2.08)
race -1.389*** -1.323***
(-5.21) (-6.66)
union 2.040**
(3.25)
interact -0.0467
(-1.03)
_cons 3.612*** 9.187*** 5.559***
(10.65) (6.05) (4.79)
------------------------------------------------------------
N 2246 2220 1854
------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
generate tenure_sq = tenure^2
eststo: reg wage tenure tenure_sq
esttab
----------------------------------------------------------------------------
(1) (2) (3) (4)
wage wage wage wage
----------------------------------------------------------------------------
ttl_exp 0.331*** 0.346*** 0.342***
(13.04) (13.49) (15.55)
age -0.138*** -0.0609*
(-3.57) (-2.08)
race -1.389*** -1.323***
(-5.21) (-6.66)
union 2.040**
(3.25)
interact -0.0467
(-1.03)
tenure 0.344***
(4.84)
tenure_sq -0.00891*
(-2.34)
_cons 3.612*** 9.187*** 5.559*** 6.327***
(10.65) (6.05) (4.79) (27.14)
----------------------------------------------------------------------------
N 2246 2220 1854 2231
----------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
generate ln_tenure = ln(tenure)
eststo: reg wage ln_tenure
esttab
--------------------------------------------------------------------------------------------
(1) (2) (3) (4) (5)
wage wage wage wage wage
--------------------------------------------------------------------------------------------
ttl_exp 0.331*** 0.346*** 0.342***
(13.04) (13.49) (15.55)
age -0.138*** -0.0609*
(-3.57) (-2.08)
race -1.389*** -1.323***
(-5.21) (-6.66)
union 2.040**
(3.25)
interact -0.0467
(-1.03)
tenure 0.344***
(4.84)
tenure_sq -0.00891*
(-2.34)
ln_tenure 0.970***
(9.53)
_cons 3.612*** 9.187*** 5.559*** 6.327*** 6.562***
(10.65) (6.05) (4.79) (27.14) (36.87)
--------------------------------------------------------------------------------------------
N 2246 2220 1854 2231 2180
--------------------------------------------------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001
esttab, se
--------------------------------------------------------------------------------------------
(1) (2) (3) (4) (5)
wage wage wage wage wage
--------------------------------------------------------------------------------------------
ttl_exp 0.331*** 0.346*** 0.342***
(0.0254) (0.0257) (0.0220)
age -0.138*** -0.0609*
(0.0387) (0.0292)
race -1.389*** -1.323***
(0.267) (0.199)
union 2.040**
(0.628)
interact -0.0467
(0.0452)
tenure 0.344***
(0.0709)
tenure_sq -0.00891*
(0.00381)
ln_tenure 0.970***
(0.102)
_cons 3.612*** 9.187*** 5.559*** 6.327*** 6.562***
(0.339) (1.518) (1.159) (0.233) (0.178)
--------------------------------------------------------------------------------------------
N 2246 2220 1854 2231 2180
--------------------------------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
esttab, se mtitle ("Basic" "Controls" "Interaction" "Quadratic" "Linear-Log")
--------------------------------------------------------------------------------------------
(1) (2) (3) (4) (5)
Basic Controls Interaction Quadratic Linear-Log
--------------------------------------------------------------------------------------------
ttl_exp 0.331*** 0.346*** 0.342***
(0.0254) (0.0257) (0.0220)
age -0.138*** -0.0609*
(0.0387) (0.0292)
race -1.389*** -1.323***
(0.267) (0.199)
union 2.040**
(0.628)
interact -0.0467
(0.0452)
tenure 0.344***
(0.0709)
tenure_sq -0.00891*
(0.00381)
ln_tenure 0.970***
(0.102)
_cons 3.612*** 9.187*** 5.559*** 6.327*** 6.562***
(0.339) (1.518) (1.159) (0.233) (0.178)
--------------------------------------------------------------------------------------------
N 2246 2220 1854 2231 2180
--------------------------------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
estimates clear
eststo: reg wage ttl_exp
estadd ysumm
eststo: reg wage ttl_exp age race
estadd ysumm
esttab, se stats(N r2 ymean)
--------------------------------------------
(1) (2)
wage wage
--------------------------------------------
ttl_exp 0.331*** 0.346***
(0.0254) (0.0257)
age -0.138***
(0.0387)
race -1.389***
(0.267)
_cons 3.612*** 9.187***
(0.339) (1.518)
--------------------------------------------
N 2246 2220
r2 0.0705 0.0855
ymean 7.767 7.758
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
* We can then change the titles of N, and r2 and ymean
esttab, se stats(N r2 ymean, label("n" "R-Squared" "Pre-mean of Y"))
--------------------------------------------
(1) (2)
wage wage
--------------------------------------------
ttl_exp 0.331*** 0.346***
(0.0254) (0.0257)
age -0.138***
(0.0387)
race -1.389***
(0.267)
_cons 3.612*** 9.187***
(0.339) (1.518)
--------------------------------------------
n 2246 2220
R-Squared 0.0705 0.0855
Pre-mean o~Y 7.767 7.758
--------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
* We can also use the variable labels instead of the variables name
* by adding the options "label" after the comma.
esttab, se label stats(N r2 ymean, label("N" "R-Squared" "Pre-mean of Y"))
----------------------------------------------------
(1) (2)
Hourly wage Hourly wage
----------------------------------------------------
Total work experie~) 0.331*** 0.346***
(0.0254) (0.0257)
Age in current year -0.138***
(0.0387)
Race -1.389***
(0.267)
Constant 3.612*** 9.187***
(0.339) (1.518)
----------------------------------------------------
N 2246 2220
R-Squared 0.0705 0.0855
Pre-mean of Y 7.767 7.758
----------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
* if you wanted a different lable, you can change the label for the variable
* or you can change it in esttab
* For example you want the variable age to be label "Age" and not "Age in current year"
* For this you use the command coeflabels
esttab, se label coeflabels(age "Age") stats(N r2 ymean, label("N" "R-Squared" "Pre-mean of Y"))
----------------------------------------------------
(1) (2)
Hourly wage Hourly wage
----------------------------------------------------
Total work experie~) 0.331*** 0.346***
(0.0254) (0.0257)
Age -0.138***
(0.0387)
Race -1.389***
(0.267)
Constant 3.612*** 9.187***
(0.339) (1.518)
----------------------------------------------------
N 2246 2220
R-Squared 0.0705 0.0855
Pre-mean of Y 7.767 7.758
----------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
* Let's say you only want to keep the coefficients from experience
* and not the other coefficients. For this you use the options "keep"
esttab, keep(ttl_exp) se label coeflabels(ttl_exp "Years of Experience") stats(N r2 ymean, label("N" "R-Squared" "Pre-mean of Y"))
----------------------------------------------------
(1) (2)
Hourly wage Hourly wage
----------------------------------------------------
Years of Experience 0.331*** 0.346***
(0.0254) (0.0257)
----------------------------------------------------
N 2246 2220
R-Squared 0.0705 0.0855
Pre-mean of Y 7.767 7.758
----------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001
esttab using "filepath", keep(ttl_exp) se label coeflabels(ttl_exp "Years of Experience") stats(N r2 ymean, label("N" "R-Squared" "Pre-mean of Y"))
esttab using "/Users/laptop/Downloads/test.rtf", replace keep(ttl_exp) se label coeflabels(ttl_exp "Years of Experience") stats(N r2 ymean, label("N" "R-Squared" "Pre-mean of Y"))
- If you want even more advanced stuff, and to use LaTex, then check out this guide:
- What if you wanted one title for each colum?
sysuse auto
estimates clear
eststo: reg price mpg
eststo: reg price mpg weight
eststo: reg trunk mpg
eststo: reg trunk weight
esttab, se mgroups("Price" "Trunk", pattern(1 0 1 0)) nomtitle
Not that this may create group column titles that are not “centered” when you open them in word. There is a fix for this if you were to use LaTex, but currently no fix in word (that I know of!).