Treatment Effects - The Basics
University of Oxford
Today we’ll replicate a famous paper on bias in the labor market: “Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination”
To reduce clutter, I’ll refer to the paper as “BM” throughout.
Our first step is to sign up for a free account on posit cloud.
tidyverse
family of R packages.install.packages('tidyverse')
https://ditraglia.com/data/lakisha_aer.csv
.tidyverse
we can read the data into a tibble called bm
using the read_csv()
function as follows:bm
contains 4870 rows and 65 columns; each row is a fictitious job applicant.
# A tibble: 4,870 × 65
id ad education ofjobs yearsexp honors volunteer military empholes
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 b 1 4 2 6 0 0 0 1
2 b 1 3 3 6 0 1 1 0
3 b 1 4 1 6 0 0 0 0
4 b 1 3 4 6 0 1 0 1
5 b 1 3 3 22 0 0 0 0
6 b 1 4 2 6 1 0 0 0
7 b 1 4 2 5 0 1 0 0
8 b 1 3 4 21 0 1 0 1
9 b 1 4 3 3 0 0 0 0
10 b 1 4 2 6 0 1 0 0
# ℹ 4,860 more rows
# ℹ 56 more variables: occupspecific <dbl>, occupbroad <dbl>,
# workinschool <dbl>, email <dbl>, computerskills <dbl>, specialskills <dbl>,
# firstname <chr>, sex <chr>, race <chr>, h <dbl>, l <dbl>, call <dbl>,
# city <chr>, kind <chr>, adid <dbl>, fracblack <dbl>, fracwhite <dbl>,
# lmedhhinc <dbl>, fracdropout <dbl>, fraccolp <dbl>, linc <dbl>, col <dbl>,
# expminreq <chr>, schoolreq <chr>, eoe <dbl>, parent_sales <dbl>, …
call
equals 1
if resume elicits a an email or telephone callback for an interviewsex
equals f
for female, m
for malerace
equals b
for black, w
for whitecomputerskills
equals 1
if resume says applicant has computer skillseducation
level of education on resumeyearsexp
years of experience on resumeofjobs
number of previous jobs on resume# A tibble: 4,870 × 7
sex race call computerskills education yearsexp ofjobs
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 f w 0 1 4 6 2
2 f w 0 1 3 6 3
3 f b 0 1 4 6 1
4 f b 0 1 3 6 4
5 f w 0 1 3 22 3
6 m w 0 0 4 6 2
7 f w 0 1 4 5 2
8 f b 0 1 3 21 4
9 f b 0 1 4 3 3
10 m b 0 0 4 6 2
# ℹ 4,860 more rows
Skim the introduction and conclusion of BM so we can discuss the following:
Skim parts A-D of section II in BM so you can answer the following:
Random assignment of treatment implies that the characteristics of the treatment and control group will be balanced: the same on average.
sex
balanced across race
?More of the fake resumes are female than male, but within sex we see that race is approximately balanced as it should be:
# A tibble: 4 × 3
# Groups: sex [2]
sex race count
<chr> <chr> <int>
1 f b 1886
2 f w 1860
3 m b 549
4 m w 575
Remember: names were randomly assigned to resumes.
sex
, education
, ofjobs
, and yearsexp
across race?computerskills
and education
balanced across sex
? What’s going on here? Hint: see BM section II C.# balanced across race
bm |>
group_by(race) |>
summarize(avg_educ = mean(education),
avg_jobs = mean(ofjobs),
avg_exp = mean(yearsexp))
# A tibble: 2 × 4
race avg_educ avg_jobs avg_exp
<chr> <dbl> <dbl> <dbl>
1 b 3.62 3.66 7.83
2 w 3.62 3.66 7.86
We care about balance because we want to be sure that the perception of race is responsible for any difference in callback rates, not some other factor.
# A tibble: 2 × 3
sex avg_comp avg_educ
<chr> <dbl> <dbl>
1 f 0.868 3.58
2 m 0.662 3.73
From the paper:
We use nearly exclusively female names for administrative and clerical jobs to increase callback rates.
Compute the following and compare to Table 1 of BM:
bm
.race
.race
and sex
.
Welch Two Sample t-test
data: call by race
t = -4.1147, df = 4711.6, p-value = 3.943e-05
alternative hypothesis: true difference in means between group b and group w is not equal to 0
95 percent confidence interval:
-0.04729503 -0.01677067
sample estimates:
mean in group b mean in group w
0.06447639 0.09650924