Shorter Data Analysis Example

Author

Matthew Weirick Johnson

Code

# Load necessary libraries (warn.conflicts and quietly are just hiding error messages because you're not really supposed to run plyr and dplyr together; when running both, it's recommended that plyr be loaded first)
library(formatR, warn.conflicts=F, quietly=T)
library(plyr, warn.conflicts=F, quietly=T)
library(dplyr, warn.conflicts=F, quietly=T)
library(tidyr, warn.conflicts=F, quietly=T)
library(tibble, warn.conflicts=F, quietly=T)
library(knitr, warn.conflicts=F, quietly=T)
library(DataExplorer, warn.conflicts=F, quietly=T)
library(SmartEDA, warn.conflicts=F, quietly=T)
library(ThemePark, warn.conflicts=F, quietly=T)
library(ggplot2, warn.conflicts=F, quietly=T)
library(easystats, warn.conflicts=F, quietly=T)

knitr::opts_chunk$set(
 fig.width = 12,
  fig.asp = 0.8,
 out.width = "100%",
 width.cutoff=I(50),60)

Caution

Please keep in mind that I do not know what I am doing, and I am not a statistician. xoxo.

Results

Descriptive Statistics

Within the sample of academic librarians who have some degree of instructional responsibility, the mean job control perceived overall (job control general) was 3.33 and the mean job control when completing instructional responsibilities (job control instruction) was 3.13. A comparison of the characteristics of these data are included in Table 1. A box plot showing the distribution of these data is included in Figure 1.

Note

Descriptive statistics summarize the data set (representing the study sample), identifying some key features such as central tendency and variability or spread.

Skewness and kurtosis are numerical tests of the normality of data (there are arguments for testing normality graphically, usually through quantile-quantile plots, histograms, box plots, violin plots, or rainclouds). The closer the skewness and kurtosis are to zero the closer the approximation of normality.

Figure 1 presents a box plot as another test for normality. A violin plot or raincloud would be more useful visuals but are more difficult to read and interpret.

Code

# report_table() would achieve the same function as as.data.frame(report())
table1 <- rbind(as.data.frame(report(d$jobcontrol.general.score.1989)), as.data.frame(report(d$jobcontrol.instruction.score.1989)))
table1 <- t(table1)
table1 <- as.data.frame(table1)
colnames(table1)[1] = "Job Control (General)"
colnames(table1)[2] = "Job Control (Instruction)"
table1 <- table1[-10,]
rownames(table1) <- c("Mean", "Std. Dev.", "Median", "MAD", "Min", "Max", "N", "Skewness", "Kurtosis")
table1 <- round(table1, 2)
table1 <- rownames_to_column(table1, var="Attribute")
kable(table1)

Table 1: A summary of job control data.
Attribute	Job Control (General)	Job Control (Instruction)
Mean	3.33	3.13
Std. Dev.	0.52	0.60
Median	3.33	3.14
MAD	0.49	0.64
Min	1.86	1.62
Max	5.00	5.00
N	245.00	245.00
Skewness	-0.09	0.26
Kurtosis	0.13	0.30

Code

p7 <- ggplot(d, aes(y=jobcontrol.general.score.1989)) +
  geom_boxplot(outlier.color = "red") +
  labs(y = "Job Control General") +
  coord_cartesian(ylim=c(1.5, 5)) + 
  theme_barbie()

p8 <- ggplot(d, aes(y=jobcontrol.instruction.score.1989)) +
  geom_boxplot(outlier.color = "red") +
  labs(y = "Job Control Instruction") +
  coord_cartesian(ylim=c(1.5, 5)) + 
  theme_barbie()

plots(p7, p8, title = "Distribution of Job Control Data")

two box plots showing the distribution of data for job control (general) and job control (instruction). The mean for the former is greater than the latter. — Figure 1: Distribution of Job Control Data.

Hypothesis Testing

Note

Hypothesis testing is a method of statistics inference used to decide if the sample data provide enough evidence or support to draw conclusions about the population. For this study, I used t-tests and ANOVA to compare means.

Student’s t-test is a statistical test used to compare the means of two groups. A paired t-test compares means of two measurements taken from the same individual. For example, paired t-tests are frequently used for comparing pre- and post-test scores. In this case, the instruction and general job control scores are from the same group of people. Analysis of Variance (ANOVA) compares the means of multiple groups.

Two hypotheses are formulated: the null hypothesis (H0) that the difference between means is 0 and the alternative hypothesis (Ha) that the true mean difference is not equal to 0.

T-tests and ANOVA are parametric tests. One assumption of parametric tests is that the data are approximately normal. As demonstrated in the descriptive statistics above, the data are approximately normal.

Tip

If your data aren’t approximately normal, there are non-parametric tests that can be used. Instead of a paired t-test, Wilcoxon Signed Rank test, instead of unpaired t-test, Mann-Whitney U test, instead of ANOVA, Kruskal-Wallis H test.

UCLA OARC provides a useful table to help you choose the correct statistical test, and they provide links with directions for executing the tests in SAS, Stata, SPSS, and R.

For an example of a resent study that used non-parametric tests, see “Complex and Varied: Factors Related to the Research Productivity of Academic Librarians in the United States.”

The paired t-test testing the difference between job control general and job control instruction suggests that the effect is positive, statistically significant, and medium (difference = 0.20, 95% CI [0.16, 0.25], t(244) = 8.29, p < .001; Cohen’s d = 0.53, 95% CI [0.40, 0.66]). The effect size is labeled following Cohen’s (1988) recommendations.

Note

The results of the paired t-test are reported above. P-values are frequently used to assess statistical significance and can carry significant weight. Generally, for research in LIS, a p-value less than 0.05 is significant. 0.05, 0.01, and 0.001 are frequently used thresholds.

In 2016, the American Statistical Association (ASA) released a statement on p-values and statistical significance. They informally defined a p-value as “the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.” They note both that “p-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone” and that “scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold,” which is possibly to say that these thresholds are arbitrary.

In the same statement, the ASA notes that p-values don’t measure the size of an effect. As such, in my reporting of the results, I’ve included the effect size. For t-tests, the effect size is measured using Cohen’s d. Based on Cohen’s recommendations in his 1988 textbook, 0.5 is considered medium. The effect size measures the strength of the relationship between the variables.

Additionally, the 95% confidence intervals for the mean difference and Cohen’s d are reported. The confidence interval takes into account the size of the sample. There’s a 95% chance that the true value falls within the confidence interval. For example, there is a 95% chance that the true mean is between 0.16 and 0.25. For smaller samples, the confidence interval would be larger.

I included the code for the t-test in R below just to show how simple it is to run in R. Cohen’s d is also simple to calculate using the effectsize package from easystats. Click the arrow to unfold the code.

Code

t.test(d$jobcontrol.general.score.1989, 
       d$jobcontrol.instruction.score.1989, 
       paired = TRUE) 
## 
##  Paired t-test
## 
## data:  d$jobcontrol.general.score.1989 and d$jobcontrol.instruction.score.1989
## t = 8.2908, df = 244, p-value = 7.593e-15
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  0.1552993 0.2520865
## sample estimates:
## mean difference 
##       0.2036929

cohens_d(t.test(d$jobcontrol.general.score.1989, 
                d$jobcontrol.instruction.score.1989, 
                paired = TRUE))
## Cohen's d |       95% CI
## ------------------------
## 0.53      | [0.40, 0.66]

Correlation and Linear Regression

Note

Pearson’s product-moment correlation is a measure of the strength of the linear correlation between two sets of data. The correlation coefficient (r) is a value between -1 and 1. Similarly to ANOVA and t-tests, Pearson’s product moment correlation is a parametric test.

Linear regression is a method of modeling the relationship between variables. The coefficient of determination (r-squared) is a measurement of the proportion of variance in the dependent variable that can be explained by the independent variable. The value of the coefficient is always between 0 and 1.

Correlation measures the linear relationship between the variable and the linear regression allows us to predict the impact of the independent variable on the dependent variable. For example, job control in instruction predicts a moderate proportion of variance in work-related burnout. Hence, we can predict that academic instruction librarians with low job control in instruction will experience greater work-related burnout in comparison to colleagues with greater job control. Linear regression may be described as explanatory. However, as with correlation, causation is not implied or inferred.

The Pearson’s product-moment correlations between job control (instruction) and the three dimensions of burnout are as follows and as visualized in Figure 2:

personal burnout: negative, statistically significant, and medium (r = -0.29, 95% CI [-0.40, -0.17], t(243) = -4.76, p < .001)
work-related burnout: negative, statistically significant, and large (r = -0.36, 95% CI [-0.47, -0.25], t(243) = -6.10, p < .001)
client-related burnout: negative, statistically significant, and large (r = -0.33, 95% CI [-0.44, -0.21], t(243) = -5.41, p < .001).

Code

correlation_matrix <- correlation(d, select=c("jobcontrol.general.score.1989","jobcontrol.instruction.score.1989", "TPBS", "TWRBS", "TCRBS"), rename = c("Job Control (General)","Job Control (Instruction)", "Personal Burnout Score", "Work-related Burnout Score", "Client-related Burnout Score"))
correlation_matrix |>
  summary(redundant = FALSE) |>
  plot(cex=50)

A diagram with boxes demonstrating the correlations between the factors mentioned above. — Figure 2: Correlation Matrix.

Linear models (estimated using OLS) were fitted to predict TWRBS (Figure 3), TPBS (Figure 4), and TCRBS (Figure 5) with job control when completing instructional responsibilities:

TWRBS ~ Job Control (Instruction): The model explains a statistically significant and moderate proportion of variance (R2 = 0.13, F(1, 243) = 37.22, p < .001, adj. R2 = 0.13). The model's intercept, corresponding to Job Control (Instruction) = 0, is at 86.96 (95% CI [74.77, 99.15], t(243) = 14.05, p < .001). Within this model, the effect of Job Control (Instruction) is statistically significant and negative (beta = -11.85, 95% CI [-15.67, -8.02], t(243) = -6.10, p < .001; Std. beta = -0.36, 95% CI [-0.48, -0.25]). The model is shown in Figure 3 along with the linear model to predict TWRBS with Job Control (General).

Code

controlxinstruction <- data.frame(d$jobcontrol.general.score.1989, d$jobcontrol.instruction.score.1989, d$TWRBS, d$TPBS, d$TCRBS)
controlxinstruction <- tidyr::pivot_longer(controlxinstruction, cols = c('d.jobcontrol.general.score.1989', 'd.jobcontrol.instruction.score.1989'), names_to = 'variable', values_to = 'value')

ggplot(controlxinstruction, aes(x=value, y=d.TWRBS, color=variable)) +
  geom_point() +
  geom_smooth(method="lm") +
  labs(title = "Correlation between Work-related Burnout and Job Control",
       x = "Job Control Score",
       y = "Total Work-Related Burnout Score") +
  scale_color_brewer(palette="Dark2", name = "Variables", labels = c("Job Control General", "Job Control Instruction")) +
  theme_minimal(base_size = 16)

A diagram of a linear model — Figure 3: Correlation between Work-related Burnout and Job Control.

TPBS ~ Job Control (Instruction): The model explains a statistically significant and weak proportion of variance (R2 = 0.09, F(1, 243) = 22.61, p < .001, adj. R2 = 0.08). The model's intercept, corresponding to Job Control (Instruction) = 0, is at 85.16 (95% CI [73.27, 97.04], t(243) = 14.11, p < .001). Within this model, the effect of Job Control (Instruction) is statistically significant and negative (beta = -9.00, 95% CI [-12.73, -5.27], t(243) = -4.76, p < .001; Std. beta = -0.29, 95% CI [-0.41, -0.17]). The model is shown in Figure 4 along with the linear model to predict TPBS with Job Control (General).

Code

ggplot(controlxinstruction, aes(x=value, y=d.TPBS, color=variable)) +
  geom_point() +
  geom_smooth(method="lm") +
  labs(title = "Correlation between Personal Burnout and Job Control",
       x = "Job Control Score",
       y = "Total Personal Burnout Score") +
  scale_color_brewer(palette="Dark2", name = "Variables", labels = c("Job Control General", "Job Control Instruction")) +
  theme_minimal(base_size = 16)

TCRBS ~ Job Control (Instruction): The model explains a statistically significant and weak proportion of variance (R2 = 0.11, F(1, 243) = 29.23, p < .001, adj. R2 = 0.10). The model's intercept, corresponding to Job Control (Instruction) = 0, is at 63.41 (95% CI [50.52, 76.30], t(243) = 9.69, p < .001). Within this model, the effect of Job Control (Instruction) is statistically significant and negative (beta = -11.10, 95% CI [-15.15, -7.06], t(243) = -5.41, p < .001; Std. beta = -0.33, 95% CI [-0.45, -0.21]). The model is shown in Figure 5 along with the linear model to predict TCRBS with Job Control (General).

Code

ggplot(controlxinstruction, aes(x=value, y=d.TCRBS, color=variable)) +
  geom_point() +
  geom_smooth(method="lm") +
  labs(title = "Correlation between Client-related Burnout and Job Control",
       x = "Job Control Score",
       y = "Total Client-Related Burnout Score") +
  scale_color_brewer(palette="Dark2", name = "Variables", labels = c("Job Control General", "Job Control Instruction")) +
  theme_minimal(base_size = 16)

Standardized parameters were obtained by fitting the models on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using a Wald t-distribution approximation.

While the R2 for all of the above models using job control for instruction are lower than found in Johnson (2023) using job control in general, which suggests that job control for instruction explains a weaker proportion of variance, the data still suggest some impact of job control on burnout. Since burnout is a multi-faceted phenomenon, it makes sense that job control wouldn't be a sole predictor of burnout.

Analysis of Variance

Note

We have finally arrived at the analysis of variance (ANOVA) mentioned back when I mentioned t-tests. Analysis of variance can compare means across multiple groups. In this case, ANOVA tests were used for all of the demographic and job characteristic data collected in the study as the goal is to compare the means across multiple groups (e.g., among people of different genders, different income levels, etc.).

While one-way ANOVA can test the difference between multiple groups, Tukey’s HSD can be used for post-hoc or follow up analysis test for statistically significant differences between pairs of group means.

Author (2023) found that extent of job control is tied to status (faculty, academic staff, or staff) and teaching workload when measuring job control generally. This study looks specifically at perception of job control when completing instructional tasks and found that time since degree (in years) and whether or not training for instruction was received have statistically significant (p < 0.05) effects on job control. Time at institution, full-time or part-time status, and tenure status for individuals had weaker evidence that the difference in means was not random (p < 0.1). For income, the p value was slightly higher (p = 1.04), so the data are presented in the results for readers who may still be interested. Finally, because status and teaching workload were significant in the study of job control in general, details about those results are included below as well. Table 2 shows the results for statistical significance computed using ANOVA tests for each demographic and job characteristic studied. In the sections that follow, the effect sizes are labeled following Field's (2013) recommendations.

Code

attributes <- c("Gender", "Gender Modality","Disability", "Income", "Time at Institution", "Time Since Degree", "Time in Libraries", "Public or Private", "Non-profit or For-profit", "Permanent or Temporary", "Full-time or Part-time", "Staff or Faculty", "Tenure (individual)", "Tenure (institution)", "Union", "Training Received", "Training Quality", "Teaching Workload")

values <- c(0.572, 0.192, 0.822, 0.104, "0.0583 *", "0.0263 **", 0.195, 0.406, 0.858, 0.187, "0.0934 *", 0.238, "0.0845 *", 0.215, 0.972, "0.0357 **", 0.814, 0.11)

table <- cbind(attributes, values)
colnames(table) <- c("Attribute", "p-value")
kable(table)

Table 2: P-values from ANOVA tests of attributes studied (* denotes p < 0.1 and ** denotes p < 0.05).
Attribute	p-value
Gender	0.572
Gender Modality	0.192
Disability	0.822
Income	0.104
Time at Institution	0.0583 *
Time Since Degree	0.0263 **
Time in Libraries	0.195
Public or Private	0.406
Non-profit or For-profit	0.858
Permanent or Temporary	0.187
Full-time or Part-time	0.0934 *
Staff or Faculty	0.238
Tenure (individual)	0.0845 *
Tenure (institution)	0.215
Union	0.972
Training Received	0.0357 **
Training Quality	0.814
Teaching Workload	0.11

Time Since Degree

The ANOVA testing the effect of time since degree on job control for instruction suggests that the main effect is statistically significant and small (F(4, 238) = 2.81, p = 0.026; Eta2 = 0.05, 95% CI [2.91e-03, 1.00]). Post-hoc analysis using Tukey's HSD test revealed a significant difference between participants with 16 or more years since receiving their degree and participants with 1 to 5 years since receiving their degree (p < 0.05) as demonstrated in Table 4.

As demonstrated in Figure 6 and Table 3, job control for instruction generally increases over time, though the mean job control is highest for individuals who received their degrees less than a year ago, though the number of participants in that category is particularly low.

Code

d |>
  group_by(time.institution) |>
  summarize(N = n(),
            Mean = mean(jobcontrol.instruction.score.1989), 
            Median = median(jobcontrol.instruction.score.1989),
            "Std. Dev." = sd(jobcontrol.instruction.score.1989),
            Min = min(jobcontrol.instruction.score.1989),
            Max = max(jobcontrol.instruction.score.1989)) |>
  kable(col.names = c("Time at Institution", "N", "Mean", "Median", "Std. Dev.", "Min.", "Max."), digits = 2)

Table 3: Job control for instruction by time since degree in intervals of years.
Time at Institution	N	Mean	Median	Std. Dev.	Min.	Max.
Less than 1	31	3.11	3.05	0.59	1.86	5.00
1 to 5	100	3.02	3.02	0.61	1.62	5.00
6 to 10	55	3.16	3.24	0.57	1.90	5.00
11 to 15	23	3.22	3.19	0.58	2.24	4.38
16 or more	36	3.35	3.40	0.58	2.19	4.29

Code

TukeyHSD(aov(jobcontrol.instruction.score.1989 ~ time.institution, data = d)) |>
  broom::tidy() |>
  select(-term, -null.value) |>
  kable(col.names = c("Paired Groups", "Difference", "CI Low", "CI High", "p-value"),digits = 2)

Table 4: Tukey’s HSD grouped means comparison for job control for instruction by time since degree in intervals of years.
Paired Groups	Difference	CI Low	CI High	p-value
1 to 5-Less than 1	-0.09	-0.42	0.24	0.95
6 to 10-Less than 1	0.05	-0.31	0.42	0.99
11 to 15-Less than 1	0.11	-0.33	0.56	0.96
16 or more-Less than 1	0.24	-0.16	0.64	0.46
6 to 10-1 to 5	0.14	-0.13	0.41	0.61
11 to 15-1 to 5	0.20	-0.17	0.58	0.57
16 or more-1 to 5	0.33	0.01	0.65	0.04
11 to 15-6 to 10	0.06	-0.34	0.47	0.99
16 or more-6 to 10	0.19	-0.16	0.54	0.57
16 or more-11 to 15	0.13	-0.31	0.56	0.93

Code

Control_instruction_timeatinstitution_means <- data.frame(d %>%
                                                          group_by(time.institution) %>%
                                                          summarise_at(vars(jobcontrol.instruction.score.1989), list(name=mean)))

tempdf <- d %>%
  group_by(time.institution) %>%
  summarize(N = n(),
            Mean = mean(jobcontrol.instruction.score.1989), 
            Median = median(jobcontrol.instruction.score.1989),
            "Std. Dev." = sd(jobcontrol.instruction.score.1989),
            CI = qt(0.975, df=n()-1)*sd(jobcontrol.instruction.score.1989)/sqrt(n()),
            CI_low = mean(jobcontrol.instruction.score.1989)-CI,
            CI_hi = mean(jobcontrol.instruction.score.1989)+CI,
            Min = min(jobcontrol.instruction.score.1989),
            Max = max(jobcontrol.instruction.score.1989))

Control_instruction_timeatinstitution_means$CI_low <- tempdf$CI_low
Control_instruction_timeatinstitution_means$CI_hi <- tempdf$CI_hi
Control_instruction_timeatinstitution_means$time.institution <- factor(Control_instruction_timeatinstitution_means$time.institution, levels = c("Less than 1", "1 to 5", "6 to 10", "11 to 15", "16 or more"))
Control_instruction_timeatinstitution_means <- Control_instruction_timeatinstitution_means[-6,]

ggplot(Control_instruction_timeatinstitution_means, aes(x=time.institution, y=name)) +
  geom_col(fill = barbie_theme_colors["dark"]) + 
  geom_errorbar(aes(ymin=CI_low, ymax=CI_hi),
                linewidth=.3,
                width=.2,
                position=position_dodge(.9)) +
  labs(x="Time at Institution (in Years)", y="Job Control Score", title="Average Job Control for Instruction Duties \n by Intervals of Number of Years at Current Institution", caption = "*Error bars indicate a 95% confidence interval for the corresponding mean.") +
  theme_barbie()

A column chart — Figure 6: Average job control for instruction duties by intervals of numbers of years since library school.

Training Received

The ANOVA testing the effect of whether or not teacher training was received on job contrtol for instruction suggests that the main effect is statistically significant and small (F(2, 242) = 3.38, p = 0.036; Eta2 = 0.03, 95% CI [9.70e-04, 1.00]). For this test, teacher training received was flattened to yes or no. Respondents had three yes options: "Yes, in library school and on the job," "Yes, only in library school," and "Yes, only on the job." The difference between these was not statistically significant, but the difference was statistically significant when comparing yes to no. Post-hoc analysis using Tukey's HSD test revealed a significant difference between respondents who received no teacher training and respondents who provided free text responses coded as "Other” as demonstrated in Table 6.

As demonstrated in Table 5 and Figure 7, mean job control for instruction is higher for those who have received training for library instruction than those who haven't; however, job control for instruction is particularly high for those who responded "Other." Responses for other generally referenced teacher training that was not specific to libraries or not provided through a library program.

Code

d |>
  group_by(Training3) |>
  summarize(N = n(),
            Mean = mean(jobcontrol.instruction.score.1989), 
            Median = median(jobcontrol.instruction.score.1989),
            "Std. Dev." = sd(jobcontrol.instruction.score.1989),
            Min = min(jobcontrol.instruction.score.1989),
            Max = max(jobcontrol.instruction.score.1989)) |>
    kable(col.names = c("Training Received", "N", "Mean", "Median", "Std. Dev.", "Min.", "Max."), digits = 2)

Table 5: Job control for instruction by whether or not training was received.
Training Received	N	Mean	Median	Std. Dev.	Min.	Max.
Yes	173	3.14	3.14	0.61	1.86	5.00
No	58	3.02	3.00	0.57	1.62	4.29
Other	14	3.47	3.40	0.43	2.71	4.10

Table 6: Tukey’s HSD grouped means comparison for job control for instruction by time since degree in intervals of years.
Paired Groups	Difference	CI Low	CI High	p-value
No-Yes	-0.13	-0.34	0.08	0.34
Other-Yes	0.32	-0.06	0.71	0.12
Other-No	0.45	0.03	0.87	0.03

Code

Control_instruction_Training_means <- data.frame(d %>%
                                     group_by(Training3) %>%
                                     summarise_at(vars(jobcontrol.instruction.score.1989), list(name=mean)))

tempdf <- d %>%
  group_by(Training3) %>%
  summarize(N = n(),
            Mean = mean(jobcontrol.instruction.score.1989), 
            Median = median(jobcontrol.instruction.score.1989),
            "Std. Dev." = sd(jobcontrol.instruction.score.1989),
            CI = qt(0.975, df=n()-1)*sd(jobcontrol.instruction.score.1989)/sqrt(n()),
            CI_low = mean(jobcontrol.instruction.score.1989)-CI,
            CI_hi = mean(jobcontrol.instruction.score.1989)+CI,
            Min = min(jobcontrol.instruction.score.1989),
            Max = max(jobcontrol.instruction.score.1989))

Control_instruction_Training_means$CI_low <- tempdf$CI_low
Control_instruction_Training_means$CI_hi <- tempdf$CI_hi
Control_instruction_Training_means$Training3 <- factor(Control_instruction_Training_means$Training3, levels = c("Yes", "No", "Other"))

ggplot(Control_instruction_Training_means, aes(x=Training3, y=name)) +
  geom_col(fill = barbie_theme_colors["dark"]) + 
  geom_errorbar(aes(ymin=CI_low, ymax=CI_hi),
                linewidth=.3,
                width=.2,
                position=position_dodge(.9)) +
  labs(x="Teacher Training Received", y="Job Control Score", title="Average Job Control for Instruction Duties \n by Whether or Not Participant Received Teacher Training", caption = "*Error bars indicate a 95% confidence interval for the corresponding mean.") +
  theme_barbie()