Week 7 (Workbook) - Piecing Plots Together
Piecing Plots Together
Packages
Code
Custom Theme:
Code
my_theme <- function(){
theme_bw() +
theme(
legend.position = "bottom"
, legend.title = element_text(face = "bold", size = rel(1))
, legend.text = element_text(face = "italic", size = rel(1))
, axis.text = element_text(face = "bold", size = rel(1.1), color = "black")
, axis.title = element_text(face = "bold", size = rel(1.2))
, plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5)
, plot.subtitle = element_text(face = "italic", size = rel(1.2), hjust = .5)
, strip.text = element_text(face = "bold", size = rel(1.1), color = "white")
, strip.background = element_rect(fill = "black")
)
}
Review
- Over the last several weeks, we have talked about:
- tidying data
-
ggplot2
logic - visualizing proportions
- visualizing differences
- visualizing time series
- visualizing uncertainty
- For the rest of the course, we will pivot to taking everything we’ve learning and piecing it all together
- Today: Piecing visualizations together
- Next week: Polishing visualizations **
-
03/05: Interactive Visualizations (
shiny
)
Today
There are lots of packages for piecing visualizations together
-
Over the years, I’ve tried:
ggExtra
cowplot
patchwork
Although
patchwork
wins by a landslide (imo), each has helpful unique features, so I’ll show you elements of eachHere is a short list of some core
ggplot2
extensions: https://exts.ggplot2.tidyverse.org/gallery/-
We’ll cover:
ggExtra
patchwork
-
cowplot
(and lots of assortments)
ggExtra
- We’ll start with
ggExtra
because it will help us create plots with distributions in the margins. - After, we’ll move to patchwork, where there will be lots of little odds and ends to step through
- Remember these data?
Code
Let’s plot the association between conscientiousness and self-rated health across genders in Study 1:
Code
p <- pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(x = p_value, y = SRhealth, color = gender)) +
geom_point(
, size = 2
, alpha = .5
) +
scale_color_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
labs(
x = "Conscientiousness (POMP, 0-10)"
, y = "Self-Rated Health (POMP, 0-10)"
, color = "Gender"
) +
my_theme()
p
Add a smoothed line:
To get marginal distributions, we can just use ggExtra::ggMarginal()
This is fine, but we can do better!
Let’s try color
:
Let’s try fill
:
Let’s try a histogram:
Let’s color and fill based on groups in the data:
cowplot
+ pathwork
- Why
cowplot
orpatchwork
?- figure alignment
- easier to choose relative values and layouts
- can mix base
R
plots andggplot2
plots - allows you to annotate plots (including stacking, as opposed to layering)
- shared legends!
- includes the themes from his book
Patchwork: Piecing the Plots Together
Code
px <- pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(x = p_value, fill = gender, color = gender)) +
geom_density(alpha = .5) +
scale_color_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
scale_fill_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
labs(fill = "Gender", color = "Gender") +
theme_void()
px
Code
py <- pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(x = SRhealth, fill = gender, color = gender)) +
geom_density(alpha = .5) +
scale_color_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
scale_fill_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
labs(fill = "Gender", color = "Gender") +
coord_flip() +
theme_void()
py
We can use the +
and /
operators to arrange them:
That arrangement isn’t quite right. Let’s try a custom layout:
Code
Those legends are messing us up! Let’s “collect” them and move them to the bottom.
Code
Honestly, we don’t need the marginal legend, so let’s remove those legends all together.
Code
We could also use this to add marginal boxplots, instead of the density distributions from ggextra
:
Code
px <- pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(x = p_value, y = gender, fill = gender, color = gender)) +
geom_boxplot(alpha = .5) +
geom_jitter(aes(y = gender), alpha = .5) +
scale_color_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
scale_fill_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
labs(fill = "Gender", color = "Gender") +
theme_void() +
theme(legend.position = "none")
px
Code
py <- pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(x = SRhealth, y = gender, fill = gender, color = gender)) +
geom_boxplot(alpha = .5) +
geom_jitter(aes(y = gender), alpha = .5) +
scale_color_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
scale_fill_manual(
values = c("cornflowerblue", "coral")
, labels = c("Male", "Female")
) +
labs(fill = "Gender", color = "Gender") +
coord_flip() +
theme_void() +
theme(legend.position = "none")
py
And then put them together using our custom arrangement:
Advanced Piecing Plots Together
- Marginal plots are great for lots of reasons
- But when it comes to piecing plots together, we are often interested for bringing together different kinds of figures together because you can’t bring them together with facets or other ways
Let me show you a couple of examples from my work that has used cowplot
or patchwork
Example: Forest Plots
- Let’s use forest plots as an example. Why use forest plots:
- Meta-analyses are common, and within-paper meta-analyses in multi-study papers are becoming more common
- Not only will this let us practice piecing plots together, this is a particularly advanced case that will let us learn about new elements that we can creating (e.g., via grobs)
- Let’s build up our use cases incrementally!
- But first, we need some data to plot!
And remember these models?
Run the models
- And remember these models?
- Let’s make two small changes:
- Add the number of observations
- Add the residual degrees of freedom
- Why? We usually include these in a plot as it’s relevant information
Code
m_fun <- function(d) {
glm(o_value ~ p_value + married + married:p_value
, data = d
, family = binomial(link = "logit"))
}
tidy_ci <- function(m) tidy(m, conf.int = T) %>% mutate(df.resid = m$df.residual, n = nrow(m$data))
nested_m <- pred_data %>%
group_by(study) %>%
nest() %>%
ungroup() %>%
mutate(
m = map(data, m_fun)
, tidy = map(m, tidy_ci)
)
nested_m
Here’s our unnested model terms:
Code
But maybe we are particularly interested in the interaction between marital status and personality in predicting mortality, which we want to plot as a forest plot
Code
- We could hack our way to a forest plot in a single figure, but it never looks as nice as if we do it in two
- the forest plot itself
- the table of values
- the forest plot itself
Example Setup: Forest Plot (P1)
Code
Let’s add our point estimates and uncertainty intervals
Code
But we want to order the points by the effect sizes:
Code
p1 <- nested_m %>%
select(study, tidy) %>%
unnest(tidy) %>%
mutate_at(vars(estimate, conf.low, conf.high), exp) %>%
filter(term == "p_value:married1") %>%
arrange(desc(estimate)) %>%
mutate(study = fct_inorder(study)) %>%
ggplot(aes(x = estimate, y = study)) +
labs(
x = "Model Estimated OR (CI)"
, y = NULL
) +
my_theme()
p1
Let’s add our point estimates and uncertainty intervals back in
Code
And add in a vertical line at OR = 1:
Example Setup: Forest Plot Table (P2)
In a forest plot, we don’t just show estimates, we print them with the sample size
Let’s build a table with those!
There are packages to do this, but I like to build them myself because it helps them play nicer with
cowplot
orpatchwork
To figure out how to make it, it’s easiest to figure where you want to end up and work backward.
First, we need to set up the data for the plot:
Code
p2 <- nested_m %>%
select(study, tidy) %>%
unnest(tidy) %>%
mutate_at(vars(estimate, conf.low, conf.high), exp) %>%
filter(term == "p_value:married1") %>%
arrange(desc(estimate)) %>%
mutate(
study = fct_inorder(study)
, study2 = 1:n()
) %>%
mutate_at(vars(estimate, conf.low, conf.high), ~sprintf("%.2f", .)) %>%
mutate(
est = sprintf("%s [%s, %s]", estimate, conf.low, conf.high)
, n = as.character(n)
) %>%
select(study, study2, estimate, n, est) %>%
pivot_longer(
cols = c(est, n)
, values_to = "lab"
, names_to = "est"
)
p2
Let’s build our base:
Add in the text:
Set the theme
We’ll add a horizontal line at the top and bottom to match the forest plot:
Code
Add the column labels:
Code
We need a little margin on the top and bottom:
Example Setup: Back to the Forest Plot (P1)
- We added an extra row at the top of the table, so we need to do that for the forest plot, too.
- To do so, we will use the same trick we did for the table, which is “tricking” ggplot into thinking we have a continuous y-axis
Code
p1 <- nested_m %>% select(study, tidy) %>%
unnest(tidy) %>%
mutate_at(vars(estimate, conf.low, conf.high), exp) %>%
filter(term == "p_value:married1") %>%
arrange(desc(estimate)) %>%
mutate(study = fct_inorder(study)
, study2 = 1:n()) %>%
ggplot(aes(x = estimate, y = study2)) +
labs(
x = "Model Estimated OR (CI)"
, y = NULL
) +
my_theme()
p1
Add our point estimates and uncertainty intervals, along with the vertical line at OR = 1
Code
Change the y scale back
Code
Add in that top bar:
Remove the y axis line
Code
- Remember that ggplot is layered.
- So sometimes, you have to hack ggplot and use
annotate()
rectangles to block out portions of the plot. - Let’s block out where the dashed line touches the top:
patchwork
-
I know that was a lot, but such is the reality of ggplot – we have to hack it!
-
annotate()
is a great tool for this - so are our
scale_[map]_[type]
functions, especially given the labels can be anything we want! - and our
theme
elements also let us hack many more parts!
-
The biggest trick to
ggplot2
is simply having lots of tricks up your sleeve, which come from knowledge (and StackOverflow)patchwork
is great, and a little more intuitive for simple use cases(We’ll still talk some about cowplot and a more full demo of it is at the end of the slides and in the workbook)
patchwork
allows you to use the+
to piece plots together and makes a lot of default assumptions about alignmentIt also let’s you continue to layer on top of figures that are pieced together, which
cowplot
doesn’t do (easily)
We can just use the +
operator!
We can also add rows using the /
And change their arrangement using plot_layout()
, adjust both/either the heights
and/or widths
:
We can add titles using plot_annotation()
Code
We can add labels to plot using plot_annotation()
Code
And change various properties of those with additional arguments:
Code
Example 2 Setup: Simple Effects
To accompany our parameter estimates, we may want to couple that with simple effects plots that decompose the interaction. To do so, we’ll need to get model-based predictions for the C-mortality association across levels of marital status using the predict()
function.
Code
Code
Let’s set up the core part of the simple effects plots:
Code
p3 <- nested_m %>%
mutate(df.resid = map_dbl(m, df.residual)) %>%
select(study, pred, df.resid) %>%
unnest(pred) %>%
mutate(married = factor(married, c(0,1), c("Never Married", "Married"))) %>%
ggplot(aes(x = p_value, y = .fitted, fill = study, color = study)) +
labs(x = "Conscientiousness (POMP, 0-10)"
, y = "Predicted Odds Ratio\nof Mortality (95% CI)"
, fill = NULL
, color = NULL) +
facet_grid(~married) +
my_theme()
p3
Add in our lineribbon:
Code
We can even put these back together to combine the pieces of information together, just like what I showed you from my A&D paper!
But we need to add better labels
Code
And deal with the legends:
cowplot
New grobs for drawing on our plots
- Relative to
patchwork
,cowplot
also adds some other new tools to our repertoire:ggdraw()
draw_label()
draw_plot_label()
draw_grob()
draw_image()
ggdraw()
-
ggdraw()
is more or a setup function that allows us to add grobs on top - We’ll use it with
draw_label()
to make our title (just some text to put on the plot)
It’d be nice if the title was centered, right?
Code
We could use cowplot::draw_label()
to add a title
and subtitle
to our plot:
Code
draw_label()
-
draw_label()
is meant to be a better wrapper forgeom_text()
that requires less customization
- Say for example, we want to put a wordmark on our plots (there are journals that require this!)
- Doing this with
geom_text()
would require 10+ arguments and has no easy application to figures put together with cowplot (or other packages for doing so)
Imagine you want to put a plot inside of another
Code
inset <-
pred_data %>%
filter(study == "Study1") %>%
ggplot(aes(y = gender, x = SRhealth, fill = gender)) +
scale_fill_manual(values = c("cornflowerblue", "coral")) +
scale_y_discrete(labels = c("Male", "Female")) +
stat_halfeye(alpha = .8) +
my_theme() +
theme(legend.position = "none") +
theme_half_open(12)
We can also add images!
Extra Slides: cowplot::plot_grid()
plot_grid()
- The core function of
cowplot
isplot_grid()
, which allows us to place differnt figures within the same figure in a grid, and it has a lot of useful arguments - It’s the alternative to
+
,/
in `patchwork
plotlist = NULL
align = c("none", "h", "v", "hv")
axis = c("none", "l", "r", "t", "b", "lr", "tb", "tblr")
nrow = NULL
ncol = NULL
rel_widths = 1
rel_heights = 1
labels = NULL
label_size = 14
label_fontfamily = NULL
label_fontface = "bold"
label_colour = NULL
label_x = 0
label_y = 1
hjust = -0.5
vjust = 1.5
scale = 1
greedy = TRUE
byrow = TRUE
cols = NULL
rows = NULL
- But now that we have our plot, we want to put it together! Remember these?
Not bad, but we want to align our plots
Similar behavior, but "hv"
leads to odd spacing
Doesn’t properly align our bottom because it’s not optimized for labels
Let our interval estimates shine
We wouldn’t do this, but note that when we have rows, we use rel_heights