library(RColorBrewer)
library(knitr)
library(kableExtra)
library(plyr)
library(broom)
library(modelr)
library(lme4)
library(broom.mixed)
library(tidyverse)
library(ggdist)
library(patchwork)
library(cowplot)
library(DiagrammeR)
library(wordcloud)
library(tidytext)
library(ggExtra)
library(distributional)
library(gganimate)
my_theme <- function(){
theme_classic() +
theme(
legend.position = "bottom"
, legend.title = element_text(face = "bold", size = rel(1))
, legend.text = element_text(face = "italic", size = rel(1))
, axis.text = element_text(face = "bold", size = rel(1.1), color = "black")
, axis.title = element_text(face = "bold", size = rel(1.2))
, plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5)
, plot.subtitle = element_text(face = "italic", size = rel(1.2), hjust = .5)
, strip.text = element_text(face = "bold", size = rel(1.1), color = "white")
, strip.background = element_rect(fill = "black")
)
}
DiagrammeR
DiagrammeR
is a unique interface because it brings together multiple ways of building diagrams in R and tries ot unite them with consistent syntaxDiagrammeR
package, so I’m going to make a strong assumption based on my knowledge of your ongoing interests and research:
DiagrammeR
: Graphvizstrict
basically determines whether we add can multiple nodes going into / out of a node[digraph]
or undirected [graph]
graph.[ID]
is what you want to name your graph object'{' stmt_list '}'
is where you specify the nodes and edges the graph (more on this next)DiagrammeR
: Graphvizdigraph
says we want the graph to be directedgraph
lets us control elements of the graph in the []
overlap = true
means nodes can overlapnode
means we’re about to specify some nodes (and their properties in []
)DiagrammeR
: GraphvizWe can control lots of properties of nodes (either as groups or individually):
color
fillcolor
fontcolor
alpha
shape
style
(like linestyle)sides
peripheries
fixedsize
height
width
distortion
penwidth
x
y
tooltip
fontname
fontsize
icon
DiagrammeR
: GraphvizBut we also want to add edges
->
indicates directed edges--
indicates undirected edgesA->{B,C}
is the same as A->B A->C
DiagrammeR
: GraphvizEdge properties can be defined like node properties:
arrowsize
arrowhead
arrowtail
dir
color
alpha
headport
tailport
fontname
fontsize
fontcolor
penwidth
menlin
tooltip
DiagrammeR
: GraphvizgrViz("
digraph b5 {
# a 'graph' statement
graph [overlap = true, fontsize = 10]
# def latent Big Five
node [shape = circle]
E; A; C; N; O
# def observed indicators
node [shape = square]
e1; e2; e3
a1; a2; a3
c1; c2; c3
n1; n2; n3
o1; o2; o3
# several 'edge' statements
E->{e1,e2,e3}
A->{a1,a2,a3}
C->{c1,c2,c3}
N->{n1,n2,n3}
O->{o1,o2,o3}
}"
)
DiagrammeR
: GraphvizBut they aren’t orthogonal, so we need to let the factors correlate.
grViz("
digraph b5 {
# a 'graph' statement
graph [overlap = true, fontsize = 10]
# def latent Big Five
node [shape = circle]
E; A; C; N; O
# def observed indicators
node [shape = square]
e1; e2; e3
a1; a2; a3
c1; c2; c3
n1; n2; n3
o1; o2; o3
# several 'edge' statements
E->{e1,e2,e3}
A->{a1,a2,a3}
C->{c1,c2,c3}
N->{n1,n2,n3}
O->{o1,o2,o3}
E->{A,C,N,O} [dir = both]
A->{C,N,O} [dir = both]
C->{N,O} [dir = both]
N->{O} [dir = both]
}"
)
DiagrammeR
: GraphvizBut they aren’t orthogonal, so we need to let the factors correlate. Let’s change the layout to neato
:
grViz("
digraph b5 {
# a 'graph' statement
graph [overlap = true, fontsize = 10, layout = neato]
# def latent Big Five
node [shape = circle]
E; A; C; N; O
# def observed indicators
node [shape = square,
fixedsize = true,
width = 0.25]
e1; e2; e3
a1; a2; a3
c1; c2; c3
n1; n2; n3
o1; o2; o3
# several 'edge' statements
E->{e1,e2,e3}
A->{a1,a2,a3}
C->{c1,c2,c3}
N->{n1,n2,n3}
O->{o1,o2,o3}
E->{A,C,N,O} [dir = both]
A->{C,N,O} [dir = both]
C->{N,O} [dir = both]
N->{O} [dir = both]
}"
)
DiagrammeR
: Graphvizlavaan
, wasn’t it?create_graph()
and accompanying functionsR
.R
, there are lots of great tools for tokenizing, basic sentiment analysis, and moretext_df <- read.table("https://github.com/emoriebeck/psc290-data-viz-2022/raw/main/08-week8-polishing/01-data/part2_pymupdf.txt", sep = "\n") %>%
setNames("text") %>%
mutate(line = 1:n()) %>%
as_tibble() %>%
mutate(text = str_remove_all(text, "[0-9]"))
text_df
# A tibble: 1,521 × 2
text line
<chr> <int>
1 "CASE REPORTS" 1
2 "LETTERS FROM JENNY (continued)" 2
3 "ANONYMOUS" 3
4 " (continued)" 4
5 "N.Y.C. Sunday /" 5
6 "My dearest Boy and Girl:" 6
7 "This is not a regular letter, but even if it" 7
8 "were I could never begin to express my" 8
9 "gratitude to you. I believe that when two" 9
10 "persons really love each other in the highest" 10
# ℹ 1,511 more rows
[1] "CASE REPORTS"
[2] "LETTERS FROM JENNY (continued)"
[3] "ANONYMOUS"
[4] " (continued)"
[5] "N.Y.C. Sunday /"
[6] "My dearest Boy and Girl:"
[7] "This is not a regular letter, but even if it"
[8] "were I could never begin to express my"
[9] "gratitude to you. I believe that when two"
[10] "persons really love each other in the highest"
A token is a meaningful unit of text, most often a word, that we are interested in using for further analysis, and tokenization is the process of splitting text into tokens (Silge & Robinson, Tidy Text Mining in R)
Let’s visualize the count:
How negative is Jenny?
Does her negativity change over time?
# A tibble: 32 × 3
sentiment index n
<chr> <dbl> <int>
1 positive 4 97
2 negative 4 94
3 negative 11 89
4 negative 7 63
5 negative 10 60
6 positive 7 60
7 negative 12 53
8 negative 1 52
9 negative 8 42
10 negative 2 39
# ℹ 22 more rows
Does her negativity change over time?
Does her negativity change over time?
p +
scale_color_manual(
values = c("grey40", "goldenrod")
) +
scale_x_continuous(
limits = c(0,18)
, breaks = seq(0,15,5)
) +
annotate("label"
, label = "negative"
, y = 32
, x = 15.5
, hjust = 0
, fill = "grey40"
, color = "white") +
annotate("label"
, label = "positive"
, y = 13
, x = 15.5
, hjust = 0
, fill = "goldenrod") +
labs(x = "Chunk", y = "Count") +
theme(legend.position = "none")
We can also look at most common negative and positive words:
tidy_text %>%
inner_join(get_sentiments("bing")) %>%
count(sentiment, word, sort = T) %>%
group_by(sentiment) %>%
top_n(10)
# A tibble: 21 × 3
# Groups: sentiment [2]
sentiment word n
<chr> <chr> <int>
1 positive love 53
2 negative prison 41
3 negative dead 20
4 positive fine 20
5 positive lovely 18
6 positive pretty 16
7 negative death 15
8 negative terrible 15
9 positive nice 14
10 negative damn 13
# ℹ 11 more rows
We can also look at most common negative and positive words:
p <- tidy_text %>%
inner_join(get_sentiments("bing")) %>%
count(sentiment, word, sort = T) %>%
group_by(sentiment) %>%
top_n(10) %>%
ungroup() %>%
mutate(word = reorder(word, n)) %>%
ggplot(aes(x = n, y = word, fill = sentiment)) +
geom_col() +
labs(y = NULL) +
facet_wrap(~sentiment, scales = "free_y") +
my_theme()
p
par(mar = c(0, 0, 0, 0), mfrow = c(1,2))
tidy_text %>%
inner_join(get_sentiments("bing")) %>%
count(sentiment, word, sort = T) %>%
filter(sentiment == "negative") %>%
with(wordcloud(
word
, n
, max.words = 100
, colors = "grey40")
)
title("Negative", line = -2)
tidy_text %>%
inner_join(get_sentiments("bing")) %>%
count(sentiment, word, sort = T) %>%
filter(sentiment == "positive") %>%
with(wordcloud(
word
, n
, max.words = 100
, colors = "goldenrod")
)
title("Positive", line = -2)
ggplot2
hacksggplot2
doesn’t communicate with correlation matrices because they are in wide format
ggplot2
wants a data framecolnames()
) and row names (rownames()
) p_value age gender SRhealth smokes
p_value 1.000000000 -0.005224085 0.053627861 0.15917525 -0.069013463
age -0.005224085 1.000000000 -0.057243245 -0.22438335 -0.078788619
gender 0.053627861 -0.057243245 1.000000000 -0.03182278 0.022275557
SRhealth 0.159175251 -0.224383351 -0.031822781 1.00000000 -0.129241536
smokes -0.069013463 -0.078788619 0.022275557 -0.12924154 1.000000000
exercise 0.048576025 -0.361768736 0.061659017 0.34546038 -0.155018841
BMI -0.019741798 0.036151816 0.012217132 -0.09340105 -0.037713371
education 0.001465775 -0.173399716 -0.001603648 0.11008540 -0.096936630
parEdu 0.019871078 -0.374733606 0.055468171 0.08273023 0.005215303
mortality -0.089637524 0.627069166 -0.092109448 -0.31142292 0.035759332
exercise BMI education parEdu mortality
p_value 0.04857602 -0.01974180 0.001465775 0.019871078 -0.08963752
age -0.36176874 0.03615182 -0.173399716 -0.374733606 0.62706917
gender 0.06165902 0.01221713 -0.001603648 0.055468171 -0.09210945
SRhealth 0.34546038 -0.09340105 0.110085399 0.082730234 -0.31142292
smokes -0.15501884 -0.03771337 -0.096936630 0.005215303 0.03575933
exercise 1.00000000 -0.06217297 0.210204022 0.176766791 -0.32138385
BMI -0.06217297 1.00000000 -0.048914825 -0.075000576 0.01643219
education 0.21020402 -0.04891483 1.000000000 0.232321970 -0.17215791
parEdu 0.17676679 -0.07500058 0.232321970 1.000000000 -0.18796244
mortality -0.32138385 0.01643219 -0.172157913 -0.187962436 1.00000000
ggplot2
default behavior, and one of those things is that it will treat columns of class()
character
as something that should be ordered alphabetically via scale_[map]_discrete()
factor
with levels
and/or labels
we providecor()
with the raw data.You can see that order by looking at the row and column names:
p_value age gender SRhealth smokes
p_value 1.000000000 -0.005224085 0.053627861 0.15917525 -0.069013463
age -0.005224085 1.000000000 -0.057243245 -0.22438335 -0.078788619
gender 0.053627861 -0.057243245 1.000000000 -0.03182278 0.022275557
SRhealth 0.159175251 -0.224383351 -0.031822781 1.00000000 -0.129241536
smokes -0.069013463 -0.078788619 0.022275557 -0.12924154 1.000000000
exercise 0.048576025 -0.361768736 0.061659017 0.34546038 -0.155018841
BMI -0.019741798 0.036151816 0.012217132 -0.09340105 -0.037713371
education 0.001465775 -0.173399716 -0.001603648 0.11008540 -0.096936630
parEdu 0.019871078 -0.374733606 0.055468171 0.08273023 0.005215303
mortality -0.089637524 0.627069166 -0.092109448 -0.31142292 0.035759332
exercise BMI education parEdu mortality
p_value 0.04857602 -0.01974180 0.001465775 0.019871078 -0.08963752
age -0.36176874 0.03615182 -0.173399716 -0.374733606 0.62706917
gender 0.06165902 0.01221713 -0.001603648 0.055468171 -0.09210945
SRhealth 0.34546038 -0.09340105 0.110085399 0.082730234 -0.31142292
smokes -0.15501884 -0.03771337 -0.096936630 0.005215303 0.03575933
exercise 1.00000000 -0.06217297 0.210204022 0.176766791 -0.32138385
BMI -0.06217297 1.00000000 -0.048914825 -0.075000576 0.01643219
education 0.21020402 -0.04891483 1.000000000 0.232321970 -0.17215791
parEdu 0.17676679 -0.07500058 0.232321970 1.000000000 -0.18796244
mortality -0.32138385 0.01643219 -0.172157913 -0.187962436 1.00000000
p_value age gender SRhealth smokes exercise
p_value NA -0.005224085 0.05362786 0.15917525 -0.06901346 0.04857602
age NA NA -0.05724324 -0.22438335 -0.07878862 -0.36176874
gender NA NA NA -0.03182278 0.02227556 0.06165902
SRhealth NA NA NA NA -0.12924154 0.34546038
smokes NA NA NA NA NA -0.15501884
exercise NA NA NA NA NA NA
BMI NA NA NA NA NA NA
education NA NA NA NA NA NA
parEdu NA NA NA NA NA NA
mortality NA NA NA NA NA NA
BMI education parEdu mortality
p_value -0.01974180 0.001465775 0.019871078 -0.08963752
age 0.03615182 -0.173399716 -0.374733606 0.62706917
gender 0.01221713 -0.001603648 0.055468171 -0.09210945
SRhealth -0.09340105 0.110085399 0.082730234 -0.31142292
smokes -0.03771337 -0.096936630 0.005215303 0.03575933
exercise -0.06217297 0.210204022 0.176766791 -0.32138385
BMI NA -0.048914825 -0.075000576 0.01643219
education NA NA 0.232321970 -0.17215791
parEdu NA NA NA -0.18796244
mortality NA NA NA NA
p_value age gender SRhealth smokes exercise
p_value NA -0.005224085 0.05362786 0.15917525 -0.06901346 0.04857602
age NA NA -0.05724324 -0.22438335 -0.07878862 -0.36176874
gender NA NA NA -0.03182278 0.02227556 0.06165902
SRhealth NA NA NA NA -0.12924154 0.34546038
smokes NA NA NA NA NA -0.15501884
exercise NA NA NA NA NA NA
BMI NA NA NA NA NA NA
education NA NA NA NA NA NA
parEdu NA NA NA NA NA NA
mortality NA NA NA NA NA NA
BMI education parEdu mortality
p_value -0.01974180 0.001465775 0.019871078 -0.08963752
age 0.03615182 -0.173399716 -0.374733606 0.62706917
gender 0.01221713 -0.001603648 0.055468171 -0.09210945
SRhealth -0.09340105 0.110085399 0.082730234 -0.31142292
smokes -0.03771337 -0.096936630 0.005215303 0.03575933
exercise -0.06217297 0.210204022 0.176766791 -0.32138385
BMI NA -0.048914825 -0.075000576 0.01643219
education NA NA 0.232321970 -0.17215791
parEdu NA NA NA -0.18796244
mortality NA NA NA NA
r <- r_data$r[[1]]
coln <- colnames(r)
r[lower.tri(r, diag = T)] <- NA
r %>% data.frame() %>%
rownames_to_column("V1")
V1 p_value age gender SRhealth smokes
1 p_value NA -0.005224085 0.05362786 0.15917525 -0.06901346
2 age NA NA -0.05724324 -0.22438335 -0.07878862
3 gender NA NA NA -0.03182278 0.02227556
4 SRhealth NA NA NA NA -0.12924154
5 smokes NA NA NA NA NA
6 exercise NA NA NA NA NA
7 BMI NA NA NA NA NA
8 education NA NA NA NA NA
9 parEdu NA NA NA NA NA
10 mortality NA NA NA NA NA
exercise BMI education parEdu mortality
1 0.04857602 -0.01974180 0.001465775 0.019871078 -0.08963752
2 -0.36176874 0.03615182 -0.173399716 -0.374733606 0.62706917
3 0.06165902 0.01221713 -0.001603648 0.055468171 -0.09210945
4 0.34546038 -0.09340105 0.110085399 0.082730234 -0.31142292
5 -0.15501884 -0.03771337 -0.096936630 0.005215303 0.03575933
6 NA -0.06217297 0.210204022 0.176766791 -0.32138385
7 NA NA -0.048914825 -0.075000576 0.01643219
8 NA NA NA 0.232321970 -0.17215791
9 NA NA NA NA -0.18796244
10 NA NA NA NA NA
r <- r_data$r[[1]]
coln <- colnames(r)
r[lower.tri(r, diag = T)] <- NA
r %>% data.frame() %>%
rownames_to_column("V1") %>%
pivot_longer(
cols = -V1
, values_to = "r"
, names_to = "V2"
)
# A tibble: 100 × 3
V1 V2 r
<chr> <chr> <dbl>
1 p_value p_value NA
2 p_value age -0.00522
3 p_value gender 0.0536
4 p_value SRhealth 0.159
5 p_value smokes -0.0690
6 p_value exercise 0.0486
7 p_value BMI -0.0197
8 p_value education 0.00147
9 p_value parEdu 0.0199
10 p_value mortality -0.0896
# ℹ 90 more rows
r <- r_data$r[[1]]
coln <- colnames(r)
r[lower.tri(r, diag = T)] <- NA
r %>% data.frame() %>%
rownames_to_column("V1") %>%
pivot_longer(
cols = -V1
, values_to = "r"
, names_to = "V2"
) %>%
mutate(V1 = factor(V1, levels = rev(coln))
, V2 = factor(V2, levels = coln))
# A tibble: 100 × 3
V1 V2 r
<fct> <fct> <dbl>
1 p_value p_value NA
2 p_value age -0.00522
3 p_value gender 0.0536
4 p_value SRhealth 0.159
5 p_value smokes -0.0690
6 p_value exercise 0.0486
7 p_value BMI -0.0197
8 p_value education 0.00147
9 p_value parEdu 0.0199
10 p_value mortality -0.0896
# ℹ 90 more rows
load(url("https://github.com/emoriebeck/psc290-data-viz-2022/raw/main/05-week5-time-series/01-data/ipcs_data.RData"))
ipcs_data
# A tibble: 4,222 × 70
SID Full_Date afraid angry attentive content excited goaldir guilty happy
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 02 2018-10-22… 1 2 4 4 2 5 2 3
2 02 2018-10-22… 1 1 4 3 2 5 1 3
3 02 2018-10-23… 2 1 2 3 1 2 2 3
4 02 2018-10-23… 2 2 4 3 2 4 1 3
5 02 2018-10-23… 2 1 4 4 3 4 1 3
6 02 2018-10-24… 2 1 4 4 2 4 1 3
7 02 2018-10-24… 2 1 4 3 2 4 1 3
8 02 2018-10-24… 2 1 4 4 4 4 1 4
9 02 2018-10-24… 2 2 3 3 3 3 2 2
10 02 2018-10-25… 2 1 4 4 3 3 2 4
# ℹ 4,212 more rows
# ℹ 60 more variables: proud <dbl>, purposeful <dbl>,
# agreeableness_Compassion <dbl>, agreeableness_Respectfulness <dbl>,
# agreeableness_Trust <dbl>, conscientiousness_Organization <dbl>,
# conscientiousness_Productiveness <dbl>,
# conscientiousness_Responsibility <dbl>, extraversion_Assertiveness <dbl>,
# extraversion_Energy.Level <dbl>, extraversion_Sociability <dbl>, …
ipcs_long <- ipcs_data %>%
filter(SID == "02") %>%
select(SID:purposeful) %>%
pivot_longer(
cols = c(-SID, -Full_Date)
, values_to = "value"
, names_to = "var"
, values_drop_na = T
) %>%
mutate(valence = ifelse(var %in% c("afraid", "angry", "guilty"), "Negative", "Positive"))
ipcs_long
# A tibble: 480 × 5
SID Full_Date var value valence
<chr> <chr> <chr> <dbl> <chr>
1 02 2018-10-22 13:23 afraid 1 Negative
2 02 2018-10-22 13:23 angry 2 Negative
3 02 2018-10-22 13:23 attentive 4 Positive
4 02 2018-10-22 13:23 content 4 Positive
5 02 2018-10-22 13:23 excited 2 Positive
6 02 2018-10-22 13:23 goaldir 5 Positive
7 02 2018-10-22 13:23 guilty 2 Negative
8 02 2018-10-22 13:23 happy 3 Positive
9 02 2018-10-22 13:23 proud 4 Positive
10 02 2018-10-22 13:23 purposeful 4 Positive
# ℹ 470 more rows
ipcs_long %>%
group_by(var, valence) %>%
summarize_at(vars(value), lst(mean, sd)) %>%
ungroup() %>%
ggplot(aes(x = var, y = mean, fill = valence)) +
geom_bar(
stat = "identity"
, position = "dodge"
) +
geom_errorbar(
aes(ymin = mean - sd, ymax = mean + sd)
, width = .1
) +
facet_grid(~valence, scales = "free_x", space = "free_x") +
my_theme()
ipcs_long %>%
group_by(var, valence) %>%
summarize_at(vars(value), lst(mean, sd)) %>%
ungroup() %>%
ggplot(aes(x = var, y = mean - 1, fill = valence)) +
geom_bar(
stat = "identity"
, position = "dodge"
) +
geom_errorbar(
aes(ymin = mean - 1 - sd, ymax = mean - 1 + sd)
, width = .1
) +
scale_y_continuous(limits = c(0,4), breaks = seq(0,4,1), labels = 1:5) +
facet_grid(~valence, scales = "free_x", space = "free_x") +
my_theme()
p <- ipcs_long %>%
group_by(var, valence) %>%
summarize_at(vars(value), lst(mean, sd)) %>%
ungroup() %>%
ggplot(aes(x = var, y = mean - 1, fill = valence)) +
geom_bar(
stat = "identity"
, position = "dodge"
) +
geom_jitter(
data = ipcs_long
, aes(y = value - 1, fill = valence)
, color = "black"
, shape = 21
, alpha = .5
, width = .2
, height = .1
) +
geom_errorbar(
aes(ymin = mean - 1 - sd, ymax = mean - 1 + sd)
, width = .1
) +
scale_y_continuous(limits = c(-.1,4), breaks = seq(0,4,1), labels = 1:5) +
facet_grid(~valence, scales = "free_x", space = "free_x") +
my_theme()
p
Here’s the data:
Let’s add the core ggplot code:
And our geoms
, labs
, and theme
:
tibble(
p = as.character(rep(1, 4))
, x = paste0("S", c(1,2,3,"p"))
, y = c(1, 2, 4, 3)
) %>%
ggplot(aes(x = x, y = y, group = p)) +
geom_line(size = 1, color = "#8cdbbe") +
geom_point(
size = 2.5
, color = "black"
, shape = "square"
) +
labs(
x = "Situation"
, y = "Mean Response"
, title = "Intraindividual Variability"
, subtitle = "Person 1"
) +
my_theme()
Let’s switch to a continuous scale, then we can use labels
to add it!
tibble(
p = as.character(rep(1, 4))
, x = paste0("S", c(1,2,3,"p"))
, x2 = 1:4
, y = c(1, 2, 4, 3)
) %>%
ggplot(aes(x = x2, y = y, group = p)) +
geom_line(size = 1, color = "#8cdbbe") +
geom_point(
size = 2.5
, color = "black"
, shape = "square"
) +
labs(
x = "Situation"
, y = "Mean Response"
, title = "Intraindividual Variability"
, subtitle = "Person 1"
) +
my_theme()
Let’s switch to a continuous scale, then we can use labels
to add it!
tibble(
p = as.character(rep(1, 4))
, x = paste0("S", c(1,2,3,"p"))
, x2 = 1:4
, y = c(1, 2, 4, 3)
) %>%
ggplot(aes(x = x2, y = y, group = p)) +
geom_line(size = 1, color = "#8cdbbe") +
geom_point(
size = 2.5
, color = "black"
, shape = "square"
) +
scale_x_continuous(
limits = c(.9, 4.1)
, breaks = c(1,2,3,3.5,4)
, labels = c("S1", "S2", "S3", "...", "S4")
) +
labs(
x = "Situation"
, y = "Mean Response"
, title = "Intraindividual Variability"
, subtitle = "Person 1"
) +
my_theme()
We can actually supply a vector of length breaks
to axis.ticks.x
specifying the size
of the ticks!
tibble(
p = as.character(rep(1, 4))
, x = paste0("S", c(1,2,3,"p"))
, x2 = 1:4
, y = c(1, 2, 4, 3)
) %>%
ggplot(aes(x = x2, y = y, group = p)) +
geom_line(size = 1, color = "#8cdbbe") +
geom_point(
size = 2.5
, color = "black"
, shape = "square"
) +
scale_x_continuous(
limits = c(.9, 4.1)
, breaks = c(1,2,3,3.5,4)
, labels = c("S1", "S2", "S3", "...", "Sn")
) +
labs(
x = "Situation"
, y = "Mean Response"
, title = "Intraindividual Variability"
, subtitle = "Person 1"
) +
my_theme() +
theme(axis.ticks.x = element_line(color = c(rep(.5, 3), 0, .5)))
coord_cartesian()
: the default and what you’ll use most of the timecoord_polar()
: remember Trig and Calculus?coord_quickmap()
: sets you up to plot mapscoord_trans()
: apply transformations to coordinate planecoord_flip()
: flip x
and y
coord_polar()
ipcs_m <- ipcs_data %>%
filter(SID %in% c(216, 211, 174)) %>%
select(SID, Full_Date, afraid:purposeful, Adversity:Sociability)
ipcs_m
# A tibble: 310 × 20
SID Full_Date afraid angry attentive content excited goaldir guilty happy
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 174 2019-10-23… 1 1 3 3 3 3 1 4
2 174 2019-10-23… 1 1 4 3 3 3 1 3
3 174 2019-10-23… 1 1 4 4 3 3 1 4
4 174 2019-10-23… 1 1 3 3 3 3 1 3
5 174 2019-10-24… 1 1 3 3 3 3 1 3
6 174 2019-10-24… 1 1 3 3 3 3 1 3
7 174 2019-10-24… 1 1 3 3 3 3 1 3
8 174 2019-10-24… 1 1 3 3 3 3 1 3
9 174 2019-10-25… 1 1 3 3 3 3 1 3
10 174 2019-10-25… 1 1 3 3 3 3 1 3
# ℹ 300 more rows
# ℹ 10 more variables: proud <dbl>, purposeful <dbl>, Adversity <dbl>,
# Deception <dbl>, Duty <dbl>, Intellect <dbl>, Mating <dbl>,
# Negativity <dbl>, pOsitivity <dbl>, Sociability <dbl>
coord_polar()
# A tibble: 54 × 4
SID var m sd
<chr> <chr> <dbl> <dbl>
1 174 Adversity 1 0
2 174 Deception 1 0
3 174 Duty 1.93 0.482
4 174 Intellect 1.76 0.432
5 174 Mating 2.03 0.739
6 174 Negativity 1.03 0.173
7 174 Sociability 1.96 0.591
8 174 afraid 1.01 0.101
9 174 angry 1.05 0.362
10 174 attentive 3.07 0.296
# ℹ 44 more rows
coord_polar()
vars <- tibble(
var = vars
, cat = c(rep("Emotion", 10), rep("Situation", 8))
, num = 1:length(vars)
)
ipcs_m <- ipcs_m %>%
left_join(vars %>% rename(var2 = num))
p <- ipcs_m %>%
ggplot(aes(x = var2, y = m, fill = cat)) +
geom_bar(
stat = "identity"
, position = "dodge"
) +
my_theme() +
facet_wrap(~SID)
p
coord_polar()
coord_polar()
coord_polar()
coord_polar()
p <- p +
labs(
fill = "Feature Category"
, title = "Relative Differences in Intraindividual Means"
, subtitle = "Across Emotions and Situation Perceptions"
) +
theme(
axis.line = element_blank()
, axis.text = element_blank()
, axis.ticks = element_blank()
, axis.title = element_blank()
, panel.background = element_rect(color = "black", fill = NA, size = 1)
)
p
coord_polar()
You can make points any text character
annotate("text", ...)
annotate("text", label = "mu", parse = T)
or annotate("text", label = expression(mu[i]), parse = T)
to produce math text in our geomsHere’s another figure from a grant I’m working on that uses several of the features we’ve been discussing:
set.seed(11)
dist_df = tibble(
dist = dist_normal(3,0.75),
dist_name = format(dist)
)
dist_df %>%
ggplot(aes(y = 1, xdist = dist)) +
stat_slab(fill = "#8cdbbe") +
annotate("point", x = 3, y = 1, size = 3) +
annotate("text", label = "mu", x = 3, y = .92, parse = T, size = 8) +
annotate("text", label = "people", x = 2, y = .95) +
annotate("segment", size = 1, x = 2.8, xend = 1.2, y = .98, yend = .98
, arrow = arrow(type = "closed", length=unit(2, "mm"))) +
annotate("text", label = "people", x = 4, y = .95) +
annotate("segment", size = 1, x = 3.2, xend = 4.8, y = .98, yend = .98
, arrow = arrow(type = "closed", length=unit(2, "mm"))) +
labs(title = "Between-Person Differences") +
theme_void() +
theme(
plot.title = element_text(face = "bold", size = rel(1.2), hjust = .5)
)
theme(legend.position = [arg])
to change its positionlabs([mappings] = "[titles]")
to control legend titlesguides()
to do about everything elsetheme()
theme()
labs
color
and fill
equal to variable V1labs(fill = "My Title", color = "My Title)
guides()
theme()
lets you control the position of the legend and how it appearslabs()
lets you control its titlesscale_[map]_[type]
lets you control limits, breaks, and labelsguides()
lets your control individual legend componentsguides()
Remember correlelograms? Do we need the size legend?
guides()
Remember correlelograms? Do we need the size legend?
guides()
Remember correlelograms? Do we need the size legend?
PSC 290 - Data Visualization