purrr
purrr
function
or subroutine is a sequence of program instructions that performs a specific task, packaged as a unit.bfi %>%
select(matches("1$")) %>%
mutate(
E1 = (E1 - min(E1, na.rm = T))/(max(E1, na.rm = T) - min(E1, na.rm = T))*100
, A1 = (A1 - min(A1, na.rm = T))/(max(A1, na.rm = T) - min(A1, na.rm = T))*100
, C1 = (C1 - min(C1, na.rm = T))/(max(C1, na.rm = T) - min(C1, na.rm = T))*100
, N1 = (N1 - min(N1, na.rm = T))/(max(N1, na.rm = T) - min(N1, na.rm = T))*100
, O1 = (O1 - min(O1, na.rm = T))/(max(O1, na.rm = T) - min(O1, na.rm = T))*100
) %>%
head(10)
A1 C1 E1 N1 O1
61617 20 20 40 40 40
61618 20 80 0 40 60
61620 80 60 20 60 60
61621 60 60 80 20 40
61622 20 60 20 20 40
61623 100 100 20 40 60
61624 20 80 60 0 80
61629 60 40 40 100 40
61630 60 100 80 80 100
61633 20 100 20 80 80
pomp <- function(x, na) (x - min(x, na.rm = na))/(max(x, na.rm = na) - min(x, na.rm = na))*100
bfi %>%
select(matches("1$")) %>%
mutate(
E1 = pomp(E1, T)
, A1 = pomp(A1, T)
, C1 = pomp(C1, T)
, N1 = pomp(N1, T)
, O1 = pomp(O1, T)
) %>%
head(10)
A1 C1 E1 N1 O1
61617 20 20 40 40 40
61618 20 80 0 40 60
61620 80 60 20 60 60
61621 60 60 80 20 40
61622 20 60 20 20 40
61623 100 100 20 40 60
61624 20 80 60 0 80
61629 60 40 40 100 40
61630 60 100 80 80 100
61633 20 100 20 80 80
mutate_at()
!mutate_at()
saves you time in the short and long run.1
pomp <- function(x, na) (x - min(x, na.rm = na))/(max(x, na.rm = na) - min(x, na.rm = na))*100
bfi %>%
select(matches("1$")) %>%
mutate_at(vars(matches("1$")), pomp, na = T) %>%
head(10)
A1 C1 E1 N1 O1
61617 20 20 40 40 40
61618 20 80 0 40 60
61620 80 60 20 60 60
61621 60 60 80 20 40
61622 20 60 20 20 40
61623 100 100 20 40 60
61624 20 80 60 0 80
61629 60 40 40 100 40
61630 60 100 80 80 100
61633 20 100 20 80 80
E1 = (E1 - min(E1, na.rm = T))/(max(E1, na.rm = T) - min(E1, na.rm = T))*100
A1 = (A1 - min(A1, na.rm = T))/(max(A1, na.rm = T) - min(A1, na.rm = T))*100
C1 = (C1 - min(C1, na.rm = T))/(max(C1, na.rm = T) - min(C1, na.rm = T))*100
N1 = (N1 - min(N1, na.rm = T))/(max(N1, na.rm = T) - min(N1, na.rm = T))*100
O1 = (O1 - min(O1, na.rm = T))/(max(O1, na.rm = T) - min(O1, na.rm = T))*100
We need three things to turn this into a function:
min()
twice and min()
oncerange()
instead, which reduces that computation time1
within mutate, it will just repeat that value for every rowTry turning the following into functions:
mean(is.na(x))
mean(is.na(y))
mean(is.na(z))
x / sum(x, na.rm = TRUE)
y / sum(y, na.rm = TRUE)
z / sum(z, na.rm = TRUE)
round(x / sum(x, na.rm = TRUE) * 100, 1)
round(y / sum(y, na.rm = TRUE) * 100, 1)
round(z / sum(z, na.rm = TRUE) * 100, 1)
(x - mean(x, na.rm = T))/sd(x, na.rm = T)
(y - mean(y, na.rm = T))/sd(y, na.rm = T)
(z - mean(z, na.rm = T))/sd(z, na.rm = T)
mutate()
calls)bfi_long <- bfi %>%
rownames_to_column("SID") %>%
pivot_longer(
cols = c(-SID, -education, -gender, -age)
, names_to = c("trait", "item")
, names_sep = -1
, values_to = "value"
, values_drop_na = T
)
head(bfi_long, 8)
# A tibble: 8 × 7
SID gender education age trait item value
<chr> <int> <int> <int> <chr> <chr> <int>
1 61617 1 NA 16 A 1 2
2 61617 1 NA 16 A 2 4
3 61617 1 NA 16 A 3 3
4 61617 1 NA 16 A 4 4
5 61617 1 NA 16 A 5 4
6 61617 1 NA 16 C 1 2
7 61617 1 NA 16 C 2 3
8 61617 1 NA 16 C 3 3
tidy_describe <- function(df, var) {
df %>%
summarize(
mean = mean({{ var }}, na.rm = TRUE),
sd = sd({{ var }}, na.rm = TRUE),
median = median({{ var }}, na.rm = TRUE),
min = min({{ var }}, na.rm = TRUE),
max = max({{ var }}, na.rm = TRUE),
n = n(),
n_miss = sum(is.na({{ var }})),
.groups = "drop"
)
}
bfi_long %>%
group_by(trait) %>%
tidy_describe(value)
# A tibble: 5 × 8
trait mean sd median min max n n_miss
<chr> <dbl> <dbl> <dbl> <int> <int> <int> <int>
1 A 4.22 1.61 5 1 6 13896 0
2 C 3.81 1.57 4 1 6 13893 0
3 E 3.79 1.61 4 1 6 13906 0
4 N 3.16 1.59 3 1 6 13881 0
5 O 3.87 1.67 4 1 6 13916 0
purrr
Iteration is everywhere. It underpins much of mathematics and statistics. If you’ve ever seen the \(\Sigma\) symbol, then you’ve seen (and probably used) iteration.
Reasons for iteration:
- reading in multiple files from a directory
- running the same operation multiple times
- running different combinations of the same model
- creating similar figures / tables / outputs
for
loopsEnter for
loops. for
loops are the “OG” form of iteration in computer science. The basic syntax is below. Basically, we can use a for loop to loop through and print a series of things.
_apply()
familyfor
loops comes from the _apply()
family of functions, including:
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
[[4]]
[1] "d"
[[5]]
[1] "e"
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
a b c d e
"a" "b" "c" "d" "e"
[1] "a"
[1] "b"
[1] "c"
[1] "d"
[1] "e"
a b c d e
"a" "b" "c" "d" "e"
purrr
and _map_()
functionsmap()
family of functions, which is the functions through which purrr
iterates.for
loops, the _apply()
family, and _map_()
functions, see https://jennybc.github.io/purrr-tutorial/purrr
and _map_()
predicatesmap()
family of functions, which is the functions through which purrr
iterates.purrr
, we can change the kind of output of map()
by adding a predicate, like lgl
, dbl
, chr
, and df
.map_chr()
:purrr
and _map_()
antecedentsmap_()
map2_()
pmap_()
map2()
and pmap()
with the predicates from above.There are a number of different cases where purrr
and map()
maybe useful for reading in data including:
For this first example, I’ll show you how this would look with a for
loop before I show you how it looks with purrr
.
Assuming you have all the data in a single folder and the format is reasonably similar, you have the following basic syntax:
This works fine in this simple case, but where purrr
really shines in when you need to make modifications to your data before combining, whether this be recoding, removing missing cases, or renaming variables.
But first, the simple case of reading data. The code below will download a .zip file when you run it. Once, you do, navigate to your Desktop to unzip the folder. Open the R Project and you should be able to run the rest of the code
map()
function from purrr
, removes the “.csv” from the ID variable, then unnests the data, resulting in a data frame for each person.purrr
On the previous slide, we saw a data frame inside of a data frame. This is called a list column within a nested data frame.
In this case, we created a list column using map, but one of the best things about purrr
is how it combines with the nest()
and unnest()
functions from the tidyr
package.
We’ll return to nest()
later to demonstrate how anything you would iterate across is also something we can nest()
by in long format data frames.
# A tibble: 11 × 7
old_name name short_name item_name description scale reverse_code
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 O_AesSens Openness O O_1 Value arti… "lik… no
2 E_Assert Extraversion E E_1 Assertive "lik… no
3 N_Depr Neuroticism N N_1 Depressed "lik… no
4 N_EmoVol Neuroticism N N_2 Feel emoti… "lik… no
5 O_IntCur Openness O O_2 Curious, o… "lik… no
6 C_Org Conscientio… C C_1 Organized "lik… no
7 A_Rspct Agreeablene… A A_1 Resepct ot… "lik… no
8 C_Rspnbl Conscientio… C C_2 Responsible "lik… no
9 A_Cmpn Agreeablene… A A_2 Compassion… "lik… no
10 E_EnerLev Extraversion E E_2 Energetic "lik… no
11 satisfaction Life Satisf… SWL satisfac… How satisf… "lik… no
df1 <- tibble(
ID = list.files("data/example_1")
) %>%
mutate(data = map(ID, ~read_csv(sprintf("data/example_1/%s", .)))) %>%
unnest(data)
head(df1, 6)
# A tibble: 6 × 18
ID O_AesSens E_Assert N_Depr N_EmoVol O_IntCur A_Rspct A_Cmpn O_CrtvImag
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 031501.… 3.98 3.82 2 3.89 5 3 2.63 2
2 031501.… 4.00 2 1 3 2.90 2.16 3.54 1.22
3 031501.… 4 3 3.5 4 4 4.69 4 2.87
4 031501.… 3 3 3 4 3.06 4 3.27 4
5 031501.… 3.35 3.50 2 2 2.01 4 4 3.5
6 031501.… 3.89 3 2.33 2.38 3.5 4 4.25 2.09
# ℹ 9 more variables: E_EnerLev <dbl>, A_Trust <dbl>, N_Anxty <dbl>,
# E_Scblty <dbl>, count <dbl>, satisfaction <dbl>, C_Org <dbl>,
# C_Rspnbl <dbl>, C_Prdctv <dbl>
old.names <- codebook$old_name # pull old names in raw data from codebook
new.names <- codebook$item_name # pull new names from codebook
df1 <- tibble(
ID = list.files("data/example_1")
) %>%
mutate(data = map(ID, ~read_csv(sprintf("data/example_1/%s", .)))
, ID = str_remove_all(ID, ".csv")) %>%
unnest(data) %>%
select(ID, count, all_of(old.names)) %>% # select columns from codebook in loaded data
setNames(c("ID", "count", new.names)) # rename columns with new names
head(df1, 6)
# A tibble: 6 × 13
ID count O_1 E_1 N_1 N_2 O_2 C_1 A_1 C_2 A_2 E_2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 031501 1 3.98 3.82 2 3.89 5 NA 3 NA 2.63 4
2 031501 2 4.00 2 1 3 2.90 NA 2.16 NA 3.54 3
3 031501 3 4 3 3.5 4 4 NA 4.69 NA 4 3
4 031501 4 3 3 3 4 3.06 NA 4 NA 3.27 3.46
5 031501 5 3.35 3.50 2 2 2.01 NA 4 NA 4 4
6 031501 6 3.89 3 2.33 2.38 3.5 NA 4 NA 4.25 4
# ℹ 1 more variable: satisfaction <dbl>
With these kinds of data, the first thing, we may want to do is look at within-person correlations, which we can do with purrr
.
cor_fun <- function(x) cor(x %>% select(-count), use = "pairwise")
nested_r <- df1 %>%
group_by(ID) %>%
nest() %>%
ungroup() %>%
mutate(r = map(data, cor_fun))
head(nested_r, 6)
# A tibble: 6 × 3
ID data r
<chr> <list> <list>
1 031501 <tibble [64 × 12]> <dbl [11 × 11]>
2 031502 <tibble [39 × 12]> <dbl [11 × 11]>
3 032201 <tibble [68 × 12]> <dbl [11 × 11]>
4 032202 <tibble [65 × 12]> <dbl [11 × 11]>
5 032203 <tibble [61 × 12]> <dbl [11 × 11]>
6 032901 <tibble [79 × 12]> <dbl [11 × 11]>
We can access it like a list:
# A tibble: 64 × 12
count O_1 E_1 N_1 N_2 O_2 C_1 A_1 C_2 A_2 E_2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 3.98 3.82 2 3.89 5 NA 3 NA 2.63 4
2 2 4.00 2 1 3 2.90 NA 2.16 NA 3.54 3
3 3 4 3 3.5 4 4 NA 4.69 NA 4 3
4 4 3 3 3 4 3.06 NA 4 NA 3.27 3.46
5 5 3.35 3.50 2 2 2.01 NA 4 NA 4 4
6 6 3.89 3 2.33 2.38 3.5 NA 4 NA 4.25 4
7 7 3.01 3 2.5 2 5.06 NA 4 NA 4 4.27
8 8 3.00 3 2.5 2 3.56 NA 4 NA 4 4
9 9 4.00 2.5 2 3 4 NA 4 NA 3.95 3.65
10 10 4.20 3.45 2 2 3 NA 3.87 NA 4 4
# ℹ 54 more rows
# ℹ 1 more variable: satisfaction <dbl>
unnest()
a matrix# A tibble: 649 × 2
ID r[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 031501 1 -0.0503 -0.133 0.0490 0.0849 NA 0.139 NA 0.197
2 031501 -0.0503 1 0.135 0.0467 -0.0439 NA 0.0169 NA -0.171
3 031501 -0.133 0.135 1 0.294 0.0122 NA 0.0835 NA -0.0669
4 031501 0.0490 0.0467 0.294 1 0.248 NA -0.0752 NA -0.114
5 031501 0.0849 -0.0439 0.0122 0.248 1 NA -0.0523 NA 0.0653
6 031501 NA NA NA NA NA NA NA NA NA
7 031501 0.139 0.0169 0.0835 -0.0752 -0.0523 NA 1 NA 0.303
8 031501 NA NA NA NA NA NA NA NA NA
9 031501 0.197 -0.171 -0.0669 -0.114 0.0653 NA 0.303 NA 1
10 031501 0.00547 -0.201 -0.150 -0.0782 0.229 NA -0.00452 NA 0.236
# ℹ 639 more rows
# ℹ 1 more variable: r[10:11] <dbl>
# A tibble: 10 × 3
ID data r
<chr> <list> <list>
1 031501 <tibble [64 × 12]> <tibble [11 × 12]>
2 031502 <tibble [39 × 12]> <tibble [11 × 12]>
3 032201 <tibble [68 × 12]> <tibble [11 × 12]>
4 032202 <tibble [65 × 12]> <tibble [11 × 12]>
5 032203 <tibble [61 × 12]> <tibble [11 × 12]>
6 032901 <tibble [79 × 12]> <tibble [11 × 12]>
7 040501 <tibble [48 × 12]> <tibble [11 × 12]>
8 040502 <tibble [41 × 12]> <tibble [11 × 12]>
9 040503 <tibble [53 × 12]> <tibble [11 × 12]>
10 040504 <tibble [41 × 12]> <tibble [11 × 12]>
# A tibble: 649 × 13
ID var O_1 E_1 N_1 N_2 O_2 C_1 A_1
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 031501 satisfact… 0.103 0.0994 -0.261 -0.00209 0.0765 NA 5.65e-2
2 031502 satisfact… 0.0929 0.126 NA NA -0.0482 0.0672 -3.81e-3
3 032201 satisfact… -0.00857 -0.0182 -0.190 0.00409 0.108 -0.157 NA
4 032202 satisfact… 0.186 0.0614 -0.348 -0.190 -0.00751 0.102 NA
5 032203 satisfact… -0.00329 0.122 -0.0633 0.113 -0.210 -0.139 NA
6 032901 satisfact… 0.137 NA -0.189 -0.148 -0.0821 -0.0344 3.96e-2
7 040501 satisfact… -0.0527 -0.137 NA NA -0.145 0.0694 6.01e-2
8 040502 satisfact… 0.273 NA 0.275 -0.0369 -0.00917 -0.0725 2.03e-1
9 040503 satisfact… 0.0595 NA 0.0272 0.0926 -0.0266 -0.108 -8.85e-4
10 040504 satisfact… 0.339 NA -0.305 0.0471 0.144 0.0378 3.53e-1
# ℹ 639 more rows
# ℹ 4 more variables: C_2 <dbl>, A_2 <dbl>, E_2 <dbl>, satisfaction <dbl>
tidy_describe <- function(df) {
df %>%
pivot_longer(
-count
, names_to = "item"
, values_to = "value"
, values_drop_na = T
) %>%
group_by(item) %>%
summarize(
mean = mean(value, na.rm = TRUE),
sd = sd(value, na.rm = TRUE),
median = median(value, na.rm = TRUE),
min = min(value, na.rm = TRUE),
max = max(value, na.rm = TRUE),
n = n(),
n_miss = sum(is.na(value)),
.groups = "drop"
)
}
# A tibble: 59 × 4
ID data r desc
<chr> <list> <list> <list>
1 031501 <tibble [64 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
2 031502 <tibble [39 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
3 032201 <tibble [68 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
4 032202 <tibble [65 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
5 032203 <tibble [61 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
6 032901 <tibble [79 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
7 040501 <tibble [48 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
8 040502 <tibble [41 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
9 040503 <tibble [53 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
10 040504 <tibble [41 × 12]> <tibble [11 × 12]> <tibble [9 × 8]>
# ℹ 49 more rows
# A tibble: 531 × 9
ID item mean sd median min max n n_miss
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 031501 A_1 3.87 0.616 4 2 5.24 64 0
2 031501 A_2 3.71 0.658 4 2 5.29 64 0
3 031501 E_1 2.89 0.798 3 0.346 4.62 64 0
4 031501 E_2 3.31 0.773 3.50 1.24 4.27 64 0
5 031501 N_1 2.27 0.503 2 1 4 64 0
6 031501 N_2 2.51 0.749 2.28 0.581 4 64 0
7 031501 O_1 3.72 0.749 4.00 1.15 6.12 64 0
8 031501 O_2 3.44 0.835 3.42 1.35 5.94 64 0
9 031501 satisfaction 3.16 1.10 3.09 1.05 4.97 64 0
10 031502 A_1 4.07 0.540 4 3 5 39 0
# ℹ 521 more rows
To create composites, we’ll:
1. separate traits from items 2. group_by()
trait, count, and ID 3. calculate the composites using summarize()
df_long <- df_long %>%
group_by(ID, count, trait) %>%
summarize(value = mean(value)) %>%
ungroup() %>%
left_join(df1 %>% select(ID, count, satisfaction))
df_long
# A tibble: 11,052 × 5
ID count trait value satisfaction
<chr> <dbl> <chr> <dbl> <dbl>
1 031501 1 A 2.81 4.96
2 031501 1 E 3.91 4.96
3 031501 1 N 2.94 4.96
4 031501 1 O 4.49 4.96
5 031501 2 A 2.85 4.79
6 031501 2 E 2.5 4.79
7 031501 2 N 2 4.79
8 031501 2 O 3.45 4.79
9 031501 3 A 4.35 1.06
10 031501 3 E 3 1.06
# ℹ 11,042 more rows
Then we’ll get within-person centered values using our own little function!
center <- function(x) x - mean(x, na.rm = T)
df_long <- df_long %>%
group_by(ID, trait) %>%
mutate(value_c = center(value)) %>%
ungroup()
df_long
# A tibble: 11,052 × 6
ID count trait value satisfaction value_c
<chr> <dbl> <chr> <dbl> <dbl> <dbl>
1 031501 1 A 2.81 4.96 -0.976
2 031501 1 E 3.91 4.96 0.807
3 031501 1 N 2.94 4.96 0.552
4 031501 1 O 4.49 4.96 0.907
5 031501 2 A 2.85 4.79 -0.939
6 031501 2 E 2.5 4.79 -0.604
7 031501 2 N 2 4.79 -0.393
8 031501 2 O 3.45 4.79 -0.128
9 031501 3 A 4.35 1.06 0.557
10 031501 3 E 3 1.06 -0.104
# ℹ 11,042 more rows
And grand-mean centered within-person averages
df_long <- df_long %>%
group_by(ID, trait) %>%
mutate(value_gmc = mean(value)) %>%
group_by(trait) %>%
mutate(value_gmc = center(value_gmc)) %>%
ungroup()
df_long
# A tibble: 11,052 × 7
ID count trait value satisfaction value_c value_gmc
<chr> <dbl> <chr> <dbl> <dbl> <dbl> <dbl>
1 031501 1 A 2.81 4.96 -0.976 -0.107
2 031501 1 E 3.91 4.96 0.807 -0.0644
3 031501 1 N 2.94 4.96 0.552 0.0443
4 031501 1 O 4.49 4.96 0.907 0.301
5 031501 2 A 2.85 4.79 -0.939 -0.107
6 031501 2 E 2.5 4.79 -0.604 -0.0644
7 031501 2 N 2 4.79 -0.393 0.0443
8 031501 2 O 3.45 4.79 -0.128 0.301
9 031501 3 A 4.35 1.06 0.557 -0.107
10 031501 3 E 3 1.06 -0.104 -0.0644
# ℹ 11,042 more rows
And now we are ready to run our models. But first, we’ll nest()
our data.
And now run the models.
run_model <- function(d) lmer(satisfaction ~ value_c * value_gmc + (1 | ID), data = d)
nested_mods <- df_long %>%
group_by(trait) %>%
nest() %>%
ungroup() %>%
mutate(model = map(data, run_model))
nested_mods
# A tibble: 5 × 3
trait data model
<chr> <list> <list>
1 A <tibble [2,278 × 6]> <lmerMod>
2 E <tibble [2,146 × 6]> <lmerMod>
3 N <tibble [2,184 × 6]> <lmerMod>
4 O <tibble [2,393 × 6]> <lmerMod>
5 C <tibble [2,051 × 6]> <lmerMod>
And get data frames of the results:
sprintfna <- function(x) ifelse(is.na(x), NA_character_, sprintf("%.2f", x))
tidy_tab <- function(m){
tidy(m, conf.int = T) %>%
mutate(pval = pnorm(abs(estimate/`std.error`), lower.tail = FALSE),
p = round(pval, digits = 3),
p = ifelse(pval < .001, "p < .001", paste0("p = ", p))) %>%
mutate_at(vars(estimate, conf.low, conf.high), sprintfna) %>%
mutate(CI = ifelse(is.na(conf.low), "", sprintf("[%s,%s]", conf.low, conf.high))) %>%
dplyr::select(term, estimate, CI, p)
}
nested_mods <- nested_mods %>%
mutate(tidy = map(model, tidy_tab))
nested_mods
# A tibble: 5 × 4
trait data model tidy
<chr> <list> <list> <list>
1 A <tibble [2,278 × 6]> <lmerMod> <tibble [6 × 4]>
2 E <tibble [2,146 × 6]> <lmerMod> <tibble [6 × 4]>
3 N <tibble [2,184 × 6]> <lmerMod> <tibble [6 × 4]>
4 O <tibble [2,393 × 6]> <lmerMod> <tibble [6 × 4]>
5 C <tibble [2,051 × 6]> <lmerMod> <tibble [6 × 4]>
Which we can print into pretty data frames
# A tibble: 30 × 5
trait term estimate CI p
<chr> <chr> <chr> <chr> <chr>
1 A (Intercept) 2.98 "[2.93,3.03]" p < .001
2 A value_c -0.02 "[-0.10,0.05]" p = 0.279
3 A value_gmc 0.13 "[-0.05,0.31]" p = 0.081
4 A value_c:value_gmc 0.16 "[-0.12,0.44]" p = 0.127
5 A sd__(Intercept) 0.04 "" <NA>
6 A sd__Observation 1.15 "" <NA>
7 E (Intercept) 2.98 "[2.93,3.03]" p < .001
8 E value_c -0.07 "[-0.14,0.00]" p = 0.029
9 E value_gmc 0.17 "[0.03,0.32]" p = 0.008
10 E value_c:value_gmc 0.02 "[-0.17,0.22]" p = 0.399
# ℹ 20 more rows
Which we can pretty easily turn into tables:
mod_tab <- nested_mods %>%
select(trait, tidy) %>%
unnest(tidy) %>%
pivot_wider(
names_from = "trait"
, names_glue = "{trait}_{.value}"
, values_from = c("estimate", "CI", "p")
) %>%
select(term, starts_with("E"), starts_with("A"), starts_with("C"), starts_with("N"), starts_with("O"))
mod_tab
# A tibble: 6 × 16
term E_estimate E_CI E_p A_estimate A_CI A_p C_estimate C_CI C_p
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 (Interce… 2.98 "[2.… p &l… 2.98 "[2.… p &l… 2.97 "[2.… p &l…
2 value_c -0.07 "[-0… p = … -0.02 "[-0… p = … 0.03 "[-0… p = …
3 value_gmc 0.17 "[0.… p = … 0.13 "[-0… p = … 0.19 "[0.… p = …
4 value_c:… 0.02 "[-0… p = … 0.16 "[-0… p = … 0.15 "[-0… p = …
5 sd__(Int… 0.00 "" <NA> 0.04 "" <NA> 0.00 "" <NA>
6 sd__Obse… 1.16 "" <NA> 1.15 "" <NA> 1.17 "" <NA>
# ℹ 6 more variables: N_estimate <chr>, N_CI <chr>, N_p <chr>,
# O_estimate <chr>, O_CI <chr>, O_p <chr>
hdr <- c(1, rep(3, 5))
names(hdr) <- c(" ", "Extraversion", "Agreeableness", "Conscientiousness", "Neuroticism", "Openness")
mod_tab <- mod_tab %>%
kable(.
, "html"
, escape = F
, col.names = c("Term", rep(c("<em>b</em>", "CI", "<em>p</em>"), times = 5))
, align = c("r", rep("c", 15))
, caption = "<strong>Table 1</strong><br><em>Multilevel Model Estimates of Between- and Within-Person Big Five-State Satisfaction Associations"
) %>%
kable_classic(full_width = F, html_font = "Times", font_size = 15) %>%
add_header_above(hdr)
Extraversion |
Agreeableness |
Conscientiousness |
Neuroticism |
Openness |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Term | b | CI | p | b | CI | p | b | CI | p | b | CI | p | b | CI | p |
(Intercept) | 2.98 | [2.93,3.03] | p < .001 | 2.98 | [2.93,3.03] | p < .001 | 2.97 | [2.92,3.02] | p < .001 | 2.99 | [2.94,3.03] | p < .001 | 2.97 | [2.93,3.02] | p < .001 |
value_c | -0.07 | [-0.14,0.00] | p = 0.029 | -0.02 | [-0.10,0.05] | p = 0.279 | 0.03 | [-0.05,0.10] | p = 0.231 | -0.01 | [-0.08,0.06] | p = 0.378 | 0.02 | [-0.05,0.09] | p = 0.29 |
value_gmc | 0.17 | [0.03,0.32] | p = 0.008 | 0.13 | [-0.05,0.31] | p = 0.081 | 0.19 | [0.06,0.32] | p = 0.003 | -0.09 | [-0.19,0.01] | p = 0.032 | 0.14 | [-0.04,0.33] | p = 0.067 |
value_c:value_gmc | 0.02 | [-0.17,0.22] | p = 0.399 | 0.16 | [-0.12,0.44] | p = 0.127 | 0.15 | [-0.04,0.34] | p = 0.061 | -0.19 | [-0.34,-0.03] | p = 0.009 | 0.25 | [0.01,0.49] | p = 0.019 |
sd__(Intercept) | 0.00 | 0.04 | 0.00 | 0.00 | 0.03 | ||||||||||
sd__Observation | 1.16 | 1.15 | 1.17 | 1.15 | 1.16 |
PSC 290 - Data Management and Cleaning