A Mega-Analysis of Personality Predictions: Robustness and Boundary Conditions
Feinberg School of MedicineWashington University in St. Louis
2020-08-26
Chapter 1 Workspace
1.1 Packages
1.2 Directory Path
1.3 Codebook
Each study has a separate codebook indexing matching, covariate, personality, and outcome variables. Moreover, these codebooks contain information about the original scale of the variable, any recoding of the variable (including binarizing outcomes, changing the scale, and removing missing data), reverse coding of scale variables, categories, etc.
# list of all codebook sheets
sheets <- sprintf("%s/codebooks/master_codebook_01.24.20.xlsx", res_path) %>% excel_sheets()
# function for reading in sheets
read_fun <- function(x){
sprintf("%s/codebooks/master_codebook_01.24.20.xlsx", res_path) %>% read_xlsx(., sheet = x)
}
# read in sheets and index source
codebook <- tibble(
study = sheets,
codebook = map(study, read_fun)
)
## short and long versions of names of all categories for later use
studies <- c("addhealth", "bhps", "gsoep", "hilda", "hrs", "liss", "midus", "nlsy", "shp", "wls")
studies_long <- c("Add Health", "BHPS", "GSOEP", "HILDA", "HRS", "LISS", "MIDUS", "NLSY", "SHP", "WLS")
traits <- codebook$codebook[[2]] %>% filter(category == "pers") %>%
select(long_name = Construct, short_name = name)
outcomes <- codebook$codebook[[2]] %>% filter(category == "out") %>%
select(long_name = Construct, short_name = name)
moderators <- codebook$codebook[[2]] %>% filter(category == "mod") %>%
select(long_name = Construct, short_name = name, breaks, mod, mod_name)
1.4 Other Supporting Documents
# used personality waves
p_waves <- sprintf("%s/codebooks/personality_waves.xlsx", res_path) %>% read_xlsx()
# used covariates for specifications
specifications <- sprintf("%s/codebooks/specifications.xlsx", res_path) %>% read_xlsx()
# coded specificatoin curve results
spec_summ <- sprintf("%s/results/sca/SCA_summary.xlsx", res_path) %>% read_xlsx()
# occupation code converters
occ90to00 <- sprintf("%s/codebooks/occ_90-00.xls", res_path) %>% read_xls() %>% mutate_at(vars(OCC90, OCC00), as.numeric)
occ70to90 <- sprintf("%s/codebooks/occ1970_occ1990dd.dta", res_path) %>% read_dta() %>%
setNames(c("OCC70", "OCC90"))
occ80to90 <- sprintf("%s/codebooks/occ1980_occ1990dd.dta", res_path) %>% read_dta() %>%
setNames(c("OCC80", "OCC90"))
# npb codes for ses
npb <- sprintf("%s/codebooks/npb.xlsx", res_path) %>% read_xlsx() %>%
setNames(c("OCC00", "NPB", "desc"))