-
Notifications
You must be signed in to change notification settings - Fork 3
/
codebook.Rmd
162 lines (118 loc) · 7.69 KB
/
codebook.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
title: "README"
author: "Melinda K. Higgins, PhD."
date: "September 27, 2017"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# N736Fall2017_HELPdataset
N736 Fall 2017 - Details and Supporting Files and Code for HELP dataset
**Data - Basic Data - No Formatting - No Codebook**
* `help.csv` is the RAW basic file with all of the data. This file does NOT have any labels or "codes" defined; i.e. not "formatting" applied [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.csv](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.csv)
* `help.RData` is the basic data file in a binary format readable by `R`; you can load this into `R` by simply running `load("help.RData")`. This "loads" the data.frame `helpdata` into your Global Environment in `R`. [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.Rdata](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.Rdata)
* `help.sav` is the basic data file in SPSS format. However, this does NOT have any labelling - there is no codebook for this version of the data. [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.sav](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/help.sav)
**Data - With Formatting - With a Codebook**
* `helpmkh.sav` is the SPSS formatted datafile WITH the codebook applied - all of the items have labels and where applicable the responses have labels also (i.e. the "codes" and "labels"). [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sav](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sav)
* `helpmkh.RData` is the dataset WITH Labels applied in a binary format readable by `R`; you can load this into `R` by simply running `load("helpmkh.RData")`. This "loads" the data.frame `help.spss` into your Global Environment in `R`. The object `help.spss` indicates that this data was read in from the SPSS formatted file WITH the codebook applied - thus the `attributes()` for the variables in this dataset will have labels and "factors" with codes and labels applied. [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.RData](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.RData)
* `helpmkh.sas7bdat` along with the SAS program `helpmkh.sas` creates the `formats.sas7bcat` formats catalog. Together these help you "apply" the codebook formats to the dataset in `SAS`.
- [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sas7bdat](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sas7bdat)
- [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sas](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/helpmkh.sas)
- [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/formats.sas7bcat](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/formats.sas7bcat)
**`R` and `SAS` CODE to get you started**
* `loadHELPdata.R` - use this R script to load the HELP dataset with the codebook applied - this has the variable labels and coded levels [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/loadHELPdata.R](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/loadHELPdata.R)
* `loadHELPdata.sas` - run this SAS program AFTER running `helpmkh.sas` to get the dataset into your WORK library. [https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/loadHELPdata.sas](https://github.com/melindahiggins2000/N736Fall2017_HELPdataset/blob/master/loadHELPdata.sas)
## Variables in HELP dataset
### Variable List and Item Labels
All variables are listed below except for `racegrp` and `substance` which are character type variables. These are listed in the next section below. For this table, all variable names, labels and minimum and maximum values are listed (excluding missing data `NA`s).
```{r, echo=FALSE, message=FALSE, warning=FALSE}
load("helpmkh.RData")
helpdata <- help.spss
library(tidyverse)
getlist <- function(x){
data.frame(attributes(x)$label,
min(x, na.rm=TRUE),
max(x, na.rm=TRUE))
}
dat1 <- helpdata %>%
select(-racegrp, -substance)
namesdat1 <- names(dat1)
library(purrr)
ldf <- purrr::map_df(dat1, getlist)
row.names(ldf) <- namesdat1
knitr::kable(ldf,
digits=1,
col.names = c("Item Label","Minimum","Maximum"),
caption="HELP Variables without `racegrp` and `substance`")
```
```{r, echo=FALSE, message=FALSE, warning=FALSE}
#load("helpmkh.RData")
#helpdata <- help.spss
#library(tidyverse)
# create a function to get the label
# label output from the attributes() function
#getlabel <- function(x) attributes(x)$label
#a <- getlabel(helpdata$age)
#library(purrr)
#ldf <- purrr::map_df(helpdata, getlabel) # this is a 1x15 tibble data.frame
# t(ldf) # transpose for easier reading to a 15x1 single column list
# using knitr to get a table of these
# variable names for Rmarkdown
#library(knitr)
#knitr::kable(t(ldf),
# col.names = c("Variable Label"),
# caption="HELP Variables")
```
## Variables with Coded Levels and Their Labels
These next set of tables provide all the variables that have coded levels and labels (i.e. the "codebook" and values used for `PROC FORMAT` in `SAS` and treated as "factors" in `R`).
### Codes for `racegrp` and `substance`
```{r, echo=FALSE, message=FALSE, warning=FALSE}
knitr::kable(names(table(helpdata$racegrp)),
col.names = attributes(helpdata$racegrp)$label)
knitr::kable(names(table(helpdata$substance)),
col.names = attributes(helpdata$substance)$label)
```
### Codes for rest of the variables ...
```{r, echo=FALSE, message=FALSE, warning=FALSE}
codetable <- function(x,captitle=" "){
knitr::kable(attributes(x)$labels,
col.names = attributes(x)$label,
caption=captitle)
}
codetable(helpdata$treat,"HELP Variable `treat`")
codedvars <- helpdata %>%
select(treat, female, homeless, g1b,
satreat, drinkstatus, anysubstatus,
linkstatus, f1a, f1b, f1c, f1d, f1e,
f1f, f1g, f1h, f1i, f1j, f1k, f1l,
f1m, f1n, f1o, f1p, f1q, f1r, f1s, f1t)
#sapply(codedvars, codetable)
codetable(helpdata$female,"HELP Variable `female`")
codetable(helpdata$homeless,"HELP Variable `homeless`")
codetable(helpdata$g1b,"HELP Variable `g1b`")
codetable(helpdata$satreat,"HELP Variable `satreat`")
codetable(helpdata$drinkstatus,"HELP Variable `drinkstatus`")
codetable(helpdata$anysubstatus,"HELP Variable `anysubstatus`")
codetable(helpdata$linkstatus,"HELP Variable `linkstatus`")
codetable(helpdata$f1a,"HELP Variable `f1a`")
codetable(helpdata$f1b,"HELP Variable `f1b`")
codetable(helpdata$f1c,"HELP Variable `f1c`")
codetable(helpdata$f1d,"HELP Variable `f1d`")
codetable(helpdata$f1e,"HELP Variable `f1e`")
codetable(helpdata$f1f,"HELP Variable `f1f`")
codetable(helpdata$f1g,"HELP Variable `f1g`")
codetable(helpdata$f1h,"HELP Variable `f1h`")
codetable(helpdata$f1i,"HELP Variable `f1i`")
codetable(helpdata$f1j,"HELP Variable `f1j`")
codetable(helpdata$f1k,"HELP Variable `f1k`")
codetable(helpdata$f1l,"HELP Variable `f1l`")
codetable(helpdata$f1m,"HELP Variable `f1m`")
codetable(helpdata$f1n,"HELP Variable `f1n`")
codetable(helpdata$f1o,"HELP Variable `f1o`")
codetable(helpdata$f1p,"HELP Variable `f1p`")
codetable(helpdata$f1q,"HELP Variable `f1q`")
codetable(helpdata$f1r,"HELP Variable `f1r`")
codetable(helpdata$f1s,"HELP Variable `f1s`")
codetable(helpdata$f1t,"HELP Variable `f1t`")
```