Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REDCapLintR #511

Open
13 tasks
wibeasley opened this issue Sep 15, 2023 · 0 comments
Open
13 tasks

REDCapLintR #511

wibeasley opened this issue Sep 15, 2023 · 0 comments
Assignees

Comments

@wibeasley
Copy link
Member

@thomasnwilson suggested that we develop a function/package that can lint REDCap dictionaries and return a markdown/html report. These are the initial rules we thought of while waiting in airport.


REDCapLintR: Tool for REDCap Dictionary Good Practices

Working title: "REDCrapR" or "CrapR" or "Moving from REDCrap to REDCap"

  • Rule: no variable should end in "_v\d" (eg, _v2 or _v3)
    Opinion: variables in a sequence should have a meaningful name that clearly communicates its position in the sequence.
    Examples of bad behavior: age, age_v2, and age_v2_v2
    Suggested fix: Rename variables to age_baseline, age_discharge, age_followup

  • Smell: at least 10% of text variables should have validation

  • Smell: at least 20% of variables should be non-text, like dropdowns or sliders

  • All piped values should originate from variables, events, or smart variables that are currently in the dictionary.
    Check that all these still exist among a combined list of variables, events, & smart variables.
    regex: \[[a-z][a-z0-9_-]*\]

  • All embedded variables should originate from variables, events, or smart variables that are currently in the dictionary.
    Check that all these still exist among a combined list of variables, events, & smart variables.
    regex: \{[a-z][a-z0-9_-]*\}

  • Rule: all date variables should have the same format within the project. Don't mix & match dmy and mdy.

  • All forms/instruments should be mapped to at least one event

  • Rule: any variable with something like "phone" in the variable name, field label or field note should have a phone validation. Tokens include

    • phone
    • mobile
    • cell
    • contact number
  • Rule: any variable with something like "number" in the variable name, field label or field note should have a integer or numeric validation. Tokens include

    • number
    • age
    • count
  • Rule: any variable with something like "zip code" in the variable name, field label or field note should have a zip code validation. Tokens include

    • zip
    • zip_code
    • zipcode
  • Rule: any variable with something like T/F, Y/N in the variable name, field label or field note should have a "1" for true/yes/on and "0" for false/no/off. Tokens include (case insensitive):

    • t/f
    • true/false
    • y/n
    • yes/no
    • on/off
  • Rule: male & female consistently coded as 1/0, 1/2, or 8507/8532 (for OMOP)

    • m/f
    • male/female
  • ?? can we expand this to tri-state variables like yes/no/maybe or yes/no/null ??

  • Rule: multiple choice responses options are coded as integers (instead of letters)

@wibeasley wibeasley self-assigned this Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant