Skip to content

VEuPathDB/study-wrangler

Repository files navigation

README

study.wranger R package for tidying data in a way compatible with VEuPathDB workflows for EDA loading.

Install in existing R environment or RStudio

# Install Bioconductor packages
install.packages("BiocManager") # if not already installed
BiocManager::install(c("SummarizedExperiment", "DESeq2"))

# Install the remotes package
install.packages("remotes") # if you don't already have 'remotes'

# Install (or upgrade) the study.wrangler package from GitHub
remotes::install_github("VEuPathDB/study-wrangler", build_vignettes = TRUE, upgrade = FALSE)
# And load all its functions into your namespace
library(study.wrangler)

# Browse the vignettes (tutorials) in your web browser
browseVignettes("study.wrangler")

# If your web browser doesn't open with the vignette index page, use one of these:
options(browser = "xdg-open")   # Linux
# or
options(browser = "open")       # macOS
# or
options(browser = "C:/Program Files/Google/Chrome/Application/chrome.exe")  # Windows
# and then try again with 
browseVignettes("study.wrangler")

Tip: GitHub rate limiting

If you see errors like HTTP error 403 or API rate limit exceeded when installing packages from GitHub, you may be hitting the unauthenticated GitHub API limit (60 requests/hour).

To avoid this, set a GitHub Personal Access Token (PAT):

  1. Generate a token via https://github.com/settings/personal-access-tokens Only the default read-only permissions to public repositories are needed.
  2. If you’re using R from the command line (not RStudio), set your token in your shell config:
    • zsh (default on most Macs): Add
      export GITHUB_PAT=github_pat_...
              

      to your ~/.zshrc

    • bash: Add it to ~/.bash_profile or ~/.bashrc
  3. For RStudio, we recommend setting it in your R environment file via:
    usethis::edit_r_environ()
        

    Add the line

    GITHUB_PAT=github_pat_...
        

    Save and restart your R-session

Tip: Edit and run the vignette as a notebook

To experiment with the vignette code interactively:

  1. Clone the source repo:
    git clone https://github.com/VEuPathDB/study-wrangler.git
        
  2. Open the .Rmd file in RStudio (File -> Open):
    study-wrangler/vignettes/cleaning-and-preparing-basics.Rmd
        

You can now run code chunks interactively and modify them/play around.

To build a docker image

docker build -t veupathdb/study-wrangler .

Build notes:

  • add `–progress=plain` if you want to see the errors more easily
  • each `remotes::install_github()` command in the Dockerfile eats into a rate limit for anonymous users at GitHub (60 per hour!) - so if you are playing around with things you almost certainly will hit the limit
  • setting up credentials for a higher limit seems not worth the cost

To run in docker:

docker run --rm -ti -e PASSWORD=password -p 8989:8787 veupathdb/study-wrangler
# Then in your web browser navigate to localhost:8989 and login with "rstudio" and "password"

To add R functions to the repo (using docker):

docker run --rm -ti --name study-wrangler-dev -v $PWD:/study.wrangler -e PASSWORD=password -p 8888:8787 veupathdb/study-wrangler
# Then in your web browser navigate to localhost:8888 and login with "rstudio" and "password"

# at the RStudio prompt
> library(devtools)
# then in the file browser:
# 1. navigate into ./study.wrangler directory
# 2. Set As Working Directory (gear icon in file browser)
# 3. (or do `setwd("~/study.wrangler")` in console)
> devtools::test()
# actually just `test()` will work

# If running commands in console, you will need to do
> load_all()
# after making changes in the code.

# To update the documentation and/or NAMESPACE file
> document()
> build(path='dist')
> install()
# Note that we are not committing the man/*.Rd files to the repo at the moment

About

Code to keep study data tidy

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •