study.wranger R package for tidying data in a way compatible with VEuPathDB workflows for EDA loading.
# Install Bioconductor packages
install.packages("BiocManager") # if not already installed
BiocManager::install(c("SummarizedExperiment", "DESeq2"))
# Install the remotes package
install.packages("remotes") # if you don't already have 'remotes'
# Install (or upgrade) the study.wrangler package from GitHub
remotes::install_github("VEuPathDB/study-wrangler", build_vignettes = TRUE, upgrade = FALSE)
# And load all its functions into your namespace
library(study.wrangler)
# Browse the vignettes (tutorials) in your web browser
browseVignettes("study.wrangler")
# If your web browser doesn't open with the vignette index page, use one of these:
options(browser = "xdg-open") # Linux
# or
options(browser = "open") # macOS
# or
options(browser = "C:/Program Files/Google/Chrome/Application/chrome.exe") # Windows
# and then try again with
browseVignettes("study.wrangler")
If you see errors like HTTP error 403 or API rate limit exceeded when installing packages from GitHub, you may be hitting the unauthenticated GitHub API limit (60 requests/hour).
To avoid this, set a GitHub Personal Access Token (PAT):
- Generate a token via https://github.com/settings/personal-access-tokens Only the default read-only permissions to public repositories are needed.
- If you’re using R from the command line (not RStudio), set your token in your shell config:
- zsh (default on most Macs): Add
export GITHUB_PAT=github_pat_...
to your
~/.zshrc
- bash: Add it to
~/.bash_profile
or~/.bashrc
- zsh (default on most Macs): Add
- For RStudio, we recommend setting it in your R environment file via:
usethis::edit_r_environ()
Add the line
GITHUB_PAT=github_pat_...
Save and restart your R-session
To experiment with the vignette code interactively:
- Clone the source repo:
git clone https://github.com/VEuPathDB/study-wrangler.git
- Open the
.Rmd
file in RStudio (File -> Open):study-wrangler/vignettes/cleaning-and-preparing-basics.Rmd
You can now run code chunks interactively and modify them/play around.
docker build -t veupathdb/study-wrangler .
Build notes:
- add `–progress=plain` if you want to see the errors more easily
- each `remotes::install_github()` command in the Dockerfile eats into a rate limit for anonymous users at GitHub (60 per hour!) - so if you are playing around with things you almost certainly will hit the limit
- setting up credentials for a higher limit seems not worth the cost
To run in docker:
docker run --rm -ti -e PASSWORD=password -p 8989:8787 veupathdb/study-wrangler # Then in your web browser navigate to localhost:8989 and login with "rstudio" and "password"
To add R functions to the repo (using docker):
docker run --rm -ti --name study-wrangler-dev -v $PWD:/study.wrangler -e PASSWORD=password -p 8888:8787 veupathdb/study-wrangler # Then in your web browser navigate to localhost:8888 and login with "rstudio" and "password" # at the RStudio prompt > library(devtools) # then in the file browser: # 1. navigate into ./study.wrangler directory # 2. Set As Working Directory (gear icon in file browser) # 3. (or do `setwd("~/study.wrangler")` in console) > devtools::test() # actually just `test()` will work # If running commands in console, you will need to do > load_all() # after making changes in the code. # To update the documentation and/or NAMESPACE file > document() > build(path='dist') > install() # Note that we are not committing the man/*.Rd files to the repo at the moment