Skip to content

A guide to reproducibilty in scientific computing, software design and usage, data management and analysis. Primarily R-focused, but the principles I've learned are extensible to your tool of choice.

Notifications You must be signed in to change notification settings

rvanmazijk/my-reproducibility-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

A guide to reproducibility scientific computing and analysis

This is a (non-comprehensive, very personally biased) guide to reproducibilty in scientific computing, software design and usage, data management and analysis. I focus primarily on programming and analysis practices in R, but the principles I've learned are extensible to your tool of choice. I also talk about general data management, and open-science platforms.

WIP

Guiding principles for reproducible R analyses

  1. Do not alter another machine's state
  2. Do not refer to things unique to your machine, or that are likely to differ between machines
  3. Record everything that can affect the analysis
  4. R-scripts (.R-files) should be able to run from top to bottom hitch-free (as far as possible)

Bad vs good R practices for reproducibility

Bad R Good R
setwd("my/path/that/no-one/else/has/) Using R-projects (i.e. a .Rproj-file)
read.csv("an/absolute/path/") here::here("relative/path/")
rm(list = ls() Don't use this in R-scripts
install.package(...) Don't use this in R-script
important_random_nos <- runif(...) set.seed(1234); reproducible_random_nos <- runif(...)

About

A guide to reproducibilty in scientific computing, software design and usage, data management and analysis. Primarily R-focused, but the principles I've learned are extensible to your tool of choice.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published