A guide to reproducibility scientific computing and analysis

This is a (non-comprehensive, very personally biased) guide to reproducibilty in scientific computing, software design and usage, data management and analysis. I focus primarily on programming and analysis practices in R, but the principles I've learned are extensible to your tool of choice. I also talk about general data management, and open-science platforms.

WIP

Guiding principles for reproducible R analyses

Do not alter another machine's state
Do not refer to things unique to your machine, or that are likely to differ between machines
Record everything that can affect the analysis
R-scripts (.R-files) should be able to run from top to bottom hitch-free (as far as possible)

Bad vs good R practices for reproducibility

Bad R	Good R
`setwd("my/path/that/no-one/else/has/)`	Using R-projects (i.e. a `.Rproj`-file)
`read.csv("an/absolute/path/")`	`here::here("relative/path/")`
`rm(list = ls()`	Don't use this in R-scripts
`install.package(...)`	Don't use this in R-script
`important_random_nos <- runif(...)`	`set.seed(1234); reproducible_random_nos <- runif(...)`

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A guide to reproducibility scientific computing and analysis

Guiding principles for reproducible R analyses

Bad vs good R practices for reproducibility

About

Releases

Packages

rvanmazijk/my-reproducibility-guide

Folders and files

Latest commit

History

Repository files navigation

A guide to reproducibility scientific computing and analysis

Guiding principles for reproducible R analyses

Bad vs good R practices for reproducibility

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages