Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge from upstream #6

Open
wants to merge 47 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
e08a8df
Update Snakemake execution configuration
timtroendle Jul 28, 2022
ef32de2
Update README to ensure minimal base env
timtroendle Jul 28, 2022
2e47baf
Add automated reruns triggered by code/params/env changes
timtroendle Jul 28, 2022
b5c11f2
Fix min_version check in default folder
timtroendle Jul 28, 2022
0fc3ffc
Fix bug in interaction with tabulate
timtroendle Jan 5, 2023
c23cabb
Update from default to IPython debugger
timtroendle Jan 5, 2023
b60ae88
Simplify Snakemake calls by defining default profile within env
timtroendle Jan 5, 2023
d335a64
Run GitHub workflow only on push to main and PR
timtroendle Jan 5, 2023
89cd6cc
Merge pull request #15 from timtroendle/feature-default-profiles-thro…
timtroendle Jan 5, 2023
24e9167
Update dependencies for Apple Silicon compatibility
timtroendle Jan 5, 2023
3768296
Update Python to 3.11
timtroendle Apr 28, 2023
61f5b76
Update to actions/checkout v3
timtroendle Apr 28, 2023
20aefce
Update other dependencies
timtroendle Apr 28, 2023
e235d47
Merge pull request #19 from timtroendle/feature-update-python
timtroendle Apr 28, 2023
2d603a4
Remove outputs from DAG rule to automatically enforce execution
timtroendle Apr 28, 2023
26dd584
Add snakemake object to Pyright builtins to avoid linting errors
timtroendle Apr 28, 2023
d69b6a4
Add conda-prefix to configuration
timtroendle Apr 28, 2023
014294f
Merge pull request #21 from timtroendle/feature-conda-prefix
timtroendle Apr 28, 2023
c9928d2
Add conda-prefix to cluster config
timtroendle Apr 28, 2023
f3dcb5c
Update deprecated pandoc option
timtroendle Apr 28, 2023
fb5e181
Add Snakemake object to Pyright builtins to avoid linting errors (clu…
timtroendle Apr 28, 2023
da33e26
Update from pandoc-xnos to pandoc-crossref
timtroendle May 26, 2023
12e29a4
Merge pull request #25 from timtroendle/fix-pantable-xnos-compat
timtroendle May 26, 2023
08011d9
Update to pytest-html 4
timtroendle Mar 1, 2024
1e5cb36
Add test infrastructure for model results
timtroendle Mar 1, 2024
c0d768c
Rename tags to keywords in pandoc metadata
timtroendle Mar 1, 2024
113addf
Add option to highlight text
timtroendle Mar 1, 2024
7d2aa8b
Update DAG creation to avoid requiring snakemake within sub env
timtroendle Mar 1, 2024
6028d23
Update default profile use to Snakemake mechanism
timtroendle Mar 1, 2024
6e23a70
Upgrade Weasyprint to 61.2 as a security update
timtroendle Mar 8, 2024
3943d67
Remove unnecessary dependencies from dag.env
timtroendle Mar 13, 2024
8fc54aa
Update Snakemake from 7.32 to 8.10.7
timtroendle May 1, 2024
290c051
Merge pull request #37 from timtroendle/feature-no-default-packages
timtroendle May 1, 2024
2d86ecb
Update from flake8 to ruff
timtroendle May 1, 2024
b166713
Fix weasyprint compatibility issue
timtroendle Nov 19, 2024
ec2f057
Add CI trigger on schedule once per month
timtroendle Nov 19, 2024
249a577
Merge pull request #39 from timtroendle/fix-weasyprint-compatibility-…
timtroendle Nov 19, 2024
4188bd1
Add support for Slurm
timtroendle Dec 6, 2024
0197224
Update GitHub workflow actions
timtroendle Dec 6, 2024
e8d1e09
Merge pull request #42 from timtroendle/feature-slurm
timtroendle Dec 9, 2024
6b4aeff
Update optional notifications from email to Pushcut
timtroendle Dec 9, 2024
58cc515
Merge pull request #43 from timtroendle/feature-portable-notification
timtroendle Dec 9, 2024
906957f
Add math support
timtroendle Dec 9, 2024
2e150ec
Merge pull request #44 from timtroendle/feature-math
timtroendle Dec 9, 2024
7f68489
Add archive rule
timtroendle Dec 9, 2024
039d0a2
Merge pull request #45 from timtroendle/feature-archive
timtroendle Dec 9, 2024
6c10c9e
Update log path handling from shell to pure Python
timtroendle Dec 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 31 additions & 17 deletions .github/workflows/reproduction.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
name: Reproduction
on: [push, pull_request]
on:
schedule:
- cron: "0 3 8 * *" # Runs every eighth day of the month at 3am.
push:
branches:
- main
pull_request:
defaults:
run:
shell: bash -l {0}
Expand All @@ -8,61 +14,69 @@ jobs:
name: Reproduce the default demo analysis
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Setup cookiecutter environment
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: "3.10"
python-version: 3.11
add-pip-as-python-dependency: true
- name: Install cookiecutter
run: pip install cookiecutter
- name: Apply cookiecutter
run: cookiecutter . --no-input --directory default
- name: Setup Snakemake environment
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: "3.10"
mamba-version: "*"
python-version: 3.11
mamba-version: 1.5.11
activate-environment: reproducible-research-project
environment-file: reproducible-research-project/environment.yaml
- name: Reproduce results
run: |
cd reproducible-research-project
snakemake --cores 1 --use-conda
snakemake
- name: Generate DAG
run: |
cd reproducible-research-project
snakemake --cores 1 --use-conda -f dag
snakemake dag
- name: Archive results
run: |
cd reproducible-research-project
snakemake archive
run_cluster_workflow:
name: Reproduce the cluster demo analysis (run locally only)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Setup cookiecutter environment
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: "3.10"
python-version: 3.11
add-pip-as-python-dependency: true
- name: Install cookiecutter
run: pip install cookiecutter
- name: Apply cookiecutter
run: cookiecutter . --no-input --directory cluster
- name: Setup Snakemake environment
uses: conda-incubator/setup-miniconda@v2
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: "3.10"
mamba-version: "*"
python-version: 3.11
mamba-version: 1.5.11
activate-environment: reproducible-research-project
environment-file: reproducible-research-project/environment.yaml
- name: Reproduce results
run: |
cd reproducible-research-project
snakemake --cores 1 --use-conda
snakemake
- name: Generate DAG
run: |
cd reproducible-research-project
snakemake --cores 1 --use-conda -f dag
snakemake dag
- name: Archive results
run: |
cd reproducible-research-project
snakemake archive
2 changes: 1 addition & 1 deletion LICENSE.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Copyright (c) 2017-2021 Tim Tröndle
Copyright (c) 2017-2024 Tim Tröndle

Permission is hereby granted, free of charge, to any person obtaining
a copy of this software and associated documentation files (the
Expand Down
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,11 @@

This repository provides [cookiecutter](http://cookiecutter.readthedocs.io) templates for reproducible research projects. The templates do not attempt to be generic, but have a clear and opinionated focus.

Projects build with these templates aim at full automation, and use `Python 3.10`, `mamba/conda`, `Git`, `Snakemake`, and `pandoc` to create a HTML report out of raw data, code, and `Markdown` text. Fork, clone, or download this repository on GitHub if you want to change any of these.
Projects build with these templates aim at full automation, and use `Python 3.11`, `mamba/conda`, `Git`, `Snakemake`, and `pandoc` to create a HTML and PDF report out of raw data, code, and `Markdown` text. Fork, clone, or download this repository on GitHub if you want to change any of these.

The template includes a few lines of code as a demo to allow you to create a HTML report out of made-up simulation results right away. Read the `README.md` in the generated repository to see how.
The template includes a few lines of code as a demo to allow you to create a report out of made-up simulation results right away. Read the `README.md` in the generated repository to see how.

These templates are developed on macOS and tested on Linux. They may work with Windows Subsystem for Linux, but Windows is not actively supported.

## Template types

Expand Down Expand Up @@ -37,14 +39,16 @@ Parameter | Description
`author` | Your name.
`institute` | The name of your institute, used for report metadata.
`short_description` | A short description of the project, used for documentation and report.
`path_to_conda_envs` | The path to the directory hosting your conda envs (leave untouched for Snakemake default).

The `cluster` template requires the following parameter values in addition:

Parameter | Description
--- | ---
`cluster_url` | The address of the cluster to allow syncing to and from the cluster.
`cluster_base_dir` | The base path for the project on the cluster (default: `~/<project-short-name>`).
`cluster_type` | The type of job scheduler used on the cluster. Currently, only LSF is supported.
`cluster_type` | The type of job scheduler used on the cluster. Currently, only Slurm is supported.
`slurm_account` | The user account on Slurm.

## Project Structure

Expand All @@ -58,6 +62,9 @@ The generated repository will have the following structure:
│ ├── default.yaml <- Default execution environment.
│ ├── report.yaml <- Environment for compilation of the report.
│ └── test.yaml <- Environment for executing tests.
├── profiles <- Snakemake profiles.
│ └── default <- Default Snakemake profile folder.
│ └── config.yaml <- Default Snakemake profile.
├── report <- All files creating the final report, usually text and figures.
│ ├── apa.csl <- Citation style definition to be used in the report.
│ ├── literature.yaml <- Bibliography file for the report.
Expand All @@ -70,7 +77,7 @@ The generated repository will have the following structure:
├── tests <- Automatic tests of the source code go in here.
│ └── test_model.py <- Demo file.
├── .editorconfig <- Editor agnostic configuration settings.
├── .flake8 <- Linting settings for flake8.
├── .ruff <- Linter and formatter settings for ruff.
├── .gitignore
├── environment.yaml <- A file to create an environment to execute your project in.
├── LICENSE.md <- MIT license description
Expand All @@ -81,12 +88,11 @@ The generated repository will have the following structure:
`cluster` templates additionally contain the following files:

```
├── config
│ └── cluster <- Cluster configuration.
│ ├── cluster-config.yaml <- A Snakemake cluster-config file.
│ └── config.yaml <- A set of Snakemake command-line parameters for cluster execution.
├── envs
│ └── shell.yaml <- An environment for shell rules.
├── profiles
│ └── cluster <- Cluster Snakemake profile folder.
│ └── config.yaml <- Cluster Snakemake profile.
├── rules
│ └── sync.yaml <- Snakemake rules to sync to and from the cluster.
├── .syncignore-receive <- Build files to ignore when receiving from the cluster.
Expand Down
7 changes: 5 additions & 2 deletions cluster/cookiecutter.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,11 @@
"author": "Your name",
"institute": "Your institution",
"short_description": "A short description of this project.",
"path_to_conda_envs": "Snakemake-default",
"cluster_url": "cluster.example.org",
"cluster_base_dir": "~/{{ cookiecutter.project_short_name }}",
"cluster_type": ["LSF"],
"_add_cluster_infrastructure": true
"cluster_type": ["Slurm"],
"slurm_account": "Your slurm account",
"_add_cluster_infrastructure": true,
"_jinja2_env_vars": {"lstrip_blocks": true, "trim_blocks": true}
}
1 change: 0 additions & 1 deletion cluster/{{cookiecutter.project_short_name}}/.flake8

This file was deleted.

1 change: 1 addition & 0 deletions cluster/{{cookiecutter.project_short_name}}/.ruff.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,5 @@ __pycache__
.vscode
.DS_Store
build
archive
notebooks

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
executor: slurm
jobs: 999
local-cores: 1
cores: 10
latency-wait: 60
use-envmodules: True
use-conda: True
conda-frontend: mamba
{% if cookiecutter.path_to_conda_envs != "Snakemake-default" %}
conda-prefix: {{cookiecutter.path_to_conda_envs}}
{% endif %}
default-resources:
- runtime=10
- mem_mb_per_cpu=16000
- disk_mb=1000
- slurm_account={{cookiecutter.slurm_account}}
3 changes: 1 addition & 2 deletions cluster/{{cookiecutter.project_short_name}}/rules/sync.smk
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ rule send:
{params.url}:{params.cluster_base_dir}
"""


rule receive:
message: "Receive build changes from cluster"
params:
Expand All @@ -27,7 +26,7 @@ rule receive:
conda: "../envs/shell.yaml"
shell:
"""
rsync -avzh --progress --delete -r --exclude-from={params.receive_ignore} \
rsync -avzhL --progress --delete -r --exclude-from={params.receive_ignore} \
{params.url}:{params.cluster_build_dir} {params.local_results_dir}
"""

Expand Down
4 changes: 3 additions & 1 deletion default/cookiecutter.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,7 @@
"author": "Your name",
"institute": "Your institution",
"short_description": "A short description of this project.",
"_add_cluster_infrastructure": false
"path_to_conda_envs": "Snakemake-default",
"_add_cluster_infrastructure": false,
"_jinja2_env_vars": {"lstrip_blocks": true, "trim_blocks": true}
}
5 changes: 0 additions & 5 deletions default/{{cookiecutter.project_short_name}}/.flake8

This file was deleted.

1 change: 1 addition & 0 deletions default/{{cookiecutter.project_short_name}}/.gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# generated files
build/
archive/

## Core latex/pdflatex auxiliary files:
*.aux
Expand Down
36 changes: 36 additions & 0 deletions default/{{cookiecutter.project_short_name}}/.ruff.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
line-length = 88
preview = true # required to activate many pycodestyle errors and warnings as of 2024-05-01
builtins = ["snakemake"]

[format]
quote-style = "double"
indent-style = "space"
docstring-code-format = false
line-ending = "auto"

[lint]
select = [
# pycodestyle errors
"E",
# pycodestyle warnings
"W",
# Pyflakes
"F",
# pyupgrade
"UP",
# flake8-bugbear
"B",
# flake8-simplify
"SIM",
# isort
"I",
]
ignore = [
# here and below, rules are redundant with formatter, see
# https://docs.astral.sh/ruff/formatter/#conflicting-lint-rules
"E501",
"W191",
"E111",
"E114",
"E117",
]
29 changes: 16 additions & 13 deletions default/{{cookiecutter.project_short_name}}/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,49 +8,51 @@ This repository contains the entire scientific project, including code and repor

You need [mamba](https://mamba.readthedocs.io/en/latest/) to run the analysis. Using mamba, you can create an environment from within you can run it:

mamba env create -f environment.yaml
mamba env create -f environment.yaml --no-default-packages

## Run the analysis

snakemake --cores 1 --use-conda
snakemake

This will run all analysis steps to reproduce results and eventually build the report.

You can also run certain parts only by using other `snakemake` rules; to get a list of all rules run `snakemake --list`.

To generate a PDF of the dependency graph of all steps `build/dag.pdf` run:

snakemake --use-conda --cores 1 -f dag
snakemake dag

{% if cookiecutter._add_cluster_infrastructure == True -%}
{% if cookiecutter._add_cluster_infrastructure == True %}
## Run on a cluster

You may want to run the workflow on a cluster. While you can run on [any cluster that is supported by Snakemake](https://snakemake.readthedocs.io/en/stable/executing/cluster.html), the workflow currently supports [LSF](https://en.wikipedia.org/wiki/Platform_LSF) clusters only. To run the workflow on a LSF cluster, use the following command:
You may want to run the workflow on a cluster. While you can run on [any cluster that is supported by Snakemake](https://snakemake.readthedocs.io/en/stable/executing/cluster.html), the workflow currently supports [Slurm](https://en.wikipedia.org/wiki/Slurm_Workload_Manager) clusters only. To run the workflow on a Slurm cluster, use the following command:

snakemake --use-conda --profile config/cluster
snakemake --profile profiles/cluster

If you want to run on another cluster, read [snakemake's documentation on cluster execution](https://snakemake.readthedocs.io/en/stable/executable.html#cluster-execution) and take `config/cluster` as a starting point.

## Work local, build on remote

You may want to work locally (to change configuration parameters, add modules etc), but execute remotely on the cluster. This workflow supports you in working this way through three Snakemake rules: `send`, `receive`, and `clean_cluster_results`. It works like the following.

First, start local and make sure the `cluster-sync` configuration parameters fit your environment. Next, run `snakemake --use-conda send` to send the entire repository to your cluster. On the cluster, execute the workflow with Snakemake (see above). After the workflow has finished, download results by locally running `snakemake --use-conda receive`. By default, this will download results into `build/cluster`.
First, start local and make sure the `cluster-sync` configuration parameters fit your environment. Next, run `snakemake send` to send the entire repository to your cluster. On the cluster, execute the workflow with Snakemake (see above). After the workflow has finished, download results by locally running `snakemake receive`. By default, this will download results into `build/cluster`.

This workflow works iteratively too. After analysing your cluster results locally, you may want to make changes locally, send these changes to the cluster (`snakemake --use-conda send`), rerun on the cluster, and download updated results (`snakemake --use-conda receive`).
This workflow works iteratively too. After analysing your cluster results locally, you may want to make changes locally, send these changes to the cluster (`snakemake send`), rerun on the cluster, and download updated results (`snakemake receive`).

To remove cluster results on your local machine, run `snakemake --use-conda clean_cluster_results`.
{%- endif %}
To remove cluster results on your local machine, run `snakemake clean_cluster_results`.
{% endif %}

## Be notified of build successes or fails

As the execution of this workflow may take a while, you can be notified whenever the execution terminates either successfully or unsuccessfully. Notifications are sent by email. To activate notifications, add the email address of the recipient to the configuration key `email`. You can add the key to your configuration file, or you can run the workflow the following way to receive notifications:
As the execution of this workflow may take a while, you can be notified whenever the execution terminates either successfully or unsuccessfully. Notifications are sent by the webservice [Pushcut](https://pushcut.io/) for which you need a free account. To activate notifications, add your Pushcut secret to the configuration using the configuration key `pushcut_secret`. You can add the key to your configuration file, or you can run the workflow the following way to receive notifications:

snakemake --cores 1 --use-conda --config email=<your-email>
snakemake --config pushcut_secret=<your-secret>

This workflow will then trigger the Pushcut notifications `snakemake_succeeded` and `snakemake_failed`.

## Run the tests

snakemake --use-conda --cores 1 test
snakemake test

## Repo structure

Expand All @@ -60,6 +62,7 @@ To remove cluster results on your local machine, run `snakemake --use-conda clea
* `envs`: contains execution environments
* `tests`: contains the test code
* `config`: configurations used in the study
* `profiles`: Snakemake execution profiles
* `data`: place for raw data
* `build`: will contain all results (does not exist initially)

Expand Down
Loading