Skip to content

Latest commit

 

History

History
136 lines (94 loc) · 6.4 KB

CONTRIBUTING.md

File metadata and controls

136 lines (94 loc) · 6.4 KB

Contributing to NMDC-notebooks

👍 First of all: Thank you for taking the time to contribute!

The following is a set of guidelines for contributing to nmdc_notebooks repo. This guide is aimed primarily at the developers for the notebooks and this repo, although anyone is welcome to contribute.

Table Of Contents

Code of Conduct

The NMDC team strives to create a welcoming environment for editors, users and other contributors.

Please carefully read NMDC's Code of Conduct.

Guidelines for Contributions and Requests

Reporting issues with exisiting notebooks

Please use the Issue Tracker for reporting problems or suggest enhancements for the notebooks. Issues should be focused and actionable (a PR could close an issue). Complex issues should be broken down into simpler issues where possible.

Please review GitHub's overview article, "Tracking Your Work with Issues".

Pull Requests

See Pull Requests for all pull requests. Every pull request should be associated with an issue.

Please review GitHub's article, "About Pull Requests", and make your changes on a new branch.

We recommend also reading GitHub Pull Requests: 10 Tips to Know

Best Practices

  • Read "About Issues" and "About Pull Requests"
  • Issues should be focused and actionable
  • Bugs should be reported with a clear description of the problem and steps to reproduce. If bugs are found within a notebook, please include the link to the notebook in the issue and the specific cell that is causing the issue.
  • Complex issues should be broken down into simpler issues where possible
  • Pull Requests (PRs) should be atomic and aim to close a single issue
  • PRs should reference issues following standard conventions (e.g. “Fixes #123”)
  • Never work on the main branch, always work on an issue/feature branch
  • Core developers can work on branches off origin rather than forks
  • If possible create a draft or work-in-progress PR on a branch to maximize transparency of what you are doing
  • PRs should be reviewed and merged in a timely fashion
  • In the case of git conflicts, the contributor should try and resolve the conflict

Adding new notebooks

To add a new notebook to this repository:

  1. Create a folder in the base directory
    • Name the folder with a short version of the analysis/question that will be explored.
    • Make name of folder snake_case
  2. Create a README.md in the folder outlining the analysis or question.
  3. Create a sub-folder for each language that will be demonstrated
    • e.g. one subfolder named R and one subfolder named python
  4. Instantiate a Jupyter Notebook for each folder coded in its corresponding language or
  5. Create a .Rmd and convert it to a Jupyter Notebook. Several methods for this exist and none are perfect, but this open source method currently works.
  6. Run the entire notebook to ensure it is working as expected and save the rendered notebook in the folder.
  7. Update the README.md in the folder to include links to the rendered notebook (using nbviewer and google colab).
  8. Add the notebooks to the appropriate github workflow to ensure they are included in the continuous integration process. See the .github/workflows folder for existing workflows (one for the R notebooks and one for the python notebooks). Add the new notebook to the end of the list of notebooks in the workflow file.

Dependency Management

R

This project uses renv for package management. We maintain a libraries.R mini script that calls the necessary packages for the project for local development because renv does not track library calls withing .ipynb files (only.R or .Rmd files).

To install the R dependencies:

  1. Clone the github repository
  2. Open the R project
  3. Run renv::restore() to make sure your packages match. To learn more about how renv works, see this resource.

To add new R dependencie:

  1. Install the library with renv::install("package_name").
  2. Add the package to the libraries.R file so that renv can track it (e.g. library("package_name").
  3. Run renv::snapshot() to update the lockfile.
  4. Commit changes and push to github.

Python

This project uses pip paired with venv to manage dependencies. Note that requirements_dev.txt should be used and updated for local development dependencies, and requirements.txt should be used for production/binder dependencies (updated manually and with discretion).

To install the python dependencies:

  1. Clone the github repository
  2. create a virtual environment: python -m venv venv
  3. Activate the virtual environment: source venv/bin/activate
  4. Install the necessary packages: pip install -r requirements_dev.txt Note to update your package installations: pip install -U -r requirements_dev.txt

To add new python dependencies:

  1. Activate the virtual environment: source venv/bin/activate
  2. Install any new packages: pip install <package>
  3. Capture the new requirements: pip freeze > requirements_dev.txt
  4. Push changes to github