👍 First of all: Thank you for taking the time to contribute!
The following is a set of guidelines for contributing to nmdc_notebooks repo. This guide is aimed primarily at the developers for the notebooks and this repo, although anyone is welcome to contribute.
- Code of Conduct
- Guidelines for Contributions and Requests
- Best practices
- Adding new notebooks
- Dependency Management
The NMDC team strives to create a welcoming environment for editors, users and other contributors.
Please carefully read NMDC's Code of Conduct.
Please use the Issue Tracker for reporting problems or suggest enhancements for the notebooks. Issues should be focused and actionable (a PR could close an issue). Complex issues should be broken down into simpler issues where possible.
Please review GitHub's overview article, "Tracking Your Work with Issues".
See Pull Requests for all pull requests. Every pull request should be associated with an issue.
Please review GitHub's article, "About Pull Requests", and make your changes on a new branch.
We recommend also reading GitHub Pull Requests: 10 Tips to Know
- Read "About Issues" and "About Pull Requests"
- Issues should be focused and actionable
- Bugs should be reported with a clear description of the problem and steps to reproduce. If bugs are found within a notebook, please include the link to the notebook in the issue and the specific cell that is causing the issue.
- Complex issues should be broken down into simpler issues where possible
- Pull Requests (PRs) should be atomic and aim to close a single issue
- PRs should reference issues following standard conventions (e.g. “Fixes #123”)
- Never work on the main branch, always work on an issue/feature branch
- Core developers can work on branches off origin rather than forks
- If possible create a draft or work-in-progress PR on a branch to maximize transparency of what you are doing
- PRs should be reviewed and merged in a timely fashion
- In the case of git conflicts, the contributor should try and resolve the conflict
To add a new notebook to this repository:
- Create a folder in the base directory
- Name the folder with a short version of the analysis/question that will be explored.
- Make name of folder
snake_case
- Create a
README.md
in the folder outlining the analysis or question. - Create a sub-folder for each language that will be demonstrated
- e.g. one subfolder named
R
and one subfolder namedpython
- e.g. one subfolder named
- Instantiate a Jupyter Notebook for each folder coded in its corresponding language or
- Create a .Rmd and convert it to a Jupyter Notebook. Several methods for this exist and none are perfect, but this open source method currently works.
- Run the entire notebook to ensure it is working as expected and save the rendered notebook in the folder.
- Update the
README.md
in the folder to include links to the rendered notebook (using nbviewer and google colab). - Add the notebooks to the appropriate github workflow to ensure they are included in the continuous integration process. See the
.github/workflows
folder for existing workflows (one for the R notebooks and one for the python notebooks). Add the new notebook to the end of the list of notebooks in the workflow file.
This project uses renv
for package management. We maintain a libraries.R mini script that calls the necessary packages for the project for local development because renv does not track library calls withing .ipynb files (only.R or .Rmd files).
- Clone the github repository
- Open the R project
- Run
renv::restore()
to make sure your packages match. To learn more about how renv works, see this resource.
- Install the library with
renv::install("package_name")
. - Add the package to the libraries.R file so that
renv
can track it (e.g.library("package_name")
. - Run
renv::snapshot()
to update the lockfile. - Commit changes and push to github.
This project uses pip paired with venv to manage dependencies. Note that requirements_dev.txt should be used and updated for local development dependencies, and requirements.txt should be used for production/binder dependencies (updated manually and with discretion).
- Clone the github repository
- create a virtual environment:
python -m venv venv
- Activate the virtual environment:
source venv/bin/activate
- Install the necessary packages:
pip install -r requirements_dev.txt
Note to update your package installations:pip install -U -r requirements_dev.txt
- Activate the virtual environment:
source venv/bin/activate
- Install any new packages:
pip install <package>
- Capture the new requirements:
pip freeze > requirements_dev.txt
- Push changes to github