From 72b8c34ab36baaa3e0d4c8f351720ce732c0857e Mon Sep 17 00:00:00 2001 From: Dunstan Matekenya Date: Fri, 1 Nov 2024 05:10:37 -0400 Subject: [PATCH] Updated the workflows documentation --- ...general-guide.md => folders-and-naming.md} | 7 +++++++ docs/git-workflows.md | 19 +++++++++++++------ docs/notebooks-workflows.md | 10 ++++++++++ 3 files changed, 30 insertions(+), 6 deletions(-) rename docs/{general-guide.md => folders-and-naming.md} (68%) create mode 100644 docs/notebooks-workflows.md diff --git a/docs/general-guide.md b/docs/folders-and-naming.md similarity index 68% rename from docs/general-guide.md rename to docs/folders-and-naming.md index 3f6d1c6..4c622cf 100644 --- a/docs/general-guide.md +++ b/docs/folders-and-naming.md @@ -17,6 +17,13 @@ since there are many things which can be named, here provide general guidelines. - **Underscore Vs. hyphen.** Except for cases where use of hyphen is not allowed (e.g., Python script names), all folder and file names should be separated by hyphen. For example, ```damage-assessment``` as opposed to ```DamageAssessment``` or ```Damage-Assessment```. - **Theme based naming.** As much as possible, ensure names are informative and match with topic/theme. For example, in the data folder, one can have directory for ```admin-boundaries``` +## Adding Data to Project Sharepoint +We recognize that this approach may create some duplication and additional effort. However, wherever possible (if datasets arent to large), we require that datasets (both raw and derived) be uploaded to the project’s SharePoint. This enables other Bank staff, who are often our clients on the project, to access the data as needed. In summary, you will maintain copies of the data in the data folder on your local machine for your analysis. As outlined in the [Git workflows](/docs/git-workflows.md), this data will not be uploaded to GitHub and will remain locally stored. + +## Programming Environments +- **Python virtual environments.** We recommend using ```.venv``` for virtual environments. This allows for automatic detection by tools and editors like VS Code, simplifies setup, and keeps the folder hidden in most operating systems, reducing clutter. It also promotes consistency across projects, making it easier for others to understand and navigate your setup. because this keeps the folder tree clean among other advantages. +- **Environment file for secrets and credentials**. In the project folder, you will find a file ```.env.example```, rename that file to ```.env```. This is what you will to keep API keys and other secrets. Again, refer to [this part](https://worldbank.github.io/template/README.html) of the documentation for details. + diff --git a/docs/git-workflows.md b/docs/git-workflows.md index 2111f51..eb569fd 100644 --- a/docs/git-workflows.md +++ b/docs/git-workflows.md @@ -1,14 +1,21 @@ # Guidelines for Git and GitHub Workflows -In this series of documents, we present what we consider best practices for executing data science projects. It’s important to note that these practices are tailored specifically to the work of the Data Lab. While they may not be universally applicable to all data science projects, we believe they remain highly valuable. +This section provides essential guidelines for using Git and GitHub effectively, ensuring a structured and collaborative workflow for all team members in a project. By following these practices—such as consistently ignoring the "data" folder to protect sensitive information, avoiding direct pushes to the main branch, creating descriptive branch names, and submitting pull requests once work on a branch is complete—we can maintain a clean, organized codebase and promote efficient collaboration. These guidelines help uphold version control best practices, streamline teamwork, and reduce the potential for errors in project repositories. -These documents will cover the following topics: + +## Branch Names and Other General Practices +- **Branch names**. After joining the project and cloning the repository, create a concise, descriptive branch name for your work and ensure you switch to that branch before beginning any work on your machine. +- **Update branches**. Avoid creating new update branches; instead, push your changes and resolve any conflicts directly. For instance, if bots in the repository modify your code (e.g., adjusting indentations), simply pull these changes before pushing your own updates. +- **Pull requests (PR)**. When you believe your changes are final, create a pull request and assign the project lead as the reviewer. + +## Folders and Files to Ignore +As all data science repos in the Data Lab use this template, the project repo will come with ```.ignore``` file prepopulated with most files and folders which need to be ignored. However, once you join the project and create your own branch. You will have to make sure that the following folders are being ignored. +- Data folder +- Virtual environments (```.venv```) +- Environment (```.env```) +Feel free to add any other files (e.g., system files specific to your OS) to the ```.gitignore``` -1.**Folder Structure and Naming Conventions for Project Setup** -2. **Git and GitHub Workflow Standards and Guidelines** -3.**Standards for Documenting and Styling Analytical Notebooks** -4.**Guidelines for Communicating and Presenting Data Outputs.** diff --git a/docs/notebooks-workflows.md b/docs/notebooks-workflows.md new file mode 100644 index 0000000..9a7bed8 --- /dev/null +++ b/docs/notebooks-workflows.md @@ -0,0 +1,10 @@ +# Guidelines for Documenting and Styling Analytical Notebooks +This section provides best practices for structuring analytical notebooks to enhance readability. The guidelines include recommendations for hiding code cells to maintain a clean appearance in Jupyter Book, incorporating references where relevant, and organizing content logically to ensure clarity for readers. + +- **Structure**. In all the Data Lab projects, please follow [this analytics structure](https://github.com/worldbank/sudan-poverty-monitoring/blob/main/docs/2-analytics.md). +- **Editing _toc.yml** +- **Removing/hiding cell blocks** All notebooks will be rendered in Jupyter Book. To enhance readability, ensure code cells are hidden or removed using cell tags. In some cases, you may use the hide-input cell tag. + + + +