Skip to content

Commit

Permalink
Revert "Lint repo using mdformat"
Browse files Browse the repository at this point in the history
This reverts commit 0d1bb07.
  • Loading branch information
SorenSpicknall committed Aug 24, 2023
1 parent 0d1bb07 commit 2b0681e
Show file tree
Hide file tree
Showing 89 changed files with 2,528 additions and 2,876 deletions.
1 change: 1 addition & 0 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ about: Create a report to help us improve
title: 'Bug: '
labels: ''
assignees: ''

---

**Describe the bug**
Expand Down
8 changes: 2 additions & 6 deletions .github/ISSUE_TEMPLATE/new-team-member.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
name: New Team Member
about: Kick off the onboarding process.
title: New Team Member - [Name]
labels: new team member
---
labels: 'new team member'

---
Name:
Role:
Reports to:
Expand All @@ -14,11 +14,9 @@ GitHub Username:
Slack Username:

**Set-up:**

- [ ] Technical Onboarding call scheduled

- [ ] Added to tools:

- [ ] Github
- [ ] Organization: Cal-ITP
- [ ] Team: warehouse-users and warehouse-contributors
Expand All @@ -28,14 +26,12 @@ Slack Username:
- [ ] Slack

- [ ] Added to meetings:

- [ ] Analyst Round Tables (Tuesday & Thursday)
- [ ] Lunch n' Learn
- [ ] All-hands
- [ ] Data & Digital Services email list

- [ ] Added to Slack channels:

- [ ] #data-analyses
- [ ] #data-office-hours
- [ ] #data
Expand Down
8 changes: 6 additions & 2 deletions .github/ISSUE_TEMPLATE/user-story.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,20 @@ about: Submit a user story or feature request
title: ''
labels: ''
assignees: ''

---

## User story / feature request

_Please describe your need, outlining the key users, the feature being requested, and the goal that that the feature will facilitate. For example: **As a \[user or stakeholder type\], I want \[software feature\] so that \[some business value\]**_
_Please describe your need, outlining the key users, the feature being requested, and the goal that that the feature will facilitate. For example: **As a [user or stakeholder type], I want [software feature] so that [some business value]**_



### Acceptance Criteria

_Please enter something that can be verified to show that this user story is satisfied. For example: **I can join table X with table Y.** or **Column A appears in table Z in Metabase.**_

### Notes


### Notes
_Please enter any additional information that will facilitate the completion of this ticket. For example: Are there any constraints not mentioned above? Are there any alternatives you have considered?_
31 changes: 15 additions & 16 deletions .github/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,39 +7,38 @@ All CI/CD automation in this project is executed via GitHub Actions, whose workf
While we're using GCP Composer, "deployment" of Airflow consists of two parts:

1. Calling `gcloud composer environments update ...` to update the Composer environment with new (or specific versions of) packages
1. Copying the `dags` and `plugins` folders to a GCS bucket that Composer reads (this is specified in the Composer Environment)
2. Copying the `dags` and `plugins` folders to a GCS bucket that Composer reads (this is specified in the Composer Environment)

## deploy-apps-maps.yml

This workflow builds a static website from the Svelte app and deploys it to Netlify.

## build-\*.yml workflows
## build-*.yml workflows

Workflows prefixed with `build-` generally lint, test, and (usually) publish either a Python package or a Docker image.

## service-\*.yml workflows
## service-*.yml workflows

Workflows prefixed with `service-` deal with Kubernetes deployments.

- `service-release-candidate.yml` creates candidate branches, using [hologit](https://github.com/JarvusInnovations/hologit) to bring in external Helm charts and remove irrelevant (i.e. non-infra) code
- `service-release-diff.yml` renders kubectl diffs on PRs targeting release branches
- `service-release-channel.yml` deploys to a given channel (i.e. environment) on updates to a release branch
* `service-release-candidate.yml` creates candidate branches, using [hologit](https://github.com/JarvusInnovations/hologit) to bring in external Helm charts and remove irrelevant (i.e. non-infra) code
* `service-release-diff.yml` renders kubectl diffs on PRs targeting release branches
* `service-release-channel.yml` deploys to a given channel (i.e. environment) on updates to a release branch

Some of these workflows use hologit or invoke. See the READMEs in [.holo](../.holo) and [ci](../ci) for documentation regarding hologit and invoke, respectively.

## GitOps

The workflows described above also define their triggers. In general, developer workflows should follow these steps.

1. Check out a feature branch
1. Put up a PR for that feature branch, targeting `main`
- `service-release-candidate` will run and create a remote branch named `candidate/<feature-branch-name`
1. Create and merge a PR from the candidate branch to `releases/test`
- `service-release-diff` will run on the PR and print the expected changes
- `service-release-channel` will run on merge (i.e. push on `releases/test`) and deploy
1. Merge the original PR
- `service-release-candidate` will then update the remote `candidates/main` branch
1. Create and merge a PR from `candidates/main` to `releases/prod`
- `service-release-channel` will run and deploy to `prod` this time
2. Put up a PR for that feature branch, targeting `main`
* `service-release-candidate` will run and create a remote branch named `candidate/<feature-branch-name`
3. Create and merge a PR from the candidate branch to `releases/test`
* `service-release-diff` will run on the PR and print the expected changes
* `service-release-channel` will run on merge (i.e. push on `releases/test`) and deploy
4. Merge the original PR
* `service-release-candidate` will then update the remote `candidates/main` branch
5. Create and merge a PR from `candidates/main` to `releases/prod`
* `service-release-channel` will run and deploy to `prod` this time

Note: One alternative would be to use `candidates/main` to deploy into both `test` and `prod`. This is very possible but can be a bit annoying if GitHub is configured to delete branches on merge and the `cleanup-release-candidates` action then deletes `candidates/main` after it has been merged into `releases/test`.
9 changes: 5 additions & 4 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Description

_Describe your changes and why you're making them. Please include the context, motivation, and relevant dependencies._

Resolves #\[issue\]


Resolves #[issue]

## Type of change

Expand All @@ -12,11 +13,11 @@ Resolves #\[issue\]
- [ ] Documentation

## How has this been tested?

_Include commands/logs/screenshots as relevant._

## Post-merge follow-ups


## Post-merge follow-ups
_Document any actions that must be taken post-merge to deploy or otherwise implement the changes in this PR (for example, running a full refresh of some incremental model in dbt). If these actions will take more than a few hours after the merge or if they will be completed by someone other than the PR author, please create a dedicated follow-up issue and link it here to track resolution._

- [ ] No action required
Expand Down
14 changes: 7 additions & 7 deletions .holo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ GitHub branches in a manner that facilitates an "obvious" GitOps workflow for CI
hologit allows:

1. Building branches containing only a subset of repository contents (for example, a branch only including infra-related code)
- This action is called "projection"
1. Bringing in contents from another repository without relying on published artifacts such as Helm charts
1. Applying transformations to files as part of #1
- These transformations are called "lenses"
* This action is called "projection"
2. Bringing in contents from another repository without relying on published artifacts such as Helm charts
3. Applying transformations to files as part of #1
* These transformations are called "lenses"

In this repository, we declare one holobranch named [release-candidate](../branches/release-candidate).
By projecting this holobranch in GitHub Actions, individual "candidate" branches end up containing
Expand All @@ -17,6 +17,6 @@ repository. Then, a PR from a `candidate/<some-branch>` to `releases/<env>` (suc
relevant to infra in addition to `releases/*` branches only ever containing infra code. For example:

1. Create a [PR making an infra-related change](https://github.com/cal-itp/data-infra/pull/2828)
1. Create and merge a [PR to deploy a candidate branch to test](https://github.com/cal-itp/data-infra/pull/2829)
1. Merge the PR from #1
1. After merge, [PR to deploy the main candidate branch to prod](https://github.com/cal-itp/data-infra/pull/2832)
2. Create and merge a [PR to deploy a candidate branch to test](https://github.com/cal-itp/data-infra/pull/2829)
3. Merge the PR from #1
4. After merge, [PR to deploy the main candidate branch to prod](https://github.com/cal-itp/data-infra/pull/2832)
24 changes: 11 additions & 13 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,16 @@ These guidelines are meant to provide a foundation for collaboration in Cal-ITP'
primarily [#data-infra](https://github.com/cal-itp/data-infra).

## Issues

- When submitting an issue, please try to use an existing template if one is appropriate
- Provide enough information and context; try to do one or more of the following:
- Include links to specific lines of code, error logs, Slack context, etc.
- Include error messages or tracebacks if relevant and short
- Connect issues to Sentry issues
* When submitting an issue, please try to use an existing template if one is appropriate
* Provide enough information and context; try to do one or more of the following:
* Include links to specific lines of code, error logs, Slack context, etc.
* Include error messages or tracebacks if relevant and short
* Connect issues to Sentry issues

## Pull Requests

- We generally use merge commits as we think they provide clarity in a PR-based workflow
- PRs should be linked to any issues that they close. [Keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) are one good way to do this
- Google provides a [How to do a code review reference](https://google.github.io/eng-practices/review/reviewer/) that reviewers may find helpful
- Use draft PRs to keep track of work without notifying reviewers, and avoid giving pre-emptive feedback on draft PRs
- Reviewers should not generally merge PRs themselves and should instead let the author merge, since authors will have the most context about merge considerations (for example, whether additional reviews are still needed, or whether any communication is needed about the impacts of the PR when it merges)
- After a PR is merged, the author has the responsibility of monitoring any subsequent CI actions for successful completions
* We generally use merge commits as we think they provide clarity in a PR-based workflow
* PRs should be linked to any issues that they close. [Keywords](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword) are one good way to do this
* Google provides a [How to do a code review reference](https://google.github.io/eng-practices/review/reviewer/) that reviewers may find helpful
* Use draft PRs to keep track of work without notifying reviewers, and avoid giving pre-emptive feedback on draft PRs
* Reviewers should not generally merge PRs themselves and should instead let the author merge, since authors will have the most context about merge considerations (for example, whether additional reviews are still needed, or whether any communication is needed about the impacts of the PR when it merges)
* After a PR is merged, the author has the responsibility of monitoring any subsequent CI actions for successful completions
31 changes: 15 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,21 +6,21 @@ Documentation for this codebase lives at [docs.calitp.org/data-infra](https://do

## Repository Structure

- [./airflow](./airflow) contains the local dev setup and source code for Airflow DAGs (i.e. ETL)
- [./ci](./ci) contains continuous integration and deployment scripts using GitHub actions.
- [./docs](./docs) builds the [docs site](https://docs.calitp.org/data-infra).
- [./kubernetes](./kubernetes) contains helm charts, scripts and more for deploying apps/services (e.g. Metabase, JupyterHub) on our kubernetes cluster.
- [./images](./images) contains images we build and deploy for use by services such as JupyterHub.
- [./services](./services) contains apps that we write and deploy to kubernetes.
- [./warehouse](./warehouse) contains our dbt project that builds and tests models in the BigQuery warehouse.
* [./airflow](./airflow) contains the local dev setup and source code for Airflow DAGs (i.e. ETL)
* [./ci](./ci) contains continuous integration and deployment scripts using GitHub actions.
* [./docs](./docs) builds the [docs site](https://docs.calitp.org/data-infra).
* [./kubernetes](./kubernetes) contains helm charts, scripts and more for deploying apps/services (e.g. Metabase, JupyterHub) on our kubernetes cluster.
* [./images](./images) contains images we build and deploy for use by services such as JupyterHub.
* [./services](./services) contains apps that we write and deploy to kubernetes.
* [./warehouse](./warehouse) contains our dbt project that builds and tests models in the BigQuery warehouse.

## Contributing

- Follow the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) standard for all commits
- Use Conventional Commit format for PR titles
- Use GitHub's *draft* status to indicate PRs that are not ready for review/merging
- Do not use GitHub's "update branch" button or merge the `main` branch back into a PR branch to update it. Instead, rebase PR branches to update them and resolve any merge conflicts.
- We use GitHub's "code owners" functionality to designate a person or group of people who are in the line of approval for changes to some parts of this repository - if one or more people are automatically tagged as reviewers by GitHub when you create a PR, an approving review from at least one of them is required to merge. This does not automatically place the PR review in somebody's list of priorities, so please reach out to a reviewer to get eyes on your PR if it's time-sensitive.
* Follow the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) standard for all commits
* Use Conventional Commit format for PR titles
* Use GitHub's *draft* status to indicate PRs that are not ready for review/merging
* Do not use GitHub's "update branch" button or merge the `main` branch back into a PR branch to update it. Instead, rebase PR branches to update them and resolve any merge conflicts.
* We use GitHub's "code owners" functionality to designate a person or group of people who are in the line of approval for changes to some parts of this repository - if one or more people are automatically tagged as reviewers by GitHub when you create a PR, an approving review from at least one of them is required to merge. This does not automatically place the PR review in somebody's list of priorities, so please reach out to a reviewer to get eyes on your PR if it's time-sensitive.

## Linting and type-checking

Expand All @@ -34,16 +34,15 @@ disabled in the CI run due to flakiness, but it will still lint any SQL files
you attempt to commit locally.

### mypy

We encourage mypy compliance for Python when possible, though we do not
currently run mypy on Airflow DAGs. All service and job images do pass mypy,
which runs in the GitHub Actions that build the individual images. If you are
unfamiliar with Python type hints or mypy, the following documentation links
will prove useful.

- [PEP 484](https://peps.python.org/pep-0484/), which added type hints
- [The typing module docs](https://docs.python.org/3/library/typing.html)
- [The mypy docs](https://mypy.readthedocs.io/en/stable/)
* [PEP 484](https://peps.python.org/pep-0484/), which added type hints
* [The typing module docs](https://docs.python.org/3/library/typing.html)
* [The mypy docs](https://mypy.readthedocs.io/en/stable/)

In general, it should be relatively easy to make most of our code pass mypy
since we make heavy use of Pydantic types. Some of our imported modules will
Expand Down
4 changes: 1 addition & 3 deletions airflow/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,11 @@ gcloud init
```

Next, run the initial database migration which also creates a default user named `airflow`.

```shell
docker-compose run airflow db init
```

Next, start all services including the Airflow web server.

```console
docker-compose up
```
Expand All @@ -63,7 +61,7 @@ Additional reading about this setup can be found on the [Airflow Docs](https://a

### Common Issues

- `docker-compose up` exits with code 137 - Check that your docker has enough RAM (e.g. 8Gbs). See [this post](https://stackoverflow.com/questions/44533319/how-to-assign-more-memory-to-docker-container) on how to increase its resources.
* `docker-compose up` exits with code 137 - Check that your docker has enough RAM (e.g. 8Gbs). See [this post](https://stackoverflow.com/questions/44533319/how-to-assign-more-memory-to-docker-container) on how to increase its resources.

## Deploying to production

Expand Down
9 changes: 4 additions & 5 deletions airflow/dags/transform_warehouse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,10 @@ Type: [Now / Scheduled](https://docs.calitp.org/data-infra/airflow/dags-maintena
This DAG orchestrates the running of the Cal-ITP dbt project and deployment of associated artifacts like the [dbt docs site](https://dbt-docs.calitp.org/#!/overview).

This DAG has some special considerations:
* If a task fails, look carefully before assuming that clearing the task will help. If the failure was caused by a `DbtModelError`, there is an issue with the SQL or data in an individual model and clearing the task will not help until that issue is fixed.

- If a task fails, look carefully before assuming that clearing the task will help. If the failure was caused by a `DbtModelError`, there is an issue with the SQL or data in an individual model and clearing the task will not help until that issue is fixed.
* While this DAG does not have any formal dependencies on other DAGs, the data transformations within the dbt project do depend on successful upstream data capture and parsing.

- While this DAG does not have any formal dependencies on other DAGs, the data transformations within the dbt project do depend on successful upstream data capture and parsing.
* Because the tasks in this DAG involve running a large volume of SQL transformations, they risk triggering data quotas if the DAG is run multiple times in a single day.

- Because the tasks in this DAG involve running a large volume of SQL transformations, they risk triggering data quotas if the DAG is run multiple times in a single day.

- This task can be run with a `dbt_select` statement provided (use the `Trigger DAG w/ config` button (option under the "play" icon in the upper right corner when looking at an individual DAG) in the Airflow UI and provide a JSON configuration like `{"dbt_select": "<+ if you want to run parents><your_model_here><+ if you want to run children>"}` using [dbt selection syntax](https://docs.getdbt.com/reference/node-selection/syntax#specifying-resources)) to re-run a specific individual model's lineage.
* This task can be run with a `dbt_select` statement provided (use the `Trigger DAG w/ config` button (option under the "play" icon in the upper right corner when looking at an individual DAG) in the Airflow UI and provide a JSON configuration like `{"dbt_select": "<+ if you want to run parents><your_model_here><+ if you want to run children>"}` using [dbt selection syntax](https://docs.getdbt.com/reference/node-selection/syntax#specifying-resources)) to re-run a specific individual model's lineage.
1 change: 0 additions & 1 deletion apps/maps/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ Netlify sites deployed via `netlify deploy ...` with `--alias=some-alias` and/or
The site is deployed to production on merges to main, as defined in [../../.github/workflows/deploy-apps-maps.yml](../../.github/workflows/deploy-apps-maps.yml).

You may also deploy manually with the following:

```bash
(from the apps/maps folder)
npm run build
Expand Down
1 change: 0 additions & 1 deletion ci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ a deployment named `archiver` is configured in [the prod channel](./channels/pro
by `invoke` (see below) calling `kubectl` commands.

## invoke (aka pyinvoke)

[invoke](https://docs.pyinvoke.org/en/stable/) is a Python framework for executing subprocesses and building a CLI application.
The tasks are defined in `tasks.py` and configuration in `invoke.yaml`; config values under the top-level `calitp`
are specific to our defined tasks.
Expand Down
Loading

0 comments on commit 2b0681e

Please sign in to comment.