Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update our EDD process documentation #166

Merged
merged 21 commits into from
Sep 26, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
1ae24e1
Initial pass at updating our EDD database change processes
joseph-flinn Aug 4, 2023
a7db292
Fix file name of new image
joseph-flinn Aug 4, 2023
e0cd9d8
Switch from 'rerunnable' to 'repeatable'
joseph-flinn Aug 4, 2023
9df35ad
Push the quick fixes from feedback
joseph-flinn Aug 8, 2023
09e4213
Removed repeated use of Fowler's name as well as the repeated use of EDD
joseph-flinn Aug 9, 2023
4717bea
Fix the image caption
joseph-flinn Aug 9, 2023
e00c9a8
Removing all added personal pronouns
joseph-flinn Aug 9, 2023
2a11473
Use markdown text styling
joseph-flinn Aug 9, 2023
a37c363
Rename the application code version in the Phase definitions to be mo…
joseph-flinn Aug 9, 2023
29ff1b7
Update terminology definitions
joseph-flinn Aug 9, 2023
bba1df6
Update language to be more focused
joseph-flinn Aug 9, 2023
0b2da50
Accepted introduction summary improvements
joseph-flinn Aug 10, 2023
cc1150e
Accepted suggested changes.
joseph-flinn Aug 10, 2023
4fedfc9
Accept revised defenition and examples of non-destructive changes
joseph-flinn Aug 10, 2023
9ea294d
Update docs/contributing/database-migrations/edd.mdx
joseph-flinn Aug 16, 2023
818cecf
addtional updates
joseph-flinn Aug 18, 2023
602f9ca
revert to JSX to fix the broken image link
joseph-flinn Aug 18, 2023
46b5f6f
Merge branch 'master' into update-db-migrations-docs
Hinton Aug 22, 2023
46dcc12
Merge branch 'master' into update-db-migrations-docs
joseph-flinn Sep 14, 2023
e4f5218
Merge branch 'master' into update-db-migrations-docs
joseph-flinn Sep 21, 2023
f420880
Tweak the EDD doc (#184)
Hinton Sep 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 125 additions & 43 deletions docs/contributing/database-migrations/edd.mdx
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This document should be written with the aim to give a high level overview of how Evolutionary database design works.

Developer focused documentation on how to write migrations should be in either the MSSQL or EF files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What portion of the article does not line up with the high level overview of EDD?

Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,7 @@ EDD describes a process where the database schema is continuously updated while
compatibility with older releases by using database transition phases.

In short the Database Schema for the Bitwarden Server **must** support the previous release of the
server. The database migrations will be performed before the code deployment, and in the event of a
release rollback the database schema will **not** be updated.
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved
server at any given time.

<bitwarden>

Expand All @@ -24,22 +23,72 @@ For background on this decision please see the [Evolutionary Database Design RFD

## Design

### Nullable
Martin Fowler's EDD defines two types of database changes: destructive and non-destructive
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved
([EDD -Transition Phase: All database changes are database refactors](https://www.martinfowler.com/articles/evodb.html#TransitionPhase)).
A destructive change is any database change that requires an accompanying code change to continue
working as expected. A non-destructive change is the opposite: a database change that does not
require a code change to allow the application to continue working as expected.

Database tables, views and stored procedures should almost always use either nullable fields or have
a default value. Since this will allow stored procedures to omit columns, which is a requirement
when running both old and new code.
### Non-destructive Database Changes
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved

### EDD Process
An example of a non-destructive change is almost always using either nullable fields or default
values in database tables, views, and stored procedures. We have adopted this as a standard for any
such changes. This will allow stored procedures to omit columns, which is a requirement when running
both old and new code.

The EDD breaks up each database migration into three phases. _Start_, _Transition_ and _End_.
### Destructive Changes

In our current release process where our database changes and our code changes are coupled, even a
new column can even be considered a destructive change if the default value of the column is a
non-constant value that needs to be computed from elsewhere.
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved

Martin Fowler's explanation of how to elegantly handle destructive database changes in an EDD way
breaks up such a change into three phases: _Start_, _Transition_ and _End_.

![Refactoring Stages](./stages_refactoring.jpg)
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved
[https://www.martinfowler.com/articles/evodb.html#TransitionPhase](https://www.martinfowler.com/articles/evodb.html#TransitionPhase)

This necessitates two different database migrations. The first migration adds new content and is
backwards compatible with the existing code. The second migration removes content and is not
backwards compatible with that same code prior to the first migration.
We tweak the terminology to be more easily understandable in how EDD relates to our deployment
processes in both our environments: our always-on application in the cloud and our self-host
deployments. We use the terms: _Initial_ Phase (instead of _Start_), _Transition_ Phase, and
_Finalization_ Phase (instead of _End_).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm always hesitant to diverge from established naming but I wonder if start here more accurately refer to the beginning of the development work, not deployment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I have updated the terminology to better align with the diagram and not redefine the established naming. The added "types" of migrations is to help overlay EDD on the deployment process and to clearly communicate between Development and *Ops the expectations that come with those migrations.


#### Initial Phase

- <u>
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved
Compatible with <i>previous</i> <b>and</b> <i>next</i> application code changes
</u>
- Represents the beginning of a database change
- Updates our database schema to support any new functionality while also maintaining old
functionality
- Supports both the previous version of code and the one being upgraded to
- Run during upgrade
- Must execute quickly to minimize downtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Drop the period here and in the next list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define quickly. Do you mean non-locking, or few operations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With k8s deployments, the goal is to get an end-to-end automated deploy finished within ten minutes. DB migrations are run in a serial manner. So any DB changes that put this initial target at risk would be considered "not quick enough". Any schema changes that put this target at risk should be flagged for more in depth discussion to see if it can be moved to during the Transition Phase.


#### Transition Phase

- <u>
Compatible with <i>previous</i> <b>and</b> <i>next</i> application code changes
</u>
- The time between initial migration and finalization
- Exists to provide an opportunity to rollback server to _previous_ version prior to breaking
changes
- Only data population migrations may be run at this time, if they are needed
- Optional step, required only when migrating data would be too slow to execute during the initial
migration. This might be a column population, index creation, anything to prepare our database
for the _next_ version
- Must be run as a background task during our Transition phase.
- These MUST run in a way where the database stays responsive during the full migration
- Schema changes are NOT to be run during this phase.

#### Finalization Phase

- <u>
Only compatible with <i>next</i> application code; represents the point of no return for this
migration
</u>
- Removes columns, data, and fallback code required to support _previous_ version
- Should be run as a typical migration either during a subsequent upgrade

### Example

Expand Down Expand Up @@ -73,7 +122,7 @@ actions.
:::

<Tabs>
<TabItem value="first" label="First Migration" default>
<TabItem value="first" label="Initial Migration" default>

```sql
-- Add Column
Expand Down Expand Up @@ -120,7 +169,7 @@ END
```

</TabItem>
<TabItem value="data" label="Data Migration">
<TabItem value="data" label="Transition Migration">

```sql
UPDATE [dbo].Customer SET
Expand All @@ -129,7 +178,7 @@ WHERE FirstName IS NULL
```

</TabItem>
<TabItem value="second" label="Second Migration">
<TabItem value="second" label="Finalization Migration">

```sql
-- Remove Column
Expand Down Expand Up @@ -173,49 +222,82 @@ END
</TabItem>
</Tabs>

## Workflow
## Our EDD Process

There are some unique constraints to Bitwarden that are not addressed directly in Martin Fowler's
EDD article.

- Our Production instances in the cloud are required to be on at all times
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ When possible I think it's a good idea to eliminate personal pronouns; there's a lot of "our" and "we" in this document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed all of the personal pronouns that I added. I did not update the documented process that already existed.

- Our self-host instances that we do not directly have access to manage must support the same EDD
processes; however, they do not have the same always-on application constraint
- Minimization of manual steps in our process
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are in my opinion all considered in the article.

  1. That's the purpose of EDD.
  2. Self-hosted does not really need to support EDD, the current deployment mechanism turns off the server,r runs migrations and starts the new code instances. I.e. there is no transition phase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the language a bit to focus more on the implementation of orchestrating EDD in our environment rather than EDD itself.

After reading through Fowler's article, I came away with the impression that the purpose of EDD is not to support an always-on DB/application. He went so far as to say that it would take a whole separate article to examine the technicalities to orchestrate EDD in such an environment. I think it is important to keep the always-on requirement for orchestration in mind.

I see this a different way. Exiting the Transition Phase is not done until the Finalization migrations have been run in the next update. If every release of server has destructive DB changes, I would say that the self-host deployment will constantly switch from one Transition Phase to the next.


The process to support all of these constraints is a complex one. Below is an image of a state
machine that will hopefully help visualize the process and what it supports. It assumes that all
database changes follow the standards that are laid out in [Migrations](./).

The Bitwarden specific workflow for writing migrations are described below.
---

### Developer
![Bitwarden EDD State Machine](./edd_state_machine.jpg) \[Open Image in a new tab for better
viewing\]

The development flow is described in [Migrations](./).
---

### Devops
### Cloud Environments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 Is "cloud" the term we want? Or perhaps more generically "online"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with "always-on" over "online" to avoid the ambiguous "online application" possibly referring to any application on the internet. Thoughts?


#### On `rc` cut
Since we treat both schema migrations and data migrations as just migrations, the only issues that
we are solving for is orchestrating the runtime constraints on the migration. Eventually, all
migrations will end up in `DbScripts`. However, to control the timing of running _Transition_ and
associated _Finalization_ migrations, we need to keep them outside of `DbScripts` until the correct
timing.

Create a PR moving the future scripts.
In our environments with always-on applications, _Transition_ scripts must be run after the new code
has been rolled out. To execute a full deploy, we run all new migrations in `DbScripts`, roll out
the new code, and then run all _Transition_ migrations in the `DbScripts_transition` directory as
soon as all of the new code services are online. In the case of a critical failure after the new
code is rolled out, we will conduct a Rollback (see Rollbacks below). _Finalization_ migrations will
not be run until the start of the next deploy when they are moved into `DbScripts`.

- `DbScripts_future` to `DbScripts`, prefix the script with the current date, but retain the
existing date.
- `dbo_future` to `dbo`.
<bitwarden>
<li>
Create a ticket in Jira with a `Due Date` of the release date to ensure future migrations are
merged in and ready to be executed. Set the ticket that created the future migration as a
blocker.
</li>
</bitwarden>
After this deploy, to prep for the next release, all migrations in `DbScripts_transition` are moved
to `DbScripts` and then all migrations in `DbScripts_finalization` are moved to `DbScripts`,
conserving their execution order for a clean install. For our current branching strategy, PRs will
be open against `master` when `rc` is cut to prep for this release. This PR automation will also
handle renaming the migration file and updating any reference of `[dbo_future]` to `[dbo]`.
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved

#### After server release
The next deploy will pick up the newly added migrations in `DbScripts` and set the previously
repeatable _Transition_ migrations to no longer be repeatable, execute the _Finalization_
migrations, and then execute any new migrations associated with the code changes that are about to
go out.

1. Run whatever data migration scripts might be needed. (This might need to be batched and executed
until all the data has been migrated)
2. After having the server run for a while execute the future migration script to clean up the
database.
The the state of migrations in the different directories at any one time is is saved and versioned
in our Migrator Utility which supports the EDD phased migration process in both types of
environments.

### Standard Self-host Environments
joseph-flinn marked this conversation as resolved.
Show resolved Hide resolved

We need to have a similarly orchestrated process as Cloud environments here. However, we are not
constrained to having an always-on application. Our updated orchestration process for self-host will
be:

- Stop the Bitwarden stack as we do today
- Start the database
- Run all new migrations in `DbScripts` (both _Finalization_ migrations from the last deploy and any
_Initial_ migrations from the deploy currently going out)
- Run all _Transition_ migrations
- Restart the Bitwarden stack.

## Rollbacks

In the event the server release failed and needs to be rolled back, it should be as simple as just
re-deploying the previous version again. The database will **stay** in the transition phase until a
hotfix can be released, and the server can be updated.

The goal is to resolve the issue quickly and re-deploy the fixed code to minimize the time the
database stays in the transition phase. Should a feature need to be completely pulled, a new
migration needs to be written to undo the database changes and the future migration will also need
to be updated to work with the database changes. This is generally not recommended since pending
migrations (for other releases) will need to be revisited.
hotfix can be released, and the server can be updated. Once a hotfix is ready to go out, we deploy
that hotfix and rerun the _Transition_ migrations to verify that the DB is in the state that it is
required to be in.

Should a feature need to be completely pulled, a new migration needs to be written to undo the
database changes and the future migration will also need to be updated to work with the database
changes. This is generally not recommended since pending migrations (for other releases) will need
to be revisited.

## Testing

Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 What id we sliced this up into thirds so it can be read top to bottom?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a stab at refiguring the image. Thoughts on the new format?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joseph-flinn could we add the source to this image if we need to modify it in the future?

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
11 changes: 11 additions & 0 deletions docs/contributing/database-migrations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,4 +72,15 @@ pwsh ef_migrate.ps1 [NAME_OF_MIGRATION]

This will generate the migrations, which should then be included in your PR.

### [Not Yet Implemented] Manual MSSQL Migrations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❓ Will this effectively be implemented once we publish these documents though?

Also, there's a few references in this document to "future" that you're limiting on the EDD one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We need to implement EDD as soon as possible because we are already using EDD without the processes to support it in the new EU environments and also in the self-host environments. I am considering this out of scope for the EDD process implementation and do not have a timeline on when we will get to this. I'm hoping that this note here will avoid the issue that we are running into now with needing to scramble to add in automated processes to support all of our environment types.


There may be a need for a migration to be run outside of our normal update process. These types of
migrations should be saved for very exceptional purposes. One such reason could be an Index rebuild.

1. Write a new Migration with a prefixed current date and place it in
`src/Migrator/DbScripts_manual`
2. After it has been run against our Cloud environments and we are satisfied with the outcome,
create a PR to move it to `DbScripts`. This will enable it to be run by our Migrator processes in
self-host and clean installs of both cloud and self-host environments

[code-style-sql]: ../code-style/sql.md