Skip to content

Branching strategy

Jonathan Van der Cruysse edited this page Mar 15, 2018 · 3 revisions

Branching strategy

This document details a proposal for our branching strategy. It may seem somewhat complicated at first glance, but it's easy to use in practice and it's based on a tried and tested strategy that tends to work well.

Decomposing the hardware and software

We are building a client-side web application, that is, an application that is built and served to users using web technologies but is otherwise designed as a desktop application.

The server plays two distinct roles for such an application:

  1. to serve the client to users and
  2. to connect users that want to communicate with each other.

In our use case, this latter point is mostly related to the collaborative editing use case.

Note that these two roles can in fact be satisfied by two different servers: there is no dependency whatsoever between the logic that serves the application to users and the logic that handles a collaborative editing (or other communication) protocol.

This leaves us with four distinct entities:

  • the client application,
  • the server that transmits the client application to users,
  • the server that connects clients, and
  • the protocol that connects clients and the communication server.

Clearly, the server that transmits the client application to users depends on the client application—it needs to know which pages to serve—and both the client and the communication server depend on the communication protocol.

Note that the dependencies described so far are the only true dependencies between components.

Invariants: an overview of repositories, forks and branches

To keep false inter-component dependencies from creeping into our codebase, we will create three different "main" repositories: one for the client application, one for the server that transmits the client application and one for the server that connects clients.

Each of these main repositories contain a master branch that at all times contains a release candidate that passes all automated tests. To resolve the dependency of the client-transmitting server on the client itself, the server's master branch includes the client repository as a git submodule.

Additionally, we will create a "waterfall" repository that contains both types of server via inclusion as git submodules. This repository's master branch is a release, tested and approved by quality assurance.

Each team member can have their own fork of every main repository. Like the main repository, these forks are hooked up to the CI server and the forks' CI configuration is the same as the main repository's. Unlike the main repository, almost all activity in these forks happens on branches that focus on a specific feature, refactor, or other change. The master branch can remain mostly untouched in team members' forks.

Workflow

  1. To create a new release, the quality assurance manager forks the master branch of the waterfall repository, updates the server submodules and goes through the product's functionality. If no defects are found, the release branch is merged into the master branch. The CI server automatically builds, packages, uploads and deploys the resulting version to our production server.

  2. When someone wants to merge the changes they made in a branch within their repository, they post a pull request to the master branch of the relevant repository. CI performs a hypothetical merge of the pull request, builds it and tests it. At least one person must then (or in parallel with the CI build) review the pull request, possibly requesting changes. Once the CI green-lights the merge and all reviewers approve (that is, they decide that no further changes to the pull request are warranted), either the person who posted the pull request or one of the reviewers can merge the pull request into the master branch.

  3. When two or more team members want to collaborate on the same feature, they agree on a division of labor—which should include a discussion of how each team member's part of the work depends on the other team members'—and then start working on branches in their own forks. Keeping branches in sync is relatively straightforward: any team member can merge any other team member's changes into their own fork and any team member can use the pull request mechanism to propose merging their changes into someone else's fork.

Enforcing the workflow

To enforce the workflow and invariants described above, we will use GitHub's protected branches features. Specifically, it is impossible to directly add a commit to a protected branch, that is, the master branch. Additionally, a pull request can only be merged into a protected branch if all CI tests pass, there has at least been one code review of the pull request and all code reviews of the pull request indicate approval.

GitHub allows us to use a subset of these features, but we'll use them all.

Tools

We use the following set of tools:

  • Regular (open-source) GitHub repositories.
  • Travis CI and AppVeyor CI for testing our software.
  • Codacy for automated code reviews.
  • GitHub pages for hosting release candidates and pull request builds.
  • Dubious Spongebot deploys builds to GitHub pages and comments on pull requests with a URL to the deployed build and the coverage report for that build.

Why not X?

  • Why not use Jenkins instead of Travis CI and AppVeyor CI? Travis CI and AppVeyor CI have three distinct advantages:

    • They build our software on a virgin machine: we write a script that downloads all dependencies and builds our software. This script is run on every build and doubles as documentation on how to build our software. Jenkins just uses the local machine as a build environment, with all dependencies already installed—dependencies are implicit, which is bad.

    • Travis CI covers both Linux and Mac OS X. AppVeyor CI covers Windows. Jenkins only covers the server's platform, which is Linux in our case.

    • Travis CI and AppVeyor CI are incredibly easy to integrate in our GitHub pull request workflow. Once integrated, we won't have to check the CI server manually: all the information we need will be aggregated in GitHub's pull request summaries.

  • Why not use a closed-source development model? There are a number of reasons to prefer an open-source approach:

    • We don't have a corporate sponsor to impose scary IP agreements, so we can just make UnSHACLed open-source if we want to.

    • No other teams are working on the same project, so there's no risk of plagiarism.

    • We can't keep our client-side source code private, not even if we keep our git repository hidden. After all, we're building a client-side web-app and any user can ask their browser for the page source HTML/CSS/JavaScript. What's the point of hiding our source code repository if anyone can just download the source code from a browser.

    • We are not in the business of selling software licenses. Indeed, that would be foolish because of the previous bullet. The most sensible way to monetize UnSHACLed to do is to sell it as a service: corporate users pay a monthly fee for their account, in return we provide them a ubiquitous, always up-to-date application and allow users to collaborate via our server. Keeping our source code secret shouldn't cost us any subscriberss.

    • Making UnSHACLed open source doesn't affect copyright (though maybe licensing our source code might be a good thing). Just because our software is open source doesn't make it legal for others to offer a competing service based on our service or even create a private copy.

    • Perhaps most importantly, using an open-source GitHub repository will give us free access CI tools such as Travis CI, AppVeyor CI and Coveralls.

    • Also, open-source software becomes part of our resumes; we'll have something to show for all the work we'll have done. Closed-source software stays hidden.

  • Why not use UGent GitHub? Because then we can't use Travis CI/AppVeyor CI/Coveralls. Well, not for free, anyway. If the powers that be decide that we absolutely must use the UGent GitHub for their comfort, then we can easily run a script on our server to mirror our repositories from regular GitHub to UGent GitHub.

  • Why not use branches in our main repository instead of forked repositories? Using branches requires less set-up initially, but the main advantage of having multiple forks is that the person working on a branch actually owns the branch: only they can commit directly to their fork. Others can only open pull requests if they want to collaborate on a branch. This is a good thing because:

    • allowing someone to commit directly onto someone else's (feature) branch disrupts the vision of the branch's "owner" and replaces it with the sum of two superimposed people's visions of how a feature should be implemented;
    • merge conflicts are likely to arise when two people work on the exact same thing at the same time;
    • allowing others to undo/change/refactor code at any time places the branch's owner under constant scrutiny, which is bad; and
    • imposing a "fixup"/refactor onto someone else's branch denies the owner of the branch a learning opportunity and makes it more likely that most of the work will shift to a single team member.

    Using pull requests only is better because:

    • they offer a convenient way for team members to communicate and synchronize their visions for a feature's design,
    • merge conflicts are still possible, but are likely to be easier to solve because pull requests tend to represent somewhat polished nuggets of code,
    • a branch's owner gets to decide when their feature is ready for scrutiny,
    • it is the responsibility of the person sending out a pull request (to either the main repository or someone's fork) to fix defects in their proposed changes, making it less likely for a lopsided group dynamic to arise.
  • Reviewing pull requests is hard. Why not use a lower-ceremony branching strategy? Ignorance may be bliss but software quality does not improve with fewer pairs of eyes looking at it. Additionally, pull requests represent measurable milestones in the development of a feature and the code reviews they trigger force us to talk to each other about the project—in a way, requiring reviews for all pull requests bakes communication into our workflow.