GitHub Pull Request Workflows


All of our code in Cesium is reviewed before being merged into master. Since Cesium is on GitHub, we, of course, use pull requests for code reviews and merging. Other than unit tests and having contributors that strive to be craftsmen, code reviews are the biggest reason for our code quality.

We have experimented with the granularity of pull requests to strike a balance between reviewability, integration headaches, and community noise. There isn’t a one-size-fits-all workflow; some are better than others in different situations and many workflows can be combined. Here I’ll explain the workflows we’ve used for better or worse, starting with the two extremes.

Coarse-grained pull requests

Say we have a major feature to implement like streaming terrain, 3D models, or KML. One approach is to branch master and then work on the feature for several months, continuously merging master into the branch. When the pull request comes, given its size, it can take weeks to review, revise, and finally merge. We’d rather have pull requests that we can merge in hours or days. Also, keeping code, especially low-level code, on a branch for several months can cause integration headaches as changes are made in master without taking into account the changes on the branch.

Reviewers can also get tired and, for example, skimp on reviewing the tests or looking at intense math in any detail. On the flipside, these pull requests are very unlikely to become victim to bikeshedding. In addition, pull requests of this granularity have a high signal-to-noise ratio compared to fine-grained pull requests when looking at all open pull requests.

In Cesium, we use coarse-grained pull requests sparingly. The Cartesian functions pull request, which is +1.5K/-2K lines of code, is a good example of a coarse-grained pull request that needs to be cohesive. Given its size and broad changes, it still took a few weeks to merge.

Fine-grained pull requests

On the other end of the extreme are fine-grained pull requests. These may change a few files, a single function, or even be a one-liner. Fine-grained pull requests can be merged quickly and minimize integration headaches, but they create a lot of noise and can fall victim to bikeshedding as reviewers look for something to say. However, we are reasonably good at not bikeshedding.

In Cesium, we like fine-grained pull requests for bug fixes, small cohesive changes, and new contributors. A good example is the bounding sphere scale pull request, which fixes a bug and adds tests for it.

We always encourage new contributors to open a pull requests for a small part of their work before moving forward. This can turn a fine-grained pull request into a larger, longer-lived pull request for continuous feedback.

Two-part pull requests

For major features, to reduce the noise of fine-grained pull requests and to allow early continuous feedback, we have used two-part pull requests. We branch from master, which becomes the master for a feature, and then branch off that for smaller features. For example, we created a branch for a new geometry engine, and then branched off that for individual geometry types before merging the entire geometry branch into master.

This avoids the downsides of reviewing a coarse-grained pull requests and the noise of fine-grained pull requests going into master. The only pull request that goes into master is a final coarse-grained pull request of code that has already been reviewed. Despite their size, these pull requests can be merged quickly. A small focused group reviews the smaller pull requests and the larger community reviews the final pull request.

The drawback of a two-part pull request is the branch is still not merged into master for quite a while leaving the possibility of integration headaches.

In Cesium, we use two-part pull requests for major features especially when multiple contributors, new contributors in particular, are working on them. Of course, this can also be done with forks in addition to branches. A good example in Cesium is the geometry and appearances pull request, which contained +16K/-4.5K lines of code, but was merged in a few days since it contained code that was already reviewed in more fine-grained pull requests like the ellipsoid geometry (+292/-133). We also used GitHub task lists to organize our work in a GitHub issue.

Feature subset pull requests

Even if a major feature takes months to develop, a lot of the code will be ready for master before the feature is finished. For example, if we are working on 3D models, we may add new functions to our matrix types or new debugging aids to our renderer. These can go into master without the entire feature.

In this workflow, we branch from master for our main feature development, and then make another branch off master for the subset we want to open a pull request for. This subset can be developed in its own branch and then merged into the feature branch and master, or the subset can be developed in the feature branch and git cherry-pick can be used to move its commits to the subset branch for merging into master.

This reduces our risk of integration headaches since more of our code gets into master sooner, and this reduces the size of the final pull request making it easier to review. We risk that the subset that goes into master will need additional, possibly breaking, changes or that it turns out that the subset wasn’t needed at it.

In Cesium, we use subset pull requests quite a bit especially if we have a critical fix or other branches need a subset of our feature-in-progress. A good example is the EXT_frag_depth pull request (+66/-5) which was a subset of the moon pull request (+763/-20).

Closing

Of course, if we can break a feature into multiple parts, we can create several medium-sized pull requests over time. For example, 3D models may be divided into pull requests for geometry and shading, animation, and optimizations.

To minimize integration headaches in all workflows, we merge master into our branch or branches often, even several times a day. This also allows users to experiment with features in branches and still have an up-to-date version of the rest of the code. This is actually quite common in Cesium.

It’s also important that we use pull requests in the first place. In Cesium, we do not commit directly to master except for the most trivial changes like fixing a doc typo or revising CHANGES.md.

Finally, in Cesium, we strive to merge pull requests quickly since it minimizes our potential for integration headaches and minimizes the task switching a contributor has to do between revising pull requests and writing new code. However, reviewing pull requests takes a significant amount of time and can prohibit our most experienced contributors, who generally review the most, from writing new code so we try to strike a balance between dropping everything to review a new pull request and being heads-down coding.


Back to top

blog comments powered by Disqus
Fork Cesium on GitHub