How RevenueCat’s SDK team uses Release Trains
How we automate the releases of our SDKs
Cesar de la Vega
At RevenueCat, we currently support multiple platforms (Apple and Android) and multiple stores (Google Play Store, Apple App Store, Amazon Appstore, and Stripe). We support an array of multi-platform frameworks, too: Flutter, React Native, Cordova, and Unity.
In total, we have six SDKs that we actively work on and maintain. When we add a new feature or fix a bug, we need to release updates for all the SDKs, so all developers using RevenueCat get access to the new code.
Some of the common code for the multi-platform SDKs lives in a library we call purchases-hybrid-common, which we have a whole article about. Our multi-platform SDKs depend on this library, which, in turn, has a dependency on our native iOS and Android SDKs, so the multi-platform SDKs transitively get the native iOS and Android libraries through it.
Let’s imagine for a moment that we wanted to roll out a new feature across all our libraries. In our old system, we would have needed to do the following:
- Make the changes to the iOS and Android codebases.
- Create a new release of both iOS and Android.
- Make a new release of purchases-hybrid-common, increasing the dependency on the native libraries, and make any code changes if necessary. There are both iOS and Android versions of this library.
- Update every hybrid multi-platform SDK to depend on the newer version of purchases-hybrid-common, and make any necessary changes and release versions of these SDKs. This would mean four releases: Flutter, Cordova, React Native, and Unity.
As you can see, this process involves a lot of steps and can be costly in terms of engineering resources. That’s especially the case because we have been performing most of the process manually.
We were only automating the deployment process for iOS and Android — we were preparing the releases manually, and the deployments for the multi-platform SDKs were not really automated. Most of our automations were just some local scripts that we would run to prepare the pull requests for new releases, and not much was being performed in CI.
To create those scripts, we had been using a tool called fastlane, which helps with app automation by offering a variety of already-made scripts that automate common tasks performed during development and deployment. It is written in Ruby, and it’s easy to extend. We use fastlane for tasks like updating version numbers in multiple files, updating dependencies, creating distribution packages, opening automatic pull requests, etc.
This approach has worked for a while, but as the team grew and the development speed of the SDKs accelerated, we started to face multiple challenges:
- Making a release was slow and error-prone.
- Making a release involved following a sequence of steps, making them hard to perform, especially for new teammates. It also involved having a development setup specifically on a work computer, as opposed to being able to deploy from a personal computer, tablet, or phone.
- Since releases take time, it’s hard to predict when the next release will occur, making it hard for the support team to set expectations for developers about when something will be released.
- Bug fixes would take days, weeks, or even months to get to hybrid SDKs.
- We want our SDKs to have meaningful changelogs, so our customers know exactly what the difference is between versions. Until now, changelogs for each release had to be compiled manually, a time-consuming and error-prone process.
- CI code and scripts were duplicated everywhere (e.g., fastlane and CircleCI config files).
- We had some issues with our Git flow. For example, we were creating releases on the main branch by freezing our main development branch until the release was out.
We noticed that there were two flows to improve: updating the hybrid (multi-platform) SDKs to depend on the latest versions of the native SDKs and automating the releases of all SDKs. We needed to create release trains that automatically make all these updates and releases for you.
Always be shipping is one of our company values, and we take it very seriously. To achieve that goal, we realized we had to do all of the following:
- Speed up our release process by moving the flow of preparing a new release to CI.
- Completely move deployments to CI while ensuring flexibility to make a new release locally if necessary.
- Automatically prepare releases for all our SDKs weekly.
- Share as much as we could of the code used for making releases across all SDKs. For this, we decided to create a common fastlane plugin and use a common CircleCI orb and common Dangerfile. This common code should be unit tested if possible.
- Automatically generate changelogs.
- Never automate major releases, since they break developers’ code.
- Add some sort of human approval to finish a release, so we don’t unintentionally release code we don’t want to go out.
Since we want to make automated weekly releases, we had to pick a day of each week to perform them. We decided to pick Wednesday, due to it being in the middle of the week. The timeline of the automatic process we planned looks like the following:
- Every Wednesday at 17:00 UTC, a new pipeline is triggered in CircleCI. We configured this using Scheduled Pipelines.
- The next version number is then determined automatically. If the automation determines that the next version is a major release, it should skip the rest of the automation since we don’t want to make major releases automatically.
- The new version number is set in all places that require an update (Podspec, Gradle files, etc.).
- The changelog since the latest release should also be automatically created.
- All these changes should be part of a new pull request opened by a CircleCI GitHub bot.
Some of these steps are very interesting to automate. Increasing the version number is easy: It’s just a text replacement in certain files. However, how would we automatically determine what the next version should be? And how do we divide the changelog into sections, so we can give more importance to bug fixes and new features? How do we skip some versions if there have only been changes that don’t have to be released?
The solution here is to be able to categorize each change (pull request) that is merged to the main branch of the repository.
One of the most common ways of categorizing changes is to add a label to the commit message of each commit. This is how semantic-release, a popular npm versioning and publishing package, solves the categorizing problem. It follows the Angular commit message convention, in which each commit is categorized by adding a label and a scope [like fix(pencil), feat(pencil), or perf(pencil)] at the beginning of the commit message. A commit can be marked as a breaking change by adding BREAKING CHANGE: to the commit footer.
We can’t use this approach because it focuses on commits and introduces unwanted friction for external contributors who would be forced to use the convention. We have a policy of using the squash and merge option when merging a pull request to the main branch. Squash and merge creates a new commit with all the squashed changes for each pull request that’s merged. If we were to follow what semantic-release suggests, we would have to manually edit the squashed commit message on every pull request merge:
Sadly, there’s no way to enforce updating the message and footer of the squashed commit, and we didn’t want to move away from squash and merge since we like how the history looks using this option.
We needed to find one solution to categorize changes that didn’t involve commit messages, and what we found as a solution was to use GitHub labels. A set of labels can be created and then applied to pull requests or issues on GitHub:
Inspired by the Angular convention, we created a set of labels that categorize all pull requests:
|breaking||Changes that are breaking|
|build||Changes that affect the build system|
|ci||Changes to our CI configuration files and scripts|
|feat||A new feature|
|fix||A bug fix|
|perf||A code change that improves performance|
|refactor||A code change that neither fixes a bug nor adds a feature|
|style||Changes that don’t affect the meaning of the code (white-space, formatting, missing semicolons, etc.)|
|test||Adding missing tests or correcting existing tests|
|next_release||Preparing a new release|
We use breaking to determine when a release should be a major release because it contains breaking changes. It is the only label that can’t be used by itself: It needs to be used together with another label.
Requiring pull request labels
We need to make sure that all pull requests are labeled; otherwise, it’s very difficult to automate the whole process. This is where Danger comes into play. Danger is a tool that is meant to be run on CI during the code review process to automate common code review chores.
Danger works by creating a Dangerfile with your own checks in your repository. Danger will run on every commit and add a comment to the pull request if any of the checks is not successful.
In our case, we wanted to write a check to make sure that pull requests were correctly labeled. We created a new repository to host this Dangerfile, so it could be shared across all our repositories. The contents of our Dangerfile are public and can be found here. It’s pretty simple: It just checks the labels using Danger’s APIs and displays a warning in case the PR is not properly labeled.
This is how the pull request comment looks in the event of failure:
Determining the next version
Now that all pull requests are properly labeled, we can correctly determine automatically what the next version should be. We follow semantic versioning, so we just need to determine if the next version is a major, minor, or patch release. The logic is pretty simple:
- Get the latest non-prerelease release in the repository.
- Get all commits in the main branch since the latest version. We squash commits, so each commit is linked to a pull request that has been merged into the main branch.
- Iterate over all of the commits and check the labels of the associated pull requests; this can be done using GitHub APIs, which we access through fastlane.
- Determine the next version number based on what changed:
- If any of the pull requests included a breaking change, the next version will be a major release. For example, the next major release after version 0.0.5 is 1.0.0.
- If any of the pull requests included a non-breaking feature, the next version will be a minor release. The next minor release of version 0.0.5 is 0.1.0.
- In any other case, the next release will be a patch. The next patch release of version 0.0.5 is 0.0.6.
Another area of focus for automation is changelogs. This is an example of one of our changelogs:
We show the changelog in GitHub releases, and in a CHANGELOG.md file in each repository. As you can see, we like to categorize the changes, so it’s clear what’s a bugfix and what’s a new feature.
Using a system similar to the one we use for determining what the next version would look like, we can determine which section of the changelog each change belongs to. For now, we only have four sections in our changelog: breaking changes, new features, bug fixes, and other changes.
Updating purchases-hybrid-common to the latest iOS and Android SDKs
Every Monday, using CircleCI’s scheduled pipelines, a new pull request is opened, updating purchases-hybrid-common to depend on the latest purchases-android and purchases-ios native SDKs.
Leveraging GitHub APIs wrapped by fastlane, we can easily detect what’s the latest stable release for both iOS and Android native SDKs, increase the dependency versions accordingly, and open a new pull request with the update. We don’t open any automatic pull requests for major releases or prereleases.
Since we are making a release of purchases-hybrid-common on every merge to the main branch, we have to merge that automatic pull request, upgrading the dependencies to start a new release of purchases-hybrid-common. This way, we guarantee that the latest bug fixes are quickly made ready to be integrated into the hybrid SDKs. If any code change is required when upgrading to the new version, we make the changes either in a separate pull request that can be released afterward as a new release or directly in the branch preparing the release.
Also, since new releases of all SDKs are released each Wednesday, it takes a week for a purchases-ios and purchases-android release to make it to the hybrid SDKs (the Flutter, Cordova, Unity, and React Native SDKs). We think allowing a full week will leave enough time to detect important bugs early and prevent them from making it to the multi-platform SDKs, increasing the scope of the bug. It’s our equivalent of staggered releases in apps.
That makes our timeline similar to the following:
- Wednesday: Automatic PR for Android and iOS, increasing the version number and preparing a new release with changes since the last release. Approving the job will make the release. PR is merged after a successful deploy.
- Monday: Automatic purchases-hybrid-common pull request increasing native dependencies. Merging the PR will create an automatic new release pull request. Approving the job will make the release. PR is merged after a successful deploy.
- Wednesday of the following week: Automatic PR for hybrid SDKs increasing the purchases-hybrid-common version and preparing a new release with changes since last release. Approving the job will make the release. PR is merged after a successful deploy. On this day, a new version of iOS and Android will also be deployed, and the cycle will start again.
Preparing release and deployment
These automated release pull requests are created off the main development branch and opened automatically, but they still need to be reviewed so we don’t deploy any change by mistake.
Someone in the team takes a look at the changes from the automated pull request and makes any required updates. The changelog sometimes will need to be updated to phrase changes in a better way.
If everything looks good, the release is approved in CircleCI (using an approval job). Approving that step will tag the release branch with the version number to be released.
Tagging a branch with a version number is our trigger to start the automatic deployment process. The CircleCI GitHub bot will continue publishing the new packages to the specific package manager and create a new release in GitHub pointing to this newly created tag.
Sharing deployment code
We created a fastlane plugin to hold all the scripts that are common to all SDKs (check it out in its GitHub repository). Not all the new release and deployment scripts are shared among all SDKs, but this approach lets us share the common code and, more importantly, lets us unit-test it, which is a huge win.
We also created a common Danger repository that we import in all SDKs’ corresponding Dangerfiles. It’s also public.
To share the CI code, we leveraged CircleCI orbs and created our own, which can be checked out in its GitHub repository. This orb can be imported in each SDK’s CircleCI config.yaml file. It contains common jobs like the automatic-bump job that calls a fastlane lane to prepare the new release.
Summary of changes
To summarize and see clearly how our process has changed, this is an example of how our Flutter plugin used to be released, and how it is released now:
- We have automated releases only because we trust our sophisticated automated testing structure. We have a blog post about how we test our SDKs.
- We don’t “move fast and break things” because we provide critical infrastructure as a service.
- This is already implemented: You can take a look at our OSS SDKs to see it in practice. We store our code in GitHub, and you can check our SDK code at https://github.com/RevenueCat.
- Since we implemented these automations, our average time since the last release has already dropped dramatically and saved us a ton of developer time.