Why we gave up on continuous integration

Four years and three companies ago we (I’ve worked with the same core team across these transitions) ditched our continuous integration server and we haven’t gone back. We spent too much time dealing with impedance mismatch between the CI environment and development/production. Instead we just keep our test suite short enough (runs in less than 2 minutes) so that developers run it often and “continuously”. And of course, with every merge and deploy.

At the time we were last using a CI server we were running TeamCity on an EC2 instance on Ubuntu. We found that we were spending far too much time working to make code run in the CI environment that didn’t match our production (Scientific Linux) or dev environment (Mac OS). Why bother managing one more environment? (And yes, if I did do CI again, I’d make it a duplicate of production.)

Instead we work to keep our tests suites down to less than 2 minutes (that’s about the limit that I’ve found acceptable). This way developers can and will run the entire suite often while coding. And our policy is to always run the entire suite for every pull request / merge.

We keep the suite running fast by segmenting code (in our case into gems and separate rails apps), by touching the database as little as possible (mock, or build new objects and test in memory, unless you absolutely need to test the database), and by running tests in parallel (taking advantage of all four cores on a MacBook Pro). Our largest project has 8377 tests that complete in 150 seconds on my MBP. Across our 20-some projects we have nearly 30,000 tests. but there’s no reason to run those all, all the time. After all, you don’t run the test suite for your third-party libraries with every CI run, do you?

Why not use a hosted CI system?

When we gave up on CI, Travis-CI, CircleCI and the plethora of such were not as readily available (or we weren’t aware of them). Yes, using a hosted CI makes a lot of sense because it’s one less system for the us to manage. However, then you run into problems where code doesn’t run in the CI environment or runs differently (e.g. when building native extensions in ruby gems). Or issues where the build environment has dependencies that collide with the app dependencies (e.g. different versions of Python).

For a brief while, I acquired responsibility for deploying a project that used CircleCI for deployment to Heroku. CircleCI had unreasonable GitHub permissions: it required full access to all your repositories (across all organizations that you belong to). As this was for a side project I had a lot of problems with that because I didn’t want Circle with access to my primary employer’s code. Yes, I could have created a new GitHub account for Circle, but that’s kind of silly. I’ve heard that they were working on fixing this, but I just tried and that’s still the only way it appears to work.

But what about Continuous Deployment?

I don’t think that CI should be a deployment tool. There are a growing number of deployment tools, particularly all the new container deployment tools. Use the right tool for the job.

I think that developers should deploy to production and be responsible for deployments. But I’d be willing to try continuous deployment in the right setting.

We don’t use CD now because there’s marketing advantage to being able to roll out a set of features together. Of course we could deploy the code and then roll out the features to customers as we make them available. In fact we do – using rollout gem – but not very often. Instead we make a commitment that master is always ready to deploy. This means that when we merge to master it may not be deployed but we accept that it could be at any time.

For us, deployment is still a one-line command: cap production deploy. [This is a core tenet of a strong development practice.] Under a continuous deployment system, developers deploy with git merge feature-branch. Not really much difference, right?


One thought on “Why we gave up on continuous integration

  1. About two years after this post my mind was changed. We moved to containers, Kubernetes, and Google Cloud and the build-and-deploy process became long enough that we needed to offload it from the developer machines. But the real thing that changed my mind was that there are many powerful security scanning services (like scanning container layers for CVEs) that we could implement as part of the CI chain. These run too long to do with each developer build (although we have fast ones like `bundler-audit` and `brakeman` built into the developer test suite). Running CI with the security scanning improved my confidence in the security of the system and made security a first-class, always-on, process.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s