I recently had a conversation with a colleague around what Continuous Integration (CI) meant in reality – what should a development team be doing to truly say they are doing CI?
(courtesy of ThoughtWorks – the widely accepted definition of CI is “a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early.”
This is a good start, but my opinion is that this leaves quite a lot open to interpretation. I also think that what I would consider “true CI” can be quite difficult to achieve, especially with older legacy development practices such as long-lived branching. So, for these types of projects where do you start with CI?
First of all, what’s the benefit?
Hopefully nothing. But if you are practicing CI, “hopefully” can be removed from the equation. From the point of commiting code to your central repo, you will “know” if you have broken anything based on the outcome of the CI build. If your CI is set up correctly, this should be minutes if not seconds.
If your last CI build was green and your next is red, then you can be sure that it was something in the last changset that broke it! This means that you have a very small surface area of change to examine to find the fault. You can also be sure that it isn’t a compounded issue (for example, a change inadvertently affecting another change) as each commit or changeset is built and tested.
Your CI build, will of course be running your unit and integration tests. This is great, as you will know in a few minutes if your solution will compile, if all the logic is working as expected from your unit test results, and components are interacting correctly from your integration tests. Your CI server, if correctly configured, it will be alerting the team immediately if anything fails. No news is good news as they say.
When CI is taken seriously by a team, high quality practices and resulting code tend to emerge. Smaller and more frequent iterative activities help teams to continuously improve more quickly (you don’t need to wait for a retrospective to improve things!)
To be maintainable, the CI process will have to be as automated as possible, or it won’t be fast enough and reliable enough to provide any benefit. There is more to CI, but for anyone getting started, here are the first steps:
The above is a basic CI process. It is not a mature or full CI. But, before we go any further, one “feature” that I see a lot in older projects (with old-style project management) are long-lived branches.
Why is this a problem? Because if you have multiple branches, even if you are running a build on every commit to the repository, you are not building your final code until you merge. At some point 2 (or more) sets of code are going to be merged together, and it is only at that point that the code is truly integrated.
Let’s take this simple example:
A standard (but old) branching strategy – with a master, and feature branches. On the bottom feature branch, a developer has extended the file myfuncs.cs, and their build is green. They have also merged it back into master, and the master build is green. On the top feature branch, a developer has moved the contents of myfuncs.cs and deleted the file. When they commit to their branch, their build is green. When they merge to main the fun begins. Before the merge, everything looks great, green builds on all code bases. There is no indication that the merge will cause an issue. This means each developer has a false sense of security and confidence in their code base. When the second branch is merged, immediately there will be a conflict. Even once the conflict is resolved, the only way of verifying that nothing is broken is to re-run the build. The question is then what value is the previous green build result offering? It is limited to the scope of the branch, and the final merge is the only point the code is truly integrated.
The example above is simple, but imagine the same with branches that are days or weeks old with many commits. Even if the merge is done very carefully the likelihood is that there will be a breaking conflict. This means you are only performing a full integration on merge, not on every commit. This is not CI, this is “SI” (sometimes integration).
Understand the impact of your branching strategy on achieving CI. The goal would be no-branching branching strategy; but the reality is that this takes maturity, and practices such as feature toggling not all teams will be able to adopt over-night. Branching is also used with a gated master as a common practice to manage code reviews, and maintain a stable code base, and sometimes having a gated master is the only option and may be dictated by policy in your company.
The next best thing to no branching is as short and as few as possible. Is it really necessary to create a branch for that bit of work, or can you hide it easily? You may want to reduce the size of your tasks or user stories, and introduce a 1 user story per branch to help limit the life of your branches. Parallel work streams are another common reason for multiple branches. To reduce the number of work streams, introduce pair or mob programming – with many developers working on the same feature at the same workstation. It may surprise you how much this can increase speed and quality of development. It also eliminates the need for a code review, as code is under constant review throughout development.
One major factor in successful CI is automation. You need a CI server. There are plenty of these available (VSTS, TFS, Team City, Jenkins, Go CD, Bamboo… the list goes on). You will need to setup a CI build which can contain any steps necessary to produce a runnable, distributable version of your software. At the bear minimum, it should compile your code and run your unit and integration tests. Time is of the essence, so you want this process to be as quick as possible. Minutes is ok, seconds is better.
I recently spent some time working with a team on their CI build. They had a monolith application made up of more than 110 .net projects, all inter-dependent on one another and had to be built together. They also had a requirement to record technical debt via SonarQube, and check their third party dependencies for vulnerabilities using the OWASP Dependency Checker. Their CI build was taking around 60 minutes; far too long to provide any real value, as nobody waited for the build to go green before making the next commit, so broken builds became difficult to diagnose. Working with this team, I helped them to reduce the build to the 10-minute mark, while keeping all the elements of their existing build. If your build is slow, take the time to speed it up, and it will provide high return on investment.
Each CI server is different, so I won’t tell you how to set up a CI build on each one – however I have previously written a blog post here with an example of a simple VSTS/TFS CI build.
It is important that your CI Server alerts your team to the status of the build on completion, especially in the case of a failure. The team then works to fix the build before anything else is commited, ensuring that the code stays in a buildable, releasable state all the time.
So, at this point we have talked about the reasons for doing CI, and what your tooling will look like. There is one other very important factor; the development team.
The team will probably need to make some changes to the way they are used to working to make full benefit of CI. As with any successful change, the team must commit to it to make it work for them. Some of the practices that you may need to introduce are things like:
For most teams, these practices become second nature eventually. An information radiator with the build status on it is a good start. Pair programming can help motivate developers to stick to the rules they’ve set themselves. If your team is struggling with discipline around the process; be honest and work together to get better – maybe appoint one or two “build watchers”. Ensure your alerts are visible enough – if nobody checks their inbox, there are plenty of other gadgets out there that could help – e.g. DevOps UFOs, or just grab an old traffic light – whatever works. Your scrum master will probably be willing to help you out and guide you if you head off-track. Once you have your CI process in place, don’t be afraid to change. If something isn’t working, or you think you can do it better, give it a go.
So far I’ve talked around a basic CI process. For some teams (especially working with brownfield), getting this in place will be challenge enough. But once you’ve got there, what’s next?
CI is meant to provide fast feedback on the state of your code, and confidence in your codebase. So far, the basic process we’ve talked about has taken you from your repository to a set of compiled binaries, and test results that should mean you have a potentially releasable product.
If your next move would be to your production environment, I strongly urge you to read Continuous Delivery by Humble and Farley, and investigate the concept of a continuous delivery pipeline (sometimes – deployment pipeline).
So, deployment. The next step would be to modify your CI build so it produces a deployable artefact. Again, the how is down to your choice of CI server, but generally this involves outputting an artefact containing all the files needed to deploy, configure and run your application in an environment. At this point, you can either just treat the artefact as throw-away, and deploy it to a test environment where you will run your automated tests, or you can store the artefact in an artefact repository, so it can be promoted down your deployment pipeline on the success of your automated tests.
Again, automation is king. The deployment and configuration to your test environment should be automated, as should your tests. If you haven’t automated this, you reduce confidence in the process. For example, is my website not starting because there is a code error, or because I copied the wrong folder? Automation increases speed, reliability and confidence in this process.
Feedback from your test environment should be automated, and again the team should be alerted as soon as anything fails so they can take steps to get the tests green again as soon as possible.
Once you’ve got CI established, hopefully you want to go even further! Continuous Integration is a subset of Continuous Delivery, so get yourself familiar with continuous delivery practices. Get your Continuous Delivery pipeline established, get comfortable with releases, automate all deployment and configuration.
Once you have confidence in your continuous delivery process, you can think about continuous deployment to production – that is, having enough confidence in your release and deployment process that you allow every commit in your source control to result in a fully automated release to your production environment. This is the ultimate goal that many of us strive for!