This is my second post in a series I’m writing about impedance to doing continuous delivery and how to overcome it. Part 1 about Infrastructure challenges can be found here. I also promised to write about complexity in continuous delivery in this earlier post about delivery pipelines.
I’m defining “a solution” as the software application or multiple applications under current development than need to work together (and hence be tested in an integrated manner) before release to production.
In my experience, continuous delivery pipelines work extremely well when you have a simple solution with the following convenient characteristics:
- All code and configuration is stored in one version control repository (e.g. Git)
- The full solution can be deployed all the way to production without needing to test it in conjunction with other applications / components under development
- You are using a 3rd party PaaS (treated as a black box, like Heroku, Google App Engine, or AWS Elastic BeanStalk)
- The build is quick to run i.e. less than 5 minutes
- The automated tests are quick to run, i.e. minutes
- The automated test coverage is sufficient that the risks associated of releasing software can be understood to be lower in value than the benefits of releasing.
The first 3 characteristics are what I am calling “Solution Complexity” and what I want to discuss this post.
Here is a nice simple depiction of an application ticking all the above boxes.
Developers can make changes in one place, know that their change will be fully tested and know that when deployed into the production platform, their application should behave exactly as expected. (I’ve squashed the continuous delivery (CD) pipeline into just one box, but inside it I’d expect to see a succession of code deployments, and automated quality gates like this.)
But what about when our solution is more complex?
What about if we fail to meet the first characteristic and our code is in multiple places and possibly not all in version control? This definitely a common problem I’ve seen, in particular for configuration and data loading scripts. However, this isn’t particularly difficult to solve from a technical perspective (more on the people-side in a future post!). Get everything managed by a version control tool like Git.
Depending on the SCM tool you use, it may not be appropriate to feel obliged to use one repository. If you do use multiple, most continuous integration tools (e.g. Jenkins) can be set up in such a way as to support handling builds that consume from multiple repositories. If you are using Git, you can even handle this complexity within your version control repository e.g. by using sub-modules.
What about if your solution includes multiple applications like the following?
Suddenly our beautiful pipeline metaphor is broken and we have a network of pipelines that need to converge (analogous to fan in in electronics). This is far from a rarity and I would say it is overwhelmingly the norm. This certainly makes things more difficult and we now have to carefully consider how our plumbing is going to work. We need to build what I call an “integrated pipeline”.
Designing an integrated pipeline is all about determining the “points of integration” aka POI i.e. the first time that testing involves the combination two or more components. At this point, you need to record the versions of each component so that they are kept consistent for the rest of the pipeline. If you fail to do this, earlier quality gates in the pipeline are invalidated.
In the below example, Applications A and B have their own CD pipelines where they will be deployed to independent test environments and face a succession of independent quality gates. Whenever a version of Application A or B gets to the end of its respective pipeline, instead of going into production, it moves into the Integrated Pipeline and creates a new integrated or composite build number. After this “POI” the applications progress towards production in the same pipeline and can only move in sync. In the diagram, version A4 of Application A and version B7 of B have made it into integration build I8. If integration build I8 makes it through the pipeline it will be worthy to progress to production.
Depending on the tool you use for orchestration, there are different solutions for achieving the above. Fundamentally it doesn’t have to be particularly complicated. You are simply aggregating version numbers in which can easily be stored together in a text document in any format you like (YAML, POM, JSON etc).
Some people reading this may by now be boiling up inside ready to scream “MICRO SERVICES” at their screens. Micro services are by design independently deploy-able services. The independence is achieved by ensuring that they fulfill and expect to consume strict contract APIs so that integration with other services can be managed and components can be upgraded independently. A convention like SemVer can be adopted to manage change to contract compatibility. I’ve for a while had this tagged in my head as the eBay way or Amazon way of doing this but micro services are now gaining a lot of attention. If you are implementing micro services and achieving this independence between pipelines, that’s great. Personally on the one micro services solution I’ve worked on so far, we still opted for an integrated pipeline that operated on an integrated build and produce predictable upgrades to production (we are looking to relax that at some point in the future).
Depending on how you are implementing your automated deployment, you may have deployment automation scripts that live separately to your application code. Obviously we want to use consistent version of these through out deployments to different environments in the pipeline. Therefore I strongly advise managing these scripts as a component in the same manner.
What about if you are not using a PaaS? In my experience, this represents the vast majority of solutions I’ve worked on. If you are not deploying into a fully managed container, you have to care about the version of the environment that you are deploying into. The great thing about treating infrastructure as code (assuming you overcome that associated impedance) is that you can treat it like an application, give it a pipeline and feed it into the integrated pipeline (probably at a POI very early). Effectively you are creating your own platform and performing continuous delivery on that. Obviously the further your production environment is from being a version-able component like this, the great the manual effort to keep environments in sync.
Coming soon: more sources of impedance to doing continuous delivery: Software packages, Organisation size, Organisation structure, etc.
(Thanks to Tom Kuhlmann for the graphic symbols.)