The problem was I got very over-enthusiastic and the blog became very long. Hence I’ve broken out describing why a DevOps Service team is a pattern here.
I described two valid types of team
- One that provides an appropriate set of services to Development and Operations (on-going). Described right here in this blog!
- One that runs a change programme to expedite adoption of Continuous Delivery (temporary). Described here.
DevOps Service Team
Determining what services are appropriate at the heart of the controversy around DevOps teams. Another way of putting it is “when is an acceptable service not also an evil functional silo?” For me, any service that a DevOps team might provide must pass the following tests:
- Test: Does it abstract Development and/or Operations from the consequences of their actions and their core responsibilities?
- Test: Does it complicate communications and the interface between Development and Operations?
If a DevOps Service team run the Continuous Integration / Continuous Delivery orchestration tooling. This does not move the developers away from writing or testing code and it doesn’t move the operators away from their core concerns, so passes test 1. These tools are also great for improving communication because they make mundane but critical information such as ‘what code has been deployed where’ transparent, thus freeing up communication channels for more complex matters, thus passing test 2. CloudBees have demonstrated people want this by offering Jenkins as a SaaS. In the organisation that I work, we also provide a similar (but more extensive) service.
What about hosting Version Control tools? If your organisation has problems with version control, it will not fail either of my above tests by making the tools the problem of someone from a DevOps team. GitHub make the success of this model pretty obvious and may also be an appropriate solution for you. However, alternatively (for various reasons e.g. possibly cheaper in the long-term, consistent with company Infosec requirements etc) an internal DevOps team could be your solution.
What about writing Automated Tests? Ok, ok, just checking you are concentrating. This immediately fails my first test. If Development doesn’t write the tests, they are abstracted from whether the code they are writing even works! DevOps teams shouldn’t write automated tests!
What about setting up Automated Build Scripts? This fails my first test. If a developer doesn’t even know how their code is going to be built for release outside their local workstation, they have been abstracted too far away from writing working code. However, we live in an imperfect world and there are times when in reality you have to balance test 1 and 2 against test 3:
- Test: Do the Development or Operations teams have the capacity and skills to deliver these?
Thankfully generally developers do understand how to create build scripts. But there are certainly cases, for example with many packaged products, that the developer process for building and deploying code is not easy to script are decouple from being a manual trigger inside the IDE. Almost invariably this leads to the unceremonious end to a Continuous Delivery journey before even achieving Continuous Integration. Use of a DevOps service with specialist skills could be appropriate for owning and solving this problem of extracting the requisite commadline build process. I also want to draw a distinction between setting up scripts and maintaining them. Once created, Developers need to own build scripts and be empowered to own them without having an inefficient and/or asynchronous external dependency.
As a side point specialist skills, these are much can be much easier to develop-within or attract-to a dedicated DevOps service team. Trying to persuade some developers that they are interested in scripts or even any code that won’t feature prominently in production isn’t easy. Trying to persuade operators that they need to write more scripts is usually an easier sell, but giving them a dedicated focus can help a lot. If you do hire specialists, you also need to be able to protect them. I’ve seen a number of examples of specialists hired to work on DevOps processes getting quickly irreversibly dragged into normal Development or Operations work (i.e. thus helping with what those teams typically achieve – see sections above!).
DevOps Service Team: Development and Test environments
What about creating and supporting Development and Test environments? By this I mean ownership of non-production environments where code is almost continually refreshed and tested and may be under constant use by 10’s if not *gulp* 100’s of people. If environments are experiencing unplanned outages or even just experiencing slow code deployments, this costs the organisation a lot of wasted productivity (not to mention morale). Since this type of activity is usually on the critical path for releasing software, lost non-production environment time will usually also mean delayed releases.
To an extent creating and supporting non-production environments with a DevOps team fails test 1. The further an Operations team is from the environments, the later they will find out about new changes that will impact production. It also fails test 2 as the classic “it works on my development machine” response to things not working in production now has an evil need breed of mutant cousins “it works in the System Test environment”, “it works in the Integration Test environment” etc.
But despite this apparent “new” problem created above, in my experience it usually already exists because I rarely see Operations successfully owning all of the lower environments. Instead they will be owned by Development (or specifically “Testing”). This not only distances Operations from the environments but usually breeds frustration within Development. Wherever they have “control”, things appear to “work” (or at least get fixed according to their priorities), but wherever in the life cycle Operations take over is carnage. The later it is, the worse the effect. In fact, you get a problem with a lot of parallels to Martin Fowler’s infamous software integration phase.
Even if creating and supporting Development and Test environments is owned by Operations often this doesn’t always go well because test 3 fails. Operations then reduce the burden of supporting non-production environments (aka “optimising” efficiency), by handing over full access to the environments to Development. This quickly breeds snowflakes and the Operations are once again abstracted from the details of non-production environments and trouble is stored up until the first environment that they keep locked down.
Another challenge with Operations managing non-Production environments is that they have achieve the perception of overall service “stability” by playing a numbers game i.e. prioritising based effort based on perceived level of impact. This equation rarely favours anything non-production and Operations teams always have a solid gold excuse for neglecting development and test environments and tools: “the Prod Card”. Playing the Prod Card grants impunity from blame despite almost any amount of to non-Production. The only way to protect non-Production is by identifying people who are dedicated and this could be a shared DevOps service.
Yet another challenge with Operations managing non-Production environments could be that they don’t yet exist. It’s not unusual on projects that the operator has either not yet been appointed or if internal not yet been commissioned to work on a project. I’m not arguing that this is desirable, but in those cases having a separate service to manage things can be more effective than having nothing at all.
All in all, non-production environment management is often a big problem which can be solved by a dedicated team provided they acknowledge their existence in relation to the 3 tests and manage their shortcomings accordingly. Also, providing they are aware of the impending ‘integration phase’ when transitioning to Operations.
A well designed DevOps Service Team is a Pattern!