DevOps – have fun and don’t miss the point!

DevOps is a term coined in 2009 which has since evolved into a movement of people passionate about a set of practices that enable companies to successfully manage rapidly changing IT services by bridging the gap between Development and Operations teams.

In my experience DevOps is very rewarding, so in this blog I’m going to try to bring its practical side to life.   I hope my post may help a few people connect DevOps to things they are already doing and even better, form some ideas about how to go further.

Brief Definition

The basic premise of DevOps is that most organisations with IT functions have separate teams responsible for software Development and for IT Operations (aka Service Management.) The observation is that separation of duty can have a negative side effect of inefficiency, unresponsiveness and unhappiness.

If your company is meeting its business objectives and you don’t have any sense for this pain, you’ve probably read enough!  Put the blog down, get yourself an ‘I’m rocking the DevOps’ t-shirt and leave me some comments containing your secrets so we can all learn from you!

A lot has been written that analyses the misery of Development versus Operations, so I’ll keep it brief.

The problems stem from the natural tension between Developers (who deliver changes) and Operators (who are incentivised to maximise service availability and hence prevent changes along with other such obvious threats.)  Add to the mix some wholesome but local optimisation (performed independently by each team) and you may start having ‘annual budget’ loads of completed yet unreleased features and stable but stagnated live service.  Periodically the Business teams mandate a ‘go live’ thus blowing up the dam and drowning live service in changes, some of which has actually gone quite toxic…

In the ideal world developers should be assured of a smooth, almost instantaneous transition of their code into production.  Operations staff should be assured of receiving stable and operable applications, and of having sufficient involvement in all changes that will impact service (in particular altered hosting requirements.)

I think of DevOps as a campaign which we can all join to maximise the throughput of successful IT changes where we work.  I’m personally ready to trust business teams to ensure we are choosing to deliver the right changes!

Where to Find the Fun

People and processes…, blah-dee-blah…  Ok, fair enough, DevOps does revolve around people, but we all like technology, so let’s start with the fun parts and come back to people and processes later!

Typical DevOps Technical Concerns include:

  • Configuration Management
  • Application Lifecycle Management
  • Environment Management
  • Software and Infrastructure Automation.

Where I work we have a well-defined technology architecture methodology which neatly classifies these concerns Development Architecture (aka DevArch).

Q. Does taking very seriously the above concerns mean you that are doing DevOps?
It helps, it’s a good start.  But it’s also key that the above concerns are consistently understood and are important priorities to both Development and Operations.

Caring about these concerns alone isn’t enough, time to tool them up!

The great news is that the popularity of DevOps has made the tooling space awash with fantastic innovators of excellent tooling. There are too many to mention but current favourites of mine include the orchestration tool Jenkins, the version control tool Git, the code quality framework and dashboard Sonar, and the automated configuration management tool Chef, and the VM manager Vagrant.

We are also now at the point where freely available open source tools (including all the above) exceed the capability of many commercial alternatives (hasta la vista ClearCase!).

Q. Does using some or all of the above types of tool mean you are doing DevOps? 
Not necessarily, but they help a lot.  Especially if they are equally prominent to and shared by Development and Operations.

Perhaps we need to add automation – lovely stuff like automated builds, automated deployments, automated testing, automated infrastructure.

In more good news, the understanding and appreciation of software delivery automation has gone up massively in the last few years.  Gradually this has reduced fear (and denial!) and increased management support and budgets, and also raised expectations!  I must thank misters Humble and Farley in particular for cementing a usable meaning of continuous delivery and creating the excellent delivery/deployment pipeline pattern for orchestrating automation.

Q. Does doing continuous integration and/or having a delivery pipeline mean you are doing DevOps?
It helps a lot.  But assuming (as every pipeline I’ve worked with so far) your pipeline doesn’t implement zero touch changes from check-in to production, Operations will still be prominent in software releases.  And of course their involvement continues long after deployments and throughout live service.  One development team’s automated deployment orchestrator might be one Operation team’s idea of malware!  Automation alone will certainly not eliminate the tension caused by the opposing relationships to change.

The Point (Don’t Miss It)

Ok technology over, let’s talk directly about people as processes.  The key point is that to globally optimise your organisation’s ability to deliver successful software changes, you have to think across both Development and Operations teams.  -It’s even nearly in the name DevOps.

Development teams optimising software delivery processes in a silo (for example with Agile) will not help Operations team accommodate shorter release cycles  Operations teams optimising risk assessment processes (for example wit ITIL) around “black box” software releases with which they have had little involvement will draw the only logical conclusion that changes should be further inhibited.

Operations teams optimising systems administration with server virtualisation and automated configuration management (e.g. Puppet) will not help development and testing productivity if they are not understood and adopted in earlier lifecycle stages by Development teams.

Development teams optimising code deployment processes with shiny automated deployment tools achieve diminished returns and in fact increased risk if the processes are not approved for use in Production.

There is no substitution for uniting over the common goal of improving release processes. Tools and automation are a lot of fun, but doing these in silos will not achieve what DevOps really promises. Collaboration is paramount, don’t miss the point!

How Does Wikipedia work? Should major Enterprises use wikis for internal knowledge management?

In this blog, I’m going to highlight my support for major Enterprises using internal wikis for internal knowledge management and I will also call out some things that I think need to be thought hard about to make them what they deserve to be –Wikipedia quality content for internal information.

As a quick aside, the concept of an internal wiki and hence not sharing things in the public domain “open-source style” might sound a bit evil. It’s not, it is absolutely necessary for companies to need to keep certain information private, it just so happens that I believe wiks (more commonly associated with public sharing) have another use case.

Pits not to fall into (sprawl and deception)

In my view you need to consider a few things to maximise the benefit of a global Enterprise wiki.

On a large project that I personally spent some time on and that lasted quite a few years, the project hosted a wiki (moinmoin as it happens for the wikistorians out there). As you’d expect in anything that lasted for so long variety of people interacting with it, we learnt a lot about what did and didn’t work.

The main problem was what I call wiki sprawl. Perhaps I can misquote Pareto (aka 80:20 principle) here and state that over time, “80% of wiki pages end up containing inaccurate, part duplicated inconsistent information, whilst 20% of the pages contain 80% of the good quality information.”

Granularity of pages was a contributor to this. Wiki pages can be similar to working with aggregate-oriented design in the NoSQL world. What is the most common unit of information that will be accessed in one go? What should be normalised through different “aggregate types” (interlinked pages) and what should be denormalised (lumped onto one page). Without getting this right it can be very difficult to create consistent information and even more difficult find it.

Another problem was things like formatting of pages. You could take the view I might be being a bit anally retentive on this point, but genuinely when pages were well laid out (good summary, appropriate use of sections and subsections, highlighted code blocks etc.) they were a lot more usable and it was also a lot more tempting to contribute to them for the benefit of others.

Sprawl started to take over. Anyone who created a page called for example UsefulLinuxCommands without first searching and discovering the existence of a similar page HelpfulLinuxCommands was definitely causing proliferation when they proudly part-duplicated, perhaps even part-contradicted the advice from the pre-existing page.

Over time the problem compounded itself. When someone comes along wanting to create a wiki page for the “greater good” diligently searches for any pre-existing pages and finds a list of 10 possible pages to trawl though before they can be sure their information is not already captured, I can sympathise with them creating a new page. Especially using their fancy flavour-of-the-month page formatting style.

It’s not all doom and gloom. I can hardly suggest this is the fate of all wikis. We’re all aware of what I am assuming is the World’s largest and greatest wiki – Wikipedia

How to avoid pitfalls (not so moderate moderation)

As we all know, the success of Wikipedia cannot be attributed to the technology. Whilst the underlying software application MediaWiki is certainly more powerful than moinmoin (RIP), it is predominantly the community of people that have turned it into possibly the best single source of information on the planet (excluding indexers to other information like Google).

So how did Wikipedia get so good?

Most people know that it is supported by thousands of moderators protecting content accuracy, legality, conformance to the site policies, and of course fighting duplication, inconsistency and sprawl. It’s also widely recognised that the moderators (Wikipedia seems to call everyone Editors) are volunteers. So how is this distributed, almost anonymous, self-organising team successful?

From some rather limited research, I’ve observed some contributing factors. But the truth is, having only updated a few pages myself, I don’t fully understand the formula.

Observations:

  • Firstly anyone can edit a page, even anonymously. Therefore the barrier to contribution is extremely low. Obviously however, this also has an implication on information quality.
  • Traceability of change. The site does its best to make it easy to see exactly what has changed, when, by who. This is through a combination of Watchlists related change pages, the Talk communication channel etc.
  • Editor rank. Editors start off with a low rank which limits what they do in terms of revoking and approving changes. Through a meritocracy based on quality and quantity of contributions, people graduate from Autoconfirmed, to Adminstrators, Beurocrats, Abtration Committe, and finally Stewards. (The next position “Founder” is reserved!).
  • Some pages become protected. Which is a process where higher ranking editors (perhaps with local status achieved due to their contributions to a particular page) can control and approve edits to protect vandalism.
  • Finally, one of my favourites, Wikipedia is policed by numerous tools or “bots”. These all extend the capacity of human editors to moderate content.

So what does Wikipedia teach Enterprise about internal wikis?

I think the following very positive points:

  • it is possible to do this at global scale
  • it is possible to do this with volunteering (or in our case people going beyond their day jobs)
  • this mechanism for structuring information to be stored in one place is so far the most powerful solution on the planet.

But don’t mustn’t ignore the critical success factors:

  • Updating information has to be a pleasurable experience (good bye SharePoint)
  • Review and governance processes need to be extremely well thought out and able to evolve over time
  • Some tooling to help track and moderate change
  • Perhaps a sort of “quality” critical mass beyond which

I’d be interested to hear if anyone has read this book about Wikipedia and/or has experiences good and bad to share.

 

Jenkins in the Enterprise

Several months ago I attended a conference about Continuous Delivery. The highlight of the conference was a talk from Kohsuke Kawaguchi the original creator of Jenkins. I enjoyed it so much, I decided to recapture it in this blog.

To the uninitiated, Jenkins is a Continuous Integration (CI) engine. This means basically it is a job scheduling tool designed to orchestrate automation related to all aspects software delivery (e.g. compilation, deployment, testing).

To make this blog clear, two pieces of Jenkins terminology are worth defining upfront:

Job – everything that you trigger to do some work in Jenkins is called a Job. A Job may have multiple pre-, during, and post- steps where a step could be executing a shell script or invoking an external tool.

Build – an execution of a Job (irrespective of whether that Job actually builds code or does something else like test or deploy code). There are numerous ways to trigger a Job to create a new Build, but the most common is by poling your version control repository (e.g. GIT) and automatically triggering when new changes are detected.

Jenkins is to some extent the successor of an earlier tool called Cruise Control that did the same thing, but was much more fiddly to configure. Jenkins is very easy to install (seriously try it), configure, and also very easy to extend with custom plugins. The result is that Jenkins is the most installed CI engine in the world with around 64,000 tracked installations. It also has around 700 plugins available for it which do everything from integrate with version control systems, to executing automated testing, to posting the status of your build to IRC or Twitter.

I’ve used Jenkins on and off since 2009 and when I come back to it, am always impressed at how far it has developed. As practices of continuous delivery have evolved, predominately through new plugins, Jenkins has also kept up (if not enabled experimentation and innovation). Hence the prospect of hearing the original creator talking about the latest set of must have plugins, was perhaps more exciting to me that I should really let on!

Kohsuke Kawaguchi’s lecture about using Jenkins for Continuous Delivery

After a punchy introduction to Jenkins, Kohsuke spent his whole lecture taking us through a list of the plugins that he considers most useful to implementing Continuous Delivery. Rather than sticking to the order that he presented them, I’m going to describe them in my own categories: Staples that I use; Alternative Solutions to Plugins I Use; New To Me.

NB. it is incredibly easy to find the online documentation for these plugins, so I’m not going to go crazy and hyperlink each one, instead please just Google the plugin name and the word Jenkins.

Staples that I use

First up was the Parameterised Builds Plugin. For me this is such a staple that I didn’t even realise it is a plugin. This allows you to alter the behavior of a Job by supplying different vales to input parameters. Kohsuke liked this to passing arguments to a function in code. The alternative to using this is to have lots of similar Job definitions, all of them hard-coded for their specific purpose (bad).

Parameterised Trigger was next. This allows you to chain a sequence Jobs in your first steps towards creating a delivery pipeline. With this plugin, the upstream Job can pass information to the downstream Job. What gets passed is usually a subset of its own parameters. If you want to pass information that is derived inside the upstream Job, you’ll need some Groovy magic… get in touch if you want help.

Arguably this plugin is also the first step towards building what might be a complex workflow of Jenkins Jobs, i.e. where steps can be executed in parallel and follow logical routing, and support goodness like restarting/skipping failed steps.

Kohsuke described a pattern of using this plugin to implement a chain of Jobs where the first Job triggers other Jobs, but does not complete until all triggered Jobs have completed. This was a new idea to me and probably something worth experimenting with.

The Build Pipeline view plugin was next. This is for me the most significant UI development that Jenkins has ever seen. It allows you to visualise a delivery pipeline as a just that, if you’ve not seen it before, click here and scroll down to the screenshot. Interestingly, the plugin hadn’t had an new version published for nearly a year and this was asked about during the Q&A (EDIT: it is now under active development again!). Apparently as can happen with Jenkins plugins, the original authors developed it to do everything they needed and moved on. A year later 3000 more people have downloaded it and thought of their own functionality requests. It then takes a while for one of those people to decide they are going to enhance it, learn how to and submit back. Two key features for me are:

  1. The ability to manually trigger parameterised jobs (I’ve got a Groovy solution for this, get in touch if you need it) (EDIT: now included!)
  2. The ability to define end points of the pipeline so that you can create integrated pipelines for multiple code bases.

The Join plugin allows you to trigger a Job upon the completion of more than one predecessor. This is essential if you want to implement a workflow where (for example) Job C waits for both A and B to complete.

This is good plugin, but I do have a word of caution that you need to patch your Build Pipeline plugin if you want it to display this pattern correctly (ask me if you need instructions).

The Dependency Graph plugin was mentioned as not only a good way of automatically visualising the dependencies between you Jobs, but also allowing you to create basic triggering using the JavaScript UI. This one I’ve used locally, but not tried on a project yet. It seems good, I’m just slightly nervous that it may not be compatible with all triggers. However, on reflection, using it in read-only mode would still be useful and should be low risk.

The Job Config History plugin got a mention. If you use Jenkins and don’t use this, my advice is GET IT NOW! It is extremely useful for tracking Job configuration changes. It highlights Builds that were the first to include a Job configuration change. It allows you do diff old and new versions of a Job configuration in a meaningful way directly in the Jenkins UI. It tells you who made changes and when. AND it lets you roll back to old versions of Jobs (particularly useful when you want to regress a change – perhaps to rule out your Job configurations changes being accountable for a bad change causing a code build failure).

Alternative Solutions to Plugins I Use

Jenkow plugin, the Build Flow plugin and the Job DSL plugin were all recommended as alternative methods of turning individual Jobs into a pipeline or workflow.

Jenkow stood out in my mind for the fact that it stores the configuration in a GIT repository which is an interesting approach. I know I’m contradicting my appreciation of the Job Config History plugin here, but I’m basically just interested to see how well this works. In addition, Jenkow supports defining the workflows in BPMN which I guess is great if you speak it and even if not, good that it opens up use of many free BPMN authoring tools. All of these seem to have been created to support more advanced workflows and I think it is encouraging that people have felt the need to do this.

The only doubt in my mind is how compatible some of these will be with the Build Pipeline plugin which for me is easily the best UI in Jenkins.

New Ideas To Me

The Promoted Builds plugin allows you to manually or automatically assign different promotions to indicate the level of quality of a particular build. This will be a familiar concept to anyone who has used ClearCase UCM where you can update the promotion levels of a label. I think in the context of Jenkins this is an excellent idea and I plan to explore whether it can be used to track sign-off of manual testing activities.

Fingerprinting (storing a database of checksums for build artefacts) was something that I knew Jenkins could do, but have never looked to exploit. The idea is that you can track artefact versions used in different Jobs. This page gives a good intro

The Activiti plugin was also a big eye opener to me. It seems to be an open source business process management (BPM) engine that supports manual tasks and has the key mission statement of being easy to use (like Jenkins). The reason this is of interest to me is that I think it’s support for manual processes could be a good mechanism for gluing Jenkins and continuous delivery into large existing enterprises rather than having some tasks hidden in Jenkins and some hidden elsewhere. I’m also interested in whether this tool could support the formal ITIL-eque release processes (for example CAB approvals) which are still highly unlikely to disappear in a cloud of DevOps smoke.

Book Review: The Phoenix Project

Earlier this year the “DevOps” movement hit a new milestone with the publication of the first novel on the subject (yes as in an entertaining work fiction).

The Phoenix Project: A Novel About IT, DevOps and Helping Your Business Win

http://www.amazon.co.uk/The-Phoenix-Project-Business-ebook/d…

If you can’t be bothered to read this whole review, then my advice is to buy it. Just don’t then blame if you don’t like it… you should have read the whole review.

To anyone familiar with the Eliyahu M. Goldratt’s “The Goal”, The Phoenix Project will feel pleasantly familiar.
http://www.amazon.co.uk/The-Goal-Process-Improvement-ebook/d…

To anyone unfamiliar with The Goal, it is basically the crusade of a middle manager faced with the challenge of turning around a failing manufacturing plant to save it from closure. This challenge is supported by a quirky physicist advisor who uses the Socratic method to reveal how to apply scientific reasoning in favour of conventional manufacturing processes and economics. Throughout The Goal book, there are lots of simple models designed to explain the principles and teach you something. It makes you feel good whilst you are reading it, but at the end a little uncertain whether you’ve actually learnt anything you can apply in the real world.

Modernise the hero and substitute their dysfunctional manufacturing plant for a dysfunctional IT Operations team, and you aren’t far off The Phoenix Project. In fact it is almost a sequel in The Goal series. A manufacturing plant which could easily have been from the The Goal is used heavily in The Phoenix Project to highlight what manufacturing can teach IT. – This is a great metaphor that I definitely subscribe to.

So is The Phoenix Project entertaining and do you actually learn anything?

I certainly found it highly entertaining, the observations were very sharp and definitely reminiscent of things I’ve seen. There are plenty of familiar examples of poor decisions about trying to go too fast at the expense of quality and stability, unpredictability and mayhem. All exciting stuff to a DevOps freak.

Do you learn anything from the Phoenix Project? Perhaps mostly just through re-evaluating your own experiences. There isn’t a huge amount of detailed substance on DevOps implementation in the book and in fact, it appears to be a good plug for the author’s next book, the DevOps Cookbook:
http://www.realgenekim.me/devops-cookbook/
Really looking forward to that!!

In summary, personally I recommend reading either the Phoenix Project or the Goal and I eagerly await the Cookbook.

A blog-alanche? It’ll pass!

I’ve been blogging for about a year now, but as I nervously started trying this thing out, I felt more comfortable doing in on internal company blog sites as opposed to writing to the whole world.

Anyway, I’ve got my courage up and I’m going public!

Expect an avalanche of blogs hear as I re-master some old posts.

Don’t expect this extremely high-level of literary productivity to be maintained long term.

I hope you find something interesting.

Agile, Waterfall, Common Sense, Experience

Even if you work for an old large traditional enterprise, it’s almost impossible now not to feel the influence of Agile and my advice is embrace it!

Agile:

  • creates / lays claim to a lot of great ideas. Some of these pre-dated it, but there is no harm at all getting them distilled all in one place
  • is extremely popular and even mandated by many organisations.
  • Mis-understood, mis-quoted, mis-interpreted, mis-used, spells TROUBLE.

Therefore, I highly recommend getting up to speed, for example by reading the manifesto. But don’t stop at that, there is huge amounts of great information out there in books and blogs to discover.

I would encourage some effort to learning the lingo. I don’t want to start ranting about how many times I’ve heard Agile terminology misquoted for defending some truly bad (non-Agile) practices (it’s many many times).

However, if you are starting out dabbling in Agile within an established organisation with entrenched processes my advice is to keep the following close to hand:

  • What Agile has to say about how to do things (and what it calls things!)
  • You common sense
  • Your experiences

There really is no excuse for ignoring any of these (and believe me I’ve seen people do it time and time again).

In my head, there is a parallel to be drawn between a methodology like Agile and an Enterprise software package like SAP (stick with me). Both are powerful, both are capable of doing massive things, “out-of-the-box” neither will ever be as effective as they were when they were first created as bespoke solutions to the unique problems that prompted them. – The problem you are solving (whether with a software package or software delivery methodology) will always be unique, and hence so must be the solution

Based on what I’ve seen over the last 4-5 years here are key success factors when following Agile in a project-based scenario:

Key Success Factors

1. Have a project initiation phase that does not complete until the sacrosanct high-level priorities of the sponsor and key high-level architectural decisions are agreed.
2. Adoption requires full co-operation of the Business as well as IT. It also requires all teams to be well trained and to distribute people with Agile experience across scrum teams.
3. Ensure that your performance management process aligns closely with the expected behaviours of people performing Agile roles.
4. Invest heavily in ensuring your application is fast and predictable to release. Without this, the benefits of faster development are lost when changes hit a bottleneck of live implementation.
5. Document and publish development patterns and coding standards and ensure you continually monitor conformance.
6. Constantly strive to reduce quality feedback time, i.e. the time from a developer making a change until the identification anything that related to the change that would indicate it as unworthy of promotion to production. This involves automating as much testing as possible and optimising the execution time of your tests.
7. Allow scrum teams to estimate and plan their own sprints and track the accuracy of this process over time.
8. Ensure you programme is heavily metric-driven and expect to need to continually refine the of your methodology over time to improve effectiveness .

Here are common pitfalls:

1. Forgetting that adopting Agile affects the whole organisation and it will not succeed in silos.
2. Making the assumption that documentation is no longer necessary. If something cannot be validated through an automated test or system monitoring, it still needs to be written down.
3. Using Agile’s innate scope flexibility to the extent that the project direction is in a constant state of flux leading to delays to benefit realisation and a lost recognition of the overall business case.
4. Becoming functionality obsessed and overlooking non-functional requirements such as performance, scalability and operability.
5. Putting an overemphasis on getting “working code” at the expense of poor application architecture and convoluted code in urgent need of re-factoring to adopt patterns and standards to reduce the cost of maintenance.
6. Devoting too much focus to executing pure Agile, as opposed to tailoring an approach that retains the key benefits whilst minimising risk on a large-scale distributed programme.
7. Making the assumption that less planning is required. Agile requires highly detailed planning involving the whole team and an iterative approach that complements and responds to the Agile delivery
8. Focusing on increasing development/test velocity when the bottleneck lies in your test environment stability and ability to predictably deploy code.
9. Ignoring the impact on service and operations teams who are expected to support new systems at a faster rate.
10. Overlooking the how to accommodate potentially slower moving dependencies such as infrastructure provisioning.
11. Doing things based on a perception of what “Agile” tells you to do and neglecting experience and knowledge of your organisation.

However, I must concede most of these pointers are based on common sense and my experience, and most could be applied to Agile, Waterfall, or any methodology.

Please let me know your thoughts.