Proposed Reference Architecture of a Platform Application (PaaA)

In this blog I’m going to propose a design for modelling a Platform Application as a series of generic layers.

I hope this model will be useful for anyone developing and operating a platform, in particular if they share my aspirations to treat the Platform as an Application and to:

Hold your Platform to the same engineering standards as a you would (should?!) your other applications

This is my fourth blog in a series where I’ve been exploring treating our Platforms as an Application (PaaA). My idea is simple, whether you are re-using a third Platform Application (e.g. Cloud Foundry) or rolling your own, you should:

  • Make sure it is reproducible from version control
  • Make sure you test it before releasing changes
  • Make sure you release only known and tested and reproducible versions
  • Industrialise and build a Continuous Delivery pipeline for you platform application
  • Industrialise and build a Continuous Delivery pipeline within your platform.

As I’ve suggested, if we are treating Business Applications as Products, we should also treat our Platform Application as a Product.  With approach in mind, clearly a Product Owner of a Business Application (e.g. a website) is not going be particularly interested in detail about how something like high-availability works.

A Platform Application should abstract the applications using it from many concerns which are important but not interesting to them.  

You could have a Product owner for the whole Platform Application, but that’s a lot to think about so I believe this reference architecture is a useful way to divide and conquer.  To further simply things, I’ve defined this anatomy in layers each of which abstracts next layer from the underlying implementation.

So here is it is:

PaaA_Anatomy

 

Starting from the bottom:

  • Hardware management 
    • Consists of: Hypervisor, Logical storage managers, Software defined network
    • The owner of this layer can makes the call: “I’ll use this hardware”
    • Abstracts you from: the hardware and allows you two work logically with compute, storage and network resources
    • Meaning: you can: within the limits of this layer e.g. physical capacity or performance consider hardware to be fully logical
    • Presents to the next layer: the ability to work with logical infrastructure
  • Basic Infrastructure orchestration 
    • Consists of: Cloud console and API equivalent. See this layer as described in Open Stack here
    • The owner of this layer can make the call: “I will use these APIs to interact with the Hardware Management layer.”
    • Abstracts you from: having to manually track usage levels of compute and storage. Monitor the hardware.
    • Meaning you can: perform operations on compute and storage in bulk using an API
    • Presents to the next layer: a convenient way to programmatically make bulk updates to what logical infrastructure has been provisioned
  • Platform Infrastructure orchestration (auto-scaling, resource usage optimisation)
    • Consists of: effectively a software application built to manage creation of the required infrastructure resources required. Holds the logic required for auto-scaling, auto-recovery and resource usage optimisation
    • The owner of this later can make the call: “I need this many servers of that size, and this storage, and this network”
    • Abstracts you from: manually creating the scaling required infrastructure and from changing this over time in response to demand levels
    • Meaning you can: expect that enough logical infrastructure will always be available for use
    • Presents to the next layer: the required amount of logical infrastructure resources to meet the requirements of the platform
  • Execution architecture 
    • Consists of: operating systems, containers, and middleware e.g. Web Application Server, RDBMS
    • The owner of this later can make the call: “This is how I will provide the runtime dependences that the Business Application needs to operate”
    • Abstracts you from: the software and configuration required your application to run
    • Meaning you can: know you have a resource that could receive release packages of code and run them
    • Presents to the next layer: the ability to create the software resources required to run the Business Applications
  • Logical environment separation
    • Consists of: logically separate and isolated instances of environments that can use to host a whole application by providing the required infrastructure resources and runtime dependencies
    • The owner of this layer can make the call: “This is what an environment consists of in terms of different execution architecture components and this is the required logical infrastructure scale”
    • Abstracts you from: working out what you need to create fully separate environments
    • Meaning you can: create environments
    • Presents to the next layer: logical environments (aka Spaces) where code can be deployed
  • Deployment architecture
    • Consists of: the orchestration and automation tools required release new Business Application releases to the Platform Application
    • The owner of this layer can make the call: “These are the tools I will use to deploy the application and configure it to work in the target logical environment”
    • Abstracts you from: the details about how to promote new versions of your application, static content, database and data
    • Meaning you can: release code to environments
    • Presents to the next layer: a user interface and API for releasing code
  • Security model
    • Consists of: a user directory, an authentication mechanism, an authorisation mechanism
    • The owner of this later can make the call: “These authorised people can do the make the following changes to all layers down to Platform Infrastructure Automation”
    • Abstracts you from: having to implement controls over platform use.
    • Meaning you can: empower the right people and be protected from the wrong people
    • Makes the call: “I want only authenticated and authorised users to be able to use my platform application”

I’d love to hear some feedback on this.  In the meantime, I’m planning to map some of the recent projects I’ve been involved with into this architecture to see how well they fit and what the challenges are..

Advertisements

PaaA is great for DevOps too: treat your Platform as a Product!

In this previous post, I chronicled my evolving understanding of PaaS and how it has taught me the virtues of treating your Platform as an Application (PaaA). Here I documented what I believe a self-respecting platform application should do.  In this post I’m going to describe how I’ve seen PaaA help solve the Dev and Ops “problem” in large organisations (“Traditional Enterprises” if you prefer).

DevOps is a highly used/abused term and here I’d like to define it as:

An organisational structure optimised for the fastest release of changes possible within a pre-defined level of acceptable risk associated with making changes. Or simply: the organisational structure that lets you release as fast as possible without losing control and messing up too badly.

This isn’t my first attempt at tackling DevOps teams, also see a blog here and an Ignite here. Of course lots of other good things have been written about it as well, e.g. here from Matt Skelton. I believe PaaA provides a good path.

So this is the traditional diagram for siloed Dev and Ops:

devops1

*Skull and crossbones denote issues which I won’t describe again here.  If you aren’t familiar with the standard story, I suggest viewing this excellent video by RackSpace.

For any organisation with more than one major application component (aka Product), when we add these to the diagram above it starts looking something like this:

devops2

Each application component (or Product) e.g. the website (Site) or the Content Management System (CMS) is affected by both silos. Obviously traditionally the Development (Dev) silo write the code, whilst the Operations (Ops) silo use part-automated processes to release, host, operate, and do whatever necessary to keep the application in service.  Whilst each “Business Application”  exists in both silo, only the Ops team have the pleasure of implementing and supporting “the Platform” i.e. the infrastructure and middleware.

So if silos are bad, perhaps the solution is the following. One giant converged team:

devops3

The problem with this is scale. There is a high likelihood that attempting to adopt this in practice actually fails and sub teams quickly form within it to re-enforce the original silos.

So we can look to subdivide this and make smaller combined Development and Operations teams per application component or small group of them. If that works for you, then fantastic!  This also is effectively the model you are already using when connecting your in-house application to any external or 3rd party web-services (for example Experian).

devops4

In my experience though, it is impractical and inappropriate to have so many different teams within one organisation each looking after their own Platform. Logically (as per the experience of public cloud) and physically (as per traditional data centres and private cloud) major elements of the platform are best to be shared e.g. for economies of scale, or perhaps for application performance.

So what about when you treat your Platform as an Application?  Where could Dev and Ops reside?

The optimum solution in my experience is a follows:

devopsPaaA

The Platform Application (highlighted above by a glowing yellow halo) has a dedicated and independent, fully-combined Development and Operations team and it is treated just like any other Business Application.

Hang on a minute, haven’t I just re-branded what would traditionally be just know as the Operations team as a Platform Application team?

Well no. Firstly the traditional Development team usually has no Operations duties such as following their code all the way to production and then being on call to support it once it is in there. They may not feel accountable for instrumentation and monitoring and operability, perhaps not even performance.  Now they must consider all of these and implement them within the constraints of the capabilities provided by the Platform Application upon which they depend.  By default nothing will be provided for them, it is for them to consume from the Platform Application.  So the Platform Application team are already alleviated of a lot of accountability compared to a traditional Operations team. So long as they can prove the Platform Application is available and meeting service levels, their pagers will not bother them.

Secondly, the platform team are no longer a quite so different from other end-to-end Business Application teams. They manage scope, they develop code, they manage dependencies, they measure quality, they can do Continuous Delivery and they must release they application just like anyone else.  Sure their application is extremely critical in that everyone else (all the products using the platform instance) depends on them, but managing dependencies is very important between Business Applications as well, so isn’t a new problem.

The Platform Application delivery team (which we could also call the Platform Product team) hey have to constantly recognise that their application has to provide a consistent experience to consuming Business Applications. One great technique for this (borrowing from “normal” applications is Semantic Versioning (SemVer) where every change made has to be labelled to provide a meaningful depiction of the compatibility the new version relative to the previous.  In Platform Application terms we can update the SemVer description as:

  1. MAJOR version when you expect consuming Business Applications to need changes e.g. you change the RDMS
  2. MINOR version when you don’t expect consuming Business Applications to break, but need full regression testing, e.g. configuration tuning or a security update
  3. PATCH version when you make backwards-compatible changes expected to have no or a very low change of external impact.  For example if the IaaS API has a change which the Platform Application fully abstracts Business Applications from.

Hopefully it is becoming clear how the powerful and effective the mentality of treating your Platform as an Application (or Product) can be.  Everything that has been invented to help deliver normal applications can be re-used and/or adapted for Platform Applications.  The pattern is also extremely conducive to switching to a public PaaS, in fact, it is exactly how you operate when using on one.

Full disclosure: I run an organisation that develop and manage multiple different Platform Applications for different enterprises.  I am most enthusiastic about this approach because I feel it reconciles a lot of the conventional wisdom around DevOps that I’ve heard about, with what I’ve seen first-hand to be extremely successful in my job working in “traditional Enterprises”.

Reducing Continuous Delivery Impedance – Part 4: People

This is my fourth post in a series I’m writing about impedance to doing continuous delivery.  Click these links for earlier parts about the trials of InfrastructureSolution Complexity and COTS products.

On the subject of Continuous Delivery where the intention is to fail fast, it’s actually rather sloppy of me to defer talking about people my fourth blog on this. When it comes to implementing Continuous Delivery there is nothing more potentially obstructive than people.  Or to put things more positively, nothing can have a more positive impact than people!

Here are my top 4 reasons that people could cause impedance.

#1 Ignorance  A lack understanding and appreciation of Continuous Delivery even among small but perhaps vocal or influential minority can be a large source of impedance.  Many Developers and Operators (and a new species of cross-breeds!) have heard of Continuous Delivery and DevOps, but often Project Managers, Architects, Testers, Management/Leadership may not.  Continuous Delivery is like Agile in that it needs to be embraced by an organisation as a whole, simply because anyone in an organisation is capable of causing impedance by their actions and the decisions they make.  For example the timelines set by a project manager simply may not support taking time to automate.  A software package selected by an architect could cause a lot of pain to everyone with an interest in automation.

A solution to this that I’ve seen work well has been awareness sessions.  Whatever format that works best for sharing knowledge (brownbag lunches, webinars, communication sessions, memos, the pub etc) should be used to make people aware of what Continuous Delivery can do, how it works, why it is important, and what all the various terminology all means.

I once spent a week doing this and talked to around 10 different projects in an organisation and hundreds of people.  It was a very rewarding experience and by the end of it we’d gone from knowing 1 or 2 interested people to scores.  It was also great to make connections with people already starting to do great things towards the cause.  We even created a social media group to share ideas and research.

#2 Ambivalence?  As I’ve discussed before some people reject Continuous Delivery because they see it as un-achievable and / or inappropriate for their organisation. (Often I’ve seen this being due to confusion with Continuous Deployment.)  Also, don’t overlook a cultural aversion to automation. In my experience it’s only been around 5 years since the majority of people “in charge” were still very skeptical about the concept of automating the full software stack preferring.

A solution here (assuming you’ve revisited the awareness sessions where necessary) is to organise demos of any aspects of Continuous Delivery already adopted and demonstrate that it is real and already adding value.

#3 Obedience  Another source of impedance could perhaps be a misguided perception that Continuous Delivery is actually forbidden in a particular organisation.  So people will impede it due to a misinformed attempt at obedience to the management/leadership.  Perhaps a management steer to focus only tactically on “delivery, delivery, delivery” does not allow room for automation.  Or perhaps they take a very strong interest in how everything works and haven’t yet spoken about Continuous Delivery practices, or even oppose certain important techniques like Continuous Integration.  Or perhaps a leadership mandate to cut costs makes strategic tasks like automation seem frivolous or impossible.

A solution here is for management/leadership to publicly endorse Continuous Delivery and cite it as the core strategy / methodology for ongoing delivery.  Getting them along to the above mentioned training sessions can help a lot.  Getting them to blog about it is good.  As can be setting up demos with them to highlight the benefits of automation already developed.  Working Continuous Delivery into the recognition and rewards processes could also be effective (if you please C suite!).

#4 Disobedience  Finally, if people know what Continuous Delivery is, they want it, they know they are allowed it, why would they then disobey and not do it?  Firstly it could be down to other sources of impedance that make it difficult even for the most determined (e.g. Infrastructure).  But it could also easily be a lack of time or resources or budget or skills.

Skills are relatively easy to address so long as you make time.  Depending on where you live there could be masses of good MeetUps to go and learn at.  There are superb tutorials online for all of the open source tools.  #FreeNode is packed with good IRC channels of supportive individuals.  The list goes on.

Another thing to consider here is governance.  As I’ve confessed before, some people like me really like things like pipeline orchestration, configuration management, automated deployments etc. But this is not the norm.  It is very common for such concerns to be unloved and to slip through the cracks with no-one feeling accountable.  Making sure there is a clear owner for all of these is a very important step.  Personally I am always more than happy to take this accountability on projects as opposed to seeing them sit unloved and ignored.

Finally as I’ve said before, DevOps discussions often focus around the idea that an organisation has just two silos – Development and Operations.  But in my experience, things are usually lot more complex with multiple silos perhaps by technology, release, department etc., multiple vendors, multiple suppliers, you name it.  Putting a DevOps team in place to help get started towards Continuous Delivery can be one effective way of ensuring there is ownership, dedicated focus and skills ready to work with others to overcome people impedance.  Of course heed the warnings.

Obviously overall People Impedance is a huge subject.  I hope this has been of some use.  Please let me know your own experiences.

So if we are treating our Platform as an Application (PaaA), what should it do?

In my last post, I described the eureka moment I’d had whilst using Cloud Foundry.  I’d suddenly realised the fantastic benefits of treating your Platform as an Application.  I’d then decided this pattern needed it’s very own acronym “PaaA” to highlight the distinction from using a “Platform Application Delivered by someone else as a Service” (i.e. a what is traditionally called PaaS).

In hindsight it is really obvious that PaaA is a good idea – if it wasn’t, why would PaaS providers (who manage platforms commercially on industrialized scale for a living) bother doing it? In this post I’m going to define the features that I think any self-respecting Platform Application should have.

A quick aside: Should you build or buy/reuse a Platform Application?  There are plenty of applications available to buy/reuse:

In my opinion, since we’re treating our Platform as an Application, the usual build vs buy logic should be applied!  However (not to avoid the question entirely), my advice is that if you are in a greenfield scenario you should try buy/reuse, and if you already have a platform, start an initiative to move towards treating what you already have platform-wise more like an application.

So if we are treating our Platform like an Application (PaaA), what should it do?

Firstly we need a name for the part of the IT solution which is not the platform.  It’s tempting to take a platform-centric position and call it the Guest Application (since it resides and functions on the platform). I fear some may consider this name derogatory, so for lack of a alternative, I’m calling it the Business Application. In terms of cardinality, I would expect any Platform Application to host one or more Business Applications.

The most basic requirement of a Platform Application is that it can provide the run-time operating system and middleware dependencies needed for the Business Application to run.  For example if the Business Application is Java Web Application requiring a Servlet Container, the Platform Application must provide that.  If an RDMS e.g. PostgreSQL is required, the platform must of course also provide that.  To put it another way, we’re treating the whole environment minus the Business Application as being something the Platform Application must supply.

All applications should be build-able from version control and releasable with a unique build number.  A Platform Application is no different and they also need a fully automated and repeatable installation (platform deployment) process, i.e. you should be able to fully destroy and recreate your whole platform (aka phoenix it) with great confidence.  You should also be able to make confident statements like “We completed all our testing on version 2.3.0.25 of the platform”. (My use of version number resembling Semantic Versioning was deliberate as I believe it is very useful for Platform Applications.)

A Platform Application should abstract the Business Applications that run on top of it from the underlying infrastructure i.e. the servers, storage and network.  Whilst doing this, the Platform Application must provide infrastructure features to level of sophistication required by the hosted Business Applications for example auto-scaling and high-availability / anti-fragility.  A nice-to-have feature is some built-in independence of the underlying infrastructure solution. This provides a level of portability to deploy the Platform Application to different physical, virtual and cloud infrastructure providers.

A Platform Application should work coherently with your software delivery lifecycle. For example it must have a cost effective solution for supporting multiple isolated test environments.  For example Cloud Foundry instances supports multi-tenancy through Spaces of which you can create multiple per Platform Application instance.

A Platform Application must make the process of performing fully automated deployments of the Business Applications onto it trivial.  Of course the release packages of the Business Applications must conform to the required specifications.  This includes both the binary artefacts format e.g. War files and any required configuration (aka manifest) files.

There are a number of main security concerns for a Platform Application.  It needs an authentication and authorisation solution for controlling administration of the platform e.g. who can perform Business Application deployments or create new environments.  The platform must have an appropriate solution for securely managing keys and certificates required by the Business Applications.  Finally the Platform Application must support the access protocols required by the Business Application e.g. https.

There are a number of logging concerns for a Platform Application.  Of course it should create adequate logs of its own so that it can be operated successfully.  It also needs a solution for managing the logs of the Business Applications, for example an inbuilt aggregation service  that could be based on LogStash, Kabana, ElasticSearch.

Finally there are monitoring concerns for a Platform Application.  Of course it needs to monitor itself and the underlying infrastructure that it is managing.  It also needs to provide a standardised solution for monitoring the Business Applications deployed onto it.

 

I’d love to hear if anyone thinks of other core features that I should add to the list.

 

I finally get PaaS – they should actually be called Platform as an Application

I’ve been aware of Platforms-as-a-Service (PaaS) for a few years, but I wouldn’t say I completely understood how important they are, until now.  In part I blame the name which leads me to thinking PaaS is all about receiving a service.  Instead I believe a pattern of treating your Platform as an Application (PaaA) is where the real value lies.

In this post I’m going to share the evolution of my understanding and hopefully leave you as fired up about Paa[AS]’ as I am.

The first PaaS that caught my attention was Google App Engine.  My understanding was that it was basically:

  • a place online where Google will host your applications for you
  • something that only worked when you write “special” compatible applications
  • not something that would change my life (i.e. my day job delivering large-scale Enterprise IT systems).

The next thing that caught my attention was Heroku which to me was basically:

  • a place that supported more “normal” applications like Ruby on Rails
  • a realization that if this trend of supporting more and more application types (middleware solutions) continues, using a PaaS could be something that I’d end up doing.

At this point I realized that I’d actually already used a PaaS when I wrote my very first static HTML website back in 2000.  The hosting provider was providing me with a very simple PaaS.  It only served static content, but none the less it was a PaaS and had already proved to me that the service model works.

So my understanding of a PaaS was that it was a service supplied by someone else to provide the platform to deliver your applications.  And I was starting to imagine that the trajectory I’d seen going from a PaaS that supported static content to one supporting full-blown Rails applications meant that soon they’d be applicable to the types of IT system I worked on.

Late last year, I had the privilege of my day job putting me in close proximity with a PaaS called CloudFoundry.  My understanding evolved again:

  • this time I was responsible for installing PaaS software myself (and hosting it in the cloud)
  • this time it was fully extensible and I had the responsibility of doing this to meet the middleware requirements (RabbitMQ, Cassandra etc.)
  • I was now expected to be a PaaS provider

Building and supporting environments for test and production was nothing new to me, but this was the first time I was doing it using a PaaS.  Yet I wasn’t receiving the platform as a service from someone externally, I was delivering it using a software application (Cloud Foundry) still referred to as a PaaS.  I’d somehow jumped from thinking one day I’d receive the benefits of a PaaS service from someone else to realizing now I was having to provide one… I felt a bit cheated!

So the obvious question in my mind was: will running a PaaS make my life easier and of course improve how well I could provide environments for test and production?

The answer wasn’t immediate. Getting up and running with Cloud Foundry was definitely a steeper learning curve than using something like Cloud Formation by Amazon. Suddenly there was another application to deal with and this one wasn’t even created by the development team. It was open source and complex and quite opinionated about how to do things.

The developers weren’t in love either. They had more to learn and more rules to follow – some rules that even the Ops team couldn’t explain…

However over time (weeks not months or years) we stabilised the platform and started to enjoy a few pretty great things:

  • we could easily rebuild our entire data centre in about 1 hour including everything from networks up to all test environments including a working application and data
  • adding new applications was extremely easy – efficiently cloning our continuous delivery pipeline in Jenkins was our new challenge (which we solved)!
  • predictability across test environments and production was higher than I’d EVER seen (and I’ve spent years solving that)
  • developers had a very clean relationship with the platform and found it a very productive eco-system to work in

In short, I was very happy and now a bit fan of PaaS. But it still took another month before I really felt like I understood why.

The answer is not the fact that Cloud Foundry is some kind of magic application, the answer is that it IS an application.

Too understand why PaaS is so important, I now actually think of it as Platform-as-an-Application (PaaA?!). The true value does not lie in the fact that someone else could deliver it to you as a service. The true value is treating everything that your application relies on as a configuration-manageable, version-able, testable, release-label software application. Naturally this is complicated and consists of multiple sub-components all subject to the same rigor, but managed as one application.

Whether you achieve this by reusing a pre-written PaaS application like CloudFoundry, Stratos or OpenShift, or whether your write your own is up to you (and I suggest subject to the normal buy/build considerations).  Whether you host it for yourself (on cloud infrastructure or not) or whether you use a public PaaS (e.g. Pivotal’s public Cloud Foundry) is not the point.  The thing that PaaS teaches us is to treat your platform as an application.

It’s a lovely idea from a DevOps maturity perspective. We’ve gone from Ops having manual silo-ed processes all the way to the logical extreme: the platform is treated no different from the application. We are all doing the same things!

It’s lovely from a Continuous Delivery practice because automated build and deployment of code is a native part of your platform application.  No extra work to do!

Another perhaps understated term is infrastructure as code (if you define infrastructure as used in IaaS). It essential to implementing your platform as an application but the name could leave you just thinking you should use it write code to manipulate your servers, storage and network. Yes you should, that’s great! But not treating this code as a coherent part of one logical application that is capable of (and optimised for) hosting your application is missing out.

So where next?  It’s not hard to think of ways of making platform applications that are more powerful and more operable. Already some PaaS applications e.g. Stratos are thinking hard how to realise similar benefits earlier in the lifecycle i.e. via App Factory.

I’m sure there will be an explosion of both PaaS applications and PaaS service providers all offering richer functionality, broader compatibility and higher service levels. Naturally Docker will significantly help implement compatibility of applications to PaaS’. For me, I plan to apply the Platform-as-an-Application pattern as widely as I can and of course try out as many more pre-existing PaaS applications as I can (starting with Stratos and App Factory).

Reducing Continuous Delivery Impedance – Part 3: COTS Software Packages

This is my third post in a series I’m writing about impedance to doing continuous delivery.  Click these links for earlier parts about the trials of Infrastructure and Solution Complexity.

Software packages, specifically Commercial-Off-The-Shelf Software (COTS) are software solutions purchased from a 3rd party vendor.  The idea is that you are purchasing something already delivering your needs for less money than it would cost you to build it.  There are also lots of other stated benefits.  I’m not certainly not challenging the use of COTS products altogether, or claiming that they are all the same.  However, many of those I’ve worked with over the years have created particularly strong impedance to doing continuous delivery.

In my experience there is nothing like custom code applications when it comes to great compatibility for doing continuous delivery.  For example Java:

  • Starting with early stages of the pipeline you’ve got Maven (or Ant or Grade), Findbugs, Checkstyle, Sonar, JUnit, Mokito, etc. all excellent. (Equivalents are readily available for C#, JavaScript, HTML, CSS, Ruby, Python, Perl, PHP, you name it…)
  • Moving on to deployment, you can literally take your pick: building native OS packages (e.g. RPMs), self-sufficient Java applications (e.g. Spring Boot), Configuration Management like Chef or Puppet, Jenkins and Shell scripts, I could go on…

This isn’t a new thing either, C and even Fortran are great and even Main Frames coupled with a proficient use of something like Endevor are very build-able, releasable and deploy-able.

Sadly with COTS products things are often not so simple…

scratched

Let’s start at the beginning of the pipeline.  COTS products can be hard to version control.  Many COTS products seem to consider a development repository part of their package.  – How generous of them to take version control / SCM off our hands?  Not really, how inconvenient, they rarely do they do a good job.  The staples of a good SCM are often absent:

  • Offline / distributed working – many COTS products require access to a shared development database.  This can be bad for productivity (slow), it may even be bad for flexible working (if people can’t get remote access)
  • Revision tracking – many COTS products may store limited metadata e.g. who changed what, but may limit your ability to do things like make an association to an ALM ticket e.g. in Jira
  • Branching  – usually not even a concept leaving you to fork irreconcilably (admittedly arguably branching isn’t really in the spirit of continuous delivery…)
  • Baselines – very rarely a concept allowing no reasonable way of promoting something other than the latest state of the development repository.

Usually to achieve the above sound SCM, you need to work hard to export your configurable items into flat files so that you can then store them in a sound SCM system (e.g. Git).  Generally this will take some kind of export / check-in tool that you may have to write yourself and thus take your own responsibility for ensuring integrity between SCM development repositories.

So let’s say we’ve got our COTS products moving parts flattened into versions control and under our control.  How do we handle merge?  The answer is usually simple: you can’t – any modification of these fragile, sensitive XML or in practical terms binary files outside of the IDE is strictly prohibited by the COTS vendor and even if tried is unlikely to lead to anything that will work (even if you get into clever schema informed XML merging).

Even if you aren’t branching, it may be perfectly reasonable for two people to end up simultaneously updating the same files.  This creates a problem that you need to try to avoid multiple people simultaneous changing the same file.  We must turn to our SCM system or perhaps a hybrid of the COTS IDE and the SCM system to implement pessimistic locking which even if achieved successfully has a negative impact on productivity.

So we’ve got a process of developers pushing consistent changes into version control.  The next challenge may be how to build.  Whilst the COTS product may export flat files, often this is just so that you can import the files into another development repository and you may not be able to build without re-importing your flat files back into another repository.  Worst case of all is when you have to import manually and then trigger a build manually within the IDE.  I’m not saying it isn’t possible to automate these things, I have definitely done it, but when it comes to having to involve AutoIT (screen scraping), you’ve got to feel somewhat frustrated/humiliated/sick/sad.

It’s not uncommon for COTS products to be slow to build.  I’m not saying the problem is exclusive to them, but with a custom application is usually much easier to break the build-up into smaller components to build separately.

Once we have a build, the task of deploying may not be easy.  Many COTS products expect fully manual, mouse-driven deployments and you basically end up reverse engineering to automate deployments.  Deployments can also be slow complex, slow and unpredictable.

Finally environments for COTS products can be tricky.  Usually a custom application will rely on standard middle-ware, or better still a PaaS.  COTS products can involve following hundreds if not thousands of pages of manual steps in installation manuals leading to a huge amount of work with your configuration management tool like Puppet or Chef.  Worse still there may be manual steps e.g. a configuration wizard that are very difficult to automate and again you are left to reverse engineering to remove the manual steps.

Like I said earlier, I’m not rejecting COTS products altogether, but I’d like to be very clear that they can often in many different ways create impedance for doing continuous delivery.  It is very rare that given enough effort these limitations cannot be overcome, but it may actually be equally rare (sadly) that organisations invest the necessary effort.

 

Reducing Continuous Delivery Impedance – Part 2: Solution Complexity

This is my second post in a series I’m writing about impedance to doing continuous delivery and how to overcome it.  Part 1 about Infrastructure challenges can be found here.  I also promised to write about complexity in continuous delivery in this earlier post about delivery pipelines.

I’m defining “a solution” as the software application or multiple applications under current development than need to work together (and hence be tested in an integrated manner) before release to production.

In my experience, continuous delivery pipelines work extremely well when you have a simple solution with the following convenient characteristics:

  1. All code and configuration is stored in one version control repository (e.g. Git)
  2. The full solution can be deployed all the way to production without needing to test it in conjunction with other applications / components under development
  3. You are using a 3rd party PaaS (treated as a black box, like HerokuGoogle App Engine, or AWS Elastic BeanStalk)
  4. The build is quick to run i.e. less than 5 minutes
  5. The automated tests are quick to run, i.e. minutes
  6. The automated test coverage is sufficient that the risks associated of releasing software can be understood to be lower in value than the benefits of releasing.

The first 3 characteristics are what I am calling “Solution Complexity” and what I want to discuss this post.

Here is a nice simple depiction of an application ticking all the above boxes.

perfect

Developers can make changes in one place, know that their change will be fully tested and know that when deployed into the production platform, their application should behave exactly as expected.  (I’ve squashed the continuous delivery (CD) pipeline into just one box, but inside it I’d expect to see a succession of code deployments, and automated quality gates like this.)

 

But what about when our solution is more complex?

What about if we fail to meet the first characteristic and our code is in multiple places and possibly not all in version control?  This definitely a common problem I’ve seen, in particular for configuration and data loading scripts.  However, this isn’t particularly difficult to solve from a technical perspective (more on the people-side in a future post!).  Get everything managed by a version control tool like Git.

Depending on the SCM tool you use, it may not be appropriate to feel obliged to use one repository.  If you do use multiple, most continuous integration tools (e.g. Jenkins) can be set up in such a way as to support handling builds that consume from multiple repositories.  If you are using Git, you can even handle this complexity within your version control repository e.g. by using sub-modules.

 

What about if your solution includes multiple applications like the following?

complex

Suddenly our beautiful pipeline metaphor is broken and we have a network of pipelines that need to converge (analogous to fan in in electronics).  This is far from a rarity and I would say it is overwhelmingly the norm.  This certainly makes things more difficult and we now have to carefully consider how our plumbing is going to work.  We need to build what I call an “integrated pipeline”.

Designing an integrated pipeline is all about determining the “points of integration” aka POI i.e. the first time that testing involves the combination two or more components.  At this point, you need to record the versions of each component so that they are kept consistent for the rest of the pipeline.  If you fail to do this, earlier quality gates in the pipeline are invalidated.

In the below example, Applications A and B have their own CD pipelines where they will be deployed to independent test environments and face a succession of independent quality gates.  Whenever a version of Application A or B gets to the end of its respective pipeline, instead of going into production, it moves into the Integrated Pipeline and creates a new integrated or composite build number.  After this “POI” the applications progress towards production in the same pipeline and can only move in sync.  In the diagram, version A4 of Application A and version B7 of B have made it into integration build I8.  If integration build I8 makes it through the pipeline it will be worthy to progress to production.

intDepending on the tool you use for orchestration, there are different solutions for achieving the above.  Fundamentally it doesn’t have to be particularly complicated.  You are simply aggregating version numbers in which can easily be stored together in a text document in any format you like (YAMLPOMJSON etc).

Some people reading this may by now be boiling up inside ready to scream “MICRO SERVICES” at their screens.  Micro services are by design independently deploy-able services.  The independence is achieved by ensuring that they fulfill and expect to consume strict contract APIs so that integration with other services can be managed and components can be upgraded independently.  A convention like SemVer can be adopted to manage change to contract compatibility.  I’ve for a while had this tagged in my head as the eBay way or Amazon way of doing this but micro services are now gaining a lot of attention.  If you are implementing micro services and achieving this independence between pipelines, that’s great.  Personally on the one micro services solution I’ve worked on so far, we still opted for an integrated pipeline that operated on an integrated build and produce predictable upgrades to production (we are looking to relax that at some point in the future).

Depending on how you are implementing your automated deployment, you may have deployment automation scripts that live separately to your application code.  Obviously we want to use consistent version of these through out deployments to different environments in the pipeline.  Therefore I strongly advise managing these scripts as a component in the same manner.

What about if you are not using a PaaS?  In my experience, this represents the vast majority of solutions I’ve worked on.  If you are not deploying into a fully managed container, you have to care about the version of the environment that you are deploying into.  The great thing about treating infrastructure as code (assuming you overcome that associated impedance) is that you can treat it like an application, give it a pipeline and feed it into the integrated pipeline (probably at a POI very early).  Effectively you are creating your own platform and performing continuous delivery on that.  Obviously the further your production environment is from being a version-able component like this, the great the manual effort to keep environments in sync.

paas

 

Coming soon: more sources of impedance to doing continuous delivery: Software packages, Organisation size, Organisation structure, etc.

 

(Thanks to Tom Kuhlmann for the graphic symbols.)