Team Topologies Book Summary – Part 2 of 3: Topologies and Interaction Modes

In part 1 of this 3 part blog series about the Team Topologies book, I summarised a large set of general things that can help make teams successful.  In this part I will get to actual team design i.e. types of team and how they should interact..

The Four Team Types

Based on their extensive research, the authors present 4 types of team and make an important assertion: to be an effective organisation you only need a mixture of teams conforming to these 4 types.

Personally, I really identified with the 4 types, but even if you have different opinions, it is still very useful to be able to reference and extend / update the common set of types described (in such great detail) in the book.

The 4 types are as follows.

Stream Aligned Teams

This is the name the authors coined for an end-to-end team that owns the full value stream for delivering changes to software products or services AND operating them.  This is the stereotypical modern and popular “you build it, you run it team”.

For Stream Aligned teams to be able to do their job, they need to be comprised of team members that collectively have all of the different skills they need e.g. business knowledge, architecture, development and testing skills, UX, security and compliance knowledge, monitoring and metrics skills etc.  This enables them to play nicely with the requirement of minimising queues and handoffs to other teams (especially in comparison to teams comprised of people performing singular functions e.g. testing).

The expectation is that by teams owning their whole value stream including the performance of the system in production, they can optimise for rapid small batch size changes, and reap all of the expected benefits around both agility and safety.  The hope is also that this end to end scope might help teams achieve “autonomy, mastery and purpose” (things which Daniel Pink highlights as most important to knowledge workers).

The amount of cognitive load required of them is a factor of a few things:

  • If they have enough of the requisite skills in their team, intrinsic cognitive load will be manageable.
  • If the modes of interaction with other teams are clean and efficient, their extrinsic cognitive load shouldn’t be too high.  They can also minimise this within their team through automation.
  • If they were formed using fracture planes effectively enough that they can keep their scope (and therefore their germane cognitive load) to a manageable amount.

Overall this seems like a very compelling team type and indeed the book recommends that the majority of teams in organisations have this form.

The book goes into a lot more detail about how to make a Stream Aligned Team effective which I highly recommend reading.

If the book only included Stream Aligned Teams, I would have considered it a cynical attempt to document fashionable ideas and ignore the factors that have led many people to have other team types.  Fortunately, instead they did a great job at considering the whole picture via 3 other types (plus bonus type SRE – read on!)

Platform Teams

In a tech start-up that begins with just one Stream Aligned Team owning the full stack and software lifecycle end-to-end, at some point they will face into Dunbar’s Number and cognitive overload.  It will be time to look for a fracture planes to enable splitting into at least two teams.  Separating the platform from the application is a very successful and well-established pattern.  At this point we could potentially have one Stream Aligned Team working on the application and one on the platform.

Let’s say the business and demand for application complexity continues to increase and the application team splits into two teams focussed on different application value streams.  We’re now in a situation where they both most likely re-use the same platform team and we get the benefit of re-using it.  This could of course repeat many times.  The authors recognised the nature of running a platform team (especially in terms of the type of coupling to other teams and effective modes of communication) differed enough to create a new team type and this they called Platform Team.

Platform Teams create and operate something re-usable for meeting one or more application hosting requirements.  These could be running platform applications like Kubernetes, a Content Management System as a service, IaaS, or even teams wrapping third party as-a-service services like a Relational Database Service.  They may also be layering on additional features valuable to the consumers of the platform such as security and compliance scanning.

But there are potential traps with platform teams.  If the platform is too opinionated and worse usage is also mandated within an organisation the impedance mismatch may be more harm than good for consuming applications.  If a platform isn’t observable and applications are deployed into a black box, the application teams will be disconnected from enough detail about how their application is performing in production and disempowered to make a difference to it.  The book goes into a lot of detail about how to avoid creating bad platform teams and strongly makes an argument for keeping platforms as thin as possible (which it calls minimum viable platform).

Personally, I think the dynamic that occurs when consuming a platform from a third party is very powerful for a number of reasons:

  • The platform provider is under commercial pressure to create a platform good enough that consumers pay to use it.
  • The platform provider has to be able to deliver the service at a cost point below the total revenue it’s users will give it (so it will be efficient and chose it’s offered services carefully)

If organisations can at least try to think like that (with or without internal use of “wooden dollars”) I think they will create an effective dynamic.  The book mentions that Don Reinertsen recommends internal pricing as a mechanism for avoiding platform consumers demanding things they don’t need.

Enabling Teams

The next type of team is designed to serve other teams to help them improve.  I think it is great that the book acknowledges and explores these as they are very common and often a good thing.  Enabling teams should contain experienced people excited about finding ways to help other teams improve.  To some extent they can be thought of as technical consultants with a remit for helping drive improvement.

Most important is how well Enabling teams engage with other teams.  There are various traps such a team can fall into and must avoid:

  • Becoming an ivory tower defining processes, policies, perhaps even technical ‘decisions’ and inflicting them upon the teams that they are supposed to be helping.
  • Generally being disruptive by causing things like interruptions, context switching, cognitive load, and communication overhead – especially if this cost outweighs the benefit.

As with all other types, the book provides detailed expected behaviours and success criteria for enabling teams such as understanding the needs of the teams they support and taking an experimental approach to meeting these.  It’s also important that enabling teams bring in a wider perspective perhaps of ideas about technology advances and industry technology and process trends.  Enabling teams may be long lived but their engagement with the teams they are supporting should probably not be permanent.

Complicated-Subsystem Teams

Finally, the book defines a 4th type of team called a complicated-subsystem team.  Essentially these teams own the development and maintenance of a complicated subcomponent that is probably consumed as code or a binary, rather than at runtime as a service over a network.  The concept is that the component requires specialist knowledge to build and change and that can be done most effectively by a dedicated team without the cognitive load required to consume and deploy their component.

Other types of teams: SRE teams

I did feel the book acknowledged some other team types outside of the main 4, for example, SRE teams.

Without getting into too much detail, Site Reliability Engineering (SRE) is an approach to Operations that Google developed and started sharing with the world a couple of years ago.  The thing that is most interesting about it from a team design perspective is that an SRE team is a traditional separate operations team.  In this regard it doesn’t really fit the 4 topologies above.  I think there are a couple of reasons the SRE model is successful:

  1. At least at Google, the default mode of operation is Stream Aligned teams.  An SRE team will only operate an application if it is demonstrably reliable and doesn’t require a lot of manual effort to operate.
  2. It promotes use of some conceptually simple but effective metrics to ensure the application meets the standards for an SRE to operate it.

The book says that if your systems are reliable and operable enough you can use SRE teams.  Doing so will of course reduce some cognitive load on the team by having to pay less attention to the demands of an application in production.  This is actually very interesting because in some ways a Stream Aligned team handing over applications to an SRE team to operate is very similar to a traditional Dev and Ops team.

So, the message in a way becomes if your processes are mature enough and your engineering good enough, traditional ways will suffice, BUT the problem is they probably aren’t and instead you need to use Stream Aligned teams until you get there.  I think this leads to an alternative option for organisations – directly implement the measurement of SRE and focus on quality until you can make your existing separate Dev and Ops perform well enough.   I even suggest this justifies a first class fifth team type.

Other types of teams: Service Experience

The book also acknowledges that many business models include call centre teams and service desk teams who engage with customers directly.  These teams are proposed to be kept separate from the team types above but also to ensure information about the operations of the live system makes it back to the Stream Aligned teams.  It also talks about the concept of grouping and tightly aligning Service Experience Teams with Stream Aligned teams.  The Service Experience Teams are named this to emphasise their customer orientation.

A key success criterion is the relationship to Stream Aligned teams which must be close and ideally 1 to 1.  I think this is an excellent point and that in many organisations it’s as much the 1 to many (overly shared) relationship between teams that should be more closely aligned that cause as many problems as the team boundaries.  For example, if an Application support team is spread across too many applications whilst it may achieve some efficiencies of scale, the overall service delivered to each supported team will be far less effective than if the team was subdivided into smaller more closely aligned teams.

If you spotted other team types in the book that were acknowledged as acceptable – let me know.

The Three Modes of Team Interaction

The book creates 3 interaction modes and recommends which team types should use each one.  By interactions the book means how and when teams communicate and collaborate with each other.  Central to this is the idea that communication is expensive and erodes team structures, so should be used wisely.

The first interaction mode the book defines is called Collaboration.  This is a multi-directional regular and close interaction.  It is fast if not often real time and therefore responsive and agile.  It is the mode than enables teams to work most closely together and is useful when there is a high degree of uncertainty or change and co-evolution is needed.

The cost of this mode is higher communication overhead and increased cognitive load – because teams will need to know more about the other teams.

Stream Aligned Teams will be the most likely to use this, probably with other Stream Aligned teams and especially earlier in the life of a particular product when uncertainty is highest, and boundaries are evolving.

The second mode is called X-as-a-Service.  A team wishing to use this interaction mode needs to consciously design and optimise it so that it serves both them and the other parties as effectively as possible.

Making this mode effective entails abstracting unnecessary detail from other teams and making the interface discoverable, self-documented and possibly an API.

The down side of this mode is that if it isn’t implemented effectively it may reduce the flow of other teams.  It also needs to stay relatively static in order to avoid consuming teams having to constantly relearn how to integrate.

As you’ve probably guessed this is a great model for a Platform Team to adopt – especially if they are highly re-used by many consumers.  It can be especially effective when platforms are well established and do not require rapid change or co-evolution of the API.

The final mode is called Facilitating.   This is the best mode for Enabling teams and describes how enabling teams can be helpful without being over demanding.

All of the above modes are described in much more detail than this and have very actionable advice

Continuous Evolution

The book has a section on static topologies i.e. how teams may interact at a point in time.  However, it stresses the importance of sensing and adapting.  As the maturity of a team changes collaboration modes and even team types may change or subdivide.

The Team Topologies book uses SRE as a good example of teams moving from Stream Aligned, to a function containing reliability specialists (who may or may not call themselves SREs) operating as an enabling team, to an SRE team acting as a separate Ops team and then back.  Obviously, there is a tension with making these changes to keep teams effective with the costs of changing teams and resetting back to Storming (as per Tuckman).

The book even offers advice for helping existing functional or traditional teams transition into the new model for example: infrastructure teams to Platform teams, support teams to Stream Aligned teams.

In the final part of this series, I will share some thoughts about how to use this information.

Team Topologies Book Summary – Part 1 of 3: Key Concepts

Team Topologies is one of the latest books published by IT Revolution (the excellent company created by Gene Kim the co-author of The Phoenix Project and author of The Unicorn Project).  Team Topologies was written by Matthew Skelton and Manuel Pais (the people behind the DevOps Topologies website) but it is far more than an ‘extended dance remix’ of that.  The book takes a much wider view on team designs that make companies successful and considers the full socio-technical problem space.

This blog series is a set of highlights from the book followed by some suggestions of what to do with the information.  I hope it will also inspire you to read the book and have some fresh thoughts about improving your organisation.

Conway’s Law Demonstrates Team structures are incredibly important

Conway’s Law is the famous observation that the structure of teams (or more specifically the structure of information flow between groups of people) impacts the design of a system.  For example, if you take a 4-team organisation and ask them to build a compiler, you will most likely get a compiler that has 4 stages to it.  This is important because of the implication that the structure of Teams doesn’t just impact governance, efficiency, and agility, it also impacts the actual architecture of the products that get built.  The architecture of products is vitally important because of the heavy bearing it has on the agility and reliability of not just systems, but the businesses driven by them.

So Conway’s law teaches us two things:

  1. Team designs are very important because good team designs lead to good software design and good software design leads to better more effective teams (or if you get it wrong the cycle goes the other way).
  2. The factors that influence both a good architecture as well as good teams need to be considered when designing teams, team boundaries, and the planned communication required between teams.

General guidance for great teams

The book doesn’t overlook covering general factors that influence high performing people and teams.  For example:

  • Team sizing – Dunbar’s number is highlighted for its recommendation about the limits of how many people can successfully collaborate in different ways:
    • The strongest form of communication happens in an individual team working closely together with a very consistent shared context and specific purpose.  In software this number is supposed to be 8 people.
      • The limit of people who can all deeply trust each other is 15 and there is also significance of the dynamics of 50 and 150 people.  So this can provide useful constraints when thinking about the design of teams of teams.
  • The importance of having long lived teams that are given enough time to get to a state of high performance and then stick together and capitalise on it.  Research shows teams can take 3 months to get to a point where a team is many times more effective than the sum of the individual members.  The better the overall company culture is, the easier changing teams can be, but even in the best cases the research recommends no less than a year.  The book also highlights that the well-known Tuckman team performance model (Forming, Storming, Norming, Performing) has been proven to be less linear than it sounds.  The storming stage has been found to restart every time there is a personnel or major change to the team.
  • Some aspects of office design are important.
  • Putting other team members first and generally investing in relationships and making the team an inclusive place to work is vital.
  • Defining effective team scope, boundaries, and approaches to communication.  This is the broadly the topic of the rest of the book (and this blog).

Team Design Consideration: Minimising Team Handoffs, Queues, and Communication Overhead

The book talks about the importance of organising for flow of change to software products.  In order to do that you need to consider team responsibilities and to minimise and optimise communication.  The book presents a case study from Amazon as exponents of “you build it, you run it” approaches.

You also need what the book calls “sensing” which is where an organisation possesses enough feedback mechanisms to ensure software and service quality is understood as early and clearly as possible.

Whilst teams may communicate and prioritise effectively within the boundary of their team, external communication across team boundaries is always much more costly and much less effective.  When teams have demands upon other teams, this can:

  • be disruptive and lead to context switching
  • lead to queues, delays, and prioritisation problems
  • create a large communication and management overhead.

I found the point about communication thought provoking because it’s often a popular idea that the more collaboration within an organisation, the better.  As the book states, in practice all communication is expensive and should be considered in terms of cost versus benefit.  A friend of mine Tom Burgess pointed out a nice similarity to memory hierarchy in Computer Science.  This also got me thinking about the parallels of people and the Fallacies of Distributed computing!

Team Design Consideration: Cognitive load

The book introduces a very helpful concept from psychology (created by John Sweller) called Cognitive Load.  This describes how much ‘thinking effort’ a particular ‘thing’ needs in order to be done effectively.  It observes that not all types of thinking effort are the same in nature:

  • Different tasks require different types of thinking
  • There are different causes of the need to think
  • There are different strategies for managing and reducing the effort required for the different types of thinking.

The types are:

  • Intrinsic Cognitive Load – this relates to the skill of how to do a task e.g. which buttons to press according to the specific information available and scenario.
  • Extraneous Cognitive Load – this relates to broader environmental knowledge, not related to the specific skill of the task, but still necessary, e.g. what are the surrounding process admin steps that must be done after this task
  • Germane Cognitive Load – this relates to thinking about how to make the work as effective as possible e.g. what should the design be, how will things integrate.

The strategies for managing each type of cognitive load are as follows:

  • Intrinsic Cognitive Load – can be reduced by training people or finding people experienced at a task.  The greater their relevant skill levels, the lower the load required to do the task.
  • Extraneous Cognitive Load – can be reduced by automation or good process design and discoverability.
  • Germane Cognitive Load – is really the type of work you want people focusing on.  However, the amount required is a factor of how much scope someone has to worry about.

This is all very interesting and applicable to individuals, but you might be wondering: how does this relate to organisational team design?  The book presents a very useful idea here: cognitive load should be considered in terms of the total amount required by whole teams.

Putting this in plainer terms, when you are thinking about team design and organisational structure, you need to consider how much collectively you are expecting the team to:

  • know
  • be able to perform
  • be able to make effective decisions about
  • be able to make brilliant.

The reason I found this so powerful is because it gives you a logical way to reason with the current fairly fashionable idea that end-to-end / DevOps / BusDevSecOps(!) teams are the utopia (or worse: the meme that if there are separate teams in the value stream you are not doing DevOps and you are doing it wrong).  Sure, giving as much ownership of a product or service as possible to one team avoids team boundaries, but it also increases the cognitive load on the team and potentially the minimum number of people needed in a team.

Decomposing and fracture planes

So a simplified summary so far:

  • Amazon have helped highlight the benefit of minimising handoffs, queues, and communication
  • Sweller has taught us to avoid giving a team too much Cognitive load
  • Dunbar has taught us to keep teams at around 8 people.

At this point we have to consider the amount of scope assigned to a team as our way to satisfy the above constraints.  If we can keep it small enough, perhaps we can still give them end-to-end autonomy whilst keeping team sizes down and cognitive load manageable.

This is where the book starts talking about a concept of Fracture Plains.  The name is a metaphor about how when stonemasons break rocks, they focus on the natural characteristics of the structure of the rock in order to break them up cleanly and efficiently.  The theory is that software systems also naturally have characteristics that create more effective places to divide things up.  The metaphor is especially poetic considering large tightly coupled software systems are often likened to monolithic rocks.

The book provides a useful discussion of different types of fracture plane to explore including:

  • Business domains (i.e. where the whole domain driven development and bounded contexts design techniques come into play)
  • Change cadence i.e. things that need to change at the same pace
  • Risk i.e. things with similar risk profiles
  • (Most important of all) separation of platform and application (more on that to come).

Ideally all of these can help decompose systems whilst minimising cognitive load and keeping team sizes small enough.

In Part 2 I’ll share the reusable patterns the book proposes for doing this.

Transforming the future together

We have chosen “Transforming the future together” as the theme for the DevOpsDays London 2017 Conference.  I wanted to share here some personal thoughts about what it means.  These may not be the same views of the other organisers (including Bob Walker who proposed the theme).  If you are reading this before 31st May 2017 there is still time for you to submit a proposal to talk!

One of the things I find most inspiring about the DevOps community is the level of sharing, not just in terms of Open Source software, but in terms of strong views on culture, processes, metrics and usage of technology.  I am especially excited by the theme because of the potential for expanding what we share.

ArcaBoard_Dumitru_Popescu

If we are going to transform the future I think to an extent we have to hold an interpretation of:

  • where we are today,
  • where the future might be heading without our intervention,
  • how we would like it to be, and
  • what we might do to influence it.

Even standing alone I can imagine each of these having lots of potential.

The great thing about hearing about the experiences of others is that it can be a shortcut to discovering things without having to try them yourselves.  For example you are walking in the country and you pass someone covered in mud “The middle of this track is much more slippery that it looks”, they volunteer with a pained smile.  Whether you choose to take the side of the path or test your own ability to brave the slippery part, you are much better off for the experience.

I hope to see some talks at DevOpsDays London that attempt to describe with great balance generally where they are today as technology organisations.   Not just how many times a day they can change static wording on their website, but also how fast they change for example their Oracle packaged products and what their bottlenecks are.  

“The future is already here — it’s just not very evenly distributed.”  William Gibson

Naturally everyone is at a different stage of their DevOps journey.  I’m hoping there will be a good range of talks reflecting this variety.

I would love to hear some predictions optimistic and pessimistic about where we might be and where we should be heading.  How will the world change when we cross 50% of people being online?  Are robots really coming for our jobs or our lives? Is blockchain really going to find widespread usage?  Will the private data centre be marginalised as much as private electricity generators have been?  Will Platform-as-a-Service finally gain widespread traction, will it be eclipsed by ServerLess?  Will 100% remote working liberate more of us from living near expensive cities?  Will the debate about what is and isn’t Agile ever end?  What is coming after Micro Services?  Will the gender balance in our industry become more equal?  Can holacracies succeed as a form of organisation?  These questions are of course just the start.

Finally what are people doing now towards achieving some of the above.  What technologies are really starting to change the game?  What new ways of working and techniques are really making an impact (positive or negative).  How are they working to be more inclusive and improve diversity?  Are they actually liking bi-modal IT?  What are they doing to alleviate their bottlenecks?

Please share your own questions that you would like to see answered and once again please consider submitting a proposal.

Image: https://en.wikipedia.org/wiki/File:ArcaBoard_Dumitru_Popescu.jpg

Using ADOP and Docker to Learn Ansible

As I have written here, the DevOps Platform (aka ADOP) is an integration of open source tools that is designed to provide the tooling capability required for Continuous Delivery.  Through the concept of cartridges (plugins) ADOP also makes it very easy to re-use automation.

In this blog I will describe an ADOP Cartridge that I created as an easy way to experiment with Ansible.  Of course there are many other ways of experimenting with Ansible such as using Vagrant.  I chose to create an ADOP cartridge because ADOP is so easy to provision and predictable.  If you have an ADOP instance running you will be able to experience Ansible doing various interesting things in under 15 minutes.

To try this for yourself:

  1. Spin up and ADOP instance
  2. Load the Ansible 101 Cartridge (instructions)
  3. Run the jobs one-by-one and in each case read the console output.
  4. Re-run the jobs with different input parameters.

To anyone only loosely familiar with ADOP, Docker and Ansible, I recognise that this blog could be hard to follow so here is a quick diagram of what is going on.

docker-ansible

The Jenkins Jobs in the Cartridge

The jobs do the following things:

As the name suggests, this job just demonstrates how to install Ansible on Centos.  It installs Ansible in a Docker container in order to keep things simple and easy to clean up.  Having build a Docker image with Ansible installed, it tests the image just by running inside the container.

$ ansible --version

2_Run_Example_Adhoc_Commands

This job is a lot more interesting than the previous.  As the name suggests, the job is designed to run some adhoc Ansible commands (which is one of the first things you’ll do when learning Ansible).

Since the purpose of Ansible is infrastructure automation we first need to set up and environment to run commands against.  My idea was to set up an environment of Docker containers pretending to be servers.  In real life I don’t think we would ever want Ansible configuring running Docker containers (we normally want Docker containers to be immutable and certainly don’t want them to have ssh access enabled).  However I felt it a quick way to get started and create something repeatable and disposable.

The environment created resembles the diagram above.  As you can see we create two Docker containers (acting as servers) calling themselves web-node and one calling it’s self db-node.  The images already contain a public key (the same one vagrant uses actually) so that they can be ssh’d to (once again not good practice with Docker containers, but needed so that we can treat them like servers and use Ansible).  We then use an image which we refer to as the Ansible Control Container.  We create this image by installing Ansible installation and adding a Ansible hosts file that tells Ansible how to connect to the db and web “nodes” using the same key mentioned above.

With the environment in place the job runs the following ad hoc Ansible commands:

  1. ping all web nodes using the Ansible ping module: ansible web -m ping
  2. gather facts about the db node using the Ansible setup module: ansible db -m setup
  3. add a user to all web servers using the Ansible user module:  ansible web -b -m user -a “name=johnd comment=”John Doe” uid=1040″

By running the job and reading the console output you can see Ansible in action and then update the job to learn more.

3_Run_Your_Adhoc_Command

This job is identical to the job above in terms of setting up an environment to run Ansible.  However instead of having the hard-coded ad hoc Ansible commands listed above, it allows you to enter your own commands when running the job.  By default it pings all nodes:

ansible all -m ping

4_Run_A_Playbook

This job is identical to the job above in terms of setting up an environment to run Ansible.  However instead of passing in an ad hoc Ansible command, it lets you pass in an Ansible playbook to also run against the nodes.  By default the playbook that gets run installs Apache on the web nodes and PostgreSQL on the db node.  Of course you can change this to run any playbook you like so long as it is set to run on a host expression that matches: web-node-1, web-node-2, and/or db-node (or “all”).

How the jobs 2-4 work

To understand exactly how jobs 2-4 work, the code is reasonably well commented and should be fairly readable.  However, at a high-level the following steps are run:

  1. Create the Ansible inventory (hosts) file that our Ansible Control Container will need so that it can connect (ssh) to our db and web “nodes” to control them.
  2. Build the Docker image for our Ansible Control Container (install Ansible like the first Jenkins job, and then add the inventory file)
  3. Create a Docker network for our pretend server containers and our Ansible Control container to all run on.
  4. Create a docker-compose file for our pretend servers environment
  5. Use docker-compose to create our pretend servers environment
  6. Run the Ansible Control Container mounting in the Jenkins workspace if we want to run a local playbook file or if not just running the ad hoc Ansible command.

Conclusion

I hope this has been a useful read and has clarified a few things about Ansible, ADOP and Docker.  If you find this useful please star the GitHub repo and or share a pull request!

Bonus: here is an ADOP Platform Extension for Ansible Tower.

ADOP with Pivotal Cloud Foundry

As I have written here, the DevOps Platform (aka ADOP) is an integration of open source tools that is designed to provide the tooling capability required for Continuous Delivery.

In this blog I will describe integrating ADOP and the Cloud Foundry public PaaS from Pivotal.  Whilst it is of course technically possible to run all of the tools found in ADOP on Cloud Foundry, that wasn’t our intention.  Instead we wanted to combine the Continuous Delivery pipeline capabilities of ADOP with the industrial grade cloud first environments that Cloud Foundry offers.

Many ADOP cartridges for example the Java Petclinic one contain two Continuous Delivery pipelines:

  • The first to build and test the infrastructure code and build the Platform Application
  • The second to build and test the application code and deploy it to an environment built on the Platform Application.

The beauty of using a Public PaaS like Pivotal Cloud Foundry is that your platforms and environments are taken care of leaving you much more time to focus on the application code.  However you do of course still need to create an account and provision your environments.

  1. Register here
  2. Click Pivotal Web Services
  3. Create a free tier account
  4. Create and organisation
  5. Create one or more spaces

With this in place you are ready to:

  1. Spin up and ADOP instance
  2. Store your Cloud Foundry credentials in Jenkins’ Secure Store
  3. Load the Cloud Foundry Cartridge (instructions)
  4. Trigger the Continuous Delivery pipeline.

Having done all of this, the pipeline now does the following:

  1. Builds the code (which happens to be the JPetStore
  2. Runs the Unit Test and performs Static Code Analysis using SonarQube
  3. Deploys the code to an environment also known in Cloud Foundry as a Space
  4. Performs functional testing using Selenium and some security testing using OWASP ZAPP.
  5. Performs some performance testing using Gatling.
  6. Kills the running application in environment and waits to verify that Cloud Foundry automatically restores it.
  7. Deploys the application to a multi node Cloud Foundry environment.
  8. Kills one of the nodes in Cloud Foundry and validates that Cloud Foundry automatically avoids sending traffic to the killed node.

The beauty of ADOP is that all of this great Continuous Delivery automation is fully portable and can be loaded time and time again into any ADOP instance running on any cloud.

There is plenty more we could have done with the cartridge to really put the PaaS through its paces such as generating load and watching auto-scaling in action.  Everything is on Github, so pull requests will be warmly welcomed!  If you’ve tried to follow along but got stuck at all, please comment on this blog.

Abstraction is not Obsoletion – Abstraction is Survival

Successfully delivering Enterprise IT is a complicated, probably even complex problem.  What’s surprising, is that as an industry, many of us are still comfortable accepting so much of the problem as our own to manage.

Let’s consider an albeit very simplified and arguably imprecise view of The “full stack”:

  • Physical electrical characteristics of materials (e.g. copper / p-type silicon, …)
  • Electronic components (resistor, capacitor, transistor)
  • Integrated circuits
  • CPUs and storage
  • Hardware devices
  • Operating Systems
  • Assembly Language
  • Modern Software Languages
  • Middleware Software
  • Business Software Systems
  • Business Logic

When you examine this view, hopefully (irrespective of what you think about what’s included or missing and the order) it is clear that when we do “IT” we are already extremely comfortable being abstracted from detail. We are already fully ready to use things which we do not and may never understand. When we build an eCommerce Platform, an ERP, or CRM system, little thought it given to Electronic components for example.

My challenge to the industry as a whole is to recognise more openly the immense benefit of abstraction for which we are already entirely dependent and to embrace it even more urgently!

Here is my thinking:

  • Electrons are hard – we take them for granted
  • Integrated circuits are hard – so we take them for granted
  • Hardware devices (servers for example) are hard – so why are so many enterprises still buying and managing them?
  • The software that it takes to make servers useful for hosting an application is hard – so why are we still doing this by default?

For solutions that still involve writing code, the most extreme example of abstraction I’ve experienced so far is the Lambda service from AWS.  Some seem to have started calling such things ServerLess computing.

With Lambda you write your software functions and upload them ready for AWS to run for you. Then you configure the triggering event that would cause your function to run. Then you sit back and pay for the privilege whilst enjoying the benefits. Obviously if the benefits outweigh the cost for the service you are making money. (Or perhaps in the world of venture capital, if the benefits are generating lots of revenue or even just active users growth, for now you don’t care…)

Let’s take a mobile example. Anyone with enough time and dedication can sit at home on a laptop and start writing mobile applications. If they write it as a purely standalone, offline application, and charge a small fee for it, theoretically they can make enough money to retire-on without even knowing how to spell server.  But in practice most applications (even if they just rely on in app-adverts) require network enabled services. But for this our app developer still doesn’t need to spell server, they just need to use the API of the online add company e.g. Adwords and their app will start generating advertising revenue. Next perhaps the application relies on persisting data off the device or notifications to be pushed to it. The developer still only needs to use another API to do this, for example Parse can provide that to you all as a programming service.  You just use the software development kit and are completely abstracted from servers.

So why are so many enterprises still exposing themselves to so much of the “full stack” above?  I wonder how much inertia there was to integrated circuits in the 1950s and how many people argued against abstraction from transistors…

To survive is to embrace Abstraction!

 


[1] Abstraction in a general computer science sense not a mathematical one (as used by Joel Spolsky in his excellent Law of Leaky Abstractions blog.)

Join the DevOps Community Today!

As I’ve said in the past, if your organisation does not yet consider itself to be “doing DevOps” you should change that today.

If I was pushed to say the one thing I love most about the DevOps movement, it would be the sense of community and sharing.

I’ve never experienced anything like it previously in our industry.  It seems like everyone involved is united by being passionate about collaborating in as many ways as possible to improve:

  • the world through software
  • the rate at which we can do that
  • the lives of those working our industry.

The barrier to entry to this community is extremely low, for example you can:

You could also consider attending the DevOps Enterprise Summit London (DOES).  It’s the third DOES event and the first ever in Europe and is highly likely to be one of the most important professional development things you do this year.  Organised by Gene Kim (co-author of The Phoenix Project) and IT Revolution, the conference is highly focused on bringing together anyone interested in DevOps and providing them as much support as humanly possible in two days.  This involves presentations from some of the most advanced IT organisations in the world (aka unicorns), as well as many from those in traditional enterprises who may be on a very similar journey to you.   Already confirmed are talks from:

  • Rosalind Radcliffe talking about doing DevOps with Mainframe systems
  • Ron Van Kemenade CIO of ING Bank
  • Jason Cox about doing DevOps transformation at Disney
  • Scott Potter Head of New Engineering at News UK
  • And many more.

My recommendation is to get as many of your organisation along to the event as possible.  They won’t be disappointed.

Early bird tickets are available until 11th May 2016.

(Full disclosure – I’m a volunteer on the DOES London committee.)

London Banner logo_770x330