Start Infrastructure Coding Today!

* Warning this post contains mildly anti-Windows sentiments *

It has never been easier to get ‘hands-on’ with Infrastructure Coding and Containers (yes including Docker), even if your daily life is spent using a Windows work laptop.  My friend Kumar and I proved this the other Saturday night in just one hour in a bar in Chennai.  Here are the steps we performed on his laptop.  I encourage you to do the same (with an optional side order of Kingfisher Ultra).

 

  1. We installed Docker Toolbox.
    It turns out this is an extremely fruitful first step as it gives you:

    1. Git (and in particular GitBash). This allows you to use the world’s best Software Configuration Management tool Git and welcomes you into the world of being able to use and contribute to Open Source software on Git Hub.  Plus it has the added bonus of turning  your laptop into something which understands good wholesome Linux commands.
    2. Virtual Box. This is a hypervisor that turns your laptop from being one machine running one Operating System (Windoze) into something capable of running multiple virtual machines with almost any Operating System you want (even UniKernels!).  Suddenly you can run (and develop) local copies of servers that from a software perspective match Production.
    3. Docker Machine. This is a command line utility that will create virtual machines for running Docker on.  It can do this either locally on your shiny new Virtual Box instance or remotely in the cloud (even the Azure cloud – Linux machines of course)
    4. Docker command line. This is the main command line utility of Docker.  This will enable you to download and build Docker images, and turn them into running Docker containers.  The beauty of the Docker command line is that you can run it locally (ideally in GitBash) on your local machine and have it control Docker running on a Linux machine.  See diagram below.
    5. Docker Compose. This is a utility that gives you the ability to run and associate multiple Docker containers by reading what is required from a text file.DockerVB
  2. Having completed step 1, we opened up the Docker Quickstart Terminal by clicking the entry that had appeared in the Windows start menu. This runs a shell script via GitBash that performs the following:
    1. Creates a virtual box machine (called ‘default’) and starts it
    2. Installs Docker on the new virtual machine
    3. Leaves you with a GitBash window open that has the necessary environment variables set to instruct point Docker command line utility to point at your new virtual machine.
  3. We wanted to test things out, so we ran:
    $ docker ps –a
    CONTAINER ID  IMAGE   COMMAND   CREATED   STATUS   PORTS  NAMES

     

    This showed us that our Docker command line tool was successfully talking to the Docker daemon (process) running on the ‘default’ virtual machine. And it showed us that no containers were either running or stopped on there.

  4. We wanted to testing things a little further so ran:
    $ docker run hello-world
     
    Hello from Docker.
    
    This message shows that your installation appears to be working correctly.
     
    
    To generate this message, Docker took the following steps:
    
    The Docker client contacted the Docker daemon.
    The Docker daemon pulled the "hello-world" image from the Docker Hub.
    The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
    
    The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.
    
     
    
    To try something more ambitious, you can run an Ubuntu container with:
    
    $ docker run -it ubuntu bash
    
     
    
    Share images, automate workflows, and more with a free Docker Hub account:
    
    https://hub.docker.com
    
     
    
    For more examples and ideas, visit:
    
    https://docs.docker.com/userguide

     

    The output is very self-explanatory.  So I recommend reading it now.

  5. We followed the instructions above to run a container from the Ubuntu image.  This started for us a container running Ubuntu and we ran a command to satisfy ourselves that we were running Ubuntu.  Note one slight modification, we had to prefix the command with ‘winpty’ to work around a tty-related issue in GitBash
    $ winpty docker run -it ubuntu bash
    
    root@2af72758e8a9:/# apt-get -v | head -1
    
    apt 1.0.1ubuntu2 for amd64 compiled on Aug  1 2015 19:20:48
    
    root@2af72758e8a9:/# exit
    
    $ exit

     

  6. We wanted to run something else, so we ran:
    $ docker run -d -P nginx:latest

     

  7. This caused the Docker command line to do more or less what is stated in the previous step with a few exceptions.
    • The –d flag caused the container to run in the background (we didn’t need –it).
    • The –P flag caused docker to expose the ports of Nginx back to our Windows machine.
    • The Image was Nginx rather than Ubuntu.  We didn’t need to specify a command for the container to run after starting (leaving it to run its default command).
  8. We then ran the following to establish how to connect to our Nginx:
    $ docker-machine ip default
    192.168.99.100
    
     $ docker ps
    
    CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                                           NAMES
    
    826827727fbf        nginx:latest        "nginx -g 'daemon off"   14 minutes ago      Up 14 minutes       0.0.0.0:32769->80/tcp, 0.0.0.0:32768->443/tcp   ecstatic_einstein
    
    

     

  9. We opened a proper web brower (Chrome) and navigated to: http://192.168.99.100:32769/ using the information above (your IP address may differ). Pleasingly we were presented with the: ‘Welcome to nginx!’ default page.
  10. We decided to clean up some of what we’re created locally on the virtual machine, so we ran the following to:
    1. Stop the Nginx container
    2. Delete the stopped containers
    3. Demonstrate that we still had the Docker ‘images’ downloaded

 

$ docker kill `docker ps -q`

8d003ca14410
$ docker rm `docker ps -aq`

8d003ca14410

2af72758e8a9

…

$ docker ps -a

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

$ docker images

REPOSITORY                     TAG                 IMAGE ID            CREATED             VIRTUAL SIZE

nginx                          latest              sha256:99e9a        4 weeks ago         134.5 MB

ubuntu                         latest              sha256:3876b        5 weeks ago         187.9 MB

hello-world                    latest              sha256:690ed        4 months ago        960 B

 

 

  1. We went back to Chrome and hit refresh. As expected Nginx was gone.
  2. We opened Oracle VM Virtual box from the Windows start machine so that we could observe our ‘default’ machine listed as running.
  3. We ran the following to stop our ‘default’ machine and also observed it then stopping Virtual Box:
    $ docker-machine stop default

     

  4. Finally we installed Vagrant. This is essentially a much more generic version of Docker-Machine that is capable of creating not just virtual machines in Virtual Box for Docker, but for many other purposes.  For example from an Infrastructure Coding perspective, you might run a virtual machine for developing Chef code.

 

Not bad for one hour on hotel wifi!

Kumar keenly agreed he would complete the following next steps.  I hope you’ll join him on the journey and Start Infrastructure Coding Today!

  1. Learn Git. It really only takes 10 minutes with this tutorial LINK to learn the basics.
  2. Docker – continue the journey here
  3. Vagrant
  4. Chef
  5. Ansible

 

Please share any issues following this and I’ll improve the instructions.  Please share  any other useful tutorials and I will add those also.

Advertisements

Neither Carrot Nor Stick

Often when we talk about motivating people, the idiom of having the choice of using a Carrot or a Stick is used. I believe this originates from the conventional wisdom about the best ways to get a mule (as in the four legged horse-like animal) to move. You could try using a carrot, which might be enough of a treat for the mule to move in order to reach it. Or you could try a stick, which might be enough of a threat to get the mule to move in order to avoid being hit.

The idiom works because the carrot is analogous to offering someone an incentive (such as pay rises or bonuses) to get them to do something. The stick is analogous to offering them the threat of punishment (such as being fired or demoted). It’s curious how threat and treat differ by just one letter…

This all makes sense for a horse but not really for people.

The idiom has a major flaw because humans are significantly more complex than animals (all of us!).

Instead if we want to influence someone effectively and sustain-ably, we need to think about how to help them to have an emotional attachment to the thing your are looking to achieve.

I think this comes down to the following:

  • Being open exploring both their and your personal motivations with a view to maximising the achievement of both – in particular the overlap.
  • Starting from an open mind and only looking to agree the desired outcome. This is not the same as agreeing the approach. The approach is key to the satisfaction and motivation of the implementer and key to their attachment to achieving a great solution.
  • Supporting them in their chosen approach taking care not to challenge unnecessarily or do anything that risks eroding their sense of your trust.
  • Being transparent about the consequences of not delivering the desired outcome and clarifying your own role in shielding them from blame and creating a safe environment to operate.

Of course these ideas are not my own.  I would encourage you to explore some of these great materials that I have taken inspiration from:

And I’d love to hear your own ideas and recommended reading.

 

Reducing Continuous Delivery Impedance – Part 3: COTS Software Packages

This is my third post in a series I’m writing about impedance to doing continuous delivery.  Click these links for earlier parts about the trials of Infrastructure and Solution Complexity.

Software packages, specifically Commercial-Off-The-Shelf Software (COTS) are software solutions purchased from a 3rd party vendor.  The idea is that you are purchasing something already delivering your needs for less money than it would cost you to build it.  There are also lots of other stated benefits.  I’m not certainly not challenging the use of COTS products altogether, or claiming that they are all the same.  However, many of those I’ve worked with over the years have created particularly strong impedance to doing continuous delivery.

In my experience there is nothing like custom code applications when it comes to great compatibility for doing continuous delivery.  For example Java:

  • Starting with early stages of the pipeline you’ve got Maven (or Ant or Grade), Findbugs, Checkstyle, Sonar, JUnit, Mokito, etc. all excellent. (Equivalents are readily available for C#, JavaScript, HTML, CSS, Ruby, Python, Perl, PHP, you name it…)
  • Moving on to deployment, you can literally take your pick: building native OS packages (e.g. RPMs), self-sufficient Java applications (e.g. Spring Boot), Configuration Management like Chef or Puppet, Jenkins and Shell scripts, I could go on…

This isn’t a new thing either, C and even Fortran are great and even Main Frames coupled with a proficient use of something like Endevor are very build-able, releasable and deploy-able.

Sadly with COTS products things are often not so simple…

scratched

Let’s start at the beginning of the pipeline.  COTS products can be hard to version control.  Many COTS products seem to consider a development repository part of their package.  – How generous of them to take version control / SCM off our hands?  Not really, how inconvenient, they rarely do they do a good job.  The staples of a good SCM are often absent:

  • Offline / distributed working – many COTS products require access to a shared development database.  This can be bad for productivity (slow), it may even be bad for flexible working (if people can’t get remote access)
  • Revision tracking – many COTS products may store limited metadata e.g. who changed what, but may limit your ability to do things like make an association to an ALM ticket e.g. in Jira
  • Branching  – usually not even a concept leaving you to fork irreconcilably (admittedly arguably branching isn’t really in the spirit of continuous delivery…)
  • Baselines – very rarely a concept allowing no reasonable way of promoting something other than the latest state of the development repository.

Usually to achieve the above sound SCM, you need to work hard to export your configurable items into flat files so that you can then store them in a sound SCM system (e.g. Git).  Generally this will take some kind of export / check-in tool that you may have to write yourself and thus take your own responsibility for ensuring integrity between SCM development repositories.

So let’s say we’ve got our COTS products moving parts flattened into versions control and under our control.  How do we handle merge?  The answer is usually simple: you can’t – any modification of these fragile, sensitive XML or in practical terms binary files outside of the IDE is strictly prohibited by the COTS vendor and even if tried is unlikely to lead to anything that will work (even if you get into clever schema informed XML merging).

Even if you aren’t branching, it may be perfectly reasonable for two people to end up simultaneously updating the same files.  This creates a problem that you need to try to avoid multiple people simultaneous changing the same file.  We must turn to our SCM system or perhaps a hybrid of the COTS IDE and the SCM system to implement pessimistic locking which even if achieved successfully has a negative impact on productivity.

So we’ve got a process of developers pushing consistent changes into version control.  The next challenge may be how to build.  Whilst the COTS product may export flat files, often this is just so that you can import the files into another development repository and you may not be able to build without re-importing your flat files back into another repository.  Worst case of all is when you have to import manually and then trigger a build manually within the IDE.  I’m not saying it isn’t possible to automate these things, I have definitely done it, but when it comes to having to involve AutoIT (screen scraping), you’ve got to feel somewhat frustrated/humiliated/sick/sad.

It’s not uncommon for COTS products to be slow to build.  I’m not saying the problem is exclusive to them, but with a custom application is usually much easier to break the build-up into smaller components to build separately.

Once we have a build, the task of deploying may not be easy.  Many COTS products expect fully manual, mouse-driven deployments and you basically end up reverse engineering to automate deployments.  Deployments can also be slow complex, slow and unpredictable.

Finally environments for COTS products can be tricky.  Usually a custom application will rely on standard middle-ware, or better still a PaaS.  COTS products can involve following hundreds if not thousands of pages of manual steps in installation manuals leading to a huge amount of work with your configuration management tool like Puppet or Chef.  Worse still there may be manual steps e.g. a configuration wizard that are very difficult to automate and again you are left to reverse engineering to remove the manual steps.

Like I said earlier, I’m not rejecting COTS products altogether, but I’d like to be very clear that they can often in many different ways create impedance for doing continuous delivery.  It is very rare that given enough effort these limitations cannot be overcome, but it may actually be equally rare (sadly) that organisations invest the necessary effort.

 

Reducing Continuous Delivery Impedance – Part 2: Solution Complexity

This is my second post in a series I’m writing about impedance to doing continuous delivery and how to overcome it.  Part 1 about Infrastructure challenges can be found here.  I also promised to write about complexity in continuous delivery in this earlier post about delivery pipelines.

I’m defining “a solution” as the software application or multiple applications under current development than need to work together (and hence be tested in an integrated manner) before release to production.

In my experience, continuous delivery pipelines work extremely well when you have a simple solution with the following convenient characteristics:

  1. All code and configuration is stored in one version control repository (e.g. Git)
  2. The full solution can be deployed all the way to production without needing to test it in conjunction with other applications / components under development
  3. You are using a 3rd party PaaS (treated as a black box, like HerokuGoogle App Engine, or AWS Elastic BeanStalk)
  4. The build is quick to run i.e. less than 5 minutes
  5. The automated tests are quick to run, i.e. minutes
  6. The automated test coverage is sufficient that the risks associated of releasing software can be understood to be lower in value than the benefits of releasing.

The first 3 characteristics are what I am calling “Solution Complexity” and what I want to discuss this post.

Here is a nice simple depiction of an application ticking all the above boxes.

perfect

Developers can make changes in one place, know that their change will be fully tested and know that when deployed into the production platform, their application should behave exactly as expected.  (I’ve squashed the continuous delivery (CD) pipeline into just one box, but inside it I’d expect to see a succession of code deployments, and automated quality gates like this.)

 

But what about when our solution is more complex?

What about if we fail to meet the first characteristic and our code is in multiple places and possibly not all in version control?  This definitely a common problem I’ve seen, in particular for configuration and data loading scripts.  However, this isn’t particularly difficult to solve from a technical perspective (more on the people-side in a future post!).  Get everything managed by a version control tool like Git.

Depending on the SCM tool you use, it may not be appropriate to feel obliged to use one repository.  If you do use multiple, most continuous integration tools (e.g. Jenkins) can be set up in such a way as to support handling builds that consume from multiple repositories.  If you are using Git, you can even handle this complexity within your version control repository e.g. by using sub-modules.

 

What about if your solution includes multiple applications like the following?

complex

Suddenly our beautiful pipeline metaphor is broken and we have a network of pipelines that need to converge (analogous to fan in in electronics).  This is far from a rarity and I would say it is overwhelmingly the norm.  This certainly makes things more difficult and we now have to carefully consider how our plumbing is going to work.  We need to build what I call an “integrated pipeline”.

Designing an integrated pipeline is all about determining the “points of integration” aka POI i.e. the first time that testing involves the combination two or more components.  At this point, you need to record the versions of each component so that they are kept consistent for the rest of the pipeline.  If you fail to do this, earlier quality gates in the pipeline are invalidated.

In the below example, Applications A and B have their own CD pipelines where they will be deployed to independent test environments and face a succession of independent quality gates.  Whenever a version of Application A or B gets to the end of its respective pipeline, instead of going into production, it moves into the Integrated Pipeline and creates a new integrated or composite build number.  After this “POI” the applications progress towards production in the same pipeline and can only move in sync.  In the diagram, version A4 of Application A and version B7 of B have made it into integration build I8.  If integration build I8 makes it through the pipeline it will be worthy to progress to production.

intDepending on the tool you use for orchestration, there are different solutions for achieving the above.  Fundamentally it doesn’t have to be particularly complicated.  You are simply aggregating version numbers in which can easily be stored together in a text document in any format you like (YAMLPOMJSON etc).

Some people reading this may by now be boiling up inside ready to scream “MICRO SERVICES” at their screens.  Micro services are by design independently deploy-able services.  The independence is achieved by ensuring that they fulfill and expect to consume strict contract APIs so that integration with other services can be managed and components can be upgraded independently.  A convention like SemVer can be adopted to manage change to contract compatibility.  I’ve for a while had this tagged in my head as the eBay way or Amazon way of doing this but micro services are now gaining a lot of attention.  If you are implementing micro services and achieving this independence between pipelines, that’s great.  Personally on the one micro services solution I’ve worked on so far, we still opted for an integrated pipeline that operated on an integrated build and produce predictable upgrades to production (we are looking to relax that at some point in the future).

Depending on how you are implementing your automated deployment, you may have deployment automation scripts that live separately to your application code.  Obviously we want to use consistent version of these through out deployments to different environments in the pipeline.  Therefore I strongly advise managing these scripts as a component in the same manner.

What about if you are not using a PaaS?  In my experience, this represents the vast majority of solutions I’ve worked on.  If you are not deploying into a fully managed container, you have to care about the version of the environment that you are deploying into.  The great thing about treating infrastructure as code (assuming you overcome that associated impedance) is that you can treat it like an application, give it a pipeline and feed it into the integrated pipeline (probably at a POI very early).  Effectively you are creating your own platform and performing continuous delivery on that.  Obviously the further your production environment is from being a version-able component like this, the great the manual effort to keep environments in sync.

paas

 

Coming soon: more sources of impedance to doing continuous delivery: Software packages, Organisation size, Organisation structure, etc.

 

(Thanks to Tom Kuhlmann for the graphic symbols.)

 

Agile, Waterfall, Common Sense, Experience

Even if you work for an old large traditional enterprise, it’s almost impossible now not to feel the influence of Agile and my advice is embrace it!

Agile:

  • creates / lays claim to a lot of great ideas. Some of these pre-dated it, but there is no harm at all getting them distilled all in one place
  • is extremely popular and even mandated by many organisations.
  • Mis-understood, mis-quoted, mis-interpreted, mis-used, spells TROUBLE.

Therefore, I highly recommend getting up to speed, for example by reading the manifesto. But don’t stop at that, there is huge amounts of great information out there in books and blogs to discover.

I would encourage some effort to learning the lingo. I don’t want to start ranting about how many times I’ve heard Agile terminology misquoted for defending some truly bad (non-Agile) practices (it’s many many times).

However, if you are starting out dabbling in Agile within an established organisation with entrenched processes my advice is to keep the following close to hand:

  • What Agile has to say about how to do things (and what it calls things!)
  • You common sense
  • Your experiences

There really is no excuse for ignoring any of these (and believe me I’ve seen people do it time and time again).

In my head, there is a parallel to be drawn between a methodology like Agile and an Enterprise software package like SAP (stick with me). Both are powerful, both are capable of doing massive things, “out-of-the-box” neither will ever be as effective as they were when they were first created as bespoke solutions to the unique problems that prompted them. – The problem you are solving (whether with a software package or software delivery methodology) will always be unique, and hence so must be the solution

Based on what I’ve seen over the last 4-5 years here are key success factors when following Agile in a project-based scenario:

Key Success Factors

1. Have a project initiation phase that does not complete until the sacrosanct high-level priorities of the sponsor and key high-level architectural decisions are agreed.
2. Adoption requires full co-operation of the Business as well as IT. It also requires all teams to be well trained and to distribute people with Agile experience across scrum teams.
3. Ensure that your performance management process aligns closely with the expected behaviours of people performing Agile roles.
4. Invest heavily in ensuring your application is fast and predictable to release. Without this, the benefits of faster development are lost when changes hit a bottleneck of live implementation.
5. Document and publish development patterns and coding standards and ensure you continually monitor conformance.
6. Constantly strive to reduce quality feedback time, i.e. the time from a developer making a change until the identification anything that related to the change that would indicate it as unworthy of promotion to production. This involves automating as much testing as possible and optimising the execution time of your tests.
7. Allow scrum teams to estimate and plan their own sprints and track the accuracy of this process over time.
8. Ensure you programme is heavily metric-driven and expect to need to continually refine the of your methodology over time to improve effectiveness .

Here are common pitfalls:

1. Forgetting that adopting Agile affects the whole organisation and it will not succeed in silos.
2. Making the assumption that documentation is no longer necessary. If something cannot be validated through an automated test or system monitoring, it still needs to be written down.
3. Using Agile’s innate scope flexibility to the extent that the project direction is in a constant state of flux leading to delays to benefit realisation and a lost recognition of the overall business case.
4. Becoming functionality obsessed and overlooking non-functional requirements such as performance, scalability and operability.
5. Putting an overemphasis on getting “working code” at the expense of poor application architecture and convoluted code in urgent need of re-factoring to adopt patterns and standards to reduce the cost of maintenance.
6. Devoting too much focus to executing pure Agile, as opposed to tailoring an approach that retains the key benefits whilst minimising risk on a large-scale distributed programme.
7. Making the assumption that less planning is required. Agile requires highly detailed planning involving the whole team and an iterative approach that complements and responds to the Agile delivery
8. Focusing on increasing development/test velocity when the bottleneck lies in your test environment stability and ability to predictably deploy code.
9. Ignoring the impact on service and operations teams who are expected to support new systems at a faster rate.
10. Overlooking the how to accommodate potentially slower moving dependencies such as infrastructure provisioning.
11. Doing things based on a perception of what “Agile” tells you to do and neglecting experience and knowledge of your organisation.

However, I must concede most of these pointers are based on common sense and my experience, and most could be applied to Agile, Waterfall, or any methodology.

Please let me know your thoughts.