Abstraction is not Obsoletion – Abstraction is Survival

Successfully delivering Enterprise IT is a complicated, probably even complex problem.  What’s surprising, is that as an industry, many of us are still comfortable accepting so much of the problem as our own to manage.

Let’s consider an albeit very simplified and arguably imprecise view of The “full stack”:

  • Physical electrical characteristics of materials (e.g. copper / p-type silicon, …)
  • Electronic components (resistor, capacitor, transistor)
  • Integrated circuits
  • CPUs and storage
  • Hardware devices
  • Operating Systems
  • Assembly Language
  • Modern Software Languages
  • Middleware Software
  • Business Software Systems
  • Business Logic

When you examine this view, hopefully (irrespective of what you think about what’s included or missing and the order) it is clear that when we do “IT” we are already extremely comfortable being abstracted from detail. We are already fully ready to use things which we do not and may never understand. When we build an eCommerce Platform, an ERP, or CRM system, little thought it given to Electronic components for example.

My challenge to the industry as a whole is to recognise more openly the immense benefit of abstraction for which we are already entirely dependent and to embrace it even more urgently!

Here is my thinking:

  • Electrons are hard – we take them for granted
  • Integrated circuits are hard – so we take them for granted
  • Hardware devices (servers for example) are hard – so why are so many enterprises still buying and managing them?
  • The software that it takes to make servers useful for hosting an application is hard – so why are we still doing this by default?

For solutions that still involve writing code, the most extreme example of abstraction I’ve experienced so far is the Lambda service from AWS.  Some seem to have started calling such things ServerLess computing.

With Lambda you write your software functions and upload them ready for AWS to run for you. Then you configure the triggering event that would cause your function to run. Then you sit back and pay for the privilege whilst enjoying the benefits. Obviously if the benefits outweigh the cost for the service you are making money. (Or perhaps in the world of venture capital, if the benefits are generating lots of revenue or even just active users growth, for now you don’t care…)

Let’s take a mobile example. Anyone with enough time and dedication can sit at home on a laptop and start writing mobile applications. If they write it as a purely standalone, offline application, and charge a small fee for it, theoretically they can make enough money to retire-on without even knowing how to spell server.  But in practice most applications (even if they just rely on in app-adverts) require network enabled services. But for this our app developer still doesn’t need to spell server, they just need to use the API of the online add company e.g. Adwords and their app will start generating advertising revenue. Next perhaps the application relies on persisting data off the device or notifications to be pushed to it. The developer still only needs to use another API to do this, for example Parse can provide that to you all as a programming service.  You just use the software development kit and are completely abstracted from servers.

So why are so many enterprises still exposing themselves to so much of the “full stack” above?  I wonder how much inertia there was to integrated circuits in the 1950s and how many people argued against abstraction from transistors…

To survive is to embrace Abstraction!

 


[1] Abstraction in a general computer science sense not a mathematical one (as used by Joel Spolsky in his excellent Law of Leaky Abstractions blog.)

Advertisements

Reusable Docker Testing Approach

In this blog I will describe a reusable approach to testing Docker that I have been working on.

By ‘testing Docker’ I mean the performing the following actions:

  • Static code analysis of the Dockerfile i.e. is the file syntactically valid and written to our expected standards?
  • Unit testing the Docker Image created by performing a build with our Dockerfile i.e. does our Dockerfile look like it created the Image we were expecting?
  • Functional testing the Container created by running an instance of our container i.e. when running does it look and do as we expected?

I wanted a solution that was very easy to adopt and extend so I chose to:

  • implement it in Docker so that it will work for anyone using Docker (see this diagram)
  • use Docker Compose to make it as easy to trigger
  • reuse Dockerlint
  • use Ruby because it is fairly widespread as a required skill in infrastructure-as-code people (for now until Go takes over…), and because the docker-api Gem is very powerful, albeit expects you to learn more about the Docker API in order to use it.
  • use RSpec an ServerSpec as testing framework because they have good documentation and the support BDD

So what is the solution?

Essentially it is a Docker image called test-docker.  To use it, you must mount-in your ‘Dockerfile’ and your ‘tests’ directory, it then:

  1. Runs Dockerlint to perform the static code analysis on the Docker file
  2. Runs your tests which I encourage you write for both inspecting the image and testing a running container.

How to see it in action?

To run this you need Docker installed and functioning happily.  Personally I’m using:

  • a Windows laptop
  • Docker Toolbox which gave me: docker-machine which manages a Linux virtual machine for me running on a local installation of Virtual Box
  • docker-compose installed (I did it manually)
  • git bash aka Git For Windows as my terminal

With the above or equivalent, you simply need to do:

$ git clone https://github.com/kramos/test-docker.git
$ cd test-docker
$ docker-compose -f docker-compose-test-docker.yml up

You should see an output like this:

Creating testdocker
Creating testdocker_lintdocker_1
Attaching to testdocker, testdocker_lintdocker_1
testdocker   | /usr/local/bin/ruby -I/usr/local/bundle/gems/rspec-support-3.4.0/lib:/usr/local/bundle/gems/rspec-core-3.4.0/lib /usr/local/bundle/gems/rspec-core-3.4.0/exe/rspec --pattern spec/\*_spec.rb
lintdocker_1 | Check passed!
testdocker_lintdocker_1 exited with code 0
testdocker   |
testdocker   | Container
testdocker   |   get running
testdocker   |     check ruby
testdocker   |       Command "ruby --version"
testdocker   |         stdout
testdocker   |           should match /ruby/
testdocker   |         stderr
testdocker   |           should be empty
testdocker   |
testdocker   | Image
testdocker   |   inpsect metadata
testdocker   |     should not expose any ports
testdocker   |
testdocker   | Finished in 1.48 seconds (files took 1.45 seconds to load)
testdocker   | 3 examples, 0 failures
testdocker   |
testdocker exited with code 0
Gracefully stopping... (press Ctrl+C again to force)


 

All good.  But what happened?  Well everything I’ve said we wanted to happen, against the test-docker tool.  #Dogfood and all that.

You can also try out another example e.g.:

$ docker-compose -f examples/redis/docker-compose.yml up

So how to use this for your own work?

Hopefully you’ll agree this is very easy (at least to get started):

  1. Replace the Dockerfile in the root of the test-docker folder with your own Dockerfile (plus any other local resources your Dockerfile needs)
  2. Run the following (this time we allow docker-compose to use the default configuration file which you also pulled from Git:
$ docker-compose up
  1. You will find out what Dockerlint thinks of your code, followed by finding out whether by extreme luck any of the tests that were written for the test-docker image (as opposed to your image) pass
  2. Open the rb file (in tests/spec) and update it to test your application using anything that you can do to a stopped container using the docker-api
  3. Open the rb file (in tests/spec) and update it to test your application using anything that you can do to a running container using the docker-api and Serverspec.
  4. I suggest remove the .git folder and initialise a git repository to manage your Dockerfile and your tests.

 

Functional tests that run the container require two subtly different approaches according to whether your Docker image is expected to run as a daemon or just run, do something and stop.  In the former case, you can use a lot of Serverspec functionality.  In the latter, your choices are more limited to running the container multiple times and in each case grabbing the output and parsing it.

 

Conclusion

 

There were a surprising number of things I had to learn on the fly here to get this working, but I don’t want this blog to drag on.  Let me know how you get on and I will happily share more, especially when any of magic things don’t work as expected – for example when writing tests.

 

I’ll leave you with my current list of things I want to improve:

  • Make work with Ruby slim (the image is huge)
  • Get working with Inspec instead of ServerSpc
  • Provide better examples of tests
  • I should really draw a diagram to help anyone new to this understand all this inception computing…

 

Credits:

I took a huge amount of help from:
http://www.unixdaemon.net/tools/testing-dockerfiles-with-serverspec/
https://github.com/sherzberg/docker-hhvm/
https://github.com/18F/docker-elasticsearch

 

Docker Inception


Sometimes when you are working with Docker it feels a bit like the movie inception  It doesn’t help when you are doing things like this.  So here is a diagram that might make things clearer.

docker-inception

Reducing Continuous Delivery Impedance – Part 2: Solution Complexity

This is my second post in a series I’m writing about impedance to doing continuous delivery and how to overcome it.  Part 1 about Infrastructure challenges can be found here.  I also promised to write about complexity in continuous delivery in this earlier post about delivery pipelines.

I’m defining “a solution” as the software application or multiple applications under current development than need to work together (and hence be tested in an integrated manner) before release to production.

In my experience, continuous delivery pipelines work extremely well when you have a simple solution with the following convenient characteristics:

  1. All code and configuration is stored in one version control repository (e.g. Git)
  2. The full solution can be deployed all the way to production without needing to test it in conjunction with other applications / components under development
  3. You are using a 3rd party PaaS (treated as a black box, like HerokuGoogle App Engine, or AWS Elastic BeanStalk)
  4. The build is quick to run i.e. less than 5 minutes
  5. The automated tests are quick to run, i.e. minutes
  6. The automated test coverage is sufficient that the risks associated of releasing software can be understood to be lower in value than the benefits of releasing.

The first 3 characteristics are what I am calling “Solution Complexity” and what I want to discuss this post.

Here is a nice simple depiction of an application ticking all the above boxes.

perfect

Developers can make changes in one place, know that their change will be fully tested and know that when deployed into the production platform, their application should behave exactly as expected.  (I’ve squashed the continuous delivery (CD) pipeline into just one box, but inside it I’d expect to see a succession of code deployments, and automated quality gates like this.)

 

But what about when our solution is more complex?

What about if we fail to meet the first characteristic and our code is in multiple places and possibly not all in version control?  This definitely a common problem I’ve seen, in particular for configuration and data loading scripts.  However, this isn’t particularly difficult to solve from a technical perspective (more on the people-side in a future post!).  Get everything managed by a version control tool like Git.

Depending on the SCM tool you use, it may not be appropriate to feel obliged to use one repository.  If you do use multiple, most continuous integration tools (e.g. Jenkins) can be set up in such a way as to support handling builds that consume from multiple repositories.  If you are using Git, you can even handle this complexity within your version control repository e.g. by using sub-modules.

 

What about if your solution includes multiple applications like the following?

complex

Suddenly our beautiful pipeline metaphor is broken and we have a network of pipelines that need to converge (analogous to fan in in electronics).  This is far from a rarity and I would say it is overwhelmingly the norm.  This certainly makes things more difficult and we now have to carefully consider how our plumbing is going to work.  We need to build what I call an “integrated pipeline”.

Designing an integrated pipeline is all about determining the “points of integration” aka POI i.e. the first time that testing involves the combination two or more components.  At this point, you need to record the versions of each component so that they are kept consistent for the rest of the pipeline.  If you fail to do this, earlier quality gates in the pipeline are invalidated.

In the below example, Applications A and B have their own CD pipelines where they will be deployed to independent test environments and face a succession of independent quality gates.  Whenever a version of Application A or B gets to the end of its respective pipeline, instead of going into production, it moves into the Integrated Pipeline and creates a new integrated or composite build number.  After this “POI” the applications progress towards production in the same pipeline and can only move in sync.  In the diagram, version A4 of Application A and version B7 of B have made it into integration build I8.  If integration build I8 makes it through the pipeline it will be worthy to progress to production.

intDepending on the tool you use for orchestration, there are different solutions for achieving the above.  Fundamentally it doesn’t have to be particularly complicated.  You are simply aggregating version numbers in which can easily be stored together in a text document in any format you like (YAMLPOMJSON etc).

Some people reading this may by now be boiling up inside ready to scream “MICRO SERVICES” at their screens.  Micro services are by design independently deploy-able services.  The independence is achieved by ensuring that they fulfill and expect to consume strict contract APIs so that integration with other services can be managed and components can be upgraded independently.  A convention like SemVer can be adopted to manage change to contract compatibility.  I’ve for a while had this tagged in my head as the eBay way or Amazon way of doing this but micro services are now gaining a lot of attention.  If you are implementing micro services and achieving this independence between pipelines, that’s great.  Personally on the one micro services solution I’ve worked on so far, we still opted for an integrated pipeline that operated on an integrated build and produce predictable upgrades to production (we are looking to relax that at some point in the future).

Depending on how you are implementing your automated deployment, you may have deployment automation scripts that live separately to your application code.  Obviously we want to use consistent version of these through out deployments to different environments in the pipeline.  Therefore I strongly advise managing these scripts as a component in the same manner.

What about if you are not using a PaaS?  In my experience, this represents the vast majority of solutions I’ve worked on.  If you are not deploying into a fully managed container, you have to care about the version of the environment that you are deploying into.  The great thing about treating infrastructure as code (assuming you overcome that associated impedance) is that you can treat it like an application, give it a pipeline and feed it into the integrated pipeline (probably at a POI very early).  Effectively you are creating your own platform and performing continuous delivery on that.  Obviously the further your production environment is from being a version-able component like this, the great the manual effort to keep environments in sync.

paas

 

Coming soon: more sources of impedance to doing continuous delivery: Software packages, Organisation size, Organisation structure, etc.

 

(Thanks to Tom Kuhlmann for the graphic symbols.)

 

It works on my laptop… #WIN

There is an infamous, much repeated dialogue between Development and Operations team that goes:

  • Operator (with urgency):   “The application you’ve developed doesn’t work”
  • Developer (frustrated):   “It works on my laptop”
  • Operator (sarcastically):   “Shall we add your laptop into production?”
  • Developer (under breath):   “$%£@$% !”

The moral of the conversation is the importance that developers understand how their application is going to be deployed-into, and run-in, and perform-in the production environment (a key tenet of DevOps).

Whilst the conversation is still telling, I believe now, more-than-ever that developers have the chance that what works on their locally laptop is genuinely production-ready.

Historically a developer’s machine would contain:

  • an operating system completely different to production (e.g. Mac OS or Window XP)
  • a tool for editing code (somewhere on the spectrum from a text editor like VIM to an Integrated Development Environment (IDE) like Eclipse)
  • a local environment for testing the code (often consisting of an application server within the IDE)

The local environment could be further characterised as:

  • Using a different version of the application server to production
  • Running as the developer’s user account
  • Having access to the developer’s home directory and the version control repository
  • Excluding any other application components than the primary one under development
  • Almost entirely free from security measures e.g. password encryption

However this no longer needs to be the case.

Virtual Box makes it possible to run a local virtual instance of exactly the same operating system as Production.

Vagrant makes it easy to manage Virtual Box from the command line (and even use version control e.g. Git to manage the configuration of this).

Online repositories of virtual machine images e.g. www.vagrantbox.es make it quick to get started with a the virtual machine that you need.

Box image creation tools like Veewee and Packer make it easy to create your own box images (and if you are using a cloud provider supporting machine images like AWS, you can even use these boxes in production.

Automated Configuration Management tools like Puppet and Chef mean that not only can you set up your local virtual server(s) to match the configuration of the Production environment, you can do it using exactly the same scripts that will be used in Production.

Integration with cloud providers (e.g. from Vagrant) mean that you don’t even need to stop when your local laptop runs out of steam.

Challenge yourself and your development teams to do a better job of development environments using the above tools (and more), and hopefully this infamous dispute will become a thing of the past!

AWS Certified Solutions Architect – Sample Questions Answered and Discussed

I recently took (and passed!) the Amazon Web Services – AWS Certified Solutions Architect certification.  It was a great way to get immersed in how Amazon Web Services work from a Technology Architecture perspective.

I consider myself quite a fan of new infrastructure solutions and progressive ways to deliver IT.  But when I saw up close the full might of AWS it really changed my outlook even on using the cloud.

Elasticity and designing for failure (aka anti-fragile) aren’t just a pipe-dream, they are first-class capabilities and implementing them is almost as simple as creating a server.

Managing your ENTIRE Production environment (servers, storage, networks, HA, DR, security, etc.) as code isn’t an aspiration, it is a reality, think “export EVERYTHING to script”, then version control it.  Then use it again and again and again.

Find out about this stuff, or get left behind!

If you to start think seriously about getting certified, you’ll quickly find yourself staring at these Sample questions.

In this blog, I’m going to provide worked answers and as much discussion as I can about underlying concepts.  I’ll have a go at including a few helpful links, but keep in mind the AWS documentation in particular the FAQs are excellent and also very easy to find.

Please don’t hesitate to provide suggestions, corrections or ask any clarifications.

Question 1 (of 7): Amazon Glacier is designed for: (Choose 2 answers)

  • A. active database storage.
  • B. infrequently accessed data.
  • C. data archives.
  • D. frequently accessed data.
  • E. cached session data.

Answer: B. infrequently accessed data. C. data archives.

Think “cold storage” and the name Glacier makes a bit more sense.  AWS includes a number of storage solutions and as per the to pass the exam, you are expected to know the appropriate use of all of them.

I picture them on the following scale:

Instance (aka ephemeral, aka local) storage is a device like a RAM disk physically attached to your server (your EC2 instance) and characteristically it gets completely wiped every reboot.  Naturally this makes it suitable for temporary storage, but nothing that needs to survive something as simple as a reboot. You can store the Operating System on there if nothing important gets stored there after the instance is started (and bootstrapping completes).  Micro-sized instance types (low specification servers) don’t have ephemeral storage.  Some larger more expensive instance types come with SSD instance storage for higher performance.

Elastic Block Store (EBS) is a service where you buy devices more akin to a hard disk that can be attached to one (and only one -at the time of writing) EC2 instance.  They can be set to persist after an instance is restarted.  They can be easily “snapshotted”, i.e. backed up in away that you can create a new identical device and attach that to the same or another EC2 instance.  One other thing to know about EBS is that you can pay extra money for what is known as provisioned IOPS which means guaranteed (and very high if you like) disk read and write speeds.

S3 is a cloud file storage service more akin to DropBox or GoogleDrive.  It is possible to attach a storage volume created and stored in S3 to an EC2 instance, but this is no longer recommended (EBS is preferable).  S3 is instead for storing things like your EC2 server images (Amazon Machine Images aka AMIs), static content e.g. for a web site, input or output data files (like you’ve use an SFTP site), or anything that you’d treat like a file.

An S3 store is called a bucket whilst living in one specified global region, has a globally unique name.  S3 integrates extremely will with the CloudFront content distribution service which offers caching of content to a much more globally distributed set of edge locations (thus improving performance and saving bandwidth costs).

Glacier comes next as basically a variant on S3 where you expect to want to view the files either hardly ever or never again.  For example old backups, old data only kept for compliance purposes.  Instead of a bucket, Glacier files are stored in a Vault. Instead of getting instant access to files, you have to make a retrieval request and wait a number of hours. S3 and Glacier play very nicely together because you can set up Lifecycles for S3 objects which cause them to be moved to Glacier after a certain trigger e.g. a certain elapsed “expiry” time passing.

Wrong answers:

A. active database storage.

Obviously databases are written to regularly i.e. the polar (excuse the pun) opposite of Glacier.

Amazon offer a 5 different options for databases.

RDS is the Relational Database Service. This allows Amazon to handle the database software for you (including licenses, replication, backups and more). You aren’t given access to any underlying EC2 servers and instead you simply connect to the database using your preferred method (e.g. JDBC). NB. currently this supports MySQL, Oracle, PostGreSQL and Microsoft SQL Slug.

SimpleDB is a non-relational database service that works in a similar way to RDS.

Redshift is Amazon’s relational data warehouse solution capable of much larger (and efficient at large scale) storage.

DynamoDB is Amazons NoSQL managed database service. For this storage Amazon apparently uses Solid Stage Devices for high performance.

Finally of course, you can create servers with EC2 and install the database software yourself and work as you would in your data centre. This is the only time that you would need to consider what storage solution you actually want to use for a database.  EBS would be most appropriate.  Clearly Instance storage is a very risky option due to not persisting after restarts.  S3 is inappropriate for databases especially for Oracle which can efficiently manage raw storage devices rather than writing files to a file system.

D. frequently accessed data.

Clearly this is the opposite of Glacier.  Obviously if your data doesn’t need to persist after restarts, Instance storage would be the best choice for Frequently accessed data. Otherwise EBS is the choice if you applications are reading and writing the data. S3 (plus CloudFront) is the option if end users access your data over the www.

E. cached session data.

ElasticCache is the AWS that provides a  Memcached or Redis compliant caching server that your applications can make use of.  Your web application front end consists of multiple EC2 instances behind an Elastic Load Balancer.

Question 2 (of 7): You configured ELB to perform health checks on these EC2 instances. If an instance fails to pass health checks, which statement will be true?

  • A. The instance is replaced automatically by the ELB.
  • B. The instance gets terminated automatically by the ELB.
  • C. The ELB stops sending traffic to the instance that failed its health check.
  • D. The instance gets quarantined by the ELB for root cause analysis.

Answer: C. The ELB stops sending traffic to the instance that failed its health check.

This question tests that you properly understand how auto-scaling works. If you don’t, you might take a guess that load balancers take the more helpful sounding option A, i.e. automatically replacing a failed server.

The fact is, an elastic load balancer is still just a load balancer. Arguably when you ignore the elastic part, it is quite a simple load balancer in that (currently) it only supports round robin routing as opposed to anything more clever (perhaps balancing that takes into account the load on each instance).

The elastic part just means that when new servers are added to an “auto-scaling group”, the load balancer recognises them and starts sending them traffic. In fact to make answer A above, you need the following:

  • A launch configuration This tells AWS how to stand up a bootstrapped server that once up is ready to do work without any human intervention
  • An auto-scaling group This tells AWS where it can create servers (could be subnets in different Availability Zones in one region (NB. subnets can’t span AZ’s), but not across multiple regions).  Also: which launch configuration to use, the minimum and maximum allowed servers in the group, and how to scale up and down. By how to scale up and down, it means for example 1 at a time, 10% more and various other things.  With both of these configured, the when an instance fails the heath checks (presumably because it is down), it is the auto scaling group that will decide whether we now need to add another server t to compensate.

Just to complete the story about auto scaling, it is worth mentioning the CloudWatch service. This is the name for the monitoring service in AWS. You can add custom checks and use these to trigger scaling policies to expand or contract your group of servers (and of course the ELB keeps up and routes traffic appropriately).

Wrong answers:

A. The instance is replaced automatically by the ELB.

As described above, you need an Auto Scaling group to handle replacements.

B. The instance gets terminated automatically by the ELB.

As discussed above, load balancers aren’t capable of manipulating EC2 like this.

D. The instance gets quarantined by the ELB for root cause analysis.

There is no concept of quarantining.

Question 3 (of 7):You are building a system to distribute confidential training videos to employees. Using CloudFront, what method could be used to serve content that is stored in S3, but not publically accessible from S3 directly?

  • A. Create an Origin Access Identity (OAI) for CloudFront and grant access to the objects in your S3 bucket to that OAI.
  • B. Add the CloudFront account security group “amazon-cf/amazon-cf-sg” to the appropriate S3 bucket policy.
  • C. Create an Identity and Access Management (IAM) User for CloudFront and grant access to the objects in your S3 bucket to that IAM User.
  • D. Create a S3 bucket policy that lists the CloudFront distribution ID as the Principal and the target bucket as the Amazon Resource Name (ARN).

Answer: A. Create an Origin Access Identity (OAI) for CloudFront and grant access to the objects in your S3 bucket to that OAI.

An Origin Access Identity is a special user that you will set up the CloudFront service to use to access you restricted content, see here.

Wrong Answers:

B. Add the CloudFront account security group “amazon-cf/amazon-cf-sg” to the appropriate S3 bucket policy.

The CloudFront OAI solution is more tightly integrated with S3 and you don’t need to know implementation level details like the actual user name as that gets handled under the covers by the service.

C. Create an Identity and Access Management (IAM) User for CloudFront and grant access to the objects in your S3 bucket to that IAM User.

IAM is the service for controlling who can do what within your AWS account. The fact is that an AWS account is so incredibly powerful, that it would be far too dangerous to have many people in a company with full access to create servers, remove storage, etc. etc.

IAMs allows you to create that fine grained access to use of services. It doesn’t work down to the level suggested in this answer of specific objects. IAMs could stop a user accessing S3 admin functions, but not specific objects.

D. Create a S3 bucket policy that lists the CloudFront distribution ID as the Principal and the target bucket as the Amazon Resource Name (ARN). When configuring Bucket policies, a Principal is one or more named individuals in receipt of a particular policy statement. For example, you could be listed as a principal so that you can be denied access to delete objects in an S3 bucket. So the terminology is misused.

Question 4 (of 7): Which of the following will occur when an EC2 instance in a VPC (Virtual Private Cloud) with an associated Elastic IP is stopped and started? (Choose 2 answers)

  • A. The Elastic IP will be dissociated from the instance
  • B. All data on instance-store devices will be lost
  • C. All data on EBS (Elastic Block Store) devices will be lost
  • D. The ENI (Elastic Network Interface) is detached
  • E. The underlying host for the instance is changed

Answers: B. All data on instance-store devices will be lost

(See storage explanations above)

E. The underlying host for the instance is changed

Not a great answer here.  You are completely abstracted from underlying hosts.  So you have no way of knowing this.  But by elimination, I picked this.

Wrong Answers:

A. The Elastic IP will be dissociated from the instance

This is the opposite of the truth. Elastic IPs are sticky until re-assigned for a good reason (such as the instance has been terminated i.e. it is never coming back).

C. All data on EBS (Elastic Block Store) devices will be lost

EBS devices are independent of EC2 instances and by default outlive them (unless configured otherwise). All data on Instance storage however will be lost and also on the root (/dev/sda1) partition of S3 backed servers.

D. The ENI (Elastic Network Interface) is detached

As far as I know, just as silly answer!

Question 5 (of 7): In the basic monitoring package for EC2, Amazon CloudWatch provides the following metrics:

  • A. web server visible metrics such as number failed transaction requests
  • B. operating system visible metrics such as memory utilization
  • C. database visible metrics such as number of connections
  • D. hypervisor visible metrics such as CPU utilization

Answer: D. hypervisor visible metrics such as CPU utilization

Amazon needs to know this anyway to provide IaaS, so it seems natural that they share it.

Wrong Answers:

A. web server visible metrics such as number failed transaction requests

Too detailed for EC2 – Amazon don’t even want to know whether you have or haven’t even installed a web server.

B. operating system visible metrics such as memory utilization

Too detailed for EC2 – Amazon don’t want to interact with your operating system.

C. database visible metrics such as number of connections

Too detailed for EC2 – Amazon don’t even want to know whether you have or haven’t even installed a web server.  NB. the question states Ec2 monitoring, RDS monitoring does include this.

Question 6 (of 7): Which is an operational process performed by AWS for data security?

  • A. AES-256 encryption of data stored on any shared storage device
  • B. Decommissioning of storage devices using industry-standard practices
  • C. Background virus scans of EBS volumes and EBS snapshots
  • D. Replication of data across multiple AWS Regions E. Secure wiping of EBS data when an EBS volume is un-mounted

Answer: B. Decommissioning of storage devices using industry-standard practices

Clearly there is no way you could do this, so AWS take care.

Wrong Answers:

A. AES-256 encryption of data stored on any shared storage device

Encryption of storage devices (EBS) is your concern.

C. Background virus scans of EBS volumes and EBS snapshots

Too detailed for EC2 – Amazon don’t want to interact with your data.

D. Replication of data across multiple AWS Regions

No, you have to do this yourself.

E. Secure wiping of EBS data when an EBS volume is un-mounted

An un-mount doesn’t cause an EBS volume to be wiped.

Question 7 (of 7): To protect S3 data from both accidental deletion and accidental overwriting, you should:

  • A. enable S3 versioning on the bucket
  • B. access S3 data using only signed URLs
  • C. disable S3 delete using an IAM bucket policy
  • D. enable S3 Reduced Redundancy Storage
  • E. enable Multi-Factor Authentication (MFA) protected access

Answer: A. enable S3 versioning on the bucket

As the name suggests, S3 versioning means that all versions of a file are kept and retrievable at a later date (by making a request to the bucket, using the object ID and also the version number). The only charge for having this enabled is from the fact that you will incur more storage. When an object is deleted, it will still be accessible just not visible.

Wrong Answers:

B. access S3 data using only signed URLs

Signed URLs are actually part of CloudFront which as I mentioned earlier is the content distribution service. These protect content from un-authorised access.

C. disable S3 delete using an IAM bucket policy

No such thing as an IAM bucket policy.  There are IAM policies and there are Bucket policies.

D. enable S3 Reduced Redundancy

Reduced Redundancy Storage RRS is a way of storing something on S3 with a lower durability, i.e. a lower assurance from Amazon that they won’t lose the data on your behalf. Obviously this lower standard of service comes at a lower price. RRC is designed for things that you need to store for convenience e.g. software binaries, but if they got deleted you could recreate (or re-download). So with this in mind enabling RRC reduces the level of protection rather than increases it. It is worth noticing the incredible level of durance that S3 provides. Without RRC enabled, durability is 11 9s, which equates to

“If you store 10,000 objects with us, on average we may lose one of them every 10 million years or so. This storage is designed in such a way that we can sustain the concurrent loss of data in two separate storage facilities.”

(see here, thanks to here).

With RRC, this drops to 4 9s which is still probably probably better than most IT departments can offer.

E. enable Multi-Factor Authentication (MFA) protected access

This answer is of little relevance. As I mentioned accounts on AWS are incredibly powerful due to the logical nature of what they control. In the physical world it isn’t possible for someone to press a button and delete an entire data centre (servers, storage, backups and all). In AWS, you could press a few buttons and do that, not just in one data centre, but in ever data centre you’ve used globally. So MFA is a mechanism for increasing security over people accessing your AWS account. As I mentioned earlier IAMS is the mechanism for further restricting what authenticated people are authorised to do.

Good Luck!!