Scaling on AWS From Your First User …and to Infinity

An excellent presentation that will test some of your assumptions. Starts with your first user (likely yourself) and then gets you thinking about scalability in logical increments. The key takeaway should be that there are very different approaches at each step, and that being overly concerned with what to do at 10m doesn’t really make much sense at, say, 100,000 users. Includes specific approaches for each step.

The presentation slides:

The presentation video:


‘Of course I can. I’m an expert.’

I must admit, I’ve had a few meetings like this (and sat in on others which I suspect were going like this from the expert’s perceptive).

Funny business meeting illustrating how hard it is for an engineer to fit into the corporate world!

I love how he starts catching on and adapting to the situation. Pretty funny.


Microsoft Azure price reductions, new instances, storage tweaks

As anticipated:

Announcing price reductions on Azure Block Blob Storage and Standard Memory Intensive Windows and Linux Virtual Machines (VMs) across all regions. Also introducing a new tier of lower priced Basic compute instances.

Their blog post has the details (and a few friendly jabs).


DevOps Reactions

A nice little Tumblr blog for DevOps and sysadmin folks - DevOps Reactions:

Say it with pictures. Describe your feelings about your everyday sysadmin interactions.

Lots of good stuff.


A Good Week for Public Cloud Users

It’s only Wednesday and two of the three big public cloud players have announced significant price drops in core services (computing, storage, etc.) along with some functionality improvements:

Google Cloud Platform Live – Blending IaaS and PaaS, Moore’s Law for the cloud

Today, at Google Cloud Platform Live we’re introducing the next set of improvements to Cloud Platform: lower and simpler pricing, cloud-based DevOps tooling, Managed Virtual Machines (VM) for App Engine, real-time Big Data analytics with Google BigQuery, and more.

AWS Price Reduction #42 – EC2, S3, RDS, ElastiCache, and Elastic MapReduce

Effective April 1, 2014 we are reducing prices for Amazon EC2, Amazon S3, the Amazon Relational Database Service, and Elastic MapReduce.

It would not surprise me if we heard some announcements from Microsoft Azure soon, given that the big three tend to track each other rather well when it comes to the apples-to-apples elements of their service menus. Besides the financial savings, Google’s new offerings – while still in limited beta – increase parity between these players’ service offerings. For example, Managed VMs are now available in some form across all three companies (which give a bit more flexibility than pure PaaS options). All now offer Windows-based VMs too.



Server Hugging

Chris Munns, for Amazon Web Services, “Why You Need To Stop Worrying About prodweb001 And Start Loving i-98fb9856

Traditionally, IT organizations have treated infrastructure components like family pets. We name them, we worry about them, and we let them wake us up at 4:00 am.

On “server hugging” and its impact when translated to the cloud environment.


The Panacea for ‘We Want No Downtime’ – A Brief Conceptual Framework for High Availability Planning

In Greek mythology, Panacea was a goddess of Universal remedy. Panacea was said to have a potion with which she healed the sick. This brought about the concept of the panacea in medicine, a substance meant to cure all diseases1. The term is also used figuratively as something intended to completely solve a large, multi-faceted problem2. Unfortunately – when it comes to business computing applications – when someone says “We want no downtime” there is simply no panacea3.

About the only thing you can be sure of when doing high availability planning is that there are a lot of tools to consider using, a lot of decisions to get to making, and a lot of work to get to doing4. This is why a good conceptual framework is important. To make sure the right things are considered and the appropriate decisions are made.

In this post I’ve attempted to outline the framework I use. The aim is to help – in any given situation and over the life of a business application and the business itself – figure out which approaches make sense and what path to take to getting there. We’ll get to that framework in a moment, but let’s make sure we’re clear on one other important thing first.

There’s being proactive and then there’s luck …and then there’s being smart.

The “no downtime” request is perhaps somewhat akin to a patient telling a doctor “I want to be healthy” (which, I suppose, is typically driven by the desire to minimize downtime of a different sort). You can’t literally be healthy anymore than you can literally design for zero downtime. You can only control the inputs, manage which knobs you turn to minimize the likelihood of being unhealthy (or having an outage), and plan so that you are prepared for (or have options – or at least are willing to accept) the inevitable things you can’t prevent with 100% certainty. And you have to make some decisions along the way as to how much you’re willing to invest – time, energy, money, distraction.

Just as it’s possible to eat cheeseburgers your entire life and still live to see your 90s, it is possible to have no downtime with your web application without even making any investments in eliminating single points of failure. Single physical server hosting your web/app and database? No data backups? Experienced no downtime or data loss? Congratulations. Sometimes you just luck out. At the same time, that doesn’t make it a good strategy.

Minimize downtime by managing it

Having a business goal of managing downtime is a perfectly reasonable request, but as with most things involving technology, the requirement must be broken down and analyzed in a practical way before any actions can be made surrounding it. The following conceptual framework is about the closest to a universal way I know of to break down the meaning of “We want no downtime” into something meaningful and useful so that engineering decisions and investment decisions can be made surrounding it.

How to think about “managing downtime”

There are numerous facets to managing downtime – preventing it, minimizing its negative impact, handling it gracefully when it does occur, and having options for handling those really bad situations no one anticipated too. So let’s get to breaking these facets down with specificity:

1. Minimize downtime

…for all reasonable events

2. Speed up recovery time

…for all unreasonable to protect against events

3. Handle outages as gracefully as possible

…don’t leave users hanging (blind) even when the app becomes unavailable (e.g. continue to provide reduced functionality if possible or, when not possible, then provide a friendly outage message)

…provide options for response (see next item)

4. Have enough depth in the architecture so that there are multiple options when the unforeseen occurs

…have data stored in multiple repositories that are as independent as possible

…have various data rollback points

…understand the architecture/platform and individual elements well enough that these options can be used if need be

Important definitions

I threw out two seemingly straightforward terms above – reasonable and unreasonable – that can have very different definitions to different stakeholders (and at different points in time over the life of an application and organization). Defining these is paramount to getting this stuff right. I’d even go as far as to state that defining these well is the crux of getting high availability investments aligned with the business requirements.

What are “reasonable” events?

The definition of reasonable events:

  • What we can anticipate; or
  • What we can afford

What are “unreasonable” events?

The definition of unreasonable events:

  • What we can’t anticipate; or
  • What we can’t afford to prevent

The “or” between each of the above bullet points is important. We can’t always afford all the things we need or know we want. Thinking about “what-ifs” in the above context provides a conceptual framework which technologists and business sponsors can use to make informed decisions about how to proceed.

Once the above are defined, the particular situations / events that apply to a given business application can be discussed with clarity and the decisions made surrounding them.

The decisions made in the above categories drive the architecture and overall investment.

In other words

All of the above put another way:

  • We want to be able to sleep better at night
  • We want to prevent what we can
  • We want to manage what we can’t prevent as best as possible
  • We want to have options when the shit really hits the fan
  • We want to invest wisely
  • We want to be able to improve as tools mature, lessons are learned, business requirements change, and our resources increase

Getting there

As the saying goes, it’s not a question of if, only when.

That doesn’t mean we have to blindly spend money on every conceivable scenario. Nor does it mean we even can spend money on every possible scenario (i.e. unlimited resources is not a panacea, sorry). We can, however, get better as our business maturity demands it and as our resources permit it.

While every business and every situation is different, the analytical framework to make these decisions within is simple enough. Every conceivable scenario can be incorporated into the framework above5. Combined with a strong understanding of the capabilities of the infrastructure and people, and the resources available for investment, the development of an architecture to support your business application’s high availability requirements is completely do-able.

Or you can just wing it on a single server and pray to your favorite Greek goddess6.

  1. alas, does not exist as far as I know 

  2. Paraphrased from Wikipedia: 

  3. regardless of your spiritual beliefs and, incidentally, regardless of whether you outsource this problem or handle it all in-house 

  4. cloud offerings have increased the tools available and decreased the barriers to their use, but each service provider’s elements still cannot simply be adopted blindly if one hopes to achieve their organization’s particular business goals 

  5. I think; I’m not perfect, but apparently I don’t mind making bold claims. Ha! 

  6. to be clear: there’s nothing wrong with starting with a single server. Everybody has to start somewhere. Do make sure you have reliable data backups though. 


The Danger of Having Role Models

The always thought provoking Brian Gardner, of StudioPress, writes on his personal blog about The Danger of Having Role Models:

There’s value in being original, and I want to embrace that about myself. I want to create content that’s unique to who I am, and not imitate the works of others.


A New Blog for Technical Freelancers

A few years ago I wanted to start sharing things I was learning, as a technical freelancer and self-employed technology services provider.

This is when was born. First as a small email list – that grew to over 200 subscribers – and now a blog. If you’re a technical freelancer — current or aspiring, full-timer or moonlighter — I invite you to join us. A blog about being a self-employed freelancer, consultant, or service provider. Edited by a consulting technologist.


`Everybody is somebody else’s monster.’

Great piece from Greg Knauss, writing on his blog, about the recent response to his publishing of the app Romantimatic:

What I’m talking about here is how addictive the righteousness that comes from that condemnation is, and how we will apparently turn to any source we can find for it — even when that source is not evil or harmful or part of any world we exist in or understand.
A few years ago, a photo made the rounds. It was taken from the back, its subject unaware. He was a fat guy wearing a jeans-jacket, and on the back he had stenciled the name of his heavy metal band. It was a sloppy and amateurish job. The photo earned a lot of mocking comments in my circle, including from me. Ha ha, look at the fat guy with the rock-and-roll pretensions. Look at him. Looooook.
And then someone said, “I think he’s awesome. He’s found something he loves, and he thinks it’s great enough to share with the world. This guy is a hero.”
And… Oh, my God. That’s right. That’s exactly right. Who was I to judge, much less judge publicly? Maybe his music was terrible, but so what? It wasn’t for me. It was for him, and his friends, and his fans. Nobody was seeking my opinion, because it would be ill-informed and emotional, because those are the only opinions I could possibly have.
I was just pumping poison into the atmosphere, to feel good about myself, for another hit of self-righteousness. I was what was wrong, because I vomited out disapproval — could only vomit out disapproval — without intent or willingness to even attempt to understand.