Azure VM Scale Sets and cloud-scale compute patterns

Azure VM Scale Sets are a way to manage a set of identically configured virtual machines. At face value the benefit is: If your machines can all be configured the same, you can reduce the overhead of managing them individually, and elastically scale the workload to meet demand. The real power of VM Scale Sets and comparable technology becomes apparent when you transition to cloud-scale models.

Looking at traditional multi-machine computing scenarios, you can apply some of them directly to a VM Scale Set model. Examples include high-performance computing (HPC), large scale data analysis (e.g. Hadoop clusters), scalable and often stateless middle-tier or backend servers (e.g. webservers). Other traditional scenarios are potential matches (like distributed databases for example), but have various challenges, like how to handle stateful data, how to apply unique properties to VMs, how to deal with control nodes etc.

Pets, cattle and other animals
The full potential of using scalable groups of identical VMs extends beyond traditional compute scenarios and lies in the realm of cloud-scale compute patterns. As these patterns developed, a simplified concept emerged: Pets and Cattle. Your pets are resources which have their own names and require individual support and nurturing. Cattle tend to be numbered and are inherently replaceable.

The architectural cornerstone of the pets vs. cattle paradigm, articulated by Gavin McCance from CERN is: “Future application architectures should use Cattle but Pets with strong configuration management are viable and still needed.” In other words, many application architectures have a combination of unique components which require individual configuration AND fungible components which can be managed collectively.

Pets: unique components provided with the specific resources they need. Also cute.

The power of the new compute architectures comes from separating out the unique and non-unique components of the application. This allows unique components such as control nodes to be configured with the specific resources they need, and appropriate high availability solutions to be put in place to keep them operational.

The non-unique components, now logically separated, are then available for cloud-scale management patterns, such as being efficiently scaled out when work is available, saving costs by scaling in when not required, being reset or deleted in the event of failure, managed with an aggregated dashboard view. As general purpose compute nodes they are governed by a “capacity” rather than individual characteristics. Easily replaceable components also fit into a microservice model with immutable infrastructure deployment (think Spinnaker).

Since zoomorphic analogies involving pets and cattle tend to break down when you get into details, I’ve chosen a duck sitting on a shoal of koi carp to represent the concept of non-unique compute components, managed by a head node, scaling out to consume resources until all work is complete. Deal with it.

Shoal: General purpose nodes which can scale out to consume resources.

Building out scalable architecture
When building out scalable architecture in the cloud there’s a trade-off between the level of control you have over individual infrastructure resources and the complexity of defining and placing the resources. Higher level PaaS solutions like Cloud Foundry for example reduce infrastructure development complexity by deploying fixed architectures you can conform to and build your application around.

In many cases however you want to deploy a specific architecture without the overhead of a pre-defined PaaS solution. For example you want a scalable compute layer running a lightweight containerized tasks, connecting to another layer of your application, and don’t require a new packaged PaaS layer on top of it.

VM Scale Sets (VMSS) occupy a sweet spot of providing a high level of control over infrastructure, without having to correlate different resources like networking, storage and compute, or having to figure out how to balance nodes across fault and update domains. A VM Scale Set lets you define a single compute resource, which has network, storage and extension properties. A single call goes down to the fabric, allowing room to optimize performance and reliability (like overprovisioning for example).


VMSS is sometimes called IaaS+ – though adding another label to the compute continuum should probably be discouraged, a better term is “infrastructure for PaaS”. The goal is to provide an easy to deploy scalable resource that PaaS solutions can build on. Examples of higher level services which run on VMSS include Azure Service Fabric, Azure Container Service and other services like Azure Batch are moving to this model. By following an application model which includes identically configured compute nodes, you get to deploy infrastructure using scale units. Deploying scale units (in this case the units are VM Scale Sets) dramatically reduces the management overhead.

In future posts I’ll go into detail about some of the architectures deployed with VM Scale Sets to solve specific problems.

This entry was posted in Cloud, Computers and Internet and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s