How to create an Azure VM with the azurerm Python library

There are at least two ways to work with Azure infrastructure using Python. You can use the official Azure SDK for Python which supports all Azure functionality, or the azurerm REST wrapper library, which is unofficial and supports a subset of the Azure REST API.

When to use which? Where you might use azurerm is when you need something very lightweight that is easy to extend and contribute to. Use the official SDK if you’re creating a production app or service. Use azurem if you’re writing a quick ops script, like figuring out which VMs are in which fault domains etc..

Here’s a simple azurerm example which goes through the steps to create a virtual machine. Note: Since creating a VM with the Azure Resource Manager deployment model imperatively requires several steps, in most cases it is easier to simply deploy an ARM template to create a set resources declaratively. When deploying a template the Azure Resource Manager takes care of parallelizing resource creation, so your program wouldn’t need to use multithreading and checks for resource completion (for creating a simple VM imperatively like this it’s not required but to create a whole set of VMs or scale sets it would be). The azurerm library also includes functions to deploy templates.

This example first creates the VM resources, including resource group, storage account, public ip address, vnet, NIC. Then it creates the VM. The current azurerm.create_vm() function creates a pretty simple VM and lacks options for data disks, disk encryption, keyvault integration, etc. but you’re welcome to extend it.

import azurerm
impor json

tenant_id = 'your-tenant-id'
application_id = 'your-application-id'
application_secret = 'your-application-secret'

# authenticate
access_token = azurerm.get_access_token(tenant_id, app_id, app_secret)

# create resource group
print('Creating resource group: ' + name)
rmreturn = azurerm.create_resource_group(access_token, subscription_id, name, location)
print(rmreturn)

# create NSG
nsg_name = name + 'nsg'
print('Creating NSG: ' + nsg_name)
rmreturn = azurerm.create_nsg(access_token, subscription_id, name, nsg_name, location)
nsg_id = rmreturn.json()['id']
print('nsg_id = ' + nsg_id)

# create NSG rule
nsg_rule = 'ssh'
print('Creating NSG rule: ' + nsg_rule)
rmreturn = azurerm.create_nsg_rule(access_token, subscription_id, name, nsg_name, nsg_rule, description='ssh rule',
                                  destination_range='22')
print(rmreturn)

# create storage account
print('Creating storage account: ' + name)
rmreturn = azurerm.create_storage_account(access_token, subscription_id, name, name, location, storage_type='Premium_LRS')
print(rmreturn)

# create VNET
vnetname = name + 'vnet'
print('Creating VNet: ' + vnetname)
rmreturn = azurerm.create_vnet(access_token, subscription_id, name, vnetname, location, nsg_id=nsg_id)
print(rmreturn)
# print(json.dumps(rmreturn.json(), sort_keys=False, indent=2, separators=(',', ': ')))
subnet_id = rmreturn.json()['properties']['subnets'][0]['id']
print('subnet_id = ' + subnet_id)

# create public IP address
public_ip_name = name + 'ip'
dns_label = name + 'ip'
print('Creating public IP address: ' + public_ip_name)
rmreturn = azurerm.create_public_ip(access_token, subscription_id, name, public_ip_name, dns_label, location)
print(rmreturn)
ip_id = rmreturn.json()['id']
print('ip_id = ' + ip_id)

# create NIC
nic_name = name + 'nic'
print('Creating NIC: ' + nic_name)
rmreturn = azurerm.create_nic(access_token, subscription_id, name, nic_name, ip_id, subnet_id, location)
print(rmreturn)
nic_id = rmreturn.json()['id']

# create VM
vm_name = name
vm_size = 'Standard_A1'
publisher = 'Canonical'
offer = 'UbuntuServer'
sku = '16.04.0-LTS'
version = 'latest'
os_uri = 'http://' + name + '.blob.core.windows.net/vhds/osdisk.vhd'
username = 'rootuser'
password = 'myPassw0rd'

print('Creating VM: ' + vm_name)
rmreturn = azurerm.create_vm(access_token, subscription_id, name, vm_name, vm_size, publisher, offer, sku,
                             version, name, os_uri, username, password, nic_id, location)
print(rmreturn)
print(json.dumps(rmreturn.json(), sort_keys=False, indent=2, separators=(',', ': ')))

Compare this azurerm example with an Azure Python SDK example to create a VM.

Posted in Cloud, Computers and Internet, Python | Tagged , | Leave a comment

Upgrading Minecraft on an Azure VM

When you deploy an Azure Minecraft VM using the Azure Resource Manager template it should be running the latest Minecraft server version, but the Mojang folks update the server fairly often, and before you know it your Minecraft launcher is complaining that the server is no longer on the latest version. If you deploy the Azure Marketplace Minecraft image rather than the ARM template, the server is more likely to be out of date. Here’s how you can upgrade the Minecraft server on the Azure VM to the latest version.

The basic steps are:

  • ssh to the VM.
  • Download the latest Minecraft server JAR file.
  • Update the minecraft-server systemctl service to point to the new JAR file.
  • Restart the minecraft-server service.

For convenience here’s a script that performs all those steps automatically. I’ll paste it below, though go here for the latest version: https://github.com/gbowerman/azure-minecraft/blob/master/scripts/mineserverupgrade.sh.

To upgrade a Minecraft server on Azure (as long as it was originally deployed using the ARM template or Marketplace image), copy the script to the virtual machine and run it using sudo. Since it’s a fairly short script a simple way to put it on the VM might be to just start vi (or nano or whatever) and paste the script into the editor and save it. Then remember to run chmod +x on the file to make it executable, and run it as root. E.g. like this (if you’re upgrading to Minecraft server version 1.10.2):

sudo bash
./mineservererupgrade.sh 1.10.2

Here’s the listing:

#!/bin/bash
# Minecraft server upgrade script for Azure
# $1 = new version (e.g. 1.10.2)

# check for a command line argument
if [[ ! $# -eq 1 ]] ; then
    echo The Minecraft server version needs to be passed as a command line argument, e.g. sudo $0 1.10.2
    exit 1
fi

# server values
minecraft_server_path=/srv/minecraft_server
server_jar=minecraft_server.$1.jar
SERVER_JAR_URL=https://s3.amazonaws.com/Minecraft.Download/versions/$1/minecraft_server.$1.jar

# adjust memory usage depending on VM size
totalMem=$(free -m | awk '/Mem:/ { print $2 }')
if [ $totalMem -lt 1024 ]; then
    memoryAlloc=512m
else
    memoryAlloc=1024m
fi

cd $minecraft_server_path

# download the server jar
while ! echo y | wget $SERVER_JAR_URL; do
    sleep 10
    wget $SERVER_JAR_URL
done

# stop the service
systemctl stop minecraft-server

# move the old service file
mv /etc/systemd/system/minecraft-server.service /tmp/minecraft-server.service.old

# recreate the service
touch /etc/systemd/system/minecraft-server.service
printf '[Unit]\nDescription=Minecraft Service\nAfter=rc-local.service\n' >> /etc/systemd/system/minecraft-server.service
printf '[Service]\nWorkingDirectory=%s\n' $minecraft_server_path >> /etc/systemd/system/minecraft-server.service
printf 'ExecStart=/usr/bin/java -Xms%s -Xmx%s -jar %s/%s nogui\n' $memoryAlloc $memoryAlloc $minecraft_server_path $server_jar >> /etc/systemd/system/minecraft-server.service
printf 'ExecReload=/bin/kill -HUP $MAINPID\nKillMode=process\nRestart=on-failure\n' >> /etc/systemd/system/minecraft-server.service
printf '[Install]\nWantedBy=multi-user.target\nAlias=minecraft-server.service' >> /etc/systemd/system/minecraft-server.service

# restart the service
systemctl start minecraft-server

# closing message
echo Upgrade completed. If any problems, you can revert to the previous version by running\:
echo sudo systemctl stop minecraft-server
echo sudo cp /tmp/minecraft-server.service.old /etc/systemd/system/minecraft-server.service
systemctl daemon-reload
echo sudo systemctl start minecraft-server

Posted in Cloud, Computers and Internet, Games, Linux, Ubuntu | Tagged , , , , | Leave a comment

How to convert an Azure virtual machine to a VM Scale Set

If you have a regular Azure Resource Manager virtual machine, you can convert it to be a source image for a VM Scale Set. In this example I’ll convert an Azure VM running a Minecraft server, to a load balanced VM scale set of servers. Multiple Minecraft client connections will hit a load balancer and be routed to different VMs in the set. This would enable you to start with a single VM, and scale it out to handle a much larger load.

In a nutshell, to convert a single VM into a scale set you need to: capture a generalized image of the VM and copy that image into the storage account you’ll use for the set, then deploy a VM Scale Set with a custom image pointing to the generalized image. The steps are:

1. Generalize the VM (e.g. run sysprep on Windows or waagent –deprovision on Linux).

2. Stop deallocate the VM.

3. Set the VM state as generalized.

4. Save the image to a storage account.

5. Copy the image to the storage account where you want to create the scale set.

6. Deploy a VM Scale Set template with the image->uri property set to the image location.

Now let’s go through those steps in more detail..

Before starting make sure you have an Azure VM that you can log in to. For the Minecraft scenario the starting point would be to deploy a Minecraft server VM using the Minecraft Server Azure template. See Creating a Minecraft server using an Azure Resource Manager template for more information on how to do that.

Note: in the steps below I will mostly use Azure CLI examples. Steps 1-4 for PowerShell are described in detail in Stephane Laponte’s excellent blog post STEP BY STEP: HOW TO CAPTURE YOUR OWN CUSTOM VIRTUAL MACHINE IMAGE UNDER AZURE RESOURCE MANAGER. Another useful PowerShell resource for steps 1-4, particularly if you have a Windows VMs is the Azure documentation: How to capture a Windows virtual machine in the Resource Manager deployment model.

1. Generalize the VM

The first step in preparing a VM to be a source image for new VM deployments is to log in to the machine and generalize the image so it can be assigned a new name/user/password/certificate etc. at VMSS deployment time.

On Windows that means running sysprep. On Linux call the VM agent with the –deprovision argument: sudo waagent –deprovision.

image

2. Stop deallocate the VM

Stop deallocate the VM so the OS drive image can be captured.

For PowerShell the command is Stop-AzureRmVm. The CLI command is: vm deallocate <resource group> <vm name>. e.g.

image

3. Set the VM state as generalized

Now tell Azure that the VM is generalized.

The PowerShell command is:
Set-AzureRmVM –ResourceGroupName <resource group name> –Name <vm to generalize> –Generalized

The Azure CLI command is: azure vm generalize <resource group> <vm name>. E.g.

image

4. Save the image to a storage account

Now it’s time to capture the generalized image and save it in a storage account. The PowerShell command is Save-AzureRmVMImage. The CLI command is: azure vm capture mineset <resource group> <vm name>, and you’ll get back a template for the captured image which includes the properties->storageProfile->osDisk->image->uri setting, which is the link to the captured image that you’ll need when copying it to a new storage account.

image

5. Copy the image to the storage account where you’ll create the scale set

If  the generalized VM image capture is already in the storage account and container you want it to be in, fine. In most cases at this point you’ll probably want to create a new storage account that you will use for the scale set you’ll be creating. You can copy the image to the new storage account using PowerShell, CLI, or a storage explorer like CloudBerry Explorer. I like the CloudBerry tool because it offers a nice split screen to show 2 storage accounts at a time and easily copy blobs between them.

image

Make a note of the URI for the new image as it will be used when deploying the scale set.

6. Deploy a VM Scale Set template with the image->uri property set to the new image location

The last step is to create a VM Scale Set with the image URI property set to the new image. There are some example ARM templates which allow you to specify a custom image, like this one in Aure Quickstart Templates: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-windows-customimage, but for the Minecraft server scenario, as well as being a Linux image, I also wanted to create a public IP address, and a load balancer with a rule to load balance incoming requests to the default Minecraft server port of 25565 to every VM in the set. The specialized template I created is here: vmss-minecraft-custom.json.

Deploying this template to Azure as a new custom deployment in the portal allows the URI of the new image to be specified as a deployment parameter (along with the number of VMs, VM size, etc.).

image

Once the template is successfully deployed, a VM Scale Set of 10 Minecraft servers is now running behind a load balancer. Yay! Now the set of Minecraft servers can handle 10 times the incoming load, or I could scale this out to 40 servers.

Note the Minecraft world on each VM in the scale set is exactly how it was when I generalized the original VM, with the same operators, whitelist settings etc. When users start making changes to different VMs the worlds will diverge, but I can always reimage the VMs in the scale set to set them back to the source image.

One manual thing I had to do was start the Minecraft server on each VM (i.e. SSH to each VM using the inbound NAT rules defined in the template and run sudo systemctl start minecraft-server). This shouldn’t be necessary, and it may have been because I had shut down the Minecraft server before generalizing the image.

Minecraft 1.10 6_20_2016 5_39_33 PM

Next steps

This was a basic walkthrough of converting a standalone Azure VM to a VM Scale Set. A next logical step would be to configure the VMSS template to use Azure autoscale. This way instead of launching a fixed number of VMs and manually scaling in or out, you could save costs by automatically scaling in or out depending on a workload such as average CPU speed.

Posted in Cloud, Computers and Internet, VM Scale Sets | Tagged , , , , | 2 Comments

How to upgrade an Azure VM Scale Set without shutting it down

This article describes how you can roll out an OS update to an Azure VM Scale Set without any downtime. In this context an OS update is either changing the version/sku of the OS, or changing the URI of a custom image. Updating without downtime means updating VMs one at a time, or in groups (such as one fault domain at a time), rather than all at once, so any VMs which are not being upgraded can keep running.

To avoid ambiguity let’s distinguish 3 types of OS update you might want to do:
1. Changing the version or sku of a platform image. E.g. changing Ubuntu 14.04.2-LTS version from 14.04.201506100 to 14.04.201507060, or changing the Ubuntu 15.10/latest sku to 16.04.0-LTS/latest etc.. Covered in this article.
2. You built a new version of a custom image and want to change the URI which points to the image (properties->virtualMachineProfile->storageProfile->osDisk->image->uri). Covered in this article.
3. Patching the OS from within a VM e.g. installing a security patch, using Windows Update etc. Supported but not covered in this article.

The first 2 are supported requirements. For the third one, at least for now, you’d need to create a new scale set to do that. This article covers options 1. and 2.
Note: VM Scale Sets which are deployed as part of an Azure Service Fabric cluster are not covered here.

The basic sequence for changing the OS version/sku of a platform image or the URI of a custom image looks like this:
– Get the VMSS model.
– Change the version, sku or URI value in the model.
– Update the model.
– Do a manualUpgrade call on the VMs in the scale set. This is only relevant if the upgradePolicy property of your Scale Set is set to “Manual”. If it is set to “Automatic”, all the VMs will upgraded at once and there will be downtime.

With this all this in mind, let’s review how you could update the version of a scale set in PowerShell, and using the REST API. These examples cover the case of a platform image, but hopefully I’ve provided enough information for you to adapt this to a custom image..

PowerShell

This example updates a Windows VM Scale Set to a new version “4.0.20160229”. After updating the model, it does an update one VM instance at a time.

$rgname = "myrg"
$vmssname = "myvmss" $newversion = "4.0.20160229" $instanceid = "1" # get the VMSS model
$vmss = Get-AzureRmVmss -ResourceGroupName $rgname -VMScaleSetName $vmssname # set the new version in the model data
$vmss.virtualMachineProfile.storageProfile.imageReference.version = $newversion # update the VMSS model
Update-AzureRmVmss -ResourceGroupName $rgname -Name $vmssname -VirtualMachineScaleSet $vmss # now start updating instances
Update-AzureRmVmssInstance -ResourceGroupName $rgname -VMScaleSetName $vmssname -InstanceId $instanceId

If you were updating the URI for a custom image instead of changing a platform image version, you’d replace the “set the new version” line with something like this:

# set the new version in the model data
$vmss.virtualMachineProfile.storageProfile.osDisk.image.uri= $newURI

Using the REST API

Here are a couple of Python examples which use the Azure REST API to roll out an OS version update. In both cases they make use of the lightweight azurerm library of Azure REST API wrapper functions to do a GET on the scale set to get the model, and then a PUT with an updated model. They also look at VM instances views to identify the VMs by update domain.

vmssupgrade

vmssupgrade is Python script to roll out an OS upgrade to a running VM Scale Set, one update domain at a time. You can find it here: https://github.com/gbowerman/vmsstools

This script lets you choose specific VMs to update, or specify an update domain, and supports changing a platform image version OR changing the URI of a custom image.

vmsseditor

This is a general purpose editor for VM Scale Sets, which shows VM status as a heatmap where one row represents one UD. Among other things you can update the model for a VMSS with a new version, sku or custom image URI, and then pick Fault Domains to upgrade (i.e. all the VMs in that UD are then upgraded to the new model), or a rolling upgrade based ont he batch size of your choice. vmsseditor can be found in the following github repo: https://github.com/gbowerman/vmssdashboard

E.g. here I’ve just updated the model of a scale set to Ubuntu 14.04-2LTS version 14.04.201507060 (note this is an old screenshot, many more options have since been added to this tool)..

image

After clicking Upgrade and then Get Details again, VMs in UD 0 are starting to update..

image

CLI?

I haven’t included a CLI example yet. Will try and get to that soon. I’d probably do it by deploying an empty template just consisting of an updated SKU packet and version to an existing template.

Posted in Cloud, Python, VM Scale Sets | Tagged , , | 1 Comment

Change the instance count of an Azure VM Scale Set

image

This article describes how to change the number of VMs in an Azure VM Scale Set.

The most important property of a scale set is “capacity”, which represents the number of VMs in the set. The main premise of a scale set is that you can easily change the number of VMs, i.e. scale it in or out, without having to worry about underlying resources like NICs, storage accounts, update domain/fault domain placement, changing VM properties etc.

So scaling a scale set is easy right? Well it should be. The basic steps to change the capacity of a scale set is:

  • GET the “model”, i.e. configuration of the scale set.
  • Change the “capacity” setting in the model.
  • PUT the model, i.e. update the configuration of the scale set.

Fortunately when you update the model of a scale set, you don’t have to include all the details, so you can skip the first two steps and do a “PATCH” which only passes in the “sku”, i.e. the information packet pictured above.

Here’s a quick review of how to change the capacity of a scale set using the Azure portal, PowerShell, CLI, and the REST API..

Change capacity using the Portal

Though the portal doesn’t directly support changing the capacity of an existing scale set yet, you can go here: https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-scale-existing and click on the “Deploy to Azure” button. When you enter the parameters in the portal, make sure you use the same resource group, VM Scale Set name, and vmSku (machine size) as your existing scale set.

Change capacity using PowerShell

Here’s an example using Azure PowerShell that uses the GET/Change/PUT approach to set a scale set called winvmss to 10 VMs:

$vmss = Get-AzureRmVmss -ResourceGroupName winvmss -VMScaleSetName winvmss  
$vmss.Sku.Capacity = 10
Update-AzureRmVmss -ResourceGroupName winvmss -Name winvmss -VirtualMachineScaleSet $vmss  

Change capacity using CLI

Update 6/6/16: You can now change the number of VMs in a scale set using a single CLI command – azure vmss scale. See this announcement on the Azure Linux blog: https://blogs.msdn.microsoft.com/azurelinux/2016/06/06/vm-scale-set-scale-command-is-now-available-in-the-0-10-1-azure-cli-release/

Change capacity using the REST API

The good thing about calling the Azure REST API directly is that it makes it easy to wrap a simple function call around a scale operation so it can be as simple as it should be.

Here’s an example in Python for doing a PATCH call to a scale set which only passes the ‘sku’ property. The azurerm unofficial Python library implements a function like this:

# scale_vmss(access_token, subscription_id, resource_group, vmss_name, size, tier, capacity) 
# change the instance count of an existing VM Scale Set
def scale_vmss(access_token, subscription_id, resource_group, vmss_name, size, tier, capacity): 
    endpoint = ''.join([azure_rm_endpoint, 
                '/subscriptions/', subscription_id,  
                '/resourceGroups/', resource_group,                                
                '/providers/Microsoft.Compute/virtualMachineScaleSets/', vmss_name, 
                '?api-version=', COMP_API]) 
    body = '{"sku":{ "name":"' + size + '", "tier":"' + tier + '", "capacity":"' + str(capacity) + '"}}' 
    return do_patch(endpoint, body, access_token) 

Easier scaling coming soon

There will be simple scale options in the portal coming soon, and the API may be simplified to make capacity a simple direct call, which would also result in simpler PowerShell and CLI commands to scale.

Posted in Cloud, Computers and Internet, Python, VM Scale Sets | Tagged , , , , | Leave a comment

Extension sequencing in Azure VM Scale Sets

VM extensions are a good way to customize an Azure VM at deployment time. You can deploy a platform image from the Azure Marketplace and then customize it with one or more extensions. Examples of extensions include diagnostics extensions to emit performance data, antivirus extensions, custom script extensions (where you can run your shell script or PowerShell at VM startup).

Templates, and Azure Resource Manager in general, will do everything they can in parallel. This can be a problem when you have multiple VM extensions defined, and both of them try to use an OS resource which is locked. For example, on Ubuntu, only one process at a time can successfully run apt-get to install software. If Azure Resource Manager ran these two extensions at the same time on the same VM, one would fail.

Extension sequencing is not a problem for regular Azure virtual machines, because you can define the extension resource with a dependsOn clause. You can use this to make one extension depend on another, so it will only run after the other extension has completed.

Extension sequencing is a problem for Azure VM Scale Sets however, because scale set extensions are not defined externally. Extensions are just another property under the Microsoft.Compute/virtualMachineScaleSets resource and don’t have a dependsOn clause. If you define a list of extensions, they could all potentially run at the same time.

Autoscale

A common VM Scale Set scenario where lack of extension sequencing can be a problem is with autoscale. To set up autoscale on a scale set, you need to define a diagnostic extension to emit performance data to a storage account. This performance data is then evaluated by the Insights engine to determine when to emit scale events.

On Linux, and let’s take the example of Ubuntu, the Linux Diagnostic Extension uses the OS installer while it’s setting up. If you also want to to install something on your VMs at deployment time, and use a custom script extension, the custom script extension can fail with errors like this in the extension.log file: Could not get lock /var/lib/dpkg/lock

Ways to sequence extensions in VM Scale Sets

Here are a couple of Ubuntu examples of how to make a custom script extension wait for the Linux Diagnostic extension to finish executing:

1. Put a loop around apt-get until it returns success

Here’s an example from the custom script used by the Ubuntu Apache PHP autoscale example in Azure Quickstart templates. The idea is put a loop around your apt-get update calls to keep trying until they work:

until apt-get -y update && apt-get -y install apache2 php5 
do
 echo "Try again"
 sleep 2
done

This usually works. The only potential issue is that there could be a timing problem where the script temporally gets a lock on the apt process and then loses it, causing the extension to fail.

2. Wait for the Linux Diagnostic extension to complete

Another approach is to put a loop at the beginning of your custom script which waits for the Linux Diagnostic extension to write a record to the extension log saying it’s done installing stuff. This check can’t make any assumptions about what version of the diagnostic extension is running, so it involves using the find command. Here’s an example from another autoscale example which installs Python bottle:

while ( ! (find /var/log/azure/Microsoft.OSTCExtensions.LinuxDiagnostic/*/extension.log | xargs grep "Start mdsd"));
do
  sleep 5 
done 
apt-get -y update 
apt-get -y install python3-bottle

Overall this approach is safer, because you know the diagnostic extension is done at this point. Though if a future updated version of the diagnostic extension changed the way it wrote to the extension log, you might need to maintain this code.

How you implement extension sequencing is going to depend on which extensions are running and how they compete with one another if at all.

The good news is that a built-in way to do extension sequencing will be added to VM Scale Sets in the near future, so there will be an easier way to manage the order in which extensions are installed.

Posted in Cloud, Computers and Internet, Linux, Ubuntu, VM Scale Sets | Tagged , , , , | Leave a comment

Deploying Applications in Azure VM Scale Sets

 image

How are Applications deployed on VM Scale Sets?

An application running on a VM Scale Set is typically deployed in one of three ways:

1. Installing new software on a Platform image at deployment time. A platform image in this context is a operating system image from the Azure Marketplace, like Ubuntu 16.04, Windows Server 2012 R2 etc.

You can install new software on a platform image using a VM Extension. A VM extension is software that runs when a VM is deployed. You can run any code you like at deployment time using a custom script extension. Here’s an example Azure Resource Manager Template with two VM extensions, a custom script extension to install Apache and PHP, and a diagnostic extension to emit performance data which can be used by Azure autoscaling: Autoscale a VM Scale Set running an Ubuntu/Apache/PHP app.

An advantage of this approach is you have a level of separation between your application code and the OS, and can maintain your application separately. Of course that means there are also more moving parts, and depending on how much needs to download and configure when the extension runs, it could add to the VM deployment time.

2. Create a custom VM image which includes both the OS and the application in a single VHD. Here the scale set consists of a set of VMs copied from an image created by you, which you have to maintain. This means no extra configuration is required at VM deployment time, but there are some limitations with custom images in the current version of VM Scale Sets – you are limited to a single storage account, and hence a maximum of 40 VMs in a scale set (as opposed to 100 VMs in a scale set which uses platform images).

3. Deploy a platform or a custom image which is basically a container host, and install your application as one or more containers which you can manage with an orchestrator or config management tool. This nice thing about this approach is that you have completely abstracted your cloud infrastructure from the application layer and can maintain them separately.

What happens when a VM Scale Set Scales Out? 

When you add one or more VMs to a scale set by increasing the capacity – whether manually or through autoscale – the application is automatically installed. For example if the scale set has extensions defined, they run on a new VM each time it is created. If the scale set is based on a custom image, any new VM will be a copy of the source custom image. If the scale set VMs are container hosts, then you might have startup code to load the containers in a custom script extension, or an extension might install an agent which registers with a cluster orchestrator (e.g. Azure Container Service).

How do you manage application updates in VM Scale Sets?

For application updates in VM Scale Sets, three main approaches follow from the three application deployment methods outlined above:

1. Updating with Extensions. Any VM extensions which are defined for a VM Scale Set are executed each time a new VM is deployed, an existing VM is reimaged, or a VM extension is updated. If you need to update your application, Directly updating an application through extensions is a viable approach – you can update the extension definition, or the extension code can point to a location which contains updateable software.

The hard problems there are:

– Security – how to maintain certificates/shared access signatures.

– Scaling – how the application updates, how long it takes, when you scale out.

2. The immutable approach. When you bake the application (or app components) into a VM image you can focus on building a reliable pipeline to automate build, test, deployment of the images (e.g. Jenkins based). You can design infrastructure architecture to facilitate rapid swapping of a stage scale set into production. A good example of this approach is the Azure Spinnaker driver work: https://github.com/spinnaker/deck/tree/master/app/scripts/modules/azurehttp://www.spinnaker.io/

Packer and Terraform support Azure Resource Manager, so you can also define your images “as code” and build them in Azure, then use the VHD in your scale set. Where this would become problematic is for Marketplace images, where extensions/custom scripts become more important as you don’t directly manipulate bits from Marketplace.

3. Update Containers. Abstract the application lifecycle management to a level above the cloud infrastructure, e.g. by encapsulating applications, and app components into containers and manage these through container orchestrators and app managers like chef/puppet.

The scale set VMs then become a stable substrate for the containers and only require occasional security and OS related updates. As mentioned, the Azure Container Service is a good example of taking this approach and building a service around it.

How do you roll out an OS update across update domains?

Suppose you want to update your OS image while keeping the VM Scale Set running. One way to do this is to update the VM images one VM at a time. You can do this with PowerShell or Azure CLI. There are separate commands to update the VM Scale Set model (how its configuration is defined), and to issue “manual upgrade” calls on individual VMs.

Here’s an example Python script which automates this to update a VM Scale Set one update domain at a time: https://github.com/gbowerman/vmsstools. (Caveat: it’s more of a proof of concept than a hardened production-ready solution – you might want to add some error checking etc.).

Posted in Cloud, Computers and Internet, Containers, VM Scale Sets | Tagged , | 6 Comments