Autoscaling VM scale sets with Azure host metrics

It’s now easier to set up autoscaling with Azure VM scale sets than it used to be. Until recently the only way to set up autoscaling for scale sets was by installing the Azure diagnostics extension in every VM. The diagnostics extension was required to emit performance data to a storage account that you also had to manage. Your autoscale rules would then reference that data to evaluate whether to trigger a scale action.

In October (2016) a more efficient data pipeline for Azure VMs went into production, based on host metrics. Host metrics means the hypervisor host machine which is running the Azure VM collects performance data about the VM and stores it in a free account which is managed on your behalf. This is the data being used when you look at your VM scale set properties in the Azure portal and view or edit a graph..

image

Autoscaling using Azure templates

You can access the host metric data directly, and also set up Azure autoscale rules to use it. The autoscaling templates in Azure Quickstart templates have already been converted to use host based metrics (that is, they no longer install a diagnostics extension, no longer create a storage account for metrics, and reference the host metric display names in the autoscale rules). This makes the templates simpler than they used to be. See the following for example:

https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-bottle-autoscale
https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-lapstack-autoscale
https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-ubuntu-autoscale
https://github.com/Azure/azure-quickstart-templates/tree/master/201-vmss-windows-autoscale

E.g. looking at one of the scale rules in the python-bottle template, the Percentage CPU host metric is referenced:

image

The host metric names you can use in the scale rules are documented here: https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics

E.g. at the time of writing:
Microsoft.Compute/virtualMachineScaleSets

Metric

Metric Display Name

Unit

Aggregation Type

Description

Percentage CPU

Percentage CPU

Percent

Average

The percentage of allocated compute units that are currently in use by the Virtual Machine(s)

Network In

Network In

Bytes

Total

The number of bytes received on all network interfaces by the Virtual Machine(s) (Incoming Traffic)

Network Out

Network Out

Bytes

Total

The number of bytes out on all network interfaces by the Virtual Machine(s) (Outgoing Traffic)

Disk Read Bytes

Disk Read Bytes

Bytes

Total

Total bytes read from disk during monitoring period

Disk Write Bytes

Disk Write Bytes

Bytes

Total

Total bytes written to disk during monitoring period

Disk Read Operations/Sec

Disk Read Operations/Sec

CountPerSecond

Average

Disk Read IOPS

Disk Write Operations/Sec

Disk Write Operations/Sec

CountPerSecond

Average

Disk Write IOPS

If you need to autoscale using other metrics e.g. memory, it is recommended to keep using the diagnostics extensions.

Accessing host metrics directly

You can view the host metrics data coming from scale sets directly using the Azure Monitor REST API (formerly known as the Insights API). Using a simple Python program like this one for example: insights_metrics.py, I can view the same data that the graph in the Azure portal shows me..

image

Or create my own graphs (in this case by just dumping data in a spreadsheet):

image

Here’s another example, this time feeding the data into a matplotlib graph.. vmsscpuplot.py. The graph below shows a case of maxing-out the CPU of the 201-vmss-bottle-autoscale scale set, starting with a capacity of 1 and then autoscale kicking in to scale out to 2 VMs. It looks like it may have scaled out to 3 and then back to a steady state of 2

..

image

Posted in Cloud, Computers and Internet, Python, VM Scale Sets | Leave a comment

Azure scale set upgrade policy explained

image

Azure VM scale sets have an “upgradePolicy” setting which can be set to “Manual” or “Automatic”. What does this setting do? What should you set it to? and how do you set it?

What upgrade policy means

The upgrade policy of a scale set determines what happens next after you change the scale set model. I.e. regardless of this setting nothing “automatic” happens unless you make a change to the model. Changing the scale set model means changing a property of the scale set which affects VMs, for example, the VM size (sku->name), the OS version, an extension property.

If you change a scale set property in the model (i.e. change a value and update the scale set), then if the upgradePolicy is set to “Manual”, nothing happens. It will be up to you to then apply the model to VMs in the scale set manually. E.g. by calling the Update-AzureRmVmssInstance PowerShell command or the Azure CLI 2.0 az vmss update-instances command. When you apply the model to a VM, this will typically result in a reboot, and if you’re changing the OS version, a reimage.

For more information about how to manually roll out an upgrade across VMs in a scale set, refer to: https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-upgrade-scale-set.

If the upgradePolicy is set to “Automatic”, when you deploy an updated scale set model, the update will be applied to all the VMs in the scale set at once. This is likely to result in an interruption to your application, which is why it is usually recommended to set this value to “Manual”.

What to set upgradePolicy to

If you don’t mind all your VMs being rebooted at the same time, you can set upgradePolicy to “Automatic”. Otherwise set it to “Manual” and take care of applying changes to the scale set model to individual VMs yourself. It is fairly easy to script rolling out the update to VMs while maintaining application uptime. See https://docs.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-upgrade-scale-set for more details.

If your scale set is in a Service Fabric cluster, certain updates like changing OS version are blocked (currently – that will change in future), and it is recommended that upgradePolicy be set to “Automatic”, as Service Fabric takes care of safely applying model changes (like updated extension settings) while maintaining availability. 

How do you set upgrade policy?

A simple way to change the upgradePolicy setting is to change it in the template you used to deploy a scale set and redeploy the template. If you didn’t use a template (for example created it from scratch using imperative PowerShell or CLI commands, or deployed it from the portal), a simple place to change the property is in the Azure Resource Explorer. Find your scale set under Microsoft.Compute in your resource group, and select Edit. Change the upgradePolicy setting and click PUT.

What about fully automated OS updates and patching?

In Azure PaaS v1 (Worker and Web roles), you could deploy cloud services and never have to worry about patching. OS updates were automatically taken care of behind the scenes. People looking at migrating cloud services to scale sets often ask when they can get equivalent functionality on scale sets. Automated patching is a feature that is expected from PaaS. It is reasonable to expect scale sets (which is an infrastructure layer designed to support PaaS solutions) to provide this ability.

Fully automated OS updates is on the scale set roadmap, but for now only manual patching is available. There will be some interim steps along the way, for example expect to see built-in manually triggered rolling updates coming soon.

In theory the primitives available now can be used to create a DIY automated upgrade feature. For example you could write an Azure Function which checks to see if there is a new OS platform image version, then updates the VMSS model with the new version, and then rolls out the update across the scale set. The VMSS Editor tool implements the manually triggered rolling update part of this for example. It’s on my to do list to write a simple auto-upgrade demo, but I won’t try and claim it would be equivalent to a fully automated Azure service.

Posted in Cloud, Computers and Internet, VM Scale Sets | Tagged , , | Leave a comment

Deploying Azure Container Service using the azurerm Python library

image

Azure Container Service is an easy to deploy container framework for Azure. It’s an open framework that among other things lets you choose whether to deploy DCOS or Swarm based cluster orchestration. You can deploy ACS directly from the Azure Portal or command line, and it has a convenient set of REST APIs to deploy and manage the service programmatically, which is supported by the standard Azure SDKs. The azurerm Python library of Azure REST wrappers also recently added support for ACS.

Here’s an example showing how you can deploy a new Container Service with azurerm. You can see a similar example in the examples section of the azurerm github repo: create_acs.py, and see all the ACS API calls exercised in the azurerm ACS unit tests.

import azurerm
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.backends import default_backend
from haikunator import Haikunator  # used to generate random word strings
import json
import sys

tenant_id = "your tenant id"
app_id = "your application id"
app_secret = "your application secret"
subscription_id = "your Azure subscription id"

# authenticate
access_token = azurerm.get_access_token(tenant_id, app_id, app_secret)

# set Azure data center location
location = 'eastus'

# create resource group - use Haikunator to generate a random name
rgname = Haikunator.haikunate() 
print('Creating resource group: ' + rgname)
response = azurerm.create_resource_group(access_token, subscription_id, rgname, location)
if response.status_code != 201:
    print(json.dumps(response.json(), sort_keys=False, indent=2, separators=(',', ': ')))
    sys.exit('Expecting return code 201 from create_resource_group(): ')

# create Container Service name and DNS values - random names again
service_name = Haikunator.haikunate(delimiter='')
agent_dns = Haikunator.haikunate(delimiter='')
master_dns = Haikunator.haikunate(delimiter='')

# generate RSA Key for container service - put your own public key here instead
key = rsa.generate_private_key(backend=default_backend(), public_exponent=65537, \
    key_size=2048)
public_key = key.public_key().public_bytes(serialization.Encoding.OpenSSH, \
    serialization.PublicFormat.OpenSSH).decode('utf-8')

# create container service (orchestrator will default to DCOS)
agent_count = 3                # the container hosts which will do the work
agent_vm_size = 'Standard_A1'
master_count = 1               # use 3 for production deployments
admin_user = 'azure'
print('Creating container service: ' + service_name)
print('Agent DNS: ' + agent_dns)
print('Master DNS: ' + master_dns)
print('Agents: ' + str(agent_count) + ' * ' + agent_vm_size)
print('Master count: ' + str(master_count))

response = azurerm.create_container_service(access_token, subscription_id, \
    rgname, service_name, agent_count, agent_vm_size, agent_dns, \
    master_dns, admin_user, public_key, location, master_count=master_count)
if response.status_code != 201:
    sys.exit('Expecting return code 201 from create_container_service(): ' + str(response.status_code))

print(json.dumps(response.json(), sort_keys=False, indent=2, separators=(',', ': ')))
Posted in Cloud, Computers and Internet, Containers, Python | Tagged , , , , | Leave a comment

Generating RSA keys with Python 3

I was looking for a quick way to generate an RSA key in Python 3 for some unit tests which needed a public key as an OpenSSH string. It ended up taking longer than expected because I started by trying to use the pycrypto library, which is hard to install on Windows (weird dependencies on specific Visual Studio runtimes) and has unresolved bugs with Python 3.

If you’re using Python 3 it’s much easier to use the cryptography library.

Here’s an example which generates an RSA key pair, prints the private key as a string in PEM container format, and prints the public key as a string in OpenSSH format.

from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.backends import default_backend

# generate private/public key pair
key = rsa.generate_private_key(backend=default_backend(), public_exponent=65537, \
    key_size=2048)

# get public key in OpenSSH format
public_key = key.public_key().public_bytes(serialization.Encoding.OpenSSH, \
    serialization.PublicFormat.OpenSSH)

# get private key in PEM container format
pem = key.private_bytes(encoding=serialization.Encoding.PEM,
    format=serialization.PrivateFormat.TraditionalOpenSSL,
    encryption_algorithm=serialization.NoEncryption())

# decode to printable strings
private_key_str = pem.decode('utf-8')
public_key_str = public_key.decode('utf-8')

print('Private key = ')
print(private_key_str)
print('Public key = ')
print(public_key_str)

Posted in Computers and Internet, Cryptography, Python | Tagged | Leave a comment

Install Azure CLI 2.0 on the Windows 10 bash on Ubuntu shell

Microsoft Azure CLI 2.0 is an excellent Azure CLI reboot based on Python which is currently in preview. If you want to install and run it on the bash on Ubuntu shell  provided as a developer feature with the Windows 10 Anniversary update, here’s what you need to do.

1. Install the bash on Ubuntu shell. Follow these instructions if you don’t have it. If you already have it but have installed random libraries and want to reset, you can do that by uninstalling (after saving any data) with the ‘lxrun /uninstall’ command from a command window, and then clicking on the bash on Ubuntu icon again to reinstall.

2. Get the Ubuntu subsystem up to date:

sudo apt-get update
sudo apt-get upgrade

3. Make sure you can ping your host name. The first CLI install instruction in the next step will fail if you can’t ping your hostname. To fix it I edited my /etc/hosts file and added the hostname to the 127.0.0.1 localhost line:

127.0.0.1 localhost PONDLIFE

I might change this later depending on what I need to do with networking and name resolution, but this works for now.

4. Follow the Debian/Ubuntu package install instructions from the Azure CLI github readme:

$ echo "deb https://apt-mo.trafficmanager.net/repos/azure-cli/ wheezy main" | sudo tee /etc/apt/sources.list.d/azure-cli.list
$ sudo apt-key adv --keyserver apt-mo.trafficmanager.net --recv-keys 417A0893
$ sudo apt-get install apt-transport-https
$ sudo apt-get update && sudo apt-get install azure-cli

Once the azure-cli package is installed you can run the ‘az’ command and see the options:

image

Posted in Cloud, Computers and Internet, Linux, Python, Ubuntu | Tagged , , , | Leave a comment

Creating an Azure VM Scale Set with the azurerm Python library

There are various ways to create an Azure VM Scale Set. The easiest methods are: directly in the Azure portal, using the CLI quick-create command, and by deploying an Azure template. If instead of deploying a template you want to create a VMSS programmatically and imperatively – that is by creating each resource one call at a time, here’s how to do it using the azurerm library of REST wrapper functions.

Before using this library you need to create a service principal. These steps are covered here.

The azurerm library added a create_vmss() function in version 0.6.12. The initial implementation has some limitations, notably:

  • Doesn’t support creating VMs with certificates. Only user/password.
  • Expects a load balancer with an inbound NAT pool to be created and its ID to be provided as a function argument.
  • Only supports VM platform images, not custom images.
  • These are easy to fix. Let me know if you need one of these features.

    Here’s an example program which creates a resource group, VNet, public ip address, load balancer, storage accounts, NSG, and uses them to create a scale set. You can find a similar example program in the azure examples folder here: create_vmss.py. You can also see how the azurerm unit tests create a VM and a VMSS in the same VNet here: compute_test.py.

    # simple program to do an imperative VMSS quick create from a platform image
    # Arguments:
    # -name [resource names are defaulted from this]
    # -image
    # -location [same location used for all resources]
    import argparse
    import azurerm
    import json
    from random import choice
    from string import ascii_lowercase
    from haikunator import Haikunator
    
    # validate command line arguments
    argParser = argparse.ArgumentParser()
    
    argParser.add_argument('--name', '-n', required=True, action='store', help='Name of vmss')
    argParser.add_argument('--capacity', '-c', required=True, action='store',
                           help='Number of VMs')
    argParser.add_argument('--location', '-l', action='store', help='Location, e.g. eastus')
    argParser.add_argument('--verbose', '-v', action='store_true', default=False, help='Print operational details')
    
    args = argParser.parse_args()
    
    name = args.name
    location = args.location
    capacity = args.capacity
       
    tenant_id = 'put your tenant id here'
    app_id = 'put your app id here'
    app_secret = 'put your app secret here'
    subscription_id = 'put your subscription id here'
    
    # authenticate
    access_token = azurerm.get_access_token(tenant_id, app_id, app_secret)
    
    # create resource group
    print('Creating resource group: ' + name)
    rmreturn = azurerm.create_resource_group(access_token, subscription_id, name, location)
    print(rmreturn)
    
    # create NSG - not strictly necessary
    nsg_name = name + 'nsg'
    print('Creating NSG: ' + nsg_name)
    rmreturn = azurerm.create_nsg(access_token, subscription_id, name, nsg_name, location)
    nsg_id = rmreturn.json()['id']
    print('nsg_id = ' + nsg_id)
    
    # create NSG rule
    nsg_rule = 'ssh'
    print('Creating NSG rule: ' + nsg_rule)
    rmreturn = azurerm.create_nsg_rule(access_token, subscription_id, name, nsg_name, nsg_rule, \
        description='ssh rule', destination_range='22')
    
    # create set of storage accounts, and construct container array
    print('Creating storage accounts')
    container_list = []
    for count in range(5):
        sa_name = ''.join(choice(ascii_lowercase) for i in range(10))
        print(sa_name)
        rmreturn = azurerm.create_storage_account(access_token, subscription_id, name, sa_name, \
            location, storage_type='Standard_LRS')
        if rmreturn.status_code == 202:
            container = 'https://' + sa_name + '.blob.core.windows.net/' + name + 'vhd'
            container_list.append(container)
        else:
            print('Error ' + str(rmreturn.status_code) + ' creating storage account ' + sa_name)
            sys.exit()
    
    # create VNET
    vnetname = name + 'vnet'
    print('Creating VNet: ' + vnetname)
    rmreturn = azurerm.create_vnet(access_token, subscription_id, name, vnetname, location, \
        nsg_id=nsg_id)
    print(rmreturn)
    
    subnet_id = rmreturn.json()['properties']['subnets'][0]['id']
    print('subnet_id = ' + subnet_id)
    
    # create public IP address
    public_ip_name = name + 'ip'
    dns_label = name + 'ip'
    print('Creating public IP address: ' + public_ip_name)
    rmreturn = azurerm.create_public_ip(access_token, subscription_id, name, public_ip_name, \
        dns_label, location)
    print(rmreturn)
    ip_id = rmreturn.json()['id']
    print('ip_id = ' + ip_id)
    
    # create load balancer with nat pool
    lb_name = vnetname + 'lb'
    print('Creating load balancer with nat pool: ' + lb_name)
    rmreturn = azurerm.create_lb_with_nat_pool(access_token, subscription_id, name, lb_name, ip_id, \
        '50000', '50100', '22', location)
    be_pool_id = rmreturn.json()['properties']['backendAddressPools'][0]['id']
    lb_pool_id = rmreturn.json()['properties']['inboundNatPools'][0]['id']
    
    # create VMSS
    vmss_name = name
    vm_size = 'Standard_A1'
    publisher = 'Canonical'
    offer = 'UbuntuServer'
    sku = '16.04.0-LTS'
    version = 'latest'
    username = 'azure'
    
    # this example creates a random password. You might want to change this or at
    # least save the random password that gets created somewhere
    password = Haikunator.haikunate(delimiter=',') 
    
    print('Creating VMSS: ' + vmss_name)
    rmreturn = azurerm.create_vmss(access_token, subscription_id, name, vmss_name, vm_size, capacity, \
        publisher, offer, sku, version, container_list, subnet_id, \
        be_pool_id, lb_pool_id, location, username=username, password=password)
    print(rmreturn)
    print(json.dumps(rmreturn.json(), sort_keys=False, indent=2, separators=(',', ': ')))
    

    Next steps

    Next on my to do list is:

    – a VMSS create example using the official Azure Python SDK.

    – Add Azure ACS wrappers to the azurerm library.

    – Make the azurerm create_vmss() and create_vm() functions support certificates.

    Posted in Cloud, Computers and Internet, Python, VM Scale Sets | Tagged , , | Leave a comment

    How to create an Azure VM with the azurerm Python library

    There are at least two ways to work with Azure infrastructure using Python. You can use the official Azure SDK for Python which supports all Azure functionality, or the azurerm REST wrapper library, which is unofficial and supports a subset of the Azure REST API.

    When to use which? Where you might use azurerm is when you need something very lightweight that is easy to extend and contribute to. Use the official SDK if you’re creating a production app or service. Use azurem if you’re writing a quick ops script, like figuring out which VMs are in which fault domains etc..

    Here’s a simple azurerm example which goes through the steps to create a virtual machine. Note: Since creating a VM with the Azure Resource Manager deployment model imperatively requires several steps, in most cases it is easier to simply deploy an ARM template to create a set resources declaratively. When deploying a template the Azure Resource Manager takes care of parallelizing resource creation, so your program wouldn’t need to use multithreading and checks for resource completion (for creating a simple VM imperatively like this it’s not required but to create a whole set of VMs or scale sets it would be). The azurerm library also includes functions to deploy templates.

    This example first creates the VM resources, including resource group, storage account, public ip address, vnet, NIC. Then it creates the VM. The current azurerm.create_vm() function creates a pretty simple VM and lacks options for data disks, disk encryption, keyvault integration, etc. but you’re welcome to extend it.

    import azurerm
    impor json
    
    tenant_id = 'your-tenant-id'
    application_id = 'your-application-id'
    application_secret = 'your-application-secret'
    
    # authenticate
    access_token = azurerm.get_access_token(tenant_id, app_id, app_secret)
    
    # create resource group
    print('Creating resource group: ' + name)
    rmreturn = azurerm.create_resource_group(access_token, subscription_id, name, location)
    print(rmreturn)
    
    # create NSG
    nsg_name = name + 'nsg'
    print('Creating NSG: ' + nsg_name)
    rmreturn = azurerm.create_nsg(access_token, subscription_id, name, nsg_name, location)
    nsg_id = rmreturn.json()['id']
    print('nsg_id = ' + nsg_id)
    
    # create NSG rule
    nsg_rule = 'ssh'
    print('Creating NSG rule: ' + nsg_rule)
    rmreturn = azurerm.create_nsg_rule(access_token, subscription_id, name, nsg_name, nsg_rule, description='ssh rule',
                                      destination_range='22')
    print(rmreturn)
    
    # create storage account
    print('Creating storage account: ' + name)
    rmreturn = azurerm.create_storage_account(access_token, subscription_id, name, name, location, storage_type='Premium_LRS')
    print(rmreturn)
    
    # create VNET
    vnetname = name + 'vnet'
    print('Creating VNet: ' + vnetname)
    rmreturn = azurerm.create_vnet(access_token, subscription_id, name, vnetname, location, nsg_id=nsg_id)
    print(rmreturn)
    # print(json.dumps(rmreturn.json(), sort_keys=False, indent=2, separators=(',', ': ')))
    subnet_id = rmreturn.json()['properties']['subnets'][0]['id']
    print('subnet_id = ' + subnet_id)
    
    # create public IP address
    public_ip_name = name + 'ip'
    dns_label = name + 'ip'
    print('Creating public IP address: ' + public_ip_name)
    rmreturn = azurerm.create_public_ip(access_token, subscription_id, name, public_ip_name, dns_label, location)
    print(rmreturn)
    ip_id = rmreturn.json()['id']
    print('ip_id = ' + ip_id)
    
    # create NIC
    nic_name = name + 'nic'
    print('Creating NIC: ' + nic_name)
    rmreturn = azurerm.create_nic(access_token, subscription_id, name, nic_name, ip_id, subnet_id, location)
    print(rmreturn)
    nic_id = rmreturn.json()['id']
    
    # create VM
    vm_name = name
    vm_size = 'Standard_A1'
    publisher = 'Canonical'
    offer = 'UbuntuServer'
    sku = '16.04.0-LTS'
    version = 'latest'
    os_uri = 'http://' + name + '.blob.core.windows.net/vhds/osdisk.vhd'
    username = 'rootuser'
    password = 'myPassw0rd'
    
    print('Creating VM: ' + vm_name)
    rmreturn = azurerm.create_vm(access_token, subscription_id, name, vm_name, vm_size, publisher, offer, sku, \
        version, name, os_uri, nic_id, location, username=username, password=password)
    print(rmreturn)
    print(json.dumps(rmreturn.json(), sort_keys=False, indent=2, separators=(',', ': ')))
    

    Compare this azurerm example with an Azure Python SDK example to create a VM.

    Posted in Cloud, Computers and Internet, Python | Tagged , | Leave a comment