Gustav Ehrenborg fullstack developer

Email symbol LinkedIn logo GitHub logo Instagram logo

Load testing with Locust and Azure Setting up distributed Locust on Azure

Saturday, August 22, 2020
Graph of load testing

This is a walkthrough on how I set up a distributed Locust load testing environment on Azure. I try to explain some parts, but the aim is the more experienced user since it came out quite complex.

Locust

Locust is a python based open source load testing tool. It is configured by entering test cases in a file called a locustfile. Locust can then be started with a single node mode or with multiple nodes. When multiple nodes are deployed, one of them must be the master, to which the other connect and get instructions on when to start and also report to.

Azure

To get hold of multiple servers that can run Locust, Azure was chosen. Azure has a great CLI which we will use in the setup below. AWS offers the same functionality, but then this guide won't help you. Heroku cannot be used since worker dynos can't be directly connected to.

What to test

A website today contains HTML, scripts, images, videos, data etc. Most likely, some of these resources are hosted elsewhere, and there is no need to load test them. Downloading, for example, images from a CDN does not test the performance of your backend servers. HTML and data, that most likely come from your server, is what should be testing.

It's easy to find out what is coming from where. In the networks tab in the Inspector, filter for "your-api-url" to see which resources that your server provides.

Filtering requests from a specific URI

Locustfile

Locust has documentation on how to write test cases here: https://docs.locust.io/en/stable/writing-a-locustfile.html

I tested a site that has a couple of special pages (start page, toplist page, index pagee for posts) and hundredths of user generated posts. I looked into Google Analytics to try to work out what the general site visitor did. I found out that a normal user enters on the startpage, visited on average three different post pages and a fifth of the users used the search. I defined a Locust task for every page type, looking into what data that is fetched in the browser and fetching the same data in Locust. I had a list of URIs for posts and let python randomly choose some of them to visit. The same procedure to test the search, a list of predefined search terms that was randomly selected from. I wanted to mirror the behaviour of the users and Locust has support for giving every task a weight, specifying how common they were. Here is a skeleton locustfile:

class NormalUser(TaskSet):
  @task(5)
  def start_page(self):
    # Fetch the js bundles
    # Fetch the styles
    # Fetch the startpage data

  @task(1)
  def search(self):
    # Your search API calls here

  @task(15)
  def single(self):
    # Randomly select a post to fetch
    # Fetch the data/page

class NormalUserLocust(HttpLocust):
  task_set = NormalUser
  wait_time = between(2, 5)

Docker

To be able to start multiple nodes running Locust, we use docker containers. A dockerfile defines what our container should run. In this case it should install som python dependencies and of course also Locust.

FROM python:3.8.2-alpine3.11

RUN apk --no-cache add --virtual=.build-dep build-base 
RUN apk --no-cache add g++ zeromq-dev libffi-dev
RUN pip install --no-cache-dir locustio pyzmq
RUN apk del .build-dep 
RUN mkdir /locust

WORKDIR /locust
EXPOSE 8089 5557 5558

Azure setup

Install the Azure CLI and login. Start by defining the names of things.

export AZURE_RESOURCE_GROUP=...
export AZURE_STORAGE_ACCOUNT=...
export AZURE_STORAGE_SHARE=...
export AZURE_MASTER_CONTAINER_GROUP=...-master
export AZURE_SLAVE_CONTAINER_GROUP=...-slave
export AZURE_ACR_NAME=...

Then we need to have an Azure resource group

az group create -n $AZURE_RESOURCE_GROUP

The locustfile needs to be hosted somewhere, where the nodes can access it, the following will create a storage account

az storage account create -n $AZURE_STORAGE_ACCOUNT -g $AZURE_RESOURCE_GROUP --sku Standard_LRS
az storage share create -n $AZURE_STORAGE_SHARE
export AZURE_STORAGE_KEY=az storage account keys list -n $AZURE_STORAGE_ACCOUNT -g $AZURE_RESOURCE_GROUP --query '[0].value' -o tsv

...and this command will upload the locustfile, run this command every time you change the locustfile

az storage file upload -s $AZURE_STORAGE_SHARE --source ./locustfile.py

To host the docker container, create a registry, build the container and save the credentials

az acr create -g $AZURE_RESOURCE_GROUP --n $AZURE_ACR_NAME --sku Basic
az acr build --image rh/locust --registry $AZURE_ACR_NAME --file Dockerfile .
# Export the credentials
export AZURE_REGISTRY_USERNAME=az acr credential show -n $AZURE_ACR_NAME --query 'username' -o tsv
export AZURE_REGISTRY_PASSWORD=az acr credential show -n $AZURE_ACR_NAME --query 'passwords[0].value' -o tsv

Starting the nodes

The commands to start the containers might seem very hairy, but they are actually not that difficult to understand. Most of the flags are human readable, and most of the values are already exported into variables.

The master

Start by choosing how many slaves you want, the master will wait for all the slaves to connect, and must therefore know this in advance. Here I chose m slaves.

az container create -g $AZURE_RESOURCE_GROUP -n $AZURE_MASTER_CONTAINER_GROUP --image ${AZURE_ACR_NAME}.azurecr.io/locust:latest --registry-username $AZURE_REGISTRY_USERNAME --registry-password $AZURE_REGISTRY_PASSWORD --ports 8089 5557 5558 --ip-address public --dns-name-label $AZURE_MASTER_CONTAINER_GROUP --azure-file-volume-account-name $AZURE_STORAGE_ACCOUNT --azure-file-volume-account-key $AZURE_STORAGE_KEY --azure-file-volume-share-name $AZURE_STORAGE_SHARE --azure-file-volume-mount-path /locust --command-line 'locust NormalUserLocust --master --expect-slaves=m'

The --command-line is the command that is run on the node, and as you see, we tell is to run locust and the NormalUserLocust that we defined in the locustfile. It also specifies that this is the master and it expects m slaves.

The slaves

az container create -g $AZURE_RESOURCE_GROUP -n ${AZURE_SLAVE_CONTAINER_GROUP}n --image ${AZURE_ACR_NAME}.azurecr.io/locust:latest --registry-username $AZURE_REGISTRY_USERNAME --registry-password $AZURE_REGISTRY_PASSWORD --azure-file-volume-account-name $AZURE_STORAGE_ACCOUNT --azure-file-volume-account-key $AZURE_STORAGE_KEY --azure-file-volume-share-name $AZURE_STORAGE_SHARE --azure-file-volume-mount-path /locust --command-line 'locust NormalUserLocust --slave --master-host ${AZURE_MASTER_CONTAINER_GROUP}.northeurope.azurecontainer.io' &

You will want to run this command m times, changing the n from 1..m. It will start up a Locust slave that will connect to the master.

For debugging purposes, here are some other great commands to know about

# Tail thee logs
az container logs --name <container name> --resource-group $AZURE_RESOURCE_GROUP --follow
# Show container info
az container show -g $AZURE_RESOURCE_GROUP -n <container name>
# Restart container
az container restart -n <container name> -g $AZURE_RESOURCE_GROUP

Running the tests

Puh... lots of command, lots of setup, but we are now ready to start the tests. Go to `${AZURE_MASTER_CONTAINER_GROUP}.northeurope.azurecontainer.io:8095` to enter the Locust web UI. Here you can see if the slaves managed to connect to the master and start the testing. Locust will not spawn all the users at once, but carefully ramp it up. You can follow along both in the statistics and the chart view. It also useful to monitor the servers that you are testing, for example using New Relic, when the load testing is running, but finding bottlenecks is another story.