Azure Batch Autoscale Formula Made Easy with Auto Scaling

Author

Reads 1K

Computer server in data center room
Credit: pexels.com, Computer server in data center room

Azure Batch autoscale formula is a powerful tool that helps you scale your batch jobs up or down based on demand. The formula is a mathematical expression that determines when to scale your resources.

By using the autoscale formula, you can save costs by not overprovisioning your resources and ensure that your batch jobs are completed efficiently.

The autoscale formula is based on the concept of a "target" value, which represents the desired average CPU utilization of your batch nodes. This value is used to determine when to scale your resources up or down.

With Azure Batch autoscale formula, you can set a scale-out condition to increase the number of nodes when the average CPU utilization exceeds 50% for a 5-minute period.

Understanding Azure Batch

Azure Batch is a powerful tool for running large-scale parallel workloads at a low cost. It's a non-visual tool, meaning there is no actual user interface, and it's initiated through Azure.

Credit: youtube.com, How to run jobs at scale with Azure Batch | Azure Tips and Tricks

You can create compute pools, which are pools of one or more compute nodes on which you will jointly assign jobs. Each compute node within a compute pool is identical and is an Azure VM configured with your specifications.

Azure Batch assumes there is data somewhere it needs to crunch, typically residing in Azure Blob Storage or Azure Data Lake Store. You'll need to upload your data to these storage folders before creating a compute pool.

A compute pool is set up by defining the name of the pool and the type of nodes it should contain, such as the OS, software installed, and linked Azure Storage. All nodes within a compute pool are identical and can absorb one or multiple tasks, depending on the number of cores the VM has.

Jobs are collections of tasks, which are paralleled. Tasks can be individual runs of a job, such as running 10 different simulations of a model or 1000 iterations of a transformation script. Each task might include a specific set of instructions or data.

Azure Batch will allocate tasks dynamically to the different nodes in a compute pool. Once all tasks are finished, the job will be marked as complete and the compute nodes will be ready to run another job.

Credit: youtube.com, Introduction To Azure Batch and Run your first Azure Batch job in the Azure portal

Azure Batch has several distinctive features that make it stand out from alternative products. These include:

  • Running large-scale parallel workloads at very low cost through the use of low-priority VMs.
  • Allowing you to fully configure the nodes yourselves using Docker-configuration.
  • Providing an easy-to-use code interface with R (through DoParalleland Python) to create pools and run jobs.
  • Auto-scaling to provide more nodes when needed, using a formula to increase the number of computing nodes if more than X tasks are queued.
  • Allowing you to monitor interactively your jobs with Application Insights or Batch Explorer.
  • Enabling you to run any type of node: GPU instances, Linux or Windows nodes, Dockers.
  • Integrating easily with Blob storage and Data Lake storage to fetch data for each task.

Auto Scaling Basics

Auto scaling in Azure Batch is a powerful feature that allows you to scale compute resources automatically based on your job's needs.

You can define an automatic scaling formula as a string value assigned to a pool's autoScaleFormula element in a request body (REST API) or CloudPool.AutoScaleFormula property (.NET API). This formula string cannot exceed 8KB in size and can include up to 100 statements separated by semicolons.

To enable auto scaling on an existing pool, you can use the Batch .NET library's BatchClient.PoolOperations.EnableAutoScale method or the REST API's Enable automatic scaling on a pool request. Both methods require the ID of an existing pool and the automatic scaling formula to apply to the pool.

Here are some common variables and operations used in auto scaling formulas:

  • maxNumberOfVMs: The maximum number of virtual machines you want to use.
  • samplePercentThreshold: The percentage threshold for CPU usage that determines whether to increase or decrease the number of nodes.
  • sampleDuration: The duration of the sample period for CPU usage.
  • $CPUPercent.GetSample(sampleDuration): Gets the last sample of CPU usage.
  • $sample >= samplePercentThreshold: Checks if the average CPU usage is greater than or equal to the threshold.

Design for Retries

Batch can automatically retry tasks that fail, which is a huge relief when working with unpredictable compute resources.

Credit: youtube.com, Design Components - Auto Scaling

User-controlled retries are specified by the task's maxTaskRetryCount, which determines how many times a task is retried after exiting with a nonzero exit code.

A task will be retried up to the value of maxTaskRetryCount if it fails, allowing you to design your tasks to withstand failure.

Internal retries can occur due to failures on the compute node, such as not being able to update internal state or a failure on the node while the task is running.

The task will be retried on the same compute node, if possible, up to an internal limit before giving up on the task and deferring it to be rescheduled by Batch.

Whether a task is preempted while running on a Spot node or interrupted due to a failure on a dedicated node, both situations are mitigated by designing the task to withstand failure.

There are no design differences when executing your tasks on dedicated or Spot nodes, making it easier to plan for retries in your auto scaling strategy.

Auto Scaling

Credit: youtube.com, Auto Scaling Groups - Capacity Settings

Auto Scaling is a powerful feature that allows you to dynamically adjust the number of compute nodes in a pool based on your application's needs. You can define an automatic scaling formula, which is a string value that determines the number of available compute nodes in a pool for the next interval of processing.

An automatic scaling formula can include up to 100 statements separated by semicolons and can be up to 8KB in size. It can include system-defined variables, user-defined variables, constant values, and supported operations on these variables or constants.

To create a complex formula, you can use multiple statements and variables. For example, you can use the avg function to calculate the average CPU usage and then use that value to determine the number of nodes to add or remove.

Here are some examples of how you can use the avg function in an automatic scaling formula:

  • avg($CPUPercent.GetSample(sampleDuration))
  • avg($MemoryUsage.GetSample(sampleDuration))

You can also use other functions such as max, min, and sum to create more complex formulas.

Credit: youtube.com, Auto Scaling Groups - Scaling Policies

Some important functions to note are:

  • avg: calculates the average value of a list
  • max: returns the maximum value in a list
  • min: returns the minimum value in a list
  • sum: returns the sum of all values in a list

Here is a table of some of the available functions:

You can use these functions to create a formula that adjusts the number of nodes based on your application's needs.

Once you have created your automatic scaling formula, you can enable it on an existing pool using the Batch .NET library or the REST API. You can also evaluate the formula before applying it to the pool to ensure it is working as expected.

Modern data center corridor with server racks and computer equipment. Ideal for technology and IT concepts.
Credit: pexels.com, Modern data center corridor with server racks and computer equipment. Ideal for technology and IT concepts.

To enable auto scaling on an existing pool, you can use the EnableAutoScale method in the Batch .NET library or the Enable automatic scaling on a pool request in the REST API. You will need to specify the ID of the existing pool and the automatic scaling formula to apply to the pool.

To evaluate the formula, you can use the EvaluateAutoScale method in the Batch .NET library or the Evaluate an automatic scaling formula request in the REST API. You will need to specify the ID of the existing pool and the string that contains the automatic scaling formula.

By using auto scaling, you can ensure that your application is running at the right scale to meet its needs, while also controlling costs and managing resources.

Autoscale Formula

An autoscale formula is a set of statements that define how to adjust the number of compute nodes in a pool based on certain conditions. These conditions can include metrics such as CPU usage, memory usage, and disk usage.

Credit: youtube.com, How to automatically scale Azure App Services | Azure Tips and Tricks

To construct an autoscaling formula, you need to define the requirements for the formula, such as increasing or decreasing the target number of compute nodes in a pool. For example, you can define a statement that increases the target number of nodes if the minimum average CPU usage during the last 10 minutes was above 70%.

You can also use system-defined variables in your formula, such as $TargetDedicated, which represents the target number of dedicated compute nodes for the pool. This variable can be changed based on actual usage for tasks.

Some common system-defined variables used in autoscaling formulas include $CPUPercent, $WallClockSeconds, $MemoryBytes, and $DiskBytes, which represent the average percentage of CPU usage, number of seconds consumed, average number of megabytes used, and average number of gigabytes used on the local disks, respectively.

Here are some examples of system-defined variables and their descriptions:

You can also use user-defined variables in your formula, which can be used to store and retrieve values. For example, you can use a variable to store the current target number of nodes and then use it in subsequent statements.

To limit the target number of dedicated compute nodes, you can use a statement that restricts the maximum number of nodes to a specific value, such as 400.

Creating and Managing Auto-Scale Pools

Credit: youtube.com, Azure Batch Service

Creating and managing auto-scale pools is a crucial part of Azure Batch autoscaling. You can enable automatic scaling when creating a pool using the AutoScaleFormula parameter with the New-AzureBatchPool cmdlet or by setting the CloudPool.AutoScaleEnabled and CloudPool.AutoScaleFormula properties after creating the pool with BatchClient.PoolOperations.CreatePool.

To do this, you'll need to specify the automatic scaling formula, which is a string value that can include up to 100 statements separated by semicolons. This formula string cannot exceed 8KB in size and can include line breaks and comments.

You can also enable automatic scaling on an existing pool by using the BatchClient.PoolOperations.EnableAutoScale method or by sending a REST API request to the pool's ID with the automatic scaling formula in the request body.

If you've already set up a pool with a specified number of compute nodes using the targetDedicated parameter, you can update the existing pool to automatically scale by ignoring the targetDedicated parameter and applying the automatic scaling formula.

Credit: youtube.com, Betatalks #44 - How to create scalable agent pools in Azure DevOps

Here are the techniques to enable automatic scaling when creating or updating a pool:

  • New-AzureBatchPool with AutoScaleFormula parameter
  • BatchClient.PoolOperations.CreatePool with CloudPool.AutoScaleEnabled and CloudPool.AutoScaleFormula properties
  • Add a pool to an account with enableAutoScale and autoScaleFormula elements
  • BatchClient.PoolOperations.EnableAutoScale method
  • Enable automatic scaling on a pool REST API request

Remember, if you set up automatic scaling when the pool is created, you must not specify the targetDedicated parameter, and if you wish to manually resize an autoscale-enabled pool, you must first disable automatic scaling on the pool.

Francisco Parker

Assigning Editor

Francisco Parker is a seasoned Assigning Editor with a keen eye for compelling content. With a passion for storytelling, Francisco has spent years honing his skills in the journalism industry, where he has developed a keen sense of what readers want to know. Throughout his career, Francisco has assigned articles on a wide range of topics, including SEO Strategies, where he has helped readers navigate the ever-changing landscape of online search and optimization.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.