Infrastructure and Cloud

Scaling and Availability: Preparing Your Workloads for Peak Demand

By Chris Henry, BCN Head of Azure CoE. Posted 19th November 2024

7 min read

Scaling and Availability: Preparing Your Workloads for Peak Demand

Chris Henry, our Head of Azure CoE, delves into the key considerations for preparing for traffic increases. He highlights the crucial aspects to consider for customer-facing workloads, particularly around scaling, elasticity, resiliency, and availability.

To maximise your revenue you must prepare to scale

Everyone probably remembers the woes and frustrations of trying to obtain tickets to their favourite band’s reunion tour (even if this can sometimes be a tactic for driving up demand). But, to maximise your revenue, you’ll want to ensure your workloads can meet the demand of your customers, especially for peak shopping periods like Black Friday, festive or January sales or other peak demand times specific to your business and sales cycle.

A benefit of Cloud over a traditional on-premises datacentre, is that scaling and elasticity are much easier to achieve. You only pay for the time this is scaled up, rather than having to scale resources more permanently.

Let’s look at what each of these terms means, how they differ and how Azure can help.

Scaling & Elasticity

Scaling and Elasticity can often be used interchangeable but generally, Scaling is when you scale a workload to meet the additional demand. For example, you usually have infrastructure that supports 1,000 users on your website but on a busy day, such as when you launch a new offer to your customers, you may need to scale to meet the demand of the 10,000 users automatically to ensure that your website remains available. This is generally called ‘Scaling’ when you scale the infrastructure that supports your website to meet the demand.

Elasticity is very similar to Scaling but this is generally considered to mean both scaling up and scaling down to meet the specific capacity requirements at the time they are needed. For example, your website might need to support 10,000 users at weekend but only 1,000 users on weekdays, due to demand. Elasticity would generally be able to scale to meet these demand peaks and troughs.

How does Azure help with Scaling and Elasticity?

Many Azure PaaS services such as App Service, App Gateway and Container Instances including scaling and elasticity features. Additionally, Autoscale can be configured to automatically add additional resources to your IaaS workload based on specific metrics such as percentage of CPU used as well as scheduled scaling such as on specific days of the week or month. Autoscale also includes ‘scale-in’ options for when you want to reduce the scale once the demand reduces again.

Expert View

To maximise your revenue, you'll want to ensure your workloads can meet the demand of your customers.

Chris Henry

BCN Head of Azure CoE

Resiliency

Resiliency refers to your workloads ability to resist disruptions or recover from any disruptions. This can often go hand-in-hand with Availability as making your workload highly available i.e. in more than one place at the same time means it is more resilient to disruption generally as both locations would have to be disrupted at the same time, thus making it less likely that this would happen.

Resiliency can refer to both resisting the initial disruption, for example having your workload running on two servers is more resilient than one server as if one server was to fail, the other server would still keep the workload running and to recovering from any disruption for example recovering from a disaster by being able to spin up a secondary server should the first one fail or being able to restore a recent backup of your data should the data become corrupted.

This generally aligns with your overall Disaster Recovery strategy around your Recovery Time Objectives (RTOs) and Recovery Point Objectives (RPOs).

RTOs are the amount of time your business defines your workload should be unavailable for in the event of disruption before the workload becomes available again.
RPOs is the amount of data you can lose between when your workload becomes unavailable and when the last backup was taken. Generally, the lower the RPO the closer to real-time your recovery point is i.e. if a backup is taken every 5 mins then the maximum amount of data that can be lost is the maximum amount of data that can be written during that 5 minute period. Your workload, in the event of a recovery, would only ever be 5 mins behind your live workload.

How does Azure help with Resiliency?

Resiliency in Azure could mean using a Load Balancer to balance the traffic between two or more backend services in a single region or using global load balancing via Azure Front Door or Traffic Manager across regions.

Additionally, you can improve your RTOs via services such as Azure Site Recovery to automatically failover between regions in the event of a disruptive scenario or improve your RPOs with Azure Backup, Geo-restore or Point-In-Time Restore (PITR) options.

Availability

Availability refers to how available your workload is and how exposed it is to different disruptions. For example, if your workload is on a single server it is much more exposed to a hardware failure than if it was on two separate servers, providing they are not sharing hardware.

If those two servers are also in different datacentre locations, this reduces the exposure of disruption even further as any power outages or disruption to one datacentre would not affect your server in the secondary datacentre location providing they are geographically separate.

How does Azure help with Availability?

Within Azure most geographic regions operate at least three Availability Zones (AZs), these are separated groups of datacentres with independent power, cooling and networking infrastructure. They are deliberately designed so that if one zone experiences disruption the other zones are very unlikely to also be disrupted. Microsoft rigorously assesses the risks around these zones to ensure that shared risks between the zones are minimised as much as possible.

Splitting your workload between two AZs in a single region means that even if there is disruption to Zone 1 your workload would continue to operate from the second server in Zone 2.

With planning and preparation, you can achieve Seamless Scaling and Elasticity in the Cloud

We know there’s a lot to consider – but in order to ensure you businesses maximises the return during peak times and doesn’t provide a poor customer experience, Azure can scale seamlessly – and in the most cost effective way.

To learn more about how BCN can assist you with architectural design, validation and recommendations around resiliency please contact us.

Need help scaling your Azure workloads?

Get in touch with our experts