Autoscaling

Configure autoscaling in Azure Cloud Kit

Horizontal Pod Autoscaler
Vertical Pod Autoscaler
Cluster Autoscaler
Set Resource Requests and Limits

The Azure Cloud Kit has three different types of autoscalers. They are:

The Horizontal Pod Autoscaler;
The Vertical Pod Autoscaler; and
The Cluster Autoscaler.

The following are descriptions of each.

Horizontal Pod Autoscaler

The Horizontal Pod Autoscaler controls the scale of a Deployment and its ReplicaSet. It’s implemented as a Kubernetes API resource and a controller. It cannot be deployed with Terraform. It’s deployed together with the application because scaling is dependent on the load requirements of the application.

The walkthrough example on Kubernetes’s site shows how to use the horizontal pod autoscaling.

Vertical Pod Autoscaler

The Vertical Pod Autoscaler tries automatically to set resource requests and limits on running containers based on past usage.

Cluster Autoscaler

The Azure Cluster Autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints. When issues are detected, the number of nodes in a node pool is increased to meet the application demand.

To enable autoscaling with Terraform, you need to set up these variables in variables.tf file:

enable_auto_scaling: Set to true.
min_count: Set to the minimum number of nodes in cluster.
max_count: Set to the maximum number of nodes in cluster.

Note	By default, there are quotas in Azure as to how many specific type resources are permitted. For example, vCPU amounts can be limited to as low as 10 per subscription. This limits the cluster size unless you increase the quota. These limits can be increased in the portal Quotas section.

You can find more information about Microsoft Azure subscription limit on their tutorial page entitled, Azure Subscription and Service Limits, Quotas, and Constraints.

Set Resource Requests and Limits

Setting resource requests and limits are different for Vertical Pod Autoscaling.

For improved scaling, you should use resource requests and limits in your application deployments. There are plenty of best practices guides. The right ones for you depends on your application. You can read more on Kubernetes' Resource Management for Pods and Containers page.

For example:

containers:
- name: prodcontainer1
  image: ubuntu
  resources:
    requests:
      memory: “64Mi”
      cpu: “300m”
    limits:
      memory: “128Mi”

Links:

Documentation