Azure Cluster Autoscaler

Metrics, Dashboards, Alerts and more for Azure Cluster Autoscaler Integration in Sysdig Monitor.
Azure Cluster Autoscaler

This integration can be enabled via the Azure Cloud Metrics.

This integration has 4 metrics.

List of Alerts

AlertDescriptionFormat
[Azure Cluster AutoScaler] Safe Autoscale not workingCluster Safe Autoscale not working.Prometheus
[Azure Cluster AutoScaler] Unneeded nodesUnneeded nodes.Prometheus
[Azure Cluster AutoScaler] Scale down is in cooldownScale down is in cooldown.Prometheus

List of Dashboards

Azure Cluster Autoscaler

The dashboard provides information on the Azure Cluster Autoscaler for AKS. Azure Cluster Autoscaler

List of Metrics

Metric name
azure_containerservice_managedclusters_cluster_autoscaler_cluster_safe_to_autoscale_avg
azure_containerservice_managedclusters_cluster_autoscaler_scale_down_in_cooldown_avg
azure_containerservice_managedclusters_cluster_autoscaler_unneeded_nodes_count_avg
azure_containerservice_managedclusters_cluster_autoscaler_unschedulable_pods_count_avg

Monitoring and Troubleshooting Azure Cluster Autoscaler

This document describes important metrics and queries that you can use to monitor and troubleshoot Azure Cluster Autoscaler.

Cluster Autoscaler

Safe Autoscale

Use the following query to check if the Safe Autoscaler is currently running. If it is not running, your cluster won’t be able to autoscale:

azure_containerservice_managedclusters_cluster_autoscaler_cluster_safe_to_autoscale_avg != 1

A return value other than 1 indicates a problem.

Unneeded Nodes

Use the following query to get the number of unneeded nodes in the AKS cluster. These nodes will be deleted by Cluster Autoscaler as soon as all pods are evicted:

azure_containerservice_managedclusters_cluster_autoscaler_unneeded_nodes_count_avg

Scale down cooldown

Use the following query to get the state of scale down:

azure_containerservice_managedclusters_cluster_autoscaler_scale_down_in_cooldown_avg

A return value other than 0 indicates that the AKS cluster won’t be able to autoscale.

Unschedulable pods

The Cluster Autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints. The following query returns the number of unschedulable pods:

azure_containerservice_managedclusters_cluster_autoscaler_unschedulable_pods_count_avg

A return value other than 0 indicates that the AKS cluster can’t schedule pods and the autoscaler will scale more nodes.