Azure Cluster Autoscaler
This integration can be enabled via the Connect an Azure Account page.
This integration has 4 metrics.
List of Alerts
Alert | Description | Format |
---|---|---|
[Azure Cluster AutoScaler] Safe Autoscale not working | Cluster Safe Autoscale not working. | Prometheus |
[Azure Cluster AutoScaler] Unneeded nodes | Unneeded nodes. | Prometheus |
[Azure Cluster AutoScaler] Scale down is in cooldown | Scale down is in cooldown. | Prometheus |
List of Dashboards
Azure Cluster Autoscaler
The dashboard provides information on the Azure Cluster Autoscaler for AKS.
List of Metrics
Metric name |
---|
azure_containerservice_managedclusters_cluster_autoscaler_cluster_safe_to_autoscale_avg |
azure_containerservice_managedclusters_cluster_autoscaler_scale_down_in_cooldown_avg |
azure_containerservice_managedclusters_cluster_autoscaler_unneeded_nodes_count_avg |
azure_containerservice_managedclusters_cluster_autoscaler_unschedulable_pods_count_avg |
Monitoring and Troubleshooting Azure Cluster Autoscaler
This document describes important metrics and queries that you can use to monitor and troubleshoot Azure Cluster Autoscaler.
Cluster Autoscaler
Safe Autoscale
Use the following query to check if the Safe Autoscaler is currently running. If it is not running, your cluster won’t be able to autoscale:
azure_containerservice_managedclusters_cluster_autoscaler_cluster_safe_to_autoscale_avg != 1
A return value other than 1
indicates a problem.
Unneeded Nodes
Use the following query to get the number of unneeded nodes in the AKS cluster. These nodes will be deleted by Cluster Autoscaler as soon as all pods are evicted:
azure_containerservice_managedclusters_cluster_autoscaler_unneeded_nodes_count_avg
Scale down cooldown
Use the following query to get the state of scale down:
azure_containerservice_managedclusters_cluster_autoscaler_scale_down_in_cooldown_avg
A return value other than 0
indicates that the AKS cluster won’t be able to autoscale.
Unschedulable pods
The Cluster Autoscaler component can watch for pods in your cluster that can’t be scheduled because of resource constraints. The following query returns the number of unschedulable pods:
azure_containerservice_managedclusters_cluster_autoscaler_unschedulable_pods_count_avg
A return value other than 0
indicates that the AKS cluster can’t schedule pods and the autoscaler will scale more nodes.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.