Azure Kubernetes Service
This integration can be enabled via the Connect an Azure Account page.
This integration has 16 metrics.
List of Alerts
Alert | Description | Format |
---|---|---|
[Azure Kubernetes Service] Cluster Node not ready | Kubernetes Cluster Node not ready. | Prometheus |
[Azure Kubernetes Service] Cluster node memory rss usage | Kubernetes Cluster node high memory rss usage. | Prometheus |
[Azure Kubernetes Service] Cluster node memory working set usage | Kubernetes Cluster node high memory working set usage. | Prometheus |
[Azure Kubernetes Service] Cluster node disk usage | Kubernetes Cluster node high disk usage. | Prometheus |
[Azure Kubernetes Service] Pod in failed phase | Kubernetes Cluster pod is in failed phase. | Prometheus |
List of Dashboards
Azure Kubernetes Service
The dashboard provides information on the Azure Azure Kubernetes Service (AKS).
List of Metrics
Metric name |
---|
azure_containerservice_managedclusters_apiserver_current_inflight_requests_avg |
azure_containerservice_managedclusters_kube_node_status_allocatable_cpu_cores_avg |
azure_containerservice_managedclusters_kube_node_status_allocatable_memory_bytes_avg |
azure_containerservice_managedclusters_kube_node_status_condition_avg |
azure_containerservice_managedclusters_kube_pod_status_phase_avg |
azure_containerservice_managedclusters_kube_pod_status_ready_avg |
azure_containerservice_managedclusters_node_cpu_usage_millicores_avg |
azure_containerservice_managedclusters_node_cpu_usage_percentage_avg |
azure_containerservice_managedclusters_node_disk_usage_bytes_avg |
azure_containerservice_managedclusters_node_disk_usage_percentage_avg |
azure_containerservice_managedclusters_node_memory_rss_bytes_avg |
azure_containerservice_managedclusters_node_memory_rss_percentage_avg |
azure_containerservice_managedclusters_node_memory_working_set_bytes_avg |
azure_containerservice_managedclusters_node_memory_working_set_percentage_avg |
azure_containerservice_managedclusters_node_network_in_bytes_avg |
azure_containerservice_managedclusters_node_network_out_bytes_avg |
Monitoring and Troubleshooting Azure Kubernetes Service
This document describes important metrics and queries that you can use to monitor and troubleshoot Azure Kubernetes Service.
Cluster State
Nodes not ready
Use the following query to check if there are nodes with NotReady state:
azure_containerservice_managedclusters_kube_node_status_condition_avg{condition="Ready", status2="NotReady"}
A return value other than 0
indicates a problem in the cluster.
CPU cores
Use the following query to get the number of cores on your cluster:
azure_containerservice_managedclusters_kube_node_status_allocatable_cpu_cores_avg
Memory Available
Use the following query to get the available memory:
azure_containerservice_managedclusters_kube_node_status_allocatable_memory_bytes_avg
A low value can indicate memory pressure, a new node should be created manually or by the autoscaler.
Cluster nodes
CPU usage
Use the following query to get the CPU percentage usage:
azure_containerservice_managedclusters_node_cpu_usage_percentage_avg
A return value greater than 95
indicates CPU pressure in a node.
Memory RSS
Memory RSS is the amount of anonymous and swap cache memory.
Use the following query to get the CPU percentage usage:
azure_containerservice_managedclusters_node_memory_rss_percentage_avg
Memory working set
The memory working set includes the amount of kernel memory, dirty memory, and recently accessed memory.
Use the following query to get the CPU percentage usage:
azure_containerservice_managedclusters_node_memory_working_set_percentage_avg
A return value greater than 95
indicates memory pressure in a node.
Disk usage
Use the following query to get the disk usage:
azure_containerservice_managedclusters_node_disk_usage_percentage_avg
A high value return indicates that a disk is running low space.
Network In
Use the following query to get network bandwidth in:
azure_containerservice_managedclusters_node_network_in_bytes_avg
Network Out
Use the following query to get network bandwidth out:
azure_containerservice_managedclusters_node_network_out_bytes_avg
API Inflight Requests
Use the following query to get the API requests:
azure_containerservice_managedclusters_apiserver_current_inflight_requests_avg
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.