Kubernetes
This integration is disabled by default. See Enable and Disable Integrations to enable it in your account.
This integration has 70 metrics.
List of Alerts
Alert | Description | Format |
---|---|---|
[Kubernetes] Container Waiting | Container in waiting status for long time (CrashLoopBackOff, ImagePullErr…) | Prometheus |
[Kubernetes] Container Restarting | Container restarting | Prometheus |
[Kubernetes] Pod Not Ready | Pod in not ready status | Prometheus |
[Kubernetes] Init Container Waiting For a Long Time | Init container in waiting state (CrashLoopBackOff, ImagePullErr…) | Prometheus |
[Kubernetes] Pod Container Creating For a Long Time | Pod is stuck in ContainerCreating state | Prometheus |
[Kubernetes] Pod Container Terminated With Error | Pod Container Terminated With Error (OOMKilled, Error…) | Prometheus |
[Kubernetes] Init Container Terminated With Error | Init Container Terminated With Error (OOMKilled, Error…) | Prometheus |
[Kubernetes] Workload with Pods not Ready | Workload with Pods not Ready (Evicted, NodeLost, UnexpectedAdmissionError) | Prometheus |
[Kubernetes] Workload Replicas Mismatch | There are pod in the workload that could not start | Prometheus |
[Kubernetes] Pod Not Scheduled For DaemonSet | Pods cannot be scheduled for DaemonSet | Prometheus |
[Kubernetes] Pods In DaemonSet Incorrectly Scheduled | There are pods from a DaemonSet that should not be running | Prometheus |
[Kubernetes] CPU Overcommit | CPU resources in the cluster are overcommitted. If a node fails, the cluster may be unable to reschedule the affected pods due to insufficient CPU capacity. | Prometheus |
[Kubernetes] Memory Overcommit | Memory resources in the cluster are overcommitted. If a node fails, the cluster may be unable to reschedule the affected pods due to insufficient memory capacity. | Prometheus |
[Kubernetes] CPU OverUsage | CPU OverUsage in cluster. If one node fails, the cluster will not have enough CPU to run all the current pods. | Prometheus |
[Kubernetes] Memory OverUsage | Memory OverUsage in cluster. If one node fails, the cluster will not have enough memory to run all the current pods. | Prometheus |
[Kubernetes] Container CPU Throttling | Container CPU usage next to limit. Possible CPU Throttling. | Prometheus |
[Kubernetes] Container Memory Next To Limit | Container memory usage next to limit. Risk of Out Of Memory Kill. | Prometheus |
[Kubernetes] Container CPU Unused | Container unused CPU higher than 85% of request for 8 hours. | Prometheus |
[Kubernetes] Container Memory Unused | Container unused Memory higher than 85% of request for 8 hours. | Prometheus |
[Kubernetes] Node Not Ready | Node in Not-Ready condition | Prometheus |
[Kubernetes] Not All Nodes Are Ready | Not all nodes are in Ready condition. | Prometheus |
[Kubernetes] Too Many Pods In Node | Node close to its limits of pods. | Prometheus |
[Kubernetes] Node Readiness Flapping | Node availability is unstable. | Prometheus |
[Kubernetes] Nodes Disappeared | Less nodes in cluster than 30 minutes before. | Prometheus |
[Kubernetes] All Nodes Gone In Cluster | All Nodes Gone In Cluster. | Prometheus |
[Kubernetes] Node CPU High Usage | High usage of CPU in node. | Prometheus |
[Kubernetes] Node Memory High Usage | High usage of memory in node. Risk of pod eviction. | Prometheus |
[Kubernetes] Node Root File System Almost Full | Root file system in node almost full. To include other file systems, change the value of the device label from ‘.root.’ to your device name | Prometheus |
[Kubernetes] Max Schedulable Pod Less Than 1 CPU Core | The maximum schedulable CPU request in a pod is less than 1 core. | Prometheus |
[Kubernetes] Max Schedulable Pod Less Than 512Mb Memory | The maximum schedulable memory request in a pod is less than 512Mb. | Prometheus |
[Kubernetes] HPA Desired Scale Up Replicas Unreached | HPA could not reach the desired scaled up replicas for long time. | Prometheus |
[Kubernetes] HPA Desired Scale Down Replicas Unreached | HPA could not reach the desired scaled down replicas for long time. | Prometheus |
[Kubernetes] Job failed to complete | Job failed to complete | Prometheus |
[Kubernetes] Cluster is reaching maximum pod capacity (95%) | Review cluster pod capacity to ensure pods can be scheduled. | Prometheus |
List of Dashboards
Workload Status & Performance
The dashboard provides information on the Workload Status and Performance.
Pod Status & Performance
The dashboard provides information on the Pod Status and Performance.
Cluster / Namespace Available Resources
The dashboard provides information on the Cluster and Namespace Available Resources.
Cluster Capacity Planning
Dashboard used for Cluster Capacity Planning.
Container Resource Usage & Troubleshooting
The dashboard provides information on the Container Resource Usage and Troubleshooting.
Node Status & Performance
The dashboard provides information on the Node Status and Performance.
Pod Rightsizing & Workload Capacity Optimization
Dashboard used for Pod Rightsizing and Workload Capacity Optimization.
Pod Scheduling Troubleshooting
Dashboard used for Pod Scheduling Troubleshooting.
Horizontal Pod Autoscaler
The dashboard provides information on the Horizontal Pod Autoscalers.
Kubernetes Jobs
The dashboard provides information on the Kubernetes Jobs.
List of Metrics
Metric name |
---|
container.image |
container.image.tag |
kube_cronjob_next_schedule_time |
kube_cronjob_status_active |
kube_cronjob_status_last_schedule_time |
kube_daemonset_status_current_number_scheduled |
kube_daemonset_status_desired_number_scheduled |
kube_daemonset_status_number_misscheduled |
kube_daemonset_status_number_ready |
kube_hpa_status_current_replicas |
kube_hpa_status_desired_replicas |
kube_job_complete |
kube_job_failed |
kube_job_spec_completions |
kube_job_status_active |
kube_namespace_labels |
kube_node_info |
kube_node_status_allocatable |
kube_node_status_allocatable_cpu_cores |
kube_node_status_allocatable_memory_bytes |
kube_node_status_capacity |
kube_node_status_capacity_cpu_cores |
kube_node_status_capacity_memory_bytes |
kube_node_status_capacity_pods |
kube_node_status_condition |
kube_node_sysdig_host |
kube_pod_container_info |
kube_pod_container_resource_limits |
kube_pod_container_resource_requests |
kube_pod_container_status_restarts_total |
kube_pod_container_status_terminated_reason |
kube_pod_container_status_waiting_reason |
kube_pod_info |
kube_pod_init_container_status_terminated_reason |
kube_pod_init_container_status_waiting_reason |
kube_pod_status_ready |
kube_resourcequota |
kube_workload_pods_status_reason |
kube_workload_status_desired |
kube_workload_status_ready |
kubernetes.hpa.replicas.current |
kubernetes.hpa.replicas.desired |
kubernetes.hpa.replicas.max |
kubernetes.hpa.replicas.min |
sysdig_container_cpu_cores_used |
sysdig_container_cpu_quota_used_percent |
sysdig_container_info |
sysdig_container_memory_limit_used_percent |
sysdig_container_memory_used_bytes |
sysdig_container_net_connection_in_count |
sysdig_container_net_connection_out_count |
sysdig_container_net_connection_total_count |
sysdig_container_net_error_count |
sysdig_container_net_http_error_count |
sysdig_container_net_http_request_time |
sysdig_container_net_http_statuscode_request_count |
sysdig_container_net_in_bytes |
sysdig_container_net_out_bytes |
sysdig_container_net_request_count |
sysdig_container_net_request_time |
sysdig_fs_free_bytes |
sysdig_fs_inodes_used_percent |
sysdig_fs_total_bytes |
sysdig_fs_used_bytes |
sysdig_fs_used_percent |
sysdig_program_cpu_cores_used |
sysdig_program_cpu_used_percent |
sysdig_program_memory_used_bytes |
sysdig_program_net_connection_total_count |
sysdig_program_net_total_bytes |
Prerequisites
None.
Installation
Installing an exporter is not required for this integration.
Agent Configuration
This integration has no default agent job.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.