IBM Kubernetes API Server

Metrics, Dashboards, Alerts and more for IBM Kubernetes API Server Integration in Sysdig Monitor.
IBM Kubernetes API Server

This integration is disabled by default. See Enable and Disable Integrations to enable it in your account.

This integration is out-of-the-box, so it doesn’t require any exporter.

This integration has 11 metrics.

Timeseries generated: ~1200 TS

List of Alerts

[IBM Kubernetes API Server] Certificate ExpiryAPI-Server Certificate ExpiryPrometheus
[IBM Kubernetes API Server] Admission Controller High LatencyAPI-Server Admission Controller High LatencyPrometheus
[IBM Kubernetes API Server] High 4xx RequestError RateAPIS-Server High 4xx Request Error RatePrometheus
[IBM Kubernetes API Server] High 5xx RequestError RateAPIS-Server High 5xx Request Error RatePrometheus
[IBM Kubernetes API Server] High Request LatencyAPIS-Server High Request LatencyPrometheus

List of Dashboards

IBM Kubernetes API Server

The dashboard provides information on the IBM Kubernetes API Server. IBM Kubernetes API Server

List of Metrics

Metric name


Install Sysdig Agent

To collect the IBM Kubernetes API Server metrics, you must install Sysdig Agent on your IKS cluster nodes.


Installing an exporter is not required for this integration.

Monitoring and Troubleshooting IBM Kubernetes API Server

Learning how to monitor Kubernetes API server is vital when running Kubernetes in production. Monitoring kube-apiserver will help you detect and troubleshoot latency and errors, and validate whether the service performs as expected.

Here are some interesting queries to run and metrics to monitor for troubleshooting the IBM Kubernetes API Server.

Certificate Expiration

Certificates are used to authenticate to the API server, and you can check with the following query if a certificate is expiring next week:

apiserver_client_certificate_expiration_seconds_count > 0 and histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket[5m]))) < 7*24*60*60

API Server Latency

Latency spike is typically a sign of overload in the API server. Probably your cluster has a high load and the API server needs to be scaled out. Use the following query to check for latency spikes in the last 10 minutes.

sum by (kube_cluster_name,verb,apiserver)(rate(apiserver_request_duration_seconds_sum{verb!="WATCH"}[10m]))/sum by (kube_cluster_name,verb,apiserver)(rate(apiserver_request_duration_seconds_count{verb!="WATCH"}[10m]))

Request Error Rate

Request error rate means that the API server is responding with 5xx errors.

sum by(kube_cluster_name)(rate(apiserver_request_total{code=~"5.."}[5m])) / sum by(kube_cluster_name)(rate(apiserver_request_total[5m])) > 0.05

Agent Configuration

The default agent job for this integration is as follows:

- job_name: iks-apiservers-default
  bearer_token_file: /var/run/secrets/
    ca_file: /var/run/secrets/
    insecure_skip_verify: true
  scheme: https
  - role: pod
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - source_labels: [__meta_kubernetes_pod_phase]
    action: keep
    regex: Running    
  - action: keep
    source_labels: [__meta_kubernetes_pod_container_name]
    regex: dashboard-metrics-scraper
  - action: replace
    source_labels: [__address__]
    target_label: __address__
    replacement: kubernetes.default.svc:443
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name
  - source_labels: [__name__]
    regex: (apiserver_admission_controller_admission_duration_seconds_count|apiserver_admission_controller_admission_duration_seconds_sum|apiserver_client_certificate_expiration_seconds_bucket|apiserver_client_certificate_expiration_seconds_count|apiserver_request_duration_seconds_count|apiserver_request_duration_seconds_sum|apiserver_request_total|apiserver_response_sizes_count|apiserver_response_sizes_sum|workqueue_adds_total|workqueue_depth)
    action: keep