OpenShift API-Server

OpenShift API-Server

OpenShift API-Server

This integration is disabled by default. Please contact Sysdig Support to enable it in your account.

List of Alerts:

[OpenShift API Server] Deprecated APIsAPI-Server Deprecated APIsPrometheus
[OpenShift API Server] Certificate ExpiryAPI-Server Certificate ExpiryPrometheus
[OpenShift API Server] Admission Controller High LatencyAPI-Server Admission Controller High LatencyPrometheus
[OpenShift API Server] Webhook Admission Controller High LatencyAPI-Server Webhook Admission Controller High LatencyPrometheus
[OpenShift API Server] High 4xx RequestError RateAPIS-Server High 4xx Request Error RatePrometheus
[OpenShift API Server] High 5xx RequestError RateAPIS-Server High 5xx Request Error RatePrometheus
[OpenShift API Server] High Request LatencyAPIS-Server High Request LatencyPrometheus

List of Dashboards:

  • OpenShift v4 API Server OpenShift v4 API Server

List of Metrics:

  • apiserver_admission_controller_admission_duration_seconds_count
  • apiserver_admission_controller_admission_duration_seconds_sum
  • apiserver_admission_webhook_admission_duration_seconds_count
  • apiserver_admission_webhook_admission_duration_seconds_sum
  • apiserver_client_certificate_expiration_seconds_bucket
  • apiserver_client_certificate_expiration_seconds_count
  • apiserver_request_duration_seconds_count
  • apiserver_request_duration_seconds_sum
  • apiserver_request_total
  • apiserver_requested_deprecated_apis

How to monitor OpenShift API Server with Sysdig agent

No further installation is needed, since OpenShift 4.X comes with both Prometheus and API Server ready to use. OpenShift API Server metrics are exposed using /federate endpoint.

Learning how to monitor Kubernetes API server is of vital importance when running Kubernetes in production. Monitoring kube-apiserver will let you detect and troubleshoot latency, errors and validate the service performs as expected.

Here are some interesting metrics and queries to monitor and troubleshoot OpenShift API Server.

API Server deprecated APIs

To check if deprecated API versions are being used use the following query:

sum by (kube_cluster_name, resource, removed_release,version)(apiserver_requested_deprecated_apis)

Certificate expiration

Certificates are used to authenticate to the apiserver, you can check with the following query if a certificate is expiring next week:

apiserver_client_certificate_expiration_seconds_count > 0 and histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket[5m]))) < 7*24*60*60

API Server Latency

Check for latency spikes in the last 10 minutes. This is typically a sign of overload in the API server. Probably your cluster has a lot of load and the API server needs to be scaled out.

sum by (kube_cluster_name,verb,apiserver)(rate(apiserver_request_duration_seconds_sum{verb!="WATCH"}[10m]))/sum by (kube_cluster_name,verb,apiserver)(rate(apiserver_request_duration_seconds_count{verb!="WATCH"}[10m]))

Request Error Rate

Request errror rate means that API is responding 5xx errors, check CPU / Memory of your api-server pods.

sum by(kube_cluster_name)(rate(apiserver_request_total{code=~"5..",kube_cluster_name=~$cluster}[5m])) / sum by(kube_cluster_name)(rate(apiserver_request_total{kube_cluster_name=~$cluster}[5m])) > 0.05