OpenShift CoreDNS
This integration is enabled by default.
Versions supported: > v4.8
This integration is out-of-the-box, so it doesn’t require any exporter.
This integration has 13 metrics.
Timeseries generated: CoreDNS generates ~230 timeseries per dns-default pod
List of Alerts
Alert | Description | Format |
---|---|---|
[OpenShift CoreDNS] Process Down | CoreDNS has disappeared from target discovery. | Prometheus |
[OpenShift CoreDNS] High Failed Responses | CoreDNS is returning failed responses. | Prometheus |
[OpenShift CoreDNS] High Latency | CoreDNS responses latency is higher than 60ms. | Prometheus |
[OpenShift CoreDNS] Panics Observed | CoreDNS Panics Observed. | Prometheus |
List of Dashboards
OpenShift v4 CoreDNS
If you are using Prometheus Remote Write you will need to add the following metric relabel config for this label.
- action: replace
source_labels: [ __address__ ]
target_label: _sysdig_integration_openshift_coredns
replacement: true
The dashboard provides information on the OpenShift CoreDNS.
List of Metrics
Metric name |
---|
coredns_cache_hits_total |
coredns_cache_misses_total |
coredns_dns_request_duration_seconds_bucket |
coredns_dns_request_size_bytes_bucket |
coredns_dns_requests_total |
coredns_dns_response_size_bytes_bucket |
coredns_dns_responses_total |
coredns_forward_request_duration_seconds_bucket |
coredns_panics_total |
coredns_plugin_enabled |
go_goroutines |
process_cpu_seconds_total |
process_resident_memory_bytes |
Prerequisites
None.
Installation
Installing an exporter is not required for this integration.
Monitoring and Troubleshooting OpenShift CoreDNS
Because OpenShift 4.X comes with both Prometheus and CoreDNS ready to use, no additional installation is required. OpenShift CoreDNS metrics are exposed on the SSL port 9154.
Here are some interesting queries to run and metrics to monitor for troubleshooting OpenShift 4.
CoreDNS Panics
Number of Panics
To check the CoreDNS number of panics, use the following query:
sum(coredns_panics_total)
See the CoreDNS pods logs when you see this number growing.
DNS Requests
By Type
To filter DNS request types, use the following query:
(sum(rate(coredns_dns_requests_total[$__interval])) by (type,kube_cluster_name,kube_pod_name))
By Protocol
To filter DNS request types by protocol, use the following query:
(sum(rate(coredns_dns_requests_total[$__interval]) ) by (proto,kube_cluster_name,kube_pod_name))
By Zone
To filter DNS request types by zone, use the following query:
(sum(rate(coredns_dns_requests_total[$__interval]) ) by (zone,kube_cluster_name,kube_pod_name))
By Latency
This metrics detects any degradation in the service. With the following query, you can compare percentile 99 against average.
histogram_quantile(0.99, sum(rate(coredns_dns_request_duration_seconds_bucket[5m])) by(server, zone, le))
Error Rate
Watch carefully for this metric as you can filter depending on the status code: 200,404,400,500.
sum by (server, status)(coredns_dns_https_responses_total{server, status})
Cache
Cache Hit
To check the cache hit rate, use the following query:
sum(rate(coredns_cache_hits_total[$__interval])) by (type,kube_cluster_name,kube_pod_name)
Cache Miss
To check the cache miss rate, use the following query:
sum(rate(coredns_cache_misses_total[$__interval])) by(server,kube_cluster_name,kube_pod_name)
Agent Configuration
The default agent job for this integration is as follows:
- job_name: openshift-dns-default
honor_labels: true
tls_config:
insecure_skip_verify: true
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
scheme: https
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
source_labels: [__meta_kubernetes_pod_host_ip]
regex: __HOSTIPS__
- source_labels: [__meta_kubernetes_pod_phase]
action: keep
regex: Running
- action: keep
source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_pod_name
separator: '/'
regex: 'openshift-dns/dns-default.+'
- source_labels:
- __address__
action: keep
regex: (.*:9154)
- source_labels:
- __meta_kubernetes_pod_name
action: replace
target_label: instance
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels: [__meta_kubernetes_pod_uid]
target_label: sysdig_k8s_pod_uid
- action: replace
source_labels: [__meta_kubernetes_pod_container_name]
target_label: sysdig_k8s_pod_container_name
- action: replace
source_labels: [ __address__ ]
target_label: _sysdig_integration_openshift_coredns
replacement: true
metric_relabel_configs:
- source_labels: [__name__]
regex: (coredns_cache_hits_total|coredns_cache_misses_total|coredns_dns_request_duration_seconds_bucket|coredns_dns_request_size_bytes_bucket|coredns_dns_requests_total|coredns_dns_response_size_bytes_bucket|coredns_dns_responses_total|coredns_forward_request_duration_seconds_bucket|coredns_panics_total|coredns_plugin_enabled|go_goroutines|process_cpu_seconds_total|process_resident_memory_bytes)
action: keep
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.