This integration is enabled by default.
Versions supported: > v15.2.12
This integration is out-of-the-box, so it doesn’t require any exporter.
This integration has 24 metrics.
Timeseries generated: 600 timeseries
List of Alerts
Alert | Description | Format |
---|---|---|
[Ceph] Ceph Manager is absent | Ceph Manager has disappeared from Prometheus target discovery. | Prometheus |
[Ceph] Ceph Manager is missing replicas | Ceph Manager is missing replicas. | Prometheus |
[Ceph] Ceph quorum at risk | Storage cluster quorum is low. Contact Support. | Prometheus |
[Ceph] High number of leader changes | Ceph Monitor has seen a lot of leader changes per minute recently. | Prometheus |
List of Dashboards
Ceph
The dashboard provides information on the status, capacity, latency and throughput of Ceph.
List of Metrics
Metric name |
---|
ceph_cluster_total_bytes |
ceph_cluster_total_used_bytes |
ceph_health_status |
ceph_mgr_status |
ceph_mon_metadata |
ceph_mon_num_elections |
ceph_mon_quorum_status |
ceph_osd_apply_latency_ms |
ceph_osd_commit_latency_ms |
ceph_osd_in |
ceph_osd_metadata |
ceph_osd_numpg |
ceph_osd_op_r |
ceph_osd_op_r_latency_count |
ceph_osd_op_r_latency_sum |
ceph_osd_op_r_out_bytes |
ceph_osd_op_w |
ceph_osd_op_w_in_bytes |
ceph_osd_op_w_latency_count |
ceph_osd_op_w_latency_sum |
ceph_osd_recovery_bytes |
ceph_osd_recovery_ops |
ceph_osd_up |
ceph_pool_max_avail |
Preparing the Integration
Enable Prometheus Module
Ceph instruments Prometheus metrics and annotates the manager pod with Prometheus annotations.
Make sure that the Prometheus module is activated in the Ceph cluster by running the following command:
ceph mgr module enable prometheus
Installing
The installation of an exporter is not required for this integration.
Monitoring and Troubleshooting Ceph
This document describes important metrics and queries that you can use to monitor and troubleshoot Ceph.
Tracking metrics status
You can track Ceph metrics status with following alerts: Exporter proccess is not serving metrics
# [Ceph] Exporter Process Down
absent(ceph_health_status{kube_cluster_name=~$cluster,kube_namespace_name=~$namespace,kube_workload_name=~$workload}) > 0
Related Blog Posts
Agent Configuration
This is the default agent job for this integration:
- job_name: ceph-default
tls_config:
insecure_skip_verify: true
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
source_labels: [__meta_kubernetes_pod_host_ip]
regex: __HOSTIPS__
- action: drop
source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
regex: true
- action: keep
source_labels:
- __meta_kubernetes_pod_container_name
- __meta_kubernetes_pod_annotation_prometheus_io_port
regex: mgr;9283
- action: replace
source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
target_label: __scheme__
regex: (https?)
- action: replace
source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- action: replace
source_labels: [__meta_kubernetes_pod_uid]
target_label: sysdig_k8s_pod_uid
- action: replace
source_labels: [__meta_kubernetes_pod_container_name]
target_label: sysdig_k8s_pod_container_name