Ceph

Metrics, Dashboards, Alerts and more for Ceph Integration in Sysdig Monitor.
Ceph

This integration is enabled by default.

Versions supported: > v15.2.12

This integration is out-of-the-box, so it doesn’t require any exporter.

This integration has 24 metrics.

Timeseries generated: 600 timeseries

List of Alerts

AlertDescriptionFormat
[Ceph] Ceph Manager is absentCeph Manager has disappeared from Prometheus target discovery.Prometheus
[Ceph] Ceph Manager is missing replicasCeph Manager is missing replicas.Prometheus
[Ceph] Ceph quorum at riskStorage cluster quorum is low. Contact Support.Prometheus
[Ceph] High number of leader changesCeph Monitor has seen a lot of leader changes per minute recently.Prometheus

List of Dashboards

Ceph

The dashboard provides information on the status, capacity, latency and throughput of Ceph. Ceph

List of Metrics

Metric name
ceph_cluster_total_bytes
ceph_cluster_total_used_bytes
ceph_health_status
ceph_mgr_status
ceph_mon_metadata
ceph_mon_num_elections
ceph_mon_quorum_status
ceph_osd_apply_latency_ms
ceph_osd_commit_latency_ms
ceph_osd_in
ceph_osd_metadata
ceph_osd_numpg
ceph_osd_op_r
ceph_osd_op_r_latency_count
ceph_osd_op_r_latency_sum
ceph_osd_op_r_out_bytes
ceph_osd_op_w
ceph_osd_op_w_in_bytes
ceph_osd_op_w_latency_count
ceph_osd_op_w_latency_sum
ceph_osd_recovery_bytes
ceph_osd_recovery_ops
ceph_osd_up
ceph_pool_max_avail

Prerequisites

Enable Prometheus Module

Ceph instruments Prometheus metrics and annotates the manager pod with Prometheus annotations.

Make sure that the Prometheus module is activated in the Ceph cluster by running the following command:

ceph mgr module enable prometheus

Installation

Installing an exporter is not required for this integration.

Agent Configuration

The default agent job for this integration is as follows:

- job_name: ceph-default
  tls_config:
    insecure_skip_verify: true
  kubernetes_sd_configs:
  - role: pod
  relabel_configs:
  - action: keep
    source_labels: [__meta_kubernetes_pod_host_ip]
    regex: __HOSTIPS__
  - action: drop
    source_labels: [__meta_kubernetes_pod_annotation_promcat_sysdig_com_omit]
    regex: true
  - source_labels: [__meta_kubernetes_pod_phase]
    action: keep
    regex: Running
  - action: keep
    source_labels:
    - __meta_kubernetes_pod_container_name
    - __meta_kubernetes_pod_annotation_prometheus_io_port
    regex: mgr;9283
  - action: replace
    source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
    target_label: __scheme__
    regex: (https?)
  - action: replace
    source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
    regex: ([^:]+)(?::\d+)?;(\d+)
    replacement: $1:$2
    target_label: __address__
  - action: replace
    source_labels: [__meta_kubernetes_pod_uid]
    target_label: sysdig_k8s_pod_uid
  - action: replace
    source_labels: [__meta_kubernetes_pod_container_name]
    target_label: sysdig_k8s_pod_container_name