Shield Health Metrics

Cluster Shield and Host Shield expose metrics related to its operational health. You can use Prometheus to collect these metrics and monitor the status of Cluster Shield in your environments, ensuring continuous protection and visibility.

Prerequisites

  • Cluster Shield 1.9.0 or later installed. Earlier versions of Cluster Shield do not expose health metrics.
  • To populate the Cluster-Shield Monitoring dashboard in Sysdig Monitor with metrics, ensure Prometheus is enabled in your values.yaml file:
features:
   monitor:
     prometheus:
       enabled: true 

Health Metrics Available

Cluster Shield exposes health metrics through the /metrics endpoint on port 8080.

Collect Health Metrics

To enable Prometheus to scrape health metrics from Cluster Shield, use the following annotation in the values.yaml configuration file:

cluster:
  pod_annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '8080'
    prometheus.io/path: '/metrics'

Once the annotation is applied, Prometheus scrapes these metrics using the specified endpoint and port.

View Metrics

Sysdig Monitor

To view Cluster Shield health metrics, and check Prometheus is successfully scraping the metrics:

  1. Log in to Monitor.
  2. Go to Dashboards > Dashboards Manager.
  3. Locate the dashboard Cluster-Shield Monitoring. You can utilize the search bar.
  4. Select the dashboard.

Host Shield

Host Shield exposes health metrics at the /metrics endpoint on port 9544.

To enable Host Shield metrics in your Shield Chart, use the following configuration:

host:
  additional_settings:
    prometheus_exporter:
      enabled: true
      export_health_metrics: true

  pod_annotations:
    prometheus.io/scrape: 'true'
    prometheus.io/port: '9544'
    prometheus.io/path: '/metrics'

The /metrics endpoint exposes metrics such as:

  • sysdig_agent_connected

  • sysdig_agent_healthy where

    • Value 1: healthy
    • Value 0: unhealthy

An extensive list of metrics is available at Agent Health metrics page.