This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

  • 1:
    • 2:
      • 3:

        Collect Prometheus Metrics

        Sysdig supports collecting, storing, and querying Prometheus native metrics and labels. You can use Sysdig in the same way that you use Prometheus and leverage Prometheus Query Language (PromQL) to create dashboards and alerts. Sysdig is compatible with Prometheus HTTP API to query your monitoring data programmatically using PromQL and extend Sysdig to other platforms like Grafana.

        From a metric collection standpoint, a lightweight Prometheus server is directly embedded into the Sysdig agent to facilitate metric collection. This also supports targets, instances, and jobs with filtering and relabeling using Prometheus syntax. You can configure the agent to identify these processes that expose Prometheus metric endpoints on its own host and send it to the Sysdig collector for storing and further processing.

        The Prometheus product itself does not necessarily have to be installed for Prometheus metrics collection.

        Agent Compatibility

        See the Sysdig agent versions and compatibility with Prometheus features:

        Sysdig Agent v12.2.0 and Above

        The following features are enabled by default:

        • Automatically scraping any Kubernetes pods with the following annotation set: prometheus.io/scrape=true
        • Automatically scrape applications supported by Monitoring Integrations.

        For more information, see Set up the Environment.

        Sysdig Agent Prior to v12.0.0

        Manually enable Prometheus in dragent.yaml file:

          prometheus:
               enabled: true
        

        For more information, see Enable Promscrape V2 on Older Versions of Sysdig Agent .

        Learn More

        The following topics describe in detail about setting up the environment for service discovery, metrics collection, and further processing.

        See the following blog posts for additional context on the Prometheus metric and how such metrics are typically used.

        1 -

        Set Up the Environment

        If you are already leveraging Kubernetes Service Discovery, specifically the approach given in prometheus-kubernetes.yml, you might already have annotations attached to the pods that mark them as eligible for scraping. Such environments can quickly begin scraping the same metrics by using the Sysdig agent in a single step.

        If you are not using Kubernetes Service Discovery, follow the instructions given below:

        Annotation

        Ensure that the Kubernetes pods that contain your Prometheus exporters have been deployed with the following annotations to enable scraping, substituting the listening exporter-TCP-port:

        spec:
          template:
            metadata:
              annotations:
                prometheus.io/scrape: "true"
                prometheus.io/port: "exporter-TCP-port"
        

        The configuration above assumes your exporters use the typical endpoint called /metrics. If your exporter is using a different endpoint, specify by adding the following additional annotation, substituting the exporter-endpoint-name:

        prometheus.io/path: "/exporter-endpoint-name"
        
        

        Sample Exporter

        Use the Sample Exporter to test your environment. You will quickly see auto-discovered Prometheus metrics being displayed on Sysdig Monitor. You can use this working example as a basis to similarly annotate your own exporters.

        2 -

        Enable Prometheus Native Service Discovery

        Prometheus service discovery is a standard method of finding endpoints to scrape for metrics. You configure prometheus.yaml and custom jobs to prepare for scraping endpoints in the same way you do for native Prometheus.

        For metric collection, a lightweight Prometheus server, named promscrape, is directly embedded into the Sysdig agent to facilitate metric collection. Promscrape supports filtering and relabeling targets, instances, and jobs and identify them using the custom jobs configured in the prometheus.yaml file. The latest versions of Sysdig agent (above v12.0.0) by default identify the processes that expose Prometheus metric endpoints on its own host and send it to the Sysdig collector for storing and further processing. On older versions of Sysdig agent, you enable these features by configuring dragent.yaml.

        Working with Promscrape

        Promscrape is a lightweight Prometheus server that is embedded with the Sysdig agent. Promscrape scrapes metrics from Prometheus endpoints and sends them for storing and processing.

        Promscrape has two versions: Promscrape V1 and Promscrape V2.

        • Promscrape V2

          Promscrape itself discovers targets by using the standard Prometheus configuration (native Prometheus service discovery), allowing the use of relabel_configs to find or modify targets. An instance of promscrape runs on every node that is running a Sysdig agent and is intended to collect metrics from local as well as remote targets specified in the prometheus.yaml file. The prometheus.yaml file you create is shared across all such nodes.

          Promscrape V2 is enabled by default on Sysdig agent v12.5.0 and above. On older versions of Sysdig agent, you need to manually enable Promscrape V2, which allows for native Prometheus service discovery, by setting the prom_service_discovery parameter to true in dragent.yaml.

        • Promscrape V1

          Sysdig agent discovers scrape targets through the Sysdig process_filter rules. For more information, see Process Filter.

        About Promscrape V2

        Supported Features

        Promscrape V2 supports the following native Prometheus capabilities:

        • Relabeling: Promscrape V2 supports Prometheus native relabel_config and metric_relabel_configs. Relabel configuration enables the following:

          • Drop unnecessary metrics or unwanted labels from metrics

          • Edit the label format of the target before scraping the labels

        • Sample format: In addition to the regular sample format (metrics name, labels, and metrics reading), Promscrape V2 includes metrics type (counter, gauge, histogram, summary) to every sample sent to the agent.

        • Scraping configuration: Promscrape V2 supports all types of scraping configuration, such as federation, blackbox-exporter, and so on.

        • Label mapping: The metrics can be mapped to their source (pod, process) by using the source labels which in turn map certain Prometheus label names to the known agent tags.

        Unsupported Features

        • Promscrape V2 does not support calculated metrics.

        • Promscrape V2 does not support cluster-wide features such as recording rules and alert management.

        • Service discovery configurations in Promscrape V1 (process_filter) and Promscrape V2 (prometheus.yaml) are incompatible and non-translatable.

        • Promscrape V2 collects metrics from both local and remote targets specified in the prometheus.yaml file and therefore it does not make sense to configure promscrape to scrape remote targets, because you will see metrics duplication in this case.

        • Promscrape V2 does not have the cluster view and therefore it ignores the configuration of recording rules and alerts, which is used in the cluster-wide metrics collection. Therefore, the following Prometheus Configurations are not supported

        • Sysdig uses __HOSTNAME__, which is not a standard Prometheus keyword.

        Enable Promscrape V2 on Older Versions of Sysdig Agent

        To enable Prometheus native service discovery on agent versions prior to 11.2:

        1. Open dragent.yaml file.

        2. Set the following Prometheus Service Discovery parameter to true:

          prometheus:
            prom_service_discovery: true
          

          If true, promscrape.v2 is used. Otherwise, promscrape.v1 is used to scrape the targets.

        3. Restart the agent.

        Create Custom Jobs

        Prerequisites

        Ensure the following features are enabled:

        • Monitoring Integration
        • Promscrape V2

        If you are using Sysdig agent v12.0.0 or above, these features are enabled by default.

        Prepare Custom Job

        You set up custom jobs in the Prometheus configuration file to identify endpoints that expose Prometheus metrics. Sysdig agent uses these custom jobs to scrape endpoints by using promscrape, the lightweight Prometheus server embedded in it.

        Guidelines

        • Ensure that targets are scraped only by the agent running on the same node as the target. You do this by adding the host selection relabeling rules.

        • Use the the sysdig specific relabeling rules to automatically get the right workload labels applied.

        Example Prometheus Configuration file

        The prometheus.yaml file comes with a default configuration for scraping the pods running on the local node. This configuration also includes the rules to preserve pod UID and container name labels for further correlation with Kubernetes State Metrics or Sysdig native metrics.

        Here is an example prometheus.yaml file that you can use to set up custom jobs.

        global:
          scrape_interval: 10s
        scrape_configs:
        - job_name: 'my_pod_job'
          sample_limit: 40000
          tls_config:
            insecure_skip_verify: true
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
            # Look for pod name starting with "my_pod_prefix" in namespace "my_namespace"
          - action:
            source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_pod_name]
            separator: /
            regex: my_namespace/my_pod_prefix.+
        
            # In those pods try to scrape from port 9876
          - source_labels: [__address__]
            action: replace
            target_label: __address__
            regex: (.+?)(\\:\\d)?
            replacement: $1:9876
        
            # Trying to ensure we only scrape local targets
            # __HOSTIPS__ is replaced by promscrape with a regex list of the IP addresses
            # of all the active network interfaces on the host
          - action: keep
            source_labels: [__meta_kubernetes_pod_host_ip]
            regex: __HOSTIPS__
        
            # Holding on to pod-id and container name so we can associate the metrics
            # with the container (and cluster hierarchy)
          - action: replace
            source_labels: [__meta_kubernetes_pod_uid]
            target_label: sysdig_k8s_pod_uid
          - action: replace
            source_labels: [__meta_kubernetes_pod_container_name]
            target_label: sysdig_k8s_pod_container_name
        

        Default Scrape Job

        If Monitoring Integration is not enabled for you and you still want to automatically collect metrics from pods with the Prometheus annotations set (prometheus.io/scrape=true), add the following default scrape job to your prometheus.yaml file:

        - job_name: 'k8s-pods'
          sample_limit: 40000
          tls_config:
            insecure_skip_verify: true
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
            # Trying to ensure we only scrape local targets
            # __HOSTIPS__ is replaced by promscrape with a regex list of the IP addresses
            # of all the active network interfaces on the host
          - action: keep
            source_labels: [__meta_kubernetes_pod_host_ip]
            regex: __HOSTIPS__
          - action: keep
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            regex: true
          - action: replace
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
            target_label: __scheme__
            regex: (https?)
          - action: replace
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            target_label: __metrics_path__
            regex: (.+)
          - action: replace
            source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
        
            # Holding on to pod-id and container name so we can associate the metrics
            # with the container (and cluster hierarchy)
          - action: replace
            source_labels: [__meta_kubernetes_pod_uid]
            target_label: sysdig_k8s_pod_uid
          - action: replace
            source_labels: [__meta_kubernetes_pod_container_name]
            target_label: sysdig_k8s_pod_container_name
        

        Default Prometheus Configuration File

        Here is the default prometheus.yaml file.

        global:
          scrape_interval: 10s
        scrape_configs:
        - job_name: 'k8s-pods'
          tls_config:
            insecure_skip_verify: true
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
            # Trying to ensure we only scrape local targets
            # __HOSTIPS__ is replaced by promscrape with a regex list of the IP addresses
            # of all the active network interfaces on the host
          - action: keep
            source_labels: [__meta_kubernetes_pod_host_ip]
            regex: __HOSTIPS__
          - action: keep
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            regex: true
          - action: replace
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
            target_label: __scheme__
            regex: (https?)
          - action: replace
            source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            target_label: __metrics_path__
            regex: (.+)
          - action: replace
            source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
            # Holding on to pod-id and container name so we can associate the metrics
            # with the container (and cluster hierarchy)
          - action: replace
            source_labels: [__meta_kubernetes_pod_uid]
            target_label: sysdig_k8s_pod_uid
          - action: replace
            source_labels: [__meta_kubernetes_pod_container_name]
            target_label: sysdig_k8s_pod_container_name
        

        Understand the Prometheus Settings

        Scrape Interval

        The default scrape interval is 10 seconds. However, the value can be overridden per scraping job. The scrape interval configured in the prometheus.yaml is independent of the agent configuration.

        Promscrape V2 reads prometheus.yaml and initiates scraping jobs.

        The metrics from targets are collected per scrape interval for each target and immediately forwarded to the agent. The agent sends the metrics every 10 seconds to the Sysdig collector. Only those metrics that have been received since the last transmission are sent to the collector. If a scraping job for a job has a scrape interval longer than 10 seconds, the agent transmissions might not include all the metrics from that job.

        Hostname Selection

        __HOSTIPS__ is replaced by the host IP addresses. Selection by the host IP address is preferred because of its reliability.

        __HOSTNAME__ is replaced with the actual hostname before promscrape starts scraping the targets. This allows promscrape to ignore targets running on other hosts.

        Relabeling Configuration

        The default Prometheus configuration file contains the following two relabeling configurations:

        - action: replace
          source_labels: [__meta_kubernetes_pod_uid]
          target_label: sysdig_k8s_pod_uid
        - action: replace
          source_labels: [__meta_kubernetes_pod_container_name]
          target_label: sysdig_k8s_pod_container_name
        

        These rules add two labels, sysdig_k8s_pod_uid and sysdig_k8s_pod_container_name to every metric gathered from the local targets, containing pod ID and container name respectively. These labels will be dropped from the metrics before sending them to the Sysdig collector for further processing.

        Configure Prometheus Configuration File Using the Agent Configmap

        Here is an example for setting up the prometheus.yaml file using the agent configmap:

        apiVersion: v1
        data:
          dragent.yaml: |
            new_k8s: true
            k8s_cluster_name: your-cluster-name
            metrics_excess_log: true
            10s_flush_enable: true
            app_checks_enabled: false
            use_promscrape: true
            new_k8s: true
            promscrape_fastproto: true
            prometheus:
              enabled: true
              prom_service_discovery: true
              log_errors: true
              max_metrics: 200000
              max_metrics_per_process: 200000
              max_tags_per_metric: 100
              ingest_raw: true
              ingest_calculated: false
            snaplen: 512
            tags: role:cluster
          prometheus.yaml: |
            global:
              scrape_interval: 10s
            scrape_configs:
            - job_name: 'haproxy-router'
              basic_auth:
                username: USER
                password: PASSWORD
              tls_config:
                insecure_skip_verify: true
              kubernetes_sd_configs:
              - role: pod
              relabel_configs:
                # Trying to ensure we only scrape local targets
                # We need the wildcard at the end because in AWS the node name is the FQDN,
                # whereas in Azure the node name is the base host name
              - action: keep
                source_labels: [__meta_kubernetes_pod_host_ip]
                regex: __HOSTIPS__
              - action: keep
                source_labels:
                - __meta_kubernetes_namespace
                - __meta_kubernetes_pod_name
                separator: '/'
                regex: 'default/router-1-.+'
                # Holding on to pod-id and container name so we can associate the metrics
                # with the container (and cluster hierarchy)
              - action: replace
                source_labels: [__meta_kubernetes_pod_uid]
                target_label: sysdig_k8s_pod_uid
              - action: replace
                source_labels: [__meta_kubernetes_pod_container_name]
                target_label: sysdig_k8s_pod_container_name
        
        kind: ConfigMap
        metadata:
            labels:
              app: sysdig-agent
            name: sysdig-agent
            namespace: sysdig-agent
        

        3 -

        Migrating from Promscrape V1 to V2

        Promscrape is the lightweight Prometheus server in the Sysdig agent. An updated version of promscrape, named Promscrape V2 is available. This configuration is controlled by the prom_discovery_service parameter in the dragent.yaml file. To use the latest features, such as Service Discovery and Monitoring Integrations, you need to have this option enabled in your environment.

        Compare Promscrape V1 and V2

        The main difference between V1 and V2 is how scrape targets are determined.

        In v1 targets are found through process-filtering rules configured in dragent.yaml or dragent.default.yaml (if no rules are given in dragent.yaml).The process-filtering rules are applied to all the running processes on the host. Matches are made based on process attributes, such as process name or TCP ports being listened to, as well as associated contexts from docker or Kubernetes, such as container labels or Kubernetes annotations.

        With Promscrape V2, scrape targets are determined by scrape_configs fields in a prometheus.yaml file (or the prometheus-v2.default.yaml file if no prometheus.yaml exists). Because promscrape is adapted from the open-source Prometheus server, the scrape_config settings are compatible with the normal Prometheus configuration. Here is an example:

        global:
          scrape_interval: 10s
        scrape_configs:
        - job_name: 'my_pod_job'
          sample_limit: 40000
          tls_config:
            insecure_skip_verify: true
          kubernetes_sd_configs:
          - role: pod
          relabel_configs:
            # Look for pod name starting with "my_pod_prefix" in namespace "my_namespace"
          - action:
            source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_pod_name,__meta_kubernetes_pod_label]
            separator: /
            regex: my_namespace/my_pod_prefix.+
          - action: keep
            source_labels: [__meta_kubernetes_pod_label_app]
            regex: my_app_metrics
        
            # In those pods try to scrape from port 9876
          - source_labels: [__address__]
            action: replace
            target_label: __address__
            regex: (.+?)(\\:\\d)?
            replacement: $1:9876
        
            # Trying to ensure we only scrape local targets
            # __HOSTIPS__ is replaced by promscrape with a regex list of the IP addresses
            # of all the active network interfaces on the host
          - action: keep
            source_labels: [__meta_kubernetes_pod_host_ip]
            regex: __HOSTIPS__
        
            # Holding on to pod-id and container name so we can associate the metrics
            # with the container (and cluster hierarchy)
          - action: replace
            source_labels: [__meta_kubernetes_pod_uid]
            target_label: sysdig_k8s_pod_uid
          - action: replace
            source_labels: [__meta_kubernetes_pod_container_name]
            target_label: sysdig_k8s_pod_container_name
        

        Migrate Using Default Configuration

        The default configuration for Promscrape v1 triggers the scraping based on standard Kubernetes pod annotations and container labels. The default configuration for v2 currently triggers scraping only based on the standard Kubernetes pod annotations leveraging the Prometheus native service discovery.

        Example Pod Annotations

        Annotation

        Value

        Description

        spec: template: metadata: annotations: prometheus.io/scrape: "true" prometheus.io/port: ""

        true

        Required field.

        prometheus.io/port: ""

        The port number to scrape

        Optional. It will scrape all pod-registered ports if omitted.

        prometheus.io/scheme

        <http|https>

        The default is http.

        (required field)prometheus.io/path

        The URL

        The default is /metrics.

        Example Static Job

        - job_name: 'static10'
          static_configs:
            - targets: ['localhost:5010']
        

        Guidelines

        • Users running Kubernetes with Promscrape v1 default rules and triggering scraping based on pod annotations need not take any action to migrate to v2. The migration happens automatically.

        • Users operating non-Kubernetes environments might need to continue using v1 for now, depending on how scraping is triggered. As of today promscrape.v2 doesn’t support leveraging container and Docker labels to discover Prometheus metrics endpoints. If your environment is one of these, define static jobs with the IP:port to be scrapped.

        Migrate Using Custom Rules

        If you relying on custom process_filter rules to collect metrics, use any method using standard Prometheus configuration syntax to scrape the endpoints. We recommend one of the following:

        • Adopt the standard approach of adding the standard Prometheus annotations to their pods. For more information, see Migrate Using Default Configuration.
        • Write a Prometheus scrape_config by using Kubernetes pods service discovery and use the appropriate pod metadata to trigger their scrapes.

        See the below example for converting your process_filter rules to Prometheus terminology.

        process_filter

        Prometheus

        - include:
            kubernetes.pod.annotation.sysdig.com/test: true
        - action: keep
          source_labels: [__meta_kubernetes_pod_annotation_sysdig_com_test]
          regex: true
        - include:
            kubernetes.pod.label.app: sysdig
        - action: keep
          source_labels: [__meta_kubernetes_pod_label_app]
          regex: 'sysdig'
        -include:
           container.label.com.sysdig.test: true

        Not supported.

        - include:
            process.name: test

        Not supported.

        - include:
            process.cmdline: sysdig-agent

        Not supported.

        - include:
            port: 8080
        - action: keep
          source_labels: [__meta_kubernetes_pod_container_port_number]
          regex: '8080'
        - include:
            container.image: sysdig-agent

        Not supported.

        - include:
            container.name: sysdig-agent
        - action: keep
          source_labels: [__meta_kubernetes_pod_container_name]
          regex: 'sysdig-agent'
        - include:
            appcheck.match: sysdig

        Appchecks are not compatble with Promscrape v2. See Configure Monitoring Integrations for supported integrations.

        Contact Support

        If you have any queries related to promscrape migration, contact Sysdig Support.