(Legacy) Integrations for Sysdig Monitor
Integrate metrics with Sysdig Monitor from a number of platforms,
orchestrators, and a wide range of applications. Sysdig collects metrics
from Prometheus, JMX, StatsD, Kubernetes, and many application stacks to
provide a 360-degree view of your infrastructure. Many metrics are
collected by default out of the box; you can also extend the integration
or create custom metrics.
Key Benefits
Collects the richest data set for cloud-native visibility and
security
Polls data, auto-discover context in order to provide operational
and security insights
Extends the power of Prometheus metrics with additional insights
from other metrics types and infrastructure stack
Integrate Prometheus alert and events for Kubernetes monitoring
needs
Expose application metrics using Java JMX and MBeans monitoring
Key Integrations
Inbound
Prometheus Metrics
Describes how Sysdig Agent enables automatically collecting metrics
from Prometheus exporters, how to set up your environment, and
scrape Prometheus metrics from local as well as remote hosts.
Java Management Extention (JMX) Metrics
Describes how to configure your Java virtual machines so Sysdig
Agent can collect JMX metrics using the JMX protocol.
StatsD Metrics
Describes how the Sysdig agent collects custom StatsD metrics with
an embedded StatsD server.
Node.JS Metrics
Illustrates how Sysdig is able to monitor node.js applications by
linking a library to the node.js codebase.
Integrate Applications
Describes the monitoring capabilities of Sysdig agent with
application check scripts or ‘app checks’.
Monitor Log Files
Learn how to search a string by using the chisel script called logwatcher.
AWS CloudWatch
Illustrates how to configure Sysdig to collect various types of CloudWatch metrics.
Agent Installation
Learn how to install Sysdig agents on supported platforms.
Oubound
Notification Channels
Learn how to add, edit, or delete a variety of notification channel types, and how to disable or delete notifications when they are not needed, for example, during scheduled downtime.
S3 Capture Storage
Learn how to configure Sysdig to use an AWS S3 bucket or custom S3 storage for storing Capture files.
For Sysdig instances deployed on IBM Cloud Monitoring with
Sysdig, an additional form of metrics
collection is offered: Platform metrics. Rather than being collected by
the Sysdig agent, when enabled, Platform metrics are reported to Sysdig
directly by the IBM Cloud infrastructure.
Platform metrics are metrics that are exposed by enabled services across
the IBM Cloud platform. These services have made metrics and pre-defined
dashboards for their services available by publishing metrics associated
with the customer’s space or account. Customers can view these platform
metrics alongside the metrics from their applications and other services
within IBM Cloud monitoring.
Enable this feature by logging into the IBM Cloud console and selecting
“Enable” for IBM Platform metrics under the Configure your resource
section when creating a new IBM Cloud Monitoring with a Sysdig instance,
as described
here.
1 - (Legacy)Collect Prometheus Metrics
Sysdig supports collecting, storing, and querying Prometheus native
metrics and labels. You can use Sysdig in the same way that you use
Prometheus and leverage
Prometheus Query Language (PromQL) to create dashboards and alerts.
Sysdig is compatible with Prometheus HTTP API to query your monitoring
data programmatically using PromQL and extend Sysdig to other platforms
like Grafana.
From a metric collection standpoint, a lightweight Prometheus server is
directly embedded into the Sysdig agent to facilitate metric collection.
This also supports targets, instances, and jobs with filtering and
relabeling using Prometheus syntax. You can configure the agent to
identify these processes that expose Prometheus metric endpoints on its
own host and send it to the Sysdig collector for storing and further
processing.

This document uses metric and time series interchangeably. The
description of configuration parameters refers to “metric”, but in
strict Prometheus terms, those imply time series. That is, applying a
limit of 100 metrics implies applying a limit on time series, where all
the time series data might not have the same metric name.
The Prometheus product
itself does not necessarily have to be installed for Prometheus metrics
collection.
See the Sysdig agent versions and compatibility with Prometheus features:
Learn More
The following topics describe in detail how to configure the Sysdig agent for service discovery, metrics collection, and further processing.
See the following blog posts for additional context on the Prometheus
metric and how such metrics are typically used.
1.1 - (Legacy) Working with Prometheus Metrics
The Sysdig agent uses its visibility to all running processes (at both
the host and container levels) to find eligible targets for scraping
Prometheus metrics. By default, no scraping is attempted. Once the
feature is enabled, the agent assembles a list of eligible targets,
apply filtering rules, and sends back to the Sysdig collector.
Latest Prometheus Features
Sysdig agents v12.0 or above is required for the following capabilities:
Sysdig agents v10.0 or above is required for the following capabilities:
The new PromQL data cannot be visualized by using the Dashboard
v2 Histogram. Use time-series based visualization for the
histogram metrics.
Prerequisites and Guidelines
Sysdig agent v 10.0.0 and above is required for the latest
Prometheus features.
Prometheus feature is enabled in the dragent.yaml
file.
prometheus:
enabled: true
See Setting up the Environment for more information.
The endpoints of the target should be available on a TCP connection
to the agent. The agent scrapes a target, remote or local, specified
by the IP: Port
or the URL
in dragent.yaml
.
Service Discovery
To use native Prometheus service discovery, enable Promscrape V2 as described in Enable Prometheus Native Service Discovery. This section covers the Sysdig way of service discovery that involves configuring
process filters in the Sysdig agent.
The way service discovery works in the Sysdig agent differs from that of
the Prometheus
server.
While the Prometheus server has built-in integration with several
service discovery mechanisms and the prometheus.yml
file to read the
configuration settings from, the Sysdig agent auto-discovers any process
(exporter or instrumented) that matches the specifications in the
dragent.yaml
, file and instructs the embedded lightweight Prometheus
server to retrieve the metrics from it.
The lightweight Prometheus server in the agent is named promscrape
and
is controlled by the flag of the same name in the dragent.yaml
file.
See Configuring Sysdig
Agent for more information.
Unlike the Prometheus server that can scrape processes running on all
the machines in a cluster, the agent can scrape only those processes
that are running on the host that it is installed on.
Within the set of eligible processes/ports/endpoints, the agent scrapes
only the ports that are exporting Prometheus metrics and will stop
attempting to scrape or retry on ports based on how they respond to
attempts to connect and scrape them. It is therefore strongly
recommended that you create a configuration that restricts the process
and ports for attempted scraping to the minimum expected range for your
exporters. This minimizes the potential for unintended side-effects in
both the Agent and your applications due to repeated failed connection
attempts.
The end to end metric collection can be summarized as follows:
A process is determined to be eligible for possible scraping if it
positively matches against a series of Process Filter
include/exclude rules. See Process Filter
for more information.
The Agent will then attempt to scrape an eligible process at a
/metrics
endpoint on all of its listening TCP ports unless the
additional configuration is present to restrict scraping to a subset
of ports and/or another endpoint name.
Upon receiving the metrics, the agent applies the following rules
before sending them to the Sysdig collector.
The metrics ultimately appear in the Sysdig Monitor Explore interface in
the Prometheus section.

1.2 - (Legacy) Set up the Environment
Quick Start For Kubernetes Environments
Prometheus users who are already leveraging Kubernetes Service
Discovery
(specifically the approach in this sample
prometheus-kubernetes.yml)
may already have Annotations attached to the Pods that mark them as
eligible for scraping. Such environments can quickly begin scraping the
same metrics using the Sysdig Agent in a couple of easy steps.
Enable the Prometheus metrics feature in the Sysdig Agent. Assuming
you are deploying using
DaemonSets,
the needed config can be added to the Agent’s dragent.yaml
by
including the following in your DaemonSet YAML (placing it in the
env
section for the sysdig-agent
container):
- name: ADDITIONAL_CONF
value: "prometheus:\n enabled: true"
Ensure the Kubernetes Pods that contain your Prometheus exporters
have been deployed with the following Annotations to enable scraping
(substituting the listening exporter-TCP-port)
:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "exporter-TCP-port"
The configuration above assumes your exporters use the typical
endpoint called /metrics
. If an exporter is using a different
endpoint, this can also be specified by adding the following
additional optional Annotation, substituting the
exporter-endpoint-name
:
prometheus.io/path: "/exporter-endpoint-name"
If you try this Kubernetes Deployment of a simple
exporter,
you will quickly see auto-discovered Prometheus metrics being displayed
in Sysdig Monitor. You can use this working example as a basis to
similarly Annotate your own exporters.
If you have Prometheus exporters not deployed in annotated Kubernetes
Pods that you would like to scrape, the following sections describe the
full set of options to configure the Agent to find and scrape your
metrics.
Quick Start for Container Environments
In order for Prometheus scraping to work in a Docker-based container
environment, set the following labels to the application containers,
substituting <exporter-port
> and <exporter-path
> with
the correct port and path where metrics are exported by your
application:
io.prometheus.scrape=true
io.prometheus.port=<exporter-port>
io.prometheus.path=<exporter-path>
For example, if mysqld-exporter
is to be scraped, spin up the
container as follows:
docker -d -l io.prometheus.scrape=true -l io.prometheus.port=9104 -l io.prometheus.path=/metrics mysqld-exporter
1.3 - (Legacy) Configuring Sysdig Agent
This feature is not supported with Promscrape V2. For information on different versions of Promscrape and migrating to the latest version, see Migrating from Promscrape V1 to V2.
As is typical for the agent, the default configuration for the feature is specified in dragent.default.yaml
, and you can override the defaults by configuring parameters in the dragent.yaml
. For each parameter, you do not set in dragent.yaml
, the defaults in dragent.default.yaml
will remain in effect.
Main Configuration Parameters
prometheus
| See below | Turns Prometheus scraping on and off. |
process_filter
| See below | Specifies which processes may be eligible for scraping. See [Process Filter](/en/docs/sysdig-monitor/monitoring-integrations/legacy-integrations/legacycollect-prometheus-metrics/configuring-sysdig-agent/#process-filter). |
use_promscrape
| See below. | Determines whether to use promscrape for scraping Prometheus metrics. |
promscrape
Promscrape is a lightweight Prometheus server that is embedded with the
Sysdig agent. The use_promscrape
parameter controls whether to use it
to scrape Prometheus endpoints.
use_promscrape
| true
| Promscrape has two versions: Promscrape V1 and Promscrape V2. With V1, Sysdig agent discovers scrape targets through the process_filter
rules. With V2, promscrape itself discovers targets by using the standard Prometheus configuration, allowing the use of relabel_configs
to find or modify targets.
prometheus
The prometheus
section defines the behavior related to Prometheus
metrics collection and analysis. It allows for turning the feature on,
set a limit from the agent side on the number of metrics to be scraped,
and determines whether to report histogram metrics and log failed scrape
attempts.
enabled
| false
| Turns Prometheus scraping on and off. |
interval
| 10
| How often (in seconds) the agent will scrape a port for Prometheus metrics |
prom_service_discovery
| true
| Enables native Prometheus service discovery. If disabled, promscrape.v1 is used to scrape the targets. See Enable Prometheus Native Service Discovery. On agent versions prior to 11.2, the default is false. |
max_metrics
| 1000
| The maximum number of total Prometheus metrics that will be scraped across all targets. This value of 1000 is the maximum per-agent, and is a separate limit from other Custom Metrics. For example, StatsD, JMX, and App Checks. |
timeout
| 1 | Used to configure the amount of time the agent will wait while scraping a Prometheus endpoint before timing out. The default value is 1 second. As of agent v10.0, this parameter is only used when promscrape is disabled. Since promscrape is now default, timeout can be considered deprecated, however it is still used when you explicitly disable promscrape . |
Process Filter
The process_filter
section specifies which of the processes known
by an agent may be eligible for scraping.
Note that once you specify a process_filter
in your
dragent.yaml
, this replaces the entire Prometheus
process_filter
section (i.e. all the rules) shown in the
dragent.default.yaml
.
The Process Filter is specified in a series of include
and
exclude
rules that are evaluated top-to-bottom for each process
known by an Agent. If a process matches an include
rule, scraping
will be attempted via a /metrics
endpoint on each listening TCP
port for the process, unless a conf
section also appears within
the rule to further restrict how the process will be scraped. See
conf for more information.
Multiple patterns can be specified in a single rule, in which case all
patterns must match for the rule to be a match (AND logic).
Within a pattern value, simple “glob” wildcarding may be used, where
*
matches any number of characters (including none) and ?
matches any single character. Note that due to YAML syntax, when using
wildcards, be sure to enclose the value in quotes ("*"
).
The table below describes the supported patterns in Process Filter
rules. To provide realistic examples, we’ll use a simple sample
Prometheus
exporter (source
code
here)
which can be deployed as a container using the Docker command line
below. To help illustrate some of the configuration options, this sample
exporter presents Prometheus metrics on /prometheus
instead of the
more common /metrics
endpoint, which will be shown in the example
configurations further below.
# docker run -d -p 8080:8080 \
--label class="exporter" \
--name my-java-app \
luca3m/prometheus-java-app
# ps auxww | grep app.jar
root 11502 95.9 9.2 3745724 753632 ? Ssl 15:52 1:42 java -jar /app.jar --management.security.enabled=false
# curl http://localhost:8080/prometheus
...
random_bucket{le="0.005",} 6.0
random_bucket{le="0.01",} 17.0
random_bucket{le="0.025",} 51.0
...
container.image
| Matches if the process is running inside a container running the specified image | - include:
container.image: luca3m/prometheus-java-app
|
container.name
| Matches if the process is running inside a container with the specified name | - include:
container.name: my-java-app
|
container.label.*
| Matches if the process is running in a container that has a Label matching the given value | - include:
container.label.class: exporter
|
kubernetes.<object>.annotation.* kubernetes.<object>.label.*
| Matches if the process is attached to a Kubernetes object (Pod, Namespace, etc.) that is marked with the Annotation/Label matching the given value. Note: This pattern does not apply to the Docker-only command-line shown above, but would instead apply if the exporter were installed as a Kubernetes Deployment using this example YAML. Note: See Kubernetes Objects, below, for information on the full set of supported Annotations and Labels. | - include:
kubernetes.pod.annotation.prometheus.io/scrape: true
|
process.name
| Matches the name of the running process | - include:
process.name: java
|
process.cmdline
| Matches a command line argument | - include:
process.cmdline: "*app.jar*"
|
port
| Matches if the process is listening on one or more TCP ports. The pattern for a single rule can specify a single port as shown in this example, or a single range (e.g.8079-8081 ), but does not support comma-separated lists of ports/ranges. Note: This parameter is only used to confirm if a process is eligible for scraping based on the ports on which it is listening. For example, if a process is listening on one port for application traffic and has a second port open for exporting Prometheus metrics, it would be possible to specify the application port here (but not the exporting port), and the exporting port in the conf section (but not the application port), and the process would be matched as eligible and the exporting port would be scraped. | - include:
port: 8080
|
appcheck.match
| Matches if an Application Check with the specific name or pattern is scheduled to run for the process. | - exclude:
appcheck.match: "*"
|
Instead of the **`include`** examples shown above that would have each
matched our process, due to the previously-described ability to combine
multiple patterns in a single rule, the following very strict
configuration would also have matched:
- include:
container.image: luca3m/prometheus-java-app
container.name: my-java-app
container.label.class: exporter
process.name: java
process.cmdline: "*app.jar*"
port: 8080
conf
Each include
rule in the port_filter
may include a
conf
portion that further describes how scraping will be attempted
on the eligible process. If a conf
portion is not included,
scraping will be attempted at a /metrics
endpoint on all listening
ports of the matching process. The possible settings:
port
| Either a static number for a single TCP port to be scraped, or a container/Kubernetes Label name or Kubernetes Annotation specified in curly braces. If the process is running in a container that is marked with this Label or is attached to a Kubernetes object (Pod, Namespace, etc.) that is marked with this Annotation/Label, scraping will be attempted only on the port specified as the value of the Label/Annotation. Note: The Label/Annotation to match against will not include the text shown in red. Note: See Kubernetes Objectsfor information on the full set of supported Annotations and Labels. Note: If running the exporter inside a container, this should specify the port number that the exporter process in the container is listening on, not the port that the container exposes to the host. | port: 8080
- or - port: "{container.label.io.prometheus.port}"
- or - port: "{kubernetes.pod.annotation.prometheus.io/port}"
|
port_filter
| A set of include and exclude rules that define the ultimate set of listening TCP ports for an eligible process on which scraping may be attempted. Note that the syntax is different from the port pattern option from within the higher-level include rule in the process_filter . Here a given rule can include single ports, comma-separated lists of ports (enclosed in square brackets), or contiguous port ranges (without brackets). | port_filter:
- include: 8080 - exclude: [9092,9200,9300] - include: 9090-9100
|
path
| Either the static specification of an endpoint to be scraped, or a container/Kubernetes Label name or Kubernetes Annotation specified in curly braces. If the process is running in a container that is marked with this Label or is attached to a Kubernetes object (Pod, Namespace, etc.) that is marked with this Annotation/Label, scraping will be attempted via the endpoint specified as the value of the Label/Annotation. If path is not specified, or specified but the Agent does not find the Label/Annotation attached to the process, the common Prometheus exporter default of /metrics will be used. Note: A Label/Annotation to match against will not include the text shown in red. Note: See Kubernetes Objects for information on the full set of supported Annotations and Labels. | path: "/prometheus"
- or - path: "{container.label.io.prometheus.path}"
- or - path: "{kubernetes.pod.annotation.prometheus.io/path}"
|
host
| A hostname or IP address. The default is localhost. | host: 192.168.1.101
- or -
host: subdomain.example.com
- or -
host: localhost
|
use_https
| When set to true , connectivity to the exporter will only be attempted through HTTPS instead of HTTP. It is false by default. (Available in Agent version 0.79.0 and newer) | use_https: true
|
ssl_verify
| When set to true , verification will be performed for the server certificates for an HTTPS connection. It is false by default. Verification was enabled by default before 0.79.0. (Available in Agent version 0.79.0 and newer) | ssl_verify: true
|
Authentication Integration
As of agent version 0.89, Sysdig can collect Prometheus metrics from
endpoints requiring authentication. Use the parameters below to enable
this function.
For username/password authentication:
For authentication using a token:
For certificate authentication with a certificate key:
auth_cert_path
auth_key_path
Token substitution is also supported for all the authorization
parameters. For instance a username can be taken from a Kubernetes
annotation by specifying
username: "{kubernetes.service.annotation.prometheus.openshift.io/username}"
conf Authentication Example
Below is an example of the dragent.yaml
section showing all the
Prometheus authentication configuration options, on OpenShift,
Kubernetes, and etcd.
In this example:
The username/password
are taken from a default annotation used by
OpenShift.
The auth token
path is commonly available in Kubernetes
deployments.
The certificate
and key
used here for etcd may normally not be
as easily accessible to the agent. In this case they were extracted
from the host namespace, constructed into Kubernetes secrets, and
then mounted into the agent container.
prometheus:
enabled: true
process_filter:
- include:
port: 1936
conf:
username: "{kubernetes.service.annotation.prometheus.openshift.io/username}"
password: "{kubernetes.service.annotation.prometheus.openshift.io/password}"
- include:
process.name: kubelet
conf:
port: 10250
use_https: true
auth_token_path: "/run/secrets/kubernetes.io/serviceaccount/token"
- include:
process.name: etcd
conf:
port: 2379
use_https: true
auth_cert_path: "/run/secrets/etcd/client-cert"
auth_key_path: "/run/secrets/etcd/client-key"
Kubernetes Objects
As described above, there are multiple configuration options that can be
set based on auto-discovered values for Kubernetes Labels and/or
Annotations. The format in each case begins with
"kubernetes.OBJECT.annotation."
or "kubernetes.OBJECT.label."
where
OBJECT
can be any of the following supported Kubernetes object types:
daemonSet
deployment
namespace
node
pod
replicaSet
replicationController
service
statefulset
The configuration text you add after the final dot becomes the name of
the Kubernetes Label/Annotation that the Agent will look for. If the
Label/Annotation is discovered attached to the process, the value of
that Label/Annotation will be used for the configuration option.
Note that there are multiple ways for a Kubernetes Label/Annotation to
be attached to a particular process. One of the simplest examples of
this is the Pod-based approach shown in Quick Start For Kubernetes
Environments.
However, as an example alternative to marking at the Pod level, you
could attach Labels/Annotations at the Namespace level, in which case
auto-discovered configuration options would apply to all processes
running in that Namespace regardless of whether they’re in a Deployment,
DaemonSet, ReplicaSet, etc.
1.4 - (Legacy) Filtering Prometheus Metrics
As of Sysdig agent 9.8.0, a lightweight Prometheus server is embedded in
agents named promscrape
and a prometheus.yaml
file is included as
part of configuration files. Using the open source Prometheus
capabilities, Sysdig leverages a Prometheus feature to allow you to
filter Prometheus metrics at the source before ingestion. To do so, you
will:
Ensure that the Prometheus scraping is enabled in the
dragent.yaml
file.
prometheus:
enabled: true
On agent v9.8.0 and above, enable the feature by setting the
use_promscrape
parameter to true in the dragent.yaml
. See
Enable Filtering at
Ingestion.
Edit the configuration in the prometheus.yaml
file. See Edit
Prometheus Configuration
File.
Sysdig-specific configuration is found in the prometheus.yaml
file.
Enable Filtering at Ingestion
On agent v9.8.0, in order for target filtering to work, the
use_promscrape
parameter in the dragent.yaml
must be set to true.
For more information on configuration, see Configuring Sysdig
Agent.
On agent v10.0, use_promscrape
is enabled by default. Implies,
promscrape is used for scraping Prometheus metrics.
Filtering configuration is optional. The absence of prometheus.yaml
will not change the existing behavior of the agent.
Edit Prometheus Configuration File
About the Prometheus Configuration File
The prometheus.yaml
file contains
mostly the filtering/relabeling
configuration in a list of key-value pairs, representing target process
attributes.
You replace keys and values with the desired tags corresponding to your
environment.
In this file, you will configure the following:
The prometheus.yaml
file is installed alongside dragent.yaml
. For
the most part, the syntax of prometheus.yaml
complies with the
standard Prometheus
configuration.
Default Configuration
A configuration with empty key-value pairs is considered a default
configuration. The default configuration will be applied to all the
processes to be scraped that don’t have a matching filtering
configuration. In Sample Prometheus Configuration
File,
the job_name: 'default'
section represents the default configuration.
Kubernetes Environments
If the agent runs in Kubernetes environments (Open
Source/OpenShift/GKE), include the following Kubernetes objects as
key-value pairs. See Agent Install:
Kubernetes for details on
agent installation.
For example:
sysdig_sd_configs:
- tags:
namespace: backend
deployment: my-api
In addition to the aforementioned tags, any of these object types can be
matched against:
daemonset: my_daemon
deployment: my_deployment
hpa: my_hpa
namespace: my_namespace
node: my_node
pod: my_pode
replicaset: my_replica
replicationcontroller: my_controller
resourcequota: my_quota
service: my_service
stateful: my_statefulset
For Kubernetes/OpenShift/GKE deployments, prometheus.yaml
shares the
same ConfigMap with dragent.yaml
.
Docker Environments
In Docker environments, include attributes such as container, host,
port, and more. For example:
sysdig_sd_configs:
- tags:
host: my-host
port: 8080
For Docker-based deployments, prometheus.yaml
can be mounted from the
host.
Sample Prometheus Configuration File
global:
scrape_interval: 20s
scrape_configs:
- job_name: 'default'
sysdig_sd_configs: # default config
relabel_configs:
- job_name: 'my-app-job'
sample_limit: 2000
sysdig_sd_configs: # apply this filtering config only to my-app
- tags:
namespace: backend
deployment: my-app
metric_relabel_configs:
# Drop all metrics starting with http_
- source_labels: [__name__]
regex: "http_(.+)"
action: drop
metric_relabel_configs:
# Drop all metrics for which the city label equals atlantis
- source_labels: [city]
regex: "atlantis"
action: drop
1.5 - (Legacy) Example Configuration
This topic introduces you to default and specific Prometheus
configurations.
Default Configuration
As an example that pulls together many of the configuration elements
shown above, consider the default Agent configuration that’s inherited
from the dragent.default.yaml
.
prometheus:
enabled: true
interval: 10
log_errors: true
max_metrics: 1000
max_metrics_per_process: 100
max_tags_per_metric: 20
# Filtering processes to scan. Processes not matching a rule will not
# be scanned
# If an include rule doesn't contain a port or port_filter in the conf
# section, we will scan all the ports that a matching process is listening to.
process_filter:
- exclude:
process.name: docker-proxy
- exclude:
container.image: sysdig/agent
# special rule to exclude processes matching configured prometheus appcheck
- exclude:
appcheck.match: prometheus
- include:
container.label.io.prometheus.scrape: "true"
conf:
# Custom path definition
# If the Label doesn't exist we'll still use "/metrics"
path: "{container.label.io.prometheus.path}"
# Port definition
# - If the Label exists, only scan the given port.
# - If it doesn't, use port_filter instead.
# - If there is no port_filter defined, skip this process
port: "{container.label.io.prometheus.port}"
port_filter:
- exclude: [9092,9200,9300]
- include: 9090-9500
- include: [9913,9984,24231,42004]
- exclude:
container.label.io.prometheus.scrape: "false"
- include:
kubernetes.pod.annotation.prometheus.io/scrape: true
conf:
path: "{kubernetes.pod.annotation.prometheus.io/path}"
port: "{kubernetes.pod.annotation.prometheus.io/port}"
- exclude:
kubernetes.pod.annotation.prometheus.io/scrape: false
Consider the following about this default configuration:
All Prometheus scraping is disabled by default. To enable the entire
configuration shown here, you would only need to add the following
to your dragent.yaml
:
prometheus:
enabled: true
Enabling this option and any pods (in case of Kubernetes) that have
the right annotation set or containers (if not) that have the labels
set will automatically be scrapped.
Once enabled, this default configuration is ideal for the use case
described in the Quick Start For Kubernetes
Environments.
A Process Filter rule excludes processes that are likely to exist in
most environments but are known to never export Prometheus metrics,
such as the Docker Proxy and the Agent itself.
Another Process Filter rule ensures that any processes configured to
be scraped by the legacy Prometheus application check will not be
scraped.
Another Process Filter rule is tailored to use container Labels.
Processes marked with the container Label io.prometheus.scrape
will become eligible for scraping, and if further marked with
container Labels io.prometheus.port
and/or
io.prometheus.path
, scraping will be attempted only on this
port and/or endpoint. If the container is not marked with the
specified path Label, scraping the /metrics
endpoint will be
attempted. If the container is not marked with the specified port
Label, any listening ports in the port_filter
will be
attempted for scraping (this port_filter
in the default is set
for the range of ports for common Prometheus
exporters,
with exclusions for ports in the range that are known to be used by
other applications that are not exporters).
The final Process Filter Include rule is tailored to the use case
described in the Quick Start For Kubernetes
Environments.
Scrape a Single Custom Process
If you need to scrape a single custom process, for instance, a java
process listening on port 9000 with path /prometheus
, add the
following to the dragent.yaml
:
prometheus:
enabled: true
process_filter:
- include:
process.name: java
port: 9000
conf:
# ensure we only scrape port 9000 as opposed to all ports this process may be listening to
port: 9000
path: "/prometheus"
This configuration overrides the default process_filter
section shown
in Default Configuration.
You can add relevant rules from the default configuration to this to
further filter down the metrics.
port
has different purposes depending on where it’s placed in the
configuration. When placed under the include
section, it is a
condition for matching the include rule.
Placing a port
under conf
indicates that only that particular port
is scraped when the rule is matched as opposed to all the ports that the
process could be listening on.
In this example, the first rule will be matched for the Java process
listening on port 9000. The java process listening only on port 9000
will be scrapped.
Scrape a Single Custom Process Based on Container Labels
If you still want to scrape based on container labels, you could just
append the relevant rules from the defaults to the process_filter
. For
example:
prometheus:
enabled: true
process_filter:
- include:
process.name: java
port: 9000
conf:
# ensure we only scrape port 9000 as opposed to all ports this process may be listening to
port: 9000
path: "/prometheus"
- exclude:
process.name: docker-proxy
- include:
container.label.io.prometheus.scrape: "true"
conf:
path: "{container.label.io.prometheus.path}"
port: "{container.label.io.prometheus.port}"
port
has a different meaning depending on where it’s placed in the
configuration. When placed under the include
section, it’s a condition
for matching the include rule.
Placing port
under conf
indicates that only that port is scraped
when the rule is matched as opposed to all the ports that the process
could be listening on.
In this example, the first rule will be matched for the process
listening on port 9000. The java process listening only on port 9000
will be scrapped.
Container Environment
With this default configuration enabled, a containerized install of our
example exporter shown below would be automatically scraped via the
Agent.
# docker run -d -p 8080:8080 \
--label io.prometheus.scrape="true" \
--label io.prometheus.port="8080" \
--label io.prometheus.path="/prometheus" \
luca3m/prometheus-java-app
Kubernetes Environment
In a Kubernetes-based environment, a Deployment with the Annotations as
shown in this example
YAML
would be scraped by enabling the default configuration.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: prometheus-java-app
spec:
replicas: 1
template:
metadata:
labels:
app: prometheus-java-app
annotations:
prometheus.io/scrape: "true"
prometheus.io/path: "/prometheus"
prometheus.io/port: "8080"
spec:
containers:
- name: prometheus-java-app
image: luca3m/prometheus-java-app
imagePullPolicy: Always
Non-Containerized Environment
This is an example of a non-containerized environment or a containerized
environment that doesn’t use Labels or Annotations. The following
dragent.yaml
would override the default and do per-second scrapes
of our sample exporter and also a second exporter on port 5005, each at
their respective non-standard endpoints. This can be thought of as a
conservative “whitelist” type of configuration since it restricts
scraping to only exporters that are known to exist in the environment
and the ports on which they’re known to export Prometheus metrics.
prometheus:
enabled: true
interval: 1
process_filter:
- include:
process.cmdline: "*app.jar*"
conf:
port: 8080
path: "/prometheus"
- include:
port: 5005
conf:
port: 5005
path: "/wacko"
port
has a different meaning depending on where it’s placed in the
configuration. When placed under the include
section, it’s a condition
for matching the include rule. Placing port
under conf
indicates
that only that port is scraped when the rule is matched as opposed to
all the ports that the process could be listening on.
In this example, the first rule will be matched for the process
*app.jar*. The java process listening only on port 8080 will be
scrapped as opposed to all the ports that *app.jar* could be listening
on. The second rule will be matched for port 5005 and the process
listening only on 5005 will be scraped.
1.6 - (Legacy) Logging and Troubleshooting
Logging
After the Agent begins scraping Prometheus metrics, there may be a delay
of up to a few minutes before the metrics become visible in Sysdig
Monitor. To help quickly confirm your configuration is correct, starting
with Agent version 0.80.0, the following log line will appear in the
Agent log the first time since starting that it has found and is
successfully scraping at least one Prometheus exporter:
2018-05-04 21:42:10.048, 8820, Information, 05-04 21:42:10.048324 Starting export of Prometheus metrics
As this is an INFO level log message, it will appear in Agents using the
default logging settings. To reveal even more detail,increase the Agent
log level to DEBUG , which
produces a message like the following that reveals the name of a
specific metric first detected. You can then look for this metric to be
visible in Sysdig Monitor shortly after.
2018-05-04 21:50:46.068, 11212, Debug, 05-04 21:50:46.068141 First prometheus metrics since agent start: pid 9583: 5 metrics including: randomSummary.95percentile
Troubleshooting
See the previous section for information on expected log messages during
successful scraping. If you have enabled Prometheus and are not seeing
the Starting export
message shown there, revisit your
configuration.
It is also suggested to leave the configuration option in its default
setting of log_errors: true
, which will reveal any issues
scraping eligible processes in the Agent log.
For example, here is an error message for a failed scrape of a TCP port
that was listening but not accepting HTTP requests:
2017-10-13 22:00:12.076, 4984, Error, sdchecks[4987] Exception on running check prometheus.5000: Exception('Timeout when hitting http://localhost:5000/metrics',)
2017-10-13 22:00:12.076, 4984, Error, sdchecks, Traceback (most recent call last):
2017-10-13 22:00:12.076, 4984, Error, sdchecks, File "/opt/draios/lib/python/sdchecks.py", line 246, in run
2017-10-13 22:00:12.076, 4984, Error, sdchecks, self.check_instance.check(self.instance_conf)
2017-10-13 22:00:12.076, 4984, Error, sdchecks, File "/opt/draios/lib/python/checks.d/prometheus.py", line 44, in check
2017-10-13 22:00:12.076, 4984, Error, sdchecks, metrics = self.get_prometheus_metrics(query_url, timeout, "prometheus")
2017-10-13 22:00:12.076, 4984, Error, sdchecks, File "/opt/draios/lib/python/checks.d/prometheus.py", line 105, in get_prometheus_metrics
2017-10-13 22:00:12.077, 4984, Error, sdchecks, raise Exception("Timeout when hitting %s" % url)
2017-10-13 22:00:12.077, 4984, Error, sdchecks, Exception: Timeout when hitting http://localhost:5000/metrics
Here is an example error message for a failed scrape of a port that was
responding to HTTP requests on the /metrics
endpoint but not
responding with valid Prometheus-format data. The invalid endpoint is
responding as follows:
# curl http://localhost:5002/metrics
This ain't no Prometheus metrics!
And the corresponding error message in the Agent log, indicating no
further scraping will be attempted after the initial failure:
2017-10-13 22:03:05.081, 5216, Information, sdchecks[5219] Skip retries for Prometheus error: could not convert string to float: ain't
2017-10-13 22:03:05.082, 5216, Error, sdchecks[5219] Exception on running check prometheus.5002: could not convert string to float: ain't
1.7 - (Legacy) Collecting Prometheus Metrics from Remote Hosts
This feature is not supported with Promscrape V2. For information on different versions of Promscrape and migrating to the latest version, see Migrating from Promscrape V1 to V2.
(Legacy) Collecting Prometheus Metrics from Remote Hosts
Sysdig Monitor can collect Prometheus metrics from remote endpoints with
minimum configuration. Remote endpoints (remote hosts) refer to hosts
where Sysdig Agent cannot be deployed. For example, a Kubernetes master
node on managed Kubernetes services such as GKE and EKS where user
workload cannot be deployed, which in turn implies no Agents involved.
Enabling remote scraping on such hosts is as simple as identifying an
Agent to perform the scraping and declaring the endpoint configurations
with a remote services section in the Agent configuration file.
The collected Prometheus metrics are reported under and associated with
the Agent that performed the scraping as opposed to associating them
with a process.
Preparing the Configuration File
Multiple Agents can share the same configuration. Therefore, determine
which one of those Agents scrape the remote endpoints with the
dragent.yaml
file. This is applicable to both
Create a separate configuration section for remote services in the
Agent configuration file under the prometheus
configuration.
Include a configuration section for each remote endpoint, and add
either a URL or host/port (and an optional path) parameter to each
section to identify the endpoint to scrape. The optional path
identifies the resource at the endpoint. An empty path parameter
defaults to the "/metrics"
endpoint for scraping.
Optionally, add custom tags for each endpoint configuration for
remote services. In the absence of tags, metric reporting might not
work as expected when multiple endpoints are involved. Agents cannot
distinguish similar metrics scraped from multiple endpoints unless
those metrics are uniquely identified by tags.
To help you get started, an example configuration for Kubernetes is
given below:
prometheus:
remote_services:
- prom_1:
kubernetes.node.annotation.sysdig.com/region: europe
kubernetes.node.annotation.sysdig.com/scraper: true
conf:
url: "https://xx.xxx.xxx.xy:5005/metrics"
tags:
host: xx.xxx.xxx.xy
service: prom_1
scraping_node: "{kubernetes.node.name}"
- prom_2:
kubernetes.node.annotation.sysdig.com/region: india
kubernetes.node.annotation.sysdig.com/scraper: true
conf:
host: xx.xxx.xxx.yx
port: 5005
use_https: true
tags:
host: xx.xxx.xxx.yx
service: prom_2
scraping_node: "{kubernetes.node.name}"
- prom_3:
kubernetes.pod.annotation.sysdig.com/prom_3_scraper: true
conf:
url: "{kubernetes.pod.annotation.sysdig.com/prom_3_url}"
tags:
service: prom_3
scraping_node: "{kubernetes.node.name}"
- haproxy:
kubernetes.node.annotation.yourhost.com/haproxy_scraper: true
conf:
host: "mymasternode"
port: 1936
path: "/metrics"
username: "{kubernetes.node.annotation.yourhost.com/haproxy_username}"
password: "{kubernetes.node.annotation.yourhost.com/haproxy_password}"
tags:
service: router
In the above example, scraping is triggered by node and pod annotations.
You can add annotations to nodes and pods by using the
kubectl annotate
command as follows:
kubectl annotate node mynode --overwrite sysdig.com/region=india sysdig.com/scraper=true haproxy_scraper=true yourhost.com/haproxy_username=admin yourhost.com/haproxy_password=admin
In this example, you set annotation on a node to trigger scraping of the
prom2
and haproxy
services as defined in the above configuration.
Preparing Container Environments
An example configuration for Docker environment is given below:
prometheus:
remote_services:
- prom_container:
container.label.com.sysdig.scrape_xyz: true
conf:
url: "https://xyz:5005/metrics"
tags:
host: xyz
service: xyz
In order for remote scraping to work in a Docker-based container
environment, set the com.sysdig.scrape_xyz=true
label to the Agent
container. For example:
docker run -d --name sysdig-agent --restart always --privileged --net host --pid host -e ACCESS_KEY=<KEY> -e COLLECTOR=<COLLECTOR> -e SECURE=true -e TAGS=example_tag:example_value -v /var/run/docker.sock:/host/var/run/docker.sock -v /dev:/host/dev -v /proc:/host/proc:ro -v /boot:/host/boot:ro -v /lib/modules:/host/lib/modules:ro -v /usr:/host/usr:ro --shm-size=512m sysdig/agent
Substitute <KEY
>, <COLLECTOR
>, TAGS
with your account
key, collector, and tags respectively.
Syntax of the Rules
The syntax of the rules for the remote_services
is almost identical to
those of process_filter
with an exception to the include/exclude rule.
The remote_services
section does not use include/exclude rules.
The process_filter
uses include and exclude rules of which only the
first match against a process is applied, whereas, in
the remote_services
section, each rule has a corresponding service name
and all the matching rules are applied.
Rule Conditions
The rule conditions work the same way as those for the process_filter
.
The only caveat is that the rules will be matched against the Agent
process and container because the remote process/context is unknown.
Therefore, matches for container labels and annotations work as before
but they must be applicable to the Agent container as well. For
instance, node annotations will apply because the Agent container runs
on a node.
For annotations, multiple patterns can be specified in a single rule, in
which case all patterns must match for the rule to be a match (AND
operator). In the following example, the endpoint will not be considered
unless both the annotations match:
kubernetes.node.annotation.sysdig.com/region_scraper: europe
kubernetes.node.annotation.sysdig.com/scraper: true
That is, Kubernetes nodes belonging to only the Europe region are
considered for scraping.
Authenticating Sysdig Agent
Sysdig Agent requires necessary permissions on the remote host to scrape
for metrics. The authentication
methods for local scraping works for authenticating agents on remote hosts as
well, but the authorization parameters work only in the agent context.
Authentication based on certificate-key pair requires it to be
constructed into Kubernetes secret and mounted to the agent.
In token-based authentication, make sure the agent token has access
rights on the remote endpoint to do the scraping.
Use annotation to retrieve username/password instead of passing them
in plaintext. Any annotation enclosed in curly braces will be
replaced by the value of the annotation. If the annotation doesn’t
exist the value will be an empty string. Token substitution is
supported for all the authorization parameters. Because
authorization works only in the Agent context, credentials cannot be
automatically retrieved from the target pod. Therefore, use an
annotation in the Agent pod to pass them. To do so, set the password
into an annotation for the selected Kubernetes object.
In the following example, an HAProxy account is authenticated with the
password supplied in the yourhost.com/haproxy_password
annotation on
the agent node.
- haproxy:
kubernetes.node.annotation.yourhost.com/haproxy_scraper: true
conf:
host: "mymasternode"
port: 1936
path: "/metrics"
username: "{kubernetes.node.annotation.yourhost.com/haproxy_username}"
password: "{kubernetes.node.annotation.yourhost.com/haproxy_password}"
tags:
service: router
2 - (Legacy) Integrate Applications (Default App Checks)
The Sysdig agent supports additional application monitoring capabilities
with application check scripts or ‘app checks’. These are a set of
plugins that poll for custom metrics from the specific applications
which export them via status or management pages: e.g. NGINX, Redis,
MongoDB, Memcached and more.
Many app checks are enabled by default in the agent and when a
supported application is found, the correct app check script will be
called and metrics polled automatically.
However, if default connection parameters are changed in your
application, you will need to modify the app check connection parameters
in the Sysdig Agent configuration file (dragent.yaml)
to match your
application.
In some cases, you may also need to enable the metrics reporting
functionality in the application before the agent can poll them.
This page details how to make configuration changes in the agent’s
configuration file, and provides an application integration example.
Click the Supported Applications links for application-specific details.
Python Version for App Checks:
As of agent version 9.9.0, the default version of Python used for app
checks is Python 3.
Python 2 can still be used by setting the following option in your
dragent.yaml
:
python_binary: <path to python 2.7 binary>
For containerized agents, this path will be: /usr/bin/python2.7
Edit dragent.yaml to Integrate or Modify Application Checks
Out of the box, the Sysdig agent will gather and report on a wide
variety of pre-defined metrics. It can also accommodate any number of
custom parameters for additional metrics collection.
The agent relies on a pair of configuration files to define metrics
collection parameters:
dragent.default.yaml
| The core configuration file. You can look at it to understand more about the default configurations provided. Location: "/opt/draios/etc/dragent.default.yaml ." CAUTION. This file should never be edited. |
dragent.yaml
| The configuration file where parameters can be added, either directly in YAML as name/value pairs, or using environment variables such as 'ADDITIONAL_CONF." Location: "/opt/draios/etc/dragent.yaml ." |
The “dragent.yaml
” file can be accessed and edited in several ways,
depending on how the agent was installed.
Review Understanding the Agent Config
Files for details.
The examples in this section presume you are entering YAML code directly
intodragent.yaml,
under the app_checks
section.
Find the default settings
To find the default app-checks for already supported applications, check
the dragent.default.yaml
file.
(Location: /opt/draios/etc/dragent.default.yaml
.)
app_checks:
- name: APP_NAME
check_module: APP_CHECK_SCRIPT
pattern:
comm: PROCESS_NAME
conf:
host: IP_ADDR
port: PORT
app_checks
| | The main section of dragent.default.yaml that contains a list of pre-configured checks. | n/a |
name
| | Every check should have a uniquename: which will be displayed on Sysdig Monitor as the process name of the integrated application. | e.g. MongoDB |
check_module
| | The name of the Python plugin that polls the data from the designated application. All the app check scripts can be found inside the /opt/draios/lib/python/checks.d directory. | e.g. elastic |
pattern
| | This section is used by the Sysdig agent to match a process with a check. Four kinds of keys can be specified along with any arguments to help distinguish them. | n/a |
| comm
| Matches process name as seen in /proc/PID /status | |
| port
| Matches based on the port used (i.e MySQL identified by 'port: 3306') | |
| arg
| Matches any process arguments | |
| exe
| Matches the process exe as seen in /proc/PID /exe link | |
conf
| | This section is specific for each plugin. You can specify any key/values that the plugins support. | |
| host
| Application-specific. A URL or IP address | |
| port
| | |
{…}
tokens can be used as values, which will be substituted with
values from process info.
Change the default settings
To override the defaults:
Copy relevant code blocks from dragent.default.yaml
into
dragent.yaml
. (Or copy the code from the appropriate app
check integration page in this documentation section.)
Any entries copied into dragent.yaml
file will override similar
entries in dragent.default.yaml
.
Never modify dragent.default.yaml
, as it will be overwritten
whenever the agent is updated.
Modify the parameters as needed.
Be sure to use proper YAML. Pay attention to consistent spacing for
indents (as shown) and list all check entries under an app_checks:
section title.
Save the changes and restart the agent.
Use service restart agent
or docker restart sysdig-agent
.
Metrics for the relevant application should appear in the Sysdig Monitor
interface under the appropriate name.
Example 1: Change Name and Add Password
Here is a sample app-check entry for Redis. The app_checks
section was
copied from the dragent.default.yaml
file and modified for a specific
instance.
customerid: 831f3-Your-Access-Key-9401
tags: local:sf,acct:dev,svc:db
app_checks:
- name: redis-6380
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: PORT
password: PASSWORD
Edits made:
As the token PORT
is used, it will be translated to the actual port
where Redis is listening.
Example 2: Increase Polling Interval
The default interval for an application check to be run by the agent is
set to every second. You can increase the interval per application check
by adding the interval: parameter (under the -name section) and the
number of seconds to wait before each run of the script.
interval:
must be put into each app check entry that should run less
often; there is no global setting.
Example: Run the NTP check once per minute:
app_checks:
- name: ntp
interval: 60
pattern:
comm: systemd
conf:
host: us.pool.ntp.org
Disabling
Disable a Single Application Check
Sometimes the default configuration shipped with the Sysdig agent does
not work for you or you may not be interested in checks for a single
application. To turn a single check off, add an entry like this to
disable it:
app_checks:
- name: nginx
enabled: false
This entry overrides the default configuration of the nginx
check,
disabling it.
If you are using the ADDITIONAL_CONF
parameter to modify your
container agent’s configuration, you would add an entry like this to
your Docker run command (or Kubernetes manifest):
-e ADDITIONAL_CONF="app_checks:\n - name: nginx\n enabled: false\n"
Disable ALL Application Checks
If you do not need it or otherwise want to disable the application check
functionality, you can add the following entry to the agent’s user
settings configuration file /opt/draios/etc/dragent.yaml
:
app_checks_enabled: false
Restart the agent as shown immediately above for either the native Linux
agent installation or the container agent installation.
Sysdig allows custom application check-script configurations to be
created for each individual container in the infrastructure, via the
environment variable SYSDIG_AGENT_CONF
. This avoids the need for
multiple edits and entries to achieve the container-specific
customization, by enabling application teams to configure their own
checks.
The SYSDIG_AGENT_CONF
variable stores a YAML-formatted configuration
for the app check, and is used to match app-check configurations. It can
be stored directly within the Docker file.
The syntax is the same as dragent.yaml
syntax.
The example below defines a per container app-check for Redis in the
Dockerfile, using the SYSDIG_AGENT_CONF
environment variable:
FROM redis
# This config file adds a password for accessing redis instance
ADD redis.conf /
ENV SYSDIG_AGENT_CONF { "app_checks": [{ "name": "redis", "check_module": "redisdb", "pattern": {"comm": "redis-server"}, "conf": { "host": "127.0.0.1", "port": "6379", "password": "protected"} }] }
ENTRYPOINT ["redis-server"]
CMD [ "/redis.conf" ]
The example below shows how parameters can be added to a container
started with docker run
, by either using the -e/–envflag
variable,
or injecting the parameters using an orchestration system (for example,
Kubernetes):
PER_CONTAINER_CONF='{ "app_checks": [{ "name": "redis", "check_module": "redisdb", "pattern": {"comm": "redis-server"}, "conf": { "host": "127.0.0.1", "port": "6379", "password": "protected"} }] }'
docker run --name redis -v /tmp/redis.conf:/etc/redis.conf -e SYSDIG_AGENT_CONF="${PER_CONTAINER_CONF}" -d redis /etc/redis.conf
Metrics Limit
Metric limits are defined by your payment plan. If more metrics are
needed please contact your sales representative with your use case.
Note that a metric with the same name but different tag name will count
as a unique metric by the agent. Example: a metric 'user.clicks'
with
the tag 'country=us'
and another 'user.clicks'
with the
'tag country=it'
are considered two metrics which count towards the
limit.
Supported Applications
Below is the supported list of applications the agent will automatically
poll.
Some app-check scripts will need to be configured since no defaults
exist, while some applications may need to be configured to output their
metrics. Click a highlighted link to see application-specific notes.
- Active MQ
- Apache
- Apache CouchDB
- Apache HBase
- Apache Kafka
- Apache Zookeeper
- Consul
- CEPH
- Couchbase
- Elasticsearch
- etcd
- fluentd
- Gearman
- Go
- Gunicorn
- HAProxy
- HDFS
- HTTP
- Jenkins
- JVM
- Lighttpd
- Memcached
- Mesos/Marathon
- MongoDB
- MySQL
- NGINX and NGINX Plus
- NTP
- PGBouncer
- PHP-FPM
- Postfix
- PostgreSQL
- Prometheus
- RabbitMQ
- RedisDB
- Supervisord
- SNMP
- TCP
You can also
2.1 - Apache
Apache web server is an open-source, web
server creation, deployment, and management software. If Apache is
installed on your environment, the Sysdig agent will connect using the
mod_status
module on Apache. You may need to edit the default entries
in the agent configuration file to connect. See the Default
Configuration, below.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Apache Setup
Install mod_status
on your Apache servers and enable ExtendedStatus.
The following configuration is required. If it is already present, then
un-comment the lines, otherwise add the configuration.
LoadModule status_module modules/mod_status.so
...
<Location /server-status>
SetHandler server-status
Order Deny,Allow
Deny from all
Allow from localhost
</Location>
...
ExtendedStatus On
Sysdig Agent Configuration
Review how to edit dragent.yaml to Integrate or Modify Application
Checks.
Apache has a common default for exposing metrics. The process command
name can be either apache2
or httpd
. By default, the Sysdig agent
will look for the process apache2
. If named differently in your
environment (e.g. httpd
), edit the configuration file to match the
process name as shown in Example 1.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Apache and collect all metrics.
app_checks:
- name: apache
check_module: apache
pattern:
comm: apache2
conf:
apache_status_url: "http://localhost:{port}/server-status?auto"
log_errors: false
Example
If it is necessary to edit dragent.yaml
to change the process name,
use the following example and update the comm
with the value httpd.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
app_checks:
- name: apache
check_module: apache
pattern:
comm: httpd
conf:
apache_status_url: "http://localhost/server-status?auto"
log_errors: false
Metrics Available
The Apache metrics are listed in the metrics dictionary here: Apache Metrics.
UI Examples

2.2 - Apache Kafka
Apache Kafka is a distributed streaming
platform. Kafka is used for building real-time data pipelines and
streaming apps. It is horizontally scalable, fault-tolerant, wicked
fast, and runs in production in thousands of companies. If Kafka is
installed on your environment, the Sysdig agent will automatically
connect. See the Default Configuration, below.
The Sysdig agent automatically collects metrics from Kafka via JMX
polling. You need to provide consumer names and topics in the agent
config file to collect consumer-based Kafka metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Kafka Setup
Kafka will automatically expose all metrics. You do not need to add
anything on the Kafka instance.
Zstandard
, one of the compressions available in the Kafka integration,
is only included in Kafka versions 2.1.0 or newer. See also Apache
documentation.
Sysdig Agent Configuration
Review how to edit dragent.yaml to Integrate or Modify Application
Checks.
Metrics from Kafka via JMX polling are already configured in the agent’s
default-settings configuration file. Metrics for consumers, however,
need to use app-checks to poll the Kafka and Zookeeper API. You need to
provide consumer names and topics in dragent.yaml
file.
Default Configuration
Since consumer names and topics are environment-specific, a default
configuration is not present in dragent.default.yaml
.
Refer to the following examples for adding Kafka checks to
dragent.yaml.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Basic Configuration
A basic example with sample consumer and topic names:
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_connect_str: "127.0.0.1:9092" # kafka address, usually localhost as we run the check on the same instance
zk_connect_str: "localhost:2181" # zookeeper address, may be different than localhost
zk_prefix: /
consumer_groups:
sample-consumer-1: # sample consumer name
sample-topic-1: [0, ] # sample topic name and partitions
sample-consumer-2: # sample consumer name
sample-topic-2: [0, 1, 2, 3] # sample topic name and partitions
Example 2: Store Consumer Group Info (Kafka 9+)
From Kafka 9 onwards, you can store consumer group config info inside
Kafka itself for better performance.
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_connect_str: "localhost:9092"
zk_connect_str: "localhost:2181"
zk_prefix: /
kafka_consumer_offsets: true
consumer_groups:
sample-consumer-1: # sample consumer name
sample-topic-1: [0, ] # sample topic name and partitions
If kafka_consumer_offsets entry is
set to true
the app check will
look for consumer offsets in Kafka. The appcheck will also look in Kafka
if zk_connect_str
is not set.
Example 3: Aggregate Partitions at the Topic Level
To enable aggregation of partitions at the topic level, use
kafka_consumer_topics
with aggregate_partitions
= true
.
In this case the app check will aggregate the lag
& offset
values at
the partition level, reducing the number of metrics collected.
Set aggregate_partitions
= false
to disable metrics aggregation at
the partition level. In this case, the appcheck will show lag
and
offset
values for each partition.
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_connect_str: "localhost:9092"
zk_connect_str: "localhost:2181"
zk_prefix: /
kafka_consumer_offsets: true
kafka_consumer_topics:
aggregate_partitions: true
consumer_groups:
sample-consumer-1: # sample consumer name
sample-topic-1: [0, ] # sample topic name and partitions
sample-consumer-2: # sample consumer name
sample-topic-2: [0, 1, 2, 3] # sample topic name and partitions
Optional tags can be applied to every emitted metric, service check,
and/or event.
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_connect_str: "localhost:9092"
zk_connect_str: "localhost:2181"
zk_prefix: /
consumer_groups:
sample-consumer-1: # sample consumer name
sample-topic-1: [0, ] # sample topic name and partitions
tags: ["key_first_tag:value_1", "key_second_tag:value_2", "key_third_tag:value_3"]
Example 5: SSL and Authentication
If SSL and authentication are enabled on Kafka, use the following
configuration.
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_consumer_offsets: true
kafka_connect_str: "127.0.0.1:9093"
zk_connect_str: "localhost:2181"
zk_prefix: /
consumer_groups:
test-group:
test: [0, ]
test-4: [0, 1, 2, 3]
security_protocol: SASL_SSL
sasl_mechanism: PLAIN
sasl_plain_username: <USERNAME>
sasl_plain_password: <PASSWORD>
ssl_check_hostname: true
ssl_cafile: <SSL_CA_FILE_PATH>
#ssl_context: <SSL_CONTEXT>
#ssl_certfile: <CERT_FILE_PATH>
#ssl_keyfile: <KEY_FILE_PATH>
#ssl_password: <PASSWORD>
#ssl_crlfile: <SSL_FILE_PATH>
Configuration Keywords and Descriptions
security_protocol (str)
| Protocol used to communicate with brokers. | PLAINTEXT
|
sasl_mechanism (str)
| String picking SASL mechanism when security_protocol is SASL_PLAINTEXT or SASL_SSL | Currently only PLAIN is supported |
sasl_plain_username (str)
| Username for SASL PLAIN authentication. | |
sasl_plain_password (str)
| Password for SASL PLAIN authentication. | |
ssl_context (ssl.SSLContext)
| Pre-configured SSLContext for wrapping socket connections. If provided, all other ssl_* configurations will be ignored. | none |
ssl_check_hostname (bool)
| Flag to configure whether SSL handshake should verify that the certificate matches the broker's hostname. | true |
ssl_cafile (str)
| Optional filename of ca file to use in certificate veriication. | none |
ssl_certfile (str)
| Optional filename of file in pem format containing the client certificate, as well as any CA certificates needed to establish the certificate's authenticity. | none |
ssl_keyfile (str)
| Optional filename containing the client private key. | none |
ssl_password (str)
| Optional password to be used when loading the certificate chain. | none |
ssl_crlfile (str)
| Optional filename containing the CRL to check for certificate expiration. By default, no CRL check is done. When providing a file, only the leaf certificate will be checked against this CRL. The CRL can only be checked with 2.7.9+. | none |
Example 6: Regex for Consumer Groups and Topics
As of Sysdig agent version 0.94, the Kafka app check has added
optional regex (regular expression) support for Kafka consumer groups
and topics.
Regex Configuration:
No new metrics are added with this feature
The new parameter consumer_groups_regex
is added, which includes
regex for consumers and topics from Kafka. Consumer offsets stored
in Zookeeper are not collected.
Regex for topics is optional. When not provided, all topics under
the consumer will be reported.
The regex Python syntax is documented here:
https://docs.python.org/3.7/library/re.html#regular-expression-syntax
If both consumer_groups
and consumer_groups_regex
are provided
at the same time, matched consumer groups from both parameters will
be merged
Sample configuration:
app_checks:
- name: kafka
check_module: kafka_consumer
pattern:
comm: java
arg: kafka.Kafka
conf:
kafka_connect_str: "localhost:9092"
zk_connect_str: "localhost:2181"
zk_prefix: /
kafka_consumer_offsets: true
# Regex can be provided in following format
# consumer_groups_regex:
# 'REGEX_1_FOR_CONSUMER_GROUPS':
# - 'REGEX_1_FOR_TOPIC'
# - 'REGEX_2_FOR_TOPIC'
consumer_groups_regex:
'consumer*':
- 'topic'
- '^topic.*'
- '.*topic$'
- '^topic.*'
- 'topic\d+'
- '^topic_\w+'
Example
topic_\d+ | All strings having keyword topic followed by _ and one or more digit characters (equal to [0-9]) | my-topic_1 topic_23 topic_5-dev | topic_x my-topic-1 topic-123 |
topic | All strings having topic keyword | topic_x x_topic123 | xyz |
consumer* | All strings have consumer keyword | consumer-1 sample-consumer sample-consumer-2 | xyz |
^topic_\w+ | All strings starting with topic followed by _ and any one or more word characters (equal to [a-zA-Z0-9_]) | topic_12 topic_x topic_xyz_123 | topic-12 x_topic topic__xyz |
^topic.* | All strings starting with topic | topic-x topic123 | x-topic x_topic123 |
.*topic$ | All strings ending with topic | x_topic sampletopic | topic-1 x_topic123 |
Metrics Available
Kafka Consumer Metrics (App Checks)
See Apache Kafka Consumer Metrics.
JMX Metrics
See Apache Kafka JMX Metrics.
Result in the Monitor UI

2.3 - Consul
Consul is a distributed service mesh to
connect, secure, and configure services across any runtime platform and
public or private cloud. If Consul is installed on your environment, the
Sysdig agent will automatically connect and collect basic metrics. If
the Consul Access Control List (ACL) is configured, you may need to edit
the default entries to connect. Also, additional latency metrics can be
collected by modifying default entries. See the Default Configuration,
below.
It’s easy! Sysdig automatically detects metrics from this app based on
standard default configurations.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Consul Configuration
Consul is ready to expose metrics without any special configuration.
Sysdig Agent Configuration
Review how to edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
``uses the following code to
connect with Consul and collect basic metrics.
app_checks:
- name: consul
pattern:
comm: consul
conf:
url: "http://localhost:8500"
catalog_checks: yes
With the dragent.default.yaml
file, the below set of metrics are
available in the Sysdig Monitor UI:
Metrics name |
---|
consul.catalog.nodes_critical |
consul.catalog.nodes_passing |
consul.catalog.nodes_up |
consul.catalog.nodes_warning |
consul.catalog.total_nodes |
consul.catalog.services_critical |
consul.catalog.services_passing |
consul.catalog.services_up |
consul.catalog.services_warning |
consul.peers |
Additional metrics and event can be collected by adding configuration in
dragent.yaml
file. The ACL token must be provided if enabled. See the
following examples.
Remember! Never edit dragent.default.yaml
``directly; always edit
only dragent.yaml
.
Example 1: Enable Leader Change Event
self_leader_check
An enabled node will watch for itself to become the
leader and will emit an event when that happens. It can be enabled on
all nodes.
app_checks:
- name: consul
pattern:
comm: consul
conf:
url: "http://localhost:8500"
catalog_checks: yes
self_leader_check: yes
logs_enabled: true
Example 2: Enable Latency Metrics
If the network_latency_checks
flag is enabled, then the Consul network
coordinates will be retrieved and the latency calculated for each node
and between data centers.
app_checks:
- name: consul
pattern:
comm: consul
conf:
url: "http://localhost:8500"
catalog_checks: yes
network_latency_checks: yes
logs_enabled: true
With the above changes, you can see the following additional metrics:
Example 3: Enable ACL Token
When the ACL System
is enabled in Consul, the ACL Agent Token
must
be added in dragent.yaml
in order to collect metrics.
Follow Consul’s official documentation to Configure
ACL,
Bootstrap
ACL and
Create Agent
Token.
app_checks:
- name: consul
pattern:
comm: consul
conf:
url: "http://localhost:8500"
acl_token: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" #Add agent token
catalog_checks: yes
logs_enabled: true
Example 4: Collect Metrics from Non-Leader Node
Required: Agent 9.6.0+
With agent 9.6.0, you can use the configuration option
single_node_install
(Optional. Default: false
). Set this option to
true
and the app check will be performed on non-leader nodes of
Consul.
app_checks:
- name: consul
pattern:
comm: consul
conf:
url: "http://localhost:8500"
catalog_checks: yes
single_node_install: true
StatsD Metrics
In addition to the metrics from the Sysdig app-check, there are many
other metrics that Consul can send using StatsD. Those metrics will be
automatically collected by the Sysdig agent’s StatsD integration if
Consul is configured to send them.
Add statsd_address
under telemetry
to the Consul config file. The
default config file location is /consul/config/local.json
{
...
"telemetry": {
"statsd_address": "127.0.0.1:8125"
}
...
}
See Telemetry Metrics
for more details.
Metrics Available
See Consul Metrics.
Result in the Monitor UI

2.4 - Couchbase
Couchbase Server is a distributed,
open-source, NoSQL database
engine. The core architecture is designed to simplify building modern
applications with a flexible data model and simpler high availability,
high scalability, high performance, and advanced security. If Couchbase
is installed on your environment, the Sysdig agent will automatically
connect. If authentication is configured, you may need to edit the
default entries to connect. See the Default Configuration, below.
The Sysdig agent automatically collects all bucket and node metrics. You
can also edit the configuration to collect query metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Couchbase Setup
Couchbase will automatically expose all metrics. You do not need to
configure anything on the Couchbase instance.
Sysdig Agent Configuration
Review how to edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Couchbase and collect all bucket and node metrics.
app_checks:
- name: couchbase
pattern:
comm: beam.smp
arg: couchbase
port: 8091
conf:
server: http://localhost:8091
If authentication is enabled, you need to edit dragent.yaml
file to
connect with Couchbase. See Example 1.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Authentication
Replace <username>
and <password>
with appropriate values and update
the dragent.yaml
file.
app_checks:
- name: couchbase
pattern:
comm: beam.smp
arg: couchbase
port: 8091
conf:
server: http://localhost:8091
user: <username>
password: <password>
# The following block is optional and required only if the 'path' and
# 'port' need to be set to non-default values specified here
cbstats:
port: 11210
path: /opt/couchbase/bin/cbstats
Example 2: Query Stats
Additionally, you can configure query_monitoring_url
to get query
monitoring stats. This is available from Couchbase version 4.5. See
Query
Monitoring
for more detail.
app_checks:
- name: couchbase
pattern:
comm: beam.smp
arg: couchbase
port: 8091
conf:
server: http://localhost:8091
query_monitoring_url: http://localhost:8093
Metrics Available
See Couchbase Metrics.
Result in the Monitor UI

2.5 - Elasticsearch
Elasticsearch is an open-source, distributed,
document storage and search engine that stores and retrieves data
structures in near real-time. Elasticsearch represents data in the form
of structured JSON documents and makes full-text search accessible via
RESTful API and web clients for languages like PHP, Python, and Ruby.
It’s also elastic in the sense that it’s easy to scale
horizontally—simply add more nodes to distribute the load. If
Elasticsearch is
installed on your environment, the Sysdig agent will automatically
connect in most of the cases. See the Default Configuration, below.
The Sysdig Agent automatically collects default metrics. You can also
edit the configuration to collect Primary
Shard
stats.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Elasticsearch Setup
Elasticsearch is ready to expose metrics without any special
configuration.
Sysdig Agent Configuration
Review how to edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Elasticsearch and collect basic metrics.
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: http://localhost:9200
For more metrics, you may need to change the elasticsearch default
setting in dragent.yaml
:
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Agent authentication to Elasticsearch Cluster with Authentication
Password Authentication
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: https://sysdigcloud-elasticsearch:9200
username: readonly
password: some_password
ssl_verify: false
Certificate Authentication
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: https://localhost:9200
ssl_cert: /tmp/certs/ssl.crt
ssl_key: /tmp/certs/ssl.key
ssl_verify: true
ssl_cert
: Path to the certificate chain used for validating the
authenticity of the Elasticsearch server.
ssl_key
: Path to the certificate key used for authenticating to the
Elasticsearch server.
Example 2: Enable Primary shard Statistics
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: http://localhost:9200
pshard_stats : true
pshard-specific Metrics
Enable pshard_stats
to monitor the following additional metrics:
Metric Name |
---|
elasticsearch.primaries.flush.total |
elasticsearch.primaries.flush.total.time |
elasticsearch.primaries.docs.count |
elasticsearch.primaries.docs.deleted |
elasticsearch.primaries.get.current |
elasticsearch.primaries.get.exists.time |
elasticsearch.primaries.get.exists.total |
elasticsearch.primaries.get.missing.time |
elasticsearch.primaries.get.missing.total |
elasticsearch.primaries.get.time |
elasticsearch.primaries.get.total |
elasticsearch.primaries.indexing.delete.current |
elasticsearch.primaries.indexing.delete.time |
elasticsearch.primaries.indexing.delete.total |
elasticsearch.primaries.indexing.index.current |
elasticsearch.primaries.indexing.index.time |
elasticsearch.primaries.indexing.index.total |
elasticsearch.primaries.merges.current |
elasticsearch.primaries.merges.current.docs |
elasticsearch.primaries.merges.current.size |
elasticsearch.primaries.merges.total |
elasticsearch.primaries.merges.total.docs |
elasticsearch.primaries.merges.total.size |
elasticsearch.primaries.merges.total.time |
elasticsearch.primaries.refresh.total |
elasticsearch.primaries.refresh.total.time |
elasticsearch.primaries.search.fetch.current |
elasticsearch.primaries.search.fetch.time |
elasticsearch.primaries.search.fetch.total |
elasticsearch.primaries.search.query.current |
elasticsearch.primaries.search.query.time |
elasticsearch.primaries.search.query.total |
elasticsearch.primaries.store.size |
Example 3: Enable Primary shard Statistics for Master Node only
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: http://localhost:9200
pshard_stats_master_node_only: true
Note that this option takes precedence over the pshard_stats
option
(above). This means that if the following configuration were put into
place, only the pshard_stats_master_node_only
option would be
respected:
app_checks:
- name: elasticsearch
check_module: elastic
pattern:
port: 9200
comm: java
conf:
url: http://localhost:9200
pshard_stats: true
pshard_stats_master_node_only: true
All Available Metrics
With the default settings and the pshard
setting, the total available
metrics are listed here: Elasticsearch
Metrics.
Result in the Monitor UI

2.6 - etcd
etcdis a distributed key-value store that
provides a reliable way to store data across a cluster of machines. If
etcd is installed on your environment, the Sysdig agent will
automatically connect. If you are using ectd older than version 2, you
may need to edit the default entries to connect. See the Default
Configuration section, below.
The Sysdig Agent automatically collects all metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
etcd Versions
etcd v2
The app check functionality described on this page supports etcd
metrics from APIs that are specific to v2 of etcd.
These APIs are present in etcd v3 as well, but export metrics only
for the v2 datastores. For example, after upgrading from etcd v2 to v3,
if the v2 datastores are not migrated to v3, the v2 APIs will continue
exporting metrics for these datastores. If the v2 datastores are
migrated to v3, the v2 APIs will no longer export metrics for these
datastores.
etcd v3
etcd v3 uses a native Prometheus exporter. The exporter only exports
metrics for v3 datastores. For example, after upgrading from etcd v2 to
v3, if v2 datastores are not migrated to v3, the Prometheus endpoint
will not export metrics for these datastores. The Prometheus endpoint
will only export metrics for datastores migrated to v3 or datastores
created after the upgrade to v3.
If your etcd version is v3 or higher, use the information on this page
to enable an integration: Integrate Prometheus
Metrics.
etcd Setup
etcd will automatically expose all metrics. You do not need to add
anything to the etcd instance.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
The default agent configuration for etcd will look for the application
on localhost, port 2379.
No customization is required.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with etcd and collect all metrics.
app_checks:
- name: etcd
pattern:
comm: etcd
conf:
url: "http://localhost:2379"
etcd (before version 2) does not listen on localhost
, so the Sysdig
agent will not connect to it automatically. In such case, you may need
edit the dragent.yaml
file with the hostname and port. See Example 1.
Alternatively, you can add the option -bind-addr 0.0.0.0:4001
to the
etcd command line to allow the agent to connect.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1
You can use {hostname}
and {port}
as a tokens in the conf:
section. This is the recommended setting for Kubernetes customers.
app_checks:
- name: etcd
pattern:
comm: etcd
conf:
url: "http://{hostname}:{port}"
Alternatively you can specify the real hostname and port.
app_checks:
- name: etcd
pattern:
comm: etcd
conf:
url: "http://my_hostname:4000" #etcd service listening on port 4000
Example 2: SSL/TLS Certificate
If encryption is used, add the appropriate SSL/TLS entries. Provide
correct path of SSL/TLS key and certificates used in etcd configuration
in fields ssl_keyfile, ssl_certfile, ssl_ca_certs
.
app_checks:
- name: etcd
pattern:
comm: etcd
conf:
url: "https://localhost:PORT"
ssl_keyfile: /etc/etcd/peer.key # Path to key file
ssl_certfile: /etc/etcd/peer.crt # Path to SSL certificate
ssl_ca_certs: /etc/etcd/ca.crt # Path to CA certificate
ssl_cert_validation: True
Metrics Available
See etcd Metrics.
Result in the Monitor UI

2.7 - fluentd
Fluentd is an open source data collector,
which allows unifying data collection and consumption to better use and
understand data. Fluentd structures data as JSON as much as possible, to
unify all facets of processing log data: collecting, filtering,
buffering, and outputting logs across multiple sources and destinations.
If Fluentd is installed on your environment, the Sysdig agent will
automatically connect. See See the Default Configuration section, below.
The Sysdig agent automatically collects default metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Fluentd Setup
Fluentd can be installed as a package (.deb, .rpm, etc) depending on the
OS flavor, or it can be deployed in a Docker container. Fluentd
installation is documented
here. For the
examples on this page, a .deb package
installation is
used.
After installing Fluentd, add following lines in fluentd.conf
:
<source>
@type monitor_agent
bind 0.0.0.0
port 24220
</source>
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’sdragent.default.yaml
uses the following code to
connect with Fluentd and collect default metrics.
(If you use a non-standard port for monitor_agent
, you can
configure it as usual in the agent config file dragent.yaml.)
- name: fluentd
pattern:
comm: fluentd
conf:
monitor_agent_url: http://localhost:24220/api/plugins.json
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
To generate the metric data, it is necessary to generate some logs
through an application. In the following example, HTTP is used. (For
more information, see Life of a Fluentd
event.)
Execute the following command on in the Fluentd environment:
$ curl -i -X POST -d 'json={"action":"login","user":2}' http://localhost:8888/test.cycle
Expected output: (Note: Here the status code is 200 OK, as HTTP traffic
is successfully generated; it will vary per application.)
HTTP/1.1 200 OK
Content-type: text/plain
Connection: Keep-Alive
Content-length: 0
Metrics Available
See fluentd Metrics.
Result in the Monitor UI

2.8 - Go
Golang expvaris the standard interface
designed to instrument and expose custom metrics from a Go
program via
HTTP
. In addition to custom metrics, it also exports some metrics
out-of-the-box, such as command line arguments, allocation stats, heap
stats, and garbage collection metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Go_expvar Setup
You will need to create a custom entry in the user settings config file
for your Go application, due to the difficulty in determining if an
application is written in Go by looking at process names or arguments.
Be sure your app has expvars
enabled, which means importing the
expvar
module and having an HTTP server started from inside your
app, as follows:
import (
...
"net/http"
"expvar"
...
)
// If your application has no http server running for the DefaultServeMux,
// you'll have to have a http server running for expvar to use, for example
// by adding the following to your init function
func init() {
go http.ServeAndListen(":8080", nil)
}
// You can also expose variables that are specific to your application
// See http://golang.org/pkg/expvar/ for more information
var (
exp_points_processed = expvar.NewInt("points_processed")
)
func processPoints(p RawPoints) {
points_processed, err := parsePoints(p)
exp_points_processed.Add(points_processed)
...
}
See also the following blog entry: How to instrument Go code with
custom expvar
metrics.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
No default configuration for Go is provided in the Sysdig agent
dragent.default.yaml
file. You must edit the agent config file as
described in Example 1.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
Add the following code sample to dragent.yaml
to collect Go metrics.
app_checks:
- name: go-expvar
check_module: go_expvar
pattern:
comm: go-expvar
conf:
expvar_url: "http://localhost:8080/debug/vars" # automatically match url using the listening port
# Add custom metrics if you want
metrics:
- path: system.numberOfSeconds
type: gauge # gauge or rate
alias: go_expvar.system.numberOfSeconds
- path: system.lastLoad
type: gauge
alias: go_expvar.system.lastLoad
- path: system.numberOfLoginsPerUser/.* # You can use / to get inside the map and use .* to match any record inside
type: gauge
- path: system.allLoad/.*
type: gauge
Metrics Available
See Go Metrics.
Result in the Monitor UI

2.9 - HAProxy
HAProxy provides a high-availability load
balancer and proxy server for TCP- and HTTP-based applications which
spreads requests across multiple servers.
The Sysdig agent automatically collects haproxy
metrics. You can also
edit the agent configuration file to collect additional metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
HAProxy Setup
The stats
feature must be enabled on your HAProxy instance. This can
be done by adding the following entry to the HAProxy configuration file
/etc/haproxy/haproxy.cfg
listen stats
bind :1936
mode http
stats enable
stats hide-version
stats realm Haproxy\ Statistics
stats uri /haproxy_stats
stats auth stats:stats
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with HAProxy and collect haproxy metrics:
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/
collect_aggregates_only: True
log_errors: false
You can get a few additional status metrics by editing the configuration
in dragent.yaml,
as in the following examples.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
Example: Collect Status Metrics Per Service
Enable the collect_status_metrics
flag to collect the metrics
haproxy.count_per_status
, and haproxy.backend_hosts
.
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/haproxy_stats
collect_aggregates_only: True
collect_status_metrics: True
log_errors: false
Example: Collect Status Metrics Per Host
Enable:
collect_status_metrics_by_host:
Instructs the check to collect
status metrics per host, instead of per service. This only applies
if `collect_status_metrics`
is true
.
tag_service_check_by_host:
When this flag is set, the hostname
is also passed with the service check ‘haproxy.backend_up
’.
By default, only the backend name and service name are associated
with it.
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/haproxy_stats
collect_aggregates_only: True
collect_status_metrics: True
collect_status_metrics_by_host: True
tag_service_check_by_host: True
log_errors: false
Example: Collect HAProxy Stats by UNIX Socket
If you’ve configured HAProxy to report statistics to a UNIX socket, you
can set the url
in dragent.yaml
to the socket’s path (e.g.,
unix:///var/run/haproxy.sock).
Set up HAProxy Config File
Edit your HAProxy configuration file ( /etc/haproxy/haproxy.cfg
)
to add the following lines to the global
section:
global
[snip]
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
[snip]
Edit dragent.yaml url
Add the socket URL from the HAProxy config to the dragent.yaml file:
app_checks:
- name: haproxy
pattern:
comm: haproxy
conf:
url: unix:///run/haproxy/admin.sock
log_errors: True
Metrics Available
See HAProxy Metrics.
Example: Enable Service Check
Required: Agent 9.6.0+
enable_service_check
: Enable/Disable service
check haproxy.backend.up
.
When set to false
, all service checks will be disabled.
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/haproxy_stats
collect_aggregates_only: true
enable_service_check: false
Example: Filter Metrics Per Service
Required: Agent 9.6.0+
services_exclude
(Optional): Name or regex of services to be excluded.
services_include
(Optional): Name or regex of services to be included
If a service is excluded with services_exclude
, it can still be be
included explicitly by services_include
. The following example
excludes all services except service_1
and service_2
.
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/haproxy_stats
collect_aggregates_only: true
services_exclude:
- ".*"
services_include:
- "service_1"
- "service_2"
Required: Agent 9.6.0+
There are two additional configuration options introduced with agent
9.6.0:
active_tag
(Optional. Default: false
):
Adds tag active
to backend metrics that belong to the active pool
of connections.
headers
(Optional):
Extra headers such as auth-token
can be passed along with
requests.
app_checks:
- name: haproxy
pattern:
comm: haproxy
port: 1936
conf:
username: stats
password: stats
url: http://localhost:1936/haproxy_stats
collect_aggregates_only: true
active_tag: true
headers:
<HEADER_NAME>: <HEADER_VALUE>
<HEADER_NAME>: <HEADER_VALUE>
Result in the Monitor UI

2.10 - HTTP
The HTTP check monitors HTTP-based applications for URL availability.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
HTTP Setup
You do not need to configure anything on HTTP-based applications for the
Sysdig agent to connect.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
No default entry is present in the dragent.default.yaml
for the HTTP
check. You need to add an entry in dragent.yaml
as shown in following
examples.
Never edit dragent.default.yaml
directly; always edit only
dragent.yaml
.
Example 1
First you must identify the process pattern (comm:
). It must match an
actively running process for the HTTP check to work. Sysdig recommends
the process be the one that is serving the URL being checked.
If the URL is is remote from the agent, the user should use a process
that is always running, such as “systemd
”.
Confirm the “comm
” value using the following command:
Add the following entry to the dragent.yaml
file and modify the
'name:''comm:'
and 'url:'
parameters as needed:
app_checks:
- name: EXAMPLE_WEBSITE
check_module: http_check
pattern:
comm: systemd
conf:
url: https://www.MYEXAMPLE.com
Example 2
There are multiple configuration options available with the HTTP check.
A full list is provided in the table following Example 2. These keys
should be listed under the conf:
section of the configuration in
Example 1.
app_checks:
- name: EXAMPLE_WEBSITE
check_module: http_check
pattern:
comm: systemd
conf:
url: https://www.MYEXAMPLE.com
# timeout: 1
# method: get
# data:
# <KEY>: <VALUE>
# content_match: '<REGEX>''
# reverse_content_match: false
# username: <USERNAME>
# ntlm_domain: <DOMAIN>
# password: <PASSWORD>
# client_cert: /opt/client.crt
# client_key: /opt/client.key
# http_response_status_code: (1|2|3)\d\d
# include_content: false
# collect_response_time: true
# disable_ssl_validation: true
# ignore_ssl_warning: false
# ca_certs: /etc/ssl/certs/ca-certificates.crt
# check_certificate_expiration: true
# days_warning: <THRESHOLD_DAYS>
# check_hostname: true
# ssl_server_name: <HOSTNAME>
# headers:
# Host: alternative.host.example.com
# X-Auth-Token: <AUTH_TOKEN>
# skip_proxy: false
# allow_redirects: true
# include_default_headers: true
# tags:
# - <KEY_1>:<VALUE_1>
# - <KEY_2>:<VALUE_2>
url
| The URL to test. |
timeout
| The time in seconds to allow for a response. |
method
| The HTTP method. This setting defaults to GET, though many other HTTP methods are supported, including POST and PUT. |
data
| The data option is only available when using the POST method. Data should be included as key-value pairs and will be sent in the body of the request. |
content_match
| A string or Python regular expression. The HTTP check will search for this value in the response and will report as DOWN if the string or expression is not found. |
reverse_content_match
| When true, reverses the behavior of the content_match option, i.e. the HTTP check will report as DOWN if the string or expression in content_match IS found. (default is false) |
username & password
| If your service uses basic authentication, you can provide the username and password here. |
http_response_status_code
| A string or Python regular expression for an HTTP status code. This check will report DOWN for any status code that does not match. This defaults to 1xx, 2xx and 3xx HTTP status codes. For example: 401 or 4\d\d . |
include_content
| When set to true , the check will include the first 200 characters of the HTTP response body in notifications. The default value is false . |
collect_response_time
| By default, the check will collect the response time (in seconds) as the metric network.http.response_time . To disable, set this value to false . |
disable_ssl_validation
| This setting will skip SSL certificate validation and is enabled by default. If you require SSL certificate validation, set this to false . This option is only used when gathering the response time/aliveness from the specified endpoint. Note this setting doesn't apply to the check_certificate_expiration option. |
ignore_ssl_warning
| When SSL certificate validation is enabled (see setting above), this setting allows you to disable security warnings. |
ca_certs
| This setting allows you to override the default certificate path as specified in init_config |
check_certificate_expiration
| When check_certificate_expiration is enabled, the service check will check the expiration date of the SSL certificate. Note that this will cause the SSL certificate to be validated, regardless of the value of the disable_ssl_validation setting. |
days_warning
| When check_certificate_expiration is enabled, these settings will raise a warning alert when the SSL certificate is within the specified number of days from expiration. |
check_hostname
| When check_certificate_expiration is enabled, this setting will raise a warning if the hostname on the SSL certificate does not match the host of the given URL. |
headers
| This parameter allows you to send additional headers with the request. e.g. X-Auth-Token: <AUTH_TOKEN> |
skip_proxy
| If set, the check will bypass proxy settings and attempt to reach the check URL directly. This defaults to false . |
allow_redirects
| This setting allows the service check to follow HTTP redirects and defaults to true . |
tags
| A list of arbitrary tags that will be associated with the check. |
Metrics Available
HTTP metrics concern response time and SSL certificate expiry
information.
See HTTP Metrics.
Service Checks
http.can_connect:
Returns DOWN when any of the following occur:
the request to URL times out
the response code is 4xx/5xx,
or it doesn’t match the pattern
provided in the http_response_status_code
the response body does not contain the pattern in content_match
reverse_content_match
is true
and the response body does contain
the pattern in content_match
URI contains https
and disable_ssl_validation
is false
, and
the SSL connection cannot be validated
Otherwise, returns UP.
Segmentation of the http.can_connect
can be done by URL.
http.ssl_cert:
The check returns:
To disable this check, set check_certificate_expiration
to false
.
Result in the Monitor UI

2.11 - Jenkins
Jenkins is an open-source automation server which
helps to automate part of the software development process, permitting
continuous integration and facilitating the technical aspects of
continuous delivery. It supports version control tools (such as
Subversion, Git, Mercurial, etc), can execute Apache Ant, Apache Maven
and SBT-based projects, and allows shell scripts and Windows batch
commands. If Jenkins is installed on your environment, the Sysdig agent
will automatically connect and collect all Jenkins metrics. See the
Default Configuration section, below.
This page describes the default configuration settings, the metrics
available for integration, and a sample result in the Sysdig Monitor UI.
Jenkins Setup
Requires the standard Jenkins server setup with one or more Jenkins Jobs
running on it.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Jenkins and collect basic metrics.
- name: jenkins
pattern:
comm: java
port: 50000
conf:
name: default
jenkins_home: /var/lib/jenkins #this depends on your environment
Jenkins Folders Plugin
By default, the Sysdig agent does not monitor jobs under job folders
created using Folders plugin.
Set jobs_folder_depth to monitor these jobs. Job folders are scanned
recursively for jobs until the designated folder depth is reached. The
default value = 1.
app_checks:
- name: jenkins
pattern:
comm: java
port: 50000
conf:
name: default
jenkins_home: /var/lib/jenkins
jobs_folder_depth: 3
Metrics Available
The following metrics will be available only after running one or more
Jenkins jobs. They handle queue size, job duration, and job waiting
time.
See Jenkins Metrics.
Result in the Monitor UI

2.12 - Lighttpd
Lighttpd is a secure, fast, compliant, and
very flexible web server that has been optimized for high-performance
environments. It has a very low memory footprint compared to other web
servers and takes care of CPU load. Its advanced feature set (FastCGI,
CGI, Auth, Output Compression, URL Rewriting, and many more) make
Lighttpd the perfect web server software for every server that suffers
load problems. If Lighttpd is installed on your environment, the Sysdig
agent will automatically connect. See the Default Configuration section,
below. The Sysdig agent automatically collects the default metrics.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
At this time, the Sysdig app check for Lighttpd supports Lighttpd
version 1.x.x only.
Lighttpd Setup
For Lighttpd, the status page must be enabled. Add mod_status
in
the /etc/lighttpd/lighttpd.conf
config file:
server.modules = ( ..., "mod_status", ... )
Then configure an endpoint for it. If (for security purposes) you want
to open the status page only to users from the local network, it can be
done by adding the following lines in the
/etc/lighttpd/lighttpd.conf file
:
$HTTP["remoteip"] == "127.0.0.1/8" {
status.status-url = "/server-status"
}
If you want an endpoint to be open for remote users based on
authentication, then the mod_auth module should be enabled in the
/etc/lighttpd/lighttpd.conf
config file:
server.modules = ( ..., "mod_auth", ... )
Then you can add the auth.require parameter in the
/etc/lighttpd/lighttpd.conf
config file:
auth.require = ( "/server-status" => ( "method" => ... , "realm" => ... , "require" => ... ) )
For more information on the auth.require
parameter, see the Lighttpd
documentation..
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Lighttpd and collect basic metrics.
app_checks:
- name: lighttpd
pattern:
comm: lighttpd
conf:
lighttpd_status_url: "http://localhost:{port}/server-status?auto"
log_errors: false
Metrics Available
These metrics are supported for Lighttpd version 1.x.x only. Lighttpd
version 2.x.x is
being built and is NOT ready for use as of this publication.
See Lighttpd Metrics.
Result in the Monitor UI

2.13 - Memcached
Memcached is an in-memory key-value store for
small chunks of arbitrary data (strings, objects) from the results of
database calls, API calls, or page rendering. If Memcached is installed
on your environment, the Sysdig agent will automatically connect. See
the Default Configuration section, below. The Sysdig agent automatically
collects basic metrics. You can also edit the configuration to collect
additional metrics related to items and slabs.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Memcached Setup
Memcached will automatically expose all metrics. You do not need to add
anything on Memcached instance.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Memcached and collect basic metrics:
app_checks:
- name: memcached
check_module: mcache
pattern:
comm: memcached
conf:
url: localhost
port: "{port}"
Additional metrics can be collected by editing Sysdig’s configuration
file dragent.yaml
. If
SASL
is enabled, authentication parameters must be added to dragent.yaml.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Additional Metrics
memcache.items.*
and memcache.slabs.*
can be collected by setting
flags in the options
section, as follows . Either value can be set to
false
if you do not want to collect metrics from them.
app_checks:
- name: memcached
check_module: mcache
pattern:
comm: memcached
conf:
url: localhost
port: "{port}"
options:
items: true # Default is false
slabs: true # Default is false
Example 2: SASL
SASL authentication can be enabled with Memcached (see instructions
here). If
enabled, credentials must be provided against username
and password
fields as shown in Example 2.
app_checks:
- name: memcached
check_module: mcache
pattern:
comm: memcached
conf:
url: localhost
port: "{port}"
username: <username>
# Some memcached version will support <username>@<hostname>.
# If memcached is installed as a container, hostname of memcached container will be used as username
password: <password>
Metrics Available
See Memcached Metrics.
Result in the Monitor UI

2.14 - Mesos/Marathon
Mesos is built using the same principles as
the Linux kernel, only at a different level of abstraction. The Mesos
kernel runs on every machine and provides applications (e.g., Hadoop,
Spark, Kafka, Elasticsearch) with APIs for resource management and
scheduling across entire datacenter and cloud environments. The Mesos
metrics are divided into master and
agent.
Marathon is a production-grade
container orchestration platform for Apache Mesos.
If Mesos and Marathon are installed in your environment, the Sysdig
agent will automatically connect and start collecting metrics. You may
need to edit the default entries to add a custom configuration if the
default does not work. See the Default Configuration section, below.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Mesos/Marathon Setup
Both Mesos and Marathon will automatically expose all metrics. You do
not need to add anything to the Mesos/Marathon instance.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
The Sysdig agent has different entries for mesos-master, mesos-slave
and marathon
in its configuration file. Default entries are present in
Sysdig’s dragent.default.yaml
file and collect all metrics for Mesos.
For Marathon, it collects basic metrics. You may need add configuration
to collect additional metrics.
Default Configuration
In the URLs for mesos-master
and mesos-slave, {mesos_url}
will be
replaced with either the hostname of the auto-detected mesos
master/slave (if auto-detection is enabled), or with an explicit value
from mesos_state_uri
otherwise.
In the URLs for marathon, {marathon_url}
will be replaced with the
hostname of the first configured/discovered Marathon framework.
For all Mesos and Marathon apps, {auth_token}
will either be blank or
an auto-generated token obtained via the /acs/api/v1/auth/login
endpoint.
Mesos Master
app_checks:
- name: mesos-master
check_module: mesos_master
interval: 30
pattern:
comm: mesos-master
conf:
url: "http://localhost:5050"
auth_token: "{auth_token}"
mesos_creds: "{mesos_creds}"
Mesos Agent
app_checks:
- name: mesos-slave
check_module: mesos_slave
interval: 30
pattern:
comm: mesos-slave
conf:
url: "http://localhost:5051"
auth_token: "{auth_token}"
mesos_creds: "{mesos_creds}"
Marathon
app_checks:
- name: marathon
check_module: marathon
interval: 30
pattern:
arg: mesosphere.marathon.Main
conf:
url: "{marathon_url}"
auth_token: "{auth_token}"
marathon_creds: "{marathon_creds}"
Remember! Never edit dragent.default.yaml
directly; always edit
dragent.yaml
.
Marathon
Enable the flag full_metrics
to collect all metrics for marathon.
The following additional metrics are collected with this configuration:
marathon.cpus
marathon.disk
marathon.instances
marathon.mem
app_checks:
- name: marathon
check_module: marathon
interval: 30
pattern:
arg: mesosphere.marathon.Main
conf:
url: "{marathon_url}"
auth_token: "{auth_token}"
marathon_creds: "{marathon_creds}"
Metrics Available
See Mesos Master Metrics.
See Mesos Agent Metrics.
See Marathon Metrics.
Result in the Monitor UI
Mesos Master

Mesos Agent

Marathon

2.15 - MongoDB
MongoDB is an open-source database
management system (DBMS) that uses a document-oriented database model
that supports various forms of data. If MongoDB is installed in your
environment, the Sysdig agent will automatically connect and collect
basic metrics (if
authentication is not
used). You may need to edit the default entries to connect and collect
additional metrics. See the Default Configuration section, below.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
MongoDB Setup
Create a read-only user for the Sysdig agent.
# Authenticate as the admin user.
use admin
db.auth("admin", "<YOUR_MONGODB_ADMIN_PASSWORD>")
# On MongoDB 2.x, use the addUser command.
db.addUser("sysdig-cloud", "sysdig-cloud-password", true)
# On MongoDB 3.x or higher, use the createUser command.
db.createUser({
"user":"sysdig-cloud",
"pwd": "sysdig-cloud-password",
"roles" : [
{role: 'read', db: 'admin' },
{role: 'clusterMonitor', db: 'admin'},
{role: 'read', db: 'local' }
]
})
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with MongoDB.
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: "mongodb://localhost:{port}/admin"
The default MongoDB entry should work for without modification if
authentication is not
configured. If you have enabled password authentication, the entry will
need to be changed.
Some metrics are not available by default. Additional configuration
needs to be provided to collect them as shown in following examples.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: With Authentication
Replace <username> and <password> with actual username and
password.
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: mongodb://<username>:<password>@localhost:{port}/admin
replica_check: true
Example 2: Additional Metrics
Some metrics are not collected by default. These can be collected by
adding additional_metrics
section in the dragent.yaml
file under the
app_checks mongodb
configuration.
Available options are:
collection
- Metrics of the specified collections
metrics.commands
- Use of database commands
tcmalloc
- TCMalloc memory allocator
top
- Usage statistics for each collection
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: mongodb://<username>:<password>@localhost:{port}/admin
replica_check: true
additional_metrics:
- collection
- metrics.commands
- tcmalloc
- top
List of metrics with respective entries in dragent.yam
l:
metric prefix | Entry under additional_metrics |
---|
mongodb.collection | collection |
mongodb.usage.commands | top |
mongodb.usage.getmore | top |
mongodb.usage.insert | top |
mongodb.usage.queries | top |
mongodb.usage.readLock | top |
mongodb.usage.writeLock | top |
mongodb.usage.remove | top |
mongodb.usage.total | top |
mongodb.usage.update | top |
mongodb.usage.writeLock | top |
mongodb.tcmalloc | tcmalloc |
mongodb.metrics.commands | metrics.commands |
Example 3: Collections Metrics
MongoDB stores documents in collections. Collections are analogous to
tables in relational databases. The Sysdig agent by default does not
collect the following collections metrics:
collections
: List of MongoDB collections to be polled by the
agent. Metrics will be collected for the specified set of
collections. This configuration requires the
additional_metrics.collection
section to be present with an entry
for collection
in the dragent.yaml
file. The collection
entry
under additional_metrics
is a flag that enables the collection
metrics.
collections_indexes_stats
: Collect indexes access metrics for
every index in every collection in the collections
list. The
default value is false.
The metric is available starting MongoDB v3.2.
For the agent to poll them, you must configure the dragent.yaml
file
and add an entry corresponding to the metrics to the conf
section as
follows.
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: mongodb://<username>:<password>@localhost:{port}/admin
replica_check: true
additional_metrics:
- collection
- metrics.commands
- tcmalloc
- top
collections:
- <LIST_COLLECTIONS>
collections_indexes_stats: true
You can tighten the security measure of the app check connection with
MongoDB by establishing an SSL connection. To enable secure
communication, you need to set the SSL configuration in dragent.yaml
to true. In an advanced deployment with multi-instances of MongoDB, you
need to include a custom CA certificate or client certificate and other
additional configurations.
Basic SSL Connection
In a basic SSL connection:
A single MongoDB instance is running on the host.
An SSL connection with no advanced features, such as the use of a
custom CA certificate or client certificate.
To establish a basic SSL connection between the agent and the MongoDB
instance:
Open the dragent.yaml
file.
Configure the SSL entries as follows:
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: "mongodb://<HOSTNAME>:{port}/admin"
ssl: true
# ssl_cert_reqs: 0 # Disable SSL validation
To disable SSL validation, set ssl_cert_reqs
to 0
. This setting
is equivalent to ssl_cert_reqs=CERT_NONE
.
Advanced SSL Connection
In an advanced SSL connection:
Advanced features, such as custom CA certificate or client
certificate, are configured.
Single or multi-MongoDB instances are running on the host. The agent
is installed as one of the following:
Prerequisites
Set up the following:
Custom CA certificate
Client SSL verification
SSL validation
(Optional ) SSL Configuration Parameters
ssl_certfile
| The certificate file that is used to identify the local connection with MongoDB. |
ssl_keyfile
| The private keyfile that is used to identify the local connection with MongoDB. Ignore this option if the key is included with ssl_certfile . |
ssl_cert_reqs
| Specifies whether a certificate is required from the MongoDB server, and whether it will be validated if provided. Possible values are: 0 for ssl.CERT_NONE . Implies certificates are ignored. 1 for ssl.CERT_OPTIONAL . Implies certificates are not required, but validated if provided. 2 for ssl.CERT_REQUIRED . Implies certificates are required and validated.
|
ssl_ca_certs
| The ca_certs file contains a set of concatenated certification authority certificates, which are used to validate certificates used by MongoDB server. Mostly used when server certificates are self-signed. |
Sysdig Agent as a Container
If Sysdig agent is installed as a container, start it with an extra
volume containing the SSL files mentioned in the agent
configuration. For example:
# extra parameter added: -v /etc/ssl:/etc/ssl
docker run -d --name sysdig-agent --restart always --privileged --net host --pid host -e ACCESS_KEY=xxxxxxxxxxxxx -e SECURE=true -e TAGS=example_tag:example_value -v /var/run/docker.sock:/host/var/run/docker.sock -v /dev:/host/dev -v /proc:/host/proc:ro -v /boot:/host/boot:ro -v /lib/modules:/host/lib/modules:ro -v /usr:/host/usr:ro -v /etc/ssl:/etc/ssl --shm-size=512m sysdig/agent
Open the dragent.yaml
file and configure the SSL entries:
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: "mongodb://<HOSTNAME>:{port}/admin"
ssl: true
# ssl_ca_certs: </path/to/ca/certificate>
# ssl_cert_reqs: 0 # Disable SSL validation
# ssl_certfile: </path/to/client/certfile>
# ssl_keyfile: </path/to/client/keyfile>
Sysdig Agent as a Process
If Sysdig agent is installed as a process, store the SSL files on
the host and provide the path in the agent configuration.
app_checks:
- name: mongodb
check_module: mongo
pattern:
comm: mongod
conf:
server: "mongodb://<HOSTNAME>:{port}/admin"
ssl: true
# ssl_ca_certs: </path/to/ca/certificate>
# ssl_cert_reqs: 0 # Disable SSL validation
# ssl_certfile: </path/to/client/certfile>
# ssl_keyfile: </path/to/client/keyfile>
See optional SSL configuration
parameters
for information on SSL certificate files.
Multi-MongoDB Setup
In a multi-MongoDB setup, multiple MongoDB instances are running on a
single host. You can configure either a basic or an advanced SSL
connection individually for each MongoDB instance.
Store SSL Files
In an advanced connection, different SSL certificates are used for each
instance of MongoDB on the same host and are stored in separate
directories. For instance, the SSL files corresponding to two different
MongoDB instances can be stored at a mount point as follows:
Open the dragent.yaml
file.
Configure the SSL entries as follows:
app_checks:
- name: mongodb-ssl-1
check_module: mongo
pattern:
comm: mongod
args: ssl_certificate-1.pem
conf:
server: "mongodb://<HOSTNAME|Certificate_CN>:{port}/admin"
ssl: true
ssl_ca_certs: /etc/ssl/mongo1/ca-cert-1
tags:
- "instance:ssl-1"
- name: mongodb-ssl-2
check_module: mongo
pattern:
comm: mongod
args: ssl_certificate-2.pem
conf:
server: "mongodb://<HOSTNAME|Certificate_CN>:{port}/admin"
ssl: true
ssl_ca_certs: /etc/ssl/mongo2/ca-cert-2
tags:
- "instance:ssl-2"
Replace the names of the instances and certificate files with the
names that you prefer.
Metrics Available
See MongoDB Metrics.
Result in the Monitor UI

2.16 - MySQL
MySQL is the world’s most popular open-source
database. With its proven performance, reliability, and ease-of-use,
MySQL has become the leading database choice for web-based applications,
used by high profile web properties including Facebook, Twitter,
YouTube. Additionally, it is an extremely popular choice as an embedded
database, distributed by thousands of ISVs and OEMs.
Supported Distribution
The MySQL AppCheck is supported for following MySQL versions.
If the Sysdig agent is installed as a Process:
Host with Python 2.7: MySQL versions supported - 5.5 to 8
Host with Python 2.6: MySQL versions supported - 4.1 to 5.7
(tested with v5.x only)
NOTE: This implies that MySQL 5.5, 5.6 and 5.7 are supported on
both the Python 2.6 and 2.7 environments.
If the Sysdig agent is installed as a Docker container:
The Docker container of the Sysdig agent has Python 2.7 installed. If it
is installed, respective versions against Python 2.7 will be supported.
The following environments have been tested and are supported. Tests
environments include both the Host/Process and Docker environment.
Python | MySQL | | | | |
---|
2.7 (Ubuntu 16/ CentOS 7) | No | Yes | Yes | Yes | Yes |
2.6 (CentOS 6) | Yes | Yes | Yes | Yes | No |
MySQL Setup
A user must be created on MySQL so the Sysdig agent can collect metrics.
To configure credentials, run the following commands on your server,
replacing the sysdig-clouc-password
parameter.
MySQL version-specific commands to create a user are provided below.
# MySQL 5.6 and earlier
CREATE USER 'sysdig-cloud'@'127.0.0.1' IDENTIFIED BY 'sysdig-cloud-password';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'sysdig-cloud'@'127.0.0.1' WITH MAX_USER_CONNECTIONS 5;
## OR ##
# MySQL 5.7 and 8
CREATE USER 'sysdig-cloud'@'127.0.0.1' IDENTIFIED BY 'sysdig-cloud-password' WITH MAX_USER_CONNECTIONS 5;
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'sysdig-cloud'@'127.0.0.1';
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
There is no default configuration for MySQL, as a unique user and
password are required for metrics polling.
Add the entry for MySQL into dragent.yaml
, updating the user
and pass
field credentials.
app_checks:
- name: mysql
pattern:
comm: mysqld
conf:
server: 127.0.0.1
user: sysdig-cloud
pass: sysdig-cloud-password
Metrics Available
See MySQL Metrics.
Result in the Monitor UI
Default Dashboard

Additional Views

2.17 - NGINX and NGINX Plus
NGINX is open-source
software for web serving, reverse proxying, caching, load balancing,
media streaming, and more. It started out as a web server designed for
maximum performance and stability. In addition to its HTTP server
capabilities, NGINX can also function as a proxy server for email (IMAP,
POP3, and SMTP) and a reverse proxy and load balancer for HTTP, TCP, and
UDP servers.
NGINX Plus is a software load
balancer, web server, and content cache built on top of open source
NGINX. NGINX Plus has exclusive enterprise‑grade features beyond what’s
available in the open-source offering, including session persistence,
configuration via API, and active health checks.
The Sysdig agent has a default configuration to collect metrics for
open-source NGINX, provided that you have the HTTP stub status module
enabled. NGINX exposes basic metrics about server activity on a simple
status page with this status module. If NGINX Plus is installed, a wide
range of metrics is available with the NGINX Plus API.
This page describes the setup steps for NGINX/NGINX Plus, the default
configuration settings, how to edit the configuration to collect
additional information, the metrics available for integration, and
sample results in the Sysdig Monitor UI.
NGINX/ NGINX Plus Setup
This section describes the configuration required on the NGINX server.
The Sysdig agent will not collect metrics until the required endpoint is
added to the NGINX configuration, per one of the following methods:
Configuration examples of each are provided below
NGINX Stub Status Module Configuration
The ngx_http_stub_status_module
provides access to basic status
information. It is compiled by default on most distributions. If not, it
should be enabled with the --with-http_stub_status_module
configuration parameter.
To check if the module is already compiled, run the following
command:
nginx -V 2>&1 | grep -o with-http_stub_status_module
If with-http_stub_status_module
is listed, the status module is
enabled. (For more information, see
http://nginx.org/en/docs/http/ngx_http_stub_status_module.html.)
Update the NGINX configuration file with /nginx_status
endpoint as
follows. The default NGINX configuration file is present at
/etc/nginx/nginx.conf
or /etc/nginx/conf.d/default.conf.
# HTTP context
server {
...
# Enable NGINX status module
location /nginx_status {
# freely available with open source NGINX
stub_status;
access_log off;
# for open source NGINX < version 1.7.5
# stub_status on;
}
...
}
NGINX Plus API Configuration
When NGINX Plus is configured, the Plus API can be enabled by adding
/api
endpoint in the NGINX configuration file as follows.
The default NGINX configuration file is present at
/etc/nginx/nginx.conf
or /etc/nginx/conf.d/default.conf.
# HTTP context
server {
...
# Enable NGINX Plus API
location /api {
api write=on;
allow all;
}
...
}
Sysdig Agent Configuration
Configuration Examples:
Example 1 (Default): When only open-source Nginx is configured.
Example 2: When only NginxPlus node is configured.
Example 3: When Nginx and NginxPlus are installed in different
containers on same host.
Flag use_plus_api
and is used for differentiating NGINX &
NGINXPlus metrics.
NGINXPlus metrics are differentiated with prefix nginx.plus.*
When use_plus_api = true,
nginx_plus_api_url
is used to fetch NginxPlus metrics from the
NginxPlus node.
nginx_status_url
is used to fetch Nginx metrics from the Nginx
node (If single host is running two separate containers for
Nginx and NginxPlus).
Example 1: Default Configuration
With the default configuration, only NGINX metrics will be available
once the ngx_http_stub_status_module
is configured.
app_checks:
- name: nginx
check_module: nginx
pattern:
exe: "nginx: worker process"
conf:
nginx_status_url: "http://localhost:{port}/nginx_status"
log_errors: true
Example 2: NGINX Plus only
With this example only NGINX Plus Metrics will be available.
app_checks:
- name: nginx
check_module: nginx
pattern:
exe: "nginx: worker process"
conf:
nginx_plus_api_url: "http://localhost:{port}/api"
use_plus_api: true
user: admin
password: admin
log_errors: true
Example 3: NGINX and NGINX Plus
This is special case where NGINX open-source and NGINX PLUS are
installed on same host but in different containers. With this
configuration, respective metrics will be available for NGINX and NGINX
Plus containers.
app_checks:
- name: nginx
check_module: nginx
pattern:
exe: "nginx: worker process"
conf:
nginx_plus_api_url: "http://localhost:{port}/api"
nginx_status_url: "http://localhost:{port}/nginx_status"
use_plus_api: true
user: admin
password: admin
log_errors: true
List of Metrics
NGINX (Open Source)
See NGINX Metrics.
NGINX Plus
See NGINX Plus Metrics.
Result in the Monitor UI

2.18 - NTP
NTP stands for
Network Time Protocol. It is used to synchronize the time on your Linux
system with a centralized NTP server. A local NTP server on the network
can be synchronized with an external timing source to keep all the
servers in your organization in-sync with an accurate time.
If the NTP check is enabled in the Sysdig agent, it reports the time
offset of the local agent from an NTP server.
This page describes how to edit the configuration to collect
information, the metrics available for integration, and a sample result
in the Sysdig Monitor UI.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig's dragent.default.yaml
does not provide any
configuration for NTP.
Add the configuration in Example 1 to the dragent.yaml
file to enable
NTP
checks.
Never edit dragent.default.yaml
directly; always edit only
dragent.yaml
.
Example
- name: ntp
interval: 60
pattern:
comm: systemd
conf:
host: us.pool.ntp.org
offset_threshold: 60
host
: (mandatory) provides the host name of NTP
server.
offset_threshold
: (optional) provides the difference (in seconds)
between the local clock and the NTP server, when the ntp.in_sync
service check becomes CRITICAL
. The default is 60
seconds.
Metrics Available
ntp.offset
, the time difference between the local clock and the NTP
reference clock, is the primary NTP metric.
See also NTP Metrics.
Service Checks
ntp.in_sync:
Returns CRITICAL
if the NTP offset is greater than the threshold
specified in dragent.yaml
, otherwise OK.
Result in the Monitor UI

2.19 - PGBouncer
PgBouncer is a lightweight
connection pooler for PostgreSQL. If PgBouncer is installed on your
environment, you may need to edit the Sysdig agent configuration file to
connect. See the Default Configuration section, below.
This page describes the configuration settings, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
PgBouncer Setup
PgBouncer does not ship with a default stats user configuration. To
configure it, you need to add a user allowed to access PgBouncer stats.
Do so by adding following line in pgbouncer.ini
. The default file
location is /etc/pgbouncer/pgbouncer.ini
stats_users = sysdig_cloud
For the same user you need the following entry in userlist.txt.
The
default file location is /etc/pgbouncer/userlist.txt
"sysdig_cloud" "sysdig_cloud_password"
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
No default configuration is present in Sysdig’s dragent.default.yaml
file for PgBouncer, as it requires a unique username and password. You
must add a custom entry in dragent.yaml
as follows:
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
app_checks:
- name: pgbouncer
pattern:
comm: pgbouncer
conf:
host: localhost # set if the bind ip is different
port: 6432 # set if the port is not the default
username: sysdig_cloud
password: sysdig_cloud_password #replace with appropriate password
Metrics Available
See PGBouncer Metrics.
Result in the Monitor UI

2.20 - PHP-FPM
PHP-FPM (FastCGI Process Manager) is an
alternative PHP FastCGI implementation, with some additional features
useful for sites of any size, especially busier sites. If PHP-FPM is
installed on your environment, the Sysdig agent will automatically
connect. You may need to edit the default entries to connect if PHP-FPM
has a custom setting in its config file. See the Default Configuration
section, below.
The Sysdig agent automatically collects all metrics with default
configuration.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
PHP-FPM Setup
This check has a default configuration that should suit most use cases.
If it does not work for you, verify that you have added these lines to
your php-fpm.conf
file. The default location is /etc/
pm.status_path = /status
ping.path = /ping
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with PHP-FPM and collect all metrics:
app_checks:
- name: php-fpm
check_module: php_fpm
retry: false
pattern:
exe: "php-fpm: master process"
If you have a configuration other than those for PHP-FPM in
php-fpm.conf,
you can edit the Sysdig agent configuration in
dragent.yaml,
as shown in Example 1.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
Replace the values of status_url
and ping_url
below with the values
set against pm.status_path
and ping.path
respectively in your
php-fpm.conf:
app_checks:
- name: php-fpm
check_module: php_fpm
pattern:
exe: "php-fpm: master process"
conf:
status_url: /mystatus
ping_url: /myping
ping_reply: mypingreply
Metrics Available
See PHP-FPM Metrics.
Result in the Monitor UI

2.21 - PostgreSQL
PostgreSQL is a powerful, open-source,
object-relational database system that has earned a strong reputation
for reliability, feature robustness, and performance.
If PostgreSQL is installed in your environment, the Sysdig agent will
automatically connect in most cases. In some conditions, you may need to
create a specific user for Sysdig and edit the default entries to
connect.
See the Default Configuration section, below. The Sysdig agent
automatically collects all metrics with the default configuration when
correct credentials are provided.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
PostgreSQL Setup
PostgreSQL will be auto-discovered and the agent will connect through
the Unix socket using the Default Configuration with the
**postgres
**default user. If this does not work, you can create a
user for Sysdig Monitor and give it enough permissions to read Postgres
stats. To do this, execute the following example statements on your
server:
create user sysdig-cloud with password 'password';
grant SELECT ON pg_stat_database to sysdig_cloud;
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s default.dragent.yaml
uses the following code to
connect with Postgres.
app_checks:
- name: postgres
pattern:
comm: postgres
port: 5432
conf:
unix_sock: "/var/run/postgresql/"
username: postgres
If a special user for Sysdig is created, then update dragent.yaml
file
with the Expanded Example, below.
Never edit default.dragent.yaml
directly; always edit only
dragent.yaml
.
Example 1: Special User
Update the username and password created for the Sysdig agent in the
respective fields, as follows:
app_checks:
- name: postgres
pattern:
comm: postgres
port: 5432
conf:
username: sysdig-cloud
password: password
Example 2: Connecting on Unix Socket
If Postgres is listening on Unix socket /tmp/.s.PGSQL.5432
, set value
of unix_sock
to /tmp/
app_checks:
- name: postgres
pattern:
comm: postgres
port: 5432
conf:
unix_sock: "/tmp/"
username: postgres
Example 3: Relations
Lists of relations/tables can be specified to track per-relation
metrics.
A single relation can be specified in two ways:
If schemas
are not provided, all schemas will be included. dbname
is
to be provided if relations is specified.
app_checks:
- name: postgres
pattern:
comm: postgres
port: 5432
conf:
username: <username>
password: <password>
dbname: <user_db_name>
relations:
- relation_name: <table_name_1>
schemas:
- <schema_name_1>
- relation_regex: <table_pattern>
Example 4: Other Optional Parameters
app_checks:
- name: postgres
check_module: postgres
pattern:
comm: postgres
port: 5432
conf:
username: postgres
unix_sock: "/var/run/postgresql"
dbname: <user_db_name>
#collect_activity_metrics: true
#collect_default_database: true
#tag_replication_role: true
Optional Parameterscollect_activity_metrics | When set to true , it will enable metrics from pg_stat_activity . New metrics added will be: postgresql.active_queries postgresql.transactions.idle_in_transaction postgresql.transactions.open postgresql.waiting_queries
| false |
collect_default_database | When set to true , it will collect statistics from default database which is postgres. All metrics from postgres database will have tag db:postgres | false |
tag_replication_role | When set to true , metrics and checks will be tagged with replication_role:<master|standby> | false |
Optional Parameters
Example 5: Custom Metrics Using Custom Queries
Personalized custom metrics can be collected from Postgres using custom
queries.
app_checks:
- name: postgres
pattern:
comm: postgres
port: 5432
conf:
unix_sock: "/var/run/postgresql/"
username: postgres
custom_queries:
- metric_prefix: postgresql.custom
query: <QUERY>
columns:
- name: <COLUNMS_1_NAME>
type: <COLUMNS_1_TYPE>
- name: <COLUNMS_2_NAME>
type: <COLUMNS_2_TYPE>
tags:
- <TAG_KEY>:<TAG_VALUE>
Option | Required | Description |
---|
metric_prefix | Yes | Each metric starts with the chosen prefix. |
query | Yes | This is the SQL to execute. It can be a simple statement or a multi-line script. All of the rows of the results are evaluated. Use the pipe if you require a multi-line script |
columns | Yes | This is a list representing each column ordered sequentially from left to right. The number of columns must equal the number of columns returned in the query. There are 2 required pieces of data:- name : This is the suffix to append to the metric_prefix to form the full metric name. If the type is specified as tag , the column is instead applied as a tag to every metric collected by this query.- type : This is the submission method (gauge, count, rate, etc.). This can also be set to ’tag’ to tag each metric in the row with the name and value of the item in this column |
tags | No | A list of tags to apply to each metric (as specified above). |
Optional Parameters
Metrics Available
See PostgreSQL Metrics.
Result in the Monitor UI
Default Dashboard
The default PostgreSQL dashboard includes combined metrics and
individual metrics in an overview page.

Other Views
You can also view individual metric charts from a drop-down menu in an
Explore view.

2.22 - RabbitMQ
RabbitMQ is an open-source message-broker
software (sometimes called message-oriented middleware) that implements
Advanced Message Queuing Protocol (AMQP). The RabbitMQ server is written
in the Erlang language and is built on the Open Telecom Platform
framework for clustering and fail-over. Client libraries to interface
with the broker are available in all major programming languages. If
RabbitMQ is installed on your environment, the Sysdig agent will
automatically connect. See the Default Configuration section, below.
The Sysdig agent automatically collects all metrics with the default
configuration. You may need to edit the dragent.yaml
file if a metrics
limit is reached.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
RabbitMQ Setup
Enable the RabbitMQ management plugin. See RabbitMQ’s
documentation to enable it.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with RabbitMQ and collect all metrics.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: guest
rabbitmq_pass: guest
The RabbitMQ app check tracks various entities, such as exchanges,
queues and nodes. Each of these entities has its maximum limits. If the
limit is reached, metrics can be controlled by editing the
dragent.yaml
file, as in the following examples.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Manage logging_interval
When a maximum limit is exceeded, the app check will log an info
message:
|
---|
rabbitmq: Too many <entity type> (<number of entities>) to fetch and maximum limit is (<configured limit>). You must choose the <entity type> you are interested in by editing the dragent.yaml configuration file |
This message is suppressed by a configuration parameter,
logging_interval
.
Its default value is 300 seconds. This can be altered by specifying a
different value in dragent.yaml
.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: guest
rabbitmq_pass: guest
logging_interval: 10 # Value in seconds. Default is 300
Example 2: Specify Nodes, Queues, or Exchanges
Each of the tracked RabbitMQ entities has its maximum limits. As of
Agent v10.5.1, the default limits are as follows:
Exchanges: 16 per-exchange metrics
Queues: 20 per-queue metrics
Nodes: 9 per-node metrics
The max_detailed_*
settings for the RabbitMQ app check do not limit
the reported number of queues, exchanges, and node, but the number of
generated metrics for the objects. For example, a single queue might
report up to 20 metrics, and therefore, set max_detailed_queues
to 20
times the actual number of queues.
The metrics for these entities are tagged. If any of these entities are
present but no transactions have occurred for them, the metrics are
still reported with 0 values, though without tags. Therefore, when
segmenting these metrics, the tags will show as unset
in the Sysdig
Monitor Explore view. However, all such entities are still counted
against the maximum limits. In such a scenario, you can specify the
entity names for which you want to collect metrics in the dragent.yaml
file.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: guest
rabbitmq_pass: guest
tags: ["queues:<queuename>"]
nodes:
- rabbit@localhost
- rabbit2@domain
nodes_regexes:
- bla.*
queues:
- queue1
- queue2
queues_regexes:
- thisqueue-.*
- another_\d+queue
exchanges:
- exchange1
- exchange2
exchanges_regexes:
- exchange*
Optional tags can be applied to every emitted metric, service check,
and/or event.
Names can be specified by exact name or regular expression.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: guest
rabbitmq_pass: guest
tags: ["some_tag:some_value"]
Example 4: filter_by_node
Use filter_by_node: true
if you want each node to report information
localized to the node. Without this option, each node reports
cluster-wide info (as presented by RabbitMQ itself). This option makes
it easier to view the metrics in the UI by removing redundant
information reported by individual nodes.
Default: false
.
Prerequisite: Sysdig agent v. 92.3 or higher.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: guest
rabbitmq_pass: guest
filter_by_node: true
Metrics Available
See RabbitMQ Metrics.
Result in the Monitor UI

2.23 - RedisDB
Redis is an open-source (BSD licensed), in-memory
data structure store, used as a database, cache, and message broker. If
Redis is installed in your environment, the Sysdig agent will
automatically connect in most cases. You may need to edit the default
entries to get additional metrics. See the Default Configuration
section, below.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Application Setup
Redis will automatically expose all metrics. You do not need to
configure anything in the Redis instance.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Redis and collect basic metrics:
app_checks:
- name: redis
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: "{port}"
Some additional metrics can be collected by editing the configuration
file as shown in following examples. The options shown in Example 2 are
relevant if Redis requires authentication or if a Unix socket is used.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Key Lengths
The following example entry results in the metric redis.key.length
in
the Sysdig Monitor UI, displaying the length of specific keys (segmented
by: key
). To enable, provide the key names in dragent.yaml
as
follows.
Note that length is 0 (zero) for keys that have a type other than
list, set, hash,
or sorted set.
Keys can be expressed as patterns;
see https://redis.io/commands/keys.
Sample entry in dragent.yaml
:
app_checks:
- name: redis
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: "{port}"
keys:
- "list_1"
- "list_9*"
Example 2: Additional Configuration Options
app_checks:
- name: redis
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: "{port}"
# unix_socket_path: /var/run/redis/redis.sock # can be used in lieu of host/port
# password: mypassword # if your Redis requires auth
Example 3: COMMANDSTATS Metrics
You can also collect the INFO COMMANDSTATS
result as metrics
(redis.command.*
). This works with Redis >= 2.6
Sample implementation:
app_checks:
- name: redis
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: "{port}"
command_stats: true
Metrics Available
See RedisDB Metrics.
Result in the Monitor UI

2.24 - SNMP
Simple Network Management Protocol
(SNMP)
is an application-layer protocol used to manage and monitor network
devices and their functions. The Sysdig agent can connect to network
devices and collect metrics using SNMP.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
SNMP Overview
Simple Network Management Protocol
(SNMP)
is an Internet Standard protocol for collecting and configuring
information about devices in the networks. The network devices include
physical devices like switches, routers, servers etc.
SNMP has three primary versions ( SNMPv1, SNMPv2c and SNMPv3) and
SNMPv2c is most widely used.
SNMP allows device vendors to expose management data in the form of
variables on managed systems organized in a management information base
(MIB), which describe the system status and configuration. The devices
can be queried as well as configured remotely using these variables.
Certain MIBs are generic and supported by the majority of the device
vendors. Additionally, each vendor can have their own private/enterprise
MIBs for vendor-specific information.
SNMP MIB is a collection of objects uniquely identified by an Object
Identifier (OID). OIDs are represented in the form of x.0, where x is
the name of object in the MIB definition.
For example, suppose one wanted to identify an instance of the variable sysDescr
The object class for sysDescr is:
iso org dod internet mgmt mib system sysDescr
1 3 6 1 2 1 1 1
Hence, the object type, x, would be 1.3.6.1.2.1.1.1
SNMP Agent Configuration
To monitor the servers with the Sysdig agent, the SNMP agent must be
installed on the servers to query the system information.
For Ubuntu-based servers, use the following commands to install the SNMP
Daemon:
$sudo apt-get update
$sudo apt-get install snmpd
Next, configure this SNMP agent to respond to queries from the SNMP
manager by updating the configuration file located at
/etc/snmp/snmpd.conf
Below are the important fields that must be configured:
snmpd.conf
# Listen for connections on all interfaces (both IPv4 *and* IPv6)
agentAddress udp:161,udp6:[::1]:161
## ACCESS CONTROL
## system + hrSystem groups only
view systemonly included .1.3.6.1.2.1.1
view systemonly included .1.3.6.1.2.1.25.1
view systemonly included .1.3.6.1.2.1.31.1
view systemonly included .1.3.6.1.2.1.2.2.1.1
# Default access to basic system info
rocommunity public default -V systemonly
# rocommunity6 is for IPv6
rocommunity6 public default -V systemonly
After making changes to the config file, restart the snmpd
service
using:
$sudo service snmpd restart
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
No default configuration is present for SNMP check.
You must specify the OID/MIB for every parameter you want to
collect, as in the following example.
The OIDs configured in dragent.yaml
are included in the
snmpd.conf
configuration under the ‘ACCESS CONTROL’ section
Ensure that the community_string
is same as configured in the
system configuration (rocommunity
).
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
app_checks:
- name: snmp
pattern:
comm: python
arg: /opt/draios/bin/sdchecks
interval: 30
conf:
mibs_folder: /usr/share/mibs/ietf/
ip_address: 52.53.158.103
port: 161
community_string: public
# Only required for snmp v1, will default to 2
# snmp_version: 2
# Optional tags can be set with each metric
tags:
- vendor:EMC
- array:VNX5300
- location:front
metrics:
- OID: 1.3.6.1.2.1.25.2.3.1.5
name: hrStorageSize
- OID: 1.3.6.1.2.1.1.7
name: sysServices
- MIB: TCP-MIB
symbol: tcpActiveOpens
- MIB: UDP-MIB
symbol: udpInDatagrams
- MIB: IP-MIB
table: ipSystemStatsTable
symbols:
- ipSystemStatsInReceives
metric_tags:
- tag: ipversion
index: 1 # specify which index you want to read the tag value from
- MIB: IF-MIB
table: ifTable
symbols:
- ifInOctets
- ifOutOctets
metric_tags:
- tag: interface
column: ifDescr # specify which column to read the tag value from
The Sysdig agent allows you to monitor the SNMP counters and gauge of
your choice. For each device, specify the metrics that you want to
monitor in the metrics
subsection using one of the following methods:
Specify a MIB and the symbol that you want to export
metrics:
- MIB: UDP-MIB
symbol: udpInDatagrams
Specify an OID and the name you want the metric to appear under in
Sysdig Monitor:
metrics:
- OID: 1.3.6.1.2.1.6.5
name: tcpActiveOpens
#The name here is the one specified in the MIB but you could use any name.
Specify an MIB and a table from which to extract information:
metrics:
- MIB: IF-MIB
table: ifTable
symbols:
- ifInOctets
metric_tags:
- tag: interface
column: ifDescr
Metrics Available
The SNMP check does not have default metrics. All metrics mentioned in
dragent.yaml
file will be seen with snmp.*
prefix/
Result in the Monitor UI

2.25 - Supervisord
Supervisor daemon is a client/server system
that allows its users to monitor and control a number of processes on
UNIX-like operating systems., The Supervisor check monitors the uptime,
status, and number of processes running under Supervisord.
No default configuration is provided for the Supervisor check; you must
provide the configuration in the dragent.yaml
file for the Sysdig
agent to collect the data provided by Supervisor.
This page describes the setup steps required on Supervisor, how to edit
the Sysdig agent configuration to collect additional information, the
metrics available for integration, and a sample result in the Sysdig
Monitor UI.
Supervisor Setup
Configuration
The Sysdig agent can collect data from Supervisor via HTTP server or
UNIX socket. The agent collects the same data regardless of the
configured collection method.
Un-comment the following or add them if they are not present in
/etc/supervisor/supervisord.conf
[inet_http_server]
port=localhost:9001
username=user # optional
password=pass # optional
...
[supervisorctl]
serverurl=unix:///tmp/supervisor.sock
...
[unix_http_server]
file=/tmp/supervisor.sock
chmod=777 # make sure chmod is set so that non-root users can read the socket.
...
[program:foo]
command=/bin/cat
The programs controlled by Supervisor are given by different [program]
sections in the configuration. Each program you want to manage by
Supervisor must be specified in the Supervisor configuration file, with
its supported options in the [program]
section. See Supervisor’s
sample.conf
file for details.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
does not have any
configuration to connect the agent with Supervisor. Edit dragent.yaml
following the Examples given to connect with Supervisor and collect
supervisor.*
metrics.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1: Connect by UNIX Socket
- name: supervisord
pattern:
comm: supervisord
conf:
socket: "unix:///tmp/supervisor.sock"
Example 2: Connect by Host Name and Port, Optional Authentication
- name: supervisord
pattern:
comm: supervisord
conf:
host: localhost
port: 9001
# user: user # Optional. Required only if a username is configured.
# pass: pass # Optional. Required only if a password is configured.
Metrics Available
supervisord.process.count (gauge) | The number of supervisord monitored processes shown as process |
supervisord.process.uptime (gauge) | The process uptime shown as second |
See also Supervisord
Metrics.
Service Check
supervisored.can.connect:
Returns CRITICAL
if the Sysdig agent cannot connect to the HTTP server
or UNIX socket configured, otherwise OK.
supervisord.process.status:
SUPERVISORD STATUS | SUPERVISORD.PROCESS.STATUS |
---|
STOPPED | CRITICAL |
STARTING | UNKNOWN |
RUNNING | OK |
BACKOFF | CRITICAL |
STOPPING | CRITICAL |
EXITED | CRITICAL |
FATAL | CRITICAL |
UNKNOWN | UNKNOWN |
Result in the Monitor UI

2.26 - TCP
You can monitor the status of your custom application’s port using the
TCP check. This check will routinely connect to the designated port and
send Sysdig Monitor a simple on/off metric and response time.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
TCP Application Setup
Any application listening on a TCP port can be monitored with
tcp_check
.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
No default configuration is provided in the default settings file; you
must add the entries in Example 1 to the user settings config file
dragent.yaml.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example
- name: tcp_check
check_module: tcp_check
pattern:
comm: httpd
arg: DFOREGROUND
conf:
port: 80
collect_response_time: true
This example shows monitoring a TCP check on an Apache process running
on the host on port 80.
comm:
is the command for running the Apache server on port 80.
If you want the response time for your port, meaning the amount of time
the process takes to accept the connection, you can add the
collect_response_time: true
parameter under the conf:
section and the additional metric network.tcp.response_time
will
appear in the Metrics list.
Do not use port:
under the pattern
: section in this case,
because if the process is not listening it will not be matched and the
metric will not be sent to Sysdig Monitor.
Metrics Available
network.tcp.response_time (gauge) | The response time of a given host and TCP port, tagged with url, e.g. 'url:192.168.1.100:22'. shown as second |
See TCP Metrics.
Service Checks
tcp.can_connect
:
DOWN if the agent
cannot connect to the configured host and port,
otherwise UP.
Result in the Monitor UI

2.27 - Varnish
Varnish HTTP Cache is a web application
accelerator, also known as a “caching HTTP reverse proxy.” You install
it in front of any server that speaks HTTP and configure it to cache the
contents. If Varnish is installed on your environment, the Sysdig agent
will automatically connect. See the Default Configuration section,
below.
The Sysdig Agent automatically collects all metrics. You can also edit
the configuration to emit service checks for the back end.
This page describes the default configuration settings, how to edit the
configuration to collect additional information, the metrics available
for integration, and a sample result in the Sysdig Monitor UI.
Varnish Setup
Varnish will automatically expose all metrics. You do not need to add
anything to the Varnish instance.
Sysdig Agent Configuration
Review how to Edit dragent.yaml to Integrate or Modify Application
Checks.
Default Configuration
By default, Sysdig’s dragent.default.yaml
uses the following code to
connect with Varnish and collect all but the VBE metrics. See Example 2
Enable Varnish VBE
Metrics.
metrics_filter:
- exclude: varnish.VBE.*
app_checks:
- name: varnishapp_checks:
interval: 15
pattern:
comm: varnishd
conf:
varnishstat: /usr/bin/varnishstat
Optionally, if you want to submit service checks for the health of each
back end, you can configure varnishadm
and edit dragent.yaml
as in
Example 1.
Remember! Never edit dragent.default.yaml
directly; always edit
only dragent.yaml
.
Example 1 Service Health Checks with varnishadm
When varnishadm
is configured, the Sysdig agent requires privileges to
execute the binary with root privileges. Add the following to your
/etc/sudoers
file:
sysdig-agent ALL=(ALL) NOPASSWD:/usr/bin/varnishadm
Then edit dragent.yaml
as follows. Note: If you have configured
varnishadm
and your secret file is NOT /etc/varnish/secret
, you can
comment out secretfile.
app_checks:
- name: varnish
interval: 15
pattern:
comm: varnishd
conf:
varnishstat: /usr/bin/varnishstat
varnishadm: /usr/bin/varnishadm
secretfile: /etc/varnish/secret
This example will enable following service check.
varnish.backend_healthy
: The agent submits a service check for each
Varnish backend, tagging each with backend:<backend_name>
.
Example 2 Enable Varnish VBE Metrics
Varnish VBE metrics are dynamically generated (and therefore are not
listed in the Metrics
Dictionary). Because they
generate unique metric names with timestamps, they can clutter metric
handling and are filtered out by default. If you want to collect these
metrics, use include
in the metrics_filter
in dragent.yaml
:
metrics_filter:
- include: varnish.VBE.*
app_checks:
- name: varnishapp_checks:
interval: 15
pattern:
comm: varnishd
conf:
varnishstat: /usr/bin/varnishstat
Metrics Available
See Varnish Metrics.
Result in the Monitor UI

3 - (Legacy) Create a Custom App Check
Application checks are integrations that allow the Sysdig agent to poll
specific metrics exposed by any application, and the built-in app checks
currently supported are listed on the App Checks main
page. Many other Java-based
applications are also supported out-of-the-box.
If your application is not already supported though, you have a few
options:
Utilize Prometheus, StatsD, or JMX to collect custom metrics:
Send a request at support@sysdig.com, and we’ll do our best to add
support for your application.
Create your own check by following the instructions below.
If you do write a custom check, let us know. We love hearing about how
our users extend Sysdig Monitor, and we can also consider embedding your
app check automatically in the Sysdig agent.
See also Understanding the Agent Config
Files for details on
accessing and editing the agent configuration files in general.
Check Anatomy
Essentially, an app check is a Python Class that extends
AgentCheck
:
from checks import AgentCheck
class MyCustomCheck(AgentCheck):
# namespaces of the monitored process to join
# right now we support 'net', 'mnt' and 'uts'
# put there the minimum necessary namespaces to join
# usually 'net' is enough. In this case you can also omit the variable
# NEEDED_NS = ( 'net', )
# def __init__(self, name, init_config, agentConfig):
# '''
# Optional, define it if you need custom initialization
# remember to accept these parameters and pass them to the superclass
# '''
# AgentCheck.__init__(self, name, init_config, agentConfig)
# self.myvar = None
def check(self, instance):
'''
This function gets called to perform the check.
Connect to the application, parse the metrics and add them to aggregation using
superclass methods like `self.gauge(metricname, value, tags)`
'''
server_port = instance['port']
self.gauge("testmetric", 1)
Put this file into /opt/draios/lib/python/checks.custom.d
(create
the directory if not present) and it will be available to the Sysdig
agent. To run your checks, you need to supply configuration information
in the agent’s config file, dragent.yaml
as is done with bundled
checks:
app_checks:
- name: voltdb # check name, must be unique
# name of your .py file, if it's the same of the check name you can omit it
# check_module: voltdb
pattern: # pattern to match the application
comm: java
arg: org.voltdb.VoltDB
conf:
port: 21212 # any key value config you need on `check(self, instance_conf)` function
Check Interface Detail
As you can see, the most important piece of the check interface is the
check function. The function declaration is:
def check(self, instance)
instance
is a dict containing the configuration of the check. It
will contain all the attributes found in the conf:
section in
dragent.yaml
plus the following:
name
: The check unique name.
ports
: An array of all listening ports of the process.
port
: The first listening port of the process.
These attributes are available as defaults and allow you to
automatically configure your check. The conf:
section as higher
priority on these values.
Inside the check function you can call these methods to send metrics:
self.gauge(metric_name, value, tags) # Sample a gauge metric
self.rate(metric_name, value, tags) # Sample a point, with the rate calculated at the end of the check
self.increment(metric_name, value, tags) # Increment a counter metric
self.decrement(metric_name, value, tags) # Decrement a counter metric
self.histogram(metric_name, value, tags) # Sample a histogram metric
self.count(metric_name, value, tags) # Sample a raw count metric
self.monotonic_count(metric_name, value, tags) # Sample an increasing counter metric
Usually the most used are gauge
and rate
. Besides
metric_name
and value
parameters that are quite obvious, you
can also add tags
to your metric using this format:
tags = [ "key:value", "key2:value2", "key_without_value"]
It is an array of string representing tags in both single or key/value
approach. They will be useful in Sysdig Monitor for graph segmentation.
You can also send service checks which are on/off metrics, using this
interface:
self.service_check(name, status, tags)
Where status can be:
AgentCheck.OK
AgentCheck.WARNING
AgentCheck.CRITICAL
AgentCheck.UNKNOWN
Testing
To test your check you can launch Sysdig App Checks from the command
line to avoid running the full agent and iterate faster:
# from /opt/draios directory
./bin/sdchecks runCheck <check_unique_name> <process_pid> [<process_vpid>] [<process_port>]
check_unique_name
: The check name as on config file.
pid
: Process pid seen from host.
vpid
: Optional, process pid seen inside the container,
defaults to 1.
port
: Optional, port where the process is listening, defaults
to None.
Example:
./bin/sdchecks runCheck redis 1254 1 6379
5658:INFO:Starting
5658:INFO:Container support: True
5658:INFO:Run AppCheck for {'ports': [6379], 'pid': 5625, 'check': 'redis', 'vpid': 1}
Conf: {'port': 6379, 'socket_timeout': 5, 'host': '127.0.0.1', 'name': 'redis', 'ports': [6379]}
Metrics: # metrics array
Checks: # metrics check
Exception: None # exceptions
The output is intentionally raw to allow you to better debug what the
check is doing.
4 - (Legacy) Create Per-Container Custom App Checks
Sysdig supports adding custom application check-script configurations
for each individual container in the infrastructure. This avoids
multiple edits and entries to achieve container specific customization.
In particular, this enables PaaS to work smarter, by delegating
application teams to configure their own checks.
See also Understanding the Agent Config
Files for details on
accessing and editing the agent configuration files in general.
How It Works
The SYSDIG_AGENT_CONF variable stores a YAML-formatted configuration
for your app check and will be used to match app check configurations.
All original app_checks are
available, and the syntax is the same as for dragent.yaml
. You can add
the environment variable directly to the Docker file.
Example with Dockerfile
This example defines a per container app-check for Redis. Normally you
would have a YAML formatted entry installed into the agent’s
/opt/draios/etc/dragent.yaml
file that would look like this:
app_checks:
- name: redis
check_module: redisdb
pattern:
comm: redis-server
conf:
host: 127.0.0.1
port: "{port}"
password: protected
For the per-container method, convert and add the above entry to the
Docker file via the SYSDIG_AGENT_CONF environment variable:
FROM redis
# This config file adds a password for accessing redis instance
ADD redis.conf /
ENV SYSDIG_AGENT_CONF { "app_checks": [{ "name": "redis", "check_module": "redisdb", "pattern": {"comm": "redis-server"}, "conf": { "host": "127.0.0.1", "port": "6379", "password": "protected"} }] }
ENTRYPOINT ["redis-server"]
CMD [ "/redis.conf" ]
Example with Docker CLI
You can add parameters starting a container with
dockerrunusing-e/–envflag or injecting it using orchestration systems
like Kubernetes:
PER_CONTAINER_CONF='{ "app_checks": [{ "name": "redis", "check_module": "redisdb", "pattern": {"comm": "redis-server"}, "conf": { "host": "127.0.0.1", "port": "6379", "password": "protected"} }] }'
docker run --name redis -v /tmp/redis.conf:/etc/redis.conf -e SYSDIG_AGENT_CONF="${PER_CONTAINER_CONF}" -d redis /etc/redis.conf