Agent Configuration
Out of the box, the Sysdig agent will gather and report on a wide variety of pre-defined metrics from a range of platforms and applications. It can also accommodate any number of custom parameters for additional metrics collection.
You can edit the agent configuration file to extend the default behavior, including additional metrics for JMX,
StatsD, Prometheus, or a wide range of other monitoring integrations.
Use this section when you need to change the default or pre-defined settings by editing the agent configuration files.
For the latest helm-based installation instructions and configuration options, see sysdig-deploy.
1 - Understand the Agent Configuration
Out of the box, the Sysdig agent will gather and report on a wide variety of predefined metrics. It can also accommodate any number of custom parameters for additional metrics collection.
The agent relies on a configuration file named dragent.yaml
to define metrics collection parameters. This file is located in the /opt/draios/etc/
directory. You can add configuration parameters directly in YAML as key-value pairs, or using the environment variable such as ADDITIONAL_CONF
.
The dragent.yaml
file can be accessed and edited in several ways, depending on how the agent was installed. This document describes how to modify dragent.yaml
.
Environments
For more information about configuring each of the three environments listed in this section, see Edit the Configuration File.
Kubernetes
If Sysdig agent is installed in a Kubernetes environment, you can edit the dragent.yaml
file using one of the following options:
values.yaml
ConfigMap
sysdig-deploy
Helm chart
Non-Orchestrated
If Sysdig agent is installed in a non-orchestrated environment such as Docker, you can edit the dragent.yaml
file using one of the following options:
Linux
If Sysdig agent is installed in a Linux host, edit the dragent.yaml
file directly.
Edit the Configuration File
dragent.yaml
Log in to the host where the agent is installed.
Open /opt/draios/etc/dragent.yaml
.
Edit the file using proper YAML syntax. See the examples at the bottom of the page.
Restart the agent for changes to take effect.
configmap.yaml
If you install the agent using DaemonSets on a Kubernetes cluster, you use configmap.yaml
to connect with and manipulate the underlying dragent.yaml
file.
Use the following ways to add parameters to configmap.yaml
:
You can edit the files locally and apply the changes with kubectl -f
:
Open the configmap.yaml
file.
Edit the file as needed.
Apply the changes:
kubectl apply -f sysdig-agent-configmap.yaml
All the running agents will automatically pick the new configuration after Kubernetes pushes the changes across all the nodes in the cluster.
docker run
Run the docker run
command with -e ADDITIONAL_CONF="<VARIABLES>"
where <VARIABLES>
contains all the customized parameters you want to include.
To insert ADDITIONAL_CONF
parameters in a docker run
command or a DaemonSet file, you must convert the YAML code into a single line. You can do the conversion manually for short snippets. To convert longer portions of YAML, use echo|sed
commands.
Write your configuration in YAML, as it would be entered directly in dragent.yaml
.
In a Bash shell, use echo
and sed
to convert to a single line:
echo "<YAML_CONTENT>" | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/\\n/g'
Insert the resulting line into the docker run
command or add it to the DaemonSet file as an ADDITIONAL_CONF
.
Examples
Disable StatsD Collection
This example shows how to turn off StatsD collection and blacklist port 6443.
Sysdig agent uses port 6443 for both inbound and outbound communication with the Sysdig backend. The agent initiates a request and keeps a connection open with the Sysdig backend for the backend to push configurations, Falco rules, policies, and so on.
Ensure that you allow the agents’ inbound and outbound communication on TCP 6443 from the respective IP addresses associated with your SaaS Regions. Note that you are allowing the agent to send communication outbound on TCP 6443 to the inbound IP ranges listed in the SaaS Regions.
statsd:
enabled: false
blacklisted_ports:
- 6443
Use spaces, hyphens, and \n
correctly when manually converting to a single line:
ADDITIONAL_CONF="statsd:\n enabled: false\n blacklisted_ports:\n - 6443"
You can run a full agent startup Docker command in a single line as follows:
docker run
--name sysdig-agent \
--privileged \
--net host \
--pid host \
-e ACCESS_KEY=1234-your-key-here-1234 \
-e TAGS=dept:sales,local:NYC \
-e ADDITIONAL_CONF="statsd:\n enabled: false\n blacklisted_ports:\n - 6443" \
-v /var/run/docker.sock:/host/var/run/docker.sock \
-v /dev:/host/dev \
-v /proc:/host/proc:ro \
-v /boot:/host/boot:ro \
-v /lib/modules:/host/lib/modules:ro \
-v /usr:/host/usr:ro \
quay.io/sysdig/agent
Add RabbitMQ App Check
This example helps you override the default configuration for a RabbitMQ app check.
app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: myuser
rabbitmq_pass: mypassword
queues:
- MyQueue1
- MyQueue2
From a Bash shell, issue the echo
command and sed script.
echo "app_checks:
- name: rabbitmq
pattern:
port: 15672
conf:
rabbitmq_api_url: "http://localhost:15672/api/"
rabbitmq_user: myuser
rabbitmq_pass: mypassword
queues:
- MyQueue1
- MyQueue2
" | sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/\\n/g'
This results in the single-line format to be used with ADDITIONAL_CONF
in a Docker command or DaemonSet file.
"app_checks:\n - name: rabbitmq\n pattern:\n port: 15672\n conf:\n rabbitmq_api_url: http://localhost:15672/api/\n rabbitmq_user: myuser\n rabbitmq_pass: mypassword\n queues:\n - MyQueue1\n - MyQueue2\n"
helm install
If you installed the Sysdig agent in Kubernetes using the Helm chart, then no configmap.yaml
file was downloaded. You can edit dragent.yaml
using the Helm syntax:
helm install \
--namespace sysdig-agent \
--set agent.sysdig.settings.tags='linux:ubuntu\,dept:dev\,local:nyc' \
--set global.clusterConfig.name='my_cluster' \
sysdig/sysdig-deploy
This command will be translated into the following:
data:
dragent.yaml: |
tags: linux:ubuntu,dept:dev,local:nyc
k8s_cluster_name: my_cluster
Environment Variables for Agent Configuration File
ACCESS_KEY
| Your Sysdig access key. | Required. |
TAGS
| Meaningful tags you want applied to your instances. | Optional. These are displayed in Sysdig Monitor for ease of use. For example: tags: linux:ubuntu,dept:dev,local:nyc
See sysdig-agent-configmap.yaml. |
REGION
| The region associated with your Sysdig application. | Enter the SaaS region. |
COLLECTOR
| <collector-hostname.com> or 111.222.333.400
| Enter the hostname or IP address of the Sysdig collector service. Note that when used within dragent.yaml , it must be lowercase (collector ). For SaaS regions, see: SaaS Regions and IP Ranges. |
COLLECTOR_PORT
| 6443
| On-prem only. The port used by the Sysdig collector service. Default: 6443 . |
SECURE
| true
| On-prem only. If using SSL/TLS to connect to collector service, set the value to true , otherwise to false . |
CHECK_CERTIFICATE
| false
| On-prem only. Set to true when using SSL/TLS to connect to the collector service and should check for a valid SSL/TLS certificate. |
ADDITIONAL_CONF
| | Optional. A place to provide custom configuration values to the agent as environment variables. |
SYSDIG_PROBE_URL
| | Optional. An alternative URL to download precompiled kernel modules. |
Here is a sample Docker command using environment variables:
docker run \
--name sysdig-agent \
--privileged \
--net host \
--pid host \
-e ACCESS_KEY=3e762f9a-3936-4c60-9cf4-c67e7ce5793b \
-e COLLECTOR=mycollector.elb.us-west-1.amazonaws.com \
-e COLLECTOR_PORT=6443 \
-e CHECK_CERTIFICATE=false \
-e TAGS=my_tag:some_value \
-e ADDITIONAL_CONF="log:\n file_priority: debug\n console_priority: error" \
-v /var/run/docker.sock:/host/var/run/docker.sock \
-v /dev:/host/dev \
-v /proc:/host/proc:ro \
-v /boot:/host/boot:ro \
-v /lib/modules:/host/lib/modules:ro \
-v /usr:/host/usr:ro \
--shm-size=350m \
quay.io/sysdig/agent
2 - Configure Agent Modes
Agent modes provide the ability to control metric collection to fit your
scale and specific requirement. You can choose one of the following
modes to do so:
Monitor
Monitor Light
Troubleshooting
Secure
Secure Light
Custom Metrics Only
Using a stripped-down mode limits collection of unneeded metrics, which
in turn prevents the consumption of excess resources and helps reduce
expenses.
Monitor
The Monitor mode offers an extensive collection of metrics. We recommend
this mode to monitor enterprise environments.
monitor
is the default mode if you are running the Enterprise
tier. To switch back to the
Monitor mode from a different mode, do one of the following:
Add the following to the dragent.yaml
file and restart the agent:
Remove the parameter related to the existing mode from the
dragent.yaml
file and restart the agent. For example, to switch
from troubleshooting
mode to monitor
, delete the following
lines:
feature:
mode: troubleshooting
Monitor Light
Monitor Light caters to the users that run agents in a
resource-restrictive environment, or to those who are interested only in
a limited set of metrics.
Monitor Light provides CPU, Memory, File, File system, and Network
metrics. For more information, see Metrics Available in Monitor
Light.
Enable Monitor Light Mode
To switch to the Monitor Light mode, edit the dragent.yaml
file:
Open the dragent.yaml
file.
Add the following configuration parameter:
feature:
mode: monitor_light
Restart the agent.
Troubleshooting
Troubleshooting mode offers sophisticated metrics with detailed
diagnostic capabilities. Some of these metrics are heuristic in nature.
In addition to the extensive metrics available in the Monitor mode,
Troubleshooting mode provides additional metrics such as net.sql
and
additional segmentation for file and network metrics. For more
information, see Additional Metrics Values Available in
Troubleshooting.
Enable Troubleshooting Mode
To switch to the Troubleshooting mode, edit the dragent.yaml
file:
Open the dragent.yaml
file.
Add the following configuration parameter:
feature:
mode: troubleshooting
Restart the agent.
Secure
The secure mode supports only Sysdig
Secure features.
Sysdig agent collects no metrics in the secure mode, which, in turn,
minimizes network consumption and storage requirement in the Sysdig
backend. Lower resource usage can help reduce costs and improve
performance.
In the Secure mode, the Monitor UI shows no data because no metrics are
sent to the collector.
This feature requires agent v10.5.0 or above.
Enable Secure Mode
Open the dragent.yaml
file.
Add the following:
Restart the agent.
Secure Light
The secure light mode supports only the following Sysdig Secure features:
Sysdig agent running in secure_light
mode consumes fewer resources than that of running in the secure mode.
This feature requires agent v12.10.0 or above.
Enable Secure Light
Open the dragent.yaml
file.
Add the following:
feature:
mode: secure_light
Restart the agent.
Custom Metrics Only Mode
Custom Metrics Only mode collects the same metrics as the Monitor
Light mode, but also adds the ability to collect the following:
- Custom Metrics: StatsD, JMX, App Checks, and Prometheus
- Kubernetes State Metrics
As such, Custom Metrics Only mode is suitable if would like to use most of the
features of Monitor mode but are limited in resources.
This mode is not compatible with Secure. If your account is
configured for Secure, you must explicitly disable Secure in
the agent configuration if you wish to use this mode.
This mode requires agent v12.4.0 or above.
Enable Custom Metrics Only Mode
Open the dragent.yaml
file.
Add the following configuration parameter:
feature:
mode: custom-metrics-only
If your account is enabled for Secure, add the following:
security:
enabled: false
secure_audit_streams:
enabled: false
falcobaseline:
enabled: false
This configuration explicitly disables the Secure features in the agent. If you do not disable Secure, the agent will not start due to incompatiblity issues.
Restart the agent.
2.1 - Metrics Available in Monitor Light
Monitor Light provides cpu, memory, file, file system, and network metrics.
Sysdig Legacy ID | Prometheus ID |
---|
cpu.cores.used | sysdig_host_cpu_cores_used
sysdig_container_cpu_cores_used
sysdig_program_cpu_cores_used |
cpu.cores.used.percent | sysdig_host_cpu_cores_used_percent
sysdig_container_cpu_cores_used_percent
sysdig_program_cpu_cores_used_percent |
cpu.idle.percent | sysdig_host_cpu_idle_percent |
cpu.iowait.percent | sysdig_host_cpu_iowait_percent |
cpu.nice.percent | sysdig_host_cpu_nice_percent |
cpu.stolen.percent | sysdig_host_cpu_stolen_percent |
cpu.system.percent | sysdig_host_cpu_system_percent |
cpu.used.percent | sysdig_host_cpu_used_percent
sysdig_container_cpu_used_percent
sysdig_program_cpu_used_percent |
cpu.user.percent | sysdig_host_cpu_user_percent |
load.average.percpu.1m | sysdig_host_load_average_percpu_1m |
load.average.percpu.5m | sysdig_host_load_average_percpu_5m |
load.average.percpu.15m | sysdig_host_load_average_percpu_15m |
memory.bytes.available | sysdig_host_memory_available_bytes |
memory.bytes.total | sysdig_host_memory_total_bytes |
memory.bytes.used | sysdig_host_memory_used_bytes
sysdig_container_memory_used_bytes
sysdig_program_memory_used_bytes |
memory.bytes.virtual | sysdig_host_memory_virtual_bytes
sysdig_container_memory_virtual_bytes
sysdig_program_memory_virtual_bytes |
memory.pageFault.major | None |
memory.pageFault.minor | None |
memory.swap.bytes.available | sysdig_host_memory_swap_available_bytes |
memory.swap.bytes.total | sysdig_host_memory_swap_total_bytes |
memory.swap.bytes.used | sysdig_host_memory_swap_used_bytes |
memory.swap.used.percent | sysdig_host_memory_swap_used_percent |
memory.used.percent | sysdig_host_memory_used_percent
sysdig_container_memory_used_percent
sysdig_program_memory_used_percent |
file.bytes.in | sysdig_host_file_in_bytes
sysdig_container_file_in_bytes sysdig_program_file_in_bytes |
file.bytes.out | sysdig_host_file_out_bytes
sysdig_container_file_out_bytes sysdig_program_file_out_bytes |
file.bytes.total | sysdig_host_file_bytes_total
sysdig_container_file_bytes_total sysdig_program_file_bytes_total |
file.iops.in | sysdig_host_file_in_iops
sysdig_container_file_in_iops |
file.iops.out | sysdig_host_file_out_iops
sysdig_container_file_out_iops |
file.iops.total | sysdig_host_file_iops_total
sysdig_container_file_iops_total sysdig_program_file_iops_total |
file.open.count | sysdig_host_file_open_count
sysdig_container_file_open_count |
file.time.in | sysdig_host_file_in_time
sysdig_container_file_in_time |
file.time.out | sysdig_host_file_out_time
sysdig_container_file_out_time |
file.time.total | sysdig_host_file_time_total
sysdig_container_file_time_total sysdig_program_file_time_total |
fs.bytes.free | sysdig_host_fs_free_bytes
sysdig_container_fs_free_bytes
sysdig_fs_free_bytes |
fs.bytes.total | sysdig_fs_total_bytes
sysdig_host_fs_total_bytes
sysdig_container_fs_total_bytes |
fs.bytes.used | sysdig_fs_used_bytes
sysdig_host_fs_used_bytes
sysdig_container_fs_used_bytes |
fs.free.percent | sysdig_fs_free_percent
sysdig_host_fs_free_percent
sysdig_container_fs_free_percent |
fs.inodes.total.count | sysdig_fs_inodes_total_count sysdig_container_fs_inodes_total_count
sysdig_host_fs_inodes_total_count |
fs.inodes.used.count | sysdig_fs_inodes_used_count sysdig_container_fs_inodes_used_count
sysdig_host_fs_inodes_used_count |
fs.inodes.used.percent | sysdig_fs_inodes_used_percent sysdig_container_fs_inodes_used_percent
sysdig_host_fs_inodes_used_percent |
fs.largest.used.percent | sysdig_container_fs_largest_used_percent sysdig_host_fs_largest_used_percent |
fs.root.used.percent | sysdig_container_fs_root_used_percent sysdig_host_fs_root_used_percent |
fs.used.percent | sysdig_fs_used_percent sysdig_container_fs_used_percent sysdig_host_fs_used_percent |
net.bytes.in | sysdig_host_net_in_bytes
sysdig_container_net_in_bytes
sysdig_program_net_in_bytes |
net.bytes.out | sysdig_host_net_out_bytes
sysdig_container_net_out_bytes
sysdig_program_net_out_bytes |
net.bytes.total | sysdig_host_net_total_bytes
sysdig_container_net_total_bytes
sysdig_program_net_total_bytes
sysdig_connection_net_total_bytes |
proc.count | sysdig_host_proc_count
sysdig_container_proc_count
sysdig_program_proc_count |
thread.count | sysdig_host_thread_count
sysdig_container_thread_count
sysdig_program_thread_count |
container.count | sysdig_container_count |
system.uptime | sysdig_host_system_uptime |
uptime | sysdig_host_up
sysdig_container_up
sysdig_program_up |
2.2 - Additional Metrics Values Available in Troubleshooting
In addition to the extensive set of metrics available in the monitor
mode, additional metrics, such as net.sql
and net.mongodb
, as well
as additional segmentations for file and network metrics are available.
Sysdig Legacy ID | Prometheus ID | Additional Metrics Values Available When Segmented by |
---|
file.error.total.count | sysdig_host_file_error_total_count
sysdig_container_file_error_total_count
sysdig_program_file_error_total_count | file.name and file.mount labels |
file.bytes.total | sysdig_host_file_total_bytes
sysdig_container_file_total_bytes
sysdig_program_file_total_bytes | |
file.bytes.in | sysdig_host_file_in_bytes
sysdig_container_file_in_bytes
sysdig_program_file_in_bytes | |
file.bytes.out | sysdig_host_file_out_bytes
sysdig_container_file_out_bytes
sysdig_program_file_out_bytes | |
file.open.count | sysdig_host_file_open_count
sysdig_container_file_open_count
sysdig_program_file_open_count | |
file.time.total | sysdig_host_file_total_time
sysdig_container_file_total_time
sysdig_program_file_total_time | |
host.count | None | |
host.error.count | sysdig_host_syscall_error_count
sysdig_container_syscall_error_count | |
proc.count | sysdig_host_proc_count
sysdig_container_proc_count
sysdig_program_proc_count | |
proc.start.count | None | |
net.mongodb.collection | | all |
net.mongodb.error.count | sysdig_host_net_mongodb_error_count
sysdig_container_net_mongodb_error_count | |
net.mongodb.operation | | |
net.mongodb.request.count | sysdig_host_net_mongodb_request_count
sysdig_container_net_mongodb_request_count | |
net.mongodb.request.time | sysdig_host_net_mongodb_request_time
sysdig_container_net_mongodb_request_time | |
net.sql.query | | all |
net.sql.error.count | sysdig_host_net_sql_error_count
sysdig_container_net_sql_error_count | |
net.sql.query.type | | |
net.sql.request.count | sysdig_host_net_sql_request_count
sysdig_container_net_sql_request_count | |
net.sql.request.time | sysdig_host_net_sql_request_time
sysdig_container_net_sql_request_time | |
net.sql.table | | |
net.http.error.count | sysdig_host_net_http_error_count
sysdig_container_net_http_error_count | net.http.url |
net.http.method | None | |
net.http.request.count | sysdig_host_net_http_request_count
sysdig_container_net_http_request_count | |
net.http.request.time | sysdig_host_net_http_request_time
sysdig_container_net_http_request_time | |
net.bytes.in | sysdig_host_net_in_bytes
sysdig_container_net_in_bytes
sysdig_program_net_in_bytes | |
net.bytes.out | sysdig_host_net_out_bytes
sysdig_container_net_out_bytes
sysdig_program_net_out_bytes | |
net.request.time.worst.out | None | |
net.request.count | sysdig_host_net_request_count
sysdig_container_net_request_count
sysdig_program_net_request_count | |
net.request.time | sysdig_host_net_request_time
sysdig_container_net_request_time
sysdig_program_net_request_time | |
net.bytes.total | sysdig_host_net_total_bytes
sysdig_container_net_total_bytes
sysdig_program_net_total_bytes
sysdig_connection_net_total_bytes | |
net.http.request.time.worst | | all |
2.3 - Metrics Not Available in Essentials Mode
The following metrics will not be reported in the essentials
mode as compared to the monitor
mode:
Sysdig ID | Prometheus ID | Segmented By |
---|
net.bytes.in | sysdig_host_net_in_bytes
sysdig_container_net_in_bytes
sysdig_program_net_in_bytes | net.connection.server , net.connection.direction , net.connection.l4proto , and net.connection.client labels |
net.bytes.out | sysdig_host_net_out_bytes
sysdig_container_net_out_bytes
sysdig_program_net_out_bytes | |
net.connection.count.total | sysdig_host_net_connection_total_count
sysdig_container_net_connection_total_count
sysdig_program_net_connection_total_count
sysdig_connection_net_connection_total_count | |
net.connection.count.in | sysdig_host_net_connection_in_count
sysdig_container_net_connection_in_count
sysdig_program_net_connection_in_count
sysdig_connection_net_connection_in_count | |
net.connection.count.out | sysdig_host_net_connection_out_count
sysdig_container_net_connection_out_count
sysdig_program_net_connection_out_count
sysdig_connection_net_connection_out_count | |
net.request.count | sysdig_host_net_request_count
sysdig_container_net_request_count
sysdig_program_net_request_count | |
net.request.count.in | sysdig_host_net_request_in_count
sysdig_container_net_request_in_count
sysdig_program_net_request_in_count
sysdig_connection_net_request_in_count | |
net.request.count.out | sysdig_host_net_request_out_count
sysdig_container_net_request_out_count
sysdig_program_net_request_out_count
sysdig_connection_net_request_out_count | |
net.request.time | sysdig_host_net_request_time
sysdig_container_net_request_time
sysdig_program_net_request_time | |
net.request.time.in | sysdig_host_net_time_in_count
sysdig_container_net_time_in_count
sysdig_program_net_time_out_count
sysdig_connection_net_time_in_count | |
net.request.time.out | sysdig_host_net_time_out_count
sysdig_container_net_time_out_count
sysdig_program_net_time_out_count
sysdig_connection_net_time_out_count | |
net.bytes.total | sysdig_host_net_total_bytes
sysdig_container_net_total_bytes
sysdig_program_net_total_bytes
sysdig_connection_net_total_bytes | |
net.mongodb.collection | | all |
net.mongodb.error.count | sysdig_host_net_mongodb_error_count
sysdig_container_net_mongodb_error_count | |
net.mongodb.operation | | |
net.mongodb.request.count | sysdig_host_net_mongodb_request_count
sysdig_container_net_mongodb_request_count | |
net.mongodb.request.time | sysdig_host_net_mongodb_request_time
sysdig_container_net_mongodb_request_time | |
net.sql.query | | all |
net.sql.error.count | sysdig_host_net_sql_error_count
sysdig_container_net_sql_error_count | |
net.sql.query.type | | |
net.sql.request.count | sysdig_host_net_sql_request_count
sysdig_container_net_sql_request_count | |
net.sql.request.time | sysdig_host_net_sql_request_time
sysdig_container_net_sql_request_time | |
net.sql.table | | |
net.sql.query | | all |
net.sql.table | | |
net.http.method | | |
net.http.request.count | sysdig_host_net_http_request_count
sysdig_container_net_http_request_count | |
net.http.request.time | sysdig_host_net_http_request_time
sysdig_container_net_http_request_time | |
net.http.statusCode | | |
net.http.url | | |
3 - Tune Agent
This sections helps you configure Sysdig agent in special circumstances to filter data, manage log levels, collect KSM, and process Kubernetes events.
3.1 - Enable HTTP Proxy for Agents
You can configure the agent to allow it to communicate with the Sysdig
collector through an HTTP proxy. HTTP proxy is usually configured to
offer greater visibility and better management of the network.
Agent Behaviour
The agent can connect to the collector through an HTTP proxy by sending
an HTTP CONNECT message and receiving a response. The proxy then
initiates a TCP connection to the collector. These two connections form
a tunnel that acts like one logical connection.
By default, the agent will encrypt all messages sent through this
tunnel. This means that after the initial CONNECT message and response,
all the communication on that tunnel is encrypted by SSL end-to-end.
This encryption is controlled by the top-level ssl
parameter in the
agent configuration.
Optionally, the agent can add a second layer of encryption, securing the
CONNECT message and response. This second layer of encryption may be
desired in the case of HTTP authentication if there is a concern that
network packet sniffing could be used to determine the user’s
credentials. This second layer of encryption is enabled by setting the
ssl
parameter to true in the http_proxy
section of the agent
configuration. See
Examples
for details.
Configuration
You specify the following parameters at the same level as http_proxy
in the dragent.yaml
file. These existing configuration options affect
the communication between the agent and collector (both with and without
a proxy).
ssl
: Default: true
. It is not recommended to change this setting. If set to false
, the metrics sent from the agent to the collector are unencrypted.
ssl_verify_certificate
: Determines whether the agent verifies the
SSL certificate sent from the collector (default is true
).
The following configuration options affect the behavior of the HTTP
Proxy setting. You specify them under the http_proxy
heading in the
dragent.yaml
file.
proxy_host
: Indicates the hostname of the proxy server. The
default is an empty string, which implies communication through an
HTTP proxy is disabled.
proxy_port
: Specifies the port on the proxy server the agent
should connect to. The default is 0, which indicates that the HTTP
proxy is disabled.
proxy_user
: Required if HTTP authentication is configured. This
option specifies the username for the HTTP authentication. The
default is an empty string, which indicates that authentication is
not configured.
proxy_password
: Required if HTTP authentication is configured.
This option specifies the password for the HTTP authentication. The
default is an empty string. Specifying proxy_user
with no
proxy_password
is allowed.
ssl
: Default: false
. If set to true, the connection between the agent and the
proxy server is encrypted.
Note that this parameter requires the top-level ssl
parameter to
be enabled, as the agent does not support SSL to the proxy but
unencrypted traffic to the collector. This additional security
prevents you from misconfiguring the agent assuming the metrics are
as well encrypted end-to-end when they are not.
ssl_verify_certificate
: Determines whether the agent will verify
the certificate presented by the proxy.
This option is configured independently of the top-level
ssl_verify_certificate
parameter. This option is enabled by
default. If the provided certificate is not correct, this option can
cause the connection to the proxy server to fail.
ca_certificate
: The path to the CA certificate for the proxy
server. If ssl_verify_certificate
is enabled, the CA certificate
must be signed appropriately.
Examples
SSL Between Proxy and Collector
In this example, SSL is enabled only between the proxy server and the
collector.
collector_port: 6443
ssl: true
ssl_verify_certificate: true
http_proxy:
proxy_host: squid.yourdomain.com
proxy_port: 3128
SSL
The following example shows SSL is enabled between the agent and the
proxy server as well as between the proxy server and the collector.
collector_port: 6443
ssl: true
http_proxy:
proxy_host: squid.yourdomain.com
proxy_port: 3129
ssl: true
ssl_verify_certificate: true
ca_certificate: /usr/proxy/proxy.crt
SSL with Username and Password
The following configuration instructs the agent to connect to a proxy
server located at squid.yourdomain.com
on port 3128
. The agent will
request the proxy server to establish an HTTP tunnel to the Sysdig
collector at collector-your.sysdigcloud.com
on port 6443. The agent
will authenticate with the proxy server using the given user and
password combination.
collector: collector-your.sysdigcloud.com
collector_port: 6443
http_proxy:
proxy_host: squid.yourdomain.com
proxy_port: 3128
proxy_user: sysdig_customer
proxy_password: 12345
ssl: true
ssl_verify_certificate: true
ca_certificate: /usr/proxy/proxy_cert.crt
3.2 - Manage Agent Log Levels
Sysdig allows you to configure log levels for agents globally and granularly.
3.2.1 - Change Agent Log Level Globally
The Sysdig agent generates log entries in /opt/draios/logs/draios.log
.
The agent will rotate the log file when it reaches 10MB in size, keeping
the 10 most recent log files archived with a date-stamp appended to the
filename.
In order of increasing detail, the log levels available are: [ none
| critical| error | warning |notice | info | debug | trace ].
The default level (info) creates an entry for each aggregated
metrics transmission to the backend servers, once per second, in
addition to entries for any warnings and errors.
Setting the value lower than info
may prohibit troubleshooting
agent-related issues.
The type and amount of logging can be changed by adding parameters and
log level arguments shown below to the agent’s user settings
configuration file here:
/opt/draios/etc/dragent.yaml
After editing the dragent.yaml
file, restart the agent at the shell
with: service dragent restart
to affect changes.
Note that dragent.yaml
code can be written in both YAML and JSON. The
examples below use YAML.
File Log Level
When troubleshooting agent behavior, increase the logging to debug for
full detail:
log:
file_priority: debug
If you wish to reduce log messages going to the
/opt/draios/logs/draios.log
file, add the log:
parameter with one of
the following arguments under it and indented two spaces: [ none |
error | warning | info | debug | trace ]
log:
file_priority: error
Container Console Logging
If you are running the containerized agent, you can also reduce
container console output by adding the additional parameter
console_priority:
with the same arguments [ none | error | warning
| info | debug | trace ]
log:
console_priority: warning
Note that troubleshooting a host with less than the default ‘info’ level
will be more difficult or not possible. You should revert to ‘info’ when
you are done troubleshooting the agent.
A level of ’error’ will generate the fewest log entries, a level of
’trace’ will give the most, ‘info’ is the default if no entry exists.
Examples
When using Helm charts and passing arguments either via –set
flags or via values files, the logPriority
parameter allows to directly set both Agent console and file logging priorities. The possible values are “info” and “debug”. This parameter is mutually exclusive with sysdig.settings.log
, therefore they should not be used together.
Using HELM
helm install ... \
--set agent.sysdig.settings.log.file_priority=debug \
--set agent.sysdig.settings.log.console_priority=debug \
sysdig/sysdig-deploy
OR
helm install ... \
--set agent.logPriority=debug \
sysdig/sysdig-deploy
Using values.yaml
agent:
sysdig:
settings:
log:
file_priority: debug
console_priority: debug
OR
agent:
logPriority: debug
Using dragent.yaml
customerid: 831f3-Your-Access-Key-9401
tags: local:sf,acct:eng,svc:websvr
log:
file_priority: warning
console_priority: info
OR
customerid: 831f3-Your-Access-Key-9401
tags: local:sf,acct:eng,svc:websvr
log: { file_priority: debug, console_priority: debug }
Using Docker Run Command
If you are using the “ADDITIONAL_CONF” parameter to start a Docker
containerized agent, you would specify this entry in the Docker run
command:
-e ADDITIONAL_CONF="log: { file_priority: error, console_priority: none }"
-e ADDITIONAL_CONF="log:\n file_priority: error\n console_priority: none"
Using deamonset.yaml in Kubernetes Infrastructure
When running in a Kubernetes infrastructure (installed using the v1
method, comment in the “ADDITIONAL_CONF” line in the agent
sysdig-daemonset.yaml
manifest file, and modify as needed:
- name: ADDITIONAL_CONF #OPTIONAL pass additional parameters to the agent
value: "log:\n file_priority: debug\n console_priority: error"
3.2.2 - Manage File Logging for Agent Components
Sysdig Agent provides the ability to set component-wise log levels that
override the global file logging level controlled by the file_priority
configuration option. The components represent internal software modules
and can be found in /opt/draios/logs/draios.log
.
By controlling logging at the fine-grained component level, you can
avoid excessive logging from certain components in draios.log
or
enable extra logging from specific components for troubleshooting.
The Agent components can also have an optional feature level logging that
can provide a way to control the logging for a particular feature
in Sysdig Agent.
To set feature-level or component-level logging:
Determine the agent feature or component you want to set the log level:
To do so,
Open the /opt/draios/logs/draios.log
file.
Copy the component name.
The format of the log entry is:
<timestamp>, <<pid>.<tid>>, <log level>, [feature]:<component>[pid]:[line]: <message>
For example, the given snippet from a sample log file shows log
messages from promscrape
featture, sdjagent
, mountedfs_reader
,
watchdog_runnable
, protobuf_file_emitter
,
connection_manager
, and dragent
.
2020-09-07 17:56:01.173, 27979.28018, Information, sdjagent[27980]: Java classpath: /opt/draios/share/sdjagent.jar
2020-09-07 17:56:01.173, 27979.28018, Information, mountedfs_reader: Starting mounted_fs_reader with pid 27984
2020-09-07 17:56:01.174, 27979.28019, Information, watchdog_runnable:105: connection_manager starting
2020-09-07 17:56:01.174, 27979.28019, Information, protobuf_file_emitter:64: Will save protobufs for all message types
2020-09-07 17:56:01.174, 27979.28019, Information, connection_manager:282: Initiating connection to collector
2020-09-07 17:56:01.175, 27979.27979, Information, dragent:1243: Created Sysdig inspector
2020-09-07 18:52:40.065, 27979.27980, Debug, promscrape:prom_emitter:72: Sent 927 Prometheus metrics of 7297 total
2020-09-07 18:52:41.129, 27979.27981, Information, promscrape:prom_stats:45: Prometheus timeseries statistics, 5 endpoints
To set feature-level logging:
Open /opt/draios/etc/dragent.yaml
.
Edit the dragent.yaml
file and add the desired feature:
In this example, you are setting the global level to notice and
promscrape
feature level to info.
log:
file_priority: notice
file_priority_by_component:
- "promscrape: info"
The log levels specified for feature override global settings.
To set component-level logging:
Open /opt/draios/etc/dragent.yaml
.
Edit the dragent.yaml
file and add the desired feature:
In this example, you are setting the global level to notice and
promscrape
feature level to info, sdjagent
, mountedfs_reader
component log level to debug, watchdog_runnable
component log level
to warning and promscrape:prom_emitter
component log level to debug.
log:
file_priority: notice
file_priority_by_component:
- "promscrape: info"
- "promscrape:prom_emitter: debug"
- "watchdog_runnable: warning"
- "sdjagent: debug"
- "mountedfs_reader: debug"
The log levels specified for feature override global settings.
The log levels specified for component overide feature and global settings.
Restart the agent.
For example, if you have installed the agent as a service, then run:
$ service dragent restart
3.2.3 - Manage Console Logging for Agent Components
Sysdig Agent provides the ability to set component-wise log levels that
override the global console logging level controlled by the
console_priority
configuration option. The components represent
internal software modules and can be found in
/opt/draios/logs/draios.log
.
By controlling logging at the fine-grained component level, you can
avoid excessive logging from certain components in draios.log
or
enable extra logging from specific components for troubleshooting.
Components can also have an optional feature level logging that
can provide a way to control the logging for a particular feature
in Sysdig Agent.
To set feature-level or component-level logging:
Determine the agent component you want to set the log level:
To do so,
Look at the console output.
If you’re using an orchestrator like Kubernetes, the log viewer
facility, such as the kubectl
log command, shows the console
log output.
Copy the component name.
The format of the log entry is:
<timestamp>, <<pid>.<tid>>, <log level>, [feature]:<component>[pid]:[line]: <message>
For example, the given snippet from a sample log file shows log
messages from promscrape
featture, sdjagent
, mountedfs_reader
,
watchdog_runnable
, protobuf_file_emitter
,
connection_manager
, and dragent
.
2020-09-07 17:56:01.173, 27979.28018, Information, sdjagent[27980]: Java classpath: /opt/draios/share/sdjagent.jar
2020-09-07 17:56:01.173, 27979.28018, Information, mountedfs_reader: Starting mounted_fs_reader with pid 27984
2020-09-07 17:56:01.174, 27979.28019, Information, watchdog_runnable:105: connection_manager starting
2020-09-07 17:56:01.174, 27979.28019, Information, protobuf_file_emitter:64: Will save protobufs for all message types
2020-09-07 17:56:01.174, 27979.28019, Information, connection_manager:282: Initiating connection to collector
2020-09-07 17:56:01.175, 27979.27979, Information, dragent:1243: Created Sysdig inspector
2020-09-07 18:52:40.065, 27979.27980, Debug, promscrape:prom_emitter:72: Sent 927 Prometheus metrics of 7297 total
2020-09-07 18:52:41.129, 27979.27981, Information, promscrape:prom_stats:45: Prometheus timeseries statistics, 5 endpoints
To set feature-level logging:
Open /opt/draios/etc/dragent.yaml
.
Edit the dragent.yaml
file and add the desired feature:
In this example, you are setting the global level to notice and
promscrape
feature level to info.
log:
console_priority: notice
console_priority_by_component:
- "promscrape: info"
The log levels specified for feature override global settings.
To set component-level logging:
Open /opt/draios/etc/dragent.yaml
.
Edit the dragent.yaml
file and add the desired feature:
In this example, you are setting the global level to notice and
promscrape
feature level to info, sdjagent
, mountedfs_reader
component log level to debug, watchdog_runnable
component log level
to warning and promscrape:prom_emitter
component log level to debug.
log:
console_priority: notice
console_priority_by_component:
- "promscrape: info"
- "promscrape:prom_emitter: debug"
- "watchdog_runnable: warning"
- "sdjagent: debug"
- "mountedfs_reader: debug"
The log levels specified for feature override global settings.
The log levels specified for component overide feature and global settings.
Restart the agent.
For example, if you have installed the agent as a service, then run:
$ service dragent restart
Agent Components
analyzer
: The logs from this component provide information about events and metrics as they come into the system. These logs assist in basic troubleshooting of event flow.
connection_manager
: This component logs details about the agent’s connection to the Sysdig backend. These logs help diagnose and troubleshoot connectivity issues.
security_mgr
: These logs describe the security processing steps the agent is taking. Having these logs assists in understanding what the security side of the agent is doing.
infrastructure_state
: This component interacts with the orchestration runtime to provide a view of the infrastructure. The logs from this component help troubleshoot orchestration issues and communication with the API server.
procfs_parser
: The agent uses the procfs parser to gather information about the state of the system. These logs provide insight into the functioning of the agent.
dragent
: These logs provide data about the core functionality of the agent.
process_emitter
: This component is used to provide data regarding processes running on a host.
k8s_parser
: The k8s_parser
is used as part of the communication with the Kubernetes API server. These logs help debug communication issues.
netsec
: These logs provide data about the functioning of the netsec
component, which provides topology and endpoint security functionality.
protocol_handler
: This component logs information about the protobufs the agent sends to the Sysdig backend.
k8s_deleg
: Kubernetes uses the concept of delegated nodes to help reduce cluster load and manage distributed systems. These logs help with troubleshooting issues within the Kubernetes distributed environment.
promscrape
: Promscrape allows the agent to send prometheus data as custom metrics.
cm_socket
: The cm_socket
is the low-level networking code used by the connection_manager
. These logs work together with the logs from the connection_manager
to show the behavior of the network connection between the agent and the backend.
secure_audit
: Audit is a feature of Sysdig Secure which provides information on system activity such as file and network behavior. These logs help understand the behavior of that feature.
memdumper
: The memdumper
is used to perform back-in-time captures, and logs from this component help troubleshoot any problems which might occur with back-in-time captures.
3.2.4 - Change the Agent Log Directory
The Sysdig agent generates log entries in /opt/draios/logs/draios.log
.
The agent will rotate the log file when it reaches 10MB in size, keeping
the 10 most recent log files archived with a date-stamp appended to the
filename.
You can change the default location as follows:
log:
location: new_directory
By default, this location is rooted in the agent install path: /opt/draios/
. Therefore, the new log location for the given example would be /opt/draios/new_directory
.
You cannot write agent logs outside of the agent install path.
3.2.5 - Control Disk Usage by Agent Logs
The Sysdig agent generates log entries in /opt/draios/logs/draios.log
. It periodically performs rotation of its own logs.
You can use the following configuration to control the space taken up by agent logs:
max_size
: Sets a limit to the size of a single agent log file, in megabytes. When the log file reaches this size, a new log file will be created. The old log will be renamed with a timestamp. The default size is 10 megabytes.
rotate
: The rotate
configuration determines how many old log files are kept on the disk. The default is 10 log files.
When the log file reaches this size, a new log file, draios.log
will be created, and the old log will be renamed with a timestamp.
log:
max_size: 10
rotate: 10
For example, if the current log file reaches the size limit of 10 megabytes and the number of log files reaches the limit of 10, the oldest will be removed. The last log file will be renamed with a timestamp and added to the list of old log files.
Increasing these values can provide more logs for troubleshooting at the expense of more space.
3.3 - Disable Captures
Sometimes, security requirements dictate that capture functionality
should NOT be triggered at all (for example, PCI compliance for payment
information).
To disable Captures altogether:
Access using one of the options
listed.
This example accesses dragent.yaml
directly. ``
Set the parameter:
sysdig_capture_enabled: false
Restart the agent, using the command: ``
See Captures for more
information on the feature
3.4 - Blacklist Ports
Use the blacklisted_ports
parameter in the agent configuration file to
block network traffic and metrics from unnecessary network ports.
Note: Port 53 (DNS) is always blacklisted.
Access the agent configuration file, using one of the options
listed.
Add blacklisted_ports
with desired port numbers.
Example (YAML):
blacklisted_ports:
- 6443
- 6379
Restart the agent (if editing dragent.yaml
file directly), using
either the service dragent restart
or
docker restart sysdig-agent
command as appropriate.