2021 Archive

2021 Archive of Sysdig Agent release notes.

12.1.1 November 22, 2021

Defect Fixes

Falco Action Works as Expected

The kill container Falco action works as expected for containerd in Azure.

12.1.0 November 08, 2021

Feature Enhancements

Ability to Build eBPF Probes for Debian 11 Kernels

The agent container has been enhanced to build probes for Debian 11 kernels.

Prebuilt Probes for Debian 11 Kernels

Prebuilt probes are added for Debian 11 kernels.

Prebuilt Probes for Fedora Kernels

Prebuilt probes are added for latest Fedora kernels.

Ability to Build eBPF Probes for Linux Kernel v5.10

The agent container can now build eBPF probes for Linux kernel version 5.10.

Enhanced Agent Containers for Probes on New Kernels with glibc v2.33

The agent container has been enhanced to build probes for new kernel versions that use glibc v2.33.

File Metrics in Audit Tap

Metrics related to file are included in audit tap.

Promscrape Memory Usage Limit

You can now limit the promscrape memory usage. The default is set to 640 MB.

Remove Self-Signed Certificate for Agent to Collector Connection

Self-signed certificate support has been removed for agent connection to the collector. See End of Support.

Defect Fixes

Image Profile Shows Results Correctly

The imageid is reported correctly when using a CRI engine.

Duplicate Environment Variable Hashes No Longer Appear in Audit Tap

The discrepancy between reported environment variables and hash in audit tap has been fixed.

Kubernetes Daemonset and Replicaset Association Works as Expected

Fixed an issue that could invalidate the association between Kubernetes Daemonset and Replicaset.

Agent Updates Prometheus Configurations Correctly

Fixed a problem that was causing Prometheus configurations to be merged incorrectly when certain integrations were updated from the backend.

12.0.4 October 29, 2021

Defect Fixes

Secure Policies Load as Expected

Fixed an issue present in 12.0.3 where Secure policies might not be loaded correctly by the agent.

12.0.3 October 22, 2021

Defect Fixes

Leases Fallback Works as Expected on OpenShift v3

Fixed an issue where Kubernetes clusters that don’t support leases failed to report Kubernetes data due to not falling back to the previous behavior.

Update the Cluster Install Scripts for Leases on OpenShift

Modified the OpenShift agent installer to add the sysdig-agent cluster role and to assign it to the sysdig-agent service account. The new cluster role allows the agent to utilize the coldstart leases.

12.0.2 September 30, 2021

Defect Fixes

Network Security Communication Works As Expected

In some environments Sysdig agents could not send any Network Security (Kubernetes Network Policies) communications upon not completing CIDR auto-discovery. This issue has been fixed.

Agent No Longer Crashes in Orchestrated Environments

Fixed a problem related to a race condition in orchestrated environments, such as OpenShift v3, due to which the agent might crash repeatedly at the agent start.

12.0.1 September 27, 2021

Defect Fixes

OpenShift 4 Clusters Able To Retrieve Metadata Without Leases

Fixed an issue where OpenShift clusters would fail to report Kubernetes data when the agent service-account did not have the permission to create leases. With this fix, the Sysdig agent falls back to the previous behavior to retrieve the metadata.

12.0.0 September 15, 2021

Feature Enhancements

Allow Sysdig Backend to Manage Prometheus Configuration

Allow Sysdig backend to manage Prometheus configuration. For more information, see the following:

Agent Console Supports Troubleshooting Prometheus Configuration

The Agent Console now supports troubleshooting Prometheus configuration.

To support this feature, Agent Console is enabled by default. This helps both users and Sysdig support to troubleshoot Sysdig agent issues. Sensitive user configuration is obfuscated and not viewable.

For more information, see Using the Agent Console.

Support for Node Leases

Sysdig agent supports using Kubernetes Lease to control how and when connections are made to the Kubernetes API Server.

For more information, see the following:

Support for Podman Environments

Sysdig agent is supported in Podman environments. For more information, see Prerequisites for Podman Environments.

Add Startup Delay to Agent to Kubernetes API Server Connection

Added a delay prior to the agent connecting to the Kubernetes API server. The delay time is set based on the number of nodes in the cluster to prevent overloading the API server. This is to support environments where node leases cannot be used.

Known Issues

None

Defect Fixes

Stale Capture Files No Longer Exhaust Local File System

Prevent incomplete and stale capture files from being left behind and thereby avoiding storage consumption for such files.

Honor CPU Quotas

Moved the main dragent process to the default cgroup so that CPU quotas can cover all the agent processes.

Containers Are Detected as Expected

Fixed issue where containers are not detected if SystemdCgroup = true is not enabled in the containerd configuration.

Report Correct Container Metadata

Fixed a problem that caused some container metadata such as the image repository and image tag to be reported incorrectly.

Upgrading from 10.8.0 to 11.3.0 No Longer Fails

Provide a http_proxy configuration option to address connection problems post-OpenSSL upgrade from v11.0 to v11.1.

11.4.1 August 03, 2021

This is a hotfix release.

Defect Fixes

Fixed a problem that broke app checks in agent-slim by adding the missing dependencies.

11.4.0 July 28, 2021

Feature Enhancements

Probe Builder

The probe builder can now be used to build kernel modules for the Sysdig agent. It can run on any host with Docker installed, including (with some preparation) air-gapped hosts.

Probe Builder is now enabled and available at https://github.com/draios/probe-builder. See the Readme for more information.

Promscrape v2

Promscrape v2 (used when prom_service_discovery is enabled for Prometheus) has been changed to discover only Kubernetes pods running on the same node as the agent. This should help reduce the load on the Kubernetes API servers in large clusters.

Added Missing Fields for Unified Workload Metrics

Added Kubernetes metric fields indicating the availability of daemon sets (status.numberAvailable, status.numberUnavailable, and status.updatedNumberScheduled) and replica sets (status.availableReplicas) to support workload-level metrics (SaaS only).

Known Issues

App checks in agent-slim don’t work due to missing dependencies. This problem will be addressed in an upcoming hotfix release.

Defect Fixes

Multiple Hosts No Longer Report the Same Pod

Fixed an issue causing multiple hosts to report the same pod if its UUID is the same on both hosts.

Duplicate StasD Metrics Are Reported Correctly

Fixed an issue related to handling duplicate StatsD metrics corresponding to a container that is reported by a host.

Stale Markers Are Sent properly for Dropped Targets

Properly generate stale markers for Prometheus metrics when a scrape target is no longer available and when using promscrape.v1.

Report a Positive Time Delta Value

Fixed a defect that could result in an invalid file.time.in, file.time.out, file.time.other, and file.time.total values.

Agent No Longer Crashes When App Check or Prometheus Is Enabled

Fixed a defect that could cause crashing the agent when app checks or Prometheus is enabled.

Secure Captures No Longer Causes Host Shutdown

Prevent agent restarts caused by apparent stalls encountered in the sample handler thread.

11.3.0 June 10, 2021

Feature Enhancements

Console Logging

Introduced per-component-level console logging feature. See Manage Console Logging for Agent Components.

Slim Agent for eBPF Probes

agent-kmodule and agent-kmodule-thin can now be used to build eBPF probes.

Replication Controller Fields

Added missing replication controller fields to the aggregator Actions.

Non-Delegated Agents Retrieve Less Data From the API Server

Use Kubernetes leases to better control the load on the Kubernetes API Server. This is disabled by default.

Defect Fixes

Agent No Longer Generates Core Dumps on Java

Prevents java process core dumps caused by the Sysdig agent while trying to access /tmp directory.

Support Container Action on Containerd

Container actions are now properly supported on containerd (CRI-O and other CRI engines that already had support). Actions for unsupported container engines are now properly reported to the Sysdig backend and a warning message is logged in the agent logs.

Recovery During Agent Shutdown

Introduced a detection and recovery mechanism for hangs during agent shutdown.

Promscrape V2 Termination No Longer Causes Agent Crash

Fixed a problem causing the agent to crash after promscrape_v2 is terminated.

Agent No Longer Restarts in Kubernetes Environment

The agent tries to fetch the metadata of the AWS instance in which it is running in order to tag metrics generated with the information unique to the AWS instance. If the metadata structure is not as expected, the agent continuously restarts due to an error in fetching such metadata. This issue has been fixed.

Profiling Works as Expected

Fixed an issue that disabled support for performance profiles in the agent.

11.2.1 May 06, 2021

This is a hotfix release.

Defect Fixes

Report Container User Information

Start tracking container user information and make that information accessible in container events. These events denote having a container started. This feature works for Docker as well as CRI-O container engines.

Reporting container user information does not work in OpenShift 4.x because it does not provide necessary CRI-O information.

11.2.0 April 26, 2021

Feature Enhancements

Agent CLI

Sysdig supports Agent CLI, a command-line interactive tool, to troubleshoot agents. This tool helps Sysdig support to solve user issues quickly and efficiently. It is currently disabled by default and requires the customer to turn it on.

For more information, see Using the Agent Console

Scraping Prometheus Metrics

Scraping Prometheus metrics is supported in the following cases:

  • Advertised ports on container IP addresses

  • Advertised ports on host IP addresses

  • Advertised ports on pod IP addresses

Slim Agent for IKS

Use the following:

Reduce Load on Kubernetes API Server

Terminated pods are no longer collected in order to reduce the load on the Kubernetes API server.

Audit Server Listens on All Interfaces

The audit server now by default listens on all the interfaces for Kubernetes audit events. This makes integration with Kubernetes audit events in the agent easier without the need for configuration changes.

Improved Noise-Reduction Filter for Activity Audits

The noise-reduction filter for Activity Audit has been improved. All the filtered data is duplicated.

Defect Fixes

CRI-O Versions Report Correct Image ID

The new CRI-O versions (1.19+, possibly 1.18) now properly report container.image.id.

Log Level Changes for Duplicate Host Container Groups

Demoted logs about duplicate host container_groups from warning to debug level

Fix CVE-2021-28831

Fix CVE-2021-28831 in the Slim Agent container.

11.1.3 April 13, 2021

This is a hotfix release.

Defect Fixes

Prevent Agent CrashLoopBackoff Error Caused by Smaller initialDelaySeconds Values

The readiness probe improvement in version 11.1.2 delayed the transition of the agent pod to a ready state until communication with the Kubernetes API server was established. But this delay could cause a CrashLoopBackoff due to liveness or readiness probes configured with an initialDelaySeconds set to less than 90.

In Agent version 11.1.3 the transition to the ready state does not wait for communication with the Kubernetes API server to be established unless the behavior is enabled via a new configuration option: k8s_wait_before_ready.

11.1.2 March 30, 2021

Known Issues

Prevent Agent CrashLoopBackoff Error Caused by Smaller initialDelaySeconds Values

The readiness probe improvement in version 11.1.2 delayed the transition of the agent pod to a ready state until communication with the Kubernetes API server was established. But this delay could cause a CrashLoopBackoff due to liveness or readiness probes configured with an initialDelaySeconds set to less than 90.

Workaround

If you are using agent version 11.1.2, set initialDelaySeconds for both liveness and readiness probes to a value that is greater than or equal to 90.

Feature Enhancements

Enhanced Connection with Kubernetes API Server

Kubernetes reconnect logic has been improved to automatically backoff (1 min, 2 min, 4 min… 1hr) if the connection is continuously dropped when using Thin Cointerface. This reduces the load that the agent imposes on the Kubernetes API Server in clusters with heavily burdened API servers.

Reduced Load on Kubernetes API Server

The agent’s readiness probe has been improved to not report ready until after the agent connects to the Kubernetes API server. This reduces the load that the agent imposes on the Kubernetes API server when starting up during RollingUpdate.

11.1.1 March 26, 2021

Defect Fixes

Agent Reports Memory Usage Accurately for Containers

Fixed an issue where the agent would incorrectly report memory.bytes.used for containers that use more than 4GB.

Runtime Policies Work as Expected

The runtime policies that have a policy type and capture action are handled as expected.

11.1.0 March 23, 2021

Defect Fixes

Agent Tags in Policy Scopes

Agent tags are supported in runtime policy scopes.

Metric Limits Are Updated As Expected

Fixed a problem where metric limits were not updated from the defaults. This is unlikely to happen if agents are connected to the SaaS backend.

Configured Tags in Prometheus Scraper

Fixed a problem in the old Prometheus scraper (used when promscrape is disabled) to ensure that configured tags are properly added to the metrics.

JMX Metrics for Short-Lived Java Processes

Fixed an issue where short-lived Java processes could cause the Sysdig Agent to stop collecting JMX metrics.

Misconfiguration No Longer Leads to Agent Constantly Querying Kubernetes API Server

Fixed a problem where the agent would continuously send requests to the Kubernetes API server to query the endpoints API. This occurs when the agent’s clusterrole is incorrectly configured. With this fix, the agent will no longer repeat the attempt if it is unable to connect to the Kubernetes API during boot.

Scope Runtime Policies

The runtime policies are now correctly scoped by kubernetes.cluster.name. The fix in 10.6.0 was incomplete.

Agent Correctly Reports Replicasets

Fixed an issue where the agent could lose track of a replicaset and report incomplete metadata.

Agent Issues Over HTTP Proxy

  • Fixed an agent connection issue over plaintext HTTP proxy with encryption.

  • Fixed an agent connection issue via HTTP proxy connections over SSL.

11.0.0 February 18, 2021

Feature Enhancements

Thin Cointerface to Reduce Memory Usage

Thin cointerface reduces the memory required to handle the Kubernetes metadata on both the agent and the Kubernetes API Server. The reduction in memory usage is significant for Kubernetes clusters with a large number of pods (in the range of 10,000 or more) or clusters that heavily use Replication Controllers.

Using this feature returns the same data to the Sysdig backend and does not affect any Sysdig features. The thin cointerface feature is disabled by default.

To enable:

  1. Add the following in either the sysdig-agent’s configmap or via the dragent.yaml file:

    thin_cointerface_enabled: true
    
  2. Restart the agent.

Reduce the Volume of Agent Log Messages

Some high-frequency information level log messages are converted to debug level to reduce the volume of messages generated at the default information level.

File Logging Capability

Per-component file logging capability for an additional set of agent components has been enabled.

For more information, see Manage File Logging for Agent Components.

Reduce Agent Memory Consumed by Prometheus

The number of Prometheus time series ingested has been limited to reduce agent memory consumption. This limit is applied after Prometheus relabeling rules are applied but before the agent’s metric filter and metric limit.

Defect Fixes

Missing Metrics Due to Aggregation in Agent Fixed

Fixed an issue where processes with certain names were improperly aggregated, which in turn caused missing metrics in certain situations.

Cointerface Fix

Fixed an issue that caused the agent’s cointerface process to restart continuously while processing kubernetes label selectors.

10.9.1 January 21, 2021

Defect Fixes

Thin Cointerface Works as Expected

Fixed a defect in the Thin Cointerface feature which could cause Kubernetes metadata to stop updating. Because Thin Cointerface is turned off by default, the change affects only a small number of users who have this feature turned on.

10.9.0 January 13, 2021

Feature Improvements

Support for Kubernetes Cronjobs

Kubernetes cronJobs are supported when reporting network communications.

Defect Fixes

Runtime Policies and Rules Are Loaded with No Errors

Fixed a race condition that could prevent runtime policies and rules from being loaded properly if multiple messages from the Sysdig backend are received consecutively.

Cluster Overview Displays Compliance Score

Fixed an issue where Statsd metrics related to compliance would have no associated Kubernetes metadata and were not visible on Cluster Overview.