This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Architecture & System Requirements

Before installing an on-premises solution, review the Sysdig architecture, sizing tips, configuration options, and installation options.

Each on-premise release is tested on several platforms and Kubernetes orchestrators. You can find the the official matrix in the onprem-install-docs repository. Click the on-premise version and navigate to the release notes page to view the supported platforms.

The actual installation instructions can be found in the onprem-install-docs repository.

1 - Architecture

Review the diagram and component descriptions. When installing on-premises, you can decide where to deploy various components.

Sysdig Agent

Sysdig will collect monitoring and security information from all the target entities. To achieve this, one Sysdig agent should be deployed in each host. These hosts can be:

  • The nodes that make up a Kubernetes or OpenShift cluster

  • Virtual machines or bare metal

  • Living in a cloud environment (i.e. AWS, Google Cloud, IBM Cloud, Azure, etc.) or on the customer’s premises

The Sysdig agent can be installed as a container itself using a Helm chart, Kubernetes operator, etc.

Once the agent is installed in the host it will automatically start collecting information from the running containers, container runtime, the orchestration API (Kubernetes, OpenShift, etc), metrics from defined Prometheus endpoints, auto-detected JMX sources, StatsD, and integrations as well as the host itself.

The Sysdig agent maintains a permanent communication channel with the Sysdig backend which is used to encapsulate messages containing the monitoring metrics, infrastructure metadata, and security events. The channel is protected using standard TLS encryption and transports data using binary messages. Using this channel, the agent can transmit data, but also receive additional configuration from the backend, such as security runtime policies or benchmarks.

Sysdig Backend

The Sysdig backend is used directly in its SaaS version, thus being managed transparently by Sysdig Inc., or it can also be installed on the customer’s premises. This distinction does not affect the operation of the platform described below.

Once the agent messages are received in the backend, they are processed and extracted into data available to the platform - time series, infrastructure and security events, and infrastructure metadata.

The main components of the backend/platform include:

  • Extraction and post-processing of the metric data from the agent, so that full time-series, with all the necessary infrastructure metadata, is available to the user

  • Maintenance of the infrastructure metadata (most notably Kubernetes state), so that all events and time series can be enriched and correctly grouped

  • Storage of time-series and event data

  • Processing of time-series data to calculate alert triggers

  • Queuing the security events triggered by the agents to be shown on the event feed, notifying by the configured notification channels and alerts and forwarding via the Event Forwarder to external platforms like Splunk, Syslog or IBM MCM / Qradar

  • Aggregating and post-processing other security data such as container fingerprints that will be used to generate container profiles, or security benchmark results.

The Sysdig platform then stores this post-processed data in a set of internal databases that will be combined by the API service to create the data views, such as dashboards, event feeds, vulnerability reports, or security benchmarks.

Sysdig APIs

The Sysdig platform provides several ways to consume and present its internal data. All APIs are RESTful, HTTP JSON-based, and secured using TLS. The same APIs are used to power the Sysdig front end, as well as any API clients (such as sdc-cli).

  • Monitor API

    • User and Team management API

    • Dashboard API

    • Events API

    • Alerts API

    • Data API (proprietary Sysdig API for querying time-series data)

  • Secure API

    • Image Scanning API

    • Security Events API

    • Activity Audit API

    • Secure Overview API

  • PromQL API: Prometheus compatible HTTP API for querying time -series data

These enable different use cases:

  • User access to the platform via the Sysdig user interface

  • Programmatic input and extraction of data, i.e.

    • Automatic user creation

    • Terraform scripts to save or recover configuration state

    • Inline scanning to push scanning results from the CI/CD pipeline

    • Instrumentation using the sdc-cli.

  • PromQL API interface that can be used to connect any PromQL-compatible solutions, such as Grafana.

2 - System Requirements

Supported Distributions

Linux Distributions

A 64-bit Linux distribution with a minimum kernel version of 3.10, and support of docker-engine 1.7.1 or later, is required for each server instance.

Recommended Linux distributions: RedHat, Ubuntu, Amazon AMI, Amazon Linux 2.

Docker Requirements

For the Docker installation, running devicemapper in ’loopback mode' is not supported. It has known performance problems and a different storage driver should be used.

Please see this note from our Replicated infrastructure partner: devicemapper-installation-warning.

Installing the latest version of Docker is recommended.

Cassandra

Cassandra is used as the metrics store for Sysdig agents. It is the most dynamic component of the system, and requires additional attention to ensure that your system is performing well and highly responsive.

This component is stateful, and should be treated more carefully than stateless components. Cassandra sizing is based on a minimum replication factor as well as the number of agents writing data.

A minimum replication factor of 3 is recommended for the Sysdig application, which allows the cluster to survive the failure of 1 Cassandra instance.

Each agent consumes anywhere from 500MB to 2GB of Cassandra storage, with average sizing at 1.5GB/agent. Because of Sysdig’s data aggregation model, this storage should comfortably handle multi-year history. This needs to then be multiplied by the replication factor to determine the total disk space required. A rough calculation might be:

100 agents = 150GB raw, X replication factor of 3, = 450GB total

To be safe we recommend that you size some additional disk space as buffer (say 25-50%) on top of that.

Network Configuration

The following firewall/security configurations are required for inbound and outbound traffic for the Sysdig platform:

Ports

Port

State

Direction

Description

6666

Open (optional)

Inbound

Agent communication (unencrypted)

6443

Open

Inbound

Agent Communication (TLS/encrypted)

443

Open

Inbound

Sysdig Monitor user-interface access inbound

443*

Open

Outbound

*Optional, used if collecting AWS CloudWatch metrics. See also AWS: Integrate AWS Account and CloudWatch Metrics (Optional).

443*

Open

Outbound

*Optional, needed if using Sysdig Secure Image Scanning to download vulnerability definitions.

Must be open to Cloudflare IP ranges: https://www.cloudflare.com/ips/.

8800

Open

Inbound

Replicated Management Console access (for on-premises installations that don't use Kubernetes)

Warning: Port 6666 should only be opened if agents will be communicating with the collectors without encryption.

Additional ports may need to be configured for the Replicated infrastructure manager. Refer to the Replicated port requirements documentation for more information.

HTTP/HTTPS and Proxy Support

All non-airgapped hosts require outbound HTTP/S internet access for:

  • License validation

  • Pulling Sysdig/Agent containers from the Docker hub repository

  • Release update checks

Note: Sysdig does not support HTTP/S proxies for Sysdig platform components.

Summary: Plan Proxy Support for Notification Channels, CloudWatch Metrics, Capture Storage

In release #760 and newer of the Sysdig platform back-end, an option is available to configure outgoing HTTP/HTTPS connections to be made via proxy. This has been tested and supports outgoing web connections that are necessary to support the following features:

  • Notification Channels

    • PagerDuty

    • Slack

    • Amazon SNS

    • VictorOps

    • OpsGenie

    • WebHook

  • Gathering of AWS CloudWatch data

  • Capture storage to an AWS S3 bucket

Proxied web connectivity to support authentication mechanisms (SAML. OpenID Connect, OAuth) are not supported at this time.

Configure Proxy Using JVM Options

The proxy settings are configured via the JVM options passed to the Sysdig software components. JVM options can be added/appended at any time (with a required restart).

  • In a Replicated on-premises install, use the Advanced Settings panel to enter JVM options in the Sysdig application JVM options field. (See “Define Advanced Settings” on Install Components (Replicated).)

    If JVM settings have already been set, log in to the Replicated Management console and choose the Settings tab. At the bottom of the screen, check the box to Show Advanced Settings to reveal the configuration option.

  • In a Kubernetes-based on-premises install, set the sysdigcloud.jvm.options in the config.yaml used to set the ConfigMap:

    # Optional: Sysdig Cloud application JVM options. For heavy load environments you'll need to tweak
    # the memory or garbage collection settings
      sysdigcloud.jvm.api.options: ""
      sysdigcloud.jvm.worker.options: ""
      sysdigcloud.jvm.collector.options: ""
    
  • Enter the proxy parameters, as in the example below.

    This JVM options string will forward all HTTP and HTTPS traffic via outgoing port 8888 on a proxy at hostname proxy.example.com. The IP address may be specified instead of hostname.

    -Dhttp.proxyHost=proxy.example.com -Dhttp.proxyPort=8888 -Dhttps.proxyPort=8888 -Dhttps.proxyHost=proxy.example.com
    # Optional: Sysdig Cloud application JVM options. For heavy load environments you'll need to tweak
    # the memory or garbage collection settings
    sysdigcloud.jvm.api.options: -Xms2048m -Xmx2048m -Dhttp.proxyHost=xxx.xxx.sysdig.com -Dhttp.proxyPort=80 -Dhttps.proxyHost=xxx.xxx.sysdig.com -Dhttps.proxyPort=80
    

Exclusions

  • Do not use local host or 127.0.0.1. By default, HTTP/HTTPS requests to localhost or 127.0.0.1 will not be directed by the back-end toward any configured proxy, which is necessary for the functioning of some web components internal to the Sysdig platform containers.

  • If you deploy the Sysdig platform in AWS, add an additional proxy parameter

    -Dhttp.nonProxyHosts=169.254.169.254

    Rational: This provides a work-around for the backend occasionally making HTTP requests to a special instance metadata address 169.254.169.254, which is undesirable when using a proxy.

    This IP address will be excluded from proxying by default in a future release.

  • If you have additional proxy exclusions you wish to specify that are unique to your environment, these can also be added using the pipe separator.

    For example, assume your deployment was in AWS and you also had a webhook target 192.168.1.2 that was not reachable via your proxy.To exclude both:

    Replicated: your complete string to enter into the console for Sysdig application JVM options would be:

    -Dhttp.proxyHost=proxy.example.com -Dhttp.proxyPort=8888 -Dhttps.proxyPort=8888 -Dhttps.proxyHost=proxy.example.com -Dhttp.nonProxyHosts=169.254.169.254|192.168.1.2
    

    Kubernetes: when setting the sysdigcloud.jvm.api.options and sysdigcloud.jvm.worker.options in the config.yaml for the ConfigMap, the pipe separator must be double-escaped, such as:

    -Dhttps.proxyPort=80 -Dhttps.proxyHost=xx.xx.sysdig.com -Dhttp.nonProxyHosts=169.123.169.123\\|127.0.0.1\\|localhost\\|.sysdig.com"
    

Time Synchronization

The Sysdig platform requires the system clocks to be closely synchronized between hosts. When provisioning hosts for installation, ensure the system clocks are synchronized.

Recommended: Install NTP to ensure all host clocks stay synchronized.

3 - Securing User Passwords

  • For MySQL, Redis, and the initial “super admin” user, a strong password is recommended, 16-20 characters, alphanumeric.

  • For Cassandra and MySQL, it is also possible to set up third-party authentication

  • For Redis, users can set up an SSH tunnel and Sysdig can connect over this tunnel.