Sysdig Documentation

Metric Alerts

Sysdig Monitor keeps a watch on time-series metrics, and alert if they violate user-defined thresholds.

384336543.png

The lines shown in the preview chart represent the values for the segments selected to monitor. The popup is a color-coded legend to show which segment (or combination of segments if there is more than one) the lines represent. You can also deselect some segment lines to prevent them from showing in the chart. Note that there is a limit of 10 lines that Sysdig Monitor ever shows in the preview chart.

Defining a Metric Alert

Guidelines

  • Set a unique name and description: Set a meaningful name and description that help recipients easily identify the alert

  • Specify multiple segments: Selecting a single segment might not always supply enough information to troubleshoot. Enrich the selected entity with related information by adding additional related segments. Enter hierarchical entities so you have the bottom-down picture of what went wrong and where. For example, specifying a Kubernetes Cluster alone does not provide the context necessary to troubleshoot. In order to narrow down the issue, add further contextual information, such as Kubernetes Namespace, Kubernetes Deployment, and so on.

Specify Entity

  1. Select an entity whose downtime you want to monitor for.

    In this example, you are monitoring unscheduled downtime of a host.

  2. Specify additional segments.

    In this example, you are monitoring the mac address of the host and mount directory of the file system.

Specify Metrics

Select a metric that this alert will monitor. You can also define how data is aggregated, such as avg, max, min or sum. To alert on multiple metrics using boolean logic, switch to multi-condition alert. Alerts

Configure Scope

Filter the environment on which this alert will apply.

Filter the environment on which this alert will apply. An alert will fire when a host goes down in the availability zone, us-east-1b.

384336532.png

Use advanced operators to include, exclude, or pattern-match groups, tags, and entities. See Multi-Condition Alerts.Alerts

You can also create alerts directly from Explore and Dashboards for automatically populating this scope.

Configure Trigger

Define the threshold and time window for assessing the alert condition. Single Alert fires an alert for your entire scope, while Multiple Alert fires if any or every segment breach the threshold at once.

384336537.png

In this example, if the file system used percentage goes above 75 for the last 5 minutes on an average, multiple alerts will be triggered. The mac address of the host and mount directory of the file system will be represented in the alert notification.

Usecases

  • Number of processes running on a host is not normal

  • Root volume disk usage in a container is high