Group Outlier Alerts

The Group Outlier Alert Type monitors specific segments in a metric to identify entities that deviate from the group. This alert type is useful for detecting variations in resource utilization, latency rates, and error rates.

Define a Group Outlier Alert

  • Scope: The alert applies to the Entire Infrastructure of your team scope by default. However, you have the option to restrict the alert scope by filtering by specific labels, such as cloud_provider_region or kube_namespace_name.

  • Metric: Select the metric the alert will evaluate. Group Outlier alert rules can only be configured for a single metric. For example, to identify if a web server is handling more HTTP requests compared to other web servers, you might use the http_requests_total metric and have the segmentation configured by hostname.

  • Group By Segment: Segmentation determines the specific entities being evaluated for outlier detection. Setting this field establishes a valid Group Outlier alert rule. For instance, when alerting on kube_pod_memory_usage, use a combination of segments like pod and node to precisely identify memory-intensive pods within particular nodes.

  • Observation Window: Set a specific time frame for evaluating potential outliers. The chosen time frame greatly influences outlier detection results. For instance, a host might show high memory usage over the past hour but not when observed over the last 10 minutes. Selecting a longer observation window can help filter out short-lived outliers, reducing the chance of flagging transient fluctuations as outliers.

MAD and DBSCAN Algorithms

Median Absolute Deviation (MAD)

Median Absolute Deviation (MAD) is a robust method to detect outliers based on their deviation from the median. It calculates the median of the time series data for a user-specified observation window, and then determines how far each entity deviates from that median. Entities that deviate significantly from the median, based on a predefined threshold, are considered outliers.

  • Tolerance: Set the tolerance to decide the acceptable values from the median absolute deviation. The tolerance specifies how many times an entity can deviate from the MAD value before it’s considered an outlier. By configuring a higher tolerance, you’ll focus on detecting entities that exhibit a more pronounced deviation from the median.

  • Outlier Persistence: The Outlier Persistence specifies the percentage of the observation window in which an entity’s reported value must fall outside the configured tolerance to be labeled as an outlier. It ensures that occasional blips aren’t mistakenly classified as outliers; only consistent deviations over the defined percentage of the observation window will trigger an alert.

Use MAD to detect deviations in entities normally displaying consistent behavior.

Density-Based Spatial Clustering of Applications with Noise (DBSCAN)

Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is an algorithm that detects outliers by analyzing their proximity to other entities within a specified observation window. Unlike MAD, which evaluates outliers based on deviation from the median, DBSCAN identifies outliers by determining if they are isolated from neighboring time series.

  • Tolerance: In the context of DBSCAN, tolerance determines the proximity range within which an entity should find neighboring time series to be part of a group. DBSCAN focuses on grouping similar time series based on their closeness. By setting a larger tolerance, you allow for more inclusive groups, capturing a wider range of similar time series. Conversely, a smaller tolerance makes the grouping criteria stricter, leading to more individual time series being identified as outliers because they don’t fit closely enough with any established group. Unlike MAD, which evaluates if an entity consistently exceeds the tolerance over a specified percentage of time, DBSCAN evaluates spatial relations in a single snapshot, eliminating the need for a percentage-based criterion.

Use DBSCAN to identify outliers in entities that typically exhibit similar trends or patterns.