Form and PromQL Editors

You can build data queries in the Dashboards module with Form editor or the PromQL editor. Form queries are automatically translated to PromQL before they are sent to the Sysdig Prometheus API, but issues can arise in some edge cases, due to differences in the query languages. Select Translate to PromQL to continue editing a query in advanced cases.

Automatic Query Translation

When you use the Form editor, Form queries are automatically translated into an equivalent PromQL query (applying the same translation logic triggered by the Translate to PromQL button), before Sysdig forwards them to the Sysdig Prometheus API server. Regardless of whether you are building a query by using the Form or the PromQL editor, the system will therefore retrieve data by using the Sysdig Prometheus APIs.

Legacy API and Sysdig Prometheus API

Before automatic query translation was introduced, Form queries were sent to the legacy API, and only PromQL queries would be sent to the Sysdig Prometheus API.

The most important difference between data queried by the legacy API and data produced by the Sysdig Prometheus API is granularity. The data produced by the queries configured by using the Form editor will have the same granularity as the PromQL Editor.

For example, a 1-hour time selection will now display metrics with 10-second granularity while before this enhancement, you would only get 1-minute granularity.

Translation Limitations

Sysdig translates Form queries to PromQL as faithfully as possible.However, some scenarios pose problems, because of the inherent differences between the two models. The following sections outline such cases and describe relevant differences between data produced by our legacy API and data produced by Sysdig Prometheus API for equivalent PromQL queries.

Click Translate to PromQL to make further edits to the query in PromQL.

Translating Aggregate Functions

Time Aggregation

When configuring a query using the Form editor and defining a time aggregation, Sysdig automatically translates the chosen time aggregation function into the equivalent Prometheus aggregation function according to the following tables.

The Prometheus aggregation function varies according to the type of the selected metric:

  • Gauge metrics that represent a single numerical value that can arbitrarily fluctuate over time, for example, CPU usage.

  • Counter metrics that help you record how many times something has happened, for example, a user login.

    See Metric Types.

Gauge Metrics

Time Aggregation in Legacy APITime Aggregation in Sysdig Prometheus API
avgavg_over_time
sumsum_over_time
minmin_over_time
maxmax_over_time
ratesum_over_time / $__interval_sec
roc (rate of change)deriv See: deriv

Counter Metrics

Sysdig distinguishes counter metrics as follows and the difference between the two counter types is in the way they store values:

  • Prometheus counter metrics that Sysdig refers to as prom counters.

    Prom counters are monotonically increasing cumulative metrics and report the total number of events since the event reporter started. The value always increases except when the reporter restarts/reboots. This is called a reset.

  • StatsD-style counter metrics that Sysdig refers to as delta counters.

    Delta counters report the number of events in the current time window

As an example, consider the following table that shows what a delta and prom counters would store for the same sequence of events occurrences:

Time102030405060
delta122443
prom1354811

To ensure that the same information is stored, consider the following question: how many events occurred after t=10 up to and including t=60?

  • For delta counters you sum the numbers: 2+2+4+4+3 = 15
  • For prom counters you have to perform: (3-1)+(5-3)+4+(8-4)+(11-8) = 15
    • where (3-1) is the number of events at t=20, (5-3) is the number of events at t=30, and so on.
    • The prom counter resets between t=30 and t=40 (in fact, its value is decreased), therefore, the number of events at t=40 is just the value at t=40, which is 4

Because of the difference in the way these two counter types store values, Susdig translates the rate and sum time aggregations in two different ways based on the counter type.

For prom counter metrics:

Time Aggregation in Legacy APITime Aggregation in Sysdig Prometheus API
raterate
sumincrease

For delta counter metrics:

Time Aggregation in Legacy APITime Aggregation in Sysdig Prometheus API
ratesum_over_time / $__interval_sec
sumsum_over_time

For additional details about the Prometheus functions, see Query Functions.

Group Aggregation

When configuring a query using the Form editor and defining a Group Aggregation, Sysdig automatically translates the chosen group aggregation function into the equivalent Prometheus aggregation function. Group Aggregation function names and meanings don’t change. For example, avg stays avg in PromQL as well and it has exactly the same meaning.

Using top(k) / bottom(k)

When a group aggregation is defined, you can explicitly select a set of aggregation labels. When this happens, data tends to become bulky and less readable on the charts. For this reason, when an aggregation label is configured, the Form editor automatically selects and returns the top 10 time series. How this selection happens in the Prometheus system is what makes the two models different.

As an example, imagine your data looks like the chart below, where you have four time series, represented using the colors green, blue, red, and orange.

When applying the Prometheus function top(2), Prometheus independently selects the top 2 time series for each point in time on the graph. Each point in time on the graph will have its own set of top 2 time series.

t0t1t2t3t4t5t6t7t8t9t10t11
top1bluegreengreengreengreenblueorangeorangeorangebluegreenorange
top2greenblueblueorangebluegreengreengreengreengreenorangegreen

The output of the top(2) function applied to the time series above will therefore be represented as follows.

Note that:

  • The green time series stays the same because its points are part of all top(2) sets.
  • The red time series disappears because it is not part of any top(2) set.
  • The blue and orange time series get some gaps and some isolated points, according to their presence in the various top(2) sets.

Using the Latest Displayed Value with Sparse Metric

Translated spare metric queries will look like this:

avg(avg_over_time(my_metric_name[$__interval]))

Sysdig uses a range vector, my_metric_name[$__interval], and therefore, Prometheus will only take the data points comprised within the $__interval into account.

When displaying a sparse metric, for example, reporting values every 2 minutes, with a small time range, such as 10m, 1h, or 6h, the panel might display No Data because $__interval does not include any non-null values, while previous Form panels had a static interval of 5 minutes.

To see data in such cases you can:

  • Translate the Form panel to PromQL and set the Min. interval to an appropriate value. For example, 5m.
  • Switch from Latest to Entire range. This will apply the time aggregation to all points within the selected time range.