Understanding Default, Custom, and Missing Metrics

Default Metrics

Default metrics include various kinds of metadata which Sysdig Monitor automatically knows how to label, segment, and display.

For example:

  • System metrics for hosts, containers, and processes (CPU used, etc.)

  • Orchestrator metrics (collected from Kubernetes, Mesos, etc.)

  • Network metrics (e.g. network traffic)

  • HTTP

  • Platform metrics (in some cases)

Default metrics are collected mainly from two sources: syscalls and Kubernetes.

Custom Metrics

About Custom Metrics

Custom metrics generally refer to any metrics that the Sysdig Agent collects from some third-party integration. The type of infrastructure and applications integrated determine the custom metrics that the Agent collects and reports to Sysdig Monitor. The supported custom metrics are:

Each metric comes with a set of custom labels, and additional labels can be user-created. Sysdig Monitor simply collects and reports them with minimal or no internal processing. The limit currently enforced is 3000 metrics per host. Use the metrics_filter option in the dragent.yaml file to remove unwanted metrics or to choose the metrics to report when hosts exceed this limit. For more information on editing the dragent.yaml file, see Understanding the Agent Config Files.

Unit for Custom Metrics

Sysdig Monitor detects the default unit of custom metrics automatically with the delimiter suffix in the metrics name. For example, custom_expvar_time_seconds results in a base unit set to seconds. The supported base units are byte, percent, and time. Custom metrics name should carry one of the following delimiter suffixes in order for Sysdig Monitor to identify and configure the accurate unit type.

  • second

  • seconds

  • byte

  • bytes

  • total (represents accumulating count)

  • percent

Custom metrics will not be auto-detected and the unit will be incorrect unless this naming convention is followed. For instance, custom_byte_expvar will not yield the correct unit, that is MiB.

Editing the Unit Scale

You have the flexibility to change the unit scale either by editing the panel on the Dashboard or in the Explore.

Explore

From the Search Metrics and Dashboard drop-down, select the custom metrics you want to edit the unit selection for, then click More Options. Select the desired unit scale from the Metric Format drop-down and click Save.

373653858.png

Dashboard

Select the Dashboard Panel associated with the custom metrics you want to modify. Select the desired unit scale from the Metrics drop-down and click Save.

373653862.png

Display Missing Data

Data can be missing for a few different reasons:

  • Problems such as faulty network connectivity in the communication channel between your infrastructure and Sysdig metrics store.

  • Metrics or StatsD batch jobs are submitted sporadically.

Sysdig Monitor allows you to configure the behavior of missing data in Dashboards. Though metric type determines the default behavior, you can configure how to visualize missing data and define it at the per-query level. Use the No Data Display drop-down in the Options menu in the panel configuration. See Create a New Panel for more information.

Consider the following guidelines:

  • The No Data Display drop-down has only two options for the Stacked Area timechart: gap and show as zero.

  • For the Number panel, the No Data Display option allows entering a custom no data text.

  • For form-based timechart panels, the default option for a metrics selection that does not contain a StatsD metric is gap.

  • Adding a StatsD metric to a query in a form-based timechart panel will default the selected No Data Display type to the show as zero , which is the default option for form-based StatsD metrics. You can change this selection to any other type.

  • The default display option is gap for PromQL Timechart panels.

The options for No Data Display are:

  • gap: The default option for form-based timechart panel, where a query metrics selection does not contain a StatsD metric. gap is the best visualization type for most use cases because it is easy to spot indicating a problem.

    gap-null-data.png
  • show as zero: The best option for StatsD metrics which are only submitted sporadically. For example, batch jobs and count of errors. This is the default display option for StatsD metrics in form-based panels.

    zero-null-data.png

    We do not recommend this option as setting zero could be misleading. For example, this setting will report the value for free disk space as 0% when the disk or host disappears, but in reality, the value is unknown.

    Note

    Prometheus best practices recommend avoiding missing metrics.

  • connect - solid: Use for measuring the value of a metric, typically a gauge, where you want to visualize the missing samples flattened.

    solidline-null-data.png

    The leftmost and rightmost visible data points can be connected as Sysdig does not perform the interpolation.

  • connect - dotted: Use it for measuring the value of a metric, typically a gauge, where you want to visualize the missing samples flattened.

    dottedline-null-data.png

    The leftmost and rightmost visible data points can be connected as Sysdig does not perform the interpolation.