Enhanced Metric Store

Sysdig has launched our next generation metric store, introducing a number of new features, as well as changes to and removal of some features in Sysdig Monitor. This document covers the major enhancements and changes introduced by the metric store.

New Features and Enhancements

Prometheus-Compatible Naming Conventions for Metrics & Labels

In prior versions of Sysdig Monitor, metrics were inconsistent between PromQL and Form querying. This behavior has been changed. Metrics are now unified—all the metrics are now given in a Prometheus compatible format, as opposed to the previous statsd compatible naming convention. For example, underscore is used instead of dot notation as given below:

kubernetes.node.allocatable.cpuCores will be translated to kube_node_status_allocatable_cpu_cores andkubernetes.namespace.name to kube_namespace_name.

Your existing dashboards and alerts data will be automatically migrated to the new naming convention. Sysdig APIs supports metrics in both old and new naming conventions.

For metrics mapping, see Metrics and Label Mapping.

Context-Specific Metrics

Metrics such as cpu.used.percent previously would either be showing values from a process, container, or host depending on your query segmentation or scope. This has been improved by creating new sets of context-explicit metrics which aligns with the flat model and resource specific semantics of Prometheus naming schema. For example:

Classic MetricsNew Metrics

Network metrics previously would either be showing values from a host, a container, a program, or a connection depending on your query segmentation or scope. This has been improved by creating also a new sets of context-explicit metrics, in this case also per connection metrics:

Classic MetricsNew Metrics

For the complete list of context-specific metrics, see Mapping Classic Metrics with Context-Specific PromQL Metrics.

Your existing dashboards and alerts data will be automatically migrated to the new naming convention.

Faster Query Performance

Queries now perform faster and handle larger volumes of data. You can expect queries executed in Sysdig Monitor to be 2-3x faster.

Single Stat Panels Displays Latest Value

Number panels, tables, histograms, and toplist panels can now show the latest value for an entity. This can be done without having to aggregate multiple values over the time selection.

Overview Displays Latest Data

Overview pages now shows the latest data as opposed to an aggregated value for widgets over the time window selected. Time navigation has been removed to focus this view on the live (latest) status of your infrastructure.

PromQL Dashboard $__scope

You can easily reference a dashboard scope in PromQL queries. To do so, use the reserved $__scope variable as shown below:

Under the hood $__scope will be substituted with the expression specified in the dashboard scope. This is achieved by leveraging Sysdig ServiceVision technology which allows for automatically enriching metrics with Kubernetes and application context. Learn more about ServiceVision.

Mixed-Metric Granularity

Sysdig Monitor can now display metrics scraped at different intervals, for example 10s and 1m, on the same graph.

Improved Granularity

Granularity of graphs has been improved in Dashboards. For example, a 1 hour selection now shows metrics with 10 second intervals. In prior versions, 1-hour selection in Dashboards showed metrics in 1-minute interval.

Remove Re-Alignment

Previously, Sysdig Monitor would re-align time selections in graphs due to certain performance limitations. This time re-alignment has been removed to show more up-to-date metrics.

Program Metrics Retention

The retention rate of the program metrics has been reduced to 4 days. Program metrics are:

  • sysdig_program_cpu_cores_used
  • sysdig_program_cpu_cores_used_percent
  • sysdig_program_cpu_used_percent
  • sysdig_program_memory_used_bytes
  • sysdig_program_net_in_bytes
  • sysdig_program_net_out_bytes
  • sysdig_program_net_connection_in_count
  • sysdig_program_net_connection_out_count
  • sysdig_program_net_connection_total_count
  • sysdig_program_net_error_count
  • sysdig_program_net_request_count
  • sysdig_program_net_request_in_count
  • sysdig_program_net_request_out_count
  • sysdig_program_net_request_time
  • sysdig_program_net_request_in_time
  • sysdig_program_net_tcp_queue_len
  • sysdig_program_proc_count
  • sysdig_program_thread_count
  • sysdig_program_up

Deprecated Features

Topology Maps

Topology Maps will be deprecated due to their incompatibility with the new data store and had limitations at scale for certain users. We’re working on an improved version of Topology Maps which is on the short term roadmap.

Agent Percentiles

Agent derived percentiles will be deprecated. If you have been using these, your query still stop working and you will have to manually migrate your queries to leverage Prometheus histograms or PromQL functions such as histogram_quantile to achieve more precise results.

Deprecated Kubernetes Labels

The following labels are no longer supported:

  • net.connection.client
  • net.connection.client.pid
  • net.connection.direction
  • net.connection.endpoint.tcp
  • net.connection.udp.inverted
  • net.connection.errorCode
  • net.connection.l4proto
  • net.connection.server
  • net.connection.server.pid
  • net.connection.state
  • net.role
  • swarm.node.label
  • swarm.service.label
  • swarm.task.label
  • agent.tag
  • cloudProvider.resource.endPoint
  • host.container.mappings
  • host.ip.all
  • host.ip.private
  • host.ip.public
  • host.server.port
  • host.isClientServer
  • host.isInstrumented
  • host.isInternal
  • host.procList.main
  • proc.id
  • proc.name.client
  • proc.name.server
  • program.environment
  • program.usernames
  • kubernetes.service.label
  • kubernetes.statefulSet.label
  • kubernetes.node.label
  • kubernetes.pod.label
  • kubernetes.namespace.label
  • kubernetes.replicaSet.label
  • kubernetes.replicationController.label
  • kubernetes.deployment.label
  • kubernetes.daemonSet.label
  • kubernetes.job.label
  • kubernetes.persistentvolume.label
  • kubernetes.persistentvolumeclaim.label
  • container.label
  • mesos_cluster
  • mesos_node
  • mesos_pid

Usage of Labels in Table Panel

Querying labels as metrics is limited to Infrastructure labels. For example, you can use all the host level labels (for example, agent tags), aws tags (for example, region) and the Kubernetes labels (for example, workload) to build table panels.

Contact Us

If you have any questions or comments about these changes, feel free to Sysdig Support or contact your Sysdig representative.

Last modified January 19, 2022