Metric Store Release Notes
June 2022
New Features and Enhancements
Prometheus-Compatible Naming Conventions for Metrics & Labels
In prior versions of Sysdig Monitor, metrics were inconsistent between PromQL and Form querying. This behavior has been changed. Metrics are now unified — all the metrics are now presented in a Prometheus compatible naming convention, as opposed to the previous statsd
compatible naming convention. For example, underscore is used instead of dot notation as given below:
kubernetes.node.allocatable.cpuCores
will be mapped to kube_node_status_allocatable_cpu_cores
and kubernetes.namespace.name
to kube_namespace_name
.
Your existing dashboards, alerts and notifications will be automatically migrated to the new naming convention. Sysdig APIs support metrics and labels in both old and new naming conventions. Note that for the initial release, Labels will not be migrated to the new naming convention in the old explore, events, and team settings.
Notifications sent via alerts (webhooks, PagerDuty, and so on) will use the new label and metric conventions. If you are performing further processing to parse the metric or label names within these notification messages, update your scripts as appropriate.
For metrics mapping, see Metrics and Label Mapping.
Context-Specific Metrics
Metrics such as cpu.used.percent
would previously show values from a process, container, or host, depending on your query segmentation and scope. This has been improved by creating new sets of context-specific metrics and resource specific semantics of Prometheus naming convention. For example:
Classic Metrics | New Metrics |
---|---|
cpu.used.percent | sysdig_program_cpu_used_percent sysdig_container_cpu_used_percent sysdig_host_cpu_used_percent |
uptime | sysdig_program_up sysdig_container_up sysdig_host_up |
Network metrics previously would either be showing values from a host, container, program, or connection depending on your query segmentation or scope. This has been improved by creating a new sets of context-explicit metrics, in this case per connection metrics:
Classic Metrics | New Metrics |
---|---|
net.bytes.in | sysdig_connection_net_in_bytes sysdig_container_net_in_bytes sysdig_host_net_in_bytes sysdig_program_net_in_bytes |
Your existing dashboards, alerts and notifications will be automatically migrated to the new naming convention. Sysdig APIs support metrics in both old and new naming conventions.
For the complete list of context-specific metrics, see Mapping Classic Metrics with Context-Specific PromQL Metrics.
Faster Query Performance
Queries now perform faster and handle larger volumes of data. You can expect queries executed in Sysdig Monitor to be noticeably faster.
Single Stat Panels Displays Latest Value
Number panels, tables, histograms, and toplist panels can now show the latest value for an entity. This can be done without having to aggregate multiple values over the time selection.
Overview Displays Latest Data
Overview pages now shows the latest data as opposed to an aggregated value for widgets over the time window selected. Time navigation has been removed to focus this view on the live (latest) status of your infrastructure.
Scope Variable in PromQL Dashboard
You can easily reference a dashboard scope in PromQL queries. To do so, use the reserved $__scope
variable as shown below:
Under the hood $__scope
will be substituted with the expression specified in the dashboard scope. This is achieved by leveraging Sysdig ServiceVision technology which allows for automatically enriching metrics with Kubernetes and application context. To learn more, see ServiceVision.
Mixed-Metric Granularity
Sysdig Monitor can now display metrics scraped at different intervals, for example 10s and 1m, on the same graph.
Improved Granularity for PromQL panels
Granularity of graphs has been improved for promQL panels. For example, a 1 hour selection now shows metrics with 10 second intervals. In prior versions, 1-hour selection in Dashboards showed metrics in 1-minute interval.
Removed Re-Alignment
Previously, Sysdig Monitor would re-align time selections in graphs due to certain performance limitations. This has been removed to show more up-to-date metrics.
Troubleshooting Metrics
Troubleshooting metrics, such as program metrics, connection-level network metrics, and Kubernetes troubleshooting metrics, are being reported on a granular level at 10s and will be stored for 4 days. For the list of troubleshooting metrics and the labels that you can use to segment them, see Troubleshooting Metrics.
Discontinued Features
Discontinued Metrics and Labels
Below is the list of metrics and labels that are going to be discontinued. We made an effort to not deprecate any metrics or labels used in existing alerts, but contact us if you encounter any issues.
It is important to note that we have applied automatic mapping of all net.*.request.time.worst
metrics to net.*.request.time
, as max
aggregation gives equivalent results and it was almost exclusively used in combination with these metrics.
Discontinued Metrics
The following metrics are no longer supported:
net.request.time.file
net.request.time.file.percent
net.request.time.local
net.request.time.local.percent
net.request.time.net
net.request.time.net.percent
net.request.time.nextTiers
net.request.time.nextTiers.percent
net.request.time.processing
net.request.time.processing.percent
net.request.time.worst.in
net.request.time.worst.out
net.incomplete.connection.count.total
net.http.request.time.worst
net.mongodb.request.time.worst
net.sql.request.time.worst
net.link.clientServer.bytes
net.link.delay.perRequest
net.link.serverClient.bytes
Discontinued Labels
The following labels are no longer supported:
net.connection.client
net.connection.client.pid
net.connection.direction
net.connection.endpoint.tcp
net.connection.udp.inverted
net.connection.errorCode
net.connection.l4proto
net.connection.server
net.connection.server.pid
net.connection.state
net.role
cloudProvider.resource.endPoint
host.container.mappings
host.ip.all
host.ip.private
host.ip.public
host.server.port
host.isClientServer
host.isInstrumented
host.isInternal
host.procList.main
host_domain
proc.id
proc.name.client
proc.name.server
program.environment
program.usernames
mesos_cluster
mesos_node
mesos_pid
In addition to this, composite labels ending with the .label
string will no longer be supported. For example kubernetes.service.label
will be deprecated, but kubernetes.service.label.*
will continue to be supported.
Removed Features
Topology Maps
Topology Maps will be deprecated due to their incompatibility with the new data store and their limitations at scale for certain users.
Agent Percentiles
Agent derived percentiles will be deprecated. If you have been using these, your query will stop working and you will have to manually migrate your queries to leverage Prometheus histograms or PromQL functions such as histogram_quantile
.
Change in Functionality
Usage of Labels in Table Panel
Only Infrastructure labels can be used to query metrics. Build table panels with:
- Host level labels (such as agent tags)
- AWS tags (such as region)
- Kubernetes labels (such as workload)
Known Issue for Non Timecharts
Due to the underlying changes we made to our core metric ingestion engine, charts that are not Timecharts (for example, Number panels) will sometimes fail to display aggregated data for the full requested time range. In this case, we will:
- Aggregate a portion of data spanning a minimum of two weeks.
- Clearly define the time range for which we were able to aggregate in the warning message.
Note that this is a transient side effect, and will be less likely to happen over time.
Contact Us
If you have any questions or comments about these changes, contact your Sysdig representative or Sysdig Support.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.